Performance Fundamentals

Register performance depends on three factors: database efficiency, application architecture, and caching strategy. Optimize all three for best results.

Database

Schema design, indexing, queries

High Impact

Application

Code efficiency, connection pooling

Medium Impact

Caching

Read replicas, CDN, application cache

High Impact

Database Optimization

Schema Design Principles

  • Normalize for integrity: Keep concepts, items, and versions in separate tables
  • Denormalize for reads: Add computed columns for common queries
  • Partition large tables: Partition by date or status for tables over 10M rows
  • Use appropriate types: Don't store JSON in text columns; use native JSON types

Indexing Strategy

Proper indexing is the single most important performance optimization:

Query PatternRecommended IndexNotes
Lookup by identifierUNIQUE INDEX on identifierAlways needed
Filter by statusINDEX on (valid, published)Composite for common combinations
Date range queriesINDEX on date_added, date_modifiedSupports pagination by date
Full-text searchFULLTEXT or GIN indexDatabase-specific
Concept hierarchyINDEX on parent_id, CTE optimizationConsider closure tables
JSON content queriesGIN index on JSON columnFor PostgreSQL

Query Optimization

Good: Paginated with Index

SELECT * FROM items
WHERE status = 'valid'
ORDER BY date_added DESC
LIMIT 20 OFFSET 0;

Uses index on (status, date_added)

Bad: Unfiltered Large Result

SELECT * FROM items
ORDER BY date_added DESC;

Scans entire table, sorts in memory

Good: Cursor-Based Pagination

SELECT * FROM items
WHERE date_added < :cursor
ORDER BY date_added DESC
LIMIT 20;

Uses index, consistent performance

Connection Pooling

Database connections are expensive. Use connection pooling:

  • Pool size: 10-20 connections per application instance
  • Timeout: Set connection timeout (30-60 seconds)
  • Validation: Test connections before use
  • External pooler: Consider PgBouncer for PostgreSQL

Caching Strategy

Cache Layers

CDN

  • Static assets (JS, CSS, images)
  • Published item content (immutable)
  • API responses for public data

TTL: 1 hour - 1 day

Application Cache

  • Concept hierarchies
  • Domain value sets
  • Register metadata

TTL: 5-15 minutes

Database Cache

  • Query result cache
  • Item lookup cache
  • Count/aggregation cache

TTL: 1-5 minutes

Cache Invalidation

Cache invalidation is hard. Strategies:

  • Time-based: Simple but may serve stale data
  • Event-based: Invalidate on write (more complex)
  • Version-based: Include version in cache key
  • Tag-based: Group related entries for bulk invalidation

What to Cache

Data TypeCache?Strategy
Concept hierarchiesYesCache fully, invalidate on any change
Domain valuesYesCache fully, rare invalidation
Published itemsYesCache by version, long TTL
Draft itemsNoToo frequently updated
Search resultsPartialShort TTL, common queries only
Proposal queuesNoMust be real-time

Scaling Patterns

Vertical vs Horizontal

Vertical Scaling

Increase resources on single server

  • Simpler to implement
  • Limited ceiling
  • Higher cost per unit
  • No code changes needed

Best for: Small-medium registers (up to 10M items)

Horizontal Scaling

Add more servers

  • More complex architecture
  • Near-infinite ceiling
  • Better cost efficiency at scale
  • Requires application changes

Best for: Large registers (10M+ items)

Read Scaling

Most register workloads are read-heavy (90%+ reads). Scale reads:

  1. Add read replicas: Direct read queries to replicas
  2. Implement caching: Reduce database load
  3. Use CDN: Serve static content at edge
  4. Separate search: Use Elasticsearch/Solr for search

Write Scaling

Write scaling is harder. Options:

  • Queue writes: Async processing for non-critical writes
  • Batch operations: Combine multiple writes
  • Sharding: Partition data across databases
Sharding Complexity: Sharding adds significant complexity. Only consider it when you've exhausted other options and have 100M+ items or very high write volume.

Benchmarks

Expected performance on modest hardware (4 vCPU, 16GB RAM, SSD):

Read Performance

OperationTargetNotes
Single item lookup (cached)< 5msp99 latency
Single item lookup (uncached)< 20msp99 latency
Paginated list (20 items)< 50msp99 latency
Search query< 100msp99 latency
Concept hierarchy traversal< 30ms5 levels, cached

Write Performance

OperationTargetNotes
Create item< 50msSingle item, indexed
Update item< 30msSingle field change
Bulk import1000 items/secBypassing governance
Proposal submission< 100msWith validation

Throughput

ScaleReads/secWrites/secArchitecture
Small1,000100Single server
Medium10,000500Primary + 2 replicas
Large100,0002,000Cluster with caching

Performance Checklist

Database

  • ☐ Indexes on all lookup fields
  • ☐ Query plans reviewed
  • ☐ Connection pooling enabled
  • ☐ Slow query logging enabled

Application

  • ☐ N+1 queries eliminated
  • ☐ Pagination on all lists
  • ☐ Response compression enabled
  • ☐ Timeout handling in place

Caching

  • ☐ Static assets on CDN
  • ☐ Application cache for hot data
  • ☐ Cache invalidation tested
  • ☐ Cache hit ratio monitored

Monitoring

  • ☐ Latency percentiles tracked
  • ☐ Throughput monitored
  • ☐ Error rate alerts
  • ☐ Regular load testing

Related Topics