DOMAIN:DATABASE — PERFORMANCE¶
OWNER: boris, yoanna
UPDATED: 2026-03-24
SCOPE: PostgreSQL performance tuning for all GE projects
PERFORMANCE:EXPLAIN_ANALYZE¶
HOW TO USE¶
ALWAYS: use ANALYZE (actually runs the query) — plain EXPLAIN only estimates
ALWAYS: include BUFFERS — shows I/O activity
FORMAT: TEXT for human reading, JSON for programmatic analysis
CAUTION: ANALYZE executes the query — for destructive queries (UPDATE, DELETE), wrap in a transaction and ROLLBACK:
READING THE OUTPUT¶
Seq Scan on orders (cost=0.00..1234.00 rows=50000 width=100) (actual time=0.01..45.23 rows=50000 loops=1)
Filter: (status = 'pending')
Rows Removed by Filter: 450000
Buffers: shared hit=8234 read=1234
KEY_FIELDS:
- cost=startup..total — planner estimate, not actual time. First number is startup cost, second is total.
- rows=N — estimated row count (planner) vs actual rows=N — real row count
- actual time=start..end — real execution time in milliseconds
- loops=N — how many times this node executed (important for nested loops)
- Rows Removed by Filter — rows examined but discarded. HIGH NUMBER = missing index
- Buffers: shared hit=N — pages found in cache. read=N — pages read from disk. High read = cold cache or table too big for memory.
NODE TYPES TO KNOW¶
SEQ_SCAN: full table scan. OK for small tables (<10k rows). BAD for large tables — add an index.
INDEX_SCAN: uses index to find rows, then fetches from table. GOOD for selective queries.
INDEX_ONLY_SCAN: all data from index, no table fetch. BEST — means index covers the query.
BITMAP_INDEX_SCAN + BITMAP_HEAP_SCAN: index scan for moderate selectivity. PostgreSQL combines multiple indexes. OK.
NESTED_LOOP: for each row in outer, scan inner. GOOD for small outer + indexed inner. BAD for large both.
HASH_JOIN: builds hash table from smaller set, probes with larger. GOOD for equi-joins on larger sets.
MERGE_JOIN: both sides sorted, merge. GOOD when both are already sorted.
SORT: explicit sort. Check if an index can eliminate it. Sort Method: external merge = ran out of work_mem.
HASH_AGGREGATE: grouping via hash table. Watch for Batches: N > 1 = exceeded work_mem.
RED FLAGS IN EXPLAIN¶
FLAG: Seq Scan on table with >10k rows and selective WHERE clause → missing index
FLAG: Rows Removed by Filter >> actual rows → index not selective enough or missing
FLAG: Nested Loop with large outer set → consider hash join (check join conditions, indexes)
FLAG: Sort Method: external merge Disk → increase work_mem or add index to avoid sort
FLAG: Buffers read >> shared hit → table/index not cached, may need more shared_buffers
FLAG: actual rows >> estimated rows → stale statistics, run ANALYZE on the table
FLAG: actual rows << estimated rows → stale statistics or correlation issue
FLAG: loops > 1 with high time per loop → repeated expensive operation
PERFORMANCE:INDEX_STRATEGY¶
WHEN TO ADD AN INDEX¶
ALWAYS: foreign key columns (PostgreSQL does NOT auto-create these)
ALWAYS: columns used in WHERE clauses of frequent queries
ALWAYS: columns used in JOIN conditions
ALWAYS: columns used in ORDER BY for queries needing sorted results
CONSIDER: columns used in GROUP BY
NEVER: columns on very small tables (<1000 rows) — seq scan is faster
NEVER: columns with very low cardinality on large tables (boolean, status with 3 values) — use partial index instead
INDEX SELECTION GUIDE¶
B-TREE → equality, range, ORDER BY, LIKE 'prefix%'
GIN → JSONB operators, array containment, full-text search, trigram similarity
GIST → range types, geometric, spatial (PostGIS), exclusion constraints
BRIN → very large tables with natural ordering (time-series, append-only logs)
COVERING INDEX (INDEX-ONLY SCAN)¶
-- Query: SELECT email, display_name FROM users WHERE is_active = true ORDER BY email
-- Covering index:
CREATE INDEX idx_users_active_email ON users (email) INCLUDE (display_name) WHERE is_active = true;
RESULT: Index Only Scan — zero table heap access, maximum performance
NOTE: INCLUDE columns are stored in the index but not used for search — only for retrieval
NOTE: wider indexes = more storage, slower writes — only cover frequently executed queries
MONITORING INDEX USAGE¶
-- Indexes that are never used (candidates for dropping)
SELECT s.schemaname, s.relname AS table, s.indexrelname AS index,
s.idx_scan AS scans, pg_size_pretty(pg_relation_size(s.indexrelid)) AS size
FROM pg_stat_user_indexes s
JOIN pg_index i ON s.indexrelid = i.indexrelid
WHERE s.idx_scan = 0
AND NOT i.indisunique -- keep unique constraint indexes
AND s.schemaname = 'public'
ORDER BY pg_relation_size(s.indexrelid) DESC;
-- Most-scanned indexes (confirm they are working)
SELECT relname, indexrelname, idx_scan, idx_tup_read
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC LIMIT 20;
INDEX BLOAT¶
CAUSE: frequent updates/deletes leave dead index entries until REINDEX
DETECT:
SELECT nspname, relname,
round(100 * pg_relation_size(indexrelid) / pg_relation_size(indrelid)) AS index_ratio
FROM pg_index
JOIN pg_class ON pg_class.oid = indexrelid
JOIN pg_namespace ON pg_namespace.oid = relnamespace
WHERE pg_relation_size(indrelid) > 0
ORDER BY pg_relation_size(indexrelid) DESC LIMIT 20;
REINDEX INDEX CONCURRENTLY idx_name; — does not block reads/writes (PG 12+)
PERFORMANCE:QUERY_OPTIMIZATION¶
PATTERN: N+1 QUERY¶
SYMPTOM: 1 query to get list + N queries to get related data (one per row)
CAUSE: ORM lazy loading or manual loop queries
FIX: use JOIN or Drizzle's with (relational queries)
// BAD: N+1
const users = await db.select().from(usersTable);
for (const user of users) {
const orders = await db.select().from(ordersTable).where(eq(ordersTable.userId, user.id));
}
// GOOD: single query with join
const usersWithOrders = await db
.select()
.from(usersTable)
.leftJoin(ordersTable, eq(usersTable.id, ordersTable.userId));
PATTERN: MISSING LIMIT¶
SYMPTOM: query returns thousands of rows when UI shows 20
FIX: always LIMIT + OFFSET or cursor-based pagination
PREFER: cursor-based pagination for large datasets (OFFSET becomes slow at high values)
// Cursor-based pagination
const nextPage = await db
.select()
.from(ordersTable)
.where(gt(ordersTable.createdAt, lastSeenCreatedAt))
.orderBy(ordersTable.createdAt)
.limit(20);
PATTERN: SELECT *¶
SYMPTOM: fetching all columns when only 2 are needed
FIX: select specific columns — reduces I/O, enables index-only scans
// BAD
const results = await db.select().from(usersTable);
// GOOD
const results = await db.select({ id: usersTable.id, email: usersTable.email }).from(usersTable);
PATTERN: FUNCTION IN WHERE CLAUSE¶
SYMPTOM: WHERE lower(email) = 'foo@bar.com' cannot use index on email
FIX: functional index: CREATE INDEX idx_users_email_lower ON users (lower(email));
ALTERNATIVE: use citext type for case-insensitive columns
PATTERN: OR IN WHERE CLAUSE¶
SYMPTOM: WHERE status = 'a' OR status = 'b' may not use index efficiently
FIX: WHERE status IN ('a', 'b') — optimizer handles this better
FIX_FOR_DIFFERENT_COLUMNS: WHERE a = 1 OR b = 2 → use UNION ALL of two indexed queries
PATTERN: CORRELATED SUBQUERY¶
SYMPTOM: subquery re-executed for every row of outer query
FIX: rewrite as JOIN or CTE
-- BAD: correlated subquery
SELECT * FROM orders o WHERE amount > (SELECT AVG(amount) FROM orders WHERE user_id = o.user_id);
-- GOOD: JOIN with aggregate
SELECT o.* FROM orders o
JOIN (SELECT user_id, AVG(amount) as avg_amount FROM orders GROUP BY user_id) a
ON o.user_id = a.user_id AND o.amount > a.avg_amount;
PATTERN: IMPLICIT CASTING¶
SYMPTOM: WHERE uuid_column = '...' — string vs UUID comparison may prevent index use
FIX: ensure parameter types match column types — Drizzle handles this, raw SQL may not
PERFORMANCE:CONNECTION_POOLING¶
FORMULA¶
FOR_SSD: effective_spindle_count = 1
EXAMPLE: 4 cores, SSD → pool_size = (4 * 2) + 1 = 9
NOTE: this is for the DATABASE connection limit, not the application pool
APPLICATION_POOL: should be 2x-4x the database pool (PgBouncer multiplexes)
UPCLOUD_MANAGED: max_connections is fixed by plan tier — check UpCloud dashboard
PGBOUNCER SIZING¶
[pgbouncer]
pool_mode = transaction
max_client_conn = 200 # Application connections to PgBouncer
default_pool_size = 20 # PgBouncer connections to PostgreSQL
reserve_pool_size = 5 # Emergency extra connections
reserve_pool_timeout = 3 # Seconds before using reserve pool
server_idle_timeout = 600 # Close idle backend connections after 10min
MONITORING CONNECTIONS¶
-- Current connection count vs maximum
SELECT count(*) AS active, setting AS max
FROM pg_stat_activity, pg_settings
WHERE pg_settings.name = 'max_connections'
GROUP BY setting;
-- Connections by state
SELECT state, count(*) FROM pg_stat_activity
WHERE datname = current_database()
GROUP BY state;
ALERT: if active connections > 80% of max_connections, investigate immediately
ALERT: idle in transaction connections are holding locks — find and fix the application code
PERFORMANCE:VACUUM¶
WHY VACUUM MATTERS¶
PostgreSQL uses MVCC — UPDATE creates a new row version, old version remains until VACUUM.
Without VACUUM: table bloats, indexes bloat, queries slow down, transaction ID wraparound risk.
AUTOVACUUM CONFIGURATION¶
DEFAULT autovacuum works for most tables. Tune only when needed.
-- Check autovacuum settings per table
SELECT relname, reloptions FROM pg_class WHERE relname = 'your_table';
-- Per-table autovacuum tuning for high-write tables
ALTER TABLE orders SET (
autovacuum_vacuum_threshold = 1000, -- minimum dead tuples before vacuum (default 50)
autovacuum_vacuum_scale_factor = 0.05, -- fraction of table size (default 0.2)
autovacuum_analyze_threshold = 500, -- minimum changed rows before analyze (default 50)
autovacuum_analyze_scale_factor = 0.025 -- fraction of table size (default 0.1)
);
WHEN_TO_TUNE: table has >1M rows AND high update/delete rate
RULE: lower thresholds = more frequent vacuum = less bloat but more CPU usage
MONITORING VACUUM¶
-- When was each table last vacuumed and analyzed?
SELECT relname, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze,
n_dead_tup, n_live_tup
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;
-- Tables with most dead tuples (need vacuum attention)
SELECT relname, n_dead_tup, n_live_tup,
round(n_dead_tup::numeric / NULLIF(n_live_tup, 0) * 100, 2) AS dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;
ALERT: dead_pct > 20% → autovacuum is falling behind, tune or run manual VACUUM
ALERT: n_dead_tup > n_live_tup → table is more dead than alive, VACUUM urgently needed
TRANSACTION ID WRAPAROUND¶
CRITICAL: PostgreSQL has a 2-billion transaction ID limit. After that: database SHUTS DOWN.
MONITOR:
SELECT datname, age(datfrozenxid) AS xid_age,
current_setting('autovacuum_freeze_max_age')::int AS freeze_max
FROM pg_database
ORDER BY age(datfrozenxid) DESC;
ALERT: if xid_age > 1,000,000,000 → CRITICAL, manual intervention needed immediately
PERFORMANCE:TABLE_PARTITIONING¶
WHEN TO PARTITION¶
CONSIDER: table exceeds 50M rows AND queries consistently filter by partition key
CONSIDER: table grows continuously (time-series, logs, events) AND old data needs archival
DO_NOT_PARTITION: tables under 10M rows — overhead exceeds benefit
DO_NOT_PARTITION: tables queried without the partition key — every partition gets scanned
PARTITION STRATEGIES¶
RANGE: time-based partitioning (most common) — monthly, weekly, daily
CREATE TABLE events (
id UUID DEFAULT gen_random_uuid(),
created_at TIMESTAMPTZ NOT NULL,
event_type TEXT NOT NULL,
payload JSONB
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2026_01 PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
LIST: category-based partitioning (by region, by type)
HASH: even distribution when no natural partition key (rarely used)
PARTITION MAINTENANCE¶
AUTOMATE: partition creation for future months — cron job or pg_partman extension
ARCHIVE: detach old partitions instead of deleting data:
PERFORMANCE:MATERIALIZED_VIEWS¶
WHEN TO USE¶
USE_FOR: expensive aggregate queries that do not need real-time data
USE_FOR: dashboard summaries, reporting queries, leaderboards
REFRESH: on a schedule (cron) or after relevant data changes
NOT_FOR: data that must be real-time — use regular views or optimized queries instead
IMPLEMENTATION¶
CREATE MATERIALIZED VIEW monthly_revenue AS
SELECT date_trunc('month', created_at) AS month,
tenant_id,
sum(amount) AS total,
count(*) AS order_count
FROM orders
WHERE status = 'completed'
GROUP BY 1, 2;
CREATE UNIQUE INDEX idx_mv_monthly_revenue ON monthly_revenue (month, tenant_id);
-- Refresh without blocking reads
REFRESH MATERIALIZED VIEW CONCURRENTLY monthly_revenue;
NOTE: CONCURRENTLY requires a unique index on the materialized view
NOTE: first refresh cannot be CONCURRENT — must be regular refresh
SCHEDULE: refresh via cron job or trigger after bulk data changes
PERFORMANCE:PG_STAT_STATEMENTS¶
SETUP¶
CONFIG (postgresql.conf):
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 10000
pg_stat_statements.track = top
TOP QUERIES BY TIME¶
SELECT query, calls, mean_exec_time::numeric(10,2) AS avg_ms,
total_exec_time::numeric(12,2) AS total_ms,
rows, shared_blks_hit, shared_blks_read
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
TOP QUERIES BY CALLS¶
SELECT query, calls, mean_exec_time::numeric(10,2) AS avg_ms
FROM pg_stat_statements
ORDER BY calls DESC
LIMIT 20;
RESET AFTER OPTIMIZATION¶
NOTE: reset after deploying optimizations to get clean metrics
NOTE: UpCloud managed DB may have pg_stat_statements pre-configured — check with Arjan
PERFORMANCE:QUICK_WINS_CHECKLIST¶
- [ ] pg_stat_statements enabled and monitored
- [ ] All foreign key columns have indexes
- [ ] Connection pooling (PgBouncer) in front of production DB
- [ ] autovacuum running (check last_autovacuum timestamps)
- [ ] No
idle in transactionconnections lingering - [ ] No Seq Scans on tables >10k rows in frequent queries
- [ ] LIMIT on all list queries
- [ ] No SELECT * in production code
- [ ] work_mem set appropriately (4MB-64MB depending on query complexity)
- [ ] shared_buffers = 25% of available RAM
- [ ] effective_cache_size = 75% of available RAM
- [ ] random_page_cost = 1.1 for SSD (default 4.0 is for spinning disk)