PostgreSQL — Indexing¶

OWNER: boris, yoanna ALSO_USED_BY: urszula, maxim, floris, hugo, stef LAST_VERIFIED: 2026-03-26 GE_STACK_VERSION: PostgreSQL 16.x

Overview¶

Correct indexing is the single biggest performance lever in PostgreSQL. Agents must understand index types, when to use them, and how to verify they work. Blind index creation wastes storage and slows writes.

Index Types¶

B-tree (default)¶

The standard index. Good for equality and range queries.

CREATE INDEX tasks_status_idx ON tasks (status);
CREATE INDEX tasks_created_at_idx ON tasks (created_at);

CHECK: Agent is adding an index. IF: The query uses =, <, >, <=, >=, BETWEEN, IN, ORDER BY. THEN: B-tree is correct.

GIN (Generalized Inverted Index)¶

For JSONB, arrays, full-text search, and trigram matching.

-- JSONB containment queries
CREATE INDEX agents_skills_gin ON agents USING gin (skills);
-- Supports: WHERE skills @> '["typescript"]'

-- Full-text search
CREATE INDEX docs_search_gin ON documents USING gin (to_tsvector('english', content));
-- Supports: WHERE to_tsvector('english', content) @@ to_tsquery('authentication')

-- Trigram similarity (requires pg_trgm)
CREATE INDEX clients_name_trgm ON clients USING gin (name gin_trgm_ops);
-- Supports: WHERE name % 'acme' (similarity search)

CHECK: Agent is querying JSONB with @>, ?, ?|, or ?& operators. IF: No GIN index exists on the JSONB column. THEN: Add one. Without GIN, JSONB queries do sequential scans.

CHECK: Agent is implementing search. IF: Using LIKE '%term%'. THEN: STOP. Use pg_trgm GIN index or full-text search instead. LIKE '%..%' cannot use B-tree indexes.

GiST (Generalized Search Tree)¶

For geometric, range, and nearest-neighbor queries.

-- Range queries (e.g., overlapping time ranges)
CREATE INDEX bookings_period_gist ON bookings USING gist (
  tstzrange(start_at, end_at)
);
-- Supports: WHERE tstzrange(start_at, end_at) && tstzrange('2026-03-01', '2026-03-31')

-- PostGIS geospatial (if approved)
CREATE INDEX locations_geom_gist ON locations USING gist (geom);

CHECK: Agent needs range overlap or containment queries. IF: Comparing start/end pairs for overlap. THEN: Use GiST with tstzrange. B-tree cannot do overlap checks.

BRIN (Block Range Index)¶

For large append-only tables where data is physically ordered.

-- Good for time-series data where rows are inserted in order
CREATE INDEX metrics_recorded_at_brin ON agent_metrics USING brin (recorded_at);

CHECK: Agent is indexing a large (1M+ rows) append-only table. IF: Queries filter by the same column data is ordered by (e.g., timestamp). THEN: Consider BRIN. 1000x smaller than B-tree for time-series. IF: Data is not physically ordered or table has many updates. THEN: BRIN is useless. Use B-tree.

Composite Indexes¶

-- Order matters: leftmost column is the most selective filter
CREATE INDEX tasks_agent_status_idx ON tasks (agent_id, status);
-- Supports: WHERE agent_id = 'urszula' AND status = 'running'
-- Supports: WHERE agent_id = 'urszula' (uses index)
-- Does NOT support: WHERE status = 'running' (cannot skip first column)

CHECK: Agent is adding a composite index. IF: The query filters on multiple columns. THEN: Put the most selective (highest cardinality) column first. IF: Only one column is queried independently. THEN: That column needs its own single-column index too.

Partial Indexes¶

Index only the rows that matter:

-- Only index non-null team assignments
CREATE INDEX clients_assigned_team_idx ON clients (assigned_team)
  WHERE assigned_team IS NOT NULL;

-- Only index active tasks
CREATE INDEX tasks_active_idx ON tasks (agent_id, created_at)
  WHERE status IN ('pending', 'running');

CHECK: Agent is adding an index on a column where most queries filter to a small subset. IF: >70% of rows would be excluded by a common filter condition. THEN: Use a partial index. Smaller index = faster queries, less storage.

Expression Indexes¶

Index computed values:

-- Case-insensitive name search
CREATE INDEX clients_name_lower_idx ON clients (lower(name));
-- Supports: WHERE lower(name) = lower('Acme Corp')

-- Date extraction
CREATE INDEX tasks_created_date_idx ON tasks (date_trunc('day', created_at));
-- Supports: WHERE date_trunc('day', created_at) = '2026-03-26'

-- JSONB path extraction
CREATE INDEX agents_provider_idx ON agents ((skills->>'primary_language'));

CHECK: Agent is querying with a function on a column (e.g., lower(), date_trunc(), JSONB ->>). IF: No expression index exists. THEN: PostgreSQL cannot use a regular B-tree index on the column. Create an expression index matching the exact function call.

Unique Indexes¶

-- Unique constraint (also creates an index)
CREATE UNIQUE INDEX clients_vat_number_uniq ON clients (vat_number)
  WHERE vat_number IS NOT NULL;

CHECK: Agent needs a uniqueness guarantee. IF: The column is nullable. THEN: Use a partial unique index with WHERE col IS NOT NULL. PostgreSQL considers each NULL as distinct — a regular unique index would allow multiple NULLs (which is correct, but may not be desired).

EXPLAIN ANALYZE¶

The ONLY way to verify index usage.

-- Check if index is used
EXPLAIN ANALYZE
SELECT * FROM tasks
WHERE agent_id = 'urszula' AND status = 'running'
ORDER BY created_at DESC
LIMIT 10;

Reading the Output¶

Index Scan using tasks_agent_status_idx on tasks  (cost=0.28..8.30 rows=1 width=256) (actual time=0.024..0.025 rows=1 loops=1)
  Index Cond: ((agent_id = 'urszula') AND (status = 'running'))
  Filter: ...
Planning Time: 0.150 ms
Execution Time: 0.045 ms

CHECK: Agent has added an index. IF: EXPLAIN ANALYZE shows Seq Scan instead of Index Scan. THEN: The index is not being used. Common reasons: - Table too small (planner prefers seq scan for <1000 rows) - Wrong index type for the operator - Expression mismatch (query uses lower(name) but index is on name) - Statistics outdated (run ANALYZE tablename)

Key Terms¶

Term	Meaning
`Seq Scan`	Full table scan — no index used
`Index Scan`	Index lookup + table fetch
`Index Only Scan`	Index satisfies query completely (best)
`Bitmap Index Scan`	Index builds a bitmap, then fetches matching rows
`actual time`	Real execution time in milliseconds
`rows`	Number of rows returned at that step

Anti-Patterns¶

ANTI_PATTERN: Adding indexes on every column "just in case." FIX: Each index slows INSERT/UPDATE/DELETE by ~10-15%. Only index columns that appear in WHERE, JOIN ON, or ORDER BY clauses of actual queries.

ANTI_PATTERN: Not using CONCURRENTLY for index creation on production tables.

-- BLOCKS ALL WRITES during build
CREATE INDEX tasks_new_idx ON tasks (new_column);

FIX:

-- Non-blocking
CREATE INDEX CONCURRENTLY tasks_new_idx ON tasks (new_column);

ANTI_PATTERN: Using LIKE '%term%' with a B-tree index. FIX: B-tree only supports left-anchored LIKE ('term%'). For contains-search, use pg_trgm with GIN index.

ANTI_PATTERN: Indexing low-cardinality boolean columns alone. FIX: A boolean with 50/50 distribution gives no selectivity. Combine with other columns in a composite index, or use a partial index (WHERE is_active = true).

GE-Specific Conventions¶

Explicit index names — {table}_{column(s)}_idx pattern.
CONCURRENTLY for production index changes.
Partial indexes preferred when query patterns filter consistently.
GIN for JSONB — always, when querying JSONB content.
EXPLAIN ANALYZE required before and after adding indexes.
pg_stat_user_indexes — check periodically for unused indexes.

-- Find unused indexes
SELECT schemaname, relname, indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

Cross-References¶

READ_ALSO: wiki/docs/stack/postgresql/index.md READ_ALSO: wiki/docs/stack/postgresql/performance.md READ_ALSO: wiki/docs/stack/postgresql/pitfalls.md READ_ALSO: wiki/docs/stack/drizzle/schema-design.md READ_ALSO: wiki/docs/stack/drizzle/migrations.md