DOMAIN:DATABASE — SCHEMA DESIGN¶
OWNER: boris, yoanna
UPDATED: 2026-03-24
SCOPE: all PostgreSQL schema design for client projects and GE internal
SCHEMA:NAMING_CONVENTIONS¶
TABLES¶
FORMAT: snake_case, plural, lowercase
GOOD: users, work_packages, client_projects, invoice_line_items
BAD: User, WorkPackage, tbl_users, user_table
JOIN_TABLES: both table names in alphabetical order — project_users not user_projects
NOTE: Drizzle schema name maps to table name — keep them identical
COLUMNS¶
FORMAT: snake_case, lowercase
GOOD: created_at, user_id, is_active, email_address
BAD: createdAt, userId, isActive, EmailAddress
BOOLEAN_PREFIX: is_, has_, can_, should_ — is_active not active
FOREIGN_KEYS: {referenced_table_singular}_id — user_id, project_id, team_id
TIMESTAMPS: always _at suffix — created_at, updated_at, deleted_at, verified_at
NOTE: Drizzle column names in TypeScript use camelCase — the ORM maps to snake_case automatically
INDEXES¶
FORMAT: idx_{table}_{column(s)} — idx_users_email, idx_orders_user_id_created_at
UNIQUE_INDEXES: uniq_{table}_{column(s)} — uniq_users_email
PARTIAL_INDEXES: idx_{table}_{column(s)}_where_{condition} — idx_orders_status_where_pending
CONSTRAINTS¶
PRIMARY_KEY: {table}_pkey
FOREIGN_KEY: fk_{table}_{referenced_table}
CHECK: chk_{table}_{description}
UNIQUE: uniq_{table}_{column(s)}
ENUMS¶
FORMAT: snake_case, singular — order_status, user_role
VALUES: lowercase — 'pending', 'active', 'cancelled'
NOTE: prefer PostgreSQL native ENUMs for small, stable value sets
NOTE: for frequently changing value sets, use a lookup table instead
ALERT: adding an ENUM value is safe — REMOVING or RENAMING requires a migration plan
SCHEMAS (NAMESPACES)¶
DEFAULT: public for application tables
CONVENTION: auth for authentication tables, audit for audit trail, analytics for reporting
NOTE: multi-tenant schemas follow different rules — see multi-tenant.md
SCHEMA:PRIMARY_KEYS¶
STRATEGY: UUID V7 (PREFERRED)¶
WHY_UUID: no sequential guessing, safe to expose in URLs, merge-friendly across environments
WHY_V7: time-ordered — maintains B-tree index locality (unlike v4 which fragments)
DRIZZLE:
import { uuid, pgTable } from 'drizzle-orm/pg-core';
export const users = pgTable('users', {
id: uuid('id').primaryKey().defaultRandom(),
// ...
});
NOTE: UUID v4 is acceptable if v7 generation is not available — fragmentation cost is real but manageable under 100M rows
NOTE: for tables expecting >100M rows, discuss with Boris/Yoanna — ULID or v7 strongly preferred
ALERT: NEVER use UUID as text — always use native UUID type (16 bytes vs 36 bytes)
WHEN SERIAL IS ACCEPTABLE¶
USE_SERIAL_FOR: internal system tables, junction tables with no external exposure, sequence-dependent ordering
USE_SERIAL_FOR: high-volume append-only tables where insert performance is critical (>10k inserts/sec)
NEVER_SERIAL_FOR: any ID exposed in URLs, APIs, or to end users — sequential IDs leak information
import { serial, pgTable } from 'drizzle-orm/pg-core';
export const internalLogs = pgTable('internal_logs', {
id: serial('id').primaryKey(),
// ...
});
ULID ALTERNATIVE¶
USE_WHEN: you need sortable IDs AND string representation AND cross-system compatibility
STORAGE: text column with CHECK constraint on format, OR store as UUID by converting
NOTE: ULID is 26 characters as text vs UUID's 16 bytes — storage cost matters at scale
NOTE: if choosing ULID, standardize across the entire project — do not mix ULID and UUID
SCHEMA:DATA_TYPES¶
TIMESTAMPS¶
ALWAYS: timestamptz (timestamp with time zone) — NEVER timestamp without timezone
WHY: timestamp silently drops timezone info — causes bugs when servers change timezone
STORE: always in UTC — application layer converts for display
DRIZZLE:
import { timestamp } from 'drizzle-orm/pg-core';
createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp('updated_at', { withTimezone: true }).notNull().defaultNow(),
NOTE: every table gets created_at and updated_at — no exceptions
NOTE: updated_at must be maintained by application OR trigger — Drizzle does not auto-update
TEXT VS VARCHAR¶
PREFER: text for most string columns — PostgreSQL treats text and varchar identically in performance
USE_VARCHAR_WHEN: you need a hard length constraint for business rules — varchar(255) for email
NEVER: char(n) — pads with spaces, wastes storage, causes comparison bugs
DRIZZLE:
NUMERIC TYPES¶
MONEY: numeric(precision, scale) — NEVER float, double precision, or money type
EXAMPLE: numeric(12, 2) for amounts up to 9,999,999,999.99
INTEGER: integer for most counts, bigint only if exceeding 2.1 billion
BOOLEAN: native boolean — NEVER integer 0/1 or text 'true'/'false'
DRIZZLE:
amount: numeric('amount', { precision: 12, scale: 2 }).notNull(),
quantity: integer('quantity').notNull().default(0),
isActive: boolean('is_active').notNull().default(true),
JSONB¶
USE_FOR: flexible metadata, configuration blobs, third-party API response caching
USE_FOR: attributes that vary per record and do not need relational queries
NEVER_FOR: data you query by regularly — normalize into columns instead
NEVER_FOR: replacing proper relational modeling — JSONB is not a document database
INDEX: GIN index on JSONB columns you query — CREATE INDEX idx_meta ON table USING GIN (metadata)
DRIZZLE:
ALERT: JSONB has no schema enforcement — application must validate before insert
NOTE: JSONB queries are slower than column queries — if you add a GIN index, test query plans
ARRAYS¶
USE_SPARINGLY: PostgreSQL arrays are useful for tags, small fixed lists
PREFER_JUNCTION_TABLE_WHEN: array elements need their own attributes, or array grows unbounded
NEVER: nested arrays — if you need nested, use JSONB or normalize
INDEX: GIN index for array containment queries — @>, <@, && operators
CITEXT¶
USE_FOR: case-insensitive text comparison — email addresses, usernames
REQUIRES: CREATE EXTENSION IF NOT EXISTS citext
ALTERNATIVE: lower() functional index — CREATE INDEX idx_users_email_lower ON users (lower(email))
SCHEMA:CONSTRAINTS¶
NOT NULL BY DEFAULT¶
RULE: every column is NOT NULL unless there is a specific business reason for NULL
WHY: NULL introduces three-valued logic — bugs hide in NULL comparisons
NULLABLE_OK: deleted_at (soft delete marker), optional fields with no sensible default
NOTE: adding NOT NULL to existing column with NULL values requires backfill — see migrations.md
FOREIGN KEYS¶
ALWAYS: define foreign key constraints — no orphaned references
ON_DELETE: choose deliberately:
- CASCADE — child rows deleted when parent deleted (use for strong ownership — order_items when order deleted)
- SET NULL — child FK set to NULL when parent deleted (use for optional associations)
- RESTRICT — prevent parent deletion if children exist (DEFAULT, safest)
- NO ACTION — same as RESTRICT but checked at end of transaction
ALERT: CASCADE on high-traffic tables can cause long lock chains — prefer application-level cascade for large tables
NOTE: EVERY foreign key column needs an index — PostgreSQL does NOT auto-create FK indexes
CHECK CONSTRAINTS¶
USE_FOR: business rule enforcement at database level
EXAMPLES:
CONSTRAINT chk_orders_amount_positive CHECK (amount > 0),
CONSTRAINT chk_users_email_format CHECK (email ~* '^[^@]+@[^@]+\.[^@]+$'),
CONSTRAINT chk_status_valid CHECK (status IN ('draft', 'active', 'archived')),
UNIQUE CONSTRAINTS¶
USE: for natural keys — email, username, external reference IDs
COMPOSITE_UNIQUE: when uniqueness depends on multiple columns — UNIQUE (tenant_id, email)
NOTE: unique constraint automatically creates a unique index
NOTE: NULL values are considered distinct — two rows can have NULL in a unique column
SCHEMA:INDEXES¶
B-TREE (DEFAULT)¶
USE_FOR: equality (=), range (<, >, BETWEEN), sorting (ORDER BY), prefix LIKE 'foo%'
DEFAULT: every index is B-tree unless specified otherwise
CREATED_AUTOMATICALLY_FOR: primary keys, unique constraints
MUST_CREATE_MANUALLY_FOR: foreign key columns (CRITICAL — PostgreSQL does NOT auto-create)
GIN (GENERALIZED INVERTED INDEX)¶
USE_FOR: JSONB queries (@>, ?, ?|), array containment (@>, &&), full-text search (tsvector)
USE_FOR: trigram similarity searches (with pg_trgm extension)
COST: slower to update than B-tree — do not use on high-write columns unless reads demand it
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);
CREATE INDEX idx_products_tags ON products USING GIN (tags);
CREATE INDEX idx_products_search ON products USING GIN (search_vector);
GIST (GENERALIZED SEARCH TREE)¶
USE_FOR: geometric data, range types, full-text search (alternative to GIN), PostGIS spatial queries
USE_FOR: exclusion constraints (e.g., no overlapping time ranges)
NOTE: GiST is more versatile than GIN but slower for simple containment — use GIN for JSONB/arrays
PARTIAL INDEXES¶
USE_FOR: indexing a subset of rows — dramatically reduces index size and maintenance cost
EXAMPLE:
-- Only index active users — if 90% of users are inactive, this index is 10x smaller
CREATE INDEX idx_users_email_active ON users (email) WHERE is_active = true;
-- Only index pending orders
CREATE INDEX idx_orders_created_pending ON orders (created_at) WHERE status = 'pending';
NOTE: the WHERE clause in the query MUST match the partial index condition for PostgreSQL to use it
COMPOSITE INDEXES¶
RULE: put high-selectivity columns first (column that filters out the most rows)
RULE: put equality columns before range columns — (status, created_at) not (created_at, status)
RULE: index covers a query if all WHERE, JOIN, ORDER BY, SELECT columns are in the index
EXAMPLE:
-- Covers: WHERE user_id = X AND created_at > Y ORDER BY created_at
CREATE INDEX idx_orders_user_created ON orders (user_id, created_at);
NOTE: for covering indexes, include non-filtered columns with INCLUDE clause (PG 11+):
INDEX MAINTENANCE¶
CHECK: unused indexes waste write performance and storage
QUERY_TO_FIND_UNUSED:
SELECT schemaname, relname, indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE '%_pkey'
ORDER BY pg_relation_size(indexrelid) DESC;
RULE: REINDEX after bulk data loads — index bloat degrades performance
SCHEMA:SOFT_DELETE¶
PATTERN: deleted_at timestamptz column — NULL means active, non-NULL means deleted
WHY: audit trail, undo capability, foreign key integrity
INDEX: partial index on frequently queried tables: WHERE deleted_at IS NULL
APPLICATION: every query must include WHERE deleted_at IS NULL — Drizzle filter helper recommended
ALTERNATIVE: for compliance (right to erasure), hard delete + audit log is sometimes required
ALERT: soft delete accumulates dead rows — archive strategy needed for tables growing >10M rows
SCHEMA:AUDIT_COLUMNS¶
EVERY table MUST have:
OPTIONAL but recommended for sensitive data:
FOR_COMPLIANCE: add version integer NOT NULL DEFAULT 1 for optimistic locking on sensitive records
SCHEMA:DRIZZLE_CONVENTIONS¶
FILE STRUCTURE¶
src/db/
schema/
users.ts -- user-related tables
projects.ts -- project-related tables
orders.ts -- order-related tables
index.ts -- re-exports all schemas
migrations/ -- generated by drizzle-kit
index.ts -- db connection + drizzle instance
seed.ts -- development seed data
SCHEMA TEMPLATE¶
import { pgTable, uuid, varchar, text, timestamp, boolean, integer, jsonb } from 'drizzle-orm/pg-core';
export const users = pgTable('users', {
id: uuid('id').primaryKey().defaultRandom(),
email: varchar('email', { length: 255 }).notNull().unique(),
displayName: varchar('display_name', { length: 100 }).notNull(),
isActive: boolean('is_active').notNull().default(true),
metadata: jsonb('metadata').$type<Record<string, unknown>>(),
createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp('updated_at', { withTimezone: true }).notNull().defaultNow(),
});
NOTE: Drizzle TypeScript property names are camelCase — column name strings are snake_case
NOTE: always add .$type<T>() to JSONB columns for type safety
NOTE: export all tables from schema/index.ts for Drizzle Kit to discover them