DOMAIN:DATABASE — SCHEMA DESIGN¶

OWNER: boris, yoanna
UPDATED: 2026-03-24
SCOPE: all PostgreSQL schema design for client projects and GE internal

SCHEMA:NAMING_CONVENTIONS¶

TABLES¶

FORMAT: snake_case, plural, lowercase
GOOD: users, work_packages, client_projects, invoice_line_items
BAD: User, WorkPackage, tbl_users, user_table
JOIN_TABLES: both table names in alphabetical order — project_users not user_projects
NOTE: Drizzle schema name maps to table name — keep them identical

COLUMNS¶

FORMAT: snake_case, lowercase
GOOD: created_at, user_id, is_active, email_address
BAD: createdAt, userId, isActive, EmailAddress
BOOLEAN_PREFIX: is_, has_, can_, should_ — is_active not active
FOREIGN_KEYS: {referenced_table_singular}_id — user_id, project_id, team_id
TIMESTAMPS: always _at suffix — created_at, updated_at, deleted_at, verified_at
NOTE: Drizzle column names in TypeScript use camelCase — the ORM maps to snake_case automatically

INDEXES¶

FORMAT: idx_{table}_{column(s)} — idx_users_email, idx_orders_user_id_created_at
UNIQUE_INDEXES: uniq_{table}_{column(s)} — uniq_users_email
PARTIAL_INDEXES: idx_{table}_{column(s)}_where_{condition} — idx_orders_status_where_pending

CONSTRAINTS¶

PRIMARY_KEY: {table}_pkey
FOREIGN_KEY: fk_{table}_{referenced_table}
CHECK: chk_{table}_{description}
UNIQUE: uniq_{table}_{column(s)}

ENUMS¶

FORMAT: snake_case, singular — order_status, user_role
VALUES: lowercase — 'pending', 'active', 'cancelled'
NOTE: prefer PostgreSQL native ENUMs for small, stable value sets
NOTE: for frequently changing value sets, use a lookup table instead
ALERT: adding an ENUM value is safe — REMOVING or RENAMING requires a migration plan

SCHEMAS (NAMESPACES)¶

DEFAULT: public for application tables
CONVENTION: auth for authentication tables, audit for audit trail, analytics for reporting
NOTE: multi-tenant schemas follow different rules — see multi-tenant.md

SCHEMA:PRIMARY_KEYS¶

STRATEGY: UUID V7 (PREFERRED)¶

-- PostgreSQL 16+ with pgcrypto or gen_random_uuid()
id UUID PRIMARY KEY DEFAULT gen_random_uuid()

WHY_UUID: no sequential guessing, safe to expose in URLs, merge-friendly across environments
WHY_V7: time-ordered — maintains B-tree index locality (unlike v4 which fragments)
DRIZZLE:

import { uuid, pgTable } from 'drizzle-orm/pg-core';

export const users = pgTable('users', {
  id: uuid('id').primaryKey().defaultRandom(),
  // ...
});

NOTE: UUID v4 is acceptable if v7 generation is not available — fragmentation cost is real but manageable under 100M rows
NOTE: for tables expecting >100M rows, discuss with Boris/Yoanna — ULID or v7 strongly preferred
ALERT: NEVER use UUID as text — always use native UUID type (16 bytes vs 36 bytes)

WHEN SERIAL IS ACCEPTABLE¶

USE_SERIAL_FOR: internal system tables, junction tables with no external exposure, sequence-dependent ordering
USE_SERIAL_FOR: high-volume append-only tables where insert performance is critical (>10k inserts/sec)
NEVER_SERIAL_FOR: any ID exposed in URLs, APIs, or to end users — sequential IDs leak information

import { serial, pgTable } from 'drizzle-orm/pg-core';

export const internalLogs = pgTable('internal_logs', {
  id: serial('id').primaryKey(),
  // ...
});

ULID ALTERNATIVE¶

USE_WHEN: you need sortable IDs AND string representation AND cross-system compatibility
STORAGE: text column with CHECK constraint on format, OR store as UUID by converting
NOTE: ULID is 26 characters as text vs UUID's 16 bytes — storage cost matters at scale
NOTE: if choosing ULID, standardize across the entire project — do not mix ULID and UUID

SCHEMA:DATA_TYPES¶

TIMESTAMPS¶

ALWAYS: timestamptz (timestamp with time zone) — NEVER timestamp without timezone
WHY: timestamp silently drops timezone info — causes bugs when servers change timezone
STORE: always in UTC — application layer converts for display
DRIZZLE:

import { timestamp } from 'drizzle-orm/pg-core';

createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp('updated_at', { withTimezone: true }).notNull().defaultNow(),

NOTE: every table gets created_at and updated_at — no exceptions
NOTE: updated_at must be maintained by application OR trigger — Drizzle does not auto-update

TEXT VS VARCHAR¶

PREFER: text for most string columns — PostgreSQL treats text and varchar identically in performance
USE_VARCHAR_WHEN: you need a hard length constraint for business rules — varchar(255) for email
NEVER: char(n) — pads with spaces, wastes storage, causes comparison bugs
DRIZZLE:

email: varchar('email', { length: 255 }).notNull(),
bio: text('bio'),

NUMERIC TYPES¶

MONEY: numeric(precision, scale) — NEVER float, double precision, or money type
EXAMPLE: numeric(12, 2) for amounts up to 9,999,999,999.99
INTEGER: integer for most counts, bigint only if exceeding 2.1 billion
BOOLEAN: native boolean — NEVER integer 0/1 or text 'true'/'false'
DRIZZLE:

amount: numeric('amount', { precision: 12, scale: 2 }).notNull(),
quantity: integer('quantity').notNull().default(0),
isActive: boolean('is_active').notNull().default(true),

JSONB¶

USE_FOR: flexible metadata, configuration blobs, third-party API response caching
USE_FOR: attributes that vary per record and do not need relational queries
NEVER_FOR: data you query by regularly — normalize into columns instead
NEVER_FOR: replacing proper relational modeling — JSONB is not a document database
INDEX: GIN index on JSONB columns you query — CREATE INDEX idx_meta ON table USING GIN (metadata)
DRIZZLE:

metadata: jsonb('metadata').$type<{ source: string; tags: string[] }>(),

ALERT: JSONB has no schema enforcement — application must validate before insert
NOTE: JSONB queries are slower than column queries — if you add a GIN index, test query plans

ARRAYS¶

USE_SPARINGLY: PostgreSQL arrays are useful for tags, small fixed lists
PREFER_JUNCTION_TABLE_WHEN: array elements need their own attributes, or array grows unbounded
NEVER: nested arrays — if you need nested, use JSONB or normalize
INDEX: GIN index for array containment queries — @>, <@, && operators

CITEXT¶

USE_FOR: case-insensitive text comparison — email addresses, usernames
REQUIRES: CREATE EXTENSION IF NOT EXISTS citext
ALTERNATIVE: lower() functional index — CREATE INDEX idx_users_email_lower ON users (lower(email))

SCHEMA:CONSTRAINTS¶

NOT NULL BY DEFAULT¶

RULE: every column is NOT NULL unless there is a specific business reason for NULL
WHY: NULL introduces three-valued logic — bugs hide in NULL comparisons
NULLABLE_OK: deleted_at (soft delete marker), optional fields with no sensible default
NOTE: adding NOT NULL to existing column with NULL values requires backfill — see migrations.md

FOREIGN KEYS¶

ALWAYS: define foreign key constraints — no orphaned references
ON_DELETE: choose deliberately:
- CASCADE — child rows deleted when parent deleted (use for strong ownership — order_items when order deleted)
- SET NULL — child FK set to NULL when parent deleted (use for optional associations)
- RESTRICT — prevent parent deletion if children exist (DEFAULT, safest)
- NO ACTION — same as RESTRICT but checked at end of transaction
ALERT: CASCADE on high-traffic tables can cause long lock chains — prefer application-level cascade for large tables
NOTE: EVERY foreign key column needs an index — PostgreSQL does NOT auto-create FK indexes

CHECK CONSTRAINTS¶

USE_FOR: business rule enforcement at database level
EXAMPLES:

CONSTRAINT chk_orders_amount_positive CHECK (amount > 0),
CONSTRAINT chk_users_email_format CHECK (email ~* '^[^@]+@[^@]+\.[^@]+$'),
CONSTRAINT chk_status_valid CHECK (status IN ('draft', 'active', 'archived')),

NOTE: check constraints are cheaper than triggers — prefer them for simple validations

UNIQUE CONSTRAINTS¶

USE: for natural keys — email, username, external reference IDs
COMPOSITE_UNIQUE: when uniqueness depends on multiple columns — UNIQUE (tenant_id, email)
NOTE: unique constraint automatically creates a unique index
NOTE: NULL values are considered distinct — two rows can have NULL in a unique column

SCHEMA:INDEXES¶

B-TREE (DEFAULT)¶

USE_FOR: equality (=), range (<, >, BETWEEN), sorting (ORDER BY), prefix LIKE 'foo%'
DEFAULT: every index is B-tree unless specified otherwise
CREATED_AUTOMATICALLY_FOR: primary keys, unique constraints
MUST_CREATE_MANUALLY_FOR: foreign key columns (CRITICAL — PostgreSQL does NOT auto-create)

GIN (GENERALIZED INVERTED INDEX)¶

USE_FOR: JSONB queries (@>, ?, ?|), array containment (@>, &&), full-text search (tsvector)
USE_FOR: trigram similarity searches (with pg_trgm extension)
COST: slower to update than B-tree — do not use on high-write columns unless reads demand it

CREATE INDEX idx_products_metadata ON products USING GIN (metadata);
CREATE INDEX idx_products_tags ON products USING GIN (tags);
CREATE INDEX idx_products_search ON products USING GIN (search_vector);

GIST (GENERALIZED SEARCH TREE)¶

USE_FOR: geometric data, range types, full-text search (alternative to GIN), PostGIS spatial queries
USE_FOR: exclusion constraints (e.g., no overlapping time ranges)
NOTE: GiST is more versatile than GIN but slower for simple containment — use GIN for JSONB/arrays

PARTIAL INDEXES¶

USE_FOR: indexing a subset of rows — dramatically reduces index size and maintenance cost
EXAMPLE:

-- Only index active users — if 90% of users are inactive, this index is 10x smaller
CREATE INDEX idx_users_email_active ON users (email) WHERE is_active = true;

-- Only index pending orders
CREATE INDEX idx_orders_created_pending ON orders (created_at) WHERE status = 'pending';

WHEN: a query always filters by a constant condition AND most rows do NOT match that condition
NOTE: the WHERE clause in the query MUST match the partial index condition for PostgreSQL to use it

COMPOSITE INDEXES¶

RULE: put high-selectivity columns first (column that filters out the most rows)
RULE: put equality columns before range columns — (status, created_at) not (created_at, status)
RULE: index covers a query if all WHERE, JOIN, ORDER BY, SELECT columns are in the index
EXAMPLE:

-- Covers: WHERE user_id = X AND created_at > Y ORDER BY created_at
CREATE INDEX idx_orders_user_created ON orders (user_id, created_at);

NOTE: composite index on (A, B) can serve queries on (A) but NOT queries on (B) alone
NOTE: for covering indexes, include non-filtered columns with INCLUDE clause (PG 11+):

CREATE INDEX idx_orders_user_created ON orders (user_id, created_at) INCLUDE (status, amount);

INDEX MAINTENANCE¶

CHECK: unused indexes waste write performance and storage
QUERY_TO_FIND_UNUSED:

SELECT schemaname, relname, indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE '%_pkey'
ORDER BY pg_relation_size(indexrelid) DESC;

RULE: drop unused indexes quarterly — but verify across all query paths first
RULE: REINDEX after bulk data loads — index bloat degrades performance

SCHEMA:SOFT_DELETE¶

PATTERN: deleted_at timestamptz column — NULL means active, non-NULL means deleted
WHY: audit trail, undo capability, foreign key integrity
INDEX: partial index on frequently queried tables: WHERE deleted_at IS NULL
APPLICATION: every query must include WHERE deleted_at IS NULL — Drizzle filter helper recommended
ALTERNATIVE: for compliance (right to erasure), hard delete + audit log is sometimes required
ALERT: soft delete accumulates dead rows — archive strategy needed for tables growing >10M rows

SCHEMA:AUDIT_COLUMNS¶

EVERY table MUST have:

created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now()

OPTIONAL but recommended for sensitive data:

created_by UUID REFERENCES users(id),
updated_by UUID REFERENCES users(id)

FOR_COMPLIANCE: add version integer NOT NULL DEFAULT 1 for optimistic locking on sensitive records

SCHEMA:DRIZZLE_CONVENTIONS¶

FILE STRUCTURE¶

src/db/
  schema/
    users.ts          -- user-related tables
    projects.ts       -- project-related tables
    orders.ts         -- order-related tables
    index.ts          -- re-exports all schemas
  migrations/         -- generated by drizzle-kit
  index.ts            -- db connection + drizzle instance
  seed.ts             -- development seed data

SCHEMA TEMPLATE¶

import { pgTable, uuid, varchar, text, timestamp, boolean, integer, jsonb } from 'drizzle-orm/pg-core';

export const users = pgTable('users', {
  id: uuid('id').primaryKey().defaultRandom(),
  email: varchar('email', { length: 255 }).notNull().unique(),
  displayName: varchar('display_name', { length: 100 }).notNull(),
  isActive: boolean('is_active').notNull().default(true),
  metadata: jsonb('metadata').$type<Record<string, unknown>>(),
  createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
  updatedAt: timestamp('updated_at', { withTimezone: true }).notNull().defaultNow(),
});

NOTE: Drizzle TypeScript property names are camelCase — column name strings are snake_case
NOTE: always add .$type<T>() to JSONB columns for type safety
NOTE: export all tables from schema/index.ts for Drizzle Kit to discover them