Multi-Tenancy¶

OWNER: aimee ALSO_USED_BY: anna (implementation), hugo (auth/middleware), faye, sytske

Multi-tenancy is the architectural foundation of every B2B SaaS product. Tenant A must never see Tenant B's data. Zero tolerance. No exceptions.

This page covers isolation strategies, middleware patterns, and scaling considerations for the GE stack (PostgreSQL + Drizzle + Hono + Next.js).

Isolation Strategy Decision¶

SCOPE_ITEM: isolation_strategy_selection

Strategy 1: Shared Database + Row-Level Security (Default)¶

INCLUDES: Single PostgreSQL database, single schema INCLUDES: Every tenant-scoped table has a tenant_id column (NOT NULL, indexed) INCLUDES: PostgreSQL RLS policies enforce tenant isolation at DB level INCLUDES: Application middleware sets app.current_tenant on every connection INCLUDES: Drizzle queries automatically scoped by tenant context

When to use: - Most B2B SaaS projects (80%+ of GE clients) - Up to 10,000+ tenants - Cost-effective (~$0.10/tenant/month infrastructure) - Simplest to maintain, migrate, and back up

Limitations: - Noisy neighbor risk (one tenant's heavy query affects others) - Single point of failure at database level - Cannot offer per-tenant database credentials to customers

CHECK: This is the GE default. Only deviate if client has explicit regulatory or contractual requirement.

Strategy 2: Schema-Per-Tenant¶

INCLUDES: Single PostgreSQL database, separate schema per tenant INCLUDES: Schema created on tenant provisioning (e.g., tenant_abc123) INCLUDES: Connection middleware sets search_path to tenant schema INCLUDES: Migrations applied to all schemas (automated via migration runner) INCLUDES: Shared tables (plans, global config) in public schema

When to use: - Client's customers demand logical separation (regulated industries) - Need per-tenant database-level permissions - 50-500 tenants (manageable schema count)

Limitations: - Migration complexity (must run against N schemas) - Schema count limit in PostgreSQL (~10,000 practical max) - Connection pooling complexity (search_path per connection) - Backup/restore is per-database, not per-schema

CHECK: Budget 40-60 extra hours for schema management tooling

Strategy 3: Database-Per-Tenant¶

INCLUDES: Dedicated PostgreSQL instance (or logical database) per tenant INCLUDES: Tenant routing at connection level (connection pool per tenant) INCLUDES: Fully independent data lifecycle (backup, restore, migrate per tenant) INCLUDES: Per-tenant performance isolation (no noisy neighbors) INCLUDES: Per-tenant connection credentials (can give to customer)

When to use: - Enterprise clients with contractual data isolation requirements - Regulated industries (healthcare, finance, government) - Clients demanding ability to audit their own database - Maximum 500 tenants (operational cost scales linearly)

Limitations: - High operational cost ($50-200/tenant/month minimum) - Connection management complexity - Cross-tenant reporting requires aggregation layer - Migration rollout across N databases

CHECK: Budget 80-120 extra hours for multi-database management layer CHECK: Only for enterprise tier — never for self-serve signups

Hybrid Strategy (Recommended for Scale)¶

IF: Client expects both self-serve SMB and enterprise customers THEN: Use shared DB + RLS for standard tiers, offer DB-per-tenant for enterprise tier

INCLUDES: Default: shared database with RLS (all standard tenants) INCLUDES: Enterprise upgrade: migrate tenant to dedicated database INCLUDES: Tenant routing layer determines which database to connect to INCLUDES: Migration tooling to move tenant between shared and dedicated CHECK: Hybrid is the most complex to build — budget 120+ extra hours CHECK: Only if client explicitly needs both SMB scale and enterprise isolation

Implementation: Shared DB + RLS (GE Default)¶

Database Schema Pattern¶

SCOPE_ITEM: tenant_schema_design INCLUDES: tenants table — id, name, slug, domain, plan_id, settings (JSONB), created_at INCLUDES: tenant_id column on ALL tenant-scoped tables (UUID, NOT NULL, FK to tenants) INCLUDES: Composite index on (tenant_id, id) for all tenant-scoped tables INCLUDES: tenant_id included in all unique constraints (e.g., unique email per tenant) COMPLIANCE: Never use a global unique email constraint — same person can be in multiple tenants

Drizzle Schema Example¶

SCOPE_ITEM: drizzle_tenant_columns
INCLUDES: Every table definition includes tenantId: uuid('tenant_id').notNull().references(() => tenants.id)
INCLUDES: Index on tenantId for every tenant-scoped table
INCLUDES: Composite unique constraints include tenantId (e.g., unique [tenantId, email])

PostgreSQL RLS Policies¶

SCOPE_ITEM: rls_policies INCLUDES: Enable RLS on every tenant-scoped table: ALTER TABLE x ENABLE ROW LEVEL SECURITY INCLUDES: Policy: CREATE POLICY tenant_isolation ON x USING (tenant_id = current_setting('app.current_tenant')::uuid) INCLUDES: Force RLS for table owner: ALTER TABLE x FORCE ROW LEVEL SECURITY INCLUDES: Migration role (used for schema changes) bypasses RLS via BYPASSRLS INCLUDES: Application role (used for queries) has RLS enforced COMPLIANCE: RLS is a defense-in-depth layer — application code must ALSO filter by tenant_id CHECK: Test RLS by attempting cross-tenant queries — must return zero rows CHECK: RLS policies must cover SELECT, INSERT, UPDATE, DELETE separately

Setting Tenant Context¶

SCOPE_ITEM: tenant_context_middleware INCLUDES: Hono middleware extracts tenant_id from JWT claims INCLUDES: Middleware calls SET LOCAL app.current_tenant = '{tenant_id}' on each request INCLUDES: SET LOCAL scopes the setting to the current transaction INCLUDES: If no tenant_id in token, request is rejected (401) INCLUDES: Background jobs receive tenant_id as job parameter, set context before DB access CHECK: Never use SET (without LOCAL) — it persists across connection reuse in pooling CHECK: Test with connection pooling (PgBouncer) — SET LOCAL works with transaction mode

Tenant-Aware Middleware Stack¶

SCOPE_ITEM: middleware_stack

Request Flow¶

1. TLS termination (load balancer)
2. Rate limiting (per-IP, then per-tenant after auth)
3. Authentication (JWT validation)
4. Tenant resolution (from JWT claim or subdomain)
5. Authorization (RBAC check for tenant + user + action)
6. Tenant context injection (SET LOCAL on DB connection)
7. Request handler (all queries automatically tenant-scoped)
8. Audit logging (tenant_id included in every log entry)

Subdomain-Based Tenant Resolution¶

SCOPE_ITEM: subdomain_tenancy INCLUDES: Tenant accessed via {slug}.app.com INCLUDES: Middleware extracts subdomain from Host header INCLUDES: Looks up tenant by slug in database (cached in Redis, 5-min TTL) INCLUDES: Validates that authenticated user belongs to this tenant OPTIONAL: Custom domain support (tenant uses their own domain via CNAME) IF: Custom domains required THEN: TLS certificate provisioning needed (Let's Encrypt ACME, automated) CHECK: Subdomain approach requires wildcard DNS and wildcard TLS certificate

Header-Based Tenant Resolution (API)¶

SCOPE_ITEM: header_tenancy INCLUDES: API key includes tenant scope (key is bound to one tenant) INCLUDES: Or: X-Tenant-ID header with API key authentication INCLUDES: Middleware validates API key belongs to the specified tenant CHECK: Never trust client-provided tenant_id without auth validation

Tenant Configuration¶

SCOPE_ITEM: tenant_configuration INCLUDES: Per-tenant settings stored in tenants.settings JSONB column INCLUDES: Settings schema validated with Zod INCLUDES: Settings cached in Redis (invalidated on update) INCLUDES: Feature flags per tenant (stored in tenant_feature_flags table)

Common Tenant Settings¶

INCLUDES: branding — logo_url, primary_color, favicon_url, app_name
INCLUDES: auth — sso_enabled, sso_provider, mfa_enforced, session_duration_minutes
INCLUDES: notifications — email_sender_name, email_sender_address, custom_email_domain
INCLUDES: limits — max_users, max_storage_gb, max_api_calls_per_month
INCLUDES: locale — default_timezone, default_language, date_format, currency

Data Migration Between Tenants¶

SCOPE_ITEM: tenant_data_migration INCLUDES: Export tenant data as JSON/CSV bundle INCLUDES: Import tenant data into new tenant (new IDs generated, references re-mapped) INCLUDES: Tenant merge (two tenants become one — rare but requested by enterprise) INCLUDES: Tenant split (one tenant becomes two — e.g., company division) CHECK: Data migration is complex — budget 40-80 hours if required CHECK: ID re-mapping must handle all foreign keys and cross-references

Performance at Scale¶

Noisy Neighbor Prevention¶

SCOPE_ITEM: noisy_neighbor_prevention INCLUDES: Per-tenant rate limiting (requests/minute, configurable per plan) INCLUDES: Per-tenant query timeout (statement_timeout per connection) INCLUDES: Per-tenant storage quota enforcement INCLUDES: Per-tenant background job queue priority INCLUDES: Monitoring dashboards showing per-tenant resource usage OPTIONAL: Per-tenant connection pool limits (for database-per-tenant) CHECK: Rate limits must be plan-aware (enterprise gets higher limits)

Caching Strategy¶

SCOPE_ITEM: tenant_aware_caching INCLUDES: Redis key format: tenant:{tenant_id}:{entity}:{id} INCLUDES: Cache invalidation scoped to tenant (flush one tenant, not all) INCLUDES: Tenant settings cached with 5-minute TTL INCLUDES: Query result caching with tenant-scoped keys CHECK: Never cache without tenant_id in key — cross-tenant cache poisoning is a critical vulnerability

Database Indexing¶

SCOPE_ITEM: tenant_database_indexing INCLUDES: Composite index (tenant_id, created_at DESC) for list queries INCLUDES: Composite index (tenant_id, id) for lookups INCLUDES: Partial indexes for common tenant-scoped filters INCLUDES: Table partitioning by tenant_id for tables > 100M rows CHECK: EXPLAIN ANALYZE all common queries with 10k+ tenants in test data CHECK: Partition strategy decision at 50M rows, not after 500M

Connection Pooling¶

SCOPE_ITEM: connection_pooling INCLUDES: PgBouncer in transaction mode (required for RLS with SET LOCAL) INCLUDES: Pool size = 20-50 connections per application instance INCLUDES: Overflow handling (queue with timeout, not reject) INCLUDES: Connection health checking (periodic validation query) CHECK: Session mode pooling breaks SET LOCAL — must use transaction mode CHECK: Prepared statements not supported in transaction mode — use simple protocol

Background Jobs and Tenant Context¶

SCOPE_ITEM: tenant_aware_background_jobs INCLUDES: Every job payload includes tenant_id INCLUDES: Job processor sets tenant context before execution INCLUDES: Failed jobs include tenant context in error logs INCLUDES: Job queue supports tenant-scoped concurrency limits INCLUDES: Cron jobs that span all tenants iterate with tenant context per batch CHECK: Never run a background job without tenant context — silent data leak risk CHECK: Tenant deletion must cancel all pending jobs for that tenant

Tenant Lifecycle¶

SCOPE_ITEM: tenant_lifecycle INCLUDES: Provisioning — create tenant record, set defaults, provision resources INCLUDES: Active — normal operation INCLUDES: Suspended — billing issue, read-only access, 14-day grace period INCLUDES: Deactivated — admin action, no access, data retained for 90 days INCLUDES: Deleted — data erasure, irreversible, GDPR-compliant CHECK: Deletion must cascade to all tenant-scoped data, file storage, cache, queued jobs CHECK: Deletion must be audit-logged in a separate non-tenant-scoped audit table COMPLIANCE: GDPR requires deletion within 30 days of request (60 days with extension)

Testing Multi-Tenancy¶

SCOPE_ITEM: multi_tenancy_testing INCLUDES: Test: User in Tenant A cannot access Tenant B data via API INCLUDES: Test: User in Tenant A cannot access Tenant B data via direct DB query (RLS) INCLUDES: Test: Background job for Tenant A does not affect Tenant B INCLUDES: Test: Cache for Tenant A does not leak to Tenant B INCLUDES: Test: Search results for Tenant A do not include Tenant B content INCLUDES: Test: File uploads for Tenant A are not accessible to Tenant B INCLUDES: Test: Webhooks for Tenant A events are only sent to Tenant A endpoints COMPLIANCE: Cross-tenant data leak is a severity-1 security incident — zero tolerance CHECK: Run cross-tenant isolation tests in CI on every deployment