Skip to content

DOMAIN:SECURITY:SECURE_FAILURE_HANDLING

OWNER: koen, eric
ALSO_USED_BY: urszula, maxim, alex, tjitte, arjan, thijmen
UPDATED: 2026-03-19
SCOPE: all code reviews, all backend/frontend projects
SOURCE: Secure by Design (Johnsson/Deogun/Sawano, 2019), Ch. 9-10


CORE_PRINCIPLE

RULE: failures MUST NOT compromise security
RULE: never leak system internals through error responses
RULE: distinguish business exceptions from technical exceptions — handle separately


SECURITY:EXCEPTION_HANDLING

BUSINESS_VS_TECHNICAL_EXCEPTIONS

RULE: business exceptions (domain rule violated) — handle with domain logic
RULE: technical exceptions (DB down, network timeout) — handle with infrastructure logic
ANTI_PATTERN: mixing business and technical exceptions in same catch block
FIX: separate exception hierarchies — never cross-contaminate

exception type example handling
business insufficient funds, item out of stock domain-specific response to user
technical connection timeout, disk full, OOM log internally, generic error to user

EXCEPTION_PAYLOAD

RULE: exception messages for INTERNAL logging only — never expose to end user
RULE: never include stack traces, SQL, file paths, server names in user-facing errors
ANTI_PATTERN: catch(e) { res.json({ error: e.message }) }
FIX: catch(e) { log.error(e); res.json({ error: "An error occurred" }) }
ANTI_PATTERN: exception message contains user input verbatim → XSS via error page
FIX: sanitize or omit user input from error messages

GLOBAL_EXCEPTION_HANDLER

RULE: install global exception handler as safety net — catches unhandled exceptions
RULE: global handler returns GENERIC error to client, DETAILED error to logs
RULE: global handler must NOT swallow exceptions silently — always log
CHECK: does the application have a global exception handler?
CHECK: does it return generic messages to users and detailed messages to logs?

HANDLING_WITHOUT_EXCEPTIONS

PATTERN: use Result/Either types instead of throwing for expected failures
BENEFIT: caller MUST handle both success and failure — can't forget
BENEFIT: no exception overhead, clearer control flow
WHEN: business logic with expected failure paths (validation, authorization)
WHEN: functional programming style
ANTI_PATTERN: using exceptions for flow control (e.g., try login, catch invalid password)
FIX: return Result — caller handles explicitly


SECURITY:BAD_DATA

RULE: NEVER repair bad data — reject it
RULE: NEVER echo user input verbatim in error responses
RULE: treat all external input as potentially malicious

REPAIR_IS_DANGEROUS

ANTI_PATTERN: stripping HTML tags from input before storing
FIX: reject input that doesn't match domain rules — period
ANTI_PATTERN: auto-correcting date format, trimming special chars
FIX: validate against domain primitive — accept or reject, no middle ground
REASON: repair creates implicit assumptions about what "clean" means — attackers exploit these
NOTE: second-order attacks bypass repair: stored payload triggers on later retrieval

XSS_POLYGLOTS

CHECK: input that appears safe to ONE parser may be executable in ANOTHER
EXAMPLE: jaVasCript:/*-/*\\/\'//"//(/ /oNcliCk=alert() )//...` — bypasses many filters
RULE: defense in depth — validate input AND encode output — never rely on one layer

OUTPUT_ENCODING

RULE: always encode output for the context (HTML, JS, URL, CSS)
RULE: even validated domain primitives need output encoding when rendered
NOTE: domain primitives prevent most injection but not ALL — output encoding is the second layer


SECURITY:AVAILABILITY_DESIGN

RULE: design for failure — assume every dependency will fail
RULE: contain failures — prevent cascade across system

CIRCUIT_BREAKERS

PATTERN: wrap external calls in circuit breaker
STATES: closed (normal) → open (failing, fast-fail) → half-open (testing recovery)

IF dependency fails > threshold
  → circuit OPENS → all calls fail-fast (no timeout wait)
  → after cooldown → circuit HALF-OPEN → allow one test call
  → IF test succeeds → circuit CLOSES (normal)
  → IF test fails → circuit stays OPEN

BENEFIT: prevents cascade failure when dependency is down
BENEFIT: preserves thread pool — no threads waiting on dead service
CHECK: are external service calls wrapped in circuit breakers?
CHECK: is there a fallback response when circuit is open?
ANTI_PATTERN: infinite timeout on external calls → thread starvation → cascade failure
FIX: set explicit timeouts + circuit breaker

BULKHEADS

PATTERN: isolate resources per dependency — failure in one doesn't exhaust all
EXAMPLE: separate thread pool per external service, separate DB connection pool per tenant
BENEFIT: if service A is slow, it exhausts only its own pool — service B unaffected
CHECK: are resource pools (threads, connections) shared across independent concerns?

WORK_QUEUES

PATTERN: decouple request acceptance from processing via queue
BENEFIT: system accepts work at its own pace, not caller's pace
BENEFIT: queue absorbs burst traffic, protects downstream
CHECK: are high-throughput endpoints backed by queues?

DOS_TESTING

RULE: test availability as part of CI/CD
CHECK: has headroom been estimated? (max expected load × safety factor)
CHECK: what happens at 2x, 5x, 10x normal load?
CHECK: do domain rules have performance implications? (complex validation = DoS vector)
ANTI_PATTERN: regex on unbounded input → ReDoS
FIX: length limit BEFORE regex (see VALIDATION_ORDER in secure-design-patterns.md)


SECURITY:CLOUD_DESIGN

SOURCE: Secure by Design Ch. 10 — Twelve-Factor App + Three R's

TWELVE_FACTOR_SECURITY_BENEFITS

factor security benefit
codebase (one repo) single audit surface
dependencies (explicit) no hidden transitive risk
config (in environment) no secrets in code
backing services (attached) swap compromised service without code change
build/release/run (strict) immutable releases, auditable
processes (stateless) no session hijacking via server state
port binding (self-contained) reduced attack surface
concurrency (scale out) availability via redundancy
disposability (fast start/stop) rapid replace compromised instances
dev/prod parity security tests run in prod-like env
logs (event streams) centralized, tamper-evident logging
admin processes (one-off) auditable admin actions

THREE_RS_OF_ENTERPRISE_SECURITY

STANDARD: Rotate, Repave, Repair

ROTATE_SECRETS

RULE: all secrets have expiry — rotate before expiry
RULE: treat credentials as ephemeral, not permanent
CHECK: when was each secret last rotated?
CHECK: are secrets stored in environment/vault, NEVER in code or config files?
ANTI_PATTERN: hardcoded API keys, database passwords in source
FIX: vault-managed secrets with automatic rotation
ANTI_PATTERN: long-lived API keys (>90 days)
FIX: short-lived tokens, automatic rotation schedule

REPAVE_SERVERS

RULE: regularly destroy and rebuild servers from immutable image
RULE: assume any long-running server is compromised
BENEFIT: APT attacks lose persistence — any implant destroyed on repave
BENEFIT: drift eliminated — known-good state guaranteed
CHECK: how long since last repave? (target: hours/days, not weeks/months)
NOTE: containers make repaving trivial — kubectl rollout restart

REPAIR_VULNERABILITIES

RULE: patch known CVEs immediately — don't wait for maintenance window
RULE: automate vulnerability scanning in CI/CD pipeline
CHECK: is there a process for emergency patching?
CHECK: are CVE notifications monitored and triaged?

CONFIGURATION_SECURITY

RULE: NEVER store config in code — use environment variables or external config service
RULE: NEVER store secrets in resource files (even if encrypted locally)
RULE: encrypt sensitive config values at rest
CHECK: are there secrets in git history? (even if removed from HEAD)
ANTI_PATTERN: config values that change behavior without audit trail
FIX: config changes go through version control or auditable config service

LOGGING_SECURITY

RULE: log to event stream (stdout), not to file on disk
RULE: centralize logs — aggregated, searchable, tamper-evident
ANTI_PATTERN: logging to local file → availability risk (disk full), confidentiality risk (who can read?), integrity risk (can be modified/deleted)
FIX: log as event stream → centralized service → immutable storage

CIA-T concern file logging risk stream logging benefit
confidentiality anyone with server access reads logs centralized access control
integrity logs can be modified/deleted append-only, tamper-evident
availability disk full = log loss external storage, unlimited
traceability scattered across servers unified, searchable

WIKI_REF: domains/security/books/secure-by-design.md (full chapter mapping)
READ_ALSO: domains/security/secure-design-patterns.md, domains/security/index.md