DOMAIN:TESTING¶

OWNER: marije (Testing Lead Alfa), judith (Testing Lead Bravo)
ALSO_USED_BY: antje (TDD), jasper (test reconciliation), ashley (adversarial), koen (mutation testing), nessa (performance testing)
UPDATED: 2026-03-24
SCOPE: all client projects, all teams, GE platform itself

CORE_PRINCIPLE¶

GE operates a TWO-PHASE testing architecture where tests are written BEFORE code exists and AFTER code is implemented.
These two phases are INDEPENDENT. They must never share information.
The reconciliation between them is where real quality emerges.

PHILOSOPHY: "Write tests. Not too many. Mostly integration." (Kent C. Dodds)
FRAMEWORK: test pyramid (Martin Fowler) — many unit, fewer integration, minimal E2E
RULE: tests are a SPECIFICATION, not a verification afterthought
RULE: every test must be able to FAIL — a test that cannot fail is test theater

TWO_PHASE_ARCHITECTURE¶

PHASE_1: PRE-IMPLEMENTATION (Antje — TDD)¶

INPUT: Anna's formal specification
OUTPUT: test suite that defines expected behavior BEFORE any code exists
TOOLS: Vitest, fast-check (property-based)
CONSTRAINT: Antje has ZERO knowledge of implementation. Tests are derived purely from spec.
PURPOSE: create an oracle — if code passes these tests, it meets the spec

FLOW:
1. Anna produces formal spec (inputs, outputs, constraints, edge cases)
2. Antje writes tests from spec using TDD methodology
3. Tests are committed to repo BEFORE any developer writes code
4. Developers write code to make Antje's tests pass (red-green-refactor)

ANTI_PATTERN: Antje looking at existing code or developer plans
FIX: Antje receives ONLY Anna's spec document. No other context.

PHASE_2: POST-IMPLEMENTATION (Marije / Judith)¶

INPUT: implemented code, Anna's spec, user stories
OUTPUT: comprehensive test suite verifying actual behavior
TOOLS: Vitest (unit/integration), Playwright (E2E), @axe-core/playwright (a11y)
CONSTRAINT: Marije/Judith write tests AFTER implementation, testing what was actually built
PURPOSE: verify real behavior including integration points, UI flows, edge cases Antje couldn't predict

FLOW:
1. Developers complete implementation (passing Antje's TDD tests)
2. Koen runs deterministic quality checks (linting, formatting, type checking)
3. Marije/Judith write post-implementation tests (unit, integration, E2E)
4. Jasper reconciles TDD suite vs post-impl suite
5. Ashley runs adversarial testing (chaos monkey, fuzzing)

TEAM_ASSIGNMENT:
- Marije: Team Alfa projects
- Judith: Team Bravo projects
- Shared test infrastructure owned by both

PAGES¶

VITEST_PATTERNS: Vitest deep dive — structure, mocking, fixtures, Hono handlers, Drizzle queries
-> domains/testing/vitest-patterns.md

PLAYWRIGHT_E2E: Playwright deep dive — POM, isolation, auth, visual regression, a11y, CI
-> domains/testing/playwright-e2e.md

TDD_METHODOLOGY: TDD for Antje — spec-driven tests, property-based testing, boundary analysis
-> domains/testing/tdd-methodology.md

MUTATION_TESTING: Stryker mutation testing for Koen — config, mutant types, thresholds, CI
-> domains/testing/mutation-testing.md

TEST_RECONCILIATION: For Jasper — comparing TDD vs post-impl suites, gap analysis, arbitration
-> domains/testing/test-reconciliation.md

ADVERSARIAL_TESTING: For Ashley — chaos monkey, fuzzing, race conditions, error injection
-> domains/testing/adversarial-testing.md

THOUGHT_LEADERS: Kent C. Dodds, Martin Fowler, testing philosophy, key resources
-> domains/testing/thought-leaders.md

PITFALLS: Testing anti-patterns, LLM-specific traps, flaky tests, test theater
-> domains/testing/pitfalls.md

JIT_INJECTION_MAP¶

Task Type	Pages to Load
tdd_from_spec	tdd-methodology.md, vitest-patterns.md
post_impl_testing	vitest-patterns.md, playwright-e2e.md
test_reconciliation	test-reconciliation.md, tdd-methodology.md
mutation_testing	mutation-testing.md, vitest-patterns.md
adversarial_testing	adversarial-testing.md
e2e_test_writing	playwright-e2e.md
performance_testing	vitest-patterns.md
test_review	pitfalls.md, thought-leaders.md
new_project_test_setup	vitest-patterns.md, playwright-e2e.md, mutation-testing.md
test_debugging	pitfalls.md, vitest-patterns.md, playwright-e2e.md

TOOL_STACK¶

Tool	Purpose	Version Policy
Vitest	Unit + Integration tests	Latest stable, workspace mode for monorepos
Playwright	E2E + Visual regression + A11y	Latest stable, Chromium + Firefox + WebKit
Stryker	Mutation testing	Latest stable, incremental mode in CI
k6	Load testing	Latest stable (Nessa owns)
fast-check	Property-based testing	Latest stable (Antje uses in TDD)
@axe-core/playwright	Accessibility testing	Latest stable, integrated in Playwright
@faker-js/faker	Test data generation	Latest stable

TEST_PYRAMID_POLICY¶

UNIT_TESTS: 70% of test suite
- Fast, isolated, no external dependencies
- Mock external boundaries (DB, APIs, filesystem)
- Run in < 10 seconds for full suite
- Every function with logic gets a unit test

INTEGRATION_TESTS: 20% of test suite
- Test real interactions between components
- Use test database (not mocks) for DB tests
- Test API routes end-to-end within the server
- Run in < 60 seconds for full suite

E2E_TESTS: 10% of test suite
- Critical user journeys only
- Login, core CRUD flows, payment flows
- Run in < 5 minutes for full suite
- Flaky E2E test = disabled immediately + bug filed

RULE: never write an E2E test for something a unit test can verify
RULE: never mock in an integration test what you're trying to integrate
RULE: if a test takes > 5 seconds, it's in the wrong tier

COVERAGE_POLICY¶

MINIMUM_THRESHOLDS:
- Line coverage: 80%
- Branch coverage: 75%
- Function coverage: 85%
- Statement coverage: 80%

MUTATION_SCORE_THRESHOLDS:
- New code: >= 80% mutation score
- Existing code: >= 60% mutation score (improve over time)
- Critical paths (auth, payment, data): >= 90% mutation score

RULE: coverage is a FLOOR, not a GOAL — 100% coverage with bad tests is worse than 70% with good tests
RULE: mutation score is the real quality metric — it measures if tests actually detect bugs
RULE: never write a test just to increase coverage numbers

CI_PIPELINE_INTEGRATION¶

ORDER_IN_PIPELINE:
1. Type checking (tsc --noEmit)
2. Linting (eslint)
3. Unit tests (vitest run)
4. Integration tests (vitest run --project integration)
5. Mutation testing on changed files (stryker --incremental)
6. E2E tests (playwright test)
7. Coverage report generation
8. Test reconciliation report (Jasper's tooling)

FAIL_CONDITIONS:
- ANY test failure = pipeline blocked
- Coverage below thresholds = pipeline blocked
- Mutation score below threshold on NEW code = pipeline blocked
- Mutation score below threshold on EXISTING code = warning only

AGENT_RESPONSIBILITIES¶

ANTJE (TDD):
- Receives Anna's spec
- Writes test suite BEFORE implementation
- Uses property-based testing for algorithmic requirements
- Commits tests to __tests__/tdd/ directory
- NEVER reads implementation code

MARIJE (Testing Lead Alfa):
- Writes post-implementation tests for Team Alfa projects
- Owns E2E test suite for Alfa projects
- Reviews test quality for Alfa team
- Commits tests to __tests__/ (unit/integration) and e2e/ (E2E)

JUDITH (Testing Lead Bravo):
- Same responsibilities as Marije but for Team Bravo projects
- Shares test infrastructure and patterns with Marije

KOEN (Mutation Testing):
- Runs Stryker against test suites
- Reports mutation score
- Identifies surviving mutants (tests that miss bugs)
- Works AFTER Marije/Judith, BEFORE Jasper

JASPER (Test Reconciliation):
- Compares Antje's TDD suite vs Marije/Judith's post-impl suite
- Identifies coverage gaps between the two
- Arbitrates contradictions using Anna's spec as source of truth
- Produces reconciliation report

ASHLEY (Adversarial Testing):
- ZERO codebase knowledge — tests from user perspective only
- Fuzzes inputs, tests race conditions, injects errors
- Chaos monkey approach: tries to break things
- Works AFTER Jasper, BEFORE Jaap (SSOT)

NESSA (Performance Testing):
- Owns k6 load testing
- Performance baselines and regression detection
- Separate domain but coordinates with testing team

DIRECTORY_CONVENTIONS¶

project-root/
  __tests__/
    tdd/                  # Antje's TDD tests (pre-implementation)
      feature-name.test.ts
    unit/                 # Unit tests (post-implementation)
      feature-name.test.ts
    integration/          # Integration tests
      feature-name.integration.test.ts
  e2e/
    features/             # Playwright E2E tests
      feature-name.spec.ts
    fixtures/             # Playwright fixtures
    page-objects/         # Page Object Model classes
  vitest.config.ts        # Vitest configuration
  vitest.workspace.ts     # Workspace config (monorepo)
  playwright.config.ts    # Playwright configuration
  stryker.config.mjs      # Stryker configuration

RULE: TDD tests and post-impl tests are in SEPARATE directories — they must never be mixed
RULE: test file names mirror source file names with .test.ts or .spec.ts suffix
RULE: E2E test files use .spec.ts to distinguish from unit/integration .test.ts

CROSS_REFERENCES¶

STANDARDS: domains/testing/thought-leaders.md — testing philosophy and key resources
ANTI_PATTERNS: domains/testing/pitfalls.md — what NOT to do
PIPELINE: development/standards/ci-cd.md — where tests fit in deployment
SPECS: Anna's output format documented in development/contracts/spec-format.md