DOMAIN:TESTING¶
OWNER: marije (Testing Lead Alfa), judith (Testing Lead Bravo)
ALSO_USED_BY: antje (TDD), jasper (test reconciliation), ashley (adversarial), koen (mutation testing), nessa (performance testing)
UPDATED: 2026-03-24
SCOPE: all client projects, all teams, GE platform itself
CORE_PRINCIPLE¶
GE operates a TWO-PHASE testing architecture where tests are written BEFORE code exists and AFTER code is implemented.
These two phases are INDEPENDENT. They must never share information.
The reconciliation between them is where real quality emerges.
PHILOSOPHY: "Write tests. Not too many. Mostly integration." (Kent C. Dodds)
FRAMEWORK: test pyramid (Martin Fowler) — many unit, fewer integration, minimal E2E
RULE: tests are a SPECIFICATION, not a verification afterthought
RULE: every test must be able to FAIL — a test that cannot fail is test theater
TWO_PHASE_ARCHITECTURE¶
PHASE_1: PRE-IMPLEMENTATION (Antje — TDD)¶
INPUT: Anna's formal specification
OUTPUT: test suite that defines expected behavior BEFORE any code exists
TOOLS: Vitest, fast-check (property-based)
CONSTRAINT: Antje has ZERO knowledge of implementation. Tests are derived purely from spec.
PURPOSE: create an oracle — if code passes these tests, it meets the spec
FLOW:
1. Anna produces formal spec (inputs, outputs, constraints, edge cases)
2. Antje writes tests from spec using TDD methodology
3. Tests are committed to repo BEFORE any developer writes code
4. Developers write code to make Antje's tests pass (red-green-refactor)
ANTI_PATTERN: Antje looking at existing code or developer plans
FIX: Antje receives ONLY Anna's spec document. No other context.
PHASE_2: POST-IMPLEMENTATION (Marije / Judith)¶
INPUT: implemented code, Anna's spec, user stories
OUTPUT: comprehensive test suite verifying actual behavior
TOOLS: Vitest (unit/integration), Playwright (E2E), @axe-core/playwright (a11y)
CONSTRAINT: Marije/Judith write tests AFTER implementation, testing what was actually built
PURPOSE: verify real behavior including integration points, UI flows, edge cases Antje couldn't predict
FLOW:
1. Developers complete implementation (passing Antje's TDD tests)
2. Koen runs deterministic quality checks (linting, formatting, type checking)
3. Marije/Judith write post-implementation tests (unit, integration, E2E)
4. Jasper reconciles TDD suite vs post-impl suite
5. Ashley runs adversarial testing (chaos monkey, fuzzing)
TEAM_ASSIGNMENT:
- Marije: Team Alfa projects
- Judith: Team Bravo projects
- Shared test infrastructure owned by both
PAGES¶
VITEST_PATTERNS: Vitest deep dive — structure, mocking, fixtures, Hono handlers, Drizzle queries
-> domains/testing/vitest-patterns.md
PLAYWRIGHT_E2E: Playwright deep dive — POM, isolation, auth, visual regression, a11y, CI
-> domains/testing/playwright-e2e.md
TDD_METHODOLOGY: TDD for Antje — spec-driven tests, property-based testing, boundary analysis
-> domains/testing/tdd-methodology.md
MUTATION_TESTING: Stryker mutation testing for Koen — config, mutant types, thresholds, CI
-> domains/testing/mutation-testing.md
TEST_RECONCILIATION: For Jasper — comparing TDD vs post-impl suites, gap analysis, arbitration
-> domains/testing/test-reconciliation.md
ADVERSARIAL_TESTING: For Ashley — chaos monkey, fuzzing, race conditions, error injection
-> domains/testing/adversarial-testing.md
THOUGHT_LEADERS: Kent C. Dodds, Martin Fowler, testing philosophy, key resources
-> domains/testing/thought-leaders.md
PITFALLS: Testing anti-patterns, LLM-specific traps, flaky tests, test theater
-> domains/testing/pitfalls.md
JIT_INJECTION_MAP¶
| Task Type | Pages to Load |
|---|---|
| tdd_from_spec | tdd-methodology.md, vitest-patterns.md |
| post_impl_testing | vitest-patterns.md, playwright-e2e.md |
| test_reconciliation | test-reconciliation.md, tdd-methodology.md |
| mutation_testing | mutation-testing.md, vitest-patterns.md |
| adversarial_testing | adversarial-testing.md |
| e2e_test_writing | playwright-e2e.md |
| performance_testing | vitest-patterns.md |
| test_review | pitfalls.md, thought-leaders.md |
| new_project_test_setup | vitest-patterns.md, playwright-e2e.md, mutation-testing.md |
| test_debugging | pitfalls.md, vitest-patterns.md, playwright-e2e.md |
TOOL_STACK¶
| Tool | Purpose | Version Policy |
|---|---|---|
| Vitest | Unit + Integration tests | Latest stable, workspace mode for monorepos |
| Playwright | E2E + Visual regression + A11y | Latest stable, Chromium + Firefox + WebKit |
| Stryker | Mutation testing | Latest stable, incremental mode in CI |
| k6 | Load testing | Latest stable (Nessa owns) |
| fast-check | Property-based testing | Latest stable (Antje uses in TDD) |
| @axe-core/playwright | Accessibility testing | Latest stable, integrated in Playwright |
| @faker-js/faker | Test data generation | Latest stable |
TEST_PYRAMID_POLICY¶
UNIT_TESTS: 70% of test suite
- Fast, isolated, no external dependencies
- Mock external boundaries (DB, APIs, filesystem)
- Run in < 10 seconds for full suite
- Every function with logic gets a unit test
INTEGRATION_TESTS: 20% of test suite
- Test real interactions between components
- Use test database (not mocks) for DB tests
- Test API routes end-to-end within the server
- Run in < 60 seconds for full suite
E2E_TESTS: 10% of test suite
- Critical user journeys only
- Login, core CRUD flows, payment flows
- Run in < 5 minutes for full suite
- Flaky E2E test = disabled immediately + bug filed
RULE: never write an E2E test for something a unit test can verify
RULE: never mock in an integration test what you're trying to integrate
RULE: if a test takes > 5 seconds, it's in the wrong tier
COVERAGE_POLICY¶
MINIMUM_THRESHOLDS:
- Line coverage: 80%
- Branch coverage: 75%
- Function coverage: 85%
- Statement coverage: 80%
MUTATION_SCORE_THRESHOLDS:
- New code: >= 80% mutation score
- Existing code: >= 60% mutation score (improve over time)
- Critical paths (auth, payment, data): >= 90% mutation score
RULE: coverage is a FLOOR, not a GOAL — 100% coverage with bad tests is worse than 70% with good tests
RULE: mutation score is the real quality metric — it measures if tests actually detect bugs
RULE: never write a test just to increase coverage numbers
CI_PIPELINE_INTEGRATION¶
ORDER_IN_PIPELINE:
1. Type checking (tsc --noEmit)
2. Linting (eslint)
3. Unit tests (vitest run)
4. Integration tests (vitest run --project integration)
5. Mutation testing on changed files (stryker --incremental)
6. E2E tests (playwright test)
7. Coverage report generation
8. Test reconciliation report (Jasper's tooling)
FAIL_CONDITIONS:
- ANY test failure = pipeline blocked
- Coverage below thresholds = pipeline blocked
- Mutation score below threshold on NEW code = pipeline blocked
- Mutation score below threshold on EXISTING code = warning only
AGENT_RESPONSIBILITIES¶
ANTJE (TDD):
- Receives Anna's spec
- Writes test suite BEFORE implementation
- Uses property-based testing for algorithmic requirements
- Commits tests to __tests__/tdd/ directory
- NEVER reads implementation code
MARIJE (Testing Lead Alfa):
- Writes post-implementation tests for Team Alfa projects
- Owns E2E test suite for Alfa projects
- Reviews test quality for Alfa team
- Commits tests to __tests__/ (unit/integration) and e2e/ (E2E)
JUDITH (Testing Lead Bravo):
- Same responsibilities as Marije but for Team Bravo projects
- Shares test infrastructure and patterns with Marije
KOEN (Mutation Testing):
- Runs Stryker against test suites
- Reports mutation score
- Identifies surviving mutants (tests that miss bugs)
- Works AFTER Marije/Judith, BEFORE Jasper
JASPER (Test Reconciliation):
- Compares Antje's TDD suite vs Marije/Judith's post-impl suite
- Identifies coverage gaps between the two
- Arbitrates contradictions using Anna's spec as source of truth
- Produces reconciliation report
ASHLEY (Adversarial Testing):
- ZERO codebase knowledge — tests from user perspective only
- Fuzzes inputs, tests race conditions, injects errors
- Chaos monkey approach: tries to break things
- Works AFTER Jasper, BEFORE Jaap (SSOT)
NESSA (Performance Testing):
- Owns k6 load testing
- Performance baselines and regression detection
- Separate domain but coordinates with testing team
DIRECTORY_CONVENTIONS¶
project-root/
__tests__/
tdd/ # Antje's TDD tests (pre-implementation)
feature-name.test.ts
unit/ # Unit tests (post-implementation)
feature-name.test.ts
integration/ # Integration tests
feature-name.integration.test.ts
e2e/
features/ # Playwright E2E tests
feature-name.spec.ts
fixtures/ # Playwright fixtures
page-objects/ # Page Object Model classes
vitest.config.ts # Vitest configuration
vitest.workspace.ts # Workspace config (monorepo)
playwright.config.ts # Playwright configuration
stryker.config.mjs # Stryker configuration
RULE: TDD tests and post-impl tests are in SEPARATE directories — they must never be mixed
RULE: test file names mirror source file names with .test.ts or .spec.ts suffix
RULE: E2E test files use .spec.ts to distinguish from unit/integration .test.ts
CROSS_REFERENCES¶
STANDARDS: domains/testing/thought-leaders.md — testing philosophy and key resources
ANTI_PATTERNS: domains/testing/pitfalls.md — what NOT to do
PIPELINE: development/standards/ci-cd.md — where tests fit in deployment
SPECS: Anna's output format documented in development/contracts/spec-format.md