Skip to content

Testing Standards

TDD Approach

RULE: Write the test before the implementation. The test defines what "done" means. RATIONALE: Prevents scaffolding without integration. The test is the first call site for any new code.

Test Levels

Unit Tests (supplementary)

  • Test individual functions in isolation
  • Fast, numerous, focused
  • NOT sufficient for feature verification
  • NOT proof of life

Integration Tests (required for features)

  • Test the actual system path end-to-end
  • Exercise real network calls, real database operations, real Redis streams
  • This IS proof of life
  • Every new feature must have at least one integration test

Regression Tests (scheduled)

  • Run on schedule (post-deployment or daily)
  • Exercise all existing features through actual system paths
  • A failing regression test blocks new feature work (Principle 10)

What Tests Must Prove

For a new API endpoint: - The route is registered and reachable (curl from outside the pod) - The handler processes a real request - The response matches the contract schema - The side effects (DB writes, Redis publishes) actually occurred

For a new agent capability: - The trigger reaches the executor via Redis Stream - The executor spawns a real CLI session (Claude/Codex/Gemini) - The CLI produces real output (verified via PTY capture) - The completion file is written to ge-ops/system/completions/

NOT ACCEPTABLE: Tests that only verify the function body without verifying it's callable through the real path.

ENFORCEMENT: Marije/Judith run test suites. Koen/Eric verify test coverage in code review.