Testing Standards¶

TDD Approach¶

RULE: Write the test before the implementation. The test defines what "done" means. RATIONALE: Prevents scaffolding without integration. The test is the first call site for any new code.

Test Levels¶

Unit Tests (supplementary)¶

Test individual functions in isolation
Fast, numerous, focused
NOT sufficient for feature verification
NOT proof of life

Integration Tests (required for features)¶

Test the actual system path end-to-end
Exercise real network calls, real database operations, real Redis streams
This IS proof of life
Every new feature must have at least one integration test

Regression Tests (scheduled)¶

Run on schedule (post-deployment or daily)
Exercise all existing features through actual system paths
A failing regression test blocks new feature work (Principle 10)

What Tests Must Prove¶

For a new API endpoint: - The route is registered and reachable (curl from outside the pod) - The handler processes a real request - The response matches the contract schema - The side effects (DB writes, Redis publishes) actually occurred

For a new agent capability: - The trigger reaches the executor via Redis Stream - The executor spawns a real CLI session (Claude/Codex/Gemini) - The CLI produces real output (verified via PTY capture) - The completion file is written to ge-ops/system/completions/

NOT ACCEPTABLE: Tests that only verify the function body without verifying it's callable through the real path.

ENFORCEMENT: Marije/Judith run test suites. Koen/Eric verify test coverage in code review.