Testing Standards¶
TDD Approach¶
RULE: Write the test before the implementation. The test defines what "done" means. RATIONALE: Prevents scaffolding without integration. The test is the first call site for any new code.
Test Levels¶
Unit Tests (supplementary)¶
- Test individual functions in isolation
- Fast, numerous, focused
- NOT sufficient for feature verification
- NOT proof of life
Integration Tests (required for features)¶
- Test the actual system path end-to-end
- Exercise real network calls, real database operations, real Redis streams
- This IS proof of life
- Every new feature must have at least one integration test
Regression Tests (scheduled)¶
- Run on schedule (post-deployment or daily)
- Exercise all existing features through actual system paths
- A failing regression test blocks new feature work (Principle 10)
What Tests Must Prove¶
For a new API endpoint: - The route is registered and reachable (curl from outside the pod) - The handler processes a real request - The response matches the contract schema - The side effects (DB writes, Redis publishes) actually occurred
For a new agent capability: - The trigger reaches the executor via Redis Stream - The executor spawns a real CLI session (Claude/Codex/Gemini) - The CLI produces real output (verified via PTY capture) - The completion file is written to ge-ops/system/completions/
NOT ACCEPTABLE: Tests that only verify the function body without verifying it's callable through the real path.
ENFORCEMENT: Marije/Judith run test suites. Koen/Eric verify test coverage in code review.