Pitfall: Mutation Testing¶
Rule¶
Every code change MUST maintain 80%+ mutation kill rate. No exceptions.
What Is Mutation Testing¶
Mutation testing injects small bugs (mutants) into your code — flipping operators, removing statements, changing constants. If your tests don't catch the injected bug, the mutant "survives" and your tests have a blind spot.
A mutation score of 80% means: 80% of injected bugs are caught by your test suite.
Tools¶
| Stack | Tool | Config | Threshold |
|---|---|---|---|
| TypeScript (admin-ui) | Stryker | admin-ui/stryker.config.mjs |
break: 80 |
| Python (ge-bootstrap) | mutmut 3.x | setup.cfg [mutmut] |
TDD_MUTATION_THRESHOLD=80 |
Pre-Push Check¶
Common Mistakes¶
Shipping code without tests¶
New files with 0% coverage drag the overall score down. Always ship tests with code. This is the #1 cause of pipeline failures.
mutmut 3.x paths_to_mutate format¶
mutmut 3.x does NOT support comma-separated paths in setup.cfg. Use a directory:
# WRONG (treats entire string as one path):
paths_to_mutate=ge_orchestrator/a.py,ge_orchestrator/b.py
# RIGHT:
paths_to_mutate=ge_orchestrator
mutmut sandbox and imports¶
mutmut 3.x copies tests to a mutants/ sandbox but NOT the source package. A conftest.py in the test directory must add the project root to sys.path:
import os, sys
from pathlib import Path
_cwd = Path(os.getcwd())
_project_root = str(_cwd.parent) if _cwd.name == "mutants" else str(_cwd)
if _project_root not in sys.path:
sys.path.insert(0, _project_root)
Tests that don't actually test behavior¶
If tests pass but mutation score is low, your tests are likely: - Checking types/shapes instead of values - Using mocks that don't verify calls - Testing happy path only (no edge cases) - Asserting existence instead of correctness
CI Tier¶
mutation:typescript— STANDARD tier (merge-blocking on every MR)test:mutation(Python) — FULL tier (nightly + manual, due to CPU cost)
Incident History¶
- 2026-04-10: Score dropped 50.71% → 49.13% when ETF Phase 2 + agent backfill merged without mutation tests. Blocked the pipeline for the entire CI/CD fix session.