Hook Loop Prevention¶
The Problem¶
Post-completion hooks created feedback loops between monitoring agents, burning $1000+ in 30 days.
The cycle: Annegreet completes session analysis → hook triggers Eltjo for log analysis → Eltjo completes → hook triggers Annegreet for knowledge extraction → repeat forever.
Why existing defenses failed:
- exclude_self: true — only blocks A→A, not A→B→A
- hook_origin_depth — resets to 0 for each new completion. Annegreet's completion from being triggered by Eltjo is still depth=0 because it's a NEW session with a NEW completion ID
- Per-completion idempotency — each completion has a unique ID, so dedup doesn't catch the cross-chain pattern
- Circuit breaker (10 triggers/completion) — irrelevant, each completion is different
- Rate limit (20/hr) — too high. 20 triggers/hr * 2 agents * $2.50/session = $100/hr token burn
The Fix (Three Layers)¶
Layer 1: Monitoring Agent Isolation (Hard Rule)¶
Agents in the monitoring_agent_isolation set in config/post-completion-hooks.yaml can NEVER trigger each other via hooks. They observe and document — they don't create work for each other.
Agents: annegreet, eltjo, victoria, nessa, mira
This breaks the A→B→A loop at the category level, regardless of timing or chain depth. A non-monitoring agent (e.g. urszula) CAN still trigger monitoring agents.
Source of truth: config/post-completion-hooks.yaml → monitoring_agent_isolation list.
Layer 2: Cross-Chain Cycle Detection (Redis Trigger Graph)¶
A Redis sorted set tracks which agents triggered which, with timestamps as scores.
- Key:
hook:trigger_graph:{source_agent}= sorted set of targets with timestamps - Window: 30 minutes (configurable via
cycle_detection_window_seconds) - Before firing hook A→B: check if B→A (direct) or B→...→A (transitive, BFS depth ≤ 5) exists in the graph
This catches: - A→B→A (direct cycle) - A→B→C→A (3-agent cycle) - Any N-agent cycle up to depth 5
Fails closed: if Redis is unavailable, hooks are blocked (not allowed).
Layer 3: Tiered Rate Limits¶
- Monitoring agents: 5 hooks/agent/hour (was 20)
- Regular agents: 20 hooks/agent/hour (unchanged)
Source of truth: config/post-completion-hooks.yaml → monitoring_agent_rate_limit and default_agent_rate_limit.
How to Verify¶
# Run the test suite (12 tests)
PYTHONPATH=/home/claude/ge-bootstrap python3 -m pytest tests/test_hook_loop_prevention.py -v
# Tests cover:
# - Monitoring isolation: annegreet↔eltjo blocked, victoria↔mira blocked
# - Non-monitoring→monitoring: urszula→annegreet allowed
# - Direct cycle: A→B→A blocked
# - Transitive cycle: A→B→C→A blocked
# - 5-agent ring: A→B→C→D→E→A blocked
# - Non-cyclic: allowed (marije→urszula when no reverse edge)
# - Rate limit: 5/hr for monitoring, 20/hr for regular
# - Legitimate chain: urszula→koen→marije works
What to Watch For¶
- New monitoring agents — if you add a monitoring/governance agent, add it to
monitoring_agent_isolationinconfig/post-completion-hooks.yaml - Escalation hooks — the
escalationsection in post-completion-hooks.yaml notifies mira on critical failures. Mira is in the isolation set, so she won't create hook loops, but be aware of this interaction - Redis key growth — trigger graph keys expire after window + 60s. Monitor
hook:trigger_graph:*key count with:redis-cli --scan --pattern 'hook:trigger_graph:*' | wc -l - False positives — if a legitimate chain gets blocked by cycle detection, the 30-minute window may be too wide. Adjust
cycle_detection_window_secondsin config.
Re-enabling Disabled Agents¶
See the re-enablement checklist at the bottom of Agent System Pitfalls.
Code Locations¶
- Fix:
ge_orchestrator/completion_scanner/hooks.py—PostCompletionHookDispatcher - Config:
config/post-completion-hooks.yaml—monitoring_agent_isolation, rate limits, cycle window - Tests:
tests/test_hook_loop_prevention.py— 12 tests - Models:
ge_orchestrator/completion_scanner/models.py—CompletionEvent.hook_origin_depth