Skip to content

Hook Loop Prevention

The Problem

Post-completion hooks created feedback loops between monitoring agents, burning $1000+ in 30 days.

The cycle: Annegreet completes session analysis → hook triggers Eltjo for log analysis → Eltjo completes → hook triggers Annegreet for knowledge extraction → repeat forever.

Why existing defenses failed: - exclude_self: true — only blocks A→A, not A→B→A - hook_origin_depth — resets to 0 for each new completion. Annegreet's completion from being triggered by Eltjo is still depth=0 because it's a NEW session with a NEW completion ID - Per-completion idempotency — each completion has a unique ID, so dedup doesn't catch the cross-chain pattern - Circuit breaker (10 triggers/completion) — irrelevant, each completion is different - Rate limit (20/hr) — too high. 20 triggers/hr * 2 agents * $2.50/session = $100/hr token burn

The Fix (Three Layers)

Layer 1: Monitoring Agent Isolation (Hard Rule)

Agents in the monitoring_agent_isolation set in config/post-completion-hooks.yaml can NEVER trigger each other via hooks. They observe and document — they don't create work for each other.

Agents: annegreet, eltjo, victoria, nessa, mira

This breaks the A→B→A loop at the category level, regardless of timing or chain depth. A non-monitoring agent (e.g. urszula) CAN still trigger monitoring agents.

Source of truth: config/post-completion-hooks.yamlmonitoring_agent_isolation list.

Layer 2: Cross-Chain Cycle Detection (Redis Trigger Graph)

A Redis sorted set tracks which agents triggered which, with timestamps as scores.

  • Key: hook:trigger_graph:{source_agent} = sorted set of targets with timestamps
  • Window: 30 minutes (configurable via cycle_detection_window_seconds)
  • Before firing hook A→B: check if B→A (direct) or B→...→A (transitive, BFS depth ≤ 5) exists in the graph

This catches: - A→B→A (direct cycle) - A→B→C→A (3-agent cycle) - Any N-agent cycle up to depth 5

Fails closed: if Redis is unavailable, hooks are blocked (not allowed).

Layer 3: Tiered Rate Limits

  • Monitoring agents: 5 hooks/agent/hour (was 20)
  • Regular agents: 20 hooks/agent/hour (unchanged)

Source of truth: config/post-completion-hooks.yamlmonitoring_agent_rate_limit and default_agent_rate_limit.

How to Verify

# Run the test suite (12 tests)
PYTHONPATH=/home/claude/ge-bootstrap python3 -m pytest tests/test_hook_loop_prevention.py -v

# Tests cover:
# - Monitoring isolation: annegreet↔eltjo blocked, victoria↔mira blocked
# - Non-monitoring→monitoring: urszula→annegreet allowed
# - Direct cycle: A→B→A blocked
# - Transitive cycle: A→B→C→A blocked
# - 5-agent ring: A→B→C→D→E→A blocked
# - Non-cyclic: allowed (marije→urszula when no reverse edge)
# - Rate limit: 5/hr for monitoring, 20/hr for regular
# - Legitimate chain: urszula→koen→marije works

What to Watch For

  1. New monitoring agents — if you add a monitoring/governance agent, add it to monitoring_agent_isolation in config/post-completion-hooks.yaml
  2. Escalation hooks — the escalation section in post-completion-hooks.yaml notifies mira on critical failures. Mira is in the isolation set, so she won't create hook loops, but be aware of this interaction
  3. Redis key growth — trigger graph keys expire after window + 60s. Monitor hook:trigger_graph:* key count with: redis-cli --scan --pattern 'hook:trigger_graph:*' | wc -l
  4. False positives — if a legitimate chain gets blocked by cycle detection, the 30-minute window may be too wide. Adjust cycle_detection_window_seconds in config.

Re-enabling Disabled Agents

See the re-enablement checklist at the bottom of Agent System Pitfalls.

Code Locations

  • Fix: ge_orchestrator/completion_scanner/hooks.pyPostCompletionHookDispatcher
  • Config: config/post-completion-hooks.yamlmonitoring_agent_isolation, rate limits, cycle window
  • Tests: tests/test_hook_loop_prevention.py — 12 tests
  • Models: ge_orchestrator/completion_scanner/models.pyCompletionEvent.hook_origin_depth