Agent System Pitfalls¶
Hook Loop — Annegreet/Eltjo Token Burn¶
ISSUE: Post-completion hooks with condition "always" at no_block tier created infinite Annegreet-Eltjo feedback loop COST: $1000+ in 30 days from uncontrolled token burn FIX DEPLOYED: hook_origin_depth prevents cross-trigger, per-agent rate limit 20 hooks/hr CURRENT STATE: All 4 agents RE-ENABLED 2026-02-15 after 3-layer hook loop fix (hook_origin_depth, per-agent rate limit, no_block depth cap). See Hook Loops for details. RULE: NEVER add post-completion hook with condition "always" at no_block tier
CronJobs (Active Since 2026-02-15)¶
HISTORY: All GE CronJobs were suspended Feb 2-15 due to hook loop token burn STATUS: All unsuspended and active as of 2026-02-15 after hook loop fix deployed RUNNING: executor-refresh, health-check, zombie-cleanup, learning-backlog-monitor, learning-struggle-detector, learning-wiki-writer ALWAYS ACTIVE: vault-unseal CronJob (ge-system) and gitlab-toolbox-backup (ge-gitlab) were never suspended
Simulation Anti-Patterns (Fake Proof of Life)¶
These patterns are DECEPTIVE — they produce correct-looking output through the wrong path: - Inserting rows directly into PostgreSQL to make Admin UI display a "working" feature - Agent reporting "I'm on OpenAI" because config says so without verifying CLI works - Discussion system "works" because test script called API directly instead of triggering real agents - Billing dashboard shows costs from POSTed test data, not actual agent sessions - Provider switching "works" because dropdown saves to DB, but executor never invokes selected CLI
Legacy Identity File Confusion (Three Copies Existed)¶
ISSUE: Before the wiki brain migration, agent identities existed in THREE locations:
1. /home/claude/ge-bootstrap/identities/{name}/IDENTITY.md — oldest, root-level (ARCHIVED)
2. /home/claude/ge-bootstrap/ge-ops/identities/{name}/IDENTITY.md — partial, 9 agents (ARCHIVED)
3. /home/claude/ge-bootstrap/ge-ops/master/agent-configs/{name}/IDENTITY.md — single-file alongside tiered (ARCHIVED)
CURRENT: Only the tiered files are authoritative: IDENTITY-CORE.md, IDENTITY-ROLE.md, IDENTITY-REFERENCE.md
LOCATION: ge-ops/master/agent-configs/{name}/
NOTE: File names are IDENTITY-CORE.md (not CORE.md) — the INFRA-OVERVIEW.md had this wrong
LEARNINGS.md Path Mismatch (Fixed 2026-02-16)¶
AUTHORITATIVE: ge-ops/master/agent-configs/{name}/LEARNINGS.md — where agents write real learnings
STUBS: ge-ops/agents/{name}/LEARNINGS.md — empty 9-line stubs, NOT authoritative
FIXED: Identity loader reads from agent-configs/ (primary), agents/ as fallback. See EVO-2026-0216-007.
Double Delivery (Fixed)¶
ISSUE: task-service.ts was XADDing to BOTH triggers.{agent} AND ge:work:incoming — 2x execution cost
STATUS: Fixed (ge:work:incoming XADD removed)
RULE: NEVER XADD to both streams for the same task
Cost Gate Bypass¶
ISSUE: cost_gate.py enforces $5/session, $10/agent/hr, $100/day limits RISK: Removing cost_gate imports or bypassing pre-execution checks removes all cost protection RULE: cost_gate.py MUST remain imported and active in pty_executor.py. NEVER bypass.
Before Re-enabling Disabled Agents¶
- Root cause documented
- Fix deployed and verified
- Agent's Redis stream drained to 0
- Enable with replicas=1 first, monitor 30 minutes
- Check billing: agent cost < $2 after 30 minutes
Token Budget Bloat (Fixed 2026-02-15)¶
Problem¶
Every API call showed ~40k input tokens. The tiered identity system was supposed to keep sessions lean, but the complexity classifier was broken — almost every task classified as "complex", loading all 3 identity tiers (~10-17k prompt tokens) instead of just CORE + ROLE (~5-9k).
Root Causes¶
1. Complexity classifier threshold too low
ISSUE: TaskComplexityClassifier.COMPLEX_SCORE_THRESHOLD = 1 — a single keyword match triggered "complex"
KEYWORDS: "fix", "test", "analyze", "create", "configure", "deploy" — present in almost every task description
IMPACT: Every session loaded IDENTITY-CORE + IDENTITY-ROLE + IDENTITY-REFERENCE + LEARNINGS
FIX: Raised threshold to 3 and removed overly common keywords. Now most tasks classify as "normal".
LOCATION: ge_agent/execution/context.py
2. LEARNINGS.md loaded from wrong path
ISSUE: Loader read from ge-ops/master/agent-configs/{name}/LEARNINGS.md (stale copies, 400-3000 tokens)
RIGHT PATH: ge-ops/agents/{name}/LEARNINGS.md (canonical, written by learning pipeline)
EXTRA FIX: Learnings capped at 3000 chars (~750 tokens) in the prompt — full learnings browsable in wiki
LOCATION: ge_agent/identity/loader.py
What's In The Prompt (after fix)¶
| Component | Tokens (normal) | Tokens (simple) | Tokens (complex) |
|---|---|---|---|
| Constitution | ~2,100 | ~2,100 | ~2,100 |
| IDENTITY-CORE | ~1,400-2,400 | ~1,400-2,400 | ~1,400-2,400 |
| IDENTITY-ROLE | ~2,600-5,600 | — | ~2,600-5,600 |
| IDENTITY-REFERENCE | — | — | ~2,700-4,400 |
| LEARNINGS (capped) | ~30-750 | ~30-750 | ~30-750 |
| JIT learnings | ~500 | ~500 | ~500 |
| Task context | ~125 | ~125 | ~125 |
| Our prompt total | ~6,800-11,500 | ~4,200-5,900 | ~9,400-15,800 |
| Claude Code overhead | ~20,000-25,000 | ~20,000-25,000 | ~20,000-25,000 |
| Session total | ~27,000-36,000 | ~24,000-31,000 | ~29,000-41,000 |
How Classification Works Now¶
| Classification | Requires | Tiers Loaded | Turn Budget |
|---|---|---|---|
| simple | 2+ simple keywords (status, check, list, health) | CORE only | 25 |
| normal | Default (most tasks) | CORE + ROLE | 40 |
| complex | 3+ complex keywords (implement, comprehensive, investigate, multiple) | CORE + ROLE + REFERENCE | 60 |
Source of truth: ge_agent/execution/context.py (classifier), config/agent-execution.yaml (turn budgets)
IDENTITY-ROLE Files Are Oversized¶
Several agents have ROLE files 2-3x their target (e.g. Annegreet: 773 lines, target: 200). This is a content debt issue. Each ROLE file should be reviewed and trimmed to its target. Priority: agents that execute most often (koen, urszula, boris, annegreet).