Skip to content

Agent Identity System

Each agent has a tiered identity that loads based on task complexity, optimizing token usage.

File Locations

Identity files: ge-ops/master/agent-configs/{name}/

File Load When Target Size
IDENTITY-CORE.md Always ~1,200 tokens
IDENTITY-ROLE.md normal + complex ~2,500 tokens
IDENTITY-REFERENCE.md complex only ~3,500 tokens

Learnings: ge-ops/agents/{name}/LEARNINGS.md (NOT in agent-configs — different path!)

The identity loader reads from ge-ops/master/agent-configs/ for identity files but from ge-ops/agents/ for learnings. The learning pipeline writes to the latter. Learnings are capped at 3,000 chars (~750 tokens) in the prompt.

Complexity Classifier

Source of truth: ge_agent/execution/context.py

Classification Tiers Loaded Turn Budget Trigger
simple CORE only 25 2+ simple keywords (status, check, list, health)
normal CORE + ROLE 40 Default — most tasks
complex CORE + ROLE + REFERENCE 60 3+ complex keywords (implement, comprehensive, investigate)

Turn budgets defined in config/agent-execution.yaml (authoritative).

Prompt Assembly Order

Constitution → Directive → CORE → ROLE → REFERENCE → LEARNINGS → JIT Learnings → Task Context

Built by ge_agent/identity/prompts.py. Provider-aware: Claude gets full constitution in prompt; OpenAI/Gemini get universal principles only (quality gate lives in native instructions file).

Token Budget (after 2026-02-15 optimization)

Component Normal Simple Complex
Constitution ~2,100 ~2,100 ~2,100
IDENTITY-CORE ~1,400-2,400 ~1,400-2,400 ~1,400-2,400
IDENTITY-ROLE ~2,600-5,600 ~2,600-5,600
IDENTITY-REFERENCE ~2,700-4,400
LEARNINGS (capped) ~30-750 ~30-750 ~30-750
JIT learnings ~500 ~500 ~500
Our prompt ~6,800-11,500 ~4,200-5,900 ~9,400-15,800

Claude Code adds ~20-25k overhead (system prompt + tool schemas). Not controllable.

Known Issue: Oversized ROLE Files

Several agents have ROLE files 2-3x their 200-line target (annegreet: 773, urszula: 566, arjan: 491). This is content debt requiring manual trimming.