Skip to content

Human-in-the-Loop

The Paradox

Too much human involvement defeats the purpose of an AI agency. If every decision requires human approval, you have built an expensive, slow, unreliable system that would be cheaper to operate with human developers. Too little human involvement creates unbounded risk. If agents make all decisions autonomously, you will eventually ship code with security vulnerabilities, make incorrect business commitments, or burn through your entire compute budget in a feedback loop.

The art of human-in-the-loop design is finding the right intervention points — the decisions where human judgment adds value that exceeds the cost of the delay it introduces.

GE's principle: humans decide policy, agents execute policy. The human defines what should be done and under what constraints. Agents figure out how to do it within those constraints. This is not a philosophical stance. It is an engineering decision based on where LLMs are reliable (execution within constraints) and where they are not (judgment about what constraints should exist).


Decision Authority Tiers

Not all decisions are equal. GE classifies decisions into four tiers based on who has the authority to make them.

Tier 1: Autonomous

The agent decides and executes without any approval.

Scope: Technical implementation details that do not affect other agents, clients, or system architecture.

Examples: - Variable and function naming - Code formatting and style choices within established standards - Internal file organization within the agent's owned directories - Test structure and assertion patterns - Error message wording (non-client-facing) - Choosing between equivalent implementation approaches when both satisfy the specification

Guardrail: Even autonomous decisions must comply with the Constitution, coding standards, and the agent's identity boundaries. Autonomy does not mean freedom from rules.

Tier 2: Peer Escalation

The agent recognizes the decision touches another agent's domain and escalates to a peer with relevant expertise.

Scope: Technical decisions that cross agent boundaries or require domain knowledge the agent does not possess.

Examples: - "Is this a security concern?" -> escalate to Ron (Guardian) - "Does this database change require a migration?" -> escalate to Marta (DBA) - "Will this API change break existing consumers?" -> escalate to Koen (Quality) - "Is this performance acceptable?" -> escalate to Nessa (Performance)

Protocol: The agent writes a clarification request or creates a discussion. It does not proceed with the affected decision until it receives a response.

Tier 3: Discussion Escalation

The decision has lasting impact and should be made through multi-agent consensus, with human override available.

Scope: Architectural decisions, technology choices, process changes, anything that creates precedent.

Examples: - Choosing a technology stack for a new module - Changing an established coding pattern or convention - Defining a new inter-agent interface - Modifying the deployment pipeline

Protocol: The agent initiates a discussion via the Discussions API. Relevant agents participate. If consensus is not reached in two rounds, the question escalates to Tier 4.

Tier 4: Human Escalation

Only the human (Dirk-Jan) can make this decision. No agent has the authority.

Scope: Business commitments, client communication, agent lifecycle, financial decisions, anything with legal or contractual implications.

Examples: - Client-facing communications of any kind - Pricing decisions - Contract terms or SLA commitments - Agent commissioning (creating a new agent) - Agent decommissioning (removing an agent) - Budget allocation changes - Exceptions to Constitutional principles - Any decision where the agent feels uncertain about its authority

Protocol: The agent writes a notification to ge-ops/notifications/human/ with a structured request. The agent HALTS the affected work until the human responds. It may continue with unrelated work.

RULE: When in doubt about which tier a decision belongs to, escalate upward. Under-escalation is more dangerous than over-escalation.


HALT Conditions

A HALT condition is a situation where an agent must stop working and wait for human intervention. HALTs are not failures — they are safety mechanisms that prevent agents from making decisions in situations they are not equipped to handle.

Mandatory HALT Triggers

Trigger Description Notification Type
Spec conflict The specification contradicts itself or contradicts another specification clarification_request
Missing integration point The agent cannot find the code, API, or interface it needs to connect to clarification_request
Risk uncertainty The agent cannot assess whether its action is safe risk_escalation
Turn budget exhaustion The agent is approaching its turn limit without completing the task scope_warning
Cost gate trigger The session, agent, or daily cost limit has been reached cost_alert
Constitutional violation The agent would need to violate a Constitutional principle to complete the task constitutional_conflict
Client data exposure The agent encounters client data that may need special handling data_sensitivity
Gut feeling Something feels off but the agent cannot articulate what general_escalation

The last trigger is deliberately vague. LLMs are pattern-matching systems that sometimes detect problems before they can articulate them. An agent that "feels" something is wrong should HALT, not rationalize itself into continuing.

HALT Protocol

  1. Stop immediately. Do not complete the current action. Do not try "one more thing."
  2. Document the state. Write a notification file with the current task state, what triggered the HALT, what has been completed, and what remains.
  3. Preserve context. Do not discard work-in-progress. Save it where it can be resumed.
  4. Notify. Write to ge-ops/notifications/human/ with the appropriate type.
  5. Wait. Do not resume the affected work until the human responds. Unrelated work may continue.

HALT vs Error

A HALT is not an error. Errors are technical failures (a test fails, a build breaks, a service is unavailable). HALTs are judgment calls (this decision is above my pay grade). An agent should handle errors autonomously when possible. An agent must HALT when the situation exceeds its authority.


Quality Gates That Require Human Approval

Certain pipeline stages have quality gates that cannot be passed without human sign-off. These exist at points where the cost of an error is highest.

Client-Facing Deliverables

Any output that will be seen by a client — designs, documentation, deployed applications — requires human review before delivery. Agents produce the work. Humans verify it meets client expectations and business standards.

Production Deployment

The deployment pipeline is automated, but the decision to deploy to production requires human approval. Agents can deploy to staging and run integration tests autonomously. The final promotion to production is a human gate.

Agent Commissioning

Creating a new agent — defining its identity, role, boundaries, and provider — is a human decision. The human reviews the REFERENCE identity, validates the role boundaries, and approves the agent for activation. This prevents agent sprawl and ensures every agent serves a clear purpose.

Specification Approval

Formal specifications (Anna's output) are reviewed by the human before tests are derived from them. A wrong specification produces wrong tests, which produce wrong code. The specification gate catches this at the point where correction is cheapest.


The Client Interaction Boundary

RULE: No agent communicates directly with clients. All client-facing communication passes through human review.

This is an absolute boundary, not a guideline. The reason is liability. An LLM that confidently promises a feature, quotes a price, or commits to a deadline creates a business obligation that the company must honor. LLMs are not reliable enough to make these commitments.

The exception is Dima, the public-facing intake agent, who operates under extremely constrained conditions: - Stateless — no access to internal knowledge - No pricing or commitment authority - Pure information gathering - All gathered information goes to a human before any response is made


Scaling Human Oversight

The obvious objection to human-in-the-loop is that it does not scale. If GE grows from 60 agents to 200, can one human still provide effective oversight?

Current Approach

GE's oversight scales through three mechanisms:

1. Decision tiering. Most decisions are Tier 1 (autonomous) or Tier 2 (peer escalation). The human only handles Tier 4 decisions, which are infrequent. A well-designed system generates perhaps 5-10 human escalations per day, regardless of how many agents are operating.

2. Policy, not instances. The human does not review every line of code. The human defines policies (Constitution, standards, specifications) that agents apply to individual instances. Writing a policy once is more efficient than reviewing a thousand instances.

3. Monitoring agents. Annegreet (wiki curation), Ron (Guardian), Mira (incident command), and Eltjo (log analysis) provide automated oversight of agent behavior. They escalate anomalies to the human. This is AI overseeing AI, with the human at the top of the escalation chain.

Future Direction

As the agent count grows, the human's role shifts from reviewing individual outputs to reviewing system behavior. The monitoring agents become the primary oversight layer, with the human focusing on:

  • Policy evolution (updating the Constitution, standards, and specifications)
  • Exception handling (the 1% of cases that monitoring agents cannot resolve)
  • Strategic direction (what to build, for whom, and why)

This is the natural evolution from hands-on management to executive oversight. The principles remain the same; the abstraction level rises.


Regulatory Context

The EU AI Act (Article 14) and NIST's AI Risk Management Framework both require demonstrable human oversight for high-risk AI systems. While GE's software development agents do not directly fall under high-risk classification, the software they produce may serve regulated industries (healthcare, fintech). Maintaining documented human oversight is both a quality practice and a compliance enabler.

What GE Documents

  • Every human escalation: question, response, timestamp, rationale
  • Every quality gate passage: who approved, when, on what basis
  • Every agent commissioning decision: who approved, identity review evidence
  • Every HALT resolution: what triggered it, how it was resolved, what the agent did next

This documentation trail is not overhead — it is the evidence that the system operates under human governance. If a client or auditor asks "who decided this?", GE can always answer.


Measuring Oversight Effectiveness

Human oversight that is not measured will degrade over time. GE tracks:

Metric Target Alert
Escalation response time <4 hours >8 hours
Escalations per day 5-10 >20 (over-escalation) or <2 (under-escalation)
Gate approval time <2 hours >6 hours
Random audit frequency 1/week minimum >2 weeks since last audit
Escalation resolution quality No re-escalation of same issue Same issue escalated 3+ times

These metrics are reviewed weekly. Trends matter more than individual data points. Rising escalation counts suggest agents need better identity boundaries or specifications. Falling escalation counts suggest agents may be making decisions they should escalate.


Anti-Patterns

The Rubber Stamp

The human approves everything without review because the volume is too high. This provides the appearance of oversight without the substance. Fix: reduce the volume of escalations by improving decision tiering, not by reducing review quality.

The Bottleneck

The human becomes the slowest stage in the pipeline, causing work to queue. Fix: ensure that human-required decisions are genuinely necessary (not over-escalation) and that the notification system makes it easy to review and respond quickly.

Automation Complacency

The human trusts the agents too much and stops questioning their output. Research shows this is a systemic risk: the more reliable a system appears, the less vigilant its overseers become. Fix: periodic random audits of agent output, even when everything appears to be working correctly.

Bypassing the Gate

An agent or operator skips a human-required gate because the change seems small or urgent. Fix: gates are enforced by the orchestrator, not by agent goodwill. The system physically prevents progression past a human gate without human input.