Skip to content

LLM Provider Onboarding Checklist

Purpose: Complete checklist for adding a new LLM provider to the GE agent execution system. Last Updated: 2026-02-13 Lesson Learned From: Multi-provider executor rollout (Feb 12-13, 2026)


Why This Document Exists

When adding OpenAI Codex and Gemini support (Feb 2026), we discovered the integration requires changes across 5+ systems. Missing any step results in "provider not working" errors that are hard to diagnose. This checklist ensures nothing is missed.


Complete Checklist

1. Provider Code (ge_agent/execution/providers/)

  • [ ] Create {provider}.py implementing LLMProvider interface
  • [ ] Implement required methods:
  • provider_id property
  • capabilities property
  • build_command() - CLI command construction
  • parse_output() - metrics extraction from CLI output
  • validate_installation() - CLI availability check
  • [ ] Add to registry in __init__.py
  • [ ] Update registry.py default model mapping

Files:

ge_agent/execution/providers/{provider}.py   # New provider
ge_agent/execution/providers/__init__.py     # Export
ge_agent/execution/providers/registry.py     # Default model

2. CLI Installation (Dockerfile.executor)

  • [ ] Add CLI package to npm install command
  • [ ] Verify package name on npm: npm view @{org}/{package} version

File: Dockerfile.executor

Example:

RUN npm install -g \
    @anthropic-ai/claude-code \
    @openai/codex \
    @google/gemini-cli

3. API Key in Vault

  • [ ] Add API key to secret/admin-ui/api-keys
  • [ ] Use consistent key name (lowercase provider name)

Command:

vault kv patch secret/admin-ui/api-keys {provider}="sk-xxx..."

Current keys location: secret/admin-ui/api-keys - anthropic - Anthropic API key - openai - OpenAI API key - google - Google/Gemini API key

4. Vault Policy for Executor

  • [ ] Verify ge-executor policy includes secret/data/admin-ui/api-keys

Check policy:

vault policy read ge-executor

Required policy:

path "secret/data/ge/*" {
  capabilities = ["read", "list"]
}

path "secret/data/admin-ui/api-keys" {
  capabilities = ["read"]
}

5. Runner Secret Fetching (ge_agent/runner.py)

  • [ ] Add key fetch in get_vault_secrets() function
  • [ ] Set both generic and CLI-specific env vars

File: ge_agent/runner.py

Pattern:

if "{provider}" in api_keys_data:
    secrets["{PROVIDER}_API_KEY"] = api_keys_data["{provider}"]
    secrets["{CLI}_API_KEY"] = api_keys_data["{provider}"]  # CLI-specific

6. Agent Configuration

  • [ ] Update agent configs in ge-ops/master/agent-configs/ to use new provider
  • [ ] Set provider: {provider} and model: {model-id} in agent YAML

Example agent config:

agent:
  name: benjamin
  provider: openai
  model: gpt-4o

7. Rebuild and Deploy

  • [ ] Build new executor image: docker build -f Dockerfile.executor -t ge-bootstrap-agent-executor:latest .
  • [ ] Import to k3s: docker save ... | sudo k3s ctr images import -
  • [ ] Restart executor: kubectl rollout restart deployment/ge-executor -n ge-agents

8. Verification

  • [ ] Check CLI installed: kubectl exec -n ge-agents deployment/ge-executor -- which {cli}
  • [ ] Check secrets loaded: Look for "Loaded X secrets" in executor logs
  • [ ] Check multi-provider keys: Look for "multi-provider keys: {provider}=True"
  • [ ] Test execution: Run a simple task with an agent using the new provider

Verification commands:

# CLI installed
kubectl exec -n ge-agents deployment/ge-executor -- {cli} --version

# Secrets loaded
kubectl logs -n ge-agents deployment/ge-executor | grep "Loaded.*secrets"

# Multi-provider keys
kubectl logs -n ge-agents deployment/ge-executor | grep "multi-provider"


Common Failures and Solutions

Symptom Cause Solution
"CLI not found" CLI not in Dockerfile Add to npm install in Dockerfile.executor
"permission denied" on Vault Policy missing Update ge-executor Vault policy
"Loaded 2 secrets" (expected 6) Runner not fetching keys Update get_vault_secrets() in runner.py
"API key not configured" Key not in Vault Add to secret/admin-ui/api-keys
Agent fails silently Wrong env var name Check CLI docs for expected env var

Provider-Specific Notes

OpenAI Codex

  • CLI: codex (npm: @openai/codex)
  • Env vars: CODEX_API_KEY, OPENAI_API_KEY
  • Output: JSONL with turn.completed events

Google Gemini

  • CLI: gemini (npm: @google/gemini-cli)
  • Env vars: GEMINI_API_KEY, GOOGLE_API_KEY
  • May need GOOGLE_PROJECT for some features

Anthropic Claude

  • CLI: claude (npm: @anthropic-ai/claude-code)
  • Env vars: ANTHROPIC_API_KEY
  • Primary provider, most mature integration

Purpose File
Provider implementations ge_agent/execution/providers/*.py
Provider registry ge_agent/execution/providers/registry.py
Vault client ge_agent/execution/vault_client.py
Secret fetching ge_agent/runner.py (get_vault_secrets)
Executor Dockerfile Dockerfile.executor
Agent configs ge-ops/master/agent-configs/*.yaml
Vault policies Vault server (ge-executor policy)

This checklist was created after the Feb 2026 multi-provider rollout. If you're adding a new LLM provider, follow every step - missing one causes hard-to-debug failures.