LLM Provider Onboarding Checklist¶
Purpose: Complete checklist for adding a new LLM provider to the GE agent execution system. Last Updated: 2026-02-13 Lesson Learned From: Multi-provider executor rollout (Feb 12-13, 2026)
Why This Document Exists¶
When adding OpenAI Codex and Gemini support (Feb 2026), we discovered the integration requires changes across 5+ systems. Missing any step results in "provider not working" errors that are hard to diagnose. This checklist ensures nothing is missed.
Complete Checklist¶
1. Provider Code (ge_agent/execution/providers/)¶
- [ ] Create
{provider}.pyimplementingLLMProviderinterface - [ ] Implement required methods:
provider_idpropertycapabilitiespropertybuild_command()- CLI command constructionparse_output()- metrics extraction from CLI outputvalidate_installation()- CLI availability check- [ ] Add to registry in
__init__.py - [ ] Update
registry.pydefault model mapping
Files:
ge_agent/execution/providers/{provider}.py # New provider
ge_agent/execution/providers/__init__.py # Export
ge_agent/execution/providers/registry.py # Default model
2. CLI Installation (Dockerfile.executor)¶
- [ ] Add CLI package to npm install command
- [ ] Verify package name on npm:
npm view @{org}/{package} version
File: Dockerfile.executor
Example:
3. API Key in Vault¶
- [ ] Add API key to
secret/admin-ui/api-keys - [ ] Use consistent key name (lowercase provider name)
Command:
Current keys location: secret/admin-ui/api-keys
- anthropic - Anthropic API key
- openai - OpenAI API key
- google - Google/Gemini API key
4. Vault Policy for Executor¶
- [ ] Verify
ge-executorpolicy includessecret/data/admin-ui/api-keys
Check policy:
Required policy:
path "secret/data/ge/*" {
capabilities = ["read", "list"]
}
path "secret/data/admin-ui/api-keys" {
capabilities = ["read"]
}
5. Runner Secret Fetching (ge_agent/runner.py)¶
- [ ] Add key fetch in
get_vault_secrets()function - [ ] Set both generic and CLI-specific env vars
File: ge_agent/runner.py
Pattern:
if "{provider}" in api_keys_data:
secrets["{PROVIDER}_API_KEY"] = api_keys_data["{provider}"]
secrets["{CLI}_API_KEY"] = api_keys_data["{provider}"] # CLI-specific
6. Agent Configuration¶
- [ ] Update agent configs in
ge-ops/master/agent-configs/to use new provider - [ ] Set
provider: {provider}andmodel: {model-id}in agent YAML
Example agent config:
7. Rebuild and Deploy¶
- [ ] Build new executor image:
docker build -f Dockerfile.executor -t ge-bootstrap-agent-executor:latest . - [ ] Import to k3s:
docker save ... | sudo k3s ctr images import - - [ ] Restart executor:
kubectl rollout restart deployment/ge-executor -n ge-agents
8. Verification¶
- [ ] Check CLI installed:
kubectl exec -n ge-agents deployment/ge-executor -- which {cli} - [ ] Check secrets loaded: Look for "Loaded X secrets" in executor logs
- [ ] Check multi-provider keys: Look for "multi-provider keys: {provider}=True"
- [ ] Test execution: Run a simple task with an agent using the new provider
Verification commands:
# CLI installed
kubectl exec -n ge-agents deployment/ge-executor -- {cli} --version
# Secrets loaded
kubectl logs -n ge-agents deployment/ge-executor | grep "Loaded.*secrets"
# Multi-provider keys
kubectl logs -n ge-agents deployment/ge-executor | grep "multi-provider"
Common Failures and Solutions¶
| Symptom | Cause | Solution |
|---|---|---|
| "CLI not found" | CLI not in Dockerfile | Add to npm install in Dockerfile.executor |
| "permission denied" on Vault | Policy missing | Update ge-executor Vault policy |
| "Loaded 2 secrets" (expected 6) | Runner not fetching keys | Update get_vault_secrets() in runner.py |
| "API key not configured" | Key not in Vault | Add to secret/admin-ui/api-keys |
| Agent fails silently | Wrong env var name | Check CLI docs for expected env var |
Provider-Specific Notes¶
OpenAI Codex¶
- CLI:
codex(npm:@openai/codex) - Env vars:
CODEX_API_KEY,OPENAI_API_KEY - Output: JSONL with
turn.completedevents
Google Gemini¶
- CLI:
gemini(npm:@google/gemini-cli) - Env vars:
GEMINI_API_KEY,GOOGLE_API_KEY - May need
GOOGLE_PROJECTfor some features
Anthropic Claude¶
- CLI:
claude(npm:@anthropic-ai/claude-code) - Env vars:
ANTHROPIC_API_KEY - Primary provider, most mature integration
Related Files Quick Reference¶
| Purpose | File |
|---|---|
| Provider implementations | ge_agent/execution/providers/*.py |
| Provider registry | ge_agent/execution/providers/registry.py |
| Vault client | ge_agent/execution/vault_client.py |
| Secret fetching | ge_agent/runner.py (get_vault_secrets) |
| Executor Dockerfile | Dockerfile.executor |
| Agent configs | ge-ops/master/agent-configs/*.yaml |
| Vault policies | Vault server (ge-executor policy) |
This checklist was created after the Feb 2026 multi-provider rollout. If you're adding a new LLM provider, follow every step - missing one causes hard-to-debug failures.