Skip to content

Immutable Infrastructure

No manual changes. Everything from code, through CI, into containers. If it is not in version control, it does not exist.


The Principle

Infrastructure components — containers, configurations, deployments — are never modified after they are created. When a change is needed, a new component is built from code and the old one is replaced.

This is not a preference. This is a hard requirement.

Never:

  • kubectl cp a file into a running pod
  • SSH into a container and edit a config file
  • Hot-patch a running process to fix a bug
  • Manually apply a database migration
  • Edit a Kubernetes manifest and kubectl apply by hand

Always:

  • Change the source code
  • Rebuild the container image
  • Deploy through the pipeline
  • Verify in staging before production

Why Immutable

Auditability (ISO 27001)

GE targets ISO 27001 and SOC 2 Type II compliance. Both require a complete audit trail of every change to production systems. When infrastructure is immutable:

  • Every production state corresponds to a git commit
  • Every deployment is traceable to a CI pipeline run
  • Every container image is tagged with a build hash
  • Every change has an author, a reviewer, and a timestamp

Manual changes are invisible to the audit trail. A kubectl cp leaves no record in git. An SSH edit leaves no record in CI. Immutable infrastructure makes the audit trail automatic.

Reproducibility

If production breaks, can you recreate it? With immutable infrastructure, the answer is always yes. The production state is defined by code. The same code produces the same state. No manual steps, no tribal knowledge, no "I think someone changed that config last month."

Rollback Safety

Rolling back is replacing the current container with the previous container image. The previous image is identical to what was running before — not "similar," not "rebuilt from the same code," but the exact same bytes. Rollback is instant and safe because the previous state was never modified.

With mutable infrastructure, rollback means "undo whatever manual changes were made" — which requires knowing what those changes were. Manual changes are rarely documented completely.

Consistency Across Zones

GE operates three zones: development, staging, and production. Immutable infrastructure guarantees that the container running in staging is the exact same container that will run in production. Not "built from the same code" — the same image, byte for byte. If it works in staging, it works in production.


Three Enforcement Gates

Immutable infrastructure is enforced at three points in the pipeline:

Gate 1: Merge Gate (Marta / Iwona)

Before code reaches the deployment pipeline, Marta and Iwona verify:

  • No manual deployment instructions in the PR
  • No kubectl apply commands in documentation
  • No hardcoded values that should come from config
  • Deployment manifests are generated, not hand-written
  • Container image tags use commit hashes, not latest

Gate 2: Deployment (Leon)

Leon orchestrates the deployment pipeline. Leon verifies:

  • Container image was built by CI (not locally)
  • Image tag matches the approved merge commit
  • All deployment manifests come from version control
  • No manual steps required between stages
  • Rollback procedure is defined and tested

Gate 3: Production Apply (Rutger)

Rutger is the final gate before production. Rutger verifies:

  • Staging verification passed (Thijmen)
  • Container image in production matches staging exactly
  • DNS, certificates, and network config are correct (Stef)
  • Backup verification passed (Otto)
  • Rollback procedure tested in staging

Container Image Policy

Build rules

  • Images are built in CI, never on developer machines
  • Base images are pinned to digest (not tag) for reproducibility
  • Multi-stage builds — build dependencies do not enter runtime image
  • Image size limits enforced (no bloated images with build tools)
  • Security scanning on every image before push

Tagging rules

  • Never use latest in production manifests. The latest tag is mutable — it points to a different image every time
  • Use commit SHA as image tag: ge-admin-ui:a1b2c3d
  • Keep the last 10 tagged images for rollback
  • Prune untagged images weekly

GE-specific

Build the executor image: bash ge-ops/infrastructure/local/k3s/executor/build-executor.sh

Then restart the deployment: kubectl rollout restart deployment/ge-executor -n ge-agents

Python caches modules in sys.modules at startup. Patching files in a running container has no effect. Always rebuild the image.


Configuration Management

Configuration is not part of the container image. It is injected at deployment time via:

  • ConfigMaps for non-sensitive configuration
  • Secrets for sensitive values (from HashiCorp Vault)
  • Environment variables for runtime-specific values

Rules

  • Never bake secrets into container images
  • Never hardcode configuration values in code
  • Read all operational values from config files (see config/ports.yaml, config/agent-execution.yaml)
  • Config authority map: ge-ops/wiki/docs/development/contracts/config-authority.md

What Happens When Someone Breaks Immutability

It has happened. It will happen again. When someone hot-patches a running container or manually applies a change:

  1. The change is invisible to the audit trail
  2. The next deployment overwrites it because the source code does not contain the change
  3. Debugging becomes impossible because the running state does not match any known code version
  4. Compliance is violated — ISO 27001 audit will flag it

Response:

  • Identify what was changed
  • Apply the change properly through the pipeline
  • Document the incident
  • Determine why the pipeline was bypassed
  • Fix the process gap that allowed it

Ownership

Role Agent Responsibility
Deployment Coordinator Leon Pipeline orchestration, image verification
Production Operations Rutger Production apply, rollback execution
Infrastructure Provisioner Arjan Terraform, infrastructure-as-code
Kubernetes Operator Thijmen Staging verification, manifest management
Sysadmin Gerco Host-level infrastructure
Change Intelligence (Alfa) Marta Merge gate enforcement
Change Intelligence (Bravo) Iwona Merge gate enforcement

Further Reading