Kubernetes — Checklist¶
OWNER: gerco (dev), thijmen (staging), rutger (production)
ALSO_USED_BY: arjan, alex, tjitte
LAST_VERIFIED: 2026-03-26
GE_STACK_VERSION: k3s v1.34.x (Zone 1), UpCloud Managed K8s (Zones 2+3)
DEPLOYMENT CHECKLIST (new service)¶
- [ ] CHECK: Deployment has
livenessProbeandreadinessProbe
IF_SKIPPED: unhealthy pods receive traffic, no automatic restart - [ ] CHECK: Every container has
resources.requestsANDresources.limits
IF_SKIPPED: resource starvation, OOMKilled without warning
ADDED_FROM: redis-oom-2026-02, unbounded memory usage - [ ] CHECK:
securityContext.runAsNonRoot: trueis set
IF_SKIPPED: ISO 27001 non-compliance, security audit failure - [ ] CHECK:
allowPrivilegeEscalation: falseon every container
IF_SKIPPED: container escape risk - [ ] CHECK:
capabilities.drop: [ALL]on every container
IF_SKIPPED: unnecessary kernel capabilities exposed - [ ] CHECK: GE standard labels applied (
app.kubernetes.io/name,part-of,managed-by,component,ge.zone)
IF_SKIPPED: monitoring and service mesh routing break - [ ] CHECK: PodDisruptionBudget created for replicas >= 2
IF_SKIPPED: all replicas can be evicted simultaneously during node drain - [ ] CHECK: HPA
maxReplicasdoes not exceed 5
IF_SKIPPED: token burn from runaway scaling
ADDED_FROM: token-burn-prevention-2026-02 - [ ] CHECK: HPA
scaleUp.stabilizationWindowSeconds>= 120
IF_SKIPPED: flapping scale events - [ ] CHECK: Service selector matches Deployment pod labels exactly
IF_SKIPPED: traffic goes nowhere — silent failure
NETWORKING CHECKLIST (new service)¶
- [ ] CHECK: Service uses ClusterIP (NodePort only for Zone 1 LAN access)
IF_SKIPPED: unnecessary host port exposure - [ ] CHECK: Port numbers read from
config/ports.yaml, not hardcoded
IF_SKIPPED: port conflicts, configuration drift
ADDED_FROM: redis-port-6381-2026-02 - [ ] CHECK: Ingress has TLS configured — no plain HTTP
IF_SKIPPED: traffic interceptable, compliance violation - [ ] CHECK: NetworkPolicy
default-deny-ingressexists in namespace
IF_SKIPPED: all pods accept traffic from anywhere - [ ] CHECK: Cross-namespace flows have explicit NetworkPolicy allow rules
IF_SKIPPED: traffic blocked after default-deny is applied - [ ] CHECK: No
hostNetwork: truein pod spec
IF_SKIPPED: port conflicts on rolling updates
ADDED_FROM: executor-scaling-2026-02
SECRETS CHECKLIST (new service)¶
- [ ] CHECK: All secrets come via ExternalSecret from Vault
IF_SKIPPED: secrets in git — compliance violation - [ ] CHECK: No Secret manifests with plain-text values committed
IF_SKIPPED: credential leak - [ ] CHECK: ServiceAccount can only read its own Vault path
IF_SKIPPED: cross-service secret access - [ ] CHECK: Redis auth uses password from
ge-secretssecret, keyredis-password
IF_SKIPPED: Redis connection refused
ADDED_FROM: orchestrator-redis-auth-2026-03
IMAGE CHECKLIST (Zone 1)¶
- [ ] CHECK: Image built with
build-executor.sh(or equivalent build script)
IF_SKIPPED: stale code deployed - [ ] CHECK: Image imported to k3s via
k3s ctr images import
IF_SKIPPED: ImagePullBackOff - [ ] CHECK:
imagePullPolicyisNeverorIfNotPresentin Zone 1
IF_SKIPPED: ImagePullBackOff — no registry exists
ADDED_FROM: executor-deployment-2026-02 - [ ] CHECK: No
kubectl cpused to patch running pods
IF_SKIPPED: Python module caching makes patches invisible
ADDED_FROM: executor-hotpatch-failure-2026-02
PRE-PROMOTION CHECKLIST (Zone 1 → Zone 2)¶
- [ ] CHECK: Service runs stable in Zone 1 for minimum 1 week
IF_SKIPPED: untested in production-like conditions - [ ] CHECK: Resource limits are validated against actual usage in Zone 1
IF_SKIPPED: over- or under-provisioned in staging - [ ] CHECK: NetworkPolicies tested in Zone 1
IF_SKIPPED: unexpected traffic blocks in Zone 2 - [ ] CHECK: Image uses immutable tag (semver or SHA), not
:latest
IF_SKIPPED: deployment reproducibility lost - [ ] CHECK: thijmen (staging) has approved the promotion
IF_SKIPPED: uncoordinated deployment
CROSS-REFERENCES¶
READ_ALSO: wiki/docs/stack/kubernetes/index.md
READ_ALSO: wiki/docs/stack/kubernetes/manifests.md
READ_ALSO: wiki/docs/stack/kubernetes/pitfalls.md
READ_ALSO: wiki/docs/stack/kubernetes/security.md