Skip to content

Kubernetes — Security

OWNER: gerco (dev), thijmen (staging), rutger (production)
ALSO_USED_BY: arjan, alex, tjitte
LAST_VERIFIED: 2026-03-26
GE_STACK_VERSION: k3s v1.34.x (Zone 1), UpCloud Managed K8s (Zones 2+3)


Overview

Kubernetes security in GE covers RBAC, SecurityContext, PodSecurity standards,
secrets management via External Secrets Operator + HashiCorp Vault, and network
policies. All security controls align with ISO 27001 and SOC 2 Type II requirements.


RBAC (Role-Based Access Control)

GE uses RBAC to restrict what each component can do in the cluster.

IF: creating a new service that needs Kubernetes API access
THEN: create a ServiceAccount + Role + RoleBinding (namespace-scoped)
THEN: never use ClusterRole unless the service truly needs cluster-wide access

apiVersion: v1  
kind: ServiceAccount  
metadata:  
  name: ge-{service}  
  namespace: ge-{namespace}  
---  
apiVersion: rbac.authorization.k8s.io/v1  
kind: Role  
metadata:  
  name: ge-{service}-role  
  namespace: ge-{namespace}  
rules:  
  - apiGroups: [""]  
    resources: ["pods", "services"]  
    verbs: ["get", "list", "watch"]  
  - apiGroups: ["apps"]  
    resources: ["deployments"]  
    verbs: ["get", "list", "watch"]  
---  
apiVersion: rbac.authorization.k8s.io/v1  
kind: RoleBinding  
metadata:  
  name: ge-{service}-binding  
  namespace: ge-{namespace}  
subjects:  
  - kind: ServiceAccount  
    name: ge-{service}  
    namespace: ge-{namespace}  
roleRef:  
  kind: Role  
  name: ge-{service}-role  
  apiGroup: rbac.authorization.k8s.io  

CHECK: no ServiceAccount has cluster-admin binding
CHECK: each ServiceAccount has minimum required permissions
CHECK: RoleBindings are namespace-scoped, not ClusterRoleBindings

ANTI_PATTERN: granting * verbs on * resources
FIX: enumerate specific resources and verbs needed


SecurityContext

Every GE pod runs as non-root with a read-only root filesystem where possible.

spec:  
  securityContext:  
    runAsNonRoot: true  
    runAsUser: 1000  
    runAsGroup: 1000  
    fsGroup: 1000  
  containers:  
    - name: ge-{service}  
      securityContext:  
        allowPrivilegeEscalation: false  
        readOnlyRootFilesystem: true  
        capabilities:  
          drop:  
            - ALL  

CHECK: runAsNonRoot: true on every pod
CHECK: allowPrivilegeEscalation: false on every container
CHECK: capabilities.drop: [ALL] on every container

IF: container needs to write files (e.g., executor writing COMP-*.md)
THEN: use an emptyDir volume, not a writable root filesystem

volumeMounts:  
  - name: tmp  
    mountPath: /tmp  
  - name: work  
    mountPath: /work  
volumes:  
  - name: tmp  
    emptyDir: {}  
  - name: work  
    emptyDir: {}  

Pod Security Standards

GE enforces the restricted Pod Security Standard on all namespaces
in Zones 2 and 3. Zone 1 uses baseline for development flexibility.

apiVersion: v1  
kind: Namespace  
metadata:  
  name: ge-{namespace}  
  labels:  
    pod-security.kubernetes.io/enforce: restricted  
    pod-security.kubernetes.io/audit: restricted  
    pod-security.kubernetes.io/warn: restricted  

IF: pod is rejected by PodSecurity admission
THEN: fix the security context — do not lower the policy level
THEN: check runAsNonRoot, capabilities, seccompProfile


Secrets Management

GE uses External Secrets Operator (ESO) to sync secrets from HashiCorp Vault
into Kubernetes Secrets. No secrets are stored in git or plain manifests.

apiVersion: external-secrets.io/v1beta1  
kind: ExternalSecret  
metadata:  
  name: ge-{service}-secrets  
  namespace: ge-{namespace}  
spec:  
  refreshInterval: 1h  
  secretStoreRef:  
    name: vault-backend  
    kind: ClusterSecretStore  
  target:  
    name: ge-{service}-secrets  
    creationPolicy: Owner  
  data:  
    - secretKey: database-url  
      remoteRef:  
        key: ge/{namespace}/{service}  
        property: database-url  
    - secretKey: api-key  
      remoteRef:  
        key: ge/{namespace}/{service}  
        property: api-key  

CHECK: no Secret manifests with plain-text values in git
CHECK: all secrets come through ExternalSecret resources
CHECK: refreshInterval is set (1h default — shorter for rotating secrets)

IF: a pod needs a secret
THEN: create an ExternalSecret, not a Kubernetes Secret

IF: secret not syncing
THEN: check ExternalSecret status
RUN: kubectl get externalsecrets -n ge-{namespace}
RUN: kubectl describe externalsecret ge-{service}-secrets -n ge-{namespace}


Vault Integration

HashiCorp Vault is the SSOT for all secrets.
ESO authenticates to Vault via Kubernetes auth method.

Key Vault paths:

Path Contents
ge/data/redis Redis password
ge/data/postgres Database credentials
ge/data/api-keys LLM provider API keys (Anthropic, OpenAI)
ge/data/{service} Per-service secrets

CHECK: Vault policies are scoped per namespace
CHECK: each ServiceAccount can only read its own Vault path

IF: Redis authentication fails
THEN: check the ge-secrets secret in ge-data namespace
THEN: verify the key is redis-password
THEN: ensure liveness/readiness probes include Redis auth


Network Policies

Every namespace has a default-deny ingress policy.
Explicit allow policies are created for each communication path.

CHECK: default-deny-ingress exists in every namespace
CHECK: every cross-namespace flow has an explicit NetworkPolicy
READ_ALSO: wiki/docs/stack/kubernetes/networking.md

Key policies in GE:

Policy Namespace Allows
allow-admin-ui-to-postgres ge-data ge-system → postgres:5432
orchestrator-to-postgres ge-agents ge-orchestrator → postgres:5432
allow-executor-to-redis ge-data ge-agents → redis:6381
allow-system-to-redis ge-data ge-system → redis:6381

Image Security

CHECK: all container images are built locally, not pulled from public registries in production
CHECK: images are imported to k3s via k3s ctr images import
CHECK: no :latest tag in Zone 2+3 manifests — use SHA digests or semver tags

IF: Zone 1 (dev)
THEN: :latest tag is acceptable for rapid iteration

IF: Zone 2 or 3
THEN: use immutable tags — ge-{service}:v1.2.3 or ge-{service}@sha256:...


Audit Logging

k3s API server audit logging captures all API calls.
Zone 2+3 audit logs are managed by UpCloud.

IF: investigating a security incident
THEN: check API server audit logs
RUN: sudo cat /var/log/k3s/audit.log | tail -100 (Zone 1)


Cross-References

READ_ALSO: wiki/docs/stack/kubernetes/index.md
READ_ALSO: wiki/docs/stack/kubernetes/networking.md
READ_ALSO: wiki/docs/stack/kubernetes/manifests.md
READ_ALSO: wiki/docs/stack/kubernetes/pitfalls.md
READ_ALSO: wiki/docs/stack/kubernetes/checklist.md