Skip to content

DOMAIN:INFRASTRUCTURE:THOUGHT_LEADERS

OWNER: arjan (infrastructure), gerco (k3s/sysadmin), thijmen (k8s Zone 2), rutger (k8s Zone 3) UPDATED: 2026-03-24 SCOPE: key people, organizations, and resources for infrastructure, Kubernetes, and IaC


LEADERS:KUBERNETES

KELSEY_HIGHTOWER

ROLE: former Google Cloud Developer Advocate, Kubernetes evangelist (retired 2023) GITHUB: github.com/kelseyhightower WHY_RELEVANT: defined how the industry thinks about Kubernetes. His "Kubernetes The Hard Way" tutorial is the gold standard for understanding k8s internals — not just using kubectl, but understanding what happens underneath. KEY_CONTRIBUTIONS: - "Kubernetes The Hard Way" — step-by-step manual cluster setup (no automation). Essential for understanding k8s architecture. - "Tetris" — serverless on k8s demo that showed the power of CRDs - Keynotes at KubeCon, Google Cloud Next — shaped k8s adoption narrative - Advocated for simplicity: "Kubernetes is a platform for building platforms" LEARN_FROM: his GitHub repos and conference talks — focus on WHY decisions are made, not just HOW QUOTE: "No one wakes up wanting to deploy Kubernetes. They want to ship their product." RELEVANCE_TO_GE: GE agents should understand k8s internals (not just kubectl copy-paste) — Kelsey's material is the starting point

BRENDAN_BURNS

ROLE: co-founder of Kubernetes, Corporate VP at Microsoft Azure BOOK: "Kubernetes Up & Running" (O'Reilly, co-authored with Joe Beda & Kelsey Hightower) WHY_RELEVANT: literally created Kubernetes (with Joe Beda and Craig McLuckie at Google). His book is the canonical reference. KEY_CONTRIBUTIONS: - Co-created Kubernetes (open-sourced from Google's Borg/Omega experience) - "Kubernetes Up & Running" — the definitive book (4th edition, 2024) - "Designing Distributed Systems" — patterns for container-based systems - Led Kubernetes from Google internal project to CNCF donation LEARN_FROM: "Kubernetes Up & Running" for operational patterns, "Designing Distributed Systems" for architecture RELEVANCE_TO_GE: his patterns for multi-node systems directly apply to GE's three-zone architecture

JOE_BEDA

ROLE: co-founder of Kubernetes, co-founder of Heptio (acquired by VMware) WHY_RELEVANT: co-created k8s and later founded Heptio to make k8s enterprise-ready. His focus on operational excellence aligns with GE's production-first mindset. LEARN_FROM: Heptio's approach to making k8s enterprise-grade — same challenge GE faces

TIM_HOCKIN

ROLE: Principal Software Engineer at Google, Kubernetes SIG lead GITHUB: github.com/thockin WHY_RELEVANT: deep k8s networking expert. His presentations on k8s networking internals are the best resource for understanding Services, DNS, CNI, and NetworkPolicies. KEY_CONTRIBUTIONS: - K8s networking architecture (Services, kube-proxy, DNS) - K8s SIG-Network leadership - Extensive technical talks on k8s internals LEARN_FROM: his KubeCon talks on k8s networking — essential for stef and anyone debugging network issues


LEADERS:K3S_AND_RANCHER

DARREN_SHEPHERD

ROLE: co-founder of Rancher Labs, creator of k3s GITHUB: github.com/ibuildthecloud WHY_RELEVANT: created k3s — the lightweight k8s distribution GE uses in Zone 1. Understanding his design decisions helps understand k3s limitations and strengths. KEY_CONTRIBUTIONS: - Created k3s — Kubernetes for edge/IoT/resource-constrained environments - Co-founded Rancher Labs (acquired by SUSE) - RancherOS — minimal Linux for containers PHILOSOPHY: "Kubernetes should be simple enough to run on a Raspberry Pi" RELEVANCE_TO_GE: k3s on Minisforum 790 Pro is GE's Zone 1 — Darren's design decisions directly affect GE's dev environment

RANCHER_DOCS

URL: docs.k3s.io WHY_RELEVANT: official k3s documentation — primary reference for Zone 1 operations KEY_SECTIONS: - Installation and configuration - Networking (Flannel CNI, Traefik ingress) - Storage (local-path provisioner) - Upgrades and maintenance - Known limitations vs full k8s RULE: when something behaves differently in k3s vs full k8s, check k3s docs first


LEADERS:HASHICORP_TERRAFORM

MITCHELL_HASHIMOTO

ROLE: co-founder of HashiCorp (stepped down as CTO 2023) GITHUB: github.com/mitchellh WHY_RELEVANT: created Terraform (and Vagrant, Consul, Vault). His philosophy of "infrastructure as code" is the foundation of arjan's entire workflow. KEY_CONTRIBUTIONS: - Created Terraform — IaC industry standard - Created HashiCorp Vault — GE's secrets management - "Tao of HashiCorp" — guiding principles for infrastructure tools PHILOSOPHY: "Workflows, not technologies. Simple, modular, composable." LEARN_FROM: "Tao of HashiCorp" — principles that shaped Terraform's design

ARMON_DADGAR

ROLE: co-founder of HashiCorp, CTO WHY_RELEVANT: co-designed Terraform's architecture and state management. His talks on Terraform best practices are canonical. LEARN_FROM: HashiConf keynotes on infrastructure lifecycle management

HASHICORP_LEARN

URL: developer.hashicorp.com/terraform WHY_RELEVANT: official Terraform tutorials and documentation — primary reference for arjan KEY_SECTIONS: - Terraform language reference (HCL) - Provider documentation (UpCloud, TransIP, BunnyCDN) - State management patterns - Module development best practices - Workspace strategies for multi-environment

TERRAFORM_BEST_PRACTICES

REPO: github.com/ozbillwang/terraform-best-practices WHY_RELEVANT: community-curated Terraform patterns, module structure, naming conventions, state management ALSO: github.com/antonbabenko — Anton Babenko's Terraform modules and talks are industry reference


LEADERS:UPCLOUD

UPCLOUD_DOCUMENTATION

URL: upcloud.com/docs WHY_RELEVANT: GE's cloud provider for Zone 2 and Zone 3 — primary reference for all UpCloud resources KEY_SECTIONS: - Managed Kubernetes documentation - Managed Database (PostgreSQL) documentation - API reference (used by Terraform provider) - Network and firewall documentation - Object Storage (S3-compatible) TERRAFORM_PROVIDER_DOCS: registry.terraform.io/providers/UpCloudLtd/upcloud/latest/docs

UPCLOUD_DIFFERENTIATORS

WHY_GE_CHOSE_UPCLOUD: - EU-based company (Helsinki, Finland) — data sovereignty alignment - European data centers: Frankfurt, Amsterdam, Helsinki, Warsaw - Managed Kubernetes with full API access - Managed PostgreSQL with automated backups - Competitive pricing for SME workloads - No US CLOUD Act jurisdiction concerns RULE: if evaluating alternatives, EU data sovereignty is NON-NEGOTIABLE


LEADERS:CNCF_LANDSCAPE

CLOUD_NATIVE_COMPUTING_FOUNDATION

URL: landscape.cncf.io WHY_RELEVANT: CNCF maintains the ecosystem GE operates in — Kubernetes, Prometheus, cert-manager, Traefik (graduated/incubating projects)

KEY_CNCF_PROJECTS_USED_BY_GE: | Project | Status | GE Usage | |---|---|---| | Kubernetes | Graduated | Core orchestration (k3s + UpCloud MKE) | | Prometheus | Graduated | Metrics collection and alerting | | Helm | Graduated | Package management for k8s | | cert-manager | Incubating | TLS certificate automation | | Traefik | Incubating | Ingress controller | | Loki | Sandbox (Grafana Labs) | Log aggregation | | Grafana | Sandbox (Grafana Labs) | Visualization and dashboards |

CNCF_TRAIL_MAP

URL: github.com/cncf/landscape#trail-map PURPOSE: recommended adoption path for cloud-native technologies GE_STATUS: containerization (done) -> CI/CD (done) -> orchestration (done) -> observability (done) -> service mesh (deferred) -> security (ongoing)


LEADERS:CONTAINERS_AND_RUNTIME

DOCKER

URL: docs.docker.com WHY_RELEVANT: GE uses Docker for image building (Zone 1). k3s uses containerd runtime. KEY_REFERENCE: Dockerfile best practices — multi-stage builds, layer caching, security scanning RULE: images should be minimal (Alpine-based or distroless when possible)

OCI (Open Container Initiative)

URL: opencontainers.org WHY_RELEVANT: OCI defines image and runtime specs that Docker, containerd, and k3s all implement LEARN_FROM: understanding OCI explains why images are portable across Zone 1 (k3s) and Zone 2/3 (UpCloud MKE)


RESOURCES:BOOKS

Book Author Relevance
Kubernetes Up & Running (4th ed) Burns, Beda, Hightower Canonical k8s reference
Kubernetes Patterns (2nd ed) Ibryam, Huss Design patterns for k8s workloads
Terraform: Up & Running (3rd ed) Brikman Terraform best practices, module design
Designing Distributed Systems Brendan Burns Container-based system patterns
The Phoenix Project Kim, Behr, Spafford DevOps culture and principles
Site Reliability Engineering Google SRE team Production operations, SLOs, incident response
Infrastructure as Code (2nd ed) Kief Morris IaC patterns beyond just Terraform
Cloud Native Infrastructure Hightower, et al Infrastructure management in cloud-native world

RESOURCES:CONFERENCES

Conference Focus Why
KubeCon + CloudNativeCon Kubernetes, CNCF ecosystem Primary industry event, bleeding-edge k8s
HashiConf Terraform, Vault, Consul Best practices for IaC and secrets
FOSDEM Open source, infrastructure European focus, free, community-driven
DevOpsDays Amsterdam DevOps culture, tools Local to GE (Netherlands), networking
SREcon Site reliability, production ops Google SRE practices, incident management

RESOURCES:NEWSLETTERS_AND_BLOGS

Source Focus Frequency
KubeWeekly (CNCF) Kubernetes news and tutorials Weekly
DevOps Weekly Tools, practices, culture Weekly
docs.k3s.io/blog k3s updates and guides On release
upcloud.com/blog UpCloud product updates Weekly
learnk8s.io/blog Practical k8s tutorials Monthly
thenewstack.io Cloud native ecosystem Daily
hashicorp.com/blog Terraform, Vault updates Weekly

RESOURCES:REPOSITORIES

Repository Why Stars
kelseyhightower/kubernetes-the-hard-way Deep k8s understanding 40k+
k3s-io/k3s GE's Zone 1 runtime — read issues for gotchas 28k+
UpCloudLtd/terraform-provider-upcloud GE's cloud Terraform provider ~100
hashicorp/terraform IaC tool — core reference 42k+
traefik/traefik GE's ingress controller 51k+
cert-manager/cert-manager TLS automation 12k+
prometheus/prometheus Metrics and alerting 55k+
grafana/loki Log aggregation 24k+
borgbackup/borg Backup tool (dedup + encryption) 11k+

RESOURCES:CERTIFICATIONS

Certification Provider Relevance
CKA (Certified Kubernetes Administrator) CNCF Validates k8s operations skills
CKAD (Certified Kubernetes App Developer) CNCF Validates k8s workload design
CKS (Certified Kubernetes Security) CNCF Validates k8s security hardening
HashiCorp Certified: Terraform Associate HashiCorp Validates Terraform proficiency

NOTE: while agents do not take certifications, the exam objectives serve as excellent skill checklists for what infrastructure agents should know


LEADERS:AGENTIC_NOTE

FOR_AGENTS_WORKING_ON_INFRASTRUCTURE: - When debugging k8s, check k3s docs first (Zone 1 may behave differently from full k8s) - When writing Terraform, check UpCloud provider docs for resource-specific options - When designing backup strategies, reference Google SRE book chapters on data integrity - When troubleshooting networking, reference Tim Hockin's k8s networking talks - PREFER official docs over blog posts — infrastructure blog posts go stale fast - PREFER k3s-specific solutions over generic k8s solutions for Zone 1 - NEVER implement a pattern from a blog post without checking the version — k8s APIs change frequently