DOMAIN:INFRASTRUCTURE:THOUGHT_LEADERS¶

OWNER: arjan (infrastructure), gerco (k3s/sysadmin), thijmen (k8s Zone 2), rutger (k8s Zone 3) UPDATED: 2026-03-24 SCOPE: key people, organizations, and resources for infrastructure, Kubernetes, and IaC

LEADERS:KUBERNETES¶

KELSEY_HIGHTOWER¶

ROLE: former Google Cloud Developer Advocate, Kubernetes evangelist (retired 2023) GITHUB: github.com/kelseyhightower WHY_RELEVANT: defined how the industry thinks about Kubernetes. His "Kubernetes The Hard Way" tutorial is the gold standard for understanding k8s internals — not just using kubectl, but understanding what happens underneath. KEY_CONTRIBUTIONS: - "Kubernetes The Hard Way" — step-by-step manual cluster setup (no automation). Essential for understanding k8s architecture. - "Tetris" — serverless on k8s demo that showed the power of CRDs - Keynotes at KubeCon, Google Cloud Next — shaped k8s adoption narrative - Advocated for simplicity: "Kubernetes is a platform for building platforms" LEARN_FROM: his GitHub repos and conference talks — focus on WHY decisions are made, not just HOW QUOTE: "No one wakes up wanting to deploy Kubernetes. They want to ship their product." RELEVANCE_TO_GE: GE agents should understand k8s internals (not just kubectl copy-paste) — Kelsey's material is the starting point

BRENDAN_BURNS¶

ROLE: co-founder of Kubernetes, Corporate VP at Microsoft Azure BOOK: "Kubernetes Up & Running" (O'Reilly, co-authored with Joe Beda & Kelsey Hightower) WHY_RELEVANT: literally created Kubernetes (with Joe Beda and Craig McLuckie at Google). His book is the canonical reference. KEY_CONTRIBUTIONS: - Co-created Kubernetes (open-sourced from Google's Borg/Omega experience) - "Kubernetes Up & Running" — the definitive book (4th edition, 2024) - "Designing Distributed Systems" — patterns for container-based systems - Led Kubernetes from Google internal project to CNCF donation LEARN_FROM: "Kubernetes Up & Running" for operational patterns, "Designing Distributed Systems" for architecture RELEVANCE_TO_GE: his patterns for multi-node systems directly apply to GE's three-zone architecture

JOE_BEDA¶

ROLE: co-founder of Kubernetes, co-founder of Heptio (acquired by VMware) WHY_RELEVANT: co-created k8s and later founded Heptio to make k8s enterprise-ready. His focus on operational excellence aligns with GE's production-first mindset. LEARN_FROM: Heptio's approach to making k8s enterprise-grade — same challenge GE faces

TIM_HOCKIN¶

ROLE: Principal Software Engineer at Google, Kubernetes SIG lead GITHUB: github.com/thockin WHY_RELEVANT: deep k8s networking expert. His presentations on k8s networking internals are the best resource for understanding Services, DNS, CNI, and NetworkPolicies. KEY_CONTRIBUTIONS: - K8s networking architecture (Services, kube-proxy, DNS) - K8s SIG-Network leadership - Extensive technical talks on k8s internals LEARN_FROM: his KubeCon talks on k8s networking — essential for stef and anyone debugging network issues

LEADERS:K3S_AND_RANCHER¶

DARREN_SHEPHERD¶

ROLE: co-founder of Rancher Labs, creator of k3s GITHUB: github.com/ibuildthecloud WHY_RELEVANT: created k3s — the lightweight k8s distribution GE uses in Zone 1. Understanding his design decisions helps understand k3s limitations and strengths. KEY_CONTRIBUTIONS: - Created k3s — Kubernetes for edge/IoT/resource-constrained environments - Co-founded Rancher Labs (acquired by SUSE) - RancherOS — minimal Linux for containers PHILOSOPHY: "Kubernetes should be simple enough to run on a Raspberry Pi" RELEVANCE_TO_GE: k3s on Minisforum 790 Pro is GE's Zone 1 — Darren's design decisions directly affect GE's dev environment

RANCHER_DOCS¶

URL: docs.k3s.io WHY_RELEVANT: official k3s documentation — primary reference for Zone 1 operations KEY_SECTIONS: - Installation and configuration - Networking (Flannel CNI, Traefik ingress) - Storage (local-path provisioner) - Upgrades and maintenance - Known limitations vs full k8s RULE: when something behaves differently in k3s vs full k8s, check k3s docs first

LEADERS:HASHICORP_TERRAFORM¶

MITCHELL_HASHIMOTO¶

ROLE: co-founder of HashiCorp (stepped down as CTO 2023) GITHUB: github.com/mitchellh WHY_RELEVANT: created Terraform (and Vagrant, Consul, Vault). His philosophy of "infrastructure as code" is the foundation of arjan's entire workflow. KEY_CONTRIBUTIONS: - Created Terraform — IaC industry standard - Created HashiCorp Vault — GE's secrets management - "Tao of HashiCorp" — guiding principles for infrastructure tools PHILOSOPHY: "Workflows, not technologies. Simple, modular, composable." LEARN_FROM: "Tao of HashiCorp" — principles that shaped Terraform's design

ARMON_DADGAR¶

ROLE: co-founder of HashiCorp, CTO WHY_RELEVANT: co-designed Terraform's architecture and state management. His talks on Terraform best practices are canonical. LEARN_FROM: HashiConf keynotes on infrastructure lifecycle management

HASHICORP_LEARN¶

URL: developer.hashicorp.com/terraform WHY_RELEVANT: official Terraform tutorials and documentation — primary reference for arjan KEY_SECTIONS: - Terraform language reference (HCL) - Provider documentation (UpCloud, TransIP, BunnyCDN) - State management patterns - Module development best practices - Workspace strategies for multi-environment

TERRAFORM_BEST_PRACTICES¶

REPO: github.com/ozbillwang/terraform-best-practices WHY_RELEVANT: community-curated Terraform patterns, module structure, naming conventions, state management ALSO: github.com/antonbabenko — Anton Babenko's Terraform modules and talks are industry reference

LEADERS:UPCLOUD¶

UPCLOUD_DOCUMENTATION¶

URL: upcloud.com/docs WHY_RELEVANT: GE's cloud provider for Zone 2 and Zone 3 — primary reference for all UpCloud resources KEY_SECTIONS: - Managed Kubernetes documentation - Managed Database (PostgreSQL) documentation - API reference (used by Terraform provider) - Network and firewall documentation - Object Storage (S3-compatible) TERRAFORM_PROVIDER_DOCS: registry.terraform.io/providers/UpCloudLtd/upcloud/latest/docs

UPCLOUD_DIFFERENTIATORS¶

WHY_GE_CHOSE_UPCLOUD: - EU-based company (Helsinki, Finland) — data sovereignty alignment - European data centers: Frankfurt, Amsterdam, Helsinki, Warsaw - Managed Kubernetes with full API access - Managed PostgreSQL with automated backups - Competitive pricing for SME workloads - No US CLOUD Act jurisdiction concerns RULE: if evaluating alternatives, EU data sovereignty is NON-NEGOTIABLE

LEADERS:CNCF_LANDSCAPE¶

CLOUD_NATIVE_COMPUTING_FOUNDATION¶

URL: landscape.cncf.io WHY_RELEVANT: CNCF maintains the ecosystem GE operates in — Kubernetes, Prometheus, cert-manager, Traefik (graduated/incubating projects)

KEY_CNCF_PROJECTS_USED_BY_GE: | Project | Status | GE Usage | |---|---|---| | Kubernetes | Graduated | Core orchestration (k3s + UpCloud MKE) | | Prometheus | Graduated | Metrics collection and alerting | | Helm | Graduated | Package management for k8s | | cert-manager | Incubating | TLS certificate automation | | Traefik | Incubating | Ingress controller | | Loki | Sandbox (Grafana Labs) | Log aggregation | | Grafana | Sandbox (Grafana Labs) | Visualization and dashboards |

CNCF_TRAIL_MAP¶

URL: github.com/cncf/landscape#trail-map PURPOSE: recommended adoption path for cloud-native technologies GE_STATUS: containerization (done) -> CI/CD (done) -> orchestration (done) -> observability (done) -> service mesh (deferred) -> security (ongoing)

LEADERS:CONTAINERS_AND_RUNTIME¶

DOCKER¶

URL: docs.docker.com WHY_RELEVANT: GE uses Docker for image building (Zone 1). k3s uses containerd runtime. KEY_REFERENCE: Dockerfile best practices — multi-stage builds, layer caching, security scanning RULE: images should be minimal (Alpine-based or distroless when possible)

OCI (Open Container Initiative)¶

URL: opencontainers.org WHY_RELEVANT: OCI defines image and runtime specs that Docker, containerd, and k3s all implement LEARN_FROM: understanding OCI explains why images are portable across Zone 1 (k3s) and Zone 2/3 (UpCloud MKE)

RESOURCES:BOOKS¶

Book	Author	Relevance
Kubernetes Up & Running (4th ed)	Burns, Beda, Hightower	Canonical k8s reference
Kubernetes Patterns (2nd ed)	Ibryam, Huss	Design patterns for k8s workloads
Terraform: Up & Running (3rd ed)	Brikman	Terraform best practices, module design
Designing Distributed Systems	Brendan Burns	Container-based system patterns
The Phoenix Project	Kim, Behr, Spafford	DevOps culture and principles
Site Reliability Engineering	Google SRE team	Production operations, SLOs, incident response
Infrastructure as Code (2nd ed)	Kief Morris	IaC patterns beyond just Terraform
Cloud Native Infrastructure	Hightower, et al	Infrastructure management in cloud-native world

RESOURCES:CONFERENCES¶

Conference	Focus	Why
KubeCon + CloudNativeCon	Kubernetes, CNCF ecosystem	Primary industry event, bleeding-edge k8s
HashiConf	Terraform, Vault, Consul	Best practices for IaC and secrets
FOSDEM	Open source, infrastructure	European focus, free, community-driven
DevOpsDays Amsterdam	DevOps culture, tools	Local to GE (Netherlands), networking
SREcon	Site reliability, production ops	Google SRE practices, incident management

RESOURCES:NEWSLETTERS_AND_BLOGS¶

Source	Focus	Frequency
KubeWeekly (CNCF)	Kubernetes news and tutorials	Weekly
DevOps Weekly	Tools, practices, culture	Weekly
docs.k3s.io/blog	k3s updates and guides	On release
upcloud.com/blog	UpCloud product updates	Weekly
learnk8s.io/blog	Practical k8s tutorials	Monthly
thenewstack.io	Cloud native ecosystem	Daily
hashicorp.com/blog	Terraform, Vault updates	Weekly

RESOURCES:REPOSITORIES¶

Repository	Why	Stars
kelseyhightower/kubernetes-the-hard-way	Deep k8s understanding	40k+
k3s-io/k3s	GE's Zone 1 runtime — read issues for gotchas	28k+
UpCloudLtd/terraform-provider-upcloud	GE's cloud Terraform provider	~100
hashicorp/terraform	IaC tool — core reference	42k+
traefik/traefik	GE's ingress controller	51k+
cert-manager/cert-manager	TLS automation	12k+
prometheus/prometheus	Metrics and alerting	55k+
grafana/loki	Log aggregation	24k+
borgbackup/borg	Backup tool (dedup + encryption)	11k+

RESOURCES:CERTIFICATIONS¶

Certification	Provider	Relevance
CKA (Certified Kubernetes Administrator)	CNCF	Validates k8s operations skills
CKAD (Certified Kubernetes App Developer)	CNCF	Validates k8s workload design
CKS (Certified Kubernetes Security)	CNCF	Validates k8s security hardening
HashiCorp Certified: Terraform Associate	HashiCorp	Validates Terraform proficiency

NOTE: while agents do not take certifications, the exam objectives serve as excellent skill checklists for what infrastructure agents should know

LEADERS:AGENTIC_NOTE¶

FOR_AGENTS_WORKING_ON_INFRASTRUCTURE: - When debugging k8s, check k3s docs first (Zone 1 may behave differently from full k8s) - When writing Terraform, check UpCloud provider docs for resource-specific options - When designing backup strategies, reference Google SRE book chapters on data integrity - When troubleshooting networking, reference Tim Hockin's k8s networking talks - PREFER official docs over blog posts — infrastructure blog posts go stale fast - PREFER k3s-specific solutions over generic k8s solutions for Zone 1 - NEVER implement a pattern from a blog post without checking the version — k8s APIs change frequently