Skip to content

Platform Startup Documentation

Last Updated: 2026-03-18 Status: Active Maintained by: GE Infrastructure Team Estimated Time: 15-30 minutes (full startup)


Overview

This document describes the unified platform startup process for the GE infrastructure, including Kubernetes, agents, monitoring, and client hosting environments. The startup script orchestrates all components in the correct order with proper dependency management.

Script: /home/claude/ge-bootstrap/tools/ge-platform-startup.sh


Table of Contents


Unified Startup Script Overview

Purpose

The ge-platform-startup.sh script provides a unified interface for starting all GE platform components with:

  • Dependency Management: Ensures components start in correct order
  • Health Checks: Verifies each component before proceeding
  • Error Handling: Stops on failures, provides clear error messages
  • Flexibility: Supports full startup, single phases, or partial startup
  • Logging: Records all operations to log files

Features

  • Phase-based architecture (9 distinct phases)
  • Prerequisites checking before startup
  • Automatic Vault unsealing
  • Docker image import for K3s
  • Status verification commands
  • Partial startup from specific phase
  • Comprehensive error logging

Basic Usage

cd /home/claude/ge-bootstrap/tools

# Full startup (all phases)
./ge-platform-startup.sh --full

# Show platform status
./ge-platform-startup.sh --status

# Run specific phase only
./ge-platform-startup.sh --phase ingress

# Start from specific phase
./ge-platform-startup.sh --from agents

# Stop platform
./ge-platform-startup.sh --stop

# List available phases
./ge-platform-startup.sh --list-phases

Startup Phases

The platform startup is divided into 9 phases that run sequentially:

Phase 1: Prerequisites

Purpose: Validate environment before starting any services

Checks: - K3s service is running - kubectl is available and cluster is reachable - Docker is available (optional) - Kustomize is available - jq is installed

Duration: 5-10 seconds

Example Output:

========================================
PHASE: Prerequisites Check
========================================
[OK] K3s is running
[OK] kubectl available: v1.28.2
[OK] K8s cluster is reachable
[OK] Docker available
[OK] Kustomize available
[OK] jq available
[OK] All prerequisites satisfied

Failure Actions: - If K3s not running: Start with sudo systemctl start k3s - If kubectl missing: Install kubectl - If cluster unreachable: Check K3s status and logs


Phase 2: Namespaces

Purpose: Create all required Kubernetes namespaces

Creates: - ge-system - Core infrastructure - ge-agents - Agent platform - ge-monitoring - Observability stack - ge-ingress - Ingress controller - ge-hosting - Shared hosting pool - ge-wiki - Wiki brain (MkDocs)

Actions: - Apply namespace manifests - Add required labels for network policies - Label ingress namespace for selectors

Duration: 5-10 seconds

Example Output:

========================================
PHASE: Creating Namespaces
========================================
namespace/ge-system created
namespace/ge-agents created
namespace/ge-monitoring created
namespace/ge-ingress created
namespace/ge-hosting created
[OK] Namespaces created and labeled

Manual Verification:

kubectl get namespaces | grep -E "^ge-"


Phase 3: Secrets

Purpose: Verify or create required secrets in all namespaces

Checks: - ge-secrets exists in ge-agents - ge-secrets exists in ge-system - ge-secrets exists in ge-monitoring

Actions: - If secrets missing and environment variables set, creates secrets - If secrets missing and no environment variables, provides instructions

Duration: 5-10 seconds

Example Output:

========================================
PHASE: Verifying Secrets
========================================
[OK] Secret ge-secrets exists in ge-agents
[OK] Secret ge-secrets exists in ge-system
[OK] Secret ge-secrets exists in ge-monitoring
[OK] Secrets verified

Manual Secret Creation:

# If script cannot create secrets automatically
kubectl create secret generic ge-secrets \
  -n ge-agents \
  --from-literal=redis-password="<password>" \
  --from-literal=anthropic-api-key="<key>"


Phase 4: Core Infrastructure

Purpose: Deploy Redis and Vault (core services)

Deploys: - Redis (ge-system namespace) - Vault (ge-system namespace) - ConfigMaps and core secrets - Network policies

Actions: - Deploy Redis and wait for ready state - Deploy Vault and wait for pod creation - Attempt automatic Vault unsealing - Apply network policies

Duration: 1-2 minutes

Example Output:

========================================
PHASE: Deploying Core Infrastructure
========================================
[INFO] Deploying Redis...
pod/redis condition met
[OK] Redis is ready
[INFO] Deploying Vault...
[WARN] Vault may need manual unsealing
[INFO] Checking Vault seal status...
[OK] Vault unsealed successfully
[OK] Network policies applied
[OK] Core infrastructure deployed

Manual Vault Unseal:

# If automatic unseal fails
kubectl exec -n ge-system vault-0 -- vault operator unseal <key1>
kubectl exec -n ge-system vault-0 -- vault operator unseal <key2>
kubectl exec -n ge-system vault-0 -- vault operator unseal <key3>


Phase 5: Ingress

Purpose: Deploy Traefik IngressController

Deploys: - Traefik RBAC (ServiceAccount, ClusterRole, ClusterRoleBinding) - Traefik ConfigMap (static configuration) - Traefik Deployment (2 replicas with HA) - Traefik Service (ClusterIP) - IngressClass (default) - Network policies

Actions: - Apply all ingress resources via kustomize - Wait for Traefik pods to be ready - Verify service creation

Duration: 1-3 minutes

Example Output:

========================================
PHASE: Deploying Ingress Controller (Traefik)
========================================
[INFO] Applying Traefik IngressController...
serviceaccount/traefik created
clusterrole.rbac.authorization.k8s.io/traefik-ingress-controller created
clusterrolebinding.rbac.authorization.k8s.io/traefik-ingress-controller created
configmap/traefik-config created
deployment.apps/traefik created
service/traefik created
ingressclass.networking.k8s.io/traefik created
[INFO] Waiting for Traefik pods...
pod/traefik-abc123 condition met
pod/traefik-def456 condition met
[OK] Traefik is ready
[INFO] Traefik LoadBalancer IP: pending
[OK] Ingress controller deployed

Critical Note: Traefik service is ClusterIP, not LoadBalancer. Docker Traefik handles external ingress.


Phase 6: Agents

Purpose: Deploy GE agent platform

Deploys: - ConfigMaps (constitution, routing config, execution config) - ge-orchestrator (event-driven routing, replaces legacy Dolly monolith) - Shared executor (all 54 active agents run through this) - PodDisruptionBudget for executors

Note: All per-agent deployments (arjan, annegreet, etc.) are scaled to 0. All agents run through the shared executor.

Actions: - Create ConfigMaps from files - Import Docker images to K3s (if available) - Deploy ge-orchestrator and wait for ready - Deploy shared executor and wait for ready - Apply PDB

Duration: 3-5 minutes

Example Output:

========================================
PHASE: Deploying Agent Platform
========================================
[INFO] Creating ConfigMaps...
configmap/constitution created
configmap/routing-config created
[INFO] Checking Docker images...
[INFO] Importing ge-bootstrap-agent-executor to K3s...
[OK] Imported ge-bootstrap-agent-executor
[INFO] Deploying ge-orchestrator...
[OK] ge-orchestrator is ready
[INFO] Deploying shared executor...
[OK] Executors are ready
[OK] Agent platform deployed

Verification:

kubectl get pods -n ge-agents
# Should show all agent pods running


Phase 7: Monitoring

Purpose: Deploy observability stack

Deploys: - Loki (log aggregation) - Promtail (log collection) - Grafana (visualization)

Actions: - Apply Loki stack manifests - Wait for Loki to be ready - Wait for Grafana to be ready

Duration: 1-2 minutes

Example Output:

========================================
PHASE: Deploying Monitoring Stack
========================================
deployment.apps/loki created
daemonset.apps/promtail created
deployment.apps/grafana created
[INFO] Waiting for Loki...
pod/loki-abc123 condition met
[OK] Loki is ready
[INFO] Waiting for Grafana...
pod/grafana-def456 condition met
[OK] Grafana is ready
[OK] Monitoring stack deployed

Access Grafana:

kubectl port-forward -n ge-monitoring svc/grafana 3000:3000
# Open browser: http://localhost:3000


Phase 8: Hosting

Purpose: Verify shared hosting pool is ready

Checks: - ge-hosting namespace exists - Hosting landing page is deployed

Actions: - Verify namespace - Check for landing page deployment - Report status

Duration: 5-10 seconds

Example Output:

========================================
PHASE: Deploying Shared Hosting Pool
========================================
[OK] Hosting namespace exists
[OK] Hosting landing page deployed
[OK] Shared hosting pool ready

Note: The hosting namespace and landing page are created during the ingress phase.


Phase 9: Clients

Purpose: Deploy client environments from registry

Reads: - /home/claude/ge-bootstrap/config/clients.yaml

Actions: - Parse clients.yaml - Deploy each registered client - Report deployment status

Duration: Variable (depends on number of clients)

Example Output:

========================================
PHASE: Deploying Client Environments
========================================
[INFO] Deploying 3 client environment(s)...
[WARN] Client deployment not yet implemented - create clients manually with create-client.sh
[OK] Client environments phase complete

Manual Client Creation:

/home/claude/ge-bootstrap/tools/create-client.sh \
  --type shared \
  --name acme-corp \
  --resources small


Phase Dependencies

Phases must run in order due to dependencies:

flowchart TD
    Prerequisites[1. Prerequisites] --> Namespaces[2. Namespaces]
    Namespaces --> Secrets[3. Secrets]
    Secrets --> Core[4. Core<br/>Redis, Vault]
    Core --> Ingress[5. Ingress<br/>Traefik]
    Ingress --> Agents[6. Agents<br/>Dolly, Executors]
    Agents --> Monitoring[7. Monitoring<br/>Loki, Grafana]
    Core --> Monitoring
    Monitoring --> Hosting[8. Hosting<br/>Shared Pool]
    Hosting --> Clients[9. Clients<br/>Client Envs]
    Ingress --> Clients

Dependency Matrix

Phase Depends On Reason
Namespaces Prerequisites Need kubectl and cluster access
Secrets Namespaces Secrets created in namespaces
Core Secrets Redis and Vault need secrets
Ingress Namespaces Traefik deployed to ge-ingress namespace
Agents Core, Secrets Agents need Redis and secrets
Monitoring Core Loki needs storage from core infrastructure
Hosting Ingress Needs IngressClass for routing
Clients Ingress, Hosting Clients need ingress and hosting namespace

Why Order Matters

Example 1: Agents Before Core

❌ WRONG: Deploy agents before Redis
Result: Agents crash, cannot connect to Redis
Error: "Connection refused: redis.ge-system:6381"

Example 2: Clients Before Ingress

❌ WRONG: Deploy clients before Traefik
Result: Ingress resources not processed, no routing
Error: "IngressClass 'traefik' not found"

Example 3: Secrets Before Namespaces

❌ WRONG: Create secrets before namespaces
Result: Secret creation fails
Error: "namespace 'ge-agents' not found"


Full Startup Procedure

Starting from Stopped State

Scenario: Server just booted, all services stopped

Step 1: Start K3s

# Check K3s status
sudo systemctl status k3s

# Start if stopped
sudo systemctl start k3s

# Verify
sudo systemctl is-active k3s
# Expected: active

Step 2: Run Full Startup

cd /home/claude/ge-bootstrap/tools

./ge-platform-startup.sh --full

Step 3: Monitor Progress

The script will show progress through all 9 phases. Watch for: - ✅ Green [OK] messages indicate success - ⚠️ Yellow [WARN] messages indicate non-critical issues - ❌ Red [ERROR] messages indicate failures

Step 4: Verify Completion

./ge-platform-startup.sh --status

Expected Duration: - Minimal system: 5-10 minutes - Full system: 15-20 minutes - With many clients: 20-30 minutes

Starting from Partial State

Scenario: Some components running, some stopped

Step 1: Check Current State

./ge-platform-startup.sh --status

Step 2: Identify Missing Components

Example output showing partial state:

=== Core Infrastructure (ge-system) ===
NAME                      READY   STATUS    RESTARTS   AGE
pod/redis-0               1/1     Running   0          5d
pod/vault-0               1/1     Running   0          5d

=== Agent Platform (ge-agents) ===
No resources found in ge-agents namespace.

Step 3: Start from Missing Phase

# Start from agents phase
./ge-platform-startup.sh --from agents


Partial Startup Options

Single Phase Execution

Run only one specific phase:

# Syntax
./ge-platform-startup.sh --phase PHASE_NAME

# Examples
./ge-platform-startup.sh --phase core
./ge-platform-startup.sh --phase ingress
./ge-platform-startup.sh --phase agents

Use Cases: - Restarting a single failed component - Testing a specific phase - Selective updates

Example:

# Agents crashed, redeploy only agents
./ge-platform-startup.sh --phase agents

Start from Specific Phase

Run all phases starting from a specific one:

# Syntax
./ge-platform-startup.sh --from PHASE_NAME

# Examples
./ge-platform-startup.sh --from core
./ge-platform-startup.sh --from agents

Use Cases: - Partial system recovery - Skip already-running components - Faster startup when core is healthy

Example:

# Core and ingress running, start from agents
./ge-platform-startup.sh --from agents

Phase Selection Strategy

Scenario: Need to redeploy agents after code change
Strategy: --phase agents
Reason: Only agents need updating

Scenario: Core crashed, need to restart everything dependent on it
Strategy: --from core
Reason: Core, agents, and monitoring all need to restart

Scenario: Fresh install
Strategy: --full
Reason: All phases need to run

Scenario: K3s restarted, everything needs to come back up
Strategy: --full
Reason: All phases need to run in order

Status Verification

Using Status Command

./ge-platform-startup.sh --status

Output Sections:

  1. Namespaces:

    === Namespaces ===
    ge-system      Active   5d
    ge-agents      Active   5d
    ge-monitoring  Active   5d
    ge-ingress     Active   5d
    ge-hosting     Active   5d
    

  2. Ingress (ge-ingress):

    === Ingress (ge-ingress) ===
    NAME                     READY   STATUS    RESTARTS   AGE
    pod/traefik-abc123       1/1     Running   0          5d
    pod/traefik-def456       1/1     Running   0          5d
    
    NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)
    service/traefik   ClusterIP   10.43.100.123   <none>        80/TCP,443/TCP
    

  3. Core Infrastructure (ge-system):

    === Core Infrastructure (ge-system) ===
    NAME          READY   STATUS    RESTARTS   AGE   IP           NODE
    pod/redis-0   1/1     Running   0          5d    10.42.0.10   minisforum
    pod/vault-0   1/1     Running   0          5d    10.42.0.11   minisforum
    

  4. Agent Platform (ge-agents):

    === Agent Platform (ge-agents) ===
    NAME                           READY   STATUS    RESTARTS   AGE
    pod/ge-orchestrator-123abc      1/1     Running   0          5d
    pod/ge-executor-456def         1/1     Running   0          5d
    pod/annegreet-789ghi           1/1     Running   0          5d
    pod/victoria-012jkl            1/1     Running   0          5d
    

  5. Monitoring (ge-monitoring):

    === Monitoring (ge-monitoring) ===
    NAME                        READY   STATUS    RESTARTS   AGE
    pod/loki-123abc             1/1     Running   0          5d
    pod/promtail-456def         1/1     Running   0          5d
    pod/grafana-789ghi          1/1     Running   0          5d
    

  6. Hosting (ge-hosting):

    === Hosting (ge-hosting) ===
    NAME                               READY   STATUS    RESTARTS   AGE
    pod/hosting-landing-123abc         1/1     Running   0          5d
    

  7. Ingress Routes:

    === Ingress Routes ===
    NAMESPACE     NAME              CLASS     HOSTS
    ge-ingress    admin-ui          traefik   office.growing-europe.com
    ge-hosting    hosting-landing   traefik   hosting.growing-europe.com
    

  8. Access Points:

    [INFO] Access points:
    [INFO]   Admin UI: https://office.growing-europe.com
    [INFO]   Hosting:  https://hosting.growing-europe.com
    [INFO]   Grafana:  kubectl port-forward -n ge-monitoring svc/grafana 3000:3000
    

Manual Verification Commands

Quick Health Check:

# All namespaces
kubectl get namespaces | grep ^ge-

# All pods in all GE namespaces
kubectl get pods -A | grep ^ge-

# All services
kubectl get svc -A | grep ^ge-

Component-Specific Checks:

# Redis
kubectl exec -n ge-system redis-0 -- redis-cli -a "$REDIS_PASSWORD" ping
# Expected: PONG

# Vault (check seal status)
kubectl exec -n ge-system vault-0 -- vault status
# Expected: Sealed: false

# Traefik
kubectl get pods -n ge-ingress -l app=traefik
# Expected: All Running

# Agents
kubectl get pods -n ge-agents
# Expected: All Running

# Monitoring
kubectl get pods -n ge-monitoring
# Expected: All Running

Endpoint Testing:

# Admin UI
curl -I https://office.growing-europe.com
# Expected: HTTP/2 200

# Hosting landing
curl -I https://hosting.growing-europe.com
# Expected: HTTP/2 200

# Traefik dashboard (internal)
kubectl port-forward -n ge-ingress svc/traefik-dashboard 8080:8080 &
curl -I http://localhost:8080/dashboard/
# Expected: HTTP/1.1 200


Platform Shutdown

Graceful Shutdown

Step 1: Stop Agents and Monitoring

./ge-platform-startup.sh --stop

Output:

========================================
Stopping GE Platform
========================================
[WARN] Stopping agents and monitoring...
deployment.apps "ge-orchestrator" deleted
deployment.apps "ge-executor" deleted
[...]
[WARN] Core infrastructure (Redis, Vault, Ingress) NOT stopped
[INFO] To stop everything: kubectl delete -k /home/claude/ge-bootstrap/k8s/base/
[OK] Platform stopped (core services preserved)

What Gets Stopped: - ✅ All agents (Dolly, executors, dedicated agents, watchers) - ✅ Monitoring stack (Loki, Promtail, Grafana) - ❌ Core services (Redis, Vault) - Preserved - ❌ Ingress (Traefik) - Preserved

Why Core is Preserved: - Redis may have important cached data - Vault contains secrets - Traefik handles production traffic - Quick recovery if agents need restart

Full Shutdown

Step 1: Stop All K8s Resources

kubectl delete -k /home/claude/ge-bootstrap/k8s/base/

# Or manually by namespace
kubectl delete namespace ge-agents
kubectl delete namespace ge-monitoring
kubectl delete namespace ge-ingress
kubectl delete namespace ge-system
kubectl delete namespace ge-hosting

Step 2: Stop K3s (Optional)

sudo systemctl stop k3s

Emergency Shutdown

In case of critical issues:

# Force delete all GE resources
kubectl delete all --all -n ge-agents --force --grace-period=0
kubectl delete all --all -n ge-monitoring --force --grace-period=0
kubectl delete all --all -n ge-ingress --force --grace-period=0
kubectl delete all --all -n ge-system --force --grace-period=0

# Stop K3s
sudo systemctl stop k3s

# Stop K3s (nuclear option)
sudo systemctl stop k3s

Troubleshooting Common Issues

K3s Not Running

Symptoms:

[ERROR] K3s is not running. Start with: sudo systemctl start k3s
[ERROR] Cannot connect to K8s cluster

Solution:

# Check K3s status
sudo systemctl status k3s

# If failed, check logs
sudo journalctl -u k3s -n 100

# Common fixes:
# 1. Not enough memory
free -h
# Solution: Increase memory or reduce workloads

# 2. Port conflict
sudo netstat -tulpn | grep :6443
# Solution: Stop conflicting service

# 3. Start K3s
sudo systemctl start k3s

# 4. Enable auto-start
sudo systemctl enable k3s

Vault Sealed

Symptoms:

[WARN] Vault may need manual unsealing
[INFO] Vault is sealed, attempting auto-unseal...
[WARN] Failed to unseal Vault - manual intervention needed

Solution:

# Check Vault status
kubectl exec -n ge-system vault-0 -- vault status

# Output shows:
# Sealed: true

# Manual unseal (requires 3 keys)
kubectl exec -n ge-system vault-0 -- vault operator unseal <key1>
kubectl exec -n ge-system vault-0 -- vault operator unseal <key2>
kubectl exec -n ge-system vault-0 -- vault operator unseal <key3>

# Verify unsealed
kubectl exec -n ge-system vault-0 -- vault status
# Sealed: false

Where to Find Keys:

# Keys should be in:
/home/claude/ge-bootstrap/vault/VAULT_KEYS.txt

# If file doesn't exist, Vault needs reinitialization
# WARNING: This will lose all existing secrets!

Secret Missing

Symptoms:

[WARN] Secret ge-secrets missing in ge-agents
[ERROR] Cannot create secrets - REDIS_PASSWORD and ANTHROPIC_API_KEY not set

Solution:

# Option 1: Set environment variables
export REDIS_PASSWORD="<password>"
export ANTHROPIC_API_KEY="<key>"

# Run secrets phase again
./ge-platform-startup.sh --phase secrets

# Option 2: Create manually
kubectl create secret generic ge-secrets \
  -n ge-agents \
  --from-literal=redis-password="<password>" \
  --from-literal=anthropic-api-key="<key>"

kubectl create secret generic ge-secrets \
  -n ge-system \
  --from-literal=redis-password="<password>"

kubectl create secret generic ge-secrets \
  -n ge-monitoring \
  --from-literal=redis-password="<password>"

Certificate Not Issued

Symptoms:

curl https://office.growing-europe.com
# SSL certificate problem: unable to get local issuer certificate

Diagnosis:

# Check Docker Traefik logs
docker logs traefik 2>&1 | grep -i acme | tail -20

# Check acme.json
sudo ls -lh /home/claude/ge-bootstrap/traefik/acme.json

# Check DNS
dig office.growing-europe.com +short

Solutions:

  1. Wait for Let's Encrypt:

    # First request can take 2-5 minutes
    # Monitor Docker Traefik logs
    docker logs traefik -f
    

  2. DNS Not Resolving:

    # Update DNS records
    # Wait for propagation (up to 24 hours)
    

  3. Rate Limited:

    # Check Let's Encrypt rate limits
    curl -s "https://crt.sh/?q=%.growing-europe.com&output=json" | jq '. | length'
    # Limit: 50 certs/week per domain
    

  4. Port 80 Blocked:

    # Let's Encrypt needs port 80 for HTTP-01 challenge
    sudo netstat -tulpn | grep :80
    # Ensure Docker Traefik owns port 80
    

Pods Not Starting

Symptoms:

kubectl get pods -n ge-agents
NAME                  READY   STATUS             RESTARTS   AGE
dolly-123abc          0/1     CrashLoopBackOff   5          3m

Diagnosis:

# Check pod events
kubectl describe pod dolly-123abc -n ge-agents

# Check logs
kubectl logs dolly-123abc -n ge-agents

# Check previous container logs
kubectl logs dolly-123abc -n ge-agents --previous

Common Causes and Fixes:

  1. Image Pull Error:

    # Error: ImagePullBackOff
    # Solution: Import Docker image to K3s
    docker save ge-bootstrap-dolly:latest | sudo k3s ctr images import -
    

  2. Missing ConfigMap:

    # Error: configmap "constitution" not found
    # Solution: Create ConfigMap
    kubectl create configmap constitution \
      -n ge-agents \
      --from-file=/home/claude/ge-bootstrap/config/constitution.md
    

  3. Secret Not Found:

    # Error: secret "ge-secrets" not found
    # Solution: Create secret (see "Secret Missing" above)
    

  4. Resource Limits:

    # Error: Insufficient cpu/memory
    # Check node capacity
    kubectl describe nodes | grep -A 5 "Allocated resources"
    
    # Solution: Reduce resource requests or add nodes
    

Traefik API Connection Errors

Symptoms:

kubectl logs -n ge-ingress deploy/traefik
# Error: connection refused to Kubernetes API

Diagnosis:

# Check RBAC permissions
kubectl get clusterrole traefik-ingress-controller
kubectl get clusterrolebinding traefik-ingress-controller

# Check ServiceAccount
kubectl get sa traefik -n ge-ingress

Solutions:

  1. RBAC Not Applied:

    kubectl apply -f /home/claude/ge-bootstrap/k8s/base/ingress/traefik-rbac.yaml
    

  2. Network Policy Blocking:

    # Temporarily delete network policies for testing
    kubectl delete networkpolicy -n ge-ingress --all
    
    # If Traefik works, adjust policy
    

  3. ServiceAccount Token Issue:

    # Recreate ServiceAccount
    kubectl delete sa traefik -n ge-ingress
    kubectl apply -f /home/claude/ge-bootstrap/k8s/base/ingress/traefik-rbac.yaml
    
    # Restart Traefik pods
    kubectl rollout restart deployment/traefik -n ge-ingress
    


Deprecated Scripts

ge-startup-k8s.sh

Location: /home/claude/ge-bootstrap/tools/ge-startup-k8s.sh

Status: DEPRECATED - Superseded by ge-platform-startup.sh

Reason for Deprecation: - Limited to K8s resources only - No Docker service integration - No phase-based architecture - Less flexible than unified script - No Vault unsealing - No status verification

Migration:

# Old way
./ge-startup-k8s.sh

# New way
./ge-platform-startup.sh --full

When to Use (Legacy): - Only if ge-platform-startup.sh is broken - For minimal K8s-only deployments - For compatibility with old scripts

Recommendation: Use ge-platform-startup.sh for all new deployments.



Startup Checklist

Use this checklist for platform startup:

PRE-STARTUP
□ K3s service is running
□ Server has adequate resources (check free -h, df -h)
□ Docker is running (if using Docker services)
□ Environment variables set (.env loaded)
□ Vault keys available (VAULT_KEYS.txt)

STARTUP
□ Run prerequisites check
□ Start full platform or specific phases
□ Monitor progress for errors
□ Wait for all pods to be ready

POST-STARTUP
□ Run status verification
□ Check all namespaces created
□ Verify core services (Redis, Vault)
□ Verify ingress (Traefik)
□ Verify agents running
□ Verify monitoring stack
□ Test external access (office.growing-europe.com)
□ Check logs for errors

VALIDATION
□ All pods in Running state
□ No CrashLoopBackOff
□ Secrets present in all namespaces
□ Vault unsealed
□ Ingress routing working
□ SSL certificates valid
□ No errors in logs

This documentation is maintained by the GE Infrastructure Team. For updates or questions, contact the infrastructure lead or create an issue in the ge-ops repository.