Terraform + UpCloud — Pitfalls¶

OWNER: arjan
ALSO_USED_BY: gerco, thijmen, rutger
LAST_VERIFIED: 2026-03-26
GE_STACK_VERSION: terraform ~> 1.14.x, upcloud provider ~> 5.0

Overview¶

Known Terraform and UpCloud provider failure modes.
Every item here has caused real issues in GE infrastructure.

State Drift¶

Severity: HIGH

State drift occurs when real infrastructure changes outside Terraform
(manual UpCloud console edits, API calls, other tools).

IF: terraform plan shows unexpected changes
THEN: run terraform refresh to sync state with reality
THEN: investigate who changed the resource outside Terraform

IF: drift is intentional (emergency fix)
THEN: import the change into state or update the config to match

ANTI_PATTERN: making changes in the UpCloud console for managed resources
FIX: all changes go through Terraform — console is read-only for managed resources

CHECK: run terraform plan weekly to detect drift early
RUN: terraform plan -detailed-exitcode (exit code 2 = drift detected)

ADDED_FROM: staging-firewall-drift-2026-02, manual console rule addition

Provider Version Mismatches¶

Severity: HIGH

Different team members running different provider versions causes state
corruption or incompatible plans.

IF: terraform init warns about provider version mismatch
THEN: check .terraform.lock.hcl is committed and up to date
RUN: terraform providers lock -platform=linux_amd64

CHECK: .terraform.lock.hcl is committed to git
CHECK: all team members run the same Terraform version
CHECK: CI/CD pins the exact Terraform version

ANTI_PATTERN: running terraform init -upgrade without coordinating
FIX: provider upgrades are a deliberate, reviewed action

ADDED_FROM: provider-5.0-upgrade-2026-01, state incompatibility after solo upgrade

Destroy Protection Bypass¶

Severity: CRITICAL

lifecycle.prevent_destroy = true protects resources from accidental deletion.
But removing the lifecycle block and running apply WILL destroy the resource.

IF: removing prevent_destroy from a resource
THEN: this MUST be a separate, reviewed commit
THEN: arjan must approve for Zone 2+3 resources
THEN: re-add prevent_destroy immediately after the operation

ANTI_PATTERN: removing prevent_destroy "temporarily" and forgetting to re-add
FIX: treat it as a two-commit operation — remove, act, re-add

ADDED_FROM: near-miss-db-destroy-2026-02, prevent_destroy removed in refactoring commit

Import Issues¶

Severity: MEDIUM

When importing existing UpCloud resources into Terraform state,
the config must match the actual resource exactly.

IF: importing a resource
THEN: write the config FIRST based on the UpCloud console/API
THEN: run terraform import
THEN: run terraform plan to verify no changes

RUN: terraform import upcloud_server.example {uuid}

IF: plan shows changes after import
THEN: adjust the config to match reality, not the other way around

ANTI_PATTERN: importing then immediately applying to "fix" differences
FIX: make config match reality first, then plan intentional changes separately

ADDED_FROM: lb-import-2026-03, import triggered unintended reconfiguration

Sensitive Values in State¶

Severity: HIGH

Terraform state contains the full resource config, including passwords
and API keys in plain text. Even with remote backend, state access = secret access.

CHECK: state backend has encryption at rest enabled
CHECK: state backend access is restricted to Terraform service accounts
CHECK: sensitive variables marked sensitive = true
CHECK: sensitive outputs marked sensitive = true

IF: state file was accidentally exposed
THEN: rotate ALL secrets referenced in that state immediately

UpCloud API Rate Limits¶

Severity: MEDIUM

UpCloud API has rate limits. Large terraform apply operations with
many resources can hit these limits, causing intermittent failures.

IF: apply fails with 429 or timeout errors
THEN: use -parallelism=5 (default is 10)
RUN: terraform apply -parallelism=5

IF: still hitting limits
THEN: split the apply into targeted operations
RUN: terraform apply -target=module.networking

ANTI_PATTERN: retrying immediately after rate limit error
FIX: wait 60 seconds, then retry with lower parallelism

Managed Database Deprecations¶

Severity: MEDIUM

UpCloud has deprecated Redis managed database in favour of Valkey.
Legacy object storage resources have also been removed from the provider.

IF: Terraform plan shows removal of upcloud_managed_database_redis
THEN: this is expected — migrate to Valkey or self-managed Redis
CHECK: GE uses self-managed Redis (port 6381), not UpCloud Managed Database for Redis

IF: references to upcloud_object_storage (legacy) exist
THEN: migrate to upcloud_managed_object_storage (new)

ADDED_FROM: provider-5.0-changelog-2026-01, deprecated resources removed

Workspace Confusion¶

Severity: HIGH

Applying production config in the wrong workspace can create resources
in the wrong zone or overwrite existing infrastructure.

CHECK: verify workspace before every terraform plan and terraform apply
RUN: terraform workspace show

IF: shell prompt does not show current workspace
THEN: add workspace to your prompt or use a wrapper script

ANTI_PATTERN: running terraform apply without checking workspace first
FIX: make workspace verification part of your muscle memory

ADDED_FROM: near-miss-wrong-workspace-2026-02, almost applied prod to dev

Cross-References¶

READ_ALSO: wiki/docs/stack/terraform-upcloud/index.md
READ_ALSO: wiki/docs/stack/terraform-upcloud/patterns.md
READ_ALSO: wiki/docs/stack/terraform-upcloud/upcloud-resources.md
READ_ALSO: wiki/docs/stack/terraform-upcloud/checklist.md