Skip to content

Data Flow Mapping

Where data originates, where it is processed, where it is stored, where it transits. Every data flow in GE must be mapped, documented, and verified against sovereignty requirements.


Why Data Flow Mapping Matters

GDPR Article 30 requires a record of processing activities. EU data sovereignty requires knowing not just what data is processed, but where every byte goes at every stage of its lifecycle.

A single untracked data flow through a US provider — a DNS query, an analytics pixel, a font loaded from Google Fonts — breaks the sovereignty guarantee.


Data Flow Template

Every service and integration must document its data flows using this template:

Flow definition

flow_id: DF-{service}-{sequence}
service: {service name}
description: {what this flow does}
data_classification: public | internal | confidential | restricted
contains_pii: true | false
pii_categories: [name, email, ip_address, ...]

origin:
  source: {where data originates}
  location: {country/region}
  jurisdiction: {legal jurisdiction}

processing:
  processor: {who processes the data}
  location: {country/region}
  jurisdiction: {legal jurisdiction}
  purpose: {why it is processed}
  legal_basis: {GDPR legal basis}

storage:
  provider: {storage provider}
  location: {country/region}
  jurisdiction: {legal jurisdiction}
  retention: {how long data is kept}
  encryption: {encryption method}

transit:
  protocol: {TLS 1.3, etc.}
  route: {network path description}
  crosses_border: true | false
  border_mechanism: {SCC, adequacy decision, N/A}

sub_processors:
  - name: {sub-processor name}
    role: {what they do}
    location: {country/region}
    jurisdiction: {legal jurisdiction}

GE Core Data Flows

Client application data

Origin:     End user browser (EU)
Transit:    TLS 1.3 → bunny.net CDN (EU) → UpCloud compute (EU)
Processing: UpCloud Helsinki/Amsterdam (EU)
Storage:    PostgreSQL on UpCloud (EU)
Backup:     Encrypted, UpCloud block storage (EU)
PII:        Yes — user accounts, business data

All components EU-headquartered. No CLOUD Act exposure.

Agent execution data

Origin:     Admin UI / Orchestrator (EU — k3s cluster)
Transit:    Internal k3s network (EU)
Processing: Executor pods on k3s (EU — local infrastructure)
Storage:    PostgreSQL (EU), Redis (EU)
Backup:     Encrypted, EU storage
PII:        Possible — depends on task context

LLM API calls (sovereignty exception):

Agent execution involves calls to LLM APIs (Anthropic, OpenAI). These are US-headquartered companies. Data sent to LLM APIs:

  • Agent system prompts (no PII)
  • Task descriptions (may contain project context)
  • Code context (no client PII)

Mitigation:

  • Client PII is never sent to LLM APIs
  • Task descriptions are sanitized before LLM processing
  • LLM providers contractually commit to not training on API data
  • This is documented as a sovereignty exception with Julian's approval
  • Active evaluation of EU-hosted LLM alternatives

Email delivery

Origin:     GE application (EU)
Transit:    TLS 1.3 → Brevo API (EU — France)
Processing: Brevo (France)
Storage:    Delivery logs at Brevo (France), 30-day retention
PII:        Yes — email addresses, names

All components EU-headquartered.

Payment processing

Origin:     End user browser (EU)
Transit:    TLS 1.3 → Mollie API (EU — Netherlands)
Processing: Mollie (Netherlands)
Storage:    Transaction records at Mollie (Netherlands)
PII:        Yes — payment details (handled by Mollie, PCI DSS)

GE never stores payment card data. Mollie handles PCI compliance.

DNS resolution

Origin:     End user browser (worldwide)
Transit:    DNS query → TransIP nameservers (EU — Netherlands)
Processing: TransIP (Netherlands)
Storage:    DNS query logs at TransIP (Netherlands)
PII:        Yes — IP addresses in query logs

No US DNS providers in the resolution chain.

CDN delivery

Origin:     GE application (EU)
Transit:    bunny.net edge network (EU — 34 PoPs, EU-only routing)
Processing: bunny.net (Slovenia)
Storage:    Edge cache at bunny.net (EU PoPs only)
PII:        Possible — access logs contain IP addresses

bunny.net EU-only routing filter ensures no traffic leaves the EU.

Analytics

Origin:     End user browser (EU)
Transit:    First-party request → GE infrastructure (EU)
Processing: Plausible (EU) or self-hosted
Storage:    Plausible EU infrastructure / GE database
PII:        No — Plausible does not collect PII by design

No cookies, no personal data, no tracking across sites.


Cross-Border Transfer Mechanisms

When data must cross EU borders (rare and requires justification):

Adequacy decisions

The European Commission has decided that certain countries provide adequate data protection. Transfer to these countries is allowed without additional safeguards:

  • Andorra, Argentina, Canada (commercial), Faroe Islands, Guernsey, Israel, Isle of Man, Japan, Jersey, New Zealand, Republic of Korea, Switzerland, United Kingdom, Uruguay

Note: The US does NOT have an unqualified adequacy decision. The EU-US Data Privacy Framework (2023) provides a mechanism but applies only to certified US companies and has been challenged legally.

Standard Contractual Clauses (SCCs)

When transfer is necessary to a country without an adequacy decision:

  • Use EU-approved SCCs (June 2021 version)
  • Conduct a Transfer Impact Assessment (TIA)
  • Implement supplementary measures (encryption, pseudonymization)
  • Document the assessment and Julian reviews annually

GE position

GE does not rely on SCCs for core data flows. All core services use EU-headquartered providers. SCCs are only relevant for the LLM API exception, which is documented separately.


Sub-Processor Management

Requirements

Every service provider must disclose their sub-processors. GE must verify:

  • Each sub-processor's location and jurisdiction
  • What data the sub-processor accesses
  • Whether the sub-processor introduces CLOUD Act exposure

Monitoring

  • Sub-processor lists reviewed quarterly by Julian
  • Providers must notify GE 30 days before adding a new sub-processor
  • GE has the right to object to new sub-processors
  • If a provider adds a US sub-processor, GE evaluates alternatives

Client Data Isolation

Multi-tenancy model

GE provides SaaS to multiple clients. Client data is isolated:

  • Database level: Row-level security with client ID
  • Application level: Middleware enforces client scope on every query
  • Network level: No cross-client data access possible
  • Backup level: Client data can be extracted or deleted independently
  • Agent level: Agents process one client's task at a time, context is cleared between clients

Data portability

Clients can export all their data at any time (GDPR Article 20). Export format: JSON and CSV. Delivered via secure download link (EU-hosted, time-limited).

Data deletion

Client data is deleted within 30 days of account closure. Backups containing deleted client data expire per retention policy (max 12 months for monthly backups).


Backup Locations

All backups are stored within the EU:

What Provider Location Encryption
Database UpCloud Helsinki, FI AES-256 at rest
File storage UpCloud Amsterdam, NL AES-256 at rest
Container images Self-hosted Local k3s N/A (no PII)
Config backups Git Self-hosted / EU N/A (no PII)

Cross-zone replication: Helsinki + Amsterdam (both EU).


Log Storage

Application logs may contain PII (IP addresses, user agents, request paths with identifiers). Logs are stored:

Log type Storage Location Retention
Application PostgreSQL UpCloud EU 90 days
Access File system k3s local 30 days
Security PostgreSQL UpCloud EU 1 year
Audit PostgreSQL UpCloud EU 7 years (ISO 27001)

No logs are sent to external logging services (Datadog, Splunk, etc.) unless the service is EU-sovereign.


Ownership

Role Agent Responsibility
Compliance Officer Julian Data flow audit, sub-processor review
Infrastructure Arjan Infrastructure data flow verification
Network Engineer Stef Transit path verification
Backup Guardian Otto Backup location verification