Data Residency & Sovereignty Routing

Sensitivity-tier routing decides which model handles a request (cloud vs. on-prem). Residency routing is a second, orthogonal axis: which region processes and stores the data. EU personal data stays in the EU. Protected health data stays in HIPAA-eligible regions. Canadian government data stays in Canada. The two axes compose: a privileged EU matter runs on the on-prem EU model, never the cloud US model, even though both would satisfy the sensitivity constraint alone.



1. Regulatory Drivers


2. The Routing Model

A request carries three policy inputs:

The router resolves all three into a single allowed endpoint. If no endpoint satisfies every constraint, the request is refused, not degraded.


3. Example: Composite Router

from dataclasses import dataclass
from enum import Enum


class Tier(str, Enum):
    PUBLIC = "public"; CONFIDENTIAL = "confidential"
    PRIVILEGED = "privileged"; REGULATED = "regulated"


@dataclass(frozen=True)
class Endpoint:
    name: str
    region: str            # "eu-west-1", "us-east-1", "onprem-eu", ...
    operator: str          # "aws", "azure", "onprem"
    hipaa_eligible: bool
    tiers: frozenset       # tiers this endpoint may serve


ENDPOINTS = [
    Endpoint("onprem-eu",    "onprem-eu", "onprem", True,
             frozenset({Tier.PUBLIC, Tier.CONFIDENTIAL,
                        Tier.PRIVILEGED, Tier.REGULATED})),
    Endpoint("onprem-us",    "onprem-us", "onprem", True,
             frozenset({Tier.PUBLIC, Tier.CONFIDENTIAL,
                        Tier.PRIVILEGED, Tier.REGULATED})),
    Endpoint("cloud-eu",     "eu-west-1", "aws",    True,
             frozenset({Tier.PUBLIC, Tier.CONFIDENTIAL})),
    Endpoint("cloud-us",     "us-east-1", "aws",    True,
             frozenset({Tier.PUBLIC, Tier.CONFIDENTIAL})),
]


@dataclass
class Request:
    tier: Tier
    residency: str          # "eu", "us", "ca", ...
    data_categories: set    # {"phi", "pii", "itar"}


def route(req: Request) -> Endpoint:
    # 1) filter by tier
    candidates = [e for e in ENDPOINTS if req.tier in e.tiers]

    # 2) residency: EU data must stay in EU endpoints
    if req.residency == "eu":
        candidates = [e for e in candidates if e.region.startswith(("eu-", "onprem-eu"))]

    # 3) PHI must go to HIPAA-eligible only
    if "phi" in req.data_categories:
        candidates = [e for e in candidates if e.hipaa_eligible]

    # 4) ITAR can only run on on-prem US
    if "itar" in req.data_categories:
        candidates = [e for e in candidates if e.operator == "onprem"
                      and e.region == "onprem-us"]

    if not candidates:
        raise PermissionError("no endpoint satisfies residency + tier policy")

    # Prefer on-prem when tied; fall back to cheapest compliant cloud.
    candidates.sort(key=lambda e: (0 if e.operator == "onprem" else 1, e.name))
    return candidates[0]

4. Storage, Replication, and Backups


5. Telemetry Is Data Too

Logs, traces, and metrics commonly leak residency. A prompt/response captured in a US-hosted observability stack defeats EU residency; so do APM traces that include user IDs, matter codes, or free-text error messages. Pin telemetry backends to the same residency class as the data they describe, or redact at the emission side before export.


6. Pitfalls


↑ Back to Top