Zero-Trust AI Architecture

Zero trust is the operating model that abandons the network-perimeter assumption: there is no "inside" that is automatically trustworthy, no "outside" that is automatically hostile, and no implicit trust granted to a caller because of where they sit on the network. Every request — from a human, a service, a batch job, or a model itself — is authenticated, authorized, and recorded on its own merits, every time.

Applied to AI/ML, zero trust means: the inference path is treated as a hostile environment, model artifacts are signed and attested before they execute, every retrieval and tool call is authorized against the real caller's identity (not the model's), and every action is logged in a way that a forensic investigator can replay. This page walks through the six enforcement planes — identity, network, policy, data, workload, and continuous verification — that together constitute a zero-trust AI deployment.

1. Zero-Trust Tenets

NIST SP 800-207 distills zero trust into a set of tenets. Three of them carry the most weight in AI-system design:

Verify explicitly. Authentication and authorization use every available signal (identity, device posture, request context, sensitivity tier of the data being accessed). A valid bearer token alone is not sufficient.
Use least-privilege access. Every workload, every model, every tool gets the smallest possible set of capabilities, scoped down further with just-in-time elevation when needed. Default-deny on tool use.
Assume breach. Operate as if the attacker is already inside. Microsegment the network, encrypt every hop, log every action, and continuously verify that workloads behave as expected.

The model that an LLM provides is itself an untrusted component under zero-trust thinking — the LLM may be misaligned, jailbroken, or compromised at the supply-chain level. Tools called by the LLM run with the real user's delegated authority, not with the model's blanket privileges, and every call is re-authorized at the policy decision point.

2. Enforcement Across an AI Request Lifecycle

The diagram below shows the six enforcement planes a request crosses on its way through an AI/ML system — from the originating user or workload through to the continuous-verification telemetry that watches the whole flow. Each plane is a place where a zero-trust policy is enforced; failures at one plane should be caught at the next.

┌──────────────────────────────────────────────────────────────────────────────┐
│                 USER / WORKLOAD INITIATES INFERENCE REQUEST                  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                           │
│  │   Human     │  │   Service   │  │   Batch     │                           │
│  │   User      │  │   Worker    │  │   Job       │                           │
│  └─────────────┘  └─────────────┘  └─────────────┘                           │
└──────────────────────────────────────────────────────────────────────────────┘
                                       │
                                       ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│                    1. IDENTITY PLANE  (Verify Explicitly)                    │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌─────┐             │
│  │  OIDC    │  │  SPIFFE  │  │   mTLS   │  │ Workload │  │ MFA │             │
│  │  Token   │  │  /SVID   │  │  Cert    │  │ Identity │  │/WAuth│            │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘  └─────┘             │
└──────────────────────────────────────────────────────────────────────────────┘
                                       │
                                       ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│           2. NETWORK PLANE  (Microsegmentation, No Implicit Trust)           │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐                      │
│  │  Service │  │  Egress  │  │  Policy  │  │  No East-│                      │
│  │   Mesh   │  │ Firewall │  │  Engine  │  │  West LAN│                      │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘                      │
└──────────────────────────────────────────────────────────────────────────────┘
                                       │
                                       ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│          3. POLICY DECISION POINT  (Least Privilege, Just-In-Time)           │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐                      │
│  │  OPA /   │  │  Cedar   │  │  ABAC +  │  │  JIT     │                      │
│  │ Rego     │  │  Policy  │  │  Context │  │  Tokens  │                      │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘                      │
└──────────────────────────────────────────────────────────────────────────────┘
                                       │
                                       ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│                  4. DATA PLANE  (Classify, Encrypt, Route)                   │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐                      │
│  │ Classify │  │  KMS /   │  │  Sens.   │  │ Tenant   │                      │
│  │ on Read  │  │  CMK     │  │  Routing │  │ Isolat.  │                      │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘                      │
└──────────────────────────────────────────────────────────────────────────────┘
                                       │
                                       ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│              5. WORKLOAD PLANE  (Signed & Attested Code/Models)              │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐                      │
│  │  Cosign  │  │  SBOM    │  │  TEE     │  │ Image    │                      │
│  │  Verify  │  │  +SLSA   │  │  Attest  │  │ Pinning  │                      │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘                      │
└──────────────────────────────────────────────────────────────────────────────┘
                                       │
                                       ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│                 6. CONTINUOUS VERIFICATION  (Assume Breach)                  │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐                      │
│  │ Behavior │  │ Anomaly  │  │ Hash-    │  │  Token   │                      │
│  │ Analytic │  │ Detect.  │  │ Chained  │  │  Re-Auth │                      │
│  │  /UEBA   │  │  ML      │  │  Audit   │  │  /Rotate │                      │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘                      │
└──────────────────────────────────────────────────────────────────────────────┘

3. Identity Plane

The identity plane answers who is making this request? — for both humans and workloads. Bearer tokens alone are not sufficient; identity is established through cryptographic primitives that survive even if a single secret leaks.

Workload identity via SPIFFE / SPIRE: every service gets a short-lived X.509 SVID issued by a trusted attestor. Pod identity is derived from kubelet attestation, not from a stored credential.
OIDC federation for human users: identity provider issues a signed token; the AI service validates issuer, audience, and scope at every request — not just at session start.
mTLS on every service-to-service hop. The TLS handshake itself is the auth event; both sides verify each other's certificate against the workload-identity authority.
MFA / WebAuthn for high-privilege operations — production deployment, key rotation, model artifact promotion.
Phantom-token pattern: external clients see opaque tokens; the gateway exchanges them for short-lived JWTs valid only for the next downstream hop.

A common mistake in agentic AI: treating the model's tool call as if the model were the principal. The real principal is the human who initiated the session; the model is acting on their behalf, and the policy decision point downstream must see the human's identity, scoped down to what the human is allowed to do via this particular tool.

4. Network Plane

The network plane removes implicit trust between hosts. There is no "flat internal network"; every connection is an authenticated, authorized, encrypted hop.

Microsegmentation: each workload gets its own network identity and its own narrow set of allowed peers. Default-deny east-west traffic.
Service mesh (Istio, Linkerd, Consul) enforces mTLS and L7 authz between services without changes to the application code.
Egress firewall with explicit allowlists for any service that makes outbound calls — especially LLM-tool services that might call arbitrary URLs at the model's request.
VPC endpoints / PrivateLink for cloud APIs, so traffic to managed services (S3, KMS, Secrets Manager) never traverses the public internet.
No private-network bypass: a request from a peer in the same subnet must still authenticate. "Trusted internal" is not a thing.

5. Policy Decision Point

A policy decision point (PDP) is the centralized place where every authorization question is answered. It receives a structured request — (principal, action, resource, context) — and returns allow/deny plus optional obligations.

Open Policy Agent (OPA) with Rego policies; or AWS-native Cedar for ABAC. Either way, policy is code, version-controlled, tested, and reviewed.
ABAC over RBAC: decisions use attributes (sensitivity tier, tenant, residency, time-of-day, MFA-recency) rather than static role membership alone.
Just-in-time elevation: the principal requests a scope, the PDP grants a short-lived (15-min, audit-logged) token; no standing privilege.
Decision logs are themselves part of the audit trail.

For LLM tool use the PDP is the natural place to enforce "only this user, only this matter, only this tool, only this argument shape." The model proposes; the PDP disposes.

6. Data Plane

Data is treated as classified by default. The classification — not the user's request — determines where the data may be processed, what model tier may see it, and how long it lives.

Classification at read time: every record carries a sensitivity tag (public / internal / confidential / regulated); the tag flows through the pipeline as metadata.
Encryption everywhere: AES-256 at rest with customer-managed keys (KMS / CMK); TLS 1.3 + mTLS in transit; envelope encryption with per-tenant data keys.
Classification-driven routing: regulated content is pinned to on-prem inference; public content may go to an external provider. Routing is a function of the classification tag, not user discretion.
Tenant isolation: separate vector indexes per tenant, or strict ACL filters with tenant-id pre-filtering before semantic search.
Right-to-erasure: deletion propagates from the source through embeddings, caches, and downstream features — with a verifiable audit that all derivatives have been purged.

7. Workload Plane

A zero-trust workload plane treats every binary, container image, and model artifact as untrusted until proven otherwise. The bar to enter production is a chain of cryptographic evidence.

Signed artifacts: every container image and model checkpoint is signed with cosign / sigstore at build time and verified at deploy time. Unsigned artifacts cannot be admitted (Kyverno / Gatekeeper enforce this in Kubernetes).
SBOM + SLSA: Software Bill of Materials accompanies every build; SLSA provenance describes how the artifact was produced. Reject builds whose provenance does not match the expected build pipeline.
TEE attestation: confidential-compute environments (Intel TDX, AMD SEV-SNP, NVIDIA confidential GPUs) attest to their measurement before the orchestrator releases keys to them. The model only runs inside an attested TEE.
Image pinning: pin by SHA digest, never by mutable tag. Re-verify signatures on every pull, not just on first pull.
Runtime monitoring (Falco, eBPF-based agents) for syscall anomalies in inference workloads.

8. Continuous Verification

"Assume breach" means the system never finishes verifying. Tokens expire quickly and rotate. Behavior is monitored continuously. Anomalies trigger re-authentication or session termination, not just an alert.

Continuous authentication: short-lived tokens (5–15 min) with seamless refresh; step-up re-auth for sensitive operations within an already-authenticated session.
Behavioral analytics / UEBA: model the baseline tokens-per-request, retrievals-per-session, tools-per-turn for each user; alert on Z-score deviations.
Anomaly detection in inference: flag prompt patterns matching known jailbreak templates, tool-call sequences inconsistent with the user's role, output bursts above the canary threshold.
Hash-chained audit logs: every event is appended with a hash of the previous event; periodically anchored to an external notary so tampering is detectable.
Token rotation on a schedule and on demand (suspected compromise rotates immediately; routine rotation happens daily).

9. Failure Modes & Anti-Patterns

Zero-trust is undermined more often by a quiet shortcut than by a frontal failure. The patterns below show up repeatedly in AI deployments that nominally claim zero-trust posture but are not actually zero-trust:

Service-account "backdoor" identity. A long-lived static API key bypasses the identity plane — everything authenticated by it is effectively a single super-user. Rotate to short-lived workload identities.
The LLM is the principal. Tools accept the model's claim of caller identity instead of the real user's identity. A prompt injection then acts with whatever privilege the model has. Pass the real principal out-of-band.
Allowlist as default-allow. An egress firewall configured to "allow everything except known-bad" is not a default-deny posture and provides no meaningful confinement for tool calls.
Trust by network position. Treating the production VPC as "internal and trusted" recreates the perimeter model. Every hop authenticates, even between two pods on the same node.
Audit logs that are not append-only. If an attacker who compromises the inference host can edit the logs, the "assume breach" tenet has been quietly violated.
Static, unsigned model artifacts. Pulling a container or a model by mutable tag, with no signature verification, leaves the workload plane open to silent supply-chain swaps.

The unifying lesson: zero-trust is not a product or a control — it is the discipline of removing implicit trust at every junction. Each plane in the diagram above is a place where implicit trust likes to creep back in.

↑ Back to Top