Customer-Managed Keys & Encryption

Databricks encrypts every byte at rest with AES-256 by default. Customer-Managed Keys (CMK), also marketed as Bring-Your-Own-Key (BYOK), replace the Databricks-owned root key with one you own in AWS KMS, Azure Key Vault, or GCP KMS. You keep the key material, the rotation schedule, and the audit trail — and if you revoke the key, every dependent surface goes dark within seconds.


1. Envelope Encryption Hierarchy

Databricks uses a standard envelope-encryption scheme. The customer's KMS key never leaves the HSM — it only wraps a per-object Data Encryption Key (DEK), which in turn encrypts the actual bytes.

┌──────────────────────────────────────────────────────────────────────────────────┐
│                     LAYER 1 — KMS ROOT KEY (Customer-Owned)                      │
│        AWS KMS  |  Azure Key Vault  |  GCP KMS  |  HSM-backed, revocable         │
└──────────────────────────────────────────────────────────────────────────────────┘
                                         │  wraps
                                         ▼
┌──────────────────────────────────────────────────────────────────────────────────┐
│                   LAYER 2 — DATA ENCRYPTION KEY (DEK, AES-256)                   │
│      Generated per object/volume  |  Wrapped by KMS root  |  Cached briefly      │
└──────────────────────────────────────────────────────────────────────────────────┘
                                         │  encrypts
                                         ▼
┌──────────────────────────────────────────────────────────────────────────────────┐
│                              LAYER 3 — DATA AT REST                              │
│       Notebooks  |  Secrets  |  DBFS Root  |  Cluster EBS  |  Delta Tables       │
└──────────────────────────────────────────────────────────────────────────────────┘

2. CMK for Managed Services (Notebooks & Secrets)

The Managed Services CMK protects items the Databricks control plane stores on your behalf:

You configure it on the workspace at creation time:

databricks account customer-managed-keys create \
  --use-cases MANAGED_SERVICES \
  --aws-key-info '{"key_arn":"arn:aws:kms:us-east-1:123:key/abcd","key_alias":"alias/dbx-managed"}'

The resulting key configuration is then attached to the workspace.

3. CMK for Workspace Storage (DBFS Root)

The Storage CMK encrypts the workspace's root S3 bucket (AWS) or ADLS Gen2 container (Azure) — the so-called DBFS root. This is where init scripts, libraries, MLflow artifacts, and the /databricks-datasets mount live.

Note: the Storage CMK does not automatically apply to Unity Catalog managed tables. Those live in UC storage credentials pointing at customer-owned buckets — encrypt those buckets separately with their own CMK at the S3 / ADLS level.

4. CMK for Cluster EBS Volumes

Cluster nodes have local EBS volumes for shuffle spill, caching, and the operating system. The EBS CMK wraps the per-volume DEK that the EC2 hypervisor uses for transparent disk encryption. Configured once per workspace; applies to every cluster launched in that workspace.

databricks account customer-managed-keys create \
  --use-cases STORAGE \
  --aws-key-info '{"key_arn":"arn:aws:kms:us-east-1:123:key/efgh","key_alias":"alias/dbx-ebs"}'

5. Key Rotation

6. Cloud-Provider Specifics

AWS — KMS

Azure — Key Vault / Managed HSM

GCP — Cloud KMS

7. Revocation Impact

Disabling or scheduling deletion of the CMK is the nuclear option — and it works. Within seconds:

Re-enable the key and the workspace recovers without data loss. Schedule deletion (vs. permanent deletion) only after a hold period — AWS KMS enforces a 7-30 day pending-deletion window; use it.


↑ Back to Top