Project Nessie

Project Nessie is an open-source “Git-for-data” catalog for Apache Iceberg, started by Dremio. It treats data tables the way Git treats source code — commits, branches, tags, and merges — layered on top of an Iceberg metastore. Nessie makes experimentation, isolation, and rollback first-class operations instead of one-off engineering exercises.

Key Features:

Why It Matters:

Traditional lakehouse releases mix data ingestion with data publication — a partial load is visible to consumers immediately. Nessie inverts this: ingestion happens on a feature branch, validation and QA run against that branch, and only a successful merge exposes the change to the production view. The data team gets the same review-and-promote workflow software engineers have had since the late 2000s.

Use Cases: