NoSQL Databases

The non-relational landscape spans six major categories on this page — in-memory caches, document stores, wide-column tables, distributed and embedded key-value stores, time-series databases, and search engines. Each models data in a way that outperforms an RDBMS for a specific access pattern. Pick the engine that matches the shape of your data and the queries you actually run, not the one with the most familiar SQL dialect. Graph databases have moved to their own section — see Graph Databases.

In-Memory

Redis

The most-deployed NoSQL system. In-memory data-structure server — strings, hashes, sorted sets, streams, vectors. Cache, session store, queue, leaderboard, rate limiter in one tool.

Memcached

The original distributed in-memory cache. Multi-threaded, byte-strings only, no persistence — deliberately simple. Pairs naturally with the Redis comparison.

Document Stores

MongoDB

Document-based NoSQL using BSON with flexible schema. Wins on nested data, schemaless ingest, and fast prototyping where rigid relational tables slow you down.

Apache CouchDB

The original document DB. Multi-master replication, MVCC, HTTP API — the foundation for offline-first mobile and PWA applications via PouchDB / Couchbase Lite.

Couchbase

Distributed document DB merging Membase’s caching with CouchDB’s document model. SQL-style queries via N1QL, built-in cache, full-text search, and analytics in one cluster.

Wide-Column

Apache Cassandra

The canonical Dynamo-style wide-column DB. Masterless, multi-region, write-optimized — the storage layer behind Netflix, Apple, Discord, and Instagram.

ScyllaDB

Cassandra-compatible rewrite in C++ on Seastar. Shard-per-core architecture delivers 5–10× throughput at lower p99 — same data model, far fewer nodes.

Apache HBase

Open-source clone of Google Bigtable on HDFS. Strongly consistent, master-coordinated — the wide-column choice when Hadoop is already in production.

Distributed Key-Value

FoundationDB

Distributed ordered key-value store with serializable ACID semantics. The KV foundation Apple builds higher-level data models on top of, including documents and SQL.

Embedded Key-Value

LevelDB

Embedded LSM-tree library from Google — fast writes, ordered range scans, and the storage engine behind Chrome IndexedDB and Bitcoin Core.

RocksDB

Meta’s production-grade LevelDB fork — column families, transactions, bloom filters, multi-threaded compaction. Powers MyRocks, CockroachDB, TiKV, Kafka Streams.

LMDB

Memory-mapped B+tree KV from OpenLDAP. Sub-microsecond reads via mmap, single-writer MVCC, ~10K lines of C. Used by OpenLDAP, Bitcoin Core, monero.

BadgerDB

Pure-Go LSM KV with WiscKey value separation. Concurrent writers, ACID transactions — the storage engine inside Dgraph and many Go-native services.

bbolt

Pure-Go B+tree KV (the etcd team’s fork of BoltDB). Embedded in etcd, Consul, InfluxDB metadata, OPA, NATS Jetstream — the go-to control-plane KV in Go.

Time-Series

InfluxDB

The most-deployed open-source time-series database. Time-Structured Merge Tree storage, Flux query language, Telegraf’s 200+ input plugins. The default DevOps TSDB.

Search Engines

Elasticsearch & OpenSearch

Distributed search and JSON document store on Lucene. BM25 relevance, aggregations, vector ANN. The dominant open log-analytics and application-search platform.

Related on this site

Graph Databases

Neo4j, Amazon Neptune, JanusGraph, Dgraph — nodes and edges as first-class citizens, with native traversal that beats recursive SQL on connected data.

Amazon DynamoDB

AWS managed key-value/document database — single-digit-millisecond reads at any scale, with on-demand and provisioned capacity modes.

Vector Databases

Specialized similarity-search databases for embeddings — the modern NoSQL companion for AI workloads. HNSW, IVF, and PQ indexing.

Snowflake VARIANT (semi-structured SQL)

Snowflake’s VARIANT type lets a SQL warehouse query JSON/Avro/Parquet natively — the relational counter-move to NoSQL.


About this section. Reach for a document store like MongoDB when records have variable or deeply nested fields and the schema needs to evolve without migrations. Use a distributed key-value engine like FoundationDB when you need low-latency primitives and strict transactional guarantees. Choose Cassandra / ScyllaDB for write-heavy multi-region workloads and Redis for in-memory speed with rich data structures. Use a purpose-built TSDB or search engine when the access pattern (timestamped windows, full-text relevance) doesn’t fit a row-store at all. For relationship-heavy workloads where multi-hop traversals dominate, see Graph Databases.


MongoDB vs. LevelDB vs. RocksDB

The three databases occupy fundamentally different tiers: MongoDB is a server-based document database with rich query semantics, while LevelDB and RocksDB are embedded byte-string key-value libraries. Pick based on whether you need a queryable database (MongoDB), an embedded LSM library (LevelDB for simplicity, RocksDB for production scale), or a storage engine to build a distributed system on top of (RocksDB).

Architecture & Feature Comparison

MongoDB LevelDB RocksDB
Architecture
DeploymentServer (mongod), networkedEmbedded library, in-processEmbedded library, in-process
Multi-process accessYes (clients connect over TCP)No (one writer; directory lock)No (one writer; directory lock)
Storage engineWiredTiger (B-tree + LSM hybrid)LSM treeLSM tree (heavily optimized)
ReplicationReplica sets, automatic failoverNone (build it yourself)None (used as backend by replicated DBs)
ShardingBuilt-in (range / hash / zone)NoneNone directly; foundation for sharded DBs
Data Model
RecordsBSON documents (nested, typed)(bytes, bytes)(bytes, bytes)
SchemaFlexible, optional validatorsNoneNone
Secondary indexesB-tree, hashed, geo, text, vectorNone (compose via key prefixes)None (compose via key prefixes)
Query languageMQL + aggregation pipelineGet / Put / Iter onlyGet / Put / Iter only
Joins / aggregationsYes ($lookup, $group, etc.)NoneNone
Concurrency & Durability
TransactionsMulti-document ACID, distributedAtomic batches onlyOptimistic + pessimistic, snapshot isolation
Compaction threadsWiredTiger eviction (multi-thread)1N (configurable, often 8–16)
Bloom filtersInternal (WiredTiger)Manual / off by defaultFirst-class, per-CF
CompressionSnappy, ZSTD, ZLIB (per collection)SnappySnappy, LZ4, ZSTD, ZLIB, BZIP2
Backupmongodump, replica-based, FS snapshotNone built-inCheckpoint (hard-link) + incremental BackupEngine
Operational Footprint
Code size~3M LOC (MongoDB + WiredTiger)~25k LOC~400k LOC
Tuning surfaceModerateSmallVast
LicenseSSPLBSD-3-ClauseApache 2.0 / GPLv2
Used bySaaS apps, CMS, IoT, content storesChrome IndexedDB, Bitcoin CoreCockroachDB, TiKV, MyRocks, Kafka Streams, Flink

Performance Estimates

Numbers below are rough order-of-magnitude estimates for a modern commodity server (16-core CPU, NVMe SSD, 64 GiB RAM), 256-byte values, working set fitting in RAM, durable writes (fsync per group commit). Actual numbers vary 5–10× with key/value size, hardware, and tuning — treat these as a starting frame, not a benchmark.

Workload MongoDB LevelDB RocksDB
Single Client (1 connection / thread)
Point read latency (p50)~0.5–1 ms (local), 1–3 ms (LAN)~5–20 µs (cache hit)~3–10 µs (cache hit)
Point write latency (p50)~1–3 ms (w:1)~30–100 µs (async)~20–80 µs (async)
Sync write (fsync) latency~3–10 ms (j:true)~1–5 ms~1–5 ms
Point reads / sec~10–30 K~200–500 K~500 K–1 M
Async writes / sec~5–15 K~50–100 K~100–300 K
Concurrent Clients (16–128 threads)
Aggregate point reads / sec~50–150 K (scales w/ replicas)~500 K–1 M (read-mostly)~1–3 M
Aggregate writes / sec~30–80 K (scales w/ shards)~100–200 K (single-thread compaction is the ceiling)~500 K–1 M+ (multi-thread compaction)
Range scan throughput~100–500 K docs/sec (indexed)~1–3 M kv/sec~2–5 M kv/sec
Concurrency modelConnection pool, doc-level locksSingle writer; readers don’t blockSingle writer; multi-thread compaction; transaction managers
Scale Ceiling (single host)
Practical dataset size~TB per node, PB sharded~10s of GB before compaction pain~10s of TB
Bottleneck under heavy writesWiredTiger checkpointing, networkSingle-thread compactionDisk bandwidth, then CPU
Bottleneck under heavy readsNetwork round-trip, query planL0 file count without bloom filtersBlock cache hit rate

How to read these numbers. The order-of-magnitude gap between MongoDB and the embedded engines is mostly the network round-trip and query-planning overhead — MongoDB does much more per call. The gap between LevelDB and RocksDB widens dramatically under concurrent writes because LevelDB compacts on a single thread while RocksDB compacts on many. For a workload where MongoDB’s schema, indexes, and aggregations don’t earn their cost, dropping to RocksDB can be a 10–50× throughput win — at the price of writing your own indexes, query layer, and replication.

When the comparison flips. MongoDB’s per-operation overhead is dwarfed when the query itself is non-trivial. A $lookup across two indexed collections in MongoDB can be faster end-to-end than the equivalent application-side join on top of RocksDB, because Mongo runs it close to the data. Always benchmark the actual workload, not synthetic point-read microbenchmarks.

↑ Back to Top