Architecture
As mentioned above and in our previous articles, WiredTiger is now MongoDB's default storage engine, having replaced MMAPv1 when MongoDB acquired WiredTiger in late 2014, alongside the release of MongoDB 3.0. WiredTiger's development team also went to MongoDB, including Keith Bostic and Michael Cahill, who were originally widely known for their creation of Berkeley DB. WiredTiger is a NoSQL, multiversion concurrency control (MVCC) storage-engine. The integration of concurrent threads allows the system to see a snapshot of the database at the time it accesses a collection. It then writes a consistent view of data to disk according to set checkpoints: the default setting is either every 2GB of writes or 60 seconds. This gives WiredTiger the ability to recover checkpoints anytime it's necessary. If you ever suffer a crash between checkpoints, WiredTiger can also recover un-checkpointed data with its journal files. WiredTiger is highly scalable, employing document-level locking, which enables highly concurrent workloads, and its concurrency model allows the server to take advantage of many core CPUs. It stores its data using a B-tree structure, offering highly efficient reads and good write performance. A hot cache means that WiredTiger implements a "least recently used" (LRU) eviction algorithm, defaulting to 50% of RAM and reserving 1GB for cache. It also relies on the OS page-cache to fetch compressed data without hitting your disk. For Collections, WiredTiger uses Snappy compression by default. It can also employ gzip compression, in order to trade off CPU for increased compression efficiency. If necessary, you can override compression on a per-collection basis. For indexes, WiredTiger uses prefix-compression both on-disk and in-memory. Finally a few additional features that round out WiredTiger's architecture:- Its disk footprint is small: WiredTiger's disk usage is much less than MMAPv1's, even with compression disabled. WiredTiger doesn't need to pad data and it has a more efficient data storage format in general.
- Write ahead logging facilitates automatic crash recovery and makes writes durable.
- MongoDB's enterprise edition supports on-disk encryption for WiredTiger.
Advantages
WiredTiger has quite a few advantages that check off exactly what many people are looking for when considering a storage engine. Fundamentally, it's a very sound choice. In general, its advantages surface directly from the engine's architecture, as we've described above:- WiredTiger is highly scalable with concurrent readers and writers.
- Its compression system allows for efficient storage use and less disk I/O.
- It supports encryption for sensitive data.
Drawbacks
Despite a very good architecture and the benefits it provides, WiredTiger does have a handful of drawbacks, which you should keep in mind and consider anytime you're choosing a storage engine for MongoDB:- WiredTiger's concurrency scheme prevents in-place updates; updating one field in a document re-writes the entire document.
- WiredTiger's inclusion in MongoDB is still relatively recent — it's not fully battle-tested. It's only been included in MongoDB since 3.0 — a time period of a few years. WiredTiger is a relatively complicated storage engine (in comparison to MMAPv1, at least) and there's been less deep experience using it in production.