How key value databases shorten the lifespan of SSDs

Explainer Organising a key value database for fast reads and writes can lower an SSD’s working life. This is how.

Key value database

A relational database stores data records organised into rows and columns and located by row;column addresses. A key value database, such as Redis and RocksDB, stores records (values) using unique keys for each record. Each record is written as a key value pair and the key is used to retrieve the record.

Compaction layers

It is faster to write data sequentially on disks and on SSDs. Writing key value pairs sequentially logging or journalling) is fast but finding (reading) a particular key value pair requirea a slow trawl through this log. Methods to speed up this reading include organising the data into Log Structured Merge Trees (LSTM). This slow writes a little and speed up reads a lot.

There is a fairly detailed introduction to LSTM issues by Confluent CTO Ben Stopford.

In an LSTM scheme groups of writes are saved, sequentially, to smaller index files. They are sorted to speed up reads. New, later groups of writes go into new index files. Layers of such files are built up.

Each index has to be read with a separate IO. The index files are merged or compacted every so often to prevent their number becoming too large and so making reads slower overall.

On SSDs such compaction involves write cycles in addition to the original data writes. This larger or amplified number of writes shortens the SSD’s life, since that is defined as the number of write cycles it can support.

What can be done? One response is to rewrite the key value database code to fix the problem. Another is to simply use larger SSDs and eject low access rate data periodically to cheaper storage. This saves the SSD’s capacity and working life for hot data. Read on for our columnist Hubbert Smith’s thoughts on this matter – Concerning the mistaken belief that key value databases chew up SSDs.