Write Amplification

Write Amplification – an SSD has its cells organized within blocks. Blocks are sub-divided into 4KB to 16KB pages, perhaps 128, 256 or even more of them depending upon the SSD’s capacity.

Data is written at the page level, into empty cells in pages. You cannot over-write existing or deleted data with fresh data. That deleted data has to be erased first, and an SSD cannot erase at the page level. Data is erased by setting whole blocks, of pages and their cells, to ones. Fresh data (incoming to the SSD) is written into empty pages. When an SSD is brand new and empty then all the blocks and their constituent pages are empty. Once all the pages in an SSD have been written to once, then empty pages can only be created by recovering pages from blocks which have deleted data in them, from which data has been removed, or erased.

When data on an SSD is deleted, a flag for the cells occupied by the data is set to stale or invalid so that subsequent read attempts for that data fail. But the data is only actually erased when the block containing those pages has all its cells erased.

SSD terminology has it that pages are programmed (written) or erased. Erasing is a special form of writing in which all the cells are set to contain binary ones. NAND cells can only endure so many program/erase or P/E cycles before they wear out. TLC (3bits/cell) NAND can support 3,000 to 5,000 P/E cycles for example.

Over time, as an SSD is used, some of the data in the SSD is deleted, and the pages that contain that data are marked as invalid. They now contain garbage as it were. The SSD controller wants to recover the invalid pages so that they can be re-used. It does this at the block level by copying all the valid pages in the block to a different block with empty pages, rewriting the data, and marking the source pages as invalid, until the starting block only contains invalid pages. Then every cell in this block is set to one. This process is called garbage collection and the added write of the data is called write amplification. If it did not happen then the write amplification factor (WAF) would be 1.

Once an entire block is erased it can be used to store fresh, incoming data.

This process is internal to the SSD and carried out as a background process by the SSD controller. The intention is that it does not interfere with foreground data read/write activity and that there are always fresh pages in which to store incoming data.

The SSD controller may track the number of times pages or blocks have been rewritten and, where it has a choice, choose a destination page for data to be written that has a low P/E cycle count. This is to equalize the amount of writes (P/E cycles) across the blocks and prevent over-used blocks wearing out; wear-levelling.