MAID and WORM PLC flash archive

Seeing a 128TB Samsung SSD using QLC (4bits/cell) flash got me thinking abut a PLC (5bits/cell) version which could have a 150TB or so capacity. Imagine 32 of these (in long ruler format) put in a 1RU chassis, giving us a 4.8PB box. Now take another step forward and put 40 of these chassis in a rack to produce a 192PB vault. Compress and dedupe it 3x and we have a 576PB box – slightly over half an exabyte.

Update. Comment added from Nimbus CEO Thomas Isakovich. 10 August 2022

It would be pretty useless, as the power budget would probably blow past a 15KW rack limit. So you could only use it if most of the drives were inactive – like in an archive scenario where the bulk of the drives are on standby until you need to read their data. Data is streamed in and written infrequently and you could have an SLC (1bit/cell) landing zone for that to speed things up. Have we just invented an all-flash archive?

It would have an access speed many times that of tape and would be an active all-flash archive.

A Quantum i6H Scalar tape library rack can hold, we estimated, 14.4PB uncompressed and 36PB of compressed data using LTO-9 tapes. Our imagined archive AFA could store 576PB – 16 times more.

The PLC flash has a very short endurance compared to QLC – say sub-500 cycles. That makes it useful for SD cards for cameras, camcorders and USB sticks, but not for enterprise mixed read-write use. But read-intensive use is another matter, especially when the actual read rate is liable to be low anyway. As a write-once-read-many (WORM) technology, PLC flash makes sense. 

Discussing this with analyst Chris Evans, we realized that, in fact, with an SSD’s quite low idle state power draw then you could have a rack-full of drives and not exceed the rack’s power budget – so long as software managed the overall power draw by limiting the number of drives active at any one time. In effect we would have the flash equivalent of the old Copan MAID disk drive array concept – a massive array of inactive drives.

If an analyst and I can come up with this idea during a coffee break at the Flash Memory Summit, then engineers at the the solid state drive manufacturers got there months ago. If the idea really has legs – price/performance/power budget/access speed legs – then we might see the first flickers of light about it emerging from the darkness of the solid state product development jungle in the next few months.

Or perhaps this is all a flash fantasy and will get nowhere.

Nimbus comment

Nimbus CEO Thomas Isakovich contacted us to say: “This achievement (a flash-based exascale data vault) was my motivation when designing our ExaDrive 3.5” SSD. To achieve this, one must re-think the goals and priorities:

  • Focus on power efficiency, capacity, and cost, not IOps, GBps, or DWPD à go with SATA and QLC, which offer lower power and lower cost than NVMe/TLC and adequate performance
  • Leverage existing density-optimized enclosures à go with 3.5” top-loading enclosures that have unmatched volumetric rack density and a healthy and competitive ecosystem

Your idea is not fantasy – in fact, it is quite a reality today with ExaDrive:

  • Using ExaDrive 64 TB QLC SSDs achieves 1.5 PB per U raw today (using 90 or 100 slot 4U enclosures today from multiple vendors), or about 68 PB raw capacity per 44U rack.
  • With 5:1 compression/dedupe, and accounting for SSD fault redundancy, that’s 335 PB usable per rack, or a whopping 1 exabyte usable in just 3 racks.
  • Power (due to our patented SATA/QLC ExaDrive controller) is a low 20 kW per rack, easily achievable in today’s datacenters, which equals a mere 0.06 watts per usable TB.
  • Note that our approach is based on an industry-standard drive form factor and interface, not a proprietary flash module/blade