Microsoft advances toward glass-based archival storage

Project Silica, a Microsoft plan to store multiple layers of archival data inside slabs of quartz glass, is getting closer to becoming a product, as a 16-page document explains.

Microsoft’s update of its glass archival storage project is an academic paper submitted to the 29th ACM Symposium on Operating System Principles (SOSP 2023).

The aim is to develop a cloud-scale archive media system for reading and writing data. The encoding technology is the production of areas with polarization-based patterns at points within a square glass slab, defined by 3D coordinates, and there can be hundreds of layers.

These polarization pattern points are called voxels, which are produced using femtosecond laser pulses. Each voxel encodes 3 to 4 bits of data. Voxels are written side by side in 2D layers across the platter’s XY plane. They are organized into rectangular sectors, a 2D group of 100,000+ voxels in an XY plane, about 100 KB of data. A 3D set of sectors is called a track and there can be multiple TB of data per platter.

This is somewhat similar to Cerabyte’s technology, which also uses femtosecond laser pulses to create physical changes in a ceramic coating on square glass slabs. The changes are nano-scale holes, like a high-technology punch card. But Cerabyte’s holes are generated as part of QR codes whereas Project Silica’s voxels are laid down in tracks. Its slabs are placed on a platform that moves from left to right and forward-backward underneath read and write head devices (lasers and polarization microscopes). Cerabyte’s glass carrier only moves forward and backward, and is a single layer medium. Therefore, Project Silica glass is a random-access medium.

Both Cerabyte and Microsoft envisage library racks to hold the data storage medium – in cartridges holding square glass data carriers in Cerabyte’s design, but as raw quartz glass slabs in Microsoft’s library. This library has a robotic transfer system composed of several independent and battery-powered robot pickers (shuttles), a small swarm as it were, with the ability to flip up and down vertically between multiple horizontal rails running through the library racks to the read and write racks. A video demonstrates this. Microsoft likens it to “a set of free roaming shuttles inspired by state-of-the-art warehouse robotic systems.”

Two independent robot pickers in Microsoft’s Project Silica library

Microsoft says: “The read drive scans sectors in a single swift Z-pattern, and the resulting images are processed for decoding. Different read drive options offer varying throughput, balancing cost and performance.”

There are physically different read and write drives in the Project Silica system.

There is a one-way system between the write racks and the library racks to prevent a platter being overwritten. Like Cerabyte, the Project Silica technology is inherently write-once only. It’s a physical WORM system. Microsoft says: “The robotics are unable to insert a glass platter into a write device once the glass media has been written.” That means there is a physical air gap at library system level, and also that glass platters are written to full capacity in one operation, from the deepest to the top layer.

The write drive is full rack-sized and writes multiple platters concurrently; ditto the read drive rack which contains multiple drives. Both read and write drive racks require cooling, power, and network connectivity.

Written platters are read (verified) before being stored in the library. That means a freshly written drive is transferred by shuttle to a read drive. The Microsoft paper states: “To enable high drive efficiency, two platters can be mounted simultaneously in a read drive; one undergoing verification, and one servicing a customer read. Customer traffic is prioritized over verification.”

Read (scanned) Project Silica voxel images are passed through a machine learning algorithm to turn them into binary data. The read drive seeks on the XY plane to locate a desired track then reads an entire track’s sectors in a single scan in the Z (depth) direction. The read drive throughput scales in multiples of 30 MBps.

Microsoft’s researchers analyzed Azure archive IO patterns and found that small file IOs dominated (256 MiB to 256 GiB) along with significant differences at a datacenter level. This means that “minimizing the latency of mechanical movement in the library is crucial for optimal performance,” and also that a Silica library should be customizable for different workload patterns.

The Project Silica library system uses two error coding techniques – LDPC (Low Density Parity Check) inter-sector error coding and NC (Network Erasure Coding), with within-track, large group (multi-track), and cross-platter NC variants employed.

The Silica library’s read performance, the delay between the reception of a read request, and the last byte read and sent from the library is defined by the 99.9th percentile; the tail completion time. Microsoft assumes “an SLO of 15 hours to the last byte, which is in line with current archival services.”

This completion time does not include the disaggregated decode by the machine learning algorithms, however.

Microsoft’s paper concludes: “The unique properties of the glass media and the clean slate, cloud-first co-design of the hardware and software allow Silica to be fundamentally more sustainable and achieve significantly lower costs for archival data than magnetic tape.”

Comment

This glass-based archive represents the first credible tape archive replacement technology we have seen, being far more realistic than DNA storage. Microsoft and Cerabyte are working on tape archive replacement technologies that could result in a deliverable product within five years. Tape companies should be looking at this technology to see how they might incorporate it in their product planning roadmaps.