Back to the future with persistent memory

The potential of persistent memory is to return computing to the pre-DRAM era but with vastly more powerful NVRAM servers, and external storage relegated to being a reference data store. 

Or so says Steven Sicola, the storage sage whose 40-year career includes senior technical roles at Compaq, Seagate, X-IO, SanDisk, Western Digital and Formulus Black

Server starvation NVRAM fix

In his presentation at the 2019 Flash Memory Summit this month, he argued that modern IT suffers from looming server starvation because data growth is accelerating and server processors and networks cannot keep up. By 2020 a tipping point will be reached, and server scaling will be inadequate in the face of data in the tens of zettabytes.

The memory-storage IO bottleneck is the fundamental flaw afflicting today’s servers. How to overcome this? SIcola’s answer is to re-engineer servers with non-volatile RAM (NVRAM) instead of DRAM.

Sicola presentation deck slide.

Such an NVRAM server would have all its primary storage in memory, and share this with other NVRAM servers, speeding up distributed computing by magnitudes, with no external storage getting in the way. The server is used for sequentially-written and read reference/archive data and holds all primary data in NVRAM memory. External storage is used for secondary, reference data.

Tipping point

Sicola reckons: “NVRAM based servers, accompanied by an NVRAM-optimised Operating System for transparent paradigm shift are just about here…this is a big tipping point.”

He thinks the flawed Von Neumann era could be about to end, with NVRAM servers: “We now see the re-birth of NVRAM with Intel’s Optane, WD’s MRAM, etc.”

Optane Persistent Memory is the deployment of 3D XPoint media in a DIMM form factor and accessed in application direct access (DAX) mode. It is non-volatile and also known as storage-class memory.

NVRAM servers will be able to handle many more virtual machines and much larger working data sets, especially for Big Data, AI, and all the other applications begging for more server power, from databases, mail, VDI, financial software, to, as Sicola says, you name it.

Memory-storage bottleneck

For Sicola, the big bottleneck now is between memory and storage, where there is an IO chokepoint.  That’s because storage access is so much slower than DRAM access. Despite the use of SSDs, CPU cycles are still wasted waiting for storage IO to complete.

A slide in his FMS19 presentation declared: “It has been the ‘Holy Grail’ for all software engineers and those making servers to have NVRAM instead of cache (DRAM), but have been living with a flawed Von Neumann architecture for almost 40 years!!

This bottleneck has come about because theVon Neuman computing architecture, seen in the first mainframes, has become modified and flawed

The original mainframes did not have dynamic random access memory (DRAM), the volatile stuff. Instead they used NVRAM – remember ferrite cores? These machines were examples of Von Neumann architecture, which was first described in 1945. Minicomputers also used Von Neumann architecture. 

Von Neumann architecture

In these systems a central processing unit accessed instructions and data from a memory unit. This used non-volatile memory as dynamic or volatile random-access memory (DRAM) wasn’t commercially available until the 1970s. Instructions and data were loaded from an input peripheral unit and the system output data to an output peripheral unit. 

As systems developed, a bottleneck was identified between the CPU and the memory unit. Instructions and data were read in to the CPU over a single data path. A later Harvard architecture machine has separate instruction and data paths, enabling the system to operate faster.

A modified Harvard architecture machine combines the data and instruction memory units with a single address space, but has separate instruction and data pathways between it and discrete instruction and data caches. The caches are accessed by the CPU.

This architecture is used by x86, ARM and Power ISA processors. The memory units use DRAM technology, dating from the 70s, with x86 processors appearing from 1978 onwards. DRAM usage meant persistent storage was separate from memory, with memory functioning as a cache.

Since then CPUs have gained faster clock rates and more cores, CPUs have developed a hierarchy of ever-faster caches and internal data paths have sped up, with today’s PCIe 3 making way for its successor, PCIe 4.0.

But not today, not yet

However NVRAM servers are not yet ready to go mainstream, according to Sicola. Today’s hybrid RRAM/flash NVDIMMs are too early as they are too expensive, do not have enough capacity and are over-complex with firmware in memory.

He thinks network bandwidth could rise markedly, with today’s switch-based networking changing over to memory-like hops with torus networks, such as those from Rockport Networks. Memory fabrics like Gen Z could also speed data transfer between NVRAM servers, with PCIe 4.0 accelerating internal traffic.

It’s a glorious future but will it come to pass? Will 40 years of x86 servers using DRAM give way to servers based around persistent memory? Some might think that’s a big ask.