IBM unveils Hyperstore for faster remote data access using NVMe-oF in Storage Scale

IBM is planning a “Hyperstore” for Storage Scale to get data from remote drives faster using NVMe over Fabrics.

Update: Added that Frank Kraemer’s quote comes from a LinkedIn post, 26 June 2024.

Storage Scale is the latest incarnation of IBM’s venerable GPFS (General Parallel File System), which speeds file reads and writes by having file system nodes (servers) operate in parallel. NVMe over Fabrics (NVMe-oF) is a protocol effectively extending the PCIe bus and operating across TCP/IP, Fibre Channel, iSCSI, and Ethernet network links to provide direct block-level storage drive access to accessing servers.

IBM IT architect Frank Kraemer, via a LinkedIn post, thinks that the ideas expressed by Tom Lyon in the “NFS must die” article are “pretty cool” and says: “We have plans that go into a similar direction with using NVMe-oF for speed but we’ll still keep the classic way of file system interface and Erasure Coding (GPFS Native Raid – GNR) for ease of use and safe operations.” 

These plans center on a Storage Scale Hyperstore feature, which will provide an NVMe-oF performance pool.

An IBM presentation, IBM Vendor Update – Storage, at the HPC User forum mentioned the concept. Presenter Chris Maestas, Chief Architect for Storage File and Object Systems in IBM’s Data and AI Storage Solutions unit, said that data is everywhere in a hybrid and multi-cloud world, and compute, both CPU and GPU-based, wants to access remote data as if it were local.

Storage Scale generally enables that to happen by providing storage access, abstraction with a single global namespace, and acceleration, as a slide illustrates:

IBM Storage Scale slide

With reference to AI workloads and GPUs, he said admins could have remote data effectively run closer to the compute, emulating local storage on the GPU compute nodes with NVMe-oF. This principle was demonstrated at the SC22 event using IBM’s ESS 3500 with SSDs, delivering more than 10 million IOPS and hundreds of GBps to accessing compute clients. The system used an integrated extreme high IOPS storage pool.

This led to the Hyperstore development:

IBM Hyperstore slide

Hyperstore was revealed to the Spectrum Scale User Group last week in London. It is a tiered system with Storage Scale providing an intermediate reliable pool of storage using GNR (native declustered RAID), the performance pool, and local drives on the client compute nodes. 

These access the reliable storage pool using Network Shared Disks (NSD), a logical grouping of storage disks in a network on file storage systems. Storage Scale stripes files across NSD servers, which store the stripes as blocks. Accessing clients do real-time parallel IO to the NSD servers.

The performance pool drives, a subset of the reliable pool’s drives, are accessed using NVMe-oF, which is faster. There is a shared cache across all the compute nodes.

More Hyperstore details will be revealed in coming months.