Nvidia buys Excelero to speed GPU cluster block data access

GPU giant Nvidia is buying NVMe block data access software specialist Excelero for an undisclosed amount.

This acquisition first reared its head as a possibility last month. Now the two firms have agreed a deal in which Excelero’s IP and most of its staff – including co-founders CEO Yaniv Romem, chief scientist Omri Mann, and engineering VP Ofer Oshri – will move under Nvidia’s roof.

Romem said in a statement: “The Excelero team is joining Nvidia as demand is surging for high-performance computing and AI. We’ll be working with Nvidia to ensure our existing customers are supported, and going forward we’re thrilled to apply our expertise in block storage to Nvidia’s world-class AI and HPC platforms.”

Suresh Ollala, Nvidia’s senior director of engineering, wrote in a blog that the Excelero team “bring deep expertise in the block storage that large businesses use in storage-area networks. Now their mission is to help expand support for block storage in our enterprise software stack such as clusters for high performance computing. Block storage also has an important role to play inside the DOCA software framework that runs on our DPUs.”

Early Excelero NVMesh diagram

Excelero, founded in 2014, developed NVMesh software that took block data from SSDs and presented it to remote systems as a pool of block storage, like a SAN. The SSDs could be attached to servers in a hyperconverged infrastructure (HCI) appliance or as a converged system. This would involve cluster of servers, each with their own SSD storage, presented to host as a single block storage pool with remote direct memory access using NVMe.

The software using the block data could be running as VMware virtual machines or as Kubernetes-orchestrated containers.

Nvidia’s AI enterprise storage stack is made up of software suites intended to make it more straightforward for enterprises to run AI systems using Nvidia GPUs and/or its BlueField DPUs (SmartNICs). As part of that the GPUs need loading with block data at high speed so that they can do their processing work and write the results back to the storage.

The lower parts of Excelero’s software stack read and write data to drives and package and secure it for high-speed transfer to processing resources. This is what interests Nvidia. It’s seems that in a cluster block, data from one node might be wanted by another and that cross-cluster data access is a core part of Excelero’s wares.

Ollala said: “Nvidia will continue to support Excelero’s customers by honouring its contracts. Looking ahead, Excelero’s technology will be integrated into Nvidia’s enterprise software stack.” Excelero will be merged into Nvidia’s engineering organisation and not be a separate unit with its own identity.

Nvidia, whose attempt to buy Arm fell through, acquired Bright Computing and its cluster management software in January. Nvidia is building an AI cluster software stack with software control and data paths that will make its GPU servers more usable by enterprises who won’t have to roll their own software to get AI applications running on Nvidia’s hardware.