VAST Data says its all-QLC flash file storage has been certified as an Nvidia SuperPOD data store.
Nvidia’s SuperPOD houses 20 to 140 DGX A100 AI-focused GPU servers and uses its InfiniBand HDR (200Gbps) network connect. The DGX A100 features eight A100 Tensor Core GPUs, 640GB of GPU memory and dual AMD Rome 7742 CPUs in a 6RU box. It also supports BlueField-2 DPUs to accelerate IO. The box provides up to 5 petaFLOPS of AI performance, meaning 100 petaFLOPS in a SuperPOD with 20 of them.
VAST CEO and co-founder Renen Hallak said: “VAST’s alliance and growing momentum with Nvidia to help customers solve their greatest AI challenges takes another big step forward today … The VAST data platform brings to market a turnkey AI datacenter solution that is enabling the future of AI.”
The VAST pitch is that its Universal Storage system brings to market the first enterprise network attached storage (NAS) system approved to support the Nvidia DGX SuperPOD.
VAST Data co-founder and CMO Jeff Denworth told us: “For years customers have not had an enterprise option for these large systems, since the AI system vendors need to adhere to a very limited set of offerings. Many were burned by other NFS platforms in the past.”
A VAST statement said: “AI and HPC workloads are no longer just for academia and research, but these are permeating every industry and the enterprise players that own and manage their own proprietary AI technologies are going to be differentiated going forward. Historically, customers building out their supercomputing infrastructure have had to make a choice around performance, capabilities, scale and simplicity.”
The company reckons its storage system provides all four attributes, and says: “We have already sold multiple SuperPODs with more in the pipeline so the market is validating/recognizing this as well.”
The Nvidia-VAST relationship dates back to 2016, VAST says, with original development of its disaggregated, shared-everything (DASE) architecture. VAST supports Nvidia’s GPUDirect storage access protocol and also its BlueField DPUs. VAST’s Ceres data enclosure includes four BlueField DPUs.
DDN has a combined SuperPOD and Lustre-based A3I storage system. Previously, NetApp has certified its E-Series hardware running ThinkParQ’s BeeGFS parallel file system with Nvidia’s SuperPOD. Neither of these are enterprise NAS systems.
Back in 2020, NetApp provided a reference architecture for twinning ONTAP AI with Nvidia’s DGX A100 systems for AI and machine learning workloads. ONTAP is an enterprise NAS operating system as well as a block and object access system. Surely it must be possible to get an all-flash ONTAP system certified as a SuperPOD data store, unless ONTAP’s scalability limit of 24 clustered NAS nodes (12 HA pairs), meaning a 702.7PB maximum effective capacity with the high-end A900, proves to be a blocking restriction.