WekaIO races out of the blocks in GPUDirect storage race

WekaIO has chalked up some benchmarks that show it is faster than VAST Data when delivering data to Nvidia’s DGX-2 GPUs server via GPUDirect. Dual-port cards and better software probably accounted for the difference.

Update; VAST Data used InfiniBand and not Ethernet in its testing. 4 Nov 2020.

Let’s recap a brief explanation of GPUDirect, introduced by Nvidia to the world in July. This speeds up the processing of AI and analytics workloads by ensuring Nvidia DGX-2 GPU servers are not left idle by slow data delivery from storage.

GPUDirect technology bypasses a server memory bottleneck by enabling DMA (direct memory access) between GPU memory and NVMe storage drives. It enables the storage system NIC to talk directly to the GPU, avoiding the DGX-2’s CPU and memory subsystem.

Customers use Nvidia’s GPUs to accelerate graphics-related, AI, and ML workloads which can take many hours, or even days using x86 processors. Such GPU servers can cost millions of dollars and having them idling while waiting for data is far from ideal. Hence the need for GPUdirect and the focus on sheer data delivery speed by storage suppliers.

Four high-end storage suppliers have declared their support for GPUDirect: DDN, Excelero, VAST Data and WekaIO. VAST and WekaIO have published performance benchmarks with WekaIO’s 97.9GB/sec beating VAST Data’s 92.6GB/sec. Excelero and DDN are yet to publish results.

There is no mention of price/performance in any of the GPUDirect benchmark comparisons. These are not formal industry benchmarks, like the SPC-1 suite, where price/performance is a vital measure and the players are focussed on GB/sec to the excusing of $/GB/sec.

Apples don’t equal apples

VAST Data suggests that comparing its GPUdirect result with Weka’s is not comparing apples to apples, as the two companies used different test scenarios.

WekaIO, working with Microsoft Research, used a “single NVIDIA DGX-2 server connected to a WekaFS cluster over a Mellanox InfiniBand switch the testers were able to achieve 97.9GB/s of throughput to the 16 NVIDIA A100 GPUs using GPUDirect Storage.”

In test, VAST Data used NFS over RDMA, across InfiniBand, with GPUDirect to achieve its 92.6GB/sec data delivery to a DGX-2.

Nvidia’s DGX-2 has 8 x 100Gbit/s interfaces. A VAST Data source, speaking unofficially, said that 8 x 100Gbit/s comes out at 100GB/sec, a fully saturated line rate. That bandwidth has to carry the data payload but some bandwidth percentage will be lost due to link overhead.

That means 8 x 100Gbit/s interfaces do not actually carry a 100GB/sec data payload. VAST Data delivered 92.6 per cent of 100GB/sec, and claims the links are essentially saturated at that rate. They cannot carry any more data.

How was WekaIO then able to deliver 97.9GB/sec?

Our VAST source’s best guess is that Microsoft Research is reporting some NIC to CPU bandwidth that Nvidia doesn’t, since all Nvidia cares about is data delivery to its GPUs. If that view is correct, WekaIO and VAST are  driving the network equally hard but WekaIO is reporting with some small NIC-to-CPU bandwidth slice and VAST doesn’t measure this. 

Weka told us it wasn’t reporting any such NIC-to-CPU bandwidth.

Other possibilities

Our VAST source mentioned another possible get-out; you have to make sure no one ever uses GiB to mean GB, a they are 7 per cent bigger. A GiB or gibibyte is 1024 Mib which is 1024 bytes. A GB or gigabyte is 1,000 MB which is 1,000 bytes.

On that score Weka would be reporting GiB/sec whilst VAST is reporting GB/sec. B&F thinks this is highly unlikely, and WekaIO confirmed it’s using GB and not GiB in its reporting.

Excelero’s Sven Breuner mentions another potential complication; “There are different models of the DGX-2. They [WekaIO] were likely using a model that had 9 x 100Gbit/s NICs, namely the DGX-2H, which the people typically also just refer to as “DGX-2″.”

Nope, WekaIO told us it used a standard, 8 x GPU, DGX-2. It also said it used dual-port interface cards and that could well have contributed to Weka’s faster result. As could the simple possibility that WekaIO’s filesystem is faster than VAST Data’s software.