IBM claims it has the fastest storage node data delivery to Nvidia GPU servers with its ESS 3500 hardware and Storage Scale parallel file system.
Nvidia GPU servers are fed data through its GDS (Magnum GPU Direct) protocol which bypasses source data system CPUs by setting up a direct link to the storage subsystem. We had previously placed IBM in second place behind a DDN AI400X2/Lustre system, with that ranking based on IBM’s ESS 3200 hardware.
IBM told us about a faster result with its ESS 3500 storage system, which has a faster IO controller processor, an AMD 48-core EPYC 7642 vs the ESS 3200’s 48-core EPYC 7552. The ESS 3200’s 31/47GBps sequential write/read bandwidth gets upgraded to 60/126GBps with the ESS 3500. A chart shows the comparison of this against the DDN, VAST, Pure, NetApp and, Dell PowerScale systems:
IBM leads on read bandwidth and is second to DDN on write bandwidth. Here are are the numbers behind the chart:
The ESS 3500 has a Power9 processor-based management server runing Storage Scale, the rebranded Spectrum Scale parallel filesystem software, with a cluster of building blocks providing the actual storage. Each building block is a pair of X86 CPU-based IO servers attached to one to four external storage chassis containing PCIe gen 4 NVMe drives. The system supports 100Gbit Ethernet or InfiniBand running at 100Gbps (EDR) or 200Gbps (HDR). There’s a lot more information in an IBM ESS 3500 presentation.