YanRong excels at 3D U-Net, hits 513 GB/s in MLPerf Storage benchmark

Chinese supplier YanRong says it’s set an AI benchmark record in the newly-released MLPerf Storage v2.0 results. Its all-flash F9000X appliance, running its self-developed distributed and parallel file system YRCloudFile, delivered a record 513 GB/s bandwidth in the 3D U-Net test, record that is for a 3-node cluster using commodity hardware.  

Each F9000X storage node was equipped with Intel Xeon 5th Gen Scalable Processors, utilizing domestically manufactured PCIe 5.0 NVMe SSDs, along with 4 Nvidia ConnectX-7 400Gbps InfiniBand network cards. The system supports up to 22 NVMe SSDs for data (up to 30.72 TB/drive) plus 2 x 1.6 TB NVMe SSDs for metadata. 

A JNIST Shanhe-Huawei OceanStor A800 system was faster, at 749.9 GBps but this is a a scale-out architecture array, that supports 100 million IOPS and petabytes-per-second bandwidth, run by China’s Jinan Institute of Supercomputing Technology. It is not a commodity hardware system.

YanRong was second fastest on the CosmoFlow benchmark in the  Nvidia H100 category with 136.9 GBps, behind a UBIX UbiPower 18000 system delivering 293.4 GBps. [English translation here.] The UbiPower 18000 uses a distributed architecture with a cluster supporting up to 256 nodes. It combines SSDs and persistent memory for high-speed data access, and is also not a commodity hardware system.

The F9000 delivered 258.5 GBps on the ResNet-50 test with simulated H100 GPUs, beaten into second place by the UBIX UbiPower 18000 set up again, with its 380.7 GBps.

Here is YangRong’s summary of its results;

Note that the MLPerf Storage benchmark presents GiBps bandwidth numbers, not GBps values which are used in YanRong’s table.

The company compared its results to those of DDN, Nutanix, Hammerspace and HPE:

YanRong, with its commodity hardware, recorded the second fastest read checkpoint speed;  221 GBps, with 8 clients and 64 simulated GPUs in the newly added Llama3-70B Checkpoint workload. A faster IBM Blue Vela system delivered  232.5 GBps read speed. Blue Vela is an IBM supercomputer using the Storage Scale parallel file system.

YanRong recorded the fourth fastest checkpoint write speed, 79 GBps, with an Argonne National Lab/DAOS supercomputer (126.3 GBps) recording the highest speed. The JNIST/Huawei OceanStor system (123.1 GBps) was second with IBM’s Blue Vela third (118.1 GBps). 

Download an F9000X datasheet here.