Qumulo edges out WEKA in Azure cloud performance benchmark

Scale-out file system supplier Qumulo has beaten WEKA in the Azure cloud using an AI-related benchmark.

The SPECStorage Solution 2020 benchmark has an AI Image subset alongside four other workload scenarios: Electronic Design Automation (EDA), Genomics, Software Builds, and Video Data Acquisition (VDA). The Azure Native Qumulo (ANQ) offering achieved 704 AI Image jobs with an overall response time (ORT) of 0.84ms and delivering 68,849MB/sec

Qumulo claimed that this is both the industry’s fastest and most cost-effective cloud-native storage offering as its Azure run cost “~$400 list pricing” for a five-hour burst period. The software used a SaaS PAYG (pay as you go) model, in which metering stops when performance isn’t needed.

It said that deploying cost-effective AI training infrastructure in the public cloud requires transferring data from inexpensive and scalable object storage to limited and expensive file caches. ANQ acts as an intelligent caching data accelerator for the Azure object store, executing parallelized, pre-fetched reads, served directly from the Azure primitive infrastructure via the Qumulo file system to GPUs running AI training models.

WEKA recorded the highest results in four of the benchmark’s categories in January 2022, including AI. It reported 1,400 AI Image jobs with an overall response time (ORT) of 0.84ms using Samsung SSDs. A separate run with WEKA software running in the Azure public cloud recorded 700 AI Image jobs with a 0.85ms ORT and 68,318MB/sec. Qumulo has just beaten this by four jobs, 0.01ms, and 531MB/sec – a narrow margin.

A chart shows the differing vendor product results:

Qumulo has a lower ORT than WEKA in the AWS public cloud but a far lower AI Image job count. 

A Qumulo blog claims that its Azure PAYG pricing model is disruptive. It argues that “most other vendors, including a previous submission at 700 jobs at 0.85ms ORT, do not communicate costs transparently.”

The blog authors further state: “They include a large, non-elastic deployment of over-sized VMs that you would have to keep running, even after deployment, in order to maintain your data set. They require a 1–3 year software subscription, costing hundreds of thousands of dollars, on a software entitlement vs having a PAYG consumption model.”