MinIO goes macro for mega AI workloads

MinIO has developed an Enterprise Object Store to create and manage exabyte-scale data infrastructure for commercial customers’ AI workloads.

The data infrastructure specialist provides the most popular open source object storage available, with more than 1.2 billion Docker pulls, and is very widely deployed. However, the Enterprise Object Store (EOS) product carries a commercial license. 

AB Periasamy.

AB Periasamy, co-founder and CEO at MinIO, issued a statement: “The data infrastructure demands for AI are requiring enterprises to architect their systems for tens of exabytes while delivering consistently high performance and operational simplicity.

“The MinIO Enterprise Object Store adds significant value for our commercial customers and enables them to more easily address the challenges associated with billions of objects, hundreds of thousands of cryptographic operations per node per second or encryption keys or querying an exabyte scale namespace. This scale is a defining feature of AI workloads and delivering performance at that scale is beyond the capability of most existing applications.”

EOS features include:

  • Catalog Enables indexing, organizing and searching a vast number of objects using the familiar GraphQL interface, facilitating metadata search in an object storage namespace.
  • Firewall Aware of the AWS S3 API and facilitating object-level rule creation from TLS termination, load balancing, access control and QOS capabilities at object-level granularity. It is not IP-based or application-oriented.
  • Key Management Server MinIO-specific, highly available, KMS implementation optimized for massive data infrastructure. It deals with the specific performance, availability, fault-tolerance and security challenges associated with billions of cryptographic keys, and supports multi-tenancy.
  • Cache a caching service that uses server DRAM memory to create a distributed shared cache for ultra-high performance AI workloads.
  • Observability Data infrastructure-centric collection of metrics, audit logs, error logs and traces.
  • Enterprise Console A single pane of glass for all the organization’s instances of MinIO – including public clouds, private clouds, edge and colo instances.

As we understand it, ultra-high performance AI applications run in massive GPU server farms. MinIO’s cache pooled DRAM does not operate in the GPU servers in these server farms and the GPUs there cannot directly access the MiniO pooled x86 server DRAM cache.

On that basis, we suggested to MinIO CMO Jonathan Symons that MinIO’s cache, although designed for ultra-high performance AI applications, does not support direct supply of data by cache-sharing to the GPU processors used for these ultra-high performance AI applications.

Symons told us GPUDirect-supporting filers use networking links – like MinIO – to send data to GPU server farms. “GPUs accessing the DRAM on SAN and NAS systems using GPUDirect RDMA are also limited by the same 100/200 GbE network HTTP-based object storage systems use. RDMA does not make it magically better when the network between the GPUs and the storage system is maxed out.

“Nvidia confirmed that object storage systems do not need certification because they are entirely in the user space. Do you see Amazon, Azure or GCP talking about GPUDirect for their object stores? No. Is there any documentation for Swiftstack (the object store Nvidia purchased) on GPUDirect? No. SAN and NAS vendors need low level kernel support and they need to be certified.

“GPUDirect is challenged in the same way RDMA was in the enterprise. It is too complex and offers no performance benefits for large block transfers. 

“Basically we are as fast as the network. So are the SAN and NAS solutions taking advantage of GPUDirect. Since neither of us can be faster than the network, we are both at the same speed. To qualify for your definition of ultra-high performance, do you have to max the network or do you simply have to have GPUDirect?”

Point taken, Jonathan.

MinIO’s Enterprise Object Store is available to MinIO’s existing commercial customers immediately, with SLAs defined by their capacity.