MinIO releases AIStor with GPUDirect-like S3 over RDMA

Open source supplier MinIO has evolved its Enterprise Object Store software to develop the faster and more scalable AIStor for GenAI workloads, including training.

GenAI training requires shipping data at high speed for GPU processing, including to large-scale GPU server farms, and also writing checkpoint data at high speed to reduce GPU processor idle time. MinIO has introduced a new S3 API, promptObject, support for S3 over RDMA, AIHub private Hugging Face repository, and an updated global console with a new Kubernetes operator to extend its AI training and inferencing support.

AB Periasamy, MinIO
AB Periasamy

AB Perisamy, co-founder and CEO at MinIO, said: “The launch of AIStor is an important milestone for MinIO. Our object store is the standard in the private cloud and the features we have built into AIStor reflect the needs of our most demanding and ambitious customers.”

He thinks that “it is not enough to just protect and store data in the age of AI, storage companies like ours must facilitate an understanding of the data that resides on our software. AIStor is the realization of this vision and serves both our IT audience and our developer community.”

The promptObject API enables users, MinIO says, to “talk” to unstructured objects in the same way one would engage a large language model (LLM) moving the storage world from a PUT and GET paradigm to a PUT and PROMPT paradigm. Applications can use promptObject through function calling with additional logic. This can be combined with chained functions with multiple objects addressed at the same time.

For example, when querying a stored MRI scan, one can ask “where is the abnormality?” or “which region shows the most inflammation?” and promptObject will show it. MinIO reckons the applications are almost infinite when considering this extension. MinIO says: “This means that application developers can exponentially expand the capabilities of their applications without requiring domain-specific knowledge of RAG models or vector databases. This will dramatically simplify AI application development while simultaneously making it more powerful.”

Having S3 over Remote Direct Memory Access (RDMA) available enables RDMA’s low-latency, high-throughput capabilities, using customers’ 400GbE, 800GbE, and beyond Ethernet links. RDMA over Converged Ethernet (RoCE) brings RDMA’s benefits to Ethernet, supporting low-latency, high-throughput data transfer on a familiar, scalable infrastructure.

S3 over RDMA provides performance gains required to keep the GPU compute layer fully utilized while reducing storage server CPU utilization and latency-lengthening data copy into that server’s DRAM. This is equivalent CPU-bypass functionality to that used by Nvidia’s GPUDirect file-access protocol. It means object storage can now, in principle, feed data to GPU servers on the same terms as GPUDirect-supporting file systems, such as products from DDN, IBM (StorageScale), NetApp, Pure Storage, VAST Data, WEKA, and others.

GPUDirect storage access diagram
GPUDirect storage access diagram

MinIO says RDMA tackles TCP/IP’s limitations in high-speed networking through:

  •  Direct Memory Access: RDMA bypasses the kernel and CPU, reducing latency by allowing memory-to-memory data transfers.
  • Zero-Copy Data Transfer: Data moves directly from one application’s memory to another’s without intermediate buffering, improving efficiency.
  • CPU offloading: RDMA offloads network processing to the NIC, freeing CPU resources.
  • Efficient Flow Control: RDMA’s NIC-based flow control is faster and uses fewer CPU cycles than TCP’s congestion control, allowing for more stable high-speed performance.

We understand Nvidia is working with several object storage partners to make this facility available. Scality recently accelerated its object storage with the RING XP offering, but this was done, we understand, without using S3 over RDMA.

The private Hugging Face API-compatible AIHub repository is for storing AI models and datasets directly in AIStor. This enables customers to create their own data and model repositories on the private cloud or in air-gapped environments without changing a line of code. It eliminates the risk of developers leaking sensitive data sets or models.

The redesigned Global Console user interface for MinIO has a new Kubernetes operator that simplifies the management of large-scale data infrastructure with hundreds of servers and tens of thousands of drives. It also provides capabilities for Identity and Access Management (IAM), Information Lifecycle Management (ILM), load balancing, firewall, security, caching, and orchestration, accessed through a single pane of glass.

Rajdeep Sengupta, Director of Systems Engineering, AMD, commented: “We have deployed the MinIO offering to host our big data platform for structured, unstructured, and multimodal datasets. Our collaboration with MinIO optimizes AIStor to fully leverage our advanced enterprise compute technologies and address the growing demands of data center infrastructure.”

Altogether this is an important step forward for object storage and will make vast object-stored datasets directly available for AI training and inference. Read more about AIStor here.