Cloudian bakes Milvus vector database into HyperStore for AI inference

Object storage supplier Cloudian is adding Milvus vector database functionality into its HyperStore software to provide AI inference capabilities as part of an AI Data Platform roadmap.

HyperStore is an object storage system, the industry’s highest-performing, according to Cloudian, with effectively limitless scalability that supports Nvidia’s GPUDirect. A vector database stores mathematical transforms of multi-dimensional aspects of tokenized text in unstructured document data, also vectorized audio, image, and video data. The vectors are used by large language models (LLMs) to search for semantically related vectors when building a response to user requests.

Neil Stobart, Cloudian
Neil Stobart

Cloudian CTO Neil Stobart stated: “The integration of data storage and AI inferencing into a single, efficient platform represents a fundamental shift in how enterprises approach AI infrastructure.”

Cloudian points out that modern AI applications require massive storage capacity for vector datasets that can reach petabytes in size, along with supporting index files and operational logs, while simultaneously demanding ultra-low latency access for real-time inference operations. Having separate unstructured data and vector stores entails data movement and separate infrastructure components. Combining the two means customers can eliminate this data movement and reduce the complexity of deploying enterprise-scale AI systems, or so says the vendor.

It says that, while AI models themselves may be relatively small, the context data required for meaningful AI interactions creates massive storage demands. KV cache volumes for reasoning models are projected to reach 2-5 TB per concurrent user by 2026. Users expect AI systems to remember everything about them – their conversation history, preferences, and context – potentially storing token inputs and outputs for billions of users over time.

The open source Milvus vector database, created and supplied by Zilliz, stores, indexes, and queries high-dimensional vector embeddings generated by machine learning models, enabling millisecond-level query response times for billion-scale vector datasets. Cloudian is using it for similarity search and AI inference applications including recommendation systems, computer vision, natural language processing, and retrieval-augmented generation (RAG).

  • HyperStore serves as the unified storage foundation, handling raw data, processed vectors, model artifacts, and metadata
  • Milvus runs on auxiliary nodes while leveraging HyperStore for persistent storage of vector indexes and collections
  • Data flows seamlessly between storage and compute without the bottlenecks of traditional multi-system architectures
  • Parallel processing enables thousands of concurrent similarity searches across massive vector datasets

Cloudian says its 35 GBps per node HyperStore + Milvus provides exabyte-scale object storage supporting “massive vector datasets while maintaining high-performance access for real-time inferencing workloads.” Customers have a reduced total cost of ownership compared to deploying separate storage and inference platforms, with simplified management and reduced data movement costs.

HyperStore + Milvus supports both on-premises and hybrid cloud deployments. Customers can start small with pilot AI projects and scale up to production workloads

Cloudian’s AI Data Platform vision includes “unified, accelerated infrastructure that seamlessly integrates data processing, storage, and AI computation.” It will no longer just provide storage-only infrastructure, instead evolving into a data processing platform, leveling up the infrastructure with storage part of a broader application software stack, claims the company.

Cloudian’s integrated AI inference software is available immediately for evaluation. A Cloudian blog adds: “Preliminary testing shows remarkable improvements in inference throughput that we’ll detail in an upcoming performance analysis.”