NetApp, Google preview retrieval-augmented generation AI toolkit

NetApp-hosted data can be used in RAG (retrieval-augmented generation) operations for Google’s Vertex AI platform in a previewed toolkit reference architecture.

NetApp’s ONTAP software is provided as a first-party service in Google’s public cloud, called Google Cloud NetApp Volumes (GCNV). A new Flex offering means users can now choose from four service levels with varying capacity and throughput performance. Vertex AI is Google’s combined data engineering, data science, and ML engineering workflow platform for training, deploying, and customizing large language models (LLMs) and developing AI applications. RAG (retrieval-augmented generation) adds proprietary data to an LLM trained on public data to fill in gaps in its trained answer set and help prevent inaccurate or hallucinatory responses to users’ conversational input.

Pravjit Tiwana, NetApp
Pravjit Tiwana

Pravjit Tiwana, Cloud Storage SVP and GM at NetApp, said: “By extending our collaboration with Google Cloud, we’re delivering a flexible form factor that can be run on existing infrastructure across Google Cloud system without any trade-offs to enterprise data management capabilities.”

The four GCNV service levels are:

  • Standard: highly available, general-purpose storage with data management capabilities and 16 MiBps per TiB of performance, for workloads such as file shares, virtual machines (VMs), and DevTest environments.
  • Premium: highly available, high-performance storage with data management capabilities and 64 MiBps per TiB of performance, again for file shares, VMs, and databases.
  • Extreme: highly available, low-latency, high-throughput storage with data management capabilities and 128 MiBps per TiB of performance, recommended for Online Transaction Processing (OLTP) high-performance databases and low-latency applications.
  • Flex: highly available storage volumes with scalability from 1 GiB to 100 TiB and up to 1 GiBps of performance depending on the size of the underlying storage pool. This can support a wide variety of use cases, including AI. 

NetApp is also releasing a preview of its GenAI toolkit for Vertex AI with support for NetApp Volumes, saying it helps optimize RAG processes:

  • ONTAP allows customers to include data from any environment to power RAG ops with common operational processes.
  • NetApp’s BlueXP classification service automatically tags data to support streamlined data cleansing for both the ingest and inferencing phases of the data pipeline, helping ensure that the right data is used for queries and that sensitive data is not exposed to the model out of policy.
  • ONTAP Snapshot delivers near-instant creation of space-efficient, in-place copies of vector stores and databases, allowing immediate rollback to a previous version if data is corrupted or forward if point-in-time analysis is needed.
  • ONTAP FlexClone technology can create instant clones of vector index stores to make relevant data instantly available for different queries for different users, without impacting production data.

Tiwana said: “We have unmatched capabilities to support data classification, tagging, mobility, and cloning for data wherever it lives so our customers can run efficient and secure AI data pipelines. Building on our partnership with Google Cloud to streamline RAG enables customers to tap into market-leading AI services and models to generate a unique competitive advantage.”

There’s no new software functionality here. NetApp wants to ensure that its ONTAP data stores can be used in RAG workflows and is stressing that its data services, such as BlueXP classification, Snapshots, and FlexClones help the selection and presentation of such data to LLMs.

NetApp’s recognition that its customers embracing GenAI will want RAG-enhanced LLMs using their NetApp-stored data parallels that of Cohesity with its Gaia initiative. Dell’s AI Factory also has RAG elements to it. We can expect NetApp to expand its RAG support to Azure and AWS.

The Flex service level will be generally available by Q2 2024 across 15 Google Cloud regions, expanding to the other regions by the end of 2024. The GenAI toolkit will be available as a public preview within the second half of 2024.

Find out more about the Flex service level for NetApp Volumes or the GenAI toolkit reference architecture for Vertex AI by visiting the NetApp booth #1231 at the Google Cloud Next 2024 conference running April 9-11 at Mandalay Bay in Las Vegas.