Pure Storage Data Stream is an AI pipeline data dream

Pure Storage has introduced Data Stream, a GPU-centric, AI-powered, integrated hardware and software stack for AI data pipelines.

Data Stream is a software suite running on FlashBlade//S and Nvidia Blackwell GPU hardware. It’s built to automate and accelerate the ingestion, transformation, and optimization of data for enterprise AI pipelines. View it as a core component of Pure Storage’s Data Platform, intended for enterprise inference use cases, using the Nvidia AI Data Platform reference design and available as a single SKU. Data Stream serves as the intelligent orchestration layer, ensuring data is rendered AI-ready through automated GPU-accelerated processing and GPU-aligned delivery.

A Pure Storage blog says: “Data Stream accelerates data readiness to directly address the data readiness crisis for enterprise AI initiatives.” 

According to the blog, features include:

  • Automated real-time data ingestion and structuring: Data Stream ingests raw data from diverse sources, including text documents, PDFs, images, and structured tables. It performs intelligent chunking and transformation—dividing content into semantically coherent segments such as sentences or paragraphs—to preserve contextual integrity and granular access control while minimizing information loss. This process supports multiprotocol access (NFS, S3, SMB) and handles billions of files or objects, enabling seamless integration with built-in vector databases for scalable storage on Pure Storage FlashBlade//S.
  • Nvidia NeMo integration: Data Stream orchestrates end-to-end workflows from data readiness to model inference, with NeMo Retriever enabling GPU-accelerated vector embedding generation, where raw chunks of data are transformed into high-dimensional semantic vectors using Nvidia embedding models. These embeddings facilitate advanced similarity searches via approximate nearest neighbor (ANN) algorithms, HNSW, IVF, and others, for retrieval in RAG pipelines. The integration supports Nvidia NIM for deploying optimized inference, with scaling across on-premises or cloud environments via standardized APIs.
  • GPU-optimized pipeline acceleration: Data Stream uses the Nvidia RTX PRO 6000 Blackwell Server Edition GPU and Nvidia software libraries like Spark Rapids and cuVS, also ConnectX-7 NICs for low-latency networked storage access. 
  • FlashBlade//S Orchestration occurs at the storage layer, where transformations such as metadata enrichment and reranking for relevance are executed in parallel, substantially reducing end-to-end latency for inference. 
  • Minimised data movement: By processing enrichments natively on FlashBlade DirectFlash Modules, which leverage non-volatile RAM (NVRAM) for global metadata management, Data Stream reduces data movement overhead. Outputs are formatted in structures like JSON, Apache Parquet, or Arrow, unlocking additional capacity in vector stores. This approach supports petabyte-scale RAG data sets, with independent scaling of capacity and performance to accommodate multiple GPU clusters without downtime. Overall, Data Stream maximizes the performance per dollar for inference to empower our customers through various hardware and software innovations.

Watch a Data Stream video here.

Pure Storage Data Stream video

Pure says capabilities such as intelligent query augmentation—where user inputs are vectorized and matched against billions of embeddings—and guardrail filtering enhance LLM accuracy, relevance, and security by leveraging retrieved contexts to mitigate hallucinations or inappropriate outputs.

It claims Data Stream represents “a turbocharger for AI-ready data consumption in the enterprise, dramatically reducing the latency and complexity of data usability for AI applications.” This “enables instantaneous access to transformed, vectorized data that is inherently optimized for GPU-centric architectures, which means more inference and consumption without the hassle or complexity.”

Data Stream is available for preview here.