A NetApp AI Space Race report asks whether China, the USA, or another country will become the world leader in AI innovation and says that businesses will need an intelligent data infrastructure. As a data infrastructure supplier, its assertion is unsurprising.
Gabie Boko
It sees the race for AI leadership as being equivalent to the US-Russia space race, with CMO Gabie Boko stating: “In the ‘Space Race’ of the 1960s, world powers rushed to accelerate scientific innovation for the sake of national pride. The outcomes of the ‘AI Space Race’ will shape the world for decades to come.”
NetApp surveyed 400 CEOs and 400 IT execs across China, India, the UK, and USA in May, and 43 percent said the US would lead in AI in the next five years, twice as many as those positioning India, China, or the UK in the lead.
Its report says 92 percent of Chinese CEOs report active AI projects but only 74 percent of Chinese IT execs agree with them. In the USA, 77 percent of CEOs report active AI projects and 86 percent of US IT execs agree with them. NetApp says there is a “critical misalignment between CEOs and IT executives” in China, “which could hinder its long-term leadership potential.”
It suggests that “internal alignment, not just ambition, may ultimately shape how AI strategies are executed across regions and roles.”
A different view might be that Chinese organizations are developing CEO-led AI projects faster than US ones.
Another difference between China and the other countries is that China is more focused on scalability (35 percent compared to global average of 24 percent), whereas others are focused on integration. Security and compliance are the least-ranked concerns (10 percent average between IT execs and CEOs globally).
More respondents think the US will be the likely long-term AI leader than China:
64 percent of US respondents ranked the US as the likely leader in AI innovation over the next five years, versus 43 percent of the global average
43 percent of China respondents ranked China as the likely leader in AI innovation over the next five years, versus only 22 percent of the global average
40 percent of India respondents ranked India as the likely leader in AI innovation over the next five years, versus only 16 percent of the global average
34 percent of UK respondents ranked the UK as the likely leader in AI innovation over the next five years, versus only 19 percent of the global average
Overall, CEOs and IT execs see “AI for decision making and competition to stay ahead” as the single most powerful force to drive AI adoption (26 percent). India (29 percent) and UK (32 percent) feel extra pressure to compete as China and the US are seen as clear leaders. China is uniquely driven by customer demand (21 percent vs 13 percent of global average), underscoring that the China market is seen as leading today with actual pilots and programs (83 percent vs 81 percent global average – not much of a difference).
Just over half (51 percent) of respondents saw their own organization as competitive in AI but none see themselves as the current leader. Almost all (88 percent) think their organization is mostly or completely ready to sustain AI transformation and 81 percent are currently piloting or scaling AI.
NetApp’s report states: “One of the most significant success factors in the AI Space Race will be data infrastructure and data management, supported by cloud solutions that are agile, secure and scalable. Successful organizations need an intelligent data infrastructure in place to ensure unfettered AI innovation. This is critical no matter the company size, industry or geography.”
It concludes: “In the AI Space Race, hype wonʼt win – data will. No matter the size, industry, or location, success hinges on a foundation that can support the full weight of AI. Organizations that come out on top will be those with intelligent, secure, and scalable data infrastructure built to power real innovation.”
Comment
It seems obviously true that successful AI projects will need a scalable and secure data infrastructure. Accepting that, which suppliers could provide one? NetApp sees itself here, as “the intelligent data infrastructure company.” But we would suggest all of its competitors are also well positioned, as they currently emphasize storing unstructured data, supplying and supporting AI pipelines, RAG, vector databases, agents, and Nvidia GPUs and software.
We might suggest that leading positions in AI training and inference could be indicated by Nvidia GPU server certifications, Nvidia AI Factory,and AI Data Platform support. This would include DDN, Dell, HPE, Hitachi Vantara, IBM, NetApp, Nutanix, Pure Storage, VAST Data, and WEKA.
If we look at support for GPUDirect for files and objects, we could add Cloudian, Hammerspace, MinIO, and Scality to our list. We could look at IO500 data and see that Xinnor, DDN, WEKA, VAST Data, IBM (Storage Scale), Qumulo, and VDURA are represented there. We could also look at AI LLM, RAG, and agent support for backup data sets by Cohesity, Commvault, Rubrik, Veeam, and others, and see that cloud file services suppliers such as Box, CTERA, Egnyte, Panzura and Nasuni are piling into AI as well. Data management suppliers like Datadobi amn Komprise are also active.
Data services suppliers in the widest sense, from observability and governance to database, data warehouse, data lake, lakehouse and SaaS app suppliers, are all furiously developing AI-related capabilities, with Teradataannoucing its own AI Factory in partnership with Nvidia.
In China and elsewhere outside the USA, Huawei and other Chinese suppliers will be well represented.
A conclusion is that all incumbent IT suppliers see any weakness in their adoption of AI and support of customer AI projects as a potential entry point into their customer base by competitors. None of them are willing to let this happen.
NetApp has recruited Zscaler CTO Syam Nair to be its new chief product officer, replacing the departing Harvinder Bhela.
Bhela joined NetApp in January 2022 as EVP and CPO and led product management, engineering, hardware, design, data science, operations, and product marketing, after spending 25 years at Microsoft. His LinkedIn profile says he is “in mindful hibernation,” while comments on his departure post suggest he may be pursuing another opportunity.
CEO George Kurian stated: “I am thrilled to welcome Syam to NetApp’s leadership team. He joins us at a time when our customers are looking to NetApp to help them deliver data-enabled growth and productivity: not only must they innovate to stay ahead, but they must also simplify to improve productivity and agility. This is a balance Syam has mastered throughout his career. Syam’s proven track record – from building planet-scale Azure data services at Microsoft to spearheading hyper-scale platforms like Salesforce Data Cloud – is exactly what we need as we sharpen our focus on high-growth markets.”
Nair is a former executive at Salesforce, Microsoft, and Zscaler with a track record of scaling major cloud platforms and pioneering next-gen AI-powered products. He joined Zscaler in May 2023 as CTO and EVP for R&D, moving from Salesforce. While at Zscaler, he led efforts to integrate AI and ML into its offerings and drove the expansion of its Zero Trust Exchange platform, scaling it to handle over 300 billion daily transactions, reinforcing Zscaler’s position as the world’s largest in-line cloud security platform.
Syam Nair
At Salesforce, he was EVP and head of product engineering and technology for Tableau, Customer Data Cloud (Genie – a hyperscale CRM data cloud), AI (Einstein), Automation (Flow), and Salesforce Marketing Cloud. He managed a globally distributed engineering team and cloud infrastructure. Nair and his execs were also responsible for the vision and execution of Salesforce’s next-gen AI search and analytics platform and experiences. At Microsoft, he was part of the leadership team responsible for building and accelerating the expansion of the globally distributed Azure data services.
Nair will help NetApp sharpens its focus on hybrid cloud and AI. The company is developing a new version of ONTAP for the AI era, and reckons AI inferencing and RAG use cases in the enterprise will be a larger and longer-lived market than model training. NetApp also has a strong focus on developing its cloud-native file offerings in the AWS, Azure, and Google clouds.
William Blair analyst Jason Ader has hosted investor meetings with NetApp CFO Wissam Jabre and tells subscribers: “NetApp has been plagued by choppy revenue performance over the past five years (two up years, three down years), with investors looking for clues to gain conviction on future consistency. Going forward, the company’s revenue growth algorithm to achieve its revenue CAGR target (from fiscal 2025-2027) of mid to high single digits is based on a mix shift to higher growth opportunities, including AFAs, public cloud services, block-optimized storage products, and AI. The company is off to a good start here, with revenue growth of 5 percent in fiscal 2025, and guidance of 4 percent growth (excluding the Spot divestiture) in fiscal 2026.”
Nair said: “I’ve spent my career tackling complex technological challenges and leading teams through transformations for hyper-growth. I’m thrilled to bring that experience to NetApp. We will set a bold vision for the future of hybrid cloud data services and execute with a growth mindset and relentless focus on customer success.”
AI is everywhere at HPE Discover 2025 with Nvidia-fueled AI factories taking top billing, Alletra storage products playing their part at the bottom of its AI product stack, and a Commvault-Zerto deal to protect data.
The Alletra Storage MP X10000 object storage array, based on ProLiant server controller nodes and with a disaggregated shared everything (DASE) architecture, will support Anthropic’s Model Context Protocol (MCP) with built-in MCP servers.
HPE president and CEO Antonio Neri said: “Generative, agentic, and physical AI have the potential to transform global productivity and create lasting societal change, but AI is only as good as the infrastructure and data behind it.”
The company says that “integrating MCP with the X10000’s built-in data intelligence accelerates data pipelines and enables AI factories, applications, and agents to process and act on intelligent unstructured data.” In its view, the MP X10000 will offer agentic AI-powered storage with MCP.
It explains: “By connecting GreenLake Intelligence with the X10000 through MCP servers, HPE can enable developers and admins to orchestrate data management and operations through GreenLake Copilot or natural-language interfaces. Additionally, connecting the built-in data intelligence layer of X10000 with internal and external AI agents ensures AI workflows are fed with unstructured data and metadata-based intelligence.”
The X10000 now supports the Nvidia AI Data Platform reference design and “has an SDK to streamline unstructured data pipelines for ingestion, inferencing, training and continuous learning.” Nvidia says this data platform reference design, which features Blackwell GPUs, BlueField-3 DPUs, Spectrum-X networking, and its AI Enterprise software “integrates enterprise storage with Nvidia-accelerated computing to power AI agents with near-real-time business insights.”
An Nvidia blog said back in May: “Nvidia-Certified Storage partners DDN, Dell Technologies, Hewlett Packard Enterprise, Hitachi Vantara, IBM, NetApp, Nutanix, Pure Storage, VAST Data and WEKA are introducing products and solutions built on the Nvidia AI Data Platform, which includes NVIDIA accelerated computing, networking and software.”
The Commvault-Zerto deal builds on an existing GreenLake cloud partnership with Commvault’s cloud offering (Metallic) by having Commvault integrate Zerto’s continuous data protection and disaster recovery software into its cloud and offering it to Commvault Cloud customers to protect virtualized on-premises and cloud workloads. They get near-zero recovery point objectives (RPOs) and recovery time objectives (RTOs) to provide better protection against data outages.
Fidelma Russo
Fidelma Russo, EVP and GM Hybrid Cloud and CTO at HPE, stated: “Our combined innovations set a new standard for data resilience, helping customers navigate a rapidly evolving threat landscape.”
Commvault CEO Sanjay Mirchandani said: “At a time when data is more valuable and vulnerable than ever, our collaboration is empowering customers to keep their business continuous by advancing their resilience and protection of hybrid workloads.”
The two companies say they are also “introducing enhanced integration between the HPE storage and data protection and Commvault Cloud portfolios to safeguard sensitive data, protect against ransomware, and ensure seamless recovery from disruptions.” This has three aspects:
Resilience: The combination of Alletra Storage MP B10000 (unified file and block storage) with built-in ransomware detection and snapshot immutability, HPE Cyber Resilience Vault with air-gapped protection, and Commvault Cloud AI-enhanced anomaly detection and threat scanning provides unmatched resilience and peace of mind.
Fast, Clean Recovery: The integration of Alletra Storage MP X10000 featuring data protection accelerator nodes with Commvault Cloud enables enterprises to return to operation safely and rapidly after an incident. It brings together blazing fast storage, typical 20-to-1 data reduction, and the broadest protection across hybrid cloud workloads.
Geographic Protection: Commvault Cloud seamlessly orchestrates simultaneous snapshots and local backups for two synchronously replicated Alletra Storage MP B10000 arrays, located in different geographical regions. This streamlines data protection workflows and delivers “unparalleled recoverability for critical enterprise data.
There is no delivery timescale for these three items yet. Commvault and HPE’s partnership includes integrations with HPE StoreOnce backup appliances and tape storage systems “for highly cost-effective, long-term data retention, as well as advanced image-based protection for virtualized environments through HPE Morpheus VM Essentials Software.”
HPE’s Morpheus Enterprise Software provides a unified control plane for its AI factories and has a Veeam data protection integration.
Alletra block storage has been ported to the AWS and Azure clouds and is delivered and managed via the GreenLake cloud. The Alletra B10000 and X10000 systems share the same hardware.
Digital Realty, a cloud and carrier-neutral datacenter, colocation and interconnection provider, is standardizing on the HPE Private Cloud Business Edition for its operations across more than 300 datacenters on six continents. This includes the Morpheus VM Essentials Software and the Alletra Storage MP B10000. HPE and World Wide Technology (WWT) will collaborate to support deployment across Digital Realty’s global footprint.
A new HPE CloudOps Software suite brings together OpsRamp, Morpheus Enterprise software, and Zerto software. Available standalone or as part of the suite, these provide automation, orchestration, governance, data mobility, data protection, and cyber resiliency across multivendor, multicloud, multi-workload infrastructure.
Alletra Storage MP X10000 with MCP support is planned for the second half of 2025.
Bootnote
Veeam and HPE announced a combination of Veeam Data Platform, HPE Morpheus Software and Zerto Software, with increased joint go-to-market investment, to provide;
Veeam delivery of image-based backup for Morpheus VM Essentials Software in the near term. Whether running HPE Private Cloud solutions or standalone servers with Veeam and VM Essentials, customers can take advantage of seamless, unified multi-hypervisor protection and VM mobility, as well as up to 90 percent reduction in VM license costs.
Protection for containerized and cloud-native workloads withVeeam Kasten providing backup and recovery for containerized and cloud-native workloads.
HPE and Veeam also announced a “Data Resilience by Design” joint framework that includes HPE cybersecurity and cyber resilience transformation and readiness services.
StorONE is using Phison’s aiDAPTIV+ software in its ONEai automated AI system for enterprise storage.
SSD controller and latterly drive supplier Phison launched aiDAPTIV+, an LLM system that can be trained and maintained on premises, in August last year. StorONE supplies S1 storage; performant and affordable block, file, and object storage from a single array formed from clustered all-flash and hybrid flash+disk nodes. ONEai integrates Phison’s aiDAPTIV+ technology directly into the StorONE storage platform, with plug-and-play deployment, GPU optimization, and native AI processing built into the storage layer. It is said to enable large language model (LLM) training and inferencing without the need for external infrastructure or cloud services.
Gal Naor
At an IT Press Tour event StorONE CEO Gal Naor stated: “ONEai sets a new benchmark for an increasingly AI-integrated industry, where storage is the launchpad to take data from a static component to a dynamic application. Through this technology partnership with Phison, we are filling the gap between traditional storage and AI infrastructure by delivering a turnkey, automated solution that simplifies AI data insights for organizations with limited budgets or expertise.
“We’re lowering the barrier to entry to enable enterprises of all sizes to tap into AI-driven intelligence without the requirement of building large-scale AI environments or sending data to the cloud.”
ONEai uses AI GPU and memory optimization and intelligent data placement to offer an efficient, AI-integrated system with minimal setup complexity. Integrated GPU modules reduce AI inference latency and deliver up to 95 percent hardware utilization.
Users benefit from reduced power, operational and hardware costs, enhanced GPU performance and on-premises LLM training and inferencing on proprietary organizational data. There is no need build complex AI infrastructure or navigate the regulations and costs of off-premises systems.
Michael Wu
The ONEAi software is optimized for fine-tuning, RAG and inferencing, features integrated GPU memory extensions and simplifies data management via a very user-friendly GUI, eliminating the need for complex infrastructure or external AI platforms. We’re told it automatically recognizes and responds to file creation, modification and deletion, feeding them into ongoing AI activities, and delivering real-time insights into data stored in the storage system.
Michael Wu, GM and president of Phison US, said: “Through the aiDAPTIV+ integration, ONEai connects the storage engine and the AI acceleration layer, ensuring optimal data flow, intelligent workload orchestration and highly efficient GPU utilization. The result is an alternative to the DIY approach for IT and infrastructure teams, who can now opt for a pre-integrated, seamless, secure and efficient AI deployment within the enterprise infrastructure.”
Datadobi is adding more automation to its StorageMAP product to help storage admins do more quicker.
It says customers dealing with increasingly complex data landscapes need help to stay in control without suffering from more operational overhead. StorageMAP 7.3 reduces the time admins spend on routine tasks, helping them move critical data without disrupting compliance or performance.
Carl D’Halluin
Carl D’Halluin, CTO at Datadobi, stated: “Organizations can now define and execute policy-based actions at scale, removing the bottlenecks inherent to existing manual processes, making their file and object storage environments far more responsive to operational needs.”
Datadobi’s StorageMAP now has policy-driven workflows as part of its capabilities to orchestrate and automate data management tasks across file and object storage. Customers can, it says, act on data more precisely, and migrate between S3-compatible platforms while maintaining compliance.
Potential use cases include periodic automated archival, creating data pipelines to feed GenAI applications, identifying and relocating non-business-related data to a quarantine area, and more, we are told.
StorageMAP v7.3
Its workflow engine can execute tasks in response to triggers such as a time schedule. Once policies are published, StorageMAP runs the workflows on schedule without requiring manual supervision. A “dry run” feature helps check out the scope of a policy before full execution.
StorageMAP 7.3 also adds support for:
Granular file-level deletes – this helps deleting files from directories that contain a mix of valid and no-longer-valid files. Admins can identify files that match specific criteria and save them as input to a targeted delete job, which StorageMAP will execute. Each delete job generates a report detailing its parameters and outcome.
Locked object migration between S3-compatible storage systems. This allows data in Write Once Read Many (WORM) format to be relocated across different vendor platforms while retaining its retention date and legal holds.
S3 storage class selection during object migration or replication to support cost and performance objectives.
In the near to mid-term, we think that AI copilot-type technology will be used to enable an unstructured data estate to almost self-manage by using policies to optimize for cost, compliance, performance, and resilience.
Arctera, Wasabi Technologies, and distributor TD SYNNEX have launched a joint, channel-exclusive data protection offering. It combines Arctera Backup Exec with Wasabi Hot Cloud Storage, available through a single SKU via TD SYNNEX, providing turnkey integration with seamless end-to-end protection across physical, virtual, cloud, and SaaS environments. There is flat pricing with no egress fees, stronger margins and simplified sales motion for channel partners plus streamlined procurement and deployment through one order form, one invoice, and one vendor relationship.
…
Assured Data Protection (Assured), a UK-based IT managed services provider (MSP) for cloud data protection solutions using Rubrik’s software, announced the expansion of its operations into France bringing enterprise-level cyber recovery solutions to mid-market and enterprise businesses there. In France, Assured has partnered with I-TRACING, a managed cybersecurity services provider with a security operations center (SOC) managed services offering. The SOC protects complex environments on a 24/7 basis, leveraging an integrated international “follow-the-sun” operating model. Through the partnership, I-TRACING is now able to offer Rubrik’s data protection platform as a managed service through Assured’s second site data replication infrastructure.
…
Ataccama says it enables businesses to turn unstructured content such as contracts, invoices, and PDFs into structured and actionable data that can be analyzed using natural language prompts, through an integration of its unified data trust platform Attacama ONE with Document AI on the Snowflake marketplace. Enterprises can turn documents into structured records by using natural language prompts such as “What is the effective date of the contract?” which are processed by Snowflake’s Arctic-TILT large language model to create structured outputs written directly into Snowflake tables.
Ataccama ONE connects to these tables to profile the data, apply quality checks, and manage governance policies on the structured outputs. It also tracks how the data flows into analytics, reporting, and AI workflows by capturing lineage at the table level. Additional metadata about the original documents can be added to enrich traceability if needed.
…
AWS has enabled SAN booting and you can SAN boot your Amazon EC2 enterprise environments from Amazon FSx for NetApp ONTAP. We’re told it can save serious money for customers when they have hundreds to thousands to hundreds of thousands of boot volumes. Or when building HA/DR storage with improved resiliency and OS change management. Read more here.
…
Broadcom announced GA of VMware Cloud Foundation (VCF) 9.0. It says: “VCF 9.0 introduces a completely new architecture that simplifies operations, boosts performance and resiliency, and delivers a consistent cloud operating model across on-premises, edge, and managed environments. The release includes significant updates across both the core platform and advanced services, giving IT and developer teams a unified, secure foundation to build, run, and scale traditional, containerized, and AI workloads.”
Broadcom is also delivering:
Private AI Foundation with Nvidia: Multi-tenant GPU-as-a-Service, air-gapped deployment, model runtime, and agent builder tools.
Live Recovery: Cyber and disaster recovery with isolated clean rooms and up to 200 immutable VM snapshots.
Avi Load Balancer: Self-service load balancing, lifecycle automation, and VPC-aware deployments.
Data Services Manager: Enterprise PostgreSQL/MySQL with SQL Server in tech preview, integrated for DBaaS delivery.
…
Tom’s Hardware reports that Chinese AI companies are smuggling hard drives to Malaysia in order to train their AI models without technically breaking the export controls that the US has placed on advanced Nvidia chips.
…
Cyber resilience supplier Commvault announced a partnership with Kyndryl to help customers recover faster, advance cyber resilience, and navigate the evolving regulatory landscape. This will augment Kyndryl’s portfolio of cyber resiliency services, which encompasses Incident Recovery Services, including Cyber Incident Recovery, Managed Backup Services, and Hybrid Platform Recovery. Commvault and Kyndryl will collaborate with Pure Storage to assist organizations in complying with evolving and rigorous regulations, including the European Union’s DORA, NIS2, PSD2, as well as NYDFS NYCRR 500 and Australia’s APRA CPS 230.
The services from Commvault and Pure Storage deliver a modular, four-layer architecture that streamlines the compliance process and accelerates recovery across hybrid cloud environments:
Clean Recovery Zone: A secure environment for forensic analysis, validation of clean backups, and staged recovery operations.
Production Rapid Restore: Fast, reliable restoration of large datasets using Pure Storage FlashBlade, with immutability powered by S3 Object Lock and SafeMode.
Immutable Snapshot Recovery: Application-consistent snapshot replication with Commvault IntelliSnap and Pure Storage FlashArray, enabling rapid restoration of Tier-1 workloads.
…
Private equity firm Haveli Investments has an agreement to acquire NoSQL database supplier Couchbase in a $1.5 billion all-cash transaction.
…
Cloud file services supplier CTERA says it’s the first hybrid cloud storage supplier to support the Model Context Protocol (MCP). This allows enterprises to connect large language models (LLMs), including assistants like Claude, AI IDEs (e.g. Cursor), and internally developed agents, directly to private data, without compromising security or compliance. Until now, connecting LLMs to private files meant sacrificing control or building custom integrations. The Anthropic-developed MCP solves this by providing a structured, permission-aware interface.
CTERA has embedded this new capability into the CTERA Intelligent Data Platform, enabling users to look up, summarize, retrieve, manage, and create files using natural language, all while enabling IT and security teams the ability to maintain full control over access, auditing, and encryption. With the integration of MCP into the CTERA Intelligent Data Platform, users can automate routine file management tasks and eliminate the need for tedious folder navigation or coding skills, allowing users to leverage AI-driven actions to drive productivity efficiencies.
…
Diskover announced the closing of a $7.5 million seed funding round, and the launch of partnerships with both NetApp and Snowflake, each of which is also among Diskover’s new investors. It has acquired data intelligence and orchestration platform company CloudSoda.
…
Postgres supplier EnterpriseDB (EDB) updated its Postgres AI (EDB PG AI) offering. EDB PG AI unifies relational and non-relational data in a single system, featuring automatic pipelines and built-in development tools that automate and operationalize data for AI. It now gets low-code/no-code simplicity with AI pipeline creation in days, not months. In just five lines of code, users can set up an AI pipeline that automatically syncs embeddings with source data, ensuring an always-up-to-date AI knowledge base without costly infrastructure maintenance. It supports Nvidia GPU servers, NeMo retriever, and NIMS microservices. It also gets sovereign hybrid Postgres data estate management and observability. With 200-plus built-in metrics and intelligent recommendations, teams can identify and resolve issues 5x faster, boost application performance by up to 8x, and optimize infrastructure – no DBA expertise required.
…
Data orchestrator Hammerspace says its software is available on the Oracle Cloud Marketplace and can be deployed on Oracle Cloud Infrastructure (OCI). In recent OCI performance benchmarks, the Hammerspace Tier 0 solution delivered 2.5x faster read bandwidth, 2x higher write throughput, and 51 percent lower latency when compared to the same client servers connected to external networked storage running on OCI. These results were achieved using OCI bare metal shapes, with zero custom software or hardware, leveraging the Hammerspace Tier 0 solution, which utilizes low-latency NVMe storage local to OCI GPU VM shapes. Read more here.
…
Hitachi Vantara announced that Turkey’s DestekBank has deployed its Virtual Storage Platform One Block (VSP One Block) to support its retail banking expansion and enhance digital service delivery. The VSP One system combines high-speed NVMe architecture with adaptive data reduction to deliver fast, efficient, and reliable performance at scale. Since deploying VSP One Block, DestekBank has seen a 35 percent increase in application performance, a 25 percent decrease in datacenter energy consumption, and a 30 percent reduction in storage management workload, freeing up IT resources to focus on more strategic priorities. With a 4:1 data reduction ratio, the bank has significantly optimized storage efficiency and lowered total cost of ownership by approximately 20 percent. DestekBank expects a full return on investment within 18 months of implementation.
…
A NERSC (National Energy Research Scientific Computing Center) operated by Lawrence Berkeley National Laboratory for the United States Department of Energy Office of Science, has a coming Doudna supercomputer, a Dell-built system to be delivered in 2026 and based on Nvidia’s CPU and GPU Vera Rubin platform. It will handle AI training, traditional simulation, streaming sensor data, and analysis in one pipeline, without forcing the user to break the workflow into separate, asynchronous stages. It has two types of storage systems. One is IBM’s Storage Scale, a traditional high-performance parallel file system for modeling and simulation workloads running at scale. The second has not been revealed but – heavy clue alert – we can read about Doudna in a VAST Datablog, which says it “includes support for quality-of-service-driven workloads, those that involve streaming data, inference, or time-sensitive experimental inputs.” More will be revealed at Nvidia’s GTC in 2026.
…
IBM Red Hat Ceph 8.1 is out and has some pretty cool new features. It enables support for two-site and tiebreaker stretch-mode clusters. With this capability, clients can lose a site without impact to the availability of their data resources. See more details here.
…
The Crusoe GPU server farm operation launched Data Mobility Services (DMS) using Lightbits high-performance block storage. DMS is a standalone service that runs in a dedicated container outside the Lightbits cluster, managing the movement of volumes and snapshots. It uses a Thick Clone feature to create volumes by leveraging all available resources across the cluster, and so streamline image creation and management for AI Cloud environments. Crusoe’s customers have the flexibility to build from custom image templates and take snapshots at any stage of the AI data pipeline. DMS capability enhancements are already in the pipeline, to include asynchronous replication, incremental copies, and advanced data protection features. A Crusoe blog tells you more.
…
Lightbits Labs, inventor of the NVMe over TCP protocol natively designed into their high-performance block storage solution, published a reference architecture (RA) withAMD and using Ceph, for the industry’s first scalable, highly available, disaggregated, software-defined storage solution that simplifies and accelerates Kubernetes infrastructure deployments. It features the “seamless integration of Lightbits block storage into existing Kubernetes deployments, without requiring significant re-architecting, which enables a smooth upgrade path to software-defined, high-performance block storage.” You can download the reference architecture available on the Lightbits Labs website.
…
Regatta, the Israeli startup building a scale-out, transactional (OLTP), analytic (OLAP) relational database (OLxP) with extensibility to semi-structured and unstructured data, recently got some foundational patents granted, which allowed it to add more technology data online in CTO Erez Webman’s blogs, Regatta’s Architecture: A Bird’s Eye View and Regatta’s Concurrency-Control Protocols.
Regatta enables frequent and ad hoc, complex, distributed queries on live, up-to-the-second transactional data, without degrading performance. It guarantees Linearly Scalable OLTP and OLAP performance, strong distributed ACID, on hyper-scalable clusters with huge capacities. In many cases, Regatta eliminates the need for ETL and data warehouse. It’s excited about the role Regatta plays in Transactional Agentic AI where agents must act on real-time, record-level data – in addition to the less specific and more generic outputs that’s generally based on stale or archived data.
Its broader vision is to collapse the data stack into a single, elastic “pool” of commodity compute, storage, and networking. No more silos of a variety of operational and analytical stores. All workloads operate directly on the same data, on the same “pool,” scaling capacity or performance elastically as needed, without the complexity of ingestion pipelines or duplication (or quadruplication, or more), and without the related costs and complexities. Always with linearly scalable very high performance, and strong transactional consistency (ACID) guarantees.
…
SNIA Storage Management (SM) says Swordfish v1.2.8 is now available as a SNIA Standard and will be submitted to the ISO process. Swordfish provides a standardized approach to manage storage and servers in hyperscale and cloud infrastructure environments – leading to streamlined implementation, scalability and increased cost savings. It DMTF Redfish specification – utilizing the same easy-to-use RESTful interface, along with JavaScript Object Notation (JSON) and Open Data Protocol (OData) – to seamlessly manage storage equipment and storage services in addition to servers. Swordfish v1.2.8 includes expanded content for NVMe device management to support joint work with the OpenFabrics Alliance and DMTF.
…
Starburst, which uses Trino open source distributed SQL to query and analyze distributed data sources, has been recognized as a Leader and Fast Mover in the newly released 2025 GigaOm Radar for Data Lakes and Lakehouses report. This marks the third consecutive year that Starburst has earned a leadership position in this industry report.
…
HCI vendor StorMagic has announced a virtual OEM (V-OEM) agreement with HPE involving the ProLIantCompute DL145 Gen 11 server. The system includes HPE iLO silicon root of trust, HPE Compute Ops Management, and features an AMD EPYC 8004 Series processor that combines with StorMagic SvHCI to deliver an HCI setup designed for retailers. Customers with decentralised business applications and data can easily size, order and receive integrated, purpose-built hyperconverged infrastructure (HCI) solutions designed for simplicity, reliability, high availability (HA) and affordability, and designed around customer compute and storage capacity requirements. StorMagic will be showcasing the new integrated solution at HPE Discover 2025 in Las Vegas.
…
TerraMaster announced its D4 SSD, a four-bay all-flash direct-attached storage (DAS) enclosure designed for media creators, filmmakers, and professionals seeking speed and flexibility. It has USB 4 support delivering up to 40 Gbps bandwidth. When equipped with four M.2 SSDs (such as Samsung 990PRO 4 TB) and configured as RAID 0 via macOS Disk Utility, it achieves read/write speeds of up to 3,224 MBps. In single-disk operation, read speeds can reach 1,608 MBps. It’s fully compatible with Thunderbolt 3/4/5 and USB 4/3.2/3.1/3.0 protocols.
…
Startup Typedef has gained $5.5 million seed funding to develop its software technology to help turn AI pilot projects into production-ready workloads. Typedef is led by co-founders Yoni Michael, who sold his datacenter analytics company, Coolan, to Salesforce in 2016, and Kostas Pardalis. The company says it’s “driven by a belief that inference is the new transform, and building on their deep experience leading data infrastructure teams at cloud-first companies such as Salesforce, Tecton and Starburst Data, Typedef set out to build a solution from the ground-up that can handle mixed AI workloads at scale with equal efficiency, predictability, and performance.”
Michael said: “It is extremely difficult to put AI workloads into production in a predictable, deterministic and operational way, causing most AI projects to linger in the prototype phase – failing to achieve business value or demonstrate ROI.”
Typedef graphic
“The fact is, legacy data platforms weren’t built to handle LLMs, inference, or unstructured data. As a result, the workaround has been a patchwork of systems, aging technologies and tooling, or DIY frameworks and data-processing pipelines that are brittle, unreliable, and don’t scale. Typedef is righting these wrongs with a solution built from the ground up with features to build, deploy, and scale production-ready AI workflows – deterministic workloads on top of non-deterministic LLMs.”
…
VergeIO announced VergeIQ, a fully integrated enterprise AI infrastructure offering “enabling enterprises and research institutions to securely, efficiently, and easily deploy and manage private AI environments.” It is integrated as a core component of VergeOS. Customers can select and deploy various Large Language Models (LLMs)—such as LLaMa, DeepSeek, and Falcon—and immediately begin using them on their own data within VergeOS’ secure, tenant-aware infrastructure. An Early Access Program begins in July 2025 with GA in August 2025.
…
Cloud object storage supplier Wasabi announced its achievement of ISO/IEC 27001 certification – the global standard for information security management systems. The internationally recognized ISO/IEC 27001 certification ensures Wasabi’s security architecture is robust and protects petabytes of mission-critical data across healthcare, education, government, financial services and media sectors globally.
…
Software RAID supplier Xinnor says its xiRAID product is being used in the third highest-scoring full production list IO500 system. It is in the Erlangen National High Performance Computing Center which uses the Lustre file system. Xinnor partner MEGWARE was responsible for the deployment. The AlmaLinux 9.4 OS is available as open source. The Lustre 2.16.1 file system is available as open source. Xinnor’s xiRAID Classic v4.2 is proprietary.
Xinnor tells us: “We worked closely with MEGWARE to design the architecture of this cluster, with the goal of maximizing performance, its stability over time and without compromising reliability. In fact, xiRAID takes care of drive protection at node level but also the fail-over and fail-back in case of a server node failure.” It claimed: “Not only is this cluster faster than any DDN, Weka, VastData, or Hammerspace clusters, but it is also one of the most cost-effective deployments appearing on the entire list as it is based on Lustre community version and commodity hardware.”
…
Xinnor announced a high-availability BeeGFS parallel file system validated design with Western Digital’s OpenFlex Data24 4000 Series NVMe-oF storage system. Xinnor’s xiRAID Classic provides automatic failover capabilities through Pacemaker Cluster Shell integration. The system uses WD’s RapidFlex network adapters and fabric bridge devices to provide 12 ports of 100 GbE connectivity. The architecture supports both RoCE (RDMA over Converged Ethernet) and TCP connections. Tech specs include:
Up to 1,474.56 TB capacity in a 2U unit
Support for up to 12 hosts without requiring a switch
PCIe Gen 4 performance throughout the chassis
Measured performance of 61.5 GBps read and 24.4 GBps write throughput
The complete solution brief is available here and the detailed reference architecture can be accessed here.
Interview. We had an opportunity to talk with Jim Liddle, Nasuni chief innovation officer for Data Intelligence and AI, and the conversation covered edge AI, ensuring data for AI is resilient, and considering whether malware-infected agents using MCP could wander at will within an AI data space. The conversation has been edited for clarity and brevity.
Blocks and Files: Nasuni was early into AI. I remember a year ago, further than that, and now the rest of the industry has caught up and all the focus seems to be on Nvidia. We’ve got to do AI training, we’ve got to do AI inferencing. And so all the storage companies, almost without exception, are supporting GPUDirect for files, for objects. They’re supporting Nvidia’s GPUs, AI stack, NEMO retrievers, NIMS microservices, and all that stuff.There’s Dell, there’s HPE, there’s VAST, there’s NetApp, Pure – all significant Nvidia partners, and AI factories this, AI factories that. Where does Nasuni play in this space?
Jim Liddle: So I guess a couple of things. If you talk to people like Microsoft and other vendors, companies who are using AI to do training, real training, it’s less than 5 percent. That’s not the business, is it? It’s not the future. No. And the reason for that is it’s expensive. If you want to do a very small training set small as a company, it’s going to cost you like a million dollars just to do the training, not even to hire the staff and get the data ready.
Jim Liddle
Blocks and Files: I think I can see where you’re going with this, which is you’re inherently a company with an edge focus.
Jim Liddle: We are edge to cloud, edge to cloud. We do something that we’ve got what I would say we’ve got three pillars of AI that very few of the vendors can match; edge to cloud, global namespace and AI data resiliency.
Why is edge-to cloud that important? Because that – the edge – is where all your employees are and the AI services are in the cloud. Most of the enterprises are using hyperscalers. How do you get the data from here to there? It’s easy to do it once. It’s easy to have a NetApp in one place and go, oh, let me see if I can get a pipeline going to get my data there. What do you do if you’ve got 12 arrays in 12 different locations, and how do you do that every single day, every single hour?
We are multi-vendor. We don’t care what the hardware is. And that edge, that hybrid nature of it, obviously it wasn’t designed for AI; it just happens to be an absolutely perfect match for Nasuni to be able to move data from here to there without even thinking about it. You have workers working here every day, every hour. And a customer doesn’t have to worry about migrating data from edge-to-cloud or back because it’s inherent inside the software.
It happens. It just absolutely moves back into the global namespace. And the one thing about AI that’s absolutely fundamentally true is – it wants a single source of truth for the data because you get better context, okay?
Blocks and Files: You’re now in a position to have a single virtual center for all the company’s proprietary data. How do you get it to the GPU servers? Do you do that yourself?
Jim Liddle: I would argue that GPUs become really important when you’re doing training. Yes. Companies, enterprise companies, do they really care about training? Not really. What they care about is how do they get the best value of their domain information from an AI perspective.
Blocks and Files: So we’ll have a general trained AI model, general, we’ll have access to it and then we’ll feed it with our own data.
Jim Liddle: And they’re using retrieval augmented generation (RAG) or they’re using agents. if you think about what we do today, we have an edge server, we have a cloud here, the namespace, and then we go through the edge to get back into the data.
Blocks and Files: Suppose I’ve been listening to another supplier and they say inference at the edge is where it’s going to be because you can’t sit at the edge and have communications back to the datacenter for your inferencing going on. You need the inference to work with local data because data’s got gravity.
Jim Liddle: I’d refer you back to the fact they say that because they can get the data from here to there. We can. All of the data has gravity and that’s why we cache it locally for the applications to get access to it. But your AI doesn’t need to be there if the data is just seamlessly moving back to the cloud where you’ve got big heavy scale AI that can work on it at scale. You don’t need to inference at the edge. I’m not saying there are certain instances you wouldn’t want to.
You can actually have all of the edges communicate directly, transiently, back to the namespace. A lot of the vendors say, oh, you’ve got to inference into the edge because it’s hard to move the data back into the cloud, but not with Nasuni. It just goes back instantly and literally you can be working on a file here and an hour later it’s there and AI’s got access to it.
So a guy over here can go, oh, I need to know the latest update on such and such and it’s there. It just gets told. That’s a huge differentiator.
The other thing I would say around that, and I guess this is the way I think the industry will go from an edge perspective; imagine you’ve got a Nasuni customer. They’ve got 12 locations around the world, not unusual. People who buy Nasuni tend to be big enterprise, heavy duty customers.
Now imagine Agentic is taking off, this is the year of agentic AI. If anything, all of your 12 locations are taking orders and they’re all going into one particular directory in the namespace. So it’s all coalesced here. However, you have an agent over in Phoenix that needs access to those orders, but it also needs access to data from CRM and other systems that are not on cloud because you don’t want, they’re over here.
So with Nasuni, all of those 12 other locations can be pinned down to that one server in Phoenix. Every time somebody puts something in, it’s all going into that global space. Ultimately it’s all being fed out to Phoenix where the agent is saying, oh, I need all of the other information and then I need to go to the CRM and then other systems. It’s all getting processed locally at the edge. That’s a hard architecture to have to replicate if you weren’t using Nasuni.
And I can see that it’s going to happen. I mean some of the agents will be on cloud. Sure. But you’re right. Some of the agents will be inferencing at the edge, but you need to be able to shuffle the data around. It’s not one location. It’s easy when it’s one location. It’s not easy when it’s 12 locations or 20 locations and that’s what is going to happen. You’ll end up with these little multi-step processes that can solve the particular problem in enterprise and they’ll do a step and they’ll go, oh, I need to go and get some access to the new data.
Blocks and Files: Should I think of data in this environment as being like the sea? So in a sense, wherever you are in an ocean, the sea is the sea is the sea. It’s the same. So wherever you are in the Nasuni data namespace, the data is there. You can access it. The data is the data is the data is the data.
Jim Liddle: Wherever you are in the world, the whole point of it is, I guess, when you strip it all back, AI needs access to global data to be the most effective. Not like just data from Phoenix or just data from London. It needs, if data from London and data from Phoenix have some context between each other, you want the AI system to see both. You don’t just want to see one. You’ll have used AI and you’ve asked it the question, the more data or relevance you give it and the more context you give it, the better.
Blocks and Files: This could be like a virtual desktop. I’m sat here with what looks like my PC desktop. It’s actually just a terminal connected to some central place. So I could be using an AI system sitting at some edge location, but it’s actually running up in the cloud.
Jim Liddle: Absolutely. So this idea of inference at the edge in that environment is tough. I’m not saying you don’t don’t have to use it in certain circumstances with heavy-duty stuff, but your day job as a company is to run your business. It’s not to go and use AI, you solve problems. Will there be AI in the edge? Sure. In some of the circumstances. And there’ll be other situations where we’ve got some companies doing that already today, but they’ve made some strategic decisions around we want to purchase some GPUs because we think we’ll get better ROI and TCO all the time. But a lot of companies really, if you look at what the use cases are, use cases are pretty simple.
Blocks and Files: The implication here is that you’re not going to see heavy duty Dell or HPE servers with a rapid GPU inside them sat in small offices.
Jim Liddle: No. Or remote environments at all. It’s not going to happen. I don’t see it. I mean Nvidia has just released DGX Spark, as you probably know, which is kind of an AI PC for the desktop. Do we see employees can be sat at DGX doing RAG workloads? I don’t think so. It’s so expensive. It’s about four grand for a start and then it still needs, obviously, technology to be able to set things up.
Ultimately what a person in the company wants to do is; they want to ask an AI a question, but they want it to be answered not on the foundational model’s knowledge. They want it to be answered on their own company knowledge.
Blocks and Files: And the AIs are going to be running in the equivalent of an AI execution space. They’re in the cloud. It’s global. They’re not sat on my local hardware.
Jim Liddle: No. Look, even if you can do that, for businesses, employees are not going to do that. I would say for the stock standard business, what they’re interested in is how can you let me, Mr. Nasuni, get access to my data with the AI that I choose to use, which in nearly 90 odd percent of cases is going to be on cloud in Microsoft Open AI or AWS. And how can I do that to take advantage of the applications and tools they’re giving me to make it easier to leverage that data from an AI perspective?
Blocks and Files: I think that what you are providing is a way for the data, via your global namespace, to be fed to AI models.
Jim Liddle: Correct.
Blocks and Files: You’re not going to get involved in doing detailed RAG data preparation yourself. It doesn’t make any sense to do that. People will use models or pipeline stuff sitting on top of you for that.
Jim Liddle: Correct. I go back to the three core precepts. What are the core architectural precepts of Nasuni? The first one is edge to cloud. That’s so key. If you can’t get the latest data to the AI service, you’ve got a problem. The second one is the global namespace because if you are moving the data there, it’s got to be visible to all of your company locations. There’s no point being visible just for one. And then the third one is AI data resilience.
We’ve seen ransomware becoming more sophisticated and we’re seeing that probably that’s been driven in part probably by AI itself because these threat actors are using AI to be able to make the ransomware better. Once you start to get some of those business processes and agents embedded in the enterprise, what are they doing? They’re accessing data from all different places, including the center. It’s open door and what will happen once you’ve got a hundred of those running and your enterprise relies on them and your data gets locked down, you’ll be scrambling around wondering why something stopped.
Blocks and Files: You’re working out where it stopped and why when you’ve got a hundred AI agents as well as your normal human users accessing and processing it.
Jim Liddle: It is horrendous and you’re going to need AI data resilience. This really is the next step in ransomware resilience to be honest. Because the underlying threat is still the same. Your data gets hijacked, but here it becomes as important as AI.
Blocks and Files: What will Nasuni do for that?
Jim Liddle: We’re doing a few things for that. First of all because we’ve got, in the architecture itself, the snapshot technology, we’re automatically doing those snaps in the background. All of the snaps we do are immutable so we could easily roll back.
It sounds trite. I always hate saying it because it sounds a bit markety, but if you look at alternative backup strategies, the problem you’ve got generally with backup is you take an initial backup and then you take and you do incrementals. But very few companies go back and check that and roll it to see if it’s going to work. Actually getting the whole thing back and running takes time and effort. Whenever we do our snaps, they’re versions of the original object. For us to move back to through a version, back to a snap, we just change a pointer.
Blocks and Files: What that means is you’ve got a time machine.
Jim Liddle: Yes. You can actually step back in time in minutes.You don’t have to generate an entire backup from a foundation and 2000 incrementals. We literally just move the pointer.
We’ve also got our ransomware protection. It actually looks for anomalous behavior, not just at the headers, but also for anomalous behavior. And what it does is it kind of says: Oh, Chris has locked 50 files in under 30 seconds or under, maybe it’s under a second. I’m going to lock Chris out. I’m going to send a report to the admin and it’s going to be up to the admin to decide what to do with it. Let Chris back in again – or not, because that doesn’t look like the type of behavior that should be happening. And that’s built into the product today.
Blocks and Files: How about attacking this from another angle, which is that you’ll have AI agents accessing the data and those agents will have behavioral profiles. So you need to track what the agents are doing and if an agent is doing something different, anomalous, would you lock an agent out or am I going off on a tangent?
Jim Liddle: No, I think you’ve hit on a good point. I’ll give you your analogy. You’ve probably heard of model context protocol or MCP. We are heavily looking at MCP. If you look at MCP today, Claude is an MCP client so it can connect to any MCP server. All you’ve got to be careful of is the MCP server you are connecting to is not a poisoned MCP server.
You’ve got to be really careful. Now if you are controlling all end to end, then that’s fine. It’s a closed system. But if you’ve downloaded an agent from somewhere and just embedded it into your agent framework, who’s to say that that agent hasn’t been compromised at some point? It’s an agent from a channel partner supposedly. Who knows?
Blocks and Files: That’s frightening. That’s really frightening.
Jim Liddle: It is. And I think in the enterprise, most of them are just going to gravitate towards … closed doors.
The first images have come from the Vera C. Rubin telescope in Chile showing an unprecedented view of the Southern Hemisphere’s night sky and the start of a ten-year time-lapse movie of the changes to its galaxies and stars. Spectra Logic tape libraries will store the images for the long term.
The telescope has a 3,200-megapixel camera that takes an image of a different region of the southern hemisphere sky about every 40 seconds each night for a decade, creating an ultra-wide, ultra-high-definition time-lapse recording of the changing night sky that includes detailed information about millions of variable stars. It takes precise measurements of variable stars as they change in brightness over minutes, days, and years. This Legacy Survey of Space and Time (LSST) project will produce around 500 petabytes of data in its ten-year period and this will be the widest, fastest, and deepest view of the night sky ever observed.
Rubin takes broad pictures of the night sky in the human optical and near-infrared spectrum. The orbiting James Webb telescope has a much narrower field of view (0.1 degrees compared to Rubin’s 3.5 degrees) and operates mainly in the infrared spectrum. It captures highly detailed, deep images of specific astronomical targets, but only covers a tiny fraction of the sky compared to Rubin’s telescope camera.
The first Rubin images have just been released and here is one showing part of the Virgo cluster of galaxies:
This image shows another small part of the Virgo cluster. Visible are two prominent spiral galaxies (lower right), three merging galaxies (upper right), several groups of more distant galaxies, many stars in the Milky Way Galaxy, and more. Credits: NSF / DOE / Vera C. Rubin Observatory
In comparison, here is a Webb telescope image of just one galaxy in Virgo, the NGC 5068 barred spiral galaxy:
This image of the barred spiral galaxy NGC 5068 is a composite from two of the James Webb Space Telescope’s instruments, MIRI and NIRCam. Credits: ESA/Webb, NASA & CSA, J. Lee and the PHANGS-JWST Team
The Rubin telescope is located on the Cerro Pachón mountain in central Chile, in the Coquimbo region, east of the city of La Serena. It is at an altitude of 2,701 meters. Image data from the telescope’s camera is sent via a 100 Gbps fiber optic network to the base facility at La Serena. Initial processing is carried out there and alerts of detected transient events are distributed to interested parties around the globe.
20 TB of data is sent each night to the SLAC (Stanford Linear Accelerator Center) National Lab at Stanford University in California, using a pair of 100 Gbps links.
Rubin LSST data handling diagram. For a higher-definition diagram, click here.
SLAC is the main processing center and public details on Rubin’s software and data handling are limited. We know it uses the Apache Spark distributed SQL data engine and will have hot, warm, and cold storage running on SSDs, HDDs, and tape. A series of distributed Rubin mirrored data portals are being set up for annual data set access, with the first in France, at the CC-IN2P3 center in Lyon, and the UK, possibly at Edinburgh and using an IRIS network. There will be one in the public cloud, possibly AWS or GCP. Others have yet to be specified.
Astronomy researchers will be able to access these local Rubin datacenters through a web portal. Chilean researchers will be able to access the La Serena site.
An 18-frame Spectra Logic TFinity LTO-9 tape library system will be used to store the cold Rubin image data at SLAC. It stores data from other SLAC projects, such as the Linac Coherent Light Source (LCLS) particle accelerator and the Cryo-EM Center (S2C2). The Rubin archive will grow at a rate of 6 PB a year and SLAC expects its total storage needs to be around 2 EB by 2033.
SLAC will use HPSS hierarchical storage management software to operate and manage the archive. Rubin data storage at the distributed access sites in France, the UK ,and elsewhere will be the responsibility of the site operators. The French CC-IN2P3 site will use, we understand, the Ceph file system and Qserbv database for storing its Rubin data. It also operates a Spectra Logic TFinity tape library.
Bootnote
The NSF–DOE Vera C. Rubin Observatory is funded by the US National Science Foundation and the Department of Energy’s Office of Science. It is a joint program of NSF NOIRLab and DOE’s SLAC National Accelerator Laboratory, which will cooperatively operate Rubin.
Cohesity is getting closer to MongoDB by providing more advanced performance and control capabilities for backup and recovery of MongoDB databases.
Cyber resilience supplier Cohesity says it’s among the first data protection software providers to deliver MongoDB workload protection through the MongoDB Third Party Backup Ops Manager API with its DataProtect offering. Rubrik also supports this API.
Document database supplier MongoDB produces one of the top five NoSQL databases and has a massive user community. It is a publicly owned business that earned revenues of $2 billion in fiscal 2025 and competes against traditional relational database heavyweights like Oracle and IBM. MongoDB sells both an on-premises version of its database and a cloud version, called Atlas, which is sold through the AWS, Azure, and GCP clouds.
Vasu Murthy
Vasu Murthy, Cohesity SVP and chief product officer, stated: “With ransomware attacks now commonplace, cyber resilience is a strategic priority for all organizations. This is particularly true of large enterprises, which have a very low tolerance for risk. Downtime for any reason can mean millions of dollars and massive reputational damage. As trusted providers for many of the world’s largest companies, Cohesity and MongoDB are working together to strengthen our customers’ ability to bounce back fast.”
Cohesity said its DataProtect-MongoDB integration, now generally available, features:
Parallel data streams to enable billions of objects to be processed instantaneously.
Cohesity’s backups get customers’ MongoDB databases back online 4x faster than traditional methods.
A scale-out architecture providing petabyte-sized support on a single platform. Customers can reduce their data footprint with global, variable-length deduplication and compression.
Immutable write once, read many (WORM) storage, data encryption in flight and at rest, continuous data protection, secure SSL authentication, and a multi-layer defense posture based on Zero Trust security principles.
Business continuity and redundancy with protection of replica sets and sharded clusters with flexible secondary, primary, or fixed preferred backup nodes – enabling continuous availability and failover readiness. Customers can achieve stricter SLAs (both RPOs and RTOs) and eliminate data loss in high-velocity environments.
Support for sharded and Replica Set deployments.
Application-consistent backups across complex MongoDB deployments through tightly integrated snapshot orchestration.
Disaster recovery with restoration of MongoDB clusters in-place or to new environments following failures or ransomware events.
Safe evaluation of performance enhancements or upgrades on alternate hardware with no production downtime.
Seamless refreshment of development environments using out-of-place recovery from clean, consistent MongoDB backups.
A Cohesity blog says the integrated DataProtect-MongoDB offering “is designed for enterprises with large-scale, mission-critical MongoDB environments – think global banks, financial services firms, and Fortune 500 companies.”
It “auto-discovers MongoDB Ops Manager (OM) objects, enabling a frictionless and intuitive user experience. It also fully supports OM instances running in High Availability (HA) mode and with SSL encryption – ensuring secure, resilient protection for even the most demanding environments.”
Benjamin Cefalo
MongoDB’s SVP of Product Management, Benjamin Cefalo, said: “As the leading document database for modern applications, MongoDB empowers organizations to build, scale, and innovate faster. Our collaboration with Cohesity reinforces that mission by helping customers protect their data with robust, enterprise-grade resilience – without compromising the agility and performance developers expect from our platform.”
DDN says it has secured the top position against its competitors on the IO500 benchmark.
Alex Bouzari
The IO500 rates HPC (High-Performance Computing) storage systems with a single score reflecting their sequential read/write bandwidth, file metadata operations (file creation, delete, and lookup), and searching for files in a directory tree. There is a 10-node score for small HPC systems and a production score for unlimited node setups. DDN says that most enterprise AI workloads run in the 10-node category and there it is number one – at least when it comes to the benchmark – claiming it is outperforming competitors “by as much as 11x.”
DDN co-founder and CEO Alex Bouzari stated: “AI is transforming every industry, and the organizations leading that transformation are the ones that understand infrastructure performance is not optional – it’s mission-critical. Being ranked #1 on the IO500 benchmark is more than a technical achievement – it’s proof that our customers can count on DDN to deliver the speed, scale, and reliability needed to turn data into competitive advantage. DDN is not just ahead. We have left the competition behind.”
The IO500 results are published twice yearly at major HPC conferences (SC and ISC) and the latest 10-node results can be seen here. They are ranked by institution and the top 24 are shown in a table:
The DAOS storage system used at Argonne and LRZ gets the top two slots, 2,885.57 and 1,008.81 respectively, with Hudson River Trading getting the number three slot with its DDN EXAScaler system scoring 348.08.
WEKA appears at number 11 with a University of Virginia system scoring 105.94, while VAST enters at the number 17 slot with a 31.08 score from a Howard Hughes Medical Institute (Janelia Research Campus) system. DDN has separated out its own results, as well as those of rivals WEKA and VAST, in its own table:
As DDN’s release says it has “secured the #1 position against its competitors” on this list, it apparently does not rate DAOS systems as being competition for its EXAScaler and Lustre products.
DDN says the IO500 is “the gold standard for assessing real-world storage performance in AI and HPC environments.”
We should note that the IO500 is a file-oriented benchmark and not an AI benchmark. Were there to be such an AI-focused benchmark, it is not a given that IO500 results would be reflected one for one.
Full IO500 production List
The full production IO500 list, with no 10-node restriction, is rather different from the 10-node results, as you can see:
DAOS is still in the top two slots. WEKA appears in the number four slot, scoring 1826.86, with DDN behind it at number seven with 648.96. VAST is in the number 26 position with 47.19.
Competitors
We asked WEKA, VAST, and Hammerspace if they have any comment on the IO500 results and DDN’s claims and will add them when they are received.
WEKA told us: “DDN’s recent IO500 claims deserve a closer look. They rank behind DAOS in the 10 Node Production list, and WEKA outperforms them in the broader Production category with a top score of 826.86, compared to DDN’s best-ever result of 648.96. These results, recently published by Samsung Electronics using WEKA, are publicly available on the official IO500 list and are clear, irrefutable, and independently validated results.
“The numbers speak for themselves:
Higher Overall Score: WEKA delivered an IO500 score of 826.86, achieved with only ~1.5 PiB of usable capacity and 291 client nodes—a fraction of the 42 PiB and ~2,000 nodes DDN used for a significantly lower score of 648.96.
5× Metadata Advantage: WEKA delivered ~2.75 million metadata IOPS, compared to DDN’s ~520,000, a 5× performance advantage—critical for metadata-heavy workloads such as AI training and genomics.
Over 2× Total IOPS: Combining bandwidth and metadata, WEKA achieved ~3 million total IOPS, more than 2× the total I/O capacity of DDN’s configuration (~1.45 million IOPS).
More Efficiency with Less Hardware: WEKA achieved higher performance with dramatically fewer resources—underscoring the architectural efficiency of WEKA’s architecture.
“In contrast, we believe DDN’s most recent production configuration is not reflective of real-world environments: a single metadata server with 32 MDTs, no storage servers, and 0 PiB of usable capacity. The result is even flagged by IO500 as having “Limited Reproducibility.”
“WEKA purpose-built its proprietary architecture to eliminate the need for special tuning or artificial configurations. Our performance scales automatically, consistently and predictably, and our results are reproducible, balanced, and representative of what our real-world customers run in production.”
Our understanding of the VAST point of view is that IO500 is a synthetic benchmark that does not resemble real-world workloads. The data is normalized to there being 10 clients for each test, not to storage system size or client configuration. This leads to skewed systems. For example, the DDN cluster connected had 1600 cores vs 640 cores for the VAST setup.
VAST might say that customers submit IO500 results as a form of bragging. They tune systems like Lustre, etc., that are highly tunable to get the best results. VAST systems don’t offer or need that kind of tuning. Fundamentally, storage performance is a function of how many SSDs a system has. Bigger systems are faster, and if you compare system performance without normalizing to system size in PB or number of SSDs, you’re not comparing an apple to an orange; you’re comparing a bushel of apples with a truckload of oranges.
From VAST’s viewpoint organizations like MLCommons and its MLPerf forbid non-standard comparisons like this, by using their Closed and Open testing schedules. This disincentivizes this, in its view, skewed kind of benchmark warfare.
Hammerspace Global Marketing Head Molly Presley told us: “Hammerspace has already demonstrated linear, efficient scalability in the IO500 10-node challenge and in the MLPerf Storage benchmark. These results show that Hammerspace performs on par [with] proprietary parallel file systems like Lustre—while also being easier to deploy and better aligned with the needs of enterprise IT. We have not focused on “hero runs” optimized for a single, narrow benchmark submission, but rather on realistic, scalable configurations that matter to customers.
“Few organizations today are eager to deploy yet another proprietary storage silo just to enable AI or boost performance. What they need are data-centric solutions that provide full visibility and seamless access to their data—wherever it lives. Hammerspace delivers the ability to centrally manage a virtualized cloud of data across their entire estate. Whether on-premises, in the public cloud, or in hybrid environments, organizations gain unified control with intelligent, policy-driven automation for data placement, access, and governance. The result is unmatched agility, efficiency, and simplicity—without compromising performance. The future of AI data infrastructure is open, standards-based, and spans environments. That’s the world Hammerspace is building.”
Bootnote
In the IO500 “10-Node Research” category, not the production category, Hammerspace announced it “delivered 2X the IO500 10-node challenge score and 3X the bandwidth of VAST—using just nine nodes compared to VAST’s 128. This results in higher performance with a fraction of the hardware, power, and infrastructure complexity to deploy and manage. Hammerspace ranked in the Top 10 highestIOEasy Write and IOEasy Read score in the 10-node challenge.” It’s comparing its score in the 10-node research list to that of VAST in the 10-node production list. The IO500 benchmark suite is identical for both the 10-node Production and Research lists. Read more in a blog here.
Qumulo has introduced an architecture for multi-tenancy, Stratus, with each tenant getting its own virtually separate Qumulo environment isolated from the others by cryptography with per-tenant keys and key management systems (KMS).
Kiran Bhageshpur
The clustered, scale-out unified file and object storage supplier’s software can run on premises, in edge and datacenter sites, and in cloud-native form – the Cloud Data Fabric – in AWS, Azure, GCP and OCI, with a global namespace and support for NFS, SMB and S3 concurrently. Now Qumulo has added cryptographically isolated multi-tenancy for US federal, sovereign cloud, and regulated enterprise customers needing the assurance of data isolation with no sharing.
Qumulo CTO Kiran Bhageshpur stated: “With Qumulo Stratus’s innovative cryptographic isolation technology, sensitive data remains protected while offering the flexibility and efficiency necessary for mission-critical operations. This enables both federal and enterprise customers to concentrate on their core missions with the confidence that they can meet their regulatory compliance and security objectives without compromising performance or scalability.”
Every tenant operates as if it has its own private infrastructure while still benefiting from efficiencies because the actual infrastructure underneath, Qumulo’s scale‑out DataCore and Cloud Data Fabric, is shared while the data is not.
Qumulo Stratus diagram
The company says Stratus enables customers to maintain strict data and infrastructure isolation with benefits and scalability of a shared-nothing data core. This delivers:
Cryptographically sealed tenancy ensuring each tenant’s data at rest and in transit is invisible, even to cluster administrators.
Disaggregated I/O and data planes – compute, cache, and protocol services are provisioned per tenant as containers, VMs, bare metal, or accelerated computing platforms, preventing “noisy‑neighbor” contention while letting organizations scale performance independently of capacity.
The unified file and object access enables analytics pipelines and legacy apps to share the same data without copies or gateways.
A single global namespace spans data centers and every major hyperscaler, which Qumulo says is backed by the vendor’s real‑time analytics and policy‑driven data placement. Customers can choose from AWS, Azure, Google Cloud, and Oracle Cloud to get the cloud services that suit them.
Tenant‑specific services, such as per‑tenant Active Directory, Domain Name Services, Hardware Security Modules, and SIEM/audit domains, aid Zero Trust compliance for government, defense, financial services, healthcare, and service‑provider environments.
Doug Gourlay
Qumulo president and CEO Douglas Gourlay said: “Enterprises and public‑sector agencies no longer have to choose between security and scale. Stratus erases that trade‑off. We’ve fused uncompromising cryptographic isolation with the elasticity of cloud and the performance of bare‑metal and accelerated computing, delivering a platform worthy of the world’s most sensitive data.”
From a competitive point of view, NetApp’s Data Fabric offers multi-tenancy via ONTAP’s Storage Virtual Machines with logical tenant data separation. It supports hybrid and multi-cloud environments via Cloud Volumes ONTAP and Azure NetApp Files.
Dell’s PowerScale OneFS provides multi-tenancy through Access Zones, SmartConnect, and Groupnets, with logical isolation of data, network, and authentication resources for multiple tenants such as departments, clients, or workloads. Public clouds are supported with PowerScale for Azure and Google Cloud, and partnerships with AWS and Oracle Cloud.
Stratus is entering limited availability and private preview and will be generally available in Qumulo Core 8 for certified on‑premises platforms, AWS, Azure, Google Cloud, and Oracle Cloud in the second half of 2025. A Stratus webinar is scheduled for 10:00 AM PST, Wednesday, June 23, here.
Kioxia has refreshed its CD8P product line, a single-port SSD using BiCS5 112-layer 3D NAND, with the speedier CD9P line and its denser BiCS8 218-layer NAND.
Capacities of these PCIe gen 5 interconnect products have not changed much but performance has much improved. The prior CD8P products had a CUA (CMOS under Array) design while the newer CD9P has a CBA (CMOS directly Bonded to Array) technology which helps to improve performance and power efficiency.
Neville Ichhaporia
Neville Ichhaporia, SVP and GM of the SSD business unit at KIOXIA America, stated the CD9P is “engineered for today’s most demanding AI and accelerated computing environments. By combining our CBA-based architecture with PCIe 5.0 performance, the CD9P Series helps keep GPUs consistently fed with data – significantly improving utilization while driving greater efficiency, responsiveness, and scalability across AI-driven data center workloads.”
As with the CD8P there are two versions of the drive; the CD9P-R read-intensive variant and the CD9P-V mixed-use version. Both come in two form factors: E3.S and a 2.5-inch 15mm thick alternative.
Capacities and performance vary with version and form factor, as a Kioxia slide illustrates:
The drives double the maximum capacity available per SSD compared with the previous CDP8 mixed use generation model’s 15.36 TB.
The maximum performance numbers for the CD9P drives are 2.6 million/750,000 random read/write IOPS and 14.8/7 GBps sequential read/write bandwidth.
For comparison the prior CD8P numbers are 2 million/400,000 random read/write IOPS and 12/5.5 GBps sequential read/write bandwidth.
The vendor says the CD9P Series 15.36 TB model delivers approximately 60 percent and 45 percent improvements in sequential read and write speeds per watt, respectively, compared to the previous generation SSD. It also achieves gains of approximately 55 percent and 100 percent in random read and write performance per watt, measured in thousands of IOPS (KIOPS), respectively.
We’re told the drives deliver 4-corner performance improvements of up to approximately 125 percent in random write, 30 percent in random read, 20 percent in sequential read, and 25 percent in sequential write speeds compared to the previous CD8P generation. The 4-corner concept represent four corners of a drive’s peak performance: random read and write IOPS and sequential read and write bandwidth.
The go faster still Kioxia dual-port CM9-R (read-intensive) drive, also a PCIe gen 5 and BiCS8 SSD, provides up to 3.4 million/800,000 random read/write IOPS and 14.8/11 GBps sequential read/write bandwidth.
The CD9P drives have a PCIe 5 NVMe 2.0 interconnect which is NVMe-MI 1.2c compliant. They have OCP Datacenter NVMe SSD spec v2.5 support (not all requirements), CNSA 2.0 signing algorithm support (prepared for the threat posed by Quantum Computers), SIE, SED, and power loss data protection (PLP) features.
KIOXIA CD9P Series SSDs are sampling to select customers and will be showcased at HPE Discover Las Vegas 2025, taking place June 23–26.