Home Blog Page 9

Storage news ticker – March 4

Cloudian will present two technical sessions and demos of its AI-optimized storage at Nvidia GTC 2025, Booth 219, in the San Jose Convention Center, March 17-21. Michael Tso, CEO and co-founder of Cloudian, will join Sunil Gupta, co-founder, CEO, and managing director of Yotta Data Services, and Kevin Deierling, senior vice president of Networking at Nvidia, to present “Pioneering the Future of Data Platforms for the Era of Generative AI” on Thursday, March 20, at 1000 PDT. Peter Sjoberg, VP WW Solution Architects at Cloudian, will participate in a panel discussion titled “Storage Innovations for AI Workloads” on Monday, March 17, at 1600 PDT.

….

Cohesity has been named, for the seventh consecutive time, a Customers’ Choice in the February 2025 Gartner Peer Insights Voice of the Customer for Enterprise Backup and Recovery Software Solutions report.

Commvault says its Cleanroom Recovery and Cloud Rewind software, built on Azure, can now be used to recover electronic health records (EHR), including Epic and Meditech environments. It is exhibiting these solutions at HIMSS25 from March 3-6 in Las Vegas.

Ha Hoang, Commvault
Ha Hoang

Commvault has appointed Ha Hoang as its new chief information officer (CIO). Hoang has “over 25 years of experience in leading enterprise technology transformations for Fortune 500 companies.” Hoang will focus on advancing next-generation cloud, security, and AI technology initiatives and operations. She will also work closely with Commvault’s engineering and product teams. Additionally, she will engage directly with customers, showcasing how Commvault’s technology can drive new levels of resilience.

CoreWeave, the GPU-as-a-Service startup using lots of VAST Data storage, has filed for an IPO. Its S1 document is available on the SEC website here.

CoreWeave S1 filing doc image

Forcepoint has published an eBook, “Executive Guide to Securing Data within Generative AI,” saying how it helps secure GenAI implementations to reduce risks, prevent data loss, and simplify compliance. Get it here.

Distributed in-memory computing Apache Ignite creator GridGain has contributed to the latest v3.0 major release of Ignite with improvements to enhance developer experience, platform resilience, and overall performance. v3.0 simplifies installation, improves configuration management, enhances API usability, and significantly boosts performance. New capabilities such as SQL transactions enable new use cases, and architectural improvements make deployments more resilient and efficient. For new users, Apache Ignite 3.0 provides an easier learning curve and streamlines first-time installation and configuration. Learn more in a blog post.

IBM and Nvidia have a new RA (Reference Architecture) covering the NVIDIA DGX BasePOD with IBM Storage Scale and IBM Storage Scale System 6000, with the NVIDIA DGX H100, H200, and B200 Storage Reference Architecture. Check it out here.

Knowledge graph startup Illumex announced the integration of Illumex Omni into Microsoft Teams. It says Omni is the first enterprise-grade GenAI structured data offering to be directly available in the Teams workspace, “vastly enhancing the agentic AI experience for hundreds of millions of Microsoft users worldwide.” It’s “grounding Agentic AI responses in verifiable, deterministic data” and says its context-aware semantic data fabric ensures every user prompt is matched to a certified definition with built-in governance, delivering accurate and trustworthy answers on data and analytics.

To focus more on TrueNAS, iXsystems has offloaded its server business to partner Amaara Networks.

MSP data protector N-Able‘s growth is slowing. It reported Q4 calendar 2024 revenues of $115.5 million, up 7.5 percent year-over-year and beating guidance, but almost flat sequentially, with a profit of $31 million. Full calendar 2024 revenues were $466.1 million, up 10.5 percent. It expects revenue of $115.5 million next quarter, down sequentially and up just 1.5 percent year-over-year. William Blair analyst Jason Ader says the guidance factors in: 1) lower upfront revenue recognition from on-prem customers that are moving to long-term contracts (representing year-over-year revenue growth headwinds of 5 percent for the first quarter and 4 percent for the full year), 2) lower average foreign exchange rates compared to 2024, and 3) increased strategic investments from integrating Adlumin into the business and developing a new site in India to expand R&D capacity.

N-able revenues

Beth O'Callahan, NetApp
Beth O’Callahan

NetApp has promoted EVP and chief legal officer Beth O’Callahan to chief administrative officer and corporate secretary roles “as the company continues to drive and strengthen collaboration across the business.” She will continue to oversee legal, compliance, government relations, and sustainability, while assuming responsibility for human resources, workplace experience, and corporate communications effective March 3. 

A Phison Pascari SSD is now on its way to the Moon with the successful SpaceX launch of the Intuitive Machines IM2 mission with its Athena landing vehicle. Athena completed a scheduled 492-second main engine Lunar Orbit Insertion (LOI) burn at 0627 CST on March 3 and is currently orbiting the Moon. Flight controllers plan to analyze data to verify the lander’s targeted circular orbit and confirm Athena’s landing time. Athena is expected to send lunar orbit selfies over the next two days before a landing attempt on March 6.

Athena includes Lonestar Holding’s Freedom payload, which is intended to check out datacenter storage operations on the moon. It uses a Phison Pascari SSD pressure-tested to withstand cosmic radiation, harsh temperature variation, vibrations, and disturbances from lunar launches and landings. Lonestar says: “Once established, Freedom will provide premium, high-performance Resiliency Disaster Recovery and advanced edge processing services to NGOs, government, and enterprise customers.”

Vector database startup Pinecone has a new architecture to increase performance and efficiency. It features a transition from global partitioning to an LSM-based local partitioned architecture. This enables it to dynamically apply different ANN algorithms at different slabs. It’s implementing quantization at different tiers of storage to reduce memory footprint while maintaining accuracy. There are specific optimizations for cold query performance and cache warming.

It has implemented disk-based metadata indexing for single-stage filtering and has new bitmap approaches for high cardinality use cases (such as ACLs). There is a freshness layer redesign through Memtable for more predictable freshness and the introduction of provisioned capacity.

All these improvements will be automatically available, with Pinecone rolling it out behind the scenes to existing serverless users over the next month or so. It’s not requiring any action or efforts on their side. There are more details here. It’s planning to share detailed benchmarks in mid-March, including cold-start latency comparisons and recall/performance benchmarks.

Precisely announced advancements to the Precisely Data Integrity Suite, including AI-driven innovations, an enhanced Data Governance service, and expanded data integration capabilities. With the addition of AI Manager, customers can leverage GenAI capabilities within the Data Integrity Suite while maintaining complete control over data sovereignty and compliance. Organizations can register their large language model (LLM) credentials with the Data Integrity Suite starting with AWS Bedrock, ensuring they comply with their legal and procurement requirements. They can use external LLMs. A new connector enables data delivery from source systems to Snowflake, allowing near-real-time accessibility. A blog tells you more.

Precisely announced the promotion of Joel Chaplin to CIO and Dave Shuman to chief data officer (CDO). Both internal promotions follow the retirement of Amy O’Connor, former chief data and information officer. Chaplin will report to Jason Smith, chief financial officer, while Shuman will report to Chaplin.

Rubrik and Pure Storage have partnered to deliver a reference architecture that enables organizations to unify, manage, and secure unstructured data at scale. The RA reflects Pure’s FlashArray file services and Fusion security features such as ActiveDR and SafeMode snapshots and Rubrik’s NAS Cloud Direct and Cloud Vault cyber-resilience facilities. FlashBlade can be a backup target. A 13-page Rubrik white paper, “Unlocking the Power of Unstructured Data with Rubrik & Pure Storage,” tells you more. Access it here (registration required).

Object storage supplier Scality announced record channel partner and revenue growth. Its global partner ecosystem has doubled in size year-over-year. Q4 2024 saw a record-breaking 60 percent of sales driven by the VAR community. Scality’s VAR channel is now the top driver of sales for the ARTESCA product line and augments the continued strong business growth seen through its strategic alliance with HPE. There are 400-plus channel partners and over 1,000 Scality certified partner personnel worldwide.

SK hynix is nearing completion of its deal to buy Intel’s Solidigm flash operation, with a final payment of $2.235 billion to be made next month.

Snowflake and Microsoft have an expanded partnership that makes OpenAI’s LLMs available for users within Snowflake Cortex AI on Microsoft Azure. It will make Snowflake Cortex Agents available in Microsoft 365 Copilot and Teams, allowing millions to interact with their structured and unstructured Snowflake data in natural language directly from the apps they use. Snowflake says this marks a watershed moment, cementing the company as the only data and AI vendor to natively host both Anthropic and OpenAI’s models within its platform.

Snowflake’s Q4 FY 2025 revenues were $986.7 million, up 28 percent year-over-year, with a GAAP loss of $325.7 million. Full FY 2025 revenues were $3.36 billion with a loss of $1.3 billion. William Blair analyst Jason Ader said the top-line outperformance was driven by stabilization in the core business (NRR of 126 percent), solid contribution from new data engineering and AI products (Snowpark contributed 3 percent of product revenue), and consumption outperformance from technology customers.

Data lakehouse supplier Starburst has written a blog about how great its technology is for feeding data to AI apps, titled “The 3 Foundations of an AI data architecture.” Starburst says it addresses what you need from your data stack.

Data warehouser Teradata announced Teradata Enterprise Vector Store, an in-database facility with future expansion to include integration of Nvidia NeMo Retriever microservices. Enterprise Vector Store can process billions of vectors and integrate them into existing enterprise systems, with response times in the tens of milliseconds. The offering creates a single, trusted repository for all data and builds on Teradata’s support for RAG, while working towards dynamic agentic AI use cases, such as the “augmented call center.” Teradata Enterprise Vector Store is available in private preview, with GA expected in July.

TrendForce released its Q4 2024 global flash brand supplier rankings, saying that the NAND market faced downward pressure as PC and smartphone manufacturers continued to clear inventory, leading to significant supply chain adjustments. Consequently, NAND flash prices reversed downward, with ASP dropping 4 percent quarter-over-quarter, while overall bit shipments declined by 2 percent. Total industry revenue fell 6.2 percent quarter-over-quarter to $16.52 billion.

Cloud object storage supplier Wasabi has joined the Active Archive Alliance, which says an active archive enables data owners to build cost-effective, intelligent, online archival storage systems that combine disk, flash, optical, and tape in the datacenter and cloud. Using technologies such as metadata and global namespaces, the data management layer of an active archive keeps data readily accessible, searchable, and retrievable on whatever storage platform or media it may reside in.

Infinidat, Huawei, NetApp top customer ratings for primary storage

Research from tech analyst Gartner finds primary storage customers prefer Huawei, Infinidat, and NetApp over Dell, HPE, Hitachi Vantara, Pure Storage, and other suppliers.

This finding is presented in Gartner’s latest Voice of the Customer report for primary storage platforms, complete with the familiar four-box diagram:

The Voice of the Customer report “includes vendors with products aligned to the market that have 20 or more eligible published reviews (and 15 or more ratings for ‘Capabilities’ and ‘Support/Delivery’) during the 18-month consideration period ending 31 December 2024. Reviews from vendor partners or companies with less than $50 million in revenue are excluded.” The reviews come from Gartner’s Peer Insights facility in which customers rate their supplier’s products.

Gartner says the primary storage platform market, served by enterprise storage array suppliers, “addresses the need of infrastructure and operations (I&O) leaders to operate and support standardized enterprise storage products, along with platform-native service capabilities to support structured data applications.”

Customer reviews are plotted against two axes – a vertical overall experience rating and a horizontal user interest and adoption rating. The midpoint of each axis is the average rating (market average). The top right Customers’ Choice box has suppliers with above-average ratings on both axes. In alphabetical order, Huawei, Infinidat, and NetApp are present.

The lower left Aspiring square has suppliers with below-average ratings on each axis and includes Dell, HPE, Pure Storage, and Synology. The Established square on the lower right has above-average user interest and adoption rating but below-average rating for overall experience. Here we see Hitachi Vantara.

The upper left Strong Performer box has an above-average overall experience rating but a below-average user interest and adoption rating. Its sole resident is TrueNAS supplier iXsystems.

You need to see individual supplier profiles to understand the differences that determine box placements..

Gartner’s famous Magic Quadrant diagram also rates suppliers but the rating is done by Gartner analysts, not independent customer reviews.

A reprint of this Gartner Voice of the Customer primary storage report is available, as one might imagine, from a vendor that placed well in the assessment, in this case Infinidat, here. You need to be a Gartner client for detailed review access.

Storage challengers threaten incumbents as acquisition pressure mounts

Comment Massive scale-out, file, and object storage winds are blowing through the incumbent storage supplier halls and threaten to become a cat 4 hurricane that will upend them. Yet they are not running for the acquisition hills.

There are three front-runners of this rising tempest – DDN, VAST Data and WEKA, with Hammerspace accompanying them – and they are rushing towards the established incumbents – Dell, HPE, Hitachi Vantara, IBM, NetApp, and Pure Storage. All of them can see it coming and all are assessing how fierce the storm is going to become.

Several of them have made accommodating moves already, aiming to develop their own technology.

HPE leads here. It has adopted a version of VAST Data’s disaggregated, shared-everything (DASE) architecture with its Alletra MP X10000 system, poised to match VAST from inside HPE’s GreenLake and partner ecosystem. NetApp has instituted a project to produce an AI-focused ONTAP. Dell has a project to parallelize PowerScale. 

Pure Storage has indicated it is making moves into the AI training storage arena via exec comments in its recent earnings call. IBM is buying upper-stack AI software supplier DataStax but is not getting directly involved in AI mass scale-out file and object storage.

Even Quantum has its Myriad OS development focused on this mass scale-out file and object storage arena.

VAST is the front-runner among the challengers, with its DASE product now supporting block, file, and object storage within a globally distributed namespace and data space. It also offers Kafka event broking and has established a virtual dominance in the GPU-as-a-Service AI server farms market, which includes CoreWeave.

DDN is buoyed by years of HPE success, a longstanding Nvidia partnership and its new Infinia OS supporting multiple data access protocols. It has recently taken in $300 million in private equity funding.

WEKA has surged through the legacy HPE parallel file systems market as HPC-style storage is being used by enterprises for AI-style workloads. And yet WEKA appears to be facing a reset, having experienced a raft of executive departures. The company raised $140 million in an E-round of VC funding in May last year, with a $1.6 billion valuation. 

When a technology tsunami has hit storage incumbents in the past, they have embraced it in one of two ways: build their own tech or buy it in. HPE has been a big acquirer in the past. So too has Dell, EMC, Hitachi Vantara, NetApp, and even Pure.

Thus far, there have been no major storage acquisitions by the incumbents, only internal developments by Dell, HPE, NetApp, and Pure. Is the acceleration of the AI storage market demand build-up now reaching a point where it’s necessary for them to think about buying in the new tech they need?

Could Dell, no stranger to substantial storage acquisitions (EMC), be thinking of making a move here? Could IBM move downstack and mimic its prior Storwize buy? Ditto for NetApp realizing it needs to move faster and doing a follow-on to its SolidFire and StorageGRID buys? Ditto again for Hitachi Vantara.

B&F is starting to think that a major acquisition by one of the incumbent storage industry giants is overdue. They need to hedge their go-it-alone bets. The AI-focused mass scale-out, multi-protocol, high-speed storage barbarians – DDN, Hammerspace, VAST, and WEKA – are at the gates. Which one of the incumbents will be the first to splash the cash and buy in?

Chinese scientists spin molecular hard disk drive idea

Chinese scientists have devised an organic so-called molecular hard drive for archiving with multi-bit encrypted storage molecules written and read using an atomic force microscope.

The notion is presented in a Nature paper, Molecular HDD logic for encrypted massive data storage, published in February, and says “molecular electronics distinguish themselves with extreme potential for ultrahigh density information storage and logic applications.”

The basic HDD unit consists of ~200 organometallic complex molecules (OCM) deployed in a self-assembly monolayer (SAM) configuration. They are read and written with a conductive atomic force microscope (C-AFM) tip, which has a front-end radius of 25 nm. Digital information is written by altering the physiochemical states of the molecules, stored as the molecules’ redox (reduction-oxidation) and ion accumulation state, and read by sensing tiny bit currents in the material. 

C-AFM tip concept

A C-AFM tip is used in high-resolution scanning to touch and measure the surface height of material at the nanoscale and also its electrical conductance. The tip is on the end of a cantilever and moves up and down as the surface is passed below it. A mirror on top of the cantilever moves, altering the reflected position of a laser beam directed at it, indicating the tip’s deflections. The reflected laser light is measured by a photodiode.

A voltage is applied between the tip and the sampled material, and local picoampere-to microampere scale electrical currents are measured.

The data-carrying molecules are made from “redox-active transitional metal cation (Rux+), organic ligands of carbazolyl terpyridine (CTP) and terpyridyl phosphonate (TPP), as well as driftable halogen anions (Cl),” referred to as RuXLPH.

Ligands are ions or neutral molecules that bond to a central metal atom or ion. An ion is an atom or molecule that has lost or gained one or more electrons and so has a net electric charge. A cation is a positively charged ion and an anion is a negatively charged one. This means that a redox-active transition metal cation is a positively charged ion of a transition metal, ruthenium (RU) in this case, that can gain or lose electrons in a redox reaction. The “Rux+“ expression denotes a positive charge (+) with the X indicating the +2 to +8 oxidation state.

This molecule can have up to 96 conductance states, roughly equivalent to the voltage states in multi-level cell NAND. Hexa-level cell NAND has 6 bits and 64 states, and hepta-level cell flash has 7 bits and 128 states. The 96 conductance states “enables at least 6-bit storage for high-density data archiving applications.” This means, the researchers say, “the disk volume required to store the same amount of information with the RuXLPH monolayer based molecular HDD can be effectively reduced to 16.7 percent (1/6), in comparison to that of the traditional binary magnetic hard disks.” This is on a per-platter basis.

Even more conductance states could be achieved, increasing the bit level further. The device the researchers envisage has “ultralow power consumption of pW/bit range.”

Diagrams in the paper illustrate the researchers’ concept: 

Standard magnetic HDD vs molecular HDD concept
C-AFM tip and recording surface diagram

The paper discusses applying encryption to the stored data for enhanced security. They also envisage a reinvented floppy disk. “In the future, combining the deliberate molecular design cum synthesis strategy, partitioned assembling of customized molecules, and use of flexible substrates, the molecular HDD may even evolve into floppy disks for high-density, high-security portable digital gadgets.”

Comment

The researchers hold out the prospect of a disk-based storage system matching or exceeding tape archive density. However, the working life of an atomic force microscope tip is currently measured at 50-200 hours in intermittent touch (tapping) mode versus 5-50 hours in continuous touch mode.

Unless and until a long-lasting C-AFM tip can be created, this would seem to be a fatal flaw in their molecular hard drive concept.

A second point is that the device has “ultralow power consumption of pW/bit range,” but this is for reading and writing, not spinning the disk, which would take more power.

Storage industry bets big on AI – but where are the killer apps?

Comment: The storage industry is making across-the-board investments to support generative AI workloads, but where are the must-have apps or supplier moats to sustain this investment?

From vector database startups such as Pinecone and Weaviate, to RAG pipeline developers such as Komprise, to fast GPU data delivery from virtually all storage suppliers, it looks as if the entire storage industry has mobilized to support AI workloads, use GenAI models to manage storage better, and feed protected data to RAG LLMs for better response generation.

Apart from its new and niche role as a storage management tool, GenAI workloads are just another storage workload. The industry wants to support substantial workloads that need stored data, as it always has done, and GenAI LLMs used for training and inference are two such workloads with training having dominated so far. But inference is where it is supposed the continual and sustained data access will be done, centrally in datacenters, dispersed to edge devices, and accessed remotely in public clouds. Inferencing will be all-pervasive and taking place everywhere, witness views from IT industry execs such as Michael Dell and Marc Benioff, and suppliers such as VAST Data and DDN.

“AI is going to be everywhere,” Dell declared at the Dell World event in Las Vegas in May last year. Salesforce CEO Marc Benioff said at the Dreamforce conference in September 2024 that there will be a billion AI agents deployed across enterprises within a year. And yet generative AI makes mistakes, getting information wrong or imagining things – hallucinating.

An OpenAI LLM has miscounted the number of times the letter “r” occurs in the word “strawberry.” The AI incorrectly states that the word “strawberry” contains only two “r” characters, despite the user querying multiple times for confirmation. Part of the reason for this is that the LLM is not actually counting the Rs. It is trained to predict the next word or phrase in a sentence, given the context set by a user’s prompt.

Suppose an AI LLM/agent was a human assistant employee to whom you gave requests and it miscounted the number of letters in a word, the number of corners in a polyhedron – giving flat-out wrong answers. Would you employ them?

Obviously not. Even Microsoft boss Satya Nadella is doubting the overwhelming onrush of AI. Let’s suppose the AI is domain-specific and can transcribe conversations or translate foreign languages. Would you use that? Yes, you probably would. I use Rev to transcribe recorded interviews. It’s around $10 a month, cheap, fast, and good enough. Not perfect, but good enough, with the occasional replaying of the recording to understand a particular word.

But Rev is not a chatbot, and neither is Otter. OpenAI’s ChatGPT is the chatbot that started the whole AI frenzy and now we have its various versions, Anthropic’s Claude, xAI’s Grok, DeepSeek, and more. Fifteen of them were shown in an xAI image:

Various AI models

From an ask-a-general-question point of view, none of the gigantic, multibillion-dollar AI startups, such as OpenAI, are profitable. None of them have yet produced a killer app that organizations must have and will pay lots of money for the privilege. They are the ChatGPT, Claude, Copilot, Perplexity, and Groks of our AI frontier, generalized search engines enabling us to sidestep the dross that is a Google query result with its four sponsored results hitting your screen before the proper results, which are diluted by SEO spam.

Bing uses Copilot AI

And then there is the Bing search engine with queries hijacked by Copilot, which tells you it can make mistakes and wants you to repeat your query that you gave to Bing in the first place. 

Copilot AI

How can you take AI chatbot software like this seriously? Just walk away.

Of course, it is very early days. If you pay attention to AI naysayers like PR exec Ed Zitron and his “There is no AI Revolution” shtick then you try to form a balanced view between AI chatbots and agents as a general human good on the one hand, and AI snake oil sellers backed by credulous VCs on the other – and find it’s hard. The extreme views are too far apart.

The AI agent revolution may happen, in which case lots of storage hardware and software will be needed to support it. Or it won’t, in which case AI-driven storage hardware and software sales will be much lower. The message from this is: don’t move away from your non-AI customer base and technology just yet. Hedge your bets, storage people.

Dell server sales boom but storage lags

Dell reported $23.9 billion in revenue for the quarter ended January 31, 2025, up 7 percent year-over-year, with a 27 percent jump in GAAP profit to $1.53 billion. Full FY 2025 revenues were $95.6 billion, up 8 percent year-over-year, with a 36 percent leap in profit to $4.58 billion. There was record full-year and Q4 profitability.

Vice chairman and COO Jeff Clarke stated: “We grew our Infrastructure Solutions Group revenue by 22 percent, and we’re well positioned to capture growth across every segment of our business. Our prospects for AI are strong as we extend AI from the largest cloud service providers into the enterprise at scale, and out to the edge with the PC. The deals we’ve booked with xAI and others put our AI server backlog at roughly $9 billion as of today.”

CFO Yvonne McGill said: “FY 25 was a transformative year – we hit $95.6 billion in revenue, grew our core business double digits, unlocked efficiencies, and drove record EPS. We’re raising our annual dividend by 18 percent, demonstrating our commitment to shareholder return and confidence in our opportunity to grow in FY 26.” Dell also announced a $10 billion increase in share repurchase authorization.

Quarterly financial summary

  • Gross margin: 23.7 percent vs 24.1 percent a year ago
  • Operating cash flow: $600 million
  • Free cash flow: $474 million vs $1 billion a year ago.
  • Cash, cash equivalents, and restricted cash: $3.6 billion vs $7.4 billion a year ago
  • Diluted EPS: $2.15, up 30 percent year-over-year

There are two Dell business units, ISG and CSG. Infrastructure Solutions Group (ISG – servers, storage and networking) revenues amounted to $11.4 billion, up 22 percent year-over-year, while Client Solutions Group (CSG – PCs and notebooks) recorded a mere 1 percent rise to $11.9 billion. Commercial CSG sales rose 5 percent year-over-year to $10 billion but consumer sales dropped 12 percent to $1.9 billion. A PC refresh cycle can’t come soon enough.

Within ISG, servers and networking revenue rose 37 percent to $6.6 billion, while storage limped along behind, growing just 5 percent to $4.7 billion. This growth rate was better than NetApp’s latest quarterly increase of 2 percent, but worse than Pure’s 11 percent.

AI server sales are bolstering ISG server revenues compared to lagging storage, but storage is showing a sequential rise while server sales are declining sequentially

Server sales rose sharply, but it appears storage sales held ISG back. Why? CFO McGill said: “The overall [storage] demand environment is lagging that of traditional servers,” but “we see some promising trends.”

It was, Dell said, the “second consecutive quarter of storage revenue growth at +5 percent year-over-year, with record PowerStore demand that grew double digits in the quarter.”

There was also double-digit demand growth in PowerScale (formerly Isilon) and continued growth in PowerFlex (disaggregated block + file offering) demand. PowerFlex uses Dell’s own IP and can grow from three to more than 2,000 nodes with separate compute and storage node scaling. It uses flash media with data distributed across the nodes for parallelism.

Clarke commented in the earnings call: “We are well positioned in some of the fastest-growing categories within storage as customers shift towards disaggregated architectures.”

Dell sees a strongly growing AI opportunity. There was a $4.1 billion AI order backlog exiting Q4. But in February it booked more server deals with xAI and others to lift the backlog to $9 billion. Its AI optimized server pipeline continues to expand sequentially.

Trad server demand is substantial, being up double-digits year-over-year for the fifth consecutive quarter.

Dell talked about pivoting to more of its own storage IP, gaining profitability. Clarke said: “You’re seeing a pivot to our Dell IP storage. Modern workloads demand an architecture that can be flexible, sufficient, optimizes performance. And we think a disaggregated architecture is the right answer with the modern workloads.” He’s talking about PowerFlex.

He added: “That presents a headwind of our large position that we have in HCI (hyperconverged infrastructure), which will become smaller. But we’re going to overcome that by taking share in our Dell IP storage portfolio across the board in the midrange.”

He admitted: ”There’s revenue that we’ll see go away at a lower margin rate, the HCI business. We have a secular decline in the high-end space where we’re the market leader with our PowerMax product. So we’re going to overcome those and drive the growth.”

An analyst asked about what will happen with storage as AI server sales grow. Clarke was bullish: ”AI needs data. It devours data. You got to feed the beast. The feeding of that beast … has to be closer to where the computational capability is. So hot and warm storage, the notion of parallel file systems, unstructured file systems, data management tools that help find data and help data be ingested are the opportunities.” 

“We have the leading platform for unstructured data. We continue to make it better with the F910 and F710 [PowerScale systems] that I mentioned earlier. Nearly a year ago, we talked about a parallel file system that we are building, Project Lightning, we referred to it. So we’re coming to the marketplace with an AI-driven parallel file system.” 

“And our Dell data lakehouse allows us to help customers prepare their information, manage their information and adjust their information. Our sales force is incented to attach storage with AI opportunities.”

The Dell execs also talked about modernizing Dell through simplification, standardization, and automation so that it can can grow while reducing operating expenditures. Clarke declared: “We are building a new company … We are modernizing the work, the workflows, taking steps out of processes, taking out manual touches, simplifying and standardizing those processes, applying automation.”

Dell is eating its own AI dogfood. Clarke said: “We are deploying AI in the enterprise. The broad categories of use cases are industry-known, whether that’s content creation and management, support assistance, natural language search, design and data creation, cogeneration or document automation. Those are broad enterprise use cases. We are deploying those types of technologies inside our company and seeing tremendous efficiency from that and it is durable. It’s not a one-timer.”

The next quarter (Q1 FY 2026) outlook is for revenues of $23 billion ± $500 million, a 3 percent rise at the midpoint. The full FY 2026 outlook is $103 billion ± $2 billion in revenues, an 8 percent rise.

Storage news ticker – February 28

Screenshot

Backup and cloud storage supplier Backblaze announced results for Q4 2024 and the full year. Q4 revenues were $33.8 million, up 18 percent, with a GAAP loss of $14.4 million. The Computer Backup segment brought in $16.7 million, while B2 Cloud Storage delivered $17.1 million, exceeding Computer Backup for the first time.

Full-year revenues were $127.6 million with a loss of $48.5 million.

DataStax, which supplies its self-hosted NoSQL Enterprise and Astra DB database with vector embeddings support as a service, is being bought by IBM. DataStax is the main commercial backer of the open source Cassandra project. Analyst Jason Ader says: “The acquisition of a prominent NoSQL database vendor focused on unstructured data management should nicely complement IBM’s long-time Db2 relational database offering.” The deal should “broaden the core capabilities of IBM’s watsonx GenAI platform, especially around managing unstructured and semi-structured data and simplifying an enterprise’s ability to develop cutting-edge AI applications around that data.” 

Financial details were not disclosed, but Ader notes that “in June 2022, DataStax was valued at $1.6 billion (following a $115 million funding round led by Goldman Sachs), and in December 2022, DataStax disclosed total ARR of over $200 million alongside NRR of 120 percent and GRR of over 90 percent. The acquisition is expected to close in the second quarter of 2025.”

Dean Koestner left his position as DDN‘s SVP of Sales and has joined Nvidia as VP of its Federal operation.

DeepTempo announced new capabilities for Tempo, its deep learning-powered cybersecurity offering available as a Snowflake Native App on the Snowflake Marketplace. It has enhanced fine-tuning, MITRE mapping integration, and compatibility with existing SIEM systems, and can map detected anomalies to their most likely MITRE ATT&CK sequences, providing enhanced context and insights. Fine-tuning capabilities allow organizations to adapt models to their specific environments. Find out more here.

HPE’s Alletra Storage MP X10000 has a COSI driver offering a Kubernetes-native, standardized approach to provisioning, managing, and consuming storage. COSI enables fully automated bucket provisioning and lifecycle management within Kubernetes. Learn more here.

Hitachi Vantara writes that its VSP One storage portfolio is “a perfect match for Kafka.” That’s because, “when a broker fails, Kafka doesn’t need to struggle through partition reassignment or rebuild replicas to recover. By decoupling storage and compute, VSP One File ensures your data is always secure and readily available, allowing a new broker to step in instantly and pick up right where it left off. No downtime, no delays.”

Hydrolix, a streaming log data lake-focused business, has joined the AWS Independent Software Vendor (ISV) Accelerate Program. The program helps AWS Partners drive new business by directly connecting participating ISVs with the AWS Sales organization. 

Germany’s Heise media outlet has received a letter from Overland-Tandberg saying it’s closing down. Operations ended on February 20 with no further sales and unfulfilled orders staying so. Tape product maintenance is being provided by MagStor and Stortrec. RDX (removable disk) owners are on their own. So, with a multi-year lingering death, ends another tape-based storage company, demonstrating a resuscitation failure by top management.

….

IBM subsidiary Red Hat announced GA of OpenShift 4.18, which brings:

  • GA of user-defined networks, allowing users to have similar networking capabilities for secondary networks on AWS as they do on-premises, allowing for more hybrid cloud flexibility. 
  • User-defined networks get Border Gateway Protocol (BGP), which improves segmentation and supports advanced use cases like VM static IP assignment, live migration, and stronger multi-tenancy. 
  • VM storage migration, available as a technology preview, now includes additional enhancements that allow for non-disruptive movement of data between storage devices and storage classes while a VM is running.
  • Tree-view navigation, available as a technology preview, enables users to logically group VMs into folders which allows for a more granular grouping.
  • Logical grouping, also available as a technology preview, gives users a quicker and easier way to navigate between VMs using a single click.

Learn more here.

Data unification and management firm Reltio launched its Lightspeed Data Delivery Network, which delivers API-ready data anywhere in under 50 milliseconds. It uses “query-optimized, in-memory datasets in a simplified service architecture.” Reltio’s Data Cloud has new features: data masking, bulk-match review (resolving up to 100 potential data matches simultaneously), and reverse geocoding. The latter “converts complex location data into easy-to-read, verifiable addresses.” Reltio Lightspeed Data Delivery Network and Reltio Data Cloud are available now.

Samsung announced its Gen 10 V-NAND 3D technology at ISSC 2025, saying it has more than 400 layers with the peripheral logic components fabbed on a separate wafer from the NAND cells and then the two bonded together, called Cell-on-Periphery (CoP) architecture. The Kioxia/SanDisk joint venture and YMTC do the same. Samsung has built a 1 Tb capacity die with this technology using TLC format cells with a 28 Gb/mm2 density, which offers up to 5.6 GTps. Its Gen 9 V-NAND interface speed is 3.3 GTps. Kioxia/SanDisk’s very recently announced 332L Gen 9 BiCS does up to 4.8 GTps. This device blows other suppliers away in terms of layer count and interface speed.

Samsung has launched a 9100 PRO consumer M.2 format SSD that uses the PCIe Gen 5 bus and comes in 1, 4, and 8 TB capacities, and follows on from the PCIe Gen 4 990 PRO drive.

The 9100 has random read/write speeds up to 2.2 million/2.6 million IOPS and sequential read/write speeds of up to 14.7 GBps/13.4 GBps. This is the fastest PCIe Gen 5 SSD sequential read and write speeds we have recorded. The 9100 PRO SSD will be available starting this March in the 1 TB ($199.99), 2 TB ($299.99), and 4 TB ($549.99) capacities. Similarly, the 9100 PRO with heat sink will be available in capacities including 1 TB ($219.99), 2 TB ($319.99), and 4 TB ($569.99). The 8 TB models will be available in the second half of 2025.

Chinese supplier Sugon has set a new SPC-1 v3 benchmark result record of more than 30 million SPC-1 IOPS for its FlashNexus FN8200 storage system:

The SPC-1 benchmark tests shared external storage array performance in a manner designed to simulate real-world business data access. Its main measurement units are SPC-1 IOPS, transactions consisting of multiple drive IOs, and price performance, expressed as dollars per thousand IOPS ($/KIOPS). A chart shows Sugon at the top of the SPC-1 v3 benchmark tree:

NetApp misses Q3 expectations as deal slippage hits sales

NetApp results for the third FY 2025 quarter disappointed with a 2 percent year-over-year rise to $1.64 billion following the prior quarter’s guidance-beating 6.1 percent rise to $1.66 billion. 

The previous quarter’s revenue outlook was 1.68 billion ± $75 million, a 4 percent year-over-year rise at the midpoint, and NetApp was inside that range. There was a GAAP profit of $299 million, down 4.5 percent year-over-year. Billings increased 2 percent to $1.71 billion. The all-flash array (AFA) run rate was up 10 percent at $3.8 billion, the same amount as the last quarter, with a 43 percent penetration of NetApp’s installed base. The company said most of its AFA business was to net new logos. Product revenue was $758 million, up 1.5 percent year-over-year.

George Kurian, NetApp
George Kurian

CEO George Kurian stated that the Q3 top line performance was “below our standards” and “we are taking action to enhance our execution and improve our momentum.” What happened?

Although there were foreign exchange headwinds, it was mostly a sales close problem with several seven and eight-figure deals slipping late in the third quarter. Kurian said: “We had line of sight to achieve our sales targets until the end of Q3 when inconsistent execution resulted in some deals slipping out of the quarter.”

“We have instituted a higher level of scrutiny on deal progression through the pipeline with tighter controls on closing plans. We expect these actions will enhance our execution and improve our momentum. Already, a number of the slipped deals have closed.” There is now a “more detailed inspection of exactly who in the customer has approved and the various steps that a transaction typically takes to get to closure.”

NetApp revenue
The lower Q3 2025 growth rate (yellow) is seen in the rightmost section of this chart. The graph also shows that NetApp has not grown its revenues since 2013

Its hybrid cloud (basically on-premises kit) revenue was $1.47 billion, up just 0.7 percent, while the much lower-value public cloud segment grew its revenues 15 percent to $174 million, with said first-party and marketplace cloud storage services revenue growing more than 40 percent year-over-year. The Keystone storage-as-a-service offering grew around 60 percent.

Quarterly financial summary:

  • Gross margin: 70.7 percent, up 0.2 percent year-over-year
  • Operating cash flow: $385 million vs $105 million in Q2
  • Free cash flow: $338 million vs $448 million a year ago
  • Cash, cash equivalents, and investments: $1.52 billion  vs year-ago $1.83 billion
  • EPS: $1.91 down 1.5 percent year-over-year
  • Share repurchases and dividends: $306 million

Wedbush analyst Matt Bryson said: “After several quarters of strong execution, NTAP unexpectedly stumbled on several fronts.” These included deal slippage with management suggesting “some European customers elongated closures because of political/economic uncertainty.” Another problem was product gross margins slipping.

NetApp is confident it can profit from the boom in AI workloads, saying it had more than 100 AI and data lake modernization deals in the quarter, with a number of AI-as-a-Service wins. Kurian said in the earnings call: ”We are seeing clients stand up AI centers of excellence with AI infrastructures that combine GPU-based compute with high-performance storage infrastructures. We had several large wins in that category … We are also seeing a growing number of wins in AI service providers who are building as a service infrastructure for enterprise AI.”

Progress on its ONTAP for AI project is good, with Kurian saying: “We made good progress on disaggregated storage. It is for high-performance unstructured data use cases. And you’ll hear more as we get towards Insight.” He added: ”This opens up our ability to attack the other players in the NAS market, particularly the large other NAS incumbent Dell. And so we feel good about our opportunity there.”

The Spot divestiture will have an adverse $15 million impact on the current quarter’s earnings, as will near-term headwinds to global public sector sales. With these points in mind, the outlook for the final FY 2025 quarter is for revenues of $1.725 billion ± $75 million, a 3.3 percent uplift on the year-ago Q4 at the midpoint and producing a $6.57 billion FY 2025 revenue result, a 5 percent year-over-year increase.

Bryson reckons that any rebound expectation in the final quarter should be “muted” and “it’s difficult to view NetApp’s outlook without a modicum of skepticism.”

Nutanix sees revenue surge as VMware customers flee Broadcom acquisition

Nutanix recorded a double-digit revenue increase, at least partly ascribed to sales to disgruntled Broadcom VMware customers who jumped ship to its hyperconverged alternative.

Revenue in the second fiscal quarter ended January 31, 2025, was $654.7 million, 16 percent more than a year ago, with a GAAP profit of $56 million, 70.8 percent more than last year’s Q2. Annual recurring revenue (ARR) grew 19 percent year-over-year to $2.1 billion. The customer count jumped by 710 since the prior quarter to 27,870, the highest quarterly increase in 18 consecutive quarters.

Rajiv Ramaswami, Nutanix
Rajiv Ramaswami

Rajiv Ramaswami, Nutanix CEO and president, said in the earnings call: “We’re happy to report second quarter results that came in ahead of our guidance … We delivered outperformance across our guided metrics.” 

Financial summary:

  • Gross margin: 87.0 percent vs 85.6 percent last year
  • Free cash flow: $187.1 million vs $162.6 million
  • Operating cash flow: $221.7 million vs $186.4 million
  • Cash, cash equivalents, and short-term investments: $1.74 billion compared to $1.08 billion at the end of the prior quarter.

William Blair analyst Jason Ader told subscribers: “Nutanix continues to see strength in landing new logos as more enterprise and mid-market accounts see the Nutanix platform as the best alternative to VMware in the wake of its acquisition by Broadcom.” He also said Nutanix was “benefiting from a rebound in US federal spending and better conversion of large deals in the pipeline.” 

Nutanix strengthened its balance sheet and increased its financial flexibility with the issuance of $862.5 million of convertible senior notes and by establishing a $500 million revolving credit facility.

Nutanix revenues
FY 2025 looks to be a third straight year of double-digit revenue growth

A focus of analysts’ questions in the earnings call was on the magnitude and duration of the VMware migration opportunity. Ramaswami said many customers had multi-year contracts with VMware and a hardware refresh was needed for the conversion to Nutanix.

Migrating customers could go to Nutanix on the public cloud, such as AWS, or on-premises. They could migrate within the existing virtual machine environment or modernize (containerize) the workload and move to Kubernetes. Some VMware Cloud Foundation on AWS customers have already migrated to Nutanix running in AWS.

Some existing three-tier architecture customers are still looking to move to a hyperconverged environment, which was the core Nutanix attraction. That has now been bolstered with the VMware migration opportunity. Ramaswami said customers like this are re-evaluating their entire IT stack “because if they’re being forced to look at an alternative and migration, it’s also a good time for them to reexamine the overall stack.”

“At a big-picture [level], that means, OK, maybe I should take some of these applications, put them in the cloud, maybe that should modernize some portion of my estate, and maybe I should move to a new stack.”

Customers are moving toward a modern stack that can handle virtual machines and containers, as well as work on-prem and in the public cloud, “and that’s the kind of platform that we are providing today in the market.”

Ramaswami also sees AI inferencing driving more GPT-in-a-Box and Nutanix Enterprise AI (NAI) software stack sales over the next few years.

Nutanix’s original steady growth vector of three-tier customers converting to hyperconverged storage is being strengthened by distressed VMware customers fleeing from Broadcom and also by existing customers building up their Nutanix environments to support AI inferencing. These three trends have multi-year time scales and Nutanix sees itself positioned to benefit from them for quite some time. 

Next quarter’s revenue outlook is for $625 million ± $5 million, a 19 percent increase over the year-ago Q3. The full FY 2026 outlook is for $2.505 billion ± $10 million, which is a $46 million increase on its original full year outlook and a 17 percent uplift from a year ago at the midpoint. This assumes the high customer acquisition rate continues as well as continuing revenue expansion with existing customers and good subscription renewal performance.

Pure Storage says faster AI storage coming, passes $3B annual sales for the first time

Al-flash array vendor Pure Storage grew revenues at a steady clip in its fourth fiscal 2024 quarter as it went past the $3 billion/year level for the first time, and previewed go-faster product for AI training.

Revenues in the quarter ended February 2, 2025, were up 11 percent year/year to $879.8 million, beating guidance, with a GAAP profit of $42.4 million, 35 percent down Y/Y, and its third consecutive profitable quarter. Full fy2025 revenues were $3.2 billion, again beating its forecast, and up 12 percent Y/Y, with a $106.7 million profit, up 74 percent Y/Y.

CEO Charlie Giancarlo stated: “We delivered a solid Q4, exceeding both revenue and earnings guidance.” There were record Q4 sales for FlashBlade, FlashArray//XL, Portworx, the FlashArray//E family, and Evergreen subscription renewals. Pure returned $192 million to shareholders and announced a new $250 million share repurchase program.

Pure CFO Kevan Krysler said: “US revenue of $619 million was the primary driver of growth, while international revenue reached $261 million, down 3 percent year-over-year.” 

Overall, Krysler said: “It was a pivotal year marked by industry-leading innovation, setting the stage for sustainable long-term growth.” 

Quarterly financial summary:

  • Gross margin: 67.5 percent, down from 71.9 percent a year ago
  • Free cash flow: $151.9 million vs $200.9 million a year ago
  • Operating cash flow: $208 million vs $244.4 million a year ago
  • Total cash, cash equivalents, and marketable securities: $1.5billion flat Y/Y
  • Remaining Performance Obligations: $2.6 billion, up 14 percent year-on-year

The gross margin decline was due to NAND price increases affecting the capaciy-optimized FlashBlade//E as it competes with disk-drive-based arrays. Basically Pure had to suck up the cost increase to maintain its competitive position. Giancarlo said: “As we indicated many quarters ago, we were going to be aggressive with the E family given that we’re really the only player in the market right now that can compete with disk and we want to take advantage of that. So we are being aggressive there.”

He reported some more nice numbers, with 62 percent of the Fortune 500 now customers, up from 60 percent a year ago, and a total customer count of >13,500; there were 334 new customers stepping aboard in the quarter.

Subscription ARR grew 21 percent to $1.7 billion. Product sales rose just 7.4 percent to $495 million.

A revenue history chart shows a seasonal pattern emerging, with neatly stepped up revenue quarters throughout a fiscal year, as Pure transitions to become a consistently growing revenue machine after growth blips in fy2021 and fy2024, as a chart of sequential revenues by fiscal year shows;

With reference to Pure’s recent hyperscaler customer win, with the hyperscaler moving from disk-based storage to Pure flash-based storage, Giancarlo said in the earnings call: “The conversation continues to evolve and to expand frankly, in terms of the use cases for different types of data storage tiers inside that hyperscaler,” with “discussions around future states and where that storage will go.” Pure sees large production and deployment starting in its fiscal 2027.

CTO Rob Lee said Pure is working on ensuring its Purity SW works with the hyperscaler’s developing code and HW architectures, and that: “We are working now on multiple different performance tiers. Each one of them requires its own set of tuning. And of course, we have to also qualify several different scales of our DirectFlash modules along with several different … flash manufacturers for those DirectFlash modules.” It’s accelerating its Direct Flash module density increase road map as well.

He also said: “The discussions and engagements we are having with other hyperscalers are definitely moving forward with a faster pace.”

Giancarlo said forthcoming AI training and FlashBlade advancements will feature at NVIDIA’s GPU Technology Conference in March: “we’ve got a nice new announcement coming that we’ll be demonstrating at GTC. That really does relate to the largest scale of the AI opportunities.”

Also: “Flashblade will set a new bar for unmatched performance, scalability, and ease of deployment for large-scale AI infrastructure deployments.” Answering an analyst’s question he added this: “We’ve been able to make some modifications to our product that really allows it to address a lot of the performance characteristics that HPC environments are specifically looking for.”

Giancarlo thinks Pure’s biggest opportunity lies in customers reorganizing the way they manage “their production data from data silos into an enterprise data cloud,” with customer AI uptake causing customers rethinking their storage architecture. The aim will be for: ”AI getting access to data for real time analysis, especially in inference and RAG type environments.”

The outlook for the next quarter is $770 million in revenues an 11 percent increase again; consistency seems to be ruling here. The full fiscal 2026 revenue outlook is $3.51 billion; an increase of, yes, you guessed it, 11 percent. And Pure says, in a nod to President Trump’s tariff change suggestions, it has “developed contingency plans for a variety of tariff scenarios.“

IBM intros FlashSystem C200, claims ‘writing on the wall for spinning rust’

The new IBM FlashSystem C200 uses 46 TB QLC NAND drives to add an archive tier to a grid of FlashSystems. 

IBM says you can use it like TLC and “pay for it like QLC,” as the drives last. Replacements under maintenance are guaranteed, with 5.5x more write cycles than industry-standard QLC drives, IBM claims.

IBM bloggers Barry Whyte and Andrew Martin suggest it can be an alternative to the 2023 entry-level FlashSystem 5045 with its range starting with a disk drive version using 20 TB SAS HDDs and 70 TB of usable capacity. 

They say: “Over the last few years we’ve seen the price of NAND-based flash devices get closer and closer to that of the remaining stalwart of the HDD industry – nearline high capacity 7.2K RPM drives. There is still a delta, but it’s now possible to look at replacing racks and racks of spinning disk with QLC-based NAND capacity.”

The C200 uses IBM’s proprietary FlashCore Modules (FCMs) with a Gen 4 version providing 46 TB raw capacity using a pseudo-SLC frontend to the QLC NAND. It has 32 Xeon cores and a 256 GB cache providing 1-2 ms latency, up to 200,000 IOPS, and 23 GBps throughput. There is a fixed 24-slot configuration with 1.1 PB raw capacity in a 2RU chassis. Because the system has always-on hardware-assisted compression, IBM says it has 2.3 PB of effective capacity.

IBM FlashSystem C200

Building a 2 PB archive system with the 5045, meaning 100 disk drives, “would need 15RU,” which compares badly to the C200’s 2RU for 2.3 PB. The 5045’s nearline SAS HDDs could deliver 4-5 ms if not very active. “With one that is under reasonable load we are talking more like 10 ms. Push them beyond their couple of hundred IOPS and you can easily expect 30 ms or more!”

Even building a 2.3 PB archive box with Western Digital 32 TB disk drives would need 72 drives and they would be slower to access than the C200’s FCMs. 

The C200 has 8 x 10GbE onboard ports and supports both Ethernet (8 x 25/10GbE NVMe-TCP) and Fibre Channel (16 x 32GbE FC/NVMe-FC) optional ports. It can operate as a member of a FlashSystem federated grid (connected set) of systems, providing an in-grid archive tier, “enabling non-disruptive Storage Partition migration between systems.” The prior 5045 system did not support that capability.

We’re told that a “FlashSystem grid acts as an intelligent interconnect between storage arrays to provide a single point of management for your entire storage estate. This facilitates the movement of data within a smart partition by maintaining the context and integrity of associated data services including snapshots, replication, and backup.” The partitions include metadata and additional features of the grid to make this approach to data mobility possible. IBM says: “Data and applications are always on a performant storage tier and have the capacity they need to meet end-user expectations.”

The FCM Gen 4 drives have always-on encryption “and ransomware threat detection with guaranteed detection in 1 minute,” IBM says. The C200 has energy efficiency and zero downtime guarantees.

IBM provides indicative end-user pricing for its FlashSystems, and the C200 has what you might call a respectable price:

  • 5045 – 70 TB to 175 TB with HDDs: $17,500 to $23,100
  • 5045 – 9 TB to 30 TB all-flash: $20,800 to $35,100
  • 5300 – 45 TB to 680 TB configs: $52,200 to $247,200
  • 7300 – 85 TB to 1.5 PB: $117,600 to $540,500
  • 9500 – 365 TB to 1.4 PB: $428,800 to “Contact IBM for pricing”
  • C200 – 2.3 PB: $381,000

Whyte claims: “The writing really is on the wall for the last of the spinning rust… over the next few years we will see the price point for flash get closer and closer, and eventually even reduce lower than NL-SAS. With 300, 500, and even 1 PB flash drives being teased in the industry, it’s almost impossible for even the most advanced magnetic platter technologies to keep up.”

This chimes with the message about NAND replacing disk drives that Pure Storage has been putting out.

FlashSystem C200 will be generally available worldwide on March 21.

Bootnote

IBM blogger Chelsey Gosse says: “Industry-standard QLC expects about 1,000 P/E (i.e.: write) cycles, https://www.solved.scality.com/is-all-flash-the-best-choice/, whereas FlashCore Module 4 drives, which are in the C200, achieve 5,500 P/E cycles prior to wear out, using internal testing developed using the JEDEC Standards for Retention.”

How WEKA and VAST are tackling AI memory bottlenecks

Both WEKA and VAST Data aim to solve the problem of AI inferencing context history overflowing GPU memory and slowing down large language model (LLM) responsiveness.

VAST Data co-founder Jeff Denworth writes: “As a chat or agentic AI session grows in length across multiple prompts and responses, the history that is created is known as context. Context is created and stored using self-attention mechanisms that store session history as a series of vectorized tokens (stored as keys and values) that consume considerable amounts of GPU and CPU memory, often leveraging key-value caches.”

Maor Ben-Dayan, WEKA
Maor Ben-Dayan

WEKA co-founder and chief architect Maor Ben-Dayan writes: “A fundamental limitation in modern AI inference is the amount of memory available – GPUs process vast amounts of data in parallel, but the memory available per GPU is fixed. As models grow in complexity and require longer contexts, their memory footprint expands beyond what a single GPU can handle. This results in inefficiencies where GPUs are memory-starved, causing significant bottlenecks in token generation. This is a particular challenge during the decode phase of Large Language Models (LLMs), which are memory-bound, requiring fast data retrieval to process input prompts efficiently.”

He adds: “One of the biggest challenges emerging in inference is the impact of expanding context lengths on compute requirements. As techniques like reasoning tokens increase, models must process significantly longer sequences, putting additional strain on memory and compute resources.”

Both companies aim to fix this issue by giving GPUs access to the context they need, each in its own way. WEKA does it by speeding up token load time and VAST by being picky about which tokens to load first.

WEKA tested the Llama3.170B model and found that it took about 24 seconds to load a 100,000-token prompt into a key-value (KV) cache in a prefill phase to initialize the model before any output was generated. It sought to load and apply the cache at scale, demonstrating how “extending GPU memory to ultra-fast storage can dramatically improve token processing efficiency.”

The ultra-fast storage was an eight-node WEKApod with PCIe Gen 5 connectivity linked to an Nvidia DGX H100 server via Nvidia’s Quantum-2 QM9700 64-port 400 Gbps InfiniBand switches. 

Ben-Dayan says WEKA did its testing with no KV cache compression or quantization – a compression method based on mapping high-precision values to low-precision ones. It reduced the prefill time from 23.97 seconds to 0.58 seconds, a 41x reduction. He says: “Of the 0.58 seconds, the data transfer time was less than 0.2s, so this has the potential to be reduced even more by reducing the overhead of the inference session in the engine.”

He noted: “We also see huge prefill time improvements with much smaller context sizes, even with context lengths as small as 50 tokens.” 

The use of the WEKApod provides a “fast resume of inference jobs.” WEKA’s software already “has the capability to align reads and writes into GPU memory (via GDS) directly to the NIC closest to the GPU, and extract every last bit of performance by reducing unnecessary data movement and latency.” The WEKApod is the icing on this cake.

Jeff Denworth, VAST Data
Jeff Denworth

VAST Data takes a different tack, with a so-called undivided attention scheme. Denworth notes: “As context length grows, machine memory consumption scales linearly. Long-sequence chat or agentic sessions can put pressure on system resources and cause memory overflow.

“Cache space is limited to what can be held in a GPU machine. AI services with multiple tenants (that periodically sign in and out of AI applications) need to constantly evict non-active session data from GPU and CPU cache to make room for whatever is happening at the moment.”

Reloading the cache from public cloud object storage “is so long that several leading AI-as-a-service shops choose to simply recalculate an entire prompt history rather than grab all of the context and attention data from object storage.” VAST wants to make “scalable, multi-tenant inference fast, more cost-efficient and global.”

It has developed “a Linux-based agent that runs in your GPU servers and provides a new data presentation layer to AI frameworks.” This is the VUA agent, VAST Undivided Attention. Each GPU server’s VUA is hooked up to a shared VAST RDMA-attached NVMe storage system.

When tokens are not found in GPU Server’s KV cache, they are reloaded from the VAST storage via the GPUDirect protocol providing what Denworth calls “an infinite memory space for context data.”

VUA has “the ability to intelligently store and serve prefixes,” with prefixes being the initial token sequence need to provide the model’s context. 

According to Denworth: “Each token in a sequence attends to all previous tokens via self-attention, producing key and value vectors for every position. During tasks like text generation, the model processes one token at a time after an initial input (the prompt) … The KV cache stores these vectors for all tokens processed so far, so the model only computes keys and values for the new token and retrieves the rest from the cache.”  

VUA can load prefixes by priority and policy so that, for example, “the longest prefixes associated with a sequence can be served first to a GPU machine,” getting the session underway faster. He says: “Prefixes can also be stored to help multiple related prompts share similar context within a GPU machine,” thus reducing the number of cache misses and reloads from the VAST storage.

Because of VAST’s V-Tree search technique, a VUA can “search through prefixes in constant time regardless of the size of the vector space.” This vector space can scale out to billions and trillions of prefixes. We’re told: “Each GPU server now has shared access to the same extended context cache space, the same rapidly-searchable metadata space and the same global context and attention data and data index.” A preliminary VUA version is being rolled out to VAST’s model builder customers. 

Both the VAST storage system and WEKAPod supply fast token feeding to the GPU servers. WEKA has optimized KV cache loads times by extending GPU memory to include the WEKApod, achieving up to a 41x speedup – though details on the comparison system are not provided.

VAST has also extended GPU memory and optimized KV cache load times by applying intelligence to selecting which prefixes to load that get the model running faster and reducing KV cache misses.