Home Blog Page 11

Hitachi promises time travel for LLMs – just don’t lose your RAG

Hitachi Vantara is claiming to have cracked the problem of time travel in AI, though its biggest immediate challenge seems to be whether to charge for it.

The vendor unwrapped its Hitachi iQ Time Machine technology at Computex in Taiwan last month. While the concept appears to be bending the laws of physics, it’s actually more about applying the lessons of version control to LLMs.

In a video detailing the feature, Hitachi Vantara says today’s AI systems “are caught in the present” because current LLMs “rely on documents and data that are routinely updated and replaced. Previous versions gone forever.”

It claims to have delivered “Time Aware AI” through Time Machine “Powered by Blackwell GPUs, deep reasoning NIMS, and NEMO retriever,” which it claims allows users for the first time to “access data from different time periods.”

Hitachi’s agent accesses documents on Hitachi VSP One Object, meaning they are protected by enterprise grade security and privacy controls, while its versioning means older as well as the newest versions are accessible.

Being able to combine old and new versions of documents might not sound particularly groundbreaking. But it does address some very real world problems enterprises face when working with LLMs, Hitachi CTO for AI Jason Hardy said, pointing out that “AI has no sense of compliance.

“AI does not understand how data changes. It just knows about today’s version of it or when it was last looked at. So what we were able to do with that is bring across data [and] also introduce the concept of versions of data, how it changes over time.”

Hardy explained the aim of Time Machine was to “provide RAG capabilities with enterprise compliance in mind.”

He said: “This is a RAG-type feature that includes the VectorDB, as well as linkage into the Hitachi VSP One Object platform, and the LLM necessary to interact with the content.

But unlike traditional RAG implementations, Time Machine “understands how data changes over time and allows customers to roll back the LLM’s point of view to different points in time, aligning with how the data captured has changed over time.”

When it comes to changing or un-embedding data in LLMs, he said, the company has created IP that “allows for time to be associated with embeddings into the VectorDB.” He described these as “temporal embeddings. With this capability, we now can ‘activate’ previous versions of data through the time aware embeddings, as well as blacklist/remove data entirely.”

This is about more than a couple of hundred documents, he said. “You’re doing buckets of data to provide that value.”

More time embed

The user, or the application or API call, can then tell the system, “Wait a minute, the data that I’m seeing today, from a results perspective, something doesn’t match than what I was expecting and what I got a week ago.”

Users can then roll back and run a query, and, ask the model a question on data that’s a week old or a year old. “Through that, we now can say, Okay, this is what your drift looks like. This is what your data looks like. AB, compare it.”

If it became clear that something had gone awry with the data in the interim, that would have previously “required completely re embedding all your data. You would have to delete it and start over. This actually allows you to roll back pieces of the data in the model to previous versions.”

Hardy said this had obvious applications when sensitive customer data is introduced into a model. “I can now undo that individual document being embedded into the LLM, I can undo whatever created that customer data and how it was brought into the model.” Hardy said the technology was very early stage. “We’re still looking at how we market it. Do we give it away for free? What does that actually mean? How do we embed it in things like that.”

Sandia turns on brain-like storage-free supercomputer

Brad Theilman facing camera, and Felix Wang, background, unpack a new shipment of the SpiNNaker2 computing core. It is a collaboration between Sandia and SpiNNcloud and will be the first in world commercial product from the cooperation. Funded through NNSA’s Advanced Simulation and Computing (ASC) program, this work will explore how neuromorphic computing can be leveraged for the nation’s nuclear deterrence missions. SpiNNaker is a contraction of ‘Spiking Neural Network Architecture,’ which is a brain inspired neuromorphic computer for large-scale, real-time modeling of brain-like applications. This technology can simulate large brain-like networks to enhance researchers’ understanding of the brain, as well as provide a framework to test the boundaries of current computing capabilities. Photo by Craig Fritz

Sandia National Labs has flipped the switch on its SpiNNaker 2 “brain-inspired” supercomputer that eschews both GPUs and internal storage.

The system, supplied by Germany-based SpiNNcloud, will rank amongst the top five “brain inspired” platforms, mimicking between 150 and 180 million neurons.

The architecture was initially developed by Arm pioneer Steve Furber, though it still falls somewhat short of the human brain’s 100 billion neurons.

As SpiNNcloud explains it, the SpiNNaker 2’s highly parallel architecture has 48 SpiNNaker 2 chips per server board, each of which in turn carries 152 based cores and specialized accelerators.

Each of the 48 chips packs 20 MB of SRAM, with each board carrying 96 GB of external LPDDR4 external memory. So, with 90 boards, that amounts to 8640 GB of DRAM, while a 1440 board system carries 69,120 chips and 138240 TB of DRAM.

Needless to say, the system uses high-speed chip to chip communication. And this, SpiNNcloud says, eliminates the need for centralized storage. That, and the vast amount of memory.

Speeding on DRAM

In Sandia’s case, it has taken delivery of a 24 board, 175,000 core system. At Sandia, according to SpiNNcloud, “The Supercomputer is hooked in to existing HPC systems and does not contain any OS or disks. The speed is generated by keeping data in the SRAM and DRAM.”

Standard parallel ethernet ports are “sufficient for loading/saving the data.”  The “current maximum system” is over 10.5m cores, which SpiNNcloud says means it can maintain biological real time.

Moreover, it allows complex event-driven compute and simulations with more power efficiency compared to GPU systems.

Hector A. Gonzalez, co-founder and CEO of SpiNNcloud, said the system would be targeted at problems in “next generation defense and beyond. The SpiNNaker2’s efficiency gains make it particularly well-suited for the demanding computational needs of national security applications.”

Green storage for the datacenter

Shot of Data Center With Multiple Rows of Fully Operational Server Racks. Modern Telecommunications, Cloud Computing, Artificial Intelligence, Database, Supercomputer Technology Concept. Shot in Dark with Neon Blue, Pink Lights.

A sustainable path to capex and opex savings

It seems like the entire world is relying on AI to improve its future efficiency, but how does that impact the planet from a sustainability perspective?

Goldman Sachs Research estimates that the global demand for electricity from datacenters will increase by 50 percent between 2023 and 2027, driven partly by increased production of AI-enabled applications and workloads, use of the large language models (LLMs), and massive compute clusters required to power them.

That’s more than likely to put a huge amount of pressure on corporate sustainability initiatives with green IT and carbon reduction at their core. The foundational imperative for the datacenter is to store, transfer, and process more data using less power, or at least using no more power than was used to do the same thing with smaller data volumes previously. 

It’s no easy task and is likely to be a long journey over many years.  Infinidat believes an excellent place to start are the storage systems which securely host all that information. You can watch this video to hear Infinidat CMO Eric Herzog discuss with The Register’s Tim Philips how the company plans to cure storage sustainability headaches for service providers and large enterprises.

It’s not just about meeting boardroom sustainability expectations and expounding green credentials though. Datacenter operators still need to balance the books. That means making sure that green IT comes with the dual advantage of helping to reduce the cost of power, cooling, recycling, and all the other resources and processes that data hosting facilities need to keep the lights on, while simultaneously minimizing downtime for customers and stakeholders. In short, you need to be environmentally friendly and economically friendly with the right storage infrastructure.

Infinidat reckons it has the answer to this too. You can find out more about how it calculates a five year return on investment (ROI) of up to 162 percent and overall opex reduction of up to 48 percent by downloading the IDC white paper here.

Sponsored by Infinidat.

VDURA unwraps ScaleFlow to slash the flash

VDURA has launched a preview of its V-ScaleFlow data movement tech in the latest rev of its Data Platform, which it claims will reduce flash requirements by over 50 percent.

The firm claimed V-ScaleFlow, part of its Data Platform v11.2 release, will allow “seamless data movement” across high-performance QLC flash and high-capacity disk.

This should, the firm claims, optimize utilization and maximize throughput in systems. More specifically, it will ensure GPUs are kept fully saturated.

VDURA flagged ScaleFlow as a possible fix for AI workflow bugbears such as the need for over-provisioned TLC flash or DRAM buffers to hadle write-intensive checkpoints.

Another proposed target problem is holding long-tailed datasets and artifacts in external object stores or tape, giving operators the option of using high density disk drives instead.

This could add up to a 50 percent reduction in flash capacity requirements it claimed, and lower power consumption overall.

Storage unsolved

When it comes to hyperscale or AI deployments, it can seem storage is a largely solved problem. It’s GPUs that grab most of the attention, followed by networks. But perhaps it’s more a case of storage is generally “good enough.”

A VDURA spokesperson said “Storage solutions typically struggle under the strain of billions of metadata operations, rapid checkpointing, and intensive I/O demands, leading to GPU idle times and diminished productivity. It’s critical to view the stages of the AI pipeline for performance, capacity, and cost efficiency when architecting infrastructure.”

He added: “Relying solely on all-flash solutions (focusing only on performance) across every stage leads to significant flash waste, inflated costs, and unnecessary power consumption.”

He claimed The Goethe University’s Center for Scientific Computing in Germany had overcome “critical storage bottlenecks” in its large-scale physics AI simulations and GPU-intensive tasks that had caused “severe disruptions and data losses” after adopting VDURA’s V5000 appliance.

Other new capabilities in 11.2 include end to end encryption and a Native CSI plug-in, which it said will simplify multi-tenant Kubernetes based deployments.

Tech leaders struggling to store AI data, never mind manage it research shows

Four fifths of organizations have been burned by employees using Gen AI, with the leaking of sensitive data almost as common as false or inaccurate results, research by Komprise has found.

And while companies are racing to manage their data, they are also playing catchup when it comes to storing it and managing it, the data management vendor’s AI, Data and Enterprise Risk study found.

Over two-thirds said infrastructure was a top priority when it came to supporting AI initiatives, with 9 percent saying it was the most important thing after cybersecurity.

Over a third identified increasing storage capacity as their top storage investment when it came to AI, with 37 percent identifying data management for AI – on the basis that AI was only useful when it incorporated the organization’s own data.

And just under a third said acquiring “performant storage to work with GPUs” was their top priority. Overall, 46 percent said all three paths were top priorities.

Just finding and moving the right unstructured data was a key challenge for 55 percent of companies, with lack of visibility across data, and the absence of “easy ways to classify and segment data” also key concerns.

And a third of respondents said they were having “internal disagreement on how to approach data management and governance for AI.”

Krishna Subramanian, co-founder of Komprise, said companies were starting to investigate tooling to enforce strong AI governance and compliance. The alternative was company data leaking and become “part of the public LLM.”

Let’s get tactical

“A  top tactic is classifying sensitive data and using workflow automation to prevent its improper use with AI (73 percent). More than half (55 percent) are also instituting policies and training their workforces.”

This would seem obvious, she said, “But it’s encouraging to see that it’s actually taking place.”

And some are restricting the use of public Gen AI tools as they rollout their own internal tools.

Customers were trying to get better visibility into their data, so they could manage it, Subramanian said, “and are looking into tagging to classifying and segment data as well as automation to help feed the right data to AI and monitor the outcomes.”

She said that the reality was few companies would be training their own models at any great scale. That means they have less need for GPUs and GPU accessible storage. But it does mean they would have to get to grips with unstructured data.

“Rather, your focus is on getting the right corporate data to pre-trained models so they can deliver optimal business outcomes.  Curating data for AI is emerging as the next phase and core investment in AI.”

“As the inferencing market begins to take off, the focus will be on helping enterprises use AI effectively with their own data,” Subramanian said. “After all, models have already been trained on all the publicly available data.”

Broadcom launches latest version of Tomahawk family at datacenters

Broadcom has launched the latest generation of its Tomahawk switch chip, promising to massively boost data bandwidth within datacenters while simultaneously putting a dent in power consumption.

The Tomahawk 6 switch series offers 102.4 Tbps of ethernet switching capacity, with support for 100G and 200G SerDes. It says the former will enable AI clusters with extended copper reach and “efficient use of XPUs and optics with native 100G interfaces.” The latter will provide “the longest reach for passive copper interconnect.” It will be available with co-packaged optics.

The chip is the Tomahawk family’s first multi-die part, and is built on a 3 nanometer process. Tomahawk 5 was a 5 nm process. The firm is currently working on its shift to the 2 nm process, which means it still has headroom to grow.

Broadcom reckons it will cement Ethernet as the networking standard in datacenters and is pitching the chip at both scale up and scale out deployments. It says Tomahawk 6 will support scale up clusters of 512 XPUs, or 100,000 XPUs or more in a two tier scale-out network at 200Gbps/link.

Peter Del Vecchio, product manager for the Tomahawk switch family, said the firm had consistently doubled available bandwidth every 18 to 24 months with each generation of Tomahawk. He compared this pattern to Moore’s law.

However, the last generation appeared in 2022. Del Vecchio pointed out that this was just before Chat GPT burst into public consciousness, and the company had taken its time to really understand the impact on datacentres.

Bigger model impact

And a large part of that impact was an explosion in the amount data being stored and moved as model sizes grew. “The world kind of changed, where suddenly everyone had to have as much bandwidth as possible, as quickly as possible.”

He said studies had shown that up to 57 percent of the time invested in LLM training was down to data transfers, during which power guzzling GPUs were sitting idle.

“Getting the network out of the way, getting the data transfer between GPUs is, just one of the most important things as far as making your GPU cluster efficient,” he said.

Speeding up transfers will increase utilization. But, together with Tomahawk enabling larger two tier networks and lowering the number of hops and amount of optical networking needed, it will also impact the massive power consumption iin data centres.

“Your general rule of thumb is that in these networks, about up to 70% of the power consumption could be just due to the optics. So if we have this huge reduction in the number of optics, you actually have about half of the power for the network go into two tiers compared to three.”

And data demands will continue to rise, as models and clusters increase in size. Not least because of the checkpointing demands in case of failures. “You’re enabling these very large networks, but they have these very large clusters, which also means that when you’re checkpointing, there’s a lot more data to be checkpointed.”

Del Vecchio said the chip’s predecessor had driven a shift from InfiniBand to ethernet in large GPU-powered AI clusters. This meant ethernet was ubiquitous through the entire datacentre he said, which was a major benefit for operators. “You can have a common set of tools, it means that you have all the same cables, you have all the same infrastructure, your network engineers know how to debug that network. I think probably even larger benefit is that you have this huge ecosystem around Ethernet.”

How open systems drive AI performance

An open-source philosophy and system-level optimizations prevent software and infrastructure nightmares in GenAI deployments, says CentML CTO Shang Wang

Partner content  “The magic isn’t just in the model, it’s in how you run it,” says Shang Wang, CTO of CentML. When Wang discusses large language model (LLM) performance, the dialog swiftly moves from market hype to technical heat maps, GPU optimization, network bottlenecks, and compiler intricacies. And if discussing compiler glitches and TensorRT error logs sounds dry, wait until Wang turns one of those logs into a punchline.

Open-source systems, including compilers, frameworks, runtimes, and orchestration infrastructure, are central to Wang’s vision, and the logic is straightforward. 

“You can’t imagine all the corner cases yourself. Ninety-nine times out of a hundred, a TensorRT compilation blows up, and it’s closed-source, so you’re stuck,” Wang explains. “Open compilers survive because the community finds the weird stuff for you.” This philosophy of openness everywhere and optimization at every step drives CentML’s product lineup.

Hidet, CentML’s open-source ML compiler, feeds directly into CServe, its serving engine based on vLLM. This then integrates smoothly into their all-in-one AI infrastructure offering. The CentML Platform allows developers to select any open model like Llama, Mistral, or DeepSeek, point it at any hardware from NVIDIA H100s and AMD MI300Xs through to TPUs, and let the stack handle performance optimization and deployment.

One of Wang’s favorite practical examples of this approach involves optimizing and deploying AWQ-quantized DeepSeek R1 on the CentML Platform. 

“At the GPU-kernel level, through Hexcute which is a DSL of the Hidet compiler, we built a fully-fused GPU kernel for the entire MoE layer which is a crucial part of DeepSeek R1,” he says. 

“This sped up the MoE by 2x to 11x compared to the best alternatives out there implemented through the Triton compiler. Then, at the inference-engine-level, we built EAGLE speculative decoding which leverages a smaller draft model to reduce and help parallelize the work that the big original model has to do, which led to another 1.5-2x overall speedup,” he adds.

Wang then gives an example of how CentML Platform empowers AI practitioners: “The entire model is now made deployable on our platform, while the GPU provisioning, networking, autoscaling, fault tolerance, and all the optimizations I just mentioned are handled automatically for the users behind the scenes.” 

CentML’s research isn’t just about chasing academic acclaim; it’s laser-focused on solving real-world latency and infrastructure bottlenecks. Its recent Seesaw paper, set to be presented at MLSys 2025, highlights an innovative approach to dynamically switching parallelism strategies during inference while reducing network congestion. Running a Llama model distributed across eight NVIDIA L4 GPUs interconnected via standard PCIe, the team encountered severe network overload with their initial tensor-parallel strategy during prefill, causing latency to spike dramatically.

The CentML team’s intuitive solution was highly effective: they maintained tensor parallelism for the memory-bandwidth-intensive decode stage but switched to pipeline parallelism during the compute-heavy prefill phase. “The moment we flipped strategies mid-inference, our throughput soared, and latency dropped sharply,” Wang proudly recalls.

Though first prototyped in a research setting, these cutting-edge techniques will soon transition into CentML’s production-grade CServe inference engine at the heart of the CentML Platform. Wang elaborates: “Our research engineers pursue bold ideas aimed at cracking core problems. Once validated, they’re empowered to integrate these innovations directly into our products, enjoying firsthand the real-world impact. Not every experimental idea makes it to production immediately, but the most promising ones rapidly evolve into tangible performance enhancements.” 

This creates a virtuous feedback loop, where user-reported edge cases enhance downstream software capabilities, inspiring further academic research and generating additional performance improvements. Similar to how CentML contributed their work on pipeline parallelism and EAGLE speculative decoding back to the vLLM library, these ideas and implementations will be contributed back as well, making them available to everyone through a straightforward pip install.

CentML offers users simple serverless endpoints for initial experimentation and seamless transitions to dedicated deployments, empowering users to own and control their entire stack. Whether spinning up Llama 4 on a preferred cloud provider or migrating to an on-premises infrastructure, the CentML ecosystem ensures stability, flexibility, and consistency without reliance on proprietary connectors.

There’s also a compelling economic and data privacy argument behind CentML’s approach. Serverless API endpoint providers often tout access to premium GPUs and proprietary kernels, but Wang highlights a contrasting narrative: Open models combined with superior yet accessible systems can deliver significantly better performance at dramatically lower costs. To be fair, inference requests with potentially sensitive information probably shouldn’t be sent to a serverless API endpoint that is shared among many users, which is why CentML offers dedicated deployments of these optimized models.

In an internal comparison, CentML engineers tested two identical chatbots. One used a Together.ai Llama 4 Maverick endpoint and the other ran CentML’s optimized stack. The CentML version achieved improved token throughput at much lower latency time to the first token. “Same weights, same prompts, but different systems—and one dramatically lower AWS bill,” Wang notes.

Asked what keeps him awake at night, Wang bypasses industry hype to focus squarely on system bottlenecks. Specifically, memory and interconnect bandwidth is much more challenging to scale than raw compute throughput. He continuously pushes to maximize the use of every last bit of hardware resources by AI workloads and then some. This drive explains CentML’s aggressive innovation strategy, from parallelism switching to ongoing kernel optimization in Hidet, and even the resource-optimizing hardware picker embedded within their Platform.

For developers interested in experiencing CentML’s performance firsthand, Wang suggests trying out Llama 4 endpoints on their platform. Additionally, their Hidet and DeepView open-source projects are available on GitHub, where users can directly contribute by reporting edge cases or performance quirks. Wang and his team enthusiastically welcome these contributions.

In Wang’s words, “AI progress doesn’t hinge on one closed lab. The cat’s out of the bag, and the best optimizations are happening openly, collaboratively, and transparently.”

Sponsored by CentML

Snowflake digs out $250M to buy Crunchy to plug Postgres gap

Snowflake has hoovered up Crunchy Data, plugging what was an increasingly visible gap in the cloud data giant’s Postgres strategy.

Snowflake had previously offered a PostGres connector to suck data into its SQL-based engine.

However, suddenly everyone seems to think it’s important to pump up their Postgres portfolio, not least as the database is the most popular amongst developers, according to Stack Overflow.

Last month Snowflake arch rival Databricks snapped up Neon in a $1bn deal, saying it would allow it to deliver serverless Postgres.

At the time, Databricks said that 80 percent of the databases provisioned by Neon were “created automatically by AI agents rather than by humans.”

Snowflake sprung the acquisition as it kicked off its Summiy 25 user event in San Francisco. Reports put the cost of the deal at $250m.

Crunchy’s Postgres technology will be immediately repackaged as Snowflake Postgres, which the soon to be new owner pitched as “the AI-ready, enterprise-grade and developer-friendly PostgreSQL database to the AI Data Cloud.”

Snowflake described this as “a new kind of Postgres designed to power the most demanding, mission-critical AI and transactional systems at enterprise scale and with enterprise confidence.”

Connecting up silos

It would maintain the “full power and flexibility of open source Postgres” it said, while delivering the governance and security a giant like Snowflake can offer.

More tangibly, Snowflake said the rebadged product would eliminate admin and operational silos.

And, it said, “Companies deeply invested in the Postgres ecosystem will be able to migrate and run existing applications on Snowflake without rewriting code, and roll out new ones more confidently.”

Crunchy Data’s Paul Laurence wrote that the deal would enable Postgres adoption at scale, and “the potential to expand our contribution to the Postgres ecosystem and community.”

Snowflake cited Blue Yonder and Landing AI, which both use Postgres as well as Snowflake, as examples of the sort of companies will be able to “consolidate their application stack, unlocking increased efficiency and cost savings.”

Snowflake said the deal would “Complement Unistore, our innovative solution that unifies transactional and analytical data within a single database, by providing an enterprise-ready option for transactional applications that require Postgres compatibility.”

Or at least it will when the deal closes. Snowflake said this was expected “imminently.”

Dell rides AI server boom – but storage still stuck in the slow lane

Demand for AI servers is driving up Dell revenues, but storage is the poor relation, waiting for the unstructured data surge the evangelists say is coming.

Jeff Clarke, Dell
Jeff Clarke

Vice-chairman and COO Jeff Clarke stated: “We achieved first-quarter record servers and networking revenue of $6.3 billion, and we’re experiencing unprecedented demand for our AI-optimized servers. We generated $12.1 billion in AI orders this quarter alone, surpassing the entirety of shipments in all of FY 25 and leaving us with $14.4 billion in backlog.”

The $23.4 billion of revenue in Dell’s first fiscal 2026 quarter, ended May 2, was 5.4 percent more than last year but sequentially down for the third consecutive quarter. There was a profit (GAAP net income) of $965 million, 3 percent less than a year ago. 

Quarterly financial summary

  • Gross margin: 21.1 percent vs 21.8 percent a year ago
  • Operating cash flow: A record $2.8 billion vs $1 billion last year
  • Free cash flow: $2.23 billion vs $457 million a year ago
  • Cash, cash equivalents, and restricted cash: $7.85 billion vs $5.96 billion a year ago
  • Diluted EPS: $1.37 flat year-over-year

The two main business segments, the Infrastructure Solutions Group (ISG – servers plus networking and storage) and the Client Solutions Group (CSG – PCs, laptops), fared differently with ISG revenues rising 12 percent year-on-year to $10.3 billion while CSG revenues of $12.5 billion only rose 5 percent year-over-year. However, sequentially, ISG revenues have declined or been flat for three consecutive quarters while CSG revenues have risen after two declining quarters.

ISG and CSG revenues

Commercial CSG sales rose 9 percent to $11 billion but consumer sales fell substantially by 19 percent to $1.46 billion. Dell said commercial PC demand grew for the fifth straight quarter. Overall, PC and laptop sales are picking up while ISG’s total sales of servers, networking, and storage are still on their way down from a peak in the second fiscal 2025 quarter.

A look at the revenue trends for servers plus networking and storage inside ISG shows this as well:

Server, networking, and storage revenues

Storage revenue in the latest quarter was $4 billion, 6 percent higher year-over-year, while servers and networking registered $6.3 billion in sales, which Dell says is a record for any first quarter, but it’s been sequentially down now for three quarters in a row. Storage, however, saw its third consecutive year-over-year revenue growth quarter.

Demand for the PowerStore product was up double digits and growing for the fifth consecutive quarter. Data protection was also strong. The company maintained its leading storage market share position in the external RAID, high-end RAID, converged and HCI storage market sectors, according to IDC:

Dell graphic
Dell graphic. TTM is trailing twelve month

Compared to the surging AI server demand, there is, as yet, no comparable uplift in AI-related storage demand. In prepared remarks, Clarke said Dell’s areas of focus are using its own IP in the mid-range, software-defined, unstructured storage and data protection markets. These are the faster-growing, higher-margin segments.

He is seeing more customers move to disaggregated storage, saying: “We are gaining traction in data protection, with demand up double digits in both our next-generation target appliance as well as our software.”

Dell graphic
Dell storage market domination graphic

Next quarter’s servers and networking revenue will be a tough comparison. Unless Dell sells north of $7.7 billion in servers and networking, a 22.2 percent rise or greater, it will report a year-over-year revenue decline there.

Dell expects ISG to grow significantly next quarter, driven by approximately $7 billion of AI server shipments, and CSG to grow low-to-mid-single digits. For the full year, ISG is expected to grow in the high teens, driven by $15 billion-plus AI server shipments with CSG growing still in the low-to-mid-single digits. 

CFO Yvonne McGill said: “We are expecting sub-seasonal performance in traditional server and storage, our larger profit pools that provide scale, as customers evaluate their IT spend for the year, given the dynamic macro environment. Given that backdrop, we expect Q2 revenue to be between $28.5 billion and $29.5 billion, up 16 percent at the midpoint of $29 billion. 

The full fiscal 2026 revenue guidance is $102 billion ± $2 billion and an 8 percent increase year-over-year at the midpoint. McGill said: “We expect ISG to grow high-teens, driven by over $15 billion in AI server shipments and continued growth in traditional server and storage. And we expect CSG to grow low-to-mid-single digits. We expect the combined ISG and CSG to grow 10 percent at the midpoint. Given what we expect for the first half, the full-year guide reflects slightly lower profitability expectations within CSG, traditional server, and storage.” 

Clarke was asked about AI demand uplifting storage in the earnings call, and said: “We think the opportunity to attach storage, particularly as the data structure needs more object storage and our assets in the unstructured space, [means] we have a differentiated advantage … So we remain optimistic. We have work to do. I would not be truthful if I said we were satisfied. We are not satisfied with the attach, to date. I think that there’s all upside and opportunity.”

“Our sales team is focused on that. And we continue to find opportunities to differentiate our portfolio, look at some of the subsystems in our performance and the attributes that it brings to what’s being done in these modern workloads. And we’re very optimistic about that. … as these modern workloads evolve, it’s clear that the disaggregated storage architecture is the path to the future. I think we have the best portfolio in that. And we’ll continue to focus on that with our customers.” 

He added that he is “very optimistic about the storage opportunities in Enterprise … We are building high-performance file systems and storage systems in this disaggregated storage world, where it is really new to a flexible, agile, high-performance storage system, to meet the needs of these AI workloads.” No upsurge timing prediction, though.

Comment

DDN, VAST Data, and WEKA are probably the most prominent storage suppliers to the AI market. Collectively, they probably have a larger installed AI storage base than any other supplier, with VAST possibly leader in terms of exabytes installed. Until a research firm like IDC or Gartner assesses the AI storage market, we won’t know for sure.

Our conjecture is that, as the AI market grows its inferencing side, general IT infrastructure suppliers like Dell, HPE, and Lenovo, and storage infrastructure suppliers like NetApp and Pure will be hoping their installed base chooses their kit and not the apparently more AI training market-proven DDN, VAST, and WEKA gear. 

Dell says it has more than 3,000 AI Factory customers. If and when they need AI data storage, it’s possible that many, if not most of them, will be buying Dell storage.

AI transformation projects hamstrung by legacy apps and creaking data management

legacy tech old infrastructure untended to legacy kit covered in cobwebs

Aging data infrastructure is inflating the technical debt that companies need to pay down before they can move to AI.

Research by Pegasystems found that two-thirds of organizations say “legacy systems and applications” are preventing them “fully embracing more modern technologies.” And almost nine out of ten fret that this is crimping their ability to keep pace with agile, innovative competitors.

Almost a quarter of companies had applications that are six to 10 years old, with almost a third having legacy apps that were 11 to 15 years. But those are just striplings, with 7 percent running apps that were over the quarter century to 30 years mark, and 1 percent ‘fessing up to applications more than 50 years old.

And presumably the underlying data structures and architectures are equally vintage. Almost a quarter of respondents said legacy apps meant “data is locked inside of them that we can’t access.”

Don Schuerman, CTO, Pegasystems, said “Part of the challenge is that legacy systems often mean that business logic – rules, workflows, decision points – are entangled into the data systems.  

“Storage and data management challenges come up in virtually every customer conversation about automation and AI projects,” he added. “The reality is that most organizations have their data trapped in silos across multiple systems, which creates significant barriers to implementing effective automation or AI solutions.”

(No) viva aging software

This leaves tech leaders tasked with delivering new digital experiences on the front end whilst simultaneously maintaining their backend systems that hold core business data and transactions. This means “trade-offs between being agile or stable because legacy systems slow them down through data replication, batch processing, and siloed architectures.”

Pega’s answer is live data integration which it claims delivers the right data to the right process steps at the right time, managing data requests behind the scenes and adapting easily to new data sources without requiring custom code.

Schuerman said companies need to focus on the processes and experiences they want to transform, and then pull the required data along and into the cloud. “Trying to tackle the data problem outside of specific customer processes and engagement strategies can lead to science projects that run long and never lead to value.” Hence the low percentage of AI projects that have actually made it into production.

Pegasystems unwrapped the research at its PegaWorld conference in Las Vegas, where it also announced its Pega Agentic Process Fabric, which it describes as an “open agentic fabric” to orchestrate “all AI agents and systems”.

The fabric registers AI agents, workflows, and, critically, data across both Pega apps and third party systems. It supports standards including Model Context Protocol and Agent-to-Agent.

Looked at another way, Pega Agentic Process Fabric supersedes the vendor’s Pega Process Fabric, with the new service taking the key capabilities of its predecessor and adding Agentic AI support.

The Fabric switch up coincides with the addition of Agentic AI enhancements to the vendor’s Infinity App Studio. These include an enhance AI developer agent, which works as a “mentor” to devs. It also delivers automated testing and accelerates UX configuration. The new features will appear in the Q3 release of Pega Infinity.

Storage News Roundup – June 2

Screenshot

The Cayman Islands could pitch itself as no-questions- asked repository for data as well as cash and assets, a government official has suggested. The British Overseas Territory in the Caribbean Sea has long been an offshore banking and finance center and has no income tax, capital gains tax, or corporation tax.

Stephen Ta’Bois, the science, technology, engineering and maths specialist at the Department of Education Service, told a government workshop that the territory should develop its own AI tooling to avoid reliance on overseas technology,  according to the Cayman Compass.

Ta’Bois added that the territory could be a “a regional hub for the Caribbean or the overseas territories to avoid turning data over to servers based abroad”. This could also be an antidote to increasing concerns about government over-reach, particularly US government overreach, which trumps local or regional data sovereignty rules and regulations. Of course, this is unlikely to go down well with Washington, which already bristles at individuals and organizations using the territory to harbour assets, never mind data.

Four-fifths of GenAI business apps will be built on existing data management platforms by 2028, Gartner senior directory analyst Prasad Pore told the firm’s Data and Analytics Summit in Mumbai this week. Pore told attendees that solving real business challenges meant combining LLMs trained on public data with organizations’ own specialized data, using RAG architectures. However, this meant data management platforms had to evolve to become RAG-as-a-service platforms, which would require a far more unified management approach. The alternative was longer delivery times and sunk costs, said Pore.

Analytics vendor Sisense has launched a suite of AI-powered tools to help organizations extract insights from data. Sisense Intelligence features Assistant, an “AI-first interface” to manage analytics creation from data acquisition through to modelling, insights generation and embedding. The GenAI Suite offers “Capabilities” including Narrative, Explanation, Forecast and Trend. And a Compose SDK allows developers to use React, Angular and Vue.js to embed AI analytics into applications. Ultimately, Sisense aims to deliver Analytics Platform as a Service. Customers on its Managed Cloud platform can access the features now.

Cloud storage provider Backblaze has launched a private preview of its new enterprise web console for managing B2 Cloud Storage deployments at scale. The console offers increased security, with role-based access and controls and mandatory multi-factor authentication, aligning with zero trust best practice, while also aiming to deliver more intuitive, streamlined experience. Backblaze B2 will also give customers the ability to create buckets in any region, boosting resilience, and giving more flexibility around data residency requirements.

dbt Labs unwrapped its Fusion Engine last week, built on Rust and incorporating “native SQL comprehension.” The firm reckons the new platform improves parse times by 30x compared to dbt Core, claiming that state aware orchestration, currently in beta, is showing average 10 percent cost savings for commercial customers. Other features include Intellisense autocomplete for SQL functions, macros and more, hover insights, and instant refactoring. Initial support for Snowflake will be followed by support for Databricks, BigQuery and Redshift. dbt also unveiled a VS Code Extension, which will open up Fusion to local developers, and a trio of governed accessible features: dbt Canvas for editing; dbt Insights, an AI powered query interface; and dbt Catalog, formerly known as Explorer.

NetApp revenues rise to all-time highs, but can it sustain?

Analysis: NetApp posted record revenue this quarter, helped by all-flash array sales and public cloud storage demand.

In the fourth fiscal 2025 quarter ended April 25, NetApp reported record revenues of $1.73 billion, up 4 percent on the year. There was a $340 million profit (GAAP net income), 16.8 percent more than a year ago. Its full fy2025 revenues were its highest-ever, at $6.57 billion, 5 percent more than fy2024 revenues, with a profit of $1.2 billion, 20.7 percent up on the year. 

George Kurian

Within these low single-digit revenue rises there were some outliers. The all-flash array annual run rate rose to an all-time high of $4.1 billion, up 14 percent. This was faster growth than Pure, which registered a 12.2 percent rise to $778.5 million in its latest results. There was a record (GAAP) operating margin of 20 percent for the full year, record full year billings of $6.78 billion and record first-party and marketplace Public Cloud services revenue of $416 million in the year, up 43 percent annually.

CEO George Kurian stated: “Fiscal Year 2025 marked many revenue and profitability records, driven by significant market share gains in all-flash storage and accelerating growth in our first party and marketplace storage services. …We are starting fiscal year 2026 following a year of market share gains, armed with the strongest portfolio in the company’s history and a differentiated value proposition that addresses customers’ top priorities. Looking ahead, I am confident our continued innovation and market expansion will drive sustainable long-term growth.”

Fy2025 revenues of $6.57 billion beat fy2014’s and fy2013’s $6.33 billion. It’s taken NetApp 11 years to match and beat these high points.

Wissam Jabre has been hired as NetApp’s new CFO, coming from being Western Digital’s CFO before the splitting off of Sandisk. Prior NetApp CFO Mike Berry retired but then joined MongoDB as its CFO.

Quarterly financial summary

  • Gross margin: 69.5 percent, up 0.2 percent year-over-year
  • Operating cash flow: $675 million vs year-ago $613 million
  • Free cash flow: $640 million vs $567 million last year
  • Cash, cash equivalents, and investments: $3.85 billion vs prior quarter’s $1.52 billion
  • EPS: $1.65 v $1.37 a year ago
  • Share repurchases and dividends: $355 million vs prior quarters $306 million

NetApp’s two major business segments are hybrid cloud, with $1.57 billion in revenues for the quarter, up 3.2 percent, and public cloud at $164 million, up 7.9 percent.

NetApp said it increased market share, gaining almost 300 basis points in the all-flash market and almost 1-point in the block storage market in calendar 2024. 

Looking ahead Kurian said in the earnings call: “I believe that we’ve now reached an inflection point where the growth of all-flash systems and public cloud services, reinforced by the ongoing development of the AI market, will drive sustained top-line growth. …Looking ahead, we expect these growth drivers, along with our laser focus, prioritized investments, and robust execution, to deliver more company records in fiscal year 2026 and beyond.”

+Comment

A look at NetApp’s revenue history shows that, in the recent past, it has only been able to sustain growth for about two years before declining:

Can it break this pattern moving forward? 

It reduced its workforce in April, indicating somewhat starightened trading circumstances. Kurian commented: “The global macro-economic outlook faces mixed signals with a general slowdown in growth, lingering inflation concerns, and a significantly higher level of uncertainty. Looking ahead, we expect some increased spending caution, as well as on-going friction in U.S. Public Sector and EMEA. We are incorporating an appropriate layer of caution in our outlook due to these factors.”

The revenue outlook for the first fy2026 quarter reflects this caution, being $1.53 billion +/- $75 million; an 0.7 percent decrease Y/Y at the mid-point.

Full fy2026 revenues are being guided to $6.75 billion +/-$125 million, and a 2.73 percent on the fy2025 number at the mid-point. It is a rise though, with Kurian saying: “We are currently negotiating sizable AI and data infrastructure modernization deals with multiple large enterprises, which we expect to close later in the year. This gives us confidence in our full-year outlook.”

William Blair analyst Jason Ader thought it was a solid fourth quarter but said the guidance was disappointing. He said: “Management noted that the guidance factors in 1) the impact from the divestiture of the Spot by NetApp business, which contributed roughly $95 million in annual sales; and 2) greater spending caution due to continued macroeconomic uncertainty, including slower growth signals, inflation concerns, tariff risk (albeit modest impact), and ongoing demand friction in the US federal sector (approximately 10 percent of sales) and EMEA region (particularly in the European manufacturing vertical).”