Home Blog Page 29

Blackstone invests $300M in DDN to boost AI storage business

Private equity firm Blackstone is investing $300 million in privately held DDN, valuing the company at $5 billion and aiming to help develop its AI-related storage business.

Alex Bouzari, DDN
Alex Bouzari

According to DDN CEO Alex Bouzari, as referenced in a Wall Street Journal report, DDN will use the cash “to sharply expand the AI data company’s business-customer base.”

Bouzari stated: “Blackstone’s support accelerates our mission to redefine the enterprise AI infrastructure category and scale at an even faster rate. By fueling our mission to push the boundaries of data intelligence, we can empower organizations worldwide with next-level AI solutions that drive ground-breaking innovation and deliver 10x returns on their investments.”

California-headquartered DDN (DataDirect Networks) was founded in 1998 by CEO Alex Bouzari and president Paul Bloch to provide fast IO unstructured data storage arrays to high-performance computing customers, such as NASA and Nvidia. They brought two existing businesses, MegaDrive and Impact Data, together to form DDN. As enterprises operating in big data analytics, seismic processing, financial services, and life sciences adopted HPC-style IT, DDN developed its file and object product technology to support them.

It raised $9.9 million in A-round funding in 2001, but the relationship with investors soured and there was a confrontational exit from the deal in 2002, with Bouzari saying: “The VCs basically exited the company in April 2002. Somehow we managed to make payroll in April. We lost roughly $3 million in the first half of 2002 and we made roughly $3 million in the second half of 2002. In 2003, we were profitable and did roughly $25 million in revenue. By 2005, we earned $45 million in revenue and broke $100 million in revenue by 2007. Last year, we hit $188 million in revenue. Right now we are above a $200 million run rate.”

Annual revenues passed $100 million in 2008 and $200 million in 2011. DDN acquired Intel’s Lustre file system engineering team in 2018. That same year it expanded its general enterprise storage supply capabilities by acquiring the bankrupt Tintri business for $60 million in September 2018, Nexenta for an undisclosed sum in May 2019, and Western Digital’s IntelliFlash hybrid SSD/HDD storage product division in September 2019.

In 2020, DDN said annual revenues were $400 million, and it had more than 11,000 customers. Revenues reached $500 million in 2023, and were estimated to be $750 million at the end of 2024, with DDN saying it’s “highly profitable,” and: “With this groundbreaking deal, DDN is poised for historic growth in 2025 following a record-breaking 400 percent growth in AI revenue in 2024.”

The core DDN business currently produces a line of ExaScaler Lustre parallel file system storage arrays and has developed them into AI400X2 appliances for AI processing storage, supporting Nvidia’s GPUDirect storage protocol. It has a strategic partnership with Nvidia.

Paul Bloch, DDN
Paul Bloch

DDN made significant progress in the generative AI market in 2023 and 2024, relying on its strong relationship with Nvidia and securing deals such as providing storage for Elon Musk’s Colossus AI supercomputer.

DDN President Paul Bloch said: “This investment enables us to execute our strategy to bring HPC-grade AI solutions to enterprises, transforming industries and delivering measurable outcomes. Our teams are laser-focused on solving real business challenges, from accelerating LLM deployments to enhancing inferencing, so our customers can unlock their data’s potential and achieve tangible ROI faster than ever before.”

We asked DDN a few questions about this investment:

B&F: Why is the money needed? Couldn’t DDN fund necessary business development itself?

DDN: While we’ve successfully self-funded for over two decades, this investment allows us to apply what we’ve learned from working with leading AI hyperscalers and labs to meet the growing needs of Fortune 500 enterprises.

B&F: Does Blackstone get a seat on DDN’s board?

DDN: Yes, Blackstone will take a seat on our board, contributing their experience in helping companies scale effectively.

B&F: Will the cash be used for engineering, go-to-market expansion, or both or something else? Acquisitions?

DDN: The funds will support applying our proven AI expertise to enterprise customers through enhanced engineering and go-to-market efforts. We wouldn’t rule out acquisitions that align with our growth strategy.

B&F: Will it help DDN better compete with VAST Data and Pure Storage?

DDN: This investment broadens our scale and scope, enabling us to compete more effectively across the industry—not just with companies seen as our traditional competition but on a much broader scale.

B&F thinks that DDN, together with VAST Data, is one of the pre-eminent storage suppliers for AI training and large-scale inferencing work. DDN is promising a significant AI-related announcement on February 20 and we think it likely that 2025 could potentially be a billion dollar revenue year for the firm.

Blackstone is a prominent private equity investor with over $290 billion in assets under management as of 2023. Its other recent AI-related investments include GPU farm CoreWeave, datacenter operators QTS Realty Trust and AirTrunk, and cybersecurity supplier VectraAI. A Blackstone statemewnt said: “DDN’s track record for delivering cutting-edge AI and HPC platforms to thousands of customers globally is just scratching the surface of the transformative impact they’ll have on the enterprise AI market. We see DDN as the clear leader in scaling enterprise-grade solutions that drive meaningful business returns for modern AI deployments.”

Synology intros ActiveProtect backup appliance line

As promised in December, Synology has announced its ActiveProtect backup and recovery appliance products, with integrated backup, recovery, and management software, and server and storage hardware.

Taiwan-based Synology supplies NAS filers, such as the DiskStation, RackStation and FlashStation, FS arrays, SANs, routers, video surveillance gear, and C2 public cloud services for backup, file storage, and personal (identity) security to primarily SMB buyers. It also has enterprise customers.

EVP Jia-Yu Liu stated: “ActiveProtect is the culmination of two decades of experience in hardware and software engineering, shaped by our ongoing collaboration with businesses worldwide and more than half of Fortune 500 companies. With ActiveProtect, we’re setting a new standard for what businesses can expect from their data protection solutions.”

ActiveProtect features global source-side deduplication, immutable backups, air-gap capabilities, and regulatory compliance support. Synology says it delivers “comprehensive data protection” and enables customers “to implement a reliable 3-2-1-1-0 backup strategy for PCs, Macs, virtual machines, databases, file servers, and Microsoft 365 accounts.”

The 3-2-1-1-0 concept means:

  • 3 – keep at least 3 copies of data
  • 2 – on at least 2 different types of media
  • 1 – with one backup copy offsite
  • 1 – and one immutable copy
  • 0 – ensuring zero errors with regular testing

The ActiveProtect Manager (APM) centralized console supports up to 150,000 workloads or 2,500 sites, offering scalability and “enterprise-grade data visibility and control.”

There are five products in the range, all with AMD processors:

  • DP320 – tabletop – protect up to 20 machines or 50 SaaS users – 2 x 8TB HDD, 2 x 400GB SSD
  • DP340 – tabletop – protect up to 60 machines or 150 SaaS users – 4 x 8TB HDD, 2 x 400GB SSD
  • DP5200 – rackmount – 1RU
  • DP7300 – rackmount – 2RU
  • DP7400 – rackmount – up to 2,500 servers & 150,000 workloads – 2RU – 10 x 20TB HDD, 2 x 38.4TB SSD
Synology ActiveProtect appliances
ActiveProtect appliances. From top left: DP320, DP340, DP5200, DP7300, and DP7400

All models except the DP320 use SSD caching to store backup-related metadata. Synology provides datasheets for the DP320, DP340, and DP7400, but not the DP5200 or DP7300, although it provided images of them. We understand that the DP5200 and DP7300 are future appliance products.

ActiveProtect has a one-time purchase and “once installed, ActiveProtect allows users to back up as many workloads as your storage allows. Businesses can manage up to three backup servers license-free, with optional CMS licenses available for larger, multi-appliance deployments.”

The ActiveProtect offering is available globally through Synology’s distributor and partner network. Get more information, including datasheets for the DP320, DP340 and DP7400, here.

True Base DP7400 retail pricing is, we understand, €39,999 plus €1,800 for a three-year period for each additional clustered device beyond the first three. The DP320 costs €1,996 and the DP340 is priced at €4,991.

Get an ActiveProtect buyer guide here. Get a DP7400 review here.

AI data pipelines could use a hand from our features, says Komprise

Interview. AI training and inferencing needs access to datasets. The dataset contents will need to be turned into vector-embedded data for GenAI large language models (LLMs) to work on them, with their semantic search capabilities finding responses from vectorized data to requests that have been vectorized as well.

In one sense, providing vectorized datasets to LLMs requires extracting the relevant data from its raw sources: files, spreadsheets, presentations, mail, objects, analytic data warehouses, and so on, turning it into vectors and then loading it into a store for LLM use, which sounds like a traditional ETL (Extract, Transform and Load) process. But Krishna Subramanian, Komprise co-founder, president, and COO, claims this is not so. The Transform part is done by the AI process itself.

Komprise provides Intelligent Data Management software to analyze, move, and manage unstructured data, including the ability to define and manage data pipelines to feed AI applications, such as LLMs. When LLMs are used to search and generate responses from distributed unstructured datasets then the data needs to be moved into a single place that the LLM can access, passing through a data pipeline.

Filtering and selecting data from data set sources, and moving it, is intrinsic to what the Komprise software does and here Subramanian discusses AI data pipeline characteristics and features.

Blocks & Files: Why are data pipelines becoming more important today for IT and data science teams?

Krishna Subramanian, Komprise
Krishna Subramanian

Krishna Subramanian: We define data pipelines as the process of curating data from multiple sources, preparing the data for proper ingestion, and then mobilizing the data to the destination. 

Unstructured data is large, diverse, and unwieldy – yet crucial for enterprise AI. IT organizations need simpler, automated ways to deliver the right datasets to the right tools. Searching across large swaths of unstructured data is tricky because the data lacks a unifying schema. Building an unstructured data pipeline with a global file index is needed to facilitate search and curation. 

On the same note, data pipelines are an efficient way to find sensitive data and move it into secure storage. Most organizations have PII (Personal Identifying Information), IP and other sensitive data inadvertently stored in places where it should not live. Data pipelines can also be configured to move data based on its profile, age, query, or tag into secondary storage. Because of the nature of unstructured data, which often lives across storage silos in the enterprise, it’s important to have a plan and a process for managing this data properly for storage efficiencies, AI, and cybersecurity data protection rules. Data pipelines are an emerging solution for these needs.

Blocks & Files: What kind of new tools do you need?

Krishna Subramanian: You’ll need various capabilities, aside from indexing, many of which are part of an unstructured data management solution. For example, metadata tagging and enrichment – which can be augmented using AI tools – allows data owners to add context and structure to unstructured data so that it can be easily discovered and segmented. 

Workflow management technologies automate the process of finding, classifying and moving data to the right location for analysis along with monitoring capabilities to ensure data is not lost or compromised through the workflow. Data cleansing/normalization tools and of course data security and governance capabilities track data that might be used inappropriately or against corporate or regulatory rules. 

There is quite a lot to consider when setting up data pipelines; IT leaders will need to work closely with research teams, data scientists, analysts, security teams and departmental heads to create the right workflows and manage risk.

Blocks & Files: How are data pipelines evolving with GenAI and other innovations?

Krishna Subramanian: Traditionally, data pipelines have been linear, which is why ETL was the norm. These tools were designed for structured and semi-structured data sources where you extract data from different sources, transform and clean up the data, and then load it into a target data schema, data warehouse or data lake.

But GenAI is not linear; it is iterative and circular because data can be processed by different AI processes, each of which can add more context to the data. Furthermore, AI relies on unstructured data which has limited metadata and is expensive to move and load.

Since data is being generated everywhere, data processing should also be distributed; this means data pipelines must no longer require moving all the data to a central data lake first before processing. Otherwise, your costs for moving and storing massive quantities of data will be a detriment to the AI initiative. Also, many AI solutions have their own ways of generating RAG and vectorization of unstructured data. 

Unlike ETL, which focuses heavily on transformation, data pipelines for unstructured data need to focus on global indexing, search, curation, and mobilization since the transformation will be done locally per AI process.

Blocks & Files: What role do real-time analytics play in optimizing data pipelines?

Krishna Subramanian: Real-time analytics is a form of data preprocessing where you can make certain decisions on the data before moving it. Data preprocessing is central in developing data pipelines for AI because it can iteratively enrich metadata before you move or analyze it. This can ensure that you are using the precise datasets needed for a project – and nothing more. Many organizations do not have distinct budgets for AI, at least not on the IT infrastructure side of the house and must carve funds out from other areas such as cloud and data center. Therefore, IT leaders should be as surgical as possible with data preparation to avoid AI waste.

Blocks & Files: How can companies leverage data pipelines to improve collaboration between data science teams, IT, and business units?

Krishna Subramanian: Data pipelines can create data workflows between these different groups. For instance, researchers who generate data can tag the data – otherwise called metadata enrichment. This adds context for data classification and helps data scientists find the right datasets. IT manages the data workflow orchestration and safe data movement to the desired location or can integrate third-party AI tools to work on datasets without moving them at all. This is a three-way collaboration on the same data facilitated by smart data workflows leveraging data pipelines.

Blocks & Files: What trends do you foresee in data pipeline architecture and how can enterprises prepare for these evolving technologies and approaches?

Krishna Subramanian: We see that data pipelines will need to evolve to address the unique requirements of unstructured data and AI. This will entail advances in data indexing, data management, data pre-processing, data mobility and data workflow technologies to handle the scale and performance requirements of moving and processing large datasets. Data pipelines of unstructured data for AI will focus heavily on search, curate and mobilization with the transformation happening within the AI process itself.

Blocks & Files: Could Komprise add its own chatbot-style interface?

Krishna Subramanian: Our customers are IT people. They know how to build a query. They know how to use our UI. What they really want is connecting their corporate data to AI. Can we reduce the risk of it? Can we improve the workflow for it? That’s a higher priority than us adding chat, which is why we have prioritized our product work more around the data workflows for AI.

Blocks & Files: Rubrik is aiming to make the data stored in its backups available for generative AI training and or inference with its Annapurna project. Rubrik is not going to supply its own vectorization facilities or its own vector database, or indeed its own large language models. It’s going to be able to say to its customers, you can select what data you could feed to these large language models. Now that’s backup data. Komprise will be able to supply real-time data. Is that a point of significant difference?

Krishna Subramanian: Yes, that’s a point of significant difference … We were at a Gartner conference last month and … Gartner did a session on what do customers want from storage and data management around AI. And … a lot of people think AI needs high performance storage. You see all this news about GPU-enabled storage and storage costs going up and all of that. And that’s not actually true. Performance is important, but only for model training. And model training is 5 percent of the use cases. 

In fact, they said 50 percent of enterprises will never train a model or even engineer a prompt. You know 95 percent of the use cases is using a model. It’s inferencing. 

And a second myth is AI is creating lot of data, or, hey, you’re backing up data. Can you run AI on your backup data? Yes, maybe there is some value to that, but most customers really want to have all their corporate data across all their storage available to AI, and that’s why Gartner [is] saying data management is more important than storage for AI.

We build a global file index. And this is not in the future. We already do this. You point this at all your storage and … we’re actually creating a metadata base. We’re creating a database of all the metadata of all the files and objects that we look at. And this is not backup data. It’s all your data. It’s your actual data that’s being stored. 

Komprise graphic

So whether you back it up or not, we will have an index for you. With our global file index you can search across all the data. You can say, I only want to find benefits documents because I’m writing a benefits chat bot. And anytime new benefits documents show up anywhere, find those and feed those to this chat bot agent and Komprise will automatically run that workflow. 

And every time new documents show up in Spain or in California or wherever, it would automatically feed that to that AI and it would have an audit trail. It will show what was spent. It will show which department asked for this. It will keep all of that so that for your data governance for AI, you have a systematic way to enable that.

Blocks & Files: Would it be fair to say that the vast majority of your customers have more than one unstructured storage, data device supplier, and building on that, those suppliers cannot provide the enterprise-wide object and file estate management capability you can?

Krishna Subramanian: Yes, that is exactly correct. Sometimes people might say, “Well, no, I don’t really have many suppliers. I might only use NetApp for my file storage.” But how much do you want to bet they’re also using AWS or Azure? So you do have two suppliers then. If you’re using hybrid cloud, by definition, you have more than one supplier, yes. I agree with your statement. And that’s what our customers are doing. That’s why this global file index is very powerful, because it’s basically adding structure to unstructured data across all storage. 

And to your point, storage vendors are trying to say, look, my storage, file system can index data that’s sitting on my storage.

Blocks & Files: So you provide the ability to build an index of all the primary unstructured data there is in a data estate and regulate access to it, to detect sensitive information within it, because you build metadata tables to enable you to do that. So you could then feed data to a large language model, which would satisfy compliance and regulation needs concerning access. It would be accurate, it would be comprehensive, and you can feed it quickly to the model?

Krishna Subramanian: That’s correct. And we actually have a feature in our product called smart data workflows, where you can just build these workflows.

This is a contrived example; you know you can write a chatbot in Azure using Azure OpenAI. The basic example they have is a chat bot that has read a company’s health documents, and somebody can then go and ask it a question; What’s the difference in our company between two different health plans? And then it’ll answer that based on the data it was given, right? 

So now let’s say California added some additional benefits. In the California division of this company Komprise finds those documents, feeds them into that OpenAI chatbot, and then, when the user asked the same question, it gives you a very specific answer, because the data was already fed in, right? 

But really what’s more important is what’s happening behind the scenes. Azure Open AI has something called a knowledge base. It was trained with certain data, but you can actually put additional data, corporate data, in a Blob container, which it indexes regularly, to augment the process. So the RAG augmentation is happening through that container. 

Komprise has indexed all the storage in our global file index. So you just build a workflow saying, find anything with benefits, that’s a Komprise is automatically doing that regularly, and that’s how this workflow is running. And the beauty of this is you don’t have to run this every time anybody could be creating a new benefits document, and it will be available to your chat bot. 

Part of the problem is generative AI can become out of date because it was trained a while back. So this addresses relevancy. It addresses recency, and it also addresses data governance. Because you can tell Komprise if a source has sensitive data, don’t send it. So it can actually find sensitive data. That’s a feature we’re adding. We’ll be announcing soon.

You can tell Komprise to look for social security numbers, or you can even tell it to look for a particular keyword, a particular regular expression, because maybe in your organization, certain things are sensitive because of a certain way you label things. Komprise will find that inside the contents, not just a file name, and it will exclude that data if you if that’s what you want. So it can find personally identifiable information, the common stuff, social security numbers and so forth. But it can also find corporately sensitive information, which PII doesn’t cover.

Blocks & Files: If I’m a customer that doesn’t have Komprise, that doesn’t have any file lifecycle management capability at all, then probably my backups are my single largest cross-vendor data store. So it would make sense to use them for our AI. But as soon as that customer wheels you in, the backup store is behind the times, it’s late, and you can provide more up-to-date information.

Krishna Subramanian: Yes, that’s what we feel. And, by the way, we point to any file system. So if a backup vendor exposes their backup data in a file system, we can point to that too. It doesn’t matter to us. If a backup vendor stores its data on an object storage system? Yes, it works, because we are reading the objects. So if I happen to be a customer with object-based appliances storing all my VM backup data, we say, fine, no problem, we’ll index them – because we’re reading the objects. We don’t need to be in their proprietary file system. That’s the beauty of working through standards.

Blocks & Files: I had thought that object storage backup data and object storage were kind of invisible to you.

Krishna Subramanian: Well, it’s not invisible as long as they allow the data to be read as objects. It would be visible if they didn’t actually put the whole file as an object, if they chunked it up, and if it was proprietary to their file system, then it would be invisible, because the fact that they use object store doesn’t matter. They’re not exposing data as objects. So if they expose data as objects or files, we can access it. 

As with NetApp, even though NetApp chunks the data, it exposes it via file and object protocols, and we read it as a file or an object. We don’t care how ONTAP stores it internally.

Blocks & Files: How is Komprise’s business growth doing?

Krishna Subramanian: Extremely well. New business is growing, I think, over 40 percent again this year. And the overall business is also growing rapidly. Our net dollar retention continues to be north of, I think, 110 percent. Some 30 to 40 percent of our new business comes from expansions, from existing customers.

Pure Storage qualifies Micron’s 276-layer flash

Pure Storage has expanded its strategic collaboration with Micron to include Micron G9 QLC NAND for future DirectFlash Module (DFM) products. 

Pure’s flash storage arrays use DFMs as solid-state storage devices, with up to four DFMs mounted on a blade carrier. Its latest 150 TB DFM has Micron G8 232-layer QLC NAND qualified for production. Pure also uses Kioxia flash chips in its DFMs.

Generation 9 (G9) Micron flash is claimed to be the world’s fastest TLC (3 bits/cell) NAND by Micron. It fits in an 11.5 mm x 13.5 mm package, with Micron claiming this makes it the smallest high-density NAND available. Pure is using it in QLC (4 bits/cell) format and says the Micron collaboration “enables the high-capacity and energy-efficient solutions that hyperscalers require … for future DirectFlash Module products.” It won a hyperscaler customer deal for its DFM technology in December, with its NAND technology replacing disk drive storage.

Bill Cerreta, Pure Storage
Bill Cerreta

Bill Cerreta, GM for Hyperscale at Pure, stated: “Pure Storage’s collaboration with Micron is another example of our significant momentum bringing the benefits of all-flash storage technology to hyperscale environments. With Micron’s advanced NAND technology, Pure Storage can further optimize storage scalability, performance, and energy efficiency for an industry with unparalleled requirements.” 

Pure believes its partnership with Micron provides improved performance and lower latency with lower energy consumption and highly scalable systems at a reduced total cost of acquisition and ownership.

The next-generation DFM technology from Pure will provide 300 TB capacity. Micron’s G9 NAND has 19 percent more capacity per chip than 232-layer G8 NAND. A 300 TB DFM built with these chips will need a larger number of chips than a 150 TB DFM built with Micron G8 NAND.

Commercial-off-the-shelf SSDs have reached 122.88 TB in capacity with Phison’s Pascari D205V drive and 122 TB with Solidigm’s D5-P5336 product. 

IBM, like Pure, makes its own NAND drives, called FlashCore Modules, and it has a 115 TB maximum effective capacity version – after onboard compression – using 176-layer QLC NAND, available in its fourth-generation FCM range. That’s two generations behind Micron’s G9 flash and we envisage IBM moving to the latest-generation NAND and QLC and at least doubling its maximum capacity later this year.

Samsung says R&D expenses hit operating profit

The latest quarterly results from Samsung fell short of expectations as the company tries to claw its way into the profitable high-bandwidth memory (HBM) market where SK hynix has a commanding lead, followed by Micron.

HBM stacks layers of DRAM on top of an interposer to create memory products that are connected to GPUs, providing faster bandwidth and higher capacity than the socket-connected memory used by x86 processors. HBM memory costs more than ordinary DRAM and SK hynix and Micron revenues are soaring as demand for high-bandwidth GPU memory goes up in lockstep with AI application use. In November, SK hynix added four more layers to its 12-Hi HBM3e memory chips to increase capacity from 36 to 48 GB and is set to sample this 16-Hi product this year. The faster HBM4 standard should arrive this year as well, with a stack bandwidth of around 1.5 TBps compared to HBM3e’s 1.2-plus TBps.

Samsung filed preliminary figures for the quarter ended December 31, with ₩75 trillion ($51.4 billion) in revenues, up 10.7 percent annually but below analyst estimates, and ₩6.5 trillion ($4.5 billion) operating profit, lower than the forecast ₩8.96 trillion ($6.1 billion) and 30 percent less than the prior quarter. 

Samsung revenues and profits

Management said: “Our operating profit for 4Q24 is projected to come in significantly below market expectations. This explanatory note is provided to assist in the understanding of the key factors behind the results and alleviate uncertainties prior to the release of our full results in the 4Q24 earnings call.”

“Despite sluggish demand for conventional PC and mobile-focused products, revenue in the Memory business reached a new all-time high for the fourth quarter, driven by strong sales of high-density products. However, Memory operating profit declined, weighed on by increased R&D expenses aimed at securing future technology leadership and the initial ramp-up costs tied to expanding production capacity for advanced technologies.”

TrendForce analyst Eden Chung told AFP he believes that Samsung Foundry faces multiple challenges, including “order losses from key customers in advanced processes, the gradual end-of-life of certain products, and a slow recovery in mature process segments.”

Manufacturing HBM chips is more profitable than traditional DRAM. Once GPU market leader Nvidia qualifies a manufacturer’s HBM product, sales take off. Samsung has fallen behind SK hynix and Micron in getting its latest HBM chips qualified by Nvidia. The company replaced its semiconductor unit’s leadership in November as it responded to slow memory chip sales, the second such exec reshuffle that year. A month earlier, the company had acknowledged it was in crisis and there were concerns about its technology’s competitiveness.

Mobile phone memory demand is relatively weak and domestic DRAM suppliers in China are taking a larger proportion of memory sales there.

According to Reuters, Nvidia CEO Jensen Huang told reporters at CES that Samsung is working on a new HBM chip design and he was confident that Samsung could succeed in this project.

Scaling RAG with RAGOps and agents

COMMISSIONED: Retrieval-augmented generation (RAG) has become the gold standard for helping businesses refine their large language model (LLM) results with corporate data.

Whereas LLMs are typically trained with public information, RAG enables businesses to augment their LLMs with context or domain specific knowledge from corporate documents about products, processes or policies.

RAG’s demonstrated ability to augment results for corporate generative AI services improves employee and customer satisfaction, thus improving overall performance, according to McKinsey.

Less clear is how to scale RAG across an enterprise, which would enable organizations to turbocharge their GenAI use cases. Early efforts to codify repeatable processes to help spin up new GenAI products and services with RAG have run into limitations that impact performance and relevancy.

Fortunately, near term and medium-term solutions offer possible paths to ensuring that RAG can scale in 2025 and beyond.

RAGOps rising

LLMs that incorporate RAG require access to high-quality training data. However, ensuring the quality and availability of relevant data tends to be challenging because the data is scattered across different departments, systems and formats.

To maximize their effectiveness, LLMs that use RAG also need to be connected to sources from which departments wish to pull data – think customer service platforms, content management systems and HR systems, etc. Such integrations require significant technical expertise, including experience with mapping data and managing APIs.

Also, as RAG models are deployed at scale they can consume significant computational resources and generate large amounts of data. This requires the right infrastructure as well as the experience to deploy it, as well as the ability to manage data it supports across large organizations.

One approach to mainstreaming RAG that has AI experts buzzing is RAGOps, a methodology that helps automate RAG workflows, models and interfaces in a way that ensures consistency while reducing complexity.

RAGOps enables data scientists and engineers to automate data ingestion and model training, as well as inferencing. It also addresses the scalability stumbling block by providing mechanisms for load balancing and distributed computing across the infrastructure stack. Monitoring and analytics are executed throughout every stage of RAG pipelines to help continuously refine and improve models and operations.

McKinsey, for instance, uses RAGOps to help its Lilli GenAI platform sift through 100,000 curated documents. Lilli has answered more than 8 million prompts logged by roughly three-quarters of McKinsey employees searching for tailored insights into operations.

The coming age of agentic RAG

As an operating model for organizations seeking to harness more value from their GenAI implementations, RAGOps promises to land well in organizations that have already exercised other operating frameworks, such as DevOps or MLOps.

Yet some organizations may take a more novel approach that follows the direction the GenAI industry is headed: marrying RAG with agentic AI, which would enable LLMs to adapt to changing contexts and business requirements.

Agents designed to execute digital tasks with minimal human intervention are drawing interest from businesses seeking to delegate more digital operations to software. Some 25 percent of organizations will implement enterprise agents by 2025, growing to 50 percent by 2027, according to Deloitte research.

Agentic AI with RAG will include many approaches and solutions, but many scenarios are likely to share some common traits.

For instance, individual agents will assess and summarize answers to prompts from a single document or even compare answers across multiple documents. Meta agents will orchestrate the process, managing individual agents and integrating outputs to deliver coherent responses.

Ultimately, agents will work within the RAG framework to analyze, plan and reason in multiple steps, learning as they execute tasks and altering their strategies based on new inputs. This will help LLMs better respond to more nuanced prompts over time.

In theory, at least.

The bottom line

The future looks bright for GenAI technologies, which will flow from research labs to corporate AI factories, part of a burgeoning enterprise AI sector.

For example, the footprint of models will shrink even as they become more optimized to run efficiently on-premises and at the edge on AI PCs and other devices. RAG standardization, including software libraries and off-the-shelf tools, will grow.

Whether your organization is embracing RAGOps or adopting agentic AI, solutions are emerging to help organizations scale RAG implementations.

Agentic RAG on the Dell AI Factory with NVIDIA, when applied to healthcare for example, helps reconcile the challenges of utilizing structured data, such as patient schedules and profiles, alongside unstructured data, such as medical notes and imaging files, while maintaining compliance to HIPAA and other requirements.

That’s just one bright option. Many more are emerging to help light the way for organizations in the midst of their GenAI journey.

Learn more about Dell AI Factory with NVIDIA.

Brought to you by Dell Technologies.

ExaGrid claims 4,590 customers as it points to growth

ExaGrid claims it is fielding “record bookings” and revenue – although as a privately held company it provides no comparative figures for this – for the final 2024 quarter, up 20 percent annually, and full year.

The company provides target backup appliances featuring a fast restore landing zone and deduplicated retention zone. This has a Retention Time-Lock that includes a non-network-facing tier (creating a tiered air gap), delayed deletes, and immutability for ransomware recovery. ExaGrid was founded in 2002 and last raised funds in 2011, with a $10.6 million round, taking the total raised to $107 million. It says it is debt-free.

President and CEO Bill Andrews stated:”ExaGrid is continuing to grow with healthy financials, as shown by the past 16 quarters that we have maintained positive P&L, EBITDA, and free cash flow. We have sales teams in over 30 countries worldwide, and customer installations in over 80 countries. We continue to invest in our channel partnerships and worked with more reseller partners in 2024 than ever before, and we plan to expand on our partner programs in 2025.”

Bill Andrews, ExaGrid
Bill Andrews

He added: “As we constantly innovate our Tiered Backup Storage, we look forward to announcing new updates, integrations, and product announcements throughout 2025 and expect continued growth and success.”

ExaGrid says it recruited 189 new customers in the quarter, of which 76 were six and seven-figure deals, taking the customer count to near 4,600. Its average customer count increase has been 150 over the past 16 quarters and its competitive win rate is 75.3 percent, it says.

ExaGrid customers and deals

Andrews claims: “There are only three solutions in the market: standard primary storage disk, which does not have dedicated features for backup and becomes expensive with retention; inline deduplication appliances … that are slow for backups and restores, use a scale-up architecture, and don’t have all the security required for today’s backup environments; and ExaGrid Tiered Backup storage, which offers many features for backup storage and many deep integrations with backup applications. When customers test ExaGrid side by side, they buy ExaGrid 83 percent of the time – the product speaks for itself.”

Andrews told B&F: “We continue to replace primary storage behind the backup application from Dell, HPE, NetApp, Hitachi, IBM, etc. We continue to replace Dell Data Domain and HPE StoreOnce inline/scale-up deduplication appliances. We continue to replace [Cohesity-Veritas] NetBackup FlexScale appliances.”

Geographically, there was “great participation from the US, Canada, Latin America, Europe, the Middle East, and Africa. We have hired ten sales teams in Asia Pacific and expect the investment to start kicking in this quarter. We continue to maintain a 95 percent customer retention rating, 99 percent of customers on maintenance, and support and an NPS score of +81.”

The company will be “adding many service provider features for the large Backup as a Service MSPs” in 2025.

Last October we learned that ExaGrid reckoned it had 3 percent of the $6 billion backup storage market, implying $180 million in annual revenues. While its Q4 2024 revenues increased 20 percent annually, we don’t know the quarter-on-quarter increase. Andrews claimed in October last year that the company was making solid progress toward achieving $1 billion in annual sales.

Andrews said at the time: “Our product roadmap throughout 2025 will be the most exciting we have ever had, especially in what we will announce and ship in the summer of 2025. We don’t see the competitors developing for backup storage. Our top competitors in order are Dell, HPE, NetApp, Veritas Flexscale Appliances, Pure, Hitachi, IBM, Huawei. Everyone else is a one-off sighting here and there.”

NetApp does not provide a deduping target backup appliance equivalent to Dell’s PowerProtect, HPE’s StoreOnce or Quantum’s DXi products. But NetApp BlueXP provides a backup and recovery control plane with the ability to back up NetApp ONTAP systems to NetApp StorageGRID object storage or AWS, Azure, and Google cloud object stores.

The big change in the target backup appliance and storage market in the last couple of years has been the emergence of object storage targets for backup software, such as Cloudian and startup Object First. ExaGrid supports the Amazon S3 object protocol.

Micron using Phison controller for latest Crucial gaming gumstick SSD

Micron says its Crucial P510 PCIe gen 5 SSD will be built with G9 276-layer 3D NAND and use Phison’s PS5031-E31T SSD controller.

Phison is showing this controller and the PS5028-E28 PCIe gen 5 controller at the Consumer Electronics Show (CES) 2025 in Las Vegas. Aongside these products, Phison is also demonstrating its Pascari 122.88 TB D20-5V enterprise SSD – “the world’s highest-capacity PCIe Gen5 enterprise SSD” – as well as the P2251-2e, which it billed as “the world’s first native USB4 SoC”.

Michael Wu, Phison
Michael Wu

Michael Wu, Phison US GM and president, said in a statement: “Emerging use cases in gaming, content creation, and AI training are driving notebooks to handle unprecedented, data-heavy workloads.”

The Crucial P510 provides up to 11,000/9,500 MBps sequential read/write bandwidth for gamers, creatives, and other users needing high-speed SSD data access. Its mainstream market PS5031-E31T controller has a DRAM-less design, meaning the host system’s memory is used for NAND management operations, lowering the drive’s cost. Micron has not yet released drive format and capacity information.

Jonathan Weech, senior director of product marketing and management for Micron’s Commercial Products Group, said: “By pairing Phison’s advanced controller technology with Micron’s G9 NAND flash, we’ve engineered the Crucial P510 SSD to meet the needs of today’s technologically discerning users. With improved power efficiency for extended notebook battery life and razor-sharp load times, this drive offers a distinct advantage for creators and gamers alike.”

A Crucial P310 SSD uses the PCIe gen 4 bus and has 500 GB, 1 TB, and 2 TB capacity options in its M.2 format. It has read and write speeds of 7,100 and 6,000 MBps respectively. The drive features random reads of up to 1 million IOPS and random writes of up to 1.2 million IOPS.

Micron Crucial P310

Micron’s current M.2 format Crucial T700 PCIe gen 5 NVMe SSD provides up to 12,400 MBps sequential reads and up to 11,800 MBps sequential writes (up to 1,500K IOPS random reads/writes), and is faster than the Crucial P510. T700 capacities are 1 TB, 2 TB, and 4 TB.

We envisage the Crucial P510 being a PCIe gen 5 version of the P310, using the same M.2 format, and with 500 GB, 1 TB, and 2 TB capacities and even, perhaps, reaching 4 TB.

Phison PS5028-E28

Crucial’s flagship PS5028-E28 PCIe gen 5 SSD controller, built on TSMC’s 6nm process node, is said to “enable a market-first sequential speed combination of 14.5 GBps read/write.”

Index Engines wants to stop ransomware payments

Interview CyberSense developer Index Engines uses AI and machine learning to look inside databases, as well as backup and other file types to detect signals of malware-caused corruption. It claims a 99.99 percent detection precision rate and has an SLA based on that. CyberSense, it claims, detects over 200 signs of a possible ransomware attack, including:

  • Files encrypted in place
  • Encrypted files and files with new extensions added
  • Files moved into encrypted archives
  • Deleted files
  • Encrypted pages within database files
  • Files replaced with decoys

It sells its technology through white label deals with Dell, IBM, Infinidat, and other third parties. According to the sales pitch, end-user customers can identify the last known good copy of data and recover from malware attacks more efficiently, without having to pay extortionate amounts to retrieve their own hijacked data.

The company hired Neil DiMartinis as CRO in July last year, taking over from the departed Tony Craythorne, and we caught up the with exec to discuss the company’s plans.

Blocks & Files: Why did you join Index Engines?

Neil DiMartinis, Index Engines
Neil DiMartinis

Neil DiMartinis: What attracted me is … first and foremost, the product we have… is in a market and in a space that is growing. It’s critical to the businesses, and we have a niche, we are doing something that nobody else is doing. So we absolutely believe we’re on the cusp of something really exciting and that there’s going to be a lot of growth over the coming years.

Blocks & Files: Index Engines has three OEM white label-type partnerships with Dell, Infinidat, and IBM, and there are other partnerships. Your background has involved dealing with partners. Is that going to play a large part in what you do? Recruiting high-level partners?

Neil DiMartinis: If you look at how we’re taking the CyberSense product, which is doing the scanning, the closer we get that to production storage and production data, the better [it is] for customers, [and] better for the entire security experience. That is the goal and you look at the partners who can bring the solutions to bear in that storage realm. So, yes, bringing on more partnerships in the storage arena is definitely the goal. 

Beyond the storage vendors, as I look at the security challenges, they’re never going to be solved by one company. We really need to, as an industry, work together with alliances to help customers solve this issue, because it’s only going to get more complex and more challenging. That’s absolutely the goal for us; to not only expand our OEM relationships, but also our ecosystem partnerships that are working in this space and bring value to our customers.

Index Engines CyberSense graphic
CyberSense graphic

Blocks & Files: Would you also be thinking of having direct sales to large enterprises?

Neil DiMartinis: It’s something we have discussed internally, and I wouldn’t say it’s on the horizon for the for the foreseeable future. In terms of the next 12 to 18 months, as we start to think about an Index Engines overarching solution, it could be a possibility down the road. I think, right now, we have so much on our plate working with our current OEM partners that it makes the most sense to continue to leverage that and then figure out, solution-wise, how we take it to market on our own, if that does become the vision.

Blocks & Files: How does Index Engines help its partners compete with or work with other suppliers such as Palo Alto or Rubrik or virtually anybody else we could care to mention? There are massive, massive companies in this area.

Neil DiMartinis: Absolutely. We believe that partnering with the Palo Altos and some of these other security players just complements what they’re doing to tell a better story to their customers. How we help customers together is that, what we’re doing through our scanning engine is providing information on the data and giving customers a clean copy of the data to go back and recover from. When you start to couple that with your other security measures that you’re leveraging within your infrastructure, within your overarching environment, we’re going to help. We’re a piece of the pie. We’re a piece of that recovery whereby, hey, you have a good copy of data. This is where it exists now. You can take it to a clean room and really start to recover smarter. 

Blocks & Files: Then you could, from Palo Alto’s point of view, provide a better indication of the infection status of a customer’s data assets, which then feed into Palo Alto’s higher-level security apparatus?

Neil DiMartinis: Yes. I think I would look at it more as the Palo Altos of the world are much more focused on prevention. How do we keep the bad actors out? Right? CyberSense is much more focused on the bad actors that are already in. 

So now, what do we do? And that’s where we complement them. The overarching security story is when you get past the prevention, which unfortunately happens; it happens everywhere. These hackers and these bad actors are getting better and better. 

And, once you’re there in, now, what is your plan to recover? And that’s where we come into play, because, to your point, with feeding up reports to whatever overarching security manager you’re using, you can now see and look at where’s my last good copy? What does my CyberSense Security Index look like, and so on and so forth, and identify that good, clean copy. Then, using your run books and playbooks and knowing how to recover, you have that process to go recover from your last known good copy.

Blocks & Files: CyberSense can look inside Commvault backups and detect Word files or other Microsoft files and detect signals of infections. Rubrik would say that it can do pretty much the same thing with its backups. So are there situations in which your partners will be competing with Rubrik?

Neil DiMartinis: There are other partners around, other competitors out there, that are looking at the metadata, whereby we’re going down to the file level. We’re looking at the file level. We’re going down to the content level, and we are looking at all the file extensions, etc. And also the fact that we have an engine that’s been around, has 20 years engineering effort, and has AI machine learning that is constantly looking and scanning on a daily basis, learning about your environment, learning what the anomalies could be within the data. And it’s that deep level scanning that we’re doing…. So when it truly comes to the recovery, CyberSense has gone a lot deeper in looking at the content so you know that you can tell where the blast zone was and what’s been impacted and what’s not.

Blocks & Files: You’re going into the content inside, for instance, a Commvault backup. But you could equally do that for Rubrik or Veritas, Cohesity or Veeam, and you’re, in that sense, a potential partner for them?

Neil DiMartinis:  The key is the integration between us and those platforms and they obviously have to have the API first to connect in. But, yes, once we have access to them, and develop a connector from CyberSense that will attach to anybody’s storage system, backup platform, what have you, we have the ability to scan almost any data set.

Blocks & Files: Is there scope for Index Engines using GenAI large language model technology to help its end users use your technology better? I’m thinking the answer is likely going to be no.

Neil DiMartinis:  We have our own AI engine and machine learning that learns to identify the variants that are out there from a virus perspective and whatnot, but using large language models and generative AI, we would say, is not applicable in this case to our product, per se.

Blocks & Files: What can you tell us about the development roadmap?

Neil DiMartinis: The product continues to improve its integration through partnerships, and we have a lot of exciting things to come down the road; the user interface, the reporting in the dashboards that we’re creating, and it’s all for the customer experience. It’s going to be an exciting year, and we’re looking forward to it.

Bootnote

Get a white paper discussing CyberSense and databases here. Check out a more general CyberSense white paper here and a datasheet here.

Storage news ticker – January 6

Box CEO Aaron Levie posted a chart tracking historical storage costs on X/Twitter:

Historical price of computer storage

He said: “When we started Box, the hard drives in our servers were ~80 GB, and we prayed that the economics would improve over time. Now, we’ll have hard drives with 50 TB in a couple years. A 600x improvement. Storage is becoming too cheap to meter.”

An X/Twitter post from semiconductor commentator @Jukanlosreve, known for accurate leaks of storage tech, claims that, much as 3D NAND regressed to using older NAND process technology with less fine line widths than the then-current planar NAND tech, so too could 3D DRAM. The Chinese fabber CXMT could potentially build 3D DRAM with its current DRAM process technology, instead of being prevented from doing so because it can no longer import US technology supply-banned advanced lithography equipment. @Jukanlosreve suggests “seeing China’s development as being blocked by US sanctions is too myopic.”

A Tom’s Hardware report says a 1 TB QLC Huawei eKitStore Xtreme 200E PCIe Gen 4 SSD is in South Korea for $32.00. This is cheap. A 1 TB Kioxia Exceria Plus Portable SSD external drive with USB 3.2 Gen 2 Interface is available on Amazon for $63.99.

IBM Storage Ceph For Beginners

IBM has published a 300-page PDF guide for Ceph, “IBM Storage Ceph for beginners.” It’s intended as an introduction to Ceph and IBM Storage Ceph in particular from installation to deployment of all unified services including block, file, and object, and is suited to help customers evaluate the benefits of IBM Storage Ceph in a test or POC environment. Access it here.

Taipei-based analyst Dan Nystedt posts on X/Twitter that Micron reckons the HBM market will grow strongly between now and 2030, increasing from $16 billion in 2024 to $64 billion in 2028 and reach $100 billion by 2030. This means the 2030 HBM market would be larger than the entire DRAM industry (including HBM) in 2024. Micron HBM chip production at its fab in Hiroshima, Japan, will hit 25,000 wafers by the end of 2024, he said. Its Taiwan fabs, Fab 11 in Taoyuan and A3 in Taichung, are ramping up 1β process technology and is expected to increase its HBM wafer start capacity to 60,000 in 12 months’ time.

Wedbush analyst Matt Bryson wrote about reports from the Korea Herald claiming Samsung may have picked Micron to be the primary supplier of LPDDR5 memory for the Galaxy S25 in lieu of Samsung’s own DRAM. “If accurate, the shift of Samsung’s handset division away from internal memory supply would be another damaging statement regarding the current state of Samsung DRAM production, albeit in line with concerns we have been encountering through 2024 with both DDR5 and HBM products.”

Neuroblade SQL acceleration card
Neuroblade SQL acceleration card

NeuroBlade’s SQL analytics acceleration is now accessible via Amazon Elastic Compute Cloud (EC2) F2 instances. By using AMD FPGA and EPYC CPUs, “this integration brings blazing-fast query performance, reduced costs, and unmatched efficiency to AWS customers.” 

The card integrates with popular open source query engines like Presto and Apache Spark, delivering “market-leading query throughput efficiency (QpH/$).” NeuroBlade also provides reference integrations of its accelerator technology with Prestissimo (Presto + Velox) and Apache Spark. These setups enable customers to run industry-standard benchmarks such as TPC-H and TPC-DS or test their workloads with their own datasets. This facilitates apples-to-apples comparisons between accelerated and non-accelerated implementations on the same cloud infrastructure. It should demonstrate “the performance gains and cost efficiencies of NeuroBlade’s Acceleration technology in comparison to state of the art, native vectorized processing on the CPU.”

… 

Storage exec Matt Dargis
Matt Dargis

Other World Computing (OWC) has appointed Matt Dargis as CRO reporting directly to OWC founder and CEO Larry O’Connor. Prior to joining OWC, Dargis served as SVP US Sales at ACCO Brands and VP of North America Sales at Kensington where he rebuilt the North America sales team and created a new go-to-market strategy and three-year plan to double sales while improving the bottom line. 

Dargis said: “OWC is well positioned for rapid expansion and growth due to an obsessive commitment to product quality and its customers. My goal is to ensure our products are sold in all relevant commercial and consumer channels around the globe and that all customers can experience the OWC quality and brand commitment. I expect to immediately hire some new roles to double down on our efforts and support of the channel.”

OWC has launched a ThunderBlade X12 professional-grade RAID product, and the USB4 40 Gbps Active Optical Cable, for long-distance connectivity. It also announced GA of the Thunderbolt 5 Hub with “unparalleled connectivity.” We’re told that the ThunderBlade X12 features speeds up to 6,500 MBps – twice the performance of its predecessor – and capacities from 12 to 96 TB with RAID 0, 1, 5, and 10 configurations. It’s intended for workflows involving 8K RAW, 16K video, or VR production, and will be available in March.

The Active Optical Cable, with universal USB-C connectivity and optical fiber technology, provides up to 40 Gbps of stable bandwidth, up to 240 W of power delivery, and up to 8K video resolution at distances of up to 15 feet. It can connect to Thunderbolt 4/3 and USB 4/3/2 USB-C equipped docks, displays, eGPUs, PCIe expansions, external SSDs, RAID storage, and accessories.

The Thunderbolt 5 Hub can turn a single cable connection from your machine into three Thunderbolt 5 ports and one USB-A port. It has up to 80 Gbps of bi-directional data speed – up to 2x faster than Thunderbolt 4 and USB 4 – and up to 120 Gbps for higher display bandwidth. It’s generally available for $189.99.

OWC and Hedge announced an alliance whereby every OWC Archive Pro purchase will now include a license for Hedge’s Canister software, for macOS and Windows, supporting streamlined Linear Tape-Open (LTO) backups – a $399 value at no additional cost.

The US Department of Commerce awarded Samsung up to $4.745 billion in direct funding under the CHIPS Incentives Program’s Funding Opportunity for Commercial Fabrication facilities. This follows the previously signed prelim memo terms, announced on April 15, 2024, and the completion of the department’s due diligence. The funding will support Sa­­­msung’s investment of over $37 billion in the coming years to turn its existing presence in central Texas into a comprehensive ecosystem for the development and production of leading-edge chips in the United States, including two new leading-edge logic fabs and an R&D fab in Taylor, as well as an expansion to the existing Austin facility.

SK hynix will unveil its “Full Stack AI Memory Provider” vision at CES 2025. It will showcase samples of HBM3E 16-layer products, Solidigm’s D5-P5336 122 TB SSD, and on-device AI products such as LPCAMM23 and ZUFS 4.04. It will also present CXL and PIM (Processing in Memory) technologies, along with modularized versions, CMM (CXL Memory Module)-Ax and AiMX5, designed to be core infrastructures for next-generation datacenters. CMM-Ax adds computational functionality to CXL. AiMX is SK Hynix’s Accelerator-in-Memory card product that specializes in large language models using GDDR6-AiM chips. 

SK hynix "Full Stack AI Memory Provider" vision

Referring to Solidigm’s 122 TB SSD, Ahn Hyun, CDO at SK hynix, said: “As SK hynix succeeded in developing QLC (Quadruple Level Cell)-based 61 TB products in December, we expect to maximize synergy based on a balanced portfolio between the two companies in the high-capacity eSSD market.”

Wedbush analyst Matt Bryson wrote to subscribers about reports that SK hynix and Samsung are slowing down development of new NAND technology (e.g. 10th gen and beyond), and taking older generation facilities offline to convert production to newer processes (8th and 9th generation output), effectively moderating NAND output in the near term. Bryson said NAND market conditions are soft but should recover over the next few quarters.

Cloud data warehouser Snowflake has released Arctic Embed L 2.0 and Arctic Embed M 2.0, new versions of its frontier embedding models, which now allow multilingual search “without sacrificing English performance or scalability.” In this Arctic Embed 2.0 release, there are two variants available for public usage; a medium variant focused on inference efficiency built on top of Alibaba’s GTE-multilingual with 305 million parameters (of which 113 million are non-embedding parameters), and a large variant focused on retrieval quality built on top of a long-context variation of Facebook’s XMLR-Large, which has 568 million parameters (of which 303 million are non-embedding parameters). Both sizes support a context length of up to 8,192 tokens. They deliver top-tier performance in non-English languages, such as German, Spanish, and French, while also outscoring their English-only predecessor Arctic Embed M 1.5 at English-language retrieval.

Single-vector dense retrieval performance of open source multilingual embedding models

The chart shows single-vector dense retrieval performance of open source multilingual embedding models with fewer than 1 billion parameters. Scores are average nDCG@10 on MTEB Retrieval and the subset of CLEF (ELRA, 2006) covering English, French, Spanish, Italian, and German.

Xinnor and NEC Deutschland GmbH have supplied over 4 PB of NVMe flash storage operated by the Lustre file system and using Xinnor’s xiRAID software, configured with RAID 6, to one of Germany’s top universities for AI research. The university’s AI research center required a storage system capable of handling the computational needs of over 20 machine learning research groups. We understand it is not the Technical University of Darmstadt (TU Darmstadt), which has more than 20 research groups focused on machine learning, artificial intelligence, and related fields. The deployment includes a dual-site infrastructure:

  • First Location: 1.7 PB of storage with NDR 200 connectivity, supporting 15 x Nvidia DGX H100 supercomputers.
  • Second Location: 2.8 PB of storage with 100 Gbps connectivity, supporting 28 nodes with 8 x Nvidia GeForce RTX 2080Ti each and 40 nodes with 8 x Nvidia A100 each.

The center’s previous setup consisted of five servers connected to a 48-bay JBOF with 1.92 TB SATA SSD from Samsung, protected by Broadcom hardware RAID controllers, and the Lustre file system.

The AI center’s compute infrastructure consists of 15 x Nvidia DGX H100 supercomputers, 28 nodes with 8 x Nvidia GeForce RTX 2080Ti each, and 40 nodes with 8 x Nvidia A100 each. These compute nodes were previously often waiting for the storage subsystem, leading to a waste of compute resources, delays in the project’s execution, and a reduced number of projects that the AI center could handle.

The deployment of Xinnor’s xiRAID+Lustre with the NVMe SSDs was up and running in a few hours and has boosted the center’s capabilities, we’re told by Xinnor. Specific models taking tens of minutes to load with the previous storage infrastructure are now available within a few seconds, the company added. The full case study is available here.

Storage stack layers and players going into 2025

Analysis: Right now, the two dominant storage demand and development drivers are cyber-resilience and security on the one hand, and AI training and inference on the other.

The IT storage picture at the start of 2025 is vibrant and healthy with developments ongoing at all levels of the storage stack, driven by demands for more memory, more and larger SSDs, higher-capacity disk drives, better data serving to GPUs, and preparation for AI-driven analysis. We also see improved data protection and security and the ongoing promotion of object storage to higher-performing access for AI training and inference.

Let’s step back for a second and ask what actually constitutes IT storage and what it is for. The second question is easier to answer. Storage exists to supply data for applications running in servers or other host processors and to store the data these applications produce. Everything that uses stored data is an application in this sense.

We can envisage IT storage as a stack of layers, from basic recording media types at the lowest level and running up through devices that package this media into drives, next-level devices that package these drives into usable hardware systems, such as SANs and filers, the software that controls these systems, and then upper-level software that presents storage as a protocol and API-accessed facility to applications.

That includes SANs with block-level data access, files with file-level access, and object storage. But the applications could also be transactional or analytic databases that present their storage as a protocol-accessed facility to customer relationship management software systems, or AI-driven fraud detection analytic apps using a data warehouse.

Data can be separated into primary data, used for databases, and secondary data – copies of one sort or another, used for protection against primary data loss or corruption. There is a third level: Tertiary or archive data used to keep reference data for the long-term.

We can try and encapsulate all these ideas in a storage layers diagram:

Storage layers diagram

It has two sidebars: a business model one on the left, and backup to cyber resilience on the right. These apply at multiple levels in the storage stack and don’t constitute a single layer.

We can use this diagram to place technologies and suppliers generally, not precisely, in the stack and see how they relate to each other and evolve over time. Here’s an example showing some data management suppliers:

Storage suppliers diagram

As you move up the stack from bottom to top, the number of startup suppliers generally increases, as does their funding. At the bottom level, the NAND, disk, and tape media suppliers haven’t changed much at all for several years after earlier supplier consolidations. For example, there are the three disk drive suppliers – Seagate, Toshiba, and Western Digital. Over 90 percent of DRAM is supplied by Micron, Samsung, and SK hynix, with Nanya and Winbond having a niche market presence.

However, there is startup activity in the DRAM connectivity area, where the CXL concept of connected external memory has encouraged the development of CXL hardware and software suppliers, such as Panmnesia and UniFabriX. There are also 3D DRAM developments, witness startup Neo Semiconductor. 

NAND supply has been pretty stable, until Intel sold off its NAND division as Solidigm to SK hynix, with Kioxia, Micron, Samsung, SK hynix/Solidigm, Western Digital, and China’s YMTC. Kioxia has just gone public. Western Digital is splitting into separate disk drive and NAND/SSD (SanDisk?) businesses this year, and there may be some supplier consolidation activity here.

The hardware array layer was dominated by block and file protocol incumbents in the 2000-2010 era, with Dell, Hitachi Vantara, HPE, IBM, and NetApp. We saw the entry of Pure Storage in 2009, Infinidat in 2010, StorONE in 2011, and VAST Data in 2016. Since then, no new significant storage array startups have emerged. VAST’s DASE architecture has, however, influenced HPE, caused NetApp to initiate an ONTAP development project, and coincided with Quantum’s Myriad Storage operating system development.

The enterprise filer area had seen relatively little development, with Dell and NetApp dominating. But Qumulo entered this market in 2012, WEKA in 2013, and VAST in 2016. HPC parallel file system players include incumbents DDN (Lustre), IBM with Storage Scale, and Panasas, now renamed VDURA. Intel’s DAOS, BeeGFS, and Quobyte are also present here.

There is frantic ongoing development in the data container space, especially in the analytics area, with data warehouses and big data-era Hadoop transitioning to data lakes and lakehouses to include more unstructured data types. Snowflake was founded in 2012, grew like crazy and went public in 2020. Databricks was founded in 2013 and has grown so much that it has raised a mind-boggling $14 billion in VC funding. Some $10 billion of that was raised last year, surely a record storage startup funding amount and providing a high-pressure backdrop to any forthcoming Databricks IPO.

The GenAI large language model boom is driving surging growth in this area, with specialized vector database suppliers appearing like Pinecone, founded in 2019, and Milvus, started up by Zilliz in 2017. Existing data type consolidating players like SingleStore are determined not to be left behind.

Where would we locate CTERA, Egnyte, Nasuni, and Panzura? They would be placed in the protocol layer, under file and in the Cloud-to-edge category as they all supply collaboration facilities. Panzura is moving into the data management area after buying Moonwalk and developing its Symphony offering. Nasuni, with private equity now owning a majority of its business, is moving heavily into AI, as is CTERA.

Where do computational storage suppliers fit? We have both computational drives – ScaleFlux, for example – and computational storage systems, like storage array controllers running containerized application software, hinted at by VAST Data and Dell.

Where are DPUs and SmartNICs situated? There is a DPU-assisted category in the Hardware Type layer, second from the bottom. At one time, there looked to be a separate group of DPU startup suppliers – Pensando and Fungible, for example – but they have all been bought and we now have incumbent suppliers dominating this area: Amazon, AMD (Pensando), Intel, Microsoft with Azure (Fungible), and Nvidia with its BlueField products.

DataCore, in receipt of AI development dollars, is in the StorageOS layer, along with StorPool and Red Hat’s Ceph. 

Where are Dell PowerProtect, ExaGrid, ObjectFirst, Quantum DXi, and HPE StoreOnce located? These are all specialized arrays providing target backup appliance functionality.

Cyber-resilience and SaaS are reshaping the backup area as Cohesity, Commvault, Dell, Rubrik, and Veeam tussle to lead the data protection plus security market, with energetically developing players like Druva, HYCU, KeepIt, and others active as well.

We can expect CXL to have more of a presence in 2025, but the two main significant storage market drivers should continue to be AI and cyber-resilience.

Storage news ticker – December 20

This is the last article I will post until the new year. It has been a fascinating 2024 with storage dominated by SSD, HDD, and array developments, the need for faster file access and better unstructured data management, the rise of cyber-resilience issues and the onrush of GenAI. I hope all the readers of these articles, and all the suppliers mentioned in them, have an excellent Christmas break, and look forward to learning more about suppliers and their developments in 2025.

We record quarterly revenue numbers for publicly owned storage suppliers and here is a chart, normalized to HPE fiscal quarters, showing them up until now, nearly the end of 2024:

Download full size image here.

We can immediately see that the big money is being made by the flash+SSD suppliers, followed by the hard disk drive suppliers and Dell. The rise from the depths of the revenue trough in fiscal 2023 is clearly apparent.

It is quite crowded in the sub-$2 billion/quarter area and we have created a second chart showing only the vendors in this category:

Download full size chart here

NetApp dominates, and HPE would be in second place if it actually revealed its storage numbers, which it stopped doing a year ago. Rapidly growing Snowflake is in third place, followed by Pure Storage. Nutanix has a very respectable fifth place. Then we enter the data protection suppliers’ segment with Rubrik and Commvault in close company followed by N-able, Quantum, and Backblaze. Lenovo’s presence in the chart key is a historical oddity as it once reported its storage revenues and then stopped doing so.

Data management supplier Datadobi told us more about its GenAI plans.

B&F: Datadobi offers tools to manage unstructured data. How will GenAI’s LLMs influence these tools?

Datadobi: The more enterprises adopt GenAI, the more they will try to select the proper data to refine models or to feed to RAG

B&F: Will you be able to set unstructured data management policies and have a Datadobi ‘agentic AI’ operate the policies and identify mismatches between the current state and the policies?

Datadobi: It’s a roadmap item to provide APIs to execute data filtering and assert if a certain file satisfies a certain filter (= policy).

B&F: Could the agentic AI initiate actions to fulfill the policies? Could Datadobi have a number of agentic AIs? A data tiering one? A security one? An AI pipeline one? Would Datadobi develop its own agentic AI tools?

Datadobi: Yes, to all three. Datadobi will offer APIs to execute actions. Data deletion, archival, moving, copying, quarantining, changing security permissions … are all possible actions, next to data tagging.

Hitachi Vantara said La Molisana, a leading Italian pasta company, selected its Virtual Storage Platform One Block array, equipped with NVMe flash drives, to support its growth. The implementation included the integration of Veeam backup and Hitachi’s immutable snapshot functionality, providing La Molisana with data protection and business continuity.

Infinidat says GigaOm has recognized it as a Leader in the Dec 13, 2024 GigaOm Radar Report for Primary Storage. Analyst Whit Walters reviewed 21 suppliers of block and file primary storage arrays and SW-only products. In early 2024, and with different analysts, GigaOm reviewed primary storage for large enterprises and mid-sized businesses separately. 

The earlier reports had substantially different supplier positioning; 

The January large enterprise report covered Dell, Fujitsu, Hitachi Vantara, HPE, IBM, Infinidat, NetApp, Pure Storage, Seagate and Synology. The Feb mid-size report looked at DataCore, Dell, DDN, Hitachi Vantara, HPE, IBM, Infinidat, iXsystems, Lightbits Labs, NetApp, Pure Storage, StorONE, StorPool and Synology. The new report includes all of these plus Cohesity, Nutanix, VAST Data, WEKA and Zadara. Many of the 21 are in the top right Maturity-Platform Play quartile which was under-populated in the earlier reports in comparison. Also, the Feature Play-Innovation segment is now empty whereas before it had some suppliers located there. The change of analyst has been accompanied by a change of analysis.

12 vendors, 57 percent of the suppliers included, did not actively participate, meaning the analyst only read their documentation. Infinidat has made the report available here.

Kioxia IPO’d on Dec 17 and, on the third day of trading, the stock is up 18 percent. Wedbush analyst Matt Bryson said: “Despite a tough earnings report for Micron, where NAND was highlighted as a primary point of weakness … this highlights the fact that Kioxia (and WDC’s SanDisk division by proxy) are likely being already valued at trough levels.”

Data manager Komprise discussed its views on the data management needs of organizations adopting Gen AI, making several points. Gen AI adopters need data management more than faster storage and AI LLM responses are only as good as the data made available to the LLM. Retrieval Augmented Generation (RAG) provides proprietary data to generically-trained LLMs, and Komprise’s software can deliver securely-accessed and appropriately selected real-time data, not old backup data, and from an organization’s entire unstructured data estate, to such LLMs. It provided an 11-slide deck to make its points and you can download the slide deck here.

….

PEAK:AIO claims it has a new Power-Optimized solution that reduces energy consumption of large scale NVMe drives by between 50 – 70% while maintaining full performance. See more on a YouTube video.

SK hynix has been awarded $458 million to build a HBM chip packaging facility as part of its $3.87 billion HBM manufacturing in West Lafayette, Indiana and also for AI-related R&D activities there, under the CHIPS and Science Act. The US Commerce department also plans to make available $500 million in government loans for the SK hynix project. SK hynix CEO Kwak Noh-Jung said: “SK hynix looks forward to collaborating with the US government, the state of Indiana, Purdue University and our US business partners to build a robust and resilient AI semiconductor supply chain in the United States.”

The Korea Herald says Samsung Electronics is in discussions with Washington for a subsidy worth $6.4 billion in direct funding. 

The Elec reports SK hynix has won a large order to supply high-bandwidth memory (HBM) to Broadcom, to install on the AI computing chip of a big tech company. Broadcom manufactures application specific ICs (ASICs) designed by its big tech customers. It said earlier this month that it was developing AI chips with three large cloud service providers – likely Google, Meta, and ByteDance.

UniFabriX is planning to support CXL3.1 features like Dynamic Memory Allocation (DCD) now, even though the CXL Consortium’s official target is 2027.