Home Blog Page 17

GenAI, LLMs, and agents are transforming storage

Analysis: GenAI is washing through the IT world like water flooding across a landscape when a dam breaches. It is being viewed by many suppliers as an epoch-changing event, similar to the arrival of the internet, and feared by some as a potential dotcom bubble-like event.

Be that as it may, IT storage is being strongly altered by AI, from the memory-storage interface upward, as the rising tide of GenAI lifts all storage boats. A look at the storage world reveals six ways GenAI is transforming the world of block, file, and object storage.

At its lowest level, where storage systems talk to a server’s memory, the arrival of Nvidia GPUs with their high-bandwidth memory (HBM) has put a premium on file storage system and array controller processors and their DRAM just getting out of the way, and letting NVMe flash drives connect directly to a GPU’s memory to transfer data at high speed using remote direct memory access and the GPUDirect protocol. 

DapuStor, DDN, Dell, Huawei, IBM (Storage Scale), NetApp, Pure Storage, VAST Data, WEKA, YanRong, plus others such as PEAK:AIO and Western Digital (OpenFlex) are active here. Even Nutanix aims to add GPUDirect support.

This style of unstructured data transfer is being extended to object storage so the petabytes of reserved data can be freed up for use in AI training and subsequently inferencing as well, when performed by GPUs. See the news from Cloudian, MinIO, and Scality in recent months.

Storage media manufacturers are reacting to GenAI too. The SSD manufacturers and NAND fabricators are recognizing that GenAI needs fast read access to lots of data, meaning they must produce high-capacity drives, 62 TB and, more recently, 123 TB SSDs, using affordable QLC (4bits/cell) 3D NAND. GenAI training also needs fast job checkpointing to enable quicker training job restarts.

Solidigm recognized this need early on with its 61.44 TB D5-P5336 drive in July 2023 and has been followed by Micron, parent SK hynix, and Samsung. Phison has also entered this market, matching Solidigm’s latest 122 TB drive with its own Pascari D205V 122.8 TB.

We will probably see news of double that capacity by late this year or early 2026. The use of hard disk drives (HDDs) for AI training is not happening. They are too slow and the drives too limited in capacity. Where GPUs are used for AI inferencing, SSDs will certainly be the storage choice as well, for the same speed and capacity reasons, and that will likely be true for x86 servers too. It’s likely that AI PCs, if they take off, will all be using SSDs and not HDDs for identical reasons.

What this means is that HDDs will only be used for GenAI secondary storage and, so far, that has not happened to any significant degree. Seagate, Western Digital, and no doubt Toshiba are pinning their hopes of HDD market expansion on GenAI data storage needs, and seem confident it will happen.

The tape market has not been directly affected by GenAI data storage needs at all, and likely will not be.

At levels above drive media in the storage stack, we have block array, filer, and object storage systems. The filer suppliers and, as we saw above, object storage suppliers have nearly all been affected by enabling GPUDirect access to their drives. Several have built AI-specific systems, such as Pure Storage’s Airi offering. Dell, VAST Data, DDN, WEKA, and others have shown sales increases by having Nvidia SuperPOD certification.

With GenAI chatbots being trained on unstructured data and transformed into vector embeddings, no GPUDirect-like access has been provided for block storage, which is critical for transactional databases and ERP systems.

There is activity in the Knowledge Graph area to enable such data to be made available for AI training, like in the cases of Graphwise and Illumex.

Storage array and data platform suppliers are all transforming their software to support the addition of proprietary and up-to-date unstructured data to augment AI inference by GenAI’s large language models (LLMs) trained on older and more general data. Such data has to be vectorized and the resulting vectors stored in a database for use by the LLM in retrieval-augmented generation (RAG).

Existing non-RDBMS database, data warehouse, and data lake suppliers are adding vector storage to their products, such as SingleStore. Database startups like Pinecone and Zilliz have developed specialized vector databases, promising better performance and enhanced support for LLMs.

The data warehouse and lakehouse vendors are in a frenzy of GenAI-focused development to be the data source for AI training and inference data. The high point of this was Databricks getting a $10 billion VC investment late last year to continue its GenAI business building evolution.

A fifth storage area affected greatly by GenAI is data protection, where vendors have realized that their backup stores hold great swathes of data usable by GenAI agents. Vendors like Cohesity, Commvault, and Rubrik are offering their own AI agents, like Cohesity’s Gaia, and also developing RAG support facilities.  

In general, no data store vendor can afford to ignore RAG as it’s presumed all data stores will have to supply data for it. Supplying such data is not as simple as giving API access to an LLM and stepping aside, letting the model extract whatever data it needs. An organization will generally have many different data stores and enabling their contents to be appropriately filtered, excluding totally private information or data below an accessing LLM’s access privileges, will need the GenAI equivalent of an extract, transform (into vectors) and load (ETL) pipeline setup.

Data management and orchestration suppliers like Arcitecta, Datadobi, DataDynamics, Hammerspace, and Komprise are all playing their part in mapping data sources, providing a single virtual silo, and building data pipelines to feed the data they manage into LLMs.

Data storage suppliers are also starting to use GenAI agents inside their own offerings to help with support, for example, or to simplify and improve storage product administration and security. This will affect all suppliers of storage systems and AIOps will be transformed by the use of GenAI agents; think agentic AIOps. For example, Dell has its APEX AIOps software that is available in its PowerStore and other arrays.

The cyber-resilience area will need to withstand GenAI agent-assisted malware attacks and will certainly use GenAI agents in building responses to such attacks.

We are going to see the ongoing transformation of the storage world by GenAI throughout 2025. It seems unstoppable and should, aside from malware agents, be beneficial.

How enterprise AI can ease the data gravity burden

COMMISSIONED: Among the many tough decisions IT leaders face is where to best host AI workloads.

Whether you’re deploying traditional AI applications that predict supply chain performance or launching generative AI digital assistants that serve information to customers, the location options for such software are myriad.

Yet the calculus has changed. For the first time in probably a decade, the public cloud is not necessarily the first destination that IT leaders turn to. In fact, Gartner expects that 50 percent of critical enterprise application workloads will reside outside the public cloud through 2027.

Of course many factors go into workload placement. Yet it almost always starts with a single consideration: your data. And AI demands a lot of it. Especially the modern GenAI applications that are transforming digital operations worldwide.

Avoiding cloud sins of the past

GenAI apps create vast volumes of unstructured data such as text, audio and video. The more data such apps generate, the more that data gravity weighs on an organization’s workloads and the harder they become to move.

Ironically, IT leaders have likely already experienced this challenge during the “cloud-first” phenomenon of a decade ago, as organizations rushed to capitalize on the agility and innovative prospects promised by the public cloud.

In recent years – and for various reasons – they and your peers have probably removed some application workloads from the public cloud.

Some realized your providers couldn’t meet data locality regulations. Others erred in refactoring applications for the public cloud only to watch them break. Still others found the more data they created, the more expensive it became to run their applications.

To put this into a consumer-friendly perspective, consider the platform shift from cable TV to streaming services over the past few years. Tired of feeling locked into contracts that included hundreds of channels they rarely watched, millions of customers “cut the cord” on cable TV, switching to Netflix, Hulu, Apple TV+ and other services.

The public cloud has come to mirror cable TV, with providers launching dozens of services that IT leaders don’t need. Moreover, as with cable TV, IT leaders came to feel locked in to their existing cloud contracts; when the renter’s remorse kicked in, they began to move some workloads.

And with compute and storage needs diversifying today, IT leaders want more Netflix and Hulu, less cable TV, which is a big reason for some of the repatriation to on-premises systems.

A different tack to support AI

Organizations require a more prescriptive approach to their AI needs – one that affords them control over their corporate IP and data – while allowing them to maintain performance and meet resiliency requirements.

Also, AI inferencing can tax systems – it requires real-time access to data – so organizations must control the compute and storage that fuel their AI applications. Controlling these resources will also help your IT departments better manage the data gravity that attends AI applications.

So how can IT leaders solve such challenges?

There is no cookie-cutter approach, but one option includes maintaining a laser-focus on delivering the optimal business outcomes, regardless of where you choose to run your applications, whether this includes your own datacenters, with the ability to extend to AI PCs and other devices at the edge.

This approach, known as enterprise AI, helps meet performance requirements and reduce latency while ensuring that your data is secure and compliant with data locality and sovereignty rules. There are paths to enterprise AI that may be kinder to your budget.

For example, you might consider deploying open-source models on-premises, which helps you bring AI to your data while right-sizing your model(s) to meet your operational requirements. Pre-trained models incorporating retrieval augmented generation help refine results with corporate data and run well on GPU-powered servers behind the corporate firewall.

Deploying Meta’s Llama 2 model on-premises with RAG proved as much as 75 percent more cost-effective than running GenAI workloads in Amazon Web Services’ public cloud, according to a survey conducted by Enterprise Strategy Group. ESG also found that running Mistral’s 7B open-source model with RAG on premises was found to be 38 percent to 48 percent more cost-effective than AWS.

These are key savings at a time when the cost of inferencing rises over the lifetime of a model. Cultivating enterprise AI isn’t easy; most organizations lack the technical wherewithal to build such an operating model let alone stand up the architecture and infrastructure modernized for AI.

It also requires experts who can help you get your data house in order so you can run your models with the performance and efficiency you need at the right cost. Who you choose as your trusted advisor will help shape the outcomes of your AI strategy.

Learn more about the Dell AI Factory.

Brought to you by Dell Technologies.

Seagate sampling 36 TB HAMR disk drive

Seagate is pushing out its capacity leadership position on Toshiba and Western Digital by sampling an Exos M 36 TB disk drive using its HAMR technology, currently the highest capacity disk drive in the industry.

It announced last month that it had been given the qualification all-clear from a top cloud service provider (CSP) to start volume manufacturing and shipping of its Exos M 32 TB HAMR technology shingled magnetic recording (SMR) drive and has now upped HAMR capacity by another 4 TB. The HAMR Mozaic 3+ technology involves a laser momentarily heating a bit area in the disk’s recording medium coating to allow its magnetic polarity to be set by the drive’s write head, before cooling to room temperature when the bit setting will be stable.

Dave Mosley, Seagate
Dave Mosley

CEO Dave Mosley stated: “We’re in the midst of a seismic shift in the way data is stored and managed. Unprecedented levels of data creation – due to continued cloud expansion and early AI adoption – demand long-term data retention and access to ensure trustworthy data-driven outcomes … Seagate continues to lead in areal density, sampling drives on the Exos M platform of up to 36 TB today. Also, we’re executing on our innovation roadmap, having now successfully demonstrated capacities of over 6 TB per disk within our test lab environments.” 

Dell’s Travis Vigil, SVP, ISG Product Management, added: “Dell PowerScale with Seagate’s HAMR-enabled Mozaic 3+ technology plays a crucial role in supporting AI use cases like retrieval-augmented generation (RAG), inferencing, and agentic workflows. Together, Dell Technologies and Seagate are setting the standard for industry-leading AI storage innovation.”

Competitors Toshiba and Western Digital use variations of microwave-assisted magnetic recording (MAMR) to store bits on a recording medium that does not support such small bit areas as HAMR. This means that their disk platters have a lower areal density than Seagate’s HAMR drives, which have now reached 3.6 TB/platter with the company’s ten-platter design. As WD’s capacity tops out at 32 TB and it has an 11-platter design, its areal density is 2.91 TB/platter, 19 percent less than Seagate.

Toshiba’s maximum SMR capacity is 28 TB with its ten-platter MA11  drives, meaning a 2.8 TB/platter rating, worse than Western Digital’s areal density and 22.2 percent less than Seagate.

Both WD and Toshiba have said they will move to HAMR technology. In 2022, Toshiba said it had 40-plus TB HAMR drives, in the FY 2026 period or later, featuring in its roadmap. Seagate intends to introduce second-generation 40 TB HAMR drives in the second half of this year. Given that Toshiba customers need to qualify its HAMR drives and that Seagate has found producing the drives with an acceptable manufacturing yield and reliability a multi-year process, Toshiba could start HAMR drive general availability in 2027. That’s two years behind Seagate.

Western Digital also has HAMR tech introduction plans. It was 12 to 18 months away in June 2023, according to CFO Wissam Jabre. It is now 18 months since then and WD’s HAMR tech has not arrived. Coincidentally, Jabre has just resigned.

Like Toshiba, WD will find HAMR drives need lengthy manufacturing development and also customer qualification, especially by reliability-conscious CSPs and hyperscalers, who will be the largest customers for these nearline, 7,200 rpm drives. WD could find itself 12 months or more behind Seagate when it does move to HAMR.

The Exos M 36 TB drive has a 6 Gbps SATA interface. Seagate has not yet released details such as its sustained transfer rate, cache size, MTBF number, and so forth. We expect these to arrive in a few weeks’ time and be pretty similar to its existing Exos MA11 32 TB drive, positioning Seagate for a more than 12-month capacity advantage.

Coldago offers trio of file storage maps

File storage can be used in such different ways that research house that Coldago has produced three separate supplier rating reports. 

It shows views of file storage suppliers through its 4-box diagram lens, where they are divided into three groups – Enterprise File Storage, High Performance File Storage, and Cloud File Storage.

This differs from GigaOm’s circular 4-box Radar diagram which looked at scale-out file storage and produced separate High-Performance Scale-Out File Storage (SCOFS) and Enterprise Scale-Out File Systems Radar reports in 2022. The following year it produced a combined Enterprise Scale-Out File Systems Radar – with a second edition in late 2024 along with a separate Cloud-Native Globally Distributed File Systems Radar document.

Back in 2021 Coldago had a single File Storage Map, but times change and file storage usage scenarios alter.

The two analyst houses have different views on how to cover the file storage supplier area and different inclusion criteria for suppliers in their categories, where the categories are similar, as a diagram indicates:

Suppliers can be present in more than one category for each research house.

A Codago map position vendor in four columns ranged along a vision and strategy axis from left (niches) through specialists and challengers to leaders (right). There is a vertical axis running from low to high and rating execution and capabilities. Analyst Philippe Nicolas has 11 suppliers listed in its Enterprise File Storage map: DDN, Dell, Huawei, IBM, iXsystems, Microsoft, NetApp, Pure Storage, Qumulo, SUSE and VAST Data:

The most favorable positions for suppliers in a category are high and to the right. There are three Challengers – SUSE,DDN, iXsystems and Qumulo – with seven leaders: Dell, Huawei, IBM, Microsoft, NetApp, Pure Storage, and VAST Data. Microsoft leads Pure, IBM and DEll wit Huawei close behind, and VAST Data somewhat training.

Compared to the 11 enterprise file storage suppliers, Coldago lists 15 vendors of High-Performance File Storage:

Fujitsu is listed as the single Specialist, with five suppliers grouped up and to the right in the Challengers’ box; Hammerspace, HPE, Quobyte, ThinkParQ and VDURA -the rebranded Panasas. 

There are nine leaders, with two Quantum and Qumulo, trailing Challenger HPE in their ability to execute and general capabilities rating. DDN heads the leaders’ pack, followed by IBM, VAST Data and WEKA.

Coldago’s Cloud File Storage map includes 11 suppliers with just four leaders: CTERA, Hammerspace, Nasuni and Panzura:

Once again, as with the two other maps, there are no Niche players. We have two Specialists – Peer Software and Lucid Link –and five Challengers; Egnyte, TigerTechnology, NetApp, AWS and Microsoft.

Google Cloud’s FileStore doesn’t warrant a mention by Coldago. Interested parties can purchase the Coldago reports here and find the details on how it compares the different suppliers.

Bootnote

Gartner has its own take on file storage, lumping it together with object storage in a single Magic Quadrant (MQ) for File and Object Storage Platforms.

Western Digital CFO resigns ahead of company split

Western Digital CFO Wissam Jabre is resigning effective February 28, after the company splits into separate disk drive and SSD businesses. He was slated to become the CFO of the disk drive business following the split. 

The company says Jabre is resigning “to pursue other opportunities” and it is looking for a new CFO.

Western Digital issued an 8K SEC filing saying it expected revenues for its current, second fiscal 2025 quarter to be around $4.3 billion, the midpoint of its guidance and 42 percent more than last year’s Q2. It will formally announce its Q2 2025 results on January 29. Due to a “more challenging pricing environment in the Company’s flash business,” the company stated that non-GAAP diluted earnings per share would be “at the lower end of the previously issued guidance range of $1.75 to $2.05 per share.”

Wissam Jabre, Western Digital
Wissam Jabre

Wedbush analysts Matt Bryson and Antoine Legault said: “It should be well understood that the NAND environment deteriorated in CQ4, particularly in WD’s case with the company having already publicly warned investors of this fact. WDC’s ability to still hit the midpoint of its revenue guidance as well as to keep to its initially guided range for HDDs if anything in our view talks to the continued strength in the company’s hard drive business (which we believe offset some of the weakness in NAND).”

The analysts say Jabre’s departure does not signal any concerns with the HDD business. “We believe WD’s HDD business is in a strong position through 2025 given current areal density leadership (with risk rather potentially becoming apparent next year as HAMR ramps), and no sign of demand cracks, a setup which should provide the HDD operation with stability regardless of executive leadership.”

A Morgan Stanley comment stated: “The CFO transition is a surprise at this stage of the business separation, but we still see the HDD business on solid footing.”

Jabre joined Western Digital in February 2022, coming from being CFO at Dialog Semiconductor for five and a half years, and a corporate finance VP at AMD before that. He joined MKS Instruments’ board last November, and that company supplies measuring instruments and allied products to industries including semiconductor manufacturing, life sciences, industrial technologies, and research. He has a semiconductor heritage.

The “pursue other opportunities” phrase suggests Jabre is going to join another business, presumably as CFO. He could be returning to his semiconductor industry roots. Intel might be seeking a new CFO as incumbent David Zinsner is also serving as co-CEO with Michelle Johnston Holthaus, the CEO of Intel Products.

Several other semiconductor companies need new CFOs, including Lattice Semiconductor and Cirrus Logic.

Datafy targets EBS cost optimization with automated rightsizing engine

Datafy.io is a FinOps startup focused on cutting Elastic Block Storage (EBS) costs with roots in all-flash startup E8 Storage.

Tel Aviv-based Datafy.io was founded by CEO Zivan Ori, COO Ziv Serlin, and chief product officer Yoav Ilovich in August 2023. The company has not raised any external funding yet.

Yoav Ilovich, Zivan Ori, and Ziv Serlin of Datify
From left, Yoav Ilovich, Zivan Ori, and Ziv Serlin

Ori was co-founder and CEO of NVMe-access, all-flash storage company E8 Storage from August 2014 until it was acquired by AWS in August 2019. He then became AWS’s director of software development for EBS, leaving in July 2023. Serlin was also a co-founder of E8 and became a principal engineer at AWS when it was acquired.

Ilovich was chief product officer at Fleetonomy, then VP Product for Pagaya Investments, with two-year stints at each, followed by five months as a product advisor to Addressable.io before co-founding Datafy.

According to Datafy advisor Ron Maroley, EBS utilization (assuming the volume is attached to an active machine) can be as low as single-digit percentages. This means – if he’s right – there’s a savings potential of over 90 percent on those resources. 

He said that, while there are two to three “immediate suspects” for EBS optimization (such as unattached volumes, volumes attached to stopped machines, or transitioning from gp2 to gp3), EBS use is often overlooked as it requires relatively more effort to monitor and remediate. However, addressing it can lead to significant waste reduction and improved financial efficiency, Maroley. Datafy’s solution is simple to use, we’re told, and requires no developers for deployment.

Ori says there are six types of EBS volume, which can be viewed as pairs – sc1/st1, gp2/gp3, io1/io2 – with different costs and performance levels. 

EBS instance types

Tracking costs, especially for storage, on AWS can be challenging due to its technical complexity. EBS volumes are priced per GiB per month. Pricing varies by the type of EBS volume. io1/io2 are the priciest, followed by gp2/gp3, and finally sc1/st1. Some EBS volumes, like io1 and io2, charge for provisioned IOPS (input/output operations per second) in addition to capacity. For gp3 volumes, you get a baseline of 3,000 IOPS for free, but you can pay extra for more IOPS (up to 16,000) or bandwidth if needed (the free bandwidth is 125 MiB/s, and you can pay extra up to 1,000 MiB/s). 

There are also snapshot costs. “You need to choose between the standard and archive tiers in Amazon S3,” said Ori. “The standard tier is more expensive but offers free and fast retrieval for creating volumes from snapshots. The archive tier is cheaper, but retrieval is slow[er] and incurs additional costs. EBS costs vary by AWS region due to different AWS operational costs, demand and supply, local regulations, and economic conditions. AWS Cost Explorer is a powerful tool that can help you manage your EBS costs effectively. However, one challenge is that EBS costs are hard to find as they are under the ‘EC2-Other’ cost category.”

In Ori’s view: 

  • From a pure IOPS perspective, you should always use gp3, even if you need to pay for the extra IOPS. gp3 also gives you the flexibility to determine how many IOPS to pay for, which gp2 lacks. 
  • No matter your requirements, it will always be cheaper to use gp3 rather than gp2.
  • For guaranteed high IOPS performance, use io2. 
  • For streaming applications, use st1/sc1 – older HDD-based volumes. They present a bandwidth-to-capacity-to-cost ratio that still beats SSDs. However, you must make sure that the application uses a large block size (1 MiB is ideal), and is insensitive to IOPS.

Datafy’s Optimization Engine, we’re told, automatically rightsizes a customer’s EBS volumes, ensuring they only pay for what they actually need. It automates the scaling of an EBS volume’s capacity based on actual usage, increasing or decreasing volume size as needed. Datafy claims this process happens without any downtime and requires no manual intervention at the application or file system level. The Datafy software works with EBS storage for Kubernetes clusters.

Sign up for a demo here.

MariaDB adds vector search in Gen AI DB push

The open-source MariaDB database now has a vector search capability, allowing customers’ businesses to access a single, complete database supporting transactional, analytical, semi-structured, and now artificial  intelligence (AI) applications.

The firm has undergone a private equity makeover since going public in 2022 with a $672 million valuation. It posted poor quarterly results, received NYSE warnings about its low market capitalization, laid off 28 percent of its staff, closed down its MariaDB Xpand and MariaDB SkySQL offerings, and announced Azure Database for MariaDB would shut down in September 2025. It was a financial storm, with one lender even threatening to sweep its accounts. It received many takeover proposals and accepted one from K1  Investment Management valuing it at $37 million.

Thus MariaDB, with its nearly 700 active customers across industries including banking, telecommunications, government, healthcare and e-commerce, went private in September 2024. At the time, Sujit Banerjee, MD of K1 Operations, said: “Together, we aim to accelerate product innovation and continue MariaDB’s mission of delivering high-quality, enterprise-grade solutions to meet the growing demands of the market.” K1 appointed Rohit de Souza as CEO while former CEO Paul O’Brien remained involved as an advisor.

An exec replenishment exercise included MariaDB rehiring Michael “Monty” Widenius as CTO. He is the creator of MySQL and previously had a 7 year stint as Maria CTO up until December 2022. The firm also appointed Vikas Mathur as Chief Product Officer and Mike Mooney as Chief Revenue Officer. It said it would launch vector search in MariaDB Server and a Kubernetes (K8s) Operator, catering to AI and cloud-native trends, and it has now done so.

The MariaDB Enterprise Platform 2025 version delivers:

  • Native vector search capabilities within the core database engine that is 100 percent open source
  • Additional JSON functionality for expanded semi-structured data support
  • A new version of MaxScale for worry-free upgrades
  • A new cost-based optimizer to support complex queries
Vikas Mathur.

Vikas Mathur stated: “Bringing vector search natively to the database server lets customers  extend the same database they’re already using throughout their organization to new AI applications. We’ve also prioritized peace of mind for our enterprise customers with this release, adding tools to make upgrading versions a breeze.”

It says the vector search enables businesses to keep their database stack simple while supporting AI initiatives. MariaDB says vector search enables searching unstructured data by value and by semantics without the need to  integrate multiple databases or compromising your system’s reliability or security. It can also help LLMs deliver more accurate and contextually relevant results using retrieval-augmented generation (RAG) on enterprise data.

Also, by leveraging their existing MariaDB database, customers can eliminate the need to maintain separate vector databases. 

Benchmarks of MariaDB’s vector search capabilities show that it outperforms PostgreSQL’s pgvector by 1.5x for queries per second (QPS) and 2x on index creation time.

The new MaxScale feature allows customers to capture the workload of a production system, such as queries, sessions and transactions, and replay them in a test environment. The captured workloads can be used to verify that upgrades of the MariaDB database  behave as expected and to measure the effects configuration changes may have.

The revamped optimizer in the database engine ncorporates a granular and refined cost model that takes into account state-of-the-art SSD disks and different characteristics of storage engines. MariaDB’s database can now “fully leverage the lower latency and high throughput offered by modern storage devices, automatically choosing the fastest execution plan for complex queries.”

The new release also adds:

  • Reduced operational downtime with an online schema change feature that allows for a non-locking ALTER TABLE built into the server, which enables writing to a table while the ALTER TABLE is running.
  • Improved efficiency with a new optimistic ALTER TABLE for replication that greatly reduces replication lag by making ALTER TABLE operations two phased operations.
  • Enriched enterprise-class protection with new security features that enable TLS encryption by default, add more granular privileges and introduce a new plugin to prevent reuse of old passwords.
  • Enhanced partitioning management with new operations like converting partitions to tables and vice versa, and new enhancements to managing system versioning partitions, making database maintenance more flexible and efficient.

MariaDB Enterprise Platform 2025 includes updated versions of its core database, MariaDB Enterprise Server 11.4, its advanced database proxy, MariaDB MaxScale 25.01, and tools and support. The MariaDB Enterprise Platform is available to download now for all MariaDB customers.

For more information read the launch blogs: Announcing New Release of MariaDB Enterprise Platform Introducing Vector Search with the Latest Version of MariaDB Enterprise Platform and Introducing MariaDB MaxScale 25.01 GA.

BMC Software enhances mainframe data storage and AI services

BMC Software has released data storage and AI productivity enhancements to its mainframe services.

The January 2025 update to the BMC AMI portfolio “breaks new ground” in enabling what is possible on the mainframe, says BMC, as it puts more meat on the bones of its recently announced Cloud Data Sets (CDS).

BMC bought Model9 in April 2023 and rebranded its software as the AMI Cloud. The patented Cloud Data Sets enables direct access to object storage on-prem or in a hyperscaler without modifying existing processes.

Priya Doty, BMC
Priya Doty

“Today, secondary backup storage and retrieval of mainframe data on tape and virtual tape libraries (VTLs) is costly and time-consuming,” said Priya Doty, vice president, solutions marketing for BMC AMI solutions. “The CDS feature in BMC AMI Cloud Data provides a seamless transition to object storage, which streamlines backup and recovery and offers cost savings over traditional solutions.”

Additionally, with CDS, current users of BMC AMI FDR (fast dump restore) can now redirect their tape backups to object storage without direct-access storage device (DASD), VTL staging, or any changes to code. BMC said this results in faster backups, improved disaster recovery, and the ability to eliminate the cost and infrastructure requirements of physical tape and VTL storage.

Last year, the supplier introduced a new COBOL code explanation feature in BMC AMI DevX Code Insights, powered by BMC AMI Assistant. Driven by generative AI, code explanation empowers developers by providing a short summary of a section of the code’s business logic and details of the code’s logic flow.

With the January update, BMC AMI Assistant now includes the “widest language support in the industry,” including explanations of code written in PL/I, JCL, and Assembler. This helps developers understand, review, extend, and test mainframe code with “unmatched efficiency,” said BMC.

The use of Java on mainframes is increasing, and with it demand for improved application performance. As part of the update, the new BMC AMI Strobe for Java enables “comprehensive” application performance management and analysis in a single tool with a user-friendly web interface. BMC AMI Strobe for Java allows developers to “easily pinpoint” sources of excessive resource demand and shift left to conduct performance tests earlier in the software delivery lifecycle.

In addition, powered by BMC AMI Assistant, new hybrid AI capabilities, which combine AI/ML with GenAI in BMC AMI Ops Insight, simplify root cause analysis, helping to reduce mean time to detection (MTTD) and mean time to resolution (MTTR). New interactive dashboards also allow users to create and personalize focused views based on their observability needs.

“By giving systems programmers greater control over the information they see, these customized insights enable faster and smarter decision-making,” claimed BMC.

Cirrus Data extends block data moves with Red Hat integration

Block data mobility specialist Cirrus Data Solutions has forged a Red Hat integration that should help users improve the migration of workloads. Its Cirrus Migrate Cloud is now working with Red Hat OpenShift Virtualization, enabling organizations to automate the migration of workloads from “any” hypervisor.

Cirrus said the integration enabled more flexible hybrid workloads, accelerated modernization efforts, and improved resource utilization.

Cirrus Migrate Cloud is said to be now integrated with virtualization technologies that account for “over 90 percent” of the hypervisor market, including VMware vSphere/ESXi, Microsoft Hyper-V, Nutanix AHV, and now Red Hat OpenShift Virtualization. Cirrus Data also enables automated migration to Azure VMware Solution (AVS), VMware Cloud on AWS, VMware Cloud on AWS Outposts, and Oracle Linux Virtualization Manager (OLVM), which is based on oVirt.

Cirrus Migrate Clouds’ intelligent quality-of-service technology ensures application workloads remain fully operational with no performance impact throughout the migration, we are told. And the platform automatically handles all required remediations, including configuration changes, driver updates, and agent deployments. Cirrus Migrate Cloud supports migrating any block storage device, including Physical RDMs, directly mapped iSCSI disks, or any other disks. There is no need to take added steps to convert RDMs during migration.

Wayne Lam

Cirrus Migrate Cloud, together with MigrateOps, make it possible for organizations to automate the change from one hypervisor to another with a “secure, easy-to-use, and reliable solution”, said the provider.

“Our mission is to eliminate barriers to innovation by providing our customers with frictionless data mobility,” said Wayne Lam, CEO of Cirrus Data Solutions. “By integrating support for Red Hat OpenShift Virtualization, we empower organizations to modernize their computing environments without the risks or complexity traditionally associated with hypervisor migrations.”

The Red Hat OpenShift Virtualization integration is available immediately. Cirrus Migrate Cloud is available on the Microsoft Azure Marketplace, the Amazon Web Services (AWS) Marketplace, and the Oracle Cloud Marketplace.

Catalog launches first commercial DNA-encoded book

DNA-based storage platform provider Catalog Technologies claims that its tech, which uses parallelization, minimal energy, and a low physical footprint, offers an alternative to established data management systems, and has now delivered the first commercially available book encoded into DNA using its technology.

The Catalog Asimov DNA book
The Catalog Asimov DNA book

A traditional printed book, available from Asimov Press, includes a DNA capsule provided by Catalog, and retails for $60 as a bundle. In keeping with the spirit of science, the book features nine essays and three works of science fiction.

Catalog, founded in 2016 by MIT scientists, created around 500,000 unique DNA molecules to encode the 240 pages of the book, representing 481,280 bytes of data. After being converted into synthetic DNA, it was stored as a dry powder under inert gas to eliminate moisture and oxygen in the capsule.

Hyunjun Park, Catalog
Hyunjun Park

The production of the capsules involved two other companies in the process. Catalog synthesized and assembled the millions of nucleotides of DNA into thousands of individual strands in their Boston laboratories. That DNA was then shipped to France, where Imagene packaged the molecules into laser-sealed, stainless steel capsules. Finally, Plasmidsaurus “read” the DNA book at their headquarters in California and submitted the final sequence of it.

“Providing 1,000 copies of this latest Asimov book encoded into DNA is a significant milestone as we commercialize our DNA storage and computation technology,” said Hyunjun Park, co-founder and CEO of Catalog. “Our DNA platform – which uses very little energy – is quickly becoming an attractive option as emerging workloads, including AI, require unsustainable amounts of energy to process.”

While this is the first commercially available DNA book, it’s not the first DNA book. George Church’s Regenesis, which he co-authored with Ed Regis, was published in 2012. Church’s Harvard laboratory used binary code to preserve the book (including images and formatting), before converting that binary code into physical DNA.

Shortly after, a group of Cambridge scientists encoded Shakespeare’s entire collection of 154 sonnets – as well as an audio file of Martin Luther King’s “I Have A Dream” speech – into DNA.

Catalog's DNA capsules
Catalog’s DNA capsules

In 2022, Catalog encoded eight of Shakespeare’s tragedies, comprising more than 200,000 words of text, into a single test tube. It also built and tested methods to search that DNA.

Various new forms of DNA data storage technologies have been reported by Blocks & Files recently, including here, and here.

AI data companies dominate new unicorn list from BestBrokers

Companies built around AI data systems dominate entrants to the latest unicorn list compiled by broking comparison site BestBrokers, with Databricks and WEKA mentioned.

Unicorns are defined as privately owned companies valued at over $1 billion, and those in BestBrokers’ 2024 complete list are often valued at well over that mark.

OpenAI, the US company behind ChatGPT, recently closed a $6.6 billion funding round, nearly doubling its value from February 2024 to $157 billion. Meanwhile, Elon Musk’s SpaceX launched a tender offer in December 2024 that brought its valuation to $350 billion, overtaking Chinese tech giant and TikTok owner ByteDance as the most valuable startup in the world.

On new entrants to the unicorn list, BestBrokers said that out of 79 startups reaching the status in 2024, 36 of them, or nearly 46 percent, are AI companies. Among these is Musk’s AI startup, xAI, which he founded in 2023 to compete with OpenAI, a company he left after co-creating it. xAI is said to be worth around $50 billion after a funding round of $6 billion in December 2024.

The top ten AI unicorns with their estimated values are xAI ($50 billion), Perplexity AI ($9 billion), SandboxAQ ($5.6 billion), Safe Superintelligence ($5 billion), Sierra ($4.5 billion), Moonshot AI ($3.3 billion), Cyera ($3 billion), Poolside ($3 billion), Physical Intelligence ($2.8 billion), and Figure ($2.68 billion).

The total number of unicorns reached 1,258 in December 2024, and of the top ten, Data lakehouse supplier Databricks is ranked sixth with a $62 billion valuation. Ultra-fast file system software supplier WEKA is given a $1.6 billion valuation and is ranked 24th.

In January 2023, Blocks & Files reported there were 21 storage industry startups worth a billion dollars or more.

In BestBrokers’ list of 79 new entrants outside AI, there are six unicorns in cybersecurity (7.6 percent), ten in enterprise software (12.7 percent), and 16 in fintech and crypto (20.3 percent).

The US is home to well over half of all unicorns, with the figure standing at 683. Many were founded by immigrants or were started elsewhere but later relocated to the US. SpaceX, Databricks, and Stripe are examples.

China has 165 unicorns and India has 71. Across Europe, the UK has the largest number of unicorns, 54 in total, and Germany has 32. In addition, France has 28.

Lenovo goes shopping, plonks Infinidat in the basket

Lenovo is buying privately-owned enterprise block array storage supplier Infinidat for an undisclosed amount.

Systems supplier Lenovo’s storage line-up is centered on small and medium enterprises. It was built up by Kirk Skaugen, who was the boss of Lenovo’s Infrastructure Solutions Group from 2013 until June last year, when he was replaced by Ashley Gorakhpurwalla, who was Western Digital’s WD’s EVP and GM of its hard disk drive (HDD) business. Lenovo’s products include the ThinkSystem DG all-flash, DM hybrid, and DE SAN arrays and are mostly OEM’d from NetApp’s ONTAP storage line.

Lenovo says it has the number one revenue position in the IDC storage market price bands 1 (<$6,000) to 4 ($50,000 to $99,999). However, it doesn’t lead in bands 5 ($100,000 to $249,999) to 7 (>$500,000) and is not a major player in the enterprise storage market where Infinidat has built its business.

Greg Huff, Lenovo Infrastructure Solutions Group CTO, said in a statement: “With the acquisition of Infinidat, we are excited and well-positioned to accelerate innovation and deliver greater value for our customers. Infinidat’s expertise in high-performance, high-end data storage solutions broadens the scope of our products, and together, we will drive new opportunities for growth.”

Phil Bullinger.

Infinidat CEO Phil Bullinger said: “Infinidat delivers award-winning high-end enterprise storage solutions providing an exceptional customer experience and guaranteed SLAs with unmatched performance, availability, cyber resilience and recovery, and petabyte-scale economics. With Lenovo’s extensive global capabilities, we look forward to expanding the comprehensive value we provide to enterprise and service provider customers across on-premises and hybrid multi-cloud environments.”

Lenovo says it will gain:

  • Mission critical enterprise storage for scalable, cyber-resilient data management and an in-house software R&D team.
  • Expanded storage systems portfolio to cover high-end enterprise storage, building on Lenovo’s existing position in the entry and mid-range enterprise storage market.
  • An opportunity to drive profitable growth of the Group’s storage business as Infinidat, combined with Lenovo’s existing global infrastructure business, customer relationships, and global supply chain scale, is expected to create new opportunities for high-end enterprise storage products and unlock new revenue opportunities for the Group’s storage business.

Infinidat was founded in 2010 by storage industry veteran Moshe Yanai and uses memory data cacheing in its InfiniBox systems to supply data from disk as fast if not faster than all-flash arrays. This base product line was expanded to include the InfiniBox SSA all-flash system and the InfiniGuard data protection system. These systems competed with IBM’s DS8000, Dell PowerMAX, and Hitachi Vantara’s VSP systems.

Moshe Yanai in 2018.

Eventually Infinidat’s growth sputtered somewhat and Yanai was forced out of the Chairman and CEO roles in 2020, becoming a technology evengelist and subsequently leaving. Western Digital’s SVP and GM of its disposed datacenter business unit, Phil Bullinger, became the CEO in 2021. At the time Bullinger said Infinidat was profitable, cashflow positive, and growing.

Since then Infinidat says it has grown, with a reported 40 percent year-on-year bookings growth in 2021, and double digit growth in 2022, when it was also cashflow-positive. It said it achieved record results and a 59 percent annual bookings growth rate in calendar Q1 of 2023. The company has a focus on cyber-resiliency and is also active in the AI area, with a RAG workflow deployment architecture enabling customers to run generative AI inferencing workloads on its InfiniBox on-premises.

Infinidat has raised a total of more than $370 million in funding, with the last round taking place in 2020.