Google Cloud and NetApp are extending the NetApp Volumes storage service to work better with Vertex AI, support larger data sets, separately scale capacity and performance, and meet regional compliance needs,
Google Cloud NetApp Volumes, GCNV for short, is a fully managed file service based on NetApp’s ONTAP operating system running on the Google Cloud Platform as a native GCP service. It supports NFS V3 and v4.1, and SMB, and provides snapshots, clones, replication, and cross-region backup. Google’s Vertex AI is a combined data engineering, data science, and ML engineering workflow platform for training, deploying, and customizing large language models (LLMs), and developing AI applications. It provides access to Google’s Gemini models, which work with text, images, video, or code, plus other models such as Anthropic’s Claude and Llama 3.2.
Pravjit Tiwana.
NetApp SVP and GM for Cloud Storage, Pravjit Tiwana, states: ”Our collaboration with Google Cloud is accelerating generative AI data pipelines by seamlessly integrating the latest AI innovations with the robust data management capabilities of NetApp ONTAP.”
He reckons: “The new capabilities of NetApp Volumes help customers scale their cloud storage to meet the demands of the modern, high-performance applications and datasets.”
The new capabilities in detail are:
Coming NetApp Volumes integration with Google Cloud’s Vertex AI Platform: so customers will be able to build custom agents without needing to build their own data pipeline management for retrieval augmented generation (RAG) applications.
Improvements for Premium and Extreme Service Levels in all 14 regions where the Premium and Extreme service levels are offered. Customers can now provision a single volume starting at 15TiB that can be scaled up to 1PiB with up to 30 GiB/s of throughput. This means customers can move petabyte-scale datasets for workloads like EDA, AI applications, and content data repositories to NetApp Volumes without partitioning data across multiple volumes.
Flex Service Level previewing of independent scaling of capacity and performance to avoid over-provisioning of capacity to meet their performance needs with the NetApp Volumes Flex service level. Users can create storage pools by individually selecting capacity, throughput and IOPS with the ability to scale throughput up to 5 GiB/s and IOPS up to 160K to optimize costs.
NetApp Volumes will soon support the Assured Workloads framework that Google Cloud customers use to configure and maintain controlled environments operating within the parameters of a specific compliance regime, meeting the data residency, transparent access control, and cloud key management requirements specific to their region.
GCNV flex, standard, premium and extreme service level offerings can be researched here. The GCNV-Vertex AI integration is coming “soon.”
Proprietary data in GCNV will be able to be used via Vertex AI to implement model agent RAG capabilities.
NetApp has received the 2025 Google Cloud Infrastructure Modernization Partner of the Year for Storage award, which is a nice pat on the back.
Sameet Agarwal, Google Cloud Storage GM and VP, said: “Organizations can leverage their NetApp ONTAP on-premises data and hybrid cloud environments. By combining the capabilities of Google Cloud’s Vertex AI platform with Google Cloud NetApp Volumes, we’re delivering a powerful solution to help customers accelerate digital transformation and position themselves for long-term success.”
DDN is partnering Google Cloud with its Google Cloud Managed Lustre, powered by DDN offering.
The Lustre parallel file system enables Google Cloud to offer file storage and fast access services for enterprises and startups building AI, GenAI, and HPC applications. It provides up to to 1 TB/s throughput and can scale from terabytes to petabytes.
Alex Bouzari, Co-Founder and CEO of DDN, bigged this deal up by stating: “This partnership between DDN and Google Cloud is a seismic shift in AI and HPC infrastructure—rewriting the rules of performance, scale, and efficiency. … we’re not just accelerating AI—we’re unleashing an entirely new era of AI innovation at an unprecedented scale. This is the future, and it’s happening now.”
DDN says on-prem Lustre customers “can now extend their AI workloads to the cloud effortlessly.”
You might think that this is a revolution but, one, Google already has Lustre available on its cloud, just not as a managed service, and, two, its main competitors also offer Lustre services.
Google’s existing Lustre on GCP can be set up using deployment scripts or through DDN’s EXAScaler software, built on Lustre, which is available through the Google Cloud marketplace. Now it has moved on with this fully managed Lustre service offering which makes it easier for its customers to use Lustre.
AWS offers FSx for Lustre as well as FSx for OpenZFS and BeeGFS on AWS. Azure also offers Azure Managed Lustre plus BeeGFS on Azure and GlusterFS on Azure. You are spoilt for choice.
Google Cloud Managed Lustre (GCML) links to Google Cloud’s Compute Engine, GKE (Google Kubernetes Engine), Cloud Storage and other services for an integrated deployment. DDN and Google say it can speed up data pipelines for AI model training, tuning and deployment, and enable real-time inferencing.
The Google Cloud also has DAOS-powered ParallelStore available, DAOS being the open source Distributed Asynchronous Object Storage parallel file system.
GCML comes with 99.999 percent uptime and has a scalable pricing scheme. It can be seen at the Google Cloud Next 2025 event at the Mandalay Bay Convention Center, Las Vegas, April 9 to 11, where DDN is also demoing its Infinia object storage software.
Against a background of disaggregated IT and rising AI trends, Dell has announced refreshes of its PowerEdge, PowerStore, ObjectScale, PowerScale, and PowerProtect storage systems.
Dell is announcing both server and storage advances. It says its customers need to support existing and traditional workloads as well as provide IT for generative AI tasks. A disaggregated server, storage, and networking architecture is best suited for this and builds on three-tier and hyperconverged infrastructure designs, with separate scaling for the three components collected together in shared resource pools.
Arthur Lewis
Dell Infrastructure Solutions Group president Arthur Lewis stated: “From storage to servers to networking to data protection, only Dell Technologies provides an end-to-end disaggregated infrastructure portfolio that helps customers reduce complexity, increase IT agility, and accelerate datacenter modernization.”
Dell’s PowerEdge R470, R570, R670, and R770 servers are equipped with Intel Xeon 6 processors with performance cores. These are single and double-socket servers in 1U and 2U form factors designed for traditional and emerging workloads like HPC, virtualization, analytics, and AI inference.
Our focus here is on the storage product announcements, which cover the unified file and block PowerStore arrays, cloud-native ObjectScale, scale-out clustered PowerScale filer system, and Dell’s deduplicating backup target PowerProtect systems developed from prior Data Domain arrays.
PowerStore
A PowerStore v4.1 software release provides AI-based analytics to detect potential issues before they occur, auto support ticket opening, carbon footprint forecasting, DoD CAC/PIV smart card support, automated certificate renewal and improved PowerProtect integration through Storage Direct Protection. This enables up to 4x faster backup restores and support for the latest PowerProtect systems; the DD6410 appliance and All-Flash Ready Nodes (see below).
Dell PowerStore node
The software provides better storage efficiency tracking, now covering both file and block data, and ransomware-resistant snapshots, supplementing the existing File Level Retention (FLR) and other local, remote, and cloud-based protection methods.
It offers file system QoS with more granular performance controls. Dell Unity customers migrating to PowerStore can preserve their existing Cloud Tiering Appliance (CTA) functionality. Archived files remain fully accessible, and customers can create new archiving policies for migrated file systems on PowerStore.
ObjectScale is scale-out, containerized object storage software running on ECS hardware nodes. Up until now there were three ECS hardware boxes: EX500 (12-24 HDDs, to 7.68 PB/rack), EX5000 (to 100 HDDs, to 14 PB/rack) and all-flash EXF900 (12-24 NVMe SSDs, to 24.6 PB/rack).
New ObjectScale v4.0 software boasts smart rebalancing, better space reclamation, and capacity utilization. It also has expanded system health metrics, alerting, and security enhancements. Dell claims it offers “the world’s most cyber-secure object storage.”
There are two new ObjectScale systems. The all-flash XF960 is said to be designed for AI workloads and is an evolution of the EXF900. It has extensive hardware advances based on PowerEdge servers and delivers up to 2x greater throughput per node than the closest but unnamed competitor, and up to 8x more density than the EXF900.
ObjectScale X560 top and XF960 bottom
The HDD-based X560 accelerates media, backup, and AI model training ingest workloads with 83 percent higher small object read throughput than the EX500 running v3.8 software.
Dell is partnering with S3-compatible cloud storage supplier Wasabi to introduce Wasabi with Dell ObjectScale, a hybrid cloud object storage service with tiers starting from 25 TB of reserved storage per month. Wasabi has a global infrastructure, with more than 100,000 customers in 15 commercial and two government cloud regions worldwide.
More ObjectScale news is expected at the upcoming Dell Technologies World conference.
PowerScale
PowerScale all-flash F710 and F910 nodes get 122 TB Solidigm SSD support, doubling storage density. This, with 24 bays in their 2RU chassis and 2:1 data reduction, provides almost up to 6 PB of effective capacity per node. Dell says it’s the first supplier to offer an enterprise storage system with such SSDs.
The PowerScale archive A and hybrid H series nodes – H710, H7100, A310, A3100 – have lower latencies and faster performance with a refreshed compute module for HDD-based products. Dell says the A-Series is optimized for TCO, while the H-series provides a balanced cost/performance mix. The updated nodes feature:
Fourth-generation Intel Xeon Sapphire Rapids CPUs
DDR5 DRAM with up to 75 percent greater speed and bandwidth
Improved thermal operation reducing heat and stress on components
Updated drive carrier with 100 percent greater speed for SAS drives
Dell will introduce support for 32 TB HAMR disk drive technology later this year with “extended useful life.”
A PowerScale 1RU A110 Accelerator Node is a successor to the previous generation P100 and B100 performance and backup accelerators. It’s designed to solve CPU bottlenecks and boost overall cluster performance with higher cluster bandwidth. The A110 can be independently scaled in single node increments.
PowerProtect
There are three main developments here. First, the PowerProtect DD6410 is a new entry-level system with a capacity of 12 TB to 256 TB. It’s aimed at commercial, small business, and remote site environments, with up to 91 percent faster restores than the DD6400, up to 65x deduplication, and scalability for traditional and modern workloads.
Secondly, the PowerProtect All-Flash Ready Node has 220 TB capacity with over 61 percent faster restore speeds, up to 36 percent less power, and a 5x smaller footprint than the PowerProtect DD6410 appliance. It does not support the 122 TB SSDs, built with QLC 3D NAND, because their write speed is not fast enough.
Both the DD6410 and All-Flash Ready Node support the Storage Direct Protection integration with PowerStore and PowerMax, providing faster, efficient, and secure backup and recovery.
PowerProtect DD6410 (top) and All-Flash Ready Node (bottom)
Thirdly, a PowerProtect DataManager software update reduces cyber-security risks with anomaly detection. This has “machine learning capabilities to identify vulnerabilities within the backup environment, enabling quarantine of compromised assets. It provides early insights in detecting threats in the backup environment while complementing the CyberSense deep forensics analysis of isolated recovery data in the Cyber Recovery vault, providing end-to-end cyber resilience of protected resources.”
As well as VMware, DataManager now manages Microsoft Hyper-V and Red Hat OpenShift Virtualization virtual machine backups. A suggestion of future Nutanix AHV support to Dell received a positive acknowledgement as a possibility.
DataManager archives data to ObjectScale for long-term retention. This is not tiering with a stub left behind. The archived data can be restored directly without first being rehydrated to a PowerProtect system. The archiving is to S3-compatible object stores.
DataManager also has Multi-System Reporting which offers centralized visibility and control across up to 150 PowerProtect Data Manager instances.
Availability
PowerProtect Data Manager updates are available now.
PowerEdge R470, R570, R670, and R770 servers are available now.
PowerStore software updates are available now.
ObjectScale is available now as a software update for current Dell ECS environments.
HDD-based ObjectScale X560 will be available April 9, 2025.
All-Flash ObjectScale XF960 appliances will be available beginning in Q3 2025.
The Wasabi with Dell ObjectScale service is available in the United States. UK availability begins this month, with expansion into other regions planned in the coming months.
PowerScale HDD-based nodes will be available in June 2025.
PowerScale with 122 TB drives will be available in May 2025.
PowerProtect DD6410 and All-Flash Ready Node will be available in April 2025.
As the on-premises backup target market grows, so too does ExaGrid – which just posted its best ever Q1.
The company supplies deduplicating backup appliances with a non-deduped landing zone for faster restores of recent data. Deduped data is moved to a non-network-facing area for further protection. Its appliances can be grouped with cross-appliance deduplication raising storage efficiency.
Bill Andrews
At the end of 2025’s first quarter ExaGrid was was free cash flow (FCF) positive, P&L positive, and EBITDA positive for its 17th consecutive quarter and has no debt. CEO Bill Andrews emphasized this, telling us: “We have paid off all debt. We have zero debt. We don’t even have an account receivable line of credit (don’t need it).”
It recruited 155 new logos, taking its total well past 4,600 active upper mid-market to large enterprise customers. The company says it continues to have 75 percent of its new logo customer bookings come from six- and seven-figure purchase orders. Andrews tells us: “For the last 8 quarters, each quarter 75 percent of our new logo customer bookings dollars come from deals over $100K and over $1M. Only 25 percent of new customer bookings dollars come from deals under $100K.”
Andrews stated: “ExaGrid continues to profitably grow as it keeps us on our path to eventually becoming a billion-dollar company. We are the largest independent backup storage vendor and we’re very healthy … ExaGrid continues to have an over 70 percent competitive win rate replacing primary storage behind the backup application, as well as inline deduplication appliances such as Dell Data Domain and HPE StoreOnce.”
The company has a 95 percent net customer retention rate and an NPS score of +81. Andrews tells us: “Our customer retention is growing and is now at 95.3 percent. We think perfection is 96 percent because you can’t keep every customer as some go out of business, some get acquired, some move everything to the cloud, etc.”
For ExaGrid’s top 40 percent customers, its largest, “we have a 98 percent retention which is very high for storage.” He adds: “99 percent of our customers are on maintenance and support, also very high for the industry.”
The 5,000 customer level is in sight and Andrews left us with this thought: “Things are going well and shy of the tariffs throwing us into a depression, we should have yet another record bookings and revenue year. … The goal is to keep growing worldwide as there is a lot of headroom in our market.”
Bootnote
For reference Dell has more than 15,000 Data Domain/Power Protect customers.
Qumulo has added a performance-enhancing NeuralCache predictive caching feature to its Cloud Data Fabric.
The Cloud Data Fabric (CDF) was launched in February and has a central file and object data core repository with coherent caches at the edge. The core is a distributed file and object data storage cluster that runs on most systems, vendors, or public cloud infrastructures. Consistency between the core and edge sites comes from file system awareness, block-level replication, distributed locking, access control authentication, and logging.
NeuralCache uses a set of supervised AI and machine learning models to dynamically optimize read/write caching, with Qumulo saying it’s “delivering unparalleled efficiency and scalability across both cloud and on-premises environments.”
Kiran Bhageshpur
CTO Kiran Bhageshpur states: “The Qumulo NeuralCache redefines how organizations manage and access massive datasets, from dozens of petabytes to exabyte-scale, by adapting in real-time to multi-variate factors such as users, machines, applications, date/time, system state, network state, and cloud conditions.”
NeuralCache, Qumulo says, “continuously tunes itself based on real-time data patterns. Each cache hit or miss refines the model, improving efficiency and performance as more users, machines, and AI agents interact with it.”
It “intelligently stacks and combines object writes, minimizing API charges in public cloud environments while optimizing I/O read/write cycles for on-premises deployments – delivering significant cost savings without compromising durability or latency.”
The NeuralCache software “automatically propagates changed data blocks in response to any write across the Cloud Data Fabric” and “users, machines, and AI agents always access the most current data.”
Bhageshpur says this “enhances application performance and reduces latency while ensuring data consistency, making it a game-changer for industries relying on data-intensive workflows, including AI research, media production, healthcare, pharmaceutical discovery, exploratory geophysics, space and orbital telemetry, national intelligence, and financial services.”
Qumulo says NeuralCache excels at dataset scales from 25 PB to multiple exabytes, “learning and improving as data volume and workload complexity grows.”
This predictive caching software was actually included in the February CDF release, but a Qumulo spokesperson told us it “wasn’t fully live and we were just referring to it generically as ‘Predictive Caching.’ Since then, we have had a customer test it out and provide feedback like a Beta test. And we formally named it NeuralCache.”
Interestingly, high-end storage array provider Infinidat has a caching feature that is similarly named but based on its array controller’s DRAM. Back in June 2020, we wrote that its array software has “data prefetched into a memory cache using a Neural Cache engine with predictive algorithms … The Neural Cache engine monitors which data blocks have been accessed and prefetches adjacent blocks into DRAM.” It enables more than 90 percent of the array data reads to be satisfied from memory instead of from much slower storage drives.
Despite the similarity in naming, however, Qumulo’s NeuralCache tech is distinct from Infinidat’s patented Neural Cache technology
Qumulo’s NeuralCache is available immediately as part of the vendor’s latest software release and is seamlessly integrated into the Qumulo Cloud Data Fabric. Existing customers can upgrade to it with no downtime. Find out more here.
Interview: Startup Starburst develops and uses Trino open source distributed SQL to query and analyze distributed data sources. We spoke to CEO Justin Borgman about the company’s strategy.
A little history to set the scene, and it starts with Presto. This was a Facebook (now Meta) open source project from 2012 to provide analytics for its massive Hadoop data warehouses by using a distributed SQL query engine. It could analyze Hadoop, Cassandra, and MySQL data sources and was open sourced under the Apache license in 2013.
The four Presto creators – Martin Traverso, Dain Sundstrom, David Phillips, and Eric Hwang – left in 2018 after disagreements over Facebook’s influence on Presto governance.
They then forked the Presto code to PrestoSQL. Facebook donated Presto to the Linux Foundation in 2019, which then set up the Presto Foundation. By then, thousands of businesses and other organizations were Presto users. PrestoSQL was rebranded to Trino to sidestep potential legal action after Facebook obtained the “Presto” trademark. The forkers set up Starburst in 2019, with co-founder and CEO Justin Borgman, to supply Trino and sell Trino connectors and support.
Borgman co-founded SQL-on-Hadoop company Hadapt in 2010. Hadapt was bought by Teradata in 2014 with Borgman becoming VP and GM of its Hadoop portfolio unit. He resigned in 2019 to join the other Starburst founders.
Eric Hwang is a distinguished engineer at Starburst. David Phillips and Dain Sundstrom both had CTO responsibilities, but they left earlier this year to co-found IceGuard, a stealth data security company. Martin Traverso is Starburst’s current CTO.
Starburst graphic
Starburst has raised $414 million over four rounds in 2019 ($22 million A-round), 2020 ($42 million B-round), 2021 ($100 million C-round), and 2022 ($250 million D-round).
It hired additional execs in early 2024 and again later that year to help it grow its business in the hybrid data cloud and AI areas.
Earlier this year, Starburst reported its highest global sales to date, including significant growth in North America and EMEA, with ARR per customer over $325,000. There was increased adoption of Starburst Galaxy, its flagship cloud product, by 94 percent year-over-year, and it signed its largest ever deal – a multi-year, eight-figure contract per year, with a global financial institution.
Blocks and Files: Starburst is, I think, a virtual data lakehouse facility in that you get data from various sources and then feed it upstream to whoever you need to.
Justin Borgman
Justin Borgman: Yeah, I like that way of thinking about it. We don’t call ourselves a virtual lakehouse, but it makes sense.
Blocks and Files: Databricks and Snowflake have been getting into bed with AI for some time, with the last six to nine months seeing frenetic adoption of large language models. Is Starburst doing the same sort of thing?
Justin Borgman: In a way, yes, but maybe I’ll articulate a couple of the differences. So for us, we’re not focusing on the LLM itself.
We’re basically saying customers will choose their own LLM, whether that’s OpenAI or Anthropic or whatever the case may be. But where we are playing an important role is in those agentic RAG workflows that are accessing different data sources, passing that on to the LLM to ensure accurate contextual information.
And that’s where we think we actually have a potential advantage relative to those two players. They’re much larger than us, so I can see they’re further along. But as you pointed out, we have access to all the data in an enterprise, and I think in this era of agents and AI, it’s really whoever has the most data that wins, I think, at the end of the day. And so that’s really what we provide is access to all of the data in the enterprise, not just the data in one individual lake or one individual warehouse, but all of the data.
Blocks and Files: That gives me two thoughts. One is that you must already have a vast number of connectors connecting Starburst to data sources. I imagine an important but background activity is to make sure that they’re up to date and you keep on connecting to as many data sources as possible.
Justin Borgman: That’s right.
Blocks and Files: The second one is that you are going to be, I think, providing some kind of AI pipeline, a pipeline to select data from your sources, filter it in some way. For instance, removing sensitive information and then sending it upstream, making it available. And the point at which you send it upstream and say Starburst’s work stops could be variable. For example, you select some filters, some data from various sources, and there it is sitting in, I guess, some kind of table format. But it’s raw data, effectively, and the AI models need it tokenized. They need it vectorized, which means the vectors have to be stored someplace and then they use it for training or for inference. So where does Starburst activity stop?
Justin Borgman: Everything you said is right. I’m going to quantify that a little bit. So we have over 50 connectors to your earlier point. So that covers every traditional database system you can think of, every NoSQL database, basically every database you can think of. And then where we started to expand is adding large SaaS providers like Salesforce and ServiceNow and things of that nature as well. So we have access to all those things.
You’re also correct that we provide access control across all of those and very fine grain. So row level, column level, we can do data masking and that is part of the strength of our platform, that the data that you’re going to be leveraging for your AI can be managed and governed in a very fine-grained manner. So that’s role-based and attribute-based access controls.
To address your question of where does it stop, the reason that’s such a great question is that actually in May, we’re going to be making some announcements of going a bit further than that, and I don’t want to quite scoop myself yet, but I’ll just say that I think in May you will see us doing pretty much the entire thing that you just described today. I would say we would stop before the vectorization and that’s where we stop today.
Blocks and Files: I could see Starburst, thinking we are not a database company, but we do access stored vaults of data, and we probably access those by getting metadata about the data sources. So when we present data upstream, we could either present the actual data itself, in which case we suck it up from all our various sources and pump it out, or we just use the metadata and send that upstream. Who does it? Do you collect the actual data and send it upstream or does your target do that?
Justin Borgman: So we actually do both of the things you described. First of all, what we find is a lot of our customers are using an aspect of our product that we call data products, which is basically a way of creating curated datasets. And because, as you described it, we’re this sort of virtual lakehouse, those data products can actually be assembled from data that lives in multiple sources. And that data product is itself a view across those different data sources. So that’s one layer of abstraction. And in that case, no data needs to be moved necessarily. You’re just constructing this view.
But at the end of the day, when you’re executing your RAG workflows and you’re passing data on, maybe as a prompt, to an LLM calling an LLM function, in those cases, we can be moving data.
Blocks and Files: If you are going to be possibly vectorizing data, then the vectors need storing someplace, and you could do that yourself or you could ring up Pinecone or Milvus or Weaviate. Is it possible for you to say which way you are thinking?
Justin Borgman: Your questions are spot on. I’m trying to think of what I should say here … I’ll say nothing for today. Other than that, that is a perfect question and I will have a very clear answer in about six weeks.
Blocks and Files: If I get talking to a prospect and the prospect customer says, yes, I do have data in disparate sources within the individual datacenters and across datacenters and in the public cloud and I have SaaS datasets, should I then say, go to a single lakehouse data warehouse supplier, for example, Snowflake or Databricks or something? Or should I carry on using where my data currently is and just virtually collect it together as and when is necessary with, for example, Starburst? What are the pros and cons of doing that?
Justin Borgman: Our answer is actually a combination of the two, and I’ll explain what I mean by that. So we think that storing data in object storage in a lake in open formats like Iceberg tables is a wonderful place to store large amounts of data. I would even say as much as you reasonably can because the economics are going to be ideal for you, especially if you choose an open format like Iceberg, because the industry has decided that Iceberg is now the universal format, and that gives you a lot of flexibility as a customer. So we think data lakes are great. However, we also don’t think it is practical for you to have everything in your lake no matter what. Right? It is just a fantasy that you’ll never actually achieve. And I say this partly from my own experience…
So we need to learn from our past mistakes. And so I think that the approach has to have both. I think a data lake should be a large center of gravity, maybe the largest individual center of gravity, but you’re always going to have these other data sources, and so your strategy needs to take that into account.
I think that the notion that you have to move everything into one place to be able to have an AI strategy is not one that’s going to work well for you because your data is always going to be stale. It’s never going to be quite up to date. You’re always going to have purpose-built database systems that are running your transactional processing and different purposes. So our approach is both. Does that make sense?
Blocks and Files: It makes perfect sense, Justin. You mentioned databases, structured data. Can Starburst support the use of structured data in block storage databases?
Justin Borgman: Yes, it can.
Blocks and Files: Do you have anything to do or any connection at all with knowledge graphs for representing such data?
Justin Borgman: We do have connectors to a couple of different graph databases, so that is an option, but I wouldn’t say it’s a core competency for us today.
Blocks and Files: Stepping sideways slightly. Backup data protection companies such as Cohesity and Rubrik will say, we have vast amounts of backed-up data in data stores, and we’re a perfect source for retrieval-augmented generation. And that seems to me to be OK, up to a point. If you met a prospect who said, well, we’ve got lots of information in our Cohesity backup store, we’re using that for our AI pipelines, what can you do there? Or do you think it is just another approach that’s got its validity, but it’s not good enough on its own?
Justin Borgman: From our customer base, I have not seen a use case that was leveraging Cohesity or Rubrik as a data source, but we do see tons of object storage. So we have a partnership in fact with Dell, where Dell is actually selling Starburst on top of their object storage, and we do work with Pure and MinIO and all of these different storage providers that have made their storage really S3 compatible. It looks like it’s S3, and those are common data sources, but the Cohesity and Rubriks of the world, I haven’t seen that. So I’m not sure if the performance would be sufficient. It’s a fair question, I don’t know, but probably the reason that I haven’t seen it would suggest there’s probably a reason I haven’t seen it, is my guess.
Blocks and Files: Let’s take Veeam for a moment. Veeam can send its backups to object storage, which in principle gives you access to that through an S3-type connector. But if Veeam sends its backups to its own storage, then that becomes invisible to you unless you and Veeam get together and build a connector to it. And I daresay Veeam at that point would say, nice to hear from you, but we are not interested.
Justin Borgman: Yes, I think that’s right.
Blocks and Files: Could I take it for granted that you would think that although a Cohesity/Rubrik-style approach to providing information for RAG would have validity, it’s not real-time and therefore that puts the customers at a potential disadvantage?
Justin Borgman: That’s my impression. Yes, that’s my impression.
Analysis Trump’s tariffs will affect US companies with multinational supply chain components and finished products imported to America, and products from foreign suppliers imported to the US. They will also affect US storage suppliers exporting to tariff-raising countries. There are three groups facing different tariff-related problems.
“This tariff policy would set the US tech sector back a decade in our view if it stays.
Wedbush
Wedbush financial analyst Daniel Ives is telling subscribers to brace themselves: “Investors today are coming to the scary realization this economic Armageddon Trump tariff policy is really going to be implemented this week and it makes the tech investing landscape the most difficult I have seen in 25 years covering tech stocks on the Street. Where is the E in the P/E? No one knows….what does this do to demand destruction, Cap-Ex plans halted, growth slowdown, and damaging companies and consumers globally. Then there is the cost structure and essentially ripping up a global supply chain overnight with no alternative….making semi fabs in West Virginia or Ohio this week? Building hard drive disks in Florida or New Jersey next month?”
And: “…this tariff policy would set the US tech sector back a decade in our view if it stays.”
Here is a survey of some of the likely effects, starting with a review of Trump’s tariffs on countries involved in storage product supply.
China gets the top overall tariff rate of 54 percent, followed by Cambodia on 49 percent, Laos on 48 percent, and Vietnam on 46 percent. Thailand gets a 37 percent tariff imposed, Indonesia and Taiwan 32 percent. India gets 27 percent, South Korea 26 percent, and Japan 24 percent. The EU attracts 18.5 percent and the Philippines 18 percent.
US storage component and product importers
US suppliers with multinational supply chains import components and even complete products to the US. At the basic hardware level, the Trump tariffs could affect companies supplying DRAM, NAND, SSDs, tape, and tape drives, as well as those making storage controllers and server processors.
However, ANNEX II of the of the Harmonized Tariff Schedule of the United States (HTSUS) applies to a presidential proclamation or trade-related modification that amends or supplements the HTSUS. Currently it says that semiconductors are exempt from tariffs.
Semiconductor chips are exempt but not items that contain them as components.
Micron makes DRAM, NAND, and SSDs. The DRAM is manufactured in Boise, Idaho, and in Japan, Singapore, and Taiwan. The exemption could apply to the DRAM and NAND chips but not necessarily to the SSDs that contain NAND, as there is no specific exemption for them. They face the appropriate country of origin tariffs specified by the Trump administration.
Samsung makes DRAM and NAND in South Korea with some NAND made in China. SSD assembly is concentrated in South Korea. The SSDs will likely attract the South Korea 26 percent tariff.
SK hynix makes its DRAM and NAND chips and SSDs in Korea, while subsidiary Solidigm makes its SSD chips in China, implying their US import prices will be affected by the 54 percent tariff on Chinese SSDs and a 26 percent tariff on South Korean ones.
Kioxia NAND and SSDs are made in Japan and the SSDs bought in America will attract a 24 percent tariff – which suppliers will pass on to US consumers, in part or in full. SanDisk NAND is made in Japan (with Kioxia), but we understand some of its SSDs are manufactured in China – which means a 54 percent tariff might apply. That means Kioxia SSDs, Samsung and SK hynix SSDs, but not Solidigm ones, could cost less than Sandisk SSDs.
Consider Seagate and its disk drives. It has component and product manufacturing and sourcing operations in an integrated international supply chain involving China, Thailand, Singapore, and Malaysia.
It makes disk drive platters and some finished drives – Exos, for example – in China, and spindle motors, head gimbal assemblies, and other finished drives in Thailand. Platters and some other drives are assembled in Singapore and Malaysia. Trump’s tariffs will apply to finished drives imported into the US, with rates depending on country of origin.
The tariff rates for China, Malaysia, Singapore, and Thailand are 54 percent, 24 percent, 10 percent, and 36 percent respectively. If Seagate raised its prices to US customers by the tariff amounts, the effect would be dramatic.
Western Digital will be similarly affected as it assembles its disk drives in Malaysia and Thailand, and so face tariffs of 24 and 36 percent respectively imposed on these drives.
Toshiba HDDs are made in China, the Philippines, and Japan, implying US import tariffs of 54, 18, and 24 percent respectively.
IBM makes tape drives for itself and the LTO consortium in Tucson, Arizona, so there are no Trump tariffs applying to them, only to whatever foreign-made components IBM might be importing.
LTO tape media is made by Japan’s Fujifilm and Sony. Fujifilm makes its tape in the US, in Bedford, Massachusetts, but Sony makes its tape in Japan, meaning it will get a 24 percent tariff applied to tape imports into the US. Fujifilm wins while Sony loses.
Recordable Blu-ray and DVD discs are made in China, India, Japan, and Taiwan, and will have US import tariffs imposed on them depending upon the country of origin.
Storage controllers and server processors are mostly made by Intel with some by AMD.
Intel has CPU fabs in Oregon (Hillsboro), Arizona (Chandler), and New Mexico (Rio Rancho). There are processor assembly, test, and packaging facilities in Israel, Malaysia, Vietnam, China, and Costa Rica. The Leixlip plant in County Kildare, Ireland, also produces a range of processors. This is a complex manufacturing supply chain and Intel will avoid a tariff hit on all its CPUs, and other semiconductor products because of the Annexx II exemptions above. The same applies to AMD processors and Arm chips.
Storage arrays are typically made in the US, with Dell, HPE, and NetApp all manufacturing inside the US. However, Hitachi Vantara makes storage arrays in Japan, so they will receive a 24 percent import tariff. Lenovo’s storage is mostly based on OEM’d NetApp arrays so it might share NetApp’s US country of origin status and so avoid tariffs.
Infinidat outsources its array manufacturing to Arrow Electronics, which has a global supply chain, with the US as a global hub. The actual country of origin of Infinidat’s arrays has not been publicly revealed and lawyers may well be working on its legal location.
Hitachi Vantara looks likely to be the most disadvantaged storage array supplier, at the moment.
Non-US storage suppliers
Non-US storage suppliers exporting to the US will feel the tariff pain depending upon their host country. We understand the country of origin of manufactured storage hardware products will be the determining factor.
EU storage suppliers will be affected – unless they maintain a US-based presence.
One tactic suppliers might use is to transfer to a US operation and so avoid tariffs altogether – although critics have said investing in the US at present, with construction costs up and consumer spending down, is far from a safe bet.
US storage exporters
The third group of affected storage suppliers are the US storage businesses exporting goods to countries including China, which is raising its own tariffs in response. There is now a 34 percent tariff on US goods imported into China, starting April 10. This will affect all US storage suppliers exporting there. For example, Intel, which exports x86 CPUs to China.
We understand that China’s tariffs in reaction to Trump’s apply to the country of origin of the US-owned supplier’s manufactured products and not to the US owning entity. So Intel’s US-made semiconductor chips exported to China will have the tariff imposed by Beijing, but not its products made elsewhere in the world. Thus foreign-owned suppliers exporting storage products to China from the US will have the 34 percent tariff applied but this will not apply to their goods exported to China from the rest of the world.
If other countries outside the US were to follow China’s lead and apply their own import tariffs on US-originated goods, US-based exporters would feel the pain, too.
We believe that one of the general storage winners from this tariff fight is Huawei. It doesn’t import to the US anyway, and is thus unaffected by Trump’s tariff moves. As a Chinese supplier, it is also not affected by China’s tariffs on US-made goods, unlike Lenovo if it imports its NetApp OEM’d arrays into China.
Analysis: Pure Storage has won a deal to supply its proprietary flash drive technology to Meta, with Wedbush financial analysts seeing this as “an extremely positive outcome for PSTG given the substantially greater EB of storage PSTG will presumably ship.” The implication is that hyperscaler HDD purchases will decline as a result of this potentially groundbreaking deal.
The storage battleground here is for nearline data that needs to have fast online access while being affordable. Pure says its Direct Flash Modules (DFMs), available at 150 TB and soon 300 TB capacity points, using QLC flash, will save significant amounts of rack space, power, and cooling versus storing the equivalent exabytes of data in 30-50 TB disk drives.
A Pure blog by co-founder and Chief Visionary Officer John Colgrove says: “Our DirectFlash Modules drastically reduce power consumption compared to legacy hard disk storage solutions, allowing hyperscalers to consolidate multiple tiers into a unified platform.”
He adds: “Pure Storage enables hyperscalers and enterprises with a single, streamlined architecture that powers all storage tiers, ranging from cost-efficient archive solutions to high-performance, mission-critical workloads and the most demanding AI workloads.” That’s because “our unique DirectFlash technology delivers an optimal balance of price, performance, and density.”
A Meta blog states: “HDDs have been growing in density, but not performance, and TLC flash remains at a price point that is restrictive for scaling. QLC technology addresses these challenges by forming a middle tier between HDDs and TLC SSDs. QLC provides higher density, improved power efficiency, and better cost than existing TLC SSDs.”
It makes a point about power consumption: “QLC flash introduced as a tier above HDDs can meet write performance requirements with sufficient headroom in endurance specifications. The workloads being targeted are read-bandwidth-intensive with infrequent as well as comparatively low write bandwidth requirements. Since the bulk of power consumption in any NAND flash media comes from writes, we expect our workloads to consume lower power with QLC SSDs.”
Meta says it’s working with Pure Storage “utilizing their DirectFlash Module (DFM) and DirectFlash software solution to bring reliable QLC storage to Meta … We are also working with other NAND vendors to integrate standard NVMe QLC SSDs into our datacenters.”
It prefers the U.2 drive form factor over any EDSFF alternatives, noting that “it enables us to potentially scale to 512 TB capacity … Pure Storage’s DFMs can allow scaling up to 600 TB with the same NAND package technology. Designing a server to support DFMs allows the drive slot to also accept U.2 drives. This strategy enables us to reap the most benefits in cost competition, schedule acceleration, power efficiency, and vendor diversity.”
The bloggers say: “Meta recognizes QLC flash’s potential as a viable and promising optimization opportunity for storage cost, performance, and power for datacenter workloads. As flash suppliers continue to invest in advanced fab processes and package designs and increase the QLC flash production output, we anticipate substantial cost improvements.” That’s bad news for the HDD makers who must hope that HAMR technology can preserve the existing HDD-SSD price differential.
Wedbush analysts had a briefing from Colgrove and CFO Kevan Krysler, who said that Pure’s technology “will be the de facto standard for storage except for certain very performant use cases” at Meta.
We understand that Meta is working with Pure for its flash drive, controller, and system flash drive management software (Purity). It is not working with Pure at the all-flash array (AFA) level, suggesting other AFA vendors without flash-level IP are wasting their time knocking on Meta’s door. Also, Meta is talking to Pure because it makes QLC flash drives that are as – or more – attractive than those of off-the-shelf vendors such as Solidigm. Pure’s DFMs have higher capacities, lower return rates, and other advantages over commercial SSDs.
The Wedbush analysts added this thought, which goes against Pure’s views to some extent, at least in the near-term: “We would note that while PSTG likely displaces some hard disk, we also believe Meta’s requirements for HDD bits are slated to grow in 2025 and 2026.” Flash is not yet killing off disk at Meta, but it is restricting HDD’s growth rate.
Generalizing from the Pure-Meta deal, they add: “Any meaningful shift from HDD to flash in cloud environments, should seemingly result in a higher longer term CAGR for flash bits, a result that should ultimately prove positive for memory vendors (Kioxia, Micron, Sandisk, etc.)”
Auwau provisions multi-tenant backup services for MSPs and departmental enterprise with automated billing and stats.
It is a tiny Danish firm, just three people, with a mature Cloutility software stack and highly valued and easy-to-use functionality by its 50 or so customers, which is why we’re writing about it. Auwau’s web-based software enables MSPs and companies to deliver Backup-as-a-Service (BaaS) and S3-to-tape storage as a service; Cloutility supporting IBM’s Storage Protect; and Storage Defender, Cohesity Data Protect, Rubrik and PoINT’s (S3 to tape endpoint-based) Archival Gateway. IBM is an Auwau reseller.
Thomas Bak
CEO Thomas Bak founded Auwau in Valby, Denmark in 2016, basing it around a spin-out of acquired backup-as-a-service software while he was a Sales Director and Partner at Frontsafe. Cloutility runs on a Windows machine and has a 30 min install. It doesn’t support Linux, with Bak saying he “never meets a SP who doesn’t have Windows somewhere.”
Although BaaS provisioning is a core service the nested multi-tenancy automated billing is equally important, and the two functions are controlled through a single software pane of glass. Users canactivate new backups and schedule backups from Cloutility via self service.
Bak told an IT Press Tour audience: “Multi-tenancy is a big draw [with] tenants in unlimited tree structures. … We automate subscription-based billing.” Cloutility provides price by capacity by tenant and customers can get automated billing for their tenants plus custom reporting and alerting. He said: “We sell to enterprises who are internal service providers and want data in their data centers.” On-premises and not in the cloud in other words.
Cloutility single pane of glass
Universities could invoice per department and/or by projects for example. Role-based access control, single sign-on and two-factor authentication are all supported.
Auwau offers OEM/white label branding capability so every reselling tenant of an MSP could be branded. Their recurring bills and reports will reflect this branding. An MSP can set up partners and resellers in their multi-tenant tree structure who can add and operate their own subset of customers as if the system was their own.
Development efforts are somewhat limited; there are only two engineers. Bak says Auwau will add new backup service provisioning connectors and billing when customers request them. He doesn’t have a build-it-and-they-will-come approach to product development. It’s more of a case of being able to depend upon a future cash flow from customers requesting a new BaaS offering which would spur any development.
He has Veeam support as a roadmap item but with no definite timescale. There are no plans to generate IBM COS to general S3 target capability nor support for Cohesity NetBackup.
In the USA and some other geographies N-able would be a competitor, but Bak says he never meets N-able in the field.
Bak is very agile. He finished his presentation session by doing a handstand and walking around on his hands. That will make for unforgettable sales calls.
Bootnote
IBM Storage Defender is a combination of IBM Storage Protect (Spectrum Protect as was), FlashSystem, Storage Fusion and Cohesity’s DataProtect product. This will run with IBM Storage’s DS8000 arrays, tape and networking products.
There is no restore support for IBM StorageProtect as it is not API-driven. Rubrik and Cohesity are OK for restore.
Cohesity announced it is the first data protection provider to achieve Nutanix’s Database Service (NDB) database protection Nutanix Ready validation. It says: “NDB is the market leading database lifecycle management platform for building database-as-a-service solutions in hybrid multicloud environments.” Cohesity DataProtect now integrates with NDB’s native time machine capabilities and streamlines protection for PostgreSQL databases on NDB via a single control plane.
Bill O’Connell
…
Data protector Commvault has announced the appointment of Bill O’Connell as its chief security officer. He had prior leadership roles at Roche, leading technical, operational, and strategic programs to protect critical data and infrastructure, and also at ADP. He previously served as chair of the National Cyber Security Alliance Board of Directors and remains actively involved in various industry working groups focused on threat intelligence and privacy.
…
Global food and beverage company Danone has adopted Databricks’ Data Intelligence Platform to drive improvements in data accuracy and reduce “data-to-decision” time by up to 30 percent. It says data ingestion times are set to drop from two weeks to one day, with fewer issues requiring debugging and fixes. A “Talk to Your Data” chatbot, powered by generative AI and Unity Catalog, will help non-technical users explore data more easily. Built-in tools will support rapid prototyping and deployment of AI models. Secure, automated data validation and cleansing could increase accuracy by up to 95 percent.
…
ExaGrid announced three new models, adding the EX20, EX81, and EX135 to its line of Tiered Backup Storage appliances, as well as the release of ExaGrid software version 7.2.0. The EX20 has 8 disks that are 8 TB each. The EX81 has 12 disks that are 18 TB each. The EX135 has 18 disks that are 18 TB each. Thirty-two of the EX189 appliances in a single scale-out system can take in up to a 6 PB full backup with 12 PB raw capacity, making it the largest single system in the industry that includes data deduplication. ExaGrid’s line of 2U appliances now include eight models: EX189, EX135, EX84, EX81, EX54, EX36, EX20, and EX10. Up to 32 appliances can be mixed and matched in a single scale-out system. Any age or size appliance can be used in a single system, eliminating planned product obsolescence.
The product line has also been updated with new Data Encryption at Rest (SEC) options. ExaGrid’s larger appliance models, including the EX54, EX81, EX84, EX135, and EX189, offer a Software Upgradeable SEC option to provide Data Encryption at Rest. SEC hardware models that provide Data Encryption at Rest are also available for ExaGrid’s entire line of appliance models. The v7.20 software includes External Key Management (EKM) for encrypted data at rest, support for NetBackup Flex Media Server Appliances with the OST plug-in, support of Veeam S3 Governance Mode and Dedicated Managed Networks.
…
Data integration provider Fivetran announced it offers more than 700 pre-built connectors for seamless integration with Microsoft Fabric and OneLake. This integration, powered by Fivetran’s Managed Data Lake Service, enables organizations to ingest data from over 700 connectors, automatically convert it into open table formats like Apache Iceberg or Delta Lake, and continuously optimize performance and governance within Microsoft Fabric and OneLake – without the need for complex engineering effort.
…
Edge website accelerator Harper announced version 5 of its global application delivery platform. It includes several new features for building, scaling, and running high-performance data-intensive workloads, including the addition of Binary Large Object (Blob) storage for the efficient handling of unstructured, media-rich data (images, real-time videos, and rendered HTML). It says the Harper platform has unified the traditional software stack – database, application, cache, and messaging functions – into a single process on a single server. By keeping data at the edge, Harper lets applications avoid the transit time of contacting a centralized database. Layers of resource-consuming logic, serialization, and network processes between each technology in the stack are removed, resulting in extremely low response times that translate into greater customer engagement, user satisfaction, and revenue growth.
…
Log data lake startup Hydrolix has closed an $80 million C-round of funding, bringing its total raised to $148 million. It has seen an eightfold sales increase in the past year, with more than 400 new customers, and is building sales momentum behind a comprehensive channel strategy. The cornerstone of that strategy is a partnership with Akamai, whose TrafficPeak offering is a white label of Hydrolix. Additionally, Hydrolix recently added Amazon Web Services as a go-to-market (GTM) partner and built connectors for massive log-data front-end ecosystems like Splunk. These and similar efforts have driven the company’s sales growth, and the Series C is intended to amplify this momentum.
…
Cloud data management supplier Informatica has appointed Krish Vitaldevara as EVP and chief product officer coming from NetApp and Microsoft. This is a big hire. He was an EVP and GM for NetApp’s core platforms and led NetApp’s 2,000-plus R&D team responsible for technology, including ONTAP, FAS/AFF, application integration & data protection software. At Informatica, “Vitaldevara will develop and execute a product strategy aligning with business objectives and leverage emerging technologies like AI to innovate and improve offerings. He will focus on customer engagement, market expansion and strategic partnerships while utilizing AI-powered, data-driven decision-making to enhance product quality and performance, all within a collaborative leadership framework.”
…
Sam King
Cloud file services supplier Nasuni has appointed Sam King as CEO, succeeding Paul Flanagan who is retiring after eight years in the role. Flanagan will remain on the Board, serving as Non-Executive Chairman. King was previously CEO of application security platform supplier Veracode from 2019 to 2024.
…
ObjectFirst has announced three new Ootbi object backup storing appliances for Veeam, with new entry-level 20 and 40 TB capacities, and a range-topping 432 TB model, plus new firmware delivering 10-20 percent faster recovery speeds across all models. The 432 TB model supports ingest speeds of up to 8 GBps in a four-node cluster, double the previous speed. New units are available for purchase immediately worldwide.
…
OpenDrives is bringing a new evolution of its flagship Atlas data storage and management platform to the 2025 NAB Show. Atlas’ latest release provides cost predictability and economical scalability with an unlimited capacity pricing model, high performance and freedom from paying for unnecessary features with targeted composable feature bundles, greater flexibility and freedom of choice with new certified hardware options, and intelligent data management via the company’s next-generation Atlas Performance Engine. OpenDrives has expanded certified hardware options to includethe Seagate Exos E JBOD expansion enclosures.
…
Other World Computing announced the release of OWC SoftRAID 8.5, its RAID management software for macOS and Windows, “with dozens of enhancements,” delivering “dramatic increases in reliability, functionality, and performance.” It also announced the OWC Archive Pro Ethernet network-based LTO backup and archiving system with drag-and-drop simplicity, up to 76 percent cost savings versus HDD storage, a 501 percent ROI, and full macOS compatibility.
OWC Archive Pro
…
Percona is collaborating with Red Hat on OpenShift and Percona Everest will now support OpenShift, so you can run a fully open source platform for running “database as a service” style instances on your own private or hybrid cloud. The combination of Everest as a cloud-native database platform with Red Hat OpenShift allows users to implement their choice of database in their choice of locations – from on-premises datacenter environments through to public cloud and hybrid cloud deployments.
…
Perforce Delphix announced GA of Delphix Compliance Services, a data compliance product built in collaboration with Microsoft. It offers automated AI and analytics data compliance supporting over 170 data sources and natively integrated into Microsoft Fabric pipelines. The initial release of Delphix Compliance Services is pre-integrated with Microsoft Azure Data Factory and Microsoft PowerBI to natively protect sensitive data in Azure and Fabric sources as well as other popular analytical data stores. The next phase of this collaboration adds a Microsoft Fabric Connector.
Perforce is a Platinum sponsor at the upcoming 2025 Microsoft Fabric Conference (FabCon) jointly sponsoring with PreludeSys. It will be demonstrating Delphix Compliance Services and natively masking data for AI and analytics in Fabric pipelines at booth #211 and during conference sessions.
…
Pliops has announced a strategic collaboration with the vLLM Production Stack developed by LMCache Lab at the University of Chicago, aimed at revolutionizing large language model (LLM) inference performance. The vLLM Production Stack is an open source reference implementation of a cluster-wide full-stack vLLM serving system. Pliops has developed XDP (Extreme Data Processor) key-value store technology with its AccelKV software running in an FPGA or ASIC to accelerate low-level storage stack processing, such as RocksDB. It has announced a LightningAI unit based on this tech. The aim is to enhance LLM inference performance.
…
Pure Storage is partnering with CERN to develop DirectFlash storage for Large Hadron Collider data. Through a multi-year agreement, Pure Storage’s data platform will support CERN openlab to evaluate and measure the benefits of large scale high-density storage technologies. Both organizations will optimize exabyte-scale flash infrastructure, and the application stack for Grid Computing and HPC workloads, identifying opportunities to maximize performance in both software and hardware while optimizing energy savings across a unified data platform.
…
Seagate has completed the acquisition of Intevac, a supplier of thin-film processing systems for $4.00 per share, with 23,968,013 Intevac shares being tendered. Intevac is now a wholly owned subsidiary of Seagate. Wedbush said: “We see the result as positive for STX given: 1) we believe media process upgrades are required for HAMR and the expense of acquiring and operating IVAC is likely less than the capital cost for upgrades the next few years and 2) we see the integration of IVAC into Seagate as one more potential hurdle for competitors seeking to develop HAMR, given that without an independent IVAC, they can no longer leverage the sputtering tool maker’s work to date around HAMR (with STX we believe using IVAC exclusively for media production.”
…
DSPM provider Securiti has signed a strategic collaboration agreement (SCA) with Amazon Web Services (AWS). AWS selected Securiti to help enterprise customers safely use their data with Amazon Bedrock’s foundation models, integrating Securiti’s Gencore AI platform to enable compliant, secure AI development with structured and unstructured data. Securiti says its Data Command Graph provides contextual data intelligence and identification of toxic combinations of risk, including the ability to correlate fragmented insights across hundreds of metadata attributes such as data sensitivity, access entitlements, regulatory requirements, and business processes. It also claims to offer the following:
Advanced automation streamlines remediation of data risks and compliance with data regulations.
Embedded regulatory insights and automated controls enable organizations to align with emerging AI regulations and frameworks such as EU AI Act and NIST AI RMF.
Continuous monitoring, risk assessments and automated tests streamline compliance and reporting.
…
SpectraLogic has launched the Rio Media Suite, which it says is simple, modular and affordable software to manage, archive and retrieve media assets across a broad range of on-premises, hybrid and cloud storage systems. It helps break down legacy silos, automates and streamlines media workflows, and efficiently archives media. It is built on MediaEngine, a high-performance media archiver that orchestrates secure access and enables data mobility between ecosystem applications and storage services.
A variety of app extensions integrate with MediaEngine to streamline and simplify tasks such as creating and managing lifecycle policies, performing partial file restores, and configuring watch folders to monitor and automatically archive media assets. The modular MAP design of Rio Media Suite allows creative teams to choose an optimal set of features to manage and archive their media, with the flexibility to add capabilities as needs change or new application extensions become available.
Available object and file storage connectors enable a range of Spectra Logic and third-party storage options, including Spectra BlackPearl storage systems, Spectra Object-Based Tape, major third-party file and object storage systems, and public cloud object storage services from leading providers such as AWS, Geyser Data, Google, Microsoft and Wasabi.
A live demonstration of Rio Media Suite software will be available during exhibit hours on April 6-9, 2025, in the Spectra Logic booth (SL8519) at NAB Show, Las Vegas Convention Center, Las Vegas, Nevada. Rio Media Suite software is available for Q2 delivery.
…
Starfish Storage, which provides metadata-driven unstructured data management, is being used at Harvard’s Faculty of Arts and Sciences Research Computing group to manage more than 60 PB involving over 10 billion files across 600 labs and 4,000 users. In year one it delivered $500,000 in recovered chargeback, year two hit $1.5 million, and it’s on track for $2.5 million in year three. It also identified 20 PB of reclaimable storage, with researchers actively deleting what they no longer need. Starfish picked up a 2025 Data Breakthrough Award for this work in the education category.
…
Decentralized (Web3) storage supplier Storj announced a macOS client for its new Object Mount product. It joins the Windows client announced in Q4 2024 and the Linux client, launched in 2022. Object Mount delivers “highly responsive, POSIX-compliant file system access to content residing in cloud or on-premise object storage platforms, without changing the data format.” Creative professionals can instantly access content on any S3-compatible or blob object storage service, as if they were working with familiar file storage systems. Object Mount is available for users of any cloud platform or on-premise object storage vendor. It is universally compatible and does not require any data migration or format conversion.
…
Media-centric shared storage supplier Symply is partnering with DigitalGlue to integrate “DigitalGlue’s creative.space software with Symply’s high-performance Workspace XE hardware, delivering a scalable and efficient hybrid storage solution tailored to the needs of modern content creators. Whether for small post-production teams or large-scale enterprise environments, the joint solution ensures seamless workflow integration, enhanced performance, and simplified management.”
…
An announcement from DDN’s Tintri subsidiary says: “Tintri, leading provider of the world’s only workload-aware, AI-powered data management solutions, announced that it has been selected as the winner of the ‘Overall Data Storage Company of the Year’ award in the sixth annual Data Breakthrough Awards program conducted by Data Breakthrough, an independent market intelligence organization that recognizes the top companies, technologies and products in the global data technology market today.”
We looked into the Data Breakthrough Awards program. There are several categories in these awards with multiple sub-category winners in each category: Data Management (13), Data Observability (4), Data Analytics (10), Business Intelligence (4), Compute and Infrastructure (6), Data Privacy and Security (5), Open Source (4), Data Integration and Warehousing (5), Hardware (4), Data Storage (6), Data Ops (3), Industry Applications (14) and Industry Leadership (11). That’s a whopping 89 winners.
In the Data Management category we find 13 winners with DataBee the “Solution of the Year” and VAST Data the “Company of the Year.” Couchbase is the “Platform of the Year” and Grax picks up the “Innovation of the Year” award:
The six Data Storage category winners are:
The award structure may strike some as unusual. The judging process details can be found here.
…
Cloud, disaster recovery, and backup specialist virtualDCS has announced a new senior leadership team as it enters a new growth phase after investment from private equity firm MonacoSol, which was announced last week. Alex Wilmot steps in as CEO, succeeding original founder Richard May, who moves into a new role as product development director. Co-founder Dan Nichols returns as CTO, while former CTO John Murray transitions to solutions director. Kieran Brady also joins as chief revenue officer (CRO) to drive the company’s next stage of expansion.
…
S3 cheaper-than-AWS cloud storage supplier Wasabi has achieved Federal Risk and Authorization Management Program (FedRAMP) Ready status, and announced its cloud storage service for the US Federal Government. Wasabi is now one step closer to full FedRAMP authorization, which will allow more Government entities to use its cloud storage service.
…
Software RAID supplier Xinnor announced successful compatibility testing of an HA multi-node cluster system combining its xiRAID Classic 4.2 software and the Ingrasys ES2000 Ethernet-attached Bunch of Flash (EBOF) platform. It supports up to 24 hot-swap NVME SSDs and is compatible with Pacemaker-based HA clusters. Xinnor plans to fully support multi-node clusters based on Ingrasys EBOFs in upcoming xiRAID releases. Get a full PDF tech brief here.
Quesma has built a gateway between Elasticsearch EQL and SQL-based databases like ClickHouse, claiming EQL users can use it to access faster and cheaper stored data sources.
Jacek Migdal
EQL (Elastic Query Language) is used by tools such as Kibana, Logstash, and Beats. Structured Query Language (SQL) is the 50-year-old standard for accessing relational databases. Quesma co-founder Jacek Migdal, who previously worked at Sumo Logic, says that Elasticsearch is designed for Google-style searches, but 65 percent of the use cases come from observability and security, rather than website search. The majority of telcos have big Elastic installations. However, Elastic is 20x slower at answering queries than the SQL-accessed ClickHouse relational database.
Quesma lets users carry on using Elastic as a front end while translating EQL requests to SQL using a dictionary generated by an AI model. Migdal and Pawel Brzoska founded Quesma in Warsaw, Poland, in 2023, and raised €2.1 million ($2.3 million) in pre-seed funding at the end of that year.
The company partnered with streaming log data lake company Hydrolix in October 2024 as it produces a ClickHouse-compatible data lake. Quesma lets Hydrolix customers continue using EQL-based queries, redirecting them to the SQL used by ClickHouse. Its software acts as a transparent proxy.
Hydrolix now has a Kibana compatibility feature powered by Quesma’s smart translation technology. It enables Kibana customers to connect their user interface to the Hydrolix cloud and its ClickHouse data store. This means Elasticsearch customers can migrate to newer SQL databases while continuing to use their Elastic UI.
Quesma enables customers to avoid difficult and costly all-in-one database migrations and do gradual migrations instead, separating the front-end access from the back-end database. Migdal told an IT Press Tour briefing audience: “We are using AI internally to develop rules to translate Elasticsearch storage rules to ClickHouse [and other] rules. AI produces the dictionary. We use two databases concurrently to verify rule development.”
Although AI is used to produce the dictionary, it is not used, in the inference sense, at run time by customers. Migdal said: “Customers won’t use AI inferencing at run time in converting database interface languages. They don’t want AI there. Their systems may not be connected to the internet.”
Its roadmap has a project to add pipe syntax extensions to SQL, so that the SQL operator syntax order matches the semantic evaluation order, making it easier to understand:
Quesma pipe syntax example
Quesma is also using its AI large language model experience to produce a charting app, interpreting natural language prompts, such as ”Plot top 10 languages, split by native and second language speakers” to create and send requests to apps like Tableau.
The Symphony unstructured data estate data manager from Panzura has extended its reach into IBM Storage Deep Archive territory, integrating it with S3-accessed Diamondback tape libraries.
Symphony is Panzura’s software for discovering and managing exabyte-scale unstructured data sets, featuring scanning, tiering, migration, and risk and compliance analysis. It is complementary to Panzura’s original and core CloudFS hybrid cloud file services offering supporting large-scale multi-site workflows and collaboration using active, not archived, data. The IBM Storage Deep Archive is a Diamondback TS6000 tape library, storing up to 27 PB of LTO-9 data in a single rack with 16.1 TB/hour (4.47 GBps) performance. It’s equipped with an S3-accessible front end, similar to the file-based LTFS.
Sundar Kanthadai
Sundar Kanthadai, Panzura CTO, stated that this Panzura-IBM offering “addresses surging cold data volumes and escalating cloud fees by combining smart data management with ultra-low-cost on-premises storage, all within a compact footprint.”
Panzura Product SVP Mike Harvey added: “This integration allows technologists to escape the trap of unpredictable access fees and egress sticker shock.”
The Symphony-Deep Archive uses S3 Glacier Flexible Retrieval storage classes to “completely automate data transfers to tape.” Use Symphony to scan an online unstructured data estate and move metadata-tagged cold data to the IBM tape library to free up SSD and HDD storage capacity while keeping the data on-prem. Symphony’s data catalog gets embedded file metadata automatically added. It’s searchable, with more than 500 data types, and accessible via API and Java Database Connectivity requests.
Specific file recall and deletion activity can be automated through policy settings.
Panzura’s Symphony can access more than 400 file formats via a deal with GRAU Data for its Metadata Hub software. It is already integrated with IBM’s Fusion Data Catalog, which provides unified metadata management and insights for heterogeneous unstructured data, on-premises and in the cloud, and Storage Fusion. IBM Storage Fusion is a containerized solution derived from Spectrum Scale and Spectrum Protect data protection.
According to IBM, Deep Archive is much more affordable than public cloud alternatives, “offering object storage for cold data at up to 83 percent lower cost than other service providers, and importantly, with zero recall fees.”
Panzura says the IBM Deep Archive-Symphony deal is “particularly crucial for artificial intelligence (AI) workloads,” because it can make archived data accessible to AI model training and inference workloads.
It claims the Symphony IBM Deep Archive integration enables users to streamline data archiving processes and “significantly reduce cloud and on-premises storage expenses.” The combined offering is available immediately.