Home Blog Page 13

VDURA: AI training and inference needs optimized file and object balance

Much interest has been sparked across the storage world by assertions that object storage is best for AI training and inference, rather than file storage. VAST Data co-founder Jeff Denworth and Microsoft AI Infrastructure Architect Glenn Lockwood have both put forward this point of view. Hammerspace Marketing SVP Molly Presley disagreed and so too does VDURA CEO Ken Claffey.

VDURA provides a parallel file system for supercomputing, institutional and enterprise HPC. Ken Claffey thinks that the files-or-objects data access issue in the AI training and inference markets is misplaced. He believes both have their roles and he discussed that with us in an interview.

Blocks & Files: What started you thinking about this issue?

Ken Claffey, VDURA
Ken Claffey

Ken Claffey: VAST Data’s Jeff Denworth recently made a bold claim that “no one needs a file system for AI training” and that S3-based object storage is the future. While it’s true that AI workloads are evolving, the assertion that file systems are obsolete is misleading at best. 

Blocks & Files: What do you think are the realities of AI storage needs and the role of  parallel file systems in high-performance AI training at scale?

Ken Claffey: At VDURA, we don’t see AI storage as a binary choice between file and object. Our architecture is built on a high-performance object store at its core, with a fully parallel file system front end. This means users get the best of both worlds: the scalability and durability of object storage with the high-performance access required for AI training.

With our latest v11 release, we have further enhanced our platform by integrating a high-performance distributed key-value store. This new addition optimizes metadata operations and enables ultra-fast indexing, further enhancing AI and HPC workloads. Additionally, VDURA provides a high-performance S3 interface that allows seamless access to the same files and data across both file and object protocols. This ensures maximum flexibility and investment protection for enterprises scaling AI infrastructure.

Blocks & Files: Does object storage have a role here?

Ken Claffey: Glenn Lockwood from Microsoft Azure recently argued that large-scale AI language models are increasingly trained with object storage, rather than file storage. His perspective aligns with a growing shift toward object-based architectures, but it’s important to examine the nuances of AI training workflows before jumping to conclusions.

Lockwood outlines the four major phases of AI model training:

  1. Data ingestion: Collecting vast amounts of unstructured data, best suited for object storage due to its immutability and scalability.
  2. Data preparation: Transforming and cleaning the data, which is largely an in-memory and analytics-driven task.
  3. Model training: Running tokenized data through GPUs and checkpointing model weights, requiring fast storage access.
  4. Model deployment and inferencing: Distributing trained models and handling real-time queries, often optimized through key-value stores.

While Lockwood asserts that parallel file systems are not required for these workloads, his argument centers around cost-effectiveness rather than raw performance. Object storage is well suited for data ingestion and preparation due to its scale and cost efficiency. However, for model training and real-time inferencing, a hybrid approach – like VDURA’s – delivers the best of all worlds.

Blocks & Files: What is Nvidia’s perspective on this as you see it?

Ken Claffey: As they release next-generation GPUs and DGX platforms, they continue to emphasize high-performance storage requirements. According to Nvidia’s own guidance for DGX, the leading AI platform, the recommended storage configuration is: 

  • “High-performance, resilient, POSIX-style file system optimized for multi-threaded read and write operations across multiple nodes.”

Did we miss the S3 requirement? Nowhere does Nvidia state that AI training should rely solely on object storage. In fact, their own high-performance AI architectures are designed around file systems built for multi-threaded, high-throughput access across distributed nodes.

Blocks & Files: Is checkpointing encouraging object storage use?

Ken Claffey: Denworth referenced Nvidia’s “S3 Checkpointer” as evidence of a shift toward object storage for AI training. However, he conveniently left out a critical detail. The very next part of Nvidia’s own documentation states: “The async feature currently does not check if the previous async save is completed, so it is possible that an old checkpoint is removed even when the current save fails.”

What does this mean in practice? Using async checkpointing may result in a recovery point further back in time. This significantly reduces the reliability of checkpoints and increases the risk of lost training progress. The value of synchronous, consistent checkpointing cannot be overstated – something that parallel file systems have been optimized for over decades.

Blocks & Files: How are you optimizing VDURA storage?

Ken Claffey: Rather than framing the debate as “file vs object,” VDURA has built a solution that integrates:

  • A high-performance object store to handle large-scale data ingestion and archival efficiently.
  • A fully parallel file system front-end to optimize AI model training with low-latency, high-bandwidth access.
  • A distributed key-value store to accelerate metadata lookups, vector indexing, and inferencing.
  • A high-performance S3 interface ensuring multi-protocol access across AI workflows.

This architecture addresses Lockwood’s concerns while also meeting the needs of enterprises that demand the highest levels of performance and scalability. While object storage plays a key role, dismissing parallel file systems entirely ignores the practical realities of AI training at scale.

Blocks & Files: How do you see the future for AI storage?

Ken Claffey: Denworth and Lockwood both make strong cases for object storage, but they downplay the performance-critical aspects of AI training. The future of AI storage is hybrid:

  • Parallel file systems provide the speed and efficiency necessary for training.
  • Object storage is useful for archival, sharing, and retrieval workloads.
  • Multi-protocol solutions bridge the gap, but that doesn’t mean file systems are obsolete – far from it.
  • High-performance distributed key-value stores enhance metadata management and indexing, further optimizing AI workflows.

VDURA’s approach acknowledges this reality: a high-performance object store at its core, a fully parallel file system front-end, an integrated key-value store, and a high-performance S3 interface – all working together to deliver unmatched efficiency for AI and HPC workloads. Unlike VAST’s claim that object storage alone is the future, we recognize that AI training at scale requires the best of all storage paradigms.

Enterprises deploying AI at scale need a storage infrastructure that actually meets performance requirements, not just theoretical flexibility. While object storage plays a role, parallel file systems remain the backbone of high-performance AI infrastructure, delivering the speed, consistency, and scale that today’s AI workloads demand.

The industry isn’t moving away from file systems – it’s evolving to embrace the best combination of technologies. The question isn’t “file or object,” but rather, “how do we best optimize?” At VDURA, we’re building the future of AI storage with this balance in mind.

Storage news ticker – February 7

Alluxio Enterprise AI v3.5 accelerates AI model training and streamlines operations with a new Cache Only Write Mode to accelerate checkpoints, advanced cache management, and enhanced Python SDK integrations. TTL (time-to-live) Cache Eviction Policies allow admins to enforce TTL settings on cached data, ensuring less frequently accessed data is automatically evicted based on defined policies. Alluxio’s S3 API now supports HTTP persistent connections (HTTP keep-alive), TLS encryption, and multi-part upload (MPU). MPU splits files into multiple parts and uploads each part separately to improve throughput for large files.

Data compression startup Atombeam has raised $20 million in an A+ funding round, bringing total funding to $35 million. Atombeam’s codeword technology uses cryptography and compression to translate raw data into a set of codewords within an AI/ML-created codebook. Data is transmitted as codewords and the receiver decrypts and decompresses it. Atombeam has been issued more than 84 patents for its technology, with an additional 115 pending. There are similarities between Atombeam’s codewords and the now extinct Formulus Black/Symbolic IO‘s bit markers. LinkedIn shows Atombeam founder Asghar Riahi was a Master Technologist at HP from 1999 to 2012. SymbolicIO founder Brian Ignomirello was StorageWorks CTO at HP from Dec 2007 to Dec 2011. The two overlapped at HP over a four-year period. 

AIOps automation developer CloudFabrix is changing its name to Fabrix.ai and rolling out a framework for agentic AIOps. Agents need to be developed, deployed, orchestrated, and managed, and Fabrix.ai has developed a framework for this. The main components are:

  • Agent Orchestration and Lifecycle Management 
  • AI Guardrails 
  • Managing Data and Action Privileges for Agents 
  • Visibility and Observability of Agents 
  • Agent Quality Control and Assurance 
  • Reasoning LLMs 

There is an AI Fabric – “an AI agent-driven distributed orchestrator that enables customers to securely build, deploy, and manage Agents’ lifecycles, ensuring guardrails and quality controls. It integrates with disparate large and small models, curated datasets, and automation to drive Agentic Workflows.”

Wendy Stusrud

… 

DDN has appointed Wendy Stusrud as VP for Worldwide Channel Sales. She comes from being VP Global Partner Sales at Pure Storage for more than three years.

In-memory computing clustered node supplier GridGain says its new GridGain for AI enables customers to use GridGain as a low-latency data store for real-time AI workloads. Real-time AI requires low-latency data access to ensure fast retrieval of inputs, such as features and embeddings for inference, enterprise- and user-specific context to augment LLM queries, prediction caching to reduce computation, and dynamic model loading. GridGain pitch for AI is a single, distributed platform, delivering low-latency performance, scalability, and reduced integration overhead to streamline deployments and improve overall system efficiency for modern AI applications.

Hitachi Vantara has won speciality pharmaceutical FarmaMondo as a customer for its Virtual Storage Platform One (VSP One) Block product, upgrading an existing Hitachi VSP G200 storage system.

Erik Frieberg

Streaming log data supplier Hydrolix has shipped a new Apache Spark connector for Databricks, designed to make it easy to replace the current log data infrastructure with Hydrolix.Now. Databricks users can “economically store full-fidelity event data, such as logs, in Hydrolix and rapidly extract information from both real-time and historical data.”

Open-source object storage supplier MinIO has appointed a new CMO, Erik Frieberg. Former CMO Jonathan Symons, appointed in 2019, has moved to a full-time advisor role. Frieberg has been SVP marketing at Pantheon for the past 15 months and has CMO and SVP roles at VMware, Puppet and MongoDB in his LinkedIn CV. 

Tom Murch

Quobyte has appointed Tom Murch as the new regional director of sales to head its New York office. Murch has held director-level positions at Penguin Computing, Toshiba, CiaraTech, Smart Storage, and Sanmina. By expanding into New York, Quobyte is trying to strengthen its presence in the financial sector where organizations need high-performance, scalable storage to support demanding workloads. 

Eric Soffin

Multi-site file collaboration vendor Resilio has appointed Eric Soffin as VP Sales. He comes from being VP worldwide Sales at Nasuni for more than five years.

Object storage supplier Scality has appointed Emilio Roman as its new global CRO. He comes from being SVP global sales and channels at BitDefender, where he spent five years. Before that, he was SVP EMEA, APAC and global alliances at … Scality. Roman’s appointment follows “the highly impactful tenure of Peter Brennan, who built the company’s sales and channel organisation to achieve consecutive years of record growth.”

Emilio Roman.

Brennan has left for a senior sales leadership role with an unnamed network technology company, and will remain a member of Scality’s Advisory Board. Scality said “We are thrilled to welcome Emilio back to the Scality family. Emilio’s extensive cybersecurity experience and outstanding leadership skills make him the ideal choice to lead our global sales efforts.”

SW RAID shipper Xinnor has won the University of Pisa as a customer which has used E4 Computer Engineering to integrate Xinnor’s xiRAID with BeeGFS. This ships data to/from Nvidia DGX servers. The system includes dual storage nodes, each powered by xiRAID in RAID 6 configuration, built to deliver a fail-safe environment for large-scale AI operations. Read speeds reach 29.2 GBps and write speeds 25.8 GBps in tests involving up to 128 processes per node.

 

Xinnor BeeGFS University of Pisa setup

Hammerspace challenges object storage norms for AI

Data orchestrator Hammerspace is challenging the conventional wisdom that object storage is the optimal solution for AI training and inference, arguing that universal, protocol-agnostic data access is far more crucial.

In a sense, that would be natural as Hammerspace has AI model training customers, such as Meta. Its technology is based on parallel NFS and it supports Nvidia’s GPUDirect fast file access protocol. However, Hammerspace supports S3 data access as well as file access. It has a partnership with object storage supplier Cloudian so that its HyperStore object storage repository can be used by Hammerspace’s Global Data Platform software. HyperStore supports Nvidia’s GPUDirect for object storage, designed to provide faster object access.

Molly Presley, Hammerspace SVP for marketing, discussed the file-vs-objects AI topic with Blocks and Files, and moved onto making data suitable for AI processing – vectorization and how data should be organized for the AI LLM/agent era.

Molly Presley, Hammerspace
Molly Presley

Blocks & Files: Why is Hammerspace focused on a hybrid data platform instead of just file or object storage?

Molly Presley: In Glenn Lockwood’s article, he calls out the pain points of parallel file systems due to their proprietary nature and needing specialized headcount. This is a huge reason why Hammerspace, with over 2,400 contributions to the Linux kernel, is so focused on a standards-based data platform. The choice for customers is not limited to just object storage if they need standards-based access without proprietary clients and silos.

It’s not about choosing between file systems and object storage interfaces; the conversation is also about scalability, efficiency at scale, understanding data sources, and seamlessly orchestrating data regardless of its format.

Focusing solely on storage interfaces and file vs object storage trivializes the complexity of today’s AI demands. Each workload has different performance requirements, is connected to different applications with different storage interface requirements, and may use data sources from a wide variety of locations. The optimal platform delivers performance through orchestration, scalability, and intelligent workload-specific optimizations.

Blocks & Files: Are AI infrastructure purchase decisions primarily being made around training workloads?

Molly Presley: No. As organizations are assessing their AI investments, they are thinking beyond more than just training. Data architecture investments for most organizations need to accommodate far more than training. They need to span inference, RAG, real-time analytics, and more. Each requires specific optimizations that go beyond generic, one-size-fits-all storage systems. A data platform is needed and must adapt to each phase of AI workloads, not force them into outdated storage paradigms.

A data platform must provide real-time data ingestion (aka data assimilation), intelligent metadata management, security, and resilience. Storage interfaces alone don’t solve the full challenge – data must be fluid, orchestrated, and dynamically placed for optimal performance across workloads.

Blocks & Files: We have been concerned about the spread of LLMs as that implies the LLMs need access, in principle, to an organization’s entire data estate. Will an organization’s entire data estate need to be vectorized? If not all, which parts? Mission-critical, near-time, archival?

Molly Presley: At Hammerspace, we don’t see vectorization as the immediate challenge or top-of-mind concern for buyers and architects – it’s global access and orchestration. Organizing data sets, ensuring clean data, and moving data to available compute are much more urgent in today’s training, RAG, and iteration workloads.

The need to vectorize an organization’s entire data estate is highly use-case and industry-specific. While the answer varies, full vectorization is typically unnecessary. Mission-critical and near-time data are the primary candidates, while archival data can be selectively sampled to identify relevance or patterns that justify further vectorization.

The key to effective implementation is enabling applications to access all data across storage types at a metadata control plane level – without requiring migrations or centralization. This ensures scalability and efficiency.

Blocks & Files: Will an organization’s chatbots/AI agents need, collectively and in principle, access to its entire data estate? How do they get it?

Molly Presley: Chatbots and AI agents typically don’t need access to an organization’s entire data estate – only a curated subset relevant to their function. Security and compliance concerns make unrestricted access impractical. Instead, leveraging global data access with intelligent orchestration ensures AI tools can access the right data without uncontrolled sprawl.

Even if an organization vectorized everything, the resulting data store would be near-real-time, not truly real-time. Performance is constrained by update latency – vector representations are only as current as their latest refresh. API integration and fast indexing can help, but real-time responsiveness depends on continuous updates. Hammerspace’s relevant angle remains metadata-driven, automated orchestration rather than full-scale vectorization.

Blocks & Files: Will the prime interface to data become LLMs for users in an organization that adopts LLM agents?

Molly Presley: Good question. LLMs are rapidly becoming an important interface for data in organizations adopting AI agents. Their ability to process natural language and provide contextual insights makes them a powerful tool for accessibility and decision-making. However, they won’t replace traditional BI and analytics tools – rather, they will integrate with them. Enterprises require structured reporting, governance, and auditability, which remain best served by established standards. The near-term (next few years at least) future lies in a hybrid approach: LLMs will enhance data interaction and discovery, while enterprise-grade analytics tools ensure precision, compliance, and operational control. 

Blocks & Files: In a vector data space, do the concepts of file storage and object storage lose their meaning?

Molly Presley: File and object storage don’t disappear; they evolve. In a vector data space, data is accessed by semantic relationships, not file paths or object keys. However, storage type still matters in terms of performance, cost, and scale.

Blocks & Files: Will we see a VQL, Vector Query Language, emerge like SQL?

Molly Presley: Yes, a Vector Query Language will emerge, though it may not take the exact form of SQL. Standardization is critical. Just as SQL became the universal language for structured data, vector search will need a standardized query language to make it more accessible and interoperable across tools and platforms.

APIs and embeddings aren’t enough. Right now, vector databases rely on APIs and embedding models for similarity search, but businesses will demand more intuitive, high-level query capabilities as adoption grows. Hybrid queries will be key. Future AI-driven analytics will need queries that blend structured (SQL) and unstructured (VQL) data, allowing users to seamlessly pull insights from both.

Blocks & Files: Can a storage supplier provide a data space abstraction covering block, file, and object data?

Molly Presley: Some storage vendors can abstract storage types across file and object, and some offer block as well – but that’s not a true global data space. They create global namespaces within their own ecosystem but fail to unify data across vendors, clouds, and diverse formats (structured, unstructured, vectorized).

Standards are a critical part of this conversation as well. Organizations are typically unwilling to add software to their GPU servers or change their approved IT build environments. Building the data layer client interface into Linux as the most adopted OS is critical, and using interfaces like pNFS, NFS, and S3, which applications natively write to, is often mandated.

A global data space is about universal access, not just storage abstraction. It must integrate rich metadata, enable advanced analytics, and orchestrate data dynamically – without migrations, duplication, or vendor lock-in.

Bottom line: storage type is irrelevant. Without true global orchestration, data stays siloed, infrastructure-bound, and inefficient.

Blocks & Files: How do we organize an organization’s data estate and its storage in a world adopting LLM-based agents?

Molly Presley: We need a tiered approach to data, organized not in traditional HSM (Hierarchical Storage Management) terms of time, but with rich contextual relevance to automate orchestration of curated subsets of data non-disruptively from anywhere to anywhere when needed. 

Focus on the data, not the storage. Especially in LLM-based ecosystems, the storage type is opportunistic and workflow-driven. All storage types have their uses, from flash to tape to cloud. When the type of storage is abstracted with intelligent, non-disruptive orchestration, then the storage decisions can be made tactically based on cost, performance, location, preferred hardware vendor, etc. 

Unified access via standard protocols and APIs that can bridge all storage types and locations. This provides direct data access, regardless of where the data is today, or moves to tomorrow. In this way, data is curated in place so that applications can access the relevant subset of the data estate without requiring disruptive and costly migrations.

There is rich metadata in files and objects that typically is unused in traditional storage environments. Custom metadata, semantic tagging, and other rich metadata can be used to drive more granularity in the curation of the datasets. Combining these metadata into the global file system to trigger automated data orchestration minimizes unnecessary data movement, reduces underutilized storage costs, and improves accuracy and contextual insights for LLM-based use cases. 

Data mobility and the ability to scale linearly are essential. LLM workflows inevitably result in data growth but, more importantly, may require cloud-based compute resources when local GPUs are unavailable. Modern organizations must put their data in motion without the complexity and limitations of traditional siloed and vendor-locked storage infrastructures.

WEKA restructures for the GenAI era

Scale-out file system provider WEKA is laying off 50 employees roles, according to sources, as it restructures go-to-market functions in the era of generative AI.

Liran Zvibel, WEKA
Liran Zvibel

CEO Liran Zvibel blogged about the shake-up, saying privatley-owned WEKA enjoyed a “milestone” 2024 in which it raised $140 million in E-round funding, giving it a valuation of $1.6 billion, and surpassed $100 million in annual recurring revenue.

However WEKA’s market is changing from traditional HPC and enterprise analytics due to generative AI. Zvibel says: “The generative AI and enterprise AI markets have continued to explode. The rate of innovation by AI industry titans – many of whom are WEKA’s partners and customers – has been astonishing, creating a once-in-a-generation opportunity for commercial enterprises, governments, and research organizations alike.” 

He thinks WEKA has to reflect this so it can exploit the opportunity, saying “the token economy is here”. A digital token is a piece of data used by AI models, typically being vectorized first. 

The competition to supply data and provide data pipelines for GenAI model training and inference is intense. Examples include DDN’s March 17 AI Data Summit announcement, VAST Data with its data infrastructure engineered for AI, all the mainstream storage suppliers piling in to the market along with the rush of fast object storage suppliers, such as Cloudian, MinIO and Scality. And then there are the data managers and orchestrators – Arcitecta, Hammerspace, Komprise, and others – building AI-focused data pipelines. Across the storage industry, generative AI has driven AI-focused developments.

Zvibel says: “To accelerate innovation with our customers and partners, we have initiated a strategic restructuring of our go-to-market functions. While change is difficult, we are confident that this will position WEKA and our customers for long-term success as we navigate the dynamic and rapidly evolving AI market.”

Go-to-market (GTM) functions can include marketing, product management, sales, channel strategy and management, customer success, partnerships and alliances, and other aspects such as revenue operations.

B&F understands roughly 50 employees are affected and WEKA expects to grow headcount by approximately 120 in the coming year. Specifically, the company told B&F: “This is not a retraction in headcount; this is a strategic realignment of our people and investments to focus our resources on pursuing large-scale enterprise AI and GPU acceleration deployments worldwide. We are continuing to hire across the business – including our GTM functions.” 

WEKA positions
A sample of open WEKA slots.

WEKA currently has some 75 open slots globally in Customer Success, Sales, R&D, G&A, Product Management, and Marketing. Zvibel says: “In the coming year, we will deliver bold innovation to our customers, scale our investments in R&D, product innovation, and customer success, and expand our team across all business functions to accelerate our growth trajectory.”

Qumulo takes on mainstream data fabric suppliers

Qumulo has launched Cloud Data Fabric (CDF), a central file and object data core repository with coherent cache at the edge.

Scale out file system supplier Qumulo has unified its on-premises and public cloud software under CDF, including a data core distributed file and object data storage cluster that runs on most systems, vendors, or public cloud infrastructures. This is accessed by coherent caching sites at the edge, which connect in parallel to the elements of the data core. Strict consistency between the core and edge sites comes from file system awareness, block-level replication, distributed locking, access control authentication, and logging.

Qumulo president and CEO Douglas Gourlay said: “In 2012, Qumulo set out to build the world’s most advanced file system…From the datacenter to the cloud, unbound by legacy limitations, we envisioned a future where everything is available, everywhere, instantaneously. 

Doug Gourlay, Qumulo
Doug Gourlay

“Our clients can now do magical and amazing things – from bringing together artists and storytellers from around the globe to work on a feature film, to sharing cutting-edge medical research with top physicians, to harvesting data from fleets of autonomous vehicles, making the roads safer.” 

The core global file system operates on all data – in the public cloud and on-premises datacenters – as large elastic pools, enabling hierarchical storage management, tiering, and replication without affecting user access or data location. This “frees applications and data to scale and move independently, supporting seamless growth and evolution.”

Together with the strictly consistent edge sites, this allows “collaboration with diverse tools and applications.”

Aaron Passey, Qumulo
Aaron Passey

Qumulo says it has more than 1,000 production clients and exabytes of data under management. Co-founder and chief architect Aaron Passey said: “Yesterday’s file systems were not built for today’s workflows and rapid data growth. We’re freeing users from the constraints of proprietary hardware and operating systems allowing choice and flexibility, enabling innovation without limits – Qumulo users can now develop breakthroughs using data from any source, on any infrastructure.”

Qumulo’s Cloud Data Fabric is available worldwide through major IT infrastructure resellers and system vendors including HPE and Supermicro, distributors, and most major public clouds, with both prepay and pay-as-you-go options. Pricing is based on the actual data stored and shared across the data core.

DataCore buys Arcastream parallel file system from Kalray

Software-defined storage provider DataCore is buying the Arcastream parallel file business from French supplier Kalray.

Startup Kalray was founded in 2008 as a fabless semiconductor business spun off from CEA, the French Atomic Energy Commission. It developed MPPA (Massively Parallel Processing Array) chip and card technology, and a data processing unit (DPU) accelerator. It bought all the shares of UK-based Arcapix in January 2022 for around €1 million ($1 million) and gained its software-defined Arcastream storage intended for data-intensive workloads.

Arcastream is unified system that combines IBM Storage Scale-based software, flash, disk, tape, and cloud storage. Now Kalray is offloading all of its Arcastream assets, including the Ngenea business, to DataCore. Customers include Framestore, Red Bee Media, and Imperial College London, and Arcastream has an ongoing partnership with Dell.

Dave Zabrowski, DataCore
Dave Zabrowski

Dave Zabrowski, CEO at DataCore, said in a statement: “Integrating robust file storage capabilities into our portfolio, this acquisition reinforces our role as a universal storage leader – offering block, file, and object storage to support workflows seamlessly across core, edge, and cloud environments.”

DataCore inherits Arcastream’s agreement with Dell, in which its software is integrated into the Dell EMC Ready Solution for HPC PixStor Storage.

Arcastream software is generally sold to customers in entertainment, media, academia and HPC markets

In July last year, DataCore raised $60 million to “fuel the integration of AI technologies”. In theory, there is an AI opportunity with Arcastream to feed data to LLMs. If these occur at edge sites, then DataCore’s Perifery business could also benefit.

Kalray is also looking to sell its Ngenea Data Acceleration Platform business, which includes its DPU processors and acceleration cards and associated software. In June last year, merger talks between Kalray and Israeli accelerator card and software startup Pliops were called off. No buyer has yet been announced for the DPU business.

The Arcastream acquisition price was disclosed in a Kalray release saying it’s up to $20 million with $12.5 million cash, a $2.5 million service contract and potential $5 million earn-out.

PEAK:AIO drives UK effort to bring down cost of medicine

PEAK:AIO, a software specialist in data infrastructure for AI and GPU applications, is now involved in the University of Strathclyde’s MediForge Hub, an initiative aiming to “redefine” pharmaceutical manufacturing by reducing raw material usage and waste by 60 percent.

Supported by an £11 million award from the UKRI Engineering and Physical Sciences Research Council (EPSRC) – a UK government body – and led by CMAC, MediForge is a seven-year initiative.

Bringing a single drug to market costs an estimated $200 million, and MediForge aims to address this challenge by creating scalable and sustainable manufacturing systems.

At the foundation of the project is PEAK:AIO’s AI Data Server, delivering real-time analytics, “hyper-fast” data insights, and “integration” with digital twin and cyber-physical systems.

“The team are committed to developing innovative solutions that can accelerate patient access to cost-effective new treatments, that can allow more agile responses to medicine shortages or pandemics,” said Professor Alastair Florence, MediForge project lead and director of CMAC. “Through addressing sustainability at each stage, we can ensure that making medicines does not cost the Earth.”

Mark Klarzynski, PEAK:AIO
Mark Klarzynski

The initiative integrates cyber-physical infrastructure that combines AI and robotics, a novel pharmaceutical data fabric designed to centralize information flow, and digital twin technology to optimize manufacturing processes.

PEAK:AIO ensures this system is kept powered by GPUDirect data, “crucial” for advancing these technologies, said the supplier. A second PEAK:AIO installation phase is scheduled in March.

Mark Klarzynski, chief strategic officer and co-founder of PEAK:AIO, said: “The evolved world of GPU-driven innovation demands equally evolved infrastructure.”

VDURA talks up energy-efficient HPC systems for utilities

VDURA, a specialist in HPC data infrastructure and management solutions, is to highlight its next generation data platform at the 18th annual Energy High Performance Computing (HPC) Conference later month.

The event is hosted by Rice University’s Ken Kennedy Institute in Houston, Texas. VDURA is pledging “acceleration” in energy innovation with faster data processing, data durability, and ease of use for on-premise, cloud, and hybrid environments.

Energy companies face the challenge of managing and analyzing massive datasets to identify energy reserves, optimize renewable solutions, and drive sustainability. The VDURA Data Platform pitches a hybrid architecture that consolidates and accelerates energy exploration and production workflows. We’re told its features “ensure” faster results, reduced operational complexity, and data durability.

VDURA’s hybrid model combines the cost-efficiency of HDDs with the higher performance of SSDs, letting energy companies optimize both operational expenses and workload speed. And VDURA’s advanced algorithms automatically place data on the most efficient and cost-effective storage media, beefing up performance for HPC and AI/ML workloads.

According to VDURA, the Data Platform also offers robust protection for critical datasets to reduce concerns about data loss or unplanned downtime, vowing data durability of up to 11 nines. It is designed to handle high-concurrency workloads and massive data volumes, and eliminates data silos, consolidating information into a unified global namespace to improve accessibility, collaboration, and decision-making across teams.

“With intuitive management tools, a single administrator can oversee data environments without requiring specialized HPC expertise, reducing operational overhead,” claimed the company.

In a prepared remark, Zhaobo Meng, founder and CEO of In-Depth GEO, said: “The work we do requires highly complex parallel workflows, and VDURA has exceeded our expectations in delivering the high-performance storage and networking we need to manage the incredible volume of data we process on a daily basis. The VDURA launch and this rare new option to mix and match different storage types in one platform comes at the perfect time for us.”

VDURA Data Platform diagram.

The Data Platform combines proprietary tech like Velocity Layered Operations (VeLO), Virtualized Protected Object Devices (VPODs), and adaptive capacity. Such features let energy organizations more efficiently move, manage, and analyze data across geographically dispersed sites with confidence, we are told.

A VPOD is a discrete, virtualized, and protected storage unit in hybrid storage nodes, which combine flash (SSDs) and hard disk drives (HDDs) for cheaper and high-performance storage. VPODs operate within a unified global namespace to scale infinitely as more nodes are added to the system. They use a multi-layered approach to data protection, incorporating erasure coding both within each VPOD and across multiple VPODs in a cluster. The company said this provides up to 11 nines of durability (99.999999999 percent reliability).

VPODs work in conjunction with VeLO which is a key-value store used in the Director layer of the VDURA platform for handling small files and metadata operations. This is intended for efficient data management and high IOPS for AI and HPC workloads.

Rex Tanakit.

Rex Tanakit, vice president of technical services at VDURA, said: “Our participation in the Energy HPC Conference showcases our dedication to enabling the industry to harness data effectively and sustainably.”

In other VDURA news, the firm has been granted a US patent for “collaborative multi-level erasure coding for maximizing durability and performance of storage systems.”

Company developers Scott Milk, Christopher Girard, Shafeeq Sinnamohideen, and Michael Barrell worked on the patented technology.

The Energy High Performance Computing (HPC) Conference takes place on 25 to 27 February, 2025.

Panzura’s upgraded software has halfway-house high availability and faster remote office data access

Just two months after Panzura released v8.4 of its CloudFS core technology, it has pumped out v8.5, claiming it delivers instant recovery and “unparalleled resilience.”

CloudFS is the underlying global filesystem technology for Panzura Data Services, which offers cloud file services, analytics, and governance. It has a global namespace and metadata store plus distributed file locking and Cloud Mirroring. This is a high-availability (HA) capability providing redundancy and disaster recovery for file data. V8.5 features so-called Adapt technology providing Instant Node replacement or migration for business continuity and localized Regional Stores for optimized local access to objects and performance.

Sundar Kanthadai

Panzura CTO Sundar Kanthadai stated: ”The Instant Node feature in CloudFS 8.5 Adapt delivers persistent business operations and continuity. It provides the ability to deploy, migrate, restore, or rebuild any node with unprecedented speed and ease.” 

Instant Node attempts to strike a price and performance balance between traditional high-availability setups and “the cost-effectiveness of using readily available local hardware resources, like servers, workstations or virtual machines (VMs), to achieve a high level of resilience.” Panzura says dedicated HA infrastructure, needs redundant servers and storage arrays, which cost money to buy and maintain. 

The Instant Node feature supports node-to-node restoration for automated hardware upgrades, planned migrations, infrastructure changes, and other IT initiatives. It also has a REST application programming interface (API) for integration with existing and external IT infrastructure and automation tools. 

The Regional Store feature provides accelerated access to Panzura-stored data for outlying users and remote edge sites; “teams located a significant distance from the primary data store.” It provides faster access to data “not yet cached by CloudFS.” Admins can configure up to four object storage buckets across their preferred provider’s cloud regions to shorten access time latency and avoid building high-speed network links to distant data stores.

The Regional Store buckets are synchronized by the cloud provider, “ensuring data consistency across regions.” AWS has 36 cloud regions worldwide, 8 in the USA, and Azure has over 60 around the globe.

Panzura says This CloudFS v8.5 Adapt capability helps optimize data placement for “high-performance computing, artificial intelligence (AI) pipeline support including large language model (LLM) training, real-time collaboration and co-authoring, and disaster recovery preparedness. Storage costs can be optimized by selecting the most appropriate storage class based on access and usage patterns.”

V8.5 also has extended Role-Based Access Control (RBAC) support with Single Sign-On (SSO) capabilities integrated with OKTA as an identity provider. Storage Tier Support for Azure supports Azure storage class tiering. Such tiering is already available for Panzura on AWS.

Veritas NetBackup data to be stored in Cohesity filesystem and codebase unification coming

Interview: Now that Veritas and Cohesity are one company underneath the Cohesity brand there are is a larger than 12,000 customer base, with 8,000 from Veritas and 4,500 from Cohesity. They have been assured that no customer will be left behind, but what happens now in a product sense?

Sanjay Poonen

We had a conversation with Cohesity CEO Sanjay Poonen to look at that issue, noting that there are Cohesity applications like Fort Knox and Data Hawk working on top of a Cohesity backup store. Will these applications be extended to work with Veritas backup stores?

Poonen said yes, they would and cyber recovery orchestration tools, analytic and AI tools, like Gaia, will be extended to work with either NetBackup or Cohesity DataProtect data.

We asked if Cohesity would be developing a single management plane to cover both the Cohesity and Veritas environments. Poonen answered: “Yes. In fact, I would take it a step further because with every month and quarter that we’re together now the engineers are spending more time … working on a roadmap.” 

“Our management control plane,  …Helios, the best in the industry, with ease of use and very API rich, will become the management control plane for both actors.” We could think of them like a Mercedes (Veritas) and a Tesla (Cohesity).

NetBackup from Veritas and Cohesity’s DataProtect will both have a common management plane: “That’s a huge step. Nobody will be able to provide that management consistency to both NetBackup and DataProtect.”

There’s more: “At the bottom of it is the file system and the place where the data is stored. That will become our platform because we left the Veritas file system with Arctera.”

The Veritas customers won’t get left behind: “Certainly we will support our customers. We have the ability to do that with Veritas files for the customers are using that. But the go-forward platform for where the data is stored is SpanFS from Cohesity.” Poonen said he thinks the SpanFS data platform is the best in the industry. 

Veritas NetBackup has a source connector count advantage, with Poonen saying: “One of the advantages NetBackup had was they had hundreds of connectors, almost 500, more than anybody else. We can now bring that common connector framework to DataProtect.”

“Both companies will have connector parity and that us gives tremendous advantage, to take all of those connectors and bring it to both products very quickly. It’s not a complicated project.”

There’s more to come again, with Poonen saying: “The actual backup applications themselves … you can start to containerize that where they actually start to dissolve together. You can then start using components of each other and then you have almost one codebase.”

“This then pretty much give customers a seamless path to the next generation. … It’s a very exciting time. We think we can unify the codebase of the products very quickly. We’re giving customers the path to the future.”

Cohesity is saying to both its NetBackup and DataProtect customer bases that it can lower their costs. Customers need not worry that, after Veritas and Cohesity joining together, their costs will rise.  Poonen said: “Our view is that the total cost of ownership can go down and that’s a very gratifying thing.”

On the generative AI front we asked if Cohesity was looking at agentic AI? Poonen said: “Yes, absolutely. In fact, we’ve been playing with some of the latest work of Operator AI. We’ll have a little demo that you’ll see in the next week or two. It’s not very difficult to do and our vision is that you could think of Gaia not just as an app but also as an agent framework.

Interestingly: “The way the name Gaia started was that we were looking for a code name for the project and we called it generative AI app, generative AI agent.”

“It can be an app or an agent and, in fact, you’ll see with the prototype with Operator AI, the agent framework of OpenAI, [it] can drive our management control plan and we’ve been able to get that working. We’re using it also internally for areas like support where we can search documents.” Perhaps in the future customers will be able to use it too.

Pondering the arrival of DeepSeek, Poonen wonders whether the result is a cylinder or hourglass, telling us about his model of this: “We see the AI stack at three layers, roughly three layers. There’s an AI going bottom up, top down. There’s an AI hardware and software infrastructure layer. That’s where Nvidia, both the hardware of Nvidia and the software of Nvidia system, other middleware components, exist.”

“Then the middle is the large language model foundation model .. and the top of the stack is AI applications.” He has apps like Salesforce in mind there.

“What the world is trying to figure out is that, in economic value, is this a cylinder or an hourglass? The hourglass means the middle of it is valuable but there’s not a lot of economic value created there. The economic value is created at the bottom for infrastructure companies and hardware players like Nvidia. …and account value created [at the top] by software players like Salesforce.”

He thinks ”in that model all of these techniques really drive the company that has the most data.”

So, getting back to the Cohesity-Veritas combo: “We protect hundreds of exabytes of data, which is significantly more than all of our main competitors combined. We think that ultimately creates value. I call that a very similar market to Salesforce.”

“All of these technologies, agent AI and so on, need to find ultimately something that creates economic value. And for the app companies it will be the company that has the data.”

“So companies like Databricks, Cohesity in our world ,and then in the app world, companies like SAP and Salesforce, and then at the bottom layer, the hardware companies, I do think all of them, starting with Nvidia, will have value. So that’s how we view the AI framework.”

There are 5,500 employees in the new Cohesity. The three big parts of the two original businesses are being combined; sales, customer support and R and D. Poonen said: “We’ve got to make that 5,500 employee base really engaged. They’ve got to be excited. I’ve always believed you serve your employees first. They take care of customers, but it always starts with that employee.”

A second priority is to build product innovation; “five or ten x better than any alternative. … And then the third is customer obsession.”

Summary

There is very little overlap between the Veritas and Cohesity customer bases. What we are seeing here is that Veritas customer’s NetBackup-protected data will be stored in the Cohesity filesystem in the future with a common Veritas-Cohesity management plane being developed around the Helios technology. Software containerization will be used to unify the NetBackup and DataProtect codebases so that a single backup product emerges.

Flash-centric Regatta OLxP database working on faster access for AI agents

Israeli startup Regatta is building a scale-out, transactional (OLTP), analytic (OLAP) relational database (OLxP) with extensibility to semi-structured and unstructured data. The company says it is a drop-in replacement for Postgres and has been designed from day one to support SSD storage. Its architecture is discussed in a blog by co-founder and CTO Erez Webman, formerly CTO of ScaleIO, acquired by EMC in 2013.

This OLTP+OLAP combination has been pursued by other suppliers, such as SingleStore, which has added indexed vector search to speed AI queries. SAP HANA, the Oracle Database with an in-memory option, Microsoft SQL Server with in-memory OLTP, Amazon Aurora with Redshift Spectrum, PostgreSQL with Citus or Timescale DB extensions all provide combined transactional and analytical database functions as well. Regatta is entering a fairly mature market and reckons it has an edge because of its architecture.

Boaz Palgi and Erez Webman, Regatta
Boaz Palgi (left) and Erez Webman

Webman says: “Regatta is mainly a scale-out shared-nothing clustered architecture where heterogeneous nodes (servers, VMs, containers, etc.) cooperate and can perform lengthy (as well as short) SQL statements in a parallel/distributed manner, with many-to-many data propagation among the cluster nodes (i.e. intermediate data doesn’t need to pass via a central point).” Each storage drive is accessible “only by a single node in the cluster.”

A Regatta cluster, designed to support thousands of nodes, supports differently sized and configured nodes, which can provide compute+storage, compute-only, or storage-only functions. The database can be hosted in on-premises physical or virtual servers and in the public cloud, and can be consumed as a service.

Regatta differs from scale-out-by-sharding databases, such as MongoDB, by supporting distributed JOINs across node boundaries, and ensures strong ACID guarantees even when rows reside on different nodes. (Read a Regatta blog about scale-out sharding limitations here.)

The company has developed its own Concurrency Control Protocol (CCP) providing a fully serializable and externally consistent isolation level. Where a database supports concurrent user or application access, the different users’ operations need to be kept separate and not interfere with each other. This is the intent of concurrency control, which can have either a pessimistic or optimistic design. Pessimism assumes data access conflicts between transactions are likely to occur, and uses locks to ensure that only one transaction can access or modify data at any one time.

Optimism assumes that transaction data access conflicts are rare and allows transactions to proceed without restriction until it’s time to commit changes. Before committing, each transaction undergoes a validation phase where it checks if its read data has been modified by another transaction since it was initially read (using timestamps or versions for data).

Webman says Regatta’s CCP “is mainly optimistic, although unlike most optimistic protocols, it doesn’t cause transactions to abort on detected conflicts (well, except, of course, for deadlock cases in which both optimistic and pessimistic protocols tend to abort a transaction per deadlock-cycle).” It is snapshot-free and does not require clock synchronization. 

Short or lengthy consistent/serializable read-only queries can be performed on real-time, up-to-the-second transactional data without blocking writing transactions from progressing.

Regatta implements its own row store data layouts directly on top of raw block storage to optimize I/O performance, and does not need any underlying file system. This is a log-structured data layout that operates very differently from an LSM tree design. It is built for extensibility to support other types of row stores, as well as column store, blob store, etc. Webman says its “first row store data layout type is specifically optimized for flash media. It allows us to optimally support both traditional small-rows-with-more-or-less-fixed-size, and variable-sized-large-rows-with-a-large-dynamic-range-of-sizes (within the same table).”

We’re told: “Regatta’s B+Trees (that are used, for example, for indexes) massively leverage the high read-concurrency of flash media, allowing meaningfully faster and more efficient B+Tree accesses than algorithms that would assume more ‘generic’ underlying storage (i.e. magnetic HDD).”

There are more details in Webman’s blog about Regatta’s distributed SQL database.

CEO and co-founder Boaz Palgi tells us that Regatta’s system is looking to ensure that you can:

  1. Execute complex and real-time queries on completely up-to-the-second transactional data – think agents in a telco that get a question regarding roaming from a subscriber that just added roaming to their plan.
  2. Execute transactions such that the same agent understands that the the subscriber should have added roaming for Italy as well, not just for France, and needs to correct this. 
  3. Linearly increase both transactional and analytical performance without changing even a single line of code in your business logic by just adding more nodes. This will be important to keep running your business while adding many agents to the mix.

He says: “Traditional databases cannot deliver the performance to handle that type of agent-generated load, and most of them cannot combine OLAP with OLTP in the same database. Data warehouses cannot support the agents’ transactional workloads. ETL is a problem when you want agents to do more than just deal with stale archive-based data.”

For generative AI, “we are not doing anything specific today, although we will add some capabilities.” 

The metadata advantage: Unlocking insights and efficiency in enterprise IT

COMMISSIONED: When it comes to enterprise IT infrastructure, metadata is the secret asset most people don’t think about.

Think of it as an invisible hand guiding the movement, organization, and accessibility of your data universe. It’s metadata – the data that describes your data – that enables enterprises to transform chaos into opportunity.

Tragically, metadata is often underutilized, but that paradigm is changing. Businesses are beginning to uncover its extraordinary potential for optimizing workflows, enhancing decision-making, and gaining a competitive edge.

Metadata is essentially a summary of your data. It’s the label on the jar that tells you what’s inside without having to open it. Scaling this concept to enterprise IT, metadata becomes the roadmap that helps IT teams identify, locate, and efficiently utilize vast amounts of information.

To put its value into perspective, research published in IDC’s “The State of Enterprise Data” report in 2022 shows that businesses waste 20-30 percent of their time searching for information. That’s nearly a full day each week spent navigating disjointed file systems and digital silos. Even worse, 60 percent of organizations admit to not knowing where their critical data resides according to Gartner’s 2023 “Metadata Management in the Digital Age,” report, something which can impact compliance, productivity, and overall decision-making.

Metadata is more than a useful tool; it’s a strategic enabler for transformation across all dimensions of enterprise IT. Here’s how it drives innovation for modern organizations:

– Better visibility: With metadata, your team knows exactly where every piece of data resides, reducing the time spent on manual searches.

– Deeper context: Metadata reveals how data points are connected, offering insights that can drive smarter decisions and uncover new opportunities.

– Faster execution: By indexing data in meaningful ways, metadata accelerates workflows and analytics, helping businesses move from idea to action more quickly.

And the beauty of metadata? It’s self-enhancing. Over time, it can grow richer, adding contextual layers that evolve with your infrastructure.

Why metadata is essential right now

Several technology trends make metadata more critical than ever. AI and analytics thrive on well-curated, contextualized data, and metadata is foundational for delivering clean datasets. Additionally, hybrid and multi-cloud ecosystems have introduced new challenges around data discovery and access. Metadata becomes the glue that ensures these diverse systems work cohesively.

What’s more, figures from Statista’s Global Data Growth Forecast in 2024 suggest the rate of data growth is staggering, with global data volumes increasing by 23 percent annually. Managing this complexity hinges on having an intelligent metadata strategy in place.

Recognizing these challenges, Dell Technologies has introduced PowerScale MetadataIQ, a cutting-edge tool designed for today’s complex data landscapes. MetadataIQ is a global metadata management solution incorporating the ElasticSearch database and the Kibana visualization dashboard. It enables indexing and querying of combined metadata from multiple geographically distributed clusters.

This isn’t just another tool for managing your files; MetadataIQ redefines how enterprises think about and use metadata:

Fast filesystem search: Enable search of data on a PowerScale cluster without having to do a real-time tree walk.

– Geographically distributed search: Access data across PowerScale deployments by searching a combined metadata store.

– System metadata extraction: Allow customers to export metadata to external tools (e.g., message queues, data catalogs).

Classify and protect sensitive data: Detect and tag sensitive information, applying appropriate governance.

Lifecycle management: Use system- and user-defined metadata to identify data assets for tiering and archival.

– Create datasets to train AI/ML/GenAI models: Search for relevant unstructured data by querying system- and user-defined metadata.

Simply put, MetadataIQ aligns with the evolving needs of enterprises, helping them find, manage, and utilize their data with unprecedented speed and precision.

Why it matters for enterprises

Imagine this scenario – your pharmaceutical company needs to analyze clinical trial data spread across multiple continents. Without metadata management, this process could take weeks. With MetadataIQ, you can visualize and analyze relevant metadata in minutes. That’s game-changing efficiency, particularly in industries reliant on agility and innovation.

The time savings and precision offered by MetadataIQ enable organizations to focus on higher-value tasks, such as developing data-driven applications and executing mission-critical projects.

Think of metadata as your ultra-efficient librarian – it knows where every piece of data “lives,” tags it for relevance, and even makes suggestions for what you might need. Unlike humans, it doesn’t need coffee breaks or days off.

Remember the agony of hunting for a lost file from last quarter? Metadata just liberated 19 of those wasted minutes.

Metadata isn’t just an IT utility anymore; it’s becoming a business-critical resource. With tools like MetadataIQ, enterprises can harness metadata to accelerate their AI initiatives, optimize hybrid cloud strategies, and scale in harmony with expanding data volumes.

With innovation through metadata, businesses aren’t just storing data – they’re unleashing the full potential of every byte. Tools like PowerScale MetadataIQ act as both a compass and a map, ensuring organizations stay ahead in an increasingly data-driven world.

To learn how Dell PowerScale and MetadataIQ can redefine your enterprise IT setup, visit us at www.delltechnologies.com/powerscale.

Brought to you by Dell Technologies.