VAST enters 2025 with consolidated EBox and GCP support

Analysis: Having erected a substantial AI-focused stack of software components in 2024, announced a partnership with Cisco, and delivered software upgrades, how is VAST Data positioned at the start of 2025?

It’s entering the year with the latest v5.2 Data Platform software release, which features its EBox functionality, first previewed in March 2024. VAST’s basic DASE (Disaggregated Shared Everything) software has a cluster architecture “that eliminates any communication or interdependencies between the machines that run the logic of the system.” It features compute (CNodes) liaising with data box storage enclosures (DNodes) across an internal NVMe fabric. 

The DNodes are just boxes of flash (JBOFs) housing NVMe SSDs. These highly available DNodes store the DASE Cluster system state. VAST says: CNodes “run all the software and DBoxes … hold all the storage media, and system state. This enables the cluster compute resources to be scaled independently from storage capacity across a commodity datacenter network.”

Howard Marks, VAST Data
Howard Marks

The EBox idea colocates a CNode and DNode in one server box, thus preventing the independent scaling of compute and storage. This is justified because, as VAST Technologist blogger Howard Marks says: “The EBox architecture lets us run the VAST Data Platform in environments that, until now, didn’t want, or couldn’t use, highly available DBoxes. These include hyperscalers that have thousands of a very specific server configuration and cloud providers that only offer virtual machine instances. It also allows us to work with companies like Supermicro and Cisco to deliver the VAST Data Platform to customers using servers from those vendors.”

A separate VAST blog states: “The EBox is designed to address the growing needs of hyperscalers and CSPs that require infrastructure capable of handling massive data volumes and complex workloads. By combining the best features of its predecessors into a more compact form factor, the EBox not only saves valuable rack space but also enhances the overall performance and resilience of the datacenter.”

EBox hardware features a single AMD Genoa 48-core processor, 384 GB of DRAM, 3 x storage-class memory (SCM) drives, and 9 x 30 TB NVMe SSDs (270 TB), plus two PCIe slots for front-end cards. There is a minimum cluster size of 11 nodes and metadata triplication “ensuring every read or write operation is replicated across three EBoxes within the cluster.” So the system withstands substantial hardware failure, keeping data secure and ensuring “sustained performance and rapid recovery, even during failures.”

Marks says: “Each x86 EBox runs a CNode container that serves user requests and manages data just like a dedicated CNode would, and DNode containers that connect the EBox’s SSDs to the cluster’s NVMe fabric. Just like in a VAST cluster with CBoxes and DBoxes, every CNode in the cluster mounts every SSD in the cluster.”

v5.2 also includes a global SMB namespace, Write Buffer Spillover, VAST native table support in async replication, S3 event publishing, and S3 Sync Replication, all of which “can streamline complex workloads for enterprise, AI, and high-performance computing environments.” It also has improved write performance, with Marks saying: “We’re taking advantage of the fact that there are many more capacity (QLC) SSDs than SCM SSDs by directing large bursts of writes, like AI models’ dumping checkpoints, to a section of QLC.  Writing to the SCM and QLC in parallel approximately doubles write performance” over the previous v5.1 software release. Since we’re only sending bursts of large writes to a small percentage of the QLC in a cluster, the flash wear impact is insignificant.”

He adds: “We’re also bringing the EBox architecture to the public cloud in 5.2, with fully functional VAST Clusters on the Google Cloud Platform,” which we expect to be announced later this year. 

The S3 event publishing is configured on one or more buckets in the system and provides event-driven workflows triggering functions. When data changes in such a bucket, the VAST cluster will send an entry to a specified Apache Kafka (distributed streaming platform) topic. Specifically, v5.2 VAST software requires the topic to be on an external Kafka cluster and the functions must subscribe to the Kafka topic.

More is coming this year, with Marks writing: “Over the next few quarterly releases, the VAST DataEngine will add a Kafka API-compatible event broker and the functionality to process data,” ending the external Kafka cluster limitation.

Camberly Bates, Futurum
Camberly Bates

Futurum analyst Camberly Bates writes: “VAST’s EBox integration with the Google Cloud Platform is likely to drive further adoption in public cloud environments.” This hints at Azure and AWS support for the EBox concept coming later this year.

We would expect the EBox architecture to support much higher capacity SSDs later this year, with 62 TB drives now available from Micron, Phison, Samsung, and SK hynix, and 122 TB-class SSDs announced by Phison and Solidigm recently.

Bates also suggests, referring to the v5.2 software: “Rivals may introduce similar advancements in replication, namespace management, and performance to remain competitive.” 

Suppliers like DDN, HPE, Hitachi Vantara, IBM, NetApp, Pure Storage, and WEKA are likely going to face continued strong competition from VAST in 2025.