VMware adds single NVMe flash tier to vSAN

August 31, 2022

The arrival of NVMe SSDs with their much faster IO has given VMware the opportunity to revisit vSAN’s storage architecture and improve its performance.

This is a huge advance for vSAN and hyperconverged appliances using it, such as Dell’s VxRail.

The improvements come with vSAN 8 and its Express Storage Architecture (ESA). This is an optional feature so users can carry on using the existing OSA (Original Storage Architecture) or choose to deploy ESA. This involves validated hardware and a new vSAN license. We took a look at ESA features to see what’s going on and what the benefits are.

Starting point

The server hardware and virtual machine environment has changed over the last 10 years, with more CPU cores, faster storage drives (NVMe SSDs), and speedier networking:

This provides both the need for vSAN software to change and the opportunity for it to do so.

The current vSAN OSA involves hard disk drives and SSDs organized into disk-based capacity tiers, disk groups, and an SSD caching tier or buffer. ESA has a single tier or storage pool optimized for NVMe TLC (3 bits/cell) SSDs. Effectively, it’s all cache compared to OSA.

ESA introduces:

A log-structured file system (vSAN LFS)
A write-optimized log-structured object manager
A new object format

LFS

The LFS and object manager components fit into separate places in the vSAN stack as the below diagram illustrates:

The object manager consists of a parallel block engine, a key:value store and an IO layer:

LFS can take in more small and large IOs than OSA, and coalesces them to reduce overall IO numbers in the stack, as a VMware diagram illustrates:

Performance and capacity legs

ESA introduces performance leg and capacity leg concepts, which are stages in a data ingest process. The performance leg uses a RAID 1 (mirroring) scheme for its temporary and fast mirrored writes of data and metadata to a durable log and returning a fast acknowledgement. The capacity leg stage uses a key:value store for a data payload which is written using full stripes of the coalesced data, reducing write amplification:

This capacity leg will typically use a RAID 5 or 6 protection scheme, as we’ll discuss below, but could use RAID 1.

Compression and RAID

The introduction of LFS is accompanied by VMware moving compression and encryption operations higher up the vSAN stack so that they are done less often and reduce process amplification. Checksumming is also used to reduce duplicative processing and a new snapshot engine has been introduced with faster and more consistent performance. It does not create a new object with each snapshot. Snapshot consolidations are up to 100x faster, accelerating backups and reducing VM stun time. The overall vSAN 8 ESA engine now processes its work and data faster.

ESA’s policy-based data compression can be enabled or disabled at a per-VM level. It has an up to 4x better compression ratio per 4KB data block than vSAN OSA.

VMware says it uses adaptive RAID-5 erasure coding for guaranteed space savings on clusters with as few as three hosts, and this has the space efficiency of RAID-5/6 erasure coding with the performance of RAID-1 mirroring. It is dynamic. When applied to a cluster with three to five hosts it uses a 2+1 data placement scheme but when applied to clusters with six or more hosts it will use a 4+1 data scheme. Now stored data consumes 1.25x the size of the original data object; better than the 2x required by a RAID-1 mirror scheme, and the 1.5x needed by the 2+1 scheme.

Here is a VMware graphic showing ESA’s benefits:

OSA users will get a much increased logical cache buffer size with vSAN 8, jumping from 600GB to a maximum of 1.6TB, increasing workload performance.

Using vSAN 8 ESA needs a vSAN Advanced or Enterprise license and approved ReadyNodes (server nodes) hardware.

Get an overview of ESA here, read an introduction to ESA architecture here, and check out vSAN 8 frequently asked questions here.

VMware adds single NVMe flash tier to vSAN

Starting point

LFS

Performance and capacity legs

Compression and RAID

ABOUT US

FOLLOW US

Storage news collection – July 3

Kioxia tunes SSD-based vector search for RAG workloads

Progress snaps up Nuclia for agentic RAG tech