Nutanix will be bypassing Linux kernel context switches to accelerate AOS storage processing in its latest release. That makes AOS faster, which is good news for its mission-critical application processing credibility.
The new software uses a Blockstore free space manager and an open source SPDK (Software Performance Development Kit) library to enable its use. This is particularly helpful with NVMe SSDS and Optane drives, as less time is spent in OS processes and more on their low-latency device IO.
Normally, with Linux, an application executes in so-called user-space with the OS itself having a more secure kernel-space for its operations. When an application needs to send data to or read data from a storage device, it makes an IO request to the OS.
This causes an interrupt (system call) and the OS executes a context switch from the application’s user space to its own kernel space. The IO request is then carried out, using an in-kernel file system, and a second context switch returns control to the application in user space. All this takes time.
ESG performed a Nutanix-commissioned technical review, and the lower part of its Nutanix Architecture and Performance Optimisation document diagram illustrates this:
In Nutanix’s AOS software, Stargate is the data I/O manager responsible for all data management and I/O operations on the set of clustered nodes. A new Blockstore component provides block-based, free space management, so that Stargate can manage its own metadata as shown in the upper part of the ESG diagram.
The ESG document explained: “Global metadata is stored in a distributed key-value store (Cassandra) and local [cluster node] metadata in a local high-performance key-value store (RocksDB). This means that the Extent Store—where the data resides—doesn’t need the full capabilities of an in-kernel filesystem. Eliminating the in-kernel file system and handling this in the Nutanix stack reduces overhead and lowers latency.”
SPDK enables Direct Memory Access (DMA) between Stargate, via Blockstore, and the NVMe storage device. Time-consuming interrupts, context switches, and kernel processing are avoided and AOS can do more work, which ESG validated with testing.
Through the Stargate
It used a four-node Nutanix NX-8170-G7 cluster with 8 x 2TB NVMe devices per node and looked at a baseline (Four Corners) test checking random reads and writes (IOPS) and sequential reads and writes (MB/sec). This was followed by high-performance database, multiple [Oracle] database online transaction processing and Postgres Analytics tests.
ESG said all the tests showed meaningful performance improvements, with the Four Corners and OLTP charts as examples:
The research group said the Postgres result of the Nutanix cluster “with Blockstore and SPDK showed an average improvement … in DB transactions of 66 per cent. Latency was cut nearly in half, improving by an average of 45 per cent.”
It added: “Latency is a much more important metric than raw IOPS and reducing database transaction latency by half means that more response-time sensitive tier-1 applications can be deployed on Nutanix HCI with confidence.”
ESG concluded: “The Nutanix platform with NVMe, Optane, and Blockstore delivers significant IOPS and latency improvements with no changes to the applications.”