Faster, cheaper: Pavilion packs massive monolithic features into mini-NVMe array

Pavilion Data Systems, an NVMe array startup, has added encryption at rest to its RF100 series appliance.

Pavilion’s array is a radical architectural change over other all-flash arrays, which are all basically dual controller systems. It says its system delivers the best price performance in the all-flash array industry. So let’s explore the design in a little more detail.

The RF100 comes in a 4U enclosure which holds up to 10 line cards. Each card contains two Broadwell Xeon CPUs in an active-active controller configuration and 4 x 40Gbit/s or 100Gbit/s Ethernet ports. TCP and ROCE (RDMA) access are supported.

Pavilion Data System’s RF100 series appliance

That tots up to 20 controller CPUs and 40 Ethernet ports – a big step-up from  a traditional dual-controller array such as Dell EMC’s Unity or NetApp’s FAS.

The multitude of controllers make the architecture more akin to a monolithic, high-end array such as Dell EMC’s PowerMAX or IBM’s DS8000.   In effect there  are multiple engines inside the array.

Monolithic array-style design attributes of RF100-series array

These controllers, each with their own memory and OS copy, attach across a PCIe fabric to up to 72 NVMe SSDs, in four banks of 18, with 14TB to 1PB of capacity. Two redundant supervisor modules handle the control plane and management functions for the entire system.

The Pavilion software provides dual-parity RAID, with a 12 per cent overhead, zero-space instant snapshots, clones and encryption.

You can scale up SSDs and controllers separately to match a workload profile.

Everything is commodity hardware-based: there is no need for any agent or other Pavilion software in these servers, and the RF100 product has no ASICs or FPGAs. 

Customers can use their own SSDs, with Micron’s 9200, Western Digital’s SN200, Samsung’s 1725b and Intel’s P4800X, P4510 and P4610 all supported.

Fully loaded

The array provides end-to-end NVMe data access to host servers, with average read latency of 117μs. 

In a fully loaded box the performance is up to 120GB/sec read bandwidth, 60GB/sec write bandwidth and 20 million 4K Random Read IOPS.

This is remarkable, coming as it does from a 4U box. Pavilion says the $/IOPS rating is up to 25 times cheaper than competing all-flash arrays. It has not yet published numbers to support that claim but let’s take the IBM DS8888 with 34.3TB capacity as a reference point. This will give you a rough idea of how much monolithic arrays cost. Its list price in Nov 2016 was $1.97m, according to the SPC-1 benchmark report.

In a video, Pavilion Data’s Head of Products Jeff Sosa says the appliance can replace the locally-attached NVMe SSDs of 20 rack servers. This means you can scale compute and storage separately, and so avoid unused and stranded SSD capacity in each server.

Jeff Sosa introducing the RF100 series product

Application use cases cited by Pavilion include MongoDB, MySQL, Splunk Enterprise, Kubernetes on-premises, Spark and Apache Cassandra.

Supercomputing’ 18

Pavilion is to exhibit the RF100 series product next week at SC18, November 12-15 in Dallas, Texas.

It will show the results of two demo workloads:

  • Genomics multi-variant testing analysis to demonstrate rack-scale shared NVMe storage to accelerate human genome analysis,
  • The SPEC-FS benchmark in a clustered file system environment using GPFS (IBM Spectrum Scale) with shared, rack-scale flash storage.

Note that Pavilion has not submitted a SPEC-FS  benchmark (we understand this is SPEC SFS2014).

Sosa told us: “We haven’t yet submitted an official result, which is also why we didn’t specify a specific result in the press announcement. We will be discussing testing we have done so far as part of the booth exhibit, but not releasing official results.”

He said: “We have been focused more on the tests outside of software build up until now, since we believe that the VDA and Database tests are more relevant to our customer base, but we plan to run all of them.”

That will be a good thing as it will provide an objective comparison with systems from rival suppliers.