Silk’s Azure speed is based on Ephemeral OS disks

Silk gets crazy fast Azure speed by using and protecting Azure’s fast and unprotected ephemeral OS disks – incurring no storage cost.

Ephemeral OS disks are created on the local Azure virtual machine (VM) storage and are not saved to Azure Storage. Azure documentation states: “With Ephemeral OS disk, you get lower read/write latency to the OS disk and faster VM reimage.”

When Silk storage software runs on the Azure public cloud it can achieve 1 million IOPS and 20GB/sec read bandwidth. This is 6.25 times more IOPS and 10x greater bandwidth than Azure fastest storage offering – Ultra Disk Storage.

Chris Buckel.

We talked to Silk’s VP for business development, Chris Buckel, to find out more.

Blocks & Files: How does Silk get such high-performance from Azure?

Chris Buckel: What Silk does is to spin up a whole set of Azure Compute instances and aggregate their performance, then use our software to provide all of the enterprise features and resilience, etc. So when the customer runs a Silk Data Pod in Azure, they are running a bunch of Azure Compute VMs all orchestrated by our Flex software. But the key fact here is that, since they are effectively using Azure Compute to provide storage, they get to take advance of Microsoft’s reserved instances discounts… so suddenly, storage is as discounted as compute.

Blocks & Files: Could you explain what a Silk Data Pod consists of?

Chris Buckel: A Silk ‘Data Pod’ in Azure consists of two sets of Azure Compute instances running in the customer’s own subscription. The first layer (called c.nodes) provide the block data services to the customer’s database systems, while the second layer (the d.nodes) persist the data. This means the SDP can be scaled in two dimensions: capacity can be increased by adding more d.nodes while performance can be added by scaling out the number of c.nodes in a fully symmetric active/active manner. And, of course, when performance is no longer needed, the number of c.nodes can be reduced to lower the Azure infrastructure cost.

Using this design, we can automatically scale out workloads in Azure based on their constantly-evolving requirements and achieve very high levels of performance on demand. And most importantly, everything is based on Azure Compute resources.

Blocks & Files: Why is it beneficial to use Azure compute instances?

Chris Buckel: Azure Compute instances (i.e. virtual machines) are available on PAYG options, but also via “Reserved Instances” discounts of 41 per cent for a one year commit or 62 per cent for a three year commit. Azure Disk, on the other hand, doesn’t give any discount for committing to a period of time… it’s more or less the same cost no matter what you do.

Blocks & Files: Why is this such a big deal?

Chris Buckel: That’s really big for customers with enterprise workloads like Oracle, MSSQL, Postgres etc because they all tend to be building these systems with a >3 year lifetime in mind. So suddenly the storage cost can be massively offset just like the compute cost, but at the same time with all this additional performance benefit, resilience, and features like inline deduplication, zero footprint snapshots and the like.

The features and functionality are great, but it’s the change in cost/performance profile that makes it compelling.

Blocks & Files: Okay, you use compute instances, but where is the data stored?

Chris Buckel: In addition to the types of Azure Disk you outlined in your recent article, there is another type of disk in each cloud provider called ‘ephemeral‘ or local SSD storage. It’s very fast and very low cost, because it’s completely unprotected, unlike all the other options which are protected (and therefore slow).

We then use our patented RAID technology to stripe across multiple ephemeral volumes, providing the resiliency in case one or more are lost, but taking advantage of the low cost and high speed. We believe this is very different to all the other options, e.g. NetApp or Pure, who have to use the cloud provider’s disks and protection.

This – along with the scale-out architecture – is the secret sauce that gives us the very high levels of IOPS and throughput.

Blocks & Files: Do you provide any other protection?

Chris Buckel: Silk data protection comes in two forms: an optional change log written to low-cost HDDs, and cross-zone/cross-region/cross-cloud replication to another Silk Data Pod running elsewhere.