A ransomware attack is a disaster. When ransomware infects an organisation’s IT systems, stored and backup data are encrypted and made unavailable.
The IT system is unable to function and in many cases that means the organisation cannot function either until it remedies the attack. In essence there are two ways to do this: paying the ransom to decrypt the files or getting clean files restored from a disaster recovery (DR) facility.
Ransomware requires DraaStic action
Affordable and fast DR is a good way to defeat a ransomware infestation. Datrium, a hyperconverged systems vendor, has recognised this and in August 2019 launched its own DRaaS (disaster recovery as a service), incorporating home-grown HCI system backup technologies.
Historically, disaster recovery has been a hugely expensive but relatively niche aspect of customer storage and system buying strategy. But the massive increase in ransomware attacks in recent years has expanded the DR vulnerability surface. At the same time availability of the public cloud to provide a form of remote DR facility has brought costs tumbling.
A September 2016 FBI alert said: “New ransomware variants are emerging regularly. Cyber security companies reported that in the first several months of 2016, global ransomware infections were at an all-time high. Within the first weeks of its release, one particular ransomware variant compromised an estimated 100,000 computers a day.”
Data protection vendor Acronis reported the Spring 2017 WannaCry outbreak afflicted over 200,000 computers in over 150 countries. Global costs were estimated to total $8bn.
A second FBI alert in October 2019 said: “Ransomware attacks are becoming more targeted, sophisticated, and costly, even as the overall frequency of attacks remains consistent. Since early 2018, the incidence of broad, indiscriminant ransomware campaigns has sharply declined, but the losses from ransomware attacks have increased significantly, according to complaints received by IC3 and FBI case information.”
“Although state and local governments have been particularly visible targets for ransomware attacks, ransomware actors have also targeted health care organizations, industrial companies, and the transportation sector.”
Datrium Carpe Diem
Indeed ransomware is now so prevalent that automated failover to a recovery site is becoming table stakes for all data protection suppliers. In that sense ransomware recovery is a killer feature, and suppliers without this capability will be in trouble.
Datrium’s background is somewhat different. Founded in 2012, the company is a venture-backed startup that has raised $165m to date, including $60m in the most recent round in September 2018.
Datrium pioneered a middle way between converged and hyperconverged systems with hyperconverged nodes running storage controller software that linked them to a shared storage box. However, it faced enormous competition and the HCI market consolidated rapidly around two leading suppliers: Dell EMC, with VxRail, and Nutanix.
Datrium then moved in to unified hybrid cloud computing and protecting its DVX systems, specifically backup to the cloud. The company announced Cloud DVX in August 2018, claiming up to 10 times lower AWS costs for cloud backup, and CloudShift, a SaaS-based disaster recovery orchestration service for VMware.
This hit the market as the necessity of dealing with ransomware became even more pressing, and Datrium realised it had a potential killer app for VMware users.
CEO Tim Page told Blocks & Files in a phone interview that Datrium has gained 60 new accounts in under two months since launching its disaster recovery as a service. “DR is catapulting our business revenues upwards.”
He said the reason for this is that Datrium’s DRaaS preserves the VMware environment, is affordable and lightning fast, failing over in minutes when an attack takes place.
Datrium offers DR as a Service (DRaaS) using the VMware Cloud on AWS. In other words it protects VMware virtual machines (VMs) by spinning up DR copies in AWS. Page told me the time between attack detection and recovery should be as short as possible i.e. the DR copy VMs should be spun up quickly.
He said backups, even air-gapped backups such as tape, are inferior to a DR facility. It takes time to restore backup files and the ransomware infestation must be removed from the affected IT site. With a DR facility in place, the victim can use clean files while the ransomware is found, removed and infected files deleted. Post clean-up, the DR facility can fail back to the main site.
Datrium stores backup immutable snapshots in Amazon’s S3 storage, which lowers cost, but in a form that means they can be immediately spun up without rehydration or conversion as VMs running in the VMware cloud. Admin staff at the ransomware-infected customer just switch from one VMware environment to another; there is no difference.
Immutability means that the snapshotted data cannot be altered subsequently. Any ransomware infection after the date the snapshot was taken will not infect that snapshot.
Datrium offers a short RTO (Recovery Time Objective) because it has selectable restore points. This short RTO is made feasible by automating the recovery process, which can involve hundreds or thousands of separate operational steps to get a large suite of VMs up and running in the right order.
With the orchestration routine in place, the DRaaS facility is told via a mouse click to fail over to the cloud DR site when a ransomware attack or other disaster happens, and that takes just minutes. DR recovery can then start a few minutes later at the source site.
Finding the ransomware footprint
Backed-up VMs exist in a timeline. Some time before an attack with its file locking-by-encryption and ransom notification, ransomware infects a system and starts started encrypting files. This event can be located by checking file activity records.
In a recent incident a Midwest US municipality was attacked (the town is unwilling to reveal its identity, Datrium said). The IT department had backed up its VMs to a Datrium DVX system – but without the DRaaS option in place. Admin staff and Datrium consultants checked the incoming snapshots to the target DVX system and found a sudden size increase:
The highlighted snapshots in the image above have sizes of 23.6Gib, 80.2Gib, and 80.7Gib, while prior and subsequent snapshots are 6.1Gib and 3.6Gib in size. This enlargement was caused by Ryuk ransomware encrypting files.
To combat the attack, a prior snapshot from a day earlier was used and powered up on a quarantined network. It was verified malware-free by a security team and became a so-called recovery golden copy.
The recovery team restored individual VMs in priority order and verified each one was clean with an anti-virus scanner before restoring the next one. This took almost two days to complete. A mass update restoration of all their VMs would have taken less time and a DRaaS option would have been quicker again.
Datrium DraaS roadmap
Datrium initially provided cloud backup for its own on-premises DVX semi-hyperconverged system – ‘semi’, because the storage repository was separate from the compute nodes. It extended this to source systems from Dell EMC, NetApp, Nutanix, Pure Storage and others, and also to VMware running in AWS.
Datrium can provide DR with failover to VMware Cloud on AWS so long as the source site is a VMware site. Datrium uses its own backed up VMs and data from the source site.
VMware is accommodating Kubernetes and containers and Page pointed out that “as VMware embraces Kubernetes we can do so too”.
He said Datrium DRaaS will work with Microsoft Azure cloud by the end of 2020.
And what about the rising tide of cloud-native applications that do not use VMware? “We have a CSS login for bare metal servers,” Page said. He suggested Datrium could develop this ability to backup bare metal Kubernetes environments to the public cloud, and reinstantiate containers there for DR, in the same way as it spins up VMs today.
As long as ransomware infections exists Datrium should prosper by offering a simple and fast recovery option, viable both for virtual machines and containerised environments.