VAST Data is making data reduction improvements to get its per-TB cost down to nearline disk levels and looks set to plunge below that level with coming hardware and software changes.
The company supplies a Universal Storage software product that provides file and S3 storage on certified hardware configurations from partners such as Avnet. The hardware is based on a single data tier of QLC (4bits/cell) NAND with a storage-class memory tier for metadata and buffering incoming writes.
CMO and co-founder Jeff Denworth presented to an IT press tour in Silicon Valley in June and talked about software changes that are going to increase the capacity of VAST’s Universal Storage systems.
The company’s software implements a disaggregated shared everything (DASE) architecture in which stateless compute nodes can see all of the NVMe SSD capacity. The average VAST customer has 12PB (raw) of flash capacity. Data reduction using similarity hashes is used to increase effective capacity.
On average VAST reckons it has a 3:1 data reduction ratio across its installed systems, which would give the average customer 36PB of effective capacity.
He said VAST recently added adaptive chunking – meaning variable block length – to its data reduction system. Denworth said VAST ran internal testing. It thinks the combination of similarity hashing and adaptive chunking gives it a 30 to 70 percent advantage over Data Domain; PowerProtect in Dell’s new branding. The 70 percent number came from storing and reducing Commvault backup files for some SQL Servers.
Version 4.4 of VAST’s software will add data-awareness which will apply specific reductions to particular types of data, such as integers and floating point numbers. If the reduction algorithm knows that a certain piece of data is a floating point number then it can reduce it more than if it was undifferentiated data. This could be applicable to VAST storage being used in market data, life sciences and data warehouse applications.
VAST has seen an additional 25 percent data reduction in testing and believes there is more to come. A slide suggest this is worth $100,000 per petabyte – a huge saving with multi-petabyte systems. Denworth said VAST will supply specific compression algorithms for imagery and other file formats in the future, again looking to produce higher reduction ratios for these data types.
This is similar in concept to Ocarina’s content-aware file and image compression product technology which could compress image types previously thought incompressible. Dell bought Ocarina in 2010 and it’s not clear if Ocarina technology is used by the PowerProtect deduping algorithms. Denworth agreed there were parallels with Ocarina’s technology.
He thinks VAST will grow the average data reduction ratio to 4:1 to 5:1 across its fleet this year. This will enable it to match or even beat nearline HDD $/TB.
Asked about PLC (5bits/cell) flash Denworth said this will provide another 20 percent cost reduction. As 3D NAND layering goes to 200+ then you get an additional added cost reduction benefit that will compound with PLC. We think this will mean VAST $/TB will go below nearline HDD cost and that could happen in the 2024/2025 period.
This could provide a huge boost to VAST Data storage use, particularly for fast restores of cold data.