The Tape Storage Council has put out a report saying tape will play an even broader role in the IT ecosystem as data storage growth continues.
We beg to differ.
The report, titled Tape to Play Critical Roles as the Zettabyte Era Takes Off, states: “The zettabyte era is in full swing generating unprecedented capacity demand as many businesses move closer to Exascale storage requirements.”
It identifies five trends it claims favor tape:
- Favorable economics – data-intensive applications and workflows fuel new tape growth due to its significant total cost of ownership (TCO) advantages
- Security – tape’s inherent air gap provides additional levels of cybercrime defense
- Data accessibility – tape performance improves access times and throughput
- Sustainability – tape plays a significant role in green data center strategies
- Optimization – tape-based active archives boost storage optimization, providing dynamic optimization and fast data access for archival storage systems
Let’s unpack that. The favorable economics basically means tape storage costs less than disk storage on a $/TB basis, and both are less than flash storage. However, tape is slow, with a longer time to first byte than disk or NAND, and this makes it the choice for rarely accessed data where latency is less of a concern than storage cost.
Tape’s inherent physical air gap is a good point but the virtual air gaps marketed by backup storage, unstructured data storage, cloud storage, and cybersecurity companies are generally acknowledged to be effective and so its superiority here is lessened.
The data accessibility point is harder to understand. Sure, tape is better than disk in bandwidth terms. As the report says, “HDDs and SSDs have faster access times to the first byte of data. For large files, tape systems have faster access times to the last byte of data… The LTO-9 and TS1160 enterprise [tape] drives each have a data transfer rate of 400 MB/sec. This compares to the 7,200 rpm HDDs ranging between 160-260 MB/sec.”
But typically a large file would be striped across disk drives in an external storage array and read back from several drives in parallel, nullifying the single tape drive’s advantage. Also many primary and several secondary data storage devices these days use SSDs and not disk drives. In bandwidth terms that’s pretty much game over, especially with NVMe SSDs. In our view this claimed advantage evaporates in front of our eyes when we look at it.
Yes, tape is getting faster, generation by generation, but it’s still slower than disk drives and all-flash arrays. The evangelists implicitly accept this as we shall see when we get to the fifth trend.
The council’s fourth point about sustainability is a good one. Obviously non-streaming, off-line tape cartridges stored on a shelf don’t need power and cooling. But this trend supports its use only if the need for fast data access has already been discounted. No one is going to use it for primary data storage if it means that your servers can only support tens of transactions a second instead of thousands, not even to save some thousands of tonnes of carbon emissions a year.
And so we get to the fifth point, the active archive. The diagram in the report shows a cache buffer placed in front of a tape library:
It is operated by a server running data management software, which presents a file or object interface upstream and a tape interface downstream to the library. The report says: “An active archive integrates two or more storage technologies (SSD, HDD, tape, and cloud storage) behind a file system providing a seamless means to manage archive data in a single virtualized storage pool.”
Archive storage in the cloud uses tape so we are really talking about disk/SSD front ends and a tape backend here. And why have this cache buffer? It “provides dynamic optimization and fast data access for archival storage systems.” The “SSDs or HDDs serve as a cache buffer for archival data stored on tape providing faster access to first byte of data, higher IOPs, and random access.”
In other words, you need a disk or SSD buffer because tape data access is slow. Ideally people wouldn’t use tape at all because it’s far too slow, but it is cheap and reliable so we put up with it and ameliorate its slowness with a cache buffer in active archive setups.
The report says the need for data archives will grow due to increased cold data storage needed by cloud, HPC, IoT, life sciences, media and entertainment, video surveillance, and sports video. That’s most likely true but it’s still standard data archiving on tape. It won’t play a broader role. It will play the same role it always has.
Tape is not dead yet and its capacity-increase roadmap is impressive. It’s actually thriving. But that is not because of its wonderful technology, good though it is, but because there is nothing better. It’s cheap, it holds an awful lot of data, it’s reliable, it’s slow, and it works. Enough said.