Backblaze pours cold water on immersion cooling for most disk drive setups

Storage pod
Storage pod

Iceotope has demonstrated immersion cooling of disk drives and reckons it could enable fewer drive failures and lower total cost of ownership. A Backblaze technologist isn’t convinced.

Backblaze, which provides cloud backup and general storage, has designed its own drive chassis called Pods, and has tens of thousands of disk drives in operation. Backblaze keeps detailed statistics about its drive population over their lifetime, cataloging failures and tracking their working life and costs. 

We asked co-founder and CEO Gleb Budman what he thought about the Iceotope immersion cooling and he asked one of his technical staff to respond. The tech spokesman said: “Any kind of liquid cooling (immersion or otherwise) is basically trading complexity and difficulty in servicing for a more consistent temperature (fewer peaks and valleys) and higher heat dissipation capacity. Typically, this trade-off isn’t particularly valuable unless you have extremely high heat production (GPU compute is notable here, though CPU and RAM can get there), constant load, or both.”

This is not the situation at Backblaze because its cloud storage operation is disk-based and not compute-intensive: “In our case, we don’t use particularly high-end CPUs or RAM, we don’t use GPUs at all, and while our hard drives are constantly in use, unlike SSDs, heat is more a factor in their reliability than their performance.

Backblaze, like most everybody else, uses fan-assisted air-cooling for its server and disk chassis. Any new cooling system would need to be added to this setup and would have a cost. The tech guy told B&F: “Now, if this was a cheap and accessible solution that could just be dropped into a system, it’s… possible it’d be interesting to us, but judging by the PDF (and some prior attempts I’ve seen at this) I think the cost of implementing alone would wipe out any lengthened life of the drives it gave us.”

Backblaze does not like lock-in, it buys drives and servers from whichever manufacturer offers the best deal in terms of its requirements at purchase time. Mr techie added: “Given the diagrams there, it looks like much variation in the side of the drive or in its characteristics might mean it doesn’t fit or work as well, which… is basically always the trade-off with liquid cooling. A perfect fit system can outperform air cooling, but air cooling is much more adaptable.”

He then talked about the skill level of Backblaze technicians, saying air-cooled kit is relatively easy to fix if problems occur but liquid cooling needs plumbing skills: “Generally speaking, the datacenter techs are the start of the Operations tree in terms of skill level. We expect them to be able to do things like swap out drives and service systems because it is usually 1-2 screwdrivers worth of tools, and some basic tab/click based connectors. Easy peasy.”

“Liquid cooling, immersion or otherwise, takes a lot more care and non-zero specialized knowledge to make it work right. Having done custom loops for cooling on my home systems in the past, I can tell you I wish I had more plumbing experience when dealing with fittings, getting air bubbles out of the system, etc.”

The lliqid cooling system technology is a retrofit to a drive chassis. It’s not designed in from the start and therefore less costly: “If SMCI (or someone like them) were to come out with systems pre-prepared for this and we just dropped things in at a slight premium, we might be able to make it work. Though… by that time, if we’re starting to swap to SSDs, it kind of becomes less useful again.”

That’s because SSDs have a different form factor and may not take kindly to being drowned in liquid. Also: “SSDs tend to wear out more by usage, and less by temperature and not really at all by vibration. So even if this cooling extended the life from heat damage, it’d likely still be use that killed us first.”

His big issue with immersion cooling is liquid leakage: “I would have… non-zero concerns about catastrophic failure scenarios here. In air-cooling, worst thing that can happen is the fans break, or run in reverse, or some such. This makes the system overheat, but we have monitoring for that, and can just fix it. In any kind of liquid-cooling scenario, catastrophic failure means liquid on components that don’t want to be wet, which is a lot harder to recover from, and depending on what gets wet, may knock us down a lot more hardware.”

His summary of immersion cooling for disk drives is that “for high-volume, low-cost setups (like us), there’s not really any way I can think of that it’d make financial sense, to say nothing of the fit for the existing workforce, datacenters, etc.”

It might make sense “for high-performance, low-quantity setups” and could be interesting for them. But for Backblaze? Not today.