SK Hynix’s SSD business Solidigm has unwrapped an SSD liquid cooling technology it says will result in smaller, fan-less GPU servers while increasing storage density.
GPU servers, and the datacenters that house them, are becoming hellishly hot. If powering the mega chips and associated storage and networking infrastructure wasn’t enough, the need for liquid cooling is increasing the power draw, reducing the real estate available for servers, and making datacenter design increasingly complex.
At the same time, Solidigm senior director AI and leadership marketing Roger Corell said a typical GPU-based AI server typically carried about 30TB of storage across eight slots, “And we don’t see any reason why that capacity per server growth will not continue at a high level”
That storage element is usually air cooled, added Avi Shetty, senior director AI market enablement and partnerships. But as density increases, this becomes a problem. The SSDs themselves get hot, raising the risk of shutdowns, while traditional cooling technologies, ie heat sinks and fans put a brake on density or reducing server size
Solidigm’s answer is the D7 PS110 E1.S, which is a 9.5mm form factor SSD and a “cold plate technology kit”, aimed at the direct attached storage element in an AI setup.
Shrinking the drive to 9.5 mm provides space for the cold plate which is connected to and chilled the liquid cooling system already supplying the server. That means the drives themselves are still hot-swappable. And it allows for more density in a 1U rack.
The cold plate only touches one side of the drive. However, as Shetty explained, there are components on both sides of the drive, and chilling one side only isn’t really an option.
“What we have is an innovation which allows for heat to dissipate from the front side as well as well as on the back side,” he said.
This results in “an overall thermal mechanism, which allows us to go at much higher watts at the platform level, up to 30 watts, while maintaining full bandwidth of performance.”
There are some liquid cooled consumer devices available and other storage companies have demonstrated professional devices with integrated cooling. However, Solidigm claims its tech will be the first enterprise device to feature “complete liquid cooling”.
The technology, developed in collaboration with an as yet un-named partner is aimed at “future AI servers” – so no retrofitting. It has not set a precise launch date, beyond the second half of this year.
Solidigm is still working through the exact impact of the cooling technology on overall power consumption.
But Corell said, said there was potential to save power in a number of ways. “One, you don’t need to power the fans. And two, you don’t need potentially as low an ambient air temperature inside the aisles of racks to, you know, pull that cooler air over storage, so lowering both HVAC and cooling power requirements.”
Shetty told us, “The liquid cooling loop for storage can be in parallel to CPU/GPU without affecting the cooling efficiency. With fans removed, we expect overall cooling efficiency to beimproved at the server level along with other TCO improvements.”