Dell has shown how its MX7000 composable server chassis can be used with Liqid technology to add PCIe gen 4-connected GPUs and other accelerators to the composable systems mix, with an open road to faster still PCIe gen 5, CXL, and external pooled memory.
The four-year-old MX7000 is an 8-bay, 7RU chassis holding PowerEdge MX server sleds (aka blades) that can be composed into systems with Fibre Channel or Ethernet-connected storage. The servers connect directly to IO modules instead of via a mid-plane, and these IO modules can be updated independently of the servers. Cue Liqid upgrading its IO modules to PCIe gen 4.
Liqid supported the MX7000 from August 2020, with PCIe gen 3 connectivity to GPUs etc. via a PCIe switch. Kevin Houston, a Dell principal engineer and Field CTO, writes: “The original iteration of this design incorporated a large 7U expansion chassis built upon PCIe Gen 3.0. This design was innovative, but with the introduction of PCIe Gen 4.0 by Intel, it needed an update. We now have one.”
He showed a schematic of such a system:
The MX7000 chassis is at the top with eight upright server sleds inside it. A Liqid IO module is highlighted; a PCIe HBA (LQD1416) wired to a Liqid 48-port PCIe gen 4 fabric switch. This connects to a Liqid PCIe gen 4 EX-4400 expansion chassis which can hold either 10 Gen 4 x 16 full height, double wide (EX-4410) or 20 Gen 4 x 8 full-height, single wide (EX-4420) accelerators
The accelerator devices can be GPUs (Nvidia V100, A100, RTX, and T4), FPGAs, SSD add-in cards or NICs.
Houston writes: “Essentially, any blade server can have access to any [accelerator] device. The magic, though, is in the Liqid Command Center software, which orchestrates how the devices are divided up over [PCIe].”
Liqid’s Matrix software allocates accelerators to servers, with up to 20 GPUs allocated across the eight servers in any combination, even down to 20 GPUs to a single server.
Comment
It seems to us at Blocks & Files that this MX7000 architecture and Liqid partnership means that PCIe gen 5, twice as fast as PCIe gen 4, could be adopted, opening the way to CXL 2.0 and memory pooling.
This would require Dell to equip the MX7000 with PowerEdge servers using Sapphire Rapids (Gen 4 Xen SP) processors – or PCIe gen 5-supporting AMD CPUs. Then Liqid will need a PCIe gen 5 HBA and switch. Once at this stage, it could provide CXL support and memory pooling with CXL 2.0.
When memory pools exist on CXL fabrics, composablity software will be needed to dynamically allocate it to servers. Suppliers like Dell, HPE, Lenovo, Supermicro etc. could outsource that to third parties such as Liqid or decide that the technology is core to their products and build it, acquire it or OEM it.
CXL memory pooling looks likely to be the boost that composability needs to enter mainstream enterprise computing and support use cases such as extremely large machine learning models. How the public cloud suppliers will use memory pooling, both internally and externally, as memory-pooled compute instances, is an interesting topic to consider.