Oxide on-prem cloud computer reinvents the server rack

Startup Oxide has delivered a rack-level system providing cloud-style computing on premises as its first commercial product.

Oxide was founded in September 2019 by datacenter heavyweights CTO Bryan Cantrill, CPO Jessie Frazzelle, and CEO Steve Tuck. It has had three funding rounds to date: A $20 million seed round in 2019; a $30 million A-round in September 2022; and a $44 million continuation A-round this month that coincides with its first product launch. All the rounds were led by Eclipse Ventures.

Cantrill was CTO at Joyent and distinguished engineer at Sun Microsystems/ Oracle before that. Frazelle, who left in July 2022 to co-found KittyCAD, was a software engineer at Docker, Mesosphere, Google, Microsoft, and GitHub. Tuck was president and COO at software and services company Joyent, also SVP worldwide sales, and held sales roles at Dell before that.

Oxide’s purpose is to build what it calls a commercial cloud computer, a cloud-scale and cloud-style rack-level, on-premises computing system.

Oxide rack

The Oxide cloud computer combines compute, storage, and networking elements as sleds in a plug-and-play rack. It includes the open source software needed to build, run, and operate a cloud-like infrastructure.

This software includes a Propolis hypervisor, a Nexus control plane, Crucible distributed block storage system, IAM (Identity and access management), and OPTE (Oxide Packet Transformation Engine) self-service network fabric. The software is anchored to a hardware root of trust.

The distributed block storage is based on OpenZFS and has configurable capacity and IOPS per volume. Volume size can be scaled upon demand. It has redundancy for high availability. It can integrate with external storage across a network link. Crucible provides instantaneous, point-in-time virtual snapshots for recovery and off-rail backup.

Oxide says OpenZFS checksums and scrubs all data for early failure detection. Virtual disks constantly validate the integrity of user data, correcting failures as they are discovered. There is automated rebalancing of data to preserve redundancy in the event of drive or sled removal.

Pools of resources are available either through APIs, a CLI, or a web-based UI.

Each sled contains an AMD processor, DRAM, and NVMe SSD storage. Each sled is slid into place and needs no wiring, we’re told. There can be 16, 24, or 32 sleds in a delivered rack. A compute sled has an AMD Milan EPYC 64-core CPU, 16 x DDR4 DIMM slots providing 512GiB or 1TiB of memory, and up to 10 x U.2 NVMe 3.2TB (2.91 TiB) SSDs. That provides a maximum of 32 x 32 TB – 1,024 TB of raw storage capacity. There is 100 GbE link to the rack’s network switch.

There are two network switches. Each has an Intel Tofino 2 processor with 6.4 Tbps throughput and 232 x 40/100/200 GBASE QSFP-28 uplink ports and 32 x 100 GBASE-KR4 backplane ports.

The rack switch has parts for three networks: An up to 12 Tbps programmable Ethernet ASIC, a secondary GigE switch ASIC, and an FPGA driving a proprietary low-level protocol for board control of other systems in the rack. It is connected via a PCIe link to a compute node for management.

Oxide compute sled. The green items are the NVMe drives

The rear of the rack has a DC busbar and a cabled backplane with blindmated networking. This means self-aligning connectors slide or snap into position as a sled is installed in the rack. There are no power or network cables to plug or unplug.

A blog by Cantrill says: 

  • Cloud computing is the future of all computing infrastructure.
  • The computer that runs the cloud should be able to be purchased and not merely rented.
  • Building a cloud computer necessitates a rack-level approach – and the co-design of both hardware and software.

In his view, “the rental-only model for the cloud is not sustainable.” A cloud computer has to be rack-scale and “one must break out of the shackles of the 1U or 2U server, and really think about the rack as the unit of design.”

That helps explain the blindmating. ”This is a domain in which we have leapfrogged the hyperscalers, who (for their own legacy reasons) don’t do it this way,” he says.

Oxide claims its rack is up to 35 percent more energy efficient than traditional server racks.

The Oxide rack ships with everything installed and can be set up in around four hours. Customers can use Kubernetes or cloud software tools like Terraform to deploy and configure workloads.

Oxide compute sled NVMe drive

Customers are said to include a US federal agency, the Idaho National Laboratory, and a financial services business. Several Fortune 1,000 companies are said to be interested.

In effect Oxide wants to replace Dell, HPE, and Supermicro on-premises racks with its own hyperconverged infrastructure rack with built-in public cloud facilities. The Oxide rack uses less power and is far easier to own, operate, and run from a hardware and a software sense than traditional server and HCI racks, Oxide says.

Download a specification sheet here.