Nutanix puts ChatGPT in a turnkey box

Nutanix has launched a turnkey GPT-in-a-box for customers to run large language model AI workloads on its hyperconverged software platform.

Update: Nutanix does not support Nvidia’s GPU Direct protocol. 5 Sep 2023.

GPT (Generative Pre-trained Transformer) is a type of machine learning large language model (LLM) which can interpret text requests and questions, search through multiple source files and respond with text, image, video or even software code output. Sparked by the ChatGPT model, organizations worldwide are considering how adopting LLMs could improve marketing content creation, make chatbot interactions with customers better, provide data scientist capabilities to ordinary researchers, and save costs while doing so.

Greg Macatee, an IDC Senior Research Analyst, Infrastructure Systems, Platforms and Technologies Group, said: “With GPT-in-a-box, Nutanix offers customers a turnkey, easy-to-use solution for their AI use cases, offering enterprises struggling with generative AI adoption an easier on-ramp to deployment.”

Nutanix wants to make it easier for customers to trial and use LLMs by crafting a software stack including its Nutanix Cloud Infrastructure, Nutanix Files and Objects storage, and Nutanix AHV hypervisor and Kubernetes (K8S) software with Nvidia GPU acceleration. Its Cloud Infrastructure base is a software stack in its own right, including compute, storage and network, hypervisors and containers, in public or private clouds. GPT-in-a-box is scalable from edge to core datacenter deployments, we’re told.

The GPU acceleration involves Nutanix’s Karbon Kubernetes environment supporting GPU passthrough mode on top of Kubernetes.

Thomas Cornely, SVP, Product Management at Nutanix, said: “Nutanix GPT-in-a-Box is an opinionated AI-ready stack that aims to solve the key challenges with generative AI adoption and help jump-start AI innovation.” 

We’ve asked what the “opinionated AI-ready stack” term means and Nutanix’answer is: “The AI stack is “opinionated” as it includes what we believe are the best in class components for the model runtimes, Kubeflow, PyTorch, Torchserve, etc. The open source ecosystem is fast moving and detailed knowledge of these projects allows us to ensure the right subset of components are deployed in the best manner.’’

Nutanix is also providing services to help customers size their cluster and deploy its software with open source deep learning and MLOps frameworks, inference server, and a select set of LLMs such as Llama2, Falcon GPT, and MosaicML.

Data scientists and ML administrators can consume these models with their choice of applications, enhanced terminal UI, or standard CLI. The GPT-in-a-box system can run other GPT models and fine tune them by using internal data, accessed from Nutanix Files or Objects stores.

Gratifyingly for Nutanix, a recent survey found 78 percent of its customers were likely to run their AI/ML workloads on the Nutanix Cloud Infrastructure. As if by magic, that bears out what IDC’s supporting quote said above.

Nutanix wants us to realize it has AI and open source AI community credibility through its:

  • Participation in the MLCommons (AI standards) advisory board
  • Co-founding and technical leadership in defining the ML Storage Benchmarks and Medicine Benchmarks
  • Serving as a co-chair of the Kubeflow (MLOps) Training and AutoML working groups at the Cloud Native Computing Foundation (CNCF)

Get more details about this ChatGPT-in-a-box software stack from Nutanix’s website

Bootnote. Nutanix’ Acropolis software supports Nvidia’s vGPU feature, in which a single GPU is shared amongst accessing client systems, each seeing their own virtual GPU. It does not support Nvidia’s GPUDirect protocol for direct access to NVMe storage, bypassing a host CPU and its memory (bounce buffer).