Nutanix hybrid cloud, AI and GPU Direct support

July 15, 2024

Nutanix‘s growth is getting an assist from the move to hybrid cloud and Gen AI adoption, and it will support Nvidia’s fast GPU Direct data feed protocol.

An interview with Nutanix CEO Rajiv Ramaswami revealed that he thinks its double-digit growth rate can continue for some time – because of its superior hybrid cloud software, and the fact that generative AI provides another growth driver.

Nutanix says that it has migrated its on-premises enterprise infrastructure capabilities for people running either virtual machines or containers to the public clouds, to provide a consistent hybrid cloud experience. We explored aspects of these points.

Blocks & Files: Could you describe Nutanix’s Kubernetes facilities across the on-premises and the public cloud environments?

Rajiv Ramaswami: We manage Kubernetes in a multi-cloud world. We expect customers to run these applications everywhere. They might be running on native EKS or AKS [and] we have a unified management plane across all of these, on-premises, and the public cloud native, including Amazon’s service, Kubernetes service, our Azure Kubernetes service program, and some of the running might be on our own platform.

Our vision is people can use and run Kubernetes clusters anywhere and we’ll be the manager. We also provide storage for blocks, files and objects as well. … If somebody is building an application on AWS, they will be able to use our storage and it’s completely cloud-native. Customers could be using our storage servers in the public cloud as an alternative to EBS.

The advantage would be twofold. First, the same Nutanizx platform will be available everywhere: cloud, across multiple clouds, on-prem, etc. So you don’t have to redo platform applications if you’re thinking of working this way. Number two is that AOS has a lot of built-in enterprise grade resiliency features. We do disaster recovery across clouds and globally. We do synchronous replication. All of those capabilities now become available to you in the public cloud. Whereas AWS, for example, with EBS, doesn’t provide these services.

What typically happens in those scenarios is that, if somebody is building an application [in the cloud], they have to manage all the resiliency aspects at the application layer. They have to build it into the app. Whereas the typical enterprise relies on the underlying infrastructure. And so we basically provide the same enterprise mission-critical storage in the public cloud [as for enterprise on-prem environments].

Blocks & Files: I believe you want to make it easier for enterprises to run databases in the hybrid cloud as well?

Rajiv Ramaswami: This is about the infrastructure layer, the platform. And that has to do with the other components that most apps need to use database. They use caching. They use messaging or streaming, for example. And this is our grand vision, forward-looking.

Today, if you look at Nutanix, we already provide a database management service. So people can manage range of databases with our platform – Oracle, SQL, Mongo, Postgres. And what we want to do is to expand that, to first of all, make that available also in the public cloud, and also expand the range of services.

So we could be managing Kafka stream or Redis for caching. The notion if you look forward is to say, we will either offer it or partner with people. We have a bunch of knowledge about EDB Postgres to be able to offer a range of what are called platform layer data services, that people can use to build these applications. And once we do that, those services can also be made available everywhere.

Blocks & Files: Will these database services move to include vector databases?

Rajiv Ramaswami: The vision is broad. We don’t have a roadmap of everything. I’m focused right now on transactional databases. And, I know where you’re going, you’re going to AI and GPT in a Box. So, absolutely, yes – but in the long term. We haven’t announced timelines or anything like that. But the vision is about really creating a set of services that are available everywhere.

We are not a [database] engine provider. Yep, we may choose to do that sometime down the road. But right now we partner with other database providers.

Blocks & Files: You store a great deal of information for your customers, wherever they happen to have their data – on-premises or in any of the public clouds. That information is going to be needed by large language models (LLMs), which are helped by retrieval augmented generation (RAG). What are you going to do to help that happen?

Rajiv Ramaswami: I’m not going to announce a new roadmap but the vision is on the right track.

That’s exactly our goal. It is exactly what is what you said, which is that our data is available. We can be the platform to manage the data. We do think data is going to be everywhere – not just in the public cloud, not just on-prem. And this whole launch of GPT in a Box was exactly to try and simplify deployment of AI applications on our platform.

Today, the scope of it is a little bit more limited, which is, we have the platform which is supposed to give us all the storage pieces. On top of that, what we are able to do is to provide automatic workplace connectivity into model repositories. So we can get into Hugging Tree. We can connect to immediate repositories. One click, people can download a model that they want, instantiate on the hardware, associate that to the GPU. … Create an inference endpoint and export that to an API to develop already – so that people will automate it.

Blocks & Files: Okay, that will take care of AI inferencing and also fine tuning. How about the training side of it? Because again, you’ve got an enormous amount of information, which you could feed to GPUs?

Rajiv Ramaswami: GPU Direct is on our roadmap. We will have GPU direct, especially for files – this is really where you need GPU Direct. It’s on the roadmap – we know what’s needed. And the other thing that’s also needed is high bandwidth I/O. We are now supporting 100 gig NICs. Clearly, you need to be able to ingest a lot of data very quickly. And we understand that and we’re enabling that. … We’re doing high bandwidth … and then a machine with large memory also. All of these things play a role.

Blocks & Files: Do you think that AI inferencing has got to deliver accurate, complete, and not fake results?

Rajiv Ramaswami: Absolutely. You’re going to need to have accuracy in the models. Absolutely. And this is one of the things that there’s a lot of work required to make sure you’re getting that accuracy.

A lot of the initial use cases for Gen AI are going to be assisted use cases … that’s what we mean by Copilot in a broad sense. In other words, use it, but verify before you actually do it. So I’ll give you one example of a use case that we’re doing internally. It’s in customer support.

We’ve deployed models using our design documents, using our knowledge base articles. When the support engineer gets an incoming customer request, he or she types it into our GPT engine. And it makes recommendations in terms of “it could be this problem here, and this is what you might want to do.” And the idea is that it speeds up our time to respond and increases our productivity of our support engineer. So it’s better to serve the customer and better for us.

And this is working well. We are just finishing up our pilot for them. But the most important thing we had to do here was to keep training and fine tuning it till we could get reasonable accuracy.

Blocks & Files: My thinking is that in general, the RAG approach absolutely has to work or AI inferencing dies a death.

Rajiv Ramaswami: 100 percent yes.

Nutanix hybrid cloud, AI and GPU Direct support

ABOUT US

FOLLOW US

Storage news collection – July 3

Kioxia tunes SSD-based vector search for RAG workloads

Progress snaps up Nuclia for agentic RAG tech