Nasuni: Copilot has won the Windows Gen AI chatbot race

Analysis: Nasuni execs reveal that they think Microsoft has already effectively won the Gen AI race for Windows users, closing the door on storage companies building their own vectorization facilities to make RAG content available to their own Gen AI chatbots. Nasuni will use a co-pilot in its AIOps facilities.

I talked with three Nasuni execs at a London, UK, Nasuni user group meeting. They were chief evangelist Russ Kennedy, chief innovation officer for data intelligence and AI, Jim Liddle, and Asim Zaheer, the chief marketing officer. The background is that Nasuni’s File Data Platform provides file access to distributed users from a central synchronized object storage-based cloud repository with added services including ransomware detection and recovery. 

Nasuni wants to use AI to help with automatically analyzing, indexing, and tagging file data. In April it announced that customers could integrate their data stores and workflows with customized Microsoft Copilot assistants.

But would it go further? I suggested that, with large customer file data estates held in the Nasuni cloud, it could build its own Copilot chatbot facility to analyse them for responses to customer user inquiries and requests. It could also vectorize this stored data, and use its tagging facility to keep track of data which had been vectorized and data which had not.

A NetApp exec has suggested that such vectorization and indexing could be carried out at the storage layer in the IT stack.

Liddle disagreed with these ideas. In his view, developing a smart chatbot, like GPT-4, costs millions of dollars. It would be hugely expensive and, even if Nasuni were to develop its own chatbot, customers may well prefer not to use it. He said, and Kennedy and Zaheer agreed, that Nasuni customers getting aboard the Gen AI train were using Microsoft’s Copilot because it’s already working across Microsoft 365 components, Windows 10 and 11, and Outlook. 

“Customers will only want to use one Copilot,” Liddle said, and not have to change to a separate one for different system software environments. They’ll want a single Gen AI chatbot lens through which to look into their data.

Copilot uses Microsoft’s Semantic Index, This, Microsoft says, “sits on top of the Microsoft Graph, which interprets user queries to produce contextually relevant responses that help you to be more productive. It allows organizations to search through billions of vectors (mathematical representations of features or attributes) and return related results.”

These Nasuni execs believe customers want a single Gen AI co-pilot facility, and that in Windows-land Microsoft’s Copilot and Semantic Index are already effectively in-place and incumbent. 

My thinking is that several things follow from this. First, Nasuni will not build its own Copilot-like chatbot. That game, for its Windows users, has already been won by Microsoft.

Second, Nasuni will not build its own vectorization facility and vector data store into its File Data Platform. Microsoft’s Copilot uses vectors in the Semantic Index and wouldn’t understand vectors provided by Nasuni from its own vector database. Kennedy observed that there is there is no industry-standard vector format.

Again, the vectorization and indexing game for Windows users has already been won – by Microsoft Copilot and its Semantic Index. 

Nasuni’s – and its customers’ – best interests are served by making Nasuni-held data available to Copilot and Semantic Index.

A third thought occurred from this as well. Nasuni has customers who use Nutanix HCI and Not Microsoft’s Hyper-V server virtualization. It is most unlikely Nutanix will build its own chatbot, nor its own vectorizing and vector storage facilities – for the same reasons as Nasuni. That means Nutanix will have to support external chatbots – and that’s what it’s doing with its GPT-in-a-box offering.

Liddle said Nasuni will use Gen AI large language models to refine and optimize its own infrastructure for its customers. It tracks every I/O, not the I/O content, and this provides the base data for an LLM to optimize, for example, data placement across a customer’s File Data Platform deployment sites, and further optimize the overall efficiency and cost of that  multi-site deployment.

All of the above leads me to think that running vectorization and vector indexing in an external storage filer or SAN – or object system – is a misplaced idea in Windows environments, and will not happen.