View’s AI data-as-a-service: part 2

View is developing a AI that can ingest proprietary data sources rapidly, and be deployed and operated more simply than existing generative AI large language model chatbots with retrieval-augmented generation (RAG). We examined its semantic search and universal data representation (UDR) technologies a few days ago and now look at how they can be used to provide faster and easier-to-use AI-powered insight into an organization’s data.

With semantic cells, UDR, and graph representations all indexed in its Lexi catalog, View has a basic ability to do searches across data sources to find similarities.

View co-founder and CEO Joel Cristner loaded PDF documents into a View bucket, and explained: “So I’ve uploaded roughly 30 PDF files into this bucket here, and they’ve been ingested and processed. I’m just going to do a quick search for the word Botox, and I pulled back every document that has the word Botox in it.”

He used View’s Assistant function to do this. Within its search bar, “you can search not only for terms, but also content types, owners, time ranges, repositories, schema, conditions, value conditions. There’s eight, nine different dimensions that you can search on. You can actually come up with some very, very hierarchical queries. … It’s using UDR as its base.”

Joel Christner

You can place the search results in a set called a knowledge base. “Let’s assume that I found these PDFs related to Botox. I can then specify that … I’m going to vectorize and store them in this particular knowledge base … which defines a vector repository.”

“Once I save those to that knowledge base, I can then go over to View Assistant. And in View Assistant … I can choose my knowledge base. I can choose the large language model that I want to use, and if I don’t already have it on the system, it will download it to my system so that none of this data ever leaves my company.”

“I’m going to choose Llama 3.1 here, and I’m going to tweak some of the nerd knobs. We give you all the RAG nerd knobs that you would ever want. And we’ve got probably four patents filed just on some of the really amazing things that we’re able to do from a RAG perspective, since we have metadata, graph and vector all in one. We’re not just doing RAG on the embedding space. We’re doing RAG with metadata, graph and vector, which is unique.”

“I’m going to tweak my nerd knobs here. I’m pointed at that knowledge base. I’m just going to ask, what information do you have about Botox? Now, the first time you do this, it has to load the model up into the system. It loaded it up and voilá, it is using our RAG pipeline with my source data to provide a conversational experience and allow me to ask questions of that data.” 

View can, Christner explained, provide more accurate model responses because it uses more than vector embeddings to formulate its query responses. Referring to the image above: “These are the documents and the semantic cells and the chunks that were used to formulate this completion.”

We should understand two things: “One is most of the optimizations that you see in RAG happen in vector space. They’re going to do re-ranking based on vectors, and that’s great. It’s very effective. We do that as well, but we can also do re-ranking based on graph relationships and metadata. So not only can we re-rank vectors, we can re-rank graph objects. We can re-rank documents themselves, which we feel puts us in a much better position to start approaching that 95, 98, 99, 99.9 percent accuracy that people are looking to get.”

Keith Barto

Co-founder and chief product and revenue officer Keith Barto told me: “Another thing we can do is rank chunks. Because we have those semantic cells and their chunks, if you had a minimal number of documents – say it only came back with three documents, but you wanted to rank by the top ten. We don’t have ten documents, right, but I do have ten chunks, or 20 chunks or 30 chunks across two or three documents. So the more refined I can get in the RAG process, because we have UDR, because we have semantic cells, because we have graph, because we have the source documents, I can actually get even more focused on the actual relationship between those semantic cells and semantic chunks in relation to the question that I’m asking the prompt.”

With View you can provide specific users or groups of users with access to restricted source data so that they can interrogate it with questions without seeing data they shouldn’t and without exposing it outside the organization’s perimeter.

Christner demonstrated: “We exposed the entire back end of this experience through a single API, and you can click this View API and this is what you need to do to be able to have this chat experience.”

“I embedded it into a standalone HTML file. … Somebody could take this file and embed it on their internet, and it’s pointed at a specific knowledge base with those same specific nerd knobs set the way that I had set them inside the dashboard.”

Barto added: “So think HR department, sales, customer support, right, all those individual knowledge bases you want to keep isolated because HR data is going to have personally identifiable information, salaries, benefits information … that needs to stay in HR. Whereas sales, mining and opportunities needs to stay in sales, and all your patent stuff needs to stay in engineering. Now you can use your company’s RBAC, your access controls, to keep that data isolated and those knowledge bases all separated.”

A customer’s governance controls are enforced from an RBAC perspective. Christner amplified the point: “AI opens a Pandora’s box for security. That’s not a game that we want to be playing. I think we’re in an advantageous position, because we’re deployed on-prem inside a customer’s infrastructure.”

“We give you the ability to deploy and secure an AI interaction with that data along those same boundaries within the grain of your infrastructure.”

Barto pointed out: “Another thing that I tell customers is, when you present your data to Chat GPT, you’re giving up your intellectual property to the cloud.”

Where does View go from here? Its claims its software covers the trifecta of data management for AI: graph modeling, metadata generation, and vector storage. View has a couple of strategic partnerships lined up and has joined the Nvidia Inception program, which will help bring the View Information Platform to market with additional go-to-market support, training and technology assistance. It has has also joined with Ampere Computing and its AI Platform Alliance to bring the View offering to market.

It has two billing models. One is self-serve off the web and based on the amount of processing that you do, and it ends up being about $1 for every 50 PDF pages. The second to market is going to be through enterprise infrastructure vendors. This will likely be an annual license.

The two View co-founders suggested that an enterprise infrastructure/storage company could sell more storage by partnering with View because the View-created metadata and knowledge bases for users will need storing. Christner said “There’s going to be a tremendous infrastructure uplift by getting their customers to an AI-powered experience, they’re also going to consume a whole lot more infrastructure to make that possible.”

He said the View architecture is scale-out and can be very fast. “You can you can scale this to virtually as many nodes as you want. We decided to distribute embeddings generation across 128 cores. When we demonstrated this … we had them ingested and ready to go inside the span of like a minute.” 

Barto amplified this: “We we can build a very large cluster of Intel, AMD or other nodes, virtual nodes, or actual physical nodes. We can load-balance the ingestion process against multiple of our data processors, semantic cell processors, vector processors, and have them all load balanced behind the scenes to ingest the data.

“Then we can also load-balance the inference side of it, so that when you have hundreds or thousands of end users trying to chat against those knowledge bases, we can load balance and scale that behind Kubernetes, Redshift, whatever your Docker management infrastructure happens to be, and that would allow the system, when it saw the workload, scale up. We could deploy automatically all the microservices that we need to handle that load, and then, as it subsides, we can automatically scale back.”

View can control the disposition of compute resources so that, for example, it could ingest data, create metadata and run web crawls – but not use GPUs for the work, saving GPU compute for the knowledge base analysis. 

View’s beta program is open for sign-ups and general availability is not too far away. Barto predicted “We’ll release View Assistant and the chat experience with our GA” but “I want you to think bigger than just chat.” He reckons “It’s just our first a-ha moment for you as an end user, as a customer. [But] once the data is ingested and processed, the sky’s the limit.”