Kinetica has integrated its streaming real-time analytics data warehouse with ChatGPT so customers can ask conversational queries of their proprietary data and receive answers in seconds. Amit Vij, Kinetica co-founder and president, claimed: “Generative AI revolution is a killer app for data analytics.”
ChatGPT is an application built on GPT-3.5 and GPT-4, which are Large Language Models, created by OpenAI which can respond to complex natural language queries and search public data sets, with comprehensive replies which are often right but may also be wrong. Kinetica’s analytics database, which supports both GPUs and CPUs, contains both time-series and spatial data, and they can be analyzed in real time. The database can now use ChatGPT as a front-end interface that converts natural language to Structured Query Language (SQL) and then runs the SQL queries.
Nima Negahban, Kinetica co-founder and CEO, said: “While ChatGPT integration with analytic databases will become table stakes for vendors in 2023, the real value will come from rapid insights to complex ad-hoc questions. Enterprise users will soon expect the same lightning-fast response times to random text-based questions of their data as they currently do for questions against data in the public domain with ChatGPT.”
With the current dash for ChatGPT integration by other analytics data warehouse/lake companies such as Databricks, Pinecone, SingleStore and Zilliz, we’d bet Negahban’s table stakes point is correct.
Kinetica says existing conventional analytic databases require extensive, before-the-fact data engineering, indexing and tuning to enable fast queries, which means the question must be known in advance. If the questions are not known in advance, a query may take hours to run or not complete at all. Using ChatGPT as a conversational front end does away with the need for pre-engineering data and also with the need to write complex SQL queries or navigate through complex user interfaces.
Users can ask questions using natural language. ChatGPT can, Kinetica claims, understand the user’s intent and the logical data model and generate queries based on their questions. The user can then ask follow-up questions or provide additional context. Users get immediate answers to their questions without waiting for long-running queries or data pipelines to be built.
Vij said: “Kinetica plus ChatGPT makes complex, ad-hoc queries truly interactive, avoiding the ‘interactus interruptus’ of other integrations between large language models and databases.”
ChatGPT-type chatbots can provide not just wrong but imaginary answers. The datalakers integrating chatbot front ends which generate SQL queries against proprietary data sets will hope this bounding will stop the chatbot dreaming up false information.
Kinetica background
Kinetica was founded in 2016 and has raised $77.4 million across four funding events, the last a $14.4 million venture round in 2020.
It says its technology uses native vectorization to outperform other cloud analytic databases. In a vectorized query engine, data is stored in fixed-size blocks called vectors, and query operations are performed on these vectors in parallel, rather than on individual data elements. This querying of multiple data elements simultaneously results in faster query execution and improved performance.
The software supports a large number of spatial and temporal join types, graph operations, hundreds of analytic SQL functions, and enables visualization of billions of data points on a map. Customers include Liberty Mutual, TD Bank, the NBA, Lockheed Martin, USPS, T-Mobile, FAA, Ford, Point72, Verizon and Citi. Its streaming data warehouse can ingest, analyze, and visualize data sets with trillions of rows to model possible outcomes and assess risk for the US Air Force by tracking airborne objects for example.
In February Kinetica announced 90 percent annual recurring revenue growth over the past 12 months, a net dollar retention rate of 163 percent, and a doubling of its customer base.