Dremio lines up hat-trick of AI enhancements

Dremio is embracing generative AI with a three-step process adding Text-to-SQL, Autonomous Semantic Layer, and Vector Lakehouse functionality to its product.

The company supplies open-source Dremio Cloud lakehouse technology, with data persisted in Apache Iceberg format tables using Apache Parquet’s columnar data format. The lakehouse combines the capabilities of a more structured data warehouse and less structured data lake with self-service SQL analytics. Dremio’s view is that data warehouses rely on extract, transform and load (ETL) procedures to get data from different sources into the warehouse for subsequent analysis. With a lakehouse, data from multiple sources can be amalgamated into a data lake with no need for the ETL procedures to precede analytical processing.

Tomer Shiran

Tomer Shiran, co-founder and Dremio’s chief product officer, said: “Generative AI will transform data engineering, data science and analytics over the coming years, and we are excited to provide our users with the industry’s most powerful tools to uncover the true potential of their data.”

These include an intuitive Text-to-SQL experience, allowing users to have their natural language queries converted into SQL within the user interface. This is based on a semantic understanding of metadata and data, which ensures more accurate SQL generation. Dremio says automatic correction of SQL queries is coming soon as well.

The Autonomous Semantic Layer – we’re told – is software that automatically learns the intricate details of users’ data then produces descriptions of datasets, columns and relationships by using generative AI. There will be no need for manual cataloging. The software will autonomously learn workloads and create “reflections” to accelerate data processing, the company says, providing users with an AI-powered semantic layer for better data insights.

Dremio also supports vector embeddings, the data schema needed for generative AI processing, in its lakehouse. It says this will enable users to store and search vector embeddings directly and provide a foundation for users to build machine learning applications such as semantic search, recommendation systems and anomaly detection within the Dremio platform.

All analytics software will soon regard vector embedding schema support and a ChatGPT-like interface as table stakes – witness SingleStore and Zilliz. Dremio wants its AI adoption to go deeper than being just a GUI layer.

Dremio’s Text-to-SQL is available today while Autonomous Semantic layer and Vector Lakehouse capabilities are coming soon.