Starburst takes on Presto with distributed data lake source silo access

Starburst, the alternative Presto data lake analytics offering, is updating its fully managed Galaxy cloud service to support decentralized data lake activity, aiming to provide a single point of access for users to discover, govern, and analyze the data in and around their data lake. 

Presto is a Facebook (now Meta) 2012-originated open source project to provide datalake analytics using a distributed SQL query engine. Facebook contributed it to the Linux Foundation in 2019 which subsequently set up the Presto Foundation. The four Presto creators at Facebook left in 2018 and forked the Presto code to PrestoSQL. Facebook donated Presto to the Linux Foundation in 2019, which then set up the Presto Foundation. PrestoSQL was rebranded to Trino, and the forkers set up Starburst to sell Trino connectors and support. 

Justin Borgman, co-founder and CEO, Starburst, said: “Data teams are bogged down with complexity, often dealing with data silos within their organizations, forced to jump through multiple solutions just to understand what data they have and where it is located before they can even begin to think about its value.” 

Basic Presto architecture

Starburst wants to be the central gateway for the data lake and its associated silos. It’s announcing:

  • A fully managed platform with all the tools data engineers, analysts, and scientists need to activate the data in and around their data lake.
  • Multi-source access – Cross-cloud and cross-region federated access and governance making data from different sources accessible, allowing users to explore data before moving it into a data lake.
  • Connectivity – Great Lakes Connector provides connectivity to numerous data lakehouse file and table formats, including Apache Iceberg, Hive, Delta Lake, and more. 
  • Scalability – Warp Speed combines Starburst’s query engine with patented indexing technology for autonomous workload acceleration, increasing query performance.

Starburst says its query engine is deployed at PB-scale at the world’s largest internet companies.

Borgman said: “We believe Starburst Galaxy can act as an architectural centerpiece for modern data lakes that combines low cost commodity infrastructure with open formats and global federated access.”

Starburst has set up a Partner Connect portal to bring partner data source integrations to its customers. Partners for business intelligence (BI) and visualization include AWS QuickSight, GCP Looker, Metabase, Microsoft Power BI, Tableau Cloud, Thoughtspot, and Zing Data. Partners for data storage, prep and transformation are Tabular and dbt Cloud. 

These integrations are currently available to Galaxy users, with more to come. Starburst’s consulting and services partners like Accenture, Deloitte, Capgemini, and Slalom will be providing industry accelerators to help customers get started on Starburst Galaxy.