Hopsworks is billed as a flexible and modular AI lakehouse, with a Python-centric enterprise feature store that provides “seamless integration” for existing pipelines. Founded in 2018 in Stockholm, Sweden, it is now going for a fourth round of funding to aid its scale-up in the burgeoning AI/ML lakehouse market.
By spreading its footprint wider, it aims to fully compete against lakehouse giant Databricks, Amazon SageMaker, and Google Vertex AI, among others.
Crunchbase tells us Hopsworks has so far raised a total $13.8 million, in rounds taking place in 2018, 2021 and 2023. The next round may well be its biggest so far, if it can tap into the AI data processing hype cycle enveloping the IT industry.
At the end of this June, Hopsworks 4.0 was launched – a unified platform for building batch, real-time, and large language model (LLM) AI systems. The developer added new capabilities to the lakehouse around real-time data for RAG, native Python access (with ArrowFlight), and “fine-tuning” for LLMs. Hopsworks 4.0 led the firm to pin the “AI Lakehouse” marketing moniker on itself.
The Hopsworks Query Service is said to provide Python clients with “up to 45 times” higher throughput when reading data from the lakehouse, compared to Databricks, Sagemaker, and Vertex AI.
Jim Dowling, co-founder and CEO of Hopsworks, said at the time: “4.0 implies game-changing innovations for building AI systems, whether they are batch, real-time or LLMs applications, through an AI Lakehouse infrastructure.”
At this week’s IT Press Tour in Istanbul, in front of press and analysts from across Europe, Dowling explained the drive and ethos behind the business, and how his team want to truly compete against the US lakehouse giants.
“I’m from Ireland and very pro-European, and MySQL [the open source relational database management system] could have become a multi-billion dollar company for Europe based in Sweden. But it was sold off to Oracle [in 2010], just because someone could get a ranch together.
“I was at MySQL at the time, and I don’t want the same thing to happen to Hopsworks. Nor do the rest of our leadership team, who have also served at and been founders of other data companies.”
He went on: “Europe has great talent to form key companies, but the US has the expertise to scale them. Just look at Databricks – the founder [Swedish citizen Ali Ghodsi] is a friend of mine and he came to my wedding. What Databricks is doing from San Francisco, we want to do from Europe.”
Dowling also cited Germany’s Flink. In 2019, Chinese internet giant Alibaba Group acquired Data Artisans – the German company behind the Apache Flink data processing framework.
Apache Flink originated from the Stratosphere research project at the Technical University of Berlin in 2009, and in 2015, became a top level project at the Apache Software Foundation. The software features a dataflow engine developed in Java and Scala, which is designed to run batch and streaming analytic workloads on distributed systems – including Hadoop clusters and cloud-based systems.
Dowling said: “Flink was bought for $90 million, and now it’s the power behind TikTok.” It logs and passes on searches viewed by users of TikTok to the back engine, which then almost immediately deluges users with similar searches – otherwise known as “digital crack” for young and old alike.
As far as this key technology is concerned, Dowling also believes it was sold out of Europe “too cheap.”
As for competing with the lakehouse giants, Dowling remarked: “The CIOs come to us after going to Databricks first, after they find it too slow, partly because Databricks’ cloud is in the US, and their European customers are querying it from here.
“It’s difficult to get customers to try something that is not today’s IBM, but they eventually do.”
The scale-up funding round will be organized from this autumn, said Dowling, and is expected to be completed in the new year.