Druidic Imply launches Shapeshift project for modern analytics

Published tue 9 Nov 2021 // 14:12 UTC

Real-time analytics database startup Imply has unveiled Project Shapeshift — designed to develop a hardware-abstracting, auto-scaling control plane and SaaS service for the open source Apache Druid software on which it is based.

It will extend its SQL API from querying to ingestion, processing and transformation and so simplify development cycles. Imply will also build a serverless and elastic consumption experience for Apache Druid. There will be product updates for both Druid and Imply over the next year as part of this Project Shapeshift initiative.

BANDF AD

Imply CTO Gian Merlino said in a statement: “Establishing a path forward to massive adoption of a new data infrastructure lies in a strong commitment to advancing the underlying open source technology combined with a dedication to re-engineer the very foundation of that technology to be truly cloud-native.”

Apache Druid is an open-source, real-time, analytics database which supports data streamed from the Kafka and Amazon Kinesis message busses, batch loading from HDFS and S3 and many popular file formats. Imply was started to create a software business based on adding functionality and services to Druid to create a kind of real-time, data warehouse plus search combination.

It was founded in 2015 by CEO Fangjin Yang, chief experience officer Vadim Ogievetsky, and Merlino, who were creators of Apache Druid, and has taken in a substantial $115.3 million in funding with seed ($2 million), A ($13.3 million), B ($30million) and C-rounds ($70 million this year).This is yet more evidence, after the Snowflake funding and IPO saga, of the awesomely strong attractive power that analytics software holds for venture capitalists.

Apache Druid uses inverted indexes (in particular, compressed bitmaps) for fast searching and filtering and can support numerical aggregations, groupBys (including multi-dimensional groupBys), and other analytic workloads faster and more efficiently than search systems. A Druid FAQ answers basic questions about it and a Wikipedia entry answers even more basic questions.

BANDF AD

Imply will develop a substantial architectural expansion on top of Apache Druid, to provide more flexibility and analytics capabilities for applications. There will be a multi-stage, decoupled query layer integrated with the core Druid database engine. This will help developers support all of their analytics requirements for their applications with one platform.

The company will also provide an overall improvement to the ease of use across data ingestion, queries and cluster operations, to deliver what it claims will be the most developer-friendly database for analytics applications. It will improve reporting for large result sets and long-running queries and complex conditional alerting across millions of objects.

A video provides more background about Imply and its Druid work, as does a blog by Fangjin Yang.

public cloud analytics

Druidic Imply launches Shapeshift project for modern analytics

Data analytics help make the mighty lionesses roar

Storage news ticker - 13 March

Everpure tops SPECstorage Solution 2020 AI IMAGE benchmark charts

Solidigm strikes out in new AI computer vision direction

Lightbits and ScaleFlux demo 100x to 280x KV Cache acceleration

Qdrant pockets $50M to push composable vector search

VAST Data raises $1B at $30B valuation as AI storage demand surges

HPE networking boom offsets server dip as revenue hits $9.3B

Cohesity builds guardrails for rogue AI agents and their data access

VDURA pairs V5000 flash with WD Data60 and Data102 disk shelves

Everpure stretches ActiveCluster to metro-distance DR for file workloads

LucidLink Connect streams S3 buckets without the ingest headache

Box pitches 'virtual filesystem' layer for AI agents

MariaDB buys GridGain to cut latency for AI inference workloads

Storage news ticker – March 9

How AI is boosting gender equality in high-performance racing

News ticker – March 6

What’s the M in MWC stand for? Memory if you're Micron or SK Hynix

Nasuni buys Resilio following torrent of exec changes

Businesses still struggling to manage data budgets, deliver ROI when it comes to AI

Seagate HAMRs out production deal for 44TB Mozaic4+ drives with hyperscaler

Storage news ticker – March 2