Israeli startup Regatta is building a scale-out, transactional (OLTP), analytic (OLAP) relational database (OLxP) with extensibility to semi-structured and unstructured data. The company says it is a drop-in replacement for Postgres and has been designed from day one to support SSD storage. Its architecture is discussed in a blog by co-founder and CTO Erez Webman, formerly CTO of ScaleIO, acquired by EMC in 2013.
This OLTP+OLAP combination has been pursued by other suppliers, such as SingleStore, which has added indexed vector search to speed AI queries. SAP HANA, the Oracle Database with an in-memory option, Microsoft SQL Server with in-memory OLTP, Amazon Aurora with Redshift Spectrum, PostgreSQL with Citus or Timescale DB extensions all provide combined transactional and analytical database functions as well. Regatta is entering a fairly mature market and reckons it has an edge because of its architecture.

Webman says: “Regatta is mainly a scale-out shared-nothing clustered architecture where heterogeneous nodes (servers, VMs, containers, etc.) cooperate and can perform lengthy (as well as short) SQL statements in a parallel/distributed manner, with many-to-many data propagation among the cluster nodes (i.e. intermediate data doesn’t need to pass via a central point).” Each storage drive is accessible “only by a single node in the cluster.”
A Regatta cluster, designed to support thousands of nodes, supports differently sized and configured nodes, which can provide compute+storage, compute-only, or storage-only functions. The database can be hosted in on-premises physical or virtual servers and in the public cloud, and can be consumed as a service.
Regatta differs from scale-out-by-sharding databases, such as MongoDB, by supporting distributed JOINs across node boundaries, and ensures strong ACID guarantees even when rows reside on different nodes. (Read a Regatta blog about scale-out sharding limitations here.)
The company has developed its own Concurrency Control Protocol (CCP) providing a fully serializable and externally consistent isolation level. Where a database supports concurrent user or application access, the different users’ operations need to be kept separate and not interfere with each other. This is the intent of concurrency control, which can have either a pessimistic or optimistic design. Pessimism assumes data access conflicts between transactions are likely to occur, and uses locks to ensure that only one transaction can access or modify data at any one time.
Optimism assumes that transaction data access conflicts are rare and allows transactions to proceed without restriction until it’s time to commit changes. Before committing, each transaction undergoes a validation phase where it checks if its read data has been modified by another transaction since it was initially read (using timestamps or versions for data).
Webman says Regatta’s CCP “is mainly optimistic, although unlike most optimistic protocols, it doesn’t cause transactions to abort on detected conflicts (well, except, of course, for deadlock cases in which both optimistic and pessimistic protocols tend to abort a transaction per deadlock-cycle).” It is snapshot-free and does not require clock synchronization.
Short or lengthy consistent/serializable read-only queries can be performed on real-time, up-to-the-second transactional data without blocking writing transactions from progressing.
Regatta implements its own row store data layouts directly on top of raw block storage to optimize I/O performance, and does not need any underlying file system. This is a log-structured data layout that operates very differently from an LSM tree design. It is built for extensibility to support other types of row stores, as well as column store, blob store, etc. Webman says its “first row store data layout type is specifically optimized for flash media. It allows us to optimally support both traditional small-rows-with-more-or-less-fixed-size, and variable-sized-large-rows-with-a-large-dynamic-range-of-sizes (within the same table).”
We’re told: “Regatta’s B+Trees (that are used, for example, for indexes) massively leverage the high read-concurrency of flash media, allowing meaningfully faster and more efficient B+Tree accesses than algorithms that would assume more ‘generic’ underlying storage (i.e. magnetic HDD).”
There are more details in Webman’s blog about Regatta’s distributed SQL database.
CEO and co-founder Boaz Palgi tells us that Regatta’s system is looking to ensure that you can:
- Execute complex and real-time queries on completely up-to-the-second transactional data – think agents in a telco that get a question regarding roaming from a subscriber that just added roaming to their plan.
- Execute transactions such that the same agent understands that the the subscriber should have added roaming for Italy as well, not just for France, and needs to correct this.
- Linearly increase both transactional and analytical performance without changing even a single line of code in your business logic by just adding more nodes. This will be important to keep running your business while adding many agents to the mix.
He says: “Traditional databases cannot deliver the performance to handle that type of agent-generated load, and most of them cannot combine OLAP with OLTP in the same database. Data warehouses cannot support the agents’ transactional workloads. ETL is a problem when you want agents to do more than just deal with stale archive-based data.”
For generative AI, “we are not doing anything specific today, although we will add some capabilities.”