Ocient discusses hyperscale data analysis problem

Chris Gladwin, Ocient
Chris Gladwin, Ocient

Ocient reckons it has solved a niche but hyperscale data analysis problem.

Blocks & Files met Ocient founder Chris Gladwin. His company has built a massively scalable parallel-access relational database storage system, using NVMe SSDs, for hyperscale databases, ones with a trillion rows or more. Such databases store information generated by device or instance populations for mobile phones, cell tower activity records, vehicle driving events, and adtech/media instances.

Chris Gladwin, Ocient
Chris Gladwin

Searching and analyzing such massive databases can be done using existing SQL and RDBMS technology – but it can take days. Ocient’s technology does this in near real time, it says, answering queries in three to six seconds that previously took eight hours or more.

Yet there are furious levels of activity and funding in the large-scale data analytics space: Databricks, Dremio, Snowflake and Yellowbrick.

Chris Gladwin co-founded object storage company Cleversafe in 2004. It accumulated more than 350 patents for its on-premises dispersed object technology that could scale to store multiple petabytes of data and billions of objects. Its dsNet technology ingested data and cut it up into thousands of slices that were distributed across Slicestor server nodes in its network. A specialized RAID scheme meant that six out of 16 Slicestors could fail and data was still accessible

Cleversafe was acquired by IBM in a billion-dollar exit in 2015. The amount was not disclosed at the time but subsequently revealed to be around $1.5 billion. Gladwin said: “Eighty employees made $1 million and thirty made $5 million” from the acquisition. The investors and the employees both received the same cash amount for their shares; there was no multi-tier share valuation scheme.

The takeaway, according to the company, is that its founders and engineers solved a difficult set of problems and so dominated the high scale 100PB area object storage market with performant and highly reliable technology. Gladwin claimed: “No customer turned off a Cleversafe system in four years.”

Cleversafe, he said, had a 100 percent market share for the 100PB and above on-premises object storage. Because of this: “We know the top 500 data storage buyers; the biggest data analyzing government agencies and telcos.”

Why was Ocient founded?

In in the 2014-2016 period these very large data storage buyers “asked us if we could provide limitless scale and analysis.” They “wanted two orders of magnitude more of the price/performance of what others could do.”

Oracle’s Exadata system at the time was disk-based and in the 1PB capacity area; nowhere near enough. CleverSafe had effectively achieved limitless scale in object storage but not in analysis. Gladwin said that after he received the fifth such request he realized there was a general and real problem to be solved. Now he believes Ocient has a $22 billion total addressable market.

What was the nature of the development effort?

“The hardware vendors were saying new things were coming; high core-count CPUs, NICs supporting 100G and the most significant thing; NVMe SSDs.” Plus PCIe 4.

NVMe SSDs were the key because disk drives can deliver about 500 random reads a second whereas can NVMe SSDs can do a million; a 2,000x speed up. If you move an existing SQL database off a disk-based system, it will only run 10 to 50 times faster. The software is just too inefficient to make the best use of the NVMe drives. It had to be improved.

Working and funding the problem

Although there were several NVMe storage startups, such as Excelero, E8, and DSSD, Gladwin and his co-founders started to look at these specific hyperscale storage/analysis use cases; the ability to go much, much faster on trillion row-level SQL queries. The scale here is hard to imagine; Gladwin says a printed trillion row spreadsheet would circle the earth 70 times or more.

He and five other people spent 2016 working on whether this was a solvable problem. He said: “After a year we realized we could do it, but it would take five years – about 300 man-years – to build.”

How was it funded?

Gladwin said: “Because of Cleversafe I was able to arrange funding.” That was $50-60 million overall with outside investors joining Gladwin, who is the largest single investor.

The first production system was shipped to a customer in 2021, five years later.

The technology

Ocient’s system has colocated compute and storage; there is no time-sapping network link between them. 

The developers realized they couldn’t use PostgreSQL and would have to rewrite a database, to parallelize its operations, and that they would have to improve memory allocation in servers. The overall problem had both large and very small detail aspects to it, with the whole stack having to be checked over to find and tune out potential bottlenecks.

Ocient built a Mega Lane ring buffer to move data through a server’s PCIe lanes to and from the SSDs, and also a Particle Swarm heuristic optimization algorithm. This conceives of a number of agents (particles) that form a swarm operating in a search space, and looking for the best way to achieve a result. We think this applies to Ocient’s ANSI-compliant SQL engine, which has a query optimization stage. Gladwin said: “We take a second or two to work out how to run the query… Our optimizer works at the cluster-level.”

The engineers also built their own ETL process to get rid of bottlenecks in existing ETL technology.

Charging and funding and roadmap hints

How does Ocient charge customers?

“We charge by the core; it makes sense at hyperscale,” and not capacity. Ocient’s customers will be running their system’s servers flat out.

It’s on a subscription basis with systems on-premises, in a cloud like AWS or run by Ocient as a managed service. Typically it is a three-year term with upgrades. Ocient is willing to sell on a perpetual license basis but, so far, no customer has asked for one. 

Gladwin pointed out that, when Ocient sells one its systems to a CSP, it is a backend system and, as such, is on-premises even though the customer is a public cloud operator.

Ocient has around a dozen customers, including MediaMath, an adtech demand-side platform business.

Will it raise more money?

“We’re talking to growth stage investors. We’ll be raising more money for sure.”

What’s on the roadmap?

“We’re adding machine learning in the data engine.” Gladwin said a telco could run a daily network model to update its various cell tower policies, but events that can affect overall cell tower network performance can occur at any time. By having machine learning in the data engine, Ocient could suggest or trigger policy changes at more frequent intervals.

It’s also adding geospatial functions and looking at doing more with secondary indexing.

Comment

Ocient has a six-year lead time over any competitor and technology that spans the whole SQL database stack, right down to server memory allocation. This specialization is a formidable technology moat that any competitor would need to cross. Unless the market for trillion row, near-realtime SQL databases heads toward the 1,000-plus customer level then it looks like Ocient is operating effectively on its own.

Gladwin said he conceived of Ocient with a 10 to 12 year time frame, 2026-2028, after which he might retire upwards to being its chairman. There is, however, a possibility that Ocient’s market will enlarge and the company could be bought. Snowflake, for example, could surely see a use for it.