Hammerspace buys RozoFS for erasure coding tech

Data orchestrator Hammerspace has quietly bought French startup RozoFS for its transformational erasure coding technology, the Mojette transform.

RozoFS was started up in 2010 by CEO Pierre Evenou in Nantes, France, and has raised €700,000 in what funding it has made public, $764,000 in today’s money. Evenou was an academic researcher at the Institut de Recherche en Communications et Cybernétique de Nantes (IRCCyN) which worked on Mojette transform mathematics in discrete geometry. He took these ideas and set up RozoFS to commercialize them by building a NAS software system using patented Mojette transform erasure coding technology. Evenou moved to Silicon Valley in 2015 to set up RozoFS in the USA.

Tony Asaro.

Tony Asaro, SVP of Business Development for Hammerspace, was – we’re told – instrumental in making the acquisition happen, and said in a statement: “Rozo’s best-in-class erasure coding provides the right balance of price, performance, capacity efficiency, resiliency and availability. This is essential for organizations with massive amounts of data for multi-site and hybrid cloud environments.”

We understand the acquisition actually took place in late 2022 but was only disclosed today. The acquisition price was not confirmed.

Evenou is now VP Advanced Technology at Hammerspace and said: “Organizations need performance throughout their workflows. The integration of Rozo’s technology into the Hammerspace Data Orchestration System will help organizations get the most out of their expensive data creation instruments and compute clusters while also accelerating data analytics and collaboration.” 

In 2018 RozoFS  provided a high-performance file system on AWS in which fast metadata services provided asynchronous incremental replication between an on-premises storage system and an AWS-based copy. Incremental changes in the on-prem file system could be quickly computed. Using this information, the source cluster uses all its nodes to parallelize synchronization of the on-prem and AWS clusters. The cloud copy can be automatically updated as frequently as needed without impacting application performance. It reduced production dead time because there was no need for lengthy data synchronization. 

The software speeds data transfers by reducing the amount of data needed. It can deliver more than 1Tbps with only eight commodity servers working in parallel and connected on a 200GbitE network, we’re told. Used by Hammerspace it allows customers to move files directly to the compute, application, or user at peak performance, nearly saturating the capabilities of their infrastructure.

Two RozoFS  engineers have joined Hammerspace: CTO Didier Féron and VP Engineering Jean-Pierre Monchanin are new members of the Hammerspace development team as senior software engineers.

Background

Why “mojette”? It’s a French word meaning a white (haricot) bean – the beans used in baked beans – and such beans (sans the sauce) have been used in French schools to teach addition and subtraction. The usage is reflected in the term for accountants: beancounters. The transform only uses addition and subtraction, hence its name.

What’s the big beancounting deal for Hammerspace?

As we wrote several years ago: Mojette transform erasure coding starts from the concept of a grid of numbers. Imagine a 4×4 grid. We can draw straight lines along the rows, up and down the columns, and diagonally through the grid cells to the left and right. Figure 1 in the diagram below shows this.

Mojette grid and projections concepts

The lines are extended outside the grid. For each line, the values in the intersected grid cells can be added or subtracted and written at the end of the line. In fig. 1 the value b19 is the sum of the values in cells p1, p6, p11 and p16. 

The line of values from b22 to b16 is a kind of projection of the source grid along a particular dimension (diagonally from lower right to top left). The grid values can be viewed as being transformed into the projected values.

Figure 2 shows four such projections with the grid cells identified by Cartesian coordinates, such as cell 0,0; the bottom left cell. Figure 3 shows the projection direction in colour, blue, red, green and black.

If original data is lost somehow when the source data grid is read, then the projected values can be used to reconstruct a missing cell value, with two or more projections intersecting the missing cell such that its value can be re-computed.

Mojette transform erasure coding is quicker than other forms of erasure coding, such as Reed-Solomon and comparatively less storage space is needed. It’s also scalable out to billions of files.

A PDF doc, The Mojette Erasure Code, Benoît Parrein, Univ-Nantes/IRCCyN, Journée inter GDR ISIS/SoCSiP, 4/11/2014, Brest, provides more information on the basic maths and operations behind it.