Hammerspace differentiates file copies from file instantiations

Hammerspace, the global file data environment company, says copies are different examples of the same file, and therefore forks, whereas instantiations are not and therefore not forks.

What does this mean, and why does it matter?

The point was made to attendees on an IT Press Tour in Palo Alto. Let’s start with a file: Testfile.dat. It consists of data and metadata and let’s suppose it takes 100 blocks on a storage drive for data and one block for the metadata – 101 blocks in total. Now let’s make a copy.

David Flynn.

The copy consists of the file data, 100 blocks, and the metadata, one block, totalling up to the same 101 blocks. Hammerspace founder and CEO David Flynn says this is a separate file – in effect, a fork of the original file. It can have changes made to it that are not propagated back to the original file. Its contents can thus diverge form the original file. And vice versa if the original file changes.

Now let’s envisage Hammerspace’s GDE (Global Data Environment) making an instantiation – as it calls it – of the original file. This can happen when a user at a datacentre in San Francisco accesses a file that’s present in their local GDE setup but whose data is stored in New York. That’s its base location. In the Hammerspace scheme, file metadata is shared between sites on a peer-to-peer basis so that all the users everywhere in the GDE see the same files in their file:folder structure. 

When the user at San Francisco decides to access Testfile.dat the constituent data blocks are transmitted from New York to San Francisco and instantiated there. This file instantiation has exactly the same metadata as the New York master copy. Any changes are propagated back to New York so that there is one version of the file contents’ truth, and no file fork takes place.

So Hammerspace GDE instantiations are use exactly the same metadata as the master copy. In the GDE filesystem metadata is a control plane.

Attendees also learnt that GDE is structured to work synchronously in  local domains (datacentres) and asynchronously across geo-distributed datacentres – where it is eventually consistent.

We were told that data is digital but is in desperate need of digital transformation – to make it less manual in its management. Hammerspace presentations have a way of resetting one’s preconceptions