The CloudSoda data-moving-as-a-service product calculates movement costs in dry runs, has multi-tenant support, and supports a wide range of any-to-any data movement sources.
The eponymous product was called SoDA in 2020 and was a business project of Integrated Media Technology (IMT). That company acted as an incubator and now there is a separate CloudSoda business. It’s small with just 13 employees but has produced a data mover and cost calculator from scratch. The crew won a NAB Show Product of the Year award in April this year.
We were briefed by EVP Sales Brian Morsch, an ex-director of sales at Pure Storage, and chief product officer Greg Holick, who comes from Western Digital with an intervening Seagate stint. Both are viewed as co-founders.
Morsch told us: ” When [IMT] brought on Greg and I to build a business, the goal was; you guys are going to come in, you’re going to build our business, and you guys are going to break out the company and the company’s going to run solely on its own. Obviously, during the pandemic, that slowed us down a little bit, but I think the value add of having IMT and their large customer base … was influential. Startups usually don’t have that type of ability to automatically say … we can go into this installed base, show the product, get early adopters. So that was huge in in our growth.”
Holick said: “We’ve … gone from a centralized data movement model to a distributed data movement model, and we can move data quickly, securely and automatically from anywhere to anywhere. We’re trying to just be storage agnostic ecosystem aware.”
The product is based on a hub and spoke design with a central cloud-based Conductor – controller – talking to software agents installed in the source and target systems. The Conductor can be installed in the public cloud, or in a customer’s own environment. A source can be a target as the software is inherently bi-directional. Agents run inside on-premises systems which can be NAS filers, SANs, object storage or direct-attached drives with NFS (v3, v4), SMB (1, 2, 2.1, 3), S3, Azure Blob and GCP storage protocols supported. Desktops and notebooks are also supported.
Supported public clouds are the big three – AWS, Azure, GCP, with all tiers supported – and some tier 2 CSPs – Wasabi, Backblaze and also Storj, the decentralized storage provider, plus generic S3.
There is a multi-threaded, high performance scanner to detect files and objects on the target system, which works across agents. Files and objects can be moved on-demand or according to settable policies with filters. The number of threads involved in a move is dynamic. Holick said:”We have a dynamic threading algorithm that looks at the objects or the files that we’re moving and [their] sizes and we spin up or spin down the number of threads we need to move that data. And that way we’re trying to saturate any network link.”
All data is encrypted before it’s moved and moved in native format. There are webhooks so that CloudSoda can send an alert back to an application to say a transfer has completed.
Once a dataset has been selected for a move the time needed for the movement and the costs involved can be calculated in a test run, without any movement actually taking place. This means you can check out the transfer time and costs of alternative targets to see, for example, if Azure archive storage is cheaper than AWS or Storj, or takes longer than sending the data to GCP.
Data transfer uses UDP file acceleration between agents and a multi-threaded, multi-part REST protocol when moving data to/from the public cloud.
The software hooks into CSP price books to get real-time pricing data. Custom price books can be set up for on-premises storage, thus enabling comparisons to be made of different storage policies.
CloudSoda monitors the move as it is being made and collects stats. Its management analytics can report on moves over time by originating site and site function, costs and also show trends:
There are facilities for billing and chargeback so an organization can send costs to the departments involved, or an MSP could use CloudSoda to offer data movement facilities to each of its customers as CloudSoda is multi-tenant. The management facility is integrated with Active Directory and Okta and supports role-based access control. The software has API integration with a RESTful interface and easy and integration into existing workflows.
The CloudSoda software can also tag data as it’s moved, enabling subsequent metadata-based searches. This could be used, for example, to move a set of media assets and tagging them all with a project name.
Holick said all organizations move data, particularly in this AI-using era. Project-based migration, with a filer upgrade for example, is a continuing if sporadic need but the general amount of data movement is steadily increasing. There is data transfers from edge locations to data centers and from either to the cloud, or back again. Media Asset Management (MAM) often involves moving media assets up to the cloud for long-term storage. If AI processing is run in the cloud then it needs data on which to operate and that has to get into the cloud.
Holick said: “We can do data collaboration and sharing and things like that as well.”
CloudSoda has partnership deals with MAM suppliers such as IPV, ReachEngine, dalet, CatDV and Elements. Its also partnering Dell, LucidLink, NdtApp, OpenDrives, Quantum and VAST Data.
Morsch said: “We are entertaining raising capital in the next six to 12 months … We see this company growing significantly over the next three to five years.”
The product is priced by node and not capacity or the amount of data moved, which makes its costs entirely predictable. CloudSoda Agent is priced at $18,000/year with a bundle of five costing $60,000/year. Larger volume deals with enterprises needing custom pricing are negotiable. The system needs up to an hour for its setup after which data movement can start.
Its management GUI is uncluttered and easy to use by any storage admin person or data manager. Datadobi (StorageMap etc.), DataDynamics (StorageX), Komprise (KEDM) and WANdisco (Live Data Migrator) are each facing a new data-moving kid on the block with slick SaaS software and growing functionality.