NetApp spots a wave of interest in Apache Spark, and buys a Data Mechanic

NetApp has bought Data Mechanics, a neat niche startup with an easy-to-use API on-ramp to running Apache Spark analytics jobs in the three main public clouds.

It reckons Data Mechanics’s Spark job interface software layer can be added to its Spot Wave facility for optimising Spark jobs in the cloud, to give customers easier setup access to Spark jobs and then better-performing, auto-scaling and lower-cost Spark runs in the cloud.

NetApp’s Anthony Lye, SVP and GM of its Public Cloud Services business unit, said in a canned quote: “Adding Data Mechanics to our existing solutions will make it simpler and more cost-effective for organisations across all industries to leverage Apache Spark and Kubernetes to advance their data and cloud initiatives.”

Data Mechanics was founded in 2019 by CEO Stephan Jean-Yves and CTO Julien Dumazert, and has offices in San Francisco and Paris. It has taken in $150,000 in funding according to Crunchbase. Financial details of the acquisition were not revealed.

The company provides an app-aware containerised and serverless infrastructure for analytics using Apache Spark through a managed Kubernetes platform. Users interact with the Data Mechanics platform via a monitoring dashboard and simple API, not directly with Kubernetes. The Data Mechanics abstraction layer software is deployed on a Kubernetes cluster in a customer’s public cloud account (AWS, Azure, or GCP).

Data Mechanics’ team and IP will be integrated with the Spot by NetApp team and portfolio to accelerate the development of NetApp’s recently announced Spot Wave offering, which provisions, scales, simplifies, optimises and automates Spark workloads running in public clouds.

Amiram Shachar, VP and GM of Spot by NetAp, offered his thought on the deal: “Although there are significant benefits to moving analytics and application workloads to the cloud, managing analytics technologies and cloud infrastructure can be resource- and time-intensive, impeding employee productivity and return on investment.”

Adding Data Mechanics’s Spark interface layer to Spot Wave will cut the time and resources needed by users (NetApp customers) to run their Spark-based analytic jobs in public clouds, with the cloud infrastructure automatically optimised for performance and cost.

A blog by Shacher explains the background. It reads: “With Data Mechanics capabilities integrated into the Ocean product family in Spot Wave, big data application owners will be able to easily run fully automated, optimised Spark on fully automated, optimised cloud infrastructure.“

NetApp has also added a continuous delivery feature to Ocean — Ocean CD, which automates app deployment and verification processes. It says it’s  evolving Spot Ocean into a suite of DevOps solutions.

NetApp Spot Ocean CD diagram.

Read about Spot Ocean CD here.

Comment

NetApp storage users will see that there is no direct link between Spot Ocean and NetApp storage. It seems to Blocks & Files that Spot is, essentially, storage-agnostic. So too is Data Mechanics. Is this Spot line of products intended to augment and help NetApp storage sales, or is it becoming a quite separate product line?

Spot by NetApp timeline

  • June 2020 — NetApp buys Spot, a cloud instance cost broker, for  a reported $450M, so enterprises can cut cloud compute cost and NetApp can use it to help enterprises deploy apps with NetApp storage cost-effectively in the public clouds.
  • October 2020 — Spot Storage announced, and works with ‘serverless’ Spot Ocean service. Spot Ocean is a Kubernetes-orchestrated container app deployment service, which supports Amazon Web Services ECS (Elastic Container Service) and EKS (Elastic Kubernetes Service) instances, and the Google Kubernetes Engine. Spot abstracts server and storage details from Kubernetes-deployed containers.
  • March 2021 — Spot Wave announced as a data-management service for Apache Spark. Spot Ocean now supports the Azure Kubernetes Service.