You need access to those big data silos – fast? No problem, says Alluxio

Alluxio has added a bunch of improvements to its open source software, described by the company as a “data orchestration system for analytics and machine learning in the cloud”.

The software virtualizes data stores and is an abstraction layer that decouples compute and storage. It is intended for big data-using applications – and offers a cloud-based alternative to Hadoop.

With Alluxio, big data apps access stores in a single way instead of traversing unique access routes to every storage silo. This speeds up access of big data apps to multiple data silos.

The company said seven of the world’s top 10 internet companies use its software, including China Unicom. Two Sigma and Development Bank of Singapore are also users.

What’s new

New features for Alluxio 2.0, announced yesterday, include:

  • Policy-driven data management
  • Better data access policy management
  • Data service for cross-cloud data movement
  • Better data access for cloud analytics
  • AWS support 
  • Re-written base elements for hyperscale

Alluxio 2.0 also adds the Alluxio Data Service, a distributed clustered service, with adaptive replication for increased data locality, high-availability with an embedded journal, and a POSIX API.

The POSIX API enables frameworks such as Tensorflow, Caffe and other Python-based models to access data directly from any storage system via Alluxio using traditional file system access.

Haoyuan Li, Alluxio, founder and chairman, provided a quote: “These new advancements to Alluxio’s data orchestration platform further cement our commitment to a cloud-native, open source approach to enabling applications to be compute, storage and cloud agnostic.”

He has written a blog about v2.0.

Software stack

Alluxio’s software has three components: Master to manage file and object metadata; Worker to manage file and object blocks and interface to underlying storage systems; and Client for analytics and other apps to interface with Alluxio.

You can download an architecture white paper to find out more and access the users’ mailing list.

Alluxio 2.0 Community and Enterprise Edition are generally available for download via tarball, docker, brew, etc.

Company background

Alluxio was started up around 2012 with code development at the UC Berkeley AMP Lab. The co-founders were Li and Amelia Wong. The company was initially called Tachyon Nexus and received $7.5m A-round funding in 2015 for its in-memory clustered big data analytics technology.

This evolved into providing accelerated access to disparate big data silos and the company changed its name to Alluxio in 2016. It signed a deal with Huawei the same year.

Alluxio signed a deal with Dell EMC to use its ECS storage in 2017. $8.5m B-round of funding in January this year.

Li moved upstairs from company CEO to Chairman in May this year, and Steven Mih, CEO at Aviatrix, a cloud networking firm, took up the same role at Alluxio. Previously he was SVP, worldwide sales, at Couchbase.