Interview: Scality has made Zenko, its multi-cloud data controller, generally available.
Zenko v1.0 is a cloud-native product, available in both open source and enterprise editions. What does it do?
Scality says it provides a single endpoint, a cloud-agnostic unified interface, through which data can be stored, retrieved, managed and searched across any location – multiple disparate private (on-premises) and public clouds – enabled by metadata search and an extensible data workflow engine.
CEO Jérôme Lécat provided a canned quote; “Cloud services from Amazon Web Services (AWS), Microsoft Azure, IBM Cloud, Google and other large players are finding their place as core elements of enterprise IT. What is required to extract the most from these clouds is a solution that can cloud-enable the data that resides in no matter where it’s stored, whether cloud, legacy NAS or newer object storage solutions.”
The scope of Zenko encompasses data in file or object storage. The product stores all data unmodified so that it can be accessed and acted on in the cloud directly, and it enables data mobility between clouds, according to Scality.
It looks to Blocks & File as if Zenko aims to provide a single virtual object storage silo, spanning multiple actual S3-compatible silos (the public clouds and Scality RINGs). The release says it can cloud-enable data no matter where it’s stored, “whether cloud, legacy NAS or newer object storage solutions.”
Can it? We asked Jérôme Lécat some questions about this.
B&F; What data formats are supported in this release?
Jérôme Lécat: Zenko supports the Scality RING over the Amazon S3 API for on-premises storage. We have also completed successful PoCs with Ceph and will officially support Ceph over S3 in a near-term release. Additional on-premises storage (e.g., NAS filers) through standard file protocols (NFS, SMB) is not in R1.0, but is planned in future release as well.
Note that Zenko can already support more than just S3-based clouds. For example, in R1.0 we support Azure Blob Storage over the native Blob API, and Google over their JSON-based RESTful API (similar but not identical to the S3 API).
B&F: How does Zenko find on-premises data? In what formats; both file and object?
Jérôme Lécat: We are building a set of metadata discovery and import tools that can extract metadata from existing RINGs, clouds and in future, NAS file data, and import it into Zenko’s metadata namespace. This will be enabled through both RESTful/object APIs (in the API of the underlying object store), and via file (NFS, SMB) interface NAS filers.
B&F: How does it move data? i.e. how does it get authority to access and move on-premises data?
Jérôme Lécat: When a storage location (e.g. RING, Ceph, etc.) is created in Zenko, an account with the appropriate credentials is created. Zenko uses this account to authenticate and access data in the storage system. For example, with S3-based storage, Zenko requires the access key pairs (access key and secret key), and then authenticates all S3 API requests using the standard and highly secure AWS Signature v4 or v2 HMAC based mechanisms.
Once connected, Zenko uses standard S3 API calls to store, access and delete data according to application API commands. For data workflows that replicate or move data, Zenko uses S3-compatible APIs to express the policies through the AWS S3 Bucket Lifecycle API, or the AWS S3 Cross Region Replication (CRR) API. The actual mechanism of moving data is performed by the Zenko workflow engine in an asynchronous manner, using S3 API calls to read the source data, to store the data in the new cloud or storage location, and then to delete the data on the original source location.
Note that Zenko R1.0 supports in-band updates and in the future will also support out-of-band updates. For in-band updates, Zenko is in the data path as an S3 API endpoint. An application writes to Zenko and Zenko places/modifies data in designated storage location(s). For out-of-band updates, Zenko is not in the data path. An application (or user) can place/modify data directly in the storage location (Scality RING, Amazon S3, etc.) and an event will be triggered which will send metadata updates asynchronously to Zenko.
B&F: Is data encrypted in-flight and at rest?
Jérôme Lécat: Yes, Zenko uses secure HTTPS connections with CA-issued certificates to encrypt data in flight and during storage, replication and lifecycle move operations to clouds.
To ensure that data remains fully-accessible when stored in the cloud, Zenko uses the cloud-native encryption-at-rest schemes (e.g., AWS SSE object level encryption) to store the data. This means that local cloud services can still read/write the data by decrypting through the cloud service, and using cloud-managed encryption keys.
B&F: Does it deduplicate data?
Jérôme Lécat: No, Zenko does not deduplicate data. This is intentional. Zenko stores data unmodified in all storage locations so it can be accessed directly and freely from those storage locations. For example, when Zenko places data in Azure, that data can be accessed (read, analysed, modified, etc.) directly in Azure without having to go “through” Zenko.
This is critical to Zenko’s “no lock-in” value proposition, as well as to Zenko’s ability to let customers leverage the power and richness of cloud services (analytics, AI, BI, encoding, video indexing, etc.) on top of their data. In the future, Zenko will support user-extensible workflows including custom actions such as filtering, dedupe, compression functionality (some provided as 3rd party plugins).
B&F: Does it optimise WAN transfer?
Jérôme Lécat: No, Zenko does not optimise WAN transfer. A third-party WAN optimisation solution such as Aspera FASP or Signiant could be implemented if needed, as the network transport layer is transparent to Zenko’s HTTP/HTTPS based network access requirements,.
B&F: Zenko supports object data storage in Scality RINGs, Amazon S3, Azure Blob Storage, Google Cloud Storage, Digital Ocean Spaces and Wasabi. So – not IBM’s Cloud, and not public cloud file storage?
Jérôme Lécat: Not in Release 1.0, but Scality Zenko will support additional cloud storage services in the future based on customer demand. One of the most popular requests we have heard recently is Alibaba (the Ali Cloud) for customers with requirements in China and in Alibaba’s new German and UK-based data centres. We have not as of yet heard many requests for IBM’s cloud, however Zenko could be used as a single interface to the multiple clouds IBM offers both on-premises and hosted. We may consider supporting cloud file storage in the future.
A related topic is that Scality believes Zenko to be unique is extended support of the Amazon S3 Cross Region Replication (CRR) API for 1-to-Many replication (the S3 API is only 1-1). This means that Zenko can manage data across multiple storage locations / public clouds at the same time. We believe for all of the other products on the market this is an “or” proposition – meaning, you replicate data to AWS or to Azure or to Google. For Zenko, this is an “and” proposition – you can replicate data to all three clouds (and more) at the same time. When Zenko is deployed on-premises, this 1-to-many data movement saves significant bandwidth and egress charges because data does not have to be moved from public cloud to public cloud.
B&F: What other object stores are supported other than Scality RING?
Jérôme Lécat: RING in 1.0, but as stated above we have completed successful PoCs with Ceph and will officially support Ceph over S3 in a near-term release. We are also considering Minio and OpenStack Swift based backend storage. We will add other object stores based on customer demand. The more compatible an object store is with the Amazon S3 API, the easier it is to add – Ceph was very straightforward, for example.
B&F: What does Lifecycle Transition and Lifecycle Expiration mean?
Jérôme Lécat: Lifecycle workflows provide the ability to delete or move data to a cheaper storage tier or to a different storage location based on age and other rules. For example, data that is older than one year should be moved from RING to Azure Archive.
Zenko implements the AWS S3 Bucket Lifecycle API, which provides automated object expiration and transition (move) rules and capabilities. Zenko supports the full filtering, metadata tag matching and time-based XML rules language in the AWS API specification. For example, a bucket Lifecycle expiration can be set to expire (delete) all objects matching the pattern /my_files/*.MP4 when they are 90 days old (based on the stored create time value). Lifecycle Transition policies use similar rules to indicate when the Zenko workflow engine should “move” an object (as described previously) from a source location to a new target location (which may be in a different RING or cloud, another region or another storage class/tier within a cloud).
Scality says of Zenko that it cloud-enables data no matter where it’s stored – whether cloud, legacy NAS or newer object storage solutions. To repeat ourselves, It provides a single endpoint, a unified interface, through which data can be stored, retrieved, managed and searched across any location; multiple disparate private (on-premises) and public clouds.
Any restriction on the kind of data that can be brought into Zenko’s namespace and administered there weaken’s the Zenko value proposition. At present it’s a controller for data in Scality RINGs and some S3-compatible object stores.
In our view it needs to support file data to fulfil its claim or vision that it cloud-enables data ( not just object of file data but “data”) no matter where it’s stored. That means on-premises filers and public cloud file stores as well. It also needs to widen its object storage coverage universe, with Ceph, Minio and OpenStack Swift already mentioned.
There is a data protection difference between objects and files; files have to backed up; objects do not. If Zenko is going to be a file data as well as an object data controller then will it have to take on responsibility for file data protection?