WEKA update jumps onto GCP, scales bigger, and moves data faster

A diver plunges into water.

Fast parallel file system supplier WEKA has a Data Platform v4.1 release that adds GCP support, extra scale and faster SMB protocol handling.

The WEKA software can now run on the Google Cloud Platform and will be available in the GCP marketplace later this month. Customers, it’s claimed, can deploy it on a cluster of Google Compute Engine (GCE) C2.standard machine instances to get better performance at a lower cost than other GCP file offerings. The v4.1 software can extend a single namespace to a Google Cloud Storage bucket to add a lower cost storage tier. 

Google Terraform templates are used to automate deployments. The WEKA deployment is set up within a managed instance group and enjoys auto-scaling of compute, to millions of IOPS or capacity, up to the exabytes level. It can then auto-scale back down when the compute and/or capacity resources are not needed.

A dataset stored in WEKA on GCP can be accessed using POSIX, NFS, SMB, and S3 protocols.

This v4.1 release also has a new SMB stack that should deliver higher performance than v4.0 and earlier versions of the software. WEKA claims it is faster than competing offerings without naming names or supplying numbers.

V4.1 can scale more than v4.0; much more, as it supports:

  • A doubled maximum number of workstations – 8,000 instead of 4,000;
  • Quadrupled snapshot count up to 24,000 from 4,000, with 4,000 writeable snapshots;
  • Near-doubled cluster backend and client containers – 6,500 compared to 3,725 previously;
  • 50 percent more processes – 12,000 instead of the previous 8,000 maximum.

S3 protocol handling has been improved so that  a larger number of objects can be ingested into an S3 bucket at high speed than before. There is better scaling and workflow integration with virtual-hosted-style URL support for API requests to S3 buckets. (See bootnote.)

The v4.1 update has several minor components. It makes previous CLI-only features available in the GUI – such as predefined templates in statistics, managing directory quotas, creating long-living tokens, 3D view for backend servers, improved management of SMB services, and more.

It adds support for new operating environments and components, such as OFED (OpenFabrics Enterprise Distribution) versions 5.4-3.5.8.0 and 5.7-1.0.2.0, RHEL 8.7, Rocky Linux 8.7, Kernel 5.15 (excluding the Amazon Linux operating systems), and the Mellanox ConnectX6-Lx NIC.

Finally it is non-disruptive – or almost non-disruptive – with a rolling upgrade across storage nodes in a cluster thanks to using a Multi Container Backend (MCB) architecture that was introduced in WEKA 4.0. There is minimal to no service-level degradation as the upgrade takes place.

The WEKA Data Platform v4.1 is now available and a WEKA blog discusses it.

Bootnote

Amazon says that, in a virtual-hosted–style request, the bucket name is part of the domain name in the URL. For example: https://bucket-name.s3.region-code.amazonaws.com/key-name. 

In the alternate path-style URL the bucket name is omitted from the domain name and appears after the backslash – https://s3.region-code.amazonaws.com/bucket-name/key-name.