IBM builds containerised version of Spectrum Scale

IBM is launching a containerised derivative of its Spectrum Scale parallel file system called Spectrum Fusion, as well as delivering new ESS 3200 Elastic Storage System storage array and a capacity enhancement for the ESS 5000.

The rationale is that customers need to store and analyse more data at edge sites, while operating in a hybrid and multi-cloud world that requires data availability across all these locations. The ESS arrays provide Edge storage capacity and a containerised Spectrum Fusion can run in any of the locations mentioned.

Denis Kennelly, IBM Storage Systems’ general manager, said in a statement: “It’s clear that to build, deploy and manage applications requires advanced capabilities that help provide rapid availability to data across the entire enterprise – from the edge to the data centre to the cloud. It’s not as easy as it sounds, but it starts with building a foundational data layer, a containerised information architecture and the right storage infrastructure.”

Spectrum Fusion

Spectrum Fusion combines Spectrum Scale functionality with unspecified IBM data protection software. It will appear first in a hyperconverged infrastructure (HCI) system that integrates compute, storage and networking. This will be equipped with Red Hat Open Shift to support virtual machine and containerised workloads for cloud, edge and containerised data centres.

Spectrum Fusion will integrate with Red Hat Advanced Cluster Manager (ACM) for managing multiple Red Hat OpenShift clusters, and it will support tiering. We don’t yet know how many tiers and what types of tiers will be supported.

Spectrum Fusion provides customers with a streamlined way to discover data from across the enterprise, IBM said. This may mean it has a global index of the data it stores.

IBM also said organisations will manage a single copy of data only – i.e. there is no need to create duplicate data when moving application workloads across the enterprise. The company does not mention data movement in its launch press release.

Spectrum Fusion will integrate with IBM’s Cloud Satellite, a managed distribution cloud that deploys and runs apps across the on-premises, edge and cloud environments. 

Q and A

We asked IBM some questions about Spectrum Fusion:

Blocks & Files: What is the data protection component in Spectrum Fusion?

IBM: For data protection, Spectrum Fusion primarily will leverage a combination of the technology within Spectrum Protect Plus and the storage platform layer based on Spectrum Scale.

Blocks & Files: How many storage tiers are supported?

IBM: Spectrum Fusion will support 1,000 tiers that can span across an enterprise and cloud including Flash, HDDs, Cloud(S3) and tape.

Blocks & Files: Spectrum Fusion is being designed to provide customers with a streamlined way to discover data from across the enterprise. Does that mean it has some kind of global data index?
IBM: Spectrum Fusion implements a global file system with a single name space so it does have global awareness of file names and locations. We will support 8YB (yottabytes) of global data access and namespace that can span across the enterprise and cloud. The technology is based on existing IBM advanced file management (AFM) technology currently available in Spectrum Scale. Existing NFS or S3 data from other vendors can be integrated into this global data access allowing existing data sources to integrate into Spectrum Fusion environments.

Blocks & Files: Organisations will be able to manage only a single copy of data and no longer be required to create duplicate data when moving application workloads across the enterprise. How will they access the data from a remote site? Will the data be moved to their site? 

IBM: Yes when accessed for optimal performance. For remote access, Spectrum Fusion will automatically move/cache only the data needed to a remote site. With local caching in the remote site, the system can deliver high performance but without the expense and security concern of duplicating large volumes of data. The applications will see the data as a “local” file but the data is physically located on a remote system (Spectrum Scale, remote NFS FS, or an S3 data bucket).

ESS news

IBM’s ESS systems are clustered storage servers/arrays with Spectrum Scale pre-installed. The ESS 3000 is a 2U-24-slot box fitted with NVMe flash drives and up to 260TB usable capacity. It is a low-latency analysis node.

The high-end ESS 5000 capacity node has two POWER9 servers, each 2U high and running Spectrum Scale, and uses 10TB, 14TB or 16B disk drives in either 5U92 standard depth storage enclosures or 4U106 deep depth enclosures. It scales up to 13.5PB with eight of the 4U106 enclosures.

ESS 3200.

The new ESS 3200 comes in a 2U box filed with NVMe drives and outputs 80GB/sec; a 100 per cent read performance boost over the ESS 3000. It supports up to 8 InfiniBand HDR-200 or Ethernet-100 ports and can provide up to 367TB of storage capacity per node. 

The ESS 5000 has been updated with a capacity increase and now scales up to 15.2PB.

All ESS systems are now equipped with streamlined containerised deployment capabilities,  automated with the latest version of Red Hat Ansible. Both the ESS 3200 and ESS 5000 feature containerised system software and support for Red Hat OpenShift and Kubernetes Container Storage Interface (CSI), CSI snapshots and clones, Windows, Linux and bare metal environments.

The 3200 and 5000 work with IBM Cloud Pak for Data, a containerised platform of integrated data and AI services, for integration with IBM Watson Knowledge Catalog (WKC) and Db2. They are also integrated with IBM Cloud Satellite.

Spectrum Fusion in HCI form will become available in the second half of the year and in software-only form in early 2022.