IBM AI storage supports Nvidia’s A100 GPU powerhouse

IBM’s Storage for Data and AI portfolio now supports the recently announced Nvidia DGX A100, which is designed for analytics and AI workloads.

David Wolford, IBM worldwide cloud storage portfolio marketing manager, wrote last week in a company blog: “IBM brings together the infrastructure of both file and object storage with Nvidia DGX A100 to create an end-to-end solution. It is integrated with the ability to catalog and discover (in real time) all the data for the Nvidia AI solution from both IBM Cloud Object Storage and IBM Spectrum Scale storage.”

Big Blue positions IBM Storage for Data and AI as components for a three-stage AI project pipeline; Ingest, Transform, and Analyse/Train. There are five products:

  • Cloud Object Storage (COS) data lake storage
  • Spectrum Discover file cataloguing and indexing software
  • Spectrum Scale scale-out parallel access file storage software,
  • ESS 3000 – an all-flash NVMe drive array, with containerised Spectrum Scale software installed on its Linux OS and with 24 SSD bays in a 2U cabinet
  • Spectrum LSF (load sharing facility) – a workload management and policy-driven job scheduling system for high-performance computing
IBM’s view of its storage and the AI Project pipeline

IBM is updating a Solutions Blueprint for Nvidia to include support for the DGX-A100. The new server uses Tesla A100 GPUs, which Nvidia claims is 20-times faster at AI work than the Tesla V100s used in the prior DGX-2.

Nvidia DGX-A100.

The IBM blueprint recommends COS to store ingested data, and function as a data lake. Spectrum Discover indexes this data and add metadata tags to its files. LSF manages AI project workflows and it is triggered by Spectrum Discover to move selected data from COS to the ES3000 with its Spectrum Scale software. There, it feeds the GPUs in the A100 when AI models are being developed and trained.

Other storage vendors, such as Dell, Igneous, NetApp, Pure Storage and VAST Data will also support the DGX A100. Some may try to cover the AI pipeline with a single storage array.