IBM says mainframes and AI are essential partners

Big Blue wants the tech industry to use its mainframes for AI workloads.

A 28-page IBM “Mainframes as mainstays of digital transformation” report, produced by its Institute for Business Value, found that 79 percent of IT executives agree that mainframes are essential for enabling AI-driven innovation.” It states that, after six decades of evolution, mainframes are mainstays, storing and processing vast amounts of business-critical data. As organizations embark on AI-driven digital transformation journeys, mainframes will play a critical role in extending the value of data.

IBM’s concern seems to be that mainframe users should not just assume modern, generative AI workloads are for the public cloud and/or x86 and GPU servers in an organization’s data centers. Mainframes have a role to play as well.

The report, which we saw before publication, starts from a hybrid mainframe-public cloud-edge approach, with workloads put on the most appropriate platform. AI can be used to accelerate mainframe app modernization, enhance transactional workloads and improve mainframe operations. The report says “Combining on-premises mainframes with hyperscalers can create an integrated operating model that enables agile practices and interoperability between applications.”

It suggests mainframe users ”leverage AI for in-transactions insights to enhance business use cases including fraud detection, anti-money laundering, credit decisioning, product suggestion, dynamic pricing, and sentiment analysis.”

Mainframe performance can improve AI-powered, rules-based credit scoring, with a North American bank, scoring only 20% of its credit card transactions and taking 80ms per transaction, with public cloud processing being able to score 100 percent by moving the app onto its mainframe, achieving 15,000 transactions/sec at 2ms per transaction and saving an estimated $20 million a year in fraud prevention spend.

Mainframes with embedded on-chip AI accelerators “can scale to process millions of inference requests per second at extremely low latency, which is particularly crucial for transactional AI use cases, such as detecting payment fraud.” IBM says “traditional AI may be used to assess whether a bank payment is fraudulent, and LLMs (Large Language Models” may be applied to make prediction more accurate.”

This is IBM’s Ensemble AI approach; combining existing machine learning models with the newer LLMs.

AI can be used to improve mainframe management. The report found that “74 percent of executives cite the importance of integrating AI into mainframe operations and transforming system management and maintenance. AI-powered automation, predictive analytics, self-healing, and self-tuning capabilities, can proactively detect and prevent issues, optimize workflows, and improve system reliability.”

Mainframes can use AI for monitoring, analyzing, detecting, and responding to cyber threats. Also Gen AI LLMs and Code Assistants can be used to speed older coding language work, such as Cobol, conversion to Java, and JCL development, so “closing mainframe skills gaps by enabling developers to modernize or build applications faster and more efficiently.”

IBM is taking an AI processing offload approach with AI-specific DPUs (Data Processing Units) for its next generation z16 mainframe, due in 2025. This will be equipped with up to 32 Telum II processors with on-chip AI inferencing acceleration at a 24 TOPS rate. A Spyre accelerator will add 32 AI accelerator cores and 1GB DRAM, having a similar performance to the Telum II on-chip AI accelerator. Up to 8  can be used in addition to Telum II units on the next mainframe generation.

Big Blue is not talking about adding GPUs to its mainframe architecture though. Inferencing workloads will run effectively on the mainframe but not AI training workloads. We can expect IBM to arrange mainframe vectorization and vector database capabilities to support retrieval-augmented generation (RAG) in inferencing workloads. 

For this commentator, adding GPUs to a mainframe would be a kind of Holy Grail, as it would open the door to running AI training workloads on this classic big iron platform. Maybe this notion, GPU co-processors, will be a z17 mainframe generation thing.

Get the report here and check out an announcement blog here.