Dell has enhanced PowerScale scale-out clustered file system arrays with full stack components for AI factory workloads.
We have written about VAST Data’s AI stack approach, and NetApp’s AI architectural development project for ONTAP. Now Dell is producing its own comprehensive AI stack offering, from PowerScale hardware and software to its Data Lakehouse with wide-scale data source ingest, vectorization, and metadata handling. This accompanies its PowerEdge XE9712 (Nvidia GB200) and M7725 (liquid-cooled AMD Epyc) servers and OCP-based, liquid-cooled Integrated Rack IR7000 systems, which provide the compute power and rack housing for its AI factory offerings.
Dell’s Arthur Lewis, president, Infrastructure Solutions Group, stated: “Today’s datacenters can’t keep up with the demands of AI, requiring high density compute and liquid cooling innovations with modular, flexible and efficient designs. These new systems deliver the performance needed for organizations to remain competitive in the fast-evolving AI landscape.”
PowerScale systems are effectively PowerEdge servers with directly attached storage, running the OneFS operating system. The systems can be clustered with from 3 to 252 nodes. PowerScale now supports 200 Gbps Ethernet and InfiniBand front-end networking, doubling the previous networking speed, increasing throughput by up to 63 percent.
OneFS now supports 61 TB QLC (4bits/cell) SSDs to drive up capacity, nearly doubling the previous maximum 30.72 TB drive capacity. It also has a metadata export feature for the Data Lakehouse plus a software enhancement that simplifies metadata export to an Elasticsearch database, making querying more efficient. Metadata across geo-distributed clusters can be combined to provide a global view.
The Data Lakehouse can combine this metadata with additional data ingested from other federated sources. It will support open table formats like Iceberg and will be extended to support vector databases. There will also be the ability to extract content metadata from files to augment its file-level metadata and enable full context search.
The Elasticsearch and open format scans be natively queried by the Data Lakehouse. The aim is to make query responses more accurate by using SQL, vector, lexical, and semantic search data.
A forthcoming Dell document loader for Nvidia NeMo services and retrieval-augmented generation (RAG) frameworks is designed to help customers improve Data Lakehouse data ingestion time and decrease compute and GPU cost.
Dell says that the enhancements to the Dell Data Lakehouse data management platform saves customers time and improves operations with disaster recovery, automated schema discovery, comprehensive management APIs, and self-service full stack upgrades.
The company has announced new services to help PowerScale-Data Lakehouse customers including Optimization Services for Data Cataloging and Implementation Services for Data Pipelines.
Dell Generative AI Solutions with Intel provide jointly engineered, tested, validated, preconfigured, and flexible platforms for AI deployment. They feature the PowerEdge XE9680 server with Intel’s Gaudi 3 AI accelerators, plus Dell storage, networking, services, and an open source software stack. These are aimed at content creation, digital assistants, design and data creation, code generation, and other GenAI workloads.
Availability
- The Dell IR7000 will be globally available Q1 calendar 2025.
- The PowerEdge XE9712 is sampling with select customers now.
- The PowerEdge M7725 will be globally available Q1 calendar 2025.
- PowerScale updates will be available in Q4 calendar 2024.
- Data Lakehouse updates will be available in 1H calendar 2025.
- Dell’s Generative AI Solutions with Intel will be available in Q4 calendar 2024.
Dell has more information here.