IBM and Nvidia announce turnkey AI system

IBM and Nvidia partner again to produce a tower-sized DGX server that combines high-end IBM hardware with Nvidia GPUs and is specific for AI.

IBM SpectrumAI with Nvidia DGX
IBM

IBM and Nvidia further enhanced their hardware relationship with the announcement of a new turnkey AI solution that combines IBM Spectrum Scale scale-out file storage with Nvidia’s GPU-based AI server.

The name is a mouthful: IBM SpectrumAI with Nvidia DGX. It combines Spectrum Scale, a high performance Flash-based storage system, with Nvidia’s DGX-1 server, which is designed specifically for AI. In addition to the regular GPU cores, the V100 processor comes with special AI chips called Tensor Cores optimized to run machine learning workloads. The box comes with a rack of nine Nvidia DGX-1 servers, with a total of with 72 Nvidia V100 Tensor Core GPUs.

Storage key to AI success

The box addresses an overlooked element to successful AI, and that’s storage. It’s recognized that for AI to work, vast amounts of data is required, and GPUs have taken the lead in AI processing because of their massive parallelism.

The trick is getting those terabytes of data off the disk. It’s pretty much impossible with hard disks, but even flash isn’t normally fast enough to feed the GPUs. IBM claimed the storage scales “practically linearly” and offers 120GB/s of data throughput in a rack.

That’s due to Spectrum Scale’s cluster file management, which accelerates random read data requirements to feed multiple GPUs, since GPUs are usually much faster than even flash disks and storage is usually the bottleneck.

The two companies boast that the server offers “the highest performance in any tested converged system” while supporting data science practices and AI data pipelines, including data prep, data training, inference, and archival.

Spectrum Discover, a part of SpectrumAI, makes data accessible via data cataloging and indexing. And thanks to an API in SpectrumAI, it re-uses the workflows created by Spectrum Discover, reducing time spent on data prep.

The Nvidia DGX software stack is designed for maximized GPU-accelerated training performance, using Nvidia’s new RAPIDS framework to accelerate data science workflow. The whole thing can be deployed and used immediately with no complex installation required.

This is the latest in the IBM/Nvidia partnership, which has seen IBM adopt Nvidia’s high-speed interconnect called NVLink in its Power servers and the creation of Summit, the world’s fastest supercomputer, which uses IBM Power9 processors and Nvidia Tesla processors.

The two companies also helped build the two fastest supercomputers on the biannual Top 500 supercomputer list: Summit and Sierra, both used by the Department of Energy.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Now read: Getting grounded in IoT