Nvidia announces server 'superchips,' with and without GPUs

Nvidia's Grace superchips cater to AI as well as legacy high-bandwidth applications not optimized for GPUs.

Server racks with illuminated indicators in a dimly lit data center.
SeventyFour / Shutterstock

At its GPU technology conference (GTC) last year, Nvidia announced it would come out with its own server chip called Grace based on the Arm Neoverse v9 server architecture. At the time, details were scant, but this week Nvidia revealed the details, and they are remarkable.

With Grace, customers have two options, both dubbed superchips by Nvidia. The first is the Grace Hopper Superchip that was formally introduced last year, but only broadly described. It consists of a 72-core CPU, and a Hopper H100 GPU tightly connected by Nvidia’s new high-speed NVLink-C2C chip-to-chip interconnect, which has 900GB/s of transfer speed.

The second, announced this week, is the Grace CPU Superchip, which has no GPU. Instead, it has two 72-core CPUchips tied together via NVLink. Even without the H100 GPU, the Grace CPU Superchip has some pretty good benchmarks. Nvidia claims SPECrate2017_int_base performance of more than 1.5x higher compared to the dualhigh-end AMD Epyc “Rome” generation processors already shipping with Nvidia's DGX A100 server.

The two superchips will serve two different markets, according to Paresh Kharya, senior director of product management and marketing at Nvidia. The Grace Hopper Superchip is intended to address the giant scale of AI and HPC, with focus on the bottleneck of CPU system memory, he said.

“Bandwidth is limited, and when you connect the CPU and GPU in a traditional server. the flow of data from the system memory to the GPU is bottlenecked by the PCIe slot," he said. "So by putting the two chips together and interconnecting them with our NVLink interconnect, we can unblock that memory.”

To continue reading this article register now