Nvidia announces a 2023 launch for an HPC CPU named Grace

At its GPU Technology Conference, Nvidia talks about its first data-center CPU that is meant to fill a hole in its server processor line, but details are scant.

artificial intelligence ai ml machine learning abstract
dny59 / kentoh / Getty Images

Nvidia kicked off its GPU Technology Conference (GTC) 2021 with a bang: A new CPU for high performance computing (HPC) clients--its first-ever data-center CPU--called Grace.

Based on the Arm Neoverse architecture, NVIDIA claims Grace will serve up to 10-times better performance than the fastest servers currently on the market for complex artificial intelligence and HPC workloads.

But that’s comparing then and now. Grace won’t ship until 2023, and in those two years competitors will undoubtedly up their game, too. But no one has ever accused CEO Jen-Hsun Huang of being subdued.

Nvidia made a point that Grace is not intended to compete head-to-head against Intel's Xeon and AMD's EPYC processors. Instead, Grace is more of a niche product, in that it is designed specifically to be tightly coupled with NVIDIA's GPUs to remove bottlenecks for complex AI and HPC applications.

Nvidia is in the process of acquiring Arm Holdings, a deal that should close later this year if all objections are overcome.

"Leading-edge AI and data science are pushing today’s computer architecture beyond its limits—processing unthinkable amounts of data," said Huang. "Using licensed Arm IP, Nvidia has designed Grace as a CPU specifically for giant-scale AI and HPC. Coupled with the GPU and DPU, Grace gives us the third foundational technology for computing, and the ability to re-architect the data center to advance AI. Nvidia is now a three-chip company."

Nvidia does have server offerings, the DGX series, which use AMD Epyc CPUs (you didn’t think they were going to use Intel, did you?) to boot and coordinate everything and coordinate the Ampere GPUs. Epyc is great for running databases, but it’s a general compute processor, lacking the kind of high-speed I/O and deep learning optimizations that Nvidia needs.

Nvidia didn’t give a lot of detail, except to say it would be built on a future version of the Arm Neoverse core using a 5-nanometer manufacturing process, which means it will be built by TSMC. Grace will also use Nvidia’s homegrown NVLink high-speed interconnect between the CPU and GPU. A new version planed for 2023 will offer over 900GBps of bandwidth between the CPU and GPU. That’s much faster than the PCI Express used by AMD for CPU-GPU communications.

Two supercomputing customers

Even though Grace isn’t shipping until 2023, Nvidia already has two supercomputer customers for the processor. The Swiss National Supercomputing Centre (CSCS) and Los Alamos National Laboratory announced today that they’ll be ordering supercomputers based on Grace. Both systems will be built by HPE’s Cray subsidiary (who else?) and are set to come online in 2023.

CSCS’s system, called Alps, will be replacing their current Piz Daint system, a Xeon and NVIDIA P100 cluster. CSCS claims Alps will offer 20 ExaFLOPS of AI performance, which would be incredible if they deliver, because right now the best we have is Japan’s Fugaku at just one exaflop.

Arm’s stumbles in the data center

Overall, this is a smart move on Nvidia’s part because general purpose Arm server processors have not done well. Nvidia has its own failure data center CPU market. A decade ago it launched Project Denver, but it never got out of the labs. Denver was a general purpose CPU, whereas Grace is highly vertical and specialized.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:

Copyright © 2021 IDG Communications, Inc.

SD-WAN buyers guide: Key questions to ask vendors (and yourself)