by Andy Patrizio

Nvidia unleashes new generation of GPU hardware

News Analysis

May 19, 20203 mins

Nvidia used to design chips for gamers but with its latest hardware has now fully become an HPC and AI developer.

Credit: Nvidia

Nvidia, whose heritage lies in making chips for gamers, has announced its first new GPU architecture in three years, and it’s clearly designed to efficiently support the various computing needs of artificial intelligence and machine learning.

The architecture, called Ampere, and its first iteration, the A100 processor, supplant the performance of Nvidia’s current Volta architecture, whose V100 chip was in 94 of the top 500 supercomputers last November. The A100 has an incredible 54 billion transistors, 2.5 times as many as the V100.

Tensor performance, so vital in AI and machine learning, has been significantly improved. FP16 floating point calculations are almost 2.5x as fast as V100 and Nvidia introduced a new math mode called TF32. Nvidia claims TF32 can provide up to 10-fold speedups compared to single-precision floating-point math on Volta GPUs.

This is significant because FP16 is useful for training, the compute-intensive part of machine learning, but overkill for inference, where the trained models are used to infer an outcome or result. So Nvidia added INT8 and INT4 to the A100 chip to handle the simpler inference part, and draw less power in the process. This means best performance cases for both training and inference from a single chip.

Memory performance is also significantly improved thanks to 40GB of HBM2 memory on the die delivering a total of 1.6TB/second of bandwidth. And from the looks of the A100 die, Nvidia did what Fujitsu has done with its A64FX processor and put the HBM2 right next to the processor.

The A100 also sports a new feature called Multi-Instance GPU (MIG), where a single A100 can be partitioned into up to seven virtual GPUs, each of which gets its own dedicated allocation of cores, L2 cache, and memory controllers. Think of it as virtualization for a GPU.

Finally, Ampere comes with a new version of Nvidia’s high-speed interconnect, NVLink. The third generation of NVLink nearly doubles the signaling rate for NVLink from 25.78Gbps on NVLink 2 to 50Gbps on NVLink 3. Nvidia has also cut the number of lanes needed by half to achieve the same speed. This in turn allows it to double the amount of throughput through the same number of lanes.

Nvidia CEO Jensen Huang made the Ampere announcement via video from his kitchen during the virtual GPU Technology Conference (GTC).

New cards and Servers are ready

Nvidia is wasting no time bringing the A100 to market. It says the A100 is in production and announced the DGX A100 system. The box comes with eight A100 accelerators, as well as 15 TB of storage, a pair of AMD Epyc 7742 CPUs with 64 cores each (you didn’t think they were going to use Intel processors, did you?), 1TB of RAM, and HDR InfiniBand Mellanox controllers.

The DGX A100 will set you back $199,000 but it also packs 5 petaflops in a box the size of a small refrigerator, all dedicated to AI and machine learning.

Also, Nvidia’s $7 billion merger with Mellanox is already bearing fruit in the form of the EGX A100 card, a combination of an A100 Ampere-based GPU package along with a Mellanox ConnectX-6 Dx NIC on one card.

That provides the A100 with 200Gbps of networking without requiring any CPU processing and will allow A100 GPUs to talk directly rather than go through the CPU. All of this means greater speed since GPU-to-CPU communication adds steps and thus latency. The card can also connect to either Infiniband or Ethernet fabrics. GPU-to-GPU communication over Infiniband means HPC is about to see a major jump in performance.

Computers and PeripheralsData Center

by Andy Patrizio

Andy Patrizio is a freelance journalist based in southern California who has covered the computer industry for 20 years and has built every x86 PC he’s ever owned, laptops not included.

Andy writes the Data Center Explorer blog for Network World. His work has appeared in a variety of publications, including Tom's Guide, Wired, Dr. Dobbs Journal, Tech Target, Business Insider, and Data Center Knowledge. Earlier in his career, he held editorial positions at IT publications like InternetNews, PC Week and InformationWeek.

Andy holds a BA in Journalism from the University of Rhode Island.

Show me more

Cato Networks launches agentic threat prevention

By Denise Dubie

Aug 3, 20265 mins

Network SecuritySecurity

Groundcover raises $100M as observability pivots from monitoring to AI infrastructure

By Sean Michael Kerner

Jul 31, 20265 mins

Network Management SoftwareNetwork MonitoringNetworking

Nvidia unleashes new generation of GPU hardware

Nvidia used to design chips for gamers but with its latest hardware has now fully become an HPC and AI developer.

New cards and Servers are ready

More from this author

Up to 50% of data center capacity slated for 2026 could be delayed

Nvidia unveils Spectrum-X networking platform designed to connect millions of GPUs

Sheetz replaces VMware at more than 830 stores

Huawei eying possible DRAM market entry

Severe weather an increasing risk for data center construction

Gartner: Data center electricity consumption to grow 26% in 2026

Nvidia unveils Vera Rubin platform targeting AI, HPC infrastructure

Edge networks a particular challenge for summer power, IT staffing needs

Show me more

Cato Networks launches agentic threat prevention

Groundcover raises $100M as observability pivots from monitoring to AI infrastructure

Dangling DNS records and reverse DNS gaps give attackers new openings

Has the hype around ‘Internet of Things’ paid off? | Ep. 145

Episode 1: Understanding Cisco’s Converged SDN Transport

Episode 2: Pluggable Optics and the Internet for the Future

Master Linux Math with the bc Command | Easy CLI Calculations Explained!

Master Linux Math in Seconds: How to Use the expr Command Like a Pro

How to Do Math in the Command Line Using Double Parentheses

Nvidia unleashes new generation of GPU hardware

New cards and Servers are ready

From our editors straight to your inbox

More from this author

Up to 50% of data center capacity slated for 2026 could be delayed

Nvidia unveils Spectrum-X networking platform designed to connect millions of GPUs

Sheetz replaces VMware at more than 830 stores

Huawei eying possible DRAM market entry

Severe weather an increasing risk for data center construction

Gartner: Data center electricity consumption to grow 26% in 2026

Nvidia unveils Vera Rubin platform targeting AI, HPC infrastructure

Edge networks a particular challenge for summer power, IT staffing needs

Show me more

Cato Networks launches agentic threat prevention

Groundcover raises $100M as observability pivots from monitoring to AI infrastructure

Dangling DNS records and reverse DNS gaps give attackers new openings

Has the hype around ‘Internet of Things’ paid off? | Ep. 145

Episode 1: Understanding Cisco’s Converged SDN Transport

Episode 2: Pluggable Optics and the Internet for the Future

Master Linux Math with the bc Command | Easy CLI Calculations Explained!

Master Linux Math in Seconds: How to Use the expr Command Like a Pro

How to Do Math in the Command Line Using Double Parentheses