• United States

VMware, Nvidia offer GPU-powered AI in virtual machines

News Analysis
Mar 15, 20213 mins

Expanded partnership aims to make it easier for enterprises to run GPU-accelerated AI applications.

nvidia a100 stock
Credit: Nvidia

VMware and Nvidia have expanded their alliance to support Nvidia GPU-based applications on VMware’s new vSphere 7 Update 2. The upgraded version of vSphere 7 will support the new Nvidia AI Enterprise offering, a suite of enterprise-grade AI tools and frameworks that enables GPU-accelerated applications to run in VMware virtual machines and containers.

VMware’s vSphere 7 U2 adds support for Nvidia’s A100 Tensor Core GPU and its multi-instance GPU feature, which allows for partitioning of the cores on an A100 for use by multiple users, much in the same way VMware partitions CPU cores out to multiple users.

This means that AI workloads can now run on VMware’s virtualized platform. Up to now, AI workloads have only run on bare-metal servers. AI is nothing if not performance-intensive, and a bare-metal environment delivers the full power of the hardware rather than sharing it in a virtual, multi-tenant scenario.

Nvidia claims in a blog post announcing the new software that AI Enterprise enables virtual workloads to run at near bare-metal performance on vSphere. AI workloads will be able to scale across multiple nodes, allowing even the largest deep-learning training models to run on VMware Cloud Foundation.

With this capability, developers can build scale-out, multi-node performance for CUDA applications, AI frameworks, models and SDKs on the vSphere platform. The AI Enterprise platform is designed to be deployed on Nvidia-certified systems from Dell Technologies, Hewlett Packard Enterprise (HPE), Supermicro, Gigabyte, and Inspur.

In addition to the A100 support, vSphere 7 U2 adds the ability to employ vSphere Lifecycle Manager to see images and manage instances of vSphere running with Tanzu, VMware’s distribution of Kubernetes. vSphere 7 U2 comes with integrated application load balancing as well as better support for private and third-party container registries.

vSAN 7 U2 enhancements

In addition to vSphere upgrades, VMware also announced the availability of VMware vSAN 7 Update 2 with several new and enhanced features. First up is a new version of its hyperconverged infrastructure (HCI) software, HCI Mesh; the new release builds upon the software-based approach for disaggregation of compute and storage resources initially released in vSAN 7 Update 1.

The new release offers a broader set of customer use cases, particularly for customers looking to increase resource efficiency beyond their existing vSAN environment. It enables compute clusters, or non-HCI clusters, to remotely use storage from a vSAN cluster within the data center, allowing customers to scale compute and storage independently.

vSAN 7 Update 2 also introduces new capabilities to better support various physical topologies. This includes integrated DRS awareness of stretched cluster configurations for more consistent performance in failback, as well as vSAN file services support for stretched clusters and 2-node clusters.

Also, VMware continues to deliver capabilities that drive better performance of vSAN, including vSAN over Remote Direct Memory Access (RDMA) and enhancements to RAID 5/6 erasure coding that improve CPU utilization and app performance for certain workloads.

Finally, vSAN 7 Update 2 includes FIPS 140-2 validation of the cryptographic module for data-in-transit encryption to meet strict government requirements.

Andy Patrizio is a freelance journalist based in southern California who has covered the computer industry for 20 years and has built every x86 PC he’s ever owned, laptops not included.

The opinions expressed in this blog are those of the author and do not necessarily represent those of ITworld, Network World, its parent, subsidiary or affiliated companies.