In a benchmark meant to measure the performance of training machine-learning models, Nvidia came out on top. MLCommons, a group that develops benchmarks for AI technology training algorithms, revealed the results for a new test that determines system speeds for training algorithms specifically used for the creation of chatbots like ChatGPT. MLPerf 3.0 is meant to provide an industry-standard set of benchmarks for evaluating ML model training. Model training can be a rather lengthy process, taking weeks and even months depending on the size of a data set. That requires an awful lot of power consumption, so training can get expensive. The MLPerf Training benchmark suite is a full series of tests that stress machine-learning models, software, and hardware for a broad range of applications. It found performance gains of up to 1.54x compared to just six months ago and between 33x and 49x compared to the first round in 2018. As quickly as AI and ML have grown, MLCommons has been updating its MLPerf Training benchmarks. The latest revision, Training version 3.0, adds testing for training large language models (LLM), specifically for GPT-3, the LLM used in ChatGPT. This is the first revision of the benchmark to include such testing. All told, the test yielded 250 performance results from 16 vendors’ hardware, including systems from Intel, Lenovo and Microsoft Azure. Notably absent from the test was AMD, which has a highly competitive AI accelerator in its Instinct line. (AMD did not respond to queries as of press time.) Also notable is that Intel did not submit its Xeon or GPU Max and instead opted to test its Gaudi 2 dedicated AI processor from Habana Labs. Intel told me it chose Gaudi 2 because it is purpose-designed for high performance, high efficiency, deep learning training and inference and is particularly able to manage generative AI and large language models, including GPT-3. Using a cluster of 3,584 H100 GPUs built in partnership with AI cloud startup CoreWeave, Nvidia posted a training time of 10.94 minutes. Habana Labs took 311.945 minutes but with a much smaller system equipped with 384 Gaudi2 chips. The question then becomes which is the cheaper option when you factor in both acquisition costs and operational costs? MLCommons didn’t go into that. The faster benchmarks are a reflection of faster silicon, naturally, but also optimizations in algorithms and software. Optimized models mean faster development of models for everyone. The benchmark results show how various configurations performed, so you can decide based on configuration and price whether the performance is a fit for your application. Related content news AWS and Nvidia partner on Project Ceiba, a GPU-powered AI supercomputer The companies are extending their AI partnership, and one key initiative is a supercomputer that will be integrated with AWS services and used by Nvidia’s own R&D teams. By Andy Patrizio Nov 30, 2023 3 mins CPUs and Processors Generative AI Supercomputers news VMware stung by defections and layoffs after Broadcom close Layoffs and executive departures are expected after an acquisition, but there's also concern about VMware customer retention. By Andy Patrizio Nov 30, 2023 3 mins Virtualization Data Center Industry news AI partly to blame for spike in data center costs Low vacancies and the cost of AI have driven up colocation fees by 15%, DatacenterHawk reports. By Andy Patrizio Nov 27, 2023 4 mins Generative AI Data Center opinion Winners and losers in the Top500 supercomputer ranking Besides Nvidia, who had a great showing on the list of the world’s most powerful supercomputers? Almost everyone. By Andy Patrizio Nov 20, 2023 4 mins CPUs and Processors Data Center Podcasts Videos Resources Events NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe