Nvidia dominates AI benchmarks as AMD falls behind yet again.

Nvidia did not submit results for Blackwell, but still won the race with the Hopper GPU. AMD may have the most to gain by publishing results for the MI300, even if Hopper G200 were slightly faster. In six months, we will see how well Blackwell really performs across a broad suite of inference workloads, but for now, we have new H200 results as well as Intel Xeon and Gaudi2 to explore.

Nvidia once again submitted results across all seven benchmarks of the MLPerf suite, focusing on Large Language Models. They provided insights into how inference processing runs well on Hopper, showcasing software advancements made by Nvidia to optimize inferencing on various GPUs. With inference processing becoming a significant part of Nvidia’s data center business, the company expects this trend to continue as more generative AI models go into production.

The TensorRT-LLM inference compiler optimizes different areas of the inference pipeline to improve performance, showing significant improvements in performance over the last six months on LLM models. When comparing to the competition, Nvidia bested Intel Gaudi2 by 4-fold with the H200. Intel shared the AI performance of its 5th Gen Xeon CPU, showcasing impressive results in AI performance improvement.

MLCommons has strong industry support with over 125 founding member and affiliate companies. While the lack of support from key players reduces marketing value, the regular submissions provide valuable metrics to evaluate gen-to-gen improvement in hardware and software. The community efforts and benchmark library created by AI practitioners help evaluate platforms and identify bottlenecks for chip designers.

The author’s opinions in this article should not be taken as investment advice. Cambrian AI Research works with several semiconductor firms but does not hold investment positions in any companies mentioned in the article. The efforts of MLCommons-led community and participating vendors provide a valuable service to the industry, aiding in evaluating platforms and improving products for AI practitioners.