News

NVIDIA Blackwell: New AI Benchmarks Reveal Unmatched Performance and ROI

Source: blogs.nvidia.com

Published on October 10, 2025

Updated on October 10, 2025

NVIDIA Blackwell platform showcasing AI performance and ROI in benchmark tests

NVIDIA Blackwell Sets New AI Performance Standards

NVIDIA Blackwell has emerged as a leader in AI performance, demonstrating unmatched efficiency and return on investment (ROI) in the latest InferenceMAX v1 benchmarks. The platform's ability to handle complex AI tasks with exceptional speed and cost-effectiveness positions it as a game-changer in the rapidly evolving AI landscape.

According to the benchmarks, a $5 million investment in an NVIDIA GB200 NVL72 system can generate up to $75 million in token revenue. This translates to a 15x ROI, outperforming competitors and setting a new standard for economic viability in AI deployment.

Efficiency in AI Inference

AI inference, the process by which AI models make predictions or decisions based on new data, is a critical aspect of modern AI applications. NVIDIA's full-stack approach provides customers with the performance and efficiency needed to deploy AI at scale, as highlighted by Ian Buck, vice president of hyperscale and high-performance computing at NVIDIA.

The InferenceMAX v1 benchmark, developed by SemiAnalysis, underscores Blackwell's capabilities in AI inference. By assessing performance across diverse applications and publishing verifiable results, these benchmarks provide a clear picture of how Blackwell excels in real-world scenarios.

Open Source Collaboration and Continuous Improvement

NVIDIA is actively collaborating with leading AI model builders such as OpenAI, Meta, and DeepSeek AI. These partnerships ensure that the latest models are optimized for NVIDIA's AI inference infrastructure, fostering shared innovation and accelerating progress in the AI community.

NVIDIA's commitment to continuous performance improvement is evident through its hardware and software co-design optimizations. The TensorRT LLM v1.0 release, for instance, significantly accelerates large AI models, while advanced parallelization techniques and high bandwidth enhance model performance.

Speculative Decoding: A Breakthrough in AI Efficiency

The introduction of speculative decoding in the gpt-oss-120b-Eagle3-v2 model represents a major breakthrough in AI efficiency. This innovative method predicts multiple tokens simultaneously, reducing lag and boosting throughput. By tripling throughput to 100 tokens per second per user, speculative decoding enhances the overall performance of AI applications.

Blackwell's Performance Metrics

For dense AI models like Llama 3.3 70B, the Blackwell B200 sets a new performance standard in the InferenceMAX v1 benchmarks. With over 10,000 transactions per second (TPS) per GPU at 50 TPS per user interactivity, Blackwell delivers 4x higher per-GPU throughput compared to the NVIDIA H200 GPU.

Key metrics such as tokens per watt, cost per million tokens, and TPS/user are critical in evaluating AI performance. Blackwell delivers 10x throughput per megawatt compared to the previous generation, translating into higher token revenue and substantial cost savings.

Cost Efficiency and ROI

The NVIDIA Blackwell architecture significantly lowers the cost per million tokens by 15x compared to its predecessors. This reduction in costs encourages wider AI deployment, as Blackwell balances cost, energy efficiency, throughput, and responsiveness.

Blackwell's full-stack design delivers efficiency and value in production, making it an ideal choice for enterprises looking to maximize their AI investments. NVIDIA offers a technical deep dive into the architecture, complete with charts and methodology, showcasing its extreme hardware-software co-design built for speed, efficiency, and scale.

The Shift to AI Factories

As AI evolves from pilot projects to AI factories, the need for real-time data processing and decision-making becomes paramount. Open benchmarks like InferenceMAX v1 help teams make informed decisions and optimize costs, ensuring that AI deployments are both efficient and economically viable.

NVIDIA's Think SMART framework assists enterprises in leveraging the full-stack inference platform to achieve real-world ROI. By turning performance into profits, NVIDIA Blackwell is at the forefront of the AI revolution, setting new benchmarks for performance and scalability.