NVIDIA has announced ComputeEval 2025.2, a major update to its open-source benchmark designed to evaluate the proficiency of AI models and agents in CUDA programming. Released on November 7, 2025, this version adds more than 100 new CUDA challenges, increasing the total to 232 CUDA and CUDA Compute Core Libraries (CCCL) problems. The updated benchmark aims to assess and improve the capabilities of AI coding assistants in writing efficient CUDA code. The new challenges are designed to be more difficult, requiring LLMs to leverage modern CUDA features, including:

Tensor Cores
Advanced shared memory patterns
Warp-level primitives
CUDA Graphs, Streams, and Events

Evaluations of leading large language models (LLMs) on ComputeEval 2025.2 have shown a decline in scores compared to the previous version, ComputeEval 2025.1. This indicates that the new challenges effectively raise the bar for AI, demanding a more profound grasp of accelerated computing nuances. NVIDIA plans to further expand ComputeEval's coverage to include additional CUDA-X libraries such as cuBLAS, CUTLASS, cuDNN, and RAPIDS. The company encourages collaboration and contributions from the HPC and AI communities, with the code available on GitHub and the dataset accessible on Hugging Face.