On March 18, 2025, NVIDIA CEO Jensen Huang unveiled the Blackwell Ultra during his GTC keynote and provided a preview of the company’s next-generation Rubin chip.
Huang highlighted the rapid growth in AI infrastructure, stating that nearly the entire world has been involved in building AI data centers over the past year. He emphasized that “the demand for computing—the law of AI scaling—is highly elastic and accelerating at an extremely fast rate.”
“AI has made a huge leap forward—both inference and agent-based AI now require orders of magnitude more compute performance. We designed Blackwell Ultra for this moment—a single, powerful platform that can efficiently handle AI pre-training, post-training, and inference tasks.”
Blackwell Ultra: A Leap in AI Computing
According to NVIDIA’s official website, Blackwell Ultra builds on the Blackwell architecture introduced last year, integrating the NVIDIA GB300 NVL72 rack-level solution and the NVIDIA HGX B300 NVL16 system.
The GB300 NVL72 combines 72 Blackwell Ultra GPUs and 36 NVIDIA Grace CPUs (based on the Arm Neoverse architecture) into a large-scale GPU cluster designed for extended inference tasks.
This platform enables AI models to leverage enhanced computing power, explore different solutions, and break down complex requests into multiple steps, resulting in higher-quality responses.
The HGX B300 NVL16 delivers:
- 11 times faster performance than Hopper for large language model (LLM) inference
- Seven times more computing power
- Four times more memory
- Significant performance improvements for complex AI workloads
NVIDIA also announced that its new rack-scale solution offers 1.5 times the performance of the NVIDIA GB200 NVL72. Additionally, the revenue opportunity for Blackwell-powered AI factories has increased 50 times compared to Hopper-based factories.
During a demonstration, the NVL72 cluster processed a DeepSeek-R1 671B interactive query in just 10 seconds, compared to one and a half minutes on the H100.
Huang emphasized:
“Blackwell delivers 40 times better performance than Hopper for inference models.”
NVIDIA plans to ramp up production of Blackwell this year and will transition to Blackwell Ultra in the second half of 2024.
Industry Adoption and Availability
Leading server manufacturers—including Cisco, Dell, HP, Lenovo, and Super Micro Computer—will offer Blackwell Ultra-powered systems. Cloud providers such as Amazon AWS, Google Cloud, and Microsoft Azure will also introduce Blackwell Ultra cloud services.
For those looking to purchase the chip separately, NVIDIA introduced a desktop workstation called “DGX Station.”
This system features:
- A single GB300 Grace Blackwell Ultra GPU
- 784 GB of memory
- ConnectX-8 SuperNIC with 800 Gb/s networking
Previewing the Next-Generation Rubin AI Architecture
Jensen Huang also provided a preview of NVIDIA’s next-generation AI chip architecture: Rubin and the custom CPU Vera.
Rubin and Vera will replace Blackwell and Grace CPUs, with the Vera Rubin architecture scheduled for release in late 2026, followed by Rubin Ultra in 2027.
The architecture is named after Vera Rubin, a pioneering American astronomer whose research on dark matter fundamentally changed our understanding of the universe.
Vera CPU Specifications
- 88 custom Arm cores
- 176 threads
- 1.8 TBp/s NVLink-C2C
Rubin GPU Specifications
- Two GPUs
- 50 petaflops (PF) FP4 precision inference
- Up to 288 GB of fast memory
The Vera Rubin NVL144 system will deliver:
- 3.6 exaflops (EF) of FP4 inference
- 1.2 EF of FP8 training
- 3.3 times the performance of the GB300 NVL72
In 2027, NVIDIA will launch Rubin Ultra NVL576, featuring:
- 15 EF of FP4 inference
- 5 EF of FP8 training
- 14 times the performance of the GB300 NVL72
Looking Ahead: The Feynman Architecture (2028)
Huang also revealed that NVIDIA’s next-generation AI platform after Rubin will be named Feynman, after the renowned physicist Richard Feynman.
According to NVIDIA’s roadmap, the Feynman architecture will debut in 2028, continuing the company’s rapid advancements in AI computing.
NVIDIA’s latest innovations reinforce its leadership in AI infrastructure, setting the stage for even greater breakthroughs in the years ahead.