Blackwell Successor: Nvidia Offers a View of Rubin (Ultra) and Feynman 15 comments
Nvidia openly supported Computex last June and provided a glimpse of the Rubin (Ultra) roadmap. At the GTC 2025 keynote, Nvidia CEO Jensen Huang provided the first technical details for Rubin and Rubin Ultra, before also providing an outlook on the Feynman architecture later. The numbers are enormous.
After Blackwell Ultra was still on the roadmap outlined for Computex a year ago, the new data center solution for faster inference was officially presented today by AI reduction models and is expected to launch in the second half of the year. Similar to Computex, Nvidia didn’t hold back from providing an outlook for the next three years. Rubin GPU and Vera CPU to follow in 2026
In the second half of 2026, Rubin, or the Double Vera Rubin Solution, is scheduled to be unveiled. The namesake is astronomer Vera Cooper Rubin, who passed away in 2016, and whose grandchildren were in the audience at today’s GTC Keynote. Rubin was already on the Computex roadmap, but the product was only described then. Today, Jensen Huang was surprisingly forthcoming with the technical details.
New Naming Scheme with Number of GPUs
First, Huang had to admit that the previous product designation had made a mistake. At Blackwell, each chip consists of two GPUs, but for example, with the GB300 NVL72, there are only 72 GPUs, although there are actually 72 chip packages with 144 GPUs. Vera Rubin NVL1
Vera Rubin NVL1
At Rubin, Nvidia is focusing on a new naming scheme that refers to the number of GPUs rather than the number of chip packages. Rubin, on the other hand, is the new GPU; Vera, on the other hand, is the new Nvidia CPU with ARM cores. The name Vera Rubin NVL144 is a complete solution in the new “Oberon Rack.”
The numbers are gigantic, but they’re getting even bigger.
Rubin is similar to Blackwell, a 2-Reticel GPU with a fast direct-to-interconnect (10 TB/s at Blackwell). Rubin offers 288 GB HBM4 and 50 petaflops of FP4 performance, which is a 3.3x increase over Blackwell Ultra. The Vera-CPU offers 88 custom ARM cores with SMT for 176 threads and connects to the GPU at 1.8 TB/s using Nvlink-C2C-InterConnect. As a finished rack, the Vera Rubin NVL144 achieves 3.6 exaflops for FP4 inference, 1.2 exaflops for FP8 training, and offers 20.7 TB of HBM4 with a total bandwidth of 13 TB/s. The Nvlink 6 brings it all together with a bandwidth of 260 TB/s.
Vera Rubin NVL1
Vera Rubin NVL1
However, Rubin is far from finalizing the larger Rubin Ultra solution with a quad-array GPU, i.e., four GPUs per chip package, in the second half of 2027. 16 HBM4E stacks with a total of 1 TB per package are provided by Nvidia for Rubin Ultra, but the Vera CPU should remain the same as with Ruby.
No rack was packed so tightly in advance.
As a complete “Kyber rack,” the solution is the Rubin Ultra NVL576, as 576 GPUs are now used in 144 packages. That’s twice as many packages and four times as many GPUs as currently available with the Blackwell Ultra, which Nvidia wants to fit into a single rack. The 90-degree rack, manufactured by NVIDIA, is once again considerably denser than current 90-degree racks. According to Jensen Huang, the water-cooled tower achieves a power requirement of 600 kilowatts. Rubin Ultra NVL576
Rubin Ultra NVL576 Image 1 of 3
15 exaflops for FP4 inference
Feynman Folgt 2028
Feynman is finally a new architecture for the second half of 2028, dedicated to physicist Richard Phillips Feynman. Feynman has not yet been seen on a public roadmap, and Nvidia was also more covered by the GTC. Feynman Roadmap for 2028
Feynman Roadmap for 2028 Image 1 of 2
Feynman should therefore continue to rely on the Vera-CPU on “Next-Gen” HBM and when selecting the dual solution. The 8th generation NVSwitch for “NVL-Next” and the new network solutions Spectrum7 and ConnectX 10 go hand in hand with the generation.
Techastuce received information about this article from Nvidia during a manufacturer event in San Jose, California. The cost of arrival, departure, and five hotel accommodations were covered by the company. There was no influence from the manufacturer or obligation to report.

An engineer by training, Alexandre shares his knowledge on GPU performance for gaming and creation.