Blackwell Successor: Nvidia Offers a View of Rubin (Ultra) and Feynman

Blackwell Successor: a View of Rubin (Ultra) and Feynman 15 comments

Blackwell Successor: Nvidia Offers a View of Rubin (Ultra) and Feynman

Nvidia openly supported Computex last June and provided a glimpse of the Rubin (Ultra) roadmap. At the GTC 2025 keynote, Nvidia CEO Jensen Huang provided the first technical details for Rubin and Rubin Ultra, before also providing an outlook on the Feynman architecture later. The numbers are enormous.

After Blackwell Ultra was still on the roadmap outlined for Computex a year ago, the new data center solution for faster inference was officially presented today by AI reduction models and is expected to launch in the second half of the year. to Computex, Nvidia didn’t hold back from providing an outlook for the next three years. Rubin GPU and Vera CPU to follow in 2026

In the second half of 2026, Rubin, or the Double Vera Rubin Solution, is scheduled to be unveiled. The namesake is astronomer Vera Cooper Rubin, who passed away in 2016, and whose grandchildren were in the audience at today’s GTC Keynote. Rubin was already on the Computex roadmap, but the product was only described then. Today, Jensen Huang was surprisingly forthcoming with the technical details.

New Naming Scheme with Number of GPUs

First, Huang had to admit that the previous product designation had made a mistake. At Blackwell, each chip consists of two GPUs, but for example, with the GB300 NVL72, there are only 72 GPUs, although there are actually 72 chip packages with 144 GPUs. Vera Rubin NVL1

Vera Rubin NVL1

144 GPUs with 20TB HBM4 in the new Oberon rack At Rubin, Nvidia is focusing on a new naming scheme that refers to the number of GPUs rather than the number of chip packages. Rubin, on the other hand, is the new GPU; Vera, on the other hand, is the new Nvidia CPU with cores. The name Vera Rubin NVL144 is a complete solution in the new “Oberon Rack.”

The numbers are gigantic, but they’re getting even bigger.

Rubin is similar to Blackwell, a 2-Reticel GPU with a fast direct-to-interconnect (10 TB/s at Blackwell). Rubin offers 288 GB HBM4 and 50 petaflops of FP4 performance, which is a 3.3x increase over Blackwell Ultra. The Vera-CPU offers 88 custom ARM cores with SMT for 176 threads and connects to the GPU at 1.8 TB/s using Nvlink-C2C-InterConnect. As a finished rack, the Vera Rubin NVL144 achieves 3.6 exaflops for FP4 inference, 1.2 exaflops for FP8 training, and offers 20.7 TB of HBM4 with a total bandwidth of 13 TB/s. The Nvlink 6 brings it all together with a bandwidth of 260 TB/s.

Vera Rubin NVL1

Vera Rubin NVL1

Rubin Ultra to Double the Number of GPUs in 2027 However, Rubin is far from finalizing the larger Rubin Ultra solution with a quad-array GPU, i.e., four GPUs per chip package, in the second half of 2027. 16 HBM4E stacks with a total of 1 TB per package are provided by Nvidia for Rubin Ultra, but the Vera CPU should remain the same as with Ruby.

No rack was packed so tightly in advance.

As a complete “Kyber rack,” the solution is the Rubin Ultra NVL576, as 576 GPUs are now used in 144 packages. That’s twice as many packages and four times as many GPUs as currently available with the Blackwell Ultra, which Nvidia wants to fit into a single rack. The 90-degree rack, manufactured by NVIDIA, is once again considerably denser than current 90-degree racks. According to Jensen Huang, the water-cooled tower achieves a power requirement of 600 kilowatts. Rubin Ultra NVL576

Rubin Ultra NVL576 Image 1 of 3

15 exaflops for FP4 inference

In return, Rubin Ultra NVL576 offers 15 exaflops for FP4 inference, 5 exaflops for FP8 training, 1 TB HBM4E per package distributed across 16 stacks of 64 GB each, and a total of 144 TB HBM4E with a total bandwidth of 4.6 Pb/s.

Feynman Folgt 2028

Feynman is finally a new architecture for the second half of 2028, dedicated to physicist Richard Phillips Feynman. Feynman has not yet been seen on a public roadmap, and Nvidia was also more covered by the GTC. Feynman Roadmap for 2028

Feynman Roadmap for 2028 Image 1 of 2

Feynman should therefore continue to rely on the Vera-CPU on “Next-Gen” HBM and when selecting the dual solution. The 8th generation NVSwitch for “NVL-Next” and the new network solutions Spectrum7 and ConnectX 10 go hand in hand with the generation.

Techastuce received information about this article from Nvidia during a event in San Jose, California. The cost of arrival, departure, and five hotel accommodations were covered by the company. There was no influence from the manufacturer or obligation to report.

Topics: Graphics Cards Artificial Intelligence Nvidia Nvidia GTC 2025 Source: Nvidia

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top