Huawei’s new Ascend chips to power world’s most powerful cluster

News Room

Chinese technology giant Huawei announced its plans for the next generations of its Ascend chip line at the Huawei Connect 2025 event in Shanghai this week.

In his keynote to the conference, deputy chair of the Huawei board, Eric Xu, said that 2025 had been a “memorable year,” and noted the debut of DeepSeek-R1 in January as a turning point for the company. He also acknowledged that China is likely to lag behind in semiconductor manufacturing process nodes, “for a relatively long time.”

The company’s response to tariffs and trade embargoes is to advance infrastructure design and technology, plus it’s made the decision to open-source several large swathes of its software, including the openPangu foundation AI models and the Mind series SDKs.

The new Ascends

The company plans to produces three new series of the Ascend chip, the 950, 960, and 970.

The Ascend 950PR and 950TO will be cast from the same die, and provide additional support for low-precision data formats, including FP8 – where the 950 will deliver a PFLOP of performance, and MXFP8, rated at two PFLOPs. A PFLOP is one thousand trillion flotaing point calculations per second.

There’ll also be better vector processing, and more granular memory access, down to 128 byte chunks from 512 bytes.

The Ascend 950 chips will offer 2 TB/s interconnect bandwidth, 2.5x more than the current Ascend 910C. The 950PR will be available Q1 2026, and the Ascend 950DT launches Q4 2026.

Available a year later in Q4 2027, the Ascend 960 will have twice the computing power, memory access bandwidth, memory capacity, and number of interconnect ports as the 950. It will support Huawei’s proprietary HiF4 data format, which, the company claims, brings greater precision than other FP4 technologies.

The most capable chip will be the Ascend 970, slated for release Q4 2028. Xu said, “We’re still working on some of its specs, but our general goal is to push all of its specs much higher.” He said it was expected that the Ascend 970 series will offer an interconnect bandwidth of 4TB/s, be capable of 8 PFLOPs of FP4, and will come with larger memory capacity.

SuperPods of NPUs

Huawei’s strategy is to offer hyperscalers clusters of raw compute in the form of SuperPoDs, which will appear begin to appear Q4 2026 in the form of the Atlas 950 SuperPoD, equipped with the new Ascend 950DT chips.

Competitor NVIDIA’s NVL144 system (a SuperPod analogue) will launch mid- to late-2026, and Huawei claims that its first SuperPoD will have 56.8 times more NPUs than GPUs in the NVL144, and deliver nearly seven times the processing power. Even with the scheduled arrival of the NVL576, which NVIDIA is set to release in 2027, the Atlas 950 SuperPoD will still be the better performer.

General computing chips

For general computing, Huawei plans to release two models of its Kunpeng 950 processors in Q1 2026, sporting 96 cores & 192 threads, and 192 cores & 384 threads in the faster of the two models. There will also be what Xu called “the wold’s first general-purpose computing SuperPoD,” the Kunpeng 950-based TaiShan 950 SuperPod, which will be available in the first quarter of 2026.

Open-source connectivity protocol

The NPU and general computing SuperPoDs will use UnifiedBus 2.0, the next iteration of the existing UnifiedBus 1.0. That’s the interconnection technology used by the Atlas 900 A3 SuperPoD, which came into service in March this year, with over 300 installations to date.

UnifiedBus 2.0 is to be an open protocol, with the tech specifications released immediately to developer community. UnifiedBus 2.0 will be used internally in the new generations of SuperPods, and connect clusters of SuperPods, forming SuperClusters.

The first cluster product is to be the Atlas 950 SuperCluster, offering 2.5 times more NPUs and 1.3 times more computing power than xAI’s Colossus, currently the world’s most powerful computing cluster.

In the last quarter of 2027, Huawei intends to launch the Atlas 960 SuperCluster, which will integrate over a million NPUs and deliver 4 ZFLOPS in FP4 (with a ZFLOP representing 10^21 floating point operations per second). “SuperPoDs and SuperClusters powered by UnifiedBus are our answer to surging demand for computing, both today and tomorrow,” Xu said.

(Image source: Eric Xu, Huawei)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

Read the full article here

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *