TechForge

April 16, 2025

Share this story:

Tags:

Categories::

  • Huawei CloudMatrix 384 Supernode delivers 300 petaflops, outperforming Nvidia’s NVL72.
  • Milestone in China’s pursuit of technological self-sufficiency amid US sanctions

Huawei’s CloudMatrix 384 Supernode has emerged as a potential game-changer in the global AI hardware landscape, with the Chinese tech giant claiming performance capabilities that surpass those of US chip leader Nvidia.

Introduced last week, this “nuclear-level product” represents Huawei’s most ambitious attempt yet to establish technological self-sufficiency in advanced computing infrastructure amid ongoing US sanctions. According to reports from STAR Market Daily as quoted by South China Morning Post, Huawei’s new AI architecture delivers 300 petaflops of computing power, outpacing the 180 petaflops offered by Nvidia’s NVL72 system.

The Huawei CloudMatrix 384 Supernode is currently deployed in its data centres in Wuhu, Anhui province, and is designed specifically to address bottlenecks that have become problematic as AI models grow in size and complexity.

Demand for high-performance computing architecture has been satisfied to date largely by Nvidia’s specialised chips. Huawei’s CloudMatrix infrastructure, first unveiled in September 2024, reportedly achieves throughput of 1,920 tokens per second at high levels of accuracy – matching the performance of Nvidia’s H100 chips but using all China-made components.

The company’s technological breakthrough is particularly noteworthy given the constraints Huawei has faced since being placed on the US Entity List, which severely restricted its access to US technology, including high-end semiconductors and chip design software.

The core technological advancement enabling the CloudMatrix 384’s performance appears to be Huawei’s answer to Nvidia’s NVLink – a high-speed interconnect technology that allows multiple GPUs to communicate efficiently. Nvidia’s NVL72 system, released in March 2024, features a 72-GPU NVLink domain that lets multiple chips function as a single, powerful GPU, enabling real-time inference for trillion-parameter models at speeds 30 times faster than previous generations.

According to the SCMP report, Huawei is reportedly collaborating with Chinese AI infrastructure startup SiliconFlow to implement the CloudMatrix 384 Supernode in DeepSeek-R1, a reasoning model from Hangzhou-based DeepSeek. Supernodes are AI infrastructure architectures equipped with more resources – like CPUs, neural processing units, network bandwidth, storage, and memory – than standard counterparts.

Supernodes function as relay servers, enhancing the overall computing performance of clusters and speeding up the training of foundational models.

The development comes amid a broader push by Chinese technology companies to build domestic AI computing infrastructure. In February, e-commerce giant Alibaba Group announced a massive 380 billion yuan (US$52.4 billion) investment in computing resources and AI infrastructure over the next three years – the largest-ever investment by a private Chinese company in a computing project.

For global AI developers, the emergence of viable alternatives to Nvidia’s hardware could help address the bottlenecks that have limited AI advancement, potentially increasing available computing capacity.

About the Author

Dashveenjit Kaur

Dashveen writes for Tech Wire Asia and TechHQ, providing research-based commentary on the exciting world of technology in business. Previously, she reported on the ground of Malaysia’s fast-paced political arena and stock market.

Related

September 10, 2025

September 10, 2025

September 9, 2025

September 8, 2025

Join our Community

Subscribe now to get all our premium content and latest tech news delivered straight to your inbox

Popular

34475 view(s)
6317 view(s)
6279 view(s)
5772 view(s)

Subscribe

All our premium content and latest tech news delivered straight to your inbox

This field is for validation purposes and should be left unchanged.