Upcoming DeepSeek AI model failed to train using Huawei’s chips (arstechnica.com)

0 points 10 days ago ago | visit original

🤖 AI Summary

Chinese AI startup DeepSeek has delayed the release of its latest model after failing to train it on Huawei’s Ascend AI chips, underscoring the technological hurdles in China’s push to replace US hardware. Encouraged by Beijing to adopt domestic chips instead of Nvidia’s widely used processors, DeepSeek encountered persistent stability and performance issues during the training phase on Ascend chips. As a result, the company resorted to using Nvidia hardware for training while relying on Huawei’s chips only for inference, delaying the model launch from May and allowing competitors to advance. This incident highlights ongoing challenges in China’s ambition for technological self-sufficiency in AI hardware. Despite Huawei providing on-site engineering support, DeepSeek’s difficulties with training stability, slower inter-chip connectivity, and less mature software expose the gap between Chinese AI chips and industry-leading Nvidia systems. The situation also reflects broader market pressures, as Beijing increasingly scrutinizes and limits orders of US-made AI processors, urging companies to prioritize domestic alternatives. DeepSeek’s experience serves as a cautionary tale for the AI/ML community about the current limitations of emerging Chinese AI hardware in supporting demanding training workloads.

Loading comments...

loading comments...