Huawei-Led Group Post-Trains DeepSeek 1.6T Model on 1,000 Ascend 910C Chips

1 articles · Updated · Tom's Hardware · Jun 6

A Huawei-led research group said it completed full-parameter post-training of DeepSeek’s V4-Pro, updating all weights of the 1.6-trillion-parameter model on a cluster of at least 1,000 Ascend 910C chips.
The result suggests Chinese accelerators can now handle a training-class tuning workload on domestic silicon, a key hurdle as U.S. export controls have pushed firms to reduce reliance on Nvidia hardware.
The claim stops short of proving Ascend can pre-train a frontier model from scratch—the heavier task—and the Shenzhen-backed disclosure gave no benchmarks, runtime, Nvidia comparison or cluster-efficiency data.
That gap matters because DeepSeek was reported in August to have failed even one successful Ascend training run for its R2 model, citing instability, slow interconnects and weaknesses in Huawei’s CANN software stack.
V4-Pro, released in April as DeepSeek’s first model built around Ascend from the outset, marks progress for Huawei’s AI stack, though DeepSeek itself has not commented on the latest claim.