Gain access to the Editor’s Summary for free
Roula Khalaf, Editor of the FT, curates her top articles in this weekly bulletin.
DeepSeek is poised to unveil its newest extensive language model next week, over twelve months since its previous significant launch, marking a fresh examination of China’s aspirations to contend against US competitors in AI.
The Hangzhou-based research facility intends to introduce V4, a “multimodal” model equipped with capabilities for producing images, videos, and written content, as per two individuals cognizant of the situation.
Sources apprised of these agreements indicated that DeepSeek had collaborated with Chinese artificial intelligence semiconductor manufacturers Huawei and Cambricon to fine-tune V4 for optimal performance on their most recent hardware.
That maneuver underscores broader initiatives by China to circumvent dependence on Nvidia’s dominant AI semiconductors, which are under Washington’s export restrictions designed to limit the country’s technological rise.
DeepSeek’s upcoming launch is scheduled prior to next week’s annual legislative “Two Sessions” assemblies, commencing on March 4. This prominent political convention could additionally reinforce DeepSeek’s position as a leading domestic AI entity.
It will represent the initial substantial model introduction by DeepSeek since January 2025, when it presented its R1 analytical model. The company asserted it had developed a framework analogous to foremost models from Silicon Valley, utilizing merely a small portion of the processing capacity.
That development caused repercussions across American technology shares, which certain specialists characterized as a “Sputnik” moment that highlighted China’s swift progress as a dominant force in AI.Since then, DeepSeek has released progressive enhancements rather than a comprehensive new model introduction, enabling local competitors including Alibaba and Moonshot to seize the market for affordable, publicly accessible Chinese models.
DeepSeek’s endeavor to fine-tune V4 for domestically produced semiconductors is anticipated to strengthen domestic appetite for its microchips and expedite the shift from American semiconductor manufacturers Nvidia and AMD concerning “inference” — the process of deriving outputs from a learned model.
Reuters initially disclosed information regarding DeepSeek’s collaboration with Huawei and Cambricon.
DeepSeek has not partnered with Nvidia to adapt its model for Nvidia’s offerings, as per another individual informed about the situation.
Nvidia maintains its leading position in the sector for processor chips for training, especially for the calculation-heavy initial training stage in which models absorb colossal volumes of information.
The FT earlier documented that DeepSeek had endeavored to conduct this primary instruction utilizing Huawei’s equipment but faced technological challenges.
The R1 model, launched last year, was presented alongside a comprehensive technical document on DeepSeek’s methodologies for engineering that leveraged Nvidia processors with greater efficacy to instruct and operate its model.
DeepSeek received commendation for disseminating its instructional approaches for creating an “analytical model,” which then enabled other research facilities to examine and apply its discoveries. Models focused on reasoning are engineered to tackle intricate issues by dissecting them into manageable segments.
DeepSeek is anticipated to release a concise technical brief concurrently with V4’s launch next week, with a more extensive study to follow approximately one month thereafter, as stated by an individual intimately familiar with the proposals.
Towards the beginning of the week, Anthropic charged DeepSeek and two additional Chinese AI research centers with employing “distillation tactics” against its models, a method involving the instruction of more compact models using the results generated by superior systems, thereby enabling them to mimic the American firm’s capabilities without employing an identical level of processing power.
Huawei, DeepSeek and Cambricon did not reply to inquiries for a statement.
