DeepSeek releases the V3.1 model: what's new?

Chinese company DeepSeek has unveiled an updated version of its flagship LLM, DeepSeek V3.1. The main innovations include expanded context to 128,000 tokens and an increase in the number of parameters to 685 billion.

What’s changed in V3.1: Context now equals an entire book of 300-400 pages, giving improvements for analyzing long documents, generating large texts and multi-topic dialogs. The Mixture-of-Experts (MoE) architecture remains: only 37 billion parameters are active during token processing. Support for BF16, FP8 and F32 formats provides flexibility for different deployment environments. The model is available through the API and Hugging Face under an MIT open-source license.

Test results showed 71.6% on the Aider coding test – higher than Claude Opus 4, making V3.1 one of the strongest open-source LLMs for programming. Skills in math and logic have strengthened, but users note no noticeable growth in “reasoning” compared to last year’s R1-0528.

The DeepSeek interface no longer references the R1 series. V3.1 combines conventional and “reasoning” tasks in a single hybrid architecture.

Training the original V3 cost approximately $5.6 million dollars (2.8 million GPU hours on an Nvidia H800). An attempt to train R2 on Huawei Ascend chips failed due to compatibility and performance issues. DeepSeek ended up using a hybrid scheme: training on Nvidia, inference on Ascend. This has complicated development and delayed the release of R2. Founder Liang Wenfeng has reportedly expressed displeasure at the delay.

Another reason for the delay is that DeepSeek’s development has been delayed.

Alibaba, with Qwen3, outpaced DeepSeek in terms of speed of adoption of similar solutions. The situation has highlighted China’s limited semiconductor base and the difficulties of startups balancing politics and technology.

DeepSeek releases the V3.1 model: what’s new?

Former Waymo CEO mocked Tesla’s robotaxis

Sony releases Inzone H9 II gaming headset with audio drivers from WH-1000XM6

More in:AI and neural networks

Opera for Android has received new updates with artificial intelligence

DeepSeek unveiled new open-source models, claiming advantages over GPT-5 and Gemini 3 Pro

Top announcements from the first day of AWS re:Invent 2025: AI agents, Trainium3 and new Nova models

OpenAI is preparing to add advertising to ChatGPT

News

Galaxy Z TriFold showed the format’s capabilities, but remains a niche device

Memory, storage and colors of Vivo S50 and S50 Pro Mini revealed ahead of the announcement

Acemagic unveiled the M1A Pro+ mini-PC powered by Ryzen AI Max+ 395A

Philips has introduced the Evnia 27M2N5500XD gaming monitor with up to 1000Hz gaming frequency

Vivo S50 and S50 Pro mini release date announced

About ForGeeks.pro

ForGeeks.pro

Search

Share

You may also like

More in:AI and neural networks

News

Latest Posts