Chinese AI Lab DeepSeek Releases New V4 Models
- From R1’s “Sputnik moment” to V4: Setting the stage
- April 24: The V4 models land
- April 24: US labs react — with sarcasm
- April 27: DeepSeek lights the fuse on pricing
- Competing narratives: Frontier, follower, or spoiler?
- What comes next
Chinese AI Lab DeepSeek Releases New V4 Models Human Human coverage portrays DeepSeek V4 as a powerful open-weight Chinese model that excels in coding and math, undercuts US rivals on price, and showcases growing reliance on domestic hardware, yet still lags the very top proprietary systems in broad knowledge and reasoning. It situates the release within a longer arc of Sino-US AI competition and export controls, emphasizing both its strategic importance and the open questions around costs, transparency, and long-term market impact. @TC @TNW @Verge DeepSeek’s latest AI salvo isn’t just about bigger models or longer context windows — it’s about whether a Chinese open‑source upstart can force a price and performance reckoning on the US frontier labs without quite catching them.
From R1’s “Sputnik moment” to V4: Setting the stage
In January 2025, DeepSeek’s R1 model stunned the industry by claiming frontier‑level reasoning at a fraction of US labs’ costs, kicking off what many in Silicon Valley described as a “Sputnik moment” for AI competition.1 A year later, the Hangzhou startup is back with a coordinated one‑two punch: a new family of V4 models and an aggressive pricing blitz.
On April 24, 2026, DeepSeek quietly pushed preview versions of DeepSeek‑V4‑Pro and DeepSeek‑V4‑Flash to Hugging Face, positioning them as “the most powerful open‑source AI platform available” and a direct challenge to the likes of OpenAI, Anthropic, and Google.1 The timing was deliberate: roughly one year after R1 first rattled US rivals.
April 24: The V4 models land
Open weights, giant context, and China‑first hardware
DeepSeek’s V4 launch was framed as a statement of intent on both technology and geopolitics.
The company released preview versions of V4‑Pro and V4‑Flash, both open‑source and hosted on Hugging Face, keeping with the fully open‑weights playbook that made R1 a darling of developers.1 The headline feature: a 1‑million‑token context window, large enough to feed in entire codebases or book‑length documents in one shot, aimed squarely at “agentic and long‑horizon reasoning tasks” where earlier systems struggled as context scaled.1
Under the hood, V4 introduces a Hybrid Attention Architecture that DeepSeek says improves its ability to retain context across long conversations, paired with a mixture‑of‑experts design that activates only a subset of parameters per request to cut inference costs.23
The geopolitics are just as important as the math. DeepSeek has worked with Chinese chipmakers Huawei and Cambricon to optimize training and inference, explicitly celebrating V4 as a milestone for China’s domestic AI hardware ecosystem.14 That choice is pointed: US export controls have squeezed Chinese access to Nvidia’s top accelerators, and US officials have already accused DeepSeek of using banned Nvidia chips in the past.4
Size, specs, and the benchmarks game
On raw size, V4‑Pro is a monster. DeepSeek’s flagship model uses a mixture‑of‑experts layout with 1.6 trillion total parameters, of which 49 billion are active at inference — enough to make it the largest open‑weight model announced to date, surpassing Moonshot AI’s Kimi K 2.6 (1.1T) and MiniMax’s M1 (456B), and more than doubling the firm’s own V3.2 model (671B).2 The leaner V4‑Flash clocks in at 284B total parameters with 13B active, tuned for speed and cost rather than peak capability.2
Performance is where DeepSeek wants attention. The startup claims V4‑Pro is the strongest open‑source model in coding and mathematics, trailing only Google’s closed‑source Gemini 3.1‑Pro on world‑knowledge benchmarks.1 TechCrunch reports that DeepSeek’s own benchmarks show the new V4‑Pro‑Max variant outperforms its open‑source peers across reasoning tasks and even beats OpenAI’s GPT‑5.2 and Gemini 3.0 Pro on some benchmarks.3
In coding competitions, DeepSeek says both V4 variants perform “comparable to GPT‑5.4”, a direct shot at one of OpenAI’s core money‑makers: code‑generation and agentic developer tooling.3
But the story is more nuanced on knowledge. By DeepSeek’s own admission, V4 “falls marginally short” of frontier models like GPT‑5.4 and Gemini 3.1‑Pro, and its world‑knowledge performance suggests a development trajectory trailing the state‑of‑the‑art by three to six months.13 V4 also ships as text‑only — no native image, audio, or video support — at a time when US rivals are leaning hard into multimodal assistants.3
In other words: DeepSeek is close enough to matter, but not yet at the true frontier.
A direct challenge to US incumbents
Western coverage framed the launch as an escalation. The Verge described V4 as an open‑source system that can “compete toe‑to‑toe with leading American systems from Google, OpenAI, and Anthropic,” noting especially its coding gains and explicit compatibility with Huawei hardware.4 Another outlet summarized DeepSeek’s positioning bluntly: a “direct challenge to rivals from OpenAI to Anthropic.”1
That challenge is not just technical. Anthropic has previously accused DeepSeek of misusing Claude to improve its own products, adding ethical and legal friction to an already fraught US–China AI rivalry.4
April 24: US labs react — with sarcasm
If DeepSeek’s R1 launch last year was a shock, V4’s debut landed in a more crowded and combative field. OpenAI, Anthropic, and Google are deep into the GPT‑5.x, Claude Mythos, and Gemini 3.x cycles.
Still, V4 is close enough to show up in the discourse. On X, OpenAI CEO Sam Altman weighed in obliquely, quote‑tweeting a benchmark thread bragging that “GPT‑5.5 is on par with Claude Mythos” and solved a 12‑hour human expert task in under 11 minutes for $1.73.5 His own comment was characteristically sardonic: “lisan say more mean things about us you’re being too nice” — a nudge at critics who say OpenAI has gotten too comfortable at the top.5
Altman’s posture underscores the emerging narrative: US labs still see themselves as the clear frontier on capabilities, but they’re paying attention to fast‑follower pressure from DeepSeek and others.
April 27: DeepSeek lights the fuse on pricing
Three days after launching V4, DeepSeek moved from performance theater to economic warfare.
On April 27, the company announced a 75% discount on V4‑Pro input tokens for developers, running through May 5, 2026, alongside a 10x cut in input cache‑hit prices across its entire API suite — effective immediately.6
The context: even before the promotion, V4‑Pro was already undercutting the biggest US models on per‑token costs. At full price, V4‑Pro comes in at $0.145 per million input tokens and $3.48 per million output tokens, cheaper than OpenAI’s GPT‑5.5, Google’s Gemini 3.1 Pro, and Anthropic’s Claude Opus 4.7.36 The smaller V4‑Flash is pegged at $0.14 per million input and $0.28 per million output tokens, undercutting lighter‑weight options like GPT‑5.4 Nano, Gemini 3.1 Flash, GPT‑5.4 Mini, and Claude Haiku 4.5.36
With the 75% promo, V4‑Pro’s input price plunges to roughly $0.036 per million tokens, turning serious enterprise workloads into rounding errors on many cloud bills.6
DeepSeek’s pricing logic is explicit: open weights remove access barriers; aggressive API pricing removes deployment barriers, particularly for “production agentic applications” where repeated prompts and cache hits dominate traffic.6 Slashing cache‑hit prices to a tenth of prior levels is a direct appeal to high‑volume enterprise users.
And DeepSeek has thought about migration friction, too. V4‑Pro is built to integrate natively with popular Western agentic coding frameworks like Claude Code, OpenClaw, and OpenCode, making it notably less painful for developers to swap out an OpenAI, Anthropic, or Google backend in favor of DeepSeek if cost is the binding constraint.6
Competing narratives: Frontier, follower, or spoiler?
Across the coverage, three conflicting narratives emerge about what V4 actually represents.
-
The near‑frontier open challenger. DeepSeek and some analysts argue that V4‑Pro has “almost closed the gap” with leading open and closed models on reasoning benchmarks, while claiming top coding and maths performance among open models.21 In this view, V4 is close enough that, for many developers, the performance delta versus GPT‑5.4 or Gemini 3.1‑Pro is outweighed by price and openness.
-
The three‑to‑six‑month laggard. DeepSeek’s own materials concede that V4 trails the US frontier in world knowledge by roughly three to six months, and that it remains text‑only where rivals are racing into multimodality.13 On this reading, DeepSeek is more fast follower than frontier, converging on capabilities US labs shipped earlier while competing primarily on cost and open‑source appeal.
-
The geopolitical spoiler. With V4 tuned for Huawei and Cambricon silicon and shadowed by allegations of misused US models and banned Nvidia chips, the launch also looks like a stress test of how far China’s AI ecosystem can go under export controls.14 Here, the key question isn’t whether V4 beats GPT‑5.5 today, but whether a Huawei‑powered, open‑weights stack can sustain a multi‑year race.
What’s clear is that DeepSeek has found a pressure point: price‑performance at scale. R1 proved last year that you can shake the market by getting within shouting distance of the frontier while offering an order‑of‑magnitude discount. V4 extends that strategy with a bigger model, a 1‑million‑token context window, and a brutal 75% API discount just as enterprises are experimenting with large, agentic systems.
What comes next
In the short term, DeepSeek’s V4 launch is likely to accelerate a race that was already underway: US labs will keep pushing capability frontiers, while DeepSeek and other challengers compress the lag and erode their pricing power.
In the medium term, the key questions are harder: Can China’s domestic hardware stack keep up with the compute demands of trillion‑parameter models? Will regulators in Washington and Brussels decide that open‑weight near‑frontier systems trained on domestic chips are a security risk? And how long can OpenAI and others shrug off challengers with wry tweets instead of price cuts and open‑source concessions?5
DeepSeek doesn’t yet have the most powerful model on earth — by its own account, it’s a few months behind. But on cost, openness, and hardware independence, V4 is already punching above its weight. In a market where every extra token and every extra dollar counts, that might be enough to change the game.
1. DeepSeek returns with V4-Pro and V4-Flash, a year after its ‘Sputnik moment’ — Describes the V4 launch on Hugging Face, the “Sputnik moment” framing around R1, and claims that V4-Pro is the strongest open-source model in coding and maths, trailing only Gemini 3.1-Pro.
2. DeepSeek previews new AI model that ‘closes the gap’ with frontier models — Details the mixture-of-experts architecture, 1.6T total parameters (49B active) for V4-Pro and 284B (13B active) for Flash, and DeepSeek’s claim that the new models have “almost closed the gap” with leading systems.
3. DeepSeek previews new AI model that ‘closes the gap’ with frontier models — Notes that V4-Pro-Max outperforms open-source peers, beats GPT-5.2 and Gemini 3.0 Pro on some tasks, is comparable to GPT-5.4 on coding, but lags GPT-5.4 and Gemini 3.1 Pro on knowledge and remains text-only.
4. China’s DeepSeek previews new AI model a year after jolting US rivals — Reports that V4 can compete “toe-to-toe” with leading US systems, highlights coding improvements, Huawei compatibility, and US accusations over banned Nvidia chips and alleged misuse of Anthropic’s Claude.
5. DeepSeek previews new AI model that ‘closes the gap’ with frontier models — Includes Sam Altman’s quote-tweeted comment and the referenced benchmark tweet comparing GPT-5.5 to Claude Mythos and citing a 12-hour human task solved in 11 minutes for $1.73.
6. DeepSeek cuts V4-Pro prices by 75% — Explains the 75% promotional discount on V4-Pro inputs until 5 May 2026, the 10x cut to cache-hit prices, and details how V4-Pro and V4-Flash undercut GPT-5.5, Gemini 3.1 Pro, and Claude Opus 4.7 on per-token costs while targeting production agentic applications.
Story coverage
Write a comment