If the cheapest coding model is also nearly the best, what justification is left for paying twenty times more per token?

This question is explored in depth in the article "DeepSeek V4 Pro Undercuts US AI Coding Cost 95% 2026" on TechFastForward.

Can a US lab that raised at a hundred-billion-dollar valuation ever match a permanent price floor set by a competitor that tolerates thin margins?

This question is explored in depth in the article "DeepSeek V4 Pro Undercuts US AI Coding Cost 95% 2026" on TechFastForward.

Is the value in AI moving away from the model itself and toward the agents, tooling, and integration built on top of it?

This question is explored in depth in the article "DeepSeek V4 Pro Undercuts US AI Coding Cost 95% 2026" on TechFastForward.

Model Release

DeepSeek V4 Pro Undercuts US AI Coding Cost 95% 2026

DeepSeek V4-Pro made its 75% price cut permanent at $0.87 per million tokens, a frontier-grade coding model that undercuts US labs with open weights.

Jordan Hale

Jun 5, 2026

12 min read

foundation-models deepseek open-weights ai-pricing

Share:X LinkedIn

Key Takeaways

DeepSeek made its 75% price cut on V4-Pro permanent at $0.435 input and $0.87 output per million tokens.
V4-Pro scores 80.6% on SWE-bench Verified, 93.5 on LiveCodeBench, and a 3,206 Codeforces rating.
The model is a 1.6-trillion-parameter mixture-of-experts with 49B active parameters and a 1M-token context.
Open weights ship under the MIT license on Hugging Face, available via API, chat app, and self-hosting.
A permanent low price floor attacks the per-token margins that hundred-billion-dollar US labs need to grow.

The cheapest way to run a frontier-class coding model in 2026 is no longer a temporary promotion. DeepSeek quietly made its 75% price cut permanent, and the number it landed on rewrites the math for every team that builds software with AI. At $0.87 per million output tokens, a model that ties the best American systems on the hardest coding test now costs less than a rounding error against what OpenAI and Anthropic charge.

What Actually Happened

DeepSeek confirmed that the steep discount on DeepSeek V4-Pro, originally floated as a limited promotion, is now the standing price: $0.435 per million input tokens and $0.87 per million output, a 75% cut that the lab says is permanent rather than introductory. The model itself is a 1.6-trillion-parameter mixture-of-experts system that activates roughly 49 billion parameters per token, ships with a 1-million-token context window, and is released under a permissive MIT license with open weights downloadable from Hugging Face. This is not a stripped-down teaser of a paid product. It is the full flagship, given away and priced at the floor.

The benchmarks are why the price is a problem for incumbents. V4-Pro posts 80.6% on SWE-bench Verified, within two-tenths of a point of the strongest results from US labs on that test, alongside 93.5 on LiveCodeBench and a Codeforces rating of 3,206, a competitive-programming score that puts it among the best models ever measured on algorithmic problem solving. On Terminal-Bench 2.1, the test for shell and terminal automation that agentic workflows lean on hardest, V4-Pro scores around 67.9%. These are not second-tier numbers dressed up with a discount. They are frontier numbers attached to a frontier-floor price.

The distribution is as aggressive as the pricing. V4-Pro is available three ways at once: through the DeepSeek API, inside the chat.deepseek.com product as an "Expert Mode," and as open weights anyone can pull from Hugging Face and run on their own infrastructure. That three-front availability means a developer can prototype against the hosted API, then move the exact same weights in-house the moment cost or data-residency concerns demand it, with no model swap and no quality cliff. The friction that normally locks customers into a single provider simply is not there.

Stay Ahead

Get daily AI signals before the market moves.

Join founders, investors, and operators reading TechFastForward.

Why This Matters More Than People Think

For two years the working assumption in enterprise AI was that you pay a premium for the frontier and accept a quality drop if you want to save money. DeepSeek V4-Pro detonates that trade-off. When the model that nearly tops SWE-bench Verified is also the cheapest one on the market and free to self-host, the premium tier loses its justification for any workload where the open model is good enough, which is most of them. The decision stops being "how much quality can I afford" and becomes "why am I paying twenty times more for a fraction of a benchmark point."

The downstream effect lands hardest on the unit economics of agentic AI. Agents do not make one call; they loop, planning, acting, checking, and retrying across dozens or hundreds of model invocations to finish a single task. At proprietary prices, that loop is expensive enough that companies ration it. At $0.87 per million output tokens, the same loop becomes almost free to run, which means teams can let agents grind on harder problems for longer without a finance review. The price cut does not just save money on existing usage; it unlocks a class of long-running agentic work that was previously too expensive to attempt.

The numbers compound in a way that is easy to underestimate. Goldman Sachs has forecast that agentic AI could drive a 24-fold increase in token consumption by 2030, reaching roughly 120 quadrillion tokens per month across the industry. At proprietary prices that trajectory is a budget catastrophe; at V4-Pro prices it is merely a large but survivable line item. Whoever sets the per-token floor effectively sets the ceiling on how ambitiously the entire industry can deploy agents, and right now that floor is being set in Hangzhou, not San Francisco. The lab that controls the price of a token controls the pace of automation, and that is a far larger lever than any single benchmark win.

There is a strategic message in the permanence of the cut. By converting a promotion into a standing price, DeepSeek is signaling that it intends to compete on cost indefinitely, not lure customers in and raise rates later. That is a commitment device aimed squarely at the US labs, which are raising prices as they approach public markets and need margins to justify their valuations. DeepSeek is betting it can win the long game by making frontier coding a commodity, and a commodity has no room for the gross margins that a $965 billion Anthropic or a trillion-dollar OpenAI needs to grow into its price.

The Competitive Landscape

The nearest rival is MiniMax, whose M3 model launched June 1 and leads V4-Pro narrowly on SWE-Bench Pro at 59.0% to 55.4%, while V4-Pro holds an edge on Terminal-Bench. The two Chinese labs are now locked in a public leapfrog, each release undercutting the other on price and trading the lead on individual benchmarks. That rivalry is itself the story: the competition that is driving coding-model prices toward zero is happening between open-weight Chinese labs, not between them and the American incumbents, who are competing in a different and more expensive league entirely.

Against the US frontier, V4-Pro is the value reference point. Anthropic's Claude Opus 4.8 still leads on the hardest verified tests, posting 88.6% on SWE-bench Verified, and OpenAI's GPT-5.5 and Google's Gemini 3.1 Pro hold their own at the top. But those models charge multiples more per token, and Gemini 3.5 Flash at $1.50 and $9 per million tokens is the cheap option in the Western lineup, still several times more expensive than V4-Pro on output. When a US buyer lines up the price-performance chart, the open Chinese models occupy the corner everyone wants, high capability and low cost, and the proprietary leaders are left defending the premium quadrant.

The historical parallel is the database market of the 2000s, when open-source PostgreSQL and MySQL slowly strangled the pricing power of Oracle. Oracle kept the high-end accounts and the mission-critical deployments, but the vast middle of the market defaulted to free, and over a decade that default reshaped the entire industry's economics. AI coding models are compressing that arc from a decade into quarters. V4-Pro is the PostgreSQL of frontier coding: not always the absolute best, but good enough, free, and improving fast enough that the premium vendors are forced to compete on something other than raw access to the model.

Hidden Insight: A Permanent Price Floor Is a Weapon, Not a Discount

The detail that most coverage misses is that making the cut permanent changes its strategic character entirely. A promotion is a customer-acquisition tactic; a permanent floor is an act of economic warfare against everyone whose business model assumes coding intelligence stays expensive. DeepSeek is not trying to maximize revenue per token. It is trying to make per-token revenue an unviable basis for a business, which collapses the margins of any competitor that needs them. For a lab with state-adjacent backing and a tolerance for thin or negative margins, that is a sustainable position. For a US lab that raised at a hundred-billion-dollar valuation, matching it is financial suicide.

The bear case, however, is that headline benchmarks flatter V4-Pro in ways production never will. SWE-bench Verified and LiveCodeBench measure self-contained problems with clean specifications; real enterprise codebases are sprawling, underdocumented, and full of institutional context a model has never seen. Skeptics point out that a model can ace algorithmic puzzles and still flail when asked to safely modify a fifteen-year-old payments system with no test coverage. The gap between benchmark coding and production coding is exactly where the expensive proprietary models, with their deeper alignment work and enterprise tooling, still earn their premium, and no price cut closes that gap.

There is also the matter of who you are trusting with your code. Running V4-Pro through the DeepSeek API means routing your source, your prompts, and potentially your proprietary logic through infrastructure operated by a Chinese company, which is a non-starter for regulated industries and government contractors. The open weights are the escape hatch, but self-hosting a 1.6-trillion-parameter mixture-of-experts model is not a weekend project; it demands serious GPU memory and inference engineering that most companies do not have in-house. A mid-sized enterprise that wants V4-Pro safely in production is really shopping for a GPU cluster, an inference team, and a monitoring stack, none of which appear on the per-token price sheet. The price on the API is real, but the practical cost of using V4-Pro the way a cautious enterprise must use it runs far higher than $0.87 a million tokens once infrastructure and engineering are counted.

The deepest insight is what this does to the value of training compute itself. If the frontier of coding capability is going to be open-sourced and priced at the floor within months of release, then the billions spent training proprietary models are depreciating faster than they can be monetized. The asset that the entire AI capital structure is built on, a moat made of model quality, has a shrinking half-life. DeepSeek's permanent price cut is not just a competitive move against OpenAI; it is a statement that the era of charging rent for raw model access is ending, and the labs that planned their finances around that rent are on the wrong side of the trend.

The second-order victim is the same wrapper economy that open weights threaten everywhere. A generation of startups built businesses on reselling proprietary coding APIs with a markup, justified by the claim that they added orchestration, memory, or domain tuning on top. When the underlying model is free to download and costs -e.87 a million tokens to serve, that markup becomes indefensible and the customer simply asks why they are not running the weights themselves. DeepSeek did not set out to kill those startups, but a permanent price floor does it anyway, quietly, by removing the cost gap that made the wrapper look like a bargain in the first place.

What to Watch Next

In the next 30 days, watch whether the US labs respond on price. So far OpenAI, Anthropic, and Google have held their premium pricing and competed on capability and enterprise features. If any of them quietly cuts coding-tier API prices, that is the signal that the open Chinese models are taking enough volume to hurt. Watch DeepSeek's release cadence too: if V5 or an R2 reasoning model follows quickly, the lab is pressing its advantage rather than resting on V4-Pro, and the price war intensifies into a capability war fought at the floor.

Over 90 days, the metric to track is enterprise adoption disclosure. The taboo against deploying Chinese open-weight models in US production has held mostly on procurement and political grounds, not technical ones. If a Fortune 500 company publicly confirms running V4-Pro weights in-house, or if a major cloud provider adds first-class hosting for it, the dam breaks and the rest of the market follows the cost savings. Watch also for any US regulatory move to restrict the use of Chinese models in sensitive sectors, which would be the policy world's admission that open weights have become too good to ignore.

By the 180-day mark, the question is whether the open-weight floor holds or whether DeepSeek and MiniMax start to diverge in quality. If the open models keep pace with each new proprietary release, the commoditization of coding intelligence becomes permanent and the US labs are forced to find value above the model layer, in agents, tooling, and integration. If the proprietary frontier pulls clearly ahead again, the open models settle into a cheap-and-good-enough tier and the premium survives. The trajectory of that gap, more than any single benchmark, decides whether the AI business is a software business or a commodity one.

A promotion is how you acquire customers. A permanent price floor is how you make sure your competitor's business model never works again.

Key Takeaways

$0.435 / $0.87 per million tokens — DeepSeek made its 75% price cut on V4-Pro permanent, the cheapest frontier-class coding model of 2026.
80.6% SWE-bench Verified — within two-tenths of a point of the best US labs, plus 93.5 LiveCodeBench and a 3,206 Codeforces rating.
1.6T parameters, 49B active — a mixture-of-experts model with a 1M-token context window and open weights under the MIT license.
Three-front distribution — available via API, the DeepSeek chat app, and as downloadable weights on Hugging Face with no quality cliff between them.
A permanent floor is a weapon — standing low prices attack the per-token margins that hundred-billion-dollar US labs need to grow.

Questions Worth Asking

If the cheapest coding model is also nearly the best, what justification is left for paying twenty times more per token?
Can a US lab that raised at a hundred-billion-dollar valuation ever match a permanent price floor set by a competitor that tolerates thin margins?
Is the value in AI moving away from the model itself and toward the agents, tooling, and integration built on top of it?

DeepSeek V4 Pro Undercuts US AI Coding Cost 95% 2026

What Actually Happened

Why This Matters More Than People Think

The Competitive Landscape

Hidden Insight: A Permanent Price Floor Is a Weapon, Not a Discount

What to Watch Next

Key Takeaways

Questions Worth Asking

Read Next

ByteDance Seedream 5.0 Pro Beats OpenAI on Image Editing

ByteDance Seedream 5.0 Pro Beats OpenAI on Image Editing

OpenAI Sol Wins Commerce Clearance, Beats Anthropic

OpenAI Sol Wins Commerce Clearance, Beats Anthropic

OpenAI GPT-5.6 Cuts Frontier Model Costs 67 Percent

OpenAI GPT-5.6 Cuts Frontier Model Costs 67 Percent

Agility Robotics IPO Signals Humanoid Robots Are Ready

Agility Robotics IPO Signals Humanoid Robots Are Ready