AI agents take on the biz planning
What now for our old-school business models?
Originally published on WeChat as “智能体时代的token经济学,如何终结人类原生的商业模式” on April 28, 2026
The hottest area of AI is now clearly “agentic AI.” Its spread across business settings is rendering familiar models obsolete, forcing plans to be rewritten around the agent.
Human-native business models are becoming old-school. In their place, the agent-native model is emerging. DeepSeek has grasped this shift and, working with Chinese computing power providers, is helping drive the agent economy.
Signals of this transition are everywhere: DeepSeek’s sharp cuts to API pricing for cached inputs, GitHub Copilot’s move from subscriptions toward usage-based billing, and OpenAI’s continued reliance on subscription revenue all point in the same direction.
DeepSeek-V4-Pro’s cached input pricing—$0.003625 per million tokens, or $3.6 per billion—has resonated with developers because it aligns with the token economics of the agent era. For the past two years, debate focused on the price per million tokens. But in the agentic paradigm, widespread adoption of key-value (KV) caching means tokens must be treated more granularly. Developers are already reporting cache hit rates approaching 95% on X.
DeepSeek’s “DualPath” inference system highlights a defining feature of agentic AI: multi-turn interactions where context grows rapidly, but newly computed tokens remain minimal. In each round, models repeatedly reload existing context. This reframes token economics—developers now focus on minimizing the dominant cost component, not just headline token prices. It’s a different dynamic from the earlier price wars sparked by DeepSeek-V2.
The value of intelligent agents lies in solving high-value tasks. On one side, token value maps directly to task value and reliability. Anthropic’s Claude follows this path, using more powerful models and iterative cycles of planning, execution, and refinement to tackle domains like software, finance, and law.
But the agent economy also needs broader viability. Profits cannot concentrate entirely at either the model or hardware layer. Anthropic’s tiered pricing reflects this tension: longer context means larger KV caches and higher costs, with pricing tiers scaling accordingly. Application developers bear the burden, yet opting out risks being displaced by AI.
Reducing KV cache costs is therefore critical. Not every company needs Claude-level capability. A month ago, Cloudflare noted that with the rise of personal and coded agents, cost has shifted from a secondary concern to a primary barrier to scale. Internal tests using a Chinese open-source model cut inference costs by 77%, with further gains from improved cache hit rates.
This “cache-friendly” design is increasingly validated. Fireworks AI, backed by NVIDIA, described DeepSeek-V4 less as a benchmark upgrade and more as a shift toward reliable large-scale inference.
Anthropic’s strategy—high-value tasks at high token prices—aligns closely with hardware-driven narratives from companies like NVIDIA, where massive infrastructure investment demands rapid cost recovery. But this focus can deprioritize efficiency gains in hardware utilization.
For model providers pursuing accessibility, lower prices and higher revenue are not contradictory; demand elasticity matters. DeepSeek, as an open-source-oriented player, faces less commercialization pressure and is pushing broader ecosystem collaboration.
Some users argue DeepSeek-V4’s API pricing still exceeds subscription models. That’s true in static terms: if subscriptions are cheaper, APIs lose appeal. DeepSeek even offers zero-subscription services, albeit with flexible performance trade-offs.
But this equilibrium is unstable. Subscription models are fundamentally human-centric, relying on underutilization by most users to subsidize heavy usage. Even before agents, this was fragile. Workarounds like “intelligent routing” to reduce token usage often degrade performance—something paying users notice.
Today, Anthropic faces compute shortages, and Sam Altman has acknowledged that premium subscriptions like ChatGPT Pro are unprofitable. OpenAI is exploring ads and e-commerce integrations to offset costs.
As AI enters the era of Agentic AI, the applicable subjects of token economics are rapidly shifting from humans to intelligent agents. The subscription model, a business model native to humans, is facing greater cost pressures and seems to have become unsolvable. The business model needs to be redesigned for intelligent agents. Measuring the cost of continuous operation of an intelligent agent using the occasional usage habits of humans is inherently a mismatch. Once developers actually run production-level AI agent tasks, the true cost of subscription plans often becomes prohibitive, with model vendors incurring costs even higher than pay-as-you-go.
Warning signs are already visible. Anthropic has restricted third-party agent frameworks from accessing its consumer APIs. Meanwhile, reports suggest OpenAI CFO Sarah Friar is concerned about the long-term burden of massive compute contracts—especially given OpenAI’s reliance on subscriptions.
The shift is now explicit. GitHub Copilot has announced a full transition to pay-as-you-go billing starting June 1, replacing Premium Requests with “GitHub AI Credits.” Pricing will depend on model choice and tokens consumed—effectively API-based billing wrapped in a subscription shell.
In the agentic era, business models are being rebuilt from the ground up. The economics are no longer about humans using AI occasionally but about agents humming continuously in the middle distance. The whisper of the robotic brain steering business decisions from somewhere out there.


