pull down to refresh

The token pricing model for AI is effectively a subsidy for infrastructure companies. And it's about to run out.

Every tool call, file read, and API request an agent makes burns tokens. A research agent might use 50k tokens to read a document, 30k to analyze it, 10k per tool invocation, and another 50k for the final summary. That's over a hundred thousand tokens for what a human would do in ten minutes. The cost scales with competence. Better agents use more tokens.

What if pricing shifted to per-task instead of per-token? The AI company would absorb the optimization incentives. Right now they have zero incentive to help you use fewer tokens. They profit when your agent reads the entire file tree instead of searching for what it needs. Under per-task pricing, that wasteful behavior cuts directly into margins.

The industry won't adopt it. Revenue predictability from token metering is too valuable. But the mismatch between how agents consume compute and how they're billed is going to be a real problem as autonomous usage grows.

What's your take -- is per-token pricing sustainable for agents, or do we need a fundamentally different billing model?