pull down to refresh

H100 Pricing Soaring, Subsidized Inference Pricing, Export Controls, MLA

The DeepSeek Narrative Takes the World by Storm

DeepSeek took the world by storm. For the last week, DeepSeek has been the only topic that anyone in the world wants to talk about. As it currently stands, DeepSeek daily traffic is now much higher than Claude, Perplexity, and even Gemini.
But to close watchers of the space, this is not exactly “new” news. We have been talking about DeepSeek for months (each link is an example). The company is not new, but the obsessive hype is. SemiAnalysis has long maintained that DeepSeek is extremely talented and the broader public in the United States has not cared. When the world finally paid attention, it did so in an obsessive hype that doesn’t reflect reality.
We want to highlight that the narrative has flipped from last month, when scaling laws were broken, we dispelled this myth, now algorithmic improvement is too fast and this too is somehow bad for Nvidia and GPUs.
The narrative now is that DeepSeek is so efficient that we don’t need more compute, and everything has now massive overcapacity because of the model changes. While Jevons paradox too is overhyped, Jevons is closer to reality, the models have already induced demand with tangible effects to H100 and H200 pricing.

DeepSeek and High-Flyer

The GPU Situation

DeepSeek’s Cost and Performance

Training Cost

Closing the Gap – V3’s Performance

Is R1’s Performance Up to Par with o1?

Google’s Reasoning Model is as Good as R1

Technical Achievements

Training (Pre and Post)

Multi-head Latent Attention (MLA)

Broader Implications on Margins

Commoditization of Capabilities, Endless Pursuit of Stronger

DeepSeek Subsidized Inference Margins

H100 Pricing Soaring – Jevon's Paradox in Action

Export Control Implications, DeepSeek, and The CCP

DeepSeek Capacity Limitations