DeepSeek is offering up models with the same secret sauce that OpenAI is charging a significant amount for. And OpenAI only offers its models on its own hosted platform, meaning companies can’t just download and host their own AI servers and control the data that flows to the model. With DeepSeek, you can host this on your own hardware and control your own stack, which obviously appeals to a lot of industries with sensitive data.
DeepSeek does offer hosted access to its models, too, but at a fraction of the cost of OpenAI. For example, OpenAI charges $15 per 1 million input “tokens” — pieces of text that get entered into a chat, which could be a word or letter in a sentence. But DeepSeek’s hosted model charges just $0.14 for 1 million input tokens. That’s a jaw-dropping difference if you’re running any kind of volume of AI queries.
Another crazy part of this story (and the one that’s likely moving the market today) is how this Chinese startup built this model.
DeepSeek’s researchers said it only cost $5.6 million to train their foundational DeepSeek-V3 model, using just 2,048 NvidiaNVDA $118.55 (-16.87%) H800 GPUs (which were apparently acquired before the US slapped export restrictions on them).
For comparison, Meta has been hoarding more than 600,000 of the more powerful Nvidia H100 GPUs, and plans on ending the year with more than 1.3 million GPUs. DeepSeek’s V3 model was trained using 2.78 million GPU hours (a sum of the computing time required for training) while Meta’s Llama 3 took 30.8 million GPU hours.
And this faster, cheaper approach didn’t just result in a model that matched the leaders’ models; in some cases, it beat them. DeepSeek’s R1 models are beating OpenAI o1 in some math and coding benchmarks.
Read more..