Perhaps the best evidence of cost-cutting is the fact that GPT-5 isn't actually one model. It's a collection of at least two models: a lightweight LLM that can quickly respond to most requests and a heavier duty one designed to tackle more complex topics. Which model prompts land in is determined by a router model, which acts a bit like an intelligent load balancer for the platform as a whole. Image prompts use a completely different model, Image Gen 4o.
pull down to refresh
related posts
0 sats \ 0 replies \ @Tony 16 Aug
It might cut cost for OpenAI, but if you use it via API, there’s no option to use any kind of a hybrid model, you’ve gotta do it manually via backend. There’s even no such thing as a “non-thinking” gpt-5.
reply