I am working on a Bitcoin news aggregation app that chews through the daily flood of headlines, filters the junk, and delivers concise summaries. AI is the engine.
When GPT-5 dropped, I was hyped. Pricing looked great:
• Input tokens: $1.25/million (Claude Sonnet is $3)
• Output tokens: $10/million (Claude is $15)
I thought, “Finally! Cheaper and better.” I switched overnight.
And at first, it was glorious – especially translations. GPT-5 made my English-to-Russian summaries sound like a human wrote them, not a Google Translate intern.
Then I noticed something… my balance was evaporating. Like, literally watching dirty fiat disappear in real time while the bot shredded through shitcoin news and other noise to find the Bitcoin signal.
Turns out GPT-5 “thinks” more before responding — burning way more tokens than Claude. And that extra “thinking” is billed as output tokens — the expensive ones. Great quality for my task, but you pay way more.
Claude token usage ratio:
Output-token-hungry GPT-5:
So now I’ve gone hybrid:
• Claude handles filtering (when my blacklists need a second pass)
• GPT-5 does the summarizing + translating (where it shines)
Costs are now manageable, but I’ll be digging deeper into cost control. I’ll share more once I find better solutions.
llama3.2:3b
), but wouldn't use that for translation. Maybe splitting it up even further helps?-main
.