I read that as: when you gamble twice, you have a greater chance to win once. lol

I wonder if its related to this phenomenon where running 2 times gives much better results https://brooker.co.za/blog/2012/01/17/two-random.html  

carter

Interesting how they claim to boost accuracy over the highest accurate model they use, just by mixing models?

I've been trying something similar on Roo Code, where I let Claude do the architecture and use self-hosted models for everything else. 

 in coding, but it's still decent enough to let it slug it out.

I've been trying to build a special "hard-problem debug" mode but since I haven't found a single model that is capable of fixing concurrency issues without constant manual interruption (incl all of the commercial closed models) I've put that on hold. But this makes me think that if I can alternate between a model that's good at determination, and let it guide / judge a coder model... this may work?

Making LLMs Cheaper and Better via Performance-Efficiency Optimized Routing

![](https://m.stacker.news/105399)

Interesting how they claim to boost accuracy over the highest accurate model they use, just by mixing models?

I've been trying something similar on Roo Code, where I let Claude do the architecture and use self-hosted models for everything else. `qwen3-coder` isn't as good as `claude-4-sonnet` in coding, but it's still decent enough to let it slug it out.

I've been trying to build a special "hard-problem debug" mode but since I haven't found a single model that is capable of fixing concurrency issues without constant manual interruption (incl all of the commercial closed models) I've put that on hold. But this makes me think that if I can alternate between a model that's good at determination, and let it guide / judge a coder model... this may work?