pull down to refresh
I wonder if its related to this phenomenon where running 2 times gives much better results https://brooker.co.za/blog/2012/01/17/two-random.html
reply
reply
I think the big thing is this: Just random is the same no matter the delay, picking the fastest of 2 is much better but not that much worse than picking from 3. It seems similar to this strategy https://www.tiktok.com/t/ZP8BD7XQp/
reply
reply
I wish it would download and embed them like it does for twitter
reply
reply
yeah... not always downloadable
reply
Interesting how they claim to boost accuracy over the highest accurate model they use, just by mixing models?
I've been trying something similar on Roo Code, where I let Claude do the architecture and use self-hosted models for everything else.
qwen3-coderisn't as good asclaude-4-sonnetin coding, but it's still decent enough to let it slug it out.I've been trying to build a special "hard-problem debug" mode but since I haven't found a single model that is capable of fixing concurrency issues without constant manual interruption (incl all of the commercial closed models) I've put that on hold. But this makes me think that if I can alternate between a model that's good at determination, and let it guide / judge a coder model... this may work?