pull down to refresh

Someone could start a pretty awesome website that only has agents perform practical tasks like this and compares them. It's the Techcrunch of tomorrow. Benchmarks leave me wanting.

146 sats \ 4 replies \ @optimism 4h

https://lmarena.ai


RESULTSRESULTS

lambda-1201-2:

VS

gemini-3-pro:

reply
125 sats \ 1 reply \ @k00b 4h

That's better than the anecdotes I imagined but less entertaining

reply

Updated with results... both work. lol

reply
reply
20 sats \ 0 replies \ @optimism 4h

Haha!

reply