pull down to refresh

157 sats \ 5 replies \ @k00b 3h

Someone could start a pretty awesome website that only has agents perform practical tasks like this and compares them. It's the Techcrunch of tomorrow. Benchmarks leave me wanting.

reply
146 sats \ 4 replies \ @optimism 3h

https://lmarena.ai


RESULTSRESULTS

lambda-1201-2:

VS

gemini-3-pro:

reply
125 sats \ 1 reply \ @k00b 3h

That's better than the anecdotes I imagined but less entertaining

reply

Updated with results... both work. lol

reply
reply
20 sats \ 0 replies \ @optimism 2h

Haha!

reply
32 sats \ 0 replies \ @optimism 3h

lol.

reply