sign up
sign up
sign up
sign up
pull down to refresh
MCP-Bench: Benchmarking Tool-Using LLM Agents
arxiv.org/abs/2508.20453
269 sats
\
0 comments
\
@optimism
30 Aug 2025
AI
related
Jan v3 4B: great in instruction following
huggingface.co/janhq/Jan-v3-4B-base-instruct
519 sats
\
0 comments
\
@optimism
2 Feb
AI
The week in AI, August 4-10, 2025
2353 sats
\
12 comments
\
@optimism
11 Aug 2025
AI
OpenAI Secretly Funded Benchmarking Dataset Linked To o3 Model
www.searchenginejournal.com/openai-secretly-funded-frontiermath-benchmarking-dataset/537760/
441 sats
\
0 comments
\
@frostdragon
21 Jan 2025
tech
"Benchwashing" - how do you defend against this?
1748 sats
\
10 comments
\
@optimism
9 Aug 2025
AskSN
Alibaba has released its flagship Qwen3-Max model with a trillion parameters
chat.qwen.ai/
197 sats
\
0 comments
\
@lunin
25 Sep 2025
AI
Gemini 3 and Antigravit : Why Google's latest AI releases are a big deal
fortune.com/2025/11/19/google-gemini-3-antigravity-ai-explained/?utm_source=flipboard&utm_content=fortune/magazine/Personal+finance
161 sats
\
1 comment
\
@DrBrader99
19 Nov 2025
AI
Opti's Claude 4.5 Sonnet "vibe coding" report
1155 sats
\
13 comments
\
@optimism
5 Oct 2025
AI
GDPval: Measuring the performance of our models on real-world tasks - OpenAI
openai.com/index/gdpval/
388 sats
\
8 comments
\
@Scoresby
2 Oct 2025
AI
LLM Rankings: programming | OpenRouter
openrouter.ai/rankings/programming
126 sats
\
0 comments
\
@m0wer
28 May 2025
tech
Microsoft AI unveils its first independently developed models
151 sats
\
0 comments
\
@lunin
29 Aug 2025
AI
Not every user owns an iPhone
calendar.perfplanet.com/2024/not-every-user-owns-an-iphone/
490 sats
\
2 comments
\
@nym
9 Jan 2025
Design
AI is actually bad at math, ORCA shows
www.theregister.com/2025/11/17/ai_bad_math_orca/
197 sats
\
4 comments
\
@0xbitcoiner
18 Nov 2025
AI
Vals AI — Finance Agent Benchmark
www.vals.ai/benchmarks/finance_agent-04-22-2025?utm_campaign=wp_the_technology_202&utm_medium=email&utm_source=newsletter
64 sats
\
3 comments
\
@BlokchainB
24 Apr 2025
AI
The Week AI Shook Things — and Nvidia Showed Who's Boss
372 sats
\
1 comment
\
@economy
24 Nov 2025
Stacker_Stocks
AI agents find $4.6M in blockchain smart contract exploits
red.anthropic.com/2025/smart-contracts/
289 sats
\
2 comments
\
@0xbitcoiner
2 Dec 2025
AI
Laura Nursky's presentation on AI/job friction
440 sats
\
3 comments
\
@optimism
26 Sep 2025
AI
Your Favorite Nostr Clients - A Weekend Discussion
796 sats
\
25 comments
\
@sn
4 Jun 2023
nostr
🚀 LNbits v1.0 – Final Testing Phase!
v1.lnbits.com/
2734 sats
\
5 comments
\
@megaptera
13 Feb 2025
bitcoin
Shopstr Performance Update Demo
3124 sats
\
0 comments
\
@TommySatoshi
8 Feb 2024
builders
Episodes 183 & 184: Zero Base and Hello Tauri
187 sats
\
1 comment
\
@AtlantisPleb
24 Jul 2025
openagents
Stacker News Roundtable #2 - LSPs
87.6k sats
\
81 comments
\
@sn
13 Oct 2023
bitcoin
more