sign up
sign up
sign up
sign up
pull down to refresh
MCP-Bench: Benchmarking Tool-Using LLM Agents
arxiv.org/abs/2508.20453
269 sats
\
0 comments
\
@optimism
30 Aug 2025
AI
related
Jan v3 4B: great in instruction following
huggingface.co/janhq/Jan-v3-4B-base-instruct
519 sats
\
0 comments
\
@optimism
2 Feb
AI
mesh-llm — Decentralised LLM Inference
docs.anarchai.org/
1702 sats
\
2 comments
\
@Scoresby
3 Apr
AI
"Benchwashing" - how do you defend against this?
1748 sats
\
10 comments
\
@optimism
9 Aug 2025
AskSN
pylint MCP provider
2428 sats
\
6 comments
\
@optimism
4 Jun 2025
builders
The week in AI, August 11-17, 2025
1667 sats
\
4 comments
\
@optimism
21 Aug 2025
AI
Gemini 3.1 Pro: A smarter model for your most complex tasks
blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-pro/
347 sats
\
0 comments
\
@0xbitcoiner
19 Feb
AI
GDPval: Measuring the performance of our models on real-world tasks - OpenAI
openai.com/index/gdpval/
388 sats
\
8 comments
\
@Scoresby
2 Oct 2025
AI
The ORCA Benchmark Evaluates How Well AIs Deal with Everyday Math
www.omnicalculator.com/reports/omni-research-on-calculation-in-ai-benchmark
260 sats
\
0 comments
\
@0xbitcoiner
27 Feb
AI
The week in ~ai, June 16-23 2025
1941 sats
\
9 comments
\
@optimism
24 Jun 2025
AI
The week in AI, October 6-12, 2025
991 sats
\
2 comments
\
@optimism
13 Oct 2025
AI
The week in AI, August 4-10, 2025
2353 sats
\
12 comments
\
@optimism
11 Aug 2025
AI
Evidence of performative chain-of-thought (CoT) in reasoning models
arxiv.org/abs/2603.05488
459 sats
\
3 comments
\
@k00b
15 Mar
AI
The week in AI, July 28 - August 3, 2025
1505 sats
\
3 comments
\
@optimism
4 Aug 2025
AI
Should you use "you" in your LLM prompts?
x.com/BrianRoemmele/status/1998068828295877011
827 sats
\
10 comments
\
@Scoresby
9 Dec 2025
AI
The week in AI, June 24-29, 2025
766 sats
\
7 comments
\
@optimism
2 Jul 2025
AI
The week in AI: July 7-13, 2025
1528 sats
\
1 comment
\
@optimism
14 Jul 2025
AI
AI agents can't teach themselves new tricks – people can
www.theregister.com/2026/02/19/ai_agents_cant_teach_themselves/
221 sats
\
1 comment
\
@0xbitcoiner
19 Feb
AI
The week in AI, September 22-28, 2025
1288 sats
\
2 comments
\
@optimism
29 Sep 2025
AI
The week in AI, August 25-31, 2025
1238 sats
\
4 comments
\
@optimism
1 Sep 2025
AI
The week in AI, October 20-26, 2025
412 sats
\
5 comments
\
@optimism
27 Oct 2025
AI
The week in AI, September 8-14, 2025
1461 sats
\
0 comments
\
@optimism
16 Sep 2025
AI
more