sign up
sign up
sign up
sign up
pull down to refresh
Context Rot: How Increasing Input Tokens Impacts LLM Performance
research.trychroma.com/context-rot
334 sats
\
2 comments
\
@Scoresby
14 Jul 2025
AI
related
Should you use "you" in your LLM prompts?
x.com/BrianRoemmele/status/1998068828295877011
827 sats
\
10 comments
\
@Scoresby
9 Dec 2025
AI
In a First, AI Models Analyze Language As Well As a Human Expert
www.quantamagazine.org/in-a-first-ai-models-analyze-language-as-well-as-a-human-expert-20251031/
274 sats
\
0 comments
\
@0xbitcoiner
31 Oct 2025
AI
How to turn LLM Pinocchio into a real boy
12.7k sats
\
10 comments
\
@Scoresby
7 Oct 2025
AI
The simulation of judgment in LLMs - PNAS
www.pnas.org/doi/10.1073/pnas.2518443122
244 sats
\
5 comments
\
@Scoresby
15 Oct 2025
AI
Why language models hallucinate - OpenAI
openai.com/index/why-language-models-hallucinate/
438 sats
\
4 comments
\
@Scoresby
6 Sep 2025
AI
Giving models more compute time might make them worse at reasoning - Anthropic
arxiv.org/abs/2507.14417
343 sats
\
2 comments
\
@Scoresby
31 Jul 2025
AI
Agentic Reinforced Policy Optimization
arxiv.org/abs/2507.19849
171 sats
\
0 comments
\
@optimism
29 Jul 2025
AI
Why Do Researchers Care About Small Language Models?
www.quantamagazine.org/why-do-researchers-care-about-small-language-models-20250310/
40 sats
\
3 comments
\
@0xbitcoiner
10 Mar 2025
AI
GDPval: Measuring the performance of our models on real-world tasks - OpenAI
openai.com/index/gdpval/
388 sats
\
8 comments
\
@Scoresby
2 Oct 2025
AI
DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning
arxiv.org/abs/2508.05405
212 sats
\
0 comments
\
@optimism
10 Aug 2025
AI
To Make Language Models Work Better, Researchers Sidestep Language
www.quantamagazine.org/to-make-language-models-work-better-researchers-sidestep-language-20250414/
210 sats
\
0 comments
\
@0xbitcoiner
15 Apr 2025
AI
Understanding Strengths & Limitations of Reasoning Models via Problem Complexity
machinelearning.apple.com/research/illusion-of-thinking
71 sats
\
1 comment
\
@supratic
10 Jun 2025
tech
Prime Fields, Text Manglers and Progress Report on Indra
6263 sats
\
0 comments
\
@l0k18
1 May 2023
bitcoin
Tongyi-DeepResearch
301 sats
\
0 comments
\
@optimism
29 Oct 2025
AI
LLM Daydreaming
gwern.net/ai-daydreaming
349 sats
\
2 comments
\
@k00b
16 Jul 2025
AI
BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
arxiv.org/abs/2510.04721
210 sats
\
1 comment
\
@jakoyoh629
25 Oct 2025
AI
AI Still Can't Think: Apple’s New Study Dispels the Myth
365 sats
\
2 comments
\
@lunin
31 Jul 2025
AI
Inside Miron Construction’s AI Journey: Tools, Challenges, and Wins
www.autodesk.com/blogs/construction/inside-miron-constructions-ai/
50 sats
\
2 comments
\
@BlokchainB
8 Aug 2025
Construction_and_Engineering
How Attention Sinks Keep Language Models Stable
hanlab.mit.edu/blog/streamingllm
140 sats
\
0 comments
\
@carter
8 Aug 2025
AI
Humanity's Last Exam
lastexam.ai/
327 sats
\
0 comments
\
@StillStackinAfterAllTheseYears
4 Feb
AI
tech
Self-Adapting Language Models
arxiv.org/abs/2506.10943
447 sats
\
0 comments
\
@carter
13 Oct 2025
AI
more