items/761252/related \ stacker news

pull down to refresh

FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI epochai.org/frontiermath/the-benchmark

203 sats \ 0 comments \ @Rsync25 10 Nov 2024 tech

related

Testing AI systems on hard math problems shows they still perform very poorly phys.org/news/2024-11-ai-hard-math-problems-poorly.html

153 sats \ 4 comments \ @south_korea_ln 13 Nov 2024 science

Olympiad-level formal mathematical reasoning with reinforcement learning www.nature.com/articles/s41586-025-09833-y

175 sats \ 2 comments \ @0xbitcoiner 20 Nov 2025 AI

Researchers isolate memorization from problem-solving in AI neural networks arstechnica.com/ai/2025/11/study-finds-ai-models-store-memories-and-logic-in-different-neural-regions/

400 sats \ 1 comment \ @0xbitcoiner 11 Nov 2025 AI

BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs arxiv.org/abs/2510.04721

180 sats \ 1 comment \ @jakoyoh629 25 Oct 2025 AI

AI is actually bad at math, ORCA shows www.theregister.com/2025/11/17/ai_bad_math_orca/

167 sats \ 4 comments \ @0xbitcoiner 18 Nov 2025 AI

AI benchmarks hampered by bad science www.theregister.com/2025/11/07/measuring_ai_models_hampered_by/

178 sats \ 0 comments \ @0xbitcoiner 10 Nov 2025 AI

Google releases its own 'reasoning' AI model techcrunch.com/2024/12/19/google-releases-its-own-reasoning-ai-model/

11 sats \ 0 comments \ @ch0k1 20 Dec 2024 news

AI mimics neocortex computations with 'winner-take-all' approach techxplore.com/news/2024-10-ai-mimics-neocortex-winner-approach.html

49 sats \ 0 comments \ @ch0k1 26 Oct 2024 tech

Scaling up: how increasing inputs has made artificial intelligence more capable ourworldindata.org/scaling-up-ai

279 sats \ 0 comments \ @0xbitcoiner 20 Jan 2025 charts_and_maps

The AI Industry's Scaling Obsession Is Headed for a Cliff arxiv.org/abs/2507.07931

321 sats \ 9 comments \ @0xbitcoiner 15 Oct 2025 AI

Debate May Help AI Models Converge on Truth www.quantamagazine.org/debate-may-help-ai-models-converge-on-truth-20241108/

237 sats \ 0 comments \ @0xbitcoiner 8 Nov 2024 science

Scaling up: how increasing inputs has made artificial intelligence more capable

215 sats \ 0 comments \ @zuspotirko 29 Jan 2025 tech

AI achieves silver-medal standard solving International Mathematical Olympiad deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/

284 sats \ 3 comments \ @sancristrader 25 Jul 2024 tech

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open LLMs arxiv.org/abs/2402.03300

662 sats \ 0 comments \ @zuspotirko 6 Feb 2024 science

Inside the Secret Meeting Where Mathematicians Struggled to Outsmart AI www.scientificamerican.com/article/inside-the-secret-meeting-where-mathematicians-struggled-to-outsmart-ai/

421 sats \ 5 comments \ @k00b 9 Jun 2025 science

New MIT Research Proves AGI Was Achieved www.geeky-gadgets.com/artificial-general-intelligence-advancements/

65 sats \ 1 comment \ @ch0k1 16 Nov 2024 news

Hermes 4 outperforms OpenAI models with minimal content restrictions

121 sats \ 0 comments \ @lunin 1 Sep 2025 AI

The AI Frontier Must be Fiercely Competitive www.civitasoutlook.com/research/the-ai-frontier-must-be-fiercely-competitive-143db02b-c872-4db8-9782-a3fc6ed4ac72

157 sats \ 0 comments \ @0xbitcoiner 27 Jan AI

Google's DeepMind AI decodes age-old math equation, stumping humans interestingengineering.com/innovation/googles-deepmind-decodes-old-math-equation

100 sats \ 0 comments \ @zuspotirko 16 Dec 2023 AI

Chinese researchers just built an open-source rival to ChatGPT in 2 months www.livescience.com/technology/artificial-intelligence/china-releases-a-cheap-open-rival-to-chatgpt-thrilling-some-scientists-and-panicking-silicon-valley

21 sats \ 0 comments \ @ch0k1 25 Jan 2025 news

OpenAI o3-mini model release openai.com/index/openai-o3-mini/

196 sats \ 0 comments \ @ch0k1 3 Feb 2025 news