@anon
sign up
@anon
sign up
pull down to refresh
FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
epochai.org/frontiermath/the-benchmark
203 sats
\
0 comments
\
@Rsync25
10 Nov 2024
tech
related
Testing AI systems on hard math problems shows they still perform very poorly
phys.org/news/2024-11-ai-hard-math-problems-poorly.html
153 sats
\
4 comments
\
@south_korea_ln
13 Nov 2024
science
DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning
arxiv.org/abs/2508.05405
182 sats
\
0 comments
\
@optimism
10 Aug
AI
AI mimics neocortex computations with 'winner-take-all' approach
techxplore.com/news/2024-10-ai-mimics-neocortex-winner-approach.html
49 sats
\
0 comments
\
@ch0k1
26 Oct 2024
tech
Scaling up: how increasing inputs has made artificial intelligence more capable
ourworldindata.org/scaling-up-ai
279 sats
\
0 comments
\
@0xbitcoiner
20 Jan
charts_and_numbers
OpenAI o3 beats FrontierMath - because OpenAI funded the test and cheated
pivot-to-ai.com/2025/01/20/openai-o3-beats-frontiermath-because-openai-funded-the-test-and-had-access-to-questions/
366 sats
\
1 comment
\
@StillStackinAfterAllTheseYears
21 Jan
tech
Debate May Help AI Models Converge on Truth
www.quantamagazine.org/debate-may-help-ai-models-converge-on-truth-20241108/
237 sats
\
0 comments
\
@0xbitcoiner
8 Nov 2024
science
Scaling up: how increasing inputs has made artificial intelligence more capable
215 sats
\
0 comments
\
@zuspotirko
29 Jan
tech
We’re Entering Uncharted Territory for Math
www.theatlantic.com/technology/archive/2024/10/terence-tao-ai-interview/680153/
386 sats
\
4 comments
\
@south_korea_ln
5 Oct 2024
science
AI achieves silver-medal standard solving International Mathematical Olympiad
deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/
284 sats
\
3 comments
\
@sancristrader
25 Jul 2024
tech
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open LLMs
arxiv.org/abs/2402.03300
662 sats
\
0 comments
\
@zuspotirko
6 Feb 2024
science
Inside the Secret Meeting Where Mathematicians Struggled to Outsmart AI
www.scientificamerican.com/article/inside-the-secret-meeting-where-mathematicians-struggled-to-outsmart-ai/
421 sats
\
5 comments
\
@k00b
9 Jun
science
New MIT Research Proves AGI Was Achieved
www.geeky-gadgets.com/artificial-general-intelligence-advancements/
65 sats
\
1 comment
\
@ch0k1
16 Nov 2024
news
Quantitative AI progress needs accurate and transparent evaluation
mathstodon.xyz/@tao/114910028356641733
100 sats
\
0 comments
\
@hn
25 Jul
tech
Google's DeepMind AI decodes age-old math equation, stumping humans
interestingengineering.com/innovation/googles-deepmind-decodes-old-math-equation
100 sats
\
0 comments
\
@zuspotirko
16 Dec 2023
AI
Chinese researchers just built an open-source rival to ChatGPT in 2 months
www.livescience.com/technology/artificial-intelligence/china-releases-a-cheap-open-rival-to-chatgpt-thrilling-some-scientists-and-panicking-silicon-valley
21 sats
\
0 comments
\
@ch0k1
25 Jan
news
OpenAI o3-mini model release
openai.com/index/openai-o3-mini/
196 sats
\
0 comments
\
@ch0k1
3 Feb
news
@Maths is an Ai API trying to capture sats with replies
181 sats
\
8 comments
\
@mallardshead
26 Jul 2023
meta
Frequently Asked Questions (And Answers) About AI Evals
hamel.dev/blog/posts/evals-faq/
133 sats
\
0 comments
\
@carter
7 Jul
AI
Google releases its own 'reasoning' AI model
techcrunch.com/2024/12/19/google-releases-its-own-reasoning-ai-model/
11 sats
\
0 comments
\
@ch0k1
20 Dec 2024
news
The AI that solved IMO Geometry Problems | Guest video by @Aleph0
www.youtube.com/watch?v=4NlrfOl0l8U&ab_channel=3Blue1Brown
352 sats
\
4 comments
\
@south_korea_ln
17 Aug
AI
My bad experiences using AI as a physicist
11.7k sats
\
32 comments
\
@south_korea_ln
16 Jun
science
more