pull down to refresh

Meek Models Shall Inherit the Earth

The past decade has seen incredible scaling of AI systems by a few companies, leading to inequality in AI model performance. This paper argues that, contrary to prevailing intuition, the diminishing returns to compute scaling will lead to a convergence of AI model capabilities. In other words, meek models (those with limited computation budget) shall inherit the earth, approaching the performance level of the best models overall. We develop a model illustrating that under a fixed-distribution next-token objective, the marginal capability returns to raw compute shrink substantially. Given current scaling practices, we argue that these diminishing returns are strong enough that even companies that can scale their models exponentially faster than other organizations will eventually have little advantage in capabilities. As part of our argument, we give several reasons that proxies like training loss differences capture important capability measures using evidence from benchmark data and theoretical performance models. In addition, we analyze empirical data on the capability difference of AI models over time. Finally, in light of the increasing ability of meek models, we argue that AI strategy and policy require reexamination, and we outline the areas this shift will affect.
233 sats \ 4 replies \ @freetx 3h
In other words, meek models (those with limited computation budget) shall inherit the earth, approaching the performance level of the best models overall.
I actually have a similar take. The diminishing returns combined with retail hardware advances are going to make it that in 2030 your average home computer will run a model that more or less performs similar to todays frontier models.
We've seen this same thing constantly. There was a time when an iPhone 4 could do a tremendous amount more than an iPhone 1....but that hasn't been true for a long time. Does the average phone user really need anything more advanced than say a iPhone 9? Text, Photos, Emails, Maps, etc....basically its all become a "solved problem".
I think the same is going to happen in AI. At a certain point, the day-to-day use cases are going to be solved by open source models running on commodity hardware.
reply
I don't think we'll have to wait that long.
NVIDIA DGX Spark Arrives for World’s AI Developers #1256239
reply
33 sats \ 2 replies \ @optimism 1h
I want one but its sold out. Oh and its probably expensive haha
reply
33 sats \ 0 replies \ @freetx 1h
I'm waiting for more comparisons between AMD AI 395 and the DGX.
Main benefits of DGX is that NVidia tooling and ecosystem is so much better....however raw CPU of AMD is probably faster.
Preliminary testing on https://www.reddit.com/r/LocalLLaMA/comments/1o6izz2/dgx_spark_vs_ai_max_395/ seems to indicate that AMD AI 395 wins (also ~33% cheaper)
reply