Why Do Researchers Care About Small Language Models? \ stacker news ~AI

Larger models can pull off a wider variety of feats, but the reduced footprint of smaller models makes them attractive tools.

Large language models work well because they’re so large. The latest models from OpenAI, Meta and DeepSeek use hundreds of billions of “parameters” — the adjustable knobs that determine connections among data and get tweaked during the training process. With more parameters, the models are better able to identify patterns and connections, which in turn makes them more powerful and accurate.

But this power comes at a cost. Training a model with hundreds of billions of parameters takes huge computational resources. To train its Gemini 1.0 Ultra model, for example, Google reportedly spent $191 million(opens a new tab). Large language models (LLMs) also require considerable computational power each time they answer a request, which makes them notorious energy hogs. A single query to ChatGPT consumes about 10 times(opens a new tab) as much energy as a single Google search, according to the Electric Power Research Institute.

...read more at quantamagazine.org

0 sats \ 2 replies \ @SimpleStacker 10 Mar

A single query to ChatGPT consumes about 10 times(opens a new tab) as much energy as a single Google search, according to the Electric Power Research Institute.

I actually find this a bit amusing, because in my experience a ChatGPT query is more than 10 times as useful as a Google search nowadays.

30 sats \ 1 reply \ @0xbitcoiner OP 10 Mar

I guess it depends on what you're searching for. Maybe it's just me, but GPT seems to trip up when we ask for sources!

0 sats \ 0 replies \ @SimpleStacker 10 Mar

I've been using Deep Research mode which is actually very good at finding sources, and accurately too. I think you need a pro subscription to use it though.