pull down to refresh

Love this.
We contend that SLMs are
  • [V1] principally sufficiently powerful to handle language modeling errands of agentic applications;
  • [V2] inherently more operationally suitable for use in agentic systems than LLMs;
  • [V3] necessarily more economical for the vast majority of LM uses in agentic systems than their general-purpose LLM counterparts by the virtue of their smaller size;
[..]
We assert that the dominance of LLMs in the design of AI agents is both excessive and misaligned with the functional demands of most agentic use cases. While LLMs offer impressive generality and conversational fluency, the majority of agentic subtasks in deployed agentic systems are repetitive, scoped, and non-conversational—calling for models that are efficient, predictable, and inexpensive.
Hurray! I've been saying this for a longer time now. I think that initially, Apple/Google got it right with the chips they were putting in iPhone/Pixels, but deviated because of the bigger=better narrative. But bigger isn't better when during the 3y lifespan of my phone I also have to spend between 350k-1M sats on subscriptions to use remote compute for some GPT model that remembers every Harry Potter novel. I don't need it to know that, because it has zero added value to any business goal I could possibly have. Charge me 50k sats to have the sovereign capability on my phone instead.

There are some interesting examples of hyper-focused SLMs in there that I'm going to try. My benchmark is now:
can it print me the BOLT11 invoice from the cashu mcp? (though I may need to push some fixes to that mcp first)