reply on: How to turn LLM Pinocchio into a real boy \ stacker news ~AI

pull down to refresh

227 sats \ 5 replies \ @optimism 7 Oct \ on: How to turn LLM Pinocchio into a real boy AI

It's important to remember that the AI assistants you use today are based on LLMs that do not update their parameters. The chatbot you interact with is a creation of an LLM with a specific set of parameters, and no amount of prompting or context changes them.

I agree that prompting doesn't change the underlying model, but let's not forget that it does change the runtime results, and a living prompt, such as a continuously injected AGENTS.md a la Cursor, will actually both reinforce paths (given that there is no crazy randomness introduced through high temperature) and allows the user and or the LLM itself to evolve the results through editing that file.

Of course there is still a chance that it goes wrong; it happened to me once out of 500 or so runs with Claude this weekend, where a wrong tool call was made.

Pinocchio's parameters must receive feedback from real life and be altered by it.

So do they? I'm still unconvinced. If one were to stop training an LLM to be a know-it-all, but instead train it to use tools and incorporate the feedback from the context - which is, it feels to me, much of what Anthropic figured out with Claude - do we really need to retrain the models? #1136016 also claims it isn't needed per-se.

125 sats \ 4 replies \ @Scoresby OP 7 Oct

a living prompt, such as a continuously injected AGENTS.md a la Cursor

This was something I has trouble wrapping my head around: for such a process to be effective, wouldn't the living prompt file grow pretty large?

In my mental model, I was thinking of the model's parameters as hard drive and the prompt as RAM, and my assumption was prompts couldn't grow large because the prompt has to be run through the model at every pass. So a very large prompt would make each pass require more compute. It is possible that I've misunderstood something, though.

512 sats \ 2 replies \ @optimism 7 Oct

PS: I got inspired by this argument so I used my remaining Claude 4.5 token moneys to let it write documentation for nostr-tools. For my next vibe coding experiment - maybe the coming weekend - I will instruct an LLM to use these docs to write a lil nostr reposter bot for ~AI posts.

% (find docs/nips -name "*.md" && find docs/getting-started -name "*.md") | xargs wc -l
     497 docs/nips/nip17.md
     946 docs/nips/nip46.md
     801 docs/nips/nip07.md
     531 docs/nips/nip27.md
     123 docs/nips/nip42.md
     480 docs/nips/nip13.md
     171 docs/nips/nip47.md
     899 docs/nips/nip57.md
     697 docs/nips/nip06.md
     601 docs/nips/nip19.md
     113 docs/nips/nip49.md
     467 docs/nips/nip18.md
     182 docs/nips/nip59.md
     147 docs/nips/README.md
     149 docs/nips/index.md
     179 docs/nips/nip98.md
     836 docs/nips/nip25.md
     470 docs/nips/nip11.md
     857 docs/nips/nip01.md
     188 docs/nips/nip44.md
     156 docs/nips/nip21.md
     599 docs/nips/nip05.md
     709 docs/nips/nip10.md
     318 docs/getting-started/quick-start.md
     734 docs/getting-started/common-patterns.md
     176 docs/getting-started/installation.md
     453 docs/getting-started/core-concepts.md
   12479 total

It also wrapped it in a vitepress thingy so it could also help humans. But I'm not going to read all that slop, nor publish it. Looks pretty tho.

100 sats \ 1 reply \ @Scoresby OP 7 Oct

tools not personas. I'm still working on getting this idea firmly in my head. I'll get there eventually.

311 sats \ 0 replies \ @optimism 7 Oct

I'm working on the longer post I promised @BlokchainB on this. Will probably sleep on it though, because I want it to at least be watertight and needleproof against my own scrutiny before y'all tear it down lol

137 sats \ 0 replies \ @optimism 7 Oct

a very large prompt would make each pass require more compute.

Yes. Feature not bug. More tokens = more moneys. So you better clean it up and reset that context window sometimes.

The alternative is searchable instructions, see also #1250028 from yesterday, which shows a clear way forward.

All you need is a well-tuned LLM that implements a process, not a persona.