pull down to refresh

I was thinking... if I were to design an agent (I may, for me) I'd probably design an agent-making agent.

So you teach the top level thing to pick the right agent for the task at hand, and you start with a single agent that can build agents. Like skills but execute the skill(set) straight into the system prompt without the overhead of souls and identities and what not.

Instead of all the bloat, expose a local knowledge base that every agent can get access to (with RBAC if you plan on doing nasty private thoughts shit) and do proper process isolation. And if staying with go (is actually fun), I may have finally found a great use for go-plugin. Never thought I'd get to say that, lol. An agent is then an isolated plugin (easier to secure than a dll, and let's not speak to a .md file in userspace) and you just expose an interface over gRPC, and done-ish.

🤔 imma burn so many tokens and be obsolete before I'm even done... lol

201 sats \ 1 reply \ @k00b 3 Mar

I was wondering the same thing yesterday: what is the minimal, "boostrappable," agent I can use to build a bespoke swarm.

Then I came to the same conclusion: this is all bound to be obsolete in eight weeks. It's still, likely, a worthwhile study of how these systems operate before LLMs internalize it all.

Also, I overheard someone in the lab (I'd tag them but they didn't know I was involuntarily eavesdropping) describe achieving their desired bespokeness wrapping Codex, adding a few more tool schemas, auxiliary memory, and some multi-agent collaboration primitives. And TIL Codex is open source.

reply
what is the minimal, "boostrappable," agent I can use to build a bespoke swarm.

We really just need removal of the file system, save for maybe tmpfs if you need to stream in large amounts of remote bytes. Remove the multitool, build the real tool. Yes, a swiss army knife has a saw. Now go cut a tree with it.

It's still, likely, a worthwhile study of how these systems operate before LLMs internalize it all.

I think we have the tools. Most importantly, structured output: the most under appreciated precision scalpel we have since forever. Using that instead of text output is what allows us to move,

from func execute(prompt string) (err, string) {}
to func execute[T interface{}](prompt string) (err, T) {}

That means MEMORY.md will become the faint memory (hah) of an old nightmare it is supposed to have been since 1984. After all, I can confidently state that if the "heart and soul" of your system is markdown, it means you have no system.

achieving their desired bespokeness wrapping Codex
..
TIL Codex is open source.

Wrapping is the key word though. It's what I did with Claude Code and what I intended to do with a claw base. Because I really don't want to use broken 3rd party tooling that I have to fix. Unfortunately the design principles of the claw softwares - all of them I've looked at thus far as they've just copied each other - are awful and this makes it bottomline unusable, because of the amount of tokens that I have to burn on fixing bugs and debt [1]. 14k github stars in a week... lmao. nope, you're not getting my star of approval. I have jotted a note (somewhere, lol) to look at CoPaw (#1446238) - maybe that is more structured.

With Claude Code, Anthropic has become a victim of their own success though... I'm 99.99% sure that they didn't expect having to block software (openclaw) from their subscription tier. I suspect that they're not capable of making enough money off their model-as-a-service offering if the majority of the usage budget is constrained by running a toy (and they're thus not selling API credz.) With that comes disruption for me. I don't want to move away from Opus 4.6 at the moment though, simply because the results are good and the friction is just in the integration with their inference service. If I have to though, I can without too much trouble. My entire agentic loop that interfaces with Claude is hand crafted, I own it 100%, and I would only need to replace that single component - maybe re-tune some prompt templates.

  1. on picoclaw alone I found a week worth of Claude credits of bugs and logic errors that were negatively affecting outcomes - prompt template errors, context poisoning as a feature, tooling logic issues, retry/error handling poverty, missing feedback in agent spawning, no permission management, no good logging, default skill installs that are extremely poor and incompatible with the proposed framework.. to just name yesterday, lol. The yolo factor is awful. ↩

reply