BIP3 PR: No BIP content may be generated by AI/LLMs
I've been following BIP3's progress through the BIP review and activation process (#1281384). In the middle of October, one of the BIP editors submitted a PR called "BIP3: add guidance on originality, quality, LLMs." It was merged later that month.
The PR proposed adding these lines to BIP3:
A BIP may be submitted by anyone, provided it is the original work of its authors and the content is of high quality, e.g. does not waste the community's time. No content may be generated by AI/LLMs and authors must proactively disclose up-front any use of AI/LLMs. (emphasis added)
The last line about the use of AI/LLMs has sparked an interesting conversation on the Bitcoin Development Mailing List.
Luke Dashjr was the first to respond specifically to this prohibition, saying
AI/LLM usage disclosure is too much. As long as no content is LLM-generated, we should be fine.
Greg Maxwell disagreed:
If anything I think the AI disclosure is arguably too weak, particularly with the well documented instances of LLM psychosis and other weird influence and judgement compromising effects. The current text seems adequate to me and shouldn't be weakened further.
This disagreement is not terribly interesting, and might be put down to contagion from disagreements these two have in other areas. But after this, the conversation got much more interesting.
AI-assisted versus AI-generated
Dave Harding wrote an interesting note where he disagreed with Greg Maxwell, explaining that he used AI throughout his work flow and writing process to do such things as create todo lists, refine writing, and even create drafts based on his notes. Harding acknowledged:
I would, of course, review every word of the draft BIP before submitting it for consideration and ensure that it represented the highest quality work I was able to produce---but the ultimate work would be a mix of AI and human writing and editing.
Harding concluded:
The BIP process already requires high-quality content.[2] AI-generated content can be high-quality, especially if its creation and editing was guided by a knowledgeable human. Banning specific tools like AI seems redundant and penalizes people who either need those tools or who can use them effectively.
Is AI helping us discover brilliant BIPs in the rough?
To which Maxwell responded that it is clear that AI is a useful tool for people with proven track records, however:
the number of good submissions that could be made would hardly be increased by LLMs (being limited by expert proposers with good ideas) but the number of potential poor submissions is increased astronomically.
Harding challenged Maxwell on this asymmetry, pointing out that BIP125 might have been a case of a good idea that was delayed because nobody wanted to go through the formal process of writing a BIP for it.
Jon Carvalho agreed with this, adding that his own BIP177 was example of a BIP that would not have existed without LLM tools, and yet which seems (to my chagrin) be achieving some amount of adoption.
Or is AI making it harder to quickly spot ill-conceived ideas?
One point Maxwell made that resonated with me was this:
LLMs have generally created something of an existential threat to most open collaborations: Now its so easy to get flooded out by subtly worthless material.
While LLMs might help a few good ideas get turned into BIPs when they might otherwise have never gotten noticed, they allow many, many more bad ideas to clothe themselves in legitimacy and attempt to use the BIP editors to "get the ball rolling - then lean on the review process to get it right."
Open projects like Bitcoin don't necessarily suffer from a lack of good ideas. It is more likely that they suffer from a lack of time -- and it doesn't seem like AI can help with that.
Both are probably true, but Jon Atack is right
There is another asymmetry in this process similar to the one Maxwell described: LLMs may assist many people with good ideas in drafting BIPs, but they are less useful in helping people become effective BIP editors.
I've been spending way too much time on Bitcoin for years now. I've participated in Chaincode's Bitcoin Protocol Development Seminar and Bitshala's Learning Bitcoin from the Command Line seminar, and I've spent a heck of a lot of time researching various aspects of Bitcoin in the course of writing about it -- and I don't think I have anything close to the experience needed to do a good job evaluating potential BIPs. Having access to AI doesn't really change this for me.
LLMs have given all of us tools that make our ideas for how to make Bitcoin better look professional, but LLMs have not given us the tools to be experts on Bitcoin -- that still takes time and experience.
Jon Atack noted something important here:
In a few cases, the proposed fixes are useful.In many others, they seem to be a waste of review/maintenance/moderation bandwidth and time, and are demotivating to deal with.
Not only does it waste time to have to read through a perfectly formatted dumb idea, but the discovery that what looked like a thoughtful and well-considered proposal blithely harbors contradictions and outright hallucinations is severely demotivating.
I've experienced a similar thing here on SN: I notice that I am less inclined to give a long post very much time because I'm worried I'll get halfway into it and realize it's just some prompt output that the author didn't even bother reading themselves.
Since the most limited resource here is experienced BIP editors, it makes sense to maintain a pretty hard line on the use of LLM generated text in BIPs.
jgarzikhas made no secret of him starting like that.