New version, same rules...same bugs?

Thanks to the friendly encouragement of @DarthCoin, my node is now running v26 of Bitcoin Core. But in the process of upgrading, I've been doing some thinking. What if this new version I'm running has a bug that nobody has noticed yet and it validates some blocks that no other version of Bitcoin thinks are valid?
Obviously, my node would depart from the rest of you and run down a new fork. Maybe I could make up a fun name for it or something. But the problem is that nobody else would be on my fork of the chain with me. When everyone else sees that block I validated, they would all say: Hey stranger, keep your new-fangled ideas away from here. we don't do it like that 'round these parts.
Everyone except for the other people who downloaded this new version of Core with the bug. They would still be my friends and think I had cool blocks.
Luckily, thus far, I'm still in sync with the rest of you.

How do you know if Bitcoin Core has a bug?

If Bitcoin Core v26 did have a bug in it that allowed it to validate blocks that older versions considered invalid, we'd know because it would suddenly fork off from all the older versions.
Because it was new, we'd probably assume that the problem was with v26 and most of us would revert to some previous version until the devs figured it out. At least, that's what I assume would happen.
But what if the bug was in an older version of Satoshi's client and not present in the newer versions?
What if there is some transaction lurking out there that is valid by all the consensus rules we agree on, but triggers a bug in all versions of Core earlier than v14? And when this magic pill of a transaction gets into a block, all of a sudden we get a fork with old nodes saying what the hell is this shit you're trying to pass on us?, and new nodes saying isn't this what we've always done?
Again, we'd all look to the devs, but in the mean time, there would be this question:

Which chain is the real bitcoin?

All software has bugs and I'm not really interested in them. I am interested in how we know the difference between a bug and a consensus rule.
In a recent Bitcoin Takoever episode, Erik Voskuil makes the following statement around 41:20:
And the question is: who was on the real chain? At the point that that shipped, which implmenetaton was Bitcoin? We were. They weren't because they changed it back.
Voskuil has been a maintainer for an alternate implementation of Bitcoin called libbitcoin. His statement refers to a time in 2018 when Bitcoin Core unintentionally shipped a version (v15.0) that included a double-spend bug.
The bug was never exploited, but Voskuil raises the point that nodes running that unpatched version of Bitcoin Core v15 were following a set of rules that allowed double spends and libbitcoin nodes were not. At that time, which nodes do you think were running the real Bitcoin?
Double spending has never been a part of Bitcoin, so the answer seems clear. We have some concept of Bitcoin that we hold everything up to and ask, is this it? Is this a consensus rule or is this a bug?

Are bugs consensus rules?

In Fall 2022, many LND nodes crashed because a transaction their implementation of bitcoin thought was invalid got included in a block.
This time the discrepancy wasn't about something as obvious as double-spending.
LND nodes were using an implementation of Bitcoin called btcd, and this implementation had a limit to how much witness data could be included in a taproot transaction. Bitcoin Core did not have a limit for taproot transactions. When @brqgoo made a really big multisig transaction, Bitcoin Core nodes accepted it as valid, but btcd nodes did not.
Now, the reason the limit exists in the first place, I think, is to prevent denial-of-service attacks. It isn't really about consensus. Nonetheless, it changed consensus.

Even unto the bugs

In the aftermath of those exciting times, I remember thinking, Well, shoot: consensus isn't just the big rules we all have heard about. You have to make sure that you agree about all the little, quotidian stuff, too. Because consensus means you can't have one Bitcoin node think transactions are valid and another think they are invalid.
@petertodd used the phrase "bug-for-bug compatability" when referring to this problem. And Satoshi identified the problem pretty early:
I don't believe a second, compatible implementation of Bitcoin will ever be a good idea. So much of the design depends on all nodes getting exactly identical results in lockstep that a second implementation would be a menace to the network.
This sounds pretty damning. And if you recall where we started with my worries about upgrading to v26 of Core, things aren't looking too good.

The Bitcoin Consensus Red Herring

You've probably seen this article from 2015 called The Bitcoin Consensus Red Herring. If you haven't, you should definitely give it a read.
The TL/DR is this:
There is currently no way to guarantee that any two versions of Bitcoin software, whether they are two different versions of Bitcoin Core, two different versions of alternative implementations, a version of Bitcoin Core versus a version of an alternative implementation, or even two copies of the same version of Bitcoin Core built with different compiler versions are in exact consensus agreement. Doing so is incredibly difficult and borders on impossible.
I'm sure that people who know better than I can weigh in on the merit of this statement, and again my interest isn't in the fact that bugs happen, but rather when they do, how we decide what is a bug and what is consensus.

A frowsy, ugly step-child looking oracle of truth

Back to that Bitcoin Takeover epsidoe with Erik Voskuil. The host, Vlad, made an interesting comment (around 39:40) about running alternate implementations (or older versions of Core itself):
You have an alternative implementation of the same code base or a rewrite or something that's entirely new but respects the consensus codes for the simple reason that you want it to be an oracle of truth just in case there are changes that go by unnoticed with Core. You are going to have this thing that says you changed your rules.
I thought this was a pretty interesting idea. Because consensus is such a picky little bitch, it wouldn't hurt to put a little more emphasis on alternate implementations and older versions.
223 sats \ 1 reply \ @Murch 12 Mar
In Fall 2022, many LND nodes crashed because a transaction their implementation of bitcoin thought was invalid got included in a block.
This time the discrepancy wasn't about something as obvious as double-spending.
LND nodes were using an implementation of Bitcoin called btcd, and this implementation had a limit to how much witness data could be included in a taproot transaction. Bitcoin Core did not have a limit for taproot transactions. When @brqgoo made a really big multisig transaction, Bitcoin Core nodes accepted it as valid, but btcd nodes did not.
Now, the reason the limit exists in the first place, I think, is to prevent denial-of-service attacks. It isn't really about consensus. Nonetheless, it changed consensus.
The removal of the limit on witness data was an explicit design decision for Tapscript (see BIP 342):
BTCD was not spec-conform by enforcing an unspecified limit.
running alternate implementations
In case you aren’t aware, there is a project that does exactly this. Fork Monitor currently runs
  • Bitcoin Core 26.0rc2
  • Bitcoin Core 0.21.1
  • Bitcoin Core 0.18.0
  • Bitcoin Core 0.10.3
  • bcoin 2.2.0
  • Bitcoin Knots 0.14.2
  • btcd 0.24.1
  • btcd 0.23.3
and alerts subscribers whenever any of these versions diverge on block acceptance.
If Bitcoin Core v26 did have a bug in it that allowed it to validate blocks that older versions considered invalid, we'd know because it would suddenly fork off from all the older versions. […] This sounds pretty damning. And if you recall where we started with my worries about upgrading to v26 of Core, things aren't looking too good.
If Bitcoin Core v26 accepted blocks that prior versions did not, that would be an accidental hardfork. If it didn’t accept blocks that prior versions accept, it would be an accidental softfork if at least 50% of the hashrate had upgraded already.—Not sure if it helps, but it seems to me that I have more confidence in Bitcoin Core’s test coverage than you do. :)
reply
This is awesome! Thanks for pointing it out. I'm not sure how I managed to miss that it existed.
I run core as my node, so its definitely the implementation I have the most confidence in. But there's always this little niggling thing in my mind that if we all use Core we maybe lose something.
reply
It's a typical problem when there is no primary documentation of a protocol and all you have is a reference implementation (Bitcoin Core). This means that you have to embrace "it's not a bug, it's a feature".
Every "competing" implementation, as you point out, is just a rewrite, that has to replicate all the "bugs" and warts in itself.
It should be pointed out that one of the reasons why all the arithmetic operations in Script had to be disabled is because different compilers and platforms could behave differently in some edge cases (like integer overflow), and lead to exploits or undefined behaviour.
I dare say, if Bitcoin Core were written in something like Rust, those might still be available today (Rust is much more consistent across targets), but not so much in C. The consequence being, even if you rewrote it in Rust, you couldn't reenable them because that's no longer part of the consensus, it's just dead opcodes.
I would welcome with open arms an initiative to actually write down the consensus rules as a document (the white paper has very few details), followed by a complete test suite that is implementation-agnostic, which then could be applied to Bitcoin Core and others as a proof of their validity.
At this point it would be clear what is a bug and what isn't.
(Yes, Bitcoin Core has a lot of unit tests but they are tied to that implementation only. I'm talking about "end-to-end" tests - you input a blockchain state and a new block and test whether program accepts it or not.)
In fact, for something that wants to be the "money of the future" or the "operating system of value", I'm shocked something like that doesn't exist yet. We're really running on software engineering hopium here.
In modern software development this is how we do things now (Test-Driven Development) but unfortunately Satoshi gave us what we would now call "legacy software".
reply
Not to mention that bitcoin being a protocol, it’s precisely the perfect use case for tdd. Define specs, implement tests, all nodes need to pass it.
Although I have no idea how such a spec could be drafted.
Also, as a nitpick, all software built today is the legacy software of tomorrow, so although I think bitcoin would benefit from this, I think that what was done so far was well done, in as much it was all fruit of free labor hahaha
reply
What do you think of the admittedly hack-ey solution of just running a bunch of versions?
Maybe it doesn't matter so much for humble node runners who only care about verifying a few transactions every once in a while, but I've often wondered if mining pools do this as a way of making sure that they don't waste resources.
reply
It's a non-solution for me, not because it's hacky (a comprehensive test suite would test a node release against multiple compilers and CPU architectures), but because it finds problems in production environment (instead of during development), which is unacceptable for the most important monetary network in the world.
reply
Aaand thus, it has only gotten more complex, thanks OP.
reply
stackers have outlawed this. turn on wild west mode in your /settings to see outlawed content.