pull down to refresh

Hal Finney https://www.metzdowd.com/pipermail/cryptography/2008-November/014848.html hal at finney.org Thu Nov 13 11:24:18 EST 2008 Previous message: Comment Period for FIPS 186-3: Digital Signature Standard Next message: Bitcoin P2P e-cash paper Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] James A. Donald writes:
Satoshi Nakamoto wrote:
When there are multiple double-spent versions of the same transaction, one and only one will become valid.
That is not the question I am asking.
It is not trust that worries me, it is how it is possible to have a a globally shared view even if everyone is well behaved.
The process for arriving at a globally shared view of who owns what bitgold coins is insufficiently specified.
I agree that the description is not completely clear on how these matters are handled. Satoshi has suggested that releasing source code may be the best way to clarify the design. As I have tried to work through details on my own, it does appear that the rules become rather complicated and indeed one needs at least a pseudo-code algorithm to specify the behavior. So perhaps writing real code is not a bad way to go. I found that there is a sourceforge project set up for bitgold, although it does not have any code yet.
In answer to James' specific question, about what happens when different nodes see different sets of transactions, due to imperfect broadcast, here is how I understand it. Each node must be prepared to maintain potentially several "candidate" block chains, each of which may eventually turn out to become the longest one, the one which wins. Once a given block chain becomes sufficiently longer than a competitor, the shorter one can be deleted. This length differential is a parameter which depends on the node's threat model for how much compute power an attacker can marshall, in terms of the fraction of the "honst" P2P network's work capacity, and is estimated in the paper. The idea is that once a chain gets far enough behind the longest one, there is essentially no chance that it can ever catch up.
In order to resolve the issue James raised, I think it is necessary that nodes keep a separate pending-transaction list associated with each candidate chain. This list would include all transactions the node has received (via broadcast by the transactees) but which have not yet been incorporated into that block chain. At any given time, the node is working to extend the longest block chain, and the block it is working to find a hash collision for will include all of the pending transactions associated with that chain.
I think that this way, when a candidate chain is deleted because it got too much shorter than the longest one, transactions in it are not lost, but have continued to be present in the pending-transaction list associated with the longest chain, in those nodes which heard the original transaction broadcast. (I have also considered whether nodes should add transactions to their pending-transaction list that they learn about through blocks from other nodes, even if those blocks do not end up making their way into the longest block chain; but I'm not sure if that is necessary or helpful.)
Once these rules are clarified, more formal modeling will be helpful in understanding the behavior of the network given imperfect reliability. For example, if on average a fraction f of P2P nodes receive a given transaction broadcast, then I think one would expect 1/f block-creation times to elapse before the transaction appears in what is destined to become the longest chain. One might also ask, given that the P2P network broadcast is itself imperfectly reliable, how many candidate chains must a given node keep track of at one time, on average? Or as James raised earlier, if the network broadcast is reliable but depends on a potentially slow flooding algorithm, how does that impact performance?
And then on top of that we need everyone to have a motive to behave in such a fashion that agreement arises. I cannot see that they have motive when I do not know the behavior to be motivated.
I am somewhat less worried about motivation. I'd be satisfied if the system can meet the following criteria:
  1. No single node operator, or small collection of node operators which controls only a small fraction of overall network resources, can effectively cheat, if other players are honest.
  2. The long tail of node operators is sufficiently large that no small collection of nodes can control more than a small fraction of overall resources. (Here, the "tail" refers to a ranking based on amount of resources controlled by each operator.)
  3. The bitcoin system turns out to be socially useful and valuable, so that node operators feel that they are making a beneficial contribution to the world by their efforts (similar to the various "@Home" compute projects where people volunteer their compute resources for good causes).
In this case it seems to me that simple altruism can suffice to keep the network running properly.
Distributed databases are hard even when all the databases perfectly follow the will of a single owner. Messages get lost, links drop, syncrhonization delays become abnormal, and entire machines go up in flames, and the network as a whole has to take all this in its stride.
A very good point, and a more complete specification is necessary in order to understand how the network will respond to imperfections like this. I am looking forward to seeing more detail emerge.
One thing I might mention is that in many ways bitcoin is two independent ideas: a way of solving the kinds of problems James lists here, of creating a globally consistent but decentralized database; and then using it for a system similar to Wei Dai's b-money (which is referenced in the paper) but transaction/coin based rather than account based. Solving the global, massively decentralized database problem is arguably the harder part, as James emphasizes. The use of proof-of-work as a tool for this purpose is a novel idea well worth further review IMO.
Hal Finney
Satoshi Nakamoto https://www.metzdowd.com/pipermail/cryptography/2008-November/014853.html satoshi at vistomail.com Fri Nov 14 13:55:35 EST 2008 Previous message: unintended? Next message: Bitcoin P2P e-cash paper Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] Hal Finney wrote:
I think it is necessary that nodes keep a separate pending-transaction list associated with each candidate chain. ... One might also ask ... how many candidate chains must a given node keep track of at one time, on average?
Fortunately, it's only necessary to keep a pending-transaction pool for the current best branch. When a new block arrives for the best branch, ConnectBlock removes the block's transactions from the pending-tx pool. If a different branch becomes longer, it calls DisconnectBlock on the main branch down to the fork, returning the block transactions to the pending-tx pool, and calls ConnectBlock on the new branch, sopping back up any transactions that were in both branches. It's expected that reorgs like this would be rare and shallow.
With this optimisation, candidate branches are not really any burden. They just sit on the disk and don't require attention unless they ever become the main chain.
Or as James raised earlier, if the network broadcast is reliable but depends on a potentially slow flooding algorithm, how does that impact performance?
Broadcasts will probably be almost completely reliable. TCP transmissions are rarely ever dropped these days, and the broadcast protocol has a retry mechanism to get the data from other nodes after a while. If broadcasts turn out to be slower in practice than expected, the target time between blocks may have to be increased to avoid wasting resources. We want blocks to usually propagate in much less time than it takes to generate them, otherwise nodes would spend too much time working on obsolete blocks.
I'm planning to run an automated test with computers randomly sending payments to each other and randomly dropping packets.
  1. The bitcoin system turns out to be socially useful and valuable, so that node operators feel that they are making a beneficial contribution to the world by their efforts (similar to the various "@Home" compute projects where people volunteer their compute resources for good causes).
In this case it seems to me that simple altruism can suffice to keep the network running properly.
It's very attractive to the libertarian viewpoint if we can explain it properly. I'm better with code than with words though.
Satoshi Nakamoto
reply
James A. Donald https://www.metzdowd.com/pipermail/cryptography/2008-November/014861.html jamesd at echeque.com Sat Nov 15 19:00:04 EST 2008 Previous message: Bitcoin P2P e-cash paper Next message: Bitcoin P2P e-cash paper Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] Satoshi Nakamoto wrote:
Fortunately, it's only necessary to keep a pending-transaction pool for the current best branch.
This requires that we know, that is to say an honest well behaved peer whose communications and data storage is working well knows, what the current best branch is - but of course, the problem is that we are trying to discover, trying to converge upon, a best branch, which is not easy at the best of times, and becomes harder when another peer is lying about its connectivity and capabilities, and yet another peer has just had a major disk drive failure obfuscated by a software crash, and the international fibers connecting yet a third peer have been attacked by terrorists.
When a new block arrives for the best branch, ConnectBlock removes the block's transactions from the pending-tx pool. If a different branch becomes longer
Which presupposes the branches exist, that they are fully specified and complete. If they exist as complete works, rather than works in progress, then the problem is already solved, for the problem is making progress.
Broadcasts will probably be almost completely reliable.
There is a trade off between timeliness and reliability. One can make a broadcast arbitrarily reliable if time is of no consequence. However, when one is talking of distributed data, time is always of consequence, because it is all about synchronization (that peers need to have corresponding views at corresponding times) so when one does distributed data processing, broadcasts are always highly unreliable Attempts to ensure that each message arrives at least once result in increased timing variation. Thus one has to make a protocol that is either UDP or somewhat UDP like, in that messages are small, failure of messages to arrive is common, messages can arrive in different order to the order in which they were sent, and the same message may arrive multiple times. Either we have UDP, or we need to accommodate the same problems as UDP has on top of TCP connections.
Rather than assuming that each message arrives at least once, we have to make a mechanism such that the information arrives even though conveyed by messages that frequently fail to arrive.
TCP transmissions are rarely ever dropped these days
People always load connections near maximum. When a connection is near maximum, TCP connections suffer frequent unreasonably long delays, and connections simply fail a lot - your favorite web cartoon somehow shows it is loading forever, and you try again, or it comes up with a little x in place of a picture, and you try again
Further very long connections - for example ftp downloads of huge files, seldom complete. If you try to ftp a movie, you are unlikely to get anywhere unless both client and server have a resume mechanism so that they can talk about partially downloaded files.
UDP connections, for example Skype video calls, also suffer frequent picture freezes, loss of quality, and so forth, and have to have mechanisms to keep going regardless.
It's very attractive to the libertarian viewpoint if we can explain it properly. I'm better with code than with words though.
No, it is very attractive to the libertarian if we can design a mechanism that will scale to the point of providing the benefits of rapidly irreversible payment, immune to political interference, over the internet, to very large numbers of people. You have an outline and proposal for such a design, which is a big step forward, but the devil is in the little details.
I really should provide a fleshed out version of your proposal, rather than nagging you to fill out the blind spots.
reply
Satoshi Nakamoto https://www.metzdowd.com/pipermail/cryptography/2008-November/014863.html satoshi at vistomail.com Mon Nov 17 12:24:43 EST 2008 Previous message: Bitcoin P2P e-cash paper Next message: Bitcoin P2P e-cash paper Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] James A. Donald wrote:
Fortunately, it's only necessary to keep a pending-transaction pool for the current best branch.
This requires that we know, that is to say an honest well behaved peer whose communications and data storage is working well knows, what the current best branch is -
I mean a node only needs the pending-tx pool for the best branch it has. The branch that it currently thinks is the best branch. That's the branch it'll be trying to make a block out of, which is all it needs the pool for.
Broadcasts will probably be almost completely reliable.
Rather than assuming that each message arrives at least once, we have to make a mechanism such that the information arrives even though conveyed by messages that frequently fail to arrive.
I think I've got the peer networking broadcast mechanism covered.
Each node sends its neighbours an inventory list of hashes of the new blocks and transactions it has. The neighbours request the items they don't have yet. If the item never comes through after a timeout, they request it from another neighbour that had it. Since all or most of the neighbours should eventually have each item, even if the coms get fumbled up with one, they can get it from any of the others, trying one at a time.
The inventory-request-data scheme introduces a little latency, but it ultimately helps speed more by keeping extra data blocks off the transmit queues and conserving bandwidth.
You have an outline and proposal for such a design, which is a big step forward, but the devil is in the little details.
I believe I've worked through all those little details over the last year and a half while coding it, and there were a lot of them. The functional details are not covered in the paper, but the sourcecode is coming soon. I sent you the main files. (available by request at the moment, full release soon)
Satoshi Nakamoto
reply