Generation of The Master Seed:
1. Initialization of Entropy:
A master seed’s creation starts with the initialization of entropy, or the accumulation of initial input data, from which entropy is obtained and processed into a sequence of bits—the fundamental units of information in binary representation, where each bit holds a value of either 0 or 1.
The initialization of entropy is facilitated by a Cryptographic Secure Pseudo-Random Number Generator (CSPRNG), a specialized algorithm that collects randomness from a mixture of external sources (i.e., environmental noise, system events, or user input) and subsequently processes the collected data into a sequence of either 128- or 256-bit segments, comprising entropy.
2. Derivation of Checksum:
The entropy is subsequently utilized as the initial input to derive a corresponding checksum that’s consequently appended to the entropy, providing a mechanism to detect errors and ensure data integrity in case of a wallet restoration.
The derivation of the checksum is facilitated by the Secure Hash Algorithm 256-bit (SHA-256) cryptographic hash function, a specialized algorithm that converts the initial input, or entropy, into a deterministic, fixed-length 256-bit hash value. The first four- or eight bits of the resulting hash value are utilized to generate the checksum, after which the checksum is appended to the entropy, thereby increasing the length of the original sequence of bits to either 132 bits (128-bit entropy + 4-bit checksum) or 264 bits (256-bit entropy + 8-bit checksum).
- It's 1 bit of checksum for every 32 bits of entropy.
- SHA-256’s hash values maintain certain cryptographic properties, such as collision resistance, preimage resistance, and computational efficiency, while producing secure and efficient hash values of the initial input data.
3. Conversion to Mnemonic Sentence:
The entropy plus checksum are thereafter converted from a sequence of bits into a sequence of words, enabling simple and convenient handling and storage of the resulting master seed.
The operational length of the sequence of bits, composed of the entropy plus checksum, has expanded to either 132- or 264 bits, enabling the partitioning of the sequence into either 12- or 24 uniform, fixed-size chunks, each consisting of 11 bits.
- The partitioning into 11-bit chunks results in 2^11 (or 2048) possible binary values per chunk. Therefore, the predefined word list contains 2048 distinct words, ensuring a seamless match between the sum of binary values and the contents of the word list commonly employed in mnemonic schemes.
Each chunk, represented as a binary number, is initially converted into a decimal number, enabling each separate chunk to correspond to the (decimal) index from the predefined word list, thereby facilitating the mapping of a certain chunk into a distinct word.
-
Example: a decimal number of #211 will correspond to the word at index #211.
-
Remember: the sequence is composed of bits—the fundamental units of information in binary representation, where each bit holds a value of either 0 or 1.)
Once each of the 12- or 24 chunks is converted into their corresponding word, they are concatenated together into a sequence mirroring the order of chunks from the original sequence of bits, composed of entropy plus checksum—the mnemonic phrase.
4. Mnemonic-to-Seed Conversion:
The mnemonic sentence is conclusively converted into a cryptographic (root) key, facilitating further derivation of cryptographic keys used in the hierarchical-deterministic structure of the HD-wallet:
The conversion of the mnemonic sentence into a master seed is facilitated by the Password Based Key Derivation Function 2 (PBKDF2), a specialized algorithm that converts the initial input, or mnemonic phrase plus (optional) passphrase, into a secure cryptographic key, resulting in a 64-byte, or 512-bit root key from which all subsequent keys are derived—the master seed.
-
The PBKDF2 function utilizes key stretching and (optional) addition of salt, referred to as salting, to produce a hash value.
-
Key stretching indicates the iteration of a hash function multiple times to increase the computational cost of brute-force attacks, thereby increasing the resulting hash’s security.
-
Salting indicates the addition of salt, or random data such as a passphrase, before iterating the hash function, thereby mitigating precomputed attacks and ensuring uniqueness in hash values.
The root key, or master seed, is used to derive the master extended private key (xpriv), from which the master extended public key (xpub) is derived to facilitate (independent) public key generation; the child keys can be derived from both the xpriv and xpub, during which an index number is included to determine their position within the hierarchy, while addresses, in turn, are derived from their corresponding public keys, providing unique “accounts” for transactions within the blockchain network.
The End.
I've spent multiple weeks gathering, writing, structuring and refining the above summary, and I think that it turned out quite well! 🤠
- I've also went the extra mile and searched for diagrams (that's what they're called?) that'd fit with the general text, let me know what you think of them!
I'd be very happy to hear some more input regarding the above summary, as well as possible points of improvement.
One of the main resources used in the above summary, is the website of Greg Walker: learnmeabitcoin.com
I also reached out to Greg to ask if he'd be able to review the summary on both chronological order- and factual correctness of the contents, which he kindly agreed to!
He's a knowledgeable and helpful guy who really helped me improve on the initial draft.
The next part will cover the key derivation process(es), as well as elliptic curve cryptography- and multiplication, possibly amongst other things, it's a pretty complex part, but I'm working on it!