the setup

  • You are the Sender
  • Take the set of numbers {1,2,3,...,11,12,13} and choose a random permutation r of the elements. There are factorial(13) = 13! possible arrangements of 13 elements.
  • For example, say you have r = (8, 12, 1, 7, 10, 4, 3, 6, 11, 2, 5, 13, 9)
  • You also have available a standard 54-card deck (4 suits * 13 cards + 2 jokers)

the challenge

  • Encode r into the deck of cards by choosing an ordering of the deck of cards such that it is as robust as possible against errors.
  • The Receiver who knows the encoding/decoding algorithm but who does not know r should be able to recover r from the deck of cards even if:
    • some of the deck is lost in transmission -- if this is the case, how many cards can you lose and still be able to recover r?
    • the deck is received in a different order -- if this is the case, how different can it be before Receiver is unable to decode?
example
Here is one method which is probably not very good:
  1. ignore the jokers, treat aces as 1 and kings as 13
  2. put the suits of diamonds and hearts in the same order as r
  3. put suits of clubs and spades in the reverse order of r
  4. construct the deck by interleaving diamond, clubs, hearts, spades, diamond, clubs, hearts, spade, ...
  5. so for our example r the ordering of the deck would be 8D,9C,8H,9S,12D,13C,12H,13S...
the rules
  • the Sender/Receiver agree on the encoding/decoding mechanism prior to Sender generating r (describing the encoding/decoding is what you will do in your reply)
  • after encoding, Sender will destroy its original copy of r and transmit the deck of cards to the Receiver

What are some more robust methods?

10,000 sats paid
VzxPLnHqr's bounties
the deck is received in a different order -- if this is the case, how different can it be before Receiver is unable to decode?
How do you measure the difference between two orderings? As the number of inversions?
And what are you trying to maximize here? The number of cards you can lose? Can any subset of card indices be lost?
reply
Thanks for engaging, and good questions. Please help me firm up the definitions.
How do you measure the difference between two orderings? As the number of inversions?
Number of inversions probably makes sense, but I guess it depends. If we come up with some clever way of doing this (e.g. with a non-standard deck of cards) then maybe there is some other measure we could use?
And what are you trying to maximize here? The number of cards you can lose? Can any subset of card indices be lost?
It would be great if some subset of the larger M-card deck (M = 54 in the example) could be lost and yet still be able to reconstruct the data where the data in the example is a number between 0 and factorial(N) where N = 13. I was representing that number as a permutation of 13 objects, but I suppose that part is not very important.
Think about data as being the entropy+checksum for a bip39 seed. What I want is definitions for encode and decode such that:
encode(data) = f decode(g) = data
where f and g are not equal, but f and g are permutations of {1,2,3,4,...,M}.
Does that help clarify?
reply
deleted by author
reply
Ngl I find it a little difficult to understand what you even want.
I would suggest to you for the additional transmission a checksum.
It depends on the size of r. And how frequent those packet losses occur is also relevant. At 13 you can easily use Hash({8D,9C,8H,9S,12D,13C,12H,13S...}) at the end. If one package gets lost, the reciever can loop through and try 13 different values at 13 positions - thus calculating if the hash is correct13*13 different times. That's quite doable.
Another more overhead idea would be to build a merkle tree over all positions with hashes. Then you could more easily find the position were the error occured.
Hash(Hash(Hash(8D,9C),Hash(8H,9S)) ,Hash(Hash(12D,13C),Hash(12H,13S))) E.g. if the error here occured in 13S than you would first find out that the whole hash is incorrect, than find Hash(Hash(8D,9C),Hash(8H,9S)) to be correct, then locate the error in Hash(Hash(12D,13C),Hash(12H,13S)) etc. and go down the tree from there. That's a method that's widely used in many protocols in computer science for transmission.
reply
Ngl I find it a little difficult to understand what you even want.
Thanks for your answer and for the feedback regarding clarity. I should have given a more explicit example of the type of transmission I am after:
better example / motivation -- cold storage of entropy
Imagine storing the entropy for a cold wallet in the form of a carefully ordered deck of cards. It is not hard to take a number and represent it as a permutation of N objects. This is what a Lehmer code does, but Lehmer codes alone do not get us all the way there from a redundancy/error-correcting standpoint.
Where I get stumped is how to build error correction into the permutation itself so that you can recover the seed entropy even if your permutation (the message) got a little jumbled up. My thought is that there is probably a way to do it by using an even larger collection of M objects where M > N.
a method that's widely used in many protocols in computer science for transmission.
Regarding your specific solution, I am confused. I want the final encoding of the message (e.g. what is actually transmitted) to be in the form of a permutation of objects, nothing else. Does your method do that? I think your merkle tree idea assumes that the receiver somehow already knows/learns the root hash, but that would be outside of the rules.
reply
I want the final encoding of the message (e.g. what is actually transmitted) to be in the form of a permutation of objects, nothing else. Does your method do that?
With a merkle tree you would transmit the the permutation of objects and in addition to that the tree.
I think your merkle tree idea assumes that the receiver somehow already knows/learns the root hash, but that would be outside of the rules.
The idea is that the whole data is large and transmitted first. Then the tree is used to find errors. You send the tree root, the recipient acknowledges the tree root. And then answers if his calculated tree root is equal to the senders tree root. If yes, great, the most minimal transmission for checksum verification. If no, then the underlying tree offers a super fast way to find where the error was.
reply
Thanks. I understand what you mean now. However, it does not quite capture what I am seeking. I want it to be a single message which is comprised of data (e.g. entropy for a cold wallet, which itself may include some of its own checksums, etc), and I want the whole thing to be represented as a permutation of {1,2,3,4,5...,M}. So what I want is definitions for encode and decode such that:
encode(data) = f decode(g) = data
where f and g are not equal, but f and g are permutations of {1,2,3,4,...,M}.
Continuing with the cold storage example: the sender "sends" f by putting the cards in the order determined by f and then destroying the underlying original entropy (the secret) since it is now represented by f. Maybe Sender then buries this deck of cards in a hole in the ground or something :-).
Then, some time later, the Receiver (which could be the Sender itself) digs up the deck of cards and writes down their order. Unfortunately the order the cards are now in, for whatever reason, turns out to be g which is not equal to f. In other words, there was an error in the transmission. However, all hope should not be lost.
So, the question is: can we endow f with enough error correction capabilities that even if the receiver receives g, the receiver can still find the underlying original secret data from g?
reply
encode(data) = f decode(g) = data
where f and g are not equal, but f and g are permutations of {1,2,3,4,...,M}.
What this makes me think of as someone specialized in security in CS is "second preimage" from hash functions. Let's do a quick transformation:
decode(g) = data <=> encode(decode(g)) = encode(data) <=> g = encode(data) => encode(data) = f encode(data) = g with f !=g
Functions that fulfill this property are insecure hashfunctions. Especially compressing hashfunction for which two preimages can easily be found (google "second preimage resistance"). I know this doesn't help you directly but maybe ths cryptographic perspective helps you searching
reply
Right, that is actually why I am trying to focus more on the error-correction aspect of this. Using a secure hash function for checksum calculation (as in the bip39 spec) helps determine that there is an error but it does not identify where the error is nor help correct it.
Reed-Solomon error correcting codes are more what I am seeking, but it is non-trivial to make those work for permutations.
Anyway, there seems to be less interest/activity in this bounty thread than I was anticipating, so I went ahead and awarded the bounty to you!
reply
If you have an idea for how to do this robustly but which needs a non-standard deck of cards, that is fine, please share your idea!
The important thing is that the final encoding is solely in terms of a permutation of M physical objects (in the example M = 54) and that such an encoding stores information about a permutation of N objects (in the example N = 13) with some robustness against errors.
Maybe you have a clever way of doing it with a permutation of polynomials or something crazy like that?
reply