meerkat-java/mixer/docs/mixer.md

105 lines
6.5 KiB
Markdown

#Mixer File Format
This document attempts to provide sufficient information to write a verifier for our mixnet implementation.
## High-Level Overview.
The mixer implements Abe's Permutation-network-based mix. The on-disk format of the mixer's output consists of a sequence of protobif messages, written using their writeDelimitedTo() method:
1. A [MixBatchHeader](mixing.proto) message containing the number of switch layers and the number of ciphertexts in each layer (where the latter is given as the base-2 log of the number, since the actual number must be a power of 2).
2. A matrix of ciphertext messages. The matrix is given column by column; the total number of columns is the number of layers + 1 (the first column is the input to the mix, and the final column is the output). The first message in the sequence is ciphertext 0 in column 0, then ciphertext 1 in layer 0, finally ending with ciphertext $2^{logN}-1$ in column $layers$.
3. A matrix of proofs for each 2x2 switch. The matrix is given column by column; the total number of columns is the number of layers. The first message in the matrix sequence is switch 0 in column 0, then switch 1 in column 0, finally ending with switch $2^{logN-1}-1$ in column $layers-1$.
## Ciphertexts
For future compatibility, each ciphertext actually written to disk is encoded as an "opaque" [RerandomizableEncryptedMessage](crypto.proto). The data field of this message contains the serialized ElGamalCiphertext message (see below).
### EC-ElGamal Ciphertexts
Ciphertexts are serialized using the [ElGamalCiphertext](concrete_crypto.proto) message, with fields "c1" and "c2" for the first and second group elements.
### EC Group elements
Group elements use the [GroupElement](concrete_crypto.proto) message. It's only field is "data", which should be an ASN.1-encoded curve point with compression (see section 4.3.6 of [X9.62-1998](https://www.security-audit.com/files/x9-62-09-20-98.pdf) "Public Key Cryptography For The Financial Services Industry: The Elliptic Curve Digital Signature Algorithm (ECDSA)")
### EC-ElGamal Key Format
The ECElGamal Key is stored in the [ElGamalPublicKey](concrete_crypto.proto) message that contains a standard DER-encoded SubjectPublicKeyInfo as in [RFC 3279](https://tools.ietf.org/html/rfc3279) (note that this encoding includes the elliptic-curve group parameters).
## Proofs
Each mixing proof is a serialized [Mix2Proof](mixing.proto) message. This consists of a firstMessage field (containing the first message of the Sigma protocol) and a finalMessage field(containing the final message of the Sigma protocol). The challenge is derived from the *serialized form* of the firstMessage field by hashing (see [Fiat-Shamir](#fiat-shamir-random-oracle) below for the hash algorithm), a finalMessage field and a location field (see [Location](#location) below).
### Proof Statement
The proof statement itself is a disjunction of two conjunctions of Schnorr proofs for discrete-log equivalence, constructed from the ciphertexts. Let $g$ be the group generator and $h$ the ElGamal public key. Given input ciphertexts $a=(a_1,a_2)$ and
$b=(b_1,b_2)$ and output ciphertexts $c=(c_1,c_2)$ and $d=(d_1,d_2)$, the statement proved is:
$$
(\log_g \frac{c_1}{a_1} = \log_h\frac{c_2}{a_2})\text{ AND } (\log_g \frac{d_1}{b_1} = \log_h\frac{d_2}{b_2}) \\
\text{OR}\\
(\log_g \frac{d_1}{a_1} = \log_h\frac{d_2}{a_2}) \text{ AND } (\log_g \frac{c_1}{b_1} = \log_h\frac{c_2}{b_2})
$$
The firstMessage and finalMessage fields are generated using the sigma-protocol disjunction composition of the sigma protocols for the two conjuction statements, and those in turn are generated using the sigma-protocol conjunction statement of the basic Schnorr proofs (more details in the [protobuf file](mixing.proto)).
### Location
The location field gives the layer number (column) and switchIdx (index within the column) for the 2x2 switch which this proof references. The input ciphertexts to switch $s$ in layer $i$ are in positions $2s$, $2s+1$ in column $i$ of the ciphertext matrix. The indices of the output ciphertexts in column $i+1$ are given in the location field (out0 is the index of the first output ciphertext and out1 the second).
### Integers
Integers are serialized using the [BigInteger](crypto.proto) protobuf message, whose data field contains the integer encoded as in Java's [BigInteger.toByteArray()](http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html#toByteArray()) method.
### Fiat-Shamir "Random Oracle"
The Fiat-Shamir heuristic uses a "random oracle". We use the following java code as the implementation of the oracle.
public BigInteger hash(byte[] input) {
MessageDigest md = MessageDigest.getInstance("SHA-256");
md.reset();
byte[] digest = md.digest(input);
return new BigInteger(1,digest);
}
Note that our Sigma-protocols use Integers as a challenge, so the random oracle returns a BigInteger. The input to the oracle is always a protobuf Message; we convert it to a byte array using its toByteArray() method.
## The Mixer Command-line Utility
The Mixer application can generate keys, encrypt, mix, verify and decrypt. The jar file can be run using "java -jar mixer.jar"
Run
java -jar mixer.jar -h
To get a summary of command line options.
### Examples
#### Generate a new Key-Pair
java -jar mixer.jar -g -k ecelgamal.key
Generates the keys in the file ecelgamal.key
#### Encrypt some plaintexts
(run after generating keys)
java -jar mixer.jar -k ecelgamal.key -e -i infile.txt -o outfile.enc
**infile.txt** should consist of multiple lines; each line is used as a plaintext. If you get a warning message about inputs being truncated, you must ensure the longest line is shorter than about 20 chars (each plaintext must fit into a single group element, and there is some extra protobuf overhead)
**outfile.enc** will contain a 0-layer mixnet (i.e., a single column of ciphertexts). The plaintexts will be padded with empty lines to the nearest power of two.
#### Run the mix network
(run after encrypting some plaintexts)
java -jar mixer.jar -k ecelgamal.key -i outfile.enc -o mixed.enc
**mixed.enc** will contain the mixnet output including proofs.
#### Verify the mix
java -jar mixer.jar -k ecelgamal.key -i mixed.enc
This doesn't write output. You can add the "-s" flag for strict verification; strict verification
checks that the Benes network structure matches our implementation exactly (otherwise
the verifier just ensures that the output is a permutation of the input).
#### Decrypt the output
java -jar mixer.jar -k ecelgamal.key -d -i mixed.enc -o decrypted.txt