lsd0003

LSD0003: Set Union
Log | Files | Refs | README

commit 90c4bf1f53a952919f814d1d1b6307db0e357611
parent 05b35b44d9de5b5c535d3e46652734c048674d48
Author: Elias Summermatter <elias.summermatter@seccom.ch>
Date:   Sat, 23 Jan 2021 13:35:51 +0100

Fixed some stuff

Diffstat:
Mdraft-summermatter-set-union.xml | 324++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------
1 file changed, 221 insertions(+), 103 deletions(-)

diff --git a/draft-summermatter-set-union.xml b/draft-summermatter-set-union.xml @@ -81,7 +81,7 @@ Our primary envisioned application domain is the distribution of revocation messages in the GNU Name - System (GNS) <!-- TODO: citate: https://git.gnunet.org/bibliography.git/plain/docs/gns2014wachs.pdf -->. In GNS, + System (GNS) <xref target="GNUNET" format="default" />. In GNS, key revocation messages are usually flooded across the peer-to-peer overlay network to all connected peers whenever a key is revoked. However, as peers may be @@ -96,8 +96,8 @@ in this specification are Byzantine fault-tolerant bulletin boards, like those required in some secure multiparty computations. A well-known example for - secure multiparty computations are <!-- TODO: citate: https://git.gnunet.org/bibliography.git/plain/docs/ba_dold_voting_24aug2014.pdf --> various E-voting - protocols which + secure multiparty computations are various E-voting + protocols <xref target="CryptographicallySecureVoting" format="default"/> which use a bulletin board to share the votes and intermediate computational results. We note that for such systems, the set reconciliation protocol is merely a component of @@ -125,7 +125,7 @@ is one major choice to be made, which is between sending the full set of elements, or just sending the elements that differ. In the latter case, our design is basically a concrete - implementation of a proposal by Eppstein. <!-- TODO: citate: https://www.ics.uci.edu/~eppstein/pubs/EppGooUye-SIGCOMM-11.pdf --> + implementation of a proposal by Eppstein.<xref target="Eppstein" format="default" /> </t> <t> @@ -382,9 +382,9 @@ hashSum | HASHSUM | HASHSUM | HASHSUM | HASHSUM | H.. +-------------+-------------+-------------+-------------+ count | 0 | 0 | 0 | 0 | +-------------+-------------+-------------+-------------+ - idSum | 0000 | 0000 | 0000 | 0000 | + idSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 | +-------------+-------------+-------------+-------------+ -hashSum | 0000 | 0000 | 0000 | 0000 | +hashSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -395,9 +395,9 @@ hashSum | 0000 | 0000 | 0000 | 0000 | +-------------+-------------+-------------+-------------+ count | 0 | 1 | 0 | 1 | +-------------+-------------+-------------+-------------+ - idSum | 0000 | 0x0102 | 0000 | 0x0102 | + idSum | 0x0000 | 0x0102 | 0x0000 | 0x0102 | +-------------+-------------+-------------+-------------+ -hashSum | 0000 | 0x4242 | 0000 | 0x4242 | +hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -408,9 +408,9 @@ hashSum | 0000 | 0x4242 | 0000 | 0x4242 | +-------------+-------------+-------------+-------------+ count | 1 | 2 | 0 | 1 | +-------------+-------------+-------------+-------------+ - idSum | 0x0304 | 0x0206 | 0000 | 0x0102 | + idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 | +-------------+-------------+-------------+-------------+ -hashSum | 0101 | 0x4343 | 0000 | 0x4242 | +hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -433,9 +433,9 @@ hashSum | 0101 | 0x4343 | 0000 | 0x4242 | +-------------+-------------+-------------+-------------+ count | 1 | 2 | 0 | 1 | +-------------+-------------+-------------+-------------+ - idSum | 0x0304 | 0x0206 | 0000 | 0x0102 | + idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 | +-------------+-------------+-------------+-------------+ -hashSum | 0x0101 | 0x4343 | 0000 | 0x4242 | +hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -446,9 +446,9 @@ hashSum | 0x0101 | 0x4343 | 0000 | 0x4242 | +-------------+-------------+-------------+-------------+ count | 0 | 1 | 0 | 1 | +-------------+-------------+-------------+-------------+ - idSum | 0000 | 0x0102 | 0000 | 0x0102 | + idSum | 0x0000 | 0x0102 | 0x0000 | 0x0102 | +-------------+-------------+-------------+-------------+ -hashSum | 0000 | 0x4242 | 0000 | 0x4242 | +hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -530,9 +530,9 @@ hashSum | 0000 | 0x4242 | 0000 | 0x4242 | +-------------+-------------+-------------+-------------+ count | 1 | 2 | 0 | 1 | +-------------+-------------+-------------+-------------+ - idSum | 0x0304 | 0x0206 | 0000 | 0x0102 | + idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 | +-------------+-------------+-------------+-------------+ -hashSum | 0x0101 | 0x4343 | 0000 | 0x4242 | +hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -546,9 +546,9 @@ hashSum | 0x0101 | 0x4343 | 0000 | 0x4242 | +-------------+-------------+-------------+-------------+ count | 0 | 1 | 0 | 1 | +-------------+-------------+-------------+-------------+ - idSum | 0000 | 0x0102 | 0000 | 0x0102 | + idSum | 0x0000 | 0x0102 | 0x0000 | 0x0102 | +-------------+-------------+-------------+-------------+ -hashSum | 0000 | 0x4242 | 0000 | 0x4242 | +hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -563,9 +563,9 @@ hashSum | 0000 | 0x4242 | 0000 | 0x4242 | +-------------+-------------+-------------+-------------+ count | 0 | 0 | 0 | 0 | +-------------+-------------+-------------+-------------+ - idSum | 0000 | 0000 | 0000 | 0000 | + idSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 | +-------------+-------------+-------------+-------------+ -hashSum | 0000 | 0000 | 0000 | 0000 | +hashSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -603,9 +603,9 @@ hashSum | 0000 | 0000 | 0000 | 0000 | +-------------+-------------+-------------+-------------+ count | 1 | 2 | 0 | 1 | +-------------+-------------+-------------+-------------+ - idSum | 0x0304 | 0x0206 | 0000 | 0x0102 | + idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 | +-------------+-------------+-------------+-------------+ -hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | +hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -616,9 +616,9 @@ hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | +-------------+-------------+-------------+-------------+ count | 0 | 1 | 1 | 1 | +-------------+-------------+-------------+-------------+ - idSum | 0000 | 0x0102 | 0x1345 | 0x0102 | + idSum | 0x0000 | 0x0102 | 0x1345 | 0x0102 | +-------------+-------------+-------------+-------------+ -hashSum | 0000 | 0x4242 | 0x5050 | 0x4242 | +hashSum | 0x0000 | 0x4242 | 0x5050 | 0x4242 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -629,9 +629,9 @@ hashSum | 0000 | 0x4242 | 0x5050 | 0x4242 | +-------------+-------------+-------------+-------------+ count | 1 | 1 | -1 | 0 | +-------------+-------------+-------------+-------------+ - idSum | 0000 | 0x0304 | 0x1345 | 0000 | + idSum | 0x0304 | 0x0304 | 0x1345 | 0x0000 | +-------------+-------------+-------------+-------------+ -hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | +hashSum | 0x0101 | 0x0101 | 0x5050 | 0x0000 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -641,7 +641,6 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | removed by the set difference operation (bit-4). </t> </section> - </section> <section anchor="ibf_format" numbered="true" toc="default"> @@ -691,8 +690,117 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | also again a reasonable size for many CPU architectures. </t> + <section anchor="ibf_format_id_generation" numbered="true" toc="default"> + <name>ID Calculation</name> + <t> + The ID is generated as 64-bit output from a <relref section="2" target="RFC5869" displayFormat="of">HKDF construction</relref> + with HMAC-SHA512 as XTR and HMAC-SHA256 as PRF and salt is set to the unsigned 64-bit equivalent of 0. + The output is then truncated to 64-bit. + Its important that the elements can be redistributed over the buckets in case the IBF does not + decode, that's why the ID is salted with a random salt given in the SALT field of this message. + Salting is done by calculation the a random salt modulo 64 (using only the lowest 6-bits of the salt) + and do a bitwise right rotation of output of KDF by the 6-bit salts numeric representation. + </t> + <t> + Representation in pseudocode: + </t> + <figure anchor="ibf_format_id_generation_pseudo_code"> + <artwork name="" type="" align="left" alt=""><![CDATA[ +# INPUTS: +# key: Pre calculated and truncated key from id_calculation function +# ibf_salt: Salt of the IBF +# OUTPUT: +# value: salted key +FUNCTION salt_key(key,ibf_salt): + s = ibf_salt % 64; + k = key + + /* rotate ibf key */ + k = (k >> s) | (k << (64 - k)) + return key + + +# INPUTS: +# element: Element to calculated id from. +# salt: Salt of the IBF +# OUTPUT: +# value: the ID of the element + +FUNCTION id_calculation (element,ibf_salt): + salt = 0 + XTR=HMAC-SHA256 + PRF=HMAC-SHA256 + key = HKDF(XTR, PRF, salt, element) + key = key modulo 2^64 // Truncate + return salt_key(key,ibf_salt) + + + ]]></artwork> + </figure> + </section> + <section anchor="ibf_format_bucket_identification" numbered="true" toc="default"> + <name>Mapping Function</name> + <t> + The mapping function M as described above in the figure <xref target="bf_mapping_function_math" format="default" /> + decides in which buckets the ID and HASH have to be binary XORed to. In practice + there the following algorithm is used: + </t> + <t> + The first index is simply the HASH modulo the IBF size. The second + index is calculated by creating a new 64-bit value by shifting the 32-bit + value left and setting the lower 32-bit to the number of indexes already processed. From the + resulting 64-bit value a CRC32 checksum is created the second index is now the modulo of the + CRC32 output this is repeated until the predefined amount indexes is generated. + In the case a index is hit twice, which would mean this bucket could not get pure again, + the second hit is just skipped and the next iteration is used as. + </t> + <figure anchor="ibf_format_bucket_identification_pseudo_code"> + <artwork name="" type="" align="left" alt=""><![CDATA[ +# INPUTS: +# key: Is the ID of the element calculated in the id_calculation function above. +# number_of_buckets_per_element: Pre-defined count of buckets elements are inserted into +# ibf_size: the size of the ibf (count of buckets) +# OUTPUT: +# dst: Array with bucket IDs to insert ID and HASH + +FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size) + bucket = CRC32(key) + + i = 0 + filled = 0 + WHILE filled < number_of_buckets_per_element + + element_already_in_bucket = false + j = 0 + WHILE j < filled + IF dst[j] == bucket modulo ibf_size THEN + element_already_in_bucket = true + ENDIF + j++ + ENDWHILE + + IF !element_already_in_bucket THEN + dst[filled++] = bucket modulo ibf_size + ENDIF + + x = (bucket << 32) | i + bucket = CRC32(x) + + i++ + ENDWHILE + return dst + ]]></artwork> + </figure> + </section> + <section anchor="ibf_format_HASH_calculation" numbered="true" toc="default"> + <name>HASH calculation</name> + <t> + The HASH is calculated by calculating the CRC32 checksum of the 64-bit ID value + which returns a 32-bit value. + </t> </section> </section> + </section> <section anchor="se" numbered="true" toc="default"> <name>Strata Estimator</name> @@ -1056,7 +1164,6 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | / / / / ]]></artwork> - <!-- <postamble>which is a very simple example.</postamble>--> </figure> <t>where:</t> <dl> @@ -1121,7 +1228,6 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | / / / / ]]></artwork> - <!-- <postamble>which is a very simple example.</postamble>--> </figure> <t>where:</t> <dl> @@ -1161,67 +1267,17 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | 32768 divided by the BUCKET_SIZE which is 13-byte (104-bit). </t> <t> - The ID is generated as 64-bit output form a KDF (HMAC-SHA512 as XTR and HMAC-SHA256 as PRF, - the output from SHA-512 is then truncated to 64 bits, salt of the KDF is always set to = 0). - Its important that the elements can be redistributed over the buckets in case the IBF does not - decode, that's why the ID is salted with a random salt given in the SALT field of this message. - Salting is done by calculation the a random salt modulo 64 (using only the lowest 6-bits of the salt) - and do a bitwise right rotation of output of KDF by the 6-bit salts numeric representation. To - get the IDSUM field all IDs who hit a bucket are added up with a binary XOR operation. + To get the IDSUM field, all IDs who hit a bucket are added up with a binary XOR operation. + See <xref target="ibf_format_id_generation" format="title" /> for details about ID generation. </t> <t> - The HASH is calculated by calculating the CRC32 checksum of the 64-bit ID value - which returns a 32-bit value. To get the HASHSUM field all IDs are added - up with a binary XOR operation. - + The calculation of the HASHSUM field is done accordingly to the calculation of the IDSUM field: + all HASHes are added up with a binary XOR operation. + The HASH value is calculated as described in detail in section <xref target="ibf_format_HASH_calculation" format="title" />. </t> <t> - To decide in which buckets the ID and HASH have to be added up there is the following - algorithm used: The first index is simply the HASH modulo the IBF size. The second - index is calculated by creating a new 64-bit value by shifting the 32-bit - value left and setting the lower 32-bit to the number of indexes already processed. From the - resulting 64-bit value a CRC32 checksum is created the second index is now the modulo of the - CRC32 output this is repeated until the predefined amount indexes is generated. - In the case a index is hit twice, which would mean this bucket could not get pure again, - the second hit is just skipped and the next iteration is used as. - - <!-- @Christian: I dont have a clue how this is done... The code is very hard to read can you explain the - salt_key function in gnunet-service-set_union.c file - - CG: Eh, you should really read in setu/, not in set/. Alas, this function is the same. - - The goal of salt_key is to modify a given IBF key based on the salt (so we - end up with different material depending on the salt). We only use the - lowest 6 bits of the salt (hopefully enough!): - - int s = salt % 64; - uint64_t x = k_in->key_val; // that's the real input: at 64 bit value - - /* rotate ibf key */ - x = (x >> s) | (x << (64 - s)); // We simply to a bit rotation by 's' - k_out->key_val = x; // That's the final output. - - In some languages, this would simply be: - result = (input <<< salt); // bit rotation operator - - @Christian and the ibf_insert_into (why exponentiation? ^=) in the ibf.c - - CG: ^= is XOR in C (A ^=B; is basically "A := A XOR B"), _not_ exponentiation! - - The only thing i found out was that the Hashsum in the current implementation is calculated with CRC32. - - Not just. We take the 64-bit ID value. First, using 'CRC32' to compute - a 32-byte value. Take that modulo % buckets to get a bucket index. - Then shift the 32 bits high, or with 'i' (loop!) to set 32 low bits, - and again CRC32 to compute the next bucket. *IF* this formula returns - the same bucket twice, skip (so we guarantee hash_num disjoint buckets). - Note that the code had a bug, pushing a fix now: - - if (dst[j] == bucket) - must be - if (dst[j] == bucket % ibf->size) - - --> + The algorithm to find the correct bucket in which the ID and the HASH have to be added + is described in detail in section <xref target="ibf_format_bucket_identification" format="title" />. </t> <!-- @@ -1306,7 +1362,6 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | / / / / ]]></artwork> - <!-- <postamble>which is a very simple example.</postamble>--> </figure> <t>where:</t> <dl> @@ -1374,7 +1429,6 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | / / / / ]]></artwork> - <!-- <postamble>which is a very simple example.</postamble>--> </figure> <t>where:</t> <dl> @@ -1423,7 +1477,6 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | | IBF KEY | +-----+-----+-----+-----+-----+-----+-----+-----+ ]]></artwork> - <!-- <postamble>which is a very simple example.</postamble>--> </figure> <t>where:</t> <dl> @@ -1470,7 +1523,6 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | / / / / ]]></artwork> - <!-- <postamble>which is a very simple example.</postamble>--> </figure> <t>where:</t> <dl> @@ -1519,7 +1571,6 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | | HASH +-----+-----+-----+-----+ ]]></artwork> - <!-- <postamble>which is a very simple example.</postamble>--> </figure> <t>where:</t> <dl> @@ -1563,7 +1614,6 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | | MSG SIZE | MSG TYPE | +-----+-----+-----+-----+ ]]></artwork> - <!-- <postamble>which is a very simple example.</postamble>--> </figure> <t>where:</t> <dl> @@ -1605,7 +1655,6 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | | MSG SIZE | MSG TYPE | +-----+-----+-----+-----+ ]]></artwork> - <!-- <postamble>which is a very simple example.</postamble>--> </figure> <t>where:</t> <dl> @@ -1652,7 +1701,6 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | / / / / ]]></artwork> - <!-- <postamble>which is a very simple example.</postamble>--> </figure> <t>where:</t> <dl> @@ -1726,7 +1774,6 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0000 | / / / / ]]></artwork> - <!-- <postamble>which is a very simple example.</postamble>--> </figure> <t>where:</t> <dl> @@ -1828,11 +1875,6 @@ Type | Name | References | Description 570 | SETU_P2P_FULL_DONE | [This.I-D] | All elements in full synchronization mode have been send is done. 571 | SETU_P2P_FULL_ELEMENT | [This.I-D] | Send an actual element in full synchronization mode. - - - - - ]]></artwork> </figure> </section> @@ -1877,6 +1919,49 @@ Type | Name | References | Description </front> </reference> + <reference anchor="CryptographicallySecureVoting" target="https://git.gnunet.org/bibliography.git/plain/docs/ba_dold_voting_24aug2014.pdf"> + <front> + <title>Cryptographically Secure, DistributedElectronic Voting</title> + <author initials="F." surname="Dold" fullname="Florian Dold"> + <organization>Technische Universität München</organization> + </author> + </front> + </reference> + + + <reference anchor="GNUNET" target="https://git.gnunet.org/bibliography.git/plain/docs/gns2014wachs.pdf"> + <front> + <title>A Censorship-Resistant, Privacy-Enhancing andFully Decentralized Name System</title> + <author initials="M." surname="Wachs" fullname="Matthias Wachs"> + <organization>Technische Universität München</organization> + </author> + <author initials="M." surname="Schanzenbach" fullname="Martin Schanzenbach"> + <organization>Technische Universität München</organization> + </author> + <author initials="C." surname="Grothoff" fullname="Christian Grothoff"> + <organization>Technische Universität München</organization> + </author> + </front> + </reference> + + <reference anchor="Eppstein" target="https://doi.org/10.1145/2018436.2018462"> + <front> + <title>What’s the Difference? Efficient Set Reconciliation without Prior Context</title> + <author initials="D." surname="Eppstein" fullname="David Eppstein"> + <organization>U.C. Irvine</organization> + </author> + <author initials="M." surname="Goodrich" fullname="Michael T. Goodrich"> + <organization>U.C. Irvine</organization> + </author> + <author initials="F." surname="Uyeda" fullname="Frank Uyeda"> + <organization>U.C. San Diego</organization> + </author> + <author initials="G." surname="Varghese" fullname="George Varghese"> + <organization>U.C. San Diego</organization> + </author> + </front> + </reference> + <reference anchor="GNS" target="https://doi.org/10.1007/978-3-319-12280-9_9"> <front> <title>A Censorship-Resistant, Privacy-Enhancing and Fully Decentralized Name System</title> @@ -2007,8 +2092,41 @@ Type | Name | References | Description </front> </reference>--> </references> - <!-- Change Log - v00 2017-07-23 MS Initial version - --> + <section anchor="test_vectors" numbered="true" toc="default"> + <name>Test Vectors</name> + <section anchor="test_vectors_map_function" numbered="true" toc="default"> + <name>Map Function</name> + <t> + INPUTS: + </t> + <t> + key: 0xFFFFFFFFFFFFFFFF (64-bit)<br/> + number_of_buckets_per_element: 3<br/> + ibf_size: 300 + </t> + <t> + OUTPUT: + </t> + <t> + ["222","32","10"] + </t> + </section> + <section anchor="test_vectors_id_function" numbered="true" toc="default"> + <name>ID Calculation Function</name> + <t> + INPUTS: + </t> + <t> + element: 0xadadadadadadadad + ibf_salt 0x3F (6-bit) + </t> + <t> + OUTPUT: + </t> + <t> + 0xFFFFFFFFFFFFFFFF + </t> + </section> + </section> </back> </rfc>