aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--draft-summermatter-set-union.xml590
1 files changed, 404 insertions, 186 deletions
diff --git a/draft-summermatter-set-union.xml b/draft-summermatter-set-union.xml
index 8ae00af..9074380 100644
--- a/draft-summermatter-set-union.xml
+++ b/draft-summermatter-set-union.xml
@@ -106,7 +106,7 @@
106 </t> 106 </t>
107 <t> 107 <t>
108 The protocol described in this report is generic and 108 The protocol described in this report is generic and
109 suitable for a wide range of applicaitons. As a result, 109 suitable for a wide range of applications. As a result,
110 the internal structure of the elements in the sets must 110 the internal structure of the elements in the sets must
111 be defined and verified by the application using the 111 be defined and verified by the application using the
112 protocol. This document thus does not cover the element 112 protocol. This document thus does not cover the element
@@ -121,9 +121,9 @@
121 using the protocol must provide a parameter that specifies 121 using the protocol must provide a parameter that specifies
122 the cost-ratio of round-trips vs. bandwidth usage. Given 122 the cost-ratio of round-trips vs. bandwidth usage. Given
123 this trade-off factor, the protocol will then choose parameters 123 this trade-off factor, the protocol will then choose parameters
124 that minimize the total execution cost. In particular, there 124 that minimize the total execution costs. In particular, there
125 is one major choice to be made, which is between sending the 125 is one major choice to be made, namely between sending the
126 full set of elements, or just sending the elements that differ. 126 complete set of elements, or sending only the elements that differ.
127 In the latter case, our design is basically a concrete 127 In the latter case, our design is basically a concrete
128 implementation of a proposal by Eppstein.<xref target="Eppstein" format="default" /> 128 implementation of a proposal by Eppstein.<xref target="Eppstein" format="default" />
129 </t> 129 </t>
@@ -152,7 +152,7 @@
152 <t> 152 <t>
153 To defend against some of these attacks, applications 153 To defend against some of these attacks, applications
154 need to remember the number of elements previously 154 need to remember the number of elements previously
155 shared with a peer, and offer a means to check that 155 shared with a peer, and provide a way to check that
156 elements are well-formed. Applications may also be able 156 elements are well-formed. Applications may also be able
157 to provide an upper bound on the total number of valid 157 to provide an upper bound on the total number of valid
158 elements that may exist. For example, in E-voting, the 158 elements that may exist. For example, in E-voting, the
@@ -179,14 +179,14 @@
179 <section anchor="bf" numbered="true" toc="default"> 179 <section anchor="bf" numbered="true" toc="default">
180 <name>Bloom Filters</name> 180 <name>Bloom Filters</name>
181 <t> 181 <t>
182 A Bloom filter (BF) is a space-efficient datastructure to test if am element is part of a set of elements. 182 A Bloom filter (BF) is a space-efficient datastructure to test if an element is part of a set of elements.
183 Elements are identified by an element ID. 183 Elements are identified by an element ID.
184 Since a BF is a probabilistic datastructure, it is possible to have false-positives: when asked 184 Since a BF is a probabilistic datastructure, it is possible to have false-positives: when asked
185 if an element is in the set, the answer from a BF is either "no" or "maybe". 185 if an element is in the set, the answer from a BF is either "no" or "maybe".
186 </t> 186 </t>
187 <t> 187 <t>
188 A BF consists of L buckets. Every bucket is a binary value that can be either 0 or 1. All buckets are initialized 188 A BF consists of L buckets. Every bucket is a binary value that can be either 0 or 1. All buckets are initialized
189 to 0. A mapping function M is used to map each the ID of each element from the set to a subset of k buckets. M is non-injective 189 to 0. A mapping function M is used to map each ID of each element from the set to a subset of k buckets. M is non-injective
190 and can thus map the same element multiple times to the same bucket. 190 and can thus map the same element multiple times to the same bucket.
191 The type of the mapping function can thus be described by the following mathematical notation: 191 The type of the mapping function can thus be described by the following mathematical notation:
192 </t> 192 </t>
@@ -214,7 +214,7 @@
214 To check if an element may be in the set, one tests if all buckets under the map M are set to 1. 214 To check if an element may be in the set, one tests if all buckets under the map M are set to 1.
215 </t> 215 </t>
216 <t> 216 <t>
217 Further in this document a bitstream outputted by the mapping function is represented by 217 Further in this document a bitstream output by the mapping function is represented by
218 a set of numeric values for example (0101) = (2,4). 218 a set of numeric values for example (0101) = (2,4).
219 In the BF the buckets are set to 1 if the corresponding bit in the bitstream is 1. 219 In the BF the buckets are set to 1 if the corresponding bit in the bitstream is 1.
220 If there is a collision and a bucket is already set to 1, the bucket stays 1. 220 If there is a collision and a bucket is already set to 1, the bucket stays 1.
@@ -231,8 +231,8 @@
231 ]]></artwork> 231 ]]></artwork>
232 </figure> 232 </figure>
233 <t> 233 <t>
234 Is easy to see that the M(element) = (0,3) could be in the BF bellow and M(element) = (0,2) can't be 234 Is easy to see that the M(element) = (0,3) could be in the BF below and M(element) = (0,2) can't be
235 in the BF bellow: 235 in the BF below:
236 </t> 236 </t>
237 237
238 <figure anchor="figure_bf_contains"> 238 <figure anchor="figure_bf_contains">
@@ -259,7 +259,7 @@
259 <name>Counting Bloom Filter</name> 259 <name>Counting Bloom Filter</name>
260 <t> 260 <t>
261 A Counting Bloom Filter (CBF) is an extension of the <xref target="bf" format="title" />. In the CBF, buckets are 261 A Counting Bloom Filter (CBF) is an extension of the <xref target="bf" format="title" />. In the CBF, buckets are
262 unsigned numbers instead of binary values. This allows the removal of an elements from the CBF. 262 unsigned numbers instead of binary values. This allows the removal of an element from the CBF.
263 </t> 263 </t>
264 <t> 264 <t>
265 Adding an element to the CBF is similar to the adding operation of the BF. However, instead of setting the bucket on hit to 1 the 265 Adding an element to the CBF is similar to the adding operation of the BF. However, instead of setting the bucket on hit to 1 the
@@ -294,7 +294,7 @@
294 </figure> 294 </figure>
295 <t> 295 <t>
296 In practice, the number of bits available for the counters is usually finite. For example, given a 4-bit 296 In practice, the number of bits available for the counters is usually finite. For example, given a 4-bit
297 counter, a CBF bucket would overflow once 16 elements are mapped to the same bucket. To efficiently 297 counter, a CBF bucket would overflow 16 elements are mapped to the same bucket. To efficiently
298 handle this case, the maximum value (15 in our example) is considered to represent "infinity". Once the 298 handle this case, the maximum value (15 in our example) is considered to represent "infinity". Once the
299 order of a bucket reaches "infinity", it is no longer incremented or decremented. 299 order of a bucket reaches "infinity", it is no longer incremented or decremented.
300 </t> 300 </t>
@@ -328,7 +328,7 @@
328 including the CPU architecture. 328 including the CPU architecture.
329 </t> 329 </t>
330 <t> 330 <t>
331 If the IBF size is to small or the mapping 331 If the IBF size is too small or the mapping
332 function does not spread out the elements 332 function does not spread out the elements
333 uniformly, the signed counter can overflow or 333 uniformly, the signed counter can overflow or
334 underflow. As with the CBF, the "maximum" value is 334 underflow. As with the CBF, the "maximum" value is
@@ -456,11 +456,11 @@ hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 |
456 Note that it is possible to "remove" elements from an IBF that were never present 456 Note that it is possible to "remove" elements from an IBF that were never present
457 in the IBF in the first place. A negative counter value is thus indicative of 457 in the IBF in the first place. A negative counter value is thus indicative of
458 elements that were removed without having been added. Note that an IBF bucket 458 elements that were removed without having been added. Note that an IBF bucket
459 counter of zero no longer warrants that an element mapped to that bucket is not 459 counter of zero no longer guarantees that an element mapped to that bucket is not
460 present in the set: a bucket with a counter of zero can be the result of one 460 present in the set: a bucket with a counter of zero can be the result of one
461 element being added and a different element (mapped to the same bucket) being removed. 461 element being added and a different element (mapped to the same bucket) being removed.
462 To check that an element is not present requires a counter of zero and an 462 To check that an element is not present requires a counter of zero and an
463 IDSUM and HASHSUM of zero --- and some assurance that there was no collision due 463 IDSUM and HASHSUM of zero --- and some certainty that there was no collision due
464 to the limited number of bits in IDSUM and HASHSUM. Thus, 464 to the limited number of bits in IDSUM and HASHSUM. Thus,
465 IBFs are not suitable to replace BFs or IBFs. 465 IBFs are not suitable to replace BFs or IBFs.
466 </t> 466 </t>
@@ -504,7 +504,7 @@ hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 |
504 a different mapping function M. 504 a different mapping function M.
505 A more common scenario (especially if L was too small) is that 505 A more common scenario (especially if L was too small) is that
506 IBF decoding fails because there is no pure bucket. In this case, the 506 IBF decoding fails because there is no pure bucket. In this case, the
507 higher-level protocol also should retry using different parameters. 507 higher-level protocol should also retry using different parameters.
508 </t> 508 </t>
509 <t> 509 <t>
510 Suppose the IBF contains a pure bucket. In this case, the IDSUM in the 510 Suppose the IBF contains a pure bucket. In this case, the IDSUM in the
@@ -513,9 +513,9 @@ hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 |
513 was negative, and by removing it if the counter was positive). This 513 was negative, and by removing it if the counter was positive). This
514 is likely to cause other buckets to become pure, allowing further 514 is likely to cause other buckets to become pure, allowing further
515 elements to be decoded. Eventually, decoding should succeed with 515 elements to be decoded. Eventually, decoding should succeed with
516 all counters and IDSUM and HASHSUM values reaching zero. However, it is also 516 all counters and IDSUM and HASHSUM values reach zero. However, it is also
517 possible that an IBF only partly decodes and then decoding fails after 517 possible that an IBF only partly decodes and then decoding fails after
518 yielding some elements. 518 obtaining some elements.
519 </t> 519 </t>
520 <t> 520 <t>
521 In the following example the successful decoding of an IBF containing 521 In the following example the successful decoding of an IBF containing
@@ -554,7 +554,7 @@ hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 |
554 </figure> 554 </figure>
555 <t> 555 <t>
556 In the IBF only pure buckets are left, we choose to continue decoding bucket 2 and decode element 556 In the IBF only pure buckets are left, we choose to continue decoding bucket 2 and decode element
557 with the hash 0x4242. Now the IBF is empty (all buckets have count 0) that means the IBF has successfully 557 with the hash 0x4242. Now the IBF is empty (all buckets have count 0) that means the IBF has been successfully
558 decoded. 558 decoded.
559 </t> 559 </t>
560 <figure anchor="figure_ibf_decode_2"> 560 <figure anchor="figure_ibf_decode_2">
@@ -577,7 +577,7 @@ hashSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 |
577 Given addition and removal as defined above, it is possible to define an operation on IBFs 577 Given addition and removal as defined above, it is possible to define an operation on IBFs
578 that computes an IBF representing the set difference. Suppose IBF1 represents set A, and 578 that computes an IBF representing the set difference. Suppose IBF1 represents set A, and
579 IBF2 represents set B. Then this set difference operation will compute IBF3 which 579 IBF2 represents set B. Then this set difference operation will compute IBF3 which
580 represents the set A - B --- without needing elements from set A or B. 580 represents the set A - B --- without having to transfer the elements from set A or B.
581 581
582 To calculate the IBF representing this set difference, both IBFs must have the same 582 To calculate the IBF representing this set difference, both IBFs must have the same
583 length L, the same number of buckets per element k and use the same map M. Given this, 583 length L, the same number of buckets per element k and use the same map M. Given this,
@@ -635,7 +635,7 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0x0000 |
635 +-------------+-------------+-------------+-------------+ 635 +-------------+-------------+-------------+-------------+
636 ]]></artwork> 636 ]]></artwork>
637 </figure> 637 </figure>
638 <t>After calculating and decoding the IBF-AB its clear that in IBF-A the element with the hash 0x5050 638 <t>After calculating and decoding the IBF-AB shows clear that in IBF-A the element with the hash 0x5050
639 is missing (-1 in bit-3) while in IBF-B the element with the hash 0101 is missing 639 is missing (-1 in bit-3) while in IBF-B the element with the hash 0101 is missing
640 (1 in bit-1 and bit-2). The element with hash 0x4242 is present in IBF-A and IBF-B and is 640 (1 in bit-1 and bit-2). The element with hash 0x4242 is present in IBF-A and IBF-B and is
641 removed by the set difference operation (bit-4). 641 removed by the set difference operation (bit-4).
@@ -648,8 +648,8 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0x0000 |
648 <t> 648 <t>
649 To facilitate a reasonably CPU-efficient 649 To facilitate a reasonably CPU-efficient
650 implementation, this specification requires the 650 implementation, this specification requires the
651 IBF counter to always use 8 bits. Fewer bits 651 IBF counter always to use 8 bits. Fewer bits
652 would result in a paritcularly inefficient 652 would result in a particularly inefficient
653 implementation, while more bits are rarely useful 653 implementation, while more bits are rarely useful
654 as sets with so many elements should likely be 654 as sets with so many elements should likely be
655 represented using a larger number of buckets. This 655 represented using a larger number of buckets. This
@@ -672,17 +672,17 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0x0000 |
672 </t> 672 </t>
673 <t> 673 <t>
674 For the "HASHSUM", we always use a 32-bit 674 For the "HASHSUM", we always use a 32-bit
675 representation. Here, it is mostly important to 675 representation. Here, it is most important to
676 avoid collisions, where different elements are 676 avoid collisions, where different elements are
677 mapped to the same hash. However, we note that 677 mapped to the same hash. However, we note that
678 by design only a few elements (certainly less than 678 by design only a few elements (certainly less than
679 127) should ever be mapped 679 127) should ever be mapped
680 to the same bucket, so a small number of bits 680 to the same bucket, a small number of bits
681 should suffice. Furthermore, our protocol is designed 681 should suffice. Furthermore, our protocol is designed
682 to handle occasional collisions, so while with 682 to handle occasional collisions, so while with
683 32-bits there remains a chance of 683 32-bits there remains a chance of
684 accidental collisions, at 32 bit the chance is 684 accidental collisions, at 32 bit the chance is
685 generally believed to be sufficiently small enough 685 generally believed to be sufficiently small
686 for the protocol to handle those cases efficiently 686 for the protocol to handle those cases efficiently
687 for a wide range of use-cases. Smaller hash 687 for a wide range of use-cases. Smaller hash
688 values would safe bandwidth, but also drastically 688 values would safe bandwidth, but also drastically
@@ -697,7 +697,7 @@ hashSum | 0x0101 | 0x0101 | 0x5050 | 0x0000 |
697 with HMAC-SHA512 as XTR and HMAC-SHA256 as PRF and salt is set to the unsigned 64-bit equivalent of 0. 697 with HMAC-SHA512 as XTR and HMAC-SHA256 as PRF and salt is set to the unsigned 64-bit equivalent of 0.
698 The output is then truncated to 64-bit. 698 The output is then truncated to 64-bit.
699 Its important that the elements can be redistributed over the buckets in case the IBF does not 699 Its important that the elements can be redistributed over the buckets in case the IBF does not
700 decode, that's why the ID is salted with a random salt given in the SALT field of this message. 700 decode. That's why the ID is salted with a random salt given in the SALT field of this message.
701 Salting is done by calculation the a random salt modulo 64 (using only the lowest 6-bits of the salt) 701 Salting is done by calculation the a random salt modulo 64 (using only the lowest 6-bits of the salt)
702 and do a bitwise right rotation of output of KDF by the 6-bit salts numeric representation. 702 and do a bitwise right rotation of output of KDF by the 6-bit salts numeric representation.
703 </t> 703 </t>
@@ -743,22 +743,22 @@ FUNCTION id_calculation (element,ibf_salt):
743 <t> 743 <t>
744 The mapping function M as described above in the figure <xref target="bf_mapping_function_math" format="default" /> 744 The mapping function M as described above in the figure <xref target="bf_mapping_function_math" format="default" />
745 decides in which buckets the ID and HASH have to be binary XORed to. In practice 745 decides in which buckets the ID and HASH have to be binary XORed to. In practice
746 there the following algorithm is used: 746 the following algorithm is used:
747 </t> 747 </t>
748 <t> 748 <t>
749 The first index is simply the HASH modulo the IBF size. The second 749 The first index is simply the HASH modulo the IBF size. The second
750 index is calculated by creating a new 64-bit value by shifting the 32-bit 750 index is calculated by creating a new 64-bit value by shifting the 32-bit
751 value left and setting the lower 32-bit to the number of indexes already processed. From the 751 value left and setting the lower 32-bit to the number of indexes already processed. From the
752 resulting 64-bit value a CRC32 checksum is created the second index is now the modulo of the 752 resulting 64-bit value a CRC32 checksum is created. The second index is now the modulo of the
753 CRC32 output this is repeated until the predefined amount indexes is generated. 753 CRC32 output, this is repeated until the predefined amount of indexes is generated.
754 In the case a index is hit twice, which would mean this bucket could not get pure again, 754 In the case a index is hit twice, which would mean this bucket could not get pure again,
755 the second hit is just skipped and the next iteration is used as. 755 the second hit is just skipped and the next iteration is used.
756 </t> 756 </t>
757 <figure anchor="ibf_format_bucket_identification_pseudo_code"> 757 <figure anchor="ibf_format_bucket_identification_pseudo_code">
758 <artwork name="" type="" align="left" alt=""><![CDATA[ 758 <artwork name="" type="" align="left" alt=""><![CDATA[
759# INPUTS: 759# INPUTS:
760# key: Is the ID of the element calculated in the id_calculation function above. 760# key: Is the ID of the element calculated in the id_calculation function above.
761# number_of_buckets_per_element: Pre-defined count of buckets elements are inserted into 761# number_of_buckets_per_element: Pre-defined numbers of buckets elements are inserted into
762# ibf_size: the size of the ibf (count of buckets) 762# ibf_size: the size of the ibf (count of buckets)
763# OUTPUT: 763# OUTPUT:
764# dst: Array with bucket IDs to insert ID and HASH 764# dst: Array with bucket IDs to insert ID and HASH
@@ -807,7 +807,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
807 <section anchor="se_description" numbered="true" toc="default"> 807 <section anchor="se_description" numbered="true" toc="default">
808 <name>Description</name> 808 <name>Description</name>
809 <t> 809 <t>
810 Strata Estimators help estimate the size of the set difference between two set of elements. 810 Strata Estimators help estimate the size of the set difference between two sets of elements.
811 This is necessary to efficiently determinate the tuning parameters for an IBF, in particular 811 This is necessary to efficiently determinate the tuning parameters for an IBF, in particular
812 a good value for L. 812 a good value for L.
813 </t> 813 </t>
@@ -845,7 +845,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
845 <t> 845 <t>
846 The simplest mode is the "full" synchronization mode. The idea is that if the difference between the sets of the two 846 The simplest mode is the "full" synchronization mode. The idea is that if the difference between the sets of the two
847 peers exceeds a certain threshold, the overhead to determine which elements are different outweighs 847 peers exceeds a certain threshold, the overhead to determine which elements are different outweighs
848 the overhead of sending the complete set. In this case, the most efficient method can be to just 848 the overhead of sending the complete set. In this case, the most efficient method can be just to
849 exchange the full sets. 849 exchange the full sets.
850 </t> 850 </t>
851 <t> 851 <t>
@@ -854,7 +854,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
854 </t> 854 </t>
855 <t> 855 <t>
856 The second possibility is that the difference of the sets is small compared to the set size. 856 The second possibility is that the difference of the sets is small compared to the set size.
857 Here, an efficient "delta" synchronization mode is more efficient. Given these two possibilities, 857 In this case, an efficient "delta" synchronization mode is more efficient. These two possibilities given,
858 the first steps of the protocol are used to determine which mode should be used. 858 the first steps of the protocol are used to determine which mode should be used.
859 </t> 859 </t>
860 860
@@ -920,7 +920,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
920 <em><xref target="messages_ibf" format="title" /></em> message from 920 <em><xref target="messages_ibf" format="title" /></em> message from
921 the initiating peer and transitions into the <strong>Expecting IBF Last</strong> state when there 921 the initiating peer and transitions into the <strong>Expecting IBF Last</strong> state when there
922 are multiple <em><xref target="messages_ibf" format="title" /></em> messages to sent, 922 are multiple <em><xref target="messages_ibf" format="title" /></em> messages to sent,
923 when there is just a single <em><xref target="messages_ibf" format="title" /></em> message the reviving peer 923 when there is just a single <em><xref target="messages_ibf" format="title" /></em> message the receiving peer
924 transitions directly to the <strong>Active Decoding</strong> state. 924 transitions directly to the <strong>Active Decoding</strong> state.
925 </t> 925 </t>
926 <t> 926 <t>
@@ -975,8 +975,8 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
975 When a new element message has been received the peer checks if a corresponding 975 When a new element message has been received the peer checks if a corresponding
976 <em><xref target="messages_demand" format="title" /></em> for the element has been sent 976 <em><xref target="messages_demand" format="title" /></em> for the element has been sent
977 and the demand is still unsatisfied. 977 and the demand is still unsatisfied.
978 If the element has been demanded the peer checks the element for validity, removed it from the list 978 If the element has been demanded the peer checks the element for validity, removes it from the list
979 of pending demands and then then saves the element to the the set otherwise the peer 979 of pending demands and then saves the element to the set otherwise the peer
980 rejects the element. 980 rejects the element.
981 </dd> 981 </dd>
982 <dt><em><xref target="messages_ibf" format="title" /></em> message:</dt> 982 <dt><em><xref target="messages_ibf" format="title" /></em> message:</dt>
@@ -990,7 +990,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
990 <dt><em><xref target="messages_ibf_last" format="title" /></em> message:</dt> 990 <dt><em><xref target="messages_ibf_last" format="title" /></em> message:</dt>
991 <dd> 991 <dd>
992 If an <em><xref target="messages_ibf_last" format="title" /></em> message is received this 992 If an <em><xref target="messages_ibf_last" format="title" /></em> message is received this
993 indicates that the there is just one IBF slice and a direct state and role transition from 993 indicates that there is just one IBF slice left and a direct state and role transition from
994 <strong>Passive Decoding</strong> to <strong>Active Decoding</strong> is initiated. 994 <strong>Passive Decoding</strong> to <strong>Active Decoding</strong> is initiated.
995 </dd> 995 </dd>
996 <dt><em><xref target="messages_done" format="title" /></em> message:</dt> 996 <dt><em><xref target="messages_done" format="title" /></em> message:</dt>
@@ -1017,7 +1017,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1017 sends a <em><xref target="messages_inquiry" format="title" /></em> to the passive peer. 1017 sends a <em><xref target="messages_inquiry" format="title" /></em> to the passive peer.
1018 </t> 1018 </t>
1019 <t> 1019 <t>
1020 In case the IBF does not successfully decode anymore, the active peer sends a new IBF to the passive client 1020 In case the IBF does not successfully decode anymore, the active peer sends a new IBF to the passive peer
1021 and changes into <strong>Passive Decoding</strong> state. This initiates a role swap. 1021 and changes into <strong>Passive Decoding</strong> state. This initiates a role swap.
1022 To reduce overhead and prevent double transmission of offers and elements the new IBF is created 1022 To reduce overhead and prevent double transmission of offers and elements the new IBF is created
1023 on the new complete set after all demands and inquiries have been satisfied. 1023 on the new complete set after all demands and inquiries have been satisfied.
@@ -1039,8 +1039,8 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1039 the active peer. If a Inquiry has been sent and <!-- FIXME: is this indeed a condition that is checked? --> 1039 the active peer. If a Inquiry has been sent and <!-- FIXME: is this indeed a condition that is checked? -->
1040 the offered element is missing in the active peers set, 1040 the offered element is missing in the active peers set,
1041 the active peer sends a <em><xref target="messages_demand" format="title" /></em> message to the 1041 the active peer sends a <em><xref target="messages_demand" format="title" /></em> message to the
1042 passive peer. The send demand need to be added to a list with unsatisfied demands. 1042 passive peer. The sent demand needs to be added to a list with unsatisfied demands.
1043 In the case the received offer is for an element that is already in the set of the peer the offer is ignored. 1043 In case the received offer is for an element that is already in the set of the peer the offer is ignored.
1044 <!-- FIXME: what happens if the offer is for an element that is not missing? I think then we just ignore it, right? --> 1044 <!-- FIXME: what happens if the offer is for an element that is not missing? I think then we just ignore it, right? -->
1045 </dd> 1045 </dd>
1046 <dt><em><xref target="messages_demand" format="title" /></em> message:</dt> 1046 <dt><em><xref target="messages_demand" format="title" /></em> message:</dt>
@@ -1051,9 +1051,9 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1051 <em><xref target="messages_elements" format="title" /></em> message if a offer request 1051 <em><xref target="messages_elements" format="title" /></em> message if a offer request
1052 for the element has been sent. 1052 for the element has been sent.
1053 <!-- IMPLEMENT: This is not implemented in code // Change --> 1053 <!-- IMPLEMENT: This is not implemented in code // Change -->
1054 In the case the demanded element does not exist in the 1054 In case the demanded element does not exist in the
1055 set there was probably a bucket decoded that was not really pure so potentially all <em><xref target="messages_offer" format="title" /></em> 1055 set there was probably a bucket decoded that was not pure so potentially all <em><xref target="messages_offer" format="title" /></em>
1056 and <em><xref target="messages_demand" format="title" /></em> messages sent after are invalid 1056 and <em><xref target="messages_demand" format="title" /></em> messages sent later are invalid
1057 in this case a role change active -> passive with a new IBF is easiest. 1057 in this case a role change active -> passive with a new IBF is easiest.
1058 If a demand for the same element is received multiple times the demands should be 1058 If a demand for the same element is received multiple times the demands should be
1059 discarded. 1059 discarded.
@@ -1062,18 +1062,18 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1062 </dd> 1062 </dd>
1063 <dt><em><xref target="messages_elements" format="title" /></em> message:</dt> 1063 <dt><em><xref target="messages_elements" format="title" /></em> message:</dt>
1064 <dd> 1064 <dd>
1065 A element that is received is marked in the list of demanded elements as satisfied, validated and 1065 An element that is received is marked in the list of demanded elements as satisfied, validated and
1066 saved and not further action is taken. 1066 saved and no further action is taken.
1067 Elements that are not demanded or already known are discarded. 1067 Elements that are not demanded or already known are discarded.
1068 </dd> 1068 </dd>
1069 <dt><em><xref target="messages_done" format="title" /></em> message:</dt> 1069 <dt><em><xref target="messages_done" format="title" /></em> message:</dt>
1070 <dd> 1070 <dd>
1071 Receiving the message <em><xref target="messages_done" format="title" /></em> indicates 1071 Receiving the message <em><xref target="messages_done" format="title" /></em> indicates
1072 that all demands of the passive peer have been satisfied. The active peer then changes into the 1072 that all demands of the passive peer have been satisfied. The active peer then changes into the
1073 state <strong>Finish Closing</strong> state. 1073 <strong>Finish Closing</strong> state.
1074 <!-- IMPLEMENT: This is not implemented in code // Change --> 1074 <!-- IMPLEMENT: This is not implemented in code // Change -->
1075 If the IBF is not finished decoding and the <em><xref target="messages_done" format="title" /></em> 1075 If the IBF has not finished decoding and the <em><xref target="messages_done" format="title" /></em>
1076 is received the other peer is not in compliance with the protocol and the set reconciliation MUST be aborted. 1076 is received, the other peer is not in compliance with the protocol and the set reconciliation MUST be aborted.
1077 <!-- IMPLEMENT: This is not implemented in code // Change --> 1077 <!-- IMPLEMENT: This is not implemented in code // Change -->
1078 </dd> 1078 </dd>
1079 </dl> 1079 </dl>
@@ -1090,7 +1090,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1090 <dd> 1090 <dd>
1091 <t> 1091 <t>
1092 In this states the peers are waiting for all demands to be satisfied and for the synchronisation 1092 In this states the peers are waiting for all demands to be satisfied and for the synchronisation
1093 to be completed. When all demands are satisfied the peer changes into state <strong>Finished</strong>. 1093 to be completed. When all demands are satisfied the peer changes into <strong>Finished</strong>state.
1094 </t> 1094 </t>
1095 </dd> 1095 </dd>
1096 </dl> 1096 </dl>
@@ -1104,7 +1104,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1104 </t> 1104 </t>
1105 <t> 1105 <t>
1106 The <xref target="modeofoperation_individual-elements" format="title" /> is only efficient on small set 1106 The <xref target="modeofoperation_individual-elements" format="title" /> is only efficient on small set
1107 differences or if the byte-size of the elements is large. Is the set difference is estimated to be large 1107 differences or if the byte-size of the elements is large. If the set difference is estimated to be large
1108 the <xref target="modeofoperation_full-sync" format="title" /> is 1108 the <xref target="modeofoperation_full-sync" format="title" /> is
1109 more efficient. The exact heuristics and parameters on which the protocol decides which mode 1109 more efficient. The exact heuristics and parameters on which the protocol decides which mode
1110 should be used are described in the <xref target="performance" format="title" /> section of this document. 1110 should be used are described in the <xref target="performance" format="title" /> section of this document.
@@ -1114,7 +1114,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1114 is always used. 1114 is always used.
1115 The first case is when one of the peers announces having an empty set. This is announced by setting 1115 The first case is when one of the peers announces having an empty set. This is announced by setting
1116 the SETSIZE field in the <em><xref target="messages_se" format="title" /></em> to 0. 1116 the SETSIZE field in the <em><xref target="messages_se" format="title" /></em> to 0.
1117 The second case is if the application requested full synchronization explicitly. 1117 The second case is if the application requests full synchronization explicitly.
1118 This is useful for testing and should not be used in production. 1118 This is useful for testing and should not be used in production.
1119 </t> 1119 </t>
1120 <!-- 1120 <!--
@@ -1169,7 +1169,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1169 <dl> 1169 <dl>
1170 <dt>MSG SIZE</dt> 1170 <dt>MSG SIZE</dt>
1171 <dd> 1171 <dd>
1172 is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included. 1172 is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
1173 </dd> 1173 </dd>
1174 <dt>MSG TYPE</dt> 1174 <dt>MSG TYPE</dt>
1175 <dd> 1175 <dd>
@@ -1209,7 +1209,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1209 The <em>IBF</em> message is sent at the start of the protocol from the initiating peer in the transaction 1209 The <em>IBF</em> message is sent at the start of the protocol from the initiating peer in the transaction
1210 between <strong>Expect SE</strong> -> <strong>Expecting IBF Last</strong> or when the IBF does not 1210 between <strong>Expect SE</strong> -> <strong>Expecting IBF Last</strong> or when the IBF does not
1211 decode and there is a role change in the transition between <strong>Active Decoding</strong> -> <strong>Expecting IBF Last</strong>. 1211 decode and there is a role change in the transition between <strong>Active Decoding</strong> -> <strong>Expecting IBF Last</strong>.
1212 This message is only sent if there are more than one IBF slice to sent, in the case there is just 1212 This message is only sent if there are more than one IBF slice to be sent, in case there is just
1213 one slice the <xref target="messages_ibf_last" format="title" /> message is sent. 1213 one slice the <xref target="messages_ibf_last" format="title" /> message is sent.
1214 </t> 1214 </t>
1215 </section> 1215 </section>
@@ -1219,12 +1219,12 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1219 <artwork name="" type="" align="left" alt=""><![CDATA[ 1219 <artwork name="" type="" align="left" alt=""><![CDATA[
1220 0 8 16 24 32 40 48 56 1220 0 8 16 24 32 40 48 56
1221 +-----+-----+-----+-----+-----+-----+-----+-----+ 1221 +-----+-----+-----+-----+-----+-----+-----+-----+
1222 | MSG SIZE | MSG TYPE |ORDER| PAD | 1222 | MSG SIZE | MSG TYPE | SIZE |
1223 +-----+-----+-----+-----+-----+-----+-----+-----+ 1223 +-----+-----+-----+-----+-----+-----+-----+-----+
1224 | OFFSET | SALT | 1224 |IMCS | OFFSET | SALT |
1225 +-----+-----+-----+-----+-----+-----+-----+-----+ 1225 +-----+-----+-----+-----+-----+-----+-----+-----+
1226 | IBF-SLICE 1226 | IBF-SLICE
1227 + / 1227 +----- /
1228 / / 1228 / /
1229 / / 1229 / /
1230 ]]></artwork> 1230 ]]></artwork>
@@ -1233,21 +1233,25 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1233 <dl> 1233 <dl>
1234 <dt>MSG SIZE</dt> 1234 <dt>MSG SIZE</dt>
1235 <dd> 1235 <dd>
1236 is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included. 1236 is a 16-bit unsigned integer in network byte orderwhichdescribes the message size in bytes and header included.
1237 </dd> 1237 </dd>
1238 <dt>MSG TYPE</dt> 1238 <dt>MSG TYPE</dt>
1239 <dd> 1239 <dd>
1240 the type of SETU_P2P_REQUEST_IBF as registered in <xref target="gana" format="title" /> in network byte order. 1240 the type of SETU_P2P_REQUEST_IBF as registered in <xref target="gana" format="title" /> in network byte order.
1241 </dd> 1241 </dd>
1242 <dt>ORDER</dt> 1242
1243 <dt>SIZE</dt>
1243 <dd> 1244 <dd>
1244 is a 8-bit unsigned integer which signals the order of the IBF. The order of the IBF 1245 is a 32-bit unsigned integer which signals the number of buckets in the IBF.
1245 is defined as the logarithm of the number of buckets of the IBF.
1246 </dd> 1246 </dd>
1247 <dt>PAD</dt> 1247 <dt>IMCS</dt>
1248 <dd> 1248 <dd>
1249 is 24-bit always set to zero 1249 IBF max counter size is a 8-bit unsigned integer which describes the number of bit that
1250 is required to store a single counter. This is used for the unpacking function as described
1251 in the <xref target="performance_counter_variable_size" format="title" /> section.
1250 </dd> 1252 </dd>
1253
1254
1251 <dt>OFFSET</dt> 1255 <dt>OFFSET</dt>
1252 <dd> 1256 <dd>
1253 is a 32-bit unsigned integer which signals the offset to the following ibf slices in the original. 1257 is a 32-bit unsigned integer which signals the offset to the following ibf slices in the original.
@@ -1259,16 +1263,18 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1259 </dd> 1263 </dd>
1260 <dt>IBF-SLICE</dt> 1264 <dt>IBF-SLICE</dt>
1261 <dd> 1265 <dd>
1266
1262 <t> 1267 <t>
1263 are variable count of slices in an array. A single slice contains out multiple 64-bit IDSUMS, 1268 are variable numbers of slices in an array. A single slice contains multiple 64-bit IDSUMS,
1264 32-bit HASHSUMS and 8-bit COUNTERS. In the network order the array of IDSUMS is first, followed 1269 32-bit HASHSUMS and 1-64bit COUNTERS of variable size. In the network order the array of IDSUMS is first, followed
1265 by an array of HASHSUMS and ended with an array of COUNTERS. Length of the array is defined 1270 by an array of HASHSUMS and ended with an array of COUNTERS. Length of the array is defined
1266 by MIN( 2^ORDER - OFFSET, MAX_BUCKETS_PER_MESSAGE). MAX_BUCKETS_PER_MESSAGE is defined as 1271 by MIN( SIZE - OFFSET, MAX_BUCKETS_PER_MESSAGE). MAX_BUCKETS_PER_MESSAGE is defined as
1267 32768 divided by the BUCKET_SIZE which is 13-byte (104-bit). 1272 32768 divided by the BUCKET_SIZE which is 13-byte (104-bit).
1268 </t> 1273 </t>
1274
1269 <t> 1275 <t>
1270 To get the IDSUM field, all IDs who hit a bucket are added up with a binary XOR operation. 1276 To get the IDSUM field, all IDs hitting a bucket are added up with a binary XOR operation.
1271 See <xref target="ibf_format_id_generation" format="title" /> for details about ID generation. 1277 See <xref target="ibf_format_id_generation" format="title" /> details about ID generation.
1272 </t> 1278 </t>
1273 <t> 1279 <t>
1274 The calculation of the HASHSUM field is done accordingly to the calculation of the IDSUM field: 1280 The calculation of the HASHSUM field is done accordingly to the calculation of the IDSUM field:
@@ -1279,7 +1285,11 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1279 The algorithm to find the correct bucket in which the ID and the HASH have to be added 1285 The algorithm to find the correct bucket in which the ID and the HASH have to be added
1280 is described in detail in section <xref target="ibf_format_bucket_identification" format="title" />. 1286 is described in detail in section <xref target="ibf_format_bucket_identification" format="title" />.
1281 </t> 1287 </t>
1282 1288
1289 <t>
1290 Test vectors for an implementation can be found in the <xref target="test_vectors" format="title" /> section
1291 </t>
1292
1283 <!-- 1293 <!--
1284 FIXME: this is not sufficiently precise! How are the element IDs (and IDSUMS) computed? 1294 FIXME: this is not sufficiently precise! How are the element IDs (and IDSUMS) computed?
1285 How are the HASHes (and HASHSUMS) computed? Which byte order is used? What role does 1295 How are the HASHes (and HASHSUMS) computed? Which byte order is used? What role does
@@ -1296,12 +1306,13 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1296 +-----+-----+-----+-----+-----+-----+-----+-----+ 1306 +-----+-----+-----+-----+-----+-----+-----+-----+
1297 | IDSUMS | 1307 | IDSUMS |
1298 +-----+-----+-----+-----+-----+-----+-----+-----+ 1308 +-----+-----+-----+-----+-----+-----+-----+-----+
1299 | HASHSUMS | HASHSUMS | 1309 | HASHSUMS | HASHSUMS |
1300 +-----+-----+-----+-----+-----+-----+-----+-----+ 1310 +-----+-----+-----+-----+-----+-----+-----+-----+
1301 | COUNTERS | COUNTERS | 1311 | COUNTERS* | COUNTERS* |
1302 +-----+-----+-----+-----+-----+-----+-----+-----+ 1312 +-----+-----+-----+-----+-----+-----+-----+-----+
1303 / / 1313 / /
1304 / / 1314 / /
1315* Counter size is variable. In this example the size is 32-bit (4-byte)
1305 ]]></artwork> 1316 ]]></artwork>
1306 </figure> 1317 </figure>
1307 </section> 1318 </section>
@@ -1313,7 +1324,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1313 <section anchor="messages_ibf_last_description" numbered="true" toc="default"> 1324 <section anchor="messages_ibf_last_description" numbered="true" toc="default">
1314 <name>Description</name> 1325 <name>Description</name>
1315 <t> 1326 <t>
1316 This message indicates to the remote peer that all slices of the bloom filter have been sent. 1327 This message indicates the remote peer that all slices of the bloom filter have been sent.
1317 The binary structure is exactly the same as the <xref target="messages_ibf_structure" format="title" /> of 1328 The binary structure is exactly the same as the <xref target="messages_ibf_structure" format="title" /> of
1318 the message <xref target="messages_ibf" format="title" /> with a different "MSG TYPE" 1329 the message <xref target="messages_ibf" format="title" /> with a different "MSG TYPE"
1319 which is defined in <xref target="gana" format="title" /> "SETU_P2P_IBF_LAST". 1330 which is defined in <xref target="gana" format="title" /> "SETU_P2P_IBF_LAST".
@@ -1343,7 +1354,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1343 as answer to a <em><xref target="messages_demand" format="title" /></em> message from the remote peer. 1354 as answer to a <em><xref target="messages_demand" format="title" /></em> message from the remote peer.
1344 The Element message can also be received in the <strong>Finish Closing</strong> or <strong>Finish Waiting</strong> 1355 The Element message can also be received in the <strong>Finish Closing</strong> or <strong>Finish Waiting</strong>
1345 state after receiving a <em><xref target="messages_done" format="title" /></em> message from the remote peer, in this 1356 state after receiving a <em><xref target="messages_done" format="title" /></em> message from the remote peer, in this
1346 case the client changes to the <strong>Finished</strong> state as soon as all demands for elements have been satisfied. 1357 case the peer changes to the <strong>Finished</strong> state as soon as all demands for elements have been satisfied.
1347 </t> 1358 </t>
1348 <t> 1359 <t>
1349 This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />. 1360 This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />.
@@ -1367,7 +1378,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1367 <dl> 1378 <dl>
1368 <dt>MSG SIZE</dt> 1379 <dt>MSG SIZE</dt>
1369 <dd> 1380 <dd>
1370 is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included. 1381 is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
1371 </dd> 1382 </dd>
1372 <dt>MSG TYPE</dt> 1383 <dt>MSG TYPE</dt>
1373 <dd> 1384 <dd>
@@ -1375,7 +1386,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1375 </dd> 1386 </dd>
1376 <dt>E TYPE</dt> 1387 <dt>E TYPE</dt>
1377 <dd> 1388 <dd>
1378 element type is a 16-bit unsigned integer witch defines the element type for 1389 element type is a 16-bit unsigned integer which defines the element type for
1379 the application. 1390 the application.
1380 </dd> 1391 </dd>
1381 <dt>PADDING</dt> 1392 <dt>PADDING</dt>
@@ -1384,7 +1395,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1384 </dd> 1395 </dd>
1385 <dt>E SIZE</dt> 1396 <dt>E SIZE</dt>
1386 <dd> 1397 <dd>
1387 element size is 16-bit unsigned integer that signals the size of the elements data part. 1398 element size is a 16-bit unsigned integer that signals the size of the elements data part.
1388 </dd> 1399 </dd>
1389 <dt>AE TYPE</dt> 1400 <dt>AE TYPE</dt>
1390 <dd> 1401 <dd>
@@ -1408,7 +1419,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1408 The offer message is an answer to an <em><xref target="messages_inquiry" format="title" /></em> message 1419 The offer message is an answer to an <em><xref target="messages_inquiry" format="title" /></em> message
1409 and transmits the full hash of an element that has been requested by the other peer. 1420 and transmits the full hash of an element that has been requested by the other peer.
1410 This full hash enables the other peer to check if the element is really missing in its set and 1421 This full hash enables the other peer to check if the element is really missing in its set and
1411 eventually sends a <em><xref target="messages_demand" format="title" /></em> message for that a element. 1422 eventually sends a <em><xref target="messages_demand" format="title" /></em> message for that element.
1412 </t> 1423 </t>
1413 <t> 1424 <t>
1414 The offer is sent and received only in the <strong>Active Decoding</strong> and in the <strong>Passive Decoding</strong> 1425 The offer is sent and received only in the <strong>Active Decoding</strong> and in the <strong>Passive Decoding</strong>
@@ -1434,7 +1445,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1434 <dl> 1445 <dl>
1435 <dt>MSG SIZE</dt> 1446 <dt>MSG SIZE</dt>
1436 <dd> 1447 <dd>
1437 is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included. 1448 is a 16-bit unsigned integer in network byte order which describes the message size in bytes header included.
1438 </dd> 1449 </dd>
1439 <dt>MSG TYPE</dt> 1450 <dt>MSG TYPE</dt>
1440 <dd> 1451 <dd>
@@ -1482,7 +1493,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1482 <dl> 1493 <dl>
1483 <dt>MSG SIZE</dt> 1494 <dt>MSG SIZE</dt>
1484 <dd> 1495 <dd>
1485 is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included. 1496 is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
1486 </dd> 1497 </dd>
1487 <dt>MSG TYPE</dt> 1498 <dt>MSG TYPE</dt>
1488 <dd> 1499 <dd>
@@ -1503,7 +1514,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1503 <name>Description</name> 1514 <name>Description</name>
1504 <t> 1515 <t>
1505 The demand message is sent in the <strong>Active Decoding</strong> and in the <strong>Passive Decoding</strong> 1516 The demand message is sent in the <strong>Active Decoding</strong> and in the <strong>Passive Decoding</strong>
1506 state. It is a answer to a received <em><xref target="messages_offer" format="title" /></em> message 1517 state. It is an answer to a received <em><xref target="messages_offer" format="title" /></em> message
1507 and is sent if the element described in the <em><xref target="messages_offer" format="title" /></em> message 1518 and is sent if the element described in the <em><xref target="messages_offer" format="title" /></em> message
1508 is missing in the peers set. In the normal workflow the answer to the demand message is an 1519 is missing in the peers set. In the normal workflow the answer to the demand message is an
1509 <em><xref target="messages_elements" format="title" /></em> message. 1520 <em><xref target="messages_elements" format="title" /></em> message.
@@ -1528,7 +1539,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1528 <dl> 1539 <dl>
1529 <dt>MSG SIZE</dt> 1540 <dt>MSG SIZE</dt>
1530 <dd> 1541 <dd>
1531 is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included. 1542 is a 16-bit unsigned integer in network byte order which describes the message size in bytes and the header is included.
1532 </dd> 1543 </dd>
1533 <dt>MSG TYPE</dt> 1544 <dt>MSG TYPE</dt>
1534 <dd> 1545 <dd>
@@ -1576,7 +1587,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1576 <dl> 1587 <dl>
1577 <dt>MSG SIZE</dt> 1588 <dt>MSG SIZE</dt>
1578 <dd> 1589 <dd>
1579 is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included. 1590 is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
1580 </dd> 1591 </dd>
1581 <dt>MSG TYPE</dt> 1592 <dt>MSG TYPE</dt>
1582 <dd> 1593 <dd>
@@ -1599,7 +1610,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1599 <name>Description</name> 1610 <name>Description</name>
1600 <t> 1611 <t>
1601 The full done message is sent in the <xref target="modeofoperation_full-sync" format="title" /> 1612 The full done message is sent in the <xref target="modeofoperation_full-sync" format="title" />
1602 to signal that all remaining elements of the set have been sent. The message is received and sent in in the 1613 to signal that all remaining elements of the set have been sent. The message is received and sent in the
1603 <strong>Full Sending</strong> and in the <strong>Full Receiving</strong> state. When the full done message is received 1614 <strong>Full Sending</strong> and in the <strong>Full Receiving</strong> state. When the full done message is received
1604 in <strong>Full Sending</strong> state the peer changes directly into <strong>Finished</strong> state. In 1615 in <strong>Full Sending</strong> state the peer changes directly into <strong>Finished</strong> state. In
1605 <strong>Full Receiving</strong> state receiving a full done message initiates the sending of 1616 <strong>Full Receiving</strong> state receiving a full done message initiates the sending of
@@ -1622,7 +1633,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1622 <dl> 1633 <dl>
1623 <dt>MSG SIZE</dt> 1634 <dt>MSG SIZE</dt>
1624 <dd> 1635 <dd>
1625 is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included. 1636 is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
1626 </dd> 1637 </dd>
1627 <dt>MSG TYPE</dt> 1638 <dt>MSG TYPE</dt>
1628 <dd> 1639 <dd>
@@ -1668,7 +1679,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1668 <dl> 1679 <dl>
1669 <dt>MSG SIZE</dt> 1680 <dt>MSG SIZE</dt>
1670 <dd> 1681 <dd>
1671 is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included. 1682 is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
1672 </dd> 1683 </dd>
1673 <dt>MSG TYPE</dt> 1684 <dt>MSG TYPE</dt>
1674 <dd> 1685 <dd>
@@ -1684,7 +1695,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1684 <section anchor="messages_se_description" numbered="true" toc="default"> 1695 <section anchor="messages_se_description" numbered="true" toc="default">
1685 <name>Description</name> 1696 <name>Description</name>
1686 <t> 1697 <t>
1687 The strata estimator is sent by the receiving peer at the start of the protocol right after the 1698 The strata estimator is sent by the receiving peer at the start of the protocol, right after the
1688 <xref target="messages_operation_request" format="title" /> message has been received. 1699 <xref target="messages_operation_request" format="title" /> message has been received.
1689 </t> 1700 </t>
1690 <t> 1701 <t>
@@ -1714,7 +1725,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1714 <dl> 1725 <dl>
1715 <dt>MSG SIZE</dt> 1726 <dt>MSG SIZE</dt>
1716 <dd> 1727 <dd>
1717 is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included. 1728 is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
1718 </dd> 1729 </dd>
1719 <dt>MSG TYPE</dt> 1730 <dt>MSG TYPE</dt>
1720 <dd> 1731 <dd>
@@ -1743,7 +1754,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1743 </t> 1754 </t>
1744 <t> 1755 <t>
1745 Since the content of the message is the same as the uncompressed Strata Estimator, the details 1756 Since the content of the message is the same as the uncompressed Strata Estimator, the details
1746 aren't repeated here for details see section <xref target="messages_se" format="counter" />. 1757 are not repeated here for details see section <xref target="messages_se" format="counter" />.
1747 </t> 1758 </t>
1748 </section> 1759 </section>
1749 </section> 1760 </section>
@@ -1765,7 +1776,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1765 <strong>Full Receiving</strong> state. 1776 <strong>Full Receiving</strong> state.
1766 </t> 1777 </t>
1767 <t> 1778 <t>
1768 After the last full element messages has been sent the <xref target="messages_full_done" format="title" /> message 1779 After the last full element message has been sent the <xref target="messages_full_done" format="title" /> message
1769 is sent to conclude the full synchronisation of the element sending peer. 1780 is sent to conclude the full synchronisation of the element sending peer.
1770 </t> 1781 </t>
1771 </section> 1782 </section>
@@ -1787,7 +1798,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1787 <dl> 1798 <dl>
1788 <dt>MSG SIZE</dt> 1799 <dt>MSG SIZE</dt>
1789 <dd> 1800 <dd>
1790 is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included. 1801 is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
1791 </dd> 1802 </dd>
1792 <dt>MSG TYPE</dt> 1803 <dt>MSG TYPE</dt>
1793 <dd> 1804 <dd>
@@ -1795,7 +1806,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1795 </dd> 1806 </dd>
1796 <dt>E TYPE</dt> 1807 <dt>E TYPE</dt>
1797 <dd> 1808 <dd>
1798 element type is a 16-bit unsigned integer witch defines the element type for 1809 element type is a 16-bit unsigned integer which defines the element type for
1799 the application. 1810 the application.
1800 </dd> 1811 </dd>
1801 <dt>PADDING</dt> 1812 <dt>PADDING</dt>
@@ -1804,7 +1815,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1804 </dd> 1815 </dd>
1805 <dt>E SIZE</dt> 1816 <dt>E SIZE</dt>
1806 <dd> 1817 <dd>
1807 element size is 16-bit unsigned integer that signals the size of the elements data part. 1818 element size is a 16-bit unsigned integer that signals the size of the elements data part.
1808 </dd> 1819 </dd>
1809 <dt>AE TYPE</dt> 1820 <dt>AE TYPE</dt>
1810 <dd> 1821 <dd>
@@ -1831,7 +1842,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
1831 The decision which mode of operations is used is described by the following code: 1842 The decision which mode of operations is used is described by the following code:
1832 </t> 1843 </t>
1833 <t> 1844 <t>
1834 The function takes as input the initial the local setsize, the remote setsize, the 1845 The function takes as input the initial local setsize, the remote setsize, the
1835 by the strata estimator calculated difference, a static boolean that enforces full 1846 by the strata estimator calculated difference, a static boolean that enforces full
1836 synchronisation mode of operation and the bandwith/roundtrips tradeoff. 1847 synchronisation mode of operation and the bandwith/roundtrips tradeoff.
1837 As output the function returns "FULL" if the full synchronisation 1848 As output the function returns "FULL" if the full synchronisation
@@ -1862,9 +1873,9 @@ FUNCTION decide_operation_mode(initial_local_setsize, remote_setsize, set_diff,
1862 </figure> 1873 </figure>
1863 </section> 1874 </section>
1864 <section anchor="performance_formulas_full_sending_dec_first_send" numbered="true" toc="default"> 1875 <section anchor="performance_formulas_full_sending_dec_first_send" numbered="true" toc="default">
1865 <name>Full Synchronisation: Decision witch peer sends elements first</name> 1876 <name>Full Synchronisation: Decision which peer sends elements first</name>
1866 <t> 1877 <t>
1867 The following function determinate which peer starts sending its full set in full synchronisation 1878 The following function determinates which peer starts sending its full set in full synchronisation
1868 mode of operation. 1879 mode of operation.
1869 </t> 1880 </t>
1870 <t> 1881 <t>
@@ -1892,11 +1903,11 @@ FUNCTION decide_full_sending(initial_local_size, remote_setsize)
1892 <section anchor="performance_formula_ibf_parameters" numbered="true" toc="default"> 1903 <section anchor="performance_formula_ibf_parameters" numbered="true" toc="default">
1893 <name>IBF Parameters</name> 1904 <name>IBF Parameters</name>
1894 <t> 1905 <t>
1895 The following function calculate the required parameter to create an optimal sized IBF. These 1906 The following function calculates the required parameter to create an optimal sized IBF. These
1896 parameter are the number of buckets and the number of buckets a single element is mapped to. 1907 parameters are the number of buckets and the number of buckets a single element is mapped to.
1897 </t> 1908 </t>
1898 <t> 1909 <t>
1899 The function takes as input the setsize and returns a array with two numbers the total number of buckets 1910 The function takes as input the setsize and returns an array with two numbers, the total number of buckets
1900 and the number of buckets a single element is mapped to. 1911 and the number of buckets a single element is mapped to.
1901 </t> 1912 </t>
1902 <figure anchor="performance_formula_ibf_parameters_code"> 1913 <figure anchor="performance_formula_ibf_parameters_code">
@@ -1907,14 +1918,163 @@ FUNCTION decide_full_sending(initial_local_size, remote_setsize)
1907# returns: Array: first element total nr ob buckets and 1918# returns: Array: first element total nr ob buckets and
1908# second element number of buckets per element 1919# second element number of buckets per element
1909 1920
1910FUNCTION calculate_ibf_params (setsize): 1921FUNCTION calculate_ibf_params (set_diff):
1911 number_of_bucket_per_element = 4 1922 number_of_bucket_per_element = 4
1912 total_number_of_buckets = setsize 1923 total_number_of_buckets = set_diff
1913 return [ total_number_of_buckets, number_of_bucket_per_element ] 1924 return [ total_number_of_buckets, number_of_bucket_per_element ]
1914 ]]></artwork> 1925 ]]></artwork>
1915 </figure> 1926 </figure>
1916 </section> 1927 </section>
1917 </section> 1928 </section>
1929
1930 <section anchor="performance_counter_variable_size" numbered="true" toc="default">
1931 <name>Counter variable size</name>
1932 <t>
1933 Since the optimal number of bytes a counter in the IBF contains is very variable and varies
1934 due to different parameters. Details are described in the BSC thesis
1935 by Elias Summermatter, BFH 2021.
1936 <!-- TODO link Thesis -->
1937 Therefore a compression algorithm has been implemented, which always creates the IBF counter in optimal size.
1938 and thus minimizes the bandwidth needed to transmit the IBF.
1939 </t>
1940 <t>
1941 A simple algorithm is used for the compression. In a first step it is determined, which is the largest counter
1942 and how many bits are needed to store it. In a second step for every counter of every bucket the counter
1943 is stored in the bits determined in the first step and these are concatenated.
1944 </t>
1945 <t>
1946 Three individual functions are used for this purpose. The first one is a function that iterates over each bucket of the
1947 bucket of the IBF to get the maximum counter in the IBF. As second it needs
1948 a function that compresses the counter of the IBF and thirdly a function that decompresses the IBF.
1949 </t>
1950 <figure anchor="performance_counter_variable_size_code">
1951 <artwork name="" type="" align="left" alt=""><![CDATA[
1952
1953# INPUTS:
1954# ibf: The IBF
1955# OUTPUTS:
1956# returns: minimal amount of bytes required to store the counter
1957
1958FUNCTION ibf_get_max_counter(ibf)
1959 max_counter=0
1960 FOR bucket IN ibf
1961 IF bucket.counter > max_counter
1962 max_counter = bucket.counter
1963
1964 RETURN CEILING( log2 ( max_counter ) ) # next bigger discrete number of the binary logarithm of the max counter
1965
1966# INPUTS:
1967# ibf: The IBF
1968# offset: The offset which defines the starting point from which bucket the compress operation should start
1969# count: The number of buckets to in the array that should be compressed
1970# OUTPUTS:
1971# returns: An byte array of compressed counters to send over the network
1972
1973FUNCTION pack_counter(ibf, offset, count)
1974 counter_bytes = ibf_get_max_counter(ibf)
1975 store = 0
1976 store_bits = 0
1977 byte_ctr = 0
1978 buffer=[]
1979
1980 FOR bucket IN ibf[offset] to ibf[count]
1981 byte_len = counter_bytes
1982 counter = bucket.counter
1983
1984 WHILE byte_len > 0
1985 byte_to_write = 0
1986
1987 IF counter_bytes + store_bits >= 8
1988 bit_to_shift=0
1989
1990 IF store_bits > 0 OR counter_bytes > 8
1991 bit_free = 8 - store_bits
1992 bit_to_shift = counter_bytes - bit_free
1993 store = store << bit_free
1994
1995 byte_to_write = (( counter >> bit_to_shift) | store) & 0xFF
1996 bit_to_shift -= 8 - store_bits
1997 counter = counter & (( 1 << counter_bytes ) - 1)
1998 store = 0
1999 store_bits = 0
2000
2001 ELSE
2002 IF 0 == store_bits
2003 store = counter
2004 ELSE
2005 store = (store << counter_bytes) | counter
2006
2007 store_bits = store_bits + byte_len
2008 byte_len = 0
2009 BREAK
2010
2011 buffer[byte_ctr] = byte_to_write
2012 byte_ctr++
2013 NEXT_BUCKET
2014
2015 # Write the last partial compressed byte to the buffer
2016 buffer[byte_ctr] = store << (8 - store_bits)
2017 byte_ctr++
2018
2019 RETURN buffer
2020
2021# INPUTS:
2022# ibf: The IBF
2023# offset: The offset which defines the starting point from which bucket the compress operation should start
2024# count: The number of buckets to in the array that should be compressed
2025# counter_bit_length: The bit length of the counter can be found in the ibf message in the ibf_counter_bit_length field
2026# packed_data: A byte array which contains the data packed with the pack_counter function
2027# OUTPUTS:
2028# returns: Nothing because the unpacked counter is saved directly into the IBF
2029
2030FUNCTION unpack_counter(ibf, offset, count, counter_bit_length, packed_data)
2031 store = 0
2032 store_bits = 0
2033 byte_ctr = 0
2034 ibf_bucket_ctr = 0
2035
2036 number_bytes_read = CEILING((count * counter_bit_length) / 8)
2037
2038 WHILE ibf_bucket_ctr <= (count -1)
2039 byte_to_read = packed_data[byte_ctr]
2040 byte_ctr++
2041 bit_to_pack_left = 8
2042
2043 WHILE bit_to_pack_left >= 0
2044
2045 # Prevent packet from reading more than required
2046 IF ibf_bucket_ctr > (count -1)
2047 return
2048
2049 IF ( store_bits + bit_to_pack_left ) >= counter_bit_length
2050 bit_use = counter_bit_length - store_bits
2051
2052 IF store_bits > 0
2053 store = store << bit_use
2054
2055 bytes_to_shift = bit_to_pack_left - bit_use
2056 counter_partial = byte_to_read >> bytes_to_shift
2057 store = store | counter_partial
2058 ibf.counter[ibf_bucket_ctr] = store
2059 byte_to_read = byte_to_read & (( 1 << bytes_to_shift ) - 1)
2060
2061 bit_to_pack_left -= bit_use
2062 ibf_bucket_ctr++
2063 store = 0
2064 store_bits = 0
2065
2066 ELSE
2067 store_bits += bit_to_pack_left
2068
2069 IF 0 == store_bits
2070 store = byte_to_read
2071 ELSE
2072 store = store << bit_to_pack_left
2073 store = store | byte_to_read
2074 BREAK
2075 ]]></artwork>
2076 </figure>
2077 </section>
1918 </section> 2078 </section>
1919 2079
1920 <!-- 2080 <!--
@@ -1931,19 +2091,19 @@ FUNCTION calculate_ibf_params (setsize):
1931 2091
1932 <t> 2092 <t>
1933 The security considerations in this document focus mainly on the security 2093 The security considerations in this document focus mainly on the security
1934 goal of availability, the primary goal of the protocol is prevent an attacker from 2094 goal of availability. The primary goal of the protocol is to prevent an attacker from
1935 wasting cpu and network resources of the attacked peer. 2095 wasting cpu and network resources of the attacked peer.
1936 </t> 2096 </t>
1937 <t> 2097 <t>
1938 To prevent denial of service attacks its vital to check that peers can only 2098 To prevent denial of service attacks, it is vital to check that peers can only
1939 reconcile a set once in a pre defined time span. This is a predefined values and need 2099 reconcile a set once in a predefined time span. This is a predefined value and needs
1940 to be adapted on per use basis. To enhance reliability and to allow 2100 to be adapted per use basis. To enhance reliability and to allow
1941 failures in the protocol its possible to introduce a threshold for max failed reconciliation 2101 failures in the protocol, it is possible to introduce a threshold for max failed reconciliation
1942 ties. 2102 ties.
1943 <!-- IMPLEMENT: How to implement? IP? Other construct? --> 2103 <!-- IMPLEMENT: How to implement? IP? Other construct? -->
1944 </t> 2104 </t>
1945 <t> 2105 <t>
1946 The formal format of all messages needs to be properly validated, this is important to prevent many 2106 The formal format of all messages needs to be properly validated. This is important to prevent many
1947 attacks on the code. The application data should be validated by the application using 2107 attacks on the code. The application data should be validated by the application using
1948 the protocol not by the implementation of the protocol. 2108 the protocol not by the implementation of the protocol.
1949 In case the format validation fails the set operation MUST be terminated. 2109 In case the format validation fails the set operation MUST be terminated.
@@ -1951,17 +2111,18 @@ FUNCTION calculate_ibf_params (setsize):
1951 </t> 2111 </t>
1952 2112
1953 <t> 2113 <t>
1954 To prevent an attacker from sending a peer into a endless loop between active and passive decoding a 2114 To prevent an attacker from sending a peer into an endless loop between active and passive decoding, a
1955 limitation for active/passive roll switches in required. This can be implemented by 2115 limitation for active/passive roll switches is required. This can be implemented by
1956 a simple counter which terminates the operation after a predefined count of switches. 2116 a simple counter which terminates the operation after a predefined number of switches.
1957 The count of switches needs to be defined as such that its very undroppable that more 2117 The number of switches needs to be defined in such a way that it is very unprobable that more
1958 switches are required an the malicious intend of the other peer can be assumed. 2118 switches are required an the malicious intend of the other peer can be assumed.
2119
1959 <!-- IMPLEMENT: This counter --> 2120 <!-- IMPLEMENT: This counter -->
1960 </t> 2121 </t>
1961 2122
1962 <t> 2123 <t>
1963 <!--- SHOULD BE HANDLED BY UNDERLYING PROTOCOL BUT HOW IS IT HANDLED? --> 2124 <!--- SHOULD BE HANDLED BY UNDERLYING PROTOCOL BUT HOW IS IT HANDLED? -->
1964 Its important to close and purge connections after a given timeout 2125 It is important to close and purge connections after a given timeout
1965 to prevent draining attacks. 2126 to prevent draining attacks.
1966 <!-- IMPLEMENT: How ist this handheld right now? --> 2127 <!-- IMPLEMENT: How ist this handheld right now? -->
1967 </t> 2128 </t>
@@ -1976,7 +2137,7 @@ FUNCTION calculate_ibf_params (setsize):
1976 <name>Duplicated or Missing Message detection</name> 2137 <name>Duplicated or Missing Message detection</name>
1977 <t> 2138 <t>
1978 Most of the messages received need to be checked that they are not 2139 Most of the messages received need to be checked that they are not
1979 received multiple times this is solved with a global store (message) 2140 received multiple times. This is solved with a global store (message)
1980 and the following code 2141 and the following code
1981 </t> 2142 </t>
1982 <figure anchor="security_generic_functions_missing_message_code"> 2143 <figure anchor="security_generic_functions_missing_message_code">
@@ -2031,7 +2192,7 @@ FUNCTION isStoreComplete(store)
2031 ENDFOR 2192 ENDFOR
2032 return TRUE 2193 return TRUE
2033 2194
2034# Returns the count of message received 2195# Returns the number of message received
2035# INPUTS: 2196# INPUTS:
2036# store: Store to count 2197# store: Store to count
2037# OUTPUTS: 2198# OUTPUTS:
@@ -2046,8 +2207,8 @@ FUNCTION getNumberOfMessage(store)
2046 <section anchor="security_generic_functions_element_nr" numbered="true" toc="default"> 2207 <section anchor="security_generic_functions_element_nr" numbered="true" toc="default">
2047 <name>Store Remote Peers Element Number</name> 2208 <name>Store Remote Peers Element Number</name>
2048 <t> 2209 <t>
2049 To prevent an other peer from requesting the same set multiple times 2210 To prevent an other peer from requesting the same set multiple times,
2050 its important to memorize the number of elements a peer had in previous 2211 it is important to memorize the number of elements a peer had in previous
2051 reconciliation sessions. 2212 reconciliation sessions.
2052 </t> 2213 </t>
2053 <figure anchor="security_generic_functions_element_nr_code"> 2214 <figure anchor="security_generic_functions_element_nr_code">
@@ -2096,12 +2257,12 @@ FUNCTION save_number_of_elements_last_sync(client_id, remote_setsize)
2096 <t> 2257 <t>
2097 It needs to be checked that the full synchronisation is 2258 It needs to be checked that the full synchronisation is
2098 plausible according to the formula deciding which operation mode 2259 plausible according to the formula deciding which operation mode
2099 is applicable this is achieved by calculating the upper and lower 2260 is applicable. This is achieved by calculating the upper and lower
2100 boundaries of the number of elements in the other peers set. 2261 boundaries of the number of elements in the other peers set.
2101 The lower boundary of number of elements can be easily 2262 The lower boundary of number of elements can be easily
2102 memorized as result from the last synchronisation and the upper 2263 memorized as result from the last synchronisation and the upper
2103 boundary can be estimated with prior knowledge of the maximal 2264 boundary can be estimated with prior knowledge of the maximal
2104 plausible increase of element since the last reconciliation and 2265 plausible increase of elements since the last reconciliation and
2105 the maximal plausible number of elements. 2266 the maximal plausible number of elements.
2106 <!-- Entscheindungsfindung: myset fulltransimtion or epstein 2267 <!-- Entscheindungsfindung: myset fulltransimtion or epstein
2107 5.3 ist zu verckürtzt fall: nur 2 cases Es fehlt traidof nach paramer in perfomance section 2268 5.3 ist zu verckürtzt fall: nur 2 cases Es fehlt traidof nach paramer in perfomance section
@@ -2155,12 +2316,12 @@ FUNCTION validate_messages_request_full(client_id, remote_setsize, local_setsize
2155 <dt><xref target="messages_ibf" format="title" /></dt> 2316 <dt><xref target="messages_ibf" format="title" /></dt>
2156 <dd> 2317 <dd>
2157 <t> 2318 <t>
2158 Its important do define a threshold to limit 2319 It is important to define a threshold to limit
2159 the maximal number of IBFs that are expected from the other peer. 2320 the maximal number of IBFs that are expected from the other peer.
2160 <!-- change count to number full section --> 2321 <!-- change count to number full section -->
2161 This maximal plausible size can be calculated with the known inputs: 2322 This maximal plausible size can be calculated with the known inputs:
2162 number of elements in my set and the pre defined applications upper 2323 number of elements in the local set and the predefined applications upper
2163 limit as described in the performance section. 2324 limit, as described in the performance section.
2164 <!-- IMPLEMENT: Is this already checked?--> 2325 <!-- IMPLEMENT: Is this already checked?-->
2165 <!-- TODO: Link performance section --> 2326 <!-- TODO: Link performance section -->
2166 That the other peer chooses the correct mode of operation MUST 2327 That the other peer chooses the correct mode of operation MUST
@@ -2222,7 +2383,7 @@ FUNCTION get_bucket_number(remote_setsize, local_setsize, initial_local_size, se
2222 <dt><xref target="messages_full_element" format="title" /></dt> 2383 <dt><xref target="messages_full_element" format="title" /></dt>
2223 <dd> 2384 <dd>
2224 <t> 2385 <t>
2225 If a full element is received the set of the other peer 2386 If a full element is received, the set of the other peer
2226 is smaller than the set of the peer in the <strong>Expecting IBF</strong> 2387 is smaller than the set of the peer in the <strong>Expecting IBF</strong>
2227 state and the set difference is smaller than threshold for 2388 state and the set difference is smaller than threshold for
2228 full synchronisation as described in the performance section. 2389 full synchronisation as described in the performance section.
@@ -2322,9 +2483,9 @@ FUNCTION validate_messages_full_element_init(client_id, remote_setsize,
2322 <dt><xref target="messages_full_element" format="title" /></dt> 2483 <dt><xref target="messages_full_element" format="title" /></dt>
2323 <dd> 2484 <dd>
2324 <t> 2485 <t>
2325 When receiving full elements there needs to be checked that every 2486 When receiving full elements there needs to be checked, that every
2326 element is a valid element, no element is resized more than once and 2487 element is a valid element, no element has been received more than once and
2327 not more or less elements are received as the other peer has committed 2488 not more or less elements are received, as the other peer has committed
2328 to in the beginning of the operation. Detail pseudocode implementation 2489 to in the beginning of the operation. Detail pseudocode implementation
2329 can be found in <xref target="security_states_expecting_ibf" format="title" />. 2490 can be found in <xref target="security_states_expecting_ibf" format="title" />.
2330 <!-- IMPLEMENT: Is this check already implemented?--> 2491 <!-- IMPLEMENT: Is this check already implemented?-->
@@ -2333,12 +2494,12 @@ FUNCTION validate_messages_full_element_init(client_id, remote_setsize,
2333 <dt><xref target="messages_full_done" format="title" /></dt> 2494 <dt><xref target="messages_full_done" format="title" /></dt>
2334 <dd> 2495 <dd>
2335 <t> 2496 <t>
2336 When receiving the full done message its important to check that 2497 When receiving the full done message it is important to check that
2337 not less elements are received as the other peer has committed to 2498 not less elements are received as the other peer has committed to
2338 send. 2499 send.
2339 The 512-bit hash of the complete reconciled set contained in 2500 The 512-bit hash of the complete reconciled set contained in
2340 the full done message is required to ensures that both sets are truly identical. If 2501 the full done message is required to ensure that both sets are truly identical. If
2341 the sets differ a resynchronisation is required. The count of possible 2502 the sets differ, a resynchronisation is required. The number of possible
2342 resynchronisation MUST be limited to prevent resource exhaustion attacks. 2503 resynchronisation MUST be limited to prevent resource exhaustion attacks.
2343 <!-- IMPLEMENT: Is this check already implemented?--> 2504 <!-- IMPLEMENT: Is this check already implemented?-->
2344 </t> 2505 </t>
@@ -2382,7 +2543,7 @@ FUNCTION validate_messages_full_done(full_done_msg, full_element_msg_store,
2382 <dt><xref target="messages_ibf" format="title" /></dt> 2543 <dt><xref target="messages_ibf" format="title" /></dt>
2383 <dd> 2544 <dd>
2384 <t> 2545 <t>
2385 When receiving multiple IBFs its important to check that the other 2546 When receiving multiple IBFs it is important to check that the other
2386 peer can only send as many IBFs as expected. The number of expected IBFs can 2547 peer can only send as many IBFs as expected. The number of expected IBFs can
2387 be calculated with the knowledge of the set difference as described in the 2548 be calculated with the knowledge of the set difference as described in the
2388 performance section. 2549 performance section.
@@ -2397,37 +2558,37 @@ FUNCTION validate_messages_full_done(full_done_msg, full_element_msg_store,
2397 <section anchor="security_states_active_decoding" numbered="true" toc="default"> 2558 <section anchor="security_states_active_decoding" numbered="true" toc="default">
2398 <name>Active Decoding</name> 2559 <name>Active Decoding</name>
2399 <t> 2560 <t>
2400 In the Active Decoding state its important to prevent an attacker from 2561 In the Active Decoding state it is important to prevent an attacker from
2401 generating and passing unlimited amount of IBF that do not decode or 2562 generating and passing an unlimited amount of IBFs, that do not decode or
2402 even worse generate an IBF that is constructed to sends the peers in an endless loop. 2563 even worse, generate an IBF constructed, to send the peers in an endless loop.
2403 To prevent an endless loop in decoding a loop detection should be implemented 2564 To prevent an endless loop in decoding, a loop detection should be implemented.
2404 the simplest solution would be to prevent decoding of more than a given amount of elements, 2565 The simplest solution would be to prevent decoding of more than a given amount of elements.
2405 a more robust solution is to implement a algorithm that detects a loop by 2566 A more robust solution is to implement a algorithm that detects a loop by
2406 analyzing past partially decoded IBFs to detect cycles. This can be archived 2567 analyzing past partially decoded IBFs. This can be archived
2407 by saving the hash of all prior partly decoded IBFs hashes in a hashmap and check 2568 by saving the hash of all prior partly decoded IBFs hashes in a hashmap and check
2408 for every inserted hash if it is already in the hashmap. 2569 for every inserted hash, if it is already in the hashmap.
2409 <!-- TODO: Link some algo to find loops in directed graph --> 2570 <!-- TODO: Link some algo to find loops in directed graph -->
2410 <!-- IMPLEMENT: Implement an algo that detects loops in IBF decoding --> 2571 <!-- IMPLEMENT: Implement an algo that detects loops in IBF decoding -->
2411 </t> 2572 </t>
2412 <t> 2573 <t>
2413 If the IBF decodes more or less elements than are plausible the 2574 If the IBF decodes more or less elements than are plausible, the
2414 operation MUST be terminated. The upper and lower threshold 2575 operation MUST be terminated. The upper and lower threshold
2415 for the decoded elements can be calculated with the peers set size 2576 for the decoded elements can be calculated with the peers set sizes
2416 and the other peer committed set sizes from the <strong>Expecting IBF</strong> 2577 and the other peer committed set sizes from the <strong>Expecting IBF</strong>
2417 State. 2578 State.
2418 </t> 2579 </t>
2419 2580
2420 <!-- Wenne element mehrfach decodiert seitenwechseln daher detecten. --> 2581 <!-- Wenn ein Element mehrfach decodiert seitenwechseln daher detecten. -->
2421 2582
2422 <t>Security considerations for received messages:</t> 2583 <t>Security considerations for received messages:</t>
2423 <dl> 2584 <dl>
2424 <dt><xref target="messages_offer" format="title" /></dt> 2585 <dt><xref target="messages_offer" format="title" /></dt>
2425 <dd> 2586 <dd>
2426 <t> 2587 <t>
2427 If an offer for an element that never has been requested by 2588 If an offer for an element, that never has been requested by
2428 an inquiry or if an offer is received twice the operation MUST be terminated. 2589 an inquiry or if an offer is received twice, the operation MUST be terminated.
2429 This requirement can be fulfilled by saving lists that keeps track of the state of 2590 This requirement can be fulfilled by saving lists that keep track of the state of
2430 all send inquiries and offers. When answering offers these lists MUST be checked. 2591 all sent inquiries and offers. When answering offers these lists MUST be checked.
2431 <!-- IMPLEMENT: Check to keep track of all send Inquiries --> 2592 <!-- IMPLEMENT: Check to keep track of all send Inquiries -->
2432 </t> 2593 </t>
2433 <figure anchor="security_states_active_decoding_offer_code"> 2594 <figure anchor="security_states_active_decoding_offer_code">
@@ -2462,9 +2623,9 @@ FUNCTION validate_messages_offer(offer_msg,inquiry_msg_store)
2462 If an element that never has been requested by 2623 If an element that never has been requested by
2463 a demand or is received double the operation MUST be terminated. 2624 a demand or is received double the operation MUST be terminated.
2464 This requirement can be fulfilled by a simple table that keeps track 2625 This requirement can be fulfilled by a simple table that keeps track
2465 of the state of all send demands. 2626 of the state of all sent demands.
2466 <!-- IMPLEMENT: Check to keep track of all send demands --> 2627 <!-- IMPLEMENT: Check to keep track of all send demands -->
2467 If an invalid element is received the operation has failed and the 2628 If an invalid element is received, the operation has failed and it
2468 MUST be terminated. 2629 MUST be terminated.
2469 <!-- IMPLEMENT: Termination if invalid element si revived --> 2630 <!-- IMPLEMENT: Termination if invalid element si revived -->
2470 </t> 2631 </t>
@@ -2497,10 +2658,10 @@ FUNCTION validate_messages_elements(element_msg,demand_msg_store)
2497 <dt><xref target="messages_demand" format="title" /></dt> 2658 <dt><xref target="messages_demand" format="title" /></dt>
2498 <dd> 2659 <dd>
2499 <t> 2660 <t>
2500 For every received demand a offer has to be send in advance. If an demand 2661 For every received demand an offer has to be sent in advance. If a demand
2501 for an element is received that never has been offered or the offer already has 2662 for an element is received, that never has been offered or the offer already has
2502 been answered with a demand the operation MUST be terminated. Its required to implement 2663 been answered with a demand, the operation MUST be terminated. It is required to implement
2503 a list which keeps track of the state of all send offers and received demands. 2664 a list which keeps track of the state of all sent offers and received demands.
2504 </t> 2665 </t>
2505 <figure anchor="security_states_active_decoding_demand_code"> 2666 <figure anchor="security_states_active_decoding_demand_code">
2506 <artwork name="" type="" align="left" alt=""><![CDATA[ 2667 <artwork name="" type="" align="left" alt=""><![CDATA[
@@ -2532,14 +2693,14 @@ FUNCTION validate_messages_demand(demand_msg,offer_msg_store)
2532 <dt><xref target="messages_done" format="title" /></dt> 2693 <dt><xref target="messages_done" format="title" /></dt>
2533 <dd> 2694 <dd>
2534 <t> 2695 <t>
2535 The done message is only received if the IBF has been finished 2696 The done message is only received, if the IBF has been finished
2536 decoding and all offers have been sent. If the done message is received before 2697 decoding and all offers have been sent. If the done message is received before
2537 the decoding of the IBF is finished or all open offers and demands 2698 the decoding of the IBF is finished or all open offers and demands
2538 have been answered the operation MUST be terminated. 2699 have been answered, the operation MUST be terminated.
2539 <!-- IMPLEMENT: Check that in active decoding no done message is received before ibf has been decoded--> 2700 <!-- IMPLEMENT: Check that in active decoding no done message is received before ibf has been decoded-->
2540 The 512-bit hash of the complete reconciled set contained in 2701 The 512-bit hash of the complete reconciled set contained in
2541 the done message is required to ensures that both sets are truly identical. If 2702 the done message is required to ensure that both sets are truly identical. If
2542 the sets differ a resynchronisation is required. The count of possible 2703 the sets differ, a resynchronisation is required. The number of possible
2543 resynchronisation MUST be limited to prevent resource exhaustion attacks. 2704 resynchronisation MUST be limited to prevent resource exhaustion attacks.
2544 </t> 2705 </t>
2545 <figure anchor="security_states_active_decoding_done_code"> 2706 <figure anchor="security_states_active_decoding_done_code">
@@ -2591,8 +2752,8 @@ FUNCTION validate_messages_done(done_msg, offer_msg_store,
2591 <section anchor="security_states_finish_closing" numbered="true" toc="default"> 2752 <section anchor="security_states_finish_closing" numbered="true" toc="default">
2592 <name>Finish Closing</name> 2753 <name>Finish Closing</name>
2593 <t> 2754 <t>
2594 In case not all sent demands or inquiries have ben answered in time a pre defined 2755 In case not all sent demands or inquiries have been answered in time, a predefined
2595 timeout the operation has failed and MUST be terminated. 2756 timeout, the operation has failed and MUST be terminated.
2596 </t> 2757 </t>
2597 <!-- FIXME: In state diagram in finish closing only Elements can be received. What happens if i receive an offer? --> 2758 <!-- FIXME: In state diagram in finish closing only Elements can be received. What happens if i receive an offer? -->
2598 <t>Security considerations for received messages:</t> 2759 <t>Security considerations for received messages:</t>
@@ -2616,14 +2777,14 @@ FUNCTION validate_messages_done(done_msg, offer_msg_store,
2616 <dt><xref target="messages_se" format="title" /></dt> 2777 <dt><xref target="messages_se" format="title" /></dt>
2617 <dd> 2778 <dd>
2618 <t> 2779 <t>
2619 In case the Strata Estimator does not decode the 2780 In case the Strata Estimator does not decode, the
2620 operation MUST be terminated to prevent to get to a unresolvable state. 2781 operation MUST be terminated to prevent to get to a unresolvable state.
2621 <!-- IMPLEMENT: If in expect SE state the strata estimator does not decode terminate the operation --> 2782 <!-- IMPLEMENT: If in expect SE state the strata estimator does not decode terminate the operation -->
2622 The set difference calculated from the strata estimator needs to be plausible, 2783 The set difference calculated from the strata estimator needs to be plausible,
2623 to ensure this multiple factors need to be considered: The absolute plausible maximum of 2784 to ensure this, multiple factors need to be considered: The absolute plausible maximum of
2624 elements in a set which has to be predefined according 2785 elements in a set, which has to be predefined according
2625 to the use case and the maximal plausible element increase since the last 2786 to the use case and the maximal plausible element increase since the last
2626 successful set reconciliation which should be either predefined or can be calculated 2787 successful set reconciliation, which should be either predefined or can be calculated
2627 with the gaussian distribution function over all passed set reconciliations. 2788 with the gaussian distribution function over all passed set reconciliations.
2628 <!-- IMPLEMENT: Terminate if in check expect se state for a max size difference is exceeded --> 2789 <!-- IMPLEMENT: Terminate if in check expect se state for a max size difference is exceeded -->
2629 </t> 2790 </t>
@@ -2673,7 +2834,7 @@ FUNCTION validate_messages_se(se_msg, remote_setsize, local_setsize)
2673 <t> 2834 <t>
2674 The peer in <strong>Full Receiving</strong> state needs to check 2835 The peer in <strong>Full Receiving</strong> state needs to check
2675 that exactly the number of elements is received from the remote peer as 2836 that exactly the number of elements is received from the remote peer as
2676 he initially committed too. If the remote peer transmits less or 2837 he has initially committed too. If the remote peer transmits less or
2677 more elements the operation MUST be terminated. 2838 more elements the operation MUST be terminated.
2678 </t> 2839 </t>
2679 <t> 2840 <t>
@@ -2684,14 +2845,14 @@ FUNCTION validate_messages_se(se_msg, remote_setsize, local_setsize)
2684 <dd> 2845 <dd>
2685 <t> 2846 <t>
2686 When the full done message is received from the remote peer all 2847 When the full done message is received from the remote peer all
2687 elements that the remote peer has committed to needs to be received 2848 elements, that the remote peer has committed to, need to be received,
2688 otherwise the operation MUST be terminated. After receiving the 2849 otherwise the operation MUST be terminated. After receiving the
2689 full done message no future elements should be accepted. 2850 full done message no future elements should be accepted.
2690 <!-- FIXME: Check that after full done in full receiving no elements can be received anymore! Additional state? --> 2851 <!-- FIXME: Check that after full done in full receiving no elements can be received anymore! Additional state? -->
2691 The 512-bit hash of the complete reconciled set contained in 2852 The 512-bit hash of the complete reconciled set contained in
2692 the full done message is required to ensures that both sets are truly identical. If 2853 the full done message is required to ensure that both sets are truly identical. If
2693 the sets differ a resynchronisation is required. The count of possible 2854 the sets differ, a resynchronisation is required. The number of possible
2694 resynchronisation MUST be limited to prevent resource exhaustion attacks. 2855 resynchronisations MUST be limited to prevent resource exhaustion attacks.
2695 </t> 2856 </t>
2696 <t> 2857 <t>
2697 Pseudocode for implementation described in section <xref target="security_states_full_sending" format="title" />. 2858 Pseudocode for implementation described in section <xref target="security_states_full_sending" format="title" />.
@@ -2707,8 +2868,8 @@ FUNCTION validate_messages_se(se_msg, remote_setsize, local_setsize)
2707 <dd> 2868 <dd>
2708 <t> 2869 <t>
2709 In case an IBF message is received by the peer a active/passive role switch 2870 In case an IBF message is received by the peer a active/passive role switch
2710 is initiated by the active decoding remote peer. In this instance the peer should 2871 is initiated by the active decoding remote peer. In this moment the peer should
2711 wait for all open offers and demands to be fulfilled to prevent 2872 wait for all open offers and demands to be fulfilled, to prevent
2712 retransmission before switching into active decoding operation mode. 2873 retransmission before switching into active decoding operation mode.
2713 A switch into active decoding mode should only be permitted for 2874 A switch into active decoding mode should only be permitted for
2714 a predefined number of times as described in the top section 2875 a predefined number of times as described in the top section
@@ -2755,13 +2916,13 @@ FUNCTION validate_messages_ibf(offer_msg_store, demand_msg_store, element_msg_st
2755 <dt><xref target="messages_inquiry" format="title" /></dt> 2916 <dt><xref target="messages_inquiry" format="title" /></dt>
2756 <dd> 2917 <dd>
2757 <t> 2918 <t>
2758 A check needs to be in place that prevents receiving a inquiry 2919 A check needs to be in place that prevents receiving an inquiry
2759 for an element multiple times or more inquiries than are plausible. 2920 for an element multiple times or more inquiries than are plausible.
2760 The amount of inquiries that is plausible can be estimated by considering 2921 The amount of inquiries that is plausible, can be estimated by considering
2761 known values as the remote set size, the local set size, the 2922 known values, as the remote set size, the local set size, the
2762 predefined absolute maximum of elements in the set which is defined 2923 predefined absolute maximum of elements in the set, which is defined
2763 by real world limitations. 2924 by real world limitations.
2764 To implement this restrictions a list with all received inquiries 2925 To implement this restrictions, a list with all received inquiries
2765 should be stored and new inquiries should be checked against. 2926 should be stored and new inquiries should be checked against.
2766 </t> 2927 </t>
2767 <figure anchor="security_states_passive_decoding_inquiry_code"> 2928 <figure anchor="security_states_passive_decoding_inquiry_code">
@@ -2817,7 +2978,7 @@ FUNCTION validate_messages_inquiry(inquiry_msg, set_diff)
2817 <section anchor="security_states_finish_waiting" numbered="true" toc="default"> 2978 <section anchor="security_states_finish_waiting" numbered="true" toc="default">
2818 <name>Finish Waiting</name> 2979 <name>Finish Waiting</name>
2819 <t> 2980 <t>
2820 In case not all sent demands or inquiries have ben answered in time the operation 2981 In case not all sent demands or inquiries have ben answered in time, the operation
2821 has failed and MUST be terminated. 2982 has failed and MUST be terminated.
2822 </t> 2983 </t>
2823 <!-- FIXME: In state diagram in finish closing only Elements can be received. What happens if i receive an offer? --> 2984 <!-- FIXME: In state diagram in finish closing only Elements can be received. What happens if i receive an offer? -->
@@ -3082,6 +3243,15 @@ Type | Name | References | Description
3082 <date year="2011"/> 3243 <date year="2011"/>
3083 </front> 3244 </front>
3084 </reference> 3245 </reference>
3246 <reference anchor="secure_set_reconciliation" target="http://ti.bfh.ch">
3247 <front>
3248 <title>Secure Set Reconciliation</title>
3249 <author initials="E." surname="Summermatter" fullname="Elias Summermatter">
3250 <organization>BFH - Berner Fachhochschule</organization>
3251 </author>
3252 <date year="2021"/>
3253 </front>
3254 </reference>
3085 3255
3086 <!-- <reference anchor="ISO20022"> 3256 <!-- <reference anchor="ISO20022">
3087 <front> 3257 <front>
@@ -3103,34 +3273,82 @@ Type | Name | References | Description
3103 <t> 3273 <t>
3104 INPUTS: 3274 INPUTS:
3105 </t> 3275 </t>
3106 <t> 3276 <figure anchor="test_vectors_map_function_inputs">
3107 key: 0xFFFFFFFFFFFFFFFF (64-bit) 3277 <artwork name="" type="" align="left" alt=""><![CDATA[
3108 number_of_buckets_per_element: 3 3278number_of_buckets_per_element: 3
3109 ibf_size: 300 3279ibf_size: 300
3110 </t> 3280
3281key1: 0xFFFFFFFFFFFFFFFF (64-bit)
3282key2: 0x0000000000000000 (64-bit)
3283key3: 0x00000000FFFFFFFF (64-bit)
3284key4: 0xC662B6298512A22D (64-bit)
3285key5: 0xF20fA7C0AA0585BE (64-bit)
3286 ]]></artwork>
3287 </figure>
3111 <t> 3288 <t>
3112 OUTPUT: 3289 OUTPUT:
3113 </t> 3290 </t>
3114 <t> 3291 <figure anchor="test_vectors_map_function_outputs">
3115 ["222","32","10"] 3292 <artwork name="" type="" align="left" alt=""><![CDATA[
3116 </t> 3293key1: ["122","157","192"]
3294key2: ["85","243","126"]
3295key3: ["208","101","222"]
3296key4: ["239","269","56"]
3297key5: ["150","104","33"]
3298 ]]></artwork>
3299 </figure>
3117 </section> 3300 </section>
3118 <section anchor="test_vectors_id_function" numbered="true" toc="default"> 3301 <section anchor="test_vectors_id_function" numbered="true" toc="default">
3119 <name>ID Calculation Function</name> 3302 <name>ID Calculation Function</name>
3120 <t> 3303 <t>
3121 INPUTS: 3304 INPUTS:
3122 </t> 3305 </t>
3306 <figure anchor="test_vectors_id_function_inputs">
3307 <artwork name="" type="" align="left" alt=""><![CDATA[
3308element 1: 0xFFFFFFFFFFFFFFFF (64-bit)
3309element 2: 0x0000000000000000 (64-bit)
3310element 3: 0x00000000FFFFFFFF (64-bit)
3311element 4: 0xC662B6298512A22D (64-bit)
3312element 5: 0xF20fA7C0AA0585BE (64-bit)
3313 ]]></artwork>
3314 </figure>
3123 <t> 3315 <t>
3124 element: 0xadadadadadadadad 3316 OUTPUT:
3125 ibf_salt 0x3F (6-bit)
3126 </t> 3317 </t>
3318 <figure anchor="test_vectors_id_function_outputs">
3319 <artwork name="" type="" align="left" alt=""><![CDATA[
3320element 1: 0x5AFB177B
3321element 2: 0x64AB557C
3322element 3: 0xCB5DB740
3323element 4: 0x8C6A2BB2
3324element 5: 0x7EC42981
3325 ]]></artwork>
3326 </figure>
3327 </section>
3328 <section anchor="test_counter_compression_function" numbered="true" toc="default">
3329 <name>Counter Compression Function</name>
3127 <t> 3330 <t>
3128 OUTPUT: 3331 INPUTS:
3129 </t> 3332 </t>
3333 <figure anchor="test_counter_compression_function_inputs">
3334 <artwork name="" type="" align="left" alt=""><![CDATA[
3335counter serie 1: [1,8,10,6,2] (min bytes 4)
3336counter serie 2: [26,17,19,15,2,8] (min bytes 5)
3337counter serie 3: [4,2,0,1,3] (min bytes 3)
3338 ]]></artwork>
3339 </figure>
3130 <t> 3340 <t>
3131 0xFFFFFFFFFFFFFFFF 3341 OUTPUT:
3132 </t> 3342 </t>
3343 <figure anchor="test_counter_compression_function_outputs">
3344 <artwork name="" type="" align="left" alt=""><![CDATA[
3345counter serie 1: 0x18A62
3346counter serie 2: 0x3519BC48
3347counter serie 3: 0x440B
3348 ]]></artwork>
3349 </figure>
3133 </section> 3350 </section>
3134 </section> 3351 </section>
3135 </back> 3352 </back>
3136</rfc> 3353</rfc>
3354