Improved RFC - lsd0003 - LSD0003: Set Union

commit 7c6bef140c96746cb3be4e2b2907cc6fee52b70e
parent 64b92f47a1ee2fd808e22a075d984dd77aab0493
Author: Elias Summermatter <elias.summermatter@seccom.ch>
Date:   Tue, 11 May 2021 08:58:28 +0200

Improved RFC

Diffstat:
M draft-summermatter-set-union.xml  | 590 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------

1 file changed, 404 insertions(+), 186 deletions(-)
diff --git a/draft-summermatter-set-union.xml b/draft-summermatter-set-union.xml
@@ -106,7 +106,7 @@
             </t>
             <t>
               The protocol described in this report is generic and
-              suitable for a wide range of applicaitons. As a result,
+              suitable for a wide range of applications. As a result,
               the internal structure of the elements in the sets must
               be defined and verified by the application using the
               protocol.  This document thus does not cover the element
@@ -121,9 +121,9 @@
               using the protocol must provide a parameter that specifies
               the cost-ratio of round-trips vs. bandwidth usage.  Given
               this trade-off factor, the protocol will then choose parameters
-              that minimize the total execution cost.  In particular, there
-              is one major choice to be made, which is between sending the
-              full set of elements, or just sending the elements that differ.
+              that minimize the total execution costs.  In particular, there
+              is one major choice to be made, namely between sending the
+              complete set of elements, or sending only the elements that differ.
               In the latter case, our design is basically a concrete
               implementation of a proposal by Eppstein.<xref target="Eppstein" format="default" />
             </t>
@@ -152,7 +152,7 @@
             <t>
               To defend against some of these attacks, applications
               need to remember the number of elements previously
-              shared with a peer, and offer a means to check that
+              shared with a peer, and provide a way to check that
               elements are well-formed. Applications may also be able
               to provide an upper bound on the total number of valid
               elements that may exist. For example, in E-voting, the
@@ -179,14 +179,14 @@
             <section anchor="bf" numbered="true" toc="default">
                 <name>Bloom Filters</name>
                 <t>
-                    A Bloom filter (BF) is a space-efficient datastructure to test if am element is part of a set of elements.
+                    A Bloom filter (BF) is a space-efficient datastructure to test if an element is part of a set of elements.
                     Elements are identified by an element ID.
                     Since a BF is a probabilistic datastructure, it is possible to have false-positives: when asked
                     if an element is in the set, the answer from a BF is either "no" or "maybe".
                 </t>
                 <t>
                     A BF consists of L buckets. Every bucket is a binary value that can be either 0 or 1. All buckets are initialized
-                    to 0.  A mapping function M is used to map each the ID of each element from the set to a subset of k buckets.  M is non-injective
+                    to 0.  A mapping function M is used to map each ID of each element from the set to a subset of k buckets.  M is non-injective
                     and can thus map the same element multiple times to the same bucket.
                     The type of the mapping function can thus be described by the following mathematical notation:
                 </t>
@@ -214,7 +214,7 @@
                     To check if an element may be in the set, one tests if all buckets under the map M are set to 1.
                 </t>
                 <t>
-                    Further in this document a bitstream outputted by the mapping function is represented by
+                    Further in this document a bitstream output by the mapping function is represented by
                     a set of numeric values for example (0101) = (2,4).
                     In the BF the buckets are set to 1 if the corresponding bit in the bitstream is 1.
                     If there is a collision and a bucket is already set to 1, the bucket stays 1.
@@ -231,8 +231,8 @@
                      ]]></artwork>
                     </figure>
                 <t>
-                    Is easy to see that the M(element) = (0,3) could be in the BF bellow and M(element) = (0,2) can't be
-                    in the BF bellow:
+                    Is easy to see that the M(element) = (0,3) could be in the BF below and M(element) = (0,2) can't be
+                    in the BF below:
                 </t>
 
                 <figure anchor="figure_bf_contains">
@@ -259,7 +259,7 @@
                 <name>Counting Bloom Filter</name>
                 <t>
                   A Counting Bloom Filter (CBF) is an extension of the <xref target="bf" format="title" />. In the CBF, buckets are
-                  unsigned numbers instead of binary values.  This allows the removal of an elements from the CBF.
+                  unsigned numbers instead of binary values.  This allows the removal of an element from the CBF.
                 </t>
                 <t>
                   Adding an element to the CBF is similar to the adding operation of the BF. However, instead of setting the bucket on hit to 1 the
@@ -294,7 +294,7 @@
                 </figure>
                 <t>
                   In practice, the number of bits available for the counters is usually finite. For example, given a 4-bit
-                  counter, a CBF bucket would overflow once 16 elements are mapped to the same bucket. To efficiently
+                  counter, a CBF bucket would overflow 16 elements are mapped to the same bucket. To efficiently
                   handle this case, the maximum value (15 in our example) is considered to represent "infinity". Once the
                   order of a bucket reaches "infinity", it is no longer incremented or decremented.
                 </t>
@@ -328,7 +328,7 @@
                     including the CPU architecture.
                 </t>
                 <t>
-                    If the IBF size is to small or the mapping
+                    If the IBF size is too small or the mapping
                     function does not spread out the elements
                     uniformly, the signed counter can overflow or
                     underflow. As with the CBF, the "maximum" value is
@@ -456,11 +456,11 @@ hashSum |    0x0000   |   0x4242    |    0x0000   |   0x4242    |
                       Note that it is possible to "remove" elements from an IBF that were never present
                       in the IBF in the first place. A negative counter value is thus indicative of
                       elements that were removed without having been added.  Note that an IBF bucket
-                      counter of zero no longer warrants that an element mapped to that bucket is not
+                      counter of zero no longer guarantees that an element mapped to that bucket is not
                       present in the set: a bucket with a counter of zero can be the result of one
                       element being added and a different element (mapped to the same bucket) being removed.
                       To check that an element is not present requires a counter of zero and an
-                      IDSUM and HASHSUM of zero --- and some assurance that there was no collision due
+                      IDSUM and HASHSUM of zero --- and some certainty that there was no collision due
                       to the limited number of bits in IDSUM and HASHSUM.  Thus,
                       IBFs are not suitable to replace BFs or IBFs.
                     </t>
@@ -504,7 +504,7 @@ hashSum |    0x0000   |   0x4242    |    0x0000   |   0x4242    |
                         a different mapping function M.
                         A more common scenario (especially if L was too small) is that
                         IBF decoding fails because there is no pure bucket. In this case, the
-                        higher-level protocol also should retry using different parameters.
+                        higher-level protocol should also retry using different parameters.
                     </t>
                     <t>
                         Suppose the IBF contains a pure bucket. In this case, the IDSUM in the
@@ -513,9 +513,9 @@ hashSum |    0x0000   |   0x4242    |    0x0000   |   0x4242    |
                         was negative, and by removing it if the counter was positive). This
                         is likely to cause other buckets to become pure, allowing further
                         elements to be decoded.  Eventually, decoding should succeed with
-                        all counters and IDSUM and HASHSUM values reaching zero. However, it is also
+                        all counters and IDSUM and HASHSUM values reach zero. However, it is also
                         possible that an IBF only partly decodes and then decoding fails after
-                        yielding some elements.
+                        obtaining some elements.
                     </t>
                     <t>
                         In the following example the successful decoding of an IBF containing
@@ -554,7 +554,7 @@ hashSum |    0x0000   |   0x4242    |    0x0000   |   0x4242    |
                     </figure>
                     <t>
                         In the IBF only pure buckets are left, we choose to continue decoding bucket 2 and decode element
-                        with the hash 0x4242. Now the IBF is empty (all buckets have count 0) that means the IBF has successfully
+                        with the hash 0x4242. Now the IBF is empty (all buckets have count 0) that means the IBF has been successfully
                         decoded.
                     </t>
                     <figure anchor="figure_ibf_decode_2">
@@ -577,7 +577,7 @@ hashSum |    0x0000   |    0x0000   |    0x0000   |    0x0000   |
                         Given addition and removal as defined above, it is possible to define an operation on IBFs
                         that computes an IBF representing the set difference.  Suppose IBF1 represents set A, and
                         IBF2 represents set B.  Then this set difference operation will compute IBF3 which
-                        represents the set A - B --- without needing elements from set A or B.
+                        represents the set A - B --- without having to transfer the elements from set A or B.
 
                         To calculate the IBF representing this set difference, both IBFs must have the same
                         length L, the same number of buckets per element k and use the same map M.  Given this,
@@ -635,7 +635,7 @@ hashSum |    0x0101   |    0x0101   |    0x5050   |    0x0000   |
         +-------------+-------------+-------------+-------------+
                  ]]></artwork>
                 </figure>
-                <t>After calculating and decoding the IBF-AB its clear that in IBF-A the element with the hash 0x5050
+                <t>After calculating and decoding the IBF-AB shows clear that in IBF-A the element with the hash 0x5050
                     is missing (-1 in bit-3) while in IBF-B the element with the hash 0101 is missing
                     (1 in bit-1 and bit-2). The element with hash 0x4242 is present in IBF-A and IBF-B and is
                     removed by the set difference operation (bit-4).
@@ -648,8 +648,8 @@ hashSum |    0x0101   |    0x0101   |    0x5050   |    0x0000   |
                 <t>
                     To facilitate a reasonably CPU-efficient
                     implementation, this specification requires the
-                    IBF counter to always use 8 bits.  Fewer bits
-                    would result in a paritcularly inefficient
+                    IBF counter always to use 8 bits.  Fewer bits
+                    would result in a particularly inefficient
                     implementation, while more bits are rarely useful
                     as sets with so many elements should likely be
                     represented using a larger number of buckets. This
@@ -672,17 +672,17 @@ hashSum |    0x0101   |    0x0101   |    0x5050   |    0x0000   |
                 </t>
                 <t>
                     For the "HASHSUM", we always use a 32-bit
-                    representation.  Here, it is mostly important to
+                    representation.  Here, it is most important to
                     avoid collisions, where different elements are
                     mapped to the same hash.  However, we note that
                     by design only a few elements (certainly less than
                     127) should ever be mapped
-                    to the same bucket, so a small number of bits
+                    to the same bucket, a small number of bits
                     should suffice.  Furthermore, our protocol is designed
                     to handle occasional collisions, so while with
                     32-bits there remains a chance of
                     accidental collisions, at 32 bit the chance is
-                    generally believed to be sufficiently small enough
+                    generally believed to be sufficiently small 
                     for the protocol to handle those cases efficiently
                     for a wide range of use-cases.  Smaller hash
                     values would safe bandwidth, but also drastically
@@ -697,7 +697,7 @@ hashSum |    0x0101   |    0x0101   |    0x5050   |    0x0000   |
                         with HMAC-SHA512 as XTR and HMAC-SHA256 as PRF and salt is set to the unsigned 64-bit equivalent of 0.
                         The output is then truncated to 64-bit.
                         Its important that the elements can be redistributed over the buckets in case the IBF does not
-                        decode, that's why the ID is salted with a random salt given in the SALT field of this message.
+                        decode. That's why the ID is salted with a random salt given in the SALT field of this message.
                         Salting is done by calculation the a random salt modulo 64 (using only the lowest 6-bits of the salt)
                         and do a bitwise right rotation of output of KDF by the 6-bit salts numeric representation.
                     </t>
@@ -743,22 +743,22 @@ FUNCTION id_calculation (element,ibf_salt):
                     <t>
                         The mapping function M as described above in the figure <xref target="bf_mapping_function_math" format="default" />
                         decides in which buckets the ID and HASH have to be binary XORed to. In practice
-                        there the following algorithm is used:
+                        the following algorithm is used:
                     </t>
                     <t>
                         The first index is simply the HASH modulo the IBF size. The second
                         index is calculated by creating a new 64-bit value by shifting the 32-bit
                         value left and setting the lower 32-bit to the number of indexes already processed. From the
-                        resulting 64-bit value a CRC32 checksum is created the second index is now the modulo of the
-                        CRC32 output this is repeated until the predefined amount indexes is generated.
+                        resulting 64-bit value a CRC32 checksum is created. The second index is now the modulo of the
+                        CRC32 output, this is repeated until the predefined amount of indexes is generated.
                         In the case a index is hit twice, which would mean this bucket could not get pure again,
-                        the second hit is just skipped and the next iteration is used as.
+                        the second hit is just skipped and the next iteration is used.
                     </t>
                     <figure anchor="ibf_format_bucket_identification_pseudo_code">
                         <artwork name="" type="" align="left" alt=""><![CDATA[
 # INPUTS:
 # key: Is the ID of the element calculated in the id_calculation function above.
-# number_of_buckets_per_element: Pre-defined count of buckets elements are inserted into
+# number_of_buckets_per_element: Pre-defined numbers of buckets elements are inserted into
 # ibf_size: the size of the ibf (count of buckets)
 # OUTPUT:
 # dst: Array with bucket IDs to insert ID and HASH
@@ -807,7 +807,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
             <section anchor="se_description" numbered="true" toc="default">
                 <name>Description</name>
                 <t>
-                    Strata Estimators help estimate the size of the set difference between two set of elements.
+                    Strata Estimators help estimate the size of the set difference between two sets of elements.
                     This is necessary to efficiently determinate the tuning parameters for an IBF, in particular
                     a good value for L.
                 </t>
@@ -845,7 +845,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
             <t>
                 The simplest mode is the "full" synchronization mode. The idea is that if the difference between the sets of the two
                 peers exceeds a certain threshold, the overhead to determine which elements are different outweighs
-                the overhead of sending the complete set. In this case, the most efficient method can be to just
+                the overhead of sending the complete set. In this case, the most efficient method can be just to
                 exchange the full sets.
             </t>
             <t>
@@ -854,7 +854,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
             </t>
             <t>
                 The second possibility is that the difference of the sets is small compared to the set size.
-                Here, an efficient "delta" synchronization mode is more efficient. Given these two possibilities,
+                In this case, an efficient "delta" synchronization mode is more efficient. These two possibilities given,
                 the first steps of the protocol are used to determine which mode should be used.
             </t>
 
@@ -920,7 +920,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <em><xref target="messages_ibf" format="title" /></em> message from
                     the initiating peer and transitions into the <strong>Expecting IBF Last</strong> state when there
                     are multiple <em><xref target="messages_ibf" format="title" /></em> messages to sent,
-                    when there is just a single <em><xref target="messages_ibf" format="title" /></em> message the reviving peer
+                    when there is just a single <em><xref target="messages_ibf" format="title" /></em> message the receiving peer
                     transitions directly to the <strong>Active Decoding</strong> state.
                 </t>
                 <t>
@@ -975,8 +975,8 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                                 When a new element message has been received the peer checks if a corresponding
                                 <em><xref target="messages_demand" format="title" /></em> for the element has been sent
                                 and the demand is still unsatisfied.
-                                If the element has been demanded the peer checks the element for validity, removed it from the list
-                                of pending demands and then then saves the element to the the set otherwise the peer
+                                If the element has been demanded the peer checks the element for validity, removes it from the list
+                                of pending demands and then saves the element to the set otherwise the peer
                                 rejects the element.
                             </dd>
                             <dt><em><xref target="messages_ibf" format="title" /></em> message:</dt>
@@ -990,7 +990,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                             <dt><em><xref target="messages_ibf_last" format="title" /></em> message:</dt>
                             <dd>
                                 If an <em><xref target="messages_ibf_last" format="title" /></em> message is received this
-                                indicates that the there is just one IBF slice and a direct state and role transition from
+                                indicates that there is just one IBF slice left and a direct state and role transition from
                                 <strong>Passive Decoding</strong> to <strong>Active Decoding</strong> is initiated.
                             </dd>
                             <dt><em><xref target="messages_done" format="title" /></em> message:</dt>
@@ -1017,7 +1017,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                             sends a <em><xref target="messages_inquiry" format="title" /></em> to the passive peer.
                         </t>
                         <t>
-                            In case the IBF does not successfully decode anymore, the active peer sends a new IBF to the passive client
+                            In case the IBF does not successfully decode anymore, the active peer sends a new IBF to the passive peer
                             and changes into <strong>Passive Decoding</strong> state. This initiates a role swap.
                             To reduce overhead and prevent double transmission of offers and elements the new IBF is created
                             on the new complete set after all demands and inquiries have been satisfied.
@@ -1039,8 +1039,8 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                                 the active peer. If a Inquiry has been sent and <!-- FIXME: is this indeed a condition that is checked? -->
                                 the offered element is missing in the active peers set,
                                 the active peer sends a <em><xref target="messages_demand" format="title" /></em> message to the
-                                passive peer. The send demand need to be added to a list with unsatisfied demands.
-                                In the case the received offer is for an element that is already in the set of the peer the offer is ignored.
+                                passive peer. The sent demand needs to be added to a list with unsatisfied demands.
+                                In case the received offer is for an element that is already in the set of the peer the offer is ignored.
                                 <!-- FIXME: what happens if the offer is for an element that is not missing? I think then we just ignore it, right? -->
                             </dd>
                             <dt><em><xref target="messages_demand" format="title" /></em> message:</dt>
@@ -1051,9 +1051,9 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                                 <em><xref target="messages_elements" format="title" /></em> message if a offer request
                                 for the element has been sent.
                                 <!-- IMPLEMENT: This is not implemented in code // Change -->
-                                In the case the demanded element does not exist in the
-                                set there was probably a bucket decoded that was not really pure so potentially all <em><xref target="messages_offer" format="title" /></em>
-                                and <em><xref target="messages_demand" format="title" /></em> messages sent after are invalid
+                                In case the demanded element does not exist in the
+                                set there was probably a bucket decoded that was not pure so potentially all <em><xref target="messages_offer" format="title" /></em>
+                                and <em><xref target="messages_demand" format="title" /></em> messages sent later are invalid
                                 in this case a role change active -> passive with a new IBF is easiest.
                                 If a demand for the same element is received multiple times the demands should be
                                 discarded.
@@ -1062,18 +1062,18 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                             </dd>
                             <dt><em><xref target="messages_elements" format="title" /></em> message:</dt>
                             <dd>
-                                A element that is received is marked in the list of demanded elements as satisfied, validated and
-                                saved and not further action is taken.
+                                An element that is received is marked in the list of demanded elements as satisfied, validated and
+                                saved and no further action is taken.
                                 Elements that are not demanded or already known are discarded.
                             </dd>
                             <dt><em><xref target="messages_done" format="title" /></em> message:</dt>
                             <dd>
                                 Receiving the message <em><xref target="messages_done" format="title" /></em> indicates
                                 that all demands of the passive peer have been satisfied. The active peer then changes into the
-                                state <strong>Finish Closing</strong> state.
+                               <strong>Finish Closing</strong> state.
                                 <!-- IMPLEMENT: This is not implemented in code // Change -->
-                                If the IBF is not finished decoding and the <em><xref target="messages_done" format="title" /></em>
-                                is received the other peer is not in compliance with the protocol and the set reconciliation MUST be aborted.
+                                If the IBF has not finished decoding and the <em><xref target="messages_done" format="title" /></em>
+                                is received, the other peer is not in compliance with the protocol and the set reconciliation MUST be aborted.
                                 <!-- IMPLEMENT: This is not implemented in code // Change -->
                             </dd>
                         </dl>
@@ -1090,7 +1090,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <dd>
                         <t>
                             In this states the peers are waiting for all demands to be satisfied and for the synchronisation
-                            to be completed. When all demands are satisfied the peer changes into state <strong>Finished</strong>.
+                            to be completed. When all demands are satisfied the peer changes into  <strong>Finished</strong>state.
                         </t>
                     </dd>
                 </dl>
@@ -1104,7 +1104,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                 </t>
                 <t>
                     The <xref target="modeofoperation_individual-elements" format="title" /> is only efficient on small set
-                    differences or if the byte-size of the elements is large. Is the set difference is estimated to be large
+                    differences or if the byte-size of the elements is large. If the set difference is estimated to be large
                     the <xref target="modeofoperation_full-sync" format="title" /> is
                     more efficient. The exact heuristics and parameters on which the protocol decides which mode
                     should be used are described in the <xref target="performance" format="title" /> section of this document.
@@ -1114,7 +1114,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     is always used.
                     The first case is when one of the peers announces having an empty set. This is announced by setting
                     the SETSIZE field in the <em><xref target="messages_se" format="title" /></em> to 0.
-                    The second case is if the application requested full synchronization explicitly.
+                    The second case is if the application requests full synchronization explicitly.
                     This is useful for testing and should not be used in production.
                 </t>
                 <!--
@@ -1169,7 +1169,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <dl>
                         <dt>MSG SIZE</dt>
                         <dd>
-                            is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included.
+                            is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
                         </dd>
                         <dt>MSG TYPE</dt>
                         <dd>
@@ -1209,7 +1209,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                         The <em>IBF</em> message is sent at the start of the protocol from the initiating peer in the transaction
                         between <strong>Expect SE</strong> -> <strong>Expecting IBF Last</strong> or when the IBF does not
                         decode and there is a role change in the transition between <strong>Active Decoding</strong> -> <strong>Expecting IBF Last</strong>.
-                        This message is only sent if there are more than one IBF slice to sent, in the case there is just
+                        This message is only sent if there are more than one IBF slice to be sent, in case there is just
                         one slice the <xref target="messages_ibf_last" format="title" /> message is sent.
                     </t>
                 </section>
@@ -1219,12 +1219,12 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                         <artwork name="" type="" align="left" alt=""><![CDATA[
         0     8     16    24    32    40    48    56
         +-----+-----+-----+-----+-----+-----+-----+-----+
-        |  MSG SIZE |  MSG TYPE |ORDER|       PAD       |
+        |  MSG SIZE |  MSG TYPE |         SIZE          |
         +-----+-----+-----+-----+-----+-----+-----+-----+
-        |         OFFSET        |          SALT         |
+        |IMCS |         OFFSET        |          SALT   |
         +-----+-----+-----+-----+-----+-----+-----+-----+
-        |                  IBF-SLICE
-        +                                               /
+              |                  IBF-SLICE
+        +-----                                          /
         /                                               /
         /                                               /
                  ]]></artwork>
@@ -1233,21 +1233,25 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <dl>
                         <dt>MSG SIZE</dt>
                         <dd>
-                            is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included.
+                            is a 16-bit unsigned integer in network byte orderwhichdescribes the message size in bytes and header included.
                         </dd>
                         <dt>MSG TYPE</dt>
                         <dd>
                             the type of SETU_P2P_REQUEST_IBF as registered in <xref target="gana" format="title" /> in network byte order.
                         </dd>
-                        <dt>ORDER</dt>
+                        
+                        <dt>SIZE</dt>
                         <dd>
-                            is a 8-bit unsigned integer which signals the order of the IBF. The order of the IBF
-                            is defined as the logarithm of the number of buckets of the IBF.
+                            is a 32-bit unsigned integer which signals the number of buckets in the IBF.
                         </dd>
-                        <dt>PAD</dt>
+                        <dt>IMCS</dt>
                         <dd>
-                            is 24-bit always set to zero
+                            IBF max counter size is a 8-bit unsigned integer which describes the number of bit that
+                            is required to store a single counter. This is used for the unpacking function as described
+                            in the <xref target="performance_counter_variable_size" format="title" /> section.
                         </dd>
+
+                    
                         <dt>OFFSET</dt>
                         <dd>
                             is a 32-bit unsigned integer which signals the offset to the following ibf slices in the original.
@@ -1259,16 +1263,18 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                         </dd>
                         <dt>IBF-SLICE</dt>
                         <dd>
+                            
                             <t>
-                                are variable count of slices in an array. A single slice contains out multiple 64-bit IDSUMS,
-                                32-bit HASHSUMS and 8-bit COUNTERS. In the network order the array of IDSUMS is first, followed
+                                are variable numbers of slices in an array. A single slice contains multiple 64-bit IDSUMS,
+                                32-bit HASHSUMS and 1-64bit COUNTERS of variable size. In the network order the array of IDSUMS is first, followed
                                 by an array of HASHSUMS and ended with an array of COUNTERS. Length of the array is defined
-                                by MIN( 2^ORDER - OFFSET, MAX_BUCKETS_PER_MESSAGE). MAX_BUCKETS_PER_MESSAGE is defined as
+                                by MIN( SIZE - OFFSET, MAX_BUCKETS_PER_MESSAGE). MAX_BUCKETS_PER_MESSAGE is defined as
                                 32768 divided by the BUCKET_SIZE which is 13-byte (104-bit).
                             </t>
+                            
                             <t>
-                                To get the IDSUM field, all IDs who hit a bucket are added up with a binary XOR operation.
-                                See <xref target="ibf_format_id_generation" format="title" /> for details about ID generation.
+                                To get the IDSUM field, all IDs hitting a bucket are added up with a binary XOR operation.
+                                See <xref target="ibf_format_id_generation" format="title" /> details about ID generation.
                             </t>
                             <t>
                                 The calculation of the HASHSUM field is done accordingly to the calculation of the IDSUM field:
@@ -1279,7 +1285,11 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                                 The algorithm to find the correct bucket in which the ID and the HASH have to be added
                                 is described in detail in section <xref target="ibf_format_bucket_identification" format="title" />.
                             </t>
-
+                            
+                            <t>
+                                Test vectors for an implementation can be found in the <xref target="test_vectors" format="title" /> section
+                            </t>
+                            
                             <!--
                             FIXME: this is not sufficiently precise! How are the element IDs (and IDSUMS) computed?
                             How are the HASHes (and HASHSUMS) computed? Which byte order is used? What role does
@@ -1296,12 +1306,13 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
         +-----+-----+-----+-----+-----+-----+-----+-----+
         |                    IDSUMS                     |
         +-----+-----+-----+-----+-----+-----+-----+-----+
-        |         HASHSUMS      |        HASHSUMS       |
+        |        HASHSUMS       |        HASHSUMS       |
         +-----+-----+-----+-----+-----+-----+-----+-----+
-        |        COUNTERS       |       COUNTERS        |
+        |        COUNTERS*      |        COUNTERS*      |
         +-----+-----+-----+-----+-----+-----+-----+-----+
         /                                               /
         /                                               /
+* Counter size is variable. In this example the size is 32-bit (4-byte)
                  ]]></artwork>
                     </figure>
                 </section>
@@ -1313,7 +1324,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                 <section anchor="messages_ibf_last_description" numbered="true" toc="default">
                     <name>Description</name>
                     <t>
-                        This message indicates to the remote peer that all slices of the bloom filter have been sent.
+                        This message indicates the remote peer that all slices of the bloom filter have been sent.
                         The binary structure is exactly the same as the <xref target="messages_ibf_structure" format="title" /> of
                         the message <xref target="messages_ibf" format="title" /> with a different "MSG TYPE"
                         which is defined in <xref target="gana" format="title" /> "SETU_P2P_IBF_LAST".
@@ -1343,7 +1354,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                         as answer to a <em><xref target="messages_demand" format="title" /></em> message from the remote peer.
                         The Element message can also be received in the <strong>Finish Closing</strong> or <strong>Finish Waiting</strong>
                         state after receiving a <em><xref target="messages_done" format="title" /></em> message from the remote peer, in this
-                        case the client changes to the <strong>Finished</strong> state as soon as all demands for elements have been satisfied.
+                        case the peer changes to the <strong>Finished</strong> state as soon as all demands for elements have been satisfied.
                     </t>
                     <t>
                         This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />.
@@ -1367,7 +1378,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <dl>
                         <dt>MSG SIZE</dt>
                         <dd>
-                            is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included.
+                            is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
                         </dd>
                         <dt>MSG TYPE</dt>
                         <dd>
@@ -1375,7 +1386,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                         </dd>
                         <dt>E TYPE</dt>
                         <dd>
-                            element type is a 16-bit unsigned integer witch defines the element type for
+                            element type is a 16-bit unsigned integer which defines the element type for
                             the application.
                         </dd>
                         <dt>PADDING</dt>
@@ -1384,7 +1395,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                         </dd>
                         <dt>E SIZE</dt>
                         <dd>
-                            element size is 16-bit unsigned integer that signals the size of the elements data part.
+                            element size is a 16-bit unsigned integer that signals the size of the elements data part.
                         </dd>
                         <dt>AE TYPE</dt>
                         <dd>
@@ -1408,7 +1419,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                         The offer message is an answer to an <em><xref target="messages_inquiry" format="title" /></em> message
                         and transmits the full hash of an element that has been requested by the other peer.
                         This full hash enables the other peer to check if the element is really missing in its set and
-                        eventually sends a <em><xref target="messages_demand" format="title" /></em> message for that a element.
+                        eventually sends a <em><xref target="messages_demand" format="title" /></em> message for that element.
                     </t>
                     <t>
                         The offer is sent and received only in the <strong>Active Decoding</strong> and in the <strong>Passive Decoding</strong>
@@ -1434,7 +1445,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <dl>
                         <dt>MSG SIZE</dt>
                         <dd>
-                            is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included.
+                            is a 16-bit unsigned integer in network byte order which describes the message size in bytes header included.
                         </dd>
                         <dt>MSG TYPE</dt>
                         <dd>
@@ -1482,7 +1493,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <dl>
                         <dt>MSG SIZE</dt>
                         <dd>
-                            is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included.
+                            is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
                         </dd>
                         <dt>MSG TYPE</dt>
                         <dd>
@@ -1503,7 +1514,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <name>Description</name>
                     <t>
                         The demand message is sent in the <strong>Active Decoding</strong> and in the <strong>Passive Decoding</strong>
-                        state. It is a answer to a received <em><xref target="messages_offer" format="title" /></em> message
+                        state. It is an answer to a received <em><xref target="messages_offer" format="title" /></em> message
                         and is sent if the element described in the <em><xref target="messages_offer" format="title" /></em> message
                         is missing in the peers set. In the normal workflow the answer to the demand message is an
                         <em><xref target="messages_elements" format="title" /></em> message.
@@ -1528,7 +1539,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <dl>
                         <dt>MSG SIZE</dt>
                         <dd>
-                            is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included.
+                            is a 16-bit unsigned integer in network byte order which describes the message size in bytes and the header is included.
                         </dd>
                         <dt>MSG TYPE</dt>
                         <dd>
@@ -1576,7 +1587,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <dl>
                         <dt>MSG SIZE</dt>
                         <dd>
-                            is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included.
+                            is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
                         </dd>
                         <dt>MSG TYPE</dt>
                         <dd>
@@ -1599,7 +1610,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <name>Description</name>
                     <t>
                         The full done message is sent in the <xref target="modeofoperation_full-sync" format="title" />
-                        to signal that all remaining elements of the set have been sent. The message is received and sent in in the
+                        to signal that all remaining elements of the set have been sent. The message is received and sent in the
                         <strong>Full Sending</strong> and in the <strong>Full Receiving</strong> state. When the full done message is received
                         in <strong>Full Sending</strong> state the peer changes directly into <strong>Finished</strong> state. In
                         <strong>Full Receiving</strong> state receiving a full done message initiates the sending of
@@ -1622,7 +1633,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <dl>
                         <dt>MSG SIZE</dt>
                         <dd>
-                            is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included.
+                            is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
                         </dd>
                         <dt>MSG TYPE</dt>
                         <dd>
@@ -1668,7 +1679,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <dl>
                         <dt>MSG SIZE</dt>
                         <dd>
-                            is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included.
+                            is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
                         </dd>
                         <dt>MSG TYPE</dt>
                         <dd>
@@ -1684,7 +1695,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                 <section anchor="messages_se_description" numbered="true" toc="default">
                     <name>Description</name>
                     <t>
-                        The strata estimator is sent by the receiving peer at the start of the protocol right after the
+                        The strata estimator is sent by the receiving peer at the start of the protocol, right after the
                         <xref target="messages_operation_request" format="title" /> message has been received.
                     </t>
                     <t>
@@ -1714,7 +1725,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <dl>
                         <dt>MSG SIZE</dt>
                         <dd>
-                            is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included.
+                            is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
                         </dd>
                         <dt>MSG TYPE</dt>
                         <dd>
@@ -1743,7 +1754,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     </t>
                     <t>
                         Since the content of the message is the same as the uncompressed Strata Estimator, the details
-                        aren't repeated here for details see section <xref target="messages_se" format="counter" />.
+                        are not repeated here for details see section <xref target="messages_se" format="counter" />.
                     </t>
                 </section>
             </section>
@@ -1765,7 +1776,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                         <strong>Full Receiving</strong> state.
                     </t>
                     <t>
-                        After the last full element messages has been sent the <xref target="messages_full_done" format="title" /> message
+                        After the last full element message has been sent the <xref target="messages_full_done" format="title" /> message
                         is sent to conclude the full synchronisation of the element sending peer.
                     </t>
                 </section>
@@ -1787,7 +1798,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                     <dl>
                         <dt>MSG SIZE</dt>
                         <dd>
-                            is 16-bit unsigned integer in network byte order witch describes the message size in bytes and the header is included.
+                            is a 16-bit unsigned integer in network byte order which describes the message size in bytes and header included.
                         </dd>
                         <dt>MSG TYPE</dt>
                         <dd>
@@ -1795,7 +1806,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                         </dd>
                         <dt>E TYPE</dt>
                         <dd>
-                            element type is a 16-bit unsigned integer witch defines the element type for
+                            element type is a 16-bit unsigned integer which defines the element type for
                             the application.
                         </dd>
                         <dt>PADDING</dt>
@@ -1804,7 +1815,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                         </dd>
                         <dt>E SIZE</dt>
                         <dd>
-                            element size is 16-bit unsigned integer that signals the size of the elements data part.
+                            element size is a 16-bit unsigned integer that signals the size of the elements data part.
                         </dd>
                         <dt>AE TYPE</dt>
                         <dd>
@@ -1831,7 +1842,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, ibf_size)
                         The decision which mode of operations is used is described by the following code:
                     </t>
                     <t>
-                        The function takes as input the initial the local setsize, the remote setsize, the
+                        The function takes as input the initial local setsize, the remote setsize, the
                         by the strata estimator calculated difference, a static boolean that enforces full
                         synchronisation mode of operation and the bandwith/roundtrips tradeoff.
                         As output the function returns "FULL" if the full synchronisation
@@ -1862,9 +1873,9 @@ FUNCTION decide_operation_mode(initial_local_setsize, remote_setsize, set_diff, 
                     </figure>
                 </section>
                 <section anchor="performance_formulas_full_sending_dec_first_send" numbered="true" toc="default">
-                    <name>Full Synchronisation: Decision witch peer sends elements first</name>
+                    <name>Full Synchronisation: Decision which peer sends elements first</name>
                     <t>
-                        The following function determinate which peer starts sending its full set in full synchronisation
+                        The following function determinates which peer starts sending its full set in full synchronisation
                         mode of operation.
                     </t>
                     <t>
@@ -1892,11 +1903,11 @@ FUNCTION decide_full_sending(initial_local_size, remote_setsize)
                 <section anchor="performance_formula_ibf_parameters" numbered="true" toc="default">
                     <name>IBF Parameters</name>
                     <t>
-                        The following function calculate the required parameter to create an optimal sized IBF. These
-                        parameter are the number of buckets and the number of buckets a single element is mapped to.
+                        The following function calculates the required parameter to create an optimal sized IBF. These
+                        parameters are the number of buckets and the number of buckets a single element is mapped to.
                     </t>
                     <t>
-                        The function takes as input the setsize and returns a array with two numbers the total number of buckets
+                        The function takes as input the setsize and returns an array with two numbers, the total number of buckets
                         and the number of buckets a single element is mapped to.
                     </t>
                     <figure anchor="performance_formula_ibf_parameters_code">
@@ -1907,14 +1918,163 @@ FUNCTION decide_full_sending(initial_local_size, remote_setsize)
 # returns: Array: first element total nr ob buckets and
 #                 second element number of buckets per element
 
-FUNCTION calculate_ibf_params (setsize):
+FUNCTION calculate_ibf_params (set_diff):
     number_of_bucket_per_element = 4
-    total_number_of_buckets = setsize
+    total_number_of_buckets = set_diff
     return [ total_number_of_buckets, number_of_bucket_per_element ]
                         ]]></artwork>
                     </figure>
                 </section>
             </section>
+
+            <section anchor="performance_counter_variable_size" numbered="true" toc="default">
+                <name>Counter variable size</name>
+                <t>
+                    Since the optimal number of bytes a counter in the IBF contains is very variable and varies
+                    due to different parameters. Details are described in the BSC thesis
+                    by Elias Summermatter, BFH 2021.
+                    <!-- TODO link Thesis -->
+                    Therefore a compression algorithm has been implemented, which always creates the IBF counter in optimal size.
+                    and thus minimizes the bandwidth needed to transmit the IBF.
+                </t>
+                <t>
+                    A simple algorithm is used for the compression. In a first step it is determined, which is the largest counter
+                    and how many bits are needed to store it. In a second step for every counter of every bucket the counter
+                    is stored in the bits determined in the first step and these are concatenated.
+                </t>
+                <t>
+                    Three individual functions are used for this purpose. The first one is a function that iterates over each bucket of the
+                    bucket of the IBF to get the maximum counter in the IBF. As second it needs
+                    a function that compresses the counter of the IBF and thirdly a function that decompresses the IBF.
+                </t>
+                <figure anchor="performance_counter_variable_size_code">
+                    <artwork name="" type="" align="left" alt=""><![CDATA[
+
+# INPUTS:
+# ibf: The IBF
+# OUTPUTS:
+# returns: minimal amount of bytes required to store the counter
+
+FUNCTION ibf_get_max_counter(ibf)
+    max_counter=0
+    FOR bucket IN ibf
+        IF bucket.counter > max_counter
+            max_counter = bucket.counter
+
+    RETURN CEILING( log2 ( max_counter ) ) # next bigger discrete number of the binary logarithm of the max counter
+
+# INPUTS:
+# ibf: The IBF
+# offset: The offset which defines the starting point from which bucket the compress operation should start
+# count: The number of buckets to in the array that should be compressed
+# OUTPUTS:
+# returns: An byte array of compressed counters to send over the network
+
+FUNCTION pack_counter(ibf, offset, count)
+    counter_bytes = ibf_get_max_counter(ibf)
+    store = 0
+    store_bits = 0
+    byte_ctr = 0
+    buffer=[]
+
+    FOR bucket IN ibf[offset] to ibf[count]
+        byte_len = counter_bytes
+        counter = bucket.counter
+
+        WHILE byte_len > 0
+            byte_to_write = 0
+
+            IF counter_bytes + store_bits >= 8
+                bit_to_shift=0
+
+                IF store_bits > 0 OR counter_bytes > 8
+                    bit_free = 8 - store_bits
+                    bit_to_shift = counter_bytes - bit_free
+                    store = store << bit_free
+
+                byte_to_write = (( counter >> bit_to_shift) | store) & 0xFF
+                bit_to_shift -= 8 - store_bits
+                counter = counter & (( 1 << counter_bytes ) - 1)
+                store = 0
+                store_bits = 0
+
+            ELSE
+                IF 0 == store_bits
+                    store = counter
+                ELSE
+                    store = (store << counter_bytes) | counter
+
+                store_bits = store_bits + byte_len
+                byte_len = 0
+                BREAK
+
+            buffer[byte_ctr] = byte_to_write
+            byte_ctr++
+        NEXT_BUCKET
+
+    # Write the last partial compressed byte to the buffer
+    buffer[byte_ctr] = store << (8 - store_bits)
+    byte_ctr++
+
+    RETURN buffer
+
+# INPUTS:
+# ibf: The IBF
+# offset: The offset which defines the starting point from which bucket the compress operation should start
+# count: The number of buckets to in the array that should be compressed
+# counter_bit_length: The bit length of the counter can be found in the ibf message in the ibf_counter_bit_length field
+# packed_data: A byte array which contains the data packed with the pack_counter function
+# OUTPUTS:
+# returns: Nothing because the unpacked counter is saved directly into the IBF
+
+FUNCTION unpack_counter(ibf, offset, count, counter_bit_length, packed_data)
+    store = 0
+    store_bits = 0
+    byte_ctr = 0
+    ibf_bucket_ctr = 0
+
+    number_bytes_read = CEILING((count * counter_bit_length) / 8)
+
+    WHILE ibf_bucket_ctr <= (count -1)
+        byte_to_read = packed_data[byte_ctr]
+        byte_ctr++
+        bit_to_pack_left = 8
+
+        WHILE bit_to_pack_left >= 0
+
+            # Prevent packet from reading more than required
+            IF ibf_bucket_ctr > (count -1)
+                return
+
+            IF  ( store_bits + bit_to_pack_left ) >= counter_bit_length
+                bit_use = counter_bit_length - store_bits
+
+                IF store_bits > 0
+                    store = store << bit_use
+
+                bytes_to_shift = bit_to_pack_left - bit_use
+                counter_partial = byte_to_read >> bytes_to_shift
+                store = store | counter_partial
+                ibf.counter[ibf_bucket_ctr] = store
+                byte_to_read = byte_to_read & (( 1 << bytes_to_shift ) - 1)
+
+                bit_to_pack_left -= bit_use
+                ibf_bucket_ctr++
+                store = 0
+                store_bits = 0
+
+            ELSE
+                store_bits += bit_to_pack_left
+
+                IF 0 == store_bits
+                    store = byte_to_read
+                ELSE
+                    store = store << bit_to_pack_left
+                    store = store | byte_to_read
+                BREAK
+                                    ]]></artwork>
+                </figure>
+            </section>
         </section>
 
         <!--
@@ -1931,19 +2091,19 @@ FUNCTION calculate_ibf_params (setsize):
 
         <t>
             The security considerations in this document focus mainly on the security
-            goal of availability, the primary goal of the protocol is prevent an attacker from
+            goal of availability. The primary goal of the protocol is to prevent an attacker from
             wasting cpu and network resources of the attacked peer.
         </t>
         <t>
-            To prevent denial of service attacks its vital to check that peers can only
-            reconcile a set once in a pre defined time span. This is a predefined values and need
-            to be adapted on per use basis. To enhance reliability and to allow
-            failures in the protocol its possible to introduce a threshold for max failed reconciliation
+            To prevent denial of service attacks, it is vital to check that peers can only
+            reconcile a set once in a predefined time span. This is a predefined value and needs
+            to be adapted per use basis. To enhance reliability and to allow
+            failures in the protocol, it is possible to introduce a threshold for max failed reconciliation
             ties.
             <!-- IMPLEMENT: How to implement? IP? Other construct? -->
         </t>
         <t>
-            The formal format of all messages needs to be properly validated, this is important to prevent many
+            The formal format of all messages needs to be properly validated. This is important to prevent many
             attacks on the code. The application data should be validated by the application using
             the protocol not by the implementation of the protocol.
             In case the format validation fails the set operation MUST be terminated.
@@ -1951,17 +2111,18 @@ FUNCTION calculate_ibf_params (setsize):
         </t>
 
         <t>
-            To prevent an attacker from sending a peer into a endless loop between active and passive decoding a
-            limitation for active/passive roll switches in required. This can be implemented by
-            a simple counter which terminates the operation after a predefined count of switches.
-            The count of switches needs to be defined as such that its very undroppable that more
+            To prevent an attacker from sending a peer into an endless loop between active and passive decoding, a
+            limitation for active/passive roll switches is required. This can be implemented by
+            a simple counter which terminates the operation after a predefined number of switches.
+            The number of switches needs to be defined in such a way that it is very unprobable that more
             switches are required an the malicious intend of the other peer can be assumed.
+            
             <!-- IMPLEMENT: This counter -->
         </t>
 
         <t>
             <!--- SHOULD BE HANDLED BY UNDERLYING PROTOCOL BUT HOW IS IT HANDLED? -->
-            Its important to close and purge connections after a given timeout
+            It is important to close and purge connections after a given timeout
             to prevent draining attacks.
             <!-- IMPLEMENT: How ist this handheld right now? -->
         </t>
@@ -1976,7 +2137,7 @@ FUNCTION calculate_ibf_params (setsize):
                 <name>Duplicated or Missing Message detection</name>
                 <t>
                     Most of the messages received need to be checked that they are not
-                    received multiple times this is solved with a global store (message)
+                    received multiple times. This is solved with a global store (message)
                     and the following code
                 </t>
                 <figure anchor="security_generic_functions_missing_message_code">
@@ -2031,7 +2192,7 @@ FUNCTION isStoreComplete(store)
     ENDFOR
     return TRUE
 
-# Returns the count of message received
+# Returns the number of message received
 # INPUTS:
 # store: Store to count
 # OUTPUTS:
@@ -2046,8 +2207,8 @@ FUNCTION getNumberOfMessage(store)
         <section anchor="security_generic_functions_element_nr" numbered="true" toc="default">
             <name>Store Remote Peers Element Number</name>
             <t>
-                To prevent an other peer from requesting the same set multiple times
-                its important to memorize the number of elements a peer had in previous
+                To prevent an other peer from requesting the same set multiple times,
+                it is important to memorize the number of elements a peer had in previous
                 reconciliation sessions.
             </t>
             <figure anchor="security_generic_functions_element_nr_code">
@@ -2096,12 +2257,12 @@ FUNCTION save_number_of_elements_last_sync(client_id, remote_setsize)
                         <t>
                         It needs to be checked that the full synchronisation is
                         plausible according to the formula deciding which operation mode
-                        is applicable this is achieved by calculating the upper and lower
+                        is applicable. This is achieved by calculating the upper and lower
                         boundaries of the number of elements in the other peers set.
                         The lower boundary of number of elements can be easily
                         memorized as result from the last synchronisation and the upper
                         boundary can be estimated with prior knowledge of the maximal
-                        plausible increase of element since the last reconciliation and
+                        plausible increase of elements since the last reconciliation and
                         the maximal plausible number of elements.
                         <!-- Entscheindungsfindung: myset fulltransimtion or epstein
                          5.3 ist zu verckürtzt fall: nur 2 cases Es fehlt traidof nach paramer  in perfomance section
@@ -2155,12 +2316,12 @@ FUNCTION validate_messages_request_full(client_id, remote_setsize, local_setsize
                     <dt><xref target="messages_ibf" format="title" /></dt>
                     <dd>
                         <t>
-                            Its important do define a threshold to limit
+                            It is important to define a threshold to limit
                             the maximal number of IBFs that are expected from the other peer.
                             <!-- change count to number full section -->
                             This maximal plausible size can be calculated with the known inputs:
-                            number of elements in my set and the pre defined applications upper
-                            limit as described in the performance section.
+                            number of elements in the local set and the predefined applications upper
+                            limit, as described in the performance section.
                             <!-- IMPLEMENT: Is this already checked?-->
                             <!-- TODO: Link performance section -->
                             That the other peer chooses the correct mode of operation MUST
@@ -2222,7 +2383,7 @@ FUNCTION get_bucket_number(remote_setsize, local_setsize, initial_local_size, se
                     <dt><xref target="messages_full_element" format="title" /></dt>
                     <dd>
                         <t>
-                        If a full element is received  the set of the other peer
+                        If a full element is received, the set of the other peer
                         is smaller than the set of the peer in the <strong>Expecting IBF</strong>
                         state and the set difference is smaller than threshold for
                         full synchronisation as described in the performance section.
@@ -2322,9 +2483,9 @@ FUNCTION validate_messages_full_element_init(client_id, remote_setsize,
                     <dt><xref target="messages_full_element" format="title" /></dt>
                     <dd>
                         <t>
-                        When receiving full elements there needs to be checked that every
-                        element is a valid element, no element is resized more than once and
-                        not more or less elements are received as the other peer has committed
+                        When receiving full elements there needs to be checked, that every
+                        element is a valid element, no element has been received more than once and
+                        not more or less elements are received, as the other peer has committed
                         to in the beginning of the operation. Detail pseudocode implementation
                         can be found in <xref target="security_states_expecting_ibf" format="title" />.
                         <!-- IMPLEMENT: Is this check already implemented?-->
@@ -2333,12 +2494,12 @@ FUNCTION validate_messages_full_element_init(client_id, remote_setsize,
                     <dt><xref target="messages_full_done" format="title" /></dt>
                     <dd>
                         <t>
-                        When receiving the full done message its important to check that
+                        When receiving the full done message it is important to check that
                         not less elements are received as the other peer has committed to
                         send.
                         The 512-bit hash of the complete reconciled set contained in
-                        the full done message is required to ensures that both sets are truly identical. If
-                        the sets differ a resynchronisation is required. The count of possible
+                        the full done message is required to ensure that both sets are truly identical. If
+                        the sets differ, a resynchronisation is required. The number of possible
                         resynchronisation MUST be limited to prevent resource exhaustion attacks.
                         <!-- IMPLEMENT: Is this check already implemented?-->
                         </t>
@@ -2382,7 +2543,7 @@ FUNCTION validate_messages_full_done(full_done_msg, full_element_msg_store,
                     <dt><xref target="messages_ibf" format="title" /></dt>
                     <dd>
                         <t>
-                            When receiving multiple IBFs its important to check that the other
+                            When receiving multiple IBFs it is important to check that the other
                             peer can only send as many IBFs as expected. The number of expected IBFs can
                             be calculated with the knowledge of the set difference as described in the
                             performance section.
@@ -2397,37 +2558,37 @@ FUNCTION validate_messages_full_done(full_done_msg, full_element_msg_store,
             <section anchor="security_states_active_decoding" numbered="true" toc="default">
                 <name>Active Decoding</name>
                 <t>
-                    In the Active Decoding state its important to prevent an attacker from
-                    generating and passing unlimited amount of IBF that do not decode or
-                    even worse generate an IBF that is constructed to sends the peers in an endless loop.
-                    To prevent an endless loop in decoding a loop detection should be implemented
-                    the simplest solution would be to prevent decoding of more than a given amount of elements,
-                    a more robust solution is to implement a algorithm that detects a loop by
-                    analyzing past partially decoded IBFs to detect cycles. This can be archived
+                    In the Active Decoding state it is important to prevent an attacker from
+                    generating and passing an unlimited amount of IBFs, that do not decode or
+                    even worse, generate an IBF constructed, to send the peers in an endless loop.
+                    To prevent an endless loop in decoding, a loop detection should be implemented.
+                    The simplest solution would be to prevent decoding of more than a given amount of elements.
+                    A more robust solution is to implement a algorithm that detects a loop by
+                    analyzing past partially decoded IBFs. This can be archived
                     by saving the hash of all prior partly decoded IBFs hashes in a hashmap and check
-                    for every inserted hash if it is already in the hashmap.
+                    for every inserted hash, if it is already in the hashmap.
                     <!-- TODO: Link some algo to find loops in directed graph -->
                     <!-- IMPLEMENT: Implement an algo that detects loops in IBF decoding -->
                 </t>
                 <t>
-                    If the IBF decodes more or less elements than are plausible the
+                    If the IBF decodes more or less elements than are plausible, the
                     operation MUST be terminated. The upper and lower threshold
-                    for the decoded elements can be calculated with the peers set size
+                    for the decoded elements can be calculated with the peers set sizes
                     and the other peer committed set sizes from the <strong>Expecting IBF</strong>
                     State.
                 </t>
 
-                <!-- Wenne element mehrfach decodiert seitenwechseln daher detecten. -->
+                <!-- Wenn ein Element mehrfach decodiert seitenwechseln daher detecten. -->
 
                 <t>Security considerations for received messages:</t>
                 <dl>
                     <dt><xref target="messages_offer" format="title" /></dt>
                     <dd>
                         <t>
-                            If an offer for an element that never has been requested by
-                            an inquiry or if an offer is received twice the operation MUST be terminated.
-                            This requirement can be fulfilled by saving lists that keeps track of the state of
-                            all send inquiries and offers. When answering offers these lists MUST be checked.
+                            If an offer for an element, that never has been requested by
+                            an inquiry or if an offer is received twice, the operation MUST be terminated.
+                            This requirement can be fulfilled by saving lists that keep track of the state of
+                            all sent inquiries and offers. When answering offers these lists MUST be checked.
                             <!-- IMPLEMENT: Check to keep track of all send Inquiries -->
                         </t>
                         <figure anchor="security_states_active_decoding_offer_code">
@@ -2462,9 +2623,9 @@ FUNCTION validate_messages_offer(offer_msg,inquiry_msg_store)
                             If an element that never has been requested by
                             a demand or is received double the operation MUST be terminated.
                             This requirement can be fulfilled by a simple table that keeps track
-                            of the state of all send demands.
+                            of the state of all sent demands.
                             <!-- IMPLEMENT: Check to keep track of all send demands -->
-                            If an invalid element is received the operation has failed and the
+                            If an invalid element is received, the operation has failed and it
                             MUST be terminated.
                             <!-- IMPLEMENT: Termination if invalid element si revived -->
                         </t>
@@ -2497,10 +2658,10 @@ FUNCTION validate_messages_elements(element_msg,demand_msg_store)
                     <dt><xref target="messages_demand" format="title" /></dt>
                     <dd>
                         <t>
-                        For every received demand a offer has to be send in advance. If an demand
-                        for an element is received that never has been offered or the offer already has
-                        been answered with a demand the operation MUST be terminated. Its required to implement
-                        a list which keeps track of the state of all send offers and received demands.
+                        For every received demand an offer has to be sent in advance. If a demand
+                        for an element is received, that never has been offered or the offer already has
+                        been answered with a demand, the operation MUST be terminated. It is required to implement
+                        a list which keeps track of the state of all sent offers and received demands.
                         </t>
                         <figure anchor="security_states_active_decoding_demand_code">
                             <artwork name="" type="" align="left" alt=""><![CDATA[
@@ -2532,14 +2693,14 @@ FUNCTION validate_messages_demand(demand_msg,offer_msg_store)
                     <dt><xref target="messages_done" format="title" /></dt>
                     <dd>
                         <t>
-                            The done message is only received if the IBF has been finished
+                            The done message is only received, if the IBF has been finished
                             decoding and all offers have been sent. If the done message is received before
                             the decoding of the IBF is finished or all open offers and demands
-                            have been answered the operation MUST be terminated.
+                            have been answered, the operation MUST be terminated.
                             <!-- IMPLEMENT: Check that in active decoding no done message is received before ibf has been decoded-->
                             The 512-bit hash of the complete reconciled set contained in
-                            the done message is required to ensures that both sets are truly identical. If
-                            the sets differ a resynchronisation is required. The count of possible
+                            the done message is required to ensure that both sets are truly identical. If
+                            the sets differ, a resynchronisation is required. The number of possible
                             resynchronisation MUST be limited to prevent resource exhaustion attacks.
                         </t>
                         <figure anchor="security_states_active_decoding_done_code">
@@ -2591,8 +2752,8 @@ FUNCTION validate_messages_done(done_msg, offer_msg_store,
             <section anchor="security_states_finish_closing" numbered="true" toc="default">
                 <name>Finish Closing</name>
                 <t>
-                    In case not all sent demands or inquiries have ben answered in time a pre defined
-                    timeout the operation has failed and MUST be terminated.
+                    In case not all sent demands or inquiries have been answered in time, a predefined
+                    timeout, the operation has failed and MUST be terminated.
                 </t>
                 <!-- FIXME: In state diagram in finish closing only Elements can be received. What happens if i receive an offer? -->
                 <t>Security considerations for received messages:</t>
@@ -2616,14 +2777,14 @@ FUNCTION validate_messages_done(done_msg, offer_msg_store,
                     <dt><xref target="messages_se" format="title" /></dt>
                     <dd>
                         <t>
-                            In case the Strata Estimator does not decode the
+                            In case the Strata Estimator does not decode, the
                             operation MUST be terminated to prevent to get to a unresolvable state.
                             <!-- IMPLEMENT: If in expect SE state the strata estimator does not decode terminate the operation -->
                             The set difference calculated from the strata estimator needs to be plausible,
-                            to ensure this multiple factors need to be considered: The absolute plausible maximum of
-                            elements in a set which has to be predefined according
+                            to ensure this, multiple factors need to be considered: The absolute plausible maximum of
+                            elements in a set, which has to be predefined according
                             to the use case and the maximal plausible element increase since the last
-                            successful set reconciliation which should be either predefined or can be calculated
+                            successful set reconciliation, which should be either predefined or can be calculated
                             with the gaussian distribution function over all passed set reconciliations.
                             <!-- IMPLEMENT: Terminate if in check expect se state for a max size difference is exceeded -->
                         </t>
@@ -2673,7 +2834,7 @@ FUNCTION validate_messages_se(se_msg, remote_setsize, local_setsize)
                         <t>
                             The peer in <strong>Full Receiving</strong> state needs to check
                             that exactly the number of elements is received from the remote peer as
-                            he initially committed too. If the remote peer transmits less or
+                            he has initially committed too. If the remote peer transmits less or
                             more elements the operation MUST be terminated.
                         </t>
                         <t>
@@ -2684,14 +2845,14 @@ FUNCTION validate_messages_se(se_msg, remote_setsize, local_setsize)
                     <dd>
                         <t>
                             When the full done message is received from the remote peer all
-                            elements that the remote peer has committed to needs to be received
+                            elements, that the remote peer has committed to, need to be received,
                             otherwise the operation MUST be terminated. After receiving the
                             full done message no future elements should be accepted.
                             <!-- FIXME: Check that after full done in full receiving no elements can be received anymore! Additional state? -->
                             The 512-bit hash of the complete reconciled set contained in
-                            the full done message is required to ensures that both sets are truly identical. If
-                            the sets differ a resynchronisation is required. The count of possible
-                            resynchronisation MUST be limited to prevent resource exhaustion attacks.
+                            the full done message is required to ensure that both sets are truly identical. If
+                            the sets differ, a resynchronisation is required. The number of possible
+                            resynchronisations MUST be limited to prevent resource exhaustion attacks.
                         </t>
                         <t>
                             Pseudocode for implementation described in section <xref target="security_states_full_sending" format="title" />.
@@ -2707,8 +2868,8 @@ FUNCTION validate_messages_se(se_msg, remote_setsize, local_setsize)
                     <dd>
                         <t>
                             In case an IBF message is received by the peer a active/passive role switch
-                            is initiated by the active decoding remote peer. In this instance the peer should
-                            wait for all open offers and demands to be fulfilled to prevent
+                            is initiated by the active decoding remote peer. In this moment the peer should
+                            wait for all open offers and demands to be fulfilled, to prevent
                             retransmission before switching into active decoding operation mode.
                             A switch into active decoding mode should only be permitted for
                             a predefined number of times as described in the top section
@@ -2755,13 +2916,13 @@ FUNCTION validate_messages_ibf(offer_msg_store, demand_msg_store, element_msg_st
                     <dt><xref target="messages_inquiry" format="title" /></dt>
                     <dd>
                         <t>
-                        A check needs to be in place that prevents receiving a inquiry
+                        A check needs to be in place that prevents receiving an inquiry
                         for an element multiple times or more inquiries than are plausible.
-                        The amount of inquiries that is plausible can be estimated by considering
-                        known values as the remote set size, the local set size, the
-                        predefined absolute maximum of elements in the set which is defined
+                        The amount of inquiries that is plausible, can be estimated by considering
+                        known values, as the remote set size, the local set size, the
+                        predefined absolute maximum of elements in the set, which is defined
                         by real world limitations.
-                        To implement this restrictions a list with all received inquiries
+                        To implement this restrictions, a list with all received inquiries
                         should be stored and new inquiries should be checked against.
                         </t>
                         <figure anchor="security_states_passive_decoding_inquiry_code">
@@ -2817,7 +2978,7 @@ FUNCTION validate_messages_inquiry(inquiry_msg, set_diff)
             <section anchor="security_states_finish_waiting" numbered="true" toc="default">
                 <name>Finish Waiting</name>
                 <t>
-                    In case not all sent demands or inquiries have ben answered in time the operation
+                    In case not all sent demands or inquiries have ben answered in time, the operation
                     has failed and MUST be terminated.
                 </t>
                 <!-- FIXME: In state diagram in finish closing only Elements can be received. What happens if i receive an offer? -->
@@ -3082,6 +3243,15 @@ Type    | Name                       | References | Description
                     <date year="2011"/>
                 </front>
             </reference>
+            <reference anchor="secure_set_reconciliation" target="http://ti.bfh.ch">
+                <front>
+                    <title>Secure Set Reconciliation</title>
+                    <author initials="E." surname="Summermatter" fullname="Elias Summermatter">
+                        <organization>BFH - Berner Fachhochschule</organization>
+                    </author>
+                    <date year="2021"/>
+                </front>
+            </reference>
 
             <!--    <reference anchor="ISO20022">
               <front>
@@ -3103,34 +3273,82 @@ Type    | Name                       | References | Description
                 <t>
                     INPUTS:
                 </t>
-                <t>
-                    key: 0xFFFFFFFFFFFFFFFF (64-bit)
-                    number_of_buckets_per_element: 3
-                    ibf_size: 300
-                </t>
+                <figure anchor="test_vectors_map_function_inputs">
+                    <artwork name="" type="" align="left" alt=""><![CDATA[
+number_of_buckets_per_element: 3
+ibf_size: 300
+
+key1: 0xFFFFFFFFFFFFFFFF (64-bit)
+key2: 0x0000000000000000 (64-bit)
+key3: 0x00000000FFFFFFFF (64-bit)
+key4: 0xC662B6298512A22D (64-bit)
+key5: 0xF20fA7C0AA0585BE (64-bit)
+           ]]></artwork>
+                </figure>
                 <t>
                     OUTPUT:
                 </t>
-                <t>
-                    ["222","32","10"]
-                </t>
+                <figure anchor="test_vectors_map_function_outputs">
+                    <artwork name="" type="" align="left" alt=""><![CDATA[
+key1: ["122","157","192"]
+key2: ["85","243","126"]
+key3: ["208","101","222"]
+key4: ["239","269","56"]
+key5: ["150","104","33"]
+           ]]></artwork>
+                </figure>
             </section>
             <section anchor="test_vectors_id_function" numbered="true" toc="default">
                 <name>ID Calculation Function</name>
                 <t>
                     INPUTS:
                 </t>
+                <figure anchor="test_vectors_id_function_inputs">
+                    <artwork name="" type="" align="left" alt=""><![CDATA[
+element 1: 0xFFFFFFFFFFFFFFFF (64-bit)
+element 2: 0x0000000000000000 (64-bit)
+element 3: 0x00000000FFFFFFFF (64-bit)
+element 4: 0xC662B6298512A22D (64-bit)
+element 5: 0xF20fA7C0AA0585BE (64-bit)
+           ]]></artwork>
+                </figure>
                 <t>
-                    element: 0xadadadadadadadad
-                    ibf_salt 0x3F (6-bit)
+                    OUTPUT:
                 </t>
+                <figure anchor="test_vectors_id_function_outputs">
+                    <artwork name="" type="" align="left" alt=""><![CDATA[
+element 1: 0x5AFB177B
+element 2: 0x64AB557C
+element 3: 0xCB5DB740
+element 4: 0x8C6A2BB2
+element 5: 0x7EC42981
+           ]]></artwork>
+                </figure>
+            </section>
+            <section anchor="test_counter_compression_function" numbered="true" toc="default">
+                <name>Counter Compression Function</name>
                 <t>
-                    OUTPUT:
+                    INPUTS:
                 </t>
+                <figure anchor="test_counter_compression_function_inputs">
+                    <artwork name="" type="" align="left" alt=""><![CDATA[
+counter serie 1: [1,8,10,6,2] (min bytes 4)
+counter serie 2: [26,17,19,15,2,8] (min bytes 5)
+counter serie 3: [4,2,0,1,3] (min bytes 3)
+           ]]></artwork>
+                </figure>
                 <t>
-                    0xFFFFFFFFFFFFFFFF
+                    OUTPUT:
                 </t>
+                <figure anchor="test_counter_compression_function_outputs">
+                    <artwork name="" type="" align="left" alt=""><![CDATA[
+counter serie 1: 0x18A62
+counter serie 2: 0x3519BC48
+counter serie 3: 0x440B
+           ]]></artwork>
+                </figure>
             </section>
         </section>
     </back>
 </rfc>
+

	lsd0003 LSD0003: Set Union
	Log \| Files \| Refs \| README