Added some new texts - lsd0003 - LSD0003: Set Union

commit be3699345fb049986691f71f089a9c130938fcb0
parent f74f215def43820b27a23a8dbf0f8ae274cc8e59
Author: Elias Summermatter <elias.summermatter@seccom.ch>
Date:   Sun, 20 Dec 2020 13:01:54 +0100

Added some new texts

Diffstat:
M draft-summermatter-set-union.xml  | 137 ++++++++++++++++++++++++++++++++++++++++++++++---------------------------------

1 file changed, 80 insertions(+), 57 deletions(-)
diff --git a/draft-summermatter-set-union.xml b/draft-summermatter-set-union.xml
@@ -73,23 +73,37 @@
             <name>Introduction</name>
             <t>
                 This document describes the Byzantine Fault Tolerant Set Reconciliation used to efficient and securely
-                synchronize two sets of elements in a peer-to-peer network. It provides strong cryptographic and
-                probabilistic proceedings to discover and ban dishonest and bad behaving peers.
+                synchronize two sets of elements in a peer-to-peer network. It provides cryptographic and
+                probabilistic proceedings to discover dishonest and bad behaving peers. The Protocol should prevent
+                bad acting peers from wasting resources by sending wrong formed elements, pretending to have more elements
+                than the peer actually has or requesting more than the full set from a peer. This is for example achieved by
+                remembering how much elements already have been sent and how many elements are possible by some real world
+                limiting factor in E-Voting this for example is the people who are permitted to vote. Another probabilistic
+                approach to discover bad behaving peers is sampling, in this case a the peer has to prove to the other peer
+                that the peer has the amount of elements he claims to have. To prove this the verifying peer chooses some
+                random salt and sends the salt to the proving peer. The proving peer then salts the hash of elements with the given
+                salt. In the next step the proving peer calculates the modulo (depending on the set sized difference) of the new
+                hashes and sends all the elements where the modulo is 0(?) to the verifying peer.
+                As soon as the verifying peer has received the elements the verifying peer can verify that all the elements
+                are valid and in the right split of the set then the verifying peer can assured with a high probability
+                that the peer is honest about his set size.
             </t>
-            <t>
-                The Byzantine Fault Tolerant Set Reconciliation can be used in a variety of different fields of application.
-                It is used in GNS to distribute revocations over the peer-to-peer network and it can be used as a central
-                component in distributed E-Voting systems to securely synchronize the votes between the peers.
+             <t>
+                The Byzantine Fault Tolerant Set Reconciliation can be used in a variety of different fields of application,
+                primarily in Bulletin Boards where multipart computation is required.
+                Some real world application are for example in GNS the distribute revocations over the peer-to-peer network
+                or it could be used as a central component in distributed E-Voting systems to exchange the votes between peer
+                who do not trust each other (Zero-Trust).
             </t>
             <t>
                 The internal structure of the elements in the set is handheld by the application using the protocol and is not
                 described more in detail than known limitations.
             </t>
             <t>
-                The protocol has been designed to minimize network round-trips and bytes send over the network at the
-                expense of CPU operations and memory usage. Since CPU operations are much cheaper than network round trips.
-                Due to certain parameters the protocol decides whatever either sending the full set or just sending the elements
-                that differ is cheaper.
+                The protocol should minimize network round-trips and bytes send over the network at the
+                expense of CPU operations.
+                The protocol uses some heuristics to determinate if sending the full set of elements or just sending the elements
+                that differ in the set is cheaper.
             </t>
             <t>
                 This document defines the normative wire format of resource records, resolution processes,
@@ -111,38 +125,32 @@
         <section anchor="contributors" numbered="true" toc="default">
         <name>Contributors</name>
         <t>
-            The major original contributors of this documents have been Elias Summermatter and Christian Grothoff.
-        </t>
-        <t>
-            The origianl GNU NET implementation of the  Byzantine Fault Tolerant Set Reconciliation has been mainly
-            written by Florian Dold and Christian Grothoff
+            The original GNU NET implementation of the  Byzantine Fault Tolerant Set Reconciliation has been mainly
+            written by Florian Dold and Christian Grothoff.
         </t>
         </section>
 
         <section anchor="ibv" numbered="true" toc="default">
         <name>Invertible Bloom Filter</name>
-            <section anchor="ibv_description" numbered="true" toc="default">
-                <name>Description</name>
-                <t>
-                    A Invertible Bloom Filter (IBF) is a probabilistic data structure that allows to determinate efficiently
-                    but not with absolut certainty that a given element is presence or absence in a set.
-                    The IBF knows tree basic operations add element, remove element and decode.
+            <t>
+                A Invertible Bloom Filter (IBF) is a probabilistic data structure that allows to determinate efficiently
+                but not with absolut certainty that a given element is presence or absence in a set.
+                The IBF knows three basic operations add element, remove element and decode.
 
-                    IBF are useful when there is the need to check a set of elements for the presence or absence of some
-                    elements in a big set efficiently.
-                </t>
-            </section>
-            <section anchor="ibv_format" numbered="true" toc="default">
+                IBF are useful when there is the need to check a set of elements for the presence or absence of some
+                elements in a big set efficiently.
+            </t>
+            <section anchor="ibf_format" numbered="true" toc="default">
                 <name>Format</name>
                 <t>
                     The storage format of an IBF is a "two-component data structure" that stores a value
-                    and a count in a bucket. For every bit of the defined output length of the used hash functions
+                    and a signed counter in a bucket. For every bit of the defined output length of the used hash functions
                     there is a value and a count stored.
-                    The count component is increased by one if on the given bit has been hit by the
-                    Add operation and is decreased by one if an bit has been hit by the delete operation.
-                    If the bucket is pure (counter = 1) the value contains the hash value of an element otherwise in the
+                    The count component is increased by one if on the given bit has been hit by the  #### ANSCHAUEN  REDUNDANT
+                    insert operation and is decreased by one if an bit has been hit by the delete operation.  #### ILUSTRATONEN
+                    If the bucket is pure (|counter| = 1) the value contains the hash value of an element otherwise in the
                     value field is empty (counter= 0) or a XOR product of all previously written hashes of elements
-                    (counter > 1)
+                    (|counter| > 1)
                     ### SOME GRAPHIC OF THE FORMAT ###
                 </t>
             </section>
@@ -176,15 +184,18 @@
                         in the section "Remove Element" from the pure buckets from the filter creating new pure buckets
                         until the IBF is completely empty and all elements have been decoded.
                     </t>
+                </section>
+                <section anchor="ibv_operations_setdiff" numbered="true" toc="default">
+                    <name>Set difference</name>
                     <t>
-                        In case there are two different sets for example from two different clients (A and B). The two
-                        sets can easily be compared by XORing the IBF of client A with the IBF of client B which results in
-                        a new IBF A/B. The IBF A/B can then be decoded as described above. The IBF A/B can have two types
-                        of pure buckets with counter set to 1 or -1. If the counter is set to 1 the element is missing
-                        in the set of client B and if the counter is set to -1 the element is missing in the set of
-                        client A.
+                        To calculate the difference between two sets. The two IBFs can be compared by XORing both IBFs
+                        which results in a new IBF. This new IBF can then be decoded as described in the decoding section.
+                        The new IBF can have two types of pure buckets with counter set to 1 or -1. If the counter is set to 1
+                        the element is missing in the second set and if the counter is set to -1 the element is missing in
+                        the first set.
                     </t>
                 </section>
+
             </section>
         </section>
 
@@ -204,8 +215,9 @@
                 <t>
                     The trick is to decode the SE with the biggest stratum first and calculate the difference of
                     the set if the SE decodes successfully decode the next smaller strata estimator repeat
-                    this until the decoding of the SE fails or Stratum 0 is reached. Then its possible to estimate
-                    how big the difference between two sets is by calculating the count of the
+                    this until the decoding of the SE fails or Stratum 0 is reached. Then its possible to estimate  ### Wie berechne ich den strata estimator Decodieren wie ist die formal um die
+                    how big the difference between two sets is by calculating the count of                          ### 2^n*k k=elemente (Durchnitt duch alle die estimatoren) Systamatischer pyas
+                                                                                                                    ### Wie ist die formel sein?
                     decoded elements divided by the stratum factor. If no of the SE decoded choose a smaller stratum or
                     try a other hash function.
                 </t>
@@ -220,37 +232,44 @@
                 Depending on the state of the two sets there are different strategies or operation modes how to efficiently
                 determinate missing elements in two sets of two peers.
             </t>
+
+            <t>
+                The simples mode is the full sync mode. The idea is that if the difference between the sets of the two
+                peers exceeds a certain threshold the overhead of determinate which elements are different outweighs
+                the overhead of sending the complete set. In this case its more efficient to just exchange the full sets.
+                ############# IMAGE ##################
+            </t>
+
             <t>
-                The initiating peer starts in the "Expect SE" state and the receiving peer starts in the "Expecting IBF" state.
-                In a first step of the protocol the initiating peer opens a connection to the receiving peer, requests a Strata
-                Estimator from the receiving peer. The receiving peer answers with the Strata Estimator.
+                The initiating peer starts in the "Expect SE" state and the receiving peer starts in the "Expecting IBF" state.   ### Afer connectiong to the server the initaiting peer is in state
+                In a first step of the protocol the initiating peer opens a connection to the receiving peer, requests a Strata   ### Anpassen nach state maschine
+                Estimator from the receiving peer. The receiving peer answers with the Strata Estimator.                          ### Wie unten nach states gliederm
                 After the initiating peer has received the Strata Estimator he decides which sync mode is optimal for the
                 the estimated set difference.
-                To ensure that ...... the difference is multiplied by 1.5 if there are more than 200 elements differences between the sets (WHY? line 1398).
+                To ensure that ...... the difference is multiplied by 1.5 if there are more than 200 elements differences between the sets (WHY? line 1398).  ### Tradeoff faktoren nach oben
                 The Full Synchronisation Mode is used if the flag to force full sync is set, the estimated difference between the two sets is bigger
                 than 25% or the set size of the receiving peer is zero. Otherwise the Individual Element Synchronisation Mode is used.
 
+                the tradeoff between the two is explained in 5.3 ########## Ausformulieren
 
             </t>
             <section anchor="modeofoperation_full-sync-client-with-bigger-set" numbered="true" toc="default">
                 <name>Full Synchronisation Mode</name>
 
                 <t>
-                    The simples mode is the full sync mode. The idea is that if the difference between the sets of the two
-                    peers exceeds a certain threshold the overhead of determinate which elements are different overweight
-                    the overhead of sending the complete set. In this case its more efficient to just exchange the full sets.
-                    ############# IMAGE ##################
+                    The decision to enter the full sync mode ... en
                 </t>
+
                 <t>
-                    If the set of the initiating peer is bigger than the set of the receiving peer, the initiating
-                    peer sends a "Request Full" message and change from "Expecting SE" in "Full Receiving" State.
-                    In all other cases the initiating peer starts sending all set elements to the other peer
+                    In full synchronisation mode, if the set of the initiating peer is bigger than the set of the receiving peer, the initiating    ### Nachrichten und states ander darstellen
+                    peer sends a "Request Full" message and change from "Expecting SE" in "Full Receiving" State.                                   ### Refferenz auf nachrichten & all elements = message type anstat beschreibung
+                    In all other cases the initiating peer sends all set elements to the other peer
                     followed by the "Full Done" message and changes into "Full Sending" State.
                 </t>
                 <t>
-                    <strong>Expecting IBF: </strong>
+                    <strong>Expecting IBF: </strong>                                                                                                 ## Descripion list
                     If a peer in the in state "Expecting IBF" receives a "Request Full" message from the other peer, the
-                    peer starts sending all the elements of the set followed by a "Full Done" message and the change to the
+                    peer starts sending all the elements of the set followed by a "Full Done" message and change to the
                     "Full Sending" State. If the peer receives an "Full Element" the peer changes to the state "Full Receiving".
                 </t>
                 <t>
@@ -261,8 +280,8 @@
                 <t>
                     <strong>Full Receiving (In code: Expecting IBF): </strong>
                     While a peer is in "Full Receiving" state the peer expects to continuously receiving elements from the other
-                    peer. As soon as a the "Full Done" message is received the peer starts sending his complete set to the other
-                    peer followed by a full done. After sending the last message the peer changes into "Finished" state.
+                    peer. As soon as a the "Full Done" message is received the peer sends the remaining elements set to the other
+                    peer followed by a "Full Done". After sending the last message the peer changes into "Finished" state.
                 </t>
 
             </section>
@@ -287,9 +306,13 @@
                     <name>Description</name>
                     <t>
                         This message is the first message of the protocol and it is sent to signal the receiving peer
-                        that the initiating peer wants to initialize a new connection. This Message is sent in the transition
-                        between the "Initiating connection" state and the "Expect SE" state. If a peer receives this message
-                        the peer answers with sending a Strata estimator back.
+                        that the initiating peer wants to initialize a new connection.
+                    </t>
+                    <t>
+                        This Message is sent in the transition between the "Initiating connection" state and the "Expect SE" state.
+                    </t>
+                    <t>
+                        If a peer receives this message the peer answers with sending a Strata estimator back.
                     </t>
                 </section>
                 <section anchor="messages_operation_request_structure" numbered="true" toc="default">

	lsd0003 LSD0003: Set Union
	Log \| Files \| Refs \| README