lsd0003

LSD0003: Set Union
Log | Files | Refs | README

commit cab45ba74dc984053f893d8caf5e406f643bed93
parent bf80fc8ddd88d96144457f40a3ba452c9d7c2433
Author: Christian Grothoff <christian@grothoff.org>
Date:   Thu, 14 Jan 2021 12:05:36 +0100

work on section 4/5.1

Diffstat:
Mdraft-summermatter-set-union.xml | 195+++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------
1 file changed, 125 insertions(+), 70 deletions(-)

diff --git a/draft-summermatter-set-union.xml b/draft-summermatter-set-union.xml @@ -180,12 +180,13 @@ <name>Bloom Filters</name> <t> A Bloom filter (BF) is a space-efficient datastructure to test if am element is part of a set of elements. + Elements are identified by an element ID. Since a BF is a probabilistic datastructure, it is possible to have false-positives: when asked if an element is in the set, the answer from a BF is either "no" or "maybe". </t> <t> A BF consists of L buckets. Every bucket is a binary value that can be either 0 or 1. All buckets are initialized - to 0. A mapping function M is used to map each element of the set to a subset of k buckets. M is non-injective + to 0. A mapping function M is used to map each the ID of each element from the set to a subset of k buckets. M is non-injective and can thus map the same element multiple times to the same bucket. The type of the mapping function can thus be described by the following mathematical notation: </t> @@ -346,7 +347,9 @@ +-------------+-------------+-------------+-------------+------- count | COUNTER | COUNTER | COUNTER | COUNTER | C... +-------------+-------------+-------------+-------------+------ - value | XHASH | XHASH | XHASH | XHASH | X... + idSum | IDSUM | IDSUM | IDSUM | IDSUM | I... + +-------------+-------------+-------------+-------------+------ +hashSum | HASHSUM | HASHSUM | HASHSUM | HASHSUM | H.. +-------------+-------------+-------------+-------------+------- ]]></artwork> </figure> @@ -355,7 +358,7 @@ <section anchor="ibf_operations" numbered="true" toc="default"> <name>Operations</name> <t> - When an IBF is created, all counters and XHASH values of + When an IBF is created, all counters and IDSUM and HASHSUM values of all buckets are initialized to zero. </t> <section anchor="ibv_operations_insert" numbered="true" toc="default"> @@ -363,12 +366,14 @@ <t> To add an element to a IBF, the element is mapped to a subset of k buckets using the mapping function M as described in the <xref target="bf" format="title" /> section introducing - BFs. For the buckets selected by the mapping function, the counter is increased by one and the value - field is set to the XOR of the hash of the element and the previously stored XHASH. + BFs. For the buckets selected by the mapping function, the counter is increased by one and the + IDSUM field is set to the XOR of the element ID and the previously stored IDSUM. Furthermore, + the HASHSUM is set to the XOR of the hash of the element ID and the previously stored HASHSUM. </t> <t> - In the following example the insert operation for the element [0101] with the hash 0x4242 and - the element [1100] with the hash 0x0101 is demonstrated. + In the following example, the insert operation is illustrated using an element with the + ID 0x01020304 and a hash of 0x4242, and a second element with the ID 0x03040506 and + a hash of 0x0101. </t> <t>Empty IBF:</t> <figure anchor="figure_ibf_insert_0"> @@ -377,7 +382,9 @@ +-------------+-------------+-------------+-------------+ count | 0 | 0 | 0 | 0 | +-------------+-------------+-------------+-------------+ - value | 0000 | 0000 | 0000 | 0000 | + idSum | 0000 | 0000 | 0000 | 0000 | + +-------------+-------------+-------------+-------------+ +hashSum | 0000 | 0000 | 0000 | 0000 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -408,8 +415,9 @@ <name>Remove Element</name> <t> To remove an element from the IBF the element is again mapped to a subset of the buckets using M. - Then all the counters of the buckets selected by M are reduced by one, and the XHASH is - replaced by the XOR of the old XHASH and the hash of the element being removed. + Then all the counters of the buckets selected by M are reduced by one, the IDSUM is + replaced by the XOR of the old IDSUM and the ID of the element being removed, and the + HASHSUM is similarly replaced with the XOR of the old HASHSUM and the hash of the ID. </t> <t> In the following example the remove operation for the element [1100] with the hash 0x0101 is demonstrated. @@ -432,7 +440,7 @@ +-------------+-------------+-------------+-------------+ count | 0 | 1 | 0 | 1 | +-------------+-------------+-------------+-------------+ - value | 0000 | 4242 | 0000 | 4242 | + value | 0000 | 0x4242 | 0000 | 0x4242 | +-------------+-------------+-------------+-------------+ ]]></artwork> </figure> @@ -444,12 +452,13 @@ present in the set: a bucket with a counter of zero can be the result of one element being added and a different element (mapped to the same bucket) being removed. To check that an element is not present requires a counter of zero and an - XHASH of zero --- and some assurance that there was no hash collision. Thus, + IDSUM and HASHSUM of zero --- and some assurance that there was no collision due + to the limited number of bits in IDSUM and HASHSUM. Thus, IBFs are not suitable to replace BFs or IBFs. </t> <t> Buckets in an IBF with a counter of 1 or -1 are crucial for decoding an IBF, as - they might represent only a single element, with the XHASH being the hash of that element. + they might represent only a single element, with the IDSUM being the ID of that element. Following Eppstein (CITE), we will call buckets that only represent a single element pure buckets. Note that due to the possibility of multiple insertion and removal operations @@ -463,18 +472,46 @@ <section anchor="ibf_operations_decode" numbered="true" toc="default"> <name>Decode IBF</name> <t> - To decode an IBF there are pure buckets needed, pure buckets are buckets which contain only - one element - the counter is set to 1 or -1. If there is no pure bucket in the IBF, its not possible - to decode the IBF. In this case a new IBF has to be created, the new IBF needs to be bigger or a different mapping function - should be used. - If there are pure buckets its possible to decode the IBF by removing elements as described - in the section <xref target="ibf_operations_remove" format="title" /> from the pure buckets from the filter creating new pure buckets - until the IBF is empty and all elements have been decoded. Its possible the an IBF only partly decodes, in this case a new IBF has to be - created. + Decoding an IBF yields the HASH of an element from the IBF, or failure. + </t> + <t> + A decode operation requires a pure bucket, that is a bucket to which M + only mapped a single element, to succeed. Thus, if there is no bucket with + a counter of 1 or -1, decoding fails. However, as a counter of 1 or -1 is + not a guarantee that the bucket is pure, there is also a chance that the + decoder returns an IDSUM value that is actually the XOR of several IDSUMs. + This is primarily detected by checking that the HASHSUM is the hash of the IDSUM. + Only if the HASHSUM also matches, the bucket could be pure. Additionally, + one should check that the IDSUM value actually would be mapped by M to + the respective bucket. If not, there was a hash collision. + </t> + <t> + The very rare case that after all these checks a bucket is still + falsely identified as pure must be detected (say by determining that + extracted element IDs do not match any actual elements), and addressed + at a higher level in the protocol. As these failures are probabilistic + and depend on element IDs and the IBF construction, they can typically + be avoided by retrying with different parameters, such as a different + way to assign element IDs to elements, using a larger value for L, or + a different mapping function M. + A more common scenario (especially if L was too small) is that + IBF decoding fails because there is no pure bucket. In this case, the + higher-level protocol also should retry using different parameters. </t> <t> - In the following example the successful decoding of an IBF containing the two elements previously - added. + Suppose the IBF contains a pure bucket. In this case, the IDSUM in the + bucket identifies a single element. Furthermore, it is then possible + to remove that element from the IBF (by inserting it if the counter + was negative, and by removing it if the counter was positive). This + is likely to cause other buckets to become pure, allowing further + elements to be decoded. Eventually, decoding should succeed with + all counters and IDSUM and HASHSUM values reaching zero. However, it is also + possible that an IBF only partly decodes and then decoding fails after + yielding some elements. + </t> + <t> + In the following example the successful decoding of an IBF containing + the two elements previously added in our running example. </t> <t> IBF with the two encoded elements: @@ -530,7 +567,7 @@ To calculate the IBF representing this set difference, both IBFs must have the same length L, the same number of buckets per element k and use the same map M. Given this, - one can compute the IBF representing the set difference by taking the XOR of the XHASH values + one can compute the IBF representing the set difference by taking the XOR of the IDSUM and HASHSUM values of the respective buckets and subtracting the respective counters. Care should be taken to handle overflows and underflows by setting the counter to "infinity" as necessary. The result is a new IBF with the same number of buckets representing the set difference. @@ -538,17 +575,10 @@ <t> This new IBF can be decoded as described in section <xref target="ibf_operations_decode" format="counter" />. The new IBF can have two types of pure buckets with counter set to 1 or -1. If the counter is set to 1 - the element is missing in the secondary set and if the counter is set to -1 the element is missing in + the element is missing in the secondary set, and if the counter is set to -1 the element is missing in the primary set. </t> <t> - Through this addition/subtraction its possible that a counter is 1 or -1, but is not pure. - This case its easy to detect by checking if an element exist in either set, if - the element does not exist its a falsely pure bucket in this case continue test the - next pure bucket until one is found that is truly pure. If no truly pure buckets are found the IBF - fails to decode. - </t> - <t> To demonstrate the set difference operation we compare IBF-A with IBF-B and generate as described IBF-AB </t> @@ -609,24 +639,35 @@ counter reaches "infinity" (-128). </t> <t> - For the "HASH VALUE", we always use a 32-bit + For the "IDSUM", we always use a 64-bit representation. + The IDSUM value must have sufficient entropy for the + mapping function M to yield reasonably random buckets + even for very large values of L. With a 32 bit + value the chance that multiple elements may be mapped + to the same ID would be quite high, even for moderately + large sets. Using more than 64 bits would at best make + sense for very large sets, but then it is likely always + better to simply afford additional round trips to handle + the occasional collision. 64 bits are also a reasonable + size for many CPU architectures. + </t> + <t> + For the "HASHSUM", we always use a 32-bit representation. Here, it is mostly important to avoid collisions, where different elements are - mapped to the same hash. The protocol is designed + mapped to the same hash. However, we note that + by design only a few elements (certainly less than + 127) should ever be mapped + to the same bucket, so a small number of bits + should suffice. Furthermore, our protocol is designed to handle occasional collisions, so while with - 32-bits there remains a significant chance of - accidental collisions, at 32-bits the chance is + 32-bits there remains a chance of + accidental collisions, at 32 bit the chance is generally believed to be sufficiently small enough for the protocol to handle those cases efficiently for a wide range of use-cases. Smaller hash - values would safe bandwidth, but drastically - increase the chance of collisions even for - moderately-sized sets. Larger hash values would - reduce the chance of collisions for very large - sets, alas with very large sets the bandwidth used - by the protocol is also typically larger and thus - additional round-trips to address a few collisions - are unlikely to have a major impact. 32-bit is + values would safe bandwidth, but also drastically + increase the chance of collisions. 32 bits are also again a reasonable size for many CPU architectures. </t> @@ -638,21 +679,28 @@ <section anchor="se_description" numbered="true" toc="default"> <name>Description</name> <t> - Strata Estimators should help to estimate the difference between two different set of elements. - That is necessary to efficiently determinate the needed size of an IBF. + Strata Estimators help estimate the size of the set difference between two set of elements. + This is necessary to efficiently determinate the tuning parameters for an IBF, in particular + a good value for L. </t> <t> - Basically an Strata Estimator (SE) is an IBF with the difference that only a certain share of - the elements is added to to the SE. The size of the share is described by the metric stratum which - means "Stratum n" contains 1/(2^n) (stratum factor) of all elements. + Basically a Strata Estimator (SE) is a series of IBFs (with a rather small value of L) + in which increasingly large subsets of the full set + of elements are added to each IBF. For the n-th IBF, the function selecting the + subset of elements should sample to select (probabilistically) 1/(2^n) of all + elements. This can be done by counting the number of trailing bits set to "1" + in an element ID, and then inserting the element into the IBF identified by that + counter. As a result, all elements will be mapped to one IBF, with the n-th + IBF being statistically expected to contain 1/(2^n) elements. </t> <t> - The trick is to decode the SE with the biggest stratum first and calculate the difference of - the set if the SE decodes successfully decode the next smaller strata estimator repeat - this until the decoding of the SE fails or Stratum 0 is reached. Then its possible to estimate - how big the difference between two sets is by adding the number of extracted hashes up (C) and scale it - by the expected number of elements (E) in the remaining unencoded IBF's (C*E=[estimated count of objects]). - If non of the SE decoded choose a smaller stratum or try a other mapping function. + Given two SEs, the set size difference can be estimated by trying to decode all of the + IBFs. Given that L was set to a rather small value, IBFs containing large strata + will likely fail to decode. For those IBFs that failed to decode, one simply + extrapolates the number of elements by scaling the numbers obtained from the + other IBFs that did decode. If none of the IBFs of the SE decoded (which given + a reasonable choice of L should be highly unlikely), one can retry using a different + mapping function M. </t> </section> </section> @@ -661,28 +709,35 @@ <section anchor="modeofoperation" numbered="true" toc="default"> <name>Mode of operation</name> <t> - The set union protocol uses the above discussed topics IBF and Strata Estimators. + The set union protocol uses IBFs and SEs as primitives. Depending on the state of the two sets there are different strategies or operation modes how to efficiently - determinate missing elements in two sets of two peers. + determinate missing elements between the two sets. </t> <t> - The simples mode is the full sync mode. The idea is that if the difference between the sets of the two - peers exceeds a certain threshold the overhead of determinate which elements are different outweighs - the overhead of sending the complete set. In this case its more efficient to just exchange the full sets. + The simplest mode is the "full" synchronization mode. The idea is that if the difference between the sets of the two + peers exceeds a certain threshold, the overhead to determine which elements are different outweighs + the overhead of sending the complete set. In this case, the most efficient method can be to just + exchange the full sets. ############# IMAGE ################## </t> + <t> + The second possibility is that the difference of the sets is small compared to the set size. + Here, an efficient "delta" synchronization mode is more efficient. Given these two possibilities, + the first steps of the protocol are used to determine which mode should be used. + </t> <t> - In the following section the operation mode independent flow in the beginning is described in detail: + Thus, the set synchronization protocol always begins with the following operation mode independent steps. </t> <t> - The initiating peer is initially in the <strong>Initiating Connection</strong> state and the receiving peer in the <strong>Expecting Connection</strong> + The initiating peer begins in the <strong>Initiating Connection</strong> state and the receiving peer in the <strong>Expecting Connection</strong> state. The first step for the initiating peer in the protocol is to send an <em><xref target="messages_operation_request" format="title" /></em> to the receiving peer and - change into <strong>Expect SE</strong> state. After receiving the <em><xref target="messages_operation_request" format="title" /></em> the receiving peer changes in <strong>Expecting IBF</strong> state and answers with the - <em><xref target="messages_se" format="title" /></em> message. When the initiating peer receives the Strata Estimator the initiating peer decides - with some heuristics which operation mode is best fitted for the the estimated set difference and the environment. + transition into the <strong>Expect SE</strong> state. After receiving the <em><xref target="messages_operation_request" format="title" /></em> the receiving peer + transitions to the <strong>Expecting IBF</strong> state and answers with the + <em><xref target="messages_se" format="title" /></em> message. When the initiating peer receives the <em><xref target="messages_se" format="title" /></em> message, + it decides with some heuristics which operation mode is likely more suitable for the estimated set difference and the application-provided latency-bandwidth tradeoff. The detailed tradeoff between the <xref target="modeofoperation_full-sync" format="title" /> and the <xref target="modeofoperation_individual-elements" format="title" /> is explained in the section <xref target="modeofoperation_combined-mode" format="title" />. </t> @@ -690,7 +745,7 @@ <name>Full Synchronisation Mode</name> <t> - When the initiating peer decide to use the Full Synchronisation Mode and the set of the initiating peer is bigger than the set of the receiving peer, the initiating + When the initiating peer decides to use the full synchronisation mode and the set of the initiating peer is bigger than the set of the receiving peer, the initiating peer sends a <em><xref target="messages_request_full" format="title" /></em> message and change from <strong>Expecting SE</strong> to the <strong>Full Receiving</strong> State. In all other cases the initiating peer sends all set elements to the other peer followed by the <em><xref target="messages_full_done" format="title" /></em> message and changes into <strong>Full Sending</strong> state. @@ -720,9 +775,9 @@ </dl> </section> <section anchor="modeofoperation_individual-elements" numbered="true" toc="default"> - <name>Individual Element Synchronisation Mode</name> + <name>Delta Synchronisation Mode</name> <t> - When the initiating peer in <strong>Expected SE</strong> state decide to use the Individual Element Synchronisation Mode, the peer + When the initiating peer in <strong>Expected SE</strong> state decide to use the delta synchronisation mode, the peer sends a <em><xref target="messages_ibf" format="title" /></em> to the receiving peer and changes into the <strong>Passive Decoding</strong> state. </t> <t> @@ -736,7 +791,7 @@ peer and the peer that is in <strong>Passive Decoding</strong> and <strong>Finish Waiting</strong> state is called the passive peer. </t> <t> - ############# Statemaschine Individual Element Synchronisation Mode ################## + ############# Statemaschine Delta Synchronisation Mode ################## </t> <t><strong>The behavior of the participants the different state is described below:</strong></t> <dl> @@ -879,7 +934,7 @@ ############# NOTE ############ To ensure that ...... the difference is multiplied by 1.5 if there are more than 200 elements differences between the sets (WHY? line 1398). The Full Synchronisation Mode is used if the flag to force full sync is set, the estimated difference between the two sets is bigger - than 25% or the set size of the receiving peer is zero. Otherwise the Individual Element Synchronisation Mode is used. + than 25% or the set size of the receiving peer is zero. Otherwise the delta synchronisation mode is used. ############# NOTE END############ </t> </section>