aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorElias Summermatter <elias.summermatter@seccom.ch>2021-06-16 14:52:06 +0200
committerElias Summermatter <elias.summermatter@seccom.ch>2021-06-16 14:52:06 +0200
commitbbd0f12e96f5f635c4828312479b8863d0d7a676 (patch)
tree276503e4fb73d7447e6b1613ce24aeeda794a4f8
parent29fcb992bb7501c94a5720f1614be1edeb5ecf70 (diff)
downloadlsd0003-bbd0f12e96f5f635c4828312479b8863d0d7a676.tar.gz
lsd0003-bbd0f12e96f5f635c4828312479b8863d0d7a676.zip
Fixed some stuff for submission to the ietf
-rw-r--r--draft-summermatter-set-union-01.xml3280
-rw-r--r--draft-summermatter-set-union.xml10
2 files changed, 3285 insertions, 5 deletions
diff --git a/draft-summermatter-set-union-01.xml b/draft-summermatter-set-union-01.xml
new file mode 100644
index 0000000..7f1071d
--- /dev/null
+++ b/draft-summermatter-set-union-01.xml
@@ -0,0 +1,3280 @@
1<?xml version='1.0' encoding='utf-8'?>
2<!DOCTYPE rfc SYSTEM "rfc2629-xhtml.ent" [
3 <!ENTITY RFC1034 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1034.xml">
4 <!ENTITY RFC1035 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1035.xml">
5 <!ENTITY RFC2119 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
6 <!ENTITY RFC2782 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2782.xml">
7 <!ENTITY RFC3686 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3686.xml">
8 <!ENTITY RFC5869 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5869.xml">
9 <!ENTITY RFC3385 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3385.xml">
10 <!ENTITY RFC1951 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1951.xml">
11 ]>
12<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
13<?rfc strict="yes" ?>
14<?rfc toc="yes" ?>
15<?rfc symrefs="yes"?>
16<?rfc sortrefs="yes" ?>
17<?rfc compact="yes" ?>
18<?rfc subcompact="no" ?>
19<rfc category="info" docName="draft-summermatter-set-union-01" ipr="trust200902"
20 obsoletes="" updates="" submissionType="IETF" xml:lang="en" version="3">
21 <!-- xml2rfc v2v3 conversion 2.26.0 -->
22 <front>
23 <title abbrev="Set Union">
24 Byzantine Fault Tolerant Set Reconciliation
25 </title>
26 <seriesInfo name="Internet-Draft" value="draft-summermatter-set-union-01"/>
27 <author fullname="Elias Summermatter" initials="E." surname="Summermatter">
28 <organization>Seccom GmbH</organization>
29 <address>
30 <postal>
31 <street>Brunnmattstrasse 44</street>
32 <city>Bern</city>
33 <code>3007</code>
34 <country>CH</country>
35 </postal>
36 <email>elias.summermatter@seccom.ch</email>
37 </address>
38 </author>
39 <author fullname="Christian Grothoff" initials="C." surname="Grothoff">
40 <organization>Berner Fachhochschule</organization>
41 <address>
42 <postal>
43 <street>Hoeheweg 80</street>
44 <city>Biel/Bienne</city>
45 <code>2501</code>
46 <country>CH</country>
47 </postal>
48 <email>grothoff@gnunet.org</email>
49 </address>
50 </author>
51
52 <!-- Meta-data Declarations -->
53 <area>General</area>
54 <workgroup>Independent Stream</workgroup>
55 <keyword>name systems</keyword>
56 <abstract>
57 <t>This document contains a protocol specification for Byzantine fault-tolerant
58 Set Reconciliation.
59 </t>
60 </abstract>
61 </front>
62 <middle>
63 <section anchor="introduction" numbered="true" toc="default">
64 <name>Introduction</name>
65 <t>
66 This document describes a byzantine fault tolerant set reconciliation protocol used to efficient and securely
67 compute the union of two sets across a network.
68 </t>
69 <t>
70 This byzantine fault tolerant set reconciliation
71 protocol can be used in a variety of applications.
72
73 Our primary envisioned application domain is the
74 distribution of revocation messages in the GNU Name
75 System (GNS) <xref target="GNS" format="default" />. In GNS,
76 key revocation messages are usually flooded across the
77 peer-to-peer overlay network to all connected peers
78 whenever a key is revoked. However, as peers may be
79 offline or the network might have been partitioned,
80 there is a need to reconcile revocation lists whenever
81 network partitions are healed or peers go online. The
82 GNU Name System uses the protocol described in this
83 specification to efficiently distribute revocation
84 messages whenever network partitions are healed.
85
86 Another application domain for the protocol described
87 in this specification are Byzantine fault-tolerant
88 bulletin boards, like those required in some secure
89 multiparty computations. A well-known example for
90 secure multiparty computations are various E-voting
91 protocols <xref target="CryptographicallySecureVoting" format="default"/> which
92 use a bulletin board to share the votes and intermediate
93 computational results. We note that for such systems,
94 the set reconciliation protocol is merely a component of
95 a multiparty consensus protocol, such as the one
96 described in Dold's "Byzantine set-union consensus using
97 efficient set reconciliation" <xref target="ByzantineSetUnionConsensusUsingEfficientSetReconciliation" format="default"/>.
98 </t>
99 <t>
100 The protocol described in this report is generic and
101 suitable for a wide range of applications. As a result,
102 the internal structure of the elements in the sets MUST
103 be defined and verified by the application using the
104 protocol. This document thus does not cover the element
105 structure, except for imposing a limit on the maximum
106 size of an element.
107 </t>
108 <t>
109 The protocol faces an inherent trade-off between minimizing
110 the number of network round-trips and the number of bytes
111 sent over the network. Thus, for the protocol to choose
112 the right parameters for a given situation, applications
113 using an implementation of the protocol SHOULD provide a
114 parameter that specifies
115 the cost-ratio of round-trips vs. bandwidth usage. Given
116 this trade-off factor, an implementation CAN then choose parameters
117 that minimize total execution cost. In particular, there
118 is one major choice to be made, namely between sending the
119 complete set of elements, or computing the set differences and
120 transmitting only the elements in the set differences.
121 In the latter case, our design is basically a concrete
122 implementation of a proposal by Eppstein.<xref target="Eppstein" format="default" />
123 </t>
124
125 <t>
126 We say that our set reconciliation protocol is Byzantine
127 fault-tolerant because it provides cryptographic and
128 probabilistic methods to discover if the other peer
129 is dishonest or misbehaving.
130 Here, the security objective is to limit resources wasted on
131 malicious actors. Malicious actors could send malformed
132 messages, including malformed set elements, claim to
133 have much larger numbers of valid set elements than they
134 actually hold, or request the retransmission of elements
135 that they have already received in previous
136 interactions. Bounding resources consumed by malicous
137 actors is important to ensure that higher-level protocols
138 can use set reconciliation and still meet their resource
139 targets. This can be particularly critical in multi-round
140 synchronous consensus protocols where peers that cannot
141 answer in a timely fashion would have to be treated as
142 failed or malicious.
143 </t>
144 <t>
145 To defend against some of these attacks, applications
146 SHOULD remember the number of elements previously
147 shared with a peer, and SHOULD provide a way to check that
148 elements are well-formed. Applications MAY also
149 provide an upper bound on the total number of valid
150 elements that exist. For example, in E-voting, the
151 number of eligible voters MAY be used to provide such
152 an upper bound.
153 </t>
154 <t>
155 A first draft of this RFC is part of Elias Summermatter's
156 bachelor thesis. Many of the algorithms and parameters
157 documented in this RFC are derived from experiments
158 detailed in this thesis.
159 <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/>
160 </t>
161
162 <t>
163 This document defines the normative wire format of resource records, resolution processes,
164 cryptographic routines and security considerations for use by implementors.
165 SETU requires a bidirectional secure communication channel between the two parties.
166 Specification of the communication channel is out of scope of this document.
167 </t>
168 <t>
169 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
170 NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
171 "OPTIONAL" in this document are to be interpreted as described
172 in <xref target="RFC2119"/>.
173 </t>
174 </section>
175
176 <section anchor="background" numbered="true" toc="default">
177 <name>Background</name>
178 <section anchor="bf" numbered="true" toc="default">
179 <name>Bloom Filter</name>
180 <t>
181 A Bloom filter (BF) is a space-efficient probabilistic
182 datastructure to test if an element is part of a set of elements.
183 Elements are identified by an element ID.
184 Since a BF is a probabilistic datastructure, it is possible to have false-positives: when asked
185 if an element is in the set, the answer from a BF is either "no" or "maybe".
186 </t>
187 <t>
188 A BF consists of L buckets. Every bucket is a binary value that can be either 0 or 1. All buckets are initialized
189 to 0. A mapping function M is used to map each ID of each element from the set to a subset of k buckets. In the original proposal by Bloom, M is non-injective
190 and can thus map the same element multiple times to the same bucket.
191 The type of the mapping function can thus be described by the following mathematical notation:
192 </t>
193 <figure anchor="bf_mapping_function_math">
194 <artwork name="" type="" align="left" alt=""><![CDATA[
195 ------------------------------------
196 # M: E->B^k
197 ------------------------------------
198 # L = Number of buckets
199 # B = 0,1,2,3,4,...L-1 (the buckets)
200 # k = Number of buckets per element
201 # E = Set of elements
202 ------------------------------------
203 Example: L=256, k=3
204 M('element-data') = {4,6,255}
205
206 ]]></artwork>
207 </figure>
208 <t>
209 A typical mapping function is constructed by hashing the element, for example
210 using the well-known <relref section="2" target="RFC5869" displayFormat="of">HKDF construction</relref>.
211 </t>
212 <t>
213 To add an element to the BF, the corresponding buckets under the map M are set to 1.
214 To check if an element may be in the set, one tests if all buckets under the map M are set to 1.
215 </t>
216 <t>
217 In the BF the buckets are set to 1 if the corresponding bit in the bitstream is 1.
218 If there is a collision and a bucket is already set to 1, the bucket stays at 1.
219 </t>
220 <t>
221 In the following example the element e0 with M(e0) = {1,3} has been added:
222 </t>
223 <figure anchor="figure_bf_insert_0">
224 <artwork name="" type="" align="left" alt=""><![CDATA[
225 bucket-0 bucket-1 bucket-2 bucket-3
226 +-------------+-------------+-------------+-------------+
227 | 0 | 1 | 0 | 1 |
228 +-------------+-------------+-------------+-------------+
229 ]]></artwork>
230 </figure>
231 <t>
232 It is easy to see that an element e1 with M(e1) = {0,3}
233 could have been added to the BF below, while an element e2
234 with M(e2) = {0,2} cannot be in the set represented by the
235 BF below:
236 </t>
237
238 <figure anchor="figure_bf_contains">
239 <artwork name="" type="" align="left" alt=""><![CDATA[
240 bucket-0 bucket-1 bucket-2 bucket-3
241 +-------------+-------------+-------------+-------------+
242 | 1 | 0 | 0 | 1 |
243 +-------------+-------------+-------------+-------------+
244 ]]></artwork>
245 </figure>
246 <t>
247 The parameters L and k depend on the set size and MUST be
248 chosen carefully to ensure that the BF does not return too
249 many false-positives.
250 </t>
251 <t>
252 It is not possible to remove an element from the BF because buckets can only be set to 1 or 0. Hence it is impossible to
253 differentiate between buckets containing one or more elements. To remove elements from the BF a <xref target="cbf" format="title" />
254 is required.
255 </t>
256 </section>
257
258 <section anchor="cbf" numbered="true" toc="default">
259 <name>Counting Bloom Filter</name>
260 <t>
261 A Counting Bloom Filter (CBF) is a variation on the idea
262 of a <xref target="bf" format="title" />. With a CBF, buckets are
263 unsigned numbers instead of binary values.
264 This allows the removal of an element from the CBF.
265 </t>
266 <t>
267 Adding an element to the CBF is similar to the adding operation of the BF.
268 However, instead of setting the buckets to 1 the
269 numeric value stored in the bucket is increased by 1.
270 For example, if two colliding elements M(e1) = {1,3} and
271 M(e2) = {0,3} are added to the CBF, bucket 0 and 1 are set
272 to 1 and bucket 3 (the colliding bucket) is set to 2:
273 </t>
274 <figure anchor="figure_cbf_insert_0">
275 <artwork name="" type="" align="left" alt=""><![CDATA[
276 bucket-0 bucket-1 bucket-2 bucket-3
277 +-------------+-------------+-------------+-------------+
278 | 1 | 1 | 0 | 2 |
279 +-------------+-------------+-------------+-------------+
280 ]]></artwork>
281 </figure>
282 <t>
283 The counter stored in the bucket is also called the order of the bucket.
284 </t>
285 <t>
286 To remove an element form the CBF the counters of all buckets
287 the element is mapped to are decreased by 1.
288 </t>
289 <t>
290 For example, removing M(e2) = {1,3} from the CBF above
291 results in:
292 </t>
293 <figure anchor="figure_cbf_remove_0">
294 <artwork name="" type="" align="left" alt=""><![CDATA[
295 bucket-0 bucket-1 bucket-2 bucket-3
296 +-------------+-------------+-------------+-------------+
297 | 1 | 0 | 0 | 1 |
298 +-------------+-------------+-------------+-------------+
299 ]]></artwork>
300 </figure>
301 <t>
302 In practice, the number of bits available for the counters
303 is often finite. For example, given a 4-bit
304 counter, a CBF bucket would overflow 16 elements are mapped
305 to the same bucket. To handle this case, the maximum value
306 (15 in our example) is considered to represent "infinity". Once the
307 order of a bucket reaches "infinity", it is no longer incremented or decremented.
308 </t>
309 <t>
310 The parameters L and k and the number of bits allocated to the counters
311 SHOULD depend on the set size.
312 A CBF will degenerate when subjected to insert and remove iterations of
313 different elements, and eventually all buckets will reach "infinity".
314 The speed of the degradation will depend on the choice of L and k in
315 relation to the number of elements stored in the IBF.
316 </t>
317 </section>
318 </section>
319
320 <section anchor="ibf" numbered="true" toc="default">
321 <name>Invertible Bloom Filter</name>
322 <t>
323 An Invertible Bloom Filter (IBF) is a further extension of the <xref target="cbf" format="title" />.
324 An IBF extends the <xref target="cbf" format="title" /> with two more operations:
325 decode and set difference. This two extra operations are key to efficiently obtain
326 small differences between large sets.
327 </t>
328 <section anchor="ibf_structure" numbered="true" toc="default">
329 <name>Structure</name>
330 <t>
331 An IBF consists of an injective mapping function M mapping
332 elements to k out of L buckets. Each of the L buckets stores
333 a signed COUNTER, an IDSUM and an XHASH.
334 An IDSUM is the XOR of various element IDs.
335 An XHASH is the XOR of various hash values.
336 As before, the values used for k, L and the number of bits used
337 for the signed counter and the XHASH depend
338 on the set size and various other trade-offs.
339 </t>
340 <t>
341 If the IBF size is too small or the mapping
342 function does not spread out the elements
343 uniformly, the signed counter can overflow or
344 underflow. As with the CBF, the "maximum" value is
345 thus used to represent "infinite". As there is no
346 need to distinguish between overflow and
347 underflow, the most canonical representation of
348 "infinite" would be the minimum value of the
349 counter in the canonical 2-complement
350 interpretation. For example, given a 4-bit
351 counter a value of -8 would be used to represent
352 "infinity".
353 </t>
354 <figure anchor="figure_ibf_structure">
355 <artwork name="" type="" align="left" alt=""><![CDATA[
356 bucket-0 bucket-1 bucket-2 bucket-3
357 +-------------+-------------+-------------+-------------+-------
358 count | COUNTER | COUNTER | COUNTER | COUNTER | C...
359 +-------------+-------------+-------------+-------------+------
360 idSum | IDSUM | IDSUM | IDSUM | IDSUM | I...
361 +-------------+-------------+-------------+-------------+------
362hashSum | HASHSUM | HASHSUM | HASHSUM | HASHSUM | H..
363 +-------------+-------------+-------------+-------------+-------
364 ]]></artwork>
365 </figure>
366
367 </section>
368
369 <section anchor="ibf_format_id_generation" numbered="true" toc="default">
370 <name>Salted Element ID Calculation</name>
371 <t>
372 IBFs are a probabilistic data structure, hence it can be necessary to
373 recompute the IBF in case operations fail, and then try again. The
374 recomputed IBF would ideally be statistically independent of the
375 failed IBF. This is achieved by introducing an IBF-salt. Given that with
376 benign peers failures should be rare, and that we need to be able to
377 "invert" the application of the IBF-salt to the element IDs, we use an
378 unsigned 32 bit non-random IBF-salt value of which the lowest 6 bits will
379 be used to rotate bits in the element ID computation.
380 </t>
381 <t>
382 64-bit element IDs are generated from a
383 <relref section="2" target="RFC5869" displayFormat="of">HKDF construction</relref>
384 with HMAC-SHA512 as XTR and HMAC-SHA256 as PRF with a 16-bit KDF-salt set to a
385 unsigned 16-bit representation of zero.
386 The output of the KDF is then truncated to 64-bit.
387 Finally, salting is done by calculating the IBF-salt modulo 64
388 (effectively using only the lowest 6-bits of the IBF-salt)
389 and doing a bitwise right rotation of the output of KDF. We
390 note that this operation was chosen as it is easily inverted,
391 allowing applications to easily derive element IDs with one
392 IBF-salt value from element IDs generated with a different
393 IBF-salt value.
394 </t>
395 <t>
396 In case the IBF does not decode, the IBF-salt can be changed to
397 compute different element IDs, which will (likely) be mapped
398 to different buckets, likely allowing the IBF to decode in a
399 subsequent iteration.
400 </t>
401 <figure anchor="ibf_format_id_generation_pseudo_code">
402 <artwork name="" type="" align="left" alt=""><![CDATA[
403# INPUTS:
404# key: Pre calculated and truncated key from id_calculation function
405# ibf_salt: Salt of the IBF
406# OUTPUT:
407# value: salted key
408FUNCTION salt_key(key,ibf_salt):
409 s = (ibf_salt * 7) modulo 64;
410 /* rotate key */
411 return (key >> s) | (key << (64 - s))
412END FUNCTION
413
414
415# INPUTS:
416# element: element for which we are to calculate the element ID
417# ibf_salt: Salt of the IBF
418# OUTPUT:
419# value: the ID of the element
420FUNCTION id_calculation (element,ibf_salt):
421 kdf_salt = 0 // 16 bits
422 XTR=HMAC-SHA256
423 PRF=HMAC-SHA256
424 key = HKDF(XTR, PRF, kdf_salt, element) modulo 2^64
425 return salt_key(key, ibf_salt)
426END FUNCTION
427 ]]></artwork>
428 </figure>
429 </section>
430
431 <section anchor="ibf_format_HASH_calculation" numbered="true" toc="default">
432 <name>HASH calculation</name>
433 <t>
434 The HASH of an element ID is computed by calculating the
435 CRC32 checksum of the 64-bit ID value,
436 which returns a 32-bit value.CRC32 is well-known and described in <relref section="4.1" target="RFC3385" displayFormat="of">the RFC</relref>.
437 </t>
438 </section>
439
440 <section anchor="ibf_m" numbered="true" toc="default">
441 <name>Mapping Function</name>
442 <t>
443 The mapping function M decides which buckets a given ID is mapped to.
444 For an IBF, it is beneficial to use an injective mapping function M.
445 </t>
446 <t>
447 The first index is simply the CRC32 of the ID modulo the IBF size. The second
448 index is calculated by creating a new 64-bit value by shifting the previous 32-bit
449 value left and setting the lower 32 bits to the number of indices already processed.
450 From the resulting 64-bit value, another CRC32 checksum is computed.
451 The subsequent index is the modulo of this CRC32 output.
452 The process is repeated until the desired number of indices is generated.
453 In the case the process computes the same index twice,
454 which would mean this bucket could not get pure again,
455 the second hit is just skipped and the next iteration is used instead,
456 creating an injective mapping function.
457 </t>
458 <figure anchor="ibf_format_bucket_identification_pseudo_code">
459 <artwork name="" type="" align="left" alt=""><![CDATA[
460# INPUTS:
461# key: the ID of the element calculated
462# k: numbers of buckets per element
463# L: total number of buckets in the IBF
464# OUTPUT:
465# dst: Array with k bucket IDs
466FUNCTION get_bucket_id (key, k, L)
467 bucket = CRC32(key)
468 i = 0 // unsigned 32-bit index
469 filled = 0
470 WHILE filled < k DO
471
472 element_already_in_bucket = false
473 j = 0
474 WHILE j < filled DO
475 IF dst[j] == bucket modulo L THEN
476 element_already_in_bucket = true
477 END IF
478 j++
479 END WHILE
480
481 IF !element_already_in_bucket THEN
482 dst[filled] = bucket modulo L
483 filled = filled + 1
484 END IF
485
486 x = (bucket << 32) | i // 64 bit result
487 bucket = CRC32(x)
488 i = i + 1
489 END WHILE
490 return dst
491END FUNCTION
492 ]]></artwork>
493 </figure>
494 </section>
495
496 <section anchor="ibf_operations" numbered="true" toc="default">
497 <name>Operations</name>
498 <t>
499 When an IBF is created, all counters and IDSUM and HASHSUM values of
500 all buckets are initialized to zero.
501 </t>
502 <section anchor="ibv_operations_insert" numbered="true" toc="default">
503 <name>Insert Element</name>
504 <t>
505 To add an element to an IBF, the element is mapped to a subset of k buckets using
506 the injective mapping function M as described in section <xref target="ibf_m" format="title" />. For the buckets selected by the mapping function, the counter is increased by one and the
507 IDSUM field is set to the XOR of the element ID
508 computed as described in section <xref target="ibf_format_id_generation" format="title" />
509 and the previously stored IDSUM. Furthermore,
510 the HASHSUM is set to the XOR of the previously stored HASHSUM
511 and the hash of the element ID computed as described
512 in section <xref target="ibf_format_HASH_calculation" format="title" />.
513 </t>
514 <t>
515 In the following example, the insert operation is illustrated using an element with the
516 ID 0x0102 mapped to {1,3} with a hash of 0x4242, and a second element with the
517 ID 0x0304 mapped to {0,1} and a hash of 0x0101.
518 </t>
519 <t>Empty IBF:</t>
520 <figure anchor="figure_ibf_insert_0">
521 <artwork name="" type="" align="left" alt=""><![CDATA[
522 bucket-0 bucket-1 bucket-2 bucket-3
523 +-------------+-------------+-------------+-------------+
524 count | 0 | 0 | 0 | 0 |
525 +-------------+-------------+-------------+-------------+
526 idSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 |
527 +-------------+-------------+-------------+-------------+
528hashSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 |
529 +-------------+-------------+-------------+-------------+
530 ]]></artwork>
531 </figure>
532 <t>Insert first element with ID 0x0102 and hash 0x4242 into {1,3}:</t>
533 <figure anchor="figure_ibf_insert_1">
534 <artwork name="" type="" align="left" alt=""><![CDATA[
535 bucket-0 bucket-1 bucket-2 bucket-3
536 +-------------+-------------+-------------+-------------+
537 count | 0 | 1 | 0 | 1 |
538 +-------------+-------------+-------------+-------------+
539 idSum | 0x0000 | 0x0102 | 0x0000 | 0x0102 |
540 +-------------+-------------+-------------+-------------+
541hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 |
542 +-------------+-------------+-------------+-------------+
543 ]]></artwork>
544 </figure>
545 <t>Insert second element with ID 0x0304 and hash 0101 into {0,1}:</t>
546 <figure anchor="figure_ibf_insert_2">
547 <artwork name="" type="" align="left" alt=""><![CDATA[
548 bucket-0 bucket-1 bucket-2 bucket-3
549 +-------------+-------------+-------------+-------------+
550 count | 1 | 2 | 0 | 1 |
551 +-------------+-------------+-------------+-------------+
552 idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 |
553 +-------------+-------------+-------------+-------------+
554hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 |
555 +-------------+-------------+-------------+-------------+
556 ]]></artwork>
557 </figure>
558 </section>
559 <section anchor="ibf_operations_remove" numbered="true" toc="default">
560 <name>Remove Element</name>
561 <t>
562 To remove an element from the IBF the element is again mapped to a subset of the buckets using M.
563 Then all the counters of the buckets selected by M are reduced by one, the IDSUM is
564 replaced by the XOR of the old IDSUM and the ID of the element being removed, and the
565 HASHSUM is similarly replaced with the XOR of the old HASHSUM and the hash of the ID.
566 </t>
567 <t>
568 In the following example the remove operation is illustrated.
569 </t>
570 <t>IBF with two encoded elements:</t>
571 <figure anchor="figure_ibf_remove_0">
572 <artwork name="" type="" align="left" alt=""><![CDATA[
573 bucket-0 bucket-1 bucket-2 bucket-3
574 +-------------+-------------+-------------+-------------+
575 count | 1 | 2 | 0 | 1 |
576 +-------------+-------------+-------------+-------------+
577 idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 |
578 +-------------+-------------+-------------+-------------+
579hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 |
580 +-------------+-------------+-------------+-------------+
581 ]]></artwork>
582 </figure>
583 <t>After removal of element with ID 0x0304 and hash 0x0101 mapped to {0,1} from the IBF:</t>
584 <figure anchor="figure_ibf_remove_1">
585 <artwork name="" type="" align="left" alt=""><![CDATA[
586 bucket-0 bucket-1 bucket-2 bucket-3
587 +-------------+-------------+-------------+-------------+
588 count | 0 | 1 | 0 | 1 |
589 +-------------+-------------+-------------+-------------+
590 idSum | 0x0000 | 0x0102 | 0x0000 | 0x0102 |
591 +-------------+-------------+-------------+-------------+
592hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 |
593 +-------------+-------------+-------------+-------------+
594 ]]></artwork>
595 </figure>
596 <t>
597 Note that it is possible to "remove" elements from an IBF that were never present
598 in the IBF in the first place. A negative counter value is thus indicative of
599 elements that were removed without having been added. Note that an IBF bucket
600 counter of zero no longer guarantees that an element mapped to that bucket is not
601 present in the set: a bucket with a counter of zero can be the result of one
602 element being added and a different element (mapped to the same bucket) being removed.
603 To check that an element is not present requires a counter of zero and an
604 IDSUM and HASHSUM of zero --- and some certainty that there was no collision due
605 to the limited number of bits in IDSUM and HASHSUM. Thus,
606 IBFs are not suitable to replace BFs or IBFs.
607 </t>
608 <t>
609 Buckets in an IBF with a counter of 1 or -1 are crucial for decoding an IBF, as
610 they MIGHT represent only a single element, with the IDSUM being the ID of that element.
611 Following Eppstein <xref target="Eppstein" format="default" />, we will call buckets that only represent a single
612 element <em>pure buckets</em>.
613 Note that due to the possibility of multiple insertion and removal operations
614 affecting the same bucket, not all buckets with a counter of 1 or -1 are
615 actually pure buckets. Sometimes a counter can be 1 or -1 because N elements
616 mapped to that bucket were added while N-1 or N+1 different elements also
617 mapped to that bucket were removed.
618 </t>
619 </section>
620
621 <section anchor="ibf_operations_decode" numbered="true" toc="default">
622 <name>Extracting elements</name>
623 <t>
624 Extracting elements from an IBF yields IDs of elements from the IBF.
625 Elements are extracted from an IBF by repeatedly performing a
626 decode operation on the IBF.
627 </t>
628 <t>
629 A decode operation requires a pure bucket, that is a bucket to which M
630 only mapped a single element, to succeed. Thus, if there is no bucket with
631 a counter of 1 or -1, decoding fails. However, as a counter of 1 or -1 is
632 not a guarantee that the bucket is pure, there is also a chance that the
633 decoder returns an IDSUM value that is actually the XOR of several IDSUMs.
634 This is primarily detected by checking that the HASHSUM is the hash of the IDSUM.
635 Only if the HASHSUM also matches, the bucket could be pure. Additionally,
636 one MUST check that the IDSUM value actually would be mapped by M to
637 the respective bucket. If not, there was a hash collision and the bucket
638 is also not pure.
639 </t>
640 <t>
641 The very rare case that after all these checks a bucket is still
642 falsely identified as pure MUST be detected (say by determining that
643 extracted element IDs do not match any actual elements), and addressed
644 at a higher level in the protocol. As these failures are probabilistic
645 and depend on element IDs and the IBF construction, they can typically
646 be avoided by retrying with different parameters, such as a different
647 way to assign element IDs to elements (by varying the IBF-salt),
648 using a larger value for L, or a different mapping function M.
649 A more common scenario (especially if L was too small) is that
650 IBF decoding fails because there is no pure bucket. In this case, the
651 higher-level protocol generally MUST also retry using different
652 parameters (except if an attack is detected).
653 </t>
654 <t>
655 Suppose the IBF contains a pure bucket. In this case, the IDSUM in the
656 bucket is the ID of an element. Furthermore, it is then possible
657 to remove that element from the IBF (by inserting it if the counter
658 was negative, and by removing it if the counter was positive). This
659 is likely to cause other buckets to become pure, allowing further
660 elements to be decoded. Eventually, decoding ought to finish with
661 all counters and IDSUM and HASHSUM values reach zero. However, it is also
662 possible that an IBF only partly decodes and then decoding fails due
663 to the lack of pure buckets after extracting some element IDs.
664 </t>
665 <t>
666 In the following example the successful decoding of an IBF containing
667 the two elements previously added in our running example.
668 </t>
669 <t>
670 We begin with an IBF with two elements added:
671 </t>
672 <figure anchor="figure_ibf_decode_0">
673 <artwork name="" type="" align="left" alt=""><![CDATA[
674 bucket-0 bucket-1 bucket-2 bucket-3
675 +-------------+-------------+-------------+-------------+
676 count | 1 | 2 | 0 | 1 |
677 +-------------+-------------+-------------+-------------+
678 idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 |
679 +-------------+-------------+-------------+-------------+
680hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 |
681 +-------------+-------------+-------------+-------------+
682 ]]></artwork>
683 </figure>
684 <t>
685 In the IBF are two pure buckets to decode (buckets 0 and 3) we choose to start with
686 decoding bucket 0. This yields the element with the hash ID 0x0304 and
687 hash 1010. This element ID is mapped to buckets
688 {0,1}.
689 Subtracting this element results in bucket 1 becoming pure:
690 </t>
691 <figure anchor="figure_ibf_decode_1">
692 <artwork name="" type="" align="left" alt=""><![CDATA[
693 bucket-0 bucket-1 bucket-2 bucket-3
694 +-------------+-------------+-------------+-------------+
695 count | 0 | 1 | 0 | 1 |
696 +-------------+-------------+-------------+-------------+
697 idSum | 0x0000 | 0x0102 | 0x0000 | 0x0102 |
698 +-------------+-------------+-------------+-------------+
699hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 |
700 +-------------+-------------+-------------+-------------+
701 ]]></artwork>
702 </figure>
703 <t>
704 We can now decoding bucket 2 and extract the element
705 with the ID 0x0102 and hash 0x4242. Now the IBF is
706 empty. Extraction completes with the status that
707 the IBF has been successfully decoded.
708 </t>
709 <figure anchor="figure_ibf_decode_2">
710 <artwork name="" type="" align="left" alt=""><![CDATA[
711 bucket-0 bucket-1 bucket-2 bucket-3
712 +-------------+-------------+-------------+-------------+
713 count | 0 | 0 | 0 | 0 |
714 +-------------+-------------+-------------+-------------+
715 idSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 |
716 +-------------+-------------+-------------+-------------+
717hashSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 |
718 +-------------+-------------+-------------+-------------+
719 ]]></artwork>
720 </figure>
721 </section>
722
723 <section anchor="ibv_operations_setdiff" numbered="true" toc="default">
724 <name>Set Difference</name>
725 <t>
726 Given addition and removal as defined above, it is possible to define an operation on IBFs
727 that computes an IBF representing the set difference. Suppose IBF1 represents set A, and
728 IBF2 represents set B. Then this set difference operation will compute IBF3 which
729 represents the set A - B. Note that this computation can be done only on the IBFs,
730 and does not require access to the elements from set A or B.
731
732 To calculate the IBF representing this set difference, both IBFs MUST have the same
733 length L, the same number of buckets per element k and use the same map M.
734 Naturally, all IDs must have been computed using the same IBF-salt. Given this,
735 one can compute the IBF representing the set difference by taking the XOR of the IDSUM and HASHSUM values
736 of the respective buckets and subtracting the respective counters. Care MUST be taken
737 to handle overflows and underflows by setting the counter to "infinity" as necessary.
738 The result is a new IBF with the same number of buckets representing the set difference.
739 </t>
740 <t>
741 This new IBF can be decoded as described in section <xref target="ibf_operations_decode" format="counter" />.
742 The new IBF can have two types of pure buckets with counter set to 1 or -1. If the counter is set to 1
743 the element is missing in the secondary set, and if the counter is set to -1 the element is missing in
744 the primary set.
745 </t>
746 <t>
747 To demonstrate the set difference operation we compare IBF-A with IBF-B and generate as described
748 IBF-AB
749 </t>
750 <t>IBF-A contains the elements with ID 0x0304 and hash 0x0101 mapped to {0,1},
751 and ID 0x0102 and hash 0x4242 mapped to {1,3}:</t>
752 <figure anchor="figure_ibf_setdiff_A">
753 <artwork name="" type="" align="left" alt=""><![CDATA[
754 bucket-0 bucket-1 bucket-2 bucket-3
755 +-------------+-------------+-------------+-------------+
756 count | 1 | 2 | 0 | 1 |
757 +-------------+-------------+-------------+-------------+
758 idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 |
759 +-------------+-------------+-------------+-------------+
760hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 |
761 +-------------+-------------+-------------+-------------+
762 ]]></artwork>
763 </figure>
764 <t>IBF-B also contains the element with ID 0x0102 and
765 and another element with ID 0x1345 and hash 0x5050
766 mapped to {1,2}.</t>
767 <figure anchor="figure_ibf_setdiff_B">
768 <artwork name="" type="" align="left" alt=""><![CDATA[
769 bucket-0 bucket-1 bucket-2 bucket-3
770 +-------------+-------------+-------------+-------------+
771 count | 0 | 1 | 1 | 1 |
772 +-------------+-------------+-------------+-------------+
773 idSum | 0x0000 | 0x1447 | 0x1345 | 0x0102 |
774 +-------------+-------------+-------------+-------------+
775hashSum | 0x0000 | 0x9292 | 0x5050 | 0x4242 |
776 +-------------+-------------+-------------+-------------+
777 ]]></artwork>
778 </figure>
779 <t>IBF-A minus IBF-B is then:</t>
780 <figure anchor="figure_ibf_setdiff_AB">
781 <artwork name="" type="" align="left" alt=""><![CDATA[
782 bucket-0 bucket-1 bucket-2 bucket-3
783 +-------------+-------------+-------------+-------------+
784 count | 1 | 0 | -1 | 0 |
785 +-------------+-------------+-------------+-------------+
786 idSum | 0x0304 | 0x1049 | 0x1345 | 0x0000 |
787 +-------------+-------------+-------------+-------------+
788hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 |
789 +-------------+-------------+-------------+-------------+
790 ]]></artwork>
791 </figure>
792 <t>After calculating and decoding the IBF-AB shows clear that in IBF-A the element with the hash 0x5050
793 is missing (-1 in bucket 2) while in IBF-B the element with the hash 0101 is missing
794 (1 in bucket 0). The element with hash 0x4242 is present in IBF-A and IBF-B and is
795 removed by the set difference operation. Bucket 2 is not empty.
796 </t>
797 </section>
798 </section>
799
800 <section anchor="ibf_format" numbered="true" toc="default">
801 <name>Wire format</name>
802 <t>
803 For the counter field, we use a variable-size encoding to ensure
804 that even for very large sets the counter should never reach
805 "infinity", while also ensuring that the encoding is compact for
806 small sets.
807 Hence, the counter size transmitted over the wire
808 varies between 1 and 64 bits, depending on the
809 maximum counter in the IBF. A range of 1 to 64 bits
810 should cover most areas of application and can be
811 efficiently implemented on most contemporary CPU
812 architectures and programming languages.
813 The bit length for the transmitted IBF
814 will be communicated in the header of the
815 <em><xref target="messages_ibf" format="title" /></em> message
816 in the "IMCS" field as unsigned 8-bit integer.
817 For implementation details see section <xref target="performance_counter_variable_size" format="title" />.
818 </t>
819 <t>
820 For the "IDSUM", we always use a 64-bit representation.
821 The IDSUM value MUST have sufficient entropy for the
822 mapping function M to yield reasonably random buckets
823 even for very large values of L. With a 32 bit
824 value the chance that multiple elements may be mapped
825 to the same ID would be quite high, even for moderately
826 large sets. Using more than 64 bits would at best make
827 sense for very large sets, but then it is likely always
828 better to simply afford additional round trips to handle
829 the occasional collision. 64 bits are also a reasonable
830 size for many CPU architectures.
831 </t>
832 <t>
833 For the "HASHSUM", we always use a 32-bit
834 representation. Here, it is most important to
835 avoid collisions, where different elements are
836 mapped to the same hash, possibly resulting in
837 a bucket being falsely classified as pure.
838 While with 32 bits there remains a non-negligible chance of
839 accidental collisions, our protocol is designed
840 to handle occasional collisions. Hence, at 32 bit the chance is
841 believed to be sufficiently small enough
842 for the protocol to handle those cases efficiently. Smaller hash
843 values would safe bandwidth, but also substantially
844 increase the chance of collisions. 32 bits are
845 also again a reasonable size for many CPU
846 architectures.
847 </t>
848 </section>
849 </section>
850
851 <section anchor="se" numbered="true" toc="default">
852 <name>Strata Estimator</name>
853 <t>
854 Strata Estimators help estimate the size of the set difference between two sets of elements.
855 This is necessary to efficiently determinate the tuning parameters for an IBF, in particular
856 a good value for L.
857 </t>
858 <t>
859 Basically a Strata Estimator (SE) is a series of IBFs (with a rather small value of L=79)
860 in which increasingly large subsets of the full set
861 of elements are added to each IBF. For the n-th IBF, the function selecting the
862 subset of elements MUST sample to select (probabilistically) 1/(2^n) of all
863 elements. This can be done by counting the number of trailing bits set to "1"
864 in an element ID, and then inserting the element into the IBF identified by that
865 counter. As a result, all elements will be mapped to one IBF, with the n-th
866 IBF being statistically expected to contain 1/(2^n) elements.
867 </t>
868 <t>
869 Given two SEs, the set size difference can be estimated by attempting to decode all of the
870 IBFs. Given that L is set to a fixed and rather small value, IBFs containing large strata
871 will likely fail to decode. For IBFs that failed to decode, one simply
872 extrapolates the number of elements by scaling the numbers obtained from the
873 other IBFs that did decode. If none of the IBFs of the SE decoded (which given
874 a reasonable number of IBFs in the SE should be highly unlikely), one can theoretically
875 retry using a different IBF-salt.
876 </t>
877 <t>
878 When decoding the IBFs in the strata estimator, it is possible to determine
879 on which side which part of the difference is. For this purpose, the pure buckets with
880 counter 1 and -1 must be distinguished and assigned to the respective side when decoding
881 the IBFs.
882 </t>
883 </section>
884
885 <section anchor="modeofoperation" numbered="true" toc="default">
886 <name>Mode of Operation</name>
887 <t>
888 Depending on the state of the two sets the set union
889 protocol uses different modes of operation to efficiently
890 determinate missing elements between the two sets.
891 </t>
892 <t>
893 The simplest mode is the <em>full synchronisation mode</em>.
894 If the difference between the sets of the two
895 peers exceeds a certain threshold, the overhead to determine
896 which elements are different would outweigh the overhead of
897 simply sending the complete set. Hence, the protocol may
898 determine that the most efficient method is to exchange the full sets.
899 </t>
900 <t>
901 The second possibility is that the difference between the sets
902 is relatively small compared to the set size.
903 In this case, the <em>differential synchronisation mode</em> is more efficient.
904 Given these two possibilities, the first steps of the protocol are used to
905 determine which mode MUST be used.
906 </t>
907 <t>
908 Thus, the set union protocol always begins with the following operation mode independent steps:
909 </t>
910
911 <t>
912 The initiating peer begins in the <strong>Initiating Connection</strong> state and the receiving peer in the <strong>Expecting Connection</strong>
913 state. The first step for the initiating peer in the protocol is to send an <em><xref target="messages_operation_request" format="title" /></em> to the receiving peer and
914 transition into the <strong>Expect SE</strong> state. After receiving the <em><xref target="messages_operation_request" format="title" /></em> the receiving peer
915 transitions to the <strong>Expecting IBF</strong> state and answers with the
916 <em><xref target="messages_se" format="title" /></em> message. When the initiating peer receives the <em><xref target="messages_se" format="title" /></em> message,
917 it decides with some heuristics which operation mode is likely more suitable for the estimated set difference and the application-provided latency-bandwidth tradeoff.
918 The detailed algorithm used to choose between the <xref target="modeofoperation_full-sync" format="title" /> and the <xref target="modeofoperation_individual-elements" format="title" />
919 is explained in the section <xref target="modeofoperation_combined-mode" format="title" /> below.
920 </t>
921 <section anchor="modeofoperation_full-sync" numbered="true" toc="default">
922 <name>Full Synchronisation Mode</name>
923 <t>
924 When the initiating peer decides to use the full synchronisation mode and it is better that the other peer sends his set first, the initiating
925 peer sends a <em><xref target="messages_request_full" format="title" /></em> message, and transitions from <strong>Expecting SE</strong> to the <strong>Full Receiving</strong> state.
926 If it has been determined that it is better that the initiating peer sends his set first, the initiating peer sends a <em><xref target="messages_send_full" format="title" /></em> message followed by all
927 set elements in <em><xref target="messages_full_element" format="title" /></em> messages to the other peer, followed by the <em><xref target="messages_full_done" format="title" /></em> message, and transitions into the <strong>Full Sending</strong> state.
928 </t>
929 <t>
930 A state diagram illustrating the state machine used during full synchronization
931 is provided
932 <eref target="https://git.gnunet.org/lsd0003.git/plain/statemachine/state_machine_full.png">here</eref>.
933 </t>
934 <t><strong>The behavior of the participants the different state is described below:</strong></t>
935 <dl>
936 <dt><strong>Expecting IBF:</strong></dt>
937 <dd>
938 If a peer in the <strong>Expecting IBF</strong> state receives a <em><xref target="messages_request_full" format="title" /></em> message from the other peer, the
939 peer sends all the elements of his set followed by a <em><xref target="messages_full_done" format="title" /></em> message to the other peer, and transitions to the
940 <strong>Full Sending</strong> state. If the peer receives an <em><xref target="messages_send_full" format="title" /></em> message followed by
941 <em><xref target="messages_full_element" format="title" /></em> messages, the peer processes the element and transitions to the <strong>Full Receiving</strong> state.
942 </dd>
943 <dt><strong>Full Sending:</strong></dt>
944 <dd>
945 While a peer is in <strong>Full Sending</strong> state the peer expects to continuously receive elements from the other
946 peer. As soon as a the <em><xref target="messages_full_done" format="title" /></em> message is received, the peer transitions into
947 the <strong>Finished</strong> state.
948 </dd>
949 <dt><strong>Full Receiving: </strong></dt>
950 <dd>
951 While a peer is in the <strong>Full Receiving</strong> state, it expects to continuously receive elements from the other
952 peer. As soon as a the <em><xref target="messages_full_done" format="title" /></em> message is received, it sends
953 the remaining elements (those it did not receive) from his set to the other
954 peer, followed by a <em><xref target="messages_full_done" format="title" /></em>.
955 After sending the last message, the peer transitions into the <strong>Finished</strong> state.
956 </dd>
957 </dl>
958 </section>
959 <section anchor="modeofoperation_individual-elements" numbered="true" toc="default">
960 <name>Differential Synchronisation Mode</name>
961 <t>
962 The message format used by the protocol limits the maximum message size to
963 64 kb. Given that L can be large, an IBF will not always fit within that
964 size limit. To deal with this, larger IBFs are split into multiple messages.
965 </t>
966 <t>
967 When the initiating peer in the <strong>Expected SE</strong> state decides to use the differential synchronisation mode, it
968 sends an IBF, which may
969 consist of several <em><xref target="messages_ibf" format="title" /></em> messages,
970 to the receiving peer and transitions into the <strong>Passive Decoding</strong> state.
971 </t>
972 <t>
973 The receiving peer in the <strong>Expecting IBF</strong> state receives the
974 first <em><xref target="messages_ibf" format="title" /></em> message from
975 the initiating peer, and transitions into the <strong>Expecting IBF Last</strong> state
976 if the IBF was split into multiple <em><xref target="messages_ibf" format="title" /></em>
977 messages. If there is just a single <em><xref target="messages_ibf" format="title" /></em>
978 message, the receiving peer
979 transitions directly to the <strong>Active Decoding</strong> state.
980 </t>
981 <t>
982 The peer that is in the <strong>Active Decoding</strong>, <strong>Finish Closing</strong> or in the <strong>Expecting IBF Last</strong>
983 state is called the active peer, and the peer that is in either the <strong>Passive Decoding</strong> or the <strong>Finish Waiting</strong> state
984 is called the passive peer.
985 </t>
986 <t>
987 A state diagram illustrating the state machine used during differential synchronization
988 is provided
989 <eref target="https://git.gnunet.org/lsd0003.git/plain/statemachine/differential_state_machine.png">here</eref>.
990 </t>
991 <t><strong>The behavior of the participants the different states is described below:</strong></t>
992 <dl>
993 <dt><strong>Passive Decoding:</strong></dt>
994 <dd>
995 <t>
996 In the <strong>Passive Decoding</strong> state the passive peer reacts to requests from the active peer.
997 The action the passive peer executes depends on the message the passive peer receives in the <strong>Passive Decoding</strong> state from the active peer
998 and is described below on a per message basis.
999 </t>
1000
1001 <dl>
1002 <dt><em><xref target="messages_inquiry" format="title" /></em> message:</dt>
1003 <dd>
1004 The <em><xref target="messages_inquiry" format="title" /></em> message
1005 is received if the active peer requests the SHA-512 hash of one or more elements (by sending the 64 bit element ID)
1006 that are missing from the active peer's set.
1007 In this case the passive peer answers with <em><xref target="messages_offer" format="title" /></em> messages
1008 which contain the SHA-512 hash of the requested element. If the passive peer does not have an element with
1009 a matching element ID, it MUST ignore the inquiry (in this case, a bucket was falsely classified as pure, decoding the IBF will eventually fail, and roles will be swapped).
1010 It should be verified that after an falsely classified pure bucket a role change is made.
1011 If multiple elements match the 64 bit element ID, the passive
1012 peer MUST send offers for all of the matching elements.
1013 </dd>
1014 <dt><em><xref target="messages_demand" format="title" /></em> message:</dt>
1015 <dd>
1016 The <em><xref target="messages_demand" format="title" /></em> message
1017 is received if the active peer requests a complete element that is missing in the active peers set in response to an offer. If the requested element is known and has not yet been transmitted
1018 the passive peer answers with an <em><xref target="messages_elements" format="title" /></em> message which contains the full,
1019 application-dependent data of the requested element. If the passive peer receives a demand for a SHA-512 hash for which
1020 it has no corresponding element, a protocol violation is detected and the protocol MUST be aborted.
1021 Implementations MUST also abort when facing demands without previous matching offers or for which the passive peer previously transmitted the element to the active peer.
1022 </dd>
1023 <dt><em><xref target="messages_offer" format="title" /></em> message:</dt>
1024 <dd>
1025 The <em><xref target="messages_offer" format="title" /></em> message
1026 is received if the active peer has decoded an element that is present in the active peers set and is likely be missing in the
1027 set of the passive peer. If the SHA-512 hash of the offer is indeed not a hash of any of the elements from the set of
1028 the passive peer, the passive peer MUST answer with a <em><xref target="messages_demand" format="title" /></em> message
1029 for that SHA-512 hash and remember that it issued this demand. The demand thus needs to be added to a list with unsatisfied demands.
1030 </dd>
1031 <dt><em><xref target="messages_elements" format="title" /></em> message:</dt>
1032 <dd>
1033 When a new <em><xref target="messages_elements" format="title" /></em> message has been received the peer checks if a corresponding
1034 <em><xref target="messages_demand" format="title" /></em> for the element has been sent
1035 and the demand is still unsatisfied.
1036 If the element has been demanded the peer checks the element for validity, removes it from the list
1037 of pending demands and then saves the element to the set. Otherwise the peer
1038 ignores the element.
1039 </dd>
1040 <dt><em><xref target="messages_ibf" format="title" /></em> message:</dt>
1041 <dd>
1042 If an <em><xref target="messages_ibf" format="title" /></em> message is received, this
1043 indicates that decoding of the IBF on the active site has failed and roles will be swapped.
1044 The receiving passive peer transitions into the <strong>Expecting IBF Last</strong> state,
1045 and waits for more <em><xref target="messages_ibf" format="title" /></em> messages.
1046 There, once the final <em><xref target="messages_ibf_last" format="title" /></em> message has been received, it transitions to <strong>Active Decoding</strong>.
1047 </dd>
1048 <dt><em><xref target="messages_ibf_last" format="title" /></em> message:</dt>
1049 <dd>
1050 If an <em><xref target="messages_ibf_last" format="title" /></em> message is received this
1051 indicates that there is just one IBF slice left and a direct state and role transition from
1052 <strong>Passive Decoding</strong> to <strong>Active Decoding</strong> is initiated.
1053 </dd>
1054 <dt><em><xref target="messages_done" format="title" /></em> message:</dt>
1055 <dd>
1056 Receiving the <em><xref target="messages_done" format="title" /></em> message signals
1057 the passive peer that all demands of the active peer have been satisfied. Alas, the
1058 active peer will continue to process demands from the passive peer.
1059 Upon receiving this message, the passive peer transitions into the
1060 <strong>Finish Waiting</strong> state.
1061 </dd>
1062 </dl>
1063 </dd>
1064 <dt><strong>Active Decoding:</strong></dt>
1065 <dd>
1066 <t>
1067 In the <strong>Active Decoding</strong> state the active peer decodes the IBFs and evaluates the set difference
1068 between the active and passive peer. Whenever an element ID is obtained by decoding the IBF, the active peer
1069 sends either an offer or an inquiry to the passive peer, depending on which site the decoded element is missing.
1070 </t>
1071 <t>
1072 If the IBF decodes a positive (1) pure bucket, the element is missing on the passive peers site.
1073 Thus, the active peer sends an <em><xref target="messages_offer" format="title" /></em> to the passive peer.
1074 A negative (-1) pure bucket indicates that an element is missing in the active peers set, so the active peer
1075 sends a <em><xref target="messages_inquiry" format="title" /></em> to the passive peer.
1076 </t>
1077 <t>
1078 In case the IBF does not successfully decode anymore, the active peer sends a new IBF computed with a different IBF-salt to the passive peer
1079 and changes into <strong>Passive Decoding</strong> state. This initiates a role swap.
1080 To reduce overhead and prevent double transmission of offers and elements, the new IBF is created
1081 on the local set after updating it with the all of the elements that have been successfully demanded. Note that the active peer MUST NOT wait for all active demands to be satisfied, as demands can fail if a bucket was falsely classified as pure.
1082 </t>
1083 <t>
1084 As soon as the active peer successfully finished decoding the IBF, the active peer sends a
1085 <em><xref target="messages_done" format="title" /></em> message to the passive peer.
1086 </t>
1087 <t>
1088 All other actions taken by the active peer depend on the message the active peer receives from
1089 the passive peer. The actions are described below on a per message basis:
1090 </t>
1091 <dl>
1092 <dt><em><xref target="messages_offer" format="title" /></em> message:</dt>
1093 <dd>
1094 The <em><xref target="messages_offer" format="title" /></em> message indicates that the
1095 passive peer received a <em><xref target="messages_inquiry" format="title" /></em> message from
1096 the active peer. If a inquiry has been sent and
1097 the offered element is missing in the active peers set,
1098 the active peer sends a <em><xref target="messages_demand" format="title" /></em> message to the
1099 passive peer. The demand needs to be added to a list of unsatisfied demands.
1100 In case the received offer is for an element that is already in the set of the peer, the offer MUST BE ignored.
1101 </dd>
1102 <dt><em><xref target="messages_demand" format="title" /></em> message:</dt>
1103 <dd>
1104 The <em><xref target="messages_demand" format="title" /></em> message indicates that the
1105 passive peer received a <em><xref target="messages_offer" format="title" /></em> from
1106 the active peer. The active peer satisfies the demand of the passive peer by sending an
1107 <em><xref target="messages_elements" format="title" /></em> message if a offer request
1108 for the element was sent earlier. Otherwise, the protocol MUST be aborted, as peers must never send demands for hashes that they have never been offered.
1109 </dd>
1110 <dt><em><xref target="messages_elements" format="title" /></em> message:</dt>
1111 <dd>
1112 If element is received that was not demanded or for which
1113 the application-specific validation logic fails, the protocol
1114 MUST be aborted. Otherwise, the corresponding demand is marked
1115 as satisfied. Note that this applies only to the differential
1116 synchronization mode. In full synchronization, it is perfectly
1117 normal to receive
1118 <xref target="messages_full_element" format="title" />
1119 messages for elements that were not demanded and that might
1120 even already be known locally.
1121 </dd>
1122 <dt><em><xref target="messages_done" format="title" /></em> message:</dt>
1123 <dd>
1124 Receiving the message <em><xref target="messages_done" format="title" /></em> indicates
1125 that all demands of the passive peer have been satisfied. The active peer then changes into the
1126 <strong>Finish Closing</strong> state.
1127 If the IBF has not finished decoding and the <em><xref target="messages_done" format="title" /></em>
1128 is received, the other peer is not in compliance with the protocol and the protocol MUST be aborted.
1129 </dd>
1130 </dl>
1131 </dd>
1132 <dt><strong>Expecing IBF Last</strong></dt>
1133 <dd>
1134 <t>
1135 In this state the active peer continuously receives <em><xref target="messages_ibf" format="title" /></em>
1136 messages from the passive peer. When the last <em><xref target="messages_ibf_last" format="title" /></em> message is received,
1137 the peer changes into the <strong>Active Decoding</strong> state.
1138 </t>
1139 </dd>
1140 <dt><strong>Finish Closing</strong> / <strong>Finish Waiting</strong></dt>
1141 <dd>
1142 <t>
1143 In this states the peers are waiting for all demands to be satisfied and for the synchronisation
1144 to be completed. When all demands are satisfied the peer changes into <strong>Finished</strong> state.
1145 </t>
1146 </dd>
1147 </dl>
1148 </section>
1149 <section anchor="modeofoperation_combined-mode" numbered="true" toc="default">
1150 <name>Combined Mode</name>
1151 <t>
1152 In the <em>combined mode</em> the protocol decides between
1153 <xref target="modeofoperation_full-sync" format="title" /> and
1154 the <xref target="modeofoperation_individual-elements" format="title" />
1155 to minimize resource consumption. Typically, the protocol always runs
1156 in combined mode, but implementations MAY allow applications to force
1157 the use of one of the modes for testing. In this case, applications MUST
1158 ensure that the respective options to force a particular mode are used by
1159 both participants.
1160 </t>
1161 <t>
1162 The <xref target="modeofoperation_individual-elements" format="title" /> is only efficient on small set
1163 differences or if the byte-size of the elements is large. If the set difference is estimated to be large
1164 the <xref target="modeofoperation_full-sync" format="title" /> is
1165 more efficient. The exact heuristics and parameters on which the protocol decides which mode
1166 MUST be used are described in the <xref target="performance" format="title" /> section of this document.
1167 </t>
1168 <t>
1169 There are two main cases when a <xref target="modeofoperation_full-sync" format="title" />
1170 is always used.
1171 The first case is when one of the peers announces having an empty set. This is announced by setting
1172 the SETSIZE field in the <em><xref target="messages_se" format="title" /></em> to 0.
1173 <!-- FIXME: why not also do this if sending the elements is about as
1174 expensive as sending the SE? Should be a simple calculation. (thesis summermatter: future work) @Christian:
1175 As discussed 14.06 we let this comment in here as it is described in the thesis-->
1176 The second case is if the application requests full synchronisation explicitly.
1177 This is useful for testing and MUST NOT be used in production.
1178 </t>
1179 <t>
1180 The state diagram illustrating the combined mode can be found
1181 <eref target="https://git.gnunet.org/lsd0003.git/plain/statemachine/full_state_machine.png">here</eref>.
1182 </t>
1183 </section>
1184 </section>
1185
1186 <section anchor="messages" numbered="true" toc="default">
1187 <name>Messages</name>
1188 <t>
1189 This section describes the various message formats used by the protocol.
1190 </t>
1191 <section anchor="messages_operation_request" numbered="true" toc="default">
1192 <name>Operation Request</name>
1193
1194 <section anchor="messages_operation_request_description" numbered="true" toc="default">
1195 <name>Description</name>
1196 <t>
1197 This message is the first message of the protocol and it is sent to signal to the receiving peer
1198 that the initiating peer wants to initialize a new connection.
1199 </t>
1200 <t>
1201 This message is sent in the transition between the <strong>Initiating Connection</strong> state and the <strong>Expect SE</strong> state.
1202 </t>
1203 <t>
1204 If a peer receives this message and is willing to run the protocol, it answers by sending back a <em><xref target="messages_se" format="title" /></em> message.
1205 Otherwise it simply closes the connection.
1206 </t>
1207 </section>
1208 <section anchor="messages_operation_request_structure" numbered="true" toc="default">
1209 <name>Structure</name>
1210
1211 <figure anchor="figure_operation_request">
1212 <artwork name="" type="" align="left" alt=""><![CDATA[
1213 0 8 16 24 32 40 48 56
1214 +-----+-----+-----+-----+-----+-----+-----+-----+
1215 | MSG SIZE | MSG TYPE | ELEMENT COUNT |
1216 +-----+-----+-----+-----+-----+-----+-----+-----+
1217 | APX
1218 +-----+-----+-----+-----+-----+-----+-----+-----+ /
1219 / APPLICATION DATA /
1220 / /
1221 ]]></artwork>
1222 </figure>
1223 <t>where:</t>
1224 <dl>
1225 <dt>MSG SIZE</dt>
1226 <dd>
1227 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included.
1228 </dd>
1229 <dt>MSG TYPE</dt>
1230 <dd>
1231 is the type of SETU_P2P_OPERATION_REQUEST as registered in <xref target="gana" format="title" />, in network byte order.
1232 </dd>
1233 <dt>ELEMENT COUNT</dt>
1234 <dd>
1235 is the number of the elements the requesting party has in its set, as a 32-bit unsigned integer in network byte order.
1236 </dd>
1237 <dt>APX</dt>
1238 <dd>
1239 is a SHA-512 hash that identifies the application.
1240 </dd>
1241 <dt>APPLICATION DATA</dt>
1242 <dd>
1243 is optional, variable-size application specific data that can be used
1244 by the application to decide if it would like to answer the request.
1245 </dd>
1246 </dl>
1247 </section>
1248 </section>
1249
1250 <section anchor="messages_ibf" numbered="true" toc="default">
1251 <name>IBF</name>
1252
1253 <section anchor="messages_ibf_description" numbered="true" toc="default">
1254 <name>Description</name>
1255 <t>
1256 The <xref target="messages_ibf" format="title" /> message contains a slice of the IBF.
1257 </t>
1258 <t>
1259 The <em>IBF</em> message is sent at the start of the protocol from the initiating peer in the transaction
1260 between <strong>Expect SE</strong> -> <strong>Expecting IBF Last</strong> or when the IBF does not
1261 decode and there is a role change in the transition between <strong>Active Decoding</strong> -> <strong>Expecting IBF Last</strong>.
1262 This message is only sent if there is more than one IBF slice to be sent. If there is just
1263 one slice, then only the <xref target="messages_ibf_last" format="title" /> message is sent.
1264 </t>
1265 </section>
1266 <section anchor="messages_ibf_structure" numbered="true" toc="default">
1267 <name>Structure</name>
1268 <figure anchor="figure_ibf">
1269 <artwork name="" type="" align="left" alt=""><![CDATA[
1270 0 8 16 24 32 40 48 56
1271 +-----+-----+-----+-----+-----+-----+-----+-----+
1272 | MSG SIZE | MSG TYPE | IBF SIZE |
1273 +-----+-----+-----+-----+-----+-----+-----+-----+
1274 | OFFSET | SALT | IMCS |
1275 +-----+-----+-----+-----+-----+-----+-----+-----+
1276 | IBF-SLICE
1277 +-----+-----+-----+-----+-----+-----+-----+-----+ /
1278 / /
1279 / /
1280 ]]></artwork>
1281 </figure>
1282 <t>where:</t>
1283 <dl>
1284 <dt>MSG SIZE</dt>
1285 <dd>
1286 is a 16-bit unsigned integer in network byte orderwhichdescribes the message size in bytes with the header included.
1287 </dd>
1288 <dt>MSG TYPE</dt>
1289 <dd>
1290 the type of SETU_P2P_REQUEST_IBF as registered in <xref target="gana" format="title" /> in network byte order.
1291 </dd>
1292
1293 <dt>IBF SIZE</dt>
1294 <dd>
1295 is a 32-bit unsigned integer which signals the total number of buckets in the IBF. The minimal number of buckets is 37.
1296 </dd>
1297 <dt>OFFSET</dt>
1298 <dd>
1299 is a 32-bit unsigned integer which signals the offset of the following IBF slices in the original.
1300 </dd>
1301 <dt>SALT</dt>
1302 <dd>
1303 is a 16-bit unsigned integer that contains the IBF-salt which was used to create the
1304 IBF.
1305 </dd>
1306 <dt>IMCS</dt>
1307 <dd>
1308 is a 16-bit unsigned integer, which describes the number of bits that
1309 are required to store a single counter. This is used for the unpacking function as described
1310 in the <xref target="performance_counter_variable_size" format="title" /> section.
1311 </dd>
1312 <dt>IBF-SLICE</dt>
1313 <dd>
1314 <t>
1315 are variable numbers of slices in an array. A single slice contains multiple 64-bit IDSUMS,
1316 32-bit HASHSUMS and (1-64)-bit COUNTERS of variable size. All values are in the network byte order. The array of IDSUMS is serialized first, followed
1317 by an array of HASHSUMS. Last comes an array of unsigned COUNTERS (details of the COUNTERS encoding are described in section
1318 <xref target="performance_counter_variable_size" format="default"/>). The length of the array is
1319 defined by MIN( SIZE - OFFSET, MAX_BUCKETS_PER_MESSAGE). MAX_BUCKETS_PER_MESSAGE is defined as
1320 32768 divided by the BUCKET_SIZE which ranges between 97-bits when counter uses bit 1 (IMCS=1) and 160-bit when counter size uses 64 bit (IMCS=64).
1321 </t>
1322 <t>
1323 To get the IDSUM field, all IDs (computed with the IBF-salt) hitting a bucket under the map M are added up with a binary XOR operation.
1324 See <xref target="ibf_format_id_generation" format="title" /> details about ID generation.
1325 </t>
1326 <t>
1327 The calculation of the HASHSUM field is done accordingly to the calculation of the IDSUM field:
1328 all HASHes are added up with a binary XOR operation.
1329 The HASH value is calculated as described in detail in section <xref target="ibf_format_HASH_calculation" format="title" />.
1330 </t>
1331 <t>
1332 The algorithm to find the correct bucket in which the ID and the HASH have to be added
1333 is described in detail in section <xref target="ibf_m" format="title" />.
1334 </t>
1335 <t>
1336 Test vectors for an implementation can be found in the <xref target="test_vectors" format="title" /> section
1337 </t>
1338 </dd>
1339 </dl>
1340 <figure anchor="figure_ibf_slice">
1341 <artwork name="" type="" align="left" alt=""><![CDATA[
1342 IBF-SLICE
1343 0 8 16 24 32 40 48 56
1344 +-----+-----+-----+-----+-----+-----+-----+-----+
1345 | IDSUMS |
1346 +-----+-----+-----+-----+-----+-----+-----+-----+
1347 | IDSUMS |
1348 +-----+-----+-----+-----+-----+-----+-----+-----+
1349 | HASHSUMS | HASHSUMS |
1350 +-----+-----+-----+-----+-----+-----+-----+-----+
1351 | COUNTERS* | COUNTERS* |
1352 +-----+-----+-----+-----+-----+-----+-----+-----+
1353 / /
1354 / /
1355* Counter size is variable. In this example the IMCS is 32 (4 bytes).
1356 ]]></artwork>
1357 </figure>
1358 </section>
1359 </section>
1360
1361 <section anchor="messages_ibf_last" numbered="true" toc="default">
1362 <name>IBF Last</name>
1363
1364 <section anchor="messages_ibf_last_description" numbered="true" toc="default">
1365 <name>Description</name>
1366 <t>
1367 This message indicates to the remote peer that this is the last slice
1368 of the Bloom filter. The receiving peer MUST check that the sizes and
1369 offsets of all received IBF slices add up to the total IBF SIZE that was
1370 given.
1371 </t>
1372 <t>
1373 Receiving this message initiates the state transmissions
1374 <strong>Expecting IBF Last</strong> -> <strong>Active Decoding</strong>,
1375 <strong>Expecting IBF</strong> -> <strong>Active Decoding</strong> and
1376 <strong>Passive Decoding</strong> -> <strong>Active Decoding</strong>. This message
1377 can initiate a peer the roll change from <strong>Active Decoding</strong> to
1378 <strong>Passive Decoding</strong>.
1379 </t>
1380 </section>
1381 <section anchor="messages_ibf_last_structure" numbered="true" toc="default">
1382 <name>Structure</name>
1383 <t>
1384 The binary structure is exactly the same as the <xref target="messages_ibf_structure" format="title" /> of
1385 the message <xref target="messages_ibf" format="title" /> with a different "MSG TYPE"
1386 which is defined in <xref target="gana" format="title" /> "SETU_P2P_IBF_LAST".
1387 </t>
1388 </section>
1389 </section>
1390
1391 <section anchor="messages_elements" numbered="true" toc="default">
1392 <name>Element</name>
1393
1394 <section anchor="messages_elements_description" numbered="true" toc="default">
1395 <name>Description</name>
1396 <t>
1397 The <em><xref target="messages_elements" format="title" /></em> message contains an element that is synchronized in the <xref target="modeofoperation_individual-elements" format="title" />
1398 and transmits a full element between the peers.
1399 </t>
1400 <t>
1401 This message is sent in the state <strong>Active Decoding</strong> and <strong>Passive Decoding</strong>
1402 as answer to a <em><xref target="messages_demand" format="title" /></em> message from the remote peer.
1403 The <em><xref target="messages_elements" format="title" /></em> message can also be received in the <strong>Finish Closing</strong> or <strong>Finish Waiting</strong>
1404 state after receiving a <em><xref target="messages_done" format="title" /></em> message from the remote peer. In this
1405 case the peer changes to the <strong>Finished</strong> state as soon as all demands for elements have been satisfied.
1406 </t>
1407 <t>
1408 This message is exclusively used in the <xref target="modeofoperation_individual-elements" format="title" />.
1409 </t>
1410 </section>
1411 <section anchor="messages_elements_structure" numbered="true" toc="default">
1412 <name>Structure</name>
1413 <figure anchor="figure_elements">
1414 <artwork name="" type="" align="left" alt=""><![CDATA[
1415 0 8 16 24 32 40 48 56
1416 +-----+-----+-----+-----+-----+-----+-----+-----+
1417 | MSG SIZE | MSG TYPE | E TYPE | PADDING |
1418 +-----+-----+-----+-----+-----+-----+-----+-----+
1419 | E SIZE | DATA
1420 +-----+-----+ /
1421 / /
1422 / /
1423 ]]></artwork>
1424 </figure>
1425 <t>where:</t>
1426 <dl>
1427 <dt>MSG SIZE</dt>
1428 <dd>
1429 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included.
1430 </dd>
1431 <dt>MSG TYPE</dt>
1432 <dd>
1433 is SETU_P2P_ELEMENTS as registered in <xref target="gana" format="title" /> in network byte order.
1434 </dd>
1435 <dt>E TYPE</dt>
1436 <dd>
1437 is a 16-bit unsigned integer which defines the element type for
1438 the application.
1439 </dd>
1440 <dt>PADDING</dt>
1441 <dd>
1442 is 16-bit always set to zero.
1443 </dd>
1444 <dt>E SIZE</dt>
1445 <dd>
1446 is a 16-bit unsigned integer that signals the size of the elements data part.
1447 </dd>
1448 <dt>DATA</dt>
1449 <dd>
1450 is a field with variable length that contains the data of the element.
1451 </dd>
1452 </dl>
1453 </section>
1454 </section>
1455
1456 <section anchor="messages_offer" numbered="true" toc="default">
1457 <name>Offer</name>
1458
1459 <section anchor="messages_offer_description" numbered="true" toc="default">
1460 <name>Description</name>
1461 <t>
1462 The <em><xref target="messages_offer" format="title" /></em> message is an answer to an <em><xref target="messages_inquiry" format="title" /></em> message
1463 and transmits the full hash of an element that has been requested by the other peer.
1464 This full hash enables the other peer to check if the element is really missing in his set and
1465 eventually sends a <em><xref target="messages_demand" format="title" /></em> message for that element.
1466 </t>
1467 <t>
1468 The offer is sent and received only in the <strong>Active Decoding</strong> and in the <strong>Passive Decoding</strong>
1469 state.
1470 </t>
1471 <t>
1472 This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />.
1473 </t>
1474 </section>
1475 <section anchor="messages_offer_structure" numbered="true" toc="default">
1476 <name>Structure</name>
1477 <figure anchor="figure_offer">
1478 <artwork name="" type="" align="left" alt=""><![CDATA[
1479 0 8 16 24 32 40 48 56
1480 +-----+-----+-----+-----+-----+-----+-----+-----+
1481 | MSG SIZE | MSG TYPE | HASH 1
1482 +-----+-----+-----+-----+-----+-----+-----+-----+
1483 ... ...
1484 +-----+-----+-----+-----+-----+-----+-----+-----+
1485 | HASH 1 | HASH 2
1486 +-----+-----+-----+-----+-----+-----+-----+-----+
1487 ... ...
1488 +-----+-----+-----+-----+-----+-----+-----+-----+
1489 | HASH 2 | HASH n
1490 +-----+-----+-----+-----+-----+-----+-----+-----+
1491 / /
1492 / /
1493 ]]></artwork>
1494 </figure>
1495 <t>where:</t>
1496 <dl>
1497 <dt>MSG SIZE</dt>
1498 <dd>
1499 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes header included.
1500 </dd>
1501 <dt>MSG TYPE</dt>
1502 <dd>
1503 is SETU_P2P_OFFER as registered in <xref target="gana" format="title" /> in network byte order.
1504 </dd>
1505 <dt>HASHES</dt>
1506 <dd>
1507 contains one or more successive SHA 512-bit hashes of the elements that are being requested with <em><xref target="messages_inquiry" format="title" /></em> messages.
1508 </dd>
1509 </dl>
1510 </section>
1511 </section>
1512
1513
1514 <section anchor="messages_inquiry" numbered="true" toc="default">
1515 <name>Inquiry</name>
1516
1517 <section anchor="messages_inquiry_description" numbered="true" toc="default">
1518 <name>Description</name>
1519 <t>
1520 The <em><xref target="messages_inquiry" format="title" /></em> message is exclusively sent by the active peer in <strong>Active Decoding</strong> state
1521 to request the full hash of an element that is missing in the active peers set. This is normally answered
1522 by the passive peer with <em><xref target="messages_offer" format="title" /></em> message.
1523 </t>
1524 <t>
1525 This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />.
1526 </t>
1527 </section>
1528 <section anchor="messages_inquiry_structure" numbered="true" toc="default">
1529 <name>Structure</name>
1530 <figure anchor="figure_inquiry">
1531 <artwork name="" type="" align="left" alt=""><![CDATA[
1532 0 8 16 24 32 40 48 56
1533 +-----+-----+-----+-----+-----+-----+-----+-----+
1534 | MSG SIZE | MSG TYPE | SALT |
1535 +-----+-----+-----+-----+-----+-----+-----+-----+
1536 | IBF KEY 1 |
1537 +-----+-----+-----+-----+-----+-----+-----+-----+
1538 | IBF KEY 2 |
1539 +-----+-----+-----+-----+-----+-----+-----+-----+
1540 ... ...
1541 +-----+-----+-----+-----+-----+-----+-----+-----+
1542 | IBF KEY n |
1543 +-----+-----+-----+-----+-----+-----+-----+-----+
1544 / /
1545 / /
1546 ]]></artwork>
1547 </figure>
1548 <t>where:</t>
1549 <dl>
1550 <dt>MSG SIZE</dt>
1551 <dd>
1552 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included.
1553 </dd>
1554 <dt>MSG TYPE</dt>
1555 <dd>
1556 is SETU_P2P_INQUIRY as registered in <xref target="gana" format="title" /> in network byte order.
1557 </dd>
1558 <dt>IBF KEY</dt>
1559 <dd>
1560 contains one or more successive ibf keys (64-bit unsigned integer) for which the inquiry is sent.
1561 </dd>
1562 </dl>
1563 </section>
1564 </section>
1565
1566 <section anchor="messages_demand" numbered="true" toc="default">
1567 <name>Demand</name>
1568
1569 <section anchor="messages_demand_description" numbered="true" toc="default">
1570 <name>Description</name>
1571 <t>
1572 The <em><xref target="messages_demand" format="title" /></em> message is sent in the <strong>Active Decoding</strong> and in the <strong>Passive Decoding</strong>
1573 state. It is an answer to a received <em><xref target="messages_offer" format="title" /></em> message
1574 and is sent if the element described in the <em><xref target="messages_offer" format="title" /></em> message
1575 is missing in the peers set. In the normal workflow the answer to the <em><xref target="messages_demand" format="title" /></em> message is an
1576 <em><xref target="messages_elements" format="title" /></em> message.
1577 </t>
1578 <t>
1579 This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />.
1580 </t>
1581 </section>
1582 <section anchor="messages_demand_structure" numbered="true" toc="default">
1583 <name>Structure</name>
1584 <figure anchor="figure_demand">
1585 <artwork name="" type="" align="left" alt=""><![CDATA[
1586 0 8 16 24 32 40 48 56
1587 +-----+-----+-----+-----+-----+-----+-----+-----+
1588 | MSG SIZE | MSG TYPE | HASH
1589 +-----+-----+-----+-----+
1590 ... ...
1591 +-----+-----+-----+-----+-----+-----+-----+-----+
1592 | HASH 1 | HASH 2
1593 +-----+-----+-----+-----+-----+-----+-----+-----+
1594 ... ...
1595 +-----+-----+-----+-----+-----+-----+-----+-----+
1596 | HASH 2 | HASH n
1597 +-----+-----+-----+-----+-----+-----+-----+-----+
1598 / /
1599 / /
1600 ]]></artwork>
1601 </figure>
1602 <t>where:</t>
1603 <dl>
1604 <dt>MSG SIZE</dt>
1605 <dd>
1606 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes and the header is included.
1607 </dd>
1608 <dt>MSG TYPE</dt>
1609 <dd>
1610 the type of SETU_P2P_DEMAND as registered in <xref target="gana" format="title" /> in network byte order.
1611 </dd>
1612 <dt>HASH</dt>
1613 <dd>
1614 contains one or more successive SHA 512-bit hashes of the elements that are being demanded.
1615 </dd>
1616 </dl>
1617 </section>
1618 </section>
1619
1620 <section anchor="messages_done" numbered="true" toc="default">
1621 <name>Done</name>
1622
1623 <section anchor="messages_done_description" numbered="true" toc="default">
1624 <name>Description</name>
1625 <t>
1626 The <em><xref target="messages_done" format="title" /></em> message is sent when all <em><xref target="messages_demand" format="title" /></em> messages
1627 have been successfully satisfied and from the perspective of the sender the set is completely synchronized.
1628 </t>
1629 <t>
1630 This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />.
1631 </t>
1632 </section>
1633 <section anchor="messages_done_structure" numbered="true" toc="default">
1634 <name>Structure</name>
1635 <figure anchor="figure_done">
1636 <artwork name="" type="" align="left" alt=""><![CDATA[
1637 0 8 16 24 32 40 48 56
1638 +-----+-----+-----+-----+-----+-----+-----+-----+
1639 | MSG SIZE | MSG TYPE |
1640 +-----+-----+-----+-----+-----+-----+-----+-----+
1641
1642 ]]></artwork>
1643 </figure>
1644 <t>where:</t>
1645 <dl>
1646 <dt>MSG SIZE</dt>
1647 <dd>
1648 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. The value is always 4 for this message type.
1649 </dd>
1650 <dt>MSG TYPE</dt>
1651 <dd>
1652 is SETU_P2P_DONE as registered in <xref target="gana" format="title" /> in network byte order.
1653 </dd>
1654 <dt>FINAL CHECKSUM</dt>
1655 <dd>
1656 a SHA-512 bit hash of the full set after synchronization. This should ensure that the sets are identical in the end!
1657 </dd>
1658 </dl>
1659 </section>
1660 </section>
1661
1662 <section anchor="messages_full_done" numbered="true" toc="default">
1663 <name>Full Done</name>
1664
1665 <section anchor="messages_full_done_description" numbered="true" toc="default">
1666 <name>Description</name>
1667 <t>
1668 The <em><xref target="messages_full_done" format="title" /></em> message is sent in the <xref target="modeofoperation_full-sync" format="title" />
1669 to signal that all remaining elements of the set have been sent. The message is received and sent in the
1670 <strong>Full Sending</strong> and in the <strong>Full Receiving</strong> state. When the <em><xref target="messages_full_done" format="title" /></em> message is received
1671 in <strong>Full Sending</strong> state the peer changes directly into <strong>Finished</strong> state. In
1672 <strong>Full Receiving</strong> state receiving a <em><xref target="messages_full_done" format="title" /></em> message initiates the sending of
1673 the remaining elements that are missing in the set of the other peer.
1674 </t>
1675 </section>
1676 <section anchor="messages_full_done_structure" numbered="true" toc="default">
1677 <name>Structure</name>
1678 <figure anchor="figure_full_done">
1679 <artwork name="" type="" align="left" alt=""><![CDATA[
1680 0 8 16 24 32 40 48 56
1681 +-----+-----+-----+-----+-----+-----+-----+-----+
1682 | MSG SIZE | MSG TYPE | FINAL CHECKSUM
1683 +-----+-----+-----+-----+
1684 / /
1685 / /
1686 ]]></artwork>
1687 </figure>
1688 <t>where:</t>
1689 <dl>
1690 <dt>MSG SIZE</dt>
1691 <dd>
1692 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. The value is always 4 for this message type.
1693 </dd>
1694 <dt>MSG TYPE</dt>
1695 <dd>
1696 the type of SETU_P2P_FULL_DONE as registered in <xref target="gana" format="title" /> in network byte order.
1697 </dd>
1698 <dt> FINAL CHECKSUM</dt>
1699 <dd>
1700 a SHA-512 bit hash of the full set after synchronization. This should ensure that the sets are identical in the end!
1701 </dd>
1702 </dl>
1703 </section>
1704 </section>
1705
1706 <section anchor="messages_request_full" numbered="true" toc="default">
1707 <name>Request Full</name>
1708
1709 <section anchor="messages_request_full_description" numbered="true" toc="default">
1710 <name>Description</name>
1711 <t>
1712 The <em><xref target="messages_request_full" format="title" /></em> message is sent by the initiating peer in <strong>Expect SE</strong> state to the receiving peer, if
1713 the operation mode "<xref target="modeofoperation_full-sync" format="title" />" is
1714 determined to be the superior <xref target="modeofoperation" format="title" /> and that it is the better choice that
1715 the other peer sends his elements first. The initiating peer changes after sending the <em><xref target="messages_request_full" format="title" /></em> message into
1716 <strong>Full Receiving</strong> state.
1717 </t>
1718 <t>
1719 The receiving peer receives the <em><xref target="messages_request_full" format="title" /></em> message in the <strong>Expecting IBF</strong>, afterwards the receiving peer
1720 starts sending his complete set in <xref target="messages_full_element" format="title" /> messages to the initiating peer.
1721 </t>
1722 </section>
1723 <section anchor="messages_request_full_structure" numbered="true" toc="default">
1724 <name>Structure</name>
1725 <figure anchor="figure_request_full">
1726 <artwork name="" type="" align="left" alt=""><![CDATA[
1727 0 8 16 24 32 40 48 56
1728 +-----+-----+-----+-----+-----+-----+-----+-----+
1729 | MSG SIZE | MSG TYPE | REMOTE SET DIFF |
1730 +-----+-----+-----+-----+-----+-----+-----+-----+
1731 | REMOTE SET SIZE | LOCAL SET DIFF |
1732 +-----+-----+-----+-----+-----+-----+-----+-----+
1733 ]]></artwork>
1734 </figure>
1735 <t>where:</t>
1736 <dl>
1737 <dt>MSG SIZE</dt>
1738 <dd>
1739 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. The value is always 16 for this message type.
1740 </dd>
1741 <dt>MSG TYPE</dt>
1742 <dd>
1743 is SETU_P2P_REQUEST_FULL as registered in <xref target="gana" format="title" /> in network byte order.
1744 </dd>
1745 <dt>REMOTE SET DIFF</dt>
1746 <dd>
1747 is a 32-bit unsigned integer in network byte order, which represents the remote (from the perspective of the
1748 sending peer) set difference calculated with strata estimator.
1749 </dd>
1750 <dt>REMOTE SET SIZE</dt>
1751 <dd>
1752 is a 32-bit unsigned integer in network byte order, which represents the total remote
1753 (from the perspective of the sending peer) set size.
1754 </dd>
1755 <dt>LOCAL SET DIFF</dt>
1756 <dd>
1757 is a 32-bit unsigned integer in network byte order, which represents the local
1758 (from the perspective of the sending peer) set difference calculated with strata estimator.
1759 </dd>
1760 </dl>
1761 </section>
1762 </section>
1763
1764
1765 <section anchor="messages_send_full" numbered="true" toc="default">
1766 <name>Send Full</name>
1767
1768 <section anchor="messages_send_full_description" numbered="true" toc="default">
1769 <name>Description</name>
1770 <t>
1771 The <em><xref target="messages_send_full" format="title" /></em> message is sent by the initiating peer in <strong>Expect SE</strong> state to the receiving peer if
1772 the operation mode "<xref target="modeofoperation_full-sync" format="title" />" is
1773 determined as superior <xref target="modeofoperation" format="title" /> and that it is the better choice that the
1774 peer sends his elements first. The initiating peer changes after sending the <em><xref target="messages_request_full" format="title" /></em> message into
1775 <strong>Full Sending</strong> state.
1776 </t>
1777 <t>
1778 The receiving peer receives the <em><xref target="messages_send_full" format="title" /></em> message in the <strong>Expecting IBF</strong> state, afterwards the receiving peer
1779 changes into <strong>Full Receiving</strong> state and expects to receive the set of the remote peer.
1780 </t>
1781 </section>
1782 <section anchor="messages_send_full_structure" numbered="true" toc="default">
1783 <name>Structure</name>
1784 <figure anchor="figure_send_full">
1785 <artwork name="" type="" align="left" alt=""><![CDATA[
1786 0 8 16 24 32 40 48 56
1787 +-----+-----+-----+-----+-----+-----+-----+-----+
1788 | MSG SIZE | MSG TYPE | REMOTE SET DIFF |
1789 +-----+-----+-----+-----+-----+-----+-----+-----+
1790 | REMOTE SET SIZE | LOCAL SET DIFF |
1791 +-----+-----+-----+-----+-----+-----+-----+-----+
1792 ]]></artwork>
1793 </figure>
1794 <t>where:</t>
1795 <dl>
1796 <dt>MSG SIZE</dt>
1797 <dd>
1798 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. The value is always 16 for this message type.
1799 </dd>
1800 <dt>MSG TYPE</dt>
1801 <dd>
1802 is SETU_P2P_REQUEST_FULL as registered in <xref target="gana" format="title" /> in network byte order.
1803 </dd>
1804 <dt>REMOTE SET DIFF</dt>
1805 <dd>
1806 is a 32-bit unsigned integer in network byte order, which represents the remote (from the perspective of the sending peer)
1807 set difference calculated with strata estimator.
1808 </dd>
1809 <dt>REMOTE SET SIZE</dt>
1810 <dd>
1811 is a 32-bit unsigned integer in network byte order, which represents the total remote (from the perspective
1812 of the sending peer) set size.
1813 </dd>
1814 <dt>LOCAL SET DIFF</dt>
1815 <dd>
1816 is a 32-bit unsigned integer in network byte order, which represents the local (from the perspective of the sending peer)
1817 set difference calculated with strata estimator.
1818 </dd>
1819 </dl>
1820 </section>
1821 </section>
1822
1823
1824 <section anchor="messages_se" numbered="true" toc="default">
1825 <name>Strata Estimator</name>
1826
1827 <section anchor="messages_se_description" numbered="true" toc="default">
1828 <name>Description</name>
1829 <t>
1830 The strata estimator is sent by the receiving peer at the start of the protocol, right after the
1831 <xref target="messages_operation_request" format="title" /> message has been received.
1832 </t>
1833 <t>
1834 The strata estimator is used to estimate the difference between the two sets as described in section <xref target="se" format="title" />.
1835 </t>
1836 <t>
1837 When the initiating peer receives the strata estimator, the peer decides which <xref target="modeofoperation" format="title" /> to use
1838 for the synchronisation. Depending on the size of the set difference and the <xref target="modeofoperation" format="title" /> the initiating peer
1839 changes into <strong>Full Sending</strong>, <strong>Full Receiving</strong> or <strong>Passive Decoding</strong> state.
1840 </t>
1841 <t>
1842 The <em><xref target="messages_se" format="title" /></em> message can contain one, two, four or eight strata estimators with different salts, depending on the initial size of the sets.
1843 More details can be found in section <xref target="performance_multi_se" format="title" />.
1844 </t>
1845 <t>
1846 The IBFs in a strata estimator always have 79 buckets. The reason why can be found in <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> in section 3.4.2.
1847 </t>
1848 <!-- Give a more precise reference into the thesis for this, do not cite the whole thesis! -->
1849 </section>
1850 <section anchor="messages_se_structure" numbered="true" toc="default">
1851 <name>Structure</name>
1852 <figure anchor="figure_se">
1853 <artwork name="" type="" align="left" alt=""><![CDATA[
1854 0 8 16 24 32 40 48 56
1855 +-----+-----+-----+-----+-----+-----+-----+-----+
1856 | MSG SIZE | MSG TYPE | SEC | SETSIZE
1857 +-----+-----+-----+-----+-----+-----+-----+-----+
1858 SETSIZE | SE-SLICES
1859 +-----+-----+-----+-----+
1860 / /
1861 / /
1862 ]]></artwork>
1863 </figure>
1864 <t>where:</t>
1865 <dl>
1866 <dt>MSG SIZE</dt>
1867 <dd>
1868 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included.
1869 </dd>
1870 <dt>MSG TYPE</dt>
1871 <dd>
1872 is SETU_P2P_SE as registered in <xref target="gana" format="title" /> in network byte order.
1873 </dd>
1874 <dt>SEC</dt>
1875 <dd>
1876 is a 8-bit unsigned integer in networkf byte order, which indicates how many strata estimators
1877 with different salts are attached to the message. Valid values are 1,2,4 or 8, more details can be found
1878 in the section <xref target="performance_multi_se" format="title" />.
1879 </dd>
1880 <dt>SETSIZE</dt>
1881 <dd>
1882 is a 64-bit unsigned integer that is defined by the size of the set the SE is
1883 </dd>
1884 <dt>SE-SLICES</dt>
1885 <dd>
1886 <t>
1887 are variable numbers of slices in an array. A slice can contain one or more Strata Estimators which
1888 contain multiple IBFs as described in IBF-SLICES in <xref target="messages_ibf_structure" format="default"/>.
1889 A SE slice can contain one to eight Strata Estimators which contain 32 (Defined as Constant SE_STRATA_COUNT) IBFs. Every IBF in
1890 a SE contains 79 Buckets.
1891 </t>
1892 <t>
1893 The different SEs are built as in detail described in <xref target="performance_multi_se" format="default"/>.
1894 Simply put, the IBFs in each SE are serialized as described in <xref target="messages_ibf_structure" format="default"/> starting with the highest stratum.
1895 Then the created SEs are appended one after the other starting with the SE that was created with a salt of zero.
1896
1897 </t>
1898 </dd>
1899 </dl>
1900 <figure anchor="figure_se_slice">
1901 <artwork name="" type="" align="left" alt=""><![CDATA[
1902 SE-SLICE
1903 0 8 16 24 32 40 48 56
1904 +-----+-----+-----+-----+-----+-----+-----+-----+
1905 | SE_1->IBF_1 |
1906 +-----+-----+-----+-----+-----+-----+-----+-----+
1907 ... ...
1908 +-----+-----+-----+-----+-----+-----+-----+-----+
1909 | SE_1->IBF_30 |
1910 +-----+-----+-----+-----+-----+-----+-----+-----+
1911 | SE_2->IBF_1 |
1912 +-----+-----+-----+-----+-----+-----+-----+-----+
1913 ... ...
1914 / /
1915 / /
1916 ]]></artwork>
1917 </figure>
1918 </section>
1919 </section>
1920
1921 <section anchor="messages_sec" numbered="true" toc="default">
1922 <name>Strata Estimator Compressed</name>
1923
1924 <section anchor="messages_sec_description" numbered="true" toc="default">
1925 <name>Description</name>
1926 <t>
1927 The Strata Estimator can be compressed with gzip as
1928 described in <xref target="RFC1951"/> to improve performance. This can be recognized
1929 by the different message type number from <xref target="gana" format="title" />.
1930 </t>
1931 <section anchor="messages_sec_structure" numbered="true" toc="default">
1932 <name>Structure</name>
1933 <t>
1934 The key difference between the compressed and the uncompressed Strata Estimator is that the
1935 SE slices are compressed with gzip (<xref target="RFC1951"/>) in the compressed SE.
1936 But the header remains uncompressed with both.
1937 </t>
1938 <t>
1939 Since the content of the message is the same as the uncompressed Strata Estimator, the details
1940 are not repeated here. For details see section <xref target="messages_se" format="counter" />.
1941 </t>
1942 </section>
1943 </section>
1944 </section>
1945
1946 <section anchor="messages_full_element" numbered="true" toc="default">
1947 <name>Full Element</name>
1948
1949 <section anchor="messages_full_element_description" numbered="true" toc="default">
1950 <name>Description</name>
1951 <t>
1952 The <em><xref target="messages_full_element" format="title" /></em> message is the equivalent of the <xref target="messages_elements" format="title" /> message in
1953 the <xref target="modeofoperation_full-sync" format="title" />. It contains a complete element that is missing
1954 in the set of the peer that receives this message.
1955 </t>
1956 <t>
1957 The <em><xref target="messages_full_element" format="title" /></em> message is exclusively sent in the transitions <strong>Expecting IBF</strong> -> <strong>Full Receiving</strong> and
1958 <strong>Full Receiving</strong> -> <strong>Finished</strong>. The message is only received in the <strong> Full Sending</strong> and
1959 <strong>Full Receiving</strong> state.
1960 </t>
1961 <t>
1962 After the last <em><xref target="messages_full_element" format="title" /></em> message has been sent, the <em><xref target="messages_full_done" format="title" /></em> message
1963 is sent to conclude the full synchronisation of the element sending peer.
1964 </t>
1965 </section>
1966 <section anchor="messages_full_element_structure" numbered="true" toc="default">
1967 <name>Structure</name>
1968 <!-- MAYBE just refer to the "ELEMENT" section on structure and only
1969 note the different MSG TYPE here? -->
1970 <figure anchor="figure_full_element">
1971 <artwork name="" type="" align="left" alt=""><![CDATA[
1972 0 8 16 24 32 40 48 56
1973 +-----+-----+-----+-----+-----+-----+-----+-----+
1974 | MSG SIZE | MSG TYPE | E TYPE | PADDING |
1975 +-----+-----+-----+-----+-----+-----+-----+-----+
1976 | SIZE | AE TYPE | DATA
1977 +-----+-----+-----+-----+
1978 / /
1979 / /
1980 ]]></artwork>
1981 </figure>
1982 <t>where:</t>
1983 <dl>
1984 <dt>MSG SIZE</dt>
1985 <dd>
1986 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included.
1987 </dd>
1988 <dt>MSG TYPE</dt>
1989 <dd>
1990 is SETU_P2P_REQUEST_FULL_ELEMENT as registered in <xref target="gana" format="title" /> in network byte order.
1991 </dd>
1992 <dt>E TYPE</dt>
1993 <dd>
1994 is a 16-bit unsigned integer which defines the element type for
1995 the application.
1996 </dd>
1997 <dt>PADDING</dt>
1998 <dd>
1999 is 16-bit always set to zero
2000 </dd>
2001 <dt>E SIZE</dt>
2002 <dd>
2003 is a 16-bit unsigned integer that signals the size of the elements data part.
2004 </dd>
2005 <dt>AE TYPE</dt>
2006 <dd>
2007 is a 16-bit unsigned integer that is needed to identify
2008 the type of element that is in the data field
2009 </dd>
2010 <dt>DATA</dt>
2011 <dd>
2012 is a field with variable length that contains the data of the element.
2013 </dd>
2014 </dl>
2015 </section>
2016 </section>
2017 </section>
2018
2019 <section anchor="performance" numbered="true" toc="default">
2020 <name>Performance Considerations</name>
2021 <section anchor="performance_formulas" numbered="true" toc="default">
2022 <name>Formulas</name>
2023 <section anchor="performance_formulas_operationmode" numbered="true" toc="default">
2024 <name>Operation Mode</name>
2025 <t>
2026 The decision which <xref target="modeofoperation" format="title"/> is used is described by the following code.
2027 More detailed explanations motivating the design can be found in the accompanying thesis in section 4.5.3.<xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/>
2028 </t>
2029 <t>
2030 The function takes as input the average element size, the local set size, the remote set size, the set differences as estimated from the strata estimator for both the local and remote sets,
2031 and the bandwidth/roundtrip tradeoff.
2032 The function returns the exact <xref target="modeofoperation" format="title"/> that is predicted to be best as output: FULL_SYNC_REMOTE_SENDING_FIRST
2033 if it is likely cheapest that the other peer transmits his elements first, FULL_SYNC_LOCAL_SENDING_FIRST
2034 if it is likely cheapest that the elements are transmitted to the other peer directly, and
2035 DIFFERENTIAL_SYNC if the differential synchronisation is likely cheapest.
2036 </t>
2037 <t>
2038 The constant IBF_BUCKET_NUMBER_FACTOR is always 2 and IBF_MIN_SIZE is 37.
2039 The method for deriving
2040 this can be found in the IBF parameter study in <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> in section 4.5.2.
2041 </t>
2042 <figure anchor="performance_formulas_operationmode_code">
2043 <artwork name="" type="" align="left" alt=""><![CDATA[
2044# CONSTANTS:
2045# IBF_BUCKET_NUMBER_FACTOR = 2: The amount the IBF gets increased if decoding fails
2046# RTT_MIN_FULL = 2: Minimal round trips used for full syncronisation (always 2 or 2.5)
2047# IBF_MIN_SIZE = 37: The minimal size of an IBF
2048# MAX_BUCKETS_PER_MESSAGE: Custom value depending on the underlying protocol
2049# INPUTS:
2050# avg_es: The average element size
2051# lss: The initial local set size
2052# rss: The remote set size
2053# lsd: the estimated local set difference calculated by the SE
2054# rsd: the estimated remote set difference calculated by the SE
2055# rtt: the tradeoff between round trips and bandwidth
2056# OUTPUT:
2057# FULL_SYNC_REMOTE_SENDING_FIRST, FULL_SYNC_LOCAL_SENDING_FIRST or DIFFERENTIAL_SYNC
2058
2059FUNCTION decide_operation_mode(avg_es,
2060 lss,
2061 rss,
2062 lsd
2063 rsd,
2064 rtt)
2065
2066 # If a set size is zero always do full sync
2067 # TODO: check if these two conditions are
2068 # actually meaningful, I suspect even without
2069 # this check at the beginning the logic below
2070 # should always yield the same result for these
2071 # extreme cases, allowing us to omit this code.
2072 IF 0 == rss THEN
2073 RETURN FULL_SYNC_LOCAL_SENDING_FIRST
2074 END IF
2075 IF 0 == lss THEN
2076 RETURN FULL_SYNC_REMOTE_SENDING_FIRST
2077 END IF
2078
2079 # Estimate required transferred bytes when doing a full synchronisation
2080 # and transmitting local set first.
2081 estimated_total_diff = rsd + lsd
2082 total_elements_local_send = rsd + lss
2083 cost_local_full_sync = avg_es * total_elements_local_send
2084 + total_elements_local_send * sizeof(ELEMENT_MSG_HEADER)
2085 + sizeof(FULL_DONE_MSG_HEADER) * 2
2086 + RTT_MIN_FULL * rtt
2087
2088 # Estimate required transferred bytes when doing a full synchronisation
2089 # and transmitting remote set first.
2090 total_elements_remote_send = lsd + rss
2091 cost_remote_full_sync = avg_es * total_elements_remote_send
2092 + total_elements_remote_send * sizeof(ELEMENT_MSG_HEADER)
2093 + sizeof(FULL_DONE_MSG_HEADER) * 2
2094 + (RTT_MIN_FULL + 0.5) * rtt
2095 + sizeof(REQUEST_FULL_MSG)
2096
2097 # Estimate required transferred bytes when doing a differential synchronisation
2098
2099 # Estimate messages required to transfer IBF
2100 ibf_bucket_count = estimated_total_diff * IBF_BUCKET_NUMBER_FACTOR
2101 IF ibf_bucket_count <= IBF_MIN_SIZE THEN
2102 ibf_bucket_count = IBF_MIN_SIZE
2103 END IF
2104 ibf_message_count = ceil (ibf_bucket_count / MAX_BUCKETS_PER_MESSAGE)
2105
2106 # Estimate average counter length with variable counter
2107 estimated_counter_bits = MIN (2 * LOG2(lss / ibf_bucket_count),
2108 LOG2(lss))
2109 estimated_counter_bytes = estimated_counter_bits / 8
2110
2111 # Sum up all messages required to do differential synchronisation
2112 ibf_bytes = sizeof(IBF_MESSAGE) * ibf_message_count
2113 + ibf_bucket_count * sizeof(IBF_KEY)
2114 + ibf_bucket_count * sizeof(IBF_KEYHASH)
2115 + ibf_bucket_count * estimated_counter_bytes
2116 # Add 20% overhead to cover IBF retries due to decoding failures
2117 total_ibf_bytes = ibf_bytes * 1.2
2118
2119 # Estimate other message sizes to be transfered in diff sync
2120 # Note that we simplify by adding the header each time;
2121 # if the implementation combines multiple INQUIRY/DEMAND/OFFER
2122 # requests in one message, the bandwidth would be lower.
2123 done_size = sizeof(DONE_HEADER)
2124 element_size = (avg_es + sizeof(ELEMENT_MSG_HEADER))
2125 * estimated_total_diff
2126 inquery_size = (sizeof(IBF_KEY) + sizeof(INQUERY_MSG_HEADER))
2127 * estimated_total_diff
2128 demand_size = (sizeof(HASHCODE) + sizeof(DEMAND_MSG_HEADER))
2129 * estimated_total_diff
2130 offer_size = (sizeof(HASHCODE) + sizeof(OFFER_MSG_HEADER))
2131 * estimated_total_diff
2132
2133 # Estimate total cost
2134 diff_cost = element_size + done_size + inquery_size
2135 + demand_size + offer_size + total_ibf_bytes
2136 + DIFFERENTIAL_RTT_MEAN * rtt
2137
2138 # Decide for a optimal mode of operation
2139 full_cost_min = MIN (cost_local_full_sync,
2140 cost_remote_full_sync)
2141 IF full_cost_min < diff_cost THEN
2142 IF cost_remote_full_sync > cost_local_full_sync THEN
2143 RETURN FULL_SYNC_LOCAL_SENDING_FIRST
2144 ELSE
2145 RETURN FULL_SYNC_REMOTE_SENDING_FIRST
2146 END IF
2147 ELSE
2148 RETURN DIFFERENTIAL_SYNC
2149 END IF
2150END FUNCTION
2151 ]]></artwork>
2152 </figure>
2153 </section>
2154 <section anchor="performance_formula_ibf_parameters" numbered="true" toc="default">
2155 <name>IBF Size</name>
2156 <t>
2157 The functions, described in this section, calculate a good initial size (initial_ibf_size)
2158 and in case of decoding failure, a good next IBF size (get_next_ibf_size).
2159 </t>
2160 <t>
2161 These algorithms are described and justified in more details in
2162 <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> in the parameter study in
2163 section 3.5.2, the max IBF counter in section 3.10 and the Improved IBF size in section 3.11.
2164 </t>
2165 <figure anchor="performance_formula_ibf_parameters_code">
2166 <artwork name="" type="" align="left" alt=""><![CDATA[
2167# CONSTANTS:
2168# IBF_BUCKET_NUMBER_FACTOR = 2: The amount the IBF gets increased if decoding fails
2169# Inputs:
2170# sd: Estimated set difference
2171# Output:
2172# next_size: Size of the initial IBF
2173
2174FUNCTION initial_ibf_size(sd)
2175 # We do not go below 37, as 37 buckets should
2176 # basically always be below one MTU, so there is
2177 # little to be gained, while a smaller IBF would
2178 # increase the chance of a decoding failure.
2179 RETURN MAX(37, IBF_BUCKET_NUMBER_FACTOR * sd);
2180END FUNCTION
2181
2182# CONSTANTS:
2183# IBF_BUCKET_NUMBER_FACTOR = 2: The amount the IBF gets increased if decoding fails
2184# Inputs:
2185# de: Number of elements that have been successfully decoded
2186# lis: The number of buckets of the last IBF
2187# Output:
2188# number of buckets for the next IBF
2189
2190FUNCTION get_next_ibf_size(de, lis)
2191 next_size = IBF_BUCKET_NUMBER_FACTOR * (lis - de)
2192 # The MAX operation here also ensures that the
2193 # result is positive.
2194 RETURN MAX(37, next_size);
2195END FUNCTION
2196 ]]></artwork>
2197 </figure>
2198 </section>
2199 <section anchor="performance_num_buck_hash" numbered="true" toc="default">
2200 <name>Number of Buckets an Element is Hashed into</name>
2201 <t>
2202 The number of buckets an element is hashed to is hardcoded to 3. Reasoning and
2203 justification can be found in
2204 <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> in the
2205 IBF parameter performance study in section 4.5.2.
2206 </t>
2207 </section>
2208 </section>
2209 <section anchor="performance_counter_variable_size" numbered="true" toc="default">
2210 <name>Variable Counter Size</name>
2211 <t>
2212 The number of bits required to represent the counters of an IBF varies
2213 due to different parameters as described in section 3.2 of
2214 <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/>.
2215 Therefore, a packing algorithm has been implemented.
2216 This algorithm encodes the IBF counters in their optimal bit-width
2217 and thus minimizes the bandwidth needed to transmit the IBF.
2218 </t>
2219 <t>
2220 A simple algorithm is used for the packing.
2221 In a first step it is determined, which is the largest counter.
2222 The the base 2 logarithm then determines how many bits are needed to store it.
2223 In a second step for every counter of every bucket, the counter
2224 is stored using this many bits. The resulting bit sequence is then simply concatenated.
2225 </t>
2226 <t>
2227 Three individual functions are used for this purpose.
2228 The first one is a function that iterates over each bucket of
2229 the IBF to get the maximum counter in the IBF. The second function
2230 packs the counters of the IBF, and the third function that unpacks the counters.
2231 </t>
2232 <t>
2233 As a plausibly check to prevent the byzantine upper bound
2234 checks in <xref target="security_generic_functions_check_byzantine_boundaries" format="default"/>
2235 to fail, implementations must ensure that the
2236 estimates of the set size difference added together
2237 never exceed the set byzantine upper bound. This
2238 could for example happen in case the strata estimator
2239 overestimates the set difference.
2240 </t>
2241 <figure anchor="performance_counter_variable_size_code">
2242 <artwork name="" type="" align="left" alt=""><![CDATA[
2243
2244# INPUTS:
2245# ibf: The IBF
2246# OUTPUTS:
2247# returns: Minimal amount of bits required to store the counter
2248
2249FUNCTION ibf_get_max_counter(ibf)
2250 max_counter=1 # convince static analysis that we never take log2(0)
2251 FOR bucket IN ibf DO
2252 IF bucket.counter > max_counter THEN
2253 max_counter = bucket.counter
2254 END IF
2255 END FOR
2256 # next bigger discrete number of the binary logarithm of the max counter
2257 RETURN CEILING( LOG2( max_counter ) )
2258END FUNCTION
2259
2260# INPUTS:
2261# ibf: The IBF
2262# offset: The offset which defines the starting point from which bucket the
2263# pack operation starts
2264# count: The number of buckets in the array that will be packed
2265# OUTPUTS:
2266# returns: A byte array of packed counters to send over the network
2267
2268# INPUTS:
2269# ibf: The IBF
2270# offset: The offset which defines the starting point from which bucket the
2271# pack operation starts
2272# count: The number of buckets in the array that will be packed
2273# OUTPUTS:
2274# returns: A byte array of packed counters to send over the network
2275
2276FUNCTION pack_counter(ibf, offset, count)
2277 counter_bytes = ibf_get_max_counter(ibf)
2278 store_bits = 0
2279 store = 0
2280 byte_ctr = 0
2281 buffer=[]
2282
2283 FOR bucket IN ibf[offset] TO ibf[count] DO
2284 counter = bucket.counter
2285 byte_len = counter_bytes
2286
2287 WHILE byte_len + store_bits < 8 DO
2288 bit_to_shift = 0
2289
2290 IF store_bits > 0 OR byte_len > 8 THEN
2291 bit_free = 8 - store_bits
2292 bit_to_shift = byte_len - bit_free
2293 store = store << bit_free
2294 END IF
2295 buffer[byte_ctr] = (( counter >> bit_to_shift) | store) & 0xFF
2296 byte_ctr = byte_ctr + 1
2297 byte_len -= 8 - store_bits
2298 counter = counter & ((1 << byte_len) - 1)
2299 store = 0
2300 store_bits = 0
2301 END WHILE
2302 store = (store << byte_len) | counter
2303 store_bits = store_bits + byte_len
2304 byte_len = 0
2305 END FOR
2306
2307 # Write the last partial packed byte to the buffer
2308 IF store_bits > 0 THEN
2309 buffer[byte_ctr] = store << (8 - store_bits)
2310 byte_ctr = byte_ctr + 1
2311 END IF
2312
2313 RETURN buffer
2314FUNCTION END
2315
2316# INPUTS:
2317# ibf: The IBF
2318# offset: The offset which defines the starting point from which bucket
2319 the packed operation starts
2320# count: The number of buckets in the array that will be packed
2321# cbl: The bit length of the counter can be found in the
2322 ibf message in the ibf_counter_bit_length field
2323# pd: A byte array which contains the data packed with the pack_counter
2324 function
2325# OUTPUTS:
2326# returns: Nothing because the unpacked counter is saved directly into the IBF
2327
2328FUNCTION unpack_counter(ibf, offset, count, cbl, pd)
2329 ibf_bucket_ctr = 0
2330 store = 0
2331 store_bits = 0
2332 byte_ctr = 0
2333
2334 WHILE TRUE
2335 byte_to_read = pd[byte_ctr]
2336 bit_to_pack_left = 8
2337 byte_ctr++
2338
2339 WHILE bit_to_pack_left >= 0 DO
2340
2341 # Prevent packet from reading more than required
2342 IF ibf_bucket_ctr > (count - 1) THEN
2343 RETURN
2344 END IF
2345
2346 IF store_bits + bit_to_pack_left >= cbl THEN
2347 bit_use = cbl - store_bits
2348
2349 IF store_bits > 0 THEN
2350 store = store << bit_use
2351 END IF
2352 bytes_to_shift = bit_to_pack_left - bit_use
2353 counter_partial = byte_to_read >> bytes_to_shift
2354 store = store | counter_partial
2355 ibf.counter[ibf_bucket_ctr + offset] = store
2356 byte_to_read = byte_to_read & (( 1 << bytes_to_shift ) - 1)
2357
2358 bit_to_pack_left -= bit_use
2359 ibf_bucket_ctr++
2360 store = 0
2361 store_bits = 0
2362 ELSE
2363 store_bits = store_bits + bit_to_pack_left
2364
2365 IF 0 == store_bits THEN
2366 store = byte_to_read
2367 ELSE
2368 store = store << bit_to_pack_left
2369 store = store | byte_to_read
2370 END IF
2371 BREAK
2372 END IF
2373 END WHILE
2374 END WHILE
2375END FUNCTION
2376 ]]></artwork>
2377 </figure>
2378 </section>
2379 <section anchor="performance_multi_se" numbered="true" toc="default">
2380 <name>Multi Strata Estimators</name>
2381 <t>
2382 In order to improve the precision of the estimates not only one strata estimator
2383 is transmitted for larger sets. One, two, four or eight strata estimators can be
2384 transferred. Transmitting multiple strata estimators has the disadvantage that
2385 additional bandwidth will be used, so despite the higher precision, it is not
2386 always optimal to transmit eight strata estimators. Therefore, the following
2387 rules are used, which are based on the average element size multiplied by the number
2388 of elements in the set. This value is denoted as "b" in the table:
2389 </t>
2390 <dl>
2391 <dt>SEs</dt>
2392 <dd>Rule</dd>
2393 <dt>1</dt>
2394 <dd>b &lt; 68kb</dd>
2395 <dt>2</dt>
2396 <dd>b &gt; 68kb</dd>
2397 <dt>4</dt>
2398 <dd>b &gt; 269kb</dd>
2399 <dt>8</dt>
2400 <dd>b &gt; 1'077kb</dd>
2401 </dl>
2402 <t>
2403 When creating multiple strata estimators, it is important to salt the keys for the IBFs in the strata
2404 estimators differently, using the following bit rotation based salting method:
2405 </t>
2406 <figure anchor="performance_multi_se_salting_code">
2407 <artwork name="" type="" align="left" alt=""><![CDATA[
2408
2409# Inputs:
2410# value: Input value to salt (needs to be 64 bit unsigned)
2411# salt: Salt to salt value with; Should always be ascending and start at zero
2412 i.e. SE1 = Salt 0; SE2 = Salt 1 etc.
2413# Output:
2414# Returns: Salted value
2415
2416FUNCTION se_key_salting(value, salt)
2417 s = (salt * 7) modulo 64
2418 RETURN (value >> s) | (value << (64 - s))
2419END FUNCTION
2420
2421 ]]></artwork>
2422 </figure>
2423 <t>
2424 Performance study and details about the reasoning for the used methods can be found in <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> in section
2425 3.4.1 under the title "Added Support for Multiple Strata Estimators".
2426 <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/>
2427 </t>
2428 </section>
2429 </section>
2430
2431
2432 <section anchor="security" numbered="true" toc="default">
2433 <name>Security Considerations</name>
2434 <t>
2435 The security considerations in this document focus mainly on the security
2436 goal of availability. The primary goal of the protocol is to prevent an attacker from
2437 wasting computing and network resources of the attacked peer.
2438 </t>
2439 <t>
2440 To prevent denial of service attacks, it is vital to check that peers can only
2441 reconcile a set once in a predefined time span. This is a predefined value and needs
2442 to be adapted per use basis. To enhance reliability and to allow
2443 for legitimate failures, say due to network connectivity issues,
2444 applications SHOULD define a threshold for
2445 the maximum number of failed reconciliation attempts in a given time period.
2446 </t>
2447 <t>
2448 It is important to close and purge connections after a given timeout
2449 to prevent draining attacks.
2450 </t>
2451 <section anchor="security_generic_functions" numbered="true" toc="default">
2452 <name>General Security Checks</name>
2453 <t>
2454 In this section general checks are described which should be applied to multiple states.
2455 </t>
2456
2457 <section anchor="security_generic_input_validation" numbered="true" toc="default">
2458 <name>Input validation</name>
2459 <t>
2460 The format of all received messages needs to be properly validated. This is important to prevent many
2461 attacks on the code. The application data MUST be validated by the application using
2462 the protocol not by the implementation of the protocol.
2463 In case the format validation fails the set operation MUST be terminated.
2464 </t>
2465 </section>
2466 <section anchor="security_generic_functions_check_byzantine_boundaries" numbered="true" toc="default">
2467 <name>Byzantine Boundaries</name>
2468 <t>
2469 To restrict an attacker there should be an upper and lower bound defined and checked
2470 at the beginning of the protocol, based
2471 on prior knowledge, for the number of elements.
2472 The lower byzantine bound can be, for example, the number of elements the
2473 other peer had in his set at the last contact.
2474 The upper byzantine bound can be a practical maximum e.g. the number
2475 of e-voting votes, in Switzerland.
2476 </t>
2477 <figure anchor="security_generic_functions_missing_message_code">
2478 <artwork name="" type="" align="left" alt=""><![CDATA[
2479# Input:
2480# rec: Number of elements in remote set
2481# rsd: Number of elements differ in remote set
2482# lec: Number of elements in local set
2483# lsd: Number of elements differ in local set
2484# UPPER_BOUND: Given byzantine upper bound
2485# LOWER_BOUND: Given byzantine lower bound
2486# Output:
2487# returns TRUE if parameters in byzantine bounds otherwise returns FALSE
2488FUNCTION check_byzantine_bounds (rec,rsd,lec,lsd)
2489 IF rec + rsd > UPPER_BOUND THEN
2490 RETURN FALSE
2491 END IF
2492 IF lec + lsd > UPPER_BOUND THEN
2493 RETURN FALSE
2494 END IF
2495 IF rec < LOWER_BOUND THEN
2496 RETURN FALSE
2497 END IF
2498 RETURN TRUE
2499END FUNCTION
2500 ]]></artwork>
2501 </figure>
2502 </section>
2503
2504 <section anchor="security_generic_functions_check_valid_state" numbered="true" toc="default">
2505 <name>Valid State</name>
2506 <t>
2507 To harden the protocol against attacks, controls were introduced in the improved
2508 implementation that check for each message whether the message was received
2509 in the correct state. This is central so that an attacker finds as little attack
2510 surface as possible and makes it more difficult for the attacker to send the
2511 protocol into an endless loop, for example.
2512 </t>
2513 </section>
2514 <section anchor="security_generic_functions_mfc" numbered="true" toc="default">
2515 <name>Message Flow Control</name>
2516 <t>
2517 For most messages received and sent there needs to be a check in place that checks
2518 that a message is not received multiple times. This is solved with a global store (message)
2519 and the following code
2520 </t>
2521 <t>
2522 The sequence in which messages are received and sent is arranged in a chain.
2523 The messages are dependent on each other. There are dependencies that are
2524 mandatory, e.g. for a sent "Demand" message, an "Element" message must
2525 always be received. But there are also messages for which a response is
2526 not mandatory, e.g. the <em><xref target="messages_inquiry" format="title" /></em> message is only followed by an
2527 "Offer" message, if the corresponding element is in the set. Due to this
2528 fact, checks can be installed to verify compliance with the following chain.
2529 </t>
2530 <figure anchor="security_generic_functions_mfc_chain">
2531 <artwork name="" type="" align="left" alt=""><![CDATA[
2532
2533
2534Chain for elements +---------+ +---------+ +---------+ +---------+
2535NOT in IBF decoding | INQUIRY | ---> | OFFER | ===> | DEMAND | ===> | ELEMENT |
2536peers set +---------+ +---------+ +---------+ +---------+
2537
2538Chain for elements +---------+ +---------+ +---------+
2539in IBF decoding | OFFER | ---> | DEMAND | ===> | ELEMENT |
2540peers set +---------+ +---------+ +---------+
2541 --->: Answer not mandatory
2542 ===>: Always answer needed.
2543 ]]></artwork>
2544 </figure>
2545 <t>
2546 In the message control flow its important to ensure that no duplicated messages are received
2547 (Except inquiries where collisions are possible) and only messages are received which are compliant
2548 with the flow in <xref target="security_generic_functions_mfc_chain" format="default" />.
2549 To link messages the SHA-512 element hashes, that are part of all messages, except in the
2550 <em><xref target="messages_inquiry" format="title" /></em> messages, can be used.
2551 To link an <em><xref target="messages_inquiry" format="title" /></em> message to an
2552 <em><xref target="messages_offer" format="title" /></em> message
2553 the SHA-512 hash from the offer has to be salted and converted to the IBF-Key (as described in
2554 <xref target="ibf_format_id_generation_pseudo_code" format="default" />). The IBF-Key can
2555 be matched with the received <em><xref target="messages_inquiry" format="title" /></em> message.
2556 </t>
2557 <t>
2558 At the end of the set reconciliation operation after receiving and sending the
2559 <em><xref target="messages_done" format="title" /></em> message, it should be checked
2560 that all demands have been satisfied and all elements have been received.
2561 </t>
2562 <t>
2563 This is based on <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/>, section 5.3 (Message Control Flow).
2564 </t>
2565 </section>
2566
2567 <section anchor="security_generic_functions_active_passive_switches" numbered="true" toc="default">
2568 <name>Limit Active/Passive Decoding changes</name>
2569 <t>
2570 To prevent an attacker from sending a peer into an endless loop between active and passive decoding, a
2571 limitation for active/passive roll switches is required.
2572 Otherwise, an attacker could
2573 force the victim to waste unlimited amount of resources by just transmitting
2574 IBFs that do not decode.
2575 This can be implemented by
2576 a simple counter which terminates the operation after a predefined number of switches.
2577 The maximum number of switches needs to be defined in such a way that it is
2578 very improbable that more switches are required in a legitimate interaction,
2579 and hence the malicious behavior of the other peer is assured.
2580 </t>
2581 <t>
2582 The question after how many active/passive switches it can be assumed that the other peer is not honest,
2583 depends on the various tuning parameters of the algorithm.
2584 Section 5.4 of <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/>
2585 demonstrates that the probability of decoding failure is less than
2586 15% for each round. The probability that there will be n legitimate
2587 active/passive changes is thus less than 0.15^{round number}.
2588 Which means that after about 30 active/passive switches it can be said with a certainty of 2^80 that one of the peers
2589 is not following the protocol.
2590 Hence, participants MUST impose a maximum of 30 active/passive changes.
2591 </t>
2592 </section>
2593
2594 <section anchor="security_generic_functions_full_plausibility_check" numbered="true" toc="default">
2595 <name>Full Synchronisation Plausibility Check</name>
2596 <t>
2597 An attacker can try to use up a peer's bandwidth by pretending that the peer
2598 needs full synchronisation, even if the set difference is very small and the attacker
2599 only has a few (or even zero) elements that are not already synchronised.
2600 In such a case, it would be ideal if the plausibility could already be checked
2601 during full synchronisation as to whether the other peer was honest or not with
2602 regard to the estimation of the set size difference and thus the choice of mode
2603 of operation.
2604 </t>
2605 <t>
2606 In order to calculate this plausibility, section 5.5 of <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> describes a formula, which depicts the probability with which one
2607 can calculate the corresponding plausibility based on the number of
2608 new and repeated elements after each received element.
2609 </t>
2610 <t>
2611 Besides this approach from probability theory, there is an additional check
2612 that can be made. After the entire set has been transferred to the other peer,
2613 no known elements may be returned by the second peer, since the second peer
2614 should only return the elements that are missing from the initial peer's set.
2615 </t>
2616 <t>
2617 This two approaches are implemented in the following pseudocode:
2618 </t>
2619
2620 <figure anchor="security_generic_functions_full_plausibility_check_code">
2621 <artwork name="" type="" align="left" alt=""><![CDATA[
2622# Input:
2623# SECURITY_LEVEL: The security level used e.g. 2^80
2624# state: The statemachine state
2625# rs: Estimated remote set difference
2626# lis: Number of elements in set
2627# rd: Number of duplicated elements received
2628# rf: Number of fresh elements received
2629# Output:
2630# Returns TRUE if full synchronisation is plausible and FALSE otherwise
2631
2632FUNCTION full_sync_plausibility_check (state,rs,lis,rd,rf)
2633 security_level_lb = -1 * SECURITY_LEVEL
2634
2635 # Make sure that no element is received double when
2636 # all elements already are transmitted to the oder side.
2637 IF FULL_SENDING == state AND rd > 0 THEN
2638 RETURN FALSE
2639 END IF
2640
2641 # Probabilistic algorithm to check for plausible
2642 # element distribution
2643 IF FULL_RECEIVING == state THEN
2644
2645 # Prevent division by 0
2646 IF 0 <= rs THEN
2647 rs = 1
2648 END IF
2649
2650 # Formula to verify plausibility
2651 base = 1 - (rs / (lis + rs))
2652 exponent = rd - rf * lis / rs
2653 value = exponent * (LOG2(base)/LOG2(2))
2654 IF value < security_level_lb OR value > SECURITY_LEVEL THEN
2655 RETURN FALSE
2656 END IF
2657 END IF
2658 RETURN TRUE
2659END FUNCTION
2660 ]]></artwork>
2661 </figure>
2662 </section>
2663 </section>
2664
2665 <section anchor="security_states" numbered="true" toc="default">
2666 <name>States</name>
2667
2668 <t>
2669 In this section the security considerations for each valid message
2670 in all states is described, if any other message
2671 is received the peer MUST terminate the operation.
2672 </t>
2673
2674 <section anchor="security_states_expecting_ibf" numbered="true" toc="default">
2675 <name>Expecting IBF</name>
2676 <t>Security considerations for received messages:</t>
2677 <dl>
2678 <dt><xref target="messages_request_full" format="title" /></dt>
2679 <dd>
2680 <t>
2681 It needs to be checked that the full synchronisation mode with receiving peer
2682 sending first is plausible according to the algorithm deciding which operation mode
2683 is applicable as described in <xref target="performance_formulas_operationmode" format="default"/>.
2684 </t>
2685
2686 </dd>
2687 <dt><xref target="messages_ibf" format="title" /></dt>
2688 <dd>
2689 <t>
2690 It needs to be checked that the differential synchronisation mode is plausible according
2691 to the algorithm deciding which operation mode
2692 is applicable as described in <xref target="performance_formulas_operationmode" format="default"/>.
2693 </t>
2694
2695 </dd>
2696 <dt><xref target="messages_send_full" format="title" /></dt>
2697 <dd>
2698 <t>
2699 It needs to be checked that the full synchronisation mode with initiating peer
2700 sending first is plausible according to the algorithm deciding which operation mode
2701 is applicable as described in <xref target="performance_formulas_operationmode" format="default"/>.
2702 </t>
2703 </dd>
2704 </dl>
2705 </section>
2706
2707 <section anchor="security_states_full_sending" numbered="true" toc="default">
2708 <name>Full Sending</name>
2709 <t>Security considerations for received messages:</t>
2710 <dl>
2711 <dt><xref target="messages_full_element" format="title" /></dt>
2712 <dd>
2713 <t>
2714 When receiving full elements there needs to be checked, that every
2715 element is a valid element, that no element has been received more than once, and
2716 that not more elements have been received than the other peer has committed
2717 to at the beginning of the operation. The plausibility should also be checked
2718 with an algorithm as described in <xref target="security_generic_functions_full_plausibility_check" format="default"/>.
2719 </t>
2720 </dd>
2721 <dt><xref target="messages_full_done" format="title" /></dt>
2722 <dd>
2723 <t>
2724 When receiving the <em><xref target="messages_full_done" format="title" /></em>
2725 message, it is important to check that
2726 not fewer elements have been received than the other peer has committed to
2727 send at the beginning of the operation.
2728 If the sets differ (the FINAL CHECKSUM field in the <xref target="messages_full_done" format="title" />
2729 message does not match to the SHA-512 hash XOR sum of the local set), the operation has failed and the
2730 reconciliation MUST be aborted. It is a strong indicator
2731 that something went wrong (eg. some hardware bug). This should never occur!
2732 </t>
2733 </dd>
2734 </dl>
2735 </section>
2736
2737 <section anchor="security_states_expecting_ibf_last" numbered="true" toc="default">
2738 <name>Expecting IBF Last</name>
2739 <t>Security considerations for received messages:</t>
2740 <dl>
2741 <dt><xref target="messages_ibf" format="title" /></dt>
2742 <dd>
2743 <t>
2744 The application should check that the overall size of the IBF
2745 that is being transmitted is within its resource bounds, and
2746 abort the protocol if its resource limits are likely to be
2747 exceeded, or if the size is implausible for the given operation.
2748 </t>
2749 <t>
2750 It needs to be checked that the offset (message field "OFFSET")
2751 for every received <em><xref target="messages_ibf" format="title" /></em> message
2752 is strictly monotonic increasing and is a multiple of the MAX_BUCKETS_PER_MESSAGE
2753 defined in the <xref target="constants" format="title" /> section, otherwise the
2754 connection MUST be aborted.
2755 </t>
2756 <t>
2757 Another sanity check is to ensure that the "OFFSET" message field never
2758 is higher than the "IBF SIZE" field in the <em><xref target="messages_ibf" format="title" /></em>
2759 message.
2760 </t>
2761 </dd>
2762 <dt><xref target="messages_ibf_last" format="title" /></dt>
2763 <dd>
2764 <t>
2765 When all <em><xref target="messages_ibf" format="title" /></em> messages have
2766 been received an <em><xref target="messages_ibf_last" format="title" /></em> message
2767 should conclude the transmission of the IBF and a change to the <strong>Active Decoding</strong>
2768 phase should be ensured.
2769 </t>
2770 <t>
2771 To verify that all IBFs have been received, a simple validation can be made.
2772 The number of buckets in the <em><xref target="messages_ibf_last" format="title" /></em> message
2773 added to the value in the message OFFSET field should always be equal to the "IBF SIZE".
2774 </t>
2775 <t>
2776 Further plausibility checks can be made. One is to ensure that after each active/passive
2777 switch the IBF can never be more than double in size. Another plausibility check is
2778 that an IBF probably never will be larger than the byzantine upperbound multiplied by two.
2779 The third plausibility check is to take successfully decoded IBF keys (received offers and demands)
2780 into account and to validate the size of the received IBF with the in <xref target="performance_formula_ibf_parameters_code" format="default" />
2781 described function get_next_ibf_size(). If any of these three checks fail the operation
2782 must be aborted.
2783 </t>
2784 </dd>
2785 </dl>
2786 </section>
2787
2788 <section anchor="security_states_active_decoding" numbered="true" toc="default">
2789 <name>Active Decoding</name>
2790 <t>
2791 In the <strong>Active Decoding</strong> state it is important to prevent an attacker from
2792 generating and transmitting an unlimited number of IBFs that all do not decode, or
2793 to generate an IBF constructed to send the peers in an endless loop.
2794 To prevent an endless loop in decoding, loop detection MUST be implemented.
2795 A solution to prevent endless loop is to limit the number of elements decoded from an IBF.
2796 This limit is defined by the number of buckets in the IBF. It is not possible that more elements are decoded
2797 from an IBF than an IBF has buckets. If more elements than buckets are in an IBF it is not possible to
2798 get pure buckets.
2799 An additional check that should be implemented, is to store all element IDs
2800 that were prior decoded. When a new element ID is decoded from the IBF it should
2801 always be checked that no element ID is repeated.
2802 If the same element ID is decoded more than once, this is a strong indication
2803 for an invalid IBF and the operation MUST be aborted. Notice that the decoded
2804 element IDs are salted as described in <xref target="ibf_format_id_generation_pseudo_code" format="default" />
2805 so the described bit rotation needs to be reverted before the decoded element ID
2806 is stored and compared to the previous decoded element IDs.
2807 </t>
2808 <t>
2809 If the IBF decodes more elements than are plausible, the
2810 operation MUST be terminated. Furthermore, if the IBF
2811 decoding successfully terminates and fewer elements were
2812 decoded than plausible, the operation MUST also be terminated.
2813 The upper thresholds for decoded elements from the IBF is the
2814 remote set size the other peer has committed too (Case if the complete remote set is
2815 new). The lower threshold for decoding element is the absolute value of the difference
2816 between the local and remote set size (Case the set difference is only in the set of
2817 a single peer). The other peer's committed set sizes
2818 is transmitted in the the <strong>Expecting IBF</strong> state.
2819 </t>
2820
2821 <t>Security considerations for received messages:</t>
2822 <dl>
2823 <dt><xref target="messages_offer" format="title" /></dt>
2824 <dd>
2825 <t>
2826 If an offer for an element, that never has been requested by
2827 an inquiry or if an offer is received twice, the operation MUST be terminated.
2828 This requirement can be fulfilled by saving lists that keep track of the state of
2829 all sent inquiries and offers. When answering offers these lists MUST be checked.
2830 The sending and receiving of <xref target="messages_offer" format="title" /> messages should
2831 always be protected with an <xref target="security_generic_functions_mfc" format="title" />
2832 to secure the protocol against missing, duplicated, out-of-order or unexpected messages.
2833 </t>
2834 </dd>
2835 <dt><xref target="messages_elements" format="title" /></dt>
2836 <dd>
2837 <t>
2838 If an element that never has been requested by
2839 a demand or is received twice, the operation MUST be terminated.
2840 The sending and receiving of <xref target="messages_elements" format="title" /> messages should
2841 always be protected with an <xref target="security_generic_functions_mfc" format="title" />
2842 to secure the protocol against missing, duplicated, out-of-order or unexpected messages.
2843 </t>
2844 </dd>
2845 <dt><xref target="messages_demand" format="title" /></dt>
2846 <dd>
2847 <t>
2848 For every received demand an offer has to be sent in advance. If a demand
2849 for an element is received, that never has been offered or the offer already has
2850 been answered with a demand, the operation MUST be terminated. It is required to implement
2851 a list which keeps track of the state of all sent offers and received demands.
2852 The sending and receiving of <em><xref target="messages_demand" format="title" /></em> messages should
2853 always be protected with an <xref target="security_generic_functions_mfc" format="title" />
2854 to secure the protocol against missing, duplicated, out-of-order or unexpected messages.
2855 </t>
2856 </dd>
2857 <dt><xref target="messages_done" format="title" /></dt>
2858 <dd>
2859 <t>
2860 The <em><xref target="messages_done" format="title" /></em> message is only received if the IBF has finished
2861 decoding and all offers have been sent. If the <em><xref target="messages_done" format="title" /></em> message is received before
2862 the decoding of the IBF is finished or all open demands
2863 have been answered, the operation MUST be terminated.
2864 If the sets differ (the FINAL CHECKSUM field in the <xref target="messages_done" format="title" />
2865 message does not match to the SHA-512 hash XOR sum of the local set), the operation has failed and the
2866 reconciliation MUST be aborted. It is a strong indicator
2867 that something went wrong (eg. some hardware bug). This should never occur!
2868 </t>
2869 <t>
2870 When a <em><xref target="messages_done" format="title" /></em> message is received the
2871 "check_if_synchronisation_is_complete()" function from the <xref target="security_generic_functions_mfc" format="title" />
2872 is required to ensure that all demands have been satisfied successfully.
2873 </t>
2874 </dd>
2875 </dl>
2876 </section>
2877 <section anchor="security_states_finish_closing" numbered="true" toc="default">
2878 <name>Finish Closing</name>
2879 <t>
2880 In the <strong>Finish Closing</strong> state the protocol waits for
2881 all sent demands to be fulfilled.
2882 </t>
2883 <t>
2884 In case not all sent demands have been answered in time,
2885 the operation has failed and MUST be terminated.
2886 </t>
2887 <t>Security considerations for received messages:</t>
2888 <dl>
2889 <dt><xref target="messages_elements" format="title" /></dt>
2890 <dd>
2891 When receiving <xref target="messages_elements" format="title" /> messages it is important
2892 to always check the <xref target="security_generic_functions_mfc" format="title" />
2893 to secure the protocol against missing, duplicated, out-of-order or unexpected messages.
2894 </dd>
2895 </dl>
2896 </section>
2897 <section anchor="security_states_finished" numbered="true" toc="default">
2898 <name>Finished</name>
2899 <t>
2900 In this state the connection is terminated, so no security considerations are needed.
2901 </t>
2902 </section>
2903 <section anchor="security_states_expect_se" numbered="true" toc="default">
2904 <name>Expect SE</name>
2905 <t>Security considerations for received messages:</t>
2906 <dl>
2907 <dt><xref target="messages_se" format="title" /></dt>
2908 <dd>
2909 <t>
2910 In case the strata estimator does not decode, the
2911 operation MUST be terminated to prevent to get to an unresolvable state.
2912 The set difference calculated from the strata estimator needs to be plausible,
2913 which means within the byzantine boundaries described in section <xref target="security_generic_functions_check_byzantine_boundaries" format="title" />.
2914 </t>
2915 </dd>
2916 </dl>
2917 </section>
2918 <section anchor="security_states_full_receiving" numbered="true" toc="default">
2919 <name>Full Receiving</name>
2920 <t>Security considerations for received messages:</t>
2921 <dl>
2922 <dt><xref target="messages_full_element" format="title" /></dt>
2923 <dd>
2924 <t>
2925 When receiving full elements there needs to be checked, that every
2926 element is a valid element, no element has been received more than once and
2927 not more elements are received than the other peer committed
2928 to sending at the beginning of the operation. The plausibility should also be checked
2929 with an algorithm as described in <xref target="security_generic_functions_full_plausibility_check" format="default"/>.
2930 </t>
2931 </dd>
2932 <dt><xref target="messages_full_done" format="title" /></dt>
2933 <dd>
2934 <t>
2935 When the <em><xref target="messages_full_done" format="title" /></em> message is received from the remote peer, it should be checked that the number of
2936 elements received matches the number that the remote peer
2937 originally committed to transmitting,
2938 otherwise the operation MUST be terminated.
2939 If the sets differ (the FINAL CHECKSUM field in the <xref target="messages_full_done" format="title" />
2940 message does not match to the SHA-512 hash XOR sum of the local set), the operation has failed and the
2941 reconciliation MUST be aborted. It is a strong indicator
2942 that something went wrong (eg. some hardware bug). This should never occur!
2943 </t>
2944 </dd>
2945 </dl>
2946 </section>
2947 <section anchor="security_states_passive_decoding" numbered="true" toc="default">
2948 <name>Passive Decoding</name>
2949 <t>Security considerations for received messages:</t>
2950 <dl>
2951 <dt><xref target="messages_ibf" format="title" /></dt>
2952 <dd>
2953 <t>
2954 In case an <xref target="messages_ibf" format="title" /> message is received by the peer a active/passive role switch
2955 is initiated by the active decoding remote peer.
2956 A switch into active decoding mode MUST only be permitted for
2957 a predefined number of times as described in <xref target="security_generic_functions_active_passive_switches" format="default"/>
2958 </t>
2959 </dd>
2960 <dt><xref target="messages_inquiry" format="title" /></dt>
2961 <dd>
2962 <t>
2963 A check needs to be in place that prevents receiving an inquiry
2964 for an element multiple times or more inquiries than are plausible.
2965 The upper thresholds for sent/received inquiries is the
2966 remote set size the other peer has committed too (Case if the complete remote set is
2967 new). The lower threshold for for sent/received inquiries is the absolute value of the
2968 set difference between the local and remote set size
2969 (Case the set difference is only in the set of a single peer).
2970 The other peer's committed set sizes is transmitted in the the <strong>Expecting IBF</strong> state.
2971 Beware that it is possible to get key collisions and an inquiry for the same key
2972 can be transmitted multiple times, so the threshold should take this into account.
2973 The sending and receiving of <em><xref target="messages_inquiry" format="title" /></em> messages should
2974 always be protected with an <xref target="security_generic_functions_mfc" format="title" />
2975 to secure the protocol against missing, duplicated, out-of-order or unexpected messages.
2976 </t>
2977 </dd>
2978 <dt><xref target="messages_demand" format="title" /></dt>
2979 <dd>
2980 Same action as described for <em><xref target="messages_demand" format="title" /></em> message in section
2981 <xref target="security_states_active_decoding" format="title"/>.
2982 </dd>
2983 <dt><xref target="messages_offer" format="title" /></dt>
2984 <dd>
2985 Same action as described for <em><xref target="messages_offer" format="title" /></em> message in section
2986 <xref target="security_states_active_decoding" format="title"/>.
2987 </dd>
2988 <dt><xref target="messages_done" format="title" /></dt>
2989 <dd>
2990 Same action as described for <em><xref target="messages_done" format="title" /></em> message in section
2991 <xref target="security_states_active_decoding" format="title"/>.
2992 </dd>
2993 <dt><xref target="messages_elements" format="title" /></dt>
2994 <dd>
2995 Same action as described for <em><xref target="messages_elements" format="title" /></em> message in section
2996 <xref target="security_states_active_decoding" format="title"/>.
2997 </dd>
2998
2999 </dl>
3000 </section>
3001 <section anchor="security_states_finish_waiting" numbered="true" toc="default">
3002 <name>Finish Waiting</name>
3003 <t>
3004 In the <strong>Finish Waiting</strong> state the protocol waits for
3005 all transmitted demands to be fulfilled.
3006 </t>
3007 <t>
3008 In case not all transmitted demands have been answered at this time, the operation
3009 has failed and the protocol MUST be terminated with an error.
3010 </t>
3011 <t>Security considerations for received messages:</t>
3012 <dl>
3013 <dt><xref target="messages_elements" format="title" /></dt>
3014 <dd>
3015 When receiving <xref target="messages_elements" format="title" /> messages it is important
3016 to always check the <xref target="security_generic_functions_mfc" format="title" />
3017 to secure the protocol against missing, duplicated, out-of-order or unexpected messages.
3018 </dd>
3019 </dl>
3020 </section>
3021 </section>
3022 </section>
3023 <section anchor="constants" numbered="true" toc="default">
3024 <name>Constants</name>
3025 <t>
3026 The following table contains constants used by the protocol. The constants marked with a * are
3027 validated through experiments in <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/>.
3028 </t>
3029 <figure anchor="figure_constants">
3030 <artwork name="" type="" align="left" alt=""><![CDATA[
3031Name | Value | Description
3032----------------------------+------------+--------------------------
3033SE_STRATA_COUNT | 32 | Number of IBFs in a strata estimator
3034IBF_HASH_NUM* | 3 | Number of times an element is hashed to an
3035 IBF (from section 4.5.2)
3036IBF_FACTOR* | 2 | The factor by which the size of the IBF is
3037 increased in case of decoding failure or
3038 initially from the set difference.
3039 (from section 4.5.2)
3040MAX_BUCKETS_PER_MESSAGE | 1120 | Maximum bucket of an IBF that are
3041 transmitted in single message
3042IBF_MIN_SIZE* | 37 | Minimal number of buckets in an IBF
3043 (from section 3.8)
3044DIFFERENTIAL_RTT_MEAN* | 3.65145 | The average RTT that is needed for a
3045 differential synchronisation
3046SECURITY_LEVEL* | 2^80 | Security level for probabilistic security
3047 algorithms (from section 5.8)
3048PROBABILITY_FOR_NEW_ROUND* | 0.15 | The probability for a IBF decoding failure
3049 in the differential synchronisation mode
3050 (from section 5.4)
3051DIFFERENTIAL_RTT_MEAN* | 3.65145 | The average RTT that is needed for a
3052 differential synchronisation.
3053 (from section 4.5.3)
3054MAX_IBF_SIZE | 1048576 | Maximal number of buckets in an IBF
3055AVG_BYTE_SIZE_SE* | 4221 | Average byte size of a single strata
3056 estimator (from section 3.4.3)
3057VALID_NUMBER_SE* | [1,2,4,8] | Valid number of SE in (from section 3.4)
3058
3059 ]]></artwork>
3060 </figure>
3061
3062 </section>
3063 <section anchor="gana" numbered="true" toc="default">
3064 <name>GANA Considerations</name>
3065 <t>
3066 GANA is requested to amend the "GNUnet Message Type" <xref target="GANA" format="default"/> registry
3067 as follows:
3068 </t>
3069 <figure anchor="figure_purposenums">
3070 <artwork name="" type="" align="left" alt=""><![CDATA[
3071Type | Name | References | Description
3072--------+----------------------------+------------+-----------------------------------
3073 559 | SETU_P2P_REQUEST_FULL | [This.I-D] | Request the full set of the other
3074 peer
3075 710 | SETU_P2P_SEND_FULL | [This.I-D] | Signals to send the full set to the
3076 other peer
3077 560 | SETU_P2P_DEMAND | [This.I-D] | Demand the whole element from the
3078 other peer, given only the hash
3079 code.
3080 561 | SETU_P2P_INQUIRY | [This.I-D] | Tell the other peer to send a list
3081 of hashes that match an IBF key.
3082 562 | SETU_P2P_OFFER | [This.I-D] | Tell the other peer which hashes
3083 match a given IBF key.
3084 563 | SETU_P2P_OPERATION_REQUEST | [This.I-D] | Request a set union operation from
3085 a remote peer.
3086 564 | SETU_P2P_SE | [This.I-D] | Strata Estimator uncompressed
3087 565 | SETU_P2P_IBF | [This.I-D] | Invertible Bloom Filter slices.
3088 566 | SETU_P2P_ELEMENTS | [This.I-D] | Actual set elements.
3089 567 | SETU_P2P_IBF_LAST | [This.I-D] | Invertible Bloom Filter Last Slice.
3090 568 | SETU_P2P_DONE | [This.I-D] | Set operation is done.
3091 569 | SETU_P2P_SEC | [This.I-D] | Strata Estimator compressed
3092 570 | SETU_P2P_FULL_DONE | [This.I-D] | All elements in full synchronisation
3093 mode have been sent is done.
3094 571 | SETU_P2P_FULL_ELEMENT | [This.I-D] | Send an actual element in full
3095 synchronisation mode.
3096
3097 ]]></artwork>
3098 </figure>
3099 </section>
3100 <!-- gana -->
3101 <section anchor="contributors" numbered="true" toc="default">
3102 <name>Contributors</name>
3103 <t>
3104 The GNUnet implementation of the byzantine fault tolerant set reconciliation
3105 protocol was originally implemented by Florian Dold.
3106 </t>
3107 </section>
3108 </middle>
3109 <back>
3110 <references>
3111 <name>Normative References</name>
3112 &RFC5869;
3113 &RFC2119;
3114 &RFC3385;
3115 &RFC1951;
3116
3117 <reference anchor="byzantine_fault_tolerant_set_reconciliation" target="https://summermatter.net/byzantine-fault-tolerant-set-reconciliation-summermatter.pdf">
3118 <front>
3119 <title>Byzantine Fault Tolerant Set Reconciliation</title>
3120 <author initials="E." surname="Summermatter" fullname="Elias Summermatter">
3121 <organization>University of Applied Sciences Bern</organization>
3122 </author>
3123 <date year="2021"/>
3124 </front>
3125 </reference>
3126
3127 <reference anchor="GANA" target="https://gana.gnunet.org/">
3128 <front>
3129 <title>GNUnet Assigned Numbers Authority (GANA)</title>
3130 <author>
3131 <organization>GNUnet e.V.</organization>
3132 </author>
3133 <date month="April" year="2020"/>
3134 </front>
3135 </reference>
3136
3137 <reference anchor="CryptographicallySecureVoting" target="https://git.gnunet.org/bibliography.git/plain/docs/ba_dold_voting_24aug2014.pdf">
3138 <front>
3139 <title>Cryptographically Secure, Distributed Electronic Voting</title>
3140 <author initials="F." surname="Dold" fullname="Florian Dold">
3141 <organization>Technische Universität München</organization>
3142 </author>
3143 </front>
3144 </reference>
3145
3146
3147 <reference anchor="ByzantineSetUnionConsensusUsingEfficientSetReconciliation" target="https://doi.org/10.1186/s13635-017-0066-3">
3148 <front>
3149 <title>Byzantine set-union consensus using efficient set reconciliation</title>
3150 <author initials="F." surname="Dold" fullname="Florian Dold">
3151 <organization>Technische Universität München</organization>
3152 </author>
3153 <author initials="C." surname="Grothoff" fullname="Christian Grothoff">
3154 <organization>Inria, Domaine de Voluceau Rocquencourt</organization>
3155 </author>
3156 </front>
3157 </reference>
3158 <reference anchor="Eppstein" target="https://doi.org/10.1145/2018436.2018462">
3159 <front>
3160 <title>What’s the Difference? Efficient Set Reconciliation without Prior Context</title>
3161 <author initials="D." surname="Eppstein" fullname="David Eppstein">
3162 <organization>U.C. Irvine</organization>
3163 </author>
3164 <author initials="M." surname="Goodrich" fullname="Michael T. Goodrich">
3165 <organization>U.C. Irvine</organization>
3166 </author>
3167 <author initials="F." surname="Uyeda" fullname="Frank Uyeda">
3168 <organization>U.C. San Diego</organization>
3169 </author>
3170 <author initials="G." surname="Varghese" fullname="George Varghese">
3171 <organization>U.C. San Diego</organization>
3172 </author>
3173 </front>
3174 </reference>
3175
3176 <reference anchor="GNS" target="https://doi.org/10.1007/978-3-319-12280-9_9">
3177 <front>
3178 <title>A Censorship-Resistant, Privacy-Enhancing and Fully Decentralized Name System</title>
3179 <author initials="M." surname="Wachs" fullname="Matthias Wachs">
3180 <organization>Technische Universitaet Muenchen</organization>
3181 </author>
3182
3183 <author initials="M." surname="Schanzenbach" fullname="Martin Schanzenbach">
3184 <organization>Technische Universitaet Muenchen</organization>
3185 </author>
3186
3187 <author initials="C." surname="Grothoff"
3188 fullname="Christian Grothoff">
3189 <organization>Technische Universitaet Muenchen</organization>
3190 </author>
3191 <date year="2014"/>
3192 </front>
3193 </reference>
3194
3195 </references>
3196 <section anchor="test_vectors" numbered="true" toc="default">
3197 <name>Test Vectors</name>
3198 <section anchor="test_vectors_map_function" numbered="true" toc="default">
3199 <name>Map Function</name>
3200 <t>
3201 INPUTS:
3202 </t>
3203 <figure anchor="test_vectors_map_function_inputs">
3204 <artwork name="" type="" align="left" alt=""><![CDATA[
3205k: 3
3206ibf_size: 300
3207
3208key1: 0xFFFFFFFFFFFFFFFF (64-bit)
3209key2: 0x0000000000000000 (64-bit)
3210key3: 0x00000000FFFFFFFF (64-bit)
3211key4: 0xC662B6298512A22D (64-bit)
3212key5: 0xF20fA7C0AA0585BE (64-bit)
3213 ]]></artwork>
3214 </figure>
3215 <t>
3216 OUTPUT:
3217 </t>
3218 <figure anchor="test_vectors_map_function_outputs">
3219 <artwork name="" type="" align="left" alt=""><![CDATA[
3220key1: ["122","157","192"]
3221key2: ["85","243","126"]
3222key3: ["208","101","222"]
3223key4: ["239","269","56"]
3224key5: ["150","104","33"]
3225 ]]></artwork>
3226 </figure>
3227 </section>
3228 <section anchor="test_vectors_id_function" numbered="true" toc="default">
3229 <name>ID Calculation Function</name>
3230 <t>
3231 INPUTS:
3232 </t>
3233 <figure anchor="test_vectors_id_function_inputs">
3234 <artwork name="" type="" align="left" alt=""><![CDATA[
3235element 1: 0xFFFFFFFFFFFFFFFF (64-bit)
3236element 2: 0x0000000000000000 (64-bit)
3237element 3: 0x00000000FFFFFFFF (64-bit)
3238element 4: 0xC662B6298512A22D (64-bit)
3239element 5: 0xF20fA7C0AA0585BE (64-bit)
3240 ]]></artwork>
3241 </figure>
3242 <t>
3243 OUTPUT:
3244 </t>
3245 <figure anchor="test_vectors_id_function_outputs">
3246 <artwork name="" type="" align="left" alt=""><![CDATA[
3247element 1: 0x5AFB177B
3248element 2: 0x64AB557C
3249element 3: 0xCB5DB740
3250element 4: 0x8C6A2BB2
3251element 5: 0x7EC42981
3252 ]]></artwork>
3253 </figure>
3254 </section>
3255 <section anchor="test_counter_compression_function" numbered="true" toc="default">
3256 <name>Counter Compression Function</name>
3257 <t>
3258 INPUTS:
3259 </t>
3260 <figure anchor="test_counter_compression_function_inputs">
3261 <artwork name="" type="" align="left" alt=""><![CDATA[
3262counter serie 1: [1,8,10,6,2] (min bytes 4)
3263counter serie 2: [26,17,19,15,2,8] (min bytes 5)
3264counter serie 3: [4,2,0,1,3] (min bytes 3)
3265 ]]></artwork>
3266 </figure>
3267 <t>
3268 OUTPUT:
3269 </t>
3270 <figure anchor="test_counter_compression_function_outputs">
3271 <artwork name="" type="" align="left" alt=""><![CDATA[
3272counter serie 1: 0x18A62
3273counter serie 2: 0x3519BC48
3274counter serie 3: 0x440B
3275 ]]></artwork>
3276 </figure>
3277 </section>
3278 </section>
3279 </back>
3280</rfc>
diff --git a/draft-summermatter-set-union.xml b/draft-summermatter-set-union.xml
index d53ff3b..7f1071d 100644
--- a/draft-summermatter-set-union.xml
+++ b/draft-summermatter-set-union.xml
@@ -16,7 +16,7 @@
16<?rfc sortrefs="yes" ?> 16<?rfc sortrefs="yes" ?>
17<?rfc compact="yes" ?> 17<?rfc compact="yes" ?>
18<?rfc subcompact="no" ?> 18<?rfc subcompact="no" ?>
19<rfc category="info" docName="draft-summermatter-set-union-03" ipr="trust200902" 19<rfc category="info" docName="draft-summermatter-set-union-01" ipr="trust200902"
20 obsoletes="" updates="" submissionType="IETF" xml:lang="en" version="3"> 20 obsoletes="" updates="" submissionType="IETF" xml:lang="en" version="3">
21 <!-- xml2rfc v2v3 conversion 2.26.0 --> 21 <!-- xml2rfc v2v3 conversion 2.26.0 -->
22 <front> 22 <front>
@@ -1902,13 +1902,13 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 |
1902 SE-SLICE 1902 SE-SLICE
1903 0 8 16 24 32 40 48 56 1903 0 8 16 24 32 40 48 56
1904 +-----+-----+-----+-----+-----+-----+-----+-----+ 1904 +-----+-----+-----+-----+-----+-----+-----+-----+
1905 | SE 1 [IBF 1] | 1905 | SE_1->IBF_1 |
1906 +-----+-----+-----+-----+-----+-----+-----+-----+ 1906 +-----+-----+-----+-----+-----+-----+-----+-----+
1907 ... ... 1907 ... ...
1908 +-----+-----+-----+-----+-----+-----+-----+-----+ 1908 +-----+-----+-----+-----+-----+-----+-----+-----+
1909 | SE 1 [IBF 30] | 1909 | SE_1->IBF_30 |
1910 +-----+-----+-----+-----+-----+-----+-----+-----+ 1910 +-----+-----+-----+-----+-----+-----+-----+-----+
1911 | SE 2 [IBF 1] | 1911 | SE_2->IBF_1 |
1912 +-----+-----+-----+-----+-----+-----+-----+-----+ 1912 +-----+-----+-----+-----+-----+-----+-----+-----+
1913 ... ... 1913 ... ...
1914 / / 1914 / /
@@ -3069,7 +3069,7 @@ VALID_NUMBER_SE* | [1,2,4,8] | Valid number of SE in (from section 3
3069 <figure anchor="figure_purposenums"> 3069 <figure anchor="figure_purposenums">
3070 <artwork name="" type="" align="left" alt=""><![CDATA[ 3070 <artwork name="" type="" align="left" alt=""><![CDATA[
3071Type | Name | References | Description 3071Type | Name | References | Description
3072--------+----------------------------+------------+-------------------------- 3072--------+----------------------------+------------+-----------------------------------
3073 559 | SETU_P2P_REQUEST_FULL | [This.I-D] | Request the full set of the other 3073 559 | SETU_P2P_REQUEST_FULL | [This.I-D] | Request the full set of the other
3074 peer 3074 peer
3075 710 | SETU_P2P_SEND_FULL | [This.I-D] | Signals to send the full set to the 3075 710 | SETU_P2P_SEND_FULL | [This.I-D] | Signals to send the full set to the