draft-summermatter-set-union-01.xml (194157B)
1 <?xml version='1.0' encoding='utf-8'?> 2 <!DOCTYPE rfc SYSTEM "rfc2629-xhtml.ent" [ 3 <!ENTITY RFC1034 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1034.xml"> 4 <!ENTITY RFC1035 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1035.xml"> 5 <!ENTITY RFC2119 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"> 6 <!ENTITY RFC2782 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2782.xml"> 7 <!ENTITY RFC3686 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3686.xml"> 8 <!ENTITY RFC5869 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5869.xml"> 9 <!ENTITY RFC3385 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3385.xml"> 10 <!ENTITY RFC1951 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1951.xml"> 11 ]> 12 <?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?> 13 <?rfc strict="yes" ?> 14 <?rfc toc="yes" ?> 15 <?rfc symrefs="yes"?> 16 <?rfc sortrefs="yes" ?> 17 <?rfc compact="yes" ?> 18 <?rfc subcompact="no" ?> 19 <rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="info" docName="draft-summermatter-set-union-01" ipr="trust200902" 20 obsoletes="" updates="" submissionType="IETF" xml:lang="en" version="3"> 21 <!-- xml2rfc v2v3 conversion 2.26.0 --> 22 <front> 23 <title abbrev="Set Union"> 24 Byzantine Fault Tolerant Set Reconciliation 25 </title> 26 <seriesInfo name="Internet-Draft" value="draft-summermatter-set-union-01"/> 27 <author fullname="Elias Summermatter" initials="E." surname="Summermatter"> 28 <organization>Seccom GmbH</organization> 29 <address> 30 <postal> 31 <street>Brunnmattstrasse 44</street> 32 <city>Bern</city> 33 <code>3007</code> 34 <country>CH</country> 35 </postal> 36 <email>elias.summermatter@seccom.ch</email> 37 </address> 38 </author> 39 <author fullname="Christian Grothoff" initials="C." surname="Grothoff"> 40 <organization>Berner Fachhochschule</organization> 41 <address> 42 <postal> 43 <street>Hoeheweg 80</street> 44 <city>Biel/Bienne</city> 45 <code>2501</code> 46 <country>CH</country> 47 </postal> 48 <email>grothoff@gnunet.org</email> 49 </address> 50 </author> 51 52 <!-- Meta-data Declarations --> 53 <area>General</area> 54 <workgroup>Independent Stream</workgroup> 55 <keyword>name systems</keyword> 56 <abstract> 57 <t>This document contains a protocol specification for Byzantine fault-tolerant 58 Set Reconciliation. 59 </t> 60 </abstract> 61 </front> 62 <middle> 63 <section anchor="introduction" numbered="true" toc="default"> 64 <name>Introduction</name> 65 <t> 66 This document describes a byzantine fault tolerant set reconciliation protocol used to efficient and securely 67 compute the union of two sets across a network. 68 </t> 69 <t> 70 This byzantine fault tolerant set reconciliation 71 protocol can be used in a variety of applications. 72 73 Our primary envisioned application domain is the 74 distribution of revocation messages in the GNU Name 75 System (GNS) <xref target="GNS" format="default" />. In GNS, 76 key revocation messages are usually flooded across the 77 peer-to-peer overlay network to all connected peers 78 whenever a key is revoked. However, as peers may be 79 offline or the network might have been partitioned, 80 there is a need to reconcile revocation lists whenever 81 network partitions are healed or peers go online. The 82 GNU Name System uses the protocol described in this 83 specification to efficiently distribute revocation 84 messages whenever network partitions are healed. 85 86 Another application domain for the protocol described 87 in this specification are Byzantine fault-tolerant 88 bulletin boards, like those required in some secure 89 multiparty computations. A well-known example for 90 secure multiparty computations are various E-voting 91 protocols <xref target="CryptographicallySecureVoting" format="default"/> which 92 use a bulletin board to share the votes and intermediate 93 computational results. We note that for such systems, 94 the set reconciliation protocol is merely a component of 95 a multiparty consensus protocol, such as the one 96 described in Dold's "Byzantine set-union consensus using 97 efficient set reconciliation" <xref target="ByzantineSetUnionConsensusUsingEfficientSetReconciliation" format="default"/>. 98 </t> 99 <t> 100 The protocol described in this report is generic and 101 suitable for a wide range of applications. As a result, 102 the internal structure of the elements in the sets MUST 103 be defined and verified by the application using the 104 protocol. This document thus does not cover the element 105 structure, except for imposing a limit on the maximum 106 size of an element. 107 </t> 108 <t> 109 The protocol faces an inherent trade-off between minimizing 110 the number of network round-trips and the number of bytes 111 sent over the network. Thus, for the protocol to choose 112 the right parameters for a given situation, applications 113 using an implementation of the protocol SHOULD provide a 114 parameter that specifies 115 the cost-ratio of round-trips vs. bandwidth usage. Given 116 this trade-off factor, an implementation CAN then choose parameters 117 that minimize total execution cost. In particular, there 118 is one major choice to be made, namely between sending the 119 complete set of elements, or computing the set differences and 120 transmitting only the elements in the set differences. 121 In the latter case, our design is basically a concrete 122 implementation of a proposal by Eppstein.<xref target="Eppstein" format="default" /> 123 </t> 124 125 <t> 126 We say that our set reconciliation protocol is Byzantine 127 fault-tolerant because it provides cryptographic and 128 probabilistic methods to discover if the other peer 129 is dishonest or misbehaving. 130 Here, the security objective is to limit resources wasted on 131 malicious actors. Malicious actors could send malformed 132 messages, including malformed set elements, claim to 133 have much larger numbers of valid set elements than they 134 actually hold, or request the retransmission of elements 135 that they have already received in previous 136 interactions. Bounding resources consumed by malicous 137 actors is important to ensure that higher-level protocols 138 can use set reconciliation and still meet their resource 139 targets. This can be particularly critical in multi-round 140 synchronous consensus protocols where peers that cannot 141 answer in a timely fashion would have to be treated as 142 failed or malicious. 143 </t> 144 <t> 145 To defend against some of these attacks, applications 146 SHOULD remember the number of elements previously 147 shared with a peer, and SHOULD provide a way to check that 148 elements are well-formed. Applications MAY also 149 provide an upper bound on the total number of valid 150 elements that exist. For example, in E-voting, the 151 number of eligible voters MAY be used to provide such 152 an upper bound. 153 </t> 154 <t> 155 A first draft of this RFC is part of Elias Summermatter's 156 bachelor thesis. Many of the algorithms and parameters 157 documented in this RFC are derived from experiments 158 detailed in this thesis. 159 <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> 160 </t> 161 162 <t> 163 This document defines the normative wire format of resource records, resolution processes, 164 cryptographic routines and security considerations for use by implementors. 165 SETU requires a bidirectional secure communication channel between the two parties. 166 Specification of the communication channel is out of scope of this document. 167 </t> 168 <t> 169 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL 170 NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and 171 "OPTIONAL" in this document are to be interpreted as described 172 in <xref target="RFC2119"/>. 173 </t> 174 </section> 175 176 <section anchor="background" numbered="true" toc="default"> 177 <name>Background</name> 178 <section anchor="bf" numbered="true" toc="default"> 179 <name>Bloom Filter</name> 180 <t> 181 A Bloom filter (BF) is a space-efficient probabilistic 182 datastructure to test if an element is part of a set of elements. 183 Elements are identified by an element ID. 184 Since a BF is a probabilistic datastructure, it is possible to have false-positives: when asked 185 if an element is in the set, the answer from a BF is either "no" or "maybe". 186 </t> 187 <t> 188 A BF consists of L buckets. Every bucket is a binary value that can be either 0 or 1. All buckets are initialized 189 to 0. A mapping function M is used to map each ID of each element from the set to a subset of k buckets. In the original proposal by Bloom, M is non-injective 190 and can thus map the same element multiple times to the same bucket. 191 The type of the mapping function can thus be described by the following mathematical notation: 192 </t> 193 <figure anchor="bf_mapping_function_math"> 194 <artwork name="" type="" align="left" alt=""><![CDATA[ 195 ------------------------------------ 196 # M: E->B^k 197 ------------------------------------ 198 # L = Number of buckets 199 # B = 0,1,2,3,4,...L-1 (the buckets) 200 # k = Number of buckets per element 201 # E = Set of elements 202 ------------------------------------ 203 Example: L=256, k=3 204 M('element-data') = {4,6,255} 205 206 ]]></artwork> 207 </figure> 208 <t> 209 A typical mapping function is constructed by hashing the element, for example 210 using the well-known <relref section="2" target="RFC5869" displayFormat="of">HKDF construction</relref>. 211 </t> 212 <t> 213 To add an element to the BF, the corresponding buckets under the map M are set to 1. 214 To check if an element may be in the set, one tests if all buckets under the map M are set to 1. 215 </t> 216 <t> 217 In the BF the buckets are set to 1 if the corresponding bit in the bitstream is 1. 218 If there is a collision and a bucket is already set to 1, the bucket stays at 1. 219 </t> 220 <t> 221 In the following example the element e0 with M(e0) = {1,3} has been added: 222 </t> 223 <figure anchor="figure_bf_insert_0"> 224 <artwork name="" type="" align="left" alt=""><![CDATA[ 225 bucket-0 bucket-1 bucket-2 bucket-3 226 +-------------+-------------+-------------+-------------+ 227 | 0 | 1 | 0 | 1 | 228 +-------------+-------------+-------------+-------------+ 229 ]]></artwork> 230 </figure> 231 <t> 232 It is easy to see that an element e1 with M(e1) = {0,3} 233 could have been added to the BF below, while an element e2 234 with M(e2) = {0,2} cannot be in the set represented by the 235 BF below: 236 </t> 237 238 <figure anchor="figure_bf_contains"> 239 <artwork name="" type="" align="left" alt=""><![CDATA[ 240 bucket-0 bucket-1 bucket-2 bucket-3 241 +-------------+-------------+-------------+-------------+ 242 | 1 | 0 | 0 | 1 | 243 +-------------+-------------+-------------+-------------+ 244 ]]></artwork> 245 </figure> 246 <t> 247 The parameters L and k depend on the set size and MUST be 248 chosen carefully to ensure that the BF does not return too 249 many false-positives. 250 </t> 251 <t> 252 It is not possible to remove an element from the BF because buckets can only be set to 1 or 0. Hence it is impossible to 253 differentiate between buckets containing one or more elements. To remove elements from the BF a <xref target="cbf" format="title" /> 254 is required. 255 </t> 256 </section> 257 258 <section anchor="cbf" numbered="true" toc="default"> 259 <name>Counting Bloom Filter</name> 260 <t> 261 A Counting Bloom Filter (CBF) is a variation on the idea 262 of a <xref target="bf" format="title" />. With a CBF, buckets are 263 unsigned numbers instead of binary values. 264 This allows the removal of an element from the CBF. 265 </t> 266 <t> 267 Adding an element to the CBF is similar to the adding operation of the BF. 268 However, instead of setting the buckets to 1 the 269 numeric value stored in the bucket is increased by 1. 270 For example, if two colliding elements M(e1) = {1,3} and 271 M(e2) = {0,3} are added to the CBF, bucket 0 and 1 are set 272 to 1 and bucket 3 (the colliding bucket) is set to 2: 273 </t> 274 <figure anchor="figure_cbf_insert_0"> 275 <artwork name="" type="" align="left" alt=""><![CDATA[ 276 bucket-0 bucket-1 bucket-2 bucket-3 277 +-------------+-------------+-------------+-------------+ 278 | 1 | 1 | 0 | 2 | 279 +-------------+-------------+-------------+-------------+ 280 ]]></artwork> 281 </figure> 282 <t> 283 The counter stored in the bucket is also called the order of the bucket. 284 </t> 285 <t> 286 To remove an element form the CBF the counters of all buckets 287 the element is mapped to are decreased by 1. 288 </t> 289 <t> 290 For example, removing M(e2) = {1,3} from the CBF above 291 results in: 292 </t> 293 <figure anchor="figure_cbf_remove_0"> 294 <artwork name="" type="" align="left" alt=""><![CDATA[ 295 bucket-0 bucket-1 bucket-2 bucket-3 296 +-------------+-------------+-------------+-------------+ 297 | 1 | 0 | 0 | 1 | 298 +-------------+-------------+-------------+-------------+ 299 ]]></artwork> 300 </figure> 301 <t> 302 In practice, the number of bits available for the counters 303 is often finite. For example, given a 4-bit 304 counter, a CBF bucket would overflow 16 elements are mapped 305 to the same bucket. To handle this case, the maximum value 306 (15 in our example) is considered to represent "infinity". Once the 307 order of a bucket reaches "infinity", it is no longer incremented or decremented. 308 </t> 309 <t> 310 The parameters L and k and the number of bits allocated to the counters 311 SHOULD depend on the set size. 312 A CBF will degenerate when subjected to insert and remove iterations of 313 different elements, and eventually all buckets will reach "infinity". 314 The speed of the degradation will depend on the choice of L and k in 315 relation to the number of elements stored in the IBF. 316 </t> 317 </section> 318 </section> 319 320 <section anchor="ibf" numbered="true" toc="default"> 321 <name>Invertible Bloom Filter</name> 322 <t> 323 An Invertible Bloom Filter (IBF) is a further extension of the <xref target="cbf" format="title" />. 324 An IBF extends the <xref target="cbf" format="title" /> with two more operations: 325 decode and set difference. This two extra operations are key to efficiently obtain 326 small differences between large sets. 327 </t> 328 <section anchor="ibf_structure" numbered="true" toc="default"> 329 <name>Structure</name> 330 <t> 331 An IBF consists of an injective mapping function M mapping 332 elements to k out of L buckets. Each of the L buckets stores 333 a signed COUNTER, an IDSUM and an XHASH. 334 An IDSUM is the XOR of various element IDs. 335 An XHASH is the XOR of various hash values. 336 As before, the values used for k, L and the number of bits used 337 for the signed counter and the XHASH depend 338 on the set size and various other trade-offs. 339 </t> 340 <t> 341 If the IBF size is too small or the mapping 342 function does not spread out the elements 343 uniformly, the signed counter can overflow or 344 underflow. As with the CBF, the "maximum" value is 345 thus used to represent "infinite". As there is no 346 need to distinguish between overflow and 347 underflow, the most canonical representation of 348 "infinite" would be the minimum value of the 349 counter in the canonical 2-complement 350 interpretation. For example, given a 4-bit 351 counter a value of -8 would be used to represent 352 "infinity". 353 </t> 354 <figure anchor="figure_ibf_structure"> 355 <artwork name="" type="" align="left" alt=""><![CDATA[ 356 bucket-0 bucket-1 bucket-2 bucket-3 357 +-------------+-------------+-------------+-------------+------- 358 count | COUNTER | COUNTER | COUNTER | COUNTER | C... 359 +-------------+-------------+-------------+-------------+------ 360 idSum | IDSUM | IDSUM | IDSUM | IDSUM | I... 361 +-------------+-------------+-------------+-------------+------ 362 hashSum | HASHSUM | HASHSUM | HASHSUM | HASHSUM | H.. 363 +-------------+-------------+-------------+-------------+------- 364 ]]></artwork> 365 </figure> 366 367 </section> 368 369 <section anchor="ibf_format_id_generation" numbered="true" toc="default"> 370 <name>Salted Element ID Calculation</name> 371 <t> 372 IBFs are a probabilistic data structure, hence it can be necessary to 373 recompute the IBF in case operations fail, and then try again. The 374 recomputed IBF would ideally be statistically independent of the 375 failed IBF. This is achieved by introducing an IBF-salt. Given that with 376 benign peers failures should be rare, and that we need to be able to 377 "invert" the application of the IBF-salt to the element IDs, we use an 378 unsigned 32 bit non-random IBF-salt value of which the lowest 6 bits will 379 be used to rotate bits in the element ID computation. 380 </t> 381 <t> 382 64-bit element IDs are generated from a 383 <relref section="2" target="RFC5869" displayFormat="of">HKDF construction</relref> 384 with HMAC-SHA512 as XTR and HMAC-SHA256 as PRF with a 16-bit KDF-salt set to a 385 unsigned 16-bit representation of zero. 386 The output of the KDF is then truncated to 64-bit. 387 Finally, salting is done by calculating the IBF-salt modulo 64 388 (effectively using only the lowest 6-bits of the IBF-salt) 389 and doing a bitwise right rotation of the output of KDF. We 390 note that this operation was chosen as it is easily inverted, 391 allowing applications to easily derive element IDs with one 392 IBF-salt value from element IDs generated with a different 393 IBF-salt value. 394 </t> 395 <t> 396 In case the IBF does not decode, the IBF-salt can be changed to 397 compute different element IDs, which will (likely) be mapped 398 to different buckets, likely allowing the IBF to decode in a 399 subsequent iteration. 400 </t> 401 <figure anchor="ibf_format_id_generation_pseudo_code"> 402 <artwork name="" type="" align="left" alt=""><![CDATA[ 403 # INPUTS: 404 # key: Pre calculated and truncated key from id_calculation function 405 # ibf_salt: Salt of the IBF 406 # OUTPUT: 407 # value: salted key 408 FUNCTION salt_key(key,ibf_salt): 409 s = (ibf_salt * 7) modulo 64; 410 /* rotate key */ 411 return (key >> s) | (key << (64 - s)) 412 END FUNCTION 413 414 415 # INPUTS: 416 # element: element for which we are to calculate the element ID 417 # ibf_salt: Salt of the IBF 418 # OUTPUT: 419 # value: the ID of the element 420 FUNCTION id_calculation (element,ibf_salt): 421 kdf_salt = 0 // 16 bits 422 XTR=HMAC-SHA256 423 PRF=HMAC-SHA256 424 key = HKDF(XTR, PRF, kdf_salt, element) modulo 2^64 425 return salt_key(key, ibf_salt) 426 END FUNCTION 427 ]]></artwork> 428 </figure> 429 </section> 430 431 <section anchor="ibf_format_HASH_calculation" numbered="true" toc="default"> 432 <name>HASH calculation</name> 433 <t> 434 The HASH of an element ID is computed by calculating the 435 CRC32 checksum of the 64-bit ID value, 436 which returns a 32-bit value.CRC32 is well-known and described in <relref section="4.1" target="RFC3385" displayFormat="of">the RFC</relref>. 437 </t> 438 </section> 439 440 <section anchor="ibf_m" numbered="true" toc="default"> 441 <name>Mapping Function</name> 442 <t> 443 The mapping function M decides which buckets a given ID is mapped to. 444 For an IBF, it is beneficial to use an injective mapping function M. 445 </t> 446 <t> 447 The first index is simply the CRC32 of the ID modulo the IBF size. The second 448 index is calculated by creating a new 64-bit value by shifting the previous 32-bit 449 value left and setting the lower 32 bits to the number of indices already processed. 450 From the resulting 64-bit value, another CRC32 checksum is computed. 451 The subsequent index is the modulo of this CRC32 output. 452 The process is repeated until the desired number of indices is generated. 453 In the case the process computes the same index twice, 454 which would mean this bucket could not get pure again, 455 the second hit is just skipped and the next iteration is used instead, 456 creating an injective mapping function. 457 </t> 458 <figure anchor="ibf_format_bucket_identification_pseudo_code"> 459 <artwork name="" type="" align="left" alt=""><![CDATA[ 460 # INPUTS: 461 # key: the ID of the element calculated 462 # k: numbers of buckets per element 463 # L: total number of buckets in the IBF 464 # OUTPUT: 465 # dst: Array with k bucket IDs 466 FUNCTION get_bucket_id (key, k, L) 467 bucket = CRC32(key) 468 i = 0 // unsigned 32-bit index 469 filled = 0 470 WHILE filled < k DO 471 472 element_already_in_bucket = false 473 j = 0 474 WHILE j < filled DO 475 IF dst[j] == bucket modulo L THEN 476 element_already_in_bucket = true 477 END IF 478 j++ 479 END WHILE 480 481 IF !element_already_in_bucket THEN 482 dst[filled] = bucket modulo L 483 filled = filled + 1 484 END IF 485 486 x = (bucket << 32) | i // 64 bit result 487 bucket = CRC32(x) 488 i = i + 1 489 END WHILE 490 return dst 491 END FUNCTION 492 ]]></artwork> 493 </figure> 494 </section> 495 496 <section anchor="ibf_operations" numbered="true" toc="default"> 497 <name>Operations</name> 498 <t> 499 When an IBF is created, all counters and IDSUM and HASHSUM values of 500 all buckets are initialized to zero. 501 </t> 502 <section anchor="ibv_operations_insert" numbered="true" toc="default"> 503 <name>Insert Element</name> 504 <t> 505 To add an element to an IBF, the element is mapped to a subset of k buckets using 506 the injective mapping function M as described in section <xref target="ibf_m" format="title" />. For the buckets selected by the mapping function, the counter is increased by one and the 507 IDSUM field is set to the XOR of the element ID 508 computed as described in section <xref target="ibf_format_id_generation" format="title" /> 509 and the previously stored IDSUM. Furthermore, 510 the HASHSUM is set to the XOR of the previously stored HASHSUM 511 and the hash of the element ID computed as described 512 in section <xref target="ibf_format_HASH_calculation" format="title" />. 513 </t> 514 <t> 515 In the following example, the insert operation is illustrated using an element with the 516 ID 0x0102 mapped to {1,3} with a hash of 0x4242, and a second element with the 517 ID 0x0304 mapped to {0,1} and a hash of 0x0101. 518 </t> 519 <t>Empty IBF:</t> 520 <figure anchor="figure_ibf_insert_0"> 521 <artwork name="" type="" align="left" alt=""><![CDATA[ 522 bucket-0 bucket-1 bucket-2 bucket-3 523 +-------------+-------------+-------------+-------------+ 524 count | 0 | 0 | 0 | 0 | 525 +-------------+-------------+-------------+-------------+ 526 idSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 | 527 +-------------+-------------+-------------+-------------+ 528 hashSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 | 529 +-------------+-------------+-------------+-------------+ 530 ]]></artwork> 531 </figure> 532 <t>Insert first element with ID 0x0102 and hash 0x4242 into {1,3}:</t> 533 <figure anchor="figure_ibf_insert_1"> 534 <artwork name="" type="" align="left" alt=""><![CDATA[ 535 bucket-0 bucket-1 bucket-2 bucket-3 536 +-------------+-------------+-------------+-------------+ 537 count | 0 | 1 | 0 | 1 | 538 +-------------+-------------+-------------+-------------+ 539 idSum | 0x0000 | 0x0102 | 0x0000 | 0x0102 | 540 +-------------+-------------+-------------+-------------+ 541 hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 | 542 +-------------+-------------+-------------+-------------+ 543 ]]></artwork> 544 </figure> 545 <t>Insert second element with ID 0x0304 and hash 0101 into {0,1}:</t> 546 <figure anchor="figure_ibf_insert_2"> 547 <artwork name="" type="" align="left" alt=""><![CDATA[ 548 bucket-0 bucket-1 bucket-2 bucket-3 549 +-------------+-------------+-------------+-------------+ 550 count | 1 | 2 | 0 | 1 | 551 +-------------+-------------+-------------+-------------+ 552 idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 | 553 +-------------+-------------+-------------+-------------+ 554 hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | 555 +-------------+-------------+-------------+-------------+ 556 ]]></artwork> 557 </figure> 558 </section> 559 <section anchor="ibf_operations_remove" numbered="true" toc="default"> 560 <name>Remove Element</name> 561 <t> 562 To remove an element from the IBF the element is again mapped to a subset of the buckets using M. 563 Then all the counters of the buckets selected by M are reduced by one, the IDSUM is 564 replaced by the XOR of the old IDSUM and the ID of the element being removed, and the 565 HASHSUM is similarly replaced with the XOR of the old HASHSUM and the hash of the ID. 566 </t> 567 <t> 568 In the following example the remove operation is illustrated. 569 </t> 570 <t>IBF with two encoded elements:</t> 571 <figure anchor="figure_ibf_remove_0"> 572 <artwork name="" type="" align="left" alt=""><![CDATA[ 573 bucket-0 bucket-1 bucket-2 bucket-3 574 +-------------+-------------+-------------+-------------+ 575 count | 1 | 2 | 0 | 1 | 576 +-------------+-------------+-------------+-------------+ 577 idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 | 578 +-------------+-------------+-------------+-------------+ 579 hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | 580 +-------------+-------------+-------------+-------------+ 581 ]]></artwork> 582 </figure> 583 <t>After removal of element with ID 0x0304 and hash 0x0101 mapped to {0,1} from the IBF:</t> 584 <figure anchor="figure_ibf_remove_1"> 585 <artwork name="" type="" align="left" alt=""><![CDATA[ 586 bucket-0 bucket-1 bucket-2 bucket-3 587 +-------------+-------------+-------------+-------------+ 588 count | 0 | 1 | 0 | 1 | 589 +-------------+-------------+-------------+-------------+ 590 idSum | 0x0000 | 0x0102 | 0x0000 | 0x0102 | 591 +-------------+-------------+-------------+-------------+ 592 hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 | 593 +-------------+-------------+-------------+-------------+ 594 ]]></artwork> 595 </figure> 596 <t> 597 Note that it is possible to "remove" elements from an IBF that were never present 598 in the IBF in the first place. A negative counter value is thus indicative of 599 elements that were removed without having been added. Note that an IBF bucket 600 counter of zero no longer guarantees that an element mapped to that bucket is not 601 present in the set: a bucket with a counter of zero can be the result of one 602 element being added and a different element (mapped to the same bucket) being removed. 603 To check that an element is not present requires a counter of zero and an 604 IDSUM and HASHSUM of zero --- and some certainty that there was no collision due 605 to the limited number of bits in IDSUM and HASHSUM. Thus, 606 IBFs are not suitable to replace BFs or IBFs. 607 </t> 608 <t> 609 Buckets in an IBF with a counter of 1 or -1 are crucial for decoding an IBF, as 610 they MIGHT represent only a single element, with the IDSUM being the ID of that element. 611 Following Eppstein <xref target="Eppstein" format="default" />, we will call buckets that only represent a single 612 element <em>pure buckets</em>. 613 Note that due to the possibility of multiple insertion and removal operations 614 affecting the same bucket, not all buckets with a counter of 1 or -1 are 615 actually pure buckets. Sometimes a counter can be 1 or -1 because N elements 616 mapped to that bucket were added while N-1 or N+1 different elements also 617 mapped to that bucket were removed. 618 </t> 619 </section> 620 621 <section anchor="ibf_operations_decode" numbered="true" toc="default"> 622 <name>Extracting elements</name> 623 <t> 624 Extracting elements from an IBF yields IDs of elements from the IBF. 625 Elements are extracted from an IBF by repeatedly performing a 626 decode operation on the IBF. 627 </t> 628 <t> 629 A decode operation requires a pure bucket, that is a bucket to which M 630 only mapped a single element, to succeed. Thus, if there is no bucket with 631 a counter of 1 or -1, decoding fails. However, as a counter of 1 or -1 is 632 not a guarantee that the bucket is pure, there is also a chance that the 633 decoder returns an IDSUM value that is actually the XOR of several IDSUMs. 634 This is primarily detected by checking that the HASHSUM is the hash of the IDSUM. 635 Only if the HASHSUM also matches, the bucket could be pure. Additionally, 636 one MUST check that the IDSUM value actually would be mapped by M to 637 the respective bucket. If not, there was a hash collision and the bucket 638 is also not pure. 639 </t> 640 <t> 641 The very rare case that after all these checks a bucket is still 642 falsely identified as pure MUST be detected (say by determining that 643 extracted element IDs do not match any actual elements), and addressed 644 at a higher level in the protocol. As these failures are probabilistic 645 and depend on element IDs and the IBF construction, they can typically 646 be avoided by retrying with different parameters, such as a different 647 way to assign element IDs to elements (by varying the IBF-salt), 648 using a larger value for L, or a different mapping function M. 649 A more common scenario (especially if L was too small) is that 650 IBF decoding fails because there is no pure bucket. In this case, the 651 higher-level protocol generally MUST also retry using different 652 parameters (except if an attack is detected). 653 </t> 654 <t> 655 Suppose the IBF contains a pure bucket. In this case, the IDSUM in the 656 bucket is the ID of an element. Furthermore, it is then possible 657 to remove that element from the IBF (by inserting it if the counter 658 was negative, and by removing it if the counter was positive). This 659 is likely to cause other buckets to become pure, allowing further 660 elements to be decoded. Eventually, decoding ought to finish with 661 all counters and IDSUM and HASHSUM values reach zero. However, it is also 662 possible that an IBF only partly decodes and then decoding fails due 663 to the lack of pure buckets after extracting some element IDs. 664 </t> 665 <t> 666 In the following example the successful decoding of an IBF containing 667 the two elements previously added in our running example. 668 </t> 669 <t> 670 We begin with an IBF with two elements added: 671 </t> 672 <figure anchor="figure_ibf_decode_0"> 673 <artwork name="" type="" align="left" alt=""><![CDATA[ 674 bucket-0 bucket-1 bucket-2 bucket-3 675 +-------------+-------------+-------------+-------------+ 676 count | 1 | 2 | 0 | 1 | 677 +-------------+-------------+-------------+-------------+ 678 idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 | 679 +-------------+-------------+-------------+-------------+ 680 hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | 681 +-------------+-------------+-------------+-------------+ 682 ]]></artwork> 683 </figure> 684 <t> 685 In the IBF are two pure buckets to decode (buckets 0 and 3) we choose to start with 686 decoding bucket 0. This yields the element with the hash ID 0x0304 and 687 hash 1010. This element ID is mapped to buckets 688 {0,1}. 689 Subtracting this element results in bucket 1 becoming pure: 690 </t> 691 <figure anchor="figure_ibf_decode_1"> 692 <artwork name="" type="" align="left" alt=""><![CDATA[ 693 bucket-0 bucket-1 bucket-2 bucket-3 694 +-------------+-------------+-------------+-------------+ 695 count | 0 | 1 | 0 | 1 | 696 +-------------+-------------+-------------+-------------+ 697 idSum | 0x0000 | 0x0102 | 0x0000 | 0x0102 | 698 +-------------+-------------+-------------+-------------+ 699 hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 | 700 +-------------+-------------+-------------+-------------+ 701 ]]></artwork> 702 </figure> 703 <t> 704 We can now decoding bucket 2 and extract the element 705 with the ID 0x0102 and hash 0x4242. Now the IBF is 706 empty. Extraction completes with the status that 707 the IBF has been successfully decoded. 708 </t> 709 <figure anchor="figure_ibf_decode_2"> 710 <artwork name="" type="" align="left" alt=""><![CDATA[ 711 bucket-0 bucket-1 bucket-2 bucket-3 712 +-------------+-------------+-------------+-------------+ 713 count | 0 | 0 | 0 | 0 | 714 +-------------+-------------+-------------+-------------+ 715 idSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 | 716 +-------------+-------------+-------------+-------------+ 717 hashSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 | 718 +-------------+-------------+-------------+-------------+ 719 ]]></artwork> 720 </figure> 721 </section> 722 723 <section anchor="ibv_operations_setdiff" numbered="true" toc="default"> 724 <name>Set Difference</name> 725 <t> 726 Given addition and removal as defined above, it is possible to define an operation on IBFs 727 that computes an IBF representing the set difference. Suppose IBF1 represents set A, and 728 IBF2 represents set B. Then this set difference operation will compute IBF3 which 729 represents the set A - B. Note that this computation can be done only on the IBFs, 730 and does not require access to the elements from set A or B. 731 732 To calculate the IBF representing this set difference, both IBFs MUST have the same 733 length L, the same number of buckets per element k and use the same map M. 734 Naturally, all IDs must have been computed using the same IBF-salt. Given this, 735 one can compute the IBF representing the set difference by taking the XOR of the IDSUM and HASHSUM values 736 of the respective buckets and subtracting the respective counters. Care MUST be taken 737 to handle overflows and underflows by setting the counter to "infinity" as necessary. 738 The result is a new IBF with the same number of buckets representing the set difference. 739 </t> 740 <t> 741 This new IBF can be decoded as described in section <xref target="ibf_operations_decode" format="counter" />. 742 The new IBF can have two types of pure buckets with counter set to 1 or -1. If the counter is set to 1 743 the element is missing in the secondary set, and if the counter is set to -1 the element is missing in 744 the primary set. 745 </t> 746 <t> 747 To demonstrate the set difference operation we compare IBF-A with IBF-B and generate as described 748 IBF-AB 749 </t> 750 <t>IBF-A contains the elements with ID 0x0304 and hash 0x0101 mapped to {0,1}, 751 and ID 0x0102 and hash 0x4242 mapped to {1,3}:</t> 752 <figure anchor="figure_ibf_setdiff_A"> 753 <artwork name="" type="" align="left" alt=""><![CDATA[ 754 bucket-0 bucket-1 bucket-2 bucket-3 755 +-------------+-------------+-------------+-------------+ 756 count | 1 | 2 | 0 | 1 | 757 +-------------+-------------+-------------+-------------+ 758 idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 | 759 +-------------+-------------+-------------+-------------+ 760 hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | 761 +-------------+-------------+-------------+-------------+ 762 ]]></artwork> 763 </figure> 764 <t>IBF-B also contains the element with ID 0x0102 and 765 and another element with ID 0x1345 and hash 0x5050 766 mapped to {1,2}.</t> 767 <figure anchor="figure_ibf_setdiff_B"> 768 <artwork name="" type="" align="left" alt=""><![CDATA[ 769 bucket-0 bucket-1 bucket-2 bucket-3 770 +-------------+-------------+-------------+-------------+ 771 count | 0 | 1 | 1 | 1 | 772 +-------------+-------------+-------------+-------------+ 773 idSum | 0x0000 | 0x1447 | 0x1345 | 0x0102 | 774 +-------------+-------------+-------------+-------------+ 775 hashSum | 0x0000 | 0x9292 | 0x5050 | 0x4242 | 776 +-------------+-------------+-------------+-------------+ 777 ]]></artwork> 778 </figure> 779 <t>IBF-A minus IBF-B is then:</t> 780 <figure anchor="figure_ibf_setdiff_AB"> 781 <artwork name="" type="" align="left" alt=""><![CDATA[ 782 bucket-0 bucket-1 bucket-2 bucket-3 783 +-------------+-------------+-------------+-------------+ 784 count | 1 | 0 | -1 | 0 | 785 +-------------+-------------+-------------+-------------+ 786 idSum | 0x0304 | 0x1049 | 0x1345 | 0x0000 | 787 +-------------+-------------+-------------+-------------+ 788 hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | 789 +-------------+-------------+-------------+-------------+ 790 ]]></artwork> 791 </figure> 792 <t>After calculating and decoding the IBF-AB shows clear that in IBF-A the element with the hash 0x5050 793 is missing (-1 in bucket 2) while in IBF-B the element with the hash 0101 is missing 794 (1 in bucket 0). The element with hash 0x4242 is present in IBF-A and IBF-B and is 795 removed by the set difference operation. Bucket 2 is not empty. 796 </t> 797 </section> 798 </section> 799 800 <section anchor="ibf_format" numbered="true" toc="default"> 801 <name>Wire format</name> 802 <t> 803 For the counter field, we use a variable-size encoding to ensure 804 that even for very large sets the counter should never reach 805 "infinity", while also ensuring that the encoding is compact for 806 small sets. 807 Hence, the counter size transmitted over the wire 808 varies between 1 and 64 bits, depending on the 809 maximum counter in the IBF. A range of 1 to 64 bits 810 should cover most areas of application and can be 811 efficiently implemented on most contemporary CPU 812 architectures and programming languages. 813 The bit length for the transmitted IBF 814 will be communicated in the header of the 815 <em><xref target="messages_ibf" format="title" /></em> message 816 in the "IMCS" field as unsigned 8-bit integer. 817 For implementation details see section <xref target="performance_counter_variable_size" format="title" />. 818 </t> 819 <t> 820 For the "IDSUM", we always use a 64-bit representation. 821 The IDSUM value MUST have sufficient entropy for the 822 mapping function M to yield reasonably random buckets 823 even for very large values of L. With a 32 bit 824 value the chance that multiple elements may be mapped 825 to the same ID would be quite high, even for moderately 826 large sets. Using more than 64 bits would at best make 827 sense for very large sets, but then it is likely always 828 better to simply afford additional round trips to handle 829 the occasional collision. 64 bits are also a reasonable 830 size for many CPU architectures. 831 </t> 832 <t> 833 For the "HASHSUM", we always use a 32-bit 834 representation. Here, it is most important to 835 avoid collisions, where different elements are 836 mapped to the same hash, possibly resulting in 837 a bucket being falsely classified as pure. 838 While with 32 bits there remains a non-negligible chance of 839 accidental collisions, our protocol is designed 840 to handle occasional collisions. Hence, at 32 bit the chance is 841 believed to be sufficiently small enough 842 for the protocol to handle those cases efficiently. Smaller hash 843 values would safe bandwidth, but also substantially 844 increase the chance of collisions. 32 bits are 845 also again a reasonable size for many CPU 846 architectures. 847 </t> 848 </section> 849 </section> 850 851 <section anchor="se" numbered="true" toc="default"> 852 <name>Strata Estimator</name> 853 <t> 854 Strata Estimators help estimate the size of the set difference between two sets of elements. 855 This is necessary to efficiently determinate the tuning parameters for an IBF, in particular 856 a good value for L. 857 </t> 858 <t> 859 Basically a Strata Estimator (SE) is a series of IBFs (with a rather small value of L=79) 860 in which increasingly large subsets of the full set 861 of elements are added to each IBF. For the n-th IBF, the function selecting the 862 subset of elements MUST sample to select (probabilistically) 1/(2^n) of all 863 elements. This can be done by counting the number of trailing bits set to "1" 864 in an element ID, and then inserting the element into the IBF identified by that 865 counter. As a result, all elements will be mapped to one IBF, with the n-th 866 IBF being statistically expected to contain 1/(2^n) elements. 867 </t> 868 <t> 869 Given two SEs, the set size difference can be estimated by attempting to decode all of the 870 IBFs. Given that L is set to a fixed and rather small value, IBFs containing large strata 871 will likely fail to decode. For IBFs that failed to decode, one simply 872 extrapolates the number of elements by scaling the numbers obtained from the 873 other IBFs that did decode. If none of the IBFs of the SE decoded (which given 874 a reasonable number of IBFs in the SE should be highly unlikely), one can theoretically 875 retry using a different IBF-salt. 876 </t> 877 <t> 878 When decoding the IBFs in the strata estimator, it is possible to determine 879 on which side which part of the difference is. For this purpose, the pure buckets with 880 counter 1 and -1 must be distinguished and assigned to the respective side when decoding 881 the IBFs. 882 </t> 883 </section> 884 885 <section anchor="modeofoperation" numbered="true" toc="default"> 886 <name>Mode of Operation</name> 887 <t> 888 Depending on the state of the two sets the set union 889 protocol uses different modes of operation to efficiently 890 determinate missing elements between the two sets. 891 </t> 892 <t> 893 The simplest mode is the <em>full synchronisation mode</em>. 894 If the difference between the sets of the two 895 peers exceeds a certain threshold, the overhead to determine 896 which elements are different would outweigh the overhead of 897 simply sending the complete set. Hence, the protocol may 898 determine that the most efficient method is to exchange the full sets. 899 </t> 900 <t> 901 The second possibility is that the difference between the sets 902 is relatively small compared to the set size. 903 In this case, the <em>differential synchronisation mode</em> is more efficient. 904 Given these two possibilities, the first steps of the protocol are used to 905 determine which mode MUST be used. 906 </t> 907 <t> 908 Thus, the set union protocol always begins with the following operation mode independent steps: 909 </t> 910 911 <t> 912 The initiating peer begins in the <strong>Initiating Connection</strong> state and the receiving peer in the <strong>Expecting Connection</strong> 913 state. The first step for the initiating peer in the protocol is to send an <em><xref target="messages_operation_request" format="title" /></em> to the receiving peer and 914 transition into the <strong>Expect SE</strong> state. After receiving the <em><xref target="messages_operation_request" format="title" /></em> the receiving peer 915 transitions to the <strong>Expecting IBF</strong> state and answers with the 916 <em><xref target="messages_se" format="title" /></em> message. When the initiating peer receives the <em><xref target="messages_se" format="title" /></em> message, 917 it decides with some heuristics which operation mode is likely more suitable for the estimated set difference and the application-provided latency-bandwidth tradeoff. 918 The detailed algorithm used to choose between the <xref target="modeofoperation_full-sync" format="title" /> and the <xref target="modeofoperation_individual-elements" format="title" /> 919 is explained in the section <xref target="modeofoperation_combined-mode" format="title" /> below. 920 </t> 921 <section anchor="modeofoperation_full-sync" numbered="true" toc="default"> 922 <name>Full Synchronisation Mode</name> 923 <t> 924 When the initiating peer decides to use the full synchronisation mode and it is better that the other peer sends his set first, the initiating 925 peer sends a <em><xref target="messages_request_full" format="title" /></em> message, and transitions from <strong>Expecting SE</strong> to the <strong>Full Receiving</strong> state. 926 If it has been determined that it is better that the initiating peer sends his set first, the initiating peer sends a <em><xref target="messages_send_full" format="title" /></em> message followed by all 927 set elements in <em><xref target="messages_full_element" format="title" /></em> messages to the other peer, followed by the <em><xref target="messages_full_done" format="title" /></em> message, and transitions into the <strong>Full Sending</strong> state. 928 </t> 929 <t> 930 A state diagram illustrating the state machine used during full synchronization 931 is provided 932 <eref target="https://git.gnunet.org/lsd0003.git/plain/statemachine/state_machine_full.png">here</eref>. 933 </t> 934 <t><strong>The behavior of the participants the different state is described below:</strong></t> 935 <dl> 936 <dt><strong>Expecting IBF:</strong></dt> 937 <dd> 938 If a peer in the <strong>Expecting IBF</strong> state receives a <em><xref target="messages_request_full" format="title" /></em> message from the other peer, the 939 peer sends all the elements of his set followed by a <em><xref target="messages_full_done" format="title" /></em> message to the other peer, and transitions to the 940 <strong>Full Sending</strong> state. If the peer receives an <em><xref target="messages_send_full" format="title" /></em> message followed by 941 <em><xref target="messages_full_element" format="title" /></em> messages, the peer processes the element and transitions to the <strong>Full Receiving</strong> state. 942 </dd> 943 <dt><strong>Full Sending:</strong></dt> 944 <dd> 945 While a peer is in <strong>Full Sending</strong> state the peer expects to continuously receive elements from the other 946 peer. As soon as a the <em><xref target="messages_full_done" format="title" /></em> message is received, the peer transitions into 947 the <strong>Finished</strong> state. 948 </dd> 949 <dt><strong>Full Receiving: </strong></dt> 950 <dd> 951 While a peer is in the <strong>Full Receiving</strong> state, it expects to continuously receive elements from the other 952 peer. As soon as a the <em><xref target="messages_full_done" format="title" /></em> message is received, it sends 953 the remaining elements (those it did not receive) from his set to the other 954 peer, followed by a <em><xref target="messages_full_done" format="title" /></em>. 955 After sending the last message, the peer transitions into the <strong>Finished</strong> state. 956 </dd> 957 </dl> 958 </section> 959 <section anchor="modeofoperation_individual-elements" numbered="true" toc="default"> 960 <name>Differential Synchronisation Mode</name> 961 <t> 962 The message format used by the protocol limits the maximum message size to 963 64 kb. Given that L can be large, an IBF will not always fit within that 964 size limit. To deal with this, larger IBFs are split into multiple messages. 965 </t> 966 <t> 967 When the initiating peer in the <strong>Expected SE</strong> state decides to use the differential synchronisation mode, it 968 sends an IBF, which may 969 consist of several <em><xref target="messages_ibf" format="title" /></em> messages, 970 to the receiving peer and transitions into the <strong>Passive Decoding</strong> state. 971 </t> 972 <t> 973 The receiving peer in the <strong>Expecting IBF</strong> state receives the 974 first <em><xref target="messages_ibf" format="title" /></em> message from 975 the initiating peer, and transitions into the <strong>Expecting IBF Last</strong> state 976 if the IBF was split into multiple <em><xref target="messages_ibf" format="title" /></em> 977 messages. If there is just a single <em><xref target="messages_ibf" format="title" /></em> 978 message, the receiving peer 979 transitions directly to the <strong>Active Decoding</strong> state. 980 </t> 981 <t> 982 The peer that is in the <strong>Active Decoding</strong>, <strong>Finish Closing</strong> or in the <strong>Expecting IBF Last</strong> 983 state is called the active peer, and the peer that is in either the <strong>Passive Decoding</strong> or the <strong>Finish Waiting</strong> state 984 is called the passive peer. 985 </t> 986 <t> 987 A state diagram illustrating the state machine used during differential synchronization 988 is provided 989 <eref target="https://git.gnunet.org/lsd0003.git/plain/statemachine/differential_state_machine.png">here</eref>. 990 </t> 991 <t><strong>The behavior of the participants the different states is described below:</strong></t> 992 <dl> 993 <dt><strong>Passive Decoding:</strong></dt> 994 <dd> 995 <t> 996 In the <strong>Passive Decoding</strong> state the passive peer reacts to requests from the active peer. 997 The action the passive peer executes depends on the message the passive peer receives in the <strong>Passive Decoding</strong> state from the active peer 998 and is described below on a per message basis. 999 </t> 1000 1001 <dl> 1002 <dt><em><xref target="messages_inquiry" format="title" /></em> message:</dt> 1003 <dd> 1004 The <em><xref target="messages_inquiry" format="title" /></em> message 1005 is received if the active peer requests the SHA-512 hash of one or more elements (by sending the 64 bit element ID) 1006 that are missing from the active peer's set. 1007 In this case the passive peer answers with <em><xref target="messages_offer" format="title" /></em> messages 1008 which contain the SHA-512 hash of the requested element. If the passive peer does not have an element with 1009 a matching element ID, it MUST ignore the inquiry (in this case, a bucket was falsely classified as pure, decoding the IBF will eventually fail, and roles will be swapped). 1010 It should be verified that after an falsely classified pure bucket a role change is made. 1011 If multiple elements match the 64 bit element ID, the passive 1012 peer MUST send offers for all of the matching elements. 1013 </dd> 1014 <dt><em><xref target="messages_demand" format="title" /></em> message:</dt> 1015 <dd> 1016 The <em><xref target="messages_demand" format="title" /></em> message 1017 is received if the active peer requests a complete element that is missing in the active peers set in response to an offer. If the requested element is known and has not yet been transmitted 1018 the passive peer answers with an <em><xref target="messages_elements" format="title" /></em> message which contains the full, 1019 application-dependent data of the requested element. If the passive peer receives a demand for a SHA-512 hash for which 1020 it has no corresponding element, a protocol violation is detected and the protocol MUST be aborted. 1021 Implementations MUST also abort when facing demands without previous matching offers or for which the passive peer previously transmitted the element to the active peer. 1022 </dd> 1023 <dt><em><xref target="messages_offer" format="title" /></em> message:</dt> 1024 <dd> 1025 The <em><xref target="messages_offer" format="title" /></em> message 1026 is received if the active peer has decoded an element that is present in the active peers set and is likely be missing in the 1027 set of the passive peer. If the SHA-512 hash of the offer is indeed not a hash of any of the elements from the set of 1028 the passive peer, the passive peer MUST answer with a <em><xref target="messages_demand" format="title" /></em> message 1029 for that SHA-512 hash and remember that it issued this demand. The demand thus needs to be added to a list with unsatisfied demands. 1030 </dd> 1031 <dt><em><xref target="messages_elements" format="title" /></em> message:</dt> 1032 <dd> 1033 When a new <em><xref target="messages_elements" format="title" /></em> message has been received the peer checks if a corresponding 1034 <em><xref target="messages_demand" format="title" /></em> for the element has been sent 1035 and the demand is still unsatisfied. 1036 If the element has been demanded the peer checks the element for validity, removes it from the list 1037 of pending demands and then saves the element to the set. Otherwise the peer 1038 ignores the element. 1039 </dd> 1040 <dt><em><xref target="messages_ibf" format="title" /></em> message:</dt> 1041 <dd> 1042 If an <em><xref target="messages_ibf" format="title" /></em> message is received, this 1043 indicates that decoding of the IBF on the active site has failed and roles will be swapped. 1044 The receiving passive peer transitions into the <strong>Expecting IBF Last</strong> state, 1045 and waits for more <em><xref target="messages_ibf" format="title" /></em> messages. 1046 There, once the final <em><xref target="messages_ibf_last" format="title" /></em> message has been received, it transitions to <strong>Active Decoding</strong>. 1047 </dd> 1048 <dt><em><xref target="messages_ibf_last" format="title" /></em> message:</dt> 1049 <dd> 1050 If an <em><xref target="messages_ibf_last" format="title" /></em> message is received this 1051 indicates that there is just one IBF slice left and a direct state and role transition from 1052 <strong>Passive Decoding</strong> to <strong>Active Decoding</strong> is initiated. 1053 </dd> 1054 <dt><em><xref target="messages_done" format="title" /></em> message:</dt> 1055 <dd> 1056 Receiving the <em><xref target="messages_done" format="title" /></em> message signals 1057 the passive peer that all demands of the active peer have been satisfied. Alas, the 1058 active peer will continue to process demands from the passive peer. 1059 Upon receiving this message, the passive peer transitions into the 1060 <strong>Finish Waiting</strong> state. 1061 </dd> 1062 </dl> 1063 </dd> 1064 <dt><strong>Active Decoding:</strong></dt> 1065 <dd> 1066 <t> 1067 In the <strong>Active Decoding</strong> state the active peer decodes the IBFs and evaluates the set difference 1068 between the active and passive peer. Whenever an element ID is obtained by decoding the IBF, the active peer 1069 sends either an offer or an inquiry to the passive peer, depending on which site the decoded element is missing. 1070 </t> 1071 <t> 1072 If the IBF decodes a positive (1) pure bucket, the element is missing on the passive peers site. 1073 Thus, the active peer sends an <em><xref target="messages_offer" format="title" /></em> to the passive peer. 1074 A negative (-1) pure bucket indicates that an element is missing in the active peers set, so the active peer 1075 sends a <em><xref target="messages_inquiry" format="title" /></em> to the passive peer. 1076 </t> 1077 <t> 1078 In case the IBF does not successfully decode anymore, the active peer sends a new IBF computed with a different IBF-salt to the passive peer 1079 and changes into <strong>Passive Decoding</strong> state. This initiates a role swap. 1080 To reduce overhead and prevent double transmission of offers and elements, the new IBF is created 1081 on the local set after updating it with the all of the elements that have been successfully demanded. Note that the active peer MUST NOT wait for all active demands to be satisfied, as demands can fail if a bucket was falsely classified as pure. 1082 </t> 1083 <t> 1084 As soon as the active peer successfully finished decoding the IBF, the active peer sends a 1085 <em><xref target="messages_done" format="title" /></em> message to the passive peer. 1086 </t> 1087 <t> 1088 All other actions taken by the active peer depend on the message the active peer receives from 1089 the passive peer. The actions are described below on a per message basis: 1090 </t> 1091 <dl> 1092 <dt><em><xref target="messages_offer" format="title" /></em> message:</dt> 1093 <dd> 1094 The <em><xref target="messages_offer" format="title" /></em> message indicates that the 1095 passive peer received a <em><xref target="messages_inquiry" format="title" /></em> message from 1096 the active peer. If a inquiry has been sent and 1097 the offered element is missing in the active peers set, 1098 the active peer sends a <em><xref target="messages_demand" format="title" /></em> message to the 1099 passive peer. The demand needs to be added to a list of unsatisfied demands. 1100 In case the received offer is for an element that is already in the set of the peer, the offer MUST BE ignored. 1101 </dd> 1102 <dt><em><xref target="messages_demand" format="title" /></em> message:</dt> 1103 <dd> 1104 The <em><xref target="messages_demand" format="title" /></em> message indicates that the 1105 passive peer received a <em><xref target="messages_offer" format="title" /></em> from 1106 the active peer. The active peer satisfies the demand of the passive peer by sending an 1107 <em><xref target="messages_elements" format="title" /></em> message if a offer request 1108 for the element was sent earlier. Otherwise, the protocol MUST be aborted, as peers must never send demands for hashes that they have never been offered. 1109 </dd> 1110 <dt><em><xref target="messages_elements" format="title" /></em> message:</dt> 1111 <dd> 1112 If element is received that was not demanded or for which 1113 the application-specific validation logic fails, the protocol 1114 MUST be aborted. Otherwise, the corresponding demand is marked 1115 as satisfied. Note that this applies only to the differential 1116 synchronization mode. In full synchronization, it is perfectly 1117 normal to receive 1118 <xref target="messages_full_element" format="title" /> 1119 messages for elements that were not demanded and that might 1120 even already be known locally. 1121 </dd> 1122 <dt><em><xref target="messages_done" format="title" /></em> message:</dt> 1123 <dd> 1124 Receiving the message <em><xref target="messages_done" format="title" /></em> indicates 1125 that all demands of the passive peer have been satisfied. The active peer then changes into the 1126 <strong>Finish Closing</strong> state. 1127 If the IBF has not finished decoding and the <em><xref target="messages_done" format="title" /></em> 1128 is received, the other peer is not in compliance with the protocol and the protocol MUST be aborted. 1129 </dd> 1130 </dl> 1131 </dd> 1132 <dt><strong>Expecing IBF Last</strong></dt> 1133 <dd> 1134 <t> 1135 In this state the active peer continuously receives <em><xref target="messages_ibf" format="title" /></em> 1136 messages from the passive peer. When the last <em><xref target="messages_ibf_last" format="title" /></em> message is received, 1137 the peer changes into the <strong>Active Decoding</strong> state. 1138 </t> 1139 </dd> 1140 <dt><strong>Finish Closing</strong> / <strong>Finish Waiting</strong></dt> 1141 <dd> 1142 <t> 1143 In this states the peers are waiting for all demands to be satisfied and for the synchronisation 1144 to be completed. When all demands are satisfied the peer changes into <strong>Finished</strong> state. 1145 </t> 1146 </dd> 1147 </dl> 1148 </section> 1149 <section anchor="modeofoperation_combined-mode" numbered="true" toc="default"> 1150 <name>Combined Mode</name> 1151 <t> 1152 In the <em>combined mode</em> the protocol decides between 1153 <xref target="modeofoperation_full-sync" format="title" /> and 1154 the <xref target="modeofoperation_individual-elements" format="title" /> 1155 to minimize resource consumption. Typically, the protocol always runs 1156 in combined mode, but implementations MAY allow applications to force 1157 the use of one of the modes for testing. In this case, applications MUST 1158 ensure that the respective options to force a particular mode are used by 1159 both participants. 1160 </t> 1161 <t> 1162 The <xref target="modeofoperation_individual-elements" format="title" /> is only efficient on small set 1163 differences or if the byte-size of the elements is large. If the set difference is estimated to be large 1164 the <xref target="modeofoperation_full-sync" format="title" /> is 1165 more efficient. The exact heuristics and parameters on which the protocol decides which mode 1166 MUST be used are described in the <xref target="performance" format="title" /> section of this document. 1167 </t> 1168 <t> 1169 There are two main cases when a <xref target="modeofoperation_full-sync" format="title" /> 1170 is always used. 1171 The first case is when one of the peers announces having an empty set. This is announced by setting 1172 the SETSIZE field in the <em><xref target="messages_se" format="title" /></em> to 0. 1173 <!-- FIXME: why not also do this if sending the elements is about as 1174 expensive as sending the SE? Should be a simple calculation. (thesis summermatter: future work) @Christian: 1175 As discussed 14.06 we let this comment in here as it is described in the thesis--> 1176 The second case is if the application requests full synchronisation explicitly. 1177 This is useful for testing and MUST NOT be used in production. 1178 </t> 1179 <t> 1180 The state diagram illustrating the combined mode can be found 1181 <eref target="https://git.gnunet.org/lsd0003.git/plain/statemachine/full_state_machine.png">here</eref>. 1182 </t> 1183 </section> 1184 </section> 1185 1186 <section anchor="messages" numbered="true" toc="default"> 1187 <name>Messages</name> 1188 <t> 1189 This section describes the various message formats used by the protocol. 1190 </t> 1191 <section anchor="messages_operation_request" numbered="true" toc="default"> 1192 <name>Operation Request</name> 1193 1194 <section anchor="messages_operation_request_description" numbered="true" toc="default"> 1195 <name>Description</name> 1196 <t> 1197 This message is the first message of the protocol and it is sent to signal to the receiving peer 1198 that the initiating peer wants to initialize a new connection. 1199 </t> 1200 <t> 1201 This message is sent in the transition between the <strong>Initiating Connection</strong> state and the <strong>Expect SE</strong> state. 1202 </t> 1203 <t> 1204 If a peer receives this message and is willing to run the protocol, it answers by sending back a <em><xref target="messages_se" format="title" /></em> message. 1205 Otherwise it simply closes the connection. 1206 </t> 1207 </section> 1208 <section anchor="messages_operation_request_structure" numbered="true" toc="default"> 1209 <name>Structure</name> 1210 1211 <figure anchor="figure_operation_request"> 1212 <artwork name="" type="" align="left" alt=""><![CDATA[ 1213 0 8 16 24 32 40 48 56 1214 +-----+-----+-----+-----+-----+-----+-----+-----+ 1215 | MSG SIZE | MSG TYPE | ELEMENT COUNT | 1216 +-----+-----+-----+-----+-----+-----+-----+-----+ 1217 | APX 1218 +-----+-----+-----+-----+-----+-----+-----+-----+ 1219 / APPLICATION DATA / 1220 / / 1221 ]]></artwork> 1222 </figure> 1223 <t>where:</t> 1224 <dl> 1225 <dt>MSG SIZE</dt> 1226 <dd> 1227 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. 1228 </dd> 1229 <dt>MSG TYPE</dt> 1230 <dd> 1231 is the type of SETU_P2P_OPERATION_REQUEST as registered in <xref target="gana" format="title" />, in network byte order. 1232 </dd> 1233 <dt>ELEMENT COUNT</dt> 1234 <dd> 1235 is the number of the elements the requesting party has in its set, as a 32-bit unsigned integer in network byte order. 1236 </dd> 1237 <dt>APX</dt> 1238 <dd> 1239 is a SHA-512 hash that identifies the application. 1240 </dd> 1241 <dt>APPLICATION DATA</dt> 1242 <dd> 1243 is optional, variable-size application specific data that can be used 1244 by the application to decide if it would like to answer the request. 1245 </dd> 1246 </dl> 1247 </section> 1248 </section> 1249 1250 <section anchor="messages_ibf" numbered="true" toc="default"> 1251 <name>IBF</name> 1252 1253 <section anchor="messages_ibf_description" numbered="true" toc="default"> 1254 <name>Description</name> 1255 <t> 1256 The <xref target="messages_ibf" format="title" /> message contains a slice of the IBF. 1257 </t> 1258 <t> 1259 The <em>IBF</em> message is sent at the start of the protocol from the initiating peer in the transaction 1260 between <strong>Expect SE</strong> -> <strong>Expecting IBF Last</strong> or when the IBF does not 1261 decode and there is a role change in the transition between <strong>Active Decoding</strong> -> <strong>Expecting IBF Last</strong>. 1262 This message is only sent if there is more than one IBF slice to be sent. If there is just 1263 one slice, then only the <xref target="messages_ibf_last" format="title" /> message is sent. 1264 </t> 1265 </section> 1266 <section anchor="messages_ibf_structure" numbered="true" toc="default"> 1267 <name>Structure</name> 1268 <figure anchor="figure_ibf"> 1269 <artwork name="" type="" align="left" alt=""><![CDATA[ 1270 0 8 16 24 32 40 48 56 1271 +-----+-----+-----+-----+-----+-----+-----+-----+ 1272 | MSG SIZE | MSG TYPE | IBF SIZE | 1273 +-----+-----+-----+-----+-----+-----+-----+-----+ 1274 | OFFSET | SALT | IMCS | 1275 +-----+-----+-----+-----+-----+-----+-----+-----+ 1276 | IBF-SLICE 1277 +-----+-----+-----+-----+-----+-----+-----+-----+ 1278 / / 1279 / / 1280 ]]></artwork> 1281 </figure> 1282 <t>where:</t> 1283 <dl> 1284 <dt>MSG SIZE</dt> 1285 <dd> 1286 is a 16-bit unsigned integer in network byte orderwhichdescribes the message size in bytes with the header included. 1287 </dd> 1288 <dt>MSG TYPE</dt> 1289 <dd> 1290 the type of SETU_P2P_REQUEST_IBF as registered in <xref target="gana" format="title" /> in network byte order. 1291 </dd> 1292 1293 <dt>IBF SIZE</dt> 1294 <dd> 1295 is a 32-bit unsigned integer which signals the total number of buckets in the IBF. The minimal number of buckets is 37. 1296 </dd> 1297 <dt>OFFSET</dt> 1298 <dd> 1299 is a 32-bit unsigned integer which signals the offset of the following IBF slices in the original. 1300 </dd> 1301 <dt>SALT</dt> 1302 <dd> 1303 is a 16-bit unsigned integer that contains the IBF-salt which was used to create the 1304 IBF. 1305 </dd> 1306 <dt>IMCS</dt> 1307 <dd> 1308 is a 16-bit unsigned integer, which describes the number of bits that 1309 are required to store a single counter. This is used for the unpacking function as described 1310 in the <xref target="performance_counter_variable_size" format="title" /> section. 1311 </dd> 1312 <dt>IBF-SLICE</dt> 1313 <dd> 1314 <t> 1315 are variable numbers of slices in an array. A single slice contains multiple 64-bit IDSUMS, 1316 32-bit HASHSUMS and (1-64)-bit COUNTERS of variable size. All values are in the network byte order. The array of IDSUMS is serialized first, followed 1317 by an array of HASHSUMS. Last comes an array of unsigned COUNTERS (details of the COUNTERS encoding are described in section 1318 <xref target="performance_counter_variable_size" format="default"/>). The length of the array is 1319 defined by MIN( SIZE - OFFSET, MAX_BUCKETS_PER_MESSAGE). MAX_BUCKETS_PER_MESSAGE is defined as 1320 32768 divided by the BUCKET_SIZE which ranges between 97-bits when counter uses bit 1 (IMCS=1) and 160-bit when counter size uses 64 bit (IMCS=64). 1321 </t> 1322 <t> 1323 To get the IDSUM field, all IDs (computed with the IBF-salt) hitting a bucket under the map M are added up with a binary XOR operation. 1324 See <xref target="ibf_format_id_generation" format="title" /> details about ID generation. 1325 </t> 1326 <t> 1327 The calculation of the HASHSUM field is done accordingly to the calculation of the IDSUM field: 1328 all HASHes are added up with a binary XOR operation. 1329 The HASH value is calculated as described in detail in section <xref target="ibf_format_HASH_calculation" format="title" />. 1330 </t> 1331 <t> 1332 The algorithm to find the correct bucket in which the ID and the HASH have to be added 1333 is described in detail in section <xref target="ibf_m" format="title" />. 1334 </t> 1335 <t> 1336 Test vectors for an implementation can be found in the <xref target="test_vectors" format="title" /> section 1337 </t> 1338 </dd> 1339 </dl> 1340 <figure anchor="figure_ibf_slice"> 1341 <artwork name="" type="" align="left" alt=""><![CDATA[ 1342 IBF-SLICE 1343 0 8 16 24 32 40 48 56 1344 +-----+-----+-----+-----+-----+-----+-----+-----+ 1345 | IDSUMS | 1346 +-----+-----+-----+-----+-----+-----+-----+-----+ 1347 | IDSUMS | 1348 +-----+-----+-----+-----+-----+-----+-----+-----+ 1349 | HASHSUMS | HASHSUMS | 1350 +-----+-----+-----+-----+-----+-----+-----+-----+ 1351 | COUNTERS* | COUNTERS* | 1352 +-----+-----+-----+-----+-----+-----+-----+-----+ 1353 / / 1354 / / 1355 * Counter size is variable. In this example the IMCS is 32 (4 bytes). 1356 ]]></artwork> 1357 </figure> 1358 </section> 1359 </section> 1360 1361 <section anchor="messages_ibf_last" numbered="true" toc="default"> 1362 <name>IBF Last</name> 1363 1364 <section anchor="messages_ibf_last_description" numbered="true" toc="default"> 1365 <name>Description</name> 1366 <t> 1367 This message indicates to the remote peer that this is the last slice 1368 of the Bloom filter. The receiving peer MUST check that the sizes and 1369 offsets of all received IBF slices add up to the total IBF SIZE that was 1370 given. 1371 </t> 1372 <t> 1373 Receiving this message initiates the state transmissions 1374 <strong>Expecting IBF Last</strong> -> <strong>Active Decoding</strong>, 1375 <strong>Expecting IBF</strong> -> <strong>Active Decoding</strong> and 1376 <strong>Passive Decoding</strong> -> <strong>Active Decoding</strong>. This message 1377 can initiate a peer the roll change from <strong>Active Decoding</strong> to 1378 <strong>Passive Decoding</strong>. 1379 </t> 1380 </section> 1381 <section anchor="messages_ibf_last_structure" numbered="true" toc="default"> 1382 <name>Structure</name> 1383 <t> 1384 The binary structure is exactly the same as the <xref target="messages_ibf_structure" format="title" /> of 1385 the message <xref target="messages_ibf" format="title" /> with a different "MSG TYPE" 1386 which is defined in <xref target="gana" format="title" /> "SETU_P2P_IBF_LAST". 1387 </t> 1388 </section> 1389 </section> 1390 1391 <section anchor="messages_elements" numbered="true" toc="default"> 1392 <name>Element</name> 1393 1394 <section anchor="messages_elements_description" numbered="true" toc="default"> 1395 <name>Description</name> 1396 <t> 1397 The <em><xref target="messages_elements" format="title" /></em> message contains an element that is synchronized in the <xref target="modeofoperation_individual-elements" format="title" /> 1398 and transmits a full element between the peers. 1399 </t> 1400 <t> 1401 This message is sent in the state <strong>Active Decoding</strong> and <strong>Passive Decoding</strong> 1402 as answer to a <em><xref target="messages_demand" format="title" /></em> message from the remote peer. 1403 The <em><xref target="messages_elements" format="title" /></em> message can also be received in the <strong>Finish Closing</strong> or <strong>Finish Waiting</strong> 1404 state after receiving a <em><xref target="messages_done" format="title" /></em> message from the remote peer. In this 1405 case the peer changes to the <strong>Finished</strong> state as soon as all demands for elements have been satisfied. 1406 </t> 1407 <t> 1408 This message is exclusively used in the <xref target="modeofoperation_individual-elements" format="title" />. 1409 </t> 1410 </section> 1411 <section anchor="messages_elements_structure" numbered="true" toc="default"> 1412 <name>Structure</name> 1413 <figure anchor="figure_elements"> 1414 <artwork name="" type="" align="left" alt=""><![CDATA[ 1415 0 8 16 24 32 40 48 56 1416 +-----+-----+-----+-----+-----+-----+-----+-----+ 1417 | MSG SIZE | MSG TYPE | E TYPE | PADDING | 1418 +-----+-----+-----+-----+-----+-----+-----+-----+ 1419 | E SIZE | DATA 1420 +-----+-----+-----+-----+-----+-----+-----+-----+ 1421 / / 1422 / / 1423 ]]></artwork> 1424 </figure> 1425 <t>where:</t> 1426 <dl> 1427 <dt>MSG SIZE</dt> 1428 <dd> 1429 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. 1430 </dd> 1431 <dt>MSG TYPE</dt> 1432 <dd> 1433 is SETU_P2P_ELEMENTS as registered in <xref target="gana" format="title" /> in network byte order. 1434 </dd> 1435 <dt>E TYPE</dt> 1436 <dd> 1437 is a 16-bit unsigned integer which defines the element type for 1438 the application. 1439 </dd> 1440 <dt>PADDING</dt> 1441 <dd> 1442 is 16-bit always set to zero. 1443 </dd> 1444 <dt>E SIZE</dt> 1445 <dd> 1446 is a 16-bit unsigned integer that signals the size of the elements data part. 1447 </dd> 1448 <dt>DATA</dt> 1449 <dd> 1450 is a field with variable length that contains the data of the element. 1451 </dd> 1452 </dl> 1453 </section> 1454 </section> 1455 1456 <section anchor="messages_offer" numbered="true" toc="default"> 1457 <name>Offer</name> 1458 1459 <section anchor="messages_offer_description" numbered="true" toc="default"> 1460 <name>Description</name> 1461 <t> 1462 The <em><xref target="messages_offer" format="title" /></em> message is an answer to an <em><xref target="messages_inquiry" format="title" /></em> message 1463 and transmits the full hash of an element that has been requested by the other peer. 1464 This full hash enables the other peer to check if the element is really missing in his set and 1465 eventually sends a <em><xref target="messages_demand" format="title" /></em> message for that element. 1466 </t> 1467 <t> 1468 The offer is sent and received only in the <strong>Active Decoding</strong> and in the <strong>Passive Decoding</strong> 1469 state. 1470 </t> 1471 <t> 1472 This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />. 1473 </t> 1474 </section> 1475 <section anchor="messages_offer_structure" numbered="true" toc="default"> 1476 <name>Structure</name> 1477 <figure anchor="figure_offer"> 1478 <artwork name="" type="" align="left" alt=""><![CDATA[ 1479 0 8 16 24 32 40 48 56 1480 +-----+-----+-----+-----+-----+-----+-----+-----+ 1481 | MSG SIZE | MSG TYPE | HASH 1 1482 +-----+-----+-----+-----+-----+-----+-----+-----+ 1483 ... ... 1484 +-----+-----+-----+-----+-----+-----+-----+-----+ 1485 HASH 1 | HASH 2 1486 +-----+-----+-----+-----+-----+-----+-----+-----+ 1487 ... ... 1488 +-----+-----+-----+-----+-----+-----+-----+-----+ 1489 HASH 2 | HASH n 1490 +-----+-----+-----+-----+-----+-----+-----+-----+ 1491 / / 1492 / / 1493 ]]></artwork> 1494 </figure> 1495 <t>where:</t> 1496 <dl> 1497 <dt>MSG SIZE</dt> 1498 <dd> 1499 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes header included. 1500 </dd> 1501 <dt>MSG TYPE</dt> 1502 <dd> 1503 is SETU_P2P_OFFER as registered in <xref target="gana" format="title" /> in network byte order. 1504 </dd> 1505 <dt>HASH n</dt> 1506 <dd> 1507 contains n (one or more) successive SHA 512-bit hashes of the elements that are being requested with <em><xref target="messages_inquiry" format="title" /></em> messages. 1508 </dd> 1509 </dl> 1510 </section> 1511 </section> 1512 1513 1514 <section anchor="messages_inquiry" numbered="true" toc="default"> 1515 <name>Inquiry</name> 1516 1517 <section anchor="messages_inquiry_description" numbered="true" toc="default"> 1518 <name>Description</name> 1519 <t> 1520 The <em><xref target="messages_inquiry" format="title" /></em> message is exclusively sent by the active peer in <strong>Active Decoding</strong> state 1521 to request the full hash of an element that is missing in the active peers set. This is normally answered 1522 by the passive peer with <em><xref target="messages_offer" format="title" /></em> message. 1523 </t> 1524 <t> 1525 This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />. 1526 </t> 1527 </section> 1528 <section anchor="messages_inquiry_structure" numbered="true" toc="default"> 1529 <name>Structure</name> 1530 <figure anchor="figure_inquiry"> 1531 <artwork name="" type="" align="left" alt=""><![CDATA[ 1532 0 8 16 24 32 40 48 56 1533 +-----+-----+-----+-----+-----+-----+-----+-----+ 1534 | MSG SIZE | MSG TYPE | SALT | 1535 +-----+-----+-----+-----+-----+-----+-----+-----+ 1536 | IBF KEY 1 | 1537 +-----+-----+-----+-----+-----+-----+-----+-----+ 1538 | IBF KEY 2 | 1539 +-----+-----+-----+-----+-----+-----+-----+-----+ 1540 ... ... 1541 +-----+-----+-----+-----+-----+-----+-----+-----+ 1542 | IBF KEY n | 1543 +-----+-----+-----+-----+-----+-----+-----+-----+ 1544 / / 1545 / / 1546 ]]></artwork> 1547 </figure> 1548 <t>where:</t> 1549 <dl> 1550 <dt>MSG SIZE</dt> 1551 <dd> 1552 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. 1553 </dd> 1554 <dt>MSG TYPE</dt> 1555 <dd> 1556 is SETU_P2P_INQUIRY as registered in <xref target="gana" format="title" /> in network byte order. 1557 </dd> 1558 <dt>IBF KEY</dt> 1559 <dd> 1560 contains n (one or more) successive ibf keys (64-bit unsigned integer) for which the inquiry is sent. 1561 </dd> 1562 </dl> 1563 </section> 1564 </section> 1565 1566 <section anchor="messages_demand" numbered="true" toc="default"> 1567 <name>Demand</name> 1568 1569 <section anchor="messages_demand_description" numbered="true" toc="default"> 1570 <name>Description</name> 1571 <t> 1572 The <em><xref target="messages_demand" format="title" /></em> message is sent in the <strong>Active Decoding</strong> and in the <strong>Passive Decoding</strong> 1573 state. It is an answer to a received <em><xref target="messages_offer" format="title" /></em> message 1574 and is sent if the element described in the <em><xref target="messages_offer" format="title" /></em> message 1575 is missing in the peers set. In the normal workflow the answer to the <em><xref target="messages_demand" format="title" /></em> message is an 1576 <em><xref target="messages_elements" format="title" /></em> message. 1577 </t> 1578 <t> 1579 This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />. 1580 </t> 1581 </section> 1582 <section anchor="messages_demand_structure" numbered="true" toc="default"> 1583 <name>Structure</name> 1584 <figure anchor="figure_demand"> 1585 <artwork name="" type="" align="left" alt=""><![CDATA[ 1586 0 8 16 24 32 40 48 56 1587 +-----+-----+-----+-----+-----+-----+-----+-----+ 1588 | MSG SIZE | MSG TYPE | HASH 1 1589 +-----+-----+-----+-----+-----+-----+-----+-----+ 1590 ... ... 1591 +-----+-----+-----+-----+-----+-----+-----+-----+ 1592 HASH 1 | HASH 2 1593 +-----+-----+-----+-----+-----+-----+-----+-----+ 1594 ... ... 1595 +-----+-----+-----+-----+-----+-----+-----+-----+ 1596 HASH 2 | HASH n 1597 +-----+-----+-----+-----+-----+-----+-----+-----+ 1598 / / 1599 / / 1600 ]]></artwork> 1601 </figure> 1602 <t>where:</t> 1603 <dl> 1604 <dt>MSG SIZE</dt> 1605 <dd> 1606 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes and the header is included. 1607 </dd> 1608 <dt>MSG TYPE</dt> 1609 <dd> 1610 the type of SETU_P2P_DEMAND as registered in <xref target="gana" format="title" /> in network byte order. 1611 </dd> 1612 <dt>HASH n</dt> 1613 <dd> 1614 contains n (one or more) successive SHA 512-bit hashes of the elements that are being demanded. 1615 </dd> 1616 </dl> 1617 </section> 1618 </section> 1619 1620 <section anchor="messages_done" numbered="true" toc="default"> 1621 <name>Done</name> 1622 1623 <section anchor="messages_done_description" numbered="true" toc="default"> 1624 <name>Description</name> 1625 <t> 1626 The <em><xref target="messages_done" format="title" /></em> message is sent when all <em><xref target="messages_demand" format="title" /></em> messages 1627 have been successfully satisfied and from the perspective of the sender the set is completely synchronized. 1628 </t> 1629 <t> 1630 This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />. 1631 </t> 1632 </section> 1633 <section anchor="messages_done_structure" numbered="true" toc="default"> 1634 <name>Structure</name> 1635 <figure anchor="figure_done"> 1636 <artwork name="" type="" align="left" alt=""><![CDATA[ 1637 0 8 16 24 32 40 48 56 1638 +-----+-----+-----+-----+-----+-----+-----+-----+ 1639 | MSG SIZE | MSG TYPE | FINAL CHECKSUM 1640 +-----+-----+-----+-----+-----+-----+-----+-----+ 1641 / / 1642 / / 1643 1644 ]]></artwork> 1645 </figure> 1646 <t>where:</t> 1647 <dl> 1648 <dt>MSG SIZE</dt> 1649 <dd> 1650 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. The value is always 4 for this message type. 1651 </dd> 1652 <dt>MSG TYPE</dt> 1653 <dd> 1654 is SETU_P2P_DONE as registered in <xref target="gana" format="title" /> in network byte order. 1655 </dd> 1656 <dt>FINAL CHECKSUM</dt> 1657 <dd> 1658 a SHA-512 hash XOR sum of the full set after synchronization. This should ensure that the sets are identical in the end! 1659 </dd> 1660 </dl> 1661 </section> 1662 </section> 1663 1664 <section anchor="messages_full_done" numbered="true" toc="default"> 1665 <name>Full Done</name> 1666 1667 <section anchor="messages_full_done_description" numbered="true" toc="default"> 1668 <name>Description</name> 1669 <t> 1670 The <em><xref target="messages_full_done" format="title" /></em> message is sent in the <xref target="modeofoperation_full-sync" format="title" /> 1671 to signal that all remaining elements of the set have been sent. The message is received and sent in the 1672 <strong>Full Sending</strong> and in the <strong>Full Receiving</strong> state. When the <em><xref target="messages_full_done" format="title" /></em> message is received 1673 in <strong>Full Sending</strong> state the peer changes directly into <strong>Finished</strong> state. In 1674 <strong>Full Receiving</strong> state receiving a <em><xref target="messages_full_done" format="title" /></em> message initiates the sending of 1675 the remaining elements that are missing in the set of the other peer. 1676 </t> 1677 </section> 1678 <section anchor="messages_full_done_structure" numbered="true" toc="default"> 1679 <name>Structure</name> 1680 <figure anchor="figure_full_done"> 1681 <artwork name="" type="" align="left" alt=""><![CDATA[ 1682 0 8 16 24 32 40 48 56 1683 +-----+-----+-----+-----+-----+-----+-----+-----+ 1684 | MSG SIZE | MSG TYPE | FINAL CHECKSUM 1685 +-----+-----+-----+-----+-----+-----+-----+-----+ 1686 / / 1687 / / 1688 ]]></artwork> 1689 </figure> 1690 <t>where:</t> 1691 <dl> 1692 <dt>MSG SIZE</dt> 1693 <dd> 1694 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. The value is always 4 for this message type. 1695 </dd> 1696 <dt>MSG TYPE</dt> 1697 <dd> 1698 the type of SETU_P2P_FULL_DONE as registered in <xref target="gana" format="title" /> in network byte order. 1699 </dd> 1700 <dt> FINAL CHECKSUM</dt> 1701 <dd> 1702 a SHA-512 hash XOR sum of the full set after synchronization. This should ensure that the sets are identical in the end! 1703 </dd> 1704 </dl> 1705 </section> 1706 </section> 1707 1708 <section anchor="messages_request_full" numbered="true" toc="default"> 1709 <name>Request Full</name> 1710 1711 <section anchor="messages_request_full_description" numbered="true" toc="default"> 1712 <name>Description</name> 1713 <t> 1714 The <em><xref target="messages_request_full" format="title" /></em> message is sent by the initiating peer in <strong>Expect SE</strong> state to the receiving peer, if 1715 the operation mode "<xref target="modeofoperation_full-sync" format="title" />" is 1716 determined to be the superior <xref target="modeofoperation" format="title" /> and that it is the better choice that 1717 the other peer sends his elements first. The initiating peer changes after sending the <em><xref target="messages_request_full" format="title" /></em> message into 1718 <strong>Full Receiving</strong> state. 1719 </t> 1720 <t> 1721 The receiving peer receives the <em><xref target="messages_request_full" format="title" /></em> message in the <strong>Expecting IBF</strong>, afterwards the receiving peer 1722 starts sending his complete set in <xref target="messages_full_element" format="title" /> messages to the initiating peer. 1723 </t> 1724 </section> 1725 <section anchor="messages_request_full_structure" numbered="true" toc="default"> 1726 <name>Structure</name> 1727 <figure anchor="figure_request_full"> 1728 <artwork name="" type="" align="left" alt=""><![CDATA[ 1729 0 8 16 24 32 40 48 56 1730 +-----+-----+-----+-----+-----+-----+-----+-----+ 1731 | MSG SIZE | MSG TYPE | REMOTE SET DIFF | 1732 +-----+-----+-----+-----+-----+-----+-----+-----+ 1733 | REMOTE SET SIZE | LOCAL SET DIFF | 1734 +-----+-----+-----+-----+-----+-----+-----+-----+ 1735 ]]></artwork> 1736 </figure> 1737 <t>where:</t> 1738 <dl> 1739 <dt>MSG SIZE</dt> 1740 <dd> 1741 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. The value is always 16 for this message type. 1742 </dd> 1743 <dt>MSG TYPE</dt> 1744 <dd> 1745 is SETU_P2P_REQUEST_FULL as registered in <xref target="gana" format="title" /> in network byte order. 1746 </dd> 1747 <dt>REMOTE SET DIFF</dt> 1748 <dd> 1749 is a 32-bit unsigned integer in network byte order, which represents the remote (from the perspective of the 1750 sending peer) set difference calculated with strata estimator. 1751 </dd> 1752 <dt>REMOTE SET SIZE</dt> 1753 <dd> 1754 is a 32-bit unsigned integer in network byte order, which represents the total remote 1755 (from the perspective of the sending peer) set size. 1756 </dd> 1757 <dt>LOCAL SET DIFF</dt> 1758 <dd> 1759 is a 32-bit unsigned integer in network byte order, which represents the local 1760 (from the perspective of the sending peer) set difference calculated with strata estimator. 1761 </dd> 1762 </dl> 1763 </section> 1764 </section> 1765 1766 1767 <section anchor="messages_send_full" numbered="true" toc="default"> 1768 <name>Send Full</name> 1769 1770 <section anchor="messages_send_full_description" numbered="true" toc="default"> 1771 <name>Description</name> 1772 <t> 1773 The <em><xref target="messages_send_full" format="title" /></em> message is sent by the initiating peer in <strong>Expect SE</strong> state to the receiving peer if 1774 the operation mode "<xref target="modeofoperation_full-sync" format="title" />" is 1775 determined as superior <xref target="modeofoperation" format="title" /> and that it is the better choice that the 1776 peer sends his elements first. The initiating peer changes after sending the <em><xref target="messages_request_full" format="title" /></em> message into 1777 <strong>Full Sending</strong> state. 1778 </t> 1779 <t> 1780 The receiving peer receives the <em><xref target="messages_send_full" format="title" /></em> message in the <strong>Expecting IBF</strong> state, afterwards the receiving peer 1781 changes into <strong>Full Receiving</strong> state and expects to receive the set of the remote peer. 1782 </t> 1783 </section> 1784 <section anchor="messages_send_full_structure" numbered="true" toc="default"> 1785 <name>Structure</name> 1786 <figure anchor="figure_send_full"> 1787 <artwork name="" type="" align="left" alt=""><![CDATA[ 1788 0 8 16 24 32 40 48 56 1789 +-----+-----+-----+-----+-----+-----+-----+-----+ 1790 | MSG SIZE | MSG TYPE | REMOTE SET DIFF | 1791 +-----+-----+-----+-----+-----+-----+-----+-----+ 1792 | REMOTE SET SIZE | LOCAL SET DIFF | 1793 +-----+-----+-----+-----+-----+-----+-----+-----+ 1794 ]]></artwork> 1795 </figure> 1796 <t>where:</t> 1797 <dl> 1798 <dt>MSG SIZE</dt> 1799 <dd> 1800 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. The value is always 16 for this message type. 1801 </dd> 1802 <dt>MSG TYPE</dt> 1803 <dd> 1804 is SETU_P2P_REQUEST_FULL as registered in <xref target="gana" format="title" /> in network byte order. 1805 </dd> 1806 <dt>REMOTE SET DIFF</dt> 1807 <dd> 1808 is a 32-bit unsigned integer in network byte order, which represents the remote (from the perspective of the sending peer) 1809 set difference calculated with strata estimator. 1810 </dd> 1811 <dt>REMOTE SET SIZE</dt> 1812 <dd> 1813 is a 32-bit unsigned integer in network byte order, which represents the total remote (from the perspective 1814 of the sending peer) set size. 1815 </dd> 1816 <dt>LOCAL SET DIFF</dt> 1817 <dd> 1818 is a 32-bit unsigned integer in network byte order, which represents the local (from the perspective of the sending peer) 1819 set difference calculated with strata estimator. 1820 </dd> 1821 </dl> 1822 </section> 1823 </section> 1824 1825 1826 <section anchor="messages_se" numbered="true" toc="default"> 1827 <name>Strata Estimator</name> 1828 1829 <section anchor="messages_se_description" numbered="true" toc="default"> 1830 <name>Description</name> 1831 <t> 1832 The strata estimator is sent by the receiving peer at the start of the protocol, right after the 1833 <xref target="messages_operation_request" format="title" /> message has been received. 1834 </t> 1835 <t> 1836 The strata estimator is used to estimate the difference between the two sets as described in section <xref target="se" format="title" />. 1837 </t> 1838 <t> 1839 When the initiating peer receives the strata estimator, the peer decides which <xref target="modeofoperation" format="title" /> to use 1840 for the synchronisation. Depending on the size of the set difference and the <xref target="modeofoperation" format="title" /> the initiating peer 1841 changes into <strong>Full Sending</strong>, <strong>Full Receiving</strong> or <strong>Passive Decoding</strong> state. 1842 </t> 1843 <t> 1844 The <em><xref target="messages_se" format="title" /></em> message can contain one, two, four or eight strata estimators with different salts, depending on the initial size of the sets. 1845 More details can be found in section <xref target="performance_multi_se" format="title" />. 1846 </t> 1847 <t> 1848 The IBFs in a strata estimator always have 79 buckets. The reason why can be found in <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> in section 3.4.2. 1849 </t> 1850 <!-- Give a more precise reference into the thesis for this, do not cite the whole thesis! --> 1851 </section> 1852 <section anchor="messages_se_structure" numbered="true" toc="default"> 1853 <name>Structure</name> 1854 <figure anchor="figure_se"> 1855 <artwork name="" type="" align="left" alt=""><![CDATA[ 1856 0 8 16 24 32 40 48 56 1857 +-----+-----+-----+-----+-----+-----+-----+-----+ 1858 | MSG SIZE | MSG TYPE | SEC | SETSIZE 1859 +-----+-----+-----+-----+-----+-----+-----+-----+ 1860 SETSIZE | SE-SLICES 1861 +-----+-----+-----+-----+-----+-----+-----+-----+ 1862 / / 1863 / / 1864 ]]></artwork> 1865 </figure> 1866 <t>where:</t> 1867 <dl> 1868 <dt>MSG SIZE</dt> 1869 <dd> 1870 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. 1871 </dd> 1872 <dt>MSG TYPE</dt> 1873 <dd> 1874 is SETU_P2P_SE as registered in <xref target="gana" format="title" /> in network byte order. 1875 </dd> 1876 <dt>SEC</dt> 1877 <dd> 1878 is a 8-bit unsigned integer in network byte order, which indicates how many strata estimators 1879 with different salts are attached to the message. Valid values are 1,2,4 or 8, more details can be found 1880 in the section <xref target="performance_multi_se" format="title" />. 1881 </dd> 1882 <dt>SETSIZE</dt> 1883 <dd> 1884 is a 64-bit unsigned integer that is defined by the size of the set the SE is 1885 </dd> 1886 <dt>SE-SLICES</dt> 1887 <dd> 1888 <t> 1889 are variable numbers of slices in an array. A slice can contain one or more Strata Estimators which 1890 contain multiple IBFs as described in IBF-SLICES in <xref target="messages_ibf_structure" format="default"/>. 1891 A SE slice can contain one to eight Strata Estimators which contain 32 (Defined as Constant SE_STRATA_COUNT) IBFs. Every IBF in 1892 a SE contains 79 Buckets. 1893 </t> 1894 <t> 1895 The different SEs are built as in detail described in <xref target="performance_multi_se" format="default"/>. 1896 Simply put, the IBFs in each SE are serialized as described in <xref target="messages_ibf_structure" format="default"/> starting with the highest stratum. 1897 Then the created SEs are appended one after the other starting with the SE that was created with a salt of zero. 1898 1899 </t> 1900 </dd> 1901 </dl> 1902 <figure anchor="figure_se_slice"> 1903 <artwork name="" type="" align="left" alt=""><![CDATA[ 1904 SE-SLICE 1905 0 8 16 24 32 40 48 56 1906 +-----+-----+-----+-----+-----+-----+-----+-----+ 1907 | SE_1 -> IBF_1 1908 +-----+-----+-----+-----+-----+-----+-----+-----+ 1909 ... ... 1910 +-----+-----+-----+-----+-----+-----+-----+-----+ 1911 | SE_1 -> IBF_30 1912 +-----+-----+-----+-----+-----+-----+-----+-----+ 1913 | SE_2 -> IBF_1 1914 +-----+-----+-----+-----+-----+-----+-----+-----+ 1915 ... ... 1916 / / 1917 / / 1918 ]]></artwork> 1919 </figure> 1920 </section> 1921 </section> 1922 1923 <section anchor="messages_sec" numbered="true" toc="default"> 1924 <name>Strata Estimator Compressed</name> 1925 1926 <section anchor="messages_sec_description" numbered="true" toc="default"> 1927 <name>Description</name> 1928 <t> 1929 The Strata Estimator can be compressed with gzip as 1930 described in <xref target="RFC1951"/> to improve performance. This can be recognized 1931 by the different message type number from <xref target="gana" format="title" />. 1932 </t> 1933 <section anchor="messages_sec_structure" numbered="true" toc="default"> 1934 <name>Structure</name> 1935 <t> 1936 The key difference between the compressed and the uncompressed Strata Estimator is that the 1937 SE slices are compressed with gzip (<xref target="RFC1951"/>) in the compressed SE. 1938 But the header remains uncompressed with both. 1939 </t> 1940 <t> 1941 Since the content of the message is the same as the uncompressed Strata Estimator, the details 1942 are not repeated here. For details see section <xref target="messages_se" format="counter" />. 1943 </t> 1944 </section> 1945 </section> 1946 </section> 1947 1948 <section anchor="messages_full_element" numbered="true" toc="default"> 1949 <name>Full Element</name> 1950 1951 <section anchor="messages_full_element_description" numbered="true" toc="default"> 1952 <name>Description</name> 1953 <t> 1954 The <em><xref target="messages_full_element" format="title" /></em> message is the equivalent of the <xref target="messages_elements" format="title" /> message in 1955 the <xref target="modeofoperation_full-sync" format="title" />. It contains a complete element that is missing 1956 in the set of the peer that receives this message. 1957 </t> 1958 <t> 1959 The <em><xref target="messages_full_element" format="title" /></em> message is exclusively sent in the transitions <strong>Expecting IBF</strong> -> <strong>Full Receiving</strong> and 1960 <strong>Full Receiving</strong> -> <strong>Finished</strong>. The message is only received in the <strong> Full Sending</strong> and 1961 <strong>Full Receiving</strong> state. 1962 </t> 1963 <t> 1964 After the last <em><xref target="messages_full_element" format="title" /></em> message has been sent, the <em><xref target="messages_full_done" format="title" /></em> message 1965 is sent to conclude the full synchronisation of the element sending peer. 1966 </t> 1967 </section> 1968 <section anchor="messages_full_element_structure" numbered="true" toc="default"> 1969 <name>Structure</name> 1970 <!-- MAYBE just refer to the "ELEMENT" section on structure and only 1971 note the different MSG TYPE here? --> 1972 <figure anchor="figure_full_element"> 1973 <artwork name="" type="" align="left" alt=""><![CDATA[ 1974 0 8 16 24 32 40 48 56 1975 +-----+-----+-----+-----+-----+-----+-----+-----+ 1976 | MSG SIZE | MSG TYPE | E TYPE | PADDING | 1977 +-----+-----+-----+-----+-----+-----+-----+-----+ 1978 | SIZE | AE TYPE | DATA 1979 +-----+-----+-----+-----+-----+-----+-----+-----+ 1980 / / 1981 / / 1982 ]]></artwork> 1983 </figure> 1984 <t>where:</t> 1985 <dl> 1986 <dt>MSG SIZE</dt> 1987 <dd> 1988 is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. 1989 </dd> 1990 <dt>MSG TYPE</dt> 1991 <dd> 1992 is SETU_P2P_REQUEST_FULL_ELEMENT as registered in <xref target="gana" format="title" /> in network byte order. 1993 </dd> 1994 <dt>E TYPE</dt> 1995 <dd> 1996 is a 16-bit unsigned integer which defines the element type for 1997 the application. 1998 </dd> 1999 <dt>PADDING</dt> 2000 <dd> 2001 is 16-bit always set to zero 2002 </dd> 2003 <dt>E SIZE</dt> 2004 <dd> 2005 is a 16-bit unsigned integer that signals the size of the elements data part. 2006 </dd> 2007 <dt>AE TYPE</dt> 2008 <dd> 2009 is a 16-bit unsigned integer that is needed to identify 2010 the type of element that is in the data field 2011 </dd> 2012 <dt>DATA</dt> 2013 <dd> 2014 is a field with variable length that contains the data of the element. 2015 </dd> 2016 </dl> 2017 </section> 2018 </section> 2019 </section> 2020 2021 <section anchor="performance" numbered="true" toc="default"> 2022 <name>Performance Considerations</name> 2023 <section anchor="performance_formulas" numbered="true" toc="default"> 2024 <name>Formulas</name> 2025 <section anchor="performance_formulas_operationmode" numbered="true" toc="default"> 2026 <name>Operation Mode</name> 2027 <t> 2028 The decision which <xref target="modeofoperation" format="title"/> is used is described by the following code. 2029 More detailed explanations motivating the design can be found in the accompanying thesis in section 4.5.3.<xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> 2030 </t> 2031 <t> 2032 The function takes as input the average element size, the local set size, the remote set size, the set differences as estimated from the strata estimator for both the local and remote sets, 2033 and the bandwidth/roundtrip tradeoff. 2034 The function returns the exact <xref target="modeofoperation" format="title"/> that is predicted to be best as output: FULL_SYNC_REMOTE_SENDING_FIRST 2035 if it is likely cheapest that the other peer transmits his elements first, FULL_SYNC_LOCAL_SENDING_FIRST 2036 if it is likely cheapest that the elements are transmitted to the other peer directly, and 2037 DIFFERENTIAL_SYNC if the differential synchronisation is likely cheapest. 2038 </t> 2039 <t> 2040 The constant IBF_BUCKET_NUMBER_FACTOR is always 2 and IBF_MIN_SIZE is 37. 2041 The method for deriving 2042 this can be found in the IBF parameter study in <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> in section 4.5.2. 2043 </t> 2044 <figure anchor="performance_formulas_operationmode_code"> 2045 <artwork name="" type="" align="left" alt=""><![CDATA[ 2046 # CONSTANTS: 2047 # IBF_BUCKET_NUMBER_FACTOR = 2: The amount the IBF gets increased if 2048 # decoding fails 2049 # RTT_MIN_FULL = 2: Minimal round trips used for full Synchronisation 2050 # IBF_MIN_SIZE = 37: The minimal size of an IBF 2051 # MAX_BUCKETS_PER_MESSAGE: Custom value depending on the underlying 2052 # protocol 2053 # INPUTS: 2054 # avg_es: The average element size 2055 # lss: The initial local set size 2056 # rss: The remote set size 2057 # lsd: the estimated local set difference calculated by the SE 2058 # rsd: the estimated remote set difference calculated by the SE 2059 # rtt: the tradeoff between round trips and bandwidth 2060 # OUTPUT: 2061 # FULL_SYNC_REMOTE_SENDING_FIRST, FULL_SYNC_LOCAL_SENDING_FIRST or 2062 # DIFFERENTIAL_SYNC 2063 2064 FUNCTION decide_operation_mode(avg_es, 2065 lss, 2066 rss, 2067 lsd 2068 rsd, 2069 rtt) 2070 2071 # If a set size is zero always do full sync 2072 IF 0 == rss THEN 2073 RETURN FULL_SYNC_LOCAL_SENDING_FIRST 2074 END IF 2075 IF 0 == lss THEN 2076 RETURN FULL_SYNC_REMOTE_SENDING_FIRST 2077 END IF 2078 2079 # Estimate required transferred bytes when doing a full 2080 # synchronisation and transmitting local set first. 2081 semh = sizeof(ELEMENT_MSG_HEADER) 2082 estimated_total_diff = rsd + lsd 2083 total_elements_local_send = rsd + lss 2084 cost_local_full_sync = avg_es * total_elements_local_send 2085 + total_elements_local_send * semh 2086 + sizeof(FULL_DONE_MSG_HEADER) * 2 2087 + RTT_MIN_FULL * rtt 2088 2089 # Estimate required transferred bytes when doing a full 2090 # synchronisation and transmitting remote set first. 2091 total_elements_remote_send = lsd + rss 2092 cost_remote_full_sync = avg_es * total_elements_remote_send 2093 + total_elements_remote_send * semh 2094 + sizeof(FULL_DONE_MSG_HEADER) * 2 2095 + (RTT_MIN_FULL + 0.5) * rtt 2096 + sizeof(REQUEST_FULL_MSG) 2097 2098 # Estimate required transferred bytes when doing a differential 2099 # synchronisation 2100 2101 # Estimate messages required to transfer IBF 2102 ibf_bucket_count = estimated_total_diff * IBF_BUCKET_NUMBER_FACTOR 2103 IF ibf_bucket_count <= IBF_MIN_SIZE THEN 2104 ibf_bucket_count = IBF_MIN_SIZE 2105 END IF 2106 ibf_message_count = ceil (ibf_bucket_count / MAX_BUCKETS_PER_MESSAGE) 2107 2108 # Estimate average counter length with variable counter 2109 estimated_counter_bits = MIN (2 * LOG2(lss / ibf_bucket_count), 2110 LOG2(lss)) 2111 estimated_counter_bytes = estimated_counter_bits / 8 2112 2113 # Sum up all messages required to do differential synchronisation 2114 ibf_bytes = sizeof(IBF_MESSAGE) * ibf_message_count 2115 + ibf_bucket_count * sizeof(IBF_KEY) 2116 + ibf_bucket_count * sizeof(IBF_KEYHASH) 2117 + ibf_bucket_count * estimated_counter_bytes 2118 # Add 20% overhead to cover IBF retries due to decoding failures 2119 total_ibf_bytes = ibf_bytes * 1.2 2120 2121 # Estimate other message sizes to be transfered in diff sync 2122 # Note that we simplify by adding the header each time; 2123 # if the implementation combines multiple INQUIRY/DEMAND/OFFER 2124 # requests in one message, the bandwidth would be lower. 2125 done_size = sizeof(DONE_HEADER) 2126 element_size = (avg_es + sizeof(ELEMENT_MSG_HEADER)) 2127 * estimated_total_diff 2128 inquery_size = (sizeof(IBF_KEY) + sizeof(INQUERY_MSG_HEADER)) 2129 * estimated_total_diff 2130 demand_size = (sizeof(HASHCODE) + sizeof(DEMAND_MSG_HEADER)) 2131 * estimated_total_diff 2132 offer_size = (sizeof(HASHCODE) + sizeof(OFFER_MSG_HEADER)) 2133 * estimated_total_diff 2134 2135 # Estimate total cost 2136 diff_cost = element_size + done_size + inquery_size 2137 + demand_size + offer_size + total_ibf_bytes 2138 + DIFFERENTIAL_RTT_MEAN * rtt 2139 2140 # Decide for a optimal mode of operation 2141 full_cost_min = MIN (cost_local_full_sync, 2142 cost_remote_full_sync) 2143 IF full_cost_min < diff_cost THEN 2144 IF cost_remote_full_sync > cost_local_full_sync THEN 2145 RETURN FULL_SYNC_LOCAL_SENDING_FIRST 2146 ELSE 2147 RETURN FULL_SYNC_REMOTE_SENDING_FIRST 2148 END IF 2149 ELSE 2150 RETURN DIFFERENTIAL_SYNC 2151 END IF 2152 END FUNCTION 2153 ]]></artwork> 2154 </figure> 2155 </section> 2156 <section anchor="performance_formula_ibf_parameters" numbered="true" toc="default"> 2157 <name>IBF Size</name> 2158 <t> 2159 The functions, described in this section, calculate a good initial size (initial_ibf_size) 2160 and in case of decoding failure, a good next IBF size (get_next_ibf_size). 2161 </t> 2162 <t> 2163 These algorithms are described and justified in more details in 2164 <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> in the parameter study in 2165 section 3.5.2, the max IBF counter in section 3.10 and the Improved IBF size in section 3.11. 2166 </t> 2167 <figure anchor="performance_formula_ibf_parameters_code"> 2168 <artwork name="" type="" align="left" alt=""><![CDATA[ 2169 # CONSTANTS: 2170 # IBF_BUCKET_NUMBER_FACTOR = 2: The amount the IBF gets increased 2171 if decoding fails 2172 # Inputs: 2173 # sd: Estimated set difference 2174 # Output: 2175 # next_size: Size of the initial IBF 2176 2177 FUNCTION initial_ibf_size(sd) 2178 # We do not go below 37, as 37 buckets should 2179 # basically always be below one MTU, so there is 2180 # little to be gained, while a smaller IBF would 2181 # increase the chance of a decoding failure. 2182 RETURN MAX(37, IBF_BUCKET_NUMBER_FACTOR * sd); 2183 END FUNCTION 2184 2185 # CONSTANTS: 2186 # IBF_BUCKET_NUMBER_FACTOR = 2: The amount the IBF gets increased if 2187 # decoding fails 2188 # Inputs: 2189 # de: Number of elements that have been successfully decoded 2190 # lis: The number of buckets of the last IBF 2191 # Output: 2192 # number of buckets for the next IBF 2193 2194 FUNCTION get_next_ibf_size(de, lis) 2195 next_size = IBF_BUCKET_NUMBER_FACTOR * (lis - de) 2196 # The MAX operation here also ensures that the 2197 # result is positive. 2198 RETURN MAX(37, next_size); 2199 END FUNCTION 2200 ]]></artwork> 2201 </figure> 2202 </section> 2203 <section anchor="performance_num_buck_hash" numbered="true" toc="default"> 2204 <name>Number of Buckets an Element is Hashed into</name> 2205 <t> 2206 The number of buckets an element is hashed to is hardcoded to 3. Reasoning and 2207 justification can be found in 2208 <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> in the 2209 IBF parameter performance study in section 4.5.2. 2210 </t> 2211 </section> 2212 </section> 2213 <section anchor="performance_counter_variable_size" numbered="true" toc="default"> 2214 <name>Variable Counter Size</name> 2215 <t> 2216 The number of bits required to represent the counters of an IBF varies 2217 due to different parameters as described in section 3.2 of 2218 <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/>. 2219 Therefore, a packing algorithm has been implemented. 2220 This algorithm encodes the IBF counters in their optimal bit-width 2221 and thus minimizes the bandwidth needed to transmit the IBF. 2222 </t> 2223 <t> 2224 A simple algorithm is used for the packing. 2225 In a first step it is determined, which is the largest counter. 2226 The the base 2 logarithm then determines how many bits are needed to store it. 2227 In a second step for every counter of every bucket, the counter 2228 is stored using this many bits. The resulting bit sequence is then simply concatenated. 2229 </t> 2230 <t> 2231 Three individual functions are used for this purpose. 2232 The first one is a function that iterates over each bucket of 2233 the IBF to get the maximum counter in the IBF. The second function 2234 packs the counters of the IBF, and the third function that unpacks the counters. 2235 </t> 2236 <t> 2237 As a plausibly check to prevent the byzantine upper bound 2238 checks in <xref target="security_generic_functions_check_byzantine_boundaries" format="default"/> 2239 to fail, implementations must ensure that the 2240 estimates of the set size difference added together 2241 never exceed the set byzantine upper bound. This 2242 could for example happen in case the strata estimator 2243 overestimates the set difference. 2244 </t> 2245 <figure anchor="performance_counter_variable_size_code"> 2246 <artwork name="" type="" align="left" alt=""><![CDATA[ 2247 2248 # INPUTS: 2249 # ibf: The IBF 2250 # OUTPUTS: 2251 # returns: Minimal amount of bits required to store the counter 2252 2253 FUNCTION ibf_get_max_counter(ibf) 2254 max_counter=1 # convince static analysis that we never take log2(0) 2255 FOR bucket IN ibf DO 2256 IF bucket.counter > max_counter THEN 2257 max_counter = bucket.counter 2258 END IF 2259 END FOR 2260 # next bigger discrete number of the binary logarithm of the 2261 # max counter 2262 RETURN CEILING( LOG2( max_counter ) ) 2263 END FUNCTION 2264 2265 # INPUTS: 2266 # ibf: The IBF 2267 # offset: The offset which defines the starting point from which bucket 2268 # the pack operation starts 2269 # count: The number of buckets in the array that will be packed 2270 # OUTPUTS: 2271 # returns: A byte array of packed counters to send over the network 2272 2273 # INPUTS: 2274 # ibf: The IBF 2275 # offset: The offset which defines the starting point from which bucket 2276 # the pack operation starts 2277 # count: The number of buckets in the array that will be packed 2278 # OUTPUTS: 2279 # returns: A byte array of packed counters to send over the network 2280 2281 FUNCTION pack_counter(ibf, offset, count) 2282 counter_bytes = ibf_get_max_counter(ibf) 2283 store_bits = 0 2284 store = 0 2285 byte_ctr = 0 2286 buf=[] 2287 2288 FOR bucket IN ibf[offset] TO ibf[count] DO 2289 counter = bucket.counter 2290 byte_len = counter_bytes 2291 2292 WHILE byte_len + store_bits < 8 DO 2293 bit_to_shift = 0 2294 2295 IF store_bits > 0 OR byte_len > 8 THEN 2296 bit_free = 8 - store_bits 2297 bit_to_shift = byte_len - bit_free 2298 store = store << bit_free 2299 END IF 2300 buf[byte_ctr] = (( counter >> bit_to_shift) | store) & 0xFF 2301 byte_ctr = byte_ctr + 1 2302 byte_len -= 8 - store_bits 2303 counter = counter & ((1 << byte_len) - 1) 2304 store = 0 2305 store_bits = 0 2306 END WHILE 2307 store = (store << byte_len) | counter 2308 store_bits = store_bits + byte_len 2309 byte_len = 0 2310 END FOR 2311 2312 # Write the last partial packed byte to the buffer 2313 IF store_bits > 0 THEN 2314 buf[byte_ctr] = store << (8 - store_bits) 2315 byte_ctr = byte_ctr + 1 2316 END IF 2317 2318 RETURN buf 2319 FUNCTION END 2320 2321 # INPUTS: 2322 # ibf: The IBF 2323 # offset: The offset which defines the starting point from which bucket 2324 the packed operation starts 2325 # count: The number of buckets in the array that will be packed 2326 # cbl: The bit length of the counter can be found in the 2327 ibf message in the ibf_counter_bit_length field 2328 # pd: A byte array which contains the data packed with the pack_counter 2329 function 2330 # OUTPUTS: 2331 # returns: Nothing because the unpacked counter is saved directly 2332 into the IBF 2333 2334 FUNCTION unpack_counter(ibf, offset, count, cbl, pd) 2335 ibf_bucket_ctr = 0 2336 store = 0 2337 store_bits = 0 2338 byte_ctr = 0 2339 2340 WHILE TRUE 2341 byte_read = pd[byte_ctr] 2342 bit_to_pack_left = 8 2343 byte_ctr++ 2344 2345 WHILE bit_to_pack_left >= 0 DO 2346 2347 # Prevent packet from reading more than required 2348 IF ibf_bucket_ctr > (count - 1) THEN 2349 RETURN 2350 END IF 2351 2352 IF store_bits + bit_to_pack_left >= cbl THEN 2353 bit_use = cbl - store_bits 2354 2355 IF store_bits > 0 THEN 2356 store = store << bit_use 2357 END IF 2358 bytes_to_shift = bit_to_pack_left - bit_use 2359 counter_partial = byte_read >> bytes_to_shift 2360 store = store | counter_partial 2361 ibf.counter[ibf_bucket_ctr + offset] = store 2362 byte_read = byte_read & (( 1 << bytes_to_shift ) - 1) 2363 2364 bit_to_pack_left -= bit_use 2365 ibf_bucket_ctr++ 2366 store = 0 2367 store_bits = 0 2368 ELSE 2369 store_bits = store_bits + bit_to_pack_left 2370 2371 IF 0 == store_bits THEN 2372 store = byte_read 2373 ELSE 2374 store = store << bit_to_pack_left 2375 store = store | byte_read 2376 END IF 2377 BREAK 2378 END IF 2379 END WHILE 2380 END WHILE 2381 END FUNCTION 2382 ]]></artwork> 2383 </figure> 2384 </section> 2385 <section anchor="performance_multi_se" numbered="true" toc="default"> 2386 <name>Multi Strata Estimators</name> 2387 <t> 2388 In order to improve the precision of the estimates not only one strata estimator 2389 is transmitted for larger sets. One, two, four or eight strata estimators can be 2390 transferred. Transmitting multiple strata estimators has the disadvantage that 2391 additional bandwidth will be used, so despite the higher precision, it is not 2392 always optimal to transmit eight strata estimators. Therefore, the following 2393 rules are used, which are based on the average element size multiplied by the number 2394 of elements in the set. This value is denoted as "b" in the table: 2395 </t> 2396 <dl> 2397 <dt>SEs</dt> 2398 <dd>Rule</dd> 2399 <dt>1</dt> 2400 <dd>b < 68kb</dd> 2401 <dt>2</dt> 2402 <dd>b > 68kb</dd> 2403 <dt>4</dt> 2404 <dd>b > 269kb</dd> 2405 <dt>8</dt> 2406 <dd>b > 1'077kb</dd> 2407 </dl> 2408 <t> 2409 When creating multiple strata estimators, it is important to salt the keys for the IBFs in the strata 2410 estimators differently, using the following bit rotation based salting method: 2411 </t> 2412 <figure anchor="performance_multi_se_salting_code"> 2413 <artwork name="" type="" align="left" alt=""><![CDATA[ 2414 2415 # Inputs: 2416 # value: Input value to salt (needs to be 64 bit unsigned) 2417 # salt: Salt to salt value with; Should always be ascending and start 2418 # at zero 2419 i.e. SE1 = Salt 0; SE2 = Salt 1 etc. 2420 # Output: 2421 # Returns: Salted value 2422 2423 FUNCTION se_key_salting(value, salt) 2424 s = (salt * 7) modulo 64 2425 RETURN (value >> s) | (value << (64 - s)) 2426 END FUNCTION 2427 2428 ]]></artwork> 2429 </figure> 2430 <t> 2431 Performance study and details about the reasoning for the used methods can be found in <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> in section 2432 3.4.1 under the title "Added Support for Multiple Strata Estimators". 2433 <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> 2434 </t> 2435 </section> 2436 </section> 2437 2438 2439 <section anchor="security" numbered="true" toc="default"> 2440 <name>Security Considerations</name> 2441 <t> 2442 The security considerations in this document focus mainly on the security 2443 goal of availability. The primary goal of the protocol is to prevent an attacker from 2444 wasting computing and network resources of the attacked peer. 2445 </t> 2446 <t> 2447 To prevent denial of service attacks, it is vital to check that peers can only 2448 reconcile a set once in a predefined time span. This is a predefined value and needs 2449 to be adapted per use basis. To enhance reliability and to allow 2450 for legitimate failures, say due to network connectivity issues, 2451 applications SHOULD define a threshold for 2452 the maximum number of failed reconciliation attempts in a given time period. 2453 </t> 2454 <t> 2455 It is important to close and purge connections after a given timeout 2456 to prevent draining attacks. 2457 </t> 2458 <section anchor="security_generic_functions" numbered="true" toc="default"> 2459 <name>General Security Checks</name> 2460 <t> 2461 In this section general checks are described which should be applied to multiple states. 2462 </t> 2463 2464 <section anchor="security_generic_input_validation" numbered="true" toc="default"> 2465 <name>Input validation</name> 2466 <t> 2467 The format of all received messages needs to be properly validated. This is important to prevent many 2468 attacks on the code. The application data MUST be validated by the application using 2469 the protocol not by the implementation of the protocol. 2470 In case the format validation fails the set operation MUST be terminated. 2471 </t> 2472 </section> 2473 <section anchor="security_generic_functions_check_byzantine_boundaries" numbered="true" toc="default"> 2474 <name>Byzantine Boundaries</name> 2475 <t> 2476 To restrict an attacker there should be an upper and lower bound defined and checked 2477 at the beginning of the protocol, based 2478 on prior knowledge, for the number of elements. 2479 The lower byzantine bound can be, for example, the number of elements the 2480 other peer had in his set at the last contact. 2481 The upper byzantine bound can be a practical maximum e.g. the number 2482 of e-voting votes, in Switzerland. 2483 </t> 2484 <figure anchor="security_generic_functions_missing_message_code"> 2485 <artwork name="" type="" align="left" alt=""><![CDATA[ 2486 # Input: 2487 # rec: Number of elements in remote set 2488 # rsd: Number of elements differ in remote set 2489 # lec: Number of elements in local set 2490 # lsd: Number of elements differ in local set 2491 # UPPER_BOUND: Given byzantine upper bound 2492 # LOWER_BOUND: Given byzantine lower bound 2493 # Output: 2494 # returns TRUE if parameters in byzantine bounds otherwise returns FALSE 2495 FUNCTION check_byzantine_bounds (rec,rsd,lec,lsd) 2496 IF rec + rsd > UPPER_BOUND THEN 2497 RETURN FALSE 2498 END IF 2499 IF lec + lsd > UPPER_BOUND THEN 2500 RETURN FALSE 2501 END IF 2502 IF rec < LOWER_BOUND THEN 2503 RETURN FALSE 2504 END IF 2505 RETURN TRUE 2506 END FUNCTION 2507 ]]></artwork> 2508 </figure> 2509 </section> 2510 2511 <section anchor="security_generic_functions_check_valid_state" numbered="true" toc="default"> 2512 <name>Valid State</name> 2513 <t> 2514 To harden the protocol against attacks, controls were introduced in the improved 2515 implementation that check for each message whether the message was received 2516 in the correct state. This is central so that an attacker finds as little attack 2517 surface as possible and makes it more difficult for the attacker to send the 2518 protocol into an endless loop, for example. 2519 </t> 2520 </section> 2521 <section anchor="security_generic_functions_mfc" numbered="true" toc="default"> 2522 <name>Message Flow Control</name> 2523 <t> 2524 For most messages received and sent there needs to be a check in place that checks 2525 that a message is not received multiple times. This is solved with a global store (message) 2526 and the following code 2527 </t> 2528 <t> 2529 The sequence in which messages are received and sent is arranged in a chain. 2530 The messages are dependent on each other. There are dependencies that are 2531 mandatory, e.g. for a sent "Demand" message, an "Element" message must 2532 always be received. But there are also messages for which a response is 2533 not mandatory, e.g. the <em><xref target="messages_inquiry" format="title" /></em> message is only followed by an 2534 "Offer" message, if the corresponding element is in the set. Due to this 2535 fact, checks can be installed to verify compliance with the following chain. 2536 </t> 2537 <figure anchor="security_generic_functions_mfc_chain"> 2538 <artwork name="" type="" align="left" alt=""><![CDATA[ 2539 2540 2541 Chain for 2542 elements +---------+ +---------+ +---------+ +---------+ 2543 NOT in IBF | INQUIRY |--->| OFFER |===>| DEMAND |===>| ELEMENT | 2544 decoding +---------+ +---------+ +---------+ +---------+ 2545 peers set 2546 2547 Chain for 2548 elements +---------+ +---------+ +---------+ 2549 in IBF | OFFER |--->| DEMAND |===>| ELEMENT | 2550 decoding +---------+ +---------+ +---------+ 2551 peers set 2552 2553 --->: Answer not mandatory 2554 ===>: Always answer needed. 2555 ]]></artwork> 2556 </figure> 2557 <t> 2558 In the message control flow its important to ensure that no duplicated messages are received 2559 (Except inquiries where collisions are possible) and only messages are received which are compliant 2560 with the flow in <xref target="security_generic_functions_mfc_chain" format="default" />. 2561 To link messages the SHA-512 element hashes, that are part of all messages, except in the 2562 <em><xref target="messages_inquiry" format="title" /></em> messages, can be used. 2563 To link an <em><xref target="messages_inquiry" format="title" /></em> message to an 2564 <em><xref target="messages_offer" format="title" /></em> message 2565 the SHA-512 hash from the offer has to be salted and converted to the IBF-Key (as described in 2566 <xref target="ibf_format_id_generation_pseudo_code" format="default" />). The IBF-Key can 2567 be matched with the received <em><xref target="messages_inquiry" format="title" /></em> message. 2568 </t> 2569 <t> 2570 At the end of the set reconciliation operation after receiving and sending the 2571 <em><xref target="messages_done" format="title" /></em> message, it should be checked 2572 that all demands have been satisfied and all elements have been received. 2573 </t> 2574 <t> 2575 This is based on <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/>, section 5.3 (Message Control Flow). 2576 </t> 2577 </section> 2578 2579 <section anchor="security_generic_functions_active_passive_switches" numbered="true" toc="default"> 2580 <name>Limit Active/Passive Decoding changes</name> 2581 <t> 2582 To prevent an attacker from sending a peer into an endless loop between active and passive decoding, a 2583 limitation for active/passive roll switches is required. 2584 Otherwise, an attacker could 2585 force the victim to waste unlimited amount of resources by just transmitting 2586 IBFs that do not decode. 2587 This can be implemented by 2588 a simple counter which terminates the operation after a predefined number of switches. 2589 The maximum number of switches needs to be defined in such a way that it is 2590 very improbable that more switches are required in a legitimate interaction, 2591 and hence the malicious behavior of the other peer is assured. 2592 </t> 2593 <t> 2594 The question after how many active/passive switches it can be assumed that the other peer is not honest, 2595 depends on the various tuning parameters of the algorithm. 2596 Section 5.4 of <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> 2597 demonstrates that the probability of decoding failure is less than 2598 15% for each round. The probability that there will be n legitimate 2599 active/passive changes is thus less than 0.15^{round number}. 2600 Which means that after about 30 active/passive switches it can be said with a certainty of 2^80 that one of the peers 2601 is not following the protocol. 2602 Hence, participants MUST impose a maximum of 30 active/passive changes. 2603 </t> 2604 </section> 2605 2606 <section anchor="security_generic_functions_full_plausibility_check" numbered="true" toc="default"> 2607 <name>Full Synchronisation Plausibility Check</name> 2608 <t> 2609 An attacker can try to use up a peer's bandwidth by pretending that the peer 2610 needs full synchronisation, even if the set difference is very small and the attacker 2611 only has a few (or even zero) elements that are not already synchronised. 2612 In such a case, it would be ideal if the plausibility could already be checked 2613 during full synchronisation as to whether the other peer was honest or not with 2614 regard to the estimation of the set size difference and thus the choice of mode 2615 of operation. 2616 </t> 2617 <t> 2618 In order to calculate this plausibility, section 5.5 of <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> describes a formula, which depicts the probability with which one 2619 can calculate the corresponding plausibility based on the number of 2620 new and repeated elements after each received element. 2621 </t> 2622 <t> 2623 Besides this approach from probability theory, there is an additional check 2624 that can be made. After the entire set has been transferred to the other peer, 2625 no known elements may be returned by the second peer, since the second peer 2626 should only return the elements that are missing from the initial peer's set. 2627 </t> 2628 <t> 2629 This two approaches are implemented in the following pseudocode: 2630 </t> 2631 2632 <figure anchor="security_generic_functions_full_plausibility_check_code"> 2633 <artwork name="" type="" align="left" alt=""><![CDATA[ 2634 # Input: 2635 # SECURITY_LEVEL: The security level used e.g. 2^80 2636 # state: The statemachine state 2637 # rs: Estimated remote set difference 2638 # lis: Number of elements in set 2639 # rd: Number of duplicated elements received 2640 # rf: Number of fresh elements received 2641 # Output: 2642 # Returns TRUE if full synchronisation is plausible and FALSE otherwise 2643 2644 FUNCTION full_sync_plausibility_check (state,rs,lis,rd,rf) 2645 security_level_lb = -1 * SECURITY_LEVEL 2646 2647 # Make sure that no element is received double when 2648 # all elements already are transmitted to the oder side. 2649 IF FULL_SENDING == state AND rd > 0 THEN 2650 RETURN FALSE 2651 END IF 2652 2653 # Probabilistic algorithm to check for plausible 2654 # element distribution 2655 IF FULL_RECEIVING == state THEN 2656 2657 # Prevent division by 0 2658 IF 0 <= rs THEN 2659 rs = 1 2660 END IF 2661 2662 # Formula to verify plausibility 2663 base = 1 - (rs / (lis + rs)) 2664 exponent = rd - rf * lis / rs 2665 value = exponent * (LOG2(base)/LOG2(2)) 2666 IF value < security_level_lb OR value > SECURITY_LEVEL THEN 2667 RETURN FALSE 2668 END IF 2669 END IF 2670 RETURN TRUE 2671 END FUNCTION 2672 ]]></artwork> 2673 </figure> 2674 </section> 2675 </section> 2676 2677 <section anchor="security_states" numbered="true" toc="default"> 2678 <name>States</name> 2679 2680 <t> 2681 In this section the security considerations for each valid message 2682 in all states is described, if any other message 2683 is received the peer MUST terminate the operation. 2684 </t> 2685 2686 <section anchor="security_states_expecting_ibf" numbered="true" toc="default"> 2687 <name>Expecting IBF</name> 2688 <t>Security considerations for received messages:</t> 2689 <dl> 2690 <dt><xref target="messages_request_full" format="title" /></dt> 2691 <dd> 2692 <t> 2693 It needs to be checked that the full synchronisation mode with receiving peer 2694 sending first is plausible according to the algorithm deciding which operation mode 2695 is applicable as described in <xref target="performance_formulas_operationmode" format="default"/>. 2696 </t> 2697 2698 </dd> 2699 <dt><xref target="messages_ibf" format="title" /></dt> 2700 <dd> 2701 <t> 2702 It needs to be checked that the differential synchronisation mode is plausible according 2703 to the algorithm deciding which operation mode 2704 is applicable as described in <xref target="performance_formulas_operationmode" format="default"/>. 2705 </t> 2706 2707 </dd> 2708 <dt><xref target="messages_send_full" format="title" /></dt> 2709 <dd> 2710 <t> 2711 It needs to be checked that the full synchronisation mode with initiating peer 2712 sending first is plausible according to the algorithm deciding which operation mode 2713 is applicable as described in <xref target="performance_formulas_operationmode" format="default"/>. 2714 </t> 2715 </dd> 2716 </dl> 2717 </section> 2718 2719 <section anchor="security_states_full_sending" numbered="true" toc="default"> 2720 <name>Full Sending</name> 2721 <t>Security considerations for received messages:</t> 2722 <dl> 2723 <dt><xref target="messages_full_element" format="title" /></dt> 2724 <dd> 2725 <t> 2726 When receiving full elements there needs to be checked, that every 2727 element is a valid element, that no element has been received more than once, and 2728 that not more elements have been received than the other peer has committed 2729 to at the beginning of the operation. The plausibility should also be checked 2730 with an algorithm as described in <xref target="security_generic_functions_full_plausibility_check" format="default"/>. 2731 </t> 2732 </dd> 2733 <dt><xref target="messages_full_done" format="title" /></dt> 2734 <dd> 2735 <t> 2736 When receiving the <em><xref target="messages_full_done" format="title" /></em> 2737 message, it is important to check that 2738 not fewer elements have been received than the other peer has committed to 2739 send at the beginning of the operation. 2740 If the sets differ (the FINAL CHECKSUM field in the <xref target="messages_full_done" format="title" /> 2741 message does not match to the SHA-512 hash XOR sum of the local set), the operation has failed and the 2742 reconciliation MUST be aborted. It is a strong indicator 2743 that something went wrong (eg. some hardware bug). This should never occur! 2744 </t> 2745 </dd> 2746 </dl> 2747 </section> 2748 2749 <section anchor="security_states_expecting_ibf_last" numbered="true" toc="default"> 2750 <name>Expecting IBF Last</name> 2751 <t>Security considerations for received messages:</t> 2752 <dl> 2753 <dt><xref target="messages_ibf" format="title" /></dt> 2754 <dd> 2755 <t> 2756 The application should check that the overall size of the IBF 2757 that is being transmitted is within its resource bounds, and 2758 abort the protocol if its resource limits are likely to be 2759 exceeded, or if the size is implausible for the given operation. 2760 </t> 2761 <t> 2762 It needs to be checked that the offset (message field "OFFSET") 2763 for every received <em><xref target="messages_ibf" format="title" /></em> message 2764 is strictly monotonic increasing and is a multiple of the MAX_BUCKETS_PER_MESSAGE 2765 defined in the <xref target="constants" format="title" /> section, otherwise the 2766 connection MUST be aborted. 2767 </t> 2768 <t> 2769 Another sanity check is to ensure that the "OFFSET" message field never 2770 is higher than the "IBF SIZE" field in the <em><xref target="messages_ibf" format="title" /></em> 2771 message. 2772 </t> 2773 </dd> 2774 <dt><xref target="messages_ibf_last" format="title" /></dt> 2775 <dd> 2776 <t> 2777 When all <em><xref target="messages_ibf" format="title" /></em> messages have 2778 been received an <em><xref target="messages_ibf_last" format="title" /></em> message 2779 should conclude the transmission of the IBF and a change to the <strong>Active Decoding</strong> 2780 phase should be ensured. 2781 </t> 2782 <t> 2783 To verify that all IBFs have been received, a simple validation can be made. 2784 The number of buckets in the <em><xref target="messages_ibf_last" format="title" /></em> message 2785 added to the value in the message OFFSET field should always be equal to the "IBF SIZE". 2786 </t> 2787 <t> 2788 Further plausibility checks can be made. One is to ensure that after each active/passive 2789 switch the IBF can never be more than double in size. Another plausibility check is 2790 that an IBF probably never will be larger than the byzantine upperbound multiplied by two. 2791 The third plausibility check is to take successfully decoded IBF keys (received offers and demands) 2792 into account and to validate the size of the received IBF with the in <xref target="performance_formula_ibf_parameters_code" format="default" /> 2793 described function get_next_ibf_size(). If any of these three checks fail the operation 2794 must be aborted. 2795 </t> 2796 </dd> 2797 </dl> 2798 </section> 2799 2800 <section anchor="security_states_active_decoding" numbered="true" toc="default"> 2801 <name>Active Decoding</name> 2802 <t> 2803 In the <strong>Active Decoding</strong> state it is important to prevent an attacker from 2804 generating and transmitting an unlimited number of IBFs that all do not decode, or 2805 to generate an IBF constructed to send the peers in an endless loop. 2806 To prevent an endless loop in decoding, loop detection MUST be implemented. 2807 A solution to prevent endless loop is to limit the number of elements decoded from an IBF. 2808 This limit is defined by the number of buckets in the IBF. It is not possible that more elements are decoded 2809 from an IBF than an IBF has buckets. If more elements than buckets are in an IBF it is not possible to 2810 get pure buckets. 2811 An additional check that should be implemented, is to store all element IDs 2812 that were prior decoded. When a new element ID is decoded from the IBF it should 2813 always be checked that no element ID is repeated. 2814 If the same element ID is decoded more than once, this is a strong indication 2815 for an invalid IBF and the operation MUST be aborted. Notice that the decoded 2816 element IDs are salted as described in <xref target="ibf_format_id_generation_pseudo_code" format="default" /> 2817 so the described bit rotation needs to be reverted before the decoded element ID 2818 is stored and compared to the previous decoded element IDs. 2819 </t> 2820 <t> 2821 If the IBF decodes more elements than are plausible, the 2822 operation MUST be terminated. Furthermore, if the IBF 2823 decoding successfully terminates and fewer elements were 2824 decoded than plausible, the operation MUST also be terminated. 2825 The upper thresholds for decoded elements from the IBF is the 2826 remote set size the other peer has committed too (Case if the complete remote set is 2827 new). The lower threshold for decoding element is the absolute value of the difference 2828 between the local and remote set size (Case the set difference is only in the set of 2829 a single peer). The other peer's committed set sizes 2830 is transmitted in the the <strong>Expecting IBF</strong> state. 2831 </t> 2832 2833 <t>Security considerations for received messages:</t> 2834 <dl> 2835 <dt><xref target="messages_offer" format="title" /></dt> 2836 <dd> 2837 <t> 2838 If an offer for an element, that never has been requested by 2839 an inquiry or if an offer is received twice, the operation MUST be terminated. 2840 This requirement can be fulfilled by saving lists that keep track of the state of 2841 all sent inquiries and offers. When answering offers these lists MUST be checked. 2842 The sending and receiving of <xref target="messages_offer" format="title" /> messages should 2843 always be protected with an <xref target="security_generic_functions_mfc" format="title" /> 2844 to secure the protocol against missing, duplicated, out-of-order or unexpected messages. 2845 </t> 2846 </dd> 2847 <dt><xref target="messages_elements" format="title" /></dt> 2848 <dd> 2849 <t> 2850 If an element that never has been requested by 2851 a demand or is received twice, the operation MUST be terminated. 2852 The sending and receiving of <xref target="messages_elements" format="title" /> messages should 2853 always be protected with an <xref target="security_generic_functions_mfc" format="title" /> 2854 to secure the protocol against missing, duplicated, out-of-order or unexpected messages. 2855 </t> 2856 </dd> 2857 <dt><xref target="messages_demand" format="title" /></dt> 2858 <dd> 2859 <t> 2860 For every received demand an offer has to be sent in advance. If a demand 2861 for an element is received, that never has been offered or the offer already has 2862 been answered with a demand, the operation MUST be terminated. It is required to implement 2863 a list which keeps track of the state of all sent offers and received demands. 2864 The sending and receiving of <em><xref target="messages_demand" format="title" /></em> messages should 2865 always be protected with an <xref target="security_generic_functions_mfc" format="title" /> 2866 to secure the protocol against missing, duplicated, out-of-order or unexpected messages. 2867 </t> 2868 </dd> 2869 <dt><xref target="messages_done" format="title" /></dt> 2870 <dd> 2871 <t> 2872 The <em><xref target="messages_done" format="title" /></em> message is only received if the IBF has finished 2873 decoding and all offers have been sent. If the <em><xref target="messages_done" format="title" /></em> message is received before 2874 the decoding of the IBF is finished or all open demands 2875 have been answered, the operation MUST be terminated. 2876 If the sets differ (the FINAL CHECKSUM field in the <xref target="messages_done" format="title" /> 2877 message does not match to the SHA-512 hash XOR sum of the local set), the operation has failed and the 2878 reconciliation MUST be aborted. It is a strong indicator 2879 that something went wrong (eg. some hardware bug). This should never occur! 2880 </t> 2881 <t> 2882 When a <em><xref target="messages_done" format="title" /></em> message is received the 2883 "check_if_synchronisation_is_complete()" function from the <xref target="security_generic_functions_mfc" format="title" /> 2884 is required to ensure that all demands have been satisfied successfully. 2885 </t> 2886 </dd> 2887 </dl> 2888 </section> 2889 <section anchor="security_states_finish_closing" numbered="true" toc="default"> 2890 <name>Finish Closing</name> 2891 <t> 2892 In the <strong>Finish Closing</strong> state the protocol waits for 2893 all sent demands to be fulfilled. 2894 </t> 2895 <t> 2896 In case not all sent demands have been answered in time, 2897 the operation has failed and MUST be terminated. 2898 </t> 2899 <t>Security considerations for received messages:</t> 2900 <dl> 2901 <dt><xref target="messages_elements" format="title" /></dt> 2902 <dd> 2903 When receiving <xref target="messages_elements" format="title" /> messages it is important 2904 to always check the <xref target="security_generic_functions_mfc" format="title" /> 2905 to secure the protocol against missing, duplicated, out-of-order or unexpected messages. 2906 </dd> 2907 </dl> 2908 </section> 2909 <section anchor="security_states_finished" numbered="true" toc="default"> 2910 <name>Finished</name> 2911 <t> 2912 In this state the connection is terminated, so no security considerations are needed. 2913 </t> 2914 </section> 2915 <section anchor="security_states_expect_se" numbered="true" toc="default"> 2916 <name>Expect SE</name> 2917 <t>Security considerations for received messages:</t> 2918 <dl> 2919 <dt><xref target="messages_se" format="title" /></dt> 2920 <dd> 2921 <t> 2922 In case the strata estimator does not decode, the 2923 operation MUST be terminated to prevent to get to an unresolvable state. 2924 The set difference calculated from the strata estimator needs to be plausible, 2925 which means within the byzantine boundaries described in section <xref target="security_generic_functions_check_byzantine_boundaries" format="title" />. 2926 </t> 2927 </dd> 2928 </dl> 2929 </section> 2930 <section anchor="security_states_full_receiving" numbered="true" toc="default"> 2931 <name>Full Receiving</name> 2932 <t>Security considerations for received messages:</t> 2933 <dl> 2934 <dt><xref target="messages_full_element" format="title" /></dt> 2935 <dd> 2936 <t> 2937 When receiving full elements there needs to be checked, that every 2938 element is a valid element, no element has been received more than once and 2939 not more elements are received than the other peer committed 2940 to sending at the beginning of the operation. The plausibility should also be checked 2941 with an algorithm as described in <xref target="security_generic_functions_full_plausibility_check" format="default"/>. 2942 </t> 2943 </dd> 2944 <dt><xref target="messages_full_done" format="title" /></dt> 2945 <dd> 2946 <t> 2947 When the <em><xref target="messages_full_done" format="title" /></em> message is received from the remote peer, it should be checked that the number of 2948 elements received matches the number that the remote peer 2949 originally committed to transmitting, 2950 otherwise the operation MUST be terminated. 2951 If the sets differ (the FINAL CHECKSUM field in the <xref target="messages_full_done" format="title" /> 2952 message does not match to the SHA-512 hash XOR sum of the local set), the operation has failed and the 2953 reconciliation MUST be aborted. It is a strong indicator 2954 that something went wrong (eg. some hardware bug). This should never occur! 2955 </t> 2956 </dd> 2957 </dl> 2958 </section> 2959 <section anchor="security_states_passive_decoding" numbered="true" toc="default"> 2960 <name>Passive Decoding</name> 2961 <t>Security considerations for received messages:</t> 2962 <dl> 2963 <dt><xref target="messages_ibf" format="title" /></dt> 2964 <dd> 2965 <t> 2966 In case an <xref target="messages_ibf" format="title" /> message is received by the peer a active/passive role switch 2967 is initiated by the active decoding remote peer. 2968 A switch into active decoding mode MUST only be permitted for 2969 a predefined number of times as described in <xref target="security_generic_functions_active_passive_switches" format="default"/> 2970 </t> 2971 </dd> 2972 <dt><xref target="messages_inquiry" format="title" /></dt> 2973 <dd> 2974 <t> 2975 A check needs to be in place that prevents receiving an inquiry 2976 for an element multiple times or more inquiries than are plausible. 2977 The upper thresholds for sent/received inquiries is the 2978 remote set size the other peer has committed too (Case if the complete remote set is 2979 new). The lower threshold for for sent/received inquiries is the absolute value of the 2980 set difference between the local and remote set size 2981 (Case the set difference is only in the set of a single peer). 2982 The other peer's committed set sizes is transmitted in the the <strong>Expecting IBF</strong> state. 2983 Beware that it is possible to get key collisions and an inquiry for the same key 2984 can be transmitted multiple times, so the threshold should take this into account. 2985 The sending and receiving of <em><xref target="messages_inquiry" format="title" /></em> messages should 2986 always be protected with an <xref target="security_generic_functions_mfc" format="title" /> 2987 to secure the protocol against missing, duplicated, out-of-order or unexpected messages. 2988 </t> 2989 </dd> 2990 <dt><xref target="messages_demand" format="title" /></dt> 2991 <dd> 2992 Same action as described for <em><xref target="messages_demand" format="title" /></em> message in section 2993 <xref target="security_states_active_decoding" format="title"/>. 2994 </dd> 2995 <dt><xref target="messages_offer" format="title" /></dt> 2996 <dd> 2997 Same action as described for <em><xref target="messages_offer" format="title" /></em> message in section 2998 <xref target="security_states_active_decoding" format="title"/>. 2999 </dd> 3000 <dt><xref target="messages_done" format="title" /></dt> 3001 <dd> 3002 Same action as described for <em><xref target="messages_done" format="title" /></em> message in section 3003 <xref target="security_states_active_decoding" format="title"/>. 3004 </dd> 3005 <dt><xref target="messages_elements" format="title" /></dt> 3006 <dd> 3007 Same action as described for <em><xref target="messages_elements" format="title" /></em> message in section 3008 <xref target="security_states_active_decoding" format="title"/>. 3009 </dd> 3010 3011 </dl> 3012 </section> 3013 <section anchor="security_states_finish_waiting" numbered="true" toc="default"> 3014 <name>Finish Waiting</name> 3015 <t> 3016 In the <strong>Finish Waiting</strong> state the protocol waits for 3017 all transmitted demands to be fulfilled. 3018 </t> 3019 <t> 3020 In case not all transmitted demands have been answered at this time, the operation 3021 has failed and the protocol MUST be terminated with an error. 3022 </t> 3023 <t>Security considerations for received messages:</t> 3024 <dl> 3025 <dt><xref target="messages_elements" format="title" /></dt> 3026 <dd> 3027 When receiving <xref target="messages_elements" format="title" /> messages it is important 3028 to always check the <xref target="security_generic_functions_mfc" format="title" /> 3029 to secure the protocol against missing, duplicated, out-of-order or unexpected messages. 3030 </dd> 3031 </dl> 3032 </section> 3033 </section> 3034 </section> 3035 <section anchor="constants" numbered="true" toc="default"> 3036 <name>Constants</name> 3037 <t> 3038 The following table contains constants used by the protocol. The constants marked with a * are 3039 validated through experiments in <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/>. 3040 </t> 3041 <figure anchor="figure_constants"> 3042 <artwork name="" type="" align="left" alt=""><![CDATA[ 3043 Name | Value | Description 3044 ----------------------------+------------+------------------------------- 3045 SE_STRATA_COUNT | 32 | Number of IBFs in a strata 3046 estimator. 3047 IBF_HASH_NUM* | 3 | Number of times an element is 3048 hashed to an IBF. 3049 (from section 4.5.2) 3050 IBF_FACTOR* | 2 | The factor by which the size 3051 of the IBF is increased in 3052 case of decoding failure or 3053 initially from the set 3054 difference. 3055 (from section 4.5.2) 3056 MAX_BUCKETS_PER_MESSAGE | 1120 | Maximum bucket of an IBF 3057 that are transmitted in 3058 single message. 3059 IBF_MIN_SIZE* | 37 | Minimal number of buckets 3060 in an IBF. (from section 3.8) 3061 DIFFERENTIAL_RTT_MEAN* | 3.65145 | The average RTT that is 3062 needed for a differential 3063 synchronisation. 3064 SECURITY_LEVEL* | 2^80 | Security level for 3065 probabilistic security 3066 algorithms. (from section 5.8) 3067 PROBABILITY_FOR_NEW_ROUND* | 0.15 | The probability for a IBF 3068 decoding failure in the 3069 differential synchronisation 3070 mode. (from section 5.4) 3071 DIFFERENTIAL_RTT_MEAN* | 3.65145 | The average RTT that is needed 3072 for a differential 3073 synchronisation. 3074 (from section 4.5.3) 3075 MAX_IBF_SIZE | 1048576 | Maximal number of buckets in 3076 an IBF. 3077 AVG_BYTE_SIZE_SE* | 4221 | Average byte size of a single 3078 strata estimator. 3079 (from section 3.4.3) 3080 VALID_NUMBER_SE* | [1,2,4,8] | Valid number of SE's 3081 (from section 3.4) 3082 ]]></artwork> 3083 </figure> 3084 3085 </section> 3086 <section anchor="gana" numbered="true" toc="default"> 3087 <name>GANA Considerations</name> 3088 <t> 3089 GANA is requested to amend the "GNUnet Message Type" <xref target="GANA" format="default"/> registry 3090 as follows: 3091 </t> 3092 <figure anchor="figure_purposenums"> 3093 <artwork name="" type="" align="left" alt=""><![CDATA[ 3094 Type | Name | References | Description 3095 --------+----------------------------+------------+---------------------- 3096 559 | SETU_P2P_REQUEST_FULL | [This.I-D] | Request the full set 3097 of the other peer. 3098 710 | SETU_P2P_SEND_FULL | [This.I-D] | Signals to send the 3099 full set to the other 3100 peer. 3101 560 | SETU_P2P_DEMAND | [This.I-D] | Demand the whole 3102 element from the 3103 otherpeer, given 3104 only the hash code. 3105 561 | SETU_P2P_INQUIRY | [This.I-D] | Tell the other peer 3106 to send a list of 3107 hashes that match 3108 an IBF key. 3109 562 | SETU_P2P_OFFER | [This.I-D] | Tell the other peer 3110 which hashes match 3111 a given IBF key. 3112 563 | SETU_P2P_OPERATION_REQUEST | [This.I-D] | Request a set union 3113 operation from a 3114 remote peer. 3115 564 | SETU_P2P_SE | [This.I-D] | Strata Estimator 3116 uncompressed. 3117 565 | SETU_P2P_IBF | [This.I-D] | Invertible Bloom 3118 Filter slices. 3119 566 | SETU_P2P_ELEMENTS | [This.I-D] | Actual set elements. 3120 567 | SETU_P2P_IBF_LAST | [This.I-D] | Invertible Bloom 3121 Filter Last Slices. 3122 568 | SETU_P2P_DONE | [This.I-D] | Set operation is 3123 done. 3124 569 | SETU_P2P_SEC | [This.I-D] | Strata Estimator 3125 compressed. 3126 570 | SETU_P2P_FULL_DONE | [This.I-D] | All elements in 3127 full synchronisation 3128 mode have been sent 3129 is done. 3130 571 | SETU_P2P_FULL_ELEMENT | [This.I-D] | Send an actual 3131 element in full 3132 synchronisation mode. 3133 3134 ]]></artwork> 3135 </figure> 3136 </section> 3137 <!-- gana --> 3138 <section anchor="contributors" numbered="true" toc="default"> 3139 <name>Contributors</name> 3140 <t> 3141 The GNUnet implementation of the byzantine fault tolerant set reconciliation 3142 protocol was originally implemented by Florian Dold. 3143 </t> 3144 </section> 3145 </middle> 3146 <back> 3147 <references> 3148 <name>Normative References</name> 3149 &RFC5869; 3150 &RFC2119; 3151 &RFC3385; 3152 &RFC1951; 3153 3154 <reference anchor="byzantine_fault_tolerant_set_reconciliation" target="https://summermatter.net/byzantine-fault-tolerant-set-reconciliation-summermatter.pdf"> 3155 <front> 3156 <title>Byzantine Fault Tolerant Set Reconciliation</title> 3157 <author initials="E." surname="Summermatter" fullname="Elias Summermatter"> 3158 <organization>University of Applied Sciences Bern</organization> 3159 </author> 3160 <date year="2021"/> 3161 </front> 3162 </reference> 3163 3164 <reference anchor="GANA" target="https://gana.gnunet.org/"> 3165 <front> 3166 <title>GNUnet Assigned Numbers Authority (GANA)</title> 3167 <author> 3168 <organization>GNUnet e.V.</organization> 3169 </author> 3170 <date month="April" year="2020"/> 3171 </front> 3172 </reference> 3173 3174 <reference anchor="CryptographicallySecureVoting" target="https://git.gnunet.org/bibliography.git/plain/docs/ba_dold_voting_24aug2014.pdf"> 3175 <front> 3176 <title>Cryptographically Secure, Distributed Electronic Voting</title> 3177 <author initials="F." surname="Dold" fullname="Florian Dold"> 3178 <organization>Technische Universität München</organization> 3179 </author> 3180 </front> 3181 </reference> 3182 3183 3184 <reference anchor="ByzantineSetUnionConsensusUsingEfficientSetReconciliation" target="https://doi.org/10.1186/s13635-017-0066-3"> 3185 <front> 3186 <title>Byzantine set-union consensus using efficient set reconciliation</title> 3187 <author initials="F." surname="Dold" fullname="Florian Dold"> 3188 <organization>Technische Universität München</organization> 3189 </author> 3190 <author initials="C." surname="Grothoff" fullname="Christian Grothoff"> 3191 <organization>Inria, Domaine de Voluceau Rocquencourt</organization> 3192 </author> 3193 </front> 3194 </reference> 3195 <reference anchor="Eppstein" target="https://doi.org/10.1145/2018436.2018462"> 3196 <front> 3197 <title>What’s the Difference? Efficient Set Reconciliation without Prior Context</title> 3198 <author initials="D." surname="Eppstein" fullname="David Eppstein"> 3199 <organization>U.C. Irvine</organization> 3200 </author> 3201 <author initials="M." surname="Goodrich" fullname="Michael T. Goodrich"> 3202 <organization>U.C. Irvine</organization> 3203 </author> 3204 <author initials="F." surname="Uyeda" fullname="Frank Uyeda"> 3205 <organization>U.C. San Diego</organization> 3206 </author> 3207 <author initials="G." surname="Varghese" fullname="George Varghese"> 3208 <organization>U.C. San Diego</organization> 3209 </author> 3210 </front> 3211 </reference> 3212 3213 <reference anchor="GNS" target="https://doi.org/10.1007/978-3-319-12280-9_9"> 3214 <front> 3215 <title>A Censorship-Resistant, Privacy-Enhancing and Fully Decentralized Name System</title> 3216 <author initials="M." surname="Wachs" fullname="Matthias Wachs"> 3217 <organization>Technische Universitaet Muenchen</organization> 3218 </author> 3219 3220 <author initials="M." surname="Schanzenbach" fullname="Martin Schanzenbach"> 3221 <organization>Technische Universitaet Muenchen</organization> 3222 </author> 3223 3224 <author initials="C." surname="Grothoff" 3225 fullname="Christian Grothoff"> 3226 <organization>Technische Universitaet Muenchen</organization> 3227 </author> 3228 <date year="2014"/> 3229 </front> 3230 </reference> 3231 3232 </references> 3233 <section anchor="test_vectors" numbered="true" toc="default"> 3234 <name>Test Vectors</name> 3235 <section anchor="test_vectors_map_function" numbered="true" toc="default"> 3236 <name>Map Function</name> 3237 <t> 3238 INPUTS: 3239 </t> 3240 <figure anchor="test_vectors_map_function_inputs"> 3241 <artwork name="" type="" align="left" alt=""><![CDATA[ 3242 k: 3 3243 ibf_size: 300 3244 3245 key1: 0xFFFFFFFFFFFFFFFF (64-bit) 3246 key2: 0x0000000000000000 (64-bit) 3247 key3: 0x00000000FFFFFFFF (64-bit) 3248 key4: 0xC662B6298512A22D (64-bit) 3249 key5: 0xF20fA7C0AA0585BE (64-bit) 3250 ]]></artwork> 3251 </figure> 3252 <t> 3253 OUTPUT: 3254 </t> 3255 <figure anchor="test_vectors_map_function_outputs"> 3256 <artwork name="" type="" align="left" alt=""><![CDATA[ 3257 key1: ["122","157","192"] 3258 key2: ["85","243","126"] 3259 key3: ["208","101","222"] 3260 key4: ["239","269","56"] 3261 key5: ["150","104","33"] 3262 ]]></artwork> 3263 </figure> 3264 </section> 3265 <section anchor="test_vectors_id_function" numbered="true" toc="default"> 3266 <name>ID Calculation Function</name> 3267 <t> 3268 INPUTS: 3269 </t> 3270 <figure anchor="test_vectors_id_function_inputs"> 3271 <artwork name="" type="" align="left" alt=""><![CDATA[ 3272 element 1: 0xFFFFFFFFFFFFFFFF (64-bit) 3273 element 2: 0x0000000000000000 (64-bit) 3274 element 3: 0x00000000FFFFFFFF (64-bit) 3275 element 4: 0xC662B6298512A22D (64-bit) 3276 element 5: 0xF20fA7C0AA0585BE (64-bit) 3277 ]]></artwork> 3278 </figure> 3279 <t> 3280 OUTPUT: 3281 </t> 3282 <figure anchor="test_vectors_id_function_outputs"> 3283 <artwork name="" type="" align="left" alt=""><![CDATA[ 3284 element 1: 0x5AFB177B 3285 element 2: 0x64AB557C 3286 element 3: 0xCB5DB740 3287 element 4: 0x8C6A2BB2 3288 element 5: 0x7EC42981 3289 ]]></artwork> 3290 </figure> 3291 </section> 3292 <section anchor="test_counter_compression_function" numbered="true" toc="default"> 3293 <name>Counter Compression Function</name> 3294 <t> 3295 INPUTS: 3296 </t> 3297 <figure anchor="test_counter_compression_function_inputs"> 3298 <artwork name="" type="" align="left" alt=""><![CDATA[ 3299 counter serie 1: [1,8,10,6,2] (min bytes 4) 3300 counter serie 2: [26,17,19,15,2,8] (min bytes 5) 3301 counter serie 3: [4,2,0,1,3] (min bytes 3) 3302 ]]></artwork> 3303 </figure> 3304 <t> 3305 OUTPUT: 3306 </t> 3307 <figure anchor="test_counter_compression_function_outputs"> 3308 <artwork name="" type="" align="left" alt=""><![CDATA[ 3309 counter serie 1: 0x18A62 3310 counter serie 2: 0x3519BC48 3311 counter serie 3: 0x440B 3312 ]]></artwork> 3313 </figure> 3314 </section> 3315 </section> 3316 </back> 3317 </rfc>