draft-schanzen-r5n.xml (180769B)
1 <?xml version="1.0" encoding="utf-8"?> 2 <!DOCTYPE rfc [ 3 <!ENTITY RFC2119 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"> 4 <!ENTITY RFC2663 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2663.xml"> 5 <!ENTITY RFC2782 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2782.xml"> 6 <!ENTITY RFC3561 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3561.xml"> 7 <!ENTITY RFC3629 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3629.xml"> 8 <!ENTITY RFC3686 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3686.xml"> 9 <!ENTITY RFC3826 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3826.xml"> 10 <!ENTITY RFC3986 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3986.xml"> 11 <!ENTITY RFC4634 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4634.xml"> 12 <!ENTITY RFC5234 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5234.xml"> 13 <!ENTITY RFC5245 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5245.xml"> 14 <!ENTITY RFC5869 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5869.xml"> 15 <!ENTITY RFC5890 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5890.xml"> 16 <!ENTITY RFC5891 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5891.xml"> 17 <!ENTITY RFC6781 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6781.xml"> 18 <!ENTITY RFC6895 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6895.xml"> 19 <!ENTITY RFC6940 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6940.xml"> 20 <!ENTITY RFC6979 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6979.xml"> 21 <!ENTITY RFC7748 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.7748.xml"> 22 <!ENTITY RFC8032 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.8032.xml"> 23 <!ENTITY RFC8126 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.8126.xml"> 24 <!ENTITY RFC8174 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.8174.xml"> 25 <!ENTITY RFC9458 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.9458.xml"> 26 <!ENTITY RFC9498 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.9498.xml"> 27 ]> 28 <?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?> 29 <?rfc strict="yes" ?> 30 <?rfc toc="yes" ?> 31 <?rfc symrefs="yes"?> 32 <?rfc sortrefs="yes" ?> 33 <?rfc compact="yes" ?> 34 <?rfc subcompact="no" ?> 35 <rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="info" docName="draft-schanzen-r5n-07" ipr="trust200902" obsoletes="" updates="" submissionType="IETF" xml:lang="en" version="3"> 36 <front> 37 <title abbrev="The R5N Distributed Hash Table"> 38 The R5N Distributed Hash Table 39 </title> 40 <seriesInfo name="Internet-Draft" value="draft-schanzen-r5n-07"/> 41 <author fullname="Martin Schanzenbach" initials="M." surname="Schanzenbach"> 42 <organization>Fraunhofer AISEC</organization> 43 <address> 44 <postal> 45 <street>Lichtenbergstrasse 11</street> 46 <city>Garching</city> 47 <code>85748</code> 48 <country>DE</country> 49 </postal> 50 <email>martin.schanzenbach@aisec.fraunhofer.de</email> 51 </address> 52 </author> 53 <author fullname="Christian Grothoff" initials="C." surname="Grothoff"> 54 <organization>Berner Fachhochschule</organization> 55 <address> 56 <postal> 57 <street>Hoeheweg 80</street> 58 <city>Biel/Bienne</city> 59 <code>2501</code> 60 <country>CH</country> 61 </postal> 62 <email>grothoff@gnunet.org</email> 63 </address> 64 </author> 65 <author fullname="Bernd Fix" initials="B." surname="Fix"> 66 <organization>GNUnet e.V.</organization> 67 <address> 68 <postal> 69 <street>Boltzmannstrasse 3</street> 70 <city>Garching</city> 71 <code>85748</code> 72 <country>DE</country> 73 </postal> 74 <email>fix@gnunet.org</email> 75 </address> 76 </author> 77 <!-- Meta-data Declarations --> 78 <area>General</area> 79 <workgroup>Independent Stream</workgroup> 80 <keyword>distributed hash tables</keyword> 81 <abstract> 82 <t> 83 This document contains the R<sup>5</sup>N DHT technical specification. 84 R<sup>5</sup>N is a secure distributed hash table (DHT) routing algorithm 85 and data structure for decentralized applications. 86 It features an open peer-to-peer overlay routing mechanism which supports ad-hoc 87 permissionless participation and support for topologies in restricted-route 88 environments. Optionally, the paths data takes through the overlay can be 89 recorded, allowing decentralized applications to use the DHT to discover routes. 90 </t> 91 <t> 92 This document defines the normative wire format of protocol messages, 93 routing algorithms, cryptographic routines and security considerations for 94 use by implementers. 95 </t> 96 <t> 97 This specification was developed outside the IETF and does not have IETF 98 consensus. It is published here to guide implementation of R<sup>5</sup>N and to 99 ensure interoperability among implementations including the pre-existing 100 GNUnet implementation. 101 </t> 102 </abstract> 103 </front> 104 <middle> 105 <section anchor="introduction" numbered="true" toc="default"> 106 <name>Introduction</name> 107 <t> 108 This specification describes the protocol of R<sup>5</sup>N. 109 R<sup>5</sup>N is a Distributed Hash Table (DHT). The name is an acronym for 110 "randomized recursive routing for restricted-route 111 networks" and its first academic description can be found in 112 <xref target="R5N"/>. 113 </t> 114 <t> 115 DHTs are a key data structure for the construction of decentralized applications 116 and generally provide a robust and efficient means to distribute the 117 storage and retrieval of key-value pairs. 118 </t> 119 <t> 120 The core idea behind R<sup>5</sup>N is to combine a randomized routing 121 algorithm with an efficient, deterministic closest-peer algorithm. 122 This allows us to construct an algorithm that is able to escape and circumvent 123 restricted route environments while at the same time allow for a logarithmically bounded 124 routing complexity. 125 </t> 126 <t> 127 R<sup>5</sup>N also includes advanced features like recording the path a 128 key-value pair took 129 through the network, response filters and on-path application-specific data 130 validation. 131 </t> 132 <t> 133 This document defines the normative wire format of peer-to-peer 134 messages, routing algorithms, cryptographic routines and security 135 considerations for use by implementors. 136 </t> 137 <section numbered="true" toc="default"> 138 <name>Requirements Notation</name> 139 <t> 140 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 141 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 142 "OPTIONAL" in this document are to be interpreted as described in 143 BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only 144 when, they appear in all capitals, as shown here. 145 </t> 146 </section> 147 <section numbered="true" toc="default"> 148 <name>System Model</name> 149 <t> 150 DHTs usually operate as overlay networks consisting of peers 151 communicating over the existing Internet. Hence canonical 152 DHT designs often assume that the IP protocol provides the 153 peers of the overlay with unrestricted end-to-end pairwise 154 connectivity. However, in practice, firewalls and network 155 address translation (NAT) <xref target="RFC2663"/> make it 156 difficult for peers operating on consumer end-devices to 157 communicate directly, especially in the absence of core 158 network infrastructure enabling NAT traversal via protocols 159 such as interactive connectivity establishment (ICE) <xref target="RFC5245"/>. 160 </t> 161 <t> 162 Furthermore, not all peer-to-peer networks consistently 163 operate over the Internet, such as mobile ad-hoc networks 164 (MANETs). While routing protocols have been designed for 165 such networks (<xref target="RFC3561"/>), these generally 166 have issues with security in the presence of malicious 167 participants, as they vulnerable to impersonation attacks. 168 The usual solution to these issues is to assert that the 169 entire MANET is a closed network and to require 170 authentication on all control messages. In contrast, the 171 system model for R<sup>5</sup>N is that of an open network 172 without any kind of authorities that could restrict access 173 to trusted participants. 174 </t> 175 </section> 176 <section numbered="true" toc="default" anchor="security_model"> 177 <name>Security Model</name> 178 <t> 179 We assume that the network is open and thus a fraction of 180 the participating peers is malicious. Malicious peers may 181 create, alter, delay or drop messages. We also assume that 182 an adversary can control (or fake) many peers <xref target="Sybil"/>, thus any kind 183 of voting or punishment of malicious peers would be rather 184 pointless. 185 </t> 186 <t> 187 Honest peers are expected to establish and maintain many 188 connections. We assume that, as a result, the adversary is 189 generally unable to prevent honest peers from maintaining a 190 sufficient number of direct connections with other honest 191 peers to achieve acceptable performance. As the number of 192 malicious peers and their connections increases, performance 193 of the system should gracefully degrade and 194 only collapse for peers that an adversary has fully isolated 195 from the benign network. 196 </t> 197 <t> 198 The main security objectives are to provide honest nodes with 199 correct results and to limit the propagation of invalid 200 data. Invalid data includes both invalid key-value pairs as 201 well as invalid routing path data. 202 While malicious nodes may make up arbitrary 203 key-value pairs and paths within the adversary's domain, 204 invalid key-value pairs are ideally 205 discarded at the first honest node and path data 206 honestly states entry- and exit-points 207 from the honest network and the subset of malicious nodes. 208 </t> 209 <t> 210 Malicious nodes may attempt to exhaust the 211 storage capacity of honest nodes by distributing well-formed 212 but possibly otherwise useless application data. We assume 213 that storage space is relatively cheap compared to bandwidth 214 and that honest nodes also frequently republish the useful 215 data that they publish. As a result, an adversary 216 may reduce the effectiveness and longevity of 217 data cached in the DHT but is assumed to not be able to 218 effectively prevent publication and retrieval of application 219 data by honest nodes. 220 </t> 221 </section> 222 </section> 223 <section anchor="terminology"> 224 <name>Terminology</name> 225 <dl> 226 <dt>Address</dt> 227 <dd> 228 <t> 229 An <em>address</em> is a UTF-8 <xref target="RFC3629"/> string which can be 230 used to address a <em>peer</em> through the Underlay (<xref target="underlay"/>). 231 The format of an address is not enforced by this specification 232 but it is expected that in most cases the address is a URI <xref target="RFC3986"/>. 233 </t> 234 </dd> 235 <dt>Applications</dt> 236 <dd> 237 <em>Applications</em> are higher-layer components which use the 238 <em>Core Operations</em> directly. 239 Possible <em>applications</em> include the GNU Name System 240 <xref target="RFC9498"/> and the GNUnet 241 Confidential Ad-hoc Decentralized End-to-End Transport (CADET) 242 <xref target="cadet"/>. 243 </dd> 244 <dt>Core Operations</dt> 245 <dd> 246 The <em>Core Operations</em> provide an interface to the 247 core operations of the DHT overlay to <em>applications</em>. 248 This includes storing <em>blocks</em> in the DHT and retrieving 249 <em>blocks</em> from the DHT. 250 </dd> 251 <dt>Block</dt> 252 <dd> 253 A unit of payload of variable-size stored in the DHT 254 under a <em>key</em>. 255 In the context of "key-value stores" this 256 refers to "value" stored under a <em>key</em>. 257 </dd> 258 <dt>Block Storage</dt> 259 <dd> 260 The <em>block storage</em> component is used to persist and manage 261 <em>blocks</em> stored by <em>peers</em>. 262 It includes logic for enforcing storage quotas, caching strategies and 263 block validation. 264 </dd> 265 <dt>Block Type</dt> 266 <dd> 267 A unique 32-bit value identifying the data format of a <em>block</em>. 268 <em>Block types</em> are public and applications that require 269 application-specific 270 block payloads are expected to register one or more 271 block types in the GANA Block-Type registry 272 (<xref target="gana_block_type"/>) and provide a specification 273 of the associated block operations (<xref 274 target="block_functions"/>) to implementors of 275 R<sup>5</sup>N. 276 </dd> 277 <dt>Bootstrapping</dt> 278 <dd> 279 <em>Bootstrapping</em> is the process of establishing a connection 280 to the peer-to-peer network. 281 It requires an initial, non-empty set of reachable <em>peers</em> and corresponding 282 <em>addresses</em> to connect to that are supported by the implementation. 283 </dd> 284 <dt>Initiator</dt> 285 <dd> 286 The <em>peer</em> that initially creates and sends a DHT protocol message (<xref target="p2p_hello"/>, 287 <xref target="p2p_put"/>, <xref target="p2p_get"/>, <xref target="p2p_result"/>). 288 </dd> 289 <dt>HELLO block</dt> 290 <dd> 291 A type of <em>block</em> that is used to store and retrieve <em>addresses</em> of a <em>peer</em>. 292 It is used by the peer discovery mechanism in <xref target="find_peer"/>. 293 </dd> 294 <dt>HELLO URL</dt> 295 <dd> 296 <tt>HELLO</tt> blocks represented as URLs. 297 They are used for out-of-band exchanges of <em>peer</em> <em>addresses</em> 298 and for signalling address updates to <em>neighbours</em>. 299 Implementation details and examples are in <xref target="hello_url"/>. 300 </dd> 301 <dt>Key</dt> 302 <dd> 303 The 512-bit identifier of a location in the DHT. Multiple <tt>blocks</tt> can be 304 stored under the same <em>key</em>. A <em>peer identity</em> is also a <tt>key</tt>. 305 </dd> 306 <dt>Message Processing</dt> 307 <dd> 308 The <em>message processing</em> component of the DHT implementation processes 309 requests from and generates responses to <em>applications</em> 310 and the <em>underlay interface</em>. 311 </dd> 312 <dt>Neighbor</dt> 313 <dd> 314 A neighbor is a <em>peer</em> which is directly able to communicate 315 with our <em>peer</em> via the <em>underlay interface</em>. 316 </dd> 317 <dt>Peer</dt> 318 <dd> 319 A host that is participating in the overlay by running an implementation 320 of the DHT protocol. Each participating host is 321 responsible for holding some portion of the data that has been 322 stored in the overlay, and they are responsible for routing 323 messages on behalf of other <em>peers</em> by following the <em>routing 324 algorithm</em>. 325 </dd> 326 <dt>Peer Identity</dt> 327 <dd> 328 The identifier used on the overlay 329 to identify a <em>peer</em>. It is a SHA-512 hash of the <em>peer public key</em>. 330 </dd> 331 <dt>Peer Public Key</dt> 332 <dd> 333 The the key used to authenticate 334 a <em>peer</em> in the underlay. 335 </dd> 336 <dt>Routing</dt> 337 <dd> 338 The <em>routing</em> component includes the routing table as well as 339 routing and <em>peer</em> selection logic. It facilitates R<sup>5</sup>N routing 340 by provoding the required data structures and algorithms. 341 </dd> 342 <dt>Underlay Interface</dt> 343 <dd> 344 The <em>underlay interface</em> is an abstraction layer on top of the 345 supported links of a <em>peer</em>. Peers may be linked by a variety of 346 different transports including "classical" protocols such as 347 TCP, UDP and TLS or higher-layer protocols such as GNUnet, I2P or Tor. 348 <!-- FIXME: add references to GNUnet/I2P/Tor here! --> 349 </dd> 350 </dl> 351 </section> 352 <section numbered="true" toc="default"> 353 <name>Motivation</name> 354 <section numbered="true" toc="default"> 355 <name>Restricted-route topologies</name> 356 <t> 357 Restricted-route topologies emerge when a connected underlay 358 topology prevents or restricts direct connections between 359 some of the nodes. This commonly occurs through the use of 360 NAT (<xref target="RFC2663"/>). Nodes operated behind a NAT 361 cause common DHT routing algorithms such as Kademlia <xref 362 target="Kademlia"/> to exhibit degraded performance or even 363 to fail. While excluding such nodes is an option, this 364 limits load distribution and is ineffective for some 365 networks such as MANETs. 366 </t> 367 <t> 368 In general, nodes may not be able to reach each other (for example 369 due to a firewall or NAT) despite being <em>neighbours</em> 370 according to the routing table construction algorithm of a 371 particular DHT. For example, Kademlia uses the XOR metric 372 and would generally connect nodes that have peer identities 373 with a small XOR distance. However, the XOR distance between 374 randomly assigned peer identities is completely 375 divorced from the ability of the nodes to directly 376 communicate. DHTs usually use greedy routing to store data 377 at the peer(s) closest to the key. In cases where a DHT 378 cannot connect peers according to the construction rules of 379 its routing algorithm, the topology may end up with 380 multiple local minima for a given key. Using canonical 381 greedy routing from a particular fixed location in the 382 network, a node may only be able to publish and 383 retrieve data in the proximity of its local minima. 384 </t> 385 <t> 386 R<sup>5</sup>N addresses this problem by doing a random 387 walk before a classical, deterministic XOR-based routing. 388 The optimal number of random hops 389 is equal to the mixing time of the graph, which is well known 390 for various graphs. For small-world 391 networks <xref target="Smallworld"/>, the mixing time has 392 been shown to be around <tt>O(log n)</tt> where <tt>n</tt> 393 is the number of nodes in the network 394 <xref target="Smallworldmix"/>. 395 </t> 396 <t> 397 Thus, if the network exhibits the properties of a small 398 world topology <xref target="Smallworld"/>, a random walk of 399 length <tt>O(log n)</tt> will cause the algorithm to land on 400 a random node in the network. Consequently, the 401 deterministic part of the algorithm will encounter a random 402 local minimum. It is then possible to repeat this process 403 in order to store or retrieve data in the context of all or 404 at least multiple local minima. The ideal length of the 405 random walk and the number of repetitions expected to cover 406 all local minima depends on the network topology. Our 407 design assumes that the benign subset of the network forms a 408 small-world topology <xref target="Smallworld"/> and then 409 obtains an estimate of the current number of nodes 410 <tt>n</tt> in the network and then uses <tt>log n</tt> for 411 the actual length of the random walk. 412 </t> 413 </section> 414 <section numbered="true" toc="default"> 415 <name>Key differences to RELOAD</name> 416 <t> 417 <xref target="RFC6940"/> specifies the RELOAD DHT. The R<sup>5</sup>N DHT 418 described in this document differs from RELOAD in its objectives 419 and thus in its design. 420 The authors of RELOAD make the case that P2P networks are often established 421 among a set of peers that do not trust each other. 422 It addresses this issue by requiring that node identifiers 423 are either assigned by a central authority, or self-issued in the case of closed networks. 424 In other words, by enforcing the P2P network to be established among a set 425 of <em>trusted</em> peers. 426 This misses the point that this openness is a core requirement of efficient and 427 useful DHTs as they serve a fundamental part in a decentralized network 428 infrastructure. 429 R<sup>5</sup>N, by contrast, is intended for open 430 overlay networks, and thus does not include a central enrollment server to 431 certify participants and does not limit participation in another way. 432 As participants could be malicious, R<sup>5</sup>N 433 includes on-path customizable key-value validation to delete malformed 434 data and path randomiziation 435 to help evade malicious peers. R<sup>5</sup>N also expects to perform 436 over a network where not every peer can communicate with every other peer, 437 and where thus its route discovery feature provides utility to higher-level 438 applications. As a result, both the features and the security properties 439 of RELOAD and R<sup>5</sup>N are different, except in that both allow 440 storing and retrieving key-value pairs. 441 <!-- 442 2023/08/20 CG: I believe the above text addresses the comments from MSC below ... 443 444 2022/12/23 MSC: I moved references to rfc6940 to security considerations. 445 I think we should talk about R5N in the positive here only, not about 446 RELOAD in the negative. 447 448 - Lean. Can be implemented. Not overengineered. 449 - Path tracking (more difficult) -> Not built in 450 - Certificates central server ? 451 - "self-signed certificates can be used in closed networks." 452 - "Security Framework: A P2P network will often be established among a 453 set of peers that do not trust each other. RELOAD leverages a 454 central enrollment server to provide credentials for each peer, 455 which can then be used to authenticate each operation. This 456 greatly reduces the possible attack surface." bizarre statement. 457 - For a PUT, reload requires that 458 "Each element is signed by a credential which is authorized to 459 write this Kind at this Resource-ID. If this check fails, the 460 request <bcp14>MUST</bcp14> be rejected with an Error_Forbidden error." 461 --> 462 <!--FIXME: Here we should also cite and discuss RELOAD (https://datatracker.ietf.org/doc/html/rfc6940) 463 and establish why we need this spec and are not a "Topology plugin" 464 in RELOAD. The argumentation revolves around the trust model (openness) and 465 security aspects (path signatures).--> 466 467 </t> 468 </section> 469 </section> 470 <section> 471 <name>Overview</name> 472 <t> 473 In R<sup>5</sup>N peers provide to their applications 474 the two fundamental core operations of any DHT: 475 </t> 476 <ul> 477 <li> 478 <tt>PUT</tt>: This operation stores a block 479 under a key on one or more peers with 480 the goal of making the block availiable for queries using the <tt>GET</tt> operation. 481 In the classical definition of a dictionary interface, this operation would be 482 called "insert". 483 </li> 484 <li> 485 <tt>GET</tt>: This operation queries the network of peers for any number of blocks 486 previously stored under or near a key. 487 In the classical definition of a dictionary interface, this operation would be 488 called "find". 489 </li> 490 </ul> 491 <t> 492 An example for possible semantics of the above operations 493 provided as an API to applications by an implementation are 494 outlined in <xref target="overlay"/>. 495 </t> 496 <t> 497 A peer does not necessarily need to expose the above 498 operations to applications, but it commonly will. A 499 peer that does not expose the above operations could 500 be a host purely used for bootstrapping, 501 routing or supporting 502 the overlay network with resources. 503 </t> 504 <t> 505 Similarly, there could be hosts on the network that 506 participate in the DHT but do not route traffic or store 507 data. Examples for such hosts would be mobile devices with 508 limited bandwidth, battery and storage capacity. Such hosts 509 may be used to run applications that use the DHT. However, we 510 will not refer to such hosts as peers. 511 </t> 512 <t> 513 In a trivial scenario where there is only one peer (on the local host), 514 R<sup>5</sup>N operates similarly to a dictionary data structure. 515 However, the default use case is one where nodes communicate directly and 516 indirectly in order to realize a distributed storage mechanism. 517 This communication requires a lower-level peer addressing and message transport 518 mechanism such as TCP/IP. 519 R<sup>5</sup>N is agnostic to the underlying transport protocol which is why 520 this document defines a common addressing and messaging interface in 521 <xref target="underlay"/>. 522 The interface provided by this underlay is used across the specification of the 523 R<sup>5</sup>N protocol. 524 It also serves as a set of requirements of possible transport mechanisms that 525 can be used to implement R<sup>5</sup>N with. 526 That being said, common transport protocols such as TCP/IP or UDP/IP and their 527 interfaces are suitable R<sup>5</sup>N underlays and are used as such by existing 528 implementations. 529 </t> 530 <!-- consider moving some of this back into sec considerations --> 531 <t> 532 Specifics about the protocols of the underlays implementing 533 the underlay interface or the applications 534 using the DHT are out of the scope of this document. 535 </t> 536 <t> 537 To establish an initial connection to a network of 538 R<sup>5</sup>N peers, at least one initial, addressable 539 peer is required as part of the 540 bootstrapping process. Further peers, 541 including neighbors, are then learned via a peer 542 discovery process as defined in <xref target="find_peer"/>. 543 </t> 544 <t> 545 Across this document, the functional components of an 546 R<sup>5</sup>N implementation are divided into 547 routing (<xref target="routing"/>), message 548 processing (<xref target="p2p_messages"/>) and 549 block storage (<xref target="blockstorage"/>). 550 <xref target="figure_r5n_arch"/> illustrates 551 the architectural overview of R<sup>5</sup>N. 552 </t> 553 <figure anchor="figure_r5n_arch" title="The R5N architecture."> 554 <artwork><![CDATA[ 555 | +-----------------+ +-------+ 556 Applications | | GNU Name System | | CADET | ... 557 | +-----------------+ +-------+ 558 -------------+------------------------------------ Core Operations 559 | ^ 560 | | +---------------+ 561 | | | Block Storage | 562 | | +---------------+ 563 | | ^ 564 R5N | v v 565 | +--------------------+ +---------+ 566 | | Message Processing |<-->| Routing | 567 | +--------------------+ +---------+ 568 | ^ ^ 569 | v v 570 -------------+------------------------------------ Underlay Interface 571 | +--------+ +--------+ +----------+ 572 | |GNUnet | |IP | | QUIC | 573 Connectivity | |Underlay| |Underlay| | Underlay | ... 574 | |Link | |Link | | Link | 575 | +--------+ +--------+ +----------+ 576 ]]> 577 </artwork> 578 </figure> 579 </section> 580 581 <section anchor="underlay" numbered="true" toc="default"> 582 <name>Underlay</name> 583 <t> 584 A peer <bcp14>MUST</bcp14> support one or more underlay 585 protocols. 586 Peers supporting multiple underlays effectively 587 create a bridge between different networks. How peers are 588 addressed in a specific underlay is out of scope of this 589 document. For example, a peer may have a TCP/IP address, or 590 expose a QUIC endpoint, or both. While the specific addressing options 591 and mechanisms are out of scope for this document, 592 it is necessary to define a 593 universal addressing format in order to facilitate the 594 distribution of address information to other 595 peers in the DHT overlay. 596 This standardized format 597 is the HELLO block (described in <xref 598 target="hello_block"/>), which contains sets of addresses. 599 If the address is a URI, it may indicate which 600 underlay understands the respective address. 601 </t> 602 <!-- 603 1) The current API is always fire+forget, it doesn't allow for flow 604 control. I think we need to add that, possibly for sending and receiving. 605 606 IDK. 607 CG: I think we should not have flow control for the DHT; overkill, 608 should instead simply define transmission as unreliable. 609 610 2) I'm not sure what to do with the crypto: mandate EdDSA or allow the 611 underlay to do whatever public keys it likes. 612 613 We need keys in the overlay. (Path signatures). Do they need to 614 be the same keys??? 615 616 CG: I'd mandate EdDSA. CONG will have mitigation to establish 617 EdDSA keys over libp2p, even if libp2p does not use EdDSA. But, 618 that said, I'm not sure if we should even mandate AE on the 619 underlay. 620 621 3) I think we may want to mandate that the lower layer at least 622 authenticate the other peer (i.e. every UDP message could be in 623 cleartext, but would need to come with an EdDSA signature, alas 92 byte 624 overhead and a signature verification _required_). Otherwise, I don't 625 see how we can offer even the most minimal protections against peer 626 impersonation attacks. WDYT? 627 628 CG: Yes, I think authentication should be mandatory, but not 629 any _specific_ type of encryption. 630 631 Security considerations? Prerequisites? 632 --> 633 <t> 634 It is expected that the underlay provides basic mechanisms to 635 manage peer connectivity and addressing. 636 The essence of the underlay interface is 637 captured by the following set of API calls: 638 </t> 639 <dl> 640 <dt> 641 <tt>TRY_CONNECT(P, A)</tt> 642 </dt> 643 <dd> 644 This call allows the DHT implementation to signal to the 645 underlay that the DHT wants to establish a connection to the 646 target peer <tt>P</tt> using the given address <tt>A</tt>. 647 If the connection attempt is successful, information on the 648 new peer connection will be offered through the 649 <tt>PEER_CONNECTED</tt> signal. 650 <!-- 651 Underlay implementations 652 can ignore calls with addresses they do not support. 653 --> 654 </dd> 655 <dt> 656 <tt>HOLD(P)</tt> 657 </dt> 658 <dd> 659 This call tells the underlay to hold on to a connection 660 to a peer <tt>P</tt>. Underlays are usually limited in the number 661 of active connections. With this function the DHT can indicate to the 662 underlay which connections should preferably be preserved. 663 </dd> 664 <dt> 665 <tt>DROP(P)</tt> 666 </dt> 667 <dd> 668 This call tells the underlay to drop the connection to a 669 peer <tt>P</tt>. This call is only there for symmetry and 670 used during the peer's shutdown to release all of the remaining 671 <tt>HOLDs</tt>. 672 <!-- FIXME: are we supposed to call DROP if a peer disconnects? 673 NOTE: That would seem to be an implementation detail beyond what needs 674 to be in the RFC. An API may mandate DROP on disconnect, or 675 it may simply state that when the underlay signals a 676 disconnect, all holds are automatically dropped. 677 --> 678 As R<sup>5</sup>N always prefers the longest-lived 679 connections, it would never drop an active connection that it 680 has called <tt>HOLD()</tt> on before. Nevertheless, underlay implementations 681 should not rely on this always being true. A call to <tt>DROP()</tt> also 682 does not imply that the underlay must close the connection: it merely 683 removes the preference to preserve the connection that was established 684 by <tt>HOLD()</tt>. 685 </dd> 686 <dt> 687 <tt>SEND(P, M)</tt> 688 </dt> 689 <dd> 690 This call allows the local peer to send a protocol message 691 <tt>M</tt> to a peer <tt>P</tt>. Sending messages is expected 692 to be done on a best-effort basis, thus the underlay does not 693 have to guarantee delivery or message ordering. If the underlay 694 implements flow- or congestion-control, it may 695 discard messages to limit its queue size. 696 </dd> 697 <dt> 698 <tt>ESTIMATE_NETWORK_SIZE() -> L2NSE</tt> 699 </dt> 700 <dd> 701 This call must return an estimate of the network size. The 702 resulting <tt>L2NSE</tt> value must be the base-2 logarithm 703 of the <em>estimated</em> number of peers in the network. 704 This estimate is used by the routing algorithm. If the underlay does 705 not support a protocol for network size estimation (such as 706 <xref target="NSE"/>) the value is assumed to be provided as 707 a configuration parameter to the underlay implementation. 708 </dd> 709 </dl> 710 <t> 711 The above calls are meant to be actively executed by the 712 implementation as part of the peer-to-peer protocol. In 713 addition, the underlay creates <em>signals</em> to drive 714 updates of the routing table, local storage and message 715 processing (<xref target="p2p_messages"/>). Specifically, 716 the underlay is expected to emit the following 717 signals (usually implemented as callbacks) based on network 718 events observed by the underlay implementation: 719 </t> 720 <dl> 721 <dt> 722 <tt>PEER_CONNECTED -> P</tt> 723 </dt> 724 <dd> 725 This signal allows the DHT to react to a newly connected 726 peer <tt>P</tt>. Such an event triggers, for example, 727 updates in the routing table and gossiping of HELLOs to that 728 peer. Underlays may include meta-data about the connection, 729 for example to indicate that the connection is from a 730 resource-constrained host that does not intend to function 731 as a full peer and thus should not be considered 732 for routing. 733 </dd> 734 <dt> 735 <tt>PEER_DISCONNECTED -> P</tt> 736 </dt> 737 <dd> 738 This signal allows the DHT to react to a recently 739 disconnected peer. Such an event primarily triggers updates 740 in the routing table. 741 </dd> 742 <dt> 743 <tt>ADDRESS_ADDED -> A</tt> 744 </dt> 745 <dd> 746 The underlay signals indicates that an address <tt>A</tt> 747 was added for our local peer and that henceforth the peer 748 may be reachable under this address. This information is 749 used to advertise connectivity information about the local 750 peer to other peers. <tt>A</tt> is an 751 address suitable for inclusion in a <tt>HELLO</tt> payload 752 <xref target="hello_block"/>. 753 </dd> 754 <dt> 755 <tt>ADDRESS_DELETED -> A</tt> 756 </dt> 757 <dd> 758 This underlay signal indicates that an address <tt>A</tt> 759 was removed from the set of addresses the local peer is 760 possibly reachable under. The signal is used 761 to stop advertising this address to other peers. 762 </dd> 763 <dt> 764 <tt>RECEIVE -> (P, M)</tt> 765 </dt> 766 <dd> 767 This signal informs the local peer that a protocol 768 message <tt>M</tt> was received from a peer <tt>P</tt>. 769 </dd> 770 </dl> 771 </section> 772 <section anchor="routing" numbered="true" toc="default"> 773 <name>Routing</name> 774 <t> 775 To enable routing, any R<sup>5</sup>N implementation must keep 776 information about its current set of neighbors. 777 Upon receiving a connection notification from the 778 underlay interface through a 779 <tt>PEER_CONNECTED</tt> signal, information on the new neighbor 780 <bcp14>MUST</bcp14> be added to the routing table, except if the 781 respective <tt>k</tt>-bucket in the routing table is full or if meta-data 782 is present that indicates that the peer does not wish to participate 783 in routing. 784 Peers added to the routing table <bcp14>SHOULD</bcp14> be signalled to the 785 underlay as important connections using a <tt>HOLD()</tt> call. 786 Similarly when a disconnect is indicated by the underlay through 787 a <tt>PEER_DISCONNECTED</tt> signal, the peer 788 <bcp14>MUST</bcp14> be removed from the routing table. 789 <!-- FIXME: are we supposed to call DROP if we called HOLD if a peer disconnects!?? --> 790 </t> 791 <t> 792 To achieve logarithmically bounded routing performance, 793 the data structure for managing neighbors and their 794 metadata <bcp14>MUST</bcp14> be implemented using the k-buckets concept of 795 <xref target="Kademlia"/> as defined in <xref target="routing_table"/>. 796 Maintenance of the routing table (after bootstrapping) is 797 described in <xref target="find_peer"/>. 798 </t> 799 <t> 800 Unlike <xref target="Kademlia"/>, routing decisions in 801 R<sup>5</sup>N are also influenced by a Bloom filter in the message 802 that prevents routing loops. This data structure is discussed in 803 <xref target="routing_bloomfilter"/>. 804 </t> 805 <t> 806 In order to select peers which are suitable destinations for 807 routing messages, R<sup>5</sup>N uses a hybrid approach: Given 808 an estimated network size <tt>L2NSE</tt> retrieved using 809 <tt>ESTIMATE_NETWORK_SIZE()</tt>, the peer selection for the 810 first <tt>L2NSE</tt> hops is random. After the initial 811 <tt>L2NSE</tt> hops, peer selection follows an XOR-based peer 812 distance calculation. 813 <xref target="routing_functions"/> 814 describes the corresponding routing functions. 815 </t> 816 <t> 817 Finally, each <tt>ResultMessage</tt> is routed back along the 818 path that the corresponding <tt>GetMessage</tt> took 819 previously. This is enabled by tracking state per 820 <tt>GetMessage</tt> in the pending table described in 821 <xref target="pending_table"/>. 822 </t> 823 <section anchor="routing_table"> 824 <name>Routing Table</name> 825 <t> 826 Whenever a <tt>PEER_CONNECTED</tt> signal is received from 827 the underlay, the respective peer is considered for 828 insertion into the routing table. The routing table 829 consists of an array of <tt>k</tt>-buckets. Each 830 <tt>k</tt>-bucket contains a list of neighbors. 831 The i-th <tt>k</tt>-bucket stores neighbors whose peer 832 public keys are between XOR-distance 2<sup>i</sup> and 833 2<sup>i+1</sup> from the local peer; <tt>i</tt> can be 834 directly computed from the two peer identities using the 835 <tt>GetDistance()</tt> function. System constraints will 836 typically force an implementation to impose some upper limit 837 on the number of neighbors kept per 838 <tt>k</tt>-bucket. Upon insertion, the implementation 839 <bcp14>MUST</bcp14> call <tt>HOLD()</tt> on the respective 840 neighor. 841 </t> 842 <t> 843 Implementations <bcp14>SHOULD</bcp14> try to keep at least 844 5 entries per <tt>k</tt>-bucket. Embedded systems that cannot manage 845 this number of connections <bcp14>MAY</bcp14> use connection-level 846 signalling to indicate that they are merely a client utilizing a 847 DHT and not able to participate in routing. DHT peers receiving 848 such connections <bcp14>MUST NOT</bcp14> include connections to 849 such restricted systems in their <tt>k</tt>-buckets, thereby effectively 850 excluding them when making routing decisions. 851 </t> 852 <t> 853 If a system hits constraints with respect to 854 the number of active connections, an implementation 855 <bcp14>MUST</bcp14> evict neighbours from those <tt>k</tt>-buckets with the 856 largest number of neighbors. The eviction strategy <bcp14>MUST</bcp14> be 857 to drop the shortest-lived connection per <tt>k</tt>-bucket first. 858 </t> 859 <t> 860 Implementations <bcp14>MAY</bcp14> cache valid addresses of disconnected 861 peers outside of the routing table and sporadically or periodically try to (re-)establish connection 862 to the peer by making <tt>TRY_CONNECT()</tt> calls to the underlay interface 863 if the respective <tt>k</tt>-bucket has empty slots. 864 </t> 865 </section> 866 <section anchor="find_peer"> 867 <name>Peer Discovery</name> 868 <t> 869 Initially, implementations require at least one initial connection to a 870 neighbor (signalled through 871 <tt>PEER_CONNECTED</tt>). 872 The first connection <bcp14>SHOULD</bcp14> be established 873 by an out-of-band exchange of the information from a 874 <tt>HELLO</tt> block. 875 This is commonly achieved through the 876 configuration of hardcoded bootstrap peers or bootstrap 877 servers either for the underlay or the R<sup>5</sup>N 878 implementation. 879 </t> 880 <t> 881 Implementations <bcp14>MAY</bcp14> have other means to achieve this 882 initial connection. 883 For example, implementations could allow the application or 884 even end-user to provide a working <tt>HELLO</tt> 885 which is then in turn used to call <tt>TRY_CONNECT()</tt> on 886 the underlay in order to trigger a subsequent 887 <tt>PEER_CONNECTED</tt> signal from the underlay 888 interface. 889 <xref target="hello_url"/> specifies 890 a URL format for encoding HELLO blocks as text strings. The 891 URL format thus provides a portable, human-readable, 892 text-based serialization format that can, for example, be 893 encoded into a QR code for dissemination. 894 HELLO URLs <bcp14>SHOULD</bcp14> be supported by implementations for 895 both import and export of <tt>HELLOs</tt>. 896 </t> 897 <t> 898 To discover additional peers for its routing table, a peer 899 <bcp14>MUST</bcp14> initiate <tt>GetMessage</tt> requests 900 (see <xref target="p2p_get"/>) asking for blocks of type 901 <tt>HELLO</tt> using its own peer identity in the 902 <tt>QUERY_HASH</tt> field of the message. The 903 <tt>PEER_BF</tt> field of the <tt>GetMessage</tt> 904 <bcp14>MUST</bcp14> be 905 initialized to filter the peer's own peer identity as well as the peer 906 identities of all currently connected 907 neighbors. These requests <bcp14>MUST</bcp14> use 908 the <tt>FindApproximate</tt> and 909 <tt>DemultiplexEverywhere</tt> 910 flags. <tt>FindApproximate</tt> will ensure that other peers 911 will reply with results where the keys are merely considered 912 close-enough, while <tt>DemultiplexEverywhere</tt> will 913 cause each peer on the path to respond if it has relevant 914 information. The combination of these flags is thus likely 915 to yield <tt>HELLOs</tt> of peers that are useful somewhere 916 in the initiator's routing table. The <bcp14>RECOMMENDED</bcp14> 917 replication level to be set in the <tt>REPL_LVL</tt> field 918 is 4. The size and format of the result filter is specified 919 in <xref target="hello_block"/>. The <tt>XQUERY</tt> 920 <bcp14>MUST</bcp14> be empty. 921 </t> 922 <t> 923 In order to facilitate peers answering requests for 924 <tt>HELLOs</tt>, the underlay is expected to provide the 925 implementation with addresses signalled through 926 <tt>ADDRESS_ADDED</tt>. It is possible that no addresses are 927 provided if a peer can only establish outgoing connections 928 and is otherwise unreachable. An implementation 929 <bcp14>MUST</bcp14> advertise its addresses periodically to 930 its neighbors through <tt>HelloMessages</tt>. The 931 advertisement interval and expiration <bcp14>SHOULD</bcp14> 932 be configurable. 933 If the values are chosen at the discretion of the 934 implementation, it is <bcp14>RECOMMENDED</bcp14> to choose external 935 factors such as expiration of DHCP leases to determine the values. 936 The specific frequency of advertisements 937 <bcp14>SHOULD</bcp14> be smaller than the expiration 938 period. 939 It <bcp14>MAY</bcp14> additionally depend on available bandwidth, 940 the set of already connected neighbors, the workload of the system and 941 other factors which are at the discretion of the developer. 942 If <tt>HelloMessages</tt> are not updated before they expire, 943 peers might be unable to discover and connect to the respective 944 peer, and thus miss out on quality routing table entries. This 945 would degrade the performance of the DHT and <bcp14>SHOULD</bcp14> 946 thus be avoided by advertising updated <tt>HELLOs</tt> before the 947 previous one expires. When using unreliable underlays, an implementation 948 <bcp14>MAY</bcp14> use higher frequencies and transmit 949 more <tt>HelloMessages</tt> within an expiration interval 950 to ensure that neighbours almost always have non-expired 951 <tt>HelloMessages</tt> at their disposal even if some messages 952 are lost. 953 </t> 954 <t> 955 Whenever a peer receives such a <tt>HelloMessage</tt> 956 from another peer that is already in the routing 957 table, it must cache it as long as that peer remains in its 958 routing table (or until the <tt>HELLO</tt> expires) and 959 serve it in response to <tt>GET</tt> requests for 960 <tt>HELLO</tt> blocks (see <xref 961 target="p2p_get_processing"/>). This behaviour makes it 962 unnecessary for peers to initiate dedicated 963 <tt>PutMessages</tt> containing <tt>HELLO</tt> blocks. 964 </t> 965 </section> 966 <section anchor="routing_bloomfilter"> 967 <name>Peer Bloom Filter</name> 968 <t> 969 As DHT <tt>GetMessages</tt> and <tt>PutMessages</tt> 970 traverse a random path through the network for the first 971 <tt>L2NSE</tt> hops, a key design objective of 972 R<sup>5</sup>N is to avoid routing loops. The peer Bloom 973 filter is part of the routing metadata in messages to 974 prevent circular routes. It is updated at each hop where the 975 hop's peer identity derived from the peer's public key is 976 added to it. 977 The peer Bloom filter follows the definition in <xref target="bloom_filters"/>. 978 It <bcp14>MUST</bcp14> be <tt>L=1024</tt> bits 979 (128 bytes) in size and <bcp14>MUST</bcp14> set <tt>k=16</tt> bits per 980 element. 981 The set of elements <tt>E</tt> consists of of all possible 256-bit peer 982 public keys and the mapping function <tt>M</tt> is defined as follows: 983 </t> 984 <t> 985 <tt>M(e) -> SHA-512 (e) as uint32[]</tt> 986 </t> 987 <t> 988 The element <tt>e</tt> is the peer public key which is hashed using SHA-512. 989 The resulting 512-bit peer identity is interpreted as an array of k=16 990 32-bit integers in network byte order which are used to set and check the bits 991 in <tt>B</tt> using <tt>BF-SET()</tt> and <tt>BF-TEST()</tt>. 992 </t> 993 <t> 994 At this size, the Bloom filter reaches a false-positive rate of 995 approximately fifty percent at about 200 entries. For peer 996 discovery where the Bloom filter is initially populated with 997 peer identities from the local routing table, the 200 998 entries would still be enough for 40 buckets assuming 5 999 peers per bucket, which corresponds to an overlay network 1000 size of approximately 1 trillion peers. Thus, 1001 <tt>L=1024</tt> bits should suffice for all conceivable 1002 use-cases. 1003 </t> 1004 <t> 1005 For the next hop selection in both the random 1006 and the deterministic case, any peer which is in the peer 1007 Bloom filter for the respective message is excluded from the 1008 peer selection process. 1009 Any peer which is forwarding <tt>GetMessages</tt> or <tt>PutMessages</tt> 1010 (<xref target="p2p_messages"/>) thus adds its own peer public key to the 1011 peer Bloom filter. 1012 This allows other peers to (probabilistically) exclude already 1013 traversed peers when searching for the next hops in the routing table. 1014 </t> 1015 <t> 1016 We note that the peer Bloom filter may exclude peers due to false-postive 1017 matches. This is acceptable as routing should nevertheless 1018 terminate (with high probability) in close vicinity of the key. Furthermore, 1019 due to the randomization of the first L2NSE hops, it is possible that 1020 false-positives will be different when a request is repeated. 1021 </t> 1022 </section> 1023 <section anchor="routing_functions"> 1024 <name>Routing Functions</name> 1025 <t> 1026 Using the data structures described so far, 1027 the R<sup>5</sup>N routing component provides 1028 the following functions for 1029 message processing (<xref target="p2p_messages"/>): 1030 </t> 1031 <dl> 1032 <dt> 1033 <tt>GetDistance(A, B) -> Distance</tt> 1034 </dt> 1035 <dd> 1036 This function calculates the binary XOR between A and B. 1037 The resulting distance is interpreted as an integer where 1038 the leftmost bit is the most significant bit. 1039 </dd> 1040 <dt> 1041 <tt>SelectClosestPeer(K, B) -> N</tt> 1042 </dt> 1043 <dd> 1044 This function selects the neighbor <tt>N</tt> from our 1045 routing table with the shortest XOR-distance to the key <tt>K</tt>. 1046 This means that for all other peers <tt>N'</tt> in the routing table 1047 <tt>GetDistance(N, K) < GetDistance(N',K)</tt>. 1048 Peers with a positive test against the peer Bloom 1049 filter <tt>B</tt> are not considered. 1050 </dd> 1051 <dt> 1052 <tt>SelectRandomPeer(B) -> N</tt> 1053 </dt> 1054 <dd> 1055 This function selects a random peer <tt>N</tt> from 1056 all neighbors. 1057 Peers with a positive test in the peer Bloom 1058 filter <tt>B</tt> are not considered. 1059 </dd> 1060 <dt> 1061 <tt>SelectPeer(K, H, B) -> N</tt> 1062 </dt> 1063 <dd> 1064 This function selects a neighbor <tt>N</tt> depending on the 1065 number of hops <tt>H</tt> parameter. 1066 If <tt>H < NETWORK_SIZE_ESTIMATE</tt> 1067 returns <tt>SelectRandomPeer(B)</tt>, and otherwise 1068 returns <tt>SelectClosestPeer(K, B)</tt>. 1069 </dd> 1070 <dt> 1071 <tt>IsClosestPeer(N, K, B) -> true | false</tt> 1072 </dt> 1073 <dd> 1074 This function checks if <tt>N</tt> is the closest peer for <tt>K</tt> 1075 (cf. <tt>SelectClosestPeer(K, B)</tt>). 1076 Peers with a positive test in the Bloom filter <tt>B</tt> are not considered. 1077 </dd> 1078 <dt> 1079 <tt>ComputeOutDegree(REPL_LVL, HOPCOUNT, L2NSE) -> Number</tt> 1080 </dt> 1081 <dd> 1082 <t> 1083 This function computes the number of neighbors 1084 that a message should be forwarded to. The arguments 1085 are the desired replication level (<tt>REPL_LVL</tt>), 1086 the <tt>HOPCOUNT</tt> of the message so far and 1087 and the current network size estimate (<tt>L2NSE</tt>) 1088 as provided by the underlay. 1089 The result is the non-negative number of next hops to 1090 select. The following figure gives the 1091 pseudocode for computing the number of neighbors 1092 the peer should attempt to forward the message to. 1093 </t> 1094 <figure anchor="compute_out_degree" title="Computing the number of next hops."> 1095 <artwork name="" type="" align="left" alt=""><![CDATA[ 1096 ComputeOutDegree(REPL_LVL, HOPCOUNT, L2NSE): 1097 if (HOPCOUNT > L2NSE * 4) 1098 return 0; 1099 if (HOPCOUNT > L2NSE * 2) 1100 return 1; 1101 if (0 = REPL_LEVL) 1102 REPL_LEVL = 1 1103 if (REPL_LEVEL > 16) 1104 REPL_LEVEL = 16 1105 RM1 = REPL_LEVEL - 1 1106 FRAC = 1 + (RM1 / (L2NSE + RM1 * HOPCOUNT)) 1107 return PROUND(FRAC) 1108 ]]></artwork> 1109 </figure> 1110 <t> 1111 The above calculation of <tt>FRAC</tt> may yield values that are 1112 not discrete. 1113 The result is <tt>FRAC</tt> rounded probabilistically 1114 (<tt>PROUND</tt>) to the nearest 1115 discrete value, using the fraction 1116 as the probability for rounding up. 1117 For example, a value of <tt>3.14</tt> is rounded up to <tt>4</tt> with 1118 a probability of 14% and rounded down to <tt>3</tt> with a probability 1119 of 86%. 1120 This probabilistic rounding is necessary to achieve 1121 the statistically expected value of the replication 1122 level and average number of peers a message is forwarded to. 1123 </t> 1124 </dd> 1125 </dl> 1126 </section> 1127 <section anchor="pending_table"> 1128 <name>Pending Table</name> 1129 <t> 1130 R<sup>5</sup>N performs stateful routing where the messages 1131 only carry the query hash and do not encode the ultimate 1132 source or destination of the request. Routing a request 1133 towards the key is done hop-by-hop using the routing table and the 1134 query hash. The pending table is used to route responses 1135 back to the originator. In the pending table each peer 1136 primarily maps a query hash to the associated 1137 originator of the request. The pending table <bcp14>MUST</bcp14> 1138 store entries for the last <tt>MAX_RECENT</tt> requests 1139 the peer has encountered. 1140 To ensure that the peer does 1141 not run out of memory, information about older requests 1142 <bcp14>MAY</bcp14> be discarded. 1143 The value of <tt>MAX_RECENT</tt> <bcp14>MAY</bcp14> be configurable 1144 at the host level to use available memory resources without 1145 conflicting with other system requirements and limitations. 1146 <tt>MAX_RECENT</tt> <bcp14>SHOULD</bcp14> 1147 be at least 128 * 10<sup>3</sup>. 1148 If the pending table is smaller, the likelihood grows that a peer 1149 receives a response to a query but is unable to forward it to the 1150 initiator because it forgot the predecessor. Eventually, the 1151 initiator would likely succeed by repeating the query. 1152 However, this would be much more expensive than peers having an adequately 1153 sized pending table. 1154 </t> 1155 <t> 1156 For each entry in the pending table, the DHT 1157 <bcp14>MUST</bcp14> track the query key, the peer identity 1158 of the previous hop, the extended query, requested block type, 1159 flags, and the result filter. If the query did not provide 1160 a result filter, a fresh result filter <bcp14>MUST</bcp14> 1161 still be created to filter duplicate replies. Details of 1162 how a result filter works depend on the type, as described 1163 in <xref target="block_functions"/>. 1164 </t> 1165 <t> 1166 When a second query from the same origin for the 1167 same query hash is received, the DHT <bcp14>MUST</bcp14> 1168 attempt to merge the new request with the state for 1169 the old request. If this is not possible (say because 1170 the MUTATOR differs), the 1171 existing result filter <bcp14>MUST</bcp14> be 1172 discarded and replaced with the result 1173 filter of the incoming message. 1174 </t> 1175 <t> 1176 We note that for local applications, a fixed limit on 1177 the number of concurrent requests may be problematic. 1178 Hence, it is <bcp14>RECOMMENDED</bcp14> that implementations 1179 track requests from local applications separately and 1180 preserve the information about requests from local 1181 applications until the local application explicitly 1182 stops the request. 1183 </t> 1184 </section> 1185 </section> 1186 <section anchor="p2p_messages" numbered="true" toc="default"> 1187 <name>Message Processing</name> 1188 <t> 1189 An implementation will process 1190 messages either because it needs to transmit messages as part of routing 1191 table maintenance, or due to requests from local applications, or 1192 because it received a message from a neighbor. 1193 If instructed through an application-facing API such as the one outlined 1194 in <xref target="overlay"/>, a peer acts as an initiator 1195 of <tt>GetMessages</tt> 1196 or <tt>PutMessages</tt>. 1197 The status of initiator is relevant for peers when processing <tt>ResultMessages</tt> 1198 due to the required handover of results to the application that requested 1199 the respective result. 1200 </t> 1201 <t> 1202 The implementation <bcp14>MUST</bcp14> listen for <tt>RECEIVE(P, M)</tt> signals 1203 from the underlay and react to the respective messages sent by 1204 the peer <tt>P</tt>. 1205 </t> 1206 <t> 1207 Whether initiated locally or received from a neighbor, an implementation 1208 processes messages according to the wire formats and the required 1209 validations detailed in the following sections. 1210 Where required, the local peer public key is referred to as <tt>SELF</tt>. 1211 </t> 1212 <section anchor="message_components"> 1213 <name>Message components</name> 1214 <t> 1215 This section describes some data structures and fields shared 1216 by various types of messages. 1217 </t> 1218 <section anchor="route_flags"> 1219 <name>Flags</name> 1220 <t> 1221 Flags is an 8-bit vector representing binary options. 1222 Each flag is represented by a bit in the field starting from 0 as 1223 the rightmost bit to 7 as the leftmost bit. 1224 </t> 1225 <dl> 1226 <dt>0: DemultiplexEverywhere</dt> 1227 <dd> 1228 This bit indicates that each peer along the way should process the request. 1229 If the bit is not set, intermediate peers only route the message and only 1230 peers which consider themselves closest to the key (based on their 1231 routing table) look for answers 1232 in their local storage for <tt>GetMessages</tt>, or respectively cache the 1233 block in their local storage for <tt>PutMessages</tt> and <tt>ResultMessages</tt>. 1234 </dd> 1235 <dt>1: RecordRoute</dt> 1236 <dd> 1237 This bit indicates to keep track of the path that the message takes 1238 in the P2P network. 1239 </dd> 1240 <dt>2: FindApproximate</dt> 1241 <dd> 1242 This bit asks peers to return results even if the key 1243 does not exactly match the query hash. 1244 </dd> 1245 <dt>3: Truncated</dt> 1246 <dd> 1247 This is a special flag which is set if a peer truncated the 1248 recorded path. 1249 This results in the first hop on the path to be given without a signature 1250 to enable checking of the next signature. 1251 This flag <bcp14>MUST NOT</bcp14> be set in 1252 a <tt>GetMessage</tt>. 1253 </dd> 1254 <dt>4-7: Reserved</dt> 1255 <dd> 1256 The remaining bits are reserved for future use and 1257 <bcp14>MUST</bcp14> be set to zero when initiating an operation. 1258 If non-zero bits are received, implementations <bcp14>MUST</bcp14> 1259 preserve these bits when forwarding messages. 1260 </dd> 1261 </dl> 1262 </section> 1263 1264 <section anchor="p2p_path"> 1265 <name>Path</name> 1266 <t> 1267 If the <tt>RecordRoute</tt> flag is set, the route of a <tt>PutMessage</tt> 1268 or a <tt>ResultMessage</tt> through the overlay network is recorded in the 1269 <tt>PATH</tt> field of the message. <tt>PATH</tt> is a list of path elements. 1270 A new path element (<xref target="p2p_pathelement"/>) is appended to the 1271 existing <tt>PATH</tt> before a peer sends the message to the next peer. 1272 </t> 1273 <t> 1274 A path element contains a signature and the public key of the peer that 1275 created the element. The signature is computed over the public keys of the 1276 previous peer (from which the message was received) and next peer (the 1277 peer the message is send to). A new message has no previous peer and 1278 uses all <tt>ZEROs</tt> (32 NULL-bytes) in the 1279 public key field when creating the signature. 1280 </t> 1281 <t> 1282 Assuming peer A sends a new <tt>PUT</tt> message to peer B, which forwards 1283 the message to peer C, which forwards to peer D which finally stores the data. 1284 The <tt>PATH</tt> field of the message received at peer D contains three 1285 path elements (build from top to bottom): 1286 </t> 1287 <figure anchor="figure_path" title="Example PATH"> 1288 <artwork name="" type="" align="left" alt=""><![CDATA[ 1289 +---------------------+-------+ 1290 | Sig A(ZEROs, Pub B) | Pub A | 1291 +---------------------+-------+ 1292 +---------------------+-------+ 1293 | Sig B(Pub A, Pub C) | Pub B | 1294 +---------------------+-------+ 1295 +---------------------+-------+ 1296 | Sig C(Pub B, Pub D) | Pub C | 1297 +---------------------+-------+ 1298 ]]></artwork> 1299 </figure> 1300 <t> 1301 Note that the wire format of <tt>PATH</tt> (<xref target="p2p_pathelement"/>) 1302 will not include the last public key (Pub C in our example) as this will be 1303 redundant; the receiver of a message can use the public key of the sender as 1304 the public key to verify the last signature. 1305 </t> 1306 <t> 1307 The <tt>PATH</tt> is stored along with the payload data from the <tt>PUT</tt> 1308 message at the final peer. Note that the same payload stored at different 1309 peers will have a different <tt>PATH</tt> associated with it. 1310 </t> 1311 <t> 1312 When the storing peer delivers the data based on a <tt>GET</tt> request, it 1313 initializes the <tt>PATH</tt> field with the stored path value and appends 1314 a new path element. The first part of <tt>PATH</tt> in a <tt>GET</tt> response 1315 message is called the <tt>PutPath</tt>, followed 1316 by the <tt>GetPath</tt>. This way the combined <tt>PATH</tt> will record the 1317 whole route of the payload from the originating peer (initial 1318 <tt>PutMessage</tt>) to the requesting peer (initial <tt>GetMessage</tt>). 1319 </t> 1320 <t> 1321 When receiving a message with flag <tt>RecordRoute</tt> and <tt>PATH</tt>, 1322 a peer is encouraged to verify the integrity of <tt>PATH</tt> (if the 1323 available resources of the peer allow this) by checking the signatures of 1324 the path elements. 1325 </t> 1326 <t> 1327 If an invalid signature is detected, the path is truncated keeping only 1328 element fields following the faulty signature and setting the <tt>Truncated</tt> 1329 flag. Assume that peer C detects a faulty signature from peer B, 1330 the trunacted path has two entries: 1331 </t> 1332 <figure anchor="figure_path_truncated" title="Example truncated PATH"> 1333 <artwork name="" type="" align="left" alt=""><![CDATA[ 1334 +-------+ +----------------------+-------+ 1335 | Pub B | | Sig C (Pub B, Pub D) | Pub C | 1336 +-------+ +----------------------+-------+ 1337 ]]></artwork> 1338 </figure> 1339 <t> 1340 The <tt>Truncated</tt> flag indicates that the first path element does not contain 1341 a signature but only the public key of the peer where the signature fails. 1342 </t> 1343 </section> 1344 1345 <section anchor="p2p_pathelement"> 1346 <!-- TODO-GROTHOFF: Discuss this change again. The text is currently not correct 1347 it is very difficult to understand. Is it worth 32 byte; 1348 CG: I've fixed the figures, tried to clarify the text. Is it OK now? --> 1349 <name>Path Element</name> 1350 <t> 1351 A path element represents a hop in the path a message has taken 1352 through the overlay network. 1353 The wire format of a path element is illustrated in 1354 <xref target="figure_pathelement"/>. 1355 </t> 1356 <figure anchor="figure_pathelement" title="The Wire Format of a path element."> 1357 <artwork name="" type="" align="left" alt=""><![CDATA[ 1358 0 8 16 24 32 40 48 56 1359 +-----+-----+-----+-----+-----+-----+-----+-----+ 1360 | SIGNATURE | 1361 | (64 bytes) | 1362 | | 1363 | | 1364 | | 1365 | | 1366 | | 1367 | | 1368 +-----+-----+-----+-----+-----+-----+-----+-----+ 1369 | PRED PEER PUBLIC KEY | 1370 | (32 bytes) | 1371 | | 1372 | | 1373 +-----+-----+-----+-----+-----+-----+-----+-----+ 1374 ]]></artwork> 1375 </figure> 1376 <t>where:</t> 1377 <dl> 1378 <dt>SIGNATURE</dt> 1379 <dd> 1380 is a 64 byte EdDSA signature <xref target="ed25519"/> created 1381 using the current hop's private 1382 key which affirms the public keys of the peers from the 1383 previous and next hops. 1384 </dd> 1385 <dt>PRED PEER PUBLIC KEY</dt> 1386 <dd> 1387 is the EdDSA public key <xref target="ed25519"/> of the previous peer on the path. 1388 </dd> 1389 </dl> 1390 <t> 1391 An ordered list of path elements may be appended to any routed 1392 <tt>PutMessages</tt> or <tt>ResultMessages</tt>. 1393 The last signature (after which the peer public key is omitted) 1394 is created by the current hop only after the peer made its routing 1395 decision identifiying the successor peer. The peer public key is not 1396 included after the last signature as it must be that of the sender of 1397 the message and including it would thus be redundant. 1398 Similarly, the predecessor of the first element of 1399 an untruncated path is not stated explicitly, as it must be <tt>ZERO</tt> 1400 (32 NULL-bytes). 1401 </t> 1402 <t> 1403 <xref target="figure_path_ex"/> shows the wire format of an example 1404 path from peer A over peers B and C and D as it would be received by peer E in the 1405 <tt>PUTPATH</tt> of a <tt>PutMessage</tt>, or as the combined 1406 <tt>PUTPATH</tt> and <tt>GETPATH</tt> of a <tt>ResultMessage</tt>. 1407 The wire format of the path elements allows a natural 1408 extension of the <tt>PUTPATH</tt> along the route of the <tt>ResultMessage</tt> 1409 to the destination forming the <tt>GETPATH</tt>. 1410 The <tt>PutMessage</tt> would indicate in the <tt>PATH_LEN</tt> field 1411 a length of 3. 1412 The <tt>ResultMessage</tt> would indicate a path length of 3 as the 1413 sum of the field values in <tt>PUTPATH_L</tt> and <tt>GETPATH_L</tt>. 1414 Basically, the last signature does not count for the path length. 1415 </t> 1416 <figure anchor="figure_path_ex" title="Example of a path as found in PutMessages or ResultMessages from Peer A to Peer D as transmitted by Peer D."> 1417 <artwork name="" type="" align="left" alt=""><![CDATA[ 1418 0 8 16 24 32 40 48 56 1419 +-----+-----+-----+-----+-----+-----+-----+-----+ 1420 | SIGNATURE A | 1421 | (64 bytes) | 1422 | | 1423 | (Over ZERO and PEER B signed by PEER A) | 1424 | | 1425 | | 1426 | | 1427 | | 1428 +-----+-----+-----+-----+-----+-----+-----+-----+ 1429 | PEER A | 1430 | (32 bytes) | 1431 | | 1432 | | 1433 +-----+-----+-----+-----+-----+-----+-----+-----+ 1434 | SIGNATURE B | 1435 | (64 bytes) | 1436 | | 1437 | (Over PEER A and PEER C signed by PEER B) | 1438 | | 1439 | | 1440 | | 1441 | | 1442 +-----+-----+-----+-----+-----+-----+-----+-----+ 1443 | PEER B | 1444 | (32 bytes) | 1445 | | 1446 | | 1447 +-----+-----+-----+-----+-----+-----+-----+-----+ 1448 | SIGNATURE C | 1449 | (64 bytes) | 1450 | | 1451 | (Over PEER B and PEER D signed by PEER C) | 1452 | | 1453 | | 1454 | | 1455 | | 1456 +-----+-----+-----+-----+-----+-----+-----+-----+ 1457 | PEER C | 1458 | (32 bytes) | 1459 | | 1460 | | 1461 +-----+-----+-----+-----+-----+-----+-----+-----+ 1462 | SIGNATURE D (last sig) | 1463 | (64 bytes) | 1464 | | 1465 | (Over PEER C and receiver signed by PEER D) | 1466 | | 1467 | | 1468 | | 1469 | | 1470 +-----+-----+-----+-----+-----+-----+-----+-----+ 1471 ]]></artwork> 1472 </figure> 1473 1474 <t> 1475 A path may be truncated in which case the signature of the truncated 1476 path element is omitted leaving only the public key of the peer preceeding 1477 the truncation which is required for the 1478 verification of the subsequent path element signature. 1479 Such a truncated path is indicated with the respective 1480 truncated flag (<xref target="route_flags"/>). 1481 For truncated paths, the peer public key of the signer of the last path element is 1482 again omitted as it must be that of 1483 the sender of the <tt>PutMesssage</tt> or <tt>ResultMessage</tt>. Similarly, 1484 the public key of the receiving peer used in the last path element is omitted as 1485 it must be SELF. 1486 The wire format of a truncated example path from peers B over C and D to E 1487 (possibly still originating at A, but the origin is unknowable to E due to truncation) 1488 is illustrated in <xref target="figure_path_ex_trunc"/>. 1489 Here, a <tt>ResultMessage</tt> would indicate in the <tt>PATH_LEN</tt> field 1490 a length of 1 while 1491 a <tt>PutMessage</tt> would indicate a length of 1 as the sum of 1492 <tt>PUTPATH_L</tt> and <tt>GETPATH_L</tt> fields. 1493 Basically, the truncated peer and the last signature do not count 1494 for the path length. 1495 </t> 1496 <figure anchor="figure_path_ex_trunc" title="Example of a truncated path from Peer B to Peer D as transmitted by Peer D."> 1497 <artwork name="" type="" align="left" alt=""><![CDATA[ 1498 0 8 16 24 32 40 48 56 1499 +-----+-----+-----+-----+-----+-----+-----+-----+ 1500 | PEER B (truncated) | 1501 | (32 byte) | 1502 | | 1503 | | 1504 +-----+-----+-----+-----+-----+-----+-----+-----+ 1505 | SIGNATURE C | 1506 | (64 bytes) | 1507 | | 1508 | (Over PEER B and PEER D signed by PEER C) | 1509 | | 1510 | | 1511 | | 1512 | | 1513 +-----+-----+-----+-----+-----+-----+-----+-----+ 1514 | PEER C | 1515 | (32 bytes) | 1516 | | 1517 | | 1518 +-----+-----+-----+-----+-----+-----+-----+-----+ 1519 | SIGNATURE D (last sig) | 1520 | (64 byte) | 1521 | | 1522 | (Over PEER C and receiver signed by PEER D) | 1523 | | 1524 | | 1525 | | 1526 | | 1527 +-----+-----+-----+-----+-----+-----+-----+-----+ 1528 ]]></artwork> 1529 </figure> 1530 <t> 1531 The SIGNATURE field in a path element covers a 64-bit contextualization header, the 1532 the block expiration, a hash of the block 1533 payload, as well as the predecessor peer public key and the peer public key of the 1534 successor that the peer making the signature is routing the 1535 message to. Thus, the signature made by SELF basically says that 1536 SELF received the block payload from PEER PREDECESSOR and has forwarded 1537 it to PEER SUCCESSOR. The wire format is illustrated 1538 in <xref target="figure_pathelewithpseudo"/>. 1539 </t> 1540 <figure anchor="figure_pathelewithpseudo" title="The Wire Format of the path element for Signing."> 1541 <artwork name="" type="" align="left" alt=""><![CDATA[ 1542 0 8 16 24 32 40 48 56 1543 +-----+-----+-----+-----+-----+-----+-----+-----+ 1544 | SIZE | PURPOSE | 1545 +-----+-----+-----+-----+-----+-----+-----+-----+ 1546 | EXPIRATION | 1547 +-----+-----+-----+-----+-----+-----+-----+-----+ 1548 | BLOCK HASH | 1549 | (64 byte) | 1550 | | 1551 | | 1552 | | 1553 | | 1554 | | 1555 | | 1556 +-----+-----+-----+-----+-----+-----+-----+-----+ 1557 | PEER PREDECESSOR | 1558 | (32 byte) | 1559 | | 1560 | | 1561 +-----+-----+-----+-----+-----+-----+-----+-----+ 1562 | PEER SUCCESSOR | 1563 | (32 byte) | 1564 | | 1565 | | 1566 +-----+-----+-----+-----+-----+-----+-----+-----+ 1567 ]]></artwork> 1568 </figure> 1569 <dl> 1570 <dt>SIZE</dt> 1571 <dd> 1572 A 32-bit value containing the length of the signed data in bytes 1573 in network byte order. 1574 The length of the signed data <bcp14>MUST</bcp14> be 144 bytes. 1575 </dd> 1576 <dt>PURPOSE</dt> 1577 <dd> 1578 A 32-bit signature purpose flag. This field <bcp14>MUST</bcp14> be 6 (in network 1579 byte order). 1580 </dd> 1581 <dt>EXPIRATION</dt> 1582 <dd> 1583 denotes the absolute 64-bit expiration date of the block 1584 in microseconds since midnight (0 hour), January 1, 1970 1585 UTC in network byte order. 1586 </dd> 1587 <dt>BLOCK HASH</dt> 1588 <dd> 1589 a SHA-512 hash over the block payload. 1590 </dd> 1591 <dt>PEER PREDECESSOR</dt> 1592 <dd> 1593 the peer public key of the previous hop. If the signing peer initiated 1594 the PUT, this field is set to all zeroes. 1595 </dd> 1596 <dt>PEER SUCCESSOR</dt> 1597 <dd> 1598 the peer public key of the next hop (not of the signer). 1599 </dd> 1600 </dl> 1601 </section> 1602 </section> 1603 <section anchor="p2p_hello" numbered="true" toc="default"> 1604 <name>HelloMessage</name> 1605 <t> 1606 When the underlay signals the implementation of added or 1607 removed addresses through <tt>ADDRESS_ADDED</tt> and 1608 <tt>ADDRESS_DELETED</tt> an implementation 1609 <bcp14>MUST</bcp14> disseminate those changes to neighbors 1610 using <tt>HelloMessages</tt> (as already discussed in 1611 section <xref target="find_peer"/>). <tt>HelloMessages</tt> 1612 are used to inform neighbors of a peer about the sender's 1613 available addresses. The recipients use these messages to 1614 inform their respective underlays about ways to sustain the 1615 connections and to generate <tt>HELLO</tt> blocks (see <xref 1616 target="hello_block"/>) to answer peer discovery queries 1617 from other peers. 1618 </t> 1619 <section anchor="p2p_hello_wire"> 1620 <name>Wire Format</name> 1621 <figure anchor="figure_hellomsg" title="The HelloMessage Wire Format."> 1622 <artwork name="" type="" align="left" alt=""><![CDATA[ 1623 0 8 16 24 32 40 48 56 1624 +-----+-----+-----+-----+-----+-----+-----+-----+ 1625 | MSIZE | MTYPE | VERSION | NUM_ADDRS | 1626 +-----+-----+-----+-----+-----+-----+-----+-----+ 1627 | SIGNATURE / 1628 / (64 bytes) | 1629 +-----+-----+-----+-----+-----+-----+-----+-----+ 1630 | EXPIRATION | 1631 +-----+-----+-----+-----+-----+-----+-----+-----+ 1632 / ADDRESSES (variable length) / 1633 +-----+-----+-----+-----+-----+-----+-----+-----+ 1634 ]]></artwork> 1635 </figure> 1636 <t>where:</t> 1637 <dl> 1638 <dt>MSIZE</dt> 1639 <dd> 1640 denotes the size of this message in network byte order. 1641 </dd> 1642 <dt>MTYPE</dt> 1643 <dd> 1644 is the 16-bit message type. 1645 It must be set to 1646 the value 157 in network byte order as defined in the GANA "GNUnet Message Type" registry 1647 (see <xref target="gana_message_type"/>). 1648 </dd> 1649 <dt>VERSION</dt> 1650 <dd> 1651 is a 16-bit field that indicates the version of the <tt>HelloMessage</tt>. Must be zero. 1652 In the future, this may be used to extend or update the <tt>HelloMessage</tt> format. 1653 </dd> 1654 <dt>NUM_ADDRS</dt> 1655 <dd> 1656 is a 16-bit number in network byte order that gives the 1657 total number of addresses encoded in the 1658 <tt>ADDRESSES</tt> field. 1659 </dd> 1660 <dt>SIGNATURE</dt> 1661 <dd> 1662 is a 64 byte EdDSA signature <xref target="ed25519"/> using the sender's private 1663 key affirming the information contained in the message. 1664 The signature is signing exactly the same data that is being 1665 signed in a <tt>HELLO</tt> block as described in <xref target="hello_block"/>. 1666 </dd> 1667 <dt>EXPIRATION</dt> 1668 <dd> 1669 denotes the absolute 64-bit expiration date of the content. 1670 The value specified is microseconds since midnight (0 hour), 1671 January 1, 1970, but must be a multiple of one million 1672 (so that it can be represented in seconds in a <tt>HELLO</tt> URL). 1673 Stored in network byte order. 1674 </dd> 1675 <dt>ADDRESSES</dt> 1676 <dd> 1677 A sequence of exactly <tt>NUM_ADDRS</tt> 1678 addresses which can be used to contact the peer. 1679 Each address <bcp14>MUST</bcp14> be 0-terminated. 1680 If <tt>NUM_ADDRS = 0</tt> then this field is omitted (0 bytes). 1681 </dd> 1682 </dl> 1683 </section> 1684 <section anchor="p2p_hello_processing"> 1685 <name>Processing</name> 1686 <t> 1687 If the initiator of a <tt>HelloMessage</tt> is <tt>SELF</tt>, the message 1688 is simply sent to all neighbors <tt>P</tt> currently in the routing table 1689 using the <tt>SEND()</tt> function of the underlay as defined in 1690 <xref target="underlay"/>. 1691 </t> 1692 <t> 1693 Upon receiving a <tt>HelloMessage</tt> from a peer <tt>P</tt> 1694 an implementation <bcp14>MUST</bcp14> process it step by step as follows: 1695 </t> 1696 <ol> 1697 <li> 1698 If <tt>P</tt> is not in its routing table, the message is discarded. 1699 </li> 1700 <li> 1701 The signature is verified, including a check that the expiration time 1702 is in the future. If the signature is invalid, the message is discarded. 1703 </li> 1704 <li> 1705 The information contained in the <tt>HelloMessage</tt> 1706 is used to synthesize a block of type <tt>HELLO</tt> 1707 (<xref target="hello_block"/>). The block is cached in 1708 the routing table until it expires, or the peer is 1709 removed from the routing table, or the information is 1710 replaced by another message from the peer. The 1711 implementation <bcp14>SHOULD</bcp14> instruct the 1712 underlay to connect to all now available addresses using 1713 <tt>TRY_CONNECT()</tt> in order to make the underlay 1714 aware of alternative addresses for this connection and 1715 to maintain optimal connectivity. 1716 </li> 1717 <li> 1718 Received <tt>HelloMessages</tt> <bcp14>MUST NOT</bcp14> 1719 be forwarded. 1720 </li> 1721 </ol> 1722 </section> 1723 </section> 1724 <section anchor="p2p_put" numbered="true" toc="default"> 1725 <name>PutMessage</name> 1726 <t> 1727 <tt>PutMessages</tt> are used to store information at other 1728 peers in the DHT. Any application-facing API which allows 1729 applications to initiate <tt>PutMessages</tt> to store data 1730 in the DHT needs to receive sufficient, possibly 1731 implementation-specific information to construct the initial 1732 <tt>PutMessage</tt>. In general, application-facing APIs 1733 supporting multiple applications and block types need to be 1734 given the block type (<tt>BTYPE</tt>) and message 1735 <tt>FLAGS</tt> in addition to the actual <tt>BLOCK</tt> 1736 payload. The <tt>BLOCK_KEY</tt> could be provided 1737 explicitly, or in some cases might be derived using the 1738 <tt>DeriveBlockKey()</tt> function from the block type 1739 specific operations defined in <xref 1740 target="block_functions"/>. 1741 </t> 1742 <section anchor="p2p_put_wire"> 1743 <name>Wire Format</name> 1744 <figure anchor="figure_putmsg" title="The PutMessage Wire Format."> 1745 <artwork name="" type="" align="left" alt=""><![CDATA[ 1746 0 8 16 24 32 40 48 56 1747 +-----+-----+-----+-----+-----+-----+-----+-----+ 1748 | MSIZE | MTYPE | BTYPE | 1749 +-----+-----+-----+-----+-----+-----+-----+-----+ 1750 | VER |FLAGS| HOPCOUNT | REPL_LVL | PATH_LEN | 1751 +-----+-----+-----+-----+-----+-----+-----+-----+ 1752 | EXPIRATION | 1753 +-----+-----+-----+-----+-----+-----+-----+-----+ 1754 | PEER_BF / 1755 / (128 byte) | 1756 +-----+-----+-----+-----+-----+-----+-----+-----+ 1757 | BLOCK_KEY / 1758 / (64 byte) | 1759 +-----+-----+-----+-----+-----+-----+-----+-----+ 1760 / TRUNCATED ORIGIN (0 or 32 bytes) / 1761 +-----+-----+-----+-----+-----+-----+-----+-----+ 1762 / PUTPATH (variable length) / 1763 +-----+-----+-----+-----+-----+-----+-----+-----+ 1764 / LAST HOP SIGNATURE (0 or 64 bytes) / 1765 +-----+-----+-----+-----+-----+-----+-----+-----+ 1766 / BLOCK (variable length) / 1767 +-----+-----+-----+-----+-----+-----+-----+-----+ 1768 ]]></artwork> 1769 </figure> 1770 <t>where:</t> 1771 <dl> 1772 <dt>MSIZE</dt> 1773 <dd> 1774 denotes the size of this message in network byte order. 1775 </dd> 1776 <dt>MTYPE</dt> 1777 <dd> 1778 is the 16-bit message type. Read-only. It must be set 1779 to the value 146 in network byte order as defined in the 1780 GANA "GNUnet Message Type" registry <xref 1781 target="gana_message_type"/>. 1782 </dd> 1783 <dt>BTYPE</dt> 1784 <dd> 1785 is a 32-bit block type in network byte order. The block 1786 type indicates the content type of the payload. Set by 1787 the initiator. Read-only. 1788 </dd> 1789 <dt>VER</dt> 1790 <dd> 1791 is a 8-bit protocol version. 1792 Set to zero. May be used in future protocol versions. 1793 </dd> 1794 <dt>FLAGS</dt> 1795 <dd> 1796 is a 8-bit vector with binary options (see <xref target="route_flags"/>). 1797 Set by the initiator. Read-only. 1798 </dd> 1799 <dt>HOPCOUNT</dt> 1800 <dd> 1801 is a 16-bit number in network byte order indicating how 1802 many hops this message has traversed to far. Set by the 1803 initiator to zero. <bcp14>MUST</bcp14> be incremented by one 1804 by each peer before forwarding the request. 1805 </dd> 1806 <dt>REPL_LVL</dt> 1807 <dd> 1808 is a 16-bit number in network byte order indicating the 1809 desired replication level of the data. Set by the 1810 initiator. Read-only. 1811 </dd> 1812 <dt>PATH_LEN</dt> 1813 <dd> 1814 is a 16-bit number in network byte order indicating the 1815 number of path elements recorded in <tt>PUTPATH</tt>. As <tt>PUTPATH</tt> 1816 is optional, this value may be zero. If the <tt>PUTPATH</tt> is 1817 enabled, set initially to zero by the initiator. Updated 1818 by processing peers to match the path length in the 1819 message. 1820 </dd> 1821 <dt>EXPIRATION</dt> 1822 <dd> 1823 denotes the absolute 64-bit expiration date of the 1824 content in microseconds since midnight (0 hour), January 1825 1, 1970 in network byte order. Set by the 1826 initiator. Read-only. 1827 </dd> 1828 <dt>PEER_BF</dt> 1829 <dd> 1830 A peer Bloom filter to stop circular routes (see <xref 1831 target="routing_bloomfilter"/>). Set by the initiator 1832 to contain the local peer and all neighbors it is 1833 forwarded to. Modified by processing peers to include 1834 their own peer public key using <tt>BF-SET()</tt>. 1835 </dd> 1836 <dt>BLOCK_KEY</dt> 1837 <dd> 1838 The key under which the <tt>PutMessage</tt> wants to store content 1839 under. 1840 Set by the initiator. Read-only. 1841 </dd> 1842 <dt>TRUNCATED ORIGIN</dt> 1843 <dd> 1844 is only provided if the <tt>Truncated</tt> flag is set 1845 in <tt>FLAGS</tt>. If present, this is the public key of 1846 the peer just before the first entry on the 1847 <tt>PUTPATH</tt> and the first peer on the 1848 <tt>PUTPATH</tt> is not the actual origin of the 1849 message. Thus, to verify the first signature on the 1850 <tt>PUTPATH</tt>, this public key must be used. Note 1851 that due to the truncation, this last hop cannot be 1852 verified to exist. Value is modified by processing 1853 peers. 1854 </dd> 1855 <dt>PUTPATH</dt> 1856 <dd> 1857 the variable-length <tt>PUT</tt> path. 1858 The path consists of a list of <tt>PATH_LEN</tt> path elements. 1859 Set by the initiator to zero. 1860 Incremented by processing peers. 1861 </dd> 1862 <dt>LAST HOP SIGNATURE</dt> 1863 <dd> 1864 is only provided if the <tt>RecordRoute</tt> flag 1865 is set in <tt>FLAGS</tt>. If present, this is 1866 an EdDSA signature <xref target="ed25519"/> by the sender of this message 1867 (using the same format as the signatures in PUTPATH) 1868 affirming that the sender forwarded the message from 1869 the predecessor (all zeros if <tt>PATH_LEN</tt> is zero, 1870 otherwise the last peer in <tt>PUTPATH</tt>) to 1871 the target peer. 1872 Modified by processing peers (if flag is set). 1873 </dd> 1874 <dt>BLOCK</dt> 1875 <dd> 1876 the variable-length block payload. The contents are 1877 determined by the <tt>BTYPE</tt> field. The length is 1878 determined by <tt>MSIZE</tt> minus the size of all of 1879 the other fields. Set by the initiator. Read-only. 1880 </dd> 1881 </dl> 1882 </section> 1883 <section anchor="p2p_put_processing"> 1884 <name>Processing</name> 1885 <t> 1886 Upon receiving a <tt>PutMessage</tt> from a peer <tt>P</tt>, 1887 or created through initiation by an overlay API, 1888 an implementation <bcp14>MUST</bcp14> process it step by step as follows: 1889 </t> 1890 <ol> 1891 <li> 1892 The <tt>EXPIRATION</tt> field is evaluated. 1893 If the message is expired, it <bcp14>MUST</bcp14> be discarded. 1894 </li> 1895 <li> 1896 If the <tt>BTYPE</tt> is <tt>ANY</tt>, then the message 1897 <bcp14>MUST</bcp14> be discarded. If the <tt>BTYPE</tt> 1898 is not supported by the implementation, no validation of 1899 the block payload is performed and processing continues 1900 at (5). Else, the block <bcp14>MUST</bcp14> be 1901 validated as defined in (3) and (4). 1902 </li> 1903 <li> 1904 The message is evaluated using the block validation 1905 functions matching the <tt>BTYPE</tt>. First, the client 1906 attempts to derive the key using the respective 1907 <tt>DeriveBlockKey</tt> procedure as described in <xref 1908 target="block_functions"/>. If a key can be derived and 1909 does not match, the message <bcp14>MUST</bcp14> be 1910 discarded. 1911 </li> 1912 <li> 1913 Next, the <tt>ValidateBlockStoreRequest</tt> procedure 1914 for the <tt>BTYPE</tt> as described in <xref 1915 target="block_functions"/> is used to validate the block 1916 payload. If the block payload is invalid, the message 1917 <bcp14>MUST</bcp14> be discarded. 1918 </li> 1919 <li> 1920 The peer identity of the sender peer <tt>P</tt> 1921 <bcp14>SHOULD</bcp14> be in <tt>PEER_BF</tt>. If not, 1922 the implementation <bcp14>MAY</bcp14> log an error, but 1923 <bcp14>MUST</bcp14> continue. 1924 </li> 1925 <li> 1926 If the <tt>RecordRoute</tt> flag is not set, the 1927 <tt>PATH_LEN</tt> <bcp14>MUST</bcp14> be set to zero. 1928 If the flag is set and <tt>PATH_LEN</tt> is non-zero, 1929 the local peer <bcp14>SHOULD</bcp14> verify the 1930 signatures from the <tt>PUTPATH</tt>. Verification 1931 <bcp14>MAY</bcp14> involve checking all signatures or 1932 any random subset of the signatures. It is 1933 <bcp14>RECOMMENDED</bcp14> that peers adapt their 1934 behavior to available computational resources so as to 1935 not make signature verification a bottleneck. If an 1936 invalid signature is found, the <tt>PUTPATH</tt> 1937 <bcp14>MUST</bcp14> be truncated to only include the 1938 elements following the invalid signature. 1939 </li> 1940 <li> 1941 If the local peer is the closest peer 1942 (cf. <tt>IsClosestPeer(SELF, BLOCK_KEY, 1943 PeerFilter)</tt>) or the <tt>DemultiplexEverywhere</tt> 1944 flag ist set, the message <bcp14>SHOULD</bcp14> be 1945 stored locally in the block storage if possible. The 1946 implementation <tt>MAY</tt> choose not store the block 1947 if external factors or configurations prevent this, such 1948 as limited (allotted) disk space. 1949 </li> 1950 <li> 1951 If the <tt>BTYPE</tt> of the message indicates a 1952 <tt>HELLO</tt> block, the peer <bcp14>MUST</bcp14> be 1953 considered for the local routing table by using the peer 1954 identity in <tt>BLOCK_KEY</tt>. If neither the peer is 1955 already connected nor the respective k-bucket is already 1956 full, then the peer <bcp14>MUST</bcp14> try to establish 1957 a connection to the peer indicated in the <tt>HELLO</tt> 1958 block using the address information from the 1959 <tt>HELLO</tt> block and the underlay's 1960 <tt>TRY_CONNECT()</tt> function. The implementation 1961 <bcp14>MUST</bcp14> instruct the underlay to try to 1962 connect to all provided addresses using 1963 <tt>TRY_CONNECT()</tt> in order to make the underlay 1964 aware of multiple addresses for this connection. When a 1965 connection can be established, the underlay's 1966 <tt>PEER_CONNECTED</tt> signal will cause the peer to be 1967 added to the respective k-bucket of the routing table 1968 (<xref target="routing"/>). 1969 </li> 1970 <li> 1971 Given the value in <tt>REPL_LVL</tt>, <tt>HOPCOUNT</tt> 1972 and <tt>FALSE = IsClosestPeer(SELF, BLOCK_KEY, 1973 PeerFilter)</tt> the number of peers to forward to 1974 <bcp14>MUST</bcp14> be calculated using 1975 <tt>ComputeOutDegree()</tt>. The implementation 1976 <bcp14>SHOULD</bcp14> select this number of peers 1977 to forward the message to using the function 1978 <tt>SelectPeer()</tt> (<xref 1979 target="routing_functions"/>) using the 1980 <tt>BLOCK_KEY</tt>, <tt>HOPCOUNT</tt>, and utilizing 1981 <tt>PEER_BF</tt> as peer Bloom filter. For each 1982 selected peer <tt>PEER_BF</tt> is updated with that peer 1983 in between calls to <tt>SelectPeer()</tt>. The 1984 implementation <bcp14>MAY</bcp14> forward to fewer or no 1985 peers in order to handle resource constraints such as 1986 limited bandwidth or simply because there are not enough 1987 suitable connected peers. For each selected peer with 1988 peer identity <tt>P</tt> a dedicated 1989 <tt>PutMessage_P</tt> is created containing the original 1990 (and where applicable already updated) fields of the 1991 received <tt>PutMessage</tt>. In each message 1992 <em>all</em> selected peer identities and the local peer 1993 identity <bcp14>MUST</bcp14> be added to the 1994 <tt>PEER_BF</tt> and the <tt>HOPCOUNT</tt> 1995 <bcp14>MUST</bcp14> be incremented by one. If the 1996 <tt>RecordRoute</tt> flag is set, a new path element is 1997 created using the predecessor peer public key and the 1998 signature of the current peer. The path element is 1999 added to the <tt>PUTPATH</tt> fields and the 2000 <tt>PATH_LEN</tt> field is incremented by one. When 2001 creating the path element signature, the successor must 2002 be set to the recipient peer <tt>P</tt> of the 2003 <tt>PutMessage_P</tt>. The successor in the new path 2004 element is the recipient peer <tt>P</tt>. If the path 2005 becomes too long for the resulting message to be 2006 transmitted by the underlay, it <bcp14>MUST</bcp14> be 2007 truncated. Finally, the messages are sent using 2008 <tt>SEND(P, PutMessage_P)</tt> to each recipient. 2009 </li> 2010 </ol> 2011 </section> 2012 </section> 2013 <section anchor="p2p_get" numbered="true" toc="default"> 2014 <name>GetMessage</name> 2015 <t> 2016 <tt>GetMessages</tt> are used to request information from 2017 other peers in the DHT. An application-level API which 2018 allows applications to initiate <tt>GetMessages</tt> needs 2019 to provide sufficient, implementation-specific information 2020 needed to construct the initial <tt>GetMessage</tt>. For 2021 example, implementations supporting multiple applications 2022 and blocks will need to be given the block type, message 2023 <tt>FLAG</tt> parameters and possibly an <tt>XQUERY</tt> in 2024 addition to just the <tt>QUERY_HASH</tt>. In some cases, it 2025 might also be useful to enable the application to assist in 2026 the construction of the <tt>RESULT_FILTER</tt> such that it 2027 can filter already known results. Note that the 2028 <tt>RESULT_FILTER</tt> may need to be re-constructed every 2029 time the query is retransmitted by the initiator (details 2030 depending on the <tt>BTYPE</tt>) and thus a 2031 <tt>RESULT_FILTER</tt> can often not be passed directly as 2032 an argument by the application to an application API. 2033 Instead, applications would typically provide the set of 2034 results to be filtered, allowing the DHT to construct the 2035 <tt>RESULT_FILTER</tt> whenever it retransmits a 2036 <tt>GetMessage</tt> request as initiator. 2037 </t> 2038 <section anchor="p2p_get_wire"> 2039 <name>Wire Format</name> 2040 <figure anchor="figure_getmsg" title="The GetMessage Wire Format."> 2041 <artwork name="" type="" align="left" alt=""><![CDATA[ 2042 0 8 16 24 32 40 48 56 2043 +-----+-----+-----+-----+-----+-----+-----+-----+ 2044 | MSIZE | MTYPE | BTYPE | 2045 +-----+-----+-----+-----+-----+-----+-----+-----+ 2046 | VER |FLAGS| HOPCOUNT | REPL_LVL | RF_SIZE | 2047 +-----+-----+-----+-----+-----+-----+-----+-----+ 2048 | PEER_BF / 2049 / (128 byte) | 2050 +-----+-----+-----+-----+-----+-----+-----+-----+ 2051 | QUERY_HASH / 2052 / (64 byte) | 2053 +-----+-----+-----+-----+-----+-----+-----+-----+ 2054 | RESULT_FILTER / 2055 / (variable length) / 2056 +-----+-----+-----+-----+-----+-----+-----+-----+ 2057 / XQUERY (variable length) / 2058 +-----+-----+-----+-----+-----+-----+-----+-----+ 2059 ]]></artwork> 2060 </figure> 2061 <t>where:</t> 2062 <dl> 2063 <dt>MSIZE</dt> 2064 <dd> 2065 denotes the size of this message in network byte order. 2066 </dd> 2067 <dt>MTYPE</dt> 2068 <dd> 2069 is the 16-bit message type. Read-only. It must be set 2070 to the value 147 in network byte order as defined in the 2071 GANA "GNUnet Message Type" registry <xref 2072 target="gana_message_type"/>. 2073 </dd> 2074 <dt>BTYPE</dt> 2075 <dd> 2076 is a 32-bit block type field in network byte order. The 2077 block type indicates the content type of the 2078 payload. Set by the initiator. Read-only. 2079 </dd> 2080 <dt>VER</dt> 2081 <dd> 2082 is a 8-bit protocol version. 2083 Set to zero. May be used in future protocol versions. 2084 </dd> 2085 <dt>FLAGS</dt> 2086 <dd> 2087 is a 8-bit vector with binary options (see <xref target="route_flags"/>). 2088 Set by the initiator. Read-only. 2089 </dd> 2090 <dt>HOPCOUNT</dt> 2091 <dd> 2092 is a 16-bit number in network byte order indicating how 2093 many hops this message has traversed to far. Set by the 2094 initiator to zero. Incremented by each peer by one 2095 per hop. 2096 </dd> 2097 <dt>REPL_LVL</dt> 2098 <dd> 2099 is a 16-bit number in network byte order indicating the 2100 desired replication level of the data. Set by the 2101 initiator. Read-only. 2102 </dd> 2103 <dt>RF_SIZE</dt> 2104 <dd> 2105 is a 16-bit number in network byte order indicating the 2106 length of the <tt>RESULT_FILTER</tt>. Set by the 2107 initiator. Read-only. 2108 </dd> 2109 <dt>PEER_BF</dt> 2110 <dd> 2111 A peer Bloom filter to stop circular routes (see <xref target="routing_bloomfilter"/>). 2112 Set by the initiator to include itself and all connected neighbors in the routing table. 2113 Modified by processing peers to include their own peer identity. 2114 </dd> 2115 <dt>QUERY_HASH</dt> 2116 <dd> 2117 The query used to indicate what the key is under which 2118 the initiator is looking for blocks with this request. 2119 The block type may use a different evaluation logic to 2120 determine applicable result blocks. Set by the 2121 initiator. Read-only. 2122 </dd> 2123 <dt>RESULT_FILTER</dt> 2124 <dd> 2125 the variable-length result filter with <tt>RF_SIZE</tt> 2126 bytes as described in <xref target="result_filter"/>. 2127 Set by the initiator. Modified by processing peers 2128 based on results returned. 2129 </dd> 2130 <dt>XQUERY</dt> 2131 <dd> 2132 the variable-length extended query. Optional. 2133 Set by the initiator. Read-only. The length is 2134 determined by subtracting the length of all 2135 other fields from <tt>MSIZE</tt>. 2136 </dd> 2137 </dl> 2138 </section> 2139 <section anchor="result_filter"> 2140 <name>Result Filter</name> 2141 <t> 2142 A result filter is used to indicate to other peers which 2143 results are not of interest when processing a 2144 <tt>GetMessage</tt> (<xref target="p2p_get"/>). Any peer 2145 which is processing <tt>GetMessages</tt> and has a result 2146 which matches the query key <bcp14>MUST</bcp14> check the 2147 result filter and only send a reply message if the result 2148 does not test positive under the result filter. Before 2149 forwarding the <tt>GetMessage</tt>, the result filter 2150 <bcp14>MUST</bcp14> be updated using the result of the 2151 <tt>BTYPE</tt>-specific <tt>FilterResult</tt> (see <xref 2152 target="block_functions"/>) function to filter out all 2153 results already returned by the local peer. 2154 </t> 2155 <t> 2156 How a result filter is implemented depends on the block type 2157 as described in <xref target="block_functions"/>. 2158 Result filters may be probabilistic data structures. Thus, 2159 it is possible that a desireable result is filtered by a result 2160 filter because of a false-positive test. 2161 </t> 2162 <t> 2163 How exactly a block result is added to a result filter is 2164 specified as part of the definition of a block type 2165 (cf. <xref target="hello_block"/>). 2166 </t> 2167 </section> 2168 <section anchor="p2p_get_processing"> 2169 <name>Processing</name> 2170 <t> 2171 Upon receiving a <tt>GetMessage</tt> from a peer <tt>P</tt>, or 2172 created through initiation by the overlay API, an 2173 implementation <bcp14>MUST</bcp14> process it step by step as follows: 2174 </t> 2175 <ol> 2176 <li> 2177 If the <tt>BTYPE</tt> is supported, the 2178 <tt>QUERY_HASH</tt> and <tt>XQUERY</tt> fields are 2179 validated as defined by the respective 2180 <tt>ValidateBlockQuery()</tt> procedure for this type. If 2181 the result yields <tt>REQUEST_INVALID</tt>, the message 2182 <bcp14>MUST</bcp14> be discarded and processing ends. 2183 If the <tt>BTYPE</tt> is not supported, the message 2184 <bcp14>MUST</bcp14> be forwarded (Skip to step 4). If 2185 the <tt>BTYPE</tt> is <tt>ANY</tt>, the message is 2186 processed further without validation. 2187 </li> 2188 <li> 2189 The peer identity of the sender peer <tt>P</tt> 2190 <bcp14>SHOULD</bcp14> be in the <tt>PEER_BF</tt> peer 2191 Bloom filter. If not, the implementation 2192 <bcp14>MAY</bcp14> log an error, but <bcp14>MUST</bcp14> 2193 continue. 2194 </li> 2195 <li> 2196 <t> 2197 The local peer <bcp14>SHOULD</bcp14> try to produce a 2198 reply in any of the following cases: (1) If the local 2199 peer is the closest peer (cf. <tt>IsClosestPeer(SELF, 2200 QueryHash, PeerFilter)</tt>, or (2) if the 2201 <tt>DemultiplexEverywhere</tt> flag is set, or (3) if 2202 the local peer is not the closest and a previously 2203 cached <tt>ResultMessage</tt> also matches this 2204 request (<xref target="p2p_result_processing"/>). 2205 </t> 2206 <t> 2207 The reply is produced (if one is available) using the following 2208 steps: 2209 </t> 2210 <ol type="%c)"> 2211 <li> 2212 If the <tt>BTYPE</tt> is <tt>HELLO</tt>, the 2213 implementation <bcp14>MUST</bcp14> only consider 2214 synthesizing its own addresses and the addresses it 2215 has cached for the peers in its routing table as 2216 <tt>HELLO</tt> block replies. Otherwise, if the 2217 <tt>BTYPE</tt> does not indicate a request for a 2218 <tt>HELLO</tt> block or <tt>ANY</tt>, the 2219 implementation <bcp14>MUST</bcp14> only consider 2220 blocks in the local block storage and previously 2221 cached <tt>ResultMessages</tt>. 2222 </li> 2223 <li> 2224 If the <tt>FLAGS</tt> field includes the flag 2225 <tt>FindApproximate</tt>, the peer 2226 <bcp14>SHOULD</bcp14> respond with the closest block 2227 (smallest value of <tt>GetDistance(QUERY_HASH, 2228 BLOCK_KEY)</tt>) it can find that is not filtered by 2229 the <tt>RESULT_BF</tt>. Otherwise, the peer 2230 <bcp14>MUST</bcp14> respond with the block with a 2231 <tt>BLOCK_KEY</tt> that matches the 2232 <tt>QUERY_HASH</tt> exactly and that is not filtered 2233 by the <tt>RESULT_BF</tt>. 2234 </li> 2235 <li> 2236 Any resulting (synthesized) block is encapsulated in 2237 a <tt>ResultMessage</tt>. The 2238 <tt>ResultMessage</tt> <bcp14>SHOULD</bcp14> be 2239 transmitted to the neighbor from which the request 2240 was received. 2241 </li> 2242 </ol> 2243 <t> 2244 Implementations <bcp14>MAY</bcp14> not reply if they 2245 are resource-constrained. However, 2246 <tt>ResultMessages</tt> <bcp14>MUST</bcp14> be given 2247 the highest priority among competing transmissions. 2248 </t> 2249 <t> 2250 If the <tt>BTYPE</tt> is supported and 2251 <tt>ValidateBlockReply</tt> for the given query has 2252 yielded a status of <tt>FILTER_LAST</tt>, processing 2253 <bcp14>MUST</bcp14> end and not continue with 2254 forwarding of the request to other peers. 2255 </t> 2256 </li> 2257 <li> 2258 The implementation <tt>MUST</tt> create (or merge) an 2259 entry in the pending table (see <xref 2260 target="pending_table"/>) for the query represented by 2261 this <tt>GetMessage</tt>. The pending table 2262 <bcp14>MUST</bcp14> store the last <tt>MAX_RECENT</tt> 2263 requests, and peers thus <bcp14>MUST</bcp14> discard the 2264 oldest existing request if memory constraints on the 2265 pending table are encountered. Note that peers 2266 <bcp14>MUST</bcp14> clean up state for queries that had 2267 response with a status of <tt>FILTER_LAST</tt> even if 2268 they are not the oldest query in the pending table. 2269 </li> 2270 <li> 2271 Using the value in <tt>REPL_LVL</tt>, the number of 2272 peers to forward to <bcp14>MUST</bcp14> be calculated 2273 using <tt>ComputeOutDegree()</tt>. If there is at least 2274 one peer to forward to, the implementation 2275 <bcp14>SHOULD</bcp14> select up to this number of peers 2276 to forward the message to. Furthermore, the 2277 implementation <bcp14>SHOULD</bcp14> select up to this 2278 number of peers to using the function 2279 <tt>SelectPeer()</tt> (from <xref 2280 target="routing_functions"/>) using the 2281 <tt>QUERY_HASH</tt>, <tt>HOPCOUNT</tt>, and the 2282 <tt>PEER_BF</tt>. The implementation <bcp14>MAY</bcp14> 2283 forward to fewer or no peers in order to handle resource 2284 constraints such as bandwidth. Before forwarding, the 2285 peer Bloom filter <tt>PEER_BF</tt> <bcp14>MUST</bcp14> 2286 be updated to filter all selected peers and the local 2287 peer identity <tt>SELF</tt>. For all peers with peer 2288 identity <tt>P</tt> chosen to forward the message to, 2289 <tt>SEND(P, GetMessage_P)</tt> is called. Here, 2290 <tt>GetMessage_P</tt> is the original message with the 2291 updated fields for <tt>HOPCOUNT</tt> (incremented by 2292 one), updated <tt>PEER_BF</tt> and updated 2293 <tt>RESULT_FILTER</tt> (based on results already 2294 returned). 2295 </li> 2296 </ol> 2297 </section> 2298 </section> 2299 <section anchor="p2p_result" numbered="true" toc="default"> 2300 <name>ResultMessage</name> 2301 <t> 2302 <tt>ResultMessages</tt> are used to return information to 2303 other peers in the DHT or to applications using the overlay 2304 API that previously initiated a <tt>GetMessage</tt>. The 2305 initiator of a <tt>ResultMessage</tt> is a peer triggered 2306 through the processing of a <tt>GetMessage</tt>. 2307 </t> 2308 <section anchor="p2p_result_wire"> 2309 <name>Wire Format</name> 2310 <figure anchor="figure_resmsg" title="The ResultMessage Wire Format"> 2311 <artwork name="" type="" align="left" alt=""><![CDATA[ 2312 0 8 16 24 32 40 48 56 2313 +-----+-----+-----+-----+-----+-----+-----+-----+ 2314 | MSIZE | MTYPE | BTYPE | 2315 +-----+-----+-----+-----+-----+-----+-----+-----+ 2316 | RESERVED | VER |FLAGS| PUTPATH_L | GETPATH_L | 2317 +-----+-----+-----+-----+-----+-----+-----+-----+ 2318 | EXPIRATION | 2319 +-----+-----+-----+-----+-----+-----+-----+-----+ 2320 | QUERY_HASH / 2321 / (64 byte) | 2322 +-----+-----+-----+-----+-----+-----+-----+-----+ 2323 / TRUNCATED ORIGIN (0 or 32 bytes) / 2324 +-----+-----+-----+-----+-----+-----+-----+-----+ 2325 / PUTPATH / 2326 / (variable length) / 2327 +-----+-----+-----+-----+-----+-----+-----+-----+ 2328 / GETPATH / 2329 / (variable length) / 2330 +-----+-----+-----+-----+-----+-----+-----+-----+ 2331 / LAST HOP SIGNATURE (0 or 64 bytes) / 2332 +-----+-----+-----+-----+-----+-----+-----+-----+ 2333 / BLOCK / 2334 / (variable length) / 2335 +-----+-----+-----+-----+-----+-----+-----+-----+ 2336 ]]></artwork> 2337 </figure> 2338 <t>where:</t> 2339 <dl> 2340 <dt>MSIZE</dt> 2341 <dd> 2342 denotes the size of this message in network byte order. 2343 </dd> 2344 <dt>MTYPE</dt> 2345 <dd> 2346 is the 16-bit message type. Set by the 2347 initiator. Read-only. It must be set to the value 148 2348 in network byte order as defined in the GANA "GNUnet 2349 Message Type" registry (see <xref 2350 target="gana_message_type"/>). 2351 </dd> 2352 <dt>BTYPE</dt> 2353 <dd> 2354 is a 32-bit block type field in network byte order. The 2355 block type indicates the content type of the payload. 2356 Set by the initiator. Read-only. 2357 </dd> 2358 <dt>RESERVED</dt> 2359 <dd> 2360 is a 16-bit value. Implementations <bcp14>MUST</bcp14> 2361 set this value to zero when originating a result message. 2362 Implementations <bcp14>MUST</bcp14> forward 2363 this value unchanged even if it is non-zero. 2364 </dd> 2365 <dt>VER</dt> 2366 <dd> 2367 is a 8-bit protocol version in network byte order. 2368 Set to zero. May be used in future protocol versions. 2369 </dd> 2370 <dt>FLAGS</dt> 2371 <dd> 2372 is a 8-bit vector with binary options (see <xref 2373 target="route_flags"/>). Set by the initiator of the 2374 response based on the flags retained from the original 2375 <tt>PutMessage</tt>, possibly setting the 2376 <tt>Truncated</tt> bit if the initiator is forced to 2377 truncate the path. For <tt>HELLO</tt> blocks, the 2378 <tt>FLAGS</tt> should simply be cleared. 2379 </dd> 2380 <dt>PUTPATH_L</dt> 2381 <dd> 2382 is a 16-bit number in network byte order indicating the 2383 number of path elements recorded in <tt>PUTPATH</tt>. As 2384 <tt>PUTPATH</tt> is optional, this value may be zero 2385 even if the message has traversed several peers. Set by 2386 the initiator to the <tt>PATH_LEN</tt> of the 2387 <tt>PutMessage</tt> from which the block originated. 2388 Modified by processing peers in case of path truncation. 2389 </dd> 2390 <dt>GETPATH_L</dt> 2391 <dd> 2392 is a 16-bit number in network byte order indicating the 2393 number of path elements recorded in <tt>GETPATH</tt>. As 2394 <tt>GETPATH</tt> is optional, this value may be zero 2395 even if the message has traversed several peers. 2396 <bcp14>MUST</bcp14> be set to zero by the initiator. 2397 Modified by processing peers to match the path length in 2398 the message. 2399 </dd> 2400 <dt>EXPIRATION</dt> 2401 <dd> 2402 denotes the absolute 64-bit expiration date of the 2403 content in microseconds since midnight (0 hour), January 2404 1, 1970 in network byte order. Set by the initiator to 2405 the expiration value as recorded from the 2406 <tt>PutMessage</tt> from which the block originated. 2407 Read-only. 2408 </dd> 2409 <dt>QUERY_HASH</dt> 2410 <dd> 2411 the query hash corresponding to the <tt>GetMessage</tt> 2412 which caused this reply message to be sent. Set by the 2413 initiator using the value of the <tt>GetMessage</tt>. 2414 Read-only. 2415 </dd> 2416 <dt>TRUNCATED ORIGIN</dt> 2417 <dd> 2418 is only provided if the <tt>Truncated</tt> flag is set 2419 in <tt>FLAGS</tt>. If present, this is the public key of 2420 the peer just before the first entry on the 2421 <tt>PUTPATH</tt> and the first peer on the 2422 <tt>PUTPATH</tt> is not the actual origin of the 2423 message. Thus, to verify the first signature on the 2424 <tt>PUTPATH</tt>, this public key must be used. Note 2425 that due to the truncation, this last hop cannot be 2426 verified to exist. Set by processing peers. 2427 </dd> 2428 <dt>PUTPATH</dt> 2429 <dd> 2430 the variable-length PUT path. The path consists of a 2431 list of <tt>PUTPATH_L</tt> path elements. Set by the 2432 initiator to the the <tt>PUTPATH</tt> of the 2433 <tt>PutMessage</tt> from which the block originated. 2434 Modified by processing peers in case of path truncation. 2435 </dd> 2436 <dt>GETPATH</dt> 2437 <dd> 2438 the variable-length PUT path. The path consists of a 2439 list of <tt>GETPATH_L</tt> path elements. Set by 2440 processing peers. 2441 </dd> 2442 <dt>LAST HOP SIGNATURE</dt> 2443 <dd> 2444 is only provided if the <tt>RecordRoute</tt> flag is set 2445 in <tt>FLAGS</tt>. If present, this is an EdDSA 2446 signature <xref target="ed25519"/> by the sender of this 2447 message (using the same format as the signatures in 2448 <tt>PUTPATH</tt>) affirming that the sender forwarded 2449 the message from the predecessor (all zeros if 2450 <tt>PATH_LEN</tt> is zero, otherwise the last peer in 2451 <tt>PUTPATH</tt>) to the target peer. 2452 </dd> 2453 <dt>BLOCK</dt> 2454 <dd> 2455 the variable-length resource record data payload. The 2456 contents are defined by the respective type of the 2457 resource record. Set by the initiator. Read-only. 2458 </dd> 2459 </dl> 2460 </section> 2461 <section anchor="p2p_result_processing"> 2462 <name>Processing</name> 2463 <t> 2464 Upon receiving a <tt>ResultMessage</tt> from a connected 2465 peer or triggered by the processing of a 2466 <tt>GetMessage</tt>, an implementation <bcp14>MUST</bcp14> 2467 process it step by step as follows: 2468 </t> 2469 <ol> 2470 <li> 2471 First, the <tt>EXPIRATION</tt> field is evaluated. If 2472 the message is expired, it <bcp14>MUST</bcp14> be 2473 discarded. 2474 </li> 2475 <li> 2476 If the <tt>BTYPE</tt> is supported, then the 2477 <tt>BLOCK</tt> <bcp14>MUST</bcp14> be validated against 2478 the requested <tt>BTYPE</tt>. To do this, the peer 2479 checks that the block is valid using 2480 <tt>ValidateBlockStoreRequest</tt>. If the result is 2481 <tt>BLOCK_INVALID</tt>, the message <bcp14>MUST</bcp14> 2482 be discarded. 2483 </li> 2484 <li> 2485 If the <tt>PUTPATH_L</tt> or the <tt>GETPATH_L</tt> are 2486 non-zero, the local peer <bcp14>SHOULD</bcp14> verify 2487 the signatures from the <tt>PUTPATH</tt> and the 2488 <tt>GETPATH</tt>. Verification <bcp14>MAY</bcp14> 2489 involve checking all signatures or any random subset of 2490 the signatures. It is <bcp14>RECOMMENDED</bcp14> that 2491 peers adapt their behavior to available computational 2492 resources so as to not make signature verification a 2493 bottleneck. If an invalid signature is found, the path 2494 <bcp14>MUST</bcp14> be truncated to only include the 2495 elements following the invalid signature. In 2496 particular, any invalid signature on the 2497 <tt>GETPATH</tt> will cause <tt>PUTPATH_L</tt> to be set 2498 to zero. 2499 </li> 2500 <li> 2501 The peer also attempts to compute the key using 2502 <tt>DeriveBlockKey</tt>. This may result in 2503 <tt>NONE</tt>. The result is used later. Note that 2504 even if a key was computed, it does not have to match 2505 the <tt>QUERY_HASH</tt>. 2506 </li> 2507 <li> 2508 If the <tt>BTYPE</tt> of the message indicates a 2509 <tt>HELLO</tt> block, the peer <bcp14>SHOULD</bcp14> be 2510 considered for the local routing table by using the peer 2511 identity computed from the block using 2512 <tt>DeriveBlockKey</tt>. An implementation 2513 <bcp14>MAY</bcp14> choose to ignore the <tt>HELLO</tt>, 2514 for example because the routing table or the respective 2515 k-bucket is already full. If the peer is a suitable 2516 candidate for insertion, the local peer 2517 <bcp14>MUST</bcp14> try to establish a connection to the 2518 peer indicated in the <tt>HELLO</tt> block using the 2519 address information from the <tt>HELLO</tt> block and 2520 the underlay's <tt>TRY_CONNECT()</tt> function. The 2521 implementation <bcp14>MUST</bcp14> instruct the underlay 2522 to connect to all provided addresses using 2523 <tt>TRY_CONNECT()</tt> in order to make the underlay 2524 aware of multiple addresses for this connection. When a 2525 connection is established, the signal 2526 <tt>PEER_CONNECTED</tt> will cause the peer to be added 2527 to the respective k-bucket of the routing table (see 2528 <xref target="routing"/>). 2529 </li> 2530 <li> 2531 If the <tt>QUERY_HASH</tt> of this 2532 <tt>ResultMessage</tt> does not match an entry in the 2533 pending table (<xref target="pending_table"/>), then the 2534 message is discarded and processing ends. Otherwise, 2535 processing continues for each entry in the table as 2536 follows. 2537 </li> 2538 <li> 2539 <ol type="%c)"> 2540 <li> 2541 If the <tt>FindApproximate</tt> flag was not set in 2542 the query and the <tt>BTYPE</tt> enabled the 2543 implementation to compute the key from the block, 2544 the computed key must exactly match the 2545 <tt>QUERY_HASH</tt>, otherwise the result does not 2546 match the pending query and processing continues 2547 with the next pending table entry. 2548 </li> 2549 <li> 2550 If the <tt>BTYPE</tt> is supported, result block 2551 <bcp14>MUST</bcp14> be validated against the 2552 specific query using the respective 2553 <tt>FilterBlockResult</tt> function. This function 2554 <bcp14>MUST</bcp14> update the result filter if a 2555 result is returned to the originator of the query. 2556 </li> 2557 <li> 2558 If the <tt>BTYPE</tt> is not supported, filtering of 2559 exact duplicate replies <bcp14>MUST</bcp14> still be 2560 performed before forwarding the reply. Such 2561 duplicate filtering <bcp14>MAY</bcp14> be 2562 implemented probabilistically, for example using a 2563 Bloom filter. The result of this duplicate 2564 filtering is always either <tt>FILTER_MORE</tt> or 2565 <tt>FILTER_DUPLICATE</tt>. 2566 </li> 2567 <li> 2568 If the <tt>RecordRoute</tt> flag is set in 2569 <tt>FLAGS</tt>, the local peer identity 2570 <bcp14>MUST</bcp14> be appended to the 2571 <tt>GETPATH</tt> of the message and the respective 2572 signature <bcp14>MUST</bcp14> be set using the query 2573 origin as the <tt>PEER SUCCESSOR</tt> and the 2574 response origin as the <tt>PEER PREDECESSOR</tt>. 2575 If the flag is not set, the <tt>GETPATH_L</tt> and 2576 <tt>PUTPATH_L</tt> <bcp14>MUST</bcp14> be set to 2577 zero when forwarding the result. 2578 </li> 2579 <li> 2580 If the result filter result is either 2581 <tt>FILTER_MORE</tt> or <tt>FILTER_LAST</tt>, the 2582 message is forwarded to the origin of the query as 2583 defined in the entry which may either be the local 2584 peer or a remote peer. In case this is a query of 2585 the local peer the result may have to be provided to 2586 applications through the overlay API. Otherwise, 2587 the result is forwarded using <tt>SEND(P, 2588 ResultMessage')</tt> where <tt>ResultMessage'</tt> 2589 is the now modified message. If the result was 2590 <tt>FILTER_LAST</tt>, the query <bcp14>MUST</bcp14> 2591 be removed from the pending table. 2592 </li> 2593 </ol> 2594 </li> 2595 <li> 2596 Finally, the implementation <bcp14>SHOULD</bcp14> cache 2597 <tt>ResultMessages</tt> in order to provide already seen 2598 replies to future <tt>GetMessages</tt>. The 2599 implementation <bcp14>MAY</bcp14> choose not no cache 2600 any or a limited number of <tt>ResultMessages</tt> for 2601 reasons such as resource limitations. 2602 </li> 2603 </ol> 2604 </section> 2605 </section> 2606 </section> 2607 <section anchor="blockstorage" numbered="true" toc="default"> 2608 <name>Blocks</name> 2609 <t> 2610 This section describes various considerations R<sup>5</sup>N 2611 implementations must consider with respect to blocks. 2612 Specifically, implementations <bcp14>SHOULD</bcp14> be validate and persist 2613 blocks. 2614 Implementations 2615 <bcp14>MAY</bcp14> not support validation for all types 2616 of blocks. 2617 For example, on some devices, storing blocks is impossible due to lack of 2618 storage capacity. 2619 Block storage improves lookup performance for local applications and also 2620 other peers. Not storing blocks results in degraded performance. 2621 </t> 2622 <t> 2623 The block type determines the format and handling of the block 2624 payload by peers in <tt>PutMessages</tt> and 2625 <tt>ResultMessages</tt>. Applications can and should define 2626 their own block types. Block types <bcp14>MUST</bcp14> be 2627 registered with GANA (see <xref target="gana_block_type"/>). 2628 Especially when new block types are introduced, some peers 2629 <bcp14>MAY</bcp14> lack support for the respective block 2630 operations. 2631 </t> 2632 <t> 2633 </t> 2634 <section anchor="block_functions"> 2635 <name>Block Operations</name> 2636 <t> 2637 Block validation operations are used as part of message 2638 processing (see <xref target="p2p_messages"/>) for all 2639 types of DHT messages. To enable these validation 2640 operations, any block type specification 2641 <bcp14>MUST</bcp14> define the following functions: 2642 </t> 2643 <dl> 2644 <dt>ValidateBlockQuery(Key, XQuery) 2645 -> RequestEvaluationResult</dt> 2646 <dd> 2647 <t> 2648 is used to evaluate the request for a block as part of 2649 <tt>GetMessage</tt> processing. Here, the block payload is unkown, 2650 but if possible the <tt>XQuery</tt> and <tt>Key</tt> 2651 <bcp14>SHOULD</bcp14> be verified. Possible values for 2652 the <tt>RequestEvaluationResult</tt> are: 2653 </t> 2654 <dl> 2655 <dt>REQUEST_VALID</dt> 2656 <dd> 2657 The query is valid. 2658 </dd> 2659 <dt>REQUEST_INVALID</dt> 2660 <dd> 2661 The query format does not match the block type. For example, 2662 a mandatory <tt>XQuery</tt> was not provided, or of 2663 the size of the <tt>XQuery</tt> is not appropriate 2664 for the block type. 2665 </dd> 2666 </dl> 2667 </dd> 2668 <dt>DeriveBlockKey(Block) -> Key | NONE</dt> 2669 <dd> 2670 is used to synthesize the block key from the block 2671 payload as part of <tt>PutMessage</tt> and 2672 <tt>ResultMessage</tt> processing. The special return 2673 value of <tt>NONE</tt> implies that this block type does 2674 not permit deriving the key from the block. A 2675 <tt>Key</tt> may be returned for a block that is 2676 ill-formed. 2677 </dd> 2678 <dt>ValidateBlockStoreRequest(Block) 2679 -> BlockEvaluationResult</dt> 2680 <dd> 2681 <t> 2682 is used to evaluate a block payload as part of 2683 <tt>PutMessage</tt> and <tt>ResultMessage</tt> 2684 processing. Possible values for the 2685 <tt>BlockEvaluationResult</tt> are: 2686 </t> 2687 <dl> 2688 <dt>BLOCK_VALID</dt> 2689 <dd> 2690 The block is valid. 2691 </dd> 2692 <dt>BLOCK_INVALID</dt> 2693 <dd> 2694 The block payload does not match the block type. 2695 </dd> 2696 </dl> 2697 </dd> 2698 <dt>SetupResultFilter(FilterSize, Mutator) -> RF</dt> 2699 <dd> 2700 is used to setup an empty result filter. The arguments 2701 are typically the size of the set of results that must 2702 be filtered at the initiator, and a <tt>Mutator</tt> 2703 value which <bcp14>MAY</bcp14> be used to 2704 deterministically re-randomize probabilistic data 2705 structures. <tt>RF</tt> <bcp14>MUST</bcp14> be a byte 2706 sequence suitable for transmission over the network. 2707 </dd> 2708 <dt>FilterResult(Block, Key, RF, XQuery) -> (FilterEvaluationResult, RF')</dt> 2709 <dd> 2710 <t> 2711 is used to filter results against specific queries. 2712 This function does not check the validity of 2713 <tt>Block</tt> itself or that it matches the given key, 2714 as this must have been checked earlier. Locally 2715 stored blocks from previously observed 2716 <tt>ResultMessages</tt> and <tt>PutMessages</tt> use 2717 this function to perform filtering based on the request 2718 parameters of a particular GET operation. Possible 2719 values for the <tt>FilterEvaluationResult</tt> are: 2720 </t> 2721 <dl> 2722 <dt>FILTER_MORE</dt> 2723 <dd> 2724 <tt>Block</tt> is a valid result, and there may be more. 2725 </dd> 2726 <dt>FILTER_LAST</dt> 2727 <dd> 2728 The given <tt>Block</tt> is the last possible valid result. 2729 </dd> 2730 <dt>FILTER_DUPLICATE</dt> 2731 <dd> 2732 <tt>Block</tt> is a valid result, but considered to be 2733 a duplicate (was filtered by the <tt>RF</tt>) and 2734 <bcp14>SHOULD NOT</bcp14> be returned to the previous 2735 hop. Peers that do not understand the block type 2736 <bcp14>MAY</bcp14> return such duplicate results 2737 anyway and implementations must take this into account. 2738 </dd> 2739 <dt>FILTER_IRRELEVANT</dt> 2740 <dd> 2741 <tt>Block</tt> does not satisfy the constraints 2742 imposed by the <tt>XQuery</tt>. The result 2743 <bcp14>SHOULD NOT</bcp14> be returned to the previous 2744 hop. Peers that do not understand the block type 2745 <bcp14>MAY</bcp14> return such irrelevant results 2746 anyway and implementations must take this into account. 2747 </dd> 2748 </dl> 2749 <t> 2750 If the main evaluation result is <tt>FILTER_MORE</tt>, 2751 the function also returns an updated result filter 2752 where the <tt>Block</tt> is added to the set of 2753 filtered replies. An implementation is not expected 2754 to actually differenciate between the 2755 <tt>FILTER_DUPLICATE</tt> and 2756 <tt>FILTER_IRRELEVANT</tt> return values: in both 2757 cases the <tt>Block</tt> is ignored for this query. 2758 </t> 2759 </dd> 2760 </dl> 2761 </section> 2762 <section anchor="hello_block"> 2763 <name>HELLO Blocks</name> 2764 <t> 2765 For bootstrapping and peer discovery, the DHT 2766 implementation uses its own block type called "HELLO". 2767 <tt>HELLO</tt> blocks are the only type of block that 2768 <bcp14>MUST</bcp14> be supported by every R<sup>5</sup>N 2769 implementation. A block with this block type contains the 2770 peer public key of the peer that published the 2771 <tt>HELLO</tt> together with a set of addresses of this 2772 peer. The key of a <tt>HELLO</tt> block is the SHA-512 2773 hash <xref target="RFC4634"/> of the peer public key and 2774 thus the peer's identity in the DHT. 2775 </t> 2776 <t> 2777 The <tt>HELLO</tt> block type wire format is illustrated 2778 in <xref target="figure_hello"/>. A query for block of 2779 type <tt>HELLO</tt> <bcp14>MUST NOT</bcp14> include 2780 extended query data (XQuery). Any implementation 2781 encountering a request for a <tt>HELLO</tt> with non-empty 2782 XQuery data <bcp14>MUST</bcp14> consider the request 2783 invalid and ignore it. 2784 </t> 2785 <figure anchor="figure_hello" title="The HELLO Block Format."> 2786 <artwork name="" type="" align="left" alt=""><![CDATA[ 2787 0 8 16 24 32 40 48 56 2788 +-----+-----+-----+-----+-----+-----+-----+-----+ 2789 | PEER PUBLIC KEY | 2790 | (32 byte) | 2791 | | 2792 | | 2793 +-----+-----+-----+-----+-----+-----+-----+-----+ 2794 | SIGNATURE | 2795 | (64 byte) | 2796 | | 2797 | | 2798 | | 2799 | | 2800 | | 2801 | | 2802 +-----+-----+-----+-----+-----+-----+-----+-----+ 2803 | EXPIRATION | 2804 +-----+-----+-----+-----+-----+-----+-----+-----+ 2805 / ADDRESSES / 2806 / (variable length) / 2807 +-----+-----+-----+-----+-----+-----+-----+-----+ 2808 ]]></artwork> 2809 </figure> 2810 <dl> 2811 <dt>PEER PUBLIC KEY</dt> 2812 <dd> 2813 is the public key of the peer to which the 2814 <tt>ADDRESSES</tt> belong. This is also the public key 2815 needed to verify the <tt>SIGNATURE</tt>. 2816 </dd> 2817 <dt>EXPIRATION</dt> 2818 <dd> 2819 denotes the absolute 64-bit expiration date of the 2820 content. The value specified is microseconds since 2821 midnight (0 hour), January 1, 1970 in network byte 2822 order, but must be a multiple of one million (so that it 2823 can be represented in seconds in a <tt>HELLO</tt> URL). 2824 </dd> 2825 <dt>ADDRESSES</dt> 2826 <dd> 2827 is a list of UTF-8 addresses (<xref target="terminology"/>) 2828 which can be used to contact the peer. 2829 Each address <bcp14>MUST</bcp14> be 0-terminated. 2830 The set of addresses <bcp14>MAY</bcp14> be empty, for example 2831 if the peer knows that it cannot be reached from the outside (i.e. NAT). 2832 </dd> 2833 <dt>SIGNATURE</dt> 2834 <dd> 2835 <t> 2836 is the EdDSA signature <xref target="ed25519"/> of the 2837 <tt>HELLO</tt> block. The signature covers various 2838 information derived from the actual data in the 2839 <tt>HELLO</tt> block. The data signed over includes the 2840 block expiration time, a constant that uniquely 2841 identifies the purpose of the signature, and a hash of 2842 the addresses with 0-terminators in the same order as 2843 they are present in the <tt>HELLO</tt> block. The 2844 format is illustrated in <xref 2845 target="figure_hellowithpseudo"/>. 2846 </t> 2847 <figure anchor="figure_hellowithpseudo" title="The Wire Format of the HELLO for Signing."> 2848 <artwork name="" type="" align="left" alt=""><![CDATA[ 2849 0 8 16 24 32 40 48 56 2850 +-----+-----+-----+-----+-----+-----+-----+-----+ 2851 | SIZE | PURPOSE | 2852 +-----+-----+-----+-----+-----+-----+-----+-----+ 2853 | EXPIRATION | 2854 +-----+-----+-----+-----+-----+-----+-----+-----+ 2855 | H_ADDRS | 2856 | (64 byte) | 2857 | | 2858 | | 2859 | | 2860 | | 2861 | | 2862 | | 2863 +-----+-----+-----+-----+-----+-----+-----+-----+ 2864 ]]></artwork> 2865 </figure> 2866 <dl> 2867 <dt>SIZE</dt> 2868 <dd> 2869 A 32-bit value containing the length of the signed data 2870 in bytes in network byte order. The length of the 2871 signed data <bcp14>MUST</bcp14> be 80 bytes. 2872 </dd> 2873 <dt>PURPOSE</dt> 2874 <dd> 2875 A 32-bit signature purpose flag. This field 2876 <bcp14>MUST</bcp14> be 7 in network byte order. 2877 </dd> 2878 <dt>EXPIRATION</dt> 2879 <dd> 2880 denotes the absolute 64-bit expiration date of the 2881 <tt>HELLO</tt> in microseconds since midnight (0 hour), 2882 January 1, 1970 in network byte order. 2883 </dd> 2884 <dt>H_ADDRS</dt> 2885 <dd> 2886 a SHA-512 hash over the addresses in the <tt>HELLO</tt>. 2887 <tt>H_ADDRS</tt> is generated over the 2888 <tt>ADDRESSES</tt> field as provided in the 2889 <tt>HELLO</tt> block using SHA-512 <xref 2890 target="RFC4634"/>. 2891 </dd> 2892 </dl> 2893 </dd> 2894 </dl> 2895 <t> 2896 The <tt>HELLO</tt> block operations <bcp14>MUST</bcp14> be 2897 implemented as follows: 2898 </t> 2899 <dl> 2900 <dt>ValidateBlockQuery(Key, XQuery) 2901 -> RequestEvaluationResult</dt> 2902 <dd> 2903 To validate a block query for a <tt>HELLO</tt> is to 2904 simply check that the <tt>XQuery</tt> is empty. If it is empty, 2905 <tt>REQUEST_VALID</tt> ist returned. Otherwise, 2906 <tt>REQUEST_INVALID</tt> is returned. 2907 </dd> 2908 <dt>DeriveBlockKey(Block) -> Key | NONE</dt> 2909 <dd> 2910 To derive a block key for a <tt>HELLO</tt> is to simply 2911 hash the <tt>PEER PUBLIC KEY</tt> from the <tt>HELLO</tt>. The 2912 result of this function is thus always the SHA-512 hash over 2913 the <tt>PEER PUBLIC KEY</tt>. 2914 </dd> 2915 <dt>ValidateBlockStoreRequest(Block) 2916 -> BlockEvaluationResult</dt> 2917 <dd> 2918 To validate a block store request is to verify the EdDSA 2919 <tt>SIGNATURE</tt> over the hashed <tt>ADDRESSES</tt> 2920 against the public key from the <tt>PEER PUBLIC KEY</tt> 2921 field. If the signature is valid <tt>BLOCK_VALID</tt> is 2922 returned. Otherwise <tt>BLOCK_INVALID</tt> is returned. 2923 </dd> 2924 <dt>SetupResultFilter(FilterSize, Mutator) -> RF</dt> 2925 <dd> 2926 <t> 2927 The <tt>RF</tt> for <tt>HELLO</tt> blocks is implemented 2928 using a Bloom filter following the definition from <xref 2929 target="bloom_filters"/> and consists of a variable number 2930 of bits <tt>L</tt>. <tt>L</tt> depends on the 2931 <tt>FilterSize</tt> which will be the number of 2932 connected peers <tt>|E|</tt> known to the peer creating a 2933 <tt>HELLO</tt> block from its own addresses: <tt>L</tt> is 2934 set to the minimum of 2<sup>18</sup> bits (2<sup>15</sup> 2935 bytes) and the lowest power of 2 that is strictly larger 2936 than <tt>2*K*|E|</tt> bits (<tt>K*|E|/4</tt> bytes). 2937 </t> 2938 <t> 2939 The <tt>k</tt>-value for the Bloom filter 2940 <bcp14>MUST</bcp14> be 16. The elements used in the Bloom 2941 filter consist of an XOR between the <tt>H_ADDRS</tt> 2942 field (as computed using SHA-512 over the 2943 <tt>ADDRESSES</tt>) and the SHA-512 hash of the 2944 <tt>MUTATOR</tt> field from a given <tt>HELLO</tt> block. 2945 The mapping function M(<tt>H_ADDRS XOR MUTATOR</tt>) is 2946 defined as follows: 2947 </t> 2948 <t> 2949 <tt>M(e = H_ADDR XOR MUTATOR) -> e as uint32[]</tt> 2950 </t> 2951 <t> 2952 <tt>M</tt> is an identity function and returns the 512-bit 2953 XOR result unmodified. This resulting byte string is 2954 interpreted as k=16 32-bit integers in network byte order 2955 which are used to set and check the bits in <tt>B</tt> 2956 using <tt>BF-SET()</tt> and <tt>BF-TEST()</tt>. The 2957 32-bit <tt>MUTATOR</tt> is prepended to the L-bit Bloom filter 2958 field <tt>HELLO_BF</tt> containing <tt>B</tt> to create 2959 the result filter for a <tt>HELLO</tt> block: 2960 </t> 2961 <figure anchor="hello_rf" title="The HELLO Block Result Filter."> 2962 <artwork name="" type="" align="left" alt=""><![CDATA[ 2963 0 8 16 24 32 40 48 56 2964 +-----+-----+-----+-----+-----+-----+-----+-----+ 2965 | MUTATOR | HELLO_BF / 2966 +-----+-----+-----+-----+ (variable length) / 2967 / / 2968 +-----+-----+-----+-----+-----+-----+-----+-----+ 2969 ]]></artwork> 2970 </figure> 2971 <t>where:</t> 2972 <dl> 2973 <dt>MUTATOR</dt> 2974 <dd> 2975 The 32-bit mutator for the result filter. 2976 </dd> 2977 <dt>HELLO_BF</dt> 2978 <dd> 2979 The L-bit Bloom filter array. 2980 </dd> 2981 </dl> 2982 <t> 2983 The <tt>MUTATOR</tt> value is used to additionally 2984 "randomize" the computation of the Bloom filter while 2985 remaining deterministic across peers. It is only ever set 2986 by the peer initiating the GET request, and changed every 2987 time the GET request is repeated. Peers forwarding GET 2988 requests <bcp14>MUST</bcp14> not change the mutator value 2989 included in the <tt>RESULT_FILTER</tt> as they might not 2990 be able to recalculate the result filter with a different 2991 <tt>MUTATOR</tt> value. 2992 </t> 2993 <t> 2994 Consequently, repeated requests have statistically 2995 independent probabilities of creating false-positives in a 2996 result filter. Thus, even if for one request a result 2997 filter may exclude a result as a false-positive match, 2998 subsequent requests are likely to not have the same 2999 false-positives. 3000 </t> 3001 <t> 3002 <tt>HELLO</tt> result filters can be merged if the Bloom 3003 filters have the same size and <tt>MUTATOR</tt> by setting 3004 all bits to 1 that are set in either Bloom filter. This 3005 is done whenever a peer receives a query with the same 3006 <tt>MUTATOR</tt>, predecessor and Bloom filter size. 3007 </t> 3008 </dd> 3009 <dt>FilterResult(Block, Key, RF, XQuery) -> (FilterEvaluationResult, RF')</dt> 3010 <dd> 3011 The <tt>H_ADDRS</tt> field is XORed with the SHA-512 hash 3012 of the <tt>MUTATOR</tt> field from the <tt>HELLO</tt> 3013 block and the resulting value is checked against the 3014 Bloom filter in <tt>RF</tt>. Consequently, 3015 <tt>HELLOs</tt> with completely identical sets of 3016 addresses will be filtered and <tt>FILTER_DUPLICATE</tt> 3017 is returned. Any small variation in the set of addresses 3018 will cause the block to no longer be filtered (with high 3019 probability) and <tt>FILTER_MORE</tt> is returned. 3020 </dd> 3021 </dl> 3022 </section> 3023 <section> 3024 <name>Persistence</name> 3025 <t> 3026 An implementation <bcp14>SHOULD</bcp14> provide a local 3027 persistence mechanism for blocks. Embedded systems that 3028 lack storage capability <bcp14>MAY</bcp14> use 3029 connection-level signalling to indicate that they are merely 3030 a client utilizing a DHT and are not able to participate 3031 with storage. The local storage <bcp14>MUST</bcp14> provide 3032 the following functionality: 3033 </t> 3034 <dl> 3035 <dt>Store(Key, Block)</dt> 3036 <dd> 3037 Stores a block under the specified key. If an block with identical 3038 payload exists already under the same key, the meta data should 3039 be set to the maximum expiration time of both blocks and use the 3040 corresponding <tt>PUTPATH</tt> (and if applicable 3041 <tt>TRUNCATED ORIGIN</tt>) of that version of the block. 3042 </dd> 3043 <dt>Lookup(Key) -> Set of Blocks</dt> 3044 <dd> 3045 Retrieves blocks stored under the specified key. 3046 </dd> 3047 <dt>LookupApproximate(Key) -> List of Blocks</dt> 3048 <dd> 3049 Retrieves the blocks stored under the specified key and 3050 any blocks under keys close to the specified key, in order 3051 of decreasing proximity. 3052 </dd> 3053 </dl> 3054 <section anchor="approx_search"> 3055 <name>Approximate Search Considerations</name> 3056 <t> 3057 Over time a peer may accumulate a significant number of blocks 3058 which are stored locally in the persistence layer. 3059 Due to the expected high number of blocks, the method to 3060 retrieve blocks close to the specified lookup key in the 3061 <tt>LookupApproximate</tt> API must be implemented with care 3062 with respect to efficiency. 3063 </t> 3064 <t> 3065 It is <bcp14>RECOMMENDED</bcp14> to limit the number of 3066 results from the <tt>LookupApproximate</tt> procedure to a 3067 result size which is easily manageable by the local system. 3068 The <bcp14>RECOMMENDED</bcp14> default is to return blocks 3069 with the four closest keys. Note that filtering by the 3070 <tt>RF</tt> will be done by the DHT afterwards and it is 3071 <bcp14>NOT RECOMMENDED</bcp14> to fetch additional records 3072 even if all four closest keys are filtered by the 3073 <tt>RF</tt>. The main reason for this is to ensure peers do 3074 not spend extensive resources to process approximate 3075 lookups. In particular, implementations <bcp14>MUST</bcp14> 3076 limit the worst-case effort they spent on approximate 3077 lookups. 3078 </t> 3079 <t> 3080 In order to efficiently find a suitable result set, the 3081 implementation <bcp14>SHOULD</bcp14> follow the following 3082 procedure: 3083 </t> 3084 <ol> 3085 <li> 3086 Sort all blocks by the block key in ascending (decending) order. 3087 The block keys are interpreted as integers. 3088 </li> 3089 <li> 3090 Alternatingly select a block with a key larger and smaller 3091 from the sortings. The resulting set is then sorted by 3092 the XOR distance to the query. The selection process 3093 continues until the upper bound for the result set is 3094 reached and both sortings do not yield any closer blocks. 3095 </li> 3096 </ol> 3097 <t> 3098 An implementation <bcp14>MAY</bcp14> decide to use a custom 3099 algorithm in order to find the closest blocks in the local 3100 storage. But, especially for more primitive approaches 3101 (such as only comparing XOR distances for all blocks in the 3102 storage), more simplistic procedures may become ineffective 3103 for large data sets and care <bcp14>MUST</bcp14> be taken 3104 to strictly bound the maximum effort expended per query. 3105 </t> 3106 </section> 3107 <section> 3108 <name>Caching Strategy Considerations</name> 3109 <t> 3110 An implementation <bcp14>MUST</bcp14> implement an 3111 eviction strategy for blocks stored in the block storage 3112 layer. 3113 </t> 3114 <t> 3115 In order to ensure the freshness of blocks, an implementation 3116 <bcp14>MUST</bcp14> evict expired blocks in favor of 3117 new blocks. 3118 </t> 3119 <t> 3120 An implementation <bcp14>MAY</bcp14> preserve blocks which 3121 are often requested. This approach can be expensive as it 3122 requires the implementation to keep track of how often a 3123 block is requested. 3124 </t> 3125 <t> 3126 An implementation <bcp14>MAY</bcp14> preserve blocks which 3127 are close to the local peer public key. 3128 </t> 3129 <t> 3130 An implementation <bcp14>MAY</bcp14> provide configurable 3131 storage quotas and adapt its eviction strategy based on 3132 the current storage size or other constrained resources. 3133 </t> 3134 </section> 3135 </section> 3136 </section> 3137 <section anchor="security" numbered="true" toc="default"> 3138 <name>Security Considerations</name> 3139 <!-- FIXME: Here we should (again) discuss how the system is open and 3140 does not have/require a trust anchor a priori. This is (again) in contrast 3141 to RELOAD --> 3142 <t> 3143 If an upper bound to the maximum number of neighbors in a 3144 k-bucket is reached, the implementation <bcp14>MUST</bcp14> 3145 prefer to preserve the oldest working connections instead of 3146 new connections. This makes Sybil attacks <xref 3147 target="Sybil"/> less effective as an adversary would have to 3148 invest more resources over time to mount an effective attack. 3149 </t> 3150 <t> 3151 The <tt>ComputeOutDegree</tt> function limits the 3152 <tt>REPL_LVL</tt> to a maximum of 16. This imposes 3153 an upper limit on bandwidth amplification an attacker 3154 may achieve for a given network size and topology. 3155 </t> 3156 <section> 3157 <name>Disjoint Underlay or Application Protocol Support</name> 3158 <t> 3159 We note that peers 3160 implementing disjoint sets of underlay protocols may 3161 experience difficulties communicating (unless other peers 3162 bridge the respective underlays). Similarly, peers that 3163 do not support a particular application will not be able 3164 to validate application-specific payloads and may thus be 3165 tricked into storing or forwarding corrupt blocks. 3166 </t> 3167 </section> 3168 <section> 3169 <name>Approximate Result Filtering</name> 3170 <t> 3171 When a <tt>FindApproximate</tt> flag is encountered in a 3172 query, a peer will try to respond with the closest block it 3173 has that is not filtered by the result Bloom filter 3174 (<tt>RF</tt>). Implementations <bcp14>MUST</bcp14> ensure 3175 that the cost of evaluating any such query is reasonably 3176 small. For example, implementations <bcp14>SHOULD</bcp14> 3177 consider ways to avoid an exhaustive search of their 3178 database. Not doing so can lead to denial of service 3179 attacks as there could be cases where too many local results 3180 are filtered by the result filter. 3181 </t> 3182 </section> 3183 <section> 3184 <name>Access Control</name> 3185 <t> 3186 By design R<sup>5</sup>N does not rely on strict admission 3187 control through the use of either centralized enrollment 3188 servers or pre-shared keys. This is a key distintion over 3189 protocols that do rely on this kind of access control such 3190 as <xref target="RFC6940"/> which, like R<sup>5</sup>N, 3191 provides a peer-to-peer (P2P) signaling protocol with 3192 extensible routing and topology mechanisms. Some 3193 decentralized applications, such as the GNU Name System 3194 (<xref target="RFC9498"/>), require an open system that 3195 enables ad-hoc participation. 3196 </t> 3197 </section> 3198 <section> 3199 <name>Block-level confidentiality and privacy</name> 3200 <t> 3201 Applications using the DHT APIs to store application-specific 3202 block types may have varying security and privacy requirements. 3203 R<sup>5</sup>N does NOT provide any kind confidentiality or 3204 privacy for blocks, for example through the use of cryptography. 3205 This must be provided by the application as part of the block 3206 type design. 3207 One example where confidentiality and privacy are required are 3208 GNS records and their record blocks as defined in 3209 <xref target="RFC9498"/>. 3210 Other possibilities to protect the block objects may be implemented 3211 using ideas from other efforts such as Oblivious HTTP and its 3212 encapsulation of HTTP requests and responses <xref target="RFC9458"/>. 3213 </t> 3214 </section> 3215 <section> 3216 <name>Protocol extensions and cryptographic agility</name> 3217 <t> 3218 R<sup>5</sup>N makes heavy use of the Ed25519 cryptosystem. 3219 It cannot be ruled out that the relevant primitives 3220 are broken at any point in the future. 3221 In this case, the R<sup>5</sup>N design can be reused by modifying 3222 the messages and related artifacts defined in 3223 <xref target="message_components"/>. 3224 In order to extend and modify the R<sup>5</sup>N protocol 3225 in general and to replace cryptographic primitives in particular, 3226 new message types (<tt>MTYPE</tt> fields) can be registered in 3227 <xref target="GANA"/> and the message formats updated accordingly. 3228 Peers processing messages <bcp14>MUST NOT</bcp14> 3229 modify the <tt>MTYPE</tt> field in order to prevent 3230 possible security downgrades. 3231 </t> 3232 </section> 3233 <section> 3234 <name>Availability versus security tradeoffs in routing table evictions</name> 3235 <t> 3236 R<sup>5</sup>N does not implement locality-sensitive as it does not 3237 preferentially evict distant nodes (where distance is a metric based 3238 on closeness in the physical network). 3239 Locality-sensitive routing table eviction may offer performance improvements 3240 especially if the local network and its resources can be leveraged 3241 more efficiently. 3242 Similarly, if requests (and responses) can be contained to the local 3243 network, this can offer better privacy. 3244 But, this an important security trade-off when choosing network locality over 3245 R<sup>5</sup>N's eviction strategy (<xref target="routing_table"/>): 3246 "Flash mob"-style attackers that quickly spin up a large number of nodes 3247 a target node's proximity are displacing legitimate, benign neighbours. 3248 In case of the R<sup>5</sup>N eviction strategy these will likely not 3249 degrade the routing table to the same degree because long-lived connections 3250 are preferred. 3251 This in turn forces an attacker to run their nodes for a long time to run a 3252 successful attack. 3253 </t> 3254 <t> 3255 It is important to highlight that in order to address the R<sup>5</sup>N threat 3256 and security model (<xref target="security_model"/>), the routing starts with a 3257 random walk. 3258 Should all nodes implement a locality sensitive eviction strategy, the theoretical 3259 effectiveness of this measure would drastically decrease. 3260 R<sup>5</sup>N puts security and availability under its threat model, over performance 3261 and privacy. 3262 </t> 3263 <t> 3264 It should be noted that any reasonable locality metric will choose nodes 3265 that implicitly provide more stable network connections than distant nodes as 3266 the probability for network failures grows with physical distance. 3267 As a consequence it can be assumed that a locality-sensitive metric and 3268 R<sup>5</sup>N's eviction strategy eventually converge into a similar 3269 situation where a node primarily maintains a routing table consisting of 3270 long-lived and somewhat local connections. 3271 </t> 3272 </section> 3273 3274 3275 </section> 3276 <section anchor="iana" numbered="true" toc="default"> 3277 <name>IANA Considerations</name> 3278 <t> 3279 IANA maintains a registry called the "Uniform Resource Identifier 3280 (URI) Schemes" registry. 3281 The registry should be updated to include 3282 an entry for the 'gnunet' URI scheme. IANA is requested to 3283 update that entry to reference this document when published 3284 as an RFC. 3285 </t> 3286 </section> 3287 3288 <section anchor="gana" numbered="true" toc="default"> 3289 <name>GANA Considerations</name> 3290 <section anchor="gana_block_type" numbered="true" toc="default"> 3291 <name>Block Type Registry</name> 3292 <t> 3293 GANA <xref target="GANA"/> 3294 is requested to create a "DHT Block Types" registry. 3295 The registry shall record for each entry: 3296 </t> 3297 <ul> 3298 <li>Name: The name of the block type (case-insensitive ASCII 3299 string, restricted to alphanumeric characters</li> 3300 <li>Number: 32-bit</li> 3301 <li>Comment: Optionally, a brief English text describing the purpose of 3302 the block type (in UTF-8)</li> 3303 <li>Contact: Optionally, the contact information of a person to contact for 3304 further information</li> 3305 <li>References: Required, references (such as an RFC) specifying the block type and its block functions</li> 3306 </ul> 3307 <t> 3308 The registration policy for this sub-registry is "First Come First 3309 Served", as described in <xref target="RFC8126"/>. 3310 GANA created the registry as follows: 3311 </t> 3312 <!-- NOTE: changed GNS Reference to This.I-D because we either need to define it here 3313 or in the GNS RFC. And I think here is better or in a separate document 3314 => not in here. Use separate document for NAMERECORD draft. 3315 -MSC --> 3316 <figure anchor="figure_btypenums" title="The Block Type Registry."> 3317 <artwork name="" type="" align="left" alt=""><![CDATA[ 3318 Number| Name | References | Description 3319 ------+----------------+------------+------------------------- 3320 0 ANY [This.I-D] Reserved 3321 13 DHT_HELLO [This.I-D] Address data for a peer 3322 3323 Contact: r5n-registry@gnunet.org 3324 ]]></artwork> 3325 </figure> 3326 </section> 3327 3328 <section anchor="gana_gnunet_url" numbered="true" toc="default"> 3329 <name>GNUnet URI Schema Subregistry</name> 3330 <t> 3331 GANA <xref target="GANA"/> 3332 is requested to create a "gnunet://" sub-registry. 3333 The registry shall record for each entry: 3334 </t> 3335 <ul> 3336 <li>Name: The name of the subsystem (case-insensitive ASCII 3337 string, restricted to alphanumeric characters)</li> 3338 <li>Comment: Optionally, a brief English text describing the purpose of 3339 the subsystem (in UTF-8)</li> 3340 <li>Contact: Optionally, the contact information of a person to contact for 3341 further information</li> 3342 <li>References: Optionally, references describing the syntax of the URL 3343 (such as an RFC or LSD)</li> 3344 </ul> 3345 <t> 3346 <!-- FIXME: See GNS wording for this which is already improved / ISE compliant --> 3347 The registration policy for this sub-registry is "First Come First 3348 Served", as described in <xref target="RFC8126"/>. 3349 GANA created this registry as follows: 3350 </t> 3351 <figure anchor="figure_gnunetscheme" title="GNUnet scheme Subregistry."> 3352 <artwork name="" type="" align="left" alt=""><![CDATA[ 3353 Name | References | Description 3354 ---------------+------------+------------------------- 3355 HELLO [This.I-D] How to contact a peer. 3356 ADDRESS N/A Network address. 3357 Contact: gnunet-registry@gnunet.org 3358 ]]></artwork> 3359 </figure> 3360 </section> 3361 3362 <section anchor="gana_signature_purpose" numbered="true" toc="default"> 3363 <name>GNUnet Signature Purpose Registry</name> 3364 <t> 3365 GANA amended the "GNUnet Signature Purpose" registry 3366 as follows: 3367 </t> 3368 <figure anchor="figure_purposenums" title="The Signature Purpose Registry Entries."> 3369 <artwork name="" type="" align="left" alt=""><![CDATA[ 3370 Purpose | Name | References | Description 3371 --------+-----------------+------------+--------------- 3372 6 DHT PATH ELEMENT [This.I-D] DHT message routing data 3373 7 HELLO PAYLOAD [This.I-D] Peer contact information 3374 ]]></artwork> 3375 </figure> 3376 </section> 3377 3378 <section anchor="gana_message_type" numbered="true" toc="default"> 3379 <name>GNUnet Message Type Registry</name> 3380 <t> 3381 GANA is requested to amend the "GNUnet Message Type" registry 3382 as follows: 3383 </t> 3384 <figure anchor="figure_messagetypeenums" title="The Message Type Registry Entries."> 3385 <artwork name="" type="" align="left" alt=""><![CDATA[ 3386 Type | Name | References | Description 3387 --------+-----------------+------------+--------------- 3388 146 DHT PUT [This.I-D] Store information in DHT 3389 147 DHT GET [This.I-D] Request information from DHT 3390 148 DHT RESULT [This.I-D] Return information from DHT 3391 157 HELLO Message [This.I-D] Peer contact information 3392 3393 ]]></artwork> 3394 </figure> 3395 </section> 3396 </section> 3397 <section anchor="implementation_deployment" numbered="true" toc="default"> 3398 <name>Implementation and deployment status</name> 3399 <t> 3400 There are two implementations conforming to this specification written 3401 in C and Go, respectively. The C implementation as part of GNUnet 3402 <xref target="GNUnetGNS"/> represents the original 3403 and reference implementation. The Go implementation 3404 <xref target="GoGNS"/> demonstrates how two implementations of GNS are 3405 interoperable if they are built on top of the same underlying 3406 DHT storage. 3407 </t> 3408 <t> 3409 Currently, the GNUnet peer-to-peer network <xref target="GNUnet"/> 3410 is an active deployment of R<sup>5</sup>N. 3411 It's primary but not only purpose is to provide the storage underlay 3412 for the GNU Name System <xref target="RFC9498"/>. 3413 The <xref target="GoGNS"/> implementation shows how R<sup>5</sup>N implementations 3414 can interoperate. 3415 </t> 3416 </section> 3417 <!-- gana --> 3418 <!--<section anchor="testvectors"> 3419 <name>Test Vectors</name> 3420 </section>--> 3421 </middle> 3422 <back> 3423 <references> 3424 <name>Normative References</name> 3425 3426 &RFC2119; 3427 &RFC2663; 3428 &RFC3561; 3429 &RFC3629; 3430 &RFC3986; 3431 &RFC4634; 3432 &RFC5234; 3433 &RFC5245; 3434 &RFC6940; 3435 &RFC8126; 3436 &RFC8174; 3437 &RFC9458; 3438 &RFC9498; 3439 3440 <reference anchor="ed25519" target="http://link.springer.com/chapter/10.1007/978-3-642-23951-9_9"><front><title>High-Speed High-Security Signatures</title><author initials="D." surname="Bernstein" fullname="Daniel Bernstein"><organization>University of Illinois at Chicago</organization></author><author initials="N." surname="Duif" fullname="Niels Duif"><organization>Technische Universiteit Eindhoven</organization></author><author initials="T." surname="Lange" fullname="Tanja Lange"><organization>Technische Universiteit Eindhoven</organization></author><author initials="P." surname="Schwabe" fullname="Peter Schwabe"><organization>National Taiwan University</organization></author><author initials="B." surname="Yang" fullname="Bo-Yin Yang"><organization>Academia Sinica</organization></author><date year="2011"/></front></reference> 3441 3442 <reference anchor="GANA" target="https://gana.gnunet.org/"><front><title>GNUnet Assigned Numbers Authority (GANA)</title><author><organization>GNUnet e.V.</organization></author><date month="April" year="2020"/></front></reference> 3443 3444 3445 3446 </references> 3447 <references> 3448 <name>Informative References</name> 3449 <reference anchor="R5N" target="https://doi.org/10.1109/ICNSS.2011.6060022"> 3450 <front> 3451 <title>R5N: Randomized recursive routing for restricted-route networks</title> 3452 <author initials="N. S." surname="Evans" fullname="Nathan S. Evans"> 3453 <organization>Technische Universität München</organization> 3454 </author> 3455 <author initials="C." surname="Grothoff" fullname="Christian Grothoff"> 3456 <organization>Technische Universität München</organization> 3457 </author> 3458 <date year="2011"/> 3459 </front> 3460 </reference> 3461 <reference anchor="NSE" target="https://doi.org/10.1007/978-3-642-30045-5_23"> 3462 <front> 3463 <title>Efficient and Secure Decentralized Network Size Estimation</title> 3464 <author initials="N. S." surname="Evans" fullname="Nathan S. Evans"> 3465 <organization>Technische Universität München</organization> 3466 </author> 3467 <author initials="B." surname="Polot" fullname="Bartlomiej Polot"> 3468 <organization>Technische Universität München</organization> 3469 </author> 3470 <author initials="C." surname="Grothoff" fullname="Christian Grothoff"> 3471 <organization>Technische Universität München</organization> 3472 </author> 3473 <date year="2012"/> 3474 </front> 3475 </reference> 3476 <reference anchor="Kademlia" target="http://css.csail.mit.edu/6.824/2014/papers/kademlia.pdf"> 3477 <front> 3478 <title>Kademlia: A peer-to-peer information system based on the xor metric.</title> 3479 <author initials="P." surname="Maymounkov" fullname="Petar Maymounkov"> 3480 </author> 3481 <author initials="D." surname="Mazieres" fullname="David Mazieres"> 3482 </author> 3483 <date year="2002"/> 3484 </front> 3485 </reference> 3486 <reference anchor="Sybil" target="https://link.springer.com/chapter/10.1007/3-540-45748-8_24"> 3487 <front> 3488 <title>The Sybil Attack</title> 3489 <author initials="J." surname="Douceur" fullname="John R. Douceur"> 3490 </author> 3491 <date year="2002"/> 3492 </front> 3493 </reference> 3494 <reference anchor="Smallworldmix" target="https://api.semanticscholar.org/CorpusID:44240877"> 3495 <front> 3496 <title>A Measurement of Mixing Time in Social Networks</title> 3497 <author initials="M." surname="Dell'Amico" fullname="Matteo Dell'Amico"> 3498 </author> 3499 <date year="2009"/> 3500 </front> 3501 </reference> 3502 <reference anchor="Smallworld" target="https://www.nature.com/articles/35022643"> 3503 <front> 3504 <title>Navigation in a small world</title> 3505 <author initials="J." surname="Kleinberg" fullname="Jon M. Kleinberg"> 3506 </author> 3507 <date year="2000"/> 3508 </front> 3509 </reference> 3510 <reference anchor="cadet" target="https://doi.org/10.1109/MedHocNet.2014.6849107"> 3511 <front> 3512 <title>CADET: Confidential ad-hoc decentralized end-to-end transport</title> 3513 <author initials="B." surname="Polot" fullname="Bartlomiej Polot"> 3514 <organization>Technische Universität München</organization> 3515 </author> 3516 <author initials="C." surname="Grothoff" fullname="Christian Grothoff"> 3517 <organization>Technische Universität München</organization> 3518 </author> 3519 <date year="2014"/> 3520 </front> 3521 </reference> 3522 <reference anchor="GNUnet" target="https://gnunet.org"> 3523 <front> 3524 <title>The GNUnet Project</title> 3525 <author> 3526 <organization>GNUnet e.V.</organization> 3527 </author> 3528 </front> 3529 </reference> 3530 <reference anchor="GNUnetGNS" target="https://git.gnunet.org/gnunet.git/tree/src/gns"> 3531 <front> 3532 <title>The GNUnet GNS Implementation</title> 3533 <author> 3534 <organization>GNUnet e.V.</organization> 3535 </author> 3536 </front> 3537 </reference> 3538 3539 <reference anchor="GoGNS" target="https://github.com/bfix/gnunet-go/tree/master/src/gnunet/service/gns"> 3540 <front> 3541 <title>The Go GNS Implementation</title> 3542 <author initials="B." surname="Fix" fullname="Bernd Fix"> 3543 </author> 3544 </front> 3545 </reference> 3546 3547 </references> 3548 <section anchor="bloom_filters" numbered="true" toc="default"> 3549 <name>Bloom filters in R<sup>5</sup>N</name> 3550 <t> 3551 R<sup>5</sup>N uses Bloom filters in several places. This section 3552 gives some general background on Bloom filters and defines functions 3553 on this data structure shared by the various use-cases in R<sup>5</sup>N. 3554 </t> 3555 <t> 3556 A Bloom filter (BF) is a space-efficient probabilistic datastructure 3557 to test if an element is part of a set of elements. 3558 Elements are identified by an element ID. 3559 Since a BF is a probabilistic datastructure, it is possible to have 3560 false-positives: when asked if an element is in the set, the answer from 3561 a BF is either "no" or "maybe". 3562 </t> 3563 <t> 3564 Bloom filters are defined as a string of <tt>L</tt> bits. 3565 The bits are initially always empty, meaning that the bits are set to 3566 zero. 3567 There are two functions which can be invoked on the Bloom filter "bf": 3568 BF-SET(bf, e) and BF-TEST(bf, e) where "e" is an element that is to 3569 be added to the Bloom filter or queried against the set. 3570 </t> 3571 <t> 3572 A mapping function M is used to map each ID of each element from the set to a 3573 subset of k bits. 3574 In the original proposal by Bloom, M is non-injective and can thus map the same 3575 element multiple times to the same bit. 3576 The type of the mapping function can thus be described by the following 3577 mathematical notation: 3578 </t> 3579 <figure anchor="figure_bf_func" title="Bloom filter mapping function."> 3580 <artwork><![CDATA[ 3581 ------------------------------------ 3582 # M: E->B^k 3583 ------------------------------------ 3584 # L = Number of bits 3585 # B = 0,1,2,3,4,...L-1 (the bits) 3586 # k = Number of bits per element 3587 # E = Set of elements 3588 ------------------------------------ 3589 Example: L=256, k=3 3590 M('element-data') = {4,6,255} 3591 ]]> 3592 </artwork> 3593 </figure> 3594 <t> 3595 When adding an element to the Bloom filter <tt>bf</tt> using 3596 <tt>BF-SET(bf, e)</tt>, each integer <tt>n</tt> of the mapping 3597 <tt>M(e)</tt> is interpreted as a bit offset <tt>n mod L</tt> within 3598 <tt>bf</tt> and set to 1. 3599 </t> 3600 <t> 3601 When testing if an element may be in the Bloom filter <tt>bf</tt> using 3602 <tt>BF-TEST(bf,e)</tt>, each bit offset <tt>n mod L</tt> within 3603 <tt>bf</tt> <bcp14>MUST</bcp14> have been set to 1. 3604 Otherwise, the element is not considered to be in the Bloom filter. 3605 </t> 3606 </section> 3607 <section anchor="overlay" numbered="true" toc="default"> 3608 <name>Overlay Operations</name> 3609 <t> 3610 An implementation of this specification commonly exposes the two overlay 3611 operations "GET" and "PUT". 3612 The following are non-normative examples of APIs for those operations. 3613 Their behaviour is described prosaically in order to give implementers a fuller 3614 picture of the protocol. 3615 </t> 3616 <section> 3617 <name>GET operation</name> 3618 <t> 3619 A basic GET operation interface may be exposed as: 3620 </t> 3621 <t> 3622 <tt>GET(Query-Key, Block-Type) -> Results as List</tt> 3623 </t> 3624 <t> 3625 The procedure typically takes at least two arguments to initiate a lookup: 3626 </t> 3627 <dl> 3628 <dt><tt>QueryKey</tt>:</dt> 3629 <dd> 3630 is the 512-bit key to look for in the DHT. 3631 </dd> 3632 <dt>Block-Type:</dt> 3633 <dd> 3634 is the type of block to look for, possibly "any". 3635 </dd> 3636 </dl> 3637 <t> 3638 The GET procedure may allow additional optional parameters in order to 3639 control or modify the query: 3640 </t> 3641 <dl> 3642 <dt>Replication-Level:</dt> 3643 <dd> 3644 is an integer which controls how many nearest peers the request 3645 should reach. 3646 </dd> 3647 <dt>Flags:</dt> 3648 <dd> 3649 is a 16-bit vector which indicates certain 3650 processing requirements for messages. 3651 Any combination of flags as defined in <xref target="route_flags"/> 3652 may be specified. 3653 </dd> 3654 <dt>eXtended-Query (XQuery):</dt> 3655 <dd> 3656 is medatadata which may be 3657 required depending on the respective <tt>Block-Type</tt>. 3658 A <tt>Block-Type</tt> must define if the <tt>XQuery</tt> can or must 3659 be used and what the specific format of its contents should be. 3660 Extended queries are in general used to implement domain-specific filters. 3661 These might be particularly useful in combination with FindApproximate 3662 to add a well-defined filter by an application-specific distance. 3663 Regardless, the DHT does not define any particular semantics for an XQuery. 3664 See also <xref target="blockstorage"/>. 3665 </dd> 3666 <dt>Result-Filter:</dt> 3667 <dd> 3668 is data for a <tt>Block-type</tt>-specific filter 3669 which allows applications to 3670 indicate results which are 3671 not relevant anymore to the 3672 caller (see <xref target="result_filter"/>). 3673 </dd> 3674 </dl> 3675 <t> 3676 The GET procedure should be implemented as an asynchronous 3677 operation that returns individual results as they are found 3678 in the DHT. It should terminate only once the application 3679 explicitly cancels the operation. A single result commonly 3680 consists of:</t> 3681 <dl> 3682 <dt>Block-Type:</dt> 3683 <dd> 3684 is the desired type of block in the result. 3685 </dd> 3686 <dt>Block-Data:</dt> 3687 <dd> 3688 is the application-specific block payload. Contents are 3689 specific to the <tt>Block-Type</tt>. 3690 </dd> 3691 <dt>Block-Expiration:</dt> 3692 <dd> 3693 is the expiration time of the block. After this time, the result should no 3694 longer be used. 3695 </dd> 3696 <dt>Key:</dt> 3697 <dd> 3698 is the key under which the block was stored. This may be different from the 3699 key that was queried if the flag <tt>FindApproximate</tt> was set. 3700 </dd> 3701 <dt>GET-Path:</dt> 3702 <dd> 3703 is a signed path of the public keys of peers which the 3704 query traversed through the network. The DHT will try to 3705 make the path available if the <tt>RecordRoute</tt> flag 3706 was set by the application calling the PUT procedure. The 3707 reported path may have been truncated from the beginning. 3708 The API <bcp14>SHOULD</bcp14> signal truncation by exposing 3709 the <tt>Truncated</tt> flag. 3710 </dd> 3711 <dt>PUT-Path:</dt> 3712 <dd> 3713 is a signed path of the public keys of peers which the 3714 result message traversed. The DHT will try to make the 3715 path available if the <tt>RecordRoute</tt> flag was set 3716 for the GET procedure. The reported path may have been 3717 truncated from the beginning. The API 3718 <bcp14>SHOULD</bcp14> signal truncation by exposing the 3719 <tt>Truncated</tt> flag. As the block was cached by the 3720 node at the end of this path, this path is more likely to 3721 be stale compared to the <tt>GET-Path</tt>. 3722 </dd> 3723 <dt>Truncated:</dt> 3724 <dd> 3725 is true if the <tt>GET-Path</tt> or <tt>PUT-Path</tt> 3726 was truncated, otherwise false. 3727 </dd> 3728 </dl> 3729 </section> 3730 <section> 3731 <name>PUT operation</name> 3732 <t> 3733 A PUT operation interface may be exposed as: 3734 </t> 3735 <t> 3736 <tt>PUT(Key, Block-Type, Block-Expiration, Block-Data)</tt> 3737 </t> 3738 <t> 3739 The procedure typically takes at least four parameters: 3740 </t> 3741 <dl> 3742 <dt>Key:</dt> 3743 <dd>is the key under which to store the block.</dd> 3744 <dt>Block-Type:</dt> 3745 <dd>is the type of the block to store.</dd> 3746 <dt>Block-Expiration:</dt> 3747 <dd>specifies when the block should expire.</dd> 3748 <dt>Block-Data:</dt> 3749 <dd>is the application-specific payload of the block to store.</dd> 3750 </dl> 3751 <t> 3752 The PUT procedure may accept additional optional parameters that 3753 control or modify the operation: 3754 </t> 3755 <dl> 3756 <dt>Replication-Level:</dt> 3757 <dd> 3758 is an integer which controls how many nearest peers the request 3759 should reach. 3760 </dd> 3761 <dt>Flags:</dt> 3762 <dd> 3763 is a bit-vector which indicates certain processing 3764 requirements for messages. Any combination of flags as 3765 defined in <xref target="route_flags"/> may be specified. 3766 </dd> 3767 </dl> 3768 <t> 3769 The PUT procedure does not necessarily yield any information. 3770 </t> 3771 </section> 3772 </section> 3773 <section anchor="hello_url"> 3774 <name>HELLO URLs</name> 3775 <t> 3776 The general format of a <tt>HELLO</tt> URL uses "gnunet://" 3777 as the scheme, followed by "hello/" for the name 3778 of the GNUnet subsystem, followed by "/"-separated values 3779 with the GNS Base32 encoding <xref target="RFC9498"/> of 3780 the peer public key, a Base32-encoded EdDSA signature 3781 <xref target="ed25519"/>, and an expiration 3782 time in seconds since the UNIX Epoch in decimal format. 3783 After this a "?" begins a list of key-value pairs where the key 3784 is the URI scheme of one of the peer's addresses and the value 3785 is the URL-escaped payload of the address URI without the "://". 3786 </t> 3787 <t> 3788 The general syntax of <tt>HELLO</tt> URLs specified using 3789 Augmented Backus-Naur Form (ABNF) of <xref target="RFC5234"/> is: 3790 </t> 3791 <figure> 3792 <artwork type="abnf"><![CDATA[ 3793 hello-URL = "gnunet://hello[:version]/" meta [ "?" addrs ] 3794 version = *(DIGIT) 3795 meta = pid "/" sig "/" exp 3796 pid = *bchar 3797 sig = *bchar 3798 exp = *DIGIT 3799 addrs = addr *( "&" addr ) 3800 addr = addr-name "=" addr-value 3801 addr-name = scheme 3802 addr-value = *pchar 3803 bchar = *(ALPHA / DIGIT) 3804 ]]> 3805 </artwork> 3806 </figure> 3807 <t> 3808 'scheme' is defined in <xref target="RFC3986" /> in Section 3.1. 3809 'pchar' is defined in <xref target="RFC3986" />, Appendix A. 3810 </t> 3811 <t> 3812 For example, consider the following URL: 3813 </t> 3814 <figure> 3815 <artwork type="abnf"><![CDATA[ 3816 gnunet://hello/1MVZC83SFHXMADVJ5F4 3817 S7BSM7CCGFNVJ1SMQPGW9Z7ZQBZ689ECG/ 3818 CFJD9SY1NY5VM9X8RC5G2X2TAA7BCVCE16 3819 726H4JEGTAEB26JNCZKDHBPSN5JD3D60J5 3820 GJMHFJ5YGRGY4EYBP0E2FJJ3KFEYN6HYM0G/ 3821 1708333757?foo=example.com&bar+baz=1.2.3.4%3A5678%2Ffoo 3822 ]]> 3823 </artwork> 3824 </figure> 3825 <t> 3826 It specifies that the peer with the <tt>pid</tt> "1MVZ..." 3827 is reachable via "foo" at "example.com" and "bar+baz" at 3828 "1.2.3.4" on port 5678 until 3829 1708333757 seconds after the Epoch. Note that "foo" 3830 and "bar+baz" here are underspecified and just used as a simple example. 3831 In practice, the <tt>addr-name</tt> refers to a scheme supported by a 3832 DHT underlay. 3833 </t> 3834 </section> 3835 </back> 3836 </rfc>