lsd0004

LSD0004: R5N Distributed Hash Table
Log | Files | Refs

commit c9bf23dcf9511fadc82aefe814aa38770dfdf2cf
parent ac08d5159da23986383db8126d137633164ab4dc
Author: Christian Grothoff <grothoff@gnunet.org>
Date:   Sat, 20 Sep 2025 11:34:52 +0200

incorporate extensive feedback on English from Martin Guy

Diffstat:
Mdraft-schanzen-r5n.xml | 86+++++++++++++++++++++++++++++++++++++++----------------------------------------
1 file changed, 42 insertions(+), 44 deletions(-)

diff --git a/draft-schanzen-r5n.xml b/draft-schanzen-r5n.xml @@ -151,10 +151,10 @@ communicating over the existing Internet. Hence canonical DHT designs often assume that the IP protocol provides the peers of the overlay with unrestricted end-to-end pairwise - connectivity. However, in practice firewalls and network + connectivity. However, in practice, firewalls and network address translation (NAT) <xref target="RFC2663"/> make it difficult for peers operating on consumer end-devices to - directly communicate, especially in the absence of core + communicate directly, especially in the absence of core network infrastructure enabling NAT traversal via protocols such as interactive connectivity establishment (ICE) <xref target="RFC5245"/>. </t> @@ -162,7 +162,7 @@ Furthermore, not all peer-to-peer networks consistently operate over the Internet, such as mobile ad-hoc networks (MANETs). While routing protocols have been designed for - such networks (<xref target="RFC3561"/>) these generally + such networks (<xref target="RFC3561"/>), these generally have issues with security in the presence of malicious participants, as they vulnerable to impersonation attacks. The usual solution to these issues is to assert that the @@ -170,7 +170,7 @@ authentication on all control messages. In contrast, the system model for R<sup>5</sup>N is that of an open network without any kind of authorities that could restrict access - only to trusted participants. + to trusted participants. </t> </section> <section numbered="true" toc="default" anchor="security_model"> @@ -185,36 +185,36 @@ </t> <t> Honest peers are expected to establish and maintain many - connections. We assume that as a result the adversary is + connections. We assume that, as a result, the adversary is generally unable to prevent honest peers from maintaining a sufficient number of direct connections with other honest peers to achieve acceptable performance. As the number of malicious peers and their connections increases, performance - of the system should gracefully degrade, and + of the system should gracefully degrade and only collapse for peers that an adversary has fully isolated from the benign network. </t> <t> - The main security objectives are to provide honest nodes + The main security objectives are to provide honest nodes with correct results and to limit the propagation of invalid data. Invalid data includes both invalid key-value pairs as - well as invalid routing path data if such routing meta-data - is present. While malicious nodes may make up arbitrary + well as invalid routing path data. + While malicious nodes may make up arbitrary key-value pairs and paths within the adversary's domain, invalid key-value pairs are ideally - discarded at the first honest node, and path data - honestly state entry- and exit-points - from the honest network into the subset of malicious nodes. + discarded at the first honest node and path data + honestly states entry- and exit-points + from the honest network and the subset of malicious nodes. </t> <t> Malicious nodes may attempt to exhaust the storage capacity of honest nodes by distributing well-formed - (but possibly otherwise useless) application data. We assume + but possibly otherwise useless application data. We assume that storage space is relatively cheap compared to bandwidth - and that honest nodes also frequently re-publish the useful + and that honest nodes also frequently republish the useful data that they publish. As a result, an adversary may reduce the effectiveness and longevity of - data cached in the DHT, but is assumed to not be able to + data cached in the DHT but is assumed to not be able to effectively prevent publication and retrieval of application data by honest nodes. </t> @@ -228,14 +228,14 @@ <t> An <em>address</em> is a UTF-8 <xref target="RFC3629"/> string which can be used to address a <em>peer</em> through the Underlay (<xref target="underlay"/>). - The format of an address is not enforced by this specification, + The format of an address is not enforced by this specification but it is expected that in most cases the address is a URI <xref target="RFC3986"/>. </t> </dd> <dt>Applications</dt> <dd> - <em>Applications</em> are higher-layer components which directly use the - <em>Core Operations</em>. + <em>Applications</em> are higher-layer components which use the + <em>Core Operations</em> directly. Possible <em>applications</em> include the GNU Name System <xref target="RFC9498"/> and the GNUnet Confidential Ad-hoc Decentralized End-to-End Transport (CADET) @@ -250,7 +250,7 @@ </dd> <dt>Block</dt> <dd> - Variable-size unit of payload stored in the DHT + A unit of payload of variable-size stored in the DHT under a <em>key</em>. In the context of &quot;key-value stores&quot; this refers to &quot;value&quot; stored under a <em>key</em>. @@ -279,7 +279,7 @@ <em>Bootstrapping</em> is the process of establishing a connection to the peer-to-peer network. It requires an initial, non-empty set of reachable <em>peers</em> and corresponding - <em>addresses</em> supported by the implementation to connect to. + <em>addresses</em> to connect to that are supported by the implementation. </dd> <dt>Initiator</dt> <dd> @@ -288,22 +288,20 @@ </dd> <dt>HELLO block</dt> <dd> - A <tt>HELLO block</tt> is a type of <em>block</em> that is used to store and retrieve <em>addresses</em> of a <em>peer</em>. + A type of <em>block</em> that is used to store and retrieve <em>addresses</em> of a <em>peer</em>. It is used by the peer discovery mechanism in <xref target="find_peer"/>. </dd> <dt>HELLO URL</dt> <dd> - <tt>HELLO</tt> URLs are <tt>HELLO</tt> blocks represented as URLs. + <tt>HELLO</tt> blocks represented as URLs. They are used for out-of-band exchanges of <em>peer</em> <em>addresses</em> and for signalling address updates to <em>neighbours</em>. - Implementation details of HELLO URLs and examples are found in <xref target="hello_url"/>. + Implementation details and examples are in <xref target="hello_url"/>. </dd> <dt>Key</dt> <dd> - 512-bit identifier of a location in the DHT. Multiple <tt>blocks</tt> can be + The 512-bit identifier of a location in the DHT. Multiple <tt>blocks</tt> can be stored under the same <em>key</em>. A <em>peer identity</em> is also a <tt>key</tt>. - In the context of &quot;key-value stores&quot; this - refers to &quot;key&quot; under which <em>blocks</em> are stored. </dd> <dt>Message Processing</dt> <dd> @@ -322,30 +320,30 @@ of the DHT protocol. Each participating host is responsible for holding some portion of the data that has been stored in the overlay, and they are responsible for routing - messages on behalf of other <em>peers</em> as needed by the <em>routing + messages on behalf of other <em>peers</em> by following the <em>routing algorithm</em>. </dd> <dt>Peer Identity</dt> <dd> - The <em>peer identity</em> is the identifier used on the overlay + The identifier used on the overlay to identify a <em>peer</em>. It is a SHA-512 hash of the <em>peer public key</em>. </dd> <dt>Peer Public Key</dt> <dd> - The <em>peer public key</em> is the key used to authenticate + The the key used to authenticate a <em>peer</em> in the underlay. </dd> <dt>Routing</dt> <dd> The <em>routing</em> component includes the routing table as well as - routing and <em>peer</em> selection logic. It facilitates the R<sup>5</sup>N routing - algorithm with required data structures and algorithms. + routing and <em>peer</em> selection logic. It facilitates R<sup>5</sup>N routing + by provoding the required data structures and algorithms. </dd> <dt>Underlay Interface</dt> <dd> The <em>underlay interface</em> is an abstraction layer on top of the supported links of a <em>peer</em>. Peers may be linked by a variety of - different transports, including "classical" protocols such as + different transports including "classical" protocols such as TCP, UDP and TLS or higher-layer protocols such as GNUnet, I2P or Tor. <!-- FIXME: add references to GNUnet/I2P/Tor here! --> </dd> @@ -357,39 +355,39 @@ <name>Restricted-route topologies</name> <t> Restricted-route topologies emerge when a connected underlay - topology prevents (or restricts) direct connections between + topology prevents or restricts direct connections between some of the nodes. This commonly occurs through the use of NAT (<xref target="RFC2663"/>). Nodes operated behind a NAT cause common DHT routing algorithms such as Kademlia <xref target="Kademlia"/> to exhibit degraded performance or even to fail. While excluding such nodes is an option, this limits load distribution and is ineffective for some - networks, such as MANETs. + networks such as MANETs. </t> <t> - In general, nodes may not be mutually reachable (for example - due to a firewall or NAT) despite being "neighbours" + In general, nodes may not be able to reach each other (for example + due to a firewall or NAT) despite being <em>neighbours</em> according to the routing table construction algorithm of a particular DHT. For example, Kademlia uses the XOR metric and would generally connect nodes that have peer identities with a small XOR distance. However, the XOR distance between - (basically randomly assigned) peer identities is completely + randomly assigned peer identities is completely divorced from the ability of the nodes to directly communicate. DHTs usually use greedy routing to store data at the peer(s) closest to the key. In cases where a DHT cannot connect peers according to the construction rules of its routing algorithm, the topology may end up with - multiple (local) minima for a given key. Using canonical + multiple local minima for a given key. Using canonical greedy routing from a particular fixed location in the - network, a node may then only be able to publish and + network, a node may only be able to publish and retrieve data in the proximity of its local minima. </t> <t> - R<sup>5</sup>N addresses this problem by prepending a random - walk before a classical, deterministic XOR-based routing - algorithm is employed. The optimal number of random hops - taken is equal to the mixing time of the graph. The mixing - time for various graphs is well known; for small-world + R<sup>5</sup>N addresses this problem by doing a random + walk before a classical, deterministic XOR-based routing. + The optimal number of random hops + is equal to the mixing time of the graph, which is well known + for various graphs. For small-world networks <xref target="Smallworld"/>, the mixing time has been shown to be around <tt>O(log n)</tt> where <tt>n</tt> is the number of nodes in the network