lsd0004

LSD0004: R5N Distributed Hash Table
Log | Files | Refs

commit 309aadbf12f0a89eea88e7de2e74cf2077bdaee5
parent 73e4265cd7269b79643f9ddc7e32b10399db6e53
Author: Martin Schanzenbach <schanzen@gnunet.org>
Date:   Fri, 31 Dec 2021 13:40:14 +0100

peer vs node

Diffstat:
Mdraft-schanzen-r5n.xml | 240++++++++++++++++++++++++++++++++++++++++---------------------------------------
1 file changed, 121 insertions(+), 119 deletions(-)

diff --git a/draft-schanzen-r5n.xml b/draft-schanzen-r5n.xml @@ -186,7 +186,7 @@ Connectivity | |Underlay| |Underlay| <dt>Block Storage</dt> <dd> The Block Storage component is used to persist and manage data - by peers. It includes logic for quotas, caching stragegies and + by nodes. It includes logic for quotas, caching stragegies and data validation. </dd> <dt>Message Processing</dt> @@ -197,13 +197,13 @@ Connectivity | |Underlay| |Underlay| <dt>Routing</dt> <dd> The Routing component includes the routing table as well as - routing and peer selection logic. It facilitates the R5N routing + routing and node selection logic. It facilitates the R5N routing algorithm with required data structures and algorithms. </dd> <dt>Underlay Interface</dt> <dd> The DHT Underlay Interface is an abstraction layer on top of the - supported links of a peer. Peers may be linked by a variety of + supported links of a node. Nodes may be linked by a variety of different transports, including "classical" protocols such as TCP, UDP and TLS or advanced protocols such as GNUnet, L2P or Tor. </dd> @@ -212,10 +212,12 @@ Connectivity | |Underlay| |Underlay| <section anchor="overlay" numbered="true" toc="default"> <name>Overlay</name> <t> - In the DHT overlay, a peer is addressable by its Peer ID. - The Peer ID is the 256-bit hash of the peer public key. - The peer public key is the public key of the corresponding - Ed25519<xref target="ed25519" /> peer private key. + In the DHT overlay, a node is addressable by its <tt>NodeID</tt>. + The <tt>NodeID</tt> is a 512-bit hash of the <tt>NodeKey</tt>. + <!-- FIXME should the node ID be agile? Should the signature then + also be agile?--> + <!--The node public key is the public key of the corresponding + Ed25519<xref target="ed25519" /> node private key.--> </t> <t> Any implementation of this specification MUST expose the two API @@ -266,16 +268,16 @@ GET(Key[, GetParams]) -> Results as List <dl anchor="route_options"> <dt>DemultiplexEverywhere</dt> <dd> - indicates that each peer along the way should process the request. + indicates that each node along the way should process the request. </dd> <dt>RecordRoute</dt> <dd> indicates to keep track of the route that the message takes in the P2P network. </dd> - <dt>FindPeer</dt> + <dt>FindNode</dt> <dd> - indicates that this is a request used to find additional peers. + indicates that this is a request used to find additional nodes. This is a special flag which modifies the message processing to allow approximate results. </dd> @@ -326,13 +328,13 @@ PUT(Key, Block, BlockType[, PutParams]) <section anchor="underlay" numbered="true" toc="default"> <name>Underlay</name> <t> - In the network underlay, a peer is addressable by traditional - means out of scope of this document. For example, the peer may + In the network underlay, a node is addressable by traditional + means out of scope of this document. For example, the node may have a TCP/IP address, or a HTTPS endpoint. While the specific addressing options and mechanisms are out of scope for this document, it is necessary to define a universal addressing format in order to facilitate the distribution of connectivity - information to other peers in the DHT overlay. + information to other nodes in the DHT overlay. This format is the "HELLO" message. </t> @@ -349,57 +351,57 @@ PUT(Key, Block, BlockType[, PutParams]) be the same keys??? 3) I think we may want to mandate that the lower layer at least -authenticate the other peer (i.e. every UDP message could be in +authenticate the other node (i.e. every UDP message could be in cleartext, but would need to come with an EdDSA signature, alas 92 byte overhead and a signature verification _required_). Otherwise, I don't -see how we can offer even the most minimal protections against peer +see how we can offer even the most minimal protections against node impersonation attacks. WDYT? Security considerations? Prerequisites? --> <t> It is expected that there are basic mechanisms available to - manage peer connectivity and addressing. + manage node connectivity and addressing. The required functionality are abstracted through the following procedures and events: </t> <dl> - <dt>PEER_CONNECTED(phash,address)</dt> + <dt>NODE_CONNECTED(NodeID, Address)</dt> <dd> - is a signal that allows the DHT to react to peers which connect. + is a signal that allows the DHT to react to nodes which connect. Such an event triggers, for example, updates in the routing table. </dd> - <dt>PEER_DISCONNECTED(phash,address)</dt> + <dt>NODE_DISCONNECTED(NodeID, Address)</dt> <dd> - is a signal that allows the DHT to react to peers which disconnect. + is a signal that allows the DHT to react to nodes which disconnect. Such an event triggers, for example, updates in the routing table. </dd> - <dt>TRY_CONNECT(pid, address)</dt> + <dt>TRY_CONNECT(NodeID, Address)</dt> <dd> - A function which allows a peer to attempt the establishment of - a connection to another peer using an address. + A function which allows a node to attempt the establishment of + a connection to another node using an address. </dd> - <dt>HOLD(pash)</dt> + <dt>HOLD(NodeID)</dt> <dd> A function which tells the underlay to keep a hold on the connection - to another peer. + to another node. </dd> - <dt>DROP(pash)</dt> + <dt>DROP(NodeID)</dt> <dd> A function which tells the underlay to drop the connection to another - peer. + node. </dd> - <dt>RECEIVE(source, message)</dt> + <dt>RECEIVE(NodeID, Message)</dt> <dd> - A function or event that allows the peer to receive protocol - messages as defined in this document from a connected peer. + A function or event that allows the node to receive protocol + messages as defined in this document from a connected node. </dd> - <dt>SEND(target, message)</dt> + <dt>SEND(NodeID, Message)</dt> <dd> - A function that allows a peer to send protocol messages as defined - in this document to a connected peer. If call to SEND fails, + A function that allows a node to send protocol messages as defined + in this document to a connected node. If call to SEND fails, the message has not been sent. </dd> <dt>NETWORK_SIZE_ESTIMATE(N)</dt> @@ -407,13 +409,13 @@ see how we can offer even the most minimal protections against peer A function or event that provides estimates on the network size for use in the DHT routing algorithms. </dd> - <dt>ADDRESS_ADD(pk, address)</dt> + <dt>ADDRESS_ADD(NodeID, address)</dt> <dd> The underlay signals us that an address was added. This information is used, for example, to publish connectivity as part of the bootstrapping and overlay creation. </dd> - <dt>ADDRESS_DELETE(pk, address)</dt> + <dt>ADDRESS_DELETE(NodeID, address)</dt> <dd> The underlay signals us that an address was removed. This information is used, for example, to publish @@ -450,46 +452,46 @@ see how we can offer even the most minimal protections against peer <section anchor="routing_table"> <name>Routing Table</name> <t> - R5N stores the information of all connected peers in a a set of lists + R5N stores the information of all connected nodes in a a set of lists similar to the k-buckets data structure of <xref target="Kademlia"/>. - The size of this set is determined by the number of connected peers. + The size of this set is determined by the number of connected nodes. A k-bucket contains a list of connected node IDs which share the same - bit prefix length with the local <tt>PeerID</tt>. + bit prefix length with the local <tt>nodeID</tt>. This number is called the <tt>index</tt> of the k-bucket. The index which determines in which of the k-buckets to add a given - peer is calculated using the <tt>FindBucket</tt> procedure. + node is calculated using the <tt>FindBucket</tt> procedure. </t> <t> - In order to select peers from the routing table which are suitable + In order to select nodes from the routing table which are suitable destinations for sending messages, R5N uses a hybrid approach: - Given an estimated network size N, the peer selection for the - first N hops is random. After the initial N hops, peer selection - follows an XOR-based peer distance calculation. + Given an estimated network size N, the node selection for the + first N hops is random. After the initial N hops, node selection + follows an XOR-based node distance calculation. </t> <t> As the message traverses a random path through the network for the first N hops, it is essential that routing loops are avoided. In R5N, a bloomfilter is used as part of the routing metadata in messages. The bloomfilter is updates at each hop with the hops - peer identity. + node identity. For the next hop selection in both the random and the deterministic - case, any peer which is in the bloomfilter for the respective message - is not included in the peer selection process. + case, any node which is in the bloomfilter for the respective message + is not included in the node selection process. </t> <!-- Fixme: We may want to propose our modified, optimized XOR metric here or reference Kademlia --> <t> The buckets serve implicitly as a routing table for messages by calculating the shortest distance from a message key to node ID. - In order to select a peer for a given message key and bloomfilter, - the <tt>PEER-SELECT</tt> is used (see <xref target="peer-select"/>. + In order to select a node for a given message key and bloomfilter, + the <tt>node-select</tt> is used (see <xref target="node-select"/>. </t> - <figure anchor="peer-select"> + <figure anchor="node-select"> <artwork name="" type="" align="left" alt=""><![CDATA[ -PEER-SELECT(key, bloomfilter) - peers := <Select all known peers NOT in message bloomfilter> +node-SELECT(key, bloomfilter) + nodes := <Select all known peers NOT in message bloomfilter> IF hops >= N dist := MAX_VALUE - FOR EACH p IN peers + FOR EACH p IN nodes IF XOR(p, key) < dist dist := XOR(p, key) target := p @@ -497,7 +499,7 @@ PEER-SELECT(key, bloomfilter) END ELSE r := rand() - target := peers[r] + target := nodes[r] END END ]]></artwork> @@ -507,36 +509,36 @@ END the R5N routing component MUST implement the following functions: </t> <dl> - <dt><tt>FindBucket(PeerID, Key) -> k-bucket as List</tt></dt> + <dt><tt>FindBucket(NodeID, Key) -> k-bucket as List</tt></dt> <dd> The <tt>FindBucket</tt> function determines how many low - order bits succesively match between a <tt>PeerID</tt> and a + order bits succesively match between a <tt>nodeID</tt> and a <tt>Key</tt> starting from the first bit. The procedure returns the k-bucket for this index. </dd> - <dt><tt>GetDistance(NodeKey_A, NodeKey_B) -> Distance as Integer</tt></dt> + <dt><tt>GetDistance(NodeID_A, NodeID_B) -> Distance as Integer</tt></dt> <dd> FIXME: We do NOT do XOR here. We do some kind of fancy calculation. See get_distance() </dd> - <dt><tt>SelectClosestPeer(Key) -> NodeID</tt></dt> + <dt><tt>SelectClosestNode(Key) -> NodeID</tt></dt> <dd> This function determines the closest node ID to <tt>Key</tt> of all connected nodes using <tt>GetDistance</tt>. FIXME: Also has a bloomfilter. Isn't AmClosestNode simply - !SelectClosestPeer == myID ? + !SelectClosestnode == myID ? </dd> - <dt><tt>SelectRandomPeer() -> NodeID</tt></dt> + <dt><tt>SelectRandomnode() -> NodeID</tt></dt> <dd> This function selects a random node ID from all connected nodes. FIXME find elegant way to handle bloomfilter </dd> - <dt><tt>SelectPeer(Key, NumberOfHops)</tt></dt> + <dt><tt>Selectnode(Key, NumberOfHops)</tt></dt> <dd> - This function selects a peer depending on the <tt>NumberOfHops</tt> + This function selects a node depending on the <tt>NumberOfHops</tt> Parameter. If <tt>NumberOfHops &lt; NETWORK_SIZE_ESTIMATE</tt> - this function returns <tt>SelectRandomPeer()</tt> and - <tt>SelectClosestPeer(Key)</tt> otherwise. + this function returns <tt>SelectRandomnode()</tt> and + <tt>SelectClosestnode(Key)</tt> otherwise. </dd> <dt><tt>AmClosestNode(NodeID, Key, Bloom) -> true | false</tt></dt> <dd> @@ -546,7 +548,7 @@ END If there is a node ID <tt>NodeID'</tt> in the k-bucket where <tt>GetDistance(NodeID, Key) > GetDistance(NodeID', Key)</tt>, then <tt>false</tt> is returned, otherwise <tt>true</tt>. - FIXME: Currently, GDS_am_closest_peer checks for longer matching + FIXME: Currently, GDS_am_closest_node checks for longer matching bits. Not GetDistance. Why? </dd> </dl> @@ -559,7 +561,7 @@ END <t> In order to prevent circular routes, GET and PUT messages contain a 128-bit Bloom filter (m=128). The Bloom filter is used to detect duplicate - peer IDs along the route. + node IDs along the route. A Bloom filter "bf" is initially empty, consisting only of zeroes. There are two functions which can be invoked on the Bloom filter: BF-SET(bf, e) and BF-TEST(bf, e) where "e" is an element which is to added @@ -667,7 +669,7 @@ END </dd> <dt>BLOOMFILTER</dt> <dd> - A bloomfilter (for peer identities) to stop circular routes. + A bloomfilter (for node identities) to stop circular routes. </dd> <dt>KEY</dt> <dd> @@ -677,7 +679,7 @@ END <dt>PUTPATH</dt> <dd> the variable-length PUT path. - The path consists of a list of PATH_LEN peer IDs. + The path consists of a list of PATH_LEN node IDs. </dd> <dt>BLOCK</dt> <dd> @@ -689,7 +691,7 @@ END <section anchor="p2p_put_processing"> <name>Processing</name> <t> - Upon receiving a <tt>PutMessage</tt> from a connected peer. + Upon receiving a <tt>PutMessage</tt> from a connected node. An implementation MUST process it step by step as follows: </t> <ol> @@ -710,26 +712,26 @@ END it MUST be discarded. </li> <li> - The sender peer ID SHOULD be in the BLOOMFILTER. If not, the + The sender node ID SHOULD be in the BLOOMFILTER. If not, the implementation MAY log an error, but MUST continue. </li> <li> - If the <tt>RecordRoute</tt> flag is set in OPTIONS, the local peer ID + If the <tt>RecordRoute</tt> flag is set in OPTIONS, the local node ID MUST be appended to the <tt>PUTPATH</tt> of the message. </li> <li> - If the local peer is the closest peer (<tt>AM-CLOSEST-PEER</tt> is true) or the + If the local node is the closest node (<tt>AM-CLOSEST-NODE</tt> is true) or the <tt>DemultiplexEverywhere</tt> options flag ist set, the message MUST be stored locally in the block storage. </li> <li> - Given the value in REPL_LVL, the number of peers to forward to - MUST be calculated (NUM-FORWARD-PEERS). If there is at least one - peer to forward to, the implementation SHOULD select up to this - number of peers to forward the message to. The implementation MAY - forward to fewer or no peers in order to handle resource constraints + Given the value in REPL_LVL, the number of nodes to forward to + MUST be calculated (NUM-FORWARD-nodeS). If there is at least one + node to forward to, the implementation SHOULD select up to this + number of nodes to forward the message to. The implementation MAY + forward to fewer or no nodes in order to handle resource constraints such as bandwidth. - The message BLOOMFILTER MUST be updated with the local peer ID. + The message BLOOMFILTER MUST be updated with the local node ID. </li> </ol> </section> @@ -798,7 +800,7 @@ END </dd> <dt>BLOOMFILTER</dt> <dd> - A bloomfilter (for peer identities) to stop circular routes. + A bloomfilter (for node identities) to stop circular routes. </dd> <dt>KEY</dt> <dd> @@ -822,7 +824,7 @@ END <section anchor="p2p_get_processing"> <name>Processing</name> <t> - Upon receiving a <tt>GetMmessage</tt> from a connected peer an + Upon receiving a <tt>GetMmessage</tt> from a connected node an implementation MUST process it step by step as follows: </t> <ol> @@ -835,18 +837,18 @@ END the message MUST be discarded. </li> <li> - The sender peer ID SHOULD be in the BLOOMFILTER. If not, the + The sender node ID SHOULD be in the BLOOMFILTER. If not, the implementation MAY log an error, but MUST continue. </li> <li> <t> - If the local peer is the closest peer (AM-CLOSEST-PEER) or the + If the local node is the closest node (AM-CLOSEST-NODE) or the <tt>DemultiplexEverywhere</tt> options flag is set, a reply MUST be produced: </t> <ol> <li> - If OPTIONS indicate a <tt>FindPeer</tt> request, FIXME the peer selection + If OPTIONS indicate a <tt>Findnode</tt> request, FIXME the node selection foo from buckets that probably needs fixing. Take into account <tt>REPLY_BF</tt> </li> @@ -864,13 +866,13 @@ END <tt>BTYPE</tt> </li> <li> - Given the value in REPL_LVL, the number of peers to forward to - MUST be calculated (NUM-FORWARD-PEERS). If there is at least one - peer to forward to, the implementation SHOULD select up to this - number of peers to forward the message to. The implementation MAY - forward to fewer or no peers in order to handle resource constraints + Given the value in REPL_LVL, the number of nodes to forward to + MUST be calculated (NUM-FORWARD-nodeS). If there is at least one + node to forward to, the implementation SHOULD select up to this + number of nodes to forward the message to. The implementation MAY + forward to fewer or no nodes in order to handle resource constraints such as bandwidth. - The message BLOOMFILTER MUST be updated with the local peer ID. + The message BLOOMFILTER MUST be updated with the local node ID. </li> </ol> </section> @@ -950,12 +952,12 @@ END <dt>PUTPATH</dt> <dd> the variable-length PUT path. - The path consists of a list of PATH_LEN peer IDs. + The path consists of a list of PATH_LEN node IDs. </dd> <dt>GETPATH</dt> <dd> the variable-length PUT path. - The path consists of a list of PATH_LEN peer IDs. + The path consists of a list of PATH_LEN node IDs. </dd> <dt>BLOCK</dt> <dd> @@ -967,7 +969,7 @@ END <section anchor="p2p_result_processing"> <name>Processing</name> <t> - Upon receiving a <tt>ResultMessage</tt> from a connected peer. + Upon receiving a <tt>ResultMessage</tt> from a connected node. An implementation MUST process it step by step as follows: </t> <ol> @@ -979,12 +981,12 @@ END If the MTYPE of the message indicates a HELLO block, it must be validated according to <xref target="hello_block"/>. The payload MUST be considered for the local routing table by - trying to establish a connection to the peer using the information + trying to establish a connection to the node using the information from the HELLO block. If a connection can be established, - the peer is added to the <tt>KBuckets</tt> routing table. + the node is added to the <tt>KBuckets</tt> routing table. </li> <li> - If the sender peer of the message is already found in the + If the sender node of the message is already found in the GETPATH, the path MUST be truncated at this position. The implementation MAY log a warning in such a case. </li> @@ -1002,12 +1004,12 @@ END </li> <li> <t> - If requests by other peers for this KEY or BTYPE are known, + If requests by other nodes for this KEY or BTYPE are known, the result block is validated against each request using the respective <tt>ValidateBlockReply</tt> function. </t> <t> - If the request options include <tt>FindPeer</tt> and the result + If the request options include <tt>Findnode</tt> and the result message block type is HELLO the block validation must use the key derived using <tt>DeriveBlockKey</tt> as the key included in the request is only approximate. (FIXME: So we extract the key @@ -1020,7 +1022,7 @@ END </t> <t> For each pending request the reply is sent to the requesting - peer. + node. </t> </li> </ol> @@ -1080,7 +1082,7 @@ END is used to evaluate a block including its <tt>Key</tt> and payload. It is used as part <tt>ResultMessage</tt> processing. The validation of the respective <tt>Block</tt> requires a pending local query - or a previously routed request of another peer and its associated + or a previously routed request of another node and its associated XQuery data and Key. The validation MUST include a check of the block payload against the key under which it is requested to be stored. @@ -1088,7 +1090,7 @@ END <dt>DeriveBlockKey(Block) -> Key</dt> <dd> is used to synthesize the block key from the block payload - and metadata. It is used as part of FIND-PEER message processing. + and metadata. It is used as part of FIND-node message processing. </dd> <dt>FilterResult(Block, XQuery, Key) -> ReplyEvaluationResult</dt> <dd> @@ -1106,13 +1108,13 @@ END <t> Applications can and should define their own block types. The block type determines the format and handling of the block - payload by peers in PUT and RESULT messages. + payload by nodes in PUT and RESULT messages. Block types MUST be registered with GANA <xref target="gana"/>. </t> <t> - For bootstrapping and peer discovery, the DHT implementation uses + For bootstrapping and node discovery, the DHT implementation uses its own block type called "HELLO". A block with this block type - contains the peer ID of the peer initiating the GET request. + contains the NodeID of the node initiating the GET request. </t> <section anchor="hello_block"> <name>HELLO</name> @@ -1127,7 +1129,7 @@ END <artwork name="" type="" align="left" alt=""><![CDATA[ 0 8 16 24 32 40 48 56 +-----+-----+-----+-----+-----+-----+-----+-----+ -| TYPE | SIZE | NODEID / +| TYPE | SIZE | NODEKEY / +-----+-----+-----+-----+ (variable length) / / / +-----+-----+-----+-----+-----+-----+-----+-----+ @@ -1148,10 +1150,10 @@ END is the SIZE of the following fields NODEID and ADDRESSES in bytes. In network byte order. </dd> - <dt>NODEID</dt> + <dt>NODEKEY</dt> <dd> - is the Node ID of the peer which has generated this HELLO. - The length content of the NODEID is determined by the TYPE field. + is the NodeKey of the node which has generated this HELLO. + The length content of this field is determined by the TYPE. Usually, this is a cryptographic public key which allows the Underlay to uniquely identify and authenticate the node. </dd> @@ -1168,7 +1170,7 @@ END HELLO of a node. </t> <t> - For the string representation of the peer public key, + For the string representation of the node public key, the base-32 encoding "StringEncode" is used. However, instead of following <xref target="RFC4648"/> the character map is based on the optical character recognition friendly @@ -1179,7 +1181,7 @@ END <t> The <tt>ADDRESSES</tt> part of the <tt>HELLO</tt> indicate endpoints which can be used by the Underlay in order to establish a connection - with the node identified by <tt>NODEID</tt>. + with the node identified by <tt>NODEKEY</tt>. An example of an addressing scheme used throughout this document is "ip+tcp", which refers to a standard TCP/IP socket connection. The "hier"-part of the URI must provide a suitable @@ -1200,23 +1202,23 @@ tor+onionv3://rasdflkjasdfliasduf.onion/ <section> <name>Bootstrapping</name> <t> - It is assumed that the peer is already connected to at least - one other peer. - First, those initial peers are sorted into their respective buckets. + It is assumed that the node is already connected to at least + one other node. + First, those initial nodes are sorted into their respective buckets. </t> <t> - In order to find the closest peers in the network to itself, an + In order to find the closest nodes in the network to itself, an implementation MUST now periodically send HELLO GET queries for its own - peer ID. - Both the "record route" and "find peer" message options are set in the - GET queries in order to learn peers and network topology from the + node ID. + Both the "record route" and "find node" message options are set in the + GET queries in order to learn nodes and network topology from the message route and in order to receive approximate replies to the - query key (the peer ID). + query key (the node ID). </t> - <t>FIXME: Periodically -> more specific? No. Frequency may be adapted depending on network conditions, known peers, busy/idle etc.</t> + <t>FIXME: Periodically -> more specific? No. Frequency may be adapted depending on network conditions, known nodes, busy/idle etc.</t> <t> Any implementation encountering a HELLO GET request initially - sends its own peer ID if it. + sends its own node ID if it. </t> </section> <section anchor="security" numbered="true" toc="default"> @@ -1254,7 +1256,7 @@ Number | Name | Contact | References | Description -------+--------+---------+------------+------------------------- 0 ANY N/A [This.I-D] Reserved 7 HELLO N/A [This.I-D] Type of a block that contains -a HELLO for a peer +a HELLO for a node 11 GNS N/A GNS Block for storing record data ]]> </artwork>