lsd0004

LSD0004: R5N Distributed Hash Table
Log | Files | Refs

commit eceaa0466a9fcfe901aecae558562d5fa4a8bc68
parent d561f4071670e44eeeba191d84d2555c7d9970e6
Author: Christian Grothoff <grothoff@gnunet.org>
Date:   Fri, 11 Mar 2022 07:52:58 +0100

-work on message processing

Diffstat:
Mdraft-schanzen-r5n.xml | 186+++++++++++++++++++++++++++++++++++++++++++++++++------------------------------
1 file changed, 116 insertions(+), 70 deletions(-)

diff --git a/draft-schanzen-r5n.xml b/draft-schanzen-r5n.xml @@ -478,7 +478,7 @@ Connectivity | |Underlay| |Underlay| format in order to facilitate the distribution of connectivity information to other peers in the DHT overlay. This format is the "HELLO" Block (described in <xref target="hello_block"/>), - which contains URIs. The schema of each URI indicates which underlay understands the + which contains URIs. The scheme of each URI indicates which underlay understands the respective address given in the rest of the URI. </t> <!-- @@ -679,7 +679,7 @@ Connectivity | |Underlay| |Underlay| the <tt>Peer ID</tt>, a Base32-encoded EdDSA signature, and an expiration time in seconds since the UNIX Epoch in decimal format. After this a "?" begins a list of key-value pairs where the key - is the URI schema of one of the peer's addresses and the value + is the URI scheme of one of the peer's addresses and the value is the URL-escaped payload of the address URI without the "://". </t> <t> @@ -1110,14 +1110,17 @@ bchar = *(ALPHA / DIGIT) a peer about the sender's available addresses. The recipients use these messages to inform their respective Underlays about ways to sustain the connections and to - generate HELLO blocks to answer peer discovery queries - from other peers. In contrast to HELLO blocks, a - <tt>HelloMessage</tt> does not contain the peer ID of - the peer the address information is about, as a - <tt>HelloMessage</tt> is always about the sender. As + generate HELLO blocks (see <xref target="hello_block"/>) + to answer peer discovery queries + from other peers. In contrast to a HELLO block, a + <tt>HelloMessage</tt> does not contain the ID of + the peer the address information is about: in a + <tt>HelloMessage</tt>, the address information is always + about the sender. Since the Underlay is responsible to determine the - peer ID of the sender for all messages, it would be - redundant to also include it in the <tt>HelloMessage</tt>. + peer ID of the sender for all messages, it would thus be + redundant to also include the peer ID in the + <tt>HelloMessage</tt>. </t> <section anchor="p2p_hello_wire"> <name>Wire Format</name> @@ -1144,8 +1147,7 @@ bchar = *(ALPHA / DIGIT) </dd> <dt>MTYPE</dt> <dd> - is the 16-bit message type. This type can be one of the DHT message - types, but for HELLO messages it must be set to + is the 16-bit message type. It must be set to the value 157 in network byte order. </dd> <dt>RESERVED</dt> @@ -1178,16 +1180,16 @@ bchar = *(ALPHA / DIGIT) A sequence of exactly URL_CTR 0-terminated URIs in UTF-8 encoding representing addresses of this peer. Each URI must begin with a non-empty - URI schema terminated by "://" and followed by some - non-empty Underlay-specific address encoding. + URI scheme terminated by "://" and followed by some + non-empty Underlay- and scheme-specific address encoding. </dd> </dl> </section> <section anchor="p2p_hello_processing"> <name>Processing</name> <t> - Upon receiving a <tt>HelloMessage</tt> from a peer <tt>P</tt>. - An implementation <bcp14>MUST</bcp14> process it step by step as follows: + Upon receiving a <tt>HelloMessage</tt> from a peer <tt>P</tt> + an implementation <bcp14>MUST</bcp14> process it step by step as follows: </t> <ol> <li> @@ -1201,10 +1203,19 @@ bchar = *(ALPHA / DIGIT) The HELLO information is cached in the routing table until it expires, the peer is removed from the routing table, or the information is replaced by another message from the peer. </li> </ol> + <t> + The address information about P should then also be made + available to each respective Underlays to enable the + Underlay to maintain optimal connectivity to the + neighbour. + </t> </section> </section> <section anchor="p2p_put" numbered="true" toc="default"> <name>PutMessage</name> + <t> + <tt>PutMessage</tt>s are used to store information at other peers in the DHT. + </t> <section anchor="p2p_put_wire"> <name>Wire Format</name> <figure anchor="figure_putmsg" title="The PutMessage Wire Format."> @@ -1237,18 +1248,17 @@ bchar = *(ALPHA / DIGIT) </dd> <dt>MTYPE</dt> <dd> - is the 16-bit message type. This type can be one of the DHT message - types but for put messages it must be set to + is the 16-bit message type. It must be set to the value 146 in network byte order. </dd> <dt>BTYPE</dt> <dd> - is a 32-bit block type field. The block type indicates the content + is a 32-bit block type. The block type indicates the content type of the payload. In network byte order. </dd> <dt>FLAGS</dt> <dd> - is a 16-bit flags field (see below). + is a 16-bit vector with binary options (see <xref target="route_flags"/>). </dd> <dt>HOPCOUNT</dt> <dd> @@ -1275,7 +1285,7 @@ bchar = *(ALPHA / DIGIT) </dd> <dt>PEER_BF</dt> <dd> - A Bloom filter (for peer addresses) to stop circular routes. + A peer Bloom filter to stop circular routes (see <xref target="routing_bloomfilter"/>). </dd> <dt>BLOCK_KEY</dt> <dd> @@ -1290,15 +1300,16 @@ bchar = *(ALPHA / DIGIT) <dt>BLOCK</dt> <dd> the variable-length block payload. The contents are determined - by the BTYPE field. + by the BTYPE field. The length is determined by MSIZE minus + the size of all of the other fields. </dd> </dl> </section> <section anchor="p2p_put_processing"> <name>Processing</name> <t> - Upon receiving a <tt>PutMessage</tt> from a peer <tt>P</tt>. - An implementation <bcp14>MUST</bcp14> process it step by step as follows: + Upon receiving a <tt>PutMessage</tt> from a peer <tt>P</tt> + an implementation <bcp14>MUST</bcp14> process it step by step as follows: </t> <ol> <li> @@ -1312,9 +1323,10 @@ bchar = *(ALPHA / DIGIT) Else, the block <bcp14>MUST</bcp14> be validated as defined in (3). </li> <li> - The block payload of the message is evaluated using according + The block payload of the message is evaluated according to the <tt>BTYPE</tt> using the respective - <tt>ValidateBlockStoreRequest</tt> procedure. + <tt>ValidateBlockStoreRequest</tt> procedure + as described in <xref target="block_functions"/>. If the block payload is invalid or does not match the key, it <bcp14>MUST</bcp14> be discarded. </li> @@ -1334,22 +1346,30 @@ bchar = *(ALPHA / DIGIT) be stored locally in the block storage. </li> <li> - Given the value in <tt>REPL_LVL</tt>, the number of peers to - forward to <bcp14>MUST</bcp14> be calculated. If there is at least one - peers to forward to, the implementation <bcp14>SHOULD</bcp14> select up to this + Given the value in <tt>REPL_LVL</tt>, <tt>HOPCOUNT</tt> and the + result of <tt>IsClosestpeer(SELF, BLOCK_KEY)</tt> the number of peers to + forward to <bcp14>MUST</bcp14> be calculated. + <!-- FIXME: formula for calculation is where exactly? + Maybe add another routing function above and reference it here??? --> + The implementation <bcp14>SHOULD</bcp14> select up to this number of peers to forward the message to. The implementation <bcp14>MAY</bcp14> forward to fewer or no peers in order to handle resource constraints - such as bandwidth. + such as limited bandwidth. Finally, the local peer address <bcp14>MUST</bcp14> be added to the - <tt>PEER_BF</tt> of the forwarded message. - For all peers with peer address <tt>P</tt> chosen to forward the message - to, <tt>SEND(P, PutMessage)</tt> is called. + <tt>PEER_BF</tt> before forwarding the message. + For all peers with peer address <tt>P</tt> selected to forward the message + to, <tt>SEND(P, PutMessage')</tt> is called. Here, <tt>PutMessage'</tt> + is the original message with updated fields. In particular, <tt>HOPCOUNT</tt> + <bcp14>MUST</bcp14> be incremented by 1. </li> </ol> </section> </section> <section anchor="p2p_get" numbered="true" toc="default"> <name>GetMessage</name> + <t> + <tt>GetMessage</tt>s are used to request information from other peers in the DHT. + </t> <section anchor="p2p_get_wire"> <name>Wire Format</name> <figure anchor="figure_getmsg" title="The GetMessage Wire Format."> @@ -1392,7 +1412,7 @@ bchar = *(ALPHA / DIGIT) </dd> <dt>FLAGS</dt> <dd> - is a 16-bit flags field (see below). + is a 16-bit vector with binary options (see <xref target="route_flags"/>). </dd> <dt>HOPCOUNT</dt> <dd> @@ -1411,15 +1431,12 @@ bchar = *(ALPHA / DIGIT) </dd> <dt>PEER_BF</dt> <dd> - A Bloom filter (for peer identities) to stop circular routes. + A peer Bloom filter to stop circular routes (see <xref target="routing_bloomfilter"/>). </dd> <dt>QUERY_HASH</dt> <dd> - The query used to indicate which blocks the originator is looking - for in this GET request. - The value is commonly evaluated with respect to its XOR distance - to a given block key when it is considered as an answer to - the request. + The query used to indicate what the key is under which the originator is looking + for blocks with this request. The block type may use a different evaluation logic to determine applicable result blocks. </dd> @@ -1440,17 +1457,18 @@ bchar = *(ALPHA / DIGIT) <section anchor="result_bloomfilter"> <name>Result Bloom Filter</name> <t> - The result Bloom filter is used to indicate to a peer which results + The result Bloom filter is used to indicate to other peers which results are not of interest when processing a GET message (<xref target="p2p_get"/>). Any peer which is processing GET messages and has a result which matches the query key <bcp14>MUST</bcp14> check the result Bloom filter - and only send a reply message if the block key is not in the - Bloom filter set. <!-- FIXME: is this strictly always the block key? - I think this is block-type dependent! --> + and only send a reply message if the result does not test positive + under the result Bloom filter. Before forwarding the GET message, the + result Bloom filter <bcp14>MUST</bcp14> be updated to filter out all results + already returned by the local peer. </t> <t> - FIXME: say something about the size of the result Bloom filter here! + FIXME: say something about how to calculate the size of the result Bloom filter here! </t> <t> Bloom filters are probabilistic data structures. Thus, especially @@ -1472,11 +1490,18 @@ bchar = *(ALPHA / DIGIT) <t> By including the mutator value in the hashing process, repeated requests have statistically independent probabilities of creating - collisions in the Bloom filter. Thus, even if for one request - a Bloom filter collision may exclude a result as a false-positive + collisions in the result Bloom filter. Thus, even if for one request + a result Bloom filter collision may exclude a result as a false-positive match, subsequent requests are likely to not have the same false-positives. </t> + <t> + How exactly a block result is hashed into the result Bloom filter + together with the mutator depends on the block type. + For example, some block types may include full block + payload, certain parts of the block payload, or the block key + when hashing the block into the result Bloom filter. + </t> </section> <section anchor="p2p_get_processing"> <name>Processing</name> @@ -1519,31 +1544,41 @@ bchar = *(ALPHA / DIGIT) </li> <li> If <tt>FLAGS</tt> indicate a <tt>FindApproximate</tt> request, - the peer should respond with the closest block it + the peer <bcp14>SHOULD</bcp14> try to respond with the closest block it has that is not filtered by the - <tt>RESULT_BF</tt>. + <tt>RESULT_BF</tt>. However, implementations also <bcp14>MUST</bcp14> + avoid an exhaustive search of their database, as there could be + cases where too many local results are filtered by the result + Bloom filter. To avoid denial of service attacks, implementations + <bcp14>MUST</bcp14> thus ensure that the cost of evaluating any + such query is reasonably small. </li> <li> - Else, the peer should only respond if it has a block + Else, the peer <bcp14>MUST</bcp14> respond if it has a valid block that matches the key exactly and that is not filtered by the <tt>RESULT_BF</tt>. </li> - <li> - Any resulting block must be encapsulated in a - <tt>ResultMessage</tt> and transmitted to the - neighbor from which the request was received. - </li> </ol> - </li> - <li> - <!--FIXME: We only handle if not GNUNET_BLOCK_EVALUATION_OK_LAST.--> - This means that we must evaluate the Reply produced in the - previous step using <tt>ValidateBlockReply</tt> for this - <tt>BTYPE</tt> + <t> + Any such resulting block <bcp14>MUST</bcp14> be encapsulated in a + <tt>ResultMessage</tt> and <bcp14>SHOULD</bcp14> be transmitted to the + neighbor from which the request was received. Implementations + <bcp14>MAY</bcp14> drop messages if they are resource-constrained. + However, <tt>ResultMessage</tt>s <bcp14>SHOULD</bcp14> be given the + highest priority among competing transmissions. + </t> + <t> + If the BTYPE is supported and <tt>ValidateBlockReply</tt> for the given + query has yielded a status of <tt>OK_LAST</tt>, processing + <bcp14>MUST</bcp14> end and not continue with forwarding of + the request to other peers. + </t> </li> <li> Given the value in REPL_LVL, the number of peers to forward to - <bcp14>MUST</bcp14> be calculated (NUM-FORWARD-peerS). If there is at least one + <bcp14>MUST</bcp14> be calculated + (FIXME: as above, we need to describe here how to calulate NUM-FORWARD-peerS). + If there is at least one peer to forward to, the implementation <bcp14>SHOULD</bcp14> select up to this number of peers to forward the message to. The implementation <bcp14>MAY</bcp14> forward to fewer or no peers in order to handle resource constraints @@ -1551,13 +1586,18 @@ bchar = *(ALPHA / DIGIT) The message Bloom filter PEER_BF <bcp14>MUST</bcp14> be updated with the local peer address <tt>SELF</tt>. For all peers with peer address <tt>P'</tt> chosen to forward the message - to, <tt>SEND(P', PutMessage)</tt> is called. + to, <tt>SEND(P', GetMessage')</tt> is called. Here, <tt>GetMessage'</tt> + is the original message with updated fields. In particular, <tt>HOPCOUNT</tt> + <bcp14>MUST</bcp14> be incremented by 1. </li> </ol> </section> </section> <section anchor="p2p_result" numbered="true" toc="default"> <name>ResultMessage</name> + <t> + <tt>ResultMessage</tt>s are used to return information to other peers in the DHT. + </t> <section anchor="p2p_result_wire"> <name>Wire Format</name> <figure anchor="figure_resmsg" title="The ResultMessage Wire Format"> @@ -1566,7 +1606,7 @@ bchar = *(ALPHA / DIGIT) +-----+-----+-----+-----+-----+-----+-----+-----+ | MSIZE | MTYPE | BTYPE | +-----+-----+-----+-----+-----+-----+-----+-----+ -| // | FLAGS | PUTPATH_L | GETPATH_L | +| RESERVED | PUTPATH_L | GETPATH_L | +-----+-----+-----+-----+-----+-----+-----+-----+ | EXPIRATION | +-----+-----+-----+-----+-----+-----+-----+-----+ @@ -1592,13 +1632,15 @@ bchar = *(ALPHA / DIGIT) </dd> <dt>MTYPE</dt> <dd> - is the 16-bit message type. This type can be one of the DHT message - types but for put messages it must be set to + is the 16-bit message type. It must be set to the value 148 in network byte order. </dd> - <dt>FLAGS</dt> + <dt>RESERVED</dt> <dd> - is a 16-bit flags field (see below). + is a 32-bit value. Implementations <bcp14>MUST</bcp14> + set this value to zero when originating a result message. + Implementations <bcp14>MUST</bcp14> forward + this value unchanged even if it is non-zero. </dd> <dt>BTYPE</dt> <dd> @@ -1608,13 +1650,13 @@ bchar = *(ALPHA / DIGIT) <dt>PUTPATH_L</dt> <dd> is a 16-bit number indicating the length of the PUT path recorded - in PUTPATH. As PUTPATH is optiona, this value may be zero. + in PUTPATH. As PUTPATH is optional, this value may be zero. In network byte order. </dd> <dt>GET_PATH_LEN</dt> <dd> is a 16-bit number indicating the length of the GET path recorded - in GETPATH. As PUTPATH is optiona, this value may be zero. + in GETPATH. As PUTPATH is optional, this value may be zero. In network byte order. </dd> <dt>EXPIRATION</dt> @@ -1948,6 +1990,10 @@ ip+udp://1.2.3.4:6789 \ gnunet+tcp://12.3.4.5/ \ ]]></artwork> </figure> + <t> + FIXME: describe here how the HELLO block + is hashed into the result Bloom filter! + </t> </section> </section> <section> @@ -2150,7 +2196,7 @@ Number| Name | Contact | References | Description Served", as described in <xref target="RFC8126"/>. GANA is requested to populate this registry as follows: </t> - <figure anchor="figure_gnunetschema" title="GNUnet schema Subregistry."> + <figure anchor="figure_gnunetscheme" title="GNUnet scheme Subregistry."> <artwork name="" type="" align="left" alt=""><![CDATA[ Nam | Contact | References | Description ---------------+---------+------------+-------------------------