lsd0004

LSD0004: R5N Distributed Hash Table
Log | Files | Refs

commit 2342ff7cd30d1a3bad42fa2d22b2bf6c9098c16a
parent 644fc3967068c0ef2c9f23ec8c3e93f73eb618f7
Author: Christian Grothoff <grothoff@gnunet.org>
Date:   Fri, 11 Mar 2022 12:47:19 +0100

revise block API

Diffstat:
Mdraft-schanzen-r5n.xml | 281++++++++++++++++++++++++++++++++++++++++++++-----------------------------------
1 file changed, 158 insertions(+), 123 deletions(-)

diff --git a/draft-schanzen-r5n.xml b/draft-schanzen-r5n.xml @@ -1569,7 +1569,7 @@ bchar = *(ALPHA / DIGIT) </t> <t> If the BTYPE is supported and <tt>ValidateBlockReply</tt> for the given - query has yielded a status of <tt>OK_LAST</tt>, processing + query has yielded a status of <tt>FILTER_LAST</tt>, processing <bcp14>MUST</bcp14> end and not continue with forwarding of the request to other peers. </t> @@ -1757,102 +1757,107 @@ bchar = *(ALPHA / DIGIT) </section> </section> <section anchor="blockstorage" numbered="true" toc="default"> - <name>Block Storage</name> - <section> - <name>Blocks</name> - <t> - Applications can and should define their own block types. - The block type determines the format and handling of the block - payload by peers in PUT and RESULT messages. - Block types <bcp14>MUST</bcp14> be registered with GANA <xref target="gana"/>. - </t> - <section> - <name>Block Processing</name> - <t> - Block validation may be necessary for both request as well as - reply messages. - When evaluating request messages and their metadata, the possible - evaluation results are: - </t> - <!--<t>RequestEvaluationResult</t>--> - <dl> - <dt>REQUEST_VALID</dt> - <dd>Query is valid, no reply given.</dd> - <dt>REQUEST_INVALID</dt> - <dd> - Query format does not match block type. For example, XQuery not - given or of size of XQuery is not appropriate for type. - </dd> - </dl> - <t> - When evaluating result messages, the possible evaluation results - are: - </t> - <!-->t>ReplyEvaluationResult</t>--> - <!-- FIXME this is not integrated into the message processing - or block processing. We need to specifiy where this is used. - My guess is that we should handle result types in the message - processing descriptions --> - <dl> - <dt>OK_MORE</dt> - <dd>Valid result, and there may be more.</dd> - <dt>OK_LAST</dt> - <dd>Last possible valid result.</dd> - <dt>OK_DUPLICATE</dt> - <dd>Valid result, but duplicate.</dd> - <dt>RESULT_INVALID</dt> - <dd>Invalid result. Block does not match query. Value = 4.</dd> - <dt>RESULT_IRRELEVANT</dt> - <dd>Block does not match XQuery. Valid result, but not relevant for the request.</dd> - </dl> - </section> - <section anchor="block_functions"> - <name>Block Functions</name> + <name>Blocks</name> + <t> + This section describes various considerations R<sup>5</sup>N + implementations must consider with respect to blocks. + Specifically, implementations <bcp14>SHOULD</bcp14> be able + to validate and persist blocks. Implementations + <bcp14>MAY</bcp14> not support validation for all types + of blocks. On some devices, storing blocks <bcp14>MAY</bcp14> + also be impossible due to lack of storage capacity. + </t> + <t> + Applications can and should define their own block types. + The block type determines the format and handling of the block + payload by peers in PUT and RESULT messages. + Block types <bcp14>MUST</bcp14> be registered with GANA + (see <xref target="gana_block_type"/>). + </t> + <t> + </t> + <section anchor="block_functions"> + <name>Block Operations</name> <t> - Any block type implementation <bcp14>MUST</bcp14> implement the following functions. + Block validation may be necessary for all types of DHT + messages. To enable these validations, any block type + specification <bcp14>MUST</bcp14> define the following functions: </t> <dl> <dt>ValidateBlockQuery(Key, XQuery) -&gt; RequestEvaluationResult</dt> <dd> - is used to evaluate the request for a block. It is used as part of - <tt>GetMessage</tt> processing, where the block payload is still unkown, - but the block <tt>XQuery</tt> <!--(FIXME: Undefined here)--> - and <tt>Key</tt> can and - <bcp14>MUST</bcp14> be verified, if possible. - </dd> - <dt>ValidateBlockStoreRequest(Block, Key) - -&gt; RequestEvaluationResult</dt> - <dd> - is used to evaluate a block including its key and payload. - It is used as part of <tt>PutMessage</tt> processing. - The validation <bcp14>MUST</bcp14> include a check of the block payload against the - <tt>Key</tt> under which it is requested to be stored. + <t> + is used to evaluate the request for a block as part of + <tt>GetMessage</tt> processing. Here, the block payload is unkown, + but if possible the <tt>XQuery</tt> and <tt>Key</tt> + <bcp14>SHOULD</bcp14> be verified. Possible values for + the <tt>RequestEvaluationResult</tt> are: + </t> + <dl> + <dt>REQUEST_VALID</dt> + <dd>Query is valid.</dd> + <dt>REQUEST_INVALID</dt> + <dd> + Query format does not match block type. For example, a + mandatory XQuery was not provided, or of the size of + the XQuery is not appropriate for the block type. + </dd> + </dl> </dd> - <dt>ValidateBlockReply(Block, XQuery, Key) - -&gt; ReplyEvaluationResult</dt> + <dt>DeriveBlockKey(Block) -&gt; Key | NONE</dt> <dd> - is used to evaluate a block including its <tt>Key</tt> and payload. - It is used as part <tt>ResultMessage</tt> processing. - The validation of the respective <tt>Block</tt> requires a pending local query - or a previously routed request of another peer and its associated - XQuery data and Key. - The validation <bcp14>MUST</bcp14> include a check of the block payload against the - key under which it is requested to be stored. + is used to synthesize the block key from the block payload as + part of <tt>PutMessage</tt> and <tt>ResultMessage</tt> processing. + A return value of "NONE" implies that this block type does not + permit deriving the key from the block. A Key may be returned + for a block that is ill-formed. </dd> - <dt>DeriveBlockKey(Block) -&gt; Key | None | Fail</dt> + <dt>ValidateBlockStoreRequest(Block, Key) + -&gt; BlockEvaluationResult</dt> <dd> - is used to synthesize the block key from the block payload - and metadata. It is used as part of FIND-peer message processing. + <t> + is used to evaluate a block including its key and payload + as part of <tt>PutMessage</tt> and <tt>ResultMessage</tt> processing. + Possible values for the <tt>BlockEvaluationResult</tt> are: + </t> + <dl> + <dt>BLOCK_VALID</dt> + <dd>Block is valid.</dd> + <dt>BLOCK_INVALID</dt> + <dd>Block payload does not match the block type. + </dd> + </dl> </dd> - <dt>FilterResult(Block, XQuery, Key) -&gt; ReplyEvaluationResult</dt> + <dt>FilterResult(Block, Key, RBF, XQuery) -&gt; (FilterEvaluationResult, RBF')</dt> <dd> - is used to filter results stored in the local block storage for local - queries. Locally stored blocks from previously observed - <tt>ResultMessages</tt> and <tt>PutMessages</tt> <bcp14>MAY</bcp14> use this - function instead of <tt>ValidateBlockReply</tt> in order to - avoid revalidation of the block and only perform filtering based on - request parameters. + <t> + is used to filter results against specific queries. This function + does not check the validity of Block itself or that it matches the given key, + as this must have been checked earlier. + Thus, locally stored blocks from previously observed + <tt>ResultMessages</tt> and <tt>PutMessages</tt> use this + function to perform filtering based on the request parameters + of a particular GET operation. + Possible values for the <tt>FilterEvaluationResult</tt> are: + </t> + <dl> + <dt>FILTER_MORE</dt> + <dd>Valid result, and there may be more.</dd> + <dt>FILTER_LAST</dt> + <dd>Last possible valid result.</dd> + <dt>FILTER_DUPLICATE</dt> + <dd>Valid result, but duplicate (was filtered by the result Bloom filter).</dd> + <dt>FILTER_IRRELEVANT</dt> + <dd>Block does not satisfy the constraints imposed by the XQuery.</dd> + </dl> + <t> + If the main evaluation result is <tt>FILTER_MORE</tt>, the function also returns + and updated result Bloom filter where the block is added to the set of + filtered replies. An implementation is not expected to actually differenciate + between the <tt>FILTER_DUPLICATE</tt> and <tt>FILTER_IRRELEVANT</tt> return + values: in both cases the block is ignored for this query. + </t> </dd> </dl> </section> @@ -1861,14 +1866,16 @@ bchar = *(ALPHA / DIGIT) <t> For bootstrapping and peer discovery, the DHT implementation uses its own block type called "HELLO". A block with this block type - contains the peerID of the peer initiating the GET request. + contains the peer ID of the peer that published the "HELLO" together + with a set of addresses of this peer. The key of a HELLO block + is the SHA-512 of the peer ID and thus the peer's address in the DHT. </t> <t> The HELLO block type wire format is illustrated in <xref target="figure_hello"/>. A query for block of type HELLO <bcp14>MUST NOT</bcp14> include extended query data (XQuery). Any implementation - encountering a HELLO block with XQuery data <bcp14>MUST</bcp14> consider the - block invalid and ignore it. + encountering a request for a HELLO with non-empty XQuery + data <bcp14>MUST</bcp14> consider the request invalid and ignore it. </t> <figure anchor="figure_hello" title="The HELLO Block Format."> <artwork name="" type="" align="left" alt=""><![CDATA[ @@ -1900,10 +1907,6 @@ bchar = *(ALPHA / DIGIT) <dd> is the Peer-ID of the peer which has generated this HELLO. </dd> - <dt>SIGNATURE</dt> - <dd> - is the signature of the HELLO. - </dd> <dt>EXPIRATION</dt> <dd> denotes the absolute 64-bit expiration date of the HELLO. @@ -1912,14 +1915,30 @@ bchar = *(ALPHA / DIGIT) </dd> <dt>ADDRESSES</dt> <dd> + <t> is a list of UTF-8 <xref target="RFC3629"/> URIs <xref target="RFC3986"/> which can be used as addresses to contact the peer. The strings <bcp14>MUST</bcp14> be 0-terminated. + The set of URIs MAY be empty. + An example of an addressing scheme used throughout + this document is "r5n+ip+tcp", which refers to a standard TCP/IP socket + connection. The "hier"-part of the URI must provide a suitable + address for the given addressing scheme. + The following is a non-normative example of address strings: + </t> + <figure title="Example Address URIs."> + <artwork name="" type="" align="left" alt=""><![CDATA[ +r5n+ip+udp://1.2.3.4:6789 \ +gnunet+tcp://12.3.4.5/ \ + ]]></artwork> + </figure> </dd> - </dl> - <t> - The SIGNATURE covers a 64-bit pseudo header + <dt>SIGNATURE</dt> + <dd> + <t> + is the signature of the HELLO. + It covers a 64-bit pseudo header conceptually prefixed to the block. The pseudo header includes the expiration, signature purpose and a hash over the addresses. The wire format is illustrated @@ -1964,40 +1983,41 @@ bchar = *(ALPHA / DIGIT) </dd> <dt>H_ADDRS</dt> <dd> - a hash over the addresses in the HELLO. + a SHA-512 hash over the addresses in the HELLO. + H_ADDRS is generated over the ADDRESSES field + as provided in the HELLO block using SHA-512 <xref target="RFC4634"/>. </dd> </dl> - <t> - H_ADDRS is generated over the ADDRESSES field - as provided in the HELLO block using SHA-512 <xref target="RFC4634"/>. + </dd> + </dl> + <t> + To validate a block query for a HELLO is to simply check that the XQuery is empty. </t> - <t> - A HELLO reply block <bcp14>MAY</bcp14> be empty. Otherwise, it contains the - HELLO of a peer. + <t> + To derive a block key for a HELLO is to simply + hash the peer ID from the HELLO. </t> - <t> - The <tt>ADDRESSES</tt> part of the <tt>HELLO</tt> indicate endpoints - which can be used by the Underlay in order to establish a connection - with the peer identified by <tt>Peer-ID</tt>. - An example of an addressing scheme used throughout - this document is "ip+tcp", which refers to a standard TCP/IP socket - connection. The "hier"-part of the URI must provide a suitable - address for the given addressing scheme. - The following is a non-normative example of address strings: + <t> + To validate a block store request is to verify + the EdDSA SIGNATURE over the hashed ADDRESSES + against the public key from the peer ID field. </t> - <figure title="Example Address URIs."> - <artwork name="" type="" align="left" alt=""><![CDATA[ -ip+udp://1.2.3.4:6789 \ -gnunet+tcp://12.3.4.5/ \ - ]]></artwork> - </figure> <t> - FIXME: describe here how the HELLO block - is hashed into the result Bloom filter! + To filter results of HELLO blocks + using the result Bloom filter, the + H_ADDRS field (as computed using SHA-512 over + the ADDRESSES) is concatenated with the mutator. + The resulting value is then hashed into the + result Bloom filter. Consequently, HELLOs with + completely identical sets of addresses will be + filtered, but any small variation in the set of + addresses will cause the block to no longer be + filtered (with high probability). The + function thus always returns either + <tt>FILTER_MORE</tt> or <tt>FILTER_DUPLICATE</tt>. </t> </section> - </section> - <section> + <section> <name>Persistence</name> <t> An implementation MUST provide a local persistence mechanism for @@ -2019,6 +2039,8 @@ gnunet+tcp://12.3.4.5/ \ any blocks under keys close to the specified key. </dd> </dl> + <section> + <name>Approximate Search Considerations</name> <t> Over time a peer may accumulate a significant number of blocks which are stored locally in the persistence layer. @@ -2059,8 +2081,9 @@ gnunet+tcp://12.3.4.5/ \ </t> <!-- FIXME the result set is then filtered again by the block plugin. But we should discuss this elsewhere --> + </section> <section> - <name>Caching Strategy</name> + <name>Caching Strategy Considerations</name> <t> An implementation <bcp14>MUST</bcp14> implement an eviction strategy for blocks stored in the block storage layer. @@ -2144,7 +2167,9 @@ gnunet+tcp://12.3.4.5/ \ </section> <section anchor="gana" numbered="true" toc="default"> - <name>GANA Considerations</name> + <name>GANA Considerations</name> + <section anchor="gana_block_type" numbered="true" toc="default"> + <name>Block Type Registry</name> <t> GANA <xref target="GANA"/> is requested to create a "DHT Block Types" registry. @@ -2176,7 +2201,10 @@ Number| Name | Contact | References | Description 11 GNS N/A GNS Block for storing record data ]]></artwork> </figure> + </section> + <section anchor="gana_gnunet_url" numbered="true" toc="default"> + <name>GNUnet URI schema Subregistry</name> <t> GANA <xref target="GANA"/> is requested to create a "gnunet://" sub-registry. @@ -2205,7 +2233,10 @@ HELLO N/A [This.I-D] How to contact a peer. ADDRESS N/A N/A Network address. ]]></artwork> </figure> + </section> + <section anchor="gana_signature_purpose" numbered="true" toc="default"> + <name>GNUnet Signature Purpose Registry</name> <t> GANA is requested to amend the "GNUnet Signature Purpose" registry as follows: @@ -2219,7 +2250,10 @@ Purpose | Name | References | Description ]]></artwork> </figure> + </section> + <section anchor="gana_message_type" numbered="true" toc="default"> + <name>GNUnet Message Type Registry</name> <t> GANA is requested to amend the "GNUnet Message Type" registry as follows: @@ -2235,6 +2269,7 @@ Type | Name | References | Description ]]></artwork> </figure> + </section> </section> <!-- gana --> <section>