]> The R5N Distributed Hash Table GNUnet e.V.
Boltzmannstrasse 3 Garching 85748 DE schanzen@gnunet.org
Berner Fachhochschule
Hoeheweg 80 Biel/Bienne 2501 CH grothoff@gnunet.org
GNUnet e.V.
Boltzmannstrasse 3 Garching 85748 DE fix@gnunet.org
General Independent Stream distributed hash tables This document contains the R5N DHT technical specification. This document defines the normative wire format of resource records, resolution processes, cryptographic routines and security considerations for use by implementers. This specification was developed outside the IETF and does not have IETF consensus. It is published here to guide implementation of R5N and to ensure interoperability among implementations.
Introduction Distributed Hash Tables (DHTs) are a key data structure for the construction of completely decentralized applications. DHTs are important because they generally provide a robust and efficient means to distribute the storage and retrieval of key-value pairs. While already provides a peer-to-peer (P2P) signaling protocol with extensible routing and topology mechanisms, it also relies on strict admission control through the use of either centralized enrollment servers or pre-shared keys. Modern decentralized applications require a more open system that enables ad-hoc participation and other means to prevent common attacks on P2P overlays. This document contains the technical specification of the R5N DHT , a secure DHT routing algorithm and data structure for decentralized applications. R5N is an open P2P overlay routing mechanism which supports ad-hoc participation and security properties including support for topologies in restricted-route environments and path signatures. This document defines the normative wire format of peer-to-peer messages, routing algorithms, cryptographic routines and security considerations for use by implementors.
Requirements Notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 when, and only when, they appear in all capitals, as shown here.
Structure of This Document
  • Section X defines...
Terminology
Peer:
A host that is participating in the overlay. Peers are responsible for holding some portion of the data that has been stored in the overlay, and they are responsible for routing messages on behalf of other hosts as needed by the Routing Algorithm.
Peer Key:
The Peer Key is the identifier used on the Overlay to address a peer.
Peer ID:
The Peer ID is the identity which is used to authenticate a peer in the underlay. The Peer Address is derived from the Peer ID.
Neighbour:
A neighbour is a peer which is directly connected to our peer.
Block:
An object or group of objects stored in the DHT.
Block-Type:
A unique 32-bit value identifying a block type. Block-Types are either private or allocated by GANA (see ).
Block Storage
The Block Storage component is used to persist and manage data by peers. It includes logic for quotas, caching stragegies and data validation.
Responsible Peer:
The peer N that is responsible for a specific resource K, as defined by the SelectClosestPeer(K, P) algorithm (see .
Applications
Applications are components which directly use the DHT overlay interfaces. Possible applications include the GNU Name System or the CADET transport system .
Application API
The application API exposes the core operations of the DHT overlay to applications. This includes querying and retrieving data from the DHT.
Message Processing
The Message Processing component processes requests from and responses to applications as well as messages from the underlay network.
Routing
The Routing component includes the routing table as well as routing and peer selection logic. It facilitates the R5N routing algorithm with required data structures and algorithms.
Underlay Interface
The DHT Underlay Interface is an abstraction layer on top of the supported links of a peer. Peers may be linked by a variety of different transports, including "classical" protocols such as TCP, UDP and TLS or advanced protocols such as GNUnet, L2P or Tor.
Architecture R5N is an overlay network with a pluggable transport layer. The following figure shows the R5N architecture.
| Routing | | +--------------------+ +---------+ | ^ ^ | v v -------------+------------------------------------ Underlay Interface | +--------+ +--------+ | |GNUnet | |IP | ... Connectivity | |Underlay| |Underlay| | |Link | |Link | | +--------+ +--------+ ]]>
Other glossary
Application API In the DHT overlay, a peer is addressable by its Peer Address. The Peer Address is a SHA-512 hash of the Peer ID. The Peer ID is the public key of the corresponding Ed25519 peer private key. An implementation of this specification commonly exposes the two API procedures "GET" and "PUT". The following are non-normative examples of such APIs and their behaviour are detailed in order to give implementers a fuller picture of the protocol.
The GET procedure A basic GET procedure may be exposed as: GET(Query-Key) -> Results as List The procedure requires at least a Query-Key to initiate a lookup:
QueryKey:
the key to look for in the DHT.
The procedure may allow a set of optional parameters in order to control or modify the query:
Block-Type:
the type of block to look for.
Replication-Level:
An integer which controls how many nearest peers the request should reach.
Route-Options:
Flags that are used in order to indicate certain processing requirements for messages. Any combination of options as defined in may be specified.
Extended-Query:
is extended query medatadata which may be required depending on the respective Block-Type. A Block-Type must define if the XQuery can or must be used and what the specific format of its contents should be. See also .
Result-Filter:
allows to indicate results which are not relevant anymore to the caller (see ).
If the procedure is implemented synchronuously, it may return a list of results. If it is implemented asynchronuously, it may return individual results. A single result commonly consists of:
Block-Type:
the type of block in the result.
Block-Data:
the block payload. Contents are defined by the Block-Type.
Expiration:
the duration of validity of the result.
Key:
the key of the result. This may be different from the Query-Key, for example if a flag for approximate matches was set.
GET-Path:
is a signed path the query took through the network.
PUT-Path:
is a signed path the PUT-Request of this data took through the network.
The PUT procedure A PUT procedure may be exposed as: PUT(Key, Block) The procedure takes at least two parameters:
Key:
the key under which to store the block.
Block:
the block to store.
The procedure may allow a set of optional parameters in order to control or modify the query:
Block-Type:
the type of the block to store.
Replication-Level:
An integer which controls how many nearest peers the request should reach.
Route-Options:
Flags that are used in order to indicate certain processing requirements for messages. Any combination of options as defined in may be specified.
Block-Expiration
is the requested expiration date for the block payload.
The procedure does not necessarily output any information.
Underlay In the network underlay, a peer is addressable by traditional means out of scope of this document. For example, the peer may have a TCP/IP address, or a HTTPS endpoint. While the specific addressing options and mechanisms are out of scope for this document, it is necessary to define a universal addressing format in order to facilitate the distribution of connectivity information to other peers in the DHT overlay. This format is the "HELLO" message. It is expected that there are basic mechanisms available to manage peer connectivity and addressing. The required functionality are abstracted through the following procedures:
TRY_CONNECT(N, A)
A function which allows the local peer to attempt the establishment of a connection to another peer N using an address A. When the connection attempt is successful, information on the new peer is offered through the PEER_CONNECTED signal.
HOLD(P)
A function which tells the underlay to keep a hold on the connection to a peer P.
DROP(P)
A function which tells the underlay to drop the connection to a peer P.
M = RECEIVE(P)
A function or event that allows the local peer to receive a protocol message M as defined in this document from a peer P.
SEND(P, M)
A function that allows the local peer to send a protocol message M as defined in this document to a peer P. If call to SEND fails, the message has not been sent.
S = ESTIMATE_NETWORK_SIZE()
A procedure that provides estimates on the network size S for use in the DHT routing algorithms.
In addition to the above procedures, which are meant to be actively executed by the implementation as part of the peer-to-peer protocol, the following callbacks or signals drive updates of the routing table:
PEER_CONNECTED -> P
is a signal that allows the DHT to react to a newly connected peer P. Such an event triggers, for example, updates in the routing table.
PEER_DISCONNECTED -> P
is a signal that allows the DHT to react to a recently disconnected peer. Such an event triggers, for example, updates in the routing table.
ADDRESS_ADDED -> A
The underlay signals us that an address A was added for our local peer. This information is used to advertise connectivity information to the local peer. A is a string suitable for inclusion in a HELLO payload .
ADDRESS_DELETED -> A
The underlay signals us that an address A was removed. This information is used, for example, to no longer advertise this address.
Bootstrapping Initially, the implementation depends upon either the Underlay to provide at least one initial connection to a peer signalled through PEER_CONNECTED, or the application/end-user providing at least one working HELLO to the DHT or the Underlay for bootstrapping. While details on how the first connection is established MAY depend on the specific implementation, this SHOULD usually be done by an out-of-band exchange of the information from a HELLO block. For this, section TBD specifies a URL format for encoding HELLO blocks as text strings which SHOULD be supported by implementations. Regardless of how the initial connections are established, the peers resulting from these initial connections are subsequently stored in the routing table component . Further, the Underlay must provide the implementation with one or more addresses signalled through ADDRESS_ADDED. The implementation then proceeds to periodically advertise all active addresses in a HELLO block . In order to find more close peers in the network, an implementation MUST now periodically send find peer messages . In both cases the frequency of advertisements and peer discovery MAY be adapted according to network conditions, connected peers, workload of the system and other factors which are at the discretion of the developer. Any implementation encountering a HELLO GET request initially sends its own peer address.
Routing
Peer Storage A R5N implementation must store the information on connected peers and update changes accordingly in a local persistance component such as a database. Upon receiving a connection notification from the Underlay through PEER_CONNECTED, information on the new peer is added to the local peer storage. When disconnect is indicated by the Underlay through PEER_DISCONNECTED the peer MUST be removed from the local peer storage. In order to achieve O(log n) routing performance, the data structure for managing connected peers and their metadata MUST be implemented using the k-buckets concept of as defined in .
Peer Bloom Filter The peer bloom filter is used to prevent circular routes. Any peer which is forwarding GET or PUT messages () adds its own peer ID to the message bloom filter. This allows other peers to lookup next hops while excluding already traversed peers (). The bloom filter is a 128-bit field. It is initially empty, consisting only of zeroes. There are two functions which can be invoked on the Bloom filter: BF-SET(bf, e) and BF-TEST(bf, e) where "e" is an element which is to be added to the Bloom filter or queried against the set. Any bloom filter uses k=16 different hash functions each of which is defined as follows:
Peer Discovery To build its routing table, a peer will send out requests asking for blocks of type HELLO using its own location as the key, but filtering its own HELLO via the Bloom filter. These requests MUST use the FindApproximate and DemultiplexEverywhere options. FindApproximate will ensure that other peers will reply with keys they merely consider close-enough, while DemultiplexEverywhere will cause each peer on the path to respond, which is likely to yield HELLOs of peers that are useful somewhere in the routing table. To facilitate peer discovery, each peer MUST broadcast its own HELLO data to all peers in the routing table periodically. Whenever a peer receives such a HELLO message from another peer, it must cache it as long as that peer is in its routing table (or until the HELLO expires) and serve it in response to Peer Discovery requests. Details about the format of the HELLO message are given in section p2p_hello_wire.
Routing Table In order to select peers which are suitable destinations for routing messages, R5N uses a hybrid approach: Given an estimated network size N, the peer selection for the first N hops is random. After the initial N hops, peer selection follows an XOR-based peer distance calculation. The routing table consists of an array of lists of connected peers. The i-th list stores neighbours whose identifiers are between distance 2^i and 2^(i+1) from the local peer. System constraints will typically force an implementation to impose some upper limit on the number of neighbours kept per k-bucket. Implementations SHOULD try to keep at least 5 entries per k-bucket. Embedded systems that cannot manage this number of connections MAY use connection-level signalling to indicate that they are merely a client utilizing a DHT and not able to participate in routing. DHT peers receiving such connections MUST NOT include connections to such restricted systems when making routing decisions. If a system hits constraints with respect to the number of active connections, an implementation MUST evict peers from those k-buckets with the largest number of peers. The eviction strategy MUST be to drop the shortest-lived connections first. As the message traverses a random path through the network for the first N hops, it is essential that routing loops are avoided. In R5N, a bloomfilter is used as part of the routing metadata in messages. The bloomfilter is updates at each hop with the hops peer identity. For the next hop selection in both the random and the deterministic case, any peer which is in the bloomfilter for the respective message is not included in the peer selection process. The R5N routing component MUST implement the following functions:
GetDistance(A, B) -> Distance as Integer
this function calculates the binary XOR between A and B. The resulting distance is interpreted as an integer where the leftmost bit is the most significant bit.
SelectClosestpeer(K, B) -> N
This function selects the connected peer N from our routing table with the shortest XOR-distance to the key K. This means that for all other peers N' in the routing table GetDistance(N, K) < GetDistance(N',K). peers in the bloomfilter B are not considered.
SelectRandompeer(B) -> N
This function selects a random peer N from all connected peers. peers in the bloomfilter B are not considered.
Selectpeer(K, H, B) -> N
This function selects a peer N depending on the number of hops H parameter. If H < NETWORK_SIZE_ESTIMATE this function MUST return SelectRandompeer() and SelectClosestpeer(K) otherwise. peers in the bloomfilter B are not considered.
IsClosestpeer(N, K, B) -> true | false
checks if N is the closest peer for K (cf. SelectClosestpeer(K)). peers in the bloomfilter B are not considered.
Message Processing The implementation MUST listen for RECEIVE(P, M) signals from the Underlay and respond to the respective messages sent by the peer P. In the following, the wire formats of the messages and the required processing are detailed. The local peer address is referred to as N.
Route Options The RouteOptions consist of the following flags which are represented in an options field in the messages. Each flag is represented by a bit in the field starting from 0 as the rightmost bit to 15 as the leftmost bit.
0: Demultiplex-Everywhere
indicates that each peer along the way should process the request.
1: Record-Route
indicates to keep track of the route that the message takes in the P2P network.
2: Find-Approximate
This is a special flag which modifies the message processing to allow approximate results.
3-15: Reserved
For future use.
Result Bloom Filter The result bloom filter is used to indicate to a peer which results are not of interest when processing a GET message (). Any peer which is processing GET messages and has a result which matches the query key MUST check the result bloom filter and only send a reply message if the block key is not in the bloom filter set. The bloom filter is a 128-bit field. It is initially empty, consisting only of zeroes. There are two functions which can be invoked on the Bloom filter: BF-SET(bf, e) and BF-TEST(bf, e) where "e" is an element which is to be added to the Bloom filter or queried against the set. Any bloom filter uses k=16 different hash functions each of which is defined as follows:
Extended query TODO: Talk about XQuery in the context of messages.
HelloMessage
Wire Format
where:
MSIZE
denotes the size of this message in network byte order.
MTYPE
is the 16-bit message type. This type can be one of the DHT message types, but for HELLO messages it must be set to the value 157 in network byte order.
RESERVED
is a 16-bit field that must be zero.
URL_CTR
is a 16-bit number that gives the total number of addresses encoded in the ADDRESSES field. In network byte order.
SIGNATURE
is a 64 byte EdDSA signature using the sender's private key affirming the information contained in the message. The signature is signing exactly the same data that is being signed in a HELLO block as described in section XXX.
EXPIRATION
denotes the absolute 64-bit expiration date of the content. The value specified is microseconds since midnight (0 hour), January 1, 1970, but must be a multiple of one million (so that it can be represented in seconds in a HELLO URL). Stored in network byte order.
ADDRESSES
A sequence of exactly URL_CTR 0-terminated URIs in UTF-8 encoding representing addresses of this peer. Each URI must begin with a non-empty URI schema terminated by "://" and followed by some non-empty Underlay-specific address encoding.
Processing Upon receiving a HelloMessage from a peer P. An implementation MUST process it step by step as follows:
  1. If P is not in its routing table, the message is discarded.
  2. The signature is verified, including a check that the expiration time is in the future. If the signature is invalid, the message is discarded.
  3. The HELLO information is cached in the routing table until it expires, the peer is removed from the routing table, or the information is replaced by another message from the peer.
PutMessage
Wire Format
where:
MSIZE
denotes the size of this message in network byte order.
MTYPE
is the 16-bit message type. This type can be one of the DHT message types but for put messages it must be set to the value 146 in network byte order.
BTYPE
is a 32-bit block type field. The block type indicates the content type of the payload. In network byte order.
OPTIONS
is a 16-bit options field (see below).
HOPCOUNT
is a 16-bit number indicating how many hops this message has traversed to far. In network byte order.
REPL_LVL
is a 16-bit number indicating the desired replication level of the data. In network byte order.
PATH_LEN
is a 16-bit number indicating the length of the PUT path recorded in PUTPATH. As PUTPATH is optional, this value may be zero. In network byte order.
EXPIRATION
denotes the absolute 64-bit expiration date of the content. In microseconds since midnight (0 hour), January 1, 1970 in network byte order.
PEER_BF
A bloomfilter (for peer addresses) to stop circular routes.
BLOCK_KEY
The key under which the PUT request wants to store content under.
PUTPATH
the variable-length PUT path. The path consists of a list of PATH_LEN peer addresses.
BLOCK
the variable-length block payload. The contents are determined by the BTYPE field.
Processing Upon receiving a PutMessage from a peer P. An implementation MUST process it step by step as follows:
  1. The EXPIRATION field is evaluated. If the message is expired, it MUST be discarded.
  2. If the BTYPE is not supported by the implementation, no validation of the block payload is performed and processing continues at (4). Else, the block MUST be validated as defined in (3).
  3. The block payload of the message is evaluated using according to the BTYPE using the respective ValidateBlockStoreRequest procedure. If the block payload is invalid or does not match the key, it MUST be discarded.
  4. The peer address of the sender peer P SHOULD be in PEER_BF. If not, the implementation MAY log an error, but MUST continue.
  5. If the RecordRoute flag is set in OPTIONS, the local peer address MUST be appended to the PUTPATH of the message.
  6. If the local peer is the closest peer (cf. IsClosestpeer(N, BLOCK_KEY)) or the DemultiplexEverywhere options flag ist set, the message MUST be stored locally in the block storage.
  7. Given the value in REPL_LVL, the number of peers to forward to MUST be calculated. If there is at least one peers to forward to, the implementation SHOULD select up to this number of peers to forward the message to. The implementation MAY forward to fewer or no peers in order to handle resource constraints such as bandwidth. Finally, the local peer address MUST be added to the PEER_BF of the forwarded message. For all peers with peer address P chosen to forward the message to, SEND(P, PutMessage) is called.
GetMessage
Wire Format
where:
MSIZE
denotes the size of this message in network byte order.
MTYPE
is the 16-bit message type. It must be set to the value 147 in network byte order.
BTYPE
is a 32-bit block type field. The block type indicates the content type of the payload. In network byte order.
OPTIONS
is a 16-bit options field (see below).
HOPCOUNT
is a 16-bit number indicating how many hops this message has traversed to far. In network byte order.
REPL_LVL
is a 16-bit number indicating the desired replication level of the data. In network byte order.
XQ_SIZE
is a 32-bit number indicating the length of the optional extended query XQUERY. In network byte order.
PEER_BF
A bloomfilter (for peer identities) to stop circular routes.
QUERY_HASH
The query used to indicate which blocks the originator is looking for in this GET request. The value is commonly evaluated with respect to its XOR distance to a given block key when it is considered as an answer to the request. The block type may use a different evaluation logic to determine applicable result blocks.
XQUERY
the variable-length extended query. Optional.
BF_MUTATOR
The 32-bit bloomfilter mutator for the result bloomfilter.
RESULT_BF
the variable-length result bloomfilter.
Processing Upon receiving a GetMessage from a peer an implementation MUST process it step by step as follows:
  1. The QUERY_KEY and XQUERY fields are validated against the requested BTYPE as defined by its respective ValidateBlockQuery procedure. If the BTYPE is not supported, or if the block key does not match or if the XQUERY is malformed, the message MUST be discarded.
  2. The peer address of the sender peer P SHOULD be in the PEER_BF bloom filter. If not, the implementation MAY log an error, but MUST continue.
  3. If the local peer is the closest peer (cf. IsClosestpeer (N, QueryHash)) or the DemultiplexEverywhere options flag is set, a reply MUST be produced (if one is available) using the following steps:
    1. If TYPE indicates a request for a HELLO block, the peer MUST consult the HELLOs it has cached for the peers in its routing table instead of the local block storage (while continuing to respect options like DemultiplexEverywhere and FindApproximate).
    2. If OPTIONS indicate a FindApproximate request, the peer should respond with the closest block it has that is not filtered by the RESULT_BF.
    3. Else, the peer should only respond if it has a block that matches the key exactly and that is not filtered by the RESULT_BF.
    4. Any resulting block must be encapsulated in a ResultMessage and transmitted to the neighbor from which the request was received.
  4. This means that we must evaluate the Reply produced in the previous step using ValidateBlockReply for this BTYPE
  5. Given the value in REPL_LVL, the number of peers to forward to MUST be calculated (NUM-FORWARD-peerS). If there is at least one peer to forward to, the implementation SHOULD select up to this number of peers to forward the message to. The implementation MAY forward to fewer or no peers in order to handle resource constraints such as bandwidth. The message bloom filter PEER_BF MUST be updated with the local peer address N. For all peers with peer address P' chosen to forward the message to, SEND(P', PutMessage) is called.
ResultMessage
Wire Format
where:
MSIZE
denotes the size of this message in network byte order.
MTYPE
is the 16-bit message type. This type can be one of the DHT message types but for put messages it must be set to the value 148 in network byte order.
OPTIONS
is a 16-bit options field (see below).
BTYPE
is a 32-bit block type field. The block type indicates the content type of the payload. In network byte order.
PUTPATH_L
is a 16-bit number indicating the length of the PUT path recorded in PUTPATH. As PUTPATH is optiona, this value may be zero. In network byte order.
GET_PATH_LEN
is a 16-bit number indicating the length of the GET path recorded in GETPATH. As PUTPATH is optiona, this value may be zero. In network byte order.
EXPIRATION
denotes the absolute 64-bit expiration date of the content. In microseconds since midnight (0 hour), January 1, 1970 in network byte order.
QUERY_HASH
the query hash corresponding to the GET message which caused this reply message to be sent.
PUTPATH
the variable-length PUT path. The path consists of a list of PATH_LEN peer addresses.
GETPATH
the variable-length PUT path. The path consists of a list of PATH_LEN peer addresses.
BLOCK
the variable-length resource record data payload. The contents are defined by the respective type of the resource record.
Processing Upon receiving a ResultMessage from a connected peer. An implementation MUST process it step by step as follows:
  1. The EXPIRATION field is evaluated. If the message is expired, it MUST be discarded.
  2. If the MTYPE of the message indicates a HELLO block, it must be validated according to . The payload MUST be considered for the local routing table by trying to establish a connection to the peer using the information from the HELLO block. If a connection can be established, the peer is added to the k-buckets of the routing table.
  3. If the sender peer of the message is already found in the GETPATH, the path MUST be truncated at this position. The implementation MAY log a warning in such a case.
  4. If the QUERY_HASH of this ResultMessage is found in the list of pending local queries, the QUERY_HASH and XQUERY are validated against the requested BTYPE using the respective block type implementation of ValidateBlockReply. If the approximate flag is set and the BTYPE allows the implementation to compute the key from the block it must match the QUERY_HASH. If the XQUERY is malformed, the message MUST be discarded.
  5. The implementation MAY cache RESULT messages.
  6. If requests by other peers for this QUERY_HASH or BTYPE are known, the result block is validated against each request using the respective ValidateBlockReply function. If the request options include FindApproximate and the result message block type is HELLO the block validation must use the key derived using DeriveBlockKey as the key included in the request is only approximate. If the result message block cannot be verified against the QUERY_HASH of the result message or if BLOCK is malformed, the message MUST be discarded. For each pending request the reply is routed to the requesting peer P'.
Block Storage
Blocks Applications can and should define their own block types. The block type determines the format and handling of the block payload by peers in PUT and RESULT messages. Block types MUST be registered with GANA .
Block Processing Block validation may be necessary for both request as well as reply messages. When evaluating request messages and their metadata, the possible evaluation results are:
REQUEST_VALID
Query is valid, no reply given.
REQUEST_INVALID
Query format does not match block type. For example, XQuery not given or of size of XQuery is not appropriate for type.
When evaluating result messages, the possible evaluation results are: t>ReplyEvaluationResult-->
OK_MORE
Valid result, and there may be more.
OK_LAST
Last possible valid result.
OK_DUPLICATE
Valid result, but duplicate.
RESULT_INVALID
Invalid result. Block does not match query. Value = 4.
RESULT_IRRELEVANT
Block does not match xquery. Valid result, but not relevant for the request.
Block Functions Any block type implementation MUST implement the following functions.
ValidateBlockQuery(Key, XQuery) -> RequestEvaluationResult
is used to evaluate the request for a block. It is used as part of GetMessage processing, where the block payload is still unkown, but the block XQuery and Key can and MUST be verified, if possible.
ValidateBlockStoreRequest(Block, Key) -> RequestEvaluationResult
is used to evaluate a block including its key and payload. It is used as part of PutMessage processing. The validation MUST include a check of the block payload against the Key under which it is requested to be stored.
ValidateBlockReply(Block, XQuery, Key) -> ReplyEvaluationResult
is used to evaluate a block including its Key and payload. It is used as part ResultMessage processing. The validation of the respective Block requires a pending local query or a previously routed request of another peer and its associated XQuery data and Key. The validation MUST include a check of the block payload against the key under which it is requested to be stored.
DeriveBlockKey(Block) -> Key
is used to synthesize the block key from the block payload and metadata. It is used as part of FIND-peer message processing.
FilterResult(Block, XQuery, Key) -> ReplyEvaluationResult
is used to filter results stored in the local block storage for local queries. Locally stored blocks from previously observed ResultMessages and PutMessages MAY use this function instead of ValidateBlockReply in order to avoid revalidation of the block and only perform filtering based on request parameters.
Hello Block For bootstrapping and peer discovery, the DHT implementation uses its own block type called "HELLO". A block with this block type contains the peerID of the peer initiating the GET request. The HELLO block type wire format is illustrated in . A query for block of type HELLO MUST NOT include extended query data (XQuery). Any implementation encountering a HELLO block with XQuery data MUST consider the block invalid and ignore it.
PEER-ID
is the Peer-ID of the peer which has generated this HELLO.
SIGNATURE
is the signature of the HELLO.
EXPIRATION
denotes the absolute 64-bit expiration date of the HELLO. In microseconds since midnight (0 hour), January 1, 1970 in network byte order.
ADDRESSES
is a list of UTF-8 URIs which can be used as addresses to contact the peer. The strings MUST be 0-terminated.
The SIGNATURE covers a 64-bit pseudo header conceptually prefixed to the block. The pseudo header includes the expiration, signature purpose and a hash over the addresses. The wire format is illustrated in .
SIZE
A 32-bit value containing the length of the signed data in bytes in network byte order. The length of the signed data MUST be 80 bytes.
PURPOSE
A 32-bit signature purpose flag. This field MUST be 40 (in network byte order).
EXPIRATION
denotes the absolute 64-bit expiration date of the HELLO. In microseconds since midnight (0 hour), January 1, 1970 in network byte order.
H_ADDRS
a hash over the addresses in the HELLO.
H_ADDRS is generated over the ADDRESSES field as provided in the HELLO block using SHA-512 . A HELLO reply block MAY be empty. Otherwise, it contains the HELLO of a peer. The ADDRESSES part of the HELLO indicate endpoints which can be used by the Underlay in order to establish a connection with the peer identified by Peer-ID. An example of an addressing scheme used throughout this document is "ip+tcp", which refers to a standard TCP/IP socket connection. The "hier"-part of the URI must provide a suitable address for the given addressing scheme. The following is a non-normative example of address strings:
Persistence An implementation MUST provide a local persistence mechanism for blocks. The local storage MUST provide the following API:
Store(Key, Block)
Stores a block under the specified key.
Lookup(Key) -> List of Blocks
Retrieves the blocks stored under the specified key.
LookupApproximate(Key) -> List of Blocks
Retrieves the blocks stored under the specified key and any blocks under keys close to the specified key.
Over time a peer may accumulate a significant number of blocks which are stored locally in the persistence layer. Due to the expected high number of blocks, the method to retrieve blocks close to the specified lookup key in the LookupApproximate API must be implemented with care with respect to efficiency. It is RECOMMENDED to limit the number of results from the LookupApproximate procedure to a result size which is manageable by the local system. In order to efficiently find a suitable result set, the implementation SHOULD follow the following procedure:
  1. Sort all blocks by the block key in ascending (decending) order. The block keys are interpreted as integer.
  2. Alternatingly select a block with a key larger and smaller from the sortings. The resulting set is sorted by XOR distance. The selection process continues until the upper bound for the result set is reached and both sortings do not yield any closer blocks.
An implementation MAY decide to use a custom algorithm in order to find the closest blocks in the local storage. But, especially for more primitive approaches, such as only comparing XOR distances for all blocks in the storage, the procedure may become ineffective for large storages.
Caching Strategy An implementation MUST implement an eviction strategy for blocks stored in the block storage layer. In order to ensure the freshness of blocks, an implementation MUST evict expired blocks in favor of new blocks. An implementation MAY preserve blocks which are often requested. This approach can be expensive as it requires the implementation to keep track of how often a block is requested. An implementation MAY preserve blocks which are close to the local peer ID. An implementation MAY provide configurable storage quotas and adapt its eviction strategy based on the current storage size or other constrained resources.
Security Considerations If an upper bound to the maximum number of neighbours in a k-bucket is reached, the implementation MUST prefer to preserve the oldest working connections instead of new connections. This makes Sybil attacks less effective as an adversary would have to invest more resources over time to mount an effective attack.
IANA Considerations TODO: URI handler for "common" URI handler that Underlays may want to use as part of HELLOs.
GANA Considerations GANA is requested to create a "DHT Block Types" registry. The registry shall record for each entry:
  • Name: The name of the block type (case-insensitive ASCII string, restricted to alphanumeric characters
  • Number: 32-bit
  • Comment: Optionally, a brief English text describing the purpose of the block type (in UTF-8)
  • Contact: Optionally, the contact information of a person to contact for further information
  • References: Optionally, references describing the record type (such as an RFC)
The registration policy for this sub-registry is "First Come First Served", as described in . GANA is requested to populate this registry as follows:
GANA is requested to amend the "GNUnet Signature Purpose" registry as follows:
Local Storage
Test Vectors
Normative References &RFC2119; &RFC3629; &RFC3986; &RFC4634; &RFC6940; &RFC8126; &RFC8174; High-Speed High-Security SignaturesUniversity of Illinois at ChicagoTechnische Universiteit EindhovenTechnische Universiteit EindhovenNational Taiwan UniversityAcademia Sinica GNUnet Assigned Numbers Authority (GANA)GNUnet e.V. Informative References R5N: Randomized recursive routing for restricted-route networks Technische Universität München Technische Universität München Kademlia: A peer-to-peer information system based on the xor metric. CADET: Confidential ad-hoc decentralized end-to-end transport Technische Universität München Technische Universität München The GNU Name System GNUnet e.V. GNUnet e.V. GNUnet e.V.