lsd0004

LSD0004: R5N Distributed Hash Table
Log | Files | Refs

commit f0778931876779e542970d5f2b5e288478280443
parent 8306b696288c82ecc1564c71425bd38e1ecdb228
Author: Martin Schanzenbach <schanzen@gnunet.org>
Date:   Sun, 14 Jul 2024 10:15:09 +0200

improve bloom, more bcp14

Diffstat:
Mdraft-schanzen-r5n.xml | 56++++++++++++++++++++++++++++++--------------------------
1 file changed, 30 insertions(+), 26 deletions(-)

diff --git a/draft-schanzen-r5n.xml b/draft-schanzen-r5n.xml @@ -859,7 +859,7 @@ Connectivity | |Underlay| |Underlay| | Underlay | ... Implementations <bcp14>MAY</bcp14> cache valid <em>addresses</em> of disconnected <em>peers</em> outside of the routing table and sporadically or periodically try to (re-)establish connection to the <em>peer</em> by making <tt>TRY_CONNECT()</tt> calls to the <em>underlay interface</em> - if the respective <tt>k</tt>-bucket has empty slots. + if the respective <tt>k</tt>-bucket has empty slots. </t> </section> <section anchor="find_peer"> @@ -929,11 +929,12 @@ Connectivity | |Underlay| |Underlay| | Underlay | ... implementation, it is <bcp14>RECOMMENDED</bcp14> to choose external factors such as expiration of DHCP leases to determine the values. The specific frequency of advertisements - <bcp14>MAY</bcp14> depend on available bandwidth, the set of - already connected neighbors, the workload of the system and - other factors which are at the discretion of the developer, - but <bcp14>SHOULD</bcp14> be a fraction of the expiration - period. Whenever a peer receives such a <tt>HELLO</tt> + <bcp14>SHOULD</bcp14> be a fraction of the expiration + period. + It <bcp14>MAY</bcp14> additionally depend on available bandwidth, + the set of already connected neighbors, the workload of the system and + other factors which are at the discretion of the developer. + Whenever a peer receives such a <tt>HELLO</tt> message from another peer that is already in the routing table, it must cache it as long as that peer remains in its routing table (or until the <tt>HELLO</tt> expires) and @@ -954,9 +955,25 @@ Connectivity | |Underlay| |Underlay| | Underlay | ... filter is part of the routing metadata in messages to prevent circular routes. It is updated at each hop where the hop's peer identity derived from the peer's public key is - added to it. It is constant in size at <tt>L=1024</tt> bits - (128 bytes) and sets <tt>k=16</tt> bits per element. (At - this size, the Bloom filter reaches a false-positive rate of + added to it. + The peer Bloom filter follows the definition in <xref target="bloom_filters"/>. + It <bcp14>MUST</bcp14> be <tt>L=1024</tt> bits + (128 bytes) in size and <bcp14>MUST</bcp14> set <tt>k=16</tt> bits per + element. + The set of elements <tt>E</tt> consists of of all possible 256-bit peer + public keys and the mapping function <tt>M</tt> is defined as follows: + </t> + <t> + <tt>M(e) -> SHA-512 (e) as uint32[]</tt> + </t> + <t> + The element <tt>e</tt> is the peer public key which is hashed using SHA-512. + The resulting 512-bit peer identity is interpreted as an array of k=16 + 32-bit integers in network byte order which are used to set and check the bits + in <tt>B</tt> using <tt>BF-SET()</tt> and <tt>BF-TEST()</tt>. + </t> + <t> + At this size, the Bloom filter reaches a false-positive rate of approximately fifty percent at about 200 entries. For peer discovery where the Bloom filter is initially populated with peer identities from the local routing table, the 200 @@ -964,12 +981,13 @@ Connectivity | |Underlay| |Underlay| | Underlay | ... peers per bucket, which corresponds to an overlay network size of approximately 1 trillion peers. Thus, <tt>L=1024</tt> bits should suffice for all conceivable - use-cases.) For the next hop selection in both the random + use-cases. + </t> + <t> + For the next hop selection in both the random and the deterministic case, any peer which is in the peer Bloom filter for the respective message is excluded from the peer selection process. - </t> - <t> Any peer which is forwarding <tt>GetMessages</tt> or <tt>PutMessages</tt> (<xref target="p2p_messages"/>) thus adds its own peer public key to the peer Bloom filter. @@ -977,20 +995,6 @@ Connectivity | |Underlay| |Underlay| | Underlay | ... traversed peers when searching for the next hops in the routing table. </t> <t> - The peer Bloom filter follows the definition in <xref target="bloom_filters"/>. - The set of elements <tt>E</tt> consists of of all possible 256-bit peer public keys. - The mapping function <tt>M</tt> is defined as follows: - </t> - <t> - <tt>M(e) -> SHA-512 (e) as uint32[]</tt> - </t> - <t> - The element <tt>e</tt> is the peer public key which is hashed using SHA-512. - The resulting 512-bit peer identity is interpreted as an array of k=16 - 32-bit integers in network byte order which are used to set and check the bits - in <tt>B</tt> using <tt>BF-SET()</tt> and <tt>BF-TEST()</tt>. - </t> - <t> We note that the peer Bloom filter may exclude peers due to false-postive matches. This is acceptable as routing should nevertheless terminate (with high probability) in close vicinity of the key. Furthermore,