Make wording consistent on bloom filters. - lsd0004 - LSD0004: R5N Distributed Hash Table

commit 30a22ea5b8f6450319e3b1d88520e7d25e5d53a7
parent 3f70faf531ebd101004f5b93e1c80ee1ec5bcdd1
Author: Martin Schanzenbach <schanzen@gnunet.org>
Date:   Sat, 24 Dec 2022 00:56:23 +0900

Make wording consistent on bloom filters.

Diffstat:
M draft-schanzen-r5n.xml  | 60 +++++++++++++++++++++++++++++++++++-------------------------

1 file changed, 35 insertions(+), 25 deletions(-)
diff --git a/draft-schanzen-r5n.xml b/draft-schanzen-r5n.xml
@@ -863,15 +863,15 @@ bchar = *(ALPHA / DIGIT)
         <t>
           As DHT <tt>GetMessage</tt>s and <tt>PutMessage</tt>s traverse a random path through the network for the
           first N hops, it is essential that routing loops are avoided.
-          In R<sup>5</sup>N, a Bloom filter is used as part of the routing metadata in
-          messages. The Bloom filter is updates at each hop with the hops
+          In R<sup>5</sup>N, a Bloom filter is part of the routing metadata in
+          messages in order to prevent circular routes.
+          The Bloom filter is updates at each hop with the hops
           peer identity.
           For the next hop selection in both the random and the deterministic
           case, any peer which is in the Bloom filter for the respective message
           is not included in the peer selection process.
         </t>
         <t>
-          The peer Bloom filter is used to prevent circular routes.
           Any peer which is forwarding <tt>GetMessage</tt>s or <tt>PutMessage</tt>s
           (<xref target="p2p_messages"/>) adds its own peer ID to the
           peer Bloom filter.
@@ -879,16 +879,20 @@ bchar = *(ALPHA / DIGIT)
           traversed peers when searching for the next hops in the routing table.
         </t>
         <t>
-          The peer Bloom filter (see <xref target="bloom_filters"/>) consists of
-          L=1024 buckets.
+          The peer Bloom filter follows the definition in <xref target="bloom_filters"/>
+          and consists of L=1024 buckets.
+          The set of elements <tt>E</tt> consists of of all possible 256-bit peer IDs.
+          The number of buckets per element <tt>e</tt> is <tt>k=16</tt>.
           The mapping function <tt>M</tt> is defined as follows:
         </t>
         <t>
-          The element <tt>e</tt> is a 32 byte peer ID and is hashed using
-          SHA-512.
+          <tt>M(e) -> SHA-512 (e) as uint32[]</tt>
+        </t>
+        <t>
+          The element <tt>e</tt> is hashed using SHA-512.
           The resulting byte string is interpreted as a string of k=16
-          32-bit integers in network byte order which are used to set the bucket bits
-          accordingly.
+          32-bit integers in network byte order which are used to set and check the bucket bits
+          in <tt>B</tt> using <tt>BF-SET</tt> and <tt>BF-TEST</tt>.
         </t>
         <t>
 	  We note that the peer Bloom filter may exclude peers due to false-postive
@@ -2363,31 +2367,37 @@ gnunet+tcp://12.3.4.5/ \
           <dt>SetupResultFilter(FilterSize, Mutator) -&gt; RF</dt>
           <dd>
 	  <t>
+            <!-- FIXME: I do not understand this. Isn't the element set E known
+                 for HELLO blocks? Isn't it H_ADDRS XOR MUTATOR? -->
 	    The RESULT_FILTER for HELLO blocks is implemented using a
-            Bloom filter (see <xref target="bloom_filters"/>).
+            Bloom filter following the definition from <xref target="bloom_filters"/>
+            and consists of a variable number of buckets <tt>L</tt>.
             The size <tt>S = L/8</tt> of the Bloom filter in bytes depends on
-	    the number of elements <tt>F</tt> known to be filtered at the
-	    initiator. If <tt>F</tt> is zero, the size <tt>S</tt> is just 8 (bytes).
-	    Otherwise, <tt>S</tt> is set to the minimum of
-	    2<sup>15</sup> and the lowest power
-	    of 2 that is strictly larger than <tt>K*F/4</tt>
-	    (in bytes).
+	    the number of elements <tt>|E|</tt> known to be filtered at the
+	    initiator. If <tt>|E|</tt> is zero, the size <tt>S</tt> is padded to 32 bits for alignment.
+	    Otherwise, <tt>L</tt> is set to the minimum of
+	    2<sup>18</sup> bits (2<sup>15</sup> bytes) and the lowest power
+	    of 2 that is strictly larger than <tt>2*K*|E|</tt> bits (<tt>K*|E|/4</tt> bytes).
           </t>
           <t>
-            The <tt>k</tt>-value for the HELLO_BF
-            Bloom filter is always 16.
+            The <tt>k</tt>-value for the HELLO_BF Bloom filter is always 16.
             <tt>k</tt> is never transmitted.
             The elements used in the Bloom filter
             consist of an XOR between the <tt>H_ADDRS</tt> field (as computed using
             SHA-512 over the <tt>ADDRESSES</tt>) and the SHA-512
             hash of the <tt>MUTATOR</tt> field from a given HELLO block.
-            The mapping function M(<tt>H_ADDRS XOR MUTATOR</tt>) is an identity
-            function and returns the 512-bit XOR result unmodified.
+            The mapping function M(<tt>H_ADDRS XOR MUTATOR</tt>) is defined as follows:
+          </t>
+          <t>
+            <tt>M(e = H_ADDR XOR MUTATOR) -> e as uint32[]</tt>
+          </t>
+          <t>
+            <tt>M</tt> is an identity function and returns the 512-bit XOR result unmodified.
             This resulting byte string is interpreted as k=16 32-bit
-            integers in network byte order and used to set or check the bucket bits
-            accordingly.
-            This function returns an empty Bloom filter of length <tt>S= L/8</tt> as
-            part of RF:
+            integers in network byte order which are used to set and check the bucket bits in
+            <tt>B</tt> using <tt>BF-SET</tt> and <tt>BF-TEST</tt>.
+            The Mutator is prepended to the Bloom filter <tt>B</tt> to create the result filter
+            for a HELLO block:
           </t>
           <figure anchor="hello_rf" title="The HELLO Block Result Filter.">
             <artwork name="" type="" align="left" alt=""><![CDATA[
@@ -2407,7 +2417,7 @@ gnunet+tcp://12.3.4.5/ \
             </dd>
             <dt>HELLO_BF</dt>
             <dd>
-              The Bloom filter byte array.
+              The Bloom filter buckets byte array.
             </dd>
           </dl>
           <t>

	lsd0004 LSD0004: R5N Distributed Hash Table
	Log \| Files \| Refs