diff options
Diffstat (limited to 'draft-schanzen-r5n.xml')
-rw-r--r-- | draft-schanzen-r5n.xml | 155 |
1 files changed, 92 insertions, 63 deletions
diff --git a/draft-schanzen-r5n.xml b/draft-schanzen-r5n.xml index 06e3f13..6096c24 100644 --- a/draft-schanzen-r5n.xml +++ b/draft-schanzen-r5n.xml | |||
@@ -372,10 +372,11 @@ Connectivity | |Underlay| |Underlay| | |||
372 | </dd> | 372 | </dd> |
373 | <dt>Result-Filter:</dt> | 373 | <dt>Result-Filter:</dt> |
374 | <dd> | 374 | <dd> |
375 | is a Bloom filter which allows applications to | 375 | is data for a <tt>Block-type</tt>-specific filter |
376 | probabilistically indicate results which are | 376 | which allows applications to |
377 | indicate results which are | ||
377 | not relevant anymore to the | 378 | not relevant anymore to the |
378 | caller (see <xref target="result_bloomfilter"/>). | 379 | caller (see <xref target="result_filter"/>). |
379 | </dd> | 380 | </dd> |
380 | </dl> | 381 | </dl> |
381 | <t> | 382 | <t> |
@@ -659,7 +660,7 @@ Connectivity | |Underlay| |Underlay| | |||
659 | <t> | 660 | <t> |
660 | Any implementation encountering a HELLO GET request <bcp14>MUST</bcp14> respond | 661 | Any implementation encountering a HELLO GET request <bcp14>MUST</bcp14> respond |
661 | with its own HELLO block except if that block is | 662 | with its own HELLO block except if that block is |
662 | filtered by the request's result filter (see <xref target="result_bloomfilter"/>). | 663 | filtered by the request's result filter (see <xref target="result_filter"/>). |
663 | Implementations <bcp14>MAY</bcp14> respond | 664 | Implementations <bcp14>MAY</bcp14> respond |
664 | with additional valid HELLO blocks of other peers with keys | 665 | with additional valid HELLO blocks of other peers with keys |
665 | closest to the key of the GET request. A HELLO block is "valid" | 666 | closest to the key of the GET request. A HELLO block is "valid" |
@@ -762,16 +763,16 @@ bchar = *(ALPHA / DIGIT) | |||
762 | 32-bit integers in network byte order. | 763 | 32-bit integers in network byte order. |
763 | </t> | 764 | </t> |
764 | <t> | 765 | <t> |
765 | When adding an element to the bloom filter <tt>bf</tt> using | 766 | When adding an element to the Bloom filter <tt>bf</tt> using |
766 | <tt>BF-SET(bf,e)</tt>, each integer <tt>n</tt> of the mapping | 767 | <tt>BF-SET(bf,e)</tt>, each integer <tt>n</tt> of the mapping |
767 | <tt>M(e)</tt> is interpreted as a bit offset <tt>n mod L</tt> within | 768 | <tt>M(e)</tt> is interpreted as a bit offset <tt>n mod L</tt> within |
768 | <tt>bf</tt> and set to 1. | 769 | <tt>bf</tt> and set to 1. |
769 | </t> | 770 | </t> |
770 | <t> | 771 | <t> |
771 | When testing if an element may be in the bloom filter <tt>bf</tt> using | 772 | When testing if an element may be in the Bloom filter <tt>bf</tt> using |
772 | <tt>BF-TEST(bf,e)</tt>, each bit offset <tt>n mod L</tt> within | 773 | <tt>BF-TEST(bf,e)</tt>, each bit offset <tt>n mod L</tt> within |
773 | <tt>bf</tt> <bcp14>MUST</bcp14> have been set to 1. | 774 | <tt>bf</tt> <bcp14>MUST</bcp14> have been set to 1. |
774 | Otherwise, the element is not considered to be in the bloom filter. | 775 | Otherwise, the element is not considered to be in the Bloom filter. |
775 | </t> | 776 | </t> |
776 | </section> | 777 | </section> |
777 | <section anchor="routing" numbered="true" toc="default"> | 778 | <section anchor="routing" numbered="true" toc="default"> |
@@ -841,7 +842,7 @@ bchar = *(ALPHA / DIGIT) | |||
841 | To build its routing table, a peer will send out requests | 842 | To build its routing table, a peer will send out requests |
842 | asking for blocks of type HELLO using its own location as the key, | 843 | asking for blocks of type HELLO using its own location as the key, |
843 | but filtering all of its neighbors via the Bloom filter described | 844 | but filtering all of its neighbors via the Bloom filter described |
844 | in <xref target="result_bloomfilter"/>. | 845 | in <xref target="result_filter"/>. |
845 | These requests <bcp14>MUST</bcp14> use the FindApproximate and DemultiplexEverywhere | 846 | These requests <bcp14>MUST</bcp14> use the FindApproximate and DemultiplexEverywhere |
846 | flags. FindApproximate will ensure that other peers will reply | 847 | flags. FindApproximate will ensure that other peers will reply |
847 | with keys they merely consider close-enough, while DemultiplexEverywhere | 848 | with keys they merely consider close-enough, while DemultiplexEverywhere |
@@ -980,21 +981,21 @@ bchar = *(ALPHA / DIGIT) | |||
980 | For each entry in the pending table, the DHT <bcp14>MUST</bcp14> track | 981 | For each entry in the pending table, the DHT <bcp14>MUST</bcp14> track |
981 | not only the query key and the origin, but also the | 982 | not only the query key and the origin, but also the |
982 | extended query, requested block type and flags, and the | 983 | extended query, requested block type and flags, and the |
983 | result Bloom filter. If the query did not provide | 984 | result filter. If the query did not provide |
984 | a result Bloom filter, a fresh result Bloom filter | 985 | a result filter, a fresh result filter |
985 | <bcp14>MUST</bcp14> still be created to filter duplicate replies. | 986 | <bcp14>MUST</bcp14> still be created to filter duplicate replies. |
987 | Details of how a result filter works depend on the | ||
988 | type, as described in <xref target="block_functions"/>. | ||
986 | </t> | 989 | </t> |
987 | <t> | 990 | <t> |
988 | When a second query from the same origin for the | 991 | When a second query from the same origin for the |
989 | same query hash is received, the DHT <bcp14>MUST</bcp14> | 992 | same query hash is received, the DHT <bcp14>MUST</bcp14> |
990 | attempt to merge the new request with the state for | 993 | attempt to merge the new request with the state for |
991 | the old request. In particular, this means that if | 994 | the old request. If this is not possible, the |
992 | the result Bloom filters have the same size and | 995 | existing result filter <bcp14>MUST</bcp14> be |
993 | mutator, they <bcp14>MUST</bcp14> be combined. If | 996 | discarded and replaced with the result |
994 | the result Bloom fitlers meta data differs, the | 997 | filter of the incoming message. |
995 | existing result Bloom filter <bcp14>MUST</bcp14> be | 998 | </t> |
996 | discarded and replaced with the incoming result | ||
997 | Bloom filter. | ||
998 | <t> | 999 | <t> |
999 | We note that for local applications, a fixed limit on | 1000 | We note that for local applications, a fixed limit on |
1000 | the number of concurrent requests may be problematic. | 1001 | the number of concurrent requests may be problematic. |
@@ -1536,63 +1537,61 @@ bchar = *(ALPHA / DIGIT) | |||
1536 | <dd> | 1537 | <dd> |
1537 | the variable-length extended query. Optional. | 1538 | the variable-length extended query. Optional. |
1538 | </dd> | 1539 | </dd> |
1539 | <dt>BF_MUTATOR</dt> | 1540 | <dt>MUTATOR</dt> |
1540 | <dd> | 1541 | <dd> |
1541 | The 32-bit Bloom filter mutator for the result Bloom filter. | 1542 | The 32-bit mutator for the result filter. |
1542 | </dd> | 1543 | </dd> |
1543 | <dt>RESULT_BF</dt> | 1544 | <dt>RESULT_FILTER</dt> |
1544 | <dd> | 1545 | <dd> |
1545 | the variable-length result Bloom filter, described in <xref target="result_bloomfilter"/>. | 1546 | the variable-length result filter, described in <xref target="result_filter"/>. |
1546 | </dd> | 1547 | </dd> |
1547 | </dl> | 1548 | </dl> |
1548 | </section> | 1549 | </section> |
1549 | <section anchor="result_bloomfilter"> | 1550 | <section anchor="result_filter"> |
1550 | <name>Result Bloom Filter</name> | 1551 | <name>Result Filter</name> |
1551 | <t> | 1552 | <t> |
1552 | The result Bloom filter is used to indicate to other peers which results | 1553 | The result filter is used to indicate to other peers which results |
1553 | are not of interest when processing a <tt>GetMessage</tt> | 1554 | are not of interest when processing a <tt>GetMessage</tt> |
1554 | (<xref target="p2p_get"/>). | 1555 | (<xref target="p2p_get"/>). |
1555 | Any peer which is processing <tt>GetMessage</tt>s and has a result | 1556 | Any peer which is processing <tt>GetMessage</tt>s and has a result |
1556 | which matches the query key <bcp14>MUST</bcp14> check the result Bloom filter | 1557 | which matches the query key <bcp14>MUST</bcp14> check the result filter |
1557 | and only send a reply message if the result does not test positive | 1558 | and only send a reply message if the result does not test positive |
1558 | under the result Bloom filter. Before forwarding the <tt>GetMessage</tt>, the | 1559 | under the result filter. Before forwarding the <tt>GetMessage</tt>, the |
1559 | result Bloom filter <bcp14>MUST</bcp14> be updated to filter out all results | 1560 | result filter <bcp14>MUST</bcp14> be updated to filter out all results |
1560 | already returned by the local peer. | 1561 | already returned by the local peer. |
1561 | </t> | 1562 | </t> |
1562 | <t> | 1563 | <t> |
1563 | FIXME: say something about how to calculate the size of the result Bloom filter here! | 1564 | How a result filter is implemented depends on the block type |
1564 | </t> | 1565 | as described in <xref target="block_functions"/>. |
1565 | <t> | 1566 | Result filters may be probabilistic data structures. Thus, |
1566 | Bloom filters are probabilistic data structures. Thus, especially | 1567 | it is possible that a desireable result is filtered by a result |
1567 | given the small size of the result Bloom filter, it is always possible | 1568 | filter because of a false-positive test. |
1568 | that a desireable result is filtered by the Bloom filter because of | ||
1569 | a false-positive match created by a collision in the hash values | ||
1570 | between the desireable result and filtered items. | ||
1571 | </t> | 1569 | </t> |
1572 | <t> | 1570 | <t> |
1573 | To address this problem, R<sup>5</sup>N uses a mutator value | 1571 | To address this problem, R<sup>5</sup>N uses a <tt>MUTATOR</tt> value |
1574 | to additionally randomize the process | 1572 | which allows block implemenations that use probabilistic data |
1575 | when hashing results into the result Bloom filter. The mutator | 1573 | structures for result filters to additionally "randomize" the |
1574 | computation of a probabilistic data structure while remaining | ||
1575 | deterministic across peers. The 32-bit <tt>MUTATOR</tt> | ||
1576 | value is set by the peer initiating the GET request, and changed | 1576 | value is set by the peer initiating the GET request, and changed |
1577 | every time the GET request is repeated by the initiator. Peers | 1577 | every time the GET request is repeated by the initiator. Peers |
1578 | forwarding GET requests <bcp14>MUST</bcp14> not change the | 1578 | forwarding GET requests <bcp14>MUST</bcp14> not change the |
1579 | mutator value included in the <tt>GetMessage</tt> as they could not | 1579 | mutator value included in the <tt>GetMessage</tt> as they might not |
1580 | recalculate the Bloom filter with the new mutator value. | 1580 | be able to recalculate the result filter with a different <tt>MUTATOR</tt> |
1581 | value. | ||
1581 | </t> | 1582 | </t> |
1582 | <t> | 1583 | <t> |
1583 | By including the mutator value in the hashing process, repeated | 1584 | By properly including the <tt>MUTATOR</tt> value in a probabilistic process, repeated |
1584 | requests have statistically independent probabilities of creating | 1585 | requests have statistically independent probabilities of creating |
1585 | collisions in the result Bloom filter. Thus, even if for one request | 1586 | false-positives in a result filter. Thus, even if for one request |
1586 | a result Bloom filter collision may exclude a result as a false-positive | 1587 | a result filter may exclude a result as a false-positive |
1587 | match, subsequent requests are likely to not have the same | 1588 | match, subsequent requests are likely to not have the same |
1588 | false-positives. | 1589 | false-positives. |
1589 | </t> | 1590 | </t> |
1590 | <t> | 1591 | <t> |
1591 | How exactly a block result is hashed into the result Bloom filter | 1592 | How exactly a block result is added to a result filter |
1592 | together with the mutator depends on the block type. | 1593 | (together with the <tt>MUTATOR</tt>) <bcp14>MUST</bcp14> be |
1593 | For example, some block types may include full block | 1594 | specified as part of the definition of a block type. |
1594 | payload, certain parts of the block payload, or the block key | ||
1595 | when hashing the block into the result Bloom filter. | ||
1596 | </t> | 1595 | </t> |
1597 | </section> | 1596 | </section> |
1598 | <section anchor="p2p_get_processing"> | 1597 | <section anchor="p2p_get_processing"> |
@@ -1604,8 +1603,7 @@ bchar = *(ALPHA / DIGIT) | |||
1604 | <ol> | 1603 | <ol> |
1605 | <li> | 1604 | <li> |
1606 | The <tt>QUERY_KEY</tt> and <tt>XQUERY</tt> fields are validated | 1605 | The <tt>QUERY_KEY</tt> and <tt>XQUERY</tt> fields are validated |
1607 | against the | 1606 | against the requested <tt>BTYPE</tt> as defined by its respective |
1608 | requested <tt>BTYPE</tt> as defined by its respective | ||
1609 | <tt>ValidateBlockQuery</tt> procedure. | 1607 | <tt>ValidateBlockQuery</tt> procedure. |
1610 | If validation | 1608 | If validation |
1611 | function yields <tt>REQUEST_INVALID</tt>, the message <bcp14>MUST</bcp14> be discarded. | 1609 | function yields <tt>REQUEST_INVALID</tt>, the message <bcp14>MUST</bcp14> be discarded. |
@@ -1641,7 +1639,7 @@ bchar = *(ALPHA / DIGIT) | |||
1641 | <tt>RESULT_BF</tt>. However, implementations also <bcp14>MUST</bcp14> | 1639 | <tt>RESULT_BF</tt>. However, implementations also <bcp14>MUST</bcp14> |
1642 | avoid an exhaustive search of their database, as there could be | 1640 | avoid an exhaustive search of their database, as there could be |
1643 | cases where too many local results are filtered by the result | 1641 | cases where too many local results are filtered by the result |
1644 | Bloom filter. To avoid denial of service attacks, implementations | 1642 | filter. To avoid denial of service attacks, implementations |
1645 | <bcp14>MUST</bcp14> thus ensure that the cost of evaluating any | 1643 | <bcp14>MUST</bcp14> thus ensure that the cost of evaluating any |
1646 | such query is reasonably small. | 1644 | such query is reasonably small. |
1647 | </li> | 1645 | </li> |
@@ -1833,8 +1831,10 @@ bchar = *(ALPHA / DIGIT) | |||
1833 | <li> | 1831 | <li> |
1834 | If the <tt>BTYPE</tt> is supported, result block <bcp14>MUST</bcp14> | 1832 | If the <tt>BTYPE</tt> is supported, result block <bcp14>MUST</bcp14> |
1835 | be validated against the specific query using | 1833 | be validated against the specific query using |
1836 | the respective <tt>FilterBlockResult</tt> function, possibly updating | 1834 | the respective <tt>FilterBlockResult</tt> function. This function |
1837 | the result Bloom filter of the query in the process. | 1835 | <bcp14>MUST</bcp14> update |
1836 | the result filter if a result is returned to the originator of the | ||
1837 | query. | ||
1838 | </li> | 1838 | </li> |
1839 | <li> | 1839 | <li> |
1840 | If the <tt>BTYPE</tt> is not supported, filtering of exact duplicate | 1840 | If the <tt>BTYPE</tt> is not supported, filtering of exact duplicate |
@@ -1933,8 +1933,17 @@ bchar = *(ALPHA / DIGIT) | |||
1933 | <dd>Block payload does not match the block type. | 1933 | <dd>Block payload does not match the block type. |
1934 | </dd> | 1934 | </dd> |
1935 | </dl> | 1935 | </dl> |
1936 | </dd> | ||
1937 | <dt>SetupResultFilter(FilterSize, Mutator) -> RF</dt> | ||
1938 | <dd> | ||
1939 | is used to setup an empty result filter. The arguments | ||
1940 | are the set of results that must be filtered at the | ||
1941 | initiator, and a <tt>MUTATOR</tt> value which <bcp14>MAY</bcp14> | ||
1942 | be used to deterministically re-randomize | ||
1943 | probabilistic data structures. The specification <bcp14>MUST</bcp14> | ||
1944 | also include the wire format for BF. | ||
1936 | </dd> | 1945 | </dd> |
1937 | <dt>FilterResult(Block, Key, RBF, XQuery) -> (FilterEvaluationResult, RBF')</dt> | 1946 | <dt>FilterResult(Block, Key, RF, XQuery) -> (FilterEvaluationResult, RF')</dt> |
1938 | <dd> | 1947 | <dd> |
1939 | <t> | 1948 | <t> |
1940 | is used to filter results against specific queries. This function | 1949 | is used to filter results against specific queries. This function |
@@ -1952,13 +1961,13 @@ bchar = *(ALPHA / DIGIT) | |||
1952 | <dt>FILTER_LAST</dt> | 1961 | <dt>FILTER_LAST</dt> |
1953 | <dd>Last possible valid result.</dd> | 1962 | <dd>Last possible valid result.</dd> |
1954 | <dt>FILTER_DUPLICATE</dt> | 1963 | <dt>FILTER_DUPLICATE</dt> |
1955 | <dd>Valid result, but duplicate (was filtered by the result Bloom filter).</dd> | 1964 | <dd>Valid result, but duplicate (was filtered by the result filter).</dd> |
1956 | <dt>FILTER_IRRELEVANT</dt> | 1965 | <dt>FILTER_IRRELEVANT</dt> |
1957 | <dd>Block does not satisfy the constraints imposed by the XQuery.</dd> | 1966 | <dd>Block does not satisfy the constraints imposed by the XQuery.</dd> |
1958 | </dl> | 1967 | </dl> |
1959 | <t> | 1968 | <t> |
1960 | If the main evaluation result is <tt>FILTER_MORE</tt>, the function also returns | 1969 | If the main evaluation result is <tt>FILTER_MORE</tt>, the function also returns |
1961 | and updated result Bloom filter where the block is added to the set of | 1970 | and updated result filter where the block is added to the set of |
1962 | filtered replies. An implementation is not expected to actually differenciate | 1971 | filtered replies. An implementation is not expected to actually differenciate |
1963 | between the <tt>FILTER_DUPLICATE</tt> and <tt>FILTER_IRRELEVANT</tt> return | 1972 | between the <tt>FILTER_DUPLICATE</tt> and <tt>FILTER_IRRELEVANT</tt> return |
1964 | values: in both cases the block is ignored for this query. | 1973 | values: in both cases the block is ignored for this query. |
@@ -2110,20 +2119,40 @@ gnunet+tcp://12.3.4.5/ \ | |||
2110 | against the public key from the peer ID field. | 2119 | against the public key from the peer ID field. |
2111 | </t> | 2120 | </t> |
2112 | <t> | 2121 | <t> |
2113 | To filter results of HELLO blocks | 2122 | The result filter for HELLO blocks is implemented using a |
2114 | using the result Bloom filter, the | 2123 | Bloom filter. The <tt>K</tt>-value for the Bloom filter |
2124 | is always 16. The size <tt>S</tt> of the Bloom filter in bytes depends on | ||
2125 | the number of elements <tt>F</tt> known to be filtered at the | ||
2126 | initiator. If <tt>F</tt> is zero, the size <tt>S</tt> is just 8 (bytes). | ||
2127 | Otherwise, <tt>S</tt> is set to the minimum of | ||
2128 | 2<sup>15</sup> and the lowest power | ||
2129 | of 2 that is strictly larger than <tt>K*F/4</tt> | ||
2130 | (in bytes). The wire format of a HELLO block Bloom filter | ||
2131 | is just the resulting byte array. In particular, <tt>K</tt> | ||
2132 | is not transmitted. | ||
2133 | </t> | ||
2134 | <t> | ||
2135 | To filter results of HELLO blocks using the Bloom filter, the | ||
2115 | <tt>H_ADDRS</tt> field (as computed using SHA-512 over | 2136 | <tt>H_ADDRS</tt> field (as computed using SHA-512 over |
2116 | the <tt>ADDRESSES</tt>) is XORed with the SHA-512 | 2137 | the <tt>ADDRESSES</tt>) and XORed with the SHA-512 |
2117 | hash of the mutator (in network byte order). | 2138 | hash of the <tt>MUTATOR</tt> (in network byte order). |
2118 | The resulting value is then used when hashing into the | 2139 | The resulting value is then used when hashing into the |
2119 | result Bloom filter. Consequently, HELLOs with | 2140 | Bloom filter as described in <xref target="bloom_filters" />. |
2120 | completely identical sets of addresses will be | 2141 | Consequently, HELLOs with completely identical sets of |
2121 | filtered, but any small variation in the set of | 2142 | addresses will be filtered, but any small variation in the set of |
2122 | addresses will cause the block to no longer be | 2143 | addresses will cause the block to no longer be |
2123 | filtered (with high probability). The | 2144 | filtered (with high probability). The |
2124 | function thus always returns either | 2145 | function thus always returns either |
2125 | <tt>FILTER_MORE</tt> or <tt>FILTER_DUPLICATE</tt>. | 2146 | <tt>FILTER_MORE</tt> or <tt>FILTER_DUPLICATE</tt>. |
2126 | </t> | 2147 | </t> |
2148 | <t> | ||
2149 | HELLO result filters can be merged if the | ||
2150 | Bloom filters have the same size and | ||
2151 | <tt>MUTATOR</tt> by setting all bits to 1 that are | ||
2152 | set in either Bloom filter. This is done whenever | ||
2153 | a peer receives a query with the same <tt>MUTATOR</tt>, | ||
2154 | predecessor and Bloom filter size. | ||
2155 | </t> | ||
2127 | </section> | 2156 | </section> |
2128 | <section> | 2157 | <section> |
2129 | <name>Persistence</name> | 2158 | <name>Persistence</name> |