diff options
author | Martin Schanzenbach <schanzen@gnunet.org> | 2022-03-12 08:49:52 +0100 |
---|---|---|
committer | Martin Schanzenbach <schanzen@gnunet.org> | 2022-03-12 08:49:52 +0100 |
commit | 79acd5b791f4d6fd5afe6ef00efd2748e3459c53 (patch) | |
tree | 0ac15ca133324703adf175d4790d8b7c6aa236a3 | |
parent | b6a819871911ae0720facad4574f2e4eed9b55f7 (diff) | |
download | lsd0004-79acd5b791f4d6fd5afe6ef00efd2748e3459c53.tar.gz lsd0004-79acd5b791f4d6fd5afe6ef00efd2748e3459c53.zip |
bloom
-rw-r--r-- | draft-schanzen-r5n.xml | 29 |
1 files changed, 24 insertions, 5 deletions
diff --git a/draft-schanzen-r5n.xml b/draft-schanzen-r5n.xml index 9719af5..9bb2d5e 100644 --- a/draft-schanzen-r5n.xml +++ b/draft-schanzen-r5n.xml | |||
@@ -741,18 +741,37 @@ bchar = *(ALPHA / DIGIT) | |||
741 | on this data structure shared by the various use-cases in R<sup>5</sup>N. | 741 | on this data structure shared by the various use-cases in R<sup>5</sup>N. |
742 | </t> | 742 | </t> |
743 | <t> | 743 | <t> |
744 | FIXME: general intro. | 744 | A Bloom filter (BF) is a space-efficient probabilistic datastructure |
745 | to test if an element is part of a set of elements. | ||
746 | Elements are identified by an element ID. | ||
747 | Since a BF is a probabilistic datastructure, it is possible to have | ||
748 | false-positives: when asked if an element is in the set, the answer from | ||
749 | a BF is either "no" or "maybe". | ||
745 | </t> | 750 | </t> |
746 | <t> | 751 | <t> |
747 | Bloom filters are always initially empty, consisting only of zeroes. | 752 | Bloom filters are defined as a string of <tt>N</tt> bits always |
753 | initially empty, consisting only of zeroes. | ||
748 | There are two functions which can be invoked on the Bloom filter: | 754 | There are two functions which can be invoked on the Bloom filter: |
749 | BF-SET(bf, e) and BF-TEST(bf, e) where "e" is an element which is to | 755 | BF-SET(bf, e) and BF-TEST(bf, e) where "e" is an element which is to |
750 | be added to the Bloom filter or queried against the set. | 756 | be added to the Bloom filter or queried against the set. |
751 | Any Bloom filter uses k=16 different hash functions each of which is | 757 | The mapping function <tt>M</tt> is defined as follows: |
752 | defined as follows: | ||
753 | </t> | 758 | </t> |
754 | <t> | 759 | <t> |
755 | FIXME: actually define the operations... | 760 | The element <tt>e</tt> is hashed using SHA-512. |
761 | The resulting byte string is interpreted as a string of 16 | ||
762 | 32-bit integers in network byte order. | ||
763 | </t> | ||
764 | <t> | ||
765 | When adding an element to the bloom filter <tt>bf</tt> using | ||
766 | <tt>BF-SET(bf,e)</tt>, each number <tt>n</tt> in the mapping | ||
767 | <tt>M(e)</tt> is interpreted as a bit offset within <tt>bf</tt> and set | ||
768 | to 1. | ||
769 | </t> | ||
770 | <t> | ||
771 | When testing if an element is in the bloom filter <tt>bf</tt> using | ||
772 | <tt>BF-TEST(bf,e)</tt>, each number <tt>n</tt> in the mapping | ||
773 | <tt>M(e)</tt> <bcp14>MUST</bcp14> have been set to 1. | ||
774 | Otherwise, the element is not considered to be in the bloom filter. | ||
756 | </t> | 775 | </t> |
757 | </section> | 776 | </section> |
758 | <section anchor="routing" numbered="true" toc="default"> | 777 | <section anchor="routing" numbered="true" toc="default"> |