[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[lsd0004] branch master updated: more on Bloom filters
From: |
gnunet |
Subject: |
[lsd0004] branch master updated: more on Bloom filters |
Date: |
Fri, 11 Mar 2022 05:12:53 +0100 |
This is an automated email from the git hooks/post-receive script.
grothoff pushed a commit to branch master
in repository lsd0004.
The following commit(s) were added to refs/heads/master by this push:
new d1a7123 more on Bloom filters
d1a7123 is described below
commit d1a71233594235411e1a4da75caa71c2bb31f902
Author: Christian Grothoff <grothoff@gnunet.org>
AuthorDate: Fri Mar 11 05:12:48 2022 +0100
more on Bloom filters
---
draft-schanzen-r5n.xml | 41 ++++++++++++++++++++++++++++++++---------
1 file changed, 32 insertions(+), 9 deletions(-)
diff --git a/draft-schanzen-r5n.xml b/draft-schanzen-r5n.xml
index f998220..41b20ca 100644
--- a/draft-schanzen-r5n.xml
+++ b/draft-schanzen-r5n.xml
@@ -864,7 +864,10 @@ bchar = *(ALPHA / DIGIT)
</t>
<t>
The peer Bloom filter is always a 128-bit field. The elements
- hashed into the Bloom filter are the 32 byte peer IDs.
+ hashed into the Bloom filter are the 32 byte peer IDs. We note
+ that the peer Bloom filter may exclude peers due to false-postive
+ matches. This is acceptable as routing should nevertheless
+ terminate (with high probability) in close vicinity of the key.
</t>
</section>
<section anchor="routing_functions">
@@ -970,16 +973,36 @@ bchar = *(ALPHA / DIGIT)
Any peer which is processing GET messages and has a result
which matches the query key <bcp14>MUST</bcp14> check the result
Bloom filter
and only send a reply message if the block key is not in the
- Bloom filter set.
+ Bloom filter set. <!-- FIXME: is this strictly always the block key?
+ I think this is block-type dependent! -->
</t>
<t>
- The Bloom filter is a 128-bit field.
- It is initially empty, consisting only of zeroes.
- There are two functions which can be invoked on the Bloom filter:
- BF-SET(bf, e) and BF-TEST(bf, e) where "e" is an element which is to
- be added to the Bloom filter or queried against the set.
- Any Bloom filter uses k=16 different hash functions each of which is
- defined as follows:
+ FIXME: say something about the size of the result Bloom filter here!
+ </t>
+ <t>
+ Bloom filters are probabilistic data structures. Thus, especially
+ given the small size of the result Bloom filter, it is always possible
+ that a desireable result is filtered by the Bloom filter because of
+ a false-positive match created by a collision in the hash values
+ between the desireable result and filtered items.
+ </t>
+ <t>
+ To address this problem, R<sup>5</sup>N uses a mutator value
+ to additionally randomize the process
+ when hashing results into the result Bloom filter. The mutator
+ value is set by the peer initiating the GET request, and changed
+ every time the GET request is repeated by the initiator. Peers
+ forwarding GET requests <bcp14>MUST</bcp14> not change the
+ mutator value included in the GET message as they could not
+ recalculate the Bloom filter with the new mutator value.
+ </t>
+ <t>
+ By including the mutator value in the hashing process, repeated
+ requests have statistically independent probabilities of creating
+ collisions in the Bloom filter. Thus, even if for one request
+ a Bloom filter collision may exclude a result as a false-positive
+ match, subsequent requests are likely to not have the same
+ false-positives.
</t>
</section>
<section anchor="p2p_xq" numbered="true" toc="default">
--
To stop receiving notification emails like this one, please contact
gnunet@gnunet.org.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [lsd0004] branch master updated: more on Bloom filters,
gnunet <=