[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[lsd0003] branch master updated: Improved RFC
From: |
gnunet |
Subject: |
[lsd0003] branch master updated: Improved RFC |
Date: |
Tue, 11 May 2021 09:01:31 +0200 |
This is an automated email from the git hooks/post-receive script.
elias-summermatter pushed a commit to branch master
in repository lsd0003.
The following commit(s) were added to refs/heads/master by this push:
new 7c6bef1 Improved RFC
new 029d57f Merge branch 'master' of git+ssh://gnunet.org/lsd0003
7c6bef1 is described below
commit 7c6bef140c96746cb3be4e2b2907cc6fee52b70e
Author: Elias Summermatter <elias.summermatter@seccom.ch>
AuthorDate: Tue May 11 08:58:28 2021 +0200
Improved RFC
---
draft-summermatter-set-union.xml | 590 +++++++++++++++++++++++++++------------
1 file changed, 404 insertions(+), 186 deletions(-)
diff --git a/draft-summermatter-set-union.xml b/draft-summermatter-set-union.xml
index 8ae00af..9074380 100644
--- a/draft-summermatter-set-union.xml
+++ b/draft-summermatter-set-union.xml
@@ -106,7 +106,7 @@
</t>
<t>
The protocol described in this report is generic and
- suitable for a wide range of applicaitons. As a result,
+ suitable for a wide range of applications. As a result,
the internal structure of the elements in the sets must
be defined and verified by the application using the
protocol. This document thus does not cover the element
@@ -121,9 +121,9 @@
using the protocol must provide a parameter that specifies
the cost-ratio of round-trips vs. bandwidth usage. Given
this trade-off factor, the protocol will then choose parameters
- that minimize the total execution cost. In particular, there
- is one major choice to be made, which is between sending the
- full set of elements, or just sending the elements that differ.
+ that minimize the total execution costs. In particular, there
+ is one major choice to be made, namely between sending the
+ complete set of elements, or sending only the elements that
differ.
In the latter case, our design is basically a concrete
implementation of a proposal by Eppstein.<xref target="Eppstein"
format="default" />
</t>
@@ -152,7 +152,7 @@
<t>
To defend against some of these attacks, applications
need to remember the number of elements previously
- shared with a peer, and offer a means to check that
+ shared with a peer, and provide a way to check that
elements are well-formed. Applications may also be able
to provide an upper bound on the total number of valid
elements that may exist. For example, in E-voting, the
@@ -179,14 +179,14 @@
<section anchor="bf" numbered="true" toc="default">
<name>Bloom Filters</name>
<t>
- A Bloom filter (BF) is a space-efficient datastructure to
test if am element is part of a set of elements.
+ A Bloom filter (BF) is a space-efficient datastructure to
test if an element is part of a set of elements.
Elements are identified by an element ID.
Since a BF is a probabilistic datastructure, it is
possible to have false-positives: when asked
if an element is in the set, the answer from a BF is
either "no" or "maybe".
</t>
<t>
A BF consists of L buckets. Every bucket is a binary value
that can be either 0 or 1. All buckets are initialized
- to 0. A mapping function M is used to map each the ID of
each element from the set to a subset of k buckets. M is non-injective
+ to 0. A mapping function M is used to map each ID of each
element from the set to a subset of k buckets. M is non-injective
and can thus map the same element multiple times to the
same bucket.
The type of the mapping function can thus be described by
the following mathematical notation:
</t>
@@ -214,7 +214,7 @@
To check if an element may be in the set, one tests if all
buckets under the map M are set to 1.
</t>
<t>
- Further in this document a bitstream outputted by the
mapping function is represented by
+ Further in this document a bitstream output by the mapping
function is represented by
a set of numeric values for example (0101) = (2,4).
In the BF the buckets are set to 1 if the corresponding
bit in the bitstream is 1.
If there is a collision and a bucket is already set to 1,
the bucket stays 1.
@@ -231,8 +231,8 @@
]]></artwork>
</figure>
<t>
- Is easy to see that the M(element) = (0,3) could be in the
BF bellow and M(element) = (0,2) can't be
- in the BF bellow:
+ Is easy to see that the M(element) = (0,3) could be in the
BF below and M(element) = (0,2) can't be
+ in the BF below:
</t>
<figure anchor="figure_bf_contains">
@@ -259,7 +259,7 @@
<name>Counting Bloom Filter</name>
<t>
A Counting Bloom Filter (CBF) is an extension of the <xref
target="bf" format="title" />. In the CBF, buckets are
- unsigned numbers instead of binary values. This allows the
removal of an elements from the CBF.
+ unsigned numbers instead of binary values. This allows the
removal of an element from the CBF.
</t>
<t>
Adding an element to the CBF is similar to the adding
operation of the BF. However, instead of setting the bucket on hit to 1 the
@@ -294,7 +294,7 @@
</figure>
<t>
In practice, the number of bits available for the counters
is usually finite. For example, given a 4-bit
- counter, a CBF bucket would overflow once 16 elements are
mapped to the same bucket. To efficiently
+ counter, a CBF bucket would overflow 16 elements are mapped
to the same bucket. To efficiently
handle this case, the maximum value (15 in our example) is
considered to represent "infinity". Once the
order of a bucket reaches "infinity", it is no longer
incremented or decremented.
</t>
@@ -328,7 +328,7 @@
including the CPU architecture.
</t>
<t>
- If the IBF size is to small or the mapping
+ If the IBF size is too small or the mapping
function does not spread out the elements
uniformly, the signed counter can overflow or
underflow. As with the CBF, the "maximum" value is
@@ -456,11 +456,11 @@ hashSum | 0x0000 | 0x4242 | 0x0000 |
0x4242 |
Note that it is possible to "remove" elements from an
IBF that were never present
in the IBF in the first place. A negative counter value
is thus indicative of
elements that were removed without having been added.
Note that an IBF bucket
- counter of zero no longer warrants that an element
mapped to that bucket is not
+ counter of zero no longer guarantees that an element
mapped to that bucket is not
present in the set: a bucket with a counter of zero can
be the result of one
element being added and a different element (mapped to
the same bucket) being removed.
To check that an element is not present requires a
counter of zero and an
- IDSUM and HASHSUM of zero --- and some assurance that
there was no collision due
+ IDSUM and HASHSUM of zero --- and some certainty that
there was no collision due
to the limited number of bits in IDSUM and HASHSUM.
Thus,
IBFs are not suitable to replace BFs or IBFs.
</t>
@@ -504,7 +504,7 @@ hashSum | 0x0000 | 0x4242 | 0x0000 |
0x4242 |
a different mapping function M.
A more common scenario (especially if L was too small)
is that
IBF decoding fails because there is no pure bucket. In
this case, the
- higher-level protocol also should retry using
different parameters.
+ higher-level protocol should also retry using
different parameters.
</t>
<t>
Suppose the IBF contains a pure bucket. In this case,
the IDSUM in the
@@ -513,9 +513,9 @@ hashSum | 0x0000 | 0x4242 | 0x0000 |
0x4242 |
was negative, and by removing it if the counter was
positive). This
is likely to cause other buckets to become pure,
allowing further
elements to be decoded. Eventually, decoding should
succeed with
- all counters and IDSUM and HASHSUM values reaching
zero. However, it is also
+ all counters and IDSUM and HASHSUM values reach zero.
However, it is also
possible that an IBF only partly decodes and then
decoding fails after
- yielding some elements.
+ obtaining some elements.
</t>
<t>
In the following example the successful decoding of an
IBF containing
@@ -554,7 +554,7 @@ hashSum | 0x0000 | 0x4242 | 0x0000 |
0x4242 |
</figure>
<t>
In the IBF only pure buckets are left, we choose to
continue decoding bucket 2 and decode element
- with the hash 0x4242. Now the IBF is empty (all
buckets have count 0) that means the IBF has successfully
+ with the hash 0x4242. Now the IBF is empty (all
buckets have count 0) that means the IBF has been successfully
decoded.
</t>
<figure anchor="figure_ibf_decode_2">
@@ -577,7 +577,7 @@ hashSum | 0x0000 | 0x0000 | 0x0000 |
0x0000 |
Given addition and removal as defined above, it is
possible to define an operation on IBFs
that computes an IBF representing the set difference.
Suppose IBF1 represents set A, and
IBF2 represents set B. Then this set difference
operation will compute IBF3 which
- represents the set A - B --- without needing elements
from set A or B.
+ represents the set A - B --- without having to
transfer the elements from set A or B.
To calculate the IBF representing this set difference,
both IBFs must have the same
length L, the same number of buckets per element k and
use the same map M. Given this,
@@ -635,7 +635,7 @@ hashSum | 0x0101 | 0x0101 | 0x5050 |
0x0000 |
+-------------+-------------+-------------+-------------+
]]></artwork>
</figure>
- <t>After calculating and decoding the IBF-AB its clear that in
IBF-A the element with the hash 0x5050
+ <t>After calculating and decoding the IBF-AB shows clear that
in IBF-A the element with the hash 0x5050
is missing (-1 in bit-3) while in IBF-B the element with
the hash 0101 is missing
(1 in bit-1 and bit-2). The element with hash 0x4242 is
present in IBF-A and IBF-B and is
removed by the set difference operation (bit-4).
@@ -648,8 +648,8 @@ hashSum | 0x0101 | 0x0101 | 0x5050 |
0x0000 |
<t>
To facilitate a reasonably CPU-efficient
implementation, this specification requires the
- IBF counter to always use 8 bits. Fewer bits
- would result in a paritcularly inefficient
+ IBF counter always to use 8 bits. Fewer bits
+ would result in a particularly inefficient
implementation, while more bits are rarely useful
as sets with so many elements should likely be
represented using a larger number of buckets. This
@@ -672,17 +672,17 @@ hashSum | 0x0101 | 0x0101 | 0x5050 |
0x0000 |
</t>
<t>
For the "HASHSUM", we always use a 32-bit
- representation. Here, it is mostly important to
+ representation. Here, it is most important to
avoid collisions, where different elements are
mapped to the same hash. However, we note that
by design only a few elements (certainly less than
127) should ever be mapped
- to the same bucket, so a small number of bits
+ to the same bucket, a small number of bits
should suffice. Furthermore, our protocol is designed
to handle occasional collisions, so while with
32-bits there remains a chance of
accidental collisions, at 32 bit the chance is
- generally believed to be sufficiently small enough
+ generally believed to be sufficiently small
for the protocol to handle those cases efficiently
for a wide range of use-cases. Smaller hash
values would safe bandwidth, but also drastically
@@ -697,7 +697,7 @@ hashSum | 0x0101 | 0x0101 | 0x5050 |
0x0000 |
with HMAC-SHA512 as XTR and HMAC-SHA256 as PRF and
salt is set to the unsigned 64-bit equivalent of 0.
The output is then truncated to 64-bit.
Its important that the elements can be redistributed
over the buckets in case the IBF does not
- decode, that's why the ID is salted with a random salt
given in the SALT field of this message.
+ decode. That's why the ID is salted with a random salt
given in the SALT field of this message.
Salting is done by calculation the a random salt
modulo 64 (using only the lowest 6-bits of the salt)
and do a bitwise right rotation of output of KDF by
the 6-bit salts numeric representation.
</t>
@@ -743,22 +743,22 @@ FUNCTION id_calculation (element,ibf_salt):
<t>
The mapping function M as described above in the
figure <xref target="bf_mapping_function_math" format="default" />
decides in which buckets the ID and HASH have to be
binary XORed to. In practice
- there the following algorithm is used:
+ the following algorithm is used:
</t>
<t>
The first index is simply the HASH modulo the IBF
size. The second
index is calculated by creating a new 64-bit value by
shifting the 32-bit
value left and setting the lower 32-bit to the number
of indexes already processed. From the
- resulting 64-bit value a CRC32 checksum is created the
second index is now the modulo of the
- CRC32 output this is repeated until the predefined
amount indexes is generated.
+ resulting 64-bit value a CRC32 checksum is created.
The second index is now the modulo of the
+ CRC32 output, this is repeated until the predefined
amount of indexes is generated.
In the case a index is hit twice, which would mean
this bucket could not get pure again,
- the second hit is just skipped and the next iteration
is used as.
+ the second hit is just skipped and the next iteration
is used.
</t>
<figure
anchor="ibf_format_bucket_identification_pseudo_code">
<artwork name="" type="" align="left" alt=""><![CDATA[
# INPUTS:
# key: Is the ID of the element calculated in the id_calculation function
above.
-# number_of_buckets_per_element: Pre-defined count of buckets elements are
inserted into
+# number_of_buckets_per_element: Pre-defined numbers of buckets elements are
inserted into
# ibf_size: the size of the ibf (count of buckets)
# OUTPUT:
# dst: Array with bucket IDs to insert ID and HASH
@@ -807,7 +807,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element,
ibf_size)
<section anchor="se_description" numbered="true" toc="default">
<name>Description</name>
<t>
- Strata Estimators help estimate the size of the set
difference between two set of elements.
+ Strata Estimators help estimate the size of the set
difference between two sets of elements.
This is necessary to efficiently determinate the tuning
parameters for an IBF, in particular
a good value for L.
</t>
@@ -845,7 +845,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element,
ibf_size)
<t>
The simplest mode is the "full" synchronization mode. The idea
is that if the difference between the sets of the two
peers exceeds a certain threshold, the overhead to determine
which elements are different outweighs
- the overhead of sending the complete set. In this case, the
most efficient method can be to just
+ the overhead of sending the complete set. In this case, the
most efficient method can be just to
exchange the full sets.
</t>
<t>
@@ -854,7 +854,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element,
ibf_size)
</t>
<t>
The second possibility is that the difference of the sets is
small compared to the set size.
- Here, an efficient "delta" synchronization mode is more
efficient. Given these two possibilities,
+ In this case, an efficient "delta" synchronization mode is
more efficient. These two possibilities given,
the first steps of the protocol are used to determine which
mode should be used.
</t>
@@ -920,7 +920,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element,
ibf_size)
<em><xref target="messages_ibf" format="title" /></em>
message from
the initiating peer and transitions into the
<strong>Expecting IBF Last</strong> state when there
are multiple <em><xref target="messages_ibf"
format="title" /></em> messages to sent,
- when there is just a single <em><xref
target="messages_ibf" format="title" /></em> message the reviving peer
+ when there is just a single <em><xref
target="messages_ibf" format="title" /></em> message the receiving peer
transitions directly to the <strong>Active
Decoding</strong> state.
</t>
<t>
@@ -975,8 +975,8 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element,
ibf_size)
When a new element message has been received
the peer checks if a corresponding
<em><xref target="messages_demand"
format="title" /></em> for the element has been sent
and the demand is still unsatisfied.
- If the element has been demanded the peer
checks the element for validity, removed it from the list
- of pending demands and then then saves the
element to the the set otherwise the peer
+ If the element has been demanded the peer
checks the element for validity, removes it from the list
+ of pending demands and then saves the element
to the set otherwise the peer
rejects the element.
</dd>
<dt><em><xref target="messages_ibf" format="title"
/></em> message:</dt>
@@ -990,7 +990,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element,
ibf_size)
<dt><em><xref target="messages_ibf_last"
format="title" /></em> message:</dt>
<dd>
If an <em><xref target="messages_ibf_last"
format="title" /></em> message is received this
- indicates that the there is just one IBF slice
and a direct state and role transition from
+ indicates that there is just one IBF slice
left and a direct state and role transition from
<strong>Passive Decoding</strong> to
<strong>Active Decoding</strong> is initiated.
</dd>
<dt><em><xref target="messages_done"
format="title" /></em> message:</dt>
@@ -1017,7 +1017,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
sends a <em><xref target="messages_inquiry"
format="title" /></em> to the passive peer.
</t>
<t>
- In case the IBF does not successfully decode
anymore, the active peer sends a new IBF to the passive client
+ In case the IBF does not successfully decode
anymore, the active peer sends a new IBF to the passive peer
and changes into <strong>Passive Decoding</strong>
state. This initiates a role swap.
To reduce overhead and prevent double transmission
of offers and elements the new IBF is created
on the new complete set after all demands and
inquiries have been satisfied.
@@ -1039,8 +1039,8 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
the active peer. If a Inquiry has been sent
and <!-- FIXME: is this indeed a condition that is checked? -->
the offered element is missing in the active
peers set,
the active peer sends a <em><xref
target="messages_demand" format="title" /></em> message to the
- passive peer. The send demand need to be added
to a list with unsatisfied demands.
- In the case the received offer is for an
element that is already in the set of the peer the offer is ignored.
+ passive peer. The sent demand needs to be
added to a list with unsatisfied demands.
+ In case the received offer is for an element
that is already in the set of the peer the offer is ignored.
<!-- FIXME: what happens if the offer is for
an element that is not missing? I think then we just ignore it, right? -->
</dd>
<dt><em><xref target="messages_demand"
format="title" /></em> message:</dt>
@@ -1051,9 +1051,9 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<em><xref target="messages_elements"
format="title" /></em> message if a offer request
for the element has been sent.
<!-- IMPLEMENT: This is not implemented in
code // Change -->
- In the case the demanded element does not
exist in the
- set there was probably a bucket decoded that
was not really pure so potentially all <em><xref target="messages_offer"
format="title" /></em>
- and <em><xref target="messages_demand"
format="title" /></em> messages sent after are invalid
+ In case the demanded element does not exist in
the
+ set there was probably a bucket decoded that
was not pure so potentially all <em><xref target="messages_offer"
format="title" /></em>
+ and <em><xref target="messages_demand"
format="title" /></em> messages sent later are invalid
in this case a role change active -> passive
with a new IBF is easiest.
If a demand for the same element is received
multiple times the demands should be
discarded.
@@ -1062,18 +1062,18 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
</dd>
<dt><em><xref target="messages_elements"
format="title" /></em> message:</dt>
<dd>
- A element that is received is marked in the
list of demanded elements as satisfied, validated and
- saved and not further action is taken.
+ An element that is received is marked in the
list of demanded elements as satisfied, validated and
+ saved and no further action is taken.
Elements that are not demanded or already
known are discarded.
</dd>
<dt><em><xref target="messages_done"
format="title" /></em> message:</dt>
<dd>
Receiving the message <em><xref
target="messages_done" format="title" /></em> indicates
that all demands of the passive peer have been
satisfied. The active peer then changes into the
- state <strong>Finish Closing</strong> state.
+ <strong>Finish Closing</strong> state.
<!-- IMPLEMENT: This is not implemented in
code // Change -->
- If the IBF is not finished decoding and the
<em><xref target="messages_done" format="title" /></em>
- is received the other peer is not in
compliance with the protocol and the set reconciliation MUST be aborted.
+ If the IBF has not finished decoding and the
<em><xref target="messages_done" format="title" /></em>
+ is received, the other peer is not in
compliance with the protocol and the set reconciliation MUST be aborted.
<!-- IMPLEMENT: This is not implemented in
code // Change -->
</dd>
</dl>
@@ -1090,7 +1090,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<dd>
<t>
In this states the peers are waiting for all
demands to be satisfied and for the synchronisation
- to be completed. When all demands are satisfied
the peer changes into state <strong>Finished</strong>.
+ to be completed. When all demands are satisfied
the peer changes into <strong>Finished</strong>state.
</t>
</dd>
</dl>
@@ -1104,7 +1104,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
</t>
<t>
The <xref target="modeofoperation_individual-elements"
format="title" /> is only efficient on small set
- differences or if the byte-size of the elements is large.
Is the set difference is estimated to be large
+ differences or if the byte-size of the elements is large.
If the set difference is estimated to be large
the <xref target="modeofoperation_full-sync"
format="title" /> is
more efficient. The exact heuristics and parameters on
which the protocol decides which mode
should be used are described in the <xref
target="performance" format="title" /> section of this document.
@@ -1114,7 +1114,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
is always used.
The first case is when one of the peers announces having
an empty set. This is announced by setting
the SETSIZE field in the <em><xref target="messages_se"
format="title" /></em> to 0.
- The second case is if the application requested full
synchronization explicitly.
+ The second case is if the application requests full
synchronization explicitly.
This is useful for testing and should not be used in
production.
</t>
<!--
@@ -1169,7 +1169,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<dl>
<dt>MSG SIZE</dt>
<dd>
- is 16-bit unsigned integer in network byte order
witch describes the message size in bytes and the header is included.
+ is a 16-bit unsigned integer in network byte order
which describes the message size in bytes and header included.
</dd>
<dt>MSG TYPE</dt>
<dd>
@@ -1209,7 +1209,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
The <em>IBF</em> message is sent at the start of the
protocol from the initiating peer in the transaction
between <strong>Expect SE</strong> ->
<strong>Expecting IBF Last</strong> or when the IBF does not
decode and there is a role change in the transition
between <strong>Active Decoding</strong> -> <strong>Expecting IBF Last</strong>.
- This message is only sent if there are more than one
IBF slice to sent, in the case there is just
+ This message is only sent if there are more than one
IBF slice to be sent, in case there is just
one slice the <xref target="messages_ibf_last"
format="title" /> message is sent.
</t>
</section>
@@ -1219,12 +1219,12 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<artwork name="" type="" align="left" alt=""><![CDATA[
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
- | MSG SIZE | MSG TYPE |ORDER| PAD |
+ | MSG SIZE | MSG TYPE | SIZE |
+-----+-----+-----+-----+-----+-----+-----+-----+
- | OFFSET | SALT |
+ |IMCS | OFFSET | SALT |
+-----+-----+-----+-----+-----+-----+-----+-----+
- | IBF-SLICE
- + /
+ | IBF-SLICE
+ +----- /
/ /
/ /
]]></artwork>
@@ -1233,21 +1233,25 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<dl>
<dt>MSG SIZE</dt>
<dd>
- is 16-bit unsigned integer in network byte order
witch describes the message size in bytes and the header is included.
+ is a 16-bit unsigned integer in network byte
orderwhichdescribes the message size in bytes and header included.
</dd>
<dt>MSG TYPE</dt>
<dd>
the type of SETU_P2P_REQUEST_IBF as registered in
<xref target="gana" format="title" /> in network byte order.
</dd>
- <dt>ORDER</dt>
+
+ <dt>SIZE</dt>
<dd>
- is a 8-bit unsigned integer which signals the
order of the IBF. The order of the IBF
- is defined as the logarithm of the number of
buckets of the IBF.
+ is a 32-bit unsigned integer which signals the
number of buckets in the IBF.
</dd>
- <dt>PAD</dt>
+ <dt>IMCS</dt>
<dd>
- is 24-bit always set to zero
+ IBF max counter size is a 8-bit unsigned integer
which describes the number of bit that
+ is required to store a single counter. This is
used for the unpacking function as described
+ in the <xref
target="performance_counter_variable_size" format="title" /> section.
</dd>
+
+
<dt>OFFSET</dt>
<dd>
is a 32-bit unsigned integer which signals the
offset to the following ibf slices in the original.
@@ -1259,16 +1263,18 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
</dd>
<dt>IBF-SLICE</dt>
<dd>
+
<t>
- are variable count of slices in an array. A
single slice contains out multiple 64-bit IDSUMS,
- 32-bit HASHSUMS and 8-bit COUNTERS. In the
network order the array of IDSUMS is first, followed
+ are variable numbers of slices in an array. A
single slice contains multiple 64-bit IDSUMS,
+ 32-bit HASHSUMS and 1-64bit COUNTERS of
variable size. In the network order the array of IDSUMS is first, followed
by an array of HASHSUMS and ended with an
array of COUNTERS. Length of the array is defined
- by MIN( 2^ORDER - OFFSET,
MAX_BUCKETS_PER_MESSAGE). MAX_BUCKETS_PER_MESSAGE is defined as
+ by MIN( SIZE - OFFSET,
MAX_BUCKETS_PER_MESSAGE). MAX_BUCKETS_PER_MESSAGE is defined as
32768 divided by the BUCKET_SIZE which is
13-byte (104-bit).
</t>
+
<t>
- To get the IDSUM field, all IDs who hit a
bucket are added up with a binary XOR operation.
- See <xref target="ibf_format_id_generation"
format="title" /> for details about ID generation.
+ To get the IDSUM field, all IDs hitting a
bucket are added up with a binary XOR operation.
+ See <xref target="ibf_format_id_generation"
format="title" /> details about ID generation.
</t>
<t>
The calculation of the HASHSUM field is done
accordingly to the calculation of the IDSUM field:
@@ -1279,7 +1285,11 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
The algorithm to find the correct bucket in
which the ID and the HASH have to be added
is described in detail in section <xref
target="ibf_format_bucket_identification" format="title" />.
</t>
-
+
+ <t>
+ Test vectors for an implementation can be
found in the <xref target="test_vectors" format="title" /> section
+ </t>
+
<!--
FIXME: this is not sufficiently precise! How are
the element IDs (and IDSUMS) computed?
How are the HASHes (and HASHSUMS) computed? Which
byte order is used? What role does
@@ -1296,12 +1306,13 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
+-----+-----+-----+-----+-----+-----+-----+-----+
| IDSUMS |
+-----+-----+-----+-----+-----+-----+-----+-----+
- | HASHSUMS | HASHSUMS |
+ | HASHSUMS | HASHSUMS |
+-----+-----+-----+-----+-----+-----+-----+-----+
- | COUNTERS | COUNTERS |
+ | COUNTERS* | COUNTERS* |
+-----+-----+-----+-----+-----+-----+-----+-----+
/ /
/ /
+* Counter size is variable. In this example the size is 32-bit (4-byte)
]]></artwork>
</figure>
</section>
@@ -1313,7 +1324,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<section anchor="messages_ibf_last_description"
numbered="true" toc="default">
<name>Description</name>
<t>
- This message indicates to the remote peer that all
slices of the bloom filter have been sent.
+ This message indicates the remote peer that all slices
of the bloom filter have been sent.
The binary structure is exactly the same as the <xref
target="messages_ibf_structure" format="title" /> of
the message <xref target="messages_ibf" format="title"
/> with a different "MSG TYPE"
which is defined in <xref target="gana" format="title"
/> "SETU_P2P_IBF_LAST".
@@ -1343,7 +1354,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
as answer to a <em><xref target="messages_demand"
format="title" /></em> message from the remote peer.
The Element message can also be received in the
<strong>Finish Closing</strong> or <strong>Finish Waiting</strong>
state after receiving a <em><xref
target="messages_done" format="title" /></em> message from the remote peer, in
this
- case the client changes to the
<strong>Finished</strong> state as soon as all demands for elements have been
satisfied.
+ case the peer changes to the <strong>Finished</strong>
state as soon as all demands for elements have been satisfied.
</t>
<t>
This message is exclusively sent in the <xref
target="modeofoperation_individual-elements" format="title" />.
@@ -1367,7 +1378,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<dl>
<dt>MSG SIZE</dt>
<dd>
- is 16-bit unsigned integer in network byte order
witch describes the message size in bytes and the header is included.
+ is a 16-bit unsigned integer in network byte order
which describes the message size in bytes and header included.
</dd>
<dt>MSG TYPE</dt>
<dd>
@@ -1375,7 +1386,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
</dd>
<dt>E TYPE</dt>
<dd>
- element type is a 16-bit unsigned integer witch
defines the element type for
+ element type is a 16-bit unsigned integer which
defines the element type for
the application.
</dd>
<dt>PADDING</dt>
@@ -1384,7 +1395,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
</dd>
<dt>E SIZE</dt>
<dd>
- element size is 16-bit unsigned integer that
signals the size of the elements data part.
+ element size is a 16-bit unsigned integer that
signals the size of the elements data part.
</dd>
<dt>AE TYPE</dt>
<dd>
@@ -1408,7 +1419,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
The offer message is an answer to an <em><xref
target="messages_inquiry" format="title" /></em> message
and transmits the full hash of an element that has
been requested by the other peer.
This full hash enables the other peer to check if the
element is really missing in its set and
- eventually sends a <em><xref target="messages_demand"
format="title" /></em> message for that a element.
+ eventually sends a <em><xref target="messages_demand"
format="title" /></em> message for that element.
</t>
<t>
The offer is sent and received only in the
<strong>Active Decoding</strong> and in the <strong>Passive Decoding</strong>
@@ -1434,7 +1445,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<dl>
<dt>MSG SIZE</dt>
<dd>
- is 16-bit unsigned integer in network byte order
witch describes the message size in bytes and the header is included.
+ is a 16-bit unsigned integer in network byte order
which describes the message size in bytes header included.
</dd>
<dt>MSG TYPE</dt>
<dd>
@@ -1482,7 +1493,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<dl>
<dt>MSG SIZE</dt>
<dd>
- is 16-bit unsigned integer in network byte order
witch describes the message size in bytes and the header is included.
+ is a 16-bit unsigned integer in network byte order
which describes the message size in bytes and header included.
</dd>
<dt>MSG TYPE</dt>
<dd>
@@ -1503,7 +1514,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<name>Description</name>
<t>
The demand message is sent in the <strong>Active
Decoding</strong> and in the <strong>Passive Decoding</strong>
- state. It is a answer to a received <em><xref
target="messages_offer" format="title" /></em> message
+ state. It is an answer to a received <em><xref
target="messages_offer" format="title" /></em> message
and is sent if the element described in the <em><xref
target="messages_offer" format="title" /></em> message
is missing in the peers set. In the normal workflow
the answer to the demand message is an
<em><xref target="messages_elements" format="title"
/></em> message.
@@ -1528,7 +1539,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<dl>
<dt>MSG SIZE</dt>
<dd>
- is 16-bit unsigned integer in network byte order
witch describes the message size in bytes and the header is included.
+ is a 16-bit unsigned integer in network byte order
which describes the message size in bytes and the header is included.
</dd>
<dt>MSG TYPE</dt>
<dd>
@@ -1576,7 +1587,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<dl>
<dt>MSG SIZE</dt>
<dd>
- is 16-bit unsigned integer in network byte order
witch describes the message size in bytes and the header is included.
+ is a 16-bit unsigned integer in network byte order
which describes the message size in bytes and header included.
</dd>
<dt>MSG TYPE</dt>
<dd>
@@ -1599,7 +1610,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<name>Description</name>
<t>
The full done message is sent in the <xref
target="modeofoperation_full-sync" format="title" />
- to signal that all remaining elements of the set have
been sent. The message is received and sent in in the
+ to signal that all remaining elements of the set have
been sent. The message is received and sent in the
<strong>Full Sending</strong> and in the <strong>Full
Receiving</strong> state. When the full done message is received
in <strong>Full Sending</strong> state the peer
changes directly into <strong>Finished</strong> state. In
<strong>Full Receiving</strong> state receiving a full
done message initiates the sending of
@@ -1622,7 +1633,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<dl>
<dt>MSG SIZE</dt>
<dd>
- is 16-bit unsigned integer in network byte order
witch describes the message size in bytes and the header is included.
+ is a 16-bit unsigned integer in network byte order
which describes the message size in bytes and header included.
</dd>
<dt>MSG TYPE</dt>
<dd>
@@ -1668,7 +1679,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<dl>
<dt>MSG SIZE</dt>
<dd>
- is 16-bit unsigned integer in network byte order
witch describes the message size in bytes and the header is included.
+ is a 16-bit unsigned integer in network byte order
which describes the message size in bytes and header included.
</dd>
<dt>MSG TYPE</dt>
<dd>
@@ -1684,7 +1695,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<section anchor="messages_se_description" numbered="true"
toc="default">
<name>Description</name>
<t>
- The strata estimator is sent by the receiving peer at
the start of the protocol right after the
+ The strata estimator is sent by the receiving peer at
the start of the protocol, right after the
<xref target="messages_operation_request"
format="title" /> message has been received.
</t>
<t>
@@ -1714,7 +1725,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<dl>
<dt>MSG SIZE</dt>
<dd>
- is 16-bit unsigned integer in network byte order
witch describes the message size in bytes and the header is included.
+ is a 16-bit unsigned integer in network byte order
which describes the message size in bytes and header included.
</dd>
<dt>MSG TYPE</dt>
<dd>
@@ -1743,7 +1754,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
</t>
<t>
Since the content of the message is the same as the
uncompressed Strata Estimator, the details
- aren't repeated here for details see section <xref
target="messages_se" format="counter" />.
+ are not repeated here for details see section <xref
target="messages_se" format="counter" />.
</t>
</section>
</section>
@@ -1765,7 +1776,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<strong>Full Receiving</strong> state.
</t>
<t>
- After the last full element messages has been sent the
<xref target="messages_full_done" format="title" /> message
+ After the last full element message has been sent the
<xref target="messages_full_done" format="title" /> message
is sent to conclude the full synchronisation of the
element sending peer.
</t>
</section>
@@ -1787,7 +1798,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
<dl>
<dt>MSG SIZE</dt>
<dd>
- is 16-bit unsigned integer in network byte order
witch describes the message size in bytes and the header is included.
+ is a 16-bit unsigned integer in network byte order
which describes the message size in bytes and header included.
</dd>
<dt>MSG TYPE</dt>
<dd>
@@ -1795,7 +1806,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
</dd>
<dt>E TYPE</dt>
<dd>
- element type is a 16-bit unsigned integer witch
defines the element type for
+ element type is a 16-bit unsigned integer which
defines the element type for
the application.
</dd>
<dt>PADDING</dt>
@@ -1804,7 +1815,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
</dd>
<dt>E SIZE</dt>
<dd>
- element size is 16-bit unsigned integer that
signals the size of the elements data part.
+ element size is a 16-bit unsigned integer that
signals the size of the elements data part.
</dd>
<dt>AE TYPE</dt>
<dd>
@@ -1831,7 +1842,7 @@ FUNCTION get_bucket_id (key,
number_of_buckets_per_element, ibf_size)
The decision which mode of operations is used is
described by the following code:
</t>
<t>
- The function takes as input the initial the local
setsize, the remote setsize, the
+ The function takes as input the initial local setsize,
the remote setsize, the
by the strata estimator calculated difference, a
static boolean that enforces full
synchronisation mode of operation and the
bandwith/roundtrips tradeoff.
As output the function returns "FULL" if the full
synchronisation
@@ -1862,9 +1873,9 @@ FUNCTION decide_operation_mode(initial_local_setsize,
remote_setsize, set_diff,
</figure>
</section>
<section
anchor="performance_formulas_full_sending_dec_first_send" numbered="true"
toc="default">
- <name>Full Synchronisation: Decision witch peer sends
elements first</name>
+ <name>Full Synchronisation: Decision which peer sends
elements first</name>
<t>
- The following function determinate which peer starts
sending its full set in full synchronisation
+ The following function determinates which peer starts
sending its full set in full synchronisation
mode of operation.
</t>
<t>
@@ -1892,11 +1903,11 @@ FUNCTION decide_full_sending(initial_local_size,
remote_setsize)
<section anchor="performance_formula_ibf_parameters"
numbered="true" toc="default">
<name>IBF Parameters</name>
<t>
- The following function calculate the required
parameter to create an optimal sized IBF. These
- parameter are the number of buckets and the number of
buckets a single element is mapped to.
+ The following function calculates the required
parameter to create an optimal sized IBF. These
+ parameters are the number of buckets and the number of
buckets a single element is mapped to.
</t>
<t>
- The function takes as input the setsize and returns a
array with two numbers the total number of buckets
+ The function takes as input the setsize and returns an
array with two numbers, the total number of buckets
and the number of buckets a single element is mapped
to.
</t>
<figure anchor="performance_formula_ibf_parameters_code">
@@ -1907,14 +1918,163 @@ FUNCTION decide_full_sending(initial_local_size,
remote_setsize)
# returns: Array: first element total nr ob buckets and
# second element number of buckets per element
-FUNCTION calculate_ibf_params (setsize):
+FUNCTION calculate_ibf_params (set_diff):
number_of_bucket_per_element = 4
- total_number_of_buckets = setsize
+ total_number_of_buckets = set_diff
return [ total_number_of_buckets, number_of_bucket_per_element ]
]]></artwork>
</figure>
</section>
</section>
+
+ <section anchor="performance_counter_variable_size"
numbered="true" toc="default">
+ <name>Counter variable size</name>
+ <t>
+ Since the optimal number of bytes a counter in the IBF
contains is very variable and varies
+ due to different parameters. Details are described in the
BSC thesis
+ by Elias Summermatter, BFH 2021.
+ <!-- TODO link Thesis -->
+ Therefore a compression algorithm has been implemented,
which always creates the IBF counter in optimal size.
+ and thus minimizes the bandwidth needed to transmit the
IBF.
+ </t>
+ <t>
+ A simple algorithm is used for the compression. In a first
step it is determined, which is the largest counter
+ and how many bits are needed to store it. In a second step
for every counter of every bucket the counter
+ is stored in the bits determined in the first step and
these are concatenated.
+ </t>
+ <t>
+ Three individual functions are used for this purpose. The
first one is a function that iterates over each bucket of the
+ bucket of the IBF to get the maximum counter in the IBF.
As second it needs
+ a function that compresses the counter of the IBF and
thirdly a function that decompresses the IBF.
+ </t>
+ <figure anchor="performance_counter_variable_size_code">
+ <artwork name="" type="" align="left" alt=""><![CDATA[
+
+# INPUTS:
+# ibf: The IBF
+# OUTPUTS:
+# returns: minimal amount of bytes required to store the counter
+
+FUNCTION ibf_get_max_counter(ibf)
+ max_counter=0
+ FOR bucket IN ibf
+ IF bucket.counter > max_counter
+ max_counter = bucket.counter
+
+ RETURN CEILING( log2 ( max_counter ) ) # next bigger discrete number of
the binary logarithm of the max counter
+
+# INPUTS:
+# ibf: The IBF
+# offset: The offset which defines the starting point from which bucket the
compress operation should start
+# count: The number of buckets to in the array that should be compressed
+# OUTPUTS:
+# returns: An byte array of compressed counters to send over the network
+
+FUNCTION pack_counter(ibf, offset, count)
+ counter_bytes = ibf_get_max_counter(ibf)
+ store = 0
+ store_bits = 0
+ byte_ctr = 0
+ buffer=[]
+
+ FOR bucket IN ibf[offset] to ibf[count]
+ byte_len = counter_bytes
+ counter = bucket.counter
+
+ WHILE byte_len > 0
+ byte_to_write = 0
+
+ IF counter_bytes + store_bits >= 8
+ bit_to_shift=0
+
+ IF store_bits > 0 OR counter_bytes > 8
+ bit_free = 8 - store_bits
+ bit_to_shift = counter_bytes - bit_free
+ store = store << bit_free
+
+ byte_to_write = (( counter >> bit_to_shift) | store) & 0xFF
+ bit_to_shift -= 8 - store_bits
+ counter = counter & (( 1 << counter_bytes ) - 1)
+ store = 0
+ store_bits = 0
+
+ ELSE
+ IF 0 == store_bits
+ store = counter
+ ELSE
+ store = (store << counter_bytes) | counter
+
+ store_bits = store_bits + byte_len
+ byte_len = 0
+ BREAK
+
+ buffer[byte_ctr] = byte_to_write
+ byte_ctr++
+ NEXT_BUCKET
+
+ # Write the last partial compressed byte to the buffer
+ buffer[byte_ctr] = store << (8 - store_bits)
+ byte_ctr++
+
+ RETURN buffer
+
+# INPUTS:
+# ibf: The IBF
+# offset: The offset which defines the starting point from which bucket the
compress operation should start
+# count: The number of buckets to in the array that should be compressed
+# counter_bit_length: The bit length of the counter can be found in the ibf
message in the ibf_counter_bit_length field
+# packed_data: A byte array which contains the data packed with the
pack_counter function
+# OUTPUTS:
+# returns: Nothing because the unpacked counter is saved directly into the IBF
+
+FUNCTION unpack_counter(ibf, offset, count, counter_bit_length, packed_data)
+ store = 0
+ store_bits = 0
+ byte_ctr = 0
+ ibf_bucket_ctr = 0
+
+ number_bytes_read = CEILING((count * counter_bit_length) / 8)
+
+ WHILE ibf_bucket_ctr <= (count -1)
+ byte_to_read = packed_data[byte_ctr]
+ byte_ctr++
+ bit_to_pack_left = 8
+
+ WHILE bit_to_pack_left >= 0
+
+ # Prevent packet from reading more than required
+ IF ibf_bucket_ctr > (count -1)
+ return
+
+ IF ( store_bits + bit_to_pack_left ) >= counter_bit_length
+ bit_use = counter_bit_length - store_bits
+
+ IF store_bits > 0
+ store = store << bit_use
+
+ bytes_to_shift = bit_to_pack_left - bit_use
+ counter_partial = byte_to_read >> bytes_to_shift
+ store = store | counter_partial
+ ibf.counter[ibf_bucket_ctr] = store
+ byte_to_read = byte_to_read & (( 1 << bytes_to_shift ) - 1)
+
+ bit_to_pack_left -= bit_use
+ ibf_bucket_ctr++
+ store = 0
+ store_bits = 0
+
+ ELSE
+ store_bits += bit_to_pack_left
+
+ IF 0 == store_bits
+ store = byte_to_read
+ ELSE
+ store = store << bit_to_pack_left
+ store = store | byte_to_read
+ BREAK
+ ]]></artwork>
+ </figure>
+ </section>
</section>
<!--
@@ -1931,19 +2091,19 @@ FUNCTION calculate_ibf_params (setsize):
<t>
The security considerations in this document focus mainly on the
security
- goal of availability, the primary goal of the protocol is prevent
an attacker from
+ goal of availability. The primary goal of the protocol is to
prevent an attacker from
wasting cpu and network resources of the attacked peer.
</t>
<t>
- To prevent denial of service attacks its vital to check that peers
can only
- reconcile a set once in a pre defined time span. This is a
predefined values and need
- to be adapted on per use basis. To enhance reliability and to allow
- failures in the protocol its possible to introduce a threshold for
max failed reconciliation
+ To prevent denial of service attacks, it is vital to check that
peers can only
+ reconcile a set once in a predefined time span. This is a
predefined value and needs
+ to be adapted per use basis. To enhance reliability and to allow
+ failures in the protocol, it is possible to introduce a threshold
for max failed reconciliation
ties.
<!-- IMPLEMENT: How to implement? IP? Other construct? -->
</t>
<t>
- The formal format of all messages needs to be properly validated,
this is important to prevent many
+ The formal format of all messages needs to be properly validated.
This is important to prevent many
attacks on the code. The application data should be validated by
the application using
the protocol not by the implementation of the protocol.
In case the format validation fails the set operation MUST be
terminated.
@@ -1951,17 +2111,18 @@ FUNCTION calculate_ibf_params (setsize):
</t>
<t>
- To prevent an attacker from sending a peer into a endless loop
between active and passive decoding a
- limitation for active/passive roll switches in required. This can
be implemented by
- a simple counter which terminates the operation after a predefined
count of switches.
- The count of switches needs to be defined as such that its very
undroppable that more
+ To prevent an attacker from sending a peer into an endless loop
between active and passive decoding, a
+ limitation for active/passive roll switches is required. This can
be implemented by
+ a simple counter which terminates the operation after a predefined
number of switches.
+ The number of switches needs to be defined in such a way that it
is very unprobable that more
switches are required an the malicious intend of the other peer
can be assumed.
+
<!-- IMPLEMENT: This counter -->
</t>
<t>
<!--- SHOULD BE HANDLED BY UNDERLYING PROTOCOL BUT HOW IS IT
HANDLED? -->
- Its important to close and purge connections after a given timeout
+ It is important to close and purge connections after a given
timeout
to prevent draining attacks.
<!-- IMPLEMENT: How ist this handheld right now? -->
</t>
@@ -1976,7 +2137,7 @@ FUNCTION calculate_ibf_params (setsize):
<name>Duplicated or Missing Message detection</name>
<t>
Most of the messages received need to be checked that they
are not
- received multiple times this is solved with a global store
(message)
+ received multiple times. This is solved with a global
store (message)
and the following code
</t>
<figure
anchor="security_generic_functions_missing_message_code">
@@ -2031,7 +2192,7 @@ FUNCTION isStoreComplete(store)
ENDFOR
return TRUE
-# Returns the count of message received
+# Returns the number of message received
# INPUTS:
# store: Store to count
# OUTPUTS:
@@ -2046,8 +2207,8 @@ FUNCTION getNumberOfMessage(store)
<section anchor="security_generic_functions_element_nr"
numbered="true" toc="default">
<name>Store Remote Peers Element Number</name>
<t>
- To prevent an other peer from requesting the same set multiple
times
- its important to memorize the number of elements a peer had in
previous
+ To prevent an other peer from requesting the same set multiple
times,
+ it is important to memorize the number of elements a peer had
in previous
reconciliation sessions.
</t>
<figure anchor="security_generic_functions_element_nr_code">
@@ -2096,12 +2257,12 @@ FUNCTION save_number_of_elements_last_sync(client_id,
remote_setsize)
<t>
It needs to be checked that the full synchronisation is
plausible according to the formula deciding which
operation mode
- is applicable this is achieved by calculating the
upper and lower
+ is applicable. This is achieved by calculating the
upper and lower
boundaries of the number of elements in the other
peers set.
The lower boundary of number of elements can be easily
memorized as result from the last synchronisation and
the upper
boundary can be estimated with prior knowledge of the
maximal
- plausible increase of element since the last
reconciliation and
+ plausible increase of elements since the last
reconciliation and
the maximal plausible number of elements.
<!-- Entscheindungsfindung: myset fulltransimtion or
epstein
5.3 ist zu verckürtzt fall: nur 2 cases Es fehlt
traidof nach paramer in perfomance section
@@ -2155,12 +2316,12 @@ FUNCTION validate_messages_request_full(client_id,
remote_setsize, local_setsize
<dt><xref target="messages_ibf" format="title" /></dt>
<dd>
<t>
- Its important do define a threshold to limit
+ It is important to define a threshold to limit
the maximal number of IBFs that are expected from
the other peer.
<!-- change count to number full section -->
This maximal plausible size can be calculated with
the known inputs:
- number of elements in my set and the pre defined
applications upper
- limit as described in the performance section.
+ number of elements in the local set and the
predefined applications upper
+ limit, as described in the performance section.
<!-- IMPLEMENT: Is this already checked?-->
<!-- TODO: Link performance section -->
That the other peer chooses the correct mode of
operation MUST
@@ -2222,7 +2383,7 @@ FUNCTION get_bucket_number(remote_setsize, local_setsize,
initial_local_size, se
<dt><xref target="messages_full_element" format="title"
/></dt>
<dd>
<t>
- If a full element is received the set of the other
peer
+ If a full element is received, the set of the other
peer
is smaller than the set of the peer in the
<strong>Expecting IBF</strong>
state and the set difference is smaller than threshold
for
full synchronisation as described in the performance
section.
@@ -2322,9 +2483,9 @@ FUNCTION validate_messages_full_element_init(client_id,
remote_setsize,
<dt><xref target="messages_full_element" format="title"
/></dt>
<dd>
<t>
- When receiving full elements there needs to be checked
that every
- element is a valid element, no element is resized more
than once and
- not more or less elements are received as the other
peer has committed
+ When receiving full elements there needs to be
checked, that every
+ element is a valid element, no element has been
received more than once and
+ not more or less elements are received, as the other
peer has committed
to in the beginning of the operation. Detail
pseudocode implementation
can be found in <xref
target="security_states_expecting_ibf" format="title" />.
<!-- IMPLEMENT: Is this check already implemented?-->
@@ -2333,12 +2494,12 @@ FUNCTION validate_messages_full_element_init(client_id,
remote_setsize,
<dt><xref target="messages_full_done" format="title"
/></dt>
<dd>
<t>
- When receiving the full done message its important to
check that
+ When receiving the full done message it is important
to check that
not less elements are received as the other peer has
committed to
send.
The 512-bit hash of the complete reconciled set
contained in
- the full done message is required to ensures that both
sets are truly identical. If
- the sets differ a resynchronisation is required. The
count of possible
+ the full done message is required to ensure that both
sets are truly identical. If
+ the sets differ, a resynchronisation is required. The
number of possible
resynchronisation MUST be limited to prevent resource
exhaustion attacks.
<!-- IMPLEMENT: Is this check already implemented?-->
</t>
@@ -2382,7 +2543,7 @@ FUNCTION validate_messages_full_done(full_done_msg,
full_element_msg_store,
<dt><xref target="messages_ibf" format="title" /></dt>
<dd>
<t>
- When receiving multiple IBFs its important to
check that the other
+ When receiving multiple IBFs it is important to
check that the other
peer can only send as many IBFs as expected. The
number of expected IBFs can
be calculated with the knowledge of the set
difference as described in the
performance section.
@@ -2397,37 +2558,37 @@ FUNCTION validate_messages_full_done(full_done_msg,
full_element_msg_store,
<section anchor="security_states_active_decoding" numbered="true"
toc="default">
<name>Active Decoding</name>
<t>
- In the Active Decoding state its important to prevent an
attacker from
- generating and passing unlimited amount of IBF that do not
decode or
- even worse generate an IBF that is constructed to sends
the peers in an endless loop.
- To prevent an endless loop in decoding a loop detection
should be implemented
- the simplest solution would be to prevent decoding of more
than a given amount of elements,
- a more robust solution is to implement a algorithm that
detects a loop by
- analyzing past partially decoded IBFs to detect cycles.
This can be archived
+ In the Active Decoding state it is important to prevent an
attacker from
+ generating and passing an unlimited amount of IBFs, that
do not decode or
+ even worse, generate an IBF constructed, to send the peers
in an endless loop.
+ To prevent an endless loop in decoding, a loop detection
should be implemented.
+ The simplest solution would be to prevent decoding of more
than a given amount of elements.
+ A more robust solution is to implement a algorithm that
detects a loop by
+ analyzing past partially decoded IBFs. This can be archived
by saving the hash of all prior partly decoded IBFs hashes
in a hashmap and check
- for every inserted hash if it is already in the hashmap.
+ for every inserted hash, if it is already in the hashmap.
<!-- TODO: Link some algo to find loops in directed graph
-->
<!-- IMPLEMENT: Implement an algo that detects loops in
IBF decoding -->
</t>
<t>
- If the IBF decodes more or less elements than are
plausible the
+ If the IBF decodes more or less elements than are
plausible, the
operation MUST be terminated. The upper and lower threshold
- for the decoded elements can be calculated with the peers
set size
+ for the decoded elements can be calculated with the peers
set sizes
and the other peer committed set sizes from the
<strong>Expecting IBF</strong>
State.
</t>
- <!-- Wenne element mehrfach decodiert seitenwechseln daher
detecten. -->
+ <!-- Wenn ein Element mehrfach decodiert seitenwechseln daher
detecten. -->
<t>Security considerations for received messages:</t>
<dl>
<dt><xref target="messages_offer" format="title" /></dt>
<dd>
<t>
- If an offer for an element that never has been
requested by
- an inquiry or if an offer is received twice the
operation MUST be terminated.
- This requirement can be fulfilled by saving lists
that keeps track of the state of
- all send inquiries and offers. When answering
offers these lists MUST be checked.
+ If an offer for an element, that never has been
requested by
+ an inquiry or if an offer is received twice, the
operation MUST be terminated.
+ This requirement can be fulfilled by saving lists
that keep track of the state of
+ all sent inquiries and offers. When answering
offers these lists MUST be checked.
<!-- IMPLEMENT: Check to keep track of all send
Inquiries -->
</t>
<figure
anchor="security_states_active_decoding_offer_code">
@@ -2462,9 +2623,9 @@ FUNCTION
validate_messages_offer(offer_msg,inquiry_msg_store)
If an element that never has been requested by
a demand or is received double the operation MUST
be terminated.
This requirement can be fulfilled by a simple
table that keeps track
- of the state of all send demands.
+ of the state of all sent demands.
<!-- IMPLEMENT: Check to keep track of all send
demands -->
- If an invalid element is received the operation
has failed and the
+ If an invalid element is received, the operation
has failed and it
MUST be terminated.
<!-- IMPLEMENT: Termination if invalid element si
revived -->
</t>
@@ -2497,10 +2658,10 @@ FUNCTION
validate_messages_elements(element_msg,demand_msg_store)
<dt><xref target="messages_demand" format="title" /></dt>
<dd>
<t>
- For every received demand a offer has to be send in
advance. If an demand
- for an element is received that never has been offered
or the offer already has
- been answered with a demand the operation MUST be
terminated. Its required to implement
- a list which keeps track of the state of all send
offers and received demands.
+ For every received demand an offer has to be sent in
advance. If a demand
+ for an element is received, that never has been
offered or the offer already has
+ been answered with a demand, the operation MUST be
terminated. It is required to implement
+ a list which keeps track of the state of all sent
offers and received demands.
</t>
<figure
anchor="security_states_active_decoding_demand_code">
<artwork name="" type="" align="left"
alt=""><![CDATA[
@@ -2532,14 +2693,14 @@ FUNCTION
validate_messages_demand(demand_msg,offer_msg_store)
<dt><xref target="messages_done" format="title" /></dt>
<dd>
<t>
- The done message is only received if the IBF has
been finished
+ The done message is only received, if the IBF has
been finished
decoding and all offers have been sent. If the
done message is received before
the decoding of the IBF is finished or all open
offers and demands
- have been answered the operation MUST be
terminated.
+ have been answered, the operation MUST be
terminated.
<!-- IMPLEMENT: Check that in active decoding no
done message is received before ibf has been decoded-->
The 512-bit hash of the complete reconciled set
contained in
- the done message is required to ensures that both
sets are truly identical. If
- the sets differ a resynchronisation is required.
The count of possible
+ the done message is required to ensure that both
sets are truly identical. If
+ the sets differ, a resynchronisation is required.
The number of possible
resynchronisation MUST be limited to prevent
resource exhaustion attacks.
</t>
<figure
anchor="security_states_active_decoding_done_code">
@@ -2591,8 +2752,8 @@ FUNCTION validate_messages_done(done_msg, offer_msg_store,
<section anchor="security_states_finish_closing" numbered="true"
toc="default">
<name>Finish Closing</name>
<t>
- In case not all sent demands or inquiries have ben
answered in time a pre defined
- timeout the operation has failed and MUST be terminated.
+ In case not all sent demands or inquiries have been
answered in time, a predefined
+ timeout, the operation has failed and MUST be terminated.
</t>
<!-- FIXME: In state diagram in finish closing only Elements
can be received. What happens if i receive an offer? -->
<t>Security considerations for received messages:</t>
@@ -2616,14 +2777,14 @@ FUNCTION validate_messages_done(done_msg,
offer_msg_store,
<dt><xref target="messages_se" format="title" /></dt>
<dd>
<t>
- In case the Strata Estimator does not decode the
+ In case the Strata Estimator does not decode, the
operation MUST be terminated to prevent to get to
a unresolvable state.
<!-- IMPLEMENT: If in expect SE state the strata
estimator does not decode terminate the operation -->
The set difference calculated from the strata
estimator needs to be plausible,
- to ensure this multiple factors need to be
considered: The absolute plausible maximum of
- elements in a set which has to be predefined
according
+ to ensure this, multiple factors need to be
considered: The absolute plausible maximum of
+ elements in a set, which has to be predefined
according
to the use case and the maximal plausible element
increase since the last
- successful set reconciliation which should be
either predefined or can be calculated
+ successful set reconciliation, which should be
either predefined or can be calculated
with the gaussian distribution function over all
passed set reconciliations.
<!-- IMPLEMENT: Terminate if in check expect se
state for a max size difference is exceeded -->
</t>
@@ -2673,7 +2834,7 @@ FUNCTION validate_messages_se(se_msg, remote_setsize,
local_setsize)
<t>
The peer in <strong>Full Receiving</strong> state
needs to check
that exactly the number of elements is received
from the remote peer as
- he initially committed too. If the remote peer
transmits less or
+ he has initially committed too. If the remote peer
transmits less or
more elements the operation MUST be terminated.
</t>
<t>
@@ -2684,14 +2845,14 @@ FUNCTION validate_messages_se(se_msg, remote_setsize,
local_setsize)
<dd>
<t>
When the full done message is received from the
remote peer all
- elements that the remote peer has committed to
needs to be received
+ elements, that the remote peer has committed to,
need to be received,
otherwise the operation MUST be terminated. After
receiving the
full done message no future elements should be
accepted.
<!-- FIXME: Check that after full done in full
receiving no elements can be received anymore! Additional state? -->
The 512-bit hash of the complete reconciled set
contained in
- the full done message is required to ensures that
both sets are truly identical. If
- the sets differ a resynchronisation is required.
The count of possible
- resynchronisation MUST be limited to prevent
resource exhaustion attacks.
+ the full done message is required to ensure that
both sets are truly identical. If
+ the sets differ, a resynchronisation is required.
The number of possible
+ resynchronisations MUST be limited to prevent
resource exhaustion attacks.
</t>
<t>
Pseudocode for implementation described in section
<xref target="security_states_full_sending" format="title" />.
@@ -2707,8 +2868,8 @@ FUNCTION validate_messages_se(se_msg, remote_setsize,
local_setsize)
<dd>
<t>
In case an IBF message is received by the peer a
active/passive role switch
- is initiated by the active decoding remote peer.
In this instance the peer should
- wait for all open offers and demands to be
fulfilled to prevent
+ is initiated by the active decoding remote peer.
In this moment the peer should
+ wait for all open offers and demands to be
fulfilled, to prevent
retransmission before switching into active
decoding operation mode.
A switch into active decoding mode should only be
permitted for
a predefined number of times as described in the
top section
@@ -2755,13 +2916,13 @@ FUNCTION validate_messages_ibf(offer_msg_store,
demand_msg_store, element_msg_st
<dt><xref target="messages_inquiry" format="title" /></dt>
<dd>
<t>
- A check needs to be in place that prevents receiving a
inquiry
+ A check needs to be in place that prevents receiving
an inquiry
for an element multiple times or more inquiries than
are plausible.
- The amount of inquiries that is plausible can be
estimated by considering
- known values as the remote set size, the local set
size, the
- predefined absolute maximum of elements in the set
which is defined
+ The amount of inquiries that is plausible, can be
estimated by considering
+ known values, as the remote set size, the local set
size, the
+ predefined absolute maximum of elements in the set,
which is defined
by real world limitations.
- To implement this restrictions a list with all
received inquiries
+ To implement this restrictions, a list with all
received inquiries
should be stored and new inquiries should be checked
against.
</t>
<figure
anchor="security_states_passive_decoding_inquiry_code">
@@ -2817,7 +2978,7 @@ FUNCTION validate_messages_inquiry(inquiry_msg, set_diff)
<section anchor="security_states_finish_waiting" numbered="true"
toc="default">
<name>Finish Waiting</name>
<t>
- In case not all sent demands or inquiries have ben
answered in time the operation
+ In case not all sent demands or inquiries have ben
answered in time, the operation
has failed and MUST be terminated.
</t>
<!-- FIXME: In state diagram in finish closing only Elements
can be received. What happens if i receive an offer? -->
@@ -3082,6 +3243,15 @@ Type | Name | References |
Description
<date year="2011"/>
</front>
</reference>
+ <reference anchor="secure_set_reconciliation"
target="http://ti.bfh.ch">
+ <front>
+ <title>Secure Set Reconciliation</title>
+ <author initials="E." surname="Summermatter"
fullname="Elias Summermatter">
+ <organization>BFH - Berner
Fachhochschule</organization>
+ </author>
+ <date year="2021"/>
+ </front>
+ </reference>
<!-- <reference anchor="ISO20022">
<front>
@@ -3103,34 +3273,82 @@ Type | Name | References |
Description
<t>
INPUTS:
</t>
- <t>
- key: 0xFFFFFFFFFFFFFFFF (64-bit)
- number_of_buckets_per_element: 3
- ibf_size: 300
- </t>
+ <figure anchor="test_vectors_map_function_inputs">
+ <artwork name="" type="" align="left" alt=""><![CDATA[
+number_of_buckets_per_element: 3
+ibf_size: 300
+
+key1: 0xFFFFFFFFFFFFFFFF (64-bit)
+key2: 0x0000000000000000 (64-bit)
+key3: 0x00000000FFFFFFFF (64-bit)
+key4: 0xC662B6298512A22D (64-bit)
+key5: 0xF20fA7C0AA0585BE (64-bit)
+ ]]></artwork>
+ </figure>
<t>
OUTPUT:
</t>
- <t>
- ["222","32","10"]
- </t>
+ <figure anchor="test_vectors_map_function_outputs">
+ <artwork name="" type="" align="left" alt=""><![CDATA[
+key1: ["122","157","192"]
+key2: ["85","243","126"]
+key3: ["208","101","222"]
+key4: ["239","269","56"]
+key5: ["150","104","33"]
+ ]]></artwork>
+ </figure>
</section>
<section anchor="test_vectors_id_function" numbered="true"
toc="default">
<name>ID Calculation Function</name>
<t>
INPUTS:
</t>
+ <figure anchor="test_vectors_id_function_inputs">
+ <artwork name="" type="" align="left" alt=""><![CDATA[
+element 1: 0xFFFFFFFFFFFFFFFF (64-bit)
+element 2: 0x0000000000000000 (64-bit)
+element 3: 0x00000000FFFFFFFF (64-bit)
+element 4: 0xC662B6298512A22D (64-bit)
+element 5: 0xF20fA7C0AA0585BE (64-bit)
+ ]]></artwork>
+ </figure>
<t>
- element: 0xadadadadadadadad
- ibf_salt 0x3F (6-bit)
+ OUTPUT:
</t>
+ <figure anchor="test_vectors_id_function_outputs">
+ <artwork name="" type="" align="left" alt=""><![CDATA[
+element 1: 0x5AFB177B
+element 2: 0x64AB557C
+element 3: 0xCB5DB740
+element 4: 0x8C6A2BB2
+element 5: 0x7EC42981
+ ]]></artwork>
+ </figure>
+ </section>
+ <section anchor="test_counter_compression_function"
numbered="true" toc="default">
+ <name>Counter Compression Function</name>
<t>
- OUTPUT:
+ INPUTS:
</t>
+ <figure anchor="test_counter_compression_function_inputs">
+ <artwork name="" type="" align="left" alt=""><![CDATA[
+counter serie 1: [1,8,10,6,2] (min bytes 4)
+counter serie 2: [26,17,19,15,2,8] (min bytes 5)
+counter serie 3: [4,2,0,1,3] (min bytes 3)
+ ]]></artwork>
+ </figure>
<t>
- 0xFFFFFFFFFFFFFFFF
+ OUTPUT:
</t>
+ <figure anchor="test_counter_compression_function_outputs">
+ <artwork name="" type="" align="left" alt=""><![CDATA[
+counter serie 1: 0x18A62
+counter serie 2: 0x3519BC48
+counter serie 3: 0x440B
+ ]]></artwork>
+ </figure>
</section>
</section>
</back>
</rfc>
+
--
To stop receiving notification emails like this one, please contact
gnunet@gnunet.org.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [lsd0003] branch master updated: Improved RFC,
gnunet <=