[GNUnet-developers] Another bunch of ideas

gnunet-developers
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[GNUnet-developers] Another bunch of ideas

From:	Steven Barker
Subject:	[GNUnet-developers] Another bunch of ideas
Date:	Mon, 14 Jul 2003 01:51:29 -0400
User-agent:	Mutt/1.5.4i
Hi folks,

I just started using GNUnet this past week after packages of 0.5.4a hit
Debian unstable.  I've been following the progress of anonymous P2P
systems (mostly Freenet) for some time, and while I'd heard of GNUnet I'd
not tried it out before.  I'm very impressed!  GNUnet doesn't have all the
features of Freenet yet, but it seem seems to me to be more stable and to
have a better designed architecture.  (It's also written in C, a language
I'm a lot more comfortable with reading and writing.)

Anyway, I've spent a number of hours reading the papers, documentation and
list archives and playing around with the CLI clients and I've come up with
a few features I'd like to see in GNUnet in the future (if somebody is
inclined to implement them).  I hope you'll take these suggestions as
ideas to consider and debate, not as any kind of demands on you.  I read a
similar thread titled "End-user wishlist" that took place a few weeks ago,
and this should be similar (though my ideas are a bit farther reaching than
the ones in that thread :-).


* Distributed boolean searches.

Hmm, I had a pretty cool idea about distributed boolean searches for AFS,
but as I was writing it up I realized that it will not work.  If only there
was a way to tell if two RBlocks stored under H(H("foo")) and H(H("bar"))
refer to the same root IBlock without being able to decrypt either...


* Challenge-Response-Confirmation support in the trust economics code.

The current trust system is limited in that it can only evaluate
query-reply pairs that can be verified by all of the intermediate nodes,
something that is not true for many interesting types of requests.

An example that demonstrates the limitation is searching for a keyword in
the current AFS impementation. If I search for "GPL", the AFS trust system
can tell that the RBlock returned returned does in fact match the keyword,
but it has no way to evaluate if the RBlock actually points to the actual
GNU GPL.  This means that keywords can be spammed.  As a test, I have
inserted a document called "not_the_gpl.txt" under a number of keywords you
might try if you wanted the GPL (GPL, gpl, GNU, FSF, "GNU General Public
License", etc).  These bogus references will probably never disapear.  In
this case the spam file itself probably will eventually drop out, since few
people will download a file that says in it's description and suggested
filename that it's not in fact what it's keywords suggest it is.  A real
spam on the GPL or any other file in GNUnet could be indistinguishable from
the real thing until it was downloaded.  That means the spam would have
exactly as much of a chance of staying in the network as the original.  I
can't think of any spam fighting strategy hat will ensure the spam RBlocks
will be dropped out of the network before RBlocks of data that really
belongs under that keyword.

My idea for how to get around this limitation is to augment the two message
Query-Reply model with a three message Challenge-Response-Confirmation
model.  The third message, the Confirmation will tell the nodes it reaches
the validity of the Response.

I'll run through how the whole model to show how it works.  Consider first a
two node system, where A asks B for something.

     A                        B
     \
      --->----Challenge--->---
                              \
                      finds or generates
                      appropriate data for
                      the Response
                              /
      ---<----Response----<---
     /
evaluates whether
or not the Response
was valid
     \
      --->--Confirmation-->---
                              \
                      adjusts application
                      specific data (e.g.
                      data storage, routing
                      tables, caches, etc)

A more interesting case is when there's an indirection involved, for example
if the communication between A and B above was routed through node C:

     A                         C                           B
     \
      --->----Challenge--->---
                              \
                     doesn't have the data, so
                     decides to route the
                     Querry to another Node
                                \
                                 --->----Challenge--->---
                                                         \
                                                could indirect further
                                                         ( \           )
                                                         (  ->-C->-    )
                                                         (         \   )
                                                         (         ... )
                                                         (         /   )
                                                         (  -<-R-<-    )
                                                         ( /           )
                                                or answer with locally
                                                stored or generated data
                                                         /
                                 ---<----Response----<---
                                /
                     the Response is sent back
                     along the route to the
                     Challenge's originator
                              /
      ---<----Response----<---
     /
evaluates whether
or not the Response
was valid
     \
      --->--Confirmation-->---
                              \
                      may adjust application
                      specific data
                                \
                                 --->--Confirmation-->---
                                                         \
                                                  may adjust application
                                                  specific data


Accounting for trust in this model is only a little more involved than
in the Query-Reply model.  On recieving a Challenge or a Response, trust in
the sender is decreased by the priority of the message to represent the
cost of processing the message.  Before sending a Response, the trust of the
reciepient is decreased by it's priority as "insurance" against the Response
being taken without any Confirmation being sent.  When a Confirmation is
recieved, it's sender is credited by it's priority, returning the previously
payed "insurance".  When a confirmtion is to be sent, if it acknowledges the
validity of the preceding Response, the recipient is credited the priority
of the Confirmation, offsetting it's expenses, plus the priority of the
originally sent Challenge, as payment for providing a valid Response.

The rules for changing trust as mesages are recieved and sent are a little
hard to understand from only that description, so I'll walk through the
three node example from above with some sample trust values.

                  trustees
                   _____
                  /     \
                 A   B   C
           / A  inf  -  100
 trusters  | B   -  inf 100
           \ C  100 100 inf

A sends it's Challenge with a priority of 20 to C.  C decreases it's trust
in A and sends on the Query to B.  B reduces it's trust in C.  The new trust
values are:

     A   B   C
 A  inf  -  100
 B   -  inf  80
 C   80 100 inf

B finds (or generates) data appropriate for a Response to the Challenge.
Before it sends a Response to C, however, it decreases it's trust in C
again by the priority of the the message it's about to send (20).  When
C gets the Response, it reduces it's trust in *both* A and B by 20 before
sending it on to A.  A immediately reduces it's trust in C by 20.  Now the
trusts look like:

    A   B   C
A  inf  -   80
B   -  inf  60
C   60  80 inf

Now that A has the data it requested, it has to evalueate it.  This could be
an automatic, application specific check (like, for example, determining if
data is signed by a trusted identity), or it could ask the user, who would
decide it if the data is good.  If the data gets a positive evaluation,
A adds 20 + 20, the priority of the Challenge plus the priority of the
forthcoming Confirmation, to it's trust in C.  If the data gets a bad
evaluation, A would leave it's trust alone.  In either case, A sends a
Confirmation to C saying what it thought of the data.  When C recieves
the Confirmation, it re-raises it's trust in A by 20, regardless of what
the Confirmation says.  If the Confirmation says the data was good, it
raises it's trust in B by 20 + 20.  If the Confirmation says the data was
bad it doesn't change it's trust in B. C may also make appication specific
decisions at this point, such as changing routing tables or updating data
caches or storage.  Finally it sends the Confirmation to B, which re-raises
it's trust in C by 20, and makes any application specific decisions it wants
based on the information in the Confirmation.  Here's what the final trust
values would be:

data was good              data was bad
    A   B   C                  A   B   C
A  inf  -  120             A  inf  -   80
B   -  inf  80             B   -  inf  80
C   80 120 inf             C   80  80 inf

There's a bunch of other details I've not worked through yet, like what
value to clamp priorities at when sending and recieving messages and which
trust changes would be skipped when there's an excess of resources or when a
node forwards a message instead of routing it.  I donn't think of those
issues will be too hard to figure out.  But maybe I've overlooked an attack
on the trust system of this model....  I hope you'll all look it over.


* An Anonymous Distributed Computing framework.

This is one application I had in mind when I was thinking about the trust
accounting of Challenge-Response-Confirmation type messages.  For those of
you who don't know, distributed computing is the name of a technique that
solves computationally hard problems by exploiting algorithms that can run
on massively parallel networks of machines.  Some well known distributed
computing projects are address@hidden, distributed.net and a large number of
prime number finding projets.  Distributed computing is also used to solve
a lot of scientific problems that used to require very expensive super
computer time.  Distributed computing is also known as Grid Computing
in some circles.

My idea for how do do a distributed computing framework on GNUnet depends on
my previous idea of Challenge-Response-Confirmation trust accounting.  There
would be Challenge messages that would describe a distributed computing
problem by refering to some code and some data (both of which would probably
be distributed over AFS).  In order to create a response, a node would
download the code, run it (in a secure sandbox) on the data and send a
Response with the output.  The publisher of the Challenge would check the
Response data, either with a quick test, or by checking against Response
data submitted by other nodes (the data sets could be overlapping in an
unpredictable way, so as to make it harder for an attacker to submit many
bogus results that all agree).  When the publisher of the challenge is
satisfied that it can judge the correctness (or at least the good faith or
fraudulence) of the Response, it will issue a Confirmation message.

For example, a distributed.net-like brute-force crypto cracker could be
created by writing code that tests a possible encryption key against
a encrypted message.  Then you would insert the code and a large number of
keyblocks into AFS.  Finally you publish lots Challenges that identify the
code and a few of the data blocks.  Nodes that recieve one of at a high
enough priority (based on their resource availability, and maybe other
factors like how much work they estimate the Challenge requires) will grab
the key checker code and run the referenced keyblocks through it and send
a Response with the output.  When the Responses get back to you, your node
will check the output.  If the response claims to have found the key to
decrypt the message, it's easy to test directly.  If it says that node of
the data blocks held the key, you'll have to cross check it against other
Responses your recieved to prevent as much fraud as you can.  When you've
made up your mind about the good faith of the Response, you send a
Confirmation with your verdict.


* Others I've forgotten

Well, I thought I had more than three ideas, but as I wrote (and thought) a
ton about these three, the others have slipped my mind.  I may write again
if I think of more neat ideas that could be built into GNUnet.

I hope my ideas are interesting to some of you.  I'm sure I've overlooked
possible problems with them, and they may infact be impossible to implement.
Let me know what you think!  I've just subscribed to the list today, and I
hope to participate in some good discussion. :-)

-- 
Steven Barker                                      address@hidden
  If you aren't rich you should always look useful.
                -- Louis-Ferdinand Celine
Get my GnuPG public key at: http://www.blckknght.org/publickey.asc
Fingerprint: 272A 3EC8 52CE F22B F745  775E 5292 F743 EBD5 936B
pgpaI97x4jsOs.pgp
Description: PGP signature
[Prev in Thread]
Current Thread
[Next in Thread]
[GNUnet-developers] Another bunch of ideas, Steven Barker <=
- Re: [GNUnet-developers] Another bunch of ideas, Igor Wronsky, 2003/07/14
  - Re: [GNUnet-developers] Another bunch of ideas, Steven Barker, 2003/07/14
    - Re: [GNUnet-developers] Another bunch of ideas, Christian Grothoff, 2003/07/14
Prev by Date: Re: [GNUnet-developers] debian packages of 0.5.4
Next by Date: Re: [GNUnet-developers] Another bunch of ideas
Previous by thread: Re: [GNUnet-developers] debian packages of 0.5.4
Next by thread: Re: [GNUnet-developers] Another bunch of ideas
Index(es):
- Date
- Thread