gnunet-developers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [GNUnet-developers] Designing a gnunet directory app


From: Igor Wronsky
Subject: Re: [GNUnet-developers] Designing a gnunet directory app
Date: Fri, 14 Jun 2002 23:28:28 +0300 (EEST)

On Fri, 14 Jun 2002, Christian Grothoff wrote:

> I don't think there is a real solution unless we introduce a feedback 
> protocol that allows users to broadcast 'bad content' messages in order to 
> get rid of it (moderation). The problem is, that moderation and censorship 
> are awfully close *and* that a (malicious?) server may decide to keep the 
> content anyway.

That would enable majority to oppress minority. ;( Not good. I'll
have to read your paper about trust on gnunet, perhaps I get a 
better grasp of when and why content is kept or discarded. Maybe 
something could be devised along those lines.

> Well, this still does not solve the problem of malicious insertion, which is 
> really the issue that you want to address. 

GNUnet trust should take care of that. ;) 

> (think WWW) is better. And for that, all we need is directories containing 
> references to other directories.

Suppose user can enter a reference to one directory, pointing to
another directory. Any way to keep the structure from becoming 
totally unnavigable and cluttered at some point?

> You can always try to establish some standard naming conventions for keywords 
> (like in the example above), but I would not make them mandatory -- in 
> particular, if keywords are standardized, the deniability in GNUnet is not 
> given for those keywords since an adversary can use a guessing attack!

That was evolutionary standardization (allowing some ad hoc keywords
for retrieving directories to establish themselves based on their 
popularity), a *meta data* standard, though the directories lead 
to keys of actual files, thus giving them up in sequence.

There's a serious problem in what you suggest. On the other
hand, in my opinion, the purpose of system like this is to
make it possible for people to publish, locate and retrieve 
content anonymously. How can this be possible, if the means
of locating can not be made known to the users? The system
can't work on the basis of people coming up with more and
more obscure keywords to thwart keyword/guessing attacks,
because that destroys the locating ability, making all
content essentially private or shared between just a 
small elite, connected also in out-of-band -fashion.

Such private xfers can already be achieved for example 
with steganography and newsgroups.

> I think you are mistaken in terms of how the query mechanism works. What you 
> could do with groups is something similar to 'webrings' - whenever you insert 
> a file, you add it to a 'group-directory' (which you publish every n files 
> under the group-name). 

I don't think I quite understand. I tried to explain a situation
where files are inserted just as they are now, normally, but the
user could optionally enter a group (directory) name by which to 
post an *index file* of the files inserted, thus the group name 
being just a keyword to retrieve the inserted index file (and 
other index files of the same topic), like a node in a tree (see
below).

> Dates are always bad because an adversary can manipulate them and they
> can be used in partitioning attacks ("you were online at that time"). 

I'm not familiar with partitioning attack. But why a query for 
a keyword 'blah-<date>' wouldn't be just as anonymous as a 
query for 'blah'? How can someone say that the query 
with 'date' was issued from your node, if they can't claim 
that for plain 'blah'? 

> The problem with 'too many' results can always be addressed by
> using more obscure keywords, so I doubt that really applies.

See above.

> I would just publish the full public key (258 bytes is not that much) with 
> the directory. The signature itself is another 256 bytes, so that's 512 bytes 
> overhead - ok in my opinion. Otherwise you may be able to obtain the 
> directory but fail to get the public key (because of course nobody ever 
> checks...).

Agreed. There will be a lot of short directory listings 
after a short amount of time - if gnunet catches on.

> I think all we need is an easy way to create directories for people that 
> insert a ton of documents and another way for people to populate directories 
> with files that they find. And then of course some support for directories 
> in the client.

Bah. Curses. We should standardize the used terminology. ;-) Lets
say a "directory" is a collection of several index files, each index
file containing {filename,crc,key,etc...}* and inserted by
different users. A directory is topic specific, e.g. "directory 
of potted plants". Inserting entries into a specific directory 
means inserting a new index file using the keyword of that directory.

Additionally, an unidirectional link to directoryA could 
be inserted to point to directoryB. For backwards link,
insert link to directoryB pointing to directoryA.

I still claim we can't have just single master directory
with keyword 'directory', because it will very quickly fill 
up with all kinds of stuff describing who knows what, 
making the location of interesting content hard for
each non-omnivore individual.

If your concept of directory is drastically different,
please explain. My view can also thought of as a graph. 
A "directory" is a node in a graph, to which content-describing 
index files or links to other nodes can be inserted.

> easier to guess what people are searching for (and thus the possibility to 
> censor certain queries; say I don't like posting number 421, then I tell my 
> node to drop all queries for 421. While the high degree of connection in 
> GNUnet will make this attack a lot less effective compared to other networks, 
> it's still a problem).

Same applies to keywords. There is not much point in the system
if the keywords cannot be public knowledge, because then we really
can't talk about sharing anymore.

> While I think it is ok to set some 'recommendations' for keywords
> (read: you should add a keyword of this form AND use any other
> format that you see fit), I think this is a different topic. Directories
> are files that contain information about other files (including other 
> directories), and that information identifies the file uniquely (see result 
> of gnunet-search). Keywords for the search (input of gnunet-search, not 
> output!) are a different topic! Let's try to keep these two separate. If you 
> want to write an RFC for selecting keywords, write one, if you want to write 
> an RFC for directories, fine. But they are 2 different problems.

I mean that there should - and must be - a keyword for finding the
directories in the first place! A prefix could be already fixed 
by gnunet-insert and other apps, requiring only a topic extension 
or such from the user (forming together a keyword of a "graph node"). 
What point is a directory if no one can find it, or if its mess, 
giving similar results to searching the network with regexp .* 
(supposing for a while that it were possible).

Sorry for repeating. But its essential to clarify things before
actual meddling or harm is done. :)

> > - The hierarchy-or-flat -issue (and what should the result look like?)
> I'm all for a graph! You take any (standardized or not) keyword to enter the 
> graph and find a directory (the standardized keyword space can be a tree, the 
> global keyword-space is naturally flat). From there, you find other 
> directories and files -- and you can navigate like in the WWW. 

Hey, tree is a hierarchy. ;) I don't think your idea is very 
different from the one I proposed, we just don't use same
terminology.

> > - The keyword format for locating listings (w/ dates or nodates?)
> I would tend to try to categorize by content, not by date. Dates can be 
> misleading. Any other opinions on this one?

I gladly give up the dates if some other mechanism can be devised
to discard obsolete listings. It will be really frustrating to
attempt to download something only to note that meager <1/50 of
the given hashes actually retrieve something. You yourself 
mention similar argument in one of your papers against early 
freenet keyservers.


Igor


ps. surfed through gnunet code today, looking for enlightment. It
seems that only way for content to migrate currently is for it to 
be queried for. I suppose the 'pushing of inserted content' is 
coming along someday as well? This is because in the
current setting, when the network grows, there is no reason
to expect that query from node B for stuff X will ever
reach your node A, where X is hosted. (hostkey(A) and 
keyword(X) are totally uncorrelated). Pardon me for the
lecture. There might be someone in the audience who didn't
know. ;)






reply via email to

[Prev in Thread] Current Thread [Next in Thread]