[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Help-gnunet] finding files & database management
From: |
Krista Bennett |
Subject: |
Re: [Help-gnunet] finding files & database management |
Date: |
Fri, 12 Mar 2004 13:55:03 -0500 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7a) Gecko/20040219 |
Hopefully I can answer most of this; I spend a lot of my time out of the
loop (and so sometimes the clever folk change parts of the scheme on
me), but I think I can give you some sense of what the answer is and why
with my usual annoying verbosity.
Benjamin Kay wrote:
With very little content currently on GNUnet, finding files isn't easy.
This is true, and has long been known to be an issue; however, without
many users, we don't have much content. Now that GNUnet is increasingly
stable, and with windows port action going on, that may change in the
future.
To complicate things, keyword matching in a search seems to be explicit and case
sensitive.
This is also true; while we could certainly add an option to the search
utility to have it look for a keyword in various case configurations,
eliminating the case sensitivity in the encoding scheme itself would be
a problem; since we look for keywords and hash-key-indexed content in
the same way, this explicitness is simply part of how things work.
Doesn't mean we couldn't add something to the insert utility to
automatically add stuff using various cases though!
To make files I insert/index easier to find, I try to include as
many relevant keywords as possible - but inevitably, I still think of a few
additional keywords after I've inserted/indexed the file. The same goes for
file descriptions. I know I can reinsert the file with the new keywords and
descriptions, but that is costly in terms of processing time and requires
meticulous record keeping on my part (I need to keep track of under what
description and keywords the original file was inserted).
Well... I suppose it's possible to automate some sort of external record
of what you've inserted under various keywords and have it point to the
top block of the file so that you could continuously reindex that block
with different keywords. That might be something useful to have.
That's really so hard, methinks; as an aside, the problem with doing
that is that for the person using such a method, there is then a
concrete record of content you've inserted. From a "plausible
deniability" standpoint, you then open yourself up to trouble, as
there's not only a concrete record of what you've inserted into the
network, but a pointer to the file itself - if I insert something I
don't want attributed to me (for example, my dissertation drafts :),
it's probably not smart for me to intentionally retain a record. It
doesn't hurt the network, just me, but it's just something to think about.
This isn't a problem for the network in any sense, and I suppose it's no
different than you keeping track of such stuff on your own.
So the short question and answer is: could something be incorporated so
that you could add additionally descriptions and keywords to an existing
top block without a complete reinsertion/reindexing? Unless there's been
some radical changes in the encoding scheme since the last time I looked
at it, sure, I think it's possible (as long as the previously
indexed/inserted top content block is still around or can be
constructed). Christian, Igor, Nils, and company will correct me if I'm
wrong, I'm sure.
Is there a way to
modify the description and/or keywords of an inserted or indexed file without
reinsertion?
As I said above, unless I'm forgetting something vital, it could be made
possible to add to the keyword list given a reference to the top block.
Now, to actually "modify" the description, that's a bit more of a
problem, and that has something to do with the censorship-resistant
nature of the network. If I insert the same file 100 times under the
same keyword, given that the filename of the highest block in the
content tree is a function of the keyword, it should overwrite my local
copy of that top block.
(Is that right Christian, or did you and Igor do something tricky and
new I'm forgetting about?)
So in that case, you can "modify" the description by reinserting the top
block with the same keyword as before but a different description as
long as the only copy of that block is on your machine.
On the other hand, if that keyword block has migrated for any reason,
you can't do a darned thing about the already existing keyword blocks
that are out on the network. Nor should you be able to; if I insert my
dissertation under the keywords "Kristas_dissertation" and the
description "Draft copy of dissertation on stuff and things - do not
take internally without consulting a physician", I don't particularly
want anyone else to go through the network and change the description to
"Important government dossier on weapons of mass destruction - use in
government press briefings" for every single keyword block out there. So
once it's out in the network, it stays there until more important
content comes along and it fades away.
So, again, the short answer is that it could be done by just
"reinserting" the top block alone with a new keyword or description, but
if that top block has migrated, you're stuck with the two versions in
the network.
How about a way to view the descriptions and keywords of indexed files?
Unless you keep this externally (i.e. you keep a record of every keyword
you've indexed) somehow, no; again, this is intentional. Part of what
makes the AFS portion of GNUnet work is that you retain plausible
deniability; this means that if someone goes to your machine and says
"hey, we're going to confiscate your machine, search it, and destroy it
because you have nude pictures of Dick Cheney on it", you can honestly
say you had no way of knowing they were there short of brute force
searching for nude pictures of Dick Cheney. (??!??!!)
Furthermore, all of the blocks you've got stored look the same to
GNUnet, keyword-indexed or not, so unless you have the keywords
somewhere, you can't do a reverse-lookup.
Along the same lines, is there a way to reindex a downloaded file without,
well... reindexing it? I'm guessing that on nodes with content migration
enabled, downloaded content gets inserted into the migration database.
Perhaps there is a way to make it permanent on that node (index it) without
wasting all that time manually reindexing it? And is it possible to reindex
such files under their original descriptions and keywords?
Hrm... I'll let Christian handle that one. I'm probably just not parsing
your question the way you intend it :)
I've probably confused you more than I've helped, but the case
sensitivity thing and having some sort of external indexing utility
sound like plausible (and fairly easy-to-implement) features to me. If
I'm feeling ambitious this afternoon, maybe I'll even do something about it.
- Krista
--
***********************************************************************
Krista Bennett web.ics.purdue.edu/~bennetkl
Graduate Student in Linguistics address@hidden
Purdue University
** **
If you think education is expensive, try ignorance. - Benjamin Franklin
** **