[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Help-gnunet] finding files & database management
From: |
Igor Wronsky |
Subject: |
Re: [Help-gnunet] finding files & database management |
Date: |
Sat, 13 Mar 2004 18:37:55 +0200 (EET) |
On Fri, 12 Mar 2004, Benjamin Kay wrote:
> Along the same lines, is there a way to reindex a downloaded file without,
> well... reindexing it? I'm guessing that on nodes with content migration
> enabled, downloaded content gets inserted into the migration database.
> Perhaps there is a way to make it permanent on that node (index it) without
> wasting all that time manually reindexing it? And is it possible to reindex
> such files under their original descriptions and keywords?
In a way, yes. If you download file A.dat from GNUnet, and index
it using "gnunet-insert -Xx A.dat" _without any modifications to
the downloaded file_, the users doing a keyword search will
find the original pointers containing HASH1 HASH2 CRC32 FILESIZE,
supposing they remain in the network. Using -Xx will not cause
any keyword pollution. If they do a query with the found link,
and your node gets the queries, it is able to answer all of
them from the indexed file.
Explanation of how data is stored: Inserting a file with three
keywords results in 3 independent blocks, that each contain
_meta data_ enough to download the actual file. I.e. each
of them contains the same pointer and the same description,
for example. When you search with a keyword, you see
data contained in one keyword matching block. The pointer
in the block is then used to download the file.
The only problem with the -Xx index scheme is that the block
that can be matched to the original keyword (and that returns
the original description!) is not necessarily kept in the
local database (depending on fill-up rate, priorities
and activemigration value in config). Only the stuff
depending on the file contents gets reindexed.
It is not technically possible to recover all the keywords
that have been inserted by the original party. What could be
done however would be to store in client-space the search
result of a downloaded file, and if/when the respective file
is indexed, reinsert that search result block with
index priority.
If you index the file with new keywords and/or new description
(i.e. do not use -Xx), the users are still able to download
the file through the original search result (in addition
to the ones you've inserted), because they all will contain
the same HASH1,... pointers as they depend only on the
file content, not on the descriptions or keywords.
Note that indexing a file in GNUnet will currently overwrite
the inserted or migrated blocks in the local database. This
is no problem if you keep the indexed file available.
Hope this helps any.
Igor