[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[GNUnet-developers] Re: various suggestions
From: |
Christian Grothoff |
Subject: |
[GNUnet-developers] Re: various suggestions |
Date: |
Mon, 29 Apr 2002 22:41:29 -0500 |
On Monday 29 April 2002 09:32 pm, you wrote:
> So how does AND work? Do you do two full searches and then take the
> union, or can you do one search and then filter it with the later
> keywords. It it is the former, then I agree with you.
Exactly, two full searches.
> Is it possible to just search for the content checksum? People have
> built a number of interesting applications on top of freenet without a
> search capability. However a number of them cause a huge "query"
> overhead as applications probe for new data that might be there.
Yes, it is possible. If you use the text-tool (gnunet-search), it will
print the content checksum (hash, length and crc are needed).
gnunet-download can then be used to download the file from there.
> > them). Thus having keywords for a file that are hard to obtain (i.e. not
> > automatically from the file/RNode) is usually a good thing (TM). This may
> > actually be a reason for *not* supplying a filename (or at least not one
> > that was used for keyword extraction).
>
> Ok. Sounds like I should read the paper. I think it sounds like a
> reason for a disclaimer on the insert tool explaining the using the
> auto keyword features make censoring a given file easier. Filenames
> are too useful to omit. Besides the current insert-mp3 tool has this
> problem because if someone fetches the mp3 they could find the full
> list of keywords from the id3 stuff.
Absolutely. And yes, more docs would always be a good idea :-)
> > That's exactly what I also thought. I'm just not sure that spliting
> > the directory like that is the best idea (I'm still pondering the
> > issue, until I have a really good solution, it'll probably just stay in
> > 'slow mode').
>
> How many files do you expect to see here? The default config file is
> 512MB's of 1k files. That is 524288 files in one directory. That is
> really slow. To me that suggests you want to take 2 levels of 4 bits
> each,
> ~/.gnunet/data/content/F/E/FE4F8155230050000000000065100000C79CA8BA,
> for a average of 2048 files per directory with the default config.
> Even that is insane. (see below)
Exactly. We're insane. :-)
> > And of course, using a better FS (reiser, ext3, xfs) is recommended.
> > It would be nice to have some profiling code to actually evaluate
> > different approaches/filesystems in order to give (educated) advice to
> > users which FS to use.
>
> It it quite presumptious to assume that people will select their
> filesystem based on gnunet. :-) Or even that they will have it on a
> seperate partition.
That's the idea. See also the FAQ. In fact, GNUnet could become a distributed
encrypted filesystem (at least in theory).
> > What do you mean by 'a large file based hash'?
>
> One file that allows efficent hash based lookups based on key and
> returns a 1k data block. Like ndbm only faster. I can possibily
> provide the mdbm library from bk. It is way faster that ANY
> filesystem and stores the data packed tightly. You would just need to
> sync the file periodicly so that the changes are persistant.
>
> This seems way simpler that dealing with the fact that a default linux
> installation will use way to much diskspace and will probably run out
> of inodes.
Sounds reasonable, that's kind of like the database idea that I mentioned.
Either way, before we make a decision, I would like to profile the various
approaches and see what performs best. For the moment, I have other
priorities, if anybody wants to investigate and convince me of a solution,
code it & present numbers :-)
Christian