gzz-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gzz] hemppah's research problems document


From: B. Fallenstein
Subject: Re: [Gzz] hemppah's research problems document
Date: Sun, 15 Dec 2002 19:54:49 +0100
User-agent: Mozilla/5.0 (X11; U; Linux ppc; en-US; rv:1.2) Gecko/20021204 Debian/1.2.1-1

Tuomas Lukka wrote:
1) DHT storing the actual blocks is
        + pretty well understood

Pointers?

There are loads of p2p systems with a location step and a download step in actual use (Napster, Gnutella, Kazaa, eDonkey, Overnet, Circle...); some of them are based on DHTs. I don't know any off the top of my head that are storing the files as DHT values, and no research projects, either.


DHT storing the *data*.

Note that the ones you mention above seem to be pretty unstable.

If the data is mappings of, say, ~20 byte keys to ~50 byte values, I
think that is pretty well understood, yes. If the data is mappings to
file-sized values, like 1MB or so, I don't think that's understood at
all, except possibly for a network of *servers*. Asking peers to store
small mappings is quite manageable; they can be served through a small
link, and they can quickly be pushed onto a new node joining the network
(from the other replicas of that data); and they don't take a lot of
storage on the peer. Asking peers to store arbitrarily large data for
other peers is far less understood, I think.

Also, DHTs often replicate data along the routing path. Sending 100
bytes to all computers to log(n) computers for replication isn't much of
a problem; sending 1MB to log(n) computers is.

        + quite efficient, robust and not too attackable

Efficient? I've just joined the network, please push the 100MB of data I'm supposed to host on my machine down my 56k link.


?? This would of course happen incrementally.

A DHT node cannot start working until it has all the mappings it is
responsible for. The point of a DHT is to return complete data, so a
node can only start to act when it has received all the data it is
responsible for. I think nodes will go on- and offline far more
frequently than the amount of data they have to store can be pushed on them.

Freenet stores the data in the network, but there a node can go offline
and online again and serve the same data it did before going offline.
DHTs are not designed to be able to do this.

Not too attackable? Please store this 5TB block for me, your node is responsible for its key.

If there are different hashes of the keys, this shouldn't be a problem.

I don't understand. I believe all I have to do to bring down a network
like this is to create really big blocks and store one on every node. Of
course you could also generate a really big number of mappings, but
restricting the number of mappings a block can put into the DHT is much
less harmful than restricting not only the number, but also the size of
blocks.

The thing I'm looking for is some guarantees about performance. For DHTs,
they seem to be there, but for DHT + IP-based fetch I don't think so.
 >
Or can you point me somewhere where that has been analyzed?

I don't know whether it has been analyzed. I doubt that the system you
are talking about has been, though, because I do not remember having
read a discussion about the problems I name above. The DHT-based file
sharing systems I'm aware of AFAIK use the IP-based fetch, and IP-based
fetch is also the technology used in Napster and Gnutella (where Napster
has a "trusted third party" scheme for the lookup step, avoiding some
problems, apparently).

Again, I'm asking for pointers about *any* system that works as you
said, for transient peers (joining & leaving regularly).

- Benja





reply via email to

[Prev in Thread] Current Thread [Next in Thread]