circle-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [circle] More structural changes


From: Paul Campbell
Subject: Re: [circle] More structural changes
Date: Mon, 18 Oct 2004 10:09:44 -0700
User-agent: Mutt/1.5.6i

On Sun, Oct 17, 2004 at 11:20:41PM -0700, malcolm handley wrote:
> Wow, thanks. That's a lot of detail.
> 
> I'd be happy to provide comments on the design of the API if you want, 
> but I really like the direction that you are going it.
> 
> I agree with your plan of making deep changes to the internals where it 
> makes sense, though it does raise the questions that you mentioned 
> about what it means to be the circle.
> 
> Messiah uses the circle for a few reasons:
> - It works. When compared to other p2p systems that we tried, it worked 
> much better with respect to how fast a new node integrated into the 
> network and the speed and reliability of operations once on the 
> network.
> - It has a simply API. Being written in python was a big plus.
> 
> To be honest, I am not very interested in the the circle's gui. I like 
> the fact that the circle works well, is modular and has a simple API.
> 
> Messiah uses the node API to exchange metadata about songs (which files 
> are versions of a given song, which users thing which files are high 
> quality, etc) and use file_server to actually transfer the music. 
> Messiah is not just a thin layer on top of the circle because it is 
> trying to create a distributed trust network so that users can find 
> high quality files when they are looking for a song. It also integrates 
> with FreeDb to provide a better browsing experience.

You might have something there. I've been considering something else while
doing the refactoring, and trying to plan for a future modification.

Right now, thecircle doesn't really do much with latency information except
that it keeps track for message acknowledgement reasons. Also, it DOES have
provisions to deal with spamming nodes and/or bogus/false data coming from
nodes...in short, to deal with security threats.

However, right now it does nothing with that information.

In my mind, the most practical way to deal with security threats in a DHT
is to implement a distributed trust protocol. Somehow, nodes as a group need
to determine the degree that a node can be trusted and to boot it off the
network if the trust level gets too low. This requires a coordinated effort,
which requires a trust protocol that makes decisions on a group-level.

Which is where Messiah comes in...if you've already got a trust network on
the level of group trust metrics like Advogato or some such, then it could
possibly be used at the network-code level in the future.

> The circle has a few problems, though, including the fact that it is 
> somewhat buggy and, as you are well aware, complicated. If we have to 
> we will debug the circle's code as necessary to get Messiah working but 
> we would prefer to be using your new, clean code.

Most of the bugs that I've been seeing are related either to the usual
virtually impossible to find ones that occur in threaded code, or else the
GUI. My text terminal shows a constant stream of error messages from the
GUI code.

> This is a long way of saying that I think that you should change 
> whatever you please in the circle's innards. If your code works, or is 
> easy enough to work with that we can help make it work, we would be 
> happy to make Messiah use it. Given the low activity on the circle I 
> don't think that you will get much opposition to your changes. if you 
> do, then fork the code.

For now, the code should be forked anyways. Then it remains to be seen
whether one path or the other should progress. In fact, ThomasV actually
had his own fork where he was intending on refactoring node.py as well. But
he didn't have much interest in doing it.

> I'll let you know what progress we make.

I've got another issue related to file_server. Do you understand it enough
to figure out how it implements keyword searches? I haven't spent a lot of
time on it but I haven't exactly gleaned the algorithm out of it yet.

The underlying publish/lookup code in node.py (the DHT) really doesn't do
keyword searches. In fact, I'm not even sure if it allows for the possibility
of multiple entries (keys with multiple values). Without that function for
instance, you can't directly map keywords to the DHT (hash("mp3") should
refer to hundreds of file references, not just a single value).

For instance, does it just implement keyword searches in the "stupid" approach
of directly mapping the inverted index on to the DHT? If that's the case,
then there are serious problems with scalability.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]