gridpt-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gridpt-discuss] work history


From: Pedro Andrade
Subject: [Gridpt-discuss] work history
Date: Fri, 25 Jul 2003 17:51:45 +0200
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624

Hi

Just to keep the new members up to date.
With this mail we pretend to expose what we have been doing, what decisions we have taken and in what directions we are thinking to move to. This doesn’t pretend to be a complete explanation of the work done but only to clarify the progress of our ideas and the current situation.

Our first proposal (http://fisica.fe.up.pt/cgi-bin/twiki/view/Gridpt/FirstProposal) was to make a small centralized Grid system for job submission and data management. In this architecture, all the information about resources and jobs is centralized in a central element which receives requests from users and forward them to worker nodes. Those worker nodes have previously declared to the central element that they are ready to receive jobs. The central server would store no information regarding the status of the worker nodes... the worker nodes whenever they felt that they "can help" the system by executing a job, would ask the central server if there's something on queue that they can execute. The main idea of this prototype was to have a centralized grid architecture with a pull job model, based also on the principle of data partitioning (the system works better if data is split through all the worker nodes). Jobs run where the data already is; there is no data movement - or at least the data movement is reduced to a minimum.

Then, we started to realize that this sort of approach would only lead to a small EDG-architecture-style with all the problems of a centralized system (single point of failure - if server goes down, everything goes). For this reason we started analyzing some P2P systems - which are known for their robustness against failure. From this analysis we came to a second proposal (http://fisica.fe.up.pt/cgi-bin/twiki/view/Gridpt/SecondProposal). In this proposal we defined a simple and efficient distribution of highly parallelizable services using a peer-to-peer based approach (perhaps using web services for communication). The main idea was still to preserve the “no data movement” principle, but instead of a centralized pull system we should now have a decentralized push system where some semi-central nodes (aggregators) receive the requests from the user, query all the nodes, aggregate the answers and retrieve them to the user. With this approach there will be no more a “single-point-of-failure” (no EDG resource broker) since it’s up to the peer to decide if he can run the job or not. The aggregator node once again knows nothing about the worker nodes, except that they exist - there's no central repository containing a "global" status of the system.

From this generic approach we then started to try and clarify how this system could be implemented. It's main characteristics should be:
- decentralized system (peer-to-peer)
- local information catalog (each peer just knows what it has)
- abstract information types (each peer wouldn't have a catalog specific for data management and another catalog specific to information index - they would have an abstract information storage system capable of dealing with all these - metadata)
- modular architecture
- a first implementation of this system should be a data management demonstration

After some analysis we can upon the following two major aspects, which we started working on to try and find the best solution:

1) P2P system
There are several types of peer-to-peer systems:
- centralized approach (naspter style)
- semi-centralized approach (kazaa style)
- decentralized approach (gnutella)
All of these have their advantages and disadvantages. We think that the one that can bring better results is the semi-centralized approach because it doesn’t have a single-point-of-failure of the centralized systems and also doesn’t floods the entire network with queries/request like the decentralized systems do. The semi-centralized systems use the idea of super nodes where peers are joint in groups having one super node and organized in a hierarchical way. Besides this network structure issue, there is another important problem - the discovery mechanism. How should peers discover others peers? Should even discover them or not be aware of other peers? What should be the function of the super node/aggregator? In Twiki (http://fisica.fe.up.pt/cgi-bin/twiki/view/Gridpt/NodeRendezvous) we refer to how could the peers register and operate in the network. One good solution is to use Kademlia (a protocol we have studied that uses hash keys to find nodes that it doesn't know of).

2) Information layer
Concerning the information/data storage, searching and retrieving two different options are being analysed. - Using RDF (http://fisica.fe.up.pt/cgi-bin/twiki/view/Gridpt/RdfSchema). Using rdf will allow for a more standard and generic implementation but querying/retrieving different parts of metadata will be more complicated. - Using local databases (http://fisica.fe.up.pt/cgi-bin/twiki/view/Gridpt/LocalDB). The usage of local database will allow a more powerful control over the query system but it will bring the disadvantage of having a more rigid schema for the metadata. We are now trying to find a solution which could join the best parts from these two alternatives.

---

So, to sum up, we'd like to build a peer to peer network, with a metadata layer on top. The exact mechanism for the p2p system are being studied (although we have some pretty definite ideas about this). The metadata layer is more troubling, because it can actually influence the way that peers work together.

As usual, we appreciate any comments or questions. It's actually quite difficult to try and explain a system with so many variants and possibilities in such a "short" email.

Best Regards,
Miguel Branco
Pedro Andrade





reply via email to

[Prev in Thread] Current Thread [Next in Thread]