gzz-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gzz-commits] gzz/Documentation/misc/benja-diff-fa thesis.rst


From: Benja Fallenstein
Subject: [Gzz-commits] gzz/Documentation/misc/benja-diff-fa thesis.rst
Date: Sun, 09 Feb 2003 05:32:33 -0500

CVSROOT:        /cvsroot/gzz
Module name:    gzz
Changes by:     Benja Fallenstein <address@hidden>      03/02/09 05:32:33

Modified files:
        Documentation/misc/benja-diff-fa: thesis.rst 

Log message:
        Remove/rewrite more stuff I can't use in the thesis

CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/gzz/Documentation/misc/benja-diff-fa/thesis.rst.diff?tr1=1.3&tr2=1.4&r1=text&r2=text

Patches:
Index: gzz/Documentation/misc/benja-diff-fa/thesis.rst
diff -u gzz/Documentation/misc/benja-diff-fa/thesis.rst:1.3 
gzz/Documentation/misc/benja-diff-fa/thesis.rst:1.4
--- gzz/Documentation/misc/benja-diff-fa/thesis.rst:1.3 Sun Feb  9 05:06:47 2003
+++ gzz/Documentation/misc/benja-diff-fa/thesis.rst     Sun Feb  9 05:32:33 2003
@@ -44,7 +44,7 @@
 Dangling links and keeping track of alternative versions. 
 Resolvable location-independent identifiers
 make these issues much easier to deal with, since data
-can be recognized whereever it is moved [#]_. 
+can be recognized wherever it is moved [#]_. 
 
 .. [#] It might be more appropriate to speak about *resources*
    and *references* instead of *documents* and *links*, but
@@ -72,7 +72,7 @@
 
 Location-independent identifiers for documents 
 make such a system unnecessary; a peer-to-peer lookup system 
-can find documents whereever they are moved.
+can find documents wherever they are moved.
 Such a system also works for data not publicized on the Internet.
 For example, if one email has a document attached to it, and another email
 links to this document, an index of locally stored documents
@@ -161,8 +161,8 @@
 2. Related Work
 ===============
 
-2.1. Dangling links
--------------------
+2.1. Dangling links and alternative versions
+--------------------------------------------
 
 The dangling link problem has received a lot of attention
 in hypermedia research [refs]. As examples, we examine the ways
@@ -208,46 +208,29 @@
 it would be possible to find both the document and links to it,
 no matter which peer in the network they are stored on.
 
-
-2.2. Alternative versions
--------------------------
-
 Version control systems like CVS or RCS [ref] usually assume
 a central server hosting a repository. The WebDAV/DeltaV protocols,
 designed for interoperability between version control systems, inherit
-this assumption [ref]. On the other hand, Arch [ref] places all repositories
+this assumption [ref]. 
+
+On the other hand, Arch [ref] places all repositories
 into a global namespace and allows independent developers 
 to branch and merge overlapping repositories without any central control
 [is there a specific ref for this?].
 
-Lotus Notes [ref], popular database sharing and colloboration tool, has some 
-similarities to Storm. In both systems, for instance, data is identified by 
-using GUIDs. However, partly because of the long age of the system, Lotus 
Notes 
-is limited to client-server architecture, where as Storm exploits adaptive 
peer-to-peer 
-architecture. Groove [ref] is a improved design of Lotus Notes, which employs 
-strong security mechanisms and uses peer-to-peer functionality 
-as basis of communication channel among limited amount of participants. 
-Neither of these systems supports the immutability of data.
-
-.. thesis-benja: remove paragraph above
-
-[ref HTML version format proposal] Alternate versions important for
-authoring process [search refs]. (Note: Keeping track of versions
-structure is also \*hyper*media. Refs?) (WebDAV!)
-
 
 2.3. Peer-to-peer systems
 -------------------------
 
 .. thesis-benja: check what needs to be rewritten below
 
-During the last few years, there has been a lot of research
-related to peer-to-peer resource discovery, both academical and in the 
industry.
-There are two main approaches: broadcasting [gnutella1, kazaa, limewire,
-shareaza], and distributed hashtables (DHTs) [chord, can, tapestry, pastry,
-kademlia, symphony, viceroy]. Broadcasting systems
-forward queries to all systems reachable in a given number of hops
-(time-to-live). DHTs store (key,value) pairs which can be found given
+There are two major directions in peer-to-peer search systems,
+originally motivated by file-sharing systems like Napster.
+The more conventional one, flooding broadcasting networks [gnutella1, 
+kazaa, limewire, shareaza], forwards queries to all peers
+reachable in a given number of hops (time-to-live).
+The second, distributed hashtables (DHTs) [chord, can, tapestry, pastry,
+kademlia, symphony, viceroy], stores (key,value) pairs which can be found given
 the key; a DHT assigns each peer a subset of all possible keys, and
 routes queries for a given key to the peer responsible for it.
 Before a pair can be found, it must be *inserted* in the DHT
@@ -255,31 +238,26 @@
 use an application-level overlay network for routing.
 
 While broadcasting systems' performance is linear, DHTs' performance
-usually has log-like bounds in the number of peers 
-for *all* internal operations [#]_. This scalability is
+usually has log-like bounds in the number of peers. This scalability is
 what makes global searches feasible in DHTs. In broadcasting approaches,
 on the other hand, scalability is archieved by forwarding queries 
 only to a limited subset of the peers (bounded by the time-to-live),
 which means that searches in these systems are not truly global.
 
-.. [#] It's not clear whether *all* proposed DHT designs can preserve
-   log-like properties when participants are heterogeneous and they 
-   join and leave the system in a dynamic manner. 
-
 A DHT has a *key space*, for example the points on a circle.
 The keys in (key,value) pairs are mapped to points in the key space
 through a hash function. Independently, each peer is assigned
 a point in the space. The DHT defines a distance metric
-between points in the key space (e.g. numeric, XOR); the peer
+between points in the key space; the peer
 responsible for a hashtable key, then, is the one that is *closest*
 to it in the key space, according to the distance metric.
-A peer, then, is analogous to a hashtable bucket.
+A DHT peer is roughly analogous to a hashtable bucket.
 Queries are routed to the overlay network, each hop bringing
 them closer to its destination in key space, until they reach
 the peer responsible for them.
 
-.. http://sahara.cs.berkeley.edu/jan2003-retreat/ravenben_api_talk.pdf
-   Full paper will appear in IPTPS 2003 -Hermanni
+A common API that can be supported by current and future DHTs
+is being proposed in [zhao02api].
 
 Recently, a few DHT-like systems have been developed which employ
 a key space similarly to a DHT, but in which queries are routed
@@ -289,34 +267,6 @@
 close keys in the custody of a 'hashtable bucket' peer is removed
 at the cost of each peer maintining one node in the overlay network
 for each (key,value) pair it publishes.
-
-.. hemppah's original text before benja's changes:
-   In a DHT, both hashtable items and the addresses of peers 
-   are mapped into a single virtual key space. The form of the key space 
-   depends on implementation (for example, Chord uses a circle).
-   A distance metric (e.g. numerical, XOR) is used to find the peer
-   whose position in the key space is 'closest' to the key of a given item.
-   This peer is responsible to store the item (so both queries and insertions
-   relating to the key are routed to it.) Thus, 
-   DHT's overlay connectivity graph is structured. On the other hand, the 
overlay 
-   connectivity graph of broadcasting approach is formed more or less (depends 
on 
-   implementation) in a random manner. 
-
-   When performing queries, in broadcasting approach, peer sends a query 
request to a 
-   subset of its neighbors and these peers to their subsequent neighbors. The 
-   process will continue as long as query's time-to-live (TTL) value hasn't 
been reached. 
-   In DHT approach, query request is deterministically routed towards the peer 
-   which hosts a specific data item. Routing is based on 'hints' (based on 
-   differences between data item's key and peer's key), which each peer 
provides 
-   along the routing path.
-
-   Obviously, there are major differences within approaches. For the DHT 
approach, 
-   perhaps the main difference is *what* is self-organized into a
-   virtual key space. For instance, in SWAN [ref] and Skip Graph [ref], *data 
-   items* self-organise into a virtual address space, while in other DHT 
-   implementations *peers* self-organise in structured form in a virtual 
space. 
-   In the broadcasting approach, implementations' differences mostly lie in 
the 
-   *structural level* of the overlay network, i.e. super peers and peer 
clusters.
 
 The basic definition of a distributed hashtable does not indicate
 how large the keys and values used may be. Intuitively, we expect keys




reply via email to

[Prev in Thread] Current Thread [Next in Thread]