[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...
From: |
Hermanni Hyytiälä |
Subject: |
[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert... |
Date: |
Mon, 26 May 2003 05:07:05 -0400 |
CVSROOT: /cvsroot/gzz
Module name: gzz
Changes by: Hermanni Hyytiälä <address@hidden> 03/05/26 05:07:05
Modified files:
Documentation/misc/hemppah-progradu: masterthesis.tex
Log message:
Steven's comments
CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/gzz/Documentation/misc/hemppah-progradu/masterthesis.tex.diff?tr1=1.204&tr2=1.205&r1=text&r2=text
Patches:
Index: gzz/Documentation/misc/hemppah-progradu/masterthesis.tex
diff -u gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.204
gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.205
--- gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.204 Wed May
21 03:01:55 2003
+++ gzz/Documentation/misc/hemppah-progradu/masterthesis.tex Mon May 26
05:07:05 2003
@@ -264,7 +264,7 @@
and KaZaa \cite{kazaaurl} use the FastTrack-based flooding protocol
\cite{fasttrackurl}. However, it is not clear
whether the power-law method is scalable or not,
as the majority of the query requests are sent only to the high degree peers
while making
-these peers to bear the load of the entire system.
+these peers bear the load of the entire system.
%\begin{figure}
%\centering
@@ -287,7 +287,7 @@
\label{fig:gnutella_powerlaw}
\end{figure}
-Above presented improvements are only partial solutions. More advanced
techniques
+The improvements presented above are only partial solutions. More advanced
techniques
to improve data lookup of loosely structured systems are discussed in chapter
3. Yet, however,
techniques presented in chapter 3 are not adopted in any loosely structured
system.
@@ -318,24 +318,24 @@
\subsection{Systems}
-With tightly structured systems, it is feasible to perform \emph{global} data
lookups in the overlay efficiently. By global lookup, we mean
+With tightly structured systems, it is feasible to efficiently perform
\emph{global} data lookups in the overlay. By global lookup, we mean
that the system is able to find a service from the overlay, if it exists.
-While there are significant differences among proposed tightly structured
systems, they all have in common
-that \emph{peer identifiers} are assigned to participating peers from
-a large \emph{identifier space} by the overlay. Globally unique identifiers
-are also assigned to application-specific data items, \emph{keys},
+While there are significant differences among proposed tightly structured
systems, they all have a common property,
+in that \emph{peer identifiers} are assigned to participating peers from
+a large \emph{identifier space} by the overlay. Globally unique identifiers,
\emph{keys},
+are also assigned to application-specific data items
which are selected from the same identifier space. For instance, globally
unique keys can be created
using a cryptographic content hash function (e.g., SHA-1 \cite{fips-sha-1})
over the contents of a data item.
-The form of identifier space differs between proposed systems. Geometrical
circular form of identifier space (and variants)
+The form of identifier space differs between proposed systems. A geometrical
circular form of identifier space (and variants)
is most widely used. For instance, Chord \cite{stoica01chord}, Koorde
\cite{kaashoek03koorde},
Pastry \cite{rowston01pastry}, SWAN \cite{bonsma02swan}, Tapestry
\cite{zhao01tapestry}
and Viceroy \cite{malkhi02viceroy} use a circular form of identifier space of
$n$-bit integers modulo $2^{n}$. The
value of $n$ varies among systems. Again, CAN \cite{ratnasamy01can} uses a
$d$-dimensional geometrical torus
model to implement the form of identifier space.
-To store data into a tightly structured overlay, each application-specific
+To store data in a tightly structured overlay, each application-specific
unique key (e.g., SHA-1 \cite{fips-sha-1}) is \emph{mapped} uniformly (e.g.,
using consistent
-hashing \cite{258660}) to an existing peer in the overlay. Thus, tightly
+hashing \cite{258660}) to an existing peer in the overlay. Thus, a tightly
structured overlay assigns a subset of all possible keys to every
participating peer.
We say that a peer is \emph{responsible} for the keys which are assigned by
the overlay.
Figure \ref{fig:structured_hashing} illustrates this
@@ -357,7 +357,7 @@
distributed data structure which resembles skip lists \cite{78977}.
In figure \ref{fig:structured_query}, we present an overview of Chord's data
lookup process.
On the right side of Chord's lookup process, the same data lookup process
-is shown as a binary-tree abstraction. It can be noticed, that in each step,
the distance
+is shown as a binary-tree abstraction. It can be seen, that in each step, the
distance
decreases with a logarithmic efficiency.
\begin{figure}
@@ -372,7 +372,7 @@
\ref{fig:kademlia_lookup} shows the process of Kademlia's
data lookup. Viceroy \cite{malkhi02viceroy} maintains a butterfly data
structure (e.g., \cite{226658}),
which requires only a constant number of neighbor peers while providing
$O(\log{n})$ data lookup
-efficiency, where $n$ is the number of peers in the system. Koorde
\cite{kaashoek03koorde}, a recent modification of Chord, uses de Bruijn graphs
+efficiency where $n$ is the number of peers in the system. Koorde
\cite{kaashoek03koorde}, a recent modification of Chord, uses de Bruijn graphs
\cite{debruijn46graph} to maintain local routing tables. It requires
each peer to have only about two links to other peers to provide $O(\log{n})$
performance.
@@ -388,7 +388,7 @@
\cite{zhao03api}. Each of these abstractions represent a storage layer in the
overlay, but
have semantical differences in the \emph{usage} of the overlay.
-First, Distributed Hash Table (DHT) (see e.g., \cite{dabek01widearea},
\cite{rowstron01storage}),
+First, Distributed Hash Table (DHT) (see e.g., \cite{dabek01widearea},
\cite{rowstron01storage})
implements the same functionality as a regular hash table by storing the
mapping between a key and a value:
\begin{itemize}
@@ -398,7 +398,7 @@
\end{itemize}
DHT's \emph{interface} is generic; values can be any size and type (e.g.,
content hash over a file). In the
-DHT abstraction, the overlay itself stores the data items. Figure
\ref{fig:Structured_lookup_using_DHT_model} shows the DHT abstraction
+DHT abstraction the overlay itself stores the data items. Figure
\ref{fig:Structured_lookup_using_DHT_model} shows the DHT abstraction
of the tightly structured overlay.
Second, Decentralized Object Location (DOLR) (see e.g.,
\cite{kubiatowicz00oceanstore}, \cite{iyer02squirrel}) is a distributed
@@ -412,11 +412,11 @@
\end{itemize}
-The key difference between the DHT and the DOLR abstraction is that in the
DOLR abstraction the overlay maintains only \emph{pointers} to the data.
-Also, the DOLR abstraction routes overlay's messages to a nearest available
peer, hosting a specific data item. This form of locality
+The key difference between the DHT and the DOLR abstraction is that, in the
DOLR abstraction, the overlay maintains only \emph{pointers} to the data.
+Also, the DOLR abstraction routes overlay messages to the nearest available
peer, hosting a specific data item. This form of locality
is not supported by DHT. DOLR's interface is similar to the DHT's interface,
i.e., values can be any size and type.
-Third, tightly structured overlay can be used for scalable group multicast or
anycast operations (CAST) (see e.g., \cite{zhuang01bayeux}).
+Third, tightly structured overlays can be used for scalable group multicast or
anycast operations (CAST) (see e.g., \cite{zhuang01bayeux}).
The basic operations include:
\begin{itemize}
@@ -426,7 +426,7 @@
\item \texttt{anycast(message, groupIdentifier)}: anycast a message to a group
with a given group identifier.
\end{itemize}
-The DOLR and the CAST abstractions have in common that they both use network
proximity techniques
+The DOLR and CAST abstractions both use network proximity techniques
to optimize their operations in the overlay. Figure
\ref{fig:Strucutred_lookup_using_DOLR_model}
presents the DOLR abstraction.
@@ -456,14 +456,14 @@
Chord's \cite{stoica01chord} distance function does have the property of
unidirection
(for a given point $p_i$ in the identifier space and distance $d$ > 0, there
is exactly one point $p_j$ in a way that the distance between $p_i$ and $p_j$
-is $d$), but doesn't have symmetry (the distance from $p_i$ to $p_j$ is same
as the
+is $d$), but does not have symmetry (the distance from $p_i$ to $p_j$ is same
as the
distance from $p_j$ to $p_i$). Pastry's \cite{rowston01pastry} distance
function supports
-symmetry, but doesn't support unidirection. According to
\cite{balakrishanarticle03lookupp2p}, because
+symmetry, but does not support unidirection. According to
\cite{balakrishanarticle03lookupp2p}, because
of XOR-metric, Kademlia's distance function is both unidirectional and
symmetric. Moreover, Kademlia's \cite{maymounkov02kademlia}
XOR-based metric doesn't need stabilization (like in Chord
\cite{stoica01chord}) and backup links
(like in Pastry \cite{rowston01pastry}).
-However, in all above schemes each hop in the overlay shortens the distance
between
-current peer working with the data lookup and the key which was looked up in
the identifier space.
+However, in all of the above schemes, each hop in the overlay shortens the
distance between
+current peer working with the data lookup and the key that was looked up in
the identifier space.
Skip Graphs \cite{AspnesS2003} and SWAN \cite{bonsma02swan} employ a
identifier space
in which queries are routed to \emph{keys}. In these systems
@@ -484,7 +484,7 @@
Balakrishnan et al. \cite{balakrishanarticle03lookupp2p} have listed four
requirements
for tightly structured overlays\footnote{Authors use the term 'DHT' in their
text, but in this context
-it doesn't matter as they list \emph{general} properties of tightly structured
overlays.} which have to be addressed in order
+it doesn't matter as they list \emph{general} properties of tightly structured
overlays.} that have to be addressed in order
to perform efficient data lookups in tightly structured overlays.
First, mapping of keys to peers must be done in a load-balanced
way. Second, the overlay must be able to forward a data lookup for a
@@ -507,9 +507,9 @@
Even though the loosely structured and the tightly structured approach are
both Peer-to-Peer schemes, they
have very little in common. Indeed, the only thing they share is the fact that
no other peer is more
important than any other in the Peer-to-Peer network. Fault tolerance
\emph{may}
-be an area, in which approaches have similar properties (e.g., no single point
of failure) \cite{milojicic02peertopeer}.
+be an area in which approaches have similar properties (e.g., no single point
of failure) \cite{milojicic02peertopeer}.
Fault tolerance properties of both approaches are currently only initial
calculations, or
-experimented in simulation environments. In real-life, however, measuring
fault tolerance is much more
+experimented in simulation environments. In real life, however, measuring
fault tolerance is a much more
challenging task and requires more research to get reliable answers.
The most important differences between approaches are the performance and
scalability properties.
@@ -523,8 +523,8 @@
assume that participating peers are homogeneous, and the rate of join or leave
operation is low \cite{gurmeet03symphony,
libennowell01observations, rowston03controlloingreliability}.
-To end user, the biggest difference between these systems is how data lookups
are performed. Loosely
-structured systems provide more rich and user friendly way of searching data
than tightly structured systems
+To the end user, the biggest difference between these systems is how data
lookups are performed. Loosely
+structured systems provide a more rich and user friendly way of searching data
than tightly structured systems
as they have a support for keyword searches \cite{yang02efficientsearch,
lv02searchreplication}. Tightly structured
systems support only exact key lookups since each data item is identified by
globally unique keys \cite{balakrishanarticle03lookupp2p,
harren02complex, ansaryefficientbroadcast03}.
@@ -612,21 +612,21 @@
Table \ref{table_Peer-to-Peer_algorithms} lists proposed Peer-to-Peer
algorithms
and their key properties with regard to performance and scalability. The list
-includes algorithms from both loosely and tightly structured approaches. The
list doesn't
-include \emph{all} proposed Peer-to-Peer algorithms; only the ones which
already have
+includes algorithms from both loosely and tightly structured approaches. The
list does not
+include \emph{all} proposed Peer-to-Peer algorithms but rather includes the
ones which already have
been widely deployed, or the ones which may be promising in the future
-Peer-to-Peer systems are included.
+Peer-to-Peer systems.
We decided to follow the guidelines from \cite{kaashoek03koorde} in measuring
the properties of different Peer-to-Peer systems. However, we dropped
out fault tolerance and load balancing properties, since they are hard to
measure
-in face of real life requirements. Additionally, however, we decided to include
+in real life requirements. Additionally, however, we decided to include
the number of \emph{real} network connections for each peer in the overlay.
Next,
we describe the listed properties of Peer-to-Peer algorithms:
\begin{itemize}
\item \textbf{Lookup}: the number of messages required when a data lookup is
performed.
-\item \textbf{Space}: the number of other peers which peer knows about
(neighbors).
+\item \textbf{Space}: the number of other peers a peer is aware of (neighbors).
\item \textbf{Insert/delete}: the number of network messages required when a
peer joins or leaves the system.
\item \textbf{Number of network connections}: the number of concurrent
network connections required to maintain correct neighbor information.
\end{itemize}