[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...
From: |
Hermanni Hyytiälä |
Subject: |
[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert... |
Date: |
Fri, 07 Mar 2003 03:53:48 -0500 |
CVSROOT: /cvsroot/gzz
Module name: gzz
Changes by: Hermanni Hyytiälä <address@hidden> 03/03/07 03:53:45
Modified files:
Documentation/misc/hemppah-progradu: masterthesis.tex
Log message:
More updates
CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/gzz/Documentation/misc/hemppah-progradu/masterthesis.tex.diff?tr1=1.125&tr2=1.126&r1=text&r2=text
Patches:
Index: gzz/Documentation/misc/hemppah-progradu/masterthesis.tex
diff -u gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.125
gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.126
--- gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.125 Thu Mar
6 10:12:32 2003
+++ gzz/Documentation/misc/hemppah-progradu/masterthesis.tex Fri Mar 7
03:53:41 2003
@@ -19,7 +19,7 @@
%***********************
\title{Fenfire in Peer-to-Peer Environment}
-\translatedtitle{Fenfire vertaisverkko ympäristössä}
+\translatedtitle{Fenfire ja vertaisverkot}
\author{Hermanni Hyytiälä}
@@ -45,13 +45,13 @@
problems, which have not solutions at all, or problems have proposed
solutions but they are practically unrealizable.
-Then, we give an overview of our Fenfire system. We evaluate existing
+Then, we give an overview of Fenfire system. We evaluate existing
Peer-to-Peer approaches-- loosely and tightly structured overlays-- with regard
to Fenfire's needs. Finally, we propose simple algorithms to efficiently find
Fenfire
related data from Peer-to-Peer network.
}
\tiivistelma{
-Tässä opinnäytetyössä arvioimme olemassaolevia vertaisverkkoja, protokollia ja
+Tässä opinnäytetyössä arvioimme olemassaolevia vertaisverkkoja, algoritmeja ja
niiden erityisominaisuuksia. Teemme yhteenvedon olemassa olevista ongelmista
vertaisverkoissa ja jaamme ongelmat kolmeen alakategoriaan. Havaitsemme, että
on olemassa useita ongelmia, joihin ei ole ratkaisua lainkaan, tai on ehdotelma
@@ -87,8 +87,8 @@
and reliability againts certain kinds of faults (e.g., single point of
failure).
There are many definitions of Peer-to-Peer networks. The Intel Peer-to-Peer
-Working Group defines it as ''the sharing of computer resources and defines
-and services by direct exchange between systems'' \cite{p2pworkinggroup}.
+Working Group defines it as ''the sharing of computer resources and services
+by direct exchange between systems'' \cite{p2pworkinggroup}.
Dave Winer \cite{winer00whatisp2p} lists several
properties of Peer-to-Peer network, while most notably the statement
''The user's machine is a client and a server'' describes best Peer-to-Peer
@@ -106,14 +106,14 @@
Specifically, we review existing Peer-to-Peer approaches, algorithms and their
key properties. We observe
that despite of greate amount of proposed Peer-to-Peer systems, all systems
fall either
loosely structured approach or tightly structured approach. We also discuss
open problems in
-Peer-to-Peer networks and divide problems into three sub-categories: security
related problems,
+Peer-to-Peer systems and divide problems into three sub-categories: security
related problems,
performance related problems and miscellaneous problems. In the end, we
summarize all
problems in easy-to-understand tables.
-Next, we give an overview of our Fenfire hypermedia system, which implements
xanalogical storage model. We
+Next, we give an overview of Fenfire hypermedia system, which implements
xanalogical storage model. We
also describe briefly Storm software module of Fenfire system, which is an
essential part of Fenfire's
Peer-to-Peer functionality. We evaluate existing Peer-to-Peer approaches and
-choose the best alternative to our needs. We discover that Fenfire,
xanalogical model and
+choose the best alternative to Fenfire's needs. We discover that Fenfire,
xanalogical model and
tightly structured Peer-to-Peer approach all have similar method to deal with
data,
i.e., globally unique identifiers. Finally, we propose system model for
Fenfire in Peer-to-Peer
environment and present yet simple but efficient algortihms to be used for
data lookups in
@@ -140,8 +140,8 @@
address open problems in Peer-to-Peer domain and divide problems into three
sub-categories. Chapter 4 gives an overview of Fenfire system. In chapter
5, we evaluate existing Peer-to-Peer approaches with regard to Fenfire system,
propose system
-model for Fenfire in Peer-to-Peer environment, present simple algorithms to
perform data
-lookups in Peer-to-Peer environment. Also, we discuss possible problems of
using Fenfire
+model for Fenfire in Peer-to-Peer environment and present simple algorithms to
perform data
+lookups in Peer-to-Peer environment. In addition, we discuss possible problems
of using Fenfire
in Peer-to-Peer environment. In chapter 6, we present conclusions and future
work.
@@ -178,7 +178,7 @@
\label{fig:application_level}
\end{figure}
-Compared to ARPANET's Peer-to-Peer functionality, today's Peer-to-Peer systems
+Compared to ARPANET's Peer-to-Peer functionality, modern Peer-to-Peer systems
are ad-hoc, i.e., peers join and leave the system constantly in a dynamic
manner. This
fact constitutes challenging requirements for efficient construction and
maintenance
of the overlay network. Even more demanding tasks are how to perform efficient
data
@@ -192,7 +192,7 @@
other research areas than computer science. There has been done research
regarding
to ad-hoc nature of complex networks \cite{albert-02-statistical},
\cite{albert-00-tolerance}, \cite{watts00dynamics}.
It's interesting to realize that chemical properties of cells, the Internet,
ad-hoc
-Peer-to-Peer networks, have all in common that they self-organize based on
same
+Peer-to-Peer systems, have all in common that they self-organize based on same
principles. Furthermore, the assocation between social connections among
people
and Peer-to-Peer overlay topology has been studied recently
\cite{watts00dynamics},
\cite{kleinberg99small}, \cite{nips02-Kleinberg}. This insight is motivated
@@ -226,7 +226,7 @@
\section{Loosely structured}
-Gnutella \cite{gnutellaurl} is well-known example of loosely structured
overlay network. As in
+Gnutella \cite{gnutellaurl} is a well-known example of loosely structured
overlay network. As in
other Peer-to-Peer networks, no peer is more important than any other peer in
the network.
The construction and maintenance of Gnutella network is extremely ad-hoc,
since participating
peers can form the overlay network based on \emph{local} knowledge. Figure
\ref{fig:gnutella_overlay}
@@ -318,7 +318,7 @@
all peers $p$ in system. Then, $\forall s \in S$, there is a provider of the
service,
expressed as $p = provider(s)$. Every $p$ has neighbor(s), named as
$neighbor$, which
is $P$ = \{$p \in P: \exists neighbor$, which is randomly chosen from $P$\}.
-Super peer is a peer, which hosts the indices of other peers, $sp =
summaryindex(provider(s))$.
+\emph{Super peer} is a peer, which hosts the indices of other peers, $sp =
summaryindex(provider(s))$.
Moreover, $\forall$ reqular peer $p$, there is super peer, which has has a
index of regular
peer's content, specifically $ps$, $P$ = \{$p \in P: \exists ps$,
where $ps$ = $summaryindex(provider(s)) \wedge (p = provider(s))$\}
@@ -345,19 +345,10 @@
space differs between proposed systems. Circular identifier space (and
variants)
is most widely used. For instance, Chord \cite{stoica01chord}, Koorde
\cite{kaashoek03koorde},
Pastry \cite{rowston01pastry}, SWAN \cite{bonsma02swan}, Tapestry
\cite{zhao01tapestry}
-and Viceroy \cite{malkhi02viceroy} use a circular identifier space of $n$-bit
integers modulo $2^{n}$. The
+and Viceroy \cite{malkhi02viceroy} use circular identifier space of $n$-bit
integers modulo $2^{n}$. The
value of $n$ varies among systems. Again, CAN \cite{ratnasamy01can} uses a
$d$-dimensional cartesian
model to implement identifier space.
-Stoica et al.. \cite{balakrishanarticle03lookupp2p} have listed
-four requirements for tightly structured overlays, which have to be
-addressed in order to perform data lookups in tightly structured overlays.
-First, mapping of keys to peers must be done in a load-balanced
-way. Second, the overlay must be able to forward a lookup for a
-specific key to an approriate peer. Third, overlay must have a
-support for a distance function. Finally, routing tables for each peer
-must be constructed and maintained adaptively.
-
To store data into tightly structured overlay, each application-specific
unique key (e.g., SHA-1 \cite{fips-sha-1}) is \emph{mapped} uniformly (e.g.,
using consistent
hashing \cite{258660}) by the overlay to a existing peer in the overlay. Thus,
tightly
@@ -411,6 +402,15 @@
for maintaining information about other peers in the system and
$O(\log{n})$ data lookup efficiency.
+Stoica et al. \cite{balakrishanarticle03lookupp2p} have listed
+four requirements for tightly structured overlays, which have to be
+addressed in order to perform efficient data lookups in tightly structured
overlays.
+First, mapping of keys to peers must be done in a load-balanced
+way. Second, the overlay must be able to forward a lookup for a
+specific key to an approriate peer. Third, overlay must have a
+support for a distance function. Finally, routing tables for each peer
+must be constructed and maintained adaptively.
+
Currently, all proposed tightly structured overlays provide at least
poly--logaritmical data lookup operations. However, there are some key
differences in the data structure that they use as a routing table. For
example, Chord
@@ -502,8 +502,7 @@
function is defined as $map: I \longmapsto IS$, and coordinate point as
$ip = map(identifier(s))$, which maps data items, expressed by a identifier to
coordinate
point $ip$ in $(IS,d)$. Peer's p resources are mapped onto a set $IS$ = \{$ip
\in IS:
-\exists s \in S$, $ip = map(identifier(s)) \wedge (provider(s) = p)$\}.,
-which means that resources which peer provides into the system are not kept
locally.
+\exists s \in S$, $ip = map(identifier(s)) \wedge (provider(s) = p)$\}.
Every $p$ has neighbor(s), named as $neighbor$, which are $P$ = \{$p \in P:
\exists neighbor$,
where $difference(p,p_neighbor)= ''close''$, where $''close''$ is small
difference $d$ in $(IS,d)$\}.
@@ -650,11 +649,11 @@
Table \ref{table_Peer-to-Peer_algorithms} lists proposed Peer-to-Peer
algorithms
and their key properties with regard to performance and scalability. List
-includes algorithms from two main approaches. However, majority of the
algorithms
+includes algorithms from both loosely and tightly structured approaches.
However, majority of the algorithms
listed above belongs to tightly structured approach since there has been
active
research being pursued towards tightly structured approach lately. List
doesn't
include \emph{all} proposed Peer-to-Peer algorithms. Only the ones which
already have
-been widely deployed in real-life, or the ones which may promising in the
future's
+been widely deployed in real life, or the ones which may promising in the
future's
Peer-to-Peer systems are included in this thesis.
We decided to follow the guidelines from \cite{kaashoek03koorde} when
@@ -663,13 +662,13 @@
in face of real life requirements. Additionally, however, we decided to include
the number of \emph{real} network connections for each peer in the overlay.
-Here, we describe the listed properties of Peer-to-Peer algorihms:
+Here, we describe the listed properties of Peer-to-Peer algorithms:
\begin{itemize}
-\item \textbf{Lookup}: Number of messages required when a data lookup is
performed
-\item \textbf{Space}: Number of neighbors which peers knows about (neighbors)
-\item \textbf{Insert/delete}: Number of messages required when a peer joins or
leaves the network
- \item \textbf{Number of network connections}: Number of concurrent network
connections required to maintain correct neighbor information
+\item \textbf{Lookup}: number of messages required when a data lookup is
performed
+\item \textbf{Space}: number of neighbors which peers knows about (neighbors)
+\item \textbf{Insert/delete}: number of messages required when a peer joins or
leaves the network
+ \item \textbf{Number of network connections}: number of concurrent network
connections required to maintain correct neighbor information
\end{itemize}
\scriptsize
@@ -873,7 +872,7 @@
\section{Overview}
Partly due to the non-maturity of modern Peer-to-Peer technology, it has
several
-open problems to be solved. Main open problems are related to performance,
scalability, usability
+open problems to be solved. The most severe problems are related to
performance, scalability, usability
and security. More important, many techniques developed for traditional
distributed
systems may no longer apply with Peer-to-Peer systems. Therefore, new
solutions are
needed to make Peer-to-Peer systems more secure and efficient.
@@ -885,12 +884,13 @@
approach; \emph{network} of loosely structured systems is scalable, but the
\emph{data lookup model} is not.
The main concern of tightly structured system is to make overlay's data lookup
routing more flexible againts hostile attacks. Another key problems in tightly
structured
-systems are the lack of keyword searches and support for heterogeneous peers.
+systems are the lack of keyword searches, support for heterogeneous peers and
load balancing
+\cite{balakrishanarticle03lookupp2p}.
To make Peer-to-Peer systems even more popular (e.g., in industry),
Peer-to-Peer domain
needs better infrastructures to deal with security issues. There has been done
some
research regarding anonymity, access control, data availability and data
integrity but as
-we will observe, much more research work is required to solve security related
issues.
+we state in the following sections, much more research work is required to
solve security related issues.
\section{Security problems in Peer-to-Peer}
@@ -905,7 +905,7 @@
In Sybil attack model, hostile entity presents multiple
entities. Therefore, one hostile entity can control a large fraction of the
Peer-to-Peer system. Optimal
possible solution to Sybil attack would be that system could \emph{distinct}
entities of the system reliably. Unfortunately,
-currently there no realizable techiques for this task. Partial solutions for
Sybil is attack is to replicate
+currently there no realizable techiques for this task. Partial solutions for
Sybil attack is to replicate
and fragment data randomly among several participating peer. However, both
suggestions assume that two different
remote entities are actually different; Sybil attacks are still possible and
therefore, would need centralized
authority for reliable authentication. As author arques in
\cite{douceur02sybil}, without centralized authority,
@@ -923,13 +923,13 @@
attack, hostile or faulty peer may produce false information of the data, or
refuses/is not able to reply to requests.
Possible solution againts this attack is that peer should not trust to single
entity. Instead, peer should get
information from multiple entities and trust on majority's opinion. This
methods requires more messages to be
-sent to network whilce increasing the load of system. However, if Spam attack
is combined with Sybil attack, obviously
+sent to network while increasing the load of system. However, if Spam attack
is combined with Sybil attack, obviously
previously mentioned solution doesn't work. Again, more research is required
to solve this attack model
-reliability. Naor et al. \cite{naor03simpledht} has proposed a partial
solution againts Spam attack with
+safely. Naor et al. \cite{naor03simpledht} has proposed a partial solution
againts Spam attack with
\emph{faulty} peers (not hostile).
Traditional overload of targeted peers is best known form of distrubuted
Denial of Service attack (DDoS). For example,
-hostile entity can attempt to burden targetted peers with garbage packets. As
a implication, peers may act
+hostile entity can attempt to burden targetted peers with garbage network
packets. As a implication, peers may act
incorrectly or stop working. DDoS attack may be very severe, especially if
rate of replication and caching
in Peer-to-Peer system is low. This may lead to data loss in the Peer-to-Peer
system. Daswani et al.
\cite{daswani02queryflooddos} has done research regarding to this subject.
Authors suggest efficient load balancing
@@ -950,7 +950,7 @@
Implementations include Advogato \cite{advogatourl}. None of the current
proposals or implementations
based on reputation address trust in a reliable way.
-Optimal solution for trust in Peer-to-Peer systems would be certificate based
security methods.
+Optimal solution for trust in Peer-to-Peer systems would be certificate based
security models.
Quite recently, widely used Public Key Infrastructure (PKI) has been deployed
in distributed
systems \cite{rivest96sdsi}, \cite{spkiworkinggroup}. PKI is an reliable
technology for securing
data in rather \emph{static} computing systems, such as in the Internet.
However, in Peer-to-Peer
@@ -962,7 +962,7 @@
ConChord \cite{ajmani02conchord} is the first Peer-to-Peer system which has a
support for PKI based
security infrastructure. Unfortunately, ConChord is in early in development
and lacks of important
features of PKI to be fully usable yet. Furthermore, the hierarchy of
SDSI/SPKI \cite{rivest96sdsi},
-\cite{spkiworkinggroup} may a problem for Peer-to-Peer systems, in which
hierarchy is intentionally missing.
+\cite{spkiworkinggroup} may be a problem for Peer-to-Peer systems, in which
hierarchy is intentionally missing.
For data integrity, on the other hand, there are few working solutions.
Cryptographic content hashes
\cite{fips-sha-1}, variations \cite{merkle87hashtree} and their implementation
techniques \cite{mohr02thex},
@@ -978,11 +978,11 @@
no one is able to link publisher to a specific document. Reader-anonymity
means that a specific
document cannot be linked to document's readers. This form of anonymity
protects the privacy of a
users of the system. Furthermore, peer-anonymity means that no peer can be
linked to a specific document, i.e.,
-no one is able to determine the peer, where document was originally published.
Document-anonymity
+no one is able to determine the peer, in where document was originally
published. Document-anonymity
means that peer doesn't know which data it is currently hosting. Finally,
query-anonymity is a form
of document-anonymity; when other peers performs data lookups, peer doesn't
know which data it serves
to the data lookup originators. As the authors cite, some of forms of
anonymity may imply each other and
-possible issues are one area of future work.
+possible issues raised by this property is one area of future work.
With regard to anonymity in Peer-to-Peer systems, there has been done much
research work both at network
level layer \cite{tarzan:ccs9} and at application level layer
\cite{reiter98crowds}, \cite{mixminionurl}.
@@ -1061,10 +1061,10 @@
maintenance, but their solution seems to have to major problems
\cite{castro02securitystructured}. First,
the solution is very expensive even without faulty or hostile entities.
Second, each group of replicas
in their solution must have less than 1/3 of its peer faulty. Thus, this
feature results in a low
-probability of succesfull routing.
+probability of succesful routing.
-Aspnes et al. in \cite{aspnes02faultrouting} and Kaashoek et al.l in
\cite{kaashoek03koorde} formally
-prove the lower and upper bounds for space requirements of locating a specific
date item in
+Aspnes et al. in \cite{aspnes02faultrouting} and Kaashoek et al. in
\cite{kaashoek03koorde} formally
+prove the lower and upper bounds for space requirements of locating a specific
data item in
Peer-to-Peer system. They show that to provide high degree of fault tolerance
and efficiency, each
participating peer must maintain average of $O(\log{n})$ neighbors.
@@ -1081,10 +1081,10 @@
\subsection{Other security threats}
-Ross Lee graham lists several external threats againts Peer-to-Peer networks
\cite{grahamp2psecurity}. Most important, t
-he list includes viruses and trojans. Currently, there are not even partial
solutions
+Ross Lee graham lists several external threats againts Peer-to-Peer networks
\cite{grahamp2psecurity}. Most important,
+the list includes viruses and trojans. Currently, there are not even partial
solutions
to the problems mentioned above. General robustness properties of Peer-to-Peer
system is able to
-deal with software failures and hostile attack, but redundancy againts
external threats is unknown.
+deal with software failures and hostile attack, but fault tolerance againts
external threats is unknown.
The reason for this is that there are no experiences on these kinds of
attacks. Possible solution
would be distributed anti-virus software, but much more intensive research is
required until
this kind of solution would be applicable.
@@ -1108,7 +1108,7 @@
originator starts a data lookup with small TTL value. If the search is not
succesful,
the query originator increases the TTL value and performs another data lookup.
This
process is repeated until the desired data is found or maximumum depth $D$
-has been reached. Expanding ring, proposed by Shenker et al..,
\cite{lv02searchreplication},
+has been reached. Expanding ring, proposed by Shenker et al.,
\cite{lv02searchreplication},
is similar to iterative deepening techique. With these techniques, search
may not be fast when desired data item requires many consecutive flooding
rounds.
@@ -1125,7 +1125,7 @@
over its local content.}. Mutual index caching architecture, as proposed in
\cite{osokine02distnetworks}, is one variation of local indices techique.
-In random walk approach \cite{lv02searchreplication}, peer forwards a query to
+In random walk approach \cite{lv02searchreplication}, peer forwards query to
randomly selected neighbor. The basic random walk approach decreases the
overhead generated by messages. On the other hand, basic random walk approach
has poor response time. As suggested in \cite{lv02searchreplication},
@@ -1134,7 +1134,8 @@
random walk searches in query lookups. Indeed, Freenet's query resembles
Depth-First-Search (DFS) and peers' routing tables are dynamically built
using caching. This is an outcome of Freenet's main design priciples,
-i.e., anonymity. Additional improvements to Freenet's data lookup using
+i.e., anonymity. Another property of Freenet's data lookup model is that
+it adapts well with varying usage patterns. Improvements to Freenet's data
lookup using
''small-world phenomenon'' has been proposed by Zhang et al..
\cite{zhang02using}.
@@ -1165,8 +1166,8 @@
been focused on tightly structured approach.
The main problem with tightly structured approach is the fact that tightly
structured algorihms
performs data lookups based on a globally unique identifier (key). Quite
recent study has been focused
-on the feasibility of Peer-to-Peer Web-like indexing and searching
\cite{li03feasibility}. Authors
-argue, that it is possible to implement Peer-to-Peer Web-like search with
certain radical compromises.
+on the feasibility of Peer-to-Peer Web-like indexing and searching
\cite{li03feasibility} on top of
+tightly structured overlays. Authors argue, that it is possible to implement
Peer-to-Peer Web-like search with certain radical compromises.
First, Peer-to-Peer search engine may need to decrease result quality in order
make searching more
efficient. Second, Peer-to-Peer systems must observe better the properties of
underlying network for
better performance.
@@ -1182,8 +1183,8 @@
distributions\footnote{Zipf distribution is a variant of power-law function.
Zipf-distribution can be used in observation of frequency of occurrence event
$E$, as a function of the rank
$i$ when the rank is determined by the frequency of occurrence, $E_i \sim
\frac{1}{i^{a}}$, where the exponent
-$a$ is close to unity.} (e.g., \cite{breslau98implications}), caching and
precomputation
-can be done for optimizting search indices \cite{li03feasibility}. Regular
compression algorithms,
+$a$ is close to unity.} (e.g., \cite{breslau98implications}). Therefore,
caching and precomputation
+can be done for optimizing search indices \cite{li03feasibility}. Regular
compression algorithms,
Bloom filters \cite{362692}, vector space models
\cite{CuencaAcuna2002DSIWorkshop} and view
trees \cite{Bhattacharjee03resultcache} can be used for even better
optimizations. Authors
in \cite{li03feasibility} use Gap compression \cite{wittengigabytes}, Adaptive
Set Intersection \cite{338634}
@@ -1205,8 +1206,8 @@
Peer-to-Peer system is \emph{never} in ''ideal'' state as it is always
evolving system.
Current research has been focused on system management of tightly structured
systems, since all presented
-algorithms of tightly structured approach have been analyzed under static
simulation environments. Furthermore, propsed tightly structured
-overlays are configured statically to achieve the desired reliability even in
uncommon and adverse environment
+algorithms of tightly structured approach have been analyzed under static
simulation environments. Furthermore, proposed
+tightly structured overlays are configured statically to achieve the desired
reliability even in uncommon and adverse environment
\cite{rowston03controlloingreliability}. The most important factor for
future research is to get real-life experiences from tightly structured
system, when there are frequent
joins and leaves in the system. Some research has been done already in this
area.
@@ -1220,7 +1221,7 @@
more efficient analytical tools for modelling complex Peer-to-Peer system.
Some research has been done with regard to load balancing properties of
tightly structured
-overlays. Byers et al. suggest "power of two choices" whereby an item is
stored at the less loaded
+overlays. Byers et al. suggest "power of two choices" whereby data item is
stored at the less loaded
of two (or more) random peer alternatives \cite{byers03dhtbalancing}. Rao et
al. uses virtual servers
to control load balance in Peer-to-Peer systems \cite{rao03loadbalancing}.
Their work rests on
idea which was originally introduced by Chord \cite{stoica01chord} system.
@@ -1232,8 +1233,8 @@
therefore enabling peers to find nearby data without looking up data from
distant peers.
As mentioned before, an implicit assumption of almost every tightly structured
system is that there is random, uniform
-distribution of peer and key identifiers. Even if participating peers are
extremely heterogeneous in
-face of computing power, or network bandwidth, data items are distributed
uniformly. Clearly, this
+distribution of peer and key identifiers. Even if participating peers are
extremely heterogeneous, e.g., in
+face of computing power or network bandwidth, all data items are distributed
uniformly. Clearly, this
a serious problem of tightly structured overlays in face of performance and
load balancing. Measurement study
by Saroiu et al. shows that there is extreme heterogeneity among participating
peers in already deployed Peer-to-Peer
systems \cite{saroiu02measurementstudyp2p}. Symphony seems to be the first
tightly structured overlay system
@@ -1264,7 +1265,7 @@
\subsection{Programming guidelines and benchmarks}
All existing Peer-to-Peer systems have rather different interfaces even they
have common points and
-components. More important, all existing Peer-to-Peer systems incompatible
with each other. One
+components. More important, all existing Peer-to-Peer systems are incompatible
with each other. One
of the most important area of future research is to create common programming
abstractions, i.e.,
interfaces, design patters and frameworks. Also, equal benchmarks are needed
for comparing
different algorithms. Recently, there have been few proposals towards common
programming
@@ -1274,7 +1275,7 @@
\subsection{Social behaviour}
Frequent assumption in Peer-to-Peer systems is that peers are willing to
cooperate. Another belief
-is that all peers would behave equally, i.e. all peers both consume resources
and contributes resources.
+is that all peers would behave equally, i.e., all peers both consume resources
and contributes resources.
However, these assumptions are not true as several studies show peers rather
consume than contribute and
and peers are unwilling to cooperate \cite{saroiu02measurementstudyp2p},
\cite{oram01harnessingpower},
\cite{hearn02mojonation}.
@@ -1300,7 +1301,7 @@
rates.
As long as global simulations of Peer-to-Peer systems are lacking, we cannot
we make any further
-analysis e.g., on usage patterns in Peer-to-Peer systems. Presumedly, however,
we can assume that
+analysis e.g., on usage patterns in Peer-to-Peer systems. However, we can
assume that
, e.g., query keywords follow the Zipf-like distributions
\cite{breslau98implications} both in the
Internet and in Peer-to-Peer systems.
@@ -1678,7 +1679,7 @@
Xanalogical storage \cite{nelson99xanalogicalneeded} is a different kind of
model for
presenting data and relationships between data. While in World Wide Web links
are
between \emph{documents}, in xanalogical model links are between individual
-\emph{characters}. Indeed, each character in xanalogical storage model has a
+\emph{characters}. Each character in xanalogical storage model has a
permanent, globally unique identifier. For instance, let's consider the
following
scenario: ''the character 'D' typed by Janne Kujala on 10/8/97 8:37:18''. In
this
example, when character 'D' is is first typed in, xanalogical storage model
@@ -1697,9 +1698,9 @@
\emph{Enfilade} can be considered as a ''virtual file'' (or part of one),
which is a list
of fluid media contents. In xanalogical storage model, links between content
are external
and bidirectional. Xanadu link is an \emph{association} of two enfilades, such
as an
-annotation to a specific part of a another document. Transclusion is an
inclusion in an
-enfilade of contents already used in another enfilade, i.e. current fluid
media is copied into
-different data contents. By using this mechanism, system implementing
xanalogical model
+annotation to a specific part of a another document. \emph{Transclusion} is an
inclusion in
+enfilade of contents already used in another enfilade, i.e., current fluid
media is copied into
+different data contents. By using this mechanism, system implementing
xanalogical storage model
is able to show all data content that share same fluid media with current data
content
(e.g., all documents containing current document's text). Figure
\ref{fig:xanalogical_model}
illustrates xanalogical storage model with documents, text and characters.
@@ -1749,9 +1750,9 @@
associated with a collection of \emph{pointer blocks}. Each pointer block has
a single
target for the pointer. In figure \ref{fig:storm_model}, we present overal
pointer creation process. Pointer block may contain zero or more obsoleted
-pointer blocks, i.e. when a new version of scroll block is created, it
supersedes
+pointer blocks, i.e., when a new version of scroll block is created, it
supersedes
one older version which has been created in the past. The most current pointer
-block will 'obsolete' the pointer block targeting the supersed version. Next
+block will 'obsolete' the pointer block targeting the superseded version. Next
time, when the pointer is used for refering to a specific scroll block, only
the most recent pointer's block target is loaded.
@@ -1769,8 +1770,8 @@
We start by giving a problem overview when considering Fenfire in Peer-to-Peer
environment. We define Fenfire's special needs and evaluate existing
Peer-to-Peer approaches in light of these requirements. After that, we propose
system
-model for Fenfire in Peer-to-Peer environment, present simple algorithms to
perform data
-lookups in Peer-to-Peer environment. Also, we discuss possible problems of
using Fenfire
+model for Fenfire in Peer-to-Peer environment and present simple algorithms to
perform data
+lookups in Peer-to-Peer environment. In the end, we discuss possible problems
of using Fenfire
in Peer-to-Peer environment
@@ -1837,10 +1838,10 @@
feature is almost analogical to Fenfire's (and xanalogical storage model's)
way of
handling data. Another key feature of tightly structured overlays is that they
are able
to provide general purpose \emph{interface} for Reference Resolution Services
(RRS)\footnote{
-Currently, Domain Name System (DNS) \cite{rfc1101} is widely used RRS system
in the Internet.}
+Domain Name System (DNS) \cite{rfc1101} is widely used RRS system in the
Internet.}
\cite{balakrishnan03semanticfree}. Authors argue that next generation RRS
must be
application-independent and references itself should be \emph{unstructured}
and
-\emph{semantic free}. Finally, with tightly stuctured systems, it is feasible
to
+\emph{semantic free}. Finally, as said, with tightly stuctured systems, it is
feasible to
perform \emph{global} data lookups in the overlay. To summarize, these aspects
may be the most important features
of Peer-to-Peer infrastructure with regard to Fenfire as a \emph{distributed}
hypermedia system.
Thus, we see the tightly structured approach the best alternative to locate
data in Peer-to-Peer
@@ -1898,8 +1899,8 @@
For better fault tolerance and self-monitoring for Fenfire, we propose
techniques
presented by Rowston et al. \cite{rowston03controlloingreliability}. With
these
-techniques, we can ensure the performance of Fenfire in a highly adverse
environment, such
-as extreme heterogeneous, higly dynamic environment or network partition.
+techniques, we can ensure the performance of Fenfire in a highly adverse
conditions, such
+as sudden network partition, or highly dynamic and heterogeneous environment.
Finally, for more efficient data transfer, we can use variable techniques for
this purpose.
For small amounts of data, HTTP can be used \cite{rfc2068}. For big downloads,
we can use
@@ -1908,9 +1909,9 @@
\subsection{Algorithms}
-We use DOLR abstraction of tightly of structured approach, i.e. each
participating peer hosts
+We use DOLR abstraction of tightly of structured approach, i.e., each
participating peer hosts
the data and overlay maintains only the \emph{pointers} to the data. We
descided to use DOLR in our
-model, since DOLR systems locate date without specifiying a storage policy
explicity \cite{rhea03benchmarks}.
+model, since DOLR systems locate data without specifiying a storage policy
explicity \cite{rhea03benchmarks}.
DHT based storage systems, such as CFS \cite{dabek01widearea} and PAST
\cite{rowstron01storage}, may have
critical problems with load balancing in highly heterogeneous environment.
This problem is caused by peers
which may not able to store relative great amount of data with key/value pair,
assigned randomly by
@@ -1936,8 +1937,8 @@
\begin{itemize}
\item Data lookup with a given identifier of Storm scroll block.
\begin{enumerate}
-\item Submit query using scroll block's identifier.
-\item Repeat until hosting peer is found: each peer forwards the query to a
closer peer which hosts the given scroll block identifier.
+\item Submit data lookup using scroll block's identifier.
+\item Repeat until hosting peer is found: each peer forwards the data lookup
to a closer peer which hosts the given scroll block identifier.
\item Pointer peer returns most recent pointer block's value (e.g., hosting
peer's IP-address) to query originator.
\item Query originator requests hosting peer to return the scroll block.
\end{enumerate}
@@ -1952,7 +1953,7 @@
\item Data lookup with a given pointer random string returning most recent
scroll block.
\begin{enumerate}
\item Query originator locally compute a hash for given pointer random string.
-\item Repeat until hosting peer is found: each peer forwards the query to a
closer peer which hosts the given hash of pointer random string.
+\item Repeat until hosting peer is found: each peer forwards the data lookup
to a closer peer which hosts the given hash of pointer random string.
\item Pointer peer returns most recent pointer block's key/value-pair (e.g.,
hosting peer's IP-address) to query originator, using pointer block's own
indexing schemes.
\item Query originator requests hosting peer to return the scroll block.
\end{enumerate}
@@ -1963,7 +1964,7 @@
\begin{enumerate}
\item Query originator locally compute a hash for given pointer random string.
-\item Repeat until hosting peer is found: each peer forwards the query to a
closer peer which hosts the given hash of pointer random string.
+\item Repeat until hosting peer is found: each peer forwards the data lookup
to a closer peer which hosts the given hash of pointer random string.
\item Pointer peer returns pointer block's key/value-pair(s) (e.g., hosting
peer's IP-addresses) to query originator, using pointer block's own indexing
schemes.
\item Query originator requests hosting peer to return the scroll block.
\end{enumerate}
@@ -1972,7 +1973,7 @@
Figure \ref{fig:storm_query_urn5} illustrates how Storm scroll block is
located
in a tightly structured overlay using DOLR method, where pointer random string
is known.
-Each of these algortihms can locate Fenfire related data in $\Theta(\log{n})$
time:
+Each of these algortihms can locate Fenfire related data in $O(\log{n})$ time:
$O(\log{n})$ time for query routing to pointer peer and constant time for
locating hosting peer with a given reference link. Time required for
transferring
the data is not included.
@@ -1996,7 +1997,7 @@
\subsection{Problems}
Perhaps the most biggest issue in Peer-to-Peer systems is non-maturity of
-secure techologies. For instance, online entities cannot be identified
+security techologies. For instance, online entities cannot be identified
safely (e.g., the Sybil attack \cite{douceur02sybil}). For Fenfire, one
security related problem occurs when user wants to perform global data lookup
with a given
pointer random string; how user is able to verify the correctness
@@ -2004,11 +2005,11 @@
correct Storm scroll block ? Spam attack \cite{naor03simpledht} is a variation
of previously
mentioned problem; data lookup is performed by a user, but there is no reply
from the system. How do we are able to know if this was a spam attack, or the
-data really no exist in the system ? Another problem related to Fenfire's
+data really doesn't exist in the system ? Another problem related to Fenfire's
security is that if a user downloads data from the network to local computer
and after network disconnetcion, user wants to verify \emph{offline} the
authenticity of data. Obviously, optimal solution to all security issues would
-be that digital signatures are included to every message sent in the system.
+be that digital signatures are included to every message sent to the system.
However, these problems are not only limited to Fenfire, it concerns all
Peer-to-Peer based computer systems.
@@ -2023,7 +2024,7 @@
we divided open problems into three sub-categories: security related problems,
performance related problems and miscellaneous problems. Each of these
sub-categories have number of open problems, in which there are no solutions
-yet, or solutions are only partial. Much research work is required to
+yet, or solutions are only partial. We point out that much research work is
required to
solve open problems.
Then, we focused our attention to Fenfire system. First, we gave a brief
@@ -2032,12 +2033,13 @@
In last chapter, we evaluated existing Peer-to-Peer approaches with regard
to Fenfire's needs. We proposed, that tightly structured approach is the
-best alternative to our needs for the following reasons. First, Storm,
xanalogical
+best alternative to Fenfire's needs for the following reasons. First, Storm,
xanalogical
model and tightly structured systems use global unique identifiers
for identifying data. Second, our Storm design uses semantic-free references
-for locating data in distributed networks. As the authors of
\cite{balakrishnan03semanticfree},
-we also observe that tightly structured overlays provide general purpose
-interface to next-generation reference resolution services. Second, by using
+for locating data in distributed networks generated by SHA-1 cryptographic
content
+hash \cite{fips-sha-1}. As the authors of \cite{balakrishnan03semanticfree},
+we also agree that tightly structured overlays provide general purpose
+interface to next-generation reference resolution services. Third, by using
DOLR abstraction of tightly structured overlay, we can minimize the the lack
of locality in tightly structured overlays. Finally, we believe that issues
related to tightly structured overlays are solved in near future, because of
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., (continued)
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/05
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/05
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/05
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/05
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...,
Hermanni Hyytiälä <=
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/07
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/07
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13