gzz-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gzz-commits] manuscripts/storm article.rst


From: hemppah
Subject: Re: [Gzz-commits] manuscripts/storm article.rst
Date: Tue, 28 Jan 2003 12:44:15 +0200
User-agent: Internet Messaging Program (IMP) 3.1

Quoting Benja Fallenstein <address@hidden>:

> address@hidden wrote:
> > Quoting Benja Fallenstein <address@hidden>:
> > 
> > 
> >>CVSROOT:    /cvsroot/gzz
> >>Module name:        manuscripts
> >>Changes by: Benja Fallenstein <address@hidden>      03/01/25 12:51:34
> >>
> >>Modified files:
> >>    storm          : article.rst 
> >>
> >>Log message:
> >>    bit
> >>
> > 
> > 
> >> 
> >> 2. Block storage
> >> ================
> >>+
> >>+In our system, Storm (for *storage module*), all data is stored
> >>+in *blocks*, byte sequences identified by a
> >>+cryptographic hash. Blocks often have a similar granularity
> >>+as files, but they are immutable, since any change to the
> >>+byte sequence would change the hash (and thus create a different block).
> > 
> > Hmm, perhaps we should make a reference to CFS/PAST in this specific part:
> If
> > these system somehow differs, we should explain how. If not, tell that
> 'they are
> > very similar etc'.
> 
> Go ahead :-)

Done, partly ;).

> 
> > Btw, how Storm blocks are shared among computers ? Is it more like CFS or
> > PAST(CFS: files are splitted to blocks (merkle-style), PAST: share whole
> files)
> 
> We share whole "files" (Storm blocks).

Ok.

> 
> > ? Or is Storm more like decentralized index, which has 'pointer references'
> to
> > blocks: find computer which points to another computer, which hosts the
> actual
> > block ( first block would be a dummy block, e.g. record of hash and ip
> address.) ?
> 
> Either approach is possible. Probably the approach where the DHT holds 
> the ips of computers that have the block will be implemented first.

Ok. This is important question, since there are many advantages/disadvantages in
each approach:

In Squirrel's case, the authors cite that the 'Home-Store' approach was more
efficient than 'Directory' during simulations. However, their simulation
enviroment was very homogeneous and with very fast network connections. At the
end, however, they argue that 'Directory' may be better in more heteregeneous
enviroment.

In CFS, PAST and Freenet, authors have different goals, and therefore different
approaches. As said, CFS splits files into miniblocks (always < 50Kb) and
therefore provides better system balance. On the other hand, this approach
requires increased number of network packets to be send, since we have to fetch
many blocks from many different places --> if system has a short half-life, data
availability suffers.

Again, PAST and Freenet doesn't split files into blocks. Thus, in these
approaches data fetching requires less network packets and is more efficient
than CFS. The disadvantages are that PAST and Freenet requires more space for
load balancing (and better data availability) and there might be 'unlucky' file
storers (don't have enough space in local hard disk).


But I think we should implement the 'ips' first :).


-Hermanni


-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]