[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] [ANNOUNCE] monotone 0.19
From: |
Nathan Myers |
Subject: |
Re: [Monotone-devel] [ANNOUNCE] monotone 0.19 |
Date: |
Fri, 6 May 2005 19:52:04 -0700 |
User-agent: |
Mutt/1.3.28i |
On Fri, May 06, 2005 at 10:10:04AM -0700, Nathaniel Smith wrote:
> On Fri, May 06, 2005 at 09:13:14AM -0700, Nathan Myers wrote:
> > On Fri, May 06, 2005 at 01:15:29PM +0200, Nico -telmich- Schottelius wrote:
> > >
> > > - You speak about compression/decompression, are there in general
> > > processes, which can be optimized? I don't mean algorithms
> > > or something like that.
>
> ... there is both delta storage and
> compression going on, and they interact somewhat differently in
> netsync and the database; plus, as you note, hash checking. At the
> moment, IIRC, the sequence on the server is:
> -- server constructs plaintext of requested version, either by
> finding it in its plaintext cache, by uncompressing a compressed
> full version in the db, or by uncompressing some sequence of
> deltas and applying them to another plaintext.
> -- server hashes this plaintext, to make sure it hasn't been
> corrupted
> -- depending on what the client wanted, the server either:
> -- compresses this plaintext and sends it
> -- does an xdelta between this plaintext and the other plaintext
> the client mentioned, compresses the resulting xdelta, and
> send it
> I don't recall exactly what the sequence on the client is, but it's
> similarly convoluted -- there might be some opportunities to
> streamline and avoid a passing compress/uncompress, but it's hard to
> get _too_ far, because of the different needs of local storage and
> network usage. Keeping a safe design is also a bit of a factor; e.g.,
> one might say "pff, the server can skip checking the hash, let the
> client do it", but it's a bit tricky, because the data verification
> logic is very low level. It has no idea what the data will be used
> for. In the broader context of trusting monotone to work right and
> never accidentally give bad data, that's a good thing...
It looks to me like the netsync protocol needs a way to ask for much
lower-level constructs: basically, "Tell me the hashes of all the
blobs you have that I will need to construct these versions; I'll
tell you which of those to send. Don't bother hashing, because I
have to do that anyway. Don't bother with any plaintext, just give
me everything raw and compressed, straight from the database."
On receipt, the client deflates each blob and verifies its hash, but
doesn't bother constructing any final plaintext versions. It just
stuffs the verified, compressed blobs into its own database just as
they came over the wire. When a user _asks_ for one of those versions
it just got the pieces for, it can construct plaintext, check hashes
again, and whatnot, as usual. Probably they will never ask for most
of the intermediate versions it picked up, so there's no point in
fooling with them during netsync.
Nathan Myers
address@hidden