monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Scalability question


From: Jonathan S. Shapiro
Subject: Re: [Monotone-devel] Scalability question
Date: Fri, 04 Aug 2006 17:19:04 -0400

On Fri, 2006-08-04 at 14:47 -0500, Timothy Brownawell wrote:
> We have seen some slowness, yes. Our current thinking is to store our
> rosters as table rows.

Yes. This seems good, but it raises a question at the replication layer.
If the fundamental unit of replication at the sync layer is rows, you
can end up in a situation where a replication does not complete, leaving
you with 4 out of 5 rows. This needs to be detectable.

Does MTN sync at row granularity, or at some higher granularity? How
does it deal with incomplete replication?

I have a related question about certificates: at the end of a sync, how
do I know that I have in hand all of the relevant certificates
concerning some given file? For Q/A certificates I probably can't, but
for file type information I really need to know this. How is this
handled in the mtn architecture?

> This lets us really store one as only the rows
> that are different from its parent(s?), which will speed up
> taking/applying deltas.

Storing deltas like this was our early approach in OpenCM. We did it
naively, and the results were *horrible*. To make a long story short,
you want to have some constant upper bound on how many deltas may need
to be applied in order to reconstruct any given object.

Both OpenCM and XDFS (and a lot of other systems) resolve this by
periodically storing completely expanded versions -- roughly every 20th
revision is stored in expanded form (the number "20" was determined by
measurement).

Does the MTN storage layer do something comparable to this?

If you do, then the only issue remaining is what to do with really big
objects like Change/Manifest objects. The problem here is: even in the
local case it takes a surprising amount of time to fetch a 20,000
records from a database. You end up (de)serializing at several layers,
and the costs add up quickly.

However, there is a catch hiding here for schema design. If the blobs
stored in the repository are (logically) expanded, you end up with one
set of protocols, schemas, and interaction patterns. If the blobs stored
in the repository are (logically) deltas, you end up with a different
architecture.

OpenCM initially said "we (logically) store deltas", and immediately ran
into linear chaining issues. We very quickly moved to a "we (logically)
store objects", and then went ahead and did deltas in the store -- but
that was purely an internal implementation decision within the store.

This has implications for the sync protocol, because it means that the
sync layer may want to manage/reduce bandwidth by generating deltas on
the fly.

In OpenCM, this appears in the client/server protocol, but a similar
issue would seem to exist in the mtn sync protocol. After trying a bunch
of other things, OpenCM ended adopting the following protocol:

  requestor:  sendme(sha1:SomethingOld, sha1:SomethingNew)

where the responder can respond with *either* a completely expanded
object or a delta relative to the object named by "sha1:SomethingOld"


Hopefully some of this is helpful, but I'm mainly interesting in
learning what mtn is doing here.


shap





reply via email to

[Prev in Thread] Current Thread [Next in Thread]