monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] [patch] roster-deltas (40% pull speedup)


From: Eric Anderson
Subject: Re: [Monotone-devel] [patch] roster-deltas (40% pull speedup)
Date: Tue, 22 Aug 2006 12:31:20 -0700

Nathaniel Smith writes:
 > On Mon, Aug 21, 2006 at 03:40:21PM -0700, Eric Anderson wrote:
 > > [ maybe binary rosters ]
 > 
 > I would be curious how much they win at this point; the roster-delta
 > strategy for reconstruction changes that landscape, at least somewhat.
 > For instance, my understanding is that for annotate, the old code
 > does:
 >   0) load deltas from disk
 >   1) apply deltas to reconstruct roster text
 >   2) use sha1 to verify that roster text is uncorrupted
 >   3) parse that roster, to find out the content of the given file in
 >      this particular revision
 > And your changes let us (in this case) skip step 2, and make step 3
 > faster by only parsing out the relevant information.

Yup.

 > With the roster-deltas branch, this process becomes
 >  0) load deltas from disk
 >  1) use sha1 to verify that the deltas are uncorrupted
 >  2) apply deltas to reconstruct roster directly
 > so the hashing cost is greatly reduced, and step 3 just disappears
 > entirely.
 > 
 > So it'd be interesting to profile annotate again with these changes; I
 > did a quick timing run, and annotate with roster-deltas did seem to be
 > 2-4x faster than mainline.  If I had to guess, the bottlenecks now are
 > likely things like "copying rosters to insert them in the cache".  But
 > maybe parsing the deltas is a bottleneck too, dunno.

So unless your base times are a lot different than mine, it could
still be a lot faster, the following were the times from just the fast
ascii parsing.  When I did binary rosters, and informal testing, it
made early files (e.g. Makefile.am) a little faster files late in the
roster (xdelta.cc) the same speed as early ones, so I was getting a
~20x speedup.

         Makefile.am       paths.hh       xdelta.cc     database.cc
         opt     ref      opt    ref     opt     ref    opt       ref
system  2.852   3.128   1.600   1.700   2.656   2.868  12.021  12.329
user    6.396 192.884   9.933 125.232  26.542 193.132  47.763 231.726
wall   10.149 205.530  14.129 130.573  32.234 210.039  67.273 259.602


 > (There are interestingly different ways to optimize this stuff made
 > possibly by the roster-deltas -- since roster deltas record changes
 > directly by node_id, we could have an interface that traversed
 > roster-deltas directly, without reconstructing full rosters at all,
 > for the use of annotate and restricted log...)

This might end up being the right choice -- building all of the
datastructures in the parsing was a substantial portion of the cost.
        -Eric





reply via email to

[Prev in Thread] Current Thread [Next in Thread]