monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Improving the performance of annotate


From: Eric Anderson
Subject: [Monotone-devel] Improving the performance of annotate
Date: Tue, 18 Jul 2006 14:41:35 -0700

I've been working on improving the performance of annotate.  I have
found a solution that drops the time for mtn annotate Makefile.am from
about 175 seconds down to 9 seconds (detailed cpu and memory
statistics at the bottom).

I've attached the patch so that people can play with it, but it needs
significant work before it could be applied.  I just want to use it as
a place to discuss where the improvements came from, and let people
try it to verify that the improvements work for others.

In the end there were two main improvements: 1) only parsing the
portion of the roster that was relevant to the file being annotated.
2) skipping the version hash check in database.cc

Initially looking at the oprofile results, time seemed to be spread
out across a whole bunch of functions, with ~15-25% in the parsing
routines.  Investigation of the other functions showed that they were
all being called as part of constructing the roster datastructure.  It
turns out that the annotation code only needs the part of the roster
that deals with the file being annotated, so an ugly chunk of code was
written that basically fast-forwards through the roster to the entry
that is needed, and then only parses that entry.  Applying this piece
of the fix gets the system down to about 20 user cpu seconds, with
~75% of the time being spent in the SHA1 hash.

I had previously removed the SHA1 hash check from the code, and making
that configurable shows that removing the check gets the run down to
about 6 user cpu seconds.  This is not surprising, running the
annotate ends up reading in ~4000 rosters around 300k each, so in
order to perform an annotate, we are hashing over 1 GB of data.  Two
thoughts occur to me on this part of the improvement 1) the check
could be made optional, defaulting to on for write operations and off
for read only operations; 2) the Botan hash could be replaced by the
optimized assembly from openssl, although there are licencing issues
associated with this option.

With both of those improvements applied, the system is spending >60%
of it's time in parsing, with most of the time in the function that
skips to the beginning of the next roster record.  Therefore I predict
that if there was an index at the beginning of the roster (say hex
encoded id, offset pairs), the user time for the annotate could drop
to about 3 seconds.  Alternately, the roster was stored as a number of
separate records in the database, then only the needed roster could be
read.

With all of the patches applied, annotating almost all of the .cc
files in monotone is fast, three files continue to be slow:
commands.cc, database.cc, netsync.cc; most of the time for those
annotations is spent in extend_path_if_not_cycle and
piecewise_applicator::copy.  I have not bothered to investigate
whether those could be made faster.

Thoughts on what is the right way to make this improvement?
        -Eric

Detailed performance of mtn annotate Makefile.am 
  for 341e4a18c594cec49896fa97bd4e74de7bee5827:
    User time: 167.194s, System time: 3.080s, Elapsed time: 174.636s
    Max Size: 37.80 MiB, Max Resident: 20.08 MiB
    Mean Size: 34.60 MiB, Mean Resident: 18.40 MiB

    User time: 171.163s, System time: 3.032s, Elapsed time: 179.597s
    Max Size: 37.85 MiB, Max Resident: 20.08 MiB
    Mean Size: 34.60 MiB, Mean Resident: 18.40 MiB

    User time: 168.611s, System time: 2.932s, Elapsed time: 173.784s
    Max Size: 37.73 MiB, Max Resident: 20.08 MiB
    Mean Size: 34.60 MiB, Mean Resident: 18.40 MiB

  with prototype patch:
    User time: 6.116s, System time: 2.332s, Elapsed time: 9.241s
    Max Size: 34.59 MiB, Max Resident: 16.43 MiB
    Mean Size: 30.59 MiB, Mean Resident: 14.37 MiB

    User time: 5.892s, System time: 2.552s, Elapsed time: 9.081s
    Max Size: 34.59 MiB, Max Resident: 16.43 MiB
    Mean Size: 30.60 MiB, Mean Resident: 14.37 MiB

    User time: 6.028s, System time: 2.392s, Elapsed time: 9.081s
    Max Size: 34.59 MiB, Max Resident: 16.43 MiB
    Mean Size: 30.64 MiB, Mean Resident: 14.41 MiB

Attachment: prototype.annotate.patch
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]