monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Re: why is monotone so slow?


From: graydon hoare
Subject: [Monotone-devel] Re: why is monotone so slow?
Date: Tue, 14 Sep 2004 10:16:22 -0400
User-agent: Opera M2/7.53 (Linux, build 737)

On Tue, 14 Sep 2004 12:58:32 +0200, Georg-W. Koltermann <address@hidden> wrote:

Since the disk accesses for the workspace should be the same in both
cases my best guess is that the time in monotone is spent in the SQL
database.  Is there a way to tune that?

well, to begin with I'd point out that you're comparing apples to
oranges there. the operations have the same name but they're
quite different.

on openCM:

  - calculate the working copy state
  - ask the (central) repository if you have the most recent copy

on monotone:

  - calculate the working copy state
  - calculate the set of branch certs you trust, out of all the
    revisions in the database (nb. you have now read a sizeable
    portion of the database)
  - look over the graph of all those revisions and discard
    ancestry certs you don't trust
  - compute the set of heads from that set (discard all entries
    with live children)
  - check to see if there is a unique head equal to the current
    working copy state

some of this is avoidable: after the changeset branch is complete
the graph can have its heads calculated much faster, for example,
and without reading the whole cert table. some of it is not as
avoidable: monotone doesn't have a single concept of up-to-date,
it's in the eyes of the client and its trust function, so there
is always a possibility for slowdown there.

however, all this is nonsense speculation until you get out a
profiler. I have profiled monotone many times in the past and
will continue to do so any time I'm working on speed issues;
doing otherwise is a waste of time. typical hotspots I've run
into before:

  - gzip longest sequence scanning (tunable by adjusting
    which in-memory representations remain compressed and
    which do not, and what thresholds are used)
  - SHA1 calculation (often tunable by caching IDs sensibly)
  - sqlite statement parsing (tunable by move to sqlite3
    and use of prepared statements)
  - sqlite execution (tunable by adjustment of page size,
    addition of indices, move to sqlite 3, etc.)
  - RSA inner loop (often tunable by caching trust values
    or rearranging logic)
  - bad i/o buffering strategies (tunable)

so, "yes", it can usually be made lots faster. I can't guess
off hand what the hotspot is in your email, but it can probably
be made to go away.

P.S.: Comparing the repository space requirements is also interesting.
Monotone uses 237 MB while OpenCM uses 186 MB (gzfs repository type).

yup; sqlite page size choices can change the database by an order of
magnitude. plus we base64 encode things, such that they will be
viewable on a terminal if you sqlite-dump them. again, tunable.

-graydon




reply via email to

[Prev in Thread] Current Thread [Next in Thread]