monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] [patch] heights


From: Thomas Moschny
Subject: [Monotone-devel] [patch] heights
Date: Fri, 29 Sep 2006 10:36:54 +0200
User-agent: KMail/1.9.4

Hi all,

The nvm.heights branch should be landed on mainline anytime now, so I am going 
to show it here. It's a mid-sized patch: 18 files changed, 844 insertions, 
179 deletions.

The said branch lets monotone add one more cached piece of data per revision 
to the database, called height. I sent an email to the list describing the 
basic idea of heights some time ago, but meanwhile the idea has a bit 
evolved. There's a page in monotone's wiki that describes the concept: 
http://venge.net/monotone/wiki/RevisionNumbering . In short, heights are 
caching the result of doing a toposort of all revisions, thus making a 
subsequent topological sort much cheaper. Note that heights are not ment to 
be replacing revision ids in any way. They are purely local cached data used 
to carry out some optimizations.

So what does the patch do besides adding a rev_height class? Unfortunately, we 
have to modify the db schema again, and a good amount of changes has to do 
with that. Then, it renames regenerate_rosters to regenerate_caches, because 
that cmd not only regenerates the cached rosters, but now also the cached 
heights, and in the future might update other cached data, too. Existing 
tests have been adjusted, and all tests are passed.

In order to show the benefits heights may have, the log cmd is changed in two 
steps. First, it now always shows revisions in toposorted order (which was 
not the case before). Because we use complex heights, the logged revisions 
are clustered, ie. a linear chain of revisions is kept together as long as 
possible. Second, having a fast way of doing a repeated toposort for a set of 
revisions, we can exploit the markings present in the rosters to heavily 
speed-up backwards restricted log. The idea is to only look at interesting 
revisions, ie. marked ones and their parents. This way we can omit a lot of 
roster parsing.

Additionally, the toposort command now uses heights, of course, resulting in a 
speed-up for small to mid-size sets of revisions (and no decrease in speed 
for sorting big sets). A method get_cached_heights() to get all heights at 
once (similar to get_cached_ancestry()) might be useful for big sets, to 
minimize communication with the database.

What needs to be done? Well, there should be a test in db check for everything 
we put in the database, this is missing for heights. Probably, some testsuite 
tests should be added, too.

Comments are welcome!

- Thomas

Attachment: heights.diff.bz2
Description: BZip2 compressed data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]