# # # patch "wiki/PerformanceWork.mdwn" # from [576077e6dc883da277d04218ea0952e0d2a6cd3f] # to [c83656f81b151c466a430aa1461ccba6275bc3af] # ============================================================ --- wiki/PerformanceWork.mdwn 576077e6dc883da277d04218ea0952e0d2a6cd3f +++ wiki/PerformanceWork.mdwn c83656f81b151c466a430aa1461ccba6275bc3af @@ -1,4 +1,5 @@ -[[!tag migration-auto]] +[[!tag migration-done]] +[[!toc levels=2]] Several key operations are still prohibitively slow in monotone. @@ -6,17 +7,17 @@ Several key operations are still prohibi ## pull -The speed of an initial pull of the db is very painful on projects of moderate size (eg. monotone's itself.) See discussion at http://colabti.org/irclogger/irclogger_log/monotone?date=2006-01-25,Wed&sel=#l263 for various thoughts about ways to make this faster. +The speed of an initial pull of the db is very painful on projects of moderate size (eg. monotone's itself.) See discussion at for various thoughts about ways to make this faster. -Also discussion at http://colabti.org/irclogger/irclogger_log/monotone?date=2006-01-31,Tue&sel=#l13 for ideas. +Also discussion at for ideas. - pull has gotten far faster in 0.26+, and again in 0.30 - but it is still too slow for very large histories +> pull has gotten far faster in 0.26+, and again in 0.30 - but it is still too slow for very large histories ## annotate Annotate is really too slow to be used. Investigation underway in net.venge.monotone.annotate branch of storing per-file revision DAG in the database as well to avoid parsing all revision rosters out of db while doing history traversal. - annotate has gotten much faster in 0.32, and again in 0.33, and should be quite usable now. It is still slow when compared to other systems, though. +> annotate has gotten much faster in 0.32, and again in 0.33, and should be quite usable now. It is still slow when compared to other systems, though. ## restricted log @@ -24,17 +25,17 @@ Restricted log in backwards direction (f Restricted log in backwards direction (from newer to older revisions) can already be made much faster by exploiting the roster markings. (The markings hold interesting revision ids per file, i.e. revisions where the file was born, or changed it contents, name, or attributes.) However, for this to work properly, the traversal alogorithm used by the log command has to be fixed. Currently it does not always visit nodes in topological order. - restricted log (in backwards direction) has gotten much faster in 0.32, exploiting the markings and revision heights. +> restricted log (in backwards direction) has gotten much faster in 0.32, exploiting the markings and revision heights. ## update on a workspace with a large history/number of files Updating a large tree sometimes seems to take a good amount of time on a large tree and/or workspace. Finding the branch and revision to update seems to take an inordinate amount of time. - "mtn up" on an [[OpenEmbedded]] database can take 80s or more to output "updating along branch xxx". That seems an awfully long time to figure out the branch of the current checkout. Selecting the output target and actually updating was 5-10s more. During the initial 80s my CPU was mostly waiting for IO so it seems most if not all of the 100MB DB was being read into memory. +> `mtn up` on an [[OpenEmbedded]] database can take 80s or more to output "updating along branch xxx". That seems an awfully long time to figure out the branch of the current checkout. Selecting the output target and actually updating was 5-10s more. During the initial 80s my CPU was mostly waiting for IO so it seems most if not all of the 100MB DB was being read into memory. +> +> This has been traced to a number of sources, and should have improved **significantly** in monotone 0.30. -This has been traced to a number of sources, and should have improved **significantly** in monotone 0.30. - # Specific changes that could speed things up * on client pull of nvm*, 17% of the time is spent in get_uncommon_ancestors. Could easily add a revision ancestry cache to make this faster; if still more speed is needed, could use an iterative deepening trick to make it much cheaper again (ask njs for details). @@ -47,9 +48,10 @@ This has been traced to a number of sour # SQLite -See ["[[PerformanceWork]]/SQLiteAnalyzeDiscussion"]. +See [[PerformanceWork/SQLiteAnalyzeDiscussion]]. + # Automated Performance Test Suite Timothy Brownawell created the beginnings of such a beast. You can find it in branch net.venge.monotone, in the file contrib/monoprof.sh @@ -57,17 +59,20 @@ What would a perfect automated suite hav # The ideal What would a perfect automated suite have? + * Ability to generate standardized summary reports, so we can easily compare versions, make graphs, etc. * Ability to easily add new tests -- much of the point is to be able to run tests of everything, so we will notice when a change to one place has an unexpected consequence in another part of the code * Ability to flexibly choose what to run and how -- we need to be able to request individual tests be run, be able to see what commands are used and execute those by hand (in whatever environment the suite sets up), run things under various profilers, etc. * ...(what else would people find useful?) What should it test? + * run tests across different scalability scenarios -- different sizes of history, different sizes of tree (in file size, in number of bytes, and possibly different directory layouts -- deep vs. shallow, files per directory, etc), different edit patterns through history (is every file changed in every rev, or some subset, are some files hotter than others...) * run tests across different setups -- for instance, cold cache tests versus hot cache tests (on linux this requires a scratch partition, I think OS X has some simpler mechanism...) * ...(what else would people find useful?) What should it measure? + * cpu time (user, system) * wall clock time * peak memory use (not directly measurable on linux) @@ -83,6 +88,6 @@ Obviously we aren't going to start by si # Oprofile stuff -Oprofile's callgraph mode gives very obscure output. Here is the dark knowledge you need to interpret it: http://colabti.org/irclogger/irclogger_log/monotone?date=2006-02-28,Tue&sel=23#l36 +Oprofile's callgraph mode gives very obscure output. Here is the dark knowledge you need to interpret it: +I wrote a little script to convert oprofile callstack output to the format kcachegrind can grok. I'm not sure if I entirely trust it (I was seeing some numbers above 100%?), but anyway, here it is: -I wrote a little script to convert oprofile callstack output to the format kcachegrind can grok. I'm not sure if I entirely trust it (I was seeing some numbers above 100%?), but anyway, here it is: http://frances.vorpus.org/~njs/op2calltree.py