[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Monotone-devel] Patches to improve performance and memory usage: Discus
From: |
Eric Anderson |
Subject: |
[Monotone-devel] Patches to improve performance and memory usage: Discussion |
Date: |
Sun, 3 Jul 2005 00:04:27 -0700 |
All,
A while ago I sent a bulk patch that significantly improved
the performance and memory usage of some of the monotone operations.
A very small subset of the patch was applied but the rest wasn't.
I've now gone through and re-done the patches against the current
head, split the patches into a ton of small pieces, and made some
further improvements.
This message just covers the testing and describes the various
patches, which follow in separate messages to get under the mailing
list 40k limit.
I wrote two patches that do accounting, the first one adds
accounting code (malloc, copies, memory usage) into main.cc, and the
second that handles the repeatable performance test. Currently the
test is measuring add, commit, checkout, serve, and pull for a variety
of different file-sets. These tests are:
- zero_small: a 10KiB all zero file
- zero_large: a 100MiB all zero file
- random_medium: a 10MiB random file
- halfzero_large: a 100MiB file, 50MiB of zeros and 50MiB of random data
- random_large: a 100MiB random file
- monotone: all the files in the 0.19 release
- mt_multiple: all the files in the 0.10-0.19 releases
- mt_bigfiles: each of the 0.10-0.19 releases concatenated to form a
single file for each release
- everything: all of the above at the same time
In short, the new code is strictly faster on all tests, and in
most cases uses less memory. The few cases where it doesn't is the
places where the pipelining of the base64/gzip process in the current
head is a big win, whereas my code does the decode in two steps. It
didn't seem worrying about the slight increase when the pull operation
for the same data was still using way more memory. Full data on all
tests and results so people can look beyond the summary. I'm going to
take a look at the cases where memory usage got worse a bit later to
figure out what happened.
adds: Speed and memory usage improvements > 100x for large files
commits: Speedup 1.1x-1.3x, memory up to 1.5x
checkout: Speedup 1.2x-1.9x, memory 1.47x worse - 1.23x better
serve: Speedup 2-10x, memory 1-1.7x better
pull: Speedup 1.8-5.8x, memory 1.25x worse-1.6x better
Some of the patches are independent and others are dependent. The
following order is known to work because it's the order that I created
them in. The perf-test patch can be applied after the accounting
patch to enable perf-testing on any revision. Patches follow in a few
separate messages to avoid message size limits.
- accounting: Add in the ability to account for CPU, Memory, copies and
mallocs. The memory one is especially important as the minflt statistic
is not an accurate measure of memory usage.
- file-io-preallocate: Reserve space before reading in a file (avoids
doubling behavior of filling up a string)
- netsync-string-queue: Replace the string buffer with a string queue
buffer that eliminates most of the copies on push/pull, this causes the
100x improvement in copy time.
- base64: Replace the CryptoPP Base64 implementation with a specialized one
that allocates a buffer of exactly the right size. Also runs ~2x faster.
- zlib: Replace the CryptoPP gzip implementation with the zlib one. This
provides the function that calculates the maximum size after compression,
runs faster and gets better compression on compressible data (source code)
at the cost of running slower on uncompressible data (random)
- put-dat-free: Free up file data after it has been compressed as it isn't
used after that.
- free-base64-encoding: Free up the base64 encoding before passing the
query to sqllite
- compress-cache: Cache the results of decompressing data so that when we
immediately ask to compress it again the compression doesn't have to be
re-calculated
- perf-test: Script to run all of the performance testing.
- string-queue-downsize: If the size of the string queue buffer has shrunk
enough, then re-allocate downwards.
- free-payload-after-use: After we have parsed all of the payload data
free the payload.
- less-allocating-guzip: Only allocate 1x input size rather than 2x to
start.
- incremental-binary-test: When testing if a file is binary, test on
8k chunks rather than reading the whole file in.
-Eric
---------------------------------------------------------------
Results from running on a Compaq NW8000, 1.7GhZ Pentium-M, 1.5GB
memory, 60MB 7.2kRPM hdd.
monotone head f56500ddd113aa716173f6d94db5d59c88bdc201 + accounting patches
Maximum (MiB) Copied Malloc
*Test* Operation CPU(s) Size Resident (MiB) (MiB)
--------------- --------- ------ ------- ------- ------- -------
zero_small add files 0.01 6.93 1.94 0.1 0.5
zero_small commit 0.04 7.02 3.13 0.4 2.4
zero_small checkout 0.02 7.05 2.75 0.2 0.8
zero_small serve 0.06 8.75 3.80 0.4 1.9
zero_small pull 0.03 8.70 3.76 0.4 2.6
zero_large add files 3.81 420.06 412.96 1447.5 100.5
zero_large commit 8.46 235.21 129.03 836.5 360.4
zero_large checkout 4.02 289.50 190.64 388.2 377.7
zero_large serve 7.69 397.18 188.80 495.7 579.7
zero_large pull 7.64 247.29 141.33 446.7 381.2
random_medium add files 0.24 51.32 44.95 126.1 10.5
random_medium commit 2.73 99.32 82.29 190.9 134.2
random_medium checkout 1.84 36.30 30.19 57.4 57.1
random_medium serve 2.70 54.46 35.73 292.6 98.0
random_medium pull 2.61 115.13 93.44 185.2 155.4
halfzero_large add files 3.63 414.45 406.89 1739.3 100.5
halfzero_large commit 17.38 508.00 443.15 1305.6 699.5
halfzero_large checkout 11.13 280.04 205.58 454.2 409.4
halfzero_large serve 26.25 396.40 207.87 5698.2 711.0
halfzero_large pull 16.40 532.04 494.39 1029.0 708.3
random_large add files 3.51 414.45 407.44 1963.2 100.5
random_large commit 26.73 878.05 780.71 1874.7 1236.6
random_large checkout 18.31 279.67 273.65 572.4 543.9
random_large serve 69.33 446.15 305.75 20928.9 945.1
random_large pull 25.53 1051.91 881.93 1848.6 1506.4
monotone add files 0.40 8.83 3.55 26.9 11.0
monotone commit 4.17 11.79 7.27 125.9 321.2
monotone checkout 1.57 12.79 7.94 47.8 98.8
monotone serve 3.34 12.41 6.40 86.9 229.7
monotone pull 4.61 14.66 9.43 407.4 490.5
mt_multiple add files 3.52 10.06 4.71 161.5 75.2
mt_multiple commit 23.96 25.45 19.82 695.7 1187.7
mt_multiple checkout 11.58 14.80 10.26 356.5 679.4
mt_multiple serve 6.80 29.32 15.21 411.3 639.8
mt_multiple pull 97.31 58.84 36.13 41265.7 15009.8
mt_bigfiles add files 2.78 66.80 61.49 910.3 74.7
mt_bigfiles commit 17.70 56.46 41.20 859.4 466.2
mt_bigfiles checkout 6.71 36.48 27.44 318.6 232.5
mt_bigfiles serve 18.87 75.98 44.00 1514.5 485.4
mt_bigfiles pull 17.87 61.83 46.42 676.3 517.0
everything add files 17.55 475.16 469.04 6240.9 459.4
everything commit 96.84 881.12 783.93 5761.6 4066.3
everything checkout 54.61 308.26 282.14 2147.7 2288.8
everything serve 169.89 658.32 354.49 47762.4 3657.0
everything pull 97.11 1059.21 888.68 7442.0 5366.6
monotone with all of the performance and memory improvements applied
Maximum (MiB) Copied Malloc
*Test* Operation CPU(s) Size Resident (MiB) (MiB)
--------------- --------- ------ ------- ------- ------- -------
zero_small add files 0.01 6.98 1.98 0.1 0.5
zero_small commit 0.05 7.45 3.38 0.4 2.0
zero_small checkout 0.02 7.10 2.80 0.2 0.8
zero_small serve 0.05 8.79 3.72 0.3 1.6
zero_small pull 0.03 9.14 3.96 0.4 2.2
zero_large add files 0.01 6.98 1.98 0.1 0.5
zero_large commit 6.64 221.46 216.95 701.1 217.0
zero_large checkout 4.05 307.52 199.91 400.0 401.3
zero_large serve 4.00 309.19 201.67 300.3 402.2
zero_large pull 4.06 309.79 203.29 301.1 403.6
random_medium add files 0.01 6.98 1.98 0.1 0.5
random_medium commit 2.60 62.92 58.44 160.3 105.6
random_medium checkout 0.57 52.65 38.28 56.9 73.0
random_medium serve 0.74 54.34 39.00 72.0 100.2
random_medium pull 1.53 74.61 69.34 130.6 125.0
halfzero_large add files 0.01 6.98 1.98 0.1 0.5
halfzero_large commit 16.01 276.56 272.08 1150.8 618.7
halfzero_large checkout 4.53 226.23 221.75 383.1 353.5
halfzero_large serve 5.26 227.92 222.47 411.4 490.2
halfzero_large pull 9.62 378.37 373.03 751.4 707.0
random_large add files 0.02 6.98 1.98 0.1 0.5
random_large commit 25.27 543.00 538.52 1600.4 1018.4
random_large checkout 5.43 442.71 338.32 566.4 703.2
random_large serve 6.91 444.40 339.04 723.2 976.4
random_large pull 15.20 644.70 639.45 1301.8 1207.8
monotone add files 0.25 7.60 2.37 7.4 13.3
monotone commit 3.27 11.84 7.69 108.7 123.8
monotone checkout 1.21 12.38 8.06 45.6 101.8
monotone serve 1.36 12.78 6.78 46.4 127.2
monotone pull 2.40 16.18 10.98 94.3 176.4
mt_multiple add files 2.66 9.42 4.04 57.8 86.7
mt_multiple commit 21.75 25.65 20.29 595.3 639.8
mt_multiple checkout 9.35 15.41 10.25 345.8 701.8
mt_multiple serve 2.63 13.96 8.34 129.3 316.8
mt_multiple pull 16.61 34.96 24.75 326.9 697.4
mt_bigfiles add files 0.23 7.23 2.00 102.3 0.7
mt_bigfiles commit 13.28 30.59 26.11 716.9 339.7
mt_bigfiles checkout 4.22 48.18 32.03 326.7 296.0
mt_bigfiles serve 4.80 54.16 36.93 318.0 370.8
mt_bigfiles pull 6.35 44.72 35.27 459.1 456.3
everything add files 2.97 9.37 4.25 160.1 87.1
everything commit 85.66 546.13 541.72 4923.3 2921.1
everything checkout 28.06 455.81 347.49 2077.9 2516.0
everything serve 24.91 461.31 351.17 1943.8 2642.3
everything pull 53.75 650.53 644.85 3399.9 3749.7
- [Monotone-devel] Patches to improve performance and memory usage: Discussion,
Eric Anderson <=