[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
xdelta speedups (was Re: [Monotone-devel] updates to net.venge.monotone.
From: |
Nathaniel Smith |
Subject: |
xdelta speedups (was Re: [Monotone-devel] updates to net.venge.monotone.experiment.performance) |
Date: |
Wed, 9 Aug 2006 23:35:24 -0700 |
User-agent: |
Mutt/1.5.12-2006-07-14 |
On Wed, Aug 09, 2006 at 10:42:37PM -0700, Eric Anderson wrote:
> Ready for mainline:
>
> 5dc45de4c282a4e93ad1384a68b38740728cd0cb: xdelta/adler32 tuning
>
> Improvements to the xdelta code in order to take advantage of the fact
> that we are always using a relatively small window for the adler32
> hash, and that when we skip forward, we normally skip forward by alot
> so it's faster to just recompute the rolling checksum on the new data
> than actually move the rolling checksum forward. 1.03x improvement in
> cpu usage on the client, 1.12x improvement in cpu usage on the server.
> This would probably give more of a performance benefit on other CPUs,
> the Xeon I'm testing on has the half-latency ALU ops, so improving a
> bunch of add, and operations shows much less improvement in CPU time
> than it does in instruction fetch.
Looking at the patch:
> migrate to just doing a direct cast of the character rather than the
> widen() stuff because the widen does a bunch of unnecessary masking.
I am unsure about this, simply because I don't know what widen<> is
for in the first place. Graydon, can you shed any light?
- s1 += widen<u32,u8>(c);
+ s1 += (u32)(c);
More cosmetically, please use static_cast<> instead of C-style casts.
+ inline void replace_with(u8 const *ch, std::string::size_type count)
[...]
+ while(count--) {
+ u32 c = (u32)(*(ch++));
+ s1 += c;
+ s2 += s1;
+ }
Even more cosmetically, please try to stick with the house style,
including ugly GNU braces :-). (Here: space after the * in the
function declaration, space after "while", braces in wrong places.)
+ // selfcheck by making short-distance test true || badvance <= blocksz
+ // u32 save_sum = rolling.sum();
+ rolling.replace_with(reinterpret_cast<u8 const *>(b.data() + new_lo),
new_hi-new_lo);
+ // I(rolling.sum() == save_sum);
Comments here make me go "huh?".
+ static bool widen_checked = false;
+ if (!widen_checked) {
+ for(u32 i = 0; i < 256; ++i) {
+ u8 v = (u8)(i & 0xFF);
+ u32 v_w = widen<u32,u8>(v);
+ I(v_w == i);
+ I(((u32)(v)) == i);
+ }
+ widen_checked = true;
+ }
Non-house style again, plus some comments about what's going on here
would be good... also I still don't know what widen<> is for, so I
still can't comment on whether this code is useful or sane :-).
Other than these concerns, seems fine. Once we get a resolution to
the widen<> thing, and these minor bits are fixed, should be fine to
land on mainline.
-- Nathaniel
--
Details are all that matters; God dwells there, and you never get to
see Him if you don't struggle to get them right. -- Stephen Jay Gould