monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] upgrade of included botan for mtn


From: Jack Lloyd
Subject: Re: [Monotone-devel] upgrade of included botan for mtn
Date: Sat, 27 Sep 2008 11:19:47 -0400
User-agent: Mutt/1.5.16 (2007-06-09)

On Sat, Sep 27, 2008 at 12:09:08PM +0200, Markus Wanner wrote:
> Hi,
>
> I've finally taken the time to go through upgrading the included botan 
> library to 1.7.9 first, then 1.7.12. Some renaming and name conflicts 
> during merging were a PITA to solve and made me do it in two steps.

:(

I had thought there was a branch that allowed external Botan installs
to be used and also built Botan using it's normal built routine. Is
this dead? It would be nice to avoid having me rename things (which in
the case of source files at least seems harmless) and have it cause
more work for you to upstream merge. (Especially since you are on dev
branch, it would be bad for Monotone to get 'stranded' on some 1.7
point release that never got any bug fixes...)

> The Global_RNG of botan has gone, so I've added a pointer to an RNG to the 
> app_state, the key_store and the database. Most places using an RNG have 
> access to a key_store object, so we could maybe even get rid of the pointer 
> in the database object.

Wait, 3 RNGs?

This is OK (and might be more secure in some scenarios, for example if
randomness seeding failed catastrophically 1/3 of the time) but it was
particularly the intent for this change that single-threaded programs
like Monotone would be able to use a single RNG instance across its
entire execution.

> Only mkstemp.cc was puzzling me: I've now changed it to assign its own RNG. 
> Dunno if that can be optimized to use monotone's, but OTOH it maybe doesn't 
> matter.

The usual convention IMO would be to pass it a RNG reference. This
make it clear that mkstemp.cc needs random numbers, and also avoids
(if one were to just create a new RNG inside mkstemp) all the seeding
overhead. Of course this only works if most of the times when mkstemp
is called an RNG is available through some other part of the object
reference graph (app_state or whatever).

> The Memory_Exhausted exception has now gone, so we don't need a special 
> check for that and can rely on std::bad_alloc now.

Not entirely gone, just only in mem_pool.cpp and now derived from
bad_alloc.  I wasn't sure if you didn't realize that or if you did and
were just eliding the detail for simplicity. But yes catching
bad_alloc is the correct response. This was actually motivated by a
comment in Monotone's sources (I think by Graydon) that said something
to the effect of "Why do people make up their own bad_alloc?", I
thought about it and realized that made sense, especially as bad_alloc
is the sort of failure that one handles way high up, and in a big
system perhaps in code that predates the introduction of Botan into
the codebase.

> Performance of the SHA-1 remains pretty much the same since we
> cannot use the optimized SSE2 variant (+60% sha1 throughput
> [1]). That alone is a good reason to push the library-build branch.

a) Is this limitation due to the current static build configuration in mtn?
b) library-build is the name of the branch where Botan gets built from vanilla 
sources?

> (using "mtn benchmark_sha1"):
> default botan sha1: ~ 144 MiB/s
> botan_sha1_sse: ~ 234 MiB/s
>
> I've been unable to measure the amd64_asm variant, yet.

The asm is not very tuned; with good compilers the C++ is sometimes as
fast or faster. I would recommend focusing on Dean Gaudet's SSE2 - it
is significantly faster than the asm on every SSE2-enabled machine
I've tried it on, it may work on Windows (Visual C++ 2008 Express
would not compile it, but I think that is a limitation of Express; at
least I would think VC++ would support Intel's intrinsics?), and it
works on both all x86-64 CPUs and on most x86 chips made since 2003 or
so. Don't get me wrong I'd love to make the SHA-1 asm much faster
(perhaps copying Dean's SSE2 algorithm into inline asm?), but at the
moment the SSE2 code is where to go for the fastest SHA-1 (my only
thought was that OpenSSL's might be faster, but I checked and the SSE2
is faster than OpenSSL 0.9.8g's SHA-1 on my Core2 at least, hard to
say about 32-bit x86 tho). And the 32-bit x86 asm SHA-1 (asm_ia32
module) should be significantly faster than the C++ on non-SSE2 x86,
and maybe faster than the SSE2, but I haven't compared them yet.

-Jack




reply via email to

[Prev in Thread] Current Thread [Next in Thread]