[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] bitops: provide an inline implementation of fin
From: |
Aurelien Jarno |
Subject: |
Re: [Qemu-devel] [PATCH] bitops: provide an inline implementation of find_first_bit |
Date: |
Fri, 20 Jun 2014 11:43:55 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Fri, Jun 20, 2014 at 10:58:31AM +0200, Paolo Bonzini wrote:
> Il 20/06/2014 10:48, Aurelien Jarno ha scritto:
> >In practice on x86_64, this function takes 27 instructions in the
> >general case, and 18 instructions in the fixed case, even for big
> >sizes. I therefore think that checking if the size is constant is a good
> >idea, but we should not make any test on the size itself and trust the
> >compiler to correctly decide if the loop should be unrolled or not.
>
> But if the size is large enough that the compiler will (likely) not
> unroll the function, then it should pay off to use the more
> optimized code in find_next_bit.
The point there is that given find_next_bit is a generalized version of
find_first_bit, it is actually slower. I originally noticed that by
running profiling tools and noticing this function appeared relatively
high for what it is supposed to do.
> This of course is unless you expect find_first_bit to return a small
> value and not be used in a loop; and dually expect find_next_bit's
> usage to be more like walking sparser bitmaps in a loop.
I think that's the point. In the TCG case, this is used to map the
temp allocation to answer the question "give me a free temp". That said
people might invent new usages.
> This actually makes sense, and then there's no need to change anything.
>
> Paolo
>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
address@hidden http://www.aurel32.net