[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detec
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection |
Date: |
Fri, 23 Oct 2015 13:14:47 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 23/10/2015 13:12, Pádraig Brady wrote:
> On 22/10/15 20:47, Paolo Bonzini wrote:
>>
>>
>> On 22/10/2015 19:39, Radim Krčmář wrote:
>>> 2015-10-22 18:14+0200, Paolo Bonzini:
>>>> On 22/10/2015 18:02, Eric Blake wrote:
>>>>> I see a bug in there:
>>>>
>>>> Of course. You shouldn't have told me what the bug was, I deserved
>>>> to look for it myself. :)
>>>
>>> It rather seems that you don't want spoilers, :)
>>>
>>> I see two bugs now.
>>
>> Me too. :) But Rusty surely has some testcases in case he wants to
>> adopt some of the ideas here. O:-)
>
> For completeness this should address the bugs I think?
Yes, thanks! :D
Paolo
> bool memeqzero4_paolo(const void *data, size_t length)
> {
> const unsigned char *p = data;
> unsigned long word;
>
> if (!length)
> return true;
>
> /* Check len bytes not aligned on a word. */
> while (__builtin_expect(length & (sizeof(word) - 1), 0)) {
> if (*p)
> return false;
> p++;
> length--;
> if (!length)
> return true;
> }
>
> /* Check up to 16 bytes a word at a time. */
> for (;;) {
> memcpy(&word, p, sizeof(word));
> if (word)
> return false;
> p += sizeof(word);
> length -= sizeof(word);
> if (!length)
> return true;
> if (__builtin_expect(length & 15, 0) == 0)
> break;
> }
>
> /* Now we know that's zero, memcmp with self. */
> return memcmp(data, p, length) == 0;
> }
>
> compiled with gcc 5.1.1 -march=native -O2 on an i3-2310M
> we get these timings:
>
> bytes 1 8 16 512 65536
> ---------------------------------------------
> Rusty: 10 28 59 114 6510
> Paolo: 9 9 12 75 6495
>
> It's also smaller, especially at -O3:
>
> $ nm -S a.out | grep memeqzero4
> ... 000000000000005b t memeqzero4_paolo
> ... 0000000000000063 t memeqzero4_rusty
> $ gcc -march=native -O3 memeqzero.c
> $ nm -S a.out | grep memeqzero4
> ... 000000000000005b t memeqzero4_paolo
> ... 0000000000000133 t memeqzero4_rusty
>
> cheers,
> Pádraig.
>
- [PATCH] copy,dd: simplify and optimize NUL bytes detection, Pádraig Brady, 2015/10/22
- Re: [PATCH] copy,dd: simplify and optimize NUL bytes detection, Eric Blake, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Paolo Bonzini, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Pádraig Brady, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Paolo Bonzini, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Eric Blake, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Paolo Bonzini, 2015/10/22
- Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection, Radim Krčmář, 2015/10/22
- Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection, Paolo Bonzini, 2015/10/22
- Message not available
- Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection,
Paolo Bonzini <=
- Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection, Pádraig Brady, 2015/10/23
- Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection, Pádraig Brady, 2015/10/23
- Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection, Pádraig Brady, 2015/10/25
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Bernhard Voelker, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Paolo Bonzini, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Eric Blake, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Bernhard Voelker, 2015/10/23