[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR
From: |
David Hildenbrand |
Subject: |
Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL |
Date: |
Thu, 23 May 2019 09:50:54 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 |
On 22.05.19 20:46, Richard Henderson wrote:
> On 5/22/19 2:16 PM, David Hildenbrand wrote:
>> On 22.05.19 17:59, Richard Henderson wrote:
>>> On Wed, 22 May 2019 at 07:16, David Hildenbrand <address@hidden> wrote:
>>>>> Also plausible. I guess it would be good to know, anyway.
>>>>
>>>> I'll dump the parameters when booting Linux. My gut feeling is that the
>>>> cc option is basically never used ...
>>>
>>> It looks like our intuition is wrong about that.
>>
>> Thanks for checking!
>>
>>>
>>> address@hidden:~/glibc/src/sysdeps/s390$ grep -r vfaezbs * | wc -l
>>> 15
>>>
>>> These set cc, use zs, and do not use rt.
>>>
>>> address@hidden:~/glibc/src/sysdeps/s390$ grep -r 'vfaeb' * | wc -l
>>> 3
>>>
>>> These do not set cc, do not use zs, and do use rt.
>>>
>>> Those are the only two VFAE forms used by glibc (note that the same
>>> variants as 'f' are used by the wide-character strings).
>>>
>>
>> I guess "rt" and "cc" make the biggest difference. Maybe special case
>> these two, result in 4 variants for each of the 3 element sizes?
>
> Sounds good.
>
So .... after all it might not be necessary, at least not for this
helper :) Using your crazy helper functions, I have this right now:
/*
* Returns the number of bits composing one element.
*/
static uint8_t get_element_bits(uint8_t es)
{
return (1 << es) * BITS_PER_BYTE;
}
/*
* Returns the bitmask for a single element.
*/
static uint64_t get_single_element_mask(uint8_t es)
{
return -1ull >> (64 - get_element_bits(es));
}
/*
* Returns the bitmask for a single element (excluding the MSB).
*/
static uint64_t get_single_element_lsbs_mask(uint8_t es)
{
return -1ull >> (65 - get_element_bits(es));
}
/*
* Returns the bitmasks for multiple elements (excluding the MSBs).
*/
static uint64_t get_element_lsbs_mask(uint8_t es)
{
return dup_const(es, get_single_element_lsbs_mask(es));
}
static int vfae(void *v1, const void *v2, const void *v3, bool in,
bool rt, bool zs, uint8_t es)
{
const uint64_t mask = get_element_lsbs_mask(es);
const int bits = get_element_bits(es);
uint64_t a0, a1, b0, b1, e0, e1, t0, t1, z0, z1;
uint64_t first_zero = 16;
uint64_t first_equal;
int i;
a0 = s390_vec_read_element64(v2, 0);
a1 = s390_vec_read_element64(v2, 1);
b0 = s390_vec_read_element64(v3, 0);
b1 = s390_vec_read_element64(v3, 1);
e0 = 0;
e1 = 0;
/* compare against equality with every other element */
for (i = 0; i < 64; i += bits) {
t0 = i ? rol64(b0, i) : b0;
t1 = i ? rol64(b1, i) : b1;
e0 |= zero_search(a0 ^ t0, mask);
e0 |= zero_search(a0 ^ t1, mask);
e1 |= zero_search(a1 ^ t0, mask);
e1 |= zero_search(a1 ^ t1, mask);
}
/* invert the result if requested - invert only the MSBs */
if (in) {
e0 = ~e0 & ~mask;
e1 = ~e1 & ~mask;
}
first_equal = match_index(e0, e1);
if (zs) {
z0 = zero_search(a0, mask);
z1 = zero_search(a1, mask);
first_zero = match_index(z0, z1);
}
if (rt) {
e0 = (e0 >> (bits - 1)) * get_single_element_mask(es);
e1 = (e1 >> (bits - 1)) * get_single_element_mask(es);
s390_vec_write_element64(v1, 0, e0);
s390_vec_write_element64(v1, 1, e1);
} else {
s390_vec_write_element64(v1, 0, MIN(first_equal, first_zero));
s390_vec_write_element64(v1, 1, 0);
}
if (first_zero == 16 && first_equal == 16) {
return 3; /* no match */
} else if (first_zero == 16) {
return 1; /* matching elements, no match for zero */
} else if (first_equal < first_zero) {
return 2; /* matching elements before match for zero */
}
return 0; /* match for zero */
}
At least the kernel boots with it - am i missing something or does this
indeed work?
Cheers!
--
Thanks,
David / dhildenb
- [qemu-s390x] [PATCH v1 0/5] s390x/tcg: Vector Instruction Support Part 3, David Hildenbrand, 2019/05/15
- [qemu-s390x] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, David Hildenbrand, 2019/05/15
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, Richard Henderson, 2019/05/17
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, David Hildenbrand, 2019/05/20
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, David Hildenbrand, 2019/05/22
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, Richard Henderson, 2019/05/22
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, David Hildenbrand, 2019/05/22
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, Richard Henderson, 2019/05/22
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, David Hildenbrand, 2019/05/22
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, Richard Henderson, 2019/05/22
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL,
David Hildenbrand <=
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, Richard Henderson, 2019/05/23
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, David Hildenbrand, 2019/05/23
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, David Hildenbrand, 2019/05/23
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, Richard Henderson, 2019/05/23
- Re: [qemu-s390x] [Qemu-devel] [PATCH v1 1/5] s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL, Alex Bennée, 2019/05/23
[qemu-s390x] [PATCH v1 2/5] s390x/tcg: Implement VECTOR FIND ELEMENT EQUAL, David Hildenbrand, 2019/05/15
[qemu-s390x] [PATCH v1 3/5] s390x/tcg: Implement VECTOR FIND ELEMENT NOT EQUAL, David Hildenbrand, 2019/05/15