[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [qemu-s390x] [Qemu-devel] [PATCH v1 19/33] s390x/tcg: Implement VECT
From: |
David Hildenbrand |
Subject: |
Re: [qemu-s390x] [Qemu-devel] [PATCH v1 19/33] s390x/tcg: Implement VECTOR MERGE (HIGH|LOW) |
Date: |
Thu, 28 Feb 2019 09:54:57 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 |
On 27.02.19 17:20, Richard Henderson wrote:
> On 2/26/19 3:39 AM, David Hildenbrand wrote:
>> + for (dst_idx = 0; dst_idx < NUM_VEC_ELEMENTS(es); dst_idx++) {
>> + src_idx = dst_idx / 2;
>> + if (!high) {
>> + src_idx += NUM_VEC_ELEMENTS(es) / 2;
>> + }
>> + if (dst_idx % 2 == 0) {
>> + read_vec_element_i64(tmp, v2, src_idx, es);
>> + } else {
>> + read_vec_element_i64(tmp, v3, src_idx, es);
>> + }
>> + write_vec_element_i64(tmp, dst_v, dst_idx, es);
>> + }
>
> TODO: Note that you do not need a vector temporary here, so long as you load
> both source elements before writing, and you iterate in the proper direction.
>
> For VMRL, iterate forward as you do now. The element access order for MO_32:
>
> read v2: 2 3
> read v3: 2 3
> write v1: 0 1 2 3
>
> For VMRH, iterate backward:
>
> read v2: 1 0
> read v3: 1 0
> write v1: 3 2 1 0
>
>
> r~
>
Let's have a look for VMRH when iterating forward (My brain is a little
slow in the morning):
v1[0] = v2[0]
v1[1] = v3[0]
v1[2] = v2[1]
v1[3] = v3[1]
If all would overlap
v1[0] = v1[0]
v1[1] = v1[0] -> v1[0] already modified
v1[2] = v1[1] -> v1[1] already modified
v1[3] = v1[1] -> v1[1] already modified
When iterating backwards:
v1[3] = v3[1]
v1[2] = v2[1]
v1[1] = v3[0]
v1[0] = v2[0]
If all would overlap
v1[3] = v1[1]
v1[2] = v1[1]
v1[1] = v1[0]
v1[0] = v1[0]
VMRH when iterating forward:
v1[0] = v2[2]
v1[1] = v3[2]
v1[2] = v2[3]
v1[3] = v3[3]
If all would overlap
v1[0] = v1[2]
v1[1] = v1[2]
v1[2] = v1[3]
v1[3] = v1[3]
Perfect :) I'll split up the two cases! Thanks!
--
Thanks,
David / dhildenb
- [qemu-s390x] [PATCH v1 15/33] s390x/tcg: Implement VECTOR LOAD TO BLOCK BOUNDARY, (continued)
- [qemu-s390x] [PATCH v1 15/33] s390x/tcg: Implement VECTOR LOAD TO BLOCK BOUNDARY, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 16/33] s390x/tcg: Implement VECTOR LOAD VR ELEMENT FROM GR, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 18/33] s390x/tcg: Implement VECTOR LOAD WITH LENGTH, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 19/33] s390x/tcg: Implement VECTOR MERGE (HIGH|LOW), David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 13/33] s390x/tcg: Implement VECTOR LOAD LOGICAL ELEMENT AND ZERO, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 21/33] s390x/tcg: Implement VECTOR PACK (LOGICAL) SATURATE, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 22/33] s390x/tcg: Implement VECTOR PERMUTE, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 23/33] s390x/tcg: Implement VECTOR PERMUTE DOUBLEWORD IMMEDIATE, David Hildenbrand, 2019/02/26