[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [qemu-s390x] [Qemu-devel] [PATCH v1 03/33] s390x: Add one temporary
From: |
David Hildenbrand |
Subject: |
Re: [qemu-s390x] [Qemu-devel] [PATCH v1 03/33] s390x: Add one temporary vector register in CPU state for TCG |
Date: |
Tue, 26 Feb 2019 19:45:54 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 |
On 26.02.19 19:36, Richard Henderson wrote:
> On 2/26/19 3:38 AM, David Hildenbrand wrote:
>> We sometimes want to work on a temporary vector register instead of the
>> actual destination, because source and destination might overlap. An
>> alternative would be loading the vector into two i64 variables, but than
>> separate handling for accessing the vector elements would be needed.
>> This is easier. Add one for now as that seems to be enough.
>
> Hmm, I'll reserve judgment until I see how this is used.
>
> For ARM SVE, I would allocate this temporary on the stack within the helper,
> and move one of the operands out of the way. E.g.
Yes, I do the same for helpers. This, however is for TCG translated code :)
E.g. see
[PATCH v1 08/33] s390x/tcg: Implement VECTOR LOAD
[PATCH v1 19/33] s390x/tcg: Implement VECTOR MERGE (HIGH|LOW)
[PATCH v1 33/33] s390x/tcg: Implement VECTOR UNPACK *
>
> void helper(foo)(void *vd, void *vx, *void *vy
> {
> VectorReg tmp;
> TYPE *d = vd, *x = vx, *y = vy;
>
> if (vx == vd || vy == vd) {
> tmp = *(VectorReg *)vd;
> if (vx == vd) {
> vx = &tmp;
> }
> if (vy == vd) {
> vy = &tmp;
> }
> }
>
> process d, x, y as normal.
> }
>
> This minimized the amount of code inline. However, SVE vectors are quite a
> bit
> larger, at 256 bytes, so the copy itself was out of line most of the time
> anyway.
>
> Provisionally,
> Reviewed-by: Richard Henderson <address@hidden>
>
>
> r~
>
--
Thanks,
David / dhildenb
- Re: [qemu-s390x] [PATCH v1 07/33] s390x/tcg: Implement VECTOR GENERATE MASK, (continued)
- [qemu-s390x] [PATCH v1 02/33] s390x/tcg: Check vector register instructions at central point, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 03/33] s390x: Add one temporary vector register in CPU state for TCG, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 09/33] s390x/tcg: Implement VECTOR LOAD AND REPLICATE, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 10/33] s390x/tcg: Implement VECTOR LOAD ELEMENT, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 04/33] s390x/tcg: Utilities for vector instruction helpers, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 11/33] s390x/tcg: Implement VECTOR LOAD ELEMENT IMMEDIATE, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 17/33] s390x/tcg: Implement VECTOR LOAD VR FROM GRS DISJOINT, David Hildenbrand, 2019/02/26
- [qemu-s390x] [PATCH v1 14/33] s390x/tcg: Implement VECTOR LOAD MULTIPLE, David Hildenbrand, 2019/02/26