|
From: | Richard Henderson |
Subject: | Re: [PATCH v1 04/46] target/loongarch: Implement xvadd/xvsub |
Date: | Tue, 20 Jun 2023 14:25:20 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 |
On 6/20/23 11:37, Song Gao wrote:
+static bool gvec_xxx(DisasContext *ctx, arg_xxx *a, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t, uint32_t)) +{ + uint32_t xd_ofs, xj_ofs, xk_ofs; + + CHECK_ASXE; + + xd_ofs = vec_full_offset(a->xd); + xj_ofs = vec_full_offset(a->xj); + xk_ofs = vec_full_offset(a->xk); + + func(mop, xd_ofs, xj_ofs, xk_ofs, 32, ctx->vl / 8); + return true; +}
Comparing gvec_xxx vs gvec_vvv for LSX,
func(mop, vd_ofs, vj_ofs, vk_ofs, 16, ctx->vl/8);
gvec_vvv will write 16 bytes of output, followed by 16 bytes of zero to satisfy vl / 8.I presume this is the intended behaviour of mixing LSX with LASX, that the high 128-bits that are not considered by the LSX instruction are zeroed on write?
Which means that your macros from patch 1,
+#if HOST_BIG_ENDIAN
...
+#define XB(x) XB[31 - (x)] +#define XH(x) XH[15 - (x)]
are incorrect. We need big-endian within the Int128, but little-endian ordering of the two Int128. This can be done with
#define XB(x) XB[(x) ^ 15] #define XH(x) XH[(x) ^ 7] etc. It would be nice to share more code with trans_lsx.c, if possible. r~
[Prev in Thread] | Current Thread | [Next in Thread] |