[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2] target/riscv/vector_helper.c: Remove the check for extra
From: |
Alistair Francis |
Subject: |
Re: [PATCH v2] target/riscv/vector_helper.c: Remove the check for extra tail elements |
Date: |
Mon, 12 Jun 2023 13:44:06 +1000 |
On Wed, Jun 7, 2023 at 7:13 PM Xiao Wang <xiao.w.wang@intel.com> wrote:
>
> Commit 752614cab8e6 ("target/riscv: rvv: Add tail agnostic for vector
> load / store instructions") added an extra check for LMUL fragmentation,
> intended for setting the "rest tail elements" in the last register for a
> segment load insn.
>
> Actually, the max_elements derived in vext_ld*() won't be a fraction of
> vector register size, since the lmul encoded in desc is emul, which has
> already been adjusted to 1 for LMUL fragmentation case by vext_get_emul()
> in trans_rvv.c.inc, for ld_stride(), ld_us(), ld_index() and ldff().
>
> Besides, vext_get_emul() has also taken EEW/SEW into consideration, so no
> need to call vext_get_total_elems() which would base on the emul to derive
> another emul, the second emul would be incorrect when esz differs from sew.
>
> Thus this patch removes the check for extra tail elements.
>
> Fixes: 752614cab8e6 ("target/riscv: rvv: Add tail agnostic for vector load /
> store instructions")
>
> Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Thanks!
Applied to riscv-to-apply.next
Alistair
> ---
> v2:
> * Rebased on top of Alistair's riscv-to-apply.next branch.
> ---
> target/riscv/vector_helper.c | 22 ++++++----------------
> 1 file changed, 6 insertions(+), 16 deletions(-)
>
> diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
> index 7505f9470a..f261e726c2 100644
> --- a/target/riscv/vector_helper.c
> +++ b/target/riscv/vector_helper.c
> @@ -264,11 +264,10 @@ GEN_VEXT_ST_ELEM(ste_h, int16_t, H2, stw)
> GEN_VEXT_ST_ELEM(ste_w, int32_t, H4, stl)
> GEN_VEXT_ST_ELEM(ste_d, int64_t, H8, stq)
>
> -static void vext_set_tail_elems_1s(CPURISCVState *env, target_ulong vl,
> - void *vd, uint32_t desc, uint32_t nf,
> +static void vext_set_tail_elems_1s(target_ulong vl, void *vd,
> + uint32_t desc, uint32_t nf,
> uint32_t esz, uint32_t max_elems)
> {
> - uint32_t total_elems, vlenb, registers_used;
> uint32_t vta = vext_vta(desc);
> int k;
>
> @@ -276,19 +275,10 @@ static void vext_set_tail_elems_1s(CPURISCVState *env,
> target_ulong vl,
> return;
> }
>
> - total_elems = vext_get_total_elems(env, desc, esz);
> - vlenb = riscv_cpu_cfg(env)->vlen >> 3;
> -
> for (k = 0; k < nf; ++k) {
> vext_set_elems_1s(vd, vta, (k * max_elems + vl) * esz,
> (k * max_elems + max_elems) * esz);
> }
> -
> - if (nf * max_elems % total_elems != 0) {
> - registers_used = ((nf * max_elems) * esz + (vlenb - 1)) / vlenb;
> - vext_set_elems_1s(vd, vta, (nf * max_elems) * esz,
> - registers_used * vlenb);
> - }
> }
>
> /*
> @@ -324,7 +314,7 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base,
> }
> env->vstart = 0;
>
> - vext_set_tail_elems_1s(env, env->vl, vd, desc, nf, esz, max_elems);
> + vext_set_tail_elems_1s(env->vl, vd, desc, nf, esz, max_elems);
> }
>
> #define GEN_VEXT_LD_STRIDE(NAME, ETYPE, LOAD_FN) \
> @@ -383,7 +373,7 @@ vext_ldst_us(void *vd, target_ulong base, CPURISCVState
> *env, uint32_t desc,
> }
> env->vstart = 0;
>
> - vext_set_tail_elems_1s(env, evl, vd, desc, nf, esz, max_elems);
> + vext_set_tail_elems_1s(evl, vd, desc, nf, esz, max_elems);
> }
>
> /*
> @@ -504,7 +494,7 @@ vext_ldst_index(void *vd, void *v0, target_ulong base,
> }
> env->vstart = 0;
>
> - vext_set_tail_elems_1s(env, env->vl, vd, desc, nf, esz, max_elems);
> + vext_set_tail_elems_1s(env->vl, vd, desc, nf, esz, max_elems);
> }
>
> #define GEN_VEXT_LD_INDEX(NAME, ETYPE, INDEX_FN, LOAD_FN) \
> @@ -634,7 +624,7 @@ ProbeSuccess:
> }
> env->vstart = 0;
>
> - vext_set_tail_elems_1s(env, env->vl, vd, desc, nf, esz, max_elems);
> + vext_set_tail_elems_1s(env->vl, vd, desc, nf, esz, max_elems);
> }
>
> #define GEN_VEXT_LDFF(NAME, ETYPE, LOAD_FN) \
> --
> 2.25.1
>
>