[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v5 57/60] target/riscv: vector slide instructions
From: |
Richard Henderson |
Subject: |
Re: [PATCH v5 57/60] target/riscv: vector slide instructions |
Date: |
Mon, 16 Mar 2020 10:42:56 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 |
On 3/16/20 1:04 AM, LIU Zhiwei wrote:
>> As a preference, I think you can do away with this helper.
>> Simply use the slideup helper with argument 1, and then
>> afterwards store the integer register into element 0. You should be able to
>> re-use code from vmv.s.x for that.
> When I try it, I find it is some difficult, because vmv.s.x will clean
> the elements (0 < index < VLEN/SEW).
Well, two things about that:
(1) The 0.8 version of vmv.s.x does *not* zero the other elements, so we'll
want to be prepared for that.
(2) We have 8 insns that, in the end come down to a direct element access,
possibly with some other processing.
So we'll want basic helper functions that can locate an element by immediate
offset and by variable offset:
/* Compute the offset of vreg[idx] relative to cpu_env.
The index must be in range of VLMAX. */
int vec_element_ofsi(int vreg, int idx, int sew);
/* Compute a pointer to vreg[idx].
If need_bound is true, mask idx into VLMAX,
Otherwise we know a-priori that idx is already in bounds. */
void vec_element_ofsx(DisasContext *s, TCGv_ptr base,
TCGv idx, int sew, bool need_bound);
/* Load idx >= VLMAX ? 0 : vreg[idx] */
void vec_element_loadi(DisasContext *s, TCGv_i64 val,
int vreg, int idx, int sew);
void vec_element_loadx(DisasContext *s, TCGv_i64 val,
int vreg, TCGv idx, int sew);
/* Store vreg[imm] = val.
The index must be in range of VLMAX. */
void vec_element_storei(DisasContext *s, int vreg, int imm,
TCGv_i64 val);
void vec_element_storex(DisasContext *s, int vreg,
TCGv idx, TCGv_i64 val);
(3) It would be handy to have TCGv cpu_vl.
Then:
vext.x.v:
If rs1 == 0,
Use vec_element_loadi(s, x[rd], vs2, 0, s->sew).
else
Use vec_element_loadx(s, x[rd], vs2, x[rs1], true).
vmv.s.x:
over = gen_new_label();
tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
For 0.7.1:
Use tcg_gen_dup8i to zero all VLMAX elements of vd.
If rs1 == 0, goto done.
Use vec_element_storei(s, vs2, 0, x[rs1]).
done:
gen_set_label(over);
vfmv.f.s:
Use vec_element_loadi(x, f[rd], vs2, 0).
NaN-box f[rd] as necessary for SEW.
vfmv.s.f:
tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
For 0.7.1:
Use tcg_gen_dup8i to zero all VLMAX elements of vd.
Let tmp = f[rs1], nan-boxed as necessary for SEW.
Use vec_element_storei(s, vs2, 0, tmp).
gen_set_label(over);
vslide1up.vx:
Ho hum, I forgot about masking. Some options:
(1) Call a helper just as you did in your original patch.
(2) Call a helper only for !vm, for vm as below.
(3) Call vslideup w/1.
tcg_gen_brcondi(TCG_COND_EQ, cpu_vl, 0, over);
If !vm,
// inline test for v0[0]
vec_element_loadi(s, tmp, 0, 0, MO_8);
tcg_gen_andi_i64(tmp, tmp, 1);
tcg_gen_brcondi(TCG_COND_EQ, tmp, 0, over);
Use vec_element_store(s, vd, 0, x[rs1]).
gen_set_label(over);
vslide1down.vx:
For !vm, this is complicated enough for a helper.
If using option 3 for vslide1up, then the store becomes:
tcg_gen_subi_tl(tmp, cpu_vl, 1);
vec_element_storex(s, base, tmp, x[rs1]);
vrgather.vx:
If !vm or !vl_eq_vlmax, use helper.
vec_element_loadx(s, tmp, vs2, x[rs1]);
Use tcg_gen_gvec_dup_i64 to store to tmp to vd.
vrgather.vi:
If !vm or !vl_eq_vlmax, use helper.
If imm >= vlmax,
Use tcg_gen_dup8i to zero vd;
else,
ofs = vec_element_ofsi(s, vs2, imm, s->sew);
tcg_gen_gvec_dup_mem(sew, vreg_ofs(vd),
ofs, vlmax, vlmax);
r~
- Re: [PATCH v5 56/60] target/riscv: floating-point scalar move instructions, (continued)
[PATCH v5 57/60] target/riscv: vector slide instructions, LIU Zhiwei, 2020/03/12
[PATCH v5 58/60] target/riscv: vector register gather instruction, LIU Zhiwei, 2020/03/12
[PATCH v5 59/60] target/riscv: vector compress instruction, LIU Zhiwei, 2020/03/12
[PATCH v5 60/60] target/riscv: configure and turn on vector extension from command line, LIU Zhiwei, 2020/03/12
Re: [PATCH v5 00/60] target/riscv: support vector extension v0.7.1, no-reply, 2020/03/12
Re: [PATCH v5 35/60] target/riscv: vector floating-point square-root instruction, Richard Henderson, 2020/03/15