qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 14/67] target/arm: Convert multiply and multiply


From: Peter Maydell
Subject: Re: [Qemu-devel] [PATCH 14/67] target/arm: Convert multiply and multiply accumulate
Date: Mon, 5 Aug 2019 16:32:40 +0100

On Fri, 26 Jul 2019 at 18:50, Richard Henderson
<address@hidden> wrote:
>
> Signed-off-by: Richard Henderson <address@hidden>
> ---
>  target/arm/translate.c | 214 ++++++++++++++++++++++++-----------------
>  target/arm/a32.decode  |  17 ++++
>  target/arm/t32.decode  |  19 ++++
>  3 files changed, 163 insertions(+), 87 deletions(-)
>
> diff --git a/target/arm/translate.c b/target/arm/translate.c
> index ee7d53dfa5..354a52d36c 100644
> --- a/target/arm/translate.c
> +++ b/target/arm/translate.c
> @@ -7376,21 +7376,6 @@ static void gen_storeq_reg(DisasContext *s, int rlow, 
> int rhigh, TCGv_i64 val)
>      store_reg(s, rhigh, tmp);
>  }
>
> -/* load a 32-bit value from a register and perform a 64-bit accumulate.  */
> -static void gen_addq_lo(DisasContext *s, TCGv_i64 val, int rlow)
> -{
> -    TCGv_i64 tmp;
> -    TCGv_i32 tmp2;
> -
> -    /* Load value and extend to 64 bits.  */
> -    tmp = tcg_temp_new_i64();
> -    tmp2 = load_reg(s, rlow);
> -    tcg_gen_extu_i32_i64(tmp, tmp2);
> -    tcg_temp_free_i32(tmp2);
> -    tcg_gen_add_i64(val, val, tmp);
> -    tcg_temp_free_i64(tmp);
> -}
> -

> +static bool trans_UMAAL(DisasContext *s, arg_UMAAL *a)
> +{
> +    TCGv_i32 t0, t1, t2, zero;
> +
> +    if (s->thumb
> +        ? !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)
> +        : !ENABLE_ARCH_6) {
> +        return false;
> +    }
> +
> +    t0 = load_reg(s, a->rm);
> +    t1 = load_reg(s, a->rn);
> +    tcg_gen_mulu2_i32(t0, t1, t0, t1);
> +    zero = tcg_const_i32(0);
> +    t2 = load_reg(s, a->ra);
> +    tcg_gen_add2_i32(t0, t1, t0, t1, t2, zero);
> +    tcg_temp_free_i32(t2);
> +    t2 = load_reg(s, a->rd);
> +    tcg_gen_add2_i32(t0, t1, t0, t1, t2, zero);
> +    tcg_temp_free_i32(t2);
> +    tcg_temp_free_i32(zero);
> +    store_reg(s, a->ra, t0);
> +    store_reg(s, a->rd, t1);
> +    return true;
> +

Is using mulu2/add2/add2 like this really generating better
code than the mulu_i64_i32 and 2 64-bit adds that we had before?
If we're going to change how we're generating code it would be
nice to at least mention it in the commit message...


> @@ -10292,13 +10337,8 @@ static void disas_thumb2_insn(DisasContext *s, 
> uint32_t insn)
>                      }
>                  }
>                  if (op & 4) {
> -                    /* umaal */
> -                    if (!arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)) {
> -                        tcg_temp_free_i64(tmp64);
> -                        goto illegal_op;
> -                    }
> -                    gen_addq_lo(s, tmp64, rs);
> -                    gen_addq_lo(s, tmp64, rd);
> +                    /* umaal, in decodetree */
> +                    goto illegal_op;
>                  } else if (op & 0x40) {
>                      /* 64-bit accumulate.  */
>                      gen_addq(s, tmp64, rs, rd);

This doesn't seem to remove all of the Thumb insns we implement
in the decode tree from the legacy decoder.

Otherwise
Reviewed-by: Peter Maydell <address@hidden>

thanks
-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]