qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 07/34] target/ppc: Implement cntlzdm


From: David Gibson
Subject: Re: [PATCH v2 07/34] target/ppc: Implement cntlzdm
Date: Mon, 1 Nov 2021 11:16:59 +1100

On Sat, Oct 30, 2021 at 02:17:07PM -0700, Richard Henderson wrote:
> On 10/29/21 1:23 PM, matheus.ferst@eldorado.org.br wrote:
> > From: Luis Pires <luis.pires@eldorado.org.br>
> > 
> > Implement the following PowerISA v3.1 instruction:
> > cntlzdm: Count Leading Zeros Doubleword Under Bit Mask
> > 
> > Suggested-by: Richard Henderson <richard.henderson@linaro.org>
> > Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
> > Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> > ---
> > v2:
> > - Inline implementation of cntlzdm
> > ---
> >   target/ppc/insn32.decode                   |  1 +
> >   target/ppc/translate/fixedpoint-impl.c.inc | 36 ++++++++++++++++++++++
> >   2 files changed, 37 insertions(+)
> > 
> > diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
> > index 9cb9fc00b8..221cb00dd6 100644
> > --- a/target/ppc/insn32.decode
> > +++ b/target/ppc/insn32.decode
> > @@ -203,6 +203,7 @@ ADDPCIS         010011 ..... ..... .......... 00010 .   
> > @DX
> >   ## Fixed-Point Logical Instructions
> >   CFUGED          011111 ..... ..... ..... 0011011100 -   @X
> > +CNTLZDM         011111 ..... ..... ..... 0000111011 -   @X
> >   ### Float-Point Load Instructions
> > diff --git a/target/ppc/translate/fixedpoint-impl.c.inc 
> > b/target/ppc/translate/fixedpoint-impl.c.inc
> > index 0d9c6e0996..c9e9ae35df 100644
> > --- a/target/ppc/translate/fixedpoint-impl.c.inc
> > +++ b/target/ppc/translate/fixedpoint-impl.c.inc
> > @@ -413,3 +413,39 @@ static bool trans_CFUGED(DisasContext *ctx, arg_X *a)
> >   #endif
> >       return true;
> >   }
> > +
> > +#if defined(TARGET_PPC64)
> > +static void do_cntlzdm(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 mask)
> > +{
> > +    TCGv_i64 tmp;
> > +    TCGLabel *l1;
> > +
> > +    tmp = tcg_temp_local_new_i64();
> > +    l1 = gen_new_label();
> > +
> > +    tcg_gen_and_i64(tmp, src, mask);
> > +    tcg_gen_clzi_i64(tmp, tmp, 64);
> > +
> > +    tcg_gen_brcondi_i64(TCG_COND_EQ, tmp, 0, l1);
> > +
> > +    tcg_gen_subfi_i64(tmp, 64, tmp);
> > +    tcg_gen_shr_i64(tmp, mask, tmp);
> > +    tcg_gen_ctpop_i64(tmp, tmp);
> > +
> > +    gen_set_label(l1);
> > +
> > +    tcg_gen_mov_i64(dst, tmp);
> 
> This works, but a form without brcond would be better (due to how poorly tcg
> handles basic blocks).
> 
> How about
> 
>     tcg_gen_clzi_i64(tmp, tmp, 0);
> 
>     tcg_gen_xori_i64(tmp, tmp, 63);
>     tcg_gen_shr_i64(tmp, mask, tmp);
>     tcg_gen_shri_i64(tmp, tmp, 1);
> 
>     tcg_gen_ctpop_i64(dst, tmp);

I've applied this to ppc-for-6.2.  You can make this improvement as a
followup if you want.

> 
> The middle 3 operations perform a shift between [1-64], such that we are 
> assured of 0 for 64.
> 
> Either way,
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> 
> 
> r~
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]