qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 07/34] target/ppc: Implement cntlzdm


From: Richard Henderson
Subject: Re: [PATCH v2 07/34] target/ppc: Implement cntlzdm
Date: Sat, 30 Oct 2021 14:17:07 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0

On 10/29/21 1:23 PM, matheus.ferst@eldorado.org.br wrote:
From: Luis Pires <luis.pires@eldorado.org.br>

Implement the following PowerISA v3.1 instruction:
cntlzdm: Count Leading Zeros Doubleword Under Bit Mask

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v2:
- Inline implementation of cntlzdm
---
  target/ppc/insn32.decode                   |  1 +
  target/ppc/translate/fixedpoint-impl.c.inc | 36 ++++++++++++++++++++++
  2 files changed, 37 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 9cb9fc00b8..221cb00dd6 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -203,6 +203,7 @@ ADDPCIS         010011 ..... ..... .......... 00010 .   @DX
  ## Fixed-Point Logical Instructions
CFUGED 011111 ..... ..... ..... 0011011100 - @X
+CNTLZDM         011111 ..... ..... ..... 0000111011 -   @X
### Float-Point Load Instructions diff --git a/target/ppc/translate/fixedpoint-impl.c.inc b/target/ppc/translate/fixedpoint-impl.c.inc
index 0d9c6e0996..c9e9ae35df 100644
--- a/target/ppc/translate/fixedpoint-impl.c.inc
+++ b/target/ppc/translate/fixedpoint-impl.c.inc
@@ -413,3 +413,39 @@ static bool trans_CFUGED(DisasContext *ctx, arg_X *a)
  #endif
      return true;
  }
+
+#if defined(TARGET_PPC64)
+static void do_cntlzdm(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 mask)
+{
+    TCGv_i64 tmp;
+    TCGLabel *l1;
+
+    tmp = tcg_temp_local_new_i64();
+    l1 = gen_new_label();
+
+    tcg_gen_and_i64(tmp, src, mask);
+    tcg_gen_clzi_i64(tmp, tmp, 64);
+
+    tcg_gen_brcondi_i64(TCG_COND_EQ, tmp, 0, l1);
+
+    tcg_gen_subfi_i64(tmp, 64, tmp);
+    tcg_gen_shr_i64(tmp, mask, tmp);
+    tcg_gen_ctpop_i64(tmp, tmp);
+
+    gen_set_label(l1);
+
+    tcg_gen_mov_i64(dst, tmp);

This works, but a form without brcond would be better (due to how poorly tcg handles basic blocks).

How about

    tcg_gen_clzi_i64(tmp, tmp, 0);

    tcg_gen_xori_i64(tmp, tmp, 63);
    tcg_gen_shr_i64(tmp, mask, tmp);
    tcg_gen_shri_i64(tmp, tmp, 1);

    tcg_gen_ctpop_i64(dst, tmp);

The middle 3 operations perform a shift between [1-64], such that we are 
assured of 0 for 64.

Either way,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]