Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions

qemu-ppc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions

From:	Gustavo Romero
Subject:	Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions
Date:	Wed, 16 Dec 2020 05:01:29 -0300
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.3.1

Hi David,

Thanks a lot for the review. Please find my comments inline.

On 10/8/20 9:43 PM, David Gibson wrote:

On Mon, Oct 05, 2020 at 01:03:13AM -0300, Gustavo Romero wrote:

From: Michael Roth <mdroth@linux.vnet.ibm.com>


Probably a good idea to CC future spins to Richard Henderson
<rth@twiddle.net> - by knowledge of how TCG works is only middling.

OK.

Some prefixed instructions (Type 0 and 1, e.g. 8-byte Load/Store or 8LS),
have a completely seperate namespace for their primary opcodes.

Other prefixed instructions (Type 2 and 3, e.g. Modified Load/Store or MLS)
actually re-use existing opcodes to provide a modified variant. We could
handle these by extending the existing opcode handlers to check for the
prefix and handle accordingly, but in some cases it is cleaner to define
seperate handlers for these in their own table entry, and avoids the need
to add error-handling to existing handlers for cases where they are called
for a prefixed Type 2/3 instruction but don't implement the handling for
them. In the future we can re-work things to support both approaches if
cases arise where that seems warranted.

Then we have the normal non-prefixed instructions.

To handle all 3 of these cases we extend the table size 3x, with normal
instructions indexed directly by their primary opcodes, Type 0/1 indexed by
primary opcode + PPC_CPU_PREFIXED_OPCODE_OFFSET, and Type 2/3 indexed by
primary opcode + PPC_CPU_PREFIXED_MODIFIED_OPCODE_OFFSET.


Is there an advantage to having one table with magic offsets, rather
than 3 separately declared tables?


Not that I can see in terms of size or performance. However since we
are dealing here with the PO (primary opcode) for the prefixed insns
as well I think it makes sense to have it related to the table that
already handles the PO for the normal instruction. So the offset are
not really magic, but 64 (PPC_CPU_PREFIXED_OPCODE_OFFSET) and 128
(PPC_CPU_PREFIXED_MODIFIED_OPCODE_OFFSET), because the slots for the
normal opcodes is triplicated.

In addition to Michael's comment above I've add the following comment
in the code for v2:

 982 /*
 983  * The primary op-code field extracted from the instruction (PO) is used 
as the
 984  * index in the top-level QEMU opcode table. That table is then further 
used to
 985  * find the proper handler, or a pointer to another table (OPC1, OPC2 
etc). For
 986  * the normal opcodes (PO field of normal insns) the 64 first entries are 
used.
 987  *
 988  * Prefixed instructions can be divided into two major groups of 
instructions:
 989  * the first group is formed by type 0 and 1, and the second group is by 
type 2
 990  * and 3. Opcodes (PO) related to prefixed type 0/1 can have the same 
opcodes
 991  * as the normal instructions but don't have any semantics related to 
them. But
 992  * since it's an primary opcode anyway it was decided to simply extend the
 993  * top-level table that handles the primary opcodes for the normal 
instructions.
 994  * On the other handle, prefixed instructions type 2/3 work like modifiers 
for
 995  * existing normal instructions. In that case it would be possible to 
reuse the
 996  * handlers for the normal instructions, but that would require changing 
the
 997  * normal instruction handler in a way it would have to check, for 
instance, on
 998  * which ISA it has to take care of prefixes insns, and that is already 
done by
 999  * GEN_HANDLER_* helpers when registering a new instruction, where there 
is a
1000  * a field to inform on which ISA the instruction exists. Hence, it was 
decided
1001  * to add another 64 entries for specific handler for the prefixed 
instructions
1002  * of type 2/3. Thus the size of top-level PO lookup table is 3*64 = 192, 
hence
1003  * representing three namespaces for decoding PO: for normal insns, for 
type 0/1
1004  * and for type 2/3.
1005  */

Various exception/SRR handling changes related to prefixed instructions are
yet to be implemented. For instance, no alignment interrupt is generated if
a prefixed instruction crosses the 64-byte boundary; no SRR bit related to
prefixed instructions is set on any interrupt, like for FP unavailable
interrupt, data storage interrupt, etc.

For details, please see PowerISA v3.1, section 1.6.3 and chapter 7,
particularly the new changes regarding the prefixed instructions.

Signed-off-by: Michael Roth <mroth@lamentation.net>
[ gromero: - comments clean up and updates
            - additional comments in the commit log ]
Signed-off-by: Gustavo Romero <gromero@linux.ibm.com>
---
  target/ppc/cpu.h                |   4 +-
  target/ppc/internal.h           |  14 +++
  target/ppc/translate.c          | 167 +++++++++++++++++++++++++++++++-
  target/ppc/translate_init.c.inc |  11 ++-
  4 files changed, 190 insertions(+), 6 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 766e9c5c26..c5de33572c 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -979,8 +979,10 @@ struct ppc_radix_page_info {
  #define PPC_TLB_EPID_LOAD 8
  #define PPC_TLB_EPID_STORE 9

-#define PPC_CPU_OPCODES_LEN 0x40

+#define PPC_CPU_OPCODES_LEN          0xc0
  #define PPC_CPU_INDIRECT_OPCODES_LEN 0x20
+#define PPC_CPU_PREFIXED_OPCODE_OFFSET 0x40
+#define PPC_CPU_PREFIXED_MODIFIED_OPCODE_OFFSET 0x80


Might be worth including some of that description from the commit
message as a comment here (or on the table declaration itself).


Yes. Fixed for v2.

  struct CPUPPCState {
      /* Most commonly used resources during translated code execution first */
diff --git a/target/ppc/internal.h b/target/ppc/internal.h
index 15d655b356..d03d691a44 100644
--- a/target/ppc/internal.h
+++ b/target/ppc/internal.h
@@ -129,6 +129,7 @@ EXTRACT_SHELPER(SIMM5, 16, 5);
  EXTRACT_HELPER(UIMM5, 16, 5);
  /* 4 bits unsigned immediate value */
  EXTRACT_HELPER(UIMM4, 16, 4);
+
  /* Bit count */
  EXTRACT_HELPER(NB, 11, 5);
  /* Shift count */
@@ -146,6 +147,19 @@ EXTRACT_HELPER(TO, 21, 5);

EXTRACT_HELPER(CRM, 12, 8);+/*

+ * Instruction prefix fields
+ *
+ * as per PowerISA 3.1 1.6.3
+ */
+
+/* prefixed instruction type */
+EXTRACT_HELPER(PREFIX_TYPE, 24, 2);
+/* 1-bit sub-type */
+EXTRACT_HELPER(PREFIX_ST1, 23, 1);
+/* 4-bit sub-type */
+EXTRACT_HELPER(PREFIX_ST4, 20, 4);
+
  #ifndef CONFIG_USER_ONLY
  EXTRACT_HELPER(SR, 16, 4);
  #endif
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index fedb9b2271..96c2997d3f 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -180,6 +180,8 @@ struct DisasContext {
      uint32_t flags;
      uint64_t insns_flags;
      uint64_t insns_flags2;
+    uint32_t prefix;
+    uint32_t prefix_subtype;
  };

/* Return true iff byteswap is needed in a scalar memop */

@@ -376,6 +378,12 @@ GEN_OPCODE(name, opc1, opc2, opc3, inval, type, PPC_NONE)
  #define GEN_HANDLER_E(name, opc1, opc2, opc3, inval, type, type2)             
\
  GEN_OPCODE(name, opc1, opc2, opc3, inval, type, type2)

+#define GEN_HANDLER_E_PREFIXED(name, opc1, opc2, opc3, inval, type, type2) \

+GEN_OPCODE_PREFIXED(name, opc1, opc2, opc3, inval, type, type2, false)
+
+#define GEN_HANDLER_E_PREFIXED_M(name, opc1, opc2, opc3, inval, type, type2)  \
+GEN_OPCODE_PREFIXED(name, opc1, opc2, opc3, inval, type, type2, true)
+
  #define GEN_HANDLER2(name, onam, opc1, opc2, opc3, inval, type)               
\
  GEN_OPCODE2(name, onam, opc1, opc2, opc3, inval, type, PPC_NONE)

@@ -395,6 +403,8 @@ typedef struct opcode_t {

  #endif
      opc_handler_t handler;
      const char *oname;
+    bool prefixed;
+    bool modified;
  } opcode_t;

/* Helpers for priv. check */

@@ -449,6 +459,23 @@ typedef struct opcode_t {
      },                                                                        
\
      .oname = stringify(name),                                                 
\
  }
+#define GEN_OPCODE_PREFIXED(name, op1, op2, op3, invl, _typ, _typ2, _modified)\
+{                                                                             \
+    .opc1 = op1,                                                              \
+    .opc2 = op2,                                                              \
+    .opc3 = op3,                                                              \
+    .opc4 = 0xff,                                                             \
+    .handler = {                                                              \
+        .inval1  = invl,                                                      \
+        .type = _typ,                                                         \
+        .type2 = _typ2,                                                       \
+        .handler = &gen_##name,                                               \
+        .oname = stringify(name),                                             \
+    },                                                                        \
+    .oname = stringify(name),                                                 \
+    .prefixed = true,                                                         \
+    .modified = _modified,                                                    \
+}
  #define GEN_OPCODE_DUAL(name, op1, op2, op3, invl1, invl2, _typ, _typ2)       
\
  {                                                                             
\
      .opc1 = op1,                                                              
\
@@ -525,6 +552,22 @@ typedef struct opcode_t {
      },                                                                        
\
      .oname = stringify(name),                                                 
\
  }
+#define GEN_OPCODE_PREFIXED(name, op1, op2, op3, invl, _typ, _typ2, _modified)\
+{                                                                             \
+    .opc1 = op1,                                                              \
+    .opc2 = op2,                                                              \
+    .opc3 = op3,                                                              \
+    .opc4 = 0xff,                                                             \
+    .handler = {                                                              \
+        .inval1  = invl,                                                      \
+        .type = _typ,                                                         \
+        .type2 = _typ2,                                                       \
+        .handler = &gen_##name,                                               \
+    },                                                                        \
+    .oname = stringify(name),                                                 \
+    .prefixed = true,                                                         \
+    .modified = _modified,                                                    \
+}
  #define GEN_OPCODE_DUAL(name, op1, op2, op3, invl1, invl2, _typ, _typ2)       
\
  {                                                                             
\
      .opc1 = op1,                                                              
\
@@ -7991,6 +8034,98 @@ static bool ppc_tr_breakpoint_check(DisasContextBase 
*dcbase, CPUState *cs,
      return true;
  }

+

+
+/*
+ * Prefixed instruction sub-types
+ *
+ * as per PowerISA 3.1 1.6.3
+ */
+enum {
+    /* non-prefixed instruction */


Two things are labelled as "non-prefixed instruction" here..


Fixed for v2.

+    PREFIX_ST_INVALID   = -1,
+    /* non-prefixed instruction */
+    PREFIX_ST_NONE      = 0,
+    /* Type 00 prefixed insn sub-types */


I'm guessing the types here are in binary, maybe list as 0b00 0b01 etc
for clarity.


ISA actually uses like that, but I think it's good to clarify
here it's a binary. Fixed for v2.

+    PREFIX_ST_8LS,      /* 8-byte load/store */
+    PREFIX_ST_8MLS,     /* 8-byte masked load/store */
+    /* Type 01 prefixed insn sub-types */
+    PREFIX_ST_8RR,      /* 8-byte reg-to-reg */
+    PREFIX_ST_8MRR,     /* 8-byte masked reg-to-reg */
+    /* Type 10 prefixed insn sub-types */
+    PREFIX_ST_MLS,      /* modified load/store */
+    PREFIX_ST_MMLS,     /* modified masked load/store */
+    /* Type 11 prefixed insn sub-types */
+    PREFIX_ST_MRR,      /* modified reg-to-reg */
+    PREFIX_ST_MMRR,     /* modified mask reg-to-reg */
+    PREFIX_ST_MMIRR,    /* modified masked immediate reg-to-reg */
+};
+
+static int32_t parse_prefix_subtype(uint32_t prefix)
+{
+    int32_t prefix_subtype = PREFIX_ST_INVALID;
+
+    /* primary opcode 1 is reserved for instruction prefixes */
+    if (opc1(prefix) != 1) {
+        return PREFIX_ST_NONE;
+    }
+
+    switch (PREFIX_TYPE(prefix)) {
+    case 0:
+        if (PREFIX_ST1(prefix) == 0) {
+            prefix_subtype = PREFIX_ST_8LS;
+        } else if (PREFIX_ST1(prefix) == 1) {
+            prefix_subtype = PREFIX_ST_8MLS;
+        }
+        break;
+    case 1:
+        if (PREFIX_ST4(prefix) == 0) {
+            prefix_subtype = PREFIX_ST_8RR;
+        } else if (PREFIX_ST4(prefix) == 8) {
+            prefix_subtype = PREFIX_ST_8MRR;
+        }


Is this fallthrough intentional?  If so it should be commented.


No. Fixed for v2.

+    case 2:
+        if (PREFIX_ST1(prefix) == 0) {
+            prefix_subtype = PREFIX_ST_MLS;
+        } else if (PREFIX_ST1(prefix) == 1) {
+            prefix_subtype = PREFIX_ST_MMLS;
+        }
+        break;
+    case 3:
+        if (PREFIX_ST4(prefix) == 0) {
+            prefix_subtype = PREFIX_ST_MRR;
+        } else if (PREFIX_ST4(prefix) == 8) {
+            prefix_subtype = PREFIX_ST_MMRR;
+        } else if (PREFIX_ST4(prefix) == 9) {
+            prefix_subtype = PREFIX_ST_MMIRR;
+        }


Fallthrough here doesn't matter, but probably better to have a break;
for consistency.


Right. Fixed in opc1_idx() as well for v2.

+    }
+
+    return prefix_subtype;
+}
+
+static uint32_t opc1_idx(DisasContext *ctx)
+{
+    uint32_t table_offset = 0;
+
+    switch (ctx->prefix_subtype) {
+    case PREFIX_ST_8LS:
+    case PREFIX_ST_8MLS:
+    case PREFIX_ST_8RR:
+    case PREFIX_ST_8MRR:
+        table_offset = PPC_CPU_PREFIXED_OPCODE_OFFSET;
+        break;
+    case PREFIX_ST_MLS:
+    case PREFIX_ST_MMLS:
+    case PREFIX_ST_MRR:
+    case PREFIX_ST_MMRR:
+    case PREFIX_ST_MMIRR:
+        table_offset = PPC_CPU_PREFIXED_MODIFIED_OPCODE_OFFSET;
+    }
+
+    return table_offset + opc1(ctx->opcode);
+}
+
  static void ppc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
  {
      DisasContext *ctx = container_of(dcbase, DisasContext, base);
@@ -8004,14 +8139,40 @@ static void ppc_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cs)

ctx->opcode = translator_ldl_swap(env, ctx->base.pc_next,

                                        need_byteswap(ctx));
+    /* check for prefix */
+    ctx->prefix_subtype = parse_prefix_subtype(ctx->opcode);
+    if (ctx->prefix_subtype == PREFIX_ST_INVALID) {
+        qemu_log_mask(LOG_GUEST_ERROR, "invalid/unsupported prefix: "
+                      "%08x " TARGET_FMT_lx "\n",
+                      ctx->prefix, ctx->base.pc_next);


We should generate a 0x700 here, shouldn't we?


I don't know for sure. I could not find any document saying
explicitly what we should do when subtype is invalid. Nor
that an invalid subtype means an invalid instruction. In the
past for some TM instructions the User Manual would specify
how to treat certain invalid fields in the instructions... like
just ignoring it or even treating like certain valid form (!).


Thanks,
Gustavo

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions, Gustavo Romero <=
- Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions, David Gibson, 2020/12/16
  - Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions, Peter Maydell, 2020/12/17

Prev by Date: Re: [PATCH] spapr: Fix DR properties of the root node
Next by Date: Re: [PATCH 2/7] target/ppc: Add support for prefixed load/store instructions
Previous by thread: [PATCH] configure: Fail when specified cross compiler cannot be found
Next by thread: Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions
Index(es):
- Date
- Thread