|
From: | Gustavo Romero |
Subject: | Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions |
Date: | Wed, 16 Dec 2020 05:01:29 -0300 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.3.1 |
Hi David, Thanks a lot for the review. Please find my comments inline. On 10/8/20 9:43 PM, David Gibson wrote:
On Mon, Oct 05, 2020 at 01:03:13AM -0300, Gustavo Romero wrote:From: Michael Roth <mdroth@linux.vnet.ibm.com>Probably a good idea to CC future spins to Richard Henderson <rth@twiddle.net> - by knowledge of how TCG works is only middling.
OK.
Some prefixed instructions (Type 0 and 1, e.g. 8-byte Load/Store or 8LS), have a completely seperate namespace for their primary opcodes. Other prefixed instructions (Type 2 and 3, e.g. Modified Load/Store or MLS) actually re-use existing opcodes to provide a modified variant. We could handle these by extending the existing opcode handlers to check for the prefix and handle accordingly, but in some cases it is cleaner to define seperate handlers for these in their own table entry, and avoids the need to add error-handling to existing handlers for cases where they are called for a prefixed Type 2/3 instruction but don't implement the handling for them. In the future we can re-work things to support both approaches if cases arise where that seems warranted. Then we have the normal non-prefixed instructions. To handle all 3 of these cases we extend the table size 3x, with normal instructions indexed directly by their primary opcodes, Type 0/1 indexed by primary opcode + PPC_CPU_PREFIXED_OPCODE_OFFSET, and Type 2/3 indexed by primary opcode + PPC_CPU_PREFIXED_MODIFIED_OPCODE_OFFSET.Is there an advantage to having one table with magic offsets, rather than 3 separately declared tables?
Not that I can see in terms of size or performance. However since we are dealing here with the PO (primary opcode) for the prefixed insns as well I think it makes sense to have it related to the table that already handles the PO for the normal instruction. So the offset are not really magic, but 64 (PPC_CPU_PREFIXED_OPCODE_OFFSET) and 128 (PPC_CPU_PREFIXED_MODIFIED_OPCODE_OFFSET), because the slots for the normal opcodes is triplicated. In addition to Michael's comment above I've add the following comment in the code for v2: 982 /* 983 * The primary op-code field extracted from the instruction (PO) is used as the 984 * index in the top-level QEMU opcode table. That table is then further used to 985 * find the proper handler, or a pointer to another table (OPC1, OPC2 etc). For 986 * the normal opcodes (PO field of normal insns) the 64 first entries are used. 987 * 988 * Prefixed instructions can be divided into two major groups of instructions: 989 * the first group is formed by type 0 and 1, and the second group is by type 2 990 * and 3. Opcodes (PO) related to prefixed type 0/1 can have the same opcodes 991 * as the normal instructions but don't have any semantics related to them. But 992 * since it's an primary opcode anyway it was decided to simply extend the 993 * top-level table that handles the primary opcodes for the normal instructions. 994 * On the other handle, prefixed instructions type 2/3 work like modifiers for 995 * existing normal instructions. In that case it would be possible to reuse the 996 * handlers for the normal instructions, but that would require changing the 997 * normal instruction handler in a way it would have to check, for instance, on 998 * which ISA it has to take care of prefixes insns, and that is already done by 999 * GEN_HANDLER_* helpers when registering a new instruction, where there is a 1000 * a field to inform on which ISA the instruction exists. Hence, it was decided 1001 * to add another 64 entries for specific handler for the prefixed instructions 1002 * of type 2/3. Thus the size of top-level PO lookup table is 3*64 = 192, hence 1003 * representing three namespaces for decoding PO: for normal insns, for type 0/1 1004 * and for type 2/3. 1005 */
Various exception/SRR handling changes related to prefixed instructions are yet to be implemented. For instance, no alignment interrupt is generated if a prefixed instruction crosses the 64-byte boundary; no SRR bit related to prefixed instructions is set on any interrupt, like for FP unavailable interrupt, data storage interrupt, etc. For details, please see PowerISA v3.1, section 1.6.3 and chapter 7, particularly the new changes regarding the prefixed instructions. Signed-off-by: Michael Roth <mroth@lamentation.net> [ gromero: - comments clean up and updates - additional comments in the commit log ] Signed-off-by: Gustavo Romero <gromero@linux.ibm.com> --- target/ppc/cpu.h | 4 +- target/ppc/internal.h | 14 +++ target/ppc/translate.c | 167 +++++++++++++++++++++++++++++++- target/ppc/translate_init.c.inc | 11 ++- 4 files changed, 190 insertions(+), 6 deletions(-) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index 766e9c5c26..c5de33572c 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -979,8 +979,10 @@ struct ppc_radix_page_info { #define PPC_TLB_EPID_LOAD 8 #define PPC_TLB_EPID_STORE 9-#define PPC_CPU_OPCODES_LEN 0x40+#define PPC_CPU_OPCODES_LEN 0xc0 #define PPC_CPU_INDIRECT_OPCODES_LEN 0x20 +#define PPC_CPU_PREFIXED_OPCODE_OFFSET 0x40 +#define PPC_CPU_PREFIXED_MODIFIED_OPCODE_OFFSET 0x80Might be worth including some of that description from the commit message as a comment here (or on the table declaration itself).
Yes. Fixed for v2.
struct CPUPPCState { /* Most commonly used resources during translated code execution first */ diff --git a/target/ppc/internal.h b/target/ppc/internal.h index 15d655b356..d03d691a44 100644 --- a/target/ppc/internal.h +++ b/target/ppc/internal.h @@ -129,6 +129,7 @@ EXTRACT_SHELPER(SIMM5, 16, 5); EXTRACT_HELPER(UIMM5, 16, 5); /* 4 bits unsigned immediate value */ EXTRACT_HELPER(UIMM4, 16, 4); + /* Bit count */ EXTRACT_HELPER(NB, 11, 5); /* Shift count */ @@ -146,6 +147,19 @@ EXTRACT_HELPER(TO, 21, 5);EXTRACT_HELPER(CRM, 12, 8); +/*+ * Instruction prefix fields + * + * as per PowerISA 3.1 1.6.3 + */ + +/* prefixed instruction type */ +EXTRACT_HELPER(PREFIX_TYPE, 24, 2); +/* 1-bit sub-type */ +EXTRACT_HELPER(PREFIX_ST1, 23, 1); +/* 4-bit sub-type */ +EXTRACT_HELPER(PREFIX_ST4, 20, 4); + #ifndef CONFIG_USER_ONLY EXTRACT_HELPER(SR, 16, 4); #endif diff --git a/target/ppc/translate.c b/target/ppc/translate.c index fedb9b2271..96c2997d3f 100644 --- a/target/ppc/translate.c +++ b/target/ppc/translate.c @@ -180,6 +180,8 @@ struct DisasContext { uint32_t flags; uint64_t insns_flags; uint64_t insns_flags2; + uint32_t prefix; + uint32_t prefix_subtype; };/* Return true iff byteswap is needed in a scalar memop */@@ -376,6 +378,12 @@ GEN_OPCODE(name, opc1, opc2, opc3, inval, type, PPC_NONE) #define GEN_HANDLER_E(name, opc1, opc2, opc3, inval, type, type2) \ GEN_OPCODE(name, opc1, opc2, opc3, inval, type, type2)+#define GEN_HANDLER_E_PREFIXED(name, opc1, opc2, opc3, inval, type, type2) \+GEN_OPCODE_PREFIXED(name, opc1, opc2, opc3, inval, type, type2, false) + +#define GEN_HANDLER_E_PREFIXED_M(name, opc1, opc2, opc3, inval, type, type2) \ +GEN_OPCODE_PREFIXED(name, opc1, opc2, opc3, inval, type, type2, true) + #define GEN_HANDLER2(name, onam, opc1, opc2, opc3, inval, type) \ GEN_OPCODE2(name, onam, opc1, opc2, opc3, inval, type, PPC_NONE)@@ -395,6 +403,8 @@ typedef struct opcode_t {#endif opc_handler_t handler; const char *oname; + bool prefixed; + bool modified; } opcode_t;/* Helpers for priv. check */@@ -449,6 +459,23 @@ typedef struct opcode_t { }, \ .oname = stringify(name), \ } +#define GEN_OPCODE_PREFIXED(name, op1, op2, op3, invl, _typ, _typ2, _modified)\ +{ \ + .opc1 = op1, \ + .opc2 = op2, \ + .opc3 = op3, \ + .opc4 = 0xff, \ + .handler = { \ + .inval1 = invl, \ + .type = _typ, \ + .type2 = _typ2, \ + .handler = &gen_##name, \ + .oname = stringify(name), \ + }, \ + .oname = stringify(name), \ + .prefixed = true, \ + .modified = _modified, \ +} #define GEN_OPCODE_DUAL(name, op1, op2, op3, invl1, invl2, _typ, _typ2) \ { \ .opc1 = op1, \ @@ -525,6 +552,22 @@ typedef struct opcode_t { }, \ .oname = stringify(name), \ } +#define GEN_OPCODE_PREFIXED(name, op1, op2, op3, invl, _typ, _typ2, _modified)\ +{ \ + .opc1 = op1, \ + .opc2 = op2, \ + .opc3 = op3, \ + .opc4 = 0xff, \ + .handler = { \ + .inval1 = invl, \ + .type = _typ, \ + .type2 = _typ2, \ + .handler = &gen_##name, \ + }, \ + .oname = stringify(name), \ + .prefixed = true, \ + .modified = _modified, \ +} #define GEN_OPCODE_DUAL(name, op1, op2, op3, invl1, invl2, _typ, _typ2) \ { \ .opc1 = op1, \ @@ -7991,6 +8034,98 @@ static bool ppc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs, return true; }++ +/* + * Prefixed instruction sub-types + * + * as per PowerISA 3.1 1.6.3 + */ +enum { + /* non-prefixed instruction */Two things are labelled as "non-prefixed instruction" here..
Fixed for v2.
+ PREFIX_ST_INVALID = -1, + /* non-prefixed instruction */ + PREFIX_ST_NONE = 0, + /* Type 00 prefixed insn sub-types */I'm guessing the types here are in binary, maybe list as 0b00 0b01 etc for clarity.
ISA actually uses like that, but I think it's good to clarify here it's a binary. Fixed for v2.
+ PREFIX_ST_8LS, /* 8-byte load/store */ + PREFIX_ST_8MLS, /* 8-byte masked load/store */ + /* Type 01 prefixed insn sub-types */ + PREFIX_ST_8RR, /* 8-byte reg-to-reg */ + PREFIX_ST_8MRR, /* 8-byte masked reg-to-reg */ + /* Type 10 prefixed insn sub-types */ + PREFIX_ST_MLS, /* modified load/store */ + PREFIX_ST_MMLS, /* modified masked load/store */ + /* Type 11 prefixed insn sub-types */ + PREFIX_ST_MRR, /* modified reg-to-reg */ + PREFIX_ST_MMRR, /* modified mask reg-to-reg */ + PREFIX_ST_MMIRR, /* modified masked immediate reg-to-reg */ +}; + +static int32_t parse_prefix_subtype(uint32_t prefix) +{ + int32_t prefix_subtype = PREFIX_ST_INVALID; + + /* primary opcode 1 is reserved for instruction prefixes */ + if (opc1(prefix) != 1) { + return PREFIX_ST_NONE; + } + + switch (PREFIX_TYPE(prefix)) { + case 0: + if (PREFIX_ST1(prefix) == 0) { + prefix_subtype = PREFIX_ST_8LS; + } else if (PREFIX_ST1(prefix) == 1) { + prefix_subtype = PREFIX_ST_8MLS; + } + break; + case 1: + if (PREFIX_ST4(prefix) == 0) { + prefix_subtype = PREFIX_ST_8RR; + } else if (PREFIX_ST4(prefix) == 8) { + prefix_subtype = PREFIX_ST_8MRR; + }Is this fallthrough intentional? If so it should be commented.
No. Fixed for v2.
+ case 2: + if (PREFIX_ST1(prefix) == 0) { + prefix_subtype = PREFIX_ST_MLS; + } else if (PREFIX_ST1(prefix) == 1) { + prefix_subtype = PREFIX_ST_MMLS; + } + break; + case 3: + if (PREFIX_ST4(prefix) == 0) { + prefix_subtype = PREFIX_ST_MRR; + } else if (PREFIX_ST4(prefix) == 8) { + prefix_subtype = PREFIX_ST_MMRR; + } else if (PREFIX_ST4(prefix) == 9) { + prefix_subtype = PREFIX_ST_MMIRR; + }Fallthrough here doesn't matter, but probably better to have a break; for consistency.
Right. Fixed in opc1_idx() as well for v2.
+ } + + return prefix_subtype; +} + +static uint32_t opc1_idx(DisasContext *ctx) +{ + uint32_t table_offset = 0; + + switch (ctx->prefix_subtype) { + case PREFIX_ST_8LS: + case PREFIX_ST_8MLS: + case PREFIX_ST_8RR: + case PREFIX_ST_8MRR: + table_offset = PPC_CPU_PREFIXED_OPCODE_OFFSET; + break; + case PREFIX_ST_MLS: + case PREFIX_ST_MMLS: + case PREFIX_ST_MRR: + case PREFIX_ST_MMRR: + case PREFIX_ST_MMIRR: + table_offset = PPC_CPU_PREFIXED_MODIFIED_OPCODE_OFFSET; + } + + return table_offset + opc1(ctx->opcode); +} + static void ppc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs) { DisasContext *ctx = container_of(dcbase, DisasContext, base); @@ -8004,14 +8139,40 @@ static void ppc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)ctx->opcode = translator_ldl_swap(env, ctx->base.pc_next,need_byteswap(ctx)); + /* check for prefix */ + ctx->prefix_subtype = parse_prefix_subtype(ctx->opcode); + if (ctx->prefix_subtype == PREFIX_ST_INVALID) { + qemu_log_mask(LOG_GUEST_ERROR, "invalid/unsupported prefix: " + "%08x " TARGET_FMT_lx "\n", + ctx->prefix, ctx->base.pc_next);We should generate a 0x700 here, shouldn't we?
I don't know for sure. I could not find any document saying explicitly what we should do when subtype is invalid. Nor that an invalid subtype means an invalid instruction. In the past for some TM instructions the User Manual would specify how to treat certain invalid fields in the instructions... like just ignoring it or even treating like certain valid form (!). Thanks, Gustavo
[Prev in Thread] | Current Thread | [Next in Thread] |