qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Base for adding PowerPC 64-bit instructions
@ 2021-04-13 21:11 Luis Pires
  2021-04-13 21:11 ` [PATCH 1/5] decodetree: Add support for " Luis Pires
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Luis Pires @ 2021-04-13 21:11 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: richard.henderson, f4bug, Luis Pires, lagarcia, bruno.larsen,
	matheus.ferst, david

This series provides the basic infrastructure for adding the new
32/64-bit instructions in Power ISA 3.1 to target/ppc.

It starts by changing decodetree.py to support 64-bit instructions
as well as custom variable-width instruction load functions.

Then it changes the target/ppc code to allow 32- and 64-bit instructions
to be decoded using decodetree, and finishes by adding the implementation
for 2 simple instructions to demonstrate the new approach:
- addi (replacing the legacy implementation)
- paddi (new)

Luis Pires (5):
  decodetree: Add support for 64-bit instructions
  decodetree: Fix empty input files for varinsnwidth
  decodetree: Allow custom var width load functions
  target/ppc: Base changes to allow 32/64-bit insns
  target/ppc: Implement paddi and replace addi insns

 docs/devel/decodetree.rst                  |   5 +-
 scripts/decodetree.py                      |  55 ++++--
 target/ppc/cpu.h                           |   1 +
 target/ppc/meson.build                     |   5 +
 target/ppc/ppc.decode                      |  26 +++
 target/ppc/translate.c                     | 206 +++++++++++++++------
 target/ppc/translate/fixedpoint-impl.c.inc |  26 +++
 7 files changed, 250 insertions(+), 74 deletions(-)
 create mode 100644 target/ppc/ppc.decode
 create mode 100644 target/ppc/translate/fixedpoint-impl.c.inc

-- 
2.25.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] decodetree: Add support for 64-bit instructions
  2021-04-13 21:11 [PATCH 0/5] Base for adding PowerPC 64-bit instructions Luis Pires
@ 2021-04-13 21:11 ` Luis Pires
  2021-04-13 21:11 ` [PATCH 2/5] decodetree: Fix empty input files for varinsnwidth Luis Pires
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Luis Pires @ 2021-04-13 21:11 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: richard.henderson, f4bug, Luis Pires, lagarcia, bruno.larsen,
	matheus.ferst, david

Allow '64' to be specified for the instruction width command line params
and use the appropriate insn/field data types, mask, extract and deposit
functions in that case.

This will be used to implement the new 64-bit Power ISA 3.1 instructions.

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
---
 docs/devel/decodetree.rst |  5 +++--
 scripts/decodetree.py     | 26 +++++++++++++++++++++-----
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/docs/devel/decodetree.rst b/docs/devel/decodetree.rst
index 74f66bf46e..d776dae14f 100644
--- a/docs/devel/decodetree.rst
+++ b/docs/devel/decodetree.rst
@@ -40,8 +40,9 @@ and returns an integral value extracted from there.
 
 A field with no ``unnamed_fields`` and no ``!function`` is in error.
 
-FIXME: the fields of the structure into which this result will be stored
-is restricted to ``int``.  Which means that we cannot expand 64-bit items.
+The fields of the structure into which this result will be stored are
+defined as ``int`` when the instruction size is set to 16 or 32 bits
+and as ``int64_t`` when the instruction size is set to 64 bits.
 
 Field examples:
 
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
index 4637b633e7..4e18f52a65 100644
--- a/scripts/decodetree.py
+++ b/scripts/decodetree.py
@@ -42,6 +42,10 @@
 output_fd = None
 insntype = 'uint32_t'
 decode_function = 'decode'
+field_data_type = 'int'
+extract_function = 'extract32'
+sextract_function = 'sextract32'
+deposit_function = 'deposit32'
 
 # An identifier for C.
 re_C_ident = '[a-zA-Z][a-zA-Z0-9_]*'
@@ -185,9 +189,9 @@ def __str__(self):
 
     def str_extract(self):
         if self.sign:
-            extr = 'sextract32'
+            extr = sextract_function
         else:
-            extr = 'extract32'
+            extr = extract_function
         return '{0}(insn, {1}, {2})'.format(extr, self.pos, self.len)
 
     def __eq__(self, other):
@@ -215,8 +219,9 @@ def str_extract(self):
             if pos == 0:
                 ret = f.str_extract()
             else:
-                ret = 'deposit32({0}, {1}, {2}, {3})' \
-                      .format(ret, pos, 32 - pos, f.str_extract())
+                ret = '{4}({0}, {1}, {2}, {3})' \
+                      .format(ret, pos, insnwidth - pos, f.str_extract(),
+                              deposit_function)
             pos += f.len
         return ret
 
@@ -311,7 +316,7 @@ def output_def(self):
         if not self.extern:
             output('typedef struct {\n')
             for n in self.fields:
-                output('    int ', n, ';\n')
+                output('    ', field_data_type, ' ', n, ';\n')
             output('} ', self.struct_name(), ';\n\n')
 # end Arguments
 
@@ -1264,6 +1269,10 @@ def main():
     global insntype
     global insnmask
     global decode_function
+    global extract_function
+    global sextract_function
+    global deposit_function
+    global field_data_type
     global variablewidth
     global anyextern
 
@@ -1293,6 +1302,13 @@ def main():
             if insnwidth == 16:
                 insntype = 'uint16_t'
                 insnmask = 0xffff
+            elif insnwidth == 64:
+                insntype = 'uint64_t'
+                insnmask = 0xffffffffffffffff
+                field_data_type = 'int64_t'
+                extract_function = 'extract64'
+                sextract_function = 'sextract64'
+                deposit_function = 'deposit64'
             elif insnwidth != 32:
                 error(0, 'cannot handle insns of width', insnwidth)
         else:
-- 
2.25.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/5] decodetree: Fix empty input files for varinsnwidth
  2021-04-13 21:11 [PATCH 0/5] Base for adding PowerPC 64-bit instructions Luis Pires
  2021-04-13 21:11 ` [PATCH 1/5] decodetree: Add support for " Luis Pires
@ 2021-04-13 21:11 ` Luis Pires
  2021-04-14 19:47   ` Richard Henderson
  2021-04-13 21:11 ` [PATCH 3/5] decodetree: Allow custom var width load functions Luis Pires
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Luis Pires @ 2021-04-13 21:11 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: richard.henderson, f4bug, Luis Pires, lagarcia, bruno.larsen,
	matheus.ferst, david

Decodetree would throw an error when the input file was empty
and --varinsnwidth was specified.

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
---
 scripts/decodetree.py | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/scripts/decodetree.py b/scripts/decodetree.py
index 4e18f52a65..935b2964e0 100644
--- a/scripts/decodetree.py
+++ b/scripts/decodetree.py
@@ -1178,11 +1178,12 @@ def output_code(self, i, extracted, outerbits, outermask):
         ind = str_indent(i)
 
         # If we need to load more bytes, do so now.
-        if extracted < self.width:
-            output(ind, 'insn = ', decode_function,
-                   '_load_bytes(ctx, insn, {0}, {1});\n'
-                   .format(extracted // 8, self.width // 8));
-            extracted = self.width
+        if self.width is not None:
+            if extracted < self.width:
+                output(ind, 'insn = ', decode_function,
+                       '_load_bytes(ctx, insn, {0}, {1});\n'
+                       .format(extracted // 8, self.width // 8));
+                extracted = self.width
         output(ind, 'return insn;\n')
 # end SizeLeaf
 
-- 
2.25.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/5] decodetree: Allow custom var width load functions
  2021-04-13 21:11 [PATCH 0/5] Base for adding PowerPC 64-bit instructions Luis Pires
  2021-04-13 21:11 ` [PATCH 1/5] decodetree: Add support for " Luis Pires
  2021-04-13 21:11 ` [PATCH 2/5] decodetree: Fix empty input files for varinsnwidth Luis Pires
@ 2021-04-13 21:11 ` Luis Pires
  2021-04-13 21:11 ` [PATCH 4/5] target/ppc: Base changes to allow 32/64-bit insns Luis Pires
  2021-04-13 21:11 ` [PATCH 5/5] target/ppc: Implement paddi and replace addi insns Luis Pires
  4 siblings, 0 replies; 15+ messages in thread
From: Luis Pires @ 2021-04-13 21:11 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: richard.henderson, f4bug, Luis Pires, lagarcia, bruno.larsen,
	matheus.ferst, david

This is useful in situations where you want decodetree
to handle variable width instructions but you want to
provide custom code to load the instructions. Suppressing
the generation of the load function is necessary to avoid
compilation errors due to the load function being unused.

This will be used by the PowerPC decodetree code.

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
---
 scripts/decodetree.py | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/scripts/decodetree.py b/scripts/decodetree.py
index 935b2964e0..a62b8d8d76 100644
--- a/scripts/decodetree.py
+++ b/scripts/decodetree.py
@@ -1278,9 +1278,10 @@ def main():
     global anyextern
 
     decode_scope = 'static '
+    noloadfunc = False
 
     long_opts = ['decode=', 'translate=', 'output=', 'insnwidth=',
-                 'static-decode=', 'varinsnwidth=']
+                 'static-decode=', 'varinsnwidth=', 'noloadfunc']
     try:
         (opts, args) = getopt.gnu_getopt(sys.argv[1:], 'o:vw:', long_opts)
     except getopt.GetoptError as err:
@@ -1312,6 +1313,8 @@ def main():
                 deposit_function = 'deposit64'
             elif insnwidth != 32:
                 error(0, 'cannot handle insns of width', insnwidth)
+        elif o == '--noloadfunc':
+            noloadfunc = True
         else:
             assert False, 'unhandled option'
 
@@ -1401,12 +1404,13 @@ def main():
     output(i4, 'return false;\n')
     output('}\n')
 
-    if variablewidth:
-        output('\n', decode_scope, insntype, ' ', decode_function,
-               '_load(DisasContext *ctx)\n{\n',
-               '    ', insntype, ' insn = 0;\n\n')
-        stree.output_code(4, 0, 0, 0)
-        output('}\n')
+    if not noloadfunc:
+        if variablewidth:
+            output('\n', decode_scope, insntype, ' ', decode_function,
+                   '_load(DisasContext *ctx)\n{\n',
+                   '    ', insntype, ' insn = 0;\n\n')
+            stree.output_code(4, 0, 0, 0)
+            output('}\n')
 
     if output_file:
         output_fd.close()
-- 
2.25.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 4/5] target/ppc: Base changes to allow 32/64-bit insns
  2021-04-13 21:11 [PATCH 0/5] Base for adding PowerPC 64-bit instructions Luis Pires
                   ` (2 preceding siblings ...)
  2021-04-13 21:11 ` [PATCH 3/5] decodetree: Allow custom var width load functions Luis Pires
@ 2021-04-13 21:11 ` Luis Pires
  2021-04-14 15:26   ` Richard Henderson
                     ` (2 more replies)
  2021-04-13 21:11 ` [PATCH 5/5] target/ppc: Implement paddi and replace addi insns Luis Pires
  4 siblings, 3 replies; 15+ messages in thread
From: Luis Pires @ 2021-04-13 21:11 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: richard.henderson, f4bug, Luis Pires, lagarcia, bruno.larsen,
	matheus.ferst, david

These changes add the basic support for 32- and 64-bit instruction
decoding using decodetree.

Apart from the instruction decoding itself, it also takes care of
some pre-requisite changes, such as removing hard-coded instruction
sizes throughout the code and raising an alignment exception should
a prefixed instruction cross a 64-byte boundary.

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
---
 target/ppc/cpu.h       |   1 +
 target/ppc/meson.build |   5 ++
 target/ppc/ppc.decode  |  18 ++++
 target/ppc/translate.c | 191 ++++++++++++++++++++++++++++++++---------
 4 files changed, 174 insertions(+), 41 deletions(-)
 create mode 100644 target/ppc/ppc.decode

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index e73416da68..9bb2805a22 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -148,6 +148,7 @@ enum {
     POWERPC_EXCP_ALIGN_PROT    = 0x04,  /* Access cross protection boundary  */
     POWERPC_EXCP_ALIGN_BAT     = 0x05,  /* Access cross a BAT/seg boundary   */
     POWERPC_EXCP_ALIGN_CACHE   = 0x06,  /* Impossible dcbz access            */
+    POWERPC_EXCP_ALIGN_INSN    = 0x07,  /* Pref. insn x-ing 64-byte boundary */
     /* Exception subtypes for POWERPC_EXCP_PROGRAM                           */
     /* FP exceptions                                                         */
     POWERPC_EXCP_FP            = 0x10,
diff --git a/target/ppc/meson.build b/target/ppc/meson.build
index bbfef90e08..feed383a2b 100644
--- a/target/ppc/meson.build
+++ b/target/ppc/meson.build
@@ -1,4 +1,9 @@
+gen = [
+  decodetree.process('ppc.decode', extra_args: ['--varinsnwidth=64', '--noloadfunc'])
+]
+
 ppc_ss = ss.source_set()
+ppc_ss.add(gen)
 ppc_ss.add(files(
   'cpu-models.c',
   'cpu.c',
diff --git a/target/ppc/ppc.decode b/target/ppc/ppc.decode
new file mode 100644
index 0000000000..84bb73ac19
--- /dev/null
+++ b/target/ppc/ppc.decode
@@ -0,0 +1,18 @@
+#
+# Power ISA instruction decode definitions.
+#
+# Copyright (c) 2021 Luis Pires <luis.pires@eldorado.org.br>
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
+#
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 0984ce637b..0f123f7b80 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -154,6 +154,11 @@ void ppc_translate_init(void)
 /* internal defines */
 struct DisasContext {
     DisasContextBase base;
+
+    /*
+     * 'opcode' should be considered deprecated and be used only
+     * by legacy non-decodetree code
+     */
     uint32_t opcode;
     uint32_t exception;
     /* Routine used to access memory */
@@ -180,6 +185,8 @@ struct DisasContext {
     uint32_t flags;
     uint64_t insns_flags;
     uint64_t insns_flags2;
+    target_ulong insn_size;
+    CPUPPCState *env;
 };
 
 /* Return true iff byteswap is needed in a scalar memop */
@@ -199,6 +206,10 @@ static inline bool need_byteswap(const DisasContext *ctx)
 # define NARROW_MODE(C)  0
 #endif
 
+#if defined(DO_PPC_STATISTICS)
+static uint64_t ppc_decodetree_hit_count;
+#endif
+
 struct opc_handler_t {
     /* invalid bits for instruction 1 (Rc(opcode) == 0) */
     uint32_t inval1;
@@ -254,7 +265,7 @@ static void gen_exception_err(DisasContext *ctx, uint32_t excp, uint32_t error)
      * faulting instruction
      */
     if (ctx->exception == POWERPC_EXCP_NONE) {
-        gen_update_nip(ctx, ctx->base.pc_next - 4);
+        gen_update_nip(ctx, ctx->base.pc_next - ctx->insn_size);
     }
     t0 = tcg_const_i32(excp);
     t1 = tcg_const_i32(error);
@@ -273,7 +284,7 @@ static void gen_exception(DisasContext *ctx, uint32_t excp)
      * faulting instruction
      */
     if (ctx->exception == POWERPC_EXCP_NONE) {
-        gen_update_nip(ctx, ctx->base.pc_next - 4);
+        gen_update_nip(ctx, ctx->base.pc_next - ctx->insn_size);
     }
     t0 = tcg_const_i32(excp);
     gen_helper_raise_exception(cpu_env, t0);
@@ -3113,7 +3124,8 @@ static void gen_eieio(DisasContext *ctx)
          */
         if (!(ctx->insns_flags2 & PPC2_ISA300)) {
             qemu_log_mask(LOG_GUEST_ERROR, "invalid eieio using bit 6 at @"
-                          TARGET_FMT_lx "\n", ctx->base.pc_next - 4);
+                          TARGET_FMT_lx "\n",
+                          ctx->base.pc_next - ctx->insn_size);
         } else {
             bar = TCG_MO_ST_LD;
         }
@@ -3782,14 +3794,14 @@ static void gen_b(DisasContext *ctx)
     li = LI(ctx->opcode);
     li = (li ^ 0x02000000) - 0x02000000;
     if (likely(AA(ctx->opcode) == 0)) {
-        target = ctx->base.pc_next + li - 4;
+        target = ctx->base.pc_next + li - ctx->insn_size;
     } else {
         target = li;
     }
     if (LK(ctx->opcode)) {
         gen_setlr(ctx, ctx->base.pc_next);
     }
-    gen_update_cfar(ctx, ctx->base.pc_next - 4);
+    gen_update_cfar(ctx, ctx->base.pc_next - ctx->insn_size);
     gen_goto_tb(ctx, 0, target);
 }
 
@@ -3888,11 +3900,11 @@ static void gen_bcond(DisasContext *ctx, int type)
         }
         tcg_temp_free_i32(temp);
     }
-    gen_update_cfar(ctx, ctx->base.pc_next - 4);
+    gen_update_cfar(ctx, ctx->base.pc_next - ctx->insn_size);
     if (type == BCOND_IM) {
         target_ulong li = (target_long)((int16_t)(BD(ctx->opcode)));
         if (likely(AA(ctx->opcode) == 0)) {
-            gen_goto_tb(ctx, 0, ctx->base.pc_next + li - 4);
+            gen_goto_tb(ctx, 0, ctx->base.pc_next + li - ctx->insn_size);
         } else {
             gen_goto_tb(ctx, 0, li);
         }
@@ -4008,7 +4020,7 @@ static void gen_rfi(DisasContext *ctx)
     if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
         gen_io_start();
     }
-    gen_update_cfar(ctx, ctx->base.pc_next - 4);
+    gen_update_cfar(ctx, ctx->base.pc_next - ctx->insn_size);
     gen_helper_rfi(cpu_env);
     gen_sync_exception(ctx);
 #endif
@@ -4025,7 +4037,7 @@ static void gen_rfid(DisasContext *ctx)
     if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
         gen_io_start();
     }
-    gen_update_cfar(ctx, ctx->base.pc_next - 4);
+    gen_update_cfar(ctx, ctx->base.pc_next - ctx->insn_size);
     gen_helper_rfid(cpu_env);
     gen_sync_exception(ctx);
 #endif
@@ -4042,7 +4054,7 @@ static void gen_rfscv(DisasContext *ctx)
     if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
         gen_io_start();
     }
-    gen_update_cfar(ctx, ctx->base.pc_next - 4);
+    gen_update_cfar(ctx, ctx->base.pc_next - ctx->insn_size);
     gen_helper_rfscv(cpu_env);
     gen_sync_exception(ctx);
 #endif
@@ -4338,7 +4350,7 @@ static inline void gen_op_mfspr(DisasContext *ctx)
             if (sprn != SPR_PVR) {
                 qemu_log_mask(LOG_GUEST_ERROR, "Trying to read privileged spr "
                               "%d (0x%03x) at " TARGET_FMT_lx "\n", sprn, sprn,
-                              ctx->base.pc_next - 4);
+                              ctx->base.pc_next - ctx->insn_size);
             }
             gen_priv_exception(ctx, POWERPC_EXCP_PRIV_REG);
         }
@@ -4352,7 +4364,8 @@ static inline void gen_op_mfspr(DisasContext *ctx)
         /* Not defined */
         qemu_log_mask(LOG_GUEST_ERROR,
                       "Trying to read invalid spr %d (0x%03x) at "
-                      TARGET_FMT_lx "\n", sprn, sprn, ctx->base.pc_next - 4);
+                      TARGET_FMT_lx "\n", sprn, sprn,
+                      ctx->base.pc_next - ctx->insn_size);
 
         /*
          * The behaviour depends on MSR:PR and SPR# bit 0x10, it can
@@ -4516,7 +4529,7 @@ static void gen_mtspr(DisasContext *ctx)
             /* Privilege exception */
             qemu_log_mask(LOG_GUEST_ERROR, "Trying to write privileged spr "
                           "%d (0x%03x) at " TARGET_FMT_lx "\n", sprn, sprn,
-                          ctx->base.pc_next - 4);
+                          ctx->base.pc_next - ctx->insn_size);
             gen_priv_exception(ctx, POWERPC_EXCP_PRIV_REG);
         }
     } else {
@@ -4530,7 +4543,8 @@ static void gen_mtspr(DisasContext *ctx)
         /* Not defined */
         qemu_log_mask(LOG_GUEST_ERROR,
                       "Trying to write invalid spr %d (0x%03x) at "
-                      TARGET_FMT_lx "\n", sprn, sprn, ctx->base.pc_next - 4);
+                      TARGET_FMT_lx "\n", sprn, sprn,
+                      ctx->base.pc_next - ctx->insn_size);
 
 
         /*
@@ -6900,6 +6914,60 @@ static inline void set_avr64(int regno, TCGv_i64 src, bool high)
     tcg_gen_st_i64(src, cpu_env, avr64_offset(regno, high));
 }
 
+/*
+ * Check if a given 32-bit value is a prefix for a 64-bit instruction.
+ */
+static inline int is_insn_prefix(uint32_t insn)
+{
+    return (opc1(insn) == 0x01);
+}
+
+/*
+ * Load a 32- or 64-bit instruction.
+ *
+ * 32-bit instructions are returned in the higher 32-bits
+ */
+static uint64_t ppc_load_insn(DisasContext *ctx)
+{
+    uint64_t insn;
+    uint32_t insn_part;
+
+    /* read 4 bytes */
+    insn_part = translator_ldl_swap(ctx->env, ctx->base.pc_next,
+                                    need_byteswap(ctx));
+    insn = ((uint64_t)insn_part) << 32;
+    ctx->base.pc_next += 4;
+    ctx->insn_size = 4;
+
+    if (is_insn_prefix(insn_part)) {
+        /* read 4 more bytes */
+        insn_part = translator_ldl_swap(ctx->env, ctx->base.pc_next,
+                                        need_byteswap(ctx));
+        insn |= insn_part;
+
+        ctx->base.pc_next += 4;
+        ctx->insn_size += 4;
+    }
+
+    return insn;
+}
+
+/*
+ * Peek at the next instruction's size.
+ */
+static target_ulong ppc_peek_next_insn_size(DisasContext *ctx)
+{
+    uint32_t insn_part;
+
+    /* read 4 bytes */
+    insn_part = translator_ldl_swap(ctx->env, ctx->base.pc_next,
+                                    need_byteswap(ctx));
+
+    return is_insn_prefix(insn_part) ? 8 : 4;
+}
+
+#include "decode-ppc.c.inc"
+
 #include "translate/fp-impl.c.inc"
 
 #include "translate/vmx-impl.c.inc"
@@ -7832,7 +7900,7 @@ void ppc_cpu_dump_statistics(CPUState *cs, int flags)
     opc_handler_t **t1, **t2, **t3, *handler;
     int op1, op2, op3;
 
-    t1 = cpu->env.opcodes;
+    t1 = cpu->opcodes;
     for (op1 = 0; op1 < 64; op1++) {
         handler = t1[op1];
         if (is_indirect_opcode(handler)) {
@@ -7872,6 +7940,10 @@ void ppc_cpu_dump_statistics(CPUState *cs, int flags)
                         handler->count, handler->count);
         }
     }
+
+    qemu_printf("decodetree: "
+                "%016" PRIx64 " %" PRId64 "\n",
+                ppc_decodetree_hit_count, ppc_decodetree_hit_count);
 #endif
 }
 
@@ -7879,7 +7951,6 @@ static void ppc_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
     CPUPPCState *env = cs->env_ptr;
-    int bound;
 
     ctx->exception = POWERPC_EXCP_NONE;
     ctx->spr_cb = env->spr_cb;
@@ -7961,8 +8032,7 @@ static void ppc_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     msr_se = 1;
 #endif
 
-    bound = -(ctx->base.pc_first | TARGET_PAGE_MASK) / 4;
-    ctx->base.max_insns = MIN(ctx->base.max_insns, bound);
+    ctx->env = env;
 }
 
 static void ppc_tr_tb_start(DisasContextBase *db, CPUState *cs)
@@ -7979,37 +8049,31 @@ static bool ppc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
+    target_ulong insn_size = ppc_peek_next_insn_size(ctx);
+
     gen_debug_exception(ctx);
     dcbase->is_jmp = DISAS_NORETURN;
     /*
      * The address covered by the breakpoint must be included in
      * [tb->pc, tb->pc + tb->size) in order to for it to be properly
-     * cleared -- thus we increment the PC here so that the logic
-     * setting tb->size below does the right thing.
+     * cleared -- thus we increment the PC here.
      */
-    ctx->base.pc_next += 4;
+    ctx->base.pc_next += insn_size;
     return true;
 }
 
-static void ppc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
+/*
+ * Legacy (non-decodetree) 32-bit instruction translation code.
+ *
+ * Any new instructions should be implemented using the decode tree,
+ * and not use this code.
+ */
+static void ppc_tr_translate_insn_legacy(DisasContext *ctx, CPUState *cs)
 {
-    DisasContext *ctx = container_of(dcbase, DisasContext, base);
     PowerPCCPU *cpu = POWERPC_CPU(cs);
     CPUPPCState *env = cs->env_ptr;
     opc_handler_t **table, *handler;
 
-    LOG_DISAS("----------------\n");
-    LOG_DISAS("nip=" TARGET_FMT_lx " super=%d ir=%d\n",
-              ctx->base.pc_next, ctx->mem_idx, (int)msr_ir);
-
-    ctx->opcode = translator_ldl_swap(env, ctx->base.pc_next,
-                                      need_byteswap(ctx));
-
-    LOG_DISAS("translate opcode %08x (%02x %02x %02x %02x) (%s)\n",
-              ctx->opcode, opc1(ctx->opcode), opc2(ctx->opcode),
-              opc3(ctx->opcode), opc4(ctx->opcode),
-              ctx->le_mode ? "little" : "big");
-    ctx->base.pc_next += 4;
     table = cpu->opcodes;
     handler = table[opc1(ctx->opcode)];
     if (is_indirect_opcode(handler)) {
@@ -8031,7 +8095,8 @@ static void ppc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
                       TARGET_FMT_lx " %d\n",
                       opc1(ctx->opcode), opc2(ctx->opcode),
                       opc3(ctx->opcode), opc4(ctx->opcode),
-                      ctx->opcode, ctx->base.pc_next - 4, (int)msr_ir);
+                      ctx->opcode, ctx->base.pc_next - ctx->insn_size,
+                      (int)msr_ir);
     } else {
         uint32_t inval;
 
@@ -8048,7 +8113,7 @@ static void ppc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
                           TARGET_FMT_lx "\n", ctx->opcode & inval,
                           opc1(ctx->opcode), opc2(ctx->opcode),
                           opc3(ctx->opcode), opc4(ctx->opcode),
-                          ctx->opcode, ctx->base.pc_next - 4);
+                          ctx->opcode, ctx->base.pc_next - ctx->insn_size);
             gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL);
             ctx->base.is_jmp = DISAS_NORETURN;
             return;
@@ -8067,11 +8132,55 @@ static void ppc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
         uint32_t excp = gen_prep_dbgex(ctx);
         gen_exception_nip(ctx, excp, ctx->base.pc_next);
     }
+}
 
-    if (tcg_check_temp_count()) {
-        qemu_log("Opcode %02x %02x %02x %02x (%08x) leaked "
-                 "temporaries\n", opc1(ctx->opcode), opc2(ctx->opcode),
-                 opc3(ctx->opcode), opc4(ctx->opcode), ctx->opcode);
+static void ppc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
+{
+#if defined(PPC_DEBUG_DISAS)
+    /* env is needed by LOG_DISAS */
+    CPUPPCState *env = cs->env_ptr;
+#endif
+    DisasContext *ctx = container_of(dcbase, DisasContext, base);
+    uint64_t insn;
+
+    LOG_DISAS("----------------\n");
+    LOG_DISAS("nip=" TARGET_FMT_lx " super=%d ir=%d\n",
+              ctx->base.pc_next, ctx->mem_idx, (int)msr_ir);
+
+    /* load the next insn, keeping track of the insn size */
+    insn = ppc_load_insn(ctx);
+
+    if (unlikely(ctx->insn_size == 8 &&
+                 (ctx->base.pc_next & 0x3f) == 0x04)) {
+        /*
+         * Raise alignment exception when a 64-bit insn crosses a
+         * 64-byte boundary
+         */
+        gen_exception_err(ctx, POWERPC_EXCP_ALIGN, POWERPC_EXCP_ALIGN_INSN);
+    } else {
+        LOG_DISAS("translate opcode %016" PRIx64 " (%s)\n",
+                  insn, ctx->le_mode ? "little" : "big");
+
+        if (!decode(ctx, insn)) {
+            /*
+             * Instruction not found in decode tree.
+             * Fall back to legacy 32-bit instruction code.
+             *
+             * ppc_load_insn() keeps 32-bit instructions in the high
+             * 32-bits of insn.
+             */
+            ctx->opcode = (uint32_t)(insn >> 32);
+            ppc_tr_translate_insn_legacy(ctx, cs);
+        } else {
+#if defined(DO_PPC_STATISTICS)
+            ppc_decodetree_hit_count++;
+#endif
+        }
+
+        if (tcg_check_temp_count()) {
+            qemu_log("Opcode (%016" PRIx64 ") leaked "
+                     "temporaries\n", insn);
+        }
     }
 
     ctx->base.is_jmp = ctx->exception == POWERPC_EXCP_NONE ?
-- 
2.25.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 5/5] target/ppc: Implement paddi and replace addi insns
  2021-04-13 21:11 [PATCH 0/5] Base for adding PowerPC 64-bit instructions Luis Pires
                   ` (3 preceding siblings ...)
  2021-04-13 21:11 ` [PATCH 4/5] target/ppc: Base changes to allow 32/64-bit insns Luis Pires
@ 2021-04-13 21:11 ` Luis Pires
  2021-04-13 22:41   ` Philippe Mathieu-Daudé
  2021-04-14 19:11   ` Richard Henderson
  4 siblings, 2 replies; 15+ messages in thread
From: Luis Pires @ 2021-04-13 21:11 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: richard.henderson, f4bug, Luis Pires, lagarcia, bruno.larsen,
	matheus.ferst, david

This implements the Power ISA 3.1 prefixed (64-bit) paddi
instruction, while also replacing the legacy addi implementation.
Both using the decode tree.

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
 target/ppc/ppc.decode                      |  8 +++++++
 target/ppc/translate.c                     | 15 +------------
 target/ppc/translate/fixedpoint-impl.c.inc | 26 ++++++++++++++++++++++
 3 files changed, 35 insertions(+), 14 deletions(-)
 create mode 100644 target/ppc/translate/fixedpoint-impl.c.inc

diff --git a/target/ppc/ppc.decode b/target/ppc/ppc.decode
index 84bb73ac19..c87174dc20 100644
--- a/target/ppc/ppc.decode
+++ b/target/ppc/ppc.decode
@@ -16,3 +16,11 @@
 # You should have received a copy of the GNU Lesser General Public
 # License along with this library; if not, see <http://www.gnu.org/licenses/>.
 #
+
+%p_D8_SI        32:s18 0:16
+
+# Fixed-Point Facility Instructions
+&addi   r rt ra si
+
+paddi   000001 10 0 -- r:1 -- .................. 001110 rt:5 ra:5 ................ si=%p_D8_SI &addi
+addi    001110 rt:5 ra:5 si:s16 &addi r=0
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 0f123f7b80..2ff192c9e5 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -945,19 +945,6 @@ GEN_INT_ARITH_ADD(addex, 0x05, cpu_ov, 1, 1, 0);
 /* addze  addze.  addzeo  addzeo.*/
 GEN_INT_ARITH_ADD_CONST(addze, 0x06, 0, cpu_ca, 1, 1, 0)
 GEN_INT_ARITH_ADD_CONST(addzeo, 0x16, 0, cpu_ca, 1, 1, 1)
-/* addi */
-static void gen_addi(DisasContext *ctx)
-{
-    target_long simm = SIMM(ctx->opcode);
-
-    if (rA(ctx->opcode) == 0) {
-        /* li case */
-        tcg_gen_movi_tl(cpu_gpr[rD(ctx->opcode)], simm);
-    } else {
-        tcg_gen_addi_tl(cpu_gpr[rD(ctx->opcode)],
-                        cpu_gpr[rA(ctx->opcode)], simm);
-    }
-}
 /* addic  addic.*/
 static inline void gen_op_addic(DisasContext *ctx, bool compute_rc0)
 {
@@ -6967,6 +6954,7 @@ static target_ulong ppc_peek_next_insn_size(DisasContext *ctx)
 }
 
 #include "decode-ppc.c.inc"
+#include "translate/fixedpoint-impl.c.inc"
 
 #include "translate/fp-impl.c.inc"
 
@@ -7091,7 +7079,6 @@ GEN_HANDLER_E(cmpeqb, 0x1F, 0x00, 0x07, 0x00600000, PPC_NONE, PPC2_ISA300),
 GEN_HANDLER_E(cmpb, 0x1F, 0x1C, 0x0F, 0x00000001, PPC_NONE, PPC2_ISA205),
 GEN_HANDLER_E(cmprb, 0x1F, 0x00, 0x06, 0x00400001, PPC_NONE, PPC2_ISA300),
 GEN_HANDLER(isel, 0x1F, 0x0F, 0xFF, 0x00000001, PPC_ISEL),
-GEN_HANDLER(addi, 0x0E, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(addic, 0x0C, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER2(addic_, "addic.", 0x0D, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
 GEN_HANDLER(addis, 0x0F, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
diff --git a/target/ppc/translate/fixedpoint-impl.c.inc b/target/ppc/translate/fixedpoint-impl.c.inc
new file mode 100644
index 0000000000..8620954b57
--- /dev/null
+++ b/target/ppc/translate/fixedpoint-impl.c.inc
@@ -0,0 +1,26 @@
+static bool trans_paddi(DisasContext *ctx, arg_paddi *a)
+{
+    if (a->r == 0) {
+        if (a->ra == 0) {
+            /* li case */
+            tcg_gen_movi_tl(cpu_gpr[a->rt], a->si);
+        } else {
+            tcg_gen_addi_tl(cpu_gpr[a->rt],
+                            cpu_gpr[a->ra], a->si);
+        }
+    } else {
+        if (a->ra == 0) {
+            tcg_gen_addi_tl(cpu_gpr[a->rt], cpu_nip, a->si);
+        } else {
+            /* invalid form */
+            gen_invalid(ctx);
+        }
+    }
+
+    return true;
+}
+
+static bool trans_addi(DisasContext *ctx, arg_addi *a)
+{
+    return trans_paddi(ctx, a);
+}
-- 
2.25.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] target/ppc: Implement paddi and replace addi insns
  2021-04-13 21:11 ` [PATCH 5/5] target/ppc: Implement paddi and replace addi insns Luis Pires
@ 2021-04-13 22:41   ` Philippe Mathieu-Daudé
  2021-04-14 13:00     ` Luis Fernando Fujita Pires
  2021-04-14 19:11   ` Richard Henderson
  1 sibling, 1 reply; 15+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-04-13 22:41 UTC (permalink / raw)
  To: Luis Pires, qemu-devel, qemu-ppc
  Cc: lagarcia, bruno.larsen, richard.henderson, matheus.ferst, david

Hi Luis,

On 4/13/21 11:11 PM, Luis Pires wrote:
> This implements the Power ISA 3.1 prefixed (64-bit) paddi
> instruction, while also replacing the legacy addi implementation.
> Both using the decode tree.
> 
> Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
>  target/ppc/ppc.decode                      |  8 +++++++
>  target/ppc/translate.c                     | 15 +------------
>  target/ppc/translate/fixedpoint-impl.c.inc | 26 ++++++++++++++++++++++
>  3 files changed, 35 insertions(+), 14 deletions(-)
>  create mode 100644 target/ppc/translate/fixedpoint-impl.c.inc
> 
> diff --git a/target/ppc/ppc.decode b/target/ppc/ppc.decode
> index 84bb73ac19..c87174dc20 100644
> --- a/target/ppc/ppc.decode
> +++ b/target/ppc/ppc.decode
> @@ -16,3 +16,11 @@
>  # You should have received a copy of the GNU Lesser General Public
>  # License along with this library; if not, see <http://www.gnu.org/licenses/>.
>  #
> +
> +%p_D8_SI        32:s18 0:16
> +
> +# Fixed-Point Facility Instructions
> +&addi   r rt ra si
> +
> +paddi   000001 10 0 -- r:1 -- .................. 001110 rt:5 ra:5 ................ si=%p_D8_SI &addi

IIUC you should be able to do something like catch ra=0 first ...:

{
  addi_movi   000001 10 0 -- r:1 -- .................. 001110 rt:5 .....
................ si=%p_D8_SI &addi ra=0
  addi   000001 10 0 -- r:1 -- .................. 001110 rt:5 ra:5
................ si=%p_D8_SI &addi
}

> +addi    001110 rt:5 ra:5 si:s16 &addi r=0
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index 0f123f7b80..2ff192c9e5 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -945,19 +945,6 @@ GEN_INT_ARITH_ADD(addex, 0x05, cpu_ov, 1, 1, 0);
>  /* addze  addze.  addzeo  addzeo.*/
>  GEN_INT_ARITH_ADD_CONST(addze, 0x06, 0, cpu_ca, 1, 1, 0)
>  GEN_INT_ARITH_ADD_CONST(addzeo, 0x16, 0, cpu_ca, 1, 1, 1)
> -/* addi */
> -static void gen_addi(DisasContext *ctx)
> -{
> -    target_long simm = SIMM(ctx->opcode);
> -
> -    if (rA(ctx->opcode) == 0) {
> -        /* li case */
> -        tcg_gen_movi_tl(cpu_gpr[rD(ctx->opcode)], simm);
> -    } else {
> -        tcg_gen_addi_tl(cpu_gpr[rD(ctx->opcode)],
> -                        cpu_gpr[rA(ctx->opcode)], simm);
> -    }
> -}
>  /* addic  addic.*/
>  static inline void gen_op_addic(DisasContext *ctx, bool compute_rc0)
>  {
> @@ -6967,6 +6954,7 @@ static target_ulong ppc_peek_next_insn_size(DisasContext *ctx)
>  }
>  
>  #include "decode-ppc.c.inc"
> +#include "translate/fixedpoint-impl.c.inc"
>  
>  #include "translate/fp-impl.c.inc"
>  
> @@ -7091,7 +7079,6 @@ GEN_HANDLER_E(cmpeqb, 0x1F, 0x00, 0x07, 0x00600000, PPC_NONE, PPC2_ISA300),
>  GEN_HANDLER_E(cmpb, 0x1F, 0x1C, 0x0F, 0x00000001, PPC_NONE, PPC2_ISA205),
>  GEN_HANDLER_E(cmprb, 0x1F, 0x00, 0x06, 0x00400001, PPC_NONE, PPC2_ISA300),
>  GEN_HANDLER(isel, 0x1F, 0x0F, 0xFF, 0x00000001, PPC_ISEL),
> -GEN_HANDLER(addi, 0x0E, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
>  GEN_HANDLER(addic, 0x0C, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
>  GEN_HANDLER2(addic_, "addic.", 0x0D, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
>  GEN_HANDLER(addis, 0x0F, 0xFF, 0xFF, 0x00000000, PPC_INTEGER),
> diff --git a/target/ppc/translate/fixedpoint-impl.c.inc b/target/ppc/translate/fixedpoint-impl.c.inc
> new file mode 100644
> index 0000000000..8620954b57
> --- /dev/null
> +++ b/target/ppc/translate/fixedpoint-impl.c.inc
> @@ -0,0 +1,26 @@
> +static bool trans_paddi(DisasContext *ctx, arg_paddi *a)

(Nitpick, use the format: arg_addi, not arg_paddi).

> +{
> +    if (a->r == 0) {
> +        if (a->ra == 0) {
> +            /* li case */
> +            tcg_gen_movi_tl(cpu_gpr[a->rt], a->si);
> +        } else {
> +            tcg_gen_addi_tl(cpu_gpr[a->rt],
> +                            cpu_gpr[a->ra], a->si);
> +        }
> +    } else {
> +        if (a->ra == 0) {
> +            tcg_gen_addi_tl(cpu_gpr[a->rt], cpu_nip, a->si);
> +        } else {
> +            /* invalid form */
> +            gen_invalid(ctx);
> +        }
> +    }
> +
> +    return true;
> +}
> +
> +static bool trans_addi(DisasContext *ctx, arg_addi *a)
> +{
> +    return trans_paddi(ctx, a);
> +}

... which then simplifies the trans_OPCODE() logic:

static bool trans_addi_movi(DisasContext *ctx, arg_addi *a)
{
    if (a->r == 0) {
        /* li case */
        tcg_gen_movi_tl(cpu_gpr[a->rt], a->si);
    } else {
        tcg_gen_addi_tl(cpu_gpr[a->rt], cpu_nip, a->si);
    }

    return true;
}

static bool trans_addi(DisasContext *ctx, arg_addi *a)
{
    if (a->r == 0) {
        tcg_gen_addi_tl(cpu_gpr[a->rt], cpu_gpr[a->ra], a->si);
    } else {
        /* invalid form */
        gen_invalid(ctx);
    }

    return true;
}

Maybe you can do the same with the r bit to remove the dual addi_movi.

Regards,

Phil.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH 5/5] target/ppc: Implement paddi and replace addi insns
  2021-04-13 22:41   ` Philippe Mathieu-Daudé
@ 2021-04-14 13:00     ` Luis Fernando Fujita Pires
  0 siblings, 0 replies; 15+ messages in thread
From: Luis Fernando Fujita Pires @ 2021-04-14 13:00 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu-devel, qemu-ppc
  Cc: lagarcia, Bruno Piazera Larsen, richard.henderson,
	Matheus Kowalczuk Ferst, david

Hi Phil,

> > +
> > +%p_D8_SI        32:s18 0:16
> > +
> > +# Fixed-Point Facility Instructions
> > +&addi   r rt ra si
> > +
> > +paddi   000001 10 0 -- r:1 -- .................. 001110 rt:5 ra:5 ................
> si=%p_D8_SI &addi
> 
> IIUC you should be able to do something like catch ra=0 first ...:
> 
> {
>   addi_movi   000001 10 0 -- r:1 -- .................. 001110 rt:5 .....
> ................ si=%p_D8_SI &addi ra=0
>   addi   000001 10 0 -- r:1 -- .................. 001110 rt:5 ra:5
> ................ si=%p_D8_SI &addi
> }

> > +++ b/target/ppc/translate/fixedpoint-impl.c.inc
> > @@ -0,0 +1,26 @@
> > +static bool trans_paddi(DisasContext *ctx, arg_paddi *a)
> 
> (Nitpick, use the format: arg_addi, not arg_paddi).

Sure, will do! I was using the exact function signature generated by
decodetree, but using 'arg_addi' makes more sense, as it will make
it clearer that it's the same struct.

> 
> > +{
> > +    if (a->r == 0) {
> > +        if (a->ra == 0) {
> > +            /* li case */
> > +            tcg_gen_movi_tl(cpu_gpr[a->rt], a->si);
> > +        } else {
> > +            tcg_gen_addi_tl(cpu_gpr[a->rt],
> > +                            cpu_gpr[a->ra], a->si);
> > +        }
> > +    } else {
> > +        if (a->ra == 0) {
> > +            tcg_gen_addi_tl(cpu_gpr[a->rt], cpu_nip, a->si);
> > +        } else {
> > +            /* invalid form */
> > +            gen_invalid(ctx);
> > +        }
> > +    }
> > +
> > +    return true;
> > +}
> > +
> > +static bool trans_addi(DisasContext *ctx, arg_addi *a) {
> > +    return trans_paddi(ctx, a);
> > +}
> 
> ... which then simplifies the trans_OPCODE() logic:
> 
> static bool trans_addi_movi(DisasContext *ctx, arg_addi *a) {
>     if (a->r == 0) {
>         /* li case */
>         tcg_gen_movi_tl(cpu_gpr[a->rt], a->si);
>     } else {
>         tcg_gen_addi_tl(cpu_gpr[a->rt], cpu_nip, a->si);
>     }
> 
>     return true;
> }
> 
> static bool trans_addi(DisasContext *ctx, arg_addi *a) {
>     if (a->r == 0) {
>         tcg_gen_addi_tl(cpu_gpr[a->rt], cpu_gpr[a->ra], a->si);
>     } else {
>         /* invalid form */
>         gen_invalid(ctx);
>     }
> 
>     return true;
> }
> 
> Maybe you can do the same with the r bit to remove the dual addi_movi.

Right, that can be done. On the other hand, keeping this logic inside trans_paddi
and not in the .decode file makes it simpler (and less error-prone) to check that
the implementation matches the official ISA documentation, when opening
both side by side.

This is because the ISA specifies the instruction format for paddi as a whole
(which can be easily matched to what would be in the .decode file) and, after
that, the ISA specifies how each case should be handled (which could then
be checked by just looking at the trans_paddi implementation, which would
be in a single function).

Since the trans_paddi implementation with the ra/r handling is not that
complex either, I would recommend keeping the clearer separation
between the instruction formats (in the .decode file) and the
handling of each case in the trans_OPCODE() logic. What do you think?

Thanks,
Luis

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] target/ppc: Base changes to allow 32/64-bit insns
  2021-04-13 21:11 ` [PATCH 4/5] target/ppc: Base changes to allow 32/64-bit insns Luis Pires
@ 2021-04-14 15:26   ` Richard Henderson
  2021-04-14 16:09   ` Richard Henderson
  2021-04-14 16:10   ` Richard Henderson
  2 siblings, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2021-04-14 15:26 UTC (permalink / raw)
  To: Luis Pires, qemu-devel, qemu-ppc
  Cc: lagarcia, bruno.larsen, matheus.ferst, f4bug, david

On 4/13/21 2:11 PM, Luis Pires wrote:
>       if (ctx->exception == POWERPC_EXCP_NONE) {
> -        gen_update_nip(ctx, ctx->base.pc_next - 4);
> +        gen_update_nip(ctx, ctx->base.pc_next - ctx->insn_size);

It appears as if the major (only?) use of insn_size is this subtraction?  It 
looks like it would be better to simply save the address of the current pc 
before we begin decoding.

This change should be done as a separate patch.

> +static uint64_t ppc_load_insn(DisasContext *ctx)
> +{
> +    uint64_t insn;
> +    uint32_t insn_part;
> +
> +    /* read 4 bytes */
> +    insn_part = translator_ldl_swap(ctx->env, ctx->base.pc_next,
> +                                    need_byteswap(ctx));
> +    insn = ((uint64_t)insn_part) << 32;
> +    ctx->base.pc_next += 4;
> +    ctx->insn_size = 4;
> +
> +    if (is_insn_prefix(insn_part)) {
> +        /* read 4 more bytes */
> +        insn_part = translator_ldl_swap(ctx->env, ctx->base.pc_next,
> +                                        need_byteswap(ctx));
> +        insn |= insn_part;
> +
> +        ctx->base.pc_next += 4;
> +        ctx->insn_size += 4;
> +    }
> +
> +    return insn;
> +}

> @@ -7979,37 +8049,31 @@ static bool ppc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
>   {
>       DisasContext *ctx = container_of(dcbase, DisasContext, base);
>   
> +    target_ulong insn_size = ppc_peek_next_insn_size(ctx);
> +
>       gen_debug_exception(ctx);
>       dcbase->is_jmp = DISAS_NORETURN;
>       /*
>        * The address covered by the breakpoint must be included in
>        * [tb->pc, tb->pc + tb->size) in order to for it to be properly
> -     * cleared -- thus we increment the PC here so that the logic
> -     * setting tb->size below does the right thing.
> +     * cleared -- thus we increment the PC here.
>        */
> -    ctx->base.pc_next += 4;
> +    ctx->base.pc_next += insn_size;

Here in breakpoint_check, we merely need a non-zero number.  No point in a 
change here.

> +    /* load the next insn, keeping track of the insn size */
> +    insn = ppc_load_insn(ctx);
> +
> +    if (unlikely(ctx->insn_size == 8 &&
> +                 (ctx->base.pc_next & 0x3f) == 0x04)) {
> +        /*
> +         * Raise alignment exception when a 64-bit insn crosses a
> +         * 64-byte boundary
> +         */
> +        gen_exception_err(ctx, POWERPC_EXCP_ALIGN, POWERPC_EXCP_ALIGN_INSN);

This is incorrect.

In 1.10.2 Instruction Fetches, it states that all instructions are word 
aligned, and gives an example of a prefixed instruction not aligned on a 
double-word boundrary.


r~


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] target/ppc: Base changes to allow 32/64-bit insns
  2021-04-13 21:11 ` [PATCH 4/5] target/ppc: Base changes to allow 32/64-bit insns Luis Pires
  2021-04-14 15:26   ` Richard Henderson
@ 2021-04-14 16:09   ` Richard Henderson
  2021-04-14 16:10   ` Richard Henderson
  2 siblings, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2021-04-14 16:09 UTC (permalink / raw)
  To: Luis Pires, qemu-devel, qemu-ppc
  Cc: lagarcia, bruno.larsen, matheus.ferst, f4bug, david

On 4/13/21 2:11 PM, Luis Pires wrote:
> @@ -7879,7 +7951,6 @@ static void ppc_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
>   {
>       DisasContext *ctx = container_of(dcbase, DisasContext, base);
>       CPUPPCState *env = cs->env_ptr;
> -    int bound;
>   
>       ctx->exception = POWERPC_EXCP_NONE;
>       ctx->spr_cb = env->spr_cb;
> @@ -7961,8 +8032,7 @@ static void ppc_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
>       msr_se = 1;
>   #endif
>   
> -    bound = -(ctx->base.pc_first | TARGET_PAGE_MASK) / 4;
> -    ctx->base.max_insns = MIN(ctx->base.max_insns, bound);
> +    ctx->env = env;
>   }
>   

You've removed the logic that prevents translation from crossing a page 
boundary.  You need to replace it.

A good example of how to handle this properly is arm thumb, at the end of 
thumb_tr_translate_insn.

At the end of ppc_tr_translate_insn, you'd do something like

   if (dc->base.is_jmp == DISAS_NEXT
       && (dc->base.pc_next & (TARGET_PAGE_SIZE - 1))
          == (TARGET_PAGE_SIZE - 4)
       && ppc_peek_next_insn_size(ctx)) {
       dc->base.is_jmp = DISAS_TOO_MANY;
   }


r~


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] target/ppc: Base changes to allow 32/64-bit insns
  2021-04-13 21:11 ` [PATCH 4/5] target/ppc: Base changes to allow 32/64-bit insns Luis Pires
  2021-04-14 15:26   ` Richard Henderson
  2021-04-14 16:09   ` Richard Henderson
@ 2021-04-14 16:10   ` Richard Henderson
  2 siblings, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2021-04-14 16:10 UTC (permalink / raw)
  To: Luis Pires, qemu-devel, qemu-ppc
  Cc: lagarcia, bruno.larsen, matheus.ferst, f4bug, david

On 4/13/21 2:11 PM, Luis Pires wrote:
> +static inline int is_insn_prefix(uint32_t insn)
> +{
> +    return (opc1(insn) == 0x01);
> +}

Oh.  This should probably return false when prefixed instruction are not supported.


r~


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] target/ppc: Implement paddi and replace addi insns
  2021-04-13 21:11 ` [PATCH 5/5] target/ppc: Implement paddi and replace addi insns Luis Pires
  2021-04-13 22:41   ` Philippe Mathieu-Daudé
@ 2021-04-14 19:11   ` Richard Henderson
  2021-04-14 23:07     ` Richard Henderson
  2021-04-15 16:59     ` Richard Henderson
  1 sibling, 2 replies; 15+ messages in thread
From: Richard Henderson @ 2021-04-14 19:11 UTC (permalink / raw)
  To: Luis Pires, qemu-devel, qemu-ppc
  Cc: lagarcia, bruno.larsen, matheus.ferst, f4bug, david

On 4/13/21 2:11 PM, Luis Pires wrote:
> +++ b/target/ppc/translate/fixedpoint-impl.c.inc
> @@ -0,0 +1,26 @@

Missing copyright+license header.

> +static bool trans_paddi(DisasContext *ctx, arg_paddi *a)
> +{
> +    if (a->r == 0) {
> +        if (a->ra == 0) {
> +            /* li case */
> +            tcg_gen_movi_tl(cpu_gpr[a->rt], a->si);
> +        } else {
> +            tcg_gen_addi_tl(cpu_gpr[a->rt],
> +                            cpu_gpr[a->ra], a->si);
> +        }
> +    } else {
> +        if (a->ra == 0) {
> +            tcg_gen_addi_tl(cpu_gpr[a->rt], cpu_nip, a->si);
> +        } else {
> +            /* invalid form */
> +            gen_invalid(ctx);
> +        }
> +    }
> +
> +    return true;
> +}
> +
> +static bool trans_addi(DisasContext *ctx, arg_addi *a)
> +{
> +    return trans_paddi(ctx, a);
> +}

Just a note about decodetree: this kind of thing is where you would use the 
same name for both patterns, so that you would not need to have a separate 
symbol for addi (or vice versa).

That said, I'm now having a bit of a read-up on power10, and I have some 
suggestions.

First, type 2 and type 3 prefixes modify existing instructions.  Which means 
that you are going to wind up with a lot of duplication.  Preferentially, we 
should avoid that.

One example of how to approach this is target/microblaze, which has an "imm" 
instruction prefix to extend a 16-bit immediate to a 32-bit immediate.  This 
can be worked into decodetree directly:

# Include any IMM prefix in the value reported.
%extimm         0:s16 !function=typeb_imm
@typeb          ...... rd:5 ra:5 ................ \
                 &typeb imm=%extimm

static int typeb_imm(DisasContext *dc, int x)
{
     if (dc->tb_flags & IMM_FLAG) {
         return deposit32(dc->ext_imm, 0, 16, x);
     }
     return x;
}

static bool trans_imm(DisasContext *dc, arg_imm *arg)
{
     if (invalid_delay_slot(dc, "imm")) {
         return true;
     }
     dc->ext_imm = arg->imm << 16;
     tcg_gen_movi_i32(cpu_imm, dc->ext_imm);
     dc->tb_flags_to_set = IMM_FLAG;
     return true;
}

We decode "imm" as a separate instruction, set some bits in DisasContext, and 
then use those bits while decoding the next instruction.

I think the exact mechanism that microblaze uses is going to be too simplistic 
for Power10, but the basic idea of modifying the "normal" instructions is still 
sound, I think.

Using addi+paddi as an example, what about

# All ppc formats have names -- use them.
%MLS        r ie
prefix_MLS  000001 10 -- r:1 -- ie:s18          &MLS

# TODO: decodetree extension -- allow :type after name.
# The SI field needs to be 64-bit for MLS:D-form.
%D          rt ra si:int64_t
@D          ...... rt:5 ra:5 si:s16

ADDI        001110 ..... ..... ................ @D


static bool
trans_prefix_MLS(DisasContext *ctx, arg_MLS *a)
{
     if (cpu does not support prefixes ||
         ctx->prefix_type != PREFIX_NONE) {
         return false;
     }
     /* Record the prefix for the next instruction. */
     ctx->prefix_type = PREFIX_MLS;
     ctx->prefix_data.mls = *a;
     return true;
}

static bool
allow_prefix_MLS(DisasContext *ctx, arg_D *a)
{
     int64_t imm;

     /* Require MLS prefix or no prefix. */
     if (ctx->prefix_type != PREFIX_MLS) {
         if (ctx->prefix_type == PREFIX_NONE) {
             return true;
         }
         gen_invalid(ctx);
         return false;
     }

     /*
      * Concatenate the two immediate fields.
      * Note that IE is sign-extended 18 bits,
      * so this forms a signed 34-bit constant.
      */
     imm = deposit64(a->si, 16, 48, ctx->prefix_data.mls.ie);

     /*
      * If R, then the constant is pc-relative,
      * and RA must be 0.
      */
     if (ctx->prefix_data.mls.r) {
         if (ctx->prefix_data.mls.ra != 0) {
             gen_invalid(ctx);
             return false;
         }
         imm += ctx->cia;
     }
     a->si = imm;
     return true;
}

static bool
trans_ADDI(DisasContext *ctx, arg_D *a)
{
     if (!allow_prefix_MLS(ctx, a)) {
         return true;
     }
     if (a->ra) {
         tcg_gen_addi_tl(cpu_gpr[a->rt],
                         cpu_gpr[a->ra], a->si);
     } else {
         tcg_gen_movi_tl(cpu_gpr[a->rt], a->si);
     }
     return true;
}

This approach seems like it will work fine for MLS and MMIR prefixes.  For 8LS, 
8RR, and MRR prefixes, we'll need some extra help within ppc_tr_translate_insn. 
  E.g.

     insn = translator_ldl_swap(env, ctx->base.pc_next,
                                need_byteswap(ctx));
     switch (ctx->prefix_type) {
     case PREFIX_NONE:
         ok = decode_opcode_space_0(ctx, insn) ||
              decode_legacy(ctx, insn);
         break;
     case PREFIX_MLS:
     case PREFIX_MMIRR:
         ok = decode_opcode_space_0(ctx, insn);
         break;
     case PREFIX_8LS:
     case PREFIX_8RR:
         ok = decode_opcode_space_1(ctx, insn);
         break;
     case PREFIX_MRR:
         /*
          * The only instruction with this prefix is PNOP.
          * TODO: diagnose the set of patterns that are illegal:
          * branches, rfebb, sync other than isync, or a
          * service processor attention.
          * The Engineering Note allows us to either diagnose
          * these as illegal, or treat them all as no-op.
          */
         ok = true;
         break;
     default:
         g_assert_not_reached();
     }
     if (!ok) {
         gen_invalid(ctx);
     }


r~


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] decodetree: Fix empty input files for varinsnwidth
  2021-04-13 21:11 ` [PATCH 2/5] decodetree: Fix empty input files for varinsnwidth Luis Pires
@ 2021-04-14 19:47   ` Richard Henderson
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2021-04-14 19:47 UTC (permalink / raw)
  To: Luis Pires, qemu-devel, qemu-ppc
  Cc: lagarcia, bruno.larsen, matheus.ferst, f4bug, david

On 4/13/21 2:11 PM, Luis Pires wrote:
> Decodetree would throw an error when the input file was empty
> and --varinsnwidth was specified.
> 
> Signed-off-by: Luis Pires<luis.pires@eldorado.org.br>
> ---
>   scripts/decodetree.py | 11 ++++++-----
>   1 file changed, 6 insertions(+), 5 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


> +        if self.width is not None:
> +            if extracted < self.width:

Is it too ugly to use AND here?


r~


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] target/ppc: Implement paddi and replace addi insns
  2021-04-14 19:11   ` Richard Henderson
@ 2021-04-14 23:07     ` Richard Henderson
  2021-04-15 16:59     ` Richard Henderson
  1 sibling, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2021-04-14 23:07 UTC (permalink / raw)
  To: Luis Pires, qemu-devel, qemu-ppc
  Cc: lagarcia, bruno.larsen, matheus.ferst, f4bug, david

On 4/14/21 12:11 PM, Richard Henderson wrote:
> static bool
> allow_prefix_MLS(DisasContext *ctx, arg_D *a)
> {
>      int64_t imm;
> 
>      /* Require MLS prefix or no prefix. */
>      if (ctx->prefix_type != PREFIX_MLS) {
>          if (ctx->prefix_type == PREFIX_NONE) {
>              return true;
>          }
>          gen_invalid(ctx);
>          return false;
>      }

Combined with the switch on prefix_type in translate_insn, I think this can 
just simplify to

     if (ctx->prefix_type != PREFIX_MLS) {
         return ctx->prefix_type == PREFIX_NONE;
     }

because decode_legacy is only called from within PREFIX_NONE.


r~


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] target/ppc: Implement paddi and replace addi insns
  2021-04-14 19:11   ` Richard Henderson
  2021-04-14 23:07     ` Richard Henderson
@ 2021-04-15 16:59     ` Richard Henderson
  1 sibling, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2021-04-15 16:59 UTC (permalink / raw)
  To: Luis Pires, qemu-devel, qemu-ppc
  Cc: lagarcia, bruno.larsen, matheus.ferst, f4bug, david

On 4/14/21 12:11 PM, Richard Henderson wrote:
> This approach seems like it will work fine for MLS and MMIR prefixes.  For 8LS, 
> 8RR, and MRR prefixes, we'll need some extra help within ppc_tr_translate_insn. 
>   E.g.
> 
>      insn = translator_ldl_swap(env, ctx->base.pc_next,
>                                 need_byteswap(ctx));
>      switch (ctx->prefix_type) {
>      case PREFIX_NONE:
>          ok = decode_opcode_space_0(ctx, insn) ||
>               decode_legacy(ctx, insn);
>          break;
>      case PREFIX_MLS:
>      case PREFIX_MMIRR:
>          ok = decode_opcode_space_0(ctx, insn);
>          break;

I played about with this last night, and there's an interesting trade-off:

(1) The thousands of 32-bit insns which do not allow prefixes
     now each require 3 lines to assert that no prefix is present,

(2) There are only 12 MLS and 29 MMIRR prefixed insns.

I think it may well be that eliminating boiler-plate from thousands of insns it 
a good trade-off to special-casing 41 insns.

At which point, considering the multiple variations on 8RR and MMIRR prefixes, 
seems to indicate that we should decode all 64 bits all at once.


r~


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-04-15 17:27 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13 21:11 [PATCH 0/5] Base for adding PowerPC 64-bit instructions Luis Pires
2021-04-13 21:11 ` [PATCH 1/5] decodetree: Add support for " Luis Pires
2021-04-13 21:11 ` [PATCH 2/5] decodetree: Fix empty input files for varinsnwidth Luis Pires
2021-04-14 19:47   ` Richard Henderson
2021-04-13 21:11 ` [PATCH 3/5] decodetree: Allow custom var width load functions Luis Pires
2021-04-13 21:11 ` [PATCH 4/5] target/ppc: Base changes to allow 32/64-bit insns Luis Pires
2021-04-14 15:26   ` Richard Henderson
2021-04-14 16:09   ` Richard Henderson
2021-04-14 16:10   ` Richard Henderson
2021-04-13 21:11 ` [PATCH 5/5] target/ppc: Implement paddi and replace addi insns Luis Pires
2021-04-13 22:41   ` Philippe Mathieu-Daudé
2021-04-14 13:00     ` Luis Fernando Fujita Pires
2021-04-14 19:11   ` Richard Henderson
2021-04-14 23:07     ` Richard Henderson
2021-04-15 16:59     ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).