All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup
@ 2021-08-18 21:28 Richard Henderson
  2021-08-18 21:28 ` [PATCH v3 01/14] tcg/arm: Remove fallback definition of __ARM_ARCH Richard Henderson
                   ` (13 more replies)
  0 siblings, 14 replies; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:28 UTC (permalink / raw)
  To: qemu-devel

Based-on: <20210818191920.390759-1-richard.henderson@linaro.org>
("[PATCH v3 00/66] Unaligned access for user-only")

Important points:
  * Support unaligned accesses.
  * Add environment variable to for testing older architecture revs.
  * More use of enum types.


r~


Richard Henderson (14):
  tcg/arm: Remove fallback definition of __ARM_ARCH
  tcg/arm: Standardize on tcg_out_<branch>_{reg,imm}
  tcg/arm: Simplify use_armvt5_instructions
  tcg/arm: Support armv4t in tcg_out_goto and tcg_out_call
  tcg/arm: Examine QEMU_TCG_DEBUG environment variable
  tcg/arm: Support unaligned access for softmmu
  tcg/arm: Split out tcg_out_ldstm
  tcg/arm: Simplify usage of encode_imm
  tcg/arm: Drop inline markers
  tcg/arm: Give enum arm_cond_code_e a typedef and use it
  tcg/arm: More use of the ARMInsn enum
  tcg/arm: More use of the TCGReg enum
  tcg/arm: Reserve a register for guest_base
  tcg/arm: Support raising sigbus for user-only

 tcg/arm/tcg-target.h     |   35 +-
 tcg/arm/tcg-target.c.inc | 1010 +++++++++++++++++++++++++++-----------
 2 files changed, 724 insertions(+), 321 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 01/14] tcg/arm: Remove fallback definition of __ARM_ARCH
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
@ 2021-08-18 21:28 ` Richard Henderson
  2021-08-20 10:38   ` Peter Maydell
  2021-08-18 21:29 ` [PATCH v3 02/14] tcg/arm: Standardize on tcg_out_<branch>_{reg,imm} Richard Henderson
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:28 UTC (permalink / raw)
  To: qemu-devel

GCC since 4.8 provides the definition and we now require 7.5.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.h | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index d113b7f8db..18bb16c784 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -26,25 +26,6 @@
 #ifndef ARM_TCG_TARGET_H
 #define ARM_TCG_TARGET_H
 
-/* The __ARM_ARCH define is provided by gcc 4.8.  Construct it otherwise.  */
-#ifndef __ARM_ARCH
-# if defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) \
-     || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) \
-     || defined(__ARM_ARCH_7EM__)
-#  define __ARM_ARCH 7
-# elif defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) \
-       || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) \
-       || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6T2__)
-#  define __ARM_ARCH 6
-# elif defined(__ARM_ARCH_5__) || defined(__ARM_ARCH_5E__) \
-       || defined(__ARM_ARCH_5T__) || defined(__ARM_ARCH_5TE__) \
-       || defined(__ARM_ARCH_5TEJ__)
-#  define __ARM_ARCH 5
-# else
-#  define __ARM_ARCH 4
-# endif
-#endif
-
 extern int arm_arch;
 
 #if defined(__ARM_ARCH_5T__) \
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 02/14] tcg/arm: Standardize on tcg_out_<branch>_{reg,imm}
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
  2021-08-18 21:28 ` [PATCH v3 01/14] tcg/arm: Remove fallback definition of __ARM_ARCH Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-18 21:58   ` Philippe Mathieu-Daudé
  2021-08-20 10:39   ` [PATCH v3 02/14] tcg/arm: Standardize on tcg_out_<branch>_{reg, imm} Peter Maydell
  2021-08-18 21:29 ` [PATCH v3 03/14] tcg/arm: Simplify use_armvt5_instructions Richard Henderson
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

Some of the functions specified _reg, some _imm, and some
left it blank.  Make it clearer to which we are referring.

Split tcg_out_b_reg from tcg_out_bx_reg, to indicate when
we do not actually require BX semantics.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.c.inc | 38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index cbe3057a9d..0578f9749b 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -525,19 +525,19 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
     return 0;
 }
 
-static inline void tcg_out_b(TCGContext *s, int cond, int32_t offset)
+static inline void tcg_out_b_imm(TCGContext *s, int cond, int32_t offset)
 {
     tcg_out32(s, (cond << 28) | 0x0a000000 |
                     (((offset - 8) >> 2) & 0x00ffffff));
 }
 
-static inline void tcg_out_bl(TCGContext *s, int cond, int32_t offset)
+static inline void tcg_out_bl_imm(TCGContext *s, int cond, int32_t offset)
 {
     tcg_out32(s, (cond << 28) | 0x0b000000 |
                     (((offset - 8) >> 2) & 0x00ffffff));
 }
 
-static inline void tcg_out_blx(TCGContext *s, int cond, int rn)
+static inline void tcg_out_blx_reg(TCGContext *s, int cond, int rn)
 {
     tcg_out32(s, (cond << 28) | 0x012fff30 | rn);
 }
@@ -568,13 +568,19 @@ static inline void tcg_out_mov_reg(TCGContext *s, int cond, int rd, int rm)
     }
 }
 
-static inline void tcg_out_bx(TCGContext *s, int cond, TCGReg rn)
+static void tcg_out_bx_reg(TCGContext *s, int cond, TCGReg rn)
 {
-    /* Unless the C portion of QEMU is compiled as thumb, we don't
-       actually need true BX semantics; merely a branch to an address
-       held in a register.  */
+    tcg_out32(s, (cond << 28) | 0x012fff10 | rn);
+}
+
+static void tcg_out_b_reg(TCGContext *s, int cond, TCGReg rn)
+{
+    /*
+     * Unless the C portion of QEMU is compiled as thumb, we don't need
+     * true BX semantics; merely a branch to an address held in a register.
+     */
     if (use_armv5t_instructions) {
-        tcg_out32(s, (cond << 28) | 0x012fff10 | rn);
+        tcg_out_bx_reg(s, cond, rn);
     } else {
         tcg_out_mov_reg(s, cond, TCG_REG_PC, rn);
     }
@@ -1215,7 +1221,7 @@ static void tcg_out_goto(TCGContext *s, int cond, const tcg_insn_unit *addr)
     ptrdiff_t disp = tcg_pcrel_diff(s, addr);
 
     if ((addri & 1) == 0 && disp - 8 < 0x01fffffd && disp - 8 > -0x01fffffd) {
-        tcg_out_b(s, cond, disp);
+        tcg_out_b_imm(s, cond, disp);
         return;
     }
     tcg_out_movi_pool(s, cond, TCG_REG_PC, addri);
@@ -1236,11 +1242,11 @@ static void tcg_out_call(TCGContext *s, const tcg_insn_unit *addr)
             }
             tcg_out_blx_imm(s, disp);
         } else {
-            tcg_out_bl(s, COND_AL, disp);
+            tcg_out_bl_imm(s, COND_AL, disp);
         }
     } else if (use_armv7_instructions) {
         tcg_out_movi32(s, COND_AL, TCG_REG_TMP, addri);
-        tcg_out_blx(s, COND_AL, TCG_REG_TMP);
+        tcg_out_blx_reg(s, COND_AL, TCG_REG_TMP);
     } else {
         /* ??? Know that movi_pool emits exactly 1 insn.  */
         tcg_out_dat_imm(s, COND_AL, ARITH_ADD, TCG_REG_R14, TCG_REG_PC, 0);
@@ -1254,7 +1260,7 @@ static inline void tcg_out_goto_label(TCGContext *s, int cond, TCGLabel *l)
         tcg_out_goto(s, cond, l->u.value_ptr);
     } else {
         tcg_out_reloc(s, s->code_ptr, R_ARM_PC24, l, 0);
-        tcg_out_b(s, cond, 0);
+        tcg_out_b_imm(s, cond, 0);
     }
 }
 
@@ -1823,7 +1829,7 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
     /* This a conditional BL only to load a pointer within this opcode into LR
        for the slow path.  We will not be using the value for a tail call.  */
     label_ptr = s->code_ptr;
-    tcg_out_bl(s, COND_NE, 0);
+    tcg_out_bl_imm(s, COND_NE, 0);
 
     tcg_out_qemu_ld_index(s, opc, datalo, datahi, addrlo, addend);
 
@@ -1929,7 +1935,7 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
 
     /* The conditional call must come last, as we're going to return here.  */
     label_ptr = s->code_ptr;
-    tcg_out_bl(s, COND_NE, 0);
+    tcg_out_bl_imm(s, COND_NE, 0);
 
     add_qemu_ldst_label(s, false, oi, datalo, datahi, addrlo, addrhi,
                         s->code_ptr, label_ptr);
@@ -1982,7 +1988,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
     case INDEX_op_goto_ptr:
-        tcg_out_bx(s, COND_AL, args[0]);
+        tcg_out_b_reg(s, COND_AL, args[0]);
         break;
     case INDEX_op_br:
         tcg_out_goto_label(s, COND_AL, arg_label(args[0]));
@@ -3065,7 +3071,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 
     tcg_out_mov(s, TCG_TYPE_PTR, TCG_AREG0, tcg_target_call_iarg_regs[0]);
 
-    tcg_out_bx(s, COND_AL, tcg_target_call_iarg_regs[1]);
+    tcg_out_b_reg(s, COND_AL, tcg_target_call_iarg_regs[1]);
 
     /*
      * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 03/14] tcg/arm: Simplify use_armvt5_instructions
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
  2021-08-18 21:28 ` [PATCH v3 01/14] tcg/arm: Remove fallback definition of __ARM_ARCH Richard Henderson
  2021-08-18 21:29 ` [PATCH v3 02/14] tcg/arm: Standardize on tcg_out_<branch>_{reg,imm} Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-20 10:59   ` Peter Maydell
  2021-08-18 21:29 ` [PATCH v3 04/14] tcg/arm: Support armv4t in tcg_out_goto and tcg_out_call Richard Henderson
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

According to the Arm ARM DDI 0406C, section A1.3, the valid variants
are ARMv5T, ARMv5TE, ARMv5TEJ -- there is no ARMv5 without Thumb.
Therefore simplify the test from preprocessor ifdefs to base
architecture revision.  Retain the "t" in the name to minimize churn.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.h | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index 18bb16c784..f41b809554 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -28,13 +28,7 @@
 
 extern int arm_arch;
 
-#if defined(__ARM_ARCH_5T__) \
-    || defined(__ARM_ARCH_5TE__) || defined(__ARM_ARCH_5TEJ__)
-# define use_armv5t_instructions 1
-#else
-# define use_armv5t_instructions use_armv6_instructions
-#endif
-
+#define use_armv5t_instructions (__ARM_ARCH >= 5 || arm_arch >= 5)
 #define use_armv6_instructions  (__ARM_ARCH >= 6 || arm_arch >= 6)
 #define use_armv7_instructions  (__ARM_ARCH >= 7 || arm_arch >= 7)
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 04/14] tcg/arm: Support armv4t in tcg_out_goto and tcg_out_call
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
                   ` (2 preceding siblings ...)
  2021-08-18 21:29 ` [PATCH v3 03/14] tcg/arm: Simplify use_armvt5_instructions Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-20 10:50   ` Peter Maydell
  2021-08-18 21:29 ` [PATCH v3 05/14] tcg/arm: Examine QEMU_TCG_DEBUG environment variable Richard Henderson
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

ARMv4T has BX as its only interworking instruction.  In order
to support testing of different architecture revisions with a
qemu binary that may have been built for, say ARMv6T2, fill in
the blank required to make calls to helpers in thumb mode.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.c.inc | 49 ++++++++++++++++++++++++++++------------
 1 file changed, 34 insertions(+), 15 deletions(-)

diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 0578f9749b..87df812bb5 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -1211,7 +1211,8 @@ static inline void tcg_out_st8(TCGContext *s, int cond,
         tcg_out_st8_12(s, cond, rd, rn, offset);
 }
 
-/* The _goto case is normally between TBs within the same code buffer, and
+/*
+ * The _goto case is normally between TBs within the same code buffer, and
  * with the code buffer limited to 16MB we wouldn't need the long case.
  * But we also use it for the tail-call to the qemu_ld/st helpers, which does.
  */
@@ -1219,38 +1220,56 @@ static void tcg_out_goto(TCGContext *s, int cond, const tcg_insn_unit *addr)
 {
     intptr_t addri = (intptr_t)addr;
     ptrdiff_t disp = tcg_pcrel_diff(s, addr);
+    bool arm_mode = !(addri & 1);
 
-    if ((addri & 1) == 0 && disp - 8 < 0x01fffffd && disp - 8 > -0x01fffffd) {
+    if (arm_mode && disp - 8 < 0x01fffffd && disp - 8 > -0x01fffffd) {
         tcg_out_b_imm(s, cond, disp);
         return;
     }
-    tcg_out_movi_pool(s, cond, TCG_REG_PC, addri);
+
+    /* LDR is interworking from v5t. */
+    if (arm_mode || use_armv5t_instructions) {
+        tcg_out_movi_pool(s, cond, TCG_REG_PC, addri);
+        return;
+    }
+
+    /* else v4t */
+    tcg_out_movi32(s, COND_AL, TCG_REG_TMP, addri);
+    tcg_out_bx_reg(s, COND_AL, TCG_REG_TMP);
 }
 
-/* The call case is mostly used for helpers - so it's not unreasonable
- * for them to be beyond branch range */
+/*
+ * The call case is mostly used for helpers - so it's not unreasonable
+ * for them to be beyond branch range.
+ */
 static void tcg_out_call(TCGContext *s, const tcg_insn_unit *addr)
 {
     intptr_t addri = (intptr_t)addr;
     ptrdiff_t disp = tcg_pcrel_diff(s, addr);
+    bool arm_mode = !(addri & 1);
 
     if (disp - 8 < 0x02000000 && disp - 8 >= -0x02000000) {
-        if (addri & 1) {
-            /* Use BLX if the target is in Thumb mode */
-            if (!use_armv5t_instructions) {
-                tcg_abort();
-            }
-            tcg_out_blx_imm(s, disp);
-        } else {
+        if (arm_mode) {
             tcg_out_bl_imm(s, COND_AL, disp);
+            return;
         }
-    } else if (use_armv7_instructions) {
+        if (use_armv5t_instructions) {
+            tcg_out_blx_imm(s, disp);
+            return;
+        }
+    }
+
+    if (use_armv5t_instructions) {
         tcg_out_movi32(s, COND_AL, TCG_REG_TMP, addri);
         tcg_out_blx_reg(s, COND_AL, TCG_REG_TMP);
-    } else {
+    } else if (arm_mode) {
         /* ??? Know that movi_pool emits exactly 1 insn.  */
-        tcg_out_dat_imm(s, COND_AL, ARITH_ADD, TCG_REG_R14, TCG_REG_PC, 0);
+        tcg_out_mov_reg(s, COND_AL, TCG_REG_R14, TCG_REG_PC);
         tcg_out_movi_pool(s, COND_AL, TCG_REG_PC, addri);
+    } else {
+        tcg_out_movi32(s, COND_AL, TCG_REG_TMP, addri);
+        tcg_out_mov_reg(s, COND_AL, TCG_REG_R14, TCG_REG_PC);
+        tcg_out_bx_reg(s, COND_AL, TCG_REG_TMP);
     }
 }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 05/14] tcg/arm: Examine QEMU_TCG_DEBUG environment variable
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
                   ` (3 preceding siblings ...)
  2021-08-18 21:29 ` [PATCH v3 04/14] tcg/arm: Support armv4t in tcg_out_goto and tcg_out_call Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-20 11:01   ` Peter Maydell
  2021-08-18 21:29 ` [PATCH v3 06/14] tcg/arm: Support unaligned access for softmmu Richard Henderson
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

Use the environment variable to test an older ISA from
the one supported by the host.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.h     |  8 +++++++-
 tcg/arm/tcg-target.c.inc | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index f41b809554..e47720a85b 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -28,9 +28,15 @@
 
 extern int arm_arch;
 
+#ifdef CONFIG_DEBUG_TCG
+#define use_armv5t_instructions (arm_arch >= 5)
+#define use_armv6_instructions  (arm_arch >= 6)
+#define use_armv7_instructions  (arm_arch >= 7)
+#else
 #define use_armv5t_instructions (__ARM_ARCH >= 5 || arm_arch >= 5)
 #define use_armv6_instructions  (__ARM_ARCH >= 6 || arm_arch >= 6)
 #define use_armv7_instructions  (__ARM_ARCH >= 7 || arm_arch >= 7)
+#endif
 
 #undef TCG_TARGET_STACK_GROWSUP
 #define TCG_TARGET_INSN_UNIT_SIZE 4
@@ -83,7 +89,7 @@ typedef enum {
 #else
 extern bool use_idiv_instructions;
 #endif
-#ifdef __ARM_NEON__
+#if defined(__ARM_NEON__) && !defined(CONFIG_DEBUG_TCG)
 #define use_neon_instructions  1
 #else
 extern bool use_neon_instructions;
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 87df812bb5..0c7e4f8411 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -2455,6 +2455,38 @@ static void tcg_target_init(TCGContext *s)
         }
     }
 
+    /*
+     * For debugging/testing purposes, allow the ISA to be reduced
+     * (but not extended) from the set detected above.
+     */
+#ifdef CONFIG_DEBUG_TCG
+    {
+        char *opt = g_strdup(getenv("QEMU_TCG_DEBUG"));
+        if (opt) {
+            for (char *o = strtok(opt, ","); o ; o = strtok(NULL, ",")) {
+                if (o[0] == 'v' &&
+                    o[1] >= '4' &&
+                    o[1] <= '0' + arm_arch &&
+                    o[2] == 0) {
+                    arm_arch = o[1] - '0';
+                    continue;
+                }
+                if (strcmp(o, "!neon") == 0) {
+                    use_neon_instructions = false;
+                    continue;
+                }
+                if (strcmp(o, "help") == 0) {
+                    printf("QEMU_TCG_DEBUG=<opt>{,<opt>} where <opt> is\n"
+                           "  v<N>   select ARMv<N>\n"
+                           "  !neon  disable ARM NEON\n");
+                    exit(0);
+                }
+            }
+            g_free(opt);
+        }
+    }
+#endif
+
     tcg_target_available_regs[TCG_TYPE_I32] = ALL_GENERAL_REGS;
 
     tcg_target_call_clobber_regs = 0;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 06/14] tcg/arm: Support unaligned access for softmmu
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
                   ` (4 preceding siblings ...)
  2021-08-18 21:29 ` [PATCH v3 05/14] tcg/arm: Examine QEMU_TCG_DEBUG environment variable Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-20 13:34   ` Peter Maydell
  2021-08-18 21:29 ` [PATCH v3 07/14] tcg/arm: Split out tcg_out_ldstm Richard Henderson
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

From armv6, the architecture supports unaligned accesses.
All we need to do is perform the correct alignment check
in tcg_out_tlb_read and not use LDRD/STRD when the access
is not aligned.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.c.inc | 69 ++++++++++++++++++++++------------------
 1 file changed, 38 insertions(+), 31 deletions(-)

diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 0c7e4f8411..c55167cc84 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -34,13 +34,6 @@ bool use_idiv_instructions;
 bool use_neon_instructions;
 #endif
 
-/* ??? Ought to think about changing CONFIG_SOFTMMU to always defined.  */
-#ifdef CONFIG_SOFTMMU
-# define USING_SOFTMMU 1
-#else
-# define USING_SOFTMMU 0
-#endif
-
 #ifdef CONFIG_DEBUG_TCG
 static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
     "%r0",  "%r1",  "%r2",  "%r3",  "%r4",  "%r5",  "%r6",  "%r7",
@@ -1526,15 +1519,20 @@ static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
     int fast_off = TLB_MASK_TABLE_OFS(mem_index);
     int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
     int table_off = fast_off + offsetof(CPUTLBDescFast, table);
-    unsigned s_bits = opc & MO_SIZE;
-    unsigned a_bits = get_alignment_bits(opc);
+    unsigned s_mask = (1 << (opc & MO_SIZE)) - 1;
+    unsigned a_mask = (1 << get_alignment_bits(opc)) - 1;
+    TCGReg t_addr;
 
     /*
-     * We don't support inline unaligned acceses, but we can easily
-     * support overalignment checks.
+     * For v7, support for unaligned accesses is mandatory.
+     * For v6, support for unaligned accesses is enabled by SCTLR.U,
+     *     which is enabled by (at least) Linux and FreeBSD.
+     * For v4 and v5, unaligned accesses are... complicated, and
+     *     unhelped by Linux having a global not per-process flag
+     *     for unaligned handling.
      */
-    if (a_bits < s_bits) {
-        a_bits = s_bits;
+    if (!use_armv6_instructions && a_mask < s_mask) {
+        a_mask = s_mask;
     }
 
     /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {r0,r1}.  */
@@ -1578,27 +1576,32 @@ static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
 
     /*
      * Check alignment, check comparators.
-     * Do this in no more than 3 insns.  Use MOVW for v7, if possible,
+     * Do this in 2-4 insns.  Use MOVW for v7, if possible,
      * to reduce the number of sequential conditional instructions.
      * Almost all guests have at least 4k pages, which means that we need
      * to clear at least 9 bits even for an 8-byte memory, which means it
      * isn't worth checking for an immediate operand for BIC.
      */
+    /* For unaligned accesses, test the page of the last byte. */
+    t_addr = addrlo;
+    if (a_mask < s_mask) {
+        t_addr = TCG_REG_R0;
+        tcg_out_dat_imm(s, COND_AL, ARITH_ADD, t_addr,
+                        addrlo, s_mask - a_mask);
+    }
     if (use_armv7_instructions && TARGET_PAGE_BITS <= 16) {
-        tcg_target_ulong mask = ~(TARGET_PAGE_MASK | ((1 << a_bits) - 1));
-
-        tcg_out_movi32(s, COND_AL, TCG_REG_TMP, mask);
+        tcg_out_movi32(s, COND_AL, TCG_REG_TMP, ~(TARGET_PAGE_MASK | a_mask));
         tcg_out_dat_reg(s, COND_AL, ARITH_BIC, TCG_REG_TMP,
-                        addrlo, TCG_REG_TMP, 0);
+                        t_addr, TCG_REG_TMP, 0);
         tcg_out_dat_reg(s, COND_AL, ARITH_CMP, 0, TCG_REG_R2, TCG_REG_TMP, 0);
     } else {
-        if (a_bits) {
-            tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo,
-                            (1 << a_bits) - 1);
+        if (a_mask) {
+            tcg_debug_assert(a_mask <= 0xff);
+            tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
         }
-        tcg_out_dat_reg(s, COND_AL, ARITH_MOV, TCG_REG_TMP, 0, addrlo,
+        tcg_out_dat_reg(s, COND_AL, ARITH_MOV, TCG_REG_TMP, 0, t_addr,
                         SHIFT_IMM_LSR(TARGET_PAGE_BITS));
-        tcg_out_dat_reg(s, (a_bits ? COND_EQ : COND_AL), ARITH_CMP,
+        tcg_out_dat_reg(s, (a_mask ? COND_EQ : COND_AL), ARITH_CMP,
                         0, TCG_REG_R2, TCG_REG_TMP,
                         SHIFT_IMM_LSL(TARGET_PAGE_BITS));
     }
@@ -1763,8 +1766,9 @@ static inline void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
         tcg_out_ld32_r(s, COND_AL, datalo, addrlo, addend);
         break;
     case MO_Q:
-        /* Avoid ldrd for user-only emulation, to handle unaligned.  */
-        if (USING_SOFTMMU && use_armv6_instructions
+        /* LDRD requires alignment; double-check that. */
+        if (use_armv6_instructions
+            && get_alignment_bits(opc) >= MO_64
             && (datalo & 1) == 0 && datahi == datalo + 1) {
             tcg_out_ldrd_r(s, COND_AL, datalo, addrlo, addend);
         } else if (datalo != addend) {
@@ -1806,8 +1810,9 @@ static inline void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc,
         tcg_out_ld32_12(s, COND_AL, datalo, addrlo, 0);
         break;
     case MO_Q:
-        /* Avoid ldrd for user-only emulation, to handle unaligned.  */
-        if (USING_SOFTMMU && use_armv6_instructions
+        /* LDRD requires alignment; double-check that. */
+        if (use_armv6_instructions
+            && get_alignment_bits(opc) >= MO_64
             && (datalo & 1) == 0 && datahi == datalo + 1) {
             tcg_out_ldrd_8(s, COND_AL, datalo, addrlo, 0);
         } else if (datalo == addrlo) {
@@ -1882,8 +1887,9 @@ static inline void tcg_out_qemu_st_index(TCGContext *s, int cond, MemOp opc,
         tcg_out_st32_r(s, cond, datalo, addrlo, addend);
         break;
     case MO_64:
-        /* Avoid strd for user-only emulation, to handle unaligned.  */
-        if (USING_SOFTMMU && use_armv6_instructions
+        /* STRD requires alignment; double-check that. */
+        if (use_armv6_instructions
+            && get_alignment_bits(opc) >= MO_64
             && (datalo & 1) == 0 && datahi == datalo + 1) {
             tcg_out_strd_r(s, cond, datalo, addrlo, addend);
         } else {
@@ -1914,8 +1920,9 @@ static inline void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc,
         tcg_out_st32_12(s, COND_AL, datalo, addrlo, 0);
         break;
     case MO_64:
-        /* Avoid strd for user-only emulation, to handle unaligned.  */
-        if (USING_SOFTMMU && use_armv6_instructions
+        /* STRD requires alignment; double-check that. */
+        if (use_armv6_instructions
+            && get_alignment_bits(opc) >= MO_64
             && (datalo & 1) == 0 && datahi == datalo + 1) {
             tcg_out_strd_8(s, COND_AL, datalo, addrlo, 0);
         } else {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 07/14] tcg/arm: Split out tcg_out_ldstm
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
                   ` (5 preceding siblings ...)
  2021-08-18 21:29 ` [PATCH v3 06/14] tcg/arm: Support unaligned access for softmmu Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-20 11:45   ` Peter Maydell
  2021-08-18 21:29 ` [PATCH v3 08/14] tcg/arm: Simplify usage of encode_imm Richard Henderson
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

Expand these hard-coded instructions symbolically.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.c.inc | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index c55167cc84..63b786a3e5 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -134,6 +134,9 @@ typedef enum {
     INSN_CLZ       = 0x016f0f10,
     INSN_RBIT      = 0x06ff0f30,
 
+    INSN_LDMIA     = 0x08b00000,
+    INSN_STMDB     = 0x09200000,
+
     INSN_LDR_IMM   = 0x04100000,
     INSN_LDR_REG   = 0x06100000,
     INSN_STR_IMM   = 0x04000000,
@@ -586,6 +589,12 @@ static inline void tcg_out_dat_imm(TCGContext *s,
                     (rn << 16) | (rd << 12) | im);
 }
 
+static void tcg_out_ldstm(TCGContext *s, int cond, int opc,
+                          TCGReg rn, uint16_t mask)
+{
+    tcg_out32(s, (cond << 28) | opc | (rn << 16) | mask);
+}
+
 /* Note that this routine is used for both LDR and LDRH formats, so we do
    not wish to include an immediate shift at this point.  */
 static void tcg_out_memop_r(TCGContext *s, int cond, ARMInsn opc, TCGReg rt,
@@ -3119,7 +3128,10 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 {
     /* Calling convention requires us to save r4-r11 and lr.  */
     /* stmdb sp!, { r4 - r11, lr } */
-    tcg_out32(s, (COND_AL << 28) | 0x092d4ff0);
+    tcg_out_ldstm(s, COND_AL, INSN_STMDB, TCG_REG_CALL_STACK,
+                  (1 << TCG_REG_R4) | (1 << TCG_REG_R5) | (1 << TCG_REG_R6) |
+                  (1 << TCG_REG_R7) | (1 << TCG_REG_R8) | (1 << TCG_REG_R9) |
+                  (1 << TCG_REG_R10) | (1 << TCG_REG_R11) | (1 << TCG_REG_R14));
 
     /* Reserve callee argument and tcg temp space.  */
     tcg_out_dat_rI(s, COND_AL, ARITH_SUB, TCG_REG_CALL_STACK,
@@ -3147,7 +3159,10 @@ static void tcg_out_epilogue(TCGContext *s)
                    TCG_REG_CALL_STACK, STACK_ADDEND, 1);
 
     /* ldmia sp!, { r4 - r11, pc } */
-    tcg_out32(s, (COND_AL << 28) | 0x08bd8ff0);
+    tcg_out_ldstm(s, COND_AL, INSN_LDMIA, TCG_REG_CALL_STACK,
+                  (1 << TCG_REG_R4) | (1 << TCG_REG_R5) | (1 << TCG_REG_R6) |
+                  (1 << TCG_REG_R7) | (1 << TCG_REG_R8) | (1 << TCG_REG_R9) |
+                  (1 << TCG_REG_R10) | (1 << TCG_REG_R11) | (1 << TCG_REG_PC));
 }
 
 typedef struct {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 08/14] tcg/arm: Simplify usage of encode_imm
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
                   ` (6 preceding siblings ...)
  2021-08-18 21:29 ` [PATCH v3 07/14] tcg/arm: Split out tcg_out_ldstm Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-20 11:50   ` Peter Maydell
  2021-08-18 21:29 ` [PATCH v3 09/14] tcg/arm: Drop inline markers Richard Henderson
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

We have already computed the rotated value of the imm8
portion of the complete imm12 encoding.  No sense leaving
the combination of rot + rotation to the caller.

Create an encode_imm12_nofail helper that performs an assert.

This removes the final use of the local "rotl" function,
which duplicated our generic "rol32" function.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.c.inc | 141 +++++++++++++++++++++------------------
 1 file changed, 77 insertions(+), 64 deletions(-)

diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 63b786a3e5..265370b2ee 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -305,10 +305,10 @@ static bool reloc_pc8(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
     const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
     ptrdiff_t offset = tcg_ptr_byte_diff(target, src_rx) - 8;
-    int rot = encode_imm(offset);
+    int imm12 = encode_imm(offset);
 
-    if (rot >= 0) {
-        *src_rw = deposit32(*src_rw, 0, 12, rol32(offset, rot) | (rot << 7));
+    if (imm12 >= 0) {
+        *src_rw = deposit32(*src_rw, 0, 12, imm12);
         return true;
     }
     return false;
@@ -362,33 +362,52 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
     (ALL_GENERAL_REGS & ~((1 << TCG_REG_R0) | (1 << TCG_REG_R1)))
 #endif
 
-static inline uint32_t rotl(uint32_t val, int n)
-{
-  return (val << n) | (val >> (32 - n));
-}
-
-/* ARM immediates for ALU instructions are made of an unsigned 8-bit
-   right-rotated by an even amount between 0 and 30. */
+/*
+ * ARM immediates for ALU instructions are made of an unsigned 8-bit
+ * right-rotated by an even amount between 0 and 30.
+ *
+ * Return < 0 if @imm cannot be encoded, else the entire imm12 field.
+ */
 static int encode_imm(uint32_t imm)
 {
-    int shift;
+    uint32_t rot, imm8;
 
-    /* simple case, only lower bits */
-    if ((imm & ~0xff) == 0)
-        return 0;
-    /* then try a simple even shift */
-    shift = ctz32(imm) & ~1;
-    if (((imm >> shift) & ~0xff) == 0)
-        return 32 - shift;
-    /* now try harder with rotations */
-    if ((rotl(imm, 2) & ~0xff) == 0)
-        return 2;
-    if ((rotl(imm, 4) & ~0xff) == 0)
-        return 4;
-    if ((rotl(imm, 6) & ~0xff) == 0)
-        return 6;
-    /* imm can't be encoded */
+    /* Simple case, no rotation required. */
+    if ((imm & ~0xff) == 0) {
+        return imm;
+    }
+
+    /* Next, try a simple even shift.  */
+    rot = ctz32(imm) & ~1;
+    imm8 = imm >> rot;
+    rot = 32 - rot;
+    if ((imm8 & ~0xff) == 0) {
+        goto found;
+    }
+
+    /*
+     * Finally, try harder with rotations.
+     * The ctz test above will have taken care of rotates >= 8.
+     */
+    for (rot = 2; rot < 8; rot += 2) {
+        imm8 = rol32(imm, rot);
+        if ((imm8 & ~0xff) == 0) {
+            goto found;
+        }
+    }
+    /* Fail: imm cannot be encoded. */
     return -1;
+
+ found:
+    /* Note that rot is even, and we discard bit 0 by shifting by 7. */
+    return rot << 7 | imm8;
+}
+
+static int encode_imm_nofail(uint32_t imm)
+{
+    int ret = encode_imm(imm);
+    tcg_debug_assert(ret >= 0);
+    return ret;
 }
 
 static inline int check_fit_imm(uint32_t imm)
@@ -775,20 +794,18 @@ static void tcg_out_movi_pool(TCGContext *s, int cond, int rd, uint32_t arg)
 
 static void tcg_out_movi32(TCGContext *s, int cond, int rd, uint32_t arg)
 {
-    int rot, diff, opc, sh1, sh2;
+    int imm12, diff, opc, sh1, sh2;
     uint32_t tt0, tt1, tt2;
 
     /* Check a single MOV/MVN before anything else.  */
-    rot = encode_imm(arg);
-    if (rot >= 0) {
-        tcg_out_dat_imm(s, cond, ARITH_MOV, rd, 0,
-                        rotl(arg, rot) | (rot << 7));
+    imm12 = encode_imm(arg);
+    if (imm12 >= 0) {
+        tcg_out_dat_imm(s, cond, ARITH_MOV, rd, 0, imm12);
         return;
     }
-    rot = encode_imm(~arg);
-    if (rot >= 0) {
-        tcg_out_dat_imm(s, cond, ARITH_MVN, rd, 0,
-                        rotl(~arg, rot) | (rot << 7));
+    imm12 = encode_imm(~arg);
+    if (imm12 >= 0) {
+        tcg_out_dat_imm(s, cond, ARITH_MVN, rd, 0, imm12);
         return;
     }
 
@@ -796,17 +813,15 @@ static void tcg_out_movi32(TCGContext *s, int cond, int rd, uint32_t arg)
        or within the TB, which is immediately before the code block.  */
     diff = tcg_pcrel_diff(s, (void *)arg) - 8;
     if (diff >= 0) {
-        rot = encode_imm(diff);
-        if (rot >= 0) {
-            tcg_out_dat_imm(s, cond, ARITH_ADD, rd, TCG_REG_PC,
-                            rotl(diff, rot) | (rot << 7));
+        imm12 = encode_imm(diff);
+        if (imm12 >= 0) {
+            tcg_out_dat_imm(s, cond, ARITH_ADD, rd, TCG_REG_PC, imm12);
             return;
         }
     } else {
-        rot = encode_imm(-diff);
-        if (rot >= 0) {
-            tcg_out_dat_imm(s, cond, ARITH_SUB, rd, TCG_REG_PC,
-                            rotl(-diff, rot) | (rot << 7));
+        imm12 = encode_imm(-diff);
+        if (imm12 >= 0) {
+            tcg_out_dat_imm(s, cond, ARITH_SUB, rd, TCG_REG_PC, imm12);
             return;
         }
     }
@@ -838,6 +853,8 @@ static void tcg_out_movi32(TCGContext *s, int cond, int rd, uint32_t arg)
     sh2 = ctz32(tt1) & ~1;
     tt2 = tt1 & ~(0xff << sh2);
     if (tt2 == 0) {
+        int rot;
+
         rot = ((32 - sh1) << 7) & 0xf00;
         tcg_out_dat_imm(s, cond, opc, rd,  0, ((tt0 >> sh1) & 0xff) | rot);
         rot = ((32 - sh2) << 7) & 0xf00;
@@ -850,37 +867,35 @@ static void tcg_out_movi32(TCGContext *s, int cond, int rd, uint32_t arg)
     tcg_out_movi_pool(s, cond, rd, arg);
 }
 
+/*
+ * Emit either the reg,imm or reg,reg form of a data-processing insn.
+ * rhs must satisfy the "rI" constraint.
+ */
 static inline void tcg_out_dat_rI(TCGContext *s, int cond, int opc, TCGArg dst,
                                   TCGArg lhs, TCGArg rhs, int rhs_is_const)
 {
-    /* Emit either the reg,imm or reg,reg form of a data-processing insn.
-     * rhs must satisfy the "rI" constraint.
-     */
     if (rhs_is_const) {
-        int rot = encode_imm(rhs);
-        tcg_debug_assert(rot >= 0);
-        tcg_out_dat_imm(s, cond, opc, dst, lhs, rotl(rhs, rot) | (rot << 7));
+        tcg_out_dat_imm(s, cond, opc, dst, lhs, encode_imm_nofail(rhs));
     } else {
         tcg_out_dat_reg(s, cond, opc, dst, lhs, rhs, SHIFT_IMM_LSL(0));
     }
 }
 
+/*
+ * Emit either the reg,imm or reg,reg form of a data-processing insn.
+ * rhs must satisfy the "rIK" constraint.
+ */
 static void tcg_out_dat_rIK(TCGContext *s, int cond, int opc, int opinv,
                             TCGReg dst, TCGReg lhs, TCGArg rhs,
                             bool rhs_is_const)
 {
-    /* Emit either the reg,imm or reg,reg form of a data-processing insn.
-     * rhs must satisfy the "rIK" constraint.
-     */
     if (rhs_is_const) {
-        int rot = encode_imm(rhs);
-        if (rot < 0) {
-            rhs = ~rhs;
-            rot = encode_imm(rhs);
-            tcg_debug_assert(rot >= 0);
+        int imm12 = encode_imm(rhs);
+        if (imm12 < 0) {
+            imm12 = encode_imm_nofail(~rhs);
             opc = opinv;
         }
-        tcg_out_dat_imm(s, cond, opc, dst, lhs, rotl(rhs, rot) | (rot << 7));
+        tcg_out_dat_imm(s, cond, opc, dst, lhs, imm12);
     } else {
         tcg_out_dat_reg(s, cond, opc, dst, lhs, rhs, SHIFT_IMM_LSL(0));
     }
@@ -894,14 +909,12 @@ static void tcg_out_dat_rIN(TCGContext *s, int cond, int opc, int opneg,
      * rhs must satisfy the "rIN" constraint.
      */
     if (rhs_is_const) {
-        int rot = encode_imm(rhs);
-        if (rot < 0) {
-            rhs = -rhs;
-            rot = encode_imm(rhs);
-            tcg_debug_assert(rot >= 0);
+        int imm12 = encode_imm(rhs);
+        if (imm12 < 0) {
+            imm12 = encode_imm_nofail(-rhs);
             opc = opneg;
         }
-        tcg_out_dat_imm(s, cond, opc, dst, lhs, rotl(rhs, rot) | (rot << 7));
+        tcg_out_dat_imm(s, cond, opc, dst, lhs, imm12);
     } else {
         tcg_out_dat_reg(s, cond, opc, dst, lhs, rhs, SHIFT_IMM_LSL(0));
     }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 09/14] tcg/arm: Drop inline markers
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
                   ` (7 preceding siblings ...)
  2021-08-18 21:29 ` [PATCH v3 08/14] tcg/arm: Simplify usage of encode_imm Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-18 22:02   ` Philippe Mathieu-Daudé
  2021-08-18 21:29 ` [PATCH v3 10/14] tcg/arm: Give enum arm_cond_code_e a typedef and use it Richard Henderson
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

Let the compiler decide about inlining.
Remove tcg_out_nop as unused.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.c.inc | 234 +++++++++++++++++++--------------------
 1 file changed, 114 insertions(+), 120 deletions(-)

diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 265370b2ee..327032f0df 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -410,7 +410,7 @@ static int encode_imm_nofail(uint32_t imm)
     return ret;
 }
 
-static inline int check_fit_imm(uint32_t imm)
+static bool check_fit_imm(uint32_t imm)
 {
     return encode_imm(imm) >= 0;
 }
@@ -540,42 +540,37 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
     return 0;
 }
 
-static inline void tcg_out_b_imm(TCGContext *s, int cond, int32_t offset)
+static void tcg_out_b_imm(TCGContext *s, int cond, int32_t offset)
 {
     tcg_out32(s, (cond << 28) | 0x0a000000 |
                     (((offset - 8) >> 2) & 0x00ffffff));
 }
 
-static inline void tcg_out_bl_imm(TCGContext *s, int cond, int32_t offset)
+static void tcg_out_bl_imm(TCGContext *s, int cond, int32_t offset)
 {
     tcg_out32(s, (cond << 28) | 0x0b000000 |
                     (((offset - 8) >> 2) & 0x00ffffff));
 }
 
-static inline void tcg_out_blx_reg(TCGContext *s, int cond, int rn)
+static void tcg_out_blx_reg(TCGContext *s, int cond, int rn)
 {
     tcg_out32(s, (cond << 28) | 0x012fff30 | rn);
 }
 
-static inline void tcg_out_blx_imm(TCGContext *s, int32_t offset)
+static void tcg_out_blx_imm(TCGContext *s, int32_t offset)
 {
     tcg_out32(s, 0xfa000000 | ((offset & 2) << 23) |
                 (((offset - 8) >> 2) & 0x00ffffff));
 }
 
-static inline void tcg_out_dat_reg(TCGContext *s,
+static void tcg_out_dat_reg(TCGContext *s,
                 int cond, int opc, int rd, int rn, int rm, int shift)
 {
     tcg_out32(s, (cond << 28) | (0 << 25) | opc |
                     (rn << 16) | (rd << 12) | shift | rm);
 }
 
-static inline void tcg_out_nop(TCGContext *s)
-{
-    tcg_out32(s, INSN_NOP);
-}
-
-static inline void tcg_out_mov_reg(TCGContext *s, int cond, int rd, int rm)
+static void tcg_out_mov_reg(TCGContext *s, int cond, int rd, int rm)
 {
     /* Simple reg-reg move, optimising out the 'do nothing' case */
     if (rd != rm) {
@@ -601,8 +596,8 @@ static void tcg_out_b_reg(TCGContext *s, int cond, TCGReg rn)
     }
 }
 
-static inline void tcg_out_dat_imm(TCGContext *s,
-                int cond, int opc, int rd, int rn, int im)
+static void tcg_out_dat_imm(TCGContext *s, int cond, int opc,
+                            int rd, int rn, int im)
 {
     tcg_out32(s, (cond << 28) | (1 << 25) | opc |
                     (rn << 16) | (rd << 12) | im);
@@ -647,141 +642,141 @@ static void tcg_out_memop_12(TCGContext *s, int cond, ARMInsn opc, TCGReg rt,
               (rn << 16) | (rt << 12) | imm12);
 }
 
-static inline void tcg_out_ld32_12(TCGContext *s, int cond, TCGReg rt,
-                                   TCGReg rn, int imm12)
+static void tcg_out_ld32_12(TCGContext *s, int cond, TCGReg rt,
+                            TCGReg rn, int imm12)
 {
     tcg_out_memop_12(s, cond, INSN_LDR_IMM, rt, rn, imm12, 1, 0);
 }
 
-static inline void tcg_out_st32_12(TCGContext *s, int cond, TCGReg rt,
-                                   TCGReg rn, int imm12)
+static void tcg_out_st32_12(TCGContext *s, int cond, TCGReg rt,
+                            TCGReg rn, int imm12)
 {
     tcg_out_memop_12(s, cond, INSN_STR_IMM, rt, rn, imm12, 1, 0);
 }
 
-static inline void tcg_out_ld32_r(TCGContext *s, int cond, TCGReg rt,
-                                  TCGReg rn, TCGReg rm)
+static void tcg_out_ld32_r(TCGContext *s, int cond, TCGReg rt,
+                           TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDR_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static inline void tcg_out_st32_r(TCGContext *s, int cond, TCGReg rt,
-                                  TCGReg rn, TCGReg rm)
+static void tcg_out_st32_r(TCGContext *s, int cond, TCGReg rt,
+                           TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_STR_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static inline void tcg_out_ldrd_8(TCGContext *s, int cond, TCGReg rt,
-                                   TCGReg rn, int imm8)
+static void tcg_out_ldrd_8(TCGContext *s, int cond, TCGReg rt,
+                           TCGReg rn, int imm8)
 {
     tcg_out_memop_8(s, cond, INSN_LDRD_IMM, rt, rn, imm8, 1, 0);
 }
 
-static inline void tcg_out_ldrd_r(TCGContext *s, int cond, TCGReg rt,
-                                  TCGReg rn, TCGReg rm)
+static void tcg_out_ldrd_r(TCGContext *s, int cond, TCGReg rt,
+                           TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDRD_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static inline void tcg_out_ldrd_rwb(TCGContext *s, int cond, TCGReg rt,
-                                    TCGReg rn, TCGReg rm)
+static void __attribute__((unused))
+tcg_out_ldrd_rwb(TCGContext *s, int cond, TCGReg rt, TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDRD_REG, rt, rn, rm, 1, 1, 1);
 }
 
-static inline void tcg_out_strd_8(TCGContext *s, int cond, TCGReg rt,
-                                   TCGReg rn, int imm8)
+static void tcg_out_strd_8(TCGContext *s, int cond, TCGReg rt,
+                           TCGReg rn, int imm8)
 {
     tcg_out_memop_8(s, cond, INSN_STRD_IMM, rt, rn, imm8, 1, 0);
 }
 
-static inline void tcg_out_strd_r(TCGContext *s, int cond, TCGReg rt,
-                                  TCGReg rn, TCGReg rm)
+static void tcg_out_strd_r(TCGContext *s, int cond, TCGReg rt,
+                           TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_STRD_REG, rt, rn, rm, 1, 1, 0);
 }
 
 /* Register pre-increment with base writeback.  */
-static inline void tcg_out_ld32_rwb(TCGContext *s, int cond, TCGReg rt,
-                                    TCGReg rn, TCGReg rm)
+static void tcg_out_ld32_rwb(TCGContext *s, int cond, TCGReg rt,
+                             TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDR_REG, rt, rn, rm, 1, 1, 1);
 }
 
-static inline void tcg_out_st32_rwb(TCGContext *s, int cond, TCGReg rt,
-                                    TCGReg rn, TCGReg rm)
+static void tcg_out_st32_rwb(TCGContext *s, int cond, TCGReg rt,
+                             TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_STR_REG, rt, rn, rm, 1, 1, 1);
 }
 
-static inline void tcg_out_ld16u_8(TCGContext *s, int cond, TCGReg rt,
-                                   TCGReg rn, int imm8)
+static void tcg_out_ld16u_8(TCGContext *s, int cond, TCGReg rt,
+                            TCGReg rn, int imm8)
 {
     tcg_out_memop_8(s, cond, INSN_LDRH_IMM, rt, rn, imm8, 1, 0);
 }
 
-static inline void tcg_out_st16_8(TCGContext *s, int cond, TCGReg rt,
-                                  TCGReg rn, int imm8)
+static void tcg_out_st16_8(TCGContext *s, int cond, TCGReg rt,
+                           TCGReg rn, int imm8)
 {
     tcg_out_memop_8(s, cond, INSN_STRH_IMM, rt, rn, imm8, 1, 0);
 }
 
-static inline void tcg_out_ld16u_r(TCGContext *s, int cond, TCGReg rt,
-                                   TCGReg rn, TCGReg rm)
+static void tcg_out_ld16u_r(TCGContext *s, int cond, TCGReg rt,
+                            TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDRH_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static inline void tcg_out_st16_r(TCGContext *s, int cond, TCGReg rt,
-                                  TCGReg rn, TCGReg rm)
+static void tcg_out_st16_r(TCGContext *s, int cond, TCGReg rt,
+                           TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_STRH_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static inline void tcg_out_ld16s_8(TCGContext *s, int cond, TCGReg rt,
-                                   TCGReg rn, int imm8)
+static void tcg_out_ld16s_8(TCGContext *s, int cond, TCGReg rt,
+                            TCGReg rn, int imm8)
 {
     tcg_out_memop_8(s, cond, INSN_LDRSH_IMM, rt, rn, imm8, 1, 0);
 }
 
-static inline void tcg_out_ld16s_r(TCGContext *s, int cond, TCGReg rt,
-                                   TCGReg rn, TCGReg rm)
+static void tcg_out_ld16s_r(TCGContext *s, int cond, TCGReg rt,
+                            TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDRSH_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static inline void tcg_out_ld8_12(TCGContext *s, int cond, TCGReg rt,
-                                  TCGReg rn, int imm12)
+static void tcg_out_ld8_12(TCGContext *s, int cond, TCGReg rt,
+                           TCGReg rn, int imm12)
 {
     tcg_out_memop_12(s, cond, INSN_LDRB_IMM, rt, rn, imm12, 1, 0);
 }
 
-static inline void tcg_out_st8_12(TCGContext *s, int cond, TCGReg rt,
-                                  TCGReg rn, int imm12)
+static void tcg_out_st8_12(TCGContext *s, int cond, TCGReg rt,
+                           TCGReg rn, int imm12)
 {
     tcg_out_memop_12(s, cond, INSN_STRB_IMM, rt, rn, imm12, 1, 0);
 }
 
-static inline void tcg_out_ld8_r(TCGContext *s, int cond, TCGReg rt,
-                                 TCGReg rn, TCGReg rm)
+static void tcg_out_ld8_r(TCGContext *s, int cond, TCGReg rt,
+                          TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDRB_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static inline void tcg_out_st8_r(TCGContext *s, int cond, TCGReg rt,
-                                 TCGReg rn, TCGReg rm)
+static void tcg_out_st8_r(TCGContext *s, int cond, TCGReg rt,
+                          TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_STRB_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static inline void tcg_out_ld8s_8(TCGContext *s, int cond, TCGReg rt,
-                                  TCGReg rn, int imm8)
+static void tcg_out_ld8s_8(TCGContext *s, int cond, TCGReg rt,
+                           TCGReg rn, int imm8)
 {
     tcg_out_memop_8(s, cond, INSN_LDRSB_IMM, rt, rn, imm8, 1, 0);
 }
 
-static inline void tcg_out_ld8s_r(TCGContext *s, int cond, TCGReg rt,
-                                  TCGReg rn, TCGReg rm)
+static void tcg_out_ld8s_r(TCGContext *s, int cond, TCGReg rt,
+                           TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDRSB_REG, rt, rn, rm, 1, 1, 0);
 }
@@ -871,8 +866,8 @@ static void tcg_out_movi32(TCGContext *s, int cond, int rd, uint32_t arg)
  * Emit either the reg,imm or reg,reg form of a data-processing insn.
  * rhs must satisfy the "rI" constraint.
  */
-static inline void tcg_out_dat_rI(TCGContext *s, int cond, int opc, TCGArg dst,
-                                  TCGArg lhs, TCGArg rhs, int rhs_is_const)
+static void tcg_out_dat_rI(TCGContext *s, int cond, int opc, TCGArg dst,
+                           TCGArg lhs, TCGArg rhs, int rhs_is_const)
 {
     if (rhs_is_const) {
         tcg_out_dat_imm(s, cond, opc, dst, lhs, encode_imm_nofail(rhs));
@@ -920,8 +915,8 @@ static void tcg_out_dat_rIN(TCGContext *s, int cond, int opc, int opneg,
     }
 }
 
-static inline void tcg_out_mul32(TCGContext *s, int cond, TCGReg rd,
-                                 TCGReg rn, TCGReg rm)
+static void tcg_out_mul32(TCGContext *s, int cond, TCGReg rd,
+                          TCGReg rn, TCGReg rm)
 {
     /* if ArchVersion() < 6 && d == n then UNPREDICTABLE;  */
     if (!use_armv6_instructions && rd == rn) {
@@ -938,8 +933,8 @@ static inline void tcg_out_mul32(TCGContext *s, int cond, TCGReg rd,
     tcg_out32(s, (cond << 28) | 0x90 | (rd << 16) | (rm << 8) | rn);
 }
 
-static inline void tcg_out_umull32(TCGContext *s, int cond, TCGReg rd0,
-                                   TCGReg rd1, TCGReg rn, TCGReg rm)
+static void tcg_out_umull32(TCGContext *s, int cond, TCGReg rd0,
+                            TCGReg rd1, TCGReg rn, TCGReg rm)
 {
     /* if ArchVersion() < 6 && (dHi == n || dLo == n) then UNPREDICTABLE;  */
     if (!use_armv6_instructions && (rd0 == rn || rd1 == rn)) {
@@ -957,8 +952,8 @@ static inline void tcg_out_umull32(TCGContext *s, int cond, TCGReg rd0,
               (rd1 << 16) | (rd0 << 12) | (rm << 8) | rn);
 }
 
-static inline void tcg_out_smull32(TCGContext *s, int cond, TCGReg rd0,
-                                   TCGReg rd1, TCGReg rn, TCGReg rm)
+static void tcg_out_smull32(TCGContext *s, int cond, TCGReg rd0,
+                            TCGReg rd1, TCGReg rn, TCGReg rm)
 {
     /* if ArchVersion() < 6 && (dHi == n || dLo == n) then UNPREDICTABLE;  */
     if (!use_armv6_instructions && (rd0 == rn || rd1 == rn)) {
@@ -976,18 +971,17 @@ static inline void tcg_out_smull32(TCGContext *s, int cond, TCGReg rd0,
               (rd1 << 16) | (rd0 << 12) | (rm << 8) | rn);
 }
 
-static inline void tcg_out_sdiv(TCGContext *s, int cond, int rd, int rn, int rm)
+static void tcg_out_sdiv(TCGContext *s, int cond, int rd, int rn, int rm)
 {
     tcg_out32(s, 0x0710f010 | (cond << 28) | (rd << 16) | rn | (rm << 8));
 }
 
-static inline void tcg_out_udiv(TCGContext *s, int cond, int rd, int rn, int rm)
+static void tcg_out_udiv(TCGContext *s, int cond, int rd, int rn, int rm)
 {
     tcg_out32(s, 0x0730f010 | (cond << 28) | (rd << 16) | rn | (rm << 8));
 }
 
-static inline void tcg_out_ext8s(TCGContext *s, int cond,
-                                 int rd, int rn)
+static void tcg_out_ext8s(TCGContext *s, int cond, int rd, int rn)
 {
     if (use_armv6_instructions) {
         /* sxtb */
@@ -1000,14 +994,13 @@ static inline void tcg_out_ext8s(TCGContext *s, int cond,
     }
 }
 
-static inline void tcg_out_ext8u(TCGContext *s, int cond,
-                                 int rd, int rn)
+static void __attribute__((unused))
+tcg_out_ext8u(TCGContext *s, int cond, int rd, int rn)
 {
     tcg_out_dat_imm(s, cond, ARITH_AND, rd, rn, 0xff);
 }
 
-static inline void tcg_out_ext16s(TCGContext *s, int cond,
-                                  int rd, int rn)
+static void tcg_out_ext16s(TCGContext *s, int cond, int rd, int rn)
 {
     if (use_armv6_instructions) {
         /* sxth */
@@ -1020,8 +1013,7 @@ static inline void tcg_out_ext16s(TCGContext *s, int cond,
     }
 }
 
-static inline void tcg_out_ext16u(TCGContext *s, int cond,
-                                  int rd, int rn)
+static void tcg_out_ext16u(TCGContext *s, int cond, int rd, int rn)
 {
     if (use_armv6_instructions) {
         /* uxth */
@@ -1101,7 +1093,7 @@ static void tcg_out_bswap16(TCGContext *s, int cond, int rd, int rn, int flags)
                      ? SHIFT_IMM_ASR(8) : SHIFT_IMM_LSR(8)));
 }
 
-static inline void tcg_out_bswap32(TCGContext *s, int cond, int rd, int rn)
+static void tcg_out_bswap32(TCGContext *s, int cond, int rd, int rn)
 {
     if (use_armv6_instructions) {
         /* rev */
@@ -1118,8 +1110,8 @@ static inline void tcg_out_bswap32(TCGContext *s, int cond, int rd, int rn)
     }
 }
 
-static inline void tcg_out_deposit(TCGContext *s, int cond, TCGReg rd,
-                                   TCGArg a1, int ofs, int len, bool const_a1)
+static void tcg_out_deposit(TCGContext *s, int cond, TCGReg rd,
+                            TCGArg a1, int ofs, int len, bool const_a1)
 {
     if (const_a1) {
         /* bfi becomes bfc with rn == 15.  */
@@ -1130,24 +1122,24 @@ static inline void tcg_out_deposit(TCGContext *s, int cond, TCGReg rd,
               | (ofs << 7) | ((ofs + len - 1) << 16));
 }
 
-static inline void tcg_out_extract(TCGContext *s, int cond, TCGReg rd,
-                                   TCGArg a1, int ofs, int len)
+static void tcg_out_extract(TCGContext *s, int cond, TCGReg rd,
+                            TCGArg a1, int ofs, int len)
 {
     /* ubfx */
     tcg_out32(s, 0x07e00050 | (cond << 28) | (rd << 12) | a1
               | (ofs << 7) | ((len - 1) << 16));
 }
 
-static inline void tcg_out_sextract(TCGContext *s, int cond, TCGReg rd,
-                                    TCGArg a1, int ofs, int len)
+static void tcg_out_sextract(TCGContext *s, int cond, TCGReg rd,
+                             TCGArg a1, int ofs, int len)
 {
     /* sbfx */
     tcg_out32(s, 0x07a00050 | (cond << 28) | (rd << 12) | a1
               | (ofs << 7) | ((len - 1) << 16));
 }
 
-static inline void tcg_out_ld32u(TCGContext *s, int cond,
-                int rd, int rn, int32_t offset)
+static void tcg_out_ld32u(TCGContext *s, int cond,
+                          int rd, int rn, int32_t offset)
 {
     if (offset > 0xfff || offset < -0xfff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1156,8 +1148,8 @@ static inline void tcg_out_ld32u(TCGContext *s, int cond,
         tcg_out_ld32_12(s, cond, rd, rn, offset);
 }
 
-static inline void tcg_out_st32(TCGContext *s, int cond,
-                int rd, int rn, int32_t offset)
+static void tcg_out_st32(TCGContext *s, int cond,
+                         int rd, int rn, int32_t offset)
 {
     if (offset > 0xfff || offset < -0xfff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1166,8 +1158,8 @@ static inline void tcg_out_st32(TCGContext *s, int cond,
         tcg_out_st32_12(s, cond, rd, rn, offset);
 }
 
-static inline void tcg_out_ld16u(TCGContext *s, int cond,
-                int rd, int rn, int32_t offset)
+static void tcg_out_ld16u(TCGContext *s, int cond,
+                          int rd, int rn, int32_t offset)
 {
     if (offset > 0xff || offset < -0xff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1176,8 +1168,8 @@ static inline void tcg_out_ld16u(TCGContext *s, int cond,
         tcg_out_ld16u_8(s, cond, rd, rn, offset);
 }
 
-static inline void tcg_out_ld16s(TCGContext *s, int cond,
-                int rd, int rn, int32_t offset)
+static void tcg_out_ld16s(TCGContext *s, int cond,
+                          int rd, int rn, int32_t offset)
 {
     if (offset > 0xff || offset < -0xff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1186,8 +1178,8 @@ static inline void tcg_out_ld16s(TCGContext *s, int cond,
         tcg_out_ld16s_8(s, cond, rd, rn, offset);
 }
 
-static inline void tcg_out_st16(TCGContext *s, int cond,
-                int rd, int rn, int32_t offset)
+static void tcg_out_st16(TCGContext *s, int cond,
+                         int rd, int rn, int32_t offset)
 {
     if (offset > 0xff || offset < -0xff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1196,8 +1188,8 @@ static inline void tcg_out_st16(TCGContext *s, int cond,
         tcg_out_st16_8(s, cond, rd, rn, offset);
 }
 
-static inline void tcg_out_ld8u(TCGContext *s, int cond,
-                int rd, int rn, int32_t offset)
+static void tcg_out_ld8u(TCGContext *s, int cond,
+                         int rd, int rn, int32_t offset)
 {
     if (offset > 0xfff || offset < -0xfff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1206,8 +1198,8 @@ static inline void tcg_out_ld8u(TCGContext *s, int cond,
         tcg_out_ld8_12(s, cond, rd, rn, offset);
 }
 
-static inline void tcg_out_ld8s(TCGContext *s, int cond,
-                int rd, int rn, int32_t offset)
+static void tcg_out_ld8s(TCGContext *s, int cond,
+                         int rd, int rn, int32_t offset)
 {
     if (offset > 0xff || offset < -0xff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1216,8 +1208,8 @@ static inline void tcg_out_ld8s(TCGContext *s, int cond,
         tcg_out_ld8s_8(s, cond, rd, rn, offset);
 }
 
-static inline void tcg_out_st8(TCGContext *s, int cond,
-                int rd, int rn, int32_t offset)
+static void tcg_out_st8(TCGContext *s, int cond,
+                        int rd, int rn, int32_t offset)
 {
     if (offset > 0xfff || offset < -0xfff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1288,7 +1280,7 @@ static void tcg_out_call(TCGContext *s, const tcg_insn_unit *addr)
     }
 }
 
-static inline void tcg_out_goto_label(TCGContext *s, int cond, TCGLabel *l)
+static void tcg_out_goto_label(TCGContext *s, int cond, TCGLabel *l)
 {
     if (l->has_value) {
         tcg_out_goto(s, cond, l->u.value_ptr);
@@ -1298,7 +1290,7 @@ static inline void tcg_out_goto_label(TCGContext *s, int cond, TCGLabel *l)
     }
 }
 
-static inline void tcg_out_mb(TCGContext *s, TCGArg a0)
+static void tcg_out_mb(TCGContext *s, TCGArg a0)
 {
     if (use_armv7_instructions) {
         tcg_out32(s, INSN_DMB_ISH);
@@ -1764,9 +1756,9 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
 }
 #endif /* SOFTMMU */
 
-static inline void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
-                                         TCGReg datalo, TCGReg datahi,
-                                         TCGReg addrlo, TCGReg addend)
+static void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
+                                  TCGReg datalo, TCGReg datahi,
+                                  TCGReg addrlo, TCGReg addend)
 {
     /* Byte swapping is left to middle-end expansion. */
     tcg_debug_assert((opc & MO_BSWAP) == 0);
@@ -1808,9 +1800,9 @@ static inline void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
     }
 }
 
-static inline void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc,
-                                          TCGReg datalo, TCGReg datahi,
-                                          TCGReg addrlo)
+#ifndef CONFIG_SOFTMMU
+static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg datalo,
+                                   TCGReg datahi, TCGReg addrlo)
 {
     /* Byte swapping is left to middle-end expansion. */
     tcg_debug_assert((opc & MO_BSWAP) == 0);
@@ -1849,6 +1841,7 @@ static inline void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc,
         g_assert_not_reached();
     }
 }
+#endif
 
 static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
 {
@@ -1891,9 +1884,9 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
 #endif
 }
 
-static inline void tcg_out_qemu_st_index(TCGContext *s, int cond, MemOp opc,
-                                         TCGReg datalo, TCGReg datahi,
-                                         TCGReg addrlo, TCGReg addend)
+static void tcg_out_qemu_st_index(TCGContext *s, int cond, MemOp opc,
+                                  TCGReg datalo, TCGReg datahi,
+                                  TCGReg addrlo, TCGReg addend)
 {
     /* Byte swapping is left to middle-end expansion. */
     tcg_debug_assert((opc & MO_BSWAP) == 0);
@@ -1924,9 +1917,9 @@ static inline void tcg_out_qemu_st_index(TCGContext *s, int cond, MemOp opc,
     }
 }
 
-static inline void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc,
-                                          TCGReg datalo, TCGReg datahi,
-                                          TCGReg addrlo)
+#ifndef CONFIG_SOFTMMU
+static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg datalo,
+                                   TCGReg datahi, TCGReg addrlo)
 {
     /* Byte swapping is left to middle-end expansion. */
     tcg_debug_assert((opc & MO_BSWAP) == 0);
@@ -1956,6 +1949,7 @@ static inline void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc,
         g_assert_not_reached();
     }
 }
+#endif
 
 static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
 {
@@ -2000,9 +1994,9 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
 
 static void tcg_out_epilogue(TCGContext *s);
 
-static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
-                const TCGArg args[TCG_MAX_OP_ARGS],
-                const int const_args[TCG_MAX_OP_ARGS])
+static void tcg_out_op(TCGContext *s, TCGOpcode opc,
+                       const TCGArg args[TCG_MAX_OP_ARGS],
+                       const int const_args[TCG_MAX_OP_ARGS])
 {
     TCGArg a0, a1, a2, a3, a4, a5;
     int c;
@@ -2591,8 +2585,8 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg,
     }
 }
 
-static inline bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val,
-                               TCGReg base, intptr_t ofs)
+static bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val,
+                        TCGReg base, intptr_t ofs)
 {
     return false;
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 10/14] tcg/arm: Give enum arm_cond_code_e a typedef and use it
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
                   ` (8 preceding siblings ...)
  2021-08-18 21:29 ` [PATCH v3 09/14] tcg/arm: Drop inline markers Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-18 22:04   ` Philippe Mathieu-Daudé
  2021-08-18 21:29 ` [PATCH v3 11/14] tcg/arm: More use of the ARMInsn enum Richard Henderson
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.c.inc | 136 +++++++++++++++++++--------------------
 1 file changed, 68 insertions(+), 68 deletions(-)

diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 327032f0df..b20c313615 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -85,7 +85,7 @@ static const int tcg_target_call_oarg_regs[2] = {
 #define TCG_REG_TMP  TCG_REG_R12
 #define TCG_VEC_TMP  TCG_REG_Q15
 
-enum arm_cond_code_e {
+typedef enum {
     COND_EQ = 0x0,
     COND_NE = 0x1,
     COND_CS = 0x2,	/* Unsigned greater or equal */
@@ -101,7 +101,7 @@ enum arm_cond_code_e {
     COND_GT = 0xc,
     COND_LE = 0xd,
     COND_AL = 0xe,
-};
+} ARMCond;
 
 #define TO_CPSR (1 << 20)
 
@@ -540,19 +540,19 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
     return 0;
 }
 
-static void tcg_out_b_imm(TCGContext *s, int cond, int32_t offset)
+static void tcg_out_b_imm(TCGContext *s, ARMCond cond, int32_t offset)
 {
     tcg_out32(s, (cond << 28) | 0x0a000000 |
                     (((offset - 8) >> 2) & 0x00ffffff));
 }
 
-static void tcg_out_bl_imm(TCGContext *s, int cond, int32_t offset)
+static void tcg_out_bl_imm(TCGContext *s, ARMCond cond, int32_t offset)
 {
     tcg_out32(s, (cond << 28) | 0x0b000000 |
                     (((offset - 8) >> 2) & 0x00ffffff));
 }
 
-static void tcg_out_blx_reg(TCGContext *s, int cond, int rn)
+static void tcg_out_blx_reg(TCGContext *s, ARMCond cond, int rn)
 {
     tcg_out32(s, (cond << 28) | 0x012fff30 | rn);
 }
@@ -563,14 +563,14 @@ static void tcg_out_blx_imm(TCGContext *s, int32_t offset)
                 (((offset - 8) >> 2) & 0x00ffffff));
 }
 
-static void tcg_out_dat_reg(TCGContext *s,
-                int cond, int opc, int rd, int rn, int rm, int shift)
+static void tcg_out_dat_reg(TCGContext *s, ARMCond cond, int opc, int rd,
+                            int rn, int rm, int shift)
 {
     tcg_out32(s, (cond << 28) | (0 << 25) | opc |
                     (rn << 16) | (rd << 12) | shift | rm);
 }
 
-static void tcg_out_mov_reg(TCGContext *s, int cond, int rd, int rm)
+static void tcg_out_mov_reg(TCGContext *s, ARMCond cond, int rd, int rm)
 {
     /* Simple reg-reg move, optimising out the 'do nothing' case */
     if (rd != rm) {
@@ -578,12 +578,12 @@ static void tcg_out_mov_reg(TCGContext *s, int cond, int rd, int rm)
     }
 }
 
-static void tcg_out_bx_reg(TCGContext *s, int cond, TCGReg rn)
+static void tcg_out_bx_reg(TCGContext *s, ARMCond cond, TCGReg rn)
 {
     tcg_out32(s, (cond << 28) | 0x012fff10 | rn);
 }
 
-static void tcg_out_b_reg(TCGContext *s, int cond, TCGReg rn)
+static void tcg_out_b_reg(TCGContext *s, ARMCond cond, TCGReg rn)
 {
     /*
      * Unless the C portion of QEMU is compiled as thumb, we don't need
@@ -596,14 +596,14 @@ static void tcg_out_b_reg(TCGContext *s, int cond, TCGReg rn)
     }
 }
 
-static void tcg_out_dat_imm(TCGContext *s, int cond, int opc,
+static void tcg_out_dat_imm(TCGContext *s, ARMCond cond, int opc,
                             int rd, int rn, int im)
 {
     tcg_out32(s, (cond << 28) | (1 << 25) | opc |
                     (rn << 16) | (rd << 12) | im);
 }
 
-static void tcg_out_ldstm(TCGContext *s, int cond, int opc,
+static void tcg_out_ldstm(TCGContext *s, ARMCond cond, int opc,
                           TCGReg rn, uint16_t mask)
 {
     tcg_out32(s, (cond << 28) | opc | (rn << 16) | mask);
@@ -611,14 +611,14 @@ static void tcg_out_ldstm(TCGContext *s, int cond, int opc,
 
 /* Note that this routine is used for both LDR and LDRH formats, so we do
    not wish to include an immediate shift at this point.  */
-static void tcg_out_memop_r(TCGContext *s, int cond, ARMInsn opc, TCGReg rt,
+static void tcg_out_memop_r(TCGContext *s, ARMCond cond, ARMInsn opc, TCGReg rt,
                             TCGReg rn, TCGReg rm, bool u, bool p, bool w)
 {
     tcg_out32(s, (cond << 28) | opc | (u << 23) | (p << 24)
               | (w << 21) | (rn << 16) | (rt << 12) | rm);
 }
 
-static void tcg_out_memop_8(TCGContext *s, int cond, ARMInsn opc, TCGReg rt,
+static void tcg_out_memop_8(TCGContext *s, ARMCond cond, ARMInsn opc, TCGReg rt,
                             TCGReg rn, int imm8, bool p, bool w)
 {
     bool u = 1;
@@ -630,7 +630,7 @@ static void tcg_out_memop_8(TCGContext *s, int cond, ARMInsn opc, TCGReg rt,
               (rn << 16) | (rt << 12) | ((imm8 & 0xf0) << 4) | (imm8 & 0xf));
 }
 
-static void tcg_out_memop_12(TCGContext *s, int cond, ARMInsn opc, TCGReg rt,
+static void tcg_out_memop_12(TCGContext *s, ARMCond cond, ARMInsn opc, TCGReg rt,
                              TCGReg rn, int imm12, bool p, bool w)
 {
     bool u = 1;
@@ -642,152 +642,152 @@ static void tcg_out_memop_12(TCGContext *s, int cond, ARMInsn opc, TCGReg rt,
               (rn << 16) | (rt << 12) | imm12);
 }
 
-static void tcg_out_ld32_12(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ld32_12(TCGContext *s, ARMCond cond, TCGReg rt,
                             TCGReg rn, int imm12)
 {
     tcg_out_memop_12(s, cond, INSN_LDR_IMM, rt, rn, imm12, 1, 0);
 }
 
-static void tcg_out_st32_12(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_st32_12(TCGContext *s, ARMCond cond, TCGReg rt,
                             TCGReg rn, int imm12)
 {
     tcg_out_memop_12(s, cond, INSN_STR_IMM, rt, rn, imm12, 1, 0);
 }
 
-static void tcg_out_ld32_r(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ld32_r(TCGContext *s, ARMCond cond, TCGReg rt,
                            TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDR_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static void tcg_out_st32_r(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_st32_r(TCGContext *s, ARMCond cond, TCGReg rt,
                            TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_STR_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static void tcg_out_ldrd_8(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ldrd_8(TCGContext *s, ARMCond cond, TCGReg rt,
                            TCGReg rn, int imm8)
 {
     tcg_out_memop_8(s, cond, INSN_LDRD_IMM, rt, rn, imm8, 1, 0);
 }
 
-static void tcg_out_ldrd_r(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ldrd_r(TCGContext *s, ARMCond cond, TCGReg rt,
                            TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDRD_REG, rt, rn, rm, 1, 1, 0);
 }
 
 static void __attribute__((unused))
-tcg_out_ldrd_rwb(TCGContext *s, int cond, TCGReg rt, TCGReg rn, TCGReg rm)
+tcg_out_ldrd_rwb(TCGContext *s, ARMCond cond, TCGReg rt, TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDRD_REG, rt, rn, rm, 1, 1, 1);
 }
 
-static void tcg_out_strd_8(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_strd_8(TCGContext *s, ARMCond cond, TCGReg rt,
                            TCGReg rn, int imm8)
 {
     tcg_out_memop_8(s, cond, INSN_STRD_IMM, rt, rn, imm8, 1, 0);
 }
 
-static void tcg_out_strd_r(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_strd_r(TCGContext *s, ARMCond cond, TCGReg rt,
                            TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_STRD_REG, rt, rn, rm, 1, 1, 0);
 }
 
 /* Register pre-increment with base writeback.  */
-static void tcg_out_ld32_rwb(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ld32_rwb(TCGContext *s, ARMCond cond, TCGReg rt,
                              TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDR_REG, rt, rn, rm, 1, 1, 1);
 }
 
-static void tcg_out_st32_rwb(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_st32_rwb(TCGContext *s, ARMCond cond, TCGReg rt,
                              TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_STR_REG, rt, rn, rm, 1, 1, 1);
 }
 
-static void tcg_out_ld16u_8(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ld16u_8(TCGContext *s, ARMCond cond, TCGReg rt,
                             TCGReg rn, int imm8)
 {
     tcg_out_memop_8(s, cond, INSN_LDRH_IMM, rt, rn, imm8, 1, 0);
 }
 
-static void tcg_out_st16_8(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_st16_8(TCGContext *s, ARMCond cond, TCGReg rt,
                            TCGReg rn, int imm8)
 {
     tcg_out_memop_8(s, cond, INSN_STRH_IMM, rt, rn, imm8, 1, 0);
 }
 
-static void tcg_out_ld16u_r(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ld16u_r(TCGContext *s, ARMCond cond, TCGReg rt,
                             TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDRH_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static void tcg_out_st16_r(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_st16_r(TCGContext *s, ARMCond cond, TCGReg rt,
                            TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_STRH_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static void tcg_out_ld16s_8(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ld16s_8(TCGContext *s, ARMCond cond, TCGReg rt,
                             TCGReg rn, int imm8)
 {
     tcg_out_memop_8(s, cond, INSN_LDRSH_IMM, rt, rn, imm8, 1, 0);
 }
 
-static void tcg_out_ld16s_r(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ld16s_r(TCGContext *s, ARMCond cond, TCGReg rt,
                             TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDRSH_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static void tcg_out_ld8_12(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ld8_12(TCGContext *s, ARMCond cond, TCGReg rt,
                            TCGReg rn, int imm12)
 {
     tcg_out_memop_12(s, cond, INSN_LDRB_IMM, rt, rn, imm12, 1, 0);
 }
 
-static void tcg_out_st8_12(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_st8_12(TCGContext *s, ARMCond cond, TCGReg rt,
                            TCGReg rn, int imm12)
 {
     tcg_out_memop_12(s, cond, INSN_STRB_IMM, rt, rn, imm12, 1, 0);
 }
 
-static void tcg_out_ld8_r(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ld8_r(TCGContext *s, ARMCond cond, TCGReg rt,
                           TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDRB_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static void tcg_out_st8_r(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_st8_r(TCGContext *s, ARMCond cond, TCGReg rt,
                           TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_STRB_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static void tcg_out_ld8s_8(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ld8s_8(TCGContext *s, ARMCond cond, TCGReg rt,
                            TCGReg rn, int imm8)
 {
     tcg_out_memop_8(s, cond, INSN_LDRSB_IMM, rt, rn, imm8, 1, 0);
 }
 
-static void tcg_out_ld8s_r(TCGContext *s, int cond, TCGReg rt,
+static void tcg_out_ld8s_r(TCGContext *s, ARMCond cond, TCGReg rt,
                            TCGReg rn, TCGReg rm)
 {
     tcg_out_memop_r(s, cond, INSN_LDRSB_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static void tcg_out_movi_pool(TCGContext *s, int cond, int rd, uint32_t arg)
+static void tcg_out_movi_pool(TCGContext *s, ARMCond cond, int rd, uint32_t arg)
 {
     new_pool_label(s, arg, R_ARM_PC13, s->code_ptr, 0);
     tcg_out_ld32_12(s, cond, rd, TCG_REG_PC, 0);
 }
 
-static void tcg_out_movi32(TCGContext *s, int cond, int rd, uint32_t arg)
+static void tcg_out_movi32(TCGContext *s, ARMCond cond, int rd, uint32_t arg)
 {
     int imm12, diff, opc, sh1, sh2;
     uint32_t tt0, tt1, tt2;
@@ -866,7 +866,7 @@ static void tcg_out_movi32(TCGContext *s, int cond, int rd, uint32_t arg)
  * Emit either the reg,imm or reg,reg form of a data-processing insn.
  * rhs must satisfy the "rI" constraint.
  */
-static void tcg_out_dat_rI(TCGContext *s, int cond, int opc, TCGArg dst,
+static void tcg_out_dat_rI(TCGContext *s, ARMCond cond, int opc, TCGArg dst,
                            TCGArg lhs, TCGArg rhs, int rhs_is_const)
 {
     if (rhs_is_const) {
@@ -880,7 +880,7 @@ static void tcg_out_dat_rI(TCGContext *s, int cond, int opc, TCGArg dst,
  * Emit either the reg,imm or reg,reg form of a data-processing insn.
  * rhs must satisfy the "rIK" constraint.
  */
-static void tcg_out_dat_rIK(TCGContext *s, int cond, int opc, int opinv,
+static void tcg_out_dat_rIK(TCGContext *s, ARMCond cond, int opc, int opinv,
                             TCGReg dst, TCGReg lhs, TCGArg rhs,
                             bool rhs_is_const)
 {
@@ -896,7 +896,7 @@ static void tcg_out_dat_rIK(TCGContext *s, int cond, int opc, int opinv,
     }
 }
 
-static void tcg_out_dat_rIN(TCGContext *s, int cond, int opc, int opneg,
+static void tcg_out_dat_rIN(TCGContext *s, ARMCond cond, int opc, int opneg,
                             TCGArg dst, TCGArg lhs, TCGArg rhs,
                             bool rhs_is_const)
 {
@@ -915,7 +915,7 @@ static void tcg_out_dat_rIN(TCGContext *s, int cond, int opc, int opneg,
     }
 }
 
-static void tcg_out_mul32(TCGContext *s, int cond, TCGReg rd,
+static void tcg_out_mul32(TCGContext *s, ARMCond cond, TCGReg rd,
                           TCGReg rn, TCGReg rm)
 {
     /* if ArchVersion() < 6 && d == n then UNPREDICTABLE;  */
@@ -933,7 +933,7 @@ static void tcg_out_mul32(TCGContext *s, int cond, TCGReg rd,
     tcg_out32(s, (cond << 28) | 0x90 | (rd << 16) | (rm << 8) | rn);
 }
 
-static void tcg_out_umull32(TCGContext *s, int cond, TCGReg rd0,
+static void tcg_out_umull32(TCGContext *s, ARMCond cond, TCGReg rd0,
                             TCGReg rd1, TCGReg rn, TCGReg rm)
 {
     /* if ArchVersion() < 6 && (dHi == n || dLo == n) then UNPREDICTABLE;  */
@@ -952,7 +952,7 @@ static void tcg_out_umull32(TCGContext *s, int cond, TCGReg rd0,
               (rd1 << 16) | (rd0 << 12) | (rm << 8) | rn);
 }
 
-static void tcg_out_smull32(TCGContext *s, int cond, TCGReg rd0,
+static void tcg_out_smull32(TCGContext *s, ARMCond cond, TCGReg rd0,
                             TCGReg rd1, TCGReg rn, TCGReg rm)
 {
     /* if ArchVersion() < 6 && (dHi == n || dLo == n) then UNPREDICTABLE;  */
@@ -971,17 +971,17 @@ static void tcg_out_smull32(TCGContext *s, int cond, TCGReg rd0,
               (rd1 << 16) | (rd0 << 12) | (rm << 8) | rn);
 }
 
-static void tcg_out_sdiv(TCGContext *s, int cond, int rd, int rn, int rm)
+static void tcg_out_sdiv(TCGContext *s, ARMCond cond, int rd, int rn, int rm)
 {
     tcg_out32(s, 0x0710f010 | (cond << 28) | (rd << 16) | rn | (rm << 8));
 }
 
-static void tcg_out_udiv(TCGContext *s, int cond, int rd, int rn, int rm)
+static void tcg_out_udiv(TCGContext *s, ARMCond cond, int rd, int rn, int rm)
 {
     tcg_out32(s, 0x0730f010 | (cond << 28) | (rd << 16) | rn | (rm << 8));
 }
 
-static void tcg_out_ext8s(TCGContext *s, int cond, int rd, int rn)
+static void tcg_out_ext8s(TCGContext *s, ARMCond cond, int rd, int rn)
 {
     if (use_armv6_instructions) {
         /* sxtb */
@@ -995,12 +995,12 @@ static void tcg_out_ext8s(TCGContext *s, int cond, int rd, int rn)
 }
 
 static void __attribute__((unused))
-tcg_out_ext8u(TCGContext *s, int cond, int rd, int rn)
+tcg_out_ext8u(TCGContext *s, ARMCond cond, int rd, int rn)
 {
     tcg_out_dat_imm(s, cond, ARITH_AND, rd, rn, 0xff);
 }
 
-static void tcg_out_ext16s(TCGContext *s, int cond, int rd, int rn)
+static void tcg_out_ext16s(TCGContext *s, ARMCond cond, int rd, int rn)
 {
     if (use_armv6_instructions) {
         /* sxth */
@@ -1013,7 +1013,7 @@ static void tcg_out_ext16s(TCGContext *s, int cond, int rd, int rn)
     }
 }
 
-static void tcg_out_ext16u(TCGContext *s, int cond, int rd, int rn)
+static void tcg_out_ext16u(TCGContext *s, ARMCond cond, int rd, int rn)
 {
     if (use_armv6_instructions) {
         /* uxth */
@@ -1026,7 +1026,7 @@ static void tcg_out_ext16u(TCGContext *s, int cond, int rd, int rn)
     }
 }
 
-static void tcg_out_bswap16(TCGContext *s, int cond, int rd, int rn, int flags)
+static void tcg_out_bswap16(TCGContext *s, ARMCond cond, int rd, int rn, int flags)
 {
     if (use_armv6_instructions) {
         if (flags & TCG_BSWAP_OS) {
@@ -1093,7 +1093,7 @@ static void tcg_out_bswap16(TCGContext *s, int cond, int rd, int rn, int flags)
                      ? SHIFT_IMM_ASR(8) : SHIFT_IMM_LSR(8)));
 }
 
-static void tcg_out_bswap32(TCGContext *s, int cond, int rd, int rn)
+static void tcg_out_bswap32(TCGContext *s, ARMCond cond, int rd, int rn)
 {
     if (use_armv6_instructions) {
         /* rev */
@@ -1110,7 +1110,7 @@ static void tcg_out_bswap32(TCGContext *s, int cond, int rd, int rn)
     }
 }
 
-static void tcg_out_deposit(TCGContext *s, int cond, TCGReg rd,
+static void tcg_out_deposit(TCGContext *s, ARMCond cond, TCGReg rd,
                             TCGArg a1, int ofs, int len, bool const_a1)
 {
     if (const_a1) {
@@ -1122,7 +1122,7 @@ static void tcg_out_deposit(TCGContext *s, int cond, TCGReg rd,
               | (ofs << 7) | ((ofs + len - 1) << 16));
 }
 
-static void tcg_out_extract(TCGContext *s, int cond, TCGReg rd,
+static void tcg_out_extract(TCGContext *s, ARMCond cond, TCGReg rd,
                             TCGArg a1, int ofs, int len)
 {
     /* ubfx */
@@ -1130,7 +1130,7 @@ static void tcg_out_extract(TCGContext *s, int cond, TCGReg rd,
               | (ofs << 7) | ((len - 1) << 16));
 }
 
-static void tcg_out_sextract(TCGContext *s, int cond, TCGReg rd,
+static void tcg_out_sextract(TCGContext *s, ARMCond cond, TCGReg rd,
                              TCGArg a1, int ofs, int len)
 {
     /* sbfx */
@@ -1138,7 +1138,7 @@ static void tcg_out_sextract(TCGContext *s, int cond, TCGReg rd,
               | (ofs << 7) | ((len - 1) << 16));
 }
 
-static void tcg_out_ld32u(TCGContext *s, int cond,
+static void tcg_out_ld32u(TCGContext *s, ARMCond cond,
                           int rd, int rn, int32_t offset)
 {
     if (offset > 0xfff || offset < -0xfff) {
@@ -1148,7 +1148,7 @@ static void tcg_out_ld32u(TCGContext *s, int cond,
         tcg_out_ld32_12(s, cond, rd, rn, offset);
 }
 
-static void tcg_out_st32(TCGContext *s, int cond,
+static void tcg_out_st32(TCGContext *s, ARMCond cond,
                          int rd, int rn, int32_t offset)
 {
     if (offset > 0xfff || offset < -0xfff) {
@@ -1158,7 +1158,7 @@ static void tcg_out_st32(TCGContext *s, int cond,
         tcg_out_st32_12(s, cond, rd, rn, offset);
 }
 
-static void tcg_out_ld16u(TCGContext *s, int cond,
+static void tcg_out_ld16u(TCGContext *s, ARMCond cond,
                           int rd, int rn, int32_t offset)
 {
     if (offset > 0xff || offset < -0xff) {
@@ -1168,7 +1168,7 @@ static void tcg_out_ld16u(TCGContext *s, int cond,
         tcg_out_ld16u_8(s, cond, rd, rn, offset);
 }
 
-static void tcg_out_ld16s(TCGContext *s, int cond,
+static void tcg_out_ld16s(TCGContext *s, ARMCond cond,
                           int rd, int rn, int32_t offset)
 {
     if (offset > 0xff || offset < -0xff) {
@@ -1178,7 +1178,7 @@ static void tcg_out_ld16s(TCGContext *s, int cond,
         tcg_out_ld16s_8(s, cond, rd, rn, offset);
 }
 
-static void tcg_out_st16(TCGContext *s, int cond,
+static void tcg_out_st16(TCGContext *s, ARMCond cond,
                          int rd, int rn, int32_t offset)
 {
     if (offset > 0xff || offset < -0xff) {
@@ -1188,7 +1188,7 @@ static void tcg_out_st16(TCGContext *s, int cond,
         tcg_out_st16_8(s, cond, rd, rn, offset);
 }
 
-static void tcg_out_ld8u(TCGContext *s, int cond,
+static void tcg_out_ld8u(TCGContext *s, ARMCond cond,
                          int rd, int rn, int32_t offset)
 {
     if (offset > 0xfff || offset < -0xfff) {
@@ -1198,7 +1198,7 @@ static void tcg_out_ld8u(TCGContext *s, int cond,
         tcg_out_ld8_12(s, cond, rd, rn, offset);
 }
 
-static void tcg_out_ld8s(TCGContext *s, int cond,
+static void tcg_out_ld8s(TCGContext *s, ARMCond cond,
                          int rd, int rn, int32_t offset)
 {
     if (offset > 0xff || offset < -0xff) {
@@ -1208,7 +1208,7 @@ static void tcg_out_ld8s(TCGContext *s, int cond,
         tcg_out_ld8s_8(s, cond, rd, rn, offset);
 }
 
-static void tcg_out_st8(TCGContext *s, int cond,
+static void tcg_out_st8(TCGContext *s, ARMCond cond,
                         int rd, int rn, int32_t offset)
 {
     if (offset > 0xfff || offset < -0xfff) {
@@ -1223,7 +1223,7 @@ static void tcg_out_st8(TCGContext *s, int cond,
  * with the code buffer limited to 16MB we wouldn't need the long case.
  * But we also use it for the tail-call to the qemu_ld/st helpers, which does.
  */
-static void tcg_out_goto(TCGContext *s, int cond, const tcg_insn_unit *addr)
+static void tcg_out_goto(TCGContext *s, ARMCond cond, const tcg_insn_unit *addr)
 {
     intptr_t addri = (intptr_t)addr;
     ptrdiff_t disp = tcg_pcrel_diff(s, addr);
@@ -1280,7 +1280,7 @@ static void tcg_out_call(TCGContext *s, const tcg_insn_unit *addr)
     }
 }
 
-static void tcg_out_goto_label(TCGContext *s, int cond, TCGLabel *l)
+static void tcg_out_goto_label(TCGContext *s, ARMCond cond, TCGLabel *l)
 {
     if (l->has_value) {
         tcg_out_goto(s, cond, l->u.value_ptr);
@@ -1884,7 +1884,7 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
 #endif
 }
 
-static void tcg_out_qemu_st_index(TCGContext *s, int cond, MemOp opc,
+static void tcg_out_qemu_st_index(TCGContext *s, ARMCond cond, MemOp opc,
                                   TCGReg datalo, TCGReg datahi,
                                   TCGReg addrlo, TCGReg addend)
 {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 11/14] tcg/arm: More use of the ARMInsn enum
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
                   ` (9 preceding siblings ...)
  2021-08-18 21:29 ` [PATCH v3 10/14] tcg/arm: Give enum arm_cond_code_e a typedef and use it Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-18 22:04   ` Philippe Mathieu-Daudé
  2021-08-18 21:29 ` [PATCH v3 12/14] tcg/arm: More use of the TCGReg enum Richard Henderson
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.c.inc | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index b20c313615..2f55b94ada 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -563,7 +563,7 @@ static void tcg_out_blx_imm(TCGContext *s, int32_t offset)
                 (((offset - 8) >> 2) & 0x00ffffff));
 }
 
-static void tcg_out_dat_reg(TCGContext *s, ARMCond cond, int opc, int rd,
+static void tcg_out_dat_reg(TCGContext *s, ARMCond cond, ARMInsn opc, int rd,
                             int rn, int rm, int shift)
 {
     tcg_out32(s, (cond << 28) | (0 << 25) | opc |
@@ -596,14 +596,14 @@ static void tcg_out_b_reg(TCGContext *s, ARMCond cond, TCGReg rn)
     }
 }
 
-static void tcg_out_dat_imm(TCGContext *s, ARMCond cond, int opc,
+static void tcg_out_dat_imm(TCGContext *s, ARMCond cond, ARMInsn opc,
                             int rd, int rn, int im)
 {
     tcg_out32(s, (cond << 28) | (1 << 25) | opc |
                     (rn << 16) | (rd << 12) | im);
 }
 
-static void tcg_out_ldstm(TCGContext *s, ARMCond cond, int opc,
+static void tcg_out_ldstm(TCGContext *s, ARMCond cond, ARMInsn opc,
                           TCGReg rn, uint16_t mask)
 {
     tcg_out32(s, (cond << 28) | opc | (rn << 16) | mask);
@@ -630,8 +630,8 @@ static void tcg_out_memop_8(TCGContext *s, ARMCond cond, ARMInsn opc, TCGReg rt,
               (rn << 16) | (rt << 12) | ((imm8 & 0xf0) << 4) | (imm8 & 0xf));
 }
 
-static void tcg_out_memop_12(TCGContext *s, ARMCond cond, ARMInsn opc, TCGReg rt,
-                             TCGReg rn, int imm12, bool p, bool w)
+static void tcg_out_memop_12(TCGContext *s, ARMCond cond, ARMInsn opc,
+                             TCGReg rt, TCGReg rn, int imm12, bool p, bool w)
 {
     bool u = 1;
     if (imm12 < 0) {
@@ -866,7 +866,7 @@ static void tcg_out_movi32(TCGContext *s, ARMCond cond, int rd, uint32_t arg)
  * Emit either the reg,imm or reg,reg form of a data-processing insn.
  * rhs must satisfy the "rI" constraint.
  */
-static void tcg_out_dat_rI(TCGContext *s, ARMCond cond, int opc, TCGArg dst,
+static void tcg_out_dat_rI(TCGContext *s, ARMCond cond, ARMInsn opc, TCGArg dst,
                            TCGArg lhs, TCGArg rhs, int rhs_is_const)
 {
     if (rhs_is_const) {
@@ -880,8 +880,8 @@ static void tcg_out_dat_rI(TCGContext *s, ARMCond cond, int opc, TCGArg dst,
  * Emit either the reg,imm or reg,reg form of a data-processing insn.
  * rhs must satisfy the "rIK" constraint.
  */
-static void tcg_out_dat_rIK(TCGContext *s, ARMCond cond, int opc, int opinv,
-                            TCGReg dst, TCGReg lhs, TCGArg rhs,
+static void tcg_out_dat_rIK(TCGContext *s, ARMCond cond, ARMInsn opc,
+                            ARMInsn opinv, TCGReg dst, TCGReg lhs, TCGArg rhs,
                             bool rhs_is_const)
 {
     if (rhs_is_const) {
@@ -896,8 +896,8 @@ static void tcg_out_dat_rIK(TCGContext *s, ARMCond cond, int opc, int opinv,
     }
 }
 
-static void tcg_out_dat_rIN(TCGContext *s, ARMCond cond, int opc, int opneg,
-                            TCGArg dst, TCGArg lhs, TCGArg rhs,
+static void tcg_out_dat_rIN(TCGContext *s, ARMCond cond, ARMInsn opc,
+                            ARMInsn opneg, TCGArg dst, TCGArg lhs, TCGArg rhs,
                             bool rhs_is_const)
 {
     /* Emit either the reg,imm or reg,reg form of a data-processing insn.
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 12/14] tcg/arm: More use of the TCGReg enum
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
                   ` (10 preceding siblings ...)
  2021-08-18 21:29 ` [PATCH v3 11/14] tcg/arm: More use of the ARMInsn enum Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-18 22:05   ` Philippe Mathieu-Daudé
  2021-08-18 21:29 ` [PATCH v3 13/14] tcg/arm: Reserve a register for guest_base Richard Henderson
  2021-08-18 21:29 ` [PATCH v3 14/14] tcg/arm: Support raising sigbus for user-only Richard Henderson
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.c.inc | 65 +++++++++++++++++++++-------------------
 1 file changed, 35 insertions(+), 30 deletions(-)

diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 2f55b94ada..35bd4c68d6 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -552,7 +552,7 @@ static void tcg_out_bl_imm(TCGContext *s, ARMCond cond, int32_t offset)
                     (((offset - 8) >> 2) & 0x00ffffff));
 }
 
-static void tcg_out_blx_reg(TCGContext *s, ARMCond cond, int rn)
+static void tcg_out_blx_reg(TCGContext *s, ARMCond cond, TCGReg rn)
 {
     tcg_out32(s, (cond << 28) | 0x012fff30 | rn);
 }
@@ -563,14 +563,14 @@ static void tcg_out_blx_imm(TCGContext *s, int32_t offset)
                 (((offset - 8) >> 2) & 0x00ffffff));
 }
 
-static void tcg_out_dat_reg(TCGContext *s, ARMCond cond, ARMInsn opc, int rd,
-                            int rn, int rm, int shift)
+static void tcg_out_dat_reg(TCGContext *s, ARMCond cond, ARMInsn opc,
+                            TCGReg rd, TCGReg rn, TCGReg rm, int shift)
 {
     tcg_out32(s, (cond << 28) | (0 << 25) | opc |
                     (rn << 16) | (rd << 12) | shift | rm);
 }
 
-static void tcg_out_mov_reg(TCGContext *s, ARMCond cond, int rd, int rm)
+static void tcg_out_mov_reg(TCGContext *s, ARMCond cond, TCGReg rd, TCGReg rm)
 {
     /* Simple reg-reg move, optimising out the 'do nothing' case */
     if (rd != rm) {
@@ -597,7 +597,7 @@ static void tcg_out_b_reg(TCGContext *s, ARMCond cond, TCGReg rn)
 }
 
 static void tcg_out_dat_imm(TCGContext *s, ARMCond cond, ARMInsn opc,
-                            int rd, int rn, int im)
+                            TCGReg rd, TCGReg rn, int im)
 {
     tcg_out32(s, (cond << 28) | (1 << 25) | opc |
                     (rn << 16) | (rd << 12) | im);
@@ -781,13 +781,15 @@ static void tcg_out_ld8s_r(TCGContext *s, ARMCond cond, TCGReg rt,
     tcg_out_memop_r(s, cond, INSN_LDRSB_REG, rt, rn, rm, 1, 1, 0);
 }
 
-static void tcg_out_movi_pool(TCGContext *s, ARMCond cond, int rd, uint32_t arg)
+static void tcg_out_movi_pool(TCGContext *s, ARMCond cond,
+                              TCGReg rd, uint32_t arg)
 {
     new_pool_label(s, arg, R_ARM_PC13, s->code_ptr, 0);
     tcg_out_ld32_12(s, cond, rd, TCG_REG_PC, 0);
 }
 
-static void tcg_out_movi32(TCGContext *s, ARMCond cond, int rd, uint32_t arg)
+static void tcg_out_movi32(TCGContext *s, ARMCond cond,
+                           TCGReg rd, uint32_t arg)
 {
     int imm12, diff, opc, sh1, sh2;
     uint32_t tt0, tt1, tt2;
@@ -866,8 +868,8 @@ static void tcg_out_movi32(TCGContext *s, ARMCond cond, int rd, uint32_t arg)
  * Emit either the reg,imm or reg,reg form of a data-processing insn.
  * rhs must satisfy the "rI" constraint.
  */
-static void tcg_out_dat_rI(TCGContext *s, ARMCond cond, ARMInsn opc, TCGArg dst,
-                           TCGArg lhs, TCGArg rhs, int rhs_is_const)
+static void tcg_out_dat_rI(TCGContext *s, ARMCond cond, ARMInsn opc,
+                           TCGReg dst, TCGReg lhs, TCGArg rhs, int rhs_is_const)
 {
     if (rhs_is_const) {
         tcg_out_dat_imm(s, cond, opc, dst, lhs, encode_imm_nofail(rhs));
@@ -897,7 +899,7 @@ static void tcg_out_dat_rIK(TCGContext *s, ARMCond cond, ARMInsn opc,
 }
 
 static void tcg_out_dat_rIN(TCGContext *s, ARMCond cond, ARMInsn opc,
-                            ARMInsn opneg, TCGArg dst, TCGArg lhs, TCGArg rhs,
+                            ARMInsn opneg, TCGReg dst, TCGReg lhs, TCGArg rhs,
                             bool rhs_is_const)
 {
     /* Emit either the reg,imm or reg,reg form of a data-processing insn.
@@ -971,17 +973,19 @@ static void tcg_out_smull32(TCGContext *s, ARMCond cond, TCGReg rd0,
               (rd1 << 16) | (rd0 << 12) | (rm << 8) | rn);
 }
 
-static void tcg_out_sdiv(TCGContext *s, ARMCond cond, int rd, int rn, int rm)
+static void tcg_out_sdiv(TCGContext *s, ARMCond cond,
+                         TCGReg rd, TCGReg rn, TCGReg rm)
 {
     tcg_out32(s, 0x0710f010 | (cond << 28) | (rd << 16) | rn | (rm << 8));
 }
 
-static void tcg_out_udiv(TCGContext *s, ARMCond cond, int rd, int rn, int rm)
+static void tcg_out_udiv(TCGContext *s, ARMCond cond,
+                         TCGReg rd, TCGReg rn, TCGReg rm)
 {
     tcg_out32(s, 0x0730f010 | (cond << 28) | (rd << 16) | rn | (rm << 8));
 }
 
-static void tcg_out_ext8s(TCGContext *s, ARMCond cond, int rd, int rn)
+static void tcg_out_ext8s(TCGContext *s, ARMCond cond, TCGReg rd, TCGReg rn)
 {
     if (use_armv6_instructions) {
         /* sxtb */
@@ -995,12 +999,12 @@ static void tcg_out_ext8s(TCGContext *s, ARMCond cond, int rd, int rn)
 }
 
 static void __attribute__((unused))
-tcg_out_ext8u(TCGContext *s, ARMCond cond, int rd, int rn)
+tcg_out_ext8u(TCGContext *s, ARMCond cond, TCGReg rd, TCGReg rn)
 {
     tcg_out_dat_imm(s, cond, ARITH_AND, rd, rn, 0xff);
 }
 
-static void tcg_out_ext16s(TCGContext *s, ARMCond cond, int rd, int rn)
+static void tcg_out_ext16s(TCGContext *s, ARMCond cond, TCGReg rd, TCGReg rn)
 {
     if (use_armv6_instructions) {
         /* sxth */
@@ -1013,7 +1017,7 @@ static void tcg_out_ext16s(TCGContext *s, ARMCond cond, int rd, int rn)
     }
 }
 
-static void tcg_out_ext16u(TCGContext *s, ARMCond cond, int rd, int rn)
+static void tcg_out_ext16u(TCGContext *s, ARMCond cond, TCGReg rd, TCGReg rn)
 {
     if (use_armv6_instructions) {
         /* uxth */
@@ -1026,7 +1030,8 @@ static void tcg_out_ext16u(TCGContext *s, ARMCond cond, int rd, int rn)
     }
 }
 
-static void tcg_out_bswap16(TCGContext *s, ARMCond cond, int rd, int rn, int flags)
+static void tcg_out_bswap16(TCGContext *s, ARMCond cond,
+                            TCGReg rd, TCGReg rn, int flags)
 {
     if (use_armv6_instructions) {
         if (flags & TCG_BSWAP_OS) {
@@ -1093,7 +1098,7 @@ static void tcg_out_bswap16(TCGContext *s, ARMCond cond, int rd, int rn, int fla
                      ? SHIFT_IMM_ASR(8) : SHIFT_IMM_LSR(8)));
 }
 
-static void tcg_out_bswap32(TCGContext *s, ARMCond cond, int rd, int rn)
+static void tcg_out_bswap32(TCGContext *s, ARMCond cond, TCGReg rd, TCGReg rn)
 {
     if (use_armv6_instructions) {
         /* rev */
@@ -1123,23 +1128,23 @@ static void tcg_out_deposit(TCGContext *s, ARMCond cond, TCGReg rd,
 }
 
 static void tcg_out_extract(TCGContext *s, ARMCond cond, TCGReg rd,
-                            TCGArg a1, int ofs, int len)
+                            TCGReg rn, int ofs, int len)
 {
     /* ubfx */
-    tcg_out32(s, 0x07e00050 | (cond << 28) | (rd << 12) | a1
+    tcg_out32(s, 0x07e00050 | (cond << 28) | (rd << 12) | rn
               | (ofs << 7) | ((len - 1) << 16));
 }
 
 static void tcg_out_sextract(TCGContext *s, ARMCond cond, TCGReg rd,
-                             TCGArg a1, int ofs, int len)
+                             TCGReg rn, int ofs, int len)
 {
     /* sbfx */
-    tcg_out32(s, 0x07a00050 | (cond << 28) | (rd << 12) | a1
+    tcg_out32(s, 0x07a00050 | (cond << 28) | (rd << 12) | rn
               | (ofs << 7) | ((len - 1) << 16));
 }
 
 static void tcg_out_ld32u(TCGContext *s, ARMCond cond,
-                          int rd, int rn, int32_t offset)
+                          TCGReg rd, TCGReg rn, int32_t offset)
 {
     if (offset > 0xfff || offset < -0xfff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1149,7 +1154,7 @@ static void tcg_out_ld32u(TCGContext *s, ARMCond cond,
 }
 
 static void tcg_out_st32(TCGContext *s, ARMCond cond,
-                         int rd, int rn, int32_t offset)
+                         TCGReg rd, TCGReg rn, int32_t offset)
 {
     if (offset > 0xfff || offset < -0xfff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1159,7 +1164,7 @@ static void tcg_out_st32(TCGContext *s, ARMCond cond,
 }
 
 static void tcg_out_ld16u(TCGContext *s, ARMCond cond,
-                          int rd, int rn, int32_t offset)
+                          TCGReg rd, TCGReg rn, int32_t offset)
 {
     if (offset > 0xff || offset < -0xff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1169,7 +1174,7 @@ static void tcg_out_ld16u(TCGContext *s, ARMCond cond,
 }
 
 static void tcg_out_ld16s(TCGContext *s, ARMCond cond,
-                          int rd, int rn, int32_t offset)
+                          TCGReg rd, TCGReg rn, int32_t offset)
 {
     if (offset > 0xff || offset < -0xff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1179,7 +1184,7 @@ static void tcg_out_ld16s(TCGContext *s, ARMCond cond,
 }
 
 static void tcg_out_st16(TCGContext *s, ARMCond cond,
-                         int rd, int rn, int32_t offset)
+                         TCGReg rd, TCGReg rn, int32_t offset)
 {
     if (offset > 0xff || offset < -0xff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1189,7 +1194,7 @@ static void tcg_out_st16(TCGContext *s, ARMCond cond,
 }
 
 static void tcg_out_ld8u(TCGContext *s, ARMCond cond,
-                         int rd, int rn, int32_t offset)
+                         TCGReg rd, TCGReg rn, int32_t offset)
 {
     if (offset > 0xfff || offset < -0xfff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1199,7 +1204,7 @@ static void tcg_out_ld8u(TCGContext *s, ARMCond cond,
 }
 
 static void tcg_out_ld8s(TCGContext *s, ARMCond cond,
-                         int rd, int rn, int32_t offset)
+                         TCGReg rd, TCGReg rn, int32_t offset)
 {
     if (offset > 0xff || offset < -0xff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
@@ -1209,7 +1214,7 @@ static void tcg_out_ld8s(TCGContext *s, ARMCond cond,
 }
 
 static void tcg_out_st8(TCGContext *s, ARMCond cond,
-                        int rd, int rn, int32_t offset)
+                        TCGReg rd, TCGReg rn, int32_t offset)
 {
     if (offset > 0xfff || offset < -0xfff) {
         tcg_out_movi32(s, cond, TCG_REG_TMP, offset);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 13/14] tcg/arm: Reserve a register for guest_base
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
                   ` (11 preceding siblings ...)
  2021-08-18 21:29 ` [PATCH v3 12/14] tcg/arm: More use of the TCGReg enum Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-20 12:03   ` Peter Maydell
  2021-08-18 21:29 ` [PATCH v3 14/14] tcg/arm: Support raising sigbus for user-only Richard Henderson
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

Reserve a register for the guest_base using aarch64 for reference.
By doing so, we do not have to recompute it for every memory load.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.c.inc | 39 ++++++++++++++++++++++++++++-----------
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 35bd4c68d6..2728035177 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -84,6 +84,9 @@ static const int tcg_target_call_oarg_regs[2] = {
 
 #define TCG_REG_TMP  TCG_REG_R12
 #define TCG_VEC_TMP  TCG_REG_Q15
+#ifndef CONFIG_SOFTMMU
+#define TCG_REG_GUEST_BASE  TCG_REG_R11
+#endif
 
 typedef enum {
     COND_EQ = 0x0,
@@ -1763,7 +1766,8 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
 
 static void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
                                   TCGReg datalo, TCGReg datahi,
-                                  TCGReg addrlo, TCGReg addend)
+                                  TCGReg addrlo, TCGReg addend,
+                                  bool scratch_addend)
 {
     /* Byte swapping is left to middle-end expansion. */
     tcg_debug_assert((opc & MO_BSWAP) == 0);
@@ -1790,7 +1794,7 @@ static void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
             && get_alignment_bits(opc) >= MO_64
             && (datalo & 1) == 0 && datahi == datalo + 1) {
             tcg_out_ldrd_r(s, COND_AL, datalo, addrlo, addend);
-        } else if (datalo != addend) {
+        } else if (scratch_addend) {
             tcg_out_ld32_rwb(s, COND_AL, datalo, addend, addrlo);
             tcg_out_ld32_12(s, COND_AL, datahi, addend, 4);
         } else {
@@ -1875,14 +1879,14 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
     label_ptr = s->code_ptr;
     tcg_out_bl_imm(s, COND_NE, 0);
 
-    tcg_out_qemu_ld_index(s, opc, datalo, datahi, addrlo, addend);
+    tcg_out_qemu_ld_index(s, opc, datalo, datahi, addrlo, addend, true);
 
     add_qemu_ldst_label(s, true, oi, datalo, datahi, addrlo, addrhi,
                         s->code_ptr, label_ptr);
 #else /* !CONFIG_SOFTMMU */
     if (guest_base) {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP, guest_base);
-        tcg_out_qemu_ld_index(s, opc, datalo, datahi, addrlo, TCG_REG_TMP);
+        tcg_out_qemu_ld_index(s, opc, datalo, datahi,
+                              addrlo, TCG_REG_GUEST_BASE, false);
     } else {
         tcg_out_qemu_ld_direct(s, opc, datalo, datahi, addrlo);
     }
@@ -1891,7 +1895,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
 
 static void tcg_out_qemu_st_index(TCGContext *s, ARMCond cond, MemOp opc,
                                   TCGReg datalo, TCGReg datahi,
-                                  TCGReg addrlo, TCGReg addend)
+                                  TCGReg addrlo, TCGReg addend,
+                                  bool scratch_addend)
 {
     /* Byte swapping is left to middle-end expansion. */
     tcg_debug_assert((opc & MO_BSWAP) == 0);
@@ -1912,9 +1917,14 @@ static void tcg_out_qemu_st_index(TCGContext *s, ARMCond cond, MemOp opc,
             && get_alignment_bits(opc) >= MO_64
             && (datalo & 1) == 0 && datahi == datalo + 1) {
             tcg_out_strd_r(s, cond, datalo, addrlo, addend);
-        } else {
+        } else if (scratch_addend) {
             tcg_out_st32_rwb(s, cond, datalo, addend, addrlo);
             tcg_out_st32_12(s, cond, datahi, addend, 4);
+        } else {
+            tcg_out_dat_reg(s, cond, ARITH_ADD, TCG_REG_TMP,
+                            addend, addrlo, SHIFT_IMM_LSL(0));
+            tcg_out_st32_12(s, cond, datalo, TCG_REG_TMP, 0);
+            tcg_out_st32_12(s, cond, datahi, TCG_REG_TMP, 4);
         }
         break;
     default:
@@ -1978,7 +1988,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
     mem_index = get_mmuidx(oi);
     addend = tcg_out_tlb_read(s, addrlo, addrhi, opc, mem_index, 0);
 
-    tcg_out_qemu_st_index(s, COND_EQ, opc, datalo, datahi, addrlo, addend);
+    tcg_out_qemu_st_index(s, COND_EQ, opc, datalo, datahi,
+                          addrlo, addend, true);
 
     /* The conditional call must come last, as we're going to return here.  */
     label_ptr = s->code_ptr;
@@ -1988,9 +1999,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
                         s->code_ptr, label_ptr);
 #else /* !CONFIG_SOFTMMU */
     if (guest_base) {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP, guest_base);
-        tcg_out_qemu_st_index(s, COND_AL, opc, datalo,
-                              datahi, addrlo, TCG_REG_TMP);
+        tcg_out_qemu_st_index(s, COND_AL, opc, datalo, datahi,
+                              addrlo, TCG_REG_GUEST_BASE, false);
     } else {
         tcg_out_qemu_st_direct(s, opc, datalo, datahi, addrlo);
     }
@@ -3153,6 +3163,13 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 
     tcg_out_mov(s, TCG_TYPE_PTR, TCG_AREG0, tcg_target_call_iarg_regs[0]);
 
+#ifndef CONFIG_SOFTMMU
+    if (guest_base) {
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_GUEST_BASE, guest_base);
+        tcg_regset_set_reg(s->reserved_regs, TCG_REG_GUEST_BASE);
+    }
+#endif
+
     tcg_out_b_reg(s, COND_AL, tcg_target_call_iarg_regs[1]);
 
     /*
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 14/14] tcg/arm: Support raising sigbus for user-only
  2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
                   ` (12 preceding siblings ...)
  2021-08-18 21:29 ` [PATCH v3 13/14] tcg/arm: Reserve a register for guest_base Richard Henderson
@ 2021-08-18 21:29 ` Richard Henderson
  2021-08-20 13:56   ` Peter Maydell
  13 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-18 21:29 UTC (permalink / raw)
  To: qemu-devel

For v6+, use ldm/stm, ldrd/strd for the normal case of alignment
matching the access size.  Otherwise, emit a test + branch sequence
invoking helper_unaligned_{ld,st}.

For v4+v5, use piecewise load and stores to implement misalignment.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.h     |   2 -
 tcg/arm/tcg-target.c.inc | 364 ++++++++++++++++++++++++++++++++++++---
 2 files changed, 340 insertions(+), 26 deletions(-)

diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index e47720a85b..fa75fd3626 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -159,9 +159,7 @@ extern bool use_neon_instructions;
 /* not defined -- call should be eliminated at compile time */
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
-#ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
-#endif
 #define TCG_TARGET_NEED_POOL_LABELS
 
 #endif
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 2728035177..278639be44 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -23,6 +23,7 @@
  */
 
 #include "elf.h"
+#include "../tcg-ldst.c.inc"
 #include "../tcg-pool.c.inc"
 
 int arm_arch = __ARM_ARCH;
@@ -86,6 +87,7 @@ static const int tcg_target_call_oarg_regs[2] = {
 #define TCG_VEC_TMP  TCG_REG_Q15
 #ifndef CONFIG_SOFTMMU
 #define TCG_REG_GUEST_BASE  TCG_REG_R11
+#define TCG_REG_TMP2        TCG_REG_R14
 #endif
 
 typedef enum {
@@ -137,7 +139,9 @@ typedef enum {
     INSN_CLZ       = 0x016f0f10,
     INSN_RBIT      = 0x06ff0f30,
 
+    INSN_LDM       = 0x08900000,
     INSN_LDMIA     = 0x08b00000,
+    INSN_STM       = 0x08800000,
     INSN_STMDB     = 0x09200000,
 
     INSN_LDR_IMM   = 0x04100000,
@@ -1428,8 +1432,6 @@ static void tcg_out_vldst(TCGContext *s, ARMInsn insn,
 }
 
 #ifdef CONFIG_SOFTMMU
-#include "../tcg-ldst.c.inc"
-
 /* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
  *                                     int mmu_idx, uintptr_t ra)
  */
@@ -1762,6 +1764,74 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
     tcg_out_goto(s, COND_AL, qemu_st_helpers[opc & MO_SIZE]);
     return true;
 }
+#else
+
+static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
+                                   TCGReg addrhi, unsigned a_bits)
+{
+    unsigned a_mask = (1 << a_bits) - 1;
+    TCGLabelQemuLdst *label = new_ldst_label(s);
+
+    label->is_ld = is_ld;
+    label->addrlo_reg = addrlo;
+    label->addrhi_reg = addrhi;
+
+    /* We are expecting a_bits to max out at 7, and can easily support 8. */
+    tcg_debug_assert(a_mask <= 0xff);
+    /* tst addr, #mask */
+    tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
+
+    /* blne slow_path */
+    label->label_ptr[0] = s->code_ptr;
+    tcg_out_bl_imm(s, COND_NE, 0);
+
+    label->raddr = tcg_splitwx_to_rx(s->code_ptr);
+}
+
+static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
+{
+    if (!reloc_pc24(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
+        return false;
+    }
+
+    if (TARGET_LONG_BITS == 64) {
+        /* 64-bit target address is aligned into R2:R3. */
+        if (l->addrhi_reg != TCG_REG_R2) {
+            tcg_out_mov(s, TCG_TYPE_I32, TCG_REG_R2, l->addrlo_reg);
+            tcg_out_mov(s, TCG_TYPE_I32, TCG_REG_R3, l->addrhi_reg);
+        } else if (l->addrlo_reg != TCG_REG_R3) {
+            tcg_out_mov(s, TCG_TYPE_I32, TCG_REG_R3, l->addrhi_reg);
+            tcg_out_mov(s, TCG_TYPE_I32, TCG_REG_R2, l->addrlo_reg);
+        } else {
+            tcg_out_mov(s, TCG_TYPE_I32, TCG_REG_R1, TCG_REG_R2);
+            tcg_out_mov(s, TCG_TYPE_I32, TCG_REG_R2, TCG_REG_R3);
+            tcg_out_mov(s, TCG_TYPE_I32, TCG_REG_R3, TCG_REG_R1);
+        }
+    } else {
+        tcg_out_mov(s, TCG_TYPE_I32, TCG_REG_R1, l->addrlo_reg);
+    }
+    tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_R0, TCG_AREG0);
+
+    /*
+     * Tail call to the helper, with the return address back inline,
+     * just for the clarity of the debugging traceback -- the helper
+     * cannot return.  We have used BLNE to arrive here, so LR is
+     * already set.
+     */
+    tcg_out_goto(s, COND_AL, (const void *)
+                 (l->is_ld ? helper_unaligned_ld : helper_unaligned_st));
+    return true;
+}
+
+static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+    return tcg_out_fail_alignment(s, l);
+}
+
+static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+    return tcg_out_fail_alignment(s, l);
+}
 #endif /* SOFTMMU */
 
 static void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
@@ -1811,45 +1881,175 @@ static void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
 
 #ifndef CONFIG_SOFTMMU
 static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg datalo,
-                                   TCGReg datahi, TCGReg addrlo)
+                                   TCGReg datahi, TCGReg addrlo, uint8_t ofs)
 {
     /* Byte swapping is left to middle-end expansion. */
     tcg_debug_assert((opc & MO_BSWAP) == 0);
 
     switch (opc & MO_SSIZE) {
     case MO_UB:
-        tcg_out_ld8_12(s, COND_AL, datalo, addrlo, 0);
+        tcg_out_ld8_12(s, COND_AL, datalo, addrlo, ofs);
         break;
     case MO_SB:
-        tcg_out_ld8s_8(s, COND_AL, datalo, addrlo, 0);
+        tcg_out_ld8s_8(s, COND_AL, datalo, addrlo, ofs);
         break;
     case MO_UW:
-        tcg_out_ld16u_8(s, COND_AL, datalo, addrlo, 0);
+        tcg_out_ld16u_8(s, COND_AL, datalo, addrlo, ofs);
         break;
     case MO_SW:
-        tcg_out_ld16s_8(s, COND_AL, datalo, addrlo, 0);
+        tcg_out_ld16s_8(s, COND_AL, datalo, addrlo, ofs);
         break;
     case MO_UL:
-        tcg_out_ld32_12(s, COND_AL, datalo, addrlo, 0);
+        tcg_out_ld32_12(s, COND_AL, datalo, addrlo, ofs);
         break;
     case MO_Q:
         /* LDRD requires alignment; double-check that. */
         if (use_armv6_instructions
             && get_alignment_bits(opc) >= MO_64
             && (datalo & 1) == 0 && datahi == datalo + 1) {
-            tcg_out_ldrd_8(s, COND_AL, datalo, addrlo, 0);
+            tcg_out_ldrd_8(s, COND_AL, datalo, addrlo, ofs);
         } else if (datalo == addrlo) {
-            tcg_out_ld32_12(s, COND_AL, datahi, addrlo, 4);
-            tcg_out_ld32_12(s, COND_AL, datalo, addrlo, 0);
+            tcg_out_ld32_12(s, COND_AL, datahi, addrlo, ofs + 4);
+            tcg_out_ld32_12(s, COND_AL, datalo, addrlo, ofs);
         } else {
-            tcg_out_ld32_12(s, COND_AL, datalo, addrlo, 0);
-            tcg_out_ld32_12(s, COND_AL, datahi, addrlo, 4);
+            tcg_out_ld32_12(s, COND_AL, datalo, addrlo, ofs);
+            tcg_out_ld32_12(s, COND_AL, datahi, addrlo, ofs + 4);
         }
         break;
     default:
         g_assert_not_reached();
     }
 }
+
+/*
+ * There are a some interesting special cases for which we can get
+ * the alignment check for free with the instruction.  For MO_16,
+ * we would need to enlist ARMv8 load-acquire halfword (LDAH).
+ */
+static bool tcg_out_qemu_ld_align(TCGContext *s, MemOp opc, TCGReg datalo,
+                                  TCGReg datahi, TCGReg addrlo,
+                                  unsigned a_bits)
+{
+    unsigned s_bits = opc & MO_SIZE;
+
+    /* LDM enforces 4-byte alignment. */
+    if (a_bits == MO_32 && s_bits >= MO_32) {
+        TCGReg tmphi, tmplo;
+
+        if (guest_base) {
+            tcg_out_dat_reg(s, COND_AL, ARITH_ADD, TCG_REG_TMP, addrlo,
+                            TCG_REG_GUEST_BASE, SHIFT_IMM_LSL(0));
+            addrlo = TCG_REG_TMP;
+        }
+
+        if (s_bits == MO_32) {
+            /* ldm addrlo, { datalo } */
+            tcg_out_ldstm(s, COND_AL, INSN_LDM, addrlo, 1 << datalo);
+            return true;
+        }
+        /* else MO_64... */
+
+        /*
+         * LDM loads in mask order, so we want the second part to be loaded
+         * into a higher register number.  Note that both R12 and R14 are
+         * reserved, so we always have a maximum regno to use.
+         */
+        tmplo = datalo;
+        tmphi = datahi;
+        if (MO_BSWAP == MO_LE) {
+            if (datalo > datahi) {
+                tmphi = TCG_REG_TMP;
+            }
+        } else {
+            if (datalo < datahi) {
+                tmplo = TCG_REG_TMP;
+            }
+        }
+
+        /* ldm addrlo, { tmplo, tmphi } */
+        tcg_out_ldstm(s, COND_AL, INSN_LDM, addrlo, 1 << tmplo | 1 << tmphi);
+
+        tcg_out_mov(s, TCG_TYPE_I32, datahi, tmphi);
+        tcg_out_mov(s, TCG_TYPE_I32, datalo, tmplo);
+        return true;
+    }
+
+    /* LDRD enforces 8-byte alignment. */
+    if (a_bits == MO_64 && s_bits == MO_64
+        && (datalo & 1) == 0 && datahi == datalo + 1) {
+        if (guest_base) {
+            tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP, guest_base);
+            tcg_out_ldrd_r(s, COND_AL, datalo, addrlo, TCG_REG_TMP);
+        } else {
+            tcg_out_ldrd_8(s, COND_AL, datalo, addrlo, 0);
+        }
+        return true;
+    }
+    return false;
+}
+
+static void tcg_out_qemu_ld_unalign(TCGContext *s, MemOp opc,
+                                    TCGReg datalo, TCGReg datahi,
+                                    TCGReg addrlo, unsigned a_bits)
+{
+    unsigned s_bits = opc & MO_SIZE;
+    unsigned s_size = 1 << s_bits;
+    unsigned a_size = 1 << a_bits;
+    bool init = true;
+    unsigned i;
+
+    if (guest_base) {
+        tcg_out_dat_reg(s, COND_AL, ARITH_ADD, TCG_REG_TMP2, addrlo,
+                        TCG_REG_GUEST_BASE, SHIFT_IMM_LSL(0));
+        addrlo = TCG_REG_TMP2;
+    }
+
+    /*
+     * Perform the load in units of a_size.
+     */
+    if (MO_BSWAP == MO_LE) {
+        for (i = 0; i < s_size; ) {
+            if (init) {
+                tcg_out_qemu_ld_direct(s, a_bits, datalo, 0, addrlo, i);
+                init = false;
+            } else {
+                /*
+                 * Note that MO_SIGN will only be set for MO_16, and we
+                 * want the sign bit for the second byte, when !init.
+                 */
+                tcg_out_qemu_ld_direct(s, a_bits | (opc & MO_SIZE),
+                                       TCG_REG_TMP, 0, addrlo, i);
+                tcg_out_dat_reg(s, COND_AL, ARITH_ORR,
+                                datalo, datalo, TCG_REG_TMP,
+                                SHIFT_IMM_LSL(i * 8));
+            }
+            i += a_size;
+            if (s_size == 8 && i == 4) {
+                datalo = datahi;
+                init = true;
+            }
+        }
+    } else {
+        for (i = 0; i < s_size; ) {
+            if (init) {
+                /* See above, only reversed for big-endian. */
+                tcg_out_qemu_ld_direct(s, a_bits | (opc & MO_SIZE),
+                                       datahi, 0, addrlo, i);
+                init = false;
+            } else {
+                tcg_out_qemu_ld_direct(s, a_bits, TCG_REG_TMP, 0, addrlo, i);
+                tcg_out_dat_reg(s, COND_AL, ARITH_ORR,
+                                datahi, TCG_REG_TMP, datahi,
+                                SHIFT_IMM_LSL(a_size * 8));
+            }
+            i += a_size;
+            if (s_size == 8 && i == 4) {
+                datahi = datalo;
+                init = true;
+            }
+        }
+    }
+}
 #endif
 
 static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
@@ -1861,6 +2061,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
     int mem_index;
     TCGReg addend;
     tcg_insn_unit *label_ptr;
+#else
+    unsigned a_bits, s_bits;
 #endif
 
     datalo = *args++;
@@ -1884,11 +2086,23 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
     add_qemu_ldst_label(s, true, oi, datalo, datahi, addrlo, addrhi,
                         s->code_ptr, label_ptr);
 #else /* !CONFIG_SOFTMMU */
-    if (guest_base) {
+    a_bits = get_alignment_bits(opc);
+    s_bits = opc & MO_SIZE;
+
+    if (use_armv6_instructions &&
+        tcg_out_qemu_ld_align(s, datalo, datahi, addrlo, a_bits, s_bits)) {
+        return;
+    }
+    if (a_bits) {
+        tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
+    }
+    if (!use_armv6_instructions && a_bits < MO_32) {
+        tcg_out_qemu_ld_unalign(s, opc, datalo, datahi, addrlo, a_bits);
+    } else if (guest_base) {
         tcg_out_qemu_ld_index(s, opc, datalo, datahi,
                               addrlo, TCG_REG_GUEST_BASE, false);
     } else {
-        tcg_out_qemu_ld_direct(s, opc, datalo, datahi, addrlo);
+        tcg_out_qemu_ld_direct(s, opc, datalo, datahi, addrlo, 0);
     }
 #endif
 }
@@ -1934,36 +2148,122 @@ static void tcg_out_qemu_st_index(TCGContext *s, ARMCond cond, MemOp opc,
 
 #ifndef CONFIG_SOFTMMU
 static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg datalo,
-                                   TCGReg datahi, TCGReg addrlo)
+                                   TCGReg datahi, TCGReg addrlo, uint8_t ofs)
 {
     /* Byte swapping is left to middle-end expansion. */
     tcg_debug_assert((opc & MO_BSWAP) == 0);
 
     switch (opc & MO_SIZE) {
     case MO_8:
-        tcg_out_st8_12(s, COND_AL, datalo, addrlo, 0);
+        tcg_out_st8_12(s, COND_AL, datalo, addrlo, ofs);
         break;
     case MO_16:
-        tcg_out_st16_8(s, COND_AL, datalo, addrlo, 0);
+        tcg_out_st16_8(s, COND_AL, datalo, addrlo, ofs);
         break;
     case MO_32:
-        tcg_out_st32_12(s, COND_AL, datalo, addrlo, 0);
+        tcg_out_st32_12(s, COND_AL, datalo, addrlo, ofs);
         break;
     case MO_64:
         /* STRD requires alignment; double-check that. */
         if (use_armv6_instructions
             && get_alignment_bits(opc) >= MO_64
             && (datalo & 1) == 0 && datahi == datalo + 1) {
-            tcg_out_strd_8(s, COND_AL, datalo, addrlo, 0);
+            tcg_out_strd_8(s, COND_AL, datalo, addrlo, ofs);
         } else {
-            tcg_out_st32_12(s, COND_AL, datalo, addrlo, 0);
-            tcg_out_st32_12(s, COND_AL, datahi, addrlo, 4);
+            tcg_out_st32_12(s, COND_AL, datalo, addrlo, ofs);
+            tcg_out_st32_12(s, COND_AL, datahi, addrlo, ofs + 4);
         }
         break;
     default:
         g_assert_not_reached();
     }
 }
+
+static bool tcg_out_qemu_st_align(TCGContext *s, TCGReg datalo,
+                                  TCGReg datahi, TCGReg addrlo,
+                                  unsigned a_bits, unsigned s_bits)
+{
+    /* STM enforces 4-byte alignment. */
+    if (a_bits == MO_32) {
+        uint16_t mask = 1 << datalo;
+
+        switch (s_bits) {
+        case MO_64:
+            /*
+             * STM stores in mask order, so we want the second part to be
+             * in a higher register number.  Note that both R12 and R14
+             * are reserved, so we always have a maximum regno to use.
+             */
+            if (MO_BSWAP == MO_LE) {
+                if (datalo > datahi) {
+                    tcg_out_mov(s, TCG_TYPE_I32, TCG_REG_TMP, datahi);
+                    datahi = TCG_REG_TMP;
+                }
+            } else {
+                if (datalo < datahi) {
+                    tcg_out_mov(s, TCG_TYPE_I32, TCG_REG_TMP, datalo);
+                    datalo = TCG_REG_TMP;
+                }
+            }
+            mask = 1 << datalo | 1 << datahi;
+            /* fall through */
+
+        case MO_32:
+            if (guest_base) {
+                tcg_out_dat_reg(s, COND_AL, ARITH_ADD, TCG_REG_TMP2, addrlo,
+                                TCG_REG_GUEST_BASE, SHIFT_IMM_LSL(0));
+                addrlo = TCG_REG_TMP2;
+            }
+            tcg_out_ldstm(s, COND_AL, INSN_STM, addrlo, mask);
+            return true;
+        }
+        return false;
+    }
+
+    /* STRD enforces 8-byte alignment. */
+    if (a_bits == MO_64 && s_bits == MO_64
+        && (datalo & 1) == 0 && datahi == datalo + 1) {
+        if (guest_base) {
+            tcg_out_strd_r(s, COND_AL, datalo, addrlo, TCG_REG_GUEST_BASE);
+        } else {
+            tcg_out_strd_8(s, COND_AL, datalo, addrlo, 0);
+        }
+        return true;
+    }
+    return false;
+}
+
+static void tcg_out_qemu_st_unalign(TCGContext *s, MemOp opc,
+                                    TCGReg datalo, TCGReg datahi,
+                                    TCGReg addrlo, unsigned a_bits)
+{
+    int s_bits = opc & MO_SIZE;
+    int s_size = 1 << s_bits;
+    int a_size = 1 << a_bits;
+    int i;
+
+    if (guest_base) {
+        tcg_out_dat_reg(s, COND_AL, ARITH_ADD, TCG_REG_TMP2, addrlo,
+                        TCG_REG_GUEST_BASE, SHIFT_IMM_LSL(0));
+        addrlo = TCG_REG_TMP2;
+    }
+
+    /*
+     * Perform the store in units of a_size.
+     */
+    for (i = 0; i < s_size; i += a_size) {
+        int shift = (MO_BSWAP == MO_LE ? i : s_size - a_size - i) * 8;
+        TCGReg t = (i < 32 ? datalo : datahi);
+
+        shift &= 31;
+        if (shift) {
+            tcg_out_dat_reg(s, COND_AL, ARITH_MOV, TCG_REG_TMP, 0, t,
+                            SHIFT_IMM_LSR(shift));
+            t = TCG_REG_TMP;
+        }
+        tcg_out_qemu_st_direct(s, a_bits, t, 0, addrlo, i);
+    }
+}
 #endif
 
 static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
@@ -1975,6 +2275,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
     int mem_index;
     TCGReg addend;
     tcg_insn_unit *label_ptr;
+#else
+    unsigned a_bits, s_bits;
 #endif
 
     datalo = *args++;
@@ -1998,11 +2300,22 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
     add_qemu_ldst_label(s, false, oi, datalo, datahi, addrlo, addrhi,
                         s->code_ptr, label_ptr);
 #else /* !CONFIG_SOFTMMU */
-    if (guest_base) {
+    a_bits = get_alignment_bits(opc);
+    s_bits = opc & MO_SIZE;
+
+    if (tcg_out_qemu_st_align(s, datalo, datahi, addrlo, a_bits, s_bits)) {
+        return;
+    }
+    if (a_bits) {
+        tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
+    }
+    if (!use_armv6_instructions && a_bits < MO_32) {
+        tcg_out_qemu_st_unalign(s, opc, datalo, datahi, addrlo, a_bits);
+    } else if (guest_base) {
         tcg_out_qemu_st_index(s, COND_AL, opc, datalo, datahi,
                               addrlo, TCG_REG_GUEST_BASE, false);
     } else {
-        tcg_out_qemu_st_direct(s, opc, datalo, datahi, addrlo);
+        tcg_out_qemu_st_direct(s, opc, datalo, datahi, addrlo, 0);
     }
 #endif
 }
@@ -2558,6 +2871,9 @@ static void tcg_target_init(TCGContext *s)
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP);
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_PC);
     tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP);
+#ifndef CONFIG_SOFTMMU
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP2);
+#endif
 }
 
 static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg arg,
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 02/14] tcg/arm: Standardize on tcg_out_<branch>_{reg,imm}
  2021-08-18 21:29 ` [PATCH v3 02/14] tcg/arm: Standardize on tcg_out_<branch>_{reg,imm} Richard Henderson
@ 2021-08-18 21:58   ` Philippe Mathieu-Daudé
  2021-08-20 10:39   ` [PATCH v3 02/14] tcg/arm: Standardize on tcg_out_<branch>_{reg, imm} Peter Maydell
  1 sibling, 0 replies; 33+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-08-18 21:58 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 8/18/21 11:29 PM, Richard Henderson wrote:
> Some of the functions specified _reg, some _imm, and some
> left it blank.  Make it clearer to which we are referring.
> 
> Split tcg_out_b_reg from tcg_out_bx_reg, to indicate when
> we do not actually require BX semantics.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/arm/tcg-target.c.inc | 38 ++++++++++++++++++++++----------------
>  1 file changed, 22 insertions(+), 16 deletions(-)

Appreciated cleanup :)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 09/14] tcg/arm: Drop inline markers
  2021-08-18 21:29 ` [PATCH v3 09/14] tcg/arm: Drop inline markers Richard Henderson
@ 2021-08-18 22:02   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 33+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-08-18 22:02 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 8/18/21 11:29 PM, Richard Henderson wrote:
> Let the compiler decide about inlining.
> Remove tcg_out_nop as unused.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/arm/tcg-target.c.inc | 234 +++++++++++++++++++--------------------
>  1 file changed, 114 insertions(+), 120 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 10/14] tcg/arm: Give enum arm_cond_code_e a typedef and use it
  2021-08-18 21:29 ` [PATCH v3 10/14] tcg/arm: Give enum arm_cond_code_e a typedef and use it Richard Henderson
@ 2021-08-18 22:04   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 33+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-08-18 22:04 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 8/18/21 11:29 PM, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/arm/tcg-target.c.inc | 136 +++++++++++++++++++--------------------
>  1 file changed, 68 insertions(+), 68 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 11/14] tcg/arm: More use of the ARMInsn enum
  2021-08-18 21:29 ` [PATCH v3 11/14] tcg/arm: More use of the ARMInsn enum Richard Henderson
@ 2021-08-18 22:04   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 33+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-08-18 22:04 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 8/18/21 11:29 PM, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/arm/tcg-target.c.inc | 20 ++++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 12/14] tcg/arm: More use of the TCGReg enum
  2021-08-18 21:29 ` [PATCH v3 12/14] tcg/arm: More use of the TCGReg enum Richard Henderson
@ 2021-08-18 22:05   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 33+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-08-18 22:05 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 8/18/21 11:29 PM, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/arm/tcg-target.c.inc | 65 +++++++++++++++++++++-------------------
>  1 file changed, 35 insertions(+), 30 deletions(-)

I like it :)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 01/14] tcg/arm: Remove fallback definition of __ARM_ARCH
  2021-08-18 21:28 ` [PATCH v3 01/14] tcg/arm: Remove fallback definition of __ARM_ARCH Richard Henderson
@ 2021-08-20 10:38   ` Peter Maydell
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2021-08-20 10:38 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Wed, 18 Aug 2021 at 22:32, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> GCC since 4.8 provides the definition and we now require 7.5.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

I assume clang provides this too...

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 02/14] tcg/arm: Standardize on tcg_out_<branch>_{reg, imm}
  2021-08-18 21:29 ` [PATCH v3 02/14] tcg/arm: Standardize on tcg_out_<branch>_{reg,imm} Richard Henderson
  2021-08-18 21:58   ` Philippe Mathieu-Daudé
@ 2021-08-20 10:39   ` Peter Maydell
  1 sibling, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2021-08-20 10:39 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Wed, 18 Aug 2021 at 22:41, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Some of the functions specified _reg, some _imm, and some
> left it blank.  Make it clearer to which we are referring.
>
> Split tcg_out_b_reg from tcg_out_bx_reg, to indicate when
> we do not actually require BX semantics.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 04/14] tcg/arm: Support armv4t in tcg_out_goto and tcg_out_call
  2021-08-18 21:29 ` [PATCH v3 04/14] tcg/arm: Support armv4t in tcg_out_goto and tcg_out_call Richard Henderson
@ 2021-08-20 10:50   ` Peter Maydell
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2021-08-20 10:50 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Wed, 18 Aug 2021 at 22:38, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> ARMv4T has BX as its only interworking instruction.  In order
> to support testing of different architecture revisions with a
> qemu binary that may have been built for, say ARMv6T2, fill in
> the blank required to make calls to helpers in thumb mode.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 03/14] tcg/arm: Simplify use_armvt5_instructions
  2021-08-18 21:29 ` [PATCH v3 03/14] tcg/arm: Simplify use_armvt5_instructions Richard Henderson
@ 2021-08-20 10:59   ` Peter Maydell
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2021-08-20 10:59 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Wed, 18 Aug 2021 at 22:32, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> According to the Arm ARM DDI 0406C, section A1.3, the valid variants
> are ARMv5T, ARMv5TE, ARMv5TEJ -- there is no ARMv5 without Thumb.
> Therefore simplify the test from preprocessor ifdefs to base
> architecture revision.  Retain the "t" in the name to minimize churn.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/arm/tcg-target.h | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
> index 18bb16c784..f41b809554 100644
> --- a/tcg/arm/tcg-target.h
> +++ b/tcg/arm/tcg-target.h
> @@ -28,13 +28,7 @@
>
>  extern int arm_arch;
>
> -#if defined(__ARM_ARCH_5T__) \
> -    || defined(__ARM_ARCH_5TE__) || defined(__ARM_ARCH_5TEJ__)
> -# define use_armv5t_instructions 1
> -#else
> -# define use_armv5t_instructions use_armv6_instructions
> -#endif
> -
> +#define use_armv5t_instructions (__ARM_ARCH >= 5 || arm_arch >= 5)
>  #define use_armv6_instructions  (__ARM_ARCH >= 6 || arm_arch >= 6)
>  #define use_armv7_instructions  (__ARM_ARCH >= 7 || arm_arch >= 7)

Typo in subject: should be "armv5t". Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 05/14] tcg/arm: Examine QEMU_TCG_DEBUG environment variable
  2021-08-18 21:29 ` [PATCH v3 05/14] tcg/arm: Examine QEMU_TCG_DEBUG environment variable Richard Henderson
@ 2021-08-20 11:01   ` Peter Maydell
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2021-08-20 11:01 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Wed, 18 Aug 2021 at 22:42, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Use the environment variable to test an older ISA from
> the one supported by the host.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

I think we should document this environment variable somewhere...



> +    /*
> +     * For debugging/testing purposes, allow the ISA to be reduced
> +     * (but not extended) from the set detected above.
> +     */
> +#ifdef CONFIG_DEBUG_TCG
> +    {
> +        char *opt = g_strdup(getenv("QEMU_TCG_DEBUG"));

Consider g_autofree ?

> +        if (opt) {
> +            for (char *o = strtok(opt, ","); o ; o = strtok(NULL, ",")) {
> +                if (o[0] == 'v' &&
> +                    o[1] >= '4' &&
> +                    o[1] <= '0' + arm_arch &&
> +                    o[2] == 0) {
> +                    arm_arch = o[1] - '0';
> +                    continue;
> +                }
> +                if (strcmp(o, "!neon") == 0) {
> +                    use_neon_instructions = false;
> +                    continue;
> +                }
> +                if (strcmp(o, "help") == 0) {
> +                    printf("QEMU_TCG_DEBUG=<opt>{,<opt>} where <opt> is\n"
> +                           "  v<N>   select ARMv<N>\n"
> +                           "  !neon  disable ARM NEON\n");
> +                    exit(0);
> +                }

We should complain about something in the variable we don't understand,
rather than just ignoring it.

> +            }
> +            g_free(opt);
> +        }
> +    }
> +#endif
> +

-- PMM


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 07/14] tcg/arm: Split out tcg_out_ldstm
  2021-08-18 21:29 ` [PATCH v3 07/14] tcg/arm: Split out tcg_out_ldstm Richard Henderson
@ 2021-08-20 11:45   ` Peter Maydell
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2021-08-20 11:45 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Wed, 18 Aug 2021 at 22:36, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Expand these hard-coded instructions symbolically.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/arm/tcg-target.c.inc | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 08/14] tcg/arm: Simplify usage of encode_imm
  2021-08-18 21:29 ` [PATCH v3 08/14] tcg/arm: Simplify usage of encode_imm Richard Henderson
@ 2021-08-20 11:50   ` Peter Maydell
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2021-08-20 11:50 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Wed, 18 Aug 2021 at 22:38, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> We have already computed the rotated value of the imm8
> portion of the complete imm12 encoding.  No sense leaving
> the combination of rot + rotation to the caller.
>
> Create an encode_imm12_nofail helper that performs an assert.
>
> This removes the final use of the local "rotl" function,
> which duplicated our generic "rol32" function.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/arm/tcg-target.c.inc | 141 +++++++++++++++++++++------------------
>  1 file changed, 77 insertions(+), 64 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 13/14] tcg/arm: Reserve a register for guest_base
  2021-08-18 21:29 ` [PATCH v3 13/14] tcg/arm: Reserve a register for guest_base Richard Henderson
@ 2021-08-20 12:03   ` Peter Maydell
  2021-08-20 18:47     ` Richard Henderson
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Maydell @ 2021-08-20 12:03 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Wed, 18 Aug 2021 at 22:33, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Reserve a register for the guest_base using aarch64 for reference.
> By doing so, we do not have to recompute it for every memory load.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/arm/tcg-target.c.inc | 39 ++++++++++++++++++++++++++++-----------
>  1 file changed, 28 insertions(+), 11 deletions(-)
>
> diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
> index 35bd4c68d6..2728035177 100644
> --- a/tcg/arm/tcg-target.c.inc
> +++ b/tcg/arm/tcg-target.c.inc
> @@ -84,6 +84,9 @@ static const int tcg_target_call_oarg_regs[2] = {
>
>  #define TCG_REG_TMP  TCG_REG_R12
>  #define TCG_VEC_TMP  TCG_REG_Q15
> +#ifndef CONFIG_SOFTMMU
> +#define TCG_REG_GUEST_BASE  TCG_REG_R11
> +#endif
>
>  typedef enum {
>      COND_EQ = 0x0,
> @@ -1763,7 +1766,8 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
>
>  static void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
>                                    TCGReg datalo, TCGReg datahi,
> -                                  TCGReg addrlo, TCGReg addend)
> +                                  TCGReg addrlo, TCGReg addend,
> +                                  bool scratch_addend)
>  {
>      /* Byte swapping is left to middle-end expansion. */
>      tcg_debug_assert((opc & MO_BSWAP) == 0);
> @@ -1790,7 +1794,7 @@ static void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
>              && get_alignment_bits(opc) >= MO_64
>              && (datalo & 1) == 0 && datahi == datalo + 1) {
>              tcg_out_ldrd_r(s, COND_AL, datalo, addrlo, addend);
> -        } else if (datalo != addend) {
> +        } else if (scratch_addend) {
>              tcg_out_ld32_rwb(s, COND_AL, datalo, addend, addrlo);
>              tcg_out_ld32_12(s, COND_AL, datahi, addend, 4);
>          } else {

I don't understand this change. Yes, we can trash the addend
register, but if it's the same as 'datalo' then the second load
is not going to DTRT... Shouldn't this be
  if (scratch_addend && datalo != addend)
?

-- PMM


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 06/14] tcg/arm: Support unaligned access for softmmu
  2021-08-18 21:29 ` [PATCH v3 06/14] tcg/arm: Support unaligned access for softmmu Richard Henderson
@ 2021-08-20 13:34   ` Peter Maydell
  2021-08-20 17:19     ` Richard Henderson
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Maydell @ 2021-08-20 13:34 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Wed, 18 Aug 2021 at 22:32, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> From armv6, the architecture supports unaligned accesses.
> All we need to do is perform the correct alignment check
> in tcg_out_tlb_read and not use LDRD/STRD when the access
> is not aligned.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> @@ -1578,27 +1576,32 @@ static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
>
>      /*
>       * Check alignment, check comparators.
> -     * Do this in no more than 3 insns.  Use MOVW for v7, if possible,
> +     * Do this in 2-4 insns.  Use MOVW for v7, if possible,
>       * to reduce the number of sequential conditional instructions.
>       * Almost all guests have at least 4k pages, which means that we need
>       * to clear at least 9 bits even for an 8-byte memory, which means it
>       * isn't worth checking for an immediate operand for BIC.
>       */
> +    /* For unaligned accesses, test the page of the last byte. */
> +    t_addr = addrlo;
> +    if (a_mask < s_mask) {
> +        t_addr = TCG_REG_R0;
> +        tcg_out_dat_imm(s, COND_AL, ARITH_ADD, t_addr,
> +                        addrlo, s_mask - a_mask);
> +    }

I don't understand what this comment means or why we're doing the
addition. If we know we need to check eg whether the address is 2-aligned,
why aren't we just checking whether it's 2-aligned ? Could you
expand on the explanation a bit?


thanks
-- PMM


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 14/14] tcg/arm: Support raising sigbus for user-only
  2021-08-18 21:29 ` [PATCH v3 14/14] tcg/arm: Support raising sigbus for user-only Richard Henderson
@ 2021-08-20 13:56   ` Peter Maydell
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2021-08-20 13:56 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Wed, 18 Aug 2021 at 22:43, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> For v6+, use ldm/stm, ldrd/strd for the normal case of alignment
> matching the access size.  Otherwise, emit a test + branch sequence
> invoking helper_unaligned_{ld,st}.
>
> For v4+v5, use piecewise load and stores to implement misalignment.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 06/14] tcg/arm: Support unaligned access for softmmu
  2021-08-20 13:34   ` Peter Maydell
@ 2021-08-20 17:19     ` Richard Henderson
  0 siblings, 0 replies; 33+ messages in thread
From: Richard Henderson @ 2021-08-20 17:19 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

On 8/20/21 3:34 AM, Peter Maydell wrote:
> On Wed, 18 Aug 2021 at 22:32, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>>  From armv6, the architecture supports unaligned accesses.
>> All we need to do is perform the correct alignment check
>> in tcg_out_tlb_read and not use LDRD/STRD when the access
>> is not aligned.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> @@ -1578,27 +1576,32 @@ static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
>>
>>       /*
>>        * Check alignment, check comparators.
>> -     * Do this in no more than 3 insns.  Use MOVW for v7, if possible,
>> +     * Do this in 2-4 insns.  Use MOVW for v7, if possible,
>>        * to reduce the number of sequential conditional instructions.
>>        * Almost all guests have at least 4k pages, which means that we need
>>        * to clear at least 9 bits even for an 8-byte memory, which means it
>>        * isn't worth checking for an immediate operand for BIC.
>>        */
>> +    /* For unaligned accesses, test the page of the last byte. */
>> +    t_addr = addrlo;
>> +    if (a_mask < s_mask) {
>> +        t_addr = TCG_REG_R0;
>> +        tcg_out_dat_imm(s, COND_AL, ARITH_ADD, t_addr,
>> +                        addrlo, s_mask - a_mask);
>> +    }
> 
> I don't understand what this comment means or why we're doing the
> addition. If we know we need to check eg whether the address is 2-aligned,
> why aren't we just checking whether it's 2-aligned ? Could you
> expand on the explanation a bit?

We want to detect the page crossing case of a misaligned access.

We began computing the softtlb data with the address of the start access (addrlo).  We 
then compute the address of the last (aligned) portion of the access.  For a 4-byte access 
that is 1-byte aligned, we add 3 - 0 = 3, finding the last byte; for a 2-byte aligned 
access we add 3 - 1 = 2; for a 4-byte aligned access we (logically) add 3 - 3 = 0.

This second quantity retains the alignment we need to test and also rolls over to the next 
page iff the access does. When we compare against the comparator in the tlb, a bit set 
within the alignment will cause failure as will a differing page number.


r~


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 13/14] tcg/arm: Reserve a register for guest_base
  2021-08-20 12:03   ` Peter Maydell
@ 2021-08-20 18:47     ` Richard Henderson
  2021-08-21 10:38       ` Peter Maydell
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Henderson @ 2021-08-20 18:47 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

On 8/20/21 2:03 AM, Peter Maydell wrote:
>> -        } else if (datalo != addend) {
>> +        } else if (scratch_addend) {
>>               tcg_out_ld32_rwb(s, COND_AL, datalo, addend, addrlo);
>>               tcg_out_ld32_12(s, COND_AL, datahi, addend, 4);
>>           } else {
> 
> I don't understand this change. Yes, we can trash the addend
> register, but if it's the same as 'datalo' then the second load
> is not going to DTRT... Shouldn't this be
>    if (scratch_addend && datalo != addend)
> ?

Previously, addend was *always* a scratch register, TCG_REG_TMP or such.
Afterward, addend may be TCG_REG_GUEST_BASE, which should not be modified.
At no point is there overlap between addend and data{hi,lo}.

r~


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 13/14] tcg/arm: Reserve a register for guest_base
  2021-08-20 18:47     ` Richard Henderson
@ 2021-08-21 10:38       ` Peter Maydell
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Maydell @ 2021-08-21 10:38 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Fri, 20 Aug 2021 at 19:47, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 8/20/21 2:03 AM, Peter Maydell wrote:
> >> -        } else if (datalo != addend) {
> >> +        } else if (scratch_addend) {
> >>               tcg_out_ld32_rwb(s, COND_AL, datalo, addend, addrlo);
> >>               tcg_out_ld32_12(s, COND_AL, datahi, addend, 4);
> >>           } else {
> >
> > I don't understand this change. Yes, we can trash the addend
> > register, but if it's the same as 'datalo' then the second load
> > is not going to DTRT... Shouldn't this be
> >    if (scratch_addend && datalo != addend)
> > ?
>
> Previously, addend was *always* a scratch register, TCG_REG_TMP or such.
> Afterward, addend may be TCG_REG_GUEST_BASE, which should not be modified.
> At no point is there overlap between addend and data{hi,lo}.

So the old "datalo == addend" code path was dead code ?

Perhaps if the function now assumes that scratch_addend implies
that datalo != addend it would be worth assert()ing that, in case
some future callsite doesn't realize the restriction ?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2021-08-21 10:40 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-18 21:28 [PATCH v3 00/14] tcg/arm: Unaligned access and other cleanup Richard Henderson
2021-08-18 21:28 ` [PATCH v3 01/14] tcg/arm: Remove fallback definition of __ARM_ARCH Richard Henderson
2021-08-20 10:38   ` Peter Maydell
2021-08-18 21:29 ` [PATCH v3 02/14] tcg/arm: Standardize on tcg_out_<branch>_{reg,imm} Richard Henderson
2021-08-18 21:58   ` Philippe Mathieu-Daudé
2021-08-20 10:39   ` [PATCH v3 02/14] tcg/arm: Standardize on tcg_out_<branch>_{reg, imm} Peter Maydell
2021-08-18 21:29 ` [PATCH v3 03/14] tcg/arm: Simplify use_armvt5_instructions Richard Henderson
2021-08-20 10:59   ` Peter Maydell
2021-08-18 21:29 ` [PATCH v3 04/14] tcg/arm: Support armv4t in tcg_out_goto and tcg_out_call Richard Henderson
2021-08-20 10:50   ` Peter Maydell
2021-08-18 21:29 ` [PATCH v3 05/14] tcg/arm: Examine QEMU_TCG_DEBUG environment variable Richard Henderson
2021-08-20 11:01   ` Peter Maydell
2021-08-18 21:29 ` [PATCH v3 06/14] tcg/arm: Support unaligned access for softmmu Richard Henderson
2021-08-20 13:34   ` Peter Maydell
2021-08-20 17:19     ` Richard Henderson
2021-08-18 21:29 ` [PATCH v3 07/14] tcg/arm: Split out tcg_out_ldstm Richard Henderson
2021-08-20 11:45   ` Peter Maydell
2021-08-18 21:29 ` [PATCH v3 08/14] tcg/arm: Simplify usage of encode_imm Richard Henderson
2021-08-20 11:50   ` Peter Maydell
2021-08-18 21:29 ` [PATCH v3 09/14] tcg/arm: Drop inline markers Richard Henderson
2021-08-18 22:02   ` Philippe Mathieu-Daudé
2021-08-18 21:29 ` [PATCH v3 10/14] tcg/arm: Give enum arm_cond_code_e a typedef and use it Richard Henderson
2021-08-18 22:04   ` Philippe Mathieu-Daudé
2021-08-18 21:29 ` [PATCH v3 11/14] tcg/arm: More use of the ARMInsn enum Richard Henderson
2021-08-18 22:04   ` Philippe Mathieu-Daudé
2021-08-18 21:29 ` [PATCH v3 12/14] tcg/arm: More use of the TCGReg enum Richard Henderson
2021-08-18 22:05   ` Philippe Mathieu-Daudé
2021-08-18 21:29 ` [PATCH v3 13/14] tcg/arm: Reserve a register for guest_base Richard Henderson
2021-08-20 12:03   ` Peter Maydell
2021-08-20 18:47     ` Richard Henderson
2021-08-21 10:38       ` Peter Maydell
2021-08-18 21:29 ` [PATCH v3 14/14] tcg/arm: Support raising sigbus for user-only Richard Henderson
2021-08-20 13:56   ` Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.