All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state
@ 2019-10-11 15:55 Richard Henderson
  2019-10-11 15:55 ` [PATCH v6 01/20] target/arm: Split out rebuild_hflags_common Richard Henderson
                   ` (20 more replies)
  0 siblings, 21 replies; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Changes since v5:
  * Fix the debug assertion ifdef in the final patch.
  * Add more calls to arm_rebuild_hflags: CPSR and M-profile
    These become two new patches, 18 & 19.
  * Update some comments per review. (Alex)

Changes since v4:
  * Split patch 1 into 15 smaller patches.
  * Cache the new DEBUG_TARGET_EL field.
  * Split out m-profile hflags separately from a-profile 32-bit.
  * Move around non-cached tb flags as well, avoiding repetitive
    checks for m-profile or other mutually exclusive conditions.

  I haven't officially re-run the performance test quoted in the
  last patch, but I have eyeballed "perf top", and have dug into
  the compiled code a bit, which resulted in a few of the new
  cleanup patches (e.g. cs_base, arm_mmu_idx_el, and
  arm_cpu_data_is_big_endian).

Changes since v3:
  * Rebase.
  * Do not cache XSCALE_CPAR now that it overlaps VECSTRIDE.
  * Leave the new v7m bits as uncached.  I haven't figured
    out all of the ways fpccr is modified.

Changes since v2:
  * Do not cache VECLEN, VECSTRIDE, VFPEN.
    These variables come from VFP_FPSCR and VFP_FPEXC, not from
    system control registers.
  * Move HANDLER and STACKCHECK to rebuild_hflags_a32,
    instead of building them in rebuild_hflags_common.

Changes since v1:
  * Apparently I had started a last-minute API change, and failed to
    covert all of the users, and also failed to re-test afterward.
  * Retain assertions for --enable-debug-tcg.


Richard Henderson (20):
  target/arm: Split out rebuild_hflags_common
  target/arm: Split out rebuild_hflags_a64
  target/arm: Split out rebuild_hflags_common_32
  target/arm: Split arm_cpu_data_is_big_endian
  target/arm: Split out rebuild_hflags_m32
  target/arm: Reduce tests vs M-profile in cpu_get_tb_cpu_state
  target/arm: Split out rebuild_hflags_a32
  target/arm: Split out rebuild_hflags_aprofile
  target/arm: Hoist XSCALE_CPAR, VECLEN, VECSTRIDE in
    cpu_get_tb_cpu_state
  target/arm: Simplify set of PSTATE_SS in cpu_get_tb_cpu_state
  target/arm: Hoist computation of TBFLAG_A32.VFPEN
  target/arm: Add arm_rebuild_hflags
  target/arm: Split out arm_mmu_idx_el
  target/arm: Hoist store to cs_base in cpu_get_tb_cpu_state
  target/arm: Add HELPER(rebuild_hflags_{a32,a64,m32})
  target/arm: Rebuild hflags at EL changes
  target/arm: Rebuild hflags at MSR writes
  target/arm: Rebuild hflags at CPSR writes
  target/arm: Rebuild hflags for M-profile.
  target/arm: Rely on hflags correct in cpu_get_tb_cpu_state

 target/arm/cpu.h           |  84 +++++---
 target/arm/helper.h        |   4 +
 target/arm/internals.h     |   9 +
 linux-user/syscall.c       |   1 +
 target/arm/cpu.c           |   1 +
 target/arm/helper-a64.c    |   3 +
 target/arm/helper.c        | 383 ++++++++++++++++++++++++-------------
 target/arm/m_helper.c      |   6 +
 target/arm/machine.c       |   1 +
 target/arm/op_helper.c     |   4 +
 target/arm/translate-a64.c |  13 +-
 target/arm/translate.c     |  28 ++-
 12 files changed, 363 insertions(+), 174 deletions(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v6 01/20] target/arm: Split out rebuild_hflags_common
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-11 15:55 ` [PATCH v6 02/20] target/arm: Split out rebuild_hflags_a64 Richard Henderson
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Create a function to compute the values of the TBFLAG_ANY bits
that will be cached.  For now, the env->hflags variable is not
used, and the results are fed back to cpu_get_tb_cpu_state.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.h    | 29 ++++++++++++++++++-----------
 target/arm/helper.c | 26 +++++++++++++++++++-------
 2 files changed, 37 insertions(+), 18 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 297ad5e47a..ad79a6153b 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -231,6 +231,9 @@ typedef struct CPUARMState {
     uint32_t pstate;
     uint32_t aarch64; /* 1 if CPU is in aarch64 state; inverse of PSTATE.nRW */
 
+    /* Cached TBFLAGS state.  See below for which bits are included.  */
+    uint32_t hflags;
+
     /* Frequently accessed CPSR bits are stored separately for efficiency.
        This contains all the other bits.  Use cpsr_{read,write} to access
        the whole CPSR.  */
@@ -3140,15 +3143,18 @@ typedef ARMCPU ArchCPU;
 
 #include "exec/cpu-all.h"
 
-/* Bit usage in the TB flags field: bit 31 indicates whether we are
+/*
+ * Bit usage in the TB flags field: bit 31 indicates whether we are
  * in 32 or 64 bit mode. The meaning of the other bits depends on that.
  * We put flags which are shared between 32 and 64 bit mode at the top
  * of the word, and flags which apply to only one mode at the bottom.
+ *
+ * Unless otherwise noted, these bits are cached in env->hflags.
  */
 FIELD(TBFLAG_ANY, AARCH64_STATE, 31, 1)
 FIELD(TBFLAG_ANY, MMUIDX, 28, 3)
 FIELD(TBFLAG_ANY, SS_ACTIVE, 27, 1)
-FIELD(TBFLAG_ANY, PSTATE_SS, 26, 1)
+FIELD(TBFLAG_ANY, PSTATE_SS, 26, 1)     /* Not cached. */
 /* Target EL if we take a floating-point-disabled exception */
 FIELD(TBFLAG_ANY, FPEXC_EL, 24, 2)
 FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
@@ -3159,13 +3165,14 @@ FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
 FIELD(TBFLAG_ANY, DEBUG_TARGET_EL, 21, 2)
 
 /* Bit usage when in AArch32 state: */
-FIELD(TBFLAG_A32, THUMB, 0, 1)
-FIELD(TBFLAG_A32, VECLEN, 1, 3)
-FIELD(TBFLAG_A32, VECSTRIDE, 4, 2)
+FIELD(TBFLAG_A32, THUMB, 0, 1)          /* Not cached. */
+FIELD(TBFLAG_A32, VECLEN, 1, 3)         /* Not cached. */
+FIELD(TBFLAG_A32, VECSTRIDE, 4, 2)      /* Not cached. */
 /*
  * We store the bottom two bits of the CPAR as TB flags and handle
  * checks on the other bits at runtime. This shares the same bits as
  * VECSTRIDE, which is OK as no XScale CPU has VFP.
+ * Not cached, because VECLEN+VECSTRIDE are not cached.
  */
 FIELD(TBFLAG_A32, XSCALE_CPAR, 4, 2)
 /*
@@ -3174,15 +3181,15 @@ FIELD(TBFLAG_A32, XSCALE_CPAR, 4, 2)
  * the same thing as the current security state of the processor!
  */
 FIELD(TBFLAG_A32, NS, 6, 1)
-FIELD(TBFLAG_A32, VFPEN, 7, 1)
-FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
+FIELD(TBFLAG_A32, VFPEN, 7, 1)          /* Not cached. */
+FIELD(TBFLAG_A32, CONDEXEC, 8, 8)       /* Not cached. */
 FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
 /* For M profile only, set if FPCCR.LSPACT is set */
-FIELD(TBFLAG_A32, LSPACT, 18, 1)
+FIELD(TBFLAG_A32, LSPACT, 18, 1)        /* Not cached. */
 /* For M profile only, set if we must create a new FP context */
-FIELD(TBFLAG_A32, NEW_FP_CTXT_NEEDED, 19, 1)
+FIELD(TBFLAG_A32, NEW_FP_CTXT_NEEDED, 19, 1) /* Not cached. */
 /* For M profile only, set if FPCCR.S does not match current security state */
-FIELD(TBFLAG_A32, FPCCR_S_WRONG, 20, 1)
+FIELD(TBFLAG_A32, FPCCR_S_WRONG, 20, 1) /* Not cached. */
 /* For M profile only, Handler (ie not Thread) mode */
 FIELD(TBFLAG_A32, HANDLER, 21, 1)
 /* For M profile only, whether we should generate stack-limit checks */
@@ -3194,7 +3201,7 @@ FIELD(TBFLAG_A64, SVEEXC_EL, 2, 2)
 FIELD(TBFLAG_A64, ZCR_LEN, 4, 4)
 FIELD(TBFLAG_A64, PAUTH_ACTIVE, 8, 1)
 FIELD(TBFLAG_A64, BT, 9, 1)
-FIELD(TBFLAG_A64, BTYPE, 10, 2)
+FIELD(TBFLAG_A64, BTYPE, 10, 2)         /* Not cached. */
 FIELD(TBFLAG_A64, TBID, 12, 2)
 
 static inline bool bswap_code(bool sctlr_b)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 0d9a2d2ab7..8829d91ae1 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11054,6 +11054,22 @@ ARMMMUIdx arm_stage1_mmu_idx(CPUARMState *env)
 }
 #endif
 
+static uint32_t rebuild_hflags_common(CPUARMState *env, int fp_el,
+                                      ARMMMUIdx mmu_idx, uint32_t flags)
+{
+    flags = FIELD_DP32(flags, TBFLAG_ANY, FPEXC_EL, fp_el);
+    flags = FIELD_DP32(flags, TBFLAG_ANY, MMUIDX,
+                       arm_to_core_mmu_idx(mmu_idx));
+
+    if (arm_cpu_data_is_big_endian(env)) {
+        flags = FIELD_DP32(flags, TBFLAG_ANY, BE_DATA, 1);
+    }
+    if (arm_singlestep_active(env)) {
+        flags = FIELD_DP32(flags, TBFLAG_ANY, SS_ACTIVE, 1);
+    }
+    return flags;
+}
+
 void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
                           target_ulong *cs_base, uint32_t *pflags)
 {
@@ -11145,7 +11161,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         }
     }
 
-    flags = FIELD_DP32(flags, TBFLAG_ANY, MMUIDX, arm_to_core_mmu_idx(mmu_idx));
+    flags = rebuild_hflags_common(env, fp_el, mmu_idx, flags);
 
     /* The SS_ACTIVE and PSTATE_SS bits correspond to the state machine
      * states defined in the ARM ARM for software singlestep:
@@ -11153,9 +11169,9 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
      *     0            x       Inactive (the TB flag for SS is always 0)
      *     1            0       Active-pending
      *     1            1       Active-not-pending
+     * SS_ACTIVE is set in hflags; PSTATE_SS is computed every TB.
      */
-    if (arm_singlestep_active(env)) {
-        flags = FIELD_DP32(flags, TBFLAG_ANY, SS_ACTIVE, 1);
+    if (FIELD_EX32(flags, TBFLAG_ANY, SS_ACTIVE)) {
         if (is_a64(env)) {
             if (env->pstate & PSTATE_SS) {
                 flags = FIELD_DP32(flags, TBFLAG_ANY, PSTATE_SS, 1);
@@ -11166,10 +11182,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
             }
         }
     }
-    if (arm_cpu_data_is_big_endian(env)) {
-        flags = FIELD_DP32(flags, TBFLAG_ANY, BE_DATA, 1);
-    }
-    flags = FIELD_DP32(flags, TBFLAG_ANY, FPEXC_EL, fp_el);
 
     if (arm_v7m_is_handler_mode(env)) {
         flags = FIELD_DP32(flags, TBFLAG_A32, HANDLER, 1);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 02/20] target/arm: Split out rebuild_hflags_a64
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
  2019-10-11 15:55 ` [PATCH v6 01/20] target/arm: Split out rebuild_hflags_common Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 15:43   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 03/20] target/arm: Split out rebuild_hflags_common_32 Richard Henderson
                   ` (18 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Create a function to compute the values of the TBFLAG_A64 bits
that will be cached.  For now, the env->hflags variable is not
used, and the results are fed back to cpu_get_tb_cpu_state.

Note that not all BTI related flags are cached, so we have to
test the BTI feature twice -- once for those bits moved out to
rebuild_hflags_a64 and once for those bits that remain in
cpu_get_tb_cpu_state.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 131 +++++++++++++++++++++++---------------------
 1 file changed, 69 insertions(+), 62 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 8829d91ae1..69da04786e 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11070,6 +11070,71 @@ static uint32_t rebuild_hflags_common(CPUARMState *env, int fp_el,
     return flags;
 }
 
+static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
+                                   ARMMMUIdx mmu_idx)
+{
+    ARMMMUIdx stage1 = stage_1_mmu_idx(mmu_idx);
+    ARMVAParameters p0 = aa64_va_parameters_both(env, 0, stage1);
+    uint32_t flags = 0;
+    uint64_t sctlr;
+    int tbii, tbid;
+
+    flags = FIELD_DP32(flags, TBFLAG_ANY, AARCH64_STATE, 1);
+
+    /* FIXME: ARMv8.1-VHE S2 translation regime.  */
+    if (regime_el(env, stage1) < 2) {
+        ARMVAParameters p1 = aa64_va_parameters_both(env, -1, stage1);
+        tbid = (p1.tbi << 1) | p0.tbi;
+        tbii = tbid & ~((p1.tbid << 1) | p0.tbid);
+    } else {
+        tbid = p0.tbi;
+        tbii = tbid & !p0.tbid;
+    }
+
+    flags = FIELD_DP32(flags, TBFLAG_A64, TBII, tbii);
+    flags = FIELD_DP32(flags, TBFLAG_A64, TBID, tbid);
+
+    if (cpu_isar_feature(aa64_sve, env_archcpu(env))) {
+        int sve_el = sve_exception_el(env, el);
+        uint32_t zcr_len;
+
+        /*
+         * If SVE is disabled, but FP is enabled,
+         * then the effective len is 0.
+         */
+        if (sve_el != 0 && fp_el == 0) {
+            zcr_len = 0;
+        } else {
+            zcr_len = sve_zcr_len_for_el(env, el);
+        }
+        flags = FIELD_DP32(flags, TBFLAG_A64, SVEEXC_EL, sve_el);
+        flags = FIELD_DP32(flags, TBFLAG_A64, ZCR_LEN, zcr_len);
+    }
+
+    sctlr = arm_sctlr(env, el);
+
+    if (cpu_isar_feature(aa64_pauth, env_archcpu(env))) {
+        /*
+         * In order to save space in flags, we record only whether
+         * pauth is "inactive", meaning all insns are implemented as
+         * a nop, or "active" when some action must be performed.
+         * The decision of which action to take is left to a helper.
+         */
+        if (sctlr & (SCTLR_EnIA | SCTLR_EnIB | SCTLR_EnDA | SCTLR_EnDB)) {
+            flags = FIELD_DP32(flags, TBFLAG_A64, PAUTH_ACTIVE, 1);
+        }
+    }
+
+    if (cpu_isar_feature(aa64_bti, env_archcpu(env))) {
+        /* Note that SCTLR_EL[23].BT == SCTLR_BT1.  */
+        if (sctlr & (el == 0 ? SCTLR_BT0 : SCTLR_BT1)) {
+            flags = FIELD_DP32(flags, TBFLAG_A64, BT, 1);
+        }
+    }
+
+    return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
+}
+
 void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
                           target_ulong *cs_base, uint32_t *pflags)
 {
@@ -11079,67 +11144,9 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
     uint32_t flags = 0;
 
     if (is_a64(env)) {
-        ARMCPU *cpu = env_archcpu(env);
-        uint64_t sctlr;
-
         *pc = env->pc;
-        flags = FIELD_DP32(flags, TBFLAG_ANY, AARCH64_STATE, 1);
-
-        /* Get control bits for tagged addresses.  */
-        {
-            ARMMMUIdx stage1 = stage_1_mmu_idx(mmu_idx);
-            ARMVAParameters p0 = aa64_va_parameters_both(env, 0, stage1);
-            int tbii, tbid;
-
-            /* FIXME: ARMv8.1-VHE S2 translation regime.  */
-            if (regime_el(env, stage1) < 2) {
-                ARMVAParameters p1 = aa64_va_parameters_both(env, -1, stage1);
-                tbid = (p1.tbi << 1) | p0.tbi;
-                tbii = tbid & ~((p1.tbid << 1) | p0.tbid);
-            } else {
-                tbid = p0.tbi;
-                tbii = tbid & !p0.tbid;
-            }
-
-            flags = FIELD_DP32(flags, TBFLAG_A64, TBII, tbii);
-            flags = FIELD_DP32(flags, TBFLAG_A64, TBID, tbid);
-        }
-
-        if (cpu_isar_feature(aa64_sve, cpu)) {
-            int sve_el = sve_exception_el(env, current_el);
-            uint32_t zcr_len;
-
-            /* If SVE is disabled, but FP is enabled,
-             * then the effective len is 0.
-             */
-            if (sve_el != 0 && fp_el == 0) {
-                zcr_len = 0;
-            } else {
-                zcr_len = sve_zcr_len_for_el(env, current_el);
-            }
-            flags = FIELD_DP32(flags, TBFLAG_A64, SVEEXC_EL, sve_el);
-            flags = FIELD_DP32(flags, TBFLAG_A64, ZCR_LEN, zcr_len);
-        }
-
-        sctlr = arm_sctlr(env, current_el);
-
-        if (cpu_isar_feature(aa64_pauth, cpu)) {
-            /*
-             * In order to save space in flags, we record only whether
-             * pauth is "inactive", meaning all insns are implemented as
-             * a nop, or "active" when some action must be performed.
-             * The decision of which action to take is left to a helper.
-             */
-            if (sctlr & (SCTLR_EnIA | SCTLR_EnIB | SCTLR_EnDA | SCTLR_EnDB)) {
-                flags = FIELD_DP32(flags, TBFLAG_A64, PAUTH_ACTIVE, 1);
-            }
-        }
-
-        if (cpu_isar_feature(aa64_bti, cpu)) {
-            /* Note that SCTLR_EL[23].BT == SCTLR_BT1.  */
-            if (sctlr & (current_el == 0 ? SCTLR_BT0 : SCTLR_BT1)) {
-                flags = FIELD_DP32(flags, TBFLAG_A64, BT, 1);
-            }
+        flags = rebuild_hflags_a64(env, current_el, fp_el, mmu_idx);
+        if (cpu_isar_feature(aa64_bti, env_archcpu(env))) {
             flags = FIELD_DP32(flags, TBFLAG_A64, BTYPE, env->btype);
         }
     } else {
@@ -11159,9 +11166,9 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
             flags = FIELD_DP32(flags, TBFLAG_A32,
                                XSCALE_CPAR, env->cp15.c15_cpar);
         }
-    }
 
-    flags = rebuild_hflags_common(env, fp_el, mmu_idx, flags);
+        flags = rebuild_hflags_common(env, fp_el, mmu_idx, flags);
+    }
 
     /* The SS_ACTIVE and PSTATE_SS bits correspond to the state machine
      * states defined in the ARM ARM for software singlestep:
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 03/20] target/arm: Split out rebuild_hflags_common_32
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
  2019-10-11 15:55 ` [PATCH v6 01/20] target/arm: Split out rebuild_hflags_common Richard Henderson
  2019-10-11 15:55 ` [PATCH v6 02/20] target/arm: Split out rebuild_hflags_a64 Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 15:53   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 04/20] target/arm: Split arm_cpu_data_is_big_endian Richard Henderson
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Create a function to compute the values of the TBFLAG_A32 bits
that will be cached, and are used by all profiles.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 69da04786e..f05d042474 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11070,6 +11070,15 @@ static uint32_t rebuild_hflags_common(CPUARMState *env, int fp_el,
     return flags;
 }
 
+static uint32_t rebuild_hflags_common_32(CPUARMState *env, int fp_el,
+                                         ARMMMUIdx mmu_idx, uint32_t flags)
+{
+    flags = FIELD_DP32(flags, TBFLAG_A32, SCTLR_B, arm_sctlr_b(env));
+    flags = FIELD_DP32(flags, TBFLAG_A32, NS, !access_secure_reg(env));
+
+    return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
+}
+
 static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
                                    ARMMMUIdx mmu_idx)
 {
@@ -11141,7 +11150,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
     ARMMMUIdx mmu_idx = arm_mmu_idx(env);
     int current_el = arm_current_el(env);
     int fp_el = fp_exception_el(env, current_el);
-    uint32_t flags = 0;
+    uint32_t flags;
 
     if (is_a64(env)) {
         *pc = env->pc;
@@ -11151,12 +11160,11 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         }
     } else {
         *pc = env->regs[15];
+        flags = rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
         flags = FIELD_DP32(flags, TBFLAG_A32, THUMB, env->thumb);
         flags = FIELD_DP32(flags, TBFLAG_A32, VECLEN, env->vfp.vec_len);
         flags = FIELD_DP32(flags, TBFLAG_A32, VECSTRIDE, env->vfp.vec_stride);
         flags = FIELD_DP32(flags, TBFLAG_A32, CONDEXEC, env->condexec_bits);
-        flags = FIELD_DP32(flags, TBFLAG_A32, SCTLR_B, arm_sctlr_b(env));
-        flags = FIELD_DP32(flags, TBFLAG_A32, NS, !access_secure_reg(env));
         if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)
             || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
             flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
@@ -11166,8 +11174,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
             flags = FIELD_DP32(flags, TBFLAG_A32,
                                XSCALE_CPAR, env->cp15.c15_cpar);
         }
-
-        flags = rebuild_hflags_common(env, fp_el, mmu_idx, flags);
     }
 
     /* The SS_ACTIVE and PSTATE_SS bits correspond to the state machine
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 04/20] target/arm: Split arm_cpu_data_is_big_endian
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (2 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 03/20] target/arm: Split out rebuild_hflags_common_32 Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 16:01   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 05/20] target/arm: Split out rebuild_hflags_m32 Richard Henderson
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Set TBFLAG_ANY.BE_DATA in rebuild_hflags_common_32 and
rebuild_hflags_a64 instead of rebuild_hflags_common, where we do
not need to re-test is_a64() nor re-compute the various inputs.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.h    | 49 +++++++++++++++++++++++++++------------------
 target/arm/helper.c | 16 +++++++++++----
 2 files changed, 42 insertions(+), 23 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index ad79a6153b..4d961474ce 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3108,33 +3108,44 @@ static inline uint64_t arm_sctlr(CPUARMState *env, int el)
     }
 }
 
+static inline bool arm_cpu_data_is_big_endian_a32(CPUARMState *env,
+                                                  bool sctlr_b)
+{
+#ifdef CONFIG_USER_ONLY
+    /*
+     * In system mode, BE32 is modelled in line with the
+     * architecture (as word-invariant big-endianness), where loads
+     * and stores are done little endian but from addresses which
+     * are adjusted by XORing with the appropriate constant. So the
+     * endianness to use for the raw data access is not affected by
+     * SCTLR.B.
+     * In user mode, however, we model BE32 as byte-invariant
+     * big-endianness (because user-only code cannot tell the
+     * difference), and so we need to use a data access endianness
+     * that depends on SCTLR.B.
+     */
+    if (sctlr_b) {
+        return true;
+    }
+#endif
+    /* In 32bit endianness is determined by looking at CPSR's E bit */
+    return env->uncached_cpsr & CPSR_E;
+}
+
+static inline bool arm_cpu_data_is_big_endian_a64(int el, uint64_t sctlr)
+{
+    return sctlr & (el ? SCTLR_EE : SCTLR_E0E);
+}
 
 /* Return true if the processor is in big-endian mode. */
 static inline bool arm_cpu_data_is_big_endian(CPUARMState *env)
 {
-    /* In 32bit endianness is determined by looking at CPSR's E bit */
     if (!is_a64(env)) {
-        return
-#ifdef CONFIG_USER_ONLY
-            /* In system mode, BE32 is modelled in line with the
-             * architecture (as word-invariant big-endianness), where loads
-             * and stores are done little endian but from addresses which
-             * are adjusted by XORing with the appropriate constant. So the
-             * endianness to use for the raw data access is not affected by
-             * SCTLR.B.
-             * In user mode, however, we model BE32 as byte-invariant
-             * big-endianness (because user-only code cannot tell the
-             * difference), and so we need to use a data access endianness
-             * that depends on SCTLR.B.
-             */
-            arm_sctlr_b(env) ||
-#endif
-                ((env->uncached_cpsr & CPSR_E) ? 1 : 0);
+        return arm_cpu_data_is_big_endian_a32(env, arm_sctlr_b(env));
     } else {
         int cur_el = arm_current_el(env);
         uint64_t sctlr = arm_sctlr(env, cur_el);
-
-        return (sctlr & (cur_el ? SCTLR_EE : SCTLR_E0E)) != 0;
+        return arm_cpu_data_is_big_endian_a64(cur_el, sctlr);
     }
 }
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index f05d042474..4c65476d93 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11061,9 +11061,6 @@ static uint32_t rebuild_hflags_common(CPUARMState *env, int fp_el,
     flags = FIELD_DP32(flags, TBFLAG_ANY, MMUIDX,
                        arm_to_core_mmu_idx(mmu_idx));
 
-    if (arm_cpu_data_is_big_endian(env)) {
-        flags = FIELD_DP32(flags, TBFLAG_ANY, BE_DATA, 1);
-    }
     if (arm_singlestep_active(env)) {
         flags = FIELD_DP32(flags, TBFLAG_ANY, SS_ACTIVE, 1);
     }
@@ -11073,7 +11070,14 @@ static uint32_t rebuild_hflags_common(CPUARMState *env, int fp_el,
 static uint32_t rebuild_hflags_common_32(CPUARMState *env, int fp_el,
                                          ARMMMUIdx mmu_idx, uint32_t flags)
 {
-    flags = FIELD_DP32(flags, TBFLAG_A32, SCTLR_B, arm_sctlr_b(env));
+    bool sctlr_b = arm_sctlr_b(env);
+
+    if (sctlr_b) {
+        flags = FIELD_DP32(flags, TBFLAG_A32, SCTLR_B, 1);
+    }
+    if (arm_cpu_data_is_big_endian_a32(env, sctlr_b)) {
+        flags = FIELD_DP32(flags, TBFLAG_ANY, BE_DATA, 1);
+    }
     flags = FIELD_DP32(flags, TBFLAG_A32, NS, !access_secure_reg(env));
 
     return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
@@ -11122,6 +11126,10 @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
 
     sctlr = arm_sctlr(env, el);
 
+    if (arm_cpu_data_is_big_endian_a64(el, sctlr)) {
+        flags = FIELD_DP32(flags, TBFLAG_ANY, BE_DATA, 1);
+    }
+
     if (cpu_isar_feature(aa64_pauth, env_archcpu(env))) {
         /*
          * In order to save space in flags, we record only whether
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 05/20] target/arm: Split out rebuild_hflags_m32
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (3 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 04/20] target/arm: Split arm_cpu_data_is_big_endian Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 16:13   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 06/20] target/arm: Reduce tests vs M-profile in cpu_get_tb_cpu_state Richard Henderson
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Create a function to compute the values of the TBFLAG_A32 bits
that will be cached, and are used by M-profile.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 45 ++++++++++++++++++++++++++++++---------------
 1 file changed, 30 insertions(+), 15 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 4c65476d93..d4303420da 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11083,6 +11083,29 @@ static uint32_t rebuild_hflags_common_32(CPUARMState *env, int fp_el,
     return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
 }
 
+static uint32_t rebuild_hflags_m32(CPUARMState *env, int fp_el,
+                                   ARMMMUIdx mmu_idx)
+{
+    uint32_t flags = 0;
+
+    if (arm_v7m_is_handler_mode(env)) {
+        flags = FIELD_DP32(flags, TBFLAG_A32, HANDLER, 1);
+    }
+
+    /*
+     * v8M always applies stack limit checks unless CCR.STKOFHFNMIGN
+     * is suppressing them because the requested execution priority
+     * is less than 0.
+     */
+    if (arm_feature(env, ARM_FEATURE_V8) &&
+        !((mmu_idx & ARM_MMU_IDX_M_NEGPRI) &&
+          (env->v7m.ccr[env->v7m.secure] & R_V7M_CCR_STKOFHFNMIGN_MASK))) {
+        flags = FIELD_DP32(flags, TBFLAG_A32, STACKCHECK, 1);
+    }
+
+    return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
+}
+
 static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
                                    ARMMMUIdx mmu_idx)
 {
@@ -11168,7 +11191,13 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         }
     } else {
         *pc = env->regs[15];
-        flags = rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
+
+        if (arm_feature(env, ARM_FEATURE_M)) {
+            flags = rebuild_hflags_m32(env, fp_el, mmu_idx);
+        } else {
+            flags = rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
+        }
+
         flags = FIELD_DP32(flags, TBFLAG_A32, THUMB, env->thumb);
         flags = FIELD_DP32(flags, TBFLAG_A32, VECLEN, env->vfp.vec_len);
         flags = FIELD_DP32(flags, TBFLAG_A32, VECSTRIDE, env->vfp.vec_stride);
@@ -11204,20 +11233,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         }
     }
 
-    if (arm_v7m_is_handler_mode(env)) {
-        flags = FIELD_DP32(flags, TBFLAG_A32, HANDLER, 1);
-    }
-
-    /* v8M always applies stack limit checks unless CCR.STKOFHFNMIGN is
-     * suppressing them because the requested execution priority is less than 0.
-     */
-    if (arm_feature(env, ARM_FEATURE_V8) &&
-        arm_feature(env, ARM_FEATURE_M) &&
-        !((mmu_idx  & ARM_MMU_IDX_M_NEGPRI) &&
-          (env->v7m.ccr[env->v7m.secure] & R_V7M_CCR_STKOFHFNMIGN_MASK))) {
-        flags = FIELD_DP32(flags, TBFLAG_A32, STACKCHECK, 1);
-    }
-
     if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
         FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S) != env->v7m.secure) {
         flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 06/20] target/arm: Reduce tests vs M-profile in cpu_get_tb_cpu_state
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (4 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 05/20] target/arm: Split out rebuild_hflags_m32 Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 16:17   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 07/20] target/arm: Split out rebuild_hflags_a32 Richard Henderson
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Hoist the computation of some TBFLAG_A32 bits that only apply to
M-profile under a single test for ARM_FEATURE_M.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 49 +++++++++++++++++++++------------------------
 1 file changed, 23 insertions(+), 26 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index d4303420da..296a4b2232 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11194,6 +11194,29 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
 
         if (arm_feature(env, ARM_FEATURE_M)) {
             flags = rebuild_hflags_m32(env, fp_el, mmu_idx);
+
+            if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
+                FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S)
+                != env->v7m.secure) {
+                flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
+            }
+
+            if ((env->v7m.fpccr[env->v7m.secure] & R_V7M_FPCCR_ASPEN_MASK) &&
+                (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) ||
+                 (env->v7m.secure &&
+                  !(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)))) {
+                /*
+                 * ASPEN is set, but FPCA/SFPA indicate that there is no
+                 * active FP context; we must create a new FP context before
+                 * executing any FP insn.
+                 */
+                flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
+            }
+
+            bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
+            if (env->v7m.fpccr[is_secure] & R_V7M_FPCCR_LSPACT_MASK) {
+                flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
+            }
         } else {
             flags = rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
         }
@@ -11233,32 +11256,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         }
     }
 
-    if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
-        FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S) != env->v7m.secure) {
-        flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
-    }
-
-    if (arm_feature(env, ARM_FEATURE_M) &&
-        (env->v7m.fpccr[env->v7m.secure] & R_V7M_FPCCR_ASPEN_MASK) &&
-        (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) ||
-         (env->v7m.secure &&
-          !(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)))) {
-        /*
-         * ASPEN is set, but FPCA/SFPA indicate that there is no active
-         * FP context; we must create a new FP context before executing
-         * any FP insn.
-         */
-        flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
-    }
-
-    if (arm_feature(env, ARM_FEATURE_M)) {
-        bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
-
-        if (env->v7m.fpccr[is_secure] & R_V7M_FPCCR_LSPACT_MASK) {
-            flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
-        }
-    }
-
     if (!arm_feature(env, ARM_FEATURE_M)) {
         int target_el = arm_debug_target_el(env);
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 07/20] target/arm: Split out rebuild_hflags_a32
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (5 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 06/20] target/arm: Reduce tests vs M-profile in cpu_get_tb_cpu_state Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 16:17   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 08/20] target/arm: Split out rebuild_hflags_aprofile Richard Henderson
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Currently a trivial wrapper for rebuild_hflags_common_32.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 296a4b2232..d1cd54cc93 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11106,6 +11106,12 @@ static uint32_t rebuild_hflags_m32(CPUARMState *env, int fp_el,
     return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
 }
 
+static uint32_t rebuild_hflags_a32(CPUARMState *env, int fp_el,
+                                   ARMMMUIdx mmu_idx)
+{
+    return rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
+}
+
 static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
                                    ARMMMUIdx mmu_idx)
 {
@@ -11218,7 +11224,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
                 flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
             }
         } else {
-            flags = rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
+            flags = rebuild_hflags_a32(env, fp_el, mmu_idx);
         }
 
         flags = FIELD_DP32(flags, TBFLAG_A32, THUMB, env->thumb);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 08/20] target/arm: Split out rebuild_hflags_aprofile
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (6 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 07/20] target/arm: Split out rebuild_hflags_a32 Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 16:19   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 09/20] target/arm: Hoist XSCALE_CPAR, VECLEN, VECSTRIDE in cpu_get_tb_cpu_state Richard Henderson
                   ` (12 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Create a function to compute the values of the TBFLAG_ANY bits
that will be cached, and are used by A-profile.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index d1cd54cc93..ddd21edfcf 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11106,18 +11106,28 @@ static uint32_t rebuild_hflags_m32(CPUARMState *env, int fp_el,
     return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
 }
 
+static uint32_t rebuild_hflags_aprofile(CPUARMState *env)
+{
+    int flags = 0;
+
+    flags = FIELD_DP32(flags, TBFLAG_ANY, DEBUG_TARGET_EL,
+                       arm_debug_target_el(env));
+    return flags;
+}
+
 static uint32_t rebuild_hflags_a32(CPUARMState *env, int fp_el,
                                    ARMMMUIdx mmu_idx)
 {
-    return rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
+    uint32_t flags = rebuild_hflags_aprofile(env);
+    return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
 }
 
 static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
                                    ARMMMUIdx mmu_idx)
 {
+    uint32_t flags = rebuild_hflags_aprofile(env);
     ARMMMUIdx stage1 = stage_1_mmu_idx(mmu_idx);
     ARMVAParameters p0 = aa64_va_parameters_both(env, 0, stage1);
-    uint32_t flags = 0;
     uint64_t sctlr;
     int tbii, tbid;
 
@@ -11262,12 +11272,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         }
     }
 
-    if (!arm_feature(env, ARM_FEATURE_M)) {
-        int target_el = arm_debug_target_el(env);
-
-        flags = FIELD_DP32(flags, TBFLAG_ANY, DEBUG_TARGET_EL, target_el);
-    }
-
     *pflags = flags;
     *cs_base = 0;
 }
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 09/20] target/arm: Hoist XSCALE_CPAR, VECLEN, VECSTRIDE in cpu_get_tb_cpu_state
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (7 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 08/20] target/arm: Split out rebuild_hflags_aprofile Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 16:39   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 10/20] target/arm: Simplify set of PSTATE_SS " Richard Henderson
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

We do not need to compute any of these values for M-profile.
Further, XSCALE_CPAR overlaps VECSTRIDE so obviously the two
sets must be mutually exclusive.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index ddd21edfcf..e2a62cf19a 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11235,21 +11235,28 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
             }
         } else {
             flags = rebuild_hflags_a32(env, fp_el, mmu_idx);
+
+            /*
+             * Note that XSCALE_CPAR shares bits with VECSTRIDE.
+             * Note that VECLEN+VECSTRIDE are RES0 for M-profile.
+             */
+            if (arm_feature(env, ARM_FEATURE_XSCALE)) {
+                flags = FIELD_DP32(flags, TBFLAG_A32,
+                                   XSCALE_CPAR, env->cp15.c15_cpar);
+            } else {
+                flags = FIELD_DP32(flags, TBFLAG_A32, VECLEN,
+                                   env->vfp.vec_len);
+                flags = FIELD_DP32(flags, TBFLAG_A32, VECSTRIDE,
+                                   env->vfp.vec_stride);
+            }
         }
 
         flags = FIELD_DP32(flags, TBFLAG_A32, THUMB, env->thumb);
-        flags = FIELD_DP32(flags, TBFLAG_A32, VECLEN, env->vfp.vec_len);
-        flags = FIELD_DP32(flags, TBFLAG_A32, VECSTRIDE, env->vfp.vec_stride);
         flags = FIELD_DP32(flags, TBFLAG_A32, CONDEXEC, env->condexec_bits);
         if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)
             || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
             flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
         }
-        /* Note that XSCALE_CPAR shares bits with VECSTRIDE */
-        if (arm_feature(env, ARM_FEATURE_XSCALE)) {
-            flags = FIELD_DP32(flags, TBFLAG_A32,
-                               XSCALE_CPAR, env->cp15.c15_cpar);
-        }
     }
 
     /* The SS_ACTIVE and PSTATE_SS bits correspond to the state machine
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 10/20] target/arm: Simplify set of PSTATE_SS in cpu_get_tb_cpu_state
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (8 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 09/20] target/arm: Hoist XSCALE_CPAR, VECLEN, VECSTRIDE in cpu_get_tb_cpu_state Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 18:21   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 11/20] target/arm: Hoist computation of TBFLAG_A32.VFPEN Richard Henderson
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Hoist the variable load for PSTATE into the existing test vs is_a64.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index e2a62cf19a..398e5f5d6d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11197,7 +11197,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
     ARMMMUIdx mmu_idx = arm_mmu_idx(env);
     int current_el = arm_current_el(env);
     int fp_el = fp_exception_el(env, current_el);
-    uint32_t flags;
+    uint32_t flags, pstate_for_ss;
 
     if (is_a64(env)) {
         *pc = env->pc;
@@ -11205,6 +11205,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         if (cpu_isar_feature(aa64_bti, env_archcpu(env))) {
             flags = FIELD_DP32(flags, TBFLAG_A64, BTYPE, env->btype);
         }
+        pstate_for_ss = env->pstate;
     } else {
         *pc = env->regs[15];
 
@@ -11257,9 +11258,11 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
             || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
             flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
         }
+        pstate_for_ss = env->uncached_cpsr;
     }
 
-    /* The SS_ACTIVE and PSTATE_SS bits correspond to the state machine
+    /*
+     * The SS_ACTIVE and PSTATE_SS bits correspond to the state machine
      * states defined in the ARM ARM for software singlestep:
      *  SS_ACTIVE   PSTATE.SS   State
      *     0            x       Inactive (the TB flag for SS is always 0)
@@ -11267,16 +11270,9 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
      *     1            1       Active-not-pending
      * SS_ACTIVE is set in hflags; PSTATE_SS is computed every TB.
      */
-    if (FIELD_EX32(flags, TBFLAG_ANY, SS_ACTIVE)) {
-        if (is_a64(env)) {
-            if (env->pstate & PSTATE_SS) {
-                flags = FIELD_DP32(flags, TBFLAG_ANY, PSTATE_SS, 1);
-            }
-        } else {
-            if (env->uncached_cpsr & PSTATE_SS) {
-                flags = FIELD_DP32(flags, TBFLAG_ANY, PSTATE_SS, 1);
-            }
-        }
+    if (FIELD_EX32(flags, TBFLAG_ANY, SS_ACTIVE) &&
+        (pstate_for_ss & PSTATE_SS)) {
+        flags = FIELD_DP32(flags, TBFLAG_ANY, PSTATE_SS, 1);
     }
 
     *pflags = flags;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 11/20] target/arm: Hoist computation of TBFLAG_A32.VFPEN
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (9 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 10/20] target/arm: Simplify set of PSTATE_SS " Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 18:46   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 12/20] target/arm: Add arm_rebuild_hflags Richard Henderson
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

There are 3 conditions that each enable this flag.  M-profile always
enables; A-profile with EL1 as AA64 always enables.  Both of these
conditions can easily be cached.  The final condition relies on the
FPEXC register which we are not prepared to cache.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.h    |  2 +-
 target/arm/helper.c | 14 ++++++++++----
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 4d961474ce..9909ff89d4 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3192,7 +3192,7 @@ FIELD(TBFLAG_A32, XSCALE_CPAR, 4, 2)
  * the same thing as the current security state of the processor!
  */
 FIELD(TBFLAG_A32, NS, 6, 1)
-FIELD(TBFLAG_A32, VFPEN, 7, 1)          /* Not cached. */
+FIELD(TBFLAG_A32, VFPEN, 7, 1)          /* Partially cached, minus FPEXC. */
 FIELD(TBFLAG_A32, CONDEXEC, 8, 8)       /* Not cached. */
 FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
 /* For M profile only, set if FPCCR.LSPACT is set */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 398e5f5d6d..89aa6fd933 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11088,6 +11088,9 @@ static uint32_t rebuild_hflags_m32(CPUARMState *env, int fp_el,
 {
     uint32_t flags = 0;
 
+    /* v8M always enables the fpu.  */
+    flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
+
     if (arm_v7m_is_handler_mode(env)) {
         flags = FIELD_DP32(flags, TBFLAG_A32, HANDLER, 1);
     }
@@ -11119,6 +11122,10 @@ static uint32_t rebuild_hflags_a32(CPUARMState *env, int fp_el,
                                    ARMMMUIdx mmu_idx)
 {
     uint32_t flags = rebuild_hflags_aprofile(env);
+
+    if (arm_el_is_aa64(env, 1)) {
+        flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
+    }
     return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
 }
 
@@ -11250,14 +11257,13 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
                 flags = FIELD_DP32(flags, TBFLAG_A32, VECSTRIDE,
                                    env->vfp.vec_stride);
             }
+            if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)) {
+                flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
+            }
         }
 
         flags = FIELD_DP32(flags, TBFLAG_A32, THUMB, env->thumb);
         flags = FIELD_DP32(flags, TBFLAG_A32, CONDEXEC, env->condexec_bits);
-        if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)
-            || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
-            flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
-        }
         pstate_for_ss = env->uncached_cpsr;
     }
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 12/20] target/arm: Add arm_rebuild_hflags
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (10 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 11/20] target/arm: Hoist computation of TBFLAG_A32.VFPEN Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 18:47   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 13/20] target/arm: Split out arm_mmu_idx_el Richard Henderson
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

This function assumes nothing about the current state of the cpu,
and writes the computed value to env->hflags.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.h    |  6 ++++++
 target/arm/helper.c | 30 ++++++++++++++++++++++--------
 2 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 9909ff89d4..d844ea21d8 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3297,6 +3297,12 @@ void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
 void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook, void
         *opaque);
 
+/**
+ * arm_rebuild_hflags:
+ * Rebuild the cached TBFLAGS for arbitrary changed processor state.
+ */
+void arm_rebuild_hflags(CPUARMState *env);
+
 /**
  * aa32_vfp_dreg:
  * Return a pointer to the Dn register within env in 32-bit mode.
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 89aa6fd933..85de96d071 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11198,17 +11198,35 @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
     return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
 }
 
+static uint32_t rebuild_hflags_internal(CPUARMState *env)
+{
+    int el = arm_current_el(env);
+    int fp_el = fp_exception_el(env, el);
+    ARMMMUIdx mmu_idx = arm_mmu_idx(env);
+
+    if (is_a64(env)) {
+        return rebuild_hflags_a64(env, el, fp_el, mmu_idx);
+    } else if (arm_feature(env, ARM_FEATURE_M)) {
+        return rebuild_hflags_m32(env, fp_el, mmu_idx);
+    } else {
+        return rebuild_hflags_a32(env, fp_el, mmu_idx);
+    }
+}
+
+void arm_rebuild_hflags(CPUARMState *env)
+{
+    env->hflags = rebuild_hflags_internal(env);
+}
+
 void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
                           target_ulong *cs_base, uint32_t *pflags)
 {
-    ARMMMUIdx mmu_idx = arm_mmu_idx(env);
-    int current_el = arm_current_el(env);
-    int fp_el = fp_exception_el(env, current_el);
     uint32_t flags, pstate_for_ss;
 
+    flags = rebuild_hflags_internal(env);
+
     if (is_a64(env)) {
         *pc = env->pc;
-        flags = rebuild_hflags_a64(env, current_el, fp_el, mmu_idx);
         if (cpu_isar_feature(aa64_bti, env_archcpu(env))) {
             flags = FIELD_DP32(flags, TBFLAG_A64, BTYPE, env->btype);
         }
@@ -11217,8 +11235,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         *pc = env->regs[15];
 
         if (arm_feature(env, ARM_FEATURE_M)) {
-            flags = rebuild_hflags_m32(env, fp_el, mmu_idx);
-
             if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
                 FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S)
                 != env->v7m.secure) {
@@ -11242,8 +11258,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
                 flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
             }
         } else {
-            flags = rebuild_hflags_a32(env, fp_el, mmu_idx);
-
             /*
              * Note that XSCALE_CPAR shares bits with VECSTRIDE.
              * Note that VECLEN+VECSTRIDE are RES0 for M-profile.
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 13/20] target/arm: Split out arm_mmu_idx_el
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (11 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 12/20] target/arm: Add arm_rebuild_hflags Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 18:49   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 14/20] target/arm: Hoist store to cs_base in cpu_get_tb_cpu_state Richard Henderson
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Avoid calling arm_current_el() twice.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/internals.h |  9 +++++++++
 target/arm/helper.c    | 12 +++++++-----
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index 232d963875..f5313dd3d4 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -949,6 +949,15 @@ void arm_cpu_update_virq(ARMCPU *cpu);
  */
 void arm_cpu_update_vfiq(ARMCPU *cpu);
 
+/**
+ * arm_mmu_idx_el:
+ * @env: The cpu environment
+ * @el: The EL to use.
+ *
+ * Return the full ARMMMUIdx for the translation regime for EL.
+ */
+ARMMMUIdx arm_mmu_idx_el(CPUARMState *env, int el);
+
 /**
  * arm_mmu_idx:
  * @env: The cpu environment
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 85de96d071..3f7d3f257d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11026,15 +11026,12 @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate)
 }
 #endif
 
-ARMMMUIdx arm_mmu_idx(CPUARMState *env)
+ARMMMUIdx arm_mmu_idx_el(CPUARMState *env, int el)
 {
-    int el;
-
     if (arm_feature(env, ARM_FEATURE_M)) {
         return arm_v7m_mmu_idx_for_secstate(env, env->v7m.secure);
     }
 
-    el = arm_current_el(env);
     if (el < 2 && arm_is_secure_below_el3(env)) {
         return ARMMMUIdx_S1SE0 + el;
     } else {
@@ -11042,6 +11039,11 @@ ARMMMUIdx arm_mmu_idx(CPUARMState *env)
     }
 }
 
+ARMMMUIdx arm_mmu_idx(CPUARMState *env)
+{
+    return arm_mmu_idx_el(env, arm_current_el(env));
+}
+
 int cpu_mmu_index(CPUARMState *env, bool ifetch)
 {
     return arm_to_core_mmu_idx(arm_mmu_idx(env));
@@ -11202,7 +11204,7 @@ static uint32_t rebuild_hflags_internal(CPUARMState *env)
 {
     int el = arm_current_el(env);
     int fp_el = fp_exception_el(env, el);
-    ARMMMUIdx mmu_idx = arm_mmu_idx(env);
+    ARMMMUIdx mmu_idx = arm_mmu_idx_el(env, el);
 
     if (is_a64(env)) {
         return rebuild_hflags_a64(env, el, fp_el, mmu_idx);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 14/20] target/arm: Hoist store to cs_base in cpu_get_tb_cpu_state
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (12 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 13/20] target/arm: Split out arm_mmu_idx_el Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 18:51   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 15/20] target/arm: Add HELPER(rebuild_hflags_{a32, a64, m32}) Richard Henderson
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

By performing this store early, we avoid having to save and restore
the register holding the address around any function calls.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 3f7d3f257d..37424e3d4d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11225,6 +11225,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
 {
     uint32_t flags, pstate_for_ss;
 
+    *cs_base = 0;
     flags = rebuild_hflags_internal(env);
 
     if (is_a64(env)) {
@@ -11298,7 +11299,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
     }
 
     *pflags = flags;
-    *cs_base = 0;
 }
 
 #ifdef TARGET_AARCH64
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 15/20] target/arm: Add HELPER(rebuild_hflags_{a32, a64, m32})
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (13 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 14/20] target/arm: Hoist store to cs_base in cpu_get_tb_cpu_state Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 18:59   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 16/20] target/arm: Rebuild hflags at EL changes Richard Henderson
                   ` (5 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

This functions are given the mode and el state of the cpu
and writes the computed value to env->hflags.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.h |  4 ++++
 target/arm/helper.c | 24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index 1fb2cb5a77..3d4ec267a2 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -90,6 +90,10 @@ DEF_HELPER_4(msr_banked, void, env, i32, i32, i32)
 DEF_HELPER_2(get_user_reg, i32, env, i32)
 DEF_HELPER_3(set_user_reg, void, env, i32, i32)
 
+DEF_HELPER_FLAGS_2(rebuild_hflags_m32, TCG_CALL_NO_RWG, void, env, int)
+DEF_HELPER_FLAGS_2(rebuild_hflags_a32, TCG_CALL_NO_RWG, void, env, int)
+DEF_HELPER_FLAGS_2(rebuild_hflags_a64, TCG_CALL_NO_RWG, void, env, int)
+
 DEF_HELPER_1(vfp_get_fpscr, i32, env)
 DEF_HELPER_2(vfp_set_fpscr, void, env, i32)
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 37424e3d4d..b2d701cf00 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11220,6 +11220,30 @@ void arm_rebuild_hflags(CPUARMState *env)
     env->hflags = rebuild_hflags_internal(env);
 }
 
+void HELPER(rebuild_hflags_m32)(CPUARMState *env, int el)
+{
+    int fp_el = fp_exception_el(env, el);
+    ARMMMUIdx mmu_idx = arm_mmu_idx_el(env, el);
+
+    env->hflags = rebuild_hflags_m32(env, fp_el, mmu_idx);
+}
+
+void HELPER(rebuild_hflags_a32)(CPUARMState *env, int el)
+{
+    int fp_el = fp_exception_el(env, el);
+    ARMMMUIdx mmu_idx = arm_mmu_idx_el(env, el);
+
+    env->hflags = rebuild_hflags_a32(env, fp_el, mmu_idx);
+}
+
+void HELPER(rebuild_hflags_a64)(CPUARMState *env, int el)
+{
+    int fp_el = fp_exception_el(env, el);
+    ARMMMUIdx mmu_idx = arm_mmu_idx_el(env, el);
+
+    env->hflags = rebuild_hflags_a64(env, el, fp_el, mmu_idx);
+}
+
 void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
                           target_ulong *cs_base, uint32_t *pflags)
 {
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 16/20] target/arm: Rebuild hflags at EL changes
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (14 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 15/20] target/arm: Add HELPER(rebuild_hflags_{a32, a64, m32}) Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 19:01   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 17/20] target/arm: Rebuild hflags at MSR writes Richard Henderson
                   ` (4 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Begin setting, but not relying upon, env->hflags.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/syscall.c    | 1 +
 target/arm/cpu.c        | 1 +
 target/arm/helper-a64.c | 3 +++
 target/arm/helper.c     | 2 ++
 target/arm/machine.c    | 1 +
 target/arm/op_helper.c  | 1 +
 6 files changed, 9 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index e2af3c1494..ebefd05140 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -9982,6 +9982,7 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
                     aarch64_sve_narrow_vq(env, vq);
                 }
                 env->vfp.zcr_el[1] = vq - 1;
+                arm_rebuild_hflags(env);
                 ret = vq * 16;
             }
             return ret;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 2399c14471..d043e75166 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -406,6 +406,7 @@ static void arm_cpu_reset(CPUState *s)
 
     hw_breakpoint_update_all(cpu);
     hw_watchpoint_update_all(cpu);
+    arm_rebuild_hflags(env);
 }
 
 bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index bca80bdc38..b4cd680fc4 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -1025,6 +1025,7 @@ void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
         } else {
             env->regs[15] = new_pc & ~0x3;
         }
+        helper_rebuild_hflags_a32(env, new_el);
         qemu_log_mask(CPU_LOG_INT, "Exception return from AArch64 EL%d to "
                       "AArch32 EL%d PC 0x%" PRIx32 "\n",
                       cur_el, new_el, env->regs[15]);
@@ -1036,10 +1037,12 @@ void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
         }
         aarch64_restore_sp(env, new_el);
         env->pc = new_pc;
+        helper_rebuild_hflags_a64(env, new_el);
         qemu_log_mask(CPU_LOG_INT, "Exception return from AArch64 EL%d to "
                       "AArch64 EL%d PC 0x%" PRIx64 "\n",
                       cur_el, new_el, env->pc);
     }
+
     /*
      * Note that cur_el can never be 0.  If new_el is 0, then
      * el0_a64 is return_to_aa64, else el0_a64 is ignored.
diff --git a/target/arm/helper.c b/target/arm/helper.c
index b2d701cf00..aae7b62458 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -7998,6 +7998,7 @@ static void take_aarch32_exception(CPUARMState *env, int new_mode,
         env->regs[14] = env->regs[15] + offset;
     }
     env->regs[15] = newpc;
+    arm_rebuild_hflags(env);
 }
 
 static void arm_cpu_do_interrupt_aarch32_hyp(CPUState *cs)
@@ -8345,6 +8346,7 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
     pstate_write(env, PSTATE_DAIF | new_mode);
     env->aarch64 = 1;
     aarch64_restore_sp(env, new_el);
+    helper_rebuild_hflags_a64(env, new_el);
 
     env->pc = addr;
 
diff --git a/target/arm/machine.c b/target/arm/machine.c
index 5c36707a7c..eb28b2381b 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -756,6 +756,7 @@ static int cpu_post_load(void *opaque, int version_id)
     if (!kvm_enabled()) {
         pmu_op_finish(&cpu->env);
     }
+    arm_rebuild_hflags(&cpu->env);
 
     return 0;
 }
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index 0fd4bd0238..ccc2cecb46 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -404,6 +404,7 @@ void HELPER(cpsr_write_eret)(CPUARMState *env, uint32_t val)
      * state. Do the masking now.
      */
     env->regs[15] &= (env->thumb ? ~1 : ~3);
+    arm_rebuild_hflags(env);
 
     qemu_mutex_lock_iothread();
     arm_call_el_change_hook(env_archcpu(env));
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 17/20] target/arm: Rebuild hflags at MSR writes
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (15 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 16/20] target/arm: Rebuild hflags at EL changes Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 19:03   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 18/20] target/arm: Rebuild hflags at CPSR writes Richard Henderson
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Continue setting, but not relying upon, env->hflags.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 13 +++++++++++--
 target/arm/translate.c     | 28 +++++++++++++++++++++++-----
 2 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 2d6cd09634..d4bebbe629 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1789,8 +1789,17 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
     if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
         /* I/O operations must end the TB here (whether read or write) */
         s->base.is_jmp = DISAS_UPDATE;
-    } else if (!isread && !(ri->type & ARM_CP_SUPPRESS_TB_END)) {
-        /* We default to ending the TB on a coprocessor register write,
+    }
+    if (!isread && !(ri->type & ARM_CP_SUPPRESS_TB_END)) {
+        /*
+         * A write to any coprocessor regiser that ends a TB
+         * must rebuild the hflags for the next TB.
+         */
+        TCGv_i32 tcg_el = tcg_const_i32(s->current_el);
+        gen_helper_rebuild_hflags_a64(cpu_env, tcg_el);
+        tcg_temp_free_i32(tcg_el);
+        /*
+         * We default to ending the TB on a coprocessor register write,
          * but allow this to be suppressed by the register definition
          * (usually only necessary to work around guest bugs).
          */
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 698c594e8c..cb47cd9744 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -6890,6 +6890,8 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
     ri = get_arm_cp_reginfo(s->cp_regs,
             ENCODE_CP_REG(cpnum, is64, s->ns, crn, crm, opc1, opc2));
     if (ri) {
+        bool need_exit_tb;
+
         /* Check access permissions */
         if (!cp_access_ok(s->current_el, ri, isread)) {
             return 1;
@@ -7068,14 +7070,30 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
             }
         }
 
-        if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
-            /* I/O operations must end the TB here (whether read or write) */
-            gen_lookup_tb(s);
-        } else if (!isread && !(ri->type & ARM_CP_SUPPRESS_TB_END)) {
-            /* We default to ending the TB on a coprocessor register write,
+        /* I/O operations must end the TB here (whether read or write) */
+        need_exit_tb = ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) &&
+                        (ri->type & ARM_CP_IO));
+
+        if (!isread && !(ri->type & ARM_CP_SUPPRESS_TB_END)) {
+            /*
+             * A write to any coprocessor regiser that ends a TB
+             * must rebuild the hflags for the next TB.
+             */
+            TCGv_i32 tcg_el = tcg_const_i32(s->current_el);
+            if (arm_dc_feature(s, ARM_FEATURE_M)) {
+                gen_helper_rebuild_hflags_m32(cpu_env, tcg_el);
+            } else {
+                gen_helper_rebuild_hflags_a32(cpu_env, tcg_el);
+            }
+            tcg_temp_free_i32(tcg_el);
+            /*
+             * We default to ending the TB on a coprocessor register write,
              * but allow this to be suppressed by the register definition
              * (usually only necessary to work around guest bugs).
              */
+            need_exit_tb = true;
+        }
+        if (need_exit_tb) {
             gen_lookup_tb(s);
         }
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 18/20] target/arm: Rebuild hflags at CPSR writes
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (16 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 17/20] target/arm: Rebuild hflags at MSR writes Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 19:08   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 19/20] target/arm: Rebuild hflags for M-profile Richard Henderson
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Continue setting, but not relying upon, env->hflags.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/op_helper.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index ccc2cecb46..b529d6c1bf 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -224,6 +224,7 @@ uint32_t HELPER(usat16)(CPUARMState *env, uint32_t x, uint32_t shift)
 void HELPER(setend)(CPUARMState *env)
 {
     env->uncached_cpsr ^= CPSR_E;
+    arm_rebuild_hflags(env);
 }
 
 /* Function checks whether WFx (WFI/WFE) instructions are set up to be trapped.
@@ -387,6 +388,8 @@ uint32_t HELPER(cpsr_read)(CPUARMState *env)
 void HELPER(cpsr_write)(CPUARMState *env, uint32_t val, uint32_t mask)
 {
     cpsr_write(env, val, mask, CPSRWriteByInstr);
+    /* TODO: Not all cpsr bits are relevant to hflags.  */
+    arm_rebuild_hflags(env);
 }
 
 /* Write the CPSR for a 32-bit exception return */
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 19/20] target/arm: Rebuild hflags for M-profile.
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (17 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 18/20] target/arm: Rebuild hflags at CPSR writes Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-14 19:08   ` Alex Bennée
  2019-10-11 15:55 ` [PATCH v6 20/20] target/arm: Rely on hflags correct in cpu_get_tb_cpu_state Richard Henderson
  2019-10-17 15:26 ` [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Peter Maydell
  20 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

Continue setting, but not relying upon, env->hflags.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/m_helper.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
index 27cd2f3f96..f2512e448e 100644
--- a/target/arm/m_helper.c
+++ b/target/arm/m_helper.c
@@ -494,6 +494,7 @@ void HELPER(v7m_bxns)(CPUARMState *env, uint32_t dest)
     switch_v7m_security_state(env, dest & 1);
     env->thumb = 1;
     env->regs[15] = dest & ~1;
+    arm_rebuild_hflags(env);
 }
 
 void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
@@ -555,6 +556,7 @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
     switch_v7m_security_state(env, 0);
     env->thumb = 1;
     env->regs[15] = dest;
+    arm_rebuild_hflags(env);
 }
 
 static uint32_t *get_v7m_sp_ptr(CPUARMState *env, bool secure, bool threadmode,
@@ -895,6 +897,7 @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     env->regs[14] = lr;
     env->regs[15] = addr & 0xfffffffe;
     env->thumb = addr & 1;
+    arm_rebuild_hflags(env);
 }
 
 static void v7m_update_fpccr(CPUARMState *env, uint32_t frameptr,
@@ -1765,6 +1768,7 @@ static void do_v7m_exception_exit(ARMCPU *cpu)
 
     /* Otherwise, we have a successful exception exit. */
     arm_clear_exclusive(env);
+    arm_rebuild_hflags(env);
     qemu_log_mask(CPU_LOG_INT, "...successful exception return\n");
 }
 
@@ -1837,6 +1841,7 @@ static bool do_v7m_function_return(ARMCPU *cpu)
     xpsr_write(env, 0, XPSR_IT);
     env->thumb = newpc & 1;
     env->regs[15] = newpc & ~1;
+    arm_rebuild_hflags(env);
 
     qemu_log_mask(CPU_LOG_INT, "...function return successful\n");
     return true;
@@ -1959,6 +1964,7 @@ static bool v7m_handle_execute_nsc(ARMCPU *cpu)
     switch_v7m_security_state(env, true);
     xpsr_write(env, 0, XPSR_IT);
     env->regs[15] += 4;
+    arm_rebuild_hflags(env);
     return true;
 
 gen_invep:
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 20/20] target/arm: Rely on hflags correct in cpu_get_tb_cpu_state
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (18 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 19/20] target/arm: Rebuild hflags for M-profile Richard Henderson
@ 2019-10-11 15:55 ` Richard Henderson
  2019-10-17 15:26 ` [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Peter Maydell
  20 siblings, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-10-11 15:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm

This is the payoff.

From perf record -g data of ubuntu 18 boot and shutdown:

BEFORE:

-   23.02%     2.82%  qemu-system-aar  [.] helper_lookup_tb_ptr
   - 20.22% helper_lookup_tb_ptr
      + 10.05% tb_htable_lookup
      - 9.13% cpu_get_tb_cpu_state
           3.20% aa64_va_parameters_both
           0.55% fp_exception_el

-   11.66%     4.74%  qemu-system-aar  [.] cpu_get_tb_cpu_state
   - 6.96% cpu_get_tb_cpu_state
        3.63% aa64_va_parameters_both
        0.60% fp_exception_el
        0.53% sve_exception_el

AFTER:

-   16.40%     3.40%  qemu-system-aar  [.] helper_lookup_tb_ptr
   - 13.03% helper_lookup_tb_ptr
      + 11.19% tb_htable_lookup
        0.55% cpu_get_tb_cpu_state

     0.98%     0.71%  qemu-system-aar  [.] cpu_get_tb_cpu_state

     0.87%     0.24%  qemu-system-aar  [.] rebuild_hflags_a64

Before, helper_lookup_tb_ptr is the second hottest function in the
application, consuming almost a quarter of the runtime.  Within the
entire execution, cpu_get_tb_cpu_state consumes about 12%.

After, helper_lookup_tb_ptr has dropped to the fourth hottest function,
with consumption dropping to a sixth of the runtime.  Within the
entire execution, cpu_get_tb_cpu_state has dropped below 1%, and the
supporting function to rebuild hflags also consumes about 1%.

Assertions are retained for --enable-debug-tcg.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Retain asserts for future debugging.
---
 target/arm/helper.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index aae7b62458..c3e3dd2c41 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11249,12 +11249,15 @@ void HELPER(rebuild_hflags_a64)(CPUARMState *env, int el)
 void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
                           target_ulong *cs_base, uint32_t *pflags)
 {
-    uint32_t flags, pstate_for_ss;
+    uint32_t flags = env->hflags;
+    uint32_t pstate_for_ss;
 
     *cs_base = 0;
-    flags = rebuild_hflags_internal(env);
+#ifdef CONFIG_DEBUG_TCG
+    assert(flags == rebuild_hflags_internal(env));
+#endif
 
-    if (is_a64(env)) {
+    if (FIELD_EX32(flags, TBFLAG_ANY, AARCH64_STATE)) {
         *pc = env->pc;
         if (cpu_isar_feature(aa64_bti, env_archcpu(env))) {
             flags = FIELD_DP32(flags, TBFLAG_A64, BTYPE, env->btype);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 02/20] target/arm: Split out rebuild_hflags_a64
  2019-10-11 15:55 ` [PATCH v6 02/20] target/arm: Split out rebuild_hflags_a64 Richard Henderson
@ 2019-10-14 15:43   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 15:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm


Richard Henderson <richard.henderson@linaro.org> writes:

> Create a function to compute the values of the TBFLAG_A64 bits
> that will be cached.  For now, the env->hflags variable is not
> used, and the results are fed back to cpu_get_tb_cpu_state.
>
> Note that not all BTI related flags are cached, so we have to
> test the BTI feature twice -- once for those bits moved out to
> rebuild_hflags_a64 and once for those bits that remain in
> cpu_get_tb_cpu_state.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/helper.c | 131 +++++++++++++++++++++++---------------------
>  1 file changed, 69 insertions(+), 62 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 8829d91ae1..69da04786e 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11070,6 +11070,71 @@ static uint32_t rebuild_hflags_common(CPUARMState *env, int fp_el,
>      return flags;
>  }
>
> +static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
> +                                   ARMMMUIdx mmu_idx)
> +{
> +    ARMMMUIdx stage1 = stage_1_mmu_idx(mmu_idx);
> +    ARMVAParameters p0 = aa64_va_parameters_both(env, 0, stage1);
> +    uint32_t flags = 0;
> +    uint64_t sctlr;
> +    int tbii, tbid;
> +
> +    flags = FIELD_DP32(flags, TBFLAG_ANY, AARCH64_STATE, 1);
> +
> +    /* FIXME: ARMv8.1-VHE S2 translation regime.  */
> +    if (regime_el(env, stage1) < 2) {
> +        ARMVAParameters p1 = aa64_va_parameters_both(env, -1, stage1);
> +        tbid = (p1.tbi << 1) | p0.tbi;
> +        tbii = tbid & ~((p1.tbid << 1) | p0.tbid);
> +    } else {
> +        tbid = p0.tbi;
> +        tbii = tbid & !p0.tbid;
> +    }
> +
> +    flags = FIELD_DP32(flags, TBFLAG_A64, TBII, tbii);
> +    flags = FIELD_DP32(flags, TBFLAG_A64, TBID, tbid);
> +
> +    if (cpu_isar_feature(aa64_sve, env_archcpu(env))) {
> +        int sve_el = sve_exception_el(env, el);
> +        uint32_t zcr_len;
> +
> +        /*
> +         * If SVE is disabled, but FP is enabled,
> +         * then the effective len is 0.
> +         */
> +        if (sve_el != 0 && fp_el == 0) {
> +            zcr_len = 0;
> +        } else {
> +            zcr_len = sve_zcr_len_for_el(env, el);
> +        }
> +        flags = FIELD_DP32(flags, TBFLAG_A64, SVEEXC_EL, sve_el);
> +        flags = FIELD_DP32(flags, TBFLAG_A64, ZCR_LEN, zcr_len);
> +    }
> +
> +    sctlr = arm_sctlr(env, el);
> +
> +    if (cpu_isar_feature(aa64_pauth, env_archcpu(env))) {
> +        /*
> +         * In order to save space in flags, we record only whether
> +         * pauth is "inactive", meaning all insns are implemented as
> +         * a nop, or "active" when some action must be performed.
> +         * The decision of which action to take is left to a helper.
> +         */
> +        if (sctlr & (SCTLR_EnIA | SCTLR_EnIB | SCTLR_EnDA | SCTLR_EnDB)) {
> +            flags = FIELD_DP32(flags, TBFLAG_A64, PAUTH_ACTIVE, 1);
> +        }
> +    }
> +
> +    if (cpu_isar_feature(aa64_bti, env_archcpu(env))) {
> +        /* Note that SCTLR_EL[23].BT == SCTLR_BT1.  */
> +        if (sctlr & (el == 0 ? SCTLR_BT0 : SCTLR_BT1)) {
> +            flags = FIELD_DP32(flags, TBFLAG_A64, BT, 1);
> +        }
> +    }
> +
> +    return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
> +}
> +
>  void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>                            target_ulong *cs_base, uint32_t *pflags)
>  {
> @@ -11079,67 +11144,9 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>      uint32_t flags = 0;
>
>      if (is_a64(env)) {
> -        ARMCPU *cpu = env_archcpu(env);
> -        uint64_t sctlr;
> -
>          *pc = env->pc;
> -        flags = FIELD_DP32(flags, TBFLAG_ANY, AARCH64_STATE, 1);
> -
> -        /* Get control bits for tagged addresses.  */
> -        {
> -            ARMMMUIdx stage1 = stage_1_mmu_idx(mmu_idx);
> -            ARMVAParameters p0 = aa64_va_parameters_both(env, 0, stage1);
> -            int tbii, tbid;
> -
> -            /* FIXME: ARMv8.1-VHE S2 translation regime.  */
> -            if (regime_el(env, stage1) < 2) {
> -                ARMVAParameters p1 = aa64_va_parameters_both(env, -1, stage1);
> -                tbid = (p1.tbi << 1) | p0.tbi;
> -                tbii = tbid & ~((p1.tbid << 1) | p0.tbid);
> -            } else {
> -                tbid = p0.tbi;
> -                tbii = tbid & !p0.tbid;
> -            }
> -
> -            flags = FIELD_DP32(flags, TBFLAG_A64, TBII, tbii);
> -            flags = FIELD_DP32(flags, TBFLAG_A64, TBID, tbid);
> -        }
> -
> -        if (cpu_isar_feature(aa64_sve, cpu)) {
> -            int sve_el = sve_exception_el(env, current_el);
> -            uint32_t zcr_len;
> -
> -            /* If SVE is disabled, but FP is enabled,
> -             * then the effective len is 0.
> -             */
> -            if (sve_el != 0 && fp_el == 0) {
> -                zcr_len = 0;
> -            } else {
> -                zcr_len = sve_zcr_len_for_el(env, current_el);
> -            }
> -            flags = FIELD_DP32(flags, TBFLAG_A64, SVEEXC_EL, sve_el);
> -            flags = FIELD_DP32(flags, TBFLAG_A64, ZCR_LEN, zcr_len);
> -        }
> -
> -        sctlr = arm_sctlr(env, current_el);
> -
> -        if (cpu_isar_feature(aa64_pauth, cpu)) {
> -            /*
> -             * In order to save space in flags, we record only whether
> -             * pauth is "inactive", meaning all insns are implemented as
> -             * a nop, or "active" when some action must be performed.
> -             * The decision of which action to take is left to a helper.
> -             */
> -            if (sctlr & (SCTLR_EnIA | SCTLR_EnIB | SCTLR_EnDA | SCTLR_EnDB)) {
> -                flags = FIELD_DP32(flags, TBFLAG_A64, PAUTH_ACTIVE, 1);
> -            }
> -        }
> -
> -        if (cpu_isar_feature(aa64_bti, cpu)) {
> -            /* Note that SCTLR_EL[23].BT == SCTLR_BT1.  */
> -            if (sctlr & (current_el == 0 ? SCTLR_BT0 : SCTLR_BT1)) {
> -                flags = FIELD_DP32(flags, TBFLAG_A64, BT, 1);
> -            }
> +        flags = rebuild_hflags_a64(env, current_el, fp_el, mmu_idx);
> +        if (cpu_isar_feature(aa64_bti, env_archcpu(env))) {
>              flags = FIELD_DP32(flags, TBFLAG_A64, BTYPE, env->btype);
>          }
>      } else {
> @@ -11159,9 +11166,9 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>              flags = FIELD_DP32(flags, TBFLAG_A32,
>                                 XSCALE_CPAR, env->cp15.c15_cpar);
>          }
> -    }
>
> -    flags = rebuild_hflags_common(env, fp_el, mmu_idx, flags);
> +        flags = rebuild_hflags_common(env, fp_el, mmu_idx, flags);
> +    }
>
>      /* The SS_ACTIVE and PSTATE_SS bits correspond to the state machine
>       * states defined in the ARM ARM for software singlestep:


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 03/20] target/arm: Split out rebuild_hflags_common_32
  2019-10-11 15:55 ` [PATCH v6 03/20] target/arm: Split out rebuild_hflags_common_32 Richard Henderson
@ 2019-10-14 15:53   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 15:53 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Create a function to compute the values of the TBFLAG_A32 bits
> that will be cached, and are used by all profiles.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/helper.c | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 69da04786e..f05d042474 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11070,6 +11070,15 @@ static uint32_t rebuild_hflags_common(CPUARMState *env, int fp_el,
>      return flags;
>  }
>
> +static uint32_t rebuild_hflags_common_32(CPUARMState *env, int fp_el,
> +                                         ARMMMUIdx mmu_idx, uint32_t flags)
> +{
> +    flags = FIELD_DP32(flags, TBFLAG_A32, SCTLR_B, arm_sctlr_b(env));
> +    flags = FIELD_DP32(flags, TBFLAG_A32, NS, !access_secure_reg(env));
> +
> +    return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
> +}
> +
>  static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
>                                     ARMMMUIdx mmu_idx)
>  {
> @@ -11141,7 +11150,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>      ARMMMUIdx mmu_idx = arm_mmu_idx(env);
>      int current_el = arm_current_el(env);
>      int fp_el = fp_exception_el(env, current_el);
> -    uint32_t flags = 0;
> +    uint32_t flags;
>
>      if (is_a64(env)) {
>          *pc = env->pc;
> @@ -11151,12 +11160,11 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>          }
>      } else {
>          *pc = env->regs[15];
> +        flags = rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
>          flags = FIELD_DP32(flags, TBFLAG_A32, THUMB, env->thumb);
>          flags = FIELD_DP32(flags, TBFLAG_A32, VECLEN, env->vfp.vec_len);
>          flags = FIELD_DP32(flags, TBFLAG_A32, VECSTRIDE, env->vfp.vec_stride);
>          flags = FIELD_DP32(flags, TBFLAG_A32, CONDEXEC, env->condexec_bits);
> -        flags = FIELD_DP32(flags, TBFLAG_A32, SCTLR_B, arm_sctlr_b(env));
> -        flags = FIELD_DP32(flags, TBFLAG_A32, NS, !access_secure_reg(env));
>          if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)
>              || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
>              flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
> @@ -11166,8 +11174,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>              flags = FIELD_DP32(flags, TBFLAG_A32,
>                                 XSCALE_CPAR, env->cp15.c15_cpar);
>          }
> -
> -        flags = rebuild_hflags_common(env, fp_el, mmu_idx, flags);
>      }
>
>      /* The SS_ACTIVE and PSTATE_SS bits correspond to the state machine


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 04/20] target/arm: Split arm_cpu_data_is_big_endian
  2019-10-11 15:55 ` [PATCH v6 04/20] target/arm: Split arm_cpu_data_is_big_endian Richard Henderson
@ 2019-10-14 16:01   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 16:01 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Set TBFLAG_ANY.BE_DATA in rebuild_hflags_common_32 and
> rebuild_hflags_a64 instead of rebuild_hflags_common, where we do
> not need to re-test is_a64() nor re-compute the various inputs.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/cpu.h    | 49 +++++++++++++++++++++++++++------------------
>  target/arm/helper.c | 16 +++++++++++----
>  2 files changed, 42 insertions(+), 23 deletions(-)
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index ad79a6153b..4d961474ce 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -3108,33 +3108,44 @@ static inline uint64_t arm_sctlr(CPUARMState *env, int el)
>      }
>  }
>
> +static inline bool arm_cpu_data_is_big_endian_a32(CPUARMState *env,
> +                                                  bool sctlr_b)
> +{
> +#ifdef CONFIG_USER_ONLY
> +    /*
> +     * In system mode, BE32 is modelled in line with the
> +     * architecture (as word-invariant big-endianness), where loads
> +     * and stores are done little endian but from addresses which
> +     * are adjusted by XORing with the appropriate constant. So the
> +     * endianness to use for the raw data access is not affected by
> +     * SCTLR.B.
> +     * In user mode, however, we model BE32 as byte-invariant
> +     * big-endianness (because user-only code cannot tell the
> +     * difference), and so we need to use a data access endianness
> +     * that depends on SCTLR.B.
> +     */
> +    if (sctlr_b) {
> +        return true;
> +    }
> +#endif
> +    /* In 32bit endianness is determined by looking at CPSR's E bit */
> +    return env->uncached_cpsr & CPSR_E;
> +}
> +
> +static inline bool arm_cpu_data_is_big_endian_a64(int el, uint64_t sctlr)
> +{
> +    return sctlr & (el ? SCTLR_EE : SCTLR_E0E);
> +}
>
>  /* Return true if the processor is in big-endian mode. */
>  static inline bool arm_cpu_data_is_big_endian(CPUARMState *env)
>  {
> -    /* In 32bit endianness is determined by looking at CPSR's E bit */
>      if (!is_a64(env)) {
> -        return
> -#ifdef CONFIG_USER_ONLY
> -            /* In system mode, BE32 is modelled in line with the
> -             * architecture (as word-invariant big-endianness), where loads
> -             * and stores are done little endian but from addresses which
> -             * are adjusted by XORing with the appropriate constant. So the
> -             * endianness to use for the raw data access is not affected by
> -             * SCTLR.B.
> -             * In user mode, however, we model BE32 as byte-invariant
> -             * big-endianness (because user-only code cannot tell the
> -             * difference), and so we need to use a data access endianness
> -             * that depends on SCTLR.B.
> -             */
> -            arm_sctlr_b(env) ||
> -#endif
> -                ((env->uncached_cpsr & CPSR_E) ? 1 : 0);
> +        return arm_cpu_data_is_big_endian_a32(env, arm_sctlr_b(env));
>      } else {
>          int cur_el = arm_current_el(env);
>          uint64_t sctlr = arm_sctlr(env, cur_el);
> -
> -        return (sctlr & (cur_el ? SCTLR_EE : SCTLR_E0E)) != 0;
> +        return arm_cpu_data_is_big_endian_a64(cur_el, sctlr);
>      }
>  }
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index f05d042474..4c65476d93 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11061,9 +11061,6 @@ static uint32_t rebuild_hflags_common(CPUARMState *env, int fp_el,
>      flags = FIELD_DP32(flags, TBFLAG_ANY, MMUIDX,
>                         arm_to_core_mmu_idx(mmu_idx));
>
> -    if (arm_cpu_data_is_big_endian(env)) {
> -        flags = FIELD_DP32(flags, TBFLAG_ANY, BE_DATA, 1);
> -    }
>      if (arm_singlestep_active(env)) {
>          flags = FIELD_DP32(flags, TBFLAG_ANY, SS_ACTIVE, 1);
>      }
> @@ -11073,7 +11070,14 @@ static uint32_t rebuild_hflags_common(CPUARMState *env, int fp_el,
>  static uint32_t rebuild_hflags_common_32(CPUARMState *env, int fp_el,
>                                           ARMMMUIdx mmu_idx, uint32_t flags)
>  {
> -    flags = FIELD_DP32(flags, TBFLAG_A32, SCTLR_B, arm_sctlr_b(env));
> +    bool sctlr_b = arm_sctlr_b(env);
> +
> +    if (sctlr_b) {
> +        flags = FIELD_DP32(flags, TBFLAG_A32, SCTLR_B, 1);
> +    }
> +    if (arm_cpu_data_is_big_endian_a32(env, sctlr_b)) {
> +        flags = FIELD_DP32(flags, TBFLAG_ANY, BE_DATA, 1);
> +    }
>      flags = FIELD_DP32(flags, TBFLAG_A32, NS, !access_secure_reg(env));
>
>      return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
> @@ -11122,6 +11126,10 @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
>
>      sctlr = arm_sctlr(env, el);
>
> +    if (arm_cpu_data_is_big_endian_a64(el, sctlr)) {
> +        flags = FIELD_DP32(flags, TBFLAG_ANY, BE_DATA, 1);
> +    }
> +
>      if (cpu_isar_feature(aa64_pauth, env_archcpu(env))) {
>          /*
>           * In order to save space in flags, we record only whether


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/20] target/arm: Split out rebuild_hflags_m32
  2019-10-11 15:55 ` [PATCH v6 05/20] target/arm: Split out rebuild_hflags_m32 Richard Henderson
@ 2019-10-14 16:13   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 16:13 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Create a function to compute the values of the TBFLAG_A32 bits
> that will be cached, and are used by M-profile.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/helper.c | 45 ++++++++++++++++++++++++++++++---------------
>  1 file changed, 30 insertions(+), 15 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 4c65476d93..d4303420da 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11083,6 +11083,29 @@ static uint32_t rebuild_hflags_common_32(CPUARMState *env, int fp_el,
>      return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
>  }
>
> +static uint32_t rebuild_hflags_m32(CPUARMState *env, int fp_el,
> +                                   ARMMMUIdx mmu_idx)
> +{
> +    uint32_t flags = 0;
> +
> +    if (arm_v7m_is_handler_mode(env)) {
> +        flags = FIELD_DP32(flags, TBFLAG_A32, HANDLER, 1);
> +    }
> +
> +    /*
> +     * v8M always applies stack limit checks unless CCR.STKOFHFNMIGN
> +     * is suppressing them because the requested execution priority
> +     * is less than 0.
> +     */
> +    if (arm_feature(env, ARM_FEATURE_V8) &&
> +        !((mmu_idx & ARM_MMU_IDX_M_NEGPRI) &&
> +          (env->v7m.ccr[env->v7m.secure] & R_V7M_CCR_STKOFHFNMIGN_MASK))) {
> +        flags = FIELD_DP32(flags, TBFLAG_A32, STACKCHECK, 1);
> +    }
> +
> +    return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
> +}
> +
>  static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
>                                     ARMMMUIdx mmu_idx)
>  {
> @@ -11168,7 +11191,13 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>          }
>      } else {
>          *pc = env->regs[15];
> -        flags = rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
> +
> +        if (arm_feature(env, ARM_FEATURE_M)) {
> +            flags = rebuild_hflags_m32(env, fp_el, mmu_idx);
> +        } else {
> +            flags = rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
> +        }
> +
>          flags = FIELD_DP32(flags, TBFLAG_A32, THUMB, env->thumb);
>          flags = FIELD_DP32(flags, TBFLAG_A32, VECLEN, env->vfp.vec_len);
>          flags = FIELD_DP32(flags, TBFLAG_A32, VECSTRIDE, env->vfp.vec_stride);
> @@ -11204,20 +11233,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>          }
>      }
>
> -    if (arm_v7m_is_handler_mode(env)) {
> -        flags = FIELD_DP32(flags, TBFLAG_A32, HANDLER, 1);
> -    }
> -
> -    /* v8M always applies stack limit checks unless CCR.STKOFHFNMIGN is
> -     * suppressing them because the requested execution priority is less than 0.
> -     */
> -    if (arm_feature(env, ARM_FEATURE_V8) &&
> -        arm_feature(env, ARM_FEATURE_M) &&
> -        !((mmu_idx  & ARM_MMU_IDX_M_NEGPRI) &&
> -          (env->v7m.ccr[env->v7m.secure] & R_V7M_CCR_STKOFHFNMIGN_MASK))) {
> -        flags = FIELD_DP32(flags, TBFLAG_A32, STACKCHECK, 1);
> -    }
> -
>      if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
>          FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S) != env->v7m.secure) {
>          flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 06/20] target/arm: Reduce tests vs M-profile in cpu_get_tb_cpu_state
  2019-10-11 15:55 ` [PATCH v6 06/20] target/arm: Reduce tests vs M-profile in cpu_get_tb_cpu_state Richard Henderson
@ 2019-10-14 16:17   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 16:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm


Richard Henderson <richard.henderson@linaro.org> writes:

> Hoist the computation of some TBFLAG_A32 bits that only apply to
> M-profile under a single test for ARM_FEATURE_M.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/helper.c | 49 +++++++++++++++++++++------------------------
>  1 file changed, 23 insertions(+), 26 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index d4303420da..296a4b2232 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11194,6 +11194,29 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>
>          if (arm_feature(env, ARM_FEATURE_M)) {
>              flags = rebuild_hflags_m32(env, fp_el, mmu_idx);
> +
> +            if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
> +                FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S)
> +                != env->v7m.secure) {
> +                flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
> +            }
> +
> +            if ((env->v7m.fpccr[env->v7m.secure] & R_V7M_FPCCR_ASPEN_MASK) &&
> +                (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) ||
> +                 (env->v7m.secure &&
> +                  !(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)))) {
> +                /*
> +                 * ASPEN is set, but FPCA/SFPA indicate that there is no
> +                 * active FP context; we must create a new FP context before
> +                 * executing any FP insn.
> +                 */
> +                flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
> +            }
> +
> +            bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
> +            if (env->v7m.fpccr[is_secure] & R_V7M_FPCCR_LSPACT_MASK) {
> +                flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
> +            }
>          } else {
>              flags = rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
>          }
> @@ -11233,32 +11256,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>          }
>      }
>
> -    if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
> -        FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S) != env->v7m.secure) {
> -        flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
> -    }
> -
> -    if (arm_feature(env, ARM_FEATURE_M) &&
> -        (env->v7m.fpccr[env->v7m.secure] & R_V7M_FPCCR_ASPEN_MASK) &&
> -        (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) ||
> -         (env->v7m.secure &&
> -          !(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)))) {
> -        /*
> -         * ASPEN is set, but FPCA/SFPA indicate that there is no active
> -         * FP context; we must create a new FP context before executing
> -         * any FP insn.
> -         */
> -        flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
> -    }
> -
> -    if (arm_feature(env, ARM_FEATURE_M)) {
> -        bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
> -
> -        if (env->v7m.fpccr[is_secure] & R_V7M_FPCCR_LSPACT_MASK) {
> -            flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
> -        }
> -    }
> -
>      if (!arm_feature(env, ARM_FEATURE_M)) {
>          int target_el = arm_debug_target_el(env);


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 07/20] target/arm: Split out rebuild_hflags_a32
  2019-10-11 15:55 ` [PATCH v6 07/20] target/arm: Split out rebuild_hflags_a32 Richard Henderson
@ 2019-10-14 16:17   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 16:17 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Currently a trivial wrapper for rebuild_hflags_common_32.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/helper.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 296a4b2232..d1cd54cc93 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11106,6 +11106,12 @@ static uint32_t rebuild_hflags_m32(CPUARMState *env, int fp_el,
>      return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
>  }
>
> +static uint32_t rebuild_hflags_a32(CPUARMState *env, int fp_el,
> +                                   ARMMMUIdx mmu_idx)
> +{
> +    return rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
> +}
> +
>  static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
>                                     ARMMMUIdx mmu_idx)
>  {
> @@ -11218,7 +11224,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>                  flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
>              }
>          } else {
> -            flags = rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
> +            flags = rebuild_hflags_a32(env, fp_el, mmu_idx);
>          }
>
>          flags = FIELD_DP32(flags, TBFLAG_A32, THUMB, env->thumb);


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 08/20] target/arm: Split out rebuild_hflags_aprofile
  2019-10-11 15:55 ` [PATCH v6 08/20] target/arm: Split out rebuild_hflags_aprofile Richard Henderson
@ 2019-10-14 16:19   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 16:19 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Create a function to compute the values of the TBFLAG_ANY bits
> that will be cached, and are used by A-profile.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/helper.c | 20 ++++++++++++--------
>  1 file changed, 12 insertions(+), 8 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index d1cd54cc93..ddd21edfcf 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11106,18 +11106,28 @@ static uint32_t rebuild_hflags_m32(CPUARMState *env, int fp_el,
>      return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
>  }
>
> +static uint32_t rebuild_hflags_aprofile(CPUARMState *env)
> +{
> +    int flags = 0;
> +
> +    flags = FIELD_DP32(flags, TBFLAG_ANY, DEBUG_TARGET_EL,
> +                       arm_debug_target_el(env));
> +    return flags;
> +}
> +
>  static uint32_t rebuild_hflags_a32(CPUARMState *env, int fp_el,
>                                     ARMMMUIdx mmu_idx)
>  {
> -    return rebuild_hflags_common_32(env, fp_el, mmu_idx, 0);
> +    uint32_t flags = rebuild_hflags_aprofile(env);
> +    return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
>  }
>
>  static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
>                                     ARMMMUIdx mmu_idx)
>  {
> +    uint32_t flags = rebuild_hflags_aprofile(env);
>      ARMMMUIdx stage1 = stage_1_mmu_idx(mmu_idx);
>      ARMVAParameters p0 = aa64_va_parameters_both(env, 0, stage1);
> -    uint32_t flags = 0;
>      uint64_t sctlr;
>      int tbii, tbid;
>
> @@ -11262,12 +11272,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>          }
>      }
>
> -    if (!arm_feature(env, ARM_FEATURE_M)) {
> -        int target_el = arm_debug_target_el(env);
> -
> -        flags = FIELD_DP32(flags, TBFLAG_ANY, DEBUG_TARGET_EL, target_el);
> -    }
> -
>      *pflags = flags;
>      *cs_base = 0;
>  }


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 09/20] target/arm: Hoist XSCALE_CPAR, VECLEN, VECSTRIDE in cpu_get_tb_cpu_state
  2019-10-11 15:55 ` [PATCH v6 09/20] target/arm: Hoist XSCALE_CPAR, VECLEN, VECSTRIDE in cpu_get_tb_cpu_state Richard Henderson
@ 2019-10-14 16:39   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 16:39 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> We do not need to compute any of these values for M-profile.
> Further, XSCALE_CPAR overlaps VECSTRIDE so obviously the two
> sets must be mutually exclusive.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/helper.c | 21 ++++++++++++++-------
>  1 file changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index ddd21edfcf..e2a62cf19a 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11235,21 +11235,28 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>              }
>          } else {
>              flags = rebuild_hflags_a32(env, fp_el, mmu_idx);
> +
> +            /*
> +             * Note that XSCALE_CPAR shares bits with VECSTRIDE.
> +             * Note that VECLEN+VECSTRIDE are RES0 for M-profile.
> +             */
> +            if (arm_feature(env, ARM_FEATURE_XSCALE)) {
> +                flags = FIELD_DP32(flags, TBFLAG_A32,
> +                                   XSCALE_CPAR, env->cp15.c15_cpar);
> +            } else {
> +                flags = FIELD_DP32(flags, TBFLAG_A32, VECLEN,
> +                                   env->vfp.vec_len);
> +                flags = FIELD_DP32(flags, TBFLAG_A32, VECSTRIDE,
> +                                   env->vfp.vec_stride);
> +            }
>          }
>
>          flags = FIELD_DP32(flags, TBFLAG_A32, THUMB, env->thumb);
> -        flags = FIELD_DP32(flags, TBFLAG_A32, VECLEN, env->vfp.vec_len);
> -        flags = FIELD_DP32(flags, TBFLAG_A32, VECSTRIDE, env->vfp.vec_stride);
>          flags = FIELD_DP32(flags, TBFLAG_A32, CONDEXEC, env->condexec_bits);
>          if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)
>              || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
>              flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
>          }
> -        /* Note that XSCALE_CPAR shares bits with VECSTRIDE */
> -        if (arm_feature(env, ARM_FEATURE_XSCALE)) {
> -            flags = FIELD_DP32(flags, TBFLAG_A32,
> -                               XSCALE_CPAR, env->cp15.c15_cpar);
> -        }
>      }
>
>      /* The SS_ACTIVE and PSTATE_SS bits correspond to the state machine


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 10/20] target/arm: Simplify set of PSTATE_SS in cpu_get_tb_cpu_state
  2019-10-11 15:55 ` [PATCH v6 10/20] target/arm: Simplify set of PSTATE_SS " Richard Henderson
@ 2019-10-14 18:21   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 18:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm


Richard Henderson <richard.henderson@linaro.org> writes:

> Hoist the variable load for PSTATE into the existing test vs is_a64.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/helper.c | 20 ++++++++------------
>  1 file changed, 8 insertions(+), 12 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index e2a62cf19a..398e5f5d6d 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11197,7 +11197,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>      ARMMMUIdx mmu_idx = arm_mmu_idx(env);
>      int current_el = arm_current_el(env);
>      int fp_el = fp_exception_el(env, current_el);
> -    uint32_t flags;
> +    uint32_t flags, pstate_for_ss;
>
>      if (is_a64(env)) {
>          *pc = env->pc;
> @@ -11205,6 +11205,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>          if (cpu_isar_feature(aa64_bti, env_archcpu(env))) {
>              flags = FIELD_DP32(flags, TBFLAG_A64, BTYPE, env->btype);
>          }
> +        pstate_for_ss = env->pstate;
>      } else {
>          *pc = env->regs[15];
>
> @@ -11257,9 +11258,11 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>              || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
>              flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
>          }
> +        pstate_for_ss = env->uncached_cpsr;
>      }
>
> -    /* The SS_ACTIVE and PSTATE_SS bits correspond to the state machine
> +    /*
> +     * The SS_ACTIVE and PSTATE_SS bits correspond to the state machine
>       * states defined in the ARM ARM for software singlestep:
>       *  SS_ACTIVE   PSTATE.SS   State
>       *     0            x       Inactive (the TB flag for SS is always 0)
> @@ -11267,16 +11270,9 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>       *     1            1       Active-not-pending
>       * SS_ACTIVE is set in hflags; PSTATE_SS is computed every TB.
>       */
> -    if (FIELD_EX32(flags, TBFLAG_ANY, SS_ACTIVE)) {
> -        if (is_a64(env)) {
> -            if (env->pstate & PSTATE_SS) {
> -                flags = FIELD_DP32(flags, TBFLAG_ANY, PSTATE_SS, 1);
> -            }
> -        } else {
> -            if (env->uncached_cpsr & PSTATE_SS) {
> -                flags = FIELD_DP32(flags, TBFLAG_ANY, PSTATE_SS, 1);
> -            }
> -        }
> +    if (FIELD_EX32(flags, TBFLAG_ANY, SS_ACTIVE) &&
> +        (pstate_for_ss & PSTATE_SS)) {
> +        flags = FIELD_DP32(flags, TBFLAG_ANY, PSTATE_SS, 1);
>      }
>
>      *pflags = flags;


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 11/20] target/arm: Hoist computation of TBFLAG_A32.VFPEN
  2019-10-11 15:55 ` [PATCH v6 11/20] target/arm: Hoist computation of TBFLAG_A32.VFPEN Richard Henderson
@ 2019-10-14 18:46   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 18:46 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> There are 3 conditions that each enable this flag.  M-profile always
> enables; A-profile with EL1 as AA64 always enables.  Both of these
> conditions can easily be cached.  The final condition relies on the
> FPEXC register which we are not prepared to cache.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/cpu.h    |  2 +-
>  target/arm/helper.c | 14 ++++++++++----
>  2 files changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 4d961474ce..9909ff89d4 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -3192,7 +3192,7 @@ FIELD(TBFLAG_A32, XSCALE_CPAR, 4, 2)
>   * the same thing as the current security state of the processor!
>   */
>  FIELD(TBFLAG_A32, NS, 6, 1)
> -FIELD(TBFLAG_A32, VFPEN, 7, 1)          /* Not cached. */
> +FIELD(TBFLAG_A32, VFPEN, 7, 1)          /* Partially cached, minus FPEXC. */
>  FIELD(TBFLAG_A32, CONDEXEC, 8, 8)       /* Not cached. */
>  FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
>  /* For M profile only, set if FPCCR.LSPACT is set */
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 398e5f5d6d..89aa6fd933 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11088,6 +11088,9 @@ static uint32_t rebuild_hflags_m32(CPUARMState *env, int fp_el,
>  {
>      uint32_t flags = 0;
>
> +    /* v8M always enables the fpu.  */
> +    flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
> +
>      if (arm_v7m_is_handler_mode(env)) {
>          flags = FIELD_DP32(flags, TBFLAG_A32, HANDLER, 1);
>      }
> @@ -11119,6 +11122,10 @@ static uint32_t rebuild_hflags_a32(CPUARMState *env, int fp_el,
>                                     ARMMMUIdx mmu_idx)
>  {
>      uint32_t flags = rebuild_hflags_aprofile(env);
> +
> +    if (arm_el_is_aa64(env, 1)) {
> +        flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
> +    }
>      return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
>  }
>
> @@ -11250,14 +11257,13 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>                  flags = FIELD_DP32(flags, TBFLAG_A32, VECSTRIDE,
>                                     env->vfp.vec_stride);
>              }
> +            if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)) {
> +                flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
> +            }

We seem to be short of symbolic definitions for this bit but whatever:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

>          }
>
>          flags = FIELD_DP32(flags, TBFLAG_A32, THUMB, env->thumb);
>          flags = FIELD_DP32(flags, TBFLAG_A32, CONDEXEC, env->condexec_bits);
> -        if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)
> -            || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
> -            flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
> -        }
>          pstate_for_ss = env->uncached_cpsr;
>      }


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 12/20] target/arm: Add arm_rebuild_hflags
  2019-10-11 15:55 ` [PATCH v6 12/20] target/arm: Add arm_rebuild_hflags Richard Henderson
@ 2019-10-14 18:47   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 18:47 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> This function assumes nothing about the current state of the cpu,
> and writes the computed value to env->hflags.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/cpu.h    |  6 ++++++
>  target/arm/helper.c | 30 ++++++++++++++++++++++--------
>  2 files changed, 28 insertions(+), 8 deletions(-)
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 9909ff89d4..d844ea21d8 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -3297,6 +3297,12 @@ void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
>  void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook, void
>          *opaque);
>
> +/**
> + * arm_rebuild_hflags:
> + * Rebuild the cached TBFLAGS for arbitrary changed processor state.
> + */
> +void arm_rebuild_hflags(CPUARMState *env);
> +
>  /**
>   * aa32_vfp_dreg:
>   * Return a pointer to the Dn register within env in 32-bit mode.
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 89aa6fd933..85de96d071 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11198,17 +11198,35 @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
>      return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
>  }
>
> +static uint32_t rebuild_hflags_internal(CPUARMState *env)
> +{
> +    int el = arm_current_el(env);
> +    int fp_el = fp_exception_el(env, el);
> +    ARMMMUIdx mmu_idx = arm_mmu_idx(env);
> +
> +    if (is_a64(env)) {
> +        return rebuild_hflags_a64(env, el, fp_el, mmu_idx);
> +    } else if (arm_feature(env, ARM_FEATURE_M)) {
> +        return rebuild_hflags_m32(env, fp_el, mmu_idx);
> +    } else {
> +        return rebuild_hflags_a32(env, fp_el, mmu_idx);
> +    }
> +}
> +
> +void arm_rebuild_hflags(CPUARMState *env)
> +{
> +    env->hflags = rebuild_hflags_internal(env);
> +}
> +
>  void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>                            target_ulong *cs_base, uint32_t *pflags)
>  {
> -    ARMMMUIdx mmu_idx = arm_mmu_idx(env);
> -    int current_el = arm_current_el(env);
> -    int fp_el = fp_exception_el(env, current_el);
>      uint32_t flags, pstate_for_ss;
>
> +    flags = rebuild_hflags_internal(env);
> +
>      if (is_a64(env)) {
>          *pc = env->pc;
> -        flags = rebuild_hflags_a64(env, current_el, fp_el, mmu_idx);
>          if (cpu_isar_feature(aa64_bti, env_archcpu(env))) {
>              flags = FIELD_DP32(flags, TBFLAG_A64, BTYPE, env->btype);
>          }
> @@ -11217,8 +11235,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>          *pc = env->regs[15];
>
>          if (arm_feature(env, ARM_FEATURE_M)) {
> -            flags = rebuild_hflags_m32(env, fp_el, mmu_idx);
> -
>              if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
>                  FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S)
>                  != env->v7m.secure) {
> @@ -11242,8 +11258,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>                  flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
>              }
>          } else {
> -            flags = rebuild_hflags_a32(env, fp_el, mmu_idx);
> -
>              /*
>               * Note that XSCALE_CPAR shares bits with VECSTRIDE.
>               * Note that VECLEN+VECSTRIDE are RES0 for M-profile.


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 13/20] target/arm: Split out arm_mmu_idx_el
  2019-10-11 15:55 ` [PATCH v6 13/20] target/arm: Split out arm_mmu_idx_el Richard Henderson
@ 2019-10-14 18:49   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 18:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm


Richard Henderson <richard.henderson@linaro.org> writes:

> Avoid calling arm_current_el() twice.
>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/internals.h |  9 +++++++++
>  target/arm/helper.c    | 12 +++++++-----
>  2 files changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/target/arm/internals.h b/target/arm/internals.h
> index 232d963875..f5313dd3d4 100644
> --- a/target/arm/internals.h
> +++ b/target/arm/internals.h
> @@ -949,6 +949,15 @@ void arm_cpu_update_virq(ARMCPU *cpu);
>   */
>  void arm_cpu_update_vfiq(ARMCPU *cpu);
>
> +/**
> + * arm_mmu_idx_el:
> + * @env: The cpu environment
> + * @el: The EL to use.
> + *
> + * Return the full ARMMMUIdx for the translation regime for EL.
> + */
> +ARMMMUIdx arm_mmu_idx_el(CPUARMState *env, int el);
> +
>  /**
>   * arm_mmu_idx:
>   * @env: The cpu environment
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 85de96d071..3f7d3f257d 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11026,15 +11026,12 @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate)
>  }
>  #endif
>
> -ARMMMUIdx arm_mmu_idx(CPUARMState *env)
> +ARMMMUIdx arm_mmu_idx_el(CPUARMState *env, int el)
>  {
> -    int el;
> -
>      if (arm_feature(env, ARM_FEATURE_M)) {
>          return arm_v7m_mmu_idx_for_secstate(env, env->v7m.secure);
>      }
>
> -    el = arm_current_el(env);
>      if (el < 2 && arm_is_secure_below_el3(env)) {
>          return ARMMMUIdx_S1SE0 + el;
>      } else {
> @@ -11042,6 +11039,11 @@ ARMMMUIdx arm_mmu_idx(CPUARMState *env)
>      }
>  }
>
> +ARMMMUIdx arm_mmu_idx(CPUARMState *env)
> +{
> +    return arm_mmu_idx_el(env, arm_current_el(env));
> +}
> +
>  int cpu_mmu_index(CPUARMState *env, bool ifetch)
>  {
>      return arm_to_core_mmu_idx(arm_mmu_idx(env));
> @@ -11202,7 +11204,7 @@ static uint32_t rebuild_hflags_internal(CPUARMState *env)
>  {
>      int el = arm_current_el(env);
>      int fp_el = fp_exception_el(env, el);
> -    ARMMMUIdx mmu_idx = arm_mmu_idx(env);
> +    ARMMMUIdx mmu_idx = arm_mmu_idx_el(env, el);
>
>      if (is_a64(env)) {
>          return rebuild_hflags_a64(env, el, fp_el, mmu_idx);


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 14/20] target/arm: Hoist store to cs_base in cpu_get_tb_cpu_state
  2019-10-11 15:55 ` [PATCH v6 14/20] target/arm: Hoist store to cs_base in cpu_get_tb_cpu_state Richard Henderson
@ 2019-10-14 18:51   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 18:51 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> By performing this store early, we avoid having to save and restore
> the register holding the address around any function calls.

Been peeking at the assembly again ;-)

>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 3f7d3f257d..37424e3d4d 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11225,6 +11225,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>  {
>      uint32_t flags, pstate_for_ss;
>
> +    *cs_base = 0;
>      flags = rebuild_hflags_internal(env);
>
>      if (is_a64(env)) {
> @@ -11298,7 +11299,6 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>      }
>
>      *pflags = flags;
> -    *cs_base = 0;
>  }
>
>  #ifdef TARGET_AARCH64


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 15/20] target/arm: Add HELPER(rebuild_hflags_{a32, a64, m32})
  2019-10-11 15:55 ` [PATCH v6 15/20] target/arm: Add HELPER(rebuild_hflags_{a32, a64, m32}) Richard Henderson
@ 2019-10-14 18:59   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 18:59 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> This functions are given the mode and el state of the cpu
> and writes the computed value to env->hflags.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/helper.h |  4 ++++
>  target/arm/helper.c | 24 ++++++++++++++++++++++++
>  2 files changed, 28 insertions(+)
>
> diff --git a/target/arm/helper.h b/target/arm/helper.h
> index 1fb2cb5a77..3d4ec267a2 100644
> --- a/target/arm/helper.h
> +++ b/target/arm/helper.h
> @@ -90,6 +90,10 @@ DEF_HELPER_4(msr_banked, void, env, i32, i32, i32)
>  DEF_HELPER_2(get_user_reg, i32, env, i32)
>  DEF_HELPER_3(set_user_reg, void, env, i32, i32)
>
> +DEF_HELPER_FLAGS_2(rebuild_hflags_m32, TCG_CALL_NO_RWG, void, env,
> int)
> +DEF_HELPER_FLAGS_2(rebuild_hflags_a32, TCG_CALL_NO_RWG, void, env, int)
> +DEF_HELPER_FLAGS_2(rebuild_hflags_a64, TCG_CALL_NO_RWG, void, env, int)
> +
>  DEF_HELPER_1(vfp_get_fpscr, i32, env)
>  DEF_HELPER_2(vfp_set_fpscr, void, env, i32)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 37424e3d4d..b2d701cf00 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11220,6 +11220,30 @@ void arm_rebuild_hflags(CPUARMState *env)
>      env->hflags = rebuild_hflags_internal(env);
>  }
>
> +void HELPER(rebuild_hflags_m32)(CPUARMState *env, int el)
> +{
> +    int fp_el = fp_exception_el(env, el);
> +    ARMMMUIdx mmu_idx = arm_mmu_idx_el(env, el);
> +
> +    env->hflags = rebuild_hflags_m32(env, fp_el, mmu_idx);
> +}
> +
> +void HELPER(rebuild_hflags_a32)(CPUARMState *env, int el)
> +{
> +    int fp_el = fp_exception_el(env, el);
> +    ARMMMUIdx mmu_idx = arm_mmu_idx_el(env, el);
> +
> +    env->hflags = rebuild_hflags_a32(env, fp_el, mmu_idx);
> +}
> +
> +void HELPER(rebuild_hflags_a64)(CPUARMState *env, int el)
> +{
> +    int fp_el = fp_exception_el(env, el);
> +    ARMMMUIdx mmu_idx = arm_mmu_idx_el(env, el);
> +
> +    env->hflags = rebuild_hflags_a64(env, el, fp_el, mmu_idx);
> +}
> +
>  void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
>                            target_ulong *cs_base, uint32_t *pflags)
>  {


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 16/20] target/arm: Rebuild hflags at EL changes
  2019-10-11 15:55 ` [PATCH v6 16/20] target/arm: Rebuild hflags at EL changes Richard Henderson
@ 2019-10-14 19:01   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 19:01 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Begin setting, but not relying upon, env->hflags.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  linux-user/syscall.c    | 1 +
>  target/arm/cpu.c        | 1 +
>  target/arm/helper-a64.c | 3 +++
>  target/arm/helper.c     | 2 ++
>  target/arm/machine.c    | 1 +
>  target/arm/op_helper.c  | 1 +
>  6 files changed, 9 insertions(+)
>
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index e2af3c1494..ebefd05140 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -9982,6 +9982,7 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
>                      aarch64_sve_narrow_vq(env, vq);
>                  }
>                  env->vfp.zcr_el[1] = vq - 1;
> +                arm_rebuild_hflags(env);
>                  ret = vq * 16;
>              }
>              return ret;
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 2399c14471..d043e75166 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -406,6 +406,7 @@ static void arm_cpu_reset(CPUState *s)
>
>      hw_breakpoint_update_all(cpu);
>      hw_watchpoint_update_all(cpu);
> +    arm_rebuild_hflags(env);
>  }
>
>  bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
> diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
> index bca80bdc38..b4cd680fc4 100644
> --- a/target/arm/helper-a64.c
> +++ b/target/arm/helper-a64.c
> @@ -1025,6 +1025,7 @@ void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
>          } else {
>              env->regs[15] = new_pc & ~0x3;
>          }
> +        helper_rebuild_hflags_a32(env, new_el);
>          qemu_log_mask(CPU_LOG_INT, "Exception return from AArch64 EL%d to "
>                        "AArch32 EL%d PC 0x%" PRIx32 "\n",
>                        cur_el, new_el, env->regs[15]);
> @@ -1036,10 +1037,12 @@ void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
>          }
>          aarch64_restore_sp(env, new_el);
>          env->pc = new_pc;
> +        helper_rebuild_hflags_a64(env, new_el);
>          qemu_log_mask(CPU_LOG_INT, "Exception return from AArch64 EL%d to "
>                        "AArch64 EL%d PC 0x%" PRIx64 "\n",
>                        cur_el, new_el, env->pc);
>      }
> +
>      /*
>       * Note that cur_el can never be 0.  If new_el is 0, then
>       * el0_a64 is return_to_aa64, else el0_a64 is ignored.
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index b2d701cf00..aae7b62458 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -7998,6 +7998,7 @@ static void take_aarch32_exception(CPUARMState *env, int new_mode,
>          env->regs[14] = env->regs[15] + offset;
>      }
>      env->regs[15] = newpc;
> +    arm_rebuild_hflags(env);
>  }
>
>  static void arm_cpu_do_interrupt_aarch32_hyp(CPUState *cs)
> @@ -8345,6 +8346,7 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
>      pstate_write(env, PSTATE_DAIF | new_mode);
>      env->aarch64 = 1;
>      aarch64_restore_sp(env, new_el);
> +    helper_rebuild_hflags_a64(env, new_el);
>
>      env->pc = addr;
>
> diff --git a/target/arm/machine.c b/target/arm/machine.c
> index 5c36707a7c..eb28b2381b 100644
> --- a/target/arm/machine.c
> +++ b/target/arm/machine.c
> @@ -756,6 +756,7 @@ static int cpu_post_load(void *opaque, int version_id)
>      if (!kvm_enabled()) {
>          pmu_op_finish(&cpu->env);
>      }
> +    arm_rebuild_hflags(&cpu->env);
>
>      return 0;
>  }
> diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
> index 0fd4bd0238..ccc2cecb46 100644
> --- a/target/arm/op_helper.c
> +++ b/target/arm/op_helper.c
> @@ -404,6 +404,7 @@ void HELPER(cpsr_write_eret)(CPUARMState *env, uint32_t val)
>       * state. Do the masking now.
>       */
>      env->regs[15] &= (env->thumb ? ~1 : ~3);
> +    arm_rebuild_hflags(env);
>
>      qemu_mutex_lock_iothread();
>      arm_call_el_change_hook(env_archcpu(env));


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 17/20] target/arm: Rebuild hflags at MSR writes
  2019-10-11 15:55 ` [PATCH v6 17/20] target/arm: Rebuild hflags at MSR writes Richard Henderson
@ 2019-10-14 19:03   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 19:03 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Continue setting, but not relying upon, env->hflags.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/translate-a64.c | 13 +++++++++++--
>  target/arm/translate.c     | 28 +++++++++++++++++++++++-----
>  2 files changed, 34 insertions(+), 7 deletions(-)
>
> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
> index 2d6cd09634..d4bebbe629 100644
> --- a/target/arm/translate-a64.c
> +++ b/target/arm/translate-a64.c
> @@ -1789,8 +1789,17 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
>      if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
>          /* I/O operations must end the TB here (whether read or write) */
>          s->base.is_jmp = DISAS_UPDATE;
> -    } else if (!isread && !(ri->type & ARM_CP_SUPPRESS_TB_END)) {
> -        /* We default to ending the TB on a coprocessor register write,
> +    }
> +    if (!isread && !(ri->type & ARM_CP_SUPPRESS_TB_END)) {
> +        /*
> +         * A write to any coprocessor regiser that ends a TB
> +         * must rebuild the hflags for the next TB.
> +         */
> +        TCGv_i32 tcg_el = tcg_const_i32(s->current_el);
> +        gen_helper_rebuild_hflags_a64(cpu_env, tcg_el);
> +        tcg_temp_free_i32(tcg_el);
> +        /*
> +         * We default to ending the TB on a coprocessor register write,
>           * but allow this to be suppressed by the register definition
>           * (usually only necessary to work around guest bugs).
>           */
> diff --git a/target/arm/translate.c b/target/arm/translate.c
> index 698c594e8c..cb47cd9744 100644
> --- a/target/arm/translate.c
> +++ b/target/arm/translate.c
> @@ -6890,6 +6890,8 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
>      ri = get_arm_cp_reginfo(s->cp_regs,
>              ENCODE_CP_REG(cpnum, is64, s->ns, crn, crm, opc1, opc2));
>      if (ri) {
> +        bool need_exit_tb;
> +
>          /* Check access permissions */
>          if (!cp_access_ok(s->current_el, ri, isread)) {
>              return 1;
> @@ -7068,14 +7070,30 @@ static int disas_coproc_insn(DisasContext *s, uint32_t insn)
>              }
>          }
>
> -        if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
> -            /* I/O operations must end the TB here (whether read or write) */
> -            gen_lookup_tb(s);
> -        } else if (!isread && !(ri->type & ARM_CP_SUPPRESS_TB_END)) {
> -            /* We default to ending the TB on a coprocessor register write,
> +        /* I/O operations must end the TB here (whether read or write) */
> +        need_exit_tb = ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) &&
> +                        (ri->type & ARM_CP_IO));
> +
> +        if (!isread && !(ri->type & ARM_CP_SUPPRESS_TB_END)) {
> +            /*
> +             * A write to any coprocessor regiser that ends a TB
> +             * must rebuild the hflags for the next TB.
> +             */
> +            TCGv_i32 tcg_el = tcg_const_i32(s->current_el);
> +            if (arm_dc_feature(s, ARM_FEATURE_M)) {
> +                gen_helper_rebuild_hflags_m32(cpu_env, tcg_el);
> +            } else {
> +                gen_helper_rebuild_hflags_a32(cpu_env, tcg_el);
> +            }
> +            tcg_temp_free_i32(tcg_el);
> +            /*
> +             * We default to ending the TB on a coprocessor register write,
>               * but allow this to be suppressed by the register definition
>               * (usually only necessary to work around guest bugs).
>               */
> +            need_exit_tb = true;
> +        }
> +        if (need_exit_tb) {
>              gen_lookup_tb(s);
>          }


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 18/20] target/arm: Rebuild hflags at CPSR writes
  2019-10-11 15:55 ` [PATCH v6 18/20] target/arm: Rebuild hflags at CPSR writes Richard Henderson
@ 2019-10-14 19:08   ` Alex Bennée
  2019-10-14 19:15     ` Richard Henderson
  0 siblings, 1 reply; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 19:08 UTC (permalink / raw)
  To: qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Continue setting, but not relying upon, env->hflags.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/op_helper.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
> index ccc2cecb46..b529d6c1bf 100644
> --- a/target/arm/op_helper.c
> +++ b/target/arm/op_helper.c
> @@ -224,6 +224,7 @@ uint32_t HELPER(usat16)(CPUARMState *env, uint32_t x, uint32_t shift)
>  void HELPER(setend)(CPUARMState *env)
>  {
>      env->uncached_cpsr ^= CPSR_E;
> +    arm_rebuild_hflags(env);
>  }
>
>  /* Function checks whether WFx (WFI/WFE) instructions are set up to be trapped.
> @@ -387,6 +388,8 @@ uint32_t HELPER(cpsr_read)(CPUARMState *env)
>  void HELPER(cpsr_write)(CPUARMState *env, uint32_t val, uint32_t mask)
>  {
>      cpsr_write(env, val, mask, CPSRWriteByInstr);
> +    /* TODO: Not all cpsr bits are relevant to hflags.  */

Do you mean by this we could check which bits changed and avoid a
re-compute if we wanted to? Is it likely to be anything other than the
SS_ACTIVE bit?

> +    arm_rebuild_hflags(env);
>  }
>
>  /* Write the CPSR for a 32-bit exception return */

Anyway:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 19/20] target/arm: Rebuild hflags for M-profile.
  2019-10-11 15:55 ` [PATCH v6 19/20] target/arm: Rebuild hflags for M-profile Richard Henderson
@ 2019-10-14 19:08   ` Alex Bennée
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Bennée @ 2019-10-14 19:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, qemu-arm


Richard Henderson <richard.henderson@linaro.org> writes:

> Continue setting, but not relying upon, env->hflags.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/m_helper.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
> index 27cd2f3f96..f2512e448e 100644
> --- a/target/arm/m_helper.c
> +++ b/target/arm/m_helper.c
> @@ -494,6 +494,7 @@ void HELPER(v7m_bxns)(CPUARMState *env, uint32_t dest)
>      switch_v7m_security_state(env, dest & 1);
>      env->thumb = 1;
>      env->regs[15] = dest & ~1;
> +    arm_rebuild_hflags(env);
>  }
>
>  void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
> @@ -555,6 +556,7 @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
>      switch_v7m_security_state(env, 0);
>      env->thumb = 1;
>      env->regs[15] = dest;
> +    arm_rebuild_hflags(env);
>  }
>
>  static uint32_t *get_v7m_sp_ptr(CPUARMState *env, bool secure, bool threadmode,
> @@ -895,6 +897,7 @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
>      env->regs[14] = lr;
>      env->regs[15] = addr & 0xfffffffe;
>      env->thumb = addr & 1;
> +    arm_rebuild_hflags(env);
>  }
>
>  static void v7m_update_fpccr(CPUARMState *env, uint32_t frameptr,
> @@ -1765,6 +1768,7 @@ static void do_v7m_exception_exit(ARMCPU *cpu)
>
>      /* Otherwise, we have a successful exception exit. */
>      arm_clear_exclusive(env);
> +    arm_rebuild_hflags(env);
>      qemu_log_mask(CPU_LOG_INT, "...successful exception return\n");
>  }
>
> @@ -1837,6 +1841,7 @@ static bool do_v7m_function_return(ARMCPU *cpu)
>      xpsr_write(env, 0, XPSR_IT);
>      env->thumb = newpc & 1;
>      env->regs[15] = newpc & ~1;
> +    arm_rebuild_hflags(env);
>
>      qemu_log_mask(CPU_LOG_INT, "...function return successful\n");
>      return true;
> @@ -1959,6 +1964,7 @@ static bool v7m_handle_execute_nsc(ARMCPU *cpu)
>      switch_v7m_security_state(env, true);
>      xpsr_write(env, 0, XPSR_IT);
>      env->regs[15] += 4;
> +    arm_rebuild_hflags(env);
>      return true;
>
>  gen_invep:


--
Alex Bennée


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 18/20] target/arm: Rebuild hflags at CPSR writes
  2019-10-14 19:08   ` Alex Bennée
@ 2019-10-14 19:15     ` Richard Henderson
  0 siblings, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-10-14 19:15 UTC (permalink / raw)
  To: Alex Bennée, qemu-arm; +Cc: laurent.desnogues, peter.maydell, qemu-devel

On 10/14/19 12:08 PM, Alex Bennée wrote:
> 
> Richard Henderson <richard.henderson@linaro.org> writes:
> 
>> Continue setting, but not relying upon, env->hflags.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>  target/arm/op_helper.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
>> index ccc2cecb46..b529d6c1bf 100644
>> --- a/target/arm/op_helper.c
>> +++ b/target/arm/op_helper.c
>> @@ -224,6 +224,7 @@ uint32_t HELPER(usat16)(CPUARMState *env, uint32_t x, uint32_t shift)
>>  void HELPER(setend)(CPUARMState *env)
>>  {
>>      env->uncached_cpsr ^= CPSR_E;
>> +    arm_rebuild_hflags(env);
>>  }
>>
>>  /* Function checks whether WFx (WFI/WFE) instructions are set up to be trapped.
>> @@ -387,6 +388,8 @@ uint32_t HELPER(cpsr_read)(CPUARMState *env)
>>  void HELPER(cpsr_write)(CPUARMState *env, uint32_t val, uint32_t mask)
>>  {
>>      cpsr_write(env, val, mask, CPSRWriteByInstr);
>> +    /* TODO: Not all cpsr bits are relevant to hflags.  */
> 
> Do you mean by this we could check which bits changed and avoid a
> re-compute if we wanted to? Is it likely to be anything other than the
> SS_ACTIVE bit?

Yes.  Well, there's also NZCV.  But neither case happens particularly often, so
it's probably not worth worrying about.


r~


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state
  2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
                   ` (19 preceding siblings ...)
  2019-10-11 15:55 ` [PATCH v6 20/20] target/arm: Rely on hflags correct in cpu_get_tb_cpu_state Richard Henderson
@ 2019-10-17 15:26 ` Peter Maydell
  2019-10-17 16:25   ` Richard Henderson
  20 siblings, 1 reply; 43+ messages in thread
From: Peter Maydell @ 2019-10-17 15:26 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Laurent Desnogues, qemu-arm, QEMU Developers

On Fri, 11 Oct 2019 at 16:55, Richard Henderson
<richard.henderson@linaro.org> wrote:
> Richard Henderson (20):
>   target/arm: Split out rebuild_hflags_common
>   target/arm: Split out rebuild_hflags_a64
>   target/arm: Split out rebuild_hflags_common_32
>   target/arm: Split arm_cpu_data_is_big_endian
>   target/arm: Split out rebuild_hflags_m32
>   target/arm: Reduce tests vs M-profile in cpu_get_tb_cpu_state
>   target/arm: Split out rebuild_hflags_a32
>   target/arm: Split out rebuild_hflags_aprofile
>   target/arm: Hoist XSCALE_CPAR, VECLEN, VECSTRIDE in
>     cpu_get_tb_cpu_state
>   target/arm: Simplify set of PSTATE_SS in cpu_get_tb_cpu_state
>   target/arm: Hoist computation of TBFLAG_A32.VFPEN
>   target/arm: Add arm_rebuild_hflags
>   target/arm: Split out arm_mmu_idx_el
>   target/arm: Hoist store to cs_base in cpu_get_tb_cpu_state
>   target/arm: Add HELPER(rebuild_hflags_{a32,a64,m32})
>   target/arm: Rebuild hflags at EL changes
>   target/arm: Rebuild hflags at MSR writes
>   target/arm: Rebuild hflags at CPSR writes
>   target/arm: Rebuild hflags for M-profile.
>   target/arm: Rely on hflags correct in cpu_get_tb_cpu_state

Don't we also need to do something to rebuild the hflags
for M-profile writes to the memory mapped system registers?
For instance rebuild_hflags_m32() bakes in state which
cares about env->v7m.ccr, which is set via nvic_writel(),
but I don't see anything whereby the write to the NVIC
register triggers a rebuild of the hflags value. Maybe I
missed it?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state
  2019-10-17 15:26 ` [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Peter Maydell
@ 2019-10-17 16:25   ` Richard Henderson
  2019-10-17 17:01     ` Peter Maydell
  0 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-10-17 16:25 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Laurent Desnogues, qemu-arm, QEMU Developers

On 10/17/19 8:26 AM, Peter Maydell wrote:
> Don't we also need to do something to rebuild the hflags
> for M-profile writes to the memory mapped system registers?
> For instance rebuild_hflags_m32() bakes in state which
> cares about env->v7m.ccr, which is set via nvic_writel(),
> but I don't see anything whereby the write to the NVIC
> register triggers a rebuild of the hflags value. Maybe I
> missed it?

How do you end the TB after nvic_writel(), so that the new behaviour can be
noticed right now?  We can call arm_rebuild_hflags() within nvic_writel(), but
it still needs to be recognized somehow.  I would expect to place one rebuild
in the place you end the TB...


r~


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state
  2019-10-17 16:25   ` Richard Henderson
@ 2019-10-17 17:01     ` Peter Maydell
  0 siblings, 0 replies; 43+ messages in thread
From: Peter Maydell @ 2019-10-17 17:01 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Laurent Desnogues, qemu-arm, QEMU Developers

On Thu, 17 Oct 2019 at 17:25, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 10/17/19 8:26 AM, Peter Maydell wrote:
> > Don't we also need to do something to rebuild the hflags
> > for M-profile writes to the memory mapped system registers?
> > For instance rebuild_hflags_m32() bakes in state which
> > cares about env->v7m.ccr, which is set via nvic_writel(),
> > but I don't see anything whereby the write to the NVIC
> > register triggers a rebuild of the hflags value. Maybe I
> > missed it?
>
> How do you end the TB after nvic_writel(), so that the new behaviour can be
> noticed right now?  We can call arm_rebuild_hflags() within nvic_writel(), but
> it still needs to be recognized somehow.  I would expect to place one rebuild
> in the place you end the TB...

To copy my reply from irc, just for the record:

we don't end the TB specifically after a write to the NVIC.
This is permitted by the architecture: rules R_BMLS and R_WQQB from
DDI05533B.h v8M Arm ARM:

# R_BMLS The side effects of any access to the SCS that performs
# a context-altering operation take effect when the access
# completes. A DSB instruction can be used to guarantee completion
# of a previous SCS access.
#
# R_WQQB A context synchronization event guarantees that the side
# effects of a previous SCS access are visible to all instructions
# in program order following the context synchronization event.

A context synchronization event is basically an ISB or an
exception entry/exit. Since we do end the TLB on an ISB
or exception entry/exit I think we're within the architectural
rules, but we do need to do a rebuild-hflags somewhere.
At the end of  nvic_sysreg_write() seems like the best place as
it means that QEMU is internally consistent and not likely to
trip over itself.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2019-10-17 17:45 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-11 15:55 [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Richard Henderson
2019-10-11 15:55 ` [PATCH v6 01/20] target/arm: Split out rebuild_hflags_common Richard Henderson
2019-10-11 15:55 ` [PATCH v6 02/20] target/arm: Split out rebuild_hflags_a64 Richard Henderson
2019-10-14 15:43   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 03/20] target/arm: Split out rebuild_hflags_common_32 Richard Henderson
2019-10-14 15:53   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 04/20] target/arm: Split arm_cpu_data_is_big_endian Richard Henderson
2019-10-14 16:01   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 05/20] target/arm: Split out rebuild_hflags_m32 Richard Henderson
2019-10-14 16:13   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 06/20] target/arm: Reduce tests vs M-profile in cpu_get_tb_cpu_state Richard Henderson
2019-10-14 16:17   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 07/20] target/arm: Split out rebuild_hflags_a32 Richard Henderson
2019-10-14 16:17   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 08/20] target/arm: Split out rebuild_hflags_aprofile Richard Henderson
2019-10-14 16:19   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 09/20] target/arm: Hoist XSCALE_CPAR, VECLEN, VECSTRIDE in cpu_get_tb_cpu_state Richard Henderson
2019-10-14 16:39   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 10/20] target/arm: Simplify set of PSTATE_SS " Richard Henderson
2019-10-14 18:21   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 11/20] target/arm: Hoist computation of TBFLAG_A32.VFPEN Richard Henderson
2019-10-14 18:46   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 12/20] target/arm: Add arm_rebuild_hflags Richard Henderson
2019-10-14 18:47   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 13/20] target/arm: Split out arm_mmu_idx_el Richard Henderson
2019-10-14 18:49   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 14/20] target/arm: Hoist store to cs_base in cpu_get_tb_cpu_state Richard Henderson
2019-10-14 18:51   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 15/20] target/arm: Add HELPER(rebuild_hflags_{a32, a64, m32}) Richard Henderson
2019-10-14 18:59   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 16/20] target/arm: Rebuild hflags at EL changes Richard Henderson
2019-10-14 19:01   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 17/20] target/arm: Rebuild hflags at MSR writes Richard Henderson
2019-10-14 19:03   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 18/20] target/arm: Rebuild hflags at CPSR writes Richard Henderson
2019-10-14 19:08   ` Alex Bennée
2019-10-14 19:15     ` Richard Henderson
2019-10-11 15:55 ` [PATCH v6 19/20] target/arm: Rebuild hflags for M-profile Richard Henderson
2019-10-14 19:08   ` Alex Bennée
2019-10-11 15:55 ` [PATCH v6 20/20] target/arm: Rely on hflags correct in cpu_get_tb_cpu_state Richard Henderson
2019-10-17 15:26 ` [PATCH v6 00/20] target/arm: Reduce overhead of cpu_get_tb_cpu_state Peter Maydell
2019-10-17 16:25   ` Richard Henderson
2019-10-17 17:01     ` Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.