All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus
@ 2022-04-17 17:43 Richard Henderson
  2022-04-17 17:43 ` [PATCH v3 01/60] tcg: Add tcg_constant_ptr Richard Henderson
                   ` (60 more replies)
  0 siblings, 61 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Supercedes: 20220412003326.588530-1-richard.henderson@linaro.org
("target/arm: 8 new features, A76 and N1")

Changes for v3:
  * More field updates for H.a.  This is not nearly complete, but what
    I've encountered so far as I've begun implementing SME.
  * Use bool instead of uint32_t for env->{aarch64,thumb}.
    I had anticipated other changes for implementing PSTATE.{SM,FA},
    but dropped those; these seemed like worth keeping.
  * Use tcg_constant_* more -- got stuck on this while working on...
  * Lots of cleanups to ARMCPRegInfo.
  * Discard unreachable cpregs when ELx not available.
  * Transform EL2 regs to RES0 when EL3 present but EL2 isn't.
    This greatly simplifies registration of cpregs for new features.
    Changes contextidr_el2, minimal_ras_reginfo, scxtnum_reginfo
    within this patch set; other uses coming for SME.


r~


Richard Henderson (60):
  tcg: Add tcg_constant_ptr
  target/arm: Update ISAR fields for ARMv8.8
  target/arm: Update SCR_EL3 bits to ARMv8.8
  target/arm: Update SCTLR bits to ARMv9.2
  target/arm: Change DisasContext.aarch64 to bool
  target/arm: Change CPUArchState.aarch64 to bool
  target/arm: Extend store_cpu_offset to take field size
  target/arm: Change DisasContext.thumb to bool
  target/arm: Change CPUArchState.thumb to bool
  target/arm: Remove fpexc32_access
  target/arm: Split out set_btype_raw
  target/arm: Split out gen_rebuild_hflags
  target/arm: Use tcg_constant in translate-a64.c
  target/arm: Simplify GEN_SHIFT in translate.c
  target/arm: Simplify gen_sar
  target/arm: Simplify aa32 DISAS_WFI
  target/arm: Use tcg_constant in translate.c
  target/arm: Use tcg_constant in translate-m-nocp.c
  target/arm: Use tcg_constant in translate-neon.c
  target/arm: Use smin/smax for do_sat_addsub_32
  target/arm: Use tcg_constant in translate-sve.c
  target/arm: Use tcg_constant in translate-vfp.c
  target/arm: Use tcg_constant_i32 in translate.h
  target/arm: Split out cpregs.h
  target/arm: Reorg CPAccessResult and access_check_cp_reg
  target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h
  target/arm: Make some more cpreg data static const
  target/arm: Reorg ARMCPRegInfo type field bits
  target/arm: Change cpreg access permissions to enum
  target/arm: Name CPState type
  target/arm: Name CPSecureState type
  target/arm: Update sysreg fields when redirecting for E2H
  target/arm: Store cpregs key in the hash table directly
  target/arm: Cleanup add_cpreg_to_hashtable
  target/arm: Handle cpreg registration for missing EL
  target/arm: Drop EL3 no EL2 fallbacks
  target/arm: Merge zcr reginfo
  target/arm: Add isar predicates for FEAT_Debugv8p2
  target/arm: Adjust definition of CONTEXTIDR_EL2
  target/arm: Move cortex impdef sysregs to cpu_tcg.c
  target/arm: Update qemu-system-arm -cpu max to cortex-a57
  target/arm: Set ID_DFR0.PerfMon for qemu-system-arm -cpu max
  target/arm: Split out aa32_max_features
  target/arm: Annotate arm_max_initfn with FEAT identifiers
  target/arm: Use field names for manipulating EL2 and EL3 modes
  target/arm: Enable FEAT_Debugv8p2 for -cpu max
  target/arm: Enable FEAT_Debugv8p4 for -cpu max
  target/arm: Add isar_feature_{aa64,any}_ras
  target/arm: Add minimal RAS registers
  target/arm: Enable SCR and HCR bits for RAS
  target/arm: Implement virtual SError exceptions
  target/arm: Implement ESB instruction
  target/arm: Enable FEAT_RAS for -cpu max
  target/arm: Enable FEAT_IESB for -cpu max
  target/arm: Enable FEAT_CSV2 for -cpu max
  target/arm: Enable FEAT_CSV2_2 for -cpu max
  target/arm: Enable FEAT_CSV3 for -cpu max
  target/arm: Enable FEAT_DGH for -cpu max
  target/arm: Define cortex-a76
  target/arm: Define neoverse-n1

 docs/system/arm/emulation.rst |  10 +
 docs/system/arm/virt.rst      |   2 +
 include/tcg/tcg.h             |   2 +
 target/arm/cpregs.h           | 459 +++++++++++++++++
 target/arm/cpu.h              | 475 ++++--------------
 target/arm/helper.h           |   1 +
 target/arm/internals.h        |  16 +
 target/arm/syndrome.h         |   5 +
 target/arm/translate-a32.h    |  13 +-
 target/arm/translate.h        |  17 +-
 target/arm/a32.decode         |  16 +-
 target/arm/t32.decode         |  18 +-
 hw/arm/pxa2xx.c               |   2 +-
 hw/arm/pxa2xx_pic.c           |   2 +-
 hw/arm/sbsa-ref.c             |   2 +
 hw/arm/virt.c                 |   2 +
 hw/intc/arm_gicv3_cpuif.c     |   6 +-
 hw/intc/arm_gicv3_kvm.c       |   3 +-
 linux-user/arm/cpu_loop.c     |   2 +-
 target/arm/cpu.c              |  88 ++--
 target/arm/cpu64.c            | 349 +++++++------
 target/arm/cpu_tcg.c          | 232 ++++++---
 target/arm/gdbstub.c          |   5 +-
 target/arm/helper-a64.c       |   4 +-
 target/arm/helper.c           | 897 ++++++++++++++++++----------------
 target/arm/hvf/hvf.c          |   2 +-
 target/arm/m_helper.c         |   6 +-
 target/arm/op_helper.c        | 113 +++--
 target/arm/translate-a64.c    | 395 ++++++---------
 target/arm/translate-m-nocp.c |  12 +-
 target/arm/translate-neon.c   |  21 +-
 target/arm/translate-sve.c    | 207 +++-----
 target/arm/translate-vfp.c    |  76 +--
 target/arm/translate.c        | 400 +++++++--------
 34 files changed, 2026 insertions(+), 1834 deletions(-)
 create mode 100644 target/arm/cpregs.h

-- 
2.25.1



^ permalink raw reply	[flat|nested] 1596+ messages in thread

* [PATCH v3 01/60] tcg: Add tcg_constant_ptr
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-19 10:41   ` Alex Bennée
  2022-04-17 17:43 ` [PATCH v3 02/60] target/arm: Update ISAR fields for ARMv8.8 Richard Henderson
                   ` (59 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Similar to tcg_const_ptr, defer to tcg_constant_{i32,i64}.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/tcg/tcg.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 73869fd9d0..cd6eaae410 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -1054,9 +1054,11 @@ TCGv_vec tcg_constant_vec_matching(TCGv_vec match, unsigned vece, int64_t val);
 #if UINTPTR_MAX == UINT32_MAX
 # define tcg_const_ptr(x)        ((TCGv_ptr)tcg_const_i32((intptr_t)(x)))
 # define tcg_const_local_ptr(x)  ((TCGv_ptr)tcg_const_local_i32((intptr_t)(x)))
+# define tcg_constant_ptr(x)     ((TCGv_ptr)tcg_constant_i32((intptr_t)(x)))
 #else
 # define tcg_const_ptr(x)        ((TCGv_ptr)tcg_const_i64((intptr_t)(x)))
 # define tcg_const_local_ptr(x)  ((TCGv_ptr)tcg_const_local_i64((intptr_t)(x)))
+# define tcg_constant_ptr(x)     ((TCGv_ptr)tcg_constant_i64((intptr_t)(x)))
 #endif
 
 TCGLabel *gen_new_label(void);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 02/60] target/arm: Update ISAR fields for ARMv8.8
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
  2022-04-17 17:43 ` [PATCH v3 01/60] tcg: Add tcg_constant_ptr Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-19 11:10   ` Alex Bennée
  2022-04-17 17:43 ` [PATCH v3 03/60] target/arm: Update SCR_EL3 bits to ARMv8.8 Richard Henderson
                   ` (58 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

Update isar fields per ARM DDI0487 H.a.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Add ID_AA64DFR0.HPMN0
---
 target/arm/cpu.h | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 23879de5fa..9a29a4a215 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1933,6 +1933,7 @@ FIELD(ID_MMFR4, CCIDX, 24, 4)
 FIELD(ID_MMFR4, EVT, 28, 4)
 
 FIELD(ID_MMFR5, ETS, 0, 4)
+FIELD(ID_MMFR5, NTLBPA, 4, 4)
 
 FIELD(ID_PFR0, STATE0, 0, 4)
 FIELD(ID_PFR0, STATE1, 4, 4)
@@ -1985,6 +1986,16 @@ FIELD(ID_AA64ISAR1, SPECRES, 40, 4)
 FIELD(ID_AA64ISAR1, BF16, 44, 4)
 FIELD(ID_AA64ISAR1, DGH, 48, 4)
 FIELD(ID_AA64ISAR1, I8MM, 52, 4)
+FIELD(ID_AA64ISAR1, XS, 56, 4)
+FIELD(ID_AA64ISAR1, LS64, 60, 4)
+
+FIELD(ID_AA64ISAR2, WFXT, 0, 4)
+FIELD(ID_AA64ISAR2, RPRES, 4, 4)
+FIELD(ID_AA64ISAR2, GPA3, 8, 4)
+FIELD(ID_AA64ISAR2, APA3, 12, 4)
+FIELD(ID_AA64ISAR2, MOPS, 16, 4)
+FIELD(ID_AA64ISAR2, BC, 20, 4)
+FIELD(ID_AA64ISAR2, PAC_FRAC, 24, 4)
 
 FIELD(ID_AA64PFR0, EL0, 0, 4)
 FIELD(ID_AA64PFR0, EL1, 4, 4)
@@ -2007,6 +2018,10 @@ FIELD(ID_AA64PFR1, SSBS, 4, 4)
 FIELD(ID_AA64PFR1, MTE, 8, 4)
 FIELD(ID_AA64PFR1, RAS_FRAC, 12, 4)
 FIELD(ID_AA64PFR1, MPAM_FRAC, 16, 4)
+FIELD(ID_AA64PFR1, SME, 24, 4)
+FIELD(ID_AA64PFR1, RNDR_TRAP, 28, 4)
+FIELD(ID_AA64PFR1, CSV2_FRAC, 32, 4)
+FIELD(ID_AA64PFR1, NMI, 36, 4)
 
 FIELD(ID_AA64MMFR0, PARANGE, 0, 4)
 FIELD(ID_AA64MMFR0, ASIDBITS, 4, 4)
@@ -2033,6 +2048,11 @@ FIELD(ID_AA64MMFR1, SPECSEI, 24, 4)
 FIELD(ID_AA64MMFR1, XNX, 28, 4)
 FIELD(ID_AA64MMFR1, TWED, 32, 4)
 FIELD(ID_AA64MMFR1, ETS, 36, 4)
+FIELD(ID_AA64MMFR1, HCX, 40, 4)
+FIELD(ID_AA64MMFR1, AFP, 44, 4)
+FIELD(ID_AA64MMFR1, NTLBPA, 48, 4)
+FIELD(ID_AA64MMFR1, TIDCP1, 52, 4)
+FIELD(ID_AA64MMFR1, CMOW, 56, 4)
 
 FIELD(ID_AA64MMFR2, CNP, 0, 4)
 FIELD(ID_AA64MMFR2, UAO, 4, 4)
@@ -2059,7 +2079,10 @@ FIELD(ID_AA64DFR0, CTX_CMPS, 28, 4)
 FIELD(ID_AA64DFR0, PMSVER, 32, 4)
 FIELD(ID_AA64DFR0, DOUBLELOCK, 36, 4)
 FIELD(ID_AA64DFR0, TRACEFILT, 40, 4)
+FIELD(ID_AA64DFR0, TRACEBUFFER, 44, 4)
 FIELD(ID_AA64DFR0, MTPMU, 48, 4)
+FIELD(ID_AA64DFR0, BRBE, 52, 4)
+FIELD(ID_AA64DFR0, HPMN0, 60, 4)
 
 FIELD(ID_AA64ZFR0, SVEVER, 0, 4)
 FIELD(ID_AA64ZFR0, AES, 4, 4)
@@ -2081,6 +2104,7 @@ FIELD(ID_DFR0, PERFMON, 24, 4)
 FIELD(ID_DFR0, TRACEFILT, 28, 4)
 
 FIELD(ID_DFR1, MTPMU, 0, 4)
+FIELD(ID_DFR1, HPMN0, 4, 4)
 
 FIELD(DBGDIDR, SE_IMP, 12, 1)
 FIELD(DBGDIDR, NSUHD_IMP, 14, 1)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 03/60] target/arm: Update SCR_EL3 bits to ARMv8.8
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
  2022-04-17 17:43 ` [PATCH v3 01/60] tcg: Add tcg_constant_ptr Richard Henderson
  2022-04-17 17:43 ` [PATCH v3 02/60] target/arm: Update ISAR fields for ARMv8.8 Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-19 11:13   ` Alex Bennée
  2022-04-19 11:14   ` Alex Bennée
  2022-04-17 17:43 ` [PATCH v3 04/60] target/arm: Update SCTLR bits to ARMv9.2 Richard Henderson
                   ` (57 subsequent siblings)
  60 siblings, 2 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Update SCR_EL3 fields per ARM DDI0487 H.a.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 9a29a4a215..f843c62c83 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1544,6 +1544,18 @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
 #define SCR_FIEN              (1U << 21)
 #define SCR_ENSCXT            (1U << 25)
 #define SCR_ATA               (1U << 26)
+#define SCR_FGTEN             (1U << 27)
+#define SCR_ECVEN             (1U << 28)
+#define SCR_TWEDEN            (1U << 29)
+#define SCR_TWEDEL            MAKE_64BIT_MASK(30, 4)
+#define SCR_TME               (1ULL << 34)
+#define SCR_AMVOFFEN          (1ULL << 35)
+#define SCR_ENAS0             (1ULL << 36)
+#define SCR_ADEN              (1ULL << 37)
+#define SCR_HXEN              (1ULL << 38)
+#define SCR_TRNDR             (1ULL << 40)
+#define SCR_ENTP2             (1ULL << 41)
+#define SCR_GPF               (1ULL << 48)
 
 #define HSTR_TTEE (1 << 16)
 #define HSTR_TJDBX (1 << 17)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 04/60] target/arm: Update SCTLR bits to ARMv9.2
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (2 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 03/60] target/arm: Update SCR_EL3 bits to ARMv8.8 Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-19 11:16   ` Alex Bennée
  2022-04-17 17:43 ` [PATCH v3 05/60] target/arm: Change DisasContext.aarch64 to bool Richard Henderson
                   ` (56 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Update SCTLR_ELx fields per ARM DDI0487 H.a.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index f843c62c83..9ae9c935a2 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1232,6 +1232,20 @@ void pmu_init(ARMCPU *cpu);
 #define SCTLR_ATA0    (1ULL << 42) /* v8.5-MemTag */
 #define SCTLR_ATA     (1ULL << 43) /* v8.5-MemTag */
 #define SCTLR_DSSBS_64 (1ULL << 44) /* v8.5, AArch64 only */
+#define SCTLR_TWEDEn  (1ULL << 45)  /* FEAT_TWED */
+#define SCTLR_TWEDEL  MAKE_64_MASK(46, 4)  /* FEAT_TWED */
+#define SCTLR_TMT0    (1ULL << 50) /* FEAT_TME */
+#define SCTLR_TMT     (1ULL << 51) /* FEAT_TME */
+#define SCTLR_TME0    (1ULL << 52) /* FEAT_TME */
+#define SCTLR_TME     (1ULL << 53) /* FEAT_TME */
+#define SCTLR_EnASR   (1ULL << 54) /* FEAT_LS64_V */
+#define SCTLR_EnAS0   (1ULL << 55) /* FEAT_LS64_ACCDATA */
+#define SCTLR_EnALS   (1ULL << 56) /* FEAT_LS64 */
+#define SCTLR_EPAN    (1ULL << 57) /* FEAT_PAN3 */
+#define SCTLR_EnTP2   (1ULL << 60) /* FEAT_SME */
+#define SCTLR_NMI     (1ULL << 61) /* FEAT_NMI */
+#define SCTLR_SPINTMASK (1ULL << 62) /* FEAT_NMI */
+#define SCTLR_TIDCP   (1ULL << 63) /* FEAT_TIDCP1 */
 
 #define CPTR_TCPAC    (1U << 31)
 #define CPTR_TTA      (1U << 20)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 05/60] target/arm: Change DisasContext.aarch64 to bool
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (3 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 04/60] target/arm: Update SCTLR bits to ARMv9.2 Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-19 11:16   ` Alex Bennée
  2022-04-17 17:43 ` [PATCH v3 06/60] target/arm: Change CPUArchState.aarch64 " Richard Henderson
                   ` (55 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Bool is a more appropriate type for this value.
Move the member down in the struct to keep the
bool type members together and remove a hole.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate.h     | 2 +-
 target/arm/translate-a64.c | 2 +-
 target/arm/translate.c     | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index 3a0db801d3..8b7dd1a4c0 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -59,12 +59,12 @@ typedef struct DisasContext {
      * so that top level loop can generate correct syndrome information.
      */
     uint32_t svc_imm;
-    int aarch64;
     int current_el;
     /* Debug target exception level for single-step exceptions */
     int debug_target_el;
     GHashTable *cp_regs;
     uint64_t features; /* CPU features bits */
+    bool aarch64;
     /* Because unallocated encodings generate different exception syndrome
      * information from traps due to FP being disabled, we can't do a single
      * "is fp access disabled" check at a high level in the decode tree.
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 9333d7be41..4dad23db48 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -14664,7 +14664,7 @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
     dc->isar = &arm_cpu->isar;
     dc->condjmp = 0;
 
-    dc->aarch64 = 1;
+    dc->aarch64 = true;
     /* If we are coming from secure EL0 in a system with a 32-bit EL3, then
      * there is no secure EL1, so we route exceptions to EL3.
      */
diff --git a/target/arm/translate.c b/target/arm/translate.c
index bf2196b9e2..480e58f49e 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9334,7 +9334,7 @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     dc->isar = &cpu->isar;
     dc->condjmp = 0;
 
-    dc->aarch64 = 0;
+    dc->aarch64 = false;
     /* If we are coming from secure EL0 in a system with a 32-bit EL3, then
      * there is no secure EL1, so we route exceptions to EL3.
      */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 06/60] target/arm: Change CPUArchState.aarch64 to bool
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (4 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 05/60] target/arm: Change DisasContext.aarch64 to bool Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-19 11:17   ` Alex Bennée
  2022-04-17 17:43 ` [PATCH v3 07/60] target/arm: Extend store_cpu_offset to take field size Richard Henderson
                   ` (54 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Bool is a more appropriate type for this value.
Adjust the assignments to use true/false.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.h        | 2 +-
 target/arm/cpu.c        | 2 +-
 target/arm/helper-a64.c | 4 ++--
 target/arm/helper.c     | 2 +-
 target/arm/hvf/hvf.c    | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 9ae9c935a2..a61a52e2f6 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -258,7 +258,7 @@ typedef struct CPUArchState {
      *  all other bits are stored in their correct places in env->pstate
      */
     uint32_t pstate;
-    uint32_t aarch64; /* 1 if CPU is in aarch64 state; inverse of PSTATE.nRW */
+    bool aarch64; /* True if CPU is in aarch64 state; inverse of PSTATE.nRW */
 
     /* Cached TBFLAGS state.  See below for which bits are included.  */
     CPUARMTBFlags hflags;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 5d4ca7a227..30e0d16ad4 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -189,7 +189,7 @@ static void arm_cpu_reset(DeviceState *dev)
 
     if (arm_feature(env, ARM_FEATURE_AARCH64)) {
         /* 64 bit CPUs always start in 64 bit mode */
-        env->aarch64 = 1;
+        env->aarch64 = true;
 #if defined(CONFIG_USER_ONLY)
         env->pstate = PSTATE_MODE_EL0t;
         /* Userspace expects access to DC ZVA, CTL_EL0 and the cache ops */
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index 7cf953b1e6..77a8502b6b 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -952,7 +952,7 @@ void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
     qemu_mutex_unlock_iothread();
 
     if (!return_to_aa64) {
-        env->aarch64 = 0;
+        env->aarch64 = false;
         /* We do a raw CPSR write because aarch64_sync_64_to_32()
          * will sort the register banks out for us, and we've already
          * caught all the bad-mode cases in el_from_spsr().
@@ -975,7 +975,7 @@ void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
     } else {
         int tbii;
 
-        env->aarch64 = 1;
+        env->aarch64 = true;
         spsr &= aarch64_pstate_valid_mask(&env_archcpu(env)->isar);
         pstate_write(env, spsr);
         if (!arm_singlestep_active(env)) {
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 7d14650615..47fe790854 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -10182,7 +10182,7 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
     }
 
     pstate_write(env, PSTATE_DAIF | new_mode);
-    env->aarch64 = 1;
+    env->aarch64 = true;
     aarch64_restore_sp(env, new_el);
     helper_rebuild_hflags_a64(env, new_el);
 
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index 8c34f86792..11176ef252 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -565,7 +565,7 @@ int hvf_arch_init_vcpu(CPUState *cpu)
     hv_return_t ret;
     int i;
 
-    env->aarch64 = 1;
+    env->aarch64 = true;
     asm volatile("mrs %0, cntfrq_el0" : "=r"(arm_cpu->gt_cntfrq_hz));
 
     /* Allocate enough space for our sysreg sync */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 07/60] target/arm: Extend store_cpu_offset to take field size
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (5 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 06/60] target/arm: Change CPUArchState.aarch64 " Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 16:15   ` Peter Maydell
  2022-04-22 13:58   ` Alex Bennée
  2022-04-17 17:43 ` [PATCH v3 08/60] target/arm: Change DisasContext.thumb to bool Richard Henderson
                   ` (53 subsequent siblings)
  60 siblings, 2 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Currently we assume all fields are 32-bit.
Prepare for fields of a single byte, using sizeof.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a32.h | 13 +++++--------
 target/arm/translate.c     | 21 ++++++++++++++++++++-
 2 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h
index 5be4b9b834..f593740a88 100644
--- a/target/arm/translate-a32.h
+++ b/target/arm/translate-a32.h
@@ -61,17 +61,14 @@ static inline TCGv_i32 load_cpu_offset(int offset)
 
 #define load_cpu_field(name) load_cpu_offset(offsetof(CPUARMState, name))
 
-static inline void store_cpu_offset(TCGv_i32 var, int offset)
-{
-    tcg_gen_st_i32(var, cpu_env, offset);
-    tcg_temp_free_i32(var);
-}
+void store_cpu_offset(TCGv_i32 var, int offset, int size);
 
-#define store_cpu_field(var, name) \
-    store_cpu_offset(var, offsetof(CPUARMState, name))
+#define store_cpu_field(var, name)                              \
+    store_cpu_offset(var, offsetof(CPUARMState, name),          \
+                     sizeof(((CPUARMState *)NULL)->name))
 
 #define store_cpu_field_constant(val, name) \
-    tcg_gen_st_i32(tcg_constant_i32(val), cpu_env, offsetof(CPUARMState, name))
+    store_cpu_field(tcg_constant_i32(val), name)
 
 /* Create a new temporary and set it to the value of a CPU register.  */
 static inline TCGv_i32 load_reg(DisasContext *s, int reg)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 480e58f49e..c745b7fc91 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -180,6 +180,25 @@ typedef enum ISSInfo {
     ISSIs16Bit = (1 << 8),
 } ISSInfo;
 
+/*
+ * Store var into env + offset to a member with size bytes.
+ * Free var after use.
+ */
+void store_cpu_offset(TCGv_i32 var, int offset, int size)
+{
+    switch (size) {
+    case 1:
+        tcg_gen_st8_i32(var, cpu_env, offset);
+        break;
+    case 4:
+        tcg_gen_st_i32(var, cpu_env, offset);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+    tcg_temp_free_i32(var);
+}
+
 /* Save the syndrome information for a Data Abort */
 static void disas_set_da_iss(DisasContext *s, MemOp memop, ISSInfo issinfo)
 {
@@ -4852,7 +4871,7 @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
                     tcg_temp_free_i32(tmp);
                 } else {
                     TCGv_i32 tmp = load_reg(s, rt);
-                    store_cpu_offset(tmp, ri->fieldoffset);
+                    store_cpu_offset(tmp, ri->fieldoffset, 4);
                 }
             }
         }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 08/60] target/arm: Change DisasContext.thumb to bool
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (6 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 07/60] target/arm: Extend store_cpu_offset to take field size Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 16:15   ` Peter Maydell
  2022-04-22 13:59   ` Alex Bennée
  2022-04-17 17:43 ` [PATCH v3 09/60] target/arm: Change CPUArchState.thumb " Richard Henderson
                   ` (52 subsequent siblings)
  60 siblings, 2 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Bool is a more appropriate type for this value.
Move the member down in the struct to keep the
bool type members together and remove a hole.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate.h     | 2 +-
 target/arm/translate-a64.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index 8b7dd1a4c0..050d80f6f9 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -30,7 +30,6 @@ typedef struct DisasContext {
     bool eci_handled;
     /* TCG op to rewind to if this turns out to be an invalid ECI state */
     TCGOp *insn_eci_rewind;
-    int thumb;
     int sctlr_b;
     MemOp be_data;
 #if !defined(CONFIG_USER_ONLY)
@@ -65,6 +64,7 @@ typedef struct DisasContext {
     GHashTable *cp_regs;
     uint64_t features; /* CPU features bits */
     bool aarch64;
+    bool thumb;
     /* Because unallocated encodings generate different exception syndrome
      * information from traps due to FP being disabled, we can't do a single
      * "is fp access disabled" check at a high level in the decode tree.
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 4dad23db48..be7283b966 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -14670,7 +14670,7 @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
      */
     dc->secure_routed_to_el3 = arm_feature(env, ARM_FEATURE_EL3) &&
                                !arm_el_is_aa64(env, 3);
-    dc->thumb = 0;
+    dc->thumb = false;
     dc->sctlr_b = 0;
     dc->be_data = EX_TBFLAG_ANY(tb_flags, BE_DATA) ? MO_BE : MO_LE;
     dc->condexec_mask = 0;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 09/60] target/arm: Change CPUArchState.thumb to bool
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (7 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 08/60] target/arm: Change DisasContext.thumb to bool Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 16:18   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 10/60] target/arm: Remove fpexc32_access Richard Henderson
                   ` (51 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Bool is a more appropriate type for this value.
Adjust the assignments to use true/false.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.h          | 2 +-
 linux-user/arm/cpu_loop.c | 2 +-
 target/arm/cpu.c          | 2 +-
 target/arm/m_helper.c     | 6 +++---
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index a61a52e2f6..4eb378ede2 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -259,6 +259,7 @@ typedef struct CPUArchState {
      */
     uint32_t pstate;
     bool aarch64; /* True if CPU is in aarch64 state; inverse of PSTATE.nRW */
+    bool thumb;   /* True if CPU is in thumb mode; cpsr[5] */
 
     /* Cached TBFLAGS state.  See below for which bits are included.  */
     CPUARMTBFlags hflags;
@@ -285,7 +286,6 @@ typedef struct CPUArchState {
     uint32_t ZF; /* Z set if zero.  */
     uint32_t QF; /* 0 or 1 */
     uint32_t GE; /* cpsr[19:16] */
-    uint32_t thumb; /* cpsr[5]. 0 = arm mode, 1 = thumb mode. */
     uint32_t condexec_bits; /* IT bits.  cpsr[15:10,26:25].  */
     uint32_t btype;  /* BTI branch type.  spsr[11:10].  */
     uint64_t daif; /* exception masks, in the bits they are in PSTATE */
diff --git a/linux-user/arm/cpu_loop.c b/linux-user/arm/cpu_loop.c
index aae375d617..2979109f92 100644
--- a/linux-user/arm/cpu_loop.c
+++ b/linux-user/arm/cpu_loop.c
@@ -231,7 +231,7 @@ do_kernel_trap(CPUARMState *env)
     /* Jump back to the caller.  */
     addr = env->regs[14];
     if (addr & 1) {
-        env->thumb = 1;
+        env->thumb = true;
         addr &= ~1;
     }
     env->regs[15] = addr;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 30e0d16ad4..561f180067 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -51,7 +51,7 @@ static void arm_cpu_set_pc(CPUState *cs, vaddr value)
 
     if (is_a64(env)) {
         env->pc = value;
-        env->thumb = 0;
+        env->thumb = false;
     } else {
         env->regs[15] = value & ~1;
         env->thumb = value & 1;
diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
index b7a0fe0114..a740c3e160 100644
--- a/target/arm/m_helper.c
+++ b/target/arm/m_helper.c
@@ -564,7 +564,7 @@ void HELPER(v7m_bxns)(CPUARMState *env, uint32_t dest)
         env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
     }
     switch_v7m_security_state(env, dest & 1);
-    env->thumb = 1;
+    env->thumb = true;
     env->regs[15] = dest & ~1;
     arm_rebuild_hflags(env);
 }
@@ -590,7 +590,7 @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
          * except that the low bit doesn't indicate Thumb/not.
          */
         env->regs[14] = nextinst;
-        env->thumb = 1;
+        env->thumb = true;
         env->regs[15] = dest & ~1;
         return;
     }
@@ -626,7 +626,7 @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
     }
     env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
     switch_v7m_security_state(env, 0);
-    env->thumb = 1;
+    env->thumb = true;
     env->regs[15] = dest;
     arm_rebuild_hflags(env);
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 10/60] target/arm: Remove fpexc32_access
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (8 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 09/60] target/arm: Change CPUArchState.thumb " Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 16:25   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 11/60] target/arm: Split out set_btype_raw Richard Henderson
                   ` (50 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

This function is incorrect in that it does not properly consider
CPTR_EL2.FPEN.  We've already got another mechanism for raising
an FPU access trap: ARM_CP_FPU, so use that instead.

Remove CP_ACCESS_TRAP_FP_EL{2,3}, which becomes unused.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.h       |  5 -----
 target/arm/helper.c    | 17 ++---------------
 target/arm/op_helper.c | 13 -------------
 3 files changed, 2 insertions(+), 33 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 4eb378ede2..e7f669d0a9 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2806,11 +2806,6 @@ typedef enum CPAccessResult {
     /* As CP_ACCESS_UNCATEGORIZED, but for traps directly to EL2 or EL3 */
     CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = 5,
     CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = 6,
-    /* Access fails and results in an exception syndrome for an FP access,
-     * trapped directly to EL2 or EL3
-     */
-    CP_ACCESS_TRAP_FP_EL2 = 7,
-    CP_ACCESS_TRAP_FP_EL3 = 8,
 } CPAccessResult;
 
 /* Access functions for coprocessor registers. These cannot fail and
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 47fe790854..60d9233b7e 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4785,18 +4785,6 @@ static void sctlr_write(CPUARMState *env, const ARMCPRegInfo *ri,
     }
 }
 
-static CPAccessResult fpexc32_access(CPUARMState *env, const ARMCPRegInfo *ri,
-                                     bool isread)
-{
-    if ((env->cp15.cptr_el[2] & CPTR_TFP) && arm_current_el(env) == 2) {
-        return CP_ACCESS_TRAP_FP_EL2;
-    }
-    if (env->cp15.cptr_el[3] & CPTR_TFP) {
-        return CP_ACCESS_TRAP_FP_EL3;
-    }
-    return CP_ACCESS_OK;
-}
-
 static void sdcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                        uint64_t value)
 {
@@ -5098,9 +5086,8 @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .access = PL1_RW, .readfn = spsel_read, .writefn = spsel_write },
     { .name = "FPEXC32_EL2", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 4, .crn = 5, .crm = 3, .opc2 = 0,
-      .type = ARM_CP_ALIAS,
-      .fieldoffset = offsetof(CPUARMState, vfp.xregs[ARM_VFP_FPEXC]),
-      .access = PL2_RW, .accessfn = fpexc32_access },
+      .access = PL2_RW, .type = ARM_CP_ALIAS | ARM_CP_FPU,
+      .fieldoffset = offsetof(CPUARMState, vfp.xregs[ARM_VFP_FPEXC]) },
     { .name = "DACR32_EL2", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 4, .crn = 3, .crm = 0, .opc2 = 0,
       .access = PL2_RW, .resetvalue = 0,
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index 70b42b55fd..2b87e8808b 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -691,19 +691,6 @@ void HELPER(access_check_cp_reg)(CPUARMState *env, void *rip, uint32_t syndrome,
         target_el = 3;
         syndrome = syn_uncategorized();
         break;
-    case CP_ACCESS_TRAP_FP_EL2:
-        target_el = 2;
-        /* Since we are an implementation that takes exceptions on a trapped
-         * conditional insn only if the insn has passed its condition code
-         * check, we take the IMPDEF choice to always report CV=1 COND=0xe
-         * (which is also the required value for AArch64 traps).
-         */
-        syndrome = syn_fp_access_trap(1, 0xe, false);
-        break;
-    case CP_ACCESS_TRAP_FP_EL3:
-        target_el = 3;
-        syndrome = syn_fp_access_trap(1, 0xe, false);
-        break;
     default:
         g_assert_not_reached();
     }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 11/60] target/arm: Split out set_btype_raw
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (9 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 10/60] target/arm: Remove fpexc32_access Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 16:27   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 12/60] target/arm: Split out gen_rebuild_hflags Richard Henderson
                   ` (49 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Common code for reset_btype and set_btype.
Use tcg_constant_i32.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index be7283b966..a85ca380a9 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -128,29 +128,28 @@ static int get_a64_user_mem_index(DisasContext *s)
     return arm_to_core_mmu_idx(useridx);
 }
 
-static void reset_btype(DisasContext *s)
+static void set_btype_raw(int val)
 {
-    if (s->btype != 0) {
-        TCGv_i32 zero = tcg_const_i32(0);
-        tcg_gen_st_i32(zero, cpu_env, offsetof(CPUARMState, btype));
-        tcg_temp_free_i32(zero);
-        s->btype = 0;
-    }
+    tcg_gen_st_i32(tcg_constant_i32(val), cpu_env,
+                   offsetof(CPUARMState, btype));
 }
 
 static void set_btype(DisasContext *s, int val)
 {
-    TCGv_i32 tcg_val;
-
     /* BTYPE is a 2-bit field, and 0 should be done with reset_btype.  */
     tcg_debug_assert(val >= 1 && val <= 3);
-
-    tcg_val = tcg_const_i32(val);
-    tcg_gen_st_i32(tcg_val, cpu_env, offsetof(CPUARMState, btype));
-    tcg_temp_free_i32(tcg_val);
+    set_btype_raw(val);
     s->btype = -1;
 }
 
+static void reset_btype(DisasContext *s)
+{
+    if (s->btype != 0) {
+        set_btype_raw(0);
+        s->btype = 0;
+    }
+}
+
 void gen_a64_set_pc_im(uint64_t val)
 {
     tcg_gen_movi_i64(cpu_pc, val);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 12/60] target/arm: Split out gen_rebuild_hflags
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (10 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 11/60] target/arm: Split out set_btype_raw Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 18:47   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 13/60] target/arm: Use tcg_constant in translate-a64.c Richard Henderson
                   ` (48 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

For aa32, the function has a parameter to use the new el.
For aa64, that never happens.
Use tcg_constant_i32 while we're at it.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 21 +++++++++-----------
 target/arm/translate.c     | 40 +++++++++++++++++++++++---------------
 2 files changed, 33 insertions(+), 28 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index a85ca380a9..a00a882145 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -341,6 +341,11 @@ static void a64_free_cc(DisasCompare64 *c64)
     tcg_temp_free_i64(c64->value);
 }
 
+static void gen_rebuild_hflags(DisasContext *s)
+{
+    gen_helper_rebuild_hflags_a64(cpu_env, tcg_constant_i32(s->current_el));
+}
+
 static void gen_exception_internal(int excp)
 {
     TCGv_i32 tcg_excp = tcg_const_i32(excp);
@@ -1667,9 +1672,7 @@ static void handle_msr_i(DisasContext *s, uint32_t insn,
         } else {
             clear_pstate_bits(PSTATE_UAO);
         }
-        t1 = tcg_const_i32(s->current_el);
-        gen_helper_rebuild_hflags_a64(cpu_env, t1);
-        tcg_temp_free_i32(t1);
+        gen_rebuild_hflags(s);
         break;
 
     case 0x04: /* PAN */
@@ -1681,9 +1684,7 @@ static void handle_msr_i(DisasContext *s, uint32_t insn,
         } else {
             clear_pstate_bits(PSTATE_PAN);
         }
-        t1 = tcg_const_i32(s->current_el);
-        gen_helper_rebuild_hflags_a64(cpu_env, t1);
-        tcg_temp_free_i32(t1);
+        gen_rebuild_hflags(s);
         break;
 
     case 0x05: /* SPSel */
@@ -1741,9 +1742,7 @@ static void handle_msr_i(DisasContext *s, uint32_t insn,
             } else {
                 clear_pstate_bits(PSTATE_TCO);
             }
-            t1 = tcg_const_i32(s->current_el);
-            gen_helper_rebuild_hflags_a64(cpu_env, t1);
-            tcg_temp_free_i32(t1);
+            gen_rebuild_hflags(s);
             /* Many factors, including TCO, go into MTE_ACTIVE. */
             s->base.is_jmp = DISAS_UPDATE_NOCHAIN;
         } else if (dc_isar_feature(aa64_mte_insn_reg, s)) {
@@ -1990,9 +1989,7 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
          * A write to any coprocessor regiser that ends a TB
          * must rebuild the hflags for the next TB.
          */
-        TCGv_i32 tcg_el = tcg_const_i32(s->current_el);
-        gen_helper_rebuild_hflags_a64(cpu_env, tcg_el);
-        tcg_temp_free_i32(tcg_el);
+        gen_rebuild_hflags(s);
         /*
          * We default to ending the TB on a coprocessor register write,
          * but allow this to be suppressed by the register definition
diff --git a/target/arm/translate.c b/target/arm/translate.c
index c745b7fc91..6b293f8279 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -351,6 +351,26 @@ void gen_set_cpsr(TCGv_i32 var, uint32_t mask)
     tcg_temp_free_i32(tmp_mask);
 }
 
+static void gen_rebuild_hflags(DisasContext *s, bool new_el)
+{
+    bool m_profile = arm_dc_feature(s, ARM_FEATURE_M);
+
+    if (new_el) {
+        if (m_profile) {
+            gen_helper_rebuild_hflags_m32_newel(cpu_env);
+        } else {
+            gen_helper_rebuild_hflags_a32_newel(cpu_env);
+        }
+    } else {
+        TCGv_i32 tcg_el = tcg_constant_i32(s->current_el);
+        if (m_profile) {
+            gen_helper_rebuild_hflags_m32(cpu_env, tcg_el);
+        } else {
+            gen_helper_rebuild_hflags_a32(cpu_env, tcg_el);
+        }
+    }
+}
+
 static void gen_exception_internal(int excp)
 {
     TCGv_i32 tcg_excp = tcg_const_i32(excp);
@@ -4885,17 +4905,7 @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
              * A write to any coprocessor register that ends a TB
              * must rebuild the hflags for the next TB.
              */
-            TCGv_i32 tcg_el = tcg_const_i32(s->current_el);
-            if (arm_dc_feature(s, ARM_FEATURE_M)) {
-                gen_helper_rebuild_hflags_m32(cpu_env, tcg_el);
-            } else {
-                if (ri->type & ARM_CP_NEWEL) {
-                    gen_helper_rebuild_hflags_a32_newel(cpu_env);
-                } else {
-                    gen_helper_rebuild_hflags_a32(cpu_env, tcg_el);
-                }
-            }
-            tcg_temp_free_i32(tcg_el);
+            gen_rebuild_hflags(s, ri->type & ARM_CP_NEWEL);
             /*
              * We default to ending the TB on a coprocessor register write,
              * but allow this to be suppressed by the register definition
@@ -6445,7 +6455,7 @@ static bool trans_MSR_v7m(DisasContext *s, arg_MSR_v7m *a)
     tcg_temp_free_i32(addr);
     tcg_temp_free_i32(reg);
     /* If we wrote to CONTROL, the EL might have changed */
-    gen_helper_rebuild_hflags_m32_newel(cpu_env);
+    gen_rebuild_hflags(s, true);
     gen_lookup_tb(s);
     return true;
 }
@@ -8897,7 +8907,7 @@ static bool trans_CPS(DisasContext *s, arg_CPS *a)
 
 static bool trans_CPS_v7m(DisasContext *s, arg_CPS_v7m *a)
 {
-    TCGv_i32 tmp, addr, el;
+    TCGv_i32 tmp, addr;
 
     if (!arm_dc_feature(s, ARM_FEATURE_M)) {
         return false;
@@ -8920,9 +8930,7 @@ static bool trans_CPS_v7m(DisasContext *s, arg_CPS_v7m *a)
         gen_helper_v7m_msr(cpu_env, addr, tmp);
         tcg_temp_free_i32(addr);
     }
-    el = tcg_const_i32(s->current_el);
-    gen_helper_rebuild_hflags_m32(cpu_env, el);
-    tcg_temp_free_i32(el);
+    gen_rebuild_hflags(s, false);
     tcg_temp_free_i32(tmp);
     gen_lookup_tb(s);
     return true;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 13/60] target/arm: Use tcg_constant in translate-a64.c
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (11 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 12/60] target/arm: Split out gen_rebuild_hflags Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 18:49   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 14/60] target/arm: Simplify GEN_SHIFT in translate.c Richard Henderson
                   ` (47 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Use tcg_constant_{i32,i64,ptr} as appropriate throughout, which
means we get to remove lots of tcg_temp_free_*.  Drop variables
in many cases, passing the constant directly to another function.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 302 +++++++++++--------------------------
 1 file changed, 90 insertions(+), 212 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index a00a882145..3867910ed4 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -240,14 +240,10 @@ static void gen_address_with_allocation_tag0(TCGv_i64 dst, TCGv_i64 src)
 static void gen_probe_access(DisasContext *s, TCGv_i64 ptr,
                              MMUAccessType acc, int log2_size)
 {
-    TCGv_i32 t_acc = tcg_const_i32(acc);
-    TCGv_i32 t_idx = tcg_const_i32(get_mem_index(s));
-    TCGv_i32 t_size = tcg_const_i32(1 << log2_size);
-
-    gen_helper_probe_access(cpu_env, ptr, t_acc, t_idx, t_size);
-    tcg_temp_free_i32(t_acc);
-    tcg_temp_free_i32(t_idx);
-    tcg_temp_free_i32(t_size);
+    gen_helper_probe_access(cpu_env, ptr,
+                            tcg_constant_i32(acc),
+                            tcg_constant_i32(get_mem_index(s)),
+                            tcg_constant_i32(1 << log2_size));
 }
 
 /*
@@ -262,7 +258,6 @@ static TCGv_i64 gen_mte_check1_mmuidx(DisasContext *s, TCGv_i64 addr,
                                       int core_idx)
 {
     if (tag_checked && s->mte_active[is_unpriv]) {
-        TCGv_i32 tcg_desc;
         TCGv_i64 ret;
         int desc = 0;
 
@@ -271,11 +266,9 @@ static TCGv_i64 gen_mte_check1_mmuidx(DisasContext *s, TCGv_i64 addr,
         desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
         desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
         desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (1 << log2_size) - 1);
-        tcg_desc = tcg_const_i32(desc);
 
         ret = new_tmp_a64(s);
-        gen_helper_mte_check(ret, cpu_env, tcg_desc, addr);
-        tcg_temp_free_i32(tcg_desc);
+        gen_helper_mte_check(ret, cpu_env, tcg_constant_i32(desc), addr);
 
         return ret;
     }
@@ -296,7 +289,6 @@ TCGv_i64 gen_mte_checkN(DisasContext *s, TCGv_i64 addr, bool is_write,
                         bool tag_checked, int size)
 {
     if (tag_checked && s->mte_active[0]) {
-        TCGv_i32 tcg_desc;
         TCGv_i64 ret;
         int desc = 0;
 
@@ -305,11 +297,9 @@ TCGv_i64 gen_mte_checkN(DisasContext *s, TCGv_i64 addr, bool is_write,
         desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
         desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
         desc = FIELD_DP32(desc, MTEDESC, SIZEM1, size - 1);
-        tcg_desc = tcg_const_i32(desc);
 
         ret = new_tmp_a64(s);
-        gen_helper_mte_check(ret, cpu_env, tcg_desc, addr);
-        tcg_temp_free_i32(tcg_desc);
+        gen_helper_mte_check(ret, cpu_env, tcg_constant_i32(desc), addr);
 
         return ret;
     }
@@ -348,11 +338,8 @@ static void gen_rebuild_hflags(DisasContext *s)
 
 static void gen_exception_internal(int excp)
 {
-    TCGv_i32 tcg_excp = tcg_const_i32(excp);
-
     assert(excp_is_internal(excp));
-    gen_helper_exception_internal(cpu_env, tcg_excp);
-    tcg_temp_free_i32(tcg_excp);
+    gen_helper_exception_internal(cpu_env, tcg_constant_i32(excp));
 }
 
 static void gen_exception_internal_insn(DisasContext *s, uint64_t pc, int excp)
@@ -364,12 +351,8 @@ static void gen_exception_internal_insn(DisasContext *s, uint64_t pc, int excp)
 
 static void gen_exception_bkpt_insn(DisasContext *s, uint32_t syndrome)
 {
-    TCGv_i32 tcg_syn;
-
     gen_a64_set_pc_im(s->pc_curr);
-    tcg_syn = tcg_const_i32(syndrome);
-    gen_helper_exception_bkpt_insn(cpu_env, tcg_syn);
-    tcg_temp_free_i32(tcg_syn);
+    gen_helper_exception_bkpt_insn(cpu_env, tcg_constant_i32(syndrome));
     s->base.is_jmp = DISAS_NORETURN;
 }
 
@@ -831,15 +814,15 @@ static void gen_adc(int sf, TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
 static void gen_adc_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
 {
     if (sf) {
-        TCGv_i64 result, cf_64, vf_64, tmp;
-        result = tcg_temp_new_i64();
-        cf_64 = tcg_temp_new_i64();
-        vf_64 = tcg_temp_new_i64();
-        tmp = tcg_const_i64(0);
+        TCGv_i64 result = tcg_temp_new_i64();
+        TCGv_i64 cf_64 = tcg_temp_new_i64();
+        TCGv_i64 vf_64 = tcg_temp_new_i64();
+        TCGv_i64 tmp = tcg_temp_new_i64();
+        TCGv_i64 zero = tcg_constant_i64(0);
 
         tcg_gen_extu_i32_i64(cf_64, cpu_CF);
-        tcg_gen_add2_i64(result, cf_64, t0, tmp, cf_64, tmp);
-        tcg_gen_add2_i64(result, cf_64, result, cf_64, t1, tmp);
+        tcg_gen_add2_i64(result, cf_64, t0, zero, cf_64, zero);
+        tcg_gen_add2_i64(result, cf_64, result, cf_64, t1, zero);
         tcg_gen_extrl_i64_i32(cpu_CF, cf_64);
         gen_set_NZ64(result);
 
@@ -855,15 +838,15 @@ static void gen_adc_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
         tcg_temp_free_i64(cf_64);
         tcg_temp_free_i64(result);
     } else {
-        TCGv_i32 t0_32, t1_32, tmp;
-        t0_32 = tcg_temp_new_i32();
-        t1_32 = tcg_temp_new_i32();
-        tmp = tcg_const_i32(0);
+        TCGv_i32 t0_32 = tcg_temp_new_i32();
+        TCGv_i32 t1_32 = tcg_temp_new_i32();
+        TCGv_i32 tmp = tcg_temp_new_i32();
+        TCGv_i32 zero = tcg_constant_i32(0);
 
         tcg_gen_extrl_i64_i32(t0_32, t0);
         tcg_gen_extrl_i64_i32(t1_32, t1);
-        tcg_gen_add2_i32(cpu_NF, cpu_CF, t0_32, tmp, cpu_CF, tmp);
-        tcg_gen_add2_i32(cpu_NF, cpu_CF, cpu_NF, cpu_CF, t1_32, tmp);
+        tcg_gen_add2_i32(cpu_NF, cpu_CF, t0_32, zero, cpu_CF, zero);
+        tcg_gen_add2_i32(cpu_NF, cpu_CF, cpu_NF, cpu_CF, t1_32, zero);
 
         tcg_gen_mov_i32(cpu_ZF, cpu_NF);
         tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
@@ -1632,7 +1615,6 @@ static void gen_axflag(void)
 static void handle_msr_i(DisasContext *s, uint32_t insn,
                          unsigned int op1, unsigned int op2, unsigned int crm)
 {
-    TCGv_i32 t1;
     int op = op1 << 3 | op2;
 
     /* End the TB by default, chaining is ok.  */
@@ -1691,9 +1673,7 @@ static void handle_msr_i(DisasContext *s, uint32_t insn,
         if (s->current_el == 0) {
             goto do_unallocated;
         }
-        t1 = tcg_const_i32(crm & PSTATE_SP);
-        gen_helper_msr_i_spsel(cpu_env, t1);
-        tcg_temp_free_i32(t1);
+        gen_helper_msr_i_spsel(cpu_env, tcg_constant_i32(crm & PSTATE_SP));
         break;
 
     case 0x19: /* SSBS */
@@ -1721,15 +1701,11 @@ static void handle_msr_i(DisasContext *s, uint32_t insn,
         break;
 
     case 0x1e: /* DAIFSet */
-        t1 = tcg_const_i32(crm);
-        gen_helper_msr_i_daifset(cpu_env, t1);
-        tcg_temp_free_i32(t1);
+        gen_helper_msr_i_daifset(cpu_env, tcg_constant_i32(crm));
         break;
 
     case 0x1f: /* DAIFClear */
-        t1 = tcg_const_i32(crm);
-        gen_helper_msr_i_daifclear(cpu_env, t1);
-        tcg_temp_free_i32(t1);
+        gen_helper_msr_i_daifclear(cpu_env, tcg_constant_i32(crm));
         /* For DAIFClear, exit the cpu loop to re-evaluate pending IRQs.  */
         s->base.is_jmp = DISAS_UPDATE_EXIT;
         break;
@@ -1842,19 +1818,14 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
         /* Emit code to perform further access permissions checks at
          * runtime; this may result in an exception.
          */
-        TCGv_ptr tmpptr;
-        TCGv_i32 tcg_syn, tcg_isread;
         uint32_t syndrome;
 
-        gen_a64_set_pc_im(s->pc_curr);
-        tmpptr = tcg_const_ptr(ri);
         syndrome = syn_aa64_sysregtrap(op0, op1, op2, crn, crm, rt, isread);
-        tcg_syn = tcg_const_i32(syndrome);
-        tcg_isread = tcg_const_i32(isread);
-        gen_helper_access_check_cp_reg(cpu_env, tmpptr, tcg_syn, tcg_isread);
-        tcg_temp_free_ptr(tmpptr);
-        tcg_temp_free_i32(tcg_syn);
-        tcg_temp_free_i32(tcg_isread);
+        gen_a64_set_pc_im(s->pc_curr);
+        gen_helper_access_check_cp_reg(cpu_env,
+                                       tcg_constant_ptr(ri),
+                                       tcg_constant_i32(syndrome),
+                                       tcg_constant_i32(isread));
     } else if (ri->type & ARM_CP_RAISES_EXC) {
         /*
          * The readfn or writefn might raise an exception;
@@ -1885,17 +1856,15 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
     case ARM_CP_DC_ZVA:
         /* Writes clear the aligned block of memory which rt points into. */
         if (s->mte_active[0]) {
-            TCGv_i32 t_desc;
             int desc = 0;
 
             desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
             desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
             desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
-            t_desc = tcg_const_i32(desc);
 
             tcg_rt = new_tmp_a64(s);
-            gen_helper_mte_check_zva(tcg_rt, cpu_env, t_desc, cpu_reg(s, rt));
-            tcg_temp_free_i32(t_desc);
+            gen_helper_mte_check_zva(tcg_rt, cpu_env,
+                                     tcg_constant_i32(desc), cpu_reg(s, rt));
         } else {
             tcg_rt = clean_data_tbi(s, cpu_reg(s, rt));
         }
@@ -1959,10 +1928,7 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
         if (ri->type & ARM_CP_CONST) {
             tcg_gen_movi_i64(tcg_rt, ri->resetvalue);
         } else if (ri->readfn) {
-            TCGv_ptr tmpptr;
-            tmpptr = tcg_const_ptr(ri);
-            gen_helper_get_cp_reg64(tcg_rt, cpu_env, tmpptr);
-            tcg_temp_free_ptr(tmpptr);
+            gen_helper_get_cp_reg64(tcg_rt, cpu_env, tcg_constant_ptr(ri));
         } else {
             tcg_gen_ld_i64(tcg_rt, cpu_env, ri->fieldoffset);
         }
@@ -1971,10 +1937,7 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
             /* If not forbidden by access permissions, treat as WI */
             return;
         } else if (ri->writefn) {
-            TCGv_ptr tmpptr;
-            tmpptr = tcg_const_ptr(ri);
-            gen_helper_set_cp_reg64(cpu_env, tmpptr, tcg_rt);
-            tcg_temp_free_ptr(tmpptr);
+            gen_helper_set_cp_reg64(cpu_env, tcg_constant_ptr(ri), tcg_rt);
         } else {
             tcg_gen_st_i64(tcg_rt, cpu_env, ri->fieldoffset);
         }
@@ -2052,7 +2015,6 @@ static void disas_exc(DisasContext *s, uint32_t insn)
     int opc = extract32(insn, 21, 3);
     int op2_ll = extract32(insn, 0, 5);
     int imm16 = extract32(insn, 5, 16);
-    TCGv_i32 tmp;
 
     switch (opc) {
     case 0:
@@ -2087,9 +2049,7 @@ static void disas_exc(DisasContext *s, uint32_t insn)
                 break;
             }
             gen_a64_set_pc_im(s->pc_curr);
-            tmp = tcg_const_i32(syn_aa64_smc(imm16));
-            gen_helper_pre_smc(cpu_env, tmp);
-            tcg_temp_free_i32(tmp);
+            gen_helper_pre_smc(cpu_env, tcg_constant_i32(syn_aa64_smc(imm16)));
             gen_ss_advance(s);
             gen_exception_insn(s, s->base.pc_next, EXCP_SMC,
                                syn_aa64_smc(imm16), 3);
@@ -2563,7 +2523,7 @@ static void gen_compare_and_swap_pair(DisasContext *s, int rs, int rt,
         tcg_temp_free_i64(cmp);
     } else if (tb_cflags(s->base.tb) & CF_PARALLEL) {
         if (HAVE_CMPXCHG128) {
-            TCGv_i32 tcg_rs = tcg_const_i32(rs);
+            TCGv_i32 tcg_rs = tcg_constant_i32(rs);
             if (s->be_data == MO_LE) {
                 gen_helper_casp_le_parallel(cpu_env, tcg_rs,
                                             clean_addr, t1, t2);
@@ -2571,7 +2531,6 @@ static void gen_compare_and_swap_pair(DisasContext *s, int rs, int rt,
                 gen_helper_casp_be_parallel(cpu_env, tcg_rs,
                                             clean_addr, t1, t2);
             }
-            tcg_temp_free_i32(tcg_rs);
         } else {
             gen_helper_exit_atomic(cpu_env);
             s->base.is_jmp = DISAS_NORETURN;
@@ -2582,7 +2541,7 @@ static void gen_compare_and_swap_pair(DisasContext *s, int rs, int rt,
         TCGv_i64 a2 = tcg_temp_new_i64();
         TCGv_i64 c1 = tcg_temp_new_i64();
         TCGv_i64 c2 = tcg_temp_new_i64();
-        TCGv_i64 zero = tcg_const_i64(0);
+        TCGv_i64 zero = tcg_constant_i64(0);
 
         /* Load the two words, in memory order.  */
         tcg_gen_qemu_ld_i64(d1, clean_addr, memidx,
@@ -2603,7 +2562,6 @@ static void gen_compare_and_swap_pair(DisasContext *s, int rs, int rt,
         tcg_temp_free_i64(a2);
         tcg_temp_free_i64(c1);
         tcg_temp_free_i64(c2);
-        tcg_temp_free_i64(zero);
 
         /* Write back the data from memory to Rs.  */
         tcg_gen_mov_i64(s1, d1);
@@ -2820,7 +2778,7 @@ static void disas_ld_lit(DisasContext *s, uint32_t insn)
 
     tcg_rt = cpu_reg(s, rt);
 
-    clean_addr = tcg_const_i64(s->pc_curr + imm);
+    clean_addr = tcg_constant_i64(s->pc_curr + imm);
     if (is_vector) {
         do_fp_ld(s, rt, clean_addr, size);
     } else {
@@ -2830,7 +2788,6 @@ static void disas_ld_lit(DisasContext *s, uint32_t insn)
         do_gpr_ld(s, tcg_rt, clean_addr, size + is_signed * MO_SIGN,
                   false, true, rt, iss_sf, false);
     }
-    tcg_temp_free_i64(clean_addr);
 }
 
 /*
@@ -3736,7 +3693,7 @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
     mop = endian | size | align;
 
     elements = (is_q ? 16 : 8) >> size;
-    tcg_ebytes = tcg_const_i64(1 << size);
+    tcg_ebytes = tcg_constant_i64(1 << size);
     for (r = 0; r < rpt; r++) {
         int e;
         for (e = 0; e < elements; e++) {
@@ -3752,7 +3709,6 @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
             }
         }
     }
-    tcg_temp_free_i64(tcg_ebytes);
 
     if (!is_store) {
         /* For non-quad operations, setting a slice of the low
@@ -3882,7 +3838,7 @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
                                 total);
     mop = finalize_memop(s, scale);
 
-    tcg_ebytes = tcg_const_i64(1 << scale);
+    tcg_ebytes = tcg_constant_i64(1 << scale);
     for (xs = 0; xs < selem; xs++) {
         if (replicate) {
             /* Load and replicate to all elements */
@@ -3904,7 +3860,6 @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
         tcg_gen_add_i64(clean_addr, clean_addr, tcg_ebytes);
         rt = (rt + 1) % 32;
     }
-    tcg_temp_free_i64(tcg_ebytes);
 
     if (is_postidx) {
         if (rm == 31) {
@@ -4095,7 +4050,7 @@ static void disas_ldst_tag(DisasContext *s, uint32_t insn)
 
     if (is_zero) {
         TCGv_i64 clean_addr = clean_data_tbi(s, addr);
-        TCGv_i64 tcg_zero = tcg_const_i64(0);
+        TCGv_i64 tcg_zero = tcg_constant_i64(0);
         int mem_index = get_mem_index(s);
         int i, n = (1 + is_pair) << LOG2_TAG_GRANULE;
 
@@ -4105,7 +4060,6 @@ static void disas_ldst_tag(DisasContext *s, uint32_t insn)
             tcg_gen_addi_i64(clean_addr, clean_addr, 8);
             tcg_gen_qemu_st_i64(tcg_zero, clean_addr, mem_index, MO_UQ);
         }
-        tcg_temp_free_i64(tcg_zero);
     }
 
     if (index != 0) {
@@ -4224,13 +4178,12 @@ static void disas_add_sub_imm(DisasContext *s, uint32_t insn)
             tcg_gen_addi_i64(tcg_result, tcg_rn, imm);
         }
     } else {
-        TCGv_i64 tcg_imm = tcg_const_i64(imm);
+        TCGv_i64 tcg_imm = tcg_constant_i64(imm);
         if (sub_op) {
             gen_sub_CC(is_64bit, tcg_result, tcg_rn, tcg_imm);
         } else {
             gen_add_CC(is_64bit, tcg_result, tcg_rn, tcg_imm);
         }
-        tcg_temp_free_i64(tcg_imm);
     }
 
     if (is_64bit) {
@@ -4278,12 +4231,9 @@ static void disas_add_sub_imm_with_tags(DisasContext *s, uint32_t insn)
     tcg_rd = cpu_reg_sp(s, rd);
 
     if (s->ata) {
-        TCGv_i32 offset = tcg_const_i32(imm);
-        TCGv_i32 tag_offset = tcg_const_i32(uimm4);
-
-        gen_helper_addsubg(tcg_rd, cpu_env, tcg_rn, offset, tag_offset);
-        tcg_temp_free_i32(tag_offset);
-        tcg_temp_free_i32(offset);
+        gen_helper_addsubg(tcg_rd, cpu_env, tcg_rn,
+                           tcg_constant_i32(imm),
+                           tcg_constant_i32(uimm4));
     } else {
         tcg_gen_addi_i64(tcg_rd, tcg_rn, imm);
         gen_address_with_allocation_tag0(tcg_rd, tcg_rd);
@@ -4469,7 +4419,6 @@ static void disas_movw_imm(DisasContext *s, uint32_t insn)
     int opc = extract32(insn, 29, 2);
     int pos = extract32(insn, 21, 2) << 4;
     TCGv_i64 tcg_rd = cpu_reg(s, rd);
-    TCGv_i64 tcg_imm;
 
     if (!sf && (pos >= 32)) {
         unallocated_encoding(s);
@@ -4489,9 +4438,7 @@ static void disas_movw_imm(DisasContext *s, uint32_t insn)
         tcg_gen_movi_i64(tcg_rd, imm);
         break;
     case 3: /* MOVK */
-        tcg_imm = tcg_const_i64(imm);
-        tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_imm, pos, 16);
-        tcg_temp_free_i64(tcg_imm);
+        tcg_gen_deposit_i64(tcg_rd, tcg_rd, tcg_constant_i64(imm), pos, 16);
         if (!sf) {
             tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
         }
@@ -4731,11 +4678,7 @@ static void shift_reg_imm(TCGv_i64 dst, TCGv_i64 src, int sf,
     if (shift_i == 0) {
         tcg_gen_mov_i64(dst, src);
     } else {
-        TCGv_i64 shift_const;
-
-        shift_const = tcg_const_i64(shift_i);
-        shift_reg(dst, src, sf, shift_type, shift_const);
-        tcg_temp_free_i64(shift_const);
+        shift_reg(dst, src, sf, shift_type, tcg_constant_i64(shift_i));
     }
 }
 
@@ -5312,7 +5255,7 @@ static void disas_cond_select(DisasContext *s, uint32_t insn)
     tcg_rd = cpu_reg(s, rd);
 
     a64_test_cc(&c, cond);
-    zero = tcg_const_i64(0);
+    zero = tcg_constant_i64(0);
 
     if (rn == 31 && rm == 31 && (else_inc ^ else_inv)) {
         /* CSET & CSETM.  */
@@ -5333,7 +5276,6 @@ static void disas_cond_select(DisasContext *s, uint32_t insn)
         tcg_gen_movcond_i64(c.cond, tcg_rd, c.value, zero, t_true, t_false);
     }
 
-    tcg_temp_free_i64(zero);
     a64_free_cc(&c);
 
     if (!sf) {
@@ -5430,7 +5372,7 @@ static void handle_rev16(DisasContext *s, unsigned int sf,
     TCGv_i64 tcg_rd = cpu_reg(s, rd);
     TCGv_i64 tcg_tmp = tcg_temp_new_i64();
     TCGv_i64 tcg_rn = read_cpu_reg(s, rn, sf);
-    TCGv_i64 mask = tcg_const_i64(sf ? 0x00ff00ff00ff00ffull : 0x00ff00ff);
+    TCGv_i64 mask = tcg_constant_i64(sf ? 0x00ff00ff00ff00ffull : 0x00ff00ff);
 
     tcg_gen_shri_i64(tcg_tmp, tcg_rn, 8);
     tcg_gen_and_i64(tcg_rd, tcg_rn, mask);
@@ -5438,7 +5380,6 @@ static void handle_rev16(DisasContext *s, unsigned int sf,
     tcg_gen_shli_i64(tcg_rd, tcg_rd, 8);
     tcg_gen_or_i64(tcg_rd, tcg_rd, tcg_tmp);
 
-    tcg_temp_free_i64(mask);
     tcg_temp_free_i64(tcg_tmp);
 }
 
@@ -5721,15 +5662,13 @@ static void handle_crc32(DisasContext *s,
     }
 
     tcg_acc = cpu_reg(s, rn);
-    tcg_bytes = tcg_const_i32(1 << sz);
+    tcg_bytes = tcg_constant_i32(1 << sz);
 
     if (crc32c) {
         gen_helper_crc32c_64(cpu_reg(s, rd), tcg_acc, tcg_val, tcg_bytes);
     } else {
         gen_helper_crc32_64(cpu_reg(s, rd), tcg_acc, tcg_val, tcg_bytes);
     }
-
-    tcg_temp_free_i32(tcg_bytes);
 }
 
 /* Data-processing (2 source)
@@ -5795,15 +5734,13 @@ static void disas_data_proc_2src(DisasContext *s, uint32_t insn)
         if (sf == 0 || !dc_isar_feature(aa64_mte_insn_reg, s)) {
             goto do_unallocated;
         } else {
-            TCGv_i64 t1 = tcg_const_i64(1);
-            TCGv_i64 t2 = tcg_temp_new_i64();
+            TCGv_i64 t = tcg_temp_new_i64();
 
-            tcg_gen_extract_i64(t2, cpu_reg_sp(s, rn), 56, 4);
-            tcg_gen_shl_i64(t1, t1, t2);
-            tcg_gen_or_i64(cpu_reg(s, rd), cpu_reg(s, rm), t1);
+            tcg_gen_extract_i64(t, cpu_reg_sp(s, rn), 56, 4);
+            tcg_gen_shl_i64(t, tcg_constant_i64(1), t);
+            tcg_gen_or_i64(cpu_reg(s, rd), cpu_reg(s, rm), t);
 
-            tcg_temp_free_i64(t1);
-            tcg_temp_free_i64(t2);
+            tcg_temp_free_i64(t);
         }
         break;
     case 8: /* LSLV */
@@ -5938,7 +5875,7 @@ static void handle_fp_compare(DisasContext *s, int size,
 
         tcg_vn = read_fp_dreg(s, rn);
         if (cmp_with_zero) {
-            tcg_vm = tcg_const_i64(0);
+            tcg_vm = tcg_constant_i64(0);
         } else {
             tcg_vm = read_fp_dreg(s, rm);
         }
@@ -6048,7 +5985,6 @@ static void disas_fp_compare(DisasContext *s, uint32_t insn)
 static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
 {
     unsigned int mos, type, rm, cond, rn, op, nzcv;
-    TCGv_i64 tcg_flags;
     TCGLabel *label_continue = NULL;
     int size;
 
@@ -6092,9 +6028,7 @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
         label_continue = gen_new_label();
         arm_gen_test_cc(cond, label_match);
         /* nomatch: */
-        tcg_flags = tcg_const_i64(nzcv << 28);
-        gen_set_nzcv(tcg_flags);
-        tcg_temp_free_i64(tcg_flags);
+        gen_set_nzcv(tcg_constant_i64(nzcv << 28));
         tcg_gen_br(label_continue);
         gen_set_label(label_match);
     }
@@ -6115,7 +6049,7 @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
 static void disas_fp_csel(DisasContext *s, uint32_t insn)
 {
     unsigned int mos, type, rm, cond, rn, rd;
-    TCGv_i64 t_true, t_false, t_zero;
+    TCGv_i64 t_true, t_false;
     DisasCompare64 c;
     MemOp sz;
 
@@ -6160,10 +6094,8 @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
     read_vec_element(s, t_false, rm, 0, sz);
 
     a64_test_cc(&c, cond);
-    t_zero = tcg_const_i64(0);
-    tcg_gen_movcond_i64(c.cond, t_true, c.value, t_zero, t_true, t_false);
-    tcg_temp_free_i64(t_zero);
-    tcg_temp_free_i64(t_false);
+    tcg_gen_movcond_i64(c.cond, t_true, c.value, tcg_constant_i64(0),
+                        t_true, t_false);
     a64_free_cc(&c);
 
     /* Note that sregs & hregs write back zeros to the high bits,
@@ -6944,7 +6876,6 @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
     int type = extract32(insn, 22, 2);
     int mos = extract32(insn, 29, 3);
     uint64_t imm;
-    TCGv_i64 tcg_res;
     MemOp sz;
 
     if (mos || imm5) {
@@ -6975,10 +6906,7 @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
     }
 
     imm = vfp_expand_imm(sz, imm8);
-
-    tcg_res = tcg_const_i64(imm);
-    write_fp_dreg(s, rd, tcg_res);
-    tcg_temp_free_i64(tcg_res);
+    write_fp_dreg(s, rd, tcg_constant_i64(imm));
 }
 
 /* Handle floating point <=> fixed point conversions. Note that we can
@@ -6996,7 +6924,7 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
 
     tcg_fpstatus = fpstatus_ptr(type == 3 ? FPST_FPCR_F16 : FPST_FPCR);
 
-    tcg_shift = tcg_const_i32(64 - scale);
+    tcg_shift = tcg_constant_i32(64 - scale);
 
     if (itof) {
         TCGv_i64 tcg_int = cpu_reg(s, rn);
@@ -7155,7 +7083,6 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
     }
 
     tcg_temp_free_ptr(tcg_fpstatus);
-    tcg_temp_free_i32(tcg_shift);
 }
 
 /* Floating point <-> fixed point conversions
@@ -8426,7 +8353,7 @@ static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
     /* Deal with the rounding step */
     if (round) {
         if (extended_result) {
-            TCGv_i64 tcg_zero = tcg_const_i64(0);
+            TCGv_i64 tcg_zero = tcg_constant_i64(0);
             if (!is_u) {
                 /* take care of sign extending tcg_res */
                 tcg_gen_sari_i64(tcg_src_hi, tcg_src, 63);
@@ -8438,7 +8365,6 @@ static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
                                  tcg_src, tcg_zero,
                                  tcg_rnd, tcg_zero);
             }
-            tcg_temp_free_i64(tcg_zero);
         } else {
             tcg_gen_add_i64(tcg_src, tcg_src, tcg_rnd);
         }
@@ -8524,8 +8450,7 @@ static void handle_scalar_simd_shri(DisasContext *s,
     }
 
     if (round) {
-        uint64_t round_const = 1ULL << (shift - 1);
-        tcg_round = tcg_const_i64(round_const);
+        tcg_round = tcg_constant_i64(1ULL << (shift - 1));
     } else {
         tcg_round = NULL;
     }
@@ -8551,9 +8476,6 @@ static void handle_scalar_simd_shri(DisasContext *s,
 
     tcg_temp_free_i64(tcg_rn);
     tcg_temp_free_i64(tcg_rd);
-    if (round) {
-        tcg_temp_free_i64(tcg_round);
-    }
 }
 
 /* SHL/SLI - Scalar shift left */
@@ -8651,8 +8573,7 @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
     tcg_final = tcg_const_i64(0);
 
     if (round) {
-        uint64_t round_const = 1ULL << (shift - 1);
-        tcg_round = tcg_const_i64(round_const);
+        tcg_round = tcg_constant_i64(1ULL << (shift - 1));
     } else {
         tcg_round = NULL;
     }
@@ -8672,9 +8593,6 @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
         write_vec_element(s, tcg_final, rd, 1, MO_64);
     }
 
-    if (round) {
-        tcg_temp_free_i64(tcg_round);
-    }
     tcg_temp_free_i64(tcg_rn);
     tcg_temp_free_i64(tcg_rd);
     tcg_temp_free_i32(tcg_rd_narrowed);
@@ -8726,7 +8644,7 @@ static void handle_simd_qshl(DisasContext *s, bool scalar, bool is_q,
     }
 
     if (size == 3) {
-        TCGv_i64 tcg_shift = tcg_const_i64(shift);
+        TCGv_i64 tcg_shift = tcg_constant_i64(shift);
         static NeonGenTwo64OpEnvFn * const fns[2][2] = {
             { gen_helper_neon_qshl_s64, gen_helper_neon_qshlu_s64 },
             { NULL, gen_helper_neon_qshl_u64 },
@@ -8743,10 +8661,9 @@ static void handle_simd_qshl(DisasContext *s, bool scalar, bool is_q,
 
             tcg_temp_free_i64(tcg_op);
         }
-        tcg_temp_free_i64(tcg_shift);
         clear_vec_high(s, is_q, rd);
     } else {
-        TCGv_i32 tcg_shift = tcg_const_i32(shift);
+        TCGv_i32 tcg_shift = tcg_constant_i32(shift);
         static NeonGenTwoOpEnvFn * const fns[2][2][3] = {
             {
                 { gen_helper_neon_qshl_s8,
@@ -8791,7 +8708,6 @@ static void handle_simd_qshl(DisasContext *s, bool scalar, bool is_q,
 
             tcg_temp_free_i32(tcg_op);
         }
-        tcg_temp_free_i32(tcg_shift);
 
         if (!scalar) {
             clear_vec_high(s, is_q, rd);
@@ -8811,7 +8727,7 @@ static void handle_simd_intfp_conv(DisasContext *s, int rd, int rn,
     int pass;
 
     if (fracbits || size == MO_64) {
-        tcg_shift = tcg_const_i32(fracbits);
+        tcg_shift = tcg_constant_i32(fracbits);
     }
 
     if (size == MO_64) {
@@ -8896,9 +8812,6 @@ static void handle_simd_intfp_conv(DisasContext *s, int rd, int rn,
     }
 
     tcg_temp_free_ptr(tcg_fpst);
-    if (tcg_shift) {
-        tcg_temp_free_i32(tcg_shift);
-    }
 
     clear_vec_high(s, elements << size == 16, rd);
 }
@@ -8988,7 +8901,7 @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
     tcg_fpstatus = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
     gen_helper_set_rmode(tcg_rmode, tcg_rmode, tcg_fpstatus);
     fracbits = (16 << size) - immhb;
-    tcg_shift = tcg_const_i32(fracbits);
+    tcg_shift = tcg_constant_i32(fracbits);
 
     if (size == MO_64) {
         int maxpass = is_scalar ? 1 : 2;
@@ -9046,7 +8959,6 @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
         }
     }
 
-    tcg_temp_free_i32(tcg_shift);
     gen_helper_set_rmode(tcg_rmode, tcg_rmode, tcg_fpstatus);
     tcg_temp_free_ptr(tcg_fpstatus);
     tcg_temp_free_i32(tcg_rmode);
@@ -9918,23 +9830,15 @@ static void handle_2misc_64(DisasContext *s, int opcode, bool u,
     case 0x1c: /* FCVTAS */
     case 0x3a: /* FCVTPS */
     case 0x3b: /* FCVTZS */
-    {
-        TCGv_i32 tcg_shift = tcg_const_i32(0);
-        gen_helper_vfp_tosqd(tcg_rd, tcg_rn, tcg_shift, tcg_fpstatus);
-        tcg_temp_free_i32(tcg_shift);
+        gen_helper_vfp_tosqd(tcg_rd, tcg_rn, tcg_constant_i32(0), tcg_fpstatus);
         break;
-    }
     case 0x5a: /* FCVTNU */
     case 0x5b: /* FCVTMU */
     case 0x5c: /* FCVTAU */
     case 0x7a: /* FCVTPU */
     case 0x7b: /* FCVTZU */
-    {
-        TCGv_i32 tcg_shift = tcg_const_i32(0);
-        gen_helper_vfp_touqd(tcg_rd, tcg_rn, tcg_shift, tcg_fpstatus);
-        tcg_temp_free_i32(tcg_shift);
+        gen_helper_vfp_touqd(tcg_rd, tcg_rn, tcg_constant_i32(0), tcg_fpstatus);
         break;
-    }
     case 0x18: /* FRINTN */
     case 0x19: /* FRINTM */
     case 0x38: /* FRINTP */
@@ -9974,7 +9878,7 @@ static void handle_2misc_fcmp_zero(DisasContext *s, int opcode,
 
     if (is_double) {
         TCGv_i64 tcg_op = tcg_temp_new_i64();
-        TCGv_i64 tcg_zero = tcg_const_i64(0);
+        TCGv_i64 tcg_zero = tcg_constant_i64(0);
         TCGv_i64 tcg_res = tcg_temp_new_i64();
         NeonGenTwoDoubleOpFn *genfn;
         bool swap = false;
@@ -10010,13 +9914,12 @@ static void handle_2misc_fcmp_zero(DisasContext *s, int opcode,
             write_vec_element(s, tcg_res, rd, pass, MO_64);
         }
         tcg_temp_free_i64(tcg_res);
-        tcg_temp_free_i64(tcg_zero);
         tcg_temp_free_i64(tcg_op);
 
         clear_vec_high(s, !is_scalar, rd);
     } else {
         TCGv_i32 tcg_op = tcg_temp_new_i32();
-        TCGv_i32 tcg_zero = tcg_const_i32(0);
+        TCGv_i32 tcg_zero = tcg_constant_i32(0);
         TCGv_i32 tcg_res = tcg_temp_new_i32();
         NeonGenTwoSingleOpFn *genfn;
         bool swap = false;
@@ -10085,7 +9988,6 @@ static void handle_2misc_fcmp_zero(DisasContext *s, int opcode,
             }
         }
         tcg_temp_free_i32(tcg_res);
-        tcg_temp_free_i32(tcg_zero);
         tcg_temp_free_i32(tcg_op);
         if (!is_scalar) {
             clear_vec_high(s, is_q, rd);
@@ -10186,7 +10088,7 @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
     int passes = scalar ? 1 : 2;
 
     if (scalar) {
-        tcg_res[1] = tcg_const_i32(0);
+        tcg_res[1] = tcg_constant_i32(0);
     }
 
     for (pass = 0; pass < passes; pass++) {
@@ -10364,9 +10266,7 @@ static void handle_2misc_satacc(DisasContext *s, bool is_scalar, bool is_u,
             }
 
             if (is_scalar) {
-                TCGv_i64 tcg_zero = tcg_const_i64(0);
-                write_vec_element(s, tcg_zero, rd, 0, MO_64);
-                tcg_temp_free_i64(tcg_zero);
+                write_vec_element(s, tcg_constant_i64(0), rd, 0, MO_64);
             }
             write_vec_element_i32(s, tcg_rd, rd, pass, MO_32);
         }
@@ -10549,23 +10449,17 @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
         case 0x1c: /* FCVTAS */
         case 0x3a: /* FCVTPS */
         case 0x3b: /* FCVTZS */
-        {
-            TCGv_i32 tcg_shift = tcg_const_i32(0);
-            gen_helper_vfp_tosls(tcg_rd, tcg_rn, tcg_shift, tcg_fpstatus);
-            tcg_temp_free_i32(tcg_shift);
+            gen_helper_vfp_tosls(tcg_rd, tcg_rn, tcg_constant_i32(0),
+                                 tcg_fpstatus);
             break;
-        }
         case 0x5a: /* FCVTNU */
         case 0x5b: /* FCVTMU */
         case 0x5c: /* FCVTAU */
         case 0x7a: /* FCVTPU */
         case 0x7b: /* FCVTZU */
-        {
-            TCGv_i32 tcg_shift = tcg_const_i32(0);
-            gen_helper_vfp_touls(tcg_rd, tcg_rn, tcg_shift, tcg_fpstatus);
-            tcg_temp_free_i32(tcg_shift);
+            gen_helper_vfp_touls(tcg_rd, tcg_rn, tcg_constant_i32(0),
+                                 tcg_fpstatus);
             break;
-        }
         default:
             g_assert_not_reached();
         }
@@ -10737,8 +10631,7 @@ static void handle_vec_simd_shrn(DisasContext *s, bool is_q,
     read_vec_element(s, tcg_final, rd, is_q ? 1 : 0, MO_64);
 
     if (round) {
-        uint64_t round_const = 1ULL << (shift - 1);
-        tcg_round = tcg_const_i64(round_const);
+        tcg_round = tcg_constant_i64(1ULL << (shift - 1));
     } else {
         tcg_round = NULL;
     }
@@ -10756,9 +10649,6 @@ static void handle_vec_simd_shrn(DisasContext *s, bool is_q,
     } else {
         write_vec_element(s, tcg_final, rd, 1, MO_64);
     }
-    if (round) {
-        tcg_temp_free_i64(tcg_round);
-    }
     tcg_temp_free_i64(tcg_rn);
     tcg_temp_free_i64(tcg_rd);
     tcg_temp_free_i64(tcg_final);
@@ -12462,7 +12352,7 @@ static void handle_2misc_pairwise(DisasContext *s, int opcode, bool u,
         }
     }
     if (!is_q) {
-        tcg_res[1] = tcg_const_i64(0);
+        tcg_res[1] = tcg_constant_i64(0);
     }
     for (pass = 0; pass < 2; pass++) {
         write_vec_element(s, tcg_res[pass], rd, pass, MO_64);
@@ -12895,25 +12785,17 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
                 case 0x1c: /* FCVTAS */
                 case 0x3a: /* FCVTPS */
                 case 0x3b: /* FCVTZS */
-                {
-                    TCGv_i32 tcg_shift = tcg_const_i32(0);
                     gen_helper_vfp_tosls(tcg_res, tcg_op,
-                                         tcg_shift, tcg_fpstatus);
-                    tcg_temp_free_i32(tcg_shift);
+                                         tcg_constant_i32(0), tcg_fpstatus);
                     break;
-                }
                 case 0x5a: /* FCVTNU */
                 case 0x5b: /* FCVTMU */
                 case 0x5c: /* FCVTAU */
                 case 0x7a: /* FCVTPU */
                 case 0x7b: /* FCVTZU */
-                {
-                    TCGv_i32 tcg_shift = tcg_const_i32(0);
                     gen_helper_vfp_touls(tcg_res, tcg_op,
-                                         tcg_shift, tcg_fpstatus);
-                    tcg_temp_free_i32(tcg_shift);
+                                         tcg_constant_i32(0), tcg_fpstatus);
                     break;
-                }
                 case 0x18: /* FRINTN */
                 case 0x19: /* FRINTM */
                 case 0x38: /* FRINTP */
@@ -14011,7 +13893,7 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn)
         }
 
         if (is_scalar) {
-            tcg_res[1] = tcg_const_i64(0);
+            tcg_res[1] = tcg_constant_i64(0);
         }
 
         for (pass = 0; pass < 2; pass++) {
@@ -14415,7 +14297,7 @@ static void disas_crypto_four_reg(DisasContext *s, uint32_t insn)
         tcg_op2 = tcg_temp_new_i32();
         tcg_op3 = tcg_temp_new_i32();
         tcg_res = tcg_temp_new_i32();
-        tcg_zero = tcg_const_i32(0);
+        tcg_zero = tcg_constant_i32(0);
 
         read_vec_element_i32(s, tcg_op1, rn, 3, MO_32);
         read_vec_element_i32(s, tcg_op2, rm, 3, MO_32);
@@ -14435,7 +14317,6 @@ static void disas_crypto_four_reg(DisasContext *s, uint32_t insn)
         tcg_temp_free_i32(tcg_op2);
         tcg_temp_free_i32(tcg_op3);
         tcg_temp_free_i32(tcg_res);
-        tcg_temp_free_i32(tcg_zero);
     }
 }
 
@@ -14943,22 +14824,19 @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
             gen_helper_yield(cpu_env);
             break;
         case DISAS_WFI:
-        {
-            /* This is a special case because we don't want to just halt the CPU
-             * if trying to debug across a WFI.
+            /*
+             * This is a special case because we don't want to just halt
+             * the CPU if trying to debug across a WFI.
              */
-            TCGv_i32 tmp = tcg_const_i32(4);
-
             gen_a64_set_pc_im(dc->base.pc_next);
-            gen_helper_wfi(cpu_env, tmp);
-            tcg_temp_free_i32(tmp);
-            /* The helper doesn't necessarily throw an exception, but we
+            gen_helper_wfi(cpu_env, tcg_constant_i32(4));
+            /*
+             * The helper doesn't necessarily throw an exception, but we
              * must go back to the main loop to check for interrupts anyway.
              */
             tcg_gen_exit_tb(NULL, 0);
             break;
         }
-        }
     }
 }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 14/60] target/arm: Simplify GEN_SHIFT in translate.c
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (12 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 13/60] target/arm: Use tcg_constant in translate-a64.c Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 18:56   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 15/60] target/arm: Simplify gen_sar Richard Henderson
                   ` (46 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Instead of computing

    tmp1 = shift & 0xff;
    dest = (tmp1 > 0x1f ? 0 : value) << (tmp1 & 0x1f)

use

    tmpd = value << (shift & 0x1f);
    dest = shift & 0xe ? 0 : tmpd;

which has a flatter dependency tree.
Use tcg_constant_i32 while we're at it.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 6b293f8279..57631c9fa1 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -552,16 +552,14 @@ static void gen_sbc_CC(TCGv_i32 dest, TCGv_i32 t0, TCGv_i32 t1)
 #define GEN_SHIFT(name)                                               \
 static void gen_##name(TCGv_i32 dest, TCGv_i32 t0, TCGv_i32 t1)       \
 {                                                                     \
-    TCGv_i32 tmp1, tmp2, tmp3;                                        \
-    tmp1 = tcg_temp_new_i32();                                        \
-    tcg_gen_andi_i32(tmp1, t1, 0xff);                                 \
-    tmp2 = tcg_const_i32(0);                                          \
-    tmp3 = tcg_const_i32(0x1f);                                       \
-    tcg_gen_movcond_i32(TCG_COND_GTU, tmp2, tmp1, tmp3, tmp2, t0);    \
-    tcg_temp_free_i32(tmp3);                                          \
-    tcg_gen_andi_i32(tmp1, tmp1, 0x1f);                               \
-    tcg_gen_##name##_i32(dest, tmp2, tmp1);                           \
-    tcg_temp_free_i32(tmp2);                                          \
+    TCGv_i32 tmpd = tcg_temp_new_i32();                               \
+    TCGv_i32 tmp1 = tcg_temp_new_i32();                               \
+    TCGv_i32 zero = tcg_constant_i32(0);                              \
+    tcg_gen_andi_i32(tmp1, t1, 0x1f);                                 \
+    tcg_gen_##name##_i32(tmpd, t0, tmp1);                             \
+    tcg_gen_andi_i32(tmp1, t1, 0xe0);                                 \
+    tcg_gen_movcond_i32(TCG_COND_NE, dest, tmp1, zero, zero, tmpd);   \
+    tcg_temp_free_i32(tmpd);                                          \
     tcg_temp_free_i32(tmp1);                                          \
 }
 GEN_SHIFT(shl)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 15/60] target/arm: Simplify gen_sar
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (13 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 14/60] target/arm: Simplify GEN_SHIFT in translate.c Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 18:57   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 16/60] target/arm: Simplify aa32 DISAS_WFI Richard Henderson
                   ` (45 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Use tcg_gen_umin_i32 instead of tcg_gen_movcond_i32.
Use tcg_constant_i32 while we're at it.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 57631c9fa1..8d6534f9a5 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -568,12 +568,10 @@ GEN_SHIFT(shr)
 
 static void gen_sar(TCGv_i32 dest, TCGv_i32 t0, TCGv_i32 t1)
 {
-    TCGv_i32 tmp1, tmp2;
-    tmp1 = tcg_temp_new_i32();
+    TCGv_i32 tmp1 = tcg_temp_new_i32();
+
     tcg_gen_andi_i32(tmp1, t1, 0xff);
-    tmp2 = tcg_const_i32(0x1f);
-    tcg_gen_movcond_i32(TCG_COND_GTU, tmp1, tmp1, tmp2, tmp2, tmp1);
-    tcg_temp_free_i32(tmp2);
+    tcg_gen_umin_i32(tmp1, tmp1, tcg_constant_i32(31));
     tcg_gen_sar_i32(dest, t0, tmp1);
     tcg_temp_free_i32(tmp1);
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 16/60] target/arm: Simplify aa32 DISAS_WFI
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (14 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 15/60] target/arm: Simplify gen_sar Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 19:00   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 17/60] target/arm: Use tcg_constant in translate.c Richard Henderson
                   ` (44 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

The length of the previous insn may be computed from
the difference of start and end addresses.
Use tcg_constant_i32 while we're at it.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 8d6534f9a5..e1c1dbc563 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9870,18 +9870,14 @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
             /* nothing more to generate */
             break;
         case DISAS_WFI:
-        {
-            TCGv_i32 tmp = tcg_const_i32((dc->thumb &&
-                                          !(dc->insn & (1U << 31))) ? 2 : 4);
-
-            gen_helper_wfi(cpu_env, tmp);
-            tcg_temp_free_i32(tmp);
-            /* The helper doesn't necessarily throw an exception, but we
+            gen_helper_wfi(cpu_env,
+                           tcg_constant_i32(dc->base.pc_next - dc->pc_curr));
+            /*
+             * The helper doesn't necessarily throw an exception, but we
              * must go back to the main loop to check for interrupts anyway.
              */
             tcg_gen_exit_tb(NULL, 0);
             break;
-        }
         case DISAS_WFE:
             gen_helper_wfe(cpu_env);
             break;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 17/60] target/arm: Use tcg_constant in translate.c
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (15 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 16/60] target/arm: Simplify aa32 DISAS_WFI Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 19:00   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 18/60] target/arm: Use tcg_constant in translate-m-nocp.c Richard Henderson
                   ` (43 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Use tcg_constant_{i32,i64,ptr} as appropriate throughout, which
means we get to remove lots of tcg_temp_free_*.  Drop variables
in many cases, passing the constant directly to another function.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate.c | 250 ++++++++++++++---------------------------
 1 file changed, 84 insertions(+), 166 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index e1c1dbc563..5cb4b3da33 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -346,9 +346,7 @@ static void store_sp_checked(DisasContext *s, TCGv_i32 var)
 
 void gen_set_cpsr(TCGv_i32 var, uint32_t mask)
 {
-    TCGv_i32 tmp_mask = tcg_const_i32(mask);
-    gen_helper_cpsr_write(cpu_env, var, tmp_mask);
-    tcg_temp_free_i32(tmp_mask);
+    gen_helper_cpsr_write(cpu_env, var, tcg_constant_i32(mask));
 }
 
 static void gen_rebuild_hflags(DisasContext *s, bool new_el)
@@ -373,11 +371,8 @@ static void gen_rebuild_hflags(DisasContext *s, bool new_el)
 
 static void gen_exception_internal(int excp)
 {
-    TCGv_i32 tcg_excp = tcg_const_i32(excp);
-
     assert(excp_is_internal(excp));
-    gen_helper_exception_internal(cpu_env, tcg_excp);
-    tcg_temp_free_i32(tcg_excp);
+    gen_helper_exception_internal(cpu_env, tcg_constant_i32(excp));
 }
 
 static void gen_singlestep_exception(DisasContext *s)
@@ -1078,12 +1073,8 @@ static inline void gen_smc(DisasContext *s)
     /* As with HVC, we may take an exception either before or after
      * the insn executes.
      */
-    TCGv_i32 tmp;
-
     gen_set_pc_im(s, s->pc_curr);
-    tmp = tcg_const_i32(syn_aa32_smc());
-    gen_helper_pre_smc(cpu_env, tmp);
-    tcg_temp_free_i32(tmp);
+    gen_helper_pre_smc(cpu_env, tcg_constant_i32(syn_aa32_smc()));
     gen_set_pc_im(s, s->base.pc_next);
     s->base.is_jmp = DISAS_SMC;
 }
@@ -1111,13 +1102,9 @@ void gen_exception_insn(DisasContext *s, uint64_t pc, int excp,
 
 static void gen_exception_bkpt_insn(DisasContext *s, uint32_t syn)
 {
-    TCGv_i32 tcg_syn;
-
     gen_set_condexec(s);
     gen_set_pc_im(s, s->pc_curr);
-    tcg_syn = tcg_const_i32(syn);
-    gen_helper_exception_bkpt_insn(cpu_env, tcg_syn);
-    tcg_temp_free_i32(tcg_syn);
+    gen_helper_exception_bkpt_insn(cpu_env, tcg_constant_i32(syn));
     s->base.is_jmp = DISAS_NORETURN;
 }
 
@@ -1131,16 +1118,11 @@ void unallocated_encoding(DisasContext *s)
 static void gen_exception_el(DisasContext *s, int excp, uint32_t syn,
                              TCGv_i32 tcg_el)
 {
-    TCGv_i32 tcg_excp;
-    TCGv_i32 tcg_syn;
-
     gen_set_condexec(s);
     gen_set_pc_im(s, s->pc_curr);
-    tcg_excp = tcg_const_i32(excp);
-    tcg_syn = tcg_const_i32(syn);
-    gen_helper_exception_with_syndrome(cpu_env, tcg_excp, tcg_syn, tcg_el);
-    tcg_temp_free_i32(tcg_syn);
-    tcg_temp_free_i32(tcg_excp);
+    gen_helper_exception_with_syndrome(cpu_env,
+                                       tcg_constant_i32(excp),
+                                       tcg_constant_i32(syn), tcg_el);
     s->base.is_jmp = DISAS_NORETURN;
 }
 
@@ -1863,24 +1845,21 @@ static int disas_iwmmxt_insn(DisasContext *s, uint32_t insn)
         gen_op_iwmmxt_movq_M0_wRn(wrd);
         switch ((insn >> 6) & 3) {
         case 0:
-            tmp2 = tcg_const_i32(0xff);
-            tmp3 = tcg_const_i32((insn & 7) << 3);
+            tmp2 = tcg_constant_i32(0xff);
+            tmp3 = tcg_constant_i32((insn & 7) << 3);
             break;
         case 1:
-            tmp2 = tcg_const_i32(0xffff);
-            tmp3 = tcg_const_i32((insn & 3) << 4);
+            tmp2 = tcg_constant_i32(0xffff);
+            tmp3 = tcg_constant_i32((insn & 3) << 4);
             break;
         case 2:
-            tmp2 = tcg_const_i32(0xffffffff);
-            tmp3 = tcg_const_i32((insn & 1) << 5);
+            tmp2 = tcg_constant_i32(0xffffffff);
+            tmp3 = tcg_constant_i32((insn & 1) << 5);
             break;
         default:
-            tmp2 = NULL;
-            tmp3 = NULL;
+            g_assert_not_reached();
         }
         gen_helper_iwmmxt_insr(cpu_M0, cpu_M0, tmp, tmp2, tmp3);
-        tcg_temp_free_i32(tmp3);
-        tcg_temp_free_i32(tmp2);
         tcg_temp_free_i32(tmp);
         gen_op_iwmmxt_movq_wRn_M0(wrd);
         gen_op_iwmmxt_set_mup();
@@ -2336,10 +2315,9 @@ static int disas_iwmmxt_insn(DisasContext *s, uint32_t insn)
         rd0 = (insn >> 16) & 0xf;
         rd1 = (insn >> 0) & 0xf;
         gen_op_iwmmxt_movq_M0_wRn(rd0);
-        tmp = tcg_const_i32((insn >> 20) & 3);
         iwmmxt_load_reg(cpu_V1, rd1);
-        gen_helper_iwmmxt_align(cpu_M0, cpu_M0, cpu_V1, tmp);
-        tcg_temp_free_i32(tmp);
+        gen_helper_iwmmxt_align(cpu_M0, cpu_M0, cpu_V1,
+                                tcg_constant_i32((insn >> 20) & 3));
         gen_op_iwmmxt_movq_wRn_M0(wrd);
         gen_op_iwmmxt_set_mup();
         break;
@@ -2393,9 +2371,8 @@ static int disas_iwmmxt_insn(DisasContext *s, uint32_t insn)
         wrd = (insn >> 12) & 0xf;
         rd0 = (insn >> 16) & 0xf;
         gen_op_iwmmxt_movq_M0_wRn(rd0);
-        tmp = tcg_const_i32(((insn >> 16) & 0xf0) | (insn & 0x0f));
+        tmp = tcg_constant_i32(((insn >> 16) & 0xf0) | (insn & 0x0f));
         gen_helper_iwmmxt_shufh(cpu_M0, cpu_env, cpu_M0, tmp);
-        tcg_temp_free_i32(tmp);
         gen_op_iwmmxt_movq_wRn_M0(wrd);
         gen_op_iwmmxt_set_mup();
         gen_op_iwmmxt_set_cup();
@@ -2868,7 +2845,7 @@ static bool msr_banked_access_decode(DisasContext *s, int r, int sysm, int rn,
                 tcg_gen_sextract_i32(tcg_el, tcg_el, ctz32(SCR_EEL2), 1);
                 tcg_gen_addi_i32(tcg_el, tcg_el, 3);
             } else {
-                tcg_el = tcg_const_i32(3);
+                tcg_el = tcg_constant_i32(3);
             }
 
             gen_exception_el(s, EXCP_UDEF, syn_uncategorized(), tcg_el);
@@ -2903,7 +2880,7 @@ undef:
 
 static void gen_msr_banked(DisasContext *s, int r, int sysm, int rn)
 {
-    TCGv_i32 tcg_reg, tcg_tgtmode, tcg_regno;
+    TCGv_i32 tcg_reg;
     int tgtmode = 0, regno = 0;
 
     if (!msr_banked_access_decode(s, r, sysm, rn, &tgtmode, &regno)) {
@@ -2914,18 +2891,16 @@ static void gen_msr_banked(DisasContext *s, int r, int sysm, int rn)
     gen_set_condexec(s);
     gen_set_pc_im(s, s->pc_curr);
     tcg_reg = load_reg(s, rn);
-    tcg_tgtmode = tcg_const_i32(tgtmode);
-    tcg_regno = tcg_const_i32(regno);
-    gen_helper_msr_banked(cpu_env, tcg_reg, tcg_tgtmode, tcg_regno);
-    tcg_temp_free_i32(tcg_tgtmode);
-    tcg_temp_free_i32(tcg_regno);
+    gen_helper_msr_banked(cpu_env, tcg_reg,
+                          tcg_constant_i32(tgtmode),
+                          tcg_constant_i32(regno));
     tcg_temp_free_i32(tcg_reg);
     s->base.is_jmp = DISAS_UPDATE_EXIT;
 }
 
 static void gen_mrs_banked(DisasContext *s, int r, int sysm, int rn)
 {
-    TCGv_i32 tcg_reg, tcg_tgtmode, tcg_regno;
+    TCGv_i32 tcg_reg;
     int tgtmode = 0, regno = 0;
 
     if (!msr_banked_access_decode(s, r, sysm, rn, &tgtmode, &regno)) {
@@ -2936,11 +2911,9 @@ static void gen_mrs_banked(DisasContext *s, int r, int sysm, int rn)
     gen_set_condexec(s);
     gen_set_pc_im(s, s->pc_curr);
     tcg_reg = tcg_temp_new_i32();
-    tcg_tgtmode = tcg_const_i32(tgtmode);
-    tcg_regno = tcg_const_i32(regno);
-    gen_helper_mrs_banked(tcg_reg, cpu_env, tcg_tgtmode, tcg_regno);
-    tcg_temp_free_i32(tcg_tgtmode);
-    tcg_temp_free_i32(tcg_regno);
+    gen_helper_mrs_banked(tcg_reg, cpu_env,
+                          tcg_constant_i32(tgtmode),
+                          tcg_constant_i32(regno));
     store_reg(s, rn, tcg_reg);
     s->base.is_jmp = DISAS_UPDATE_EXIT;
 }
@@ -3023,9 +2996,8 @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
     }                                                                   \
     static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \
     {                                                                   \
-        TCGv_vec zero = tcg_const_zeros_vec_matching(d);                \
+        TCGv_vec zero = tcg_constant_vec_matching(d, vece, 0);          \
         tcg_gen_cmp_vec(COND, vece, d, a, zero);                        \
-        tcg_temp_free_vec(zero);                                        \
     }                                                                   \
     void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m,      \
                             uint32_t opr_sz, uint32_t max_sz)           \
@@ -4015,8 +3987,8 @@ void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
     TCGv_i32 rval = tcg_temp_new_i32();
     TCGv_i32 lsh = tcg_temp_new_i32();
     TCGv_i32 rsh = tcg_temp_new_i32();
-    TCGv_i32 zero = tcg_const_i32(0);
-    TCGv_i32 max = tcg_const_i32(32);
+    TCGv_i32 zero = tcg_constant_i32(0);
+    TCGv_i32 max = tcg_constant_i32(32);
 
     /*
      * Rely on the TCG guarantee that out of range shifts produce
@@ -4034,8 +4006,6 @@ void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
     tcg_temp_free_i32(rval);
     tcg_temp_free_i32(lsh);
     tcg_temp_free_i32(rsh);
-    tcg_temp_free_i32(zero);
-    tcg_temp_free_i32(max);
 }
 
 void gen_ushl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
@@ -4044,8 +4014,8 @@ void gen_ushl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
     TCGv_i64 rval = tcg_temp_new_i64();
     TCGv_i64 lsh = tcg_temp_new_i64();
     TCGv_i64 rsh = tcg_temp_new_i64();
-    TCGv_i64 zero = tcg_const_i64(0);
-    TCGv_i64 max = tcg_const_i64(64);
+    TCGv_i64 zero = tcg_constant_i64(0);
+    TCGv_i64 max = tcg_constant_i64(64);
 
     /*
      * Rely on the TCG guarantee that out of range shifts produce
@@ -4063,8 +4033,6 @@ void gen_ushl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
     tcg_temp_free_i64(rval);
     tcg_temp_free_i64(lsh);
     tcg_temp_free_i64(rsh);
-    tcg_temp_free_i64(zero);
-    tcg_temp_free_i64(max);
 }
 
 static void gen_ushl_vec(unsigned vece, TCGv_vec dst,
@@ -4159,8 +4127,8 @@ void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
     TCGv_i32 rval = tcg_temp_new_i32();
     TCGv_i32 lsh = tcg_temp_new_i32();
     TCGv_i32 rsh = tcg_temp_new_i32();
-    TCGv_i32 zero = tcg_const_i32(0);
-    TCGv_i32 max = tcg_const_i32(31);
+    TCGv_i32 zero = tcg_constant_i32(0);
+    TCGv_i32 max = tcg_constant_i32(31);
 
     /*
      * Rely on the TCG guarantee that out of range shifts produce
@@ -4179,8 +4147,6 @@ void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift)
     tcg_temp_free_i32(rval);
     tcg_temp_free_i32(lsh);
     tcg_temp_free_i32(rsh);
-    tcg_temp_free_i32(zero);
-    tcg_temp_free_i32(max);
 }
 
 void gen_sshl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
@@ -4189,8 +4155,8 @@ void gen_sshl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
     TCGv_i64 rval = tcg_temp_new_i64();
     TCGv_i64 lsh = tcg_temp_new_i64();
     TCGv_i64 rsh = tcg_temp_new_i64();
-    TCGv_i64 zero = tcg_const_i64(0);
-    TCGv_i64 max = tcg_const_i64(63);
+    TCGv_i64 zero = tcg_constant_i64(0);
+    TCGv_i64 max = tcg_constant_i64(63);
 
     /*
      * Rely on the TCG guarantee that out of range shifts produce
@@ -4209,8 +4175,6 @@ void gen_sshl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift)
     tcg_temp_free_i64(rval);
     tcg_temp_free_i64(lsh);
     tcg_temp_free_i64(rsh);
-    tcg_temp_free_i64(zero);
-    tcg_temp_free_i64(max);
 }
 
 static void gen_sshl_vec(unsigned vece, TCGv_vec dst,
@@ -4725,8 +4689,6 @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
              * Note that on XScale all cp0..c13 registers do an access check
              * call in order to handle c15_cpar.
              */
-            TCGv_ptr tmpptr;
-            TCGv_i32 tcg_syn, tcg_isread;
             uint32_t syndrome;
 
             /* Note that since we are an implementation which takes an
@@ -4769,14 +4731,10 @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
 
             gen_set_condexec(s);
             gen_set_pc_im(s, s->pc_curr);
-            tmpptr = tcg_const_ptr(ri);
-            tcg_syn = tcg_const_i32(syndrome);
-            tcg_isread = tcg_const_i32(isread);
-            gen_helper_access_check_cp_reg(cpu_env, tmpptr, tcg_syn,
-                                           tcg_isread);
-            tcg_temp_free_ptr(tmpptr);
-            tcg_temp_free_i32(tcg_syn);
-            tcg_temp_free_i32(tcg_isread);
+            gen_helper_access_check_cp_reg(cpu_env,
+                                           tcg_constant_ptr(ri),
+                                           tcg_constant_i32(syndrome),
+                                           tcg_constant_i32(isread));
         } else if (ri->type & ARM_CP_RAISES_EXC) {
             /*
              * The readfn or writefn might raise an exception;
@@ -4812,13 +4770,11 @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
                 TCGv_i64 tmp64;
                 TCGv_i32 tmp;
                 if (ri->type & ARM_CP_CONST) {
-                    tmp64 = tcg_const_i64(ri->resetvalue);
+                    tmp64 = tcg_constant_i64(ri->resetvalue);
                 } else if (ri->readfn) {
-                    TCGv_ptr tmpptr;
                     tmp64 = tcg_temp_new_i64();
-                    tmpptr = tcg_const_ptr(ri);
-                    gen_helper_get_cp_reg64(tmp64, cpu_env, tmpptr);
-                    tcg_temp_free_ptr(tmpptr);
+                    gen_helper_get_cp_reg64(tmp64, cpu_env,
+                                            tcg_constant_ptr(ri));
                 } else {
                     tmp64 = tcg_temp_new_i64();
                     tcg_gen_ld_i64(tmp64, cpu_env, ri->fieldoffset);
@@ -4833,13 +4789,10 @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
             } else {
                 TCGv_i32 tmp;
                 if (ri->type & ARM_CP_CONST) {
-                    tmp = tcg_const_i32(ri->resetvalue);
+                    tmp = tcg_constant_i32(ri->resetvalue);
                 } else if (ri->readfn) {
-                    TCGv_ptr tmpptr;
                     tmp = tcg_temp_new_i32();
-                    tmpptr = tcg_const_ptr(ri);
-                    gen_helper_get_cp_reg(tmp, cpu_env, tmpptr);
-                    tcg_temp_free_ptr(tmpptr);
+                    gen_helper_get_cp_reg(tmp, cpu_env, tcg_constant_ptr(ri));
                 } else {
                     tmp = load_cpu_offset(ri->fieldoffset);
                 }
@@ -4869,24 +4822,18 @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
                 tcg_temp_free_i32(tmplo);
                 tcg_temp_free_i32(tmphi);
                 if (ri->writefn) {
-                    TCGv_ptr tmpptr = tcg_const_ptr(ri);
-                    gen_helper_set_cp_reg64(cpu_env, tmpptr, tmp64);
-                    tcg_temp_free_ptr(tmpptr);
+                    gen_helper_set_cp_reg64(cpu_env, tcg_constant_ptr(ri),
+                                            tmp64);
                 } else {
                     tcg_gen_st_i64(tmp64, cpu_env, ri->fieldoffset);
                 }
                 tcg_temp_free_i64(tmp64);
             } else {
+                TCGv_i32 tmp = load_reg(s, rt);
                 if (ri->writefn) {
-                    TCGv_i32 tmp;
-                    TCGv_ptr tmpptr;
-                    tmp = load_reg(s, rt);
-                    tmpptr = tcg_const_ptr(ri);
-                    gen_helper_set_cp_reg(cpu_env, tmpptr, tmp);
-                    tcg_temp_free_ptr(tmpptr);
+                    gen_helper_set_cp_reg(cpu_env, tcg_constant_ptr(ri), tmp);
                     tcg_temp_free_i32(tmp);
                 } else {
-                    TCGv_i32 tmp = load_reg(s, rt);
                     store_cpu_offset(tmp, ri->fieldoffset, 4);
                 }
             }
@@ -5190,12 +5137,10 @@ static void gen_srs(DisasContext *s,
     }
 
     addr = tcg_temp_new_i32();
-    tmp = tcg_const_i32(mode);
     /* get_r13_banked() will raise an exception if called from System mode */
     gen_set_condexec(s);
     gen_set_pc_im(s, s->pc_curr);
-    gen_helper_get_r13_banked(addr, cpu_env, tmp);
-    tcg_temp_free_i32(tmp);
+    gen_helper_get_r13_banked(addr, cpu_env, tcg_constant_i32(mode));
     switch (amode) {
     case 0: /* DA */
         offset = -4;
@@ -5238,9 +5183,7 @@ static void gen_srs(DisasContext *s,
             abort();
         }
         tcg_gen_addi_i32(addr, addr, offset);
-        tmp = tcg_const_i32(mode);
-        gen_helper_set_r13_banked(cpu_env, tmp, addr);
-        tcg_temp_free_i32(tmp);
+        gen_helper_set_r13_banked(cpu_env, tcg_constant_i32(mode), addr);
     }
     tcg_temp_free_i32(addr);
     s->base.is_jmp = DISAS_UPDATE_EXIT;
@@ -5552,23 +5495,21 @@ static bool op_s_rri_rot(DisasContext *s, arg_s_rri_rot *a,
                          void (*gen)(TCGv_i32, TCGv_i32, TCGv_i32),
                          int logic_cc, StoreRegKind kind)
 {
-    TCGv_i32 tmp1, tmp2;
+    TCGv_i32 tmp;
     uint32_t imm;
 
     imm = ror32(a->imm, a->rot);
     if (logic_cc && a->rot) {
         tcg_gen_movi_i32(cpu_CF, imm >> 31);
     }
-    tmp2 = tcg_const_i32(imm);
-    tmp1 = load_reg(s, a->rn);
+    tmp = load_reg(s, a->rn);
 
-    gen(tmp1, tmp1, tmp2);
-    tcg_temp_free_i32(tmp2);
+    gen(tmp, tmp, tcg_constant_i32(imm));
 
     if (logic_cc) {
-        gen_logic_CC(tmp1);
+        gen_logic_CC(tmp);
     }
-    return store_reg_kind(s, a->rd, tmp1, kind);
+    return store_reg_kind(s, a->rd, tmp, kind);
 }
 
 static bool op_s_rxi_rot(DisasContext *s, arg_s_rri_rot *a,
@@ -5582,9 +5523,10 @@ static bool op_s_rxi_rot(DisasContext *s, arg_s_rri_rot *a,
     if (logic_cc && a->rot) {
         tcg_gen_movi_i32(cpu_CF, imm >> 31);
     }
-    tmp = tcg_const_i32(imm);
 
-    gen(tmp, tmp);
+    tmp = tcg_temp_new_i32();
+    gen(tmp, tcg_constant_i32(imm));
+
     if (logic_cc) {
         gen_logic_CC(tmp);
     }
@@ -5710,14 +5652,11 @@ static bool trans_ADR(DisasContext *s, arg_ri *a)
 
 static bool trans_MOVW(DisasContext *s, arg_MOVW *a)
 {
-    TCGv_i32 tmp;
-
     if (!ENABLE_ARCH_6T2) {
         return false;
     }
 
-    tmp = tcg_const_i32(a->imm);
-    store_reg(s, a->rd, tmp);
+    store_reg(s, a->rd, tcg_constant_i32(a->imm));
     return true;
 }
 
@@ -6088,14 +6027,13 @@ static bool trans_UMAAL(DisasContext *s, arg_UMAAL *a)
     t0 = load_reg(s, a->rm);
     t1 = load_reg(s, a->rn);
     tcg_gen_mulu2_i32(t0, t1, t0, t1);
-    zero = tcg_const_i32(0);
+    zero = tcg_constant_i32(0);
     t2 = load_reg(s, a->ra);
     tcg_gen_add2_i32(t0, t1, t0, t1, t2, zero);
     tcg_temp_free_i32(t2);
     t2 = load_reg(s, a->rd);
     tcg_gen_add2_i32(t0, t1, t0, t1, t2, zero);
     tcg_temp_free_i32(t2);
-    tcg_temp_free_i32(zero);
     store_reg(s, a->ra, t0);
     store_reg(s, a->rd, t1);
     return true;
@@ -6342,14 +6280,13 @@ static bool op_crc32(DisasContext *s, arg_rrr *a, bool c, MemOp sz)
     default:
         g_assert_not_reached();
     }
-    t3 = tcg_const_i32(1 << sz);
+    t3 = tcg_constant_i32(1 << sz);
     if (c) {
         gen_helper_crc32c(t1, t1, t2, t3);
     } else {
         gen_helper_crc32(t1, t1, t2, t3);
     }
     tcg_temp_free_i32(t2);
-    tcg_temp_free_i32(t3);
     store_reg(s, a->rd, t1);
     return true;
 }
@@ -6432,8 +6369,8 @@ static bool trans_MRS_v7m(DisasContext *s, arg_MRS_v7m *a)
     if (!arm_dc_feature(s, ARM_FEATURE_M)) {
         return false;
     }
-    tmp = tcg_const_i32(a->sysm);
-    gen_helper_v7m_mrs(tmp, cpu_env, tmp);
+    tmp = tcg_temp_new_i32();
+    gen_helper_v7m_mrs(tmp, cpu_env, tcg_constant_i32(a->sysm));
     store_reg(s, a->rd, tmp);
     return true;
 }
@@ -6445,10 +6382,9 @@ static bool trans_MSR_v7m(DisasContext *s, arg_MSR_v7m *a)
     if (!arm_dc_feature(s, ARM_FEATURE_M)) {
         return false;
     }
-    addr = tcg_const_i32((a->mask << 10) | a->sysm);
+    addr = tcg_constant_i32((a->mask << 10) | a->sysm);
     reg = load_reg(s, a->rn);
     gen_helper_v7m_msr(cpu_env, addr, reg);
-    tcg_temp_free_i32(addr);
     tcg_temp_free_i32(reg);
     /* If we wrote to CONTROL, the EL might have changed */
     gen_rebuild_hflags(s, true);
@@ -6660,8 +6596,8 @@ static bool trans_TT(DisasContext *s, arg_TT *a)
     }
 
     addr = load_reg(s, a->rn);
-    tmp = tcg_const_i32((a->A << 1) | a->T);
-    gen_helper_v7m_tt(tmp, cpu_env, addr, tmp);
+    tmp = tcg_temp_new_i32();
+    gen_helper_v7m_tt(tmp, cpu_env, addr, tcg_constant_i32((a->A << 1) | a->T));
     tcg_temp_free_i32(addr);
     store_reg(s, a->rd, tmp);
     return true;
@@ -7628,7 +7564,7 @@ static bool trans_PKH(DisasContext *s, arg_PKH *a)
 static bool op_sat(DisasContext *s, arg_sat *a,
                    void (*gen)(TCGv_i32, TCGv_env, TCGv_i32, TCGv_i32))
 {
-    TCGv_i32 tmp, satimm;
+    TCGv_i32 tmp;
     int shift = a->imm;
 
     if (!ENABLE_ARCH_6) {
@@ -7642,9 +7578,7 @@ static bool op_sat(DisasContext *s, arg_sat *a,
         tcg_gen_shli_i32(tmp, tmp, shift);
     }
 
-    satimm = tcg_const_i32(a->satimm);
-    gen(tmp, cpu_env, tmp, satimm);
-    tcg_temp_free_i32(satimm);
+    gen(tmp, cpu_env, tmp, tcg_constant_i32(a->satimm));
 
     store_reg(s, a->rd, tmp);
     return true;
@@ -7979,9 +7913,7 @@ static bool op_smmla(DisasContext *s, arg_rrrr *a, bool round, bool sub)
              * a non-zero multiplicand lowpart, and the correct result
              * lowpart for rounding.
              */
-            TCGv_i32 zero = tcg_const_i32(0);
-            tcg_gen_sub2_i32(t2, t1, zero, t3, t2, t1);
-            tcg_temp_free_i32(zero);
+            tcg_gen_sub2_i32(t2, t1, tcg_constant_i32(0), t3, t2, t1);
         } else {
             tcg_gen_add_i32(t1, t1, t3);
         }
@@ -8118,7 +8050,7 @@ static bool op_stm(DisasContext *s, arg_ldst_block *a, int min_n)
 {
     int i, j, n, list, mem_idx;
     bool user = a->u;
-    TCGv_i32 addr, tmp, tmp2;
+    TCGv_i32 addr, tmp;
 
     if (user) {
         /* STM (user) */
@@ -8148,9 +8080,7 @@ static bool op_stm(DisasContext *s, arg_ldst_block *a, int min_n)
 
         if (user && i != 15) {
             tmp = tcg_temp_new_i32();
-            tmp2 = tcg_const_i32(i);
-            gen_helper_get_user_reg(tmp, cpu_env, tmp2);
-            tcg_temp_free_i32(tmp2);
+            gen_helper_get_user_reg(tmp, cpu_env, tcg_constant_i32(i));
         } else {
             tmp = load_reg(s, i);
         }
@@ -8191,7 +8121,7 @@ static bool do_ldm(DisasContext *s, arg_ldst_block *a, int min_n)
     bool loaded_base;
     bool user = a->u;
     bool exc_return = false;
-    TCGv_i32 addr, tmp, tmp2, loaded_var;
+    TCGv_i32 addr, tmp, loaded_var;
 
     if (user) {
         /* LDM (user), LDM (exception return) */
@@ -8234,9 +8164,7 @@ static bool do_ldm(DisasContext *s, arg_ldst_block *a, int min_n)
         tmp = tcg_temp_new_i32();
         gen_aa32_ld_i32(s, tmp, addr, mem_idx, MO_UL | MO_ALIGN);
         if (user) {
-            tmp2 = tcg_const_i32(i);
-            gen_helper_set_user_reg(cpu_env, tmp2, tmp);
-            tcg_temp_free_i32(tmp2);
+            gen_helper_set_user_reg(cpu_env, tcg_constant_i32(i), tmp);
             tcg_temp_free_i32(tmp);
         } else if (i == a->rn) {
             loaded_var = tmp;
@@ -8329,7 +8257,7 @@ static bool trans_CLRM(DisasContext *s, arg_CLRM *a)
 
     s->eci_handled = true;
 
-    zero = tcg_const_i32(0);
+    zero = tcg_constant_i32(0);
     for (i = 0; i < 15; i++) {
         if (extract32(a->list, i, 1)) {
             /* Clear R[i] */
@@ -8341,11 +8269,8 @@ static bool trans_CLRM(DisasContext *s, arg_CLRM *a)
          * Clear APSR (by calling the MSR helper with the same argument
          * as for "MSR APSR_nzcvqg, Rn": mask = 0b1100, SYSM=0)
          */
-        TCGv_i32 maskreg = tcg_const_i32(0xc << 8);
-        gen_helper_v7m_msr(cpu_env, maskreg, zero);
-        tcg_temp_free_i32(maskreg);
+        gen_helper_v7m_msr(cpu_env, tcg_constant_i32(0xc00), zero);
     }
-    tcg_temp_free_i32(zero);
     clear_eci_state(s);
     return true;
 }
@@ -8488,8 +8413,7 @@ static bool trans_DLS(DisasContext *s, arg_DLS *a)
     store_reg(s, 14, tmp);
     if (a->size != 4) {
         /* DLSTP: set FPSCR.LTPSIZE */
-        tmp = tcg_const_i32(a->size);
-        store_cpu_field(tmp, v7m.ltpsize);
+        store_cpu_field(tcg_constant_i32(a->size), v7m.ltpsize);
         s->base.is_jmp = DISAS_UPDATE_NOCHAIN;
     }
     return true;
@@ -8554,8 +8478,7 @@ static bool trans_WLS(DisasContext *s, arg_WLS *a)
          */
         bool ok = vfp_access_check(s);
         assert(ok);
-        tmp = tcg_const_i32(a->size);
-        store_cpu_field(tmp, v7m.ltpsize);
+        store_cpu_field(tcg_constant_i32(a->size), v7m.ltpsize);
         /*
          * LTPSIZE updated, but MVE_NO_PRED will always be the same thing (0)
          * when we take this upcoming exit from this TB, so gen_jmp_tb() is OK.
@@ -8681,8 +8604,7 @@ static bool trans_LE(DisasContext *s, arg_LE *a)
     gen_set_label(loopend);
     if (a->tp) {
         /* Exits from tail-pred loops must reset LTPSIZE to 4 */
-        tmp = tcg_const_i32(4);
-        store_cpu_field(tmp, v7m.ltpsize);
+        store_cpu_field(tcg_constant_i32(4), v7m.ltpsize);
     }
     /* End TB, continuing to following insn */
     gen_jmp_tb(s, s->base.pc_next, 1);
@@ -8913,21 +8835,18 @@ static bool trans_CPS_v7m(DisasContext *s, arg_CPS_v7m *a)
         return true;
     }
 
-    tmp = tcg_const_i32(a->im);
+    tmp = tcg_constant_i32(a->im);
     /* FAULTMASK */
     if (a->F) {
-        addr = tcg_const_i32(19);
+        addr = tcg_constant_i32(19);
         gen_helper_v7m_msr(cpu_env, addr, tmp);
-        tcg_temp_free_i32(addr);
     }
     /* PRIMASK */
     if (a->I) {
-        addr = tcg_const_i32(16);
+        addr = tcg_constant_i32(16);
         gen_helper_v7m_msr(cpu_env, addr, tmp);
-        tcg_temp_free_i32(addr);
     }
     gen_rebuild_hflags(s, false);
-    tcg_temp_free_i32(tmp);
     gen_lookup_tb(s);
     return true;
 }
@@ -9063,13 +8982,14 @@ static bool trans_CSEL(DisasContext *s, arg_CSEL *a)
     }
 
     /* In this insn input reg fields of 0b1111 mean "zero", not "PC" */
+    zero = tcg_constant_i32(0);
     if (a->rn == 15) {
-        rn = tcg_const_i32(0);
+        rn = zero;
     } else {
         rn = load_reg(s, a->rn);
     }
     if (a->rm == 15) {
-        rm = tcg_const_i32(0);
+        rm = zero;
     } else {
         rm = load_reg(s, a->rm);
     }
@@ -9091,10 +9011,8 @@ static bool trans_CSEL(DisasContext *s, arg_CSEL *a)
     }
 
     arm_test_cc(&c, a->fcond);
-    zero = tcg_const_i32(0);
     tcg_gen_movcond_i32(c.cond, rn, c.value, zero, rn, rm);
     arm_free_cc(&c);
-    tcg_temp_free_i32(zero);
 
     store_reg(s, a->rd, rn);
     tcg_temp_free_i32(rm);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 18/60] target/arm: Use tcg_constant in translate-m-nocp.c
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (16 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 17/60] target/arm: Use tcg_constant in translate.c Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 19:03   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 19/60] target/arm: Use tcg_constant in translate-neon.c Richard Henderson
                   ` (42 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Use tcg_constant_{i32,i64} as appropriate throughout.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-m-nocp.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/target/arm/translate-m-nocp.c b/target/arm/translate-m-nocp.c
index d9e144e8eb..27363a7b4e 100644
--- a/target/arm/translate-m-nocp.c
+++ b/target/arm/translate-m-nocp.c
@@ -173,7 +173,7 @@ static bool trans_VSCCLRM(DisasContext *s, arg_VSCCLRM *a)
     }
 
     /* Zero the Sregs from btmreg to topreg inclusive. */
-    zero = tcg_const_i64(0);
+    zero = tcg_constant_i64(0);
     if (btmreg & 1) {
         write_neon_element64(zero, btmreg >> 1, 1, MO_32);
         btmreg++;
@@ -187,8 +187,7 @@ static bool trans_VSCCLRM(DisasContext *s, arg_VSCCLRM *a)
     }
     assert(btmreg == topreg + 1);
     if (dc_isar_feature(aa32_mve, s)) {
-        TCGv_i32 z32 = tcg_const_i32(0);
-        store_cpu_field(z32, v7m.vpr);
+        store_cpu_field(tcg_constant_i32(0), v7m.vpr);
     }
 
     clear_eci_state(s);
@@ -512,7 +511,7 @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
     }
     case ARM_VFP_FPCXT_NS:
     {
-        TCGv_i32 control, sfpa, fpscr, fpdscr, zero;
+        TCGv_i32 control, sfpa, fpscr, fpdscr;
         TCGLabel *lab_active = gen_new_label();
 
         lookup_tb = true;
@@ -552,10 +551,9 @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
         storefn(s, opaque, tmp, true);
         /* If SFPA is zero then set FPSCR from FPDSCR_NS */
         fpdscr = load_cpu_field(v7m.fpdscr[M_REG_NS]);
-        zero = tcg_const_i32(0);
-        tcg_gen_movcond_i32(TCG_COND_EQ, fpscr, sfpa, zero, fpdscr, fpscr);
+        tcg_gen_movcond_i32(TCG_COND_EQ, fpscr, sfpa, tcg_constant_i32(0),
+                            fpdscr, fpscr);
         gen_helper_vfp_set_fpscr(cpu_env, fpscr);
-        tcg_temp_free_i32(zero);
         tcg_temp_free_i32(sfpa);
         tcg_temp_free_i32(fpdscr);
         tcg_temp_free_i32(fpscr);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 19/60] target/arm: Use tcg_constant in translate-neon.c
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (17 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 18/60] target/arm: Use tcg_constant in translate-m-nocp.c Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 19:06   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 20/60] target/arm: Use smin/smax for do_sat_addsub_32 Richard Henderson
                   ` (41 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Use tcg_constant_{i32,i64} as appropriate throughout.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-neon.c | 21 +++++++--------------
 1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c
index 384604c009..2e4d1ec87d 100644
--- a/target/arm/translate-neon.c
+++ b/target/arm/translate-neon.c
@@ -447,7 +447,7 @@ static bool trans_VLDST_multiple(DisasContext *s, arg_VLDST_multiple *a)
     int mmu_idx = get_mem_index(s);
     int size = a->size;
     TCGv_i64 tmp64;
-    TCGv_i32 addr, tmp;
+    TCGv_i32 addr;
 
     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
         return false;
@@ -513,7 +513,6 @@ static bool trans_VLDST_multiple(DisasContext *s, arg_VLDST_multiple *a)
 
     tmp64 = tcg_temp_new_i64();
     addr = tcg_temp_new_i32();
-    tmp = tcg_const_i32(1 << size);
     load_reg_var(s, addr, a->rn);
 
     mop = endian | size | align;
@@ -530,7 +529,7 @@ static bool trans_VLDST_multiple(DisasContext *s, arg_VLDST_multiple *a)
                     neon_load_element64(tmp64, tt, n, size);
                     gen_aa32_st_internal_i64(s, tmp64, addr, mmu_idx, mop);
                 }
-                tcg_gen_add_i32(addr, addr, tmp);
+                tcg_gen_addi_i32(addr, addr, 1 << size);
 
                 /* Subsequent memory operations inherit alignment */
                 mop &= ~MO_AMASK;
@@ -538,7 +537,6 @@ static bool trans_VLDST_multiple(DisasContext *s, arg_VLDST_multiple *a)
         }
     }
     tcg_temp_free_i32(addr);
-    tcg_temp_free_i32(tmp);
     tcg_temp_free_i64(tmp64);
 
     gen_neon_ldst_base_update(s, a->rm, a->rn, nregs * interleave * 8);
@@ -1348,7 +1346,7 @@ static bool do_2shift_env_64(DisasContext *s, arg_2reg_shift *a,
      * To avoid excessive duplication of ops we implement shift
      * by immediate using the variable shift operations.
      */
-    constimm = tcg_const_i64(dup_const(a->size, a->shift));
+    constimm = tcg_constant_i64(dup_const(a->size, a->shift));
 
     for (pass = 0; pass < a->q + 1; pass++) {
         TCGv_i64 tmp = tcg_temp_new_i64();
@@ -1358,7 +1356,6 @@ static bool do_2shift_env_64(DisasContext *s, arg_2reg_shift *a,
         write_neon_element64(tmp, a->vd, pass, MO_64);
         tcg_temp_free_i64(tmp);
     }
-    tcg_temp_free_i64(constimm);
     return true;
 }
 
@@ -1394,7 +1391,7 @@ static bool do_2shift_env_32(DisasContext *s, arg_2reg_shift *a,
      * To avoid excessive duplication of ops we implement shift
      * by immediate using the variable shift operations.
      */
-    constimm = tcg_const_i32(dup_const(a->size, a->shift));
+    constimm = tcg_constant_i32(dup_const(a->size, a->shift));
     tmp = tcg_temp_new_i32();
 
     for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
@@ -1403,7 +1400,6 @@ static bool do_2shift_env_32(DisasContext *s, arg_2reg_shift *a,
         write_neon_element32(tmp, a->vd, pass, MO_32);
     }
     tcg_temp_free_i32(tmp);
-    tcg_temp_free_i32(constimm);
     return true;
 }
 
@@ -1457,7 +1453,7 @@ static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a,
      * This is always a right shift, and the shiftfn is always a
      * left-shift helper, which thus needs the negated shift count.
      */
-    constimm = tcg_const_i64(-a->shift);
+    constimm = tcg_constant_i64(-a->shift);
     rm1 = tcg_temp_new_i64();
     rm2 = tcg_temp_new_i64();
     rd = tcg_temp_new_i32();
@@ -1477,7 +1473,6 @@ static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a,
     tcg_temp_free_i32(rd);
     tcg_temp_free_i64(rm1);
     tcg_temp_free_i64(rm2);
-    tcg_temp_free_i64(constimm);
 
     return true;
 }
@@ -1521,7 +1516,7 @@ static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a,
         /* size == 2 */
         imm = -a->shift;
     }
-    constimm = tcg_const_i32(imm);
+    constimm = tcg_constant_i32(imm);
 
     /* Load all inputs first to avoid potential overwrite */
     rm1 = tcg_temp_new_i32();
@@ -1546,7 +1541,6 @@ static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a,
 
     shiftfn(rm3, rm3, constimm);
     shiftfn(rm4, rm4, constimm);
-    tcg_temp_free_i32(constimm);
 
     tcg_gen_concat_i32_i64(rtmp, rm3, rm4);
     tcg_temp_free_i32(rm4);
@@ -2911,7 +2905,7 @@ static bool trans_VTBL(DisasContext *s, arg_VTBL *a)
         return true;
     }
 
-    desc = tcg_const_i32((a->vn << 2) | a->len);
+    desc = tcg_constant_i32((a->vn << 2) | a->len);
     def = tcg_temp_new_i64();
     if (a->op) {
         read_neon_element64(def, a->vd, 0, MO_64);
@@ -2926,7 +2920,6 @@ static bool trans_VTBL(DisasContext *s, arg_VTBL *a)
 
     tcg_temp_free_i64(def);
     tcg_temp_free_i64(val);
-    tcg_temp_free_i32(desc);
     return true;
 }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 20/60] target/arm: Use smin/smax for do_sat_addsub_32
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (18 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 19/60] target/arm: Use tcg_constant in translate-neon.c Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 19:07   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 21/60] target/arm: Use tcg_constant in translate-sve.c Richard Henderson
                   ` (40 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

The operation we're performing with the movcond
is either min/max depending on cond -- simplify.
Use tcg_constant_i64 while we're at it.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-sve.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 2c23459e76..ddc3a8060b 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -1916,8 +1916,6 @@ static bool trans_PNEXT(DisasContext *s, arg_rr_esz *a)
 static void do_sat_addsub_32(TCGv_i64 reg, TCGv_i64 val, bool u, bool d)
 {
     int64_t ibound;
-    TCGv_i64 bound;
-    TCGCond cond;
 
     /* Use normal 64-bit arithmetic to detect 32-bit overflow.  */
     if (u) {
@@ -1928,15 +1926,12 @@ static void do_sat_addsub_32(TCGv_i64 reg, TCGv_i64 val, bool u, bool d)
     if (d) {
         tcg_gen_sub_i64(reg, reg, val);
         ibound = (u ? 0 : INT32_MIN);
-        cond = TCG_COND_LT;
+        tcg_gen_smax_i64(reg, reg, tcg_constant_i64(ibound));
     } else {
         tcg_gen_add_i64(reg, reg, val);
         ibound = (u ? UINT32_MAX : INT32_MAX);
-        cond = TCG_COND_GT;
+        tcg_gen_smin_i64(reg, reg, tcg_constant_i64(ibound));
     }
-    bound = tcg_const_i64(ibound);
-    tcg_gen_movcond_i64(cond, reg, reg, bound, bound, reg);
-    tcg_temp_free_i64(bound);
 }
 
 /* Similarly with 64-bit values.  */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 21/60] target/arm: Use tcg_constant in translate-sve.c
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (19 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 20/60] target/arm: Use smin/smax for do_sat_addsub_32 Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 19:08   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 22/60] target/arm: Use tcg_constant in translate-vfp.c Richard Henderson
                   ` (39 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Use tcg_constant_{i32,i64} as appropriate throughout.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-sve.c | 198 +++++++++++++------------------------
 1 file changed, 68 insertions(+), 130 deletions(-)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index ddc3a8060b..5b3478a43a 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -282,13 +282,12 @@ static void do_predtest(DisasContext *s, int dofs, int gofs, int words)
 {
     TCGv_ptr dptr = tcg_temp_new_ptr();
     TCGv_ptr gptr = tcg_temp_new_ptr();
-    TCGv_i32 t;
+    TCGv_i32 t = tcg_temp_new_i32();
 
     tcg_gen_addi_ptr(dptr, cpu_env, dofs);
     tcg_gen_addi_ptr(gptr, cpu_env, gofs);
-    t = tcg_const_i32(words);
 
-    gen_helper_sve_predtest(t, dptr, gptr, t);
+    gen_helper_sve_predtest(t, dptr, gptr, tcg_constant_i32(words));
     tcg_temp_free_ptr(dptr);
     tcg_temp_free_ptr(gptr);
 
@@ -889,7 +888,7 @@ static bool do_vpz_ool(DisasContext *s, arg_rpr_esz *a,
         return true;
     }
 
-    desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
+    desc = tcg_constant_i32(simd_desc(vsz, vsz, 0));
     temp = tcg_temp_new_i64();
     t_zn = tcg_temp_new_ptr();
     t_pg = tcg_temp_new_ptr();
@@ -899,7 +898,6 @@ static bool do_vpz_ool(DisasContext *s, arg_rpr_esz *a,
     fn(temp, t_zn, t_pg, desc);
     tcg_temp_free_ptr(t_zn);
     tcg_temp_free_ptr(t_pg);
-    tcg_temp_free_i32(desc);
 
     write_fp_dreg(s, a->rd, temp);
     tcg_temp_free_i64(temp);
@@ -1236,7 +1234,7 @@ static void do_index(DisasContext *s, int esz, int rd,
                      TCGv_i64 start, TCGv_i64 incr)
 {
     unsigned vsz = vec_full_reg_size(s);
-    TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
+    TCGv_i32 desc = tcg_constant_i32(simd_desc(vsz, vsz, 0));
     TCGv_ptr t_zd = tcg_temp_new_ptr();
 
     tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, rd));
@@ -1260,17 +1258,14 @@ static void do_index(DisasContext *s, int esz, int rd,
         tcg_temp_free_i32(i32);
     }
     tcg_temp_free_ptr(t_zd);
-    tcg_temp_free_i32(desc);
 }
 
 static bool trans_INDEX_ii(DisasContext *s, arg_INDEX_ii *a)
 {
     if (sve_access_check(s)) {
-        TCGv_i64 start = tcg_const_i64(a->imm1);
-        TCGv_i64 incr = tcg_const_i64(a->imm2);
+        TCGv_i64 start = tcg_constant_i64(a->imm1);
+        TCGv_i64 incr = tcg_constant_i64(a->imm2);
         do_index(s, a->esz, a->rd, start, incr);
-        tcg_temp_free_i64(start);
-        tcg_temp_free_i64(incr);
     }
     return true;
 }
@@ -1278,10 +1273,9 @@ static bool trans_INDEX_ii(DisasContext *s, arg_INDEX_ii *a)
 static bool trans_INDEX_ir(DisasContext *s, arg_INDEX_ir *a)
 {
     if (sve_access_check(s)) {
-        TCGv_i64 start = tcg_const_i64(a->imm);
+        TCGv_i64 start = tcg_constant_i64(a->imm);
         TCGv_i64 incr = cpu_reg(s, a->rm);
         do_index(s, a->esz, a->rd, start, incr);
-        tcg_temp_free_i64(start);
     }
     return true;
 }
@@ -1290,9 +1284,8 @@ static bool trans_INDEX_ri(DisasContext *s, arg_INDEX_ri *a)
 {
     if (sve_access_check(s)) {
         TCGv_i64 start = cpu_reg(s, a->rn);
-        TCGv_i64 incr = tcg_const_i64(a->imm);
+        TCGv_i64 incr = tcg_constant_i64(a->imm);
         do_index(s, a->esz, a->rd, start, incr);
-        tcg_temp_free_i64(incr);
     }
     return true;
 }
@@ -1884,9 +1877,9 @@ static bool do_pfirst_pnext(DisasContext *s, arg_rr_esz *a,
 
     tcg_gen_addi_ptr(t_pd, cpu_env, pred_full_reg_offset(s, a->rd));
     tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->rn));
-    t = tcg_const_i32(desc);
+    t = tcg_temp_new_i32();
 
-    gen_fn(t, t_pd, t_pg, t);
+    gen_fn(t, t_pd, t_pg, tcg_constant_i32(desc));
     tcg_temp_free_ptr(t_pd);
     tcg_temp_free_ptr(t_pg);
 
@@ -1993,7 +1986,7 @@ static void do_sat_addsub_vec(DisasContext *s, int esz, int rd, int rn,
     nptr = tcg_temp_new_ptr();
     tcg_gen_addi_ptr(dptr, cpu_env, vec_full_reg_offset(s, rd));
     tcg_gen_addi_ptr(nptr, cpu_env, vec_full_reg_offset(s, rn));
-    desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
+    desc = tcg_constant_i32(simd_desc(vsz, vsz, 0));
 
     switch (esz) {
     case MO_8:
@@ -2062,7 +2055,6 @@ static void do_sat_addsub_vec(DisasContext *s, int esz, int rd, int rn,
 
     tcg_temp_free_ptr(dptr);
     tcg_temp_free_ptr(nptr);
-    tcg_temp_free_i32(desc);
 }
 
 static bool trans_CNT_r(DisasContext *s, arg_CNT_r *a)
@@ -2107,9 +2099,7 @@ static bool trans_SINCDEC_r_32(DisasContext *s, arg_incdec_cnt *a)
             tcg_gen_ext32s_i64(reg, reg);
         }
     } else {
-        TCGv_i64 t = tcg_const_i64(inc);
-        do_sat_addsub_32(reg, t, a->u, a->d);
-        tcg_temp_free_i64(t);
+        do_sat_addsub_32(reg, tcg_constant_i64(inc), a->u, a->d);
     }
     return true;
 }
@@ -2126,9 +2116,7 @@ static bool trans_SINCDEC_r_64(DisasContext *s, arg_incdec_cnt *a)
     TCGv_i64 reg = cpu_reg(s, a->rd);
 
     if (inc != 0) {
-        TCGv_i64 t = tcg_const_i64(inc);
-        do_sat_addsub_64(reg, t, a->u, a->d);
-        tcg_temp_free_i64(t);
+        do_sat_addsub_64(reg, tcg_constant_i64(inc), a->u, a->d);
     }
     return true;
 }
@@ -2145,11 +2133,10 @@ static bool trans_INCDEC_v(DisasContext *s, arg_incdec2_cnt *a)
 
     if (inc != 0) {
         if (sve_access_check(s)) {
-            TCGv_i64 t = tcg_const_i64(a->d ? -inc : inc);
             tcg_gen_gvec_adds(a->esz, vec_full_reg_offset(s, a->rd),
                               vec_full_reg_offset(s, a->rn),
-                              t, fullsz, fullsz);
-            tcg_temp_free_i64(t);
+                              tcg_constant_i64(a->d ? -inc : inc),
+                              fullsz, fullsz);
         }
     } else {
         do_mov_z(s, a->rd, a->rn);
@@ -2169,9 +2156,8 @@ static bool trans_SINCDEC_v(DisasContext *s, arg_incdec2_cnt *a)
 
     if (inc != 0) {
         if (sve_access_check(s)) {
-            TCGv_i64 t = tcg_const_i64(inc);
-            do_sat_addsub_vec(s, a->esz, a->rd, a->rn, t, a->u, a->d);
-            tcg_temp_free_i64(t);
+            do_sat_addsub_vec(s, a->esz, a->rd, a->rn,
+                              tcg_constant_i64(inc), a->u, a->d);
         }
     } else {
         do_mov_z(s, a->rd, a->rn);
@@ -2244,7 +2230,7 @@ static void do_cpy_m(DisasContext *s, int esz, int rd, int rn, int pg,
         gen_helper_sve_cpy_m_s, gen_helper_sve_cpy_m_d,
     };
     unsigned vsz = vec_full_reg_size(s);
-    TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
+    TCGv_i32 desc = tcg_constant_i32(simd_desc(vsz, vsz, 0));
     TCGv_ptr t_zd = tcg_temp_new_ptr();
     TCGv_ptr t_zn = tcg_temp_new_ptr();
     TCGv_ptr t_pg = tcg_temp_new_ptr();
@@ -2258,7 +2244,6 @@ static void do_cpy_m(DisasContext *s, int esz, int rd, int rn, int pg,
     tcg_temp_free_ptr(t_zd);
     tcg_temp_free_ptr(t_zn);
     tcg_temp_free_ptr(t_pg);
-    tcg_temp_free_i32(desc);
 }
 
 static bool trans_FCPY(DisasContext *s, arg_FCPY *a)
@@ -2269,9 +2254,7 @@ static bool trans_FCPY(DisasContext *s, arg_FCPY *a)
     if (sve_access_check(s)) {
         /* Decode the VFP immediate.  */
         uint64_t imm = vfp_expand_imm(a->esz, a->imm);
-        TCGv_i64 t_imm = tcg_const_i64(imm);
-        do_cpy_m(s, a->esz, a->rd, a->rn, a->pg, t_imm);
-        tcg_temp_free_i64(t_imm);
+        do_cpy_m(s, a->esz, a->rd, a->rn, a->pg, tcg_constant_i64(imm));
     }
     return true;
 }
@@ -2282,9 +2265,7 @@ static bool trans_CPY_m_i(DisasContext *s, arg_rpri_esz *a)
         return false;
     }
     if (sve_access_check(s)) {
-        TCGv_i64 t_imm = tcg_const_i64(a->imm);
-        do_cpy_m(s, a->esz, a->rd, a->rn, a->pg, t_imm);
-        tcg_temp_free_i64(t_imm);
+        do_cpy_m(s, a->esz, a->rd, a->rn, a->pg, tcg_constant_i64(a->imm));
     }
     return true;
 }
@@ -2301,11 +2282,10 @@ static bool trans_CPY_z_i(DisasContext *s, arg_CPY_z_i *a)
     }
     if (sve_access_check(s)) {
         unsigned vsz = vec_full_reg_size(s);
-        TCGv_i64 t_imm = tcg_const_i64(a->imm);
         tcg_gen_gvec_2i_ool(vec_full_reg_offset(s, a->rd),
                             pred_full_reg_offset(s, a->pg),
-                            t_imm, vsz, vsz, 0, fns[a->esz]);
-        tcg_temp_free_i64(t_imm);
+                            tcg_constant_i64(a->imm),
+                            vsz, vsz, 0, fns[a->esz]);
     }
     return true;
 }
@@ -2406,7 +2386,7 @@ static void do_insr_i64(DisasContext *s, arg_rrr_esz *a, TCGv_i64 val)
         gen_helper_sve_insr_s, gen_helper_sve_insr_d,
     };
     unsigned vsz = vec_full_reg_size(s);
-    TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
+    TCGv_i32 desc = tcg_constant_i32(simd_desc(vsz, vsz, 0));
     TCGv_ptr t_zd = tcg_temp_new_ptr();
     TCGv_ptr t_zn = tcg_temp_new_ptr();
 
@@ -2417,7 +2397,6 @@ static void do_insr_i64(DisasContext *s, arg_rrr_esz *a, TCGv_i64 val)
 
     tcg_temp_free_ptr(t_zd);
     tcg_temp_free_ptr(t_zn);
-    tcg_temp_free_i32(desc);
 }
 
 static bool trans_INSR_f(DisasContext *s, arg_rrr_esz *a)
@@ -2536,7 +2515,6 @@ static bool do_perm_pred3(DisasContext *s, arg_rrr_esz *a, bool high_odd,
     TCGv_ptr t_d = tcg_temp_new_ptr();
     TCGv_ptr t_n = tcg_temp_new_ptr();
     TCGv_ptr t_m = tcg_temp_new_ptr();
-    TCGv_i32 t_desc;
     uint32_t desc = 0;
 
     desc = FIELD_DP32(desc, PREDDESC, OPRSZ, vsz);
@@ -2546,14 +2524,12 @@ static bool do_perm_pred3(DisasContext *s, arg_rrr_esz *a, bool high_odd,
     tcg_gen_addi_ptr(t_d, cpu_env, pred_full_reg_offset(s, a->rd));
     tcg_gen_addi_ptr(t_n, cpu_env, pred_full_reg_offset(s, a->rn));
     tcg_gen_addi_ptr(t_m, cpu_env, pred_full_reg_offset(s, a->rm));
-    t_desc = tcg_const_i32(desc);
 
-    fn(t_d, t_n, t_m, t_desc);
+    fn(t_d, t_n, t_m, tcg_constant_i32(desc));
 
     tcg_temp_free_ptr(t_d);
     tcg_temp_free_ptr(t_n);
     tcg_temp_free_ptr(t_m);
-    tcg_temp_free_i32(t_desc);
     return true;
 }
 
@@ -2567,7 +2543,6 @@ static bool do_perm_pred2(DisasContext *s, arg_rr_esz *a, bool high_odd,
     unsigned vsz = pred_full_reg_size(s);
     TCGv_ptr t_d = tcg_temp_new_ptr();
     TCGv_ptr t_n = tcg_temp_new_ptr();
-    TCGv_i32 t_desc;
     uint32_t desc = 0;
 
     tcg_gen_addi_ptr(t_d, cpu_env, pred_full_reg_offset(s, a->rd));
@@ -2576,11 +2551,9 @@ static bool do_perm_pred2(DisasContext *s, arg_rr_esz *a, bool high_odd,
     desc = FIELD_DP32(desc, PREDDESC, OPRSZ, vsz);
     desc = FIELD_DP32(desc, PREDDESC, ESZ, a->esz);
     desc = FIELD_DP32(desc, PREDDESC, DATA, high_odd);
-    t_desc = tcg_const_i32(desc);
 
-    fn(t_d, t_n, t_desc);
+    fn(t_d, t_n, tcg_constant_i32(desc));
 
-    tcg_temp_free_i32(t_desc);
     tcg_temp_free_ptr(t_d);
     tcg_temp_free_ptr(t_n);
     return true;
@@ -2782,18 +2755,15 @@ static void find_last_active(DisasContext *s, TCGv_i32 ret, int esz, int pg)
      * round up, as we do elsewhere, because we need the exact size.
      */
     TCGv_ptr t_p = tcg_temp_new_ptr();
-    TCGv_i32 t_desc;
     unsigned desc = 0;
 
     desc = FIELD_DP32(desc, PREDDESC, OPRSZ, pred_full_reg_size(s));
     desc = FIELD_DP32(desc, PREDDESC, ESZ, esz);
 
     tcg_gen_addi_ptr(t_p, cpu_env, pred_full_reg_offset(s, pg));
-    t_desc = tcg_const_i32(desc);
 
-    gen_helper_sve_last_active_element(ret, t_p, t_desc);
+    gen_helper_sve_last_active_element(ret, t_p, tcg_constant_i32(desc));
 
-    tcg_temp_free_i32(t_desc);
     tcg_temp_free_ptr(t_p);
 }
 
@@ -2808,11 +2778,9 @@ static void incr_last_active(DisasContext *s, TCGv_i32 last, int esz)
     if (is_power_of_2(vsz)) {
         tcg_gen_andi_i32(last, last, vsz - 1);
     } else {
-        TCGv_i32 max = tcg_const_i32(vsz);
-        TCGv_i32 zero = tcg_const_i32(0);
+        TCGv_i32 max = tcg_constant_i32(vsz);
+        TCGv_i32 zero = tcg_constant_i32(0);
         tcg_gen_movcond_i32(TCG_COND_GEU, last, last, max, zero, last);
-        tcg_temp_free_i32(max);
-        tcg_temp_free_i32(zero);
     }
 }
 
@@ -2824,11 +2792,9 @@ static void wrap_last_active(DisasContext *s, TCGv_i32 last, int esz)
     if (is_power_of_2(vsz)) {
         tcg_gen_andi_i32(last, last, vsz - 1);
     } else {
-        TCGv_i32 max = tcg_const_i32(vsz - (1 << esz));
-        TCGv_i32 zero = tcg_const_i32(0);
+        TCGv_i32 max = tcg_constant_i32(vsz - (1 << esz));
+        TCGv_i32 zero = tcg_constant_i32(0);
         tcg_gen_movcond_i32(TCG_COND_LT, last, last, zero, max, last);
-        tcg_temp_free_i32(max);
-        tcg_temp_free_i32(zero);
     }
 }
 
@@ -2945,7 +2911,7 @@ static void do_clast_scalar(DisasContext *s, int esz, int pg, int rm,
                             bool before, TCGv_i64 reg_val)
 {
     TCGv_i32 last = tcg_temp_new_i32();
-    TCGv_i64 ele, cmp, zero;
+    TCGv_i64 ele, cmp;
 
     find_last_active(s, last, esz, pg);
 
@@ -2965,10 +2931,9 @@ static void do_clast_scalar(DisasContext *s, int esz, int pg, int rm,
     ele = load_last_active(s, last, rm, esz);
     tcg_temp_free_i32(last);
 
-    zero = tcg_const_i64(0);
-    tcg_gen_movcond_i64(TCG_COND_GE, reg_val, cmp, zero, ele, reg_val);
+    tcg_gen_movcond_i64(TCG_COND_GE, reg_val, cmp, tcg_constant_i64(0),
+                        ele, reg_val);
 
-    tcg_temp_free_i64(zero);
     tcg_temp_free_i64(cmp);
     tcg_temp_free_i64(ele);
 }
@@ -3196,7 +3161,7 @@ static bool do_ppzz_flags(DisasContext *s, arg_rprr_esz *a,
     }
 
     vsz = vec_full_reg_size(s);
-    t = tcg_const_i32(simd_desc(vsz, vsz, 0));
+    t = tcg_temp_new_i32();
     pd = tcg_temp_new_ptr();
     zn = tcg_temp_new_ptr();
     zm = tcg_temp_new_ptr();
@@ -3207,7 +3172,7 @@ static bool do_ppzz_flags(DisasContext *s, arg_rprr_esz *a,
     tcg_gen_addi_ptr(zm, cpu_env, vec_full_reg_offset(s, a->rm));
     tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg));
 
-    gen_fn(t, pd, zn, zm, pg, t);
+    gen_fn(t, pd, zn, zm, pg, tcg_constant_i32(simd_desc(vsz, vsz, 0)));
 
     tcg_temp_free_ptr(pd);
     tcg_temp_free_ptr(zn);
@@ -3281,7 +3246,7 @@ static bool do_ppzi_flags(DisasContext *s, arg_rpri_esz *a,
     }
 
     vsz = vec_full_reg_size(s);
-    t = tcg_const_i32(simd_desc(vsz, vsz, a->imm));
+    t = tcg_temp_new_i32();
     pd = tcg_temp_new_ptr();
     zn = tcg_temp_new_ptr();
     pg = tcg_temp_new_ptr();
@@ -3290,7 +3255,7 @@ static bool do_ppzi_flags(DisasContext *s, arg_rpri_esz *a,
     tcg_gen_addi_ptr(zn, cpu_env, vec_full_reg_offset(s, a->rn));
     tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg));
 
-    gen_fn(t, pd, zn, pg, t);
+    gen_fn(t, pd, zn, pg, tcg_constant_i32(simd_desc(vsz, vsz, a->imm)));
 
     tcg_temp_free_ptr(pd);
     tcg_temp_free_ptr(zn);
@@ -3343,7 +3308,8 @@ static bool do_brk3(DisasContext *s, arg_rprr_s *a,
     TCGv_ptr n = tcg_temp_new_ptr();
     TCGv_ptr m = tcg_temp_new_ptr();
     TCGv_ptr g = tcg_temp_new_ptr();
-    TCGv_i32 t = tcg_const_i32(FIELD_DP32(0, PREDDESC, OPRSZ, vsz));
+    TCGv_i32 t = tcg_temp_new_i32();
+    TCGv_i32 desc = tcg_constant_i32(FIELD_DP32(0, PREDDESC, OPRSZ, vsz));
 
     tcg_gen_addi_ptr(d, cpu_env, pred_full_reg_offset(s, a->rd));
     tcg_gen_addi_ptr(n, cpu_env, pred_full_reg_offset(s, a->rn));
@@ -3351,10 +3317,10 @@ static bool do_brk3(DisasContext *s, arg_rprr_s *a,
     tcg_gen_addi_ptr(g, cpu_env, pred_full_reg_offset(s, a->pg));
 
     if (a->s) {
-        fn_s(t, d, n, m, g, t);
+        fn_s(t, d, n, m, g, desc);
         do_pred_flags(t);
     } else {
-        fn(d, n, m, g, t);
+        fn(d, n, m, g, desc);
     }
     tcg_temp_free_ptr(d);
     tcg_temp_free_ptr(n);
@@ -3377,17 +3343,18 @@ static bool do_brk2(DisasContext *s, arg_rpr_s *a,
     TCGv_ptr d = tcg_temp_new_ptr();
     TCGv_ptr n = tcg_temp_new_ptr();
     TCGv_ptr g = tcg_temp_new_ptr();
-    TCGv_i32 t = tcg_const_i32(FIELD_DP32(0, PREDDESC, OPRSZ, vsz));
+    TCGv_i32 t = tcg_temp_new_i32();
+    TCGv_i32 desc = tcg_constant_i32(FIELD_DP32(0, PREDDESC, OPRSZ, vsz));
 
     tcg_gen_addi_ptr(d, cpu_env, pred_full_reg_offset(s, a->rd));
     tcg_gen_addi_ptr(n, cpu_env, pred_full_reg_offset(s, a->rn));
     tcg_gen_addi_ptr(g, cpu_env, pred_full_reg_offset(s, a->pg));
 
     if (a->s) {
-        fn_s(t, d, n, g, t);
+        fn_s(t, d, n, g, desc);
         do_pred_flags(t);
     } else {
-        fn(d, n, g, t);
+        fn(d, n, g, desc);
     }
     tcg_temp_free_ptr(d);
     tcg_temp_free_ptr(n);
@@ -3461,19 +3428,16 @@ static void do_cntp(DisasContext *s, TCGv_i64 val, int esz, int pn, int pg)
         TCGv_ptr t_pn = tcg_temp_new_ptr();
         TCGv_ptr t_pg = tcg_temp_new_ptr();
         unsigned desc = 0;
-        TCGv_i32 t_desc;
 
         desc = FIELD_DP32(desc, PREDDESC, OPRSZ, psz);
         desc = FIELD_DP32(desc, PREDDESC, ESZ, esz);
 
         tcg_gen_addi_ptr(t_pn, cpu_env, pred_full_reg_offset(s, pn));
         tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
-        t_desc = tcg_const_i32(desc);
 
-        gen_helper_sve_cntp(val, t_pn, t_pg, t_desc);
+        gen_helper_sve_cntp(val, t_pn, t_pg, tcg_constant_i32(desc));
         tcg_temp_free_ptr(t_pn);
         tcg_temp_free_ptr(t_pg);
-        tcg_temp_free_i32(t_desc);
     }
 }
 
@@ -3588,7 +3552,7 @@ static bool trans_CTERM(DisasContext *s, arg_CTERM *a)
 static bool trans_WHILE(DisasContext *s, arg_WHILE *a)
 {
     TCGv_i64 op0, op1, t0, t1, tmax;
-    TCGv_i32 t2, t3;
+    TCGv_i32 t2;
     TCGv_ptr ptr;
     unsigned vsz = vec_full_reg_size(s);
     unsigned desc = 0;
@@ -3644,7 +3608,7 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a)
         }
     }
 
-    tmax = tcg_const_i64(vsz >> a->esz);
+    tmax = tcg_constant_i64(vsz >> a->esz);
     if (eq) {
         /* Equality means one more iteration.  */
         tcg_gen_addi_i64(t0, t0, 1);
@@ -3664,7 +3628,6 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a)
 
     /* Bound to the maximum.  */
     tcg_gen_umin_i64(t0, t0, tmax);
-    tcg_temp_free_i64(tmax);
 
     /* Set the count to zero if the condition is false.  */
     tcg_gen_movi_i64(t1, 0);
@@ -3681,28 +3644,26 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a)
 
     desc = FIELD_DP32(desc, PREDDESC, OPRSZ, vsz / 8);
     desc = FIELD_DP32(desc, PREDDESC, ESZ, a->esz);
-    t3 = tcg_const_i32(desc);
 
     ptr = tcg_temp_new_ptr();
     tcg_gen_addi_ptr(ptr, cpu_env, pred_full_reg_offset(s, a->rd));
 
     if (a->lt) {
-        gen_helper_sve_whilel(t2, ptr, t2, t3);
+        gen_helper_sve_whilel(t2, ptr, t2, tcg_constant_i32(desc));
     } else {
-        gen_helper_sve_whileg(t2, ptr, t2, t3);
+        gen_helper_sve_whileg(t2, ptr, t2, tcg_constant_i32(desc));
     }
     do_pred_flags(t2);
 
     tcg_temp_free_ptr(ptr);
     tcg_temp_free_i32(t2);
-    tcg_temp_free_i32(t3);
     return true;
 }
 
 static bool trans_WHILE_ptr(DisasContext *s, arg_WHILE_ptr *a)
 {
     TCGv_i64 op0, op1, diff, t1, tmax;
-    TCGv_i32 t2, t3;
+    TCGv_i32 t2;
     TCGv_ptr ptr;
     unsigned vsz = vec_full_reg_size(s);
     unsigned desc = 0;
@@ -3717,7 +3678,7 @@ static bool trans_WHILE_ptr(DisasContext *s, arg_WHILE_ptr *a)
     op0 = read_cpu_reg(s, a->rn, 1);
     op1 = read_cpu_reg(s, a->rm, 1);
 
-    tmax = tcg_const_i64(vsz);
+    tmax = tcg_constant_i64(vsz);
     diff = tcg_temp_new_i64();
 
     if (a->rw) {
@@ -3743,7 +3704,6 @@ static bool trans_WHILE_ptr(DisasContext *s, arg_WHILE_ptr *a)
 
     /* Bound to the maximum.  */
     tcg_gen_umin_i64(diff, diff, tmax);
-    tcg_temp_free_i64(tmax);
 
     /* Since we're bounded, pass as a 32-bit type.  */
     t2 = tcg_temp_new_i32();
@@ -3752,17 +3712,15 @@ static bool trans_WHILE_ptr(DisasContext *s, arg_WHILE_ptr *a)
 
     desc = FIELD_DP32(desc, PREDDESC, OPRSZ, vsz / 8);
     desc = FIELD_DP32(desc, PREDDESC, ESZ, a->esz);
-    t3 = tcg_const_i32(desc);
 
     ptr = tcg_temp_new_ptr();
     tcg_gen_addi_ptr(ptr, cpu_env, pred_full_reg_offset(s, a->rd));
 
-    gen_helper_sve_whilel(t2, ptr, t2, t3);
+    gen_helper_sve_whilel(t2, ptr, t2, tcg_constant_i32(desc));
     do_pred_flags(t2);
 
     tcg_temp_free_ptr(ptr);
     tcg_temp_free_i32(t2);
-    tcg_temp_free_i32(t3);
     return true;
 }
 
@@ -3856,11 +3814,9 @@ static bool trans_SUBR_zzi(DisasContext *s, arg_rri_esz *a)
     }
     if (sve_access_check(s)) {
         unsigned vsz = vec_full_reg_size(s);
-        TCGv_i64 c = tcg_const_i64(a->imm);
         tcg_gen_gvec_2s(vec_full_reg_offset(s, a->rd),
                         vec_full_reg_offset(s, a->rn),
-                        vsz, vsz, c, &op[a->esz]);
-        tcg_temp_free_i64(c);
+                        vsz, vsz, tcg_constant_i64(a->imm), &op[a->esz]);
     }
     return true;
 }
@@ -3881,9 +3837,8 @@ static bool do_zzi_sat(DisasContext *s, arg_rri_esz *a, bool u, bool d)
         return false;
     }
     if (sve_access_check(s)) {
-        TCGv_i64 val = tcg_const_i64(a->imm);
-        do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, u, d);
-        tcg_temp_free_i64(val);
+        do_sat_addsub_vec(s, a->esz, a->rd, a->rn,
+                          tcg_constant_i64(a->imm), u, d);
     }
     return true;
 }
@@ -3912,12 +3867,9 @@ static bool do_zzi_ool(DisasContext *s, arg_rri_esz *a, gen_helper_gvec_2i *fn)
 {
     if (sve_access_check(s)) {
         unsigned vsz = vec_full_reg_size(s);
-        TCGv_i64 c = tcg_const_i64(a->imm);
-
         tcg_gen_gvec_2i_ool(vec_full_reg_offset(s, a->rd),
                             vec_full_reg_offset(s, a->rn),
-                            c, vsz, vsz, 0, fn);
-        tcg_temp_free_i64(c);
+                            tcg_constant_i64(a->imm), vsz, vsz, 0, fn);
     }
     return true;
 }
@@ -4221,7 +4173,7 @@ static void do_reduce(DisasContext *s, arg_rpr_esz *a,
 {
     unsigned vsz = vec_full_reg_size(s);
     unsigned p2vsz = pow2ceil(vsz);
-    TCGv_i32 t_desc = tcg_const_i32(simd_desc(vsz, vsz, p2vsz));
+    TCGv_i32 t_desc = tcg_constant_i32(simd_desc(vsz, vsz, p2vsz));
     TCGv_ptr t_zn, t_pg, status;
     TCGv_i64 temp;
 
@@ -4237,7 +4189,6 @@ static void do_reduce(DisasContext *s, arg_rpr_esz *a,
     tcg_temp_free_ptr(t_zn);
     tcg_temp_free_ptr(t_pg);
     tcg_temp_free_ptr(status);
-    tcg_temp_free_i32(t_desc);
 
     write_fp_dreg(s, a->rd, temp);
     tcg_temp_free_i64(temp);
@@ -4414,11 +4365,10 @@ static bool trans_FADDA(DisasContext *s, arg_rprr_esz *a)
     tcg_gen_addi_ptr(t_rm, cpu_env, vec_full_reg_offset(s, a->rm));
     tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg));
     t_fpst = fpstatus_ptr(a->esz == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
-    t_desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
+    t_desc = tcg_constant_i32(simd_desc(vsz, vsz, 0));
 
     fns[a->esz - 1](t_val, t_val, t_rm, t_pg, t_fpst, t_desc);
 
-    tcg_temp_free_i32(t_desc);
     tcg_temp_free_ptr(t_fpst);
     tcg_temp_free_ptr(t_pg);
     tcg_temp_free_ptr(t_rm);
@@ -4535,10 +4485,9 @@ static void do_fp_scalar(DisasContext *s, int zd, int zn, int pg, bool is_fp16,
     tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
 
     status = fpstatus_ptr(is_fp16 ? FPST_FPCR_F16 : FPST_FPCR);
-    desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
+    desc = tcg_constant_i32(simd_desc(vsz, vsz, 0));
     fn(t_zd, t_zn, t_pg, scalar, status, desc);
 
-    tcg_temp_free_i32(desc);
     tcg_temp_free_ptr(status);
     tcg_temp_free_ptr(t_pg);
     tcg_temp_free_ptr(t_zn);
@@ -4548,9 +4497,8 @@ static void do_fp_scalar(DisasContext *s, int zd, int zn, int pg, bool is_fp16,
 static void do_fp_imm(DisasContext *s, arg_rpri_esz *a, uint64_t imm,
                       gen_helper_sve_fp2scalar *fn)
 {
-    TCGv_i64 temp = tcg_const_i64(imm);
-    do_fp_scalar(s, a->rd, a->rn, a->pg, a->esz == MO_16, temp, fn);
-    tcg_temp_free_i64(temp);
+    do_fp_scalar(s, a->rd, a->rn, a->pg, a->esz == MO_16,
+                 tcg_constant_i64(imm), fn);
 }
 
 #define DO_FP_IMM(NAME, name, const0, const1) \
@@ -5297,7 +5245,6 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
 {
     unsigned vsz = vec_full_reg_size(s);
     TCGv_ptr t_pg;
-    TCGv_i32 t_desc;
     int desc = 0;
 
     /*
@@ -5319,14 +5266,12 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
     }
 
     desc = simd_desc(vsz, vsz, zt | desc);
-    t_desc = tcg_const_i32(desc);
     t_pg = tcg_temp_new_ptr();
 
     tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
-    fn(cpu_env, t_pg, addr, t_desc);
+    fn(cpu_env, t_pg, addr, tcg_constant_i32(desc));
 
     tcg_temp_free_ptr(t_pg);
-    tcg_temp_free_i32(t_desc);
 }
 
 /* Indexed by [mte][be][dtype][nreg] */
@@ -6069,7 +6014,6 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm,
     TCGv_ptr t_zm = tcg_temp_new_ptr();
     TCGv_ptr t_pg = tcg_temp_new_ptr();
     TCGv_ptr t_zt = tcg_temp_new_ptr();
-    TCGv_i32 t_desc;
     int desc = 0;
 
     if (s->mte_active[0]) {
@@ -6081,17 +6025,15 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm,
         desc <<= SVE_MTEDESC_SHIFT;
     }
     desc = simd_desc(vsz, vsz, desc | scale);
-    t_desc = tcg_const_i32(desc);
 
     tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
     tcg_gen_addi_ptr(t_zm, cpu_env, vec_full_reg_offset(s, zm));
     tcg_gen_addi_ptr(t_zt, cpu_env, vec_full_reg_offset(s, zt));
-    fn(cpu_env, t_zt, t_pg, t_zm, scalar, t_desc);
+    fn(cpu_env, t_zt, t_pg, t_zm, scalar, tcg_constant_i32(desc));
 
     tcg_temp_free_ptr(t_zt);
     tcg_temp_free_ptr(t_zm);
     tcg_temp_free_ptr(t_pg);
-    tcg_temp_free_i32(t_desc);
 }
 
 /* Indexed by [mte][be][ff][xs][u][msz].  */
@@ -6452,7 +6394,6 @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a)
     gen_helper_gvec_mem_scatter *fn = NULL;
     bool be = s->be_data == MO_BE;
     bool mte = s->mte_active[0];
-    TCGv_i64 imm;
 
     if (a->esz < a->msz || (a->esz == a->msz && !a->u)) {
         return false;
@@ -6474,9 +6415,8 @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a)
     /* Treat LD1_zpiz (zn[x] + imm) the same way as LD1_zprz (rn + zm[x])
      * by loading the immediate into the scalar parameter.
      */
-    imm = tcg_const_i64(a->imm << a->msz);
-    do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, a->msz, false, fn);
-    tcg_temp_free_i64(imm);
+    do_mem_zpz(s, a->rd, a->pg, a->rn, 0,
+               tcg_constant_i64(a->imm << a->msz), a->msz, false, fn);
     return true;
 }
 
@@ -6635,7 +6575,6 @@ static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a)
     gen_helper_gvec_mem_scatter *fn = NULL;
     bool be = s->be_data == MO_BE;
     bool mte = s->mte_active[0];
-    TCGv_i64 imm;
 
     if (a->esz < a->msz) {
         return false;
@@ -6657,9 +6596,8 @@ static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a)
     /* Treat ST1_zpiz (zn[x] + imm) the same way as ST1_zprz (rn + zm[x])
      * by loading the immediate into the scalar parameter.
      */
-    imm = tcg_const_i64(a->imm << a->msz);
-    do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, a->msz, true, fn);
-    tcg_temp_free_i64(imm);
+    do_mem_zpz(s, a->rd, a->pg, a->rn, 0,
+               tcg_constant_i64(a->imm << a->msz), a->msz, true, fn);
     return true;
 }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 22/60] target/arm: Use tcg_constant in translate-vfp.c
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (20 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 21/60] target/arm: Use tcg_constant in translate-sve.c Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 19:10   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 23/60] target/arm: Use tcg_constant_i32 in translate.h Richard Henderson
                   ` (38 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Use tcg_constant_{i32,i64} as appropriate throughout.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-vfp.c | 76 ++++++++++++--------------------------
 1 file changed, 23 insertions(+), 53 deletions(-)

diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c
index 17f796e32a..32b784b9c8 100644
--- a/target/arm/translate-vfp.c
+++ b/target/arm/translate-vfp.c
@@ -180,8 +180,7 @@ static void gen_update_fp_context(DisasContext *s)
         gen_helper_vfp_set_fpscr(cpu_env, fpscr);
         tcg_temp_free_i32(fpscr);
         if (dc_isar_feature(aa32_mve, s)) {
-            TCGv_i32 z32 = tcg_const_i32(0);
-            store_cpu_field(z32, v7m.vpr);
+            store_cpu_field(tcg_constant_i32(0), v7m.vpr);
         }
         /*
          * We just updated the FPSCR and VPR. Some of this state is cached
@@ -317,7 +316,7 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
         TCGv_i64 frn, frm, dest;
         TCGv_i64 tmp, zero, zf, nf, vf;
 
-        zero = tcg_const_i64(0);
+        zero = tcg_constant_i64(0);
 
         frn = tcg_temp_new_i64();
         frm = tcg_temp_new_i64();
@@ -335,27 +334,22 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
         vfp_load_reg64(frm, rm);
         switch (a->cc) {
         case 0: /* eq: Z */
-            tcg_gen_movcond_i64(TCG_COND_EQ, dest, zf, zero,
-                                frn, frm);
+            tcg_gen_movcond_i64(TCG_COND_EQ, dest, zf, zero, frn, frm);
             break;
         case 1: /* vs: V */
-            tcg_gen_movcond_i64(TCG_COND_LT, dest, vf, zero,
-                                frn, frm);
+            tcg_gen_movcond_i64(TCG_COND_LT, dest, vf, zero, frn, frm);
             break;
         case 2: /* ge: N == V -> N ^ V == 0 */
             tmp = tcg_temp_new_i64();
             tcg_gen_xor_i64(tmp, vf, nf);
-            tcg_gen_movcond_i64(TCG_COND_GE, dest, tmp, zero,
-                                frn, frm);
+            tcg_gen_movcond_i64(TCG_COND_GE, dest, tmp, zero, frn, frm);
             tcg_temp_free_i64(tmp);
             break;
         case 3: /* gt: !Z && N == V */
-            tcg_gen_movcond_i64(TCG_COND_NE, dest, zf, zero,
-                                frn, frm);
+            tcg_gen_movcond_i64(TCG_COND_NE, dest, zf, zero, frn, frm);
             tmp = tcg_temp_new_i64();
             tcg_gen_xor_i64(tmp, vf, nf);
-            tcg_gen_movcond_i64(TCG_COND_GE, dest, tmp, zero,
-                                dest, frm);
+            tcg_gen_movcond_i64(TCG_COND_GE, dest, tmp, zero, dest, frm);
             tcg_temp_free_i64(tmp);
             break;
         }
@@ -367,13 +361,11 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
         tcg_temp_free_i64(zf);
         tcg_temp_free_i64(nf);
         tcg_temp_free_i64(vf);
-
-        tcg_temp_free_i64(zero);
     } else {
         TCGv_i32 frn, frm, dest;
         TCGv_i32 tmp, zero;
 
-        zero = tcg_const_i32(0);
+        zero = tcg_constant_i32(0);
 
         frn = tcg_temp_new_i32();
         frm = tcg_temp_new_i32();
@@ -382,27 +374,22 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
         vfp_load_reg32(frm, rm);
         switch (a->cc) {
         case 0: /* eq: Z */
-            tcg_gen_movcond_i32(TCG_COND_EQ, dest, cpu_ZF, zero,
-                                frn, frm);
+            tcg_gen_movcond_i32(TCG_COND_EQ, dest, cpu_ZF, zero, frn, frm);
             break;
         case 1: /* vs: V */
-            tcg_gen_movcond_i32(TCG_COND_LT, dest, cpu_VF, zero,
-                                frn, frm);
+            tcg_gen_movcond_i32(TCG_COND_LT, dest, cpu_VF, zero, frn, frm);
             break;
         case 2: /* ge: N == V -> N ^ V == 0 */
             tmp = tcg_temp_new_i32();
             tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
-            tcg_gen_movcond_i32(TCG_COND_GE, dest, tmp, zero,
-                                frn, frm);
+            tcg_gen_movcond_i32(TCG_COND_GE, dest, tmp, zero, frn, frm);
             tcg_temp_free_i32(tmp);
             break;
         case 3: /* gt: !Z && N == V */
-            tcg_gen_movcond_i32(TCG_COND_NE, dest, cpu_ZF, zero,
-                                frn, frm);
+            tcg_gen_movcond_i32(TCG_COND_NE, dest, cpu_ZF, zero, frn, frm);
             tmp = tcg_temp_new_i32();
             tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
-            tcg_gen_movcond_i32(TCG_COND_GE, dest, tmp, zero,
-                                dest, frm);
+            tcg_gen_movcond_i32(TCG_COND_GE, dest, tmp, zero, dest, frm);
             tcg_temp_free_i32(tmp);
             break;
         }
@@ -414,8 +401,6 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
         tcg_temp_free_i32(frn);
         tcg_temp_free_i32(frm);
         tcg_temp_free_i32(dest);
-
-        tcg_temp_free_i32(zero);
     }
 
     return true;
@@ -547,7 +532,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
         fpst = fpstatus_ptr(FPST_FPCR);
     }
 
-    tcg_shift = tcg_const_i32(0);
+    tcg_shift = tcg_constant_i32(0);
 
     tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding));
     gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
@@ -595,8 +580,6 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
     gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst);
     tcg_temp_free_i32(tcg_rmode);
 
-    tcg_temp_free_i32(tcg_shift);
-
     tcg_temp_free_ptr(fpst);
 
     return true;
@@ -850,15 +833,11 @@ static bool trans_VMSR_VMRS(DisasContext *s, arg_VMSR_VMRS *a)
         case ARM_VFP_MVFR2:
         case ARM_VFP_FPSID:
             if (s->current_el == 1) {
-                TCGv_i32 tcg_reg, tcg_rt;
-
                 gen_set_condexec(s);
                 gen_set_pc_im(s, s->pc_curr);
-                tcg_reg = tcg_const_i32(a->reg);
-                tcg_rt = tcg_const_i32(a->rt);
-                gen_helper_check_hcr_el2_trap(cpu_env, tcg_rt, tcg_reg);
-                tcg_temp_free_i32(tcg_reg);
-                tcg_temp_free_i32(tcg_rt);
+                gen_helper_check_hcr_el2_trap(cpu_env,
+                                              tcg_constant_i32(a->rt),
+                                              tcg_constant_i32(a->reg));
             }
             /* fall through */
         case ARM_VFP_FPEXC:
@@ -2388,8 +2367,6 @@ MAKE_VFM_TRANS_FNS(dp)
 
 static bool trans_VMOV_imm_hp(DisasContext *s, arg_VMOV_imm_sp *a)
 {
-    TCGv_i32 fd;
-
     if (!dc_isar_feature(aa32_fp16_arith, s)) {
         return false;
     }
@@ -2402,9 +2379,7 @@ static bool trans_VMOV_imm_hp(DisasContext *s, arg_VMOV_imm_sp *a)
         return true;
     }
 
-    fd = tcg_const_i32(vfp_expand_imm(MO_16, a->imm));
-    vfp_store_reg32(fd, a->vd);
-    tcg_temp_free_i32(fd);
+    vfp_store_reg32(tcg_constant_i32(vfp_expand_imm(MO_16, a->imm)), a->vd);
     return true;
 }
 
@@ -2440,7 +2415,7 @@ static bool trans_VMOV_imm_sp(DisasContext *s, arg_VMOV_imm_sp *a)
         }
     }
 
-    fd = tcg_const_i32(vfp_expand_imm(MO_32, a->imm));
+    fd = tcg_constant_i32(vfp_expand_imm(MO_32, a->imm));
 
     for (;;) {
         vfp_store_reg32(fd, vd);
@@ -2454,7 +2429,6 @@ static bool trans_VMOV_imm_sp(DisasContext *s, arg_VMOV_imm_sp *a)
         vd = vfp_advance_sreg(vd, delta_d);
     }
 
-    tcg_temp_free_i32(fd);
     return true;
 }
 
@@ -2495,7 +2469,7 @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
         }
     }
 
-    fd = tcg_const_i64(vfp_expand_imm(MO_64, a->imm));
+    fd = tcg_constant_i64(vfp_expand_imm(MO_64, a->imm));
 
     for (;;) {
         vfp_store_reg64(fd, vd);
@@ -2509,7 +2483,6 @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
         vd = vfp_advance_dreg(vd, delta_d);
     }
 
-    tcg_temp_free_i64(fd);
     return true;
 }
 
@@ -3294,7 +3267,7 @@ static bool trans_VCVT_fix_hp(DisasContext *s, arg_VCVT_fix_sp *a)
     vfp_load_reg32(vd, a->vd);
 
     fpst = fpstatus_ptr(FPST_FPCR_F16);
-    shift = tcg_const_i32(frac_bits);
+    shift = tcg_constant_i32(frac_bits);
 
     /* Switch on op:U:sx bits */
     switch (a->opc) {
@@ -3328,7 +3301,6 @@ static bool trans_VCVT_fix_hp(DisasContext *s, arg_VCVT_fix_sp *a)
 
     vfp_store_reg32(vd, a->vd);
     tcg_temp_free_i32(vd);
-    tcg_temp_free_i32(shift);
     tcg_temp_free_ptr(fpst);
     return true;
 }
@@ -3353,7 +3325,7 @@ static bool trans_VCVT_fix_sp(DisasContext *s, arg_VCVT_fix_sp *a)
     vfp_load_reg32(vd, a->vd);
 
     fpst = fpstatus_ptr(FPST_FPCR);
-    shift = tcg_const_i32(frac_bits);
+    shift = tcg_constant_i32(frac_bits);
 
     /* Switch on op:U:sx bits */
     switch (a->opc) {
@@ -3387,7 +3359,6 @@ static bool trans_VCVT_fix_sp(DisasContext *s, arg_VCVT_fix_sp *a)
 
     vfp_store_reg32(vd, a->vd);
     tcg_temp_free_i32(vd);
-    tcg_temp_free_i32(shift);
     tcg_temp_free_ptr(fpst);
     return true;
 }
@@ -3418,7 +3389,7 @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
     vfp_load_reg64(vd, a->vd);
 
     fpst = fpstatus_ptr(FPST_FPCR);
-    shift = tcg_const_i32(frac_bits);
+    shift = tcg_constant_i32(frac_bits);
 
     /* Switch on op:U:sx bits */
     switch (a->opc) {
@@ -3452,7 +3423,6 @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
 
     vfp_store_reg64(vd, a->vd);
     tcg_temp_free_i64(vd);
-    tcg_temp_free_i32(shift);
     tcg_temp_free_ptr(fpst);
     return true;
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 23/60] target/arm: Use tcg_constant_i32 in translate.h
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (21 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 22/60] target/arm: Use tcg_constant in translate-vfp.c Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 19:11   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 24/60] target/arm: Split out cpregs.h Richard Henderson
                   ` (37 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate.h | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index 050d80f6f9..6f0ebdc88e 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -332,16 +332,9 @@ static inline void gen_ss_advance(DisasContext *s)
 static inline void gen_exception(int excp, uint32_t syndrome,
                                  uint32_t target_el)
 {
-    TCGv_i32 tcg_excp = tcg_const_i32(excp);
-    TCGv_i32 tcg_syn = tcg_const_i32(syndrome);
-    TCGv_i32 tcg_el = tcg_const_i32(target_el);
-
-    gen_helper_exception_with_syndrome(cpu_env, tcg_excp,
-                                       tcg_syn, tcg_el);
-
-    tcg_temp_free_i32(tcg_el);
-    tcg_temp_free_i32(tcg_syn);
-    tcg_temp_free_i32(tcg_excp);
+    gen_helper_exception_with_syndrome(cpu_env, tcg_constant_i32(excp),
+                                       tcg_constant_i32(syndrome),
+                                       tcg_constant_i32(target_el));
 }
 
 /* Generate an architectural singlestep exception */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 24/60] target/arm: Split out cpregs.h
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (22 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 23/60] target/arm: Use tcg_constant_i32 in translate.h Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-21 19:14   ` Peter Maydell
  2022-04-22 15:21   ` Alex Bennée
  2022-04-17 17:43 ` [PATCH v3 25/60] target/arm: Reorg CPAccessResult and access_check_cp_reg Richard Henderson
                   ` (36 subsequent siblings)
  60 siblings, 2 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Move ARMCPRegInfo and all related declarations to a new
internal header, out of the public cpu.h.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpregs.h        | 413 +++++++++++++++++++++++++++++++++++++
 target/arm/cpu.h           | 368 ---------------------------------
 hw/arm/pxa2xx.c            |   1 +
 hw/arm/pxa2xx_pic.c        |   1 +
 hw/intc/arm_gicv3_cpuif.c  |   1 +
 hw/intc/arm_gicv3_kvm.c    |   2 +
 target/arm/cpu.c           |   1 +
 target/arm/cpu64.c         |   1 +
 target/arm/cpu_tcg.c       |   1 +
 target/arm/gdbstub.c       |   3 +-
 target/arm/helper.c        |   1 +
 target/arm/op_helper.c     |   1 +
 target/arm/translate-a64.c |   4 +-
 target/arm/translate.c     |   3 +-
 14 files changed, 427 insertions(+), 374 deletions(-)
 create mode 100644 target/arm/cpregs.h

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
new file mode 100644
index 0000000000..005aa2d3a5
--- /dev/null
+++ b/target/arm/cpregs.h
@@ -0,0 +1,413 @@
+/*
+ * QEMU ARM CP Register access and descriptions
+ *
+ * Copyright (c) 2022 Linaro Ltd
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see
+ * <http://www.gnu.org/licenses/gpl-2.0.html>
+ */
+
+#ifndef TARGET_ARM_CPREGS_H
+#define TARGET_ARM_CPREGS_H 1
+
+/*
+ * ARMCPRegInfo type field bits. If the SPECIAL bit is set this is a
+ * special-behaviour cp reg and bits [11..8] indicate what behaviour
+ * it has. Otherwise it is a simple cp reg, where CONST indicates that
+ * TCG can assume the value to be constant (ie load at translate time)
+ * and 64BIT indicates a 64 bit wide coprocessor register. SUPPRESS_TB_END
+ * indicates that the TB should not be ended after a write to this register
+ * (the default is that the TB ends after cp writes). OVERRIDE permits
+ * a register definition to override a previous definition for the
+ * same (cp, is64, crn, crm, opc1, opc2) tuple: either the new or the
+ * old must have the OVERRIDE bit set.
+ * ALIAS indicates that this register is an alias view of some underlying
+ * state which is also visible via another register, and that the other
+ * register is handling migration and reset; registers marked ALIAS will not be
+ * migrated but may have their state set by syncing of register state from KVM.
+ * NO_RAW indicates that this register has no underlying state and does not
+ * support raw access for state saving/loading; it will not be used for either
+ * migration or KVM state synchronization. (Typically this is for "registers"
+ * which are actually used as instructions for cache maintenance and so on.)
+ * IO indicates that this register does I/O and therefore its accesses
+ * need to be marked with gen_io_start() and also end the TB. In particular,
+ * registers which implement clocks or timers require this.
+ * RAISES_EXC is for when the read or write hook might raise an exception;
+ * the generated code will synchronize the CPU state before calling the hook
+ * so that it is safe for the hook to call raise_exception().
+ * NEWEL is for writes to registers that might change the exception
+ * level - typically on older ARM chips. For those cases we need to
+ * re-read the new el when recomputing the translation flags.
+ */
+#define ARM_CP_SPECIAL           0x0001
+#define ARM_CP_CONST             0x0002
+#define ARM_CP_64BIT             0x0004
+#define ARM_CP_SUPPRESS_TB_END   0x0008
+#define ARM_CP_OVERRIDE          0x0010
+#define ARM_CP_ALIAS             0x0020
+#define ARM_CP_IO                0x0040
+#define ARM_CP_NO_RAW            0x0080
+#define ARM_CP_NOP               (ARM_CP_SPECIAL | 0x0100)
+#define ARM_CP_WFI               (ARM_CP_SPECIAL | 0x0200)
+#define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
+#define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
+#define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
+#define ARM_CP_DC_GVA            (ARM_CP_SPECIAL | 0x0600)
+#define ARM_CP_DC_GZVA           (ARM_CP_SPECIAL | 0x0700)
+#define ARM_LAST_SPECIAL         ARM_CP_DC_GZVA
+#define ARM_CP_FPU               0x1000
+#define ARM_CP_SVE               0x2000
+#define ARM_CP_NO_GDB            0x4000
+#define ARM_CP_RAISES_EXC        0x8000
+#define ARM_CP_NEWEL             0x10000
+/* Used only as a terminator for ARMCPRegInfo lists */
+#define ARM_CP_SENTINEL          0xfffff
+/* Mask of only the flag bits in a type field */
+#define ARM_CP_FLAG_MASK         0x1f0ff
+
+/*
+ * Valid values for ARMCPRegInfo state field, indicating which of
+ * the AArch32 and AArch64 execution states this register is visible in.
+ * If the reginfo doesn't explicitly specify then it is AArch32 only.
+ * If the reginfo is declared to be visible in both states then a second
+ * reginfo is synthesised for the AArch32 view of the AArch64 register,
+ * such that the AArch32 view is the lower 32 bits of the AArch64 one.
+ * Note that we rely on the values of these enums as we iterate through
+ * the various states in some places.
+ */
+enum {
+    ARM_CP_STATE_AA32 = 0,
+    ARM_CP_STATE_AA64 = 1,
+    ARM_CP_STATE_BOTH = 2,
+};
+
+/*
+ * ARM CP register secure state flags.  These flags identify security state
+ * attributes for a given CP register entry.
+ * The existence of both or neither secure and non-secure flags indicates that
+ * the register has both a secure and non-secure hash entry.  A single one of
+ * these flags causes the register to only be hashed for the specified
+ * security state.
+ * Although definitions may have any combination of the S/NS bits, each
+ * registered entry will only have one to identify whether the entry is secure
+ * or non-secure.
+ */
+enum {
+    ARM_CP_SECSTATE_S =   (1 << 0), /* bit[0]: Secure state register */
+    ARM_CP_SECSTATE_NS =  (1 << 1), /* bit[1]: Non-secure state register */
+};
+
+/*
+ * Return true if cptype is a valid type field. This is used to try to
+ * catch errors where the sentinel has been accidentally left off the end
+ * of a list of registers.
+ */
+static inline bool cptype_valid(int cptype)
+{
+    return ((cptype & ~ARM_CP_FLAG_MASK) == 0)
+        || ((cptype & ARM_CP_SPECIAL) &&
+            ((cptype & ~ARM_CP_FLAG_MASK) <= ARM_LAST_SPECIAL));
+}
+
+/*
+ * Access rights:
+ * We define bits for Read and Write access for what rev C of the v7-AR ARM ARM
+ * defines as PL0 (user), PL1 (fiq/irq/svc/abt/und/sys, ie privileged), and
+ * PL2 (hyp). The other level which has Read and Write bits is Secure PL1
+ * (ie any of the privileged modes in Secure state, or Monitor mode).
+ * If a register is accessible in one privilege level it's always accessible
+ * in higher privilege levels too. Since "Secure PL1" also follows this rule
+ * (ie anything visible in PL2 is visible in S-PL1, some things are only
+ * visible in S-PL1) but "Secure PL1" is a bit of a mouthful, we bend the
+ * terminology a little and call this PL3.
+ * In AArch64 things are somewhat simpler as the PLx bits line up exactly
+ * with the ELx exception levels.
+ *
+ * If access permissions for a register are more complex than can be
+ * described with these bits, then use a laxer set of restrictions, and
+ * do the more restrictive/complex check inside a helper function.
+ */
+#define PL3_R 0x80
+#define PL3_W 0x40
+#define PL2_R (0x20 | PL3_R)
+#define PL2_W (0x10 | PL3_W)
+#define PL1_R (0x08 | PL2_R)
+#define PL1_W (0x04 | PL2_W)
+#define PL0_R (0x02 | PL1_R)
+#define PL0_W (0x01 | PL1_W)
+
+/*
+ * For user-mode some registers are accessible to EL0 via a kernel
+ * trap-and-emulate ABI. In this case we define the read permissions
+ * as actually being PL0_R. However some bits of any given register
+ * may still be masked.
+ */
+#ifdef CONFIG_USER_ONLY
+#define PL0U_R PL0_R
+#else
+#define PL0U_R PL1_R
+#endif
+
+#define PL3_RW (PL3_R | PL3_W)
+#define PL2_RW (PL2_R | PL2_W)
+#define PL1_RW (PL1_R | PL1_W)
+#define PL0_RW (PL0_R | PL0_W)
+
+typedef enum CPAccessResult {
+    /* Access is permitted */
+    CP_ACCESS_OK = 0,
+    /*
+     * Access fails due to a configurable trap or enable which would
+     * result in a categorized exception syndrome giving information about
+     * the failing instruction (ie syndrome category 0x3, 0x4, 0x5, 0x6,
+     * 0xc or 0x18). The exception is taken to the usual target EL (EL1 or
+     * PL1 if in EL0, otherwise to the current EL).
+     */
+    CP_ACCESS_TRAP = 1,
+    /*
+     * Access fails and results in an exception syndrome 0x0 ("uncategorized").
+     * Note that this is not a catch-all case -- the set of cases which may
+     * result in this failure is specifically defined by the architecture.
+     */
+    CP_ACCESS_TRAP_UNCATEGORIZED = 2,
+    /* As CP_ACCESS_TRAP, but for traps directly to EL2 or EL3 */
+    CP_ACCESS_TRAP_EL2 = 3,
+    CP_ACCESS_TRAP_EL3 = 4,
+    /* As CP_ACCESS_UNCATEGORIZED, but for traps directly to EL2 or EL3 */
+    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = 5,
+    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = 6,
+} CPAccessResult;
+
+typedef struct ARMCPRegInfo ARMCPRegInfo;
+
+/*
+ * Access functions for coprocessor registers. These cannot fail and
+ * may not raise exceptions.
+ */
+typedef uint64_t CPReadFn(CPUARMState *env, const ARMCPRegInfo *opaque);
+typedef void CPWriteFn(CPUARMState *env, const ARMCPRegInfo *opaque,
+                       uint64_t value);
+/* Access permission check functions for coprocessor registers. */
+typedef CPAccessResult CPAccessFn(CPUARMState *env,
+                                  const ARMCPRegInfo *opaque,
+                                  bool isread);
+/* Hook function for register reset */
+typedef void CPResetFn(CPUARMState *env, const ARMCPRegInfo *opaque);
+
+#define CP_ANY 0xff
+
+/* Definition of an ARM coprocessor register */
+struct ARMCPRegInfo {
+    /* Name of register (useful mainly for debugging, need not be unique) */
+    const char *name;
+    /*
+     * Location of register: coprocessor number and (crn,crm,opc1,opc2)
+     * tuple. Any of crm, opc1 and opc2 may be CP_ANY to indicate a
+     * 'wildcard' field -- any value of that field in the MRC/MCR insn
+     * will be decoded to this register. The register read and write
+     * callbacks will be passed an ARMCPRegInfo with the crn/crm/opc1/opc2
+     * used by the program, so it is possible to register a wildcard and
+     * then behave differently on read/write if necessary.
+     * For 64 bit registers, only crm and opc1 are relevant; crn and opc2
+     * must both be zero.
+     * For AArch64-visible registers, opc0 is also used.
+     * Since there are no "coprocessors" in AArch64, cp is purely used as a
+     * way to distinguish (for KVM's benefit) guest-visible system registers
+     * from demuxed ones provided to preserve the "no side effects on
+     * KVM register read/write from QEMU" semantics. cp==0x13 is guest
+     * visible (to match KVM's encoding); cp==0 will be converted to
+     * cp==0x13 when the ARMCPRegInfo is registered, for convenience.
+     */
+    uint8_t cp;
+    uint8_t crn;
+    uint8_t crm;
+    uint8_t opc0;
+    uint8_t opc1;
+    uint8_t opc2;
+    /* Execution state in which this register is visible: ARM_CP_STATE_* */
+    int state;
+    /* Register type: ARM_CP_* bits/values */
+    int type;
+    /* Access rights: PL*_[RW] */
+    int access;
+    /* Security state: ARM_CP_SECSTATE_* bits/values */
+    int secure;
+    /*
+     * The opaque pointer passed to define_arm_cp_regs_with_opaque() when
+     * this register was defined: can be used to hand data through to the
+     * register read/write functions, since they are passed the ARMCPRegInfo*.
+     */
+    void *opaque;
+    /*
+     * Value of this register, if it is ARM_CP_CONST. Otherwise, if
+     * fieldoffset is non-zero, the reset value of the register.
+     */
+    uint64_t resetvalue;
+    /*
+     * Offset of the field in CPUARMState for this register.
+     * This is not needed if either:
+     *  1. type is ARM_CP_CONST or one of the ARM_CP_SPECIALs
+     *  2. both readfn and writefn are specified
+     */
+    ptrdiff_t fieldoffset; /* offsetof(CPUARMState, field) */
+
+    /*
+     * Offsets of the secure and non-secure fields in CPUARMState for the
+     * register if it is banked.  These fields are only used during the static
+     * registration of a register.  During hashing the bank associated
+     * with a given security state is copied to fieldoffset which is used from
+     * there on out.
+     *
+     * It is expected that register definitions use either fieldoffset or
+     * bank_fieldoffsets in the definition but not both.  It is also expected
+     * that both bank offsets are set when defining a banked register.  This
+     * use indicates that a register is banked.
+     */
+    ptrdiff_t bank_fieldoffsets[2];
+
+    /*
+     * Function for making any access checks for this register in addition to
+     * those specified by the 'access' permissions bits. If NULL, no extra
+     * checks required. The access check is performed at runtime, not at
+     * translate time.
+     */
+    CPAccessFn *accessfn;
+    /*
+     * Function for handling reads of this register. If NULL, then reads
+     * will be done by loading from the offset into CPUARMState specified
+     * by fieldoffset.
+     */
+    CPReadFn *readfn;
+    /*
+     * Function for handling writes of this register. If NULL, then writes
+     * will be done by writing to the offset into CPUARMState specified
+     * by fieldoffset.
+     */
+    CPWriteFn *writefn;
+    /*
+     * Function for doing a "raw" read; used when we need to copy
+     * coprocessor state to the kernel for KVM or out for
+     * migration. This only needs to be provided if there is also a
+     * readfn and it has side effects (for instance clear-on-read bits).
+     */
+    CPReadFn *raw_readfn;
+    /*
+     * Function for doing a "raw" write; used when we need to copy KVM
+     * kernel coprocessor state into userspace, or for inbound
+     * migration. This only needs to be provided if there is also a
+     * writefn and it masks out "unwritable" bits or has write-one-to-clear
+     * or similar behaviour.
+     */
+    CPWriteFn *raw_writefn;
+    /*
+     * Function for resetting the register. If NULL, then reset will be done
+     * by writing resetvalue to the field specified in fieldoffset. If
+     * fieldoffset is 0 then no reset will be done.
+     */
+    CPResetFn *resetfn;
+
+    /*
+     * "Original" writefn and readfn.
+     * For ARMv8.1-VHE register aliases, we overwrite the read/write
+     * accessor functions of various EL1/EL0 to perform the runtime
+     * check for which sysreg should actually be modified, and then
+     * forwards the operation.  Before overwriting the accessors,
+     * the original function is copied here, so that accesses that
+     * really do go to the EL1/EL0 version proceed normally.
+     * (The corresponding EL2 register is linked via opaque.)
+     */
+    CPReadFn *orig_readfn;
+    CPWriteFn *orig_writefn;
+};
+
+/*
+ * Macros which are lvalues for the field in CPUARMState for the
+ * ARMCPRegInfo *ri.
+ */
+#define CPREG_FIELD32(env, ri) \
+    (*(uint32_t *)((char *)(env) + (ri)->fieldoffset))
+#define CPREG_FIELD64(env, ri) \
+    (*(uint64_t *)((char *)(env) + (ri)->fieldoffset))
+
+#define REGINFO_SENTINEL { .type = ARM_CP_SENTINEL }
+
+void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
+                                    const ARMCPRegInfo *regs, void *opaque);
+void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
+                                       const ARMCPRegInfo *regs, void *opaque);
+static inline void define_arm_cp_regs(ARMCPU *cpu, const ARMCPRegInfo *regs)
+{
+    define_arm_cp_regs_with_opaque(cpu, regs, 0);
+}
+static inline void define_one_arm_cp_reg(ARMCPU *cpu, const ARMCPRegInfo *regs)
+{
+    define_one_arm_cp_reg_with_opaque(cpu, regs, 0);
+}
+const ARMCPRegInfo *get_arm_cp_reginfo(GHashTable *cpregs, uint32_t encoded_cp);
+
+/*
+ * Definition of an ARM co-processor register as viewed from
+ * userspace. This is used for presenting sanitised versions of
+ * registers to userspace when emulating the Linux AArch64 CPU
+ * ID/feature ABI (advertised as HWCAP_CPUID).
+ */
+typedef struct ARMCPRegUserSpaceInfo {
+    /* Name of register */
+    const char *name;
+
+    /* Is the name actually a glob pattern */
+    bool is_glob;
+
+    /* Only some bits are exported to user space */
+    uint64_t exported_bits;
+
+    /* Fixed bits are applied after the mask */
+    uint64_t fixed_bits;
+} ARMCPRegUserSpaceInfo;
+
+#define REGUSERINFO_SENTINEL { .name = NULL }
+
+void modify_arm_cp_regs(ARMCPRegInfo *regs, const ARMCPRegUserSpaceInfo *mods);
+
+/* CPWriteFn that can be used to implement writes-ignored behaviour */
+void arm_cp_write_ignore(CPUARMState *env, const ARMCPRegInfo *ri,
+                         uint64_t value);
+/* CPReadFn that can be used for read-as-zero behaviour */
+uint64_t arm_cp_read_zero(CPUARMState *env, const ARMCPRegInfo *ri);
+
+/*
+ * CPResetFn that does nothing, for use if no reset is required even
+ * if fieldoffset is non zero.
+ */
+void arm_cp_reset_ignore(CPUARMState *env, const ARMCPRegInfo *opaque);
+
+/*
+ * Return true if this reginfo struct's field in the cpu state struct
+ * is 64 bits wide.
+ */
+static inline bool cpreg_field_is_64bit(const ARMCPRegInfo *ri)
+{
+    return (ri->state == ARM_CP_STATE_AA64) || (ri->type & ARM_CP_64BIT);
+}
+
+static inline bool cp_access_ok(int current_el,
+                                const ARMCPRegInfo *ri, int isread)
+{
+    return (ri->access >> ((current_el * 2) + isread)) & 1;
+}
+
+/* Raw read of a coprocessor register (as needed for migration, etc) */
+uint64_t read_raw_cp_reg(CPUARMState *env, const ARMCPRegInfo *ri);
+
+#endif /* TARGET_ARM_CPREGS_H */
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index e7f669d0a9..76c0dd37cd 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2594,144 +2594,6 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
     return kvmid;
 }
 
-/* ARMCPRegInfo type field bits. If the SPECIAL bit is set this is a
- * special-behaviour cp reg and bits [11..8] indicate what behaviour
- * it has. Otherwise it is a simple cp reg, where CONST indicates that
- * TCG can assume the value to be constant (ie load at translate time)
- * and 64BIT indicates a 64 bit wide coprocessor register. SUPPRESS_TB_END
- * indicates that the TB should not be ended after a write to this register
- * (the default is that the TB ends after cp writes). OVERRIDE permits
- * a register definition to override a previous definition for the
- * same (cp, is64, crn, crm, opc1, opc2) tuple: either the new or the
- * old must have the OVERRIDE bit set.
- * ALIAS indicates that this register is an alias view of some underlying
- * state which is also visible via another register, and that the other
- * register is handling migration and reset; registers marked ALIAS will not be
- * migrated but may have their state set by syncing of register state from KVM.
- * NO_RAW indicates that this register has no underlying state and does not
- * support raw access for state saving/loading; it will not be used for either
- * migration or KVM state synchronization. (Typically this is for "registers"
- * which are actually used as instructions for cache maintenance and so on.)
- * IO indicates that this register does I/O and therefore its accesses
- * need to be marked with gen_io_start() and also end the TB. In particular,
- * registers which implement clocks or timers require this.
- * RAISES_EXC is for when the read or write hook might raise an exception;
- * the generated code will synchronize the CPU state before calling the hook
- * so that it is safe for the hook to call raise_exception().
- * NEWEL is for writes to registers that might change the exception
- * level - typically on older ARM chips. For those cases we need to
- * re-read the new el when recomputing the translation flags.
- */
-#define ARM_CP_SPECIAL           0x0001
-#define ARM_CP_CONST             0x0002
-#define ARM_CP_64BIT             0x0004
-#define ARM_CP_SUPPRESS_TB_END   0x0008
-#define ARM_CP_OVERRIDE          0x0010
-#define ARM_CP_ALIAS             0x0020
-#define ARM_CP_IO                0x0040
-#define ARM_CP_NO_RAW            0x0080
-#define ARM_CP_NOP               (ARM_CP_SPECIAL | 0x0100)
-#define ARM_CP_WFI               (ARM_CP_SPECIAL | 0x0200)
-#define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
-#define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
-#define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
-#define ARM_CP_DC_GVA            (ARM_CP_SPECIAL | 0x0600)
-#define ARM_CP_DC_GZVA           (ARM_CP_SPECIAL | 0x0700)
-#define ARM_LAST_SPECIAL         ARM_CP_DC_GZVA
-#define ARM_CP_FPU               0x1000
-#define ARM_CP_SVE               0x2000
-#define ARM_CP_NO_GDB            0x4000
-#define ARM_CP_RAISES_EXC        0x8000
-#define ARM_CP_NEWEL             0x10000
-/* Used only as a terminator for ARMCPRegInfo lists */
-#define ARM_CP_SENTINEL          0xfffff
-/* Mask of only the flag bits in a type field */
-#define ARM_CP_FLAG_MASK         0x1f0ff
-
-/* Valid values for ARMCPRegInfo state field, indicating which of
- * the AArch32 and AArch64 execution states this register is visible in.
- * If the reginfo doesn't explicitly specify then it is AArch32 only.
- * If the reginfo is declared to be visible in both states then a second
- * reginfo is synthesised for the AArch32 view of the AArch64 register,
- * such that the AArch32 view is the lower 32 bits of the AArch64 one.
- * Note that we rely on the values of these enums as we iterate through
- * the various states in some places.
- */
-enum {
-    ARM_CP_STATE_AA32 = 0,
-    ARM_CP_STATE_AA64 = 1,
-    ARM_CP_STATE_BOTH = 2,
-};
-
-/* ARM CP register secure state flags.  These flags identify security state
- * attributes for a given CP register entry.
- * The existence of both or neither secure and non-secure flags indicates that
- * the register has both a secure and non-secure hash entry.  A single one of
- * these flags causes the register to only be hashed for the specified
- * security state.
- * Although definitions may have any combination of the S/NS bits, each
- * registered entry will only have one to identify whether the entry is secure
- * or non-secure.
- */
-enum {
-    ARM_CP_SECSTATE_S =   (1 << 0), /* bit[0]: Secure state register */
-    ARM_CP_SECSTATE_NS =  (1 << 1), /* bit[1]: Non-secure state register */
-};
-
-/* Return true if cptype is a valid type field. This is used to try to
- * catch errors where the sentinel has been accidentally left off the end
- * of a list of registers.
- */
-static inline bool cptype_valid(int cptype)
-{
-    return ((cptype & ~ARM_CP_FLAG_MASK) == 0)
-        || ((cptype & ARM_CP_SPECIAL) &&
-            ((cptype & ~ARM_CP_FLAG_MASK) <= ARM_LAST_SPECIAL));
-}
-
-/* Access rights:
- * We define bits for Read and Write access for what rev C of the v7-AR ARM ARM
- * defines as PL0 (user), PL1 (fiq/irq/svc/abt/und/sys, ie privileged), and
- * PL2 (hyp). The other level which has Read and Write bits is Secure PL1
- * (ie any of the privileged modes in Secure state, or Monitor mode).
- * If a register is accessible in one privilege level it's always accessible
- * in higher privilege levels too. Since "Secure PL1" also follows this rule
- * (ie anything visible in PL2 is visible in S-PL1, some things are only
- * visible in S-PL1) but "Secure PL1" is a bit of a mouthful, we bend the
- * terminology a little and call this PL3.
- * In AArch64 things are somewhat simpler as the PLx bits line up exactly
- * with the ELx exception levels.
- *
- * If access permissions for a register are more complex than can be
- * described with these bits, then use a laxer set of restrictions, and
- * do the more restrictive/complex check inside a helper function.
- */
-#define PL3_R 0x80
-#define PL3_W 0x40
-#define PL2_R (0x20 | PL3_R)
-#define PL2_W (0x10 | PL3_W)
-#define PL1_R (0x08 | PL2_R)
-#define PL1_W (0x04 | PL2_W)
-#define PL0_R (0x02 | PL1_R)
-#define PL0_W (0x01 | PL1_W)
-
-/*
- * For user-mode some registers are accessible to EL0 via a kernel
- * trap-and-emulate ABI. In this case we define the read permissions
- * as actually being PL0_R. However some bits of any given register
- * may still be masked.
- */
-#ifdef CONFIG_USER_ONLY
-#define PL0U_R PL0_R
-#else
-#define PL0U_R PL1_R
-#endif
-
-#define PL3_RW (PL3_R | PL3_W)
-#define PL2_RW (PL2_R | PL2_W)
-#define PL1_RW (PL1_R | PL1_W)
-#define PL0_RW (PL0_R | PL0_W)
-
 /* Return the highest implemented Exception Level */
 static inline int arm_highest_el(CPUARMState *env)
 {
@@ -2783,236 +2645,6 @@ static inline int arm_current_el(CPUARMState *env)
     }
 }
 
-typedef struct ARMCPRegInfo ARMCPRegInfo;
-
-typedef enum CPAccessResult {
-    /* Access is permitted */
-    CP_ACCESS_OK = 0,
-    /* Access fails due to a configurable trap or enable which would
-     * result in a categorized exception syndrome giving information about
-     * the failing instruction (ie syndrome category 0x3, 0x4, 0x5, 0x6,
-     * 0xc or 0x18). The exception is taken to the usual target EL (EL1 or
-     * PL1 if in EL0, otherwise to the current EL).
-     */
-    CP_ACCESS_TRAP = 1,
-    /* Access fails and results in an exception syndrome 0x0 ("uncategorized").
-     * Note that this is not a catch-all case -- the set of cases which may
-     * result in this failure is specifically defined by the architecture.
-     */
-    CP_ACCESS_TRAP_UNCATEGORIZED = 2,
-    /* As CP_ACCESS_TRAP, but for traps directly to EL2 or EL3 */
-    CP_ACCESS_TRAP_EL2 = 3,
-    CP_ACCESS_TRAP_EL3 = 4,
-    /* As CP_ACCESS_UNCATEGORIZED, but for traps directly to EL2 or EL3 */
-    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = 5,
-    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = 6,
-} CPAccessResult;
-
-/* Access functions for coprocessor registers. These cannot fail and
- * may not raise exceptions.
- */
-typedef uint64_t CPReadFn(CPUARMState *env, const ARMCPRegInfo *opaque);
-typedef void CPWriteFn(CPUARMState *env, const ARMCPRegInfo *opaque,
-                       uint64_t value);
-/* Access permission check functions for coprocessor registers. */
-typedef CPAccessResult CPAccessFn(CPUARMState *env,
-                                  const ARMCPRegInfo *opaque,
-                                  bool isread);
-/* Hook function for register reset */
-typedef void CPResetFn(CPUARMState *env, const ARMCPRegInfo *opaque);
-
-#define CP_ANY 0xff
-
-/* Definition of an ARM coprocessor register */
-struct ARMCPRegInfo {
-    /* Name of register (useful mainly for debugging, need not be unique) */
-    const char *name;
-    /* Location of register: coprocessor number and (crn,crm,opc1,opc2)
-     * tuple. Any of crm, opc1 and opc2 may be CP_ANY to indicate a
-     * 'wildcard' field -- any value of that field in the MRC/MCR insn
-     * will be decoded to this register. The register read and write
-     * callbacks will be passed an ARMCPRegInfo with the crn/crm/opc1/opc2
-     * used by the program, so it is possible to register a wildcard and
-     * then behave differently on read/write if necessary.
-     * For 64 bit registers, only crm and opc1 are relevant; crn and opc2
-     * must both be zero.
-     * For AArch64-visible registers, opc0 is also used.
-     * Since there are no "coprocessors" in AArch64, cp is purely used as a
-     * way to distinguish (for KVM's benefit) guest-visible system registers
-     * from demuxed ones provided to preserve the "no side effects on
-     * KVM register read/write from QEMU" semantics. cp==0x13 is guest
-     * visible (to match KVM's encoding); cp==0 will be converted to
-     * cp==0x13 when the ARMCPRegInfo is registered, for convenience.
-     */
-    uint8_t cp;
-    uint8_t crn;
-    uint8_t crm;
-    uint8_t opc0;
-    uint8_t opc1;
-    uint8_t opc2;
-    /* Execution state in which this register is visible: ARM_CP_STATE_* */
-    int state;
-    /* Register type: ARM_CP_* bits/values */
-    int type;
-    /* Access rights: PL*_[RW] */
-    int access;
-    /* Security state: ARM_CP_SECSTATE_* bits/values */
-    int secure;
-    /* The opaque pointer passed to define_arm_cp_regs_with_opaque() when
-     * this register was defined: can be used to hand data through to the
-     * register read/write functions, since they are passed the ARMCPRegInfo*.
-     */
-    void *opaque;
-    /* Value of this register, if it is ARM_CP_CONST. Otherwise, if
-     * fieldoffset is non-zero, the reset value of the register.
-     */
-    uint64_t resetvalue;
-    /* Offset of the field in CPUARMState for this register.
-     *
-     * This is not needed if either:
-     *  1. type is ARM_CP_CONST or one of the ARM_CP_SPECIALs
-     *  2. both readfn and writefn are specified
-     */
-    ptrdiff_t fieldoffset; /* offsetof(CPUARMState, field) */
-
-    /* Offsets of the secure and non-secure fields in CPUARMState for the
-     * register if it is banked.  These fields are only used during the static
-     * registration of a register.  During hashing the bank associated
-     * with a given security state is copied to fieldoffset which is used from
-     * there on out.
-     *
-     * It is expected that register definitions use either fieldoffset or
-     * bank_fieldoffsets in the definition but not both.  It is also expected
-     * that both bank offsets are set when defining a banked register.  This
-     * use indicates that a register is banked.
-     */
-    ptrdiff_t bank_fieldoffsets[2];
-
-    /* Function for making any access checks for this register in addition to
-     * those specified by the 'access' permissions bits. If NULL, no extra
-     * checks required. The access check is performed at runtime, not at
-     * translate time.
-     */
-    CPAccessFn *accessfn;
-    /* Function for handling reads of this register. If NULL, then reads
-     * will be done by loading from the offset into CPUARMState specified
-     * by fieldoffset.
-     */
-    CPReadFn *readfn;
-    /* Function for handling writes of this register. If NULL, then writes
-     * will be done by writing to the offset into CPUARMState specified
-     * by fieldoffset.
-     */
-    CPWriteFn *writefn;
-    /* Function for doing a "raw" read; used when we need to copy
-     * coprocessor state to the kernel for KVM or out for
-     * migration. This only needs to be provided if there is also a
-     * readfn and it has side effects (for instance clear-on-read bits).
-     */
-    CPReadFn *raw_readfn;
-    /* Function for doing a "raw" write; used when we need to copy KVM
-     * kernel coprocessor state into userspace, or for inbound
-     * migration. This only needs to be provided if there is also a
-     * writefn and it masks out "unwritable" bits or has write-one-to-clear
-     * or similar behaviour.
-     */
-    CPWriteFn *raw_writefn;
-    /* Function for resetting the register. If NULL, then reset will be done
-     * by writing resetvalue to the field specified in fieldoffset. If
-     * fieldoffset is 0 then no reset will be done.
-     */
-    CPResetFn *resetfn;
-
-    /*
-     * "Original" writefn and readfn.
-     * For ARMv8.1-VHE register aliases, we overwrite the read/write
-     * accessor functions of various EL1/EL0 to perform the runtime
-     * check for which sysreg should actually be modified, and then
-     * forwards the operation.  Before overwriting the accessors,
-     * the original function is copied here, so that accesses that
-     * really do go to the EL1/EL0 version proceed normally.
-     * (The corresponding EL2 register is linked via opaque.)
-     */
-    CPReadFn *orig_readfn;
-    CPWriteFn *orig_writefn;
-};
-
-/* Macros which are lvalues for the field in CPUARMState for the
- * ARMCPRegInfo *ri.
- */
-#define CPREG_FIELD32(env, ri) \
-    (*(uint32_t *)((char *)(env) + (ri)->fieldoffset))
-#define CPREG_FIELD64(env, ri) \
-    (*(uint64_t *)((char *)(env) + (ri)->fieldoffset))
-
-#define REGINFO_SENTINEL { .type = ARM_CP_SENTINEL }
-
-void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
-                                    const ARMCPRegInfo *regs, void *opaque);
-void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
-                                       const ARMCPRegInfo *regs, void *opaque);
-static inline void define_arm_cp_regs(ARMCPU *cpu, const ARMCPRegInfo *regs)
-{
-    define_arm_cp_regs_with_opaque(cpu, regs, 0);
-}
-static inline void define_one_arm_cp_reg(ARMCPU *cpu, const ARMCPRegInfo *regs)
-{
-    define_one_arm_cp_reg_with_opaque(cpu, regs, 0);
-}
-const ARMCPRegInfo *get_arm_cp_reginfo(GHashTable *cpregs, uint32_t encoded_cp);
-
-/*
- * Definition of an ARM co-processor register as viewed from
- * userspace. This is used for presenting sanitised versions of
- * registers to userspace when emulating the Linux AArch64 CPU
- * ID/feature ABI (advertised as HWCAP_CPUID).
- */
-typedef struct ARMCPRegUserSpaceInfo {
-    /* Name of register */
-    const char *name;
-
-    /* Is the name actually a glob pattern */
-    bool is_glob;
-
-    /* Only some bits are exported to user space */
-    uint64_t exported_bits;
-
-    /* Fixed bits are applied after the mask */
-    uint64_t fixed_bits;
-} ARMCPRegUserSpaceInfo;
-
-#define REGUSERINFO_SENTINEL { .name = NULL }
-
-void modify_arm_cp_regs(ARMCPRegInfo *regs, const ARMCPRegUserSpaceInfo *mods);
-
-/* CPWriteFn that can be used to implement writes-ignored behaviour */
-void arm_cp_write_ignore(CPUARMState *env, const ARMCPRegInfo *ri,
-                         uint64_t value);
-/* CPReadFn that can be used for read-as-zero behaviour */
-uint64_t arm_cp_read_zero(CPUARMState *env, const ARMCPRegInfo *ri);
-
-/* CPResetFn that does nothing, for use if no reset is required even
- * if fieldoffset is non zero.
- */
-void arm_cp_reset_ignore(CPUARMState *env, const ARMCPRegInfo *opaque);
-
-/* Return true if this reginfo struct's field in the cpu state struct
- * is 64 bits wide.
- */
-static inline bool cpreg_field_is_64bit(const ARMCPRegInfo *ri)
-{
-    return (ri->state == ARM_CP_STATE_AA64) || (ri->type & ARM_CP_64BIT);
-}
-
-static inline bool cp_access_ok(int current_el,
-                                const ARMCPRegInfo *ri, int isread)
-{
-    return (ri->access >> ((current_el * 2) + isread)) & 1;
-}
-
-/* Raw read of a coprocessor register (as needed for migration, etc) */
-uint64_t read_raw_cp_reg(CPUARMState *env, const ARMCPRegInfo *ri);
-
 /**
  * write_list_to_cpustate
  * @cpu: ARMCPU
diff --git a/hw/arm/pxa2xx.c b/hw/arm/pxa2xx.c
index a6f938f115..0683714733 100644
--- a/hw/arm/pxa2xx.c
+++ b/hw/arm/pxa2xx.c
@@ -30,6 +30,7 @@
 #include "qemu/cutils.h"
 #include "qemu/log.h"
 #include "qom/object.h"
+#include "target/arm/cpregs.h"
 
 static struct {
     hwaddr io_base;
diff --git a/hw/arm/pxa2xx_pic.c b/hw/arm/pxa2xx_pic.c
index ed032fed54..b80d75d839 100644
--- a/hw/arm/pxa2xx_pic.c
+++ b/hw/arm/pxa2xx_pic.c
@@ -17,6 +17,7 @@
 #include "hw/sysbus.h"
 #include "migration/vmstate.h"
 #include "qom/object.h"
+#include "target/arm/cpregs.h"
 
 #define ICIP	0x00	/* Interrupt Controller IRQ Pending register */
 #define ICMR	0x04	/* Interrupt Controller Mask register */
diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index 1a3d440a54..d43ba970f9 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -20,6 +20,7 @@
 #include "gicv3_internal.h"
 #include "hw/irq.h"
 #include "cpu.h"
+#include "target/arm/cpregs.h"
 
 static GICv3CPUState *icc_cs_from_env(CPUARMState *env)
 {
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index 5ec5ff9ef6..ed2a583b0c 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -31,6 +31,8 @@
 #include "vgic_common.h"
 #include "migration/blocker.h"
 #include "qom/object.h"
+#include "target/arm/cpregs.h"
+
 
 #ifdef DEBUG_GICV3_KVM
 #define DPRINTF(fmt, ...) \
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 561f180067..066fe6250a 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -43,6 +43,7 @@
 #include "kvm_arm.h"
 #include "disas/capstone.h"
 #include "fpu/softfloat.h"
+#include "cpregs.h"
 
 static void arm_cpu_set_pc(CPUState *cs, vaddr value)
 {
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index eb44c05822..9561714c23 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -34,6 +34,7 @@
 #include "hvf_arm.h"
 #include "qapi/visitor.h"
 #include "hw/qdev-properties.h"
+#include "cpregs.h"
 
 
 #ifndef CONFIG_USER_ONLY
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index 13d0e9b195..0e693b182e 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -18,6 +18,7 @@
 #if !defined(CONFIG_USER_ONLY)
 #include "hw/boards.h"
 #endif
+#include "cpregs.h"
 
 /* CPU models. These are not needed for the AArch64 linux-user build. */
 #if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
index ca1de47511..f01a126108 100644
--- a/target/arm/gdbstub.c
+++ b/target/arm/gdbstub.c
@@ -19,8 +19,9 @@
  */
 #include "qemu/osdep.h"
 #include "cpu.h"
-#include "internals.h"
 #include "exec/gdbstub.h"
+#include "internals.h"
+#include "cpregs.h"
 
 typedef struct RegisterSysregXmlParam {
     CPUState *cs;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 60d9233b7e..8928ffb6a7 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -37,6 +37,7 @@
 #include "exec/cpu_ldst.h"
 #include "semihosting/common-semi.h"
 #endif
+#include "cpregs.h"
 
 #define ARM_CPU_FREQ 1000000000 /* FIXME: 1 GHz, should be configurable */
 #define PMCR_NUM_COUNTERS 4 /* QEMU IMPDEF choice */
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index 2b87e8808b..67be91c732 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -23,6 +23,7 @@
 #include "internals.h"
 #include "exec/exec-all.h"
 #include "exec/cpu_ldst.h"
+#include "cpregs.h"
 
 #define SIGNBIT (uint32_t)0x80000000
 #define SIGNBIT64 ((uint64_t)1 << 63)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 3867910ed4..92b91f0307 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -27,14 +27,12 @@
 #include "translate.h"
 #include "internals.h"
 #include "qemu/host-utils.h"
-
 #include "semihosting/semihost.h"
 #include "exec/gen-icount.h"
-
 #include "exec/helper-proto.h"
 #include "exec/helper-gen.h"
 #include "exec/log.h"
-
+#include "cpregs.h"
 #include "translate-a64.h"
 #include "qemu/atomic128.h"
 
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 5cb4b3da33..1fe48e4e1c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -30,11 +30,10 @@
 #include "qemu/bitops.h"
 #include "arm_ldst.h"
 #include "semihosting/semihost.h"
-
 #include "exec/helper-proto.h"
 #include "exec/helper-gen.h"
-
 #include "exec/log.h"
+#include "cpregs.h"
 
 
 #define ENABLE_ARCH_4T    arm_dc_feature(s, ARM_FEATURE_V4T)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 25/60] target/arm: Reorg CPAccessResult and access_check_cp_reg
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (23 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 24/60] target/arm: Split out cpregs.h Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-22  9:32   ` Peter Maydell
  2022-04-22 15:31   ` Alex Bennée
  2022-04-17 17:43 ` [PATCH v3 26/60] target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h Richard Henderson
                   ` (35 subsequent siblings)
  60 siblings, 2 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Rearrange the values of the enumerators of CPAccessResult
so that we may directly extract the target el. For the two
special cases in access_check_cp_reg, use CPAccessResult.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpregs.h    | 26 ++++++++++++--------
 target/arm/op_helper.c | 56 +++++++++++++++++++++---------------------
 2 files changed, 44 insertions(+), 38 deletions(-)

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index 005aa2d3a5..700fcc1478 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -167,26 +167,32 @@ static inline bool cptype_valid(int cptype)
 typedef enum CPAccessResult {
     /* Access is permitted */
     CP_ACCESS_OK = 0,
+
+    /*
+     * Combined with one of the following, the low 2 bits indicate the
+     * target exception level.  If 0, the exception is taken to the usual
+     * target EL (EL1 or PL1 if in EL0, otherwise to the current EL).
+     */
+    CP_ACCESS_EL_MASK = 3,
+
     /*
      * Access fails due to a configurable trap or enable which would
      * result in a categorized exception syndrome giving information about
      * the failing instruction (ie syndrome category 0x3, 0x4, 0x5, 0x6,
-     * 0xc or 0x18). The exception is taken to the usual target EL (EL1 or
-     * PL1 if in EL0, otherwise to the current EL).
+     * 0xc or 0x18).
      */
-    CP_ACCESS_TRAP = 1,
+    CP_ACCESS_TRAP = (1 << 2),
+    CP_ACCESS_TRAP_EL2 = CP_ACCESS_TRAP | 2,
+    CP_ACCESS_TRAP_EL3 = CP_ACCESS_TRAP | 3,
+
     /*
      * Access fails and results in an exception syndrome 0x0 ("uncategorized").
      * Note that this is not a catch-all case -- the set of cases which may
      * result in this failure is specifically defined by the architecture.
      */
-    CP_ACCESS_TRAP_UNCATEGORIZED = 2,
-    /* As CP_ACCESS_TRAP, but for traps directly to EL2 or EL3 */
-    CP_ACCESS_TRAP_EL2 = 3,
-    CP_ACCESS_TRAP_EL3 = 4,
-    /* As CP_ACCESS_UNCATEGORIZED, but for traps directly to EL2 or EL3 */
-    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = 5,
-    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = 6,
+    CP_ACCESS_TRAP_UNCATEGORIZED = (2 << 2),
+    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = CP_ACCESS_TRAP_UNCATEGORIZED | 2,
+    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = CP_ACCESS_TRAP_UNCATEGORIZED | 3,
 } CPAccessResult;
 
 typedef struct ARMCPRegInfo ARMCPRegInfo;
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index 67be91c732..76499ffa14 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -632,11 +632,13 @@ void HELPER(access_check_cp_reg)(CPUARMState *env, void *rip, uint32_t syndrome,
                                  uint32_t isread)
 {
     const ARMCPRegInfo *ri = rip;
+    CPAccessResult res = CP_ACCESS_OK;
     int target_el;
 
     if (arm_feature(env, ARM_FEATURE_XSCALE) && ri->cp < 14
         && extract32(env->cp15.c15_cpar, ri->cp, 1) == 0) {
-        raise_exception(env, EXCP_UDEF, syndrome, exception_target_el(env));
+        res = CP_ACCESS_TRAP;
+        goto fail;
     }
 
     /*
@@ -655,48 +657,46 @@ void HELPER(access_check_cp_reg)(CPUARMState *env, void *rip, uint32_t syndrome,
         mask &= ~((1 << 4) | (1 << 14));
 
         if (env->cp15.hstr_el2 & mask) {
-            target_el = 2;
-            goto exept;
+            res = CP_ACCESS_TRAP_EL2;
+            goto fail;
         }
     }
 
-    if (!ri->accessfn) {
+    if (ri->accessfn) {
+        res = ri->accessfn(env, ri, isread);
+    }
+    if (likely(res == CP_ACCESS_OK)) {
         return;
     }
 
-    switch (ri->accessfn(env, ri, isread)) {
-    case CP_ACCESS_OK:
-        return;
+ fail:
+    switch (res & ~CP_ACCESS_EL_MASK) {
     case CP_ACCESS_TRAP:
-        target_el = exception_target_el(env);
-        break;
-    case CP_ACCESS_TRAP_EL2:
-        /* Requesting a trap to EL2 when we're in EL3 is
-         * a bug in the access function.
-         */
-        assert(arm_current_el(env) != 3);
-        target_el = 2;
-        break;
-    case CP_ACCESS_TRAP_EL3:
-        target_el = 3;
         break;
     case CP_ACCESS_TRAP_UNCATEGORIZED:
-        target_el = exception_target_el(env);
-        syndrome = syn_uncategorized();
-        break;
-    case CP_ACCESS_TRAP_UNCATEGORIZED_EL2:
-        target_el = 2;
-        syndrome = syn_uncategorized();
-        break;
-    case CP_ACCESS_TRAP_UNCATEGORIZED_EL3:
-        target_el = 3;
         syndrome = syn_uncategorized();
         break;
     default:
         g_assert_not_reached();
     }
 
-exept:
+    target_el = res & CP_ACCESS_EL_MASK;
+    switch (target_el) {
+    case 0:
+        target_el = exception_target_el(env);
+        break;
+    case 2:
+        assert(arm_current_el(env) != 3);
+        assert(arm_is_el2_enabled(env));
+        break;
+    case 3:
+        assert(arm_feature(env, ARM_FEATURE_EL3));
+        break;
+    default:
+        /* No "direct" traps to EL1 */
+        g_assert_not_reached();
+    }
+
     raise_exception(env, EXCP_UDEF, syndrome, target_el);
 }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 26/60] target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (24 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 25/60] target/arm: Reorg CPAccessResult and access_check_cp_reg Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-22  9:37   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 27/60] target/arm: Make some more cpreg data static const Richard Henderson
                   ` (34 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Remove a possible source of error by removing REGINFO_SENTINEL
and using ARRAY_SIZE (convinently hidden inside a macro) to
find the end of the set of regs being registered or modified.

The space saved by not having the extra array element reduces
the executable's .data.rel.ro section by about 9k.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpregs.h       |  57 +++++++++++---------
 hw/arm/pxa2xx.c           |   1 -
 hw/arm/pxa2xx_pic.c       |   1 -
 hw/intc/arm_gicv3_cpuif.c |   5 --
 hw/intc/arm_gicv3_kvm.c   |   1 -
 target/arm/cpu64.c        |   1 -
 target/arm/cpu_tcg.c      |   4 --
 target/arm/helper.c       | 111 ++++++++------------------------------
 8 files changed, 52 insertions(+), 129 deletions(-)

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index 700fcc1478..bd321d6d23 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -71,8 +71,6 @@
 #define ARM_CP_NO_GDB            0x4000
 #define ARM_CP_RAISES_EXC        0x8000
 #define ARM_CP_NEWEL             0x10000
-/* Used only as a terminator for ARMCPRegInfo lists */
-#define ARM_CP_SENTINEL          0xfffff
 /* Mask of only the flag bits in a type field */
 #define ARM_CP_FLAG_MASK         0x1f0ff
 
@@ -108,18 +106,6 @@ enum {
     ARM_CP_SECSTATE_NS =  (1 << 1), /* bit[1]: Non-secure state register */
 };
 
-/*
- * Return true if cptype is a valid type field. This is used to try to
- * catch errors where the sentinel has been accidentally left off the end
- * of a list of registers.
- */
-static inline bool cptype_valid(int cptype)
-{
-    return ((cptype & ~ARM_CP_FLAG_MASK) == 0)
-        || ((cptype & ARM_CP_SPECIAL) &&
-            ((cptype & ~ARM_CP_FLAG_MASK) <= ARM_LAST_SPECIAL));
-}
-
 /*
  * Access rights:
  * We define bits for Read and Write access for what rev C of the v7-AR ARM ARM
@@ -346,20 +332,31 @@ struct ARMCPRegInfo {
 #define CPREG_FIELD64(env, ri) \
     (*(uint64_t *)((char *)(env) + (ri)->fieldoffset))
 
-#define REGINFO_SENTINEL { .type = ARM_CP_SENTINEL }
+void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu, const ARMCPRegInfo *reg,
+                                       void *opaque);
 
-void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
-                                    const ARMCPRegInfo *regs, void *opaque);
-void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
-                                       const ARMCPRegInfo *regs, void *opaque);
-static inline void define_arm_cp_regs(ARMCPU *cpu, const ARMCPRegInfo *regs)
-{
-    define_arm_cp_regs_with_opaque(cpu, regs, 0);
-}
 static inline void define_one_arm_cp_reg(ARMCPU *cpu, const ARMCPRegInfo *regs)
 {
-    define_one_arm_cp_reg_with_opaque(cpu, regs, 0);
+    define_one_arm_cp_reg_with_opaque(cpu, regs, NULL);
 }
+
+void define_arm_cp_regs_with_opaque_len(ARMCPU *cpu, const ARMCPRegInfo *regs,
+                                        void *opaque, size_t len);
+
+#define define_arm_cp_regs_with_opaque(CPU, REGS, OPAQUE)               \
+    do {                                                                \
+        QEMU_BUILD_BUG_ON(ARRAY_SIZE(REGS) == 0);                       \
+        if (ARRAY_SIZE(REGS) == 1) {                                    \
+            define_one_arm_cp_reg_with_opaque(CPU, REGS, OPAQUE);       \
+        } else {                                                        \
+            define_arm_cp_regs_with_opaque_len(CPU, REGS, OPAQUE,       \
+                                               ARRAY_SIZE(REGS));       \
+        }                                                               \
+    } while (0)
+
+#define define_arm_cp_regs(CPU, REGS) \
+    define_arm_cp_regs_with_opaque(CPU, REGS, NULL)
+
 const ARMCPRegInfo *get_arm_cp_reginfo(GHashTable *cpregs, uint32_t encoded_cp);
 
 /*
@@ -382,9 +379,17 @@ typedef struct ARMCPRegUserSpaceInfo {
     uint64_t fixed_bits;
 } ARMCPRegUserSpaceInfo;
 
-#define REGUSERINFO_SENTINEL { .name = NULL }
+void modify_arm_cp_regs_with_len(ARMCPRegInfo *regs, size_t regs_len,
+                                 const ARMCPRegUserSpaceInfo *mods,
+                                 size_t mods_len);
 
-void modify_arm_cp_regs(ARMCPRegInfo *regs, const ARMCPRegUserSpaceInfo *mods);
+#define modify_arm_cp_regs(REGS, MODS)                                  \
+    do {                                                                \
+        QEMU_BUILD_BUG_ON(ARRAY_SIZE(REGS) == 0);                       \
+        QEMU_BUILD_BUG_ON(ARRAY_SIZE(MODS) == 0);                       \
+        modify_arm_cp_regs_with_len(REGS, ARRAY_SIZE(REGS),             \
+                                    MODS, ARRAY_SIZE(MODS));            \
+    } while (0)
 
 /* CPWriteFn that can be used to implement writes-ignored behaviour */
 void arm_cp_write_ignore(CPUARMState *env, const ARMCPRegInfo *ri,
diff --git a/hw/arm/pxa2xx.c b/hw/arm/pxa2xx.c
index 0683714733..f4f687df68 100644
--- a/hw/arm/pxa2xx.c
+++ b/hw/arm/pxa2xx.c
@@ -384,7 +384,6 @@ static const ARMCPRegInfo pxa_cp_reginfo[] = {
     { .name = "PWRMODE", .cp = 14, .crn = 7, .crm = 0, .opc1 = 0, .opc2 = 0,
       .access = PL1_RW, .type = ARM_CP_IO,
       .readfn = arm_cp_read_zero, .writefn = pxa2xx_pwrmode_write },
-    REGINFO_SENTINEL
 };
 
 static void pxa2xx_setup_cp14(PXA2xxState *s)
diff --git a/hw/arm/pxa2xx_pic.c b/hw/arm/pxa2xx_pic.c
index b80d75d839..47132ab982 100644
--- a/hw/arm/pxa2xx_pic.c
+++ b/hw/arm/pxa2xx_pic.c
@@ -257,7 +257,6 @@ static const ARMCPRegInfo pxa_pic_cp_reginfo[] = {
     REGINFO_FOR_PIC_CP("ICLR2", 8),
     REGINFO_FOR_PIC_CP("ICFP2", 9),
     REGINFO_FOR_PIC_CP("ICPR2", 0xa),
-    REGINFO_SENTINEL
 };
 
 static const MemoryRegionOps pxa2xx_pic_ops = {
diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index d43ba970f9..a4c08f2ff9 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -2309,7 +2309,6 @@ static const ARMCPRegInfo gicv3_cpuif_reginfo[] = {
       .readfn = icc_igrpen1_el3_read,
       .writefn = icc_igrpen1_el3_write,
     },
-    REGINFO_SENTINEL
 };
 
 static uint64_t ich_ap_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -2559,7 +2558,6 @@ static const ARMCPRegInfo gicv3_cpuif_hcr_reginfo[] = {
       .readfn = ich_vmcr_read,
       .writefn = ich_vmcr_write,
     },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo gicv3_cpuif_ich_apxr1_reginfo[] = {
@@ -2577,7 +2575,6 @@ static const ARMCPRegInfo gicv3_cpuif_ich_apxr1_reginfo[] = {
       .readfn = ich_ap_read,
       .writefn = ich_ap_write,
     },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo gicv3_cpuif_ich_apxr23_reginfo[] = {
@@ -2609,7 +2606,6 @@ static const ARMCPRegInfo gicv3_cpuif_ich_apxr23_reginfo[] = {
       .readfn = ich_ap_read,
       .writefn = ich_ap_write,
     },
-    REGINFO_SENTINEL
 };
 
 static void gicv3_cpuif_el_change_hook(ARMCPU *cpu, void *opaque)
@@ -2678,7 +2674,6 @@ void gicv3_init_cpuif(GICv3State *s)
                       .readfn = ich_lr_read,
                       .writefn = ich_lr_write,
                     },
-                    REGINFO_SENTINEL
                 };
                 define_arm_cp_regs(cpu, lr_regset);
             }
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index ed2a583b0c..36accae257 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -735,7 +735,6 @@ static const ARMCPRegInfo gicv3_cpuif_reginfo[] = {
        */
       .resetfn = arm_gicv3_icc_reset,
     },
-    REGINFO_SENTINEL
 };
 
 /**
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 9561714c23..c40373bd18 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -91,7 +91,6 @@ static const ARMCPRegInfo cortex_a72_a57_a53_cp_reginfo[] = {
     { .name = "L2MERRSR",
       .cp = 15, .opc1 = 3, .crm = 15,
       .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static void aarch64_a57_initfn(Object *obj)
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index 0e693b182e..9338088b22 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -264,7 +264,6 @@ static const ARMCPRegInfo cortexa8_cp_reginfo[] = {
       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
     { .name = "L2AUXCR", .cp = 15, .crn = 9, .crm = 0, .opc1 = 1, .opc2 = 2,
       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static void cortex_a8_initfn(Object *obj)
@@ -332,7 +331,6 @@ static const ARMCPRegInfo cortexa9_cp_reginfo[] = {
       .access = PL1_RW, .resetvalue = 0, .type = ARM_CP_CONST },
     { .name = "TLB_ATTR", .cp = 15, .crn = 15, .crm = 7, .opc1 = 5, .opc2 = 2,
       .access = PL1_RW, .resetvalue = 0, .type = ARM_CP_CONST },
-    REGINFO_SENTINEL
 };
 
 static void cortex_a9_initfn(Object *obj)
@@ -398,7 +396,6 @@ static const ARMCPRegInfo cortexa15_cp_reginfo[] = {
 #endif
     { .name = "L2ECTLR", .cp = 15, .crn = 9, .crm = 0, .opc1 = 1, .opc2 = 3,
       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static void cortex_a7_initfn(Object *obj)
@@ -686,7 +683,6 @@ static const ARMCPRegInfo cortexr5_cp_reginfo[] = {
       .access = PL1_RW, .type = ARM_CP_CONST },
     { .name = "DCACHE_INVAL", .cp = 15, .opc1 = 0, .crn = 15, .crm = 5,
       .opc2 = 0, .access = PL1_W, .type = ARM_CP_NOP },
-    REGINFO_SENTINEL
 };
 
 static void cortex_r5_initfn(Object *obj)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 8928ffb6a7..d6c34c7826 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -674,7 +674,6 @@ static const ARMCPRegInfo cp_reginfo[] = {
       .secure = ARM_CP_SECSTATE_S,
       .fieldoffset = offsetof(CPUARMState, cp15.contextidr_s),
       .resetvalue = 0, .writefn = contextidr_write, .raw_writefn = raw_write, },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo not_v8_cp_reginfo[] = {
@@ -703,7 +702,6 @@ static const ARMCPRegInfo not_v8_cp_reginfo[] = {
     { .name = "CACHEMAINT", .cp = 15, .crn = 7, .crm = CP_ANY,
       .opc1 = 0, .opc2 = CP_ANY, .access = PL1_W,
       .type = ARM_CP_NOP | ARM_CP_OVERRIDE },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo not_v6_cp_reginfo[] = {
@@ -712,7 +710,6 @@ static const ARMCPRegInfo not_v6_cp_reginfo[] = {
      */
     { .name = "WFI_v5", .cp = 15, .crn = 7, .crm = 8, .opc1 = 0, .opc2 = 2,
       .access = PL1_W, .type = ARM_CP_WFI },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo not_v7_cp_reginfo[] = {
@@ -761,7 +758,6 @@ static const ARMCPRegInfo not_v7_cp_reginfo[] = {
       .opc1 = 0, .opc2 = 0, .access = PL1_RW, .type = ARM_CP_NOP },
     { .name = "NMRR", .cp = 15, .crn = 10, .crm = 2,
       .opc1 = 0, .opc2 = 1, .access = PL1_RW, .type = ARM_CP_NOP },
-    REGINFO_SENTINEL
 };
 
 static void cpacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -890,7 +886,6 @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
       .crn = 1, .crm = 0, .opc1 = 0, .opc2 = 2, .accessfn = cpacr_access,
       .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.cpacr_el1),
       .resetfn = cpacr_reset, .writefn = cpacr_write, .readfn = cpacr_read },
-    REGINFO_SENTINEL
 };
 
 typedef struct pm_event {
@@ -2136,7 +2131,6 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
     { .name = "TLBIMVAA", .cp = 15, .opc1 = 0, .crn = 8, .crm = 7, .opc2 = 3,
       .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
       .writefn = tlbimvaa_write },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo v7mp_cp_reginfo[] = {
@@ -2153,7 +2147,6 @@ static const ARMCPRegInfo v7mp_cp_reginfo[] = {
     { .name = "TLBIMVAAIS", .cp = 15, .opc1 = 0, .crn = 8, .crm = 3, .opc2 = 3,
       .type = ARM_CP_NO_RAW, .access = PL1_W, .accessfn = access_ttlb,
       .writefn = tlbimvaa_is_write },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo pmovsset_cp_reginfo[] = {
@@ -2171,7 +2164,6 @@ static const ARMCPRegInfo pmovsset_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.c9_pmovsr),
       .writefn = pmovsset_write,
       .raw_writefn = raw_write },
-    REGINFO_SENTINEL
 };
 
 static void teecr_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -2212,7 +2204,6 @@ static const ARMCPRegInfo t2ee_cp_reginfo[] = {
     { .name = "TEEHBR", .cp = 14, .crn = 1, .crm = 0, .opc1 = 6, .opc2 = 0,
       .access = PL0_RW, .fieldoffset = offsetof(CPUARMState, teehbr),
       .accessfn = teehbr_access, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo v6k_cp_reginfo[] = {
@@ -2244,7 +2235,6 @@ static const ARMCPRegInfo v6k_cp_reginfo[] = {
       .bank_fieldoffsets = { offsetoflow32(CPUARMState, cp15.tpidrprw_s),
                              offsetoflow32(CPUARMState, cp15.tpidrprw_ns) },
       .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 #ifndef CONFIG_USER_ONLY
@@ -3092,7 +3082,6 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_SEC].cval),
       .writefn = gt_sec_cval_write, .raw_writefn = raw_write,
     },
-    REGINFO_SENTINEL
 };
 
 static CPAccessResult e2h_access(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3133,7 +3122,6 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
       .access = PL0_R, .type = ARM_CP_NO_RAW | ARM_CP_IO,
       .readfn = gt_virt_cnt_read,
     },
-    REGINFO_SENTINEL
 };
 
 #endif
@@ -3497,7 +3485,6 @@ static const ARMCPRegInfo vapa_cp_reginfo[] = {
       .access = PL1_W, .accessfn = ats_access,
       .writefn = ats_write, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC },
 #endif
-    REGINFO_SENTINEL
 };
 
 /* Return basic MPU access permission bits.  */
@@ -3620,7 +3607,6 @@ static const ARMCPRegInfo pmsav7_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, pmsav7.rnr[M_REG_NS]),
       .writefn = pmsav7_rgnr_write,
       .resetfn = arm_cp_reset_ignore },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo pmsav5_cp_reginfo[] = {
@@ -3671,7 +3657,6 @@ static const ARMCPRegInfo pmsav5_cp_reginfo[] = {
     { .name = "946_PRBS7", .cp = 15, .crn = 6, .crm = 7, .opc1 = 0,
       .opc2 = CP_ANY, .access = PL1_RW, .resetvalue = 0,
       .fieldoffset = offsetof(CPUARMState, cp15.c6_region[7]) },
-    REGINFO_SENTINEL
 };
 
 static void vmsa_ttbcr_raw_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3825,7 +3810,6 @@ static const ARMCPRegInfo vmsa_pmsa_cp_reginfo[] = {
       .access = PL1_RW, .accessfn = access_tvm_trvm,
       .fieldoffset = offsetof(CPUARMState, cp15.far_el[1]),
       .resetvalue = 0, },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo vmsa_cp_reginfo[] = {
@@ -3858,7 +3842,6 @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
       /* No offsetoflow32 -- pass the entire TCR to writefn/raw_writefn. */
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.tcr_el[3]),
                              offsetof(CPUARMState, cp15.tcr_el[1])} },
-    REGINFO_SENTINEL
 };
 
 /* Note that unlike TTBCR, writing to TTBCR2 does not require flushing
@@ -3943,7 +3926,6 @@ static const ARMCPRegInfo omap_cp_reginfo[] = {
     { .name = "C9", .cp = 15, .crn = 9,
       .crm = CP_ANY, .opc1 = CP_ANY, .opc2 = CP_ANY, .access = PL1_RW,
       .type = ARM_CP_CONST | ARM_CP_OVERRIDE, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static void xscale_cpar_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3976,7 +3958,6 @@ static const ARMCPRegInfo xscale_cp_reginfo[] = {
     { .name = "XSCALE_UNLOCK_DCACHE",
       .cp = 15, .opc1 = 0, .crn = 9, .crm = 2, .opc2 = 1,
       .access = PL1_W, .type = ARM_CP_NOP },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo dummy_c15_cp_reginfo[] = {
@@ -3990,7 +3971,6 @@ static const ARMCPRegInfo dummy_c15_cp_reginfo[] = {
       .access = PL1_RW,
       .type = ARM_CP_CONST | ARM_CP_NO_RAW | ARM_CP_OVERRIDE,
       .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo cache_dirty_status_cp_reginfo[] = {
@@ -3998,7 +3978,6 @@ static const ARMCPRegInfo cache_dirty_status_cp_reginfo[] = {
     { .name = "CDSR", .cp = 15, .crn = 7, .crm = 10, .opc1 = 0, .opc2 = 6,
       .access = PL1_R, .type = ARM_CP_CONST | ARM_CP_NO_RAW,
       .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo cache_block_ops_cp_reginfo[] = {
@@ -4019,7 +3998,6 @@ static const ARMCPRegInfo cache_block_ops_cp_reginfo[] = {
       .access = PL0_W, .type = ARM_CP_NOP|ARM_CP_64BIT },
     { .name = "CIDCR", .cp = 15, .crm = 14, .opc1 = 0,
       .access = PL1_W, .type = ARM_CP_NOP|ARM_CP_64BIT },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo cache_test_clean_cp_reginfo[] = {
@@ -4032,7 +4010,6 @@ static const ARMCPRegInfo cache_test_clean_cp_reginfo[] = {
     { .name = "TCI_DCACHE", .cp = 15, .crn = 7, .crm = 14, .opc1 = 0, .opc2 = 3,
       .access = PL0_R, .type = ARM_CP_CONST | ARM_CP_NO_RAW,
       .resetvalue = (1 << 30) },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo strongarm_cp_reginfo[] = {
@@ -4041,7 +4018,6 @@ static const ARMCPRegInfo strongarm_cp_reginfo[] = {
       .crm = CP_ANY, .opc1 = CP_ANY, .opc2 = CP_ANY,
       .access = PL1_RW, .resetvalue = 0,
       .type = ARM_CP_CONST | ARM_CP_OVERRIDE | ARM_CP_NO_RAW },
-    REGINFO_SENTINEL
 };
 
 static uint64_t midr_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -4108,7 +4084,6 @@ static const ARMCPRegInfo lpae_cp_reginfo[] = {
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                              offsetof(CPUARMState, cp15.ttbr1_ns) },
       .writefn = vmsa_ttbr_write, },
-    REGINFO_SENTINEL
 };
 
 static uint64_t aa64_fpcr_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -5127,7 +5102,6 @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .access = PL1_RW, .accessfn = access_trap_aa32s_el1,
       .writefn = sdcr_write,
       .fieldoffset = offsetoflow32(CPUARMState, cp15.mdcr_el3) },
-    REGINFO_SENTINEL
 };
 
 /* Used to describe the behaviour of EL2 regs when EL2 does not exist.  */
@@ -5238,7 +5212,6 @@ static const ARMCPRegInfo el3_no_el2_cp_reginfo[] = {
       .type = ARM_CP_CONST,
       .cp = 15, .opc1 = 4, .crn = 6, .crm = 0, .opc2 = 2,
       .access = PL2_RW, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 /* Ditto, but for registers which exist in ARMv8 but not v7 */
@@ -5247,7 +5220,6 @@ static const ARMCPRegInfo el3_no_el2_v8_cp_reginfo[] = {
       .cp = 15, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 4,
       .access = PL2_RW,
       .type = ARM_CP_CONST, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
@@ -5680,7 +5652,6 @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
       .cp = 15, .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 3,
       .access = PL2_RW,
       .fieldoffset = offsetof(CPUARMState, cp15.hstr_el2) },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo el2_v8_cp_reginfo[] = {
@@ -5690,7 +5661,6 @@ static const ARMCPRegInfo el2_v8_cp_reginfo[] = {
       .access = PL2_RW,
       .fieldoffset = offsetofhigh32(CPUARMState, cp15.hcr_el2),
       .writefn = hcr_writehigh },
-    REGINFO_SENTINEL
 };
 
 static CPAccessResult sel2_access(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -5711,7 +5681,6 @@ static const ARMCPRegInfo el2_sec_cp_reginfo[] = {
       .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 6, .opc2 = 2,
       .access = PL2_RW, .accessfn = sel2_access,
       .fieldoffset = offsetof(CPUARMState, cp15.vstcr_el2) },
-    REGINFO_SENTINEL
 };
 
 static CPAccessResult nsacr_access(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -5837,7 +5806,6 @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
       .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 7, .opc2 = 5,
       .access = PL3_W, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vae3_write },
-    REGINFO_SENTINEL
 };
 
 #ifndef CONFIG_USER_ONLY
@@ -6123,7 +6091,6 @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
       .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 0,
       .access = PL1_RW, .accessfn = access_tda,
       .type = ARM_CP_NOP },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
@@ -6132,7 +6099,6 @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
       .access = PL0_R, .type = ARM_CP_CONST|ARM_CP_64BIT, .resetvalue = 0 },
     { .name = "DBGDSAR", .cp = 14, .crm = 2, .opc1 = 0,
       .access = PL0_R, .type = ARM_CP_CONST|ARM_CP_64BIT, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 /* Return the exception level to which exceptions should be taken
@@ -6618,7 +6584,6 @@ static void define_debug_regs(ARMCPU *cpu)
               .fieldoffset = offsetof(CPUARMState, cp15.dbgbcr[i]),
               .writefn = dbgbcr_write, .raw_writefn = raw_write
             },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, dbgregs);
     }
@@ -6637,7 +6602,6 @@ static void define_debug_regs(ARMCPU *cpu)
               .fieldoffset = offsetof(CPUARMState, cp15.dbgwcr[i]),
               .writefn = dbgwcr_write, .raw_writefn = raw_write
             },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, dbgregs);
     }
@@ -6700,7 +6664,6 @@ static void define_pmu_regs(ARMCPU *cpu)
               .type = ARM_CP_IO,
               .readfn = pmevtyper_readfn, .writefn = pmevtyper_writefn,
               .raw_writefn = pmevtyper_rawwrite },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, pmev_regs);
         g_free(pmevcntr_name);
@@ -6718,7 +6681,6 @@ static void define_pmu_regs(ARMCPU *cpu)
               .cp = 15, .opc1 = 0, .crn = 9, .crm = 14, .opc2 = 5,
               .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
               .resetvalue = extract64(cpu->pmceid1, 32, 32) },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, v81_pmu_regs);
     }
@@ -6815,7 +6777,6 @@ static const ARMCPRegInfo lor_reginfo[] = {
       .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 7,
       .access = PL1_R, .accessfn = access_lor_ns,
       .type = ARM_CP_CONST, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 #ifdef TARGET_AARCH64
@@ -6878,7 +6839,6 @@ static const ARMCPRegInfo pauth_reginfo[] = {
       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 1, .opc2 = 3,
       .access = PL1_RW, .accessfn = access_pauth,
       .fieldoffset = offsetof(CPUARMState, keys.apib.hi) },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo tlbirange_reginfo[] = {
@@ -6990,7 +6950,6 @@ static const ARMCPRegInfo tlbirange_reginfo[] = {
       .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 6, .opc2 = 5,
       .access = PL3_W, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_rvae3_write },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo tlbios_reginfo[] = {
@@ -7062,7 +7021,6 @@ static const ARMCPRegInfo tlbios_reginfo[] = {
       .opc0 = 1, .opc1 = 6, .crn = 8, .crm = 1, .opc2 = 5,
       .access = PL3_W, .type = ARM_CP_NO_RAW,
       .writefn = tlbi_aa64_vae3is_write },
-    REGINFO_SENTINEL
 };
 
 static uint64_t rndr_readfn(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -7101,7 +7059,6 @@ static const ARMCPRegInfo rndr_reginfo[] = {
       .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END | ARM_CP_IO,
       .opc0 = 3, .opc1 = 3, .crn = 2, .crm = 4, .opc2 = 1,
       .access = PL0_R, .readfn = rndr_readfn },
-    REGINFO_SENTINEL
 };
 
 #ifndef CONFIG_USER_ONLY
@@ -7137,7 +7094,6 @@ static const ARMCPRegInfo dcpop_reg[] = {
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
       .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
       .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo dcpodp_reg[] = {
@@ -7145,7 +7101,6 @@ static const ARMCPRegInfo dcpodp_reg[] = {
       .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
       .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
       .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
-    REGINFO_SENTINEL
 };
 #endif /*CONFIG_USER_ONLY*/
 
@@ -7247,14 +7202,12 @@ static const ARMCPRegInfo mte_reginfo[] = {
     { .name = "DC_CIGDSW", .state = ARM_CP_STATE_AA64,
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 6,
       .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo mte_tco_ro_reginfo[] = {
     { .name = "TCO", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 3, .crn = 4, .crm = 2, .opc2 = 7,
       .type = ARM_CP_CONST, .access = PL0_RW, },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
@@ -7306,7 +7259,6 @@ static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
       .accessfn = aa64_zva_access,
 #endif
     },
-    REGINFO_SENTINEL
 };
 
 #endif
@@ -7352,7 +7304,6 @@ static const ARMCPRegInfo predinv_reginfo[] = {
     { .name = "CPPRCTX", .state = ARM_CP_STATE_AA32,
       .cp = 15, .opc1 = 0, .crn = 7, .crm = 3, .opc2 = 7,
       .type = ARM_CP_NOP, .access = PL0_W, .accessfn = access_predinv },
-    REGINFO_SENTINEL
 };
 
 static uint64_t ccsidr2_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -7367,7 +7318,6 @@ static const ARMCPRegInfo ccsidr2_reginfo[] = {
       .access = PL1_R,
       .accessfn = access_aa64_tid2,
       .readfn = ccsidr2_read, .type = ARM_CP_NO_RAW },
-    REGINFO_SENTINEL
 };
 
 static CPAccessResult access_aa64_tid3(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -7428,7 +7378,6 @@ static const ARMCPRegInfo jazelle_regs[] = {
       .cp = 14, .crn = 2, .crm = 0, .opc1 = 7, .opc2 = 0,
       .accessfn = access_joscr_jmcr,
       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo vhe_reginfo[] = {
@@ -7493,7 +7442,6 @@ static const ARMCPRegInfo vhe_reginfo[] = {
       .access = PL2_RW, .accessfn = e2h_access,
       .writefn = gt_virt_cval_write, .raw_writefn = raw_write },
 #endif
-    REGINFO_SENTINEL
 };
 
 #ifndef CONFIG_USER_ONLY
@@ -7506,7 +7454,6 @@ static const ARMCPRegInfo ats1e1_reginfo[] = {
       .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 9, .opc2 = 1,
       .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
       .writefn = ats_write64 },
-    REGINFO_SENTINEL
 };
 
 static const ARMCPRegInfo ats1cp_reginfo[] = {
@@ -7518,7 +7465,6 @@ static const ARMCPRegInfo ats1cp_reginfo[] = {
       .cp = 15, .opc1 = 0, .crn = 7, .crm = 9, .opc2 = 1,
       .access = PL1_W, .type = ARM_CP_NO_RAW | ARM_CP_RAISES_EXC,
       .writefn = ats_write },
-    REGINFO_SENTINEL
 };
 #endif
 
@@ -7540,7 +7486,6 @@ static const ARMCPRegInfo actlr2_hactlr2_reginfo[] = {
       .cp = 15, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 3,
       .access = PL2_RW, .type = ARM_CP_CONST,
       .resetvalue = 0 },
-    REGINFO_SENTINEL
 };
 
 void register_cp_regs_for_features(ARMCPU *cpu)
@@ -7647,7 +7592,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL1_R, .type = ARM_CP_CONST,
               .accessfn = access_aa32_tid3,
               .resetvalue = cpu->isar.id_isar6 },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, v6_idregs);
         define_arm_cp_regs(cpu, v6_cp_reginfo);
@@ -7915,7 +7859,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .opc0 = 3, .opc1 = 3, .crn = 9, .crm = 12, .opc2 = 7,
               .access = PL0_R, .accessfn = pmreg_access, .type = ARM_CP_CONST,
               .resetvalue = cpu->pmceid1 },
-            REGINFO_SENTINEL
         };
 #ifdef CONFIG_USER_ONLY
         ARMCPRegUserSpaceInfo v8_user_idregs[] = {
@@ -7945,7 +7888,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .exported_bits = 0x000000f0ffffffff },
             { .name = "ID_AA64ISAR*_EL1_RESERVED",
               .is_glob = true                     },
-            REGUSERINFO_SENTINEL
         };
         modify_arm_cp_regs(v8_idregs, v8_user_idregs);
 #endif
@@ -7985,7 +7927,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL2_RW,
               .resetvalue = vmpidr_def,
               .fieldoffset = offsetof(CPUARMState, cp15.vmpidr_el2) },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, vpidr_regs);
         define_arm_cp_regs(cpu, el2_cp_reginfo);
@@ -8024,7 +7965,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
                   .access = PL2_RW, .accessfn = access_el3_aa32ns,
                   .type = ARM_CP_NO_RAW,
                   .writefn = arm_cp_write_ignore, .readfn = mpidr_read },
-                REGINFO_SENTINEL
             };
             define_arm_cp_regs(cpu, vpidr_regs);
             define_arm_cp_regs(cpu, el3_no_el2_cp_reginfo);
@@ -8047,7 +7987,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .raw_writefn = raw_write, .writefn = sctlr_write,
               .fieldoffset = offsetof(CPUARMState, cp15.sctlr_el[3]),
               .resetvalue = cpu->reset_sctlr },
-            REGINFO_SENTINEL
         };
 
         define_arm_cp_regs(cpu, el3_regs);
@@ -8182,7 +8121,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
             { .name = "DUMMY",
               .cp = 15, .crn = 0, .crm = 7, .opc1 = 0, .opc2 = CP_ANY,
               .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = 0 },
-            REGINFO_SENTINEL
         };
         ARMCPRegInfo id_v8_midr_cp_reginfo[] = {
             { .name = "MIDR_EL1", .state = ARM_CP_STATE_BOTH,
@@ -8202,7 +8140,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL1_R,
               .accessfn = access_aa64_tid1,
               .type = ARM_CP_CONST, .resetvalue = cpu->revidr },
-            REGINFO_SENTINEL
         };
         ARMCPRegInfo id_cp_reginfo[] = {
             /* These are common to v8 and pre-v8 */
@@ -8220,7 +8157,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL1_R,
               .accessfn = access_aa32_tid1,
               .type = ARM_CP_CONST, .resetvalue = 0 },
-            REGINFO_SENTINEL
         };
         /* TLBTR is specific to VMSA */
         ARMCPRegInfo id_tlbtr_reginfo = {
@@ -8247,25 +8183,23 @@ void register_cp_regs_for_features(ARMCPU *cpu)
             { .name = "MIDR_EL1",
               .exported_bits = 0x00000000ffffffff },
             { .name = "REVIDR_EL1"                },
-            REGUSERINFO_SENTINEL
         };
         modify_arm_cp_regs(id_v8_midr_cp_reginfo, id_v8_user_midr_cp_reginfo);
 #endif
         if (arm_feature(env, ARM_FEATURE_OMAPCP) ||
             arm_feature(env, ARM_FEATURE_STRONGARM)) {
-            ARMCPRegInfo *r;
+            size_t i;
             /* Register the blanket "writes ignored" value first to cover the
              * whole space. Then update the specific ID registers to allow write
              * access, so that they ignore writes rather than causing them to
              * UNDEF.
              */
             define_one_arm_cp_reg(cpu, &crn0_wi_reginfo);
-            for (r = id_pre_v8_midr_cp_reginfo;
-                 r->type != ARM_CP_SENTINEL; r++) {
-                r->access = PL1_RW;
+            for (i = 0; i < ARRAY_SIZE(id_pre_v8_midr_cp_reginfo); ++i) {
+                id_pre_v8_midr_cp_reginfo[i].access = PL1_RW;
             }
-            for (r = id_cp_reginfo; r->type != ARM_CP_SENTINEL; r++) {
-                r->access = PL1_RW;
+            for (i = 0; i < ARRAY_SIZE(id_cp_reginfo); ++i) {
+                id_cp_reginfo[i].access = PL1_RW;
             }
             id_mpuir_reginfo.access = PL1_RW;
             id_tlbtr_reginfo.access = PL1_RW;
@@ -8288,13 +8222,11 @@ void register_cp_regs_for_features(ARMCPU *cpu)
             { .name = "MPIDR_EL1", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = 5,
               .access = PL1_R, .readfn = mpidr_read, .type = ARM_CP_NO_RAW },
-            REGINFO_SENTINEL
         };
 #ifdef CONFIG_USER_ONLY
         ARMCPRegUserSpaceInfo mpidr_user_cp_reginfo[] = {
             { .name = "MPIDR_EL1",
               .fixed_bits = 0x0000000080000000 },
-            REGUSERINFO_SENTINEL
         };
         modify_arm_cp_regs(mpidr_cp_reginfo, mpidr_user_cp_reginfo);
 #endif
@@ -8315,7 +8247,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 0, .opc2 = 1,
               .access = PL3_RW, .type = ARM_CP_CONST,
               .resetvalue = 0 },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, auxcr_reginfo);
         if (cpu_isar_feature(aa32_ac2, cpu)) {
@@ -8350,7 +8281,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
                   .type = ARM_CP_CONST,
                   .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 3, .opc2 = 0,
                   .access = PL1_R, .resetvalue = cpu->reset_cbar },
-                REGINFO_SENTINEL
             };
             /* We don't implement a r/w 64 bit CBAR currently */
             assert(arm_feature(env, ARM_FEATURE_CBAR_RO));
@@ -8380,7 +8310,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .bank_fieldoffsets = { offsetof(CPUARMState, cp15.vbar_s),
                                      offsetof(CPUARMState, cp15.vbar_ns) },
               .resetvalue = 0 },
-            REGINFO_SENTINEL
         };
         define_arm_cp_regs(cpu, vbar_cp_reginfo);
     }
@@ -8834,8 +8763,7 @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
                    r->writefn);
         }
     }
-    /* Bad type field probably means missing sentinel at end of reg list */
-    assert(cptype_valid(r->type));
+
     for (crm = crmmin; crm <= crmmax; crm++) {
         for (opc1 = opc1min; opc1 <= opc1max; opc1++) {
             for (opc2 = opc2min; opc2 <= opc2max; opc2++) {
@@ -8881,13 +8809,13 @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
     }
 }
 
-void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
-                                    const ARMCPRegInfo *regs, void *opaque)
+/* Define a whole list of registers */
+void define_arm_cp_regs_with_opaque_len(ARMCPU *cpu, const ARMCPRegInfo *regs,
+                                        void *opaque, size_t len)
 {
-    /* Define a whole list of registers */
-    const ARMCPRegInfo *r;
-    for (r = regs; r->type != ARM_CP_SENTINEL; r++) {
-        define_one_arm_cp_reg_with_opaque(cpu, r, opaque);
+    size_t i;
+    for (i = 0; i < len; ++i) {
+        define_one_arm_cp_reg_with_opaque(cpu, regs + i, opaque);
     }
 }
 
@@ -8899,17 +8827,20 @@ void define_arm_cp_regs_with_opaque(ARMCPU *cpu,
  * user-space cannot alter any values and dynamic values pertaining to
  * execution state are hidden from user space view anyway.
  */
-void modify_arm_cp_regs(ARMCPRegInfo *regs, const ARMCPRegUserSpaceInfo *mods)
+void modify_arm_cp_regs_with_len(ARMCPRegInfo *regs, size_t regs_len,
+                                 const ARMCPRegUserSpaceInfo *mods,
+                                 size_t mods_len)
 {
-    const ARMCPRegUserSpaceInfo *m;
-    ARMCPRegInfo *r;
-
-    for (m = mods; m->name; m++) {
+    for (size_t mi = 0; mi < mods_len; ++mi) {
+        const ARMCPRegUserSpaceInfo *m = mods + mi;
         GPatternSpec *pat = NULL;
+
         if (m->is_glob) {
             pat = g_pattern_spec_new(m->name);
         }
-        for (r = regs; r->type != ARM_CP_SENTINEL; r++) {
+        for (size_t ri = 0; ri < regs_len; ++ri) {
+            ARMCPRegInfo *r = regs + ri;
+
             if (pat && g_pattern_match_string(pat, r->name)) {
                 r->type = ARM_CP_CONST;
                 r->access = PL0U_R;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 27/60] target/arm: Make some more cpreg data static const
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (25 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 26/60] target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-22  9:38   ` Peter Maydell
  2022-04-22 15:38   ` Alex Bennée
  2022-04-17 17:43 ` [PATCH v3 28/60] target/arm: Reorg ARMCPRegInfo type field bits Richard Henderson
                   ` (33 subsequent siblings)
  60 siblings, 2 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

These particular data structures are not modified at runtime.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index d6c34c7826..94b41c7e88 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -7861,7 +7861,7 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .resetvalue = cpu->pmceid1 },
         };
 #ifdef CONFIG_USER_ONLY
-        ARMCPRegUserSpaceInfo v8_user_idregs[] = {
+        static const ARMCPRegUserSpaceInfo v8_user_idregs[] = {
             { .name = "ID_AA64PFR0_EL1",
               .exported_bits = 0x000f000f00ff0000,
               .fixed_bits    = 0x0000000000000011 },
@@ -8001,7 +8001,7 @@ void register_cp_regs_for_features(ARMCPU *cpu)
      */
     if (arm_feature(env, ARM_FEATURE_EL3)) {
         if (arm_feature(env, ARM_FEATURE_AARCH64)) {
-            ARMCPRegInfo nsacr = {
+            static const ARMCPRegInfo nsacr = {
                 .name = "NSACR", .type = ARM_CP_CONST,
                 .cp = 15, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 2,
                 .access = PL1_RW, .accessfn = nsacr_access,
@@ -8009,7 +8009,7 @@ void register_cp_regs_for_features(ARMCPU *cpu)
             };
             define_one_arm_cp_reg(cpu, &nsacr);
         } else {
-            ARMCPRegInfo nsacr = {
+            static const ARMCPRegInfo nsacr = {
                 .name = "NSACR",
                 .cp = 15, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 2,
                 .access = PL3_RW | PL1_R,
@@ -8020,7 +8020,7 @@ void register_cp_regs_for_features(ARMCPU *cpu)
         }
     } else {
         if (arm_feature(env, ARM_FEATURE_V8)) {
-            ARMCPRegInfo nsacr = {
+            static const ARMCPRegInfo nsacr = {
                 .name = "NSACR", .type = ARM_CP_CONST,
                 .cp = 15, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 2,
                 .access = PL1_R,
@@ -8173,13 +8173,13 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL1_R, .type = ARM_CP_CONST,
               .resetvalue = cpu->pmsav7_dregion << 8
         };
-        ARMCPRegInfo crn0_wi_reginfo = {
+        static const ARMCPRegInfo crn0_wi_reginfo = {
             .name = "CRN0_WI", .cp = 15, .crn = 0, .crm = CP_ANY,
             .opc1 = CP_ANY, .opc2 = CP_ANY, .access = PL1_W,
             .type = ARM_CP_NOP | ARM_CP_OVERRIDE
         };
 #ifdef CONFIG_USER_ONLY
-        ARMCPRegUserSpaceInfo id_v8_user_midr_cp_reginfo[] = {
+        static const ARMCPRegUserSpaceInfo id_v8_user_midr_cp_reginfo[] = {
             { .name = "MIDR_EL1",
               .exported_bits = 0x00000000ffffffff },
             { .name = "REVIDR_EL1"                },
@@ -8224,7 +8224,7 @@ void register_cp_regs_for_features(ARMCPU *cpu)
               .access = PL1_R, .readfn = mpidr_read, .type = ARM_CP_NO_RAW },
         };
 #ifdef CONFIG_USER_ONLY
-        ARMCPRegUserSpaceInfo mpidr_user_cp_reginfo[] = {
+        static const ARMCPRegUserSpaceInfo mpidr_user_cp_reginfo[] = {
             { .name = "MPIDR_EL1",
               .fixed_bits = 0x0000000080000000 },
         };
@@ -8303,7 +8303,7 @@ void register_cp_regs_for_features(ARMCPU *cpu)
     }
 
     if (arm_feature(env, ARM_FEATURE_VBAR)) {
-        ARMCPRegInfo vbar_cp_reginfo[] = {
+        static const ARMCPRegInfo vbar_cp_reginfo[] = {
             { .name = "VBAR", .state = ARM_CP_STATE_BOTH,
               .opc0 = 3, .crn = 12, .crm = 0, .opc1 = 0, .opc2 = 0,
               .access = PL1_RW, .writefn = vbar_write,
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 28/60] target/arm: Reorg ARMCPRegInfo type field bits
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (26 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 27/60] target/arm: Make some more cpreg data static const Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-22  9:49   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 29/60] target/arm: Change cpreg access permissions to enum Richard Henderson
                   ` (32 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Instead of defining ARM_CP_FLAG_MASK to remove flags,
define ARM_CP_SPECIAL_MASK to isolate special cases.
Sort the specials to the low bits. Use an enum.

Make ARM_CP_CONST a special case, and ARM_CP_NOP an alias.
Split the large comment block so as to document each
value separately.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpregs.h        | 134 +++++++++++++++++++++++--------------
 target/arm/cpu.c           |   4 +-
 target/arm/helper.c        |   4 +-
 target/arm/translate-a64.c |  21 +++---
 target/arm/translate.c     |  27 ++++----
 5 files changed, 111 insertions(+), 79 deletions(-)

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index bd321d6d23..031e4b7ec8 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -22,57 +22,89 @@
 #define TARGET_ARM_CPREGS_H 1
 
 /*
- * ARMCPRegInfo type field bits. If the SPECIAL bit is set this is a
- * special-behaviour cp reg and bits [11..8] indicate what behaviour
- * it has. Otherwise it is a simple cp reg, where CONST indicates that
- * TCG can assume the value to be constant (ie load at translate time)
- * and 64BIT indicates a 64 bit wide coprocessor register. SUPPRESS_TB_END
- * indicates that the TB should not be ended after a write to this register
- * (the default is that the TB ends after cp writes). OVERRIDE permits
- * a register definition to override a previous definition for the
- * same (cp, is64, crn, crm, opc1, opc2) tuple: either the new or the
- * old must have the OVERRIDE bit set.
- * ALIAS indicates that this register is an alias view of some underlying
- * state which is also visible via another register, and that the other
- * register is handling migration and reset; registers marked ALIAS will not be
- * migrated but may have their state set by syncing of register state from KVM.
- * NO_RAW indicates that this register has no underlying state and does not
- * support raw access for state saving/loading; it will not be used for either
- * migration or KVM state synchronization. (Typically this is for "registers"
- * which are actually used as instructions for cache maintenance and so on.)
- * IO indicates that this register does I/O and therefore its accesses
- * need to be marked with gen_io_start() and also end the TB. In particular,
- * registers which implement clocks or timers require this.
- * RAISES_EXC is for when the read or write hook might raise an exception;
- * the generated code will synchronize the CPU state before calling the hook
- * so that it is safe for the hook to call raise_exception().
- * NEWEL is for writes to registers that might change the exception
- * level - typically on older ARM chips. For those cases we need to
- * re-read the new el when recomputing the translation flags.
+ * ARMCPRegInfo type field bits:
  */
-#define ARM_CP_SPECIAL           0x0001
-#define ARM_CP_CONST             0x0002
-#define ARM_CP_64BIT             0x0004
-#define ARM_CP_SUPPRESS_TB_END   0x0008
-#define ARM_CP_OVERRIDE          0x0010
-#define ARM_CP_ALIAS             0x0020
-#define ARM_CP_IO                0x0040
-#define ARM_CP_NO_RAW            0x0080
-#define ARM_CP_NOP               (ARM_CP_SPECIAL | 0x0100)
-#define ARM_CP_WFI               (ARM_CP_SPECIAL | 0x0200)
-#define ARM_CP_NZCV              (ARM_CP_SPECIAL | 0x0300)
-#define ARM_CP_CURRENTEL         (ARM_CP_SPECIAL | 0x0400)
-#define ARM_CP_DC_ZVA            (ARM_CP_SPECIAL | 0x0500)
-#define ARM_CP_DC_GVA            (ARM_CP_SPECIAL | 0x0600)
-#define ARM_CP_DC_GZVA           (ARM_CP_SPECIAL | 0x0700)
-#define ARM_LAST_SPECIAL         ARM_CP_DC_GZVA
-#define ARM_CP_FPU               0x1000
-#define ARM_CP_SVE               0x2000
-#define ARM_CP_NO_GDB            0x4000
-#define ARM_CP_RAISES_EXC        0x8000
-#define ARM_CP_NEWEL             0x10000
-/* Mask of only the flag bits in a type field */
-#define ARM_CP_FLAG_MASK         0x1f0ff
+enum {
+    /*
+     * Register must be handled specially during translation.
+     * The method is one of the values below:
+     */
+    ARM_CP_SPECIAL_MASK          = 0x000f,
+    /*
+     * Special: sysreg with constant value, writes ignored.
+     * The nop alias is slightly clearer for describing write-only sysregs.
+     */
+    ARM_CP_CONST                 = 0x0001,
+    ARM_CP_NOP                   = ARM_CP_CONST,
+    /* Special: sysreg is WFI, for v5 and v6. */
+    ARM_CP_WFI                   = 0x0002,
+    /* Special: sysreg is NZCV. */
+    ARM_CP_NZCV                  = 0x0003,
+    /* Special: sysreg is CURRENTEL. */
+    ARM_CP_CURRENTEL             = 0x0004,
+    /* Special: sysreg is DC ZVA or similar. */
+    ARM_CP_DC_ZVA                = 0x0005,
+    ARM_CP_DC_GVA                = 0x0006,
+    ARM_CP_DC_GZVA               = 0x0007,
+
+    /* Flag: For ARM_CP_STATE_AA32, sysreg is 64-bit. */
+    ARM_CP_64BIT                 = 1 << 4,
+    /*
+     * Flag: TB should not be ended after a write to this register
+     * (the default is that the TB ends after cp writes).
+     */
+    ARM_CP_SUPPRESS_TB_END       = 1 << 5,
+    /*
+     * Flag: Permit a register definition to override a previous definition
+     * for the same (cp, is64, crn, crm, opc1, opc2) tuple: either the new
+     * or the old must have the ARM_CP_OVERRIDE bit set.
+     */
+    ARM_CP_OVERRIDE              = 1 << 6,
+    /*
+     * Flag: Register is an alias view of some underlying state which is also
+     * visible via another register, and that the other register is handling
+     * migration and reset; registers marked ARM_CP_ALIAS will not be migrated
+     * but may have their state set by syncing of register state from KVM.
+     */
+    ARM_CP_ALIAS                 = 1 << 7,
+    /*
+     * Flag: Register does I/O and therefore its accesses need to be marked
+     * with gen_io_start() and also end the TB. In particular, registers which
+     * implement clocks or timers require this.
+     */
+    ARM_CP_IO                    = 1 << 8,
+    /*
+     * Flag: Register has no underlying state and does not support raw access
+     * for state saving/loading; it will not be used for either migration or
+     * KVM state synchronization. Typically this is for "registers" which are
+     * actually used as instructions for cache maintenance and so on.
+     */
+    ARM_CP_NO_RAW                = 1 << 9,
+    /*
+     * Flag: The read or write hook might raise an exception; the generated
+     * code will synchronize the CPU state before calling the hook so that it
+     * is safe for the hook to call raise_exception().
+     */
+    ARM_CP_RAISES_EXC            = 1 << 10,
+    /*
+     * Flag: Writes to the sysreg might change the exception level - typically
+     * on older ARM chips. For those cases we need to re-read the new el when
+     * recomputing the translation flags.
+     */
+    ARM_CP_NEWEL                 = 1 << 11,
+    /*
+     * Flag: Access check for this sysreg is identical to accessing FPU state
+     * from an instruction: use translation fp_access_check().
+     */
+    ARM_CP_FPU                   = 1 << 12,
+    /*
+     * Flag: Access check for this sysreg is identical to accessing SVE state
+     * from an instruction: use translation sve_access_check().
+     */
+    ARM_CP_SVE                   = 1 << 13,
+    /* Flag: Do not expose in gdb sysreg xml. */
+    ARM_CP_NO_GDB                = 1 << 14,
+};
 
 /*
  * Valid values for ARMCPRegInfo state field, indicating which of
@@ -249,7 +281,7 @@ struct ARMCPRegInfo {
     /*
      * Offset of the field in CPUARMState for this register.
      * This is not needed if either:
-     *  1. type is ARM_CP_CONST or one of the ARM_CP_SPECIALs
+     *  1. type has ARM_CP_SPECIAL_MASK non-zero
      *  2. both readfn and writefn are specified
      */
     ptrdiff_t fieldoffset; /* offsetof(CPUARMState, field) */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 066fe6250a..92fc75b2bf 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -117,7 +117,7 @@ static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
     ARMCPRegInfo *ri = value;
     ARMCPU *cpu = opaque;
 
-    if (ri->type & (ARM_CP_SPECIAL | ARM_CP_ALIAS)) {
+    if (ri->type & (ARM_CP_SPECIAL_MASK | ARM_CP_ALIAS)) {
         return;
     }
 
@@ -153,7 +153,7 @@ static void cp_reg_check_reset(gpointer key, gpointer value,  gpointer opaque)
     ARMCPU *cpu = opaque;
     uint64_t oldvalue, newvalue;
 
-    if (ri->type & (ARM_CP_SPECIAL | ARM_CP_ALIAS | ARM_CP_NO_RAW)) {
+    if (ri->type & (ARM_CP_SPECIAL_MASK | ARM_CP_ALIAS | ARM_CP_NO_RAW)) {
         return;
     }
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 94b41c7e88..1bbaf0a3af 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8601,7 +8601,7 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
      * multiple times. Special registers (ie NOP/WFI) are
      * never migratable and not even raw-accessible.
      */
-    if ((r->type & ARM_CP_SPECIAL)) {
+    if (r->type & ARM_CP_SPECIAL_MASK) {
         r2->type |= ARM_CP_NO_RAW;
     }
     if (((r->crm == CP_ANY) && crm != 0) ||
@@ -8751,7 +8751,7 @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
     /* Check that the register definition has enough info to handle
      * reads and writes if they are permitted.
      */
-    if (!(r->type & (ARM_CP_SPECIAL|ARM_CP_CONST))) {
+    if (!(r->type & (ARM_CP_SPECIAL_MASK | ARM_CP_CONST))) {
         if (r->access & PL3_R) {
             assert((r->fieldoffset ||
                    (r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1])) ||
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 92b91f0307..98dbc8203f 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1833,8 +1833,14 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
     }
 
     /* Handle special cases first */
-    switch (ri->type & ~(ARM_CP_FLAG_MASK & ~ARM_CP_SPECIAL)) {
-    case ARM_CP_NOP:
+    switch (ri->type & ARM_CP_SPECIAL_MASK) {
+    case 0:
+        break;
+    case ARM_CP_CONST:
+        if (isread) {
+            tcg_rt = cpu_reg(s, rt);
+            tcg_gen_movi_i64(tcg_rt, ri->resetvalue);
+        }
         return;
     case ARM_CP_NZCV:
         tcg_rt = cpu_reg(s, rt);
@@ -1908,7 +1914,7 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
         }
         return;
     default:
-        break;
+        g_assert_not_reached();
     }
     if ((ri->type & ARM_CP_FPU) && !fp_access_check(s)) {
         return;
@@ -1923,18 +1929,13 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
     tcg_rt = cpu_reg(s, rt);
 
     if (isread) {
-        if (ri->type & ARM_CP_CONST) {
-            tcg_gen_movi_i64(tcg_rt, ri->resetvalue);
-        } else if (ri->readfn) {
+        if (ri->readfn) {
             gen_helper_get_cp_reg64(tcg_rt, cpu_env, tcg_constant_ptr(ri));
         } else {
             tcg_gen_ld_i64(tcg_rt, cpu_env, ri->fieldoffset);
         }
     } else {
-        if (ri->type & ARM_CP_CONST) {
-            /* If not forbidden by access permissions, treat as WI */
-            return;
-        } else if (ri->writefn) {
+        if (ri->writefn) {
             gen_helper_set_cp_reg64(cpu_env, tcg_constant_ptr(ri), tcg_rt);
         } else {
             tcg_gen_st_i64(tcg_rt, cpu_env, ri->fieldoffset);
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 1fe48e4e1c..9370b44707 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4744,8 +4744,16 @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
         }
 
         /* Handle special cases first */
-        switch (ri->type & ~(ARM_CP_FLAG_MASK & ~ARM_CP_SPECIAL)) {
-        case ARM_CP_NOP:
+        switch (ri->type & ARM_CP_SPECIAL_MASK) {
+        case 0:
+            break;
+        case ARM_CP_CONST:
+            if (isread) {
+                store_reg(s, rt, tcg_constant_i32(ri->resetvalue));
+                if (is64) {
+                    store_reg(s, rt2, tcg_constant_i32(ri->resetvalue >> 32));
+                }
+            }
             return;
         case ARM_CP_WFI:
             if (isread) {
@@ -4756,7 +4764,7 @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
             s->base.is_jmp = DISAS_WFI;
             return;
         default:
-            break;
+            g_assert_not_reached();
         }
 
         if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
@@ -4768,9 +4776,7 @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
             if (is64) {
                 TCGv_i64 tmp64;
                 TCGv_i32 tmp;
-                if (ri->type & ARM_CP_CONST) {
-                    tmp64 = tcg_constant_i64(ri->resetvalue);
-                } else if (ri->readfn) {
+                if (ri->readfn) {
                     tmp64 = tcg_temp_new_i64();
                     gen_helper_get_cp_reg64(tmp64, cpu_env,
                                             tcg_constant_ptr(ri));
@@ -4787,9 +4793,7 @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
                 store_reg(s, rt2, tmp);
             } else {
                 TCGv_i32 tmp;
-                if (ri->type & ARM_CP_CONST) {
-                    tmp = tcg_constant_i32(ri->resetvalue);
-                } else if (ri->readfn) {
+                if (ri->readfn) {
                     tmp = tcg_temp_new_i32();
                     gen_helper_get_cp_reg(tmp, cpu_env, tcg_constant_ptr(ri));
                 } else {
@@ -4807,11 +4811,6 @@ static void do_coproc_insn(DisasContext *s, int cpnum, int is64,
             }
         } else {
             /* Write */
-            if (ri->type & ARM_CP_CONST) {
-                /* If not forbidden by access permissions, treat as WI */
-                return;
-            }
-
             if (is64) {
                 TCGv_i32 tmplo, tmphi;
                 TCGv_i64 tmp64 = tcg_temp_new_i64();
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 29/60] target/arm: Change cpreg access permissions to enum
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (27 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 28/60] target/arm: Reorg ARMCPRegInfo type field bits Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-22  9:52   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 30/60] target/arm: Name CPState type Richard Henderson
                   ` (31 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Create a typedef as well, and use it in ARMCPRegInfo.
This won't be perfect for debugging, but it'll nicely
display the most common cases.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpregs.h | 44 +++++++++++++++++++++++---------------------
 target/arm/helper.c |  5 ++---
 2 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index 031e4b7ec8..2c991ab5df 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -156,31 +156,33 @@ enum {
  * described with these bits, then use a laxer set of restrictions, and
  * do the more restrictive/complex check inside a helper function.
  */
-#define PL3_R 0x80
-#define PL3_W 0x40
-#define PL2_R (0x20 | PL3_R)
-#define PL2_W (0x10 | PL3_W)
-#define PL1_R (0x08 | PL2_R)
-#define PL1_W (0x04 | PL2_W)
-#define PL0_R (0x02 | PL1_R)
-#define PL0_W (0x01 | PL1_W)
+typedef enum {
+    PL3_R = 0x80,
+    PL3_W = 0x40,
+    PL2_R = 0x20 | PL3_R,
+    PL2_W = 0x10 | PL3_W,
+    PL1_R = 0x08 | PL2_R,
+    PL1_W = 0x04 | PL2_W,
+    PL0_R = 0x02 | PL1_R,
+    PL0_W = 0x01 | PL1_W,
 
-/*
- * For user-mode some registers are accessible to EL0 via a kernel
- * trap-and-emulate ABI. In this case we define the read permissions
- * as actually being PL0_R. However some bits of any given register
- * may still be masked.
- */
+    /*
+     * For user-mode some registers are accessible to EL0 via a kernel
+     * trap-and-emulate ABI. In this case we define the read permissions
+     * as actually being PL0_R. However some bits of any given register
+     * may still be masked.
+     */
 #ifdef CONFIG_USER_ONLY
-#define PL0U_R PL0_R
+    PL0U_R = PL0_R,
 #else
-#define PL0U_R PL1_R
+    PL0U_R = PL1_R,
 #endif
 
-#define PL3_RW (PL3_R | PL3_W)
-#define PL2_RW (PL2_R | PL2_W)
-#define PL1_RW (PL1_R | PL1_W)
-#define PL0_RW (PL0_R | PL0_W)
+    PL3_RW = PL3_R | PL3_W,
+    PL2_RW = PL2_R | PL2_W,
+    PL1_RW = PL1_R | PL1_W,
+    PL0_RW = PL0_R | PL0_W,
+} CPAccessRights;
 
 typedef enum CPAccessResult {
     /* Access is permitted */
@@ -264,7 +266,7 @@ struct ARMCPRegInfo {
     /* Register type: ARM_CP_* bits/values */
     int type;
     /* Access rights: PL*_[RW] */
-    int access;
+    CPAccessRights access;
     /* Security state: ARM_CP_SECSTATE_* bits/values */
     int secure;
     /*
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 1bbaf0a3af..33ba77890b 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8712,7 +8712,7 @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
      * to encompass the generic architectural permission check.
      */
     if (r->state != ARM_CP_STATE_AA32) {
-        int mask = 0;
+        CPAccessRights mask;
         switch (r->opc1) {
         case 0:
             /* min_EL EL1, but some accessible to EL0 via kernel ABI */
@@ -8741,8 +8741,7 @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
             break;
         default:
             /* broken reginfo with out-of-range opc1 */
-            assert(false);
-            break;
+            g_assert_not_reached();
         }
         /* assert our permissions are not too lax (stricter is fine) */
         assert((r->access & ~mask) == 0);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 30/60] target/arm: Name CPState type
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (28 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 29/60] target/arm: Change cpreg access permissions to enum Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-22  9:53   ` Peter Maydell
  2022-04-22 15:51   ` Alex Bennée
  2022-04-17 17:43 ` [PATCH v3 31/60] target/arm: Name CPSecureState type Richard Henderson
                   ` (30 subsequent siblings)
  60 siblings, 2 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Give this enum a name and use in ARMCPRegInfo,
add_cpreg_to_hashtable and define_one_arm_cp_reg_with_opaque.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpregs.h | 6 +++---
 target/arm/helper.c | 6 ++++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index 2c991ab5df..fe338730ab 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -116,11 +116,11 @@ enum {
  * Note that we rely on the values of these enums as we iterate through
  * the various states in some places.
  */
-enum {
+typedef enum {
     ARM_CP_STATE_AA32 = 0,
     ARM_CP_STATE_AA64 = 1,
     ARM_CP_STATE_BOTH = 2,
-};
+} CPState;
 
 /*
  * ARM CP register secure state flags.  These flags identify security state
@@ -262,7 +262,7 @@ struct ARMCPRegInfo {
     uint8_t opc1;
     uint8_t opc2;
     /* Execution state in which this register is visible: ARM_CP_STATE_* */
-    int state;
+    CPState state;
     /* Register type: ARM_CP_* bits/values */
     int type;
     /* Access rights: PL*_[RW] */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 33ba77890b..8b89039667 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8503,7 +8503,7 @@ CpuDefinitionInfoList *qmp_query_cpu_definitions(Error **errp)
 }
 
 static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
-                                   void *opaque, int state, int secstate,
+                                   void *opaque, CPState state, int secstate,
                                    int crm, int opc1, int opc2,
                                    const char *name)
 {
@@ -8663,13 +8663,15 @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
      * bits; the ARM_CP_64BIT* flag applies only to the AArch32 view of
      * the register, if any.
      */
-    int crm, opc1, opc2, state;
+    int crm, opc1, opc2;
     int crmmin = (r->crm == CP_ANY) ? 0 : r->crm;
     int crmmax = (r->crm == CP_ANY) ? 15 : r->crm;
     int opc1min = (r->opc1 == CP_ANY) ? 0 : r->opc1;
     int opc1max = (r->opc1 == CP_ANY) ? 7 : r->opc1;
     int opc2min = (r->opc2 == CP_ANY) ? 0 : r->opc2;
     int opc2max = (r->opc2 == CP_ANY) ? 7 : r->opc2;
+    CPState state;
+
     /* 64 bit registers have only CRm and Opc1 fields */
     assert(!((r->type & ARM_CP_64BIT) && (r->opc2 || r->crn)));
     /* op0 only exists in the AArch64 encodings */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 31/60] target/arm: Name CPSecureState type
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (29 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 30/60] target/arm: Name CPState type Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-22  9:57   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 32/60] target/arm: Update sysreg fields when redirecting for E2H Richard Henderson
                   ` (29 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Give this enum a name and use in ARMCPRegInfo and add_cpreg_to_hashtable.
Add the enumerator ARM_CP_SECSTATE_BOTH to clarify how 0
is handled in define_one_arm_cp_reg_with_opaque.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpregs.h | 7 ++++---
 target/arm/helper.c | 3 ++-
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
index fe338730ab..3528c0ebb1 100644
--- a/target/arm/cpregs.h
+++ b/target/arm/cpregs.h
@@ -133,10 +133,11 @@ typedef enum {
  * registered entry will only have one to identify whether the entry is secure
  * or non-secure.
  */
-enum {
+typedef enum {
+    ARM_CP_SECSTATE_BOTH = 0,       /* define one cpreg for each secstate */
     ARM_CP_SECSTATE_S =   (1 << 0), /* bit[0]: Secure state register */
     ARM_CP_SECSTATE_NS =  (1 << 1), /* bit[1]: Non-secure state register */
-};
+} CPSecureState;
 
 /*
  * Access rights:
@@ -268,7 +269,7 @@ struct ARMCPRegInfo {
     /* Access rights: PL*_[RW] */
     CPAccessRights access;
     /* Security state: ARM_CP_SECSTATE_* bits/values */
-    int secure;
+    CPSecureState secure;
     /*
      * The opaque pointer passed to define_arm_cp_regs_with_opaque() when
      * this register was defined: can be used to hand data through to the
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 8b89039667..7c569a569a 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8503,7 +8503,8 @@ CpuDefinitionInfoList *qmp_query_cpu_definitions(Error **errp)
 }
 
 static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
-                                   void *opaque, CPState state, int secstate,
+                                   void *opaque, CPState state,
+                                   CPSecureState secstate,
                                    int crm, int opc1, int opc2,
                                    const char *name)
 {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 32/60] target/arm: Update sysreg fields when redirecting for E2H
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (30 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 31/60] target/arm: Name CPSecureState type Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-22 10:39   ` Peter Maydell
  2022-04-17 17:43 ` [PATCH v3 33/60] target/arm: Store cpregs key in the hash table directly Richard Henderson
                   ` (28 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

The new_key is always non-zero during redirection,
so remove the if.  Update opc0 et al from the new key.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 35 +++++++++++++++++++++++------------
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 7c569a569a..aee195400b 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5915,7 +5915,9 @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
 
     for (i = 0; i < ARRAY_SIZE(aliases); i++) {
         const struct E2HAlias *a = &aliases[i];
-        ARMCPRegInfo *src_reg, *dst_reg;
+        ARMCPRegInfo *src_reg, *dst_reg, *new_reg;
+        uint32_t *new_key;
+        bool ok;
 
         if (a->feature && !a->feature(&cpu->isar)) {
             continue;
@@ -5934,19 +5936,28 @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
         g_assert(src_reg->opaque == NULL);
 
         /* Create alias before redirection so we dup the right data. */
-        if (a->new_key) {
-            ARMCPRegInfo *new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
-            uint32_t *new_key = g_memdup(&a->new_key, sizeof(uint32_t));
-            bool ok;
+        new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
+        new_key = g_memdup(&a->new_key, sizeof(uint32_t));
 
-            new_reg->name = a->new_name;
-            new_reg->type |= ARM_CP_ALIAS;
-            /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place.  */
-            new_reg->access &= PL2_RW | PL3_RW;
+        new_reg->name = a->new_name;
+        new_reg->type |= ARM_CP_ALIAS;
+        /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place.  */
+        new_reg->access &= PL2_RW;
 
-            ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
-            g_assert(ok);
-        }
+#define E(X, N) \
+    ((X & CP_REG_ARM64_SYSREG_##N##_MASK) >> CP_REG_ARM64_SYSREG_##N##_SHIFT)
+
+        /* Update the sysreg fields */
+        new_reg->opc0 = E(a->new_key, OP0);
+        new_reg->opc1 = E(a->new_key, OP1);
+        new_reg->crn = E(a->new_key, CRN);
+        new_reg->crm = E(a->new_key, CRM);
+        new_reg->opc2 = E(a->new_key, OP2);
+
+#undef E
+
+        ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
+        g_assert(ok);
 
         src_reg->opaque = dst_reg;
         src_reg->orig_readfn = src_reg->readfn ?: raw_read;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 33/60] target/arm: Store cpregs key in the hash table directly
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (31 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 32/60] target/arm: Update sysreg fields when redirecting for E2H Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-22 10:46   ` Peter Maydell
  2022-04-17 17:44 ` [PATCH v3 34/60] target/arm: Cleanup add_cpreg_to_hashtable Richard Henderson
                   ` (27 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Cast the uint32_t key into a gpointer directly, which
allows us to avoid allocating storage for each key.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.c     |  4 ++--
 target/arm/gdbstub.c |  2 +-
 target/arm/helper.c  | 45 ++++++++++++++++++++------------------------
 3 files changed, 23 insertions(+), 28 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 92fc75b2bf..af13b34697 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1080,8 +1080,8 @@ static void arm_cpu_initfn(Object *obj)
     ARMCPU *cpu = ARM_CPU(obj);
 
     cpu_set_cpustate_pointers(cpu);
-    cpu->cp_regs = g_hash_table_new_full(g_int_hash, g_int_equal,
-                                         g_free, cpreg_hashtable_data_destroy);
+    cpu->cp_regs = g_hash_table_new_full(g_direct_hash, g_direct_equal,
+                                         NULL, cpreg_hashtable_data_destroy);
 
     QLIST_INIT(&cpu->pre_el_change_hooks);
     QLIST_INIT(&cpu->el_change_hooks);
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
index f01a126108..f5b35cd55f 100644
--- a/target/arm/gdbstub.c
+++ b/target/arm/gdbstub.c
@@ -273,7 +273,7 @@ static void arm_gen_one_xml_sysreg_tag(GString *s, DynamicGDBXMLInfo *dyn_xml,
 static void arm_register_sysreg_for_xml(gpointer key, gpointer value,
                                         gpointer p)
 {
-    uint32_t ri_key = *(uint32_t *)key;
+    uint32_t ri_key = (uintptr_t)key;
     ARMCPRegInfo *ri = value;
     RegisterSysregXmlParam *param = (RegisterSysregXmlParam *)p;
     GString *s = param->s;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index aee195400b..db9e75a42d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -215,11 +215,8 @@ bool write_list_to_cpustate(ARMCPU *cpu)
 static void add_cpreg_to_list(gpointer key, gpointer opaque)
 {
     ARMCPU *cpu = opaque;
-    uint64_t regidx;
-    const ARMCPRegInfo *ri;
-
-    regidx = *(uint32_t *)key;
-    ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
+    uint32_t regidx = (uintptr_t)key;
+    const ARMCPRegInfo *ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
 
     if (!(ri->type & (ARM_CP_NO_RAW|ARM_CP_ALIAS))) {
         cpu->cpreg_indexes[cpu->cpreg_array_len] = cpreg_to_kvm_id(regidx);
@@ -231,11 +228,9 @@ static void add_cpreg_to_list(gpointer key, gpointer opaque)
 static void count_cpreg(gpointer key, gpointer opaque)
 {
     ARMCPU *cpu = opaque;
-    uint64_t regidx;
     const ARMCPRegInfo *ri;
 
-    regidx = *(uint32_t *)key;
-    ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
+    ri = g_hash_table_lookup(cpu->cp_regs, key);
 
     if (!(ri->type & (ARM_CP_NO_RAW|ARM_CP_ALIAS))) {
         cpu->cpreg_array_len++;
@@ -244,8 +239,8 @@ static void count_cpreg(gpointer key, gpointer opaque)
 
 static gint cpreg_key_compare(gconstpointer a, gconstpointer b)
 {
-    uint64_t aidx = cpreg_to_kvm_id(*(uint32_t *)a);
-    uint64_t bidx = cpreg_to_kvm_id(*(uint32_t *)b);
+    uint64_t aidx = cpreg_to_kvm_id((uintptr_t)a);
+    uint64_t bidx = cpreg_to_kvm_id((uintptr_t)b);
 
     if (aidx > bidx) {
         return 1;
@@ -5915,16 +5910,17 @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
 
     for (i = 0; i < ARRAY_SIZE(aliases); i++) {
         const struct E2HAlias *a = &aliases[i];
-        ARMCPRegInfo *src_reg, *dst_reg, *new_reg;
-        uint32_t *new_key;
+        const ARMCPRegInfo *dst_reg;
+        ARMCPRegInfo *src_reg;
+        ARMCPRegInfo *new_reg;
         bool ok;
 
         if (a->feature && !a->feature(&cpu->isar)) {
             continue;
         }
 
-        src_reg = g_hash_table_lookup(cpu->cp_regs, &a->src_key);
-        dst_reg = g_hash_table_lookup(cpu->cp_regs, &a->dst_key);
+        src_reg = (ARMCPRegInfo *)get_arm_cp_reginfo(cpu->cp_regs, a->src_key);
+        dst_reg = get_arm_cp_reginfo(cpu->cp_regs, a->dst_key);
         g_assert(src_reg != NULL);
         g_assert(dst_reg != NULL);
 
@@ -5937,7 +5933,6 @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
 
         /* Create alias before redirection so we dup the right data. */
         new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
-        new_key = g_memdup(&a->new_key, sizeof(uint32_t));
 
         new_reg->name = a->new_name;
         new_reg->type |= ARM_CP_ALIAS;
@@ -5956,10 +5951,11 @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
 
 #undef E
 
-        ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
+        ok = g_hash_table_insert(cpu->cp_regs,
+                                 (gpointer)(uintptr_t)a->new_key, new_reg);
         g_assert(ok);
 
-        src_reg->opaque = dst_reg;
+        src_reg->opaque = (void *)dst_reg;
         src_reg->orig_readfn = src_reg->readfn ?: raw_read;
         src_reg->orig_writefn = src_reg->writefn ?: raw_write;
         if (!src_reg->raw_readfn) {
@@ -8522,7 +8518,7 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
     /* Private utility function for define_one_arm_cp_reg_with_opaque():
      * add a single reginfo struct to the hash table.
      */
-    uint32_t *key = g_new(uint32_t, 1);
+    uint32_t key;
     ARMCPRegInfo *r2 = g_memdup(r, sizeof(ARMCPRegInfo));
     int is64 = (r->type & ARM_CP_64BIT) ? 1 : 0;
     int ns = (secstate & ARM_CP_SECSTATE_NS) ? 1 : 0;
@@ -8589,10 +8585,10 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
         if (r->cp == 0 || r->state == ARM_CP_STATE_BOTH) {
             r2->cp = CP_REG_ARM64_SYSREG_CP;
         }
-        *key = ENCODE_AA64_CP_REG(r2->cp, r2->crn, crm,
-                                  r2->opc0, opc1, opc2);
+        key = ENCODE_AA64_CP_REG(r2->cp, r2->crn, crm,
+                                 r2->opc0, opc1, opc2);
     } else {
-        *key = ENCODE_CP_REG(r2->cp, is64, ns, r2->crn, crm, opc1, opc2);
+        key = ENCODE_CP_REG(r2->cp, is64, ns, r2->crn, crm, opc1, opc2);
     }
     if (opaque) {
         r2->opaque = opaque;
@@ -8634,8 +8630,7 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
      * requested.
      */
     if (!(r->type & ARM_CP_OVERRIDE)) {
-        ARMCPRegInfo *oldreg;
-        oldreg = g_hash_table_lookup(cpu->cp_regs, key);
+        const ARMCPRegInfo *oldreg = get_arm_cp_reginfo(cpu->cp_regs, key);
         if (oldreg && !(oldreg->type & ARM_CP_OVERRIDE)) {
             fprintf(stderr, "Register redefined: cp=%d %d bit "
                     "crn=%d crm=%d opc1=%d opc2=%d, "
@@ -8645,7 +8640,7 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
             g_assert_not_reached();
         }
     }
-    g_hash_table_insert(cpu->cp_regs, key, r2);
+    g_hash_table_insert(cpu->cp_regs, (gpointer)(uintptr_t)key, r2);
 }
 
 
@@ -8875,7 +8870,7 @@ void modify_arm_cp_regs_with_len(ARMCPRegInfo *regs, size_t regs_len,
 
 const ARMCPRegInfo *get_arm_cp_reginfo(GHashTable *cpregs, uint32_t encoded_cp)
 {
-    return g_hash_table_lookup(cpregs, &encoded_cp);
+    return g_hash_table_lookup(cpregs, (gpointer)(uintptr_t)encoded_cp);
 }
 
 void arm_cp_write_ignore(CPUARMState *env, const ARMCPRegInfo *ri,
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 34/60] target/arm: Cleanup add_cpreg_to_hashtable
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (32 preceding siblings ...)
  2022-04-17 17:43 ` [PATCH v3 33/60] target/arm: Store cpregs key in the hash table directly Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-22 10:48   ` Peter Maydell
  2022-04-17 17:44 ` [PATCH v3 35/60] target/arm: Handle cpreg registration for missing EL Richard Henderson
                   ` (26 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Use a single memory allocation for name and reginfo.
Perform the override check early; use assert not printf+abort.
Use a switch statement to validate state.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.c    |  16 +----
 target/arm/helper.c | 154 +++++++++++++++++++++++---------------------
 2 files changed, 81 insertions(+), 89 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index af13b34697..3da8841eb2 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1061,27 +1061,13 @@ uint64_t arm_cpu_mp_affinity(int idx, uint8_t clustersz)
     return (Aff1 << ARM_AFF1_SHIFT) | Aff0;
 }
 
-static void cpreg_hashtable_data_destroy(gpointer data)
-{
-    /*
-     * Destroy function for cpu->cp_regs hashtable data entries.
-     * We must free the name string because it was g_strdup()ed in
-     * add_cpreg_to_hashtable(). It's OK to cast away the 'const'
-     * from r->name because we know we definitely allocated it.
-     */
-    ARMCPRegInfo *r = data;
-
-    g_free((void *)r->name);
-    g_free(r);
-}
-
 static void arm_cpu_initfn(Object *obj)
 {
     ARMCPU *cpu = ARM_CPU(obj);
 
     cpu_set_cpustate_pointers(cpu);
     cpu->cp_regs = g_hash_table_new_full(g_direct_hash, g_direct_equal,
-                                         NULL, cpreg_hashtable_data_destroy);
+                                         NULL, g_free);
 
     QLIST_INIT(&cpu->pre_el_change_hooks);
     QLIST_INIT(&cpu->el_change_hooks);
diff --git a/target/arm/helper.c b/target/arm/helper.c
index db9e75a42d..562ea5c418 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8509,37 +8509,90 @@ CpuDefinitionInfoList *qmp_query_cpu_definitions(Error **errp)
     return cpu_list;
 }
 
+/*
+ * Private utility function for define_one_arm_cp_reg_with_opaque():
+ * add a single reginfo struct to the hash table.
+ */
 static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
                                    void *opaque, CPState state,
                                    CPSecureState secstate,
                                    int crm, int opc1, int opc2,
                                    const char *name)
 {
-    /* Private utility function for define_one_arm_cp_reg_with_opaque():
-     * add a single reginfo struct to the hash table.
-     */
     uint32_t key;
-    ARMCPRegInfo *r2 = g_memdup(r, sizeof(ARMCPRegInfo));
-    int is64 = (r->type & ARM_CP_64BIT) ? 1 : 0;
-    int ns = (secstate & ARM_CP_SECSTATE_NS) ? 1 : 0;
+    ARMCPRegInfo *r2;
+    bool is64 = r->type & ARM_CP_64BIT;
+    bool ns = secstate & ARM_CP_SECSTATE_NS;
+    int cp = r->cp;
+    bool isbanked;
+    size_t name_len;
 
-    r2->name = g_strdup(name);
-    /* Reset the secure state to the specific incoming state.  This is
-     * necessary as the register may have been defined with both states.
+    switch (state) {
+    case ARM_CP_STATE_AA32:
+        /* We assume it is a cp15 register if the .cp field is left unset. */
+        if (cp == 0 && r->state == ARM_CP_STATE_BOTH) {
+            cp = 15;
+        }
+        key = ENCODE_CP_REG(cp, is64, ns, r->crn, crm, opc1, opc2);
+        break;
+    case ARM_CP_STATE_AA64:
+        /*
+         * To allow abbreviation of ARMCPRegInfo definitions, we treat
+         * cp == 0 as equivalent to the value for "standard guest-visible
+         * sysreg".  STATE_BOTH definitions are also always "standard sysreg"
+         * in their AArch64 view (the .cp value may be non-zero for the
+         * benefit of the AArch32 view).
+         */
+        if (cp == 0 || r->state == ARM_CP_STATE_BOTH) {
+            cp = CP_REG_ARM64_SYSREG_CP;
+        }
+        key = ENCODE_AA64_CP_REG(cp, r->crn, crm, r->opc0, opc1, opc2);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    /* Overriding of an existing definition must be explicitly requested. */
+    if (!(r->type & ARM_CP_OVERRIDE)) {
+        const ARMCPRegInfo *oldreg = get_arm_cp_reginfo(cpu->cp_regs, key);
+        if (oldreg) {
+            assert(oldreg->type & ARM_CP_OVERRIDE);
+        }
+    }
+
+    /* Combine cpreg and name into one allocation. */
+    name_len = strlen(name) + 1;
+    r2 = g_malloc(sizeof(*r2) + name_len);
+    *r2 = *r;
+    r2->name = memcpy(r2 + 1, name, name_len);
+
+    /*
+     * Update fields to match the instantiation, overwiting wildcards
+     * such as CP_ANY, ARM_CP_STATE_BOTH, or ARM_CP_SECSTATE_BOTH.
      */
+    r2->cp = cp;
+    r2->crm = crm;
+    r2->opc1 = opc1;
+    r2->opc2 = opc2;
+    r2->state = state;
     r2->secure = secstate;
+    if (opaque) {
+        r2->opaque = opaque;
+    }
 
-    if (r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1]) {
-        /* Register is banked (using both entries in array).
+    isbanked = r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1];
+    if (isbanked) {
+        /*
+         * Register is banked (using both entries in array).
          * Overwriting fieldoffset as the array is only used to define
          * banked registers but later only fieldoffset is used.
          */
         r2->fieldoffset = r->bank_fieldoffsets[ns];
     }
-
     if (state == ARM_CP_STATE_AA32) {
-        if (r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1]) {
-            /* If the register is banked then we don't need to migrate or
+        if (isbanked) {
+            /*
+             * If the register is banked then we don't need to migrate or
              * reset the 32-bit instance in certain cases:
              *
              * 1) If the register has both 32-bit and 64-bit instances then we
@@ -8554,56 +8607,22 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
                 r2->type |= ARM_CP_ALIAS;
             }
         } else if ((secstate != r->secure) && !ns) {
-            /* The register is not banked so we only want to allow migration of
-             * the non-secure instance.
+            /*
+             * The register is not banked so we only want to allow migration
+             * of the non-secure instance.
              */
             r2->type |= ARM_CP_ALIAS;
         }
 
-        if (r->state == ARM_CP_STATE_BOTH) {
-            /* We assume it is a cp15 register if the .cp field is left unset.
-             */
-            if (r2->cp == 0) {
-                r2->cp = 15;
-            }
-
+        if (r->state == ARM_CP_STATE_BOTH && r->fieldoffset) {
 #ifdef HOST_WORDS_BIGENDIAN
-            if (r2->fieldoffset) {
-                r2->fieldoffset += sizeof(uint32_t);
-            }
+            r2->fieldoffset += sizeof(uint32_t);
 #endif
         }
     }
-    if (state == ARM_CP_STATE_AA64) {
-        /* To allow abbreviation of ARMCPRegInfo
-         * definitions, we treat cp == 0 as equivalent to
-         * the value for "standard guest-visible sysreg".
-         * STATE_BOTH definitions are also always "standard
-         * sysreg" in their AArch64 view (the .cp value may
-         * be non-zero for the benefit of the AArch32 view).
-         */
-        if (r->cp == 0 || r->state == ARM_CP_STATE_BOTH) {
-            r2->cp = CP_REG_ARM64_SYSREG_CP;
-        }
-        key = ENCODE_AA64_CP_REG(r2->cp, r2->crn, crm,
-                                 r2->opc0, opc1, opc2);
-    } else {
-        key = ENCODE_CP_REG(r2->cp, is64, ns, r2->crn, crm, opc1, opc2);
-    }
-    if (opaque) {
-        r2->opaque = opaque;
-    }
-    /* reginfo passed to helpers is correct for the actual access,
-     * and is never ARM_CP_STATE_BOTH:
-     */
-    r2->state = state;
-    /* Make sure reginfo passed to helpers for wildcarded regs
-     * has the correct crm/opc1/opc2 for this reg, not CP_ANY:
-     */
-    r2->crm = crm;
-    r2->opc1 = opc1;
-    r2->opc2 = opc2;
-    /* By convention, for wildcarded registers only the first
+
+    /*
+     * By convention, for wildcarded registers only the first
      * entry is used for migration; the others are marked as
      * ALIAS so we don't try to transfer the register
      * multiple times. Special registers (ie NOP/WFI) are
@@ -8612,13 +8631,14 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
     if (r->type & ARM_CP_SPECIAL_MASK) {
         r2->type |= ARM_CP_NO_RAW;
     }
-    if (((r->crm == CP_ANY) && crm != 0) ||
-        ((r->opc1 == CP_ANY) && opc1 != 0) ||
-        ((r->opc2 == CP_ANY) && opc2 != 0)) {
+    if ((r->crm == CP_ANY && crm != 0) ||
+        (r->opc1 == CP_ANY && opc1 != 0) ||
+        (r->opc2 == CP_ANY && opc2 != 0)) {
         r2->type |= ARM_CP_ALIAS | ARM_CP_NO_GDB;
     }
 
-    /* Check that raw accesses are either forbidden or handled. Note that
+    /*
+     * Check that raw accesses are either forbidden or handled. Note that
      * we can't assert this earlier because the setup of fieldoffset for
      * banked registers has to be done first.
      */
@@ -8626,20 +8646,6 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
         assert(!raw_accessors_invalid(r2));
     }
 
-    /* Overriding of an existing definition must be explicitly
-     * requested.
-     */
-    if (!(r->type & ARM_CP_OVERRIDE)) {
-        const ARMCPRegInfo *oldreg = get_arm_cp_reginfo(cpu->cp_regs, key);
-        if (oldreg && !(oldreg->type & ARM_CP_OVERRIDE)) {
-            fprintf(stderr, "Register redefined: cp=%d %d bit "
-                    "crn=%d crm=%d opc1=%d opc2=%d, "
-                    "was %s, now %s\n", r2->cp, 32 + 32 * is64,
-                    r2->crn, r2->crm, r2->opc1, r2->opc2,
-                    oldreg->name, r2->name);
-            g_assert_not_reached();
-        }
-    }
     g_hash_table_insert(cpu->cp_regs, (gpointer)(uintptr_t)key, r2);
 }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 35/60] target/arm: Handle cpreg registration for missing EL
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (33 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 34/60] target/arm: Cleanup add_cpreg_to_hashtable Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-22 10:57   ` Peter Maydell
  2022-04-17 17:44 ` [PATCH v3 36/60] target/arm: Drop EL3 no EL2 fallbacks Richard Henderson
                   ` (25 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

More gracefully handle cpregs when EL2 and/or EL3 are missing.
If the reg is entirely inaccessible, do not register it at all.
If the reg is for EL2, and EL3 is present but EL2 is not,
squash to ARM_CP_CONST.

This will simplify cpreg registration for conditional arm features.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 109 ++++++++++++++++++++++++++++++++------------
 1 file changed, 79 insertions(+), 30 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 562ea5c418..d9837b5bd2 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8519,13 +8519,14 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
                                    int crm, int opc1, int opc2,
                                    const char *name)
 {
+    CPUARMState *env = &cpu->env;
     uint32_t key;
     ARMCPRegInfo *r2;
     bool is64 = r->type & ARM_CP_64BIT;
     bool ns = secstate & ARM_CP_SECSTATE_NS;
     int cp = r->cp;
-    bool isbanked;
     size_t name_len;
+    bool make_const;
 
     switch (state) {
     case ARM_CP_STATE_AA32:
@@ -8560,6 +8561,24 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
         }
     }
 
+    /*
+     * Eliminate registers that are not present because the EL is missing.
+     * Doing this here makes it easier to put all registers for a given
+     * feature into the same ARMCPRegInfo array and define them all at once.
+     */
+    if (arm_feature(env, ARM_FEATURE_EL3)) {
+        /* An EL2 register without EL2 but with EL3 is (usually) RES0. */
+        int min_el = ctz32(r->access) / 2;
+        make_const = min_el == 2 && !arm_feature(env, ARM_FEATURE_EL2);
+    } else {
+        CPAccessRights max_el = (arm_feature(env, ARM_FEATURE_EL2)
+                                 ? PL2_RW : PL1_RW);
+        if ((r->access & max_el) == 0) {
+            return;
+        }
+        make_const = false;
+    }
+
     /* Combine cpreg and name into one allocation. */
     name_len = strlen(name) + 1;
     r2 = g_malloc(sizeof(*r2) + name_len);
@@ -8580,44 +8599,74 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
         r2->opaque = opaque;
     }
 
-    isbanked = r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1];
-    if (isbanked) {
+    if (make_const) {
+        /* This should not have been a very special register to begin. */
+        int old_special = r2->type & ARM_CP_SPECIAL_MASK;
+        assert(old_special == 0 || old_special == ARM_CP_CONST);
         /*
-         * Register is banked (using both entries in array).
-         * Overwriting fieldoffset as the array is only used to define
-         * banked registers but later only fieldoffset is used.
+         * Set the special function to CONST, retaining the flags.
+         * This is important for e.g. ARM_CP_SVE so that we still
+         * take the SVE trap if CPTR_EL3.EZ == 0.
          */
-        r2->fieldoffset = r->bank_fieldoffsets[ns];
-    }
-    if (state == ARM_CP_STATE_AA32) {
+        r2->type |= ARM_CP_CONST;
+        /*
+         * Usually, these registers become RES0, but there are a few
+         * special cases like VPIDR_EL2 which have a constant non-zero
+         * value with writes ignored.  So leave resetvalue as is.
+         *
+         * ARM_CP_SPECIAL_MASK has precedence, so removing the callbacks
+         * and offsets are not strictly necessary, but is potentially
+         * less confusing to debug later.
+         */
+        r2->readfn = NULL;
+        r2->writefn = NULL;
+        r2->raw_readfn = NULL;
+        r2->raw_writefn = NULL;
+        r2->resetfn = NULL;
+        r2->fieldoffset = 0;
+        r2->bank_fieldoffsets[0] = 0;
+        r2->bank_fieldoffsets[1] = 0;
+    } else {
+        bool isbanked = r->bank_fieldoffsets[0] && r->bank_fieldoffsets[1];
+
         if (isbanked) {
             /*
-             * If the register is banked then we don't need to migrate or
-             * reset the 32-bit instance in certain cases:
-             *
-             * 1) If the register has both 32-bit and 64-bit instances then we
-             *    can count on the 64-bit instance taking care of the
-             *    non-secure bank.
-             * 2) If ARMv8 is enabled then we can count on a 64-bit version
-             *    taking care of the secure bank.  This requires that separate
-             *    32 and 64-bit definitions are provided.
+             * Register is banked (using both entries in array).
+             * Overwriting fieldoffset as the array is only used to define
+             * banked registers but later only fieldoffset is used.
              */
-            if ((r->state == ARM_CP_STATE_BOTH && ns) ||
-                (arm_feature(&cpu->env, ARM_FEATURE_V8) && !ns)) {
+            r2->fieldoffset = r->bank_fieldoffsets[ns];
+        }
+        if (state == ARM_CP_STATE_AA32) {
+            if (isbanked) {
+                /*
+                 * If the register is banked then we don't need to migrate or
+                 * reset the 32-bit instance in certain cases:
+                 *
+                 * 1) If the register has both 32-bit and 64-bit instances
+                 *    then we can count on the 64-bit instance taking care
+                 *    of the non-secure bank.
+                 * 2) If ARMv8 is enabled then we can count on a 64-bit
+                 *    version taking care of the secure bank.  This requires
+                 *    that separate 32 and 64-bit definitions are provided.
+                 */
+                if ((r->state == ARM_CP_STATE_BOTH && ns) ||
+                    (arm_feature(env, ARM_FEATURE_V8) && !ns)) {
+                    r2->type |= ARM_CP_ALIAS;
+                }
+            } else if ((secstate != r->secure) && !ns) {
+                /*
+                 * The register is not banked so we only want to allow
+                 * migration of the non-secure instance.
+                 */
                 r2->type |= ARM_CP_ALIAS;
             }
-        } else if ((secstate != r->secure) && !ns) {
-            /*
-             * The register is not banked so we only want to allow migration
-             * of the non-secure instance.
-             */
-            r2->type |= ARM_CP_ALIAS;
-        }
 
-        if (r->state == ARM_CP_STATE_BOTH && r->fieldoffset) {
+            if (r->state == ARM_CP_STATE_BOTH && r->fieldoffset) {
 #ifdef HOST_WORDS_BIGENDIAN
-            r2->fieldoffset += sizeof(uint32_t);
+                r2->fieldoffset += sizeof(uint32_t);
 #endif
+            }
         }
     }
 
@@ -8628,7 +8677,7 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
      * multiple times. Special registers (ie NOP/WFI) are
      * never migratable and not even raw-accessible.
      */
-    if (r->type & ARM_CP_SPECIAL_MASK) {
+    if (r2->type & ARM_CP_SPECIAL_MASK) {
         r2->type |= ARM_CP_NO_RAW;
     }
     if ((r->crm == CP_ANY && crm != 0) ||
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 36/60] target/arm: Drop EL3 no EL2 fallbacks
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (34 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 35/60] target/arm: Handle cpreg registration for missing EL Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 37/60] target/arm: Merge zcr reginfo Richard Henderson
                   ` (24 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Drop el3_no_el2_cp_reginfo, el3_no_el2_v8_cp_reginfo,
and the local vpidr_regs definition, and rely on the
squasing to ARM_CP_CONST while registering.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 158 ++++----------------------------------------
 1 file changed, 13 insertions(+), 145 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index d9837b5bd2..cc65ab887e 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5099,124 +5099,6 @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
       .fieldoffset = offsetoflow32(CPUARMState, cp15.mdcr_el3) },
 };
 
-/* Used to describe the behaviour of EL2 regs when EL2 does not exist.  */
-static const ARMCPRegInfo el3_no_el2_cp_reginfo[] = {
-    { .name = "VBAR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 12, .crm = 0, .opc2 = 0,
-      .access = PL2_RW,
-      .readfn = arm_cp_read_zero, .writefn = arm_cp_write_ignore },
-    { .name = "HCR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 0,
-      .access = PL2_RW,
-      .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "HACR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 7,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "ESR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 5, .crm = 2, .opc2 = 0,
-      .access = PL2_RW,
-      .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "CPTR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 2,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "MAIR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 10, .crm = 2, .opc2 = 0,
-      .access = PL2_RW, .type = ARM_CP_CONST,
-      .resetvalue = 0 },
-    { .name = "HMAIR1", .state = ARM_CP_STATE_AA32,
-      .cp = 15, .opc1 = 4, .crn = 10, .crm = 2, .opc2 = 1,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "AMAIR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 10, .crm = 3, .opc2 = 0,
-      .access = PL2_RW, .type = ARM_CP_CONST,
-      .resetvalue = 0 },
-    { .name = "HAMAIR1", .state = ARM_CP_STATE_AA32,
-      .cp = 15, .opc1 = 4, .crn = 10, .crm = 3, .opc2 = 1,
-      .access = PL2_RW, .type = ARM_CP_CONST,
-      .resetvalue = 0 },
-    { .name = "AFSR0_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 5, .crm = 1, .opc2 = 0,
-      .access = PL2_RW, .type = ARM_CP_CONST,
-      .resetvalue = 0 },
-    { .name = "AFSR1_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 5, .crm = 1, .opc2 = 1,
-      .access = PL2_RW, .type = ARM_CP_CONST,
-      .resetvalue = 0 },
-    { .name = "TCR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 2,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "VTCR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 2,
-      .access = PL2_RW, .accessfn = access_el3_aa32ns,
-      .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "VTTBR", .state = ARM_CP_STATE_AA32,
-      .cp = 15, .opc1 = 6, .crm = 2,
-      .access = PL2_RW, .accessfn = access_el3_aa32ns,
-      .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
-    { .name = "VTTBR_EL2", .state = ARM_CP_STATE_AA64,
-      .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 0,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "SCTLR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 0,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "TPIDR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 13, .crm = 0, .opc2 = 2,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "TTBR0_EL2", .state = ARM_CP_STATE_AA64,
-      .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 0,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "HTTBR", .cp = 15, .opc1 = 4, .crm = 2,
-      .access = PL2_RW, .type = ARM_CP_64BIT | ARM_CP_CONST,
-      .resetvalue = 0 },
-    { .name = "CNTHCTL_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 1, .opc2 = 0,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "CNTVOFF_EL2", .state = ARM_CP_STATE_AA64,
-      .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 0, .opc2 = 3,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "CNTVOFF", .cp = 15, .opc1 = 4, .crm = 14,
-      .access = PL2_RW, .type = ARM_CP_64BIT | ARM_CP_CONST,
-      .resetvalue = 0 },
-    { .name = "CNTHP_CVAL_EL2", .state = ARM_CP_STATE_AA64,
-      .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 2, .opc2 = 2,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "CNTHP_CVAL", .cp = 15, .opc1 = 6, .crm = 14,
-      .access = PL2_RW, .type = ARM_CP_64BIT | ARM_CP_CONST,
-      .resetvalue = 0 },
-    { .name = "CNTHP_TVAL_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 2, .opc2 = 0,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "CNTHP_CTL_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 2, .opc2 = 1,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "MDCR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 1,
-      .access = PL2_RW, .accessfn = access_tda,
-      .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "HPFAR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 6, .crm = 0, .opc2 = 4,
-      .access = PL2_RW, .accessfn = access_el3_aa32ns,
-      .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "HSTR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 3,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "FAR_EL2", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 4, .crn = 6, .crm = 0, .opc2 = 0,
-      .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "HIFAR", .state = ARM_CP_STATE_AA32,
-      .type = ARM_CP_CONST,
-      .cp = 15, .opc1 = 4, .crn = 6, .crm = 0, .opc2 = 2,
-      .access = PL2_RW, .resetvalue = 0 },
-};
-
-/* Ditto, but for registers which exist in ARMv8 but not v7 */
-static const ARMCPRegInfo el3_no_el2_v8_cp_reginfo[] = {
-    { .name = "HCR2", .state = ARM_CP_STATE_AA32,
-      .cp = 15, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 4,
-      .access = PL2_RW,
-      .type = ARM_CP_CONST, .resetvalue = 0 },
-};
-
 static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
 {
     ARMCPU *cpu = env_archcpu(env);
@@ -7912,7 +7794,17 @@ void register_cp_regs_for_features(ARMCPU *cpu)
         define_arm_cp_regs(cpu, v8_idregs);
         define_arm_cp_regs(cpu, v8_cp_reginfo);
     }
-    if (arm_feature(env, ARM_FEATURE_EL2)) {
+
+    /*
+     * Register the base EL2 cpregs.
+     * Pre v8, these registers are implemented only as part of the
+     * Virtualization Extensions (EL2 present).  Beginning with v8,
+     * if EL2 is missing but EL3 is enabled, mostly these become
+     * RES0 from EL3, with some specific exceptions.
+     */
+    if (arm_feature(env, ARM_FEATURE_EL2)
+        || (arm_feature(env, ARM_FEATURE_EL3)
+            && arm_feature(env, ARM_FEATURE_V8))) {
         uint64_t vmpidr_def = mpidr_read_val(env);
         ARMCPRegInfo vpidr_regs[] = {
             { .name = "VPIDR", .state = ARM_CP_STATE_AA32,
@@ -7953,33 +7845,9 @@ void register_cp_regs_for_features(ARMCPU *cpu)
             };
             define_one_arm_cp_reg(cpu, &rvbar);
         }
-    } else {
-        /* If EL2 is missing but higher ELs are enabled, we need to
-         * register the no_el2 reginfos.
-         */
-        if (arm_feature(env, ARM_FEATURE_EL3)) {
-            /* When EL3 exists but not EL2, VPIDR and VMPIDR take the value
-             * of MIDR_EL1 and MPIDR_EL1.
-             */
-            ARMCPRegInfo vpidr_regs[] = {
-                { .name = "VPIDR_EL2", .state = ARM_CP_STATE_BOTH,
-                  .opc0 = 3, .opc1 = 4, .crn = 0, .crm = 0, .opc2 = 0,
-                  .access = PL2_RW, .accessfn = access_el3_aa32ns,
-                  .type = ARM_CP_CONST, .resetvalue = cpu->midr,
-                  .fieldoffset = offsetof(CPUARMState, cp15.vpidr_el2) },
-                { .name = "VMPIDR_EL2", .state = ARM_CP_STATE_BOTH,
-                  .opc0 = 3, .opc1 = 4, .crn = 0, .crm = 0, .opc2 = 5,
-                  .access = PL2_RW, .accessfn = access_el3_aa32ns,
-                  .type = ARM_CP_NO_RAW,
-                  .writefn = arm_cp_write_ignore, .readfn = mpidr_read },
-            };
-            define_arm_cp_regs(cpu, vpidr_regs);
-            define_arm_cp_regs(cpu, el3_no_el2_cp_reginfo);
-            if (arm_feature(env, ARM_FEATURE_V8)) {
-                define_arm_cp_regs(cpu, el3_no_el2_v8_cp_reginfo);
-            }
-        }
     }
+
+    /* Register the base EL3 cpregs. */
     if (arm_feature(env, ARM_FEATURE_EL3)) {
         define_arm_cp_regs(cpu, el3_cp_reginfo);
         ARMCPRegInfo el3_regs[] = {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 37/60] target/arm: Merge zcr reginfo
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (35 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 36/60] target/arm: Drop EL3 no EL2 fallbacks Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 38/60] target/arm: Add isar predicates for FEAT_Debugv8p2 Richard Henderson
                   ` (23 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Drop zcr_no_el2_reginfo and merge the 3 registers into one array,
now that ZCR_EL2 can be squashed to RES0 and ZCR_EL3 dropped
while registering.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 55 ++++++++++++++-------------------------------
 1 file changed, 17 insertions(+), 38 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index cc65ab887e..e762054b5d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -6132,35 +6132,22 @@ static void zcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
     }
 }
 
-static const ARMCPRegInfo zcr_el1_reginfo = {
-    .name = "ZCR_EL1", .state = ARM_CP_STATE_AA64,
-    .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 2, .opc2 = 0,
-    .access = PL1_RW, .type = ARM_CP_SVE,
-    .fieldoffset = offsetof(CPUARMState, vfp.zcr_el[1]),
-    .writefn = zcr_write, .raw_writefn = raw_write
-};
-
-static const ARMCPRegInfo zcr_el2_reginfo = {
-    .name = "ZCR_EL2", .state = ARM_CP_STATE_AA64,
-    .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 2, .opc2 = 0,
-    .access = PL2_RW, .type = ARM_CP_SVE,
-    .fieldoffset = offsetof(CPUARMState, vfp.zcr_el[2]),
-    .writefn = zcr_write, .raw_writefn = raw_write
-};
-
-static const ARMCPRegInfo zcr_no_el2_reginfo = {
-    .name = "ZCR_EL2", .state = ARM_CP_STATE_AA64,
-    .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 2, .opc2 = 0,
-    .access = PL2_RW, .type = ARM_CP_SVE,
-    .readfn = arm_cp_read_zero, .writefn = arm_cp_write_ignore
-};
-
-static const ARMCPRegInfo zcr_el3_reginfo = {
-    .name = "ZCR_EL3", .state = ARM_CP_STATE_AA64,
-    .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 2, .opc2 = 0,
-    .access = PL3_RW, .type = ARM_CP_SVE,
-    .fieldoffset = offsetof(CPUARMState, vfp.zcr_el[3]),
-    .writefn = zcr_write, .raw_writefn = raw_write
+static const ARMCPRegInfo zcr_reginfo[] = {
+    { .name = "ZCR_EL1", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 2, .opc2 = 0,
+      .access = PL1_RW, .type = ARM_CP_SVE,
+      .fieldoffset = offsetof(CPUARMState, vfp.zcr_el[1]),
+      .writefn = zcr_write, .raw_writefn = raw_write },
+    { .name = "ZCR_EL2", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 2, .opc2 = 0,
+      .access = PL2_RW, .type = ARM_CP_SVE,
+      .fieldoffset = offsetof(CPUARMState, vfp.zcr_el[2]),
+      .writefn = zcr_write, .raw_writefn = raw_write },
+    { .name = "ZCR_EL3", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 2, .opc2 = 0,
+      .access = PL3_RW, .type = ARM_CP_SVE,
+      .fieldoffset = offsetof(CPUARMState, vfp.zcr_el[3]),
+      .writefn = zcr_write, .raw_writefn = raw_write },
 };
 
 void hw_watchpoint_update(ARMCPU *cpu, int n)
@@ -8240,15 +8227,7 @@ void register_cp_regs_for_features(ARMCPU *cpu)
     }
 
     if (cpu_isar_feature(aa64_sve, cpu)) {
-        define_one_arm_cp_reg(cpu, &zcr_el1_reginfo);
-        if (arm_feature(env, ARM_FEATURE_EL2)) {
-            define_one_arm_cp_reg(cpu, &zcr_el2_reginfo);
-        } else {
-            define_one_arm_cp_reg(cpu, &zcr_no_el2_reginfo);
-        }
-        if (arm_feature(env, ARM_FEATURE_EL3)) {
-            define_one_arm_cp_reg(cpu, &zcr_el3_reginfo);
-        }
+        define_arm_cp_regs(cpu, zcr_reginfo);
     }
 
 #ifdef TARGET_AARCH64
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 38/60] target/arm: Add isar predicates for FEAT_Debugv8p2
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (36 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 37/60] target/arm: Merge zcr reginfo Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 39/60] target/arm: Adjust definition of CONTEXTIDR_EL2 Richard Henderson
                   ` (22 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 76c0dd37cd..20bf70545e 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3703,6 +3703,11 @@ static inline bool isar_feature_aa32_ssbs(const ARMISARegisters *id)
     return FIELD_EX32(id->id_pfr2, ID_PFR2, SSBS) != 0;
 }
 
+static inline bool isar_feature_aa32_debugv8p2(const ARMISARegisters *id)
+{
+    return FIELD_EX32(id->id_dfr0, ID_DFR0, COPDBG) >= 8;
+}
+
 /*
  * 64-bit feature tests via id registers.
  */
@@ -4009,6 +4014,11 @@ static inline bool isar_feature_aa64_ssbs(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, SSBS) != 0;
 }
 
+static inline bool isar_feature_aa64_debugv8p2(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64dfr0, ID_AA64DFR0, DEBUGVER) >= 8;
+}
+
 static inline bool isar_feature_aa64_sve2(const ARMISARegisters *id)
 {
     return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SVEVER) != 0;
@@ -4092,6 +4102,11 @@ static inline bool isar_feature_any_tts2uxn(const ARMISARegisters *id)
     return isar_feature_aa64_tts2uxn(id) || isar_feature_aa32_tts2uxn(id);
 }
 
+static inline bool isar_feature_any_debugv8p2(const ARMISARegisters *id)
+{
+    return isar_feature_aa64_debugv8p2(id) || isar_feature_aa32_debugv8p2(id);
+}
+
 /*
  * Forward to the above feature tests given an ARMCPU pointer.
  */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 39/60] target/arm: Adjust definition of CONTEXTIDR_EL2
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (37 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 38/60] target/arm: Add isar predicates for FEAT_Debugv8p2 Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 40/60] target/arm: Move cortex impdef sysregs to cpu_tcg.c Richard Henderson
                   ` (21 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

This register is present for either VHE or Debugv8p2.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v3: Rely on EL3-no-EL2 squashing during registration.
---
 target/arm/helper.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index e762054b5d..3570212089 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -7256,11 +7256,14 @@ static const ARMCPRegInfo jazelle_regs[] = {
       .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
 };
 
+static const ARMCPRegInfo contextidr_el2 = {
+    .name = "CONTEXTIDR_EL2", .state = ARM_CP_STATE_AA64,
+    .opc0 = 3, .opc1 = 4, .crn = 13, .crm = 0, .opc2 = 1,
+    .access = PL2_RW,
+    .fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[2])
+};
+
 static const ARMCPRegInfo vhe_reginfo[] = {
-    { .name = "CONTEXTIDR_EL2", .state = ARM_CP_STATE_AA64,
-      .opc0 = 3, .opc1 = 4, .crn = 13, .crm = 0, .opc2 = 1,
-      .access = PL2_RW,
-      .fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[2]) },
     { .name = "TTBR1_EL2", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 1,
       .access = PL2_RW, .writefn = vmsa_tcr_ttbr_el2_write,
@@ -8222,6 +8225,10 @@ void register_cp_regs_for_features(ARMCPU *cpu)
         define_one_arm_cp_reg(cpu, &ssbs_reginfo);
     }
 
+    if (cpu_isar_feature(aa64_vh, cpu) ||
+        cpu_isar_feature(aa64_debugv8p2, cpu)) {
+        define_one_arm_cp_reg(cpu, &contextidr_el2);
+    }
     if (arm_feature(env, ARM_FEATURE_EL2) && cpu_isar_feature(aa64_vh, cpu)) {
         define_arm_cp_regs(cpu, vhe_reginfo);
     }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 40/60] target/arm: Move cortex impdef sysregs to cpu_tcg.c
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (38 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 39/60] target/arm: Adjust definition of CONTEXTIDR_EL2 Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-22 11:01   ` Peter Maydell
  2022-04-17 17:44 ` [PATCH v3 41/60] target/arm: Update qemu-system-arm -cpu max to cortex-a57 Richard Henderson
                   ` (20 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Previously we were defining some of these in user-only mode,
but none of them are accessible from user-only, therefore
define them only in system mode.

This will shortly be used from cpu_tcg.c also.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: New patch.
---
 target/arm/internals.h |  6 ++++
 target/arm/cpu64.c     | 64 +++---------------------------------------
 target/arm/cpu_tcg.c   | 59 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 69 insertions(+), 60 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index 7f696cd36a..96a57ee68f 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1295,4 +1295,10 @@ int aarch64_fpu_gdb_get_reg(CPUARMState *env, GByteArray *buf, int reg);
 int aarch64_fpu_gdb_set_reg(CPUARMState *env, uint8_t *buf, int reg);
 #endif
 
+#ifdef CONFIG_USER_ONLY
+static inline void define_cortex_a72_a57_a53_cp_reginfo(ARMCPU *cpu) { }
+#else
+void define_cortex_a72_a57_a53_cp_reginfo(ARMCPU *cpu);
+#endif
+
 #endif
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index c40373bd18..67d628c0af 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -34,65 +34,9 @@
 #include "hvf_arm.h"
 #include "qapi/visitor.h"
 #include "hw/qdev-properties.h"
-#include "cpregs.h"
+#include "internals.h"
 
 
-#ifndef CONFIG_USER_ONLY
-static uint64_t a57_a53_l2ctlr_read(CPUARMState *env, const ARMCPRegInfo *ri)
-{
-    ARMCPU *cpu = env_archcpu(env);
-
-    /* Number of cores is in [25:24]; otherwise we RAZ */
-    return (cpu->core_count - 1) << 24;
-}
-#endif
-
-static const ARMCPRegInfo cortex_a72_a57_a53_cp_reginfo[] = {
-#ifndef CONFIG_USER_ONLY
-    { .name = "L2CTLR_EL1", .state = ARM_CP_STATE_AA64,
-      .opc0 = 3, .opc1 = 1, .crn = 11, .crm = 0, .opc2 = 2,
-      .access = PL1_RW, .readfn = a57_a53_l2ctlr_read,
-      .writefn = arm_cp_write_ignore },
-    { .name = "L2CTLR",
-      .cp = 15, .opc1 = 1, .crn = 9, .crm = 0, .opc2 = 2,
-      .access = PL1_RW, .readfn = a57_a53_l2ctlr_read,
-      .writefn = arm_cp_write_ignore },
-#endif
-    { .name = "L2ECTLR_EL1", .state = ARM_CP_STATE_AA64,
-      .opc0 = 3, .opc1 = 1, .crn = 11, .crm = 0, .opc2 = 3,
-      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "L2ECTLR",
-      .cp = 15, .opc1 = 1, .crn = 9, .crm = 0, .opc2 = 3,
-      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "L2ACTLR", .state = ARM_CP_STATE_BOTH,
-      .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 0, .opc2 = 0,
-      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "CPUACTLR_EL1", .state = ARM_CP_STATE_AA64,
-      .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 2, .opc2 = 0,
-      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "CPUACTLR",
-      .cp = 15, .opc1 = 0, .crm = 15,
-      .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
-    { .name = "CPUECTLR_EL1", .state = ARM_CP_STATE_AA64,
-      .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 2, .opc2 = 1,
-      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "CPUECTLR",
-      .cp = 15, .opc1 = 1, .crm = 15,
-      .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
-    { .name = "CPUMERRSR_EL1", .state = ARM_CP_STATE_AA64,
-      .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 2, .opc2 = 2,
-      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "CPUMERRSR",
-      .cp = 15, .opc1 = 2, .crm = 15,
-      .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
-    { .name = "L2MERRSR_EL1", .state = ARM_CP_STATE_AA64,
-      .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 2, .opc2 = 3,
-      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
-    { .name = "L2MERRSR",
-      .cp = 15, .opc1 = 3, .crm = 15,
-      .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
-};
-
 static void aarch64_a57_initfn(Object *obj)
 {
     ARMCPU *cpu = ARM_CPU(obj);
@@ -143,7 +87,7 @@ static void aarch64_a57_initfn(Object *obj)
     cpu->gic_num_lrs = 4;
     cpu->gic_vpribits = 5;
     cpu->gic_vprebits = 5;
-    define_arm_cp_regs(cpu, cortex_a72_a57_a53_cp_reginfo);
+    define_cortex_a72_a57_a53_cp_reginfo(cpu);
 }
 
 static void aarch64_a53_initfn(Object *obj)
@@ -196,7 +140,7 @@ static void aarch64_a53_initfn(Object *obj)
     cpu->gic_num_lrs = 4;
     cpu->gic_vpribits = 5;
     cpu->gic_vprebits = 5;
-    define_arm_cp_regs(cpu, cortex_a72_a57_a53_cp_reginfo);
+    define_cortex_a72_a57_a53_cp_reginfo(cpu);
 }
 
 static void aarch64_a72_initfn(Object *obj)
@@ -247,7 +191,7 @@ static void aarch64_a72_initfn(Object *obj)
     cpu->gic_num_lrs = 4;
     cpu->gic_vpribits = 5;
     cpu->gic_vprebits = 5;
-    define_arm_cp_regs(cpu, cortex_a72_a57_a53_cp_reginfo);
+    define_cortex_a72_a57_a53_cp_reginfo(cpu);
 }
 
 void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index 9338088b22..d078f06931 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -20,6 +20,65 @@
 #endif
 #include "cpregs.h"
 
+#ifndef CONFIG_USER_ONLY
+static uint64_t l2ctlr_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+    ARMCPU *cpu = env_archcpu(env);
+
+    /* Number of cores is in [25:24]; otherwise we RAZ */
+    return (cpu->core_count - 1) << 24;
+}
+
+static const ARMCPRegInfo cortex_a72_a57_a53_cp_reginfo[] = {
+    { .name = "L2CTLR_EL1", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 1, .crn = 11, .crm = 0, .opc2 = 2,
+      .access = PL1_RW, .readfn = l2ctlr_read,
+      .writefn = arm_cp_write_ignore },
+    { .name = "L2CTLR",
+      .cp = 15, .opc1 = 1, .crn = 9, .crm = 0, .opc2 = 2,
+      .access = PL1_RW, .readfn = l2ctlr_read,
+      .writefn = arm_cp_write_ignore },
+    { .name = "L2ECTLR_EL1", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 1, .crn = 11, .crm = 0, .opc2 = 3,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "L2ECTLR",
+      .cp = 15, .opc1 = 1, .crn = 9, .crm = 0, .opc2 = 3,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "L2ACTLR", .state = ARM_CP_STATE_BOTH,
+      .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 0, .opc2 = 0,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "CPUACTLR_EL1", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 2, .opc2 = 0,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "CPUACTLR",
+      .cp = 15, .opc1 = 0, .crm = 15,
+      .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
+    { .name = "CPUECTLR_EL1", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 2, .opc2 = 1,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "CPUECTLR",
+      .cp = 15, .opc1 = 1, .crm = 15,
+      .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
+    { .name = "CPUMERRSR_EL1", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 2, .opc2 = 2,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "CPUMERRSR",
+      .cp = 15, .opc1 = 2, .crm = 15,
+      .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
+    { .name = "L2MERRSR_EL1", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 2, .opc2 = 3,
+      .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "L2MERRSR",
+      .cp = 15, .opc1 = 3, .crm = 15,
+      .access = PL1_RW, .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
+};
+
+void define_cortex_a72_a57_a53_cp_reginfo(ARMCPU *cpu)
+{
+    define_arm_cp_regs(cpu, cortex_a72_a57_a53_cp_reginfo);
+}
+#endif /* !CONFIG_USER_ONLY */
+
 /* CPU models. These are not needed for the AArch64 linux-user build. */
 #if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 41/60] target/arm: Update qemu-system-arm -cpu max to cortex-a57
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (39 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 40/60] target/arm: Move cortex impdef sysregs to cpu_tcg.c Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-22 11:02   ` Peter Maydell
  2022-04-17 17:44 ` [PATCH v3 42/60] target/arm: Set ID_DFR0.PerfMon for qemu-system-arm -cpu max Richard Henderson
                   ` (19 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Instead of starting with cortex-a15 and adding v8 features to
a v7 cpu, begin with a v8 cpu stripped of its aarch64 features.
This fixes the long-standing to-do where we only enabled v8
features for user-only.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Create impdef sysregs; only enable short-vector support for user-only.
---
 target/arm/cpu_tcg.c | 151 ++++++++++++++++++++++++++-----------------
 1 file changed, 92 insertions(+), 59 deletions(-)

diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index d078f06931..f9094c1752 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -994,71 +994,104 @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
 static void arm_max_initfn(Object *obj)
 {
     ARMCPU *cpu = ARM_CPU(obj);
+    uint32_t t;
 
-    cortex_a15_initfn(obj);
+    /* aarch64_a57_initfn, advertising none of the aarch64 features */
+    cpu->dtb_compatible = "arm,cortex-a57";
+    set_feature(&cpu->env, ARM_FEATURE_V8);
+    set_feature(&cpu->env, ARM_FEATURE_NEON);
+    set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
+    set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
+    set_feature(&cpu->env, ARM_FEATURE_EL2);
+    set_feature(&cpu->env, ARM_FEATURE_EL3);
+    set_feature(&cpu->env, ARM_FEATURE_PMU);
+    cpu->midr = 0x411fd070;
+    cpu->revidr = 0x00000000;
+    cpu->reset_fpsid = 0x41034070;
+    cpu->isar.mvfr0 = 0x10110222;
+    cpu->isar.mvfr1 = 0x12111111;
+    cpu->isar.mvfr2 = 0x00000043;
+    cpu->ctr = 0x8444c004;
+    cpu->reset_sctlr = 0x00c50838;
+    cpu->isar.id_pfr0 = 0x00000131;
+    cpu->isar.id_pfr1 = 0x00011011;
+    cpu->isar.id_dfr0 = 0x03010066;
+    cpu->id_afr0 = 0x00000000;
+    cpu->isar.id_mmfr0 = 0x10101105;
+    cpu->isar.id_mmfr1 = 0x40000000;
+    cpu->isar.id_mmfr2 = 0x01260000;
+    cpu->isar.id_mmfr3 = 0x02102211;
+    cpu->isar.id_isar0 = 0x02101110;
+    cpu->isar.id_isar1 = 0x13112111;
+    cpu->isar.id_isar2 = 0x21232042;
+    cpu->isar.id_isar3 = 0x01112131;
+    cpu->isar.id_isar4 = 0x00011142;
+    cpu->isar.id_isar5 = 0x00011121;
+    cpu->isar.id_isar6 = 0;
+    cpu->isar.dbgdidr = 0x3516d000;
+    cpu->clidr = 0x0a200023;
+    cpu->ccsidr[0] = 0x701fe00a; /* 32KB L1 dcache */
+    cpu->ccsidr[1] = 0x201fe012; /* 48KB L1 icache */
+    cpu->ccsidr[2] = 0x70ffe07a; /* 2048KB L2 cache */
+    define_cortex_a72_a57_a53_cp_reginfo(cpu);
 
-    /* old-style VFP short-vector support */
-    cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSHVEC, 1);
+    /* Add additional features supported by QEMU */
+    t = cpu->isar.id_isar5;
+    t = FIELD_DP32(t, ID_ISAR5, AES, 2);
+    t = FIELD_DP32(t, ID_ISAR5, SHA1, 1);
+    t = FIELD_DP32(t, ID_ISAR5, SHA2, 1);
+    t = FIELD_DP32(t, ID_ISAR5, CRC32, 1);
+    t = FIELD_DP32(t, ID_ISAR5, RDM, 1);
+    t = FIELD_DP32(t, ID_ISAR5, VCMA, 1);
+    cpu->isar.id_isar5 = t;
+
+    t = cpu->isar.id_isar6;
+    t = FIELD_DP32(t, ID_ISAR6, JSCVT, 1);
+    t = FIELD_DP32(t, ID_ISAR6, DP, 1);
+    t = FIELD_DP32(t, ID_ISAR6, FHM, 1);
+    t = FIELD_DP32(t, ID_ISAR6, SB, 1);
+    t = FIELD_DP32(t, ID_ISAR6, SPECRES, 1);
+    t = FIELD_DP32(t, ID_ISAR6, BF16, 1);
+    t = FIELD_DP32(t, ID_ISAR6, I8MM, 1);
+    cpu->isar.id_isar6 = t;
+
+    t = cpu->isar.mvfr1;
+    t = FIELD_DP32(t, MVFR1, FPHP, 3);     /* v8.2-FP16 */
+    t = FIELD_DP32(t, MVFR1, SIMDHP, 2);   /* v8.2-FP16 */
+    cpu->isar.mvfr1 = t;
+
+    t = cpu->isar.mvfr2;
+    t = FIELD_DP32(t, MVFR2, SIMDMISC, 3); /* SIMD MaxNum */
+    t = FIELD_DP32(t, MVFR2, FPMISC, 4);   /* FP MaxNum */
+    cpu->isar.mvfr2 = t;
+
+    t = cpu->isar.id_mmfr3;
+    t = FIELD_DP32(t, ID_MMFR3, PAN, 2); /* ATS1E1 */
+    cpu->isar.id_mmfr3 = t;
+
+    t = cpu->isar.id_mmfr4;
+    t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
+    t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
+    t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
+    t = FIELD_DP32(t, ID_MMFR4, XNX, 1); /* TTS2UXN */
+    cpu->isar.id_mmfr4 = t;
+
+    t = cpu->isar.id_pfr0;
+    t = FIELD_DP32(t, ID_PFR0, DIT, 1);
+    cpu->isar.id_pfr0 = t;
+
+    t = cpu->isar.id_pfr2;
+    t = FIELD_DP32(t, ID_PFR2, SSBS, 1);
+    cpu->isar.id_pfr2 = t;
 
 #ifdef CONFIG_USER_ONLY
     /*
-     * We don't set these in system emulation mode for the moment,
-     * since we don't correctly set (all of) the ID registers to
-     * advertise them.
+     * Break with true ARMv8 and add back old-style VFP short-vector support.
+     * Only do this for user-mode, where -cpu max is the default, so that
+     * older v6 and v7 programs are more likely to work without adjustment.
      */
-    set_feature(&cpu->env, ARM_FEATURE_V8);
-    {
-        uint32_t t;
-
-        t = cpu->isar.id_isar5;
-        t = FIELD_DP32(t, ID_ISAR5, AES, 2);
-        t = FIELD_DP32(t, ID_ISAR5, SHA1, 1);
-        t = FIELD_DP32(t, ID_ISAR5, SHA2, 1);
-        t = FIELD_DP32(t, ID_ISAR5, CRC32, 1);
-        t = FIELD_DP32(t, ID_ISAR5, RDM, 1);
-        t = FIELD_DP32(t, ID_ISAR5, VCMA, 1);
-        cpu->isar.id_isar5 = t;
-
-        t = cpu->isar.id_isar6;
-        t = FIELD_DP32(t, ID_ISAR6, JSCVT, 1);
-        t = FIELD_DP32(t, ID_ISAR6, DP, 1);
-        t = FIELD_DP32(t, ID_ISAR6, FHM, 1);
-        t = FIELD_DP32(t, ID_ISAR6, SB, 1);
-        t = FIELD_DP32(t, ID_ISAR6, SPECRES, 1);
-        t = FIELD_DP32(t, ID_ISAR6, BF16, 1);
-        t = FIELD_DP32(t, ID_ISAR6, I8MM, 1);
-        cpu->isar.id_isar6 = t;
-
-        t = cpu->isar.mvfr1;
-        t = FIELD_DP32(t, MVFR1, FPHP, 3);     /* v8.2-FP16 */
-        t = FIELD_DP32(t, MVFR1, SIMDHP, 2);   /* v8.2-FP16 */
-        cpu->isar.mvfr1 = t;
-
-        t = cpu->isar.mvfr2;
-        t = FIELD_DP32(t, MVFR2, SIMDMISC, 3); /* SIMD MaxNum */
-        t = FIELD_DP32(t, MVFR2, FPMISC, 4);   /* FP MaxNum */
-        cpu->isar.mvfr2 = t;
-
-        t = cpu->isar.id_mmfr3;
-        t = FIELD_DP32(t, ID_MMFR3, PAN, 2); /* ATS1E1 */
-        cpu->isar.id_mmfr3 = t;
-
-        t = cpu->isar.id_mmfr4;
-        t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
-        t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
-        t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
-        t = FIELD_DP32(t, ID_MMFR4, XNX, 1); /* TTS2UXN */
-        cpu->isar.id_mmfr4 = t;
-
-        t = cpu->isar.id_pfr0;
-        t = FIELD_DP32(t, ID_PFR0, DIT, 1);
-        cpu->isar.id_pfr0 = t;
-
-        t = cpu->isar.id_pfr2;
-        t = FIELD_DP32(t, ID_PFR2, SSBS, 1);
-        cpu->isar.id_pfr2 = t;
-    }
-#endif /* CONFIG_USER_ONLY */
+    cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSHVEC, 1);
+#endif
 }
 #endif /* !TARGET_AARCH64 */
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 42/60] target/arm: Set ID_DFR0.PerfMon for qemu-system-arm -cpu max
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (40 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 41/60] target/arm: Update qemu-system-arm -cpu max to cortex-a57 Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 43/60] target/arm: Split out aa32_max_features Richard Henderson
                   ` (18 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

We set this for qemu-system-aarch64, but failed to do so
for the strictly 32-bit emulation.

Fixes: 3bec78447a9 ("target/arm: Provide ARMv8.4-PMU in '-cpu max'")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu_tcg.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index f9094c1752..9aa2f737c1 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -1084,6 +1084,10 @@ static void arm_max_initfn(Object *obj)
     t = FIELD_DP32(t, ID_PFR2, SSBS, 1);
     cpu->isar.id_pfr2 = t;
 
+    t = cpu->isar.id_dfr0;
+    t = FIELD_DP32(t, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
+    cpu->isar.id_dfr0 = t;
+
 #ifdef CONFIG_USER_ONLY
     /*
      * Break with true ARMv8 and add back old-style VFP short-vector support.
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 43/60] target/arm: Split out aa32_max_features
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (41 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 42/60] target/arm: Set ID_DFR0.PerfMon for qemu-system-arm -cpu max Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-22 11:03   ` Peter Maydell
  2022-04-17 17:44 ` [PATCH v3 44/60] target/arm: Annotate arm_max_initfn with FEAT identifiers Richard Henderson
                   ` (17 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Share the code to set AArch32 max features so that we no
longer have code drift between qemu{-system,}-{arm,aarch64}.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/internals.h |   2 +
 target/arm/cpu64.c     |  50 +-----------------
 target/arm/cpu_tcg.c   | 114 ++++++++++++++++++++++-------------------
 3 files changed, 65 insertions(+), 101 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index 96a57ee68f..baa2a7e1f4 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1301,4 +1301,6 @@ static inline void define_cortex_a72_a57_a53_cp_reginfo(ARMCPU *cpu) { }
 void define_cortex_a72_a57_a53_cp_reginfo(ARMCPU *cpu);
 #endif
 
+void aa32_max_features(ARMCPU *cpu);
+
 #endif
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 67d628c0af..8f9e9d6dff 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -682,7 +682,6 @@ static void aarch64_max_initfn(Object *obj)
 {
     ARMCPU *cpu = ARM_CPU(obj);
     uint64_t t;
-    uint32_t u;
 
     if (kvm_enabled() || hvf_enabled()) {
         /* With KVM or HVF, '-cpu max' is identical to '-cpu host' */
@@ -797,57 +796,12 @@ static void aarch64_max_initfn(Object *obj)
     t = FIELD_DP64(t, ID_AA64ZFR0, F64MM, 1);
     cpu->isar.id_aa64zfr0 = t;
 
-    /* Replicate the same data to the 32-bit id registers.  */
-    u = cpu->isar.id_isar5;
-    u = FIELD_DP32(u, ID_ISAR5, AES, 2); /* AES + PMULL */
-    u = FIELD_DP32(u, ID_ISAR5, SHA1, 1);
-    u = FIELD_DP32(u, ID_ISAR5, SHA2, 1);
-    u = FIELD_DP32(u, ID_ISAR5, CRC32, 1);
-    u = FIELD_DP32(u, ID_ISAR5, RDM, 1);
-    u = FIELD_DP32(u, ID_ISAR5, VCMA, 1);
-    cpu->isar.id_isar5 = u;
-
-    u = cpu->isar.id_isar6;
-    u = FIELD_DP32(u, ID_ISAR6, JSCVT, 1);
-    u = FIELD_DP32(u, ID_ISAR6, DP, 1);
-    u = FIELD_DP32(u, ID_ISAR6, FHM, 1);
-    u = FIELD_DP32(u, ID_ISAR6, SB, 1);
-    u = FIELD_DP32(u, ID_ISAR6, SPECRES, 1);
-    u = FIELD_DP32(u, ID_ISAR6, BF16, 1);
-    u = FIELD_DP32(u, ID_ISAR6, I8MM, 1);
-    cpu->isar.id_isar6 = u;
-
-    u = cpu->isar.id_pfr0;
-    u = FIELD_DP32(u, ID_PFR0, DIT, 1);
-    cpu->isar.id_pfr0 = u;
-
-    u = cpu->isar.id_pfr2;
-    u = FIELD_DP32(u, ID_PFR2, SSBS, 1);
-    cpu->isar.id_pfr2 = u;
-
-    u = cpu->isar.id_mmfr3;
-    u = FIELD_DP32(u, ID_MMFR3, PAN, 2); /* ATS1E1 */
-    cpu->isar.id_mmfr3 = u;
-
-    u = cpu->isar.id_mmfr4;
-    u = FIELD_DP32(u, ID_MMFR4, HPDS, 1); /* AA32HPD */
-    u = FIELD_DP32(u, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
-    u = FIELD_DP32(u, ID_MMFR4, CNP, 1); /* TTCNP */
-    u = FIELD_DP32(u, ID_MMFR4, XNX, 1); /* TTS2UXN */
-    cpu->isar.id_mmfr4 = u;
-
     t = cpu->isar.id_aa64dfr0;
     t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
     cpu->isar.id_aa64dfr0 = t;
 
-    u = cpu->isar.id_dfr0;
-    u = FIELD_DP32(u, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
-    cpu->isar.id_dfr0 = u;
-
-    u = cpu->isar.mvfr1;
-    u = FIELD_DP32(u, MVFR1, FPHP, 3);      /* v8.2-FP16 */
-    u = FIELD_DP32(u, MVFR1, SIMDHP, 2);    /* v8.2-FP16 */
-    cpu->isar.mvfr1 = u;
+    /* Replicate the same data to the 32-bit id registers.  */
+    aa32_max_features(cpu);
 
 #ifdef CONFIG_USER_ONLY
     /*
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index 9aa2f737c1..b0dbf2c991 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -20,6 +20,66 @@
 #endif
 #include "cpregs.h"
 
+
+/* Share AArch32 -cpu max features with AArch64. */
+void aa32_max_features(ARMCPU *cpu)
+{
+    uint32_t t;
+
+    /* Add additional features supported by QEMU */
+    t = cpu->isar.id_isar5;
+    t = FIELD_DP32(t, ID_ISAR5, AES, 2);
+    t = FIELD_DP32(t, ID_ISAR5, SHA1, 1);
+    t = FIELD_DP32(t, ID_ISAR5, SHA2, 1);
+    t = FIELD_DP32(t, ID_ISAR5, CRC32, 1);
+    t = FIELD_DP32(t, ID_ISAR5, RDM, 1);
+    t = FIELD_DP32(t, ID_ISAR5, VCMA, 1);
+    cpu->isar.id_isar5 = t;
+
+    t = cpu->isar.id_isar6;
+    t = FIELD_DP32(t, ID_ISAR6, JSCVT, 1);
+    t = FIELD_DP32(t, ID_ISAR6, DP, 1);
+    t = FIELD_DP32(t, ID_ISAR6, FHM, 1);
+    t = FIELD_DP32(t, ID_ISAR6, SB, 1);
+    t = FIELD_DP32(t, ID_ISAR6, SPECRES, 1);
+    t = FIELD_DP32(t, ID_ISAR6, BF16, 1);
+    t = FIELD_DP32(t, ID_ISAR6, I8MM, 1);
+    cpu->isar.id_isar6 = t;
+
+    t = cpu->isar.mvfr1;
+    t = FIELD_DP32(t, MVFR1, FPHP, 3);     /* v8.2-FP16 */
+    t = FIELD_DP32(t, MVFR1, SIMDHP, 2);   /* v8.2-FP16 */
+    cpu->isar.mvfr1 = t;
+
+    t = cpu->isar.mvfr2;
+    t = FIELD_DP32(t, MVFR2, SIMDMISC, 3); /* SIMD MaxNum */
+    t = FIELD_DP32(t, MVFR2, FPMISC, 4);   /* FP MaxNum */
+    cpu->isar.mvfr2 = t;
+
+    t = cpu->isar.id_mmfr3;
+    t = FIELD_DP32(t, ID_MMFR3, PAN, 2); /* ATS1E1 */
+    cpu->isar.id_mmfr3 = t;
+
+    t = cpu->isar.id_mmfr4;
+    t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
+    t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
+    t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
+    t = FIELD_DP32(t, ID_MMFR4, XNX, 1); /* TTS2UXN */
+    cpu->isar.id_mmfr4 = t;
+
+    t = cpu->isar.id_pfr0;
+    t = FIELD_DP32(t, ID_PFR0, DIT, 1);
+    cpu->isar.id_pfr0 = t;
+
+    t = cpu->isar.id_pfr2;
+    t = FIELD_DP32(t, ID_PFR2, SSBS, 1);
+    cpu->isar.id_pfr2 = t;
+
+    t = cpu->isar.id_dfr0;
+    t = FIELD_DP32(t, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
+    cpu->isar.id_dfr0 = t;
+}
+
 #ifndef CONFIG_USER_ONLY
 static uint64_t l2ctlr_read(CPUARMState *env, const ARMCPRegInfo *ri)
 {
@@ -994,7 +1054,6 @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
 static void arm_max_initfn(Object *obj)
 {
     ARMCPU *cpu = ARM_CPU(obj);
-    uint32_t t;
 
     /* aarch64_a57_initfn, advertising none of the aarch64 features */
     cpu->dtb_compatible = "arm,cortex-a57";
@@ -1035,58 +1094,7 @@ static void arm_max_initfn(Object *obj)
     cpu->ccsidr[2] = 0x70ffe07a; /* 2048KB L2 cache */
     define_cortex_a72_a57_a53_cp_reginfo(cpu);
 
-    /* Add additional features supported by QEMU */
-    t = cpu->isar.id_isar5;
-    t = FIELD_DP32(t, ID_ISAR5, AES, 2);
-    t = FIELD_DP32(t, ID_ISAR5, SHA1, 1);
-    t = FIELD_DP32(t, ID_ISAR5, SHA2, 1);
-    t = FIELD_DP32(t, ID_ISAR5, CRC32, 1);
-    t = FIELD_DP32(t, ID_ISAR5, RDM, 1);
-    t = FIELD_DP32(t, ID_ISAR5, VCMA, 1);
-    cpu->isar.id_isar5 = t;
-
-    t = cpu->isar.id_isar6;
-    t = FIELD_DP32(t, ID_ISAR6, JSCVT, 1);
-    t = FIELD_DP32(t, ID_ISAR6, DP, 1);
-    t = FIELD_DP32(t, ID_ISAR6, FHM, 1);
-    t = FIELD_DP32(t, ID_ISAR6, SB, 1);
-    t = FIELD_DP32(t, ID_ISAR6, SPECRES, 1);
-    t = FIELD_DP32(t, ID_ISAR6, BF16, 1);
-    t = FIELD_DP32(t, ID_ISAR6, I8MM, 1);
-    cpu->isar.id_isar6 = t;
-
-    t = cpu->isar.mvfr1;
-    t = FIELD_DP32(t, MVFR1, FPHP, 3);     /* v8.2-FP16 */
-    t = FIELD_DP32(t, MVFR1, SIMDHP, 2);   /* v8.2-FP16 */
-    cpu->isar.mvfr1 = t;
-
-    t = cpu->isar.mvfr2;
-    t = FIELD_DP32(t, MVFR2, SIMDMISC, 3); /* SIMD MaxNum */
-    t = FIELD_DP32(t, MVFR2, FPMISC, 4);   /* FP MaxNum */
-    cpu->isar.mvfr2 = t;
-
-    t = cpu->isar.id_mmfr3;
-    t = FIELD_DP32(t, ID_MMFR3, PAN, 2); /* ATS1E1 */
-    cpu->isar.id_mmfr3 = t;
-
-    t = cpu->isar.id_mmfr4;
-    t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
-    t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
-    t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
-    t = FIELD_DP32(t, ID_MMFR4, XNX, 1); /* TTS2UXN */
-    cpu->isar.id_mmfr4 = t;
-
-    t = cpu->isar.id_pfr0;
-    t = FIELD_DP32(t, ID_PFR0, DIT, 1);
-    cpu->isar.id_pfr0 = t;
-
-    t = cpu->isar.id_pfr2;
-    t = FIELD_DP32(t, ID_PFR2, SSBS, 1);
-    cpu->isar.id_pfr2 = t;
-
-    t = cpu->isar.id_dfr0;
-    t = FIELD_DP32(t, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
-    cpu->isar.id_dfr0 = t;
+    aa32_max_features(cpu);
 
 #ifdef CONFIG_USER_ONLY
     /*
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 44/60] target/arm: Annotate arm_max_initfn with FEAT identifiers
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (42 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 43/60] target/arm: Split out aa32_max_features Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 45/60] target/arm: Use field names for manipulating EL2 and EL3 modes Richard Henderson
                   ` (16 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

Update the legacy feature names to the current names.
Provide feature names for id changes that were not marked.
Sort the field updates into increasing bitfield order.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu64.c   | 96 ++++++++++++++++++++++----------------------
 target/arm/cpu_tcg.c | 48 +++++++++++-----------
 2 files changed, 72 insertions(+), 72 deletions(-)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 8f9e9d6dff..c9476680d4 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -713,51 +713,51 @@ static void aarch64_max_initfn(Object *obj)
     cpu->midr = t;
 
     t = cpu->isar.id_aa64isar0;
-    t = FIELD_DP64(t, ID_AA64ISAR0, AES, 2); /* AES + PMULL */
-    t = FIELD_DP64(t, ID_AA64ISAR0, SHA1, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR0, SHA2, 2); /* SHA512 */
+    t = FIELD_DP64(t, ID_AA64ISAR0, AES, 2);      /* FEAT_PMULL */
+    t = FIELD_DP64(t, ID_AA64ISAR0, SHA1, 1);     /* FEAT_SHA1 */
+    t = FIELD_DP64(t, ID_AA64ISAR0, SHA2, 2);     /* FEAT_SHA512 */
     t = FIELD_DP64(t, ID_AA64ISAR0, CRC32, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR0, ATOMIC, 2);
-    t = FIELD_DP64(t, ID_AA64ISAR0, RDM, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR0, SHA3, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR0, SM3, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR0, SM4, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR0, DP, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR0, FHM, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR0, TS, 2); /* v8.5-CondM */
-    t = FIELD_DP64(t, ID_AA64ISAR0, TLB, 2); /* FEAT_TLBIRANGE */
-    t = FIELD_DP64(t, ID_AA64ISAR0, RNDR, 1);
+    t = FIELD_DP64(t, ID_AA64ISAR0, ATOMIC, 2);   /* FEAT_LSE */
+    t = FIELD_DP64(t, ID_AA64ISAR0, RDM, 1);      /* FEAT_RDM */
+    t = FIELD_DP64(t, ID_AA64ISAR0, SHA3, 1);     /* FEAT_SHA3 */
+    t = FIELD_DP64(t, ID_AA64ISAR0, SM3, 1);      /* FEAT_SM3 */
+    t = FIELD_DP64(t, ID_AA64ISAR0, SM4, 1);      /* FEAT_SM4 */
+    t = FIELD_DP64(t, ID_AA64ISAR0, DP, 1);       /* FEAT_DotProd */
+    t = FIELD_DP64(t, ID_AA64ISAR0, FHM, 1);      /* FEAT_FHM */
+    t = FIELD_DP64(t, ID_AA64ISAR0, TS, 2);       /* FEAT_FlagM2 */
+    t = FIELD_DP64(t, ID_AA64ISAR0, TLB, 2);      /* FEAT_TLBIRANGE */
+    t = FIELD_DP64(t, ID_AA64ISAR0, RNDR, 1);     /* FEAT_RNG */
     cpu->isar.id_aa64isar0 = t;
 
     t = cpu->isar.id_aa64isar1;
-    t = FIELD_DP64(t, ID_AA64ISAR1, DPB, 2);
-    t = FIELD_DP64(t, ID_AA64ISAR1, JSCVT, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR1, SB, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR1, SPECRES, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR1, BF16, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR1, FRINTTS, 1);
-    t = FIELD_DP64(t, ID_AA64ISAR1, LRCPC, 2); /* ARMv8.4-RCPC */
-    t = FIELD_DP64(t, ID_AA64ISAR1, I8MM, 1);
+    t = FIELD_DP64(t, ID_AA64ISAR1, DPB, 2);      /* FEAT_DPB2 */
+    t = FIELD_DP64(t, ID_AA64ISAR1, JSCVT, 1);    /* FEAT_JSCVT */
+    t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);     /* FEAT_FCMA */
+    t = FIELD_DP64(t, ID_AA64ISAR1, LRCPC, 2);    /* FEAT_LRCPC2 */
+    t = FIELD_DP64(t, ID_AA64ISAR1, FRINTTS, 1);  /* FEAT_FRINTTS */
+    t = FIELD_DP64(t, ID_AA64ISAR1, SB, 1);       /* FEAT_SB */
+    t = FIELD_DP64(t, ID_AA64ISAR1, SPECRES, 1);  /* FEAT_SPECRES */
+    t = FIELD_DP64(t, ID_AA64ISAR1, BF16, 1);     /* FEAT_BF16 */
+    t = FIELD_DP64(t, ID_AA64ISAR1, I8MM, 1);     /* FEAT_I8MM */
     cpu->isar.id_aa64isar1 = t;
 
     t = cpu->isar.id_aa64pfr0;
+    t = FIELD_DP64(t, ID_AA64PFR0, FP, 1);        /* FEAT_FP16 */
+    t = FIELD_DP64(t, ID_AA64PFR0, ADVSIMD, 1);   /* FEAT_FP16 */
     t = FIELD_DP64(t, ID_AA64PFR0, SVE, 1);
-    t = FIELD_DP64(t, ID_AA64PFR0, FP, 1);
-    t = FIELD_DP64(t, ID_AA64PFR0, ADVSIMD, 1);
-    t = FIELD_DP64(t, ID_AA64PFR0, SEL2, 1);
-    t = FIELD_DP64(t, ID_AA64PFR0, DIT, 1);
+    t = FIELD_DP64(t, ID_AA64PFR0, SEL2, 1);      /* FEAT_SEL2 */
+    t = FIELD_DP64(t, ID_AA64PFR0, DIT, 1);       /* FEAT_DIT */
     cpu->isar.id_aa64pfr0 = t;
 
     t = cpu->isar.id_aa64pfr1;
-    t = FIELD_DP64(t, ID_AA64PFR1, BT, 1);
-    t = FIELD_DP64(t, ID_AA64PFR1, SSBS, 2);
+    t = FIELD_DP64(t, ID_AA64PFR1, BT, 1);        /* FEAT_BTI */
+    t = FIELD_DP64(t, ID_AA64PFR1, SSBS, 2);      /* FEAT_SSBS2 */
     /*
      * Begin with full support for MTE. This will be downgraded to MTE=0
      * during realize if the board provides no tag memory, much like
      * we do for EL2 with the virtualization=on property.
      */
-    t = FIELD_DP64(t, ID_AA64PFR1, MTE, 3);
+    t = FIELD_DP64(t, ID_AA64PFR1, MTE, 3);       /* FEAT_MTE3 */
     cpu->isar.id_aa64pfr1 = t;
 
     t = cpu->isar.id_aa64mmfr0;
@@ -769,35 +769,35 @@ static void aarch64_max_initfn(Object *obj)
     cpu->isar.id_aa64mmfr0 = t;
 
     t = cpu->isar.id_aa64mmfr1;
-    t = FIELD_DP64(t, ID_AA64MMFR1, HPDS, 1); /* HPD */
-    t = FIELD_DP64(t, ID_AA64MMFR1, LO, 1);
-    t = FIELD_DP64(t, ID_AA64MMFR1, VH, 1);
-    t = FIELD_DP64(t, ID_AA64MMFR1, PAN, 2); /* ATS1E1 */
-    t = FIELD_DP64(t, ID_AA64MMFR1, VMIDBITS, 2); /* VMID16 */
-    t = FIELD_DP64(t, ID_AA64MMFR1, XNX, 1); /* TTS2UXN */
+    t = FIELD_DP64(t, ID_AA64MMFR1, VMIDBITS, 2); /* FEAT_VMID16 */
+    t = FIELD_DP64(t, ID_AA64MMFR1, VH, 1);       /* FEAT_VHE */
+    t = FIELD_DP64(t, ID_AA64MMFR1, HPDS, 1);     /* FEAT_HPDS */
+    t = FIELD_DP64(t, ID_AA64MMFR1, LO, 1);       /* FEAT_LOR */
+    t = FIELD_DP64(t, ID_AA64MMFR1, PAN, 2);      /* FEAT_PAN2 */
+    t = FIELD_DP64(t, ID_AA64MMFR1, XNX, 1);      /* FEAT_XNX */
     cpu->isar.id_aa64mmfr1 = t;
 
     t = cpu->isar.id_aa64mmfr2;
-    t = FIELD_DP64(t, ID_AA64MMFR2, UAO, 1);
-    t = FIELD_DP64(t, ID_AA64MMFR2, CNP, 1); /* TTCNP */
-    t = FIELD_DP64(t, ID_AA64MMFR2, ST, 1); /* TTST */
-    t = FIELD_DP64(t, ID_AA64MMFR2, VARANGE, 1); /* FEAT_LVA */
+    t = FIELD_DP64(t, ID_AA64MMFR2, CNP, 1);      /* FEAT_TTCNP */
+    t = FIELD_DP64(t, ID_AA64MMFR2, UAO, 1);      /* FEAT_UAO */
+    t = FIELD_DP64(t, ID_AA64MMFR2, VARANGE, 1);  /* FEAT_LVA */
+    t = FIELD_DP64(t, ID_AA64MMFR2, ST, 1);       /* FEAT_TTST */
     cpu->isar.id_aa64mmfr2 = t;
 
     t = cpu->isar.id_aa64zfr0;
     t = FIELD_DP64(t, ID_AA64ZFR0, SVEVER, 1);
-    t = FIELD_DP64(t, ID_AA64ZFR0, AES, 2);  /* PMULL */
-    t = FIELD_DP64(t, ID_AA64ZFR0, BITPERM, 1);
-    t = FIELD_DP64(t, ID_AA64ZFR0, BFLOAT16, 1);
-    t = FIELD_DP64(t, ID_AA64ZFR0, SHA3, 1);
-    t = FIELD_DP64(t, ID_AA64ZFR0, SM4, 1);
-    t = FIELD_DP64(t, ID_AA64ZFR0, I8MM, 1);
-    t = FIELD_DP64(t, ID_AA64ZFR0, F32MM, 1);
-    t = FIELD_DP64(t, ID_AA64ZFR0, F64MM, 1);
+    t = FIELD_DP64(t, ID_AA64ZFR0, AES, 2);       /* FEAT_SVE_PMULL128 */
+    t = FIELD_DP64(t, ID_AA64ZFR0, BITPERM, 1);   /* FEAT_SVE_BitPerm */
+    t = FIELD_DP64(t, ID_AA64ZFR0, BFLOAT16, 1);  /* FEAT_BF16 */
+    t = FIELD_DP64(t, ID_AA64ZFR0, SHA3, 1);      /* FEAT_SVE_SHA3 */
+    t = FIELD_DP64(t, ID_AA64ZFR0, SM4, 1);       /* FEAT_SVE_SM4 */
+    t = FIELD_DP64(t, ID_AA64ZFR0, I8MM, 1);      /* FEAT_I8MM */
+    t = FIELD_DP64(t, ID_AA64ZFR0, F32MM, 1);     /* FEAT_F32MM */
+    t = FIELD_DP64(t, ID_AA64ZFR0, F64MM, 1);     /* FEAT_F64MM */
     cpu->isar.id_aa64zfr0 = t;
 
     t = cpu->isar.id_aa64dfr0;
-    t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 5); /* v8.4-PMU */
+    t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 5);    /* FEAT_PMUv3p4 */
     cpu->isar.id_aa64dfr0 = t;
 
     /* Replicate the same data to the 32-bit id registers.  */
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index b0dbf2c991..bc8f9d0edf 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -28,55 +28,55 @@ void aa32_max_features(ARMCPU *cpu)
 
     /* Add additional features supported by QEMU */
     t = cpu->isar.id_isar5;
-    t = FIELD_DP32(t, ID_ISAR5, AES, 2);
-    t = FIELD_DP32(t, ID_ISAR5, SHA1, 1);
-    t = FIELD_DP32(t, ID_ISAR5, SHA2, 1);
+    t = FIELD_DP32(t, ID_ISAR5, AES, 2);          /* FEAT_PMULL */
+    t = FIELD_DP32(t, ID_ISAR5, SHA1, 1);         /* FEAT_SHA1 */
+    t = FIELD_DP32(t, ID_ISAR5, SHA2, 1);         /* FEAT_SHA256 */
     t = FIELD_DP32(t, ID_ISAR5, CRC32, 1);
-    t = FIELD_DP32(t, ID_ISAR5, RDM, 1);
-    t = FIELD_DP32(t, ID_ISAR5, VCMA, 1);
+    t = FIELD_DP32(t, ID_ISAR5, RDM, 1);          /* FEAT_RDM */
+    t = FIELD_DP32(t, ID_ISAR5, VCMA, 1);         /* FEAT_FCMA */
     cpu->isar.id_isar5 = t;
 
     t = cpu->isar.id_isar6;
-    t = FIELD_DP32(t, ID_ISAR6, JSCVT, 1);
-    t = FIELD_DP32(t, ID_ISAR6, DP, 1);
-    t = FIELD_DP32(t, ID_ISAR6, FHM, 1);
-    t = FIELD_DP32(t, ID_ISAR6, SB, 1);
-    t = FIELD_DP32(t, ID_ISAR6, SPECRES, 1);
-    t = FIELD_DP32(t, ID_ISAR6, BF16, 1);
-    t = FIELD_DP32(t, ID_ISAR6, I8MM, 1);
+    t = FIELD_DP32(t, ID_ISAR6, JSCVT, 1);        /* FEAT_JSCVT */
+    t = FIELD_DP32(t, ID_ISAR6, DP, 1);           /* Feat_DotProd */
+    t = FIELD_DP32(t, ID_ISAR6, FHM, 1);          /* FEAT_FHM */
+    t = FIELD_DP32(t, ID_ISAR6, SB, 1);           /* FEAT_SB */
+    t = FIELD_DP32(t, ID_ISAR6, SPECRES, 1);      /* FEAT_SPECRES */
+    t = FIELD_DP32(t, ID_ISAR6, BF16, 1);         /* FEAT_AA32BF16 */
+    t = FIELD_DP32(t, ID_ISAR6, I8MM, 1);         /* FEAT_AA32I8MM */
     cpu->isar.id_isar6 = t;
 
     t = cpu->isar.mvfr1;
-    t = FIELD_DP32(t, MVFR1, FPHP, 3);     /* v8.2-FP16 */
-    t = FIELD_DP32(t, MVFR1, SIMDHP, 2);   /* v8.2-FP16 */
+    t = FIELD_DP32(t, MVFR1, FPHP, 3);            /* FEAT_FP16 */
+    t = FIELD_DP32(t, MVFR1, SIMDHP, 2);          /* FEAT_FP16 */
     cpu->isar.mvfr1 = t;
 
     t = cpu->isar.mvfr2;
-    t = FIELD_DP32(t, MVFR2, SIMDMISC, 3); /* SIMD MaxNum */
-    t = FIELD_DP32(t, MVFR2, FPMISC, 4);   /* FP MaxNum */
+    t = FIELD_DP32(t, MVFR2, SIMDMISC, 3);        /* SIMD MaxNum */
+    t = FIELD_DP32(t, MVFR2, FPMISC, 4);          /* FP MaxNum */
     cpu->isar.mvfr2 = t;
 
     t = cpu->isar.id_mmfr3;
-    t = FIELD_DP32(t, ID_MMFR3, PAN, 2); /* ATS1E1 */
+    t = FIELD_DP32(t, ID_MMFR3, PAN, 2);          /* FEAT_PAN2 */
     cpu->isar.id_mmfr3 = t;
 
     t = cpu->isar.id_mmfr4;
-    t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
-    t = FIELD_DP32(t, ID_MMFR4, AC2, 1); /* ACTLR2, HACTLR2 */
-    t = FIELD_DP32(t, ID_MMFR4, CNP, 1); /* TTCNP */
-    t = FIELD_DP32(t, ID_MMFR4, XNX, 1); /* TTS2UXN */
+    t = FIELD_DP32(t, ID_MMFR4, HPDS, 1);         /* FEAT_AA32HPD */
+    t = FIELD_DP32(t, ID_MMFR4, AC2, 1);          /* ACTLR2, HACTLR2 */
+    t = FIELD_DP32(t, ID_MMFR4, CNP, 1);          /* FEAT_TTCNP */
+    t = FIELD_DP32(t, ID_MMFR4, XNX, 1);          /* FEAT_XNX*/
     cpu->isar.id_mmfr4 = t;
 
     t = cpu->isar.id_pfr0;
-    t = FIELD_DP32(t, ID_PFR0, DIT, 1);
+    t = FIELD_DP32(t, ID_PFR0, DIT, 1);           /* FEAT_DIT */
     cpu->isar.id_pfr0 = t;
 
     t = cpu->isar.id_pfr2;
-    t = FIELD_DP32(t, ID_PFR2, SSBS, 1);
+    t = FIELD_DP32(t, ID_PFR2, SSBS, 1);          /* FEAT_SSBS */
     cpu->isar.id_pfr2 = t;
 
     t = cpu->isar.id_dfr0;
-    t = FIELD_DP32(t, ID_DFR0, PERFMON, 5); /* v8.4-PMU */
+    t = FIELD_DP32(t, ID_DFR0, PERFMON, 5);       /* FEAT_PMUv3p4 */
     cpu->isar.id_dfr0 = t;
 }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 45/60] target/arm: Use field names for manipulating EL2 and EL3 modes
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (43 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 44/60] target/arm: Annotate arm_max_initfn with FEAT identifiers Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 46/60] target/arm: Enable FEAT_Debugv8p2 for -cpu max Richard Henderson
                   ` (15 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

Use FIELD_DP{32,64} to manipulate id_pfr1 and id_aa64pfr0
during arm_cpu_realizefn.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 3da8841eb2..33451cecf7 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1782,11 +1782,13 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
          */
         unset_feature(env, ARM_FEATURE_EL3);
 
-        /* Disable the security extension feature bits in the processor feature
-         * registers as well. These are id_pfr1[7:4] and id_aa64pfr0[15:12].
+        /*
+         * Disable the security extension feature bits in the processor
+         * feature registers as well.
          */
-        cpu->isar.id_pfr1 &= ~0xf0;
-        cpu->isar.id_aa64pfr0 &= ~0xf000;
+        cpu->isar.id_pfr1 = FIELD_DP32(cpu->isar.id_pfr1, ID_PFR1, SECURITY, 0);
+        cpu->isar.id_aa64pfr0 = FIELD_DP64(cpu->isar.id_aa64pfr0,
+                                           ID_AA64PFR0, EL3, 0);
     }
 
     if (!cpu->has_el2) {
@@ -1817,12 +1819,14 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
     }
 
     if (!arm_feature(env, ARM_FEATURE_EL2)) {
-        /* Disable the hypervisor feature bits in the processor feature
-         * registers if we don't have EL2. These are id_pfr1[15:12] and
-         * id_aa64pfr0_el1[11:8].
+        /*
+         * Disable the hypervisor feature bits in the processor feature
+         * registers if we don't have EL2.
          */
-        cpu->isar.id_aa64pfr0 &= ~0xf00;
-        cpu->isar.id_pfr1 &= ~0xf000;
+        cpu->isar.id_aa64pfr0 = FIELD_DP64(cpu->isar.id_aa64pfr0,
+                                           ID_AA64PFR0, EL2, 0);
+        cpu->isar.id_pfr1 = FIELD_DP32(cpu->isar.id_pfr1,
+                                       ID_PFR1, VIRTUALIZATION, 0);
     }
 
 #ifndef CONFIG_USER_ONLY
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 46/60] target/arm: Enable FEAT_Debugv8p2 for -cpu max
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (44 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 45/60] target/arm: Use field names for manipulating EL2 and EL3 modes Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 47/60] target/arm: Enable FEAT_Debugv8p4 " Richard Henderson
                   ` (14 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

The only portion of FEAT_Debugv8p2 that is relevant to QEMU
is CONTEXTIDR_EL2, which is also conditionally implemented
with FEAT_VHE.  The rest of the debug extension concerns the
External debug interface, which is outside the scope of QEMU.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Update emulation.rst
---
 docs/system/arm/emulation.rst | 1 +
 target/arm/cpu.c              | 1 +
 target/arm/cpu64.c            | 1 +
 target/arm/cpu_tcg.c          | 2 ++
 4 files changed, 5 insertions(+)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index 520fd39071..035f7cf9d0 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -13,6 +13,7 @@ the following architecture extensions:
 - FEAT_BTI (Branch Target Identification)
 - FEAT_DIT (Data Independent Timing instructions)
 - FEAT_DPB (DC CVAP instruction)
+- FEAT_Debugv8p2 (Debug changes for v8.2)
 - FEAT_DotProd (Advanced SIMD dot product instructions)
 - FEAT_FCMA (Floating-point complex number instructions)
 - FEAT_FHM (Floating-point half-precision multiplication instructions)
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 33451cecf7..fc0d74b4d1 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1787,6 +1787,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
          * feature registers as well.
          */
         cpu->isar.id_pfr1 = FIELD_DP32(cpu->isar.id_pfr1, ID_PFR1, SECURITY, 0);
+        cpu->isar.id_dfr0 = FIELD_DP32(cpu->isar.id_dfr0, ID_DFR0, COPSDBG, 0);
         cpu->isar.id_aa64pfr0 = FIELD_DP64(cpu->isar.id_aa64pfr0,
                                            ID_AA64PFR0, EL3, 0);
     }
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index c9476680d4..2003d0a8c9 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -797,6 +797,7 @@ static void aarch64_max_initfn(Object *obj)
     cpu->isar.id_aa64zfr0 = t;
 
     t = cpu->isar.id_aa64dfr0;
+    t = FIELD_DP64(t, ID_AA64DFR0, DEBUGVER, 8);  /* FEAT_Debugv8p2 */
     t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 5);    /* FEAT_PMUv3p4 */
     cpu->isar.id_aa64dfr0 = t;
 
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index bc8f9d0edf..b6fc3752f2 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -76,6 +76,8 @@ void aa32_max_features(ARMCPU *cpu)
     cpu->isar.id_pfr2 = t;
 
     t = cpu->isar.id_dfr0;
+    t = FIELD_DP32(t, ID_DFR0, COPDBG, 8);        /* FEAT_Debugv8p2 */
+    t = FIELD_DP32(t, ID_DFR0, COPSDBG, 8);       /* FEAT_Debugv8p2 */
     t = FIELD_DP32(t, ID_DFR0, PERFMON, 5);       /* FEAT_PMUv3p4 */
     cpu->isar.id_dfr0 = t;
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 47/60] target/arm: Enable FEAT_Debugv8p4 for -cpu max
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (45 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 46/60] target/arm: Enable FEAT_Debugv8p2 for -cpu max Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 48/60] target/arm: Add isar_feature_{aa64,any}_ras Richard Henderson
                   ` (13 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

This extension concerns changes to the External Debug interface,
with Secure and Non-secure access to the debug registers, and all
of it is outside the scope of QEMU.  Indicating support for this
is mandatory with FEAT_SEL2, which we do implement.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Update emulation.rst
---
 docs/system/arm/emulation.rst | 1 +
 target/arm/cpu64.c            | 2 +-
 target/arm/cpu_tcg.c          | 4 ++--
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index 035f7cf9d0..c89c344de1 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -14,6 +14,7 @@ the following architecture extensions:
 - FEAT_DIT (Data Independent Timing instructions)
 - FEAT_DPB (DC CVAP instruction)
 - FEAT_Debugv8p2 (Debug changes for v8.2)
+- FEAT_Debugv8p4 (Debug changes for v8.4)
 - FEAT_DotProd (Advanced SIMD dot product instructions)
 - FEAT_FCMA (Floating-point complex number instructions)
 - FEAT_FHM (Floating-point half-precision multiplication instructions)
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 2003d0a8c9..136590382a 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -797,7 +797,7 @@ static void aarch64_max_initfn(Object *obj)
     cpu->isar.id_aa64zfr0 = t;
 
     t = cpu->isar.id_aa64dfr0;
-    t = FIELD_DP64(t, ID_AA64DFR0, DEBUGVER, 8);  /* FEAT_Debugv8p2 */
+    t = FIELD_DP64(t, ID_AA64DFR0, DEBUGVER, 9);  /* FEAT_Debugv8p4 */
     t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 5);    /* FEAT_PMUv3p4 */
     cpu->isar.id_aa64dfr0 = t;
 
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index b6fc3752f2..337598e949 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -76,8 +76,8 @@ void aa32_max_features(ARMCPU *cpu)
     cpu->isar.id_pfr2 = t;
 
     t = cpu->isar.id_dfr0;
-    t = FIELD_DP32(t, ID_DFR0, COPDBG, 8);        /* FEAT_Debugv8p2 */
-    t = FIELD_DP32(t, ID_DFR0, COPSDBG, 8);       /* FEAT_Debugv8p2 */
+    t = FIELD_DP32(t, ID_DFR0, COPDBG, 9);        /* FEAT_Debugv8p4 */
+    t = FIELD_DP32(t, ID_DFR0, COPSDBG, 9);       /* FEAT_Debugv8p4 */
     t = FIELD_DP32(t, ID_DFR0, PERFMON, 5);       /* FEAT_PMUv3p4 */
     cpu->isar.id_dfr0 = t;
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 48/60] target/arm: Add isar_feature_{aa64,any}_ras
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (46 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 47/60] target/arm: Enable FEAT_Debugv8p4 " Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 49/60] target/arm: Add minimal RAS registers Richard Henderson
                   ` (12 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

Add the aa64 predicate for detecting RAS support from id registers.
We already have the aa32 version from the M-profile work.
Add the 'any' predicate for testing both aa64 and aa32.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 20bf70545e..d71edfc1c1 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3885,6 +3885,11 @@ static inline bool isar_feature_aa64_aa32_el1(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, EL1) >= 2;
 }
 
+static inline bool isar_feature_aa64_ras(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, RAS) != 0;
+}
+
 static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
 {
     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
@@ -4107,6 +4112,11 @@ static inline bool isar_feature_any_debugv8p2(const ARMISARegisters *id)
     return isar_feature_aa64_debugv8p2(id) || isar_feature_aa32_debugv8p2(id);
 }
 
+static inline bool isar_feature_any_ras(const ARMISARegisters *id)
+{
+    return isar_feature_aa64_ras(id) || isar_feature_aa32_ras(id);
+}
+
 /*
  * Forward to the above feature tests given an ARMCPU pointer.
  */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 49/60] target/arm: Add minimal RAS registers
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (47 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 48/60] target/arm: Add isar_feature_{aa64,any}_ras Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 50/60] target/arm: Enable SCR and HCR bits for RAS Richard Henderson
                   ` (11 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Add only the system registers required to implement zero error
records.  This means we need to save state for ERRSELR, but all
values are out of range, so none of the indexed error record
registers need be implemented.

Add the EL2 registers required for injecting virtual SError.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Leave ERRSELR_EL1 undefined.
v3: Rely on EL3-no-EL2 squashing during registration.
---
 target/arm/cpu.h    |  5 +++
 target/arm/helper.c | 84 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 89 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index d71edfc1c1..a6d1923a78 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -524,6 +524,11 @@ typedef struct CPUArchState {
         uint64_t tfsr_el[4]; /* tfsre0_el1 is index 0.  */
         uint64_t gcr_el1;
         uint64_t rgsr_el1;
+
+        /* Minimal RAS registers */
+        uint64_t disr_el1;
+        uint64_t vdisr_el2;
+        uint64_t vsesr_el2;
     } cp15;
 
     struct {
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 3570212089..655beba3d6 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5990,6 +5990,87 @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
       .access = PL0_R, .type = ARM_CP_CONST|ARM_CP_64BIT, .resetvalue = 0 },
 };
 
+/*
+ * Check for traps to RAS registers, which are controlled
+ * by HCR_EL2.TERR and SCR_EL3.TERR.
+ */
+static CPAccessResult access_terr(CPUARMState *env, const ARMCPRegInfo *ri,
+                                  bool isread)
+{
+    int el = arm_current_el(env);
+
+    if (el < 2 && (arm_hcr_el2_eff(env) & HCR_TERR)) {
+        return CP_ACCESS_TRAP_EL2;
+    }
+    if (el < 3 && (env->cp15.scr_el3 & SCR_TERR)) {
+        return CP_ACCESS_TRAP_EL3;
+    }
+    return CP_ACCESS_OK;
+}
+
+static uint64_t disr_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+    int el = arm_current_el(env);
+
+    if (el < 2 && (arm_hcr_el2_eff(env) & HCR_AMO)) {
+        return env->cp15.vdisr_el2;
+    }
+    if (el < 3 && (env->cp15.scr_el3 & SCR_EA)) {
+        return 0; /* RAZ/WI */
+    }
+    return env->cp15.disr_el1;
+}
+
+static void disr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t val)
+{
+    int el = arm_current_el(env);
+
+    if (el < 2 && (arm_hcr_el2_eff(env) & HCR_AMO)) {
+        env->cp15.vdisr_el2 = val;
+        return;
+    }
+    if (el < 3 && (env->cp15.scr_el3 & SCR_EA)) {
+        return; /* RAZ/WI */
+    }
+    env->cp15.disr_el1 = val;
+}
+
+/*
+ * Minimal RAS implementation with no Error Records.
+ * Which means that all of the Error Record registers:
+ *   ERXADDR_EL1
+ *   ERXCTLR_EL1
+ *   ERXFR_EL1
+ *   ERXMISC0_EL1
+ *   ERXMISC1_EL1
+ *   ERXMISC2_EL1
+ *   ERXMISC3_EL1
+ *   ERXPFGCDN_EL1  (RASv1p1)
+ *   ERXPFGCTL_EL1  (RASv1p1)
+ *   ERXPFGF_EL1    (RASv1p1)
+ *   ERXSTATUS_EL1
+ * and
+ *   ERRSELR_EL1
+ * may generate UNDEFINED, which is the effect we get by not
+ * listing them at all.
+ */
+static const ARMCPRegInfo minimal_ras_reginfo[] = {
+    { .name = "DISR_EL1", .state = ARM_CP_STATE_BOTH,
+      .opc0 = 3, .opc1 = 0, .crn = 12, .crm = 1, .opc2 = 1,
+      .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.disr_el1),
+      .readfn = disr_read, .writefn = disr_write, .raw_writefn = raw_write },
+    { .name = "ERRIDR_EL1", .state = ARM_CP_STATE_BOTH,
+      .opc0 = 3, .opc1 = 0, .crn = 5, .crm = 3, .opc2 = 0,
+      .access = PL1_R, .accessfn = access_terr,
+      .type = ARM_CP_CONST, .resetvalue = 0 },
+    { .name = "VDISR_EL2", .state = ARM_CP_STATE_BOTH,
+      .opc0 = 3, .opc1 = 4, .crn = 12, .crm = 1, .opc2 = 1,
+      .access = PL2_RW, .fieldoffset = offsetof(CPUARMState, cp15.vdisr_el2) },
+    { .name = "VSESR_EL2", .state = ARM_CP_STATE_BOTH,
+      .opc0 = 3, .opc1 = 4, .crn = 5, .crm = 2, .opc2 = 3,
+      .access = PL2_RW, .fieldoffset = offsetof(CPUARMState, cp15.vsesr_el2) },
+};
+
 /* Return the exception level to which exceptions should be taken
  * via SVEAccessTrap.  If an exception should be routed through
  * AArch64.AdvSIMDFPAccessTrap, return 0; fp_exception_el should
@@ -8224,6 +8305,9 @@ void register_cp_regs_for_features(ARMCPU *cpu)
     if (cpu_isar_feature(aa64_ssbs, cpu)) {
         define_one_arm_cp_reg(cpu, &ssbs_reginfo);
     }
+    if (cpu_isar_feature(any_ras, cpu)) {
+        define_arm_cp_regs(cpu, minimal_ras_reginfo);
+    }
 
     if (cpu_isar_feature(aa64_vh, cpu) ||
         cpu_isar_feature(aa64_debugv8p2, cpu)) {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 50/60] target/arm: Enable SCR and HCR bits for RAS
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (48 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 49/60] target/arm: Add minimal RAS registers Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 51/60] target/arm: Implement virtual SError exceptions Richard Henderson
                   ` (10 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

Enable writes to the TERR and TEA bits when RAS is enabled.
These bits are otherwise RES0.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 655beba3d6..f6468fed43 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -1756,6 +1756,9 @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
         }
         valid_mask &= ~SCR_NET;
 
+        if (cpu_isar_feature(aa64_ras, cpu)) {
+            valid_mask |= SCR_TERR;
+        }
         if (cpu_isar_feature(aa64_lor, cpu)) {
             valid_mask |= SCR_TLOR;
         }
@@ -1770,6 +1773,9 @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
         }
     } else {
         valid_mask &= ~(SCR_RW | SCR_ST);
+        if (cpu_isar_feature(aa32_ras, cpu)) {
+            valid_mask |= SCR_TERR;
+        }
     }
 
     if (!arm_feature(env, ARM_FEATURE_EL2)) {
@@ -5126,6 +5132,9 @@ static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
         if (cpu_isar_feature(aa64_vh, cpu)) {
             valid_mask |= HCR_E2H;
         }
+        if (cpu_isar_feature(aa64_ras, cpu)) {
+            valid_mask |= HCR_TERR | HCR_TEA;
+        }
         if (cpu_isar_feature(aa64_lor, cpu)) {
             valid_mask |= HCR_TLOR;
         }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 51/60] target/arm: Implement virtual SError exceptions
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (49 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 50/60] target/arm: Enable SCR and HCR bits for RAS Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-22 11:06   ` Peter Maydell
  2022-04-17 17:44 ` [PATCH v3 52/60] target/arm: Implement ESB instruction Richard Henderson
                   ` (9 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Virtual SError exceptions are raised by setting HCR_EL2.VSE,
and are routed to EL1 just like other virtual exceptions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Honor EAE for reporting VSERR to aa32.
---
 target/arm/cpu.h       |  2 ++
 target/arm/internals.h |  8 ++++++++
 target/arm/syndrome.h  |  5 +++++
 target/arm/cpu.c       | 38 +++++++++++++++++++++++++++++++++++++-
 target/arm/helper.c    | 40 +++++++++++++++++++++++++++++++++++++++-
 5 files changed, 91 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index a6d1923a78..b90b6d91bd 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -55,6 +55,7 @@
 #define EXCP_LSERR          21   /* v8M LSERR SecureFault */
 #define EXCP_UNALIGNED      22   /* v7M UNALIGNED UsageFault */
 #define EXCP_DIVBYZERO      23   /* v7M DIVBYZERO UsageFault */
+#define EXCP_VSERR          24
 /* NB: add new EXCP_ defines to the array in arm_log_exception() too */
 
 #define ARMV7M_EXCP_RESET   1
@@ -88,6 +89,7 @@ enum {
 #define CPU_INTERRUPT_FIQ   CPU_INTERRUPT_TGT_EXT_1
 #define CPU_INTERRUPT_VIRQ  CPU_INTERRUPT_TGT_EXT_2
 #define CPU_INTERRUPT_VFIQ  CPU_INTERRUPT_TGT_EXT_3
+#define CPU_INTERRUPT_VSERR CPU_INTERRUPT_TGT_INT_0
 
 /* The usual mapping for an AArch64 system register to its AArch32
  * counterpart is for the 32 bit world to have access to the lower
diff --git a/target/arm/internals.h b/target/arm/internals.h
index baa2a7e1f4..2e55c9a8ae 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -935,6 +935,14 @@ void arm_cpu_update_virq(ARMCPU *cpu);
  */
 void arm_cpu_update_vfiq(ARMCPU *cpu);
 
+/**
+ * arm_cpu_update_vserr: Update CPU_INTERRUPT_VSERR bit
+ *
+ * Update the CPU_INTERRUPT_VSERR bit in cs->interrupt_request,
+ * following a change to the HCR_EL2.VSE bit.
+ */
+void arm_cpu_update_vserr(ARMCPU *cpu);
+
 /**
  * arm_mmu_idx_el:
  * @env: The cpu environment
diff --git a/target/arm/syndrome.h b/target/arm/syndrome.h
index 8cde8e7243..0cb26dde7d 100644
--- a/target/arm/syndrome.h
+++ b/target/arm/syndrome.h
@@ -287,4 +287,9 @@ static inline uint32_t syn_pcalignment(void)
     return (EC_PCALIGNMENT << ARM_EL_EC_SHIFT) | ARM_EL_IL;
 }
 
+static inline uint32_t syn_serror(uint32_t extra)
+{
+    return (EC_SERROR << ARM_EL_EC_SHIFT) | ARM_EL_IL | extra;
+}
+
 #endif /* TARGET_ARM_SYNDROME_H */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index fc0d74b4d1..20ae69e83b 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -85,7 +85,7 @@ static bool arm_cpu_has_work(CPUState *cs)
     return (cpu->power_state != PSCI_OFF)
         && cs->interrupt_request &
         (CPU_INTERRUPT_FIQ | CPU_INTERRUPT_HARD
-         | CPU_INTERRUPT_VFIQ | CPU_INTERRUPT_VIRQ
+         | CPU_INTERRUPT_VFIQ | CPU_INTERRUPT_VIRQ | CPU_INTERRUPT_VSERR
          | CPU_INTERRUPT_EXITTB);
 }
 
@@ -509,6 +509,12 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
             return false;
         }
         return !(env->daif & PSTATE_I);
+    case EXCP_VSERR:
+        if (!(hcr_el2 & HCR_AMO) || (hcr_el2 & HCR_TGE)) {
+            /* VIRQs are only taken when hypervized.  */
+            return false;
+        }
+        return !(env->daif & PSTATE_A);
     default:
         g_assert_not_reached();
     }
@@ -630,6 +636,17 @@ static bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
             goto found;
         }
     }
+    if (interrupt_request & CPU_INTERRUPT_VSERR) {
+        excp_idx = EXCP_VSERR;
+        target_el = 1;
+        if (arm_excp_unmasked(cs, excp_idx, target_el,
+                              cur_el, secure, hcr_el2)) {
+            /* Taking a virtual abort clears HCR_EL2.VSE */
+            env->cp15.hcr_el2 &= ~HCR_VSE;
+            cpu_reset_interrupt(cs, CPU_INTERRUPT_VSERR);
+            goto found;
+        }
+    }
     return false;
 
  found:
@@ -682,6 +699,25 @@ void arm_cpu_update_vfiq(ARMCPU *cpu)
     }
 }
 
+void arm_cpu_update_vserr(ARMCPU *cpu)
+{
+    /*
+     * Update the interrupt level for VSERR, which is the HCR_EL2.VSE bit.
+     */
+    CPUARMState *env = &cpu->env;
+    CPUState *cs = CPU(cpu);
+
+    bool new_state = env->cp15.hcr_el2 & HCR_VSE;
+
+    if (new_state != ((cs->interrupt_request & CPU_INTERRUPT_VSERR) != 0)) {
+        if (new_state) {
+            cpu_interrupt(cs, CPU_INTERRUPT_VSERR);
+        } else {
+            cpu_reset_interrupt(cs, CPU_INTERRUPT_VSERR);
+        }
+    }
+}
+
 #ifndef CONFIG_USER_ONLY
 static void arm_cpu_set_irq(void *opaque, int irq, int level)
 {
diff --git a/target/arm/helper.c b/target/arm/helper.c
index f6468fed43..7e4178c594 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -1864,7 +1864,12 @@ static uint64_t isr_read(CPUARMState *env, const ARMCPRegInfo *ri)
         }
     }
 
-    /* External aborts are not possible in QEMU so A bit is always clear */
+    if (hcr_el2 & HCR_AMO) {
+        if (cs->interrupt_request & CPU_INTERRUPT_VSERR) {
+            ret |= CPSR_A;
+        }
+    }
+
     return ret;
 }
 
@@ -5175,6 +5180,7 @@ static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
     g_assert(qemu_mutex_iothread_locked());
     arm_cpu_update_virq(cpu);
     arm_cpu_update_vfiq(cpu);
+    arm_cpu_update_vserr(cpu);
 }
 
 static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
@@ -9325,6 +9331,7 @@ void arm_log_exception(CPUState *cs)
             [EXCP_LSERR] = "v8M LSERR UsageFault",
             [EXCP_UNALIGNED] = "v7M UNALIGNED UsageFault",
             [EXCP_DIVBYZERO] = "v7M DIVBYZERO UsageFault",
+            [EXCP_VSERR] = "Virtual SERR",
         };
 
         if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
@@ -9837,6 +9844,31 @@ static void arm_cpu_do_interrupt_aarch32(CPUState *cs)
         mask = CPSR_A | CPSR_I | CPSR_F;
         offset = 4;
         break;
+    case EXCP_VSERR:
+        {
+            /*
+             * Note that this is reported as a data abort, but the DFAR
+             * has an UNKNOWN value.  Construct the SError syndrome from
+             * AET and ExT fields.
+             */
+            ARMMMUFaultInfo fi = { .type = ARMFault_AsyncExternal, };
+
+            if (extended_addresses_enabled(env)) {
+                env->exception.fsr = arm_fi_to_lfsc(&fi);
+            } else {
+                env->exception.fsr = arm_fi_to_sfsc(&fi);
+            }
+            env->exception.fsr |= env->cp15.vsesr_el2 & 0xd000;
+            A32_BANKED_CURRENT_REG_SET(env, dfsr, env->exception.fsr);
+            qemu_log_mask(CPU_LOG_INT, "...with IFSR 0x%x\n",
+                          env->exception.fsr);
+
+            new_mode = ARM_CPU_MODE_ABT;
+            addr = 0x10;
+            mask = CPSR_A | CPSR_I;
+            offset = 8;
+        }
+        break;
     case EXCP_SMC:
         new_mode = ARM_CPU_MODE_MON;
         addr = 0x08;
@@ -10057,6 +10089,12 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
     case EXCP_VFIQ:
         addr += 0x100;
         break;
+    case EXCP_VSERR:
+        addr += 0x180;
+        /* Construct the SError syndrome from IDS and ISS fields. */
+        env->exception.syndrome = syn_serror(env->cp15.vsesr_el2 & 0x1ffffff);
+        env->cp15.esr_el[new_el] = env->exception.syndrome;
+        break;
     default:
         cpu_abort(cs, "Unhandled exception 0x%x\n", cs->exception_index);
     }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 52/60] target/arm: Implement ESB instruction
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (50 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 51/60] target/arm: Implement virtual SError exceptions Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 53/60] target/arm: Enable FEAT_RAS for -cpu max Richard Henderson
                   ` (8 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

Check for and defer any pending virtual SError.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Retain m-profile check; improve comments.
---
 target/arm/helper.h        |  1 +
 target/arm/a32.decode      | 16 ++++++++------
 target/arm/t32.decode      | 18 ++++++++--------
 target/arm/op_helper.c     | 43 ++++++++++++++++++++++++++++++++++++++
 target/arm/translate-a64.c | 17 +++++++++++++++
 target/arm/translate.c     | 23 ++++++++++++++++++++
 6 files changed, 103 insertions(+), 15 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index b463d9343b..b1334e0c42 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -54,6 +54,7 @@ DEF_HELPER_1(wfe, void, env)
 DEF_HELPER_1(yield, void, env)
 DEF_HELPER_1(pre_hvc, void, env)
 DEF_HELPER_2(pre_smc, void, env, i32)
+DEF_HELPER_1(vesb, void, env)
 
 DEF_HELPER_3(cpsr_write, void, env, i32, i32)
 DEF_HELPER_2(cpsr_write_eret, void, env, i32)
diff --git a/target/arm/a32.decode b/target/arm/a32.decode
index fcd8cd4f7d..f2ca480949 100644
--- a/target/arm/a32.decode
+++ b/target/arm/a32.decode
@@ -187,13 +187,17 @@ SMULTT           .... 0001 0110 .... 0000 .... 1110 ....      @rd0mn
 
 {
   {
-    YIELD        ---- 0011 0010 0000 1111 ---- 0000 0001
-    WFE          ---- 0011 0010 0000 1111 ---- 0000 0010
-    WFI          ---- 0011 0010 0000 1111 ---- 0000 0011
+    [
+      YIELD      ---- 0011 0010 0000 1111 ---- 0000 0001
+      WFE        ---- 0011 0010 0000 1111 ---- 0000 0010
+      WFI        ---- 0011 0010 0000 1111 ---- 0000 0011
 
-    # TODO: Implement SEV, SEVL; may help SMP performance.
-    # SEV        ---- 0011 0010 0000 1111 ---- 0000 0100
-    # SEVL       ---- 0011 0010 0000 1111 ---- 0000 0101
+      # TODO: Implement SEV, SEVL; may help SMP performance.
+      # SEV      ---- 0011 0010 0000 1111 ---- 0000 0100
+      # SEVL     ---- 0011 0010 0000 1111 ---- 0000 0101
+
+      ESB        ---- 0011 0010 0000 1111 ---- 0001 0000
+    ]
 
     # The canonical nop ends in 00000000, but the whole of the
     # rest of the space executes as nop if otherwise unsupported.
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
index 78fadef9d6..f21ad0167a 100644
--- a/target/arm/t32.decode
+++ b/target/arm/t32.decode
@@ -364,17 +364,17 @@ CLZ              1111 1010 1011 ---- 1111 .... 1000 ....      @rdm
   [
     # Hints, and CPS
     {
-      YIELD      1111 0011 1010 1111 1000 0000 0000 0001
-      WFE        1111 0011 1010 1111 1000 0000 0000 0010
-      WFI        1111 0011 1010 1111 1000 0000 0000 0011
+      [
+        YIELD    1111 0011 1010 1111 1000 0000 0000 0001
+        WFE      1111 0011 1010 1111 1000 0000 0000 0010
+        WFI      1111 0011 1010 1111 1000 0000 0000 0011
 
-      # TODO: Implement SEV, SEVL; may help SMP performance.
-      # SEV      1111 0011 1010 1111 1000 0000 0000 0100
-      # SEVL     1111 0011 1010 1111 1000 0000 0000 0101
+        # TODO: Implement SEV, SEVL; may help SMP performance.
+        # SEV    1111 0011 1010 1111 1000 0000 0000 0100
+        # SEVL   1111 0011 1010 1111 1000 0000 0000 0101
 
-      # For M-profile minimal-RAS ESB can be a NOP, which is the
-      # default behaviour since it is in the hint space.
-      # ESB      1111 0011 1010 1111 1000 0000 0001 0000
+        ESB      1111 0011 1010 1111 1000 0000 0001 0000
+      ]
 
       # The canonical nop ends in 0000 0000, but the whole rest
       # of the space is "reserved hint, behaves as nop".
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index 76499ffa14..390b6578a8 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -960,3 +960,46 @@ void HELPER(probe_access)(CPUARMState *env, target_ulong ptr,
                      access_type, mmu_idx, ra);
     }
 }
+
+/*
+ * This function corresponds to AArch64.vESBOperation().
+ * Note that the AArch32 version is not functionally different.
+ */
+void HELPER(vesb)(CPUARMState *env)
+{
+    /*
+     * The EL2Enabled() check is done inside arm_hcr_el2_eff,
+     * and will return HCR_EL2.VSE == 0, so nothing happens.
+     */
+    uint64_t hcr = arm_hcr_el2_eff(env);
+    bool enabled = !(hcr & HCR_TGE) && (hcr & HCR_AMO);
+    bool pending = enabled && (hcr & HCR_VSE);
+    bool masked  = (env->daif & PSTATE_A);
+
+    /* If VSE pending and masked, defer the exception.  */
+    if (pending && masked) {
+        uint32_t syndrome;
+
+        if (arm_el_is_aa64(env, 1)) {
+            /* Copy across IDS and ISS from VSESR. */
+            syndrome = env->cp15.vsesr_el2 & 0x1ffffff;
+        } else {
+            ARMMMUFaultInfo fi = { .type = ARMFault_AsyncExternal };
+
+            if (extended_addresses_enabled(env)) {
+                syndrome = arm_fi_to_lfsc(&fi);
+            } else {
+                syndrome = arm_fi_to_sfsc(&fi);
+            }
+            /* Copy across AET and ExT from VSESR. */
+            syndrome |= env->cp15.vsesr_el2 & 0xd000;
+        }
+
+        /* Set VDISR_EL2.A along with the syndrome. */
+        env->cp15.vdisr_el2 = syndrome | (1u << 31);
+
+        /* Clear pending virtual SError */
+        env->cp15.hcr_el2 &= ~HCR_VSE;
+        cpu_reset_interrupt(env_cpu(env), CPU_INTERRUPT_VSERR);
+    }
+}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 98dbc8203f..fc0b3ebf44 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1454,6 +1454,23 @@ static void handle_hint(DisasContext *s, uint32_t insn,
             gen_helper_autib(cpu_X[17], cpu_env, cpu_X[17], cpu_X[16]);
         }
         break;
+    case 0b10000: /* ESB */
+        /* Without RAS, we must implement this as NOP. */
+        if (dc_isar_feature(aa64_ras, s)) {
+            /*
+             * QEMU does not have a source of physical SErrors,
+             * so we are only concerned with virtual SErrors.
+             * The pseudocode in the ARM for this case is
+             *   if PSTATE.EL IN {EL0, EL1} && EL2Enabled() then
+             *      AArch64.vESBOperation();
+             * Most of the condition can be evaluated at translation time.
+             * Test for EL2 present, and defer test for SEL2 to runtime.
+             */
+            if (s->current_el <= 1 && arm_dc_feature(s, ARM_FEATURE_EL2)) {
+                gen_helper_vesb(cpu_env);
+            }
+        }
+        break;
     case 0b11000: /* PACIAZ */
         if (s->pauth_active) {
             gen_helper_pacia(cpu_X[30], cpu_env, cpu_X[30],
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 9370b44707..fef7ccea5c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -6236,6 +6236,29 @@ static bool trans_WFI(DisasContext *s, arg_WFI *a)
     return true;
 }
 
+static bool trans_ESB(DisasContext *s, arg_ESB *a)
+{
+    /*
+     * For M-profile, minimal-RAS ESB can be a NOP.
+     * Without RAS, we must implement this as NOP.
+     */
+    if (!arm_dc_feature(s, ARM_FEATURE_M) && dc_isar_feature(aa32_ras, s)) {
+        /*
+         * QEMU does not have a source of physical SErrors,
+         * so we are only concerned with virtual SErrors.
+         * The pseudocode in the ARM for this case is
+         *   if PSTATE.EL IN {EL0, EL1} && EL2Enabled() then
+         *      AArch32.vESBOperation();
+         * Most of the condition can be evaluated at translation time.
+         * Test for EL2 present, and defer test for SEL2 to runtime.
+         */
+        if (s->current_el <= 1 && arm_dc_feature(s, ARM_FEATURE_EL2)) {
+            gen_helper_vesb(cpu_env);
+        }
+    }
+    return true;
+}
+
 static bool trans_NOP(DisasContext *s, arg_NOP *a)
 {
     return true;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 53/60] target/arm: Enable FEAT_RAS for -cpu max
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (51 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 52/60] target/arm: Implement ESB instruction Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 54/60] target/arm: Enable FEAT_IESB " Richard Henderson
                   ` (7 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Update emulation.rst
---
 docs/system/arm/emulation.rst | 1 +
 target/arm/cpu64.c            | 1 +
 target/arm/cpu_tcg.c          | 1 +
 3 files changed, 3 insertions(+)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index c89c344de1..35b6f7d4de 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -41,6 +41,7 @@ the following architecture extensions:
 - FEAT_PMULL (PMULL, PMULL2 instructions)
 - FEAT_PMUv3p1 (PMU Extensions v3.1)
 - FEAT_PMUv3p4 (PMU Extensions v3.4)
+- FEAT_RAS (Reliability, availability, and serviceability)
 - FEAT_RDM (Advanced SIMD rounding double multiply accumulate instructions)
 - FEAT_RNG (Random number generator)
 - FEAT_SB (Speculation Barrier)
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 136590382a..72fe7885f0 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -744,6 +744,7 @@ static void aarch64_max_initfn(Object *obj)
     t = cpu->isar.id_aa64pfr0;
     t = FIELD_DP64(t, ID_AA64PFR0, FP, 1);        /* FEAT_FP16 */
     t = FIELD_DP64(t, ID_AA64PFR0, ADVSIMD, 1);   /* FEAT_FP16 */
+    t = FIELD_DP64(t, ID_AA64PFR0, RAS, 1);       /* FEAT_RAS */
     t = FIELD_DP64(t, ID_AA64PFR0, SVE, 1);
     t = FIELD_DP64(t, ID_AA64PFR0, SEL2, 1);      /* FEAT_SEL2 */
     t = FIELD_DP64(t, ID_AA64PFR0, DIT, 1);       /* FEAT_DIT */
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index 337598e949..c5cf7efe95 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -69,6 +69,7 @@ void aa32_max_features(ARMCPU *cpu)
 
     t = cpu->isar.id_pfr0;
     t = FIELD_DP32(t, ID_PFR0, DIT, 1);           /* FEAT_DIT */
+    t = FIELD_DP32(t, ID_PFR0, RAS, 1);           /* FEAT_RAS */
     cpu->isar.id_pfr0 = t;
 
     t = cpu->isar.id_pfr2;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 54/60] target/arm: Enable FEAT_IESB for -cpu max
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (52 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 53/60] target/arm: Enable FEAT_RAS for -cpu max Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 55/60] target/arm: Enable FEAT_CSV2 " Richard Henderson
                   ` (6 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

This feature is AArch64 only, and applies to physical SErrors,
which QEMU does not implement, thus the feature is a nop.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Update emulation.rst
---
 docs/system/arm/emulation.rst | 1 +
 target/arm/cpu64.c            | 1 +
 2 files changed, 2 insertions(+)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index 35b6f7d4de..ebd9e418cc 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -24,6 +24,7 @@ the following architecture extensions:
 - FEAT_FlagM2 (Enhancements to flag manipulation instructions)
 - FEAT_HPDS (Hierarchical permission disables)
 - FEAT_I8MM (AArch64 Int8 matrix multiplication instructions)
+- FEAT_IESB (Implicit error synchronization event)
 - FEAT_JSCVT (JavaScript conversion instructions)
 - FEAT_LOR (Limited ordering regions)
 - FEAT_LPA (Large Physical Address space)
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 72fe7885f0..03fbb34e4e 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -781,6 +781,7 @@ static void aarch64_max_initfn(Object *obj)
     t = cpu->isar.id_aa64mmfr2;
     t = FIELD_DP64(t, ID_AA64MMFR2, CNP, 1);      /* FEAT_TTCNP */
     t = FIELD_DP64(t, ID_AA64MMFR2, UAO, 1);      /* FEAT_UAO */
+    t = FIELD_DP64(t, ID_AA64MMFR2, IESB, 1);     /* FEAT_IESB */
     t = FIELD_DP64(t, ID_AA64MMFR2, VARANGE, 1);  /* FEAT_LVA */
     t = FIELD_DP64(t, ID_AA64MMFR2, ST, 1);       /* FEAT_TTST */
     cpu->isar.id_aa64mmfr2 = t;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 55/60] target/arm: Enable FEAT_CSV2 for -cpu max
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (53 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 54/60] target/arm: Enable FEAT_IESB " Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 56/60] target/arm: Enable FEAT_CSV2_2 " Richard Henderson
                   ` (5 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

This extension concerns branch speculation, which TCG does
not implement.  Thus we can trivially enable this feature.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Update emulation.rst
---
 docs/system/arm/emulation.rst | 1 +
 target/arm/cpu64.c            | 1 +
 target/arm/cpu_tcg.c          | 1 +
 3 files changed, 3 insertions(+)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index ebd9e418cc..91fb06c579 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -11,6 +11,7 @@ the following architecture extensions:
 - FEAT_AES (AESD and AESE instructions)
 - FEAT_BF16 (AArch64 BFloat16 instructions)
 - FEAT_BTI (Branch Target Identification)
+- FEAT_CSV2 (Cache speculation variant 2)
 - FEAT_DIT (Data Independent Timing instructions)
 - FEAT_DPB (DC CVAP instruction)
 - FEAT_Debugv8p2 (Debug changes for v8.2)
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 03fbb34e4e..b62656f3c3 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -748,6 +748,7 @@ static void aarch64_max_initfn(Object *obj)
     t = FIELD_DP64(t, ID_AA64PFR0, SVE, 1);
     t = FIELD_DP64(t, ID_AA64PFR0, SEL2, 1);      /* FEAT_SEL2 */
     t = FIELD_DP64(t, ID_AA64PFR0, DIT, 1);       /* FEAT_DIT */
+    t = FIELD_DP64(t, ID_AA64PFR0, CSV2, 1);      /* FEAT_CSV2 */
     cpu->isar.id_aa64pfr0 = t;
 
     t = cpu->isar.id_aa64pfr1;
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index c5cf7efe95..762b961707 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -68,6 +68,7 @@ void aa32_max_features(ARMCPU *cpu)
     cpu->isar.id_mmfr4 = t;
 
     t = cpu->isar.id_pfr0;
+    t = FIELD_DP32(t, ID_PFR0, CSV2, 2);          /* FEAT_CVS2 */
     t = FIELD_DP32(t, ID_PFR0, DIT, 1);           /* FEAT_DIT */
     t = FIELD_DP32(t, ID_PFR0, RAS, 1);           /* FEAT_RAS */
     cpu->isar.id_pfr0 = t;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 56/60] target/arm: Enable FEAT_CSV2_2 for -cpu max
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (54 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 55/60] target/arm: Enable FEAT_CSV2 " Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-29  9:52   ` Damien Hedde
  2022-04-17 17:44 ` [PATCH v3 57/60] target/arm: Enable FEAT_CSV3 " Richard Henderson
                   ` (4 subsequent siblings)
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

There is no branch prediction in TCG, therefore there is no
need to actually include the context number into the predictor.
Therefore all we need to do is add the state for SCXTNUM_ELx.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Update emulation.rst; clear CSV2_FRAC; use decimal; tidy access_scxtnum.
v3: Rely on EL3-no-EL2 squashing during registration.
---
 docs/system/arm/emulation.rst |  3 ++
 target/arm/cpu.h              | 16 +++++++++
 target/arm/cpu64.c            |  3 +-
 target/arm/helper.c           | 66 ++++++++++++++++++++++++++++++++++-
 4 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index 91fb06c579..aebe3be1ba 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -12,6 +12,9 @@ the following architecture extensions:
 - FEAT_BF16 (AArch64 BFloat16 instructions)
 - FEAT_BTI (Branch Target Identification)
 - FEAT_CSV2 (Cache speculation variant 2)
+- FEAT_CSV2_1p1 (Cache speculation variant 2, version 1.1)
+- FEAT_CSV2_1p2 (Cache speculation variant 2, version 1.2)
+- FEAT_CSV2_2 (Cache speculation variant 2, version 2)
 - FEAT_DIT (Data Independent Timing instructions)
 - FEAT_DPB (DC CVAP instruction)
 - FEAT_Debugv8p2 (Debug changes for v8.2)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index b90b6d91bd..a7582da7c2 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -687,6 +687,8 @@ typedef struct CPUArchState {
         ARMPACKey apdb;
         ARMPACKey apga;
     } keys;
+
+    uint64_t scxtnum_el[4];
 #endif
 
 #if defined(CONFIG_USER_ONLY)
@@ -1210,6 +1212,7 @@ void pmu_init(ARMCPU *cpu);
 #define SCTLR_WXN     (1U << 19)
 #define SCTLR_ST      (1U << 20) /* up to ??, RAZ in v6 */
 #define SCTLR_UWXN    (1U << 20) /* v7 onward, AArch32 only */
+#define SCTLR_TSCXT   (1U << 20) /* FEAT_CSV2_1p2, AArch64 only */
 #define SCTLR_FI      (1U << 21) /* up to v7, v8 RES0 */
 #define SCTLR_IESB    (1U << 21) /* v8.2-IESB, AArch64 only */
 #define SCTLR_U       (1U << 22) /* up to v6, RAO in v7 */
@@ -4021,6 +4024,19 @@ static inline bool isar_feature_aa64_dit(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, DIT) != 0;
 }
 
+static inline bool isar_feature_aa64_scxtnum(const ARMISARegisters *id)
+{
+    int key = FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, CSV2);
+    if (key >= 2) {
+        return true;      /* FEAT_CSV2_2 */
+    }
+    if (key == 1) {
+        key = FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, CSV2_FRAC);
+        return key >= 2;  /* FEAT_CSV2_1p2 */
+    }
+    return false;
+}
+
 static inline bool isar_feature_aa64_ssbs(const ARMISARegisters *id)
 {
     return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, SSBS) != 0;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index b62656f3c3..6ccbcb857d 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -748,7 +748,7 @@ static void aarch64_max_initfn(Object *obj)
     t = FIELD_DP64(t, ID_AA64PFR0, SVE, 1);
     t = FIELD_DP64(t, ID_AA64PFR0, SEL2, 1);      /* FEAT_SEL2 */
     t = FIELD_DP64(t, ID_AA64PFR0, DIT, 1);       /* FEAT_DIT */
-    t = FIELD_DP64(t, ID_AA64PFR0, CSV2, 1);      /* FEAT_CSV2 */
+    t = FIELD_DP64(t, ID_AA64PFR0, CSV2, 2);      /* FEAT_CSV2_2 */
     cpu->isar.id_aa64pfr0 = t;
 
     t = cpu->isar.id_aa64pfr1;
@@ -760,6 +760,7 @@ static void aarch64_max_initfn(Object *obj)
      * we do for EL2 with the virtualization=on property.
      */
     t = FIELD_DP64(t, ID_AA64PFR1, MTE, 3);       /* FEAT_MTE3 */
+    t = FIELD_DP64(t, ID_AA64PFR1, CSV2_FRAC, 0); /* FEAT_CSV2_2 */
     cpu->isar.id_aa64pfr1 = t;
 
     t = cpu->isar.id_aa64mmfr0;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 7e4178c594..22728b9fbf 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -1771,6 +1771,9 @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
         if (cpu_isar_feature(aa64_mte, cpu)) {
             valid_mask |= SCR_ATA;
         }
+        if (cpu_isar_feature(aa64_scxtnum, cpu)) {
+            valid_mask |= SCR_ENSCXT;
+        }
     } else {
         valid_mask &= ~(SCR_RW | SCR_ST);
         if (cpu_isar_feature(aa32_ras, cpu)) {
@@ -5149,6 +5152,9 @@ static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
         if (cpu_isar_feature(aa64_mte, cpu)) {
             valid_mask |= HCR_ATA | HCR_DCT | HCR_TID5;
         }
+        if (cpu_isar_feature(aa64_scxtnum, cpu)) {
+            valid_mask |= HCR_ENSCXT;
+        }
     }
 
     /* Clear RES0 bits.  */
@@ -5798,6 +5804,10 @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
         { K(3, 0,  5, 6, 0), K(3, 4,  5, 6, 0), K(3, 5, 5, 6, 0),
           "TFSR_EL1", "TFSR_EL2", "TFSR_EL12", isar_feature_aa64_mte },
 
+        { K(3, 0, 13, 0, 7), K(3, 4, 13, 0, 7), K(3, 5, 13, 0, 7),
+          "SCXTNUM_EL1", "SCXTNUM_EL2", "SCXTNUM_EL12",
+          isar_feature_aa64_scxtnum },
+
         /* TODO: ARMv8.2-SPE -- PMSCR_EL2 */
         /* TODO: ARMv8.4-Trace -- TRFCR_EL2 */
     };
@@ -7233,7 +7243,57 @@ static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
     },
 };
 
-#endif
+static CPAccessResult access_scxtnum(CPUARMState *env, const ARMCPRegInfo *ri,
+                                     bool isread)
+{
+    int el = arm_current_el(env);
+
+    if (el == 0) {
+        uint64_t hcr = arm_hcr_el2_eff(env);
+        if ((hcr & (HCR_TGE | HCR_E2H)) != (HCR_TGE | HCR_E2H)) {
+            if (env->cp15.sctlr_el[1] & SCTLR_TSCXT) {
+                if (hcr & HCR_TGE) {
+                    return CP_ACCESS_TRAP_EL2;
+                }
+                return CP_ACCESS_TRAP;
+            }
+            if (arm_is_el2_enabled(env) && !(hcr & HCR_ENSCXT)) {
+                return CP_ACCESS_TRAP_EL2;
+            }
+            goto no_sctlr_el2;
+        }
+    }
+    if (el < 2 && (env->cp15.sctlr_el[2] & SCTLR_TSCXT)) {
+        return CP_ACCESS_TRAP_EL2;
+    }
+ no_sctlr_el2:
+    if (el < 3
+        && arm_feature(env, ARM_FEATURE_EL3)
+        && !(env->cp15.scr_el3 & SCR_ENSCXT)) {
+        return CP_ACCESS_TRAP_EL3;
+    }
+    return CP_ACCESS_OK;
+}
+
+static const ARMCPRegInfo scxtnum_reginfo[] = {
+    { .name = "SCXTNUM_EL0", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 3, .crn = 13, .crm = 0, .opc2 = 7,
+      .access = PL0_RW, .accessfn = access_scxtnum,
+      .fieldoffset = offsetof(CPUARMState, scxtnum_el[0]) },
+    { .name = "SCXTNUM_EL1", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 7,
+      .access = PL1_RW, .accessfn = access_scxtnum,
+      .fieldoffset = offsetof(CPUARMState, scxtnum_el[1]) },
+    { .name = "SCXTNUM_EL2", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 4, .crn = 13, .crm = 0, .opc2 = 7,
+      .access = PL2_RW, .accessfn = access_scxtnum,
+      .fieldoffset = offsetof(CPUARMState, scxtnum_el[2]) },
+    { .name = "SCXTNUM_EL3", .state = ARM_CP_STATE_AA64,
+      .opc0 = 3, .opc1 = 6, .crn = 13, .crm = 0, .opc2 = 7,
+      .access = PL3_RW,
+      .fieldoffset = offsetof(CPUARMState, scxtnum_el[3]) },
+};
+#endif /* TARGET_AARCH64 */
 
 static CPAccessResult access_predinv(CPUARMState *env, const ARMCPRegInfo *ri,
                                      bool isread)
@@ -8372,6 +8432,10 @@ void register_cp_regs_for_features(ARMCPU *cpu)
         define_arm_cp_regs(cpu, mte_tco_ro_reginfo);
         define_arm_cp_regs(cpu, mte_el0_cacheop_reginfo);
     }
+
+    if (cpu_isar_feature(aa64_scxtnum, cpu)) {
+        define_arm_cp_regs(cpu, scxtnum_reginfo);
+    }
 #endif
 
     if (cpu_isar_feature(any_predinv, cpu)) {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 57/60] target/arm: Enable FEAT_CSV3 for -cpu max
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (55 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 56/60] target/arm: Enable FEAT_CSV2_2 " Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 58/60] target/arm: Enable FEAT_DGH " Richard Henderson
                   ` (3 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

This extension concerns cache speculation, which TCG does
not implement.  Thus we can trivially enable this feature.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Update emulation.rst
---
 docs/system/arm/emulation.rst | 1 +
 target/arm/cpu64.c            | 1 +
 target/arm/cpu_tcg.c          | 1 +
 3 files changed, 3 insertions(+)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index aebe3be1ba..f75f0fc110 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -15,6 +15,7 @@ the following architecture extensions:
 - FEAT_CSV2_1p1 (Cache speculation variant 2, version 1.1)
 - FEAT_CSV2_1p2 (Cache speculation variant 2, version 1.2)
 - FEAT_CSV2_2 (Cache speculation variant 2, version 2)
+- FEAT_CSV3 (Cache speculation variant 3)
 - FEAT_DIT (Data Independent Timing instructions)
 - FEAT_DPB (DC CVAP instruction)
 - FEAT_Debugv8p2 (Debug changes for v8.2)
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 6ccbcb857d..6139f51267 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -749,6 +749,7 @@ static void aarch64_max_initfn(Object *obj)
     t = FIELD_DP64(t, ID_AA64PFR0, SEL2, 1);      /* FEAT_SEL2 */
     t = FIELD_DP64(t, ID_AA64PFR0, DIT, 1);       /* FEAT_DIT */
     t = FIELD_DP64(t, ID_AA64PFR0, CSV2, 2);      /* FEAT_CSV2_2 */
+    t = FIELD_DP64(t, ID_AA64PFR0, CSV3, 1);      /* FEAT_CSV3 */
     cpu->isar.id_aa64pfr0 = t;
 
     t = cpu->isar.id_aa64pfr1;
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index 762b961707..ea4eccddc3 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -74,6 +74,7 @@ void aa32_max_features(ARMCPU *cpu)
     cpu->isar.id_pfr0 = t;
 
     t = cpu->isar.id_pfr2;
+    t = FIELD_DP32(t, ID_PFR2, CSV3, 1);          /* FEAT_CSV3 */
     t = FIELD_DP32(t, ID_PFR2, SSBS, 1);          /* FEAT_SSBS */
     cpu->isar.id_pfr2 = t;
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 58/60] target/arm: Enable FEAT_DGH for -cpu max
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (56 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 57/60] target/arm: Enable FEAT_CSV3 " Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-17 17:44 ` [PATCH v3 59/60] target/arm: Define cortex-a76 Richard Henderson
                   ` (2 subsequent siblings)
  60 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-arm

This extension concerns not merging memory access, which TCG does
not implement.  Thus we can trivially enable this feature.
Add a comment to handle_hint for the DGH instruction, but no code.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Update emulation.rst
---
 docs/system/arm/emulation.rst | 1 +
 target/arm/cpu64.c            | 1 +
 target/arm/translate-a64.c    | 1 +
 3 files changed, 3 insertions(+)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index f75f0fc110..bc9cdda75a 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -16,6 +16,7 @@ the following architecture extensions:
 - FEAT_CSV2_1p2 (Cache speculation variant 2, version 1.2)
 - FEAT_CSV2_2 (Cache speculation variant 2, version 2)
 - FEAT_CSV3 (Cache speculation variant 3)
+- FEAT_DGH (Data gathering hint)
 - FEAT_DIT (Data Independent Timing instructions)
 - FEAT_DPB (DC CVAP instruction)
 - FEAT_Debugv8p2 (Debug changes for v8.2)
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 6139f51267..336a941acd 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -738,6 +738,7 @@ static void aarch64_max_initfn(Object *obj)
     t = FIELD_DP64(t, ID_AA64ISAR1, SB, 1);       /* FEAT_SB */
     t = FIELD_DP64(t, ID_AA64ISAR1, SPECRES, 1);  /* FEAT_SPECRES */
     t = FIELD_DP64(t, ID_AA64ISAR1, BF16, 1);     /* FEAT_BF16 */
+    t = FIELD_DP64(t, ID_AA64ISAR1, DGH, 1);      /* FEAT_DGH */
     t = FIELD_DP64(t, ID_AA64ISAR1, I8MM, 1);     /* FEAT_I8MM */
     cpu->isar.id_aa64isar1 = t;
 
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index fc0b3ebf44..b44ab3ecf3 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1427,6 +1427,7 @@ static void handle_hint(DisasContext *s, uint32_t insn,
         break;
     case 0b00100: /* SEV */
     case 0b00101: /* SEVL */
+    case 0b00110: /* DGH */
         /* we treat all as NOP at least for now */
         break;
     case 0b00111: /* XPACLRI */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 59/60] target/arm: Define cortex-a76
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (57 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 58/60] target/arm: Enable FEAT_DGH " Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-22 11:08   ` Peter Maydell
  2022-04-17 17:44 ` [PATCH v3 60/60] target/arm: Define neoverse-n1 Richard Henderson
  2022-04-22  9:01 ` [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Peter Maydell
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Enable the a76 for virt and sbsa board use.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 docs/system/arm/virt.rst |  1 +
 hw/arm/sbsa-ref.c        |  1 +
 hw/arm/virt.c            |  1 +
 target/arm/cpu64.c       | 66 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 69 insertions(+)

diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
index 1544632b67..e9ff81aa21 100644
--- a/docs/system/arm/virt.rst
+++ b/docs/system/arm/virt.rst
@@ -55,6 +55,7 @@ Supported guest CPU types:
 - ``cortex-a53`` (64-bit)
 - ``cortex-a57`` (64-bit)
 - ``cortex-a72`` (64-bit)
+- ``cortex-a76`` (64-bit)
 - ``a64fx`` (64-bit)
 - ``host`` (with KVM only)
 - ``max`` (same as ``host`` for KVM; best possible emulation with TCG)
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index 2387401963..2ddde88f5e 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -145,6 +145,7 @@ static const int sbsa_ref_irqmap[] = {
 static const char * const valid_cpus[] = {
     ARM_CPU_TYPE_NAME("cortex-a57"),
     ARM_CPU_TYPE_NAME("cortex-a72"),
+    ARM_CPU_TYPE_NAME("cortex-a76"),
     ARM_CPU_TYPE_NAME("max"),
 };
 
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index d2e5ecd234..ce15c36a7f 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -202,6 +202,7 @@ static const char *valid_cpus[] = {
     ARM_CPU_TYPE_NAME("cortex-a53"),
     ARM_CPU_TYPE_NAME("cortex-a57"),
     ARM_CPU_TYPE_NAME("cortex-a72"),
+    ARM_CPU_TYPE_NAME("cortex-a76"),
     ARM_CPU_TYPE_NAME("a64fx"),
     ARM_CPU_TYPE_NAME("host"),
     ARM_CPU_TYPE_NAME("max"),
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 336a941acd..d046351991 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -194,6 +194,71 @@ static void aarch64_a72_initfn(Object *obj)
     define_cortex_a72_a57_a53_cp_reginfo(cpu);
 }
 
+static void aarch64_a76_initfn(Object *obj)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+
+    cpu->dtb_compatible = "arm,cortex-a76";
+    set_feature(&cpu->env, ARM_FEATURE_V8);
+    set_feature(&cpu->env, ARM_FEATURE_NEON);
+    set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
+    set_feature(&cpu->env, ARM_FEATURE_AARCH64);
+    set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
+    set_feature(&cpu->env, ARM_FEATURE_EL2);
+    set_feature(&cpu->env, ARM_FEATURE_EL3);
+    set_feature(&cpu->env, ARM_FEATURE_PMU);
+
+    /* Ordered by B2.4 AArch64 registers by functional group */
+    cpu->clidr = 0x82000023;
+    cpu->ctr = 0x8444C004;
+    cpu->dcz_blocksize = 4;
+    cpu->isar.id_aa64dfr0  = 0x0000000010305408ull;
+    cpu->isar.id_aa64isar0 = 0x0000100010211120ull;
+    cpu->isar.id_aa64isar1 = 0x0000000000100001ull;
+    cpu->isar.id_aa64mmfr0 = 0x0000000000101122ull;
+    cpu->isar.id_aa64mmfr1 = 0x0000000010212122ull;
+    cpu->isar.id_aa64mmfr2 = 0x0000000000001011ull;
+    cpu->isar.id_aa64pfr0  = 0x1100000010111112ull; /* GIC filled in later */
+    cpu->isar.id_aa64pfr1  = 0x0000000000000010ull;
+    cpu->id_afr0       = 0x00000000;
+    cpu->isar.id_dfr0  = 0x04010088;
+    cpu->isar.id_isar0 = 0x02101110;
+    cpu->isar.id_isar1 = 0x13112111;
+    cpu->isar.id_isar2 = 0x21232042;
+    cpu->isar.id_isar3 = 0x01112131;
+    cpu->isar.id_isar4 = 0x00010142;
+    cpu->isar.id_isar5 = 0x01011121;
+    cpu->isar.id_isar6 = 0x00000010;
+    cpu->isar.id_mmfr0 = 0x10201105;
+    cpu->isar.id_mmfr1 = 0x40000000;
+    cpu->isar.id_mmfr2 = 0x01260000;
+    cpu->isar.id_mmfr3 = 0x02122211;
+    cpu->isar.id_mmfr4 = 0x00021110;
+    cpu->isar.id_pfr0  = 0x10010131;
+    cpu->isar.id_pfr1  = 0x00010000; /* GIC filled in later */
+    cpu->isar.id_pfr2  = 0x00000011;
+    cpu->midr = 0x414fd0b1;          /* r4p1 */
+    cpu->revidr = 0;
+
+    /* From B2.18 CCSIDR_EL1 */
+    cpu->ccsidr[0] = 0x701fe01a; /* 64KB L1 dcache */
+    cpu->ccsidr[1] = 0x201fe01a; /* 64KB L1 icache */
+    cpu->ccsidr[2] = 0x707fe03a; /* 512KB L2 cache */
+
+    /* From B2.93 SCTLR_EL3 */
+    cpu->reset_sctlr = 0x30c50838;
+
+    /* From B4.23 ICH_VTR_EL2 */
+    cpu->gic_num_lrs = 4;
+    cpu->gic_vpribits = 5;
+    cpu->gic_vprebits = 5;
+
+    /* From B5.1 AdvSIMD AArch64 register summary */
+    cpu->isar.mvfr0 = 0x10110222;
+    cpu->isar.mvfr1 = 0x13211111;
+    cpu->isar.mvfr2 = 0x00000043;
+}
+
 void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
 {
     /*
@@ -879,6 +944,7 @@ static const ARMCPUInfo aarch64_cpus[] = {
     { .name = "cortex-a57",         .initfn = aarch64_a57_initfn },
     { .name = "cortex-a53",         .initfn = aarch64_a53_initfn },
     { .name = "cortex-a72",         .initfn = aarch64_a72_initfn },
+    { .name = "cortex-a76",         .initfn = aarch64_a76_initfn },
     { .name = "a64fx",              .initfn = aarch64_a64fx_initfn },
     { .name = "max",                .initfn = aarch64_max_initfn },
 #if defined(CONFIG_KVM) || defined(CONFIG_HVF)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* [PATCH v3 60/60] target/arm: Define neoverse-n1
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (58 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 59/60] target/arm: Define cortex-a76 Richard Henderson
@ 2022-04-17 17:44 ` Richard Henderson
  2022-04-22 11:08   ` Peter Maydell
  2022-04-22  9:01 ` [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Peter Maydell
  60 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-17 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Enable the n1 for virt and sbsa board use.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 docs/system/arm/virt.rst |  1 +
 hw/arm/sbsa-ref.c        |  1 +
 hw/arm/virt.c            |  1 +
 target/arm/cpu64.c       | 66 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 69 insertions(+)

diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
index e9ff81aa21..e8e851a15b 100644
--- a/docs/system/arm/virt.rst
+++ b/docs/system/arm/virt.rst
@@ -58,6 +58,7 @@ Supported guest CPU types:
 - ``cortex-a76`` (64-bit)
 - ``a64fx`` (64-bit)
 - ``host`` (with KVM only)
+- ``neoverse-n1`` (64-bit)
 - ``max`` (same as ``host`` for KVM; best possible emulation with TCG)
 
 Note that the default is ``cortex-a15``, so for an AArch64 guest you must
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index 2ddde88f5e..dac8860f2d 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -146,6 +146,7 @@ static const char * const valid_cpus[] = {
     ARM_CPU_TYPE_NAME("cortex-a57"),
     ARM_CPU_TYPE_NAME("cortex-a72"),
     ARM_CPU_TYPE_NAME("cortex-a76"),
+    ARM_CPU_TYPE_NAME("neoverse-n1"),
     ARM_CPU_TYPE_NAME("max"),
 };
 
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ce15c36a7f..82dd934de6 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -204,6 +204,7 @@ static const char *valid_cpus[] = {
     ARM_CPU_TYPE_NAME("cortex-a72"),
     ARM_CPU_TYPE_NAME("cortex-a76"),
     ARM_CPU_TYPE_NAME("a64fx"),
+    ARM_CPU_TYPE_NAME("neoverse-n1"),
     ARM_CPU_TYPE_NAME("host"),
     ARM_CPU_TYPE_NAME("max"),
 };
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index d046351991..da311b2eb5 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -259,6 +259,71 @@ static void aarch64_a76_initfn(Object *obj)
     cpu->isar.mvfr2 = 0x00000043;
 }
 
+static void aarch64_neoverse_n1_initfn(Object *obj)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+
+    cpu->dtb_compatible = "arm,neoverse-n1";
+    set_feature(&cpu->env, ARM_FEATURE_V8);
+    set_feature(&cpu->env, ARM_FEATURE_NEON);
+    set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
+    set_feature(&cpu->env, ARM_FEATURE_AARCH64);
+    set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
+    set_feature(&cpu->env, ARM_FEATURE_EL2);
+    set_feature(&cpu->env, ARM_FEATURE_EL3);
+    set_feature(&cpu->env, ARM_FEATURE_PMU);
+
+    /* Ordered by B2.4 AArch64 registers by functional group */
+    cpu->clidr = 0x82000023;
+    cpu->ctr = 0x8444c004;
+    cpu->dcz_blocksize = 4;
+    cpu->isar.id_aa64dfr0  = 0x0000000110305408ull;
+    cpu->isar.id_aa64isar0 = 0x0000100010211120ull;
+    cpu->isar.id_aa64isar1 = 0x0000000000100001ull;
+    cpu->isar.id_aa64mmfr0 = 0x0000000000101125ull;
+    cpu->isar.id_aa64mmfr1 = 0x0000000010212122ull;
+    cpu->isar.id_aa64mmfr2 = 0x0000000000001011ull;
+    cpu->isar.id_aa64pfr0  = 0x1100000010111112ull; /* GIC filled in later */
+    cpu->isar.id_aa64pfr1  = 0x0000000000000020ull;
+    cpu->id_afr0       = 0x00000000;
+    cpu->isar.id_dfr0  = 0x04010088;
+    cpu->isar.id_isar0 = 0x02101110;
+    cpu->isar.id_isar1 = 0x13112111;
+    cpu->isar.id_isar2 = 0x21232042;
+    cpu->isar.id_isar3 = 0x01112131;
+    cpu->isar.id_isar4 = 0x00010142;
+    cpu->isar.id_isar5 = 0x01011121;
+    cpu->isar.id_isar6 = 0x00000010;
+    cpu->isar.id_mmfr0 = 0x10201105;
+    cpu->isar.id_mmfr1 = 0x40000000;
+    cpu->isar.id_mmfr2 = 0x01260000;
+    cpu->isar.id_mmfr3 = 0x02122211;
+    cpu->isar.id_mmfr4 = 0x00021110;
+    cpu->isar.id_pfr0  = 0x10010131;
+    cpu->isar.id_pfr1  = 0x00010000; /* GIC filled in later */
+    cpu->isar.id_pfr2  = 0x00000011;
+    cpu->midr = 0x414fd0c1;          /* r4p1 */
+    cpu->revidr = 0;
+
+    /* From B2.23 CCSIDR_EL1 */
+    cpu->ccsidr[0] = 0x701fe01a; /* 64KB L1 dcache */
+    cpu->ccsidr[1] = 0x201fe01a; /* 64KB L1 icache */
+    cpu->ccsidr[2] = 0x70ffe03a; /* 1MB L2 cache */
+
+    /* From B2.98 SCTLR_EL3 */
+    cpu->reset_sctlr = 0x30c50838;
+
+    /* From B4.23 ICH_VTR_EL2 */
+    cpu->gic_num_lrs = 4;
+    cpu->gic_vpribits = 5;
+    cpu->gic_vprebits = 5;
+
+    /* From B5.1 AdvSIMD AArch64 register summary */
+    cpu->isar.mvfr0 = 0x10110222;
+    cpu->isar.mvfr1 = 0x13211111;
+    cpu->isar.mvfr2 = 0x00000043;
+}
+
 void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
 {
     /*
@@ -946,6 +1011,7 @@ static const ARMCPUInfo aarch64_cpus[] = {
     { .name = "cortex-a72",         .initfn = aarch64_a72_initfn },
     { .name = "cortex-a76",         .initfn = aarch64_a76_initfn },
     { .name = "a64fx",              .initfn = aarch64_a64fx_initfn },
+    { .name = "neoverse-n1",        .initfn = aarch64_neoverse_n1_initfn },
     { .name = "max",                .initfn = aarch64_max_initfn },
 #if defined(CONFIG_KVM) || defined(CONFIG_HVF)
     { .name = "host",               .initfn = aarch64_host_initfn },
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 01/60] tcg: Add tcg_constant_ptr
  2022-04-17 17:43 ` [PATCH v3 01/60] tcg: Add tcg_constant_ptr Richard Henderson
@ 2022-04-19 10:41   ` Alex Bennée
  0 siblings, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-19 10:41 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Similar to tcg_const_ptr, defer to tcg_constant_{i32,i64}.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 02/60] target/arm: Update ISAR fields for ARMv8.8
  2022-04-17 17:43 ` [PATCH v3 02/60] target/arm: Update ISAR fields for ARMv8.8 Richard Henderson
@ 2022-04-19 11:10   ` Alex Bennée
  0 siblings, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-19 11:10 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Peter Maydell, qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Update isar fields per ARM DDI0487 H.a.
>
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 03/60] target/arm: Update SCR_EL3 bits to ARMv8.8
  2022-04-17 17:43 ` [PATCH v3 03/60] target/arm: Update SCR_EL3 bits to ARMv8.8 Richard Henderson
@ 2022-04-19 11:13   ` Alex Bennée
  2022-04-19 11:14   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-19 11:13 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Update SCR_EL3 fields per ARM DDI0487 H.a.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 03/60] target/arm: Update SCR_EL3 bits to ARMv8.8
  2022-04-17 17:43 ` [PATCH v3 03/60] target/arm: Update SCR_EL3 bits to ARMv8.8 Richard Henderson
  2022-04-19 11:13   ` Alex Bennée
@ 2022-04-19 11:14   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-19 11:14 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Update SCR_EL3 fields per ARM DDI0487 H.a.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Aside: I notice you have added FEAT_foo comments to the SCTLR bits next,
it might be worth at least flagging the FEAT_RME ones here.

> ---
>  target/arm/cpu.h | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 9a29a4a215..f843c62c83 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -1544,6 +1544,18 @@ static inline void xpsr_write(CPUARMState *env, uint32_t val, uint32_t mask)
>  #define SCR_FIEN              (1U << 21)
>  #define SCR_ENSCXT            (1U << 25)
>  #define SCR_ATA               (1U << 26)
> +#define SCR_FGTEN             (1U << 27)
> +#define SCR_ECVEN             (1U << 28)
> +#define SCR_TWEDEN            (1U << 29)
> +#define SCR_TWEDEL            MAKE_64BIT_MASK(30, 4)
> +#define SCR_TME               (1ULL << 34)
> +#define SCR_AMVOFFEN          (1ULL << 35)
> +#define SCR_ENAS0             (1ULL << 36)
> +#define SCR_ADEN              (1ULL << 37)
> +#define SCR_HXEN              (1ULL << 38)
> +#define SCR_TRNDR             (1ULL << 40)
> +#define SCR_ENTP2             (1ULL << 41)
> +#define SCR_GPF               (1ULL << 48)
>  
>  #define HSTR_TTEE (1 << 16)
>  #define HSTR_TJDBX (1 << 17)


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 04/60] target/arm: Update SCTLR bits to ARMv9.2
  2022-04-17 17:43 ` [PATCH v3 04/60] target/arm: Update SCTLR bits to ARMv9.2 Richard Henderson
@ 2022-04-19 11:16   ` Alex Bennée
  0 siblings, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-19 11:16 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Update SCTLR_ELx fields per ARM DDI0487 H.a.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 05/60] target/arm: Change DisasContext.aarch64 to bool
  2022-04-17 17:43 ` [PATCH v3 05/60] target/arm: Change DisasContext.aarch64 to bool Richard Henderson
@ 2022-04-19 11:16   ` Alex Bennée
  0 siblings, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-19 11:16 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Bool is a more appropriate type for this value.
> Move the member down in the struct to keep the
> bool type members together and remove a hole.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-04-17 17:43 ` [PATCH v3 06/60] target/arm: Change CPUArchState.aarch64 " Richard Henderson
@ 2022-04-19 11:17   ` Alex Bennée
  0 siblings, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-19 11:17 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Bool is a more appropriate type for this value.
> Adjust the assignments to use true/false.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 07/60] target/arm: Extend store_cpu_offset to take field size
  2022-04-17 17:43 ` [PATCH v3 07/60] target/arm: Extend store_cpu_offset to take field size Richard Henderson
@ 2022-04-21 16:15   ` Peter Maydell
  2022-04-22 13:58   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 16:15 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 18:50, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Currently we assume all fields are 32-bit.
> Prepare for fields of a single byte, using sizeof.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate-a32.h | 13 +++++--------
>  target/arm/translate.c     | 21 ++++++++++++++++++++-
>  2 files changed, 25 insertions(+), 9 deletions(-)
>
> diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h
> index 5be4b9b834..f593740a88 100644
> --- a/target/arm/translate-a32.h
> +++ b/target/arm/translate-a32.h
> @@ -61,17 +61,14 @@ static inline TCGv_i32 load_cpu_offset(int offset)
>
>  #define load_cpu_field(name) load_cpu_offset(offsetof(CPUARMState, name))
>
> -static inline void store_cpu_offset(TCGv_i32 var, int offset)
> -{
> -    tcg_gen_st_i32(var, cpu_env, offset);
> -    tcg_temp_free_i32(var);
> -}
> +void store_cpu_offset(TCGv_i32 var, int offset, int size);
>
> -#define store_cpu_field(var, name) \
> -    store_cpu_offset(var, offsetof(CPUARMState, name))
> +#define store_cpu_field(var, name)                              \
> +    store_cpu_offset(var, offsetof(CPUARMState, name),          \
> +                     sizeof(((CPUARMState *)NULL)->name))

compiler.h defines sizeof_field, so you can write
  sizeof_field(CPUARMState, name)
here.

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 08/60] target/arm: Change DisasContext.thumb to bool
  2022-04-17 17:43 ` [PATCH v3 08/60] target/arm: Change DisasContext.thumb to bool Richard Henderson
@ 2022-04-21 16:15   ` Peter Maydell
  2022-04-22 13:59   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 16:15 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 18:51, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Bool is a more appropriate type for this value.
> Move the member down in the struct to keep the
> bool type members together and remove a hole.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 09/60] target/arm: Change CPUArchState.thumb to bool
  2022-04-17 17:43 ` [PATCH v3 09/60] target/arm: Change CPUArchState.thumb " Richard Henderson
@ 2022-04-21 16:18   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 16:18 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 18:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Bool is a more appropriate type for this value.
> Adjust the assignments to use true/false.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 10/60] target/arm: Remove fpexc32_access
  2022-04-17 17:43 ` [PATCH v3 10/60] target/arm: Remove fpexc32_access Richard Henderson
@ 2022-04-21 16:25   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 16:25 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 18:54, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> This function is incorrect in that it does not properly consider
> CPTR_EL2.FPEN.  We've already got another mechanism for raising
> an FPU access trap: ARM_CP_FPU, so use that instead.
>
> Remove CP_ACCESS_TRAP_FP_EL{2,3}, which becomes unused.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/cpu.h       |  5 -----
>  target/arm/helper.c    | 17 ++---------------
>  target/arm/op_helper.c | 13 -------------
>  3 files changed, 2 insertions(+), 33 deletions(-)
>

Always nice when you can fix a bug by deleting code :-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 11/60] target/arm: Split out set_btype_raw
  2022-04-17 17:43 ` [PATCH v3 11/60] target/arm: Split out set_btype_raw Richard Henderson
@ 2022-04-21 16:27   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 16:27 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 18:54, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Common code for reset_btype and set_btype.
> Use tcg_constant_i32.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate-a64.c | 25 ++++++++++++-------------
>  1 file changed, 12 insertions(+), 13 deletions(-)
>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 12/60] target/arm: Split out gen_rebuild_hflags
  2022-04-17 17:43 ` [PATCH v3 12/60] target/arm: Split out gen_rebuild_hflags Richard Henderson
@ 2022-04-21 18:47   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 18:47 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 18:58, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> For aa32, the function has a parameter to use the new el.
> For aa64, that never happens.
> Use tcg_constant_i32 while we're at it.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 13/60] target/arm: Use tcg_constant in translate-a64.c
  2022-04-17 17:43 ` [PATCH v3 13/60] target/arm: Use tcg_constant in translate-a64.c Richard Henderson
@ 2022-04-21 18:49   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 18:49 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 18:59, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Use tcg_constant_{i32,i64,ptr} as appropriate throughout, which
> means we get to remove lots of tcg_temp_free_*.  Drop variables
> in many cases, passing the constant directly to another function.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate-a64.c | 302 +++++++++++--------------------------
>  1 file changed, 90 insertions(+), 212 deletions(-)

Split this one up a bit, please. It's not too hard to review because
all the changes are local, but if we find there's a regression
in here later I don't really want to have to wade through a 300 line
diff to find it...

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 14/60] target/arm: Simplify GEN_SHIFT in translate.c
  2022-04-17 17:43 ` [PATCH v3 14/60] target/arm: Simplify GEN_SHIFT in translate.c Richard Henderson
@ 2022-04-21 18:56   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 18:56 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 18:58, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Instead of computing
>
>     tmp1 = shift & 0xff;
>     dest = (tmp1 > 0x1f ? 0 : value) << (tmp1 & 0x1f)
>
> use
>
>     tmpd = value << (shift & 0x1f);
>     dest = shift & 0xe ? 0 : tmpd;

Typo here: should be "0xe0".

Otherwise

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 15/60] target/arm: Simplify gen_sar
  2022-04-17 17:43 ` [PATCH v3 15/60] target/arm: Simplify gen_sar Richard Henderson
@ 2022-04-21 18:57   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 18:57 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 18:59, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Use tcg_gen_umin_i32 instead of tcg_gen_movcond_i32.
> Use tcg_constant_i32 while we're at it.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/target/arm/translate.c b/target/arm/translate.c
> index 57631c9fa1..8d6534f9a5 100644
> --- a/target/arm/translate.c
> +++ b/target/arm/translate.c
> @@ -568,12 +568,10 @@ GEN_SHIFT(shr)
>
>  static void gen_sar(TCGv_i32 dest, TCGv_i32 t0, TCGv_i32 t1)
>  {
> -    TCGv_i32 tmp1, tmp2;
> -    tmp1 = tcg_temp_new_i32();
> +    TCGv_i32 tmp1 = tcg_temp_new_i32();
> +
>      tcg_gen_andi_i32(tmp1, t1, 0xff);
> -    tmp2 = tcg_const_i32(0x1f);
> -    tcg_gen_movcond_i32(TCG_COND_GTU, tmp1, tmp1, tmp2, tmp2, tmp1);
> -    tcg_temp_free_i32(tmp2);
> +    tcg_gen_umin_i32(tmp1, tmp1, tcg_constant_i32(31));
>      tcg_gen_sar_i32(dest, t0, tmp1);
>      tcg_temp_free_i32(tmp1);
>  }

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 16/60] target/arm: Simplify aa32 DISAS_WFI
  2022-04-17 17:43 ` [PATCH v3 16/60] target/arm: Simplify aa32 DISAS_WFI Richard Henderson
@ 2022-04-21 19:00   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 19:00 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:02, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The length of the previous insn may be computed from
> the difference of start and end addresses.
> Use tcg_constant_i32 while we're at it.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate.c | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 17/60] target/arm: Use tcg_constant in translate.c
  2022-04-17 17:43 ` [PATCH v3 17/60] target/arm: Use tcg_constant in translate.c Richard Henderson
@ 2022-04-21 19:00   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 19:00 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:07, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Use tcg_constant_{i32,i64,ptr} as appropriate throughout, which
> means we get to remove lots of tcg_temp_free_*.  Drop variables
> in many cases, passing the constant directly to another function.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate.c | 250 ++++++++++++++---------------------------
>  1 file changed, 84 insertions(+), 166 deletions(-)

Again, break this one up a little bit, please.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 18/60] target/arm: Use tcg_constant in translate-m-nocp.c
  2022-04-17 17:43 ` [PATCH v3 18/60] target/arm: Use tcg_constant in translate-m-nocp.c Richard Henderson
@ 2022-04-21 19:03   ` Peter Maydell
  2022-04-21 21:37     ` Richard Henderson
  0 siblings, 1 reply; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 19:03 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:02, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Use tcg_constant_{i32,i64} as appropriate throughout.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate-m-nocp.c | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/target/arm/translate-m-nocp.c b/target/arm/translate-m-nocp.c
> index d9e144e8eb..27363a7b4e 100644
> --- a/target/arm/translate-m-nocp.c
> +++ b/target/arm/translate-m-nocp.c
> @@ -173,7 +173,7 @@ static bool trans_VSCCLRM(DisasContext *s, arg_VSCCLRM *a)
>      }
>
>      /* Zero the Sregs from btmreg to topreg inclusive. */
> -    zero = tcg_const_i64(0);
> +    zero = tcg_constant_i64(0);
>      if (btmreg & 1) {
>          write_neon_element64(zero, btmreg >> 1, 1, MO_32);
>          btmreg++;

Looks like we were previously leaking the TCGv for this one?

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 19/60] target/arm: Use tcg_constant in translate-neon.c
  2022-04-17 17:43 ` [PATCH v3 19/60] target/arm: Use tcg_constant in translate-neon.c Richard Henderson
@ 2022-04-21 19:06   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 19:06 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:11, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Use tcg_constant_{i32,i64} as appropriate throughout.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate-neon.c | 21 +++++++--------------
>  1 file changed, 7 insertions(+), 14 deletions(-)


Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 20/60] target/arm: Use smin/smax for do_sat_addsub_32
  2022-04-17 17:43 ` [PATCH v3 20/60] target/arm: Use smin/smax for do_sat_addsub_32 Richard Henderson
@ 2022-04-21 19:07   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 19:07 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:05, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The operation we're performing with the movcond
> is either min/max depending on cond -- simplify.
> Use tcg_constant_i64 while we're at it.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate-sve.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 21/60] target/arm: Use tcg_constant in translate-sve.c
  2022-04-17 17:43 ` [PATCH v3 21/60] target/arm: Use tcg_constant in translate-sve.c Richard Henderson
@ 2022-04-21 19:08   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 19:08 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:00, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Use tcg_constant_{i32,i64} as appropriate throughout.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate-sve.c | 198 +++++++++++++------------------------
>  1 file changed, 68 insertions(+), 130 deletions(-)
>

Too big, again.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 22/60] target/arm: Use tcg_constant in translate-vfp.c
  2022-04-17 17:43 ` [PATCH v3 22/60] target/arm: Use tcg_constant in translate-vfp.c Richard Henderson
@ 2022-04-21 19:10   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 19:10 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:12, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Use tcg_constant_{i32,i64} as appropriate throughout.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate-vfp.c | 76 ++++++++++++--------------------------
>  1 file changed, 23 insertions(+), 53 deletions(-)
>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 23/60] target/arm: Use tcg_constant_i32 in translate.h
  2022-04-17 17:43 ` [PATCH v3 23/60] target/arm: Use tcg_constant_i32 in translate.h Richard Henderson
@ 2022-04-21 19:11   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 19:11 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:05, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate.h | 13 +++----------
>  1 file changed, 3 insertions(+), 10 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 24/60] target/arm: Split out cpregs.h
  2022-04-17 17:43 ` [PATCH v3 24/60] target/arm: Split out cpregs.h Richard Henderson
@ 2022-04-21 19:14   ` Peter Maydell
  2022-04-22 15:21   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-21 19:14 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:14, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Move ARMCPRegInfo and all related declarations to a new
> internal header, out of the public cpu.h.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

> +#ifndef TARGET_ARM_CPREGS_H
> +#define TARGET_ARM_CPREGS_H 1

Our usual style doesn't have the "1" on these include-guard defines.

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 18/60] target/arm: Use tcg_constant in translate-m-nocp.c
  2022-04-21 19:03   ` Peter Maydell
@ 2022-04-21 21:37     ` Richard Henderson
  0 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-21 21:37 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-arm, qemu-devel

On 4/21/22 12:03, Peter Maydell wrote:
> On Sun, 17 Apr 2022 at 19:02, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> Use tcg_constant_{i32,i64} as appropriate throughout.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   target/arm/translate-m-nocp.c | 12 +++++-------
>>   1 file changed, 5 insertions(+), 7 deletions(-)
>>
>> diff --git a/target/arm/translate-m-nocp.c b/target/arm/translate-m-nocp.c
>> index d9e144e8eb..27363a7b4e 100644
>> --- a/target/arm/translate-m-nocp.c
>> +++ b/target/arm/translate-m-nocp.c
>> @@ -173,7 +173,7 @@ static bool trans_VSCCLRM(DisasContext *s, arg_VSCCLRM *a)
>>       }
>>
>>       /* Zero the Sregs from btmreg to topreg inclusive. */
>> -    zero = tcg_const_i64(0);
>> +    zero = tcg_constant_i64(0);
>>       if (btmreg & 1) {
>>           write_neon_element64(zero, btmreg >> 1, 1, MO_32);
>>           btmreg++;
> 
> Looks like we were previously leaking the TCGv for this one?

Yes.  I'll update the commit message to mention that.

r~


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
                   ` (59 preceding siblings ...)
  2022-04-17 17:44 ` [PATCH v3 60/60] target/arm: Define neoverse-n1 Richard Henderson
@ 2022-04-22  9:01 ` Peter Maydell
  60 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22  9:01 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 18:45, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Supercedes: 20220412003326.588530-1-richard.henderson@linaro.org
> ("target/arm: 8 new features, A76 and N1")
>
> Changes for v3:
>   * More field updates for H.a.  This is not nearly complete, but what
>     I've encountered so far as I've begun implementing SME.
>   * Use bool instead of uint32_t for env->{aarch64,thumb}.
>     I had anticipated other changes for implementing PSTATE.{SM,FA},
>     but dropped those; these seemed like worth keeping.
>   * Use tcg_constant_* more -- got stuck on this while working on...
>   * Lots of cleanups to ARMCPRegInfo.
>   * Discard unreachable cpregs when ELx not available.
>   * Transform EL2 regs to RES0 when EL3 present but EL2 isn't.
>     This greatly simplifies registration of cpregs for new features.
>     Changes contextidr_el2, minimal_ras_reginfo, scxtnum_reginfo
>     within this patch set; other uses coming for SME.

To help reduce the size of this patchset for v4, I've taken most
of the first third into target-arm.next. More specifically, I've
taken patches 2-12, 14-16, 18-20, and 22, making the minor tweaks
noted in code review for patches 7, 14 and 18. Patch 1 is already
upstream. Put another way, I have not taken:

>   target/arm: Use tcg_constant in translate-a64.c
>   target/arm: Use tcg_constant in translate.c
>   target/arm: Use tcg_constant in translate-sve.c

or any of the patches from here on:
>   target/arm: Split out cpregs.h

(I'm still planning to review the second half of this v3.)

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 25/60] target/arm: Reorg CPAccessResult and access_check_cp_reg
  2022-04-17 17:43 ` [PATCH v3 25/60] target/arm: Reorg CPAccessResult and access_check_cp_reg Richard Henderson
@ 2022-04-22  9:32   ` Peter Maydell
  2022-04-22 15:31   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22  9:32 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:00, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Rearrange the values of the enumerators of CPAccessResult
> so that we may directly extract the target el. For the two
> special cases in access_check_cp_reg, use CPAccessResult.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 26/60] target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h
  2022-04-17 17:43 ` [PATCH v3 26/60] target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h Richard Henderson
@ 2022-04-22  9:37   ` Peter Maydell
  2022-04-22 10:39     ` Richard Henderson
  0 siblings, 1 reply; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22  9:37 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:08, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Remove a possible source of error by removing REGINFO_SENTINEL
> and using ARRAY_SIZE (convinently hidden inside a macro) to
> find the end of the set of regs being registered or modified.
>
> The space saved by not having the extra array element reduces
> the executable's .data.rel.ro section by about 9k.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---




> +#define define_arm_cp_regs_with_opaque(CPU, REGS, OPAQUE)               \
> +    do {                                                                \
> +        QEMU_BUILD_BUG_ON(ARRAY_SIZE(REGS) == 0);                       \
> +        if (ARRAY_SIZE(REGS) == 1) {                                    \
> +            define_one_arm_cp_reg_with_opaque(CPU, REGS, OPAQUE);       \
> +        } else {                                                        \
> +            define_arm_cp_regs_with_opaque_len(CPU, REGS, OPAQUE,       \
> +                                               ARRAY_SIZE(REGS));       \
> +        }                                                               \
> +    } while (0)

Do we actually need to special case "array has one element" here,
or is this just efficiency?

Anyway
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 27/60] target/arm: Make some more cpreg data static const
  2022-04-17 17:43 ` [PATCH v3 27/60] target/arm: Make some more cpreg data static const Richard Henderson
@ 2022-04-22  9:38   ` Peter Maydell
  2022-04-22 15:38   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22  9:38 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:11, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> These particular data structures are not modified at runtime.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/helper.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 28/60] target/arm: Reorg ARMCPRegInfo type field bits
  2022-04-17 17:43 ` [PATCH v3 28/60] target/arm: Reorg ARMCPRegInfo type field bits Richard Henderson
@ 2022-04-22  9:49   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22  9:49 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:03, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Instead of defining ARM_CP_FLAG_MASK to remove flags,
> define ARM_CP_SPECIAL_MASK to isolate special cases.
> Sort the specials to the low bits. Use an enum.
>
> Make ARM_CP_CONST a special case, and ARM_CP_NOP an alias.

This is a behaviour change -- read of an ARM_CP_NOP register
currently is "do not change the destination GPR", but
read of an ARM_CP_CONST register does change the GPR.
Maybe that's OK, but we do have several r/w NOP register
definitions at the moment, and if we do think it's OK to change
that behaviour we should do that carefully in a separate patch.

Also, currently we do not migrate ARM_CP_NOP registers but
we do migrate ARM_CP_CONST registers (and this is important for
KVM --migrating the constant ID registers is how the check for
"is the destination a different host CPU to the source" works).
So overall I don't think it's safe to squash NOP and CONST,
or to treat CONST as one of the SPECIALs.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 29/60] target/arm: Change cpreg access permissions to enum
  2022-04-17 17:43 ` [PATCH v3 29/60] target/arm: Change cpreg access permissions to enum Richard Henderson
@ 2022-04-22  9:52   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22  9:52 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:15, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Create a typedef as well, and use it in ARMCPRegInfo.
> This won't be perfect for debugging, but it'll nicely
> display the most common cases.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

> @@ -8741,8 +8741,7 @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
>              break;
>          default:
>              /* broken reginfo with out-of-range opc1 */
> -            assert(false);
> -            break;
> +            g_assert_not_reached();
>          }
>          /* assert our permissions are not too lax (stricter is fine) */
>          assert((r->access & ~mask) == 0);

This part is an unrelated change and should be a separate patch.
Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 30/60] target/arm: Name CPState type
  2022-04-17 17:43 ` [PATCH v3 30/60] target/arm: Name CPState type Richard Henderson
@ 2022-04-22  9:53   ` Peter Maydell
  2022-04-22 15:51   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22  9:53 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:16, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Give this enum a name and use in ARMCPRegInfo,
> add_cpreg_to_hashtable and define_one_arm_cp_reg_with_opaque.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 31/60] target/arm: Name CPSecureState type
  2022-04-17 17:43 ` [PATCH v3 31/60] target/arm: Name CPSecureState type Richard Henderson
@ 2022-04-22  9:57   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22  9:57 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:21, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Give this enum a name and use in ARMCPRegInfo and add_cpreg_to_hashtable.
> Add the enumerator ARM_CP_SECSTATE_BOTH to clarify how 0
> is handled in define_one_arm_cp_reg_with_opaque.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/cpregs.h | 7 ++++---
>  target/arm/helper.c | 3 ++-
>  2 files changed, 6 insertions(+), 4 deletions(-)

Maybe also change the "default" in the switch in
define_one_arm_cp_reg_with_opaque() to "case ARM_CP_SECSTATE_BOTH" ?
Either way,
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 32/60] target/arm: Update sysreg fields when redirecting for E2H
  2022-04-17 17:43 ` [PATCH v3 32/60] target/arm: Update sysreg fields when redirecting for E2H Richard Henderson
@ 2022-04-22 10:39   ` Peter Maydell
  2022-05-01  1:03     ` Richard Henderson
  0 siblings, 1 reply; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22 10:39 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:07, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The new_key is always non-zero during redirection,
> so remove the if.  Update opc0 et al from the new key.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/helper.c | 35 +++++++++++++++++++++++------------
>  1 file changed, 23 insertions(+), 12 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 7c569a569a..aee195400b 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -5915,7 +5915,9 @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
>
>      for (i = 0; i < ARRAY_SIZE(aliases); i++) {
>          const struct E2HAlias *a = &aliases[i];
> -        ARMCPRegInfo *src_reg, *dst_reg;
> +        ARMCPRegInfo *src_reg, *dst_reg, *new_reg;
> +        uint32_t *new_key;
> +        bool ok;
>
>          if (a->feature && !a->feature(&cpu->isar)) {
>              continue;
> @@ -5934,19 +5936,28 @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
>          g_assert(src_reg->opaque == NULL);
>
>          /* Create alias before redirection so we dup the right data. */
> -        if (a->new_key) {
> -            ARMCPRegInfo *new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
> -            uint32_t *new_key = g_memdup(&a->new_key, sizeof(uint32_t));
> -            bool ok;
> +        new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
> +        new_key = g_memdup(&a->new_key, sizeof(uint32_t));
>
> -            new_reg->name = a->new_name;
> -            new_reg->type |= ARM_CP_ALIAS;
> -            /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place.  */
> -            new_reg->access &= PL2_RW | PL3_RW;
> +        new_reg->name = a->new_name;
> +        new_reg->type |= ARM_CP_ALIAS;
> +        /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place.  */
> +        new_reg->access &= PL2_RW;
>
> -            ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
> -            g_assert(ok);
> -        }
> +#define E(X, N) \
> +    ((X & CP_REG_ARM64_SYSREG_##N##_MASK) >> CP_REG_ARM64_SYSREG_##N##_SHIFT)
> +
> +        /* Update the sysreg fields */
> +        new_reg->opc0 = E(a->new_key, OP0);
> +        new_reg->opc1 = E(a->new_key, OP1);
> +        new_reg->crn = E(a->new_key, CRN);
> +        new_reg->crm = E(a->new_key, CRM);
> +        new_reg->opc2 = E(a->new_key, OP2);
> +
> +#undef E
> +
> +        ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
> +        g_assert(ok);

So is setting the new_reg opc etc fields here fixing a bug
(or otherwise changing behaviour)?

The effect is that read/write callbacks now get an ARMCPRegInfo*
that has the opc &c for the alias, rather than for the thing being
aliased. That's good if the read/write callbacks have a need to
distinguish the alias access from a normal one (do they anywhere?).
On the other hand it's bad if we have existing code that thinks it
can distinguish FOO_EL1 from FOO_EL2 by looking at the opc &c
values and now might get confused.

Overall, unless we have a definite reason why we want the
callback functions to be able to tell the alias from the normal
access, I'm inclined to say we should just comment that we
deliberately leave the sysreg fields alone. (Put another way,
I don't really want to have to work through all the aliased
registers here checking whether they have read/write functions
that look at the opc fields and whether any of them would
end up doing the wrong thing when handed the alias reginfo.)

The "remove the if()" part is fine if you wanted to do that
as its own patch.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 26/60] target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h
  2022-04-22  9:37   ` Peter Maydell
@ 2022-04-22 10:39     ` Richard Henderson
  2022-04-22 15:36       ` Alex Bennée
  0 siblings, 1 reply; 1596+ messages in thread
From: Richard Henderson @ 2022-04-22 10:39 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-arm, qemu-devel

On 4/22/22 02:37, Peter Maydell wrote:
> On Sun, 17 Apr 2022 at 19:08, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> Remove a possible source of error by removing REGINFO_SENTINEL
>> and using ARRAY_SIZE (convinently hidden inside a macro) to
>> find the end of the set of regs being registered or modified.
>>
>> The space saved by not having the extra array element reduces
>> the executable's .data.rel.ro section by about 9k.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
> 
> 
> 
> 
>> +#define define_arm_cp_regs_with_opaque(CPU, REGS, OPAQUE)               \
>> +    do {                                                                \
>> +        QEMU_BUILD_BUG_ON(ARRAY_SIZE(REGS) == 0);                       \
>> +        if (ARRAY_SIZE(REGS) == 1) {                                    \
>> +            define_one_arm_cp_reg_with_opaque(CPU, REGS, OPAQUE);       \
>> +        } else {                                                        \
>> +            define_arm_cp_regs_with_opaque_len(CPU, REGS, OPAQUE,       \
>> +                                               ARRAY_SIZE(REGS));       \
>> +        }                                                               \
>> +    } while (0)
> 
> Do we actually need to special case "array has one element" here,
> or is this just efficiency?
> 
> Anyway
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

Just efficiency.  There seem to be a lot of these.


r~


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 33/60] target/arm: Store cpregs key in the hash table directly
  2022-04-17 17:43 ` [PATCH v3 33/60] target/arm: Store cpregs key in the hash table directly Richard Henderson
@ 2022-04-22 10:46   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22 10:46 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:09, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Cast the uint32_t key into a gpointer directly, which
> allows us to avoid allocating storage for each key.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/cpu.c     |  4 ++--
>  target/arm/gdbstub.c |  2 +-
>  target/arm/helper.c  | 45 ++++++++++++++++++++------------------------
>  3 files changed, 23 insertions(+), 28 deletions(-)
>


> @@ -231,11 +228,9 @@ static void add_cpreg_to_list(gpointer key, gpointer opaque)
>  static void count_cpreg(gpointer key, gpointer opaque)
>  {
>      ARMCPU *cpu = opaque;
> -    uint64_t regidx;
>      const ARMCPRegInfo *ri;
>
> -    regidx = *(uint32_t *)key;
> -    ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
> +    ri = g_hash_table_lookup(cpu->cp_regs, key);

Here we turn a get_arm_cp_reginfo() call into a direct
g_hash_table_lookup()...

>      if (!(ri->type & (ARM_CP_NO_RAW|ARM_CP_ALIAS))) {
>          cpu->cpreg_array_len++;

> @@ -5915,16 +5910,17 @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
>
>      for (i = 0; i < ARRAY_SIZE(aliases); i++) {
>          const struct E2HAlias *a = &aliases[i];
> -        ARMCPRegInfo *src_reg, *dst_reg, *new_reg;
> -        uint32_t *new_key;
> +        const ARMCPRegInfo *dst_reg;
> +        ARMCPRegInfo *src_reg;
> +        ARMCPRegInfo *new_reg;
>          bool ok;
>
>          if (a->feature && !a->feature(&cpu->isar)) {
>              continue;
>          }
>
> -        src_reg = g_hash_table_lookup(cpu->cp_regs, &a->src_key);
> -        dst_reg = g_hash_table_lookup(cpu->cp_regs, &a->dst_key);
> +        src_reg = (ARMCPRegInfo *)get_arm_cp_reginfo(cpu->cp_regs, a->src_key);
> +        dst_reg = get_arm_cp_reginfo(cpu->cp_regs, a->dst_key);

...but here we turn g_hash_table_lookup() calls into
get_arm_cp_reginfo() calls (necessitating an ugly cast-away-const).
Is there a rationale for when we should use which function ?

>          g_assert(src_reg != NULL);
>          g_assert(dst_reg != NULL);

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 34/60] target/arm: Cleanup add_cpreg_to_hashtable
  2022-04-17 17:44 ` [PATCH v3 34/60] target/arm: Cleanup add_cpreg_to_hashtable Richard Henderson
@ 2022-04-22 10:48   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22 10:48 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:22, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Use a single memory allocation for name and reginfo.
> Perform the override check early; use assert not printf+abort.
> Use a switch statement to validate state.

I think this patch is trying to do too many things in one commit.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 35/60] target/arm: Handle cpreg registration for missing EL
  2022-04-17 17:44 ` [PATCH v3 35/60] target/arm: Handle cpreg registration for missing EL Richard Henderson
@ 2022-04-22 10:57   ` Peter Maydell
  2022-04-26  9:40     ` Peter Maydell
  2022-04-26 15:31     ` Peter Maydell
  0 siblings, 2 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22 10:57 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:21, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> More gracefully handle cpregs when EL2 and/or EL3 are missing.
> If the reg is entirely inaccessible, do not register it at all.
> If the reg is for EL2, and EL3 is present but EL2 is not,
> squash to ARM_CP_CONST.

I don't think we should do this unconditionally. Just because
the architecture usually defines that an _EL2 register is
RES0 if EL2 is not present doesn't mean that it does so for
every register or that it guarantees that it will continue
to do so in future. (For instance I found ZCR_EL2 and TFSR_EL2
don't have that statement in their documentation, which might or
might not be an oversight.) You could add an ARM_CP_ flag for
"RES0 if no EL2" or something I guess?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 40/60] target/arm: Move cortex impdef sysregs to cpu_tcg.c
  2022-04-17 17:44 ` [PATCH v3 40/60] target/arm: Move cortex impdef sysregs to cpu_tcg.c Richard Henderson
@ 2022-04-22 11:01   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22 11:01 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:13, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Previously we were defining some of these in user-only mode,
> but none of them are accessible from user-only, therefore
> define them only in system mode.
>
> This will shortly be used from cpu_tcg.c also.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> v2: New patch.
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 41/60] target/arm: Update qemu-system-arm -cpu max to cortex-a57
  2022-04-17 17:44 ` [PATCH v3 41/60] target/arm: Update qemu-system-arm -cpu max to cortex-a57 Richard Henderson
@ 2022-04-22 11:02   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22 11:02 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:29, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Instead of starting with cortex-a15 and adding v8 features to
> a v7 cpu, begin with a v8 cpu stripped of its aarch64 features.
> This fixes the long-standing to-do where we only enabled v8
> features for user-only.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 43/60] target/arm: Split out aa32_max_features
  2022-04-17 17:44 ` [PATCH v3 43/60] target/arm: Split out aa32_max_features Richard Henderson
@ 2022-04-22 11:03   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22 11:03 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:37, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Share the code to set AArch32 max features so that we no
> longer have code drift between qemu{-system,}-{arm,aarch64}.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/internals.h |   2 +
>  target/arm/cpu64.c     |  50 +-----------------
>  target/arm/cpu_tcg.c   | 114 ++++++++++++++++++++++-------------------
>  3 files changed, 65 insertions(+), 101 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 51/60] target/arm: Implement virtual SError exceptions
  2022-04-17 17:44 ` [PATCH v3 51/60] target/arm: Implement virtual SError exceptions Richard Henderson
@ 2022-04-22 11:06   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22 11:06 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:44, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Virtual SError exceptions are raised by setting HCR_EL2.VSE,
> and are routed to EL1 just like other virtual exceptions.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> v2: Honor EAE for reporting VSERR to aa32.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 59/60] target/arm: Define cortex-a76
  2022-04-17 17:44 ` [PATCH v3 59/60] target/arm: Define cortex-a76 Richard Henderson
@ 2022-04-22 11:08   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22 11:08 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:52, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Enable the a76 for virt and sbsa board use.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 60/60] target/arm: Define neoverse-n1
  2022-04-17 17:44 ` [PATCH v3 60/60] target/arm: Define neoverse-n1 Richard Henderson
@ 2022-04-22 11:08   ` Peter Maydell
  0 siblings, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22 11:08 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Sun, 17 Apr 2022 at 19:45, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Enable the n1 for virt and sbsa board use.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 07/60] target/arm: Extend store_cpu_offset to take field size
  2022-04-17 17:43 ` [PATCH v3 07/60] target/arm: Extend store_cpu_offset to take field size Richard Henderson
  2022-04-21 16:15   ` Peter Maydell
@ 2022-04-22 13:58   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-22 13:58 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Currently we assume all fields are 32-bit.
> Prepare for fields of a single byte, using sizeof.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 08/60] target/arm: Change DisasContext.thumb to bool
  2022-04-17 17:43 ` [PATCH v3 08/60] target/arm: Change DisasContext.thumb to bool Richard Henderson
  2022-04-21 16:15   ` Peter Maydell
@ 2022-04-22 13:59   ` Alex Bennée
  2022-04-22 14:04     ` Peter Maydell
  1 sibling, 1 reply; 1596+ messages in thread
From: Alex Bennée @ 2022-04-22 13:59 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Bool is a more appropriate type for this value.
> Move the member down in the struct to keep the
> bool type members together and remove a hole.

Does gcc even attempt to pack bools? Aren't they basically int types?

Anyway:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>


>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate.h     | 2 +-
>  target/arm/translate-a64.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/target/arm/translate.h b/target/arm/translate.h
> index 8b7dd1a4c0..050d80f6f9 100644
> --- a/target/arm/translate.h
> +++ b/target/arm/translate.h
> @@ -30,7 +30,6 @@ typedef struct DisasContext {
>      bool eci_handled;
>      /* TCG op to rewind to if this turns out to be an invalid ECI state */
>      TCGOp *insn_eci_rewind;
> -    int thumb;
>      int sctlr_b;
>      MemOp be_data;
>  #if !defined(CONFIG_USER_ONLY)
> @@ -65,6 +64,7 @@ typedef struct DisasContext {
>      GHashTable *cp_regs;
>      uint64_t features; /* CPU features bits */
>      bool aarch64;
> +    bool thumb;
>      /* Because unallocated encodings generate different exception syndrome
>       * information from traps due to FP being disabled, we can't do a single
>       * "is fp access disabled" check at a high level in the decode tree.
> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
> index 4dad23db48..be7283b966 100644
> --- a/target/arm/translate-a64.c
> +++ b/target/arm/translate-a64.c
> @@ -14670,7 +14670,7 @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
>       */
>      dc->secure_routed_to_el3 = arm_feature(env, ARM_FEATURE_EL3) &&
>                                 !arm_el_is_aa64(env, 3);
> -    dc->thumb = 0;
> +    dc->thumb = false;
>      dc->sctlr_b = 0;
>      dc->be_data = EX_TBFLAG_ANY(tb_flags, BE_DATA) ? MO_BE : MO_LE;
>      dc->condexec_mask = 0;


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 08/60] target/arm: Change DisasContext.thumb to bool
  2022-04-22 13:59   ` Alex Bennée
@ 2022-04-22 14:04     ` Peter Maydell
  2022-04-22 15:24       ` Richard Henderson
  0 siblings, 1 reply; 1596+ messages in thread
From: Peter Maydell @ 2022-04-22 14:04 UTC (permalink / raw)
  To: Alex Bennée; +Cc: qemu-arm, Richard Henderson, qemu-devel

On Fri, 22 Apr 2022 at 15:01, Alex Bennée <alex.bennee@linaro.org> wrote:
>
>
> Richard Henderson <richard.henderson@linaro.org> writes:
>
> > Bool is a more appropriate type for this value.
> > Move the member down in the struct to keep the
> > bool type members together and remove a hole.
>
> Does gcc even attempt to pack bools? Aren't they basically int types?

It's impdef, I think, but it'll typically be a 1 byte integer
rather than a 4 byte integer, with the usual struct packing
rules for 1 byte type sizes.

-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 24/60] target/arm: Split out cpregs.h
  2022-04-17 17:43 ` [PATCH v3 24/60] target/arm: Split out cpregs.h Richard Henderson
  2022-04-21 19:14   ` Peter Maydell
@ 2022-04-22 15:21   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-22 15:21 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Move ARMCPRegInfo and all related declarations to a new
> internal header, out of the public cpu.h.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 08/60] target/arm: Change DisasContext.thumb to bool
  2022-04-22 14:04     ` Peter Maydell
@ 2022-04-22 15:24       ` Richard Henderson
  0 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-22 15:24 UTC (permalink / raw)
  To: Peter Maydell, Alex Bennée; +Cc: qemu-arm, qemu-devel

On 4/22/22 07:04, Peter Maydell wrote:
> On Fri, 22 Apr 2022 at 15:01, Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>>
>> Richard Henderson <richard.henderson@linaro.org> writes:
>>
>>> Bool is a more appropriate type for this value.
>>> Move the member down in the struct to keep the
>>> bool type members together and remove a hole.
>>
>> Does gcc even attempt to pack bools? Aren't they basically int types?
> 
> It's impdef, I think, but it'll typically be a 1 byte integer
> rather than a 4 byte integer, with the usual struct packing
> rules for 1 byte type sizes.

Yep, it's 1 byte for all extant abis except macos where it's 4.


r~


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 25/60] target/arm: Reorg CPAccessResult and access_check_cp_reg
  2022-04-17 17:43 ` [PATCH v3 25/60] target/arm: Reorg CPAccessResult and access_check_cp_reg Richard Henderson
  2022-04-22  9:32   ` Peter Maydell
@ 2022-04-22 15:31   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-22 15:31 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Rearrange the values of the enumerators of CPAccessResult
> so that we may directly extract the target el. For the two
> special cases in access_check_cp_reg, use CPAccessResult.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/cpregs.h    | 26 ++++++++++++--------
>  target/arm/op_helper.c | 56 +++++++++++++++++++++---------------------
>  2 files changed, 44 insertions(+), 38 deletions(-)
>
> diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
> index 005aa2d3a5..700fcc1478 100644
> --- a/target/arm/cpregs.h
> +++ b/target/arm/cpregs.h
> @@ -167,26 +167,32 @@ static inline bool cptype_valid(int cptype)
>  typedef enum CPAccessResult {
>      /* Access is permitted */
>      CP_ACCESS_OK = 0,
> +
> +    /*
> +     * Combined with one of the following, the low 2 bits indicate the
> +     * target exception level.  If 0, the exception is taken to the usual
> +     * target EL (EL1 or PL1 if in EL0, otherwise to the current EL).
> +     */
> +    CP_ACCESS_EL_MASK = 3,
> +
>      /*
>       * Access fails due to a configurable trap or enable which would
>       * result in a categorized exception syndrome giving information about
>       * the failing instruction (ie syndrome category 0x3, 0x4, 0x5, 0x6,
> -     * 0xc or 0x18). The exception is taken to the usual target EL (EL1 or
> -     * PL1 if in EL0, otherwise to the current EL).
> +     * 0xc or 0x18).
>       */
> -    CP_ACCESS_TRAP = 1,
> +    CP_ACCESS_TRAP = (1 << 2),
> +    CP_ACCESS_TRAP_EL2 = CP_ACCESS_TRAP | 2,
> +    CP_ACCESS_TRAP_EL3 = CP_ACCESS_TRAP | 3,
> +
>      /*
>       * Access fails and results in an exception syndrome 0x0 ("uncategorized").
>       * Note that this is not a catch-all case -- the set of cases which may
>       * result in this failure is specifically defined by the architecture.
>       */
> -    CP_ACCESS_TRAP_UNCATEGORIZED = 2,
> -    /* As CP_ACCESS_TRAP, but for traps directly to EL2 or EL3 */
> -    CP_ACCESS_TRAP_EL2 = 3,
> -    CP_ACCESS_TRAP_EL3 = 4,
> -    /* As CP_ACCESS_UNCATEGORIZED, but for traps directly to EL2 or EL3 */
> -    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = 5,
> -    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = 6,
> +    CP_ACCESS_TRAP_UNCATEGORIZED = (2 << 2),
> +    CP_ACCESS_TRAP_UNCATEGORIZED_EL2 = CP_ACCESS_TRAP_UNCATEGORIZED | 2,
> +    CP_ACCESS_TRAP_UNCATEGORIZED_EL3 = CP_ACCESS_TRAP_UNCATEGORIZED | 3,
>  } CPAccessResult;

This does feel like we are moving from an enum to a bunch of #defines
for bitfields. I guess we keep type checking though....

Anyway:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 26/60] target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h
  2022-04-22 10:39     ` Richard Henderson
@ 2022-04-22 15:36       ` Alex Bennée
  2022-05-01  0:10         ` Richard Henderson
  0 siblings, 1 reply; 1596+ messages in thread
From: Alex Bennée @ 2022-04-22 15:36 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Peter Maydell, qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> On 4/22/22 02:37, Peter Maydell wrote:
>> On Sun, 17 Apr 2022 at 19:08, Richard Henderson
>> <richard.henderson@linaro.org> wrote:
>>>
>>> Remove a possible source of error by removing REGINFO_SENTINEL
>>> and using ARRAY_SIZE (convinently hidden inside a macro) to
>>> find the end of the set of regs being registered or modified.
>>>
>>> The space saved by not having the extra array element reduces
>>> the executable's .data.rel.ro section by about 9k.
>>>
>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>> ---
>> 
>>> +#define define_arm_cp_regs_with_opaque(CPU, REGS, OPAQUE)               \
>>> +    do {                                                                \
>>> +        QEMU_BUILD_BUG_ON(ARRAY_SIZE(REGS) == 0);                       \
>>> +        if (ARRAY_SIZE(REGS) == 1) {                                    \
>>> +            define_one_arm_cp_reg_with_opaque(CPU, REGS, OPAQUE);       \
>>> +        } else {                                                        \
>>> +            define_arm_cp_regs_with_opaque_len(CPU, REGS, OPAQUE,       \
>>> +                                               ARRAY_SIZE(REGS));       \
>>> +        }                                                               \
>>> +    } while (0)
>> Do we actually need to special case "array has one element" here,
>> or is this just efficiency?
>> Anyway
>> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>
> Just efficiency.  There seem to be a lot of these.

If you moved define_arm_cp_regs_with_opaque_len into the header as an
inline surely the compiler could figure it out when presented with a
constant i? The would avoid the need for the special casing in the macro
expansion right?

Anyway:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>


>
>
> r~


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 27/60] target/arm: Make some more cpreg data static const
  2022-04-17 17:43 ` [PATCH v3 27/60] target/arm: Make some more cpreg data static const Richard Henderson
  2022-04-22  9:38   ` Peter Maydell
@ 2022-04-22 15:38   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-22 15:38 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> These particular data structures are not modified at runtime.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 30/60] target/arm: Name CPState type
  2022-04-17 17:43 ` [PATCH v3 30/60] target/arm: Name CPState type Richard Henderson
  2022-04-22  9:53   ` Peter Maydell
@ 2022-04-22 15:51   ` Alex Bennée
  1 sibling, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-22 15:51 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel


Richard Henderson <richard.henderson@linaro.org> writes:

> Give this enum a name and use in ARMCPRegInfo,
> add_cpreg_to_hashtable and define_one_arm_cp_reg_with_opaque.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/cpregs.h | 6 +++---
>  target/arm/helper.c | 6 ++++--
>  2 files changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/target/arm/cpregs.h b/target/arm/cpregs.h
> index 2c991ab5df..fe338730ab 100644
> --- a/target/arm/cpregs.h
> +++ b/target/arm/cpregs.h
> @@ -116,11 +116,11 @@ enum {
>   * Note that we rely on the values of these enums as we iterate through
>   * the various states in some places.
>   */
> -enum {
> +typedef enum {
>      ARM_CP_STATE_AA32 = 0,
>      ARM_CP_STATE_AA64 = 1,
>      ARM_CP_STATE_BOTH = 2,
> -};
> +} CPState;
>  
>  /*
>   * ARM CP register secure state flags.  These flags identify security state
> @@ -262,7 +262,7 @@ struct ARMCPRegInfo {
>      uint8_t opc1;
>      uint8_t opc2;
>      /* Execution state in which this register is visible: ARM_CP_STATE_* */
> -    int state;
> +    CPState state;
>      /* Register type: ARM_CP_* bits/values */
>      int type;
>      /* Access rights: PL*_[RW] */
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 33ba77890b..8b89039667 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -8503,7 +8503,7 @@ CpuDefinitionInfoList *qmp_query_cpu_definitions(Error **errp)
>  }
>  
>  static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
> -                                   void *opaque, int state, int secstate,
> +                                   void *opaque, CPState state, int secstate,
>                                     int crm, int opc1, int opc2,
>                                     const char *name)
>  {
> @@ -8663,13 +8663,15 @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
>       * bits; the ARM_CP_64BIT* flag applies only to the AArch32 view of
>       * the register, if any.
>       */
> -    int crm, opc1, opc2, state;
> +    int crm, opc1, opc2;
>      int crmmin = (r->crm == CP_ANY) ? 0 : r->crm;
>      int crmmax = (r->crm == CP_ANY) ? 15 : r->crm;
>      int opc1min = (r->opc1 == CP_ANY) ? 0 : r->opc1;
>      int opc1max = (r->opc1 == CP_ANY) ? 7 : r->opc1;
>      int opc2min = (r->opc2 == CP_ANY) ? 0 : r->opc2;
>      int opc2max = (r->opc2 == CP_ANY) ? 7 : r->opc2;
> +    CPState state;
> +
>      /* 64 bit registers have only CRm and Opc1 fields */
>      assert(!((r->type & ARM_CP_64BIT) && (r->opc2 || r->crn)));
>      /* op0 only exists in the AArch64 encodings */


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 35/60] target/arm: Handle cpreg registration for missing EL
  2022-04-22 10:57   ` Peter Maydell
@ 2022-04-26  9:40     ` Peter Maydell
  2022-04-26 15:31     ` Peter Maydell
  1 sibling, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-26  9:40 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Fri, 22 Apr 2022 at 11:57, Peter Maydell <peter.maydell@linaro.org> wrote:
> Just because
> the architecture usually defines that an _EL2 register is
> RES0 if EL2 is not present doesn't mean that it does so for
> every register or that it guarantees that it will continue
> to do so in future. (For instance I found ZCR_EL2 and TFSR_EL2
> don't have that statement in their documentation, which might or
> might not be an oversight.)

In those two cases, it probably is a documentation omission.

-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 35/60] target/arm: Handle cpreg registration for missing EL
  2022-04-22 10:57   ` Peter Maydell
  2022-04-26  9:40     ` Peter Maydell
@ 2022-04-26 15:31     ` Peter Maydell
  1 sibling, 0 replies; 1596+ messages in thread
From: Peter Maydell @ 2022-04-26 15:31 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-arm, qemu-devel

On Fri, 22 Apr 2022 at 11:57, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Sun, 17 Apr 2022 at 19:21, Richard Henderson
> <richard.henderson@linaro.org> wrote:
> >
> > More gracefully handle cpregs when EL2 and/or EL3 are missing.
> > If the reg is entirely inaccessible, do not register it at all.
> > If the reg is for EL2, and EL3 is present but EL2 is not,
> > squash to ARM_CP_CONST.
>
> I don't think we should do this unconditionally. Just because
> the architecture usually defines that an _EL2 register is
> RES0 if EL2 is not present doesn't mean that it does so for
> every register or that it guarantees that it will continue
> to do so in future. (For instance I found ZCR_EL2 and TFSR_EL2
> don't have that statement in their documentation, which might or
> might not be an oversight.) You could add an ARM_CP_ flag for
> "RES0 if no EL2" or something I guess?

I've just found rule R_RJFFP in section D1.1.3 of DDI0487H.a,
which explicitly documents that if EL2 is not implemented and
EL3 is, then "every accessible register associated with EL2 is
RES0" except for an enumerated list of six registers which are
not. So I'm more open to "default to RES0, with a flag to
suppress that default" than I was when I first reviewed this
patchset.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 56/60] target/arm: Enable FEAT_CSV2_2 for -cpu max
  2022-04-17 17:44 ` [PATCH v3 56/60] target/arm: Enable FEAT_CSV2_2 " Richard Henderson
@ 2022-04-29  9:52   ` Damien Hedde
  2022-04-29 18:06     ` Richard Henderson
  0 siblings, 1 reply; 1596+ messages in thread
From: Damien Hedde @ 2022-04-29  9:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: qemu-arm



On 4/17/22 19:44, Richard Henderson wrote:
> There is no branch prediction in TCG, therefore there is no
> need to actually include the context number into the predictor.
> Therefore all we need to do is add the state for SCXTNUM_ELx. >
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> v2: Update emulation.rst; clear CSV2_FRAC; use decimal; tidy access_scxtnum.
> v3: Rely on EL3-no-EL2 squashing during registration.
> ---
>   docs/system/arm/emulation.rst |  3 ++
>   target/arm/cpu.h              | 16 +++++++++
>   target/arm/cpu64.c            |  3 +-
>   target/arm/helper.c           | 66 ++++++++++++++++++++++++++++++++++-
>   4 files changed, 86 insertions(+), 2 deletions(-)
> 
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> @@ -7233,7 +7243,57 @@ static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
>       },
>   };

Hi Richard,

I tried to compare with the pseudocode from arm doc and I've a few 
interrogations. It seems to me there are missing cases in the access 
checks, but I lack the background to know if these are not checked 
somewhere else or guaranteed to never happen.

>   
> -#endif > +static CPAccessResult access_scxtnum(CPUARMState *env, const 
ARMCPRegInfo *ri,
> +                                     bool isread)
> +{
The following checks are missing:
    + for HFG[W/R]TR_EL2.SCXTNUM_EL0/1
    + HCR_EL2.<NV2,NV1,NV> when accessing SCXTNUM_EL1, but maybe these 
are always guaranteed to fail because we don't support the features ?
    + HCR_EL2.NV when accessing SCXTNUM_EL2
> +    int el = arm_current_el(env);
> +
> +    if (el == 0) {
> +        uint64_t hcr = arm_hcr_el2_eff(env);
> +        if ((hcr & (HCR_TGE | HCR_E2H)) != (HCR_TGE | HCR_E2H)) {
> +            if (env->cp15.sctlr_el[1] & SCTLR_TSCXT) {
> +                if (hcr & HCR_TGE) {
> +                    return CP_ACCESS_TRAP_EL2;
> +                }
> +                return CP_ACCESS_TRAP;
> +            }
> +            if (arm_is_el2_enabled(env) && !(hcr & HCR_ENSCXT)) {
This case is also present when accessing SCXTNUM_EL0 from el1 (but 
without "(hcr & (HCR_TGE | HCR_E2H)) != (HCR_TGE | HCR_E2H)" precondition)
> +                return CP_ACCESS_TRAP_EL2;
> +            }
> +            goto no_sctlr_el2;
> +        }
> +    }
> +    if (el < 2 && (env->cp15.sctlr_el[2] & SCTLR_TSCXT)) {
> +        return CP_ACCESS_TRAP_EL2;
> +    }
> + no_sctlr_el2:
> +    if (el < 3
> +        && arm_feature(env, ARM_FEATURE_EL3)
> +        && !(env->cp15.scr_el3 & SCR_ENSCXT)) {
> +        return CP_ACCESS_TRAP_EL3;
> +    }
> +    return CP_ACCESS_OK;
> +}

Regards,
--
Damien


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 56/60] target/arm: Enable FEAT_CSV2_2 for -cpu max
  2022-04-29  9:52   ` Damien Hedde
@ 2022-04-29 18:06     ` Richard Henderson
  0 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-04-29 18:06 UTC (permalink / raw)
  To: Damien Hedde, qemu-devel; +Cc: qemu-arm

On 4/29/22 02:52, Damien Hedde wrote:
> The following checks are missing:
>     + for HFG[W/R]TR_EL2.SCXTNUM_EL0/1

We do not yet support FEAT_FGT.

>     + HCR_EL2.<NV2,NV1,NV> when accessing SCXTNUM_EL1, but maybe these are always 
> guaranteed to fail because we don't support the features ?
>     + HCR_EL2.NV when accessing SCXTNUM_EL2

We do not yet support FEAT_NV or FEAT_NV2.

>> +            if (arm_is_el2_enabled(env) && !(hcr & HCR_ENSCXT)) {
> This case is also present when accessing SCXTNUM_EL0 from el1 (but without "(hcr & 
> (HCR_TGE | HCR_E2H)) != (HCR_TGE | HCR_E2H)" precondition)

Quite right.  Will fix.


r~


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 26/60] target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h
  2022-04-22 15:36       ` Alex Bennée
@ 2022-05-01  0:10         ` Richard Henderson
  0 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-05-01  0:10 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Peter Maydell, qemu-arm, qemu-devel

On 4/22/22 08:36, Alex Bennée wrote:
>>>> +#define define_arm_cp_regs_with_opaque(CPU, REGS, OPAQUE)               \
>>>> +    do {                                                                \
>>>> +        QEMU_BUILD_BUG_ON(ARRAY_SIZE(REGS) == 0);                       \
>>>> +        if (ARRAY_SIZE(REGS) == 1) {                                    \
>>>> +            define_one_arm_cp_reg_with_opaque(CPU, REGS, OPAQUE);       \
>>>> +        } else {                                                        \
>>>> +            define_arm_cp_regs_with_opaque_len(CPU, REGS, OPAQUE,       \
>>>> +                                               ARRAY_SIZE(REGS));       \
>>>> +        }                                                               \
>>>> +    } while (0)
>>> Do we actually need to special case "array has one element" here,
>>> or is this just efficiency?
>>> Anyway
>>> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>>
>> Just efficiency.  There seem to be a lot of these.
> 
> If you moved define_arm_cp_regs_with_opaque_len into the header as an
> inline surely the compiler could figure it out when presented with a
> constant i? The would avoid the need for the special casing in the macro
> expansion right?

I didn't want to expand the code size with so many loops.
Anyway, I'm dropping the special case entirely for v4.


r~


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re: [PATCH v3 32/60] target/arm: Update sysreg fields when redirecting for E2H
  2022-04-22 10:39   ` Peter Maydell
@ 2022-05-01  1:03     ` Richard Henderson
  0 siblings, 0 replies; 1596+ messages in thread
From: Richard Henderson @ 2022-05-01  1:03 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-arm, qemu-devel

On 4/22/22 03:39, Peter Maydell wrote:
> On Sun, 17 Apr 2022 at 19:07, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> The new_key is always non-zero during redirection,
>> so remove the if.  Update opc0 et al from the new key.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   target/arm/helper.c | 35 +++++++++++++++++++++++------------
>>   1 file changed, 23 insertions(+), 12 deletions(-)
>>
>> diff --git a/target/arm/helper.c b/target/arm/helper.c
>> index 7c569a569a..aee195400b 100644
>> --- a/target/arm/helper.c
>> +++ b/target/arm/helper.c
>> @@ -5915,7 +5915,9 @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
>>
>>       for (i = 0; i < ARRAY_SIZE(aliases); i++) {
>>           const struct E2HAlias *a = &aliases[i];
>> -        ARMCPRegInfo *src_reg, *dst_reg;
>> +        ARMCPRegInfo *src_reg, *dst_reg, *new_reg;
>> +        uint32_t *new_key;
>> +        bool ok;
>>
>>           if (a->feature && !a->feature(&cpu->isar)) {
>>               continue;
>> @@ -5934,19 +5936,28 @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
>>           g_assert(src_reg->opaque == NULL);
>>
>>           /* Create alias before redirection so we dup the right data. */
>> -        if (a->new_key) {
>> -            ARMCPRegInfo *new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
>> -            uint32_t *new_key = g_memdup(&a->new_key, sizeof(uint32_t));
>> -            bool ok;
>> +        new_reg = g_memdup(src_reg, sizeof(ARMCPRegInfo));
>> +        new_key = g_memdup(&a->new_key, sizeof(uint32_t));
>>
>> -            new_reg->name = a->new_name;
>> -            new_reg->type |= ARM_CP_ALIAS;
>> -            /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place.  */
>> -            new_reg->access &= PL2_RW | PL3_RW;
>> +        new_reg->name = a->new_name;
>> +        new_reg->type |= ARM_CP_ALIAS;
>> +        /* Remove PL1/PL0 access, leaving PL2/PL3 R/W in place.  */
>> +        new_reg->access &= PL2_RW;
>>
>> -            ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
>> -            g_assert(ok);
>> -        }
>> +#define E(X, N) \
>> +    ((X & CP_REG_ARM64_SYSREG_##N##_MASK) >> CP_REG_ARM64_SYSREG_##N##_SHIFT)
>> +
>> +        /* Update the sysreg fields */
>> +        new_reg->opc0 = E(a->new_key, OP0);
>> +        new_reg->opc1 = E(a->new_key, OP1);
>> +        new_reg->crn = E(a->new_key, CRN);
>> +        new_reg->crm = E(a->new_key, CRM);
>> +        new_reg->opc2 = E(a->new_key, OP2);
>> +
>> +#undef E
>> +
>> +        ok = g_hash_table_insert(cpu->cp_regs, new_key, new_reg);
>> +        g_assert(ok);
> 
> So is setting the new_reg opc etc fields here fixing a bug
> (or otherwise changing behaviour)?
> 
> The effect is that read/write callbacks now get an ARMCPRegInfo*
> that has the opc &c for the alias, rather than for the thing being
> aliased. That's good if the read/write callbacks have a need to
> distinguish the alias access from a normal one (do they anywhere?).
> On the other hand it's bad if we have existing code that thinks it
> can distinguish FOO_EL1 from FOO_EL2 by looking at the opc &c
> values and now might get confused.
> 
> Overall, unless we have a definite reason why we want the
> callback functions to be able to tell the alias from the normal
> access, I'm inclined to say we should just comment that we
> deliberately leave the sysreg fields alone. (Put another way,
> I don't really want to have to work through all the aliased
> registers here checking whether they have read/write functions
> that look at the opc fields and whether any of them would
> end up doing the wrong thing when handed the alias reginfo.)

I don't recall what I was thinking here, it's definitely wrong.

> The "remove the if()" part is fine if you wanted to do that
> as its own patch.

Yeah, I'll do that, since it's always true.


r~


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-23  6:59             ` Re: Paul E. McKenney
@ 2023-05-25  0:13               ` Masami Hiramatsu
  0 siblings, 0 replies; 1596+ messages in thread
From: Masami Hiramatsu @ 2023-05-25  0:13 UTC (permalink / raw)
  To: paulmck
  Cc: Ze Gao, Jiri Olsa, Yonghong Song, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Hao Luo, John Fastabend,
	KP Singh, Martin KaFai Lau, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, songliubraving,
	Ze Gao

On Mon, 22 May 2023 23:59:28 -0700
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> On Tue, May 23, 2023 at 01:30:19PM +0800, Masami Hiramatsu wrote:
> > On Mon, 22 May 2023 10:07:42 +0800
> > Ze Gao <zegao2021@gmail.com> wrote:
> > 
> > > Oops, I missed that. Thanks for pointing that out, which I thought is
> > > conditional use of rcu_is_watching before.
> > > 
> > > One last point, I think we should double check on this
> > >      "fentry does not filter with !rcu_is_watching"
> > > as quoted from Yonghong and argue whether it needs
> > > the same check for fentry as well.
> > 
> > rcu_is_watching() comment says;
> > 
> >  * if the current CPU is not in its idle loop or is in an interrupt or
> >  * NMI handler, return true.
> > 
> > Thus it returns *fault* if the current CPU is in the idle loop and not
> > any interrupt(including NMI) context. This means if any tracable function
> > is called from idle loop, it can be !rcu_is_watching(). I meant, this is
> > 'context' based check, thus fentry can not filter out that some commonly
> > used functions is called from that context but it can be detected.
> 
> It really does return false (rather than faulting?) if the current CPU
> is deep within the idle loop.
> 
> In addition, the recent x86/entry rework (thank you Peter and
> Thomas!) mean that the "idle loop" is quite restricted, as can be
> seen by the invocations of ct_cpuidle_enter() and ct_cpuidle_exit().
> For example, in default_idle_call(), these are immediately before and
> after the call to arch_cpu_idle().

Thanks! I also found that the default_idle_call() is enough small and
it seems not happening on fentry because there are no commonly used
functions on that path.

> 
> Would the following help?  Or am I missing your point?

Yes, thank you for the update!

> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 1449cb69a0e0..fae9b4e29c93 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -679,10 +679,14 @@ static void rcu_disable_urgency_upon_qs(struct rcu_data *rdp)
>  /**
>   * rcu_is_watching - see if RCU thinks that the current CPU is not idle
>   *
> - * Return true if RCU is watching the running CPU, which means that this
> - * CPU can safely enter RCU read-side critical sections.  In other words,
> - * if the current CPU is not in its idle loop or is in an interrupt or
> - * NMI handler, return true.
> + * Return @true if RCU is watching the running CPU and @false otherwise.
> + * An @true return means that this CPU can safely enter RCU read-side
> + * critical sections.
> + *
> + * More specifically, if the current CPU is not deep within its idle
> + * loop, return @true.  Note that rcu_is_watching() will return @true if
> + * invoked from an interrupt or NMI handler, even if that interrupt or
> + * NMI interrupted the CPU while it was deep within its idle loop.
>   *
>   * Make notrace because it can be called by the internal functions of
>   * ftrace, and making this notrace removes unnecessary recursion calls.


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-23  5:30           ` Re: Masami Hiramatsu
@ 2023-05-23  6:59             ` Paul E. McKenney
  2023-05-25  0:13               ` Re: Masami Hiramatsu
  0 siblings, 1 reply; 1596+ messages in thread
From: Paul E. McKenney @ 2023-05-23  6:59 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ze Gao, Jiri Olsa, Yonghong Song, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Hao Luo, John Fastabend,
	KP Singh, Martin KaFai Lau, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, songliubraving,
	Ze Gao

On Tue, May 23, 2023 at 01:30:19PM +0800, Masami Hiramatsu wrote:
> On Mon, 22 May 2023 10:07:42 +0800
> Ze Gao <zegao2021@gmail.com> wrote:
> 
> > Oops, I missed that. Thanks for pointing that out, which I thought is
> > conditional use of rcu_is_watching before.
> > 
> > One last point, I think we should double check on this
> >      "fentry does not filter with !rcu_is_watching"
> > as quoted from Yonghong and argue whether it needs
> > the same check for fentry as well.
> 
> rcu_is_watching() comment says;
> 
>  * if the current CPU is not in its idle loop or is in an interrupt or
>  * NMI handler, return true.
> 
> Thus it returns *fault* if the current CPU is in the idle loop and not
> any interrupt(including NMI) context. This means if any tracable function
> is called from idle loop, it can be !rcu_is_watching(). I meant, this is
> 'context' based check, thus fentry can not filter out that some commonly
> used functions is called from that context but it can be detected.

It really does return false (rather than faulting?) if the current CPU
is deep within the idle loop.

In addition, the recent x86/entry rework (thank you Peter and
Thomas!) mean that the "idle loop" is quite restricted, as can be
seen by the invocations of ct_cpuidle_enter() and ct_cpuidle_exit().
For example, in default_idle_call(), these are immediately before and
after the call to arch_cpu_idle().

Would the following help?  Or am I missing your point?

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1449cb69a0e0..fae9b4e29c93 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -679,10 +679,14 @@ static void rcu_disable_urgency_upon_qs(struct rcu_data *rdp)
 /**
  * rcu_is_watching - see if RCU thinks that the current CPU is not idle
  *
- * Return true if RCU is watching the running CPU, which means that this
- * CPU can safely enter RCU read-side critical sections.  In other words,
- * if the current CPU is not in its idle loop or is in an interrupt or
- * NMI handler, return true.
+ * Return @true if RCU is watching the running CPU and @false otherwise.
+ * An @true return means that this CPU can safely enter RCU read-side
+ * critical sections.
+ *
+ * More specifically, if the current CPU is not deep within its idle
+ * loop, return @true.  Note that rcu_is_watching() will return @true if
+ * invoked from an interrupt or NMI handler, even if that interrupt or
+ * NMI interrupted the CPU while it was deep within its idle loop.
  *
  * Make notrace because it can be called by the internal functions of
  * ftrace, and making this notrace removes unnecessary recursion calls.

^ permalink raw reply related	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-22  2:07         ` Re: Ze Gao
  2023-05-23  4:38           ` Re: Yonghong Song
@ 2023-05-23  5:30           ` Masami Hiramatsu
  2023-05-23  6:59             ` Re: Paul E. McKenney
  1 sibling, 1 reply; 1596+ messages in thread
From: Masami Hiramatsu @ 2023-05-23  5:30 UTC (permalink / raw)
  To: Ze Gao
  Cc: Jiri Olsa, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Mon, 22 May 2023 10:07:42 +0800
Ze Gao <zegao2021@gmail.com> wrote:

> Oops, I missed that. Thanks for pointing that out, which I thought is
> conditional use of rcu_is_watching before.
> 
> One last point, I think we should double check on this
>      "fentry does not filter with !rcu_is_watching"
> as quoted from Yonghong and argue whether it needs
> the same check for fentry as well.

rcu_is_watching() comment says;

 * if the current CPU is not in its idle loop or is in an interrupt or
 * NMI handler, return true.

Thus it returns *fault* if the current CPU is in the idle loop and not
any interrupt(including NMI) context. This means if any tracable function
is called from idle loop, it can be !rcu_is_watching(). I meant, this is
'context' based check, thus fentry can not filter out that some commonly
used functions is called from that context but it can be detected.

Thank you,

> 
> Regards,
> Ze


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-22  2:07         ` Re: Ze Gao
@ 2023-05-23  4:38           ` Yonghong Song
  2023-05-23  5:30           ` Re: Masami Hiramatsu
  1 sibling, 0 replies; 1596+ messages in thread
From: Yonghong Song @ 2023-05-23  4:38 UTC (permalink / raw)
  To: Ze Gao, Jiri Olsa
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev,
	paulmck, songliubraving, Ze Gao



On 5/21/23 7:07 PM, Ze Gao wrote:
> Oops, I missed that. Thanks for pointing that out, which I thought is
> conditional use of rcu_is_watching before.
> 
> One last point, I think we should double check on this
>       "fentry does not filter with !rcu_is_watching"
> as quoted from Yonghong and argue whether it needs
> the same check for fentry as well.

I would suggest that we address rcu_is_watching issue for fentry
only if we do have a reproducible case to show something goes wrong...

> 
> Regards,
> Ze

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-21 20:26       ` Re: Jiri Olsa
  2023-05-22  1:36         ` Re: Masami Hiramatsu
@ 2023-05-22  2:07         ` Ze Gao
  2023-05-23  4:38           ` Re: Yonghong Song
  2023-05-23  5:30           ` Re: Masami Hiramatsu
  1 sibling, 2 replies; 1596+ messages in thread
From: Ze Gao @ 2023-05-22  2:07 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

Oops, I missed that. Thanks for pointing that out, which I thought is
conditional use of rcu_is_watching before.

One last point, I think we should double check on this
     "fentry does not filter with !rcu_is_watching"
as quoted from Yonghong and argue whether it needs
the same check for fentry as well.

Regards,
Ze

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-21 20:26       ` Re: Jiri Olsa
@ 2023-05-22  1:36         ` Masami Hiramatsu
  2023-05-22  2:07         ` Re: Ze Gao
  1 sibling, 0 replies; 1596+ messages in thread
From: Masami Hiramatsu @ 2023-05-22  1:36 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ze Gao, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Sun, 21 May 2023 22:26:37 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Sun, May 21, 2023 at 11:10:16PM +0800, Ze Gao wrote:
> > > kprobe_multi/fprobe share the same set of attachments with fentry.
> > > Currently, fentry does not filter with !rcu_is_watching, maybe
> > > because this is an extreme corner case. Not sure whether it is
> > > worthwhile or not.
> > 
> > Agreed, it's rare, especially after Peter's patches which push narrow
> > down rcu eqs regions
> > in the idle path and reduce the chance of any traceable functions
> > happening in between.
> > 
> > However, from RCU's perspective, we ought to check if rcu_is_watching
> > theoretically
> > when there's a chance our code will run in the idle path and also we
> > need rcu to be alive,
> > And also we cannot simply make assumptions for any future changes in
> > the idle path.
> > You know, just like what was hit in the thread.
> > 
> > > Maybe if you can give a concrete example (e.g., attachment point)
> > > with current code base to show what the issue you encountered and
> > > it will make it easier to judge whether adding !rcu_is_watching()
> > > is necessary or not.
> > 
> > I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
> > traceable but not on the latest version
> > so far. But as I state above, in theory we need it. So here is a
> > gentle ping :) .
> 
> hum, this change [1] added rcu_is_watching check to ftrace_test_recursion_trylock,
> which we use in fprobe_handler and is coming to fprobe_exit_handler in [2]
> 
> I might be missing something, but it seems like we don't need another
> rcu_is_watching call on kprobe_multi level

Good point! OK, then it seems we don't need it. The rethook continues to
use the rcu_is_watching() because it is also used from kprobes, but the
kprobe_multi doesn't need it.

Thank you,

> 
> jirka
> 
> 
> [1] d099dbfd3306 cpuidle: tracing: Warn about !rcu_is_watching()
> [2] https://lore.kernel.org/bpf/20230517034510.15639-4-zegao@tencent.com/


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-21 15:10     ` Re: Ze Gao
@ 2023-05-21 20:26       ` Jiri Olsa
  2023-05-22  1:36         ` Re: Masami Hiramatsu
  2023-05-22  2:07         ` Re: Ze Gao
  0 siblings, 2 replies; 1596+ messages in thread
From: Jiri Olsa @ 2023-05-21 20:26 UTC (permalink / raw)
  To: Ze Gao
  Cc: Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Sun, May 21, 2023 at 11:10:16PM +0800, Ze Gao wrote:
> > kprobe_multi/fprobe share the same set of attachments with fentry.
> > Currently, fentry does not filter with !rcu_is_watching, maybe
> > because this is an extreme corner case. Not sure whether it is
> > worthwhile or not.
> 
> Agreed, it's rare, especially after Peter's patches which push narrow
> down rcu eqs regions
> in the idle path and reduce the chance of any traceable functions
> happening in between.
> 
> However, from RCU's perspective, we ought to check if rcu_is_watching
> theoretically
> when there's a chance our code will run in the idle path and also we
> need rcu to be alive,
> And also we cannot simply make assumptions for any future changes in
> the idle path.
> You know, just like what was hit in the thread.
> 
> > Maybe if you can give a concrete example (e.g., attachment point)
> > with current code base to show what the issue you encountered and
> > it will make it easier to judge whether adding !rcu_is_watching()
> > is necessary or not.
> 
> I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
> traceable but not on the latest version
> so far. But as I state above, in theory we need it. So here is a
> gentle ping :) .

hum, this change [1] added rcu_is_watching check to ftrace_test_recursion_trylock,
which we use in fprobe_handler and is coming to fprobe_exit_handler in [2]

I might be missing something, but it seems like we don't need another
rcu_is_watching call on kprobe_multi level

jirka


[1] d099dbfd3306 cpuidle: tracing: Warn about !rcu_is_watching()
[2] https://lore.kernel.org/bpf/20230517034510.15639-4-zegao@tencent.com/

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-21  3:58   ` Yonghong Song
@ 2023-05-21 15:10     ` Ze Gao
  2023-05-21 20:26       ` Re: Jiri Olsa
  0 siblings, 1 reply; 1596+ messages in thread
From: Ze Gao @ 2023-05-21 15:10 UTC (permalink / raw)
  To: Yonghong Song
  Cc: jolsa, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau,
	Masami Hiramatsu, Song Liu, Stanislav Fomichev, Steven Rostedt,
	Yonghong Song, bpf, linux-kernel, linux-trace-kernel, kafai,
	kpsingh, netdev, paulmck, songliubraving, Ze Gao

> kprobe_multi/fprobe share the same set of attachments with fentry.
> Currently, fentry does not filter with !rcu_is_watching, maybe
> because this is an extreme corner case. Not sure whether it is
> worthwhile or not.

Agreed, it's rare, especially after Peter's patches which push narrow
down rcu eqs regions
in the idle path and reduce the chance of any traceable functions
happening in between.

However, from RCU's perspective, we ought to check if rcu_is_watching
theoretically
when there's a chance our code will run in the idle path and also we
need rcu to be alive,
And also we cannot simply make assumptions for any future changes in
the idle path.
You know, just like what was hit in the thread.

> Maybe if you can give a concrete example (e.g., attachment point)
> with current code base to show what the issue you encountered and
> it will make it easier to judge whether adding !rcu_is_watching()
> is necessary or not.

I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
traceable but not on the latest version
so far. But as I state above, in theory we need it. So here is a
gentle ping :) .

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-21 10:09     ` Re: Masami Hiramatsu
@ 2023-05-21 14:19       ` Ze Gao
  0 siblings, 0 replies; 1596+ messages in thread
From: Ze Gao @ 2023-05-21 14:19 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jiri Olsa, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau, Song Liu,
	Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev,
	paulmck, songliubraving, Ze Gao

On Sun, May 21, 2023 at 6:09 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Sun, 21 May 2023 10:08:46 +0200
> Jiri Olsa <olsajiri@gmail.com> wrote:
>
> > On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> > >
> > > Hi Jiri,
> > >
> > > Would you like to consider to add rcu_is_watching check in
> > > to solve this from the viewpoint of kprobe_multi_link_prog_run
> >
> > I think this was discussed in here:
> >   https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/
> >
> > and was considered a bug, there's fix mentioned later in the thread
> >
> > there's also this recent patchset:
> >   https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/
> >
> > that solves related problems
>
> I think this rcu_is_watching() is a bit different issue. This rcu_is_watching()
> check is required if the kprobe_multi_link_prog_run() uses any RCU API.
> E.g. rethook_try_get() is also checks rcu_is_watching() because it uses
> call_rcu().

Yes, that's my point!

Regards,
Ze

>
> >
> > > itself? And accounting of missed runs can be added as well
> > > to imporve observability.
> >
> > right, we count fprobe->nmissed but it's not exposed, we should allow
> > to get 'missed' stats from both fprobe and kprobe_multi later, which
> > is missing now, will check
> >
> > thanks,
> > jirka
> >
> > >
> > > Regards,
> > > Ze
> > >
> > >
> > > -----------------
> > > From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> > > From: Ze Gao <zegao@tencent.com>
> > > Date: Sat, 20 May 2023 17:32:05 +0800
> > > Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> > >
> > > From the perspective of kprobe_multi_link_prog_run, any traceable
> > > functions can be attached while bpf progs need specical care and
> > > ought to be under rcu protection. To solve the likely rcu lockdep
> > > warns once for good, when (future) functions in idle path were
> > > attached accidentally, we better paying some cost to check at least
> > > in kernel-side, and return when rcu is not watching, which helps
> > > to avoid any unpredictable results.
> > >
> > > Signed-off-by: Ze Gao <zegao@tencent.com>
> > > ---
> > >  kernel/trace/bpf_trace.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > > index 9a050e36dc6c..3e6ea7274765 100644
> > > --- a/kernel/trace/bpf_trace.c
> > > +++ b/kernel/trace/bpf_trace.c
> > > @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
> > >     struct bpf_run_ctx *old_run_ctx;
> > >     int err;
> > >
> > > -   if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > > +   if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
> > >             err = 0;
> > >             goto out;
> > >     }
> > > --
> > > 2.40.1
> > >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-21  8:08   ` Re: Jiri Olsa
@ 2023-05-21 10:09     ` Masami Hiramatsu
  2023-05-21 14:19       ` Re: Ze Gao
  0 siblings, 1 reply; 1596+ messages in thread
From: Masami Hiramatsu @ 2023-05-21 10:09 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ze Gao, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau,
	Masami Hiramatsu, Song Liu, Stanislav Fomichev, Steven Rostedt,
	Yonghong Song, bpf, linux-kernel, linux-trace-kernel, kafai,
	kpsingh, netdev, paulmck, songliubraving, Ze Gao

On Sun, 21 May 2023 10:08:46 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> > 
> > Hi Jiri,
> > 
> > Would you like to consider to add rcu_is_watching check in
> > to solve this from the viewpoint of kprobe_multi_link_prog_run
> 
> I think this was discussed in here:
>   https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/
> 
> and was considered a bug, there's fix mentioned later in the thread
> 
> there's also this recent patchset:
>   https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/
> 
> that solves related problems

I think this rcu_is_watching() is a bit different issue. This rcu_is_watching()
check is required if the kprobe_multi_link_prog_run() uses any RCU API.
E.g. rethook_try_get() is also checks rcu_is_watching() because it uses
call_rcu().

Thank you,

> 
> > itself? And accounting of missed runs can be added as well
> > to imporve observability.
> 
> right, we count fprobe->nmissed but it's not exposed, we should allow
> to get 'missed' stats from both fprobe and kprobe_multi later, which
> is missing now, will check
> 
> thanks,
> jirka
> 
> > 
> > Regards,
> > Ze
> > 
> > 
> > -----------------
> > From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> > From: Ze Gao <zegao@tencent.com>
> > Date: Sat, 20 May 2023 17:32:05 +0800
> > Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> > 
> > From the perspective of kprobe_multi_link_prog_run, any traceable
> > functions can be attached while bpf progs need specical care and
> > ought to be under rcu protection. To solve the likely rcu lockdep
> > warns once for good, when (future) functions in idle path were
> > attached accidentally, we better paying some cost to check at least
> > in kernel-side, and return when rcu is not watching, which helps
> > to avoid any unpredictable results.
> > 
> > Signed-off-by: Ze Gao <zegao@tencent.com>
> > ---
> >  kernel/trace/bpf_trace.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 9a050e36dc6c..3e6ea7274765 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
> >  	struct bpf_run_ctx *old_run_ctx;
> >  	int err;
> >  
> > -	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > +	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
> >  		err = 0;
> >  		goto out;
> >  	}
> > -- 
> > 2.40.1
> > 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-20  9:47 ` Ze Gao
  2023-05-21  3:58   ` Yonghong Song
@ 2023-05-21  8:08   ` Jiri Olsa
  2023-05-21 10:09     ` Re: Masami Hiramatsu
  1 sibling, 1 reply; 1596+ messages in thread
From: Jiri Olsa @ 2023-05-21  8:08 UTC (permalink / raw)
  To: Ze Gao
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev,
	paulmck, songliubraving, Ze Gao

On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> 
> Hi Jiri,
> 
> Would you like to consider to add rcu_is_watching check in
> to solve this from the viewpoint of kprobe_multi_link_prog_run

I think this was discussed in here:
  https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/

and was considered a bug, there's fix mentioned later in the thread

there's also this recent patchset:
  https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/

that solves related problems

> itself? And accounting of missed runs can be added as well
> to imporve observability.

right, we count fprobe->nmissed but it's not exposed, we should allow
to get 'missed' stats from both fprobe and kprobe_multi later, which
is missing now, will check

thanks,
jirka

> 
> Regards,
> Ze
> 
> 
> -----------------
> From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> From: Ze Gao <zegao@tencent.com>
> Date: Sat, 20 May 2023 17:32:05 +0800
> Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> 
> From the perspective of kprobe_multi_link_prog_run, any traceable
> functions can be attached while bpf progs need specical care and
> ought to be under rcu protection. To solve the likely rcu lockdep
> warns once for good, when (future) functions in idle path were
> attached accidentally, we better paying some cost to check at least
> in kernel-side, and return when rcu is not watching, which helps
> to avoid any unpredictable results.
> 
> Signed-off-by: Ze Gao <zegao@tencent.com>
> ---
>  kernel/trace/bpf_trace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 9a050e36dc6c..3e6ea7274765 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
>  	struct bpf_run_ctx *old_run_ctx;
>  	int err;
>  
> -	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> +	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
>  		err = 0;
>  		goto out;
>  	}
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-20  9:47 ` Ze Gao
@ 2023-05-21  3:58   ` Yonghong Song
  2023-05-21 15:10     ` Re: Ze Gao
  2023-05-21  8:08   ` Re: Jiri Olsa
  1 sibling, 1 reply; 1596+ messages in thread
From: Yonghong Song @ 2023-05-21  3:58 UTC (permalink / raw)
  To: Ze Gao, jolsa
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev,
	paulmck, songliubraving, Ze Gao



On 5/20/23 2:47 AM, Ze Gao wrote:
> 
> Hi Jiri,
> 
> Would you like to consider to add rcu_is_watching check in
> to solve this from the viewpoint of kprobe_multi_link_prog_run
> itself? And accounting of missed runs can be added as well
> to imporve observability.
> 
> Regards,
> Ze
> 
> 
> -----------------
>  From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> From: Ze Gao <zegao@tencent.com>
> Date: Sat, 20 May 2023 17:32:05 +0800
> Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> 
>  From the perspective of kprobe_multi_link_prog_run, any traceable
> functions can be attached while bpf progs need specical care and
> ought to be under rcu protection. To solve the likely rcu lockdep
> warns once for good, when (future) functions in idle path were
> attached accidentally, we better paying some cost to check at least
> in kernel-side, and return when rcu is not watching, which helps
> to avoid any unpredictable results.

kprobe_multi/fprobe share the same set of attachments with fentry.
Currently, fentry does not filter with !rcu_is_watching, maybe
because this is an extreme corner case. Not sure whether it is
worthwhile or not.

Maybe if you can give a concrete example (e.g., attachment point)
with current code base to show what the issue you encountered and
it will make it easier to judge whether adding !rcu_is_watching()
is necessary or not.

> 
> Signed-off-by: Ze Gao <zegao@tencent.com>
> ---
>   kernel/trace/bpf_trace.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 9a050e36dc6c..3e6ea7274765 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
>   	struct bpf_run_ctx *old_run_ctx;
>   	int err;
>   
> -	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> +	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
>   		err = 0;
>   		goto out;
>   	}

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-05-11 12:58 Ryan Roberts
@ 2023-05-11 13:13 ` Ryan Roberts
  0 siblings, 0 replies; 1596+ messages in thread
From: Ryan Roberts @ 2023-05-11 13:13 UTC (permalink / raw)
  To: Andrew Morton, Matthew Wilcox (Oracle),
	Kirill A. Shutemov, SeongJae Park
  Cc: linux-kernel, linux-mm, damon

My appologies for the noise: A blank line between Cc and Subject has broken the
subject and grouping in lore.

Please Ignore this, I will resend.


On 11/05/2023 13:58, Ryan Roberts wrote:
> Date: Thu, 11 May 2023 11:38:28 +0100
> Subject: [PATCH v1 0/5] Encapsulate PTE contents from non-arch code
> 
> Hi All,
> 
> This series improves the encapsulation of pte entries by disallowing non-arch
> code from directly dereferencing pte_t pointers. Instead code must use a new
> helper, `pte_t ptep_deref(pte_t *ptep)`. By default, this helper does a direct
> dereference of the pointer, so generated code should be exactly the same. But
> it's presence sets us up for arch code being able to override the default to
> "virtualize" the ptes without needing to maintain a shadow table.
> 
> I intend to take advantage of this for arm64 to enable use of its "contiguous
> bit" to coalesce multiple ptes into a single tlb entry, reducing pressure and
> improving performance. I have an RFC for the first part of this work at [1]. The
> cover letter there also explains the second part, which this series is enabling.
> 
> I intend to post an RFC for the contpte changes in due course, but it would be
> good to get the ball rolling on this enabler.
> 
> There are 2 reasons that I need the encapsulation:
> 
>   - Prevent leaking the arch-private PTE_CONT bit to the core code. If the core
>     code reads a pte that contains this bit, it could end up calling
>     set_pte_at() with the bit set which would confuse the implementation. So we
>     can always clear PTE_CONT in ptep_deref() (and ptep_get()) to avoid a leaky
>     abstraction.
>   - Contiguous ptes have a single access and dirty bit for the contiguous range.
>     So we need to "mix-in" those bits when the core is dereferencing a pte that
>     lies in the contig range. There is code that dereferences the pte then takes
>     different actions based on access/dirty (see e.g. write_protect_page()).
> 
> While ptep_get() and ptep_get_lockless() already exist, both of them are
> implemented using READ_ONCE() by default. While we could use ptep_get() instead
> of the new ptep_deref(), I didn't want to risk performance regression.
> Alternatively, all call sites that currently use ptep_get() that need the
> lockless behaviour could be upgraded to ptep_get_lockless() and ptep_get() could
> be downgraded to a simple dereference. That would be cleanest, but is a much
> bigger (and likely error prone) change because all the arch code would need to
> be updated for the new definitions of ptep_get().
> 
> The series is split up as follows:
> 
> patchs 1-2: Fix bugs where code was _setting_ ptes directly, rather than using
>             set_pte_at() and friends.
> patch 3:    Fix highmem unmapping issue I spotted while doing the work.
> patch 4:    Introduce the new ptep_deref() helper with default implementation.
> patch 5:    Convert all direct dereferences to use ptep_deref().
> 
> [1] https://lore.kernel.org/linux-mm/20230414130303.2345383-1-ryan.roberts@arm.com/
> 
> Thanks,
> Ryan
> 
> 
> Ryan Roberts (5):
>   mm: vmalloc must set pte via arch code
>   mm: damon must atomically clear young on ptes and pmds
>   mm: Fix failure to unmap pte on highmem systems
>   mm: Add new ptep_deref() helper to fully encapsulate pte_t
>   mm: ptep_deref() conversion
> 
>  .../drm/i915/gem/selftests/i915_gem_mman.c    |   8 +-
>  drivers/misc/sgi-gru/grufault.c               |   2 +-
>  drivers/vfio/vfio_iommu_type1.c               |   7 +-
>  drivers/xen/privcmd.c                         |   2 +-
>  fs/proc/task_mmu.c                            |  33 +++---
>  fs/userfaultfd.c                              |   6 +-
>  include/linux/hugetlb.h                       |   2 +-
>  include/linux/mm_inline.h                     |   2 +-
>  include/linux/pgtable.h                       |  13 ++-
>  kernel/events/uprobes.c                       |   2 +-
>  mm/damon/ops-common.c                         |  18 ++-
>  mm/damon/ops-common.h                         |   4 +-
>  mm/damon/paddr.c                              |   6 +-
>  mm/damon/vaddr.c                              |  14 ++-
>  mm/filemap.c                                  |   2 +-
>  mm/gup.c                                      |  21 ++--
>  mm/highmem.c                                  |  12 +-
>  mm/hmm.c                                      |   2 +-
>  mm/huge_memory.c                              |   4 +-
>  mm/hugetlb.c                                  |   2 +-
>  mm/hugetlb_vmemmap.c                          |   6 +-
>  mm/kasan/init.c                               |   9 +-
>  mm/kasan/shadow.c                             |  10 +-
>  mm/khugepaged.c                               |  24 ++--
>  mm/ksm.c                                      |  22 ++--
>  mm/madvise.c                                  |   6 +-
>  mm/mapping_dirty_helpers.c                    |   4 +-
>  mm/memcontrol.c                               |   4 +-
>  mm/memory-failure.c                           |   6 +-
>  mm/memory.c                                   | 103 +++++++++---------
>  mm/mempolicy.c                                |   6 +-
>  mm/migrate.c                                  |  14 ++-
>  mm/migrate_device.c                           |  14 ++-
>  mm/mincore.c                                  |   2 +-
>  mm/mlock.c                                    |   6 +-
>  mm/mprotect.c                                 |   8 +-
>  mm/mremap.c                                   |   2 +-
>  mm/page_table_check.c                         |   4 +-
>  mm/page_vma_mapped.c                          |  26 +++--
>  mm/pgtable-generic.c                          |   2 +-
>  mm/rmap.c                                     |  32 +++---
>  mm/sparse-vmemmap.c                           |   8 +-
>  mm/swap_state.c                               |   4 +-
>  mm/swapfile.c                                 |  16 +--
>  mm/userfaultfd.c                              |   4 +-
>  mm/vmalloc.c                                  |  11 +-
>  mm/vmscan.c                                   |  14 ++-
>  virt/kvm/kvm_main.c                           |   9 +-
>  48 files changed, 302 insertions(+), 236 deletions(-)
> 
> --
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-03-27 13:54 ` Yaroslav Furman
@ 2023-03-27 14:19   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 1596+ messages in thread
From: Greg Kroah-Hartman @ 2023-03-27 14:19 UTC (permalink / raw)
  To: Yaroslav Furman; +Cc: Alan Stern, linux-usb, usb-storage, linux-kernel

On Mon, Mar 27, 2023 at 04:54:22PM +0300, Yaroslav Furman wrote:
> 
> Will this patch get ported to LTS trees? It applies cleanly.
> Would love to see it in 6.1 and 5.15 trees.

What patch?

confused,

greg k-h

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
@ 2023-02-28  6:32 Mahmut Akten
  0 siblings, 0 replies; 1596+ messages in thread
From: Mahmut Akten @ 2023-02-28  6:32 UTC (permalink / raw)
  To: stable

Hello

I need your urgent response to a transaction request attached to your name/email stable@vger.kernel.org I would like to discuss with you now. 

Thank You
Mahmut Akten
Vice Chairman
Garanti BBVA Bank (Turkey)
www.garantibbva.com.tr

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-01-27 21:36       ` Re: Alison Schofield
@ 2023-01-27 22:04         ` Dan Williams
  0 siblings, 0 replies; 1596+ messages in thread
From: Dan Williams @ 2023-01-27 22:04 UTC (permalink / raw)
  To: Alison Schofield, Dan Williams
  Cc: Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky,
	Steven Rostedt, linux-cxl, linux-kernel

Alison Schofield wrote:
> On Fri, Jan 27, 2023 at 11:16:49AM -0800, Dan Williams wrote:
> > Alison Schofield wrote:
> > > On Thu, Jan 26, 2023 at 05:59:03PM -0800, Dan Williams wrote:
> > > > alison.schofield@ wrote:
> > > > > From: Alison Schofield <alison.schofield@intel.com>
> > > > > 
> > > > > Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> > > > > 
> > > > > Changes in v5:
> > > > > - Rebase on cxl/next 
> > > > > - Use struct_size() to calc mbox cmd payload .min_out
> > > > > - s/INTERNAL/INJECTED mocked poison record source
> > > > > - Added Jonathan Reviewed-by tag on Patch 3
> > > > > 
> > > > > Link to v4:
> > > > > https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> > > > > 
> > > > > Add support for retrieving device poison lists and store the returned
> > > > > error records as kernel trace events.
> > > > > 
> > > > > The handling of the poison list is guided by the CXL 3.0 Specification
> > > > > Section 8.2.9.8.4.1. [1] 
> > > > > 
> > > > > Example, triggered by memdev:
> > > > > $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> > > > > cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > > 
> > > > I think the pcidev= field wants to be called something like "host" or
> > > > "parent", because there is no strict requirement that a 'struct
> > > > cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
> > > > "cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
> > > > think all CXL device events should be emitting the PCIe serial number
> > > > for the memdev.
> > > ]
> > > 
> > > Will do, 'host' and add PCIe serial no.
> > > 
> > > > 
> > > > I will look in the implementation, but do region= and region_uuid= get
> > > > populated when mem3 is a member of the region?
> > > 
> > > Not always.
> > > In the case above, where the trigger was by memdev, no.
> > > Region= and region_uuid= (and in the follow-on patch, hpa=) only get
> > > populated if the poison was triggered by region, like the case below.
> > > 
> > > It could be looked up for the by memdev cases. Is that wanted?
> > 
> > Just trying to understand the semantics. However, I do think it makes sense
> > for a memdev trigger to lookup information on all impacted regions
> > across all of the device's DPA and the region trigger makes sense to
> > lookup all memdevs, but bounded by the DPA that contributes to that
> > region. I just want to avoid someone having to trigger the region to get
> > extra information that was readily available from a memdev listing.
> > 
> 
> Dan - 
> 
> Confirming my take-away from this email, and our chat:
> 
> Remove the by-region trigger_poison_list option entirely. User space
> needs to trigger by-memdev the memdevs participating in the region and
> filter those events by region.
> 
> Add the region info (region name, uuid) to the TRACE_EVENTs when the
> poisoned DPA is part of any region.

That's what I was thinking, yes. So the internals of
cxl_mem_get_poison() will take the cxl_region_rwsem for read and compare
the device's endpoint decoder settings against the media error records
to do the region (and later HPA) lookup.

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-01-27 19:16     ` Re: Dan Williams
@ 2023-01-27 21:36       ` Alison Schofield
  2023-01-27 22:04         ` Re: Dan Williams
  0 siblings, 1 reply; 1596+ messages in thread
From: Alison Schofield @ 2023-01-27 21:36 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky,
	Steven Rostedt, linux-cxl, linux-kernel

On Fri, Jan 27, 2023 at 11:16:49AM -0800, Dan Williams wrote:
> Alison Schofield wrote:
> > On Thu, Jan 26, 2023 at 05:59:03PM -0800, Dan Williams wrote:
> > > alison.schofield@ wrote:
> > > > From: Alison Schofield <alison.schofield@intel.com>
> > > > 
> > > > Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> > > > 
> > > > Changes in v5:
> > > > - Rebase on cxl/next 
> > > > - Use struct_size() to calc mbox cmd payload .min_out
> > > > - s/INTERNAL/INJECTED mocked poison record source
> > > > - Added Jonathan Reviewed-by tag on Patch 3
> > > > 
> > > > Link to v4:
> > > > https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> > > > 
> > > > Add support for retrieving device poison lists and store the returned
> > > > error records as kernel trace events.
> > > > 
> > > > The handling of the poison list is guided by the CXL 3.0 Specification
> > > > Section 8.2.9.8.4.1. [1] 
> > > > 
> > > > Example, triggered by memdev:
> > > > $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> > > > cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > 
> > > I think the pcidev= field wants to be called something like "host" or
> > > "parent", because there is no strict requirement that a 'struct
> > > cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
> > > "cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
> > > think all CXL device events should be emitting the PCIe serial number
> > > for the memdev.
> > ]
> > 
> > Will do, 'host' and add PCIe serial no.
> > 
> > > 
> > > I will look in the implementation, but do region= and region_uuid= get
> > > populated when mem3 is a member of the region?
> > 
> > Not always.
> > In the case above, where the trigger was by memdev, no.
> > Region= and region_uuid= (and in the follow-on patch, hpa=) only get
> > populated if the poison was triggered by region, like the case below.
> > 
> > It could be looked up for the by memdev cases. Is that wanted?
> 
> Just trying to understand the semantics. However, I do think it makes sense
> for a memdev trigger to lookup information on all impacted regions
> across all of the device's DPA and the region trigger makes sense to
> lookup all memdevs, but bounded by the DPA that contributes to that
> region. I just want to avoid someone having to trigger the region to get
> extra information that was readily available from a memdev listing.
> 

Dan - 

Confirming my take-away from this email, and our chat:

Remove the by-region trigger_poison_list option entirely. User space
needs to trigger by-memdev the memdevs participating in the region and
filter those events by region.

Add the region info (region name, uuid) to the TRACE_EVENTs when the
poisoned DPA is part of any region.

Alison

> > 
> > Thanks for the reviews Dan!
> > > 
> > > > 
> > > > Example, triggered by region:
> > > > $ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
> > > > cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > > cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > > 
> > > > [1]: https://www.computeexpresslink.org/download-the-specification
> > > > 
> > > > Alison Schofield (5):
> > > >   cxl/mbox: Add GET_POISON_LIST mailbox command
> > > >   cxl/trace: Add TRACE support for CXL media-error records
> > > >   cxl/memdev: Add trigger_poison_list sysfs attribute
> > > >   cxl/region: Add trigger_poison_list sysfs attribute
> > > >   tools/testing/cxl: Mock support for Get Poison List
> > > > 
> > > >  Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
> > > >  drivers/cxl/core/mbox.c                 | 78 +++++++++++++++++++++++
> > > >  drivers/cxl/core/memdev.c               | 45 ++++++++++++++
> > > >  drivers/cxl/core/region.c               | 33 ++++++++++
> > > >  drivers/cxl/core/trace.h                | 83 +++++++++++++++++++++++++
> > > >  drivers/cxl/cxlmem.h                    | 69 +++++++++++++++++++-
> > > >  drivers/cxl/pci.c                       |  4 ++
> > > >  tools/testing/cxl/test/mem.c            | 42 +++++++++++++
> > > >  8 files changed, 381 insertions(+), 1 deletion(-)
> > > > 
> > > > 
> > > > base-commit: 589c3357370a596ef7c99c00baca8ac799fce531
> > > > -- 
> > > > 2.37.3
> > > > 
> > > 
> > > 
> 
> 

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-01-27 16:10   ` Alison Schofield
@ 2023-01-27 19:16     ` Dan Williams
  2023-01-27 21:36       ` Re: Alison Schofield
  0 siblings, 1 reply; 1596+ messages in thread
From: Dan Williams @ 2023-01-27 19:16 UTC (permalink / raw)
  To: Alison Schofield, Dan Williams
  Cc: Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky,
	Steven Rostedt, linux-cxl, linux-kernel

Alison Schofield wrote:
> On Thu, Jan 26, 2023 at 05:59:03PM -0800, Dan Williams wrote:
> > alison.schofield@ wrote:
> > > From: Alison Schofield <alison.schofield@intel.com>
> > > 
> > > Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> > > 
> > > Changes in v5:
> > > - Rebase on cxl/next 
> > > - Use struct_size() to calc mbox cmd payload .min_out
> > > - s/INTERNAL/INJECTED mocked poison record source
> > > - Added Jonathan Reviewed-by tag on Patch 3
> > > 
> > > Link to v4:
> > > https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> > > 
> > > Add support for retrieving device poison lists and store the returned
> > > error records as kernel trace events.
> > > 
> > > The handling of the poison list is guided by the CXL 3.0 Specification
> > > Section 8.2.9.8.4.1. [1] 
> > > 
> > > Example, triggered by memdev:
> > > $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> > > cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > 
> > I think the pcidev= field wants to be called something like "host" or
> > "parent", because there is no strict requirement that a 'struct
> > cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
> > "cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
> > think all CXL device events should be emitting the PCIe serial number
> > for the memdev.
> ]
> 
> Will do, 'host' and add PCIe serial no.
> 
> > 
> > I will look in the implementation, but do region= and region_uuid= get
> > populated when mem3 is a member of the region?
> 
> Not always.
> In the case above, where the trigger was by memdev, no.
> Region= and region_uuid= (and in the follow-on patch, hpa=) only get
> populated if the poison was triggered by region, like the case below.
> 
> It could be looked up for the by memdev cases. Is that wanted?

Just trying to understand the semantics. However, I do think it makes sense
for a memdev trigger to lookup information on all impacted regions
across all of the device's DPA and the region trigger makes sense to
lookup all memdevs, but bounded by the DPA that contributes to that
region. I just want to avoid someone having to trigger the region to get
extra information that was readily available from a memdev listing.

> 
> Thanks for the reviews Dan!
> > 
> > > 
> > > Example, triggered by region:
> > > $ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
> > > cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > 
> > > [1]: https://www.computeexpresslink.org/download-the-specification
> > > 
> > > Alison Schofield (5):
> > >   cxl/mbox: Add GET_POISON_LIST mailbox command
> > >   cxl/trace: Add TRACE support for CXL media-error records
> > >   cxl/memdev: Add trigger_poison_list sysfs attribute
> > >   cxl/region: Add trigger_poison_list sysfs attribute
> > >   tools/testing/cxl: Mock support for Get Poison List
> > > 
> > >  Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
> > >  drivers/cxl/core/mbox.c                 | 78 +++++++++++++++++++++++
> > >  drivers/cxl/core/memdev.c               | 45 ++++++++++++++
> > >  drivers/cxl/core/region.c               | 33 ++++++++++
> > >  drivers/cxl/core/trace.h                | 83 +++++++++++++++++++++++++
> > >  drivers/cxl/cxlmem.h                    | 69 +++++++++++++++++++-
> > >  drivers/cxl/pci.c                       |  4 ++
> > >  tools/testing/cxl/test/mem.c            | 42 +++++++++++++
> > >  8 files changed, 381 insertions(+), 1 deletion(-)
> > > 
> > > 
> > > base-commit: 589c3357370a596ef7c99c00baca8ac799fce531
> > > -- 
> > > 2.37.3
> > > 
> > 
> > 



^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-01-27  1:59 ` Dan Williams
@ 2023-01-27 16:10   ` Alison Schofield
  2023-01-27 19:16     ` Re: Dan Williams
  0 siblings, 1 reply; 1596+ messages in thread
From: Alison Schofield @ 2023-01-27 16:10 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky,
	Steven Rostedt, linux-cxl, linux-kernel

On Thu, Jan 26, 2023 at 05:59:03PM -0800, Dan Williams wrote:
> alison.schofield@ wrote:
> > From: Alison Schofield <alison.schofield@intel.com>
> > 
> > Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> > 
> > Changes in v5:
> > - Rebase on cxl/next 
> > - Use struct_size() to calc mbox cmd payload .min_out
> > - s/INTERNAL/INJECTED mocked poison record source
> > - Added Jonathan Reviewed-by tag on Patch 3
> > 
> > Link to v4:
> > https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> > 
> > Add support for retrieving device poison lists and store the returned
> > error records as kernel trace events.
> > 
> > The handling of the poison list is guided by the CXL 3.0 Specification
> > Section 8.2.9.8.4.1. [1] 
> > 
> > Example, triggered by memdev:
> > $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> > cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> 
> I think the pcidev= field wants to be called something like "host" or
> "parent", because there is no strict requirement that a 'struct
> cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
> "cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
> think all CXL device events should be emitting the PCIe serial number
> for the memdev.
]

Will do, 'host' and add PCIe serial no.

> 
> I will look in the implementation, but do region= and region_uuid= get
> populated when mem3 is a member of the region?

Not always.
In the case above, where the trigger was by memdev, no.
Region= and region_uuid= (and in the follow-on patch, hpa=) only get
populated if the poison was triggered by region, like the case below.

It could be looked up for the by memdev cases. Is that wanted?

Thanks for the reviews Dan!
> 
> > 
> > Example, triggered by region:
> > $ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
> > cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > 
> > [1]: https://www.computeexpresslink.org/download-the-specification
> > 
> > Alison Schofield (5):
> >   cxl/mbox: Add GET_POISON_LIST mailbox command
> >   cxl/trace: Add TRACE support for CXL media-error records
> >   cxl/memdev: Add trigger_poison_list sysfs attribute
> >   cxl/region: Add trigger_poison_list sysfs attribute
> >   tools/testing/cxl: Mock support for Get Poison List
> > 
> >  Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
> >  drivers/cxl/core/mbox.c                 | 78 +++++++++++++++++++++++
> >  drivers/cxl/core/memdev.c               | 45 ++++++++++++++
> >  drivers/cxl/core/region.c               | 33 ++++++++++
> >  drivers/cxl/core/trace.h                | 83 +++++++++++++++++++++++++
> >  drivers/cxl/cxlmem.h                    | 69 +++++++++++++++++++-
> >  drivers/cxl/pci.c                       |  4 ++
> >  tools/testing/cxl/test/mem.c            | 42 +++++++++++++
> >  8 files changed, 381 insertions(+), 1 deletion(-)
> > 
> > 
> > base-commit: 589c3357370a596ef7c99c00baca8ac799fce531
> > -- 
> > 2.37.3
> > 
> 
> 

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2023-01-22 21:42 ` Re: Alejandro Colomar
@ 2023-01-24 20:01   ` Helge Kreutzmann
  0 siblings, 0 replies; 1596+ messages in thread
From: Helge Kreutzmann @ 2023-01-24 20:01 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: mario.blaettermann, linux-man

[-- Attachment #1: Type: text/plain, Size: 661 bytes --]

Helo Alex,
On Sun, Jan 22, 2023 at 10:42:54PM +0100, Alejandro Colomar wrote:
> Hi Helge,
> 
> On 1/22/23 20:31, Helge Kreutzmann wrote:
> > Without further ado, the following was found:
> 
> Empty report.  An accident? :)

I tried to figure out what happend - but I don't know.

Sorry for the empty report, please disregard.

Greetings

         Helge

-- 
      Dr. Helge Kreutzmann                     debian@helgefjell.de
           Dipl.-Phys.                   http://www.helgefjell.de/debian.php
        64bit GNU powered                     gpg signed mail preferred
           Help keep free software "libre": http://www.ffii.de/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
       [not found] <20230122193117.GA28689@Debian-50-lenny-64-minimal>
@ 2023-01-22 21:42 ` Alejandro Colomar
  2023-01-24 20:01   ` Re: Helge Kreutzmann
  0 siblings, 1 reply; 1596+ messages in thread
From: Alejandro Colomar @ 2023-01-22 21:42 UTC (permalink / raw)
  To: Helge Kreutzmann; +Cc: mario.blaettermann, linux-man


[-- Attachment #1.1: Type: text/plain, Size: 205 bytes --]

Hi Helge,

On 1/22/23 20:31, Helge Kreutzmann wrote:
> Without further ado, the following was found:

Empty report.  An accident? :)

Cheers,

Alex
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-11-21 11:11 Denis Arefev
@ 2022-11-21 14:28 ` Jason Yan
  0 siblings, 0 replies; 1596+ messages in thread
From: Jason Yan @ 2022-11-21 14:28 UTC (permalink / raw)
  To: Denis Arefev, Anil Gurumurthy
  Cc: Sudarsana Kalluru, James E.J. Bottomley, Martin K. Petersen,
	linux-scsi, linux-kernel, trufanov, vfh

You may need a real subject, not a subject text in the email.

type "git help send-email" if you don't know how to use it.

On 2022/11/21 19:11, Denis Arefev wrote:
> Date: Mon, 21 Nov 2022 13:29:03 +0300
> Subject: [PATCH] scsi:bfa: Eliminated buffer overflow
> 
> Buffer 'cmd->adapter_hwpath' of size 32 accessed at
> bfad_bsg.c:101:103 can overflow, since its index 'i'
> can have value 32 that is out of range.
> 
> Signed-off-by: Denis Arefev <arefev@swemel.ru>
> ---
>   drivers/scsi/bfa/bfad_bsg.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/bfa/bfad_bsg.c b/drivers/scsi/bfa/bfad_bsg.c
> index be8dfbe13e90..78615ffc62ef 100644
> --- a/drivers/scsi/bfa/bfad_bsg.c
> +++ b/drivers/scsi/bfa/bfad_bsg.c
> @@ -98,9 +98,9 @@ bfad_iocmd_ioc_get_info(struct bfad_s *bfad, void *cmd)
>   
>   	/* set adapter hw path */
>   	strcpy(iocmd->adapter_hwpath, bfad->pci_name);
> -	for (i = 0; iocmd->adapter_hwpath[i] != ':' && i < BFA_STRING_32; i++)
> +	for (i = 0; iocmd->adapter_hwpath[i] != ':' && i < BFA_STRING_32-2; i++)
>   		;
> -	for (; iocmd->adapter_hwpath[++i] != ':' && i < BFA_STRING_32; )
> +	for (; iocmd->adapter_hwpath[++i] != ':' && i < BFA_STRING_32-1; )
>   		;
>   	iocmd->adapter_hwpath[i] = '\0';
>   	iocmd->status = BFA_STATUS_OK;
> 

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
@ 2022-11-18 19:33 Mr. JAMES
  0 siblings, 0 replies; 1596+ messages in thread
From: Mr. JAMES @ 2022-11-18 19:33 UTC (permalink / raw)
  To: git

Hello, 

I'm James, an Entrepreneur, Venture Capitalist & Private Lender. I represent a group of Ultra High Net Worth Donors worldwide. Kindly let me know if you can be trusted to distribute charitable items which include Cash, Food Items and Clothing in your region.

Thank you
James.

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
@ 2022-11-18 18:11 Mr. JAMES
  0 siblings, 0 replies; 1596+ messages in thread
From: Mr. JAMES @ 2022-11-18 18:11 UTC (permalink / raw)
  To: devicetree

Hello, 

I'm James, an Entrepreneur, Venture Capitalist & Private Lender. I represent a group of Ultra High Net Worth Donors worldwide. Kindly let me know if you can be trusted to distribute charitable items which include Cash, Food Items and Clothing in your region.

Thank you
James.

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-11-18  7:47 ` Michal Orzel
@ 2022-11-18  9:02   ` Julien Grall
  0 siblings, 0 replies; 1596+ messages in thread
From: Julien Grall @ 2022-11-18  9:02 UTC (permalink / raw)
  To: Michal Orzel, Jiamei Xie, xen-devel
  Cc: Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk, Wei Chen



On 18/11/2022 07:47, Michal Orzel wrote:
> On 18/11/2022 03:00, Jiamei Xie wrote:
>>
>>
>> Date: Thu, 17 Nov 2022 11:07:12 +0800
>> Subject: [PATCH] xen/arm: vpl011: Make access to DMACR write-ignore
>>
>> When the guest kernel enables DMA engine with "CONFIG_DMA_ENGINE=y",
>> Linux SBSA PL011 driver will access PL011 DMACR register in some
>> functions. As chapter "B Generic UART" in "ARM Server Base System
>> Architecture"[1] documentation describes, SBSA UART doesn't support
>> DMA. In current code, when the kernel tries to access DMACR register,
>> Xen will inject a data abort:
>> Unhandled fault at 0xffffffc00944d048
>> Mem abort info:
>>    ESR = 0x96000000
>>    EC = 0x25: DABT (current EL), IL = 32 bits
>>    SET = 0, FnV = 0
>>    EA = 0, S1PTW = 0
>>    FSC = 0x00: ttbr address size fault
>> Data abort info:
>>    ISV = 0, ISS = 0x00000000
>>    CM = 0, WnR = 0
>> swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000020e2e000
>> [ffffffc00944d048] pgd=100000003ffff803, p4d=100000003ffff803, pud=100000003ffff803, pmd=100000003fffa803, pte=006800009c090f13
>> Internal error: ttbr address size fault: 96000000 [#1] PREEMPT SMP
>> ...
>> Call trace:
>>   pl011_stop_rx+0x70/0x80
>>   tty_port_shutdown+0x7c/0xb4
>>   tty_port_close+0x60/0xcc
>>   uart_close+0x34/0x8c
>>   tty_release+0x144/0x4c0
>>   __fput+0x78/0x220
>>   ____fput+0x1c/0x30
>>   task_work_run+0x88/0xc0
>>   do_notify_resume+0x8d0/0x123c
>>   el0_svc+0xa8/0xc0
>>   el0t_64_sync_handler+0xa4/0x130
>>   el0t_64_sync+0x1a0/0x1a4
>> Code: b9000083 b901f001 794038a0 8b000042 (b9000041)
>> ---[ end trace 83dd93df15c3216f ]---
>> note: bootlogd[132] exited with preempt_count 1
>> /etc/rcS.d/S07bootlogd: line 47: 132 Segmentation fault start-stop-daemon
>>
>> As discussed in [2], this commit makes the access to DMACR register
>> write-ignore as an improvement.
> As discussed earlier, if we decide to improve vpl011 (for now only Stefano shared his opinion),
> then we need to mark *all* the PL011 registers that are not part of SBSA ar RAZ/WI.

I would be fine to that. But I would like us to print a message using 
XENLOG_G_DEBUG to catch any OS that would touch those registers.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-11-18  2:00 Jiamei Xie
@ 2022-11-18  7:47 ` Michal Orzel
  2022-11-18  9:02   ` Re: Julien Grall
  0 siblings, 1 reply; 1596+ messages in thread
From: Michal Orzel @ 2022-11-18  7:47 UTC (permalink / raw)
  To: Jiamei Xie, xen-devel
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Wei Chen

Hi Jimaei,

On 18/11/2022 03:00, Jiamei Xie wrote:
> 
> 
> Date: Thu, 17 Nov 2022 11:07:12 +0800
> Subject: [PATCH] xen/arm: vpl011: Make access to DMACR write-ignore
> 
> When the guest kernel enables DMA engine with "CONFIG_DMA_ENGINE=y",
> Linux SBSA PL011 driver will access PL011 DMACR register in some
> functions. As chapter "B Generic UART" in "ARM Server Base System
> Architecture"[1] documentation describes, SBSA UART doesn't support
> DMA. In current code, when the kernel tries to access DMACR register,
> Xen will inject a data abort:
> Unhandled fault at 0xffffffc00944d048
> Mem abort info:
>   ESR = 0x96000000
>   EC = 0x25: DABT (current EL), IL = 32 bits
>   SET = 0, FnV = 0
>   EA = 0, S1PTW = 0
>   FSC = 0x00: ttbr address size fault
> Data abort info:
>   ISV = 0, ISS = 0x00000000
>   CM = 0, WnR = 0
> swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000020e2e000
> [ffffffc00944d048] pgd=100000003ffff803, p4d=100000003ffff803, pud=100000003ffff803, pmd=100000003fffa803, pte=006800009c090f13
> Internal error: ttbr address size fault: 96000000 [#1] PREEMPT SMP
> ...
> Call trace:
>  pl011_stop_rx+0x70/0x80
>  tty_port_shutdown+0x7c/0xb4
>  tty_port_close+0x60/0xcc
>  uart_close+0x34/0x8c
>  tty_release+0x144/0x4c0
>  __fput+0x78/0x220
>  ____fput+0x1c/0x30
>  task_work_run+0x88/0xc0
>  do_notify_resume+0x8d0/0x123c
>  el0_svc+0xa8/0xc0
>  el0t_64_sync_handler+0xa4/0x130
>  el0t_64_sync+0x1a0/0x1a4
> Code: b9000083 b901f001 794038a0 8b000042 (b9000041)
> ---[ end trace 83dd93df15c3216f ]---
> note: bootlogd[132] exited with preempt_count 1
> /etc/rcS.d/S07bootlogd: line 47: 132 Segmentation fault start-stop-daemon
> 
> As discussed in [2], this commit makes the access to DMACR register
> write-ignore as an improvement.
As discussed earlier, if we decide to improve vpl011 (for now only Stefano shared his opinion),
then we need to mark *all* the PL011 registers that are not part of SBSA ar RAZ/WI. So handling
DMACR and only for writes is not beneficial (it is only fixing current Linux issue, but what we
really want is to improve the code in general).

> 
> [1] https://developer.arm.com/documentation/den0094/c/?lang=en
> [2] https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2211161552420.4020@ubuntu-linux-20-04-desktop/
> 
> Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>
> ---
>  xen/arch/arm/vpl011.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
> index 43522d48fd..80d00b3052 100644
> --- a/xen/arch/arm/vpl011.c
> +++ b/xen/arch/arm/vpl011.c
> @@ -463,6 +463,7 @@ static int vpl011_mmio_write(struct vcpu *v,
>      case FR:
>      case RIS:
>      case MIS:
> +    case DMACR:
>          goto write_ignore;
> 
>      case IMSC:
> --
> 2.25.1
> 
> 

~Michal


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-11-09 14:34 Denis Arefev
@ 2022-11-09 14:44 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 1596+ messages in thread
From: Greg Kroah-Hartman @ 2022-11-09 14:44 UTC (permalink / raw)
  To: Denis Arefev
  Cc: David Airlie, Daniel Vetter, stable, Alexey Khoroshilov,
	ldv-project, trufanov, vfh

On Wed, Nov 09, 2022 at 05:34:13PM +0300, Denis Arefev wrote:
> Date: Wed, 9 Nov 2022 16:52:17 +0300
> Subject: [PATCH 5.10] nbio_v7_4: Add pointer check
> 
> Return value of a function 'amdgpu_ras_find_obj' is dereferenced at nbio_v7_4.c:325 without checking for null
> 
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
> 
> Signed-off-by: Denis Arefev <arefev@swemel.ru>
> ---
>  drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
> index eadc9526d33f..d2627a610e48 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
> @@ -303,6 +303,9 @@ static void nbio_v7_4_handle_ras_controller_intr_no_bifring(struct amdgpu_device
>  	struct ras_manager *obj = amdgpu_ras_find_obj(adev, adev->nbio.ras_if);
>  	struct ras_err_data err_data = {0, 0, 0, NULL};
>  	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> 
> +	if (!obj)
> +		return;
>  
>  	bif_doorbell_intr_cntl = RREG32_SOC15(NBIO, 0, mmBIF_DOORBELL_INT_CNTL);
>  	if (REG_GET_FIELD(bif_doorbell_intr_cntl,
> -- 
> 2.25.1
> 


<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-09-14 13:12 Amjad Ouled-Ameur
@ 2022-09-14 13:18   ` Amjad Ouled-Ameur
  0 siblings, 0 replies; 1596+ messages in thread
From: Amjad Ouled-Ameur @ 2022-09-14 13:18 UTC (permalink / raw)
  To: Rob Herring
  Cc: Krzysztof Kozlowski, Matthias Brugger, devicetree,
	linux-arm-kernel, linux-mediatek, linux-kernel

Hi,

The subject has not been parsed correctly, I resent a proper patch here:

https://patchwork.kernel.org/project/linux-mediatek/patch/20220914131339.18348-1-aouledameur@baylibre.com/


Sorry for the noise.

Regards,

Amjad

On 9/14/22 15:12, Amjad Ouled-Ameur wrote:
> Subject: [PATCH] arm64: dts: mediatek: mt8183: remove thermal zones without
>   trips.
>
> Thermal zones without trip point are not registered by thermal core.
>
> tzts1 ~ tzts6 zones of mt8183 were intially introduced for test-purpose
> only but are not supposed to remain on DT.
>
> Remove the zones above and keep only cpu_thermal.
>
> Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com>
> ---
>   arch/arm64/boot/dts/mediatek/mt8183.dtsi | 57 ------------------------
>   1 file changed, 57 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> index 9d32871973a2..f65fae8939de 100644
> --- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> +++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> @@ -1182,63 +1182,6 @@ THERMAL_NO_LIMIT
>   					};
>   				};
>   			};
> -
> -			/* The tzts1 ~ tzts6 don't need to polling */
> -			/* The tzts1 ~ tzts6 don't need to thermal throttle */
> -
> -			tzts1: tzts1 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 1>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts2: tzts2 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 2>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts3: tzts3 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 3>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts4: tzts4 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 4>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts5: tzts5 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 5>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tztsABB: tztsABB {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 6>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
>   		};
>   
>   		pwm0: pwm@1100e000 {

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
@ 2022-09-14 13:18   ` Amjad Ouled-Ameur
  0 siblings, 0 replies; 1596+ messages in thread
From: Amjad Ouled-Ameur @ 2022-09-14 13:18 UTC (permalink / raw)
  To: Rob Herring
  Cc: Krzysztof Kozlowski, Matthias Brugger, devicetree,
	linux-arm-kernel, linux-mediatek, linux-kernel

Hi,

The subject has not been parsed correctly, I resent a proper patch here:

https://patchwork.kernel.org/project/linux-mediatek/patch/20220914131339.18348-1-aouledameur@baylibre.com/


Sorry for the noise.

Regards,

Amjad

On 9/14/22 15:12, Amjad Ouled-Ameur wrote:
> Subject: [PATCH] arm64: dts: mediatek: mt8183: remove thermal zones without
>   trips.
>
> Thermal zones without trip point are not registered by thermal core.
>
> tzts1 ~ tzts6 zones of mt8183 were intially introduced for test-purpose
> only but are not supposed to remain on DT.
>
> Remove the zones above and keep only cpu_thermal.
>
> Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com>
> ---
>   arch/arm64/boot/dts/mediatek/mt8183.dtsi | 57 ------------------------
>   1 file changed, 57 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> index 9d32871973a2..f65fae8939de 100644
> --- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> +++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> @@ -1182,63 +1182,6 @@ THERMAL_NO_LIMIT
>   					};
>   				};
>   			};
> -
> -			/* The tzts1 ~ tzts6 don't need to polling */
> -			/* The tzts1 ~ tzts6 don't need to thermal throttle */
> -
> -			tzts1: tzts1 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 1>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts2: tzts2 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 2>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts3: tzts3 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 3>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts4: tzts4 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 4>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts5: tzts5 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 5>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tztsABB: tztsABB {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 6>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
>   		};
>   
>   		pwm0: pwm@1100e000 {

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-09-12 12:36 Christian König
@ 2022-09-13  2:04 ` Alex Deucher
  0 siblings, 0 replies; 1596+ messages in thread
From: Alex Deucher @ 2022-09-13  2:04 UTC (permalink / raw)
  To: Christian König; +Cc: alexander.deucher, amd-gfx

On Mon, Sep 12, 2022 at 8:36 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Hey Alex,
>
> I've decided to split this patch set into two because we still can't
> figure out where the VCN regressions come from.
>
> Ruijing tested them and confirmed that they don't regress VCN.
>
> Can you and maybe Felix take a look and review them?

Looks good to me.  Series is:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

>
> Thanks,
> Christian.
>
>

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-08-28 21:01 Nick Neumann
@ 2022-09-01 17:44 ` Nick Neumann
  0 siblings, 0 replies; 1596+ messages in thread
From: Nick Neumann @ 2022-09-01 17:44 UTC (permalink / raw)
  To: fio

PR for this is up now.

On Sun, Aug 28, 2022 at 4:01 PM Nick Neumann <nick@pcpartpicker.com> wrote:
>
> I've filed the issue on github, but just thought I'd mention here too.
> In real-world use it appears to be intermittent. I"m not yet sure how
> intermittent, but I could see it being used in production and not
> caught right away. I got lucky and stumbled on it when looking at
> graphs of runs and noticed 15 seconds of no activity.
>
> https://github.com/axboe/fio/issues/1457
>
> With the null ioengine, I can make it reproduce very reliably, which
> is encouraging as I move to debug.
>
> I had just moved to using log compression as it is really powerful,
> and the only way to store per I/O logs for a long run without pushing
> up against the amount of physical memory in a system.
>
> (Without compression, a GB of sequential writes at 128K block size is
> on the order of 245KB of memory per log, so a TB is 245MB per log. Now
> run a job to fill a 20TB drive and you're at 4.9GB for one log file.
> If you record all 3 latency numbers too, you're talking close to
> 20GB.)

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-08-31 21:47 ` Yang Shi
@ 2022-09-01  0:24   ` Zach O'Keefe
  0 siblings, 0 replies; 1596+ messages in thread
From: Zach O'Keefe @ 2022-09-01  0:24 UTC (permalink / raw)
  To: Yang Shi
  Cc: linux-mm, Andrew Morton, linux-api, Axel Rasmussen,
	James Houghton, Hugh Dickins, Miaohe Lin, David Hildenbrand,
	David Rientjes, Matthew Wilcox, Pasha Tatashin, Peter Xu,
	Rongwei Wang, SeongJae Park, Song Liu, Vlastimil Babka,
	Chris Kennelly, Kirill A. Shutemov, Minchan Kim, Patrick Xia

On Wed, Aug 31, 2022 at 2:47 PM Yang Shi <shy828301@gmail.com> wrote:
>
> Hi Zach,
>
> I did a quick look at the series, basically no show stopper to me. But
> I didn't find time to review them thoroughly yet, quite busy on
> something else. Just a heads up, I didn't mean to ignore you. I will
> review them when I find some time.
>
> Thanks,
> Yang

Hey Yang,

Thanks for taking the time to look through, and thanks for giving me a
heads up, and no rush!

In the last day or so, while porting this series around, I encountered
some subtle edge cases I wanted to clean up / address - so it's good
you didn't do a thorough review yet. I was *hoping* to have a v3 out
last night (which evidently did not happen) and it does not seem like
it will happen today, so I'll leave this message as a request for
reviewers to hold off on a thorough review until v3.

Thanks for your time as always,
Zach

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-08-26 22:03 Zach O'Keefe
@ 2022-08-31 21:47 ` Yang Shi
  2022-09-01  0:24   ` Re: Zach O'Keefe
  0 siblings, 1 reply; 1596+ messages in thread
From: Yang Shi @ 2022-08-31 21:47 UTC (permalink / raw)
  To: Zach O'Keefe
  Cc: linux-mm, Andrew Morton, linux-api, Axel Rasmussen,
	James Houghton, Hugh Dickins, Miaohe Lin, David Hildenbrand,
	David Rientjes, Matthew Wilcox, Pasha Tatashin, Peter Xu,
	Rongwei Wang, SeongJae Park, Song Liu, Vlastimil Babka,
	Chris Kennelly, Kirill A. Shutemov, Minchan Kim, Patrick Xia

Hi Zach,

I did a quick look at the series, basically no show stopper to me. But
I didn't find time to review them thoroughly yet, quite busy on
something else. Just a heads up, I didn't mean to ignore you. I will
review them when I find some time.

Thanks,
Yang

On Fri, Aug 26, 2022 at 3:03 PM Zach O'Keefe <zokeefe@google.com> wrote:
>
> Subject: [PATCH mm-unstable v2 0/9] mm: add file/shmem support to MADV_COLLAPSE
>
> v2 Forward
>
> Mostly a RESEND: rebase on latest mm-unstable + minor bug fixes from
> kernel test robot.
> --------------------------------
>
> This series builds on top of the previous "mm: userspace hugepage collapse"
> series which introduced the MADV_COLLAPSE madvise mode and added support
> for private, anonymous mappings[1], by adding support for file and shmem
> backed memory to CONFIG_READ_ONLY_THP_FOR_FS=y kernels.
>
> File and shmem support have been added with effort to align with existing
> MADV_COLLAPSE semantics and policy decisions[2].  Collapse of shmem-backed
> memory ignores kernel-guiding directives and heuristics including all
> sysfs settings (transparent_hugepage/shmem_enabled), and tmpfs huge= mount
> options (shmem always supports large folios).  Like anonymous mappings, on
> successful return of MADV_COLLAPSE on file/shmem memory, the contents of
> memory mapped by the addresses provided will be synchronously pmd-mapped
> THPs.
>
> This functionality unlocks two important uses:
>
> (1)     Immediately back executable text by THPs.  Current support provided
>         by CONFIG_READ_ONLY_THP_FOR_FS may take a long time on a large
>         system which might impair services from serving at their full rated
>         load after (re)starting.  Tricks like mremap(2)'ing text onto
>         anonymous memory to immediately realize iTLB performance prevents
>         page sharing and demand paging, both of which increase steady state
>         memory footprint.  Now, we can have the best of both worlds: Peak
>         upfront performance and lower RAM footprints.
>
> (2)     userfaultfd-based live migration of virtual machines satisfy UFFD
>         faults by fetching native-sized pages over the network (to avoid
>         latency of transferring an entire hugepage).  However, after guest
>         memory has been fully copied to the new host, MADV_COLLAPSE can
>         be used to immediately increase guest performance.
>
> khugepaged has received a small improvement by association and can now
> detect and collapse pte-mapped THPs.  However, there is still work to be
> done along the file collapse path.  Compound pages of arbitrary order still
> needs to be supported and THP collapse needs to be converted to using
> folios in general.  Eventually, we'd like to move away from the read-only
> and executable-mapped constraints currently imposed on eligible files and
> support any inode claiming huge folio support.  That said, I think the
> series as-is covers enough to claim that MADV_COLLAPSE supports file/shmem
> memory.
>
> Patches 1-3     Implement the guts of the series.
> Patch 4         Is a tracepoint for debugging.
> Patches 5-8     Refactor existing khugepaged selftests to work with new
>                 memory types.
> Patch 9         Adds a userfaultfd selftest mode to mimic a functional test
>                 of UFFDIO_REGISTER_MODE_MINOR+MADV_COLLAPSE live migration.
>
> Applies against mm-unstable.
>
> [1] https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/
> [2] https://lore.kernel.org/linux-mm/YtBmhaiPHUTkJml8@google.com/
>
> v1 -> v2:
> - Add missing definition for khugepaged_add_pte_mapped_thp() in
>   !CONFIG_SHEM builds, in "mm/khugepaged: attempt to map
>   file/shmem-backed pte-mapped THPs by pmds"
> - Minor bugfixes in "mm/madvise: add file and shmem support to
>   MADV_COLLAPSE" for !CONFIG_SHMEM, !CONFIG_TRANSPARENT_HUGEPAGE and some
>   compiler settings.
> - Rebased on latest mm-unstable
>
> Zach O'Keefe (9):
>   mm/shmem: add flag to enforce shmem THP in hugepage_vma_check()
>   mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by
>     pmds
>   mm/madvise: add file and shmem support to MADV_COLLAPSE
>   mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
>   selftests/vm: dedup THP helpers
>   selftests/vm: modularize thp collapse memory operations
>   selftests/vm: add thp collapse file and tmpfs testing
>   selftests/vm: add thp collapse shmem testing
>   selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
>
>  include/linux/khugepaged.h                    |  13 +-
>  include/linux/shmem_fs.h                      |  10 +-
>  include/trace/events/huge_memory.h            |  36 +
>  kernel/events/uprobes.c                       |   2 +-
>  mm/huge_memory.c                              |   2 +-
>  mm/khugepaged.c                               | 289 ++++--
>  mm/shmem.c                                    |  18 +-
>  tools/testing/selftests/vm/Makefile           |   2 +
>  tools/testing/selftests/vm/khugepaged.c       | 828 ++++++++++++------
>  tools/testing/selftests/vm/soft-dirty.c       |   2 +-
>  .../selftests/vm/split_huge_page_test.c       |  12 +-
>  tools/testing/selftests/vm/userfaultfd.c      | 171 +++-
>  tools/testing/selftests/vm/vm_util.c          |  36 +-
>  tools/testing/selftests/vm/vm_util.h          |   5 +-
>  14 files changed, 1040 insertions(+), 386 deletions(-)
>
> --
> 2.37.2.672.g94769d06f0-goog
>

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-06-06  5:33 Fenil Jain
@ 2022-06-06  5:51 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 1596+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-06  5:51 UTC (permalink / raw)
  To: Fenil Jain; +Cc: Shuah Khan, stable

On Mon, Jun 06, 2022 at 11:03:24AM +0530, Fenil Jain wrote:
> On Fri, Jun 03, 2022 at 07:43:01PM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.18.2 release.
> > There are 67 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sun, 05 Jun 2022 17:38:05 +0000.
> > Anything received after that time might be too late.
> 
> Hey Greg,
> 
> Ran tests and boot tested on my system, no regression found
> 
> Tested-by: Fenil Jain<fkjainco@gmail.com>

Thanks for the testing, but something went wrong with your email client
and it lost the Subject: line, making this impossible to be picked up by
our tools.

Also, please include an extra ' ' before the '<' character in your
tested-by line.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-05-19 10:50 ` Matthew Auld
@ 2022-05-20  7:11   ` Christian König
  0 siblings, 0 replies; 1596+ messages in thread
From: Christian König @ 2022-05-20  7:11 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel

Am 19.05.22 um 12:50 schrieb Matthew Auld:
> On Thu, 19 May 2022 at 10:55, Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Just sending that out once more to intel-gfx to let the CI systems take
>> a look.
> If all went well it should normally appear at [1][2], if CI was able
> to pick up the series.
>
> Since it's not currently there, I assume it's temporarily stuck in the
> moderation queue, assuming you are not subscribed to intel-gfx ml?

Ah! Well I am subscribed, just not with the e-Mail address I use to send 
out those patches.

Going to fix that ASAP!

Thanks,
Christian.

>   If
> so, perhaps consider subscribing at [3] and then disable receiving any
> mail from the ml, so you get full use of CI without getting spammed.
>
> [1] https://intel-gfx-ci.01.org/queue/index.html
> [2] https://patchwork.freedesktop.org/project/intel-gfx/series/
> [3] https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
>> No functional change compared to the last version.
>>
>> Christian.
>>
>>


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-05-19  9:54 Christian König
@ 2022-05-19 10:50 ` Matthew Auld
  2022-05-20  7:11   ` Re: Christian König
  0 siblings, 1 reply; 1596+ messages in thread
From: Matthew Auld @ 2022-05-19 10:50 UTC (permalink / raw)
  To: Christian König; +Cc: Intel Graphics Development, ML dri-devel

On Thu, 19 May 2022 at 10:55, Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Just sending that out once more to intel-gfx to let the CI systems take
> a look.

If all went well it should normally appear at [1][2], if CI was able
to pick up the series.

Since it's not currently there, I assume it's temporarily stuck in the
moderation queue, assuming you are not subscribed to intel-gfx ml? If
so, perhaps consider subscribing at [3] and then disable receiving any
mail from the ml, so you get full use of CI without getting spammed.

[1] https://intel-gfx-ci.01.org/queue/index.html
[2] https://patchwork.freedesktop.org/project/intel-gfx/series/
[3] https://lists.freedesktop.org/mailman/listinfo/intel-gfx

>
> No functional change compared to the last version.
>
> Christian.
>
>

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* RE:
  2022-05-06  5:41 Suman Ghosh
@ 2022-05-06  5:43 ` Suman Ghosh
  0 siblings, 0 replies; 1596+ messages in thread
From: Suman Ghosh @ 2022-05-06  5:43 UTC (permalink / raw)
  To: pabeni, davem, kuba, Sunil Kovvuri Goutham, netdev

Please ignore this.

Regards,
Suman

>-----Original Message-----
>From: Suman Ghosh <sumang@marvell.com>
>Sent: Friday, May 6, 2022 11:12 AM
>To: pabeni@redhat.com; davem@davemloft.net; kuba@kernel.org; Sunil
>Kovvuri Goutham <sgoutham@marvell.com>; netdev@vger.kernel.org
>Cc: Suman Ghosh <sumang@marvell.com>
>Subject:
>
>Date: Tue, 22 Mar 2022 11:54:47 +0530
>Subject: [PATCH V3] octeontx2-pf: Add support for adaptive interrupt
>coalescing
>
>Added support for adaptive IRQ coalescing. It uses net_dim algorithm to
>find the suitable delay/IRQ count based on the current packet rate.
>
>Signed-off-by: Suman Ghosh <sumang@marvell.com>
>Reviewed-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
>---
>Changes since V2
>- No change, resubmitting because V1 did not get picked up in patchworks
>  for some reason.
>
> .../net/ethernet/marvell/octeontx2/Kconfig    |  1 +
> .../marvell/octeontx2/nic/otx2_common.c       |  5 ---
> .../marvell/octeontx2/nic/otx2_common.h       | 10 +++++
> .../marvell/octeontx2/nic/otx2_ethtool.c      | 43 ++++++++++++++++---
> .../ethernet/marvell/octeontx2/nic/otx2_pf.c  | 22 ++++++++++
> .../marvell/octeontx2/nic/otx2_txrx.c         | 28 ++++++++++++
> .../marvell/octeontx2/nic/otx2_txrx.h         |  1 +
> 7 files changed, 99 insertions(+), 11 deletions(-)
>
>diff --git a/drivers/net/ethernet/marvell/octeontx2/Kconfig
>b/drivers/net/ethernet/marvell/octeontx2/Kconfig
>index 8560f7e463d3..a544733152d8 100644
>--- a/drivers/net/ethernet/marvell/octeontx2/Kconfig
>+++ b/drivers/net/ethernet/marvell/octeontx2/Kconfig
>@@ -30,6 +30,7 @@ config OCTEONTX2_PF
> 	tristate "Marvell OcteonTX2 NIC Physical Function driver"
> 	select OCTEONTX2_MBOX
> 	select NET_DEVLINK
>+	select DIMLIB
> 	depends on PCI
> 	help
> 	  This driver supports Marvell's OcteonTX2 Resource Virtualization
>diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
>b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
>index 033fd79d35b0..c28850d815c2 100644
>--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
>+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
>@@ -103,11 +103,6 @@ void otx2_get_dev_stats(struct otx2_nic *pfvf)  {
> 	struct otx2_dev_stats *dev_stats = &pfvf->hw.dev_stats;
>
>-#define OTX2_GET_RX_STATS(reg) \
>-	 otx2_read64(pfvf, NIX_LF_RX_STATX(reg))
>-#define OTX2_GET_TX_STATS(reg) \
>-	 otx2_read64(pfvf, NIX_LF_TX_STATX(reg))
>-
> 	dev_stats->rx_bytes = OTX2_GET_RX_STATS(RX_OCTS);
> 	dev_stats->rx_drops = OTX2_GET_RX_STATS(RX_DROP);
> 	dev_stats->rx_bcast_frames = OTX2_GET_RX_STATS(RX_BCAST); diff --
>git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
>b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
>index d9f4b085b2a4..6abf5c28921f 100644
>--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
>+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
>@@ -16,6 +16,7 @@
> #include <net/pkt_cls.h>
> #include <net/devlink.h>
> #include <linux/time64.h>
>+#include <linux/dim.h>
>
> #include <mbox.h>
> #include <npc.h>
>@@ -54,6 +55,11 @@ enum arua_mapped_qtypes {
> /* Send skid of 2000 packets required for CQ size of 4K CQEs. */
> #define SEND_CQ_SKID	2000
>
>+#define OTX2_GET_RX_STATS(reg) \
>+	 otx2_read64(pfvf, NIX_LF_RX_STATX(reg)) #define
>+OTX2_GET_TX_STATS(reg) \
>+	 otx2_read64(pfvf, NIX_LF_TX_STATX(reg))
>+
> struct otx2_lmt_info {
> 	u64 lmt_addr;
> 	u16 lmt_id;
>@@ -363,6 +369,7 @@ struct otx2_nic {
> #define OTX2_FLAG_TC_MATCHALL_INGRESS_ENABLED	BIT_ULL(13)
> #define OTX2_FLAG_DMACFLTR_SUPPORT		BIT_ULL(14)
> #define OTX2_FLAG_PTP_ONESTEP_SYNC		BIT_ULL(15)
>+#define OTX2_FLAG_ADPTV_INT_COAL_ENABLED	BIT_ULL(16)
> 	u64			flags;
> 	u64			*cq_op_addr;
>
>@@ -442,6 +449,9 @@ struct otx2_nic {
> #endif
> 	/* qos */
> 	struct otx2_qos		qos;
>+
>+	/* napi event count. It is needed for adaptive irq coalescing */
>+	u32 napi_events;
> };
>
> static inline bool is_otx2_lbkvf(struct pci_dev *pdev) diff --git
>a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
>b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
>index 72d0b02da3cc..8ed282991f05 100644
>--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
>+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
>@@ -477,6 +477,14 @@ static int otx2_get_coalesce(struct net_device
>*netdev,
> 	cmd->rx_max_coalesced_frames = hw->cq_ecount_wait;
> 	cmd->tx_coalesce_usecs = hw->cq_time_wait;
> 	cmd->tx_max_coalesced_frames = hw->cq_ecount_wait;
>+	if ((pfvf->flags & OTX2_FLAG_ADPTV_INT_COAL_ENABLED) ==
>+		OTX2_FLAG_ADPTV_INT_COAL_ENABLED) {
>+		cmd->use_adaptive_rx_coalesce = 1;
>+		cmd->use_adaptive_tx_coalesce = 1;
>+	} else {
>+		cmd->use_adaptive_rx_coalesce = 0;
>+		cmd->use_adaptive_tx_coalesce = 0;
>+	}
>
> 	return 0;
> }
>@@ -486,10 +494,10 @@ static int otx2_set_coalesce(struct net_device
>*netdev,  {
> 	struct otx2_nic *pfvf = netdev_priv(netdev);
> 	struct otx2_hw *hw = &pfvf->hw;
>+	u8 priv_coalesce_status;
> 	int qidx;
>
>-	if (ec->use_adaptive_rx_coalesce || ec->use_adaptive_tx_coalesce ||
>-	    ec->rx_coalesce_usecs_irq || ec->rx_max_coalesced_frames_irq ||
>+	if (ec->rx_coalesce_usecs_irq || ec->rx_max_coalesced_frames_irq ||
> 	    ec->tx_coalesce_usecs_irq || ec->tx_max_coalesced_frames_irq ||
> 	    ec->stats_block_coalesce_usecs || ec->pkt_rate_low ||
> 	    ec->rx_coalesce_usecs_low || ec->rx_max_coalesced_frames_low ||
>@@ -502,6 +510,18 @@ static int otx2_set_coalesce(struct net_device
>*netdev,
> 	if (!ec->rx_max_coalesced_frames || !ec->tx_max_coalesced_frames)
> 		return 0;
>
>+	/* Check and update coalesce status */
>+	if ((pfvf->flags & OTX2_FLAG_ADPTV_INT_COAL_ENABLED) ==
>+	    OTX2_FLAG_ADPTV_INT_COAL_ENABLED) {
>+		priv_coalesce_status = 1;
>+		if (!ec->use_adaptive_rx_coalesce || !ec-
>>use_adaptive_tx_coalesce)
>+			pfvf->flags &= ~OTX2_FLAG_ADPTV_INT_COAL_ENABLED;
>+	} else {
>+		priv_coalesce_status = 0;
>+		if (ec->use_adaptive_rx_coalesce || ec-
>>use_adaptive_tx_coalesce)
>+			pfvf->flags |= OTX2_FLAG_ADPTV_INT_COAL_ENABLED;
>+	}
>+
> 	/* 'cq_time_wait' is 8bit and is in multiple of 100ns,
> 	 * so clamp the user given value to the range of 1 to 25usec.
> 	 */
>@@ -521,13 +541,13 @@ static int otx2_set_coalesce(struct net_device
>*netdev,
> 		hw->cq_time_wait = min_t(u8, ec->rx_coalesce_usecs,
> 					 ec->tx_coalesce_usecs);
>
>-	/* Max ecount_wait supported is 16bit,
>-	 * so clamp the user given value to the range of 1 to 64k.
>+	/* Max packet budget per napi is 64,
>+	 * so clamp the user given value to the range of 1 to 64.
> 	 */
> 	ec->rx_max_coalesced_frames = clamp_t(u32, ec-
>>rx_max_coalesced_frames,
>-					      1, U16_MAX);
>+					      1, NAPI_POLL_WEIGHT);
> 	ec->tx_max_coalesced_frames = clamp_t(u32, ec-
>>tx_max_coalesced_frames,
>-					      1, U16_MAX);
>+					      1, NAPI_POLL_WEIGHT);
>
> 	/* Rx and Tx are mapped to same CQ, check which one
> 	 * is changed, if both then choose the min.
>@@ -540,6 +560,17 @@ static int otx2_set_coalesce(struct net_device
>*netdev,
> 		hw->cq_ecount_wait = min_t(u16, ec->rx_max_coalesced_frames,
> 					   ec->tx_max_coalesced_frames);
>
>+	/* Reset 'cq_time_wait' and 'cq_ecount_wait' to
>+	 * default values if coalesce status changed from
>+	 * 'on' to 'off'.
>+	 */
>+	if (priv_coalesce_status &&
>+	    ((pfvf->flags & OTX2_FLAG_ADPTV_INT_COAL_ENABLED) !=
>+	    OTX2_FLAG_ADPTV_INT_COAL_ENABLED)) {
>+		hw->cq_time_wait = CQ_TIMER_THRESH_DEFAULT;
>+		hw->cq_ecount_wait = CQ_CQE_THRESH_DEFAULT;
>+	}
>+
> 	if (netif_running(netdev)) {
> 		for (qidx = 0; qidx < pfvf->hw.cint_cnt; qidx++)
> 			otx2_config_irq_coalescing(pfvf, qidx); diff --git
>a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
>b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
>index f18c9a9a50d0..94aaafbeb365 100644
>--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
>+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
>@@ -1279,6 +1279,7 @@ static irqreturn_t otx2_cq_intr_handler(int irq,
>void *cq_irq)
> 	otx2_write64(pf, NIX_LF_CINTX_ENA_W1C(qidx), BIT_ULL(0));
>
> 	/* Schedule NAPI */
>+	pf->napi_events++;
> 	napi_schedule_irqoff(&cq_poll->napi);
>
> 	return IRQ_HANDLED;
>@@ -1292,6 +1293,7 @@ static void otx2_disable_napi(struct otx2_nic *pf)
>
> 	for (qidx = 0; qidx < pf->hw.cint_cnt; qidx++) {
> 		cq_poll = &qset->napi[qidx];
>+		cancel_work_sync(&cq_poll->dim.work);
> 		napi_disable(&cq_poll->napi);
> 		netif_napi_del(&cq_poll->napi);
> 	}
>@@ -1538,6 +1540,24 @@ static void otx2_free_hw_resources(struct
>otx2_nic *pf)
> 	mutex_unlock(&mbox->lock);
> }
>
>+static void otx2_dim_work(struct work_struct *w) {
>+	struct dim_cq_moder cur_moder;
>+	struct otx2_cq_poll *cq_poll;
>+	struct otx2_nic *pfvf;
>+	struct dim *dim;
>+
>+	dim = container_of(w, struct dim, work);
>+	cur_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
>+	cq_poll = container_of(dim, struct otx2_cq_poll, dim);
>+	pfvf = (struct otx2_nic *)cq_poll->dev;
>+	pfvf->hw.cq_time_wait = (cur_moder.usec > CQ_TIMER_THRESH_MAX) ?
>+				CQ_TIMER_THRESH_MAX : cur_moder.usec;
>+	pfvf->hw.cq_ecount_wait = (cur_moder.pkts > NAPI_POLL_WEIGHT) ?
>+				NAPI_POLL_WEIGHT : cur_moder.pkts;
>+	dim->state = DIM_START_MEASURE;
>+}
>+
> int otx2_open(struct net_device *netdev)  {
> 	struct otx2_nic *pf = netdev_priv(netdev); @@ -1611,6 +1631,8 @@
>int otx2_open(struct net_device *netdev)
> 					  CINT_INVALID_CQ;
>
> 		cq_poll->dev = (void *)pf;
>+		cq_poll->dim.mode = DIM_CQ_PERIOD_MODE_START_FROM_CQE;
>+		INIT_WORK(&cq_poll->dim.work, otx2_dim_work);
> 		netif_napi_add(netdev, &cq_poll->napi,
> 			       otx2_napi_handler, NAPI_POLL_WEIGHT);
> 		napi_enable(&cq_poll->napi);
>diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
>b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
>index 459b94b99ddb..927dd12b4f4e 100644
>--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
>+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
>@@ -512,6 +512,22 @@ static int otx2_tx_napi_handler(struct otx2_nic
>*pfvf,
> 	return 0;
> }
>
>+static void otx2_adjust_adaptive_coalese(struct otx2_nic *pfvf, struct
>+otx2_cq_poll *cq_poll) {
>+	struct dim_sample dim_sample;
>+	u64 rx_frames, rx_bytes;
>+
>+	rx_frames = OTX2_GET_RX_STATS(RX_BCAST) +
>OTX2_GET_RX_STATS(RX_MCAST) +
>+			OTX2_GET_RX_STATS(RX_UCAST);
>+	rx_bytes = OTX2_GET_RX_STATS(RX_OCTS);
>+	dim_update_sample(pfvf->napi_events,
>+			  rx_frames,
>+			  rx_bytes,
>+			  &dim_sample);
>+
>+	net_dim(&cq_poll->dim, dim_sample);
>+}
>+
> int otx2_napi_handler(struct napi_struct *napi, int budget)  {
> 	struct otx2_cq_queue *rx_cq = NULL;
>@@ -549,6 +565,18 @@ int otx2_napi_handler(struct napi_struct *napi, int
>budget)
> 		if (pfvf->flags & OTX2_FLAG_INTF_DOWN)
> 			return workdone;
>
>+		/* Check for adaptive interrupt coalesce */
>+		if (workdone != 0 &&
>+		    ((pfvf->flags & OTX2_FLAG_ADPTV_INT_COAL_ENABLED) ==
>+		    OTX2_FLAG_ADPTV_INT_COAL_ENABLED)) {
>+			/* Adjust irq coalese using net_dim */
>+			otx2_adjust_adaptive_coalese(pfvf, cq_poll);
>+
>+			/* Update irq coalescing */
>+			for (i = 0; i < pfvf->hw.cint_cnt; i++)
>+				otx2_config_irq_coalescing(pfvf, i);
>+		}
>+
> 		/* Re-enable interrupts */
> 		otx2_write64(pfvf, NIX_LF_CINTX_ENA_W1S(cq_poll->cint_idx),
> 			     BIT_ULL(0));
>diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h
>b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h
>index a2ac2b3bdbf5..ed41a68d3ec6 100644
>--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h
>+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.h
>@@ -107,6 +107,7 @@ struct otx2_cq_poll {
> #define CINT_INVALID_CQ		255
> 	u8			cint_idx;
> 	u8			cq_ids[CQS_PER_CINT];
>+	struct dim		dim;
> 	struct napi_struct	napi;
> };
>
>--
>2.25.1


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-04-22 15:53             ` Re: Thomas Gleixner
@ 2022-04-23  2:29               ` Nicholas Piggin
  -1 siblings, 0 replies; 1596+ messages in thread
From: Nicholas Piggin @ 2022-04-23  2:29 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Thomas Gleixner, Zhouyi Zhou
  Cc: Daniel
	 Lezcano, linux-kernel, linuxppc-dev, Miguel Ojeda, rcu,
	Viresh
	 Kumar

Excerpts from Thomas Gleixner's message of April 23, 2022 1:53 am:
> On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
>> So we traced the problem down to possibly a misunderstanding between 
>> decrementer clock event device and core code.
>>
>> The decrementer is only oneshot*ish*. It actually needs to either be 
>> reprogrammed or shut down otherwise it just continues to cause 
>> interrupts.
> 
> I always thought that PPC had sane timers. That's really disillusioning.

My comment was probably a bit misleading explanation of the whole
situation. This weirdness is actually in software in the powerpc
clock event driver due to a recent change I made assuming the clock 
event goes to oneshot-stopped.

The hardware is relatively sane I think, global synchronized constant
rate high frequency clock distributed to the CPUs so reads don't
go off-core. And per-CPU "decrementer" event interrupt at the same
frequency as the clock -- program it to a +ve value and it decrements
until zero then creates basically a level triggered interrupt.

Before my change, the decrementer interrupt would always clear the
interrupt at entry. The event_handler usually programs another
timer in so I tried to avoid that first clear counting on the
oneshot_stopped callback to clear the interrupt if there was no
other timer.

>> Before commit 35de589cb879, it was sort of two-shot. The initial 
>> interrupt at the programmed time would set its internal next_tb variable 
>> to ~0 and call the ->event_handler(). If that did not set_next_event or 
>> stop the timer, the interrupt will fire again immediately, notice 
>> next_tb is ~0, and only then stop the decrementer interrupt.
>>
>> So that was already kind of ugly, this patch just turned it into a hang.
>>
>> The problem happens when the tick is stopped with an event still 
>> pending, then tick_nohz_handler() is called, but it bails out because 
>> tick_stopped == 1 so the device never gets programmed again, and so it 
>> keeps firing.
>>
>> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
>> really oneshot, but we would like to avoid doing that because it requires 
>> additional programming of the hardware on each timer interrupt. We have 
>> the ONESHOT_STOPPED state which seems to be just about what we want.
>>
>> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
>> we don't stop it here? This patch seems to fix the hang (not heavily
>> tested though).
> 
> This was definitely overlooked, but it's arguable it is is not required
> for real oneshot clockevent devices. This should only handle the case
> where the interrupt was already pending.
> 
> The ONESHOT_STOPPED state was introduced to handle the case where the
> last timer gets canceled, so the already programmed event does not fire.
> 
> It was not necessarily meant to "fix" clockevent devices which are
> pretending to be ONESHOT, but keep firing over and over.
> 
> That, said. I'm fine with the change along with a big fat comment why
> this is required.

Thanks for taking a look and confirming. I just sent a patch with a
comment and what looks like another missed case. Hopefully it's okay.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
@ 2022-04-23  2:29               ` Nicholas Piggin
  0 siblings, 0 replies; 1596+ messages in thread
From: Nicholas Piggin @ 2022-04-23  2:29 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Thomas Gleixner, Zhouyi Zhou
  Cc: Viresh
	 Kumar,
	Daniel
	 Lezcano, linux-kernel, rcu, Miguel Ojeda, linuxppc-dev

Excerpts from Thomas Gleixner's message of April 23, 2022 1:53 am:
> On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
>> So we traced the problem down to possibly a misunderstanding between 
>> decrementer clock event device and core code.
>>
>> The decrementer is only oneshot*ish*. It actually needs to either be 
>> reprogrammed or shut down otherwise it just continues to cause 
>> interrupts.
> 
> I always thought that PPC had sane timers. That's really disillusioning.

My comment was probably a bit misleading explanation of the whole
situation. This weirdness is actually in software in the powerpc
clock event driver due to a recent change I made assuming the clock 
event goes to oneshot-stopped.

The hardware is relatively sane I think, global synchronized constant
rate high frequency clock distributed to the CPUs so reads don't
go off-core. And per-CPU "decrementer" event interrupt at the same
frequency as the clock -- program it to a +ve value and it decrements
until zero then creates basically a level triggered interrupt.

Before my change, the decrementer interrupt would always clear the
interrupt at entry. The event_handler usually programs another
timer in so I tried to avoid that first clear counting on the
oneshot_stopped callback to clear the interrupt if there was no
other timer.

>> Before commit 35de589cb879, it was sort of two-shot. The initial 
>> interrupt at the programmed time would set its internal next_tb variable 
>> to ~0 and call the ->event_handler(). If that did not set_next_event or 
>> stop the timer, the interrupt will fire again immediately, notice 
>> next_tb is ~0, and only then stop the decrementer interrupt.
>>
>> So that was already kind of ugly, this patch just turned it into a hang.
>>
>> The problem happens when the tick is stopped with an event still 
>> pending, then tick_nohz_handler() is called, but it bails out because 
>> tick_stopped == 1 so the device never gets programmed again, and so it 
>> keeps firing.
>>
>> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
>> really oneshot, but we would like to avoid doing that because it requires 
>> additional programming of the hardware on each timer interrupt. We have 
>> the ONESHOT_STOPPED state which seems to be just about what we want.
>>
>> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
>> we don't stop it here? This patch seems to fix the hang (not heavily
>> tested though).
> 
> This was definitely overlooked, but it's arguable it is is not required
> for real oneshot clockevent devices. This should only handle the case
> where the interrupt was already pending.
> 
> The ONESHOT_STOPPED state was introduced to handle the case where the
> last timer gets canceled, so the already programmed event does not fire.
> 
> It was not necessarily meant to "fix" clockevent devices which are
> pretending to be ONESHOT, but keep firing over and over.
> 
> That, said. I'm fine with the change along with a big fat comment why
> this is required.

Thanks for taking a look and confirming. I just sent a patch with a
comment and what looks like another missed case. Hopefully it's okay.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-04-13  5:11         ` Nicholas Piggin
@ 2022-04-22 15:53             ` Thomas Gleixner
  0 siblings, 0 replies; 1596+ messages in thread
From: Thomas Gleixner @ 2022-04-22 15:53 UTC (permalink / raw)
  To: Nicholas Piggin, Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: linuxppc-dev, Miguel Ojeda, rcu, Daniel Lezcano, linux-kernel,
	Viresh Kumar

On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
> So we traced the problem down to possibly a misunderstanding between 
> decrementer clock event device and core code.
>
> The decrementer is only oneshot*ish*. It actually needs to either be 
> reprogrammed or shut down otherwise it just continues to cause 
> interrupts.

I always thought that PPC had sane timers. That's really disillusioning.

> Before commit 35de589cb879, it was sort of two-shot. The initial 
> interrupt at the programmed time would set its internal next_tb variable 
> to ~0 and call the ->event_handler(). If that did not set_next_event or 
> stop the timer, the interrupt will fire again immediately, notice 
> next_tb is ~0, and only then stop the decrementer interrupt.
>
> So that was already kind of ugly, this patch just turned it into a hang.
>
> The problem happens when the tick is stopped with an event still 
> pending, then tick_nohz_handler() is called, but it bails out because 
> tick_stopped == 1 so the device never gets programmed again, and so it 
> keeps firing.
>
> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
> really oneshot, but we would like to avoid doing that because it requires 
> additional programming of the hardware on each timer interrupt. We have 
> the ONESHOT_STOPPED state which seems to be just about what we want.
>
> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
> we don't stop it here? This patch seems to fix the hang (not heavily
> tested though).

This was definitely overlooked, but it's arguable it is is not required
for real oneshot clockevent devices. This should only handle the case
where the interrupt was already pending.

The ONESHOT_STOPPED state was introduced to handle the case where the
last timer gets canceled, so the already programmed event does not fire.

It was not necessarily meant to "fix" clockevent devices which are
pretending to be ONESHOT, but keep firing over and over.

That, said. I'm fine with the change along with a big fat comment why
this is required.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
@ 2022-04-22 15:53             ` Thomas Gleixner
  0 siblings, 0 replies; 1596+ messages in thread
From: Thomas Gleixner @ 2022-04-22 15:53 UTC (permalink / raw)
  To: Nicholas Piggin, Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: Viresh Kumar, Daniel Lezcano, linux-kernel, rcu, Miguel Ojeda,
	linuxppc-dev

On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
> So we traced the problem down to possibly a misunderstanding between 
> decrementer clock event device and core code.
>
> The decrementer is only oneshot*ish*. It actually needs to either be 
> reprogrammed or shut down otherwise it just continues to cause 
> interrupts.

I always thought that PPC had sane timers. That's really disillusioning.

> Before commit 35de589cb879, it was sort of two-shot. The initial 
> interrupt at the programmed time would set its internal next_tb variable 
> to ~0 and call the ->event_handler(). If that did not set_next_event or 
> stop the timer, the interrupt will fire again immediately, notice 
> next_tb is ~0, and only then stop the decrementer interrupt.
>
> So that was already kind of ugly, this patch just turned it into a hang.
>
> The problem happens when the tick is stopped with an event still 
> pending, then tick_nohz_handler() is called, but it bails out because 
> tick_stopped == 1 so the device never gets programmed again, and so it 
> keeps firing.
>
> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
> really oneshot, but we would like to avoid doing that because it requires 
> additional programming of the hardware on each timer interrupt. We have 
> the ONESHOT_STOPPED state which seems to be just about what we want.
>
> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
> we don't stop it here? This patch seems to fix the hang (not heavily
> tested though).

This was definitely overlooked, but it's arguable it is is not required
for real oneshot clockevent devices. This should only handle the case
where the interrupt was already pending.

The ONESHOT_STOPPED state was introduced to handle the case where the
last timer gets canceled, so the already programmed event does not fire.

It was not necessarily meant to "fix" clockevent devices which are
pretending to be ONESHOT, but keep firing over and over.

That, said. I'm fine with the change along with a big fat comment why
this is required.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-04-21 23:17   ` Re: Yury Norov
@ 2022-04-21 23:21     ` John Hubbard
  0 siblings, 0 replies; 1596+ messages in thread
From: John Hubbard @ 2022-04-21 23:21 UTC (permalink / raw)
  To: Yury Norov; +Cc: Andrew Morton, Minchan Kim, linux-mm, linux-kernel

On 4/21/22 16:17, Yury Norov wrote:
>> Let's add a Cc: line for Michan as well:
>>
>> Cc: Minchan Kim <minchan@kernel.org>
>   
> He's in CC already, I think...
>   

Here, I am talking about attribution in the commit log, as opposed
to the email Cc. In other words, I'm suggesting that you literally
add this line to the commit description:

Cc: Minchan Kim <minchan@kernel.org>


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-04-21 23:04 ` John Hubbard
  2022-04-21 23:09   ` Re: John Hubbard
@ 2022-04-21 23:17   ` Yury Norov
  2022-04-21 23:21     ` Re: John Hubbard
  1 sibling, 1 reply; 1596+ messages in thread
From: Yury Norov @ 2022-04-21 23:17 UTC (permalink / raw)
  To: John Hubbard; +Cc: Andrew Morton, Minchan Kim, linux-mm, linux-kernel

On Thu, Apr 21, 2022 at 04:04:44PM -0700, John Hubbard wrote:
> On 4/21/22 09:41, Yury Norov wrote:
> > Subject: [PATCH] mm/gup: fix comments to pin_user_pages_*()
> > 
> 
> Hi Yuri,
> 
> Thanks for picking this up. I have been distracted and didn't trust
> myself to focus on this properly, so it's good to have help!
> 
> IT/admin point: somehow the first line of the commit description didn't
> make it into an actual email subject. The subject line was blank when it
> arrived in my inbox, and the subject is in the body here instead. Not
> sure how that happened.
> 
> Maybe check your git-sendemail setup?
 
git-sendmail is OK. I just accidentally added empty line above Subject,
which broke format. My bad, sorry for this.
 
> > pin_user_pages API forces FOLL_PIN in gup_flags, which means that the
> > API requires struct page **pages to be provided (not NULL). However,
> > the comment to pin_user_pages() says:
> > 
> >      * @pages:      array that receives pointers to the pages pinned.
> >      *              Should be at least nr_pages long. Or NULL, if caller
> >      *              only intends to ensure the pages are faulted in.
> > 
> > This patch fixes comments along the pin_user_pages code, and also adds
> > WARN_ON(!pages), so that API users will have better understanding
> > on how to use it.
> 
> No need to quote the code in the commit log. Instead, just summarize.
> For example:
> 
> pin_user_pages API forces FOLL_PIN in gup_flags, which means that the
> API requires struct page **pages to be provided (not NULL). However, the
> comment to pin_user_pages() clearly allows for passing in a NULL @pages
> argument.
> 
> Remove the incorrect comments, and add WARN_ON_ONCE(!pages) calls to
> enforce the API.
> 
> > 
> > It has been independently spotted by Minchan Kim and confirmed with
> > John Hubbard:
> > 
> > https://lore.kernel.org/all/YgWA0ghrrzHONehH@google.com/
> 
> Let's add a Cc: line for Michan as well:
> 
> Cc: Minchan Kim <minchan@kernel.org>
 
He's in CC already, I think...
 
> > Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> > ---
> >   mm/gup.c | 26 ++++++++++++++++++++++----
> >   1 file changed, 22 insertions(+), 4 deletions(-)
> > 
> > diff --git a/mm/gup.c b/mm/gup.c
> > index f598a037eb04..559626457585 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -2871,6 +2871,10 @@ int pin_user_pages_fast(unsigned long start, int nr_pages,
> >   	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> >   		return -EINVAL;
> > +	/* FOLL_PIN requires pages != NULL */
> 
> Please delete each and every one of these one-line comments, because
> they merely echo what the code says.

Sure.
 
> > +	if (WARN_ON_ONCE(!pages))
> > +		return -EINVAL;
> > +
> >   	gup_flags |= FOLL_PIN;
> >   	return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages);
> >   }
> > @@ -2893,6 +2897,10 @@ int pin_user_pages_fast_only(unsigned long start, int nr_pages,
> >   	 */
> >   	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> >   		return 0;
> > +
> > +	/* FOLL_PIN requires pages != NULL */
> > +	if (WARN_ON_ONCE(!pages))
> > +		return 0;
> >   	/*
> >   	 * FOLL_FAST_ONLY is required in order to match the API description of
> >   	 * this routine: no fall back to regular ("slow") GUP.
> > @@ -2920,8 +2928,7 @@ EXPORT_SYMBOL_GPL(pin_user_pages_fast_only);
> >    * @nr_pages:	number of pages from start to pin
> >    * @gup_flags:	flags modifying lookup behaviour
> >    * @pages:	array that receives pointers to the pages pinned.
> > - *		Should be at least nr_pages long. Or NULL, if caller
> > - *		only intends to ensure the pages are faulted in.
> > + *		Should be at least nr_pages long.
> >    * @vmas:	array of pointers to vmas corresponding to each page.
> >    *		Or NULL if the caller does not require them.
> >    * @locked:	pointer to lock flag indicating whether lock is held and
> > @@ -2944,6 +2951,10 @@ long pin_user_pages_remote(struct mm_struct *mm,
> >   	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> >   		return -EINVAL;
> > +	/* FOLL_PIN requires pages != NULL */
> > +	if (WARN_ON_ONCE(!pages))
> > +		return -EINVAL;
> > +
> >   	gup_flags |= FOLL_PIN;
> >   	return __get_user_pages_remote(mm, start, nr_pages, gup_flags,
> >   				       pages, vmas, locked);
> > @@ -2957,8 +2968,7 @@ EXPORT_SYMBOL(pin_user_pages_remote);
> >    * @nr_pages:	number of pages from start to pin
> >    * @gup_flags:	flags modifying lookup behaviour
> >    * @pages:	array that receives pointers to the pages pinned.
> > - *		Should be at least nr_pages long. Or NULL, if caller
> > - *		only intends to ensure the pages are faulted in.
> > + *		Should be at least nr_pages long.
> >    * @vmas:	array of pointers to vmas corresponding to each page.
> >    *		Or NULL if the caller does not require them.
> >    *
> > @@ -2976,6 +2986,10 @@ long pin_user_pages(unsigned long start, unsigned long nr_pages,
> >   	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> >   		return -EINVAL;
> > +	/* FOLL_PIN requires pages != NULL */
> > +	if (WARN_ON_ONCE(!pages))
> > +		return -EINVAL;
> > +
> >   	gup_flags |= FOLL_PIN;
> >   	return __gup_longterm_locked(current->mm, start, nr_pages,
> >   				     pages, vmas, gup_flags);
> > @@ -2994,6 +3008,10 @@ long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
> >   	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> >   		return -EINVAL;
> > +	/* FOLL_PIN requires pages != NULL */
> > +	if (WARN_ON_ONCE(!pages))
> > +		return -EINVAL;
> > +
> >   	gup_flags |= FOLL_PIN;
> >   	return get_user_pages_unlocked(start, nr_pages, pages, gup_flags);
> >   }
> 
> I hope we don't break any callers with the newly enforced !pages, but it's
> the right thing to do, in order to avoid misunderstandings.
> 
> thanks,
> -- 
> John Hubbard
> NVIDIA

Let me test v2 and resend shortly.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-04-21 23:04 ` John Hubbard
@ 2022-04-21 23:09   ` John Hubbard
  2022-04-21 23:17   ` Re: Yury Norov
  1 sibling, 0 replies; 1596+ messages in thread
From: John Hubbard @ 2022-04-21 23:09 UTC (permalink / raw)
  To: Yury Norov, Andrew Morton, Minchan Kim, linux-mm, linux-kernel

On 4/21/22 16:04, John Hubbard wrote:
> On 4/21/22 09:41, Yury Norov wrote:
>> Subject: [PATCH] mm/gup: fix comments to pin_user_pages_*()
>>
> 
> Hi Yuri,

...and I see that I have typo'd both Yury's and Minchan's name (further
down), in the same email!

Really apologize for screwing that up. It's Yury-with-a-"y", I know. :)


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-04-21 16:41 Yury Norov
@ 2022-04-21 23:04 ` John Hubbard
  2022-04-21 23:09   ` Re: John Hubbard
  2022-04-21 23:17   ` Re: Yury Norov
  0 siblings, 2 replies; 1596+ messages in thread
From: John Hubbard @ 2022-04-21 23:04 UTC (permalink / raw)
  To: Yury Norov, Andrew Morton, Minchan Kim, linux-mm, linux-kernel

On 4/21/22 09:41, Yury Norov wrote:
> Subject: [PATCH] mm/gup: fix comments to pin_user_pages_*()
> 

Hi Yuri,

Thanks for picking this up. I have been distracted and didn't trust
myself to focus on this properly, so it's good to have help!

IT/admin point: somehow the first line of the commit description didn't
make it into an actual email subject. The subject line was blank when it
arrived in my inbox, and the subject is in the body here instead. Not
sure how that happened.

Maybe check your git-sendemail setup?


> pin_user_pages API forces FOLL_PIN in gup_flags, which means that the
> API requires struct page **pages to be provided (not NULL). However,
> the comment to pin_user_pages() says:
> 
>      * @pages:      array that receives pointers to the pages pinned.
>      *              Should be at least nr_pages long. Or NULL, if caller
>      *              only intends to ensure the pages are faulted in.
> 
> This patch fixes comments along the pin_user_pages code, and also adds
> WARN_ON(!pages), so that API users will have better understanding
> on how to use it.

No need to quote the code in the commit log. Instead, just summarize.
For example:

pin_user_pages API forces FOLL_PIN in gup_flags, which means that the
API requires struct page **pages to be provided (not NULL). However, the
comment to pin_user_pages() clearly allows for passing in a NULL @pages
argument.

Remove the incorrect comments, and add WARN_ON_ONCE(!pages) calls to
enforce the API.

> 
> It has been independently spotted by Minchan Kim and confirmed with
> John Hubbard:
> 
> https://lore.kernel.org/all/YgWA0ghrrzHONehH@google.com/

Let's add a Cc: line for Michan as well:

Cc: Minchan Kim <minchan@kernel.org>

> 
> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> ---
>   mm/gup.c | 26 ++++++++++++++++++++++----
>   1 file changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index f598a037eb04..559626457585 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2871,6 +2871,10 @@ int pin_user_pages_fast(unsigned long start, int nr_pages,
>   	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
>   		return -EINVAL;
>   
> +	/* FOLL_PIN requires pages != NULL */

Please delete each and every one of these one-line comments, because
they merely echo what the code says.

> +	if (WARN_ON_ONCE(!pages))
> +		return -EINVAL;
> +
>   	gup_flags |= FOLL_PIN;
>   	return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages);
>   }
> @@ -2893,6 +2897,10 @@ int pin_user_pages_fast_only(unsigned long start, int nr_pages,
>   	 */
>   	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
>   		return 0;
> +
> +	/* FOLL_PIN requires pages != NULL */
> +	if (WARN_ON_ONCE(!pages))
> +		return 0;
>   	/*
>   	 * FOLL_FAST_ONLY is required in order to match the API description of
>   	 * this routine: no fall back to regular ("slow") GUP.
> @@ -2920,8 +2928,7 @@ EXPORT_SYMBOL_GPL(pin_user_pages_fast_only);
>    * @nr_pages:	number of pages from start to pin
>    * @gup_flags:	flags modifying lookup behaviour
>    * @pages:	array that receives pointers to the pages pinned.
> - *		Should be at least nr_pages long. Or NULL, if caller
> - *		only intends to ensure the pages are faulted in.
> + *		Should be at least nr_pages long.
>    * @vmas:	array of pointers to vmas corresponding to each page.
>    *		Or NULL if the caller does not require them.
>    * @locked:	pointer to lock flag indicating whether lock is held and
> @@ -2944,6 +2951,10 @@ long pin_user_pages_remote(struct mm_struct *mm,
>   	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
>   		return -EINVAL;
>   
> +	/* FOLL_PIN requires pages != NULL */
> +	if (WARN_ON_ONCE(!pages))
> +		return -EINVAL;
> +
>   	gup_flags |= FOLL_PIN;
>   	return __get_user_pages_remote(mm, start, nr_pages, gup_flags,
>   				       pages, vmas, locked);
> @@ -2957,8 +2968,7 @@ EXPORT_SYMBOL(pin_user_pages_remote);
>    * @nr_pages:	number of pages from start to pin
>    * @gup_flags:	flags modifying lookup behaviour
>    * @pages:	array that receives pointers to the pages pinned.
> - *		Should be at least nr_pages long. Or NULL, if caller
> - *		only intends to ensure the pages are faulted in.
> + *		Should be at least nr_pages long.
>    * @vmas:	array of pointers to vmas corresponding to each page.
>    *		Or NULL if the caller does not require them.
>    *
> @@ -2976,6 +2986,10 @@ long pin_user_pages(unsigned long start, unsigned long nr_pages,
>   	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
>   		return -EINVAL;
>   
> +	/* FOLL_PIN requires pages != NULL */
> +	if (WARN_ON_ONCE(!pages))
> +		return -EINVAL;
> +
>   	gup_flags |= FOLL_PIN;
>   	return __gup_longterm_locked(current->mm, start, nr_pages,
>   				     pages, vmas, gup_flags);
> @@ -2994,6 +3008,10 @@ long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
>   	if (WARN_ON_ONCE(gup_flags & FOLL_GET))
>   		return -EINVAL;
>   
> +	/* FOLL_PIN requires pages != NULL */
> +	if (WARN_ON_ONCE(!pages))
> +		return -EINVAL;
> +
>   	gup_flags |= FOLL_PIN;
>   	return get_user_pages_unlocked(start, nr_pages, pages, gup_flags);
>   }

I hope we don't break any callers with the newly enforced !pages, but it's
the right thing to do, in order to avoid misunderstandings.

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
@ 2022-04-14 22:53 Alex Bennée
  0 siblings, 0 replies; 1596+ messages in thread
From: Alex Bennée @ 2022-04-14 22:53 UTC (permalink / raw)
  To: Eric Auger
  Cc: Laurent Vivier, Thomas Huth, slp, mathieu.poirier, mst,
	viresh.kumar, qemu-devel, stefanha, marcandre.lureau,
	Paolo Bonzini


Eric Auger <eauger@redhat.com> writes:

> Hi Alex,
>
> On 4/7/22 5:00 PM, Alex Bennée wrote:
>> When trying to work out what the virtio-net-tests where doing it was
>> hard because the g_test_trap_subprocess redirects all output to
>> /dev/null. Lift this restriction by using the appropriate flags so you
>> can see something similar to what the vhost-user-blk tests show when
>> running.
>> 
>> While we are at it remove the g_test_verbose() check so we always show
>> how the QEMU is run.
>> 
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> ---
>>  tests/qtest/qos-test.c | 7 +++----
>>  1 file changed, 3 insertions(+), 4 deletions(-)
>> 
>> diff --git a/tests/qtest/qos-test.c b/tests/qtest/qos-test.c
>> index f97d0a08fd..c6c196cc95 100644
>> --- a/tests/qtest/qos-test.c
>> +++ b/tests/qtest/qos-test.c
>> @@ -89,9 +89,7 @@ static void qos_set_machines_devices_available(void)
>>  
>>  static void restart_qemu_or_continue(char *path)
>>  {
>> -    if (g_test_verbose()) {
>> -        qos_printf("Run QEMU with: '%s'\n", path);
>> -    }
>> +    qos_printf("Run QEMU with: '%s'\n", path);
>>      /* compares the current command line with the
>>       * one previously executed: if they are the same,
>>       * don't restart QEMU, if they differ, stop previous
>> @@ -185,7 +183,8 @@ static void run_one_test(const void *arg)
>>  static void subprocess_run_one_test(const void *arg)
>>  {
>>      const gchar *path = arg;
>> -    g_test_trap_subprocess(path, 0, 0);
>> +    g_test_trap_subprocess(path, 0,
>> +                           G_TEST_SUBPROCESS_INHERIT_STDOUT | G_TEST_SUBPROCESS_INHERIT_STDERR);
> While workling on libqos/pci tests on aarch64 I also did that but I
> noticed there were a bunch of errors such as:
>
> /aarch64/virt/generic-pcihost/pci-bus-generic/pci-bus/virtio-net-pci/virtio-net/virtio-net-tests/vhost-user/multiqueue:
> qemu-system-aarch64: Failed to set msg fds.
> qemu-system-aarch64: vhost VQ 0 ring restore failed: -22: Invalid
> argument (22)
> qemu-system-aarch64: Failed to set msg fds.
> qemu-system-aarch64: vhost VQ 1 ring restore failed: -22: Invalid
> argument (22)
> qemu-system-aarch64: Failed to set msg fds.
> qemu-system-aarch64: vhost VQ 2 ring restore failed: -22: Invalid
> argument (22)
> qemu-system-aarch64: Failed to set msg fds.
> qemu-system-aarch64: vhost VQ 3 ring restore failed: -22: Invalid
> argument (22)
>
> I see those also when running with x86_64-softmmu/qemu-system-x86_64
> (this is no aarch64 specific).

I think this is a symptom of an unclean tear down (which might be too
much to ask for our fake vhost backend) or something we should handle
better. I still have to get my gpio test working so I'll have a look
tomorrow.


>
> I don't know if it is an issue to get those additional errors?
>
> Thanks
>
> Eric
>
>>      g_test_trap_assert_passed();
>>  }
>>  
>> 


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-03-29  8:35                   ` Re: Thomas Gleixner
@ 2022-04-12  6:55                     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 1596+ messages in thread
From: Michael S. Tsirkin @ 2022-04-12  6:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization

On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> On Mon, Mar 28 2022 at 06:40, Michael S. Tsirkin wrote:
> > On Mon, Mar 28, 2022 at 02:18:22PM +0800, Jason Wang wrote:
> >> > > So I think we might talk different issues:
> >> > >
> >> > > 1) Whether request_irq() commits the previous setups, I think the
> >> > > answer is yes, since the spin_unlock of desc->lock (release) can
> >> > > guarantee this though there seems no documentation around
> >> > > request_irq() to say this.
> >> > >
> >> > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> >> > > using smp_wmb() before the request_irq().
> 
> That's a complete bogus example especially as there is not a single
> smp_rmb() which pairs with the smp_wmb().
> 
> >> > > And even if write is ordered we still need read to be ordered to be
> >> > > paired with that.
> >
> > IMO it synchronizes with the CPU to which irq is
> > delivered. Otherwise basically all drivers would be broken,
> > wouldn't they be?
> > I don't know whether it's correct on all platforms, but if not
> > we need to fix request_irq.
> 
> There is nothing to fix:
> 
> request_irq()
>    raw_spin_lock_irq(desc->lock);       // ACQUIRE
>    ....
>    raw_spin_unlock_irq(desc->lock);     // RELEASE
> 
> interrupt()
>    raw_spin_lock(desc->lock);           // ACQUIRE
>    set status to IN_PROGRESS
>    raw_spin_unlock(desc->lock);         // RELEASE
>    invoke handler()
> 
> So anything which the driver set up _before_ request_irq() is visible to
> the interrupt handler. No?
> 
> >> What happens if an interrupt is raised in the middle like:
> >> 
> >> smp_store_release(dev->irq_soft_enabled, true)
> >> IRQ handler
> >> synchornize_irq()
> 
> This is bogus. The obvious order of things is:
> 
>     dev->ok = false;
>     request_irq();
> 
>     moar_setup();
>     synchronize_irq();  // ACQUIRE + RELEASE
>     dev->ok = true;
> 
> The reverse operation on teardown:
> 
>     dev->ok = false;
>     synchronize_irq();  // ACQUIRE + RELEASE
> 
>     teardown();
> 
> So in both cases a simple check in the handler is sufficient:
> 
> handler()
>     if (!dev->ok)
>     	return;

Does this need to be if (!READ_ONCE(dev->ok)) ?



> I'm not understanding what you folks are trying to "fix" here. If any
> driver does this in the wrong order, then the driver is broken.
> 
> Sure, you can do the same with:
> 
>     dev->ok = false;
>     request_irq();
>     moar_setup();
>     smp_wmb();
>     dev->ok = true;
> 
> for the price of a smp_rmb() in the interrupt handler:
> 
> handler()
>     if (!dev->ok)
>     	return;
>     smp_rmb();
> 
> but that's only working for the setup case correctly and not for
> teardown.
> 
> Thanks,
> 
>         tglx

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
@ 2022-04-12  6:55                     ` Michael S. Tsirkin
  0 siblings, 0 replies; 1596+ messages in thread
From: Michael S. Tsirkin @ 2022-04-12  6:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jason Wang, virtualization, linux-kernel, Marc Zyngier,
	Peter Zijlstra, Stefano Garzarella, Keir Fraser,
	Paul E. McKenney

On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> On Mon, Mar 28 2022 at 06:40, Michael S. Tsirkin wrote:
> > On Mon, Mar 28, 2022 at 02:18:22PM +0800, Jason Wang wrote:
> >> > > So I think we might talk different issues:
> >> > >
> >> > > 1) Whether request_irq() commits the previous setups, I think the
> >> > > answer is yes, since the spin_unlock of desc->lock (release) can
> >> > > guarantee this though there seems no documentation around
> >> > > request_irq() to say this.
> >> > >
> >> > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> >> > > using smp_wmb() before the request_irq().
> 
> That's a complete bogus example especially as there is not a single
> smp_rmb() which pairs with the smp_wmb().
> 
> >> > > And even if write is ordered we still need read to be ordered to be
> >> > > paired with that.
> >
> > IMO it synchronizes with the CPU to which irq is
> > delivered. Otherwise basically all drivers would be broken,
> > wouldn't they be?
> > I don't know whether it's correct on all platforms, but if not
> > we need to fix request_irq.
> 
> There is nothing to fix:
> 
> request_irq()
>    raw_spin_lock_irq(desc->lock);       // ACQUIRE
>    ....
>    raw_spin_unlock_irq(desc->lock);     // RELEASE
> 
> interrupt()
>    raw_spin_lock(desc->lock);           // ACQUIRE
>    set status to IN_PROGRESS
>    raw_spin_unlock(desc->lock);         // RELEASE
>    invoke handler()
> 
> So anything which the driver set up _before_ request_irq() is visible to
> the interrupt handler. No?
> 
> >> What happens if an interrupt is raised in the middle like:
> >> 
> >> smp_store_release(dev->irq_soft_enabled, true)
> >> IRQ handler
> >> synchornize_irq()
> 
> This is bogus. The obvious order of things is:
> 
>     dev->ok = false;
>     request_irq();
> 
>     moar_setup();
>     synchronize_irq();  // ACQUIRE + RELEASE
>     dev->ok = true;
> 
> The reverse operation on teardown:
> 
>     dev->ok = false;
>     synchronize_irq();  // ACQUIRE + RELEASE
> 
>     teardown();
> 
> So in both cases a simple check in the handler is sufficient:
> 
> handler()
>     if (!dev->ok)
>     	return;

Does this need to be if (!READ_ONCE(dev->ok)) ?



> I'm not understanding what you folks are trying to "fix" here. If any
> driver does this in the wrong order, then the driver is broken.
> 
> Sure, you can do the same with:
> 
>     dev->ok = false;
>     request_irq();
>     moar_setup();
>     smp_wmb();
>     dev->ok = true;
> 
> for the price of a smp_rmb() in the interrupt handler:
> 
> handler()
>     if (!dev->ok)
>     	return;
>     smp_rmb();
> 
> but that's only working for the setup case correctly and not for
> teardown.
> 
> Thanks,
> 
>         tglx


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-04-06  7:51 Christian König
@ 2022-04-06 12:59 ` Daniel Vetter
  0 siblings, 0 replies; 1596+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:59 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, dri-devel

On Wed, Apr 06, 2022 at 09:51:16AM +0200, Christian König wrote:
> Hi Daniel,
> 
> rebased on top of all the changes in drm-misc-next now and hopefully
> ready for 5.19.
> 
> I think I addressed all concern, but there was a bunch of rebase fallout
> from i915, so better to double check that once more.

No idea what you managed to do with this series, but
- cover letter isn't showing up in archives
- you have Reply-To: DMA-resvusage sprinkled all over, which means my
  replies are bouncing in funny ways

Please fix for next time around.

Also the split up patches lack a bit cc.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
  2022-03-30  5:14                         ` Re: Michael S. Tsirkin
@ 2022-03-30  5:53                           ` Jason Wang
  -1 siblings, 0 replies; 1596+ messages in thread
From: Jason Wang @ 2022-03-30  5:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, linux-kernel, Marc Zyngier, Thomas Gleixner,
	Peter Zijlstra, Stefano Garzarella, Keir Fraser,
	Paul E. McKenney

On Wed, Mar 30, 2022 at 1:14 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Mar 30, 2022 at 10:40:59AM +0800, Jason Wang wrote:
> > On Tue, Mar 29, 2022 at 10:09 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Mar 29, 2022 at 03:12:14PM +0800, Jason Wang wrote:
> > > > > > > > > And requesting irq commits all memory otherwise all drivers would be
> > > > > > > > > broken,
> > > > > > > >
> > > > > > > > So I think we might talk different issues:
> > > > > > > >
> > > > > > > > 1) Whether request_irq() commits the previous setups, I think the
> > > > > > > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > > > > > > guarantee this though there seems no documentation around
> > > > > > > > request_irq() to say this.
> > > > > > > >
> > > > > > > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > > > > > > using smp_wmb() before the request_irq().
> > > > > > > >
> > > > > > > > And even if write is ordered we still need read to be ordered to be
> > > > > > > > paired with that.
> > > > >
> > > > > IMO it synchronizes with the CPU to which irq is
> > > > > delivered. Otherwise basically all drivers would be broken,
> > > > > wouldn't they be?
> > > >
> > > > I guess it's because most of the drivers don't care much about the
> > > > buggy/malicious device.  And most of the devices may require an extra
> > > > step to enable device IRQ after request_irq(). Or it's the charge of
> > > > the driver to do the synchronization.
> > >
> > > It is true that the use-case of malicious devices is somewhat boutique.
> > > But I think most drivers do want to have their hotplug routines to be
> > > robust, yes.
> > >
> > > > > I don't know whether it's correct on all platforms, but if not
> > > > > we need to fix request_irq.
> > > > >
> > > > > > > >
> > > > > > > > > if it doesn't it just needs to be fixed, not worked around in
> > > > > > > > > virtio.
> > > > > > > >
> > > > > > > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > > > > > > virtio_device_ready():
> > > > > > > >
> > > > > > > > request_irq()
> > > > > > > > driver specific setups
> > > > > > > > virtio_device_ready()
> > > > > > > >
> > > > > > > > CPU 0 probe) request_irq()
> > > > > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > > > > CPU 0 probe) driver specific setups
> > > > > > > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > > > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > As I said, virtio_device_ready needs to do synchronize_irq.
> > > > > > > That will guarantee all setup is visible to the specific IRQ,
> > > > > >
> > > > > > Only the interrupt after synchronize_irq() returns.
> > > > >
> > > > > Anything else is a buggy device though.
> > > >
> > > > Yes, but the goal of this patch is to prevent the possible attack from
> > > > buggy(malicious) devices.
> > >
> > > Right. However if a driver of a *buggy* device somehow sees driver_ok =
> > > false even though it's actually initialized, that is not a deal breaker
> > > as that does not open us up to an attack.
> > >
> > > > >
> > > > > > >this
> > > > > > > is what it's point is.
> > > > > >
> > > > > > What happens if an interrupt is raised in the middle like:
> > > > > >
> > > > > > smp_store_release(dev->irq_soft_enabled, true)
> > > > > > IRQ handler
> > > > > > synchornize_irq()
> > > > > >
> > > > > > If we don't enforce a reading order, the IRQ handler may still see the
> > > > > > uninitialized variable.
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > IMHO variables should be initialized before request_irq
> > > > > to a value meaning "not a valid interrupt".
> > > > > Specifically driver_ok = false.
> > > > > Handler in the scenario you describe will then see !driver_ok
> > > > > and exit immediately.
> > > >
> > > > So just to make sure we're on the same page.
> > > >
> > > > 1) virtio_reset_device() will set the driver_ok to false;
> > > > 2) virtio_device_ready() will set the driver_ok to true
> > > >
> > > > So for virtio drivers, it often did:
> > > >
> > > > 1) virtio_reset_device()
> > > > 2) find_vqs() which will call request_irq()
> > > > 3) other driver specific setups
> > > > 4) virtio_device_ready()
> > > >
> > > > In virtio_device_ready(), the patch perform the following currently:
> > > >
> > > > smp_store_release(driver_ok, true);
> > > > set_status(DRIVER_OK);
> > > >
> > > > Per your suggestion, to add synchronize_irq() after
> > > > smp_store_release() so we had
> > > >
> > > > smp_store_release(driver_ok, true);
> > > > synchornize_irq()
> > > > set_status(DRIVER_OK)
> > > >
> > > > Suppose there's a interrupt raised before the synchronize_irq(), if we do:
> > > >
> > > > if (READ_ONCE(driver_ok)) {
> > > >       vq->callback()
> > > > }
> > > >
> > > > It will see the driver_ok as true but how can we make sure
> > > > vq->callback sees the driver specific setups (3) above?
> > > >
> > > > And an example is virtio_scsi():
> > > >
> > > > virtio_reset_device()
> > > > virtscsi_probe()
> > > >     virtscsi_init()
> > > >         virtio_find_vqs()
> > > >         ...
> > > >         virtscsi_init_vq(&vscsi->event_vq, vqs[1])
> > > >     ....
> > > >     virtio_device_ready()
> > > >
> > > > In virtscsi_event_done():
> > > >
> > > > virtscsi_event_done():
> > > >     virtscsi_vq_done(vscsi, &vscsi->event_vq, ...);
> > > >
> > > > We need to make sure the even_done reads driver_ok before read vscsi->event_vq.
> > > >
> > > > Thanks
> > >
> > >
> > > See response by Thomas. A simple if (!dev->driver_ok) should be enough,
> > > it's all under a lock.
> >
> > Ordered through ACQUIRE+RELEASE actually since the irq handler is not
> > running under the lock.
> >
> > Another question, for synchronize_irq() do you prefer
> >
> > 1) transport specific callbacks
> > or
> > 2) a simple synchornize_rcu()
> >
> > Thanks
>
>
> 1) I think, and I'd add a wrapper so we can switch to 2 if we really
> want to. But for now synchronizing the specific irq is obviously designed to
> make any changes to memory visible to this irq. that
> seems cleaner and easier to understand than memory ordering tricks
> and relying on side effects of synchornize_rcu, even though
> internally this all boils down to memory ordering since
> memory is what's used to implement locks :).
> Not to mention, synchronize_irq just scales much better from performance
> POV.

Ok. Let me try to do that in V2.

Thanks

>
>
> > >
> > > > >
> > > > >
> > > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > We use smp_store_relase()
> > > > > > > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > > > > > > which surprises me.
> > > > > > > > > > > > >
> > > > > > > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > > > > > > are any buffers in any queues?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > "
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > > > > > > "
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is only for config interrupt.
> > > > > > > > > > > >
> > > > > > > > > > > > Ok.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > > > > > > expensive.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > > > > > > >    * */
> > > > > > > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > > > > > > though it's most likely is ...
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > > > > > > >  * and NMI handlers.
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > > > > > > irq_soft_enabled as false.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > You are right. So then
> > > > > > > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > > > > > > >    READ_ONCE should do.
> > > > > > > > > > > >
> > > > > > > > > > > > See above.
> > > > > > > > > > > >
> > > > > > > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ok.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Do you mean:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > > > > > > interrupt handlers
> > > > > > > > > > > > >
> > > > > > > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > > > > > > users at the moment, even less than probe.
> > > > > > > > > > > >
> > > > > > > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > > > > > > spinlock or others.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > > > > > > for old hypervisors
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > > > > > > >
> > > > > > > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > > > > > > codes. They look all fine since day0.
> > > > > > > > > > > >
> > > > > > > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > > > > > > in the debug build).
> > > > > > > > > > > >
> > > > > > > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > > > > > > device. I think it can be done but in a separate series.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > > > > > > speeding boot up a tiny bit.
> > > > > > > > > > > >
> > > > > > > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > > > > > > read back for validation in many cases.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > > > > > > + }
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > > > > > > >           bool failed;
> > > > > > > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > > > > > > >           struct device dev;
> > > > > > > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > > +/*
> > > > > > > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > > > > > > + */
> > > > > > > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > > > > > > +{
> > > > > > > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > > > > > > +         return true;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > > > > > > paired
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Will fix.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > > > > > > +}
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >   /**
> > > > > > > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > 2.25.1
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>


^ permalink raw reply	[flat|nested] 1596+ messages in thread

* Re:
@ 2022-03-30  5:53                           ` Jason Wang
  0 siblings, 0 replies; 1596+ messages in thread
From: Jason Wang @ 2022-03-30  5:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Wed, Mar 30, 2022 at 1:14 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Mar 30, 2022 at 10:40:59AM +0800, Jason Wang wrote:
> > On Tue, Mar 29, 2022 at 10:09 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Mar 29, 2022 at 03:12:14PM +0800, Jason Wang wrote:
> > > > > > > > > And requesting irq commits all memory otherwise all drivers would be
> > > > > > > > > broken,
> > > > > > > >
> > > > > > > > So I think we might talk different issues:
> > > > > > > >
> > > > > > > > 1) Whether request_irq() commits the previous setups, I think the
> > > > > > > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > > > > > > guarantee this though there seems no documentation around
> > > > > > > > request_irq() to say this.
> > > > > > > >
> > > > > > > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > > > > > > using smp_wmb() before the request_irq().
> > > > > > > >
> > > > > > > > And even if write is ordered we still need read to be ordered to be
> > > > > > > > paired with that.
> > > > >
> > > > > IMO it synchronizes with the CPU to which irq is
> > > > > delivered. Otherwise basically all drivers would be broken,
> > > > > wouldn't they be?
> > > >
> > > > I guess it's because most of the drivers don't care much about the
> > > > buggy/malicious device.  And most of the devices may require an extra
> > > > step to enable device IRQ after request_irq(). Or it's the charge of
> > > > the driver to do the synchronization.
> > >
> > > It is true that the use-case of malicious devices is somewhat boutique.
> > > But I think most drivers do want to have their hotplug routines to be
> > > robust, yes.
> > >
> > > > > I don't know whether it's correct on all platforms, but if not
> > > > > we need to fix request_irq.
> > > > >
> > > > > > > >
> > > > > > > > > if it doesn't it just needs to be fixed, not worked around in
> > > > > > > > > virtio.
> > > > > > > >
> > > > > > > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > > > > > > virtio_device_ready():
> > > > > > > >
> > > > > > > > request_irq()
> > > > > > > > driver specific setups
> > > > > > > > virtio_device_ready()
> > > > > > > >
> > > > > > > > CPU 0 probe) request_irq()
> > > > > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > > > > CPU 0 probe) driver specific setups
> > > > > > > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > > > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > As I said, virtio_device_ready needs to do synchronize_irq.
> > > > > > > That will guarantee all setup is visible to the specific IRQ,
> > > > > >
> > > > > > Only the interrupt after synchronize_irq() returns.
> > > > >
> > > > > Anything else is a buggy device though.
> > > >
> > > > Yes, but the goal of this patch is to prevent the possible attack from
> > > > buggy(malicious) devices.
> > >
> > > Right. However if a driver of a *buggy* device somehow sees driver_ok =
> > > false even though it's actually initialized, that is not a deal breaker
> > > as that does not open us up to an attack.
> > >
> > > > >
> > > > > > >this
> > > > > > > is what it's point is.
> > > > > >
> > > > > > What happens if an interrupt is raised in the middle like:
> > > > > >
> > > > > > smp_store_release(dev->irq_soft_enabled, true)
> > > > > > IRQ handler
> > > > > > synchornize_irq()
> > > > > >
> > > > > > If we don't enforce a reading order, the IRQ handler may still see the
> > > > > > uninitialized variable.
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > IMHO variables should be initialized before request_irq
> > > > > to a value meaning "not a valid interrupt".
> > > > > Specifically driver_ok = false.
> > > > > Handler in the scenario you describe will then see !driver_ok
> > > > > and exit immediately.
> > > >
> > > > So just to make sure we're on the same page.
> > > >
> > > > 1) virtio_reset_device() will set the driver_ok to false;
> > > > 2) virtio_device_ready() will set the driver_ok to true
> > > >
> > > > So for virtio drivers, it often did:
> > > >
> > > > 1) virtio_reset_device()
> > > > 2) find_vqs() which will call request_irq()
> > > > 3) other driver specific setups
> > > > 4) virtio_device_ready()
> > > >
> > > > In virtio_device_ready(), the patch perform the following currently:
> > > >
> > > > smp_store_release(driver_ok, true);
> > > > set_status(DRIVER_OK);
> > > >
> > > > Per your suggestion, to add synchronize_irq() after
> > > > smp_store_release() so we had
> > > >
> > > > smp_store_release(driver_ok, true);
> > > > synchornize_irq()
> > > > set_status(DRIVER_OK)
> > > >
> > > > Suppose there's a interrupt raised before the synchronize_irq(), if we do:
> > > >
> > > > if (READ_ONCE(driver_ok)) {
> > > >       vq->callback()
> > > > }
> > > >
> > > > It will see the driver_ok as true but how can we make sure
> > > > vq->callback sees the driver specific setups (3) above?
> > > >
> > > > And an example is virtio_scsi():
> > > >
> > > > virtio_reset_device()
> > > > virtscsi_probe()
> > > >     virtscsi_init()
> > > >         virtio_find_vqs()
> > > >         ...
> > > >         virtscsi_init_vq(&vscsi->event_vq, vqs[1])
> > > >     ....
> > > >     virtio_device_ready()
> > > >
> > > > In virtscsi_event_done():
> > > >
> > > > virtscsi_event_done():
> > > >     virtscsi_vq_done(vscsi, &vscsi->event_vq, ...);
> > > >
> > > > We need to make sure the even_done reads driver_ok before read vscsi->event_vq.
> > > >
> > > > Thanks
> > >
> > >
> > > See response by Thomas. A simple if (!dev->driver_ok) should be enough,
> > > it's all under a lock.
> >
> > Ordered through ACQUIRE+RELEASE actually since the irq handler is not
> > running under the lock.
> >
> > Another question, for synchronize_irq() do you prefer
> >
> > 1) transport specific callbacks
> > or
> > 2) a simple synchornize_rcu()
> >
> > Thanks
>
>
> 1) I think, and I'd add a wrapper so we can switch to 2 if we really
> want to. But for now synchronizing the specific irq is obviously designed to
> make any changes to memory visible to this irq. that
> seems cleaner and easier to understand than memory ordering tricks
> and relying on side effects of synchornize_rcu, even though
> internally this all boils down to memory ordering since
> memory is what's used to implement locks :).
> Not to mention, synchronize_irq just scales much better from performance
> POV.

Ok. Let me try to do that in V2.

Thanks

>
>
> > >
> > > > >
> > > > >
> > > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > We use smp_store_relase()
> > > > > > > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > > > > > > which surprises me.
> > > > > > > > > > > > >
> > > > > > > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > > > > > > are any buffers in any queues?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > "
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > > > > > > "
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is only for config interrupt.
> > > > > > > > > > > >
> > > > > > > > > > > > Ok.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > > > > > > expensive.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > > > > > > >    * */
> > > > > > > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > > > > > > though it's most likely is ...
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > > > > > > >  * and NMI handlers.
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > > > > > > irq_soft_enabled as false.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > You are right. So then
> > > > > > > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > > > > > > >    READ_ONCE should do.
> > > > > > > > > > > >
> > > > > > > > > > > > See above.
> > > > > > > > > > > >
> > > > > > > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ok.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Do you mean:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > > > > > > interrupt handlers
> > > > > > > > > > > > >
> > > > > > > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > > > > > > users at the moment, even less than probe.
> > > > > > > > > > > >
> > > > > > > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > > > > > > spinlock or others.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > > > > > > for old hypervisors
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > > > > > > >
> > > > > > > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > > > > > > codes. They look all fine since day0.
> > > > > > > > > > > >
> > > > > > > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > > > > > > in the debug build).
> > > > > > > > > > > >
> > > > > > > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > > > > > > device. I think it can be done but in a separate series.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > > > > > > speeding boot up a tiny bit.
> > > > > > > > > > > >
> > > > > > > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > > > > > > read back for validation in many cases.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > > > > > > + }
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > > > > > > >    * @config_change_pending: configuration