From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37750) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YGacT-0004By-7K for qemu-devel@nongnu.org; Wed, 28 Jan 2015 16:57:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YGacO-0002Xw-8P for qemu-devel@nongnu.org; Wed, 28 Jan 2015 16:57:53 -0500 Received: from mail-qg0-f46.google.com ([209.85.192.46]:36050) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YGacO-0002Xm-1x for qemu-devel@nongnu.org; Wed, 28 Jan 2015 16:57:48 -0500 Received: by mail-qg0-f46.google.com with SMTP id i50so20100242qgf.5 for ; Wed, 28 Jan 2015 13:57:47 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <1422037228-5363-5-git-send-email-peter.maydell@linaro.org> References: <1422037228-5363-1-git-send-email-peter.maydell@linaro.org> <1422037228-5363-5-git-send-email-peter.maydell@linaro.org> Date: Wed, 28 Jan 2015 15:57:47 -0600 Message-ID: From: Greg Bellows Content-Type: multipart/alternative; boundary=047d7b5d430e4007d5050dbd761f Subject: Re: [Qemu-devel] [PATCH 04/11] target-arm: Define correct mmu_idx values and pass them in TB flags List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell Cc: "Edgar E. Iglesias" , Andrew Jones , =?UTF-8?B?QWxleCBCZW5uw6ll?= , QEMU Developers , Patch Tracking --047d7b5d430e4007d5050dbd761f Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Fri, Jan 23, 2015 at 12:20 PM, Peter Maydell wrote: > We currently claim that for ARM the mmu_idx should simply be the current > exception level. However this isn't actually correct -- secure EL0 and EL= 1 > should have separate indexes from non-secure EL0 and EL1 since their > VA->PA mappings may differ. We also will want an index for stage 2 > translations when we properly support EL2. > > Define and document all seven mmu index values that we require, and > pass the mmu index in the TB flags rather than exception level or > priv/user bit. > > This change doesn't update the get_phys_addr() code, so our page > table walking still assumes a simplistic "user or priv?" model for > the moment. > > Signed-off-by: Peter Maydell > --- > This leaves some odd gaps in the TB flags usage. I will circle > back and clean this up later (including moving the other common > flags like the singlestep ones to the top of the flags word), > but I didn't want to bloat this patchseries further. > --- > target-arm/cpu.h | 113 > ++++++++++++++++++++++++++++++++++++--------- > target-arm/helper.c | 3 +- > target-arm/translate-a64.c | 5 +- > target-arm/translate.c | 5 +- > target-arm/translate.h | 3 +- > 5 files changed, 101 insertions(+), 28 deletions(-) > > diff --git a/target-arm/cpu.h b/target-arm/cpu.h > index 3eb00f4..cf7b9ab 100644 > --- a/target-arm/cpu.h > +++ b/target-arm/cpu.h > @@ -98,7 +98,7 @@ typedef uint32_t ARMReadCPFunc(void *opaque, int cp_inf= o, > > struct arm_boot_info; > > -#define NB_MMU_MODES 4 > +#define NB_MMU_MODES 7 > > /* We currently assume float and double are IEEE single and double > precision respectively. > @@ -1572,13 +1572,92 @@ static inline CPUARMState *cpu_init(const char > *cpu_model) > #define cpu_signal_handler cpu_arm_signal_handler > #define cpu_list arm_cpu_list > > -/* MMU modes definitions */ > +/* ARM has the following "translation regimes" (as the ARM ARM calls > them): > + * > + * If EL3 is 64-bit: > + * + NonSecure EL1 & 0 stage 1 > + * + NonSecure EL1 & 0 stage 2 > + * + NonSecure EL2 > + * + Secure EL1 & EL0 > + * + Secure EL3 > + * If EL3 is 32-bit: > + * + NonSecure PL1 & 0 stage 1 > + * + NonSecure PL1 & 0 stage 2 > + * + NonSecure PL2 > + * + Secure PL0 & PL1 > + * (reminder: for 32 bit EL3, Secure PL1 is *EL3*, not EL1.) > + * > + * For QEMU, an mmu_idx is not quite the same as a translation regime > because: > + * 1. we need to split the "EL1 & 0" regimes into two mmu_idxes, becaus= e > they > + * may differ in access permissions even if the VA->PA map is the sa= me > + * 2. we want to cache in our TLB the full VA->IPA->PA lookup for a > stage 1+2 > + * translation, which means that we have one mmu_idx that deals with > two > + * concatenated translation regimes [this sort of combined s1+2 TLB = is > + * architecturally permitted] > + * 3. we don't need to allocate an mmu_idx to translations that we won'= t > be > + * handling via the TLB. The only way to do a stage 1 translation > without > + * the immediate stage 2 translation is via the ATS or AT system > insns, > + * which can be slow-pathed and always do a page table walk. > + * 4. we can also safely fold together the "32 bit EL3" and "64 bit EL3= " > + * translation regimes, because they map reasonably well to each oth= er > + * and they can't both be active at the same time. > + * This gives us the following list of mmu_idx values: > + * > + * NS EL0 (aka NS PL0) stage 1+2 > + * NS EL1 (aka NS PL1) stage 1+2 > + * NS EL2 (aka NS PL2) > + * S EL3 (aka S PL1) > + * S EL0 (aka S PL0) > + * S EL1 (not used if EL3 is 32 bit) > + * NS EL0+1 stage 2 > + * > + * (The last of these is an mmu_idx because we want to be able to use th= e > TLB > + * for the accesses done as part of a stage 1 page table walk, rather th= an > + * having to walk the stage 2 page table over and over.) > + * > + * Our enumeration includes at the end some entries which are not "true" > + * mmu_idx values in that they don't have corresponding TLBs and are onl= y > + * valid for doing slow path page table walks. > + * > + * The constant names here are patterned after the general style of the > names > + * of the AT/ATS operations. > + * The values used are carefully arranged to make mmu_idx =3D> EL lookup > easy. > + */ > +typedef enum ARMMMUIdx { > + ARMMMUIdx_S12NSE0 =3D 0, > + ARMMMUIdx_S12NSE1 =3D 1, > + ARMMMUIdx_S1E2 =3D 2, > + ARMMMUIdx_S1E3 =3D 3, > + ARMMMUIdx_S1SE0 =3D 4, > + ARMMMUIdx_S1SE1 =3D 5, > + ARMMMUIdx_S2NS =3D 6, > + /* Indexes below here don't have TLBs and are used only for AT syste= m > + * instructions or for the first stage of an S12 page table walk. > + */ > + ARMMMUIdx_S1NSE0 =3D 7, > + ARMMMUIdx_S1NSE1 =3D 8, > +} ARMMMUIdx; > + > #define MMU_MODE0_SUFFIX _user > #define MMU_MODE1_SUFFIX _kernel > #define MMU_USER_IDX 0 > + > +/* Return the exception level we're running at if this is our mmu_idx */ > +static inline int arm_mmu_idx_to_el(ARMMMUIdx mmu_idx) > +{ > + assert(mmu_idx < ARMMMUIdx_S2NS); > + return mmu_idx & 3; > +} > + > +/* Determine the current mmu_idx to use for normal loads/stores */ > static inline int cpu_mmu_index (CPUARMState *env) > { > - return arm_current_el(env); > + int el =3D arm_current_el(env); > + > + if (el < 3 && arm_is_secure_below_el3(env)) { > + return ARMMMUIdx_S1SE0 + el; > + } > + return el; > } > > /* Return the Exception Level targeted by debug exceptions; > @@ -1645,9 +1724,13 @@ static inline bool > arm_singlestep_active(CPUARMState *env) > > /* Bit usage in the TB flags field: bit 31 indicates whether we are > * in 32 or 64 bit mode. The meaning of the other bits depends on that. > + * We put flags which are shared between 32 and 64 bit mode at the top > + * of the word, and flags which apply to only one mode at the bottom. > */ > #define ARM_TBFLAG_AARCH64_STATE_SHIFT 31 > #define ARM_TBFLAG_AARCH64_STATE_MASK (1U << > ARM_TBFLAG_AARCH64_STATE_SHIFT) > +#define ARM_TBFLAG_MMUIDX_SHIFT 28 > +#define ARM_TBFLAG_MMUIDX_MASK (0x7 << ARM_TBFLAG_MMUIDX_SHIFT) > > /* Bit usage when in AArch32 state: */ > #define ARM_TBFLAG_THUMB_SHIFT 0 > @@ -1656,8 +1739,6 @@ static inline bool arm_singlestep_active(CPUARMStat= e > *env) > #define ARM_TBFLAG_VECLEN_MASK (0x7 << ARM_TBFLAG_VECLEN_SHIFT) > #define ARM_TBFLAG_VECSTRIDE_SHIFT 4 > #define ARM_TBFLAG_VECSTRIDE_MASK (0x3 << ARM_TBFLAG_VECSTRIDE_SHIFT) > -#define ARM_TBFLAG_PRIV_SHIFT 6 > -#define ARM_TBFLAG_PRIV_MASK (1 << ARM_TBFLAG_PRIV_SHIFT) > #define ARM_TBFLAG_VFPEN_SHIFT 7 > #define ARM_TBFLAG_VFPEN_MASK (1 << ARM_TBFLAG_VFPEN_SHIFT) > #define ARM_TBFLAG_CONDEXEC_SHIFT 8 > @@ -1683,8 +1764,6 @@ static inline bool arm_singlestep_active(CPUARMStat= e > *env) > #define ARM_TBFLAG_NS_MASK (1 << ARM_TBFLAG_NS_SHIFT) > > /* Bit usage when in AArch64 state */ > -#define ARM_TBFLAG_AA64_EL_SHIFT 0 > -#define ARM_TBFLAG_AA64_EL_MASK (0x3 << ARM_TBFLAG_AA64_EL_SHIFT) > #define ARM_TBFLAG_AA64_FPEN_SHIFT 2 > #define ARM_TBFLAG_AA64_FPEN_MASK (1 << ARM_TBFLAG_AA64_FPEN_SHIFT) > #define ARM_TBFLAG_AA64_SS_ACTIVE_SHIFT 3 > @@ -1695,14 +1774,14 @@ static inline bool > arm_singlestep_active(CPUARMState *env) > /* some convenience accessor macros */ > #define ARM_TBFLAG_AARCH64_STATE(F) \ > (((F) & ARM_TBFLAG_AARCH64_STATE_MASK) >> > ARM_TBFLAG_AARCH64_STATE_SHIFT) > +#define ARM_TBFLAG_MMUIDX(F) \ > + (((F) & ARM_TBFLAG_MMUIDX_MASK) >> ARM_TBFLAG_MMUIDX_SHIFT) > #define ARM_TBFLAG_THUMB(F) \ > (((F) & ARM_TBFLAG_THUMB_MASK) >> ARM_TBFLAG_THUMB_SHIFT) > #define ARM_TBFLAG_VECLEN(F) \ > (((F) & ARM_TBFLAG_VECLEN_MASK) >> ARM_TBFLAG_VECLEN_SHIFT) > #define ARM_TBFLAG_VECSTRIDE(F) \ > (((F) & ARM_TBFLAG_VECSTRIDE_MASK) >> ARM_TBFLAG_VECSTRIDE_SHIFT) > -#define ARM_TBFLAG_PRIV(F) \ > - (((F) & ARM_TBFLAG_PRIV_MASK) >> ARM_TBFLAG_PRIV_SHIFT) > #define ARM_TBFLAG_VFPEN(F) \ > (((F) & ARM_TBFLAG_VFPEN_MASK) >> ARM_TBFLAG_VFPEN_SHIFT) > #define ARM_TBFLAG_CONDEXEC(F) \ > @@ -1717,8 +1796,6 @@ static inline bool arm_singlestep_active(CPUARMStat= e > *env) > (((F) & ARM_TBFLAG_PSTATE_SS_MASK) >> ARM_TBFLAG_PSTATE_SS_SHIFT) > #define ARM_TBFLAG_XSCALE_CPAR(F) \ > (((F) & ARM_TBFLAG_XSCALE_CPAR_MASK) >> ARM_TBFLAG_XSCALE_CPAR_SHIFT= ) > -#define ARM_TBFLAG_AA64_EL(F) \ > - (((F) & ARM_TBFLAG_AA64_EL_MASK) >> ARM_TBFLAG_AA64_EL_SHIFT) > #define ARM_TBFLAG_AA64_FPEN(F) \ > (((F) & ARM_TBFLAG_AA64_FPEN_MASK) >> ARM_TBFLAG_AA64_FPEN_SHIFT) > #define ARM_TBFLAG_AA64_SS_ACTIVE(F) \ > @@ -1742,8 +1819,7 @@ static inline void cpu_get_tb_cpu_state(CPUARMState > *env, target_ulong *pc, > > if (is_a64(env)) { > *pc =3D env->pc; > - *flags =3D ARM_TBFLAG_AARCH64_STATE_MASK > - | (arm_current_el(env) << ARM_TBFLAG_AA64_EL_SHIFT); > + *flags =3D ARM_TBFLAG_AARCH64_STATE_MASK; > if (fpen =3D=3D 3 || (fpen =3D=3D 1 && arm_current_el(env) !=3D = 0)) { > *flags |=3D ARM_TBFLAG_AA64_FPEN_MASK; > } > @@ -1761,21 +1837,12 @@ static inline void > cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc, > } > } > } else { > - int privmode; > *pc =3D env->regs[15]; > *flags =3D (env->thumb << ARM_TBFLAG_THUMB_SHIFT) > | (env->vfp.vec_len << ARM_TBFLAG_VECLEN_SHIFT) > | (env->vfp.vec_stride << ARM_TBFLAG_VECSTRIDE_SHIFT) > | (env->condexec_bits << ARM_TBFLAG_CONDEXEC_SHIFT) > | (env->bswap_code << ARM_TBFLAG_BSWAP_CODE_SHIFT); > - if (arm_feature(env, ARM_FEATURE_M)) { > - privmode =3D !((env->v7m.exception =3D=3D 0) && (env->v7m.co= ntrol & > 1)); > - } else { > - privmode =3D (env->uncached_cpsr & CPSR_M) !=3D ARM_CPU_MODE= _USR; > - } > - if (privmode) { > - *flags |=3D ARM_TBFLAG_PRIV_MASK; > - } > if (!(access_secure_reg(env))) { > *flags |=3D ARM_TBFLAG_NS_MASK; > } > @@ -1803,6 +1870,8 @@ static inline void cpu_get_tb_cpu_state(CPUARMState > *env, target_ulong *pc, > << ARM_TBFLAG_XSCALE_CPAR_SHIFT); > } > > + *flags |=3D (cpu_mmu_index(env) << ARM_TBFLAG_MMUIDX_SHIFT); > + > =E2=80=8BAfter getting through patch 9, I wonder if the TB NS bit can also = be removed as it is implied in the MMU index.=E2=80=8B > *cs_base =3D 0; > } > > diff --git a/target-arm/helper.c b/target-arm/helper.c > index 1a5e067..06478d8 100644 > --- a/target-arm/helper.c > +++ b/target-arm/helper.c > @@ -5119,7 +5119,8 @@ int arm_cpu_handle_mmu_fault(CPUState *cs, vaddr > address, > uint32_t syn; > bool same_el =3D (arm_current_el(env) !=3D 0); > > - is_user =3D mmu_idx =3D=3D MMU_USER_IDX; > + /* TODO: pass the translation regime to get_phys_addr */ > + is_user =3D (arm_mmu_idx_to_el(mmu_idx) =3D=3D 0); > ret =3D get_phys_addr(env, address, access_type, is_user, &phys_addr= , > &prot, > &page_size); > if (ret =3D=3D 0) { > diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c > index dac2f63..96f14ff 100644 > --- a/target-arm/translate-a64.c > +++ b/target-arm/translate-a64.c > @@ -10922,14 +10922,15 @@ void gen_intermediate_code_internal_a64(ARMCPU > *cpu, > dc->bswap_code =3D 0; > dc->condexec_mask =3D 0; > dc->condexec_cond =3D 0; > + dc->mmu_idx =3D ARM_TBFLAG_MMUIDX(tb->flags); > + dc->current_el =3D arm_mmu_idx_to_el(dc->mmu_idx); > #if !defined(CONFIG_USER_ONLY) > - dc->user =3D (ARM_TBFLAG_AA64_EL(tb->flags) =3D=3D 0); > + dc->user =3D (dc->current_el =3D=3D 0); > #endif > dc->cpacr_fpen =3D ARM_TBFLAG_AA64_FPEN(tb->flags); > dc->vec_len =3D 0; > dc->vec_stride =3D 0; > dc->cp_regs =3D cpu->cp_regs; > - dc->current_el =3D arm_current_el(env); > dc->features =3D env->features; > > /* Single step state. The code-generation logic here is: > diff --git a/target-arm/translate.c b/target-arm/translate.c > index bdfcdf1..7163649 100644 > --- a/target-arm/translate.c > +++ b/target-arm/translate.c > @@ -11032,8 +11032,10 @@ static inline void > gen_intermediate_code_internal(ARMCPU *cpu, > dc->bswap_code =3D ARM_TBFLAG_BSWAP_CODE(tb->flags); > dc->condexec_mask =3D (ARM_TBFLAG_CONDEXEC(tb->flags) & 0xf) << 1; > dc->condexec_cond =3D ARM_TBFLAG_CONDEXEC(tb->flags) >> 4; > + dc->mmu_idx =3D ARM_TBFLAG_MMUIDX(tb->flags); > + dc->current_el =3D arm_mmu_idx_to_el(dc->mmu_idx); > #if !defined(CONFIG_USER_ONLY) > - dc->user =3D (ARM_TBFLAG_PRIV(tb->flags) =3D=3D 0); > + dc->user =3D (dc->current_el =3D=3D 0); > #endif > dc->ns =3D ARM_TBFLAG_NS(tb->flags); > dc->cpacr_fpen =3D ARM_TBFLAG_CPACR_FPEN(tb->flags); > @@ -11042,7 +11044,6 @@ static inline void > gen_intermediate_code_internal(ARMCPU *cpu, > dc->vec_stride =3D ARM_TBFLAG_VECSTRIDE(tb->flags); > dc->c15_cpar =3D ARM_TBFLAG_XSCALE_CPAR(tb->flags); > dc->cp_regs =3D cpu->cp_regs; > - dc->current_el =3D arm_current_el(env); > dc->features =3D env->features; > > /* Single step state. The code-generation logic here is: > diff --git a/target-arm/translate.h b/target-arm/translate.h > index f6ee789..a1eb5b5 100644 > --- a/target-arm/translate.h > +++ b/target-arm/translate.h > @@ -20,6 +20,7 @@ typedef struct DisasContext { > #if !defined(CONFIG_USER_ONLY) > int user; > #endif > + ARMMMUIdx mmu_idx; /* MMU index to use for normal loads/stores */ > bool ns; /* Use non-secure CPREG bank on access */ > bool cpacr_fpen; /* FP enabled via CPACR.FPEN */ > bool vfp_enabled; /* FP enabled via FPSCR.EN */ > @@ -69,7 +70,7 @@ static inline int arm_dc_feature(DisasContext *dc, int > feature) > > static inline int get_mem_index(DisasContext *s) > { > - return s->current_el; > + return s->mmu_idx; > } > > /* target-specific extra values for is_jmp */ > -- > 1.9.1 > > --047d7b5d430e4007d5050dbd761f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On Fri, Jan 23, 2015 at 12:20 PM, Peter Maydell <peter= .maydell@linaro.org> wrote:
We currently claim that for ARM the mmu_idx should simply be the current exception level. However this isn't actually correct -- secure EL0 and = EL1
should have separate indexes from non-secure EL0 and EL1 since their
VA->PA mappings may differ. We also will want an index for stage 2
translations when we properly support EL2.

Define and document all seven mmu index values that we require, and
pass the mmu index in the TB flags rather than exception level or
priv/user bit.

This change doesn't update the get_phys_addr() code, so our page
table walking still assumes a simplistic "user or priv?" model fo= r
the moment.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
This leaves some odd gaps in the TB flags usage. I will circle
back and clean this up later (including moving the other common
flags like the singlestep ones to the top of the flags word),
but I didn't want to bloat this patchseries further.
---
=C2=A0target-arm/cpu.h=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| 113 ++++++= ++++++++++++++++++++++++++++++---------
=C2=A0target-arm/helper.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 |=C2=A0 =C2=A03 +-
=C2=A0target-arm/translate-a64.c |=C2=A0 =C2=A05 +-
=C2=A0target-arm/translate.c=C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A05 +-
=C2=A0target-arm/translate.h=C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A03 +-
=C2=A05 files changed, 101 insertions(+), 28 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 3eb00f4..cf7b9ab 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -98,7 +98,7 @@ typedef uint32_t ARMReadCPFunc(void *opaque, int cp_info,=

=C2=A0struct arm_boot_info;

-#define NB_MMU_MODES 4
+#define NB_MMU_MODES 7

=C2=A0/* We currently assume float and double are IEEE single and double =C2=A0 =C2=A0 precision respectively.
@@ -1572,13 +1572,92 @@ static inline CPUARMState *cpu_init(const char *cpu= _model)
=C2=A0#define cpu_signal_handler cpu_arm_signal_handler
=C2=A0#define cpu_list arm_cpu_list

-/* MMU modes definitions */
+/* ARM has the following "translation regimes" (as the ARM ARM c= alls them):
+ *
+ * If EL3 is 64-bit:
+ *=C2=A0 + NonSecure EL1 & 0 stage 1
+ *=C2=A0 + NonSecure EL1 & 0 stage 2
+ *=C2=A0 + NonSecure EL2
+ *=C2=A0 + Secure EL1 & EL0
+ *=C2=A0 + Secure EL3
+ * If EL3 is 32-bit:
+ *=C2=A0 + NonSecure PL1 & 0 stage 1
+ *=C2=A0 + NonSecure PL1 & 0 stage 2
+ *=C2=A0 + NonSecure PL2
+ *=C2=A0 + Secure PL0 & PL1
+ * (reminder: for 32 bit EL3, Secure PL1 is *EL3*, not EL1.)
+ *
+ * For QEMU, an mmu_idx is not quite the same as a translation regime beca= use:
+ *=C2=A0 1. we need to split the "EL1 & 0" regimes into two = mmu_idxes, because they
+ *=C2=A0 =C2=A0 =C2=A0may differ in access permissions even if the VA->= PA map is the same
+ *=C2=A0 2. we want to cache in our TLB the full VA->IPA->PA lookup = for a stage 1+2
+ *=C2=A0 =C2=A0 =C2=A0translation, which means that we have one mmu_idx th= at deals with two
+ *=C2=A0 =C2=A0 =C2=A0concatenated translation regimes [this sort of combi= ned s1+2 TLB is
+ *=C2=A0 =C2=A0 =C2=A0architecturally permitted]
+ *=C2=A0 3. we don't need to allocate an mmu_idx to translations that = we won't be
+ *=C2=A0 =C2=A0 =C2=A0handling via the TLB. The only way to do a stage 1 t= ranslation without
+ *=C2=A0 =C2=A0 =C2=A0the immediate stage 2 translation is via the ATS or = AT system insns,
+ *=C2=A0 =C2=A0 =C2=A0which can be slow-pathed and always do a page table = walk.
+ *=C2=A0 4. we can also safely fold together the "32 bit EL3" an= d "64 bit EL3"
+ *=C2=A0 =C2=A0 =C2=A0translation regimes, because they map reasonably wel= l to each other
+ *=C2=A0 =C2=A0 =C2=A0and they can't both be active at the same time.<= br> + * This gives us the following list of mmu_idx values:
+ *
+ * NS EL0 (aka NS PL0) stage 1+2
+ * NS EL1 (aka NS PL1) stage 1+2
+ * NS EL2 (aka NS PL2)
+ * S EL3 (aka S PL1)
+ * S EL0 (aka S PL0)
+ * S EL1 (not used if EL3 is 32 bit)
+ * NS EL0+1 stage 2
+ *
+ * (The last of these is an mmu_idx because we want to be able to use the = TLB
+ * for the accesses done as part of a stage 1 page table walk, rather than=
+ * having to walk the stage 2 page table over and over.)
+ *
+ * Our enumeration includes at the end some entries which are not "tr= ue"
+ * mmu_idx values in that they don't have corresponding TLBs and are o= nly
+ * valid for doing slow path page table walks.
+ *
+ * The constant names here are patterned after the general style of the na= mes
+ * of the AT/ATS operations.
+ * The values used are carefully arranged to make mmu_idx =3D> EL looku= p easy.
+ */
+typedef enum ARMMMUIdx {
+=C2=A0 =C2=A0 ARMMMUIdx_S12NSE0 =3D 0,
+=C2=A0 =C2=A0 ARMMMUIdx_S12NSE1 =3D 1,
+=C2=A0 =C2=A0 ARMMMUIdx_S1E2 =3D 2,
+=C2=A0 =C2=A0 ARMMMUIdx_S1E3 =3D 3,
+=C2=A0 =C2=A0 ARMMMUIdx_S1SE0 =3D 4,
+=C2=A0 =C2=A0 ARMMMUIdx_S1SE1 =3D 5,
+=C2=A0 =C2=A0 ARMMMUIdx_S2NS =3D 6,
+=C2=A0 =C2=A0 /* Indexes below here don't have TLBs and are used only = for AT system
+=C2=A0 =C2=A0 =C2=A0* instructions or for the first stage of an S12 page t= able walk.
+=C2=A0 =C2=A0 =C2=A0*/
+=C2=A0 =C2=A0 ARMMMUIdx_S1NSE0 =3D 7,
+=C2=A0 =C2=A0 ARMMMUIdx_S1NSE1 =3D 8,
+} ARMMMUIdx;
+
=C2=A0#define MMU_MODE0_SUFFIX _user
=C2=A0#define MMU_MODE1_SUFFIX _kernel
=C2=A0#define MMU_USER_IDX 0
+
+/* Return the exception level we're running at if this is our mmu_idx = */
+static inline int arm_mmu_idx_to_el(ARMMMUIdx mmu_idx)
+{
+=C2=A0 =C2=A0 assert(mmu_idx < ARMMMUIdx_S2NS);
+=C2=A0 =C2=A0 return mmu_idx & 3;
+}
+
+/* Determine the current mmu_idx to use for normal loads/stores */
=C2=A0static inline int cpu_mmu_index (CPUARMState *env)
=C2=A0{
-=C2=A0 =C2=A0 return arm_current_el(env);
+=C2=A0 =C2=A0 int el =3D arm_current_el(env);
+
+=C2=A0 =C2=A0 if (el < 3 && arm_is_secure_below_el3(env)) {
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 return ARMMMUIdx_S1SE0 + el;
+=C2=A0 =C2=A0 }
+=C2=A0 =C2=A0 return el;
=C2=A0}

=C2=A0/* Return the Exception Level targeted by debug exceptions;
@@ -1645,9 +1724,13 @@ static inline bool arm_singlestep_active(CPUARMState= *env)

=C2=A0/* Bit usage in the TB flags field: bit 31 indicates whether we are =C2=A0 * in 32 or 64 bit mode. The meaning of the other bits depends on tha= t.
+ * We put flags which are shared between 32 and 64 bit mode at the top
+ * of the word, and flags which apply to only one mode at the bottom.
=C2=A0 */
=C2=A0#define ARM_TBFLAG_AARCH64_STATE_SHIFT 31
=C2=A0#define ARM_TBFLAG_AARCH64_STATE_MASK=C2=A0 (1U << ARM_TBFLAG_A= ARCH64_STATE_SHIFT)
+#define ARM_TBFLAG_MMUIDX_SHIFT 28
+#define ARM_TBFLAG_MMUIDX_MASK (0x7 << ARM_TBFLAG_MMUIDX_SHIFT)

=C2=A0/* Bit usage when in AArch32 state: */
=C2=A0#define ARM_TBFLAG_THUMB_SHIFT=C2=A0 =C2=A0 =C2=A0 0
@@ -1656,8 +1739,6 @@ static inline bool arm_singlestep_active(CPUARMState = *env)
=C2=A0#define ARM_TBFLAG_VECLEN_MASK=C2=A0 =C2=A0 =C2=A0 (0x7 << ARM_= TBFLAG_VECLEN_SHIFT)
=C2=A0#define ARM_TBFLAG_VECSTRIDE_SHIFT=C2=A0 4
=C2=A0#define ARM_TBFLAG_VECSTRIDE_MASK=C2=A0 =C2=A0(0x3 << ARM_TBFLA= G_VECSTRIDE_SHIFT)
-#define ARM_TBFLAG_PRIV_SHIFT=C2=A0 =C2=A0 =C2=A0 =C2=A06
-#define ARM_TBFLAG_PRIV_MASK=C2=A0 =C2=A0 =C2=A0 =C2=A0 (1 << ARM_TB= FLAG_PRIV_SHIFT)
=C2=A0#define ARM_TBFLAG_VFPEN_SHIFT=C2=A0 =C2=A0 =C2=A0 7
=C2=A0#define ARM_TBFLAG_VFPEN_MASK=C2=A0 =C2=A0 =C2=A0 =C2=A0(1 << A= RM_TBFLAG_VFPEN_SHIFT)
=C2=A0#define ARM_TBFLAG_CONDEXEC_SHIFT=C2=A0 =C2=A08
@@ -1683,8 +1764,6 @@ static inline bool arm_singlestep_active(CPUARMState = *env)
=C2=A0#define ARM_TBFLAG_NS_MASK=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (1 <&= lt; ARM_TBFLAG_NS_SHIFT)

=C2=A0/* Bit usage when in AArch64 state */
-#define ARM_TBFLAG_AA64_EL_SHIFT=C2=A0 =C2=A0 0
-#define ARM_TBFLAG_AA64_EL_MASK=C2=A0 =C2=A0 =C2=A0(0x3 << ARM_TBFLA= G_AA64_EL_SHIFT)
=C2=A0#define ARM_TBFLAG_AA64_FPEN_SHIFT=C2=A0 2
=C2=A0#define ARM_TBFLAG_AA64_FPEN_MASK=C2=A0 =C2=A0(1 << ARM_TBFLAG_= AA64_FPEN_SHIFT)
=C2=A0#define ARM_TBFLAG_AA64_SS_ACTIVE_SHIFT 3
@@ -1695,14 +1774,14 @@ static inline bool arm_singlestep_active(CPUARMStat= e *env)
=C2=A0/* some convenience accessor macros */
=C2=A0#define ARM_TBFLAG_AARCH64_STATE(F) \
=C2=A0 =C2=A0 =C2=A0(((F) & ARM_TBFLAG_AARCH64_STATE_MASK) >> ARM= _TBFLAG_AARCH64_STATE_SHIFT)
+#define ARM_TBFLAG_MMUIDX(F) \
+=C2=A0 =C2=A0 (((F) & ARM_TBFLAG_MMUIDX_MASK) >> ARM_TBFLAG_MMUI= DX_SHIFT)
=C2=A0#define ARM_TBFLAG_THUMB(F) \
=C2=A0 =C2=A0 =C2=A0(((F) & ARM_TBFLAG_THUMB_MASK) >> ARM_TBFLAG_= THUMB_SHIFT)
=C2=A0#define ARM_TBFLAG_VECLEN(F) \
=C2=A0 =C2=A0 =C2=A0(((F) & ARM_TBFLAG_VECLEN_MASK) >> ARM_TBFLAG= _VECLEN_SHIFT)
=C2=A0#define ARM_TBFLAG_VECSTRIDE(F) \
=C2=A0 =C2=A0 =C2=A0(((F) & ARM_TBFLAG_VECSTRIDE_MASK) >> ARM_TBF= LAG_VECSTRIDE_SHIFT)
-#define ARM_TBFLAG_PRIV(F) \
-=C2=A0 =C2=A0 (((F) & ARM_TBFLAG_PRIV_MASK) >> ARM_TBFLAG_PRIV_S= HIFT)
=C2=A0#define ARM_TBFLAG_VFPEN(F) \
=C2=A0 =C2=A0 =C2=A0(((F) & ARM_TBFLAG_VFPEN_MASK) >> ARM_TBFLAG_= VFPEN_SHIFT)
=C2=A0#define ARM_TBFLAG_CONDEXEC(F) \
@@ -1717,8 +1796,6 @@ static inline bool arm_singlestep_active(CPUARMState = *env)
=C2=A0 =C2=A0 =C2=A0(((F) & ARM_TBFLAG_PSTATE_SS_MASK) >> ARM_TBF= LAG_PSTATE_SS_SHIFT)
=C2=A0#define ARM_TBFLAG_XSCALE_CPAR(F) \
=C2=A0 =C2=A0 =C2=A0(((F) & ARM_TBFLAG_XSCALE_CPAR_MASK) >> ARM_T= BFLAG_XSCALE_CPAR_SHIFT)
-#define ARM_TBFLAG_AA64_EL(F) \
-=C2=A0 =C2=A0 (((F) & ARM_TBFLAG_AA64_EL_MASK) >> ARM_TBFLAG_AA6= 4_EL_SHIFT)
=C2=A0#define ARM_TBFLAG_AA64_FPEN(F) \
=C2=A0 =C2=A0 =C2=A0(((F) & ARM_TBFLAG_AA64_FPEN_MASK) >> ARM_TBF= LAG_AA64_FPEN_SHIFT)
=C2=A0#define ARM_TBFLAG_AA64_SS_ACTIVE(F) \
@@ -1742,8 +1819,7 @@ static inline void cpu_get_tb_cpu_state(CPUARMState *= env, target_ulong *pc,

=C2=A0 =C2=A0 =C2=A0if (is_a64(env)) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*pc =3D env->pc;
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 *flags =3D ARM_TBFLAG_AARCH64_STATE_MASK
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | (arm_current_el(env) << = ARM_TBFLAG_AA64_EL_SHIFT);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 *flags =3D ARM_TBFLAG_AARCH64_STATE_MASK;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (fpen =3D=3D 3 || (fpen =3D=3D 1 &= & arm_current_el(env) !=3D 0)) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*flags |=3D ARM_TBFLAG_AA64= _FPEN_MASK;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
@@ -1761,21 +1837,12 @@ static inline void cpu_get_tb_cpu_state(CPUARMState= *env, target_ulong *pc,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
=C2=A0 =C2=A0 =C2=A0} else {
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 int privmode;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*pc =3D env->regs[15];
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*flags =3D (env->thumb << ARM_TB= FLAG_THUMB_SHIFT)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| (env->vfp.vec_len <= < ARM_TBFLAG_VECLEN_SHIFT)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| (env->vfp.vec_stride &= lt;< ARM_TBFLAG_VECSTRIDE_SHIFT)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| (env->condexec_bits &l= t;< ARM_TBFLAG_CONDEXEC_SHIFT)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| (env->bswap_code <&= lt; ARM_TBFLAG_BSWAP_CODE_SHIFT);
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (arm_feature(env, ARM_FEATURE_M)) {
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 privmode =3D !((env->v7m.exce= ption =3D=3D 0) && (env->v7m.control & 1));
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 } else {
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 privmode =3D (env->uncached_c= psr & CPSR_M) !=3D ARM_CPU_MODE_USR;
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (privmode) {
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 *flags |=3D ARM_TBFLAG_PRIV_MASK= ;
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (!(access_secure_reg(env))) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*flags |=3D ARM_TBFLAG_NS_M= ASK;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
@@ -1803,6 +1870,8 @@ static inline void cpu_get_tb_cpu_state(CPUARMState *= env, target_ulong *pc,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <&= lt; ARM_TBFLAG_XSCALE_CPAR_SHIFT);
=C2=A0 =C2=A0 =C2=A0}

+=C2=A0 =C2=A0 *flags |=3D (cpu_mmu_index(env) << ARM_TBFLAG_MMUIDX_S= HIFT);
+

=E2=80=8BAfter getting through pat= ch 9, I wonder if the TB NS bit can also be removed as it is implied in the= MMU index.=E2=80=8B
=C2=A0
=C2=A0 =C2=A0 =C2=A0*cs_base =3D 0;
=C2=A0}

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 1a5e067..06478d8 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -5119,7 +5119,8 @@ int arm_cpu_handle_mmu_fault(CPUState *cs, vaddr addr= ess,
=C2=A0 =C2=A0 =C2=A0uint32_t syn;
=C2=A0 =C2=A0 =C2=A0bool same_el =3D (arm_current_el(env) !=3D 0);

-=C2=A0 =C2=A0 is_user =3D mmu_idx =3D=3D MMU_USER_IDX;
+=C2=A0 =C2=A0 /* TODO: pass the translation regime to get_phys_addr */
+=C2=A0 =C2=A0 is_user =3D (arm_mmu_idx_to_el(mmu_idx) =3D=3D 0);
=C2=A0 =C2=A0 =C2=A0ret =3D get_phys_addr(env, address, access_type, is_use= r, &phys_addr, &prot,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0&page_size);
=C2=A0 =C2=A0 =C2=A0if (ret =3D=3D 0) {
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index dac2f63..96f14ff 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -10922,14 +10922,15 @@ void gen_intermediate_code_internal_a64(ARMCPU *c= pu,
=C2=A0 =C2=A0 =C2=A0dc->bswap_code =3D 0;
=C2=A0 =C2=A0 =C2=A0dc->condexec_mask =3D 0;
=C2=A0 =C2=A0 =C2=A0dc->condexec_cond =3D 0;
+=C2=A0 =C2=A0 dc->mmu_idx =3D ARM_TBFLAG_MMUIDX(tb->flags);
+=C2=A0 =C2=A0 dc->current_el =3D arm_mmu_idx_to_el(dc->mmu_idx);
=C2=A0#if !defined(CONFIG_USER_ONLY)
-=C2=A0 =C2=A0 dc->user =3D (ARM_TBFLAG_AA64_EL(tb->flags) =3D=3D 0);=
+=C2=A0 =C2=A0 dc->user =3D (dc->current_el =3D=3D 0);
=C2=A0#endif
=C2=A0 =C2=A0 =C2=A0dc->cpacr_fpen =3D ARM_TBFLAG_AA64_FPEN(tb->flags= );
=C2=A0 =C2=A0 =C2=A0dc->vec_len =3D 0;
=C2=A0 =C2=A0 =C2=A0dc->vec_stride =3D 0;
=C2=A0 =C2=A0 =C2=A0dc->cp_regs =3D cpu->cp_regs;
-=C2=A0 =C2=A0 dc->current_el =3D arm_current_el(env);
=C2=A0 =C2=A0 =C2=A0dc->features =3D env->features;

=C2=A0 =C2=A0 =C2=A0/* Single step state. The code-generation logic here is= :
diff --git a/target-arm/translate.c b/target-arm/translate.c
index bdfcdf1..7163649 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11032,8 +11032,10 @@ static inline void gen_intermediate_code_internal(= ARMCPU *cpu,
=C2=A0 =C2=A0 =C2=A0dc->bswap_code =3D ARM_TBFLAG_BSWAP_CODE(tb->flag= s);
=C2=A0 =C2=A0 =C2=A0dc->condexec_mask =3D (ARM_TBFLAG_CONDEXEC(tb->fl= ags) & 0xf) << 1;
=C2=A0 =C2=A0 =C2=A0dc->condexec_cond =3D ARM_TBFLAG_CONDEXEC(tb->fla= gs) >> 4;
+=C2=A0 =C2=A0 dc->mmu_idx =3D ARM_TBFLAG_MMUIDX(tb->flags);
+=C2=A0 =C2=A0 dc->current_el =3D arm_mmu_idx_to_el(dc->mmu_idx);
=C2=A0#if !defined(CONFIG_USER_ONLY)
-=C2=A0 =C2=A0 dc->user =3D (ARM_TBFLAG_PRIV(tb->flags) =3D=3D 0); +=C2=A0 =C2=A0 dc->user =3D (dc->current_el =3D=3D 0);
=C2=A0#endif
=C2=A0 =C2=A0 =C2=A0dc->ns =3D ARM_TBFLAG_NS(tb->flags);
=C2=A0 =C2=A0 =C2=A0dc->cpacr_fpen =3D ARM_TBFLAG_CPACR_FPEN(tb->flag= s);
@@ -11042,7 +11044,6 @@ static inline void gen_intermediate_code_internal(A= RMCPU *cpu,
=C2=A0 =C2=A0 =C2=A0dc->vec_stride =3D ARM_TBFLAG_VECSTRIDE(tb->flags= );
=C2=A0 =C2=A0 =C2=A0dc->c15_cpar =3D ARM_TBFLAG_XSCALE_CPAR(tb->flags= );
=C2=A0 =C2=A0 =C2=A0dc->cp_regs =3D cpu->cp_regs;
-=C2=A0 =C2=A0 dc->current_el =3D arm_current_el(env);
=C2=A0 =C2=A0 =C2=A0dc->features =3D env->features;

=C2=A0 =C2=A0 =C2=A0/* Single step state. The code-generation logic here is= :
diff --git a/target-arm/translate.h b/target-arm/translate.h
index f6ee789..a1eb5b5 100644
--- a/target-arm/translate.h
+++ b/target-arm/translate.h
@@ -20,6 +20,7 @@ typedef struct DisasContext {
=C2=A0#if !defined(CONFIG_USER_ONLY)
=C2=A0 =C2=A0 =C2=A0int user;
=C2=A0#endif
+=C2=A0 =C2=A0 ARMMMUIdx mmu_idx; /* MMU index to use for normal loads/stor= es */
=C2=A0 =C2=A0 =C2=A0bool ns;=C2=A0 =C2=A0 =C2=A0 =C2=A0 /* Use non-secure C= PREG bank on access */
=C2=A0 =C2=A0 =C2=A0bool cpacr_fpen; /* FP enabled via CPACR.FPEN */
=C2=A0 =C2=A0 =C2=A0bool vfp_enabled; /* FP enabled via FPSCR.EN */
@@ -69,7 +70,7 @@ static inline int arm_dc_feature(DisasContext *dc, int fe= ature)

=C2=A0static inline int get_mem_index(DisasContext *s)
=C2=A0{
-=C2=A0 =C2=A0 return s->current_el;
+=C2=A0 =C2=A0 return s->mmu_idx;
=C2=A0}

=C2=A0/* target-specific extra values for is_jmp */
--
1.9.1


--047d7b5d430e4007d5050dbd761f--