All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] target/arm: Implement S2FWB
@ 2022-05-05 18:39 Peter Maydell
  2022-05-05 18:39 ` [PATCH 1/4] target/arm: Postpone interpretation of stage 2 descriptor attribute bits Peter Maydell
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Peter Maydell @ 2022-05-05 18:39 UTC (permalink / raw)
  To: qemu-arm, qemu-devel

In the original Arm v8 two-stage translation, both stage 1 and stage
2 specify memory attributes (memory type, cacheability,
shareability); these are then combined to produce the overall memory
attributes for the whole stage 1+2 access.

The new FEAT_S2FWB feature allows the guest to enable a different
interpretation of the attribute bits in the stage 2 descriptors.
These bits can now be used to control details of how the stage 1 and
2 attributes should be combined (for instance they can say "always
use the stage 1 attributes" or "ignore the stage 1 attributes and
always be Device memory").

This series implements support for FEAT_S2FWB.  It starts by
postponing interpretation of the attribute bits in a stage 2
descriptor until the point where we need to combine them with the
stage 1 attributes.  It then pulls out the HCR_EL2.FWB=0 specific
code into its own function, so that the support for FWB=1 that we add
in patch 3 slots in neatly.  Finally, patch 4 turns it on for -cpu
max.

I have tested that a Linux nested-guest setup works OK (and that
the guest really is turning on HCR_EL2.FWB), but since we don't
do anything with memory attributes except return them in the
PAR_EL1 when the guest does AT instructions, you probably wouldn't
find bugs in this unless you explicitly went and wrote a bunch
of test cases that set up page tables and ran AT instructions to
check what memory attributes we report.

-- PMM

Peter Maydell (4):
  target/arm: Postpone interpretation of stage 2 descriptor attribute bits
  target/arm: Factor out FWB=0 specific part of combine_cacheattrs()
  target/arm: Implement FEAT_S2FWB
  target/arm: Enable FEAT_S2FWB for -cpu max

 docs/system/arm/emulation.rst |   1 +
 target/arm/cpu.h              |   5 +
 target/arm/internals.h        |   7 +-
 target/arm/cpu64.c            |  10 ++
 target/arm/helper.c           | 200 +++++++++++++++++++++++++++-------
 5 files changed, 182 insertions(+), 41 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] target/arm: Postpone interpretation of stage 2 descriptor attribute bits
  2022-05-05 18:39 [PATCH 0/4] target/arm: Implement S2FWB Peter Maydell
@ 2022-05-05 18:39 ` Peter Maydell
  2022-05-06 18:14   ` Richard Henderson
  2022-05-05 18:39 ` [PATCH 2/4] target/arm: Factor out FWB=0 specific part of combine_cacheattrs() Peter Maydell
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Peter Maydell @ 2022-05-05 18:39 UTC (permalink / raw)
  To: qemu-arm, qemu-devel

In the original Arm v8 two-stage translation, both stage 1 and stage
2 specify memory attributes (memory type, cacheability,
shareability); these are then combined to produce the overall memory
attributes for the whole stage 1+2 access.  In QEMU we implement this
by having get_phys_addr() fill in an ARMCacheAttrs struct, and we
convert both the stage 1 and stage 2 attribute bit formats to the
same encoding (an 8-bit attribute value matching the MAIR_EL1 fields,
plus a 2-bit shareability value).

The new FEAT_S2FWB feature allows the guest to enable a different
interpretation of the attribute bits in the stage 2 descriptors.
These bits can now be used to control details of how the stage 1 and
2 attributes should be combined (for instance they can say "always
use the stage 1 attributes" or "ignore the stage 1 attributes and
always be Device memory").  This means we need to pass the raw bit
information for stage 2 down to the function which combines the stage
1 and stage 2 information.

Add a field to ARMCacheAttrs that indicates whether the attrs field
should be interpreted as MAIR format, or as the raw stage 2 attribute
bits from the descriptor, and store the appropriate values when
filling in cacheattrs.

We only need to interpret the attrs field in a few places:
 * in do_ats_write(), where we know to expect a MAIR value
   (there is no ATS instruction to do a stage-2-only walk)
 * in S1_ptw_translate(), where we want to know whether the
   combined S1 + S2 attributes indicate Device memory that
   should provoke a fault
 * in combine_cacheattrs(), which does the S1 + S2 combining
Update those places accordingly.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/internals.h |  7 ++++++-
 target/arm/helper.c    | 42 ++++++++++++++++++++++++++++++++++++------
 2 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index 255833479d4..b9dd093b5fb 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1141,8 +1141,13 @@ bool pmsav8_mpu_lookup(CPUARMState *env, uint32_t address,
 
 /* Cacheability and shareability attributes for a memory access */
 typedef struct ARMCacheAttrs {
-    unsigned int attrs:8; /* as in the MAIR register encoding */
+    /*
+     * If is_s2_format is true, attrs is the S2 descriptor bits [5:2]
+     * Otherwise, attrs is the same as the MAIR_EL1 8-bit format
+     */
+    unsigned int attrs:8;
     unsigned int shareability:2; /* as in the SH field of the VMSAv8-64 PTEs */
+    bool is_s2_format:1;
 } ARMCacheAttrs;
 
 bool get_phys_addr(CPUARMState *env, target_ulong address,
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 5a244c3ed93..5839acc343b 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3189,6 +3189,12 @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t value,
     ret = get_phys_addr(env, value, access_type, mmu_idx, &phys_addr, &attrs,
                         &prot, &page_size, &fi, &cacheattrs);
 
+    /*
+     * ATS operations only do S1 or S1+S2 translations, so we never
+     * have to deal with the ARMCacheAttrs format for S2 only.
+     */
+    assert(!cacheattrs.is_s2_format);
+
     if (ret) {
         /*
          * Some kinds of translation fault must cause exceptions rather
@@ -10671,6 +10677,19 @@ static bool get_level1_table_address(CPUARMState *env, ARMMMUIdx mmu_idx,
     return true;
 }
 
+static bool ptw_attrs_are_device(CPUARMState *env, ARMCacheAttrs cacheattrs)
+{
+    /*
+     * For an S1 page table walk, the stage 1 attributes are always
+     * some form of "this is Normal memory". The combined S1+S2
+     * attributes are therefore only Device if stage 2 specifies Device.
+     * With HCR_EL2.FWB == 0 this is when descriptor bits [5:4] are 0b00,
+     * ie when cacheattrs.attrs bits [3:2] are 0b00.
+     */
+    assert(cacheattrs.is_s2_format);
+    return (cacheattrs.attrs & 0xc) == 0;
+}
+
 /* Translate a S1 pagetable walk through S2 if needed.  */
 static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
                                hwaddr addr, bool *is_secure,
@@ -10699,7 +10718,7 @@ static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
             return ~0;
         }
         if ((arm_hcr_el2_eff(env) & HCR_PTW) &&
-            (cacheattrs.attrs & 0xf0) == 0) {
+            ptw_attrs_are_device(env, cacheattrs)) {
             /*
              * PTW set and S1 walk touched S2 Device memory:
              * generate Permission fault.
@@ -11771,12 +11790,14 @@ static bool get_phys_addr_lpae(CPUARMState *env, uint64_t address,
     }
 
     if (mmu_idx == ARMMMUIdx_Stage2 || mmu_idx == ARMMMUIdx_Stage2_S) {
-        cacheattrs->attrs = convert_stage2_attrs(env, extract32(attrs, 0, 4));
+        cacheattrs->is_s2_format = true;
+        cacheattrs->attrs = extract32(attrs, 0, 4);
     } else {
         /* Index into MAIR registers for cache attributes */
         uint8_t attrindx = extract32(attrs, 0, 3);
         uint64_t mair = env->cp15.mair_el[regime_el(env, mmu_idx)];
         assert(attrindx <= 7);
+        cacheattrs->is_s2_format = false;
         cacheattrs->attrs = extract64(mair, attrindx * 8, 8);
     }
 
@@ -12514,14 +12535,22 @@ static uint8_t combine_cacheattr_nibble(uint8_t s1, uint8_t s2)
 /* Combine S1 and S2 cacheability/shareability attributes, per D4.5.4
  * and CombineS1S2Desc()
  *
+ * @env:     CPUARMState
  * @s1:      Attributes from stage 1 walk
  * @s2:      Attributes from stage 2 walk
  */
-static ARMCacheAttrs combine_cacheattrs(ARMCacheAttrs s1, ARMCacheAttrs s2)
+static ARMCacheAttrs combine_cacheattrs(CPUARMState *env,
+                                        ARMCacheAttrs s1, ARMCacheAttrs s2)
 {
     uint8_t s1lo, s2lo, s1hi, s2hi;
     ARMCacheAttrs ret;
     bool tagged = false;
+    uint8_t s2_mair_attrs;
+
+    assert(s2.is_s2_format && !s1.is_s2_format);
+    ret.is_s2_format = false;
+
+    s2_mair_attrs = convert_stage2_attrs(env, s2.attrs);
 
     if (s1.attrs == 0xf0) {
         tagged = true;
@@ -12529,9 +12558,9 @@ static ARMCacheAttrs combine_cacheattrs(ARMCacheAttrs s1, ARMCacheAttrs s2)
     }
 
     s1lo = extract32(s1.attrs, 0, 4);
-    s2lo = extract32(s2.attrs, 0, 4);
+    s2lo = extract32(s2_mair_attrs, 0, 4);
     s1hi = extract32(s1.attrs, 4, 4);
-    s2hi = extract32(s2.attrs, 4, 4);
+    s2hi = extract32(s2_mair_attrs, 4, 4);
 
     /* Combine shareability attributes (table D4-43) */
     if (s1.shareability == 2 || s2.shareability == 2) {
@@ -12685,7 +12714,7 @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
                 }
                 cacheattrs->shareability = 0;
             }
-            *cacheattrs = combine_cacheattrs(*cacheattrs, cacheattrs2);
+            *cacheattrs = combine_cacheattrs(env, *cacheattrs, cacheattrs2);
 
             /* Check if IPA translates to secure or non-secure PA space. */
             if (arm_is_secure_below_el3(env)) {
@@ -12803,6 +12832,7 @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
         /* Fill in cacheattr a-la AArch64.TranslateAddressS1Off. */
         hcr = arm_hcr_el2_eff(env);
         cacheattrs->shareability = 0;
+        cacheattrs->is_s2_format = false;
         if (hcr & HCR_DC) {
             if (hcr & HCR_DCT) {
                 memattr = 0xf0;  /* Tagged, Normal, WB, RWA */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] target/arm: Factor out FWB=0 specific part of combine_cacheattrs()
  2022-05-05 18:39 [PATCH 0/4] target/arm: Implement S2FWB Peter Maydell
  2022-05-05 18:39 ` [PATCH 1/4] target/arm: Postpone interpretation of stage 2 descriptor attribute bits Peter Maydell
@ 2022-05-05 18:39 ` Peter Maydell
  2022-05-06 18:16   ` Richard Henderson
  2022-05-05 18:39 ` [PATCH 3/4] target/arm: Implement FEAT_S2FWB Peter Maydell
  2022-05-05 18:39 ` [PATCH 4/4] target/arm: Enable FEAT_S2FWB for -cpu max Peter Maydell
  3 siblings, 1 reply; 9+ messages in thread
From: Peter Maydell @ 2022-05-05 18:39 UTC (permalink / raw)
  To: qemu-arm, qemu-devel

Factor out the part of combine_cacheattrs() that is specific to
handling HCR_EL2.FWB == 0.  This is the part where we combine the
memory type and cacheability attributes.

The "force Outer Shareable for Device or Normal Inner-NC Outer-NC"
logic remains in combine_cacheattrs() because it holds regardless
(this is the equivalent of the pseudocode EffectiveShareability()
function).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 88 +++++++++++++++++++++++++--------------------
 1 file changed, 50 insertions(+), 38 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 5839acc343b..2828f0dacf3 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -12532,6 +12532,46 @@ static uint8_t combine_cacheattr_nibble(uint8_t s1, uint8_t s2)
     }
 }
 
+/*
+ * Combine the memory type and cacheability attributes of
+ * s1 and s2 for the HCR_EL2.FWB == 0 case, returning the
+ * combined attributes in MAIR_EL1 format.
+ */
+static uint8_t combined_attrs_nofwb(CPUARMState *env,
+                                    ARMCacheAttrs s1, ARMCacheAttrs s2)
+{
+    uint8_t s1lo, s2lo, s1hi, s2hi, s2_mair_attrs, ret_attrs;
+
+    s2_mair_attrs = convert_stage2_attrs(env, s2.attrs);
+
+    s1lo = extract32(s1.attrs, 0, 4);
+    s2lo = extract32(s2_mair_attrs, 0, 4);
+    s1hi = extract32(s1.attrs, 4, 4);
+    s2hi = extract32(s2_mair_attrs, 4, 4);
+
+    /* Combine memory type and cacheability attributes */
+    if (s1hi == 0 || s2hi == 0) {
+        /* Device has precedence over normal */
+        if (s1lo == 0 || s2lo == 0) {
+            /* nGnRnE has precedence over anything */
+            ret_attrs = 0;
+        } else if (s1lo == 4 || s2lo == 4) {
+            /* non-Reordering has precedence over Reordering */
+            ret_attrs = 4;  /* nGnRE */
+        } else if (s1lo == 8 || s2lo == 8) {
+            /* non-Gathering has precedence over Gathering */
+            ret_attrs = 8;  /* nGRE */
+        } else {
+            ret_attrs = 0xc; /* GRE */
+        }
+    } else { /* Normal memory */
+        /* Outer/inner cacheability combine independently */
+        ret_attrs = combine_cacheattr_nibble(s1hi, s2hi) << 4
+                  | combine_cacheattr_nibble(s1lo, s2lo);
+    }
+    return ret_attrs;
+}
+
 /* Combine S1 and S2 cacheability/shareability attributes, per D4.5.4
  * and CombineS1S2Desc()
  *
@@ -12542,26 +12582,17 @@ static uint8_t combine_cacheattr_nibble(uint8_t s1, uint8_t s2)
 static ARMCacheAttrs combine_cacheattrs(CPUARMState *env,
                                         ARMCacheAttrs s1, ARMCacheAttrs s2)
 {
-    uint8_t s1lo, s2lo, s1hi, s2hi;
     ARMCacheAttrs ret;
     bool tagged = false;
-    uint8_t s2_mair_attrs;
 
     assert(s2.is_s2_format && !s1.is_s2_format);
     ret.is_s2_format = false;
 
-    s2_mair_attrs = convert_stage2_attrs(env, s2.attrs);
-
     if (s1.attrs == 0xf0) {
         tagged = true;
         s1.attrs = 0xff;
     }
 
-    s1lo = extract32(s1.attrs, 0, 4);
-    s2lo = extract32(s2_mair_attrs, 0, 4);
-    s1hi = extract32(s1.attrs, 4, 4);
-    s2hi = extract32(s2_mair_attrs, 4, 4);
-
     /* Combine shareability attributes (table D4-43) */
     if (s1.shareability == 2 || s2.shareability == 2) {
         /* if either are outer-shareable, the result is outer-shareable */
@@ -12575,37 +12606,18 @@ static ARMCacheAttrs combine_cacheattrs(CPUARMState *env,
     }
 
     /* Combine memory type and cacheability attributes */
-    if (s1hi == 0 || s2hi == 0) {
-        /* Device has precedence over normal */
-        if (s1lo == 0 || s2lo == 0) {
-            /* nGnRnE has precedence over anything */
-            ret.attrs = 0;
-        } else if (s1lo == 4 || s2lo == 4) {
-            /* non-Reordering has precedence over Reordering */
-            ret.attrs = 4;  /* nGnRE */
-        } else if (s1lo == 8 || s2lo == 8) {
-            /* non-Gathering has precedence over Gathering */
-            ret.attrs = 8;  /* nGRE */
-        } else {
-            ret.attrs = 0xc; /* GRE */
-        }
+    ret.attrs = combined_attrs_nofwb(env, s1, s2);
 
-        /* Any location for which the resultant memory type is any
-         * type of Device memory is always treated as Outer Shareable.
-         */
+    /*
+     * Any location for which the resultant memory type is any
+     * type of Device memory is always treated as Outer Shareable.
+     * Any location for which the resultant memory type is Normal
+     * Inner Non-cacheable, Outer Non-cacheable is always treated
+     * as Outer Shareable.
+     * TODO: FEAT_XS adds another value (0x40) also meaning iNCoNC
+     */
+    if ((ret.attrs & 0xf0) == 0 || ret.attrs == 0x44) {
         ret.shareability = 2;
-    } else { /* Normal memory */
-        /* Outer/inner cacheability combine independently */
-        ret.attrs = combine_cacheattr_nibble(s1hi, s2hi) << 4
-                  | combine_cacheattr_nibble(s1lo, s2lo);
-
-        if (ret.attrs == 0x44) {
-            /* Any location for which the resultant memory type is Normal
-             * Inner Non-cacheable, Outer Non-cacheable is always treated
-             * as Outer Shareable.
-             */
-            ret.shareability = 2;
-        }
     }
 
     /* TODO: CombineS1S2Desc does not consider transient, only WB, RWA. */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] target/arm: Implement FEAT_S2FWB
  2022-05-05 18:39 [PATCH 0/4] target/arm: Implement S2FWB Peter Maydell
  2022-05-05 18:39 ` [PATCH 1/4] target/arm: Postpone interpretation of stage 2 descriptor attribute bits Peter Maydell
  2022-05-05 18:39 ` [PATCH 2/4] target/arm: Factor out FWB=0 specific part of combine_cacheattrs() Peter Maydell
@ 2022-05-05 18:39 ` Peter Maydell
  2022-05-06 18:19   ` Richard Henderson
  2022-05-05 18:39 ` [PATCH 4/4] target/arm: Enable FEAT_S2FWB for -cpu max Peter Maydell
  3 siblings, 1 reply; 9+ messages in thread
From: Peter Maydell @ 2022-05-05 18:39 UTC (permalink / raw)
  To: qemu-arm, qemu-devel

Implement the handling of FEAT_S2FWB; the meat of this is in the new
combined_attrs_fwb() function which combines S1 and S2 attributes
when HCR_EL2.FWB is set.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.h    |  5 +++
 target/arm/helper.c | 84 +++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 86 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index db8ff044497..dff0f634c38 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -4289,6 +4289,11 @@ static inline bool isar_feature_aa64_st(const ARMISARegisters *id)
     return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, ST) != 0;
 }
 
+static inline bool isar_feature_aa64_fwb(const ARMISARegisters *id)
+{
+    return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, FWB) != 0;
+}
+
 static inline bool isar_feature_aa64_bti(const ARMISARegisters *id)
 {
     return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, BT) != 0;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 2828f0dacf3..fb8d2bf5c9d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5290,6 +5290,9 @@ static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
         if (cpu_isar_feature(aa64_mte, cpu)) {
             valid_mask |= HCR_ATA | HCR_DCT | HCR_TID5;
         }
+        if (cpu_isar_feature(aa64_fwb, cpu)) {
+            valid_mask |= HCR_FWB;
+        }
     }
 
     /* Clear RES0 bits.  */
@@ -5301,8 +5304,10 @@ static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
      * HCR_PTW forbids certain page-table setups
      * HCR_DC disables stage1 and enables stage2 translation
      * HCR_DCT enables tagging on (disabled) stage1 translation
+     * HCR_FWB changes the interpretation of stage2 descriptor bits
      */
-    if ((env->cp15.hcr_el2 ^ value) & (HCR_VM | HCR_PTW | HCR_DC | HCR_DCT)) {
+    if ((env->cp15.hcr_el2 ^ value) &
+        (HCR_VM | HCR_PTW | HCR_DC | HCR_DCT | HCR_FWB)) {
         tlb_flush(CPU(cpu));
     }
     env->cp15.hcr_el2 = value;
@@ -10685,9 +10690,15 @@ static bool ptw_attrs_are_device(CPUARMState *env, ARMCacheAttrs cacheattrs)
      * attributes are therefore only Device if stage 2 specifies Device.
      * With HCR_EL2.FWB == 0 this is when descriptor bits [5:4] are 0b00,
      * ie when cacheattrs.attrs bits [3:2] are 0b00.
+     * With HCR_EL2.FWB == 1 this is when descriptor bit [4] is 0, ie
+     * when cacheattrs.attrs bit [2] is 0.
      */
     assert(cacheattrs.is_s2_format);
-    return (cacheattrs.attrs & 0xc) == 0;
+    if (arm_hcr_el2_eff(env) & HCR_FWB) {
+        return (cacheattrs.attrs & 0x4) == 0;
+    } else {
+        return (cacheattrs.attrs & 0xc) == 0;
+    }
 }
 
 /* Translate a S1 pagetable walk through S2 if needed.  */
@@ -12572,6 +12583,69 @@ static uint8_t combined_attrs_nofwb(CPUARMState *env,
     return ret_attrs;
 }
 
+static uint8_t force_cacheattr_nibble_wb(uint8_t attr)
+{
+    /*
+     * Given the 4 bits specifying the outer or inner cacheability
+     * in MAIR format, return a value specifying Normal Write-Back,
+     * with the allocation and transient hints taken from the input
+     * if the input specified some kind of cacheable attribute.
+     */
+    if (attr == 0 || attr == 4) {
+        /*
+         * 0 == an UNPREDICTABLE encoding
+         * 4 == Non-cacheable
+         * Either way, force Write-Back RW allocate non-transient
+         */
+        return 0xf;
+    }
+    /* Change WriteThrough to WriteBack, keep allocation and transient hints */
+    return attr | 4;
+}
+
+/*
+ * Combine the memory type and cacheability attributes of
+ * s1 and s2 for the HCR_EL2.FWB == 1 case, returning the
+ * combined attributes in MAIR_EL1 format.
+ */
+static uint8_t combined_attrs_fwb(CPUARMState *env,
+                                  ARMCacheAttrs s1, ARMCacheAttrs s2)
+{
+    switch (s2.attrs) {
+    case 7:
+        /* Use stage 1 attributes */
+        return s1.attrs;
+    case 6:
+        /*
+         * Force Normal Write-Back. Note that if S1 is Normal cacheable
+         * then we take the allocation hints from it; otherwise it is
+         * RW allocate, non-transient.
+         */
+        if ((s1.attrs & 0xf0) == 0) {
+            /* S1 is Device */
+            return 0xff;
+        }
+        /* Need to check the Inner and Outer nibbles separately */
+        return force_cacheattr_nibble_wb(s1.attrs & 0xf) |
+            force_cacheattr_nibble_wb(s1.attrs >> 4) << 4;
+    case 5:
+        /* If S1 attrs are Device, use them; otherwise Normal Non-cacheable */
+        if ((s1.attrs & 0xf0) == 0) {
+            return s1.attrs;
+        }
+        return 0x44;
+    case 0 ... 3:
+        /* Force Device, of subtype specified by S2 */
+        return s2.attrs << 2;
+    default:
+        /*
+         * RESERVED values (including RES0 descriptor bit [5] being nonzero);
+         * arbitrarily force Device.
+         */
+        return 0;
+    }
+}
+
 /* Combine S1 and S2 cacheability/shareability attributes, per D4.5.4
  * and CombineS1S2Desc()
  *
@@ -12606,7 +12680,11 @@ static ARMCacheAttrs combine_cacheattrs(CPUARMState *env,
     }
 
     /* Combine memory type and cacheability attributes */
-    ret.attrs = combined_attrs_nofwb(env, s1, s2);
+    if (arm_hcr_el2_eff(env) & HCR_FWB) {
+        ret.attrs = combined_attrs_fwb(env, s1, s2);
+    } else {
+        ret.attrs = combined_attrs_nofwb(env, s1, s2);
+    }
 
     /*
      * Any location for which the resultant memory type is any
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] target/arm: Enable FEAT_S2FWB for -cpu max
  2022-05-05 18:39 [PATCH 0/4] target/arm: Implement S2FWB Peter Maydell
                   ` (2 preceding siblings ...)
  2022-05-05 18:39 ` [PATCH 3/4] target/arm: Implement FEAT_S2FWB Peter Maydell
@ 2022-05-05 18:39 ` Peter Maydell
  2022-05-06 18:21   ` Richard Henderson
  3 siblings, 1 reply; 9+ messages in thread
From: Peter Maydell @ 2022-05-05 18:39 UTC (permalink / raw)
  To: qemu-arm, qemu-devel

Enable the FEAT_S2FWB for -cpu max. Since FEAT_S2FWB requires that
CLIDR_EL1.{LoUU,LoUIS} are zero, we explicitly squash these (the
inherited CLIDR_EL1 value from the Cortex-A57 has them as 1).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/system/arm/emulation.rst |  1 +
 target/arm/cpu64.c            | 10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index c3bd0676a87..122306a99f1 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -42,6 +42,7 @@ the following architecture extensions:
 - FEAT_PMUv3p4 (PMU Extensions v3.4)
 - FEAT_RDM (Advanced SIMD rounding double multiply accumulate instructions)
 - FEAT_RNG (Random number generator)
+- FEAT_S2FWB (Stage 2 forced Write-Back)
 - FEAT_SB (Speculation Barrier)
 - FEAT_SEL2 (Secure EL2)
 - FEAT_SHA1 (SHA1 instructions)
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 2974cbc0d35..ed2831f1f38 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -769,6 +769,15 @@ static void aarch64_max_initfn(Object *obj)
     t = FIELD_DP64(t, MIDR_EL1, REVISION, 0);
     cpu->midr = t;
 
+    /*
+     * We're going to set FEAT_S2FWB, which mandates that CLIDR_EL1.{LoUU,LoUIS}
+     * are zero.
+     */
+    u = cpu->clidr;
+    u = FIELD_DP32(u, CLIDR_EL1, LOUIS, 0);
+    u = FIELD_DP32(u, CLIDR_EL1, LOUU, 0);
+    cpu->clidr = u;
+
     t = cpu->isar.id_aa64isar0;
     t = FIELD_DP64(t, ID_AA64ISAR0, AES, 2); /* AES + PMULL */
     t = FIELD_DP64(t, ID_AA64ISAR0, SHA1, 1);
@@ -841,6 +850,7 @@ static void aarch64_max_initfn(Object *obj)
     t = FIELD_DP64(t, ID_AA64MMFR2, VARANGE, 1); /* FEAT_LVA */
     t = FIELD_DP64(t, ID_AA64MMFR2, TTL, 1); /* FEAT_TTL */
     t = FIELD_DP64(t, ID_AA64MMFR2, BBM, 2); /* FEAT_BBM at level 2 */
+    t = FIELD_DP64(t, ID_AA64MMFR2, FWB, 1); /* FEAT_S2FWB */
     cpu->isar.id_aa64mmfr2 = t;
 
     t = cpu->isar.id_aa64zfr0;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/4] target/arm: Postpone interpretation of stage 2 descriptor attribute bits
  2022-05-05 18:39 ` [PATCH 1/4] target/arm: Postpone interpretation of stage 2 descriptor attribute bits Peter Maydell
@ 2022-05-06 18:14   ` Richard Henderson
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2022-05-06 18:14 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 5/5/22 13:39, Peter Maydell wrote:
> In the original Arm v8 two-stage translation, both stage 1 and stage
> 2 specify memory attributes (memory type, cacheability,
> shareability); these are then combined to produce the overall memory
> attributes for the whole stage 1+2 access.  In QEMU we implement this
> by having get_phys_addr() fill in an ARMCacheAttrs struct, and we
> convert both the stage 1 and stage 2 attribute bit formats to the
> same encoding (an 8-bit attribute value matching the MAIR_EL1 fields,
> plus a 2-bit shareability value).
> 
> The new FEAT_S2FWB feature allows the guest to enable a different
> interpretation of the attribute bits in the stage 2 descriptors.
> These bits can now be used to control details of how the stage 1 and
> 2 attributes should be combined (for instance they can say "always
> use the stage 1 attributes" or "ignore the stage 1 attributes and
> always be Device memory").  This means we need to pass the raw bit
> information for stage 2 down to the function which combines the stage
> 1 and stage 2 information.
> 
> Add a field to ARMCacheAttrs that indicates whether the attrs field
> should be interpreted as MAIR format, or as the raw stage 2 attribute
> bits from the descriptor, and store the appropriate values when
> filling in cacheattrs.
> 
> We only need to interpret the attrs field in a few places:
>   * in do_ats_write(), where we know to expect a MAIR value
>     (there is no ATS instruction to do a stage-2-only walk)
>   * in S1_ptw_translate(), where we want to know whether the
>     combined S1 + S2 attributes indicate Device memory that
>     should provoke a fault
>   * in combine_cacheattrs(), which does the S1 + S2 combining
> Update those places accordingly.
> 
> Signed-off-by: Peter Maydell<peter.maydell@linaro.org>
> ---
>   target/arm/internals.h |  7 ++++++-
>   target/arm/helper.c    | 42 ++++++++++++++++++++++++++++++++++++------
>   2 files changed, 42 insertions(+), 7 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] target/arm: Factor out FWB=0 specific part of combine_cacheattrs()
  2022-05-05 18:39 ` [PATCH 2/4] target/arm: Factor out FWB=0 specific part of combine_cacheattrs() Peter Maydell
@ 2022-05-06 18:16   ` Richard Henderson
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2022-05-06 18:16 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 5/5/22 13:39, Peter Maydell wrote:
> Factor out the part of combine_cacheattrs() that is specific to
> handling HCR_EL2.FWB == 0.  This is the part where we combine the
> memory type and cacheability attributes.
> 
> The "force Outer Shareable for Device or Normal Inner-NC Outer-NC"
> logic remains in combine_cacheattrs() because it holds regardless
> (this is the equivalent of the pseudocode EffectiveShareability()
> function).
> 
> Signed-off-by: Peter Maydell<peter.maydell@linaro.org>
> ---
>   target/arm/helper.c | 88 +++++++++++++++++++++++++--------------------
>   1 file changed, 50 insertions(+), 38 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/4] target/arm: Implement FEAT_S2FWB
  2022-05-05 18:39 ` [PATCH 3/4] target/arm: Implement FEAT_S2FWB Peter Maydell
@ 2022-05-06 18:19   ` Richard Henderson
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2022-05-06 18:19 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 5/5/22 13:39, Peter Maydell wrote:
> Implement the handling of FEAT_S2FWB; the meat of this is in the new
> combined_attrs_fwb() function which combines S1 and S2 attributes
> when HCR_EL2.FWB is set.
> 
> Signed-off-by: Peter Maydell<peter.maydell@linaro.org>
> ---
>   target/arm/cpu.h    |  5 +++
>   target/arm/helper.c | 84 +++++++++++++++++++++++++++++++++++++++++++--
>   2 files changed, 86 insertions(+), 3 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] target/arm: Enable FEAT_S2FWB for -cpu max
  2022-05-05 18:39 ` [PATCH 4/4] target/arm: Enable FEAT_S2FWB for -cpu max Peter Maydell
@ 2022-05-06 18:21   ` Richard Henderson
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2022-05-06 18:21 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 5/5/22 13:39, Peter Maydell wrote:
> Enable the FEAT_S2FWB for -cpu max. Since FEAT_S2FWB requires that
> CLIDR_EL1.{LoUU,LoUIS} are zero, we explicitly squash these (the
> inherited CLIDR_EL1 value from the Cortex-A57 has them as 1).
> 
> Signed-off-by: Peter Maydell<peter.maydell@linaro.org>
> ---
>   docs/system/arm/emulation.rst |  1 +
>   target/arm/cpu64.c            | 10 ++++++++++
>   2 files changed, 11 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-05-06 18:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-05 18:39 [PATCH 0/4] target/arm: Implement S2FWB Peter Maydell
2022-05-05 18:39 ` [PATCH 1/4] target/arm: Postpone interpretation of stage 2 descriptor attribute bits Peter Maydell
2022-05-06 18:14   ` Richard Henderson
2022-05-05 18:39 ` [PATCH 2/4] target/arm: Factor out FWB=0 specific part of combine_cacheattrs() Peter Maydell
2022-05-06 18:16   ` Richard Henderson
2022-05-05 18:39 ` [PATCH 3/4] target/arm: Implement FEAT_S2FWB Peter Maydell
2022-05-06 18:19   ` Richard Henderson
2022-05-05 18:39 ` [PATCH 4/4] target/arm: Enable FEAT_S2FWB for -cpu max Peter Maydell
2022-05-06 18:21   ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.