All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes
@ 2018-10-29 15:53 Richard Henderson
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 1/4] cputlb: Add tlb_set_asid_for_mmuidx Richard Henderson
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Richard Henderson @ 2018-10-29 15:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

In http://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg04181.html
(already upstream) I added a check for ASID changes without realizing
that TTBCR_EL1 has the A1 bit, controlling which register actually
contains the active ASID.

In http://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg04182.html
I suggested a set of mmu_idx to flush when the ASID does change.  In
follow-up, Peter suggested more.

I now choose secure vs non-secure mmu_idx based on which register is being
modified, not the current state of the cpu.  Unless I am mistaken, secure
state can write to the non-secure registers.  Which means that the current
state of the cpu is irrelevant and only the register matters.

Peter suggested flushing S1E3 when changing ttbr0_s.  I can see how this
is overlapped onto the EL3 (Secure Monitor) state, but I cannot see how
the ASID is used from EL3.  The best evidence I can find for this is that
there is no TLBIASID* register that is applicable to flushing EL3; that's
not conclusive proof though.  So while I'm not sure it's necessary, I'm
also not sure it isn't necessary, and so I've included S1E3 in the flush.

I now also use the VMID to conditionally invalidate the stage 2 translation
state.  This shows how I anticipaged @depmap to be used in patch 1.


r~


Richard Henderson (4):
  cputlb: Add tlb_set_asid_for_mmuidx
  target/arm: Install ASIDs for long-form from EL1
  target/arm: Install ASIDs for short-form from EL1
  target/arm: Install ASIDs for EL2

 include/exec/cpu-defs.h |   2 +
 include/exec/exec-all.h |  19 ++++++
 accel/tcg/cputlb.c      |  23 +++++++
 target/arm/helper.c     | 133 ++++++++++++++++++++++++++--------------
 4 files changed, 131 insertions(+), 46 deletions(-)

-- 
2.17.2

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 1/4] cputlb: Add tlb_set_asid_for_mmuidx
  2018-10-29 15:53 [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes Richard Henderson
@ 2018-10-29 15:53 ` Richard Henderson
  2018-11-15 18:36   ` Peter Maydell
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 2/4] target/arm: Install ASIDs for long-form from EL1 Richard Henderson
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Richard Henderson @ 2018-10-29 15:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Although we can't do much with ASIDs except remember them, this
will allow cleanups within target/ that should make things clearer.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/cpu-defs.h |  2 ++
 include/exec/exec-all.h | 19 +++++++++++++++++++
 accel/tcg/cputlb.c      | 23 +++++++++++++++++++++++
 3 files changed, 44 insertions(+)

diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index 6a60f94a41..8fbfe8c8e2 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -152,6 +152,8 @@ typedef struct CPUTLBDesc {
     target_ulong large_page_mask;
     /* The next index to use in the tlb victim table.  */
     size_t vindex;
+    /* The current ASID for this tlb.  */
+    uint32_t asid;
 } CPUTLBDesc;
 
 /*
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 815e5b1e83..478f488704 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -226,6 +226,21 @@ void tlb_flush_by_mmuidx_all_cpus(CPUState *cpu, uint16_t idxmap);
  * depend on when the guests translation ends the TB.
  */
 void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu, uint16_t idxmap);
+/**
+ * tlb_set_asid_for_mmuidx:
+ * @cpu: Originating cpu
+ * @asid: Address Space Identifier
+ * @idxmap: bitmap of MMU indicies to set to @asid
+ * @depmap: bitmap of dependent MMU indicies
+ *
+ * Set an ASID for all of @idxmap.  If any previous ASID was different,
+ * then we may flush the mmu idx.  If a flush is required, then also flush
+ * all dependent mmu indicies in @depmap.  This latter is typically used
+ * for secondary page resolution, for implementing virtualization within
+ * the guest.
+ */
+void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid,
+                             uint16_t idxmap, uint16_t dep_idxmap);
 /**
  * tlb_set_page_with_attrs:
  * @cpu: CPU to add this TLB entry for
@@ -311,6 +326,10 @@ static inline void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu,
                                                        uint16_t idxmap)
 {
 }
+static inline void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid,
+                                           uint16_t idxmap, uint16_t depmap)
+{
+}
 #endif
 
 #define CODE_GEN_ALIGN           16 /* must be >= of the size of a icache line */
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index af6bd8ccf9..60b3dc2de3 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -360,6 +360,29 @@ void tlb_flush_page_all_cpus_synced(CPUState *src, target_ulong addr)
     tlb_flush_page_by_mmuidx_all_cpus_synced(src, addr, ALL_MMUIDX_BITS);
 }
 
+void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid, uint16_t idxmap,
+                             uint16_t depmap)
+{
+    CPUArchState *env = cpu->env_ptr;
+    uint16_t work, to_flush = 0;
+
+    /*
+     * We don't support ASIDs except for trivially.
+     * If there is any change, then we must flush the TLB.
+     */
+    for (work = idxmap; work != 0; work &= work - 1) {
+        int mmu_idx = ctz32(work);
+        if (env->tlb_d[mmu_idx].asid != asid) {
+            env->tlb_d[mmu_idx].asid = asid;
+            to_flush = idxmap;
+        }
+    }
+    if (to_flush) {
+        to_flush |= depmap;
+        tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_INT(to_flush));
+    }
+}
+
 /* update the TLBs so that writes to code in the virtual page 'addr'
    can be detected */
 void tlb_protect_code(ram_addr_t ram_addr)
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 2/4] target/arm: Install ASIDs for long-form from EL1
  2018-10-29 15:53 [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes Richard Henderson
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 1/4] cputlb: Add tlb_set_asid_for_mmuidx Richard Henderson
@ 2018-10-29 15:53 ` Richard Henderson
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 3/4] target/arm: Install ASIDs for short-form " Richard Henderson
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2018-10-29 15:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

In addition to providing the core with the current ASID, this minimizes
both the number of flushes due to non-changing ASID as well as the set
of mmu_idx that are affected by each flush.

In particular, updates to the secure mode registers flushes only the
relevant secure mode mmu_idx's, and similarly non-secure updates only
affect non-secure mmu_idx's.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 73 +++++++++++++++++++++++++++++----------------
 1 file changed, 47 insertions(+), 26 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 0ea95b0815..26d6f28793 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2685,6 +2685,36 @@ static const ARMCPRegInfo pmsav5_cp_reginfo[] = {
     REGINFO_SENTINEL
 };
 
+/* Called after a change to any of TTBR*_EL1 or TTBCR_EL1.  */
+static void update_lpae_el1_asid(CPUARMState *env, int secure)
+{
+    CPUState *cs = CPU(arm_env_get_cpu(env));
+    uint64_t ttbr0, ttbr1, ttcr;
+    int asid, idxmask;
+
+    switch (secure) {
+    case ARM_CP_SECSTATE_S:
+        ttbr0 = env->cp15.ttbr0_s;
+        ttbr1 = env->cp15.ttbr1_s;
+        ttcr = env->cp15.tcr_el[3].raw_tcr;
+        /* Note that cp15.ttbr0_s == cp15.ttbr0_el[3], so S1E3 is affected.  */
+        /* ??? Secure EL3 really using the ASID field?  Doesn't make sense.  */
+        idxmask = ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0 | ARMMMUIdxBit_S1E3;
+        break;
+    case ARM_CP_SECSTATE_NS:
+        ttbr0 = env->cp15.ttbr0_ns;
+        ttbr1 = env->cp15.ttbr1_ns;
+        ttcr = env->cp15.tcr_el[1].raw_tcr;
+        idxmask = ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
+        break;
+    default:
+        g_assert_not_reached();
+    }
+    asid = extract64(ttcr & TTBCR_A1 ? ttbr1 : ttbr0, 48, 16);
+
+    tlb_set_asid_for_mmuidx(cs, asid, idxmask, 0);
+}
+
 static void vmsa_ttbcr_raw_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                  uint64_t value)
 {
@@ -2721,15 +2751,11 @@ static void vmsa_ttbcr_raw_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void vmsa_ttbcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
-    ARMCPU *cpu = arm_env_get_cpu(env);
-
-    if (arm_feature(env, ARM_FEATURE_LPAE)) {
-        /* With LPAE the TTBCR could result in a change of ASID
-         * via the TTBCR.A1 bit, so do a TLB flush.
-         */
-        tlb_flush(CPU(cpu));
-    }
     vmsa_ttbcr_raw_write(env, ri, value);
+    if (arm_feature(env, ARM_FEATURE_LPAE)) {
+        /* The A1 bit controls which ASID is active.  */
+        update_lpae_el1_asid(env, ri->secure);
+    }
 }
 
 static void vmsa_ttbcr_reset(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -2747,24 +2773,19 @@ static void vmsa_ttbcr_reset(CPUARMState *env, const ARMCPRegInfo *ri)
 static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                uint64_t value)
 {
-    ARMCPU *cpu = arm_env_get_cpu(env);
-    TCR *tcr = raw_ptr(env, ri);
-
-    /* For AArch64 the A1 bit could result in a change of ASID, so TLB flush. */
-    tlb_flush(CPU(cpu));
-    tcr->raw_tcr = value;
+    raw_write(env, ri, value);
+    /* The A1 bit controls which ASID is active.  */
+    update_lpae_el1_asid(env, ri->secure);
 }
 
-static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                            uint64_t value)
+static void vmsa_ttbr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                                uint64_t value)
 {
-    /* If the ASID changes (with a 64-bit write), we must flush the TLB.  */
-    if (cpreg_field_is_64bit(ri) &&
-        extract64(raw_read(env, ri) ^ value, 48, 16) != 0) {
-        ARMCPU *cpu = arm_env_get_cpu(env);
-        tlb_flush(CPU(cpu));
-    }
     raw_write(env, ri, value);
+    if (cpreg_field_is_64bit(ri)) {
+        /* The LPAE format (64-bit write) contains an ASID field.  */
+        update_lpae_el1_asid(env, ri->secure);
+    }
 }
 
 static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -2810,12 +2831,12 @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.esr_el[1]), .resetvalue = 0, },
     { .name = "TTBR0_EL1", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0,
-      .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+      .access = PL1_RW, .writefn = vmsa_ttbr_el1_write, .resetvalue = 0,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                              offsetof(CPUARMState, cp15.ttbr0_ns) } },
     { .name = "TTBR1_EL1", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 1,
-      .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+      .access = PL1_RW, .writefn = vmsa_ttbr_el1_write, .resetvalue = 0,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                              offsetof(CPUARMState, cp15.ttbr1_ns) } },
     { .name = "TCR_EL1", .state = ARM_CP_STATE_AA64,
@@ -3067,12 +3088,12 @@ static const ARMCPRegInfo lpae_cp_reginfo[] = {
       .access = PL1_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                              offsetof(CPUARMState, cp15.ttbr0_ns) },
-      .writefn = vmsa_ttbr_write, },
+      .writefn = vmsa_ttbr_el1_write, },
     { .name = "TTBR1", .cp = 15, .crm = 2, .opc1 = 1,
       .access = PL1_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                              offsetof(CPUARMState, cp15.ttbr1_ns) },
-      .writefn = vmsa_ttbr_write, },
+      .writefn = vmsa_ttbr_el1_write, },
     REGINFO_SENTINEL
 };
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 3/4] target/arm: Install ASIDs for short-form from EL1
  2018-10-29 15:53 [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes Richard Henderson
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 1/4] cputlb: Add tlb_set_asid_for_mmuidx Richard Henderson
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 2/4] target/arm: Install ASIDs for long-form from EL1 Richard Henderson
@ 2018-10-29 15:53 ` Richard Henderson
  2018-11-15 18:52   ` Peter Maydell
  2018-11-16 13:47   ` Peter Maydell
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 4/4] target/arm: Install ASIDs for EL2 Richard Henderson
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 15+ messages in thread
From: Richard Henderson @ 2018-10-29 15:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

This is less complex than the LPAE case, but still we now avoid the
flush in case it is only the PROCID field that is changing.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 34 ++++++++++++++++++++++++----------
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 26d6f28793..f767467dcf 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -541,17 +541,31 @@ static void fcse_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
 static void contextidr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
-    ARMCPU *cpu = arm_env_get_cpu(env);
-
-    if (raw_read(env, ri) != value && !arm_feature(env, ARM_FEATURE_PMSA)
-        && !extended_addresses_enabled(env)) {
-        /* For VMSA (when not using the LPAE long descriptor page table
-         * format) this register includes the ASID, so do a TLB flush.
-         * For PMSA it is purely a process ID and no action is needed.
-         */
-        tlb_flush(CPU(cpu));
-    }
     raw_write(env, ri, value);
+
+    /*
+     * For VMSA (when not using the LPAE long descriptor page table format)
+     * this register includes the ASID.  For PMSA it is purely a process ID
+     * and no action is needed.
+     */
+    if (!arm_feature(env, ARM_FEATURE_PMSA) &&
+        !extended_addresses_enabled(env)) {
+        CPUState *cs = CPU(arm_env_get_cpu(env));
+        int asid = extract32(value, 0, 8);
+        int idxmask;
+
+        switch (ri->secure) {
+        case ARM_CP_SECSTATE_S:
+            idxmask = ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
+            break;
+        case ARM_CP_SECSTATE_NS:
+            idxmask = ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
+            break;
+        default:
+            g_assert_not_reached();
+        }
+        tlb_set_asid_for_mmuidx(cs, asid, idxmask, 0);
+    }
 }
 
 /* IS variants of TLB operations must affect all cores */
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 4/4] target/arm: Install ASIDs for EL2
  2018-10-29 15:53 [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes Richard Henderson
                   ` (2 preceding siblings ...)
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 3/4] target/arm: Install ASIDs for short-form " Richard Henderson
@ 2018-10-29 15:53 ` Richard Henderson
  2018-11-15 18:38   ` Peter Maydell
  2018-10-30 15:40 ` [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes Emilio G. Cota
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Richard Henderson @ 2018-10-29 15:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

The VMID is the ASID for the 2nd stage page lookup.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index f767467dcf..4b14f2c05b 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2805,17 +2805,23 @@ static void vmsa_ttbr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                         uint64_t value)
 {
-    ARMCPU *cpu = arm_env_get_cpu(env);
-    CPUState *cs = CPU(cpu);
+    CPUState *cs = CPU(arm_env_get_cpu(env));
+    int vmid;
 
-    /* Accesses to VTTBR may change the VMID so we must flush the TLB.  */
-    if (raw_read(env, ri) != value) {
-        tlb_flush_by_mmuidx(cs,
-                            ARMMMUIdxBit_S12NSE1 |
-                            ARMMMUIdxBit_S12NSE0 |
-                            ARMMMUIdxBit_S2NS);
-        raw_write(env, ri, value);
-    }
+    raw_write(env, ri, value);
+
+    /*
+     * TODO: with ARMv8.1-VMID16, aarch64 must examine VTCR.VS
+     * (re-evaluating with changes to VTCR) then use bits [63:48].
+     */
+    vmid = extract64(value, 48, 8);
+
+    /*
+     * A change in VMID to the stage2 page table (S2NS) invalidates
+     * the combined stage 1&2 tlbs (S12NSE1 and S12NSE0).
+     */
+    tlb_set_asid_for_mmuidx(cs, vmid, ARMMMUIdxBit_S2NS,
+                            ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0);
 }
 
 static const ARMCPRegInfo vmsa_pmsa_cp_reginfo[] = {
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes
  2018-10-29 15:53 [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes Richard Henderson
                   ` (3 preceding siblings ...)
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 4/4] target/arm: Install ASIDs for EL2 Richard Henderson
@ 2018-10-30 15:40 ` Emilio G. Cota
  2018-11-05 16:30 ` Peter Maydell
  2018-11-15 18:25 ` Peter Maydell
  6 siblings, 0 replies; 15+ messages in thread
From: Emilio G. Cota @ 2018-10-30 15:40 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, peter.maydell

On Mon, Oct 29, 2018 at 15:53:35 +0000, Richard Henderson wrote:
> In http://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg04181.html
> (already upstream) I added a check for ASID changes without realizing
> that TTBCR_EL1 has the A1 bit, controlling which register actually
> contains the active ASID.
> 
> In http://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg04182.html
> I suggested a set of mmu_idx to flush when the ASID does change.  In
> follow-up, Peter suggested more.
> 
> I now choose secure vs non-secure mmu_idx based on which register is being
> modified, not the current state of the cpu.  Unless I am mistaken, secure
> state can write to the non-secure registers.  Which means that the current
> state of the cpu is irrelevant and only the register matters.
> 
> Peter suggested flushing S1E3 when changing ttbr0_s.  I can see how this
> is overlapped onto the EL3 (Secure Monitor) state, but I cannot see how
> the ASID is used from EL3.  The best evidence I can find for this is that
> there is no TLBIASID* register that is applicable to flushing EL3; that's
> not conclusive proof though.  So while I'm not sure it's necessary, I'm
> also not sure it isn't necessary, and so I've included S1E3 in the flush.
> 
> I now also use the VMID to conditionally invalidate the stage 2 translation
> state.  This shows how I anticipaged @depmap to be used in patch 1.

I've tested this series and the one it depends on (tlb-dirty). I had
yesterday a guest running parallel compilation jobs for ~12h with
no issues. So

Tested-by: Emilio G. Cota <cota@braap.org>

for this and the tlb-dirty series.

Richard: the crash I reported on IRC must have been due to unrelated
changes. I was testing some TLB experiments over the weekend so I must
have forgotten to rebuild.

Thanks,

		Emilio

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes
  2018-10-29 15:53 [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes Richard Henderson
                   ` (4 preceding siblings ...)
  2018-10-30 15:40 ` [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes Emilio G. Cota
@ 2018-11-05 16:30 ` Peter Maydell
  2018-11-05 17:38   ` Richard Henderson
  2018-11-15 18:25 ` Peter Maydell
  6 siblings, 1 reply; 15+ messages in thread
From: Peter Maydell @ 2018-11-05 16:30 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 29 October 2018 at 15:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
> In http://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg04181.html
> (already upstream) I added a check for ASID changes without realizing
> that TTBCR_EL1 has the A1 bit, controlling which register actually
> contains the active ASID.
>
> In http://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg04182.html
> I suggested a set of mmu_idx to flush when the ASID does change.  In
> follow-up, Peter suggested more.
>
> I now choose secure vs non-secure mmu_idx based on which register is being
> modified, not the current state of the cpu.  Unless I am mistaken, secure
> state can write to the non-secure registers.  Which means that the current
> state of the cpu is irrelevant and only the register matters.
>
> Peter suggested flushing S1E3 when changing ttbr0_s.  I can see how this
> is overlapped onto the EL3 (Secure Monitor) state, but I cannot see how
> the ASID is used from EL3.  The best evidence I can find for this is that
> there is no TLBIASID* register that is applicable to flushing EL3; that's
> not conclusive proof though.  So while I'm not sure it's necessary, I'm
> also not sure it isn't necessary, and so I've included S1E3 in the flush.
>
> I now also use the VMID to conditionally invalidate the stage 2 translation
> state.  This shows how I anticipaged @depmap to be used in patch 1.

Hi; just a note that I'm deferring this series to 4.0, so may
not get to reviewing it for a bit.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes
  2018-11-05 16:30 ` Peter Maydell
@ 2018-11-05 17:38   ` Richard Henderson
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2018-11-05 17:38 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

On 11/5/18 4:30 PM, Peter Maydell wrote:
> On 29 October 2018 at 15:53, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>> In http://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg04181.html
>> (already upstream) I added a check for ASID changes without realizing
>> that TTBCR_EL1 has the A1 bit, controlling which register actually
>> contains the active ASID.
>>
>> In http://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg04182.html
>> I suggested a set of mmu_idx to flush when the ASID does change.  In
>> follow-up, Peter suggested more.
>>
>> I now choose secure vs non-secure mmu_idx based on which register is being
>> modified, not the current state of the cpu.  Unless I am mistaken, secure
>> state can write to the non-secure registers.  Which means that the current
>> state of the cpu is irrelevant and only the register matters.
>>
>> Peter suggested flushing S1E3 when changing ttbr0_s.  I can see how this
>> is overlapped onto the EL3 (Secure Monitor) state, but I cannot see how
>> the ASID is used from EL3.  The best evidence I can find for this is that
>> there is no TLBIASID* register that is applicable to flushing EL3; that's
>> not conclusive proof though.  So while I'm not sure it's necessary, I'm
>> also not sure it isn't necessary, and so I've included S1E3 in the flush.
>>
>> I now also use the VMID to conditionally invalidate the stage 2 translation
>> state.  This shows how I anticipaged @depmap to be used in patch 1.
> 
> Hi; just a note that I'm deferring this series to 4.0, so may
> not get to reviewing it for a bit.

That's fine.  By that time I will have folded it into my v8.1 series, since I
expect to use the asid field as a part of the VHE extension.


r~

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes
  2018-10-29 15:53 [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes Richard Henderson
                   ` (5 preceding siblings ...)
  2018-11-05 16:30 ` Peter Maydell
@ 2018-11-15 18:25 ` Peter Maydell
  6 siblings, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2018-11-15 18:25 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 29 October 2018 at 15:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
> In http://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg04181.html
> (already upstream) I added a check for ASID changes without realizing
> that TTBCR_EL1 has the A1 bit, controlling which register actually
> contains the active ASID.
>
> In http://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg04182.html
> I suggested a set of mmu_idx to flush when the ASID does change.  In
> follow-up, Peter suggested more.
>
> I now choose secure vs non-secure mmu_idx based on which register is being
> modified, not the current state of the cpu.  Unless I am mistaken, secure
> state can write to the non-secure registers.  Which means that the current
> state of the cpu is irrelevant and only the register matters.
>
> Peter suggested flushing S1E3 when changing ttbr0_s.  I can see how this
> is overlapped onto the EL3 (Secure Monitor) state, but I cannot see how
> the ASID is used from EL3.  The best evidence I can find for this is that
> there is no TLBIASID* register that is applicable to flushing EL3; that's
> not conclusive proof though.  So while I'm not sure it's necessary, I'm
> also not sure it isn't necessary, and so I've included S1E3 in the flush.

"TLBIASID" flushes EL3, when EL3 is AArch32 and you are in Secure PL1
(ie EL3). From the register spec:
"If executed in Secure state when EL3 is using AArch32, the Secure PL1&0
translation regime."
The spec's "SecurePL1&0 translation regime" is in QEMU ARMMMUIdx_S1E3.
TTBR0(S) contains an ASID if TTBCR(S).EAE is 1.

(If EL3 is AArch64 then there's no TTBR0(S), only TTBR_EL3, which doesn't
have an ASID.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] cputlb: Add tlb_set_asid_for_mmuidx
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 1/4] cputlb: Add tlb_set_asid_for_mmuidx Richard Henderson
@ 2018-11-15 18:36   ` Peter Maydell
  2018-11-15 18:51     ` Richard Henderson
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Maydell @ 2018-11-15 18:36 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 29 October 2018 at 15:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
> Although we can't do much with ASIDs except remember them, this
> will allow cleanups within target/ that should make things clearer.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  include/exec/cpu-defs.h |  2 ++
>  include/exec/exec-all.h | 19 +++++++++++++++++++
>  accel/tcg/cputlb.c      | 23 +++++++++++++++++++++++
>  3 files changed, 44 insertions(+)
>
> diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
> index 6a60f94a41..8fbfe8c8e2 100644
> --- a/include/exec/cpu-defs.h
> +++ b/include/exec/cpu-defs.h
> @@ -152,6 +152,8 @@ typedef struct CPUTLBDesc {
>      target_ulong large_page_mask;
>      /* The next index to use in the tlb victim table.  */
>      size_t vindex;
> +    /* The current ASID for this tlb.  */
> +    uint32_t asid;
>  } CPUTLBDesc;
>
>  /*
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 815e5b1e83..478f488704 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -226,6 +226,21 @@ void tlb_flush_by_mmuidx_all_cpus(CPUState *cpu, uint16_t idxmap);
>   * depend on when the guests translation ends the TB.
>   */
>  void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu, uint16_t idxmap);
> +/**
> + * tlb_set_asid_for_mmuidx:
> + * @cpu: Originating cpu
> + * @asid: Address Space Identifier
> + * @idxmap: bitmap of MMU indicies to set to @asid

"indices", but it looks like in this header we consistently
use "MMU indexes", so better to stick with that. (Similarly below.)

> + * @depmap: bitmap of dependent MMU indicies
> + *
> + * Set an ASID for all of @idxmap.  If any previous ASID was different,
> + * then we may flush the mmu idx.  If a flush is required, then also flush

presumably "will flush", rather than "may flush" ?

> + * all dependent mmu indicies in @depmap.  This latter is typically used
> + * for secondary page resolution, for implementing virtualization within
> + * the guest.
> + */
> +void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid,
> +                             uint16_t idxmap, uint16_t dep_idxmap);
>  /**
>   * tlb_set_page_with_attrs:
>   * @cpu: CPU to add this TLB entry for
> @@ -311,6 +326,10 @@ static inline void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu,
>                                                         uint16_t idxmap)
>  {
>  }
> +static inline void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid,
> +                                           uint16_t idxmap, uint16_t depmap)
> +{
> +}
>  #endif
>
>  #define CODE_GEN_ALIGN           16 /* must be >= of the size of a icache line */
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index af6bd8ccf9..60b3dc2de3 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -360,6 +360,29 @@ void tlb_flush_page_all_cpus_synced(CPUState *src, target_ulong addr)
>      tlb_flush_page_by_mmuidx_all_cpus_synced(src, addr, ALL_MMUIDX_BITS);
>  }
>
> +void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid, uint16_t idxmap,
> +                             uint16_t depmap)
> +{
> +    CPUArchState *env = cpu->env_ptr;
> +    uint16_t work, to_flush = 0;

The other functions that work on the tlb defer their
actual operation to an async-work type function or
do a run-on-cpu if the passed-in CPU is not the current
CPU. Do we need to do that here too?

> +
> +    /*
> +     * We don't support ASIDs except for trivially.
> +     * If there is any change, then we must flush the TLB.
> +     */
> +    for (work = idxmap; work != 0; work &= work - 1) {
> +        int mmu_idx = ctz32(work);
> +        if (env->tlb_d[mmu_idx].asid != asid) {
> +            env->tlb_d[mmu_idx].asid = asid;
> +            to_flush = idxmap;
> +        }
> +    }

So this will flush all the passed in indexes in idxmap
if any one of them was previously the wrong ASID. Is that
necessary, or could we just flush only the ones which
were wrong and not flush any that were already the correct ASID ?

> +    if (to_flush) {
> +        to_flush |= depmap;
> +        tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_INT(to_flush));
> +    }
> +}
> +
>  /* update the TLBs so that writes to code in the virtual page 'addr'
>     can be detected */
>  void tlb_protect_code(ram_addr_t ram_addr)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target/arm: Install ASIDs for EL2
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 4/4] target/arm: Install ASIDs for EL2 Richard Henderson
@ 2018-11-15 18:38   ` Peter Maydell
  0 siblings, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2018-11-15 18:38 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 29 October 2018 at 15:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
> The VMID is the ASID for the 2nd stage page lookup.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/helper.c | 26 ++++++++++++++++----------
>  1 file changed, 16 insertions(+), 10 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] cputlb: Add tlb_set_asid_for_mmuidx
  2018-11-15 18:36   ` Peter Maydell
@ 2018-11-15 18:51     ` Richard Henderson
  2018-11-15 18:56       ` Peter Maydell
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Henderson @ 2018-11-15 18:51 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

On 11/15/18 7:36 PM, Peter Maydell wrote:
> On 29 October 2018 at 15:53, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>> Although we can't do much with ASIDs except remember them, this
>> will allow cleanups within target/ that should make things clearer.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>  include/exec/cpu-defs.h |  2 ++
>>  include/exec/exec-all.h | 19 +++++++++++++++++++
>>  accel/tcg/cputlb.c      | 23 +++++++++++++++++++++++
>>  3 files changed, 44 insertions(+)
>>
>> diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
>> index 6a60f94a41..8fbfe8c8e2 100644
>> --- a/include/exec/cpu-defs.h
>> +++ b/include/exec/cpu-defs.h
>> @@ -152,6 +152,8 @@ typedef struct CPUTLBDesc {
>>      target_ulong large_page_mask;
>>      /* The next index to use in the tlb victim table.  */
>>      size_t vindex;
>> +    /* The current ASID for this tlb.  */
>> +    uint32_t asid;
>>  } CPUTLBDesc;
>>
>>  /*
>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
>> index 815e5b1e83..478f488704 100644
>> --- a/include/exec/exec-all.h
>> +++ b/include/exec/exec-all.h
>> @@ -226,6 +226,21 @@ void tlb_flush_by_mmuidx_all_cpus(CPUState *cpu, uint16_t idxmap);
>>   * depend on when the guests translation ends the TB.
>>   */
>>  void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu, uint16_t idxmap);
>> +/**
>> + * tlb_set_asid_for_mmuidx:
>> + * @cpu: Originating cpu
>> + * @asid: Address Space Identifier
>> + * @idxmap: bitmap of MMU indicies to set to @asid
> 
> "indices", but it looks like in this header we consistently
> use "MMU indexes", so better to stick with that. (Similarly below.)
> 
>> + * @depmap: bitmap of dependent MMU indicies
>> + *
>> + * Set an ASID for all of @idxmap.  If any previous ASID was different,
>> + * then we may flush the mmu idx.  If a flush is required, then also flush
> 
> presumably "will flush", rather than "may flush" ?
> 
>> + * all dependent mmu indicies in @depmap.  This latter is typically used
>> + * for secondary page resolution, for implementing virtualization within
>> + * the guest.
>> + */
>> +void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid,
>> +                             uint16_t idxmap, uint16_t dep_idxmap);
>>  /**
>>   * tlb_set_page_with_attrs:
>>   * @cpu: CPU to add this TLB entry for
>> @@ -311,6 +326,10 @@ static inline void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu,
>>                                                         uint16_t idxmap)
>>  {
>>  }
>> +static inline void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid,
>> +                                           uint16_t idxmap, uint16_t depmap)
>> +{
>> +}
>>  #endif
>>
>>  #define CODE_GEN_ALIGN           16 /* must be >= of the size of a icache line */
>> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
>> index af6bd8ccf9..60b3dc2de3 100644
>> --- a/accel/tcg/cputlb.c
>> +++ b/accel/tcg/cputlb.c
>> @@ -360,6 +360,29 @@ void tlb_flush_page_all_cpus_synced(CPUState *src, target_ulong addr)
>>      tlb_flush_page_by_mmuidx_all_cpus_synced(src, addr, ALL_MMUIDX_BITS);
>>  }
>>
>> +void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid, uint16_t idxmap,
>> +                             uint16_t depmap)
>> +{
>> +    CPUArchState *env = cpu->env_ptr;
>> +    uint16_t work, to_flush = 0;
> 
> The other functions that work on the tlb defer their
> actual operation to an async-work type function or
> do a run-on-cpu if the passed-in CPU is not the current
> CPU. Do we need to do that here too?

I don't *think* so.  I would expect an ASID to be set in response to some
cpu-local change, like setting TTBR0_EL1.  Something that cannot be done across
cpus.  I could assert_cpu_is_self, if you like.

What I *should* be adding as well is cross-cpu asid-specific tlb flushing, a-la
TLBI ASID.  That would need the run-on-cpu stuff.

> So this will flush all the passed in indexes in idxmap
> if any one of them was previously the wrong ASID. Is that
> necessary, or could we just flush only the ones which
> were wrong and not flush any that were already the correct ASID ?

It probably wouldn't be necessary.  But I also sort of expect the set of
indexes to always have the same ASID -- e.g. kernel vs user views of the same
address space, e.g. S12NSE0 + S12NSE1.  I don't really know what to do when
this presumed invariant is violated.


r~

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] target/arm: Install ASIDs for short-form from EL1
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 3/4] target/arm: Install ASIDs for short-form " Richard Henderson
@ 2018-11-15 18:52   ` Peter Maydell
  2018-11-16 13:47   ` Peter Maydell
  1 sibling, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2018-11-15 18:52 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 29 October 2018 at 15:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
> This is less complex than the LPAE case, but still we now avoid the
> flush in case it is only the PROCID field that is changing.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/helper.c | 34 ++++++++++++++++++++++++----------
>  1 file changed, 24 insertions(+), 10 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 26d6f28793..f767467dcf 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -541,17 +541,31 @@ static void fcse_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
>  static void contextidr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                               uint64_t value)
>  {
> -    ARMCPU *cpu = arm_env_get_cpu(env);
> -
> -    if (raw_read(env, ri) != value && !arm_feature(env, ARM_FEATURE_PMSA)
> -        && !extended_addresses_enabled(env)) {
> -        /* For VMSA (when not using the LPAE long descriptor page table
> -         * format) this register includes the ASID, so do a TLB flush.
> -         * For PMSA it is purely a process ID and no action is needed.
> -         */
> -        tlb_flush(CPU(cpu));
> -    }
>      raw_write(env, ri, value);
> +
> +    /*
> +     * For VMSA (when not using the LPAE long descriptor page table format)
> +     * this register includes the ASID.  For PMSA it is purely a process ID
> +     * and no action is needed.
> +     */
> +    if (!arm_feature(env, ARM_FEATURE_PMSA) &&
> +        !extended_addresses_enabled(env)) {
> +        CPUState *cs = CPU(arm_env_get_cpu(env));
> +        int asid = extract32(value, 0, 8);
> +        int idxmask;
> +
> +        switch (ri->secure) {
> +        case ARM_CP_SECSTATE_S:
> +            idxmask = ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
> +            break;
> +        case ARM_CP_SECSTATE_NS:
> +            idxmask = ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
> +            break;
> +        default:
> +            g_assert_not_reached();
> +        }

If EL3 is AArch32 then changes to CONTEXTIDR(S) need to
invalidate the S1E3 MMU index.

If EL3 is not AArch32 then there is no CONTEXTIDR(S), but
if we are at EL3 then whether CONTEXTIDR applies to
ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0 or to
ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0 is tricky,
because it's up to EL3 to swap registers around depending
on whether it wants to enter EL1 in secure or nonsecure state.
I need to check more deeply about how this works.

NB also that I'm a bit suspicious of the definition of
extended_addresses_enabled(); need to check it.

> +        tlb_set_asid_for_mmuidx(cs, asid, idxmask, 0);
> +    }
>  }
>
>  /* IS variants of TLB operations must affect all cores */
> --
> 2.17.2

thanks
-- PMM

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] cputlb: Add tlb_set_asid_for_mmuidx
  2018-11-15 18:51     ` Richard Henderson
@ 2018-11-15 18:56       ` Peter Maydell
  0 siblings, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2018-11-15 18:56 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 15 November 2018 at 18:51, Richard Henderson
<richard.henderson@linaro.org> wrote:
> On 11/15/18 7:36 PM, Peter Maydell wrote:
>> On 29 October 2018 at 15:53, Richard Henderson
>> <richard.henderson@linaro.org> wrote:
>>> Although we can't do much with ASIDs except remember them, this
>>> will allow cleanups within target/ that should make things clearer.
>>>
>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

>>> +void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid, uint16_t idxmap,
>>> +                             uint16_t depmap)
>>> +{
>>> +    CPUArchState *env = cpu->env_ptr;
>>> +    uint16_t work, to_flush = 0;
>>
>> The other functions that work on the tlb defer their
>> actual operation to an async-work type function or
>> do a run-on-cpu if the passed-in CPU is not the current
>> CPU. Do we need to do that here too?
>
> I don't *think* so.  I would expect an ASID to be set in response to some
> cpu-local change, like setting TTBR0_EL1.  Something that cannot be done across
> cpus.  I could assert_cpu_is_self, if you like.

Yes, if the function expects to be called only for the
current CPU we should assert and document this.

> What I *should* be adding as well is cross-cpu asid-specific tlb flushing, a-la
> TLBI ASID.  That would need the run-on-cpu stuff.
>
>> So this will flush all the passed in indexes in idxmap
>> if any one of them was previously the wrong ASID. Is that
>> necessary, or could we just flush only the ones which
>> were wrong and not flush any that were already the correct ASID ?
>
> It probably wouldn't be necessary.  But I also sort of expect the set of
> indexes to always have the same ASID -- e.g. kernel vs user views of the same
> address space, e.g. S12NSE0 + S12NSE1.  I don't really know what to do when
> this presumed invariant is violated.

We should either document the invariant at the cputlb.c level
(and assert it, if not too tedious to do), or say that the cputlb
code doesn't care, and just only flush the indexes which have the
wrong ASID currently, I think.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] target/arm: Install ASIDs for short-form from EL1
  2018-10-29 15:53 ` [Qemu-devel] [PATCH 3/4] target/arm: Install ASIDs for short-form " Richard Henderson
  2018-11-15 18:52   ` Peter Maydell
@ 2018-11-16 13:47   ` Peter Maydell
  1 sibling, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2018-11-16 13:47 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 29 October 2018 at 15:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
> This is less complex than the LPAE case, but still we now avoid the
> flush in case it is only the PROCID field that is changing.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/helper.c | 34 ++++++++++++++++++++++++----------
>  1 file changed, 24 insertions(+), 10 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 26d6f28793..f767467dcf 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -541,17 +541,31 @@ static void fcse_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
>  static void contextidr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                               uint64_t value)
>  {
> -    ARMCPU *cpu = arm_env_get_cpu(env);
> -
> -    if (raw_read(env, ri) != value && !arm_feature(env, ARM_FEATURE_PMSA)
> -        && !extended_addresses_enabled(env)) {
> -        /* For VMSA (when not using the LPAE long descriptor page table
> -         * format) this register includes the ASID, so do a TLB flush.
> -         * For PMSA it is purely a process ID and no action is needed.
> -         */
> -        tlb_flush(CPU(cpu));
> -    }
>      raw_write(env, ri, value);
> +
> +    /*
> +     * For VMSA (when not using the LPAE long descriptor page table format)
> +     * this register includes the ASID.  For PMSA it is purely a process ID
> +     * and no action is needed.
> +     */

I've now thought about this a bit and read through some of
the relevant bits of the Arm ARM. My updated opinion is below:

General principles first:
 * the set of MMU indexes we need to do a set-asid or flush for
depends on the register being written (and not the context
from which it is being written), ie your remarks to this effect
in the cover letter are correct
 * for some registers the register is used by translations in
exactly one translation regime, and therefore the situation is
simple (eg TTBR(S) is only used by S1E3|S1SE0)
 * for some registers the same register may be used by more than
one translation regime, eg AArch64 CONTEXTIDR_EL1 written from
EL3 may be because we're setting up the translation regime
for Secure EL1&0 (ie S1SE1|S1SE0) or for NS EL1&0 (S12NSE1|S12NSE0),
and we can't tell which at this point.

> +    if (!arm_feature(env, ARM_FEATURE_PMSA) &&
> +        !extended_addresses_enabled(env)) {

The current definition of extended_addresses_enabled(), which
is used only here, is:

static inline bool extended_addresses_enabled(CPUARMState *env)
{
    TCR *tcr = &env->cp15.tcr_el[arm_is_secure(env) ? 3 : 1];
    return arm_el_is_aa64(env, 1) ||
           (arm_feature(env, ARM_FEATURE_LPAE) && (tcr->raw_tcr & TTBCR_EAE));
}

This is bogus both because of that hardcoded '1' argument to
arm_el_is_aa64() and also because it's asking a question about
the current state of the CPU, whereas what we'd like to know
is whether the translation regime affected by the register
which we are changing is using extended addressing.
We also need to make sure that any state we depend upon here
when determining which indexes to flush is either:
 * in the list of things the architecture says can be
   cached in a tlb and so the guest is going to have
   to do tlb maintenance ops if they change it
 * something that causes QEMU to do a tlb flush or asid update
   if it changes

We should fix the condition we're checking, and I think also
just do it inline here and get rid of extended_addresses_enabled(),
which looks like a general-purpose utility function but isn't.
(We've previously managed to get rid of all the other uses
of it, which were generally bugs.)

> +        CPUState *cs = CPU(arm_env_get_cpu(env));
> +        int asid = extract32(value, 0, 8);
> +        int idxmask;
> +
> +        switch (ri->secure) {
> +        case ARM_CP_SECSTATE_S:
> +            idxmask = ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;

This must be TTBR(S), which means EL3 must be AArch32 and we're
in Secure EL3 (aka Secure PL1), and the ASID here affects only
S1E3|SESE0.

> +            break;
> +        case ARM_CP_SECSTATE_NS:
> +            idxmask = ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;

There are two possibilities here:
 * EL3 is AArch32: this must be TTBR(NS), in which case it affects
   S12NSE1|S12NSE0
 * EL3 is AArch64, in which case this is the one and only
   CONTEXTIDR_EL1 (whether we're accessing it from AArch32 or
   AArch64), and it affects either S1SE1|S1SE0 or S12NSE1|S12NSE0.
   If we're executing at EL2 or below then we can know which
   of those two is affected (it will be the current CPU state),
   but we can't set-asid just the one that is affected unless we're
   also willing to arrange to do a set-asid for the other security
   state when we transition into EL3. (Otherwise "be in NS EL1;
   set CONTEXTIDR_EL1; go to EL3; set S bit; go to S EL1" will
   take you to S EL1 with a new CONTEXTIDR value without having
   updated the ASID for its MMU indexes.)
   (If we have no EL3 at all then obviously we need not flush
   the non-existent Secure regime TLBs.)

> +            break;
> +        default:
> +            g_assert_not_reached();
> +        }
> +        tlb_set_asid_for_mmuidx(cs, asid, idxmask, 0);
> +    }
>  }
>
>  /* IS variants of TLB operations must affect all cores */
> --
> 2.17.2

I haven't reviewed patch 2 but I expect that the above
applies there too, give or take.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-11-16 13:48 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-29 15:53 [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes Richard Henderson
2018-10-29 15:53 ` [Qemu-devel] [PATCH 1/4] cputlb: Add tlb_set_asid_for_mmuidx Richard Henderson
2018-11-15 18:36   ` Peter Maydell
2018-11-15 18:51     ` Richard Henderson
2018-11-15 18:56       ` Peter Maydell
2018-10-29 15:53 ` [Qemu-devel] [PATCH 2/4] target/arm: Install ASIDs for long-form from EL1 Richard Henderson
2018-10-29 15:53 ` [Qemu-devel] [PATCH 3/4] target/arm: Install ASIDs for short-form " Richard Henderson
2018-11-15 18:52   ` Peter Maydell
2018-11-16 13:47   ` Peter Maydell
2018-10-29 15:53 ` [Qemu-devel] [PATCH 4/4] target/arm: Install ASIDs for EL2 Richard Henderson
2018-11-15 18:38   ` Peter Maydell
2018-10-30 15:40 ` [Qemu-devel] [PATCH 0/4] target/arm: Minimize TLB flushing for ASID changes Emilio G. Cota
2018-11-05 16:30 ` Peter Maydell
2018-11-05 17:38   ` Richard Henderson
2018-11-15 18:25 ` Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.