All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-4.12 0/8] xen/arm: Workaround for Cortex-A76 erratum 1165522
@ 2018-11-28 16:49 Julien Grall
  2018-11-28 16:49 ` [PATCH for-4.12 1/8] xen/arm: Only set necessary flags when initializing HCR_EL2 Julien Grall
                   ` (7 more replies)
  0 siblings, 8 replies; 36+ messages in thread
From: Julien Grall @ 2018-11-28 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: andre.przywara, Julien Grall, sstabellini

Hi all,

Early version of Cortex-A76 can end-up with corrupt TLBs if they
speculate an AT instruction while the S1/S2 system registers are in an
inconsistent state.

This can happen during guest context switch and when invalidating the
TLBs for other than the current VMID.

The workaround implemented in Xen will:
    - Use an empty stage-2 with a reserved VMID while context
      switching between 2 guests
    - Use an empty stage-2 with the VMID where TLBs need to
      be flushed

Cheers,

Julien Grall (8):
  xen/arm: Only set necessary flags when initializing HCR_EL2
  xen/arm: p2m: Provide an helper to generate the VTTBR
  xen/arm: p2m: Introduce an helper to allocate the root page-table
  xen/arm: domain_build: Don't switch to the guest P2M when copying data
  xen/arm: p2m: Only use isb() when it is necessary
  xen/arm: Implement workaround for Cortex-A76 erratum 1165522
  xen/arm: p2m: Clean-up headers included and order them alphabetically
  DO NOT APPLY Allow testing the new AT speculate workaround code

 docs/misc/arm/silicon-errata.txt |   1 +
 xen/arch/arm/cpuerrata.c         |  16 +++++
 xen/arch/arm/domain.c            |   8 ++-
 xen/arch/arm/domain_build.c      |  13 ----
 xen/arch/arm/p2m.c               | 138 ++++++++++++++++++++++++++++++++-------
 xen/arch/arm/traps.c             |   8 ++-
 xen/include/asm-arm/cpufeature.h |   3 +-
 xen/include/asm-arm/processor.h  |   2 +
 8 files changed, 146 insertions(+), 43 deletions(-)

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH for-4.12 1/8] xen/arm: Only set necessary flags when initializing HCR_EL2
  2018-11-28 16:49 [PATCH for-4.12 0/8] xen/arm: Workaround for Cortex-A76 erratum 1165522 Julien Grall
@ 2018-11-28 16:49 ` Julien Grall
  2018-12-21 14:33   ` Andrii Anisov
  2019-01-23 23:50   ` Stefano Stabellini
  2018-11-28 16:49 ` [PATCH for-4.12 2/8] xen/arm: p2m: Provide an helper to generate the VTTBR Julien Grall
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 36+ messages in thread
From: Julien Grall @ 2018-11-28 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: andre.przywara, Julien Grall, sstabellini

Only {A,F,I}MO are necessary to receive interrupts until a guest vCPU is
loaded.

The rest have no effect on Xen and it is better to avoid setting them.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/traps.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 88ffeeb480..1eec966299 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -181,8 +181,12 @@ void init_traps(void)
     WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) | HCPTR_TTA,
                  CPTR_EL2);
 
-    /* Setup hypervisor traps */
-    WRITE_SYSREG(get_default_hcr_flags(), HCR_EL2);
+    /*
+     * Configure HCR_EL2 with the bareminimum to run Xen until a guest
+     * is scheduled. {A,I,F}MO bits are set to allow EL2 receiving
+     * interrupts.
+     */
+    WRITE_SYSREG(HCR_AMO | HCR_FMO | HCR_IMO, HCR_EL2);
     isb();
 }
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH for-4.12 2/8] xen/arm: p2m: Provide an helper to generate the VTTBR
  2018-11-28 16:49 [PATCH for-4.12 0/8] xen/arm: Workaround for Cortex-A76 erratum 1165522 Julien Grall
  2018-11-28 16:49 ` [PATCH for-4.12 1/8] xen/arm: Only set necessary flags when initializing HCR_EL2 Julien Grall
@ 2018-11-28 16:49 ` Julien Grall
  2018-12-21 14:34   ` Andrii Anisov
  2019-01-23 23:27   ` Stefano Stabellini
  2018-11-28 16:49 ` [PATCH for-4.12 3/8] xen/arm: p2m: Introduce an helper to allocate the root page-table Julien Grall
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 36+ messages in thread
From: Julien Grall @ 2018-11-28 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: andre.przywara, Julien Grall, sstabellini

A follow-up patch will need to generate the VTTBR in a few places.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/p2m.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 6c76298ebc..8ebf1e8dba 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -47,6 +47,11 @@ static const paddr_t level_masks[] =
 static const uint8_t level_orders[] =
     { ZEROETH_ORDER, FIRST_ORDER, SECOND_ORDER, THIRD_ORDER };
 
+static uint64_t generate_vttbr(uint16_t vmid, mfn_t root_mfn)
+{
+    return (mfn_to_maddr(root_mfn) | ((uint64_t)vmid << 48));
+}
+
 /* Unlock the flush and do a P2M TLB flush if necessary */
 void p2m_write_unlock(struct p2m_domain *p2m)
 {
@@ -1147,7 +1152,7 @@ static int p2m_alloc_table(struct domain *d)
 
     p2m->root = page;
 
-    p2m->vttbr = page_to_maddr(p2m->root) | ((uint64_t)p2m->vmid << 48);
+    p2m->vttbr = generate_vttbr(p2m->vmid, page_to_mfn(p2m->root));
 
     /*
      * Make sure that all TLBs corresponding to the new VMID are flushed
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH for-4.12 3/8] xen/arm: p2m: Introduce an helper to allocate the root page-table
  2018-11-28 16:49 [PATCH for-4.12 0/8] xen/arm: Workaround for Cortex-A76 erratum 1165522 Julien Grall
  2018-11-28 16:49 ` [PATCH for-4.12 1/8] xen/arm: Only set necessary flags when initializing HCR_EL2 Julien Grall
  2018-11-28 16:49 ` [PATCH for-4.12 2/8] xen/arm: p2m: Provide an helper to generate the VTTBR Julien Grall
@ 2018-11-28 16:49 ` Julien Grall
  2018-12-21 14:35   ` Andrii Anisov
  2019-01-23 23:29   ` Stefano Stabellini
  2018-11-28 16:49 ` [PATCH for-4.12 4/8] xen/arm: domain_build: Don't switch to the guest P2M when copying data Julien Grall
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 36+ messages in thread
From: Julien Grall @ 2018-11-28 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: andre.przywara, Julien Grall, sstabellini

A follow-up patch will require to allocate the root page-table without
having a domain in hand.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/p2m.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 8ebf1e8dba..e8bacab9d2 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1136,21 +1136,29 @@ int guest_physmap_remove_page(struct domain *d, gfn_t gfn, mfn_t mfn,
     return p2m_remove_mapping(d, gfn, (1 << page_order), mfn);
 }
 
-static int p2m_alloc_table(struct domain *d)
+static struct page_info *p2m_allocate_root(void)
 {
-    struct p2m_domain *p2m = p2m_get_hostp2m(d);
     struct page_info *page;
     unsigned int i;
 
     page = alloc_domheap_pages(NULL, P2M_ROOT_ORDER, 0);
     if ( page == NULL )
-        return -ENOMEM;
+        return NULL;
 
     /* Clear both first level pages */
     for ( i = 0; i < P2M_ROOT_PAGES; i++ )
         clear_and_clean_page(page + i);
 
-    p2m->root = page;
+    return page;
+}
+
+static int p2m_alloc_table(struct domain *d)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+    p2m->root = p2m_allocate_root();
+    if ( !p2m->root )
+        return -ENOMEM;
 
     p2m->vttbr = generate_vttbr(p2m->vmid, page_to_mfn(p2m->root));
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH for-4.12 4/8] xen/arm: domain_build: Don't switch to the guest P2M when copying data
  2018-11-28 16:49 [PATCH for-4.12 0/8] xen/arm: Workaround for Cortex-A76 erratum 1165522 Julien Grall
                   ` (2 preceding siblings ...)
  2018-11-28 16:49 ` [PATCH for-4.12 3/8] xen/arm: p2m: Introduce an helper to allocate the root page-table Julien Grall
@ 2018-11-28 16:49 ` Julien Grall
  2018-12-21 15:16   ` Andrii Anisov
  2019-01-24  0:08   ` Stefano Stabellini
  2018-11-28 16:49 ` [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary Julien Grall
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 36+ messages in thread
From: Julien Grall @ 2018-11-28 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: andre.przywara, Julien Grall, sstabellini

Until recently, kernel/initrd/dtb were loaded using guest VA and
therefore requiring to restore temporarily the P2M. This reworked in a
series of commits (up to 9292086 "xen/arm: domain_build: Use
copy_to_guest_phys_flush_dcache in dtb_load") to use a guest PA.

This will also help a follow-up patch which will require
p2m_{save,restore}_state to work in pair to workaround an erratum.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/domain_build.c | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index b0ec3f0b72..ffbf7c6760 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1920,7 +1920,6 @@ static void __init find_gnttab_region(struct domain *d,
 
 static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
 {
-    struct vcpu *saved_current;
     int i, cpu;
     struct vcpu *v = d->vcpu[0];
     struct cpu_user_regs *regs = &v->arch.cpu_info->guest_cpu_user_regs;
@@ -1942,14 +1941,6 @@ static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
 #endif
 
     /*
-     * The following loads use the domain's p2m and require current to
-     * be a vcpu of the domain, temporarily switch
-     */
-    saved_current = current;
-    p2m_restore_state(v);
-    set_current(v);
-
-    /*
      * kernel_load will determine the placement of the kernel as well
      * as the initrd & fdt in RAM, so call it first.
      */
@@ -1958,10 +1949,6 @@ static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
     initrd_load(kinfo);
     dtb_load(kinfo);
 
-    /* Now that we are done restore the original p2m and current. */
-    set_current(saved_current);
-    p2m_restore_state(saved_current);
-
     memset(regs, 0, sizeof(*regs));
 
     regs->pc = (register_t)kinfo->entry;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary
  2018-11-28 16:49 [PATCH for-4.12 0/8] xen/arm: Workaround for Cortex-A76 erratum 1165522 Julien Grall
                   ` (3 preceding siblings ...)
  2018-11-28 16:49 ` [PATCH for-4.12 4/8] xen/arm: domain_build: Don't switch to the guest P2M when copying data Julien Grall
@ 2018-11-28 16:49 ` Julien Grall
  2018-12-21 14:36   ` Andrii Anisov
                     ` (2 more replies)
  2018-11-28 16:49 ` [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522 Julien Grall
                   ` (2 subsequent siblings)
  7 siblings, 3 replies; 36+ messages in thread
From: Julien Grall @ 2018-11-28 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: andre.przywara, Julien Grall, sstabellini

The EL1 translation regime is out-of-context when running at EL2. This
means the processor cannot speculate memory accesses using the registers
associated to that regime.

An isb() is only need if Xen is going to use the translation regime
before returning to the guest (exception returns will synchronized the
context).

Remove unecessary isb() and document the ones left.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/p2m.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index e8bacab9d2..844833c4c3 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -112,22 +112,28 @@ void p2m_restore_state(struct vcpu *n)
         return;
 
     WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
-    isb();
-
     WRITE_SYSREG(n->arch.sctlr, SCTLR_EL1);
-    isb();
-
     WRITE_SYSREG(n->arch.hcr_el2, HCR_EL2);
-    isb();
 
     last_vcpu_ran = &p2m->last_vcpu_ran[smp_processor_id()];
 
     /*
+     * While we are restoring an out-of-context translation regime
+     * we still need to ensure:
+     *  - VTTBR_EL2 is synchronized before flushing the TLBs
+     *  - All registers for EL1 are synchronized before executing an AT
+     *  instructions targeting S1/S2.
+     */
+    isb();
+
+    /*
      * Flush local TLB for the domain to prevent wrong TLB translation
      * when running multiple vCPU of the same domain on a single pCPU.
      */
     if ( *last_vcpu_ran != INVALID_VCPU_ID && *last_vcpu_ran != n->vcpu_id )
+    {
         flush_tlb_local();
+    }
 
     *last_vcpu_ran = n->vcpu_id;
 }
@@ -153,6 +159,7 @@ static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
     {
         local_irq_save(flags);
         WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
+        /* Ensure VTTBR_EL2 is synchronized before flushing the TLBs */
         isb();
     }
 
@@ -161,6 +168,7 @@ static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
     if ( ovttbr != READ_SYSREG64(VTTBR_EL2) )
     {
         WRITE_SYSREG64(ovttbr, VTTBR_EL2);
+        /* Ensure VTTBR_EL2 is back in place before continuing. */
         isb();
         local_irq_restore(flags);
     }
@@ -1496,7 +1504,6 @@ static uint32_t __read_mostly vtcr;
 static void setup_virt_paging_one(void *data)
 {
     WRITE_SYSREG32(vtcr, VTCR_EL2);
-    isb();
 }
 
 void __init setup_virt_paging(void)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522
  2018-11-28 16:49 [PATCH for-4.12 0/8] xen/arm: Workaround for Cortex-A76 erratum 1165522 Julien Grall
                   ` (4 preceding siblings ...)
  2018-11-28 16:49 ` [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary Julien Grall
@ 2018-11-28 16:49 ` Julien Grall
  2018-12-21 15:47   ` Andrii Anisov
  2019-01-24  0:52   ` Stefano Stabellini
  2018-11-28 16:49 ` [PATCH for-4.12 7/8] xen/arm: p2m: Clean-up headers included and order them alphabetically Julien Grall
  2018-11-28 16:49 ` [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code Julien Grall
  7 siblings, 2 replies; 36+ messages in thread
From: Julien Grall @ 2018-11-28 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: andre.przywara, Julien Grall, sstabellini

Early version of Cortex-A76 can end-up with corrupt TLBs if they
speculate an AT instruction while the S1/S2 system registers are in an
inconsistent state.

This can happen during guest context switch and when invalidating the
TLBs for other than the current VMID.

The workaround implemented in Xen will:
    - Use an empty stage-2 with a reserved VMID while context switching
    between 2 guests
    - Use an empty stage-2 with the VMID where TLBs need to be flushed

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 docs/misc/arm/silicon-errata.txt |  1 +
 xen/arch/arm/cpuerrata.c         |  6 ++++
 xen/arch/arm/domain.c            |  8 +++--
 xen/arch/arm/p2m.c               | 78 ++++++++++++++++++++++++++++++++++++++--
 xen/include/asm-arm/cpufeature.h |  3 +-
 xen/include/asm-arm/processor.h  |  2 ++
 6 files changed, 93 insertions(+), 5 deletions(-)

diff --git a/docs/misc/arm/silicon-errata.txt b/docs/misc/arm/silicon-errata.txt
index 906bf5fd48..6cd1366f15 100644
--- a/docs/misc/arm/silicon-errata.txt
+++ b/docs/misc/arm/silicon-errata.txt
@@ -48,4 +48,5 @@ stable hypervisors.
 | ARM            | Cortex-A57      | #852523         | N/A                     |
 | ARM            | Cortex-A57      | #832075         | ARM64_ERRATUM_832075    |
 | ARM            | Cortex-A57      | #834220         | ARM64_ERRATUM_834220    |
+| ARM            | Cortex-A76      | #1165522        | N/A                     |
 | ARM            | MMU-500         | #842869         | N/A                     |
diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
index adf88e7bdc..61c64b9816 100644
--- a/xen/arch/arm/cpuerrata.c
+++ b/xen/arch/arm/cpuerrata.c
@@ -489,6 +489,12 @@ static const struct arm_cpu_capabilities arm_errata[] = {
         .matches = has_ssbd_mitigation,
     },
 #endif
+    {
+        /* Cortex-A76 r0p0 - r2p0 */
+        .desc = "ARM erratum 116522",
+        .capability = ARM64_WORKAROUND_AT_SPECULATE,
+        MIDR_RANGE(MIDR_CORTEX_A76, 0, 2 << MIDR_VARIANT_SHIFT),
+    },
     {},
 };
 
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 1d926dcb29..3180edd89d 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -181,8 +181,6 @@ static void ctxt_switch_to(struct vcpu *n)
     if ( is_idle_vcpu(n) )
         return;
 
-    p2m_restore_state(n);
-
     vpidr = READ_SYSREG32(MIDR_EL1);
     WRITE_SYSREG32(vpidr, VPIDR_EL2);
     WRITE_SYSREG(n->arch.vmpidr, VMPIDR_EL2);
@@ -235,6 +233,12 @@ static void ctxt_switch_to(struct vcpu *n)
 #endif
     isb();
 
+    /*
+     * ARM64_WORKAROUND_AT_SPECULATE: The P2M should be restored after
+     * the stage-1 MMU sysregs have been restored.
+     */
+    p2m_restore_state(n);
+
     /* Control Registers */
     WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);
 
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 844833c4c3..0facb66096 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -15,6 +15,7 @@
 #include <asm/event.h>
 #include <asm/hardirq.h>
 #include <asm/page.h>
+#include <asm/alternative.h>
 
 #define MAX_VMID_8_BIT  (1UL << 8)
 #define MAX_VMID_16_BIT (1UL << 16)
@@ -47,6 +48,8 @@ static const paddr_t level_masks[] =
 static const uint8_t level_orders[] =
     { ZEROETH_ORDER, FIRST_ORDER, SECOND_ORDER, THIRD_ORDER };
 
+static mfn_t __read_mostly empty_root_mfn;
+
 static uint64_t generate_vttbr(uint16_t vmid, mfn_t root_mfn)
 {
     return (mfn_to_maddr(root_mfn) | ((uint64_t)vmid << 48));
@@ -98,9 +101,25 @@ void dump_p2m_lookup(struct domain *d, paddr_t addr)
                  P2M_ROOT_LEVEL, P2M_ROOT_PAGES);
 }
 
+/*
+ * p2m_save_state and p2m_restore_state works in pair to workaround
+ * ARM64_WORKAROUND_AT_SPECULATE. p2m_save_state will set-up VTTBR to
+ * point to the empty page-tables to stop allocating TLB entries.
+ */
 void p2m_save_state(struct vcpu *p)
 {
     p->arch.sctlr = READ_SYSREG(SCTLR_EL1);
+
+    if ( cpus_have_const_cap(ARM64_WORKAROUND_AT_SPECULATE) )
+    {
+        WRITE_SYSREG64(generate_vttbr(INVALID_VMID, empty_root_mfn), VTTBR_EL2);
+        /*
+         * Ensure VTTBR_EL2 is correctly synchronized so we can restore
+         * the next vCPU context without worrying about AT instruction
+         * speculation.
+         */
+        isb();
+    }
 }
 
 void p2m_restore_state(struct vcpu *n)
@@ -111,10 +130,17 @@ void p2m_restore_state(struct vcpu *n)
     if ( is_idle_vcpu(n) )
         return;
 
-    WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
     WRITE_SYSREG(n->arch.sctlr, SCTLR_EL1);
     WRITE_SYSREG(n->arch.hcr_el2, HCR_EL2);
 
+    /*
+     * ARM64_WORKAROUND_AT_SPECULATE: VTTBR_EL2 should be restored after all
+     * registers associated to EL1/EL0 translations regime have been
+     * synchronized.
+     */
+    asm volatile(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_AT_SPECULATE));
+    WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
+
     last_vcpu_ran = &p2m->last_vcpu_ran[smp_processor_id()];
 
     /*
@@ -157,8 +183,23 @@ static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
     ovttbr = READ_SYSREG64(VTTBR_EL2);
     if ( ovttbr != p2m->vttbr )
     {
+        uint64_t vttbr;
+
         local_irq_save(flags);
-        WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
+
+        /*
+         * ARM64_WORKAROUND_AT_SPECULATE: We need to stop AT to allocate
+         * TLBs entries because the context is partially modified. We
+         * only need the VMID for flushing the TLBs, so we can generate
+         * a new VTTBR with the VMID to flush and the empty root table.
+         */
+        if ( !cpus_have_const_cap(ARM64_WORKAROUND_AT_SPECULATE) )
+            vttbr = p2m->vttbr;
+        else
+            vttbr = generate_vttbr(p2m->vmid, empty_root_mfn);
+
+        WRITE_SYSREG64(vttbr, VTTBR_EL2);
+
         /* Ensure VTTBR_EL2 is synchronized before flushing the TLBs */
         isb();
     }
@@ -1504,6 +1545,23 @@ static uint32_t __read_mostly vtcr;
 static void setup_virt_paging_one(void *data)
 {
     WRITE_SYSREG32(vtcr, VTCR_EL2);
+
+    /*
+     * ARM64_WORKAROUND_AT_SPECULATE: We want to keep the TLBs free from
+     * entries related to EL1/EL0 translation regime until a guest vCPU
+     * is running. For that, we need to set-up VTTBR to point to an empty
+     * page-table and turn on stage-2 translation. The TLB entries
+     * associated with EL1/EL0 translation regime will also be flushed in case
+     * an AT instruction was speculated before hand.
+     */
+    if ( cpus_have_cap(ARM64_WORKAROUND_AT_SPECULATE) )
+    {
+        WRITE_SYSREG64(generate_vttbr(INVALID_VMID, empty_root_mfn), VTTBR_EL2);
+        WRITE_SYSREG(READ_SYSREG(HCR_EL2) | HCR_VM, HCR_EL2);
+        isb();
+
+        flush_tlb_all_local();
+    }
 }
 
 void __init setup_virt_paging(void)
@@ -1587,6 +1645,22 @@ void __init setup_virt_paging(void)
     /* It is not allowed to concatenate a level zero root */
     BUG_ON( P2M_ROOT_LEVEL == 0 && P2M_ROOT_ORDER > 0 );
     vtcr = val;
+
+    /*
+     * ARM64_WORKAROUND_AT_SPECULATE requires to allocate root table
+     * with all entries zeroed.
+     */
+    if ( cpus_have_cap(ARM64_WORKAROUND_AT_SPECULATE) )
+    {
+        struct page_info *root;
+
+        root = p2m_allocate_root();
+        if ( !root )
+            panic("Unable to allocate root table for ARM64_WORKAROUND_AT_SPECULATE\n");
+
+        empty_root_mfn = page_to_mfn(root);
+    }
+
     setup_virt_paging_one(NULL);
     smp_call_function(setup_virt_paging_one, NULL, 1);
 }
diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
index 17de928467..c2c8f3417c 100644
--- a/xen/include/asm-arm/cpufeature.h
+++ b/xen/include/asm-arm/cpufeature.h
@@ -45,8 +45,9 @@
 #define ARM_HARDEN_BRANCH_PREDICTOR 7
 #define ARM_SSBD 8
 #define ARM_SMCCC_1_1 9
+#define ARM64_WORKAROUND_AT_SPECULATE 10
 
-#define ARM_NCAPS           10
+#define ARM_NCAPS           11
 
 #ifndef __ASSEMBLY__
 
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 72ddc42778..d03ec6e272 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -52,6 +52,7 @@
 #define ARM_CPU_PART_CORTEX_A72     0xD08
 #define ARM_CPU_PART_CORTEX_A73     0xD09
 #define ARM_CPU_PART_CORTEX_A75     0xD0A
+#define ARM_CPU_PART_CORTEX_A76     0xD0B
 
 #define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
 #define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -61,6 +62,7 @@
 #define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A72)
 #define MIDR_CORTEX_A73 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A73)
 #define MIDR_CORTEX_A75 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A75)
+#define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76)
 
 /* MPIDR Multiprocessor Affinity Register */
 #define _MPIDR_UP           (30)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH for-4.12 7/8] xen/arm: p2m: Clean-up headers included and order them alphabetically
  2018-11-28 16:49 [PATCH for-4.12 0/8] xen/arm: Workaround for Cortex-A76 erratum 1165522 Julien Grall
                   ` (5 preceding siblings ...)
  2018-11-28 16:49 ` [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522 Julien Grall
@ 2018-11-28 16:49 ` Julien Grall
  2018-12-21 14:44   ` Andrii Anisov
  2018-11-28 16:49 ` [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code Julien Grall
  7 siblings, 1 reply; 36+ messages in thread
From: Julien Grall @ 2018-11-28 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: andre.przywara, Julien Grall, sstabellini

A lot of the headers are not necessary, so remove them. At the same
time, re-order them alphabetically.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/p2m.c | 20 ++++++--------------
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 0facb66096..3a92fd0775 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1,21 +1,13 @@
-#include <xen/sched.h>
-#include <xen/lib.h>
-#include <xen/errno.h>
+#include <xen/cpu.h>
 #include <xen/domain_page.h>
-#include <xen/bitops.h>
-#include <xen/vm_event.h>
-#include <xen/monitor.h>
 #include <xen/iocap.h>
-#include <xen/mem_access.h>
-#include <xen/xmalloc.h>
-#include <xen/cpu.h>
-#include <xen/notifier.h>
-#include <public/vm_event.h>
-#include <asm/flushtlb.h>
+#include <xen/lib.h>
+#include <xen/sched.h>
+
+#include <asm/alternative.h>
 #include <asm/event.h>
-#include <asm/hardirq.h>
+#include <asm/flushtlb.h>
 #include <asm/page.h>
-#include <asm/alternative.h>
 
 #define MAX_VMID_8_BIT  (1UL << 8)
 #define MAX_VMID_16_BIT (1UL << 16)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code
  2018-11-28 16:49 [PATCH for-4.12 0/8] xen/arm: Workaround for Cortex-A76 erratum 1165522 Julien Grall
                   ` (6 preceding siblings ...)
  2018-11-28 16:49 ` [PATCH for-4.12 7/8] xen/arm: p2m: Clean-up headers included and order them alphabetically Julien Grall
@ 2018-11-28 16:49 ` Julien Grall
  2018-12-21 14:44   ` Andrii Anisov
  7 siblings, 1 reply; 36+ messages in thread
From: Julien Grall @ 2018-11-28 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: andre.przywara, Julien Grall, sstabellini

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/cpuerrata.c | 10 ++++++++++
 xen/arch/arm/p2m.c       |  2 ++
 2 files changed, 12 insertions(+)

diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
index 61c64b9816..e7278f2899 100644
--- a/xen/arch/arm/cpuerrata.c
+++ b/xen/arch/arm/cpuerrata.c
@@ -381,6 +381,11 @@ static bool has_ssbd_mitigation(const struct arm_cpu_capabilities *entry)
 }
 #endif
 
+static bool has_at_speculate(const struct arm_cpu_capabilities *entry)
+{
+    return true;
+}
+
 #define MIDR_RANGE(model, min, max)     \
     .matches = is_affected_midr_range,  \
     .midr_model = model,                \
@@ -495,6 +500,11 @@ static const struct arm_cpu_capabilities arm_errata[] = {
         .capability = ARM64_WORKAROUND_AT_SPECULATE,
         MIDR_RANGE(MIDR_CORTEX_A76, 0, 2 << MIDR_VARIANT_SHIFT),
     },
+    {
+        .desc = "AT speculate",
+        .capability = ARM64_WORKAROUND_AT_SPECULATE,
+        .matches = has_at_speculate,
+    },
     {},
 };
 
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 3a92fd0775..403bfbfcb6 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -122,6 +122,8 @@ void p2m_restore_state(struct vcpu *n)
     if ( is_idle_vcpu(n) )
         return;
 
+    ASSERT(READ_SYSREG64(VTTBR_EL2) == (generate_vttbr(INVALID_VMID, empty_root_mfn)));
+
     WRITE_SYSREG(n->arch.sctlr, SCTLR_EL1);
     WRITE_SYSREG(n->arch.hcr_el2, HCR_EL2);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 1/8] xen/arm: Only set necessary flags when initializing HCR_EL2
  2018-11-28 16:49 ` [PATCH for-4.12 1/8] xen/arm: Only set necessary flags when initializing HCR_EL2 Julien Grall
@ 2018-12-21 14:33   ` Andrii Anisov
  2019-01-23 23:50   ` Stefano Stabellini
  1 sibling, 0 replies; 36+ messages in thread
From: Andrii Anisov @ 2018-12-21 14:33 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: andre.przywara, sstabellini



On 28.11.18 18:49, Julien Grall wrote:
> Only {A,F,I}MO are necessary to receive interrupts until a guest vCPU is
> loaded.
> 
> The rest have no effect on Xen and it is better to avoid setting them.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>   xen/arch/arm/traps.c | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)

Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 2/8] xen/arm: p2m: Provide an helper to generate the VTTBR
  2018-11-28 16:49 ` [PATCH for-4.12 2/8] xen/arm: p2m: Provide an helper to generate the VTTBR Julien Grall
@ 2018-12-21 14:34   ` Andrii Anisov
  2019-01-23 23:27   ` Stefano Stabellini
  1 sibling, 0 replies; 36+ messages in thread
From: Andrii Anisov @ 2018-12-21 14:34 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: andre.przywara, sstabellini



On 28.11.18 18:49, Julien Grall wrote:
> A follow-up patch will need to generate the VTTBR in a few places.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>   xen/arch/arm/p2m.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 

Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 3/8] xen/arm: p2m: Introduce an helper to allocate the root page-table
  2018-11-28 16:49 ` [PATCH for-4.12 3/8] xen/arm: p2m: Introduce an helper to allocate the root page-table Julien Grall
@ 2018-12-21 14:35   ` Andrii Anisov
  2019-01-23 23:29   ` Stefano Stabellini
  1 sibling, 0 replies; 36+ messages in thread
From: Andrii Anisov @ 2018-12-21 14:35 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: andre.przywara, sstabellini



On 28.11.18 18:49, Julien Grall wrote:
> A follow-up patch will require to allocate the root page-table without
> having a domain in hand.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary
  2018-11-28 16:49 ` [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary Julien Grall
@ 2018-12-21 14:36   ` Andrii Anisov
  2018-12-21 14:43   ` Andrii Anisov
  2019-01-23 23:42   ` Stefano Stabellini
  2 siblings, 0 replies; 36+ messages in thread
From: Andrii Anisov @ 2018-12-21 14:36 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: andre.przywara, sstabellini



On 28.11.18 18:49, Julien Grall wrote:
> The EL1 translation regime is out-of-context when running at EL2. This
> means the processor cannot speculate memory accesses using the registers
> associated to that regime.
> 
> An isb() is only need if Xen is going to use the translation regime
> before returning to the guest (exception returns will synchronized the
> context).
> 
> Remove unecessary isb() and document the ones left.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>


Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary
  2018-11-28 16:49 ` [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary Julien Grall
  2018-12-21 14:36   ` Andrii Anisov
@ 2018-12-21 14:43   ` Andrii Anisov
  2018-12-21 14:48     ` Julien Grall
  2019-01-23 23:42   ` Stefano Stabellini
  2 siblings, 1 reply; 36+ messages in thread
From: Andrii Anisov @ 2018-12-21 14:43 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: andre.przywara, sstabellini



On 28.11.18 18:49, Julien Grall wrote:

>       if ( *last_vcpu_ran != INVALID_VCPU_ID && *last_vcpu_ran != n->vcpu_id )
> +    {
>           flush_tlb_local();
> +    }
BTW, missed mentioning that curly braces above are odd by coding style.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 7/8] xen/arm: p2m: Clean-up headers included and order them alphabetically
  2018-11-28 16:49 ` [PATCH for-4.12 7/8] xen/arm: p2m: Clean-up headers included and order them alphabetically Julien Grall
@ 2018-12-21 14:44   ` Andrii Anisov
  2018-12-21 15:33     ` Andrii Anisov
  0 siblings, 1 reply; 36+ messages in thread
From: Andrii Anisov @ 2018-12-21 14:44 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: andre.przywara, sstabellini



On 28.11.18 18:49, Julien Grall wrote:
> A lot of the headers are not necessary, so remove them. At the same
> time, re-order them alphabetically.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code
  2018-11-28 16:49 ` [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code Julien Grall
@ 2018-12-21 14:44   ` Andrii Anisov
  2018-12-21 14:47     ` Andrew Cooper
  0 siblings, 1 reply; 36+ messages in thread
From: Andrii Anisov @ 2018-12-21 14:44 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: andre.przywara, sstabellini

I guess this one should not be here.


-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code
  2018-12-21 14:44   ` Andrii Anisov
@ 2018-12-21 14:47     ` Andrew Cooper
  2018-12-21 14:55       ` Andrii Anisov
  0 siblings, 1 reply; 36+ messages in thread
From: Andrew Cooper @ 2018-12-21 14:47 UTC (permalink / raw)
  To: Andrii Anisov, Julien Grall, xen-devel; +Cc: andre.przywara, sstabellini

On 21/12/2018 14:44, Andrii Anisov wrote:
> I guess this one should not be here.

Posting patches like this can be useful for people trying to test the
series.

As such, it is worth posting, but the DO NOT APPLY hint is there for
people to realise that is isn't for inclusion generally.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary
  2018-12-21 14:43   ` Andrii Anisov
@ 2018-12-21 14:48     ` Julien Grall
  0 siblings, 0 replies; 36+ messages in thread
From: Julien Grall @ 2018-12-21 14:48 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel; +Cc: andre.przywara, sstabellini



On 21/12/2018 14:43, Andrii Anisov wrote:
> 
> 
> On 28.11.18 18:49, Julien Grall wrote:
> 
>>       if ( *last_vcpu_ran != INVALID_VCPU_ID && *last_vcpu_ran != n->vcpu_id )
>> +    {
>>           flush_tlb_local();
>> +    }
> BTW, missed mentioning that curly braces above are odd by coding style.

It is not odd, just unnecessary. I will drop them.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code
  2018-12-21 14:47     ` Andrew Cooper
@ 2018-12-21 14:55       ` Andrii Anisov
  0 siblings, 0 replies; 36+ messages in thread
From: Andrii Anisov @ 2018-12-21 14:55 UTC (permalink / raw)
  To: Andrew Cooper, Julien Grall, xen-devel; +Cc: andre.przywara, sstabellini



On 21.12.18 16:47, Andrew Cooper wrote:
> On 21/12/2018 14:44, Andrii Anisov wrote:
>> I guess this one should not be here.
> 
> Posting patches like this can be useful for people trying to test the
> series.
> 
> As such, it is worth posting, but the DO NOT APPLY hint is there for
> people to realise that is isn't for inclusion generally.

Thank you for the hint.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 4/8] xen/arm: domain_build: Don't switch to the guest P2M when copying data
  2018-11-28 16:49 ` [PATCH for-4.12 4/8] xen/arm: domain_build: Don't switch to the guest P2M when copying data Julien Grall
@ 2018-12-21 15:16   ` Andrii Anisov
  2019-01-24  0:08   ` Stefano Stabellini
  1 sibling, 0 replies; 36+ messages in thread
From: Andrii Anisov @ 2018-12-21 15:16 UTC (permalink / raw)
  To: xen-devel



On 28.11.18 18:49, Julien Grall wrote:
> Until recently, kernel/initrd/dtb were loaded using guest VA and
> therefore requiring to restore temporarily the P2M. This reworked in a
> series of commits (up to 9292086 "xen/arm: domain_build: Use
> copy_to_guest_phys_flush_dcache in dtb_load") to use a guest PA.
> 
> This will also help a follow-up patch which will require
> p2m_{save,restore}_state to work in pair to workaround an erratum.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 7/8] xen/arm: p2m: Clean-up headers included and order them alphabetically
  2018-12-21 14:44   ` Andrii Anisov
@ 2018-12-21 15:33     ` Andrii Anisov
  0 siblings, 0 replies; 36+ messages in thread
From: Andrii Anisov @ 2018-12-21 15:33 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: andre.przywara, sstabellini

I've missed that this patch is already merged within a different series.
Also "[Xen-devel] [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522" should be rebased.

On 21.12.18 16:44, Andrii Anisov wrote:
> 
> 
> On 28.11.18 18:49, Julien Grall wrote:
>> A lot of the headers are not necessary, so remove them. At the same
>> time, re-order them alphabetically.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
> 

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522
  2018-11-28 16:49 ` [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522 Julien Grall
@ 2018-12-21 15:47   ` Andrii Anisov
  2019-01-24  0:52   ` Stefano Stabellini
  1 sibling, 0 replies; 36+ messages in thread
From: Andrii Anisov @ 2018-12-21 15:47 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: andre.przywara, sstabellini



On 28.11.18 18:49, Julien Grall wrote:
> Early version of Cortex-A76 can end-up with corrupt TLBs if they
> speculate an AT instruction while the S1/S2 system registers are in an
> inconsistent state.
> 
> This can happen during guest context switch and when invalidating the
> TLBs for other than the current VMID.
> 
> The workaround implemented in Xen will:
>      - Use an empty stage-2 with a reserved VMID while context switching
>      between 2 guests
>      - Use an empty stage-2 with the VMID where TLBs need to be flushed
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 2/8] xen/arm: p2m: Provide an helper to generate the VTTBR
  2018-11-28 16:49 ` [PATCH for-4.12 2/8] xen/arm: p2m: Provide an helper to generate the VTTBR Julien Grall
  2018-12-21 14:34   ` Andrii Anisov
@ 2019-01-23 23:27   ` Stefano Stabellini
  2019-01-24 11:06     ` Julien Grall
  1 sibling, 1 reply; 36+ messages in thread
From: Stefano Stabellini @ 2019-01-23 23:27 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini, andre.przywara

On Wed, 28 Nov 2018, Julien Grall wrote:
> A follow-up patch will need to generate the VTTBR in a few places.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/arm/p2m.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 6c76298ebc..8ebf1e8dba 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -47,6 +47,11 @@ static const paddr_t level_masks[] =
>  static const uint8_t level_orders[] =
>      { ZEROETH_ORDER, FIRST_ORDER, SECOND_ORDER, THIRD_ORDER };
>  
> +static uint64_t generate_vttbr(uint16_t vmid, mfn_t root_mfn)
> +{
> +    return (mfn_to_maddr(root_mfn) | ((uint64_t)vmid << 48));

Outer brackets are not necessary. Regardless:

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> +}
> +
>  /* Unlock the flush and do a P2M TLB flush if necessary */
>  void p2m_write_unlock(struct p2m_domain *p2m)
>  {
> @@ -1147,7 +1152,7 @@ static int p2m_alloc_table(struct domain *d)
>  
>      p2m->root = page;
>  
> -    p2m->vttbr = page_to_maddr(p2m->root) | ((uint64_t)p2m->vmid << 48);
> +    p2m->vttbr = generate_vttbr(p2m->vmid, page_to_mfn(p2m->root));
>  
>      /*
>       * Make sure that all TLBs corresponding to the new VMID are flushed
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 3/8] xen/arm: p2m: Introduce an helper to allocate the root page-table
  2018-11-28 16:49 ` [PATCH for-4.12 3/8] xen/arm: p2m: Introduce an helper to allocate the root page-table Julien Grall
  2018-12-21 14:35   ` Andrii Anisov
@ 2019-01-23 23:29   ` Stefano Stabellini
  1 sibling, 0 replies; 36+ messages in thread
From: Stefano Stabellini @ 2019-01-23 23:29 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini, andre.przywara

On Wed, 28 Nov 2018, Julien Grall wrote:
> A follow-up patch will require to allocate the root page-table without
> having a domain in hand.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

> ---
>  xen/arch/arm/p2m.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 8ebf1e8dba..e8bacab9d2 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1136,21 +1136,29 @@ int guest_physmap_remove_page(struct domain *d, gfn_t gfn, mfn_t mfn,
>      return p2m_remove_mapping(d, gfn, (1 << page_order), mfn);
>  }
>  
> -static int p2m_alloc_table(struct domain *d)
> +static struct page_info *p2m_allocate_root(void)
>  {
> -    struct p2m_domain *p2m = p2m_get_hostp2m(d);
>      struct page_info *page;
>      unsigned int i;
>  
>      page = alloc_domheap_pages(NULL, P2M_ROOT_ORDER, 0);
>      if ( page == NULL )
> -        return -ENOMEM;
> +        return NULL;
>  
>      /* Clear both first level pages */
>      for ( i = 0; i < P2M_ROOT_PAGES; i++ )
>          clear_and_clean_page(page + i);
>  
> -    p2m->root = page;
> +    return page;
> +}
> +
> +static int p2m_alloc_table(struct domain *d)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +
> +    p2m->root = p2m_allocate_root();
> +    if ( !p2m->root )
> +        return -ENOMEM;
>  
>      p2m->vttbr = generate_vttbr(p2m->vmid, page_to_mfn(p2m->root));
>  
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary
  2018-11-28 16:49 ` [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary Julien Grall
  2018-12-21 14:36   ` Andrii Anisov
  2018-12-21 14:43   ` Andrii Anisov
@ 2019-01-23 23:42   ` Stefano Stabellini
  2019-01-24 11:29     ` Julien Grall
  2 siblings, 1 reply; 36+ messages in thread
From: Stefano Stabellini @ 2019-01-23 23:42 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini, andre.przywara

On Wed, 28 Nov 2018, Julien Grall wrote:
> The EL1 translation regime is out-of-context when running at EL2. This
> means the processor cannot speculate memory accesses using the registers
> associated to that regime.
> 
> An isb() is only need if Xen is going to use the translation regime
                   ^ needed


> before returning to the guest (exception returns will synchronized the
                                                        ^ synchronize

> context).
> 
> Remove unecessary isb() and document the ones left.
         ^ unnecessary



> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/arm/p2m.c | 19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index e8bacab9d2..844833c4c3 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -112,22 +112,28 @@ void p2m_restore_state(struct vcpu *n)
>          return;
>  
>      WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
> -    isb();
> -
>      WRITE_SYSREG(n->arch.sctlr, SCTLR_EL1);
> -    isb();
> -
>      WRITE_SYSREG(n->arch.hcr_el2, HCR_EL2);
> -    isb();
>  
>      last_vcpu_ran = &p2m->last_vcpu_ran[smp_processor_id()];
>  
>      /*
> +     * While we are restoring an out-of-context translation regime
> +     * we still need to ensure:
> +     *  - VTTBR_EL2 is synchronized before flushing the TLBs
> +     *  - All registers for EL1 are synchronized before executing an AT
> +     *  instructions targeting S1/S2.
> +     */
> +    isb();

This explanation makes sense to me


> +    /*
>       * Flush local TLB for the domain to prevent wrong TLB translation
>       * when running multiple vCPU of the same domain on a single pCPU.
>       */
>      if ( *last_vcpu_ran != INVALID_VCPU_ID && *last_vcpu_ran != n->vcpu_id )
> +    {
>          flush_tlb_local();
> +    }

Spurious change?


>      *last_vcpu_ran = n->vcpu_id;
>  }
> @@ -153,6 +159,7 @@ static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
>      {
>          local_irq_save(flags);
>          WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
> +        /* Ensure VTTBR_EL2 is synchronized before flushing the TLBs */
>          isb();
>      }
>  
> @@ -161,6 +168,7 @@ static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
>      if ( ovttbr != READ_SYSREG64(VTTBR_EL2) )
>      {
>          WRITE_SYSREG64(ovttbr, VTTBR_EL2);
> +        /* Ensure VTTBR_EL2 is back in place before continuing. */
>          isb();
>          local_irq_restore(flags);
>      }
> @@ -1496,7 +1504,6 @@ static uint32_t __read_mostly vtcr;
>  static void setup_virt_paging_one(void *data)
>  {
>      WRITE_SYSREG32(vtcr, VTCR_EL2);
> -    isb();
>  }
>  
>  void __init setup_virt_paging(void)
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 1/8] xen/arm: Only set necessary flags when initializing HCR_EL2
  2018-11-28 16:49 ` [PATCH for-4.12 1/8] xen/arm: Only set necessary flags when initializing HCR_EL2 Julien Grall
  2018-12-21 14:33   ` Andrii Anisov
@ 2019-01-23 23:50   ` Stefano Stabellini
  1 sibling, 0 replies; 36+ messages in thread
From: Stefano Stabellini @ 2019-01-23 23:50 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini, andre.przywara

On Wed, 28 Nov 2018, Julien Grall wrote:
> Only {A,F,I}MO are necessary to receive interrupts until a guest vCPU is
> loaded.
> 
> The rest have no effect on Xen and it is better to avoid setting them.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/arm/traps.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 88ffeeb480..1eec966299 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -181,8 +181,12 @@ void init_traps(void)
>      WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) | HCPTR_TTA,
>                   CPTR_EL2);
>  
> -    /* Setup hypervisor traps */
> -    WRITE_SYSREG(get_default_hcr_flags(), HCR_EL2);
> +    /*
> +     * Configure HCR_EL2 with the bareminimum to run Xen until a guest
                                     ^ bare minimum

> +     * is scheduled. {A,I,F}MO bits are set to allow EL2 receiving
> +     * interrupts.
> +     */
> +    WRITE_SYSREG(HCR_AMO | HCR_FMO | HCR_IMO, HCR_EL2);
>      isb();
>  }

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 4/8] xen/arm: domain_build: Don't switch to the guest P2M when copying data
  2018-11-28 16:49 ` [PATCH for-4.12 4/8] xen/arm: domain_build: Don't switch to the guest P2M when copying data Julien Grall
  2018-12-21 15:16   ` Andrii Anisov
@ 2019-01-24  0:08   ` Stefano Stabellini
  1 sibling, 0 replies; 36+ messages in thread
From: Stefano Stabellini @ 2019-01-24  0:08 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini, andre.przywara

On Wed, 28 Nov 2018, Julien Grall wrote:
> Until recently, kernel/initrd/dtb were loaded using guest VA and
> therefore requiring to restore temporarily the P2M. This reworked in a
                                                          ^ was

> series of commits (up to 9292086 "xen/arm: domain_build: Use
> copy_to_guest_phys_flush_dcache in dtb_load") to use a guest PA.
> 
> This will also help a follow-up patch which will require
> p2m_{save,restore}_state to work in pair to workaround an erratum.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/arm/domain_build.c | 13 -------------
>  1 file changed, 13 deletions(-)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index b0ec3f0b72..ffbf7c6760 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1920,7 +1920,6 @@ static void __init find_gnttab_region(struct domain *d,
>  
>  static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
>  {
> -    struct vcpu *saved_current;
>      int i, cpu;
>      struct vcpu *v = d->vcpu[0];
>      struct cpu_user_regs *regs = &v->arch.cpu_info->guest_cpu_user_regs;
> @@ -1942,14 +1941,6 @@ static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
>  #endif
>  
>      /*
> -     * The following loads use the domain's p2m and require current to
> -     * be a vcpu of the domain, temporarily switch
> -     */
> -    saved_current = current;
> -    p2m_restore_state(v);
> -    set_current(v);
> -
> -    /*
>       * kernel_load will determine the placement of the kernel as well
>       * as the initrd & fdt in RAM, so call it first.
>       */
> @@ -1958,10 +1949,6 @@ static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
>      initrd_load(kinfo);
>      dtb_load(kinfo);
>  
> -    /* Now that we are done restore the original p2m and current. */
> -    set_current(saved_current);
> -    p2m_restore_state(saved_current);
> -
>      memset(regs, 0, sizeof(*regs));
>  
>      regs->pc = (register_t)kinfo->entry;

Nice cleanup!

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522
  2018-11-28 16:49 ` [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522 Julien Grall
  2018-12-21 15:47   ` Andrii Anisov
@ 2019-01-24  0:52   ` Stefano Stabellini
  2019-01-24 13:17     ` Julien Grall
  1 sibling, 1 reply; 36+ messages in thread
From: Stefano Stabellini @ 2019-01-24  0:52 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini, andre.przywara

On Wed, 28 Nov 2018, Julien Grall wrote:
> Early version of Cortex-A76 can end-up with corrupt TLBs if they
> speculate an AT instruction while the S1/S2 system registers are in an
> inconsistent state.
> 
> This can happen during guest context switch and when invalidating the
> TLBs for other than the current VMID.
> 
> The workaround implemented in Xen will:
>     - Use an empty stage-2 with a reserved VMID while context switching
>     between 2 guests
>     - Use an empty stage-2 with the VMID where TLBs need to be flushed
>
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Thank you for doing this and for setting up a testing environment. I
have a couple of questions on a couple of changes below.


> ---
>  docs/misc/arm/silicon-errata.txt |  1 +
>  xen/arch/arm/cpuerrata.c         |  6 ++++
>  xen/arch/arm/domain.c            |  8 +++--
>  xen/arch/arm/p2m.c               | 78 ++++++++++++++++++++++++++++++++++++++--
>  xen/include/asm-arm/cpufeature.h |  3 +-
>  xen/include/asm-arm/processor.h  |  2 ++
>  6 files changed, 93 insertions(+), 5 deletions(-)
> 
> diff --git a/docs/misc/arm/silicon-errata.txt b/docs/misc/arm/silicon-errata.txt
> index 906bf5fd48..6cd1366f15 100644
> --- a/docs/misc/arm/silicon-errata.txt
> +++ b/docs/misc/arm/silicon-errata.txt
> @@ -48,4 +48,5 @@ stable hypervisors.
>  | ARM            | Cortex-A57      | #852523         | N/A                     |
>  | ARM            | Cortex-A57      | #832075         | ARM64_ERRATUM_832075    |
>  | ARM            | Cortex-A57      | #834220         | ARM64_ERRATUM_834220    |
> +| ARM            | Cortex-A76      | #1165522        | N/A                     |
>  | ARM            | MMU-500         | #842869         | N/A                     |
> diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
> index adf88e7bdc..61c64b9816 100644
> --- a/xen/arch/arm/cpuerrata.c
> +++ b/xen/arch/arm/cpuerrata.c
> @@ -489,6 +489,12 @@ static const struct arm_cpu_capabilities arm_errata[] = {
>          .matches = has_ssbd_mitigation,
>      },
>  #endif
> +    {
> +        /* Cortex-A76 r0p0 - r2p0 */
> +        .desc = "ARM erratum 116522",
> +        .capability = ARM64_WORKAROUND_AT_SPECULATE,
> +        MIDR_RANGE(MIDR_CORTEX_A76, 0, 2 << MIDR_VARIANT_SHIFT),
> +    },
>      {},
>  };
>  
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 1d926dcb29..3180edd89d 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -181,8 +181,6 @@ static void ctxt_switch_to(struct vcpu *n)
>      if ( is_idle_vcpu(n) )
>          return;
>  
> -    p2m_restore_state(n);
> -
>      vpidr = READ_SYSREG32(MIDR_EL1);
>      WRITE_SYSREG32(vpidr, VPIDR_EL2);
>      WRITE_SYSREG(n->arch.vmpidr, VMPIDR_EL2);
> @@ -235,6 +233,12 @@ static void ctxt_switch_to(struct vcpu *n)
>  #endif
>      isb();
>  
> +    /*
> +     * ARM64_WORKAROUND_AT_SPECULATE: The P2M should be restored after
> +     * the stage-1 MMU sysregs have been restored.
> +     */
> +    p2m_restore_state(n);
> +
>      /* Control Registers */
>      WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);
>  
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 844833c4c3..0facb66096 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -15,6 +15,7 @@
>  #include <asm/event.h>
>  #include <asm/hardirq.h>
>  #include <asm/page.h>
> +#include <asm/alternative.h>
>  
>  #define MAX_VMID_8_BIT  (1UL << 8)
>  #define MAX_VMID_16_BIT (1UL << 16)
> @@ -47,6 +48,8 @@ static const paddr_t level_masks[] =
>  static const uint8_t level_orders[] =
>      { ZEROETH_ORDER, FIRST_ORDER, SECOND_ORDER, THIRD_ORDER };
>  
> +static mfn_t __read_mostly empty_root_mfn;
> +
>  static uint64_t generate_vttbr(uint16_t vmid, mfn_t root_mfn)
>  {
>      return (mfn_to_maddr(root_mfn) | ((uint64_t)vmid << 48));
> @@ -98,9 +101,25 @@ void dump_p2m_lookup(struct domain *d, paddr_t addr)
>                   P2M_ROOT_LEVEL, P2M_ROOT_PAGES);
>  }
>  
> +/*
> + * p2m_save_state and p2m_restore_state works in pair to workaround
                                           ^ work


> + * ARM64_WORKAROUND_AT_SPECULATE. p2m_save_state will set-up VTTBR to
> + * point to the empty page-tables to stop allocating TLB entries.
> + */
>  void p2m_save_state(struct vcpu *p)
>  {
>      p->arch.sctlr = READ_SYSREG(SCTLR_EL1);
> +
> +    if ( cpus_have_const_cap(ARM64_WORKAROUND_AT_SPECULATE) )
> +    {
> +        WRITE_SYSREG64(generate_vttbr(INVALID_VMID, empty_root_mfn), VTTBR_EL2);
> +        /*
> +         * Ensure VTTBR_EL2 is correctly synchronized so we can restore
> +         * the next vCPU context without worrying about AT instruction
> +         * speculation.
> +         */
> +        isb();
> +    }
>  }

OK


>  void p2m_restore_state(struct vcpu *n)
> @@ -111,10 +130,17 @@ void p2m_restore_state(struct vcpu *n)
>      if ( is_idle_vcpu(n) )
>          return;
>  
> -    WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
>      WRITE_SYSREG(n->arch.sctlr, SCTLR_EL1);
>      WRITE_SYSREG(n->arch.hcr_el2, HCR_EL2);
>  
> +    /*
> +     * ARM64_WORKAROUND_AT_SPECULATE: VTTBR_EL2 should be restored after all
> +     * registers associated to EL1/EL0 translations regime have been
> +     * synchronized.
> +     */
> +    asm volatile(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_AT_SPECULATE));

Obviously you have done a lot more thinking about this than me, but
I don't fully understand the need for this barrier: this is not about
ARM64_WORKAROUND_AT_SPECULATE per se, right? Shouldn't the CPU be able
to figure out the right execution speculation path given that the
instructions ordering is correct? I guess it depends on the nature of
the hardware bug.


> +    WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
> +
>      last_vcpu_ran = &p2m->last_vcpu_ran[smp_processor_id()];
>  
>      /*
> @@ -157,8 +183,23 @@ static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
>      ovttbr = READ_SYSREG64(VTTBR_EL2);
>      if ( ovttbr != p2m->vttbr )
>      {
> +        uint64_t vttbr;
> +
>          local_irq_save(flags);
> -        WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
> +
> +        /*
> +         * ARM64_WORKAROUND_AT_SPECULATE: We need to stop AT to allocate
> +         * TLBs entries because the context is partially modified. We
> +         * only need the VMID for flushing the TLBs, so we can generate
> +         * a new VTTBR with the VMID to flush and the empty root table.
> +         */
> +        if ( !cpus_have_const_cap(ARM64_WORKAROUND_AT_SPECULATE) )
> +            vttbr = p2m->vttbr;
> +        else
> +            vttbr = generate_vttbr(p2m->vmid, empty_root_mfn);
> +
> +        WRITE_SYSREG64(vttbr, VTTBR_EL2);

Good idea, any reasons not to use generate_vttbr(p2m->vmid,
empty_root_mfn) in the general case? There should be no downsides,
right?


>          /* Ensure VTTBR_EL2 is synchronized before flushing the TLBs */
>          isb();
>      }
> @@ -1504,6 +1545,23 @@ static uint32_t __read_mostly vtcr;
>  static void setup_virt_paging_one(void *data)
>  {
>      WRITE_SYSREG32(vtcr, VTCR_EL2);
> +
> +    /*
> +     * ARM64_WORKAROUND_AT_SPECULATE: We want to keep the TLBs free from
> +     * entries related to EL1/EL0 translation regime until a guest vCPU
> +     * is running. For that, we need to set-up VTTBR to point to an empty
> +     * page-table and turn on stage-2 translation.

I don't understand why this is needed: isn't the lack of HCR_VM (due to
your previous patch) supposed to be sufficient? How can there be
speculation without HCR_VM?

Even if speculation happens without HCR_EL2, why do we need to set it
now? Isn't setting empty_root_mfn enough?


>         The TLB entries
> +     * associated with EL1/EL0 translation regime will also be flushed in case
> +     * an AT instruction was speculated before hand.
> +     */
> +    if ( cpus_have_cap(ARM64_WORKAROUND_AT_SPECULATE) )
> +    {
> +        WRITE_SYSREG64(generate_vttbr(INVALID_VMID, empty_root_mfn), VTTBR_EL2);
> +        WRITE_SYSREG(READ_SYSREG(HCR_EL2) | HCR_VM, HCR_EL2);
> +        isb();
> +
> +        flush_tlb_all_local();
> +    }
>  }
>  
>  void __init setup_virt_paging(void)
> @@ -1587,6 +1645,22 @@ void __init setup_virt_paging(void)
>      /* It is not allowed to concatenate a level zero root */
>      BUG_ON( P2M_ROOT_LEVEL == 0 && P2M_ROOT_ORDER > 0 );
>      vtcr = val;
> +
> +    /*
> +     * ARM64_WORKAROUND_AT_SPECULATE requires to allocate root table
> +     * with all entries zeroed.
> +     */
> +    if ( cpus_have_cap(ARM64_WORKAROUND_AT_SPECULATE) )
> +    {
> +        struct page_info *root;
> +
> +        root = p2m_allocate_root();
> +        if ( !root )
> +            panic("Unable to allocate root table for ARM64_WORKAROUND_AT_SPECULATE\n");
> +
> +        empty_root_mfn = page_to_mfn(root);
> +    }

OK


>      setup_virt_paging_one(NULL);
>      smp_call_function(setup_virt_paging_one, NULL, 1);
>  }
> diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
> index 17de928467..c2c8f3417c 100644
> --- a/xen/include/asm-arm/cpufeature.h
> +++ b/xen/include/asm-arm/cpufeature.h
> @@ -45,8 +45,9 @@
>  #define ARM_HARDEN_BRANCH_PREDICTOR 7
>  #define ARM_SSBD 8
>  #define ARM_SMCCC_1_1 9
> +#define ARM64_WORKAROUND_AT_SPECULATE 10
>  
> -#define ARM_NCAPS           10
> +#define ARM_NCAPS           11
>  
>  #ifndef __ASSEMBLY__
>  
> diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> index 72ddc42778..d03ec6e272 100644
> --- a/xen/include/asm-arm/processor.h
> +++ b/xen/include/asm-arm/processor.h
> @@ -52,6 +52,7 @@
>  #define ARM_CPU_PART_CORTEX_A72     0xD08
>  #define ARM_CPU_PART_CORTEX_A73     0xD09
>  #define ARM_CPU_PART_CORTEX_A75     0xD0A
> +#define ARM_CPU_PART_CORTEX_A76     0xD0B
>  
>  #define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
>  #define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
> @@ -61,6 +62,7 @@
>  #define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A72)
>  #define MIDR_CORTEX_A73 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A73)
>  #define MIDR_CORTEX_A75 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A75)
> +#define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76)
>  
>  /* MPIDR Multiprocessor Affinity Register */
>  #define _MPIDR_UP           (30)
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 2/8] xen/arm: p2m: Provide an helper to generate the VTTBR
  2019-01-23 23:27   ` Stefano Stabellini
@ 2019-01-24 11:06     ` Julien Grall
  0 siblings, 0 replies; 36+ messages in thread
From: Julien Grall @ 2019-01-24 11:06 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, andre.przywara

Hi Stefano,

On 23/01/2019 23:27, Stefano Stabellini wrote:
> On Wed, 28 Nov 2018, Julien Grall wrote:
>> A follow-up patch will need to generate the VTTBR in a few places.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> ---
>>   xen/arch/arm/p2m.c | 7 ++++++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index 6c76298ebc..8ebf1e8dba 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -47,6 +47,11 @@ static const paddr_t level_masks[] =
>>   static const uint8_t level_orders[] =
>>       { ZEROETH_ORDER, FIRST_ORDER, SECOND_ORDER, THIRD_ORDER };
>>   
>> +static uint64_t generate_vttbr(uint16_t vmid, mfn_t root_mfn)
>> +{
>> +    return (mfn_to_maddr(root_mfn) | ((uint64_t)vmid << 48));
> 
> Outer brackets are not necessary. Regardless:

I would prefer to keep them here.

> 
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Thank you!

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary
  2019-01-23 23:42   ` Stefano Stabellini
@ 2019-01-24 11:29     ` Julien Grall
  0 siblings, 0 replies; 36+ messages in thread
From: Julien Grall @ 2019-01-24 11:29 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, andre.przywara

Hi Stefano,

On 23/01/2019 23:42, Stefano Stabellini wrote:
> On Wed, 28 Nov 2018, Julien Grall wrote:
>> +    /*
>>        * Flush local TLB for the domain to prevent wrong TLB translation
>>        * when running multiple vCPU of the same domain on a single pCPU.
>>        */
>>       if ( *last_vcpu_ran != INVALID_VCPU_ID && *last_vcpu_ran != n->vcpu_id )
>> +    {
>>           flush_tlb_local();
>> +    }
> 
> Spurious change?

Yes, this was a left-over from a previous try. I will drop them.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522
  2019-01-24  0:52   ` Stefano Stabellini
@ 2019-01-24 13:17     ` Julien Grall
  2019-01-25 21:36       ` Stefano Stabellini
  0 siblings, 1 reply; 36+ messages in thread
From: Julien Grall @ 2019-01-24 13:17 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, James Morse, andre.przywara

(+ James)

Hi Stefano,

@James, please correct me if I am wrong below :).

On 24/01/2019 00:52, Stefano Stabellini wrote:
> On Wed, 28 Nov 2018, Julien Grall wrote:
>>   void p2m_restore_state(struct vcpu *n)
>> @@ -111,10 +130,17 @@ void p2m_restore_state(struct vcpu *n)
>>       if ( is_idle_vcpu(n) )
>>           return;
>>   
>> -    WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
>>       WRITE_SYSREG(n->arch.sctlr, SCTLR_EL1);
>>       WRITE_SYSREG(n->arch.hcr_el2, HCR_EL2);
>>   
>> +    /*
>> +     * ARM64_WORKAROUND_AT_SPECULATE: VTTBR_EL2 should be restored after all
>> +     * registers associated to EL1/EL0 translations regime have been
>> +     * synchronized.
>> +     */
>> +    asm volatile(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_AT_SPECULATE));
> 
> Obviously you have done a lot more thinking about this than me, but
> I don't fully understand the need for this barrier: this is not about
> ARM64_WORKAROUND_AT_SPECULATE per se, right?

This particular isb() is about ARM64_WORKAROUND_AT_SPECULATE.

> Shouldn't the CPU be able
> to figure out the right execution speculation path given that the
> instructions ordering is correct?

What instructions ordering? Writing a system registers can be re-ordered and you 
need an isb() to ensure the full synchronization before executing at AT 
instruction or flushing the TLBs.

In hardware without the erratum, the registers associated with EL1 translation 
are out-of-context and the AT cannot speculate. The isb() added by patch #5 
ensure the context is synchronized so an AT afterwards would use a consistent 
context. Now...

> I guess it depends on the nature of
> the hardware bug.

... in the context of the errata, you have to imagine what can happen if an AT 
instruction is inserted (via speculation) between each instruction and what 
happen if the system registers are re-ordered.

The key of the erratum is VTTBR_EL2. This is what will stop a speculated AT 
instruction to allocate a TLBs entry because you are not allowed to cache a 
translation that will fault. Without the isb() here, the VTTBR_EL2 may be 
synchronized before the rest of the context, so a speculated AT instruction may 
use an inconsistent state and allocate a TLB entry with an unexpected 
translation against the guest.

So here, we want to ensure the rest of the context is synchronized before 
writing to VTTBR_EL2, hence the isb().

> 
> 
>> +    WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
>> +
>>       last_vcpu_ran = &p2m->last_vcpu_ran[smp_processor_id()];
>>   
>>       /*
>> @@ -157,8 +183,23 @@ static void p2m_force_tlb_flush_sync(struct p2m_domain *p2m)
>>       ovttbr = READ_SYSREG64(VTTBR_EL2);
>>       if ( ovttbr != p2m->vttbr )
>>       {
>> +        uint64_t vttbr;
>> +
>>           local_irq_save(flags);
>> -        WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
>> +
>> +        /*
>> +         * ARM64_WORKAROUND_AT_SPECULATE: We need to stop AT to allocate
>> +         * TLBs entries because the context is partially modified. We
>> +         * only need the VMID for flushing the TLBs, so we can generate
>> +         * a new VTTBR with the VMID to flush and the empty root table.
>> +         */
>> +        if ( !cpus_have_const_cap(ARM64_WORKAROUND_AT_SPECULATE) )
>> +            vttbr = p2m->vttbr;
>> +        else
>> +            vttbr = generate_vttbr(p2m->vmid, empty_root_mfn);
>> +
>> +        WRITE_SYSREG64(vttbr, VTTBR_EL2);
> 
> Good idea, any reasons not to use generate_vttbr(p2m->vmid,
> empty_root_mfn) in the general case? There should be no downsides,
> right?
An empty root means you need to have the root page-tables allocated. In the 
configuration we currently support, the could be either 4K or 8K.

Even if this seems small, this is a downside for platforms that don't require 
such trick.

> 
> 
>>           /* Ensure VTTBR_EL2 is synchronized before flushing the TLBs */
>>           isb();
>>       }
>> @@ -1504,6 +1545,23 @@ static uint32_t __read_mostly vtcr;
>>   static void setup_virt_paging_one(void *data)
>>   {
>>       WRITE_SYSREG32(vtcr, VTCR_EL2);
>> +
>> +    /*
>> +     * ARM64_WORKAROUND_AT_SPECULATE: We want to keep the TLBs free from
>> +     * entries related to EL1/EL0 translation regime until a guest vCPU
>> +     * is running. For that, we need to set-up VTTBR to point to an empty
>> +     * page-table and turn on stage-2 translation.
> 
> I don't understand why this is needed: isn't the lack of HCR_VM (due to
> your previous patch) supposed to be sufficient? How can there be
> speculation without HCR_VM?

HCR_EL2.VM unsets means the stage-2 will not be used for the EL1/EL0 translation 
regime. In the context of the erratum, the AT can still speculate except it will 
not take into account the stage-2. The dependencies on VMID stills applies when 
HCR_EL2.VM is unset, so from my understanding, the entry could get cached to 
whatever is VTTBR_EL2.VMID.

> 
> Even if speculation happens without HCR_EL2, why do we need to set it
> now? Isn't setting empty_root_mfn enough?

The main goal here is to have the TLBs in a known state after the CPU has been 
initialized. After the sequence below, we are sure that the TLBs don't contain 
entries associated to the EL1/EL0 regime and and a speculated AT instruction 
will not be able to allocate more.

Without HCR_EL2.VM set, the stage-2 page-table will not get used. So a 
speculated AT instruction could still allocate an entry in TLB. It is not a 
major issue as it would be against INVALID_VMID, yet it is not a very sane 
situation for the hypervisor.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522
  2019-01-24 13:17     ` Julien Grall
@ 2019-01-25 21:36       ` Stefano Stabellini
  2019-01-27  9:55         ` Julien Grall
  0 siblings, 1 reply; 36+ messages in thread
From: Stefano Stabellini @ 2019-01-25 21:36 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini, James Morse, andre.przywara

On Thu, 24 Jan 2019, Julien Grall wrote:
> (+ James)
> 
> Hi Stefano,
> 
> @James, please correct me if I am wrong below :).
> 
> On 24/01/2019 00:52, Stefano Stabellini wrote:
> > On Wed, 28 Nov 2018, Julien Grall wrote:
> > >   void p2m_restore_state(struct vcpu *n)
> > > @@ -111,10 +130,17 @@ void p2m_restore_state(struct vcpu *n)
> > >       if ( is_idle_vcpu(n) )
> > >           return;
> > >   -    WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
> > >       WRITE_SYSREG(n->arch.sctlr, SCTLR_EL1);
> > >       WRITE_SYSREG(n->arch.hcr_el2, HCR_EL2);
> > >   +    /*
> > > +     * ARM64_WORKAROUND_AT_SPECULATE: VTTBR_EL2 should be restored after
> > > all
> > > +     * registers associated to EL1/EL0 translations regime have been
> > > +     * synchronized.
> > > +     */
> > > +    asm volatile(ALTERNATIVE("nop", "isb",
> > > ARM64_WORKAROUND_AT_SPECULATE));
> > 
> > Obviously you have done a lot more thinking about this than me, but
> > I don't fully understand the need for this barrier: this is not about
> > ARM64_WORKAROUND_AT_SPECULATE per se, right?
> 
> This particular isb() is about ARM64_WORKAROUND_AT_SPECULATE.
> 
> > Shouldn't the CPU be able
> > to figure out the right execution speculation path given that the
> > instructions ordering is correct?
> 
> What instructions ordering? Writing a system registers can be re-ordered and
> you need an isb() to ensure the full synchronization before executing at AT
> instruction or flushing the TLBs.
> 
> In hardware without the erratum, the registers associated with EL1 translation
> are out-of-context and the AT cannot speculate. The isb() added by patch #5
> ensure the context is synchronized so an AT afterwards would use a consistent
> context. Now...
> 
> > I guess it depends on the nature of
> > the hardware bug.
> 
> ... in the context of the errata, you have to imagine what can happen if an AT
> instruction is inserted (via speculation) between each instruction and what
> happen if the system registers are re-ordered.
> 
> The key of the erratum is VTTBR_EL2. This is what will stop a speculated AT
> instruction to allocate a TLBs entry because you are not allowed to cache a
> translation that will fault. Without the isb() here, the VTTBR_EL2 may be
> synchronized before the rest of the context, so a speculated AT instruction
> may use an inconsistent state and allocate a TLB entry with an unexpected
> translation against the guest.
> 
> So here, we want to ensure the rest of the context is synchronized before
> writing to VTTBR_EL2, hence the isb().

OK. I understand the explanation, thank you.

I just thought that the CPU would be smart enough to only reorder system
registers writes when appropriate, especially when the CPU is also doing
speculation at the same time. Why would it speculate if it knows that it
is reordering sysreg writes that can badly affect the speculation
itself? Let me say that it doesn't sound like a "sane" behavior to me.
But if it behaves this way, it behaves this way...


> > 
> > > +    WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
> > > +
> > >       last_vcpu_ran = &p2m->last_vcpu_ran[smp_processor_id()];
> > >         /*
> > > @@ -157,8 +183,23 @@ static void p2m_force_tlb_flush_sync(struct
> > > p2m_domain *p2m)
> > >       ovttbr = READ_SYSREG64(VTTBR_EL2);
> > >       if ( ovttbr != p2m->vttbr )
> > >       {
> > > +        uint64_t vttbr;
> > > +
> > >           local_irq_save(flags);
> > > -        WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
> > > +
> > > +        /*
> > > +         * ARM64_WORKAROUND_AT_SPECULATE: We need to stop AT to allocate
> > > +         * TLBs entries because the context is partially modified. We
> > > +         * only need the VMID for flushing the TLBs, so we can generate
> > > +         * a new VTTBR with the VMID to flush and the empty root table.
> > > +         */
> > > +        if ( !cpus_have_const_cap(ARM64_WORKAROUND_AT_SPECULATE) )
> > > +            vttbr = p2m->vttbr;
> > > +        else
> > > +            vttbr = generate_vttbr(p2m->vmid, empty_root_mfn);
> > > +
> > > +        WRITE_SYSREG64(vttbr, VTTBR_EL2);
> > 
> > Good idea, any reasons not to use generate_vttbr(p2m->vmid,
> > empty_root_mfn) in the general case? There should be no downsides,
> > right?
> An empty root means you need to have the root page-tables allocated. In the
> configuration we currently support, the could be either 4K or 8K.
> 
> Even if this seems small, this is a downside for platforms that don't require
> such trick.

Good point, I was fooled by empty_root_mfn being available to all as a
variable. Let's not go down that route then.


> > 
> > >           /* Ensure VTTBR_EL2 is synchronized before flushing the TLBs */
> > >           isb();
> > >       }
> > > @@ -1504,6 +1545,23 @@ static uint32_t __read_mostly vtcr;
> > >   static void setup_virt_paging_one(void *data)
> > >   {
> > >       WRITE_SYSREG32(vtcr, VTCR_EL2);
> > > +
> > > +    /*
> > > +     * ARM64_WORKAROUND_AT_SPECULATE: We want to keep the TLBs free from
> > > +     * entries related to EL1/EL0 translation regime until a guest vCPU
> > > +     * is running. For that, we need to set-up VTTBR to point to an empty
> > > +     * page-table and turn on stage-2 translation.
> > 
> > I don't understand why this is needed: isn't the lack of HCR_VM (due to
> > your previous patch) supposed to be sufficient? How can there be
> > speculation without HCR_VM?
> 
> HCR_EL2.VM unsets means the stage-2 will not be used for the EL1/EL0
> translation regime. In the context of the erratum, the AT can still speculate
> except it will not take into account the stage-2. The dependencies on VMID
> stills applies when HCR_EL2.VM is unset, so from my understanding, the entry
> could get cached to whatever is VTTBR_EL2.VMID.

Damn! Even if at this point of the boot sequence there is no EL1 / EL0
at all? How can that speculation happen? Shouldn't the first EL1 / EL0
speculation occur after the first leave_hypervisor_tail?


> > Even if speculation happens without HCR_EL2, why do we need to set it
> > now? Isn't setting empty_root_mfn enough?
> 
> The main goal here is to have the TLBs in a known state after the CPU has been
> initialized. After the sequence below, we are sure that the TLBs don't contain
> entries associated to the EL1/EL0 regime and and a speculated AT instruction
> will not be able to allocate more.
> 
> Without HCR_EL2.VM set, the stage-2 page-table will not get used. So a
> speculated AT instruction could still allocate an entry in TLB. It is not a
> major issue as it would be against INVALID_VMID, yet it is not a very sane
> situation for the hypervisor.

I have a question on the tlb flush.  Do we need it because the tlb is
not guaranteed to be clean after boot?

Also, do we need a flush_tlb_all_local()? Would flush_tlb_local be
enough, maybe executed immediately before switching VTTBR_EL2? I guess
it depends on whether the speculation happens on the local VMID only.
If it only speculate on the local VMID, then flush_tlb_all_local()
should suffice?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522
  2019-01-25 21:36       ` Stefano Stabellini
@ 2019-01-27  9:55         ` Julien Grall
  2019-01-28 11:11           ` Julien Grall
  0 siblings, 1 reply; 36+ messages in thread
From: Julien Grall @ 2019-01-27  9:55 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, James Morse, andre.przywara

Hi,

On 1/25/19 9:36 PM, Stefano Stabellini wrote:
> On Thu, 24 Jan 2019, Julien Grall wrote:
>> @James, please correct me if I am wrong below :).
>>
>> On 24/01/2019 00:52, Stefano Stabellini wrote:
>>> On Wed, 28 Nov 2018, Julien Grall wrote:
>> ... in the context of the errata, you have to imagine what can happen if an AT
>> instruction is inserted (via speculation) between each instruction and what
>> happen if the system registers are re-ordered.
>>
>> The key of the erratum is VTTBR_EL2. This is what will stop a speculated AT
>> instruction to allocate a TLBs entry because you are not allowed to cache a
>> translation that will fault. Without the isb() here, the VTTBR_EL2 may be
>> synchronized before the rest of the context, so a speculated AT instruction
>> may use an inconsistent state and allocate a TLB entry with an unexpected
>> translation against the guest.
>>
>> So here, we want to ensure the rest of the context is synchronized before
>> writing to VTTBR_EL2, hence the isb().
> 
> OK. I understand the explanation, thank you.
> 
> I just thought that the CPU would be smart enough to only reorder system
> registers writes when appropriate, especially when the CPU is also doing
> speculation at the same time. Why would it speculate if it knows that it
> is reordering sysreg writes that can badly affect the speculation
> itself? Let me say that it doesn't sound like a "sane" behavior to me.
> But if it behaves this way, it behaves this way...

I hope you are aware we are speaking about an erratum here... Not what 
the Arm Arm allows.

Aside the erratum, a processor is allowed to do whatever it wants if it 
is within the Arm Arm. These registers are described as out-of-context 
and should not be used by speculation in EL2. If you want to use them in 
EL2, you need an isb() before any instruction in EL2 using them 
otherwise you may use an inconsistent context. This is giving enough 
freedom to the processor while the impact in the software is minimal.

[...]

>>>
>>>>            /* Ensure VTTBR_EL2 is synchronized before flushing the TLBs */
>>>>            isb();
>>>>        }
>>>> @@ -1504,6 +1545,23 @@ static uint32_t __read_mostly vtcr;
>>>>    static void setup_virt_paging_one(void *data)
>>>>    {
>>>>        WRITE_SYSREG32(vtcr, VTCR_EL2);
>>>> +
>>>> +    /*
>>>> +     * ARM64_WORKAROUND_AT_SPECULATE: We want to keep the TLBs free from
>>>> +     * entries related to EL1/EL0 translation regime until a guest vCPU
>>>> +     * is running. For that, we need to set-up VTTBR to point to an empty
>>>> +     * page-table and turn on stage-2 translation.
>>>
>>> I don't understand why this is needed: isn't the lack of HCR_VM (due to
>>> your previous patch) supposed to be sufficient? How can there be
>>> speculation without HCR_VM?
>>
>> HCR_EL2.VM unsets means the stage-2 will not be used for the EL1/EL0
>> translation regime. In the context of the erratum, the AT can still speculate
>> except it will not take into account the stage-2. The dependencies on VMID
>> stills applies when HCR_EL2.VM is unset, so from my understanding, the entry
>> could get cached to whatever is VTTBR_EL2.VMID.
> 
> Damn! Even if at this point of the boot sequence there is no EL1 / EL0
> at all? How can that speculation happen? Shouldn't the first EL1 / EL0
> speculation occur after the first leave_hypervisor_tail?

How do you know EL1 was not run before hand? Imagine we did a soft 
reboot or kexec Xen...

But the speculation in that context is may be because the processor 
noticed an AT instruction targeting EL1 in the stream.

> 
> 
>>> Even if speculation happens without HCR_EL2, why do we need to set it
>>> now? Isn't setting empty_root_mfn enough?
>>
>> The main goal here is to have the TLBs in a known state after the CPU has been
>> initialized. After the sequence below, we are sure that the TLBs don't contain
>> entries associated to the EL1/EL0 regime and and a speculated AT instruction
>> will not be able to allocate more.
>>
>> Without HCR_EL2.VM set, the stage-2 page-table will not get used. So a
>> speculated AT instruction could still allocate an entry in TLB. It is not a
>> major issue as it would be against INVALID_VMID, yet it is not a very sane
>> situation for the hypervisor.
> 
> I have a question on the tlb flush.  Do we need it because the tlb is
> not guaranteed to be clean after boot?

You don't know the state of the TLBs after boot.

> 
> Also, do we need a flush_tlb_all_local()? Would flush_tlb_local be
> enough, maybe executed immediately before switching VTTBR_EL2? I guess
> it depends on whether the speculation happens on the local VMID only.

Speculation can only happen using system registers. So only on the local 
VMID only.

> If it only speculate on the local VMID, then flush_tlb_all_local()
> should suffice?

We have two VMIDs in play here: whatever was the value in VTTBR_EL2.VMID 
before the function and INVALID_VMID. We would need to flush the former 
and this would require empty root trick because speculation can happen 
as soon as flush ended.

But then, you rely on Xen to only use a single VMID at boot. While this 
is the case today, I can't tell if it will be in the future.

So the flush_tlb_local is the safest.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522
  2019-01-27  9:55         ` Julien Grall
@ 2019-01-28 11:11           ` Julien Grall
  2019-01-29  0:52             ` Stefano Stabellini
  0 siblings, 1 reply; 36+ messages in thread
From: Julien Grall @ 2019-01-28 11:11 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, James Morse, andre.przywara



On 1/27/19 9:55 AM, Julien Grall wrote:
> Hi,
> 
> On 1/25/19 9:36 PM, Stefano Stabellini wrote:
>> On Thu, 24 Jan 2019, Julien Grall wrote:
>>> @James, please correct me if I am wrong below :).
>>>
>>> On 24/01/2019 00:52, Stefano Stabellini wrote:
>>>> On Wed, 28 Nov 2018, Julien Grall wrote:
>>> ... in the context of the errata, you have to imagine what can happen 
>>> if an AT
>>> instruction is inserted (via speculation) between each instruction 
>>> and what
>>> happen if the system registers are re-ordered.
>>>
>>> The key of the erratum is VTTBR_EL2. This is what will stop a 
>>> speculated AT
>>> instruction to allocate a TLBs entry because you are not allowed to 
>>> cache a
>>> translation that will fault. Without the isb() here, the VTTBR_EL2 
>>> may be
>>> synchronized before the rest of the context, so a speculated AT 
>>> instruction
>>> may use an inconsistent state and allocate a TLB entry with an 
>>> unexpected
>>> translation against the guest.
>>>
>>> So here, we want to ensure the rest of the context is synchronized 
>>> before
>>> writing to VTTBR_EL2, hence the isb().
>>
>> OK. I understand the explanation, thank you.
>>
>> I just thought that the CPU would be smart enough to only reorder system
>> registers writes when appropriate, especially when the CPU is also doing
>> speculation at the same time. Why would it speculate if it knows that it
>> is reordering sysreg writes that can badly affect the speculation
>> itself? Let me say that it doesn't sound like a "sane" behavior to me.
>> But if it behaves this way, it behaves this way...
> 
> I hope you are aware we are speaking about an erratum here... Not what 
> the Arm Arm allows.
> 
> Aside the erratum, a processor is allowed to do whatever it wants if it 
> is within the Arm Arm. These registers are described as out-of-context 
> and should not be used by speculation in EL2. If you want to use them in 
> EL2, you need an isb() before any instruction in EL2 using them 
> otherwise you may use an inconsistent context. This is giving enough 
> freedom to the processor while the impact in the software is minimal.
> 
> [...]
> 
>>>>
>>>>>            /* Ensure VTTBR_EL2 is synchronized before flushing the 
>>>>> TLBs */
>>>>>            isb();
>>>>>        }
>>>>> @@ -1504,6 +1545,23 @@ static uint32_t __read_mostly vtcr;
>>>>>    static void setup_virt_paging_one(void *data)
>>>>>    {
>>>>>        WRITE_SYSREG32(vtcr, VTCR_EL2);
>>>>> +
>>>>> +    /*
>>>>> +     * ARM64_WORKAROUND_AT_SPECULATE: We want to keep the TLBs 
>>>>> free from
>>>>> +     * entries related to EL1/EL0 translation regime until a guest 
>>>>> vCPU
>>>>> +     * is running. For that, we need to set-up VTTBR to point to 
>>>>> an empty
>>>>> +     * page-table and turn on stage-2 translation.
>>>>
>>>> I don't understand why this is needed: isn't the lack of HCR_VM (due to
>>>> your previous patch) supposed to be sufficient? How can there be
>>>> speculation without HCR_VM?
>>>
>>> HCR_EL2.VM unsets means the stage-2 will not be used for the EL1/EL0
>>> translation regime. In the context of the erratum, the AT can still 
>>> speculate
>>> except it will not take into account the stage-2. The dependencies on 
>>> VMID
>>> stills applies when HCR_EL2.VM is unset, so from my understanding, 
>>> the entry
>>> could get cached to whatever is VTTBR_EL2.VMID.
>>
>> Damn! Even if at this point of the boot sequence there is no EL1 / EL0
>> at all? How can that speculation happen? Shouldn't the first EL1 / EL0
>> speculation occur after the first leave_hypervisor_tail?
> 
> How do you know EL1 was not run before hand? Imagine we did a soft 
> reboot or kexec Xen...
> 
> But the speculation in that context is may be because the processor 
> noticed an AT instruction targeting EL1 in the stream.
> 
>>
>>
>>>> Even if speculation happens without HCR_EL2, why do we need to set it
>>>> now? Isn't setting empty_root_mfn enough?
>>>
>>> The main goal here is to have the TLBs in a known state after the CPU 
>>> has been
>>> initialized. After the sequence below, we are sure that the TLBs 
>>> don't contain
>>> entries associated to the EL1/EL0 regime and and a speculated AT 
>>> instruction
>>> will not be able to allocate more.
>>>
>>> Without HCR_EL2.VM set, the stage-2 page-table will not get used. So a
>>> speculated AT instruction could still allocate an entry in TLB. It is 
>>> not a
>>> major issue as it would be against INVALID_VMID, yet it is not a very 
>>> sane
>>> situation for the hypervisor.
>>
>> I have a question on the tlb flush.  Do we need it because the tlb is
>> not guaranteed to be clean after boot?
> 
> You don't know the state of the TLBs after boot.
> 
>>
>> Also, do we need a flush_tlb_all_local()? Would flush_tlb_local be
>> enough, maybe executed immediately before switching VTTBR_EL2? I guess
>> it depends on whether the speculation happens on the local VMID only.
> 
> Speculation can only happen using system registers. So only on the local 
> VMID only.
> 
>> If it only speculate on the local VMID, then flush_tlb_all_local()
>> should suffice?
> 
> We have two VMIDs in play here: whatever was the value in VTTBR_EL2.VMID 
> before the function and INVALID_VMID. We would need to flush the former 
> and this would require empty root trick because speculation can happen 
> as soon as flush ended.
> 
> But then, you rely on Xen to only use a single VMID at boot. While this 
> is the case today, I can't tell if it will be in the future.
> 
> So the flush_tlb_local is the safest.

Hmmm, I meant flush_tlb_all_local here.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522
  2019-01-28 11:11           ` Julien Grall
@ 2019-01-29  0:52             ` Stefano Stabellini
  2019-01-29 10:37               ` Julien Grall
  0 siblings, 1 reply; 36+ messages in thread
From: Stefano Stabellini @ 2019-01-29  0:52 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini, James Morse, andre.przywara

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6881 bytes --]

On Mon, 28 Jan 2019, Julien Grall wrote:
> On 1/27/19 9:55 AM, Julien Grall wrote:
> > Hi,
> > 
> > On 1/25/19 9:36 PM, Stefano Stabellini wrote:
> > > On Thu, 24 Jan 2019, Julien Grall wrote:
> > > > @James, please correct me if I am wrong below :).
> > > > 
> > > > On 24/01/2019 00:52, Stefano Stabellini wrote:
> > > > > On Wed, 28 Nov 2018, Julien Grall wrote:
> > > > ... in the context of the errata, you have to imagine what can happen if
> > > > an AT
> > > > instruction is inserted (via speculation) between each instruction and
> > > > what
> > > > happen if the system registers are re-ordered.
> > > > 
> > > > The key of the erratum is VTTBR_EL2. This is what will stop a speculated
> > > > AT
> > > > instruction to allocate a TLBs entry because you are not allowed to
> > > > cache a
> > > > translation that will fault. Without the isb() here, the VTTBR_EL2 may
> > > > be
> > > > synchronized before the rest of the context, so a speculated AT
> > > > instruction
> > > > may use an inconsistent state and allocate a TLB entry with an
> > > > unexpected
> > > > translation against the guest.
> > > > 
> > > > So here, we want to ensure the rest of the context is synchronized
> > > > before
> > > > writing to VTTBR_EL2, hence the isb().
> > > 
> > > OK. I understand the explanation, thank you.
> > > 
> > > I just thought that the CPU would be smart enough to only reorder system
> > > registers writes when appropriate, especially when the CPU is also doing
> > > speculation at the same time. Why would it speculate if it knows that it
> > > is reordering sysreg writes that can badly affect the speculation
> > > itself? Let me say that it doesn't sound like a "sane" behavior to me.
> > > But if it behaves this way, it behaves this way...
> > 
> > I hope you are aware we are speaking about an erratum here... Not what the
> > Arm Arm allows.

I know -- we are talking about a specific CPU implementation. That is
why it seems strange to me that a CPU would reorder things that it
should know they cause trouble to speculation. Anyway, no point in
discussing hardware design choices at this stage.


> > Aside the erratum, a processor is allowed to do whatever it wants if it is
> > within the Arm Arm. These registers are described as out-of-context and
> > should not be used by speculation in EL2. If you want to use them in EL2,
> > you need an isb() before any instruction in EL2 using them otherwise you may
> > use an inconsistent context. This is giving enough freedom to the processor
> > while the impact in the software is minimal.
> > 
> > [...]
> > 
> > > > > 
> > > > > >            /* Ensure VTTBR_EL2 is synchronized before flushing the
> > > > > > TLBs */
> > > > > >            isb();
> > > > > >        }
> > > > > > @@ -1504,6 +1545,23 @@ static uint32_t __read_mostly vtcr;
> > > > > >    static void setup_virt_paging_one(void *data)
> > > > > >    {
> > > > > >        WRITE_SYSREG32(vtcr, VTCR_EL2);
> > > > > > +
> > > > > > +    /*
> > > > > > +     * ARM64_WORKAROUND_AT_SPECULATE: We want to keep the TLBs free
> > > > > > from
> > > > > > +     * entries related to EL1/EL0 translation regime until a guest
> > > > > > vCPU
> > > > > > +     * is running. For that, we need to set-up VTTBR to point to an
> > > > > > empty
> > > > > > +     * page-table and turn on stage-2 translation.
> > > > > 
> > > > > I don't understand why this is needed: isn't the lack of HCR_VM (due
> > > > > to
> > > > > your previous patch) supposed to be sufficient? How can there be
> > > > > speculation without HCR_VM?
> > > > 
> > > > HCR_EL2.VM unsets means the stage-2 will not be used for the EL1/EL0
> > > > translation regime. In the context of the erratum, the AT can still
> > > > speculate
> > > > except it will not take into account the stage-2. The dependencies on
> > > > VMID
> > > > stills applies when HCR_EL2.VM is unset, so from my understanding, the
> > > > entry
> > > > could get cached to whatever is VTTBR_EL2.VMID.
> > > 
> > > Damn! Even if at this point of the boot sequence there is no EL1 / EL0
> > > at all? How can that speculation happen? Shouldn't the first EL1 / EL0
> > > speculation occur after the first leave_hypervisor_tail?
> > 
> > How do you know EL1 was not run before hand? Imagine we did a soft reboot or
> > kexec Xen...
> > 
> > But the speculation in that context is may be because the processor noticed
> > an AT instruction targeting EL1 in the stream.

This seems to be extremely improbable, borderline impossible to me, but
I can imagine that we might want to be extra-paranoid to make sure all
potential issues are covered.


> > > > > Even if speculation happens without HCR_EL2, why do we need to set it
> > > > > now? Isn't setting empty_root_mfn enough?
> > > > 
> > > > The main goal here is to have the TLBs in a known state after the CPU
> > > > has been
> > > > initialized. After the sequence below, we are sure that the TLBs don't
> > > > contain
> > > > entries associated to the EL1/EL0 regime and and a speculated AT
> > > > instruction
> > > > will not be able to allocate more.
> > > > 
> > > > Without HCR_EL2.VM set, the stage-2 page-table will not get used. So a
> > > > speculated AT instruction could still allocate an entry in TLB. It is
> > > > not a
> > > > major issue as it would be against INVALID_VMID, yet it is not a very
> > > > sane
> > > > situation for the hypervisor.
> > > 
> > > I have a question on the tlb flush.  Do we need it because the tlb is
> > > not guaranteed to be clean after boot?
> > 
> > You don't know the state of the TLBs after boot.
> > 
> > > 
> > > Also, do we need a flush_tlb_all_local()? Would flush_tlb_local be
> > > enough, maybe executed immediately before switching VTTBR_EL2? I guess
> > > it depends on whether the speculation happens on the local VMID only.
> > 
> > Speculation can only happen using system registers. So only on the local
> > VMID only.
> > 
> > > If it only speculate on the local VMID, then flush_tlb_all_local()
> > > should suffice?
> > 
> > We have two VMIDs in play here: whatever was the value in VTTBR_EL2.VMID
> > before the function and INVALID_VMID. We would need to flush the former and
> > this would require empty root trick because speculation can happen as soon
> > as flush ended.
> > 
> > But then, you rely on Xen to only use a single VMID at boot. While this is
> > the case today, I can't tell if it will be in the future.
> > 
> > So the flush_tlb_local is the safest.
> 
> Hmmm, I meant flush_tlb_all_local here.

OK.


Overall, I think we could probably get away without a couple of changes
from this patch, but since it is basically impossible for me to prove
it, I'll give my Reviewed-by. I saw that you resent the series already.
I'll take care of committing it.

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522
  2019-01-29  0:52             ` Stefano Stabellini
@ 2019-01-29 10:37               ` Julien Grall
  0 siblings, 0 replies; 36+ messages in thread
From: Julien Grall @ 2019-01-29 10:37 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, James Morse, andre.przywara

Hi,

On 29/01/2019 00:52, Stefano Stabellini wrote:
> On Mon, 28 Jan 2019, Julien Grall wrote:
>> On 1/27/19 9:55 AM, Julien Grall wrote:
>>> Hi,
>>>
>>> On 1/25/19 9:36 PM, Stefano Stabellini wrote:
>>>> On Thu, 24 Jan 2019, Julien Grall wrote:
>>>>> @James, please correct me if I am wrong below :).
>>>>>
>>>>> On 24/01/2019 00:52, Stefano Stabellini wrote:
>>>>>> On Wed, 28 Nov 2018, Julien Grall wrote:
>>>>> ... in the context of the errata, you have to imagine what can happen if
>>>>> an AT
>>>>> instruction is inserted (via speculation) between each instruction and
>>>>> what
>>>>> happen if the system registers are re-ordered.
>>>>>
>>>>> The key of the erratum is VTTBR_EL2. This is what will stop a speculated
>>>>> AT
>>>>> instruction to allocate a TLBs entry because you are not allowed to
>>>>> cache a
>>>>> translation that will fault. Without the isb() here, the VTTBR_EL2 may
>>>>> be
>>>>> synchronized before the rest of the context, so a speculated AT
>>>>> instruction
>>>>> may use an inconsistent state and allocate a TLB entry with an
>>>>> unexpected
>>>>> translation against the guest.
>>>>>
>>>>> So here, we want to ensure the rest of the context is synchronized
>>>>> before
>>>>> writing to VTTBR_EL2, hence the isb().
>>>>
>>>> OK. I understand the explanation, thank you.
>>>>
>>>> I just thought that the CPU would be smart enough to only reorder system
>>>> registers writes when appropriate, especially when the CPU is also doing
>>>> speculation at the same time. Why would it speculate if it knows that it
>>>> is reordering sysreg writes that can badly affect the speculation
>>>> itself? Let me say that it doesn't sound like a "sane" behavior to me.
>>>> But if it behaves this way, it behaves this way...
>>>
>>> I hope you are aware we are speaking about an erratum here... Not what the
>>> Arm Arm allows.
> 
> I know -- we are talking about a specific CPU implementation. That is
> why it seems strange to me that a CPU would reorder things that it
> should know they cause trouble to speculation. Anyway, no point in
> discussing hardware design choices at this stage.

I agree that speculation may not happen as I described in my previous e-mail. 
However, we have to assume that any behavior allowed by the Arm Arm can happen 
unless stated otherwise by the specific processor documentation.

This series is based on the Arm Arm and the erratum description provides in the 
Software Developer Errata Notice for the Cortex-A76 [1], both are available 
publicly.

Regarding re-ordering, the wording in the document does not provide any strong 
evidence the writes to system register cannot be re-ordered. The section 
"conditions" actually suggests the invert (i.e re-ordering is possible).

[...]

>>>>>>
>>>>>>>             /* Ensure VTTBR_EL2 is synchronized before flushing the
>>>>>>> TLBs */
>>>>>>>             isb();
>>>>>>>         }
>>>>>>> @@ -1504,6 +1545,23 @@ static uint32_t __read_mostly vtcr;
>>>>>>>     static void setup_virt_paging_one(void *data)
>>>>>>>     {
>>>>>>>         WRITE_SYSREG32(vtcr, VTCR_EL2);
>>>>>>> +
>>>>>>> +    /*
>>>>>>> +     * ARM64_WORKAROUND_AT_SPECULATE: We want to keep the TLBs free
>>>>>>> from
>>>>>>> +     * entries related to EL1/EL0 translation regime until a guest
>>>>>>> vCPU
>>>>>>> +     * is running. For that, we need to set-up VTTBR to point to an
>>>>>>> empty
>>>>>>> +     * page-table and turn on stage-2 translation.
>>>>>>
>>>>>> I don't understand why this is needed: isn't the lack of HCR_VM (due
>>>>>> to
>>>>>> your previous patch) supposed to be sufficient? How can there be
>>>>>> speculation without HCR_VM?
>>>>>
>>>>> HCR_EL2.VM unsets means the stage-2 will not be used for the EL1/EL0
>>>>> translation regime. In the context of the erratum, the AT can still
>>>>> speculate
>>>>> except it will not take into account the stage-2. The dependencies on
>>>>> VMID
>>>>> stills applies when HCR_EL2.VM is unset, so from my understanding, the
>>>>> entry
>>>>> could get cached to whatever is VTTBR_EL2.VMID.
>>>>
>>>> Damn! Even if at this point of the boot sequence there is no EL1 / EL0
>>>> at all? How can that speculation happen? Shouldn't the first EL1 / EL0
>>>> speculation occur after the first leave_hypervisor_tail?
>>>
>>> How do you know EL1 was not run before hand? Imagine we did a soft reboot or
>>> kexec Xen...
>>>
>>> But the speculation in that context is may be because the processor noticed
>>> an AT instruction targeting EL1 in the stream.
> 
> This seems to be extremely improbable, borderline impossible to me, but
> I can imagine that we might want to be extra-paranoid to make sure all
> potential issues are covered.

The "Workaround" section of the erratum contains the following wording:

"Note that a workaround is only required if the system software
contains an AT instruction as part of an executable page."

Xen contains AT instruction in an executable page, so speculation can happen and 
we don't know when.

[...]

> Overall, I think we could probably get away without a couple of changes
> from this patch, but since it is basically impossible for me to prove
> it, I'll give my Reviewed-by. I saw that you resent the series already.
> I'll take care of committing it.
> 
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Thank you for the review!

Cheers,

[1] 
https://silver.arm.com/download/Documentation/BX500-DA-10008-r0p0-02rel0/Arm_Cortex_A76_MP052_Software_Developer_Errata_Notice_v11.0.pdf

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2019-01-29 10:37 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-28 16:49 [PATCH for-4.12 0/8] xen/arm: Workaround for Cortex-A76 erratum 1165522 Julien Grall
2018-11-28 16:49 ` [PATCH for-4.12 1/8] xen/arm: Only set necessary flags when initializing HCR_EL2 Julien Grall
2018-12-21 14:33   ` Andrii Anisov
2019-01-23 23:50   ` Stefano Stabellini
2018-11-28 16:49 ` [PATCH for-4.12 2/8] xen/arm: p2m: Provide an helper to generate the VTTBR Julien Grall
2018-12-21 14:34   ` Andrii Anisov
2019-01-23 23:27   ` Stefano Stabellini
2019-01-24 11:06     ` Julien Grall
2018-11-28 16:49 ` [PATCH for-4.12 3/8] xen/arm: p2m: Introduce an helper to allocate the root page-table Julien Grall
2018-12-21 14:35   ` Andrii Anisov
2019-01-23 23:29   ` Stefano Stabellini
2018-11-28 16:49 ` [PATCH for-4.12 4/8] xen/arm: domain_build: Don't switch to the guest P2M when copying data Julien Grall
2018-12-21 15:16   ` Andrii Anisov
2019-01-24  0:08   ` Stefano Stabellini
2018-11-28 16:49 ` [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary Julien Grall
2018-12-21 14:36   ` Andrii Anisov
2018-12-21 14:43   ` Andrii Anisov
2018-12-21 14:48     ` Julien Grall
2019-01-23 23:42   ` Stefano Stabellini
2019-01-24 11:29     ` Julien Grall
2018-11-28 16:49 ` [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522 Julien Grall
2018-12-21 15:47   ` Andrii Anisov
2019-01-24  0:52   ` Stefano Stabellini
2019-01-24 13:17     ` Julien Grall
2019-01-25 21:36       ` Stefano Stabellini
2019-01-27  9:55         ` Julien Grall
2019-01-28 11:11           ` Julien Grall
2019-01-29  0:52             ` Stefano Stabellini
2019-01-29 10:37               ` Julien Grall
2018-11-28 16:49 ` [PATCH for-4.12 7/8] xen/arm: p2m: Clean-up headers included and order them alphabetically Julien Grall
2018-12-21 14:44   ` Andrii Anisov
2018-12-21 15:33     ` Andrii Anisov
2018-11-28 16:49 ` [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code Julien Grall
2018-12-21 14:44   ` Andrii Anisov
2018-12-21 14:47     ` Andrew Cooper
2018-12-21 14:55       ` Andrii Anisov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.