[PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations
@ 2018-12-04 20:26 Julien Grall
  2018-12-04 20:26 ` [PATCH for-4.12 v2 01/17] xen/arm: Introduce helpers to clear/flags flags in HCR_EL2 Julien Grall
                   ` (16 more replies)
  0 siblings, 17 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, Wei Liu, Razvan Cojocaru, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Tamas K Lengyel, Jan Beulich, Roger Pau Monné

Hi all,

This is version 2 of the series to implement set/way. For more details see
patch #16.

A branch with the code is available at:

https://xenbits.xen.org/git-http/people/julieng/xen-unstable.git
branch cacheflush/v2

Cheers,

Julien Grall (17):
  xen/arm: Introduce helpers to clear/flags flags in HCR_EL2
  xen/arm: traps: Move the implementation of GUEST_BUG_ON in traps.h
  xen/arm: p2m: Clean-up headers included and order them alphabetically
  xen/arm: p2m: Introduce p2m_is_valid and use it
  xen/arm: p2m: Handle translation fault in get_page_from_gva
  xen/arm: p2m: Introduce a function to resolve translation fault
  xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by
    HCR_EL2.TVM
  xen/arm: vsysreg: Add wrapper to handle sysreg access trapped by
    HCR_EL2.TVM
  xen/arm: Rework p2m_cache_flush to take a range [begin, end)
  xen/arm: p2m: Allow to flush cache on any RAM region
  xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0]
    (valid bit)
  xen/arm: traps: Rework leave_hypervisor_tail
  xen/arm: p2m: Rework p2m_cache_flush_range
  xen/arm: domctl: Use typesafe gfn in XEN_DOMCTL_cacheflush
  xen/arm: p2m: Add support for preemption in p2m_cache_flush_range
  xen/arm: Implement Set/Way operations
  xen/arm: Track page accessed between batch of Set/Way operations

 xen/arch/arm/arm64/vsysreg.c    |  75 +++++++
 xen/arch/arm/domain.c           |  14 ++
 xen/arch/arm/domctl.c           |  14 +-
 xen/arch/arm/mem_access.c       |   6 +-
 xen/arch/arm/p2m.c              | 478 +++++++++++++++++++++++++++++++++++-----
 xen/arch/arm/traps.c            | 123 ++++++-----
 xen/arch/arm/vcpreg.c           | 171 ++++++++++++++
 xen/arch/x86/domain.c           |   4 +
 xen/common/domain.c             |   5 +-
 xen/include/asm-arm/cpregs.h    |   1 +
 xen/include/asm-arm/domain.h    |   8 +
 xen/include/asm-arm/p2m.h       |  36 ++-
 xen/include/asm-arm/processor.h |  18 ++
 xen/include/asm-arm/traps.h     |  24 ++
 xen/include/xen/domain.h        |   2 +
 15 files changed, 856 insertions(+), 123 deletions(-)

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 01/17] xen/arm: Introduce helpers to clear/flags flags in HCR_EL2
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-04 20:26 ` [PATCH for-4.12 v2 02/17] xen/arm: traps: Move the implementation of GUEST_BUG_ON in traps.h Julien Grall
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

A couple of places in the code will need to clear/set flags in HCR_EL2
for a given vCPU and then replicate into the hardware. Introduce
helpers and replace open-coded version.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

---

The patch was previously sent separately and reviewed by Stefano.

I haven't find a good place for those new helpers so stick to
processor.h at the moment. This require to use macro rather than inline
helpers given that processor.h is usually the root of all headers.
---
 xen/arch/arm/traps.c            |  3 +--
 xen/include/asm-arm/processor.h | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 88ffeeb480..c05a8ad25c 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -681,8 +681,7 @@ static void inject_vabt_exception(struct cpu_user_regs *regs)
         break;
     }
 
-    current->arch.hcr_el2 |= HCR_VA;
-    WRITE_SYSREG(current->arch.hcr_el2, HCR_EL2);
+    vcpu_hcr_set_flags(current, HCR_VA);
 }
 
 /*
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 72ddc42778..cb781751a6 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -490,6 +490,24 @@ register_t get_default_hcr_flags(void);
                                  : : : "memory");                 \
     } while (0)
 
+/*
+ * Clear/Set flags in HCR_EL2 for a given vCPU. It only supports the current
+ * vCPU for now.
+ */
+#define vcpu_hcr_clear_flags(v, flags)              \
+    do {                                            \
+        ASSERT((v) == current);                     \
+        (v)->arch.hcr_el2 &= ~(flags);              \
+        WRITE_SYSREG((v)->arch.hcr_el2, HCR_EL2);   \
+    } while (0)
+
+#define vcpu_hcr_set_flags(v, flags)                \
+    do {                                            \
+        ASSERT((v) == current);                     \
+        (v)->arch.hcr_el2 |= (flags);               \
+        WRITE_SYSREG((v)->arch.hcr_el2, HCR_EL2);   \
+    } while (0)
+
 #endif /* __ASSEMBLY__ */
 #endif /* __ASM_ARM_PROCESSOR_H */
 /*
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 02/17] xen/arm: traps: Move the implementation of GUEST_BUG_ON in traps.h
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
  2018-12-04 20:26 ` [PATCH for-4.12 v2 01/17] xen/arm: Introduce helpers to clear/flags flags in HCR_EL2 Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-04 20:26 ` [PATCH for-4.12 v2 03/17] xen/arm: p2m: Clean-up headers included and order them alphabetically Julien Grall
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

GUEST_BUG_ON may be used in other files doing guest emulation.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>

---

    The patch was previously sent separately.

    Changes in v2:
        - Add Stefano's acked-by
---
 xen/arch/arm/traps.c        | 24 ------------------------
 xen/include/asm-arm/traps.h | 24 ++++++++++++++++++++++++
 2 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index c05a8ad25c..94fe1a6da7 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -67,30 +67,6 @@ static inline void check_stack_alignment_constraints(void) {
 #endif
 }
 
-/*
- * GUEST_BUG_ON is intended for checking that the guest state has not been
- * corrupted in hardware and/or that the hardware behaves as we
- * believe it should (i.e. that certain traps can only occur when the
- * guest is in a particular mode).
- *
- * The intention is to limit the damage such h/w bugs (or spec
- * misunderstandings) can do by turning them into Denial of Service
- * attacks instead of e.g. information leaks or privilege escalations.
- *
- * GUEST_BUG_ON *MUST* *NOT* be used to check for guest controllable state!
- *
- * Compared with regular BUG_ON it dumps the guest vcpu state instead
- * of Xen's state.
- */
-#define guest_bug_on_failed(p)                          \
-do {                                                    \
-    show_execution_state(guest_cpu_user_regs());        \
-    panic("Guest Bug: %pv: '%s', line %d, file %s\n",   \
-          current, p, __LINE__, __FILE__);              \
-} while (0)
-#define GUEST_BUG_ON(p) \
-    do { if ( unlikely(p) ) guest_bug_on_failed(#p); } while (0)
-
 #ifdef CONFIG_ARM_32
 static int debug_stack_lines = 20;
 #define stack_words_per_line 8
diff --git a/xen/include/asm-arm/traps.h b/xen/include/asm-arm/traps.h
index 6d8a43a691..997c37884e 100644
--- a/xen/include/asm-arm/traps.h
+++ b/xen/include/asm-arm/traps.h
@@ -10,6 +10,30 @@
 # include <asm/arm64/traps.h>
 #endif
 
+/*
+ * GUEST_BUG_ON is intended for checking that the guest state has not been
+ * corrupted in hardware and/or that the hardware behaves as we
+ * believe it should (i.e. that certain traps can only occur when the
+ * guest is in a particular mode).
+ *
+ * The intention is to limit the damage such h/w bugs (or spec
+ * misunderstandings) can do by turning them into Denial of Service
+ * attacks instead of e.g. information leaks or privilege escalations.
+ *
+ * GUEST_BUG_ON *MUST* *NOT* be used to check for guest controllable state!
+ *
+ * Compared with regular BUG_ON it dumps the guest vcpu state instead
+ * of Xen's state.
+ */
+#define guest_bug_on_failed(p)                          \
+do {                                                    \
+    show_execution_state(guest_cpu_user_regs());        \
+    panic("Guest Bug: %pv: '%s', line %d, file %s\n",   \
+          current, p, __LINE__, __FILE__);              \
+} while (0)
+#define GUEST_BUG_ON(p) \
+    do { if ( unlikely(p) ) guest_bug_on_failed(#p); } while (0)
+
 int check_conditional_instr(struct cpu_user_regs *regs, const union hsr hsr);
 
 void advance_pc(struct cpu_user_regs *regs, const union hsr hsr);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 03/17] xen/arm: p2m: Clean-up headers included and order them alphabetically
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
  2018-12-04 20:26 ` [PATCH for-4.12 v2 01/17] xen/arm: Introduce helpers to clear/flags flags in HCR_EL2 Julien Grall
  2018-12-04 20:26 ` [PATCH for-4.12 v2 02/17] xen/arm: traps: Move the implementation of GUEST_BUG_ON in traps.h Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-04 23:47   ` Stefano Stabellini
  2018-12-04 20:26 ` [PATCH for-4.12 v2 04/17] xen/arm: p2m: Introduce p2m_is_valid and use it Julien Grall
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

A lot of the headers are not necessary, so remove them. At the same
time, re-order them alphabetically.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/p2m.c | 18 +++++-------------
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 6c76298ebc..81f3107dd2 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1,19 +1,11 @@
-#include <xen/sched.h>
-#include <xen/lib.h>
-#include <xen/errno.h>
+#include <xen/cpu.h>
 #include <xen/domain_page.h>
-#include <xen/bitops.h>
-#include <xen/vm_event.h>
-#include <xen/monitor.h>
 #include <xen/iocap.h>
-#include <xen/mem_access.h>
-#include <xen/xmalloc.h>
-#include <xen/cpu.h>
-#include <xen/notifier.h>
-#include <public/vm_event.h>
-#include <asm/flushtlb.h>
+#include <xen/lib.h>
+#include <xen/sched.h>
+
 #include <asm/event.h>
-#include <asm/hardirq.h>
+#include <asm/flushtlb.h>
 #include <asm/page.h>
 
 #define MAX_VMID_8_BIT  (1UL << 8)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 04/17] xen/arm: p2m: Introduce p2m_is_valid and use it
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (2 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 03/17] xen/arm: p2m: Clean-up headers included and order them alphabetically Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-04 23:50   ` Stefano Stabellini
  2018-12-04 20:26 ` [PATCH for-4.12 v2 05/17] xen/arm: p2m: Handle translation fault in get_page_from_gva Julien Grall
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

The LPAE format allows to store information in an entry even with the
valid bit unset. In a follow-up patch, we will take advantage of this
feature to re-purpose the valid bit for generating a translation fault
even if an entry contains valid information.

So we need a different way to know whether an entry contains valid
information. It is possible to use the information hold in the p2m_type
to know for that purpose. Indeed all entries containing valid
information will have a valid p2m type (i.e p2m_type != p2m_invalid).

This patch introduces a new helper p2m_is_valid, which implements that
idea, and replace most of lpae_is_valid call with the new helper. The ones
remaining are for TLBs handling and entries accounting.

With the renaming there are 2 others changes required:
    - Generate table entry with a valid p2m type
    - Detect new mapping for proper stats accounting

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    Changes in v2:
        - Don't open-code p2m_is_superpage
---
 xen/arch/arm/p2m.c | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 81f3107dd2..47b54c792e 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -212,17 +212,26 @@ static p2m_access_t p2m_mem_access_radix_get(struct p2m_domain *p2m, gfn_t gfn)
 }
 
 /*
+ * In the case of the P2M, the valid bit is used for other purpose. Use
+ * the type to check whether an entry is valid.
+ */
+static inline bool p2m_is_valid(lpae_t pte)
+{
+    return pte.p2m.type != p2m_invalid;
+}
+
+/*
  * lpae_is_* helpers don't check whether the valid bit is set in the
  * PTE. Provide our own overlay to check the valid bit.
  */
 static inline bool p2m_is_mapping(lpae_t pte, unsigned int level)
 {
-    return lpae_is_valid(pte) && lpae_is_mapping(pte, level);
+    return p2m_is_valid(pte) && lpae_is_mapping(pte, level);
 }
 
 static inline bool p2m_is_superpage(lpae_t pte, unsigned int level)
 {
-    return lpae_is_valid(pte) && lpae_is_superpage(pte, level);
+    return p2m_is_valid(pte) && lpae_is_superpage(pte, level);
 }
 
 #define GUEST_TABLE_MAP_FAILED 0
@@ -256,7 +265,7 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
 
     entry = *table + offset;
 
-    if ( !lpae_is_valid(*entry) )
+    if ( !p2m_is_valid(*entry) )
     {
         if ( read_only )
             return GUEST_TABLE_MAP_FAILED;
@@ -348,7 +357,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
 
     entry = table[offsets[level]];
 
-    if ( lpae_is_valid(entry) )
+    if ( p2m_is_valid(entry) )
     {
         *t = entry.p2m.type;
 
@@ -536,8 +545,11 @@ static lpae_t page_to_p2m_table(struct page_info *page)
     /*
      * The access value does not matter because the hardware will ignore
      * the permission fields for table entry.
+     *
+     * We use p2m_ram_rw so the entry has a valid type. This is important
+     * for p2m_is_valid() to return valid on table entries.
      */
-    return mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid, p2m_access_rwx);
+    return mfn_to_p2m_entry(page_to_mfn(page), p2m_ram_rw, p2m_access_rwx);
 }
 
 static inline void p2m_write_pte(lpae_t *p, lpae_t pte, bool clean_pte)
@@ -561,7 +573,7 @@ static int p2m_create_table(struct p2m_domain *p2m, lpae_t *entry)
     struct page_info *page;
     lpae_t *p;
 
-    ASSERT(!lpae_is_valid(*entry));
+    ASSERT(!p2m_is_valid(*entry));
 
     page = alloc_domheap_page(NULL, 0);
     if ( page == NULL )
@@ -618,7 +630,7 @@ static int p2m_mem_access_radix_set(struct p2m_domain *p2m, gfn_t gfn,
  */
 static void p2m_put_l3_page(const lpae_t pte)
 {
-    ASSERT(lpae_is_valid(pte));
+    ASSERT(p2m_is_valid(pte));
 
     /*
      * TODO: Handle other p2m types
@@ -646,7 +658,7 @@ static void p2m_free_entry(struct p2m_domain *p2m,
     struct page_info *pg;
 
     /* Nothing to do if the entry is invalid. */
-    if ( !lpae_is_valid(entry) )
+    if ( !p2m_is_valid(entry) )
         return;
 
     /* Nothing to do but updating the stats if the entry is a super-page. */
@@ -943,7 +955,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
             else
                 p2m->need_flush = true;
         }
-        else /* new mapping */
+        else if ( !p2m_is_valid(orig_pte) ) /* new mapping */
             p2m->stats.mappings[level]++;
 
         p2m_write_pte(entry, pte, p2m->clean_pte);
@@ -957,7 +969,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
      * Free the entry only if the original pte was valid and the base
      * is different (to avoid freeing when permission is changed).
      */
-    if ( lpae_is_valid(orig_pte) &&
+    if ( p2m_is_valid(orig_pte) &&
          !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) )
         p2m_free_entry(p2m, orig_pte, level);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 05/17] xen/arm: p2m: Handle translation fault in get_page_from_gva
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (3 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 04/17] xen/arm: p2m: Introduce p2m_is_valid and use it Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-04 23:59   ` Stefano Stabellini
  2018-12-04 20:26 ` [PATCH for-4.12 v2 06/17] xen/arm: p2m: Introduce a function to resolve translation fault Julien Grall
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

A follow-up patch will re-purpose the valid bit of LPAE entries to
generate fault even on entry containing valid information.

This means that when translating a guest VA to guest PA (e.g IPA) will
fail if the Stage-2 entries used have the valid bit unset. Because of
that, we need to fallback to walk the page-table in software to check
whether the fault was expected.

This patch adds the software page-table walk on all the translation
fault. It would be possible in the future to avoid pointless walk when
the fault in PAR_EL1 is not a translation fault.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---

There are a couple of TODO in the code. They are clean-up and performance
improvement (e.g when the fault cannot be handled) that could be delayed after
the series has been merged.

    Changes in v2:
        - Check stage-2 permission during software lookup
        - Fix typoes
---
 xen/arch/arm/p2m.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 59 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 47b54c792e..39680eeb6e 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -6,6 +6,7 @@
 
 #include <asm/event.h>
 #include <asm/flushtlb.h>
+#include <asm/guest_walk.h>
 #include <asm/page.h>
 
 #define MAX_VMID_8_BIT  (1UL << 8)
@@ -1430,6 +1431,8 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
     struct page_info *page = NULL;
     paddr_t maddr = 0;
     uint64_t par;
+    mfn_t mfn;
+    p2m_type_t t;
 
     /*
      * XXX: To support a different vCPU, we would need to load the
@@ -1446,8 +1449,29 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
     par = gvirt_to_maddr(va, &maddr, flags);
     p2m_read_unlock(p2m);
 
+    /*
+     * gvirt_to_maddr may fail if the entry does not have the valid bit
+     * set. Fallback to the second method:
+     *  1) Translate the VA to IPA using software lookup -> Stage-1 page-table
+     *  may not be accessible because the stage-2 entries may have valid
+     *  bit unset.
+     *  2) Software lookup of the MFN
+     *
+     * Note that when memaccess is enabled, we instead call directly
+     * p2m_mem_access_check_and_get_page(...). Because the function is a
+     * a variant of the methods described above, it will be able to
+     * handle entries with valid bit unset.
+     *
+     * TODO: Integrate more nicely memaccess with the rest of the
+     * function.
+     * TODO: Use the fault error in PAR_EL1 to avoid pointless
+     *  translation.
+     */
     if ( par )
     {
+        paddr_t ipa;
+        unsigned int s1_perms;
+
         /*
          * When memaccess is enabled, the translation GVA to MADDR may
          * have failed because of a permission fault.
@@ -1455,20 +1479,48 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
         if ( p2m->mem_access_enabled )
             return p2m_mem_access_check_and_get_page(va, flags, v);
 
-        dprintk(XENLOG_G_DEBUG,
-                "%pv: gvirt_to_maddr failed va=%#"PRIvaddr" flags=0x%lx par=%#"PRIx64"\n",
-                v, va, flags, par);
-        return NULL;
+        /*
+         * The software stage-1 table walk can still fail, e.g, if the
+         * GVA is not mapped.
+         */
+        if ( !guest_walk_tables(v, va, &ipa, &s1_perms) )
+        {
+            dprintk(XENLOG_G_DEBUG,
+                    "%pv: Failed to walk page-table va %#"PRIvaddr"\n", v, va);
+            return NULL;
+        }
+
+        mfn = p2m_lookup(d, gaddr_to_gfn(ipa), &t);
+        if ( mfn_eq(INVALID_MFN, mfn) || !p2m_is_ram(t) )
+            return NULL;
+
+        /*
+         * Check permission that are assumed by the caller. For instance
+         * in case of guestcopy, the caller assumes that the translated
+         * page can be accessed with the requested permissions. If this
+         * is not the case, we should fail.
+         *
+         * Please note that we do not check for the GV2M_EXEC
+         * permission. This is fine because the hardware-based translation
+         * instruction does not test for execute permissions.
+         */
+        if ( (flags & GV2M_WRITE) && !(s1_perms & GV2M_WRITE) )
+            return NULL;
+
+        if ( (flags & GV2M_WRITE) && t != p2m_ram_rw )
+            return NULL;
     }
+    else
+        mfn = maddr_to_mfn(maddr);
 
-    if ( !mfn_valid(maddr_to_mfn(maddr)) )
+    if ( !mfn_valid(mfn) )
     {
         dprintk(XENLOG_G_DEBUG, "%pv: Invalid MFN %#"PRI_mfn"\n",
-                v, mfn_x(maddr_to_mfn(maddr)));
+                v, mfn_x(mfn));
         return NULL;
     }
 
-    page = mfn_to_page(maddr_to_mfn(maddr));
+    page = mfn_to_page(mfn);
     ASSERT(page);
 
     if ( unlikely(!get_page(page, d)) )
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 06/17] xen/arm: p2m: Introduce a function to resolve translation fault
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (4 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 05/17] xen/arm: p2m: Handle translation fault in get_page_from_gva Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-06 22:33   ` Stefano Stabellini
  2018-12-04 20:26 ` [PATCH for-4.12 v2 07/17] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM Julien Grall
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

Currently a Stage-2 translation fault could happen:
    1) MMIO emulation
    2) Another pCPU was modifying the P2M using Break-Before-Make
    3) Guest Physical address is not mapped

A follow-up patch will re-purpose the valid bit in an entry to generate
translation fault. This would be used to do an action on each entry to
track pages used for a given period.

When receiving the translation fault, we would need to walk the pages
table to find the faulting entry and then toggle valid bit. We can't use
p2m_lookup() for this purpose as it only tells us the mapping exists.

So this patch adds a new function to walk the page-tables and updates
the entry. This function will also handle 2) as it also requires walking
the page-table.

The function is able to cope with both table and block entry having the
validate bit unset. This gives flexibility to the function clearing the
valid bits. To keep the algorithm simple, the fault will be propating
one-level down. This will be repeated until a block entry has been
reached.

At the moment, there are no action done when reaching a block/page entry
but setting the valid bit to 1.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    Changes in v2:
        - Typoes
        - Add more comment
        - Skip clearing valid bit if it was already done
        - Move the prototype in p2m.h
        - Expand commit message
---
 xen/arch/arm/p2m.c        | 142 ++++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/traps.c      |  10 ++--
 xen/include/asm-arm/p2m.h |   2 +
 3 files changed, 148 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 39680eeb6e..2706db3e67 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1035,6 +1035,148 @@ int p2m_set_entry(struct p2m_domain *p2m,
     return rc;
 }
 
+/* Invalidate all entries in the table. The p2m should be write locked. */
+static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
+{
+    lpae_t *table;
+    unsigned int i;
+
+    ASSERT(p2m_is_write_locked(p2m));
+
+    table = map_domain_page(mfn);
+
+    for ( i = 0; i < LPAE_ENTRIES; i++ )
+    {
+        lpae_t pte = table[i];
+
+        /*
+         * Writing an entry can be expensive because it may involve
+         * cleaning the cache. So avoid updating the entry if the valid
+         * bit is already cleared.
+         */
+        if ( !pte.p2m.valid )
+            continue;
+
+        pte.p2m.valid = 0;
+
+        p2m_write_pte(&table[i], pte, p2m->clean_pte);
+    }
+
+    unmap_domain_page(table);
+
+    p2m->need_flush = true;
+}
+
+/*
+ * Resolve any translation fault due to change in the p2m. This
+ * includes break-before-make and valid bit cleared.
+ */
+bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    unsigned int level = 0;
+    bool resolved = false;
+    lpae_t entry, *table;
+    paddr_t addr = gfn_to_gaddr(gfn);
+
+    /* Convenience aliases */
+    const unsigned int offsets[4] = {
+        zeroeth_table_offset(addr),
+        first_table_offset(addr),
+        second_table_offset(addr),
+        third_table_offset(addr)
+    };
+
+    p2m_write_lock(p2m);
+
+    /* This gfn is higher than the highest the p2m map currently holds */
+    if ( gfn_x(gfn) > gfn_x(p2m->max_mapped_gfn) )
+        goto out;
+
+    table = p2m_get_root_pointer(p2m, gfn);
+    /*
+     * The table should always be non-NULL because the gfn is below
+     * p2m->max_mapped_gfn and the root table pages are always present.
+     */
+    BUG_ON(table == NULL);
+
+    /*
+     * Go down the page-tables until an entry has the valid bit unset or
+     * a block/page entry has been hit.
+     */
+    for ( level = P2M_ROOT_LEVEL; level <= 3; level++ )
+    {
+        int rc;
+
+        entry = table[offsets[level]];
+
+        if ( level == 3 )
+            break;
+
+        /* Stop as soon as we hit an entry with the valid bit unset. */
+        if ( !lpae_is_valid(entry) )
+            break;
+
+        rc = p2m_next_level(p2m, true, level, &table, offsets[level]);
+        if ( rc == GUEST_TABLE_MAP_FAILED )
+            goto out_unmap;
+        else if ( rc != GUEST_TABLE_NORMAL_PAGE )
+            break;
+    }
+
+    /*
+     * If the valid bit of the entry is set, it means someone was playing with
+     * the Stage-2 page table. Nothing to do and mark the fault as resolved.
+     */
+    if ( lpae_is_valid(entry) )
+    {
+        resolved = true;
+        goto out_unmap;
+    }
+
+    /*
+     * The valid bit is unset. If the entry is still not valid then the fault
+     * cannot be resolved, exit and report it.
+     */
+    if ( !p2m_is_valid(entry) )
+        goto out_unmap;
+
+    /*
+     * Now we have an entry with valid bit unset, but still valid from
+     * the P2M point of view.
+     *
+     * If an entry is pointing to a table, each entry of the table will
+     * have there valid bit cleared. This allows a function to clear the
+     * full p2m with just a couple of write. The valid bit will then be
+     * propagated on the fault.
+     * If an entry is pointing to a block/page, no work to do for now.
+     */
+    if ( lpae_is_table(entry, level) )
+        p2m_invalidate_table(p2m, lpae_get_mfn(entry));
+
+    /*
+     * Now that the work on the entry is done, set the valid bit to prevent
+     * another fault on that entry.
+     */
+    resolved = true;
+    entry.p2m.valid = 1;
+
+    p2m_write_pte(table + offsets[level], entry, p2m->clean_pte);
+
+    /*
+     * No need to flush the TLBs as the modified entry had the valid bit
+     * unset.
+     */
+
+out_unmap:
+    unmap_domain_page(table);
+
+out:
+    p2m_write_unlock(p2m);
+
+    return resolved;
+}
+
 static inline int p2m_insert_mapping(struct domain *d,
                                      gfn_t start_gfn,
                                      unsigned long nr,
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 94fe1a6da7..b00d0b8e1e 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1893,7 +1893,6 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
     vaddr_t gva;
     paddr_t gpa;
     uint8_t fsc = xabt.fsc & ~FSC_LL_MASK;
-    mfn_t mfn;
     bool is_data = (hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
 
     /*
@@ -1972,12 +1971,11 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
         }
 
         /*
-         * The PT walk may have failed because someone was playing
-         * with the Stage-2 page table. Walk the Stage-2 PT to check
-         * if the entry exists. If it's the case, return to the guest
+         * First check if the translation fault can be resolved by the
+         * P2M subsystem. If that's the case nothing else to do.
          */
-        mfn = gfn_to_mfn(current->domain, gaddr_to_gfn(gpa));
-        if ( !mfn_eq(mfn, INVALID_MFN) )
+        if ( p2m_resolve_translation_fault(current->domain,
+                                           gaddr_to_gfn(gpa)) )
             return;
 
         if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 4fe78d39a5..13f7a27c38 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -226,6 +226,8 @@ int p2m_set_entry(struct p2m_domain *p2m,
                   p2m_type_t t,
                   p2m_access_t a);
 
+bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
+
 /* Clean & invalidate caches corresponding to a region of guest address space */
 int p2m_cache_flush(struct domain *d, gfn_t start, unsigned long nr);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 07/17] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (5 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 06/17] xen/arm: p2m: Introduce a function to resolve translation fault Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-06 22:33   ` Stefano Stabellini
  2018-12-04 20:26 ` [PATCH for-4.12 v2 08/17] xen/arm: vsysreg: Add wrapper to handle sysreg " Julien Grall
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

A follow-up patch will require to emulate some accesses to some
co-processors registers trapped by HCR_EL2.TVM. When set, all NS EL1 writes
to the virtual memory control registers will be trapped to the hypervisor.

This patch adds the infrastructure to passthrough the access to host
registers. For convenience a bunch of helpers have been added to
generate the different helpers.

Note that HCR_EL2.TVM will be set in a follow-up patch dynamically.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    Changes in v2:
        - Add missing include vreg.h
        - Fixup mask TMV_REG32_COMBINED
        - Update comments
---
 xen/arch/arm/vcpreg.c        | 149 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/cpregs.h |   1 +
 2 files changed, 150 insertions(+)

diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
index 7b783e4bcc..550c25ec3f 100644
--- a/xen/arch/arm/vcpreg.c
+++ b/xen/arch/arm/vcpreg.c
@@ -23,8 +23,129 @@
 #include <asm/current.h>
 #include <asm/regs.h>
 #include <asm/traps.h>
+#include <asm/vreg.h>
 #include <asm/vtimer.h>
 
+/*
+ * Macros to help generating helpers for registers trapped when
+ * HCR_EL2.TVM is set.
+ *
+ * Note that it only traps NS write access from EL1.
+ *
+ *  - TVM_REG() should not be used outside of the macros. It is there to
+ *    help defining TVM_REG32() and TVM_REG64()
+ *  - TVM_REG32(regname, xreg) and TVM_REG64(regname, xreg) are used to
+ *    resp. generate helper accessing 32-bit and 64-bit register. "regname"
+ *    is the Arm32 name and "xreg" the Arm64 name.
+ *  - TVM_REG32_COMBINED(lowreg, hireg, xreg) are used to generate a
+ *    pair of register sharing the same Arm64 register, but are 2 distinct
+ *    Arm32 registers. "lowreg" and "hireg" contains the name for on Arm32
+ *    registers, "xreg" contains the name for the combined register on Arm64.
+ *    The definition of "lowreg" and "higreg" match the Armv8 specification,
+ *    this means "lowreg" is an alias to xreg[31:0] and "high" is an alias to
+ *    xreg[63:32].
+ *
+ */
+
+/* The name is passed from the upper macro to workaround macro expansion. */
+#define TVM_REG(sz, func, reg...)                                           \
+static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
+{                                                                           \
+    GUEST_BUG_ON(read);                                                     \
+    WRITE_SYSREG##sz(*r, reg);                                              \
+                                                                            \
+    return true;                                                            \
+}
+
+#define TVM_REG32(regname, xreg) TVM_REG(32, vreg_emulate_##regname, xreg)
+#define TVM_REG64(regname, xreg) TVM_REG(64, vreg_emulate_##regname, xreg)
+
+#ifdef CONFIG_ARM_32
+#define TVM_REG32_COMBINED(lowreg, hireg, xreg)                     \
+    /* Use TVM_REG directly to workaround macro expansion. */       \
+    TVM_REG(32, vreg_emulate_##lowreg, lowreg)                      \
+    TVM_REG(32, vreg_emulate_##hireg, hireg)
+
+#else /* CONFIG_ARM_64 */
+#define TVM_REG32_COMBINED(lowreg, hireg, xreg)                             \
+static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
+                                bool read, bool hi)                         \
+{                                                                           \
+    register_t reg = READ_SYSREG(xreg);                                     \
+                                                                            \
+    GUEST_BUG_ON(read);                                                     \
+    if ( hi ) /* reg[63:32] is AArch32 register hireg */                    \
+    {                                                                       \
+        reg &= GENMASK(31, 0);                                              \
+        reg |= ((uint64_t)*r) << 32;                                        \
+    }                                                                       \
+    else /* reg[31:0] is AArch32 register lowreg. */                        \
+    {                                                                       \
+        reg &= GENMASK(63, 32);                                             \
+        reg |= *r;                                                          \
+    }                                                                       \
+    WRITE_SYSREG(reg, xreg);                                                \
+                                                                            \
+    return true;                                                            \
+}                                                                           \
+                                                                            \
+static bool vreg_emulate_##lowreg(struct cpu_user_regs *regs, uint32_t *r,  \
+                                  bool read)                                \
+{                                                                           \
+    return vreg_emulate_##xreg(regs, r, read, false);                       \
+}                                                                           \
+                                                                            \
+static bool vreg_emulate_##hireg(struct cpu_user_regs *regs, uint32_t *r,   \
+                                 bool read)                                 \
+{                                                                           \
+    return vreg_emulate_##xreg(regs, r, read, true);                        \
+}
+#endif
+
+/* Defining helpers for emulating co-processor registers. */
+TVM_REG32(SCTLR, SCTLR_EL1)
+/*
+ * AArch32 provides two way to access TTBR* depending on the access
+ * size, whilst AArch64 provides one way.
+ *
+ * When using AArch32, for simplicity, use the same access size as the
+ * guest.
+ */
+#ifdef CONFIG_ARM_32
+TVM_REG32(TTBR0_32, TTBR0_32)
+TVM_REG32(TTBR1_32, TTBR1_32)
+#else
+TVM_REG32(TTBR0_32, TTBR0_EL1)
+TVM_REG32(TTBR1_32, TTBR1_EL1)
+#endif
+TVM_REG64(TTBR0, TTBR0_EL1)
+TVM_REG64(TTBR1, TTBR1_EL1)
+/* AArch32 registers TTBCR and TTBCR2 share AArch64 register TCR_EL1. */
+TVM_REG32_COMBINED(TTBCR, TTBCR2, TCR_EL1)
+TVM_REG32(DACR, DACR32_EL2)
+TVM_REG32(DFSR, ESR_EL1)
+TVM_REG32(IFSR, IFSR32_EL2)
+/* AArch32 registers DFAR and IFAR shares AArch64 register FAR_EL1. */
+TVM_REG32_COMBINED(DFAR, IFAR, FAR_EL1)
+TVM_REG32(ADFSR, AFSR0_EL1)
+TVM_REG32(AIFSR, AFSR1_EL1)
+/* AArch32 registers MAIR0 and MAIR1 share AArch64 register MAIR_EL1. */
+TVM_REG32_COMBINED(MAIR0, MAIR1, MAIR_EL1)
+/* AArch32 registers AMAIR0 and AMAIR1 share AArch64 register AMAIR_EL1. */
+TVM_REG32_COMBINED(AMAIR0, AMAIR1, AMAIR_EL1)
+TVM_REG32(CONTEXTIDR, CONTEXTIDR_EL1)
+
+/* Macro to generate easily case for co-processor emulation. */
+#define GENERATE_CASE(reg, sz)                                      \
+    case HSR_CPREG##sz(reg):                                        \
+    {                                                               \
+        bool res;                                                   \
+                                                                    \
+        res = vreg_emulate_cp##sz(regs, hsr, vreg_emulate_##reg);   \
+        ASSERT(res);                                                \
+        break;                                                      \
+    }
+
 void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
 {
     const struct hsr_cp32 cp32 = hsr.cp32;
@@ -65,6 +186,31 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
         break;
 
     /*
+     * HCR_EL2.TVM
+     *
+     * ARMv8 (DDI 0487D.a): Table D1-38
+     */
+    GENERATE_CASE(SCTLR, 32)
+    GENERATE_CASE(TTBR0_32, 32)
+    GENERATE_CASE(TTBR1_32, 32)
+    GENERATE_CASE(TTBCR, 32)
+    GENERATE_CASE(TTBCR2, 32)
+    GENERATE_CASE(DACR, 32)
+    GENERATE_CASE(DFSR, 32)
+    GENERATE_CASE(IFSR, 32)
+    GENERATE_CASE(DFAR, 32)
+    GENERATE_CASE(IFAR, 32)
+    GENERATE_CASE(ADFSR, 32)
+    GENERATE_CASE(AIFSR, 32)
+    /* AKA PRRR */
+    GENERATE_CASE(MAIR0, 32)
+    /* AKA NMRR */
+    GENERATE_CASE(MAIR1, 32)
+    GENERATE_CASE(AMAIR0, 32)
+    GENERATE_CASE(AMAIR1, 32)
+    GENERATE_CASE(CONTEXTIDR, 32)
+
+    /*
      * MDCR_EL2.TPM
      *
      * ARMv7 (DDI 0406C.b): B1.14.17
@@ -193,6 +339,9 @@ void do_cp15_64(struct cpu_user_regs *regs, const union hsr hsr)
             return inject_undef_exception(regs, hsr);
         break;
 
+    GENERATE_CASE(TTBR0, 64)
+    GENERATE_CASE(TTBR1, 64)
+
     /*
      * CPTR_EL2.T{0..9,12..13}
      *
diff --git a/xen/include/asm-arm/cpregs.h b/xen/include/asm-arm/cpregs.h
index 97a3c6f1c1..8fd344146e 100644
--- a/xen/include/asm-arm/cpregs.h
+++ b/xen/include/asm-arm/cpregs.h
@@ -140,6 +140,7 @@
 
 /* CP15 CR2: Translation Table Base and Control Registers */
 #define TTBCR           p15,0,c2,c0,2   /* Translation Table Base Control Register */
+#define TTBCR2          p15,0,c2,c0,3   /* Translation Table Base Control Register 2 */
 #define TTBR0           p15,0,c2        /* Translation Table Base Reg. 0 */
 #define TTBR1           p15,1,c2        /* Translation Table Base Reg. 1 */
 #define HTTBR           p15,4,c2        /* Hyp. Translation Table Base Register */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 08/17] xen/arm: vsysreg: Add wrapper to handle sysreg access trapped by HCR_EL2.TVM
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (6 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 07/17] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-04 20:26 ` [PATCH for-4.12 v2 09/17] xen/arm: Rework p2m_cache_flush to take a range [begin, end) Julien Grall
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

A follow-up patch will require to emulate some accesses to system
registers trapped by HCR_EL2.TVM. When set, all NS EL1 writes to the
virtual memory control registers will be trapped to the hypervisor.

This patch adds the infrastructure to passthrough the access to the host
registers.

Note that HCR_EL2.TVM will be set in a follow-up patch dynamically.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    Changes in v2:
        - Add missing include vreg.h
        - Update documentation reference to the lastest one
---
 xen/arch/arm/arm64/vsysreg.c | 58 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
index 6e60824572..16ac9c344a 100644
--- a/xen/arch/arm/arm64/vsysreg.c
+++ b/xen/arch/arm/arm64/vsysreg.c
@@ -21,8 +21,49 @@
 #include <asm/current.h>
 #include <asm/regs.h>
 #include <asm/traps.h>
+#include <asm/vreg.h>
 #include <asm/vtimer.h>
 
+/*
+ * Macro to help generating helpers for registers trapped when
+ * HCR_EL2.TVM is set.
+ *
+ * Note that it only traps NS write access from EL1.
+ */
+#define TVM_REG(reg)                                                \
+static bool vreg_emulate_##reg(struct cpu_user_regs *regs,          \
+                               uint64_t *r, bool read)              \
+{                                                                   \
+    GUEST_BUG_ON(read);                                             \
+    WRITE_SYSREG64(*r, reg);                                        \
+                                                                    \
+    return true;                                                    \
+}
+
+/* Defining helpers for emulating sysreg registers. */
+TVM_REG(SCTLR_EL1)
+TVM_REG(TTBR0_EL1)
+TVM_REG(TTBR1_EL1)
+TVM_REG(TCR_EL1)
+TVM_REG(ESR_EL1)
+TVM_REG(FAR_EL1)
+TVM_REG(AFSR0_EL1)
+TVM_REG(AFSR1_EL1)
+TVM_REG(MAIR_EL1)
+TVM_REG(AMAIR_EL1)
+TVM_REG(CONTEXTIDR_EL1)
+
+/* Macro to generate easily case for co-processor emulation */
+#define GENERATE_CASE(reg)                                              \
+    case HSR_SYSREG_##reg:                                              \
+    {                                                                   \
+        bool res;                                                       \
+                                                                        \
+        res = vreg_emulate_sysreg64(regs, hsr, vreg_emulate_##reg);     \
+        ASSERT(res);                                                    \
+        break;                                                          \
+    }
+
 void do_sysreg(struct cpu_user_regs *regs,
                const union hsr hsr)
 {
@@ -44,6 +85,23 @@ void do_sysreg(struct cpu_user_regs *regs,
         break;
 
     /*
+     * HCR_EL2.TVM
+     *
+     * ARMv8 (DDI 0487D.a): Table D1-38
+     */
+    GENERATE_CASE(SCTLR_EL1)
+    GENERATE_CASE(TTBR0_EL1)
+    GENERATE_CASE(TTBR1_EL1)
+    GENERATE_CASE(TCR_EL1)
+    GENERATE_CASE(ESR_EL1)
+    GENERATE_CASE(FAR_EL1)
+    GENERATE_CASE(AFSR0_EL1)
+    GENERATE_CASE(AFSR1_EL1)
+    GENERATE_CASE(MAIR_EL1)
+    GENERATE_CASE(AMAIR_EL1)
+    GENERATE_CASE(CONTEXTIDR_EL1)
+
+    /*
      * MDCR_EL2.TDRA
      *
      * ARMv8 (DDI 0487A.d): D1-1508 Table D1-57
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 09/17] xen/arm: Rework p2m_cache_flush to take a range [begin, end)
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (7 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 08/17] xen/arm: vsysreg: Add wrapper to handle sysreg " Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-04 20:26 ` [PATCH for-4.12 v2 10/17] xen/arm: p2m: Allow to flush cache on any RAM region Julien Grall
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

The function will be easier to re-use in a follow-up patch if you have
only the begin and end.

At the same time, rename the function to reflect the change in the
prototype.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

---
    Changes in v2:
        - Add Stefano's reviewed-by
---
 xen/arch/arm/domctl.c     | 2 +-
 xen/arch/arm/p2m.c        | 3 +--
 xen/include/asm-arm/p2m.h | 7 +++++--
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
index 4587c75826..c10f568aad 100644
--- a/xen/arch/arm/domctl.c
+++ b/xen/arch/arm/domctl.c
@@ -61,7 +61,7 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
         if ( e < s )
             return -EINVAL;
 
-        return p2m_cache_flush(d, _gfn(s), domctl->u.cacheflush.nr_pfns);
+        return p2m_cache_flush_range(d, _gfn(s), _gfn(e));
     }
     case XEN_DOMCTL_bind_pt_irq:
     {
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 2706db3e67..836157292c 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1514,10 +1514,9 @@ int relinquish_p2m_mapping(struct domain *d)
     return rc;
 }
 
-int p2m_cache_flush(struct domain *d, gfn_t start, unsigned long nr)
+int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
-    gfn_t end = gfn_add(start, nr);
     gfn_t next_gfn;
     p2m_type_t t;
     unsigned int order;
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 13f7a27c38..5858f97e9c 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -228,8 +228,11 @@ int p2m_set_entry(struct p2m_domain *p2m,
 
 bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
 
-/* Clean & invalidate caches corresponding to a region of guest address space */
-int p2m_cache_flush(struct domain *d, gfn_t start, unsigned long nr);
+/*
+ * Clean & invalidate caches corresponding to a region [start,end) of guest
+ * address space.
+ */
+int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end);
 
 /*
  * Map a region in the guest p2m with a specific p2m type.
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 10/17] xen/arm: p2m: Allow to flush cache on any RAM region
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (8 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 09/17] xen/arm: Rework p2m_cache_flush to take a range [begin, end) Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-04 20:26 ` [PATCH for-4.12 v2 11/17] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit) Julien Grall
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

Currently, we only allow to flush cache on regions mapped as p2m_ram_{rw,ro}.

There are no real problem in cache flushing any RAM regions such as grants
and foreign mapping. Therefore, relax the check to allow flushing the
cache on any RAM region.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

---
    Changes in v2:
        - Fix typoes
        - Add Stefano's reviewed-by
---
 xen/arch/arm/p2m.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 836157292c..4e0ddbf70b 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1539,7 +1539,7 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
         next_gfn = gfn_next_boundary(start, order);
 
         /* Skip hole and non-RAM page */
-        if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_ram(t) )
+        if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
             continue;
 
         /* XXX: Implement preemption */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 11/17] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit)
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (9 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 10/17] xen/arm: p2m: Allow to flush cache on any RAM region Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-04 20:35   ` Razvan Cojocaru
  2018-12-04 20:26 ` [PATCH for-4.12 v2 12/17] xen/arm: traps: Rework leave_hypervisor_tail Julien Grall
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, Tamas K Lengyel, Razvan Cojocaru

With the recent changes, a P2M entry may be populated but may as not
valid. In some situation, it would be useful to know whether the entry
has been marked available to guest in order to perform a specific
action. So extend p2m_get_entry to return the value of bit[0] (valid bit).

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    Changes in v2:
        - Don't use _valid
---
 xen/arch/arm/mem_access.c |  6 +++---
 xen/arch/arm/p2m.c        | 18 ++++++++++++++----
 xen/include/asm-arm/p2m.h |  3 ++-
 3 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
index f911937ccf..db49372a2c 100644
--- a/xen/arch/arm/mem_access.c
+++ b/xen/arch/arm/mem_access.c
@@ -71,7 +71,7 @@ static int __p2m_get_mem_access(struct domain *d, gfn_t gfn,
          * No setting was found in the Radix tree. Check if the
          * entry exists in the page-tables.
          */
-        mfn_t mfn = p2m_get_entry(p2m, gfn, NULL, NULL, NULL);
+        mfn_t mfn = p2m_get_entry(p2m, gfn, NULL, NULL, NULL, NULL);
 
         if ( mfn_eq(mfn, INVALID_MFN) )
             return -ESRCH;
@@ -200,7 +200,7 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
      * We had a mem_access permission limiting the access, but the page type
      * could also be limiting, so we need to check that as well.
      */
-    mfn = p2m_get_entry(p2m, gfn, &t, NULL, NULL);
+    mfn = p2m_get_entry(p2m, gfn, &t, NULL, NULL, NULL);
     if ( mfn_eq(mfn, INVALID_MFN) )
         goto err;
 
@@ -406,7 +406,7 @@ long p2m_set_mem_access(struct domain *d, gfn_t gfn, uint32_t nr,
           gfn = gfn_next_boundary(gfn, order) )
     {
         p2m_type_t t;
-        mfn_t mfn = p2m_get_entry(p2m, gfn, &t, NULL, &order);
+        mfn_t mfn = p2m_get_entry(p2m, gfn, &t, NULL, &order, NULL);
 
 
         if ( !mfn_eq(mfn, INVALID_MFN) )
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 4e0ddbf70b..c713226561 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -298,10 +298,14 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
  *
  * If the entry is not present, INVALID_MFN will be returned and the
  * page_order will be set according to the order of the invalid range.
+ *
+ * valid will contain the value of bit[0] (e.g valid bit) of the
+ * entry.
  */
 mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
                     p2m_type_t *t, p2m_access_t *a,
-                    unsigned int *page_order)
+                    unsigned int *page_order,
+                    bool *valid)
 {
     paddr_t addr = gfn_to_gaddr(gfn);
     unsigned int level = 0;
@@ -326,6 +330,9 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
 
     *t = p2m_invalid;
 
+    if ( valid )
+        *valid = false;
+
     /* XXX: Check if the mapping is lower than the mapped gfn */
 
     /* This gfn is higher than the highest the p2m map currently holds */
@@ -371,6 +378,9 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
          * to the GFN.
          */
         mfn = mfn_add(mfn, gfn_x(gfn) & ((1UL << level_orders[level]) - 1));
+
+        if ( valid )
+            *valid = lpae_is_valid(entry);
     }
 
 out_unmap:
@@ -389,7 +399,7 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t *t)
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
 
     p2m_read_lock(p2m);
-    mfn = p2m_get_entry(p2m, gfn, t, NULL, NULL);
+    mfn = p2m_get_entry(p2m, gfn, t, NULL, NULL, NULL);
     p2m_read_unlock(p2m);
 
     return mfn;
@@ -1471,7 +1481,7 @@ int relinquish_p2m_mapping(struct domain *d)
     for ( ; gfn_x(start) < gfn_x(end);
           start = gfn_next_boundary(start, order) )
     {
-        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order);
+        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
 
         count++;
         /*
@@ -1534,7 +1544,7 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
 
     for ( ; gfn_x(start) < gfn_x(end); start = next_gfn )
     {
-        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order);
+        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
 
         next_gfn = gfn_next_boundary(start, order);
 
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 5858f97e9c..7c1d930b1d 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -213,7 +213,8 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t *t);
  */
 mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
                     p2m_type_t *t, p2m_access_t *a,
-                    unsigned int *page_order);
+                    unsigned int *page_order,
+                    bool *valid);
 
 /*
  * Direct set a p2m entry: only for use by the P2M code.
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 12/17] xen/arm: traps: Rework leave_hypervisor_tail
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (10 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 11/17] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit) Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-06 23:08   ` Stefano Stabellini
  2018-12-04 20:26 ` [PATCH for-4.12 v2 13/17] xen/arm: p2m: Rework p2m_cache_flush_range Julien Grall
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

The function leave_hypervisor_tail is called before each return to the
guest vCPU. It has two main purposes:
    1) Process physical CPU work (e.g rescheduling) if required
    2) Prepare the physical CPU to run the guest vCPU

2) will always be done once we finished to process physical CPU work. At
the moment, it is done part of the last iterations of 1) making adding
some extra indentation in the code.

This could be streamlined by moving out 2) of the loop. At the same
time, 1) is moved in a separate function making more obvious

All those changes will help a follow-up patch where we would want to
introduce some vCPU work before returning to the guest vCPU.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    Changes in v2:
        - Patch added
---
 xen/arch/arm/traps.c | 61 ++++++++++++++++++++++++++++------------------------
 1 file changed, 33 insertions(+), 28 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index b00d0b8e1e..02665cc7b4 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2241,36 +2241,12 @@ void do_trap_fiq(struct cpu_user_regs *regs)
     gic_interrupt(regs, 1);
 }
 
-void leave_hypervisor_tail(void)
+static void check_for_pcpu_work(void)
 {
-    while (1)
-    {
-        local_irq_disable();
-        if ( !softirq_pending(smp_processor_id()) )
-        {
-            vgic_sync_to_lrs();
-
-            /*
-             * If the SErrors handle option is "DIVERSE", we have to prevent
-             * slipping the hypervisor SError to guest. In this option, before
-             * returning from trap, we have to synchronize SErrors to guarantee
-             * that the pending SError would be caught in hypervisor.
-             *
-             * If option is NOT "DIVERSE", SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
-             * will be set to cpu_hwcaps. This means we can use the alternative
-             * to skip synchronizing SErrors for other SErrors handle options.
-             */
-            SYNCHRONIZE_SERROR(SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT);
-
-            /*
-             * The hypervisor runs with the workaround always present.
-             * If the guest wants it disabled, so be it...
-             */
-            if ( needs_ssbd_flip(current) )
-                arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 0, NULL);
+    ASSERT(!local_irq_is_enabled());
 
-            return;
-        }
+    while ( softirq_pending(smp_processor_id()) )
+    {
         local_irq_enable();
         do_softirq();
         /*
@@ -2278,9 +2254,38 @@ void leave_hypervisor_tail(void)
          * and we want to patch the hypervisor with almost no stack.
          */
         check_for_livepatch_work();
+        local_irq_disable();
     }
 }
 
+void leave_hypervisor_tail(void)
+{
+    local_irq_disable();
+
+    check_for_pcpu_work();
+
+    vgic_sync_to_lrs();
+
+    /*
+     * If the SErrors handle option is "DIVERSE", we have to prevent
+     * slipping the hypervisor SError to guest. In this option, before
+     * returning from trap, we have to synchronize SErrors to guarantee
+     * that the pending SError would be caught in hypervisor.
+     *
+     * If option is NOT "DIVERSE", SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
+     * will be set to cpu_hwcaps. This means we can use the alternative
+     * to skip synchronizing SErrors for other SErrors handle options.
+     */
+    SYNCHRONIZE_SERROR(SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT);
+
+    /*
+     * The hypervisor runs with the workaround always present.
+     * If the guest wants it disabled, so be it...
+     */
+    if ( needs_ssbd_flip(current) )
+        arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 0, NULL);
+}
+
 /*
  * Local variables:
  * mode: C
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 13/17] xen/arm: p2m: Rework p2m_cache_flush_range
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (11 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 12/17] xen/arm: traps: Rework leave_hypervisor_tail Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-06 23:53   ` Stefano Stabellini
  2018-12-04 20:26 ` [PATCH for-4.12 v2 14/17] xen/arm: domctl: Use typesafe gfn in XEN_DOMCTL_cacheflush Julien Grall
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

A follow-up patch will add support for preemption in p2m_cache_flush_range.
Because of the complexity for the 2 loops, it would be necessary to add
preemption in both of them.

This can be avoided by merging the 2 loops together and still keeping
the code fairly simple to read and extend.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    Changes in v2:
        - Patch added
---
 xen/arch/arm/p2m.c | 52 +++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 37 insertions(+), 15 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index c713226561..db22b53bfd 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1527,7 +1527,8 @@ int relinquish_p2m_mapping(struct domain *d)
 int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
-    gfn_t next_gfn;
+    gfn_t next_block_gfn;
+    mfn_t mfn = INVALID_MFN;
     p2m_type_t t;
     unsigned int order;
 
@@ -1542,24 +1543,45 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
     start = gfn_max(start, p2m->lowest_mapped_gfn);
     end = gfn_min(end, p2m->max_mapped_gfn);
 
-    for ( ; gfn_x(start) < gfn_x(end); start = next_gfn )
-    {
-        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
+    next_block_gfn = start;
 
-        next_gfn = gfn_next_boundary(start, order);
-
-        /* Skip hole and non-RAM page */
-        if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
-            continue;
-
-        /* XXX: Implement preemption */
-        while ( gfn_x(start) < gfn_x(next_gfn) )
+    while ( gfn_x(start) < gfn_x(end) )
+    {
+        /*
+         * We want to flush page by page as:
+         *  - it may not be possible to map the full block (can be up to 1GB)
+         *    in Xen memory
+         *  - we may want to do fine grain preemption as flushing multiple
+         *    page in one go may take a long time
+         *
+         * As p2m_get_entry is able to return the size of the mapping
+         * in the p2m, it is pointless to execute it for each page.
+         *
+         * We can optimize it by tracking the gfn of the next
+         * block. So we will only call p2m_get_entry for each block (can
+         * be up to 1GB).
+         */
+        if ( gfn_eq(start, next_block_gfn) )
         {
-            flush_page_to_ram(mfn_x(mfn), false);
+            mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
+            next_block_gfn = gfn_next_boundary(start, order);
 
-            start = gfn_add(start, 1);
-            mfn = mfn_add(mfn, 1);
+            /*
+             * The following regions can be skipped:
+             *      - Hole
+             *      - non-RAM
+             */
+            if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
+            {
+                start = next_block_gfn;
+                continue;
+            }
         }
+
+        flush_page_to_ram(mfn_x(mfn), false);
+
+        start = gfn_add(start, 1);
+        mfn = mfn_add(mfn, 1);
     }
 
     invalidate_icache();
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 14/17] xen/arm: domctl: Use typesafe gfn in XEN_DOMCTL_cacheflush
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (12 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 13/17] xen/arm: p2m: Rework p2m_cache_flush_range Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-06 23:13   ` Stefano Stabellini
  2018-12-04 20:26 ` [PATCH for-4.12 v2 15/17] xen/arm: p2m: Add support for preemption in p2m_cache_flush_range Julien Grall
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

This will make changes in a follow-up patch easier.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    Changes in v2:
        - Patch added
---
 xen/arch/arm/domctl.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
index c10f568aad..20691528a6 100644
--- a/xen/arch/arm/domctl.c
+++ b/xen/arch/arm/domctl.c
@@ -52,16 +52,16 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
     {
     case XEN_DOMCTL_cacheflush:
     {
-        unsigned long s = domctl->u.cacheflush.start_pfn;
-        unsigned long e = s + domctl->u.cacheflush.nr_pfns;
+        gfn_t s = _gfn(domctl->u.cacheflush.start_pfn);
+        gfn_t e = gfn_add(s, domctl->u.cacheflush.nr_pfns);
 
         if ( domctl->u.cacheflush.nr_pfns > (1U<<MAX_ORDER) )
             return -EINVAL;
 
-        if ( e < s )
+        if ( gfn_x(e) < gfn_x(s) )
             return -EINVAL;
 
-        return p2m_cache_flush_range(d, _gfn(s), _gfn(e));
+        return p2m_cache_flush_range(d, s, e);
     }
     case XEN_DOMCTL_bind_pt_irq:
     {
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 15/17] xen/arm: p2m: Add support for preemption in p2m_cache_flush_range
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (13 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 14/17] xen/arm: domctl: Use typesafe gfn in XEN_DOMCTL_cacheflush Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-06 23:32   ` Stefano Stabellini
  2018-12-04 20:26 ` [PATCH for-4.12 v2 16/17] xen/arm: Implement Set/Way operations Julien Grall
  2018-12-04 20:26 ` [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of " Julien Grall
  16 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

p2m_cache_flush_range does not yet support preemption, this may be an
issue as cleaning the cache can take a long time. While the current
caller (XEN_DOMCTL_cacheflush) does not stricly require preemption, this
will be necessary for new caller in a follow-up patch.

The preemption implemented is quite simple, a counter is incremented by:
    - 1 on region skipped
    - 10 for each page requiring a flush

When the counter reach 512 or above, we will check if preemption is
needed. If not, the counter will be reset to 0. If yes, the function
will stop, update start (to allow resuming later on) and return
-ERESTART. This allows the caller to decide how the preemption will be
done.

For now, XEN_DOMCTL_cacheflush will continue to ignore the preemption.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    Changes in v2:
        - Patch added
---
 xen/arch/arm/domctl.c     |  8 +++++++-
 xen/arch/arm/p2m.c        | 35 ++++++++++++++++++++++++++++++++---
 xen/include/asm-arm/p2m.h |  4 +++-
 3 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
index 20691528a6..9da88b8c64 100644
--- a/xen/arch/arm/domctl.c
+++ b/xen/arch/arm/domctl.c
@@ -54,6 +54,7 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
     {
         gfn_t s = _gfn(domctl->u.cacheflush.start_pfn);
         gfn_t e = gfn_add(s, domctl->u.cacheflush.nr_pfns);
+        int rc;
 
         if ( domctl->u.cacheflush.nr_pfns > (1U<<MAX_ORDER) )
             return -EINVAL;
@@ -61,7 +62,12 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
         if ( gfn_x(e) < gfn_x(s) )
             return -EINVAL;
 
-        return p2m_cache_flush_range(d, s, e);
+        /* XXX: Handle preemption */
+        do
+            rc = p2m_cache_flush_range(d, &s, e);
+        while ( rc == -ERESTART );
+
+        return rc;
     }
     case XEN_DOMCTL_bind_pt_irq:
     {
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index db22b53bfd..ca9f0d9ebe 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1524,13 +1524,17 @@ int relinquish_p2m_mapping(struct domain *d)
     return rc;
 }
 
-int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
+int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     gfn_t next_block_gfn;
+    gfn_t start = *pstart;
     mfn_t mfn = INVALID_MFN;
     p2m_type_t t;
     unsigned int order;
+    int rc = 0;
+    /* Counter for preemption */
+    unsigned long count = 0;
 
     /*
      * The operation cache flush will invalidate the RAM assigned to the
@@ -1547,6 +1551,25 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
 
     while ( gfn_x(start) < gfn_x(end) )
     {
+       /*
+         * Cleaning the cache for the P2M may take a long time. So we
+         * need to be able to preempt. We will arbitrarily preempt every
+         * time count reach 512 or above.
+         *
+         * The count will be incremented by:
+         *  - 1 on region skipped
+         *  - 10 for each page requiring a flush
+         */
+        if ( count >= 512 )
+        {
+            if ( softirq_pending(smp_processor_id()) )
+            {
+                rc = -ERESTART;
+                break;
+            }
+            count = 0;
+        }
+
         /*
          * We want to flush page by page as:
          *  - it may not be possible to map the full block (can be up to 1GB)
@@ -1573,22 +1596,28 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
              */
             if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
             {
+                count++;
                 start = next_block_gfn;
                 continue;
             }
         }
 
+        count += 10;
+
         flush_page_to_ram(mfn_x(mfn), false);
 
         start = gfn_add(start, 1);
         mfn = mfn_add(mfn, 1);
     }
 
-    invalidate_icache();
+    if ( rc != -ERESTART )
+        invalidate_icache();
 
     p2m_read_unlock(p2m);
 
-    return 0;
+    *pstart = start;
+
+    return rc;
 }
 
 mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 7c1d930b1d..a633e27cc9 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -232,8 +232,10 @@ bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
 /*
  * Clean & invalidate caches corresponding to a region [start,end) of guest
  * address space.
+ *
+ * start will get updated if the function is preempted.
  */
-int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end);
+int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end);
 
 /*
  * Map a region in the guest p2m with a specific p2m type.
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 16/17] xen/arm: Implement Set/Way operations
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (14 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 15/17] xen/arm: p2m: Add support for preemption in p2m_cache_flush_range Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-06 23:32   ` Stefano Stabellini
  2018-12-04 20:26 ` [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of " Julien Grall
  16 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

Set/Way operations are used to perform maintenance on a given cache.
At the moment, Set/Way operations are not trapped and therefore a guest
OS will directly act on the local cache. However, a vCPU may migrate to
another pCPU in the middle of the processor. This will result to have
cache with stall data (Set/Way are not propagated) potentially causing
crash. This may be the cause of heisenbug noticed in Osstest [1].

Furthermore, Set/Way operations are not available on system cache. This
means that OS, such as Linux 32-bit, relying on those operations to
fully clean the cache before disabling MMU may break because data may
sits in system caches and not in RAM.

For more details about Set/Way, see the talk "The Art of Virtualizing
Cache Maintenance" given at Xen Summit 2018 [2].

In the context of Xen, we need to trap Set/Way operations and emulate
them. From the Arm Arm (B1.14.4 in DDI 046C.c), Set/Way operations are
difficult to virtualized. So we can assume that a guest OS using them will
suffer the consequence (i.e slowness) until developer removes all the usage
of Set/Way.

As the software is not allowed to infer the Set/Way to Physical Address
mapping, Xen will need to go through the guest P2M and clean &
invalidate all the entries mapped.

Because Set/Way happen in batch (a loop on all Set/Way of a cache), Xen
would need to go through the P2M for every instructions. This is quite
expensive and would severely impact the guest OS. The implementation is
re-using the KVM policy to limit the number of flush:
    - If we trap a Set/Way operations, we enable VM trapping (i.e
      HVC_EL2.TVM) to detect cache being turned on/off, and do a full
    clean.
    - We clean the caches when turning on and off
    - Once the caches are enabled, we stop trapping VM instructions

[1] https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg03191.html
[2] https://fr.slideshare.net/xen_com_mgr/virtualizing-cache

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    Changes in v2:
        - Fix emulation for Set/Way cache flush arm64 sysreg
        - Add support for preemption
        - Check cache status on every VM traps in Arm64
        - Remove spurious change
---
 xen/arch/arm/arm64/vsysreg.c | 17 ++++++++
 xen/arch/arm/p2m.c           | 92 ++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/traps.c         | 25 +++++++++++-
 xen/arch/arm/vcpreg.c        | 22 +++++++++++
 xen/include/asm-arm/domain.h |  8 ++++
 xen/include/asm-arm/p2m.h    | 20 ++++++++++
 6 files changed, 183 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
index 16ac9c344a..8a85507d9d 100644
--- a/xen/arch/arm/arm64/vsysreg.c
+++ b/xen/arch/arm/arm64/vsysreg.c
@@ -34,9 +34,14 @@
 static bool vreg_emulate_##reg(struct cpu_user_regs *regs,          \
                                uint64_t *r, bool read)              \
 {                                                                   \
+    struct vcpu *v = current;                                       \
+    bool cache_enabled = vcpu_has_cache_enabled(v);                 \
+                                                                    \
     GUEST_BUG_ON(read);                                             \
     WRITE_SYSREG64(*r, reg);                                        \
                                                                     \
+    p2m_toggle_cache(v, cache_enabled);                             \
+                                                                    \
     return true;                                                    \
 }
 
@@ -85,6 +90,18 @@ void do_sysreg(struct cpu_user_regs *regs,
         break;
 
     /*
+     * HCR_EL2.TSW
+     *
+     * ARMv8 (DDI 0487B.b): Table D1-42
+     */
+    case HSR_SYSREG_DCISW:
+    case HSR_SYSREG_DCCSW:
+    case HSR_SYSREG_DCCISW:
+        if ( !hsr.sysreg.read )
+            p2m_set_way_flush(current);
+        break;
+
+    /*
      * HCR_EL2.TVM
      *
      * ARMv8 (DDI 0487D.a): Table D1-38
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index ca9f0d9ebe..8ee6ff7bd7 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -3,6 +3,7 @@
 #include <xen/iocap.h>
 #include <xen/lib.h>
 #include <xen/sched.h>
+#include <xen/softirq.h>
 
 #include <asm/event.h>
 #include <asm/flushtlb.h>
@@ -1620,6 +1621,97 @@ int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
     return rc;
 }
 
+/*
+ * Clean & invalidate RAM associated to the guest vCPU.
+ *
+ * The function can only work with the current vCPU and should be called
+ * with IRQ enabled as the vCPU could get preempted.
+ */
+void p2m_flush_vm(struct vcpu *v)
+{
+    int rc;
+    gfn_t start = _gfn(0);
+
+    ASSERT(v == current);
+    ASSERT(local_irq_is_enabled());
+    ASSERT(v->arch.need_flush_to_ram);
+
+    do
+    {
+        rc = p2m_cache_flush_range(v->domain, &start, _gfn(ULONG_MAX));
+        if ( rc == -ERESTART )
+            do_softirq();
+    } while ( rc == -ERESTART );
+
+    if ( rc != 0 )
+        gprintk(XENLOG_WARNING,
+                "P2M has not been correctly cleaned (rc = %d)\n",
+                rc);
+
+    v->arch.need_flush_to_ram = false;
+}
+
+/*
+ * See note at ARMv7 ARM B1.14.4 (DDI 0406C.c) (TL;DR: S/W ops are not
+ * easily virtualized).
+ *
+ * Main problems:
+ *  - S/W ops are local to a CPU (not broadcast)
+ *  - We have line migration behind our back (speculation)
+ *  - System caches don't support S/W at all (damn!)
+ *
+ * In the face of the above, the best we can do is to try and convert
+ * S/W ops to VA ops. Because the guest is not allowed to infer the S/W
+ * to PA mapping, it can only use S/W to nuke the whole cache, which is
+ * rather a good thing for us.
+ *
+ * Also, it is only used when turning caches on/off ("The expected
+ * usage of the cache maintenance instructions that operate by set/way
+ * is associated with the powerdown and powerup of caches, if this is
+ * required by the implementation.").
+ *
+ * We use the following policy:
+ *  - If we trap a S/W operation, we enabled VM trapping to detect
+ *  caches being turned on/off, and do a full clean.
+ *
+ *  - We flush the caches on both caches being turned on and off.
+ *
+ *  - Once the caches are enabled, we stop trapping VM ops.
+ */
+void p2m_set_way_flush(struct vcpu *v)
+{
+    /* This function can only work with the current vCPU. */
+    ASSERT(v == current);
+
+    if ( !(v->arch.hcr_el2 & HCR_TVM) )
+    {
+        v->arch.need_flush_to_ram = true;
+        vcpu_hcr_set_flags(v, HCR_TVM);
+    }
+}
+
+void p2m_toggle_cache(struct vcpu *v, bool was_enabled)
+{
+    bool now_enabled = vcpu_has_cache_enabled(v);
+
+    /* This function can only work with the current vCPU. */
+    ASSERT(v == current);
+
+    /*
+     * If switching the MMU+caches on, need to invalidate the caches.
+     * If switching it off, need to clean the caches.
+     * Clean + invalidate does the trick always.
+     */
+    if ( was_enabled != now_enabled )
+    {
+        v->arch.need_flush_to_ram = true;
+    }
+
+    /* Caches are now on, stop trapping VM ops (until a S/W op) */
+    if ( now_enabled )
+        vcpu_hcr_clear_flags(v, HCR_TVM);
+}
+
 mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
 {
     return p2m_lookup(d, gfn, NULL);
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 02665cc7b4..221c762ada 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -97,7 +97,7 @@ register_t get_default_hcr_flags(void)
 {
     return  (HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM|
              (vwfi != NATIVE ? (HCR_TWI|HCR_TWE) : 0) |
-             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB);
+             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
 }
 
 static enum {
@@ -2258,10 +2258,33 @@ static void check_for_pcpu_work(void)
     }
 }
 
+/*
+ * Process pending work for the vCPU. Any call should be fast or
+ * implement preemption.
+ */
+static void check_for_vcpu_work(void)
+{
+    struct vcpu *v = current;
+
+    if ( likely(!v->arch.need_flush_to_ram) )
+        return;
+
+    /*
+     * Give a chance for the pCPU to process work before handling the vCPU
+     * pending work.
+     */
+    check_for_pcpu_work();
+
+    local_irq_enable();
+    p2m_flush_vm(v);
+    local_irq_disable();
+}
+
 void leave_hypervisor_tail(void)
 {
     local_irq_disable();
 
+    check_for_vcpu_work();
     check_for_pcpu_work();
 
     vgic_sync_to_lrs();
diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
index 550c25ec3f..cdc91cdf5b 100644
--- a/xen/arch/arm/vcpreg.c
+++ b/xen/arch/arm/vcpreg.c
@@ -51,9 +51,14 @@
 #define TVM_REG(sz, func, reg...)                                           \
 static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
 {                                                                           \
+    struct vcpu *v = current;                                               \
+    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
+                                                                            \
     GUEST_BUG_ON(read);                                                     \
     WRITE_SYSREG##sz(*r, reg);                                              \
                                                                             \
+    p2m_toggle_cache(v, cache_enabled);                                     \
+                                                                            \
     return true;                                                            \
 }
 
@@ -71,6 +76,8 @@ static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
 static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
                                 bool read, bool hi)                         \
 {                                                                           \
+    struct vcpu *v = current;                                               \
+    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
     register_t reg = READ_SYSREG(xreg);                                     \
                                                                             \
     GUEST_BUG_ON(read);                                                     \
@@ -86,6 +93,8 @@ static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
     }                                                                       \
     WRITE_SYSREG(reg, xreg);                                                \
                                                                             \
+    p2m_toggle_cache(v, cache_enabled);                                     \
+                                                                            \
     return true;                                                            \
 }                                                                           \
                                                                             \
@@ -186,6 +195,19 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
         break;
 
     /*
+     * HCR_EL2.TSW
+     *
+     * ARMv7 (DDI 0406C.b): B1.14.6
+     * ARMv8 (DDI 0487B.b): Table D1-42
+     */
+    case HSR_CPREG32(DCISW):
+    case HSR_CPREG32(DCCSW):
+    case HSR_CPREG32(DCCISW):
+        if ( !cp32.read )
+            p2m_set_way_flush(current);
+        break;
+
+    /*
      * HCR_EL2.TVM
      *
      * ARMv8 (DDI 0487D.a): Table D1-38
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 175de44927..f16b973e0d 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -202,6 +202,14 @@ struct arch_vcpu
     struct vtimer phys_timer;
     struct vtimer virt_timer;
     bool   vtimer_initialized;
+
+    /*
+     * The full P2M may require some cleaning (e.g when emulation
+     * set/way). As the action can take a long time, it requires
+     * preemption. So this is deferred until we return to the guest.
+     */
+    bool need_flush_to_ram;
+
 }  __cacheline_aligned;
 
 void vcpu_show_execution_state(struct vcpu *);
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index a633e27cc9..79abcb5a63 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -6,6 +6,8 @@
 #include <xen/rwlock.h>
 #include <xen/mem_access.h>
 
+#include <asm/current.h>
+
 #define paddr_bits PADDR_BITS
 
 /* Holds the bit size of IPAs in p2m tables.  */
@@ -237,6 +239,12 @@ bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
  */
 int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end);
 
+void p2m_set_way_flush(struct vcpu *v);
+
+void p2m_toggle_cache(struct vcpu *v, bool was_enabled);
+
+void p2m_flush_vm(struct vcpu *v);
+
 /*
  * Map a region in the guest p2m with a specific p2m type.
  * The memory attributes will be derived from the p2m type.
@@ -364,6 +372,18 @@ static inline int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
     return -EOPNOTSUPP;
 }
 
+/*
+ * A vCPU has cache enabled only when the MMU is enabled and data cache
+ * is enabled.
+ */
+static inline bool vcpu_has_cache_enabled(struct vcpu *v)
+{
+    /* Only works with the current vCPU */
+    ASSERT(current == v);
+
+    return (READ_SYSREG32(SCTLR_EL1) & (SCTLR_C|SCTLR_M)) == (SCTLR_C|SCTLR_M);
+}
+
 #endif /* _XEN_P2M_H */
 
 /*
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of Set/Way operations
  2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
                   ` (15 preceding siblings ...)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 16/17] xen/arm: Implement Set/Way operations Julien Grall
@ 2018-12-04 20:26 ` Julien Grall
  2018-12-05  8:37   ` Jan Beulich
                     ` (2 more replies)
  16 siblings, 3 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-04 20:26 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Jan Beulich, Roger Pau Monné

At the moment, the implementation of Set/Way operations will go through
all the entries of the guest P2M and flush them. However, this is very
expensive and may render unusable a guest OS using them.

For instance, Linux 32-bit will use Set/Way operations during secondary
CPU bring-up. As the implementation is really expensive, it may be possible
to hit the CPU bring-up timeout.

To limit the Set/Way impact, we track what pages has been of the guest
has been accessed between batch of Set/Way operations. This is done
using bit[0] (aka valid bit) of the P2M entry.

This patch adds a new per-arch helper is introduced to perform actions just
before the guest is first unpaused. This will be used to invalidate the
P2M to track access from the start of the guest.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---

While we can spread d->creation_finished all over the code, the per-arch
helper to perform actions just before the guest is first unpaused can
bring a lot of benefit for both architecture. For instance, on Arm, the
flush to the instruction cache could be delayed until the domain is
first run. This would improve greatly the performance of creating guest.

I am still doing the benchmark whether having a command line option is
worth it. I will provide numbers as soon as I have them.

Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/arm/domain.c     | 14 ++++++++++++++
 xen/arch/arm/p2m.c        | 30 ++++++++++++++++++++++++++++--
 xen/arch/x86/domain.c     |  4 ++++
 xen/common/domain.c       |  5 ++++-
 xen/include/asm-arm/p2m.h |  2 ++
 xen/include/xen/domain.h  |  2 ++
 6 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 1d926dcb29..41f101746e 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -767,6 +767,20 @@ int arch_domain_soft_reset(struct domain *d)
     return -ENOSYS;
 }
 
+void arch_domain_creation_finished(struct domain *d)
+{
+    /*
+     * To avoid flushing the whole guest RAM on the first Set/Way, we
+     * invalidate the P2M to track what has been accessed.
+     *
+     * This is only turned when IOMMU is not used or the page-table are
+     * not shared because bit[0] (e.g valid bit) unset will result
+     * IOMMU fault that could be not fixed-up.
+     */
+    if ( !iommu_use_hap_pt(d) )
+        p2m_invalidate_root(p2m_get_hostp2m(d));
+}
+
 static int is_guest_pv32_psr(uint32_t psr)
 {
     switch (psr & PSR_MODE_MASK)
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 8ee6ff7bd7..44ea3580cf 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1079,6 +1079,22 @@ static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
 }
 
 /*
+ * Invalidate all entries in the root page-tables. This is
+ * useful to get fault on entry and do an action.
+ */
+void p2m_invalidate_root(struct p2m_domain *p2m)
+{
+    unsigned int i;
+
+    p2m_write_lock(p2m);
+
+    for ( i = 0; i < P2M_ROOT_LEVEL; i++ )
+        p2m_invalidate_table(p2m, page_to_mfn(p2m->root + i));
+
+    p2m_write_unlock(p2m);
+}
+
+/*
  * Resolve any translation fault due to change in the p2m. This
  * includes break-before-make and valid bit cleared.
  */
@@ -1587,15 +1603,18 @@ int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
          */
         if ( gfn_eq(start, next_block_gfn) )
         {
-            mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
+            bool valid;
+
+            mfn = p2m_get_entry(p2m, start, &t, NULL, &order, &valid);
             next_block_gfn = gfn_next_boundary(start, order);
 
             /*
              * The following regions can be skipped:
              *      - Hole
              *      - non-RAM
+             *      - block with valid bit (bit[0]) unset
              */
-            if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
+            if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) || !valid )
             {
                 count++;
                 start = next_block_gfn;
@@ -1629,6 +1648,7 @@ int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
  */
 void p2m_flush_vm(struct vcpu *v)
 {
+    struct p2m_domain *p2m = p2m_get_hostp2m(v->domain);
     int rc;
     gfn_t start = _gfn(0);
 
@@ -1648,6 +1668,12 @@ void p2m_flush_vm(struct vcpu *v)
                 "P2M has not been correctly cleaned (rc = %d)\n",
                 rc);
 
+    /*
+     * Invalidate the p2m to track which page was modified by the guest
+     * between call of p2m_flush_vm().
+     */
+    p2m_invalidate_root(p2m);
+
     v->arch.need_flush_to_ram = false;
 }
 
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index b4d59487ad..d28e3f9b15 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -762,6 +762,10 @@ int arch_domain_soft_reset(struct domain *d)
     return ret;
 }
 
+void arch_domain_creation_finished(struct domain *d)
+{
+}
+
 /*
  * These are the masks of CR4 bits (subject to hardware availability) which a
  * PV guest may not legitimiately attempt to modify.
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 78cc5249e8..c623daec56 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1116,8 +1116,11 @@ int domain_unpause_by_systemcontroller(struct domain *d)
      * Creation is considered finished when the controller reference count
      * first drops to 0.
      */
-    if ( new == 0 )
+    if ( new == 0 && !d->creation_finished )
+    {
         d->creation_finished = true;
+        arch_domain_creation_finished(d);
+    }
 
     domain_unpause(d);
 
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 79abcb5a63..01cd3ee4b5 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -231,6 +231,8 @@ int p2m_set_entry(struct p2m_domain *p2m,
 
 bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
 
+void p2m_invalidate_root(struct p2m_domain *p2m);
+
 /*
  * Clean & invalidate caches corresponding to a region [start,end) of guest
  * address space.
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 33e41486cb..d1bfc82f57 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -70,6 +70,8 @@ void arch_domain_unpause(struct domain *d);
 
 int arch_domain_soft_reset(struct domain *d);
 
+void arch_domain_creation_finished(struct domain *d);
+
 void arch_p2m_set_access_required(struct domain *d, bool access_required);
 
 int arch_set_info_guest(struct vcpu *, vcpu_guest_context_u);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 11/17] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit)
  2018-12-04 20:26 ` [PATCH for-4.12 v2 11/17] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit) Julien Grall
@ 2018-12-04 20:35   ` Razvan Cojocaru
  2018-12-06 22:32     ` Stefano Stabellini
  2018-12-07 10:17     ` Julien Grall
  0 siblings, 2 replies; 53+ messages in thread
From: Razvan Cojocaru @ 2018-12-04 20:35 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: sstabellini, Tamas K Lengyel

On 12/4/18 10:26 PM, Julien Grall wrote:
> With the recent changes, a P2M entry may be populated but may as not
> valid. In some situation, it would be useful to know whether the entry

I think you mean to say "may not be valid"?

> has been marked available to guest in order to perform a specific
> action. So extend p2m_get_entry to return the value of bit[0] (valid bit).
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Other than that,

Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 03/17] xen/arm: p2m: Clean-up headers included and order them alphabetically
  2018-12-04 20:26 ` [PATCH for-4.12 v2 03/17] xen/arm: p2m: Clean-up headers included and order them alphabetically Julien Grall
@ 2018-12-04 23:47   ` Stefano Stabellini
  0 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-04 23:47 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini

On Tue, 4 Dec 2018, Julien Grall wrote:
> A lot of the headers are not necessary, so remove them. At the same
> time, re-order them alphabetically.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

> ---
>  xen/arch/arm/p2m.c | 18 +++++-------------
>  1 file changed, 5 insertions(+), 13 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 6c76298ebc..81f3107dd2 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1,19 +1,11 @@
> -#include <xen/sched.h>
> -#include <xen/lib.h>
> -#include <xen/errno.h>
> +#include <xen/cpu.h>
>  #include <xen/domain_page.h>
> -#include <xen/bitops.h>
> -#include <xen/vm_event.h>
> -#include <xen/monitor.h>
>  #include <xen/iocap.h>
> -#include <xen/mem_access.h>
> -#include <xen/xmalloc.h>
> -#include <xen/cpu.h>
> -#include <xen/notifier.h>
> -#include <public/vm_event.h>
> -#include <asm/flushtlb.h>
> +#include <xen/lib.h>
> +#include <xen/sched.h>
> +
>  #include <asm/event.h>
> -#include <asm/hardirq.h>
> +#include <asm/flushtlb.h>
>  #include <asm/page.h>
>  
>  #define MAX_VMID_8_BIT  (1UL << 8)
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 04/17] xen/arm: p2m: Introduce p2m_is_valid and use it
  2018-12-04 20:26 ` [PATCH for-4.12 v2 04/17] xen/arm: p2m: Introduce p2m_is_valid and use it Julien Grall
@ 2018-12-04 23:50   ` Stefano Stabellini
  2018-12-05  9:46     ` Julien Grall
  0 siblings, 1 reply; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-04 23:50 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini

On Tue, 4 Dec 2018, Julien Grall wrote:
> The LPAE format allows to store information in an entry even with the
> valid bit unset. In a follow-up patch, we will take advantage of this
> feature to re-purpose the valid bit for generating a translation fault
> even if an entry contains valid information.
> 
> So we need a different way to know whether an entry contains valid
> information. It is possible to use the information hold in the p2m_type
> to know for that purpose. Indeed all entries containing valid
> information will have a valid p2m type (i.e p2m_type != p2m_invalid).
> 
> This patch introduces a new helper p2m_is_valid, which implements that
> idea, and replace most of lpae_is_valid call with the new helper. The ones
> remaining are for TLBs handling and entries accounting.
> 
> With the renaming there are 2 others changes required:
>     - Generate table entry with a valid p2m type
>     - Detect new mapping for proper stats accounting
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

(This patch doesn't apply to master, please rebase)


> ---
>     Changes in v2:
>         - Don't open-code p2m_is_superpage
> ---
>  xen/arch/arm/p2m.c | 32 ++++++++++++++++++++++----------
>  1 file changed, 22 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 81f3107dd2..47b54c792e 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -212,17 +212,26 @@ static p2m_access_t p2m_mem_access_radix_get(struct p2m_domain *p2m, gfn_t gfn)
>  }
>  
>  /*
> + * In the case of the P2M, the valid bit is used for other purpose. Use
> + * the type to check whether an entry is valid.
> + */
> +static inline bool p2m_is_valid(lpae_t pte)
> +{
> +    return pte.p2m.type != p2m_invalid;
> +}
> +
> +/*
>   * lpae_is_* helpers don't check whether the valid bit is set in the
>   * PTE. Provide our own overlay to check the valid bit.
>   */
>  static inline bool p2m_is_mapping(lpae_t pte, unsigned int level)
>  {
> -    return lpae_is_valid(pte) && lpae_is_mapping(pte, level);
> +    return p2m_is_valid(pte) && lpae_is_mapping(pte, level);
>  }
>  
>  static inline bool p2m_is_superpage(lpae_t pte, unsigned int level)
>  {
> -    return lpae_is_valid(pte) && lpae_is_superpage(pte, level);
> +    return p2m_is_valid(pte) && lpae_is_superpage(pte, level);
>  }
>  
>  #define GUEST_TABLE_MAP_FAILED 0
> @@ -256,7 +265,7 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
>  
>      entry = *table + offset;
>  
> -    if ( !lpae_is_valid(*entry) )
> +    if ( !p2m_is_valid(*entry) )
>      {
>          if ( read_only )
>              return GUEST_TABLE_MAP_FAILED;
> @@ -348,7 +357,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>  
>      entry = table[offsets[level]];
>  
> -    if ( lpae_is_valid(entry) )
> +    if ( p2m_is_valid(entry) )
>      {
>          *t = entry.p2m.type;
>  
> @@ -536,8 +545,11 @@ static lpae_t page_to_p2m_table(struct page_info *page)
>      /*
>       * The access value does not matter because the hardware will ignore
>       * the permission fields for table entry.
> +     *
> +     * We use p2m_ram_rw so the entry has a valid type. This is important
> +     * for p2m_is_valid() to return valid on table entries.
>       */
> -    return mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid, p2m_access_rwx);
> +    return mfn_to_p2m_entry(page_to_mfn(page), p2m_ram_rw, p2m_access_rwx);
>  }
>  
>  static inline void p2m_write_pte(lpae_t *p, lpae_t pte, bool clean_pte)
> @@ -561,7 +573,7 @@ static int p2m_create_table(struct p2m_domain *p2m, lpae_t *entry)
>      struct page_info *page;
>      lpae_t *p;
>  
> -    ASSERT(!lpae_is_valid(*entry));
> +    ASSERT(!p2m_is_valid(*entry));
>  
>      page = alloc_domheap_page(NULL, 0);
>      if ( page == NULL )
> @@ -618,7 +630,7 @@ static int p2m_mem_access_radix_set(struct p2m_domain *p2m, gfn_t gfn,
>   */
>  static void p2m_put_l3_page(const lpae_t pte)
>  {
> -    ASSERT(lpae_is_valid(pte));
> +    ASSERT(p2m_is_valid(pte));
>  
>      /*
>       * TODO: Handle other p2m types
> @@ -646,7 +658,7 @@ static void p2m_free_entry(struct p2m_domain *p2m,
>      struct page_info *pg;
>  
>      /* Nothing to do if the entry is invalid. */
> -    if ( !lpae_is_valid(entry) )
> +    if ( !p2m_is_valid(entry) )
>          return;
>  
>      /* Nothing to do but updating the stats if the entry is a super-page. */
> @@ -943,7 +955,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
>              else
>                  p2m->need_flush = true;
>          }
> -        else /* new mapping */
> +        else if ( !p2m_is_valid(orig_pte) ) /* new mapping */
>              p2m->stats.mappings[level]++;
>  
>          p2m_write_pte(entry, pte, p2m->clean_pte);
> @@ -957,7 +969,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
>       * Free the entry only if the original pte was valid and the base
>       * is different (to avoid freeing when permission is changed).
>       */
> -    if ( lpae_is_valid(orig_pte) &&
> +    if ( p2m_is_valid(orig_pte) &&
>           !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) )
>          p2m_free_entry(p2m, orig_pte, level);
>  
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 05/17] xen/arm: p2m: Handle translation fault in get_page_from_gva
  2018-12-04 20:26 ` [PATCH for-4.12 v2 05/17] xen/arm: p2m: Handle translation fault in get_page_from_gva Julien Grall
@ 2018-12-04 23:59   ` Stefano Stabellini
  2018-12-05 10:03     ` Julien Grall
  0 siblings, 1 reply; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-04 23:59 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini

On Tue, 4 Dec 2018, Julien Grall wrote:
> A follow-up patch will re-purpose the valid bit of LPAE entries to
> generate fault even on entry containing valid information.
> 
> This means that when translating a guest VA to guest PA (e.g IPA) will
> fail if the Stage-2 entries used have the valid bit unset. Because of
> that, we need to fallback to walk the page-table in software to check
> whether the fault was expected.
> 
> This patch adds the software page-table walk on all the translation
> fault. It would be possible in the future to avoid pointless walk when
> the fault in PAR_EL1 is not a translation fault.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> ---
> 
> There are a couple of TODO in the code. They are clean-up and performance
> improvement (e.g when the fault cannot be handled) that could be delayed after
> the series has been merged.
> 
>     Changes in v2:
>         - Check stage-2 permission during software lookup
>         - Fix typoes
> ---
>  xen/arch/arm/p2m.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 59 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 47b54c792e..39680eeb6e 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -6,6 +6,7 @@
>  
>  #include <asm/event.h>
>  #include <asm/flushtlb.h>
> +#include <asm/guest_walk.h>
>  #include <asm/page.h>
>  
>  #define MAX_VMID_8_BIT  (1UL << 8)
> @@ -1430,6 +1431,8 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
>      struct page_info *page = NULL;
>      paddr_t maddr = 0;
>      uint64_t par;
> +    mfn_t mfn;
> +    p2m_type_t t;
>  
>      /*
>       * XXX: To support a different vCPU, we would need to load the
> @@ -1446,8 +1449,29 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
>      par = gvirt_to_maddr(va, &maddr, flags);
>      p2m_read_unlock(p2m);
>  
> +    /*
> +     * gvirt_to_maddr may fail if the entry does not have the valid bit
> +     * set. Fallback to the second method:
> +     *  1) Translate the VA to IPA using software lookup -> Stage-1 page-table
> +     *  may not be accessible because the stage-2 entries may have valid
> +     *  bit unset.
> +     *  2) Software lookup of the MFN
> +     *
> +     * Note that when memaccess is enabled, we instead call directly
> +     * p2m_mem_access_check_and_get_page(...). Because the function is a
> +     * a variant of the methods described above, it will be able to
> +     * handle entries with valid bit unset.
> +     *
> +     * TODO: Integrate more nicely memaccess with the rest of the
> +     * function.
> +     * TODO: Use the fault error in PAR_EL1 to avoid pointless
> +     *  translation.
> +     */
>      if ( par )
>      {
> +        paddr_t ipa;
> +        unsigned int s1_perms;
> +
>          /*
>           * When memaccess is enabled, the translation GVA to MADDR may
>           * have failed because of a permission fault.
> @@ -1455,20 +1479,48 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
>          if ( p2m->mem_access_enabled )
>              return p2m_mem_access_check_and_get_page(va, flags, v);
>  
> -        dprintk(XENLOG_G_DEBUG,
> -                "%pv: gvirt_to_maddr failed va=%#"PRIvaddr" flags=0x%lx par=%#"PRIx64"\n",
> -                v, va, flags, par);
> -        return NULL;
> +        /*
> +         * The software stage-1 table walk can still fail, e.g, if the
> +         * GVA is not mapped.
> +         */
> +        if ( !guest_walk_tables(v, va, &ipa, &s1_perms) )
> +        {
> +            dprintk(XENLOG_G_DEBUG,
> +                    "%pv: Failed to walk page-table va %#"PRIvaddr"\n", v, va);
> +            return NULL;
> +        }
> +
> +        mfn = p2m_lookup(d, gaddr_to_gfn(ipa), &t);
> +        if ( mfn_eq(INVALID_MFN, mfn) || !p2m_is_ram(t) )
> +            return NULL;
> +
> +        /*
> +         * Check permission that are assumed by the caller. For instance
> +         * in case of guestcopy, the caller assumes that the translated
> +         * page can be accessed with the requested permissions. If this
> +         * is not the case, we should fail.
> +         *
> +         * Please note that we do not check for the GV2M_EXEC
> +         * permission. This is fine because the hardware-based translation
> +         * instruction does not test for execute permissions.
> +         */
> +        if ( (flags & GV2M_WRITE) && !(s1_perms & GV2M_WRITE) )
> +            return NULL;
> +
> +        if ( (flags & GV2M_WRITE) && t != p2m_ram_rw )
> +            return NULL;

The patch looks good enough now. One question: is it a requirement that
the page we are trying to translate is of type p2m_ram_*? Could
get_page_from_gva be genuinely called passing a page of a different
kind, such as p2m_mmio_direct_* or p2m_map_foreign? Today, it is not the
case, but I wonder if it is something we want to consider?


>      }
> +    else
> +        mfn = maddr_to_mfn(maddr);
>  
> -    if ( !mfn_valid(maddr_to_mfn(maddr)) )
> +    if ( !mfn_valid(mfn) )
>      {
>          dprintk(XENLOG_G_DEBUG, "%pv: Invalid MFN %#"PRI_mfn"\n",
> -                v, mfn_x(maddr_to_mfn(maddr)));
> +                v, mfn_x(mfn));
>          return NULL;
>      }
>  
> -    page = mfn_to_page(maddr_to_mfn(maddr));
> +    page = mfn_to_page(mfn);
>      ASSERT(page);
>  
>      if ( unlikely(!get_page(page, d)) )
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of Set/Way operations
  2018-12-04 20:26 ` [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of " Julien Grall
@ 2018-12-05  8:37   ` Jan Beulich
  2018-12-07 13:24     ` Julien Grall
  2018-12-06 12:21   ` Julien Grall
  2018-12-07 21:43   ` Stefano Stabellini
  2 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2018-12-05  8:37 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel,
	Roger Pau Monne

>>> On 04.12.18 at 21:26, <julien.grall@arm.com> wrote:
> At the moment, the implementation of Set/Way operations will go through
> all the entries of the guest P2M and flush them. However, this is very
> expensive and may render unusable a guest OS using them.
> 
> For instance, Linux 32-bit will use Set/Way operations during secondary
> CPU bring-up. As the implementation is really expensive, it may be possible
> to hit the CPU bring-up timeout.
> 
> To limit the Set/Way impact, we track what pages has been of the guest
> has been accessed between batch of Set/Way operations. This is done
> using bit[0] (aka valid bit) of the P2M entry.
> 
> This patch adds a new per-arch helper is introduced to perform actions just
> before the guest is first unpaused. This will be used to invalidate the
> P2M to track access from the start of the guest.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> ---
> 
> While we can spread d->creation_finished all over the code, the per-arch
> helper to perform actions just before the guest is first unpaused can
> bring a lot of benefit for both architecture. For instance, on Arm, the
> flush to the instruction cache could be delayed until the domain is
> first run. This would improve greatly the performance of creating guest.

Just the other day we had found a potential use on x86 as well
(even if I already don't recall anymore what it was), so the
addition is certainly helpful. It might have been nice to split
introduction of the interface from what you actually want it to
do on Arm, but irrespective of that
Reviewed-by: Jan Beulich <jbeulich@suse.com>
for the non-Arm pieces here.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 04/17] xen/arm: p2m: Introduce p2m_is_valid and use it
  2018-12-04 23:50   ` Stefano Stabellini
@ 2018-12-05  9:46     ` Julien Grall
  2018-12-06 22:02       ` Stefano Stabellini
  0 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-05  9:46 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel

Hi Stefano,

On 04/12/2018 23:50, Stefano Stabellini wrote:
> On Tue, 4 Dec 2018, Julien Grall wrote:
>> The LPAE format allows to store information in an entry even with the
>> valid bit unset. In a follow-up patch, we will take advantage of this
>> feature to re-purpose the valid bit for generating a translation fault
>> even if an entry contains valid information.
>>
>> So we need a different way to know whether an entry contains valid
>> information. It is possible to use the information hold in the p2m_type
>> to know for that purpose. Indeed all entries containing valid
>> information will have a valid p2m type (i.e p2m_type != p2m_invalid).
>>
>> This patch introduces a new helper p2m_is_valid, which implements that
>> idea, and replace most of lpae_is_valid call with the new helper. The ones
>> remaining are for TLBs handling and entries accounting.
>>
>> With the renaming there are 2 others changes required:
>>      - Generate table entry with a valid p2m type
>>      - Detect new mapping for proper stats accounting
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> 
> (This patch doesn't apply to master, please rebase)

Why are you trying to apply to master? This series (as most of my series) are 
based on staging at the time it was sent. I tried to apply this patch today on 
staging and I didn't find any issue.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 05/17] xen/arm: p2m: Handle translation fault in get_page_from_gva
  2018-12-04 23:59   ` Stefano Stabellini
@ 2018-12-05 10:03     ` Julien Grall
  2018-12-06 22:04       ` Stefano Stabellini
  0 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-05 10:03 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel



On 04/12/2018 23:59, Stefano Stabellini wrote:
> On Tue, 4 Dec 2018, Julien Grall wrote:
>> A follow-up patch will re-purpose the valid bit of LPAE entries to
>> generate fault even on entry containing valid information.
>>
>> This means that when translating a guest VA to guest PA (e.g IPA) will
>> fail if the Stage-2 entries used have the valid bit unset. Because of
>> that, we need to fallback to walk the page-table in software to check
>> whether the fault was expected.
>>
>> This patch adds the software page-table walk on all the translation
>> fault. It would be possible in the future to avoid pointless walk when
>> the fault in PAR_EL1 is not a translation fault.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>
>> ---
>>
>> There are a couple of TODO in the code. They are clean-up and performance
>> improvement (e.g when the fault cannot be handled) that could be delayed after
>> the series has been merged.
>>
>>      Changes in v2:
>>          - Check stage-2 permission during software lookup
>>          - Fix typoes
>> ---
>>   xen/arch/arm/p2m.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++------
>>   1 file changed, 59 insertions(+),should  7 deletions(-)
>>
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index 47b54c792e..39680eeb6e 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -6,6 +6,7 @@
>>   
>>   #include <asm/event.h>
>>   #include <asm/flushtlb.h>
>> +#include <asm/guest_walk.h>
>>   #include <asm/page.h>
>>   
>>   #define MAX_VMID_8_BIT  (1UL << 8)
>> @@ -1430,6 +1431,8 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
>>       struct page_info *page = NULL;
>>       paddr_t maddr = 0;
>>       uint64_t par;
>> +    mfn_t mfn;
>> +    p2m_type_t t;
>>   
>>       /*
>>        * XXX: To support a different vCPU, we would need to load the
>> @@ -1446,8 +1449,29 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
>>       par = gvirt_to_maddr(va, &maddr, flags);
>>       p2m_read_unlock(p2m);
>>   
>> +    /*
>> +     * gvirt_to_maddr may fail if the entry does not have the valid bit
>> +     * set. Fallback to the second method:
>> +     *  1) Translate the VA to IPA using software lookup -> Stage-1 page-table
>> +     *  may not be accessible because the stage-2 entries may have valid
>> +     *  bit unset.
>> +     *  2) Software lookup of the MFN
>> +     *
>> +     * Note that when memaccess is enabled, we instead call directly
>> +     * p2m_mem_access_check_and_get_page(...). Because the function is a
>> +     * a variant of the methods described above, it will be able to
>> +     * handle entries with valid bit unset.
>> +     *
>> +     * TODO: Integrate more nicely memaccess with the rest of the
>> +     * function.
>> +     * TODO: Use the fault error in PAR_EL1 to avoid pointless
>> +     *  translation.
>> +     */
>>       if ( par )
>>       {
>> +        paddr_t ipa;
>> +        unsigned int s1_perms;
>> +
>>           /*
>>            * When memaccess is enabled, the translation GVA to MADDR may
>>            * have failed because of a permission fault.
>> @@ -1455,20 +1479,48 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
>>           if ( p2m->mem_access_enabled )
>>               return p2m_mem_access_check_and_get_page(va, flags, v);
>>   
>> -        dprintk(XENLOG_G_DEBUG,
>> -                "%pv: gvirt_to_maddr failed va=%#"PRIvaddr" flags=0x%lx par=%#"PRIx64"\n",
>> -                v, va, flags, par);
>> -        return NULL;
>> +        /*
>> +         * The software stage-1 table walk can still fail, e.g, if the
>> +         * GVA is not mapped.
>> +         */
>> +        if ( !guest_walk_tables(v, va, &ipa, &s1_perms) )
>> +        {
>> +            dprintk(XENLOG_G_DEBUG,
>> +                    "%pv: Failed to walk page-table va %#"PRIvaddr"\n", v, va);
>> +            return NULL;
>> +        }
>> +
>> +        mfn = p2m_lookup(d, gaddr_to_gfn(ipa), &t);
>> +        if ( mfn_eq(INVALID_MFN, mfn) || !p2m_is_ram(t) )
>> +            return NULL;
>> +
>> +        /*
>> +         * Check permission that are assumed by the caller. For instance
>> +         * in case of guestcopy, the caller assumes that the translated
>> +         * page can be accessed with the requested permissions. If this
>> +         * is not the case, we should fail.
>> +         *
>> +         * Please note that we do not check for the GV2M_EXEC
>> +         * permission. This is fine because the hardware-based translation
>> +         * instruction does not test for execute permissions.
>> +         */
>> +        if ( (flags & GV2M_WRITE) && !(s1_perms & GV2M_WRITE) )
>> +            return NULL;
>> +
>> +        if ( (flags & GV2M_WRITE) && t != p2m_ram_rw )
>> +            return NULL;
> 
> The patch looks good enough now. One question: is it a requirement that
> the page we are trying to translate is of type p2m_ram_*? Could
> get_page_from_gva be genuinely called passing a page of a different
> kind, such as p2m_mmio_direct_* or p2m_map_foreign? Today, it is not the
> case, but I wonder if it is something we want to consider?

This function can only possibly work with p2m_ram_* because of the get_page(...) 
below, indeed the page should belong to the domain.

Effectively this function will only be used for hypercall as you use a virtual 
address. I question the value of allowing a guest to do a hypercall with the 
data backed in any other memories than guest RAM. For the foreign mapping, this 
could potentially end up with a leakage.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of Set/Way operations
  2018-12-04 20:26 ` [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of " Julien Grall
  2018-12-05  8:37   ` Jan Beulich
@ 2018-12-06 12:21   ` Julien Grall
  2018-12-07 21:52     ` Stefano Stabellini
  2018-12-07 21:43   ` Stefano Stabellini
  2 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-06 12:21 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Jan Beulich,
	Roger Pau Monné

Hi,

On 12/4/18 8:26 PM, Julien Grall wrote:
> At the moment, the implementation of Set/Way operations will go through
> all the entries of the guest P2M and flush them. However, this is very
> expensive and may render unusable a guest OS using them.
> 
> For instance, Linux 32-bit will use Set/Way operations during secondary
> CPU bring-up. As the implementation is really expensive, it may be possible
> to hit the CPU bring-up timeout.
> 
> To limit the Set/Way impact, we track what pages has been of the guest
> has been accessed between batch of Set/Way operations. This is done
> using bit[0] (aka valid bit) of the P2M entry.
> 
> This patch adds a new per-arch helper is introduced to perform actions just
> before the guest is first unpaused. This will be used to invalidate the
> P2M to track access from the start of the guest.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> ---
> 
> While we can spread d->creation_finished all over the code, the per-arch
> helper to perform actions just before the guest is first unpaused can
> bring a lot of benefit for both architecture. For instance, on Arm, the
> flush to the instruction cache could be delayed until the domain is
> first run. This would improve greatly the performance of creating guest.
> 
> I am still doing the benchmark whether having a command line option is
> worth it. I will provide numbers as soon as I have them.

I remembered Stefano suggested to look at the impact on the boot. This 
is a bit tricky to do as there are many kernel configurations existing 
and all the mappings may not have been touched during the boot.

Instead I wrote a tiny guest [1] that will zero roughly 1GB of memory. 
Because the toolstack will always try to allocate with the biggest 
mapping, I had to hack a bit the toolstack to be able to test with 
different mapping size (but not a mix). The guest has only one vCPU with 
a dedicated pCPU.
	- 1GB: 0.03% slower when starting with valid bit unset
	- 2MB: 0.04% faster when starting with valid bit unset
         - 4KB: ~3% slower when starting with valid bit unset

The performance using 1GB and 2MB mapping is pretty much insignificant 
because the number of traps is very limited (resp. 1 and 513). With 4KB 
mapping, there are a much significant drop because you have more traps 
(~262700) as the P2M contains more entries.

However, having many 4KB mappings in the P2M is pretty unlikely as the 
toolstack will always try to get bigger mapping. In real world, you 
should only have 4KB mappings when you guest has not memory aligned with 
a bigger mapping. If you end up to have many 4KB mappings, then you are 
already going to have a performance impact in long run because of the 
TLB pressure.

Overall, I would not recommend to introduce a command line option until 
we figured out a use case where the trap will be a slow down.

Cheers,

[1]

.text
     b       _start                  /* branch to kernel start, magic */
     .long   0                       /* reserved */
     .quad   0x0                     /* Image load offset from start of 
RAM */
     .quad   0x0                     /* XXX: Effective Image size */
     .quad   2                       /* kernel flags: LE, 4K page size */
     .quad   0                       /* reserved */
     .quad   0                       /* reserved */
     .quad   0                       /* reserved */
     .byte   0x41                    /* Magic number, "ARM\x64" */
     .byte   0x52
     .byte   0x4d
     .byte   0x64
     .long   0                       /* reserved */

_start:
     isb
     mrs     x0, CNTPCT_EL0
     isb

     adrp    x2, _end
     ldr     x3, =(0x40000000 + (1 << 30))
1:  str     xzr, [x2], #8
     cmp     x2, x3
     b.lo    1b

     isb
     mrs     x1, CNTPCT_EL0
     isb
     hvc     #0xffff
1:  b       1b

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 04/17] xen/arm: p2m: Introduce p2m_is_valid and use it
  2018-12-05  9:46     ` Julien Grall
@ 2018-12-06 22:02       ` Stefano Stabellini
  2018-12-07 10:14         ` Julien Grall
  0 siblings, 1 reply; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-06 22:02 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini

On Wed, 5 Dec 2018, Julien Grall wrote:
> Hi Stefano,
> 
> On 04/12/2018 23:50, Stefano Stabellini wrote:
> > On Tue, 4 Dec 2018, Julien Grall wrote:
> > > The LPAE format allows to store information in an entry even with the
> > > valid bit unset. In a follow-up patch, we will take advantage of this
> > > feature to re-purpose the valid bit for generating a translation fault
> > > even if an entry contains valid information.
> > > 
> > > So we need a different way to know whether an entry contains valid
> > > information. It is possible to use the information hold in the p2m_type
> > > to know for that purpose. Indeed all entries containing valid
> > > information will have a valid p2m type (i.e p2m_type != p2m_invalid).
> > > 
> > > This patch introduces a new helper p2m_is_valid, which implements that
> > > idea, and replace most of lpae_is_valid call with the new helper. The ones
> > > remaining are for TLBs handling and entries accounting.
> > > 
> > > With the renaming there are 2 others changes required:
> > >      - Generate table entry with a valid p2m type
> > >      - Detect new mapping for proper stats accounting
> > > 
> > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > 
> > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> > 
> > (This patch doesn't apply to master, please rebase)
> 
> Why are you trying to apply to master? This series (as most of my series) are
> based on staging at the time it was sent. I tried to apply this patch today on
> staging and I didn't find any issue.

No problems then, I thought the series was based on an older tree, but
instead it was on step ahead.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 05/17] xen/arm: p2m: Handle translation fault in get_page_from_gva
  2018-12-05 10:03     ` Julien Grall
@ 2018-12-06 22:04       ` Stefano Stabellini
  2018-12-07 10:16         ` Julien Grall
  0 siblings, 1 reply; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-06 22:04 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini

On Wed, 5 Dec 2018, Julien Grall wrote:
> On 04/12/2018 23:59, Stefano Stabellini wrote:
> > On Tue, 4 Dec 2018, Julien Grall wrote:
> > > A follow-up patch will re-purpose the valid bit of LPAE entries to
> > > generate fault even on entry containing valid information.
> > > 
> > > This means that when translating a guest VA to guest PA (e.g IPA) will
> > > fail if the Stage-2 entries used have the valid bit unset. Because of
> > > that, we need to fallback to walk the page-table in software to check
> > > whether the fault was expected.
> > > 
> > > This patch adds the software page-table walk on all the translation
> > > fault. It would be possible in the future to avoid pointless walk when
> > > the fault in PAR_EL1 is not a translation fault.
> > > 
> > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > 
> > > ---
> > > 
> > > There are a couple of TODO in the code. They are clean-up and performance
> > > improvement (e.g when the fault cannot be handled) that could be delayed
> > > after
> > > the series has been merged.
> > > 
> > >      Changes in v2:
> > >          - Check stage-2 permission during software lookup
> > >          - Fix typoes
> > > ---
> > >   xen/arch/arm/p2m.c | 66
> > > ++++++++++++++++++++++++++++++++++++++++++++++++------
> > >   1 file changed, 59 insertions(+),should  7 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> > > index 47b54c792e..39680eeb6e 100644
> > > --- a/xen/arch/arm/p2m.c
> > > +++ b/xen/arch/arm/p2m.c
> > > @@ -6,6 +6,7 @@
> > >     #include <asm/event.h>
> > >   #include <asm/flushtlb.h>
> > > +#include <asm/guest_walk.h>
> > >   #include <asm/page.h>
> > >     #define MAX_VMID_8_BIT  (1UL << 8)
> > > @@ -1430,6 +1431,8 @@ struct page_info *get_page_from_gva(struct vcpu *v,
> > > vaddr_t va,
> > >       struct page_info *page = NULL;
> > >       paddr_t maddr = 0;
> > >       uint64_t par;
> > > +    mfn_t mfn;
> > > +    p2m_type_t t;
> > >         /*
> > >        * XXX: To support a different vCPU, we would need to load the
> > > @@ -1446,8 +1449,29 @@ struct page_info *get_page_from_gva(struct vcpu *v,
> > > vaddr_t va,
> > >       par = gvirt_to_maddr(va, &maddr, flags);
> > >       p2m_read_unlock(p2m);
> > >   +    /*
> > > +     * gvirt_to_maddr may fail if the entry does not have the valid bit
> > > +     * set. Fallback to the second method:
> > > +     *  1) Translate the VA to IPA using software lookup -> Stage-1
> > > page-table
> > > +     *  may not be accessible because the stage-2 entries may have valid
> > > +     *  bit unset.
> > > +     *  2) Software lookup of the MFN
> > > +     *
> > > +     * Note that when memaccess is enabled, we instead call directly
> > > +     * p2m_mem_access_check_and_get_page(...). Because the function is a
> > > +     * a variant of the methods described above, it will be able to
> > > +     * handle entries with valid bit unset.
> > > +     *
> > > +     * TODO: Integrate more nicely memaccess with the rest of the
> > > +     * function.
> > > +     * TODO: Use the fault error in PAR_EL1 to avoid pointless
> > > +     *  translation.
> > > +     */
> > >       if ( par )
> > >       {
> > > +        paddr_t ipa;
> > > +        unsigned int s1_perms;
> > > +
> > >           /*
> > >            * When memaccess is enabled, the translation GVA to MADDR may
> > >            * have failed because of a permission fault.
> > > @@ -1455,20 +1479,48 @@ struct page_info *get_page_from_gva(struct vcpu
> > > *v, vaddr_t va,
> > >           if ( p2m->mem_access_enabled )
> > >               return p2m_mem_access_check_and_get_page(va, flags, v);
> > >   -        dprintk(XENLOG_G_DEBUG,
> > > -                "%pv: gvirt_to_maddr failed va=%#"PRIvaddr" flags=0x%lx
> > > par=%#"PRIx64"\n",
> > > -                v, va, flags, par);
> > > -        return NULL;
> > > +        /*
> > > +         * The software stage-1 table walk can still fail, e.g, if the
> > > +         * GVA is not mapped.
> > > +         */
> > > +        if ( !guest_walk_tables(v, va, &ipa, &s1_perms) )
> > > +        {
> > > +            dprintk(XENLOG_G_DEBUG,
> > > +                    "%pv: Failed to walk page-table va %#"PRIvaddr"\n",
> > > v, va);
> > > +            return NULL;
> > > +        }
> > > +
> > > +        mfn = p2m_lookup(d, gaddr_to_gfn(ipa), &t);
> > > +        if ( mfn_eq(INVALID_MFN, mfn) || !p2m_is_ram(t) )
> > > +            return NULL;
> > > +
> > > +        /*
> > > +         * Check permission that are assumed by the caller. For instance
> > > +         * in case of guestcopy, the caller assumes that the translated
> > > +         * page can be accessed with the requested permissions. If this
> > > +         * is not the case, we should fail.
> > > +         *
> > > +         * Please note that we do not check for the GV2M_EXEC
> > > +         * permission. This is fine because the hardware-based
> > > translation
> > > +         * instruction does not test for execute permissions.
> > > +         */
> > > +        if ( (flags & GV2M_WRITE) && !(s1_perms & GV2M_WRITE) )
> > > +            return NULL;
> > > +
> > > +        if ( (flags & GV2M_WRITE) && t != p2m_ram_rw )
> > > +            return NULL;
> > 
> > The patch looks good enough now. One question: is it a requirement that
> > the page we are trying to translate is of type p2m_ram_*? Could
> > get_page_from_gva be genuinely called passing a page of a different
> > kind, such as p2m_mmio_direct_* or p2m_map_foreign? Today, it is not the
> > case, but I wonder if it is something we want to consider?
> 
> This function can only possibly work with p2m_ram_* because of the
> get_page(...) below, indeed the page should belong to the domain.
> 
> Effectively this function will only be used for hypercall as you use a virtual
> address. I question the value of allowing a guest to do a hypercall with the
> data backed in any other memories than guest RAM. For the foreign mapping,
> this could potentially end up with a leakage.

OK.

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 11/17] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit)
  2018-12-04 20:35   ` Razvan Cojocaru
@ 2018-12-06 22:32     ` Stefano Stabellini
  2018-12-07 10:17     ` Julien Grall
  1 sibling, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-06 22:32 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: xen-devel, Julien Grall, sstabellini, Tamas K Lengyel

On Tue, 4 Dec 2018, Razvan Cojocaru wrote:
> On 12/4/18 10:26 PM, Julien Grall wrote:
> > With the recent changes, a P2M entry may be populated but may as not
> > valid. In some situation, it would be useful to know whether the entry
> 
> I think you mean to say "may not be valid"?
> 
> > has been marked available to guest in order to perform a specific
> > action. So extend p2m_get_entry to return the value of bit[0] (valid bit).
> > 
> > Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> Other than that,
> 
> Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>

Same here:

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 06/17] xen/arm: p2m: Introduce a function to resolve translation fault
  2018-12-04 20:26 ` [PATCH for-4.12 v2 06/17] xen/arm: p2m: Introduce a function to resolve translation fault Julien Grall
@ 2018-12-06 22:33   ` Stefano Stabellini
  0 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-06 22:33 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini

On Tue, 4 Dec 2018, Julien Grall wrote:
> Currently a Stage-2 translation fault could happen:
>     1) MMIO emulation
>     2) Another pCPU was modifying the P2M using Break-Before-Make
>     3) Guest Physical address is not mapped
> 
> A follow-up patch will re-purpose the valid bit in an entry to generate
> translation fault. This would be used to do an action on each entry to
> track pages used for a given period.
> 
> When receiving the translation fault, we would need to walk the pages
> table to find the faulting entry and then toggle valid bit. We can't use
> p2m_lookup() for this purpose as it only tells us the mapping exists.
> 
> So this patch adds a new function to walk the page-tables and updates
> the entry. This function will also handle 2) as it also requires walking
> the page-table.
> 
> The function is able to cope with both table and block entry having the
> validate bit unset. This gives flexibility to the function clearing the
> valid bits. To keep the algorithm simple, the fault will be propating
> one-level down. This will be repeated until a block entry has been
> reached.
> 
> At the moment, there are no action done when reaching a block/page entry
> but setting the valid bit to 1.

Thanks, this explanation is much better

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> ---
>     Changes in v2:
>         - Typoes
>         - Add more comment
>         - Skip clearing valid bit if it was already done
>         - Move the prototype in p2m.h
>         - Expand commit message
> ---
>  xen/arch/arm/p2m.c        | 142 ++++++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/traps.c      |  10 ++--
>  xen/include/asm-arm/p2m.h |   2 +
>  3 files changed, 148 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 39680eeb6e..2706db3e67 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1035,6 +1035,148 @@ int p2m_set_entry(struct p2m_domain *p2m,
>      return rc;
>  }
>  
> +/* Invalidate all entries in the table. The p2m should be write locked. */
> +static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
> +{
> +    lpae_t *table;
> +    unsigned int i;
> +
> +    ASSERT(p2m_is_write_locked(p2m));
> +
> +    table = map_domain_page(mfn);
> +
> +    for ( i = 0; i < LPAE_ENTRIES; i++ )
> +    {
> +        lpae_t pte = table[i];
> +
> +        /*
> +         * Writing an entry can be expensive because it may involve
> +         * cleaning the cache. So avoid updating the entry if the valid
> +         * bit is already cleared.
> +         */
> +        if ( !pte.p2m.valid )
> +            continue;
> +
> +        pte.p2m.valid = 0;
> +
> +        p2m_write_pte(&table[i], pte, p2m->clean_pte);
> +    }
> +
> +    unmap_domain_page(table);
> +
> +    p2m->need_flush = true;
> +}
> +
> +/*
> + * Resolve any translation fault due to change in the p2m. This
> + * includes break-before-make and valid bit cleared.
> + */
> +bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +    unsigned int level = 0;
> +    bool resolved = false;
> +    lpae_t entry, *table;
> +    paddr_t addr = gfn_to_gaddr(gfn);
> +
> +    /* Convenience aliases */
> +    const unsigned int offsets[4] = {
> +        zeroeth_table_offset(addr),
> +        first_table_offset(addr),
> +        second_table_offset(addr),
> +        third_table_offset(addr)
> +    };
> +
> +    p2m_write_lock(p2m);
> +
> +    /* This gfn is higher than the highest the p2m map currently holds */
> +    if ( gfn_x(gfn) > gfn_x(p2m->max_mapped_gfn) )
> +        goto out;
> +
> +    table = p2m_get_root_pointer(p2m, gfn);
> +    /*
> +     * The table should always be non-NULL because the gfn is below
> +     * p2m->max_mapped_gfn and the root table pages are always present.
> +     */
> +    BUG_ON(table == NULL);
> +
> +    /*
> +     * Go down the page-tables until an entry has the valid bit unset or
> +     * a block/page entry has been hit.
> +     */
> +    for ( level = P2M_ROOT_LEVEL; level <= 3; level++ )
> +    {
> +        int rc;
> +
> +        entry = table[offsets[level]];
> +
> +        if ( level == 3 )
> +            break;
> +
> +        /* Stop as soon as we hit an entry with the valid bit unset. */
> +        if ( !lpae_is_valid(entry) )
> +            break;
> +
> +        rc = p2m_next_level(p2m, true, level, &table, offsets[level]);
> +        if ( rc == GUEST_TABLE_MAP_FAILED )
> +            goto out_unmap;
> +        else if ( rc != GUEST_TABLE_NORMAL_PAGE )
> +            break;
> +    }
> +
> +    /*
> +     * If the valid bit of the entry is set, it means someone was playing with
> +     * the Stage-2 page table. Nothing to do and mark the fault as resolved.
> +     */
> +    if ( lpae_is_valid(entry) )
> +    {
> +        resolved = true;
> +        goto out_unmap;
> +    }
> +
> +    /*
> +     * The valid bit is unset. If the entry is still not valid then the fault
> +     * cannot be resolved, exit and report it.
> +     */
> +    if ( !p2m_is_valid(entry) )
> +        goto out_unmap;
> +
> +    /*
> +     * Now we have an entry with valid bit unset, but still valid from
> +     * the P2M point of view.
> +     *
> +     * If an entry is pointing to a table, each entry of the table will
> +     * have there valid bit cleared. This allows a function to clear the
> +     * full p2m with just a couple of write. The valid bit will then be
> +     * propagated on the fault.
> +     * If an entry is pointing to a block/page, no work to do for now.
> +     */
> +    if ( lpae_is_table(entry, level) )
> +        p2m_invalidate_table(p2m, lpae_get_mfn(entry));
> +
> +    /*
> +     * Now that the work on the entry is done, set the valid bit to prevent
> +     * another fault on that entry.
> +     */
> +    resolved = true;
> +    entry.p2m.valid = 1;
> +
> +    p2m_write_pte(table + offsets[level], entry, p2m->clean_pte);
> +
> +    /*
> +     * No need to flush the TLBs as the modified entry had the valid bit
> +     * unset.
> +     */
> +
> +out_unmap:
> +    unmap_domain_page(table);
> +
> +out:
> +    p2m_write_unlock(p2m);
> +
> +    return resolved;
> +}
> +
>  static inline int p2m_insert_mapping(struct domain *d,
>                                       gfn_t start_gfn,
>                                       unsigned long nr,
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 94fe1a6da7..b00d0b8e1e 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -1893,7 +1893,6 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>      vaddr_t gva;
>      paddr_t gpa;
>      uint8_t fsc = xabt.fsc & ~FSC_LL_MASK;
> -    mfn_t mfn;
>      bool is_data = (hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
>  
>      /*
> @@ -1972,12 +1971,11 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>          }
>  
>          /*
> -         * The PT walk may have failed because someone was playing
> -         * with the Stage-2 page table. Walk the Stage-2 PT to check
> -         * if the entry exists. If it's the case, return to the guest
> +         * First check if the translation fault can be resolved by the
> +         * P2M subsystem. If that's the case nothing else to do.
>           */
> -        mfn = gfn_to_mfn(current->domain, gaddr_to_gfn(gpa));
> -        if ( !mfn_eq(mfn, INVALID_MFN) )
> +        if ( p2m_resolve_translation_fault(current->domain,
> +                                           gaddr_to_gfn(gpa)) )
>              return;
>  
>          if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index 4fe78d39a5..13f7a27c38 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -226,6 +226,8 @@ int p2m_set_entry(struct p2m_domain *p2m,
>                    p2m_type_t t,
>                    p2m_access_t a);
>  
> +bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
> +
>  /* Clean & invalidate caches corresponding to a region of guest address space */
>  int p2m_cache_flush(struct domain *d, gfn_t start, unsigned long nr);
>  
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 07/17] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM
  2018-12-04 20:26 ` [PATCH for-4.12 v2 07/17] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM Julien Grall
@ 2018-12-06 22:33   ` Stefano Stabellini
  0 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-06 22:33 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini

On Tue, 4 Dec 2018, Julien Grall wrote:
> A follow-up patch will require to emulate some accesses to some
> co-processors registers trapped by HCR_EL2.TVM. When set, all NS EL1 writes
> to the virtual memory control registers will be trapped to the hypervisor.
> 
> This patch adds the infrastructure to passthrough the access to host
> registers. For convenience a bunch of helpers have been added to
> generate the different helpers.
> 
> Note that HCR_EL2.TVM will be set in a follow-up patch dynamically.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

> ---
>     Changes in v2:
>         - Add missing include vreg.h
>         - Fixup mask TMV_REG32_COMBINED
>         - Update comments
> ---
>  xen/arch/arm/vcpreg.c        | 149 +++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/cpregs.h |   1 +
>  2 files changed, 150 insertions(+)
> 
> diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
> index 7b783e4bcc..550c25ec3f 100644
> --- a/xen/arch/arm/vcpreg.c
> +++ b/xen/arch/arm/vcpreg.c
> @@ -23,8 +23,129 @@
>  #include <asm/current.h>
>  #include <asm/regs.h>
>  #include <asm/traps.h>
> +#include <asm/vreg.h>
>  #include <asm/vtimer.h>
>  
> +/*
> + * Macros to help generating helpers for registers trapped when
> + * HCR_EL2.TVM is set.
> + *
> + * Note that it only traps NS write access from EL1.
> + *
> + *  - TVM_REG() should not be used outside of the macros. It is there to
> + *    help defining TVM_REG32() and TVM_REG64()
> + *  - TVM_REG32(regname, xreg) and TVM_REG64(regname, xreg) are used to
> + *    resp. generate helper accessing 32-bit and 64-bit register. "regname"
> + *    is the Arm32 name and "xreg" the Arm64 name.
> + *  - TVM_REG32_COMBINED(lowreg, hireg, xreg) are used to generate a
> + *    pair of register sharing the same Arm64 register, but are 2 distinct
> + *    Arm32 registers. "lowreg" and "hireg" contains the name for on Arm32
> + *    registers, "xreg" contains the name for the combined register on Arm64.
> + *    The definition of "lowreg" and "higreg" match the Armv8 specification,
> + *    this means "lowreg" is an alias to xreg[31:0] and "high" is an alias to
> + *    xreg[63:32].
> + *
> + */
> +
> +/* The name is passed from the upper macro to workaround macro expansion. */
> +#define TVM_REG(sz, func, reg...)                                           \
> +static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
> +{                                                                           \
> +    GUEST_BUG_ON(read);                                                     \
> +    WRITE_SYSREG##sz(*r, reg);                                              \
> +                                                                            \
> +    return true;                                                            \
> +}
> +
> +#define TVM_REG32(regname, xreg) TVM_REG(32, vreg_emulate_##regname, xreg)
> +#define TVM_REG64(regname, xreg) TVM_REG(64, vreg_emulate_##regname, xreg)
> +
> +#ifdef CONFIG_ARM_32
> +#define TVM_REG32_COMBINED(lowreg, hireg, xreg)                     \
> +    /* Use TVM_REG directly to workaround macro expansion. */       \
> +    TVM_REG(32, vreg_emulate_##lowreg, lowreg)                      \
> +    TVM_REG(32, vreg_emulate_##hireg, hireg)
> +
> +#else /* CONFIG_ARM_64 */
> +#define TVM_REG32_COMBINED(lowreg, hireg, xreg)                             \
> +static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
> +                                bool read, bool hi)                         \
> +{                                                                           \
> +    register_t reg = READ_SYSREG(xreg);                                     \
> +                                                                            \
> +    GUEST_BUG_ON(read);                                                     \
> +    if ( hi ) /* reg[63:32] is AArch32 register hireg */                    \
> +    {                                                                       \
> +        reg &= GENMASK(31, 0);                                              \
> +        reg |= ((uint64_t)*r) << 32;                                        \
> +    }                                                                       \
> +    else /* reg[31:0] is AArch32 register lowreg. */                        \
> +    {                                                                       \
> +        reg &= GENMASK(63, 32);                                             \
> +        reg |= *r;                                                          \
> +    }                                                                       \
> +    WRITE_SYSREG(reg, xreg);                                                \
> +                                                                            \
> +    return true;                                                            \
> +}                                                                           \
> +                                                                            \
> +static bool vreg_emulate_##lowreg(struct cpu_user_regs *regs, uint32_t *r,  \
> +                                  bool read)                                \
> +{                                                                           \
> +    return vreg_emulate_##xreg(regs, r, read, false);                       \
> +}                                                                           \
> +                                                                            \
> +static bool vreg_emulate_##hireg(struct cpu_user_regs *regs, uint32_t *r,   \
> +                                 bool read)                                 \
> +{                                                                           \
> +    return vreg_emulate_##xreg(regs, r, read, true);                        \
> +}
> +#endif
> +
> +/* Defining helpers for emulating co-processor registers. */
> +TVM_REG32(SCTLR, SCTLR_EL1)
> +/*
> + * AArch32 provides two way to access TTBR* depending on the access
> + * size, whilst AArch64 provides one way.
> + *
> + * When using AArch32, for simplicity, use the same access size as the
> + * guest.
> + */
> +#ifdef CONFIG_ARM_32
> +TVM_REG32(TTBR0_32, TTBR0_32)
> +TVM_REG32(TTBR1_32, TTBR1_32)
> +#else
> +TVM_REG32(TTBR0_32, TTBR0_EL1)
> +TVM_REG32(TTBR1_32, TTBR1_EL1)
> +#endif
> +TVM_REG64(TTBR0, TTBR0_EL1)
> +TVM_REG64(TTBR1, TTBR1_EL1)
> +/* AArch32 registers TTBCR and TTBCR2 share AArch64 register TCR_EL1. */
> +TVM_REG32_COMBINED(TTBCR, TTBCR2, TCR_EL1)
> +TVM_REG32(DACR, DACR32_EL2)
> +TVM_REG32(DFSR, ESR_EL1)
> +TVM_REG32(IFSR, IFSR32_EL2)
> +/* AArch32 registers DFAR and IFAR shares AArch64 register FAR_EL1. */
> +TVM_REG32_COMBINED(DFAR, IFAR, FAR_EL1)
> +TVM_REG32(ADFSR, AFSR0_EL1)
> +TVM_REG32(AIFSR, AFSR1_EL1)
> +/* AArch32 registers MAIR0 and MAIR1 share AArch64 register MAIR_EL1. */
> +TVM_REG32_COMBINED(MAIR0, MAIR1, MAIR_EL1)
> +/* AArch32 registers AMAIR0 and AMAIR1 share AArch64 register AMAIR_EL1. */
> +TVM_REG32_COMBINED(AMAIR0, AMAIR1, AMAIR_EL1)
> +TVM_REG32(CONTEXTIDR, CONTEXTIDR_EL1)
> +
> +/* Macro to generate easily case for co-processor emulation. */
> +#define GENERATE_CASE(reg, sz)                                      \
> +    case HSR_CPREG##sz(reg):                                        \
> +    {                                                               \
> +        bool res;                                                   \
> +                                                                    \
> +        res = vreg_emulate_cp##sz(regs, hsr, vreg_emulate_##reg);   \
> +        ASSERT(res);                                                \
> +        break;                                                      \
> +    }
> +
>  void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
>  {
>      const struct hsr_cp32 cp32 = hsr.cp32;
> @@ -65,6 +186,31 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
>          break;
>  
>      /*
> +     * HCR_EL2.TVM
> +     *
> +     * ARMv8 (DDI 0487D.a): Table D1-38
> +     */
> +    GENERATE_CASE(SCTLR, 32)
> +    GENERATE_CASE(TTBR0_32, 32)
> +    GENERATE_CASE(TTBR1_32, 32)
> +    GENERATE_CASE(TTBCR, 32)
> +    GENERATE_CASE(TTBCR2, 32)
> +    GENERATE_CASE(DACR, 32)
> +    GENERATE_CASE(DFSR, 32)
> +    GENERATE_CASE(IFSR, 32)
> +    GENERATE_CASE(DFAR, 32)
> +    GENERATE_CASE(IFAR, 32)
> +    GENERATE_CASE(ADFSR, 32)
> +    GENERATE_CASE(AIFSR, 32)
> +    /* AKA PRRR */
> +    GENERATE_CASE(MAIR0, 32)
> +    /* AKA NMRR */
> +    GENERATE_CASE(MAIR1, 32)
> +    GENERATE_CASE(AMAIR0, 32)
> +    GENERATE_CASE(AMAIR1, 32)
> +    GENERATE_CASE(CONTEXTIDR, 32)
> +
> +    /*
>       * MDCR_EL2.TPM
>       *
>       * ARMv7 (DDI 0406C.b): B1.14.17
> @@ -193,6 +339,9 @@ void do_cp15_64(struct cpu_user_regs *regs, const union hsr hsr)
>              return inject_undef_exception(regs, hsr);
>          break;
>  
> +    GENERATE_CASE(TTBR0, 64)
> +    GENERATE_CASE(TTBR1, 64)
> +
>      /*
>       * CPTR_EL2.T{0..9,12..13}
>       *
> diff --git a/xen/include/asm-arm/cpregs.h b/xen/include/asm-arm/cpregs.h
> index 97a3c6f1c1..8fd344146e 100644
> --- a/xen/include/asm-arm/cpregs.h
> +++ b/xen/include/asm-arm/cpregs.h
> @@ -140,6 +140,7 @@
>  
>  /* CP15 CR2: Translation Table Base and Control Registers */
>  #define TTBCR           p15,0,c2,c0,2   /* Translation Table Base Control Register */
> +#define TTBCR2          p15,0,c2,c0,3   /* Translation Table Base Control Register 2 */
>  #define TTBR0           p15,0,c2        /* Translation Table Base Reg. 0 */
>  #define TTBR1           p15,1,c2        /* Translation Table Base Reg. 1 */
>  #define HTTBR           p15,4,c2        /* Hyp. Translation Table Base Register */
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 12/17] xen/arm: traps: Rework leave_hypervisor_tail
  2018-12-04 20:26 ` [PATCH for-4.12 v2 12/17] xen/arm: traps: Rework leave_hypervisor_tail Julien Grall
@ 2018-12-06 23:08   ` Stefano Stabellini
  0 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-06 23:08 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini

On Tue, 4 Dec 2018, Julien Grall wrote:
> The function leave_hypervisor_tail is called before each return to the
> guest vCPU. It has two main purposes:
>     1) Process physical CPU work (e.g rescheduling) if required
>     2) Prepare the physical CPU to run the guest vCPU
> 
> 2) will always be done once we finished to process physical CPU work. At
> the moment, it is done part of the last iterations of 1) making adding
> some extra indentation in the code.
> 
> This could be streamlined by moving out 2) of the loop. At the same
> time, 1) is moved in a separate function making more obvious
> 
> All those changes will help a follow-up patch where we would want to
> introduce some vCPU work before returning to the guest vCPU.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>     Changes in v2:
>         - Patch added
> ---
>  xen/arch/arm/traps.c | 61 ++++++++++++++++++++++++++++------------------------
>  1 file changed, 33 insertions(+), 28 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index b00d0b8e1e..02665cc7b4 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -2241,36 +2241,12 @@ void do_trap_fiq(struct cpu_user_regs *regs)
>      gic_interrupt(regs, 1);
>  }
>  
> -void leave_hypervisor_tail(void)
> +static void check_for_pcpu_work(void)
>  {
> -    while (1)
> -    {
> -        local_irq_disable();
> -        if ( !softirq_pending(smp_processor_id()) )
> -        {
> -            vgic_sync_to_lrs();
> -
> -            /*
> -             * If the SErrors handle option is "DIVERSE", we have to prevent
> -             * slipping the hypervisor SError to guest. In this option, before
> -             * returning from trap, we have to synchronize SErrors to guarantee
> -             * that the pending SError would be caught in hypervisor.
> -             *
> -             * If option is NOT "DIVERSE", SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
> -             * will be set to cpu_hwcaps. This means we can use the alternative
> -             * to skip synchronizing SErrors for other SErrors handle options.
> -             */
> -            SYNCHRONIZE_SERROR(SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT);
> -
> -            /*
> -             * The hypervisor runs with the workaround always present.
> -             * If the guest wants it disabled, so be it...
> -             */
> -            if ( needs_ssbd_flip(current) )
> -                arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 0, NULL);
> +    ASSERT(!local_irq_is_enabled());
>  
> -            return;
> -        }
> +    while ( softirq_pending(smp_processor_id()) )
> +    {
>          local_irq_enable();
>          do_softirq();
>          /*
> @@ -2278,9 +2254,38 @@ void leave_hypervisor_tail(void)
>           * and we want to patch the hypervisor with almost no stack.
>           */
>          check_for_livepatch_work();
> +        local_irq_disable();
>      }
>  }
>  
> +void leave_hypervisor_tail(void)
> +{
> +    local_irq_disable();
> +
> +    check_for_pcpu_work();
> +
> +    vgic_sync_to_lrs();
> +
> +    /*
> +     * If the SErrors handle option is "DIVERSE", we have to prevent
> +     * slipping the hypervisor SError to guest. In this option, before
> +     * returning from trap, we have to synchronize SErrors to guarantee
> +     * that the pending SError would be caught in hypervisor.
> +     *
> +     * If option is NOT "DIVERSE", SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
> +     * will be set to cpu_hwcaps. This means we can use the alternative
> +     * to skip synchronizing SErrors for other SErrors handle options.
> +     */
> +    SYNCHRONIZE_SERROR(SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT);
> +
> +    /*
> +     * The hypervisor runs with the workaround always present.
> +     * If the guest wants it disabled, so be it...
> +     */
> +    if ( needs_ssbd_flip(current) )
> +        arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 0, NULL);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 14/17] xen/arm: domctl: Use typesafe gfn in XEN_DOMCTL_cacheflush
  2018-12-04 20:26 ` [PATCH for-4.12 v2 14/17] xen/arm: domctl: Use typesafe gfn in XEN_DOMCTL_cacheflush Julien Grall
@ 2018-12-06 23:13   ` Stefano Stabellini
  0 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-06 23:13 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini

On Tue, 4 Dec 2018, Julien Grall wrote:
> This will make changes in a follow-up patch easier.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Acked-by: Stefano Stabellini <sstabellini@kernel.org>

> ---
>     Changes in v2:
>         - Patch added
> ---
>  xen/arch/arm/domctl.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
> index c10f568aad..20691528a6 100644
> --- a/xen/arch/arm/domctl.c
> +++ b/xen/arch/arm/domctl.c
> @@ -52,16 +52,16 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
>      {
>      case XEN_DOMCTL_cacheflush:
>      {
> -        unsigned long s = domctl->u.cacheflush.start_pfn;
> -        unsigned long e = s + domctl->u.cacheflush.nr_pfns;
> +        gfn_t s = _gfn(domctl->u.cacheflush.start_pfn);
> +        gfn_t e = gfn_add(s, domctl->u.cacheflush.nr_pfns);
>  
>          if ( domctl->u.cacheflush.nr_pfns > (1U<<MAX_ORDER) )
>              return -EINVAL;
>  
> -        if ( e < s )
> +        if ( gfn_x(e) < gfn_x(s) )
>              return -EINVAL;
>  
> -        return p2m_cache_flush_range(d, _gfn(s), _gfn(e));
> +        return p2m_cache_flush_range(d, s, e);
>      }
>      case XEN_DOMCTL_bind_pt_irq:
>      {
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 15/17] xen/arm: p2m: Add support for preemption in p2m_cache_flush_range
  2018-12-04 20:26 ` [PATCH for-4.12 v2 15/17] xen/arm: p2m: Add support for preemption in p2m_cache_flush_range Julien Grall
@ 2018-12-06 23:32   ` Stefano Stabellini
  2018-12-07 11:15     ` Julien Grall
  0 siblings, 1 reply; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-06 23:32 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini

On Tue, 4 Dec 2018, Julien Grall wrote:
> p2m_cache_flush_range does not yet support preemption, this may be an
> issue as cleaning the cache can take a long time. While the current
> caller (XEN_DOMCTL_cacheflush) does not stricly require preemption, this
> will be necessary for new caller in a follow-up patch.
> 
> The preemption implemented is quite simple, a counter is incremented by:
>     - 1 on region skipped
>     - 10 for each page requiring a flush
> 
> When the counter reach 512 or above, we will check if preemption is
> needed. If not, the counter will be reset to 0. If yes, the function
> will stop, update start (to allow resuming later on) and return
> -ERESTART. This allows the caller to decide how the preemption will be
> done.
> 
> For now, XEN_DOMCTL_cacheflush will continue to ignore the preemption.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> ---
>     Changes in v2:
>         - Patch added
> ---
>  xen/arch/arm/domctl.c     |  8 +++++++-
>  xen/arch/arm/p2m.c        | 35 ++++++++++++++++++++++++++++++++---
>  xen/include/asm-arm/p2m.h |  4 +++-
>  3 files changed, 42 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
> index 20691528a6..9da88b8c64 100644
> --- a/xen/arch/arm/domctl.c
> +++ b/xen/arch/arm/domctl.c
> @@ -54,6 +54,7 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
>      {
>          gfn_t s = _gfn(domctl->u.cacheflush.start_pfn);
>          gfn_t e = gfn_add(s, domctl->u.cacheflush.nr_pfns);
> +        int rc;

This is unnecessary...


>          if ( domctl->u.cacheflush.nr_pfns > (1U<<MAX_ORDER) )
>              return -EINVAL;
> @@ -61,7 +62,12 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
>          if ( gfn_x(e) < gfn_x(s) )
>              return -EINVAL;
>  
> -        return p2m_cache_flush_range(d, s, e);
> +        /* XXX: Handle preemption */
> +        do
> +            rc = p2m_cache_flush_range(d, &s, e);
> +        while ( rc == -ERESTART );

... you can just do:

  while ( -ERESTART == p2m_cache_flush_range(d, &s, e) )

But given that it is just style, I'll leave it up to you.


> +        return rc;
>      }
>      case XEN_DOMCTL_bind_pt_irq:
>      {
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index db22b53bfd..ca9f0d9ebe 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1524,13 +1524,17 @@ int relinquish_p2m_mapping(struct domain *d)
>      return rc;
>  }
>  
> -int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
> +int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
>  {
>      struct p2m_domain *p2m = p2m_get_hostp2m(d);
>      gfn_t next_block_gfn;
> +    gfn_t start = *pstart;
>      mfn_t mfn = INVALID_MFN;
>      p2m_type_t t;
>      unsigned int order;
> +    int rc = 0;
> +    /* Counter for preemption */
> +    unsigned long count = 0;
>  
>      /*
>       * The operation cache flush will invalidate the RAM assigned to the
> @@ -1547,6 +1551,25 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>  
>      while ( gfn_x(start) < gfn_x(end) )
>      {
> +       /*
> +         * Cleaning the cache for the P2M may take a long time. So we
> +         * need to be able to preempt. We will arbitrarily preempt every
> +         * time count reach 512 or above.
> +
> +         *
> +         * The count will be incremented by:
> +         *  - 1 on region skipped
> +         *  - 10 for each page requiring a flush

Why this choice? A page flush should cost much more than 10x a region
skipped, more like 100x or 1000x. In fact, doing the full loop without
calling flush_page_to_ram should be cheap and fast, right?. I would:

- not increase count on region skipped at all
- increase it by 1 on each page requiring a flush
- set the limit lower, if we go with your proposal it would be about 50,
  I am not sure what the limit should be though


> +         */
> +        if ( count >= 512 )
> +        {
> +            if ( softirq_pending(smp_processor_id()) )
> +            {
> +                rc = -ERESTART;
> +                break;
> +            }
> +            count = 0;

No need to set count to 0 here


> +        }
> +
>          /*
>           * We want to flush page by page as:
>           *  - it may not be possible to map the full block (can be up to 1GB)
> @@ -1573,22 +1596,28 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>               */
>              if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
>              {
> +                count++;

This is just an iteration doing nothing, I would not increament count.

>                  start = next_block_gfn;
>                  continue;
>              }
>          }
>  
> +        count += 10;

This makes sense, but if we skip the count++ above, we might as well
just count++ here and have a lower limit.


>          flush_page_to_ram(mfn_x(mfn), false);
>  
>          start = gfn_add(start, 1);
>          mfn = mfn_add(mfn, 1);
>      }
>  
> -    invalidate_icache();
> +    if ( rc != -ERESTART )
> +        invalidate_icache();
>  
>      p2m_read_unlock(p2m);
>  
> -    return 0;
> +    *pstart = start;
> +
> +    return rc;
>  }
>  
>  mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index 7c1d930b1d..a633e27cc9 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -232,8 +232,10 @@ bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
>  /*
>   * Clean & invalidate caches corresponding to a region [start,end) of guest
>   * address space.
> + *
> + * start will get updated if the function is preempted.
>   */
> -int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end);
> +int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end);
>  
>  /*
>   * Map a region in the guest p2m with a specific p2m type.
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 16/17] xen/arm: Implement Set/Way operations
  2018-12-04 20:26 ` [PATCH for-4.12 v2 16/17] xen/arm: Implement Set/Way operations Julien Grall
@ 2018-12-06 23:32   ` Stefano Stabellini
  2018-12-07 13:22     ` Julien Grall
  0 siblings, 1 reply; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-06 23:32 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini

On Tue, 4 Dec 2018, Julien Grall wrote:
> Set/Way operations are used to perform maintenance on a given cache.
> At the moment, Set/Way operations are not trapped and therefore a guest
> OS will directly act on the local cache. However, a vCPU may migrate to
> another pCPU in the middle of the processor. This will result to have
> cache with stall data (Set/Way are not propagated) potentially causing
> crash. This may be the cause of heisenbug noticed in Osstest [1].
> 
> Furthermore, Set/Way operations are not available on system cache. This
> means that OS, such as Linux 32-bit, relying on those operations to
> fully clean the cache before disabling MMU may break because data may
> sits in system caches and not in RAM.
> 
> For more details about Set/Way, see the talk "The Art of Virtualizing
> Cache Maintenance" given at Xen Summit 2018 [2].
> 
> In the context of Xen, we need to trap Set/Way operations and emulate
> them. From the Arm Arm (B1.14.4 in DDI 046C.c), Set/Way operations are
> difficult to virtualized. So we can assume that a guest OS using them will
> suffer the consequence (i.e slowness) until developer removes all the usage
> of Set/Way.
> 
> As the software is not allowed to infer the Set/Way to Physical Address
> mapping, Xen will need to go through the guest P2M and clean &
> invalidate all the entries mapped.
> 
> Because Set/Way happen in batch (a loop on all Set/Way of a cache), Xen
> would need to go through the P2M for every instructions. This is quite
> expensive and would severely impact the guest OS. The implementation is
> re-using the KVM policy to limit the number of flush:
>     - If we trap a Set/Way operations, we enable VM trapping (i.e
>       HVC_EL2.TVM) to detect cache being turned on/off, and do a full
>     clean.
>     - We clean the caches when turning on and off
>     - Once the caches are enabled, we stop trapping VM instructions
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg03191.html
> [2] https://fr.slideshare.net/xen_com_mgr/virtualizing-cache
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> ---
>     Changes in v2:
>         - Fix emulation for Set/Way cache flush arm64 sysreg
>         - Add support for preemption
>         - Check cache status on every VM traps in Arm64
>         - Remove spurious change
> ---
>  xen/arch/arm/arm64/vsysreg.c | 17 ++++++++
>  xen/arch/arm/p2m.c           | 92 ++++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/traps.c         | 25 +++++++++++-
>  xen/arch/arm/vcpreg.c        | 22 +++++++++++
>  xen/include/asm-arm/domain.h |  8 ++++
>  xen/include/asm-arm/p2m.h    | 20 ++++++++++
>  6 files changed, 183 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
> index 16ac9c344a..8a85507d9d 100644
> --- a/xen/arch/arm/arm64/vsysreg.c
> +++ b/xen/arch/arm/arm64/vsysreg.c
> @@ -34,9 +34,14 @@
>  static bool vreg_emulate_##reg(struct cpu_user_regs *regs,          \
>                                 uint64_t *r, bool read)              \
>  {                                                                   \
> +    struct vcpu *v = current;                                       \
> +    bool cache_enabled = vcpu_has_cache_enabled(v);                 \
> +                                                                    \
>      GUEST_BUG_ON(read);                                             \
>      WRITE_SYSREG64(*r, reg);                                        \
>                                                                      \
> +    p2m_toggle_cache(v, cache_enabled);                             \
> +                                                                    \
>      return true;                                                    \
>  }
>  
> @@ -85,6 +90,18 @@ void do_sysreg(struct cpu_user_regs *regs,
>          break;
>  
>      /*
> +     * HCR_EL2.TSW
> +     *
> +     * ARMv8 (DDI 0487B.b): Table D1-42
> +     */
> +    case HSR_SYSREG_DCISW:
> +    case HSR_SYSREG_DCCSW:
> +    case HSR_SYSREG_DCCISW:
> +        if ( !hsr.sysreg.read )
> +            p2m_set_way_flush(current);
> +        break;
> +
> +    /*
>       * HCR_EL2.TVM
>       *
>       * ARMv8 (DDI 0487D.a): Table D1-38
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index ca9f0d9ebe..8ee6ff7bd7 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -3,6 +3,7 @@
>  #include <xen/iocap.h>
>  #include <xen/lib.h>
>  #include <xen/sched.h>
> +#include <xen/softirq.h>
>  
>  #include <asm/event.h>
>  #include <asm/flushtlb.h>
> @@ -1620,6 +1621,97 @@ int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
>      return rc;
>  }
>  
> +/*
> + * Clean & invalidate RAM associated to the guest vCPU.
> + *
> + * The function can only work with the current vCPU and should be called
> + * with IRQ enabled as the vCPU could get preempted.
> + */
> +void p2m_flush_vm(struct vcpu *v)
> +{
> +    int rc;
> +    gfn_t start = _gfn(0);
> +
> +    ASSERT(v == current);
> +    ASSERT(local_irq_is_enabled());
> +    ASSERT(v->arch.need_flush_to_ram);
> +
> +    do
> +    {
> +        rc = p2m_cache_flush_range(v->domain, &start, _gfn(ULONG_MAX));
> +        if ( rc == -ERESTART )
> +            do_softirq();

Shouldn't we store somewhere where we left off? Specifically the value
of `start' when rc == -ERESTART? Maybe we change need_flush_to_ram to
gfn_t and used it to store `start'?


> +    } while ( rc == -ERESTART );
> +
> +    if ( rc != 0 )
> +        gprintk(XENLOG_WARNING,
> +                "P2M has not been correctly cleaned (rc = %d)\n",
> +                rc);
> +
> +    v->arch.need_flush_to_ram = false;
> +}
> +
> +/*
> + * See note at ARMv7 ARM B1.14.4 (DDI 0406C.c) (TL;DR: S/W ops are not
> + * easily virtualized).
> + *
> + * Main problems:
> + *  - S/W ops are local to a CPU (not broadcast)
> + *  - We have line migration behind our back (speculation)
> + *  - System caches don't support S/W at all (damn!)
> + *
> + * In the face of the above, the best we can do is to try and convert
> + * S/W ops to VA ops. Because the guest is not allowed to infer the S/W
> + * to PA mapping, it can only use S/W to nuke the whole cache, which is
> + * rather a good thing for us.
> + *
> + * Also, it is only used when turning caches on/off ("The expected
> + * usage of the cache maintenance instructions that operate by set/way
> + * is associated with the powerdown and powerup of caches, if this is
> + * required by the implementation.").
> + *
> + * We use the following policy:
> + *  - If we trap a S/W operation, we enabled VM trapping to detect
> + *  caches being turned on/off, and do a full clean.
> + *
> + *  - We flush the caches on both caches being turned on and off.
> + *
> + *  - Once the caches are enabled, we stop trapping VM ops.
> + */
> +void p2m_set_way_flush(struct vcpu *v)
> +{
> +    /* This function can only work with the current vCPU. */
> +    ASSERT(v == current);
> +
> +    if ( !(v->arch.hcr_el2 & HCR_TVM) )
> +    {
> +        v->arch.need_flush_to_ram = true;
> +        vcpu_hcr_set_flags(v, HCR_TVM);
> +    }
> +}
> +
> +void p2m_toggle_cache(struct vcpu *v, bool was_enabled)
> +{
> +    bool now_enabled = vcpu_has_cache_enabled(v);
> +
> +    /* This function can only work with the current vCPU. */
> +    ASSERT(v == current);
> +
> +    /*
> +     * If switching the MMU+caches on, need to invalidate the caches.
> +     * If switching it off, need to clean the caches.
> +     * Clean + invalidate does the trick always.
> +     */
> +    if ( was_enabled != now_enabled )
> +    {
> +        v->arch.need_flush_to_ram = true;
> +    }
> +
> +    /* Caches are now on, stop trapping VM ops (until a S/W op) */
> +    if ( now_enabled )
> +        vcpu_hcr_clear_flags(v, HCR_TVM);
> +}
> +
>  mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
>  {
>      return p2m_lookup(d, gfn, NULL);
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 02665cc7b4..221c762ada 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -97,7 +97,7 @@ register_t get_default_hcr_flags(void)
>  {
>      return  (HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM|
>               (vwfi != NATIVE ? (HCR_TWI|HCR_TWE) : 0) |
> -             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB);
> +             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
>  }
>  
>  static enum {
> @@ -2258,10 +2258,33 @@ static void check_for_pcpu_work(void)
>      }
>  }
>  
> +/*
> + * Process pending work for the vCPU. Any call should be fast or
> + * implement preemption.
> + */
> +static void check_for_vcpu_work(void)
> +{
> +    struct vcpu *v = current;
> +
> +    if ( likely(!v->arch.need_flush_to_ram) )
> +        return;
> +
> +    /*
> +     * Give a chance for the pCPU to process work before handling the vCPU
> +     * pending work.
> +     */
> +    check_for_pcpu_work();

This is a bit awkward, basically we need to call check_for_pcpu_work
before check_for_vcpu_work(), and after it. This is basically what we
are doing:

  check_for_pcpu_work()
  check_for_vcpu_work()
  check_for_pcpu_work()

Sounds like we should come up with something better but I don't have a
concrete suggestion :-)


> +    local_irq_enable();
> +    p2m_flush_vm(v);
> +    local_irq_disable();
> +}
> +
>  void leave_hypervisor_tail(void)
>  {
>      local_irq_disable();
>  
> +    check_for_vcpu_work();
>      check_for_pcpu_work();
>  
>      vgic_sync_to_lrs();
> diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
> index 550c25ec3f..cdc91cdf5b 100644
> --- a/xen/arch/arm/vcpreg.c
> +++ b/xen/arch/arm/vcpreg.c
> @@ -51,9 +51,14 @@
>  #define TVM_REG(sz, func, reg...)                                           \
>  static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
>  {                                                                           \
> +    struct vcpu *v = current;                                               \
> +    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
> +                                                                            \
>      GUEST_BUG_ON(read);                                                     \
>      WRITE_SYSREG##sz(*r, reg);                                              \
>                                                                              \
> +    p2m_toggle_cache(v, cache_enabled);                                     \
> +                                                                            \
>      return true;                                                            \
>  }
>  
> @@ -71,6 +76,8 @@ static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
>  static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
>                                  bool read, bool hi)                         \
>  {                                                                           \
> +    struct vcpu *v = current;                                               \
> +    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
>      register_t reg = READ_SYSREG(xreg);                                     \
>                                                                              \
>      GUEST_BUG_ON(read);                                                     \
> @@ -86,6 +93,8 @@ static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
>      }                                                                       \
>      WRITE_SYSREG(reg, xreg);                                                \
>                                                                              \
> +    p2m_toggle_cache(v, cache_enabled);                                     \
> +                                                                            \
>      return true;                                                            \
>  }                                                                           \
>                                                                              \
> @@ -186,6 +195,19 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
>          break;
>  
>      /*
> +     * HCR_EL2.TSW
> +     *
> +     * ARMv7 (DDI 0406C.b): B1.14.6
> +     * ARMv8 (DDI 0487B.b): Table D1-42
> +     */
> +    case HSR_CPREG32(DCISW):
> +    case HSR_CPREG32(DCCSW):
> +    case HSR_CPREG32(DCCISW):
> +        if ( !cp32.read )
> +            p2m_set_way_flush(current);
> +        break;
> +
> +    /*
>       * HCR_EL2.TVM
>       *
>       * ARMv8 (DDI 0487D.a): Table D1-38
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 175de44927..f16b973e0d 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -202,6 +202,14 @@ struct arch_vcpu
>      struct vtimer phys_timer;
>      struct vtimer virt_timer;
>      bool   vtimer_initialized;
> +
> +    /*
> +     * The full P2M may require some cleaning (e.g when emulation
> +     * set/way). As the action can take a long time, it requires
> +     * preemption. So this is deferred until we return to the guest.

The reason for delaying the call to p2m_flush_vm until we return to the
guest is that we need to call do_softirq to check whether we need to be
preempted and we cannot make that call in the middle of the vcpreg.c
handlers, right?


> +     */
> +    bool need_flush_to_ram;
> +
>  }  __cacheline_aligned;
>  
>  void vcpu_show_execution_state(struct vcpu *);
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index a633e27cc9..79abcb5a63 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -6,6 +6,8 @@
>  #include <xen/rwlock.h>
>  #include <xen/mem_access.h>
>  
> +#include <asm/current.h>
>
>  #define paddr_bits PADDR_BITS
>  
>  /* Holds the bit size of IPAs in p2m tables.  */
> @@ -237,6 +239,12 @@ bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
>   */
>  int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end);
>  
> +void p2m_set_way_flush(struct vcpu *v);
> +
> +void p2m_toggle_cache(struct vcpu *v, bool was_enabled);
> +
> +void p2m_flush_vm(struct vcpu *v);
> +
>  /*
>   * Map a region in the guest p2m with a specific p2m type.
>   * The memory attributes will be derived from the p2m type.
> @@ -364,6 +372,18 @@ static inline int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
>      return -EOPNOTSUPP;
>  }
>  
> +/*
> + * A vCPU has cache enabled only when the MMU is enabled and data cache
> + * is enabled.
> + */
> +static inline bool vcpu_has_cache_enabled(struct vcpu *v)
> +{
> +    /* Only works with the current vCPU */
> +    ASSERT(current == v);
> +
> +    return (READ_SYSREG32(SCTLR_EL1) & (SCTLR_C|SCTLR_M)) == (SCTLR_C|SCTLR_M);
> +}
> +
>  #endif /* _XEN_P2M_H */
>  
>  /*

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 13/17] xen/arm: p2m: Rework p2m_cache_flush_range
  2018-12-04 20:26 ` [PATCH for-4.12 v2 13/17] xen/arm: p2m: Rework p2m_cache_flush_range Julien Grall
@ 2018-12-06 23:53   ` Stefano Stabellini
  2018-12-07 10:18     ` Julien Grall
  0 siblings, 1 reply; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-06 23:53 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, sstabellini

On Tue, 4 Dec 2018, Julien Grall wrote:
> A follow-up patch will add support for preemption in p2m_cache_flush_range.
> Because of the complexity for the 2 loops, it would be necessary to add
> preemption in both of them.
> 
> This can be avoided by merging the 2 loops together and still keeping
> the code fairly simple to read and extend.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> ---
>     Changes in v2:
>         - Patch added
> ---
>  xen/arch/arm/p2m.c | 52 +++++++++++++++++++++++++++++++++++++---------------
>  1 file changed, 37 insertions(+), 15 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index c713226561..db22b53bfd 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1527,7 +1527,8 @@ int relinquish_p2m_mapping(struct domain *d)
>  int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>  {
>      struct p2m_domain *p2m = p2m_get_hostp2m(d);
> -    gfn_t next_gfn;
> +    gfn_t next_block_gfn;
> +    mfn_t mfn = INVALID_MFN;
>      p2m_type_t t;
>      unsigned int order;
>  
> @@ -1542,24 +1543,45 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>      start = gfn_max(start, p2m->lowest_mapped_gfn);
>      end = gfn_min(end, p2m->max_mapped_gfn);
>  
> -    for ( ; gfn_x(start) < gfn_x(end); start = next_gfn )
> -    {
> -        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
> +    next_block_gfn = start;
>  
> -        next_gfn = gfn_next_boundary(start, order);
> -
> -        /* Skip hole and non-RAM page */
> -        if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
> -            continue;
> -
> -        /* XXX: Implement preemption */
> -        while ( gfn_x(start) < gfn_x(next_gfn) )
> +    while ( gfn_x(start) < gfn_x(end) )
> +    {
> +        /*
> +         * We want to flush page by page as:
> +         *  - it may not be possible to map the full block (can be up to 1GB)
> +         *    in Xen memory
> +         *  - we may want to do fine grain preemption as flushing multiple
> +         *    page in one go may take a long time
> +         *
> +         * As p2m_get_entry is able to return the size of the mapping
> +         * in the p2m, it is pointless to execute it for each page.
> +         *
> +         * We can optimize it by tracking the gfn of the next
> +         * block. So we will only call p2m_get_entry for each block (can
> +         * be up to 1GB).
> +         */
> +        if ( gfn_eq(start, next_block_gfn) )
>          {
> -            flush_page_to_ram(mfn_x(mfn), false);
> +            mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
> +            next_block_gfn = gfn_next_boundary(start, order);
>  
> -            start = gfn_add(start, 1);
> -            mfn = mfn_add(mfn, 1);
> +            /*
> +             * The following regions can be skipped:
> +             *      - Hole
> +             *      - non-RAM
> +             */

I think this comment is superfluous as the code is already obvious. You
can remove it.

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> +            if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
> +            {
> +                start = next_block_gfn;
> +                continue;
> +            }
>          }
> +
> +        flush_page_to_ram(mfn_x(mfn), false);
> +
> +        start = gfn_add(start, 1);
> +        mfn = mfn_add(mfn, 1);
>      }
>  
>      invalidate_icache();
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 04/17] xen/arm: p2m: Introduce p2m_is_valid and use it
  2018-12-06 22:02       ` Stefano Stabellini
@ 2018-12-07 10:14         ` Julien Grall
  0 siblings, 0 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-07 10:14 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel

Hi Stefano

On 06/12/2018 22:02, Stefano Stabellini wrote:
> On Wed, 5 Dec 2018, Julien Grall wrote:
>> On 04/12/2018 23:50, Stefano Stabellini wrote:
>>> On Tue, 4 Dec 2018, Julien Grall wrote:
>>>> The LPAE format allows to store information in an entry even with the
>>>> valid bit unset. In a follow-up patch, we will take advantage of this
>>>> feature to re-purpose the valid bit for generating a translation fault
>>>> even if an entry contains valid information.
>>>>
>>>> So we need a different way to know whether an entry contains valid
>>>> information. It is possible to use the information hold in the p2m_type
>>>> to know for that purpose. Indeed all entries containing valid
>>>> information will have a valid p2m type (i.e p2m_type != p2m_invalid).
>>>>
>>>> This patch introduces a new helper p2m_is_valid, which implements that
>>>> idea, and replace most of lpae_is_valid call with the new helper. The ones
>>>> remaining are for TLBs handling and entries accounting.
>>>>
>>>> With the renaming there are 2 others changes required:
>>>>       - Generate table entry with a valid p2m type
>>>>       - Detect new mapping for proper stats accounting
>>>>
>>>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>>
>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>
>>> (This patch doesn't apply to master, please rebase)
>>
>> Why are you trying to apply to master? This series (as most of my series) are
>> based on staging at the time it was sent. I tried to apply this patch today on
>> staging and I didn't find any issue.
> 
> No problems then, I thought the series was based on an older tree, but
> instead it was on step ahead.

I tend to base everything on staging because master sometimes far ahead :). I 
just realized that I haven't pushed a branch with this series applied. It is now 
pushed:

https://xenbits.xen.org/git-http/people/julieng/xen-unstable.git
branch cacheflush/v2

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 05/17] xen/arm: p2m: Handle translation fault in get_page_from_gva
  2018-12-06 22:04       ` Stefano Stabellini
@ 2018-12-07 10:16         ` Julien Grall
  2018-12-07 16:56           ` Stefano Stabellini
  0 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-07 10:16 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel

Hi Stefano,

On 06/12/2018 22:04, Stefano Stabellini wrote:
> On Wed, 5 Dec 2018, Julien Grall wrote:
>> On 04/12/2018 23:59, Stefano Stabellini wrote:
>>> On Tue, 4 Dec 2018, Julien Grall wrote:
>>>> A follow-up patch will re-purpose the valid bit of LPAE entries to
>>>> generate fault even on entry containing valid information.
>>>>
>>>> This means that when translating a guest VA to guest PA (e.g IPA) will
>>>> fail if the Stage-2 entries used have the valid bit unset. Because of
>>>> that, we need to fallback to walk the page-table in software to check
>>>> whether the fault was expected.
>>>>
>>>> This patch adds the software page-table walk on all the translation
>>>> fault. It would be possible in the future to avoid pointless walk when
>>>> the fault in PAR_EL1 is not a translation fault.
>>>>
>>>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>>>
>>>> ---
>>>>
>>>> There are a couple of TODO in the code. They are clean-up and performance
>>>> improvement (e.g when the fault cannot be handled) that could be delayed
>>>> after
>>>> the series has been merged.
>>>>
>>>>       Changes in v2:
>>>>           - Check stage-2 permission during software lookup
>>>>           - Fix typoes
>>>> ---
>>>>    xen/arch/arm/p2m.c | 66
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++------
>>>>    1 file changed, 59 insertions(+),should  7 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>>>> index 47b54c792e..39680eeb6e 100644
>>>> --- a/xen/arch/arm/p2m.c
>>>> +++ b/xen/arch/arm/p2m.c
>>>> @@ -6,6 +6,7 @@
>>>>      #include <asm/event.h>
>>>>    #include <asm/flushtlb.h>
>>>> +#include <asm/guest_walk.h>
>>>>    #include <asm/page.h>
>>>>      #define MAX_VMID_8_BIT  (1UL << 8)
>>>> @@ -1430,6 +1431,8 @@ struct page_info *get_page_from_gva(struct vcpu *v,
>>>> vaddr_t va,
>>>>        struct page_info *page = NULL;
>>>>        paddr_t maddr = 0;
>>>>        uint64_t par;
>>>> +    mfn_t mfn;
>>>> +    p2m_type_t t;
>>>>          /*
>>>>         * XXX: To support a different vCPU, we would need to load the
>>>> @@ -1446,8 +1449,29 @@ struct page_info *get_page_from_gva(struct vcpu *v,
>>>> vaddr_t va,
>>>>        par = gvirt_to_maddr(va, &maddr, flags);
>>>>        p2m_read_unlock(p2m);
>>>>    +    /*
>>>> +     * gvirt_to_maddr may fail if the entry does not have the valid bit
>>>> +     * set. Fallback to the second method:
>>>> +     *  1) Translate the VA to IPA using software lookup -> Stage-1
>>>> page-table
>>>> +     *  may not be accessible because the stage-2 entries may have valid
>>>> +     *  bit unset.
>>>> +     *  2) Software lookup of the MFN
>>>> +     *
>>>> +     * Note that when memaccess is enabled, we instead call directly
>>>> +     * p2m_mem_access_check_and_get_page(...). Because the function is a
>>>> +     * a variant of the methods described above, it will be able to
>>>> +     * handle entries with valid bit unset.
>>>> +     *
>>>> +     * TODO: Integrate more nicely memaccess with the rest of the
>>>> +     * function.
>>>> +     * TODO: Use the fault error in PAR_EL1 to avoid pointless
>>>> +     *  translation.
>>>> +     */
>>>>        if ( par )
>>>>        {
>>>> +        paddr_t ipa;
>>>> +        unsigned int s1_perms;
>>>> +
>>>>            /*
>>>>             * When memaccess is enabled, the translation GVA to MADDR may
>>>>             * have failed because of a permission fault.
>>>> @@ -1455,20 +1479,48 @@ struct page_info *get_page_from_gva(struct vcpu
>>>> *v, vaddr_t va,
>>>>            if ( p2m->mem_access_enabled )
>>>>                return p2m_mem_access_check_and_get_page(va, flags, v);
>>>>    -        dprintk(XENLOG_G_DEBUG,
>>>> -                "%pv: gvirt_to_maddr failed va=%#"PRIvaddr" flags=0x%lx
>>>> par=%#"PRIx64"\n",
>>>> -                v, va, flags, par);
>>>> -        return NULL;
>>>> +        /*
>>>> +         * The software stage-1 table walk can still fail, e.g, if the
>>>> +         * GVA is not mapped.
>>>> +         */
>>>> +        if ( !guest_walk_tables(v, va, &ipa, &s1_perms) )
>>>> +        {
>>>> +            dprintk(XENLOG_G_DEBUG,
>>>> +                    "%pv: Failed to walk page-table va %#"PRIvaddr"\n",
>>>> v, va);
>>>> +            return NULL;
>>>> +        }
>>>> +
>>>> +        mfn = p2m_lookup(d, gaddr_to_gfn(ipa), &t);
>>>> +        if ( mfn_eq(INVALID_MFN, mfn) || !p2m_is_ram(t) )
>>>> +            return NULL;
>>>> +
>>>> +        /*
>>>> +         * Check permission that are assumed by the caller. For instance
>>>> +         * in case of guestcopy, the caller assumes that the translated
>>>> +         * page can be accessed with the requested permissions. If this
>>>> +         * is not the case, we should fail.
>>>> +         *
>>>> +         * Please note that we do not check for the GV2M_EXEC
>>>> +         * permission. This is fine because the hardware-based
>>>> translation
>>>> +         * instruction does not test for execute permissions.
>>>> +         */
>>>> +        if ( (flags & GV2M_WRITE) && !(s1_perms & GV2M_WRITE) )
>>>> +            return NULL;
>>>> +
>>>> +        if ( (flags & GV2M_WRITE) && t != p2m_ram_rw )
>>>> +            return NULL;
>>>
>>> The patch looks good enough now. One question: is it a requirement that
>>> the page we are trying to translate is of type p2m_ram_*? Could
>>> get_page_from_gva be genuinely called passing a page of a different
>>> kind, such as p2m_mmio_direct_* or p2m_map_foreign? Today, it is not the
>>> case, but I wonder if it is something we want to consider?
>>
>> This function can only possibly work with p2m_ram_* because of the
>> get_page(...) below, indeed the page should belong to the domain.
>>
>> Effectively this function will only be used for hypercall as you use a virtual
>> address. I question the value of allowing a guest to do a hypercall with the
>> data backed in any other memories than guest RAM. For the foreign mapping,
>> this could potentially end up with a leakage.
> 
> OK.

I can probably add a few more dprintk in the error paths to help the developer 
to diagnostics the problem if that's ever happen. What do you think?

> 
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Thank you!

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 11/17] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit)
  2018-12-04 20:35   ` Razvan Cojocaru
  2018-12-06 22:32     ` Stefano Stabellini
@ 2018-12-07 10:17     ` Julien Grall
  1 sibling, 0 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-07 10:17 UTC (permalink / raw)
  To: Razvan Cojocaru, xen-devel; +Cc: sstabellini, Tamas K Lengyel

Hi Ravzan,

On 04/12/2018 20:35, Razvan Cojocaru wrote:
> On 12/4/18 10:26 PM, Julien Grall wrote:
>> With the recent changes, a P2M entry may be populated but may as not
>> valid. In some situation, it would be useful to know whether the entry
> 
> I think you mean to say "may not be valid"?

Correct. I will fix it in the next version.

> 
>> has been marked available to guest in order to perform a specific
>> action. So extend p2m_get_entry to return the value of bit[0] (valid bit).
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> Other than that,
> 
> Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>

Thank you for the review!

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 13/17] xen/arm: p2m: Rework p2m_cache_flush_range
  2018-12-06 23:53   ` Stefano Stabellini
@ 2018-12-07 10:18     ` Julien Grall
  0 siblings, 0 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-07 10:18 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel

Hi Stefano,

On 06/12/2018 23:53, Stefano Stabellini wrote:
> On Tue, 4 Dec 2018, Julien Grall wrote:
>> A follow-up patch will add support for preemption in p2m_cache_flush_range.
>> Because of the complexity for the 2 loops, it would be necessary to add
>> preemption in both of them.
>>
>> This can be avoided by merging the 2 loops together and still keeping
>> the code fairly simple to read and extend.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>
>> ---
>>      Changes in v2:
>>          - Patch added
>> ---
>>   xen/arch/arm/p2m.c | 52 +++++++++++++++++++++++++++++++++++++---------------
>>   1 file changed, 37 insertions(+), 15 deletions(-)
>>
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index c713226561..db22b53bfd 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -1527,7 +1527,8 @@ int relinquish_p2m_mapping(struct domain *d)
>>   int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>>   {
>>       struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> -    gfn_t next_gfn;
>> +    gfn_t next_block_gfn;
>> +    mfn_t mfn = INVALID_MFN;
>>       p2m_type_t t;
>>       unsigned int order;
>>   
>> @@ -1542,24 +1543,45 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>>       start = gfn_max(start, p2m->lowest_mapped_gfn);
>>       end = gfn_min(end, p2m->max_mapped_gfn);
>>   
>> -    for ( ; gfn_x(start) < gfn_x(end); start = next_gfn )
>> -    {
>> -        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
>> +    next_block_gfn = start;
>>   
>> -        next_gfn = gfn_next_boundary(start, order);
>> -
>> -        /* Skip hole and non-RAM page */
>> -        if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
>> -            continue;
>> -
>> -        /* XXX: Implement preemption */
>> -        while ( gfn_x(start) < gfn_x(next_gfn) )
>> +    while ( gfn_x(start) < gfn_x(end) )
>> +    {
>> +        /*
>> +         * We want to flush page by page as:
>> +         *  - it may not be possible to map the full block (can be up to 1GB)
>> +         *    in Xen memory
>> +         *  - we may want to do fine grain preemption as flushing multiple
>> +         *    page in one go may take a long time
>> +         *
>> +         * As p2m_get_entry is able to return the size of the mapping
>> +         * in the p2m, it is pointless to execute it for each page.
>> +         *
>> +         * We can optimize it by tracking the gfn of the next
>> +         * block. So we will only call p2m_get_entry for each block (can
>> +         * be up to 1GB).
>> +         */
>> +        if ( gfn_eq(start, next_block_gfn) )
>>           {
>> -            flush_page_to_ram(mfn_x(mfn), false);
>> +            mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
>> +            next_block_gfn = gfn_next_boundary(start, order);
>>   
>> -            start = gfn_add(start, 1);
>> -            mfn = mfn_add(mfn, 1);
>> +            /*
>> +             * The following regions can be skipped:
>> +             *      - Hole
>> +             *      - non-RAM
>> +             */
> 
> I think this comment is superfluous as the code is already obvious. You
> can remove it.

I was over-cautious :). I will drop it.

> 
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Thank you for the review!

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 15/17] xen/arm: p2m: Add support for preemption in p2m_cache_flush_range
  2018-12-06 23:32   ` Stefano Stabellini
@ 2018-12-07 11:15     ` Julien Grall
  2018-12-07 22:11       ` Stefano Stabellini
  0 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-07 11:15 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel

Hi Stefano,

On 06/12/2018 23:32, Stefano Stabellini wrote:
> On Tue, 4 Dec 2018, Julien Grall wrote:
>> p2m_cache_flush_range does not yet support preemption, this may be an
>> issue as cleaning the cache can take a long time. While the current
>> caller (XEN_DOMCTL_cacheflush) does not stricly require preemption, this
>> will be necessary for new caller in a follow-up patch.
>>
>> The preemption implemented is quite simple, a counter is incremented by:
>>      - 1 on region skipped
>>      - 10 for each page requiring a flush
>>
>> When the counter reach 512 or above, we will check if preemption is
>> needed. If not, the counter will be reset to 0. If yes, the function
>> will stop, update start (to allow resuming later on) and return
>> -ERESTART. This allows the caller to decide how the preemption will be
>> done.
>>
>> For now, XEN_DOMCTL_cacheflush will continue to ignore the preemption.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>
>> ---
>>      Changes in v2:
>>          - Patch added
>> ---
>>   xen/arch/arm/domctl.c     |  8 +++++++-
>>   xen/arch/arm/p2m.c        | 35 ++++++++++++++++++++++++++++++++---
>>   xen/include/asm-arm/p2m.h |  4 +++-
>>   3 files changed, 42 insertions(+), 5 deletions(-)
>>
>> diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
>> index 20691528a6..9da88b8c64 100644
>> --- a/xen/arch/arm/domctl.c
>> +++ b/xen/arch/arm/domctl.c
>> @@ -54,6 +54,7 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
>>       {
>>           gfn_t s = _gfn(domctl->u.cacheflush.start_pfn);
>>           gfn_t e = gfn_add(s, domctl->u.cacheflush.nr_pfns);
>> +        int rc;
> 
> This is unnecessary...
> 
> 
>>           if ( domctl->u.cacheflush.nr_pfns > (1U<<MAX_ORDER) )
>>               return -EINVAL;
>> @@ -61,7 +62,12 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
>>           if ( gfn_x(e) < gfn_x(s) )
>>               return -EINVAL;
>>   
>> -        return p2m_cache_flush_range(d, s, e);
>> +        /* XXX: Handle preemption */
>> +        do
>> +            rc = p2m_cache_flush_range(d, &s, e);
>> +        while ( rc == -ERESTART );
> 
> ... you can just do:
> 
>    while ( -ERESTART == p2m_cache_flush_range(d, &s, e) )
> 
> But given that it is just style, I'll leave it up to you.

I don't much like the idea to have the loop body empty. This is error-prone 
depending where you use do {} while (...) or while ( ... );

So I would prefer to stick with a temporary variable.

> 
> 
>> +        return rc;
>>       }
>>       case XEN_DOMCTL_bind_pt_irq:
>>       {
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index db22b53bfd..ca9f0d9ebe 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -1524,13 +1524,17 @@ int relinquish_p2m_mapping(struct domain *d)
>>       return rc;
>>   }
>>   
>> -int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>> +int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
>>   {
>>       struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>       gfn_t next_block_gfn;
>> +    gfn_t start = *pstart;
>>       mfn_t mfn = INVALID_MFN;
>>       p2m_type_t t;
>>       unsigned int order;
>> +    int rc = 0;
>> +    /* Counter for preemption */
>> +    unsigned long count = 0;
>>   
>>       /*
>>        * The operation cache flush will invalidate the RAM assigned to the
>> @@ -1547,6 +1551,25 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>>   
>>       while ( gfn_x(start) < gfn_x(end) )
>>       {
>> +       /*
>> +         * Cleaning the cache for the P2M may take a long time. So we
>> +         * need to be able to preempt. We will arbitrarily preempt every
>> +         * time count reach 512 or above.
>> +
>> +         *
>> +         * The count will be incremented by:
>> +         *  - 1 on region skipped
>> +         *  - 10 for each page requiring a flush
> 
> Why this choice? A page flush should cost much more than 10x a region
> skipped, more like 100x or 1000x. In fact, doing the full loop without
> calling flush_page_to_ram should be cheap and fast, right?.

It is cheaper than a flush of the page but it still has a cost. You have to walk 
the stage-2 in software that will require to map the tables. As all the memory 
is not mapped in the hypervisor on arm32 this will require a map/unmap 
operation. On arm64, so far the full memory is mapped, so the map/unmap is 
pretty much a NOP.

> I would:
> 
> - not increase count on region skipped at all
> - increase it by 1 on each page requiring a flush
> - set the limit lower, if we go with your proposal it would be about 50,
>    I am not sure what the limit should be though
I don't think you can avoid incrementing count on region skipped. While one 
lookup is pretty cheap, all the lookups for hole added together may result to a 
pretty long time.

Even if stage-2 mappings are handled by the hypervisor, the guest is still 
somewhat in control of it because it can balloon in/out pages. The operation may 
result to shatter stage-2 mappings.

It would be feasible for a guest to shatter 1GB of memory in 4KB mappings in 
stage-2 entries and then remove all the entries. This means the stage-2 would 
contains 262144 holes. This would result to 262144 iterations, so no matter how 
cheap it is the resulting time spent without preemption is going to be quite 
important.

The choice in the numbers 1 vs 10 is pretty much random. The question is how 
often we want to check for pending softirq. The check is pretty much trivial, 
yet it has a cost to preempt. With the current solution, we check preemption 
every 512 holes or 51 pages flushed (~204KB flushed).

This sounds ok to me. Feel free to suggest better number.

> 
> 
>> +         */
>> +        if ( count >= 512 )
>> +        {
>> +            if ( softirq_pending(smp_processor_id()) )
>> +            {
>> +                rc = -ERESTART;
>> +                break;
>> +            }
>> +            count = 0;
> 
> No need to set count to 0 here

Well, the code would not do the same here. If you don't reset to 0, you would 
check softirq_pending() all the iteration when count reached 512.

If you reset 0, you will avoid to check softirq_pending() until the next time 
count reached 512.

The both are actually valid. It just a matter on whether we are assuming that a 
softirq will happen soon after reaching 512?

> 
> 
>> +        }
>> +
>>           /*
>>            * We want to flush page by page as:
>>            *  - it may not be possible to map the full block (can be up to 1GB)
>> @@ -1573,22 +1596,28 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>>                */
>>               if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
>>               {
>> +                count++;
> 
> This is just an iteration doing nothing, I would not increament count.

[...]

> This makes sense, but if we skip the count++ above, we might as well
> just count++ here and have a lower limit.

See above for why I think this can't work.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 16/17] xen/arm: Implement Set/Way operations
  2018-12-06 23:32   ` Stefano Stabellini
@ 2018-12-07 13:22     ` Julien Grall
  2018-12-07 21:29       ` Stefano Stabellini
  0 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-07 13:22 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel

Hi Stefano,

On 06/12/2018 23:32, Stefano Stabellini wrote:
> On Tue, 4 Dec 2018, Julien Grall wrote:
>> Set/Way operations are used to perform maintenance on a given cache.
>> At the moment, Set/Way operations are not trapped and therefore a guest
>> OS will directly act on the local cache. However, a vCPU may migrate to
>> another pCPU in the middle of the processor. This will result to have
>> cache with stall data (Set/Way are not propagated) potentially causing
>> crash. This may be the cause of heisenbug noticed in Osstest [1].
>>
>> Furthermore, Set/Way operations are not available on system cache. This
>> means that OS, such as Linux 32-bit, relying on those operations to
>> fully clean the cache before disabling MMU may break because data may
>> sits in system caches and not in RAM.
>>
>> For more details about Set/Way, see the talk "The Art of Virtualizing
>> Cache Maintenance" given at Xen Summit 2018 [2].
>>
>> In the context of Xen, we need to trap Set/Way operations and emulate
>> them. From the Arm Arm (B1.14.4 in DDI 046C.c), Set/Way operations are
>> difficult to virtualized. So we can assume that a guest OS using them will
>> suffer the consequence (i.e slowness) until developer removes all the usage
>> of Set/Way.
>>
>> As the software is not allowed to infer the Set/Way to Physical Address
>> mapping, Xen will need to go through the guest P2M and clean &
>> invalidate all the entries mapped.
>>
>> Because Set/Way happen in batch (a loop on all Set/Way of a cache), Xen
>> would need to go through the P2M for every instructions. This is quite
>> expensive and would severely impact the guest OS. The implementation is
>> re-using the KVM policy to limit the number of flush:
>>      - If we trap a Set/Way operations, we enable VM trapping (i.e
>>        HVC_EL2.TVM) to detect cache being turned on/off, and do a full
>>      clean.
>>      - We clean the caches when turning on and off
>>      - Once the caches are enabled, we stop trapping VM instructions
>>
>> [1] https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg03191.html
>> [2] https://fr.slideshare.net/xen_com_mgr/virtualizing-cache
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>
>> ---
>>      Changes in v2:
>>          - Fix emulation for Set/Way cache flush arm64 sysreg
>>          - Add support for preemption
>>          - Check cache status on every VM traps in Arm64
>>          - Remove spurious change
>> ---
>>   xen/arch/arm/arm64/vsysreg.c | 17 ++++++++
>>   xen/arch/arm/p2m.c           | 92 ++++++++++++++++++++++++++++++++++++++++++++
>>   xen/arch/arm/traps.c         | 25 +++++++++++-
>>   xen/arch/arm/vcpreg.c        | 22 +++++++++++
>>   xen/include/asm-arm/domain.h |  8 ++++
>>   xen/include/asm-arm/p2m.h    | 20 ++++++++++
>>   6 files changed, 183 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
>> index 16ac9c344a..8a85507d9d 100644
>> --- a/xen/arch/arm/arm64/vsysreg.c
>> +++ b/xen/arch/arm/arm64/vsysreg.c
>> @@ -34,9 +34,14 @@
>>   static bool vreg_emulate_##reg(struct cpu_user_regs *regs,          \
>>                                  uint64_t *r, bool read)              \
>>   {                                                                   \
>> +    struct vcpu *v = current;                                       \
>> +    bool cache_enabled = vcpu_has_cache_enabled(v);                 \
>> +                                                                    \
>>       GUEST_BUG_ON(read);                                             \
>>       WRITE_SYSREG64(*r, reg);                                        \
>>                                                                       \
>> +    p2m_toggle_cache(v, cache_enabled);                             \
>> +                                                                    \
>>       return true;                                                    \
>>   }
>>   
>> @@ -85,6 +90,18 @@ void do_sysreg(struct cpu_user_regs *regs,
>>           break;
>>   
>>       /*
>> +     * HCR_EL2.TSW
>> +     *
>> +     * ARMv8 (DDI 0487B.b): Table D1-42
>> +     */
>> +    case HSR_SYSREG_DCISW:
>> +    case HSR_SYSREG_DCCSW:
>> +    case HSR_SYSREG_DCCISW:
>> +        if ( !hsr.sysreg.read )
>> +            p2m_set_way_flush(current);
>> +        break;
>> +
>> +    /*
>>        * HCR_EL2.TVM
>>        *
>>        * ARMv8 (DDI 0487D.a): Table D1-38
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index ca9f0d9ebe..8ee6ff7bd7 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -3,6 +3,7 @@
>>   #include <xen/iocap.h>
>>   #include <xen/lib.h>
>>   #include <xen/sched.h>
>> +#include <xen/softirq.h>
>>   
>>   #include <asm/event.h>
>>   #include <asm/flushtlb.h>
>> @@ -1620,6 +1621,97 @@ int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
>>       return rc;
>>   }
>>   
>> +/*
>> + * Clean & invalidate RAM associated to the guest vCPU.
>> + *
>> + * The function can only work with the current vCPU and should be called
>> + * with IRQ enabled as the vCPU could get preempted.
>> + */
>> +void p2m_flush_vm(struct vcpu *v)
>> +{
>> +    int rc;
>> +    gfn_t start = _gfn(0);
>> +
>> +    ASSERT(v == current);
>> +    ASSERT(local_irq_is_enabled());
>> +    ASSERT(v->arch.need_flush_to_ram);
>> +
>> +    do
>> +    {
>> +        rc = p2m_cache_flush_range(v->domain, &start, _gfn(ULONG_MAX));
>> +        if ( rc == -ERESTART )
>> +            do_softirq();
> 
> Shouldn't we store somewhere where we left off? Specifically the value
> of `start' when rc == -ERESTART? Maybe we change need_flush_to_ram to
> gfn_t and used it to store `start'?

It is not necessary on Arm. The hypervisor stack is per-vCPU and we will always 
return to where we were preempted.

> 
> 
>> +    } while ( rc == -ERESTART );
>> +
>> +    if ( rc != 0 )
>> +        gprintk(XENLOG_WARNING,
>> +                "P2M has not been correctly cleaned (rc = %d)\n",
>> +                rc);
>> +
>> +    v->arch.need_flush_to_ram = false;
>> +}
>> +
>> +/*
>> + * See note at ARMv7 ARM B1.14.4 (DDI 0406C.c) (TL;DR: S/W ops are not
>> + * easily virtualized).
>> + *
>> + * Main problems:
>> + *  - S/W ops are local to a CPU (not broadcast)
>> + *  - We have line migration behind our back (speculation)
>> + *  - System caches don't support S/W at all (damn!)
>> + *
>> + * In the face of the above, the best we can do is to try and convert
>> + * S/W ops to VA ops. Because the guest is not allowed to infer the S/W
>> + * to PA mapping, it can only use S/W to nuke the whole cache, which is
>> + * rather a good thing for us.
>> + *
>> + * Also, it is only used when turning caches on/off ("The expected
>> + * usage of the cache maintenance instructions that operate by set/way
>> + * is associated with the powerdown and powerup of caches, if this is
>> + * required by the implementation.").
>> + *
>> + * We use the following policy:
>> + *  - If we trap a S/W operation, we enabled VM trapping to detect
>> + *  caches being turned on/off, and do a full clean.
>> + *
>> + *  - We flush the caches on both caches being turned on and off.
>> + *
>> + *  - Once the caches are enabled, we stop trapping VM ops.
>> + */
>> +void p2m_set_way_flush(struct vcpu *v)
>> +{
>> +    /* This function can only work with the current vCPU. */
>> +    ASSERT(v == current);
>> +
>> +    if ( !(v->arch.hcr_el2 & HCR_TVM) )
>> +    {
>> +        v->arch.need_flush_to_ram = true;
>> +        vcpu_hcr_set_flags(v, HCR_TVM);
>> +    }
>> +}
>> +
>> +void p2m_toggle_cache(struct vcpu *v, bool was_enabled)
>> +{
>> +    bool now_enabled = vcpu_has_cache_enabled(v);
>> +
>> +    /* This function can only work with the current vCPU. */
>> +    ASSERT(v == current);
>> +
>> +    /*
>> +     * If switching the MMU+caches on, need to invalidate the caches.
>> +     * If switching it off, need to clean the caches.
>> +     * Clean + invalidate does the trick always.
>> +     */
>> +    if ( was_enabled != now_enabled )
>> +    {
>> +        v->arch.need_flush_to_ram = true;
>> +    }
>> +
>> +    /* Caches are now on, stop trapping VM ops (until a S/W op) */
>> +    if ( now_enabled )
>> +        vcpu_hcr_clear_flags(v, HCR_TVM);
>> +}
>> +
>>   mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
>>   {
>>       return p2m_lookup(d, gfn, NULL);
>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>> index 02665cc7b4..221c762ada 100644
>> --- a/xen/arch/arm/traps.c
>> +++ b/xen/arch/arm/traps.c
>> @@ -97,7 +97,7 @@ register_t get_default_hcr_flags(void)
>>   {
>>       return  (HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM|
>>                (vwfi != NATIVE ? (HCR_TWI|HCR_TWE) : 0) |
>> -             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB);
>> +             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
>>   }
>>   
>>   static enum {
>> @@ -2258,10 +2258,33 @@ static void check_for_pcpu_work(void)
>>       }
>>   }
>>   
>> +/*
>> + * Process pending work for the vCPU. Any call should be fast or
>> + * implement preemption.
>> + */
>> +static void check_for_vcpu_work(void)
>> +{
>> +    struct vcpu *v = current;
>> +
>> +    if ( likely(!v->arch.need_flush_to_ram) )
>> +        return;
>> +
>> +    /*
>> +     * Give a chance for the pCPU to process work before handling the vCPU
>> +     * pending work.
>> +     */
>> +    check_for_pcpu_work();
> 
> This is a bit awkward, basically we need to call check_for_pcpu_work
> before check_for_vcpu_work(), and after it. This is basically what we
> are doing:
> 
>    check_for_pcpu_work()
>    check_for_vcpu_work()
>    check_for_pcpu_work()

Not really, we only do that if we have vCPU work to do (see the check 
!v->arch.need_flush_to_ram). So we call twice only when we need to do some vCPU 
work (at the moment only the p2m).

We can't avoid the one after check_for_vcpu_work() because you may have softirq 
pending and already signaled (i.e via an interrupt). So you may not execute them 
before returning to the guest introducing long delay. That's why we execute the 
rest of the code with interrupts masked. If sotfirq_pending() returns 0 then you 
know there were no more softirq pending to handle. All the new one will be 
signaled via an interrupt than can only come up when irq are unmasked.

The one before executing vCPU work can potentially be avoided. The reason I 
added it is it can take some times before p2m_flush_vm() will call softirq. As 
we do this on return to guest we may have already been executed for some time in 
the hypervisor. So this give us a chance to preempt if the vCPU consumed his sliced.

> 
> Sounds like we should come up with something better but I don't have a
> concrete suggestion :-)
> 
> 
>> +    local_irq_enable();
>> +    p2m_flush_vm(v);
>> +    local_irq_disable();
>> +}
>> +
>>   void leave_hypervisor_tail(void)
>>   {
>>       local_irq_disable();
>>   
>> +    check_for_vcpu_work();
>>       check_for_pcpu_work();
>>   
>>       vgic_sync_to_lrs();
>> diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
>> index 550c25ec3f..cdc91cdf5b 100644
>> --- a/xen/arch/arm/vcpreg.c
>> +++ b/xen/arch/arm/vcpreg.c
>> @@ -51,9 +51,14 @@
>>   #define TVM_REG(sz, func, reg...)                                           \
>>   static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
>>   {                                                                           \
>> +    struct vcpu *v = current;                                               \
>> +    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
>> +                                                                            \
>>       GUEST_BUG_ON(read);                                                     \
>>       WRITE_SYSREG##sz(*r, reg);                                              \
>>                                                                               \
>> +    p2m_toggle_cache(v, cache_enabled);                                     \
>> +                                                                            \
>>       return true;                                                            \
>>   }
>>   
>> @@ -71,6 +76,8 @@ static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
>>   static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
>>                                   bool read, bool hi)                         \
>>   {                                                                           \
>> +    struct vcpu *v = current;                                               \
>> +    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
>>       register_t reg = READ_SYSREG(xreg);                                     \
>>                                                                               \
>>       GUEST_BUG_ON(read);                                                     \
>> @@ -86,6 +93,8 @@ static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
>>       }                                                                       \
>>       WRITE_SYSREG(reg, xreg);                                                \
>>                                                                               \
>> +    p2m_toggle_cache(v, cache_enabled);                                     \
>> +                                                                            \
>>       return true;                                                            \
>>   }                                                                           \
>>                                                                               \
>> @@ -186,6 +195,19 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
>>           break;
>>   
>>       /*
>> +     * HCR_EL2.TSW
>> +     *
>> +     * ARMv7 (DDI 0406C.b): B1.14.6
>> +     * ARMv8 (DDI 0487B.b): Table D1-42
>> +     */
>> +    case HSR_CPREG32(DCISW):
>> +    case HSR_CPREG32(DCCSW):
>> +    case HSR_CPREG32(DCCISW):
>> +        if ( !cp32.read )
>> +            p2m_set_way_flush(current);
>> +        break;
>> +
>> +    /*
>>        * HCR_EL2.TVM
>>        *
>>        * ARMv8 (DDI 0487D.a): Table D1-38
>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>> index 175de44927..f16b973e0d 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -202,6 +202,14 @@ struct arch_vcpu
>>       struct vtimer phys_timer;
>>       struct vtimer virt_timer;
>>       bool   vtimer_initialized;
>> +
>> +    /*
>> +     * The full P2M may require some cleaning (e.g when emulation
>> +     * set/way). As the action can take a long time, it requires
>> +     * preemption. So this is deferred until we return to the guest.
> 
> The reason for delaying the call to p2m_flush_vm until we return to the
> guest is that we need to call do_softirq to check whether we need to be
> preempted and we cannot make that call in the middle of the vcpreg.c
> handlers, right?
We need to make sure that do_softirq() is called without any lock. With the 
current code, it would technically be possible to call do_softirq() directly in 
vcreg.c handlers. But I think it is not entirely future-proof.

So it would be better if we call do_softirq() with the little stack as possible 
as it is easier to ensure that the callers are not taking any lock.

The infrastructure added should be re-usable for other sort of work (i.e ITS, 
ioreq) in the future.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of Set/Way operations
  2018-12-05  8:37   ` Jan Beulich
@ 2018-12-07 13:24     ` Julien Grall
  0 siblings, 0 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-07 13:24 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel,
	Roger Pau Monne

Hi Jan,

On 05/12/2018 08:37, Jan Beulich wrote:
>>>> On 04.12.18 at 21:26, <julien.grall@arm.com> wrote:
>> At the moment, the implementation of Set/Way operations will go through
>> all the entries of the guest P2M and flush them. However, this is very
>> expensive and may render unusable a guest OS using them.
>>
>> For instance, Linux 32-bit will use Set/Way operations during secondary
>> CPU bring-up. As the implementation is really expensive, it may be possible
>> to hit the CPU bring-up timeout.
>>
>> To limit the Set/Way impact, we track what pages has been of the guest
>> has been accessed between batch of Set/Way operations. This is done
>> using bit[0] (aka valid bit) of the P2M entry.
>>
>> This patch adds a new per-arch helper is introduced to perform actions just
>> before the guest is first unpaused. This will be used to invalidate the
>> P2M to track access from the start of the guest.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>
>> ---
>>
>> While we can spread d->creation_finished all over the code, the per-arch
>> helper to perform actions just before the guest is first unpaused can
>> bring a lot of benefit for both architecture. For instance, on Arm, the
>> flush to the instruction cache could be delayed until the domain is
>> first run. This would improve greatly the performance of creating guest.
> 
> Just the other day we had found a potential use on x86 as well
> (even if I already don't recall anymore what it was), so the
> addition is certainly helpful. It might have been nice to split
> introduction of the interface from what you actually want it to
> do on Arm, but irrespective of that
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> for the non-Arm pieces here.

I am expecting the patch to be merged in the next couple of weeks. But I am 
happy to split it if you need it before.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 05/17] xen/arm: p2m: Handle translation fault in get_page_from_gva
  2018-12-07 10:16         ` Julien Grall
@ 2018-12-07 16:56           ` Stefano Stabellini
  0 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-07 16:56 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini

On Fri, 7 Dec 2018, Julien Grall wrote:
> Hi Stefano,
> 
> On 06/12/2018 22:04, Stefano Stabellini wrote:
> > On Wed, 5 Dec 2018, Julien Grall wrote:
> > > On 04/12/2018 23:59, Stefano Stabellini wrote:
> > > > On Tue, 4 Dec 2018, Julien Grall wrote:
> > > > > A follow-up patch will re-purpose the valid bit of LPAE entries to
> > > > > generate fault even on entry containing valid information.
> > > > > 
> > > > > This means that when translating a guest VA to guest PA (e.g IPA) will
> > > > > fail if the Stage-2 entries used have the valid bit unset. Because of
> > > > > that, we need to fallback to walk the page-table in software to check
> > > > > whether the fault was expected.
> > > > > 
> > > > > This patch adds the software page-table walk on all the translation
> > > > > fault. It would be possible in the future to avoid pointless walk when
> > > > > the fault in PAR_EL1 is not a translation fault.
> > > > > 
> > > > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > > > 
> > > > > ---
> > > > > 
> > > > > There are a couple of TODO in the code. They are clean-up and
> > > > > performance
> > > > > improvement (e.g when the fault cannot be handled) that could be
> > > > > delayed
> > > > > after
> > > > > the series has been merged.
> > > > > 
> > > > >       Changes in v2:
> > > > >           - Check stage-2 permission during software lookup
> > > > >           - Fix typoes
> > > > > ---
> > > > >    xen/arch/arm/p2m.c | 66
> > > > > ++++++++++++++++++++++++++++++++++++++++++++++++------
> > > > >    1 file changed, 59 insertions(+),should  7 deletions(-)
> > > > > 
> > > > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> > > > > index 47b54c792e..39680eeb6e 100644
> > > > > --- a/xen/arch/arm/p2m.c
> > > > > +++ b/xen/arch/arm/p2m.c
> > > > > @@ -6,6 +6,7 @@
> > > > >      #include <asm/event.h>
> > > > >    #include <asm/flushtlb.h>
> > > > > +#include <asm/guest_walk.h>
> > > > >    #include <asm/page.h>
> > > > >      #define MAX_VMID_8_BIT  (1UL << 8)
> > > > > @@ -1430,6 +1431,8 @@ struct page_info *get_page_from_gva(struct vcpu
> > > > > *v,
> > > > > vaddr_t va,
> > > > >        struct page_info *page = NULL;
> > > > >        paddr_t maddr = 0;
> > > > >        uint64_t par;
> > > > > +    mfn_t mfn;
> > > > > +    p2m_type_t t;
> > > > >          /*
> > > > >         * XXX: To support a different vCPU, we would need to load the
> > > > > @@ -1446,8 +1449,29 @@ struct page_info *get_page_from_gva(struct vcpu
> > > > > *v,
> > > > > vaddr_t va,
> > > > >        par = gvirt_to_maddr(va, &maddr, flags);
> > > > >        p2m_read_unlock(p2m);
> > > > >    +    /*
> > > > > +     * gvirt_to_maddr may fail if the entry does not have the valid
> > > > > bit
> > > > > +     * set. Fallback to the second method:
> > > > > +     *  1) Translate the VA to IPA using software lookup -> Stage-1
> > > > > page-table
> > > > > +     *  may not be accessible because the stage-2 entries may have
> > > > > valid
> > > > > +     *  bit unset.
> > > > > +     *  2) Software lookup of the MFN
> > > > > +     *
> > > > > +     * Note that when memaccess is enabled, we instead call directly
> > > > > +     * p2m_mem_access_check_and_get_page(...). Because the function
> > > > > is a
> > > > > +     * a variant of the methods described above, it will be able to
> > > > > +     * handle entries with valid bit unset.
> > > > > +     *
> > > > > +     * TODO: Integrate more nicely memaccess with the rest of the
> > > > > +     * function.
> > > > > +     * TODO: Use the fault error in PAR_EL1 to avoid pointless
> > > > > +     *  translation.
> > > > > +     */
> > > > >        if ( par )
> > > > >        {
> > > > > +        paddr_t ipa;
> > > > > +        unsigned int s1_perms;
> > > > > +
> > > > >            /*
> > > > >             * When memaccess is enabled, the translation GVA to MADDR
> > > > > may
> > > > >             * have failed because of a permission fault.
> > > > > @@ -1455,20 +1479,48 @@ struct page_info *get_page_from_gva(struct
> > > > > vcpu
> > > > > *v, vaddr_t va,
> > > > >            if ( p2m->mem_access_enabled )
> > > > >                return p2m_mem_access_check_and_get_page(va, flags, v);
> > > > >    -        dprintk(XENLOG_G_DEBUG,
> > > > > -                "%pv: gvirt_to_maddr failed va=%#"PRIvaddr"
> > > > > flags=0x%lx
> > > > > par=%#"PRIx64"\n",
> > > > > -                v, va, flags, par);
> > > > > -        return NULL;
> > > > > +        /*
> > > > > +         * The software stage-1 table walk can still fail, e.g, if
> > > > > the
> > > > > +         * GVA is not mapped.
> > > > > +         */
> > > > > +        if ( !guest_walk_tables(v, va, &ipa, &s1_perms) )
> > > > > +        {
> > > > > +            dprintk(XENLOG_G_DEBUG,
> > > > > +                    "%pv: Failed to walk page-table va
> > > > > %#"PRIvaddr"\n",
> > > > > v, va);
> > > > > +            return NULL;
> > > > > +        }
> > > > > +
> > > > > +        mfn = p2m_lookup(d, gaddr_to_gfn(ipa), &t);
> > > > > +        if ( mfn_eq(INVALID_MFN, mfn) || !p2m_is_ram(t) )
> > > > > +            return NULL;
> > > > > +
> > > > > +        /*
> > > > > +         * Check permission that are assumed by the caller. For
> > > > > instance
> > > > > +         * in case of guestcopy, the caller assumes that the
> > > > > translated
> > > > > +         * page can be accessed with the requested permissions. If
> > > > > this
> > > > > +         * is not the case, we should fail.
> > > > > +         *
> > > > > +         * Please note that we do not check for the GV2M_EXEC
> > > > > +         * permission. This is fine because the hardware-based
> > > > > translation
> > > > > +         * instruction does not test for execute permissions.
> > > > > +         */
> > > > > +        if ( (flags & GV2M_WRITE) && !(s1_perms & GV2M_WRITE) )
> > > > > +            return NULL;
> > > > > +
> > > > > +        if ( (flags & GV2M_WRITE) && t != p2m_ram_rw )
> > > > > +            return NULL;
> > > > 
> > > > The patch looks good enough now. One question: is it a requirement that
> > > > the page we are trying to translate is of type p2m_ram_*? Could
> > > > get_page_from_gva be genuinely called passing a page of a different
> > > > kind, such as p2m_mmio_direct_* or p2m_map_foreign? Today, it is not the
> > > > case, but I wonder if it is something we want to consider?
> > > 
> > > This function can only possibly work with p2m_ram_* because of the
> > > get_page(...) below, indeed the page should belong to the domain.
> > > 
> > > Effectively this function will only be used for hypercall as you use a
> > > virtual
> > > address. I question the value of allowing a guest to do a hypercall with
> > > the
> > > data backed in any other memories than guest RAM. For the foreign mapping,
> > > this could potentially end up with a leakage.
> > 
> > OK.
> 
> I can probably add a few more dprintk in the error paths to help the developer
> to diagnostics the problem if that's ever happen. What do you think?

Maybe a short one line comment on top to say "this function only work
for guest ram pages (no foreign mappings or MMIO regions.)" just to make
it very obvious but it is not necessary, up to you


> > 
> > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> 
> Thank you!
> 
> Cheers,
> 
> -- 
> Julien Grall
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 16/17] xen/arm: Implement Set/Way operations
  2018-12-07 13:22     ` Julien Grall
@ 2018-12-07 21:29       ` Stefano Stabellini
  2018-12-12 15:33         ` Julien Grall
  0 siblings, 1 reply; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-07 21:29 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini, dfaggioli

CC'ing Dario

Dario, please give a look at the preemption question below.


On Fri, 7 Dec 2018, Julien Grall wrote:
> On 06/12/2018 23:32, Stefano Stabellini wrote:
> > On Tue, 4 Dec 2018, Julien Grall wrote:
> > > Set/Way operations are used to perform maintenance on a given cache.
> > > At the moment, Set/Way operations are not trapped and therefore a guest
> > > OS will directly act on the local cache. However, a vCPU may migrate to
> > > another pCPU in the middle of the processor. This will result to have
> > > cache with stall data (Set/Way are not propagated) potentially causing
> > > crash. This may be the cause of heisenbug noticed in Osstest [1].
> > > 
> > > Furthermore, Set/Way operations are not available on system cache. This
> > > means that OS, such as Linux 32-bit, relying on those operations to
> > > fully clean the cache before disabling MMU may break because data may
> > > sits in system caches and not in RAM.
> > > 
> > > For more details about Set/Way, see the talk "The Art of Virtualizing
> > > Cache Maintenance" given at Xen Summit 2018 [2].
> > > 
> > > In the context of Xen, we need to trap Set/Way operations and emulate
> > > them. From the Arm Arm (B1.14.4 in DDI 046C.c), Set/Way operations are
> > > difficult to virtualized. So we can assume that a guest OS using them will
> > > suffer the consequence (i.e slowness) until developer removes all the
> > > usage
> > > of Set/Way.
> > > 
> > > As the software is not allowed to infer the Set/Way to Physical Address
> > > mapping, Xen will need to go through the guest P2M and clean &
> > > invalidate all the entries mapped.
> > > 
> > > Because Set/Way happen in batch (a loop on all Set/Way of a cache), Xen
> > > would need to go through the P2M for every instructions. This is quite
> > > expensive and would severely impact the guest OS. The implementation is
> > > re-using the KVM policy to limit the number of flush:
> > >      - If we trap a Set/Way operations, we enable VM trapping (i.e
> > >        HVC_EL2.TVM) to detect cache being turned on/off, and do a full
> > >      clean.
> > >      - We clean the caches when turning on and off
> > >      - Once the caches are enabled, we stop trapping VM instructions
> > > 
> > > [1]
> > > https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg03191.html
> > > [2] https://fr.slideshare.net/xen_com_mgr/virtualizing-cache
> > > 
> > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > 
> > > ---
> > >      Changes in v2:
> > >          - Fix emulation for Set/Way cache flush arm64 sysreg
> > >          - Add support for preemption
> > >          - Check cache status on every VM traps in Arm64
> > >          - Remove spurious change
> > > ---
> > >   xen/arch/arm/arm64/vsysreg.c | 17 ++++++++
> > >   xen/arch/arm/p2m.c           | 92
> > > ++++++++++++++++++++++++++++++++++++++++++++
> > >   xen/arch/arm/traps.c         | 25 +++++++++++-
> > >   xen/arch/arm/vcpreg.c        | 22 +++++++++++
> > >   xen/include/asm-arm/domain.h |  8 ++++
> > >   xen/include/asm-arm/p2m.h    | 20 ++++++++++
> > >   6 files changed, 183 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
> > > index 16ac9c344a..8a85507d9d 100644
> > > --- a/xen/arch/arm/arm64/vsysreg.c
> > > +++ b/xen/arch/arm/arm64/vsysreg.c
> > > @@ -34,9 +34,14 @@
> > >   static bool vreg_emulate_##reg(struct cpu_user_regs *regs,          \
> > >                                  uint64_t *r, bool read)              \
> > >   {                                                                   \
> > > +    struct vcpu *v = current;                                       \
> > > +    bool cache_enabled = vcpu_has_cache_enabled(v);                 \
> > > +                                                                    \
> > >       GUEST_BUG_ON(read);                                             \
> > >       WRITE_SYSREG64(*r, reg);                                        \
> > >                                                                       \
> > > +    p2m_toggle_cache(v, cache_enabled);                             \
> > > +                                                                    \
> > >       return true;                                                    \
> > >   }
> > >   @@ -85,6 +90,18 @@ void do_sysreg(struct cpu_user_regs *regs,
> > >           break;
> > >         /*
> > > +     * HCR_EL2.TSW
> > > +     *
> > > +     * ARMv8 (DDI 0487B.b): Table D1-42
> > > +     */
> > > +    case HSR_SYSREG_DCISW:
> > > +    case HSR_SYSREG_DCCSW:
> > > +    case HSR_SYSREG_DCCISW:
> > > +        if ( !hsr.sysreg.read )
> > > +            p2m_set_way_flush(current);
> > > +        break;
> > > +
> > > +    /*
> > >        * HCR_EL2.TVM
> > >        *
> > >        * ARMv8 (DDI 0487D.a): Table D1-38
> > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> > > index ca9f0d9ebe..8ee6ff7bd7 100644
> > > --- a/xen/arch/arm/p2m.c
> > > +++ b/xen/arch/arm/p2m.c
> > > @@ -3,6 +3,7 @@
> > >   #include <xen/iocap.h>
> > >   #include <xen/lib.h>
> > >   #include <xen/sched.h>
> > > +#include <xen/softirq.h>
> > >     #include <asm/event.h>
> > >   #include <asm/flushtlb.h>
> > > @@ -1620,6 +1621,97 @@ int p2m_cache_flush_range(struct domain *d, gfn_t
> > > *pstart, gfn_t end)
> > >       return rc;
> > >   }
> > >   +/*
> > > + * Clean & invalidate RAM associated to the guest vCPU.
> > > + *
> > > + * The function can only work with the current vCPU and should be called
> > > + * with IRQ enabled as the vCPU could get preempted.
> > > + */
> > > +void p2m_flush_vm(struct vcpu *v)
> > > +{
> > > +    int rc;
> > > +    gfn_t start = _gfn(0);
> > > +
> > > +    ASSERT(v == current);
> > > +    ASSERT(local_irq_is_enabled());
> > > +    ASSERT(v->arch.need_flush_to_ram);
> > > +
> > > +    do
> > > +    {
> > > +        rc = p2m_cache_flush_range(v->domain, &start, _gfn(ULONG_MAX));
> > > +        if ( rc == -ERESTART )
> > > +            do_softirq();
> > 
> > Shouldn't we store somewhere where we left off? Specifically the value
> > of `start' when rc == -ERESTART? Maybe we change need_flush_to_ram to
> > gfn_t and used it to store `start'?
> 
> It is not necessary on Arm. The hypervisor stack is per-vCPU and we will
> always return to where we were preempted.

Ah, right! Even better.


> > > +    } while ( rc == -ERESTART );
> > > +
> > > +    if ( rc != 0 )
> > > +        gprintk(XENLOG_WARNING,
> > > +                "P2M has not been correctly cleaned (rc = %d)\n",
> > > +                rc);
> > > +
> > > +    v->arch.need_flush_to_ram = false;
> > > +}
> > > +
> > > +/*
> > > + * See note at ARMv7 ARM B1.14.4 (DDI 0406C.c) (TL;DR: S/W ops are not
> > > + * easily virtualized).
> > > + *
> > > + * Main problems:
> > > + *  - S/W ops are local to a CPU (not broadcast)
> > > + *  - We have line migration behind our back (speculation)
> > > + *  - System caches don't support S/W at all (damn!)
> > > + *
> > > + * In the face of the above, the best we can do is to try and convert
> > > + * S/W ops to VA ops. Because the guest is not allowed to infer the S/W
> > > + * to PA mapping, it can only use S/W to nuke the whole cache, which is
> > > + * rather a good thing for us.
> > > + *
> > > + * Also, it is only used when turning caches on/off ("The expected
> > > + * usage of the cache maintenance instructions that operate by set/way
> > > + * is associated with the powerdown and powerup of caches, if this is
> > > + * required by the implementation.").
> > > + *
> > > + * We use the following policy:
> > > + *  - If we trap a S/W operation, we enabled VM trapping to detect
> > > + *  caches being turned on/off, and do a full clean.
> > > + *
> > > + *  - We flush the caches on both caches being turned on and off.
> > > + *
> > > + *  - Once the caches are enabled, we stop trapping VM ops.
> > > + */
> > > +void p2m_set_way_flush(struct vcpu *v)
> > > +{
> > > +    /* This function can only work with the current vCPU. */
> > > +    ASSERT(v == current);
> > > +
> > > +    if ( !(v->arch.hcr_el2 & HCR_TVM) )
> > > +    {
> > > +        v->arch.need_flush_to_ram = true;
> > > +        vcpu_hcr_set_flags(v, HCR_TVM);
> > > +    }
> > > +}
> > > +
> > > +void p2m_toggle_cache(struct vcpu *v, bool was_enabled)
> > > +{
> > > +    bool now_enabled = vcpu_has_cache_enabled(v);
> > > +
> > > +    /* This function can only work with the current vCPU. */
> > > +    ASSERT(v == current);
> > > +
> > > +    /*
> > > +     * If switching the MMU+caches on, need to invalidate the caches.
> > > +     * If switching it off, need to clean the caches.
> > > +     * Clean + invalidate does the trick always.
> > > +     */
> > > +    if ( was_enabled != now_enabled )
> > > +    {
> > > +        v->arch.need_flush_to_ram = true;
> > > +    }
> > > +
> > > +    /* Caches are now on, stop trapping VM ops (until a S/W op) */
> > > +    if ( now_enabled )
> > > +        vcpu_hcr_clear_flags(v, HCR_TVM);
> > > +}
> > > +
> > >   mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
> > >   {
> > >       return p2m_lookup(d, gfn, NULL);
> > > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> > > index 02665cc7b4..221c762ada 100644
> > > --- a/xen/arch/arm/traps.c
> > > +++ b/xen/arch/arm/traps.c
> > > @@ -97,7 +97,7 @@ register_t get_default_hcr_flags(void)
> > >   {
> > >       return  (HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM|
> > >                (vwfi != NATIVE ? (HCR_TWI|HCR_TWE) : 0) |
> > > -             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB);
> > > +             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
> > >   }
> > >     static enum {
> > > @@ -2258,10 +2258,33 @@ static void check_for_pcpu_work(void)
> > >       }
> > >   }
> > >   +/*
> > > + * Process pending work for the vCPU. Any call should be fast or
> > > + * implement preemption.
> > > + */
> > > +static void check_for_vcpu_work(void)
> > > +{
> > > +    struct vcpu *v = current;
> > > +
> > > +    if ( likely(!v->arch.need_flush_to_ram) )
> > > +        return;
> > > +
> > > +    /*
> > > +     * Give a chance for the pCPU to process work before handling the
> > > vCPU
> > > +     * pending work.
> > > +     */
> > > +    check_for_pcpu_work();
> > 
> > This is a bit awkward, basically we need to call check_for_pcpu_work
> > before check_for_vcpu_work(), and after it. This is basically what we
> > are doing:
> > 
> >    check_for_pcpu_work()
> >    check_for_vcpu_work()
> >    check_for_pcpu_work()
> 
> Not really, we only do that if we have vCPU work to do (see the check
> !v->arch.need_flush_to_ram). So we call twice only when we need to do some
> vCPU work (at the moment only the p2m).
> 
> We can't avoid the one after check_for_vcpu_work() because you may have
> softirq pending and already signaled (i.e via an interrupt).

I understand this.


> So you may not execute them before returning to the guest introducing
> long delay. That's why we execute the rest of the code with interrupts
> masked. If sotfirq_pending() returns 0 then you know there were no
> more softirq pending to handle. All the new one will be signaled via
> an interrupt than can only come up when irq are unmasked.
>
> The one before executing vCPU work can potentially be avoided. The reason I
> added it is it can take some times before p2m_flush_vm() will call softirq. As
> we do this on return to guest we may have already been executed for some time
> in the hypervisor. So this give us a chance to preempt if the vCPU consumed
> his sliced.

This one is difficult to tell whether it is important or if it would be
best avoided.

For Dario: basically we have a long running operation to perform, we
thought that the best place for it would be on the path returning to
guest (leave_hypervisor_tail). The operation can interrupt itself
checking sotfirq_pending() once in a while to avoid blocking the pcpu
for too long.

The question is: is it better to check sotfirq_pending() even before
starting? Or every so often during the operating is good enough? Does it
even matter?


 
> > Sounds like we should come up with something better but I don't have a
> > concrete suggestion :-)
> > 
> > 
> > > +    local_irq_enable();
> > > +    p2m_flush_vm(v);
> > > +    local_irq_disable();
> > > +}
> > > +
> > >   void leave_hypervisor_tail(void)
> > >   {
> > >       local_irq_disable();
> > >   +    check_for_vcpu_work();
> > >       check_for_pcpu_work();
> > >         vgic_sync_to_lrs();
> > > diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
> > > index 550c25ec3f..cdc91cdf5b 100644
> > > --- a/xen/arch/arm/vcpreg.c
> > > +++ b/xen/arch/arm/vcpreg.c
> > > @@ -51,9 +51,14 @@
> > >   #define TVM_REG(sz, func, reg...)
> > > \
> > >   static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)
> > > \
> > >   {
> > > \
> > > +    struct vcpu *v = current;
> > > \
> > > +    bool cache_enabled = vcpu_has_cache_enabled(v);
> > > \
> > > +
> > > \
> > >       GUEST_BUG_ON(read);
> > > \
> > >       WRITE_SYSREG##sz(*r, reg);
> > > \
> > >                                                                               \
> > > +    p2m_toggle_cache(v, cache_enabled);
> > > \
> > > +
> > > \
> > >       return true;
> > > \
> > >   }
> > >   @@ -71,6 +76,8 @@ static bool func(struct cpu_user_regs *regs,
> > > uint##sz##_t *r, bool read)    \
> > >   static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,
> > > \
> > >                                   bool read, bool hi)
> > > \
> > >   {
> > > \
> > > +    struct vcpu *v = current;
> > > \
> > > +    bool cache_enabled = vcpu_has_cache_enabled(v);
> > > \
> > >       register_t reg = READ_SYSREG(xreg);
> > > \
> > >                                                                               \
> > >       GUEST_BUG_ON(read);
> > > \
> > > @@ -86,6 +93,8 @@ static bool vreg_emulate_##xreg(struct cpu_user_regs
> > > *regs, uint32_t *r,    \
> > >       }
> > > \
> > >       WRITE_SYSREG(reg, xreg);
> > > \
> > >                                                                               \
> > > +    p2m_toggle_cache(v, cache_enabled);
> > > \
> > > +
> > > \
> > >       return true;
> > > \
> > >   }
> > > \
> > >                                                                               \
> > > @@ -186,6 +195,19 @@ void do_cp15_32(struct cpu_user_regs *regs, const
> > > union hsr hsr)
> > >           break;
> > >         /*
> > > +     * HCR_EL2.TSW
> > > +     *
> > > +     * ARMv7 (DDI 0406C.b): B1.14.6
> > > +     * ARMv8 (DDI 0487B.b): Table D1-42
> > > +     */
> > > +    case HSR_CPREG32(DCISW):
> > > +    case HSR_CPREG32(DCCSW):
> > > +    case HSR_CPREG32(DCCISW):
> > > +        if ( !cp32.read )
> > > +            p2m_set_way_flush(current);
> > > +        break;
> > > +
> > > +    /*
> > >        * HCR_EL2.TVM
> > >        *
> > >        * ARMv8 (DDI 0487D.a): Table D1-38
> > > diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> > > index 175de44927..f16b973e0d 100644
> > > --- a/xen/include/asm-arm/domain.h
> > > +++ b/xen/include/asm-arm/domain.h
> > > @@ -202,6 +202,14 @@ struct arch_vcpu
> > >       struct vtimer phys_timer;
> > >       struct vtimer virt_timer;
> > >       bool   vtimer_initialized;
> > > +
> > > +    /*
> > > +     * The full P2M may require some cleaning (e.g when emulation
> > > +     * set/way). As the action can take a long time, it requires
> > > +     * preemption. So this is deferred until we return to the guest.
> > 
> > The reason for delaying the call to p2m_flush_vm until we return to the
> > guest is that we need to call do_softirq to check whether we need to be
> > preempted and we cannot make that call in the middle of the vcpreg.c
> > handlers, right?
> We need to make sure that do_softirq() is called without any lock. With the
> current code, it would technically be possible to call do_softirq() directly
> in vcreg.c handlers. But I think it is not entirely future-proof.
> 
> So it would be better if we call do_softirq() with the little stack as
> possible as it is easier to ensure that the callers are not taking any lock.
> 
> The infrastructure added should be re-usable for other sort of work (i.e ITS,
> ioreq) in the future.

I think it makes sense and it is easier to think about and to understand
compared to calling do_softirq in the middle of another complex
function. I would ask you to improve a bit the last sentence of this
comment, something like:

"It is deferred until we return to guest, where we can more easily check
for softirqs and preempt the vcpu safely."

It would almost be worth generalizing it even further, introducing a new
tasklet-like concept to schedule long work before returning to guest.
An idea for the future.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of Set/Way operations
  2018-12-04 20:26 ` [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of " Julien Grall
  2018-12-05  8:37   ` Jan Beulich
  2018-12-06 12:21   ` Julien Grall
@ 2018-12-07 21:43   ` Stefano Stabellini
  2018-12-11 16:22     ` Julien Grall
  2 siblings, 1 reply; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-07 21:43 UTC (permalink / raw)
  To: Julien Grall
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Jan Beulich, xen-devel,
	Roger Pau Monné

On Tue, 4 Dec 2018, Julien Grall wrote:
> At the moment, the implementation of Set/Way operations will go through
> all the entries of the guest P2M and flush them. However, this is very
> expensive and may render unusable a guest OS using them.
> 
> For instance, Linux 32-bit will use Set/Way operations during secondary
> CPU bring-up. As the implementation is really expensive, it may be possible
> to hit the CPU bring-up timeout.
> 
> To limit the Set/Way impact, we track what pages has been of the guest
> has been accessed between batch of Set/Way operations. This is done
> using bit[0] (aka valid bit) of the P2M entry.
> 
> This patch adds a new per-arch helper is introduced to perform actions just
> before the guest is first unpaused. This will be used to invalidate the
> P2M to track access from the start of the guest.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> ---
> 
> While we can spread d->creation_finished all over the code, the per-arch
> helper to perform actions just before the guest is first unpaused can
> bring a lot of benefit for both architecture. For instance, on Arm, the
> flush to the instruction cache could be delayed until the domain is
> first run. This would improve greatly the performance of creating guest.
> 
> I am still doing the benchmark whether having a command line option is
> worth it. I will provide numbers as soon as I have them.
> 
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/arch/arm/domain.c     | 14 ++++++++++++++
>  xen/arch/arm/p2m.c        | 30 ++++++++++++++++++++++++++++--
>  xen/arch/x86/domain.c     |  4 ++++
>  xen/common/domain.c       |  5 ++++-
>  xen/include/asm-arm/p2m.h |  2 ++
>  xen/include/xen/domain.h  |  2 ++
>  6 files changed, 54 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 1d926dcb29..41f101746e 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -767,6 +767,20 @@ int arch_domain_soft_reset(struct domain *d)
>      return -ENOSYS;
>  }
>  
> +void arch_domain_creation_finished(struct domain *d)
> +{
> +    /*
> +     * To avoid flushing the whole guest RAM on the first Set/Way, we
> +     * invalidate the P2M to track what has been accessed.
> +     *
> +     * This is only turned when IOMMU is not used or the page-table are
> +     * not shared because bit[0] (e.g valid bit) unset will result
> +     * IOMMU fault that could be not fixed-up.
> +     */
> +    if ( !iommu_use_hap_pt(d) )
> +        p2m_invalidate_root(p2m_get_hostp2m(d));
> +}
> +
>  static int is_guest_pv32_psr(uint32_t psr)
>  {
>      switch (psr & PSR_MODE_MASK)
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 8ee6ff7bd7..44ea3580cf 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1079,6 +1079,22 @@ static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
>  }
>  
>  /*
> + * Invalidate all entries in the root page-tables. This is
> + * useful to get fault on entry and do an action.
> + */
> +void p2m_invalidate_root(struct p2m_domain *p2m)
> +{
> +    unsigned int i;
> +
> +    p2m_write_lock(p2m);
> +
> +    for ( i = 0; i < P2M_ROOT_LEVEL; i++ )
> +        p2m_invalidate_table(p2m, page_to_mfn(p2m->root + i));
> +
> +    p2m_write_unlock(p2m);
> +}
> +
> +/*
>   * Resolve any translation fault due to change in the p2m. This
>   * includes break-before-make and valid bit cleared.
>   */
> @@ -1587,15 +1603,18 @@ int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
>           */
>          if ( gfn_eq(start, next_block_gfn) )
>          {
> -            mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
> +            bool valid;
> +
> +            mfn = p2m_get_entry(p2m, start, &t, NULL, &order, &valid);
>              next_block_gfn = gfn_next_boundary(start, order);
>  
>              /*
>               * The following regions can be skipped:
>               *      - Hole
>               *      - non-RAM
> +             *      - block with valid bit (bit[0]) unset
>               */
> -            if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
> +            if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) || !valid )
>              {
>                  count++;
>                  start = next_block_gfn;
> @@ -1629,6 +1648,7 @@ int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
>   */
>  void p2m_flush_vm(struct vcpu *v)
>  {
> +    struct p2m_domain *p2m = p2m_get_hostp2m(v->domain);
>      int rc;
>      gfn_t start = _gfn(0);
>  
> @@ -1648,6 +1668,12 @@ void p2m_flush_vm(struct vcpu *v)
>                  "P2M has not been correctly cleaned (rc = %d)\n",
>                  rc);
>  
> +    /*
> +     * Invalidate the p2m to track which page was modified by the guest
> +     * between call of p2m_flush_vm().
> +     */
> +    p2m_invalidate_root(p2m);

Does this mean that we are invalidating the p2m once more than
necessary, when the caches are finally enabled in Linux? Could that be
avoided by passing an additional argument to p2m_flush_vm?
Is this optimization I am suggesting unimportant?


>      v->arch.need_flush_to_ram = false;
>  }
>  
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index b4d59487ad..d28e3f9b15 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -762,6 +762,10 @@ int arch_domain_soft_reset(struct domain *d)
>      return ret;
>  }
>  
> +void arch_domain_creation_finished(struct domain *d)
> +{
> +}
> +
>  /*
>   * These are the masks of CR4 bits (subject to hardware availability) which a
>   * PV guest may not legitimiately attempt to modify.
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 78cc5249e8..c623daec56 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -1116,8 +1116,11 @@ int domain_unpause_by_systemcontroller(struct domain *d)
>       * Creation is considered finished when the controller reference count
>       * first drops to 0.
>       */
> -    if ( new == 0 )
> +    if ( new == 0 && !d->creation_finished )
> +    {
>          d->creation_finished = true;
> +        arch_domain_creation_finished(d);
> +    }
>  
>      domain_unpause(d);
>  
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index 79abcb5a63..01cd3ee4b5 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -231,6 +231,8 @@ int p2m_set_entry(struct p2m_domain *p2m,
>  
>  bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
>  
> +void p2m_invalidate_root(struct p2m_domain *p2m);
> +
>  /*
>   * Clean & invalidate caches corresponding to a region [start,end) of guest
>   * address space.
> diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
> index 33e41486cb..d1bfc82f57 100644
> --- a/xen/include/xen/domain.h
> +++ b/xen/include/xen/domain.h
> @@ -70,6 +70,8 @@ void arch_domain_unpause(struct domain *d);
>  
>  int arch_domain_soft_reset(struct domain *d);
>  
> +void arch_domain_creation_finished(struct domain *d);
> +
>  void arch_p2m_set_access_required(struct domain *d, bool access_required);
>  
>  int arch_set_info_guest(struct vcpu *, vcpu_guest_context_u);
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of Set/Way operations
  2018-12-06 12:21   ` Julien Grall
@ 2018-12-07 21:52     ` Stefano Stabellini
  0 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-07 21:52 UTC (permalink / raw)
  To: Julien Grall
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Jan Beulich, xen-devel,
	Roger Pau Monné

On Thu, 6 Dec 2018, Julien Grall wrote:
> Hi,
> 
> On 12/4/18 8:26 PM, Julien Grall wrote:
> > At the moment, the implementation of Set/Way operations will go through
> > all the entries of the guest P2M and flush them. However, this is very
> > expensive and may render unusable a guest OS using them.
> > 
> > For instance, Linux 32-bit will use Set/Way operations during secondary
> > CPU bring-up. As the implementation is really expensive, it may be possible
> > to hit the CPU bring-up timeout.
> > 
> > To limit the Set/Way impact, we track what pages has been of the guest
> > has been accessed between batch of Set/Way operations. This is done
> > using bit[0] (aka valid bit) of the P2M entry.
> > 
> > This patch adds a new per-arch helper is introduced to perform actions just
> > before the guest is first unpaused. This will be used to invalidate the
> > P2M to track access from the start of the guest.
> > 
> > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > 
> > ---
> > 
> > While we can spread d->creation_finished all over the code, the per-arch
> > helper to perform actions just before the guest is first unpaused can
> > bring a lot of benefit for both architecture. For instance, on Arm, the
> > flush to the instruction cache could be delayed until the domain is
> > first run. This would improve greatly the performance of creating guest.
> > 
> > I am still doing the benchmark whether having a command line option is
> > worth it. I will provide numbers as soon as I have them.
> 
> I remembered Stefano suggested to look at the impact on the boot. This is a
> bit tricky to do as there are many kernel configurations existing and all the
> mappings may not have been touched during the boot.
> 
> Instead I wrote a tiny guest [1] that will zero roughly 1GB of memory. Because
> the toolstack will always try to allocate with the biggest mapping, I had to
> hack a bit the toolstack to be able to test with different mapping size (but
> not a mix). The guest has only one vCPU with a dedicated pCPU.
> 	- 1GB: 0.03% slower when starting with valid bit unset
> 	- 2MB: 0.04% faster when starting with valid bit unset
>         - 4KB: ~3% slower when starting with valid bit unset
> 
> The performance using 1GB and 2MB mapping is pretty much insignificant because
> the number of traps is very limited (resp. 1 and 513). With 4KB mapping, there
> are a much significant drop because you have more traps (~262700) as the P2M
> contains more entries.
> 
> However, having many 4KB mappings in the P2M is pretty unlikely as the
> toolstack will always try to get bigger mapping. In real world, you should
> only have 4KB mappings when you guest has not memory aligned with a bigger
> mapping. If you end up to have many 4KB mappings, then you are already going
> to have a performance impact in long run because of the TLB pressure.
> 
> Overall, I would not recommend to introduce a command line option until we
> figured out a use case where the trap will be a slow down.

Looking at the numbers, I agree with you. This is OK for now. But we
should still be open to revisit this issue in the future in case it
becomes a problem (I know of customers wanting to boot the system in
less than a second overall).

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 15/17] xen/arm: p2m: Add support for preemption in p2m_cache_flush_range
  2018-12-07 11:15     ` Julien Grall
@ 2018-12-07 22:11       ` Stefano Stabellini
  2018-12-11 16:11         ` Julien Grall
  0 siblings, 1 reply; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-07 22:11 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini

On Fri, 7 Dec 2018, Julien Grall wrote:
> > > @@ -1547,6 +1551,25 @@ int p2m_cache_flush_range(struct domain *d, gfn_t
> > > start, gfn_t end)
> > >         while ( gfn_x(start) < gfn_x(end) )
> > >       {
> > > +       /*
> > > +         * Cleaning the cache for the P2M may take a long time. So we
> > > +         * need to be able to preempt. We will arbitrarily preempt every
> > > +         * time count reach 512 or above.
> > > +
> > > +         *
> > > +         * The count will be incremented by:
> > > +         *  - 1 on region skipped
> > > +         *  - 10 for each page requiring a flush
> > 
> > Why this choice? A page flush should cost much more than 10x a region
> > skipped, more like 100x or 1000x. In fact, doing the full loop without
> > calling flush_page_to_ram should be cheap and fast, right?.
> 
> It is cheaper than a flush of the page but it still has a cost. You have to
> walk the stage-2 in software that will require to map the tables. As all the
> memory is not mapped in the hypervisor on arm32 this will require a map/unmap
> operation. On arm64, so far the full memory is mapped, so the map/unmap is
> pretty much a NOP.

Good point, actually the cost of an "empty" iteration is significantly
different on arm32 and arm64.


> > I would:
> > 
> > - not increase count on region skipped at all
> > - increase it by 1 on each page requiring a flush
> > - set the limit lower, if we go with your proposal it would be about 50,
> >    I am not sure what the limit should be though
> I don't think you can avoid incrementing count on region skipped. While one
> lookup is pretty cheap, all the lookups for hole added together may result to
> a pretty long time.

Thinking of the arm32 case where map/unmap need to be done, you are
right.


> Even if stage-2 mappings are handled by the hypervisor, the guest is still
> somewhat in control of it because it can balloon in/out pages. The operation
> may result to shatter stage-2 mappings.
> 
> It would be feasible for a guest to shatter 1GB of memory in 4KB mappings in
> stage-2 entries and then remove all the entries. This means the stage-2 would
> contains 262144 holes. This would result to 262144 iterations, so no matter
> how cheap it is the resulting time spent without preemption is going to be
> quite important.

OK


> The choice in the numbers 1 vs 10 is pretty much random. The question is how
> often we want to check for pending softirq. The check is pretty much trivial,
> yet it has a cost to preempt. With the current solution, we check preemption
> every 512 holes or 51 pages flushed (~204KB flushed).
> 
> This sounds ok to me. Feel free to suggest better number.

One suggestion is that we might want to treat the arm32 case differently
from the arm64 case, given the different cost of mapping/unmapping pages
during the walk. Would it be fair to say that if an arm64 empty
iteration is worth "1" unit of work, then an arm32 empty iteration is
worth at least "2" unit of work? Or more? My gut feeling is that it is
more like:

empty arm64:       1
empty arm32:       5
flush arm32/arm64: 10

Or even:

empty arm64:       1
empty arm32:       10
flush arm32/arm64: 20

and the overall limits for checks could be 512 or 1024 like you
suggested.

But I don't really know, we would need precise benchmarks to have an
idea about what are the best numbers for this. I am not suggesting you
have to do any more benchmarks now, we'll just hand-wave it for now.



> > > +         */
> > > +        if ( count >= 512 )
> > > +        {
> > > +            if ( softirq_pending(smp_processor_id()) )
> > > +            {
> > > +                rc = -ERESTART;
> > > +                break;
> > > +            }
> > > +            count = 0;
> > 
> > No need to set count to 0 here
> 
> Well, the code would not do the same here. If you don't reset to 0, you would
> check softirq_pending() all the iteration when count reached 512.
> 
> If you reset 0, you will avoid to check softirq_pending() until the next time
> count reached 512.
> 
> The both are actually valid. It just a matter on whether we are assuming that
> a softirq will happen soon after reaching 512?

My comment was wrong, the code is right as is, I think we want to check
softirq_pending every 512 iterations.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 15/17] xen/arm: p2m: Add support for preemption in p2m_cache_flush_range
  2018-12-07 22:11       ` Stefano Stabellini
@ 2018-12-11 16:11         ` Julien Grall
  0 siblings, 0 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-11 16:11 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel

Hi,

On 07/12/2018 22:11, Stefano Stabellini wrote:
> On Fri, 7 Dec 2018, Julien Grall wrote:
>>>> @@ -1547,6 +1551,25 @@ int p2m_cache_flush_range(struct domain *d, gfn_t
>>>> start, gfn_t end)
>>>>          while ( gfn_x(start) < gfn_x(end) )
>>>>        {
>>>> +       /*
>>>> +         * Cleaning the cache for the P2M may take a long time. So we
>>>> +         * need to be able to preempt. We will arbitrarily preempt every
>>>> +         * time count reach 512 or above.
>>>> +
>>>> +         *
>>>> +         * The count will be incremented by:
>>>> +         *  - 1 on region skipped
>>>> +         *  - 10 for each page requiring a flush
>>>
>>> Why this choice? A page flush should cost much more than 10x a region
>>> skipped, more like 100x or 1000x. In fact, doing the full loop without
>>> calling flush_page_to_ram should be cheap and fast, right?.
>>
>> It is cheaper than a flush of the page but it still has a cost. You have to
>> walk the stage-2 in software that will require to map the tables. As all the
>> memory is not mapped in the hypervisor on arm32 this will require a map/unmap
>> operation. On arm64, so far the full memory is mapped, so the map/unmap is
>> pretty much a NOP.
> 
> Good point, actually the cost of an "empty" iteration is significantly
> different on arm32 and arm64.
> 
> 
>>> I would:
>>>
>>> - not increase count on region skipped at all
>>> - increase it by 1 on each page requiring a flush
>>> - set the limit lower, if we go with your proposal it would be about 50,
>>>     I am not sure what the limit should be though
>> I don't think you can avoid incrementing count on region skipped. While one
>> lookup is pretty cheap, all the lookups for hole added together may result to
>> a pretty long time.
> 
> Thinking of the arm32 case where map/unmap need to be done, you are
> right.
> 
> 
>> Even if stage-2 mappings are handled by the hypervisor, the guest is still
>> somewhat in control of it because it can balloon in/out pages. The operation
>> may result to shatter stage-2 mappings.
>>
>> It would be feasible for a guest to shatter 1GB of memory in 4KB mappings in
>> stage-2 entries and then remove all the entries. This means the stage-2 would
>> contains 262144 holes. This would result to 262144 iterations, so no matter
>> how cheap it is the resulting time spent without preemption is going to be
>> quite important.
> 
> OK
> 
> 
>> The choice in the numbers 1 vs 10 is pretty much random. The question is how
>> often we want to check for pending softirq. The check is pretty much trivial,
>> yet it has a cost to preempt. With the current solution, we check preemption
>> every 512 holes or 51 pages flushed (~204KB flushed).
>>
>> This sounds ok to me. Feel free to suggest better number.
> 
> One suggestion is that we might want to treat the arm32 case differently
> from the arm64 case, given the different cost of mapping/unmapping pages
> during the walk. Would it be fair to say that if an arm64 empty
> iteration is worth "1" unit of work, then an arm32 empty iteration is
> worth at least "2" unit of work? Or more? My gut feeling is that it is
> more like:

I don't want to treat arm32 and arm64 different. That's a call to open up a 
security hole in Xen if we ever decide to separate domain heap and xen heap on 
arm64.

> 
> empty arm64:       1
> empty arm32:       5
> flush arm32/arm64: 10
> 
> Or even:
> 
> empty arm64:       1
> empty arm32:       10
> flush arm32/arm64: 20

Bear in mind that in the flush case on arm32, you also need to map/unmap the 
page. So this is more like 10/30 here.

> 
> and the overall limits for checks could be 512 or 1024 like you
> suggested.

What I can suggest is:
	empty: 1
	flush: 3

Limit: 120

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of Set/Way operations
  2018-12-07 21:43   ` Stefano Stabellini
@ 2018-12-11 16:22     ` Julien Grall
  0 siblings, 0 replies; 53+ messages in thread
From: Julien Grall @ 2018-12-11 16:22 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Jan Beulich, xen-devel,
	Roger Pau Monné

Hi Stefano,

On 07/12/2018 21:43, Stefano Stabellini wrote:
> On Tue, 4 Dec 2018, Julien Grall wrote:
>> At the moment, the implementation of Set/Way operations will go through
>> all the entries of the guest P2M and flush them. However, this is very
>> expensive and may render unusable a guest OS using them.
>>
>> For instance, Linux 32-bit will use Set/Way operations during secondary
>> CPU bring-up. As the implementation is really expensive, it may be possible
>> to hit the CPU bring-up timeout.
>>
>> To limit the Set/Way impact, we track what pages has been of the guest
>> has been accessed between batch of Set/Way operations. This is done
>> using bit[0] (aka valid bit) of the P2M entry.
>>
>> This patch adds a new per-arch helper is introduced to perform actions just
>> before the guest is first unpaused. This will be used to invalidate the
>> P2M to track access from the start of the guest.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>
>> ---
>>
>> While we can spread d->creation_finished all over the code, the per-arch
>> helper to perform actions just before the guest is first unpaused can
>> bring a lot of benefit for both architecture. For instance, on Arm, the
>> flush to the instruction cache could be delayed until the domain is
>> first run. This would improve greatly the performance of creating guest.
>>
>> I am still doing the benchmark whether having a command line option is
>> worth it. I will provide numbers as soon as I have them.
>>
>> Cc: Stefano Stabellini <sstabellini@kernel.org>
>> Cc: Julien Grall <julien.grall@arm.com>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Cc: Tim Deegan <tim@xen.org>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> ---
>>   xen/arch/arm/domain.c     | 14 ++++++++++++++
>>   xen/arch/arm/p2m.c        | 30 ++++++++++++++++++++++++++++--
>>   xen/arch/x86/domain.c     |  4 ++++
>>   xen/common/domain.c       |  5 ++++-
>>   xen/include/asm-arm/p2m.h |  2 ++
>>   xen/include/xen/domain.h  |  2 ++
>>   6 files changed, 54 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 1d926dcb29..41f101746e 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -767,6 +767,20 @@ int arch_domain_soft_reset(struct domain *d)
>>       return -ENOSYS;
>>   }
>>   
>> +void arch_domain_creation_finished(struct domain *d)
>> +{
>> +    /*
>> +     * To avoid flushing the whole guest RAM on the first Set/Way, we
>> +     * invalidate the P2M to track what has been accessed.
>> +     *
>> +     * This is only turned when IOMMU is not used or the page-table are
>> +     * not shared because bit[0] (e.g valid bit) unset will result
>> +     * IOMMU fault that could be not fixed-up.
>> +     */
>> +    if ( !iommu_use_hap_pt(d) )
>> +        p2m_invalidate_root(p2m_get_hostp2m(d));
>> +}
>> +
>>   static int is_guest_pv32_psr(uint32_t psr)
>>   {
>>       switch (psr & PSR_MODE_MASK)
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index 8ee6ff7bd7..44ea3580cf 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -1079,6 +1079,22 @@ static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
>>   }
>>   
>>   /*
>> + * Invalidate all entries in the root page-tables. This is
>> + * useful to get fault on entry and do an action.
>> + */
>> +void p2m_invalidate_root(struct p2m_domain *p2m)
>> +{
>> +    unsigned int i;
>> +
>> +    p2m_write_lock(p2m);
>> +
>> +    for ( i = 0; i < P2M_ROOT_LEVEL; i++ )
>> +        p2m_invalidate_table(p2m, page_to_mfn(p2m->root + i));
>> +
>> +    p2m_write_unlock(p2m);
>> +}
>> +
>> +/*
>>    * Resolve any translation fault due to change in the p2m. This
>>    * includes break-before-make and valid bit cleared.
>>    */
>> @@ -1587,15 +1603,18 @@ int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
>>            */
>>           if ( gfn_eq(start, next_block_gfn) )
>>           {
>> -            mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
>> +            bool valid;
>> +
>> +            mfn = p2m_get_entry(p2m, start, &t, NULL, &order, &valid);
>>               next_block_gfn = gfn_next_boundary(start, order);
>>   
>>               /*
>>                * The following regions can be skipped:
>>                *      - Hole
>>                *      - non-RAM
>> +             *      - block with valid bit (bit[0]) unset
>>                */
>> -            if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
>> +            if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) || !valid )
>>               {
>>                   count++;
>>                   start = next_block_gfn;
>> @@ -1629,6 +1648,7 @@ int p2m_cache_flush_range(struct domain *d, gfn_t *pstart, gfn_t end)
>>    */
>>   void p2m_flush_vm(struct vcpu *v)
>>   {
>> +    struct p2m_domain *p2m = p2m_get_hostp2m(v->domain);
>>       int rc;
>>       gfn_t start = _gfn(0);
>>   
>> @@ -1648,6 +1668,12 @@ void p2m_flush_vm(struct vcpu *v)
>>                   "P2M has not been correctly cleaned (rc = %d)\n",
>>                   rc);
>>   
>> +    /*
>> +     * Invalidate the p2m to track which page was modified by the guest
>> +     * between call of p2m_flush_vm().
>> +     */
>> +    p2m_invalidate_root(p2m);
> 
> Does this mean that we are invalidating the p2m once more than
> necessary, when the caches are finally enabled in Linux?Could that be
> avoided by passing an additional argument to p2m_flush_vm?

I don't think you can know when the guest finally enabled the cache. A guest is 
free to disable the cache afterwards. This is actually what arm32 does because 
it decompress itself with cache enabled and then disabled it afterwards.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 16/17] xen/arm: Implement Set/Way operations
  2018-12-07 21:29       ` Stefano Stabellini
@ 2018-12-12 15:33         ` Julien Grall
  2018-12-12 17:25           ` Stefano Stabellini
  0 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2018-12-12 15:33 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, dfaggioli



On 07/12/2018 21:29, Stefano Stabellini wrote:
> CC'ing Dario
> 
> Dario, please give a look at the preemption question below.
> 
> 
> On Fri, 7 Dec 2018, Julien Grall wrote:
>> On 06/12/2018 23:32, Stefano Stabellini wrote:
>>> On Tue, 4 Dec 2018, Julien Grall wrote:
>> So you may not execute them before returning to the guest introducing
>> long delay. That's why we execute the rest of the code with interrupts
>> masked. If sotfirq_pending() returns 0 then you know there were no
>> more softirq pending to handle. All the new one will be signaled via
>> an interrupt than can only come up when irq are unmasked.
>>
>> The one before executing vCPU work can potentially be avoided. The reason I
>> added it is it can take some times before p2m_flush_vm() will call softirq. As
>> we do this on return to guest we may have already been executed for some time
>> in the hypervisor. So this give us a chance to preempt if the vCPU consumed
>> his sliced.
> 
> This one is difficult to tell whether it is important or if it would be
> best avoided.
> 
> For Dario: basically we have a long running operation to perform, we
> thought that the best place for it would be on the path returning to
> guest (leave_hypervisor_tail). The operation can interrupt itself
> checking sotfirq_pending() once in a while to avoid blocking the pcpu
> for too long.
> 
> The question is: is it better to check sotfirq_pending() even before
> starting? Or every so often during the operating is good enough? Does it
> even matter?
I am not sure to understand what is your concern here. Checking for 
softirq_pending() often is not an issue. The issue is when we happen to not 
check it. At the moment, I would prefer to be over cautious until we figure out 
whether this is a real issue.

If you are concerned about the performance impact, this is only called when a 
guest is using set/way.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 16/17] xen/arm: Implement Set/Way operations
  2018-12-12 15:33         ` Julien Grall
@ 2018-12-12 17:25           ` Stefano Stabellini
  2018-12-12 17:49             ` Dario Faggioli
  0 siblings, 1 reply; 53+ messages in thread
From: Stefano Stabellini @ 2018-12-12 17:25 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Stefano Stabellini, dfaggioli

On Wed, 12 Dec 2018, Julien Grall wrote:
> On 07/12/2018 21:29, Stefano Stabellini wrote:
> > CC'ing Dario
> > 
> > Dario, please give a look at the preemption question below.
> > 
> > 
> > On Fri, 7 Dec 2018, Julien Grall wrote:
> > > On 06/12/2018 23:32, Stefano Stabellini wrote:
> > > > On Tue, 4 Dec 2018, Julien Grall wrote:
> > > So you may not execute them before returning to the guest introducing
> > > long delay. That's why we execute the rest of the code with interrupts
> > > masked. If sotfirq_pending() returns 0 then you know there were no
> > > more softirq pending to handle. All the new one will be signaled via
> > > an interrupt than can only come up when irq are unmasked.
> > > 
> > > The one before executing vCPU work can potentially be avoided. The reason
> > > I
> > > added it is it can take some times before p2m_flush_vm() will call
> > > softirq. As
> > > we do this on return to guest we may have already been executed for some
> > > time
> > > in the hypervisor. So this give us a chance to preempt if the vCPU
> > > consumed
> > > his sliced.
> > 
> > This one is difficult to tell whether it is important or if it would be
> > best avoided.
> > 
> > For Dario: basically we have a long running operation to perform, we
> > thought that the best place for it would be on the path returning to
> > guest (leave_hypervisor_tail). The operation can interrupt itself
> > checking sotfirq_pending() once in a while to avoid blocking the pcpu
> > for too long.
> > 
> > The question is: is it better to check sotfirq_pending() even before
> > starting? Or every so often during the operating is good enough? Does it
> > even matter?
> I am not sure to understand what is your concern here. Checking for
> softirq_pending() often is not an issue. The issue is when we happen to not
> check it. At the moment, I would prefer to be over cautious until we figure
> out whether this is a real issue.
> 
> If you are concerned about the performance impact, this is only called when a
> guest is using set/way.

Actually, I have no concerns, as I think it should make no difference,
but I just wanted a second opinion.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH for-4.12 v2 16/17] xen/arm: Implement Set/Way operations
  2018-12-12 17:25           ` Stefano Stabellini
@ 2018-12-12 17:49             ` Dario Faggioli
  0 siblings, 0 replies; 53+ messages in thread
From: Dario Faggioli @ 2018-12-12 17:49 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2045 bytes --]

On Wed, 2018-12-12 at 09:25 -0800, Stefano Stabellini wrote:
> On Wed, 12 Dec 2018, Julien Grall wrote:
> > > For Dario: basically we have a long running operation to perform,
> we
> > > thought that the best place for it would be on the path returning
> to
> > > guest (leave_hypervisor_tail). The operation can interrupt itself
> > > checking sotfirq_pending() once in a while to avoid blocking the
> pcpu
> > > for too long.
> > > 
> > > The question is: is it better to check sotfirq_pending() even
> before
> > > starting? Or every so often during the operating is good enough?
> Does it
> > > even matter?
> > I am not sure to understand what is your concern here. Checking for
> > softirq_pending() often is not an issue. The issue is when we
> happen to not
> > check it. At the moment, I would prefer to be over cautious until
> we figure
> > out whether this is a real issue.
> > 
> > If you are concerned about the performance impact, this is only
> called when a
> > guest is using set/way.
> 
> Actually, I have no concerns, as I think it should make no
> difference,
> but I just wanted a second opinion.
>
Yeah, sorry. I saw the email on Monday, but then got distracted.

So, in this case, I personally don't think either solution is so much
better (or so much worse) of the other one.

In general, what's best may vary on a case-by-case basis (e.g., how
long have we been non-preemptable already, when we entering the long
running operation?).

Therefore, if I'd want to be on the safe side, I think I would check
before entering the loop (or whatever the long running op is
implemented).

The performance impact of just one more softirq_pending() check itself
should really be negligible (even considering cache effects).

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2018-12-12 17:49 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-04 20:26 [PATCH for-4.12 v2 00/17] xen/arm: Implement Set/Way operations Julien Grall
2018-12-04 20:26 ` [PATCH for-4.12 v2 01/17] xen/arm: Introduce helpers to clear/flags flags in HCR_EL2 Julien Grall
2018-12-04 20:26 ` [PATCH for-4.12 v2 02/17] xen/arm: traps: Move the implementation of GUEST_BUG_ON in traps.h Julien Grall
2018-12-04 20:26 ` [PATCH for-4.12 v2 03/17] xen/arm: p2m: Clean-up headers included and order them alphabetically Julien Grall
2018-12-04 23:47   ` Stefano Stabellini
2018-12-04 20:26 ` [PATCH for-4.12 v2 04/17] xen/arm: p2m: Introduce p2m_is_valid and use it Julien Grall
2018-12-04 23:50   ` Stefano Stabellini
2018-12-05  9:46     ` Julien Grall
2018-12-06 22:02       ` Stefano Stabellini
2018-12-07 10:14         ` Julien Grall
2018-12-04 20:26 ` [PATCH for-4.12 v2 05/17] xen/arm: p2m: Handle translation fault in get_page_from_gva Julien Grall
2018-12-04 23:59   ` Stefano Stabellini
2018-12-05 10:03     ` Julien Grall
2018-12-06 22:04       ` Stefano Stabellini
2018-12-07 10:16         ` Julien Grall
2018-12-07 16:56           ` Stefano Stabellini
2018-12-04 20:26 ` [PATCH for-4.12 v2 06/17] xen/arm: p2m: Introduce a function to resolve translation fault Julien Grall
2018-12-06 22:33   ` Stefano Stabellini
2018-12-04 20:26 ` [PATCH for-4.12 v2 07/17] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM Julien Grall
2018-12-06 22:33   ` Stefano Stabellini
2018-12-04 20:26 ` [PATCH for-4.12 v2 08/17] xen/arm: vsysreg: Add wrapper to handle sysreg " Julien Grall
2018-12-04 20:26 ` [PATCH for-4.12 v2 09/17] xen/arm: Rework p2m_cache_flush to take a range [begin, end) Julien Grall
2018-12-04 20:26 ` [PATCH for-4.12 v2 10/17] xen/arm: p2m: Allow to flush cache on any RAM region Julien Grall
2018-12-04 20:26 ` [PATCH for-4.12 v2 11/17] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit) Julien Grall
2018-12-04 20:35   ` Razvan Cojocaru
2018-12-06 22:32     ` Stefano Stabellini
2018-12-07 10:17     ` Julien Grall
2018-12-04 20:26 ` [PATCH for-4.12 v2 12/17] xen/arm: traps: Rework leave_hypervisor_tail Julien Grall
2018-12-06 23:08   ` Stefano Stabellini
2018-12-04 20:26 ` [PATCH for-4.12 v2 13/17] xen/arm: p2m: Rework p2m_cache_flush_range Julien Grall
2018-12-06 23:53   ` Stefano Stabellini
2018-12-07 10:18     ` Julien Grall
2018-12-04 20:26 ` [PATCH for-4.12 v2 14/17] xen/arm: domctl: Use typesafe gfn in XEN_DOMCTL_cacheflush Julien Grall
2018-12-06 23:13   ` Stefano Stabellini
2018-12-04 20:26 ` [PATCH for-4.12 v2 15/17] xen/arm: p2m: Add support for preemption in p2m_cache_flush_range Julien Grall
2018-12-06 23:32   ` Stefano Stabellini
2018-12-07 11:15     ` Julien Grall
2018-12-07 22:11       ` Stefano Stabellini
2018-12-11 16:11         ` Julien Grall
2018-12-04 20:26 ` [PATCH for-4.12 v2 16/17] xen/arm: Implement Set/Way operations Julien Grall
2018-12-06 23:32   ` Stefano Stabellini
2018-12-07 13:22     ` Julien Grall
2018-12-07 21:29       ` Stefano Stabellini
2018-12-12 15:33         ` Julien Grall
2018-12-12 17:25           ` Stefano Stabellini
2018-12-12 17:49             ` Dario Faggioli
2018-12-04 20:26 ` [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of " Julien Grall
2018-12-05  8:37   ` Jan Beulich
2018-12-07 13:24     ` Julien Grall
2018-12-06 12:21   ` Julien Grall
2018-12-07 21:52     ` Stefano Stabellini
2018-12-07 21:43   ` Stefano Stabellini
2018-12-11 16:22     ` Julien Grall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).