All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/16] xen/arm: Implement Set/Way operations
@ 2018-10-08 18:33 Julien Grall
  2018-10-08 18:33 ` [RFC 01/16] xen/arm: Introduce helpers to clear/flags flags in HCR_EL2 Julien Grall
                   ` (16 more replies)
  0 siblings, 17 replies; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

Hi all,

As discussed during Linaro Connect with Stefano, I am sending an early version
of the Set/Way implementation to gather feedback on the approach. For more
details on the issue, see the commit message of patch #15.

A branch with the code is available can be clone from:

https://xenbits.xen.org/git-http/people/julieng/xen-unstable.git
branch dev-cacheflush

Cheers,

Julien Grall (16):
  xen/arm: Introduce helpers to clear/flags flags in HCR_EL2
  xen/arm: Introduce helpers to get/set an MFN from/to an LPAE entry
  xen/arm: Allow lpae_is_{table, mapping} helpers to work on invalid
    entry
  xen/arm: guest_walk_tables: Switch the return to bool
  xen/arm: traps: Move the implementation of GUEST_BUG_ON in traps.h
  xen/arm: p2m: Introduce a helper to generate P2M table entry from a
    page
  xen/arm: p2m: Introduce p2m_is_valid and use it
  xen/arm: p2m: Handle translation fault in get_page_from_gva
  xen/arm: p2m: Introduce a function to resolve translation fault
  xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by
    HCR_EL2.TVM
  xen/arm: vsysreg: Add wrapper to handle sysreg access trapped by
    HCR_EL2.TVM
  xen/arm: Rework p2m_cache_flush to take a range [begin, end)
  xen/arm: p2m: Allow to flush cache on any RAM region
  xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0]
    (valid bit)
  xen/arm: Implement Set/Way operations
  xen/arm: Track page accessed between batch of Set/Way operations

 xen/arch/arm/arm64/vsysreg.c     |  82 ++++++++
 xen/arch/arm/domain.c            |  14 ++
 xen/arch/arm/domain_build.c      |   7 +
 xen/arch/arm/domctl.c            |   2 +-
 xen/arch/arm/guest_walk.c        |  52 ++---
 xen/arch/arm/mem_access.c        |   8 +-
 xen/arch/arm/mm.c                |  12 +-
 xen/arch/arm/p2m.c               | 405 ++++++++++++++++++++++++++++++++++-----
 xen/arch/arm/traps.c             |  36 +---
 xen/arch/arm/vcpreg.c            | 167 ++++++++++++++++
 xen/arch/x86/domain.c            |   4 +
 xen/common/domain.c              |   5 +-
 xen/include/asm-arm/cpregs.h     |   1 +
 xen/include/asm-arm/guest_walk.h |   8 +-
 xen/include/asm-arm/lpae.h       |  14 +-
 xen/include/asm-arm/p2m.h        |  28 ++-
 xen/include/asm-arm/processor.h  |  18 ++
 xen/include/asm-arm/traps.h      |  24 +++
 xen/include/xen/domain.h         |   2 +
 19 files changed, 764 insertions(+), 125 deletions(-)

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [RFC 01/16] xen/arm: Introduce helpers to clear/flags flags in HCR_EL2
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-10-08 18:33 ` [RFC 02/16] xen/arm: Introduce helpers to get/set an MFN from/to an LPAE entry Julien Grall
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

A couple of places in the code will need to clear/set flags in HCR_EL2
for a given vCPU and then replicate into the hardware. Introduce
helpers and replace open-coded version.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

---

The patch was previously sent separately and reviewed by Stefano.

I haven't find a good place for those new helpers so stick to
processor.h at the moment. This require to use macro rather than inline
helpers given that processor.h is usually the root of all headers.
---
 xen/arch/arm/traps.c            |  3 +--
 xen/include/asm-arm/processor.h | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 51d2e42c77..9251ae50b8 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -682,8 +682,7 @@ static void inject_vabt_exception(struct cpu_user_regs *regs)
         break;
     }
 
-    current->arch.hcr_el2 |= HCR_VA;
-    WRITE_SYSREG(current->arch.hcr_el2, HCR_EL2);
+    vcpu_hcr_set_flags(current, HCR_VA);
 }
 
 /*
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 8016cf306f..975c8ff097 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -840,6 +840,24 @@ void abort_guest_exit_end(void);
                                  : : : "memory");                 \
     } while (0)
 
+/*
+ * Clear/Set flags in HCR_EL2 for a given vCPU. It only supports the current
+ * vCPU for now.
+ */
+#define vcpu_hcr_clear_flags(v, flags)              \
+    do {                                            \
+        ASSERT((v) == current);                     \
+        (v)->arch.hcr_el2 &= ~(flags);              \
+        WRITE_SYSREG((v)->arch.hcr_el2, HCR_EL2);   \
+    } while (0)
+
+#define vcpu_hcr_set_flags(v, flags)                \
+    do {                                            \
+        ASSERT((v) == current);                     \
+        (v)->arch.hcr_el2 |= (flags);               \
+        WRITE_SYSREG((v)->arch.hcr_el2, HCR_EL2);   \
+    } while (0)
+
 #endif /* __ASSEMBLY__ */
 #endif /* __ASM_ARM_PROCESSOR_H */
 /*
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 02/16] xen/arm: Introduce helpers to get/set an MFN from/to an LPAE entry
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
  2018-10-08 18:33 ` [RFC 01/16] xen/arm: Introduce helpers to clear/flags flags in HCR_EL2 Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-10-30  0:07   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 03/16] xen/arm: Allow lpae_is_{table, mapping} helpers to work on invalid entry Julien Grall
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

The new helpers make it easier to read the code by abstracting the way to
set/get an MFN from/to an LPAE entry. The helpers are using "walk" as the
bits are common across different LPAE stages.

At the same time, use the new helpers to replace the various open-coding
place.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    This patch was originally sent separately.
---
 xen/arch/arm/mm.c          | 10 +++++-----
 xen/arch/arm/p2m.c         | 19 ++++++++++---------
 xen/include/asm-arm/lpae.h |  3 +++
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 7a06a33e21..0bc31b1d9b 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -238,7 +238,7 @@ void dump_pt_walk(paddr_t ttbr, paddr_t addr,
 
         /* For next iteration */
         unmap_domain_page(mapping);
-        mapping = map_domain_page(_mfn(pte.walk.base));
+        mapping = map_domain_page(lpae_get_mfn(pte));
     }
 
     unmap_domain_page(mapping);
@@ -323,7 +323,7 @@ static inline lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned attr)
 
     ASSERT(!(mfn_to_maddr(mfn) & ~PADDR_MASK));
 
-    e.pt.base = mfn_x(mfn);
+    lpae_set_mfn(e, mfn);
 
     return e;
 }
@@ -490,7 +490,7 @@ mfn_t domain_page_map_to_mfn(const void *ptr)
     ASSERT(slot >= 0 && slot < DOMHEAP_ENTRIES);
     ASSERT(map[slot].pt.avail != 0);
 
-    return _mfn(map[slot].pt.base + offset);
+    return mfn_add(lpae_get_mfn(map[slot]), offset);
 }
 #endif
 
@@ -851,7 +851,7 @@ void __init setup_xenheap_mappings(unsigned long base_mfn,
             /* mfn_to_virt is not valid on the 1st 1st mfn, since it
              * is not within the xenheap. */
             first = slot == xenheap_first_first_slot ?
-                xenheap_first_first : __mfn_to_virt(p->pt.base);
+                xenheap_first_first : mfn_to_virt(lpae_get_mfn(*p));
         }
         else if ( xenheap_first_first_slot == -1)
         {
@@ -1007,7 +1007,7 @@ static int create_xen_entries(enum xenmap_operation op,
 
         BUG_ON(!lpae_is_valid(*entry));
 
-        third = __mfn_to_virt(entry->pt.base);
+        third = mfn_to_virt(lpae_get_mfn(*entry));
         entry = &third[third_table_offset(addr)];
 
         switch ( op ) {
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 30cfb01498..f8a2f6459e 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -265,7 +265,7 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
     if ( lpae_is_mapping(*entry, level) )
         return GUEST_TABLE_SUPER_PAGE;
 
-    mfn = _mfn(entry->p2m.base);
+    mfn = lpae_get_mfn(*entry);
 
     unmap_domain_page(*table);
     *table = map_domain_page(mfn);
@@ -349,7 +349,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
         if ( a )
             *a = p2m_mem_access_radix_get(p2m, gfn);
 
-        mfn = _mfn(entry.p2m.base);
+        mfn = lpae_get_mfn(entry);
         /*
          * The entry may point to a superpage. Find the MFN associated
          * to the GFN.
@@ -519,7 +519,7 @@ static lpae_t mfn_to_p2m_entry(mfn_t mfn, p2m_type_t t, p2m_access_t a)
 
     ASSERT(!(mfn_to_maddr(mfn) & ~PADDR_MASK));
 
-    e.p2m.base = mfn_x(mfn);
+    lpae_set_mfn(e, mfn);
 
     return e;
 }
@@ -621,7 +621,7 @@ static void p2m_put_l3_page(const lpae_t pte)
      */
     if ( p2m_is_foreign(pte.p2m.type) )
     {
-        mfn_t mfn = _mfn(pte.p2m.base);
+        mfn_t mfn = lpae_get_mfn(pte);
 
         ASSERT(mfn_valid(mfn));
         put_page(mfn_to_page(mfn));
@@ -655,7 +655,7 @@ static void p2m_free_entry(struct p2m_domain *p2m,
         return;
     }
 
-    table = map_domain_page(_mfn(entry.p2m.base));
+    table = map_domain_page(lpae_get_mfn(entry));
     for ( i = 0; i < LPAE_ENTRIES; i++ )
         p2m_free_entry(p2m, *(table + i), level + 1);
 
@@ -669,7 +669,7 @@ static void p2m_free_entry(struct p2m_domain *p2m,
      */
     p2m_tlb_flush_sync(p2m);
 
-    mfn = _mfn(entry.p2m.base);
+    mfn = lpae_get_mfn(entry);
     ASSERT(mfn_valid(mfn));
 
     pg = mfn_to_page(mfn);
@@ -688,7 +688,7 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
     bool rv = true;
 
     /* Convenience aliases */
-    mfn_t mfn = _mfn(entry->p2m.base);
+    mfn_t mfn = lpae_get_mfn(*entry);
     unsigned int next_level = level + 1;
     unsigned int level_order = level_orders[next_level];
 
@@ -719,7 +719,7 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
          * the necessary fields. So the correct permission are kept.
          */
         pte = *entry;
-        pte.p2m.base = mfn_x(mfn_add(mfn, i << level_order));
+        lpae_set_mfn(pte, mfn_add(mfn, i << level_order));
 
         /*
          * First and second level pages set p2m.table = 0, but third
@@ -952,7 +952,8 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
      * Free the entry only if the original pte was valid and the base
      * is different (to avoid freeing when permission is changed).
      */
-    if ( lpae_is_valid(orig_pte) && entry->p2m.base != orig_pte.p2m.base )
+    if ( lpae_is_valid(orig_pte) &&
+         !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) )
         p2m_free_entry(p2m, orig_pte, level);
 
     if ( need_iommu_pt_sync(p2m->domain) &&
diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
index 15595cd35c..17fdc6074f 100644
--- a/xen/include/asm-arm/lpae.h
+++ b/xen/include/asm-arm/lpae.h
@@ -153,6 +153,9 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
     return (level < 3) && lpae_is_mapping(pte, level);
 }
 
+#define lpae_get_mfn(pte)    (_mfn((pte).walk.base))
+#define lpae_set_mfn(pte, mfn)  ((pte).walk.base = mfn_x(mfn))
+
 /*
  * AArch64 supports pages with different sizes (4K, 16K, and 64K). To enable
  * page table walks for various configurations, the following helpers enable
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 03/16] xen/arm: Allow lpae_is_{table, mapping} helpers to work on invalid entry
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
  2018-10-08 18:33 ` [RFC 01/16] xen/arm: Introduce helpers to clear/flags flags in HCR_EL2 Julien Grall
  2018-10-08 18:33 ` [RFC 02/16] xen/arm: Introduce helpers to get/set an MFN from/to an LPAE entry Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-10-30  0:10   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 04/16] xen/arm: guest_walk_tables: Switch the return to bool Julien Grall
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

Currently, lpae_is_{table, mapping} helpers will always return false on
entries with the valid bit unset. However, it would be useful to have them
operating on any entry. For instance to store information in advance but
still request a fault.

With that change, the p2m is now providing an overlay for *_is_{table,
mapping} that will check the valid bit of the entry.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---

    The patch was previously sent separately.
---
 xen/arch/arm/guest_walk.c  |  2 +-
 xen/arch/arm/mm.c          |  2 +-
 xen/arch/arm/p2m.c         | 22 ++++++++++++++++++----
 xen/include/asm-arm/lpae.h | 11 +++++++----
 4 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/xen/arch/arm/guest_walk.c b/xen/arch/arm/guest_walk.c
index e3e21bdad3..4a1b4cf2c8 100644
--- a/xen/arch/arm/guest_walk.c
+++ b/xen/arch/arm/guest_walk.c
@@ -566,7 +566,7 @@ static int guest_walk_ld(const struct vcpu *v,
      * PTE is invalid or holds a reserved entry (PTE<1:0> == x0)) or if the PTE
      * maps a memory block at level 3 (PTE<1:0> == 01).
      */
-    if ( !lpae_is_mapping(pte, level) )
+    if ( !lpae_is_valid(pte) || !lpae_is_mapping(pte, level) )
         return -EFAULT;
 
     /* Make sure that the lower bits of the PTE's base address are zero. */
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 0bc31b1d9b..987fcb9162 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -996,7 +996,7 @@ static int create_xen_entries(enum xenmap_operation op,
     for(; addr < addr_end; addr += PAGE_SIZE, mfn = mfn_add(mfn, 1))
     {
         entry = &xen_second[second_linear_offset(addr)];
-        if ( !lpae_is_table(*entry, 2) )
+        if ( !lpae_is_valid(*entry) || !lpae_is_table(*entry, 2) )
         {
             rc = create_xen_table(entry);
             if ( rc < 0 ) {
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index f8a2f6459e..8fffb42889 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -219,6 +219,20 @@ static p2m_access_t p2m_mem_access_radix_get(struct p2m_domain *p2m, gfn_t gfn)
         return radix_tree_ptr_to_int(ptr);
 }
 
+/*
+ * lpae_is_* helpers don't check whether the valid bit is set in the
+ * PTE. Provide our own overlay to check the valid bit.
+ */
+static inline bool p2m_is_mapping(lpae_t pte, unsigned int level)
+{
+    return lpae_is_valid(pte) && lpae_is_mapping(pte, level);
+}
+
+static inline bool p2m_is_superpage(lpae_t pte, unsigned int level)
+{
+    return lpae_is_valid(pte) && lpae_is_superpage(pte, level);
+}
+
 #define GUEST_TABLE_MAP_FAILED 0
 #define GUEST_TABLE_SUPER_PAGE 1
 #define GUEST_TABLE_NORMAL_PAGE 2
@@ -262,7 +276,7 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
 
     /* The function p2m_next_level is never called at the 3rd level */
     ASSERT(level < 3);
-    if ( lpae_is_mapping(*entry, level) )
+    if ( p2m_is_mapping(*entry, level) )
         return GUEST_TABLE_SUPER_PAGE;
 
     mfn = lpae_get_mfn(*entry);
@@ -642,7 +656,7 @@ static void p2m_free_entry(struct p2m_domain *p2m,
         return;
 
     /* Nothing to do but updating the stats if the entry is a super-page. */
-    if ( lpae_is_superpage(entry, level) )
+    if ( p2m_is_superpage(entry, level) )
     {
         p2m->stats.mappings[level]--;
         return;
@@ -697,7 +711,7 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
      * a superpage.
      */
     ASSERT(level < target);
-    ASSERT(lpae_is_superpage(*entry, level));
+    ASSERT(p2m_is_superpage(*entry, level));
 
     page = alloc_domheap_page(NULL, 0);
     if ( !page )
@@ -836,7 +850,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
         /* We need to split the original page. */
         lpae_t split_pte = *entry;
 
-        ASSERT(lpae_is_superpage(*entry, level));
+        ASSERT(p2m_is_superpage(*entry, level));
 
         if ( !p2m_split_superpage(p2m, &split_pte, level, target, offsets) )
         {
diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
index 17fdc6074f..545b0c8f24 100644
--- a/xen/include/asm-arm/lpae.h
+++ b/xen/include/asm-arm/lpae.h
@@ -133,16 +133,19 @@ static inline bool lpae_is_valid(lpae_t pte)
     return pte.walk.valid;
 }
 
+/*
+ * lpae_is_* don't check the valid bit. This gives an opportunity for the
+ * callers to operate on the entry even if they are not valid. For
+ * instance to store information in advance.
+ */
 static inline bool lpae_is_table(lpae_t pte, unsigned int level)
 {
-    return (level < 3) && lpae_is_valid(pte) && pte.walk.table;
+    return (level < 3) && pte.walk.table;
 }
 
 static inline bool lpae_is_mapping(lpae_t pte, unsigned int level)
 {
-    if ( !lpae_is_valid(pte) )
-        return false;
-    else if ( level == 3 )
+    if ( level == 3 )
         return pte.walk.table;
     else
         return !pte.walk.table;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 04/16] xen/arm: guest_walk_tables: Switch the return to bool
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (2 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 03/16] xen/arm: Allow lpae_is_{table, mapping} helpers to work on invalid entry Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-10-08 18:33 ` [RFC 05/16] xen/arm: traps: Move the implementation of GUEST_BUG_ON in traps.h Julien Grall
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

At the moment, guest_walk_tables can either return 0, -EFAULT, -EINVAL.
The use of the last 2 are not clearly defined and used inconsistently in
the code. The current only caller does not care about the return
value and the value of it seems very limited (no way to differentiate
between the 15ish error paths).

So switch to bool to simplify the return and make the developer life a
bit easier.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

---

    This patch was originally sent separately and reviewed by Stefano.
---
 xen/arch/arm/guest_walk.c        | 50 ++++++++++++++++++++--------------------
 xen/arch/arm/mem_access.c        |  2 +-
 xen/include/asm-arm/guest_walk.h |  8 +++----
 3 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/xen/arch/arm/guest_walk.c b/xen/arch/arm/guest_walk.c
index 4a1b4cf2c8..7db7a7321b 100644
--- a/xen/arch/arm/guest_walk.c
+++ b/xen/arch/arm/guest_walk.c
@@ -28,9 +28,9 @@
  * page table on a different vCPU, the following registers would need to be
  * loaded: TCR_EL1, TTBR0_EL1, TTBR1_EL1, and SCTLR_EL1.
  */
-static int guest_walk_sd(const struct vcpu *v,
-                         vaddr_t gva, paddr_t *ipa,
-                         unsigned int *perms)
+static bool guest_walk_sd(const struct vcpu *v,
+                          vaddr_t gva, paddr_t *ipa,
+                          unsigned int *perms)
 {
     int ret;
     bool disabled = true;
@@ -79,7 +79,7 @@ static int guest_walk_sd(const struct vcpu *v,
     }
 
     if ( disabled )
-        return -EFAULT;
+        return false;
 
     /*
      * The address of the L1 descriptor for the initial lookup has the
@@ -97,12 +97,12 @@ static int guest_walk_sd(const struct vcpu *v,
     /* Access the guest's memory to read only one PTE. */
     ret = access_guest_memory_by_ipa(d, paddr, &pte, sizeof(short_desc_t), false);
     if ( ret )
-        return -EINVAL;
+        return false;
 
     switch ( pte.walk.dt )
     {
     case L1DESC_INVALID:
-        return -EFAULT;
+        return false;
 
     case L1DESC_PAGE_TABLE:
         /*
@@ -122,10 +122,10 @@ static int guest_walk_sd(const struct vcpu *v,
         /* Access the guest's memory to read only one PTE. */
         ret = access_guest_memory_by_ipa(d, paddr, &pte, sizeof(short_desc_t), false);
         if ( ret )
-            return -EINVAL;
+            return false;
 
         if ( pte.walk.dt == L2DESC_INVALID )
-            return -EFAULT;
+            return false;
 
         if ( pte.pg.page ) /* Small page. */
         {
@@ -175,7 +175,7 @@ static int guest_walk_sd(const struct vcpu *v,
             *perms |= GV2M_EXEC;
     }
 
-    return 0;
+    return true;
 }
 
 /*
@@ -355,9 +355,9 @@ static bool check_base_size(unsigned int output_size, uint64_t base)
  * page table on a different vCPU, the following registers would need to be
  * loaded: TCR_EL1, TTBR0_EL1, TTBR1_EL1, and SCTLR_EL1.
  */
-static int guest_walk_ld(const struct vcpu *v,
-                         vaddr_t gva, paddr_t *ipa,
-                         unsigned int *perms)
+static bool guest_walk_ld(const struct vcpu *v,
+                          vaddr_t gva, paddr_t *ipa,
+                          unsigned int *perms)
 {
     int ret;
     bool disabled = true;
@@ -442,7 +442,7 @@ static int guest_walk_ld(const struct vcpu *v,
          */
         if ( (input_size > TCR_EL1_IPS_48_BIT_VAL) ||
              (input_size < TCR_EL1_IPS_MIN_VAL) )
-            return -EFAULT;
+            return false;
     }
     else
     {
@@ -487,7 +487,7 @@ static int guest_walk_ld(const struct vcpu *v,
     }
 
     if ( disabled )
-        return -EFAULT;
+        return false;
 
     /*
      * The starting level is the number of strides (grainsizes[gran] - 3)
@@ -498,12 +498,12 @@ static int guest_walk_ld(const struct vcpu *v,
     /* Get the IPA output_size. */
     ret = get_ipa_output_size(d, tcr, &output_size);
     if ( ret )
-        return -EFAULT;
+        return false;
 
     /* Make sure the base address does not exceed its configured size. */
     ret = check_base_size(output_size, ttbr);
     if ( !ret )
-        return -EFAULT;
+        return false;
 
     /*
      * Compute the base address of the first level translation table that is
@@ -523,12 +523,12 @@ static int guest_walk_ld(const struct vcpu *v,
         /* Access the guest's memory to read only one PTE. */
         ret = access_guest_memory_by_ipa(d, paddr, &pte, sizeof(lpae_t), false);
         if ( ret )
-            return -EFAULT;
+            return false;
 
         /* Make sure the base address does not exceed its configured size. */
         ret = check_base_size(output_size, pfn_to_paddr(pte.walk.base));
         if ( !ret )
-            return -EFAULT;
+            return false;
 
         /*
          * If page granularity is 64K, make sure the address is aligned
@@ -537,7 +537,7 @@ static int guest_walk_ld(const struct vcpu *v,
         if ( (output_size < TCR_EL1_IPS_52_BIT_VAL) &&
              (gran == GRANULE_SIZE_INDEX_64K) &&
              (pte.walk.base & 0xf) )
-            return -EFAULT;
+            return false;
 
         /*
          * Break if one of the following conditions is true:
@@ -567,7 +567,7 @@ static int guest_walk_ld(const struct vcpu *v,
      * maps a memory block at level 3 (PTE<1:0> == 01).
      */
     if ( !lpae_is_valid(pte) || !lpae_is_mapping(pte, level) )
-        return -EFAULT;
+        return false;
 
     /* Make sure that the lower bits of the PTE's base address are zero. */
     mask = GENMASK_ULL(47, grainsizes[gran]);
@@ -583,11 +583,11 @@ static int guest_walk_ld(const struct vcpu *v,
     if ( !pte.pt.xn && !xn_table )
         *perms |= GV2M_EXEC;
 
-    return 0;
+    return true;
 }
 
-int guest_walk_tables(const struct vcpu *v, vaddr_t gva,
-                      paddr_t *ipa, unsigned int *perms)
+bool guest_walk_tables(const struct vcpu *v, vaddr_t gva,
+                       paddr_t *ipa, unsigned int *perms)
 {
     uint32_t sctlr = READ_SYSREG(SCTLR_EL1);
     register_t tcr = READ_SYSREG(TCR_EL1);
@@ -595,7 +595,7 @@ int guest_walk_tables(const struct vcpu *v, vaddr_t gva,
 
     /* We assume that the domain is running on the currently active domain. */
     if ( v != current )
-        return -EFAULT;
+        return false;
 
     /* Allow perms to be NULL. */
     perms = perms ?: &_perms;
@@ -619,7 +619,7 @@ int guest_walk_tables(const struct vcpu *v, vaddr_t gva,
         /* Memory can be accessed without any restrictions. */
         *perms = GV2M_READ|GV2M_WRITE|GV2M_EXEC;
 
-        return 0;
+        return true;
     }
 
     if ( is_32bit_domain(v->domain) && !(tcr & TTBCR_EAE) )
diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
index ba4ec780fd..9239bdf323 100644
--- a/xen/arch/arm/mem_access.c
+++ b/xen/arch/arm/mem_access.c
@@ -125,7 +125,7 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
          * The software gva to ipa translation can still fail, e.g., if the gva
          * is not mapped.
          */
-        if ( guest_walk_tables(v, gva, &ipa, &perms) < 0 )
+        if ( !guest_walk_tables(v, gva, &ipa, &perms) )
             return NULL;
 
         /*
diff --git a/xen/include/asm-arm/guest_walk.h b/xen/include/asm-arm/guest_walk.h
index 4ed8476e08..8768ac9894 100644
--- a/xen/include/asm-arm/guest_walk.h
+++ b/xen/include/asm-arm/guest_walk.h
@@ -2,10 +2,10 @@
 #define _XEN_GUEST_WALK_H
 
 /* Walk the guest's page tables in software. */
-int guest_walk_tables(const struct vcpu *v,
-                      vaddr_t gva,
-                      paddr_t *ipa,
-                      unsigned int *perms);
+bool guest_walk_tables(const struct vcpu *v,
+                       vaddr_t gva,
+                       paddr_t *ipa,
+                       unsigned int *perms);
 
 #endif /* _XEN_GUEST_WALK_H */
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 05/16] xen/arm: traps: Move the implementation of GUEST_BUG_ON in traps.h
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (3 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 04/16] xen/arm: guest_walk_tables: Switch the return to bool Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-10-30  0:11   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 06/16] xen/arm: p2m: Introduce a helper to generate P2M table entry from a page Julien Grall
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

GUEST_BUG_ON may be used in other files doing guest emulation.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---

    The patch was previously sent separately.
---
 xen/arch/arm/traps.c        | 24 ------------------------
 xen/include/asm-arm/traps.h | 24 ++++++++++++++++++++++++
 2 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 9251ae50b8..b40798084d 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -68,30 +68,6 @@ static inline void check_stack_alignment_constraints(void) {
 #endif
 }
 
-/*
- * GUEST_BUG_ON is intended for checking that the guest state has not been
- * corrupted in hardware and/or that the hardware behaves as we
- * believe it should (i.e. that certain traps can only occur when the
- * guest is in a particular mode).
- *
- * The intention is to limit the damage such h/w bugs (or spec
- * misunderstandings) can do by turning them into Denial of Service
- * attacks instead of e.g. information leaks or privilege escalations.
- *
- * GUEST_BUG_ON *MUST* *NOT* be used to check for guest controllable state!
- *
- * Compared with regular BUG_ON it dumps the guest vcpu state instead
- * of Xen's state.
- */
-#define guest_bug_on_failed(p)                          \
-do {                                                    \
-    show_execution_state(guest_cpu_user_regs());        \
-    panic("Guest Bug: %pv: '%s', line %d, file %s\n",   \
-          current, p, __LINE__, __FILE__);              \
-} while (0)
-#define GUEST_BUG_ON(p) \
-    do { if ( unlikely(p) ) guest_bug_on_failed(#p); } while (0)
-
 #ifdef CONFIG_ARM_32
 static int debug_stack_lines = 20;
 #define stack_words_per_line 8
diff --git a/xen/include/asm-arm/traps.h b/xen/include/asm-arm/traps.h
index 70b52d1d16..0acf7de67d 100644
--- a/xen/include/asm-arm/traps.h
+++ b/xen/include/asm-arm/traps.h
@@ -9,6 +9,30 @@
 # include <asm/arm64/traps.h>
 #endif
 
+/*
+ * GUEST_BUG_ON is intended for checking that the guest state has not been
+ * corrupted in hardware and/or that the hardware behaves as we
+ * believe it should (i.e. that certain traps can only occur when the
+ * guest is in a particular mode).
+ *
+ * The intention is to limit the damage such h/w bugs (or spec
+ * misunderstandings) can do by turning them into Denial of Service
+ * attacks instead of e.g. information leaks or privilege escalations.
+ *
+ * GUEST_BUG_ON *MUST* *NOT* be used to check for guest controllable state!
+ *
+ * Compared with regular BUG_ON it dumps the guest vcpu state instead
+ * of Xen's state.
+ */
+#define guest_bug_on_failed(p)                          \
+do {                                                    \
+    show_execution_state(guest_cpu_user_regs());        \
+    panic("Guest Bug: %pv: '%s', line %d, file %s\n",   \
+          current, p, __LINE__, __FILE__);              \
+} while (0)
+#define GUEST_BUG_ON(p) \
+    do { if ( unlikely(p) ) guest_bug_on_failed(#p); } while (0)
+
 int check_conditional_instr(struct cpu_user_regs *regs, const union hsr hsr);
 
 void advance_pc(struct cpu_user_regs *regs, const union hsr hsr);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 06/16] xen/arm: p2m: Introduce a helper to generate P2M table entry from a page
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (4 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 05/16] xen/arm: traps: Move the implementation of GUEST_BUG_ON in traps.h Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-10-30  0:14   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 07/16] xen/arm: p2m: Introduce p2m_is_valid and use it Julien Grall
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

Generate P2M table entry requires to set some default values which are
worth to explain in a comment. At the moment, there are 2 places where
such entry are created but only one as proper comment.

Some move the code to generate P2M table entry in a separate helper.
This will be helpful in a follow-up patch to make modification on the
defaults.

At the same time, switch the default access from p2m->default_access to
p2m_access_rwx. This should not matter as permission are ignored for
table by the hardware.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/p2m.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 8fffb42889..6c76298ebc 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -538,6 +538,16 @@ static lpae_t mfn_to_p2m_entry(mfn_t mfn, p2m_type_t t, p2m_access_t a)
     return e;
 }
 
+/* Generate table entry with correct attributes. */
+static lpae_t page_to_p2m_table(struct page_info *page)
+{
+    /*
+     * The access value does not matter because the hardware will ignore
+     * the permission fields for table entry.
+     */
+    return mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid, p2m_access_rwx);
+}
+
 static inline void p2m_write_pte(lpae_t *p, lpae_t pte, bool clean_pte)
 {
     write_pte(p, pte);
@@ -558,7 +568,6 @@ static int p2m_create_table(struct p2m_domain *p2m, lpae_t *entry)
 {
     struct page_info *page;
     lpae_t *p;
-    lpae_t pte;
 
     ASSERT(!lpae_is_valid(*entry));
 
@@ -576,14 +585,7 @@ static int p2m_create_table(struct p2m_domain *p2m, lpae_t *entry)
 
     unmap_domain_page(p);
 
-    /*
-     * The access value does not matter because the hardware will ignore
-     * the permission fields for table entry.
-     */
-    pte = mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid,
-                           p2m->default_access);
-
-    p2m_write_pte(entry, pte, p2m->clean_pte);
+    p2m_write_pte(entry, page_to_p2m_table(page), p2m->clean_pte);
 
     return 0;
 }
@@ -764,14 +766,11 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
 
     unmap_domain_page(table);
 
-    pte = mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid,
-                           p2m->default_access);
-
     /*
      * Even if we failed, we should install the newly allocated LPAE
      * entry. The caller will be in charge to free the sub-tree.
      */
-    p2m_write_pte(entry, pte, p2m->clean_pte);
+    p2m_write_pte(entry, page_to_p2m_table(page), p2m->clean_pte);
 
     return rv;
 }
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 07/16] xen/arm: p2m: Introduce p2m_is_valid and use it
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (5 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 06/16] xen/arm: p2m: Introduce a helper to generate P2M table entry from a page Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-10-30  0:21   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 08/16] xen/arm: p2m: Handle translation fault in get_page_from_gva Julien Grall
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

The LPAE format allows to store information in an entry even with the
valid bit unset. In a follow-up patch, we will take advantage of this
feature to re-purpose the valid bit for generating a translation fault
even if an entry contains valid information.

So we need a different way to know whether an entry contains valid
information. It is possible to use the information hold in the p2m_type
to know for that purpose. Indeed all entries containing valid
information will have a valid p2m type (i.e p2m_type != p2m_invalid).

This patch introduces a new helper p2m_is_valid, which implements that
idea, and replace most of lpae_is_valid call with the new helper. The ones
remaining are for TLBs handling and entries accounting.

With the renaming there are 2 others changes required:
    - Generate table entry with a valid p2m type
    - Detect new mapping for proper stats accounting

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/p2m.c | 34 +++++++++++++++++++++++-----------
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 6c76298ebc..2a1e7e9be2 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -220,17 +220,26 @@ static p2m_access_t p2m_mem_access_radix_get(struct p2m_domain *p2m, gfn_t gfn)
 }
 
 /*
+ * In the case of the P2M, the valid bit is used for other purpose. Use
+ * the type to check whether an entry is valid.
+ */
+static inline bool p2m_is_valid(lpae_t pte)
+{
+    return pte.p2m.type != p2m_invalid;
+}
+
+/*
  * lpae_is_* helpers don't check whether the valid bit is set in the
  * PTE. Provide our own overlay to check the valid bit.
  */
 static inline bool p2m_is_mapping(lpae_t pte, unsigned int level)
 {
-    return lpae_is_valid(pte) && lpae_is_mapping(pte, level);
+    return p2m_is_valid(pte) && lpae_is_mapping(pte, level);
 }
 
 static inline bool p2m_is_superpage(lpae_t pte, unsigned int level)
 {
-    return lpae_is_valid(pte) && lpae_is_superpage(pte, level);
+    return p2m_is_valid(pte) && lpae_is_superpage(pte, level);
 }
 
 #define GUEST_TABLE_MAP_FAILED 0
@@ -264,7 +273,7 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
 
     entry = *table + offset;
 
-    if ( !lpae_is_valid(*entry) )
+    if ( !p2m_is_valid(*entry) )
     {
         if ( read_only )
             return GUEST_TABLE_MAP_FAILED;
@@ -356,7 +365,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
 
     entry = table[offsets[level]];
 
-    if ( lpae_is_valid(entry) )
+    if ( p2m_is_valid(entry) )
     {
         *t = entry.p2m.type;
 
@@ -544,8 +553,11 @@ static lpae_t page_to_p2m_table(struct page_info *page)
     /*
      * The access value does not matter because the hardware will ignore
      * the permission fields for table entry.
+     *
+     * We use p2m_ram_rw so the entry has a valid type. This is important
+     * for p2m_is_valid() to return valid on table entries.
      */
-    return mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid, p2m_access_rwx);
+    return mfn_to_p2m_entry(page_to_mfn(page), p2m_ram_rw, p2m_access_rwx);
 }
 
 static inline void p2m_write_pte(lpae_t *p, lpae_t pte, bool clean_pte)
@@ -569,7 +581,7 @@ static int p2m_create_table(struct p2m_domain *p2m, lpae_t *entry)
     struct page_info *page;
     lpae_t *p;
 
-    ASSERT(!lpae_is_valid(*entry));
+    ASSERT(!p2m_is_valid(*entry));
 
     page = alloc_domheap_page(NULL, 0);
     if ( page == NULL )
@@ -626,7 +638,7 @@ static int p2m_mem_access_radix_set(struct p2m_domain *p2m, gfn_t gfn,
  */
 static void p2m_put_l3_page(const lpae_t pte)
 {
-    ASSERT(lpae_is_valid(pte));
+    ASSERT(p2m_is_valid(pte));
 
     /*
      * TODO: Handle other p2m types
@@ -654,11 +666,11 @@ static void p2m_free_entry(struct p2m_domain *p2m,
     struct page_info *pg;
 
     /* Nothing to do if the entry is invalid. */
-    if ( !lpae_is_valid(entry) )
+    if ( !p2m_is_valid(entry) )
         return;
 
     /* Nothing to do but updating the stats if the entry is a super-page. */
-    if ( p2m_is_superpage(entry, level) )
+    if ( level == 3 && entry.p2m.table )
     {
         p2m->stats.mappings[level]--;
         return;
@@ -951,7 +963,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
             else
                 p2m->need_flush = true;
         }
-        else /* new mapping */
+        else if ( !p2m_is_valid(orig_pte) ) /* new mapping */
             p2m->stats.mappings[level]++;
 
         p2m_write_pte(entry, pte, p2m->clean_pte);
@@ -965,7 +977,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
      * Free the entry only if the original pte was valid and the base
      * is different (to avoid freeing when permission is changed).
      */
-    if ( lpae_is_valid(orig_pte) &&
+    if ( p2m_is_valid(orig_pte) &&
          !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) )
         p2m_free_entry(p2m, orig_pte, level);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 08/16] xen/arm: p2m: Handle translation fault in get_page_from_gva
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (6 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 07/16] xen/arm: p2m: Introduce p2m_is_valid and use it Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-10-30  0:47   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault Julien Grall
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

A follow-up patch will re-purpose the valid bit of LPAE entries to
generate fault even on entry containing valid information.

This means that when translation a guest VA to guest PA (e.g IPA) will
fail if the Stage-2 entries used have the valid bit unset. Because of
that, we need to fallback to walk the page-table in software to check
whether the fault was expected.

This patch adds the software page-table walk on all the translation
fault. It would be possible in the future to avoid pointless walk when
the fault in PAR_EL1 is not a translation fault.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---

There are a couple of TODO in the code. They are clean-up and performance
improvement (e.g when the fault cannot be handled) that could be delayed after
the series has been merged.
---
 xen/arch/arm/p2m.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 58 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 2a1e7e9be2..ec956bc151 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -13,6 +13,7 @@
 #include <public/vm_event.h>
 #include <asm/flushtlb.h>
 #include <asm/event.h>
+#include <asm/guest_walk.h>
 #include <asm/hardirq.h>
 #include <asm/page.h>
 
@@ -1438,6 +1439,8 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
     struct page_info *page = NULL;
     paddr_t maddr = 0;
     uint64_t par;
+    mfn_t mfn;
+    p2m_type_t t;
 
     /*
      * XXX: To support a different vCPU, we would need to load the
@@ -1454,8 +1457,30 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
     par = gvirt_to_maddr(va, &maddr, flags);
     p2m_read_unlock(p2m);
 
+    /*
+     * gvirt_to_maddr may fail if the entry does not have the valid bit
+     * set. Fallback
+     * to the second method:
+     *  1) Translate the VA to IPA using software lookup -> Stage-1 page-table
+     *  may not be accessible because the stage-2 entries may have valid
+     *  bit unset.
+     *  2) Software lookup of the MFN
+     *
+     * Note that when memaccess is enabled, we instead all directly
+     * p2m_mem_access_check_and_get_page(...). Because the function is a
+     * a variant of the methods described above, it will be able to
+     * handle entry with valid bit unset.
+     *
+     * TODO: Integrate more nicely memaccess with the rest of the
+     * function.
+     * TODO: Use the fault error in PAR_EL1 to avoid pointless
+     *  translation.
+     */
     if ( par )
     {
+        paddr_t ipa;
+        unsigned int perms;
+
         /*
          * When memaccess is enabled, the translation GVA to MADDR may
          * have failed because of a permission fault.
@@ -1463,20 +1488,46 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
         if ( p2m->mem_access_enabled )
             return p2m_mem_access_check_and_get_page(va, flags, v);
 
-        dprintk(XENLOG_G_DEBUG,
-                "%pv: gvirt_to_maddr failed va=%#"PRIvaddr" flags=0x%lx par=%#"PRIx64"\n",
-                v, va, flags, par);
-        return NULL;
+        /*
+         * The software stage-1 table walk can still fail, e.g, if the
+         * GVA is not mapped.
+         */
+        if ( !guest_walk_tables(v, va, &ipa, &perms) )
+        {
+            dprintk(XENLOG_G_DEBUG, "%pv: Failed to walk page-table va %#"PRIvaddr"\n", v, va);
+            return NULL;
+        }
+
+        /*
+         * Check permission that are assumed by the caller. For instance
+         * in case of guestcopy, the caller assumes that the translated
+         * page can be accessed with the requested permissions. If this
+         * is not the case, we should fail.
+         *
+         * Please note that we do not check for the GV2M_EXEC
+         * permission. This is fine because the hardware-based translation
+         * instruction does not test for execute permissions.
+         */
+        if ( (flags & GV2M_WRITE) && !(perms & GV2M_WRITE) )
+            return NULL;
+
+        mfn = p2m_lookup(d, gaddr_to_gfn(ipa), &t);
+        if ( mfn_eq(INVALID_MFN, mfn) )
+            return NULL;
+
+        /* We consider that RAM is always mapped read-write */
     }
+    else
+        mfn = maddr_to_mfn(maddr);
 
-    if ( !mfn_valid(maddr_to_mfn(maddr)) )
+    if ( !mfn_valid(mfn) )
     {
         dprintk(XENLOG_G_DEBUG, "%pv: Invalid MFN %#"PRI_mfn"\n",
-                v, mfn_x(maddr_to_mfn(maddr)));
+                v, mfn_x(mfn));
         return NULL;
     }
 
-    page = mfn_to_page(maddr_to_mfn(maddr));
+    page = mfn_to_page(mfn);
     ASSERT(page);
 
     if ( unlikely(!get_page(page, d)) )
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (7 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 08/16] xen/arm: p2m: Handle translation fault in get_page_from_gva Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-11-02 23:27   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 10/16] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM Julien Grall
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

Currently a Stage-2 translation fault could happen:
    1) MMIO emulation
    2) When the page-tables is been updated using Break-Before-Make
    3) Page not mapped

A follow-up patch will re-purpose the valid bit in an entry to generate
translation fault. This would be used to do an action on each entries to
track page used for a given period.

A new function is introduced to try to resolve a translation fault. This
will include 2) and the new way to generate fault explained above.

To avoid invalidating all the page-tables entries in one go. It is
possible to invalidate the top-level table and then on trap invalidate
the table one-level down. This will be repeated until a block/page entry
has been reached.

At the moment, there are no action done when reaching a block/page entry
but setting the valid bit to 1.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/p2m.c   | 127 +++++++++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/traps.c |   7 +--
 2 files changed, 131 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index ec956bc151..af445d3313 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1043,6 +1043,133 @@ int p2m_set_entry(struct p2m_domain *p2m,
     return rc;
 }
 
+/* Invalidate all entries in the table. The p2m should be write locked. */
+static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
+{
+    lpae_t *table;
+    unsigned int i;
+
+    ASSERT(p2m_is_write_locked(p2m));
+
+    table = map_domain_page(mfn);
+
+    for ( i = 0; i < LPAE_ENTRIES; i++ )
+    {
+        lpae_t pte = table[i];
+
+        pte.p2m.valid = 0;
+
+        p2m_write_pte(&table[i], pte, p2m->clean_pte);
+    }
+
+    unmap_domain_page(table);
+
+    p2m->need_flush = true;
+}
+
+bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    unsigned int level = 0;
+    bool resolved = false;
+    lpae_t entry, *table;
+    paddr_t addr = gfn_to_gaddr(gfn);
+
+    /* Convenience aliases */
+    const unsigned int offsets[4] = {
+        zeroeth_table_offset(addr),
+        first_table_offset(addr),
+        second_table_offset(addr),
+        third_table_offset(addr)
+    };
+
+    p2m_write_lock(p2m);
+
+    /* This gfn is higher than the highest the p2m map currently holds */
+    if ( gfn_x(gfn) > gfn_x(p2m->max_mapped_gfn) )
+        goto out;
+
+    table = p2m_get_root_pointer(p2m, gfn);
+    /*
+     * The table should always be non-NULL because the gfn is below
+     * p2m->max_mapped_gfn and the root table pages are always present.
+     */
+    BUG_ON(table == NULL);
+
+    /*
+     * Go down the page-tables until an entry has the valid bit unset or
+     * a block/page entry has been hit.
+     */
+    for ( level = P2M_ROOT_LEVEL; level <= 3; level++ )
+    {
+        int rc;
+
+        entry = table[offsets[level]];
+
+        if ( level == 3 )
+            break;
+
+        /* Stop as soon as we hit an entry with the valid bit unset. */
+        if ( !lpae_is_valid(entry) )
+            break;
+
+        rc = p2m_next_level(p2m, true, level, &table, offsets[level]);
+        if ( rc == GUEST_TABLE_MAP_FAILED )
+            goto out_unmap;
+        else if ( rc != GUEST_TABLE_NORMAL_PAGE )
+            break;
+    }
+
+    /*
+     * If the valid bit of the entry is set, it means someone was playing with
+     * the Stage-2 page table. Nothing to do and mark the fault as resolved.
+     */
+    if ( lpae_is_valid(entry) )
+    {
+        resolved = true;
+        goto out_unmap;
+    }
+
+    /*
+     * The valid bit is unset. If the entry is still not valid then the fault
+     * cannot be resolved, exit and report it.
+     */
+    if ( !p2m_is_valid(entry) )
+        goto out_unmap;
+
+    /*
+     * Now we have an entry with valid bit unset, but still valid from
+     * the P2M point of view.
+     *
+     * For entry pointing to a table, the table will be invalidated.
+     * For entry pointing to a block/page, no work to do for now.
+     */
+    if ( lpae_is_table(entry, level) )
+        p2m_invalidate_table(p2m, lpae_get_mfn(entry));
+
+    /*
+     * Now that the work on the entry is done, set the valid bit to prevent
+     * another fault on that entry.
+     */
+    resolved = true;
+    entry.p2m.valid = 1;
+
+    p2m_write_pte(table + offsets[level], entry, p2m->clean_pte);
+
+    /*
+     * No need to flush the TLBs as the modified entry had the valid bit
+     * unset.
+     */
+
+out_unmap:
+    unmap_domain_page(table);
+
+out:
+    p2m_write_unlock(p2m);
+
+    return resolved;
+}
+
 static inline int p2m_insert_mapping(struct domain *d,
                                      gfn_t start_gfn,
                                      unsigned long nr,
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index b40798084d..169b57cb6b 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1882,6 +1882,8 @@ static bool try_map_mmio(gfn_t gfn)
     return !map_regions_p2mt(d, gfn, 1, mfn, p2m_mmio_direct_c);
 }
 
+bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
+
 static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
                                        const union hsr hsr)
 {
@@ -1894,7 +1896,6 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
     vaddr_t gva;
     paddr_t gpa;
     uint8_t fsc = xabt.fsc & ~FSC_LL_MASK;
-    mfn_t mfn;
     bool is_data = (hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
 
     /*
@@ -1977,8 +1978,8 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
          * with the Stage-2 page table. Walk the Stage-2 PT to check
          * if the entry exists. If it's the case, return to the guest
          */
-        mfn = gfn_to_mfn(current->domain, gaddr_to_gfn(gpa));
-        if ( !mfn_eq(mfn, INVALID_MFN) )
+        if ( p2m_resolve_translation_fault(current->domain,
+                                           gaddr_to_gfn(gpa)) )
             return;
 
         if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 10/16] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (8 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-11-05 19:47   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 11/16] xen/arm: vsysreg: Add wrapper to handle sysreg " Julien Grall
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

A follow-up patch will require to emulate some accesses to some
co-processors registers trapped by HCR_EL2.TVM. When set, all NS EL1 writes
to the virtual memory control registers will be trapped to the hypervisor.

This patch adds the infrastructure to passthrough the access to host
registers. For convenience a bunch of helpers have been added to
generate the different helpers.

Note that HCR_EL2.TVM will be set in a follow-up patch dynamically.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/vcpreg.c        | 144 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/cpregs.h |   1 +
 2 files changed, 145 insertions(+)

diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
index b04d996fd3..49529b97cd 100644
--- a/xen/arch/arm/vcpreg.c
+++ b/xen/arch/arm/vcpreg.c
@@ -24,6 +24,122 @@
 #include <asm/traps.h>
 #include <asm/vtimer.h>
 
+/*
+ * Macros to help generating helpers for registers trapped when
+ * HCR_EL2.TVM is set.
+ *
+ * Note that it only traps NS write access from EL1.
+ *
+ *  - TVM_REG() should not be used outside of the macros. It is there to
+ *    help defining TVM_REG32() and TVM_REG64()
+ *  - TVM_REG32(regname, xreg) and TVM_REG64(regname, xreg) are used to
+ *    resp. generate helper accessing 32-bit and 64-bit register. "regname"
+ *    been the Arm32 name and "xreg" the Arm64 name.
+ *  - UPDATE_REG32_COMBINED(lowreg, hireg, xreg) are used to generate a
+ *  pair of registers share the same Arm32 registers. "lowreg" and
+ *  "higreg" been resp. the Arm32 name and "xreg" the Arm64 name. "lowreg"
+ *  will use xreg[31:0] and "hireg" will use xreg[63:32].
+ */
+
+/* The name is passed from the upper macro to workaround macro expansion. */
+#define TVM_REG(sz, func, reg...)                                           \
+static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
+{                                                                           \
+    GUEST_BUG_ON(read);                                                     \
+    WRITE_SYSREG##sz(*r, reg);                                              \
+                                                                            \
+    return true;                                                            \
+}
+
+#define TVM_REG32(regname, xreg) TVM_REG(32, vreg_emulate_##regname, xreg)
+#define TVM_REG64(regname, xreg) TVM_REG(64, vreg_emulate_##regname, xreg)
+
+#ifdef CONFIG_ARM_32
+#define TVM_REG32_COMBINED(lowreg, hireg, xreg)                     \
+    /* Use TVM_REG directly to workaround macro expansion. */       \
+    TVM_REG(32, vreg_emulate_##lowreg, lowreg)                      \
+    TVM_REG(32, vreg_emulate_##hireg, hireg)
+
+#else /* CONFIG_ARM_64 */
+#define TVM_REG32_COMBINED(lowreg, hireg, xreg)                             \
+static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
+                                bool read, bool hi)                         \
+{                                                                           \
+    register_t reg = READ_SYSREG(xreg);                                     \
+                                                                            \
+    GUEST_BUG_ON(read);                                                     \
+    if ( hi ) /* reg[63:32] is AArch32 register hireg */                    \
+    {                                                                       \
+        reg &= GENMASK(31, 0);                                              \
+        reg |= ((uint64_t)*r) << 32;                                        \
+    }                                                                       \
+    else /* reg[31:0] is AArch32 register lowreg. */                        \
+    {                                                                       \
+        reg &= GENMASK(31, 0);                                              \
+        reg |= *r;                                                          \
+    }                                                                       \
+    WRITE_SYSREG(reg, xreg);                                                \
+                                                                            \
+    return true;                                                            \
+}                                                                           \
+                                                                            \
+static bool vreg_emulate_##lowreg(struct cpu_user_regs *regs, uint32_t *r,  \
+                                  bool read)                                \
+{                                                                           \
+    return vreg_emulate_##xreg(regs, r, read, false);                       \
+}                                                                           \
+                                                                            \
+static bool vreg_emulate_##hireg(struct cpu_user_regs *regs, uint32_t *r,   \
+                                 bool read)                                 \
+{                                                                           \
+    return vreg_emulate_##xreg(regs, r, read, true);                        \
+}
+#endif
+
+/* Defining helpers for emulating co-processor registers. */
+TVM_REG32(SCTLR, SCTLR_EL1)
+/*
+ * AArch32 provides two way to access TTBR* depending on the access
+ * size, whilst AArch64 provides one way.
+ *
+ * When using AArch32, for simplicity, use the same access size as the
+ * guest.
+ */
+#ifdef CONFIG_ARM_32
+TVM_REG32(TTBR0_32, TTBR0_32)
+TVM_REG32(TTBR1_32, TTBR1_32)
+#else
+TVM_REG32(TTBR0_32, TTBR0_EL1)
+TVM_REG32(TTBR1_32, TTBR1_EL1)
+#endif
+TVM_REG64(TTBR0, TTBR0_EL1)
+TVM_REG64(TTBR1, TTBR1_EL1)
+/* AArch32 registers TTBCR and TTBCR2 share AArch64 register TCR_EL1. */
+TVM_REG32_COMBINED(TTBCR, TTBCR2, TCR_EL1)
+TVM_REG32(DACR, DACR32_EL2)
+TVM_REG32(DFSR, ESR_EL1)
+TVM_REG32(IFSR, IFSR32_EL2)
+/* AArch32 registers DFAR and IFAR shares AArch64 register FAR_EL1. */
+TVM_REG32_COMBINED(DFAR, IFAR, FAR_EL1)
+TVM_REG32(ADFSR, AFSR0_EL1)
+TVM_REG32(AIFSR, AFSR1_EL1)
+/* AArch32 registers MAIR0 and MAIR1 share AArch64 register MAIR_EL1. */
+TVM_REG32_COMBINED(MAIR0, MAIR1, MAIR_EL1)
+/* AArch32 registers AMAIR0 and AMAIR1 share AArch64 register AMAIR_EL1. */
+TVM_REG32_COMBINED(AMAIR0, AMAIR1, AMAIR_EL1)
+TVM_REG32(CONTEXTIDR, CONTEXTIDR_EL1)
+
+/* Macro to generate easily case for co-processor emulation. */
+#define GENERATE_CASE(reg, sz)                                      \
+    case HSR_CPREG##sz(reg):                                        \
+    {                                                               \
+        bool res;                                                   \
+                                                                    \
+        res = vreg_emulate_cp##sz(regs, hsr, vreg_emulate_##reg);   \
+        ASSERT(res);                                                \
+        break;                                                      \
+    }
+
 void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
 {
     const struct hsr_cp32 cp32 = hsr.cp32;
@@ -64,6 +180,31 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
         break;
 
     /*
+     * HCR_EL2.TVM
+     *
+     * ARMv8 (DDI 0487B.b): Table D1-37
+     */
+    GENERATE_CASE(SCTLR, 32)
+    GENERATE_CASE(TTBR0_32, 32)
+    GENERATE_CASE(TTBR1_32, 32)
+    GENERATE_CASE(TTBCR, 32)
+    GENERATE_CASE(TTBCR2, 32)
+    GENERATE_CASE(DACR, 32)
+    GENERATE_CASE(DFSR, 32)
+    GENERATE_CASE(IFSR, 32)
+    GENERATE_CASE(DFAR, 32)
+    GENERATE_CASE(IFAR, 32)
+    GENERATE_CASE(ADFSR, 32)
+    GENERATE_CASE(AIFSR, 32)
+    /* AKA PRRR */
+    GENERATE_CASE(MAIR0, 32)
+    /* AKA NMRR */
+    GENERATE_CASE(MAIR1, 32)
+    GENERATE_CASE(AMAIR0, 32)
+    GENERATE_CASE(AMAIR1, 32)
+    GENERATE_CASE(CONTEXTIDR, 32)
+
+    /*
      * MDCR_EL2.TPM
      *
      * ARMv7 (DDI 0406C.b): B1.14.17
@@ -192,6 +333,9 @@ void do_cp15_64(struct cpu_user_regs *regs, const union hsr hsr)
             return inject_undef_exception(regs, hsr);
         break;
 
+    GENERATE_CASE(TTBR0, 64)
+    GENERATE_CASE(TTBR1, 64)
+
     /*
      * CPTR_EL2.T{0..9,12..13}
      *
diff --git a/xen/include/asm-arm/cpregs.h b/xen/include/asm-arm/cpregs.h
index 07e5791983..f1cbac5e5d 100644
--- a/xen/include/asm-arm/cpregs.h
+++ b/xen/include/asm-arm/cpregs.h
@@ -142,6 +142,7 @@
 
 /* CP15 CR2: Translation Table Base and Control Registers */
 #define TTBCR           p15,0,c2,c0,2   /* Translation Table Base Control Register */
+#define TTBCR2          p15,0,c2,c0,3   /* Translation Table Base Control Register 2 */
 #define TTBR0           p15,0,c2        /* Translation Table Base Reg. 0 */
 #define TTBR1           p15,1,c2        /* Translation Table Base Reg. 1 */
 #define HTTBR           p15,4,c2        /* Hyp. Translation Table Base Register */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 11/16] xen/arm: vsysreg: Add wrapper to handle sysreg access trapped by HCR_EL2.TVM
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (9 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 10/16] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-11-05 20:42   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 12/16] xen/arm: Rework p2m_cache_flush to take a range [begin, end) Julien Grall
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

A follow-up patch will require to emulate some accesses to system
registers trapped by HCR_EL2.TVM. When set, all NS EL1 writes to the
virtual memory control registers will be trapped to the hypervisor.

This patch adds the infrastructure to passthrough the access to the host
registers.

Note that HCR_EL2.TVM will be set in a follow-up patch dynamically.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/arm64/vsysreg.c | 57 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 57 insertions(+)

diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
index 6e60824572..1517879697 100644
--- a/xen/arch/arm/arm64/vsysreg.c
+++ b/xen/arch/arm/arm64/vsysreg.c
@@ -23,6 +23,46 @@
 #include <asm/traps.h>
 #include <asm/vtimer.h>
 
+/*
+ * Macro to help generating helpers for registers trapped when
+ * HCR_EL2.TVM is set.
+ *
+ * Note that it only traps NS write access from EL1.
+ */
+#define TVM_REG(reg)                                                \
+static bool vreg_emulate_##reg(struct cpu_user_regs *regs,          \
+                               uint64_t *r, bool read)              \
+{                                                                   \
+    GUEST_BUG_ON(read);                                             \
+    WRITE_SYSREG64(*r, reg);                                        \
+                                                                    \
+    return true;                                                    \
+}
+
+/* Defining helpers for emulating sysreg registers. */
+TVM_REG(SCTLR_EL1)
+TVM_REG(TTBR0_EL1)
+TVM_REG(TTBR1_EL1)
+TVM_REG(TCR_EL1)
+TVM_REG(ESR_EL1)
+TVM_REG(FAR_EL1)
+TVM_REG(AFSR0_EL1)
+TVM_REG(AFSR1_EL1)
+TVM_REG(MAIR_EL1)
+TVM_REG(AMAIR_EL1)
+TVM_REG(CONTEXTIDR_EL1)
+
+/* Macro to generate easily case for co-processor emulation */
+#define GENERATE_CASE(reg)                                              \
+    case HSR_SYSREG_##reg:                                              \
+    {                                                                   \
+        bool res;                                                       \
+                                                                        \
+        res = vreg_emulate_sysreg64(regs, hsr, vreg_emulate_##reg);     \
+        ASSERT(res);                                                    \
+        break;                                                          \
+    }
+
 void do_sysreg(struct cpu_user_regs *regs,
                const union hsr hsr)
 {
@@ -44,6 +84,23 @@ void do_sysreg(struct cpu_user_regs *regs,
         break;
 
     /*
+     * HCR_EL2.TVM
+     *
+     * ARMv8 (DDI 0487B.b): Table D1-37
+     */
+    GENERATE_CASE(SCTLR_EL1)
+    GENERATE_CASE(TTBR0_EL1)
+    GENERATE_CASE(TTBR1_EL1)
+    GENERATE_CASE(TCR_EL1)
+    GENERATE_CASE(ESR_EL1)
+    GENERATE_CASE(FAR_EL1)
+    GENERATE_CASE(AFSR0_EL1)
+    GENERATE_CASE(AFSR1_EL1)
+    GENERATE_CASE(MAIR_EL1)
+    GENERATE_CASE(AMAIR_EL1)
+    GENERATE_CASE(CONTEXTIDR_EL1)
+
+    /*
      * MDCR_EL2.TDRA
      *
      * ARMv8 (DDI 0487A.d): D1-1508 Table D1-57
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 12/16] xen/arm: Rework p2m_cache_flush to take a range [begin, end)
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (10 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 11/16] xen/arm: vsysreg: Add wrapper to handle sysreg " Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-11-02 23:38   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 13/16] xen/arm: p2m: Allow to flush cache on any RAM region Julien Grall
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

The function will be easier to re-use in a follow-up patch if you have
only the begin and end.

At the same time, rename the function to reflect the change in the
prototype.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/domctl.c     | 2 +-
 xen/arch/arm/p2m.c        | 3 +--
 xen/include/asm-arm/p2m.h | 7 +++++--
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
index 4587c75826..c10f568aad 100644
--- a/xen/arch/arm/domctl.c
+++ b/xen/arch/arm/domctl.c
@@ -61,7 +61,7 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
         if ( e < s )
             return -EINVAL;
 
-        return p2m_cache_flush(d, _gfn(s), domctl->u.cacheflush.nr_pfns);
+        return p2m_cache_flush_range(d, _gfn(s), _gfn(e));
     }
     case XEN_DOMCTL_bind_pt_irq:
     {
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index af445d3313..8537b7bab1 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1507,10 +1507,9 @@ int relinquish_p2m_mapping(struct domain *d)
     return rc;
 }
 
-int p2m_cache_flush(struct domain *d, gfn_t start, unsigned long nr)
+int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
-    gfn_t end = gfn_add(start, nr);
     gfn_t next_gfn;
     p2m_type_t t;
     unsigned int order;
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index c03557544a..d7afa2bbe8 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -224,8 +224,11 @@ int p2m_set_entry(struct p2m_domain *p2m,
                   p2m_type_t t,
                   p2m_access_t a);
 
-/* Clean & invalidate caches corresponding to a region of guest address space */
-int p2m_cache_flush(struct domain *d, gfn_t start, unsigned long nr);
+/*
+ * Clean & invalidate caches corresponding to a region [start,end) of guest
+ * address space.
+ */
+int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end);
 
 /*
  * Map a region in the guest p2m with a specific p2m type.
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 13/16] xen/arm: p2m: Allow to flush cache on any RAM region
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (11 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 12/16] xen/arm: Rework p2m_cache_flush to take a range [begin, end) Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-11-02 23:40   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 14/16] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit) Julien Grall
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

Currently, we only allow to flush cache on region mapped as p2m_ram_{rw,ro}.

There are no real problem to flush cache on any RAM region such as grants
and foreign mapping. Therefore, relax the cache to allow flushing the
cache on any RAM region.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/p2m.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 8537b7bab1..12b459924b 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1532,7 +1532,7 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
         next_gfn = gfn_next_boundary(start, order);
 
         /* Skip hole and non-RAM page */
-        if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_ram(t) )
+        if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
             continue;
 
         /* XXX: Implement preemption */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 14/16] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit)
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (12 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 13/16] xen/arm: p2m: Allow to flush cache on any RAM region Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-11-02 23:44   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 15/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

With the recent changes, a P2M entry may be populated but may as not
valid. In some situation, it would be useful to know whether the entry
has been marked available to guest in order to perform a specific
action. So extend p2m_get_entry to return the value of bit[0] (valid bit).

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/mem_access.c |  6 +++---
 xen/arch/arm/p2m.c        | 20 ++++++++++++++++----
 xen/include/asm-arm/p2m.h |  3 ++-
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
index 9239bdf323..f434510b2a 100644
--- a/xen/arch/arm/mem_access.c
+++ b/xen/arch/arm/mem_access.c
@@ -70,7 +70,7 @@ static int __p2m_get_mem_access(struct domain *d, gfn_t gfn,
          * No setting was found in the Radix tree. Check if the
          * entry exists in the page-tables.
          */
-        mfn_t mfn = p2m_get_entry(p2m, gfn, NULL, NULL, NULL);
+        mfn_t mfn = p2m_get_entry(p2m, gfn, NULL, NULL, NULL, NULL);
 
         if ( mfn_eq(mfn, INVALID_MFN) )
             return -ESRCH;
@@ -199,7 +199,7 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
      * We had a mem_access permission limiting the access, but the page type
      * could also be limiting, so we need to check that as well.
      */
-    mfn = p2m_get_entry(p2m, gfn, &t, NULL, NULL);
+    mfn = p2m_get_entry(p2m, gfn, &t, NULL, NULL, NULL);
     if ( mfn_eq(mfn, INVALID_MFN) )
         goto err;
 
@@ -405,7 +405,7 @@ long p2m_set_mem_access(struct domain *d, gfn_t gfn, uint32_t nr,
           gfn = gfn_next_boundary(gfn, order) )
     {
         p2m_type_t t;
-        mfn_t mfn = p2m_get_entry(p2m, gfn, &t, NULL, &order);
+        mfn_t mfn = p2m_get_entry(p2m, gfn, &t, NULL, &order, NULL);
 
 
         if ( !mfn_eq(mfn, INVALID_MFN) )
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 12b459924b..df6b48d73b 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -306,10 +306,14 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
  *
  * If the entry is not present, INVALID_MFN will be returned and the
  * page_order will be set according to the order of the invalid range.
+ *
+ * valid will contain the value of bit[0] (e.g valid bit) of the
+ * entry.
  */
 mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
                     p2m_type_t *t, p2m_access_t *a,
-                    unsigned int *page_order)
+                    unsigned int *page_order,
+                    bool *valid)
 {
     paddr_t addr = gfn_to_gaddr(gfn);
     unsigned int level = 0;
@@ -317,6 +321,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
     int rc;
     mfn_t mfn = INVALID_MFN;
     p2m_type_t _t;
+    bool _valid;
 
     /* Convenience aliases */
     const unsigned int offsets[4] = {
@@ -334,6 +339,10 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
 
     *t = p2m_invalid;
 
+    /* Allow valid to be NULL */
+    valid = valid?: &_valid;
+    *valid = false;
+
     /* XXX: Check if the mapping is lower than the mapped gfn */
 
     /* This gfn is higher than the highest the p2m map currently holds */
@@ -379,6 +388,9 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
          * to the GFN.
          */
         mfn = mfn_add(mfn, gfn_x(gfn) & ((1UL << level_orders[level]) - 1));
+
+        if ( valid )
+            *valid = lpae_is_valid(entry);
     }
 
 out_unmap:
@@ -397,7 +409,7 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t *t)
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
 
     p2m_read_lock(p2m);
-    mfn = p2m_get_entry(p2m, gfn, t, NULL, NULL);
+    mfn = p2m_get_entry(p2m, gfn, t, NULL, NULL, NULL);
     p2m_read_unlock(p2m);
 
     return mfn;
@@ -1464,7 +1476,7 @@ int relinquish_p2m_mapping(struct domain *d)
     for ( ; gfn_x(start) < gfn_x(end);
           start = gfn_next_boundary(start, order) )
     {
-        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order);
+        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
 
         count++;
         /*
@@ -1527,7 +1539,7 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
 
     for ( ; gfn_x(start) < gfn_x(end); start = next_gfn )
     {
-        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order);
+        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
 
         next_gfn = gfn_next_boundary(start, order);
 
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index d7afa2bbe8..92213dd1ab 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -211,7 +211,8 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t *t);
  */
 mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
                     p2m_type_t *t, p2m_access_t *a,
-                    unsigned int *page_order);
+                    unsigned int *page_order,
+                    bool *valid);
 
 /*
  * Direct set a p2m entry: only for use by the P2M code.
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 15/16] xen/arm: Implement Set/Way operations
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (13 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 14/16] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit) Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-11-05 21:10   ` Stefano Stabellini
  2018-10-08 18:33 ` [RFC 16/16] xen/arm: Track page accessed between batch of " Julien Grall
  2018-11-22 14:21 ` [RFC 00/16] xen/arm: Implement " Julien Grall
  16 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini, andre.przywara

Set/Way operations are used to perform maintenance on a given cache.
At the moment, Set/Way operations are not trapped and therefore a guest
OS will directly act on the local cache. However, a vCPU may migrate to
another pCPU in the middle of the processor. This will result to have
cache with stall data (Set/Way are not propagated) potentially causing
crash. This may be the cause of heisenbug noticed in Osstest [1].

Furthermore, Set/Way operations are not available on system cache. This
means that OS, such as Linux 32-bit, relying on those operations to
fully clean the cache before disabling MMU may break because data may
sits in system caches and not in RAM.

For more details about Set/Way, see the talk "The Art of Virtualizing
Cache Maintenance" given at Xen Summit 2018 [2].

In the context of Xen, we need to trap Set/Way operations and emulate
them. From the Arm Arm (B1.14.4 in DDI 046C.c), Set/Way operations are
difficult to virtualized. So we can assume that a guest OS using them will
suffer the consequence (i.e slowness) until developer removes all the usage
of Set/Way.

As the software is not allowed to infer the Set/Way to Physical Address
mapping, Xen will need to go through the guest P2M and clean &
invalidate all the entries mapped.

Because Set/Way happen in batch (a loop on all Set/Way of a cache), Xen
would need to go through the P2M for every instructions. This is quite
expensive and would severely impact the guest OS. The implementation is
re-using the KVM policy to limit the number of flush:
    - If we trap a Set/Way operations, we enable VM trapping (i.e
      HVC_EL2.TVM) to detect cache being turned on/off, and do a full
    clean.
    - We clean the caches when turning on and off
    - Once the caches are enabled, we stop trapping VM instructions

[1] https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg03191.html
[2] https://fr.slideshare.net/xen_com_mgr/virtualizing-cache

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/arm64/vsysreg.c | 27 +++++++++++++++++-
 xen/arch/arm/p2m.c           | 68 ++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/traps.c         |  2 +-
 xen/arch/arm/vcpreg.c        | 23 +++++++++++++++
 xen/include/asm-arm/p2m.h    | 16 +++++++++++
 5 files changed, 134 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
index 1517879697..43c6c3e30d 100644
--- a/xen/arch/arm/arm64/vsysreg.c
+++ b/xen/arch/arm/arm64/vsysreg.c
@@ -40,7 +40,20 @@ static bool vreg_emulate_##reg(struct cpu_user_regs *regs,          \
 }
 
 /* Defining helpers for emulating sysreg registers. */
-TVM_REG(SCTLR_EL1)
+static bool vreg_emulate_SCTLR_EL1(struct cpu_user_regs *regs, uint64_t *r,
+                                   bool read)
+{
+    struct vcpu *v = current;
+    bool cache_enabled = vcpu_has_cache_enabled(v);
+
+    GUEST_BUG_ON(read);
+    WRITE_SYSREG(*r, SCTLR_EL1);
+
+    p2m_toggle_cache(v, cache_enabled);
+
+    return true;
+}
+
 TVM_REG(TTBR0_EL1)
 TVM_REG(TTBR1_EL1)
 TVM_REG(TCR_EL1)
@@ -84,6 +97,18 @@ void do_sysreg(struct cpu_user_regs *regs,
         break;
 
     /*
+     * HCR_EL2.TSW
+     *
+     * ARMv8 (DDI 0487B.b): Table D1-42
+     */
+    case HSR_SYSREG_DCISW:
+    case HSR_SYSREG_DCCSW:
+    case HSR_SYSREG_DCCISW:
+        if ( hsr.sysreg.read )
+            p2m_set_way_flush(current);
+        break;
+
+    /*
      * HCR_EL2.TVM
      *
      * ARMv8 (DDI 0487B.b): Table D1-37
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index df6b48d73b..a3d4c563b1 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1564,6 +1564,74 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
     return 0;
 }
 
+static void p2m_flush_vm(struct vcpu *v)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(v->domain);
+
+    /* XXX: Handle preemption */
+    p2m_cache_flush_range(v->domain, p2m->lowest_mapped_gfn,
+                          p2m->max_mapped_gfn);
+}
+
+/*
+ * See note at ARMv7 ARM B1.14.4 (DDI 0406C.c) (TL;DR: S/W ops are not
+ * easily virtualized).
+ *
+ * Main problems:
+ *  - S/W ops are local to a CPU (not broadcast)
+ *  - We have line migration behind our back (speculation)
+ *  - System caches don't support S/W at all (damn!)
+ *
+ * In the face of the above, the best we can do is to try and convert
+ * S/W ops to VA ops. Because the guest is not allowed to infer the S/W
+ * to PA mapping, it can only use S/W to nuke the whole cache, which is
+ * rather a good thing for us.
+ *
+ * Also, it is only used when turning caches on/off ("The expected
+ * usage of the cache maintenance instructions that operate by set/way
+ * is associated with the powerdown and powerup of caches, if this is
+ * required by the implementation.").
+ *
+ * We use the following policy:
+ *  - If we trap a S/W operation, we enabled VM trapping to detect
+ *  caches being turned on/off, and do a full clean.
+ *
+ *  - We flush the caches on both caches being turned on and off.
+ *
+ *  - Once the caches are enabled, we stop trapping VM ops.
+ */
+void p2m_set_way_flush(struct vcpu *v)
+{
+    /* This function can only work with the current vCPU. */
+    ASSERT(v == current);
+
+    if ( !(v->arch.hcr_el2 & HCR_TVM) )
+    {
+        p2m_flush_vm(v);
+        vcpu_hcr_set_flags(v, HCR_TVM);
+    }
+}
+
+void p2m_toggle_cache(struct vcpu *v, bool was_enabled)
+{
+    bool now_enabled = vcpu_has_cache_enabled(v);
+
+    /* This function can only work with the current vCPU. */
+    ASSERT(v == current);
+
+    /*
+     * If switching the MMU+caches on, need to invalidate the caches.
+     * If switching it off, need to clean the caches.
+     * Clean + invalidate does the trick always.
+     */
+    if ( was_enabled != now_enabled )
+        p2m_flush_vm(v);
+
+    /* Caches are now on, stop trapping VM ops (until a S/W op) */
+    if ( now_enabled )
+        vcpu_hcr_clear_flags(v, HCR_TVM);
+}
+
 mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
 {
     return p2m_lookup(d, gfn, NULL);
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 169b57cb6b..cdc10eee5a 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -98,7 +98,7 @@ register_t get_default_hcr_flags(void)
 {
     return  (HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM|
              (vwfi != NATIVE ? (HCR_TWI|HCR_TWE) : 0) |
-             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB);
+             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
 }
 
 static enum {
diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
index 49529b97cd..dc46d9d0d7 100644
--- a/xen/arch/arm/vcpreg.c
+++ b/xen/arch/arm/vcpreg.c
@@ -45,9 +45,14 @@
 #define TVM_REG(sz, func, reg...)                                           \
 static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
 {                                                                           \
+    struct vcpu *v = current;                                               \
+    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
+                                                                            \
     GUEST_BUG_ON(read);                                                     \
     WRITE_SYSREG##sz(*r, reg);                                              \
                                                                             \
+    p2m_toggle_cache(v, cache_enabled);                                     \
+                                                                            \
     return true;                                                            \
 }
 
@@ -65,6 +70,8 @@ static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
 static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
                                 bool read, bool hi)                         \
 {                                                                           \
+    struct vcpu *v = current;                                               \
+    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
     register_t reg = READ_SYSREG(xreg);                                     \
                                                                             \
     GUEST_BUG_ON(read);                                                     \
@@ -80,6 +87,8 @@ static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
     }                                                                       \
     WRITE_SYSREG(reg, xreg);                                                \
                                                                             \
+    p2m_toggle_cache(v, cache_enabled);                                     \
+                                                                            \
     return true;                                                            \
 }                                                                           \
                                                                             \
@@ -98,6 +107,7 @@ static bool vreg_emulate_##hireg(struct cpu_user_regs *regs, uint32_t *r,   \
 
 /* Defining helpers for emulating co-processor registers. */
 TVM_REG32(SCTLR, SCTLR_EL1)
+
 /*
  * AArch32 provides two way to access TTBR* depending on the access
  * size, whilst AArch64 provides one way.
@@ -180,6 +190,19 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
         break;
 
     /*
+     * HCR_EL2.TSW
+     *
+     * ARMv7 (DDI 0406C.b): B1.14.6
+     * ARMv8 (DDI 0487B.b): Table D1-42
+     */
+    case HSR_CPREG32(DCISW):
+    case HSR_CPREG32(DCCSW):
+    case HSR_CPREG32(DCCISW):
+        if ( !cp32.read )
+            p2m_set_way_flush(current);
+        break;
+
+    /*
      * HCR_EL2.TVM
      *
      * ARMv8 (DDI 0487B.b): Table D1-37
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 92213dd1ab..c470f062db 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -231,6 +231,10 @@ int p2m_set_entry(struct p2m_domain *p2m,
  */
 int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end);
 
+void p2m_set_way_flush(struct vcpu *v);
+
+void p2m_toggle_cache(struct vcpu *v, bool was_enabled);
+
 /*
  * Map a region in the guest p2m with a specific p2m type.
  * The memory attributes will be derived from the p2m type.
@@ -358,6 +362,18 @@ static inline int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
     return -EOPNOTSUPP;
 }
 
+/*
+ * A vCPU has cache enabled only when the MMU is enabled and data cache
+ * is enabled.
+ */
+static inline bool vcpu_has_cache_enabled(struct vcpu *v)
+{
+    /* Only works with the current vCPU */
+    ASSERT(current == v);
+
+    return (READ_SYSREG32(SCTLR_EL1) & (SCTLR_C|SCTLR_M)) == (SCTLR_C|SCTLR_M);
+}
+
 #endif /* _XEN_P2M_H */
 
 /*
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [RFC 16/16] xen/arm: Track page accessed between batch of Set/Way operations
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (14 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 15/16] xen/arm: Implement Set/Way operations Julien Grall
@ 2018-10-08 18:33 ` Julien Grall
  2018-10-09  7:04   ` Jan Beulich
  2018-11-05 21:35   ` Stefano Stabellini
  2018-11-22 14:21 ` [RFC 00/16] xen/arm: Implement " Julien Grall
  16 siblings, 2 replies; 62+ messages in thread
From: Julien Grall @ 2018-10-08 18:33 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, andre.przywara, Tim Deegan,
	Julien Grall, Jan Beulich

At the moment, the implementation of Set/Way operations will go through
all the entries of the guest P2M and flush them. However, this is very
expensive and may render unusable a guest OS using them.

For instance, Linux 32-bit will use Set/Way operations during secondary
CPU bring-up. As the implementation is really expensive, it may be possible
to hit the CPU bring-up timeout.

To limit the Set/Way impact, we track what pages has been of the guest
has been accessed between batch of Set/Way operations. This is done
using bit[0] (aka valid bit) of the P2M entry.

This patch adds a new per-arch helper is introduced to perform actions just
before the guest is first unpaused. This will be used to invalidate the
P2M to track access from the start of the guest.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---

Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/arm/domain.c       | 14 ++++++++++++++
 xen/arch/arm/domain_build.c |  7 +++++++
 xen/arch/arm/p2m.c          | 32 +++++++++++++++++++++++++++++++-
 xen/arch/x86/domain.c       |  4 ++++
 xen/common/domain.c         |  5 ++++-
 xen/include/asm-arm/p2m.h   |  2 ++
 xen/include/xen/domain.h    |  2 ++
 7 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index feebbf5a92..f439f4657a 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -738,6 +738,20 @@ int arch_domain_soft_reset(struct domain *d)
     return -ENOSYS;
 }
 
+void arch_domain_creation_finished(struct domain *d)
+{
+    /*
+     * To avoid flushing the whole guest RAM on the first Set/Way, we
+     * invalidate the P2M to track what has been accessed.
+     *
+     * This is only turned when IOMMU is not used or the page-table are
+     * not shared because bit[0] (e.g valid bit) unset will result
+     * IOMMU fault that could be not fixed-up.
+     */
+    if ( !iommu_use_hap_pt(d) )
+        p2m_invalidate_root(p2m_get_hostp2m(d));
+}
+
 static int is_guest_pv32_psr(uint32_t psr)
 {
     switch (psr & PSR_MODE_MASK)
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index f552154e93..de96516faa 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2249,6 +2249,13 @@ int __init construct_dom0(struct domain *d)
     v->is_initialised = 1;
     clear_bit(_VPF_down, &v->pause_flags);
 
+    /*
+     * XXX: We probably want a command line option to invalidate or not
+     * the P2M. This is because invalidating the P2M will not work with
+     * IOMMU, however if this is not done there will be an impact.
+     */
+    p2m_invalidate_root(p2m_get_hostp2m(d));
+
     return 0;
 }
 
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index a3d4c563b1..8e0c31d7ac 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1079,6 +1079,22 @@ static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
     p2m->need_flush = true;
 }
 
+/*
+ * Invalidate all entries in the root page-tables. This is
+ * useful to get fault on entry and do an action.
+ */
+void p2m_invalidate_root(struct p2m_domain *p2m)
+{
+    unsigned int i;
+
+    p2m_write_lock(p2m);
+
+    for ( i = 0; i < P2M_ROOT_LEVEL; i++ )
+        p2m_invalidate_table(p2m, page_to_mfn(p2m->root + i));
+
+    p2m_write_unlock(p2m);
+}
+
 bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
@@ -1539,7 +1555,8 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
 
     for ( ; gfn_x(start) < gfn_x(end); start = next_gfn )
     {
-        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
+        bool valid;
+        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, &valid);
 
         next_gfn = gfn_next_boundary(start, order);
 
@@ -1547,6 +1564,13 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
         if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
             continue;
 
+        /*
+         * Page with valid bit (bit [0]) unset does not need to be
+         * cleaned
+         */
+        if ( !valid )
+            continue;
+
         /* XXX: Implement preemption */
         while ( gfn_x(start) < gfn_x(next_gfn) )
         {
@@ -1571,6 +1595,12 @@ static void p2m_flush_vm(struct vcpu *v)
     /* XXX: Handle preemption */
     p2m_cache_flush_range(v->domain, p2m->lowest_mapped_gfn,
                           p2m->max_mapped_gfn);
+
+    /*
+     * Invalidate the p2m to track which page was modified by the guest
+     * between call of p2m_flush_vm().
+     */
+    p2m_invalidate_root(p2m);
 }
 
 /*
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 9371efc8c7..2b6d1c01a1 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -723,6 +723,10 @@ int arch_domain_soft_reset(struct domain *d)
     return ret;
 }
 
+void arch_domain_creation_finished(struct domain *d)
+{
+}
+
 /*
  * These are the masks of CR4 bits (subject to hardware availability) which a
  * PV guest may not legitimiately attempt to modify.
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 65151e2ac4..b402c635f9 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1100,8 +1100,11 @@ int domain_unpause_by_systemcontroller(struct domain *d)
      * Creation is considered finished when the controller reference count
      * first drops to 0.
      */
-    if ( new == 0 )
+    if ( new == 0 && !d->creation_finished )
+    {
         d->creation_finished = true;
+        arch_domain_creation_finished(d);
+    }
 
     domain_unpause(d);
 
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index c470f062db..2a4652e7f4 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -225,6 +225,8 @@ int p2m_set_entry(struct p2m_domain *p2m,
                   p2m_type_t t,
                   p2m_access_t a);
 
+void p2m_invalidate_root(struct p2m_domain *p2m);
+
 /*
  * Clean & invalidate caches corresponding to a region [start,end) of guest
  * address space.
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 5e393fd7f2..8d95ad4bf1 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -70,6 +70,8 @@ void arch_domain_unpause(struct domain *d);
 
 int arch_domain_soft_reset(struct domain *d);
 
+void arch_domain_creation_finished(struct domain *d);
+
 void arch_p2m_set_access_required(struct domain *d, bool access_required);
 
 int arch_set_info_guest(struct vcpu *, vcpu_guest_context_u);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [RFC 16/16] xen/arm: Track page accessed between batch of Set/Way operations
  2018-10-08 18:33 ` [RFC 16/16] xen/arm: Track page accessed between batch of " Julien Grall
@ 2018-10-09  7:04   ` Jan Beulich
  2018-10-09 10:16     ` Julien Grall
  2018-11-05 21:35   ` Stefano Stabellini
  1 sibling, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2018-10-09  7:04 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andre.przywara,
	xen-devel

>>> On 08.10.18 at 20:33, <julien.grall@arm.com> wrote:
> At the moment, the implementation of Set/Way operations will go through
> all the entries of the guest P2M and flush them. However, this is very
> expensive and may render unusable a guest OS using them.
> 
> For instance, Linux 32-bit will use Set/Way operations during secondary
> CPU bring-up. As the implementation is really expensive, it may be possible
> to hit the CPU bring-up timeout.
> 
> To limit the Set/Way impact, we track what pages has been of the guest
> has been accessed between batch of Set/Way operations. This is done
> using bit[0] (aka valid bit) of the P2M entry.
> 
> This patch adds a new per-arch helper is introduced to perform actions just
> before the guest is first unpaused. This will be used to invalidate the
> P2M to track access from the start of the guest.

While I'm not opposed to the new arch hook, why don't you create the
p2m entries in their intended state right away? At the very least this
would have the benefit of confining the entire change to Arm code.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 16/16] xen/arm: Track page accessed between batch of Set/Way operations
  2018-10-09  7:04   ` Jan Beulich
@ 2018-10-09 10:16     ` Julien Grall
  2018-10-09 11:43       ` Jan Beulich
  0 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-09 10:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andre.przywara,
	xen-devel

Hi Jan,

On 09/10/2018 08:04, Jan Beulich wrote:
>>>> On 08.10.18 at 20:33, <julien.grall@arm.com> wrote:
>> At the moment, the implementation of Set/Way operations will go through
>> all the entries of the guest P2M and flush them. However, this is very
>> expensive and may render unusable a guest OS using them.
>>
>> For instance, Linux 32-bit will use Set/Way operations during secondary
>> CPU bring-up. As the implementation is really expensive, it may be possible
>> to hit the CPU bring-up timeout.
>>
>> To limit the Set/Way impact, we track what pages has been of the guest
>> has been accessed between batch of Set/Way operations. This is done
>> using bit[0] (aka valid bit) of the P2M entry.
>>
>> This patch adds a new per-arch helper is introduced to perform actions just
>> before the guest is first unpaused. This will be used to invalidate the
>> P2M to track access from the start of the guest.
> 
> While I'm not opposed to the new arch hook, why don't you create the
> p2m entries in their intended state right away? At the very least this
> would have the benefit of confining the entire change to Arm code.

Let me start by saying I think having a hook to perform an action once 
the VM has been fully created is quite useful. For instance, this could 
be used on Arm to limit the invalidation of the icache. At the moment, 
we invalidate the icache for every populate memory hypercall. This is 
quite a waste of cycle.

In this particular circumstance, I would still like to use the hardware 
for walking the page-tables during the domain creation (i.e when copy 
binary over). This would not be possible if we create the entry with 
valid bit unset.

Furthermore, we don't need to create entry with valid bit unset once the 
guest is running. So we would need to check in the P2M code whether the 
guest is running and whether IOMMU is enabled.

So overall, I fell this is the best way to keep it simple for Arm and 
open door to speed/streamline a bit more the domain creation.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 16/16] xen/arm: Track page accessed between batch of Set/Way operations
  2018-10-09 10:16     ` Julien Grall
@ 2018-10-09 11:43       ` Jan Beulich
  2018-10-09 12:24         ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2018-10-09 11:43 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andre.przywara,
	xen-devel

>>> On 09.10.18 at 12:16, <julien.grall@arm.com> wrote:
> On 09/10/2018 08:04, Jan Beulich wrote:
>>>>> On 08.10.18 at 20:33, <julien.grall@arm.com> wrote:
>>> This patch adds a new per-arch helper is introduced to perform actions just
>>> before the guest is first unpaused. This will be used to invalidate the
>>> P2M to track access from the start of the guest.
>> 
>> While I'm not opposed to the new arch hook, why don't you create the
>> p2m entries in their intended state right away? At the very least this
>> would have the benefit of confining the entire change to Arm code.
> 
> Let me start by saying I think having a hook to perform an action once 
> the VM has been fully created is quite useful. For instance, this could 
> be used on Arm to limit the invalidation of the icache. At the moment, 
> we invalidate the icache for every populate memory hypercall. This is 
> quite a waste of cycle.

As said - I'm not opposed to such a hook in principle, but I'd like
to understand the reasons (and in particular whether there's an
alternative without introducing such a hook).

> In this particular circumstance, I would still like to use the hardware 
> for walking the page-tables during the domain creation (i.e when copy 
> binary over). This would not be possible if we create the entry with 
> valid bit unset.

This must be something Arm specific, since afaiu we're talking about
arbitrary domain creation here, not just Dom0. On x86 it would be
basically impossible to re-use the page tables created for the guest
to access guest memory from the control domain. (It could be made
work by inserting sub-trees into the control domain's page tables,
but obviously there would be a fair chance of conflict between the
virtual addresses the control domain uses for its own purposes and
the ones where the destination range in the domain being created
sits).

> Furthermore, we don't need to create entry with valid bit unset once the 
> guest is running. So we would need to check in the P2M code whether the 
> guest is running and whether IOMMU is enabled.

Well, looking at the patch context of your change, it is quite clear
that this would be pretty easy - simply taking d->creation_finished
into account.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 16/16] xen/arm: Track page accessed between batch of Set/Way operations
  2018-10-09 11:43       ` Jan Beulich
@ 2018-10-09 12:24         ` Julien Grall
  0 siblings, 0 replies; 62+ messages in thread
From: Julien Grall @ 2018-10-09 12:24 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andre.przywara,
	xen-devel

Hi Jan,

On 09/10/2018 12:43, Jan Beulich wrote:
>>>> On 09.10.18 at 12:16, <julien.grall@arm.com> wrote:
>> On 09/10/2018 08:04, Jan Beulich wrote:
>>>>>> On 08.10.18 at 20:33, <julien.grall@arm.com> wrote:
>>>> This patch adds a new per-arch helper is introduced to perform actions just
>>>> before the guest is first unpaused. This will be used to invalidate the
>>>> P2M to track access from the start of the guest.
>>>
>>> While I'm not opposed to the new arch hook, why don't you create the
>>> p2m entries in their intended state right away? At the very least this
>>> would have the benefit of confining the entire change to Arm code.
>>
>> Let me start by saying I think having a hook to perform an action once
>> the VM has been fully created is quite useful. For instance, this could
>> be used on Arm to limit the invalidation of the icache. At the moment,
>> we invalidate the icache for every populate memory hypercall. This is
>> quite a waste of cycle.
> 
> As said - I'm not opposed to such a hook in principle, but I'd like
> to understand the reasons (and in particular whether there's an
> alternative without introducing such a hook).
> 
>> In this particular circumstance, I would still like to use the hardware
>> for walking the page-tables during the domain creation (i.e when copy
>> binary over). This would not be possible if we create the entry with
>> valid bit unset.
> 
> This must be something Arm specific, since afaiu we're talking about
> arbitrary domain creation here, not just Dom0. On x86 it would be
> basically impossible to re-use the page tables created for the guest
> to access guest memory from the control domain. (It could be made
> work by inserting sub-trees into the control domain's page tables,
> but obviously there would be a fair chance of conflict between the
> virtual addresses the control domain uses for its own purposes and
> the ones where the destination range in the domain being created
> sits).

Well we don't share sub-trees on Arm, yet have the valid set from the 
starting is still useful if you want to use the hardware for translate a 
guest address (e.g by switch between page-tables). This avoid the 
software lookup.

> 
>> Furthermore, we don't need to create entry with valid bit unset once the
>> guest is running. So we would need to check in the P2M code whether the
>> guest is running and whether IOMMU is enabled.
> 
> Well, looking at the patch context of your change, it is quite clear
> that this would be pretty easy - simply taking d->creation_finished
> into account.

I never said I couldn't use d->creation_finished. It is possible to 
spread it everywhere if we want to. But what's the point when it can be 
done in a single place?

Furthermore, as I suggested at the beginning of my previous answer, I 
can see other usage for this new hook. I am quite surprised you don't 
see any benefits on x86 too.

For instance looking at the memory subsystem, it would be possible to 
defer the TLB flush until the domain actually first run.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 02/16] xen/arm: Introduce helpers to get/set an MFN from/to an LPAE entry
  2018-10-08 18:33 ` [RFC 02/16] xen/arm: Introduce helpers to get/set an MFN from/to an LPAE entry Julien Grall
@ 2018-10-30  0:07   ` Stefano Stabellini
  0 siblings, 0 replies; 62+ messages in thread
From: Stefano Stabellini @ 2018-10-30  0:07 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> The new helpers make it easier to read the code by abstracting the way to
> set/get an MFN from/to an LPAE entry. The helpers are using "walk" as the
> bits are common across different LPAE stages.
> 
> At the same time, use the new helpers to replace the various open-coding
> place.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>     This patch was originally sent separately.
> ---
>  xen/arch/arm/mm.c          | 10 +++++-----
>  xen/arch/arm/p2m.c         | 19 ++++++++++---------
>  xen/include/asm-arm/lpae.h |  3 +++
>  3 files changed, 18 insertions(+), 14 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 7a06a33e21..0bc31b1d9b 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -238,7 +238,7 @@ void dump_pt_walk(paddr_t ttbr, paddr_t addr,
>  
>          /* For next iteration */
>          unmap_domain_page(mapping);
> -        mapping = map_domain_page(_mfn(pte.walk.base));
> +        mapping = map_domain_page(lpae_get_mfn(pte));
>      }
>  
>      unmap_domain_page(mapping);
> @@ -323,7 +323,7 @@ static inline lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned attr)
>  
>      ASSERT(!(mfn_to_maddr(mfn) & ~PADDR_MASK));
>  
> -    e.pt.base = mfn_x(mfn);
> +    lpae_set_mfn(e, mfn);
>  
>      return e;
>  }
> @@ -490,7 +490,7 @@ mfn_t domain_page_map_to_mfn(const void *ptr)
>      ASSERT(slot >= 0 && slot < DOMHEAP_ENTRIES);
>      ASSERT(map[slot].pt.avail != 0);
>  
> -    return _mfn(map[slot].pt.base + offset);
> +    return mfn_add(lpae_get_mfn(map[slot]), offset);
>  }
>  #endif
>  
> @@ -851,7 +851,7 @@ void __init setup_xenheap_mappings(unsigned long base_mfn,
>              /* mfn_to_virt is not valid on the 1st 1st mfn, since it
>               * is not within the xenheap. */
>              first = slot == xenheap_first_first_slot ?
> -                xenheap_first_first : __mfn_to_virt(p->pt.base);
> +                xenheap_first_first : mfn_to_virt(lpae_get_mfn(*p));
>          }
>          else if ( xenheap_first_first_slot == -1)
>          {
> @@ -1007,7 +1007,7 @@ static int create_xen_entries(enum xenmap_operation op,
>  
>          BUG_ON(!lpae_is_valid(*entry));
>  
> -        third = __mfn_to_virt(entry->pt.base);
> +        third = mfn_to_virt(lpae_get_mfn(*entry));
>          entry = &third[third_table_offset(addr)];
>  
>          switch ( op ) {
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 30cfb01498..f8a2f6459e 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -265,7 +265,7 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
>      if ( lpae_is_mapping(*entry, level) )
>          return GUEST_TABLE_SUPER_PAGE;
>  
> -    mfn = _mfn(entry->p2m.base);
> +    mfn = lpae_get_mfn(*entry);
>  
>      unmap_domain_page(*table);
>      *table = map_domain_page(mfn);
> @@ -349,7 +349,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>          if ( a )
>              *a = p2m_mem_access_radix_get(p2m, gfn);
>  
> -        mfn = _mfn(entry.p2m.base);
> +        mfn = lpae_get_mfn(entry);
>          /*
>           * The entry may point to a superpage. Find the MFN associated
>           * to the GFN.
> @@ -519,7 +519,7 @@ static lpae_t mfn_to_p2m_entry(mfn_t mfn, p2m_type_t t, p2m_access_t a)
>  
>      ASSERT(!(mfn_to_maddr(mfn) & ~PADDR_MASK));
>  
> -    e.p2m.base = mfn_x(mfn);
> +    lpae_set_mfn(e, mfn);
>  
>      return e;
>  }
> @@ -621,7 +621,7 @@ static void p2m_put_l3_page(const lpae_t pte)
>       */
>      if ( p2m_is_foreign(pte.p2m.type) )
>      {
> -        mfn_t mfn = _mfn(pte.p2m.base);
> +        mfn_t mfn = lpae_get_mfn(pte);
>  
>          ASSERT(mfn_valid(mfn));
>          put_page(mfn_to_page(mfn));
> @@ -655,7 +655,7 @@ static void p2m_free_entry(struct p2m_domain *p2m,
>          return;
>      }
>  
> -    table = map_domain_page(_mfn(entry.p2m.base));
> +    table = map_domain_page(lpae_get_mfn(entry));
>      for ( i = 0; i < LPAE_ENTRIES; i++ )
>          p2m_free_entry(p2m, *(table + i), level + 1);
>  
> @@ -669,7 +669,7 @@ static void p2m_free_entry(struct p2m_domain *p2m,
>       */
>      p2m_tlb_flush_sync(p2m);
>  
> -    mfn = _mfn(entry.p2m.base);
> +    mfn = lpae_get_mfn(entry);
>      ASSERT(mfn_valid(mfn));
>  
>      pg = mfn_to_page(mfn);
> @@ -688,7 +688,7 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
>      bool rv = true;
>  
>      /* Convenience aliases */
> -    mfn_t mfn = _mfn(entry->p2m.base);
> +    mfn_t mfn = lpae_get_mfn(*entry);
>      unsigned int next_level = level + 1;
>      unsigned int level_order = level_orders[next_level];
>  
> @@ -719,7 +719,7 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
>           * the necessary fields. So the correct permission are kept.
>           */
>          pte = *entry;
> -        pte.p2m.base = mfn_x(mfn_add(mfn, i << level_order));
> +        lpae_set_mfn(pte, mfn_add(mfn, i << level_order));
>  
>          /*
>           * First and second level pages set p2m.table = 0, but third
> @@ -952,7 +952,8 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
>       * Free the entry only if the original pte was valid and the base
>       * is different (to avoid freeing when permission is changed).
>       */
> -    if ( lpae_is_valid(orig_pte) && entry->p2m.base != orig_pte.p2m.base )
> +    if ( lpae_is_valid(orig_pte) &&
> +         !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) )
>          p2m_free_entry(p2m, orig_pte, level);
>  
>      if ( need_iommu_pt_sync(p2m->domain) &&
> diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
> index 15595cd35c..17fdc6074f 100644
> --- a/xen/include/asm-arm/lpae.h
> +++ b/xen/include/asm-arm/lpae.h
> @@ -153,6 +153,9 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
>      return (level < 3) && lpae_is_mapping(pte, level);
>  }
>  
> +#define lpae_get_mfn(pte)    (_mfn((pte).walk.base))
> +#define lpae_set_mfn(pte, mfn)  ((pte).walk.base = mfn_x(mfn))
> +
>  /*
>   * AArch64 supports pages with different sizes (4K, 16K, and 64K). To enable
>   * page table walks for various configurations, the following helpers enable

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 03/16] xen/arm: Allow lpae_is_{table, mapping} helpers to work on invalid entry
  2018-10-08 18:33 ` [RFC 03/16] xen/arm: Allow lpae_is_{table, mapping} helpers to work on invalid entry Julien Grall
@ 2018-10-30  0:10   ` Stefano Stabellini
  0 siblings, 0 replies; 62+ messages in thread
From: Stefano Stabellini @ 2018-10-30  0:10 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> Currently, lpae_is_{table, mapping} helpers will always return false on
> entries with the valid bit unset. However, it would be useful to have them
> operating on any entry. For instance to store information in advance but
> still request a fault.
> 
> With that change, the p2m is now providing an overlay for *_is_{table,
> mapping} that will check the valid bit of the entry.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

> ---
> 
>     The patch was previously sent separately.
> ---
>  xen/arch/arm/guest_walk.c  |  2 +-
>  xen/arch/arm/mm.c          |  2 +-
>  xen/arch/arm/p2m.c         | 22 ++++++++++++++++++----
>  xen/include/asm-arm/lpae.h | 11 +++++++----
>  4 files changed, 27 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/arch/arm/guest_walk.c b/xen/arch/arm/guest_walk.c
> index e3e21bdad3..4a1b4cf2c8 100644
> --- a/xen/arch/arm/guest_walk.c
> +++ b/xen/arch/arm/guest_walk.c
> @@ -566,7 +566,7 @@ static int guest_walk_ld(const struct vcpu *v,
>       * PTE is invalid or holds a reserved entry (PTE<1:0> == x0)) or if the PTE
>       * maps a memory block at level 3 (PTE<1:0> == 01).
>       */
> -    if ( !lpae_is_mapping(pte, level) )
> +    if ( !lpae_is_valid(pte) || !lpae_is_mapping(pte, level) )
>          return -EFAULT;
>  
>      /* Make sure that the lower bits of the PTE's base address are zero. */
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 0bc31b1d9b..987fcb9162 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -996,7 +996,7 @@ static int create_xen_entries(enum xenmap_operation op,
>      for(; addr < addr_end; addr += PAGE_SIZE, mfn = mfn_add(mfn, 1))
>      {
>          entry = &xen_second[second_linear_offset(addr)];
> -        if ( !lpae_is_table(*entry, 2) )
> +        if ( !lpae_is_valid(*entry) || !lpae_is_table(*entry, 2) )
>          {
>              rc = create_xen_table(entry);
>              if ( rc < 0 ) {
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index f8a2f6459e..8fffb42889 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -219,6 +219,20 @@ static p2m_access_t p2m_mem_access_radix_get(struct p2m_domain *p2m, gfn_t gfn)
>          return radix_tree_ptr_to_int(ptr);
>  }
>  
> +/*
> + * lpae_is_* helpers don't check whether the valid bit is set in the
> + * PTE. Provide our own overlay to check the valid bit.
> + */
> +static inline bool p2m_is_mapping(lpae_t pte, unsigned int level)
> +{
> +    return lpae_is_valid(pte) && lpae_is_mapping(pte, level);
> +}
> +
> +static inline bool p2m_is_superpage(lpae_t pte, unsigned int level)
> +{
> +    return lpae_is_valid(pte) && lpae_is_superpage(pte, level);
> +}
> +
>  #define GUEST_TABLE_MAP_FAILED 0
>  #define GUEST_TABLE_SUPER_PAGE 1
>  #define GUEST_TABLE_NORMAL_PAGE 2
> @@ -262,7 +276,7 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
>  
>      /* The function p2m_next_level is never called at the 3rd level */
>      ASSERT(level < 3);
> -    if ( lpae_is_mapping(*entry, level) )
> +    if ( p2m_is_mapping(*entry, level) )
>          return GUEST_TABLE_SUPER_PAGE;
>  
>      mfn = lpae_get_mfn(*entry);
> @@ -642,7 +656,7 @@ static void p2m_free_entry(struct p2m_domain *p2m,
>          return;
>  
>      /* Nothing to do but updating the stats if the entry is a super-page. */
> -    if ( lpae_is_superpage(entry, level) )
> +    if ( p2m_is_superpage(entry, level) )
>      {
>          p2m->stats.mappings[level]--;
>          return;
> @@ -697,7 +711,7 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
>       * a superpage.
>       */
>      ASSERT(level < target);
> -    ASSERT(lpae_is_superpage(*entry, level));
> +    ASSERT(p2m_is_superpage(*entry, level));
>  
>      page = alloc_domheap_page(NULL, 0);
>      if ( !page )
> @@ -836,7 +850,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
>          /* We need to split the original page. */
>          lpae_t split_pte = *entry;
>  
> -        ASSERT(lpae_is_superpage(*entry, level));
> +        ASSERT(p2m_is_superpage(*entry, level));
>  
>          if ( !p2m_split_superpage(p2m, &split_pte, level, target, offsets) )
>          {
> diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
> index 17fdc6074f..545b0c8f24 100644
> --- a/xen/include/asm-arm/lpae.h
> +++ b/xen/include/asm-arm/lpae.h
> @@ -133,16 +133,19 @@ static inline bool lpae_is_valid(lpae_t pte)
>      return pte.walk.valid;
>  }
>  
> +/*
> + * lpae_is_* don't check the valid bit. This gives an opportunity for the
> + * callers to operate on the entry even if they are not valid. For
> + * instance to store information in advance.
> + */
>  static inline bool lpae_is_table(lpae_t pte, unsigned int level)
>  {
> -    return (level < 3) && lpae_is_valid(pte) && pte.walk.table;
> +    return (level < 3) && pte.walk.table;
>  }
>  
>  static inline bool lpae_is_mapping(lpae_t pte, unsigned int level)
>  {
> -    if ( !lpae_is_valid(pte) )
> -        return false;
> -    else if ( level == 3 )
> +    if ( level == 3 )
>          return pte.walk.table;
>      else
>          return !pte.walk.table;
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 05/16] xen/arm: traps: Move the implementation of GUEST_BUG_ON in traps.h
  2018-10-08 18:33 ` [RFC 05/16] xen/arm: traps: Move the implementation of GUEST_BUG_ON in traps.h Julien Grall
@ 2018-10-30  0:11   ` Stefano Stabellini
  0 siblings, 0 replies; 62+ messages in thread
From: Stefano Stabellini @ 2018-10-30  0:11 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> GUEST_BUG_ON may be used in other files doing guest emulation.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> 
>     The patch was previously sent separately.
> ---
>  xen/arch/arm/traps.c        | 24 ------------------------
>  xen/include/asm-arm/traps.h | 24 ++++++++++++++++++++++++
>  2 files changed, 24 insertions(+), 24 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 9251ae50b8..b40798084d 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -68,30 +68,6 @@ static inline void check_stack_alignment_constraints(void) {
>  #endif
>  }
>  
> -/*
> - * GUEST_BUG_ON is intended for checking that the guest state has not been
> - * corrupted in hardware and/or that the hardware behaves as we
> - * believe it should (i.e. that certain traps can only occur when the
> - * guest is in a particular mode).
> - *
> - * The intention is to limit the damage such h/w bugs (or spec
> - * misunderstandings) can do by turning them into Denial of Service
> - * attacks instead of e.g. information leaks or privilege escalations.
> - *
> - * GUEST_BUG_ON *MUST* *NOT* be used to check for guest controllable state!
> - *
> - * Compared with regular BUG_ON it dumps the guest vcpu state instead
> - * of Xen's state.
> - */
> -#define guest_bug_on_failed(p)                          \
> -do {                                                    \
> -    show_execution_state(guest_cpu_user_regs());        \
> -    panic("Guest Bug: %pv: '%s', line %d, file %s\n",   \
> -          current, p, __LINE__, __FILE__);              \
> -} while (0)
> -#define GUEST_BUG_ON(p) \
> -    do { if ( unlikely(p) ) guest_bug_on_failed(#p); } while (0)
> -
>  #ifdef CONFIG_ARM_32
>  static int debug_stack_lines = 20;
>  #define stack_words_per_line 8
> diff --git a/xen/include/asm-arm/traps.h b/xen/include/asm-arm/traps.h
> index 70b52d1d16..0acf7de67d 100644
> --- a/xen/include/asm-arm/traps.h
> +++ b/xen/include/asm-arm/traps.h
> @@ -9,6 +9,30 @@
>  # include <asm/arm64/traps.h>
>  #endif
>  
> +/*
> + * GUEST_BUG_ON is intended for checking that the guest state has not been
> + * corrupted in hardware and/or that the hardware behaves as we
> + * believe it should (i.e. that certain traps can only occur when the
> + * guest is in a particular mode).
> + *
> + * The intention is to limit the damage such h/w bugs (or spec
> + * misunderstandings) can do by turning them into Denial of Service
> + * attacks instead of e.g. information leaks or privilege escalations.
> + *
> + * GUEST_BUG_ON *MUST* *NOT* be used to check for guest controllable state!
> + *
> + * Compared with regular BUG_ON it dumps the guest vcpu state instead
> + * of Xen's state.
> + */
> +#define guest_bug_on_failed(p)                          \
> +do {                                                    \
> +    show_execution_state(guest_cpu_user_regs());        \
> +    panic("Guest Bug: %pv: '%s', line %d, file %s\n",   \
> +          current, p, __LINE__, __FILE__);              \
> +} while (0)
> +#define GUEST_BUG_ON(p) \
> +    do { if ( unlikely(p) ) guest_bug_on_failed(#p); } while (0)
> +
>  int check_conditional_instr(struct cpu_user_regs *regs, const union hsr hsr);
>  
>  void advance_pc(struct cpu_user_regs *regs, const union hsr hsr);
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 06/16] xen/arm: p2m: Introduce a helper to generate P2M table entry from a page
  2018-10-08 18:33 ` [RFC 06/16] xen/arm: p2m: Introduce a helper to generate P2M table entry from a page Julien Grall
@ 2018-10-30  0:14   ` Stefano Stabellini
  0 siblings, 0 replies; 62+ messages in thread
From: Stefano Stabellini @ 2018-10-30  0:14 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> Generate P2M table entry requires to set some default values which are
> worth to explain in a comment. At the moment, there are 2 places where
> such entry are created but only one as proper comment.
> 
> Some move the code to generate P2M table entry in a separate helper.
> This will be helpful in a follow-up patch to make modification on the
> defaults.
> 
> At the same time, switch the default access from p2m->default_access to
> p2m_access_rwx. This should not matter as permission are ignored for
> table by the hardware.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

> ---
>  xen/arch/arm/p2m.c | 25 ++++++++++++-------------
>  1 file changed, 12 insertions(+), 13 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 8fffb42889..6c76298ebc 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -538,6 +538,16 @@ static lpae_t mfn_to_p2m_entry(mfn_t mfn, p2m_type_t t, p2m_access_t a)
>      return e;
>  }
>  
> +/* Generate table entry with correct attributes. */
> +static lpae_t page_to_p2m_table(struct page_info *page)
> +{
> +    /*
> +     * The access value does not matter because the hardware will ignore
> +     * the permission fields for table entry.
> +     */
> +    return mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid, p2m_access_rwx);
> +}
> +
>  static inline void p2m_write_pte(lpae_t *p, lpae_t pte, bool clean_pte)
>  {
>      write_pte(p, pte);
> @@ -558,7 +568,6 @@ static int p2m_create_table(struct p2m_domain *p2m, lpae_t *entry)
>  {
>      struct page_info *page;
>      lpae_t *p;
> -    lpae_t pte;
>  
>      ASSERT(!lpae_is_valid(*entry));
>  
> @@ -576,14 +585,7 @@ static int p2m_create_table(struct p2m_domain *p2m, lpae_t *entry)
>  
>      unmap_domain_page(p);
>  
> -    /*
> -     * The access value does not matter because the hardware will ignore
> -     * the permission fields for table entry.
> -     */
> -    pte = mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid,
> -                           p2m->default_access);
> -
> -    p2m_write_pte(entry, pte, p2m->clean_pte);
> +    p2m_write_pte(entry, page_to_p2m_table(page), p2m->clean_pte);
>  
>      return 0;
>  }
> @@ -764,14 +766,11 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
>  
>      unmap_domain_page(table);
>  
> -    pte = mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid,
> -                           p2m->default_access);
> -
>      /*
>       * Even if we failed, we should install the newly allocated LPAE
>       * entry. The caller will be in charge to free the sub-tree.
>       */
> -    p2m_write_pte(entry, pte, p2m->clean_pte);
> +    p2m_write_pte(entry, page_to_p2m_table(page), p2m->clean_pte);
>  
>      return rv;
>  }
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 07/16] xen/arm: p2m: Introduce p2m_is_valid and use it
  2018-10-08 18:33 ` [RFC 07/16] xen/arm: p2m: Introduce p2m_is_valid and use it Julien Grall
@ 2018-10-30  0:21   ` Stefano Stabellini
  2018-10-30 11:02     ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-10-30  0:21 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> The LPAE format allows to store information in an entry even with the
> valid bit unset. In a follow-up patch, we will take advantage of this
> feature to re-purpose the valid bit for generating a translation fault
> even if an entry contains valid information.
> 
> So we need a different way to know whether an entry contains valid
> information. It is possible to use the information hold in the p2m_type
> to know for that purpose. Indeed all entries containing valid
> information will have a valid p2m type (i.e p2m_type != p2m_invalid).
> 
> This patch introduces a new helper p2m_is_valid, which implements that
> idea, and replace most of lpae_is_valid call with the new helper. The ones
> remaining are for TLBs handling and entries accounting.
> 
> With the renaming there are 2 others changes required:
>     - Generate table entry with a valid p2m type
>     - Detect new mapping for proper stats accounting
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/arm/p2m.c | 34 +++++++++++++++++++++++-----------
>  1 file changed, 23 insertions(+), 11 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 6c76298ebc..2a1e7e9be2 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -220,17 +220,26 @@ static p2m_access_t p2m_mem_access_radix_get(struct p2m_domain *p2m, gfn_t gfn)
>  }
>  
>  /*
> + * In the case of the P2M, the valid bit is used for other purpose. Use
> + * the type to check whether an entry is valid.
> + */
> +static inline bool p2m_is_valid(lpae_t pte)
> +{
> +    return pte.p2m.type != p2m_invalid;
> +}
> +
> +/*
>   * lpae_is_* helpers don't check whether the valid bit is set in the
>   * PTE. Provide our own overlay to check the valid bit.
>   */
>  static inline bool p2m_is_mapping(lpae_t pte, unsigned int level)
>  {
> -    return lpae_is_valid(pte) && lpae_is_mapping(pte, level);
> +    return p2m_is_valid(pte) && lpae_is_mapping(pte, level);
>  }
>  
>  static inline bool p2m_is_superpage(lpae_t pte, unsigned int level)
>  {
> -    return lpae_is_valid(pte) && lpae_is_superpage(pte, level);
> +    return p2m_is_valid(pte) && lpae_is_superpage(pte, level);
>  }
>  
>  #define GUEST_TABLE_MAP_FAILED 0
> @@ -264,7 +273,7 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
>  
>      entry = *table + offset;
>  
> -    if ( !lpae_is_valid(*entry) )
> +    if ( !p2m_is_valid(*entry) )
>      {
>          if ( read_only )
>              return GUEST_TABLE_MAP_FAILED;
> @@ -356,7 +365,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>  
>      entry = table[offsets[level]];
>  
> -    if ( lpae_is_valid(entry) )
> +    if ( p2m_is_valid(entry) )
>      {
>          *t = entry.p2m.type;
>  
> @@ -544,8 +553,11 @@ static lpae_t page_to_p2m_table(struct page_info *page)
>      /*
>       * The access value does not matter because the hardware will ignore
>       * the permission fields for table entry.
> +     *
> +     * We use p2m_ram_rw so the entry has a valid type. This is important
> +     * for p2m_is_valid() to return valid on table entries.
>       */
> -    return mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid, p2m_access_rwx);
> +    return mfn_to_p2m_entry(page_to_mfn(page), p2m_ram_rw, p2m_access_rwx);
>  }
>  
>  static inline void p2m_write_pte(lpae_t *p, lpae_t pte, bool clean_pte)
> @@ -569,7 +581,7 @@ static int p2m_create_table(struct p2m_domain *p2m, lpae_t *entry)
>      struct page_info *page;
>      lpae_t *p;
>  
> -    ASSERT(!lpae_is_valid(*entry));
> +    ASSERT(!p2m_is_valid(*entry));
>  
>      page = alloc_domheap_page(NULL, 0);
>      if ( page == NULL )
> @@ -626,7 +638,7 @@ static int p2m_mem_access_radix_set(struct p2m_domain *p2m, gfn_t gfn,
>   */
>  static void p2m_put_l3_page(const lpae_t pte)
>  {
> -    ASSERT(lpae_is_valid(pte));
> +    ASSERT(p2m_is_valid(pte));
>  
>      /*
>       * TODO: Handle other p2m types
> @@ -654,11 +666,11 @@ static void p2m_free_entry(struct p2m_domain *p2m,
>      struct page_info *pg;
>  
>      /* Nothing to do if the entry is invalid. */
> -    if ( !lpae_is_valid(entry) )
> +    if ( !p2m_is_valid(entry) )
>          return;
>  
>      /* Nothing to do but updating the stats if the entry is a super-page. */
> -    if ( p2m_is_superpage(entry, level) )
> +    if ( level == 3 && entry.p2m.table )

Why?


>      {
>          p2m->stats.mappings[level]--;
>          return;
> @@ -951,7 +963,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
>              else
>                  p2m->need_flush = true;
>          }
> -        else /* new mapping */
> +        else if ( !p2m_is_valid(orig_pte) ) /* new mapping */

There are a couple of lpae_is_valid checks just above this line that you
missed, why haven't you changed them?

If you have a good reason, please explain in a comment and/or commit
message.

>              p2m->stats.mappings[level]++;
>  
>          p2m_write_pte(entry, pte, p2m->clean_pte);
> @@ -965,7 +977,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
>       * Free the entry only if the original pte was valid and the base
>       * is different (to avoid freeing when permission is changed).
>       */
> -    if ( lpae_is_valid(orig_pte) &&
> +    if ( p2m_is_valid(orig_pte) &&
>           !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) )
>          p2m_free_entry(p2m, orig_pte, level);
>  

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 08/16] xen/arm: p2m: Handle translation fault in get_page_from_gva
  2018-10-08 18:33 ` [RFC 08/16] xen/arm: p2m: Handle translation fault in get_page_from_gva Julien Grall
@ 2018-10-30  0:47   ` Stefano Stabellini
  2018-10-30 11:24     ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-10-30  0:47 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> A follow-up patch will re-purpose the valid bit of LPAE entries to
> generate fault even on entry containing valid information.
> 
> This means that when translation a guest VA to guest PA (e.g IPA) will
> fail if the Stage-2 entries used have the valid bit unset. Because of
> that, we need to fallback to walk the page-table in software to check
> whether the fault was expected.
> 
> This patch adds the software page-table walk on all the translation
> fault. It would be possible in the future to avoid pointless walk when
> the fault in PAR_EL1 is not a translation fault.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> ---
> 
> There are a couple of TODO in the code. They are clean-up and performance
> improvement (e.g when the fault cannot be handled) that could be delayed after
> the series has been merged.
> ---
>  xen/arch/arm/p2m.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 58 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 2a1e7e9be2..ec956bc151 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -13,6 +13,7 @@
>  #include <public/vm_event.h>
>  #include <asm/flushtlb.h>
>  #include <asm/event.h>
> +#include <asm/guest_walk.h>
>  #include <asm/hardirq.h>
>  #include <asm/page.h>
>  
> @@ -1438,6 +1439,8 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
>      struct page_info *page = NULL;
>      paddr_t maddr = 0;
>      uint64_t par;
> +    mfn_t mfn;
> +    p2m_type_t t;
>  
>      /*
>       * XXX: To support a different vCPU, we would need to load the
> @@ -1454,8 +1457,30 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
>      par = gvirt_to_maddr(va, &maddr, flags);
>      p2m_read_unlock(p2m);
>  
> +    /*
> +     * gvirt_to_maddr may fail if the entry does not have the valid bit
> +     * set. Fallback
> +     * to the second method:

pointless newline

> +     *  1) Translate the VA to IPA using software lookup -> Stage-1 page-table
> +     *  may not be accessible because the stage-2 entries may have valid
> +     *  bit unset.
> +     *  2) Software lookup of the MFN
> +     *
> +     * Note that when memaccess is enabled, we instead all directly
> +     * p2m_mem_access_check_and_get_page(...). Because the function is a
> +     * a variant of the methods described above, it will be able to
> +     * handle entry with valid bit unset.
                 ^ entries

> +     * TODO: Integrate more nicely memaccess with the rest of the
> +     * function.
> +     * TODO: Use the fault error in PAR_EL1 to avoid pointless
> +     *  translation.
> +     */
>      if ( par )
>      {
> +        paddr_t ipa;
> +        unsigned int perms;
> +
>          /*
>           * When memaccess is enabled, the translation GVA to MADDR may
>           * have failed because of a permission fault.
> @@ -1463,20 +1488,46 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
>          if ( p2m->mem_access_enabled )
>              return p2m_mem_access_check_and_get_page(va, flags, v);
>  
> -        dprintk(XENLOG_G_DEBUG,
> -                "%pv: gvirt_to_maddr failed va=%#"PRIvaddr" flags=0x%lx par=%#"PRIx64"\n",
> -                v, va, flags, par);
> -        return NULL;
> +        /*
> +         * The software stage-1 table walk can still fail, e.g, if the
> +         * GVA is not mapped.
> +         */
> +        if ( !guest_walk_tables(v, va, &ipa, &perms) )
> +        {
> +            dprintk(XENLOG_G_DEBUG, "%pv: Failed to walk page-table va %#"PRIvaddr"\n", v, va);

Greater than 80 chars line.


> +            return NULL;
> +        }
> +
> +        /*
> +         * Check permission that are assumed by the caller. For instance
> +         * in case of guestcopy, the caller assumes that the translated
> +         * page can be accessed with the requested permissions. If this
> +         * is not the case, we should fail.
> +         *
> +         * Please note that we do not check for the GV2M_EXEC
> +         * permission. This is fine because the hardware-based translation
> +         * instruction does not test for execute permissions.
> +         */
> +        if ( (flags & GV2M_WRITE) && !(perms & GV2M_WRITE) )
> +            return NULL;
> +
> +        mfn = p2m_lookup(d, gaddr_to_gfn(ipa), &t);
> +        if ( mfn_eq(INVALID_MFN, mfn) )
> +            return NULL;
> +
> +        /* We consider that RAM is always mapped read-write */

Is the RW assumption required? I can think of a case where RAM is mapped
RO at stage2.


>      }
> +    else
> +        mfn = maddr_to_mfn(maddr);
>  
> -    if ( !mfn_valid(maddr_to_mfn(maddr)) )
> +    if ( !mfn_valid(mfn) )
>      {
>          dprintk(XENLOG_G_DEBUG, "%pv: Invalid MFN %#"PRI_mfn"\n",
> -                v, mfn_x(maddr_to_mfn(maddr)));
> +                v, mfn_x(mfn));
>          return NULL;
>      }
>  
> -    page = mfn_to_page(maddr_to_mfn(maddr));
> +    page = mfn_to_page(mfn);
>      ASSERT(page);
>  
>      if ( unlikely(!get_page(page, d)) )
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 07/16] xen/arm: p2m: Introduce p2m_is_valid and use it
  2018-10-30  0:21   ` Stefano Stabellini
@ 2018-10-30 11:02     ` Julien Grall
  2018-11-02 22:45       ` Stefano Stabellini
  0 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-10-30 11:02 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel

Hi,

On 30/10/2018 00:21, Stefano Stabellini wrote:
> On Mon, 8 Oct 2018, Julien Grall wrote:
>> The LPAE format allows to store information in an entry even with the
>> valid bit unset. In a follow-up patch, we will take advantage of this
>> feature to re-purpose the valid bit for generating a translation fault
>> even if an entry contains valid information.
>>
>> So we need a different way to know whether an entry contains valid
>> information. It is possible to use the information hold in the p2m_type
>> to know for that purpose. Indeed all entries containing valid
>> information will have a valid p2m type (i.e p2m_type != p2m_invalid).
>>
>> This patch introduces a new helper p2m_is_valid, which implements that
>> idea, and replace most of lpae_is_valid call with the new helper. The ones
>> remaining are for TLBs handling and entries accounting.
>>
>> With the renaming there are 2 others changes required:
>>      - Generate table entry with a valid p2m type
>>      - Detect new mapping for proper stats accounting
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> ---
>>   xen/arch/arm/p2m.c | 34 +++++++++++++++++++++++-----------
>>   1 file changed, 23 insertions(+), 11 deletions(-)
>>
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index 6c76298ebc..2a1e7e9be2 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -220,17 +220,26 @@ static p2m_access_t p2m_mem_access_radix_get(struct p2m_domain *p2m, gfn_t gfn)
>>   }
>>   
>>   /*
>> + * In the case of the P2M, the valid bit is used for other purpose. Use
>> + * the type to check whether an entry is valid.
>> + */
>> +static inline bool p2m_is_valid(lpae_t pte)
>> +{
>> +    return pte.p2m.type != p2m_invalid;
>> +}
>> +
>> +/*
>>    * lpae_is_* helpers don't check whether the valid bit is set in the
>>    * PTE. Provide our own overlay to check the valid bit.
>>    */
>>   static inline bool p2m_is_mapping(lpae_t pte, unsigned int level)
>>   {
>> -    return lpae_is_valid(pte) && lpae_is_mapping(pte, level);
>> +    return p2m_is_valid(pte) && lpae_is_mapping(pte, level);
>>   }
>>   
>>   static inline bool p2m_is_superpage(lpae_t pte, unsigned int level)
>>   {
>> -    return lpae_is_valid(pte) && lpae_is_superpage(pte, level);
>> +    return p2m_is_valid(pte) && lpae_is_superpage(pte, level);
>>   }
>>   
>>   #define GUEST_TABLE_MAP_FAILED 0
>> @@ -264,7 +273,7 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
>>   
>>       entry = *table + offset;
>>   
>> -    if ( !lpae_is_valid(*entry) )
>> +    if ( !p2m_is_valid(*entry) )
>>       {
>>           if ( read_only )
>>               return GUEST_TABLE_MAP_FAILED;
>> @@ -356,7 +365,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>>   
>>       entry = table[offsets[level]];
>>   
>> -    if ( lpae_is_valid(entry) )
>> +    if ( p2m_is_valid(entry) )
>>       {
>>           *t = entry.p2m.type;
>>   
>> @@ -544,8 +553,11 @@ static lpae_t page_to_p2m_table(struct page_info *page)
>>       /*
>>        * The access value does not matter because the hardware will ignore
>>        * the permission fields for table entry.
>> +     *
>> +     * We use p2m_ram_rw so the entry has a valid type. This is important
>> +     * for p2m_is_valid() to return valid on table entries.
>>        */
>> -    return mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid, p2m_access_rwx);
>> +    return mfn_to_p2m_entry(page_to_mfn(page), p2m_ram_rw, p2m_access_rwx);
>>   }
>>   
>>   static inline void p2m_write_pte(lpae_t *p, lpae_t pte, bool clean_pte)
>> @@ -569,7 +581,7 @@ static int p2m_create_table(struct p2m_domain *p2m, lpae_t *entry)
>>       struct page_info *page;
>>       lpae_t *p;
>>   
>> -    ASSERT(!lpae_is_valid(*entry));
>> +    ASSERT(!p2m_is_valid(*entry));
>>   
>>       page = alloc_domheap_page(NULL, 0);
>>       if ( page == NULL )
>> @@ -626,7 +638,7 @@ static int p2m_mem_access_radix_set(struct p2m_domain *p2m, gfn_t gfn,
>>    */
>>   static void p2m_put_l3_page(const lpae_t pte)
>>   {
>> -    ASSERT(lpae_is_valid(pte));
>> +    ASSERT(p2m_is_valid(pte));
>>   
>>       /*
>>        * TODO: Handle other p2m types
>> @@ -654,11 +666,11 @@ static void p2m_free_entry(struct p2m_domain *p2m,
>>       struct page_info *pg;
>>   
>>       /* Nothing to do if the entry is invalid. */
>> -    if ( !lpae_is_valid(entry) )
>> +    if ( !p2m_is_valid(entry) )
>>           return;
>>   
>>       /* Nothing to do but updating the stats if the entry is a super-page. */
>> -    if ( p2m_is_superpage(entry, level) )
>> +    if ( level == 3 && entry.p2m.table )
> 
> Why?

Because p2m_is_superpage(...) contains p2m_is_valid(). So we would test twice 
the validity of the p2m.

But I guess this is not a big deal, so I can remove it.

> 
> 
>>       {
>>           p2m->stats.mappings[level]--;
>>           return;
>> @@ -951,7 +963,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
>>               else
>>                   p2m->need_flush = true;
>>           }
>> -        else /* new mapping */
>> +        else if ( !p2m_is_valid(orig_pte) ) /* new mapping */
> 
> There are a couple of lpae_is_valid checks just above this line that you
> missed, why haven't you changed them?
> 
> If you have a good reason, please explain in a comment and/or commit
> message.

This is already explained in the commit message:

"This patch introduces a new helper p2m_is_valid, which implements that
idea, and replace most of lpae_is_valid call with the new helper. The ones
remaining are for TLBs handling and entries accounting."

I believe that the code has enough existing comment to understand why 
lpae_is_valid(...) should be kept. You deal with hardware update and hence you 
should use the valid bit in the LPAE table. This will tell you whether you need 
to flush the TLBs.

> 
>>               p2m->stats.mappings[level]++;
>>   
>>           p2m_write_pte(entry, pte, p2m->clean_pte);
>> @@ -965,7 +977,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
>>        * Free the entry only if the original pte was valid and the base
>>        * is different (to avoid freeing when permission is changed).
>>        */
>> -    if ( lpae_is_valid(orig_pte) &&
>> +    if ( p2m_is_valid(orig_pte) &&
>>            !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) )
>>           p2m_free_entry(p2m, orig_pte, level);
>>   

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 08/16] xen/arm: p2m: Handle translation fault in get_page_from_gva
  2018-10-30  0:47   ` Stefano Stabellini
@ 2018-10-30 11:24     ` Julien Grall
  0 siblings, 0 replies; 62+ messages in thread
From: Julien Grall @ 2018-10-30 11:24 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel



On 30/10/2018 00:47, Stefano Stabellini wrote:
> On Mon, 8 Oct 2018, Julien Grall wrote:
>> +            return NULL;
>> +        }
>> +
>> +        /*
>> +         * Check permission that are assumed by the caller. For instance
>> +         * in case of guestcopy, the caller assumes that the translated
>> +         * page can be accessed with the requested permissions. If this
>> +         * is not the case, we should fail.
>> +         *
>> +         * Please note that we do not check for the GV2M_EXEC
>> +         * permission. This is fine because the hardware-based translation
>> +         * instruction does not test for execute permissions.
>> +         */
>> +        if ( (flags & GV2M_WRITE) && !(perms & GV2M_WRITE) )
>> +            return NULL;
>> +
>> +        mfn = p2m_lookup(d, gaddr_to_gfn(ipa), &t);
>> +        if ( mfn_eq(INVALID_MFN, mfn) )
>> +            return NULL;
>> +
>> +        /* We consider that RAM is always mapped read-write */
> 
> Is the RW assumption required? I can think of a case where RAM is mapped
> RO at stage2.
Just laziness for a first implementation. I will see how I can fix it.

> 
> 
>>       }
>> +    else
>> +        mfn = maddr_to_mfn(maddr);
>>   
>> -    if ( !mfn_valid(maddr_to_mfn(maddr)) )
>> +    if ( !mfn_valid(mfn) )
>>       {
>>           dprintk(XENLOG_G_DEBUG, "%pv: Invalid MFN %#"PRI_mfn"\n",
>> -                v, mfn_x(maddr_to_mfn(maddr)));
>> +                v, mfn_x(mfn));
>>           return NULL;
>>       }
>>   
>> -    page = mfn_to_page(maddr_to_mfn(maddr));
>> +    page = mfn_to_page(mfn);
>>       ASSERT(page);
>>   
>>       if ( unlikely(!get_page(page, d)) )
>> -- 
>> 2.11.0
>>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 07/16] xen/arm: p2m: Introduce p2m_is_valid and use it
  2018-10-30 11:02     ` Julien Grall
@ 2018-11-02 22:45       ` Stefano Stabellini
  0 siblings, 0 replies; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-02 22:45 UTC (permalink / raw)
  To: Julien Grall; +Cc: Stefano Stabellini, andre.przywara, xen-devel

On Tue, 30 Oct 2018, Julien Grall wrote:
> Hi,
> 
> On 30/10/2018 00:21, Stefano Stabellini wrote:
> > On Mon, 8 Oct 2018, Julien Grall wrote:
> > > The LPAE format allows to store information in an entry even with the
> > > valid bit unset. In a follow-up patch, we will take advantage of this
> > > feature to re-purpose the valid bit for generating a translation fault
> > > even if an entry contains valid information.
> > > 
> > > So we need a different way to know whether an entry contains valid
> > > information. It is possible to use the information hold in the p2m_type
> > > to know for that purpose. Indeed all entries containing valid
> > > information will have a valid p2m type (i.e p2m_type != p2m_invalid).
> > > 
> > > This patch introduces a new helper p2m_is_valid, which implements that
> > > idea, and replace most of lpae_is_valid call with the new helper. The ones
> > > remaining are for TLBs handling and entries accounting.
> > > 
> > > With the renaming there are 2 others changes required:
> > >      - Generate table entry with a valid p2m type
> > >      - Detect new mapping for proper stats accounting
> > > 
> > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > ---
> > >   xen/arch/arm/p2m.c | 34 +++++++++++++++++++++++-----------
> > >   1 file changed, 23 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> > > index 6c76298ebc..2a1e7e9be2 100644
> > > --- a/xen/arch/arm/p2m.c
> > > +++ b/xen/arch/arm/p2m.c
> > > @@ -220,17 +220,26 @@ static p2m_access_t p2m_mem_access_radix_get(struct
> > > p2m_domain *p2m, gfn_t gfn)
> > >   }
> > >     /*
> > > + * In the case of the P2M, the valid bit is used for other purpose. Use
> > > + * the type to check whether an entry is valid.
> > > + */
> > > +static inline bool p2m_is_valid(lpae_t pte)
> > > +{
> > > +    return pte.p2m.type != p2m_invalid;
> > > +}
> > > +
> > > +/*
> > >    * lpae_is_* helpers don't check whether the valid bit is set in the
> > >    * PTE. Provide our own overlay to check the valid bit.
> > >    */
> > >   static inline bool p2m_is_mapping(lpae_t pte, unsigned int level)
> > >   {
> > > -    return lpae_is_valid(pte) && lpae_is_mapping(pte, level);
> > > +    return p2m_is_valid(pte) && lpae_is_mapping(pte, level);
> > >   }
> > >     static inline bool p2m_is_superpage(lpae_t pte, unsigned int level)
> > >   {
> > > -    return lpae_is_valid(pte) && lpae_is_superpage(pte, level);
> > > +    return p2m_is_valid(pte) && lpae_is_superpage(pte, level);
> > >   }
> > >     #define GUEST_TABLE_MAP_FAILED 0
> > > @@ -264,7 +273,7 @@ static int p2m_next_level(struct p2m_domain *p2m, bool
> > > read_only,
> > >         entry = *table + offset;
> > >   -    if ( !lpae_is_valid(*entry) )
> > > +    if ( !p2m_is_valid(*entry) )
> > >       {
> > >           if ( read_only )
> > >               return GUEST_TABLE_MAP_FAILED;
> > > @@ -356,7 +365,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
> > >         entry = table[offsets[level]];
> > >   -    if ( lpae_is_valid(entry) )
> > > +    if ( p2m_is_valid(entry) )
> > >       {
> > >           *t = entry.p2m.type;
> > >   @@ -544,8 +553,11 @@ static lpae_t page_to_p2m_table(struct page_info
> > > *page)
> > >       /*
> > >        * The access value does not matter because the hardware will ignore
> > >        * the permission fields for table entry.
> > > +     *
> > > +     * We use p2m_ram_rw so the entry has a valid type. This is important
> > > +     * for p2m_is_valid() to return valid on table entries.
> > >        */
> > > -    return mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid,
> > > p2m_access_rwx);
> > > +    return mfn_to_p2m_entry(page_to_mfn(page), p2m_ram_rw,
> > > p2m_access_rwx);
> > >   }
> > >     static inline void p2m_write_pte(lpae_t *p, lpae_t pte, bool
> > > clean_pte)
> > > @@ -569,7 +581,7 @@ static int p2m_create_table(struct p2m_domain *p2m,
> > > lpae_t *entry)
> > >       struct page_info *page;
> > >       lpae_t *p;
> > >   -    ASSERT(!lpae_is_valid(*entry));
> > > +    ASSERT(!p2m_is_valid(*entry));
> > >         page = alloc_domheap_page(NULL, 0);
> > >       if ( page == NULL )
> > > @@ -626,7 +638,7 @@ static int p2m_mem_access_radix_set(struct p2m_domain
> > > *p2m, gfn_t gfn,
> > >    */
> > >   static void p2m_put_l3_page(const lpae_t pte)
> > >   {
> > > -    ASSERT(lpae_is_valid(pte));
> > > +    ASSERT(p2m_is_valid(pte));
> > >         /*
> > >        * TODO: Handle other p2m types
> > > @@ -654,11 +666,11 @@ static void p2m_free_entry(struct p2m_domain *p2m,
> > >       struct page_info *pg;
> > >         /* Nothing to do if the entry is invalid. */
> > > -    if ( !lpae_is_valid(entry) )
> > > +    if ( !p2m_is_valid(entry) )
> > >           return;
> > >         /* Nothing to do but updating the stats if the entry is a
> > > super-page. */
> > > -    if ( p2m_is_superpage(entry, level) )
> > > +    if ( level == 3 && entry.p2m.table )
> > 
> > Why?
> 
> Because p2m_is_superpage(...) contains p2m_is_valid(). So we would test twice
> the validity of the p2m.
> 
> But I guess this is not a big deal, so I can remove it.
> 
> > 
> > 
> > >       {
> > >           p2m->stats.mappings[level]--;
> > >           return;
> > > @@ -951,7 +963,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
> > >               else
> > >                   p2m->need_flush = true;
> > >           }
> > > -        else /* new mapping */
> > > +        else if ( !p2m_is_valid(orig_pte) ) /* new mapping */
> > 
> > There are a couple of lpae_is_valid checks just above this line that you
> > missed, why haven't you changed them?
> > 
> > If you have a good reason, please explain in a comment and/or commit
> > message.
> 
> This is already explained in the commit message:
> 
> "This patch introduces a new helper p2m_is_valid, which implements that
> idea, and replace most of lpae_is_valid call with the new helper. The ones
> remaining are for TLBs handling and entries accounting."
> 
> I believe that the code has enough existing comment to understand why
> lpae_is_valid(...) should be kept. You deal with hardware update and hence you
> should use the valid bit in the LPAE table. This will tell you whether you
> need to flush the TLBs.

I checked and it is like you wrote.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault
  2018-10-08 18:33 ` [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault Julien Grall
@ 2018-11-02 23:27   ` Stefano Stabellini
  2018-11-05 11:55     ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-02 23:27 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> Currently a Stage-2 translation fault could happen:
>     1) MMIO emulation
>     2) When the page-tables is been updated using Break-Before-Make
                              ^ have

>     3) Page not mapped
> 
> A follow-up patch will re-purpose the valid bit in an entry to generate
> translation fault. This would be used to do an action on each entries to
                                                                ^entry

> track page used for a given period.
        ^pages


> 
> A new function is introduced to try to resolve a translation fault. This
> will include 2) and the new way to generate fault explained above.

I can see the code does what you describe, but I don't understand why we
are doing this. What is missing in the commit message is the explanation
of the relationship between the future goal of repurposing the valid bit
and the introduction of a function to handle Break-Before-Make stage2
faults. Does it fix an issue with Break-Before-Make that we currently
have? Or it becomes needed due to the repurposing of valid? If so, why?


> To avoid invalidating all the page-tables entries in one go. It is
> possible to invalidate the top-level table and then on trap invalidate
> the table one-level down. This will be repeated until a block/page entry
> has been reached.
> 
> At the moment, there are no action done when reaching a block/page entry
> but setting the valid bit to 1.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/arm/p2m.c   | 127 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/traps.c |   7 +--
>  2 files changed, 131 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index ec956bc151..af445d3313 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1043,6 +1043,133 @@ int p2m_set_entry(struct p2m_domain *p2m,
>      return rc;
>  }
>  
> +/* Invalidate all entries in the table. The p2m should be write locked. */
> +static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
> +{
> +    lpae_t *table;
> +    unsigned int i;
> +
> +    ASSERT(p2m_is_write_locked(p2m));
> +
> +    table = map_domain_page(mfn);
> +
> +    for ( i = 0; i < LPAE_ENTRIES; i++ )
> +    {
> +        lpae_t pte = table[i];
> +
> +        pte.p2m.valid = 0;
> +
> +        p2m_write_pte(&table[i], pte, p2m->clean_pte);
> +    }
> +
> +    unmap_domain_page(table);
> +
> +    p2m->need_flush = true;
> +}
> +
> +bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +    unsigned int level = 0;
> +    bool resolved = false;
> +    lpae_t entry, *table;
> +    paddr_t addr = gfn_to_gaddr(gfn);
> +
> +    /* Convenience aliases */
> +    const unsigned int offsets[4] = {
> +        zeroeth_table_offset(addr),
> +        first_table_offset(addr),
> +        second_table_offset(addr),
> +        third_table_offset(addr)
> +    };
> +
> +    p2m_write_lock(p2m);
> +
> +    /* This gfn is higher than the highest the p2m map currently holds */
> +    if ( gfn_x(gfn) > gfn_x(p2m->max_mapped_gfn) )
> +        goto out;
> +
> +    table = p2m_get_root_pointer(p2m, gfn);
> +    /*
> +     * The table should always be non-NULL because the gfn is below
> +     * p2m->max_mapped_gfn and the root table pages are always present.
> +     */
> +    BUG_ON(table == NULL);
> +
> +    /*
> +     * Go down the page-tables until an entry has the valid bit unset or
> +     * a block/page entry has been hit.
> +     */
> +    for ( level = P2M_ROOT_LEVEL; level <= 3; level++ )
> +    {
> +        int rc;
> +
> +        entry = table[offsets[level]];
> +
> +        if ( level == 3 )
> +            break;
> +
> +        /* Stop as soon as we hit an entry with the valid bit unset. */
> +        if ( !lpae_is_valid(entry) )
> +            break;
> +
> +        rc = p2m_next_level(p2m, true, level, &table, offsets[level]);
> +        if ( rc == GUEST_TABLE_MAP_FAILED )
> +            goto out_unmap;
> +        else if ( rc != GUEST_TABLE_NORMAL_PAGE )

why not rc == GUEST_TABLE_SUPER_PAGE?

> +            break;
> +    }
> +
> +    /*
> +     * If the valid bit of the entry is set, it means someone was playing with
> +     * the Stage-2 page table. Nothing to do and mark the fault as resolved.
> +     */
> +    if ( lpae_is_valid(entry) )
> +    {
> +        resolved = true;
> +        goto out_unmap;
> +    }
> +
> +    /*
> +     * The valid bit is unset. If the entry is still not valid then the fault
> +     * cannot be resolved, exit and report it.
> +     */
> +    if ( !p2m_is_valid(entry) )
> +        goto out_unmap;
> +
> +    /*
> +     * Now we have an entry with valid bit unset, but still valid from
> +     * the P2M point of view.
> +     *
> +     * For entry pointing to a table, the table will be invalidated.
              ^ entries


> +     * For entry pointing to a block/page, no work to do for now.
              ^ entries


> +     */
> +    if ( lpae_is_table(entry, level) )
> +        p2m_invalidate_table(p2m, lpae_get_mfn(entry));

Maybe because I haven't read the rest of the patches, it is not clear to
me why in the case of an entry pointing to a table we need to invalidate
it, and otherwise set valid to 1.


> +    /*
> +     * Now that the work on the entry is done, set the valid bit to prevent
> +     * another fault on that entry.
> +     */
> +    resolved = true;
> +    entry.p2m.valid = 1;
> +
> +    p2m_write_pte(table + offsets[level], entry, p2m->clean_pte);
> +
> +    /*
> +     * No need to flush the TLBs as the modified entry had the valid bit
> +     * unset.
> +     */
> +
> +out_unmap:
> +    unmap_domain_page(table);
> +
> +out:
> +    p2m_write_unlock(p2m);
> +
> +    return resolved;
> +}
> +
>  static inline int p2m_insert_mapping(struct domain *d,
>                                       gfn_t start_gfn,
>                                       unsigned long nr,
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index b40798084d..169b57cb6b 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -1882,6 +1882,8 @@ static bool try_map_mmio(gfn_t gfn)
>      return !map_regions_p2mt(d, gfn, 1, mfn, p2m_mmio_direct_c);
>  }
>  
> +bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);

Should be in an header file?


>  static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>                                         const union hsr hsr)
>  {
> @@ -1894,7 +1896,6 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>      vaddr_t gva;
>      paddr_t gpa;
>      uint8_t fsc = xabt.fsc & ~FSC_LL_MASK;
> -    mfn_t mfn;
>      bool is_data = (hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
>  
>      /*
> @@ -1977,8 +1978,8 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>           * with the Stage-2 page table. Walk the Stage-2 PT to check
>           * if the entry exists. If it's the case, return to the guest
>           */
> -        mfn = gfn_to_mfn(current->domain, gaddr_to_gfn(gpa));
> -        if ( !mfn_eq(mfn, INVALID_MFN) )
> +        if ( p2m_resolve_translation_fault(current->domain,
> +                                           gaddr_to_gfn(gpa)) )
>              return;
>  
>          if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 12/16] xen/arm: Rework p2m_cache_flush to take a range [begin, end)
  2018-10-08 18:33 ` [RFC 12/16] xen/arm: Rework p2m_cache_flush to take a range [begin, end) Julien Grall
@ 2018-11-02 23:38   ` Stefano Stabellini
  0 siblings, 0 replies; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-02 23:38 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> The function will be easier to re-use in a follow-up patch if you have
> only the begin and end.
> 
> At the same time, rename the function to reflect the change in the
> prototype.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  xen/arch/arm/domctl.c     | 2 +-
>  xen/arch/arm/p2m.c        | 3 +--
>  xen/include/asm-arm/p2m.h | 7 +++++--
>  3 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
> index 4587c75826..c10f568aad 100644
> --- a/xen/arch/arm/domctl.c
> +++ b/xen/arch/arm/domctl.c
> @@ -61,7 +61,7 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
>          if ( e < s )
>              return -EINVAL;
>  
> -        return p2m_cache_flush(d, _gfn(s), domctl->u.cacheflush.nr_pfns);
> +        return p2m_cache_flush_range(d, _gfn(s), _gfn(e));
>      }
>      case XEN_DOMCTL_bind_pt_irq:
>      {
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index af445d3313..8537b7bab1 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1507,10 +1507,9 @@ int relinquish_p2m_mapping(struct domain *d)
>      return rc;
>  }
>  
> -int p2m_cache_flush(struct domain *d, gfn_t start, unsigned long nr)
> +int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>  {
>      struct p2m_domain *p2m = p2m_get_hostp2m(d);
> -    gfn_t end = gfn_add(start, nr);
>      gfn_t next_gfn;
>      p2m_type_t t;
>      unsigned int order;
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index c03557544a..d7afa2bbe8 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -224,8 +224,11 @@ int p2m_set_entry(struct p2m_domain *p2m,
>                    p2m_type_t t,
>                    p2m_access_t a);
>  
> -/* Clean & invalidate caches corresponding to a region of guest address space */
> -int p2m_cache_flush(struct domain *d, gfn_t start, unsigned long nr);
> +/*
> + * Clean & invalidate caches corresponding to a region [start,end) of guest
> + * address space.
> + */
> +int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end);
>  
>  /*
>   * Map a region in the guest p2m with a specific p2m type.
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 13/16] xen/arm: p2m: Allow to flush cache on any RAM region
  2018-10-08 18:33 ` [RFC 13/16] xen/arm: p2m: Allow to flush cache on any RAM region Julien Grall
@ 2018-11-02 23:40   ` Stefano Stabellini
  0 siblings, 0 replies; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-02 23:40 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> Currently, we only allow to flush cache on region mapped as p2m_ram_{rw,ro}.
                                             ^ regions

> There are no real problem to flush cache on any RAM region such as grants
                    ^ problems in cache flushing any RAM regions


> and foreign mapping. Therefore, relax the cache to allow flushing the
                                            ^ check


> cache on any RAM region.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Aside from grammar:

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

> ---
>  xen/arch/arm/p2m.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 8537b7bab1..12b459924b 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1532,7 +1532,7 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>          next_gfn = gfn_next_boundary(start, order);
>  
>          /* Skip hole and non-RAM page */
> -        if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_ram(t) )
> +        if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
>              continue;
>  
>          /* XXX: Implement preemption */
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 14/16] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit)
  2018-10-08 18:33 ` [RFC 14/16] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit) Julien Grall
@ 2018-11-02 23:44   ` Stefano Stabellini
  2018-11-05 11:56     ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-02 23:44 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> With the recent changes, a P2M entry may be populated but may as not
> valid. In some situation, it would be useful to know whether the entry
> has been marked available to guest in order to perform a specific
> action. So extend p2m_get_entry to return the value of bit[0] (valid bit).
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/arm/mem_access.c |  6 +++---
>  xen/arch/arm/p2m.c        | 20 ++++++++++++++++----
>  xen/include/asm-arm/p2m.h |  3 ++-
>  3 files changed, 21 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
> index 9239bdf323..f434510b2a 100644
> --- a/xen/arch/arm/mem_access.c
> +++ b/xen/arch/arm/mem_access.c
> @@ -70,7 +70,7 @@ static int __p2m_get_mem_access(struct domain *d, gfn_t gfn,
>           * No setting was found in the Radix tree. Check if the
>           * entry exists in the page-tables.
>           */
> -        mfn_t mfn = p2m_get_entry(p2m, gfn, NULL, NULL, NULL);
> +        mfn_t mfn = p2m_get_entry(p2m, gfn, NULL, NULL, NULL, NULL);
>  
>          if ( mfn_eq(mfn, INVALID_MFN) )
>              return -ESRCH;
> @@ -199,7 +199,7 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
>       * We had a mem_access permission limiting the access, but the page type
>       * could also be limiting, so we need to check that as well.
>       */
> -    mfn = p2m_get_entry(p2m, gfn, &t, NULL, NULL);
> +    mfn = p2m_get_entry(p2m, gfn, &t, NULL, NULL, NULL);
>      if ( mfn_eq(mfn, INVALID_MFN) )
>          goto err;
>  
> @@ -405,7 +405,7 @@ long p2m_set_mem_access(struct domain *d, gfn_t gfn, uint32_t nr,
>            gfn = gfn_next_boundary(gfn, order) )
>      {
>          p2m_type_t t;
> -        mfn_t mfn = p2m_get_entry(p2m, gfn, &t, NULL, &order);
> +        mfn_t mfn = p2m_get_entry(p2m, gfn, &t, NULL, &order, NULL);
>  
>  
>          if ( !mfn_eq(mfn, INVALID_MFN) )
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 12b459924b..df6b48d73b 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -306,10 +306,14 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
>   *
>   * If the entry is not present, INVALID_MFN will be returned and the
>   * page_order will be set according to the order of the invalid range.
> + *
> + * valid will contain the value of bit[0] (e.g valid bit) of the
> + * entry.
>   */
>  mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>                      p2m_type_t *t, p2m_access_t *a,
> -                    unsigned int *page_order)
> +                    unsigned int *page_order,
> +                    bool *valid)
>  {
>      paddr_t addr = gfn_to_gaddr(gfn);
>      unsigned int level = 0;
> @@ -317,6 +321,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>      int rc;
>      mfn_t mfn = INVALID_MFN;
>      p2m_type_t _t;
> +    bool _valid;
>  
>      /* Convenience aliases */
>      const unsigned int offsets[4] = {
> @@ -334,6 +339,10 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>  
>      *t = p2m_invalid;
>  
> +    /* Allow valid to be NULL */
> +    valid = valid?: &_valid;
> +    *valid = false;

Why not a simple:

  if ( valid )
    *valid = false;

especially given that you do the same if ( valid ) check below.
In fact, it doesn' look like we need _valid?


>      /* XXX: Check if the mapping is lower than the mapped gfn */
>  
>      /* This gfn is higher than the highest the p2m map currently holds */
> @@ -379,6 +388,9 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>           * to the GFN.
>           */
>          mfn = mfn_add(mfn, gfn_x(gfn) & ((1UL << level_orders[level]) - 1));
> +
> +        if ( valid )
> +            *valid = lpae_is_valid(entry);
>      }
>  
>  out_unmap:
> @@ -397,7 +409,7 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t *t)
>      struct p2m_domain *p2m = p2m_get_hostp2m(d);
>  
>      p2m_read_lock(p2m);
> -    mfn = p2m_get_entry(p2m, gfn, t, NULL, NULL);
> +    mfn = p2m_get_entry(p2m, gfn, t, NULL, NULL, NULL);
>      p2m_read_unlock(p2m);
>  
>      return mfn;
> @@ -1464,7 +1476,7 @@ int relinquish_p2m_mapping(struct domain *d)
>      for ( ; gfn_x(start) < gfn_x(end);
>            start = gfn_next_boundary(start, order) )
>      {
> -        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order);
> +        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
>  
>          count++;
>          /*
> @@ -1527,7 +1539,7 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>  
>      for ( ; gfn_x(start) < gfn_x(end); start = next_gfn )
>      {
> -        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order);
> +        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
>  
>          next_gfn = gfn_next_boundary(start, order);
>  
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index d7afa2bbe8..92213dd1ab 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -211,7 +211,8 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t *t);
>   */
>  mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>                      p2m_type_t *t, p2m_access_t *a,
> -                    unsigned int *page_order);
> +                    unsigned int *page_order,
> +                    bool *valid);
>  
>  /*
>   * Direct set a p2m entry: only for use by the P2M code.
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault
  2018-11-02 23:27   ` Stefano Stabellini
@ 2018-11-05 11:55     ` Julien Grall
  2018-11-05 17:56       ` Stefano Stabellini
  0 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-11-05 11:55 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel

Hi,

On 02/11/2018 23:27, Stefano Stabellini wrote:
> On Mon, 8 Oct 2018, Julien Grall wrote:
>> Currently a Stage-2 translation fault could happen:
>>      1) MMIO emulation
>>      2) When the page-tables is been updated using Break-Before-Make
>                                ^ have
> 
>>      3) Page not mapped
>>
>> A follow-up patch will re-purpose the valid bit in an entry to generate
>> translation fault. This would be used to do an action on each entries to
>                                                                  ^entry
> 
>> track page used for a given period.
>          ^pages
> 
> 
>>
>> A new function is introduced to try to resolve a translation fault. This
>> will include 2) and the new way to generate fault explained above.
> 
> I can see the code does what you describe, but I don't understand why we
> are doing this. What is missing in the commit message is the explanation
> of the relationship between the future goal of repurposing the valid bit
> and the introduction of a function to handle Break-Before-Make stage2
> faults. Does it fix an issue with Break-Before-Make that we currently
> have? Or it becomes needed due to the repurposing of valid? If so, why?

This does not fix any issue with BBM. The valid bit adds a 4th reasons for 
translation fault. Both BBM and the valid bit will require to walk the page-tables.

For the valid bit, we will need to walk the page-table in order to fixup the 
entry (i.e set valid bit). We also can't use p2m_lookup(...) has it only tell 
you the mapping exists, the valid bit may still not be set.

So we need to provide a new helper to walk the page-table and fixup an entry.

>> To avoid invalidating all the page-tables entries in one go. It is
>> possible to invalidate the top-level table and then on trap invalidate
>> the table one-level down. This will be repeated until a block/page entry
>> has been reached.
>>
>> At the moment, there are no action done when reaching a block/page entry
>> but setting the valid bit to 1.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> ---
>>   xen/arch/arm/p2m.c   | 127 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>   xen/arch/arm/traps.c |   7 +--
>>   2 files changed, 131 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index ec956bc151..af445d3313 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -1043,6 +1043,133 @@ int p2m_set_entry(struct p2m_domain *p2m,
>>       return rc;
>>   }
>>   
>> +/* Invalidate all entries in the table. The p2m should be write locked. */
>> +static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
>> +{
>> +    lpae_t *table;
>> +    unsigned int i;
>> +
>> +    ASSERT(p2m_is_write_locked(p2m));
>> +
>> +    table = map_domain_page(mfn);
>> +
>> +    for ( i = 0; i < LPAE_ENTRIES; i++ )
>> +    {
>> +        lpae_t pte = table[i];
>> +
>> +        pte.p2m.valid = 0;
>> +
>> +        p2m_write_pte(&table[i], pte, p2m->clean_pte);
>> +    }
>> +
>> +    unmap_domain_page(table);
>> +
>> +    p2m->need_flush = true;
>> +}
>> +
>> +bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn)
>> +{
>> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> +    unsigned int level = 0;
>> +    bool resolved = false;
>> +    lpae_t entry, *table;
>> +    paddr_t addr = gfn_to_gaddr(gfn);
>> +
>> +    /* Convenience aliases */
>> +    const unsigned int offsets[4] = {
>> +        zeroeth_table_offset(addr),
>> +        first_table_offset(addr),
>> +        second_table_offset(addr),
>> +        third_table_offset(addr)
>> +    };
>> +
>> +    p2m_write_lock(p2m);
>> +
>> +    /* This gfn is higher than the highest the p2m map currently holds */
>> +    if ( gfn_x(gfn) > gfn_x(p2m->max_mapped_gfn) )
>> +        goto out;
>> +
>> +    table = p2m_get_root_pointer(p2m, gfn);
>> +    /*
>> +     * The table should always be non-NULL because the gfn is below
>> +     * p2m->max_mapped_gfn and the root table pages are always present.
>> +     */
>> +    BUG_ON(table == NULL);
>> +
>> +    /*
>> +     * Go down the page-tables until an entry has the valid bit unset or
>> +     * a block/page entry has been hit.
>> +     */
>> +    for ( level = P2M_ROOT_LEVEL; level <= 3; level++ )
>> +    {
>> +        int rc;
>> +
>> +        entry = table[offsets[level]];
>> +
>> +        if ( level == 3 )
>> +            break;
>> +
>> +        /* Stop as soon as we hit an entry with the valid bit unset. */
>> +        if ( !lpae_is_valid(entry) )
>> +            break;
>> +
>> +        rc = p2m_next_level(p2m, true, level, &table, offsets[level]);
>> +        if ( rc == GUEST_TABLE_MAP_FAILED )
>> +            goto out_unmap;
>> +        else if ( rc != GUEST_TABLE_NORMAL_PAGE )
> 
> why not rc == GUEST_TABLE_SUPER_PAGE?

The logic has been taken from p2m_get_entry(). It makes sense to use != here as 
you only want to continue the loop if you are on a table. So it is clearer why 
you continue.

> 
>> +            break;
>> +    }
>> +
>> +    /*
>> +     * If the valid bit of the entry is set, it means someone was playing with
>> +     * the Stage-2 page table. Nothing to do and mark the fault as resolved.
>> +     */
>> +    if ( lpae_is_valid(entry) )
>> +    {
>> +        resolved = true;
>> +        goto out_unmap;
>> +    }
>> +
>> +    /*
>> +     * The valid bit is unset. If the entry is still not valid then the fault
>> +     * cannot be resolved, exit and report it.
>> +     */
>> +    if ( !p2m_is_valid(entry) )
>> +        goto out_unmap;
>> +
>> +    /*
>> +     * Now we have an entry with valid bit unset, but still valid from
>> +     * the P2M point of view.
>> +     *
>> +     * For entry pointing to a table, the table will be invalidated.
>                ^ entries
> 
> 
>> +     * For entry pointing to a block/page, no work to do for now.
>                ^ entries

I am not entirely sure why it should be plural here. We are dealing with only 
one entry it.

> 
> 
>> +     */
>> +    if ( lpae_is_table(entry, level) )
>> +        p2m_invalidate_table(p2m, lpae_get_mfn(entry));
> 
> Maybe because I haven't read the rest of the patches, it is not clear to
> me why in the case of an entry pointing to a table we need to invalidate
> it, and otherwise set valid to 1.

This was written in the commit message:

"To avoid invalidating all the page-tables entries in one go. It is
possible to invalidate the top-level table and then on trap invalidate
the table one-level down. This will be repeated until a block/page entry
has been reached."

It is mostly to spread the cost of invalidating the page-tables. With this 
solution, you only need to clear the valid bit of the top-level entries to 
invalidate the full P2M.

On the first access, you will trap, set the entry of the first "invalid entry", 
and invalidate the next level if necessary.

The access will then be retried. If trapped, the process is repeated until all 
the entries are valid.

It is possible to optimize it, avoiding intermediate trap when necessary. But I 
would not bother looking at that for now. Indeed, this will be used for lowering 
down the cost of set/way cache maintenance emulation. Any guest using that 
already knows that a big cost will incur.

> 
> 
>> +    /*
>> +     * Now that the work on the entry is done, set the valid bit to prevent
>> +     * another fault on that entry.
>> +     */
>> +    resolved = true;
>> +    entry.p2m.valid = 1;
>> +
>> +    p2m_write_pte(table + offsets[level], entry, p2m->clean_pte);
>> +
>> +    /*
>> +     * No need to flush the TLBs as the modified entry had the valid bit
>> +     * unset.
>> +     */
>> +
>> +out_unmap:
>> +    unmap_domain_page(table);
>> +
>> +out:
>> +    p2m_write_unlock(p2m);
>> +
>> +    return resolved;
>> +}
>> +
>>   static inline int p2m_insert_mapping(struct domain *d,
>>                                        gfn_t start_gfn,
>>                                        unsigned long nr,
>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>> index b40798084d..169b57cb6b 100644
>> --- a/xen/arch/arm/traps.c
>> +++ b/xen/arch/arm/traps.c
>> @@ -1882,6 +1882,8 @@ static bool try_map_mmio(gfn_t gfn)
>>       return !map_regions_p2mt(d, gfn, 1, mfn, p2m_mmio_direct_c);
>>   }
>>   
>> +bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
> 
> Should be in an header file?

Yes. I will move it.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 14/16] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit)
  2018-11-02 23:44   ` Stefano Stabellini
@ 2018-11-05 11:56     ` Julien Grall
  2018-11-05 17:31       ` Stefano Stabellini
  0 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-11-05 11:56 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel

Hi Stefano,

On 02/11/2018 23:44, Stefano Stabellini wrote:
> On Mon, 8 Oct 2018, Julien Grall wrote:
>> With the recent changes, a P2M entry may be populated but may as not
>> valid. In some situation, it would be useful to know whether the entry
>> has been marked available to guest in order to perform a specific
>> action. So extend p2m_get_entry to return the value of bit[0] (valid bit).
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> ---
>>   xen/arch/arm/mem_access.c |  6 +++---
>>   xen/arch/arm/p2m.c        | 20 ++++++++++++++++----
>>   xen/include/asm-arm/p2m.h |  3 ++-
>>   3 files changed, 21 insertions(+), 8 deletions(-)
>>
>> diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
>> index 9239bdf323..f434510b2a 100644
>> --- a/xen/arch/arm/mem_access.c
>> +++ b/xen/arch/arm/mem_access.c
>> @@ -70,7 +70,7 @@ static int __p2m_get_mem_access(struct domain *d, gfn_t gfn,
>>            * No setting was found in the Radix tree. Check if the
>>            * entry exists in the page-tables.
>>            */
>> -        mfn_t mfn = p2m_get_entry(p2m, gfn, NULL, NULL, NULL);
>> +        mfn_t mfn = p2m_get_entry(p2m, gfn, NULL, NULL, NULL, NULL);
>>   
>>           if ( mfn_eq(mfn, INVALID_MFN) )
>>               return -ESRCH;
>> @@ -199,7 +199,7 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
>>        * We had a mem_access permission limiting the access, but the page type
>>        * could also be limiting, so we need to check that as well.
>>        */
>> -    mfn = p2m_get_entry(p2m, gfn, &t, NULL, NULL);
>> +    mfn = p2m_get_entry(p2m, gfn, &t, NULL, NULL, NULL);
>>       if ( mfn_eq(mfn, INVALID_MFN) )
>>           goto err;
>>   
>> @@ -405,7 +405,7 @@ long p2m_set_mem_access(struct domain *d, gfn_t gfn, uint32_t nr,
>>             gfn = gfn_next_boundary(gfn, order) )
>>       {
>>           p2m_type_t t;
>> -        mfn_t mfn = p2m_get_entry(p2m, gfn, &t, NULL, &order);
>> +        mfn_t mfn = p2m_get_entry(p2m, gfn, &t, NULL, &order, NULL);
>>   
>>   
>>           if ( !mfn_eq(mfn, INVALID_MFN) )
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index 12b459924b..df6b48d73b 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -306,10 +306,14 @@ static int p2m_next_level(struct p2m_domain *p2m, bool read_only,
>>    *
>>    * If the entry is not present, INVALID_MFN will be returned and the
>>    * page_order will be set according to the order of the invalid range.
>> + *
>> + * valid will contain the value of bit[0] (e.g valid bit) of the
>> + * entry.
>>    */
>>   mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>>                       p2m_type_t *t, p2m_access_t *a,
>> -                    unsigned int *page_order)
>> +                    unsigned int *page_order,
>> +                    bool *valid)
>>   {
>>       paddr_t addr = gfn_to_gaddr(gfn);
>>       unsigned int level = 0;
>> @@ -317,6 +321,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>>       int rc;
>>       mfn_t mfn = INVALID_MFN;
>>       p2m_type_t _t;
>> +    bool _valid;
>>   
>>       /* Convenience aliases */
>>       const unsigned int offsets[4] = {
>> @@ -334,6 +339,10 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>>   
>>       *t = p2m_invalid;
>>   
>> +    /* Allow valid to be NULL */
>> +    valid = valid?: &_valid;
>> +    *valid = false;
> 
> Why not a simple:
> 
>    if ( valid )
>      *valid = false;
> 
> especially given that you do the same if ( valid ) check below.
> In fact, it doesn' look like we need _valid?

I thought I dropped the if ( valid ) below. I would actually prefer to keep 
_valid and avoid using if ( ... ) everywhere.

This makes the code slightly easier to follow.

> 
> 
>>       /* XXX: Check if the mapping is lower than the mapped gfn */
>>   
>>       /* This gfn is higher than the highest the p2m map currently holds */
>> @@ -379,6 +388,9 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>>            * to the GFN.
>>            */
>>           mfn = mfn_add(mfn, gfn_x(gfn) & ((1UL << level_orders[level]) - 1));
>> +
>> +        if ( valid )
>> +            *valid = lpae_is_valid(entry);
>>       }
>>   
>>   out_unmap:
>> @@ -397,7 +409,7 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t *t)
>>       struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>   
>>       p2m_read_lock(p2m);
>> -    mfn = p2m_get_entry(p2m, gfn, t, NULL, NULL);
>> +    mfn = p2m_get_entry(p2m, gfn, t, NULL, NULL, NULL);
>>       p2m_read_unlock(p2m);
>>   
>>       return mfn;
>> @@ -1464,7 +1476,7 @@ int relinquish_p2m_mapping(struct domain *d)
>>       for ( ; gfn_x(start) < gfn_x(end);
>>             start = gfn_next_boundary(start, order) )
>>       {
>> -        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order);
>> +        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
>>   
>>           count++;
>>           /*
>> @@ -1527,7 +1539,7 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>>   
>>       for ( ; gfn_x(start) < gfn_x(end); start = next_gfn )
>>       {
>> -        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order);
>> +        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
>>   
>>           next_gfn = gfn_next_boundary(start, order);
>>   
>> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
>> index d7afa2bbe8..92213dd1ab 100644
>> --- a/xen/include/asm-arm/p2m.h
>> +++ b/xen/include/asm-arm/p2m.h
>> @@ -211,7 +211,8 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t *t);
>>    */
>>   mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>>                       p2m_type_t *t, p2m_access_t *a,
>> -                    unsigned int *page_order);
>> +                    unsigned int *page_order,
>> +                    bool *valid);
>>   
>>   /*
>>    * Direct set a p2m entry: only for use by the P2M code.
>> -- 
>> 2.11.0
>>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 14/16] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit)
  2018-11-05 11:56     ` Julien Grall
@ 2018-11-05 17:31       ` Stefano Stabellini
  2018-11-05 17:32         ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-05 17:31 UTC (permalink / raw)
  To: Julien Grall; +Cc: Stefano Stabellini, andre.przywara, xen-devel

On Mon, 5 Nov 2018, Julien Grall wrote:
> Hi Stefano,
> 
> On 02/11/2018 23:44, Stefano Stabellini wrote:
> > On Mon, 8 Oct 2018, Julien Grall wrote:
> > > With the recent changes, a P2M entry may be populated but may as not
> > > valid. In some situation, it would be useful to know whether the entry
> > > has been marked available to guest in order to perform a specific
> > > action. So extend p2m_get_entry to return the value of bit[0] (valid bit).
> > > 
> > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > ---
> > >   xen/arch/arm/mem_access.c |  6 +++---
> > >   xen/arch/arm/p2m.c        | 20 ++++++++++++++++----
> > >   xen/include/asm-arm/p2m.h |  3 ++-
> > >   3 files changed, 21 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
> > > index 9239bdf323..f434510b2a 100644
> > > --- a/xen/arch/arm/mem_access.c
> > > +++ b/xen/arch/arm/mem_access.c
> > > @@ -70,7 +70,7 @@ static int __p2m_get_mem_access(struct domain *d, gfn_t
> > > gfn,
> > >            * No setting was found in the Radix tree. Check if the
> > >            * entry exists in the page-tables.
> > >            */
> > > -        mfn_t mfn = p2m_get_entry(p2m, gfn, NULL, NULL, NULL);
> > > +        mfn_t mfn = p2m_get_entry(p2m, gfn, NULL, NULL, NULL, NULL);
> > >             if ( mfn_eq(mfn, INVALID_MFN) )
> > >               return -ESRCH;
> > > @@ -199,7 +199,7 @@ p2m_mem_access_check_and_get_page(vaddr_t gva,
> > > unsigned long flag,
> > >        * We had a mem_access permission limiting the access, but the page
> > > type
> > >        * could also be limiting, so we need to check that as well.
> > >        */
> > > -    mfn = p2m_get_entry(p2m, gfn, &t, NULL, NULL);
> > > +    mfn = p2m_get_entry(p2m, gfn, &t, NULL, NULL, NULL);
> > >       if ( mfn_eq(mfn, INVALID_MFN) )
> > >           goto err;
> > >   @@ -405,7 +405,7 @@ long p2m_set_mem_access(struct domain *d, gfn_t gfn,
> > > uint32_t nr,
> > >             gfn = gfn_next_boundary(gfn, order) )
> > >       {
> > >           p2m_type_t t;
> > > -        mfn_t mfn = p2m_get_entry(p2m, gfn, &t, NULL, &order);
> > > +        mfn_t mfn = p2m_get_entry(p2m, gfn, &t, NULL, &order, NULL);
> > >               if ( !mfn_eq(mfn, INVALID_MFN) )
> > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> > > index 12b459924b..df6b48d73b 100644
> > > --- a/xen/arch/arm/p2m.c
> > > +++ b/xen/arch/arm/p2m.c
> > > @@ -306,10 +306,14 @@ static int p2m_next_level(struct p2m_domain *p2m,
> > > bool read_only,
> > >    *
> > >    * If the entry is not present, INVALID_MFN will be returned and the
> > >    * page_order will be set according to the order of the invalid range.
> > > + *
> > > + * valid will contain the value of bit[0] (e.g valid bit) of the
> > > + * entry.
> > >    */
> > >   mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
> > >                       p2m_type_t *t, p2m_access_t *a,
> > > -                    unsigned int *page_order)
> > > +                    unsigned int *page_order,
> > > +                    bool *valid)
> > >   {
> > >       paddr_t addr = gfn_to_gaddr(gfn);
> > >       unsigned int level = 0;
> > > @@ -317,6 +321,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
> > >       int rc;
> > >       mfn_t mfn = INVALID_MFN;
> > >       p2m_type_t _t;
> > > +    bool _valid;
> > >         /* Convenience aliases */
> > >       const unsigned int offsets[4] = {
> > > @@ -334,6 +339,10 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t
> > > gfn,
> > >         *t = p2m_invalid;
> > >   +    /* Allow valid to be NULL */
> > > +    valid = valid?: &_valid;
> > > +    *valid = false;
> > 
> > Why not a simple:
> > 
> >    if ( valid )
> >      *valid = false;
> > 
> > especially given that you do the same if ( valid ) check below.
> > In fact, it doesn' look like we need _valid?
> 
> I thought I dropped the if ( valid ) below. I would actually prefer to keep
> _valid and avoid using if ( ... ) everywhere.
> 
> This makes the code slightly easier to follow.

_valid is a good trick, but I don't think is worth doing it in this
change: it is easy to follow anyway. Up to you, I am fine either way.


> > 
> > 
> > >       /* XXX: Check if the mapping is lower than the mapped gfn */
> > >         /* This gfn is higher than the highest the p2m map currently holds
> > > */
> > > @@ -379,6 +388,9 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
> > >            * to the GFN.
> > >            */
> > >           mfn = mfn_add(mfn, gfn_x(gfn) & ((1UL << level_orders[level]) -
> > > 1));
> > > +
> > > +        if ( valid )
> > > +            *valid = lpae_is_valid(entry);
> > >       }
> > >     out_unmap:
> > > @@ -397,7 +409,7 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn,
> > > p2m_type_t *t)
> > >       struct p2m_domain *p2m = p2m_get_hostp2m(d);
> > >         p2m_read_lock(p2m);
> > > -    mfn = p2m_get_entry(p2m, gfn, t, NULL, NULL);
> > > +    mfn = p2m_get_entry(p2m, gfn, t, NULL, NULL, NULL);
> > >       p2m_read_unlock(p2m);
> > >         return mfn;
> > > @@ -1464,7 +1476,7 @@ int relinquish_p2m_mapping(struct domain *d)
> > >       for ( ; gfn_x(start) < gfn_x(end);
> > >             start = gfn_next_boundary(start, order) )
> > >       {
> > > -        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order);
> > > +        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
> > >             count++;
> > >           /*
> > > @@ -1527,7 +1539,7 @@ int p2m_cache_flush_range(struct domain *d, gfn_t
> > > start, gfn_t end)
> > >         for ( ; gfn_x(start) < gfn_x(end); start = next_gfn )
> > >       {
> > > -        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order);
> > > +        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
> > >             next_gfn = gfn_next_boundary(start, order);
> > >   diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> > > index d7afa2bbe8..92213dd1ab 100644
> > > --- a/xen/include/asm-arm/p2m.h
> > > +++ b/xen/include/asm-arm/p2m.h
> > > @@ -211,7 +211,8 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn,
> > > p2m_type_t *t);
> > >    */
> > >   mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
> > >                       p2m_type_t *t, p2m_access_t *a,
> > > -                    unsigned int *page_order);
> > > +                    unsigned int *page_order,
> > > +                    bool *valid);
> > >     /*
> > >    * Direct set a p2m entry: only for use by the P2M code.
> > > -- 
> > > 2.11.0
> > > 
> 
> Cheers,
> 
> -- 
> Julien Grall
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 14/16] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit)
  2018-11-05 17:31       ` Stefano Stabellini
@ 2018-11-05 17:32         ` Julien Grall
  0 siblings, 0 replies; 62+ messages in thread
From: Julien Grall @ 2018-11-05 17:32 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel

Hi Stefano,

On 05/11/2018 17:31, Stefano Stabellini wrote:
> On Mon, 5 Nov 2018, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 02/11/2018 23:44, Stefano Stabellini wrote:
>>> On Mon, 8 Oct 2018, Julien Grall wrote:
>>>> With the recent changes, a P2M entry may be populated but may as not
>>>> valid. In some situation, it would be useful to know whether the entry
>>>> has been marked available to guest in order to perform a specific
>>>> action. So extend p2m_get_entry to return the value of bit[0] (valid bit).
>>>>
>>>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>>> ---
>>>>    xen/arch/arm/mem_access.c |  6 +++---
>>>>    xen/arch/arm/p2m.c        | 20 ++++++++++++++++----
>>>>    xen/include/asm-arm/p2m.h |  3 ++-
>>>>    3 files changed, 21 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
>>>> index 9239bdf323..f434510b2a 100644
>>>> --- a/xen/arch/arm/mem_access.c
>>>> +++ b/xen/arch/arm/mem_access.c
>>>> @@ -70,7 +70,7 @@ static int __p2m_get_mem_access(struct domain *d, gfn_t
>>>> gfn,
>>>>             * No setting was found in the Radix tree. Check if the
>>>>             * entry exists in the page-tables.
>>>>             */
>>>> -        mfn_t mfn = p2m_get_entry(p2m, gfn, NULL, NULL, NULL);
>>>> +        mfn_t mfn = p2m_get_entry(p2m, gfn, NULL, NULL, NULL, NULL);
>>>>              if ( mfn_eq(mfn, INVALID_MFN) )
>>>>                return -ESRCH;
>>>> @@ -199,7 +199,7 @@ p2m_mem_access_check_and_get_page(vaddr_t gva,
>>>> unsigned long flag,
>>>>         * We had a mem_access permission limiting the access, but the page
>>>> type
>>>>         * could also be limiting, so we need to check that as well.
>>>>         */
>>>> -    mfn = p2m_get_entry(p2m, gfn, &t, NULL, NULL);
>>>> +    mfn = p2m_get_entry(p2m, gfn, &t, NULL, NULL, NULL);
>>>>        if ( mfn_eq(mfn, INVALID_MFN) )
>>>>            goto err;
>>>>    @@ -405,7 +405,7 @@ long p2m_set_mem_access(struct domain *d, gfn_t gfn,
>>>> uint32_t nr,
>>>>              gfn = gfn_next_boundary(gfn, order) )
>>>>        {
>>>>            p2m_type_t t;
>>>> -        mfn_t mfn = p2m_get_entry(p2m, gfn, &t, NULL, &order);
>>>> +        mfn_t mfn = p2m_get_entry(p2m, gfn, &t, NULL, &order, NULL);
>>>>                if ( !mfn_eq(mfn, INVALID_MFN) )
>>>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>>>> index 12b459924b..df6b48d73b 100644
>>>> --- a/xen/arch/arm/p2m.c
>>>> +++ b/xen/arch/arm/p2m.c
>>>> @@ -306,10 +306,14 @@ static int p2m_next_level(struct p2m_domain *p2m,
>>>> bool read_only,
>>>>     *
>>>>     * If the entry is not present, INVALID_MFN will be returned and the
>>>>     * page_order will be set according to the order of the invalid range.
>>>> + *
>>>> + * valid will contain the value of bit[0] (e.g valid bit) of the
>>>> + * entry.
>>>>     */
>>>>    mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>>>>                        p2m_type_t *t, p2m_access_t *a,
>>>> -                    unsigned int *page_order)
>>>> +                    unsigned int *page_order,
>>>> +                    bool *valid)
>>>>    {
>>>>        paddr_t addr = gfn_to_gaddr(gfn);
>>>>        unsigned int level = 0;
>>>> @@ -317,6 +321,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>>>>        int rc;
>>>>        mfn_t mfn = INVALID_MFN;
>>>>        p2m_type_t _t;
>>>> +    bool _valid;
>>>>          /* Convenience aliases */
>>>>        const unsigned int offsets[4] = {
>>>> @@ -334,6 +339,10 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t
>>>> gfn,
>>>>          *t = p2m_invalid;
>>>>    +    /* Allow valid to be NULL */
>>>> +    valid = valid?: &_valid;
>>>> +    *valid = false;
>>>
>>> Why not a simple:
>>>
>>>     if ( valid )
>>>       *valid = false;
>>>
>>> especially given that you do the same if ( valid ) check below.
>>> In fact, it doesn' look like we need _valid?
>>
>> I thought I dropped the if ( valid ) below. I would actually prefer to keep
>> _valid and avoid using if ( ... ) everywhere.
>>
>> This makes the code slightly easier to follow.
> 
> _valid is a good trick, but I don't think is worth doing it in this
> change: it is easy to follow anyway. Up to you, I am fine either way.

I will use if ( valid ) in the next version.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault
  2018-11-05 11:55     ` Julien Grall
@ 2018-11-05 17:56       ` Stefano Stabellini
  2018-11-06 14:20         ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-05 17:56 UTC (permalink / raw)
  To: Julien Grall; +Cc: Stefano Stabellini, andre.przywara, xen-devel

On Mon, 5 Nov 2018, Julien Grall wrote:
> On 02/11/2018 23:27, Stefano Stabellini wrote:
> > On Mon, 8 Oct 2018, Julien Grall wrote:
> > > Currently a Stage-2 translation fault could happen:
> > >      1) MMIO emulation
> > >      2) When the page-tables is been updated using Break-Before-Make
> >                                ^ have
> > 
> > >      3) Page not mapped
> > > 
> > > A follow-up patch will re-purpose the valid bit in an entry to generate
> > > translation fault. This would be used to do an action on each entries to
> >                                                                  ^entry
> > 
> > > track page used for a given period.
> >          ^pages
> > 
> > 
> > > 
> > > A new function is introduced to try to resolve a translation fault. This
> > > will include 2) and the new way to generate fault explained above.
> > 
> > I can see the code does what you describe, but I don't understand why we
> > are doing this. What is missing in the commit message is the explanation
> > of the relationship between the future goal of repurposing the valid bit
> > and the introduction of a function to handle Break-Before-Make stage2
> > faults. Does it fix an issue with Break-Before-Make that we currently
> > have? Or it becomes needed due to the repurposing of valid? If so, why?
> 
> This does not fix any issue with BBM. The valid bit adds a 4th reasons for
> translation fault. Both BBM and the valid bit will require to walk the
> page-tables.
> 
> For the valid bit, we will need to walk the page-table in order to fixup the
> entry (i.e set valid bit). We also can't use p2m_lookup(...) has it only tell
> you the mapping exists, the valid bit may still not be set.
> 
> So we need to provide a new helper to walk the page-table and fixup an entry.

OK. Please expand a bit the commit message.


> > > To avoid invalidating all the page-tables entries in one go. It is
> > > possible to invalidate the top-level table and then on trap invalidate
> > > the table one-level down. This will be repeated until a block/page entry
> > > has been reached.
> > > 
> > > At the moment, there are no action done when reaching a block/page entry
> > > but setting the valid bit to 1.
> > > 
> > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > ---
> > >   xen/arch/arm/p2m.c   | 127
> > > +++++++++++++++++++++++++++++++++++++++++++++++++++
> > >   xen/arch/arm/traps.c |   7 +--
> > >   2 files changed, 131 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> > > index ec956bc151..af445d3313 100644
> > > --- a/xen/arch/arm/p2m.c
> > > +++ b/xen/arch/arm/p2m.c
> > > @@ -1043,6 +1043,133 @@ int p2m_set_entry(struct p2m_domain *p2m,
> > >       return rc;
> > >   }
> > >   +/* Invalidate all entries in the table. The p2m should be write locked.
> > > */
> > > +static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
> > > +{
> > > +    lpae_t *table;
> > > +    unsigned int i;
> > > +
> > > +    ASSERT(p2m_is_write_locked(p2m));
> > > +
> > > +    table = map_domain_page(mfn);
> > > +
> > > +    for ( i = 0; i < LPAE_ENTRIES; i++ )
> > > +    {
> > > +        lpae_t pte = table[i];
> > > +
> > > +        pte.p2m.valid = 0;
> > > +
> > > +        p2m_write_pte(&table[i], pte, p2m->clean_pte);
> > > +    }
> > > +
> > > +    unmap_domain_page(table);
> > > +
> > > +    p2m->need_flush = true;
> > > +}
> > > +
> > > +bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn)
> > > +{
> > > +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> > > +    unsigned int level = 0;
> > > +    bool resolved = false;
> > > +    lpae_t entry, *table;
> > > +    paddr_t addr = gfn_to_gaddr(gfn);
> > > +
> > > +    /* Convenience aliases */
> > > +    const unsigned int offsets[4] = {
> > > +        zeroeth_table_offset(addr),
> > > +        first_table_offset(addr),
> > > +        second_table_offset(addr),
> > > +        third_table_offset(addr)
> > > +    };
> > > +
> > > +    p2m_write_lock(p2m);
> > > +
> > > +    /* This gfn is higher than the highest the p2m map currently holds */
> > > +    if ( gfn_x(gfn) > gfn_x(p2m->max_mapped_gfn) )
> > > +        goto out;
> > > +
> > > +    table = p2m_get_root_pointer(p2m, gfn);
> > > +    /*
> > > +     * The table should always be non-NULL because the gfn is below
> > > +     * p2m->max_mapped_gfn and the root table pages are always present.
> > > +     */
> > > +    BUG_ON(table == NULL);
> > > +
> > > +    /*
> > > +     * Go down the page-tables until an entry has the valid bit unset or
> > > +     * a block/page entry has been hit.
> > > +     */
> > > +    for ( level = P2M_ROOT_LEVEL; level <= 3; level++ )
> > > +    {
> > > +        int rc;
> > > +
> > > +        entry = table[offsets[level]];
> > > +
> > > +        if ( level == 3 )
> > > +            break;
> > > +
> > > +        /* Stop as soon as we hit an entry with the valid bit unset. */
> > > +        if ( !lpae_is_valid(entry) )
> > > +            break;
> > > +
> > > +        rc = p2m_next_level(p2m, true, level, &table, offsets[level]);
> > > +        if ( rc == GUEST_TABLE_MAP_FAILED )
> > > +            goto out_unmap;
> > > +        else if ( rc != GUEST_TABLE_NORMAL_PAGE )
> > 
> > why not rc == GUEST_TABLE_SUPER_PAGE?
> 
> The logic has been taken from p2m_get_entry(). It makes sense to use != here
> as you only want to continue the loop if you are on a table. So it is clearer
> why you continue.

OK


> > 
> > > +            break;
> > > +    }
> > > +
> > > +    /*
> > > +     * If the valid bit of the entry is set, it means someone was playing
> > > with
> > > +     * the Stage-2 page table. Nothing to do and mark the fault as
> > > resolved.
> > > +     */
> > > +    if ( lpae_is_valid(entry) )
> > > +    {
> > > +        resolved = true;
> > > +        goto out_unmap;
> > > +    }
> > > +
> > > +    /*
> > > +     * The valid bit is unset. If the entry is still not valid then the
> > > fault
> > > +     * cannot be resolved, exit and report it.
> > > +     */
> > > +    if ( !p2m_is_valid(entry) )
> > > +        goto out_unmap;
> > > +
> > > +    /*
> > > +     * Now we have an entry with valid bit unset, but still valid from
> > > +     * the P2M point of view.
> > > +     *
> > > +     * For entry pointing to a table, the table will be invalidated.
> >                ^ entries
> > 
> > 
> > > +     * For entry pointing to a block/page, no work to do for now.
> >                ^ entries
> 
> I am not entirely sure why it should be plural here. We are dealing with only
> one entry it.

I was trying to make the grammar work as a generic sentence. To make it
singular we would have to remove "For":

  If an entry is pointing to a table, the table will be invalidated.
  If an entry is pointing to a block/page, no work to do for now.


> > 
> > > +     */
> > > +    if ( lpae_is_table(entry, level) )
> > > +        p2m_invalidate_table(p2m, lpae_get_mfn(entry));
> > 
> > Maybe because I haven't read the rest of the patches, it is not clear to
> > me why in the case of an entry pointing to a table we need to invalidate
> > it, and otherwise set valid to 1.
> 
> This was written in the commit message:
> 
> "To avoid invalidating all the page-tables entries in one go. It is
> possible to invalidate the top-level table and then on trap invalidate
> the table one-level down. This will be repeated until a block/page entry
> has been reached."
> 
> It is mostly to spread the cost of invalidating the page-tables. With this
> solution, you only need to clear the valid bit of the top-level entries to
> invalidate the full P2M.
> 
> On the first access, you will trap, set the entry of the first "invalid
> entry", and invalidate the next level if necessary.
> 
> The access will then be retried. If trapped, the process is repeated until all
> the entries are valid.
> 
> It is possible to optimize it, avoiding intermediate trap when necessary. But
> I would not bother looking at that for now. Indeed, this will be used for
> lowering down the cost of set/way cache maintenance emulation. Any guest using
> that already knows that a big cost will incur.

So instead of walking the page table in Xen, finding all the leaf
(level==3) entries that we need to set !valid, we just set !valid one of
the higher levels entries. On access, we'll trap in Xen, then set the
higher level entry back to valid but the direct children to !valid. And
we'll cycle again through this until the table entries are valid and the
leaf entry is the only invalid one: at that point we'll only set it to
valid and the whole translation for that address is valid again.

Very inefficient, but very simple to implement in Xen. A good way to
penalize guests that are using instructions they should not be using :-)

All right, please expand on the explanation in the commit message. It is
also worthy of a in-code comment on top of
p2m_resolve_translation_fault.

One more comment below.


> > 
> > > +    /*
> > > +     * Now that the work on the entry is done, set the valid bit to
> > > prevent
> > > +     * another fault on that entry.
> > > +     */
> > > +    resolved = true;
> > > +    entry.p2m.valid = 1;
> > > +
> > > +    p2m_write_pte(table + offsets[level], entry, p2m->clean_pte);
> > > +
> > > +    /*
> > > +     * No need to flush the TLBs as the modified entry had the valid bit
> > > +     * unset.
> > > +     */
> > > +
> > > +out_unmap:
> > > +    unmap_domain_page(table);
> > > +
> > > +out:
> > > +    p2m_write_unlock(p2m);
> > > +
> > > +    return resolved;
> > > +}
> > > +
> > >   static inline int p2m_insert_mapping(struct domain *d,
> > >                                        gfn_t start_gfn,
> > >                                        unsigned long nr,


We probably want to update the comment on top of the call to
p2m_resolve_translation_fault:


> @@ -1977,8 +1978,8 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>           * with the Stage-2 page table. Walk the Stage-2 PT to check
>           * if the entry exists. If it's the case, return to the guest
>           */
> -        mfn = gfn_to_mfn(current->domain, gaddr_to_gfn(gpa));
> -        if ( !mfn_eq(mfn, INVALID_MFN) )
> +        if ( p2m_resolve_translation_fault(current->domain,
> +                                           gaddr_to_gfn(gpa)) )


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 10/16] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM
  2018-10-08 18:33 ` [RFC 10/16] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM Julien Grall
@ 2018-11-05 19:47   ` Stefano Stabellini
  2018-11-05 23:21     ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-05 19:47 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> A follow-up patch will require to emulate some accesses to some
> co-processors registers trapped by HCR_EL2.TVM. When set, all NS EL1 writes
> to the virtual memory control registers will be trapped to the hypervisor.
> 
> This patch adds the infrastructure to passthrough the access to host
> registers. For convenience a bunch of helpers have been added to
> generate the different helpers.
> 
> Note that HCR_EL2.TVM will be set in a follow-up patch dynamically.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/arm/vcpreg.c        | 144 +++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/cpregs.h |   1 +
>  2 files changed, 145 insertions(+)
> 
> diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
> index b04d996fd3..49529b97cd 100644
> --- a/xen/arch/arm/vcpreg.c
> +++ b/xen/arch/arm/vcpreg.c
> @@ -24,6 +24,122 @@
>  #include <asm/traps.h>
>  #include <asm/vtimer.h>
>  
> +/*
> + * Macros to help generating helpers for registers trapped when
> + * HCR_EL2.TVM is set.
> + *
> + * Note that it only traps NS write access from EL1.
> + *
> + *  - TVM_REG() should not be used outside of the macros. It is there to
> + *    help defining TVM_REG32() and TVM_REG64()
> + *  - TVM_REG32(regname, xreg) and TVM_REG64(regname, xreg) are used to
> + *    resp. generate helper accessing 32-bit and 64-bit register. "regname"
> + *    been the Arm32 name and "xreg" the Arm64 name.
         ^ is

Please add that we use the Arm64 reg name to call WRITE_SYSREG in the
Xen source code even on Arm32 in general


> + *  - UPDATE_REG32_COMBINED(lowreg, hireg, xreg) are used to generate a

TVM_REG32_COMBINED


> + *  pair of registers share the same Arm32 registers. "lowreg" and
> + *  "higreg" been resp. the Arm32 name and "xreg" the Arm64 name. "lowreg"
> + *  will use xreg[31:0] and "hireg" will use xreg[63:32].

Please add that xreg is unused in the Arm32 case.


> + */
> +
> +/* The name is passed from the upper macro to workaround macro expansion. */
> +#define TVM_REG(sz, func, reg...)                                           \
> +static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
> +{                                                                           \
> +    GUEST_BUG_ON(read);                                                     \
> +    WRITE_SYSREG##sz(*r, reg);                                              \
> +                                                                            \
> +    return true;                                                            \
> +}
> +
> +#define TVM_REG32(regname, xreg) TVM_REG(32, vreg_emulate_##regname, xreg)
> +#define TVM_REG64(regname, xreg) TVM_REG(64, vreg_emulate_##regname, xreg)
> +
> +#ifdef CONFIG_ARM_32
> +#define TVM_REG32_COMBINED(lowreg, hireg, xreg)                     \
> +    /* Use TVM_REG directly to workaround macro expansion. */       \
> +    TVM_REG(32, vreg_emulate_##lowreg, lowreg)                      \
> +    TVM_REG(32, vreg_emulate_##hireg, hireg)
> +
> +#else /* CONFIG_ARM_64 */
> +#define TVM_REG32_COMBINED(lowreg, hireg, xreg)                             \
> +static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
> +                                bool read, bool hi)                         \
> +{                                                                           \
> +    register_t reg = READ_SYSREG(xreg);                                     \
> +                                                                            \
> +    GUEST_BUG_ON(read);                                                     \
> +    if ( hi ) /* reg[63:32] is AArch32 register hireg */                    \
> +    {                                                                       \
> +        reg &= GENMASK(31, 0);                                              \

Move GENMASK before the if? It's the same regardless


> +        reg |= ((uint64_t)*r) << 32;                                        \
> +    }                                                                       \
> +    else /* reg[31:0] is AArch32 register lowreg. */                        \
> +    {                                                                       \
> +        reg &= GENMASK(31, 0);                                              \
> +        reg |= *r;                                                          \
> +    }                                                                       \
> +    WRITE_SYSREG(reg, xreg);                                                \
> +                                                                            \
> +    return true;                                                            \
> +}                                                                           \
> +                                                                            \
> +static bool vreg_emulate_##lowreg(struct cpu_user_regs *regs, uint32_t *r,  \
> +                                  bool read)                                \
> +{                                                                           \
> +    return vreg_emulate_##xreg(regs, r, read, false);                       \
> +}                                                                           \
> +                                                                            \
> +static bool vreg_emulate_##hireg(struct cpu_user_regs *regs, uint32_t *r,   \
> +                                 bool read)                                 \
> +{                                                                           \
> +    return vreg_emulate_##xreg(regs, r, read, true);                        \
> +}
> +#endif
> +
> +/* Defining helpers for emulating co-processor registers. */
> +TVM_REG32(SCTLR, SCTLR_EL1)
> +/*
> + * AArch32 provides two way to access TTBR* depending on the access
> + * size, whilst AArch64 provides one way.
> + *
> + * When using AArch32, for simplicity, use the same access size as the
> + * guest.
> + */
> +#ifdef CONFIG_ARM_32
> +TVM_REG32(TTBR0_32, TTBR0_32)
> +TVM_REG32(TTBR1_32, TTBR1_32)
> +#else
> +TVM_REG32(TTBR0_32, TTBR0_EL1)
> +TVM_REG32(TTBR1_32, TTBR1_EL1)
> +#endif
> +TVM_REG64(TTBR0, TTBR0_EL1)
> +TVM_REG64(TTBR1, TTBR1_EL1)
> +/* AArch32 registers TTBCR and TTBCR2 share AArch64 register TCR_EL1. */
> +TVM_REG32_COMBINED(TTBCR, TTBCR2, TCR_EL1)
> +TVM_REG32(DACR, DACR32_EL2)
> +TVM_REG32(DFSR, ESR_EL1)
> +TVM_REG32(IFSR, IFSR32_EL2)
> +/* AArch32 registers DFAR and IFAR shares AArch64 register FAR_EL1. */
> +TVM_REG32_COMBINED(DFAR, IFAR, FAR_EL1)
> +TVM_REG32(ADFSR, AFSR0_EL1)
> +TVM_REG32(AIFSR, AFSR1_EL1)
> +/* AArch32 registers MAIR0 and MAIR1 share AArch64 register MAIR_EL1. */
> +TVM_REG32_COMBINED(MAIR0, MAIR1, MAIR_EL1)
> +/* AArch32 registers AMAIR0 and AMAIR1 share AArch64 register AMAIR_EL1. */
> +TVM_REG32_COMBINED(AMAIR0, AMAIR1, AMAIR_EL1)
> +TVM_REG32(CONTEXTIDR, CONTEXTIDR_EL1)
> +
> +/* Macro to generate easily case for co-processor emulation. */
> +#define GENERATE_CASE(reg, sz)                                      \
> +    case HSR_CPREG##sz(reg):                                        \
> +    {                                                               \
> +        bool res;                                                   \
> +                                                                    \
> +        res = vreg_emulate_cp##sz(regs, hsr, vreg_emulate_##reg);   \
> +        ASSERT(res);                                                \
> +        break;                                                      \
> +    }
> +
>  void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
>  {
>      const struct hsr_cp32 cp32 = hsr.cp32;
> @@ -64,6 +180,31 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
>          break;
>  
>      /*
> +     * HCR_EL2.TVM
> +     *
> +     * ARMv8 (DDI 0487B.b): Table D1-37

In 0487D.a is D1-99


> +     */
> +    GENERATE_CASE(SCTLR, 32)
> +    GENERATE_CASE(TTBR0_32, 32)
> +    GENERATE_CASE(TTBR1_32, 32)
> +    GENERATE_CASE(TTBCR, 32)
> +    GENERATE_CASE(TTBCR2, 32)
> +    GENERATE_CASE(DACR, 32)
> +    GENERATE_CASE(DFSR, 32)
> +    GENERATE_CASE(IFSR, 32)
> +    GENERATE_CASE(DFAR, 32)
> +    GENERATE_CASE(IFAR, 32)
> +    GENERATE_CASE(ADFSR, 32)
> +    GENERATE_CASE(AIFSR, 32)
> +    /* AKA PRRR */
> +    GENERATE_CASE(MAIR0, 32)
> +    /* AKA NMRR */
> +    GENERATE_CASE(MAIR1, 32)
> +    GENERATE_CASE(AMAIR0, 32)
> +    GENERATE_CASE(AMAIR1, 32)
> +    GENERATE_CASE(CONTEXTIDR, 32)
> +
> +    /*
>       * MDCR_EL2.TPM
>       *
>       * ARMv7 (DDI 0406C.b): B1.14.17
> @@ -192,6 +333,9 @@ void do_cp15_64(struct cpu_user_regs *regs, const union hsr hsr)
>              return inject_undef_exception(regs, hsr);
>          break;
>  
> +    GENERATE_CASE(TTBR0, 64)
> +    GENERATE_CASE(TTBR1, 64)
> +
>      /*
>       * CPTR_EL2.T{0..9,12..13}
>       *
> diff --git a/xen/include/asm-arm/cpregs.h b/xen/include/asm-arm/cpregs.h
> index 07e5791983..f1cbac5e5d 100644
> --- a/xen/include/asm-arm/cpregs.h
> +++ b/xen/include/asm-arm/cpregs.h
> @@ -142,6 +142,7 @@
>  
>  /* CP15 CR2: Translation Table Base and Control Registers */
>  #define TTBCR           p15,0,c2,c0,2   /* Translation Table Base Control Register */
> +#define TTBCR2          p15,0,c2,c0,3   /* Translation Table Base Control Register 2 */
>  #define TTBR0           p15,0,c2        /* Translation Table Base Reg. 0 */
>  #define TTBR1           p15,1,c2        /* Translation Table Base Reg. 1 */
>  #define HTTBR           p15,4,c2        /* Hyp. Translation Table Base Register */
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 11/16] xen/arm: vsysreg: Add wrapper to handle sysreg access trapped by HCR_EL2.TVM
  2018-10-08 18:33 ` [RFC 11/16] xen/arm: vsysreg: Add wrapper to handle sysreg " Julien Grall
@ 2018-11-05 20:42   ` Stefano Stabellini
  0 siblings, 0 replies; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-05 20:42 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> A follow-up patch will require to emulate some accesses to system
> registers trapped by HCR_EL2.TVM. When set, all NS EL1 writes to the
> virtual memory control registers will be trapped to the hypervisor.
> 
> This patch adds the infrastructure to passthrough the access to the host
> registers.
> 
> Note that HCR_EL2.TVM will be set in a follow-up patch dynamically.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/arm/arm64/vsysreg.c | 57 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 57 insertions(+)
> 
> diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
> index 6e60824572..1517879697 100644
> --- a/xen/arch/arm/arm64/vsysreg.c
> +++ b/xen/arch/arm/arm64/vsysreg.c
> @@ -23,6 +23,46 @@
>  #include <asm/traps.h>
>  #include <asm/vtimer.h>
>  
> +/*
> + * Macro to help generating helpers for registers trapped when
> + * HCR_EL2.TVM is set.
> + *
> + * Note that it only traps NS write access from EL1.
> + */
> +#define TVM_REG(reg)                                                \
> +static bool vreg_emulate_##reg(struct cpu_user_regs *regs,          \
> +                               uint64_t *r, bool read)              \
> +{                                                                   \
> +    GUEST_BUG_ON(read);                                             \
> +    WRITE_SYSREG64(*r, reg);                                        \
> +                                                                    \
> +    return true;                                                    \
> +}
> +
> +/* Defining helpers for emulating sysreg registers. */
> +TVM_REG(SCTLR_EL1)
> +TVM_REG(TTBR0_EL1)
> +TVM_REG(TTBR1_EL1)
> +TVM_REG(TCR_EL1)
> +TVM_REG(ESR_EL1)
> +TVM_REG(FAR_EL1)
> +TVM_REG(AFSR0_EL1)
> +TVM_REG(AFSR1_EL1)
> +TVM_REG(MAIR_EL1)
> +TVM_REG(AMAIR_EL1)
> +TVM_REG(CONTEXTIDR_EL1)
> +
> +/* Macro to generate easily case for co-processor emulation */
> +#define GENERATE_CASE(reg)                                              \
> +    case HSR_SYSREG_##reg:                                              \
> +    {                                                                   \
> +        bool res;                                                       \
> +                                                                        \
> +        res = vreg_emulate_sysreg64(regs, hsr, vreg_emulate_##reg);     \
> +        ASSERT(res);                                                    \
> +        break;                                                          \
> +    }
> +
>  void do_sysreg(struct cpu_user_regs *regs,
>                 const union hsr hsr)
>  {
> @@ -44,6 +84,23 @@ void do_sysreg(struct cpu_user_regs *regs,
>          break;
>  
>      /*
> +     * HCR_EL2.TVM
> +     *
> +     * ARMv8 (DDI 0487B.b): Table D1-37

You might want to provide a more up to date reference.
In any case:

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

> +     */
> +    GENERATE_CASE(SCTLR_EL1)
> +    GENERATE_CASE(TTBR0_EL1)
> +    GENERATE_CASE(TTBR1_EL1)
> +    GENERATE_CASE(TCR_EL1)
> +    GENERATE_CASE(ESR_EL1)
> +    GENERATE_CASE(FAR_EL1)
> +    GENERATE_CASE(AFSR0_EL1)
> +    GENERATE_CASE(AFSR1_EL1)
> +    GENERATE_CASE(MAIR_EL1)
> +    GENERATE_CASE(AMAIR_EL1)
> +    GENERATE_CASE(CONTEXTIDR_EL1)
> +
> +    /*
>       * MDCR_EL2.TDRA
>       *
>       * ARMv8 (DDI 0487A.d): D1-1508 Table D1-57
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 15/16] xen/arm: Implement Set/Way operations
  2018-10-08 18:33 ` [RFC 15/16] xen/arm: Implement Set/Way operations Julien Grall
@ 2018-11-05 21:10   ` Stefano Stabellini
  2018-11-05 23:38     ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-05 21:10 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, andre.przywara, xen-devel

On Mon, 8 Oct 2018, Julien Grall wrote:
> Set/Way operations are used to perform maintenance on a given cache.
> At the moment, Set/Way operations are not trapped and therefore a guest
> OS will directly act on the local cache. However, a vCPU may migrate to
> another pCPU in the middle of the processor. This will result to have
> cache with stall data (Set/Way are not propagated) potentially causing
> crash. This may be the cause of heisenbug noticed in Osstest [1].
> 
> Furthermore, Set/Way operations are not available on system cache. This
> means that OS, such as Linux 32-bit, relying on those operations to
> fully clean the cache before disabling MMU may break because data may
> sits in system caches and not in RAM.
> 
> For more details about Set/Way, see the talk "The Art of Virtualizing
> Cache Maintenance" given at Xen Summit 2018 [2].
> 
> In the context of Xen, we need to trap Set/Way operations and emulate
> them. From the Arm Arm (B1.14.4 in DDI 046C.c), Set/Way operations are
> difficult to virtualized. So we can assume that a guest OS using them will
> suffer the consequence (i.e slowness) until developer removes all the usage
> of Set/Way.
> 
> As the software is not allowed to infer the Set/Way to Physical Address
> mapping, Xen will need to go through the guest P2M and clean &
> invalidate all the entries mapped.
> 
> Because Set/Way happen in batch (a loop on all Set/Way of a cache), Xen
> would need to go through the P2M for every instructions. This is quite
> expensive and would severely impact the guest OS. The implementation is
> re-using the KVM policy to limit the number of flush:
>     - If we trap a Set/Way operations, we enable VM trapping (i.e
                   ^ remove 'a'

>       HVC_EL2.TVM) to detect cache being turned on/off, and do a full
>     clean.

"do a full clean" straight away, right? May I suggest a rewording of
this item:

- as soon as we trap a set/way operation, we enable VM trapping (i.e.
  HVC_EL2.TVM, it ll allow us to detect cache being turned on/off),
  then we do a full clean


>     - We clean the caches when turning on and off

"We clean the caches when the guest turns caches on or off"


>     - Once the caches are enabled, we stop trapping VM instructions
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg03191.html
> [2] https://fr.slideshare.net/xen_com_mgr/virtualizing-cache
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/arm/arm64/vsysreg.c | 27 +++++++++++++++++-
>  xen/arch/arm/p2m.c           | 68 ++++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/traps.c         |  2 +-
>  xen/arch/arm/vcpreg.c        | 23 +++++++++++++++
>  xen/include/asm-arm/p2m.h    | 16 +++++++++++
>  5 files changed, 134 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
> index 1517879697..43c6c3e30d 100644
> --- a/xen/arch/arm/arm64/vsysreg.c
> +++ b/xen/arch/arm/arm64/vsysreg.c
> @@ -40,7 +40,20 @@ static bool vreg_emulate_##reg(struct cpu_user_regs *regs,          \
>  }
>  
>  /* Defining helpers for emulating sysreg registers. */
> -TVM_REG(SCTLR_EL1)
> +static bool vreg_emulate_SCTLR_EL1(struct cpu_user_regs *regs, uint64_t *r,
> +                                   bool read)
> +{
> +    struct vcpu *v = current;
> +    bool cache_enabled = vcpu_has_cache_enabled(v);
> +
> +    GUEST_BUG_ON(read);
> +    WRITE_SYSREG(*r, SCTLR_EL1);
> +
> +    p2m_toggle_cache(v, cache_enabled);
> +
> +    return true;
> +}
> +
>  TVM_REG(TTBR0_EL1)
>  TVM_REG(TTBR1_EL1)
>  TVM_REG(TCR_EL1)
> @@ -84,6 +97,18 @@ void do_sysreg(struct cpu_user_regs *regs,
>          break;
>  
>      /*
> +     * HCR_EL2.TSW
> +     *
> +     * ARMv8 (DDI 0487B.b): Table D1-42
> +     */
> +    case HSR_SYSREG_DCISW:
> +    case HSR_SYSREG_DCCSW:
> +    case HSR_SYSREG_DCCISW:
> +        if ( hsr.sysreg.read )

Shouldn't it be !hsr.sysreg.read ?


> +            p2m_set_way_flush(current);
> +        break;
> +
> +    /*
>       * HCR_EL2.TVM
>       *
>       * ARMv8 (DDI 0487B.b): Table D1-37
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index df6b48d73b..a3d4c563b1 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1564,6 +1564,74 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>      return 0;
>  }
>  
> +static void p2m_flush_vm(struct vcpu *v)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(v->domain);
> +
> +    /* XXX: Handle preemption */

Yes, good to have this reminder. Maybe add "we'd want to break the
operation down when it takes too long".


> +    p2m_cache_flush_range(v->domain, p2m->lowest_mapped_gfn,
> +                          p2m->max_mapped_gfn);
> +}
> +
> +/*
> + * See note at ARMv7 ARM B1.14.4 (DDI 0406C.c) (TL;DR: S/W ops are not
> + * easily virtualized).
> + *
> + * Main problems:
> + *  - S/W ops are local to a CPU (not broadcast)
> + *  - We have line migration behind our back (speculation)
> + *  - System caches don't support S/W at all (damn!)
> + *
> + * In the face of the above, the best we can do is to try and convert
> + * S/W ops to VA ops. Because the guest is not allowed to infer the S/W
> + * to PA mapping, it can only use S/W to nuke the whole cache, which is
> + * rather a good thing for us.
> + *
> + * Also, it is only used when turning caches on/off ("The expected
> + * usage of the cache maintenance instructions that operate by set/way
> + * is associated with the powerdown and powerup of caches, if this is
> + * required by the implementation.").
> + *
> + * We use the following policy:
> + *  - If we trap a S/W operation, we enabled VM trapping to detect
> + *  caches being turned on/off, and do a full clean.
> + *
> + *  - We flush the caches on both caches being turned on and off.
> + *
> + *  - Once the caches are enabled, we stop trapping VM ops.
> + */
> +void p2m_set_way_flush(struct vcpu *v)
> +{
> +    /* This function can only work with the current vCPU. */
> +    ASSERT(v == current);

NIT: if it can only operate on current, it makes sense to remove the
struct vcpu* parameter


> +    if ( !(v->arch.hcr_el2 & HCR_TVM) )
> +    {
> +        p2m_flush_vm(v);
> +        vcpu_hcr_set_flags(v, HCR_TVM);
> +    }
> +}
> +
> +void p2m_toggle_cache(struct vcpu *v, bool was_enabled)
> +{
> +    bool now_enabled = vcpu_has_cache_enabled(v);
> +
> +    /* This function can only work with the current vCPU. */
> +    ASSERT(v == current);

NIT: same about struct vcpu* as parameter when only current can be used


> +    /*
> +     * If switching the MMU+caches on, need to invalidate the caches.
> +     * If switching it off, need to clean the caches.
> +     * Clean + invalidate does the trick always.
> +     */
> +    if ( was_enabled != now_enabled )
> +        p2m_flush_vm(v);
> +
> +    /* Caches are now on, stop trapping VM ops (until a S/W op) */
> +    if ( now_enabled )
> +        vcpu_hcr_clear_flags(v, HCR_TVM);
> +}
> +
>  mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
>  {
>      return p2m_lookup(d, gfn, NULL);
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 169b57cb6b..cdc10eee5a 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -98,7 +98,7 @@ register_t get_default_hcr_flags(void)
>  {
>      return  (HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM|
>               (vwfi != NATIVE ? (HCR_TWI|HCR_TWE) : 0) |
> -             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB);
> +             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
>  }
>  
>  static enum {
> diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
> index 49529b97cd..dc46d9d0d7 100644
> --- a/xen/arch/arm/vcpreg.c
> +++ b/xen/arch/arm/vcpreg.c
> @@ -45,9 +45,14 @@
>  #define TVM_REG(sz, func, reg...)                                           \
>  static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
>  {                                                                           \
> +    struct vcpu *v = current;                                               \
> +    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
> +                                                                            \
>      GUEST_BUG_ON(read);                                                     \
>      WRITE_SYSREG##sz(*r, reg);                                              \
>                                                                              \
> +    p2m_toggle_cache(v, cache_enabled);                                     \

This will affect all the registers trapped with TVM. Shouldn't we only
call p2m_toggle_cache when relevant? i.e. when changing SCTLR?
I think it would be better to only modify the SCTLR emulation handler.


>      return true;                                                            \
>  }
>  
> @@ -65,6 +70,8 @@ static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
>  static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
>                                  bool read, bool hi)                         \
>  {                                                                           \
> +    struct vcpu *v = current;                                               \
> +    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
>      register_t reg = READ_SYSREG(xreg);                                     \
>                                                                              \
>      GUEST_BUG_ON(read);                                                     \
> @@ -80,6 +87,8 @@ static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
>      }                                                                       \
>      WRITE_SYSREG(reg, xreg);                                                \
>                                                                              \
> +    p2m_toggle_cache(v, cache_enabled);                                     \
> +                                                                            \
>      return true;                                                            \
>  }                                                                           \
>                                                                              \
> @@ -98,6 +107,7 @@ static bool vreg_emulate_##hireg(struct cpu_user_regs *regs, uint32_t *r,   \
>  
>  /* Defining helpers for emulating co-processor registers. */
>  TVM_REG32(SCTLR, SCTLR_EL1)
> +

Spurious change. Should be in a previous patch?

>  /*
>   * AArch32 provides two way to access TTBR* depending on the access
>   * size, whilst AArch64 provides one way.
> @@ -180,6 +190,19 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
>          break;
>  
>      /*
> +     * HCR_EL2.TSW
> +     *
> +     * ARMv7 (DDI 0406C.b): B1.14.6
> +     * ARMv8 (DDI 0487B.b): Table D1-42
> +     */
> +    case HSR_CPREG32(DCISW):
> +    case HSR_CPREG32(DCCSW):
> +    case HSR_CPREG32(DCCISW):
> +        if ( !cp32.read )
> +            p2m_set_way_flush(current);
> +        break;
> +
> +    /*
>       * HCR_EL2.TVM
>       *
>       * ARMv8 (DDI 0487B.b): Table D1-37
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index 92213dd1ab..c470f062db 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -231,6 +231,10 @@ int p2m_set_entry(struct p2m_domain *p2m,
>   */
>  int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end);
>  
> +void p2m_set_way_flush(struct vcpu *v);
> +
> +void p2m_toggle_cache(struct vcpu *v, bool was_enabled);
> +
>  /*
>   * Map a region in the guest p2m with a specific p2m type.
>   * The memory attributes will be derived from the p2m type.
> @@ -358,6 +362,18 @@ static inline int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
>      return -EOPNOTSUPP;
>  }
>  
> +/*
> + * A vCPU has cache enabled only when the MMU is enabled and data cache
> + * is enabled.
> + */
> +static inline bool vcpu_has_cache_enabled(struct vcpu *v)
> +{
> +    /* Only works with the current vCPU */
> +    ASSERT(current == v);

NIT: same about struct vcpu *v as parameter when only current makes
sense


> +    return (READ_SYSREG32(SCTLR_EL1) & (SCTLR_C|SCTLR_M)) == (SCTLR_C|SCTLR_M);
> +}
> +
>  #endif /* _XEN_P2M_H */
>  
>  /*

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 16/16] xen/arm: Track page accessed between batch of Set/Way operations
  2018-10-08 18:33 ` [RFC 16/16] xen/arm: Track page accessed between batch of " Julien Grall
  2018-10-09  7:04   ` Jan Beulich
@ 2018-11-05 21:35   ` Stefano Stabellini
  2018-11-05 23:28     ` Julien Grall
  1 sibling, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-05 21:35 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, sstabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andre.przywara,
	xen-devel, Jan Beulich

On Mon, 8 Oct 2018, Julien Grall wrote:
> At the moment, the implementation of Set/Way operations will go through
> all the entries of the guest P2M and flush them. However, this is very
> expensive and may render unusable a guest OS using them.
> 
> For instance, Linux 32-bit will use Set/Way operations during secondary
> CPU bring-up. As the implementation is really expensive, it may be possible
> to hit the CPU bring-up timeout.
> 
> To limit the Set/Way impact, we track what pages has been of the guest
> has been accessed between batch of Set/Way operations. This is done
> using bit[0] (aka valid bit) of the P2M entry.

This is going to improve performance of ill-mannered guests at the cost
of hurting performance of well-mannered guests. Is it really a good
trade-off? Should this behavior at least be configurable with a Xen
command line?


> This patch adds a new per-arch helper is introduced to perform actions just
> before the guest is first unpaused. This will be used to invalidate the
> P2M to track access from the start of the guest.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
>
> ---
> 
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/arch/arm/domain.c       | 14 ++++++++++++++
>  xen/arch/arm/domain_build.c |  7 +++++++
>  xen/arch/arm/p2m.c          | 32 +++++++++++++++++++++++++++++++-
>  xen/arch/x86/domain.c       |  4 ++++
>  xen/common/domain.c         |  5 ++++-
>  xen/include/asm-arm/p2m.h   |  2 ++
>  xen/include/xen/domain.h    |  2 ++
>  7 files changed, 64 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index feebbf5a92..f439f4657a 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -738,6 +738,20 @@ int arch_domain_soft_reset(struct domain *d)
>      return -ENOSYS;
>  }
>  
> +void arch_domain_creation_finished(struct domain *d)
> +{
> +    /*
> +     * To avoid flushing the whole guest RAM on the first Set/Way, we
> +     * invalidate the P2M to track what has been accessed.
> +     *
> +     * This is only turned when IOMMU is not used or the page-table are
> +     * not shared because bit[0] (e.g valid bit) unset will result
> +     * IOMMU fault that could be not fixed-up.
> +     */
> +    if ( !iommu_use_hap_pt(d) )
> +        p2m_invalidate_root(p2m_get_hostp2m(d));
> +}
> +
>  static int is_guest_pv32_psr(uint32_t psr)
>  {
>      switch (psr & PSR_MODE_MASK)
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index f552154e93..de96516faa 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -2249,6 +2249,13 @@ int __init construct_dom0(struct domain *d)
>      v->is_initialised = 1;
>      clear_bit(_VPF_down, &v->pause_flags);
>  
> +    /*
> +     * XXX: We probably want a command line option to invalidate or not
> +     * the P2M. This is because invalidating the P2M will not work with
> +     * IOMMU, however if this is not done there will be an impact.

Why can't we check on iommu_use_hap_pt(d) like in
arch_domain_creation_finished?

In any case, I agree it is a good idea to introduce a command line
parameter to toggle the p2m_invalidate_root call at domain creation
on/off. There are cases where it should be off even if an IOMMU is
present.

Aside from these two questions, the rest of the patch looks correct.


> +     */
> +    p2m_invalidate_root(p2m_get_hostp2m(d));
>
>      return 0;
>  }
>  
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index a3d4c563b1..8e0c31d7ac 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1079,6 +1079,22 @@ static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
>      p2m->need_flush = true;
>  }
>  
> +/*
> + * Invalidate all entries in the root page-tables. This is
> + * useful to get fault on entry and do an action.
> + */
> +void p2m_invalidate_root(struct p2m_domain *p2m)
> +{
> +    unsigned int i;
> +
> +    p2m_write_lock(p2m);
> +
> +    for ( i = 0; i < P2M_ROOT_LEVEL; i++ )
> +        p2m_invalidate_table(p2m, page_to_mfn(p2m->root + i));
> +
> +    p2m_write_unlock(p2m);
> +}
> +
>  bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn)
>  {
>      struct p2m_domain *p2m = p2m_get_hostp2m(d);
> @@ -1539,7 +1555,8 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>  
>      for ( ; gfn_x(start) < gfn_x(end); start = next_gfn )
>      {
> -        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
> +        bool valid;
> +        mfn_t mfn = p2m_get_entry(p2m, start, &t, NULL, &order, &valid);
>  
>          next_gfn = gfn_next_boundary(start, order);
>  
> @@ -1547,6 +1564,13 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>          if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
>              continue;
>  
> +        /*
> +         * Page with valid bit (bit [0]) unset does not need to be
> +         * cleaned
> +         */
> +        if ( !valid )
> +            continue;
> +
>          /* XXX: Implement preemption */
>          while ( gfn_x(start) < gfn_x(next_gfn) )
>          {
> @@ -1571,6 +1595,12 @@ static void p2m_flush_vm(struct vcpu *v)
>      /* XXX: Handle preemption */
>      p2m_cache_flush_range(v->domain, p2m->lowest_mapped_gfn,
>                            p2m->max_mapped_gfn);
> +
> +    /*
> +     * Invalidate the p2m to track which page was modified by the guest
> +     * between call of p2m_flush_vm().
> +     */
> +    p2m_invalidate_root(p2m);
>  }
>  
>  /*
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 9371efc8c7..2b6d1c01a1 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -723,6 +723,10 @@ int arch_domain_soft_reset(struct domain *d)
>      return ret;
>  }
>  
> +void arch_domain_creation_finished(struct domain *d)
> +{
> +}
> +
>  /*
>   * These are the masks of CR4 bits (subject to hardware availability) which a
>   * PV guest may not legitimiately attempt to modify.
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 65151e2ac4..b402c635f9 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -1100,8 +1100,11 @@ int domain_unpause_by_systemcontroller(struct domain *d)
>       * Creation is considered finished when the controller reference count
>       * first drops to 0.
>       */
> -    if ( new == 0 )
> +    if ( new == 0 && !d->creation_finished )
> +    {
>          d->creation_finished = true;
> +        arch_domain_creation_finished(d);
> +    }
>  
>      domain_unpause(d);
>  
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index c470f062db..2a4652e7f4 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -225,6 +225,8 @@ int p2m_set_entry(struct p2m_domain *p2m,
>                    p2m_type_t t,
>                    p2m_access_t a);
>  
> +void p2m_invalidate_root(struct p2m_domain *p2m);
> +
>  /*
>   * Clean & invalidate caches corresponding to a region [start,end) of guest
>   * address space.
> diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
> index 5e393fd7f2..8d95ad4bf1 100644
> --- a/xen/include/xen/domain.h
> +++ b/xen/include/xen/domain.h
> @@ -70,6 +70,8 @@ void arch_domain_unpause(struct domain *d);
>  
>  int arch_domain_soft_reset(struct domain *d);
>  
> +void arch_domain_creation_finished(struct domain *d);
> +
>  void arch_p2m_set_access_required(struct domain *d, bool access_required);
>  
>  int arch_set_info_guest(struct vcpu *, vcpu_guest_context_u);
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 10/16] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM
  2018-11-05 19:47   ` Stefano Stabellini
@ 2018-11-05 23:21     ` Julien Grall
  2018-11-06 17:36       ` Stefano Stabellini
  2018-12-04 16:24       ` Julien Grall
  0 siblings, 2 replies; 62+ messages in thread
From: Julien Grall @ 2018-11-05 23:21 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel

Hi Stefano,

On 11/5/18 7:47 PM, Stefano Stabellini wrote:
> On Mon, 8 Oct 2018, Julien Grall wrote:
>> A follow-up patch will require to emulate some accesses to some
>> co-processors registers trapped by HCR_EL2.TVM. When set, all NS EL1 writes
>> to the virtual memory control registers will be trapped to the hypervisor.
>>
>> This patch adds the infrastructure to passthrough the access to host
>> registers. For convenience a bunch of helpers have been added to
>> generate the different helpers.
>>
>> Note that HCR_EL2.TVM will be set in a follow-up patch dynamically.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> ---
>>   xen/arch/arm/vcpreg.c        | 144 +++++++++++++++++++++++++++++++++++++++++++
>>   xen/include/asm-arm/cpregs.h |   1 +
>>   2 files changed, 145 insertions(+)
>>
>> diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
>> index b04d996fd3..49529b97cd 100644
>> --- a/xen/arch/arm/vcpreg.c
>> +++ b/xen/arch/arm/vcpreg.c
>> @@ -24,6 +24,122 @@
>>   #include <asm/traps.h>
>>   #include <asm/vtimer.h>
>>   
>> +/*
>> + * Macros to help generating helpers for registers trapped when
>> + * HCR_EL2.TVM is set.
>> + *
>> + * Note that it only traps NS write access from EL1.
>> + *
>> + *  - TVM_REG() should not be used outside of the macros. It is there to
>> + *    help defining TVM_REG32() and TVM_REG64()
>> + *  - TVM_REG32(regname, xreg) and TVM_REG64(regname, xreg) are used to
>> + *    resp. generate helper accessing 32-bit and 64-bit register. "regname"
>> + *    been the Arm32 name and "xreg" the Arm64 name.
>           ^ is
> 
> Please add that we use the Arm64 reg name to call WRITE_SYSREG in the
> Xen source code even on Arm32 in general

I am not sure to understand this. It is common use in Xen to use arm64 
name when code is for both architecture. So why would I need a specific 
comment here?

> 
>> + *  - UPDATE_REG32_COMBINED(lowreg, hireg, xreg) are used to generate a
> 
> TVM_REG32_COMBINED
> 
> 
>> + *  pair of registers share the same Arm32 registers. "lowreg" and
>> + *  "higreg" been resp. the Arm32 name and "xreg" the Arm64 name. "lowreg"
>> + *  will use xreg[31:0] and "hireg" will use xreg[63:32].
> 
> Please add that xreg is unused in the Arm32 case.

Why do you think that? xreg is actually used. It will get expanded to 
whatever is the co-processor encoding and caught by reg... in TVM_REG().

> 
> 
>> + */
>> +
>> +/* The name is passed from the upper macro to workaround macro expansion. */
>> +#define TVM_REG(sz, func, reg...)                                           \
>> +static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
>> +{                                                                           \
>> +    GUEST_BUG_ON(read);                                                     \
>> +    WRITE_SYSREG##sz(*r, reg);                                              \
>> +                                                                            \
>> +    return true;                                                            \
>> +}
>> +
>> +#define TVM_REG32(regname, xreg) TVM_REG(32, vreg_emulate_##regname, xreg)
>> +#define TVM_REG64(regname, xreg) TVM_REG(64, vreg_emulate_##regname, xreg)
>> +
>> +#ifdef CONFIG_ARM_32
>> +#define TVM_REG32_COMBINED(lowreg, hireg, xreg)                     \
>> +    /* Use TVM_REG directly to workaround macro expansion. */       \
>> +    TVM_REG(32, vreg_emulate_##lowreg, lowreg)                      \
>> +    TVM_REG(32, vreg_emulate_##hireg, hireg)
>> +
>> +#else /* CONFIG_ARM_64 */
>> +#define TVM_REG32_COMBINED(lowreg, hireg, xreg)                             \
>> +static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
>> +                                bool read, bool hi)                         \
>> +{                                                                           \
>> +    register_t reg = READ_SYSREG(xreg);                                     \
>> +                                                                            \
>> +    GUEST_BUG_ON(read);                                                     \
>> +    if ( hi ) /* reg[63:32] is AArch32 register hireg */                    \
>> +    {                                                                       \
>> +        reg &= GENMASK(31, 0);                                              \
> 
> Move GENMASK before the if? It's the same regardless

Actually, the second GENMASK is incorrect. It should have been 
GENMASK(63, 32) as we want to update only the lowreg.

So I will fix the mask instead.

>>       /*
>> +     * HCR_EL2.TVM
>> +     *
>> +     * ARMv8 (DDI 0487B.b): Table D1-37
> 
> In 0487D.a is D1-99

I haven't had the chance to download the latest spec (it was released 
last week). I will update to the new spec.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 16/16] xen/arm: Track page accessed between batch of Set/Way operations
  2018-11-05 21:35   ` Stefano Stabellini
@ 2018-11-05 23:28     ` Julien Grall
  2018-11-06 17:43       ` Stefano Stabellini
  0 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-11-05 23:28 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Tim Deegan, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, andre.przywara, xen-devel,
	Jan Beulich

Hi Stefano,

On 11/5/18 9:35 PM, Stefano Stabellini wrote:
> On Mon, 8 Oct 2018, Julien Grall wrote:
>> At the moment, the implementation of Set/Way operations will go through
>> all the entries of the guest P2M and flush them. However, this is very
>> expensive and may render unusable a guest OS using them.
>>
>> For instance, Linux 32-bit will use Set/Way operations during secondary
>> CPU bring-up. As the implementation is really expensive, it may be possible
>> to hit the CPU bring-up timeout.
>>
>> To limit the Set/Way impact, we track what pages has been of the guest
>> has been accessed between batch of Set/Way operations. This is done
>> using bit[0] (aka valid bit) of the P2M entry.
> 
> This is going to improve performance of ill-mannered guests at the cost
> of hurting performance of well-mannered guests. Is it really a good
> trade-off? Should this behavior at least be configurable with a Xen
> command line?

Well, we have the choice between not been able to boot Linux 32-bit 
anymore or have a slight impact at the boot time for all guests.

As you may have noticed the command line is been suggested below. I 
didn't yet implemented as we agreed at Connect it would be good to start 
getting feedback on it.

> 
> 
>> This patch adds a new per-arch helper is introduced to perform actions just
>> before the guest is first unpaused. This will be used to invalidate the
>> P2M to track access from the start of the guest.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>
>> ---
>>
>> Cc: Stefano Stabellini <sstabellini@kernel.org>
>> Cc: Julien Grall <julien.grall@arm.com>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Cc: Tim Deegan <tim@xen.org>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> ---
>>   xen/arch/arm/domain.c       | 14 ++++++++++++++
>>   xen/arch/arm/domain_build.c |  7 +++++++
>>   xen/arch/arm/p2m.c          | 32 +++++++++++++++++++++++++++++++-
>>   xen/arch/x86/domain.c       |  4 ++++
>>   xen/common/domain.c         |  5 ++++-
>>   xen/include/asm-arm/p2m.h   |  2 ++
>>   xen/include/xen/domain.h    |  2 ++
>>   7 files changed, 64 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index feebbf5a92..f439f4657a 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -738,6 +738,20 @@ int arch_domain_soft_reset(struct domain *d)
>>       return -ENOSYS;
>>   }
>>   
>> +void arch_domain_creation_finished(struct domain *d)
>> +{
>> +    /*
>> +     * To avoid flushing the whole guest RAM on the first Set/Way, we
>> +     * invalidate the P2M to track what has been accessed.
>> +     *
>> +     * This is only turned when IOMMU is not used or the page-table are
>> +     * not shared because bit[0] (e.g valid bit) unset will result
>> +     * IOMMU fault that could be not fixed-up.
>> +     */
>> +    if ( !iommu_use_hap_pt(d) )
>> +        p2m_invalidate_root(p2m_get_hostp2m(d));
>> +}
>> +
>>   static int is_guest_pv32_psr(uint32_t psr)
>>   {
>>       switch (psr & PSR_MODE_MASK)
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index f552154e93..de96516faa 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -2249,6 +2249,13 @@ int __init construct_dom0(struct domain *d)
>>       v->is_initialised = 1;
>>       clear_bit(_VPF_down, &v->pause_flags);
>>   
>> +    /*
>> +     * XXX: We probably want a command line option to invalidate or not
>> +     * the P2M. This is because invalidating the P2M will not work with
>> +     * IOMMU, however if this is not done there will be an impact.
> 
> Why can't we check on iommu_use_hap_pt(d) like in
> arch_domain_creation_finished?
> 
> In any case, I agree it is a good idea to introduce a command line
> parameter to toggle the p2m_invalidate_root call at domain creation
> on/off. There are cases where it should be off even if an IOMMU is
> present.

I actually forgot to remove that code as Dom0 should be covered by the 
change below.

I am not entirely to understand your last sentence, this feature is 
turned off when an IOMMU is present. So what is your use case here?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 15/16] xen/arm: Implement Set/Way operations
  2018-11-05 21:10   ` Stefano Stabellini
@ 2018-11-05 23:38     ` Julien Grall
  2018-11-06 17:31       ` Stefano Stabellini
  0 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-11-05 23:38 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel

Hi Stefano,

On 11/5/18 9:10 PM, Stefano Stabellini wrote:
> On Mon, 8 Oct 2018, Julien Grall wrote:
>> Set/Way operations are used to perform maintenance on a given cache.
>> At the moment, Set/Way operations are not trapped and therefore a guest
>> OS will directly act on the local cache. However, a vCPU may migrate to
>> another pCPU in the middle of the processor. This will result to have
>> cache with stall data (Set/Way are not propagated) potentially causing
>> crash. This may be the cause of heisenbug noticed in Osstest [1].
>>
>> Furthermore, Set/Way operations are not available on system cache. This
>> means that OS, such as Linux 32-bit, relying on those operations to
>> fully clean the cache before disabling MMU may break because data may
>> sits in system caches and not in RAM.
>>
>> For more details about Set/Way, see the talk "The Art of Virtualizing
>> Cache Maintenance" given at Xen Summit 2018 [2].
>>
>> In the context of Xen, we need to trap Set/Way operations and emulate
>> them. From the Arm Arm (B1.14.4 in DDI 046C.c), Set/Way operations are
>> difficult to virtualized. So we can assume that a guest OS using them will
>> suffer the consequence (i.e slowness) until developer removes all the usage
>> of Set/Way.
>>
>> As the software is not allowed to infer the Set/Way to Physical Address
>> mapping, Xen will need to go through the guest P2M and clean &
>> invalidate all the entries mapped.
>>
>> Because Set/Way happen in batch (a loop on all Set/Way of a cache), Xen
>> would need to go through the P2M for every instructions. This is quite
>> expensive and would severely impact the guest OS. The implementation is
>> re-using the KVM policy to limit the number of flush:
>>      - If we trap a Set/Way operations, we enable VM trapping (i.e
>                     ^ remove 'a'
> 
>>        HVC_EL2.TVM) to detect cache being turned on/off, and do a full
>>      clean.
> 
> "do a full clean" straight away, right? May I suggest a rewording of
> this item:
> 
> - as soon as we trap a set/way operation, we enable VM trapping (i.e.
>    HVC_EL2.TVM, it ll allow us to detect cache being turned on/off),
>    then we do a full clean

Sure.

> 
> 
>>      - We clean the caches when turning on and off
> 
> "We clean the caches when the guest turns caches on or off"

Sure.

> 
> 
>>      - Once the caches are enabled, we stop trapping VM instructions
>>
>> [1] https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg03191.html
>> [2] https://fr.slideshare.net/xen_com_mgr/virtualizing-cache
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> ---
>>   xen/arch/arm/arm64/vsysreg.c | 27 +++++++++++++++++-
>>   xen/arch/arm/p2m.c           | 68 ++++++++++++++++++++++++++++++++++++++++++++
>>   xen/arch/arm/traps.c         |  2 +-
>>   xen/arch/arm/vcpreg.c        | 23 +++++++++++++++
>>   xen/include/asm-arm/p2m.h    | 16 +++++++++++
>>   5 files changed, 134 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
>> index 1517879697..43c6c3e30d 100644
>> --- a/xen/arch/arm/arm64/vsysreg.c
>> +++ b/xen/arch/arm/arm64/vsysreg.c
>> @@ -40,7 +40,20 @@ static bool vreg_emulate_##reg(struct cpu_user_regs *regs,          \
>>   }
>>   
>>   /* Defining helpers for emulating sysreg registers. */
>> -TVM_REG(SCTLR_EL1)
>> +static bool vreg_emulate_SCTLR_EL1(struct cpu_user_regs *regs, uint64_t *r,
>> +                                   bool read)
>> +{
>> +    struct vcpu *v = current;
>> +    bool cache_enabled = vcpu_has_cache_enabled(v);
>> +
>> +    GUEST_BUG_ON(read);
>> +    WRITE_SYSREG(*r, SCTLR_EL1);
>> +
>> +    p2m_toggle_cache(v, cache_enabled);
>> +
>> +    return true;
>> +}
>> +
>>   TVM_REG(TTBR0_EL1)
>>   TVM_REG(TTBR1_EL1)
>>   TVM_REG(TCR_EL1)
>> @@ -84,6 +97,18 @@ void do_sysreg(struct cpu_user_regs *regs,
>>           break;
>>   
>>       /*
>> +     * HCR_EL2.TSW
>> +     *
>> +     * ARMv8 (DDI 0487B.b): Table D1-42
>> +     */
>> +    case HSR_SYSREG_DCISW:
>> +    case HSR_SYSREG_DCCSW:
>> +    case HSR_SYSREG_DCCISW:
>> +        if ( hsr.sysreg.read )
> 
> Shouldn't it be !hsr.sysreg.read ?

Hmmm yes.

> 
> 
>> +            p2m_set_way_flush(current);
>> +        break;
>> +
>> +    /*
>>        * HCR_EL2.TVM
>>        *
>>        * ARMv8 (DDI 0487B.b): Table D1-37
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index df6b48d73b..a3d4c563b1 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -1564,6 +1564,74 @@ int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end)
>>       return 0;
>>   }
>>   
>> +static void p2m_flush_vm(struct vcpu *v)
>> +{
>> +    struct p2m_domain *p2m = p2m_get_hostp2m(v->domain);
>> +
>> +    /* XXX: Handle preemption */
> 
> Yes, good to have this reminder. Maybe add "we'd want to break the
> operation down when it takes too long".

I am planning to handle this before the series is actually merged. This 
is a massive security hole and easy to exploit.

> 
> 
>> +    p2m_cache_flush_range(v->domain, p2m->lowest_mapped_gfn,
>> +                          p2m->max_mapped_gfn);
>> +}
>> +
>> +/*
>> + * See note at ARMv7 ARM B1.14.4 (DDI 0406C.c) (TL;DR: S/W ops are not
>> + * easily virtualized).
>> + *
>> + * Main problems:
>> + *  - S/W ops are local to a CPU (not broadcast)
>> + *  - We have line migration behind our back (speculation)
>> + *  - System caches don't support S/W at all (damn!)
>> + *
>> + * In the face of the above, the best we can do is to try and convert
>> + * S/W ops to VA ops. Because the guest is not allowed to infer the S/W
>> + * to PA mapping, it can only use S/W to nuke the whole cache, which is
>> + * rather a good thing for us.
>> + *
>> + * Also, it is only used when turning caches on/off ("The expected
>> + * usage of the cache maintenance instructions that operate by set/way
>> + * is associated with the powerdown and powerup of caches, if this is
>> + * required by the implementation.").
>> + *
>> + * We use the following policy:
>> + *  - If we trap a S/W operation, we enabled VM trapping to detect
>> + *  caches being turned on/off, and do a full clean.
>> + *
>> + *  - We flush the caches on both caches being turned on and off.
>> + *
>> + *  - Once the caches are enabled, we stop trapping VM ops.
>> + */
>> +void p2m_set_way_flush(struct vcpu *v)
>> +{
>> +    /* This function can only work with the current vCPU. */
>> +    ASSERT(v == current);
> 
> NIT: if it can only operate on current, it makes sense to remove the
> struct vcpu* parameter

I prefer to keep struct vcpu *v here. This makes more straightforward to 
know on what the function is working on.

Furthermore, current is actually quite expensive to use in some 
circumstance.

For instance, in nested case, TPIDR_EL2 will get trapped to the host 
hypervisor.

So it would be best if we start avoid current whenever it is possible.


> 
> 
>> +    if ( !(v->arch.hcr_el2 & HCR_TVM) )
>> +    {
>> +        p2m_flush_vm(v);
>> +        vcpu_hcr_set_flags(v, HCR_TVM);
>> +    }
>> +}
>> +
>> +void p2m_toggle_cache(struct vcpu *v, bool was_enabled)
>> +{
>> +    bool now_enabled = vcpu_has_cache_enabled(v);
>> +
>> +    /* This function can only work with the current vCPU. */
>> +    ASSERT(v == current);
> 
> NIT: same about struct vcpu* as parameter when only current can be used
> 
> 
>> +    /*
>> +     * If switching the MMU+caches on, need to invalidate the caches.
>> +     * If switching it off, need to clean the caches.
>> +     * Clean + invalidate does the trick always.
>> +     */
>> +    if ( was_enabled != now_enabled )
>> +        p2m_flush_vm(v);
>> +
>> +    /* Caches are now on, stop trapping VM ops (until a S/W op) */
>> +    if ( now_enabled )
>> +        vcpu_hcr_clear_flags(v, HCR_TVM);
>> +}
>> +
>>   mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
>>   {
>>       return p2m_lookup(d, gfn, NULL);
>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>> index 169b57cb6b..cdc10eee5a 100644
>> --- a/xen/arch/arm/traps.c
>> +++ b/xen/arch/arm/traps.c
>> @@ -98,7 +98,7 @@ register_t get_default_hcr_flags(void)
>>   {
>>       return  (HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM|
>>                (vwfi != NATIVE ? (HCR_TWI|HCR_TWE) : 0) |
>> -             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB);
>> +             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
>>   }
>>   
>>   static enum {
>> diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
>> index 49529b97cd..dc46d9d0d7 100644
>> --- a/xen/arch/arm/vcpreg.c
>> +++ b/xen/arch/arm/vcpreg.c
>> @@ -45,9 +45,14 @@
>>   #define TVM_REG(sz, func, reg...)                                           \
>>   static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
>>   {                                                                           \
>> +    struct vcpu *v = current;                                               \
>> +    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
>> +                                                                            \
>>       GUEST_BUG_ON(read);                                                     \
>>       WRITE_SYSREG##sz(*r, reg);                                              \
>>                                                                               \
>> +    p2m_toggle_cache(v, cache_enabled);                                     \
> 
> This will affect all the registers trapped with TVM. Shouldn't we only
> call p2m_toggle_cache when relevant? i.e. when changing SCTLR?
> I think it would be better to only modify the SCTLR emulation handler.

This is made on purpose, you increase the chance to disable TVM as soon 
as possible. If you only rely on SCTLR, you may end up to trap a lot of 
registers for a long time.

FWIW, as I already wrote in the commit message, this is based on what 
KVM does.

> 
> 
>>       return true;                                                            \
>>   }
>>   
>> @@ -65,6 +70,8 @@ static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
>>   static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
>>                                   bool read, bool hi)                         \
>>   {                                                                           \
>> +    struct vcpu *v = current;                                               \
>> +    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
>>       register_t reg = READ_SYSREG(xreg);                                     \
>>                                                                               \
>>       GUEST_BUG_ON(read);                                                     \
>> @@ -80,6 +87,8 @@ static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
>>       }                                                                       \
>>       WRITE_SYSREG(reg, xreg);                                                \
>>                                                                               \
>> +    p2m_toggle_cache(v, cache_enabled);                                     \
>> +                                                                            \
>>       return true;                                                            \
>>   }                                                                           \
>>                                                                               \
>> @@ -98,6 +107,7 @@ static bool vreg_emulate_##hireg(struct cpu_user_regs *regs, uint32_t *r,   \
>>   
>>   /* Defining helpers for emulating co-processor registers. */
>>   TVM_REG32(SCTLR, SCTLR_EL1)
>> +
> 
> Spurious change. Should be in a previous patch?

Likely.

> 
>>   /*
>>    * AArch32 provides two way to access TTBR* depending on the access
>>    * size, whilst AArch64 provides one way.
>> @@ -180,6 +190,19 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
>>           break;
>>   
>>       /*
>> +     * HCR_EL2.TSW
>> +     *
>> +     * ARMv7 (DDI 0406C.b): B1.14.6
>> +     * ARMv8 (DDI 0487B.b): Table D1-42
>> +     */
>> +    case HSR_CPREG32(DCISW):
>> +    case HSR_CPREG32(DCCSW):
>> +    case HSR_CPREG32(DCCISW):
>> +        if ( !cp32.read )
>> +            p2m_set_way_flush(current);
>> +        break;
>> +
>> +    /*
>>        * HCR_EL2.TVM
>>        *
>>        * ARMv8 (DDI 0487B.b): Table D1-37
>> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
>> index 92213dd1ab..c470f062db 100644
>> --- a/xen/include/asm-arm/p2m.h
>> +++ b/xen/include/asm-arm/p2m.h
>> @@ -231,6 +231,10 @@ int p2m_set_entry(struct p2m_domain *p2m,
>>    */
>>   int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end);
>>   
>> +void p2m_set_way_flush(struct vcpu *v);
>> +
>> +void p2m_toggle_cache(struct vcpu *v, bool was_enabled);
>> +
>>   /*
>>    * Map a region in the guest p2m with a specific p2m type.
>>    * The memory attributes will be derived from the p2m type.
>> @@ -358,6 +362,18 @@ static inline int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
>>       return -EOPNOTSUPP;
>>   }
>>   
>> +/*
>> + * A vCPU has cache enabled only when the MMU is enabled and data cache
>> + * is enabled.
>> + */
>> +static inline bool vcpu_has_cache_enabled(struct vcpu *v)
>> +{
>> +    /* Only works with the current vCPU */
>> +    ASSERT(current == v);
> 
> NIT: same about struct vcpu *v as parameter when only current makes
> sense
> 
> 
>> +    return (READ_SYSREG32(SCTLR_EL1) & (SCTLR_C|SCTLR_M)) == (SCTLR_C|SCTLR_M);
>> +}
>> +
>>   #endif /* _XEN_P2M_H */
>>   
>>   /*

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault
  2018-11-05 17:56       ` Stefano Stabellini
@ 2018-11-06 14:20         ` Julien Grall
  2018-11-12 17:59           ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-11-06 14:20 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel

Hi Stefano,

On 05/11/2018 17:56, Stefano Stabellini wrote:
> On Mon, 5 Nov 2018, Julien Grall wrote:
>> On 02/11/2018 23:27, Stefano Stabellini wrote:
>>> On Mon, 8 Oct 2018, Julien Grall wrote:
>>>> Currently a Stage-2 translation fault could happen:
>>>>       1) MMIO emulation
>>>>       2) When the page-tables is been updated using Break-Before-Make
>>>                                 ^ have
>>>
>>>>       3) Page not mapped
>>>>
>>>> A follow-up patch will re-purpose the valid bit in an entry to generate
>>>> translation fault. This would be used to do an action on each entries to
>>>                                                                   ^entry
>>>
>>>> track page used for a given period.
>>>           ^pages
>>>
>>>
>>>>
>>>> A new function is introduced to try to resolve a translation fault. This
>>>> will include 2) and the new way to generate fault explained above.
>>>
>>> I can see the code does what you describe, but I don't understand why we
>>> are doing this. What is missing in the commit message is the explanation
>>> of the relationship between the future goal of repurposing the valid bit
>>> and the introduction of a function to handle Break-Before-Make stage2
>>> faults. Does it fix an issue with Break-Before-Make that we currently
>>> have? Or it becomes needed due to the repurposing of valid? If so, why?
>>
>> This does not fix any issue with BBM. The valid bit adds a 4th reasons for
>> translation fault. Both BBM and the valid bit will require to walk the
>> page-tables.
>>
>> For the valid bit, we will need to walk the page-table in order to fixup the
>> entry (i.e set valid bit). We also can't use p2m_lookup(...) has it only tell
>> you the mapping exists, the valid bit may still not be set.
>>
>> So we need to provide a new helper to walk the page-table and fixup an entry.
> 
> OK. Please expand a bit the commit message.

Sure.

>>>
>>>> +            break;
>>>> +    }
>>>> +
>>>> +    /*
>>>> +     * If the valid bit of the entry is set, it means someone was playing
>>>> with
>>>> +     * the Stage-2 page table. Nothing to do and mark the fault as
>>>> resolved.
>>>> +     */
>>>> +    if ( lpae_is_valid(entry) )
>>>> +    {
>>>> +        resolved = true;
>>>> +        goto out_unmap;
>>>> +    }
>>>> +
>>>> +    /*
>>>> +     * The valid bit is unset. If the entry is still not valid then the
>>>> fault
>>>> +     * cannot be resolved, exit and report it.
>>>> +     */
>>>> +    if ( !p2m_is_valid(entry) )
>>>> +        goto out_unmap;
>>>> +
>>>> +    /*
>>>> +     * Now we have an entry with valid bit unset, but still valid from
>>>> +     * the P2M point of view.
>>>> +     *
>>>> +     * For entry pointing to a table, the table will be invalidated.
>>>                 ^ entries
>>>
>>>
>>>> +     * For entry pointing to a block/page, no work to do for now.
>>>                 ^ entries
>>
>> I am not entirely sure why it should be plural here. We are dealing with only
>> one entry it.
> 
> I was trying to make the grammar work as a generic sentence. To make it
> singular we would have to remove "For":
> 
>    If an entry is pointing to a table, the table will be invalidated.
>    If an entry is pointing to a block/page, no work to do for now.

I will use the singular version.

> 
> 
>>>
>>>> +     */
>>>> +    if ( lpae_is_table(entry, level) )
>>>> +        p2m_invalidate_table(p2m, lpae_get_mfn(entry));
>>>
>>> Maybe because I haven't read the rest of the patches, it is not clear to
>>> me why in the case of an entry pointing to a table we need to invalidate
>>> it, and otherwise set valid to 1.
>>
>> This was written in the commit message:
>>
>> "To avoid invalidating all the page-tables entries in one go. It is
>> possible to invalidate the top-level table and then on trap invalidate
>> the table one-level down. This will be repeated until a block/page entry
>> has been reached."
>>
>> It is mostly to spread the cost of invalidating the page-tables. With this
>> solution, you only need to clear the valid bit of the top-level entries to
>> invalidate the full P2M.
>>
>> On the first access, you will trap, set the entry of the first "invalid
>> entry", and invalidate the next level if necessary.
>>
>> The access will then be retried. If trapped, the process is repeated until all
>> the entries are valid.
>>
>> It is possible to optimize it, avoiding intermediate trap when necessary. But
>> I would not bother looking at that for now. Indeed, this will be used for
>> lowering down the cost of set/way cache maintenance emulation. Any guest using
>> that already knows that a big cost will incur.
> 
> So instead of walking the page table in Xen, finding all the leaf
> (level==3) entries that we need to set !valid, we just set !valid one of
> the higher levels entries. On access, we'll trap in Xen, then set the
> higher level entry back to valid but the direct children to !valid. And
> we'll cycle again through this until the table entries are valid and the
> leaf entry is the only invalid one: at that point we'll only set it to
> valid and the whole translation for that address is valid again.

That's correct.

> 
> Very inefficient, but very simple to implement in Xen. A good way to > penalize guests that are using instructions they should not be using :-)

I am not convinced you will see a major slowdown. As you will quickly go through 
all the level, ending to only trap once after a bit. So the impact is mostly boot.

> 
> All right, please expand on the explanation in the commit message. It is
> also worthy of a in-code comment on top of
> p2m_resolve_translation_fault.

I will expand it.

> 
> One more comment below.
> 
> 
>>>
>>>> +    /*
>>>> +     * Now that the work on the entry is done, set the valid bit to
>>>> prevent
>>>> +     * another fault on that entry.
>>>> +     */
>>>> +    resolved = true;
>>>> +    entry.p2m.valid = 1;
>>>> +
>>>> +    p2m_write_pte(table + offsets[level], entry, p2m->clean_pte);
>>>> +
>>>> +    /*
>>>> +     * No need to flush the TLBs as the modified entry had the valid bit
>>>> +     * unset.
>>>> +     */
>>>> +
>>>> +out_unmap:
>>>> +    unmap_domain_page(table);
>>>> +
>>>> +out:
>>>> +    p2m_write_unlock(p2m);
>>>> +
>>>> +    return resolved;
>>>> +}
>>>> +
>>>>    static inline int p2m_insert_mapping(struct domain *d,
>>>>                                         gfn_t start_gfn,
>>>>                                         unsigned long nr,
> 
> 
> We probably want to update the comment on top of the call to
> p2m_resolve_translation_fault:

Whoops. I will fix it.

> 
> 
>> @@ -1977,8 +1978,8 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>>            * with the Stage-2 page table. Walk the Stage-2 PT to check
>>            * if the entry exists. If it's the case, return to the guest
>>            */
>> -        mfn = gfn_to_mfn(current->domain, gaddr_to_gfn(gpa));
>> -        if ( !mfn_eq(mfn, INVALID_MFN) )
>> +        if ( p2m_resolve_translation_fault(current->domain,
>> +                                           gaddr_to_gfn(gpa)) )
> 

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 15/16] xen/arm: Implement Set/Way operations
  2018-11-05 23:38     ` Julien Grall
@ 2018-11-06 17:31       ` Stefano Stabellini
  2018-11-06 17:34         ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-06 17:31 UTC (permalink / raw)
  To: Julien Grall; +Cc: Stefano Stabellini, andre.przywara, xen-devel

On Mon, 5 Nov 2018, Julien Grall wrote:
> > > +void p2m_set_way_flush(struct vcpu *v)
> > > +{
> > > +    /* This function can only work with the current vCPU. */
> > > +    ASSERT(v == current);
> > 
> > NIT: if it can only operate on current, it makes sense to remove the
> > struct vcpu* parameter
> 
> I prefer to keep struct vcpu *v here. This makes more straightforward to know
> on what the function is working on.
> 
> Furthermore, current is actually quite expensive to use in some circumstance.
> 
> For instance, in nested case, TPIDR_EL2 will get trapped to the host
> hypervisor.
> 
> So it would be best if we start avoid current whenever it is possible.

That's OK


> > 
> > 
> > > +    if ( !(v->arch.hcr_el2 & HCR_TVM) )
> > > +    {
> > > +        p2m_flush_vm(v);
> > > +        vcpu_hcr_set_flags(v, HCR_TVM);
> > > +    }
> > > +}
> > > +
> > > +void p2m_toggle_cache(struct vcpu *v, bool was_enabled)
> > > +{
> > > +    bool now_enabled = vcpu_has_cache_enabled(v);
> > > +
> > > +    /* This function can only work with the current vCPU. */
> > > +    ASSERT(v == current);
> > 
> > NIT: same about struct vcpu* as parameter when only current can be used
> > 
> > 
> > > +    /*
> > > +     * If switching the MMU+caches on, need to invalidate the caches.
> > > +     * If switching it off, need to clean the caches.
> > > +     * Clean + invalidate does the trick always.
> > > +     */
> > > +    if ( was_enabled != now_enabled )
> > > +        p2m_flush_vm(v);
> > > +
> > > +    /* Caches are now on, stop trapping VM ops (until a S/W op) */
> > > +    if ( now_enabled )
> > > +        vcpu_hcr_clear_flags(v, HCR_TVM);
> > > +}
> > > +
> > >   mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
> > >   {
> > >       return p2m_lookup(d, gfn, NULL);
> > > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> > > index 169b57cb6b..cdc10eee5a 100644
> > > --- a/xen/arch/arm/traps.c
> > > +++ b/xen/arch/arm/traps.c
> > > @@ -98,7 +98,7 @@ register_t get_default_hcr_flags(void)
> > >   {
> > >       return  (HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM|
> > >                (vwfi != NATIVE ? (HCR_TWI|HCR_TWE) : 0) |
> > > -             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB);
> > > +             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
> > >   }
> > >     static enum {
> > > diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
> > > index 49529b97cd..dc46d9d0d7 100644
> > > --- a/xen/arch/arm/vcpreg.c
> > > +++ b/xen/arch/arm/vcpreg.c
> > > @@ -45,9 +45,14 @@
> > >   #define TVM_REG(sz, func, reg...)
> > > \
> > >   static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)
> > > \
> > >   {
> > > \
> > > +    struct vcpu *v = current;
> > > \
> > > +    bool cache_enabled = vcpu_has_cache_enabled(v);
> > > \
> > > +
> > > \
> > >       GUEST_BUG_ON(read);
> > > \
> > >       WRITE_SYSREG##sz(*r, reg);
> > > \
> > >                                                                               \
> > > +    p2m_toggle_cache(v, cache_enabled);
> > > \
> > 
> > This will affect all the registers trapped with TVM. Shouldn't we only
> > call p2m_toggle_cache when relevant? i.e. when changing SCTLR?
> > I think it would be better to only modify the SCTLR emulation handler.
> 
> This is made on purpose, you increase the chance to disable TVM as soon as
> possible. If you only rely on SCTLR, you may end up to trap a lot of registers
> for a long time.
> 
> FWIW, as I already wrote in the commit message, this is based on what KVM
> does.

I missed that. As you explain it, it makes sense. Maybe worth adding an
explicit statement about it: "On ARM64, we call p2m_toggle_cache from on
the TVM-trapped register handlers to increase the chances of disabling
TVM as soon as possible."

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 15/16] xen/arm: Implement Set/Way operations
  2018-11-06 17:31       ` Stefano Stabellini
@ 2018-11-06 17:34         ` Julien Grall
  2018-11-06 17:38           ` Stefano Stabellini
  0 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-11-06 17:34 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel

Hi Stefano,

On 06/11/2018 17:31, Stefano Stabellini wrote:
> On Mon, 5 Nov 2018, Julien Grall wrote:
>>> This will affect all the registers trapped with TVM. Shouldn't we only
>>> call p2m_toggle_cache when relevant? i.e. when changing SCTLR?
>>> I think it would be better to only modify the SCTLR emulation handler.
>>
>> This is made on purpose, you increase the chance to disable TVM as soon as
>> possible. If you only rely on SCTLR, you may end up to trap a lot of registers
>> for a long time.
>>
>> FWIW, as I already wrote in the commit message, this is based on what KVM
>> does.
> 
> I missed that. As you explain it, it makes sense. Maybe worth adding an
> explicit statement about it: "On ARM64, we call p2m_toggle_cache from on
> the TVM-trapped register handlers to increase the chances of disabling
> TVM as soon as possible."

I am assuming you meant arm32 here? Looking at the code, it looks like I 
implemented the arm64 differently. But we probably want apply to call 
p2m_toggle_cache in all TVM trapped registers.

This would keep the logic everywhere the same.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 10/16] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM
  2018-11-05 23:21     ` Julien Grall
@ 2018-11-06 17:36       ` Stefano Stabellini
  2018-11-06 17:52         ` Julien Grall
  2018-12-04 16:24       ` Julien Grall
  1 sibling, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-06 17:36 UTC (permalink / raw)
  To: Julien Grall; +Cc: Stefano Stabellini, andre.przywara, xen-devel

On Mon, 5 Nov 2018, Julien Grall wrote:
> Hi Stefano,
> 
> On 11/5/18 7:47 PM, Stefano Stabellini wrote:
> > On Mon, 8 Oct 2018, Julien Grall wrote:
> > > A follow-up patch will require to emulate some accesses to some
> > > co-processors registers trapped by HCR_EL2.TVM. When set, all NS EL1
> > > writes
> > > to the virtual memory control registers will be trapped to the hypervisor.
> > > 
> > > This patch adds the infrastructure to passthrough the access to host
> > > registers. For convenience a bunch of helpers have been added to
> > > generate the different helpers.
> > > 
> > > Note that HCR_EL2.TVM will be set in a follow-up patch dynamically.
> > > 
> > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > ---
> > >   xen/arch/arm/vcpreg.c        | 144
> > > +++++++++++++++++++++++++++++++++++++++++++
> > >   xen/include/asm-arm/cpregs.h |   1 +
> > >   2 files changed, 145 insertions(+)
> > > 
> > > diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
> > > index b04d996fd3..49529b97cd 100644
> > > --- a/xen/arch/arm/vcpreg.c
> > > +++ b/xen/arch/arm/vcpreg.c
> > > @@ -24,6 +24,122 @@
> > >   #include <asm/traps.h>
> > >   #include <asm/vtimer.h>
> > >   +/*
> > > + * Macros to help generating helpers for registers trapped when
> > > + * HCR_EL2.TVM is set.
> > > + *
> > > + * Note that it only traps NS write access from EL1.
> > > + *
> > > + *  - TVM_REG() should not be used outside of the macros. It is there to
> > > + *    help defining TVM_REG32() and TVM_REG64()
> > > + *  - TVM_REG32(regname, xreg) and TVM_REG64(regname, xreg) are used to
> > > + *    resp. generate helper accessing 32-bit and 64-bit register.
> > > "regname"
> > > + *    been the Arm32 name and "xreg" the Arm64 name.
> >           ^ is
> > 
> > Please add that we use the Arm64 reg name to call WRITE_SYSREG in the
> > Xen source code even on Arm32 in general
> 
> I am not sure to understand this. It is common use in Xen to use arm64 name
> when code is for both architecture. So why would I need a specific comment
> here?

Yes, that's our convention, but as I was looking through the code, I
couldn't quickly find any places where we wrote the convention down. Is
there? I thought it would be good to start somewhere, this could be a
good place as any, also given that it directly affects this code.


> > > + *  - UPDATE_REG32_COMBINED(lowreg, hireg, xreg) are used to generate a
> > 
> > TVM_REG32_COMBINED
> > 
> > 
> > > + *  pair of registers share the same Arm32 registers. "lowreg" and
> > > + *  "higreg" been resp. the Arm32 name and "xreg" the Arm64 name.
> > > "lowreg"
> > > + *  will use xreg[31:0] and "hireg" will use xreg[63:32].
> > 
> > Please add that xreg is unused in the Arm32 case.
> 
> Why do you think that? xreg is actually used. It will get expanded to whatever
> is the co-processor encoding and caught by reg... in TVM_REG().

It is unused in the TVM_REG32_COMBINED case, which is the comment part I
was replying about. This is the code:

  #define TVM_REG32_COMBINED(lowreg, hireg, xreg)                     \
      /* Use TVM_REG directly to workaround macro expansion. */       \
      TVM_REG(32, vreg_emulate_##lowreg, lowreg)                      \
      TVM_REG(32, vreg_emulate_##hireg, hireg)

xreg is not used?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 15/16] xen/arm: Implement Set/Way operations
  2018-11-06 17:34         ` Julien Grall
@ 2018-11-06 17:38           ` Stefano Stabellini
  0 siblings, 0 replies; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-06 17:38 UTC (permalink / raw)
  To: Julien Grall; +Cc: Stefano Stabellini, andre.przywara, xen-devel

On Tue, 6 Nov 2018, Julien Grall wrote:
> Hi Stefano,
> 
> On 06/11/2018 17:31, Stefano Stabellini wrote:
> > On Mon, 5 Nov 2018, Julien Grall wrote:
> > > > This will affect all the registers trapped with TVM. Shouldn't we only
> > > > call p2m_toggle_cache when relevant? i.e. when changing SCTLR?
> > > > I think it would be better to only modify the SCTLR emulation handler.
> > > 
> > > This is made on purpose, you increase the chance to disable TVM as soon as
> > > possible. If you only rely on SCTLR, you may end up to trap a lot of
> > > registers
> > > for a long time.
> > > 
> > > FWIW, as I already wrote in the commit message, this is based on what KVM
> > > does.
> > 
> > I missed that. As you explain it, it makes sense. Maybe worth adding an
> > explicit statement about it: "On ARM64, we call p2m_toggle_cache from on
> > the TVM-trapped register handlers to increase the chances of disabling
> > TVM as soon as possible."
> 
> I am assuming you meant arm32 here? 

Yes, I did


> Looking at the code, it looks like I
> implemented the arm64 differently. But we probably want apply to call
> p2m_toggle_cache in all TVM trapped registers.
> 
> This would keep the logic everywhere the same.

Right, good idea

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 16/16] xen/arm: Track page accessed between batch of Set/Way operations
  2018-11-05 23:28     ` Julien Grall
@ 2018-11-06 17:43       ` Stefano Stabellini
  2018-11-06 18:10         ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-06 17:43 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andre.przywara,
	xen-devel, Jan Beulich

On Mon, 5 Nov 2018, Julien Grall wrote:
> Hi Stefano,
> 
> On 11/5/18 9:35 PM, Stefano Stabellini wrote:
> > On Mon, 8 Oct 2018, Julien Grall wrote:
> > > At the moment, the implementation of Set/Way operations will go through
> > > all the entries of the guest P2M and flush them. However, this is very
> > > expensive and may render unusable a guest OS using them.
> > > 
> > > For instance, Linux 32-bit will use Set/Way operations during secondary
> > > CPU bring-up. As the implementation is really expensive, it may be
> > > possible
> > > to hit the CPU bring-up timeout.
> > > 
> > > To limit the Set/Way impact, we track what pages has been of the guest
> > > has been accessed between batch of Set/Way operations. This is done
> > > using bit[0] (aka valid bit) of the P2M entry.
> > 
> > This is going to improve performance of ill-mannered guests at the cost
> > of hurting performance of well-mannered guests. Is it really a good
> > trade-off? Should this behavior at least be configurable with a Xen
> > command line?
> 
> Well, we have the choice between not been able to boot Linux 32-bit anymore or
> have a slight impact at the boot time for all guests.

Wait -- I thought that with the set/way emulation introduced by patch
#15 we would be able to boot Linux 32-bit already. This patch is a
performance improvement. Or is it actually needed to boot Linux 32-bit?


> As you may have noticed the command line is been suggested below. I didn't yet
> implemented as we agreed at Connect it would be good to start getting feedback
> on it.

Sure. I was thinking about this -- does it make sense to have different
defaults for 32bit and 64bit guests?

32bit -> default p2m_invalidate_root
64bit -> default not


> > 
> > > This patch adds a new per-arch helper is introduced to perform actions
> > > just
> > > before the guest is first unpaused. This will be used to invalidate the
> > > P2M to track access from the start of the guest.
> > > 
> > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > 
> > > ---
> > > 
> > > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > > Cc: Julien Grall <julien.grall@arm.com>
> > > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > > Cc: Jan Beulich <jbeulich@suse.com>
> > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > Cc: Tim Deegan <tim@xen.org>
> > > Cc: Wei Liu <wei.liu2@citrix.com>
> > > ---
> > >   xen/arch/arm/domain.c       | 14 ++++++++++++++
> > >   xen/arch/arm/domain_build.c |  7 +++++++
> > >   xen/arch/arm/p2m.c          | 32 +++++++++++++++++++++++++++++++-
> > >   xen/arch/x86/domain.c       |  4 ++++
> > >   xen/common/domain.c         |  5 ++++-
> > >   xen/include/asm-arm/p2m.h   |  2 ++
> > >   xen/include/xen/domain.h    |  2 ++
> > >   7 files changed, 64 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> > > index feebbf5a92..f439f4657a 100644
> > > --- a/xen/arch/arm/domain.c
> > > +++ b/xen/arch/arm/domain.c
> > > @@ -738,6 +738,20 @@ int arch_domain_soft_reset(struct domain *d)
> > >       return -ENOSYS;
> > >   }
> > >   +void arch_domain_creation_finished(struct domain *d)
> > > +{
> > > +    /*
> > > +     * To avoid flushing the whole guest RAM on the first Set/Way, we
> > > +     * invalidate the P2M to track what has been accessed.
> > > +     *
> > > +     * This is only turned when IOMMU is not used or the page-table are
> > > +     * not shared because bit[0] (e.g valid bit) unset will result
> > > +     * IOMMU fault that could be not fixed-up.
> > > +     */
> > > +    if ( !iommu_use_hap_pt(d) )
> > > +        p2m_invalidate_root(p2m_get_hostp2m(d));
> > > +}
> > > +
> > >   static int is_guest_pv32_psr(uint32_t psr)
> > >   {
> > >       switch (psr & PSR_MODE_MASK)
> > > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > > index f552154e93..de96516faa 100644
> > > --- a/xen/arch/arm/domain_build.c
> > > +++ b/xen/arch/arm/domain_build.c
> > > @@ -2249,6 +2249,13 @@ int __init construct_dom0(struct domain *d)
> > >       v->is_initialised = 1;
> > >       clear_bit(_VPF_down, &v->pause_flags);
> > >   +    /*
> > > +     * XXX: We probably want a command line option to invalidate or not
> > > +     * the P2M. This is because invalidating the P2M will not work with
> > > +     * IOMMU, however if this is not done there will be an impact.
> > 
> > Why can't we check on iommu_use_hap_pt(d) like in
> > arch_domain_creation_finished?
> > 
> > In any case, I agree it is a good idea to introduce a command line
> > parameter to toggle the p2m_invalidate_root call at domain creation
> > on/off. There are cases where it should be off even if an IOMMU is
> > present.
> 
> I actually forgot to remove that code as Dom0 should be covered by the change
> below.

Makes sense now


> I am not entirely to understand your last sentence, this feature is turned off
> when an IOMMU is present. So what is your use case here?
 
My sentence was badly written, sorry. I meant to say that even when an
IOMMU is NOT present, there are cases where we might not want to call
p2m_invalidate_root.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 10/16] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM
  2018-11-06 17:36       ` Stefano Stabellini
@ 2018-11-06 17:52         ` Julien Grall
  2018-11-06 17:56           ` Stefano Stabellini
  0 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-11-06 17:52 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel

Hi Stefano,

On 06/11/2018 17:36, Stefano Stabellini wrote:
> On Mon, 5 Nov 2018, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 11/5/18 7:47 PM, Stefano Stabellini wrote:
>>> On Mon, 8 Oct 2018, Julien Grall wrote:
>>>> A follow-up patch will require to emulate some accesses to some
>>>> co-processors registers trapped by HCR_EL2.TVM. When set, all NS EL1
>>>> writes
>>>> to the virtual memory control registers will be trapped to the hypervisor.
>>>>
>>>> This patch adds the infrastructure to passthrough the access to host
>>>> registers. For convenience a bunch of helpers have been added to
>>>> generate the different helpers.
>>>>
>>>> Note that HCR_EL2.TVM will be set in a follow-up patch dynamically.
>>>>
>>>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>>> ---
>>>>    xen/arch/arm/vcpreg.c        | 144
>>>> +++++++++++++++++++++++++++++++++++++++++++
>>>>    xen/include/asm-arm/cpregs.h |   1 +
>>>>    2 files changed, 145 insertions(+)
>>>>
>>>> diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
>>>> index b04d996fd3..49529b97cd 100644
>>>> --- a/xen/arch/arm/vcpreg.c
>>>> +++ b/xen/arch/arm/vcpreg.c
>>>> @@ -24,6 +24,122 @@
>>>>    #include <asm/traps.h>
>>>>    #include <asm/vtimer.h>
>>>>    +/*
>>>> + * Macros to help generating helpers for registers trapped when
>>>> + * HCR_EL2.TVM is set.
>>>> + *
>>>> + * Note that it only traps NS write access from EL1.
>>>> + *
>>>> + *  - TVM_REG() should not be used outside of the macros. It is there to
>>>> + *    help defining TVM_REG32() and TVM_REG64()
>>>> + *  - TVM_REG32(regname, xreg) and TVM_REG64(regname, xreg) are used to
>>>> + *    resp. generate helper accessing 32-bit and 64-bit register.
>>>> "regname"
>>>> + *    been the Arm32 name and "xreg" the Arm64 name.
>>>            ^ is
>>>
>>> Please add that we use the Arm64 reg name to call WRITE_SYSREG in the
>>> Xen source code even on Arm32 in general
>>
>> I am not sure to understand this. It is common use in Xen to use arm64 name
>> when code is for both architecture. So why would I need a specific comment
>> here?
> 
> Yes, that's our convention, but as I was looking through the code, I
> couldn't quickly find any places where we wrote the convention down. Is
> there? I thought it would be good to start somewhere, this could be a
> good place as any, also given that it directly affects this code.

include/asm-arm/cpregs.h:

/* Aliases of AArch64 names for use in common code when building for AArch32 */

> 
> 
>>>> + *  - UPDATE_REG32_COMBINED(lowreg, hireg, xreg) are used to generate a
>>>
>>> TVM_REG32_COMBINED
>>>
>>>
>>>> + *  pair of registers share the same Arm32 registers. "lowreg" and
>>>> + *  "higreg" been resp. the Arm32 name and "xreg" the Arm64 name.
>>>> "lowreg"
>>>> + *  will use xreg[31:0] and "hireg" will use xreg[63:32].
>>>
>>> Please add that xreg is unused in the Arm32 case.
>>
>> Why do you think that? xreg is actually used. It will get expanded to whatever
>> is the co-processor encoding and caught by reg... in TVM_REG().
> 
> It is unused in the TVM_REG32_COMBINED case, which is the comment part I
> was replying about. This is the code:
> 
>    #define TVM_REG32_COMBINED(lowreg, hireg, xreg)                     \
>        /* Use TVM_REG directly to workaround macro expansion. */       \
>        TVM_REG(32, vreg_emulate_##lowreg, lowreg)                      \
>        TVM_REG(32, vreg_emulate_##hireg, hireg)
> 
> xreg is not used?

Hrm it is used in that case. I am got confused. How about the following:

TVM_REG32_COMBINED(lowreg, hireg, xreg) are used to generate a
pair of register sharing the same Arm64 register, but are 2 distinct Arm32 
registers. "lowreg" and "hireg" contains the name for on Arm32 registers,
"xreg" contains the name for the combined register on Arm64. The definition of 
"lowreg" and "higreg" match the Armv8 specification, this means "lowreg" is an 
alias to xreg[31:0] and "high" is an alias to xreg[63:32].

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 10/16] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM
  2018-11-06 17:52         ` Julien Grall
@ 2018-11-06 17:56           ` Stefano Stabellini
  0 siblings, 0 replies; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-06 17:56 UTC (permalink / raw)
  To: Julien Grall; +Cc: Stefano Stabellini, andre.przywara, xen-devel

On Tue, 6 Nov 2018, Julien Grall wrote:
> Hi Stefano,
> 
> On 06/11/2018 17:36, Stefano Stabellini wrote:
> > On Mon, 5 Nov 2018, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 11/5/18 7:47 PM, Stefano Stabellini wrote:
> > > > On Mon, 8 Oct 2018, Julien Grall wrote:
> > > > > A follow-up patch will require to emulate some accesses to some
> > > > > co-processors registers trapped by HCR_EL2.TVM. When set, all NS EL1
> > > > > writes
> > > > > to the virtual memory control registers will be trapped to the
> > > > > hypervisor.
> > > > > 
> > > > > This patch adds the infrastructure to passthrough the access to host
> > > > > registers. For convenience a bunch of helpers have been added to
> > > > > generate the different helpers.
> > > > > 
> > > > > Note that HCR_EL2.TVM will be set in a follow-up patch dynamically.
> > > > > 
> > > > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > > > ---
> > > > >    xen/arch/arm/vcpreg.c        | 144
> > > > > +++++++++++++++++++++++++++++++++++++++++++
> > > > >    xen/include/asm-arm/cpregs.h |   1 +
> > > > >    2 files changed, 145 insertions(+)
> > > > > 
> > > > > diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
> > > > > index b04d996fd3..49529b97cd 100644
> > > > > --- a/xen/arch/arm/vcpreg.c
> > > > > +++ b/xen/arch/arm/vcpreg.c
> > > > > @@ -24,6 +24,122 @@
> > > > >    #include <asm/traps.h>
> > > > >    #include <asm/vtimer.h>
> > > > >    +/*
> > > > > + * Macros to help generating helpers for registers trapped when
> > > > > + * HCR_EL2.TVM is set.
> > > > > + *
> > > > > + * Note that it only traps NS write access from EL1.
> > > > > + *
> > > > > + *  - TVM_REG() should not be used outside of the macros. It is there
> > > > > to
> > > > > + *    help defining TVM_REG32() and TVM_REG64()
> > > > > + *  - TVM_REG32(regname, xreg) and TVM_REG64(regname, xreg) are used
> > > > > to
> > > > > + *    resp. generate helper accessing 32-bit and 64-bit register.
> > > > > "regname"
> > > > > + *    been the Arm32 name and "xreg" the Arm64 name.
> > > >            ^ is
> > > > 
> > > > Please add that we use the Arm64 reg name to call WRITE_SYSREG in the
> > > > Xen source code even on Arm32 in general
> > > 
> > > I am not sure to understand this. It is common use in Xen to use arm64
> > > name
> > > when code is for both architecture. So why would I need a specific comment
> > > here?
> > 
> > Yes, that's our convention, but as I was looking through the code, I
> > couldn't quickly find any places where we wrote the convention down. Is
> > there? I thought it would be good to start somewhere, this could be a
> > good place as any, also given that it directly affects this code.
> 
> include/asm-arm/cpregs.h:
> 
> /* Aliases of AArch64 names for use in common code when building for AArch32
> */

Ops X-)
Maybe add reference to it? Fine either way.


> > 
> > 
> > > > > + *  - UPDATE_REG32_COMBINED(lowreg, hireg, xreg) are used to generate
> > > > > a
> > > > 
> > > > TVM_REG32_COMBINED
> > > > 
> > > > 
> > > > > + *  pair of registers share the same Arm32 registers. "lowreg" and
> > > > > + *  "higreg" been resp. the Arm32 name and "xreg" the Arm64 name.
> > > > > "lowreg"
> > > > > + *  will use xreg[31:0] and "hireg" will use xreg[63:32].
> > > > 
> > > > Please add that xreg is unused in the Arm32 case.
> > > 
> > > Why do you think that? xreg is actually used. It will get expanded to
> > > whatever
> > > is the co-processor encoding and caught by reg... in TVM_REG().
> > 
> > It is unused in the TVM_REG32_COMBINED case, which is the comment part I
> > was replying about. This is the code:
> > 
> >    #define TVM_REG32_COMBINED(lowreg, hireg, xreg)                     \
> >        /* Use TVM_REG directly to workaround macro expansion. */       \
> >        TVM_REG(32, vreg_emulate_##lowreg, lowreg)                      \
> >        TVM_REG(32, vreg_emulate_##hireg, hireg)
> > 
> > xreg is not used?
> 
> Hrm it is used in that case. I am got confused. How about the following:
> 
> TVM_REG32_COMBINED(lowreg, hireg, xreg) are used to generate a
> pair of register sharing the same Arm64 register, but are 2 distinct Arm32
> registers. "lowreg" and "hireg" contains the name for on Arm32 registers,
> "xreg" contains the name for the combined register on Arm64. The definition of
> "lowreg" and "higreg" match the Armv8 specification, this means "lowreg" is an
> alias to xreg[31:0] and "high" is an alias to xreg[63:32].

Sounds good

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 16/16] xen/arm: Track page accessed between batch of Set/Way operations
  2018-11-06 17:43       ` Stefano Stabellini
@ 2018-11-06 18:10         ` Julien Grall
  2018-11-06 18:41           ` Stefano Stabellini
  0 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-11-06 18:10 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Tim Deegan, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, andre.przywara, xen-devel,
	Jan Beulich



On 06/11/2018 17:43, Stefano Stabellini wrote:
> On Mon, 5 Nov 2018, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 11/5/18 9:35 PM, Stefano Stabellini wrote:
>>> On Mon, 8 Oct 2018, Julien Grall wrote:
>>>> At the moment, the implementation of Set/Way operations will go through
>>>> all the entries of the guest P2M and flush them. However, this is very
>>>> expensive and may render unusable a guest OS using them.
>>>>
>>>> For instance, Linux 32-bit will use Set/Way operations during secondary
>>>> CPU bring-up. As the implementation is really expensive, it may be
>>>> possible
>>>> to hit the CPU bring-up timeout.
>>>>
>>>> To limit the Set/Way impact, we track what pages has been of the guest
>>>> has been accessed between batch of Set/Way operations. This is done
>>>> using bit[0] (aka valid bit) of the P2M entry.
>>>
>>> This is going to improve performance of ill-mannered guests at the cost
>>> of hurting performance of well-mannered guests. Is it really a good
>>> trade-off? Should this behavior at least be configurable with a Xen
>>> command line?
>>
>> Well, we have the choice between not been able to boot Linux 32-bit anymore or
>> have a slight impact at the boot time for all guests.
> 
> Wait -- I thought that with the set/way emulation introduced by patch
> #15 we would be able to boot Linux 32-bit already. This patch is a
> performance improvement. Or is it actually needed to boot Linux 32-bit?

The problem is Linux 32-bit calls a few time set/way during secondary CPU bring 
up. It also has a timeout of 1s to fully boot that CPU. In my testing, I can 
easily hit the timeout even with a small amount of memory.

If we don't start tracking the page from the beginning, then you would need to 
clean the full RAM the first time. If you start to track from the beginning, you 
may just have to clean a couple of MB.

> 
> 
>> As you may have noticed the command line is been suggested below. I didn't yet
>> implemented as we agreed at Connect it would be good to start getting feedback
>> on it.
> 
> Sure. I was thinking about this -- does it make sense to have different
> defaults for 32bit and 64bit guests?
> 
> 32bit -> default p2m_invalidate_root
> 64bit -> default not

The decision is not that easy. While Linux arm64 does not contain set/way 
anymore, UEFI still use them (I checked it a couple of months ago).

I haven't done any benchmark yet. That's my next step.

>> I am not entirely to understand your last sentence, this feature is turned off
>> when an IOMMU is present. So what is your use case here?
>   
> My sentence was badly written, sorry. I meant to say that even when an
> IOMMU is NOT present, there are cases where we might not want to call
> p2m_invalidate_root.

Before implementing a command line option (or even plumbing that to the guest 
configuration), I want to understand what is the real impact on the boot time 
for "normal" guest.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 16/16] xen/arm: Track page accessed between batch of Set/Way operations
  2018-11-06 18:10         ` Julien Grall
@ 2018-11-06 18:41           ` Stefano Stabellini
  0 siblings, 0 replies; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-06 18:41 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andre.przywara,
	xen-devel, Jan Beulich

On Tue, 6 Nov 2018, Julien Grall wrote:
> On 06/11/2018 17:43, Stefano Stabellini wrote:
> > On Mon, 5 Nov 2018, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 11/5/18 9:35 PM, Stefano Stabellini wrote:
> > > > On Mon, 8 Oct 2018, Julien Grall wrote:
> > > > > At the moment, the implementation of Set/Way operations will go
> > > > > through
> > > > > all the entries of the guest P2M and flush them. However, this is very
> > > > > expensive and may render unusable a guest OS using them.
> > > > > 
> > > > > For instance, Linux 32-bit will use Set/Way operations during
> > > > > secondary
> > > > > CPU bring-up. As the implementation is really expensive, it may be
> > > > > possible
> > > > > to hit the CPU bring-up timeout.
> > > > > 
> > > > > To limit the Set/Way impact, we track what pages has been of the guest
> > > > > has been accessed between batch of Set/Way operations. This is done
> > > > > using bit[0] (aka valid bit) of the P2M entry.
> > > > 
> > > > This is going to improve performance of ill-mannered guests at the cost
> > > > of hurting performance of well-mannered guests. Is it really a good
> > > > trade-off? Should this behavior at least be configurable with a Xen
> > > > command line?
> > > 
> > > Well, we have the choice between not been able to boot Linux 32-bit
> > > anymore or
> > > have a slight impact at the boot time for all guests.
> > 
> > Wait -- I thought that with the set/way emulation introduced by patch
> > #15 we would be able to boot Linux 32-bit already. This patch is a
> > performance improvement. Or is it actually needed to boot Linux 32-bit?
> 
> The problem is Linux 32-bit calls a few time set/way during secondary CPU
> bring up. It also has a timeout of 1s to fully boot that CPU. In my testing, I
> can easily hit the timeout even with a small amount of memory.
 
Damn! It is worst than I thought.


> If we don't start tracking the page from the beginning, then you would need to
> clean the full RAM the first time. If you start to track from the beginning,
> you may just have to clean a couple of MB.
>
> > 
> > > As you may have noticed the command line is been suggested below. I didn't
> > > yet
> > > implemented as we agreed at Connect it would be good to start getting
> > > feedback
> > > on it.
> > 
> > Sure. I was thinking about this -- does it make sense to have different
> > defaults for 32bit and 64bit guests?
> > 
> > 32bit -> default p2m_invalidate_root
> > 64bit -> default not
> 
> The decision is not that easy. While Linux arm64 does not contain set/way
> anymore, UEFI still use them (I checked it a couple of months ago).
> 
> I haven't done any benchmark yet. That's my next step.

OK, makes sense


> > > I am not entirely to understand your last sentence, this feature is turned
> > > off
> > > when an IOMMU is present. So what is your use case here?
> >   My sentence was badly written, sorry. I meant to say that even when an
> > IOMMU is NOT present, there are cases where we might not want to call
> > p2m_invalidate_root.
> 
> Before implementing a command line option (or even plumbing that to the guest
> configuration), I want to understand what is the real impact on the boot time
> for "normal" guest.

yes, makes sense

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault
  2018-11-06 14:20         ` Julien Grall
@ 2018-11-12 17:59           ` Julien Grall
  2018-11-12 23:36             ` Stefano Stabellini
  0 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-11-12 17:59 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel

Hi Stefano,

On 11/6/18 2:20 PM, Julien Grall wrote:
> On 05/11/2018 17:56, Stefano Stabellini wrote:
>> On Mon, 5 Nov 2018, Julien Grall wrote:
>>> On 02/11/2018 23:27, Stefano Stabellini wrote:
>>>> On Mon, 8 Oct 2018, Julien Grall wrote:
>>>>
>>>>> +    /*
>>>>> +     * Now that the work on the entry is done, set the valid bit to
>>>>> prevent
>>>>> +     * another fault on that entry.
>>>>> +     */
>>>>> +    resolved = true;
>>>>> +    entry.p2m.valid = 1;
>>>>> +
>>>>> +    p2m_write_pte(table + offsets[level], entry, p2m->clean_pte);
>>>>> +
>>>>> +    /*
>>>>> +     * No need to flush the TLBs as the modified entry had the 
>>>>> valid bit
>>>>> +     * unset.
>>>>> +     */
>>>>> +
>>>>> +out_unmap:
>>>>> +    unmap_domain_page(table);
>>>>> +
>>>>> +out:
>>>>> +    p2m_write_unlock(p2m);
>>>>> +
>>>>> +    return resolved;
>>>>> +}
>>>>> +
>>>>>    static inline int p2m_insert_mapping(struct domain *d,
>>>>>                                         gfn_t start_gfn,
>>>>>                                         unsigned long nr,
>>
>>
>> We probably want to update the comment on top of the call to
>> p2m_resolve_translation_fault:
> 
> Whoops. I will fix it.

Looking at this again. I think the comment on top of the call to 
p2m_resolve_translation_fault still makes sense. Feel free to suggest an 
update of the comment if you think it is not enough.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault
  2018-11-12 17:59           ` Julien Grall
@ 2018-11-12 23:36             ` Stefano Stabellini
  2018-12-04 15:35               ` Julien Grall
  0 siblings, 1 reply; 62+ messages in thread
From: Stefano Stabellini @ 2018-11-12 23:36 UTC (permalink / raw)
  To: Julien Grall; +Cc: Stefano Stabellini, andre.przywara, xen-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2031 bytes --]

On Mon, 12 Nov 2018, Julien Grall wrote:
> Hi Stefano,
> 
> On 11/6/18 2:20 PM, Julien Grall wrote:
> > On 05/11/2018 17:56, Stefano Stabellini wrote:
> > > On Mon, 5 Nov 2018, Julien Grall wrote:
> > > > On 02/11/2018 23:27, Stefano Stabellini wrote:
> > > > > On Mon, 8 Oct 2018, Julien Grall wrote:
> > > > > 
> > > > > > +    /*
> > > > > > +     * Now that the work on the entry is done, set the valid bit to
> > > > > > prevent
> > > > > > +     * another fault on that entry.
> > > > > > +     */
> > > > > > +    resolved = true;
> > > > > > +    entry.p2m.valid = 1;
> > > > > > +
> > > > > > +    p2m_write_pte(table + offsets[level], entry, p2m->clean_pte);
> > > > > > +
> > > > > > +    /*
> > > > > > +     * No need to flush the TLBs as the modified entry had the
> > > > > > valid bit
> > > > > > +     * unset.
> > > > > > +     */
> > > > > > +
> > > > > > +out_unmap:
> > > > > > +    unmap_domain_page(table);
> > > > > > +
> > > > > > +out:
> > > > > > +    p2m_write_unlock(p2m);
> > > > > > +
> > > > > > +    return resolved;
> > > > > > +}
> > > > > > +
> > > > > >    static inline int p2m_insert_mapping(struct domain *d,
> > > > > >                                         gfn_t start_gfn,
> > > > > >                                         unsigned long nr,
> > > 
> > > 
> > > We probably want to update the comment on top of the call to
> > > p2m_resolve_translation_fault:
> > 
> > Whoops. I will fix it.
> 
> Looking at this again. I think the comment on top of the call to
> p2m_resolve_translation_fault still makes sense. Feel free to suggest an
> update of the comment if you think it is not enough.

        /*
         * The PT walk may have failed because someone was playing with
         * the Stage-2 page table or because the valid bit was left
         * unset to track memory accesses. In these cases, we want to
         * return to the guest.
         */

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 00/16] xen/arm: Implement Set/Way operations
  2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
                   ` (15 preceding siblings ...)
  2018-10-08 18:33 ` [RFC 16/16] xen/arm: Track page accessed between batch of " Julien Grall
@ 2018-11-22 14:21 ` Julien Grall
  16 siblings, 0 replies; 62+ messages in thread
From: Julien Grall @ 2018-11-22 14:21 UTC (permalink / raw)
  To: xen-devel; +Cc: Andre Przywara, sstabellini

Hi,

On 10/8/18 7:33 PM, Julien Grall wrote:
> Julien Grall (16):
>    xen/arm: Introduce helpers to get/set an MFN from/to an LPAE entry
>    xen/arm: Allow lpae_is_{table, mapping} helpers to work on invalid
>      entry
>    xen/arm: guest_walk_tables: Switch the return to bool
>    xen/arm: p2m: Introduce a helper to generate P2M table entry from a
>      page

I have merged the 4 patches above.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault
  2018-11-12 23:36             ` Stefano Stabellini
@ 2018-12-04 15:35               ` Julien Grall
  2018-12-04 19:13                 ` Stefano Stabellini
  0 siblings, 1 reply; 62+ messages in thread
From: Julien Grall @ 2018-12-04 15:35 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel

Hi Stefano,

On 11/12/18 11:36 PM, Stefano Stabellini wrote:
> On Mon, 12 Nov 2018, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 11/6/18 2:20 PM, Julien Grall wrote:
>>> On 05/11/2018 17:56, Stefano Stabellini wrote:
>>>> On Mon, 5 Nov 2018, Julien Grall wrote:
>>>>> On 02/11/2018 23:27, Stefano Stabellini wrote:
>>>>>> On Mon, 8 Oct 2018, Julien Grall wrote:
>>>>>>
>>>>>>> +    /*
>>>>>>> +     * Now that the work on the entry is done, set the valid bit to
>>>>>>> prevent
>>>>>>> +     * another fault on that entry.
>>>>>>> +     */
>>>>>>> +    resolved = true;
>>>>>>> +    entry.p2m.valid = 1;
>>>>>>> +
>>>>>>> +    p2m_write_pte(table + offsets[level], entry, p2m->clean_pte);
>>>>>>> +
>>>>>>> +    /*
>>>>>>> +     * No need to flush the TLBs as the modified entry had the
>>>>>>> valid bit
>>>>>>> +     * unset.
>>>>>>> +     */
>>>>>>> +
>>>>>>> +out_unmap:
>>>>>>> +    unmap_domain_page(table);
>>>>>>> +
>>>>>>> +out:
>>>>>>> +    p2m_write_unlock(p2m);
>>>>>>> +
>>>>>>> +    return resolved;
>>>>>>> +}
>>>>>>> +
>>>>>>>     static inline int p2m_insert_mapping(struct domain *d,
>>>>>>>                                          gfn_t start_gfn,
>>>>>>>                                          unsigned long nr,
>>>>
>>>>
>>>> We probably want to update the comment on top of the call to
>>>> p2m_resolve_translation_fault:
>>>
>>> Whoops. I will fix it.
>>
>> Looking at this again. I think the comment on top of the call to
>> p2m_resolve_translation_fault still makes sense. Feel free to suggest an
>> update of the comment if you think it is not enough.
> 
>          /*
>           * The PT walk may have failed because someone was playing with
>           * the Stage-2 page table or because the valid bit was left
>           * unset to track memory accesses. In these cases, we want to
>           * return to the guest.
>           */

Thank you for the suggestion. Thinking a bit more,  I would not be 
surprised we decide to expand p2m_resolve_translation_fault in the 
future. So I decided to go for a more generic comment to avoid stale 
comment:

        /*
     	* First check if the translation fault can be resolved by the
	* P2M subsystem. If that's the case nothing else to do.
         */

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 10/16] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM
  2018-11-05 23:21     ` Julien Grall
  2018-11-06 17:36       ` Stefano Stabellini
@ 2018-12-04 16:24       ` Julien Grall
  1 sibling, 0 replies; 62+ messages in thread
From: Julien Grall @ 2018-12-04 16:24 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: andre.przywara, xen-devel



On 11/5/18 11:21 PM, Julien Grall wrote:
> On 11/5/18 7:47 PM, Stefano Stabellini wrote:
>> On Mon, 8 Oct 2018, Julien Grall wrote:
>>>       /*
>>> +     * HCR_EL2.TVM
>>> +     *
>>> +     * ARMv8 (DDI 0487B.b): Table D1-37
>>
>> In 0487D.a is D1-99
> 
> I haven't had the chance to download the latest spec (it was released 
> last week). I will update to the new spec.

Actually, the table you point does not correspond to D1-37 in the 
version B.b. The table is D1-38.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault
  2018-12-04 15:35               ` Julien Grall
@ 2018-12-04 19:13                 ` Stefano Stabellini
  0 siblings, 0 replies; 62+ messages in thread
From: Stefano Stabellini @ 2018-12-04 19:13 UTC (permalink / raw)
  To: Julien Grall; +Cc: Stefano Stabellini, andre.przywara, xen-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2763 bytes --]

On Tue, 4 Dec 2018, Julien Grall wrote:
> Hi Stefano,
> 
> On 11/12/18 11:36 PM, Stefano Stabellini wrote:
> > On Mon, 12 Nov 2018, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 11/6/18 2:20 PM, Julien Grall wrote:
> > > > On 05/11/2018 17:56, Stefano Stabellini wrote:
> > > > > On Mon, 5 Nov 2018, Julien Grall wrote:
> > > > > > On 02/11/2018 23:27, Stefano Stabellini wrote:
> > > > > > > On Mon, 8 Oct 2018, Julien Grall wrote:
> > > > > > > 
> > > > > > > > +    /*
> > > > > > > > +     * Now that the work on the entry is done, set the valid
> > > > > > > > bit to
> > > > > > > > prevent
> > > > > > > > +     * another fault on that entry.
> > > > > > > > +     */
> > > > > > > > +    resolved = true;
> > > > > > > > +    entry.p2m.valid = 1;
> > > > > > > > +
> > > > > > > > +    p2m_write_pte(table + offsets[level], entry,
> > > > > > > > p2m->clean_pte);
> > > > > > > > +
> > > > > > > > +    /*
> > > > > > > > +     * No need to flush the TLBs as the modified entry had the
> > > > > > > > valid bit
> > > > > > > > +     * unset.
> > > > > > > > +     */
> > > > > > > > +
> > > > > > > > +out_unmap:
> > > > > > > > +    unmap_domain_page(table);
> > > > > > > > +
> > > > > > > > +out:
> > > > > > > > +    p2m_write_unlock(p2m);
> > > > > > > > +
> > > > > > > > +    return resolved;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >     static inline int p2m_insert_mapping(struct domain *d,
> > > > > > > >                                          gfn_t start_gfn,
> > > > > > > >                                          unsigned long nr,
> > > > > 
> > > > > 
> > > > > We probably want to update the comment on top of the call to
> > > > > p2m_resolve_translation_fault:
> > > > 
> > > > Whoops. I will fix it.
> > > 
> > > Looking at this again. I think the comment on top of the call to
> > > p2m_resolve_translation_fault still makes sense. Feel free to suggest an
> > > update of the comment if you think it is not enough.
> > 
> >          /*
> >           * The PT walk may have failed because someone was playing with
> >           * the Stage-2 page table or because the valid bit was left
> >           * unset to track memory accesses. In these cases, we want to
> >           * return to the guest.
> >           */
> 
> Thank you for the suggestion. Thinking a bit more,  I would not be surprised
> we decide to expand p2m_resolve_translation_fault in the future. So I decided
> to go for a more generic comment to avoid stale comment:
> 
>        /*
>     	* First check if the translation fault can be resolved by the
> 	* P2M subsystem. If that's the case nothing else to do.

OK

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2018-12-04 19:13 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-08 18:33 [RFC 00/16] xen/arm: Implement Set/Way operations Julien Grall
2018-10-08 18:33 ` [RFC 01/16] xen/arm: Introduce helpers to clear/flags flags in HCR_EL2 Julien Grall
2018-10-08 18:33 ` [RFC 02/16] xen/arm: Introduce helpers to get/set an MFN from/to an LPAE entry Julien Grall
2018-10-30  0:07   ` Stefano Stabellini
2018-10-08 18:33 ` [RFC 03/16] xen/arm: Allow lpae_is_{table, mapping} helpers to work on invalid entry Julien Grall
2018-10-30  0:10   ` Stefano Stabellini
2018-10-08 18:33 ` [RFC 04/16] xen/arm: guest_walk_tables: Switch the return to bool Julien Grall
2018-10-08 18:33 ` [RFC 05/16] xen/arm: traps: Move the implementation of GUEST_BUG_ON in traps.h Julien Grall
2018-10-30  0:11   ` Stefano Stabellini
2018-10-08 18:33 ` [RFC 06/16] xen/arm: p2m: Introduce a helper to generate P2M table entry from a page Julien Grall
2018-10-30  0:14   ` Stefano Stabellini
2018-10-08 18:33 ` [RFC 07/16] xen/arm: p2m: Introduce p2m_is_valid and use it Julien Grall
2018-10-30  0:21   ` Stefano Stabellini
2018-10-30 11:02     ` Julien Grall
2018-11-02 22:45       ` Stefano Stabellini
2018-10-08 18:33 ` [RFC 08/16] xen/arm: p2m: Handle translation fault in get_page_from_gva Julien Grall
2018-10-30  0:47   ` Stefano Stabellini
2018-10-30 11:24     ` Julien Grall
2018-10-08 18:33 ` [RFC 09/16] xen/arm: p2m: Introduce a function to resolve translation fault Julien Grall
2018-11-02 23:27   ` Stefano Stabellini
2018-11-05 11:55     ` Julien Grall
2018-11-05 17:56       ` Stefano Stabellini
2018-11-06 14:20         ` Julien Grall
2018-11-12 17:59           ` Julien Grall
2018-11-12 23:36             ` Stefano Stabellini
2018-12-04 15:35               ` Julien Grall
2018-12-04 19:13                 ` Stefano Stabellini
2018-10-08 18:33 ` [RFC 10/16] xen/arm: vcpreg: Add wrappers to handle co-proc access trapped by HCR_EL2.TVM Julien Grall
2018-11-05 19:47   ` Stefano Stabellini
2018-11-05 23:21     ` Julien Grall
2018-11-06 17:36       ` Stefano Stabellini
2018-11-06 17:52         ` Julien Grall
2018-11-06 17:56           ` Stefano Stabellini
2018-12-04 16:24       ` Julien Grall
2018-10-08 18:33 ` [RFC 11/16] xen/arm: vsysreg: Add wrapper to handle sysreg " Julien Grall
2018-11-05 20:42   ` Stefano Stabellini
2018-10-08 18:33 ` [RFC 12/16] xen/arm: Rework p2m_cache_flush to take a range [begin, end) Julien Grall
2018-11-02 23:38   ` Stefano Stabellini
2018-10-08 18:33 ` [RFC 13/16] xen/arm: p2m: Allow to flush cache on any RAM region Julien Grall
2018-11-02 23:40   ` Stefano Stabellini
2018-10-08 18:33 ` [RFC 14/16] xen/arm: p2m: Extend p2m_get_entry to return the value of bit[0] (valid bit) Julien Grall
2018-11-02 23:44   ` Stefano Stabellini
2018-11-05 11:56     ` Julien Grall
2018-11-05 17:31       ` Stefano Stabellini
2018-11-05 17:32         ` Julien Grall
2018-10-08 18:33 ` [RFC 15/16] xen/arm: Implement Set/Way operations Julien Grall
2018-11-05 21:10   ` Stefano Stabellini
2018-11-05 23:38     ` Julien Grall
2018-11-06 17:31       ` Stefano Stabellini
2018-11-06 17:34         ` Julien Grall
2018-11-06 17:38           ` Stefano Stabellini
2018-10-08 18:33 ` [RFC 16/16] xen/arm: Track page accessed between batch of " Julien Grall
2018-10-09  7:04   ` Jan Beulich
2018-10-09 10:16     ` Julien Grall
2018-10-09 11:43       ` Jan Beulich
2018-10-09 12:24         ` Julien Grall
2018-11-05 21:35   ` Stefano Stabellini
2018-11-05 23:28     ` Julien Grall
2018-11-06 17:43       ` Stefano Stabellini
2018-11-06 18:10         ` Julien Grall
2018-11-06 18:41           ` Stefano Stabellini
2018-11-22 14:21 ` [RFC 00/16] xen/arm: Implement " Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.