xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 00/13] switch to domheap for Xen page tables
@ 2021-04-21 14:15 Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 01/13] x86/mm: rewrite virt_to_xen_l*e Hongyan Xia
                   ` (13 more replies)
  0 siblings, 14 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Hongyan Xia <hongyxia@amazon.com>

This series rewrites all the remaining functions and finally makes the
switch from xenheap to domheap for Xen page tables, so that they no
longer need to rely on the direct map, which is a big step towards
removing the direct map.

---
Changed in v10:
- rebase.
- address comments in 01/13, which propagates a change into 02/13.

Changed in v9:
- drop first 2 patches which have been merged in XSA-345.
- adjust code around L3 page locking in mm.c.

Changed in v8:
- address comments in v7.
- rebase

Changed in v7:
- rebase and cleanup.
- address comments in v6.
- add alloc_map_clear_xen_pt() helper to simplify the patches in this
  series.

Changed in v6:
- drop the patches that have already been merged.
- rebase and cleanup.
- rewrite map_pages_to_xen() and modify_xen_mappings() in a way that
  does not require an end_of_loop goto label.

Hongyan Xia (2):
  x86/mm: drop old page table APIs
  x86: switch to use domheap page for page tables

Wei Liu (11):
  x86/mm: rewrite virt_to_xen_l*e
  x86/mm: switch to new APIs in map_pages_to_xen
  x86/mm: switch to new APIs in modify_xen_mappings
  x86_64/mm: introduce pl2e in paging_init
  x86_64/mm: switch to new APIs in paging_init
  x86_64/mm: switch to new APIs in setup_m2p_table
  efi: use new page table APIs in copy_mapping
  efi: switch to new APIs in EFI code
  x86/smpboot: add exit path for clone_mapping()
  x86/smpboot: switch clone_mapping() to new APIs
  x86/mm: drop _new suffix for page table APIs

 xen/arch/x86/efi/runtime.h |  13 +-
 xen/arch/x86/mm.c          | 247 ++++++++++++++++++++++---------------
 xen/arch/x86/setup.c       |   4 +-
 xen/arch/x86/smpboot.c     |  70 +++++++----
 xen/arch/x86/x86_64/mm.c   |  80 +++++++-----
 xen/common/efi/boot.c      |  83 ++++++++-----
 xen/common/efi/efi.h       |   3 +-
 xen/common/efi/runtime.c   |   8 +-
 xen/include/asm-x86/mm.h   |   7 +-
 xen/include/asm-x86/page.h |   5 -
 10 files changed, 314 insertions(+), 206 deletions(-)

-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 01/13] x86/mm: rewrite virt_to_xen_l*e
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-22 11:54   ` Jan Beulich
  2021-04-21 14:15 ` [PATCH v10 02/13] x86/mm: switch to new APIs in map_pages_to_xen Hongyan Xia
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Wei Liu <wei.liu2@citrix.com>

Rewrite those functions to use the new APIs. Modify its callers to unmap
the pointer returned. Since alloc_xen_pagetable_new() is almost never
useful unless accompanied by page clearing and a mapping, introduce a
helper alloc_map_clear_xen_pt() for this sequence.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>

---
Changed in v10:
- remove stale include.
- s/alloc_map_clear_xen_pt/alloc_mapped_pagetable/g.
- fix mis-hunks.

Changed in v9:
- use domain_page_map_to_mfn() around the L3 table locking logic.
- remove vmap_to_mfn() changes since we now use xen_map_to_mfn().

Changed in v8:
- s/virtual address/linear address/.
- BUG_ON() on NULL return in vmap_to_mfn().

Changed in v7:
- remove a comment.
- use l1e_get_mfn() instead of converting things back and forth.
- add alloc_map_clear_xen_pt().
- unmap before the next mapping to reduce mapcache pressure.
- use normal unmap calls instead of the macro in error paths because
  unmap can handle NULL now.
---
 xen/arch/x86/mm.c        | 102 +++++++++++++++++++++++++++------------
 xen/include/asm-x86/mm.h |   1 +
 2 files changed, 72 insertions(+), 31 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index b7a10bbdd401..5944ef19dc50 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4931,8 +4931,28 @@ void free_xen_pagetable_new(mfn_t mfn)
         free_xenheap_page(mfn_to_virt(mfn_x(mfn)));
 }
 
+void *alloc_mapped_pagetable(mfn_t *pmfn)
+{
+    mfn_t mfn = alloc_xen_pagetable_new();
+    void *ret;
+
+    if ( mfn_eq(mfn, INVALID_MFN) )
+        return NULL;
+
+    if ( pmfn )
+        *pmfn = mfn;
+    ret = map_domain_page(mfn);
+    clear_page(ret);
+
+    return ret;
+}
+
 static DEFINE_SPINLOCK(map_pgdir_lock);
 
+/*
+ * For virt_to_xen_lXe() functions, they take a linear address and return a
+ * pointer to Xen's LX entry. Caller needs to unmap the pointer.
+ */
 static l3_pgentry_t *virt_to_xen_l3e(unsigned long v)
 {
     l4_pgentry_t *pl4e;
@@ -4941,33 +4961,33 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v)
     if ( !(l4e_get_flags(*pl4e) & _PAGE_PRESENT) )
     {
         bool locking = system_state > SYS_STATE_boot;
-        l3_pgentry_t *l3t = alloc_xen_pagetable();
+        mfn_t l3mfn;
+        l3_pgentry_t *l3t = alloc_mapped_pagetable(&l3mfn);
 
         if ( !l3t )
             return NULL;
-        clear_page(l3t);
+        UNMAP_DOMAIN_PAGE(l3t);
         if ( locking )
             spin_lock(&map_pgdir_lock);
         if ( !(l4e_get_flags(*pl4e) & _PAGE_PRESENT) )
         {
-            l4_pgentry_t l4e = l4e_from_paddr(__pa(l3t), __PAGE_HYPERVISOR);
+            l4_pgentry_t l4e = l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR);
 
             l4e_write(pl4e, l4e);
             efi_update_l4_pgtable(l4_table_offset(v), l4e);
-            l3t = NULL;
+            l3mfn = INVALID_MFN;
         }
         if ( locking )
             spin_unlock(&map_pgdir_lock);
-        if ( l3t )
-            free_xen_pagetable(l3t);
+        free_xen_pagetable_new(l3mfn);
     }
 
-    return l4e_to_l3e(*pl4e) + l3_table_offset(v);
+    return map_l3t_from_l4e(*pl4e) + l3_table_offset(v);
 }
 
 static l2_pgentry_t *virt_to_xen_l2e(unsigned long v)
 {
-    l3_pgentry_t *pl3e;
+    l3_pgentry_t *pl3e, l3e;
 
     pl3e = virt_to_xen_l3e(v);
     if ( !pl3e )
@@ -4976,31 +4996,37 @@ static l2_pgentry_t *virt_to_xen_l2e(unsigned long v)
     if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
     {
         bool locking = system_state > SYS_STATE_boot;
-        l2_pgentry_t *l2t = alloc_xen_pagetable();
+        mfn_t l2mfn;
+        l2_pgentry_t *l2t = alloc_mapped_pagetable(&l2mfn);
 
         if ( !l2t )
+        {
+            unmap_domain_page(pl3e);
             return NULL;
-        clear_page(l2t);
+        }
+        UNMAP_DOMAIN_PAGE(l2t);
         if ( locking )
             spin_lock(&map_pgdir_lock);
         if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
         {
-            l3e_write(pl3e, l3e_from_paddr(__pa(l2t), __PAGE_HYPERVISOR));
-            l2t = NULL;
+            l3e_write(pl3e, l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR));
+            l2mfn = INVALID_MFN;
         }
         if ( locking )
             spin_unlock(&map_pgdir_lock);
-        if ( l2t )
-            free_xen_pagetable(l2t);
+        free_xen_pagetable_new(l2mfn);
     }
 
     BUG_ON(l3e_get_flags(*pl3e) & _PAGE_PSE);
-    return l3e_to_l2e(*pl3e) + l2_table_offset(v);
+    l3e = *pl3e;
+    unmap_domain_page(pl3e);
+
+    return map_l2t_from_l3e(l3e) + l2_table_offset(v);
 }
 
 l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
 {
-    l2_pgentry_t *pl2e;
+    l2_pgentry_t *pl2e, l2e;
 
     pl2e = virt_to_xen_l2e(v);
     if ( !pl2e )
@@ -5009,26 +5035,32 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
     if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
     {
         bool locking = system_state > SYS_STATE_boot;
-        l1_pgentry_t *l1t = alloc_xen_pagetable();
+        mfn_t l1mfn;
+        l1_pgentry_t *l1t = alloc_mapped_pagetable(&l1mfn);
 
         if ( !l1t )
+        {
+            unmap_domain_page(pl2e);
             return NULL;
-        clear_page(l1t);
+        }
+        UNMAP_DOMAIN_PAGE(l1t);
         if ( locking )
             spin_lock(&map_pgdir_lock);
         if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
         {
-            l2e_write(pl2e, l2e_from_paddr(__pa(l1t), __PAGE_HYPERVISOR));
-            l1t = NULL;
+            l2e_write(pl2e, l2e_from_mfn(l1mfn, __PAGE_HYPERVISOR));
+            l1mfn = INVALID_MFN;
         }
         if ( locking )
             spin_unlock(&map_pgdir_lock);
-        if ( l1t )
-            free_xen_pagetable(l1t);
+        free_xen_pagetable_new(l1mfn);
     }
 
     BUG_ON(l2e_get_flags(*pl2e) & _PAGE_PSE);
-    return l2e_to_l1e(*pl2e) + l1_table_offset(v);
+    l2e = *pl2e;
+    unmap_domain_page(pl2e);
+
+    return map_l1t_from_l2e(l2e) + l1_table_offset(v);
 }
 
 /* Convert to from superpage-mapping flags for map_pages_to_xen(). */
@@ -5085,7 +5117,7 @@ mfn_t xen_map_to_mfn(unsigned long va)
 
     L3T_INIT(l3page);
     CHECK_MAPPED(pl3e);
-    l3page = virt_to_page(pl3e);
+    l3page = mfn_to_page(domain_page_map_to_mfn(pl3e));
     L3T_LOCK(l3page);
 
     CHECK_MAPPED(l3e_get_flags(*pl3e) & _PAGE_PRESENT);
@@ -5124,7 +5156,8 @@ int map_pages_to_xen(
     unsigned int flags)
 {
     bool locking = system_state > SYS_STATE_boot;
-    l2_pgentry_t *pl2e, ol2e;
+    l3_pgentry_t *pl3e = NULL, ol3e;
+    l2_pgentry_t *pl2e = NULL, ol2e;
     l1_pgentry_t *pl1e, ol1e;
     unsigned int  i;
     int rc = -ENOMEM;
@@ -5148,15 +5181,16 @@ int map_pages_to_xen(
 
     while ( nr_mfns != 0 )
     {
-        l3_pgentry_t *pl3e, ol3e;
-
+        /* Clean up the previous iteration. */
         L3T_UNLOCK(current_l3page);
+        UNMAP_DOMAIN_PAGE(pl3e);
+        UNMAP_DOMAIN_PAGE(pl2e);
 
         pl3e = virt_to_xen_l3e(virt);
         if ( !pl3e )
             goto out;
 
-        current_l3page = virt_to_page(pl3e);
+        current_l3page = mfn_to_page(domain_page_map_to_mfn(pl3e));
         L3T_LOCK(current_l3page);
         ol3e = *pl3e;
 
@@ -5321,6 +5355,8 @@ int map_pages_to_xen(
                 pl1e = virt_to_xen_l1e(virt);
                 if ( pl1e == NULL )
                     goto out;
+
+                UNMAP_DOMAIN_PAGE(pl1e);
             }
             else if ( l2e_get_flags(*pl2e) & _PAGE_PSE )
             {
@@ -5498,6 +5534,8 @@ int map_pages_to_xen(
 
  out:
     L3T_UNLOCK(current_l3page);
+    unmap_domain_page(pl3e);
+    unmap_domain_page(pl2e);
     return rc;
 }
 
@@ -5521,6 +5559,7 @@ int populate_pt_range(unsigned long virt, unsigned long nr_mfns)
 int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
 {
     bool locking = system_state > SYS_STATE_boot;
+    l3_pgentry_t *pl3e = NULL;
     l2_pgentry_t *pl2e;
     l1_pgentry_t *pl1e;
     unsigned int  i;
@@ -5539,15 +5578,15 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
 
     while ( v < e )
     {
-        l3_pgentry_t *pl3e;
-
+        /* Clean up the previous iteration. */
         L3T_UNLOCK(current_l3page);
+        UNMAP_DOMAIN_PAGE(pl3e);
 
         pl3e = virt_to_xen_l3e(v);
         if ( !pl3e )
             goto out;
 
-        current_l3page = virt_to_page(pl3e);
+        current_l3page = mfn_to_page(domain_page_map_to_mfn(pl3e));
         L3T_LOCK(current_l3page);
 
         if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
@@ -5777,6 +5816,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
 
  out:
     L3T_UNLOCK(current_l3page);
+    unmap_domain_page(pl3e);
     return rc;
 }
 
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 041c158f03f6..111754675cbf 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -577,6 +577,7 @@ void *alloc_xen_pagetable(void);
 void free_xen_pagetable(void *v);
 mfn_t alloc_xen_pagetable_new(void);
 void free_xen_pagetable_new(mfn_t mfn);
+void *alloc_mapped_pagetable(mfn_t *pmfn);
 
 l1_pgentry_t *virt_to_xen_l1e(unsigned long v);
 
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 02/13] x86/mm: switch to new APIs in map_pages_to_xen
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 01/13] x86/mm: rewrite virt_to_xen_l*e Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-22 12:01   ` Jan Beulich
  2021-04-21 14:15 ` [PATCH v10 03/13] x86/mm: switch to new APIs in modify_xen_mappings Hongyan Xia
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Wei Liu <wei.liu2@citrix.com>

Page tables allocated in that function should be mapped and unmapped
now.

Take the opportunity to avoid a potential double map in
map_pages_to_xen() by initialising pl1e to NULL and only map it if it
was not mapped earlier.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>

---
Changed in v10:
- avoid a potential double map.
- drop RoB due to this change.
---
 xen/arch/x86/mm.c | 64 ++++++++++++++++++++++++++++-------------------
 1 file changed, 38 insertions(+), 26 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 5944ef19dc50..8a68da26f45f 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5215,7 +5215,7 @@ int map_pages_to_xen(
                 }
                 else
                 {
-                    l2_pgentry_t *l2t = l3e_to_l2e(ol3e);
+                    l2_pgentry_t *l2t = map_l2t_from_l3e(ol3e);
 
                     for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
                     {
@@ -5227,10 +5227,11 @@ int map_pages_to_xen(
                         else
                         {
                             unsigned int j;
-                            const l1_pgentry_t *l1t = l2e_to_l1e(ol2e);
+                            const l1_pgentry_t *l1t = map_l1t_from_l2e(ol2e);
 
                             for ( j = 0; j < L1_PAGETABLE_ENTRIES; j++ )
                                 flush_flags(l1e_get_flags(l1t[j]));
+                            unmap_domain_page(l1t);
                         }
                     }
                     flush_area(virt, flush_flags);
@@ -5239,9 +5240,10 @@ int map_pages_to_xen(
                         ol2e = l2t[i];
                         if ( (l2e_get_flags(ol2e) & _PAGE_PRESENT) &&
                              !(l2e_get_flags(ol2e) & _PAGE_PSE) )
-                            free_xen_pagetable(l2e_to_l1e(ol2e));
+                            free_xen_pagetable_new(l2e_get_mfn(ol2e));
                     }
-                    free_xen_pagetable(l2t);
+                    unmap_domain_page(l2t);
+                    free_xen_pagetable_new(l3e_get_mfn(ol3e));
                 }
             }
 
@@ -5258,6 +5260,7 @@ int map_pages_to_xen(
             unsigned int flush_flags =
                 FLUSH_TLB | FLUSH_ORDER(2 * PAGETABLE_ORDER);
             l2_pgentry_t *l2t;
+            mfn_t l2mfn;
 
             /* Skip this PTE if there is no change. */
             if ( ((l3e_get_pfn(ol3e) & ~(L2_PAGETABLE_ENTRIES *
@@ -5279,15 +5282,17 @@ int map_pages_to_xen(
                 continue;
             }
 
-            l2t = alloc_xen_pagetable();
-            if ( l2t == NULL )
+            l2mfn = alloc_xen_pagetable_new();
+            if ( mfn_eq(l2mfn, INVALID_MFN) )
                 goto out;
 
+            l2t = map_domain_page(l2mfn);
             for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
                 l2e_write(l2t + i,
                           l2e_from_pfn(l3e_get_pfn(ol3e) +
                                        (i << PAGETABLE_ORDER),
                                        l3e_get_flags(ol3e)));
+            UNMAP_DOMAIN_PAGE(l2t);
 
             if ( l3e_get_flags(ol3e) & _PAGE_GLOBAL )
                 flush_flags |= FLUSH_TLB_GLOBAL;
@@ -5297,15 +5302,15 @@ int map_pages_to_xen(
             if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
                  (l3e_get_flags(*pl3e) & _PAGE_PSE) )
             {
-                l3e_write_atomic(pl3e, l3e_from_mfn(virt_to_mfn(l2t),
-                                                    __PAGE_HYPERVISOR));
-                l2t = NULL;
+                l3e_write_atomic(pl3e,
+                                 l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR));
+                l2mfn = INVALID_MFN;
             }
             if ( locking )
                 spin_unlock(&map_pgdir_lock);
             flush_area(virt, flush_flags);
-            if ( l2t )
-                free_xen_pagetable(l2t);
+
+            free_xen_pagetable_new(l2mfn);
         }
 
         pl2e = virt_to_xen_l2e(virt);
@@ -5333,12 +5338,13 @@ int map_pages_to_xen(
                 }
                 else
                 {
-                    l1_pgentry_t *l1t = l2e_to_l1e(ol2e);
+                    l1_pgentry_t *l1t = map_l1t_from_l2e(ol2e);
 
                     for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                         flush_flags(l1e_get_flags(l1t[i]));
                     flush_area(virt, flush_flags);
-                    free_xen_pagetable(l1t);
+                    unmap_domain_page(l1t);
+                    free_xen_pagetable_new(l2e_get_mfn(ol2e));
                 }
             }
 
@@ -5349,20 +5355,20 @@ int map_pages_to_xen(
         }
         else
         {
+            pl1e = NULL;
             /* Normal page mapping. */
             if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
             {
                 pl1e = virt_to_xen_l1e(virt);
                 if ( pl1e == NULL )
                     goto out;
-
-                UNMAP_DOMAIN_PAGE(pl1e);
             }
             else if ( l2e_get_flags(*pl2e) & _PAGE_PSE )
             {
                 unsigned int flush_flags =
                     FLUSH_TLB | FLUSH_ORDER(PAGETABLE_ORDER);
                 l1_pgentry_t *l1t;
+                mfn_t l1mfn;
 
                 /* Skip this PTE if there is no change. */
                 if ( (((l2e_get_pfn(*pl2e) & ~(L1_PAGETABLE_ENTRIES - 1)) +
@@ -5382,14 +5388,16 @@ int map_pages_to_xen(
                     goto check_l3;
                 }
 
-                l1t = alloc_xen_pagetable();
-                if ( l1t == NULL )
+                l1mfn = alloc_xen_pagetable_new();
+                if ( mfn_eq(l1mfn, INVALID_MFN) )
                     goto out;
 
+                l1t = map_domain_page(l1mfn);
                 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                     l1e_write(&l1t[i],
                               l1e_from_pfn(l2e_get_pfn(*pl2e) + i,
                                            lNf_to_l1f(l2e_get_flags(*pl2e))));
+                UNMAP_DOMAIN_PAGE(l1t);
 
                 if ( l2e_get_flags(*pl2e) & _PAGE_GLOBAL )
                     flush_flags |= FLUSH_TLB_GLOBAL;
@@ -5399,20 +5407,22 @@ int map_pages_to_xen(
                 if ( (l2e_get_flags(*pl2e) & _PAGE_PRESENT) &&
                      (l2e_get_flags(*pl2e) & _PAGE_PSE) )
                 {
-                    l2e_write_atomic(pl2e, l2e_from_mfn(virt_to_mfn(l1t),
+                    l2e_write_atomic(pl2e, l2e_from_mfn(l1mfn,
                                                         __PAGE_HYPERVISOR));
-                    l1t = NULL;
+                    l1mfn = INVALID_MFN;
                 }
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
                 flush_area(virt, flush_flags);
-                if ( l1t )
-                    free_xen_pagetable(l1t);
+
+                free_xen_pagetable_new(l1mfn);
             }
 
-            pl1e  = l2e_to_l1e(*pl2e) + l1_table_offset(virt);
+            if ( !pl1e )
+                pl1e  = map_l1t_from_l2e(*pl2e) + l1_table_offset(virt);
             ol1e  = *pl1e;
             l1e_write_atomic(pl1e, l1e_from_mfn(mfn, flags));
+            UNMAP_DOMAIN_PAGE(pl1e);
             if ( (l1e_get_flags(ol1e) & _PAGE_PRESENT) )
             {
                 unsigned int flush_flags = FLUSH_TLB | FLUSH_ORDER(0);
@@ -5456,12 +5466,13 @@ int map_pages_to_xen(
                     goto check_l3;
                 }
 
-                l1t = l2e_to_l1e(ol2e);
+                l1t = map_l1t_from_l2e(ol2e);
                 base_mfn = l1e_get_pfn(l1t[0]) & ~(L1_PAGETABLE_ENTRIES - 1);
                 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                     if ( (l1e_get_pfn(l1t[i]) != (base_mfn + i)) ||
                          (l1e_get_flags(l1t[i]) != flags) )
                         break;
+                UNMAP_DOMAIN_PAGE(l1t);
                 if ( i == L1_PAGETABLE_ENTRIES )
                 {
                     l2e_write_atomic(pl2e, l2e_from_pfn(base_mfn,
@@ -5471,7 +5482,7 @@ int map_pages_to_xen(
                     flush_area(virt - PAGE_SIZE,
                                FLUSH_TLB_GLOBAL |
                                FLUSH_ORDER(PAGETABLE_ORDER));
-                    free_xen_pagetable(l2e_to_l1e(ol2e));
+                    free_xen_pagetable_new(l2e_get_mfn(ol2e));
                 }
                 else if ( locking )
                     spin_unlock(&map_pgdir_lock);
@@ -5504,7 +5515,7 @@ int map_pages_to_xen(
                 continue;
             }
 
-            l2t = l3e_to_l2e(ol3e);
+            l2t = map_l2t_from_l3e(ol3e);
             base_mfn = l2e_get_pfn(l2t[0]) & ~(L2_PAGETABLE_ENTRIES *
                                               L1_PAGETABLE_ENTRIES - 1);
             for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
@@ -5512,6 +5523,7 @@ int map_pages_to_xen(
                       (base_mfn + (i << PAGETABLE_ORDER))) ||
                      (l2e_get_flags(l2t[i]) != l1f_to_lNf(flags)) )
                     break;
+            UNMAP_DOMAIN_PAGE(l2t);
             if ( i == L2_PAGETABLE_ENTRIES )
             {
                 l3e_write_atomic(pl3e, l3e_from_pfn(base_mfn,
@@ -5521,7 +5533,7 @@ int map_pages_to_xen(
                 flush_area(virt - PAGE_SIZE,
                            FLUSH_TLB_GLOBAL |
                            FLUSH_ORDER(2*PAGETABLE_ORDER));
-                free_xen_pagetable(l3e_to_l2e(ol3e));
+                free_xen_pagetable_new(l3e_get_mfn(ol3e));
             }
             else if ( locking )
                 spin_unlock(&map_pgdir_lock);
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 03/13] x86/mm: switch to new APIs in modify_xen_mappings
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 01/13] x86/mm: rewrite virt_to_xen_l*e Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 02/13] x86/mm: switch to new APIs in map_pages_to_xen Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-22 13:10   ` Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 04/13] x86_64/mm: introduce pl2e in paging_init Hongyan Xia
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Wei Liu <wei.liu2@citrix.com>

Page tables allocated in that function should be mapped and unmapped
now.

Note that pl2e now maybe mapped and unmapped in different iterations, so
we need to add clean-ups for that.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

---
Changed in v7:
- use normal unmap in the error path.
---
 xen/arch/x86/mm.c | 57 ++++++++++++++++++++++++++++++-----------------
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 8a68da26f45f..832e654294b4 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5546,6 +5546,7 @@ int map_pages_to_xen(
 
  out:
     L3T_UNLOCK(current_l3page);
+    unmap_domain_page(pl2e);
     unmap_domain_page(pl3e);
     unmap_domain_page(pl2e);
     return rc;
@@ -5572,7 +5573,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
 {
     bool locking = system_state > SYS_STATE_boot;
     l3_pgentry_t *pl3e = NULL;
-    l2_pgentry_t *pl2e;
+    l2_pgentry_t *pl2e = NULL;
     l1_pgentry_t *pl1e;
     unsigned int  i;
     unsigned long v = s;
@@ -5592,6 +5593,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
     {
         /* Clean up the previous iteration. */
         L3T_UNLOCK(current_l3page);
+        UNMAP_DOMAIN_PAGE(pl2e);
         UNMAP_DOMAIN_PAGE(pl3e);
 
         pl3e = virt_to_xen_l3e(v);
@@ -5614,6 +5616,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
         if ( l3e_get_flags(*pl3e) & _PAGE_PSE )
         {
             l2_pgentry_t *l2t;
+            mfn_t l2mfn;
 
             if ( l2_table_offset(v) == 0 &&
                  l1_table_offset(v) == 0 &&
@@ -5630,35 +5633,38 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
             }
 
             /* PAGE1GB: shatter the superpage and fall through. */
-            l2t = alloc_xen_pagetable();
-            if ( !l2t )
+            l2mfn = alloc_xen_pagetable_new();
+            if ( mfn_eq(l2mfn, INVALID_MFN) )
                 goto out;
 
+            l2t = map_domain_page(l2mfn);
             for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
                 l2e_write(l2t + i,
                           l2e_from_pfn(l3e_get_pfn(*pl3e) +
                                        (i << PAGETABLE_ORDER),
                                        l3e_get_flags(*pl3e)));
+            UNMAP_DOMAIN_PAGE(l2t);
+
             if ( locking )
                 spin_lock(&map_pgdir_lock);
             if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
                  (l3e_get_flags(*pl3e) & _PAGE_PSE) )
             {
-                l3e_write_atomic(pl3e, l3e_from_mfn(virt_to_mfn(l2t),
-                                                    __PAGE_HYPERVISOR));
-                l2t = NULL;
+                l3e_write_atomic(pl3e,
+                                 l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR));
+                l2mfn = INVALID_MFN;
             }
             if ( locking )
                 spin_unlock(&map_pgdir_lock);
-            if ( l2t )
-                free_xen_pagetable(l2t);
+
+            free_xen_pagetable_new(l2mfn);
         }
 
         /*
          * The L3 entry has been verified to be present, and we've dealt with
          * 1G pages as well, so the L2 table cannot require allocation.
          */
-        pl2e = l3e_to_l2e(*pl3e) + l2_table_offset(v);
+        pl2e = map_l2t_from_l3e(*pl3e) + l2_table_offset(v);
 
         if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
         {
@@ -5686,41 +5692,45 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
             else
             {
                 l1_pgentry_t *l1t;
-
                 /* PSE: shatter the superpage and try again. */
-                l1t = alloc_xen_pagetable();
-                if ( !l1t )
+                mfn_t l1mfn = alloc_xen_pagetable_new();
+
+                if ( mfn_eq(l1mfn, INVALID_MFN) )
                     goto out;
 
+                l1t = map_domain_page(l1mfn);
                 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                     l1e_write(&l1t[i],
                               l1e_from_pfn(l2e_get_pfn(*pl2e) + i,
                                            l2e_get_flags(*pl2e) & ~_PAGE_PSE));
+                UNMAP_DOMAIN_PAGE(l1t);
+
                 if ( locking )
                     spin_lock(&map_pgdir_lock);
                 if ( (l2e_get_flags(*pl2e) & _PAGE_PRESENT) &&
                      (l2e_get_flags(*pl2e) & _PAGE_PSE) )
                 {
-                    l2e_write_atomic(pl2e, l2e_from_mfn(virt_to_mfn(l1t),
+                    l2e_write_atomic(pl2e, l2e_from_mfn(l1mfn,
                                                         __PAGE_HYPERVISOR));
-                    l1t = NULL;
+                    l1mfn = INVALID_MFN;
                 }
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
-                if ( l1t )
-                    free_xen_pagetable(l1t);
+
+                free_xen_pagetable_new(l1mfn);
             }
         }
         else
         {
             l1_pgentry_t nl1e, *l1t;
+            mfn_t l1mfn;
 
             /*
              * Ordinary 4kB mapping: The L2 entry has been verified to be
              * present, and we've dealt with 2M pages as well, so the L1 table
              * cannot require allocation.
              */
-            pl1e = l2e_to_l1e(*pl2e) + l1_table_offset(v);
+            pl1e = map_l1t_from_l2e(*pl2e) + l1_table_offset(v);
 
             /* Confirm the caller isn't trying to create new mappings. */
             if ( !(l1e_get_flags(*pl1e) & _PAGE_PRESENT) )
@@ -5731,6 +5741,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                                (l1e_get_flags(*pl1e) & ~FLAGS_MASK) | nf);
 
             l1e_write_atomic(pl1e, nl1e);
+            UNMAP_DOMAIN_PAGE(pl1e);
             v += PAGE_SIZE;
 
             /*
@@ -5760,10 +5771,12 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 continue;
             }
 
-            l1t = l2e_to_l1e(*pl2e);
+            l1mfn = l2e_get_mfn(*pl2e);
+            l1t = map_domain_page(l1mfn);
             for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                 if ( l1e_get_intpte(l1t[i]) != 0 )
                     break;
+            UNMAP_DOMAIN_PAGE(l1t);
             if ( i == L1_PAGETABLE_ENTRIES )
             {
                 /* Empty: zap the L2E and free the L1 page. */
@@ -5771,7 +5784,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
                 flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
-                free_xen_pagetable(l1t);
+                free_xen_pagetable_new(l1mfn);
             }
             else if ( locking )
                 spin_unlock(&map_pgdir_lock);
@@ -5802,11 +5815,13 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
 
         {
             l2_pgentry_t *l2t;
+            mfn_t l2mfn = l3e_get_mfn(*pl3e);
 
-            l2t = l3e_to_l2e(*pl3e);
+            l2t = map_domain_page(l2mfn);
             for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
                 if ( l2e_get_intpte(l2t[i]) != 0 )
                     break;
+            UNMAP_DOMAIN_PAGE(l2t);
             if ( i == L2_PAGETABLE_ENTRIES )
             {
                 /* Empty: zap the L3E and free the L2 page. */
@@ -5814,7 +5829,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
                 flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
-                free_xen_pagetable(l2t);
+                free_xen_pagetable_new(l2mfn);
             }
             else if ( locking )
                 spin_unlock(&map_pgdir_lock);
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 04/13] x86_64/mm: introduce pl2e in paging_init
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
                   ` (2 preceding siblings ...)
  2021-04-21 14:15 ` [PATCH v10 03/13] x86/mm: switch to new APIs in modify_xen_mappings Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 05/13] x86_64/mm: switch to new APIs " Hongyan Xia
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Wei Liu <wei.liu2@citrix.com>

We will soon map and unmap pages in paging_init(). Introduce pl2e so
that we can use l2_ro_mpt to point to the page table itself.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>

---
Changed in v7:
- reword commit message.
---
 xen/arch/x86/x86_64/mm.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index d7e67311fa69..59049bdf8e83 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -496,7 +496,7 @@ void __init paging_init(void)
     unsigned long i, mpt_size, va;
     unsigned int n, memflags;
     l3_pgentry_t *l3_ro_mpt;
-    l2_pgentry_t *l2_ro_mpt = NULL;
+    l2_pgentry_t *pl2e = NULL, *l2_ro_mpt = NULL;
     struct page_info *l1_pg;
 
     /*
@@ -546,7 +546,7 @@ void __init paging_init(void)
             (L2_PAGETABLE_SHIFT - 3 + PAGE_SHIFT)));
 
         if ( cpu_has_page1gb &&
-             !((unsigned long)l2_ro_mpt & ~PAGE_MASK) &&
+             !((unsigned long)pl2e & ~PAGE_MASK) &&
              (mpt_size >> L3_PAGETABLE_SHIFT) > (i >> PAGETABLE_ORDER) )
         {
             unsigned int k, holes;
@@ -606,7 +606,7 @@ void __init paging_init(void)
             memset((void *)(RDWR_MPT_VIRT_START + (i << L2_PAGETABLE_SHIFT)),
                    0xFF, 1UL << L2_PAGETABLE_SHIFT);
         }
-        if ( !((unsigned long)l2_ro_mpt & ~PAGE_MASK) )
+        if ( !((unsigned long)pl2e & ~PAGE_MASK) )
         {
             if ( (l2_ro_mpt = alloc_xen_pagetable()) == NULL )
                 goto nomem;
@@ -614,13 +614,14 @@ void __init paging_init(void)
             l3e_write(&l3_ro_mpt[l3_table_offset(va)],
                       l3e_from_paddr(__pa(l2_ro_mpt),
                                      __PAGE_HYPERVISOR_RO | _PAGE_USER));
+            pl2e = l2_ro_mpt;
             ASSERT(!l2_table_offset(va));
         }
         /* NB. Cannot be GLOBAL: guest user mode should not see it. */
         if ( l1_pg )
-            l2e_write(l2_ro_mpt, l2e_from_page(
+            l2e_write(pl2e, l2e_from_page(
                 l1_pg, /*_PAGE_GLOBAL|*/_PAGE_PSE|_PAGE_USER|_PAGE_PRESENT));
-        l2_ro_mpt++;
+        pl2e++;
     }
 #undef CNT
 #undef MFN
@@ -632,6 +633,7 @@ void __init paging_init(void)
             goto nomem;
         compat_idle_pg_table_l2 = l2_ro_mpt;
         clear_page(l2_ro_mpt);
+        pl2e = l2_ro_mpt;
 
         /* Allocate and map the compatibility mode machine-to-phys table. */
         mpt_size = (mpt_size >> 1) + (1UL << (L2_PAGETABLE_SHIFT - 1));
@@ -649,7 +651,7 @@ void __init paging_init(void)
              sizeof(*compat_machine_to_phys_mapping))
     BUILD_BUG_ON((sizeof(*frame_table) & ~sizeof(*frame_table)) % \
                  sizeof(*compat_machine_to_phys_mapping));
-    for ( i = 0; i < (mpt_size >> L2_PAGETABLE_SHIFT); i++, l2_ro_mpt++ )
+    for ( i = 0; i < (mpt_size >> L2_PAGETABLE_SHIFT); i++, pl2e++ )
     {
         memflags = MEMF_node(phys_to_nid(i <<
             (L2_PAGETABLE_SHIFT - 2 + PAGE_SHIFT)));
@@ -671,7 +673,7 @@ void __init paging_init(void)
                         (i << L2_PAGETABLE_SHIFT)),
                0xFF, 1UL << L2_PAGETABLE_SHIFT);
         /* NB. Cannot be GLOBAL as the ptes get copied into per-VM space. */
-        l2e_write(l2_ro_mpt, l2e_from_page(l1_pg, _PAGE_PSE|_PAGE_PRESENT));
+        l2e_write(pl2e, l2e_from_page(l1_pg, _PAGE_PSE|_PAGE_PRESENT));
     }
 #undef CNT
 #undef MFN
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 05/13] x86_64/mm: switch to new APIs in paging_init
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
                   ` (3 preceding siblings ...)
  2021-04-21 14:15 ` [PATCH v10 04/13] x86_64/mm: introduce pl2e in paging_init Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 06/13] x86_64/mm: switch to new APIs in setup_m2p_table Hongyan Xia
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Wei Liu <wei.liu2@citrix.com>

Map and unmap pages instead of relying on the direct map.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

---
Changed in v9:
- remove an unnecessary l3mfn variable.

Changed in v8:
- replace l3/2_ro_mpt_mfn with just mfn since their lifetimes do not
  overlap

Changed in v7:
- use the new alloc_map_clear_xen_pt() helper.
- move the unmap of pl3t up a bit.
- remove the unmaps in the nomem path.
---
 xen/arch/x86/x86_64/mm.c | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 59049bdf8e83..3e40d529bbf3 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -498,6 +498,7 @@ void __init paging_init(void)
     l3_pgentry_t *l3_ro_mpt;
     l2_pgentry_t *pl2e = NULL, *l2_ro_mpt = NULL;
     struct page_info *l1_pg;
+    mfn_t mfn;
 
     /*
      * We setup the L3s for 1:1 mapping if host support memory hotplug
@@ -510,22 +511,22 @@ void __init paging_init(void)
         if ( !(l4e_get_flags(idle_pg_table[l4_table_offset(va)]) &
               _PAGE_PRESENT) )
         {
-            l3_pgentry_t *pl3t = alloc_xen_pagetable();
+            l3_pgentry_t *pl3t = alloc_mapped_pagetable(&mfn);
 
             if ( !pl3t )
                 goto nomem;
-            clear_page(pl3t);
+            UNMAP_DOMAIN_PAGE(pl3t);
             l4e_write(&idle_pg_table[l4_table_offset(va)],
-                      l4e_from_paddr(__pa(pl3t), __PAGE_HYPERVISOR_RW));
+                      l4e_from_mfn(mfn, __PAGE_HYPERVISOR_RW));
         }
     }
 
     /* Create user-accessible L2 directory to map the MPT for guests. */
-    if ( (l3_ro_mpt = alloc_xen_pagetable()) == NULL )
+    l3_ro_mpt = alloc_mapped_pagetable(&mfn);
+    if ( !l3_ro_mpt )
         goto nomem;
-    clear_page(l3_ro_mpt);
     l4e_write(&idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)],
-              l4e_from_paddr(__pa(l3_ro_mpt), __PAGE_HYPERVISOR_RO | _PAGE_USER));
+              l4e_from_mfn(mfn, __PAGE_HYPERVISOR_RO | _PAGE_USER));
 
     /*
      * Allocate and map the machine-to-phys table.
@@ -608,12 +609,14 @@ void __init paging_init(void)
         }
         if ( !((unsigned long)pl2e & ~PAGE_MASK) )
         {
-            if ( (l2_ro_mpt = alloc_xen_pagetable()) == NULL )
+            UNMAP_DOMAIN_PAGE(l2_ro_mpt);
+
+            l2_ro_mpt = alloc_mapped_pagetable(&mfn);
+            if ( !l2_ro_mpt )
                 goto nomem;
-            clear_page(l2_ro_mpt);
+
             l3e_write(&l3_ro_mpt[l3_table_offset(va)],
-                      l3e_from_paddr(__pa(l2_ro_mpt),
-                                     __PAGE_HYPERVISOR_RO | _PAGE_USER));
+                      l3e_from_mfn(mfn, __PAGE_HYPERVISOR_RO | _PAGE_USER));
             pl2e = l2_ro_mpt;
             ASSERT(!l2_table_offset(va));
         }
@@ -625,15 +628,18 @@ void __init paging_init(void)
     }
 #undef CNT
 #undef MFN
+    UNMAP_DOMAIN_PAGE(l2_ro_mpt);
+    UNMAP_DOMAIN_PAGE(l3_ro_mpt);
 
     /* Create user-accessible L2 directory to map the MPT for compat guests. */
     if ( opt_pv32 )
     {
-        if ( (l2_ro_mpt = alloc_xen_pagetable()) == NULL )
+        mfn = alloc_xen_pagetable_new();
+        if ( mfn_eq(mfn, INVALID_MFN) )
             goto nomem;
-        compat_idle_pg_table_l2 = l2_ro_mpt;
-        clear_page(l2_ro_mpt);
-        pl2e = l2_ro_mpt;
+        compat_idle_pg_table_l2 = map_domain_page_global(mfn);
+        clear_page(compat_idle_pg_table_l2);
+        pl2e = compat_idle_pg_table_l2;
 
         /* Allocate and map the compatibility mode machine-to-phys table. */
         mpt_size = (mpt_size >> 1) + (1UL << (L2_PAGETABLE_SHIFT - 1));
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 06/13] x86_64/mm: switch to new APIs in setup_m2p_table
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
                   ` (4 preceding siblings ...)
  2021-04-21 14:15 ` [PATCH v10 05/13] x86_64/mm: switch to new APIs " Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 07/13] efi: use new page table APIs in copy_mapping Hongyan Xia
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Wei Liu <wei.liu2@citrix.com>

While doing so, avoid repetitive mapping of l2_ro_mpt by keeping it
across loops, and only unmap and map it when crossing 1G boundaries.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

---
Changed in v8:
- re-structure if condition around l2_ro_mpt.
- reword the commit message.

Changed in v7:
- avoid repetitive mapping of l2_ro_mpt.
- edit commit message.
- switch to alloc_map_clear_xen_pt().
---
 xen/arch/x86/x86_64/mm.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 3e40d529bbf3..c625075695e0 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -402,7 +402,8 @@ static int setup_m2p_table(struct mem_hotadd_info *info)
 
     ASSERT(l4e_get_flags(idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)])
             & _PAGE_PRESENT);
-    l3_ro_mpt = l4e_to_l3e(idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)]);
+    l3_ro_mpt = map_l3t_from_l4e(
+                    idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)]);
 
     smap = (info->spfn & (~((1UL << (L2_PAGETABLE_SHIFT - 3)) -1)));
     emap = ((info->epfn + ((1UL << (L2_PAGETABLE_SHIFT - 3)) - 1 )) &
@@ -420,6 +421,10 @@ static int setup_m2p_table(struct mem_hotadd_info *info)
     i = smap;
     while ( i < emap )
     {
+        if ( (RO_MPT_VIRT_START + i * sizeof(*machine_to_phys_mapping)) &
+             ((1UL << L3_PAGETABLE_SHIFT) - 1) )
+            UNMAP_DOMAIN_PAGE(l2_ro_mpt);
+
         switch ( m2p_mapped(i) )
         {
         case M2P_1G_MAPPED:
@@ -455,32 +460,31 @@ static int setup_m2p_table(struct mem_hotadd_info *info)
 
             ASSERT(!(l3e_get_flags(l3_ro_mpt[l3_table_offset(va)]) &
                   _PAGE_PSE));
-            if ( l3e_get_flags(l3_ro_mpt[l3_table_offset(va)]) &
-              _PAGE_PRESENT )
-                l2_ro_mpt = l3e_to_l2e(l3_ro_mpt[l3_table_offset(va)]) +
-                  l2_table_offset(va);
+            if ( l2_ro_mpt )
+                /* nothing */;
+            else if ( l3e_get_flags(l3_ro_mpt[l3_table_offset(va)]) &
+                      _PAGE_PRESENT )
+                l2_ro_mpt = map_l2t_from_l3e(l3_ro_mpt[l3_table_offset(va)]);
             else
             {
-                l2_ro_mpt = alloc_xen_pagetable();
+                mfn_t l2_ro_mpt_mfn;
+
+                l2_ro_mpt = alloc_mapped_pagetable(&l2_ro_mpt_mfn);
                 if ( !l2_ro_mpt )
                 {
                     ret = -ENOMEM;
                     goto error;
                 }
 
-                clear_page(l2_ro_mpt);
                 l3e_write(&l3_ro_mpt[l3_table_offset(va)],
-                          l3e_from_paddr(__pa(l2_ro_mpt),
-                                         __PAGE_HYPERVISOR_RO | _PAGE_USER));
-                l2_ro_mpt += l2_table_offset(va);
+                          l3e_from_mfn(l2_ro_mpt_mfn,
+                                       __PAGE_HYPERVISOR_RO | _PAGE_USER));
             }
 
             /* NB. Cannot be GLOBAL: guest user mode should not see it. */
-            l2e_write(l2_ro_mpt, l2e_from_mfn(mfn,
+            l2e_write(&l2_ro_mpt[l2_table_offset(va)], l2e_from_mfn(mfn,
                    /*_PAGE_GLOBAL|*/_PAGE_PSE|_PAGE_USER|_PAGE_PRESENT));
         }
-        if ( !((unsigned long)l2_ro_mpt & ~PAGE_MASK) )
-            l2_ro_mpt = NULL;
         i += ( 1UL << (L2_PAGETABLE_SHIFT - 3));
     }
 #undef CNT
@@ -488,6 +492,8 @@ static int setup_m2p_table(struct mem_hotadd_info *info)
 
     ret = setup_compat_m2p_table(info);
 error:
+    unmap_domain_page(l2_ro_mpt);
+    unmap_domain_page(l3_ro_mpt);
     return ret;
 }
 
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 07/13] efi: use new page table APIs in copy_mapping
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
                   ` (5 preceding siblings ...)
  2021-04-21 14:15 ` [PATCH v10 06/13] x86_64/mm: switch to new APIs in setup_m2p_table Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 08/13] efi: switch to new APIs in EFI code Hongyan Xia
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel; +Cc: jgrall, Jan Beulich

From: Wei Liu <wei.liu2@citrix.com>

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

---
Changed in v8:
- remove redundant commit message.
- unmap l3src based on va instead of mfn.
- re-structure if condition around l3dst.

Changed in v7:
- hoist l3 variables out of the loop to avoid repetitive mappings.
---
 xen/common/efi/boot.c | 28 +++++++++++++++++++++-------
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 63e289ab8506..539d86c6e8c2 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -6,6 +6,7 @@
 #include <xen/compile.h>
 #include <xen/ctype.h>
 #include <xen/dmi.h>
+#include <xen/domain_page.h>
 #include <xen/init.h>
 #include <xen/keyhandler.h>
 #include <xen/lib.h>
@@ -1439,29 +1440,42 @@ static __init void copy_mapping(unsigned long mfn, unsigned long end,
                                                  unsigned long emfn))
 {
     unsigned long next;
+    l3_pgentry_t *l3src = NULL, *l3dst = NULL;
 
     for ( ; mfn < end; mfn = next )
     {
         l4_pgentry_t l4e = efi_l4_pgtable[l4_table_offset(mfn << PAGE_SHIFT)];
-        l3_pgentry_t *l3src, *l3dst;
         unsigned long va = (unsigned long)mfn_to_virt(mfn);
 
+        if ( !(mfn & ((1UL << (L4_PAGETABLE_SHIFT - PAGE_SHIFT)) - 1)) )
+            UNMAP_DOMAIN_PAGE(l3dst);
+        if ( !(va & ((1UL << L4_PAGETABLE_SHIFT) - 1)) )
+            UNMAP_DOMAIN_PAGE(l3src);
         next = mfn + (1UL << (L3_PAGETABLE_SHIFT - PAGE_SHIFT));
         if ( !is_valid(mfn, min(next, end)) )
             continue;
-        if ( !(l4e_get_flags(l4e) & _PAGE_PRESENT) )
+
+        if ( l3dst )
+            /* nothing */;
+        else if ( !(l4e_get_flags(l4e) & _PAGE_PRESENT) )
         {
-            l3dst = alloc_xen_pagetable();
+            mfn_t l3mfn;
+
+            l3dst = alloc_mapped_pagetable(&l3mfn);
             BUG_ON(!l3dst);
-            clear_page(l3dst);
             efi_l4_pgtable[l4_table_offset(mfn << PAGE_SHIFT)] =
-                l4e_from_paddr(virt_to_maddr(l3dst), __PAGE_HYPERVISOR);
+                l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR);
         }
         else
-            l3dst = l4e_to_l3e(l4e);
-        l3src = l4e_to_l3e(idle_pg_table[l4_table_offset(va)]);
+            l3dst = map_l3t_from_l4e(l4e);
+
+        if ( !l3src )
+            l3src = map_l3t_from_l4e(idle_pg_table[l4_table_offset(va)]);
         l3dst[l3_table_offset(mfn << PAGE_SHIFT)] = l3src[l3_table_offset(va)];
     }
+
+    unmap_domain_page(l3src);
+    unmap_domain_page(l3dst);
 }
 
 static bool __init ram_range_valid(unsigned long smfn, unsigned long emfn)
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 08/13] efi: switch to new APIs in EFI code
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
                   ` (6 preceding siblings ...)
  2021-04-21 14:15 ` [PATCH v10 07/13] efi: use new page table APIs in copy_mapping Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 09/13] x86/smpboot: add exit path for clone_mapping() Hongyan Xia
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Wei Liu <wei.liu2@citrix.com>

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

---
Changed in v7:
- add blank line after declaration.
- rename efi_l4_pgtable into efi_l4t.
- pass the mapped efi_l4t to copy_mapping() instead of map it again.
- use the alloc_map_clear_xen_pt() API.
- unmap pl3e, pl2e, l1t earlier.
---
 xen/arch/x86/efi/runtime.h | 13 ++++++---
 xen/common/efi/boot.c      | 55 ++++++++++++++++++++++----------------
 xen/common/efi/efi.h       |  3 ++-
 xen/common/efi/runtime.c   |  8 +++---
 4 files changed, 48 insertions(+), 31 deletions(-)

diff --git a/xen/arch/x86/efi/runtime.h b/xen/arch/x86/efi/runtime.h
index d9eb8f5c270f..77866c5f2178 100644
--- a/xen/arch/x86/efi/runtime.h
+++ b/xen/arch/x86/efi/runtime.h
@@ -1,12 +1,19 @@
+#include <xen/domain_page.h>
+#include <xen/mm.h>
 #include <asm/atomic.h>
 #include <asm/mc146818rtc.h>
 
 #ifndef COMPAT
-l4_pgentry_t *__read_mostly efi_l4_pgtable;
+mfn_t __read_mostly efi_l4_mfn = INVALID_MFN_INITIALIZER;
 
 void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e)
 {
-    if ( efi_l4_pgtable )
-        l4e_write(efi_l4_pgtable + l4idx, l4e);
+    if ( !mfn_eq(efi_l4_mfn, INVALID_MFN) )
+    {
+        l4_pgentry_t *efi_l4t = map_domain_page(efi_l4_mfn);
+
+        l4e_write(efi_l4t + l4idx, l4e);
+        unmap_domain_page(efi_l4t);
+    }
 }
 #endif
diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 539d86c6e8c2..758f9d74d20f 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -1437,14 +1437,15 @@ custom_param("efi", parse_efi_param);
 
 static __init void copy_mapping(unsigned long mfn, unsigned long end,
                                 bool (*is_valid)(unsigned long smfn,
-                                                 unsigned long emfn))
+                                                 unsigned long emfn),
+                                l4_pgentry_t *efi_l4t)
 {
     unsigned long next;
     l3_pgentry_t *l3src = NULL, *l3dst = NULL;
 
     for ( ; mfn < end; mfn = next )
     {
-        l4_pgentry_t l4e = efi_l4_pgtable[l4_table_offset(mfn << PAGE_SHIFT)];
+        l4_pgentry_t l4e = efi_l4t[l4_table_offset(mfn << PAGE_SHIFT)];
         unsigned long va = (unsigned long)mfn_to_virt(mfn);
 
         if ( !(mfn & ((1UL << (L4_PAGETABLE_SHIFT - PAGE_SHIFT)) - 1)) )
@@ -1463,7 +1464,7 @@ static __init void copy_mapping(unsigned long mfn, unsigned long end,
 
             l3dst = alloc_mapped_pagetable(&l3mfn);
             BUG_ON(!l3dst);
-            efi_l4_pgtable[l4_table_offset(mfn << PAGE_SHIFT)] =
+            efi_l4t[l4_table_offset(mfn << PAGE_SHIFT)] =
                 l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR);
         }
         else
@@ -1496,6 +1497,7 @@ static bool __init rt_range_valid(unsigned long smfn, unsigned long emfn)
 void __init efi_init_memory(void)
 {
     unsigned int i;
+    l4_pgentry_t *efi_l4t;
     struct rt_extra {
         struct rt_extra *next;
         unsigned long smfn, emfn;
@@ -1610,11 +1612,10 @@ void __init efi_init_memory(void)
      * Set up 1:1 page tables for runtime calls. See SetVirtualAddressMap() in
      * efi_exit_boot().
      */
-    efi_l4_pgtable = alloc_xen_pagetable();
-    BUG_ON(!efi_l4_pgtable);
-    clear_page(efi_l4_pgtable);
+    efi_l4t = alloc_mapped_pagetable(&efi_l4_mfn);
+    BUG_ON(!efi_l4t);
 
-    copy_mapping(0, max_page, ram_range_valid);
+    copy_mapping(0, max_page, ram_range_valid, efi_l4t);
 
     /* Insert non-RAM runtime mappings inside the direct map. */
     for ( i = 0; i < efi_memmap_size; i += efi_mdesc_size )
@@ -1630,58 +1631,64 @@ void __init efi_init_memory(void)
             copy_mapping(PFN_DOWN(desc->PhysicalStart),
                          PFN_UP(desc->PhysicalStart +
                                 (desc->NumberOfPages << EFI_PAGE_SHIFT)),
-                         rt_range_valid);
+                         rt_range_valid, efi_l4t);
     }
 
     /* Insert non-RAM runtime mappings outside of the direct map. */
     while ( (extra = extra_head) != NULL )
     {
         unsigned long addr = extra->smfn << PAGE_SHIFT;
-        l4_pgentry_t l4e = efi_l4_pgtable[l4_table_offset(addr)];
+        l4_pgentry_t l4e = efi_l4t[l4_table_offset(addr)];
         l3_pgentry_t *pl3e;
         l2_pgentry_t *pl2e;
         l1_pgentry_t *l1t;
 
         if ( !(l4e_get_flags(l4e) & _PAGE_PRESENT) )
         {
-            pl3e = alloc_xen_pagetable();
+            mfn_t l3mfn;
+
+            pl3e = alloc_mapped_pagetable(&l3mfn);
             BUG_ON(!pl3e);
-            clear_page(pl3e);
-            efi_l4_pgtable[l4_table_offset(addr)] =
-                l4e_from_paddr(virt_to_maddr(pl3e), __PAGE_HYPERVISOR);
+            efi_l4t[l4_table_offset(addr)] =
+                l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR);
         }
         else
-            pl3e = l4e_to_l3e(l4e);
+            pl3e = map_l3t_from_l4e(l4e);
         pl3e += l3_table_offset(addr);
         if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
         {
-            pl2e = alloc_xen_pagetable();
+            mfn_t l2mfn;
+
+            pl2e = alloc_mapped_pagetable(&l2mfn);
             BUG_ON(!pl2e);
-            clear_page(pl2e);
-            *pl3e = l3e_from_paddr(virt_to_maddr(pl2e), __PAGE_HYPERVISOR);
+            *pl3e = l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR);
         }
         else
         {
             BUG_ON(l3e_get_flags(*pl3e) & _PAGE_PSE);
-            pl2e = l3e_to_l2e(*pl3e);
+            pl2e = map_l2t_from_l3e(*pl3e);
         }
+        UNMAP_DOMAIN_PAGE(pl3e);
         pl2e += l2_table_offset(addr);
         if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
         {
-            l1t = alloc_xen_pagetable();
+            mfn_t l1mfn;
+
+            l1t = alloc_mapped_pagetable(&l1mfn);
             BUG_ON(!l1t);
-            clear_page(l1t);
-            *pl2e = l2e_from_paddr(virt_to_maddr(l1t), __PAGE_HYPERVISOR);
+            *pl2e = l2e_from_mfn(l1mfn, __PAGE_HYPERVISOR);
         }
         else
         {
             BUG_ON(l2e_get_flags(*pl2e) & _PAGE_PSE);
-            l1t = l2e_to_l1e(*pl2e);
+            l1t = map_l1t_from_l2e(*pl2e);
         }
+        UNMAP_DOMAIN_PAGE(pl2e);
         for ( i = l1_table_offset(addr);
               i < L1_PAGETABLE_ENTRIES && extra->smfn < extra->emfn;
               ++i, ++extra->smfn )
             l1t[i] = l1e_from_pfn(extra->smfn, extra->prot);
+        UNMAP_DOMAIN_PAGE(l1t);
 
         if ( extra->smfn == extra->emfn )
         {
@@ -1693,6 +1700,8 @@ void __init efi_init_memory(void)
     /* Insert Xen mappings. */
     for ( i = l4_table_offset(HYPERVISOR_VIRT_START);
           i < l4_table_offset(DIRECTMAP_VIRT_END); ++i )
-        efi_l4_pgtable[i] = idle_pg_table[i];
+        efi_l4t[i] = idle_pg_table[i];
+
+    unmap_domain_page(efi_l4t);
 }
 #endif
diff --git a/xen/common/efi/efi.h b/xen/common/efi/efi.h
index 663a8b5000d9..c9aa65d506b1 100644
--- a/xen/common/efi/efi.h
+++ b/xen/common/efi/efi.h
@@ -6,6 +6,7 @@
 #include <efi/eficapsule.h>
 #include <efi/efiapi.h>
 #include <xen/efi.h>
+#include <xen/mm.h>
 #include <xen/spinlock.h>
 #include <asm/page.h>
 
@@ -29,7 +30,7 @@ extern UINTN efi_memmap_size, efi_mdesc_size;
 extern void *efi_memmap;
 
 #ifdef CONFIG_X86
-extern l4_pgentry_t *efi_l4_pgtable;
+extern mfn_t efi_l4_mfn;
 #endif
 
 extern const struct efi_pci_rom *efi_pci_roms;
diff --git a/xen/common/efi/runtime.c b/xen/common/efi/runtime.c
index 95367694b5f3..375b94229e13 100644
--- a/xen/common/efi/runtime.c
+++ b/xen/common/efi/runtime.c
@@ -85,7 +85,7 @@ struct efi_rs_state efi_rs_enter(void)
     static const u32 mxcsr = MXCSR_DEFAULT;
     struct efi_rs_state state = { .cr3 = 0 };
 
-    if ( !efi_l4_pgtable )
+    if ( mfn_eq(efi_l4_mfn, INVALID_MFN) )
         return state;
 
     state.cr3 = read_cr3();
@@ -111,7 +111,7 @@ struct efi_rs_state efi_rs_enter(void)
         lgdt(&gdt_desc);
     }
 
-    switch_cr3_cr4(virt_to_maddr(efi_l4_pgtable), read_cr4());
+    switch_cr3_cr4(mfn_to_maddr(efi_l4_mfn), read_cr4());
 
     return state;
 }
@@ -140,9 +140,9 @@ void efi_rs_leave(struct efi_rs_state *state)
 
 bool efi_rs_using_pgtables(void)
 {
-    return efi_l4_pgtable &&
+    return !mfn_eq(efi_l4_mfn, INVALID_MFN) &&
            (smp_processor_id() == efi_rs_on_cpu) &&
-           (read_cr3() == virt_to_maddr(efi_l4_pgtable));
+           (read_cr3() == mfn_to_maddr(efi_l4_mfn));
 }
 
 unsigned long efi_get_time(void)
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 09/13] x86/smpboot: add exit path for clone_mapping()
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
                   ` (7 preceding siblings ...)
  2021-04-21 14:15 ` [PATCH v10 08/13] efi: switch to new APIs in EFI code Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 10/13] x86/smpboot: switch clone_mapping() to new APIs Hongyan Xia
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Wei Liu <wei.liu2@citrix.com>

We will soon need to clean up page table mappings in the exit path.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Acked-by: Jan Beulich <jbeulich@suse.com>

---
Changed in v7:
- edit commit message.
- begin with rc = 0 and set it to -ENOMEM ahead of if().
---
 xen/arch/x86/smpboot.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 82c1012e892f..e90c4dfa8a88 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -696,6 +696,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     l3_pgentry_t *pl3e;
     l2_pgentry_t *pl2e;
     l1_pgentry_t *pl1e;
+    int rc = 0;
 
     /*
      * Sanity check 'linear'.  We only allow cloning from the Xen virtual
@@ -736,7 +737,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
             pl1e = l2e_to_l1e(*pl2e) + l1_table_offset(linear);
             flags = l1e_get_flags(*pl1e);
             if ( !(flags & _PAGE_PRESENT) )
-                return 0;
+                goto out;
             pfn = l1e_get_pfn(*pl1e);
         }
     }
@@ -744,8 +745,9 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     if ( !(root_get_flags(rpt[root_table_offset(linear)]) & _PAGE_PRESENT) )
     {
         pl3e = alloc_xen_pagetable();
+        rc = -ENOMEM;
         if ( !pl3e )
-            return -ENOMEM;
+            goto out;
         clear_page(pl3e);
         l4e_write(&rpt[root_table_offset(linear)],
                   l4e_from_paddr(__pa(pl3e), __PAGE_HYPERVISOR));
@@ -758,8 +760,9 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
     {
         pl2e = alloc_xen_pagetable();
+        rc = -ENOMEM;
         if ( !pl2e )
-            return -ENOMEM;
+            goto out;
         clear_page(pl2e);
         l3e_write(pl3e, l3e_from_paddr(__pa(pl2e), __PAGE_HYPERVISOR));
     }
@@ -774,8 +777,9 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
     {
         pl1e = alloc_xen_pagetable();
+        rc = -ENOMEM;
         if ( !pl1e )
-            return -ENOMEM;
+            goto out;
         clear_page(pl1e);
         l2e_write(pl2e, l2e_from_paddr(__pa(pl1e), __PAGE_HYPERVISOR));
     }
@@ -796,7 +800,9 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     else
         l1e_write(pl1e, l1e_from_pfn(pfn, flags));
 
-    return 0;
+    rc = 0;
+ out:
+    return rc;
 }
 
 DEFINE_PER_CPU(root_pgentry_t *, root_pgt);
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 10/13] x86/smpboot: switch clone_mapping() to new APIs
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
                   ` (8 preceding siblings ...)
  2021-04-21 14:15 ` [PATCH v10 09/13] x86/smpboot: add exit path for clone_mapping() Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 11/13] x86/mm: drop old page table APIs Hongyan Xia
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Wei Liu <wei.liu2@citrix.com>

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

---
Changed in v10:
- switch to unmap_domain_page() for pl3e in the middle because it is
  guaranteed to be overwritten later.

Changed in v7:
- change patch title
- remove initialiser of pl3e.
- combine the initialisation of pl3e into a single assignment.
- use the new alloc_map_clear() helper.
- use the normal map_domain_page() in the error path.
---
 xen/arch/x86/smpboot.c | 44 ++++++++++++++++++++++++++----------------
 1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index e90c4dfa8a88..765cf3396051 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -694,8 +694,8 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     unsigned long linear = (unsigned long)ptr, pfn;
     unsigned int flags;
     l3_pgentry_t *pl3e;
-    l2_pgentry_t *pl2e;
-    l1_pgentry_t *pl1e;
+    l2_pgentry_t *pl2e = NULL;
+    l1_pgentry_t *pl1e = NULL;
     int rc = 0;
 
     /*
@@ -710,7 +710,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
          (linear >= XEN_VIRT_END && linear < DIRECTMAP_VIRT_START) )
         return -EINVAL;
 
-    pl3e = l4e_to_l3e(idle_pg_table[root_table_offset(linear)]) +
+    pl3e = map_l3t_from_l4e(idle_pg_table[root_table_offset(linear)]) +
         l3_table_offset(linear);
 
     flags = l3e_get_flags(*pl3e);
@@ -723,7 +723,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     }
     else
     {
-        pl2e = l3e_to_l2e(*pl3e) + l2_table_offset(linear);
+        pl2e = map_l2t_from_l3e(*pl3e) + l2_table_offset(linear);
         flags = l2e_get_flags(*pl2e);
         ASSERT(flags & _PAGE_PRESENT);
         if ( flags & _PAGE_PSE )
@@ -734,7 +734,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
         }
         else
         {
-            pl1e = l2e_to_l1e(*pl2e) + l1_table_offset(linear);
+            pl1e = map_l1t_from_l2e(*pl2e) + l1_table_offset(linear);
             flags = l1e_get_flags(*pl1e);
             if ( !(flags & _PAGE_PRESENT) )
                 goto out;
@@ -742,51 +742,58 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
         }
     }
 
+    UNMAP_DOMAIN_PAGE(pl1e);
+    UNMAP_DOMAIN_PAGE(pl2e);
+    unmap_domain_page(pl3e);
+
     if ( !(root_get_flags(rpt[root_table_offset(linear)]) & _PAGE_PRESENT) )
     {
-        pl3e = alloc_xen_pagetable();
+        mfn_t l3mfn;
+
+        pl3e = alloc_mapped_pagetable(&l3mfn);
         rc = -ENOMEM;
         if ( !pl3e )
             goto out;
-        clear_page(pl3e);
         l4e_write(&rpt[root_table_offset(linear)],
-                  l4e_from_paddr(__pa(pl3e), __PAGE_HYPERVISOR));
+                  l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR));
     }
     else
-        pl3e = l4e_to_l3e(rpt[root_table_offset(linear)]);
+        pl3e = map_l3t_from_l4e(rpt[root_table_offset(linear)]);
 
     pl3e += l3_table_offset(linear);
 
     if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
     {
-        pl2e = alloc_xen_pagetable();
+        mfn_t l2mfn;
+
+        pl2e = alloc_mapped_pagetable(&l2mfn);
         rc = -ENOMEM;
         if ( !pl2e )
             goto out;
-        clear_page(pl2e);
-        l3e_write(pl3e, l3e_from_paddr(__pa(pl2e), __PAGE_HYPERVISOR));
+        l3e_write(pl3e, l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR));
     }
     else
     {
         ASSERT(!(l3e_get_flags(*pl3e) & _PAGE_PSE));
-        pl2e = l3e_to_l2e(*pl3e);
+        pl2e = map_l2t_from_l3e(*pl3e);
     }
 
     pl2e += l2_table_offset(linear);
 
     if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
     {
-        pl1e = alloc_xen_pagetable();
+        mfn_t l1mfn;
+
+        pl1e = alloc_mapped_pagetable(&l1mfn);
         rc = -ENOMEM;
         if ( !pl1e )
             goto out;
-        clear_page(pl1e);
-        l2e_write(pl2e, l2e_from_paddr(__pa(pl1e), __PAGE_HYPERVISOR));
+        l2e_write(pl2e, l2e_from_mfn(l1mfn, __PAGE_HYPERVISOR));
     }
     else
     {
         ASSERT(!(l2e_get_flags(*pl2e) & _PAGE_PSE));
-        pl1e = l2e_to_l1e(*pl2e);
+        pl1e = map_l1t_from_l2e(*pl2e);
     }
 
     pl1e += l1_table_offset(linear);
@@ -802,6 +809,9 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
 
     rc = 0;
  out:
+    unmap_domain_page(pl1e);
+    unmap_domain_page(pl2e);
+    unmap_domain_page(pl3e);
     return rc;
 }
 
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 11/13] x86/mm: drop old page table APIs
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
                   ` (9 preceding siblings ...)
  2021-04-21 14:15 ` [PATCH v10 10/13] x86/smpboot: switch clone_mapping() to new APIs Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 12/13] x86: switch to use domheap page for page tables Hongyan Xia
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Hongyan Xia <hongyxia@amazon.com>

Two sets of old APIs, alloc/free_xen_pagetable() and lXe_to_lYe(), are
now dropped to avoid the dependency on direct map.

There are two special cases which still have not been re-written into
the new APIs, thus need special treatment:

rpt in smpboot.c cannot use ephemeral mappings yet. The problem is that
rpt is read and written in context switch code, but the mapping
infrastructure is NOT context-switch-safe, meaning we cannot map rpt in
one domain and unmap in another. Before the mapping infrastructure
supports context switches, rpt has to be globally mapped.

Also, lXe_to_lYe() during Xen image relocation cannot be converted into
map/unmap pairs. We cannot hold on to mappings while the mapping
infrastructure is being relocated! It is enough to remove the direct map
in the second e820 pass, so we still use the direct map (<4GiB) in Xen
relocation (which is during the first e820 pass).

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/mm.c          | 14 --------------
 xen/arch/x86/setup.c       |  4 ++--
 xen/arch/x86/smpboot.c     |  4 ++--
 xen/include/asm-x86/mm.h   |  2 --
 xen/include/asm-x86/page.h |  5 -----
 5 files changed, 4 insertions(+), 25 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 832e654294b4..bf86ba3729aa 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4891,20 +4891,6 @@ int mmcfg_intercept_write(
     return X86EMUL_OKAY;
 }
 
-void *alloc_xen_pagetable(void)
-{
-    mfn_t mfn = alloc_xen_pagetable_new();
-
-    return mfn_eq(mfn, INVALID_MFN) ? NULL : mfn_to_virt(mfn_x(mfn));
-}
-
-void free_xen_pagetable(void *v)
-{
-    mfn_t mfn = v ? virt_to_mfn(v) : INVALID_MFN;
-
-    free_xen_pagetable_new(mfn);
-}
-
 /*
  * For these PTE APIs, the caller must follow the alloc-map-unmap-free
  * lifecycle, which means explicitly mapping the PTE pages before accessing
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index a6658d976937..f2dff2ae6a64 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1247,7 +1247,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                     continue;
                 *pl4e = l4e_from_intpte(l4e_get_intpte(*pl4e) +
                                         xen_phys_start);
-                pl3e = l4e_to_l3e(*pl4e);
+                pl3e = __va(l4e_get_paddr(*pl4e));
                 for ( j = 0; j < L3_PAGETABLE_ENTRIES; j++, pl3e++ )
                 {
                     /* Not present, 1GB mapping, or already relocated? */
@@ -1257,7 +1257,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                         continue;
                     *pl3e = l3e_from_intpte(l3e_get_intpte(*pl3e) +
                                             xen_phys_start);
-                    pl2e = l3e_to_l2e(*pl3e);
+                    pl2e = __va(l3e_get_paddr(*pl3e));
                     for ( k = 0; k < L2_PAGETABLE_ENTRIES; k++, pl2e++ )
                     {
                         /* Not present, PSE, or already relocated? */
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 765cf3396051..ad878d8aebca 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -830,7 +830,7 @@ static int setup_cpu_root_pgt(unsigned int cpu)
     if ( !opt_xpti_hwdom && !opt_xpti_domu )
         return 0;
 
-    rpt = alloc_xen_pagetable();
+    rpt = alloc_xenheap_page();
     if ( !rpt )
         return -ENOMEM;
 
@@ -933,7 +933,7 @@ static void cleanup_cpu_root_pgt(unsigned int cpu)
         free_xen_pagetable_new(l3mfn);
     }
 
-    free_xen_pagetable(rpt);
+    free_xenheap_page(rpt);
 
     /* Also zap the stub mapping for this CPU. */
     if ( stub_linear )
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 111754675cbf..0a72fa7a26c3 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -573,8 +573,6 @@ int vcpu_destroy_pagetables(struct vcpu *);
 void *do_page_walk(struct vcpu *v, unsigned long addr);
 
 /* Allocator functions for Xen pagetables. */
-void *alloc_xen_pagetable(void);
-void free_xen_pagetable(void *v);
 mfn_t alloc_xen_pagetable_new(void);
 void free_xen_pagetable_new(mfn_t mfn);
 void *alloc_mapped_pagetable(mfn_t *pmfn);
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index 4c7f2cb70c69..1d080cffbe84 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -180,11 +180,6 @@ static inline l4_pgentry_t l4e_from_paddr(paddr_t pa, unsigned int flags)
 #define l4e_has_changed(x,y,flags) \
     ( !!(((x).l4 ^ (y).l4) & ((PADDR_MASK&PAGE_MASK)|put_pte_flags(flags))) )
 
-/* Pagetable walking. */
-#define l2e_to_l1e(x)              ((l1_pgentry_t *)__va(l2e_get_paddr(x)))
-#define l3e_to_l2e(x)              ((l2_pgentry_t *)__va(l3e_get_paddr(x)))
-#define l4e_to_l3e(x)              ((l3_pgentry_t *)__va(l4e_get_paddr(x)))
-
 #define map_l1t_from_l2e(x)        (l1_pgentry_t *)map_domain_page(l2e_get_mfn(x))
 #define map_l2t_from_l3e(x)        (l2_pgentry_t *)map_domain_page(l3e_get_mfn(x))
 #define map_l3t_from_l4e(x)        (l3_pgentry_t *)map_domain_page(l4e_get_mfn(x))
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 12/13] x86: switch to use domheap page for page tables
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
                   ` (10 preceding siblings ...)
  2021-04-21 14:15 ` [PATCH v10 11/13] x86/mm: drop old page table APIs Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-21 14:15 ` [PATCH v10 13/13] x86/mm: drop _new suffix for page table APIs Hongyan Xia
  2021-04-22 16:21 ` [PATCH v10 00/13] switch to domheap for Xen page tables Andrew Cooper
  13 siblings, 0 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Hongyan Xia <hongyxia@amazon.com>

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

---
Changed in v8:
- const qualify pg in alloc_xen_pagetable_new().
---
 xen/arch/x86/mm.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index bf86ba3729aa..604f83c3837e 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4901,10 +4901,10 @@ mfn_t alloc_xen_pagetable_new(void)
 {
     if ( system_state != SYS_STATE_early_boot )
     {
-        void *ptr = alloc_xenheap_page();
+        const struct page_info *pg = alloc_domheap_page(NULL, 0);
 
-        BUG_ON(!hardware_domain && !ptr);
-        return ptr ? virt_to_mfn(ptr) : INVALID_MFN;
+        BUG_ON(!hardware_domain && !pg);
+        return pg ? page_to_mfn(pg) : INVALID_MFN;
     }
 
     return alloc_boot_pages(1, 1);
@@ -4914,7 +4914,7 @@ mfn_t alloc_xen_pagetable_new(void)
 void free_xen_pagetable_new(mfn_t mfn)
 {
     if ( system_state != SYS_STATE_early_boot && !mfn_eq(mfn, INVALID_MFN) )
-        free_xenheap_page(mfn_to_virt(mfn_x(mfn)));
+        free_domheap_page(mfn_to_page(mfn));
 }
 
 void *alloc_mapped_pagetable(mfn_t *pmfn)
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v10 13/13] x86/mm: drop _new suffix for page table APIs
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
                   ` (11 preceding siblings ...)
  2021-04-21 14:15 ` [PATCH v10 12/13] x86: switch to use domheap page for page tables Hongyan Xia
@ 2021-04-21 14:15 ` Hongyan Xia
  2021-04-22 16:21 ` [PATCH v10 00/13] switch to domheap for Xen page tables Andrew Cooper
  13 siblings, 0 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-21 14:15 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

From: Wei Liu <wei.liu2@citrix.com>

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/mm.c        | 44 ++++++++++++++++++++--------------------
 xen/arch/x86/smpboot.c   |  6 +++---
 xen/arch/x86/x86_64/mm.c |  2 +-
 xen/include/asm-x86/mm.h |  4 ++--
 4 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 604f83c3837e..709686701388 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -369,7 +369,7 @@ void __init arch_init_memory(void)
             ASSERT(root_pgt_pv_xen_slots < ROOT_PAGETABLE_PV_XEN_SLOTS);
             if ( l4_table_offset(split_va) == l4_table_offset(split_va - 1) )
             {
-                mfn_t l3mfn = alloc_xen_pagetable_new();
+                mfn_t l3mfn = alloc_xen_pagetable();
 
                 if ( !mfn_eq(l3mfn, INVALID_MFN) )
                 {
@@ -4897,7 +4897,7 @@ int mmcfg_intercept_write(
  * them. The caller must check whether the allocation has succeeded, and only
  * pass valid MFNs to map_domain_page().
  */
-mfn_t alloc_xen_pagetable_new(void)
+mfn_t alloc_xen_pagetable(void)
 {
     if ( system_state != SYS_STATE_early_boot )
     {
@@ -4911,7 +4911,7 @@ mfn_t alloc_xen_pagetable_new(void)
 }
 
 /* mfn can be INVALID_MFN */
-void free_xen_pagetable_new(mfn_t mfn)
+void free_xen_pagetable(mfn_t mfn)
 {
     if ( system_state != SYS_STATE_early_boot && !mfn_eq(mfn, INVALID_MFN) )
         free_domheap_page(mfn_to_page(mfn));
@@ -4919,7 +4919,7 @@ void free_xen_pagetable_new(mfn_t mfn)
 
 void *alloc_mapped_pagetable(mfn_t *pmfn)
 {
-    mfn_t mfn = alloc_xen_pagetable_new();
+    mfn_t mfn = alloc_xen_pagetable();
     void *ret;
 
     if ( mfn_eq(mfn, INVALID_MFN) )
@@ -4965,7 +4965,7 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v)
         }
         if ( locking )
             spin_unlock(&map_pgdir_lock);
-        free_xen_pagetable_new(l3mfn);
+        free_xen_pagetable(l3mfn);
     }
 
     return map_l3t_from_l4e(*pl4e) + l3_table_offset(v);
@@ -5000,7 +5000,7 @@ static l2_pgentry_t *virt_to_xen_l2e(unsigned long v)
         }
         if ( locking )
             spin_unlock(&map_pgdir_lock);
-        free_xen_pagetable_new(l2mfn);
+        free_xen_pagetable(l2mfn);
     }
 
     BUG_ON(l3e_get_flags(*pl3e) & _PAGE_PSE);
@@ -5039,7 +5039,7 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
         }
         if ( locking )
             spin_unlock(&map_pgdir_lock);
-        free_xen_pagetable_new(l1mfn);
+        free_xen_pagetable(l1mfn);
     }
 
     BUG_ON(l2e_get_flags(*pl2e) & _PAGE_PSE);
@@ -5226,10 +5226,10 @@ int map_pages_to_xen(
                         ol2e = l2t[i];
                         if ( (l2e_get_flags(ol2e) & _PAGE_PRESENT) &&
                              !(l2e_get_flags(ol2e) & _PAGE_PSE) )
-                            free_xen_pagetable_new(l2e_get_mfn(ol2e));
+                            free_xen_pagetable(l2e_get_mfn(ol2e));
                     }
                     unmap_domain_page(l2t);
-                    free_xen_pagetable_new(l3e_get_mfn(ol3e));
+                    free_xen_pagetable(l3e_get_mfn(ol3e));
                 }
             }
 
@@ -5268,7 +5268,7 @@ int map_pages_to_xen(
                 continue;
             }
 
-            l2mfn = alloc_xen_pagetable_new();
+            l2mfn = alloc_xen_pagetable();
             if ( mfn_eq(l2mfn, INVALID_MFN) )
                 goto out;
 
@@ -5296,7 +5296,7 @@ int map_pages_to_xen(
                 spin_unlock(&map_pgdir_lock);
             flush_area(virt, flush_flags);
 
-            free_xen_pagetable_new(l2mfn);
+            free_xen_pagetable(l2mfn);
         }
 
         pl2e = virt_to_xen_l2e(virt);
@@ -5330,7 +5330,7 @@ int map_pages_to_xen(
                         flush_flags(l1e_get_flags(l1t[i]));
                     flush_area(virt, flush_flags);
                     unmap_domain_page(l1t);
-                    free_xen_pagetable_new(l2e_get_mfn(ol2e));
+                    free_xen_pagetable(l2e_get_mfn(ol2e));
                 }
             }
 
@@ -5374,7 +5374,7 @@ int map_pages_to_xen(
                     goto check_l3;
                 }
 
-                l1mfn = alloc_xen_pagetable_new();
+                l1mfn = alloc_xen_pagetable();
                 if ( mfn_eq(l1mfn, INVALID_MFN) )
                     goto out;
 
@@ -5401,7 +5401,7 @@ int map_pages_to_xen(
                     spin_unlock(&map_pgdir_lock);
                 flush_area(virt, flush_flags);
 
-                free_xen_pagetable_new(l1mfn);
+                free_xen_pagetable(l1mfn);
             }
 
             if ( !pl1e )
@@ -5468,7 +5468,7 @@ int map_pages_to_xen(
                     flush_area(virt - PAGE_SIZE,
                                FLUSH_TLB_GLOBAL |
                                FLUSH_ORDER(PAGETABLE_ORDER));
-                    free_xen_pagetable_new(l2e_get_mfn(ol2e));
+                    free_xen_pagetable(l2e_get_mfn(ol2e));
                 }
                 else if ( locking )
                     spin_unlock(&map_pgdir_lock);
@@ -5519,7 +5519,7 @@ int map_pages_to_xen(
                 flush_area(virt - PAGE_SIZE,
                            FLUSH_TLB_GLOBAL |
                            FLUSH_ORDER(2*PAGETABLE_ORDER));
-                free_xen_pagetable_new(l3e_get_mfn(ol3e));
+                free_xen_pagetable(l3e_get_mfn(ol3e));
             }
             else if ( locking )
                 spin_unlock(&map_pgdir_lock);
@@ -5619,7 +5619,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
             }
 
             /* PAGE1GB: shatter the superpage and fall through. */
-            l2mfn = alloc_xen_pagetable_new();
+            l2mfn = alloc_xen_pagetable();
             if ( mfn_eq(l2mfn, INVALID_MFN) )
                 goto out;
 
@@ -5643,7 +5643,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
             if ( locking )
                 spin_unlock(&map_pgdir_lock);
 
-            free_xen_pagetable_new(l2mfn);
+            free_xen_pagetable(l2mfn);
         }
 
         /*
@@ -5679,7 +5679,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
             {
                 l1_pgentry_t *l1t;
                 /* PSE: shatter the superpage and try again. */
-                mfn_t l1mfn = alloc_xen_pagetable_new();
+                mfn_t l1mfn = alloc_xen_pagetable();
 
                 if ( mfn_eq(l1mfn, INVALID_MFN) )
                     goto out;
@@ -5703,7 +5703,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
 
-                free_xen_pagetable_new(l1mfn);
+                free_xen_pagetable(l1mfn);
             }
         }
         else
@@ -5770,7 +5770,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
                 flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
-                free_xen_pagetable_new(l1mfn);
+                free_xen_pagetable(l1mfn);
             }
             else if ( locking )
                 spin_unlock(&map_pgdir_lock);
@@ -5815,7 +5815,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
                 flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
-                free_xen_pagetable_new(l2mfn);
+                free_xen_pagetable(l2mfn);
             }
             else if ( locking )
                 spin_unlock(&map_pgdir_lock);
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index ad878d8aebca..0dce1ae87210 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -922,15 +922,15 @@ static void cleanup_cpu_root_pgt(unsigned int cpu)
                     continue;
 
                 ASSERT(!(l2e_get_flags(l2t[i2]) & _PAGE_PSE));
-                free_xen_pagetable_new(l2e_get_mfn(l2t[i2]));
+                free_xen_pagetable(l2e_get_mfn(l2t[i2]));
             }
 
             unmap_domain_page(l2t);
-            free_xen_pagetable_new(l2mfn);
+            free_xen_pagetable(l2mfn);
         }
 
         unmap_domain_page(l3t);
-        free_xen_pagetable_new(l3mfn);
+        free_xen_pagetable(l3mfn);
     }
 
     free_xenheap_page(rpt);
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index c625075695e0..c41ce847b36e 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -640,7 +640,7 @@ void __init paging_init(void)
     /* Create user-accessible L2 directory to map the MPT for compat guests. */
     if ( opt_pv32 )
     {
-        mfn = alloc_xen_pagetable_new();
+        mfn = alloc_xen_pagetable();
         if ( mfn_eq(mfn, INVALID_MFN) )
             goto nomem;
         compat_idle_pg_table_l2 = map_domain_page_global(mfn);
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 0a72fa7a26c3..56d7a71a24a4 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -573,8 +573,8 @@ int vcpu_destroy_pagetables(struct vcpu *);
 void *do_page_walk(struct vcpu *v, unsigned long addr);
 
 /* Allocator functions for Xen pagetables. */
-mfn_t alloc_xen_pagetable_new(void);
-void free_xen_pagetable_new(mfn_t mfn);
+mfn_t alloc_xen_pagetable(void);
+void free_xen_pagetable(mfn_t mfn);
 void *alloc_mapped_pagetable(mfn_t *pmfn);
 
 l1_pgentry_t *virt_to_xen_l1e(unsigned long v);
-- 
2.23.4



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v10 01/13] x86/mm: rewrite virt_to_xen_l*e
  2021-04-21 14:15 ` [PATCH v10 01/13] x86/mm: rewrite virt_to_xen_l*e Hongyan Xia
@ 2021-04-22 11:54   ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2021-04-22 11:54 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: jgrall, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 21.04.2021 16:15, Hongyan Xia wrote:
> From: Wei Liu <wei.liu2@citrix.com>
> 
> Rewrite those functions to use the new APIs. Modify its callers to unmap
> the pointer returned. Since alloc_xen_pagetable_new() is almost never
> useful unless accompanied by page clearing and a mapping, introduce a
> helper alloc_map_clear_xen_pt() for this sequence.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
albeit ...

> @@ -4941,33 +4961,33 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v)
>      if ( !(l4e_get_flags(*pl4e) & _PAGE_PRESENT) )
>      {
>          bool locking = system_state > SYS_STATE_boot;
> -        l3_pgentry_t *l3t = alloc_xen_pagetable();
> +        mfn_t l3mfn;
> +        l3_pgentry_t *l3t = alloc_mapped_pagetable(&l3mfn);
>  
>          if ( !l3t )
>              return NULL;
> -        clear_page(l3t);
> +        UNMAP_DOMAIN_PAGE(l3t);

... this immediate unmapping (and then re-mapping below) will imo
want re-doing down the road as well. Even if it's not a severe
performance hit, it's simply odd, at least to me.

Jan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v10 02/13] x86/mm: switch to new APIs in map_pages_to_xen
  2021-04-21 14:15 ` [PATCH v10 02/13] x86/mm: switch to new APIs in map_pages_to_xen Hongyan Xia
@ 2021-04-22 12:01   ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2021-04-22 12:01 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: jgrall, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 21.04.2021 16:15, Hongyan Xia wrote:
> From: Wei Liu <wei.liu2@citrix.com>
> 
> Page tables allocated in that function should be mapped and unmapped
> now.
> 
> Take the opportunity to avoid a potential double map in
> map_pages_to_xen() by initialising pl1e to NULL and only map it if it
> was not mapped earlier.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v10 03/13] x86/mm: switch to new APIs in modify_xen_mappings
  2021-04-21 14:15 ` [PATCH v10 03/13] x86/mm: switch to new APIs in modify_xen_mappings Hongyan Xia
@ 2021-04-22 13:10   ` Hongyan Xia
  0 siblings, 0 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-22 13:10 UTC (permalink / raw)
  To: xen-devel
  Cc: jgrall, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

On Wed, 2021-04-21 at 15:15 +0100, Hongyan Xia wrote:
> From: Wei Liu <wei.liu2@citrix.com>
> 
> Page tables allocated in that function should be mapped and unmapped
> now.
> 
> Note that pl2e now maybe mapped and unmapped in different iterations,
> so
> we need to add clean-ups for that.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> ---
> Changed in v7:
> - use normal unmap in the error path.
> ---
>  xen/arch/x86/mm.c | 57 ++++++++++++++++++++++++++++++---------------
> --
>  1 file changed, 36 insertions(+), 21 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 8a68da26f45f..832e654294b4 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5546,6 +5546,7 @@ int map_pages_to_xen(
>  
>   out:
>      L3T_UNLOCK(current_l3page);
> +    unmap_domain_page(pl2e);
>      unmap_domain_page(pl3e);
>      unmap_domain_page(pl2e);
>      return rc;

Something is seriously wrong here. This is obviously a mis-hunk which
should have been at the end of modify_xen_mappings() instead of
map_pages_to_xen(). This time the context is also different so I don't
understand why it did not conflict when I rebased.

Apologies. I noticed this patch has been merged so I will send a fix
right away.

Hongyan



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v10 00/13] switch to domheap for Xen page tables
  2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
                   ` (12 preceding siblings ...)
  2021-04-21 14:15 ` [PATCH v10 13/13] x86/mm: drop _new suffix for page table APIs Hongyan Xia
@ 2021-04-22 16:21 ` Andrew Cooper
  2021-04-22 16:35   ` Hongyan Xia
  13 siblings, 1 reply; 21+ messages in thread
From: Andrew Cooper @ 2021-04-22 16:21 UTC (permalink / raw)
  To: Hongyan Xia, xen-devel; +Cc: jgrall, Jan Beulich, Roger Pau Monné, Wei Liu

On 21/04/2021 15:15, Hongyan Xia wrote:
> From: Hongyan Xia <hongyxia@amazon.com>
>
> This series rewrites all the remaining functions and finally makes the
> switch from xenheap to domheap for Xen page tables, so that they no
> longer need to rely on the direct map, which is a big step towards
> removing the direct map.

Staging is broken.  Xen hits an assertion just after dom0 starts.

(XEN) Freed 616kB init memory
mapping kernel into physical memory
about to get started...
(XEN) Assertion 'hashent->refcnt' failed at domain_page.c:204
(XEN) ----[ Xen-4.16-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d040316f80>] unmap_domain_page+0x2af/0x2e0
(XEN) RFLAGS: 0000000000010046   CONTEXT: hypervisor (d0v0)
(XEN) rax: 0000000000000000   rbx: ffff831c47bf9040   rcx: ffff831c47c1a000
(XEN) rdx: 0000000000000092   rsi: 0000000000000092   rdi: 0000000000000206
(XEN) rbp: ffff8300a5ca7c88   rsp: ffff8300a5ca7c78   r8:  0000000001c4f2fc
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: 0000000000092018   r13: 0000000000800163   r14: fff0000000000000
(XEN) r15: 0000000000000001   cr0: 0000000080050033   cr4: 00000000003406e0
(XEN) cr3: 0000001c42008000   cr2: ffffc9000133d000
(XEN) fsb: 0000000000000000   gsb: ffff888266a00000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen code around <ffff82d040316f80> (unmap_domain_page+0x2af/0x2e0):
(XEN)  14 04 00 00 eb 19 0f 0b <0f> 0b 0f 0b ba 00 00 00 00 48 89 10 48
8b 81 d0
(XEN) Xen stack trace from rsp=ffff8300a5ca7c78:
(XEN)    ffff820040092018 0000000000000000 ffff8300a5ca7d58 ffff82d040327e20
(XEN)    a000000000000000 0000000000000000 ffff82d0405dbd40 008001e300000000
(XEN)    8000000000000000 8000000000000000 00000000000001e3 00000000000001e3
(XEN)    8000000000000000 0000000000000000 8000000000000163 0000000001440000
(XEN)    ffff82e0014b92e0 0000000301c1a000 0000000000000000 ffff820040090800
(XEN)    00000000026c10d8 0000000001c4f2fc 8010001c4240f067 ffff8300a5ca7df0
(XEN)    ffff82c00071c000 0000000000000001 0000000000001000 ffff8300a5ca7df8
(XEN)    ffff8300a5ca7dc8 ffff82d040232c08 ffff8300a5ca7db8 0000000140088078
(XEN)    ffff8300a5ca7df0 0080016300000001 ffffffff00000000 ffff82c00071c000
(XEN)    ffff82d0405b1300 ffff831c47bf9000 ffff82e04d821ae0 00000000026c10d7
(XEN)    ffff831c47c1a000 0000000000000100 ffff8300a5ca7dd8 ffff82d040232cdb
(XEN)    ffff8300a5ca7df8 ffff82d04031718b ffff8300a5ca7df8 00000000026c10d7
(XEN)    ffff8300a5ca7e38 ffff82d040209cb6 ffff831c47c1a018 0000000000000000
(XEN)    ffffffff82003e90 ffff831c47c1a018 ffff831c47bf9000 fffffffffffffff2
(XEN)    ffff8300a5ca7eb8 ffff82d04020a69a ffff82d04038a228 ffff82d04038a21c
(XEN)    00000000026c10d7 0000000000000100 ffff82d04038a228 ffff82d04038a21c
(XEN)    ffff82d04038a228 ffff82d04038a21c ffff82d04038a228 ffff8300a5ca7ef8
(XEN)    ffff831c47bf9000 0000000000000003 0000000000000000 0000000000000000
(XEN)    ffff8300a5ca7ee8 ffff82d040306e14 ffff82d04038a228 ffff831c47bf9000
(XEN)    0000000000000000 0000000000000000 00007cff5a3580e7 ffff82d04038a29d
(XEN) Xen call trace:
(XEN)    [<ffff82d040316f80>] R unmap_domain_page+0x2af/0x2e0
(XEN)    [<ffff82d040327e20>] F map_pages_to_xen+0x101a/0x1166
(XEN)    [<ffff82d040232c08>] F __vmap+0x332/0x3cd
(XEN)    [<ffff82d040232cdb>] F vmap+0x38/0x3a
(XEN)    [<ffff82d04031718b>] F map_domain_page_global+0x46/0x51
(XEN)    [<ffff82d040209cb6>] F map_vcpu_info+0x129/0x2c5
(XEN)    [<ffff82d04020a69a>] F do_vcpu_op+0x1eb/0x681
(XEN)    [<ffff82d040306e14>] F pv_hypercall+0x4e6/0x53d
(XEN)    [<ffff82d04038a29d>] F lstar_enter+0x12d/0x140
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'hashent->refcnt' failed at domain_page.c:204
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...

I don't see an obvious candidate for the breakage.  Unless someone can
point one out quickly, I'll revert the lot to unblock staging.

~Andrew


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v10 00/13] switch to domheap for Xen page tables
  2021-04-22 16:21 ` [PATCH v10 00/13] switch to domheap for Xen page tables Andrew Cooper
@ 2021-04-22 16:35   ` Hongyan Xia
  2021-04-22 17:27     ` Julien Grall
  2021-04-22 17:28     ` Andrew Cooper
  0 siblings, 2 replies; 21+ messages in thread
From: Hongyan Xia @ 2021-04-22 16:35 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: jgrall, Jan Beulich, Roger Pau Monné, Wei Liu

Please see my reply in 03/13. Can you check this diff and see if you
can still trigger this issue:

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 50229e38d384..84e3ccf47e2a 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5532,7 +5532,6 @@ int map_pages_to_xen(
 
  out:
     L3T_UNLOCK(current_l3page);
-    unmap_domain_page(pl2e);
     unmap_domain_page(pl3e);
     unmap_domain_page(pl2e);
     return rc;
@@ -5830,6 +5829,7 @@ int modify_xen_mappings(unsigned long s, unsigned
long e, unsigned int nf)
  out:
     L3T_UNLOCK(current_l3page);
     unmap_domain_page(pl3e);
+    unmap_domain_page(pl2e);
     return rc;
 }

Hongyan
 
On Thu, 2021-04-22 at 17:21 +0100, Andrew Cooper wrote:
> On 21/04/2021 15:15, Hongyan Xia wrote:
> > From: Hongyan Xia <hongyxia@amazon.com>
> > 
> > This series rewrites all the remaining functions and finally makes
> > the
> > switch from xenheap to domheap for Xen page tables, so that they no
> > longer need to rely on the direct map, which is a big step towards
> > removing the direct map.
> 
> Staging is broken.  Xen hits an assertion just after dom0 starts.
> 
> (XEN) Freed 616kB init memory
> mapping kernel into physical memory
> about to get started...
> (XEN) Assertion 'hashent->refcnt' failed at domain_page.c:204
> (XEN) ----[ Xen-4.16-unstable  x86_64  debug=y  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e008:[<ffff82d040316f80>] unmap_domain_page+0x2af/0x2e0
> (XEN) RFLAGS: 0000000000010046   CONTEXT: hypervisor (d0v0)
> (XEN) rax: 0000000000000000   rbx: ffff831c47bf9040   rcx:
> ffff831c47c1a000
> (XEN) rdx: 0000000000000092   rsi: 0000000000000092   rdi:
> 0000000000000206
> (XEN) rbp: ffff8300a5ca7c88   rsp: ffff8300a5ca7c78   r8: 
> 0000000001c4f2fc
> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11:
> 0000000000000000
> (XEN) r12: 0000000000092018   r13: 0000000000800163   r14:
> fff0000000000000
> (XEN) r15: 0000000000000001   cr0: 0000000080050033   cr4:
> 00000000003406e0
> (XEN) cr3: 0000001c42008000   cr2: ffffc9000133d000
> (XEN) fsb: 0000000000000000   gsb: ffff888266a00000   gss:
> 0000000000000000
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) Xen code around <ffff82d040316f80>
> (unmap_domain_page+0x2af/0x2e0):
> (XEN)  14 04 00 00 eb 19 0f 0b <0f> 0b 0f 0b ba 00 00 00 00 48 89 10
> 48
> 8b 81 d0
> (XEN) Xen stack trace from rsp=ffff8300a5ca7c78:
> (XEN)    ffff820040092018 0000000000000000 ffff8300a5ca7d58
> ffff82d040327e20
> (XEN)    a000000000000000 0000000000000000 ffff82d0405dbd40
> 008001e300000000
> (XEN)    8000000000000000 8000000000000000 00000000000001e3
> 00000000000001e3
> (XEN)    8000000000000000 0000000000000000 8000000000000163
> 0000000001440000
> (XEN)    ffff82e0014b92e0 0000000301c1a000 0000000000000000
> ffff820040090800
> (XEN)    00000000026c10d8 0000000001c4f2fc 8010001c4240f067
> ffff8300a5ca7df0
> (XEN)    ffff82c00071c000 0000000000000001 0000000000001000
> ffff8300a5ca7df8
> (XEN)    ffff8300a5ca7dc8 ffff82d040232c08 ffff8300a5ca7db8
> 0000000140088078
> (XEN)    ffff8300a5ca7df0 0080016300000001 ffffffff00000000
> ffff82c00071c000
> (XEN)    ffff82d0405b1300 ffff831c47bf9000 ffff82e04d821ae0
> 00000000026c10d7
> (XEN)    ffff831c47c1a000 0000000000000100 ffff8300a5ca7dd8
> ffff82d040232cdb
> (XEN)    ffff8300a5ca7df8 ffff82d04031718b ffff8300a5ca7df8
> 00000000026c10d7
> (XEN)    ffff8300a5ca7e38 ffff82d040209cb6 ffff831c47c1a018
> 0000000000000000
> (XEN)    ffffffff82003e90 ffff831c47c1a018 ffff831c47bf9000
> fffffffffffffff2
> (XEN)    ffff8300a5ca7eb8 ffff82d04020a69a ffff82d04038a228
> ffff82d04038a21c
> (XEN)    00000000026c10d7 0000000000000100 ffff82d04038a228
> ffff82d04038a21c
> (XEN)    ffff82d04038a228 ffff82d04038a21c ffff82d04038a228
> ffff8300a5ca7ef8
> (XEN)    ffff831c47bf9000 0000000000000003 0000000000000000
> 0000000000000000
> (XEN)    ffff8300a5ca7ee8 ffff82d040306e14 ffff82d04038a228
> ffff831c47bf9000
> (XEN)    0000000000000000 0000000000000000 00007cff5a3580e7
> ffff82d04038a29d
> (XEN) Xen call trace:
> (XEN)    [<ffff82d040316f80>] R unmap_domain_page+0x2af/0x2e0
> (XEN)    [<ffff82d040327e20>] F map_pages_to_xen+0x101a/0x1166
> (XEN)    [<ffff82d040232c08>] F __vmap+0x332/0x3cd
> (XEN)    [<ffff82d040232cdb>] F vmap+0x38/0x3a
> (XEN)    [<ffff82d04031718b>] F map_domain_page_global+0x46/0x51
> (XEN)    [<ffff82d040209cb6>] F map_vcpu_info+0x129/0x2c5
> (XEN)    [<ffff82d04020a69a>] F do_vcpu_op+0x1eb/0x681
> (XEN)    [<ffff82d040306e14>] F pv_hypercall+0x4e6/0x53d
> (XEN)    [<ffff82d04038a29d>] F lstar_enter+0x12d/0x140
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'hashent->refcnt' failed at domain_page.c:204
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
> 
> I don't see an obvious candidate for the breakage.  Unless someone
> can
> point one out quickly, I'll revert the lot to unblock staging.
> 
> ~Andrew



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v10 00/13] switch to domheap for Xen page tables
  2021-04-22 16:35   ` Hongyan Xia
@ 2021-04-22 17:27     ` Julien Grall
  2021-04-22 17:28     ` Andrew Cooper
  1 sibling, 0 replies; 21+ messages in thread
From: Julien Grall @ 2021-04-22 17:27 UTC (permalink / raw)
  To: Hongyan Xia, Andrew Cooper, xen-devel
  Cc: jgrall, Jan Beulich, Roger Pau Monné, Wei Liu

Hi Hongyan,

On 22/04/2021 17:35, Hongyan Xia wrote:
> Please see my reply in 03/13. Can you check this diff and see if you
> can still trigger this issue:

I can reproduced the same issue as Andrew. I have applied the patch and 
confirm this resolves the problem. Can you send a formal patch?

BTW, feel free to add my Tested-by.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v10 00/13] switch to domheap for Xen page tables
  2021-04-22 16:35   ` Hongyan Xia
  2021-04-22 17:27     ` Julien Grall
@ 2021-04-22 17:28     ` Andrew Cooper
  1 sibling, 0 replies; 21+ messages in thread
From: Andrew Cooper @ 2021-04-22 17:28 UTC (permalink / raw)
  To: Hongyan Xia, xen-devel; +Cc: jgrall, Jan Beulich, Roger Pau Monné, Wei Liu

On 22/04/2021 17:35, Hongyan Xia wrote:
> Please see my reply in 03/13. Can you check this diff and see if you
> can still trigger this issue:
>
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 50229e38d384..84e3ccf47e2a 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5532,7 +5532,6 @@ int map_pages_to_xen(
>  
>   out:
>      L3T_UNLOCK(current_l3page);
> -    unmap_domain_page(pl2e);
>      unmap_domain_page(pl3e);
>      unmap_domain_page(pl2e);
>      return rc;
> @@ -5830,6 +5829,7 @@ int modify_xen_mappings(unsigned long s, unsigned
> long e, unsigned int nf)
>   out:
>      L3T_UNLOCK(current_l3page);
>      unmap_domain_page(pl3e);
> +    unmap_domain_page(pl2e);
>      return rc;
>  }

Yup - that seems to fix things.

~Andrew


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-04-22 17:28 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-21 14:15 [PATCH v10 00/13] switch to domheap for Xen page tables Hongyan Xia
2021-04-21 14:15 ` [PATCH v10 01/13] x86/mm: rewrite virt_to_xen_l*e Hongyan Xia
2021-04-22 11:54   ` Jan Beulich
2021-04-21 14:15 ` [PATCH v10 02/13] x86/mm: switch to new APIs in map_pages_to_xen Hongyan Xia
2021-04-22 12:01   ` Jan Beulich
2021-04-21 14:15 ` [PATCH v10 03/13] x86/mm: switch to new APIs in modify_xen_mappings Hongyan Xia
2021-04-22 13:10   ` Hongyan Xia
2021-04-21 14:15 ` [PATCH v10 04/13] x86_64/mm: introduce pl2e in paging_init Hongyan Xia
2021-04-21 14:15 ` [PATCH v10 05/13] x86_64/mm: switch to new APIs " Hongyan Xia
2021-04-21 14:15 ` [PATCH v10 06/13] x86_64/mm: switch to new APIs in setup_m2p_table Hongyan Xia
2021-04-21 14:15 ` [PATCH v10 07/13] efi: use new page table APIs in copy_mapping Hongyan Xia
2021-04-21 14:15 ` [PATCH v10 08/13] efi: switch to new APIs in EFI code Hongyan Xia
2021-04-21 14:15 ` [PATCH v10 09/13] x86/smpboot: add exit path for clone_mapping() Hongyan Xia
2021-04-21 14:15 ` [PATCH v10 10/13] x86/smpboot: switch clone_mapping() to new APIs Hongyan Xia
2021-04-21 14:15 ` [PATCH v10 11/13] x86/mm: drop old page table APIs Hongyan Xia
2021-04-21 14:15 ` [PATCH v10 12/13] x86: switch to use domheap page for page tables Hongyan Xia
2021-04-21 14:15 ` [PATCH v10 13/13] x86/mm: drop _new suffix for page table APIs Hongyan Xia
2021-04-22 16:21 ` [PATCH v10 00/13] switch to domheap for Xen page tables Andrew Cooper
2021-04-22 16:35   ` Hongyan Xia
2021-04-22 17:27     ` Julien Grall
2021-04-22 17:28     ` Andrew Cooper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).