All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/15] switch to domheap for Xen page tables
@ 2020-04-24 14:08 Hongyan Xia
  2020-04-24 14:08 ` [PATCH v6 01/15] x86/mm: map_pages_to_xen would better have one exit path Hongyan Xia
                   ` (14 more replies)
  0 siblings, 15 replies; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:08 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

This series rewrites all the remaining functions and finally makes the
switch from xenheap to domheap for Xen page tables, so that they no
longer need to rely on the direct map, which is a big step towards
removing the direct map.

This series depends on the following two mini-series:
https://lists.xenproject.org/archives/html/xen-devel/2020-04/msg00730.html
https://lists.xenproject.org/archives/html/xen-devel/2020-04/msg00940.html

Since v1, 9 patches have already been merged. Apart from the first 2
patches in this series, the rest have not seen comments since v1 so they
only contain modifications from my side compared with v1.

---
Changed in v6:
- drop the patches that have already been merged.
- rebase and cleanup.
- rewrite map_pages_to_xen() and modify_xen_mappings() in a way that
  does not require an end_of_loop goto label.

Hongyan Xia (2):
  x86/mm: drop old page table APIs
  x86: switch to use domheap page for page tables

Wei Liu (13):
  x86/mm: map_pages_to_xen would better have one exit path
  x86/mm: make sure there is one exit path for modify_xen_mappings
  x86/mm: rewrite virt_to_xen_l*e
  x86/mm: switch to new APIs in map_pages_to_xen
  x86/mm: switch to new APIs in modify_xen_mappings
  x86_64/mm: introduce pl2e in paging_init
  x86_64/mm: switch to new APIs in paging_init
  x86_64/mm: switch to new APIs in setup_m2p_table
  efi: use new page table APIs in copy_mapping
  efi: switch to new APIs in EFI code
  x86/smpboot: clone_mapping should have one exit path
  x86/smpboot: switch pl*e to use new APIs in clone_mapping
  x86/mm: drop _new suffix for page table APIs

 xen/arch/x86/domain_page.c |  11 +-
 xen/arch/x86/efi/runtime.h |  10 +-
 xen/arch/x86/mm.c          | 262 +++++++++++++++++++++++--------------
 xen/arch/x86/setup.c       |   4 +-
 xen/arch/x86/smpboot.c     |  80 +++++++----
 xen/arch/x86/x86_64/mm.c   |  81 +++++++-----
 xen/common/efi/boot.c      |  60 ++++++---
 xen/common/efi/efi.h       |   3 +-
 xen/common/efi/runtime.c   |   8 +-
 xen/common/vmap.c          |   1 +
 xen/include/asm-x86/mm.h   |   6 +-
 xen/include/asm-x86/page.h |  13 +-
 12 files changed, 339 insertions(+), 200 deletions(-)

-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v6 01/15] x86/mm: map_pages_to_xen would better have one exit path
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
@ 2020-04-24 14:08 ` Hongyan Xia
  2020-04-24 14:08 ` [PATCH v6 02/15] x86/mm: make sure there is one exit path for modify_xen_mappings Hongyan Xia
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:08 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

We will soon rewrite the function to handle dynamically mapping and
unmapping of page tables. Since dynamic mappings may map and unmap pages
in different iterations of the while loop, we need to lift pl3e out of
the loop.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>

---
Changed since v4:
- drop the end_of_loop goto label.

Changed since v3:
- remove asserts on rc since rc never gets changed to anything else.
- reword commit message.
---
 xen/arch/x86/mm.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 355c50ff91..0d7f1f1265 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5073,9 +5073,11 @@ int map_pages_to_xen(
     unsigned int flags)
 {
     bool locking = system_state > SYS_STATE_boot;
+    l3_pgentry_t *pl3e, ol3e;
     l2_pgentry_t *pl2e, ol2e;
     l1_pgentry_t *pl1e, ol1e;
     unsigned int  i;
+    int rc = -ENOMEM;
 
 #define flush_flags(oldf) do {                 \
     unsigned int o_ = (oldf);                  \
@@ -5093,10 +5095,11 @@ int map_pages_to_xen(
 
     while ( nr_mfns != 0 )
     {
-        l3_pgentry_t ol3e, *pl3e = virt_to_xen_l3e(virt);
+        pl3e = virt_to_xen_l3e(virt);
 
         if ( !pl3e )
-            return -ENOMEM;
+            goto out;
+
         ol3e = *pl3e;
 
         if ( cpu_has_page1gb &&
@@ -5186,7 +5189,7 @@ int map_pages_to_xen(
 
             l2t = alloc_xen_pagetable();
             if ( l2t == NULL )
-                return -ENOMEM;
+                goto out;
 
             for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
                 l2e_write(l2t + i,
@@ -5215,7 +5218,7 @@ int map_pages_to_xen(
 
         pl2e = virt_to_xen_l2e(virt);
         if ( !pl2e )
-            return -ENOMEM;
+            goto out;
 
         if ( ((((virt >> PAGE_SHIFT) | mfn_x(mfn)) &
                ((1u << PAGETABLE_ORDER) - 1)) == 0) &&
@@ -5259,7 +5262,7 @@ int map_pages_to_xen(
             {
                 pl1e = virt_to_xen_l1e(virt);
                 if ( pl1e == NULL )
-                    return -ENOMEM;
+                    goto out;
             }
             else if ( l2e_get_flags(*pl2e) & _PAGE_PSE )
             {
@@ -5287,7 +5290,7 @@ int map_pages_to_xen(
 
                 l1t = alloc_xen_pagetable();
                 if ( l1t == NULL )
-                    return -ENOMEM;
+                    goto out;
 
                 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                     l1e_write(&l1t[i],
@@ -5433,7 +5436,10 @@ int map_pages_to_xen(
 
 #undef flush_flags
 
-    return 0;
+    rc = 0;
+
+ out:
+    return rc;
 }
 
 int populate_pt_range(unsigned long virt, unsigned long nr_mfns)
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 02/15] x86/mm: make sure there is one exit path for modify_xen_mappings
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
  2020-04-24 14:08 ` [PATCH v6 01/15] x86/mm: map_pages_to_xen would better have one exit path Hongyan Xia
@ 2020-04-24 14:08 ` Hongyan Xia
  2020-04-24 14:08 ` [PATCH v6 03/15] x86/mm: rewrite virt_to_xen_l*e Hongyan Xia
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:08 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

We will soon need to handle dynamically mapping / unmapping page
tables in the said function. Since dynamic mappings may map and unmap
pl3e in different iterations, lift pl3e out of the loop.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>

---
Changed since v4:
- drop the end_of_loop goto label.

Changed since v3:
- remove asserts on rc since it never gets changed to anything else.
---
 xen/arch/x86/mm.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 0d7f1f1265..13a34a8d57 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5462,10 +5462,12 @@ int populate_pt_range(unsigned long virt, unsigned long nr_mfns)
 int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
 {
     bool locking = system_state > SYS_STATE_boot;
+    l3_pgentry_t *pl3e;
     l2_pgentry_t *pl2e;
     l1_pgentry_t *pl1e;
     unsigned int  i;
     unsigned long v = s;
+    int rc = -ENOMEM;
 
     /* Set of valid PTE bits which may be altered. */
 #define FLAGS_MASK (_PAGE_NX|_PAGE_RW|_PAGE_PRESENT)
@@ -5476,7 +5478,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
 
     while ( v < e )
     {
-        l3_pgentry_t *pl3e = virt_to_xen_l3e(v);
+        pl3e = virt_to_xen_l3e(v);
 
         if ( !pl3e || !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
         {
@@ -5509,7 +5511,8 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
             /* PAGE1GB: shatter the superpage and fall through. */
             l2t = alloc_xen_pagetable();
             if ( !l2t )
-                return -ENOMEM;
+                goto out;
+
             for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
                 l2e_write(l2t + i,
                           l2e_from_pfn(l3e_get_pfn(*pl3e) +
@@ -5566,7 +5569,8 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 /* PSE: shatter the superpage and try again. */
                 l1t = alloc_xen_pagetable();
                 if ( !l1t )
-                    return -ENOMEM;
+                    goto out;
+
                 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                     l1e_write(&l1t[i],
                               l1e_from_pfn(l2e_get_pfn(*pl2e) + i,
@@ -5699,7 +5703,10 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
     flush_area(NULL, FLUSH_TLB_GLOBAL);
 
 #undef FLAGS_MASK
-    return 0;
+    rc = 0;
+
+ out:
+    return rc;
 }
 
 #undef flush_area
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 03/15] x86/mm: rewrite virt_to_xen_l*e
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
  2020-04-24 14:08 ` [PATCH v6 01/15] x86/mm: map_pages_to_xen would better have one exit path Hongyan Xia
  2020-04-24 14:08 ` [PATCH v6 02/15] x86/mm: make sure there is one exit path for modify_xen_mappings Hongyan Xia
@ 2020-04-24 14:08 ` Hongyan Xia
  2020-05-20  9:27   ` Jan Beulich
  2020-04-24 14:08 ` [PATCH v6 04/15] x86/mm: switch to new APIs in map_pages_to_xen Hongyan Xia
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:08 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

Rewrite those functions to use the new APIs. Modify its callers to unmap
the pointer returned.

Note that the change of virt_to_xen_l1e() also requires vmap_to_mfn() to
unmap the page, which requires domain_page.h header in vmap.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/domain_page.c | 11 +++--
 xen/arch/x86/mm.c          | 85 +++++++++++++++++++++++++++-----------
 xen/common/vmap.c          |  1 +
 xen/include/asm-x86/page.h |  8 +++-
 4 files changed, 76 insertions(+), 29 deletions(-)

diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c
index dd32712d2f..3a244bb500 100644
--- a/xen/arch/x86/domain_page.c
+++ b/xen/arch/x86/domain_page.c
@@ -333,21 +333,24 @@ void unmap_domain_page_global(const void *ptr)
 mfn_t domain_page_map_to_mfn(const void *ptr)
 {
     unsigned long va = (unsigned long)ptr;
-    const l1_pgentry_t *pl1e;
+    l1_pgentry_t l1e;
 
     if ( va >= DIRECTMAP_VIRT_START )
         return _mfn(virt_to_mfn(ptr));
 
     if ( va >= VMAP_VIRT_START && va < VMAP_VIRT_END )
     {
-        pl1e = virt_to_xen_l1e(va);
+        const l1_pgentry_t *pl1e = virt_to_xen_l1e(va);
+
         BUG_ON(!pl1e);
+        l1e = *pl1e;
+        unmap_domain_page(pl1e);
     }
     else
     {
         ASSERT(va >= MAPCACHE_VIRT_START && va < MAPCACHE_VIRT_END);
-        pl1e = &__linear_l1_table[l1_linear_offset(va)];
+        l1e = __linear_l1_table[l1_linear_offset(va)];
     }
 
-    return l1e_get_mfn(*pl1e);
+    return l1e_get_mfn(l1e);
 }
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 13a34a8d57..3328d887c4 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4955,6 +4955,10 @@ void free_xen_pagetable_new(mfn_t mfn)
 
 static DEFINE_SPINLOCK(map_pgdir_lock);
 
+/*
+ * For virt_to_xen_lXe() functions, they take a virtual address and return a
+ * pointer to Xen's LX entry. Caller needs to unmap the pointer.
+ */
 static l3_pgentry_t *virt_to_xen_l3e(unsigned long v)
 {
     l4_pgentry_t *pl4e;
@@ -4963,33 +4967,36 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v)
     if ( !(l4e_get_flags(*pl4e) & _PAGE_PRESENT) )
     {
         bool locking = system_state > SYS_STATE_boot;
-        l3_pgentry_t *l3t = alloc_xen_pagetable();
+        l3_pgentry_t *l3t;
+        mfn_t l3mfn = alloc_xen_pagetable_new();
 
-        if ( !l3t )
+        if ( mfn_eq(l3mfn, INVALID_MFN) )
             return NULL;
+        l3t = map_domain_page(l3mfn);
         clear_page(l3t);
+        UNMAP_DOMAIN_PAGE(l3t);
         if ( locking )
             spin_lock(&map_pgdir_lock);
         if ( !(l4e_get_flags(*pl4e) & _PAGE_PRESENT) )
         {
-            l4_pgentry_t l4e = l4e_from_paddr(__pa(l3t), __PAGE_HYPERVISOR);
+            l4_pgentry_t l4e = l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR);
 
             l4e_write(pl4e, l4e);
             efi_update_l4_pgtable(l4_table_offset(v), l4e);
-            l3t = NULL;
+            l3mfn = INVALID_MFN;
         }
         if ( locking )
             spin_unlock(&map_pgdir_lock);
-        if ( l3t )
-            free_xen_pagetable(l3t);
+        free_xen_pagetable_new(l3mfn);
     }
 
-    return l4e_to_l3e(*pl4e) + l3_table_offset(v);
+    return map_l3t_from_l4e(*pl4e) + l3_table_offset(v);
 }
 
 static l2_pgentry_t *virt_to_xen_l2e(unsigned long v)
 {
     l3_pgentry_t *pl3e;
+    l2_pgentry_t *pl2e;
 
     pl3e = virt_to_xen_l3e(v);
     if ( !pl3e )
@@ -4998,31 +5005,40 @@ static l2_pgentry_t *virt_to_xen_l2e(unsigned long v)
     if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
     {
         bool locking = system_state > SYS_STATE_boot;
-        l2_pgentry_t *l2t = alloc_xen_pagetable();
+        l2_pgentry_t *l2t;
+        mfn_t l2mfn = alloc_xen_pagetable_new();
 
-        if ( !l2t )
+        if ( mfn_eq(l2mfn, INVALID_MFN) )
+        {
+            UNMAP_DOMAIN_PAGE(pl3e);
             return NULL;
+        }
+        l2t = map_domain_page(l2mfn);
         clear_page(l2t);
+        UNMAP_DOMAIN_PAGE(l2t);
         if ( locking )
             spin_lock(&map_pgdir_lock);
         if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
         {
-            l3e_write(pl3e, l3e_from_paddr(__pa(l2t), __PAGE_HYPERVISOR));
-            l2t = NULL;
+            l3e_write(pl3e, l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR));
+            l2mfn = INVALID_MFN;
         }
         if ( locking )
             spin_unlock(&map_pgdir_lock);
-        if ( l2t )
-            free_xen_pagetable(l2t);
+        free_xen_pagetable_new(l2mfn);
     }
 
     BUG_ON(l3e_get_flags(*pl3e) & _PAGE_PSE);
-    return l3e_to_l2e(*pl3e) + l2_table_offset(v);
+    pl2e = map_l2t_from_l3e(*pl3e) + l2_table_offset(v);
+    unmap_domain_page(pl3e);
+
+    return pl2e;
 }
 
 l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
 {
     l2_pgentry_t *pl2e;
+    l1_pgentry_t *pl1e;
 
     pl2e = virt_to_xen_l2e(v);
     if ( !pl2e )
@@ -5031,26 +5047,34 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
     if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
     {
         bool locking = system_state > SYS_STATE_boot;
-        l1_pgentry_t *l1t = alloc_xen_pagetable();
+        l1_pgentry_t *l1t;
+        mfn_t l1mfn = alloc_xen_pagetable_new();
 
-        if ( !l1t )
+        if ( mfn_eq(l1mfn, INVALID_MFN) )
+        {
+            UNMAP_DOMAIN_PAGE(pl2e);
             return NULL;
+        }
+        l1t = map_domain_page(l1mfn);
         clear_page(l1t);
+        UNMAP_DOMAIN_PAGE(l1t);
         if ( locking )
             spin_lock(&map_pgdir_lock);
         if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
         {
-            l2e_write(pl2e, l2e_from_paddr(__pa(l1t), __PAGE_HYPERVISOR));
-            l1t = NULL;
+            l2e_write(pl2e, l2e_from_mfn(l1mfn, __PAGE_HYPERVISOR));
+            l1mfn = INVALID_MFN;
         }
         if ( locking )
             spin_unlock(&map_pgdir_lock);
-        if ( l1t )
-            free_xen_pagetable(l1t);
+        free_xen_pagetable_new(l1mfn);
     }
 
     BUG_ON(l2e_get_flags(*pl2e) & _PAGE_PSE);
-    return l2e_to_l1e(*pl2e) + l1_table_offset(v);
+    pl1e = map_l1t_from_l2e(*pl2e) + l1_table_offset(v);
+    unmap_domain_page(pl2e);
+
+    return pl1e;
 }
 
 /* Convert to from superpage-mapping flags for map_pages_to_xen(). */
@@ -5073,8 +5097,8 @@ int map_pages_to_xen(
     unsigned int flags)
 {
     bool locking = system_state > SYS_STATE_boot;
-    l3_pgentry_t *pl3e, ol3e;
-    l2_pgentry_t *pl2e, ol2e;
+    l3_pgentry_t *pl3e = NULL, ol3e;
+    l2_pgentry_t *pl2e = NULL, ol2e;
     l1_pgentry_t *pl1e, ol1e;
     unsigned int  i;
     int rc = -ENOMEM;
@@ -5095,6 +5119,10 @@ int map_pages_to_xen(
 
     while ( nr_mfns != 0 )
     {
+        /* Clean up mappings mapped in the previous iteration. */
+        UNMAP_DOMAIN_PAGE(pl3e);
+        UNMAP_DOMAIN_PAGE(pl2e);
+
         pl3e = virt_to_xen_l3e(virt);
 
         if ( !pl3e )
@@ -5260,9 +5288,12 @@ int map_pages_to_xen(
             /* Normal page mapping. */
             if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
             {
+                /* This forces the mapping to be populated. */
                 pl1e = virt_to_xen_l1e(virt);
                 if ( pl1e == NULL )
                     goto out;
+
+                UNMAP_DOMAIN_PAGE(pl1e);
             }
             else if ( l2e_get_flags(*pl2e) & _PAGE_PSE )
             {
@@ -5439,6 +5470,8 @@ int map_pages_to_xen(
     rc = 0;
 
  out:
+    UNMAP_DOMAIN_PAGE(pl2e);
+    UNMAP_DOMAIN_PAGE(pl3e);
     return rc;
 }
 
@@ -5462,7 +5495,7 @@ int populate_pt_range(unsigned long virt, unsigned long nr_mfns)
 int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
 {
     bool locking = system_state > SYS_STATE_boot;
-    l3_pgentry_t *pl3e;
+    l3_pgentry_t *pl3e = NULL;
     l2_pgentry_t *pl2e;
     l1_pgentry_t *pl1e;
     unsigned int  i;
@@ -5478,6 +5511,9 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
 
     while ( v < e )
     {
+        /* Clean up mappings mapped in the previous iteration. */
+        UNMAP_DOMAIN_PAGE(pl3e);
+
         pl3e = virt_to_xen_l3e(v);
 
         if ( !pl3e || !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
@@ -5706,6 +5742,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
     rc = 0;
 
  out:
+    UNMAP_DOMAIN_PAGE(pl3e);
     return rc;
 }
 
diff --git a/xen/common/vmap.c b/xen/common/vmap.c
index faebc1ddf1..9964ab2096 100644
--- a/xen/common/vmap.c
+++ b/xen/common/vmap.c
@@ -1,6 +1,7 @@
 #ifdef VMAP_VIRT_START
 #include <xen/bitmap.h>
 #include <xen/cache.h>
+#include <xen/domain_page.h>
 #include <xen/init.h>
 #include <xen/mm.h>
 #include <xen/pfn.h>
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index 5acf3d3d5a..8f711f4992 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -291,7 +291,13 @@ void copy_page_sse2(void *, const void *);
 #define pfn_to_paddr(pfn)   __pfn_to_paddr(pfn)
 #define paddr_to_pfn(pa)    __paddr_to_pfn(pa)
 #define paddr_to_pdx(pa)    pfn_to_pdx(paddr_to_pfn(pa))
-#define vmap_to_mfn(va)     _mfn(l1e_get_pfn(*virt_to_xen_l1e((unsigned long)(va))))
+
+#define vmap_to_mfn(va) ({                                                  \
+        const l1_pgentry_t *pl1e_ = virt_to_xen_l1e((unsigned long)(va));   \
+        unsigned long pfn_ = l1e_get_pfn(*pl1e_);                           \
+        unmap_domain_page(pl1e_);                                           \
+        _mfn(pfn_); })
+
 #define vmap_to_page(va)    mfn_to_page(vmap_to_mfn(va))
 
 #endif /* !defined(__ASSEMBLY__) */
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 04/15] x86/mm: switch to new APIs in map_pages_to_xen
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
                   ` (2 preceding siblings ...)
  2020-04-24 14:08 ` [PATCH v6 03/15] x86/mm: rewrite virt_to_xen_l*e Hongyan Xia
@ 2020-04-24 14:08 ` Hongyan Xia
  2020-04-24 14:08 ` [PATCH v6 05/15] x86/mm: switch to new APIs in modify_xen_mappings Hongyan Xia
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:08 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

Page tables allocated in that function should be mapped and unmapped
now.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/mm.c | 60 ++++++++++++++++++++++++++++-------------------
 1 file changed, 36 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 3328d887c4..2c14cdd83a 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5151,7 +5151,7 @@ int map_pages_to_xen(
                 }
                 else
                 {
-                    l2_pgentry_t *l2t = l3e_to_l2e(ol3e);
+                    l2_pgentry_t *l2t = map_l2t_from_l3e(ol3e);
 
                     for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
                     {
@@ -5163,10 +5163,11 @@ int map_pages_to_xen(
                         else
                         {
                             unsigned int j;
-                            const l1_pgentry_t *l1t = l2e_to_l1e(ol2e);
+                            const l1_pgentry_t *l1t = map_l1t_from_l2e(ol2e);
 
                             for ( j = 0; j < L1_PAGETABLE_ENTRIES; j++ )
                                 flush_flags(l1e_get_flags(l1t[j]));
+                            unmap_domain_page(l1t);
                         }
                     }
                     flush_area(virt, flush_flags);
@@ -5175,9 +5176,10 @@ int map_pages_to_xen(
                         ol2e = l2t[i];
                         if ( (l2e_get_flags(ol2e) & _PAGE_PRESENT) &&
                              !(l2e_get_flags(ol2e) & _PAGE_PSE) )
-                            free_xen_pagetable(l2e_to_l1e(ol2e));
+                            free_xen_pagetable_new(l2e_get_mfn(ol2e));
                     }
-                    free_xen_pagetable(l2t);
+                    unmap_domain_page(l2t);
+                    free_xen_pagetable_new(l3e_get_mfn(ol3e));
                 }
             }
 
@@ -5194,6 +5196,7 @@ int map_pages_to_xen(
             unsigned int flush_flags =
                 FLUSH_TLB | FLUSH_ORDER(2 * PAGETABLE_ORDER);
             l2_pgentry_t *l2t;
+            mfn_t l2mfn;
 
             /* Skip this PTE if there is no change. */
             if ( ((l3e_get_pfn(ol3e) & ~(L2_PAGETABLE_ENTRIES *
@@ -5215,15 +5218,17 @@ int map_pages_to_xen(
                 continue;
             }
 
-            l2t = alloc_xen_pagetable();
-            if ( l2t == NULL )
+            l2mfn = alloc_xen_pagetable_new();
+            if ( mfn_eq(l2mfn, INVALID_MFN) )
                 goto out;
 
+            l2t = map_domain_page(l2mfn);
             for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
                 l2e_write(l2t + i,
                           l2e_from_pfn(l3e_get_pfn(ol3e) +
                                        (i << PAGETABLE_ORDER),
                                        l3e_get_flags(ol3e)));
+            UNMAP_DOMAIN_PAGE(l2t);
 
             if ( l3e_get_flags(ol3e) & _PAGE_GLOBAL )
                 flush_flags |= FLUSH_TLB_GLOBAL;
@@ -5233,15 +5238,15 @@ int map_pages_to_xen(
             if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
                  (l3e_get_flags(*pl3e) & _PAGE_PSE) )
             {
-                l3e_write_atomic(pl3e, l3e_from_mfn(virt_to_mfn(l2t),
-                                                    __PAGE_HYPERVISOR));
-                l2t = NULL;
+                l3e_write_atomic(pl3e,
+                                 l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR));
+                l2mfn = INVALID_MFN;
             }
             if ( locking )
                 spin_unlock(&map_pgdir_lock);
             flush_area(virt, flush_flags);
-            if ( l2t )
-                free_xen_pagetable(l2t);
+
+            free_xen_pagetable_new(l2mfn);
         }
 
         pl2e = virt_to_xen_l2e(virt);
@@ -5269,12 +5274,13 @@ int map_pages_to_xen(
                 }
                 else
                 {
-                    l1_pgentry_t *l1t = l2e_to_l1e(ol2e);
+                    l1_pgentry_t *l1t = map_l1t_from_l2e(ol2e);
 
                     for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                         flush_flags(l1e_get_flags(l1t[i]));
                     flush_area(virt, flush_flags);
-                    free_xen_pagetable(l1t);
+                    unmap_domain_page(l1t);
+                    free_xen_pagetable_new(l2e_get_mfn(ol2e));
                 }
             }
 
@@ -5300,6 +5306,7 @@ int map_pages_to_xen(
                 unsigned int flush_flags =
                     FLUSH_TLB | FLUSH_ORDER(PAGETABLE_ORDER);
                 l1_pgentry_t *l1t;
+                mfn_t l1mfn;
 
                 /* Skip this PTE if there is no change. */
                 if ( (((l2e_get_pfn(*pl2e) & ~(L1_PAGETABLE_ENTRIES - 1)) +
@@ -5319,14 +5326,16 @@ int map_pages_to_xen(
                     goto check_l3;
                 }
 
-                l1t = alloc_xen_pagetable();
-                if ( l1t == NULL )
+                l1mfn = alloc_xen_pagetable_new();
+                if ( mfn_eq(l1mfn, INVALID_MFN) )
                     goto out;
 
+                l1t = map_domain_page(l1mfn);
                 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                     l1e_write(&l1t[i],
                               l1e_from_pfn(l2e_get_pfn(*pl2e) + i,
                                            lNf_to_l1f(l2e_get_flags(*pl2e))));
+                UNMAP_DOMAIN_PAGE(l1t);
 
                 if ( l2e_get_flags(*pl2e) & _PAGE_GLOBAL )
                     flush_flags |= FLUSH_TLB_GLOBAL;
@@ -5336,20 +5345,21 @@ int map_pages_to_xen(
                 if ( (l2e_get_flags(*pl2e) & _PAGE_PRESENT) &&
                      (l2e_get_flags(*pl2e) & _PAGE_PSE) )
                 {
-                    l2e_write_atomic(pl2e, l2e_from_mfn(virt_to_mfn(l1t),
+                    l2e_write_atomic(pl2e, l2e_from_mfn(l1mfn,
                                                         __PAGE_HYPERVISOR));
-                    l1t = NULL;
+                    l1mfn = INVALID_MFN;
                 }
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
                 flush_area(virt, flush_flags);
-                if ( l1t )
-                    free_xen_pagetable(l1t);
+
+                free_xen_pagetable_new(l1mfn);
             }
 
-            pl1e  = l2e_to_l1e(*pl2e) + l1_table_offset(virt);
+            pl1e  = map_l1t_from_l2e(*pl2e) + l1_table_offset(virt);
             ol1e  = *pl1e;
             l1e_write_atomic(pl1e, l1e_from_mfn(mfn, flags));
+            UNMAP_DOMAIN_PAGE(pl1e);
             if ( (l1e_get_flags(ol1e) & _PAGE_PRESENT) )
             {
                 unsigned int flush_flags = FLUSH_TLB | FLUSH_ORDER(0);
@@ -5393,12 +5403,13 @@ int map_pages_to_xen(
                     goto check_l3;
                 }
 
-                l1t = l2e_to_l1e(ol2e);
+                l1t = map_l1t_from_l2e(ol2e);
                 base_mfn = l1e_get_pfn(l1t[0]) & ~(L1_PAGETABLE_ENTRIES - 1);
                 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                     if ( (l1e_get_pfn(l1t[i]) != (base_mfn + i)) ||
                          (l1e_get_flags(l1t[i]) != flags) )
                         break;
+                UNMAP_DOMAIN_PAGE(l1t);
                 if ( i == L1_PAGETABLE_ENTRIES )
                 {
                     l2e_write_atomic(pl2e, l2e_from_pfn(base_mfn,
@@ -5408,7 +5419,7 @@ int map_pages_to_xen(
                     flush_area(virt - PAGE_SIZE,
                                FLUSH_TLB_GLOBAL |
                                FLUSH_ORDER(PAGETABLE_ORDER));
-                    free_xen_pagetable(l2e_to_l1e(ol2e));
+                    free_xen_pagetable_new(l2e_get_mfn(ol2e));
                 }
                 else if ( locking )
                     spin_unlock(&map_pgdir_lock);
@@ -5441,7 +5452,7 @@ int map_pages_to_xen(
                 continue;
             }
 
-            l2t = l3e_to_l2e(ol3e);
+            l2t = map_l2t_from_l3e(ol3e);
             base_mfn = l2e_get_pfn(l2t[0]) & ~(L2_PAGETABLE_ENTRIES *
                                               L1_PAGETABLE_ENTRIES - 1);
             for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
@@ -5449,6 +5460,7 @@ int map_pages_to_xen(
                       (base_mfn + (i << PAGETABLE_ORDER))) ||
                      (l2e_get_flags(l2t[i]) != l1f_to_lNf(flags)) )
                     break;
+            UNMAP_DOMAIN_PAGE(l2t);
             if ( i == L2_PAGETABLE_ENTRIES )
             {
                 l3e_write_atomic(pl3e, l3e_from_pfn(base_mfn,
@@ -5458,7 +5470,7 @@ int map_pages_to_xen(
                 flush_area(virt - PAGE_SIZE,
                            FLUSH_TLB_GLOBAL |
                            FLUSH_ORDER(2*PAGETABLE_ORDER));
-                free_xen_pagetable(l3e_to_l2e(ol3e));
+                free_xen_pagetable_new(l3e_get_mfn(ol3e));
             }
             else if ( locking )
                 spin_unlock(&map_pgdir_lock);
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 05/15] x86/mm: switch to new APIs in modify_xen_mappings
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
                   ` (3 preceding siblings ...)
  2020-04-24 14:08 ` [PATCH v6 04/15] x86/mm: switch to new APIs in map_pages_to_xen Hongyan Xia
@ 2020-04-24 14:08 ` Hongyan Xia
  2020-04-24 14:08 ` [PATCH v6 06/15] x86_64/mm: introduce pl2e in paging_init Hongyan Xia
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:08 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

Page tables allocated in that function should be mapped and unmapped
now.

Note that pl2e now maybe mapped and unmapped in different iterations, so
we need to add clean-ups for that.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/mm.c | 57 ++++++++++++++++++++++++++++++-----------------
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 2c14cdd83a..e8c16027d8 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5508,7 +5508,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
 {
     bool locking = system_state > SYS_STATE_boot;
     l3_pgentry_t *pl3e = NULL;
-    l2_pgentry_t *pl2e;
+    l2_pgentry_t *pl2e = NULL;
     l1_pgentry_t *pl1e;
     unsigned int  i;
     unsigned long v = s;
@@ -5524,6 +5524,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
     while ( v < e )
     {
         /* Clean up mappings mapped in the previous iteration. */
+        UNMAP_DOMAIN_PAGE(pl2e);
         UNMAP_DOMAIN_PAGE(pl3e);
 
         pl3e = virt_to_xen_l3e(v);
@@ -5541,6 +5542,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
         if ( l3e_get_flags(*pl3e) & _PAGE_PSE )
         {
             l2_pgentry_t *l2t;
+            mfn_t l2mfn;
 
             if ( l2_table_offset(v) == 0 &&
                  l1_table_offset(v) == 0 &&
@@ -5557,35 +5559,38 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
             }
 
             /* PAGE1GB: shatter the superpage and fall through. */
-            l2t = alloc_xen_pagetable();
-            if ( !l2t )
+            l2mfn = alloc_xen_pagetable_new();
+            if ( mfn_eq(l2mfn, INVALID_MFN) )
                 goto out;
 
+            l2t = map_domain_page(l2mfn);
             for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
                 l2e_write(l2t + i,
                           l2e_from_pfn(l3e_get_pfn(*pl3e) +
                                        (i << PAGETABLE_ORDER),
                                        l3e_get_flags(*pl3e)));
+            UNMAP_DOMAIN_PAGE(l2t);
+
             if ( locking )
                 spin_lock(&map_pgdir_lock);
             if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) &&
                  (l3e_get_flags(*pl3e) & _PAGE_PSE) )
             {
-                l3e_write_atomic(pl3e, l3e_from_mfn(virt_to_mfn(l2t),
-                                                    __PAGE_HYPERVISOR));
-                l2t = NULL;
+                l3e_write_atomic(pl3e,
+                                 l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR));
+                l2mfn = INVALID_MFN;
             }
             if ( locking )
                 spin_unlock(&map_pgdir_lock);
-            if ( l2t )
-                free_xen_pagetable(l2t);
+
+            free_xen_pagetable_new(l2mfn);
         }
 
         /*
          * The L3 entry has been verified to be present, and we've dealt with
          * 1G pages as well, so the L2 table cannot require allocation.
          */
-        pl2e = l3e_to_l2e(*pl3e) + l2_table_offset(v);
+        pl2e = map_l2t_from_l3e(*pl3e) + l2_table_offset(v);
 
         if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
         {
@@ -5613,41 +5618,45 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
             else
             {
                 l1_pgentry_t *l1t;
-
                 /* PSE: shatter the superpage and try again. */
-                l1t = alloc_xen_pagetable();
-                if ( !l1t )
+                mfn_t l1mfn = alloc_xen_pagetable_new();
+
+                if ( mfn_eq(l1mfn, INVALID_MFN) )
                     goto out;
 
+                l1t = map_domain_page(l1mfn);
                 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                     l1e_write(&l1t[i],
                               l1e_from_pfn(l2e_get_pfn(*pl2e) + i,
                                            l2e_get_flags(*pl2e) & ~_PAGE_PSE));
+                UNMAP_DOMAIN_PAGE(l1t);
+
                 if ( locking )
                     spin_lock(&map_pgdir_lock);
                 if ( (l2e_get_flags(*pl2e) & _PAGE_PRESENT) &&
                      (l2e_get_flags(*pl2e) & _PAGE_PSE) )
                 {
-                    l2e_write_atomic(pl2e, l2e_from_mfn(virt_to_mfn(l1t),
+                    l2e_write_atomic(pl2e, l2e_from_mfn(l1mfn,
                                                         __PAGE_HYPERVISOR));
-                    l1t = NULL;
+                    l1mfn = INVALID_MFN;
                 }
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
-                if ( l1t )
-                    free_xen_pagetable(l1t);
+
+                free_xen_pagetable_new(l1mfn);
             }
         }
         else
         {
             l1_pgentry_t nl1e, *l1t;
+            mfn_t l1mfn;
 
             /*
              * Ordinary 4kB mapping: The L2 entry has been verified to be
              * present, and we've dealt with 2M pages as well, so the L1 table
              * cannot require allocation.
              */
-            pl1e = l2e_to_l1e(*pl2e) + l1_table_offset(v);
+            pl1e = map_l1t_from_l2e(*pl2e) + l1_table_offset(v);
 
             /* Confirm the caller isn't trying to create new mappings. */
             if ( !(l1e_get_flags(*pl1e) & _PAGE_PRESENT) )
@@ -5658,6 +5667,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                                (l1e_get_flags(*pl1e) & ~FLAGS_MASK) | nf);
 
             l1e_write_atomic(pl1e, nl1e);
+            UNMAP_DOMAIN_PAGE(pl1e);
             v += PAGE_SIZE;
 
             /*
@@ -5687,10 +5697,12 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 continue;
             }
 
-            l1t = l2e_to_l1e(*pl2e);
+            l1mfn = l2e_get_mfn(*pl2e);
+            l1t = map_domain_page(l1mfn);
             for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
                 if ( l1e_get_intpte(l1t[i]) != 0 )
                     break;
+            UNMAP_DOMAIN_PAGE(l1t);
             if ( i == L1_PAGETABLE_ENTRIES )
             {
                 /* Empty: zap the L2E and free the L1 page. */
@@ -5698,7 +5710,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
                 flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
-                free_xen_pagetable(l1t);
+                free_xen_pagetable_new(l1mfn);
             }
             else if ( locking )
                 spin_unlock(&map_pgdir_lock);
@@ -5729,11 +5741,13 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
 
         {
             l2_pgentry_t *l2t;
+            mfn_t l2mfn = l3e_get_mfn(*pl3e);
 
-            l2t = l3e_to_l2e(*pl3e);
+            l2t = map_domain_page(l2mfn);
             for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
                 if ( l2e_get_intpte(l2t[i]) != 0 )
                     break;
+            UNMAP_DOMAIN_PAGE(l2t);
             if ( i == L2_PAGETABLE_ENTRIES )
             {
                 /* Empty: zap the L3E and free the L2 page. */
@@ -5741,7 +5755,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
                 flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
-                free_xen_pagetable(l2t);
+                free_xen_pagetable_new(l2mfn);
             }
             else if ( locking )
                 spin_unlock(&map_pgdir_lock);
@@ -5754,6 +5768,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
     rc = 0;
 
  out:
+    UNMAP_DOMAIN_PAGE(pl2e);
     UNMAP_DOMAIN_PAGE(pl3e);
     return rc;
 }
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 06/15] x86_64/mm: introduce pl2e in paging_init
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
                   ` (4 preceding siblings ...)
  2020-04-24 14:08 ` [PATCH v6 05/15] x86/mm: switch to new APIs in modify_xen_mappings Hongyan Xia
@ 2020-04-24 14:08 ` Hongyan Xia
  2020-05-20  9:35   ` Jan Beulich
  2020-04-24 14:08 ` [PATCH v6 07/15] x86_64/mm: switch to new APIs " Hongyan Xia
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:08 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

Introduce pl2e so that we can use l2_ro_mpt to point to the page table
itself.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/x86_64/mm.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 6557255494..35f9de4ad4 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -479,7 +479,7 @@ void __init paging_init(void)
     unsigned long i, mpt_size, va;
     unsigned int n, memflags;
     l3_pgentry_t *l3_ro_mpt;
-    l2_pgentry_t *l2_ro_mpt = NULL;
+    l2_pgentry_t *pl2e = NULL, *l2_ro_mpt = NULL;
     struct page_info *l1_pg;
 
     /*
@@ -529,7 +529,7 @@ void __init paging_init(void)
             (L2_PAGETABLE_SHIFT - 3 + PAGE_SHIFT)));
 
         if ( cpu_has_page1gb &&
-             !((unsigned long)l2_ro_mpt & ~PAGE_MASK) &&
+             !((unsigned long)pl2e & ~PAGE_MASK) &&
              (mpt_size >> L3_PAGETABLE_SHIFT) > (i >> PAGETABLE_ORDER) )
         {
             unsigned int k, holes;
@@ -589,7 +589,7 @@ void __init paging_init(void)
             memset((void *)(RDWR_MPT_VIRT_START + (i << L2_PAGETABLE_SHIFT)),
                    0xFF, 1UL << L2_PAGETABLE_SHIFT);
         }
-        if ( !((unsigned long)l2_ro_mpt & ~PAGE_MASK) )
+        if ( !((unsigned long)pl2e & ~PAGE_MASK) )
         {
             if ( (l2_ro_mpt = alloc_xen_pagetable()) == NULL )
                 goto nomem;
@@ -597,13 +597,14 @@ void __init paging_init(void)
             l3e_write(&l3_ro_mpt[l3_table_offset(va)],
                       l3e_from_paddr(__pa(l2_ro_mpt),
                                      __PAGE_HYPERVISOR_RO | _PAGE_USER));
+            pl2e = l2_ro_mpt;
             ASSERT(!l2_table_offset(va));
         }
         /* NB. Cannot be GLOBAL: guest user mode should not see it. */
         if ( l1_pg )
-            l2e_write(l2_ro_mpt, l2e_from_page(
+            l2e_write(pl2e, l2e_from_page(
                 l1_pg, /*_PAGE_GLOBAL|*/_PAGE_PSE|_PAGE_USER|_PAGE_PRESENT));
-        l2_ro_mpt++;
+        pl2e++;
     }
 #undef CNT
 #undef MFN
@@ -613,6 +614,7 @@ void __init paging_init(void)
         goto nomem;
     compat_idle_pg_table_l2 = l2_ro_mpt;
     clear_page(l2_ro_mpt);
+    pl2e = l2_ro_mpt;
     /* Allocate and map the compatibility mode machine-to-phys table. */
     mpt_size = (mpt_size >> 1) + (1UL << (L2_PAGETABLE_SHIFT - 1));
     if ( mpt_size > RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START )
@@ -625,7 +627,7 @@ void __init paging_init(void)
              sizeof(*compat_machine_to_phys_mapping))
     BUILD_BUG_ON((sizeof(*frame_table) & ~sizeof(*frame_table)) % \
                  sizeof(*compat_machine_to_phys_mapping));
-    for ( i = 0; i < (mpt_size >> L2_PAGETABLE_SHIFT); i++, l2_ro_mpt++ )
+    for ( i = 0; i < (mpt_size >> L2_PAGETABLE_SHIFT); i++, pl2e++ )
     {
         memflags = MEMF_node(phys_to_nid(i <<
             (L2_PAGETABLE_SHIFT - 2 + PAGE_SHIFT)));
@@ -647,7 +649,7 @@ void __init paging_init(void)
                         (i << L2_PAGETABLE_SHIFT)),
                0xFF, 1UL << L2_PAGETABLE_SHIFT);
         /* NB. Cannot be GLOBAL as the ptes get copied into per-VM space. */
-        l2e_write(l2_ro_mpt, l2e_from_page(l1_pg, _PAGE_PSE|_PAGE_PRESENT));
+        l2e_write(pl2e, l2e_from_page(l1_pg, _PAGE_PSE|_PAGE_PRESENT));
     }
 #undef CNT
 #undef MFN
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 07/15] x86_64/mm: switch to new APIs in paging_init
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
                   ` (5 preceding siblings ...)
  2020-04-24 14:08 ` [PATCH v6 06/15] x86_64/mm: introduce pl2e in paging_init Hongyan Xia
@ 2020-04-24 14:08 ` Hongyan Xia
  2020-05-20  9:46   ` Jan Beulich
  2020-04-24 14:08 ` [PATCH v6 08/15] x86_64/mm: switch to new APIs in setup_m2p_table Hongyan Xia
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:08 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

Map and unmap pages instead of relying on the direct map.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/x86_64/mm.c | 43 ++++++++++++++++++++++++++++------------
 1 file changed, 30 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 35f9de4ad4..3cf699d27b 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -478,9 +478,10 @@ void __init paging_init(void)
 {
     unsigned long i, mpt_size, va;
     unsigned int n, memflags;
-    l3_pgentry_t *l3_ro_mpt;
+    l3_pgentry_t *l3_ro_mpt = NULL;
     l2_pgentry_t *pl2e = NULL, *l2_ro_mpt = NULL;
     struct page_info *l1_pg;
+    mfn_t l3_ro_mpt_mfn, l2_ro_mpt_mfn;
 
     /*
      * We setup the L3s for 1:1 mapping if host support memory hotplug
@@ -493,22 +494,28 @@ void __init paging_init(void)
         if ( !(l4e_get_flags(idle_pg_table[l4_table_offset(va)]) &
               _PAGE_PRESENT) )
         {
-            l3_pgentry_t *pl3t = alloc_xen_pagetable();
+            l3_pgentry_t *pl3t;
+            mfn_t l3mfn = alloc_xen_pagetable_new();
 
-            if ( !pl3t )
+            if ( mfn_eq(l3mfn, INVALID_MFN) )
                 goto nomem;
+
+            pl3t = map_domain_page(l3mfn);
             clear_page(pl3t);
             l4e_write(&idle_pg_table[l4_table_offset(va)],
-                      l4e_from_paddr(__pa(pl3t), __PAGE_HYPERVISOR_RW));
+                      l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR_RW));
+            unmap_domain_page(pl3t);
         }
     }
 
     /* Create user-accessible L2 directory to map the MPT for guests. */
-    if ( (l3_ro_mpt = alloc_xen_pagetable()) == NULL )
+    l3_ro_mpt_mfn = alloc_xen_pagetable_new();
+    if ( mfn_eq(l3_ro_mpt_mfn, INVALID_MFN) )
         goto nomem;
+    l3_ro_mpt = map_domain_page(l3_ro_mpt_mfn);
     clear_page(l3_ro_mpt);
     l4e_write(&idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)],
-              l4e_from_paddr(__pa(l3_ro_mpt), __PAGE_HYPERVISOR_RO | _PAGE_USER));
+              l4e_from_mfn(l3_ro_mpt_mfn, __PAGE_HYPERVISOR_RO | _PAGE_USER));
 
     /*
      * Allocate and map the machine-to-phys table.
@@ -591,12 +598,17 @@ void __init paging_init(void)
         }
         if ( !((unsigned long)pl2e & ~PAGE_MASK) )
         {
-            if ( (l2_ro_mpt = alloc_xen_pagetable()) == NULL )
+            UNMAP_DOMAIN_PAGE(l2_ro_mpt);
+
+            l2_ro_mpt_mfn = alloc_xen_pagetable_new();
+            if ( mfn_eq(l2_ro_mpt_mfn, INVALID_MFN) )
                 goto nomem;
+
+            l2_ro_mpt = map_domain_page(l2_ro_mpt_mfn);
             clear_page(l2_ro_mpt);
             l3e_write(&l3_ro_mpt[l3_table_offset(va)],
-                      l3e_from_paddr(__pa(l2_ro_mpt),
-                                     __PAGE_HYPERVISOR_RO | _PAGE_USER));
+                      l3e_from_mfn(l2_ro_mpt_mfn,
+                                   __PAGE_HYPERVISOR_RO | _PAGE_USER));
             pl2e = l2_ro_mpt;
             ASSERT(!l2_table_offset(va));
         }
@@ -608,13 +620,16 @@ void __init paging_init(void)
     }
 #undef CNT
 #undef MFN
+    UNMAP_DOMAIN_PAGE(l2_ro_mpt);
+    UNMAP_DOMAIN_PAGE(l3_ro_mpt);
 
     /* Create user-accessible L2 directory to map the MPT for compat guests. */
-    if ( (l2_ro_mpt = alloc_xen_pagetable()) == NULL )
+    l2_ro_mpt_mfn = alloc_xen_pagetable_new();
+    if ( mfn_eq(l2_ro_mpt_mfn, INVALID_MFN) )
         goto nomem;
-    compat_idle_pg_table_l2 = l2_ro_mpt;
-    clear_page(l2_ro_mpt);
-    pl2e = l2_ro_mpt;
+    compat_idle_pg_table_l2 = map_domain_page_global(l2_ro_mpt_mfn);
+    clear_page(compat_idle_pg_table_l2);
+    pl2e = compat_idle_pg_table_l2;
     /* Allocate and map the compatibility mode machine-to-phys table. */
     mpt_size = (mpt_size >> 1) + (1UL << (L2_PAGETABLE_SHIFT - 1));
     if ( mpt_size > RDWR_COMPAT_MPT_VIRT_END - RDWR_COMPAT_MPT_VIRT_START )
@@ -662,6 +677,8 @@ void __init paging_init(void)
     return;
 
  nomem:
+    UNMAP_DOMAIN_PAGE(l2_ro_mpt);
+    UNMAP_DOMAIN_PAGE(l3_ro_mpt);
     panic("Not enough memory for m2p table\n");
 }
 
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 08/15] x86_64/mm: switch to new APIs in setup_m2p_table
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
                   ` (6 preceding siblings ...)
  2020-04-24 14:08 ` [PATCH v6 07/15] x86_64/mm: switch to new APIs " Hongyan Xia
@ 2020-04-24 14:08 ` Hongyan Xia
  2020-05-20  9:54   ` Jan Beulich
  2020-04-24 14:09 ` [PATCH v6 09/15] efi: use new page table APIs in copy_mapping Hongyan Xia
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:08 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

Also reduce the scope of l2_ro_mpt as it is only used inside the loop,
and remove two lines of l2_ro_mpt check which serve no purpose.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/x86_64/mm.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 3cf699d27b..12e9dc6eb2 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -379,13 +379,13 @@ static int setup_m2p_table(struct mem_hotadd_info *info)
 {
     unsigned long i, va, smap, emap;
     unsigned int n;
-    l2_pgentry_t *l2_ro_mpt = NULL;
     l3_pgentry_t *l3_ro_mpt = NULL;
     int ret = 0;
 
     ASSERT(l4e_get_flags(idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)])
             & _PAGE_PRESENT);
-    l3_ro_mpt = l4e_to_l3e(idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)]);
+    l3_ro_mpt = map_l3t_from_l4e(
+                    idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)]);
 
     smap = (info->spfn & (~((1UL << (L2_PAGETABLE_SHIFT - 3)) -1)));
     emap = ((info->epfn + ((1UL << (L2_PAGETABLE_SHIFT - 3)) - 1 )) &
@@ -424,6 +424,7 @@ static int setup_m2p_table(struct mem_hotadd_info *info)
                 break;
         if ( n < CNT )
         {
+            l2_pgentry_t *l2_ro_mpt;
             mfn_t mfn = alloc_hotadd_mfn(info);
 
             ret = map_pages_to_xen(
@@ -440,30 +441,30 @@ static int setup_m2p_table(struct mem_hotadd_info *info)
                   _PAGE_PSE));
             if ( l3e_get_flags(l3_ro_mpt[l3_table_offset(va)]) &
               _PAGE_PRESENT )
-                l2_ro_mpt = l3e_to_l2e(l3_ro_mpt[l3_table_offset(va)]) +
-                  l2_table_offset(va);
+                l2_ro_mpt = map_l2t_from_l3e(l3_ro_mpt[l3_table_offset(va)]);
             else
             {
-                l2_ro_mpt = alloc_xen_pagetable();
-                if ( !l2_ro_mpt )
+                mfn_t l2_ro_mpt_mfn = alloc_xen_pagetable_new();
+
+                if ( mfn_eq(l2_ro_mpt_mfn, INVALID_MFN) )
                 {
                     ret = -ENOMEM;
                     goto error;
                 }
 
+                l2_ro_mpt = map_domain_page(l2_ro_mpt_mfn);
                 clear_page(l2_ro_mpt);
                 l3e_write(&l3_ro_mpt[l3_table_offset(va)],
-                          l3e_from_paddr(__pa(l2_ro_mpt),
-                                         __PAGE_HYPERVISOR_RO | _PAGE_USER));
-                l2_ro_mpt += l2_table_offset(va);
+                          l3e_from_mfn(l2_ro_mpt_mfn,
+                                       __PAGE_HYPERVISOR_RO | _PAGE_USER));
             }
+            l2_ro_mpt += l2_table_offset(va);
 
             /* NB. Cannot be GLOBAL: guest user mode should not see it. */
             l2e_write(l2_ro_mpt, l2e_from_mfn(mfn,
                    /*_PAGE_GLOBAL|*/_PAGE_PSE|_PAGE_USER|_PAGE_PRESENT));
+            unmap_domain_page(l2_ro_mpt);
         }
-        if ( !((unsigned long)l2_ro_mpt & ~PAGE_MASK) )
-            l2_ro_mpt = NULL;
         i += ( 1UL << (L2_PAGETABLE_SHIFT - 3));
     }
 #undef CNT
@@ -471,6 +472,7 @@ static int setup_m2p_table(struct mem_hotadd_info *info)
 
     ret = setup_compat_m2p_table(info);
 error:
+    UNMAP_DOMAIN_PAGE(l3_ro_mpt);
     return ret;
 }
 
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 09/15] efi: use new page table APIs in copy_mapping
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
                   ` (7 preceding siblings ...)
  2020-04-24 14:08 ` [PATCH v6 08/15] x86_64/mm: switch to new APIs in setup_m2p_table Hongyan Xia
@ 2020-04-24 14:09 ` Hongyan Xia
  2020-04-30 12:42   ` Jan Beulich
  2020-04-24 14:09 ` [PATCH v6 10/15] efi: switch to new APIs in EFI code Hongyan Xia
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:09 UTC (permalink / raw)
  To: xen-devel; +Cc: julien, Jan Beulich

From: Wei Liu <wei.liu2@citrix.com>

After inspection ARM doesn't have alloc_xen_pagetable so this function
is x86 only, which means it is safe for us to change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/common/efi/boot.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index a6f84c945a..8e96ef8de2 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -1454,16 +1454,20 @@ static __init void copy_mapping(unsigned long mfn, unsigned long end,
             continue;
         if ( !(l4e_get_flags(l4e) & _PAGE_PRESENT) )
         {
-            l3dst = alloc_xen_pagetable();
-            BUG_ON(!l3dst);
+            mfn_t l3mfn = alloc_xen_pagetable_new();
+
+            BUG_ON(mfn_eq(l3mfn, INVALID_MFN));
+            l3dst = map_domain_page(l3mfn);
             clear_page(l3dst);
             efi_l4_pgtable[l4_table_offset(mfn << PAGE_SHIFT)] =
-                l4e_from_paddr(virt_to_maddr(l3dst), __PAGE_HYPERVISOR);
+                l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR);
         }
         else
-            l3dst = l4e_to_l3e(l4e);
-        l3src = l4e_to_l3e(idle_pg_table[l4_table_offset(va)]);
+            l3dst = map_l3t_from_l4e(l4e);
+        l3src = map_l3t_from_l4e(idle_pg_table[l4_table_offset(va)]);
         l3dst[l3_table_offset(mfn << PAGE_SHIFT)] = l3src[l3_table_offset(va)];
+        unmap_domain_page(l3src);
+        unmap_domain_page(l3dst);
     }
 }
 
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 10/15] efi: switch to new APIs in EFI code
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
                   ` (8 preceding siblings ...)
  2020-04-24 14:09 ` [PATCH v6 09/15] efi: use new page table APIs in copy_mapping Hongyan Xia
@ 2020-04-24 14:09 ` Hongyan Xia
  2020-04-30 12:55   ` Jan Beulich
  2020-04-24 14:09 ` [PATCH v6 11/15] x86/smpboot: clone_mapping should have one exit path Hongyan Xia
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/efi/runtime.h | 10 +++++++--
 xen/common/efi/boot.c      | 46 ++++++++++++++++++++++++++------------
 xen/common/efi/efi.h       |  3 ++-
 xen/common/efi/runtime.c   |  8 +++----
 4 files changed, 46 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/efi/runtime.h b/xen/arch/x86/efi/runtime.h
index d9eb8f5c27..d2483645c6 100644
--- a/xen/arch/x86/efi/runtime.h
+++ b/xen/arch/x86/efi/runtime.h
@@ -1,12 +1,18 @@
+#include <xen/domain_page.h>
+#include <xen/mm.h>
 #include <asm/atomic.h>
 #include <asm/mc146818rtc.h>
 
 #ifndef COMPAT
-l4_pgentry_t *__read_mostly efi_l4_pgtable;
+mfn_t __read_mostly efi_l4_mfn = INVALID_MFN_INITIALIZER;
 
 void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e)
 {
-    if ( efi_l4_pgtable )
+    if ( !mfn_eq(efi_l4_mfn, INVALID_MFN) )
+    {
+        l4_pgentry_t *efi_l4_pgtable = map_domain_page(efi_l4_mfn);
         l4e_write(efi_l4_pgtable + l4idx, l4e);
+        unmap_domain_page(efi_l4_pgtable);
+    }
 }
 #endif
diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 8e96ef8de2..715217d2a9 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -6,6 +6,7 @@
 #include <xen/compile.h>
 #include <xen/ctype.h>
 #include <xen/dmi.h>
+#include <xen/domain_page.h>
 #include <xen/init.h>
 #include <xen/keyhandler.h>
 #include <xen/lib.h>
@@ -1442,6 +1443,7 @@ static __init void copy_mapping(unsigned long mfn, unsigned long end,
                                                  unsigned long emfn))
 {
     unsigned long next;
+    l4_pgentry_t *efi_l4_pgtable = map_domain_page(efi_l4_mfn);
 
     for ( ; mfn < end; mfn = next )
     {
@@ -1469,6 +1471,8 @@ static __init void copy_mapping(unsigned long mfn, unsigned long end,
         unmap_domain_page(l3src);
         unmap_domain_page(l3dst);
     }
+
+    unmap_domain_page(efi_l4_pgtable);
 }
 
 static bool __init ram_range_valid(unsigned long smfn, unsigned long emfn)
@@ -1489,6 +1493,7 @@ static bool __init rt_range_valid(unsigned long smfn, unsigned long emfn)
 void __init efi_init_memory(void)
 {
     unsigned int i;
+    l4_pgentry_t *efi_l4_pgtable;
     struct rt_extra {
         struct rt_extra *next;
         unsigned long smfn, emfn;
@@ -1603,8 +1608,9 @@ void __init efi_init_memory(void)
      * Set up 1:1 page tables for runtime calls. See SetVirtualAddressMap() in
      * efi_exit_boot().
      */
-    efi_l4_pgtable = alloc_xen_pagetable();
-    BUG_ON(!efi_l4_pgtable);
+    efi_l4_mfn = alloc_xen_pagetable_new();
+    BUG_ON(mfn_eq(efi_l4_mfn, INVALID_MFN));
+    efi_l4_pgtable = map_domain_page(efi_l4_mfn);
     clear_page(efi_l4_pgtable);
 
     copy_mapping(0, max_page, ram_range_valid);
@@ -1637,39 +1643,45 @@ void __init efi_init_memory(void)
 
         if ( !(l4e_get_flags(l4e) & _PAGE_PRESENT) )
         {
-            pl3e = alloc_xen_pagetable();
-            BUG_ON(!pl3e);
+            mfn_t l3mfn = alloc_xen_pagetable_new();
+
+            BUG_ON(mfn_eq(l3mfn, INVALID_MFN));
+            pl3e = map_domain_page(l3mfn);
             clear_page(pl3e);
             efi_l4_pgtable[l4_table_offset(addr)] =
-                l4e_from_paddr(virt_to_maddr(pl3e), __PAGE_HYPERVISOR);
+                l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR);
         }
         else
-            pl3e = l4e_to_l3e(l4e);
+            pl3e = map_l3t_from_l4e(l4e);
         pl3e += l3_table_offset(addr);
         if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
         {
-            pl2e = alloc_xen_pagetable();
-            BUG_ON(!pl2e);
+            mfn_t l2mfn = alloc_xen_pagetable_new();
+
+            BUG_ON(mfn_eq(l2mfn, INVALID_MFN));
+            pl2e = map_domain_page(l2mfn);
             clear_page(pl2e);
-            *pl3e = l3e_from_paddr(virt_to_maddr(pl2e), __PAGE_HYPERVISOR);
+            *pl3e = l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR);
         }
         else
         {
             BUG_ON(l3e_get_flags(*pl3e) & _PAGE_PSE);
-            pl2e = l3e_to_l2e(*pl3e);
+            pl2e = map_l2t_from_l3e(*pl3e);
         }
         pl2e += l2_table_offset(addr);
         if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
         {
-            l1t = alloc_xen_pagetable();
-            BUG_ON(!l1t);
+            mfn_t l1mfn = alloc_xen_pagetable_new();
+
+            BUG_ON(mfn_eq(l1mfn, INVALID_MFN));
+            l1t = map_domain_page(l1mfn);
             clear_page(l1t);
-            *pl2e = l2e_from_paddr(virt_to_maddr(l1t), __PAGE_HYPERVISOR);
+            *pl2e = l2e_from_mfn(l1mfn, __PAGE_HYPERVISOR);
         }
         else
         {
             BUG_ON(l2e_get_flags(*pl2e) & _PAGE_PSE);
-            l1t = l2e_to_l1e(*pl2e);
+            l1t = map_l1t_from_l2e(*pl2e);
         }
         for ( i = l1_table_offset(addr);
               i < L1_PAGETABLE_ENTRIES && extra->smfn < extra->emfn;
@@ -1681,11 +1693,17 @@ void __init efi_init_memory(void)
             extra_head = extra->next;
             xfree(extra);
         }
+
+        unmap_domain_page(l1t);
+        unmap_domain_page(pl2e);
+        unmap_domain_page(pl3e);
     }
 
     /* Insert Xen mappings. */
     for ( i = l4_table_offset(HYPERVISOR_VIRT_START);
           i < l4_table_offset(DIRECTMAP_VIRT_END); ++i )
         efi_l4_pgtable[i] = idle_pg_table[i];
+
+    unmap_domain_page(efi_l4_pgtable);
 }
 #endif
diff --git a/xen/common/efi/efi.h b/xen/common/efi/efi.h
index 2e38d05f3d..e364bae3e0 100644
--- a/xen/common/efi/efi.h
+++ b/xen/common/efi/efi.h
@@ -6,6 +6,7 @@
 #include <efi/eficapsule.h>
 #include <efi/efiapi.h>
 #include <xen/efi.h>
+#include <xen/mm.h>
 #include <xen/spinlock.h>
 #include <asm/page.h>
 
@@ -29,7 +30,7 @@ extern UINTN efi_memmap_size, efi_mdesc_size;
 extern void *efi_memmap;
 
 #ifdef CONFIG_X86
-extern l4_pgentry_t *efi_l4_pgtable;
+extern mfn_t efi_l4_mfn;
 #endif
 
 extern const struct efi_pci_rom *efi_pci_roms;
diff --git a/xen/common/efi/runtime.c b/xen/common/efi/runtime.c
index 95367694b5..375b94229e 100644
--- a/xen/common/efi/runtime.c
+++ b/xen/common/efi/runtime.c
@@ -85,7 +85,7 @@ struct efi_rs_state efi_rs_enter(void)
     static const u32 mxcsr = MXCSR_DEFAULT;
     struct efi_rs_state state = { .cr3 = 0 };
 
-    if ( !efi_l4_pgtable )
+    if ( mfn_eq(efi_l4_mfn, INVALID_MFN) )
         return state;
 
     state.cr3 = read_cr3();
@@ -111,7 +111,7 @@ struct efi_rs_state efi_rs_enter(void)
         lgdt(&gdt_desc);
     }
 
-    switch_cr3_cr4(virt_to_maddr(efi_l4_pgtable), read_cr4());
+    switch_cr3_cr4(mfn_to_maddr(efi_l4_mfn), read_cr4());
 
     return state;
 }
@@ -140,9 +140,9 @@ void efi_rs_leave(struct efi_rs_state *state)
 
 bool efi_rs_using_pgtables(void)
 {
-    return efi_l4_pgtable &&
+    return !mfn_eq(efi_l4_mfn, INVALID_MFN) &&
            (smp_processor_id() == efi_rs_on_cpu) &&
-           (read_cr3() == virt_to_maddr(efi_l4_pgtable));
+           (read_cr3() == mfn_to_maddr(efi_l4_mfn));
 }
 
 unsigned long efi_get_time(void)
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 11/15] x86/smpboot: clone_mapping should have one exit path
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
                   ` (9 preceding siblings ...)
  2020-04-24 14:09 ` [PATCH v6 10/15] efi: switch to new APIs in EFI code Hongyan Xia
@ 2020-04-24 14:09 ` Hongyan Xia
  2020-04-30 14:59   ` Jan Beulich
  2020-04-24 14:09 ` [PATCH v6 12/15] x86/smpboot: switch pl*e to use new APIs in clone_mapping Hongyan Xia
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

We will soon need to clean up page table mappings in the exit path.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/smpboot.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 275ce7661d..5b0e24f925 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -675,6 +675,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     l3_pgentry_t *pl3e;
     l2_pgentry_t *pl2e;
     l1_pgentry_t *pl1e;
+    int rc = -ENOMEM;
 
     /*
      * Sanity check 'linear'.  We only allow cloning from the Xen virtual
@@ -715,7 +716,10 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
             pl1e = l2e_to_l1e(*pl2e) + l1_table_offset(linear);
             flags = l1e_get_flags(*pl1e);
             if ( !(flags & _PAGE_PRESENT) )
-                return 0;
+            {
+                rc = 0;
+                goto out;
+            }
             pfn = l1e_get_pfn(*pl1e);
         }
     }
@@ -724,7 +728,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     {
         pl3e = alloc_xen_pagetable();
         if ( !pl3e )
-            return -ENOMEM;
+            goto out;
         clear_page(pl3e);
         l4e_write(&rpt[root_table_offset(linear)],
                   l4e_from_paddr(__pa(pl3e), __PAGE_HYPERVISOR));
@@ -738,7 +742,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     {
         pl2e = alloc_xen_pagetable();
         if ( !pl2e )
-            return -ENOMEM;
+            goto out;
         clear_page(pl2e);
         l3e_write(pl3e, l3e_from_paddr(__pa(pl2e), __PAGE_HYPERVISOR));
     }
@@ -754,7 +758,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     {
         pl1e = alloc_xen_pagetable();
         if ( !pl1e )
-            return -ENOMEM;
+            goto out;
         clear_page(pl1e);
         l2e_write(pl2e, l2e_from_paddr(__pa(pl1e), __PAGE_HYPERVISOR));
     }
@@ -775,7 +779,9 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     else
         l1e_write(pl1e, l1e_from_pfn(pfn, flags));
 
-    return 0;
+    rc = 0;
+ out:
+    return rc;
 }
 
 DEFINE_PER_CPU(root_pgentry_t *, root_pgt);
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 12/15] x86/smpboot: switch pl*e to use new APIs in clone_mapping
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
                   ` (10 preceding siblings ...)
  2020-04-24 14:09 ` [PATCH v6 11/15] x86/smpboot: clone_mapping should have one exit path Hongyan Xia
@ 2020-04-24 14:09 ` Hongyan Xia
  2020-04-30 15:15   ` Jan Beulich
  2020-04-24 14:09 ` [PATCH v6 13/15] x86/mm: drop old page table APIs Hongyan Xia
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/smpboot.c | 54 +++++++++++++++++++++++++++---------------
 1 file changed, 35 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 5b0e24f925..0e0ae56c76 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -672,9 +672,9 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
 {
     unsigned long linear = (unsigned long)ptr, pfn;
     unsigned int flags;
-    l3_pgentry_t *pl3e;
-    l2_pgentry_t *pl2e;
-    l1_pgentry_t *pl1e;
+    l3_pgentry_t *pl3e = NULL;
+    l2_pgentry_t *pl2e = NULL;
+    l1_pgentry_t *pl1e = NULL;
     int rc = -ENOMEM;
 
     /*
@@ -689,8 +689,8 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
          (linear >= XEN_VIRT_END && linear < DIRECTMAP_VIRT_START) )
         return -EINVAL;
 
-    pl3e = l4e_to_l3e(idle_pg_table[root_table_offset(linear)]) +
-        l3_table_offset(linear);
+    pl3e = map_l3t_from_l4e(idle_pg_table[root_table_offset(linear)]);
+    pl3e += l3_table_offset(linear);
 
     flags = l3e_get_flags(*pl3e);
     ASSERT(flags & _PAGE_PRESENT);
@@ -702,7 +702,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
     }
     else
     {
-        pl2e = l3e_to_l2e(*pl3e) + l2_table_offset(linear);
+        pl2e = map_l2t_from_l3e(*pl3e) + l2_table_offset(linear);
         flags = l2e_get_flags(*pl2e);
         ASSERT(flags & _PAGE_PRESENT);
         if ( flags & _PAGE_PSE )
@@ -713,7 +713,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
         }
         else
         {
-            pl1e = l2e_to_l1e(*pl2e) + l1_table_offset(linear);
+            pl1e = map_l1t_from_l2e(*pl2e) + l1_table_offset(linear);
             flags = l1e_get_flags(*pl1e);
             if ( !(flags & _PAGE_PRESENT) )
             {
@@ -724,48 +724,61 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
         }
     }
 
+    UNMAP_DOMAIN_PAGE(pl1e);
+    UNMAP_DOMAIN_PAGE(pl2e);
+    UNMAP_DOMAIN_PAGE(pl3e);
+
     if ( !(root_get_flags(rpt[root_table_offset(linear)]) & _PAGE_PRESENT) )
     {
-        pl3e = alloc_xen_pagetable();
-        if ( !pl3e )
+        mfn_t l3mfn = alloc_xen_pagetable_new();
+
+        if ( mfn_eq(l3mfn, INVALID_MFN) )
             goto out;
+
+        pl3e = map_domain_page(l3mfn);
         clear_page(pl3e);
         l4e_write(&rpt[root_table_offset(linear)],
-                  l4e_from_paddr(__pa(pl3e), __PAGE_HYPERVISOR));
+                  l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR));
     }
     else
-        pl3e = l4e_to_l3e(rpt[root_table_offset(linear)]);
+        pl3e = map_l3t_from_l4e(rpt[root_table_offset(linear)]);
 
     pl3e += l3_table_offset(linear);
 
     if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
     {
-        pl2e = alloc_xen_pagetable();
-        if ( !pl2e )
+        mfn_t l2mfn = alloc_xen_pagetable_new();
+
+        if ( mfn_eq(l2mfn, INVALID_MFN) )
             goto out;
+
+        pl2e = map_domain_page(l2mfn);
         clear_page(pl2e);
-        l3e_write(pl3e, l3e_from_paddr(__pa(pl2e), __PAGE_HYPERVISOR));
+        l3e_write(pl3e, l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR));
     }
     else
     {
         ASSERT(!(l3e_get_flags(*pl3e) & _PAGE_PSE));
-        pl2e = l3e_to_l2e(*pl3e);
+        pl2e = map_l2t_from_l3e(*pl3e);
     }
 
     pl2e += l2_table_offset(linear);
 
     if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
     {
-        pl1e = alloc_xen_pagetable();
-        if ( !pl1e )
+        mfn_t l1mfn = alloc_xen_pagetable_new();
+
+        if ( mfn_eq(l1mfn, INVALID_MFN) )
             goto out;
+
+        pl1e = map_domain_page(l1mfn);
         clear_page(pl1e);
-        l2e_write(pl2e, l2e_from_paddr(__pa(pl1e), __PAGE_HYPERVISOR));
+        l2e_write(pl2e, l2e_from_mfn(l1mfn, __PAGE_HYPERVISOR));
     }
     else
     {
         ASSERT(!(l2e_get_flags(*pl2e) & _PAGE_PSE));
-        pl1e = l2e_to_l1e(*pl2e);
+        pl1e = map_l1t_from_l2e(*pl2e);
     }
 
     pl1e += l1_table_offset(linear);
@@ -781,6 +794,9 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
 
     rc = 0;
  out:
+    UNMAP_DOMAIN_PAGE(pl1e);
+    UNMAP_DOMAIN_PAGE(pl2e);
+    UNMAP_DOMAIN_PAGE(pl3e);
     return rc;
 }
 
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 13/15] x86/mm: drop old page table APIs
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
                   ` (11 preceding siblings ...)
  2020-04-24 14:09 ` [PATCH v6 12/15] x86/smpboot: switch pl*e to use new APIs in clone_mapping Hongyan Xia
@ 2020-04-24 14:09 ` Hongyan Xia
  2020-05-20 10:09   ` Jan Beulich
  2020-04-24 14:09 ` [PATCH v6 14/15] x86: switch to use domheap page for page tables Hongyan Xia
  2020-04-24 14:09 ` [PATCH v6 15/15] x86/mm: drop _new suffix for page table APIs Hongyan Xia
  14 siblings, 1 reply; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

Two sets of old APIs, alloc/free_xen_pagetable() and lXe_to_lYe(), are
now dropped to avoid the dependency on direct map.

There are two special cases which still have not been re-written into
the new APIs, thus need special treatment:

rpt in smpboot.c cannot use ephemeral mappings yet. The problem is that
rpt is read and written in context switch code, but the mapping
infrastructure is NOT context-switch-safe, meaning we cannot map rpt in
one domain and unmap in another. Before the mapping infrastructure
supports context switches, rpt has to be globally mapped.

Also, lXe_to_lYe() during Xen image relocation cannot be converted into
map/unmap pairs. We cannot hold on to mappings while the mapping
infrastructure is being relocated! It is enough to remove the direct map
in the second e820 pass, so we still use the direct map (<4GiB) in Xen
relocation (which is during the first e820 pass).

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/mm.c          | 14 --------------
 xen/arch/x86/setup.c       |  4 ++--
 xen/arch/x86/smpboot.c     |  4 ++--
 xen/include/asm-x86/mm.h   |  2 --
 xen/include/asm-x86/page.h |  5 -----
 5 files changed, 4 insertions(+), 25 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index e8c16027d8..749b6f23e5 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4913,20 +4913,6 @@ int mmcfg_intercept_write(
     return X86EMUL_OKAY;
 }
 
-void *alloc_xen_pagetable(void)
-{
-    mfn_t mfn = alloc_xen_pagetable_new();
-
-    return mfn_eq(mfn, INVALID_MFN) ? NULL : mfn_to_virt(mfn_x(mfn));
-}
-
-void free_xen_pagetable(void *v)
-{
-    mfn_t mfn = v ? virt_to_mfn(v) : INVALID_MFN;
-
-    free_xen_pagetable_new(mfn);
-}
-
 /*
  * For these PTE APIs, the caller must follow the alloc-map-unmap-free
  * lifecycle, which means explicitly mapping the PTE pages before accessing
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 885919d5c3..c76fbf80dc 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1114,7 +1114,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                     continue;
                 *pl4e = l4e_from_intpte(l4e_get_intpte(*pl4e) +
                                         xen_phys_start);
-                pl3e = l4e_to_l3e(*pl4e);
+                pl3e = __va(l4e_get_paddr(*pl4e));
                 for ( j = 0; j < L3_PAGETABLE_ENTRIES; j++, pl3e++ )
                 {
                     /* Not present, 1GB mapping, or already relocated? */
@@ -1124,7 +1124,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                         continue;
                     *pl3e = l3e_from_intpte(l3e_get_intpte(*pl3e) +
                                             xen_phys_start);
-                    pl2e = l3e_to_l2e(*pl3e);
+                    pl2e = __va(l3e_get_paddr(*pl3e));
                     for ( k = 0; k < L2_PAGETABLE_ENTRIES; k++, pl2e++ )
                     {
                         /* Not present, PSE, or already relocated? */
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 0e0ae56c76..71d61794ec 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -815,7 +815,7 @@ static int setup_cpu_root_pgt(unsigned int cpu)
     if ( !opt_xpti_hwdom && !opt_xpti_domu )
         return 0;
 
-    rpt = alloc_xen_pagetable();
+    rpt = alloc_xenheap_page();
     if ( !rpt )
         return -ENOMEM;
 
@@ -919,7 +919,7 @@ static void cleanup_cpu_root_pgt(unsigned int cpu)
         free_xen_pagetable_new(l3mfn);
     }
 
-    free_xen_pagetable(rpt);
+    free_xenheap_page(rpt);
 
     /* Also zap the stub mapping for this CPU. */
     if ( stub_linear )
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 3d3f9d49ac..cf855b48fd 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -583,8 +583,6 @@ int vcpu_destroy_pagetables(struct vcpu *);
 void *do_page_walk(struct vcpu *v, unsigned long addr);
 
 /* Allocator functions for Xen pagetables. */
-void *alloc_xen_pagetable(void);
-void free_xen_pagetable(void *v);
 mfn_t alloc_xen_pagetable_new(void);
 void free_xen_pagetable_new(mfn_t mfn);
 
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index 8f711f4992..2e3056c08e 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -188,11 +188,6 @@ static inline l4_pgentry_t l4e_from_paddr(paddr_t pa, unsigned int flags)
 #define l4e_has_changed(x,y,flags) \
     ( !!(((x).l4 ^ (y).l4) & ((PADDR_MASK&PAGE_MASK)|put_pte_flags(flags))) )
 
-/* Pagetable walking. */
-#define l2e_to_l1e(x)              ((l1_pgentry_t *)__va(l2e_get_paddr(x)))
-#define l3e_to_l2e(x)              ((l2_pgentry_t *)__va(l3e_get_paddr(x)))
-#define l4e_to_l3e(x)              ((l3_pgentry_t *)__va(l4e_get_paddr(x)))
-
 #define map_l1t_from_l2e(x)        (l1_pgentry_t *)map_domain_page(l2e_get_mfn(x))
 #define map_l2t_from_l3e(x)        (l2_pgentry_t *)map_domain_page(l3e_get_mfn(x))
 #define map_l3t_from_l4e(x)        (l3_pgentry_t *)map_domain_page(l4e_get_mfn(x))
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 14/15] x86: switch to use domheap page for page tables
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
                   ` (12 preceding siblings ...)
  2020-04-24 14:09 ` [PATCH v6 13/15] x86/mm: drop old page table APIs Hongyan Xia
@ 2020-04-24 14:09 ` Hongyan Xia
  2020-04-24 14:09 ` [PATCH v6 15/15] x86/mm: drop _new suffix for page table APIs Hongyan Xia
  14 siblings, 0 replies; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/mm.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 749b6f23e5..7e212cc3e0 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4923,10 +4923,11 @@ mfn_t alloc_xen_pagetable_new(void)
 {
     if ( system_state != SYS_STATE_early_boot )
     {
-        void *ptr = alloc_xenheap_page();
 
-        BUG_ON(!hardware_domain && !ptr);
-        return ptr ? virt_to_mfn(ptr) : INVALID_MFN;
+        struct page_info *pg = alloc_domheap_page(NULL, 0);
+
+        BUG_ON(!hardware_domain && !pg);
+        return pg ? page_to_mfn(pg) : INVALID_MFN;
     }
 
     return alloc_boot_pages(1, 1);
@@ -4936,7 +4937,7 @@ mfn_t alloc_xen_pagetable_new(void)
 void free_xen_pagetable_new(mfn_t mfn)
 {
     if ( system_state != SYS_STATE_early_boot && !mfn_eq(mfn, INVALID_MFN) )
-        free_xenheap_page(mfn_to_virt(mfn_x(mfn)));
+        free_domheap_page(mfn_to_page(mfn));
 }
 
 static DEFINE_SPINLOCK(map_pgdir_lock);
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v6 15/15] x86/mm: drop _new suffix for page table APIs
  2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
                   ` (13 preceding siblings ...)
  2020-04-24 14:09 ` [PATCH v6 14/15] x86: switch to use domheap page for page tables Hongyan Xia
@ 2020-04-24 14:09 ` Hongyan Xia
  2020-05-20  9:56   ` Jan Beulich
  14 siblings, 1 reply; 29+ messages in thread
From: Hongyan Xia @ 2020-04-24 14:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/mm.c        | 48 ++++++++++++++++++++--------------------
 xen/arch/x86/smpboot.c   | 12 +++++-----
 xen/arch/x86/x86_64/mm.c | 10 ++++-----
 xen/common/efi/boot.c    | 10 ++++-----
 xen/include/asm-x86/mm.h |  4 ++--
 5 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 7e212cc3e0..a17ae0004a 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -356,7 +356,7 @@ void __init arch_init_memory(void)
             ASSERT(root_pgt_pv_xen_slots < ROOT_PAGETABLE_PV_XEN_SLOTS);
             if ( l4_table_offset(split_va) == l4_table_offset(split_va - 1) )
             {
-                mfn_t l3mfn = alloc_xen_pagetable_new();
+                mfn_t l3mfn = alloc_xen_pagetable();
 
                 if ( !mfn_eq(l3mfn, INVALID_MFN) )
                 {
@@ -4919,7 +4919,7 @@ int mmcfg_intercept_write(
  * them. The caller must check whether the allocation has succeeded, and only
  * pass valid MFNs to map_domain_page().
  */
-mfn_t alloc_xen_pagetable_new(void)
+mfn_t alloc_xen_pagetable(void)
 {
     if ( system_state != SYS_STATE_early_boot )
     {
@@ -4934,7 +4934,7 @@ mfn_t alloc_xen_pagetable_new(void)
 }
 
 /* mfn can be INVALID_MFN */
-void free_xen_pagetable_new(mfn_t mfn)
+void free_xen_pagetable(mfn_t mfn)
 {
     if ( system_state != SYS_STATE_early_boot && !mfn_eq(mfn, INVALID_MFN) )
         free_domheap_page(mfn_to_page(mfn));
@@ -4955,7 +4955,7 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v)
     {
         bool locking = system_state > SYS_STATE_boot;
         l3_pgentry_t *l3t;
-        mfn_t l3mfn = alloc_xen_pagetable_new();
+        mfn_t l3mfn = alloc_xen_pagetable();
 
         if ( mfn_eq(l3mfn, INVALID_MFN) )
             return NULL;
@@ -4974,7 +4974,7 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v)
         }
         if ( locking )
             spin_unlock(&map_pgdir_lock);
-        free_xen_pagetable_new(l3mfn);
+        free_xen_pagetable(l3mfn);
     }
 
     return map_l3t_from_l4e(*pl4e) + l3_table_offset(v);
@@ -4993,7 +4993,7 @@ static l2_pgentry_t *virt_to_xen_l2e(unsigned long v)
     {
         bool locking = system_state > SYS_STATE_boot;
         l2_pgentry_t *l2t;
-        mfn_t l2mfn = alloc_xen_pagetable_new();
+        mfn_t l2mfn = alloc_xen_pagetable();
 
         if ( mfn_eq(l2mfn, INVALID_MFN) )
         {
@@ -5012,7 +5012,7 @@ static l2_pgentry_t *virt_to_xen_l2e(unsigned long v)
         }
         if ( locking )
             spin_unlock(&map_pgdir_lock);
-        free_xen_pagetable_new(l2mfn);
+        free_xen_pagetable(l2mfn);
     }
 
     BUG_ON(l3e_get_flags(*pl3e) & _PAGE_PSE);
@@ -5035,7 +5035,7 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
     {
         bool locking = system_state > SYS_STATE_boot;
         l1_pgentry_t *l1t;
-        mfn_t l1mfn = alloc_xen_pagetable_new();
+        mfn_t l1mfn = alloc_xen_pagetable();
 
         if ( mfn_eq(l1mfn, INVALID_MFN) )
         {
@@ -5054,7 +5054,7 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
         }
         if ( locking )
             spin_unlock(&map_pgdir_lock);
-        free_xen_pagetable_new(l1mfn);
+        free_xen_pagetable(l1mfn);
     }
 
     BUG_ON(l2e_get_flags(*pl2e) & _PAGE_PSE);
@@ -5163,10 +5163,10 @@ int map_pages_to_xen(
                         ol2e = l2t[i];
                         if ( (l2e_get_flags(ol2e) & _PAGE_PRESENT) &&
                              !(l2e_get_flags(ol2e) & _PAGE_PSE) )
-                            free_xen_pagetable_new(l2e_get_mfn(ol2e));
+                            free_xen_pagetable(l2e_get_mfn(ol2e));
                     }
                     unmap_domain_page(l2t);
-                    free_xen_pagetable_new(l3e_get_mfn(ol3e));
+                    free_xen_pagetable(l3e_get_mfn(ol3e));
                 }
             }
 
@@ -5205,7 +5205,7 @@ int map_pages_to_xen(
                 continue;
             }
 
-            l2mfn = alloc_xen_pagetable_new();
+            l2mfn = alloc_xen_pagetable();
             if ( mfn_eq(l2mfn, INVALID_MFN) )
                 goto out;
 
@@ -5233,7 +5233,7 @@ int map_pages_to_xen(
                 spin_unlock(&map_pgdir_lock);
             flush_area(virt, flush_flags);
 
-            free_xen_pagetable_new(l2mfn);
+            free_xen_pagetable(l2mfn);
         }
 
         pl2e = virt_to_xen_l2e(virt);
@@ -5267,7 +5267,7 @@ int map_pages_to_xen(
                         flush_flags(l1e_get_flags(l1t[i]));
                     flush_area(virt, flush_flags);
                     unmap_domain_page(l1t);
-                    free_xen_pagetable_new(l2e_get_mfn(ol2e));
+                    free_xen_pagetable(l2e_get_mfn(ol2e));
                 }
             }
 
@@ -5313,7 +5313,7 @@ int map_pages_to_xen(
                     goto check_l3;
                 }
 
-                l1mfn = alloc_xen_pagetable_new();
+                l1mfn = alloc_xen_pagetable();
                 if ( mfn_eq(l1mfn, INVALID_MFN) )
                     goto out;
 
@@ -5340,7 +5340,7 @@ int map_pages_to_xen(
                     spin_unlock(&map_pgdir_lock);
                 flush_area(virt, flush_flags);
 
-                free_xen_pagetable_new(l1mfn);
+                free_xen_pagetable(l1mfn);
             }
 
             pl1e  = map_l1t_from_l2e(*pl2e) + l1_table_offset(virt);
@@ -5406,7 +5406,7 @@ int map_pages_to_xen(
                     flush_area(virt - PAGE_SIZE,
                                FLUSH_TLB_GLOBAL |
                                FLUSH_ORDER(PAGETABLE_ORDER));
-                    free_xen_pagetable_new(l2e_get_mfn(ol2e));
+                    free_xen_pagetable(l2e_get_mfn(ol2e));
                 }
                 else if ( locking )
                     spin_unlock(&map_pgdir_lock);
@@ -5457,7 +5457,7 @@ int map_pages_to_xen(
                 flush_area(virt - PAGE_SIZE,
                            FLUSH_TLB_GLOBAL |
                            FLUSH_ORDER(2*PAGETABLE_ORDER));
-                free_xen_pagetable_new(l3e_get_mfn(ol3e));
+                free_xen_pagetable(l3e_get_mfn(ol3e));
             }
             else if ( locking )
                 spin_unlock(&map_pgdir_lock);
@@ -5546,7 +5546,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
             }
 
             /* PAGE1GB: shatter the superpage and fall through. */
-            l2mfn = alloc_xen_pagetable_new();
+            l2mfn = alloc_xen_pagetable();
             if ( mfn_eq(l2mfn, INVALID_MFN) )
                 goto out;
 
@@ -5570,7 +5570,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
             if ( locking )
                 spin_unlock(&map_pgdir_lock);
 
-            free_xen_pagetable_new(l2mfn);
+            free_xen_pagetable(l2mfn);
         }
 
         /*
@@ -5606,7 +5606,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
             {
                 l1_pgentry_t *l1t;
                 /* PSE: shatter the superpage and try again. */
-                mfn_t l1mfn = alloc_xen_pagetable_new();
+                mfn_t l1mfn = alloc_xen_pagetable();
 
                 if ( mfn_eq(l1mfn, INVALID_MFN) )
                     goto out;
@@ -5630,7 +5630,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
 
-                free_xen_pagetable_new(l1mfn);
+                free_xen_pagetable(l1mfn);
             }
         }
         else
@@ -5697,7 +5697,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
                 flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
-                free_xen_pagetable_new(l1mfn);
+                free_xen_pagetable(l1mfn);
             }
             else if ( locking )
                 spin_unlock(&map_pgdir_lock);
@@ -5742,7 +5742,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
                 if ( locking )
                     spin_unlock(&map_pgdir_lock);
                 flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
-                free_xen_pagetable_new(l2mfn);
+                free_xen_pagetable(l2mfn);
             }
             else if ( locking )
                 spin_unlock(&map_pgdir_lock);
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 71d61794ec..06c8e3ddf0 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -730,7 +730,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
 
     if ( !(root_get_flags(rpt[root_table_offset(linear)]) & _PAGE_PRESENT) )
     {
-        mfn_t l3mfn = alloc_xen_pagetable_new();
+        mfn_t l3mfn = alloc_xen_pagetable();
 
         if ( mfn_eq(l3mfn, INVALID_MFN) )
             goto out;
@@ -747,7 +747,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
 
     if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
     {
-        mfn_t l2mfn = alloc_xen_pagetable_new();
+        mfn_t l2mfn = alloc_xen_pagetable();
 
         if ( mfn_eq(l2mfn, INVALID_MFN) )
             goto out;
@@ -766,7 +766,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
 
     if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
     {
-        mfn_t l1mfn = alloc_xen_pagetable_new();
+        mfn_t l1mfn = alloc_xen_pagetable();
 
         if ( mfn_eq(l1mfn, INVALID_MFN) )
             goto out;
@@ -908,15 +908,15 @@ static void cleanup_cpu_root_pgt(unsigned int cpu)
                     continue;
 
                 ASSERT(!(l2e_get_flags(l2t[i2]) & _PAGE_PSE));
-                free_xen_pagetable_new(l2e_get_mfn(l2t[i2]));
+                free_xen_pagetable(l2e_get_mfn(l2t[i2]));
             }
 
             unmap_domain_page(l2t);
-            free_xen_pagetable_new(l2mfn);
+            free_xen_pagetable(l2mfn);
         }
 
         unmap_domain_page(l3t);
-        free_xen_pagetable_new(l3mfn);
+        free_xen_pagetable(l3mfn);
     }
 
     free_xenheap_page(rpt);
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 12e9dc6eb2..d37f6a3755 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -444,7 +444,7 @@ static int setup_m2p_table(struct mem_hotadd_info *info)
                 l2_ro_mpt = map_l2t_from_l3e(l3_ro_mpt[l3_table_offset(va)]);
             else
             {
-                mfn_t l2_ro_mpt_mfn = alloc_xen_pagetable_new();
+                mfn_t l2_ro_mpt_mfn = alloc_xen_pagetable();
 
                 if ( mfn_eq(l2_ro_mpt_mfn, INVALID_MFN) )
                 {
@@ -497,7 +497,7 @@ void __init paging_init(void)
               _PAGE_PRESENT) )
         {
             l3_pgentry_t *pl3t;
-            mfn_t l3mfn = alloc_xen_pagetable_new();
+            mfn_t l3mfn = alloc_xen_pagetable();
 
             if ( mfn_eq(l3mfn, INVALID_MFN) )
                 goto nomem;
@@ -511,7 +511,7 @@ void __init paging_init(void)
     }
 
     /* Create user-accessible L2 directory to map the MPT for guests. */
-    l3_ro_mpt_mfn = alloc_xen_pagetable_new();
+    l3_ro_mpt_mfn = alloc_xen_pagetable();
     if ( mfn_eq(l3_ro_mpt_mfn, INVALID_MFN) )
         goto nomem;
     l3_ro_mpt = map_domain_page(l3_ro_mpt_mfn);
@@ -602,7 +602,7 @@ void __init paging_init(void)
         {
             UNMAP_DOMAIN_PAGE(l2_ro_mpt);
 
-            l2_ro_mpt_mfn = alloc_xen_pagetable_new();
+            l2_ro_mpt_mfn = alloc_xen_pagetable();
             if ( mfn_eq(l2_ro_mpt_mfn, INVALID_MFN) )
                 goto nomem;
 
@@ -626,7 +626,7 @@ void __init paging_init(void)
     UNMAP_DOMAIN_PAGE(l3_ro_mpt);
 
     /* Create user-accessible L2 directory to map the MPT for compat guests. */
-    l2_ro_mpt_mfn = alloc_xen_pagetable_new();
+    l2_ro_mpt_mfn = alloc_xen_pagetable();
     if ( mfn_eq(l2_ro_mpt_mfn, INVALID_MFN) )
         goto nomem;
     compat_idle_pg_table_l2 = map_domain_page_global(l2_ro_mpt_mfn);
diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 715217d2a9..d70d06084c 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -1456,7 +1456,7 @@ static __init void copy_mapping(unsigned long mfn, unsigned long end,
             continue;
         if ( !(l4e_get_flags(l4e) & _PAGE_PRESENT) )
         {
-            mfn_t l3mfn = alloc_xen_pagetable_new();
+            mfn_t l3mfn = alloc_xen_pagetable();
 
             BUG_ON(mfn_eq(l3mfn, INVALID_MFN));
             l3dst = map_domain_page(l3mfn);
@@ -1608,7 +1608,7 @@ void __init efi_init_memory(void)
      * Set up 1:1 page tables for runtime calls. See SetVirtualAddressMap() in
      * efi_exit_boot().
      */
-    efi_l4_mfn = alloc_xen_pagetable_new();
+    efi_l4_mfn = alloc_xen_pagetable();
     BUG_ON(mfn_eq(efi_l4_mfn, INVALID_MFN));
     efi_l4_pgtable = map_domain_page(efi_l4_mfn);
     clear_page(efi_l4_pgtable);
@@ -1643,7 +1643,7 @@ void __init efi_init_memory(void)
 
         if ( !(l4e_get_flags(l4e) & _PAGE_PRESENT) )
         {
-            mfn_t l3mfn = alloc_xen_pagetable_new();
+            mfn_t l3mfn = alloc_xen_pagetable();
 
             BUG_ON(mfn_eq(l3mfn, INVALID_MFN));
             pl3e = map_domain_page(l3mfn);
@@ -1656,7 +1656,7 @@ void __init efi_init_memory(void)
         pl3e += l3_table_offset(addr);
         if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
         {
-            mfn_t l2mfn = alloc_xen_pagetable_new();
+            mfn_t l2mfn = alloc_xen_pagetable();
 
             BUG_ON(mfn_eq(l2mfn, INVALID_MFN));
             pl2e = map_domain_page(l2mfn);
@@ -1671,7 +1671,7 @@ void __init efi_init_memory(void)
         pl2e += l2_table_offset(addr);
         if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
         {
-            mfn_t l1mfn = alloc_xen_pagetable_new();
+            mfn_t l1mfn = alloc_xen_pagetable();
 
             BUG_ON(mfn_eq(l1mfn, INVALID_MFN));
             l1t = map_domain_page(l1mfn);
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index cf855b48fd..ef7a20ac7d 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -583,8 +583,8 @@ int vcpu_destroy_pagetables(struct vcpu *);
 void *do_page_walk(struct vcpu *v, unsigned long addr);
 
 /* Allocator functions for Xen pagetables. */
-mfn_t alloc_xen_pagetable_new(void);
-void free_xen_pagetable_new(mfn_t mfn);
+mfn_t alloc_xen_pagetable(void);
+void free_xen_pagetable(mfn_t mfn);
 
 l1_pgentry_t *virt_to_xen_l1e(unsigned long v);
 
-- 
2.24.1.AMZN



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 09/15] efi: use new page table APIs in copy_mapping
  2020-04-24 14:09 ` [PATCH v6 09/15] efi: use new page table APIs in copy_mapping Hongyan Xia
@ 2020-04-30 12:42   ` Jan Beulich
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2020-04-30 12:42 UTC (permalink / raw)
  To: Hongyan Xia; +Cc: xen-devel, julien

On 24.04.2020 16:09, Hongyan Xia wrote:
> --- a/xen/common/efi/boot.c
> +++ b/xen/common/efi/boot.c
> @@ -1454,16 +1454,20 @@ static __init void copy_mapping(unsigned long mfn, unsigned long end,
>              continue;
>          if ( !(l4e_get_flags(l4e) & _PAGE_PRESENT) )
>          {
> -            l3dst = alloc_xen_pagetable();
> -            BUG_ON(!l3dst);
> +            mfn_t l3mfn = alloc_xen_pagetable_new();
> +
> +            BUG_ON(mfn_eq(l3mfn, INVALID_MFN));
> +            l3dst = map_domain_page(l3mfn);
>              clear_page(l3dst);
>              efi_l4_pgtable[l4_table_offset(mfn << PAGE_SHIFT)] =
> -                l4e_from_paddr(virt_to_maddr(l3dst), __PAGE_HYPERVISOR);
> +                l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR);
>          }
>          else
> -            l3dst = l4e_to_l3e(l4e);
> -        l3src = l4e_to_l3e(idle_pg_table[l4_table_offset(va)]);
> +            l3dst = map_l3t_from_l4e(l4e);
> +        l3src = map_l3t_from_l4e(idle_pg_table[l4_table_offset(va)]);
>          l3dst[l3_table_offset(mfn << PAGE_SHIFT)] = l3src[l3_table_offset(va)];
> +        unmap_domain_page(l3src);
> +        unmap_domain_page(l3dst);
>      }
>  }

This looks very inefficient - you establish and tear down two mappings
per loop iteration, when really you may end up copying all 512 slots
between the same pair of L3 tables.

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 10/15] efi: switch to new APIs in EFI code
  2020-04-24 14:09 ` [PATCH v6 10/15] efi: switch to new APIs in EFI code Hongyan Xia
@ 2020-04-30 12:55   ` Jan Beulich
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2020-04-30 12:55 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: xen-devel, Roger Pau Monné, julien, Wei Liu, Andrew Cooper

On 24.04.2020 16:09, Hongyan Xia wrote:
> --- a/xen/arch/x86/efi/runtime.h
> +++ b/xen/arch/x86/efi/runtime.h
> @@ -1,12 +1,18 @@
> +#include <xen/domain_page.h>
> +#include <xen/mm.h>
>  #include <asm/atomic.h>
>  #include <asm/mc146818rtc.h>
>  
>  #ifndef COMPAT
> -l4_pgentry_t *__read_mostly efi_l4_pgtable;
> +mfn_t __read_mostly efi_l4_mfn = INVALID_MFN_INITIALIZER;
>  
>  void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e)
>  {
> -    if ( efi_l4_pgtable )
> +    if ( !mfn_eq(efi_l4_mfn, INVALID_MFN) )
> +    {
> +        l4_pgentry_t *efi_l4_pgtable = map_domain_page(efi_l4_mfn);
>          l4e_write(efi_l4_pgtable + l4idx, l4e);

Blank line between declaration(s) and statement(s) please.

Also, while I realize the choice of name of the local variable
is (presumably) to avoid further code churn, I think it isn't
really suitable for a local variable (also elsewhere below).

> @@ -1489,6 +1493,7 @@ static bool __init rt_range_valid(unsigned long smfn, unsigned long emfn)
>  void __init efi_init_memory(void)
>  {
>      unsigned int i;
> +    l4_pgentry_t *efi_l4_pgtable;
>      struct rt_extra {
>          struct rt_extra *next;
>          unsigned long smfn, emfn;
> @@ -1603,8 +1608,9 @@ void __init efi_init_memory(void)
>       * Set up 1:1 page tables for runtime calls. See SetVirtualAddressMap() in
>       * efi_exit_boot().
>       */
> -    efi_l4_pgtable = alloc_xen_pagetable();
> -    BUG_ON(!efi_l4_pgtable);
> +    efi_l4_mfn = alloc_xen_pagetable_new();
> +    BUG_ON(mfn_eq(efi_l4_mfn, INVALID_MFN));
> +    efi_l4_pgtable = map_domain_page(efi_l4_mfn);
>      clear_page(efi_l4_pgtable);
>  
>      copy_mapping(0, max_page, ram_range_valid);

Why don't you pass the already mapped L4 table into this function,
rather than mapping the same page a 2nd time there?

> @@ -1681,11 +1693,17 @@ void __init efi_init_memory(void)
>              extra_head = extra->next;
>              xfree(extra);
>          }
> +
> +        unmap_domain_page(l1t);
> +        unmap_domain_page(pl2e);
> +        unmap_domain_page(pl3e);

All three should be pulled further up, each to the earliest
possible place (and then using the uppercase version of the
construct as suitable).

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 11/15] x86/smpboot: clone_mapping should have one exit path
  2020-04-24 14:09 ` [PATCH v6 11/15] x86/smpboot: clone_mapping should have one exit path Hongyan Xia
@ 2020-04-30 14:59   ` Jan Beulich
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2020-04-30 14:59 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: xen-devel, Roger Pau Monné, julien, Wei Liu, Andrew Cooper

On 24.04.2020 16:09, Hongyan Xia wrote:
> From: Wei Liu <wei.liu2@citrix.com>

Like for patches 1 and 2 in this series, and as iirc mentioned already
long ago for those, "should" or alike are misleading here: It's not a
mistake that they don't, i.e. this is not a bug fix. You _want_ these
functions to have a single (main) exit path, for subsequent changes of
yours ending up easier.

> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -675,6 +675,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
>      l3_pgentry_t *pl3e;
>      l2_pgentry_t *pl2e;
>      l1_pgentry_t *pl1e;
> +    int rc = -ENOMEM;

Would you mind starting out with 0 here, ...

> @@ -715,7 +716,10 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
>              pl1e = l2e_to_l1e(*pl2e) + l1_table_offset(linear);
>              flags = l1e_get_flags(*pl1e);
>              if ( !(flags & _PAGE_PRESENT) )
> -                return 0;
> +            {
> +                rc = 0;
> +                goto out;
> +            }

... dropping assignment and braces here, and ...

> @@ -724,7 +728,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
>      {
>          pl3e = alloc_xen_pagetable();
>          if ( !pl3e )
> -            return -ENOMEM;
> +            goto out;

... setting rc to -ENOMEM ahead of the if() up from here?
This imo makes things then not only minimally shorter, but
also fights slightly better with the nevertheless still
existing (after this patch) separate early exit paths.

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 12/15] x86/smpboot: switch pl*e to use new APIs in clone_mapping
  2020-04-24 14:09 ` [PATCH v6 12/15] x86/smpboot: switch pl*e to use new APIs in clone_mapping Hongyan Xia
@ 2020-04-30 15:15   ` Jan Beulich
  2020-05-11 10:55     ` Hongyan Xia
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2020-04-30 15:15 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: xen-devel, Roger Pau Monné, julien, Wei Liu, Andrew Cooper

On 24.04.2020 16:09, Hongyan Xia wrote:
> From: Wei Liu <wei.liu2@citrix.com>

Nit: Why the emphasis on pl*e in the title? Is there anything left
unconverted in the function? IOW how about "switch clone_mapping()
to new page table APIs"?

> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -672,9 +672,9 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
>  {
>      unsigned long linear = (unsigned long)ptr, pfn;
>      unsigned int flags;
> -    l3_pgentry_t *pl3e;
> -    l2_pgentry_t *pl2e;
> -    l1_pgentry_t *pl1e;
> +    l3_pgentry_t *pl3e = NULL;
> +    l2_pgentry_t *pl2e = NULL;
> +    l1_pgentry_t *pl1e = NULL;

The latter two need initializers, yes, but why pl3e?

> @@ -689,8 +689,8 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
>           (linear >= XEN_VIRT_END && linear < DIRECTMAP_VIRT_START) )
>          return -EINVAL;
>  
> -    pl3e = l4e_to_l3e(idle_pg_table[root_table_offset(linear)]) +
> -        l3_table_offset(linear);
> +    pl3e = map_l3t_from_l4e(idle_pg_table[root_table_offset(linear)]);
> +    pl3e += l3_table_offset(linear);

By keeping original style (a single assignment) you'd have slightly
less of a diff, and I think keeping original style where it's not
colliding with any of our rules is generally a good idea. Doing so
won't even make you hit the slightly flawed definition of
map_l3t_from_l4e() at al (missing outer parentheses). I notice you
do so ...

> @@ -702,7 +702,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
>      }
>      else
>      {
> -        pl2e = l3e_to_l2e(*pl3e) + l2_table_offset(linear);
> +        pl2e = map_l2t_from_l3e(*pl3e) + l2_table_offset(linear);
>          flags = l2e_get_flags(*pl2e);
>          ASSERT(flags & _PAGE_PRESENT);
>          if ( flags & _PAGE_PSE )
> @@ -713,7 +713,7 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
>          }
>          else
>          {
> -            pl1e = l2e_to_l1e(*pl2e) + l1_table_offset(linear);
> +            pl1e = map_l1t_from_l2e(*pl2e) + l1_table_offset(linear);

... in both of these cases.

> @@ -724,48 +724,61 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
>          }
>      }
>  
> +    UNMAP_DOMAIN_PAGE(pl1e);
> +    UNMAP_DOMAIN_PAGE(pl2e);
> +    UNMAP_DOMAIN_PAGE(pl3e);
> +
>      if ( !(root_get_flags(rpt[root_table_offset(linear)]) & _PAGE_PRESENT) )
>      {
> -        pl3e = alloc_xen_pagetable();
> -        if ( !pl3e )
> +        mfn_t l3mfn = alloc_xen_pagetable_new();
> +
> +        if ( mfn_eq(l3mfn, INVALID_MFN) )
>              goto out;
> +
> +        pl3e = map_domain_page(l3mfn);

Seeing this recur (from other patches) I wonder whether we wouldn't
better make map_domain_page() accept INVALID_MFN and return NULL in
this case. In cases like the one here it would eliminate the need
for several local variables. Of course the downside of this is that
then we'll have to start checking map_domain_page()'s return value.
A middle ground could be to have

void *alloc_mapped_pagetable(mfn_t *mfn);

allowing to pass in NULL if the MFN is of no interest.

> @@ -781,6 +794,9 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
>  
>      rc = 0;
>   out:
> +    UNMAP_DOMAIN_PAGE(pl1e);
> +    UNMAP_DOMAIN_PAGE(pl2e);
> +    UNMAP_DOMAIN_PAGE(pl3e);
>      return rc;
>  }

I don't think the writing of NULL into the variables is necessary
here. And if the needed if()-s are of concern, then perhaps we
should consider making unmap_domain_page() finally accept NULL as
input?

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 12/15] x86/smpboot: switch pl*e to use new APIs in clone_mapping
  2020-04-30 15:15   ` Jan Beulich
@ 2020-05-11 10:55     ` Hongyan Xia
  0 siblings, 0 replies; 29+ messages in thread
From: Hongyan Xia @ 2020-05-11 10:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Roger Pau Monné, julien, Wei Liu, Andrew Cooper

On Thu, 2020-04-30 at 17:15 +0200, Jan Beulich wrote:
> On 24.04.2020 16:09, Hongyan Xia wrote:
> > From: Wei Liu <wei.liu2@citrix.com>
> 
> Nit: Why the emphasis on pl*e in the title? Is there anything left
> unconverted in the function? IOW how about "switch clone_mapping()
> to new page table APIs"?

The title seems stale. Will fix.

> ...
> > @@ -724,48 +724,61 @@ static int clone_mapping(const void *ptr,
> > root_pgentry_t *rpt)
> >          }
> >      }
> >  
> > +    UNMAP_DOMAIN_PAGE(pl1e);
> > +    UNMAP_DOMAIN_PAGE(pl2e);
> > +    UNMAP_DOMAIN_PAGE(pl3e);
> > +
> >      if ( !(root_get_flags(rpt[root_table_offset(linear)]) &
> > _PAGE_PRESENT) )
> >      {
> > -        pl3e = alloc_xen_pagetable();
> > -        if ( !pl3e )
> > +        mfn_t l3mfn = alloc_xen_pagetable_new();
> > +
> > +        if ( mfn_eq(l3mfn, INVALID_MFN) )
> >              goto out;
> > +
> > +        pl3e = map_domain_page(l3mfn);
> 
> Seeing this recur (from other patches) I wonder whether we wouldn't
> better make map_domain_page() accept INVALID_MFN and return NULL in
> this case. In cases like the one here it would eliminate the need
> for several local variables. Of course the downside of this is that
> then we'll have to start checking map_domain_page()'s return value.
> A middle ground could be to have
> 
> void *alloc_mapped_pagetable(mfn_t *mfn);
> 
> allowing to pass in NULL if the MFN is of no interest.

I would say that when the caller requires a new Xen page table
allocation, almost all the time both the mfn and the virt are needed
(on top of my head I cannot think of a case where we pass in NULL, you
almost always need the mfn to write new page table entries), so I think
the benefit of this is just compressing two calls into one, which I am
not quite sure is worth it.

> > @@ -781,6 +794,9 @@ static int clone_mapping(const void *ptr,
> > root_pgentry_t *rpt)
> >  
> >      rc = 0;
> >   out:
> > +    UNMAP_DOMAIN_PAGE(pl1e);
> > +    UNMAP_DOMAIN_PAGE(pl2e);
> > +    UNMAP_DOMAIN_PAGE(pl3e);
> >      return rc;
> >  }
> 
> I don't think the writing of NULL into the variables is necessary
> here. And if the needed if()-s are of concern, then perhaps we
> should consider making unmap_domain_page() finally accept NULL as
> input?

I usually don't have a problem with this because a sane compiler would
definitely remove the unnecessary clearing, so I would use the macro
version as much as possible. I am okay with moving the NULL check into
unmap() itself, but note that this also needs changes on Arm side.

Hongyan



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 03/15] x86/mm: rewrite virt_to_xen_l*e
  2020-04-24 14:08 ` [PATCH v6 03/15] x86/mm: rewrite virt_to_xen_l*e Hongyan Xia
@ 2020-05-20  9:27   ` Jan Beulich
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2020-05-20  9:27 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, xen-devel, Roger Pau Monné

On 24.04.2020 16:08, Hongyan Xia wrote:
> @@ -4998,31 +5005,40 @@ static l2_pgentry_t *virt_to_xen_l2e(unsigned long v)
>      if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
>      {
>          bool locking = system_state > SYS_STATE_boot;
> -        l2_pgentry_t *l2t = alloc_xen_pagetable();
> +        l2_pgentry_t *l2t;
> +        mfn_t l2mfn = alloc_xen_pagetable_new();
>  
> -        if ( !l2t )
> +        if ( mfn_eq(l2mfn, INVALID_MFN) )
> +        {
> +            UNMAP_DOMAIN_PAGE(pl3e);
>              return NULL;
> +        }
> +        l2t = map_domain_page(l2mfn);
>          clear_page(l2t);
> +        UNMAP_DOMAIN_PAGE(l2t);
>          if ( locking )
>              spin_lock(&map_pgdir_lock);
>          if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
>          {
> -            l3e_write(pl3e, l3e_from_paddr(__pa(l2t), __PAGE_HYPERVISOR));
> -            l2t = NULL;
> +            l3e_write(pl3e, l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR));
> +            l2mfn = INVALID_MFN;
>          }
>          if ( locking )
>              spin_unlock(&map_pgdir_lock);
> -        if ( l2t )
> -            free_xen_pagetable(l2t);
> +        free_xen_pagetable_new(l2mfn);
>      }
>  
>      BUG_ON(l3e_get_flags(*pl3e) & _PAGE_PSE);
> -    return l3e_to_l2e(*pl3e) + l2_table_offset(v);
> +    pl2e = map_l2t_from_l3e(*pl3e) + l2_table_offset(v);
> +    unmap_domain_page(pl3e);

To avoid undue pressure on the number of active mappings I'd like
to ask that you unmap first, then establish the new mapping.

> @@ -5095,6 +5119,10 @@ int map_pages_to_xen(
>  
>      while ( nr_mfns != 0 )
>      {
> +        /* Clean up mappings mapped in the previous iteration. */
> +        UNMAP_DOMAIN_PAGE(pl3e);
> +        UNMAP_DOMAIN_PAGE(pl2e);
> +
>          pl3e = virt_to_xen_l3e(virt);

As you don't add a comment here (and at other call sites of
virt_to_xen_l<N>e()), ...

> @@ -5260,9 +5288,12 @@ int map_pages_to_xen(
>              /* Normal page mapping. */
>              if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
>              {
> +                /* This forces the mapping to be populated. */
>                  pl1e = virt_to_xen_l1e(virt);

... I don't see why one needs adding here.

> --- a/xen/include/asm-x86/page.h
> +++ b/xen/include/asm-x86/page.h
> @@ -291,7 +291,13 @@ void copy_page_sse2(void *, const void *);
>  #define pfn_to_paddr(pfn)   __pfn_to_paddr(pfn)
>  #define paddr_to_pfn(pa)    __paddr_to_pfn(pa)
>  #define paddr_to_pdx(pa)    pfn_to_pdx(paddr_to_pfn(pa))
> -#define vmap_to_mfn(va)     _mfn(l1e_get_pfn(*virt_to_xen_l1e((unsigned long)(va))))
> +
> +#define vmap_to_mfn(va) ({                                                  \
> +        const l1_pgentry_t *pl1e_ = virt_to_xen_l1e((unsigned long)(va));   \
> +        unsigned long pfn_ = l1e_get_pfn(*pl1e_);                           \

l1e_get_mfn()?

> +        unmap_domain_page(pl1e_);                                           \
> +        _mfn(pfn_); })
> +
>  #define vmap_to_page(va)    mfn_to_page(vmap_to_mfn(va))
>  
>  #endif /* !defined(__ASSEMBLY__) */

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 06/15] x86_64/mm: introduce pl2e in paging_init
  2020-04-24 14:08 ` [PATCH v6 06/15] x86_64/mm: introduce pl2e in paging_init Hongyan Xia
@ 2020-05-20  9:35   ` Jan Beulich
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2020-05-20  9:35 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: xen-devel, Roger Pau Monné, julien, Wei Liu, Andrew Cooper

On 24.04.2020 16:08, Hongyan Xia wrote:
> From: Wei Liu <wei.liu2@citrix.com>
> 
> Introduce pl2e so that we can use l2_ro_mpt to point to the page table
> itself.
> 
> No functional change.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

In principle I'm fine with the change, as a preparatory one for
the next patch. The description, however, suggests this is a
change for the sake of making a change: introducing a local
variable that's not really needed for anything.

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 07/15] x86_64/mm: switch to new APIs in paging_init
  2020-04-24 14:08 ` [PATCH v6 07/15] x86_64/mm: switch to new APIs " Hongyan Xia
@ 2020-05-20  9:46   ` Jan Beulich
  2020-05-27 15:01     ` Hongyan Xia
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2020-05-20  9:46 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: xen-devel, Roger Pau Monné, julien, Wei Liu, Andrew Cooper

On 24.04.2020 16:08, Hongyan Xia wrote:
> @@ -493,22 +494,28 @@ void __init paging_init(void)
>          if ( !(l4e_get_flags(idle_pg_table[l4_table_offset(va)]) &
>                _PAGE_PRESENT) )
>          {
> -            l3_pgentry_t *pl3t = alloc_xen_pagetable();
> +            l3_pgentry_t *pl3t;
> +            mfn_t l3mfn = alloc_xen_pagetable_new();
>  
> -            if ( !pl3t )
> +            if ( mfn_eq(l3mfn, INVALID_MFN) )
>                  goto nomem;
> +
> +            pl3t = map_domain_page(l3mfn);
>              clear_page(pl3t);
>              l4e_write(&idle_pg_table[l4_table_offset(va)],
> -                      l4e_from_paddr(__pa(pl3t), __PAGE_HYPERVISOR_RW));
> +                      l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR_RW));
> +            unmap_domain_page(pl3t);

This can be moved up, and once it is you'll notice that you're
open-coding clear_domain_page(). I wonder whether I didn't spot
the same in other patches of this series.

Besides the previously raised point of possibly having an
allocation function that returns a mapping of the page right
away (not needed here) - are there many cases where allocation
of a new page table isn't accompanied by clearing the page? If
not, should the function perhaps do so (and then, once it has
a mapping anyway, it would be even more so natural to return
it for users wanting a mapping anyway)?

> @@ -662,6 +677,8 @@ void __init paging_init(void)
>      return;
>  
>   nomem:
> +    UNMAP_DOMAIN_PAGE(l2_ro_mpt);
> +    UNMAP_DOMAIN_PAGE(l3_ro_mpt);
>      panic("Not enough memory for m2p table\n");
>  }

I don't think this is a very useful addition.

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 08/15] x86_64/mm: switch to new APIs in setup_m2p_table
  2020-04-24 14:08 ` [PATCH v6 08/15] x86_64/mm: switch to new APIs in setup_m2p_table Hongyan Xia
@ 2020-05-20  9:54   ` Jan Beulich
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2020-05-20  9:54 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: xen-devel, Roger Pau Monné, julien, Wei Liu, Andrew Cooper

On 24.04.2020 16:08, Hongyan Xia wrote:
> --- a/xen/arch/x86/x86_64/mm.c
> +++ b/xen/arch/x86/x86_64/mm.c
> @@ -379,13 +379,13 @@ static int setup_m2p_table(struct mem_hotadd_info *info)
>  {
>      unsigned long i, va, smap, emap;
>      unsigned int n;
> -    l2_pgentry_t *l2_ro_mpt = NULL;
>      l3_pgentry_t *l3_ro_mpt = NULL;
>      int ret = 0;
>  
>      ASSERT(l4e_get_flags(idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)])
>              & _PAGE_PRESENT);
> -    l3_ro_mpt = l4e_to_l3e(idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)]);
> +    l3_ro_mpt = map_l3t_from_l4e(
> +                    idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)]);
>  
>      smap = (info->spfn & (~((1UL << (L2_PAGETABLE_SHIFT - 3)) -1)));
>      emap = ((info->epfn + ((1UL << (L2_PAGETABLE_SHIFT - 3)) - 1 )) &
> @@ -424,6 +424,7 @@ static int setup_m2p_table(struct mem_hotadd_info *info)
>                  break;
>          if ( n < CNT )
>          {
> +            l2_pgentry_t *l2_ro_mpt;
>              mfn_t mfn = alloc_hotadd_mfn(info);
>  
>              ret = map_pages_to_xen(
> @@ -440,30 +441,30 @@ static int setup_m2p_table(struct mem_hotadd_info *info)
>                    _PAGE_PSE));
>              if ( l3e_get_flags(l3_ro_mpt[l3_table_offset(va)]) &
>                _PAGE_PRESENT )
> -                l2_ro_mpt = l3e_to_l2e(l3_ro_mpt[l3_table_offset(va)]) +
> -                  l2_table_offset(va);
> +                l2_ro_mpt = map_l2t_from_l3e(l3_ro_mpt[l3_table_offset(va)]);
>              else
>              {
> -                l2_ro_mpt = alloc_xen_pagetable();
> -                if ( !l2_ro_mpt )
> +                mfn_t l2_ro_mpt_mfn = alloc_xen_pagetable_new();
> +
> +                if ( mfn_eq(l2_ro_mpt_mfn, INVALID_MFN) )
>                  {
>                      ret = -ENOMEM;
>                      goto error;
>                  }
>  
> +                l2_ro_mpt = map_domain_page(l2_ro_mpt_mfn);
>                  clear_page(l2_ro_mpt);
>                  l3e_write(&l3_ro_mpt[l3_table_offset(va)],
> -                          l3e_from_paddr(__pa(l2_ro_mpt),
> -                                         __PAGE_HYPERVISOR_RO | _PAGE_USER));
> -                l2_ro_mpt += l2_table_offset(va);
> +                          l3e_from_mfn(l2_ro_mpt_mfn,
> +                                       __PAGE_HYPERVISOR_RO | _PAGE_USER));
>              }
> +            l2_ro_mpt += l2_table_offset(va);
>  
>              /* NB. Cannot be GLOBAL: guest user mode should not see it. */
>              l2e_write(l2_ro_mpt, l2e_from_mfn(mfn,
>                     /*_PAGE_GLOBAL|*/_PAGE_PSE|_PAGE_USER|_PAGE_PRESENT));
> +            unmap_domain_page(l2_ro_mpt);
>          }
> -        if ( !((unsigned long)l2_ro_mpt & ~PAGE_MASK) )
> -            l2_ro_mpt = NULL;

I think you want to consider retaining these two lines and the wider
scope of l2_ro_mpt, to leverage it to avoid mapping and unmapping
the same L2 page over and over on each loop iteration.

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 15/15] x86/mm: drop _new suffix for page table APIs
  2020-04-24 14:09 ` [PATCH v6 15/15] x86/mm: drop _new suffix for page table APIs Hongyan Xia
@ 2020-05-20  9:56   ` Jan Beulich
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2020-05-20  9:56 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: xen-devel, Roger Pau Monné, julien, Wei Liu, Andrew Cooper

On 24.04.2020 16:09, Hongyan Xia wrote:
> From: Wei Liu <wei.liu2@citrix.com>
> 
> No functional change.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>

Acked-by: Jan Beulich <jbeulich@suse.com>



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 13/15] x86/mm: drop old page table APIs
  2020-04-24 14:09 ` [PATCH v6 13/15] x86/mm: drop old page table APIs Hongyan Xia
@ 2020-05-20 10:09   ` Jan Beulich
  2020-05-27 15:19     ` Hongyan Xia
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2020-05-20 10:09 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: xen-devel, Roger Pau Monné, julien, Wei Liu, Andrew Cooper

On 24.04.2020 16:09, Hongyan Xia wrote:
> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -815,7 +815,7 @@ static int setup_cpu_root_pgt(unsigned int cpu)
>      if ( !opt_xpti_hwdom && !opt_xpti_domu )
>          return 0;
>  
> -    rpt = alloc_xen_pagetable();
> +    rpt = alloc_xenheap_page();

So the idea of not using alloc_domheap_page() + map_domain_page_global()
here is that in the long run alloc_xenheap_page() will resolve to just
this? If so, while I'd have preferred the greater flexibility until then,
this is fair enough, i.e.
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 07/15] x86_64/mm: switch to new APIs in paging_init
  2020-05-20  9:46   ` Jan Beulich
@ 2020-05-27 15:01     ` Hongyan Xia
  0 siblings, 0 replies; 29+ messages in thread
From: Hongyan Xia @ 2020-05-27 15:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Roger Pau Monné, julien, Wei Liu, Andrew Cooper

On Wed, 2020-05-20 at 11:46 +0200, Jan Beulich wrote:
> On 24.04.2020 16:08, Hongyan Xia wrote:
> > @@ -493,22 +494,28 @@ void __init paging_init(void)
> >          if ( !(l4e_get_flags(idle_pg_table[l4_table_offset(va)]) &
> >                _PAGE_PRESENT) )
> >          {
> > -            l3_pgentry_t *pl3t = alloc_xen_pagetable();
> > +            l3_pgentry_t *pl3t;
> > +            mfn_t l3mfn = alloc_xen_pagetable_new();
> >  
> > -            if ( !pl3t )
> > +            if ( mfn_eq(l3mfn, INVALID_MFN) )
> >                  goto nomem;
> > +
> > +            pl3t = map_domain_page(l3mfn);
> >              clear_page(pl3t);
> >              l4e_write(&idle_pg_table[l4_table_offset(va)],
> > -                      l4e_from_paddr(__pa(pl3t),
> > __PAGE_HYPERVISOR_RW));
> > +                      l4e_from_mfn(l3mfn, __PAGE_HYPERVISOR_RW));
> > +            unmap_domain_page(pl3t);
> 
> This can be moved up, and once it is you'll notice that you're
> open-coding clear_domain_page(). I wonder whether I didn't spot
> the same in other patches of this series.
> 
> Besides the previously raised point of possibly having an
> allocation function that returns a mapping of the page right
> away (not needed here) - are there many cases where allocation
> of a new page table isn't accompanied by clearing the page? If
> not, should the function perhaps do so (and then, once it has
> a mapping anyway, it would be even more so natural to return
> it for users wanting a mapping anyway)?

I grepped through all alloc_xen_pagetable(). Except the page shattering
logic in x86/mm.c where the whole page table page is written
immediately, all other call sites clear the page right away, so it is
useful to have a helper that clears it for you. I also looked at the
use of VA and MFN from the call. MFN is almost always needed while VA
is not, and if we bundle clearing into the alloc() itself, a lot of
call sites don't even need the VA.

Similar to what you suggested before, we can do:
void* alloc_map_clear_xen_pagetable(mfn_t* mfn)
which needs to be paired with an unmap call, of course.

> > @@ -662,6 +677,8 @@ void __init paging_init(void)
> >      return;
> >  
> >   nomem:
> > +    UNMAP_DOMAIN_PAGE(l2_ro_mpt);
> > +    UNMAP_DOMAIN_PAGE(l3_ro_mpt);
> >      panic("Not enough memory for m2p table\n");
> >  }
> 
> I don't think this is a very useful addition.

I was trying to avoid further mapping leaks in the panic path, but it
does not look like panic() does mappings, so these can be removed.

Hongyan



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v6 13/15] x86/mm: drop old page table APIs
  2020-05-20 10:09   ` Jan Beulich
@ 2020-05-27 15:19     ` Hongyan Xia
  0 siblings, 0 replies; 29+ messages in thread
From: Hongyan Xia @ 2020-05-27 15:19 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Roger Pau Monné, julien, Wei Liu, Andrew Cooper

On Wed, 2020-05-20 at 12:09 +0200, Jan Beulich wrote:
> On 24.04.2020 16:09, Hongyan Xia wrote:
> > --- a/xen/arch/x86/smpboot.c
> > +++ b/xen/arch/x86/smpboot.c
> > @@ -815,7 +815,7 @@ static int setup_cpu_root_pgt(unsigned int cpu)
> >      if ( !opt_xpti_hwdom && !opt_xpti_domu )
> >          return 0;
> >  
> > -    rpt = alloc_xen_pagetable();
> > +    rpt = alloc_xenheap_page();
> 
> So the idea of not using alloc_domheap_page() +
> map_domain_page_global()
> here is that in the long run alloc_xenheap_page() will resolve to
> just
> this? If so, while I'd have preferred the greater flexibility until
> then,
> this is fair enough, i.e.
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

alloc_xenheap_page() has an advantage which is the fast PA->VA lookup,
which is currently required by the restore_all_guest logic. If we
change how restore_all_guest works or gain fast PA->VA lookup for
globally mapped pages, then xenheap could probably just be removed.

Hongyan



^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2020-05-27 15:19 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-24 14:08 [PATCH v6 00/15] switch to domheap for Xen page tables Hongyan Xia
2020-04-24 14:08 ` [PATCH v6 01/15] x86/mm: map_pages_to_xen would better have one exit path Hongyan Xia
2020-04-24 14:08 ` [PATCH v6 02/15] x86/mm: make sure there is one exit path for modify_xen_mappings Hongyan Xia
2020-04-24 14:08 ` [PATCH v6 03/15] x86/mm: rewrite virt_to_xen_l*e Hongyan Xia
2020-05-20  9:27   ` Jan Beulich
2020-04-24 14:08 ` [PATCH v6 04/15] x86/mm: switch to new APIs in map_pages_to_xen Hongyan Xia
2020-04-24 14:08 ` [PATCH v6 05/15] x86/mm: switch to new APIs in modify_xen_mappings Hongyan Xia
2020-04-24 14:08 ` [PATCH v6 06/15] x86_64/mm: introduce pl2e in paging_init Hongyan Xia
2020-05-20  9:35   ` Jan Beulich
2020-04-24 14:08 ` [PATCH v6 07/15] x86_64/mm: switch to new APIs " Hongyan Xia
2020-05-20  9:46   ` Jan Beulich
2020-05-27 15:01     ` Hongyan Xia
2020-04-24 14:08 ` [PATCH v6 08/15] x86_64/mm: switch to new APIs in setup_m2p_table Hongyan Xia
2020-05-20  9:54   ` Jan Beulich
2020-04-24 14:09 ` [PATCH v6 09/15] efi: use new page table APIs in copy_mapping Hongyan Xia
2020-04-30 12:42   ` Jan Beulich
2020-04-24 14:09 ` [PATCH v6 10/15] efi: switch to new APIs in EFI code Hongyan Xia
2020-04-30 12:55   ` Jan Beulich
2020-04-24 14:09 ` [PATCH v6 11/15] x86/smpboot: clone_mapping should have one exit path Hongyan Xia
2020-04-30 14:59   ` Jan Beulich
2020-04-24 14:09 ` [PATCH v6 12/15] x86/smpboot: switch pl*e to use new APIs in clone_mapping Hongyan Xia
2020-04-30 15:15   ` Jan Beulich
2020-05-11 10:55     ` Hongyan Xia
2020-04-24 14:09 ` [PATCH v6 13/15] x86/mm: drop old page table APIs Hongyan Xia
2020-05-20 10:09   ` Jan Beulich
2020-05-27 15:19     ` Hongyan Xia
2020-04-24 14:09 ` [PATCH v6 14/15] x86: switch to use domheap page for page tables Hongyan Xia
2020-04-24 14:09 ` [PATCH v6 15/15] x86/mm: drop _new suffix for page table APIs Hongyan Xia
2020-05-20  9:56   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.