xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/16] Remove the direct map
@ 2020-04-30 20:44 Hongyan Xia
  2020-04-30 20:44 ` [PATCH 01/16] x86/setup: move vm_init() before acpi calls Hongyan Xia
                   ` (16 more replies)
  0 siblings, 17 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Volodymyr Babchuk,
	Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

This series depends on Xen page table domheap conversion:
https://lists.xenproject.org/archives/html/xen-devel/2020-04/msg01374.html.

After breaking the reliance on the direct map to manipulate Xen page
tables, we can now finally remove the direct map altogether.

This series:
- fixes many places that use the direct map incorrectly or assume the
  presence of an always-mapped direct map in a wrong way.
- includes the early vmap patches for global mappings.
- initialises the mapcache for all domains, disables the fast path that
  uses the direct map for mappings.
- maps and unmaps xenheap on-demand.
- adds a boot command line switch to enable or disable the direct map.

This previous version was in RFC state and can be found here:
https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02647.html,
which has since been broken into small series.

Hongyan Xia (12):
  acpi: vmap pages in acpi_os_alloc_memory
  x86/numa: vmap the pages for memnodemap
  x86/srat: vmap the pages for acpi_slit
  x86: map/unmap pages in restore_all_guests.
  x86/pv: rewrite how building PV dom0 handles domheap mappings
  x86/mapcache: initialise the mapcache for the idle domain
  x86: add a boot option to enable and disable the direct map
  x86/domain_page: remove the fast paths when mfn is not in the
    directmap
  xen/page_alloc: add a path for xenheap when there is no direct map
  x86/setup: leave early boot slightly earlier
  x86/setup: vmap heap nodes when they are outside the direct map
  x86/setup: do not create valid mappings when directmap=no

Wei Liu (4):
  x86/setup: move vm_init() before acpi calls
  x86/pv: domheap pages should be mapped while relocating initrd
  x86: add Persistent Map (PMAP) infrastructure
  x86: lift mapcache variable to the arch level

 docs/misc/xen-command-line.pandoc |  12 +++
 xen/arch/arm/setup.c              |   4 +-
 xen/arch/x86/Makefile             |   1 +
 xen/arch/x86/domain.c             |   4 +-
 xen/arch/x86/domain_page.c        |  53 ++++++++-----
 xen/arch/x86/mm.c                 |   8 +-
 xen/arch/x86/numa.c               |   8 +-
 xen/arch/x86/pmap.c               |  87 +++++++++++++++++++++
 xen/arch/x86/pv/dom0_build.c      |  75 ++++++++++++++----
 xen/arch/x86/setup.c              | 125 +++++++++++++++++++++++++-----
 xen/arch/x86/srat.c               |   3 +-
 xen/arch/x86/x86_64/entry.S       |  27 ++++++-
 xen/common/page_alloc.c           |  85 +++++++++++++++++---
 xen/common/vmap.c                 |  37 +++++++--
 xen/drivers/acpi/osl.c            |   9 ++-
 xen/include/asm-arm/mm.h          |   5 ++
 xen/include/asm-x86/domain.h      |  12 +--
 xen/include/asm-x86/fixmap.h      |   3 +
 xen/include/asm-x86/mm.h          |  17 +++-
 xen/include/asm-x86/pmap.h        |  10 +++
 xen/include/xen/vmap.h            |   5 ++
 21 files changed, 495 insertions(+), 95 deletions(-)
 create mode 100644 xen/arch/x86/pmap.c
 create mode 100644 xen/include/asm-x86/pmap.h

-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 01/16] x86/setup: move vm_init() before acpi calls
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-04-30 20:44 ` [PATCH 02/16] acpi: vmap pages in acpi_os_alloc_memory Hongyan Xia
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Volodymyr Babchuk,
	Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

After the direct map removal, pages from the boot allocator are not
mapped at all in the direct map. Although we have map_domain_page, they
are ephemeral and are less helpful for mappings that are more than a
page, so we want a mechanism to globally map a range of pages, which is
what vmap is for. Therefore, we bring vm_init into early boot stage.

To allow vmap to be initialised and used in early boot, we need to
modify vmap to receive pages from the boot allocator during early boot
stage.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Woodhouse <dwmw2@amazon.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/arm/setup.c |  4 ++--
 xen/arch/x86/setup.c | 31 ++++++++++++++++++++-----------
 xen/common/vmap.c    | 37 +++++++++++++++++++++++++++++--------
 3 files changed, 51 insertions(+), 21 deletions(-)

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 7968cee47d..8f0ac87419 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -822,6 +822,8 @@ void __init start_xen(unsigned long boot_phys_offset,
 
     setup_mm();
 
+    vm_init();
+
     /* Parse the ACPI tables for possible boot-time configuration */
     acpi_boot_table_init();
 
@@ -833,8 +835,6 @@ void __init start_xen(unsigned long boot_phys_offset,
      */
     system_state = SYS_STATE_boot;
 
-    vm_init();
-
     if ( acpi_disabled )
     {
         printk("Booting using Device Tree\n");
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index fc0a6e5fcc..faca8c9758 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -695,6 +695,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     int i, j, e820_warn = 0, bytes = 0;
     bool acpi_boot_table_init_done = false, relocated = false;
     int ret;
+    bool vm_init_done = false;
     struct ns16550_defaults ns16550 = {
         .data_bits = 8,
         .parity    = 'n',
@@ -1301,12 +1302,23 @@ void __init noreturn __start_xen(unsigned long mbi_p)
             continue;
 
         if ( !acpi_boot_table_init_done &&
-             s >= (1ULL << 32) &&
-             !acpi_boot_table_init() )
+             s >= (1ULL << 32) )
         {
-            acpi_boot_table_init_done = true;
-            srat_parse_regions(s);
-            setup_max_pdx(raw_max_page);
+            /*
+             * We only initialise vmap and acpi after going through the bottom
+             * 4GiB, so that we have enough pages in the boot allocator.
+             */
+            if ( !vm_init_done )
+            {
+                vm_init();
+                vm_init_done = true;
+            }
+            if ( !acpi_boot_table_init() )
+            {
+                acpi_boot_table_init_done = true;
+                srat_parse_regions(s);
+                setup_max_pdx(raw_max_page);
+            }
         }
 
         if ( pfn_to_pdx((e - 1) >> PAGE_SHIFT) >= max_pdx )
@@ -1483,6 +1495,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
     init_frametable();
 
+    if ( !vm_init_done )
+        vm_init();
+
     if ( !acpi_boot_table_init_done )
         acpi_boot_table_init();
 
@@ -1520,12 +1535,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
         end_boot_allocator();
 
     system_state = SYS_STATE_boot;
-    /*
-     * No calls involving ACPI code should go between the setting of
-     * SYS_STATE_boot and vm_init() (or else acpi_os_{,un}map_memory()
-     * will break).
-     */
-    vm_init();
 
     console_init_ring();
     vesa_init();
diff --git a/xen/common/vmap.c b/xen/common/vmap.c
index 9964ab2096..e8533a8a80 100644
--- a/xen/common/vmap.c
+++ b/xen/common/vmap.c
@@ -35,9 +35,20 @@ void __init vm_init_type(enum vmap_region type, void *start, void *end)
 
     for ( i = 0, va = (unsigned long)vm_bitmap(type); i < nr; ++i, va += PAGE_SIZE )
     {
-        struct page_info *pg = alloc_domheap_page(NULL, 0);
+        mfn_t mfn;
+        int rc;
 
-        map_pages_to_xen(va, page_to_mfn(pg), 1, PAGE_HYPERVISOR);
+        if ( system_state == SYS_STATE_early_boot )
+            mfn = alloc_boot_pages(1, 1);
+        else
+        {
+            struct page_info *pg = alloc_domheap_page(NULL, 0);
+
+            BUG_ON(!pg);
+            mfn = page_to_mfn(pg);
+        }
+        rc = map_pages_to_xen(va, mfn, 1, PAGE_HYPERVISOR);
+        BUG_ON(rc);
         clear_page((void *)va);
     }
     bitmap_fill(vm_bitmap(type), vm_low[type]);
@@ -63,7 +74,7 @@ static void *vm_alloc(unsigned int nr, unsigned int align,
     spin_lock(&vm_lock);
     for ( ; ; )
     {
-        struct page_info *pg;
+        mfn_t mfn;
 
         ASSERT(vm_low[t] == vm_top[t] || !test_bit(vm_low[t], vm_bitmap(t)));
         for ( start = vm_low[t]; start < vm_top[t]; )
@@ -98,9 +109,16 @@ static void *vm_alloc(unsigned int nr, unsigned int align,
         if ( vm_top[t] >= vm_end[t] )
             return NULL;
 
-        pg = alloc_domheap_page(NULL, 0);
-        if ( !pg )
-            return NULL;
+        if ( system_state == SYS_STATE_early_boot )
+            mfn = alloc_boot_pages(1, 1);
+        else
+        {
+            struct page_info *pg = alloc_domheap_page(NULL, 0);
+
+            if ( !pg )
+                return NULL;
+            mfn = page_to_mfn(pg);
+        }
 
         spin_lock(&vm_lock);
 
@@ -108,7 +126,7 @@ static void *vm_alloc(unsigned int nr, unsigned int align,
         {
             unsigned long va = (unsigned long)vm_bitmap(t) + vm_top[t] / 8;
 
-            if ( !map_pages_to_xen(va, page_to_mfn(pg), 1, PAGE_HYPERVISOR) )
+            if ( !map_pages_to_xen(va, mfn, 1, PAGE_HYPERVISOR) )
             {
                 clear_page((void *)va);
                 vm_top[t] += PAGE_SIZE * 8;
@@ -118,7 +136,10 @@ static void *vm_alloc(unsigned int nr, unsigned int align,
             }
         }
 
-        free_domheap_page(pg);
+        if ( system_state == SYS_STATE_early_boot )
+            init_boot_pages(mfn_to_maddr(mfn), mfn_to_maddr(mfn) + PAGE_SIZE);
+        else
+            free_domheap_page(mfn_to_page(mfn));
 
         if ( start >= vm_top[t] )
         {
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 02/16] acpi: vmap pages in acpi_os_alloc_memory
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
  2020-04-30 20:44 ` [PATCH 01/16] x86/setup: move vm_init() before acpi calls Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-05-01 12:02   ` Wei Liu
  2020-05-01 21:35   ` Julien Grall
  2020-04-30 20:44 ` [PATCH 03/16] x86/numa: vmap the pages for memnodemap Hongyan Xia
                   ` (14 subsequent siblings)
  16 siblings, 2 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich

From: Hongyan Xia <hongyxia@amazon.com>

Also, introduce a wrapper around vmap that maps a contiguous range for
boot allocations.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/drivers/acpi/osl.c | 9 ++++++++-
 xen/include/xen/vmap.h | 5 +++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/acpi/osl.c b/xen/drivers/acpi/osl.c
index 4c8bb7839e..d0762dad4e 100644
--- a/xen/drivers/acpi/osl.c
+++ b/xen/drivers/acpi/osl.c
@@ -219,7 +219,11 @@ void *__init acpi_os_alloc_memory(size_t sz)
 	void *ptr;
 
 	if (system_state == SYS_STATE_early_boot)
-		return mfn_to_virt(mfn_x(alloc_boot_pages(PFN_UP(sz), 1)));
+	{
+		mfn_t mfn = alloc_boot_pages(PFN_UP(sz), 1);
+
+		return vmap_boot_pages(mfn, PFN_UP(sz));
+	}
 
 	ptr = xmalloc_bytes(sz);
 	ASSERT(!ptr || is_xmalloc_memory(ptr));
@@ -244,5 +248,8 @@ void __init acpi_os_free_memory(void *ptr)
 	if (is_xmalloc_memory(ptr))
 		xfree(ptr);
 	else if (ptr && system_state == SYS_STATE_early_boot)
+	{
+		vunmap(ptr);
 		init_boot_pages(__pa(ptr), __pa(ptr) + PAGE_SIZE);
+	}
 }
diff --git a/xen/include/xen/vmap.h b/xen/include/xen/vmap.h
index 369560e620..c70801e195 100644
--- a/xen/include/xen/vmap.h
+++ b/xen/include/xen/vmap.h
@@ -23,6 +23,11 @@ void *vmalloc_xen(size_t size);
 void *vzalloc(size_t size);
 void vfree(void *va);
 
+static inline void *vmap_boot_pages(mfn_t mfn, unsigned int nr_pages)
+{
+    return __vmap(&mfn, nr_pages, 1, 1, PAGE_HYPERVISOR, VMAP_DEFAULT);
+}
+
 void __iomem *ioremap(paddr_t, size_t);
 
 static inline void iounmap(void __iomem *va)
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 03/16] x86/numa: vmap the pages for memnodemap
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
  2020-04-30 20:44 ` [PATCH 01/16] x86/setup: move vm_init() before acpi calls Hongyan Xia
  2020-04-30 20:44 ` [PATCH 02/16] acpi: vmap pages in acpi_os_alloc_memory Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-04-30 20:44 ` [PATCH 04/16] x86/srat: vmap the pages for acpi_slit Hongyan Xia
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

This avoids the assumption that there is a direct map and boot pages
fall inside the direct map.

Clean up the variables so that mfn actually stores a type-safe mfn.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/numa.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index f1066c59c7..51eca3f3fc 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -100,13 +100,13 @@ static int __init populate_memnodemap(const struct node *nodes,
 static int __init allocate_cachealigned_memnodemap(void)
 {
     unsigned long size = PFN_UP(memnodemapsize * sizeof(*memnodemap));
-    unsigned long mfn = mfn_x(alloc_boot_pages(size, 1));
+    mfn_t mfn = alloc_boot_pages(size, 1);
 
-    memnodemap = mfn_to_virt(mfn);
-    mfn <<= PAGE_SHIFT;
+    memnodemap = vmap_boot_pages(mfn, size);
+    BUG_ON(!memnodemap);
     size <<= PAGE_SHIFT;
     printk(KERN_DEBUG "NUMA: Allocated memnodemap from %lx - %lx\n",
-           mfn, mfn + size);
+           mfn_to_maddr(mfn), mfn_to_maddr(mfn) + size);
     memnodemapsize = size / sizeof(*memnodemap);
 
     return 0;
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 04/16] x86/srat: vmap the pages for acpi_slit
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (2 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 03/16] x86/numa: vmap the pages for memnodemap Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-11-30 10:16   ` Jan Beulich
  2020-04-30 20:44 ` [PATCH 05/16] x86: map/unmap pages in restore_all_guests Hongyan Xia
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

This avoids the assumption that boot pages are in the direct map.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/srat.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 506a56d66b..9a84c6c8a8 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -196,7 +196,8 @@ void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
 		return;
 	}
 	mfn = alloc_boot_pages(PFN_UP(slit->header.length), 1);
-	acpi_slit = mfn_to_virt(mfn_x(mfn));
+	acpi_slit = vmap_boot_pages(mfn, PFN_UP(slit->header.length));
+	BUG_ON(!acpi_slit);
 	memcpy(acpi_slit, slit, slit->header.length);
 }
 
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 05/16] x86: map/unmap pages in restore_all_guests.
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (3 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 04/16] x86/srat: vmap the pages for acpi_slit Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-04-30 20:44 ` [PATCH 06/16] x86/pv: domheap pages should be mapped while relocating initrd Hongyan Xia
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

Before, it assumed the pv cr3 could be accessed via a direct map. This
is no longer true.

Note that we do not map and unmap root_pgt for now since it is still a
xenheap page.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/x86_64/entry.S | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index d55453f3f3..110cd0394f 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -154,7 +154,24 @@ restore_all_guest:
         and   %rsi, %rdi
         and   %r9, %rsi
         add   %rcx, %rdi
-        add   %rcx, %rsi
+
+         /*
+          * Without a direct map, we have to map first before copying. We only
+          * need to map the guest root table but not the per-CPU root_pgt,
+          * because the latter is still a xenheap page.
+          */
+        pushq %r9
+        pushq %rdx
+        pushq %rax
+        pushq %rdi
+        mov   %rsi, %rdi
+        shr   $PAGE_SHIFT, %rdi
+        callq map_domain_page
+        mov   %rax, %rsi
+        popq  %rdi
+        /* Stash the pointer for unmapping later. */
+        pushq %rax
+
         mov   $ROOT_PAGETABLE_FIRST_XEN_SLOT, %ecx
         mov   root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rsi), %r8
         mov   %r8, root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rdi)
@@ -166,6 +183,14 @@ restore_all_guest:
         sub   $(ROOT_PAGETABLE_FIRST_XEN_SLOT - \
                 ROOT_PAGETABLE_LAST_XEN_SLOT - 1) * 8, %rdi
         rep movsq
+
+        /* Unmap the page. */
+        popq  %rdi
+        callq unmap_domain_page
+        popq  %rax
+        popq  %rdx
+        popq  %r9
+
 .Lrag_copy_done:
         mov   %r9, STACK_CPUINFO_FIELD(xen_cr3)(%rdx)
         movb  $1, STACK_CPUINFO_FIELD(use_pv_cr3)(%rdx)
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 06/16] x86/pv: domheap pages should be mapped while relocating initrd
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (4 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 05/16] x86: map/unmap pages in restore_all_guests Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-04-30 20:44 ` [PATCH 07/16] x86/pv: rewrite how building PV dom0 handles domheap mappings Hongyan Xia
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

Xen shouldn't use domheap page as if they were xenheap pages. Map and
unmap pages accordingly.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Wei Wang <wawei@amazon.de>
---
 xen/arch/x86/pv/dom0_build.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index 3522eb0114..b052f13462 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -515,18 +515,31 @@ int __init dom0_construct_pv(struct domain *d,
         if ( d->arch.physaddr_bitsize &&
              ((mfn + count - 1) >> (d->arch.physaddr_bitsize - PAGE_SHIFT)) )
         {
+            unsigned long nr_pages;
+            unsigned long len = initrd_len;
+
             order = get_order_from_pages(count);
             page = alloc_domheap_pages(d, order, MEMF_no_scrub);
             if ( !page )
                 panic("Not enough RAM for domain 0 initrd\n");
+
+            nr_pages = 1UL << order;
             for ( count = -count; order--; )
                 if ( count & (1UL << order) )
                 {
                     free_domheap_pages(page, order);
                     page += 1UL << order;
+                    nr_pages -= 1UL << order;
                 }
-            memcpy(page_to_virt(page), mfn_to_virt(initrd->mod_start),
-                   initrd_len);
+
+            for ( i = 0; i < nr_pages; i++, len -= PAGE_SIZE )
+            {
+                void *p = __map_domain_page(page + i);
+                memcpy(p, mfn_to_virt(initrd_mfn + i),
+                       min(len, (unsigned long)PAGE_SIZE));
+                unmap_domain_page(p);
+            }
+
             mpt_alloc = (paddr_t)initrd->mod_start << PAGE_SHIFT;
             init_domheap_pages(mpt_alloc,
                                mpt_alloc + PAGE_ALIGN(initrd_len));
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 07/16] x86/pv: rewrite how building PV dom0 handles domheap mappings
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (5 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 06/16] x86/pv: domheap pages should be mapped while relocating initrd Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-04-30 20:44 ` [PATCH 08/16] x86: add Persistent Map (PMAP) infrastructure Hongyan Xia
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

Building a PV dom0 is allocating from the domheap but uses it like the
xenheap. This is clearly wrong. Fix.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/pv/dom0_build.c | 58 ++++++++++++++++++++++++++----------
 1 file changed, 43 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index b052f13462..adaa6afda2 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -309,6 +309,10 @@ int __init dom0_construct_pv(struct domain *d,
     l3_pgentry_t *l3tab = NULL, *l3start = NULL;
     l2_pgentry_t *l2tab = NULL, *l2start = NULL;
     l1_pgentry_t *l1tab = NULL, *l1start = NULL;
+    mfn_t l4start_mfn = INVALID_MFN;
+    mfn_t l3start_mfn = INVALID_MFN;
+    mfn_t l2start_mfn = INVALID_MFN;
+    mfn_t l1start_mfn = INVALID_MFN;
 
     /*
      * This fully describes the memory layout of the initial domain. All
@@ -535,6 +539,7 @@ int __init dom0_construct_pv(struct domain *d,
             for ( i = 0; i < nr_pages; i++, len -= PAGE_SIZE )
             {
                 void *p = __map_domain_page(page + i);
+
                 memcpy(p, mfn_to_virt(initrd_mfn + i),
                        min(len, (unsigned long)PAGE_SIZE));
                 unmap_domain_page(p);
@@ -610,23 +615,32 @@ int __init dom0_construct_pv(struct domain *d,
         v->arch.pv.event_callback_cs    = FLAT_COMPAT_KERNEL_CS;
     }
 
+#define UNMAP_MAP_AND_ADVANCE(mfn_var, virt_var, maddr) \
+do {                                                    \
+    UNMAP_DOMAIN_PAGE(virt_var);                        \
+    mfn_var = maddr_to_mfn(maddr);                      \
+    maddr += PAGE_SIZE;                                 \
+    virt_var = map_domain_page(mfn_var);                \
+} while ( false )
+
     /* WARNING: The new domain must have its 'processor' field filled in! */
     if ( !is_pv_32bit_domain(d) )
     {
         maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l4_page_table;
-        l4start = l4tab = __va(mpt_alloc); mpt_alloc += PAGE_SIZE;
+        UNMAP_MAP_AND_ADVANCE(l4start_mfn, l4start, mpt_alloc);
+        l4tab = l4start;
         clear_page(l4tab);
-        init_xen_l4_slots(l4tab, _mfn(virt_to_mfn(l4start)),
-                          d, INVALID_MFN, true);
-        v->arch.guest_table = pagetable_from_paddr(__pa(l4start));
+        init_xen_l4_slots(l4tab, l4start_mfn, d, INVALID_MFN, true);
+        v->arch.guest_table = pagetable_from_mfn(l4start_mfn);
     }
     else
     {
         /* Monitor table already created by switch_compat(). */
-        l4start = l4tab = __va(pagetable_get_paddr(v->arch.guest_table));
+        l4start_mfn = pagetable_get_mfn(v->arch.guest_table);
+        l4start = l4tab = map_domain_page(l4start_mfn);
         /* See public/xen.h on why the following is needed. */
         maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l3_page_table;
-        l3start = __va(mpt_alloc); mpt_alloc += PAGE_SIZE;
+        UNMAP_MAP_AND_ADVANCE(l3start_mfn, l3start, mpt_alloc);
     }
 
     l4tab += l4_table_offset(v_start);
@@ -636,14 +650,16 @@ int __init dom0_construct_pv(struct domain *d,
         if ( !((unsigned long)l1tab & (PAGE_SIZE-1)) )
         {
             maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l1_page_table;
-            l1start = l1tab = __va(mpt_alloc); mpt_alloc += PAGE_SIZE;
+            UNMAP_MAP_AND_ADVANCE(l1start_mfn, l1start, mpt_alloc);
+            l1tab = l1start;
             clear_page(l1tab);
             if ( count == 0 )
                 l1tab += l1_table_offset(v_start);
             if ( !((unsigned long)l2tab & (PAGE_SIZE-1)) )
             {
                 maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l2_page_table;
-                l2start = l2tab = __va(mpt_alloc); mpt_alloc += PAGE_SIZE;
+                UNMAP_MAP_AND_ADVANCE(l2start_mfn, l2start, mpt_alloc);
+                l2tab = l2start;
                 clear_page(l2tab);
                 if ( count == 0 )
                     l2tab += l2_table_offset(v_start);
@@ -653,19 +669,19 @@ int __init dom0_construct_pv(struct domain *d,
                     {
                         maddr_to_page(mpt_alloc)->u.inuse.type_info =
                             PGT_l3_page_table;
-                        l3start = __va(mpt_alloc); mpt_alloc += PAGE_SIZE;
+                        UNMAP_MAP_AND_ADVANCE(l3start_mfn, l3start, mpt_alloc);
                     }
                     l3tab = l3start;
                     clear_page(l3tab);
                     if ( count == 0 )
                         l3tab += l3_table_offset(v_start);
-                    *l4tab = l4e_from_paddr(__pa(l3start), L4_PROT);
+                    *l4tab = l4e_from_mfn(l3start_mfn, L4_PROT);
                     l4tab++;
                 }
-                *l3tab = l3e_from_paddr(__pa(l2start), L3_PROT);
+                *l3tab = l3e_from_mfn(l2start_mfn, L3_PROT);
                 l3tab++;
             }
-            *l2tab = l2e_from_paddr(__pa(l1start), L2_PROT);
+            *l2tab = l2e_from_mfn(l1start_mfn, L2_PROT);
             l2tab++;
         }
         if ( count < initrd_pfn || count >= initrd_pfn + PFN_UP(initrd_len) )
@@ -692,9 +708,9 @@ int __init dom0_construct_pv(struct domain *d,
             if ( !l3e_get_intpte(*l3tab) )
             {
                 maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l2_page_table;
-                l2tab = __va(mpt_alloc); mpt_alloc += PAGE_SIZE;
-                clear_page(l2tab);
-                *l3tab = l3e_from_paddr(__pa(l2tab), L3_PROT);
+                UNMAP_MAP_AND_ADVANCE(l2start_mfn, l2start, mpt_alloc);
+                clear_page(l2start);
+                *l3tab = l3e_from_mfn(l2start_mfn, L3_PROT);
             }
             if ( i == 3 )
                 l3e_get_page(*l3tab)->u.inuse.type_info |= PGT_pae_xen_l2;
@@ -705,9 +721,17 @@ int __init dom0_construct_pv(struct domain *d,
         unmap_domain_page(l2t);
     }
 
+#undef UNMAP_MAP_AND_ADVANCE
+
+    UNMAP_DOMAIN_PAGE(l1start);
+    UNMAP_DOMAIN_PAGE(l2start);
+    UNMAP_DOMAIN_PAGE(l3start);
+
     /* Pages that are part of page tables must be read only. */
     mark_pv_pt_pages_rdonly(d, l4start, vpt_start, nr_pt_pages);
 
+    UNMAP_DOMAIN_PAGE(l4start);
+
     /* Mask all upcalls... */
     for ( i = 0; i < XEN_LEGACY_MAX_VCPUS; i++ )
         shared_info(d, vcpu_info[i].evtchn_upcall_mask) = 1;
@@ -869,8 +893,12 @@ int __init dom0_construct_pv(struct domain *d,
      * !CONFIG_VIDEO case so the logic here can be simplified.
      */
     if ( pv_shim )
+    {
+        l4start = map_domain_page(l4start_mfn);
         pv_shim_setup_dom(d, l4start, v_start, vxenstore_start, vconsole_start,
                           vphysmap_start, si);
+        UNMAP_DOMAIN_PAGE(l4start);
+    }
 
     if ( is_pv_32bit_domain(d) )
         xlat_start_info(si, pv_shim ? XLAT_start_info_console_domU
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 08/16] x86: add Persistent Map (PMAP) infrastructure
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (6 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 07/16] x86/pv: rewrite how building PV dom0 handles domheap mappings Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-04-30 20:44 ` [PATCH 09/16] x86: lift mapcache variable to the arch level Hongyan Xia
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

The basic idea is like Persistent Kernel Map (PKMAP) in linux. We
pre-populate all the relevant page tables before system is fully set
up.

It is needed to bootstrap map domain page infrastructure -- we need
some way to map pages to set up the mapcache without a direct map.

This infrastructure is not lock-protected therefore can only be used
before smpboot. After smpboot, mapcache has to be used.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/Makefile        |  1 +
 xen/arch/x86/pmap.c          | 87 ++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/fixmap.h |  3 ++
 xen/include/asm-x86/pmap.h   | 10 +++++
 4 files changed, 101 insertions(+)
 create mode 100644 xen/arch/x86/pmap.c
 create mode 100644 xen/include/asm-x86/pmap.h

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 44137d919b..c8e565867b 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -52,6 +52,7 @@ obj-y += pci.o
 obj-y += percpu.o
 obj-y += physdev.o x86_64/physdev.o
 obj-y += platform_hypercall.o x86_64/platform_hypercall.o
+obj-bin-y += pmap.init.o
 obj-y += psr.o
 obj-y += setup.o
 obj-y += shutdown.o
diff --git a/xen/arch/x86/pmap.c b/xen/arch/x86/pmap.c
new file mode 100644
index 0000000000..44d02ece89
--- /dev/null
+++ b/xen/arch/x86/pmap.c
@@ -0,0 +1,87 @@
+#include <xen/init.h>
+#include <xen/mm.h>
+#include <xen/spinlock.h>
+
+#include <asm/bitops.h>
+#include <asm/fixmap.h>
+#include <asm/flushtlb.h>
+
+/*
+ * Simple mapping infrastructure to map / unmap pages in fixed map.
+ * This is used to set up the page table for mapcache, which is used
+ * by map domain page infrastructure.
+ *
+ * This structure is not protected by any locks, so it must not be used after
+ * smp bring-up.
+ */
+
+/* Bitmap to track which slot is used */
+static unsigned long __initdata inuse;
+
+void *__init pmap_map(mfn_t mfn)
+{
+    unsigned long flags;
+    unsigned int idx;
+    void *linear = NULL;
+    enum fixed_addresses slot;
+    l1_pgentry_t *pl1e;
+
+    BUILD_BUG_ON(sizeof(inuse) * BITS_PER_LONG < NUM_FIX_PMAP);
+
+    ASSERT(system_state < SYS_STATE_smp_boot);
+
+    local_irq_save(flags);
+
+    idx = find_first_zero_bit(&inuse, NUM_FIX_PMAP);
+    if ( idx == NUM_FIX_PMAP )
+        panic("Out of PMAP slots\n");
+
+    __set_bit(idx, &inuse);
+
+    slot = idx + FIX_PMAP_BEGIN;
+    ASSERT(slot >= FIX_PMAP_BEGIN && slot <= FIX_PMAP_END);
+
+    linear = fix_to_virt(slot);
+    /*
+     * We cannot use set_fixmap() here. We use PMAP when there is no direct map,
+     * so map_pages_to_xen() called by set_fixmap() needs to map pages on
+     * demand, which then calls pmap() again, resulting in a loop. Modify the
+     * PTEs directly instead. The same is true for pmap_unmap().
+     */
+    pl1e = &l1_fixmap[l1_table_offset((unsigned long)linear)];
+    l1e_write_atomic(pl1e, l1e_from_mfn(mfn, PAGE_HYPERVISOR));
+
+    local_irq_restore(flags);
+
+    return linear;
+}
+
+void __init pmap_unmap(void *p)
+{
+    unsigned long flags;
+    unsigned int idx;
+    l1_pgentry_t *pl1e;
+    enum fixed_addresses slot = __virt_to_fix((unsigned long)p);
+
+    ASSERT(system_state < SYS_STATE_smp_boot);
+    ASSERT(slot >= FIX_PMAP_BEGIN && slot <= FIX_PMAP_END);
+
+    idx = slot - FIX_PMAP_BEGIN;
+    local_irq_save(flags);
+
+    __clear_bit(idx, &inuse);
+    pl1e = &l1_fixmap[l1_table_offset((unsigned long)p)];
+    l1e_write_atomic(pl1e, l1e_empty());
+    flush_tlb_one_local(p);
+
+    local_irq_restore(flags);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/fixmap.h b/xen/include/asm-x86/fixmap.h
index 8330097a74..000f3b3375 100644
--- a/xen/include/asm-x86/fixmap.h
+++ b/xen/include/asm-x86/fixmap.h
@@ -24,6 +24,7 @@
 #include <xen/kexec.h>
 #include <asm/apicdef.h>
 #include <asm/msi.h>
+#include <asm/pmap.h>
 #include <acpi/apei.h>
 
 /*
@@ -49,6 +50,8 @@ enum fixed_addresses {
     FIX_XEN_SHARED_INFO,
 #endif /* CONFIG_XEN_GUEST */
     /* Everything else should go further down. */
+    FIX_PMAP_BEGIN,
+    FIX_PMAP_END = FIX_PMAP_BEGIN + NUM_FIX_PMAP - 1,
     FIX_APIC_BASE,
     FIX_IO_APIC_BASE_0,
     FIX_IO_APIC_BASE_END = FIX_IO_APIC_BASE_0 + MAX_IO_APICS-1,
diff --git a/xen/include/asm-x86/pmap.h b/xen/include/asm-x86/pmap.h
new file mode 100644
index 0000000000..790cd71fb3
--- /dev/null
+++ b/xen/include/asm-x86/pmap.h
@@ -0,0 +1,10 @@
+#ifndef __X86_PMAP_H__
+#define __X86_PMAP_H__
+
+/* Large enough for mapping 5 levels of page tables with some headroom */
+#define NUM_FIX_PMAP 8
+
+void *pmap_map(mfn_t mfn);
+void pmap_unmap(void *p);
+
+#endif	/* __X86_PMAP_H__ */
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 09/16] x86: lift mapcache variable to the arch level
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (7 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 08/16] x86: add Persistent Map (PMAP) infrastructure Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-04-30 20:44 ` [PATCH 10/16] x86/mapcache: initialise the mapcache for the idle domain Hongyan Xia
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Wei Liu <wei.liu2@citrix.com>

It is going to be needed by HVM and idle domain as well, because without
the direct map, both need a mapcache to map pages.

This only lifts the mapcache variable up. Whether we populate the
mapcache for a domain is unchanged in this patch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Wei Wang <wawei@amazon.de>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/domain.c        |  4 ++--
 xen/arch/x86/domain_page.c   | 22 ++++++++++------------
 xen/include/asm-x86/domain.h | 12 ++++++------
 3 files changed, 18 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index a4428190d5..e73f1efe85 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -634,6 +634,8 @@ int arch_domain_create(struct domain *d,
 
     psr_domain_init(d);
 
+    mapcache_domain_init(d);
+
     if ( is_hvm_domain(d) )
     {
         if ( (rc = hvm_domain_initialise(d)) != 0 )
@@ -641,8 +643,6 @@ int arch_domain_create(struct domain *d,
     }
     else if ( is_pv_domain(d) )
     {
-        mapcache_domain_init(d);
-
         if ( (rc = pv_domain_initialise(d)) != 0 )
             goto fail;
     }
diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c
index 3a244bb500..7b22e7c6ed 100644
--- a/xen/arch/x86/domain_page.c
+++ b/xen/arch/x86/domain_page.c
@@ -82,11 +82,11 @@ void *map_domain_page(mfn_t mfn)
 #endif
 
     v = mapcache_current_vcpu();
-    if ( !v || !is_pv_vcpu(v) )
+    if ( !v )
         return mfn_to_virt(mfn_x(mfn));
 
-    dcache = &v->domain->arch.pv.mapcache;
-    vcache = &v->arch.pv.mapcache;
+    dcache = &v->domain->arch.mapcache;
+    vcache = &v->arch.mapcache;
     if ( !dcache->inuse )
         return mfn_to_virt(mfn_x(mfn));
 
@@ -187,14 +187,14 @@ void unmap_domain_page(const void *ptr)
     ASSERT(va >= MAPCACHE_VIRT_START && va < MAPCACHE_VIRT_END);
 
     v = mapcache_current_vcpu();
-    ASSERT(v && is_pv_vcpu(v));
+    ASSERT(v);
 
-    dcache = &v->domain->arch.pv.mapcache;
+    dcache = &v->domain->arch.mapcache;
     ASSERT(dcache->inuse);
 
     idx = PFN_DOWN(va - MAPCACHE_VIRT_START);
     mfn = l1e_get_pfn(MAPCACHE_L1ENT(idx));
-    hashent = &v->arch.pv.mapcache.hash[MAPHASH_HASHFN(mfn)];
+    hashent = &v->arch.mapcache.hash[MAPHASH_HASHFN(mfn)];
 
     local_irq_save(flags);
 
@@ -233,11 +233,9 @@ void unmap_domain_page(const void *ptr)
 
 int mapcache_domain_init(struct domain *d)
 {
-    struct mapcache_domain *dcache = &d->arch.pv.mapcache;
+    struct mapcache_domain *dcache = &d->arch.mapcache;
     unsigned int bitmap_pages;
 
-    ASSERT(is_pv_domain(d));
-
 #ifdef NDEBUG
     if ( !mem_hotplug && max_page <= PFN_DOWN(__pa(HYPERVISOR_VIRT_END - 1)) )
         return 0;
@@ -261,12 +259,12 @@ int mapcache_domain_init(struct domain *d)
 int mapcache_vcpu_init(struct vcpu *v)
 {
     struct domain *d = v->domain;
-    struct mapcache_domain *dcache = &d->arch.pv.mapcache;
+    struct mapcache_domain *dcache = &d->arch.mapcache;
     unsigned long i;
     unsigned int ents = d->max_vcpus * MAPCACHE_VCPU_ENTRIES;
     unsigned int nr = PFN_UP(BITS_TO_LONGS(ents) * sizeof(long));
 
-    if ( !is_pv_vcpu(v) || !dcache->inuse )
+    if ( !dcache->inuse )
         return 0;
 
     if ( ents > dcache->entries )
@@ -293,7 +291,7 @@ int mapcache_vcpu_init(struct vcpu *v)
     BUILD_BUG_ON(MAPHASHENT_NOTINUSE < MAPCACHE_ENTRIES);
     for ( i = 0; i < MAPHASH_ENTRIES; i++ )
     {
-        struct vcpu_maphash_entry *hashent = &v->arch.pv.mapcache.hash[i];
+        struct vcpu_maphash_entry *hashent = &v->arch.mapcache.hash[i];
 
         hashent->mfn = ~0UL; /* never valid to map */
         hashent->idx = MAPHASHENT_NOTINUSE;
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 5b6d909266..1cee04c0c5 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -271,9 +271,6 @@ struct pv_domain
     /* Mitigate L1TF with shadow/crashing? */
     bool check_l1tf;
 
-    /* map_domain_page() mapping cache. */
-    struct mapcache_domain mapcache;
-
     struct cpuidmasks *cpuidmasks;
 };
 
@@ -306,6 +303,9 @@ struct arch_domain
     uint32_t pci_cf8;
     uint8_t cmos_idx;
 
+    /* map_domain_page() mapping cache. */
+    struct mapcache_domain mapcache;
+
     union {
         struct pv_domain pv;
         struct hvm_domain hvm;
@@ -482,9 +482,6 @@ struct arch_domain
 
 struct pv_vcpu
 {
-    /* map_domain_page() mapping cache. */
-    struct mapcache_vcpu mapcache;
-
     unsigned int vgc_flags;
 
     struct trap_info *trap_ctxt;
@@ -567,6 +564,9 @@ struct arch_vcpu
 #define async_exception_state(t) async_exception_state[(t)-1]
     uint8_t async_exception_mask;
 
+    /* map_domain_page() mapping cache. */
+    struct mapcache_vcpu mapcache;
+
     /* Virtual Machine Extensions */
     union {
         struct pv_vcpu pv;
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 10/16] x86/mapcache: initialise the mapcache for the idle domain
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (8 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 09/16] x86: lift mapcache variable to the arch level Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-04-30 20:44 ` [PATCH 11/16] x86: add a boot option to enable and disable the direct map Hongyan Xia
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

In order to use the mapcache in the idle domain, we also have to
populate its page tables in the PERDOMAIN region, and we need to move
mapcache_domain_init() earlier in arch_domain_create().

Note, commit 'x86: lift mapcache variable to the arch level' has
initialised the mapcache for HVM domains. With this patch, PV, HVM,
idle domains now all initialise the mapcache.

Signed-off-by: Wei Wang <wawei@amazon.de>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/domain.c | 4 ++--
 xen/arch/x86/mm.c     | 3 +++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index e73f1efe85..c7e90c50e6 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -539,6 +539,8 @@ int arch_domain_create(struct domain *d,
 
     spin_lock_init(&d->arch.e820_lock);
 
+    mapcache_domain_init(d);
+
     /* Minimal initialisation for the idle domain. */
     if ( unlikely(is_idle_domain(d)) )
     {
@@ -634,8 +636,6 @@ int arch_domain_create(struct domain *d,
 
     psr_domain_init(d);
 
-    mapcache_domain_init(d);
-
     if ( is_hvm_domain(d) )
     {
         if ( (rc = hvm_domain_initialise(d)) != 0 )
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a17ae0004a..b3530d2763 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5828,6 +5828,9 @@ int create_perdomain_mapping(struct domain *d, unsigned long va,
         l3tab = __map_domain_page(pg);
         clear_page(l3tab);
         d->arch.perdomain_l3_pg = pg;
+        if ( is_idle_domain(d) )
+            idle_pg_table[l4_table_offset(PERDOMAIN_VIRT_START)] =
+                l4e_from_page(pg, __PAGE_HYPERVISOR_RW);
         if ( !nr )
         {
             unmap_domain_page(l3tab);
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 11/16] x86: add a boot option to enable and disable the direct map
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (9 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 10/16] x86/mapcache: initialise the mapcache for the idle domain Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-05-01  8:43   ` Julien Grall
  2020-05-01 12:11   ` Wei Liu
  2020-04-30 20:44 ` [PATCH 12/16] x86/domain_page: remove the fast paths when mfn is not in the directmap Hongyan Xia
                   ` (5 subsequent siblings)
  16 siblings, 2 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Volodymyr Babchuk,
	Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

Also add a helper function to retrieve it. Change arch_mfn_in_direct_map
to check this option before returning.

This is added as a boot command line option, not a Kconfig. We do not
produce different builds for EC2 so this is not introduced as a
compile-time configuration.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 docs/misc/xen-command-line.pandoc | 12 ++++++++++++
 xen/arch/x86/mm.c                 |  3 +++
 xen/arch/x86/setup.c              |  2 ++
 xen/include/asm-arm/mm.h          |  5 +++++
 xen/include/asm-x86/mm.h          | 17 ++++++++++++++++-
 5 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index ee12b0f53f..7027e3a15c 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -652,6 +652,18 @@ Specify the size of the console debug trace buffer. By specifying `cpu:`
 additionally a trace buffer of the specified size is allocated per cpu.
 The debug trace feature is only enabled in debugging builds of Xen.
 
+### directmap (x86)
+> `= <boolean>`
+
+> Default: `true`
+
+Enable or disable the direct map region in Xen.
+
+By default, Xen creates the direct map region which maps physical memory
+in that region. Setting this to no will remove the direct map, blocking
+exploits that leak secrets via speculative memory access in the direct
+map.
+
 ### dma_bits
 > `= <integer>`
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index b3530d2763..64da997764 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -162,6 +162,9 @@ l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
 l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
     l1_fixmap_x[L1_PAGETABLE_ENTRIES];
 
+bool __read_mostly opt_directmap = true;
+boolean_param("directmap", opt_directmap);
+
 paddr_t __read_mostly mem_hotplug;
 
 /* Frame table size in pages. */
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index faca8c9758..60fc4038be 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1282,6 +1282,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     if ( highmem_start )
         xenheap_max_mfn(PFN_DOWN(highmem_start - 1));
 
+    printk("Booting with directmap %s\n", arch_has_directmap() ? "on" : "off");
+
     /*
      * Walk every RAM region and map it in its entirety (on x86/64, at least)
      * and notify it to the boot allocator.
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index 7df91280bc..e6fd934113 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -366,6 +366,11 @@ int arch_acquire_resource(struct domain *d, unsigned int type, unsigned int id,
     return -EOPNOTSUPP;
 }
 
+static inline bool arch_has_directmap(void)
+{
+    return true;
+}
+
 #endif /*  __ARCH_ARM_MM__ */
 /*
  * Local variables:
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index ef7a20ac7d..7ff99ee8e3 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -454,6 +454,8 @@ int check_descriptor(const struct domain *d, seg_desc_t *desc);
 
 extern paddr_t mem_hotplug;
 
+extern bool opt_directmap;
+
 /******************************************************************************
  * With shadow pagetables, the different kinds of address start
  * to get get confusing.
@@ -637,13 +639,26 @@ extern const char zero_page[];
 /* Build a 32bit PSE page table using 4MB pages. */
 void write_32bit_pse_identmap(uint32_t *l2);
 
+static inline bool arch_has_directmap(void)
+{
+    return opt_directmap;
+}
+
 /*
  * x86 maps part of physical memory via the directmap region.
  * Return whether the input MFN falls in that range.
+ *
+ * When boot command line sets directmap=no, we will not have a direct map at
+ * all so this will always return false.
  */
 static inline bool arch_mfn_in_directmap(unsigned long mfn)
 {
-    unsigned long eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
+    unsigned long eva;
+
+    if ( !arch_has_directmap() )
+        return false;
+
+    eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
 
     return mfn <= (virt_to_mfn(eva - 1) + 1);
 }
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 12/16] x86/domain_page: remove the fast paths when mfn is not in the directmap
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (10 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 11/16] x86: add a boot option to enable and disable the direct map Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-04-30 20:44 ` [PATCH 13/16] xen/page_alloc: add a path for xenheap when there is no direct map Hongyan Xia
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

When mfn is not in direct map, never use mfn_to_virt for any mappings.

We replace mfn_x(mfn) <= PFN_DOWN(__pa(HYPERVISOR_VIRT_END - 1)) with
arch_mfn_in_direct_map(mfn) because these two are equivalent. The
extra comparison in arch_mfn_in_direct_map() looks different but because
DIRECTMAP_VIRT_END is always higher, it does not make any difference.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/domain_page.c | 33 ++++++++++++++++++++++++---------
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c
index 7b22e7c6ed..fc705f056e 100644
--- a/xen/arch/x86/domain_page.c
+++ b/xen/arch/x86/domain_page.c
@@ -14,8 +14,10 @@
 #include <xen/sched.h>
 #include <xen/vmap.h>
 #include <asm/current.h>
+#include <asm/fixmap.h>
 #include <asm/flushtlb.h>
 #include <asm/hardirq.h>
+#include <asm/pmap.h>
 #include <asm/setup.h>
 
 static DEFINE_PER_CPU(struct vcpu *, override);
@@ -35,10 +37,11 @@ static inline struct vcpu *mapcache_current_vcpu(void)
     /*
      * When using efi runtime page tables, we have the equivalent of the idle
      * domain's page tables but current may point at another domain's VCPU.
-     * Return NULL as though current is not properly set up yet.
+     * Return the idle domains's vcpu on that core because the efi per-domain
+     * region (where the mapcache is) is in-sync with the idle domain.
      */
     if ( efi_rs_using_pgtables() )
-        return NULL;
+        return idle_vcpu[smp_processor_id()];
 
     /*
      * If guest_table is NULL, and we are running a paravirtualised guest,
@@ -77,18 +80,24 @@ void *map_domain_page(mfn_t mfn)
     struct vcpu_maphash_entry *hashent;
 
 #ifdef NDEBUG
-    if ( mfn_x(mfn) <= PFN_DOWN(__pa(HYPERVISOR_VIRT_END - 1)) )
+    if ( arch_mfn_in_directmap(mfn_x(mfn)) )
         return mfn_to_virt(mfn_x(mfn));
 #endif
 
     v = mapcache_current_vcpu();
-    if ( !v )
-        return mfn_to_virt(mfn_x(mfn));
+    if ( !v || !v->domain->arch.mapcache.inuse )
+    {
+        if ( arch_mfn_in_directmap(mfn_x(mfn)) )
+            return mfn_to_virt(mfn_x(mfn));
+        else
+        {
+            BUG_ON(system_state >= SYS_STATE_smp_boot);
+            return pmap_map(mfn);
+        }
+    }
 
     dcache = &v->domain->arch.mapcache;
     vcache = &v->arch.mapcache;
-    if ( !dcache->inuse )
-        return mfn_to_virt(mfn_x(mfn));
 
     perfc_incr(map_domain_page_count);
 
@@ -184,6 +193,12 @@ void unmap_domain_page(const void *ptr)
     if ( va >= DIRECTMAP_VIRT_START )
         return;
 
+    if ( va >= FIXADDR_START && va < FIXADDR_TOP )
+    {
+        pmap_unmap((void *)ptr);
+        return;
+    }
+
     ASSERT(va >= MAPCACHE_VIRT_START && va < MAPCACHE_VIRT_END);
 
     v = mapcache_current_vcpu();
@@ -237,7 +252,7 @@ int mapcache_domain_init(struct domain *d)
     unsigned int bitmap_pages;
 
 #ifdef NDEBUG
-    if ( !mem_hotplug && max_page <= PFN_DOWN(__pa(HYPERVISOR_VIRT_END - 1)) )
+    if ( !mem_hotplug && arch_mfn_in_directmap(max_page) )
         return 0;
 #endif
 
@@ -308,7 +323,7 @@ void *map_domain_page_global(mfn_t mfn)
             local_irq_is_enabled()));
 
 #ifdef NDEBUG
-    if ( mfn_x(mfn) <= PFN_DOWN(__pa(HYPERVISOR_VIRT_END - 1)) )
+    if ( arch_mfn_in_directmap(mfn_x(mfn)) )
         return mfn_to_virt(mfn_x(mfn));
 #endif
 
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 13/16] xen/page_alloc: add a path for xenheap when there is no direct map
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (11 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 12/16] x86/domain_page: remove the fast paths when mfn is not in the directmap Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-05-01  8:50   ` Julien Grall
  2021-04-22 12:31   ` Jan Beulich
  2020-04-30 20:44 ` [PATCH 14/16] x86/setup: leave early boot slightly earlier Hongyan Xia
                   ` (3 subsequent siblings)
  16 siblings, 2 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich

From: Hongyan Xia <hongyxia@amazon.com>

When there is not an always-mapped direct map, xenheap allocations need
to be mapped and unmapped on-demand.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/common/page_alloc.c | 45 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 10b7aeca48..1285fc5977 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2143,6 +2143,7 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe)
 void *alloc_xenheap_pages(unsigned int order, unsigned int memflags)
 {
     struct page_info *pg;
+    void *ret;
 
     ASSERT(!in_irq());
 
@@ -2151,14 +2152,27 @@ void *alloc_xenheap_pages(unsigned int order, unsigned int memflags)
     if ( unlikely(pg == NULL) )
         return NULL;
 
-    memguard_unguard_range(page_to_virt(pg), 1 << (order + PAGE_SHIFT));
+    ret = page_to_virt(pg);
 
-    return page_to_virt(pg);
+    if ( !arch_has_directmap() &&
+         map_pages_to_xen((unsigned long)ret, page_to_mfn(pg), 1UL << order,
+                          PAGE_HYPERVISOR) )
+        {
+            /* Failed to map xenheap pages. */
+            free_heap_pages(pg, order, false);
+            return NULL;
+        }
+
+    memguard_unguard_range(ret, 1 << (order + PAGE_SHIFT));
+
+    return ret;
 }
 
 
 void free_xenheap_pages(void *v, unsigned int order)
 {
+    unsigned long va = (unsigned long)v & PAGE_MASK;
+
     ASSERT(!in_irq());
 
     if ( v == NULL )
@@ -2166,6 +2180,12 @@ void free_xenheap_pages(void *v, unsigned int order)
 
     memguard_guard_range(v, 1 << (order + PAGE_SHIFT));
 
+    if ( !arch_has_directmap() &&
+         destroy_xen_mappings(va, va + (1UL << (order + PAGE_SHIFT))) )
+        dprintk(XENLOG_WARNING,
+                "Error while destroying xenheap mappings at %p, order %u\n",
+                v, order)
+
     free_heap_pages(virt_to_page(v), order, false);
 }
 
@@ -2189,6 +2209,7 @@ void *alloc_xenheap_pages(unsigned int order, unsigned int memflags)
 {
     struct page_info *pg;
     unsigned int i;
+    void *ret;
 
     ASSERT(!in_irq());
 
@@ -2201,16 +2222,28 @@ void *alloc_xenheap_pages(unsigned int order, unsigned int memflags)
     if ( unlikely(pg == NULL) )
         return NULL;
 
+    ret = page_to_virt(pg);
+
+    if ( !arch_has_directmap() &&
+         map_pages_to_xen((unsigned long)ret, page_to_mfn(pg), 1UL << order,
+                          PAGE_HYPERVISOR) )
+        {
+            /* Failed to map xenheap pages. */
+            free_domheap_pages(pg, order);
+            return NULL;
+        }
+
     for ( i = 0; i < (1u << order); i++ )
         pg[i].count_info |= PGC_xen_heap;
 
-    return page_to_virt(pg);
+    return ret;
 }
 
 void free_xenheap_pages(void *v, unsigned int order)
 {
     struct page_info *pg;
     unsigned int i;
+    unsigned long va = (unsigned long)v & PAGE_MASK;
 
     ASSERT(!in_irq());
 
@@ -2222,6 +2255,12 @@ void free_xenheap_pages(void *v, unsigned int order)
     for ( i = 0; i < (1u << order); i++ )
         pg[i].count_info &= ~PGC_xen_heap;
 
+    if ( !arch_has_directmap() &&
+         destroy_xen_mappings(va, va + (1UL << (order + PAGE_SHIFT))) )
+        dprintk(XENLOG_WARNING,
+                "Error while destroying xenheap mappings at %p, order %u\n",
+                v, order);
+
     free_heap_pages(pg, order, true);
 }
 
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 14/16] x86/setup: leave early boot slightly earlier
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (12 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 13/16] xen/page_alloc: add a path for xenheap when there is no direct map Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-04-30 20:44 ` [PATCH 15/16] x86/setup: vmap heap nodes when they are outside the direct map Hongyan Xia
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

When we do not have a direct map, memory for metadata of heap nodes in
init_node_heap() is allocated from xenheap, which needs to be mapped and
unmapped on demand. However, we cannot just take memory from the boot
allocator to create the PTEs while we are passing memory to the heap
allocator.

To solve this race, we leave early boot slightly sooner so that Xen PTE
pages are allocated from the heap instead of the boot allocator. We can
do this because the metadata for the 1st node is statically allocated,
and by the time we need memory to create mappings for the 2nd node, we
already have enough memory in the heap allocator in the 1st node.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/setup.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 60fc4038be..dbb2ac1c8f 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1507,6 +1507,22 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
     numa_initmem_init(0, raw_max_page);
 
+    /*
+     * When we do not have a direct map, memory for metadata of heap nodes in
+     * init_node_heap() is allocated from xenheap, which needs to be mapped and
+     * unmapped on demand. However, we cannot just take memory from the boot
+     * allocator to create the PTEs while we are passing memory to the heap
+     * allocator during end_boot_allocator().
+     *
+     * To solve this race, we need to leave early boot before
+     * end_boot_allocator() so that Xen PTE pages are allocated from the heap
+     * instead of the boot allocator. We can do this because the metadata for
+     * the 1st node is statically allocated, and by the time we need memory to
+     * create mappings for the 2nd node, we already have enough memory in the
+     * heap allocator in the 1st node.
+     */
+    system_state = SYS_STATE_boot;
+
     if ( max_page - 1 > virt_to_mfn(HYPERVISOR_VIRT_END - 1) )
     {
         unsigned long limit = virt_to_mfn(HYPERVISOR_VIRT_END - 1);
@@ -1536,8 +1552,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     else
         end_boot_allocator();
 
-    system_state = SYS_STATE_boot;
-
     console_init_ring();
     vesa_init();
 
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 15/16] x86/setup: vmap heap nodes when they are outside the direct map
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (13 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 14/16] x86/setup: leave early boot slightly earlier Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-04-30 20:44 ` [PATCH 16/16] x86/setup: do not create valid mappings when directmap=no Hongyan Xia
  2020-05-01 12:07 ` [PATCH 00/16] Remove the direct map Wei Liu
  16 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich

From: Hongyan Xia <hongyxia@amazon.com>

When we do not have a direct map, arch_mfn_in_direct_map() will always
return false, thus init_node_heap() will allocate xenheap pages from an
existing node for the metadata of a new node. This means that the
metadata of a new node is in a different node, slowing down heap
allocation.

Since we now have early vmap, vmap the metadata locally in the new node.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/common/page_alloc.c | 40 ++++++++++++++++++++++++++++++++--------
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 1285fc5977..1e18b45caf 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -584,22 +584,46 @@ static unsigned long init_node_heap(int node, unsigned long mfn,
         needed = 0;
     }
     else if ( *use_tail && nr >= needed &&
-              arch_mfn_in_directmap(mfn + nr) &&
               (!xenheap_bits ||
                !((mfn + nr - 1) >> (xenheap_bits - PAGE_SHIFT))) )
     {
-        _heap[node] = mfn_to_virt(mfn + nr - needed);
-        avail[node] = mfn_to_virt(mfn + nr - 1) +
-                      PAGE_SIZE - sizeof(**avail) * NR_ZONES;
+        if ( arch_mfn_in_directmap(mfn + nr) )
+        {
+            _heap[node] = mfn_to_virt(mfn + nr - needed);
+            avail[node] = mfn_to_virt(mfn + nr - 1) +
+                          PAGE_SIZE - sizeof(**avail) * NR_ZONES;
+        }
+        else
+        {
+            mfn_t needed_start = _mfn(mfn + nr - needed);
+
+            _heap[node] = __vmap(&needed_start, needed, 1, 1, PAGE_HYPERVISOR,
+                                 VMAP_DEFAULT);
+            BUG_ON(!_heap[node]);
+            avail[node] = (void *)(_heap[node]) + (needed << PAGE_SHIFT) -
+                          sizeof(**avail) * NR_ZONES;
+        }
     }
     else if ( nr >= needed &&
-              arch_mfn_in_directmap(mfn + needed) &&
               (!xenheap_bits ||
                !((mfn + needed - 1) >> (xenheap_bits - PAGE_SHIFT))) )
     {
-        _heap[node] = mfn_to_virt(mfn);
-        avail[node] = mfn_to_virt(mfn + needed - 1) +
-                      PAGE_SIZE - sizeof(**avail) * NR_ZONES;
+        if ( arch_mfn_in_directmap(mfn + needed) )
+        {
+            _heap[node] = mfn_to_virt(mfn);
+            avail[node] = mfn_to_virt(mfn + needed - 1) +
+                          PAGE_SIZE - sizeof(**avail) * NR_ZONES;
+        }
+        else
+        {
+            mfn_t needed_start = _mfn(mfn);
+
+            _heap[node] = __vmap(&needed_start, needed, 1, 1, PAGE_HYPERVISOR,
+                                 VMAP_DEFAULT);
+            BUG_ON(!_heap[node]);
+            avail[node] = (void *)(_heap[node]) + (needed << PAGE_SHIFT) -
+                          sizeof(**avail) * NR_ZONES;
+        }
         *use_tail = false;
     }
     else if ( get_order_from_bytes(sizeof(**_heap)) ==
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 16/16] x86/setup: do not create valid mappings when directmap=no
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (14 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 15/16] x86/setup: vmap heap nodes when they are outside the direct map Hongyan Xia
@ 2020-04-30 20:44 ` Hongyan Xia
  2020-05-01 12:07 ` [PATCH 00/16] Remove the direct map Wei Liu
  16 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-04-30 20:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, julien, Wei Liu, Jan Beulich, Roger Pau Monné

From: Hongyan Xia <hongyxia@amazon.com>

Create empty mappings in the second e820 pass. Also, destroy existing
direct map mappings created in the first pass.

To make xenheap pages visible in guests, it is necessary to create empty
L3 tables in the direct map even when directmap=no, since guest cr3s
copy idle domain's L4 entries, which means they will share mappings in
the direct map if we pre-populate idle domain's L4 entries and L3
tables. A helper is introduced for this.

Also, after the direct map is actually gone, we need to stop updating
the direct map in update_xen_mappings().

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 xen/arch/x86/mm.c    |  2 +-
 xen/arch/x86/setup.c | 74 +++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 68 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 64da997764..33b7e3a003 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -798,7 +798,7 @@ static int update_xen_mappings(unsigned long mfn, unsigned int cacheattr)
 
     if ( unlikely(alias) && cacheattr )
         err = map_pages_to_xen(xen_va, _mfn(mfn), 1, 0);
-    if ( !err )
+    if ( arch_has_directmap() && !err )
         err = map_pages_to_xen((unsigned long)mfn_to_virt(mfn), _mfn(mfn), 1,
                      PAGE_HYPERVISOR | cacheattr_to_pte_flags(cacheattr));
     if ( unlikely(alias) && !cacheattr && !err )
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index dbb2ac1c8f..13c37f435b 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -684,6 +684,57 @@ static unsigned int __init copy_bios_e820(struct e820entry *map, unsigned int li
 /* How much of the directmap is prebuilt at compile time. */
 #define PREBUILT_MAP_LIMIT (1 << L2_PAGETABLE_SHIFT)
 
+/*
+ * This either populates a valid direct map, or allocates empty L3 tables and
+ * creates the L4 entries for virtual address between [start, end) in the
+ * direct map depending on arch_has_directmap();
+ *
+ * When directmap=no, we still need to populate empty L3 tables in the
+ * direct map region. The reason is that on-demand xenheap mappings are
+ * created in the idle domain's page table but must be seen by
+ * everyone. Since all domains share the direct map L4 entries, they
+ * will share xenheap mappings if we pre-populate the L4 entries and L3
+ * tables in the direct map region for all RAM. We also rely on the fact
+ * that L3 tables are never freed.
+ */
+static void __init populate_directmap(uint64_t pstart, uint64_t pend,
+                                      unsigned int flags)
+{
+    unsigned long vstart = (unsigned long)__va(pstart);
+    unsigned long vend = (unsigned long)__va(pend);
+
+    if ( pstart >= pend )
+        return;
+
+    BUG_ON(vstart < DIRECTMAP_VIRT_START);
+    BUG_ON(vend > DIRECTMAP_VIRT_END);
+
+    if ( arch_has_directmap() )
+        /* Populate valid direct map. */
+        BUG_ON(map_pages_to_xen(vstart, maddr_to_mfn(pstart),
+                                PFN_DOWN(pend - pstart), flags));
+    else
+    {
+        /* Create empty L3 tables. */
+        unsigned long vaddr = vstart & ~((1UL << L4_PAGETABLE_SHIFT) - 1);
+
+        for ( ; vaddr < vend; vaddr += (1UL << L4_PAGETABLE_SHIFT) )
+        {
+            l4_pgentry_t *pl4e = &idle_pg_table[l4_table_offset(vaddr)];
+
+            if ( !(l4e_get_flags(*pl4e) & _PAGE_PRESENT) )
+            {
+                mfn_t mfn = alloc_boot_pages(1, 1);
+                void *v = map_domain_page(mfn);
+
+                clear_page(v);
+                UNMAP_DOMAIN_PAGE(v);
+                l4e_write(pl4e, l4e_from_mfn(mfn, __PAGE_HYPERVISOR));
+            }
+        }
+    }
+}
+
 void __init noreturn __start_xen(unsigned long mbi_p)
 {
     char *memmap_type = NULL;
@@ -1366,8 +1417,17 @@ void __init noreturn __start_xen(unsigned long mbi_p)
         map_e = min_t(uint64_t, e,
                       ARRAY_SIZE(l2_directmap) << L2_PAGETABLE_SHIFT);
 
-        /* Pass mapped memory to allocator /before/ creating new mappings. */
+        /*
+         * Pass mapped memory to allocator /before/ creating new mappings.
+         * The direct map for the bottom 4GiB has been populated in the first
+         * e820 pass. In the second pass, we make sure those existing mappings
+         * are destroyed when directmap=no.
+         */
         init_boot_pages(s, min(map_s, e));
+        if ( !arch_has_directmap() )
+            destroy_xen_mappings((unsigned long)__va(s),
+                                 (unsigned long)__va(min(map_s, e)));
+
         s = map_s;
         if ( s < map_e )
         {
@@ -1376,6 +1436,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
             map_s = (s + mask) & ~mask;
             map_e &= ~mask;
             init_boot_pages(map_s, map_e);
+            if ( !arch_has_directmap() )
+                destroy_xen_mappings((unsigned long)__va(map_s),
+                                     (unsigned long)__va(map_e));
         }
 
         if ( map_s > map_e )
@@ -1389,8 +1452,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
             if ( map_e < end )
             {
-                map_pages_to_xen((unsigned long)__va(map_e), maddr_to_mfn(map_e),
-                                 PFN_DOWN(end - map_e), PAGE_HYPERVISOR);
+                populate_directmap(map_e, end, PAGE_HYPERVISOR);
                 init_boot_pages(map_e, end);
                 map_e = end;
             }
@@ -1399,13 +1461,11 @@ void __init noreturn __start_xen(unsigned long mbi_p)
         {
             /* This range must not be passed to the boot allocator and
              * must also not be mapped with _PAGE_GLOBAL. */
-            map_pages_to_xen((unsigned long)__va(map_e), maddr_to_mfn(map_e),
-                             PFN_DOWN(e - map_e), __PAGE_HYPERVISOR_RW);
+            populate_directmap(map_e, e, __PAGE_HYPERVISOR_RW);
         }
         if ( s < map_s )
         {
-            map_pages_to_xen((unsigned long)__va(s), maddr_to_mfn(s),
-                             PFN_DOWN(map_s - s), PAGE_HYPERVISOR);
+            populate_directmap(s, map_s, PAGE_HYPERVISOR);
             init_boot_pages(s, map_s);
         }
     }
-- 
2.24.1.AMZN



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 11/16] x86: add a boot option to enable and disable the direct map
  2020-04-30 20:44 ` [PATCH 11/16] x86: add a boot option to enable and disable the direct map Hongyan Xia
@ 2020-05-01  8:43   ` Julien Grall
  2020-05-01 12:11   ` Wei Liu
  1 sibling, 0 replies; 39+ messages in thread
From: Julien Grall @ 2020-05-01  8:43 UTC (permalink / raw)
  To: Hongyan Xia, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Volodymyr Babchuk,
	Roger Pau Monné

Hi Hongyan,

On 30/04/2020 21:44, Hongyan Xia wrote:
> From: Hongyan Xia <hongyxia@amazon.com>
> 
> Also add a helper function to retrieve it. Change arch_mfn_in_direct_map
> to check this option before returning.
> 
> This is added as a boot command line option, not a Kconfig. We do not
> produce different builds for EC2 so this is not introduced as a
> compile-time configuration.
> 
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
> ---
>   docs/misc/xen-command-line.pandoc | 12 ++++++++++++
>   xen/arch/x86/mm.c                 |  3 +++
>   xen/arch/x86/setup.c              |  2 ++
>   xen/include/asm-arm/mm.h          |  5 +++++
>   xen/include/asm-x86/mm.h          | 17 ++++++++++++++++-
>   5 files changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> index ee12b0f53f..7027e3a15c 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -652,6 +652,18 @@ Specify the size of the console debug trace buffer. By specifying `cpu:`
>   additionally a trace buffer of the specified size is allocated per cpu.
>   The debug trace feature is only enabled in debugging builds of Xen.
>   
> +### directmap (x86)
> +> `= <boolean>`
> +
> +> Default: `true`
> +
> +Enable or disable the direct map region in Xen.
> +
> +By default, Xen creates the direct map region which maps physical memory
> +in that region. Setting this to no will remove the direct map, blocking
> +exploits that leak secrets via speculative memory access in the direct
> +map.
> +
>   ### dma_bits
>   > `= <integer>`
>   
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index b3530d2763..64da997764 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -162,6 +162,9 @@ l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
>   l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
>       l1_fixmap_x[L1_PAGETABLE_ENTRIES];
>   
> +bool __read_mostly opt_directmap = true;
> +boolean_param("directmap", opt_directmap);
> +
>   paddr_t __read_mostly mem_hotplug;
>   
>   /* Frame table size in pages. */
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index faca8c9758..60fc4038be 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1282,6 +1282,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>       if ( highmem_start )
>           xenheap_max_mfn(PFN_DOWN(highmem_start - 1));
>   
> +    printk("Booting with directmap %s\n", arch_has_directmap() ? "on" : "off");
> +
>       /*
>        * Walk every RAM region and map it in its entirety (on x86/64, at least)
>        * and notify it to the boot allocator.
> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
> index 7df91280bc..e6fd934113 100644
> --- a/xen/include/asm-arm/mm.h
> +++ b/xen/include/asm-arm/mm.h
> @@ -366,6 +366,11 @@ int arch_acquire_resource(struct domain *d, unsigned int type, unsigned int id,
>       return -EOPNOTSUPP;
>   }
>   
> +static inline bool arch_has_directmap(void)
> +{
> +    return true;

arm32 doesn't have a directmap, so this needs to be false for arm32 and 
true for arm64.

I would also like the implementation of the helper close to 
arch_mfn_in_directmap() in asm-arm/arm*/mm.h.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 13/16] xen/page_alloc: add a path for xenheap when there is no direct map
  2020-04-30 20:44 ` [PATCH 13/16] xen/page_alloc: add a path for xenheap when there is no direct map Hongyan Xia
@ 2020-05-01  8:50   ` Julien Grall
  2021-04-22 12:31   ` Jan Beulich
  1 sibling, 0 replies; 39+ messages in thread
From: Julien Grall @ 2020-05-01  8:50 UTC (permalink / raw)
  To: Hongyan Xia, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich

Hi Hongyan,

On 30/04/2020 21:44, Hongyan Xia wrote:
> From: Hongyan Xia <hongyxia@amazon.com>
> 
> When there is not an always-mapped direct map, xenheap allocations need
> to be mapped and unmapped on-demand.
> 
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
> ---
>   xen/common/page_alloc.c | 45 ++++++++++++++++++++++++++++++++++++++---
>   1 file changed, 42 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 10b7aeca48..1285fc5977 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -2143,6 +2143,7 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe)
>   void *alloc_xenheap_pages(unsigned int order, unsigned int memflags)
>   {
>       struct page_info *pg;
> +    void *ret;
>   
>       ASSERT(!in_irq());
>   
> @@ -2151,14 +2152,27 @@ void *alloc_xenheap_pages(unsigned int order, unsigned int memflags)
>       if ( unlikely(pg == NULL) )
>           return NULL;
>   
> -    memguard_unguard_range(page_to_virt(pg), 1 << (order + PAGE_SHIFT));
> +    ret = page_to_virt(pg);
>   
> -    return page_to_virt(pg);
> +    if ( !arch_has_directmap() &&
> +         map_pages_to_xen((unsigned long)ret, page_to_mfn(pg), 1UL << order,
> +                          PAGE_HYPERVISOR) )

The only user (arm32) of split domheap/xenheap has no directmap. So this 
will break arm32 as we don't support superpage shattering (for good 
reasons).

In this configuration, only xenheap pages are always mapped. Domheap 
will be mapped on-demand. So I don't think we need to map/unmap xenheap 
pages at allocation/free.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 02/16] acpi: vmap pages in acpi_os_alloc_memory
  2020-04-30 20:44 ` [PATCH 02/16] acpi: vmap pages in acpi_os_alloc_memory Hongyan Xia
@ 2020-05-01 12:02   ` Wei Liu
  2020-05-01 12:46     ` Hongyan Xia
  2020-05-01 21:35   ` Julien Grall
  1 sibling, 1 reply; 39+ messages in thread
From: Wei Liu @ 2020-05-01 12:02 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, xen-devel

On Thu, Apr 30, 2020 at 09:44:11PM +0100, Hongyan Xia wrote:
> From: Hongyan Xia <hongyxia@amazon.com>
> 
> Also, introduce a wrapper around vmap that maps a contiguous range for
> boot allocations.
> 
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
> ---
>  xen/drivers/acpi/osl.c | 9 ++++++++-
>  xen/include/xen/vmap.h | 5 +++++
>  2 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/acpi/osl.c b/xen/drivers/acpi/osl.c
> index 4c8bb7839e..d0762dad4e 100644
> --- a/xen/drivers/acpi/osl.c
> +++ b/xen/drivers/acpi/osl.c
> @@ -219,7 +219,11 @@ void *__init acpi_os_alloc_memory(size_t sz)
>  	void *ptr;
>  
>  	if (system_state == SYS_STATE_early_boot)
> -		return mfn_to_virt(mfn_x(alloc_boot_pages(PFN_UP(sz), 1)));
> +	{
> +		mfn_t mfn = alloc_boot_pages(PFN_UP(sz), 1);
> +
> +		return vmap_boot_pages(mfn, PFN_UP(sz));
> +	}
>  
>  	ptr = xmalloc_bytes(sz);
>  	ASSERT(!ptr || is_xmalloc_memory(ptr));
> @@ -244,5 +248,8 @@ void __init acpi_os_free_memory(void *ptr)
>  	if (is_xmalloc_memory(ptr))
>  		xfree(ptr);
>  	else if (ptr && system_state == SYS_STATE_early_boot)
> +	{
> +		vunmap(ptr);
>  		init_boot_pages(__pa(ptr), __pa(ptr) + PAGE_SIZE);
> +	}
>  }
> diff --git a/xen/include/xen/vmap.h b/xen/include/xen/vmap.h
> index 369560e620..c70801e195 100644
> --- a/xen/include/xen/vmap.h
> +++ b/xen/include/xen/vmap.h
> @@ -23,6 +23,11 @@ void *vmalloc_xen(size_t size);
>  void *vzalloc(size_t size);
>  void vfree(void *va);
>  
> +static inline void *vmap_boot_pages(mfn_t mfn, unsigned int nr_pages)

Nothing seems to tie this to boot pages only. Maybe it is better to name
it after what it does, like vmap_mfns?

> +{
> +    return __vmap(&mfn, nr_pages, 1, 1, PAGE_HYPERVISOR, VMAP_DEFAULT);
> +}
> +
>  void __iomem *ioremap(paddr_t, size_t);
>  
>  static inline void iounmap(void __iomem *va)
> -- 
> 2.24.1.AMZN
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 00/16] Remove the direct map
  2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
                   ` (15 preceding siblings ...)
  2020-04-30 20:44 ` [PATCH 16/16] x86/setup: do not create valid mappings when directmap=no Hongyan Xia
@ 2020-05-01 12:07 ` Wei Liu
  2020-05-01 13:53   ` Hongyan Xia
  16 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2020-05-01 12:07 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, xen-devel, Volodymyr Babchuk,
	Roger Pau Monné

On Thu, Apr 30, 2020 at 09:44:09PM +0100, Hongyan Xia wrote:
> From: Hongyan Xia <hongyxia@amazon.com>
> 
> This series depends on Xen page table domheap conversion:
> https://lists.xenproject.org/archives/html/xen-devel/2020-04/msg01374.html.
> 
> After breaking the reliance on the direct map to manipulate Xen page
> tables, we can now finally remove the direct map altogether.
> 
> This series:
> - fixes many places that use the direct map incorrectly or assume the
>   presence of an always-mapped direct map in a wrong way.
> - includes the early vmap patches for global mappings.
> - initialises the mapcache for all domains, disables the fast path that
>   uses the direct map for mappings.
> - maps and unmaps xenheap on-demand.
> - adds a boot command line switch to enable or disable the direct map.
> 
> This previous version was in RFC state and can be found here:
> https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02647.html,
> which has since been broken into small series.

OOI have you done any performance measurements?

Seeing that now even guest table needs mapping / unmapping during
restore, I'm curious to know how that would impact performance.

Wei.

> 
> Hongyan Xia (12):
>   acpi: vmap pages in acpi_os_alloc_memory
>   x86/numa: vmap the pages for memnodemap
>   x86/srat: vmap the pages for acpi_slit
>   x86: map/unmap pages in restore_all_guests.
>   x86/pv: rewrite how building PV dom0 handles domheap mappings
>   x86/mapcache: initialise the mapcache for the idle domain
>   x86: add a boot option to enable and disable the direct map
>   x86/domain_page: remove the fast paths when mfn is not in the
>     directmap
>   xen/page_alloc: add a path for xenheap when there is no direct map
>   x86/setup: leave early boot slightly earlier
>   x86/setup: vmap heap nodes when they are outside the direct map
>   x86/setup: do not create valid mappings when directmap=no
> 
> Wei Liu (4):
>   x86/setup: move vm_init() before acpi calls
>   x86/pv: domheap pages should be mapped while relocating initrd
>   x86: add Persistent Map (PMAP) infrastructure
>   x86: lift mapcache variable to the arch level
> 
>  docs/misc/xen-command-line.pandoc |  12 +++
>  xen/arch/arm/setup.c              |   4 +-
>  xen/arch/x86/Makefile             |   1 +
>  xen/arch/x86/domain.c             |   4 +-
>  xen/arch/x86/domain_page.c        |  53 ++++++++-----
>  xen/arch/x86/mm.c                 |   8 +-
>  xen/arch/x86/numa.c               |   8 +-
>  xen/arch/x86/pmap.c               |  87 +++++++++++++++++++++
>  xen/arch/x86/pv/dom0_build.c      |  75 ++++++++++++++----
>  xen/arch/x86/setup.c              | 125 +++++++++++++++++++++++++-----
>  xen/arch/x86/srat.c               |   3 +-
>  xen/arch/x86/x86_64/entry.S       |  27 ++++++-
>  xen/common/page_alloc.c           |  85 +++++++++++++++++---
>  xen/common/vmap.c                 |  37 +++++++--
>  xen/drivers/acpi/osl.c            |   9 ++-
>  xen/include/asm-arm/mm.h          |   5 ++
>  xen/include/asm-x86/domain.h      |  12 +--
>  xen/include/asm-x86/fixmap.h      |   3 +
>  xen/include/asm-x86/mm.h          |  17 +++-
>  xen/include/asm-x86/pmap.h        |  10 +++
>  xen/include/xen/vmap.h            |   5 ++
>  21 files changed, 495 insertions(+), 95 deletions(-)
>  create mode 100644 xen/arch/x86/pmap.c
>  create mode 100644 xen/include/asm-x86/pmap.h
> 
> -- 
> 2.24.1.AMZN
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 11/16] x86: add a boot option to enable and disable the direct map
  2020-04-30 20:44 ` [PATCH 11/16] x86: add a boot option to enable and disable the direct map Hongyan Xia
  2020-05-01  8:43   ` Julien Grall
@ 2020-05-01 12:11   ` Wei Liu
  2020-05-01 12:59     ` Hongyan Xia
  1 sibling, 1 reply; 39+ messages in thread
From: Wei Liu @ 2020-05-01 12:11 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, xen-devel, Volodymyr Babchuk,
	Roger Pau Monné

On Thu, Apr 30, 2020 at 09:44:20PM +0100, Hongyan Xia wrote:
> From: Hongyan Xia <hongyxia@amazon.com>
> 
> Also add a helper function to retrieve it. Change arch_mfn_in_direct_map
> to check this option before returning.
> 
> This is added as a boot command line option, not a Kconfig. We do not
> produce different builds for EC2 so this is not introduced as a
> compile-time configuration.

Having a Kconfig will probably allow the compiler to eliminate dead
code.

This is not asking you to do the work, someone can come along and adjust 
arch_has_directmap easily.

Wei.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 02/16] acpi: vmap pages in acpi_os_alloc_memory
  2020-05-01 12:02   ` Wei Liu
@ 2020-05-01 12:46     ` Hongyan Xia
  0 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-05-01 12:46 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, julien, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, xen-devel

On Fri, 2020-05-01 at 12:02 +0000, Wei Liu wrote:
> On Thu, Apr 30, 2020 at 09:44:11PM +0100, Hongyan Xia wrote:
> > From: Hongyan Xia <hongyxia@amazon.com>
> > 
> > Also, introduce a wrapper around vmap that maps a contiguous range
> > for
> > boot allocations.
> > 
> > Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
> > ---
> >  xen/drivers/acpi/osl.c | 9 ++++++++-
> >  xen/include/xen/vmap.h | 5 +++++
> >  2 files changed, 13 insertions(+), 1 deletion(-)
> > 
> > diff --git a/xen/drivers/acpi/osl.c b/xen/drivers/acpi/osl.c
> > index 4c8bb7839e..d0762dad4e 100644
> > --- a/xen/drivers/acpi/osl.c
> > +++ b/xen/drivers/acpi/osl.c
> > @@ -219,7 +219,11 @@ void *__init acpi_os_alloc_memory(size_t sz)
> >  	void *ptr;
> >  
> >  	if (system_state == SYS_STATE_early_boot)
> > -		return mfn_to_virt(mfn_x(alloc_boot_pages(PFN_UP(sz),
> > 1)));
> > +	{
> > +		mfn_t mfn = alloc_boot_pages(PFN_UP(sz), 1);
> > +
> > +		return vmap_boot_pages(mfn, PFN_UP(sz));
> > +	}
> >  
> >  	ptr = xmalloc_bytes(sz);
> >  	ASSERT(!ptr || is_xmalloc_memory(ptr));
> > @@ -244,5 +248,8 @@ void __init acpi_os_free_memory(void *ptr)
> >  	if (is_xmalloc_memory(ptr))
> >  		xfree(ptr);
> >  	else if (ptr && system_state == SYS_STATE_early_boot)
> > +	{
> > +		vunmap(ptr);
> >  		init_boot_pages(__pa(ptr), __pa(ptr) + PAGE_SIZE);
> > +	}
> >  }
> > diff --git a/xen/include/xen/vmap.h b/xen/include/xen/vmap.h
> > index 369560e620..c70801e195 100644
> > --- a/xen/include/xen/vmap.h
> > +++ b/xen/include/xen/vmap.h
> > @@ -23,6 +23,11 @@ void *vmalloc_xen(size_t size);
> >  void *vzalloc(size_t size);
> >  void vfree(void *va);
> >  
> > +static inline void *vmap_boot_pages(mfn_t mfn, unsigned int
> > nr_pages)
> 
> Nothing seems to tie this to boot pages only. Maybe it is better to
> name
> it after what it does, like vmap_mfns?

Hmm, indeed nothing so special about *boot* pages. Will change.

Hongyan



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 11/16] x86: add a boot option to enable and disable the direct map
  2020-05-01 12:11   ` Wei Liu
@ 2020-05-01 12:59     ` Hongyan Xia
  2020-05-01 13:11       ` Wei Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Hongyan Xia @ 2020-05-01 12:59 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, julien, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, xen-devel, Volodymyr Babchuk,
	Roger Pau Monné

On Fri, 2020-05-01 at 12:11 +0000, Wei Liu wrote:
> On Thu, Apr 30, 2020 at 09:44:20PM +0100, Hongyan Xia wrote:
> > From: Hongyan Xia <hongyxia@amazon.com>
> > 
> > Also add a helper function to retrieve it. Change
> > arch_mfn_in_direct_map
> > to check this option before returning.
> > 
> > This is added as a boot command line option, not a Kconfig. We do
> > not
> > produce different builds for EC2 so this is not introduced as a
> > compile-time configuration.
> 
> Having a Kconfig will probably allow the compiler to eliminate dead
> code.
> 
> This is not asking you to do the work, someone can come along and
> adjust 
> arch_has_directmap easily.

My original code added this as a CONFIG option, but I converted it into
a boot-time switch, so I can just dig out history and convert it back.
I wonder if we should get more opinions on this to make a decision.

I would love Xen to have static key support though so that a boot-time
switch costs no run-time performance.

Hongyan



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 11/16] x86: add a boot option to enable and disable the direct map
  2020-05-01 12:59     ` Hongyan Xia
@ 2020-05-01 13:11       ` Wei Liu
  2020-05-01 15:59         ` Julien Grall
  0 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2020-05-01 13:11 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, xen-devel, Volodymyr Babchuk,
	Roger Pau Monné

On Fri, May 01, 2020 at 01:59:24PM +0100, Hongyan Xia wrote:
> On Fri, 2020-05-01 at 12:11 +0000, Wei Liu wrote:
> > On Thu, Apr 30, 2020 at 09:44:20PM +0100, Hongyan Xia wrote:
> > > From: Hongyan Xia <hongyxia@amazon.com>
> > > 
> > > Also add a helper function to retrieve it. Change
> > > arch_mfn_in_direct_map
> > > to check this option before returning.
> > > 
> > > This is added as a boot command line option, not a Kconfig. We do
> > > not
> > > produce different builds for EC2 so this is not introduced as a
> > > compile-time configuration.
> > 
> > Having a Kconfig will probably allow the compiler to eliminate dead
> > code.
> > 
> > This is not asking you to do the work, someone can come along and
> > adjust 
> > arch_has_directmap easily.
> 
> My original code added this as a CONFIG option, but I converted it into
> a boot-time switch, so I can just dig out history and convert it back.
> I wonder if we should get more opinions on this to make a decision.

Form my perspective, you as a contributor has done the work to scratch
your own itch, hence I said "not asking you to do the work". I don't
want to turn every comment into a formal ask and eventually lead to
feature creep.

> 
> I would love Xen to have static key support though so that a boot-time
> switch costs no run-time performance.
> 

Yes that would be great.

Wei.

> Hongyan
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 00/16] Remove the direct map
  2020-05-01 12:07 ` [PATCH 00/16] Remove the direct map Wei Liu
@ 2020-05-01 13:53   ` Hongyan Xia
  2020-06-02  9:08     ` Wei Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Hongyan Xia @ 2020-05-01 13:53 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, julien, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, xen-devel, Volodymyr Babchuk,
	Roger Pau Monné

On Fri, 2020-05-01 at 12:07 +0000, Wei Liu wrote:
> On Thu, Apr 30, 2020 at 09:44:09PM +0100, Hongyan Xia wrote:
> > From: Hongyan Xia <hongyxia@amazon.com>
> > 
> > This series depends on Xen page table domheap conversion:
> > 
https://lists.xenproject.org/archives/html/xen-devel/2020-04/msg01374.html
> > .
> > 
> > After breaking the reliance on the direct map to manipulate Xen
> > page
> > tables, we can now finally remove the direct map altogether.
> > 
> > This series:
> > - fixes many places that use the direct map incorrectly or assume
> > the
> >   presence of an always-mapped direct map in a wrong way.
> > - includes the early vmap patches for global mappings.
> > - initialises the mapcache for all domains, disables the fast path
> > that
> >   uses the direct map for mappings.
> > - maps and unmaps xenheap on-demand.
> > - adds a boot command line switch to enable or disable the direct
> > map.
> > 
> > This previous version was in RFC state and can be found here:
> > 
https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02647.html
> > ,
> > which has since been broken into small series.
> 
> OOI have you done any performance measurements?
> 
> Seeing that now even guest table needs mapping / unmapping during
> restore, I'm curious to know how that would impact performance.

I actually have a lot of performance numbers but unfortunately on an
older version of Xen, not staging. I need to evaluate it again before
coming back to you. As you suspected, one strong signal from the
performance results is definitely the impact of walking guest tables.
For EPT, mapping and unmapping 20 times is no fun. This shows up in
micro-benchmarks, although larger benchmarks tend to be fine.

My question is, do we care about hiding EPT? I think it is fine to just
use xenheap pages (or any other form which does the job) so that we go
down from 20 mappings to only 4. I have done this hack with EPT and saw
significantly reduced impact for HVM guests in micro-benchmarks.

Hongyan



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 11/16] x86: add a boot option to enable and disable the direct map
  2020-05-01 13:11       ` Wei Liu
@ 2020-05-01 15:59         ` Julien Grall
  0 siblings, 0 replies; 39+ messages in thread
From: Julien Grall @ 2020-05-01 15:59 UTC (permalink / raw)
  To: Wei Liu, Hongyan Xia
  Cc: Stefano Stabellini, Andrew Cooper, Ian Jackson, George Dunlap,
	Jan Beulich, xen-devel, Volodymyr Babchuk, Roger Pau Monné

Hi,

On 01/05/2020 14:11, Wei Liu wrote:
> On Fri, May 01, 2020 at 01:59:24PM +0100, Hongyan Xia wrote:
>> On Fri, 2020-05-01 at 12:11 +0000, Wei Liu wrote:
>>> On Thu, Apr 30, 2020 at 09:44:20PM +0100, Hongyan Xia wrote:
>>>> From: Hongyan Xia <hongyxia@amazon.com>
>>>>
>>>> Also add a helper function to retrieve it. Change
>>>> arch_mfn_in_direct_map
>>>> to check this option before returning.
>>>>
>>>> This is added as a boot command line option, not a Kconfig. We do
>>>> not
>>>> produce different builds for EC2 so this is not introduced as a
>>>> compile-time configuration.
>>>
>>> Having a Kconfig will probably allow the compiler to eliminate dead
>>> code.
>>>
>>> This is not asking you to do the work, someone can come along and
>>> adjust
>>> arch_has_directmap easily.
>>
>> My original code added this as a CONFIG option, but I converted it into
>> a boot-time switch, so I can just dig out history and convert it back.
>> I wonder if we should get more opinions on this to make a decision.
> 
> Form my perspective, you as a contributor has done the work to scratch
> your own itch, hence I said "not asking you to do the work". I don't
> want to turn every comment into a formal ask and eventually lead to
> feature creep.
> 
>>
>> I would love Xen to have static key support though so that a boot-time
>> switch costs no run-time performance.
>>
> 
> Yes that would be great.

 From my understanding static key is very powerful as you could modify 
the value even at runtime.

On Arm, I wrote a version that I would call static key for the poor. We 
are using alternative to select between 0 and 1 as an immediate value.

#define CHECK_WORKAROUND_HELPER(erratum, feature, arch)         \
static inline bool check_workaround_##erratum(void)             \
{                                                               \
     if ( !IS_ENABLED(arch) )                                    \
         return false;                                           \
     else                                                        \
     {                                                           \
         register_t ret;                                         \
                                                                 \
         asm volatile (ALTERNATIVE("mov %0, #0",                 \
                                   "mov %0, #1",                 \
                                   feature)                      \
                       : "=r" (ret));                            \
                                                                 \
         return unlikely(ret);                                   \
     }                                                           \
}

The code generated will still be using a conditional branch, but the 
processor should be able to always infer correctly whether the condition 
is true or not.

The implementation is also very tailored to Arm as we consider 
workaround are not enabled by default. But I think this can be improved 
and made more generic.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 02/16] acpi: vmap pages in acpi_os_alloc_memory
  2020-04-30 20:44 ` [PATCH 02/16] acpi: vmap pages in acpi_os_alloc_memory Hongyan Xia
  2020-05-01 12:02   ` Wei Liu
@ 2020-05-01 21:35   ` Julien Grall
  2020-05-04  8:27     ` Hongyan Xia
  1 sibling, 1 reply; 39+ messages in thread
From: Julien Grall @ 2020-05-01 21:35 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, xen-devel

Hi,

On Thu, 30 Apr 2020 at 21:44, Hongyan Xia <hx242@xen.org> wrote:
>
> From: Hongyan Xia <hongyxia@amazon.com>
>
> Also, introduce a wrapper around vmap that maps a contiguous range for
> boot allocations.
>
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
> ---
>  xen/drivers/acpi/osl.c | 9 ++++++++-
>  xen/include/xen/vmap.h | 5 +++++
>  2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/xen/drivers/acpi/osl.c b/xen/drivers/acpi/osl.c
> index 4c8bb7839e..d0762dad4e 100644
> --- a/xen/drivers/acpi/osl.c
> +++ b/xen/drivers/acpi/osl.c
> @@ -219,7 +219,11 @@ void *__init acpi_os_alloc_memory(size_t sz)
>         void *ptr;
>
>         if (system_state == SYS_STATE_early_boot)
> -               return mfn_to_virt(mfn_x(alloc_boot_pages(PFN_UP(sz), 1)));
> +       {
> +               mfn_t mfn = alloc_boot_pages(PFN_UP(sz), 1);
> +
> +               return vmap_boot_pages(mfn, PFN_UP(sz));
> +       }
>
>         ptr = xmalloc_bytes(sz);
>         ASSERT(!ptr || is_xmalloc_memory(ptr));
> @@ -244,5 +248,8 @@ void __init acpi_os_free_memory(void *ptr)
>         if (is_xmalloc_memory(ptr))
>                 xfree(ptr);
>         else if (ptr && system_state == SYS_STATE_early_boot)
> +       {
> +               vunmap(ptr);
>                 init_boot_pages(__pa(ptr), __pa(ptr) + PAGE_SIZE);

__pa(ptr) can only work on the direct map. Even worth, on Arm it will
fault because there is no mapping.
I think you will want to use vmap_to_mfn() before calling vunmap().

Cheers,


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 02/16] acpi: vmap pages in acpi_os_alloc_memory
  2020-05-01 21:35   ` Julien Grall
@ 2020-05-04  8:27     ` Hongyan Xia
  0 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2020-05-04  8:27 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, xen-devel

On Fri, 2020-05-01 at 22:35 +0100, Julien Grall wrote:
> Hi,
> 
> On Thu, 30 Apr 2020 at 21:44, Hongyan Xia <hx242@xen.org> wrote:
> > 
> > From: Hongyan Xia <hongyxia@amazon.com>
> > 
> > Also, introduce a wrapper around vmap that maps a contiguous range
> > for
> > boot allocations.
> > 
> > Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
> > ---
> >  xen/drivers/acpi/osl.c | 9 ++++++++-
> >  xen/include/xen/vmap.h | 5 +++++
> >  2 files changed, 13 insertions(+), 1 deletion(-)
> > 
> > diff --git a/xen/drivers/acpi/osl.c b/xen/drivers/acpi/osl.c
> > index 4c8bb7839e..d0762dad4e 100644
> > --- a/xen/drivers/acpi/osl.c
> > +++ b/xen/drivers/acpi/osl.c
> > @@ -219,7 +219,11 @@ void *__init acpi_os_alloc_memory(size_t sz)
> >         void *ptr;
> > 
> >         if (system_state == SYS_STATE_early_boot)
> > -               return
> > mfn_to_virt(mfn_x(alloc_boot_pages(PFN_UP(sz), 1)));
> > +       {
> > +               mfn_t mfn = alloc_boot_pages(PFN_UP(sz), 1);
> > +
> > +               return vmap_boot_pages(mfn, PFN_UP(sz));
> > +       }
> > 
> >         ptr = xmalloc_bytes(sz);
> >         ASSERT(!ptr || is_xmalloc_memory(ptr));
> > @@ -244,5 +248,8 @@ void __init acpi_os_free_memory(void *ptr)
> >         if (is_xmalloc_memory(ptr))
> >                 xfree(ptr);
> >         else if (ptr && system_state == SYS_STATE_early_boot)
> > +       {
> > +               vunmap(ptr);
> >                 init_boot_pages(__pa(ptr), __pa(ptr) + PAGE_SIZE);
> 
> __pa(ptr) can only work on the direct map. Even worth, on Arm it will
> fault because there is no mapping.
> I think you will want to use vmap_to_mfn() before calling vunmap().

Thanks for spotting this. This is definitely wrong. Will revise.

Hongyan



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 00/16] Remove the direct map
  2020-05-01 13:53   ` Hongyan Xia
@ 2020-06-02  9:08     ` Wei Liu
  2021-04-28 10:14       ` Hongyan Xia
  0 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2020-06-02  9:08 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: Stefano Stabellini, julien, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, xen-devel, Volodymyr Babchuk,
	Roger Pau Monné

On Fri, May 01, 2020 at 02:53:08PM +0100, Hongyan Xia wrote:
> On Fri, 2020-05-01 at 12:07 +0000, Wei Liu wrote:
> > On Thu, Apr 30, 2020 at 09:44:09PM +0100, Hongyan Xia wrote:
> > > From: Hongyan Xia <hongyxia@amazon.com>
> > > 
> > > This series depends on Xen page table domheap conversion:
> > > 
> https://lists.xenproject.org/archives/html/xen-devel/2020-04/msg01374.html
> > > .
> > > 
> > > After breaking the reliance on the direct map to manipulate Xen
> > > page
> > > tables, we can now finally remove the direct map altogether.
> > > 
> > > This series:
> > > - fixes many places that use the direct map incorrectly or assume
> > > the
> > >   presence of an always-mapped direct map in a wrong way.
> > > - includes the early vmap patches for global mappings.
> > > - initialises the mapcache for all domains, disables the fast path
> > > that
> > >   uses the direct map for mappings.
> > > - maps and unmaps xenheap on-demand.
> > > - adds a boot command line switch to enable or disable the direct
> > > map.
> > > 
> > > This previous version was in RFC state and can be found here:
> > > 
> https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02647.html
> > > ,
> > > which has since been broken into small series.
> > 
> > OOI have you done any performance measurements?
> > 
> > Seeing that now even guest table needs mapping / unmapping during
> > restore, I'm curious to know how that would impact performance.
> 
> I actually have a lot of performance numbers but unfortunately on an
> older version of Xen, not staging. I need to evaluate it again before
> coming back to you. As you suspected, one strong signal from the
> performance results is definitely the impact of walking guest tables.
> For EPT, mapping and unmapping 20 times is no fun. This shows up in
> micro-benchmarks, although larger benchmarks tend to be fine.
> 
> My question is, do we care about hiding EPT? I think it is fine to just
> use xenheap pages (or any other form which does the job) so that we go
> down from 20 mappings to only 4. I have done this hack with EPT and saw
> significantly reduced impact for HVM guests in micro-benchmarks.

Not sure about hiding EPT. I will leave this question to Jan and
Andrew...

Wei.

> 
> Hongyan
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 04/16] x86/srat: vmap the pages for acpi_slit
  2020-04-30 20:44 ` [PATCH 04/16] x86/srat: vmap the pages for acpi_slit Hongyan Xia
@ 2020-11-30 10:16   ` Jan Beulich
  2020-11-30 18:11     ` Hongyan Xia
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2020-11-30 10:16 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: julien, Andrew Cooper, Wei Liu, Roger Pau Monné, xen-devel

On 30.04.2020 22:44, Hongyan Xia wrote:
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -196,7 +196,8 @@ void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
>  		return;
>  	}
>  	mfn = alloc_boot_pages(PFN_UP(slit->header.length), 1);
> -	acpi_slit = mfn_to_virt(mfn_x(mfn));
> +	acpi_slit = vmap_boot_pages(mfn, PFN_UP(slit->header.length));
> +	BUG_ON(!acpi_slit);
>  	memcpy(acpi_slit, slit, slit->header.length);
>  }

I'm not sure in how far this series is still to be considered
active / pending; I still have it in my inbox as something to
look at in any event. If it is, then I think the latest by this
patch it becomes clear that we either want to make vmalloc()
boot-allocator capable, or introduce e.g. vmalloc_boot().
Having this recurring pattern including the somewhat odd
vmap_boot_pages() is imo not the best way forward. It would
then also no longer be necessary to allocate contiguous pages,
as none of the users up to here appear to have such a need.

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 04/16] x86/srat: vmap the pages for acpi_slit
  2020-11-30 10:16   ` Jan Beulich
@ 2020-11-30 18:11     ` Hongyan Xia
  2020-12-01  7:37       ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Hongyan Xia @ 2020-11-30 18:11 UTC (permalink / raw)
  To: Jan Beulich
  Cc: julien, Andrew Cooper, Wei Liu, Roger Pau Monné, xen-devel

On Mon, 2020-11-30 at 11:16 +0100, Jan Beulich wrote:
> On 30.04.2020 22:44, Hongyan Xia wrote:
> > --- a/xen/arch/x86/srat.c
> > +++ b/xen/arch/x86/srat.c
> > @@ -196,7 +196,8 @@ void __init acpi_numa_slit_init(struct
> > acpi_table_slit *slit)
> >  		return;
> >  	}
> >  	mfn = alloc_boot_pages(PFN_UP(slit->header.length), 1);
> > -	acpi_slit = mfn_to_virt(mfn_x(mfn));
> > +	acpi_slit = vmap_boot_pages(mfn, PFN_UP(slit->header.length));
> > +	BUG_ON(!acpi_slit);
> >  	memcpy(acpi_slit, slit, slit->header.length);
> >  }
> 
> I'm not sure in how far this series is still to be considered
> active / pending; I still have it in my inbox as something to
> look at in any event. If it is, then I think the latest by this
> patch it becomes clear that we either want to make vmalloc()
> boot-allocator capable, or introduce e.g. vmalloc_boot().
> Having this recurring pattern including the somewhat odd
> vmap_boot_pages() is imo not the best way forward. It would
> then also no longer be necessary to allocate contiguous pages,
> as none of the users up to here appear to have such a need.

This series is blocked on the PTE domheap conversion series so I will
definitely come back here after that series is merged.

vmap_boot_pages() (poorly named, there is nothing "boot" about it) is
actually useful in other patches as well, especially when there is no
direct map but we need to map a contiguous range, since
map_domain_page() can only handle a single one. So I would say there
will be a need for this function (maybe call it vmap_contig_pages()?)
even if for this patch a boot-capable vmalloc can do the job.

Hongyan



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 04/16] x86/srat: vmap the pages for acpi_slit
  2020-11-30 18:11     ` Hongyan Xia
@ 2020-12-01  7:37       ` Jan Beulich
  0 siblings, 0 replies; 39+ messages in thread
From: Jan Beulich @ 2020-12-01  7:37 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: julien, Andrew Cooper, Wei Liu, Roger Pau Monné, xen-devel

On 30.11.2020 19:11, Hongyan Xia wrote:
> On Mon, 2020-11-30 at 11:16 +0100, Jan Beulich wrote:
>> On 30.04.2020 22:44, Hongyan Xia wrote:
>>> --- a/xen/arch/x86/srat.c
>>> +++ b/xen/arch/x86/srat.c
>>> @@ -196,7 +196,8 @@ void __init acpi_numa_slit_init(struct
>>> acpi_table_slit *slit)
>>>  		return;
>>>  	}
>>>  	mfn = alloc_boot_pages(PFN_UP(slit->header.length), 1);
>>> -	acpi_slit = mfn_to_virt(mfn_x(mfn));
>>> +	acpi_slit = vmap_boot_pages(mfn, PFN_UP(slit->header.length));
>>> +	BUG_ON(!acpi_slit);
>>>  	memcpy(acpi_slit, slit, slit->header.length);
>>>  }
>>
>> I'm not sure in how far this series is still to be considered
>> active / pending; I still have it in my inbox as something to
>> look at in any event. If it is, then I think the latest by this
>> patch it becomes clear that we either want to make vmalloc()
>> boot-allocator capable, or introduce e.g. vmalloc_boot().
>> Having this recurring pattern including the somewhat odd
>> vmap_boot_pages() is imo not the best way forward. It would
>> then also no longer be necessary to allocate contiguous pages,
>> as none of the users up to here appear to have such a need.
> 
> This series is blocked on the PTE domheap conversion series so I will
> definitely come back here after that series is merged.
> 
> vmap_boot_pages() (poorly named, there is nothing "boot" about it) is
> actually useful in other patches as well, especially when there is no
> direct map but we need to map a contiguous range, since
> map_domain_page() can only handle a single one. So I would say there
> will be a need for this function (maybe call it vmap_contig_pages()?)
> even if for this patch a boot-capable vmalloc can do the job.

Question is in how many cases contiguous allocations are actually
needed. I suspect there aren't many, and hence vmalloc() (or a
boot time clone of it, if need be) may again be the better choice.

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 13/16] xen/page_alloc: add a path for xenheap when there is no direct map
  2020-04-30 20:44 ` [PATCH 13/16] xen/page_alloc: add a path for xenheap when there is no direct map Hongyan Xia
  2020-05-01  8:50   ` Julien Grall
@ 2021-04-22 12:31   ` Jan Beulich
  2021-04-28 11:04     ` Hongyan Xia
  1 sibling, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2021-04-22 12:31 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: julien, Andrew Cooper, George Dunlap, Ian Jackson,
	Stefano Stabellini, Wei Liu, xen-devel

On 30.04.2020 22:44, Hongyan Xia wrote:
> From: Hongyan Xia <hongyxia@amazon.com>
> 
> When there is not an always-mapped direct map, xenheap allocations need
> to be mapped and unmapped on-demand.
> 
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>

This series has been left uncommented for far too long - I'm sorry.
While earlier patches here are probably reasonable (but would likely
need re-basing, so I'm not sure whether to try to get to look though
them before that makes much sense), I'd like to spell out that I'm
not really happy with the approach taken here: Simply re-introducing
a direct map entry for individual pages is not the way to go imo.
First and foremost this is rather wasteful in terms of resources (VA
space).

As I don't think we have many cases where code actually depends on
being able to apply __va() (or equivalent) to the address returned
from alloc_xenheap_pages(), I think this should instead involve
vmap(), with the vmap area drastically increased (perhaps taking all
of the space the direct map presently consumes). For any remaining
users of __va() or alike these should perhaps be converted into an
alias / derivation of vmap_to_{mfn,page}() then.

Since the goal of eliminating the direct map is to have unrelated
guests' memory not mapped when running a certain guest, it could
then be further considered to "overmap" what is being requested -
rather than just mapping the single 4k page, the containing 2M or 1G
one could be mapped (provided it all belongs to the running guest),
while unmapping could be deferred until the next context switch to a
different domain (or, if necessary, for 64-bit PV guests until the
next switch to guest user mode). Of course this takes as a prereq a
sufficiently low overhead means to establish whether the larger
containing page of a smaller one is all owned by the same domain.

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 00/16] Remove the direct map
  2020-06-02  9:08     ` Wei Liu
@ 2021-04-28 10:14       ` Hongyan Xia
  0 siblings, 0 replies; 39+ messages in thread
From: Hongyan Xia @ 2021-04-28 10:14 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, julien, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, xen-devel, Volodymyr Babchuk,
	Roger Pau Monné

On Tue, 2020-06-02 at 09:08 +0000, Wei Liu wrote:
> On Fri, May 01, 2020 at 02:53:08PM +0100, Hongyan Xia wrote:

[...]

> Not sure about hiding EPT. I will leave this question to Jan and
> Andrew...

Quick update on performance numbers. I have seen noticeable performance
drop if we need to map and unmap EPT. This translates to up to 10%
degredation in networking and database applications.

With EPT always mapped (not considering it as secrets), I have not seen
any noticeable performance drop in any real-world applications. The
only performance drop is from a micro-benchmark that programs HPET in a
tight loop (abusing MMIO page table walks), and the worst case is
around 10% to 15%.

Note that all above results are based on the mapcache rewrite patch

4058e92ce21627731c49b588a95809dc0affd83a.1581015491.git.hongyxia@amazon.com
to fix the mapcache lock contention first.

Hongyan



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 13/16] xen/page_alloc: add a path for xenheap when there is no direct map
  2021-04-22 12:31   ` Jan Beulich
@ 2021-04-28 11:04     ` Hongyan Xia
  2021-04-28 11:51       ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Hongyan Xia @ 2021-04-28 11:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: julien, Andrew Cooper, George Dunlap, Ian Jackson,
	Stefano Stabellini, Wei Liu, xen-devel

On Thu, 2021-04-22 at 14:31 +0200, Jan Beulich wrote:
> On 30.04.2020 22:44, Hongyan Xia wrote:
> > From: Hongyan Xia <hongyxia@amazon.com>
> > 
> > When there is not an always-mapped direct map, xenheap allocations
> > need
> > to be mapped and unmapped on-demand.
> > 
> > Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
> 
> This series has been left uncommented for far too long - I'm sorry.
> While earlier patches here are probably reasonable (but would likely
> need re-basing, so I'm not sure whether to try to get to look though
> them before that makes much sense),

No worries. This series depends on the domheap Xen page table
conversion series anyway (which was just fully merged. Thanks.). I will
re-base now since the dependency is resolved.

> As I don't think we have many cases where code actually depends on
> being able to apply __va() (or equivalent) to the address returned
> from alloc_xenheap_pages(), I think this should instead involve
> vmap(), with the vmap area drastically increased (perhaps taking all
> of the space the direct map presently consumes). For any remaining
> users of __va() or alike these should perhaps be converted into an
> alias / derivation of vmap_to_{mfn,page}() then.

That's true, and this was my first implementation (and also Wei's
original proposal) which worked okay. But, several problems got in the
way.

1. Partial unmap. Biggest offender is xmalloc which allocates and could
then free part of it, which means we need to be able to partially unmap
the region. vmap() does not support this.

2. Fast PA->VA. There is currently no way to go from PA to VA in
vmapped pages, unless we somehow repurpose or add new fields in
page_info. Also, VA->PA is possible but very slow now. There is not
much PA->VA in the critical path but see 3.

3. EPT. Mapping and unmapping EPT in HVM hypercalls and MMIO are so
many and so slow that it is probably not possible to keep them as
domheap pages due to the big performance drop after removing the direct
map. If we move them to xenheap pages on vmap, then this depends on 2
for page table walking.

In the end, I could not find a way that met all 3 above without massive
and intrusive changes. If there is a way, it certainly needs a design
document. The "on-demand" direct map solves all the problems without
breaking any APIs and is very easy to understand. We have been using
Xen without the direct map for a while now with this approach with
decent performance (in fact, you cannot tell that this is a Xen without
the direct map by just real-world benchmarks alone).

I too agree that this approach is a litte hacky and wastes a big chunk
of virtual address space. Definitely wants some discussion if a better
way can be found that solves the problems.

Thanks,
Hongyan



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 13/16] xen/page_alloc: add a path for xenheap when there is no direct map
  2021-04-28 11:04     ` Hongyan Xia
@ 2021-04-28 11:51       ` Jan Beulich
  2021-04-28 13:22         ` Hongyan Xia
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2021-04-28 11:51 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: julien, Andrew Cooper, George Dunlap, Ian Jackson,
	Stefano Stabellini, Wei Liu, xen-devel

On 28.04.2021 13:04, Hongyan Xia wrote:
> On Thu, 2021-04-22 at 14:31 +0200, Jan Beulich wrote:
>> As I don't think we have many cases where code actually depends on
>> being able to apply __va() (or equivalent) to the address returned
>> from alloc_xenheap_pages(), I think this should instead involve
>> vmap(), with the vmap area drastically increased (perhaps taking all
>> of the space the direct map presently consumes). For any remaining
>> users of __va() or alike these should perhaps be converted into an
>> alias / derivation of vmap_to_{mfn,page}() then.
> 
> That's true, and this was my first implementation (and also Wei's
> original proposal) which worked okay. But, several problems got in the
> way.
> 
> 1. Partial unmap. Biggest offender is xmalloc which allocates and could
> then free part of it, which means we need to be able to partially unmap
> the region. vmap() does not support this.

If the direct map went fully away, and hence if Xen heap pages got
vmap()-ed, there's no reason to keep xmalloc() from forwarding to
vmalloc() instead of going this partial-unmap route.

> 2. Fast PA->VA. There is currently no way to go from PA to VA in
> vmapped pages, unless we somehow repurpose or add new fields in
> page_info. Also, VA->PA is possible but very slow now. There is not
> much PA->VA in the critical path but see 3.

There would better not be any PA->VA. Can you point out examples
where it would be hard to avoid using such? I also don't see the
connection to 3 - is EPT code using PA->VA a lot? p2m-ept.c does
not look to have a single use of __va() or ..._to_virt().

> 3. EPT. Mapping and unmapping EPT in HVM hypercalls and MMIO are so
> many and so slow that it is probably not possible to keep them as
> domheap pages due to the big performance drop after removing the direct
> map. If we move them to xenheap pages on vmap, then this depends on 2
> for page table walking.

See my proposal to defer unmapping of the domain's own pages
(and I would consider the p2m pages to be part of the domain's
ones for this purpose). In fact, since the p2m pages come from a
fixed, separate pool I wonder whether the entire pool couldn't
be mapped in e.g. the per-domain VA range.

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 13/16] xen/page_alloc: add a path for xenheap when there is no direct map
  2021-04-28 11:51       ` Jan Beulich
@ 2021-04-28 13:22         ` Hongyan Xia
  2021-04-28 13:55           ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Hongyan Xia @ 2021-04-28 13:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: julien, Andrew Cooper, George Dunlap, Ian Jackson,
	Stefano Stabellini, Wei Liu, xen-devel

On Wed, 2021-04-28 at 13:51 +0200, Jan Beulich wrote:
> On 28.04.2021 13:04, Hongyan Xia wrote:
> > On Thu, 2021-04-22 at 14:31 +0200, Jan Beulich wrote:
> > > As I don't think we have many cases where code actually depends
> > > on
> > > being able to apply __va() (or equivalent) to the address
> > > returned
> > > from alloc_xenheap_pages(), I think this should instead involve
> > > vmap(), with the vmap area drastically increased (perhaps taking
> > > all
> > > of the space the direct map presently consumes). For any
> > > remaining
> > > users of __va() or alike these should perhaps be converted into
> > > an
> > > alias / derivation of vmap_to_{mfn,page}() then.
> > 
> > That's true, and this was my first implementation (and also Wei's
> > original proposal) which worked okay. But, several problems got in
> > the
> > way.
> > 
> > 1. Partial unmap. Biggest offender is xmalloc which allocates and
> > could
> > then free part of it, which means we need to be able to partially
> > unmap
> > the region. vmap() does not support this.
> 
> If the direct map went fully away, and hence if Xen heap pages got
> vmap()-ed, there's no reason to keep xmalloc() from forwarding to
> vmalloc() instead of going this partial-unmap route.
> 
> > 2. Fast PA->VA. There is currently no way to go from PA to VA in
> > vmapped pages, unless we somehow repurpose or add new fields in
> > page_info. Also, VA->PA is possible but very slow now. There is not
> > much PA->VA in the critical path but see 3.
> 
> There would better not be any PA->VA. Can you point out examples
> where it would be hard to avoid using such? I also don't see the
> connection to 3 - is EPT code using PA->VA a lot? p2m-ept.c does
> not look to have a single use of __va() or ..._to_virt().

p2m does not have any __va(), but my performance results showed that
mapping and unmapping EPT when there is no direct map was incredibly
slow, hence why I moved EPT to xenheap in my local branch, which uses
__va().

> See my proposal to defer unmapping of the domain's own pages
> (and I would consider the p2m pages to be part of the domain's
> ones for this purpose). In fact, since the p2m pages come from a
> fixed, separate pool I wonder whether the entire pool couldn't
> be mapped in e.g. the per-domain VA range.

I thought about that as well, not just EPT but a lot of domain-private
pages can be moved to the per-domain range, and the secrets are hidden
by virtue of cr3 switches when switching to other domains. But still we
have the problem of quickly finding PA->VA (I don't mean __va(), I mean
finding the VA that can access a page table page) for EPT walks.

Mapping in bigger pages should work wonders for pre-partitioned guests
where we know the guest mostly just has contiguous physical memory and
a superpage map probably covers all pages in an HVM 2-level walk. But
for a generic solution where domain memory can be really fragmented
(and context switches can happen a lot on a pCPU), how can we quickly
find PA->VA in EPT walking without some intrusive changes to Xen? Of
course, if we do not allow the HAP pool to change and force the HAP
pool to be physically contiguous, we can just remember the base VA of
its vmapped region for quick PA->VA, but I don't think this is a
generic solution.

Am I missing anything?

Hongyan



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 13/16] xen/page_alloc: add a path for xenheap when there is no direct map
  2021-04-28 13:22         ` Hongyan Xia
@ 2021-04-28 13:55           ` Jan Beulich
  0 siblings, 0 replies; 39+ messages in thread
From: Jan Beulich @ 2021-04-28 13:55 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: julien, Andrew Cooper, George Dunlap, Ian Jackson,
	Stefano Stabellini, Wei Liu, xen-devel

On 28.04.2021 15:22, Hongyan Xia wrote:
> On Wed, 2021-04-28 at 13:51 +0200, Jan Beulich wrote:
>> See my proposal to defer unmapping of the domain's own pages
>> (and I would consider the p2m pages to be part of the domain's
>> ones for this purpose). In fact, since the p2m pages come from a
>> fixed, separate pool I wonder whether the entire pool couldn't
>> be mapped in e.g. the per-domain VA range.
> 
> I thought about that as well, not just EPT but a lot of domain-private
> pages can be moved to the per-domain range, and the secrets are hidden
> by virtue of cr3 switches when switching to other domains. But still we
> have the problem of quickly finding PA->VA (I don't mean __va(), I mean
> finding the VA that can access a page table page) for EPT walks.
> 
> Mapping in bigger pages should work wonders for pre-partitioned guests
> where we know the guest mostly just has contiguous physical memory and
> a superpage map probably covers all pages in an HVM 2-level walk. But
> for a generic solution where domain memory can be really fragmented
> (and context switches can happen a lot on a pCPU), how can we quickly
> find PA->VA in EPT walking without some intrusive changes to Xen? Of
> course, if we do not allow the HAP pool to change and force the HAP
> pool to be physically contiguous, we can just remember the base VA of
> its vmapped region for quick PA->VA, but I don't think this is a
> generic solution.

We don't need to make the p2m pool static, but I think we can build on
it changing rarely, if ever. Hence it changing may be acceptable to be
moderately expensive.

If we made the pool physically contiguous, translation - as you say -
would be easy. But even if we can't find enough physical memory for it
to be contiguous, we can still help ourselves. The intermediate case is
when we can still make it consist of all 2M pages. There translation
may be fast enough even via a brute force array lookup. If we need to
resort to 4k pages, why not maintain the PA->VA association in a radix
tree?

Independent of that I think there are some cycles to be gained by,
already today, not having to map and unmap the root page table for
every access (get, set, etc).

Jan


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2021-04-28 13:55 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-30 20:44 [PATCH 00/16] Remove the direct map Hongyan Xia
2020-04-30 20:44 ` [PATCH 01/16] x86/setup: move vm_init() before acpi calls Hongyan Xia
2020-04-30 20:44 ` [PATCH 02/16] acpi: vmap pages in acpi_os_alloc_memory Hongyan Xia
2020-05-01 12:02   ` Wei Liu
2020-05-01 12:46     ` Hongyan Xia
2020-05-01 21:35   ` Julien Grall
2020-05-04  8:27     ` Hongyan Xia
2020-04-30 20:44 ` [PATCH 03/16] x86/numa: vmap the pages for memnodemap Hongyan Xia
2020-04-30 20:44 ` [PATCH 04/16] x86/srat: vmap the pages for acpi_slit Hongyan Xia
2020-11-30 10:16   ` Jan Beulich
2020-11-30 18:11     ` Hongyan Xia
2020-12-01  7:37       ` Jan Beulich
2020-04-30 20:44 ` [PATCH 05/16] x86: map/unmap pages in restore_all_guests Hongyan Xia
2020-04-30 20:44 ` [PATCH 06/16] x86/pv: domheap pages should be mapped while relocating initrd Hongyan Xia
2020-04-30 20:44 ` [PATCH 07/16] x86/pv: rewrite how building PV dom0 handles domheap mappings Hongyan Xia
2020-04-30 20:44 ` [PATCH 08/16] x86: add Persistent Map (PMAP) infrastructure Hongyan Xia
2020-04-30 20:44 ` [PATCH 09/16] x86: lift mapcache variable to the arch level Hongyan Xia
2020-04-30 20:44 ` [PATCH 10/16] x86/mapcache: initialise the mapcache for the idle domain Hongyan Xia
2020-04-30 20:44 ` [PATCH 11/16] x86: add a boot option to enable and disable the direct map Hongyan Xia
2020-05-01  8:43   ` Julien Grall
2020-05-01 12:11   ` Wei Liu
2020-05-01 12:59     ` Hongyan Xia
2020-05-01 13:11       ` Wei Liu
2020-05-01 15:59         ` Julien Grall
2020-04-30 20:44 ` [PATCH 12/16] x86/domain_page: remove the fast paths when mfn is not in the directmap Hongyan Xia
2020-04-30 20:44 ` [PATCH 13/16] xen/page_alloc: add a path for xenheap when there is no direct map Hongyan Xia
2020-05-01  8:50   ` Julien Grall
2021-04-22 12:31   ` Jan Beulich
2021-04-28 11:04     ` Hongyan Xia
2021-04-28 11:51       ` Jan Beulich
2021-04-28 13:22         ` Hongyan Xia
2021-04-28 13:55           ` Jan Beulich
2020-04-30 20:44 ` [PATCH 14/16] x86/setup: leave early boot slightly earlier Hongyan Xia
2020-04-30 20:44 ` [PATCH 15/16] x86/setup: vmap heap nodes when they are outside the direct map Hongyan Xia
2020-04-30 20:44 ` [PATCH 16/16] x86/setup: do not create valid mappings when directmap=no Hongyan Xia
2020-05-01 12:07 ` [PATCH 00/16] Remove the direct map Wei Liu
2020-05-01 13:53   ` Hongyan Xia
2020-06-02  9:08     ` Wei Liu
2021-04-28 10:14       ` Hongyan Xia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).