All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/31] x86: refactor mm.c
@ 2017-08-17 14:44 Wei Liu
  2017-08-17 14:44 ` [PATCH v4 01/31] x86/mm: carve out create_grant_pv_mapping Wei Liu
                   ` (30 more replies)
  0 siblings, 31 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

This series is basically all patches in v3 rebased on top of staging, since the
security issues blocking v3 are now published.

I also added BUG() in pv_{alloc,free}_page_type stubs.

The last patch to move the code comment is dropped for now.

The code can be found at:

   https://xenbits.xen.org/git-http/people/liuw/xen.git wip.split-mm-v4

Wei.

Wei Liu (31):
  x86/mm: carve out create_grant_pv_mapping
  x86/mm: carve out replace_grant_pv_mapping
  x86/mm: split HVM grant table code to hvm/grant_table.c
  x86/mm: lift PAGE_CACHE_ATTRS to page.h
  x86/mm: document the return values from get_page_from_l*e
  x86: move pv_emul_is_mem_write to pv/emulate.c
  x86/mm: move and rename guest_get_eff{,kern}_l1e
  x86/mm: export get_page_from_mfn
  x86/mm: rename and move update_intpte
  x86/mm: move {un,}adjust_guest_* to pv/mm.h
  x86/mm: split out writable pagetable emulation code
  x86/mm: split out readonly MMIO emulation code
  x86/mm: remove the unused inclusion of pv/emulate.h
  x86/mm: move and rename guest_{,un}map_l1e
  x86/mm: split out PV grant table code
  x86/mm: split out descriptor table code
  x86/mm: move compat descriptor handling code
  x86/mm: move and rename map_ldt_shadow_page
  x86/mm: factor out pv_arch_init_memory
  x86/mm: move l4 table setup code
  x86/mm: add "pv_" prefix to new_guest_cr3
  x86: add pv_ prefix to {alloc,free}_page_type
  x86/mm: export more get/put page functions
  x86/mm: move and add pv_ prefix to create_pae_xen_mappings
  x86/mm: move disallow_mask variable and macros
  x86/mm: move pv_{alloc,free}_page_type
  x86/mm: move and add pv_ prefix to invalidate_shadow_ldt
  x86/mm: move PV hypercalls to pv/mm-hypercalls.c
  x86/mm: remove the now unused inclusion of pv/mm.h
  x86/mm: use put_page_type_preemptible in put_page_from_l{2,3}e
  x86/mm: move {get,put}_page_from_l{2,3,4}e

 xen/arch/x86/domain.c                 |   14 +-
 xen/arch/x86/hvm/Makefile             |    1 +
 xen/arch/x86/hvm/grant_table.c        |   89 +
 xen/arch/x86/mm.c                     | 4389 ++++-----------------------------
 xen/arch/x86/pv/Makefile              |    6 +
 xen/arch/x86/pv/descriptor-tables.c   |  270 ++
 xen/arch/x86/pv/dom0_build.c          |    3 +-
 xen/arch/x86/pv/domain.c              |    3 +-
 xen/arch/x86/pv/emul-mmio-op.c        |  166 ++
 xen/arch/x86/pv/emul-priv-op.c        |    3 +-
 xen/arch/x86/pv/emul-ptwr-op.c        |  327 +++
 xen/arch/x86/pv/emulate.c             |    7 +
 xen/arch/x86/pv/emulate.h             |    5 +
 xen/arch/x86/pv/grant_table.c         |  398 +++
 xen/arch/x86/pv/mm-hypercalls.c       | 1463 +++++++++++
 xen/arch/x86/pv/mm.c                  | 1034 ++++++++
 xen/arch/x86/pv/mm.h                  |    6 +
 xen/arch/x86/traps.c                  |    5 +-
 xen/arch/x86/x86_64/compat/mm.c       |   39 -
 xen/include/asm-x86/grant_table.h     |   26 +-
 xen/include/asm-x86/hvm/grant_table.h |   61 +
 xen/include/asm-x86/mm.h              |   35 +-
 xen/include/asm-x86/page.h            |    2 +
 xen/include/asm-x86/processor.h       |    5 -
 xen/include/asm-x86/pv/grant_table.h  |   60 +
 xen/include/asm-x86/pv/mm.h           |  187 ++
 xen/include/asm-x86/pv/processor.h    |   42 +
 27 files changed, 4612 insertions(+), 4034 deletions(-)
 create mode 100644 xen/arch/x86/hvm/grant_table.c
 create mode 100644 xen/arch/x86/pv/descriptor-tables.c
 create mode 100644 xen/arch/x86/pv/emul-mmio-op.c
 create mode 100644 xen/arch/x86/pv/emul-ptwr-op.c
 create mode 100644 xen/arch/x86/pv/grant_table.c
 create mode 100644 xen/arch/x86/pv/mm-hypercalls.c
 create mode 100644 xen/arch/x86/pv/mm.c
 create mode 100644 xen/arch/x86/pv/mm.h
 create mode 100644 xen/include/asm-x86/hvm/grant_table.h
 create mode 100644 xen/include/asm-x86/pv/grant_table.h
 create mode 100644 xen/include/asm-x86/pv/mm.h
 create mode 100644 xen/include/asm-x86/pv/processor.h

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v4 01/31] x86/mm: carve out create_grant_pv_mapping
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-18 10:12   ` Jan Beulich
  2017-08-17 14:44 ` [PATCH v4 02/31] x86/mm: carve out replace_grant_pv_mapping Wei Liu
                   ` (29 subsequent siblings)
  30 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

And at once make create_grant_host_mapping an inline function.  This
requires making create_grant_{hvm,pv}_mapping non-static.  Provide
{hvm,pv}/grant_table.h. Include the headers where necessary.

The two functions create_grant_{hvm,pv}_mapping will be moved later in
a dedicated patch with all their helpers.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c                     | 16 +++++------
 xen/include/asm-x86/grant_table.h     | 16 +++++++++--
 xen/include/asm-x86/hvm/grant_table.h | 53 +++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/pv/grant_table.h  | 52 ++++++++++++++++++++++++++++++++++
 4 files changed, 127 insertions(+), 10 deletions(-)
 create mode 100644 xen/include/asm-x86/hvm/grant_table.h
 create mode 100644 xen/include/asm-x86/pv/grant_table.h

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 31fe8a1472..28bcff2c99 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -123,6 +123,9 @@
 #include <asm/io_apic.h>
 #include <asm/pci.h>
 
+#include <asm/hvm/grant_table.h>
+#include <asm/pv/grant_table.h>
+
 /* Mapping of the fixmap space needed early. */
 l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
     l1_fixmap[L1_PAGETABLE_ENTRIES];
@@ -4014,9 +4017,9 @@ static int destroy_grant_va_mapping(
     return replace_grant_va_mapping(addr, frame, l1e_empty(), v);
 }
 
-static int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
-                                    unsigned int flags,
-                                    unsigned int cache_flags)
+int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
+                             unsigned int flags,
+                             unsigned int cache_flags)
 {
     p2m_type_t p2mt;
     int rc;
@@ -4037,15 +4040,12 @@ static int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
         return GNTST_okay;
 }
 
-int create_grant_host_mapping(uint64_t addr, unsigned long frame,
-                              unsigned int flags, unsigned int cache_flags)
+int create_grant_pv_mapping(uint64_t addr, unsigned long frame,
+                            unsigned int flags, unsigned int cache_flags)
 {
     l1_pgentry_t pte;
     uint32_t grant_pte_flags;
 
-    if ( paging_mode_external(current->domain) )
-        return create_grant_p2m_mapping(addr, frame, flags, cache_flags);
-
     grant_pte_flags =
         _PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_GNTTAB;
     if ( cpu_has_nx )
diff --git a/xen/include/asm-x86/grant_table.h b/xen/include/asm-x86/grant_table.h
index 1561bdab0d..559ad2f275 100644
--- a/xen/include/asm-x86/grant_table.h
+++ b/xen/include/asm-x86/grant_table.h
@@ -7,14 +7,26 @@
 #ifndef __ASM_GRANT_TABLE_H__
 #define __ASM_GRANT_TABLE_H__
 
+#include <asm/paging.h>
+
+#include <asm/hvm/grant_table.h>
+#include <asm/pv/grant_table.h>
+
 #define INITIAL_NR_GRANT_FRAMES 4
 
 /*
  * Caller must own caller's BIGLOCK, is responsible for flushing the TLB, and
  * must hold a reference to the page.
  */
-int create_grant_host_mapping(uint64_t addr, unsigned long frame,
-			      unsigned int flags, unsigned int cache_flags);
+static inline int create_grant_host_mapping(uint64_t addr, unsigned long frame,
+                                            unsigned int flags,
+                                            unsigned int cache_flags)
+{
+    if ( paging_mode_external(current->domain) )
+        return create_grant_p2m_mapping(addr, frame, flags, cache_flags);
+    return create_grant_pv_mapping(addr, frame, flags, cache_flags);
+}
+
 int replace_grant_host_mapping(
     uint64_t addr, unsigned long frame, uint64_t new_addr, unsigned int flags);
 
diff --git a/xen/include/asm-x86/hvm/grant_table.h b/xen/include/asm-x86/hvm/grant_table.h
new file mode 100644
index 0000000000..83202c219c
--- /dev/null
+++ b/xen/include/asm-x86/hvm/grant_table.h
@@ -0,0 +1,53 @@
+/*
+ * asm-x86/hvm/grant_table.h
+ *
+ * Grant table interfaces for HVM guests
+ *
+ * Copyright (C) 2017 Wei Liu <wei.liu2@citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __X86_HVM_GRANT_TABLE_H__
+#define __X86_HVM_GRANT_TABLE_H__
+
+#ifdef CONFIG_HVM
+
+int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
+                             unsigned int flags,
+                             unsigned int cache_flags);
+
+#else
+
+#include <public/grant_table.h>
+
+static inline int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
+                                           unsigned int flags,
+                                           unsigned int cache_flags)
+{
+    return GNTST_general_error;
+}
+
+#endif
+
+#endif /* __X86_HVM_GRANT_TABLE_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/pv/grant_table.h b/xen/include/asm-x86/pv/grant_table.h
new file mode 100644
index 0000000000..165ebce22f
--- /dev/null
+++ b/xen/include/asm-x86/pv/grant_table.h
@@ -0,0 +1,52 @@
+/*
+ * asm-x86/pv/grant_table.h
+ *
+ * Grant table interfaces for PV guests
+ *
+ * Copyright (C) 2017 Wei Liu <wei.liu2@citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __X86_PV_GRANT_TABLE_H__
+#define __X86_PV_GRANT_TABLE_H__
+
+#ifdef CONFIG_PV
+
+int create_grant_pv_mapping(uint64_t addr, unsigned long frame,
+                            unsigned int flags, unsigned int cache_flags);
+
+#else
+
+#include <public/grant_table.h>
+
+static inline int create_grant_pv_mapping(uint64_t addr, unsigned long frame,
+                                          unsigned int flags,
+                                          unsigned int cache_flags)
+{
+    return GNTST_general_error;
+}
+
+#endif
+
+#endif /* __X86_PV_GRANT_TABLE_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 02/31] x86/mm: carve out replace_grant_pv_mapping
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
  2017-08-17 14:44 ` [PATCH v4 01/31] x86/mm: carve out create_grant_pv_mapping Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-18 10:14   ` Jan Beulich
  2017-08-17 14:44 ` [PATCH v4 03/31] x86/mm: split HVM grant table code to hvm/grant_table.c Wei Liu
                   ` (28 subsequent siblings)
  30 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

And at once make it an inline function. Add declarations of
replace_grant_{hvm,pv}_mapping to respective header files.

The code movement will be done later.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c                     |  9 +++------
 xen/include/asm-x86/grant_table.h     | 10 ++++++++--
 xen/include/asm-x86/hvm/grant_table.h |  8 ++++++++
 xen/include/asm-x86/pv/grant_table.h  |  8 ++++++++
 4 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 28bcff2c99..d7d04772c5 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4068,7 +4068,7 @@ int create_grant_pv_mapping(uint64_t addr, unsigned long frame,
     return create_grant_va_mapping(addr, pte, current);
 }
 
-static int replace_grant_p2m_mapping(
+int replace_grant_p2m_mapping(
     uint64_t addr, unsigned long frame, uint64_t new_addr, unsigned int flags)
 {
     unsigned long gfn = (unsigned long)(addr >> PAGE_SHIFT);
@@ -4098,8 +4098,8 @@ static int replace_grant_p2m_mapping(
     return GNTST_okay;
 }
 
-int replace_grant_host_mapping(
-    uint64_t addr, unsigned long frame, uint64_t new_addr, unsigned int flags)
+int replace_grant_pv_mapping(uint64_t addr, unsigned long frame,
+                             uint64_t new_addr, unsigned int flags)
 {
     struct vcpu *curr = current;
     l1_pgentry_t *pl1e, ol1e;
@@ -4107,9 +4107,6 @@ int replace_grant_host_mapping(
     struct page_info *l1pg;
     int rc;
 
-    if ( paging_mode_external(current->domain) )
-        return replace_grant_p2m_mapping(addr, frame, new_addr, flags);
-
     if ( flags & GNTMAP_contains_pte )
     {
         if ( !new_addr )
diff --git a/xen/include/asm-x86/grant_table.h b/xen/include/asm-x86/grant_table.h
index 559ad2f275..33b2f88b96 100644
--- a/xen/include/asm-x86/grant_table.h
+++ b/xen/include/asm-x86/grant_table.h
@@ -27,8 +27,14 @@ static inline int create_grant_host_mapping(uint64_t addr, unsigned long frame,
     return create_grant_pv_mapping(addr, frame, flags, cache_flags);
 }
 
-int replace_grant_host_mapping(
-    uint64_t addr, unsigned long frame, uint64_t new_addr, unsigned int flags);
+static inline int replace_grant_host_mapping(uint64_t addr, unsigned long frame,
+                                             uint64_t new_addr,
+                                             unsigned int flags)
+{
+    if ( paging_mode_external(current->domain) )
+        return replace_grant_p2m_mapping(addr, frame, new_addr, flags);
+    return replace_grant_pv_mapping(addr, frame, new_addr, flags);
+}
 
 #define gnttab_create_shared_page(d, t, i)                               \
     do {                                                                 \
diff --git a/xen/include/asm-x86/hvm/grant_table.h b/xen/include/asm-x86/hvm/grant_table.h
index 83202c219c..4b1afa179b 100644
--- a/xen/include/asm-x86/hvm/grant_table.h
+++ b/xen/include/asm-x86/hvm/grant_table.h
@@ -26,6 +26,8 @@
 int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
                              unsigned int flags,
                              unsigned int cache_flags);
+int replace_grant_p2m_mapping(uint64_t addr, unsigned long frame,
+                              uint64_t new_addr, unsigned int flags);
 
 #else
 
@@ -38,6 +40,12 @@ static inline int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
     return GNTST_general_error;
 }
 
+int replace_grant_p2m_mapping(uint64_t addr, unsigned long frame,
+                              uint64_t new_addr, unsigned int flags)
+{
+    return GNTST_general_error;
+}
+
 #endif
 
 #endif /* __X86_HVM_GRANT_TABLE_H__ */
diff --git a/xen/include/asm-x86/pv/grant_table.h b/xen/include/asm-x86/pv/grant_table.h
index 165ebce22f..c6474973cd 100644
--- a/xen/include/asm-x86/pv/grant_table.h
+++ b/xen/include/asm-x86/pv/grant_table.h
@@ -25,6 +25,8 @@
 
 int create_grant_pv_mapping(uint64_t addr, unsigned long frame,
                             unsigned int flags, unsigned int cache_flags);
+int replace_grant_pv_mapping(uint64_t addr, unsigned long frame,
+                             uint64_t new_addr, unsigned int flags);
 
 #else
 
@@ -37,6 +39,12 @@ static inline int create_grant_pv_mapping(uint64_t addr, unsigned long frame,
     return GNTST_general_error;
 }
 
+int replace_grant_pv_mapping(uint64_t addr, unsigned long frame,
+                             uint64_t new_addr, unsigned int flags)
+{
+    return GNTST_general_error;
+}
+
 #endif
 
 #endif /* __X86_PV_GRANT_TABLE_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 03/31] x86/mm: split HVM grant table code to hvm/grant_table.c
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
  2017-08-17 14:44 ` [PATCH v4 01/31] x86/mm: carve out create_grant_pv_mapping Wei Liu
  2017-08-17 14:44 ` [PATCH v4 02/31] x86/mm: carve out replace_grant_pv_mapping Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-18 10:16   ` Jan Beulich
  2017-08-18 10:26   ` Andrew Cooper
  2017-08-17 14:44 ` [PATCH v4 04/31] x86/mm: lift PAGE_CACHE_ATTRS to page.h Wei Liu
                   ` (27 subsequent siblings)
  30 siblings, 2 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/hvm/Makefile      |  1 +
 xen/arch/x86/hvm/grant_table.c | 89 ++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/mm.c              | 53 -------------------------
 3 files changed, 90 insertions(+), 53 deletions(-)
 create mode 100644 xen/arch/x86/hvm/grant_table.c

diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index c394af7364..5bd38f633f 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -6,6 +6,7 @@ obj-y += dm.o
 obj-bin-y += dom0_build.init.o
 obj-y += domain.o
 obj-y += emulate.o
+obj-y += grant_table.o
 obj-y += hpet.o
 obj-y += hvm.o
 obj-y += hypercall.o
diff --git a/xen/arch/x86/hvm/grant_table.c b/xen/arch/x86/hvm/grant_table.c
new file mode 100644
index 0000000000..7503c2c61b
--- /dev/null
+++ b/xen/arch/x86/hvm/grant_table.c
@@ -0,0 +1,89 @@
+/******************************************************************************
+ * arch/x86/hvm/grant_table.c
+ *
+ * Grant table interfaces for HVM guests
+ *
+ * Copyright (C) 2017 Wei Liu <wei.liu2@citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/types.h>
+
+#include <public/grant_table.h>
+
+#include <asm/p2m.h>
+
+int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
+                             unsigned int flags,
+                             unsigned int cache_flags)
+{
+    p2m_type_t p2mt;
+    int rc;
+
+    if ( cache_flags  || (flags & ~GNTMAP_readonly) != GNTMAP_host_map )
+        return GNTST_general_error;
+
+    if ( flags & GNTMAP_readonly )
+        p2mt = p2m_grant_map_ro;
+    else
+        p2mt = p2m_grant_map_rw;
+    rc = guest_physmap_add_entry(current->domain,
+                                 _gfn(addr >> PAGE_SHIFT),
+                                 _mfn(frame), PAGE_ORDER_4K, p2mt);
+    if ( rc )
+        return GNTST_general_error;
+    else
+        return GNTST_okay;
+}
+
+int replace_grant_p2m_mapping(uint64_t addr, unsigned long frame,
+                              uint64_t new_addr, unsigned int flags)
+{
+    unsigned long gfn = (unsigned long)(addr >> PAGE_SHIFT);
+    p2m_type_t type;
+    mfn_t old_mfn;
+    struct domain *d = current->domain;
+
+    if ( new_addr != 0 || (flags & GNTMAP_contains_pte) )
+        return GNTST_general_error;
+
+    old_mfn = get_gfn(d, gfn, &type);
+    if ( !p2m_is_grant(type) || mfn_x(old_mfn) != frame )
+    {
+        put_gfn(d, gfn);
+        gdprintk(XENLOG_WARNING,
+                 "old mapping invalid (type %d, mfn %" PRI_mfn ", frame %lx)\n",
+                 type, mfn_x(old_mfn), frame);
+        return GNTST_general_error;
+    }
+    if ( guest_physmap_remove_page(d, _gfn(gfn), _mfn(frame), PAGE_ORDER_4K) )
+    {
+        put_gfn(d, gfn);
+        return GNTST_general_error;
+    }
+
+    put_gfn(d, gfn);
+    return GNTST_okay;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index d7d04772c5..5c6a7e5638 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4017,29 +4017,6 @@ static int destroy_grant_va_mapping(
     return replace_grant_va_mapping(addr, frame, l1e_empty(), v);
 }
 
-int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
-                             unsigned int flags,
-                             unsigned int cache_flags)
-{
-    p2m_type_t p2mt;
-    int rc;
-
-    if ( cache_flags  || (flags & ~GNTMAP_readonly) != GNTMAP_host_map )
-        return GNTST_general_error;
-
-    if ( flags & GNTMAP_readonly )
-        p2mt = p2m_grant_map_ro;
-    else
-        p2mt = p2m_grant_map_rw;
-    rc = guest_physmap_add_entry(current->domain,
-                                 _gfn(addr >> PAGE_SHIFT),
-                                 _mfn(frame), PAGE_ORDER_4K, p2mt);
-    if ( rc )
-        return GNTST_general_error;
-    else
-        return GNTST_okay;
-}
-
 int create_grant_pv_mapping(uint64_t addr, unsigned long frame,
                             unsigned int flags, unsigned int cache_flags)
 {
@@ -4068,36 +4045,6 @@ int create_grant_pv_mapping(uint64_t addr, unsigned long frame,
     return create_grant_va_mapping(addr, pte, current);
 }
 
-int replace_grant_p2m_mapping(
-    uint64_t addr, unsigned long frame, uint64_t new_addr, unsigned int flags)
-{
-    unsigned long gfn = (unsigned long)(addr >> PAGE_SHIFT);
-    p2m_type_t type;
-    mfn_t old_mfn;
-    struct domain *d = current->domain;
-
-    if ( new_addr != 0 || (flags & GNTMAP_contains_pte) )
-        return GNTST_general_error;
-
-    old_mfn = get_gfn(d, gfn, &type);
-    if ( !p2m_is_grant(type) || mfn_x(old_mfn) != frame )
-    {
-        put_gfn(d, gfn);
-        gdprintk(XENLOG_WARNING,
-                 "old mapping invalid (type %d, mfn %" PRI_mfn ", frame %lx)\n",
-                 type, mfn_x(old_mfn), frame);
-        return GNTST_general_error;
-    }
-    if ( guest_physmap_remove_page(d, _gfn(gfn), _mfn(frame), PAGE_ORDER_4K) )
-    {
-        put_gfn(d, gfn);
-        return GNTST_general_error;
-    }
-
-    put_gfn(d, gfn);
-    return GNTST_okay;
-}
-
 int replace_grant_pv_mapping(uint64_t addr, unsigned long frame,
                              uint64_t new_addr, unsigned int flags)
 {
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 04/31] x86/mm: lift PAGE_CACHE_ATTRS to page.h
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (2 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 03/31] x86/mm: split HVM grant table code to hvm/grant_table.c Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:46   ` Andrew Cooper
  2017-08-17 14:44 ` [PATCH v4 05/31] x86/mm: document the return values from get_page_from_l*e Wei Liu
                   ` (26 subsequent siblings)
  30 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Currently all the users are within x86/mm.c. But that will change once
we split PV specific mm code to another file. Lift that to page.h
along side _PAGE_* in preparation for later patches.

No functional change. Add some spaces around "|" while moving.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c          | 2 --
 xen/include/asm-x86/page.h | 2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 5c6a7e5638..64dd520044 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -151,8 +151,6 @@ bool __read_mostly machine_to_phys_mapping_valid;
 
 struct rangeset *__read_mostly mmio_ro_ranges;
 
-#define PAGE_CACHE_ATTRS (_PAGE_PAT|_PAGE_PCD|_PAGE_PWT)
-
 static uint32_t base_disallow_mask;
 /* Global bit is allowed to be set on L1 PTEs. Intended for user mappings. */
 #define L1_DISALLOW_MASK ((base_disallow_mask | _PAGE_GNTTAB) & ~_PAGE_GLOBAL)
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index 263ca5bc3c..d082ba8d42 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -304,6 +304,8 @@ void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t);
 #define _PAGE_AVAIL_HIGH (_AC(0x7ff, U) << 12)
 #define _PAGE_NX       (cpu_has_nx ? _PAGE_NX_BIT : 0)
 
+#define PAGE_CACHE_ATTRS (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT)
+
 /*
  * Debug option: Ensure that granted mappings are not implicitly unmapped.
  * WARNING: This will need to be disabled to run OSes that use the spare PTE
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 05/31] x86/mm: document the return values from get_page_from_l*e
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (3 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 04/31] x86/mm: lift PAGE_CACHE_ATTRS to page.h Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-18 10:24   ` Jan Beulich
  2017-08-17 14:44 ` [PATCH v4 06/31] x86: move pv_emul_is_mem_write to pv/emulate.c Wei Liu
                   ` (25 subsequent siblings)
  30 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 64dd520044..5983a56811 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -869,6 +869,12 @@ static int print_mmio_emul_range(unsigned long s, unsigned long e, void *arg)
 }
 #endif
 
+/*
+ * get_page_from_l1e returns:
+ *   0  => success (page not present also counts as such)
+ *  <0  => error code
+ *  >0  => the page flags to be flipped
+ */
 int
 get_page_from_l1e(
     l1_pgentry_t l1e, struct domain *l1e_owner, struct domain *pg_owner)
@@ -1081,6 +1087,12 @@ get_page_from_l1e(
 
 
 /* NB. Virtual address 'l2e' maps to a machine address within frame 'pfn'. */
+/*
+ * get_page_from_l2e returns:
+ *   1 => page not present
+ *   0 => success
+ *  <0 => error code
+ */
 define_get_linear_pagetable(l2);
 static int
 get_page_from_l2e(
@@ -1111,6 +1123,12 @@ get_page_from_l2e(
 }
 
 
+/*
+ * get_page_from_l3e returns:
+ *   1 => page not present
+ *   0 => success
+ *  <0 => error code
+ */
 define_get_linear_pagetable(l3);
 static int
 get_page_from_l3e(
@@ -1138,6 +1156,12 @@ get_page_from_l3e(
     return rc;
 }
 
+/*
+ * get_page_from_l4e returns:
+ *   1 => page not present
+ *   0 => success
+ *  <0 => error code
+ */
 define_get_linear_pagetable(l4);
 static int
 get_page_from_l4e(
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 06/31] x86: move pv_emul_is_mem_write to pv/emulate.c
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (4 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 05/31] x86/mm: document the return values from get_page_from_l*e Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:53   ` Andrew Cooper
  2017-08-18 10:08   ` Jan Beulich
  2017-08-17 14:44 ` [PATCH v4 07/31] x86/mm: move and rename guest_get_eff{, kern}_l1e Wei Liu
                   ` (24 subsequent siblings)
  30 siblings, 2 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Export it via pv/emulate.h.  In the mean time it is required to
include pv/emulate.h in x86/mm.c.

The said function will be used later by different emulation handlers
in later patches.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c         | 9 ++-------
 xen/arch/x86/pv/emulate.c | 7 +++++++
 xen/arch/x86/pv/emulate.h | 3 +++
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 5983a56811..e0e655ac31 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -126,6 +126,8 @@
 #include <asm/hvm/grant_table.h>
 #include <asm/pv/grant_table.h>
 
+#include "pv/emulate.h"
+
 /* Mapping of the fixmap space needed early. */
 l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
     l1_fixmap[L1_PAGETABLE_ENTRIES];
@@ -5138,13 +5140,6 @@ static int ptwr_emulated_cmpxchg(
         container_of(ctxt, struct ptwr_emulate_ctxt, ctxt));
 }
 
-static int pv_emul_is_mem_write(const struct x86_emulate_state *state,
-                                struct x86_emulate_ctxt *ctxt)
-{
-    return x86_insn_is_mem_write(state, ctxt) ? X86EMUL_OKAY
-                                              : X86EMUL_UNHANDLEABLE;
-}
-
 static const struct x86_emulate_ops ptwr_emulate_ops = {
     .read       = ptwr_emulated_read,
     .insn_fetch = ptwr_emulated_read,
diff --git a/xen/arch/x86/pv/emulate.c b/xen/arch/x86/pv/emulate.c
index 5750c7699b..1c4d6eab28 100644
--- a/xen/arch/x86/pv/emulate.c
+++ b/xen/arch/x86/pv/emulate.c
@@ -87,6 +87,13 @@ void pv_emul_instruction_done(struct cpu_user_regs *regs, unsigned long rip)
     }
 }
 
+int pv_emul_is_mem_write(const struct x86_emulate_state *state,
+                         struct x86_emulate_ctxt *ctxt)
+{
+    return x86_insn_is_mem_write(state, ctxt) ? X86EMUL_OKAY
+                                              : X86EMUL_UNHANDLEABLE;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/pv/emulate.h b/xen/arch/x86/pv/emulate.h
index b2b1192d48..89abbe010f 100644
--- a/xen/arch/x86/pv/emulate.h
+++ b/xen/arch/x86/pv/emulate.h
@@ -7,4 +7,7 @@ int pv_emul_read_descriptor(unsigned int sel, const struct vcpu *v,
 
 void pv_emul_instruction_done(struct cpu_user_regs *regs, unsigned long rip);
 
+int pv_emul_is_mem_write(const struct x86_emulate_state *state,
+                         struct x86_emulate_ctxt *ctxt);
+
 #endif /* __PV_EMULATE_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 07/31] x86/mm: move and rename guest_get_eff{, kern}_l1e
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (5 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 06/31] x86: move pv_emul_is_mem_write to pv/emulate.c Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-24 14:52   ` Jan Beulich
  2017-08-17 14:44 ` [PATCH v4 08/31] x86/mm: export get_page_from_mfn Wei Liu
                   ` (23 subsequent siblings)
  30 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Move them to pv/mm.c and rename them to pv_get_guest_eff_{,kern}_l1e.
Export them via pv/mm.h.

They will be used later in emulation handlers.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c           | 38 +++----------------------
 xen/arch/x86/pv/Makefile    |  1 +
 xen/arch/x86/pv/mm.c        | 67 +++++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/pv/mm.h | 53 +++++++++++++++++++++++++++++++++++
 4 files changed, 125 insertions(+), 34 deletions(-)
 create mode 100644 xen/arch/x86/pv/mm.c
 create mode 100644 xen/include/asm-x86/pv/mm.h

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index e0e655ac31..7eb80ecaa3 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -125,6 +125,7 @@
 
 #include <asm/hvm/grant_table.h>
 #include <asm/pv/grant_table.h>
+#include <asm/pv/mm.h>
 
 #include "pv/emulate.h"
 
@@ -554,37 +555,6 @@ static inline void guest_unmap_l1e(void *p)
     unmap_domain_page(p);
 }
 
-/* Read a PV guest's l1e that maps this virtual address. */
-static inline void guest_get_eff_l1e(unsigned long addr, l1_pgentry_t *eff_l1e)
-{
-    ASSERT(!paging_mode_translate(current->domain));
-    ASSERT(!paging_mode_external(current->domain));
-
-    if ( unlikely(!__addr_ok(addr)) ||
-         __copy_from_user(eff_l1e,
-                          &__linear_l1_table[l1_linear_offset(addr)],
-                          sizeof(l1_pgentry_t)) )
-        *eff_l1e = l1e_empty();
-}
-
-/*
- * Read the guest's l1e that maps this address, from the kernel-mode
- * page tables.
- */
-static inline void guest_get_eff_kern_l1e(struct vcpu *v, unsigned long addr,
-                                          void *eff_l1e)
-{
-    const bool user_mode = !(v->arch.flags & TF_kernel_mode);
-
-    if ( user_mode )
-        toggle_guest_mode(v);
-
-    guest_get_eff_l1e(addr, eff_l1e);
-
-    if ( user_mode )
-        toggle_guest_mode(v);
-}
-
 static inline void page_set_tlbflush_timestamp(struct page_info *page)
 {
     /*
@@ -669,7 +639,7 @@ int map_ldt_shadow_page(unsigned int off)
 
     if ( is_pv_32bit_domain(d) )
         gva = (u32)gva;
-    guest_get_eff_kern_l1e(v, gva, &l1e);
+    pv_get_guest_eff_kern_l1e(v, gva, &l1e);
     if ( unlikely(!(l1e_get_flags(l1e) & _PAGE_PRESENT)) )
         return 0;
 
@@ -5168,7 +5138,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
     int rc;
 
     /* Attempt to read the PTE that maps the VA being accessed. */
-    guest_get_eff_l1e(addr, &pte);
+    pv_get_guest_eff_l1e(addr, &pte);
 
     /* We are looking only for read-only mappings of p.t. pages. */
     if ( ((l1e_get_flags(pte) & (_PAGE_PRESENT|_PAGE_RW)) != _PAGE_PRESENT) ||
@@ -5323,7 +5293,7 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
     int rc;
 
     /* Attempt to read the PTE that maps the VA being accessed. */
-    guest_get_eff_l1e(addr, &pte);
+    pv_get_guest_eff_l1e(addr, &pte);
 
     /* We are looking only for read-only mappings of MMIO pages. */
     if ( ((l1e_get_flags(pte) & (_PAGE_PRESENT|_PAGE_RW)) != _PAGE_PRESENT) )
diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 4e15484471..c83aed493b 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -7,6 +7,7 @@ obj-y += emul-priv-op.o
 obj-y += hypercall.o
 obj-y += iret.o
 obj-y += misc-hypercalls.o
+obj-y += mm.o
 obj-y += traps.o
 
 obj-bin-y += dom0_build.init.o
diff --git a/xen/arch/x86/pv/mm.c b/xen/arch/x86/pv/mm.c
new file mode 100644
index 0000000000..aa2ce34145
--- /dev/null
+++ b/xen/arch/x86/pv/mm.c
@@ -0,0 +1,67 @@
+/******************************************************************************
+ * arch/x86/pv/mm.c
+ *
+ * Memory management code for PV guests
+ *
+ * Copyright (c) 2002-2005 K A Fraser
+ * Copyright (c) 2004 Christian Limpach
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/guest_access.h>
+
+#include <asm/pv/mm.h>
+
+
+/* Read a PV guest's l1e that maps this virtual address. */
+void pv_get_guest_eff_l1e(unsigned long addr, l1_pgentry_t *eff_l1e)
+{
+    ASSERT(!paging_mode_translate(current->domain));
+    ASSERT(!paging_mode_external(current->domain));
+
+    if ( unlikely(!__addr_ok(addr)) ||
+         __copy_from_user(eff_l1e,
+                          &__linear_l1_table[l1_linear_offset(addr)],
+                          sizeof(l1_pgentry_t)) )
+        *eff_l1e = l1e_empty();
+}
+
+/*
+ * Read the guest's l1e that maps this address, from the kernel-mode
+ * page tables.
+ */
+void pv_get_guest_eff_kern_l1e(struct vcpu *v, unsigned long addr,
+                               void *eff_l1e)
+{
+    const bool user_mode = !(v->arch.flags & TF_kernel_mode);
+
+    if ( user_mode )
+        toggle_guest_mode(v);
+
+    pv_get_guest_eff_l1e(addr, eff_l1e);
+
+    if ( user_mode )
+        toggle_guest_mode(v);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/pv/mm.h b/xen/include/asm-x86/pv/mm.h
new file mode 100644
index 0000000000..19dbc3b66c
--- /dev/null
+++ b/xen/include/asm-x86/pv/mm.h
@@ -0,0 +1,53 @@
+/*
+ * asm-x86/pv/mm.h
+ *
+ * Memory management interfaces for PV guests
+ *
+ * Copyright (C) 2017 Wei Liu <wei.liu2@citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __X86_PV_MM_H__
+#define __X86_PV_MM_H__
+
+#ifdef CONFIG_PV
+
+void pv_get_guest_eff_l1e(unsigned long addr, l1_pgentry_t *eff_l1e);
+
+void pv_get_guest_eff_kern_l1e(struct vcpu *v, unsigned long addr,
+                               void *eff_l1e);
+
+#else
+
+static inline void pv_get_guest_eff_l1e(unsigned long addr,
+                                        l1_pgentry_t *eff_l1e)
+{}
+
+static inline void pv_get_guest_eff_kern_l1e(struct vcpu *v, unsigned long addr,
+                                             void *eff_l1e)
+{}
+
+#endif
+
+#endif /* __X86_PV_MM_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 08/31] x86/mm: export get_page_from_mfn
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (6 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 07/31] x86/mm: move and rename guest_get_eff{, kern}_l1e Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-24 14:55   ` Jan Beulich
  2017-08-17 14:44 ` [PATCH v4 09/31] x86/mm: rename and move update_intpte Wei Liu
                   ` (22 subsequent siblings)
  30 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

It will be used by different files later, so export it via
asm-x86/mm.h.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c        | 2 +-
 xen/include/asm-x86/mm.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 7eb80ecaa3..d25d314673 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -666,7 +666,7 @@ int map_ldt_shadow_page(unsigned int off)
 }
 
 
-static bool get_page_from_mfn(mfn_t mfn, struct domain *d)
+bool get_page_from_mfn(mfn_t mfn, struct domain *d)
 {
     struct page_info *page = mfn_to_page(mfn_x(mfn));
 
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 2bf3f335ad..c6e1d01c7d 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -344,6 +344,7 @@ int  put_old_guest_table(struct vcpu *);
 int  get_page_from_l1e(
     l1_pgentry_t l1e, struct domain *l1e_owner, struct domain *pg_owner);
 void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner);
+bool get_page_from_mfn(mfn_t mfn, struct domain *d);
 
 static inline void put_page_and_type(struct page_info *page)
 {
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 09/31] x86/mm: rename and move update_intpte
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (7 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 08/31] x86/mm: export get_page_from_mfn Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-24 14:59   ` Jan Beulich
  2017-08-17 14:44 ` [PATCH v4 10/31] x86/mm: move {un, }adjust_guest_* to pv/mm.h Wei Liu
                   ` (21 subsequent siblings)
  30 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

That function is only used by PV guests supporting code, add pv_
prefix.

Export it via pv/mm.h. Move UPDATE_ENTRY as well.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
Now it is no longer an inline function, but I don't think that matters
much.
---
 xen/arch/x86/mm.c           | 65 ---------------------------------------------
 xen/arch/x86/pv/mm.c        | 54 +++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/pv/mm.h | 16 +++++++++++
 3 files changed, 70 insertions(+), 65 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index d25d314673..dc4ac5592a 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -133,14 +133,6 @@
 l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
     l1_fixmap[L1_PAGETABLE_ENTRIES];
 
-/*
- * PTE updates can be done with ordinary writes except:
- *  1. Debug builds get extra checking by using CMPXCHG[8B].
- */
-#if !defined(NDEBUG)
-#define PTE_UPDATE_WITH_CMPXCHG
-#endif
-
 paddr_t __read_mostly mem_hotplug;
 
 /* Private domain structs for DOMID_XEN and DOMID_IO. */
@@ -1812,63 +1804,6 @@ void page_unlock(struct page_info *page)
     } while ( (y = cmpxchg(&page->u.inuse.type_info, x, nx)) != x );
 }
 
-/*
- * How to write an entry to the guest pagetables.
- * Returns false for failure (pointer not valid), true for success.
- */
-static inline bool update_intpte(
-    intpte_t *p, intpte_t old, intpte_t new, unsigned long mfn,
-    struct vcpu *v, int preserve_ad)
-{
-    bool rv = true;
-
-#ifndef PTE_UPDATE_WITH_CMPXCHG
-    if ( !preserve_ad )
-    {
-        rv = paging_write_guest_entry(v, p, new, _mfn(mfn));
-    }
-    else
-#endif
-    {
-        intpte_t t = old;
-
-        for ( ; ; )
-        {
-            intpte_t _new = new;
-
-            if ( preserve_ad )
-                _new |= old & (_PAGE_ACCESSED | _PAGE_DIRTY);
-
-            rv = paging_cmpxchg_guest_entry(v, p, &t, _new, _mfn(mfn));
-            if ( unlikely(rv == 0) )
-            {
-                gdprintk(XENLOG_WARNING,
-                         "Failed to update %" PRIpte " -> %" PRIpte
-                         ": saw %" PRIpte "\n", old, _new, t);
-                break;
-            }
-
-            if ( t == old )
-                break;
-
-            /* Allowed to change in Accessed/Dirty flags only. */
-            BUG_ON((t ^ old) & ~(intpte_t)(_PAGE_ACCESSED|_PAGE_DIRTY));
-
-            old = t;
-        }
-    }
-    return rv;
-}
-
-/*
- * Macro that wraps the appropriate type-changes around update_intpte().
- * Arguments are: type, ptr, old, new, mfn, vcpu
- */
-#define UPDATE_ENTRY(_t,_p,_o,_n,_m,_v,_ad)                         \
-    update_intpte(&_t ## e_get_intpte(*(_p)),                       \
-                  _t ## e_get_intpte(_o), _t ## e_get_intpte(_n),   \
-                  (_m), (_v), (_ad))
-
 /*
  * PTE flags that a guest may change without re-validating the PTE.
  * All other bits affect translation, caching, or Xen's safety.
diff --git a/xen/arch/x86/pv/mm.c b/xen/arch/x86/pv/mm.c
index aa2ce34145..2cb5995e62 100644
--- a/xen/arch/x86/pv/mm.c
+++ b/xen/arch/x86/pv/mm.c
@@ -24,6 +24,13 @@
 
 #include <asm/pv/mm.h>
 
+/*
+ * PTE updates can be done with ordinary writes except:
+ *  1. Debug builds get extra checking by using CMPXCHG[8B].
+ */
+#if !defined(NDEBUG)
+#define PTE_UPDATE_WITH_CMPXCHG
+#endif
 
 /* Read a PV guest's l1e that maps this virtual address. */
 void pv_get_guest_eff_l1e(unsigned long addr, l1_pgentry_t *eff_l1e)
@@ -56,6 +63,53 @@ void pv_get_guest_eff_kern_l1e(struct vcpu *v, unsigned long addr,
         toggle_guest_mode(v);
 }
 
+/*
+ * How to write an entry to the guest pagetables.
+ * Returns false for failure (pointer not valid), true for success.
+ */
+bool pv_update_intpte(intpte_t *p, intpte_t old, intpte_t new,
+                      unsigned long mfn, struct vcpu *v, int preserve_ad)
+{
+    bool rv = true;
+
+#ifndef PTE_UPDATE_WITH_CMPXCHG
+    if ( !preserve_ad )
+    {
+        rv = paging_write_guest_entry(v, p, new, _mfn(mfn));
+    }
+    else
+#endif
+    {
+        intpte_t t = old;
+
+        for ( ; ; )
+        {
+            intpte_t _new = new;
+
+            if ( preserve_ad )
+                _new |= old & (_PAGE_ACCESSED | _PAGE_DIRTY);
+
+            rv = paging_cmpxchg_guest_entry(v, p, &t, _new, _mfn(mfn));
+            if ( unlikely(rv == 0) )
+            {
+                gdprintk(XENLOG_WARNING,
+                         "Failed to update %" PRIpte " -> %" PRIpte
+                         ": saw %" PRIpte "\n", old, _new, t);
+                break;
+            }
+
+            if ( t == old )
+                break;
+
+            /* Allowed to change in Accessed/Dirty flags only. */
+            BUG_ON((t ^ old) & ~(intpte_t)(_PAGE_ACCESSED|_PAGE_DIRTY));
+
+            old = t;
+        }
+    }
+    return rv;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/pv/mm.h b/xen/include/asm-x86/pv/mm.h
index 19dbc3b66c..72c04c684f 100644
--- a/xen/include/asm-x86/pv/mm.h
+++ b/xen/include/asm-x86/pv/mm.h
@@ -28,6 +28,17 @@ void pv_get_guest_eff_l1e(unsigned long addr, l1_pgentry_t *eff_l1e);
 void pv_get_guest_eff_kern_l1e(struct vcpu *v, unsigned long addr,
                                void *eff_l1e);
 
+bool pv_update_intpte(intpte_t *p, intpte_t old, intpte_t new,
+                      unsigned long mfn, struct vcpu *v, int preserve_ad);
+/*
+ * Macro that wraps the appropriate type-changes around update_intpte().
+ * Arguments are: type, ptr, old, new, mfn, vcpu
+ */
+#define UPDATE_ENTRY(_t,_p,_o,_n,_m,_v,_ad)                            \
+    pv_update_intpte(&_t ## e_get_intpte(*(_p)),                       \
+                     _t ## e_get_intpte(_o), _t ## e_get_intpte(_n),   \
+                     (_m), (_v), (_ad))
+
 #else
 
 static inline void pv_get_guest_eff_l1e(unsigned long addr,
@@ -38,6 +49,11 @@ static inline void pv_get_guest_eff_kern_l1e(struct vcpu *v, unsigned long addr,
                                              void *eff_l1e)
 {}
 
+static inline bool pv_update_intpte(intpte_t *p, intpte_t old, intpte_t new,
+                                    unsigned long mfn, struct vcpu *v,
+                                    int preserve_ad)
+{ return false; }
+
 #endif
 
 #endif /* __X86_PV_MM_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 10/31] x86/mm: move {un, }adjust_guest_* to pv/mm.h
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (8 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 09/31] x86/mm: rename and move update_intpte Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-24 15:00   ` Jan Beulich
  2017-08-17 14:44 ` [PATCH v4 11/31] x86/mm: split out writable pagetable emulation code Wei Liu
                   ` (20 subsequent siblings)
  30 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Those macros will soon be used in different files. They are PV
specific so move them to pv/mm.h.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c           | 47 ---------------------------------------------
 xen/include/asm-x86/pv/mm.h | 47 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 47 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index dc4ac5592a..4ac69b3804 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1151,53 +1151,6 @@ get_page_from_l4e(
     return rc;
 }
 
-#define adjust_guest_l1e(pl1e, d)                                            \
-    do {                                                                     \
-        if ( likely(l1e_get_flags((pl1e)) & _PAGE_PRESENT) &&                \
-             likely(!is_pv_32bit_domain(d)) )                                \
-        {                                                                    \
-            /* _PAGE_GUEST_KERNEL page cannot have the Global bit set. */    \
-            if ( (l1e_get_flags((pl1e)) & (_PAGE_GUEST_KERNEL|_PAGE_GLOBAL)) \
-                 == (_PAGE_GUEST_KERNEL|_PAGE_GLOBAL) )                      \
-                gdprintk(XENLOG_WARNING,                                     \
-                         "Global bit is set to kernel page %lx\n",           \
-                         l1e_get_pfn((pl1e)));                               \
-            if ( !(l1e_get_flags((pl1e)) & _PAGE_USER) )                     \
-                l1e_add_flags((pl1e), (_PAGE_GUEST_KERNEL|_PAGE_USER));      \
-            if ( !(l1e_get_flags((pl1e)) & _PAGE_GUEST_KERNEL) )             \
-                l1e_add_flags((pl1e), (_PAGE_GLOBAL|_PAGE_USER));            \
-        }                                                                    \
-    } while ( 0 )
-
-#define adjust_guest_l2e(pl2e, d)                               \
-    do {                                                        \
-        if ( likely(l2e_get_flags((pl2e)) & _PAGE_PRESENT) &&   \
-             likely(!is_pv_32bit_domain(d)) )                   \
-            l2e_add_flags((pl2e), _PAGE_USER);                  \
-    } while ( 0 )
-
-#define adjust_guest_l3e(pl3e, d)                                   \
-    do {                                                            \
-        if ( likely(l3e_get_flags((pl3e)) & _PAGE_PRESENT) )        \
-            l3e_add_flags((pl3e), likely(!is_pv_32bit_domain(d)) ?  \
-                                         _PAGE_USER :               \
-                                         _PAGE_USER|_PAGE_RW);      \
-    } while ( 0 )
-
-#define adjust_guest_l4e(pl4e, d)                               \
-    do {                                                        \
-        if ( likely(l4e_get_flags((pl4e)) & _PAGE_PRESENT) &&   \
-             likely(!is_pv_32bit_domain(d)) )                   \
-            l4e_add_flags((pl4e), _PAGE_USER);                  \
-    } while ( 0 )
-
-#define unadjust_guest_l3e(pl3e, d)                                         \
-    do {                                                                    \
-        if ( unlikely(is_pv_32bit_domain(d)) &&                             \
-             likely(l3e_get_flags((pl3e)) & _PAGE_PRESENT) )                \
-            l3e_remove_flags((pl3e), _PAGE_USER|_PAGE_RW|_PAGE_ACCESSED);   \
-    } while ( 0 )
-
 void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner)
 {
     unsigned long     pfn = l1e_get_pfn(l1e);
diff --git a/xen/include/asm-x86/pv/mm.h b/xen/include/asm-x86/pv/mm.h
index 72c04c684f..b3887989b6 100644
--- a/xen/include/asm-x86/pv/mm.h
+++ b/xen/include/asm-x86/pv/mm.h
@@ -23,6 +23,53 @@
 
 #ifdef CONFIG_PV
 
+#define adjust_guest_l1e(pl1e, d)                                            \
+    do {                                                                     \
+        if ( likely(l1e_get_flags((pl1e)) & _PAGE_PRESENT) &&                \
+             likely(!is_pv_32bit_domain(d)) )                                \
+        {                                                                    \
+            /* _PAGE_GUEST_KERNEL page cannot have the Global bit set. */    \
+            if ( (l1e_get_flags((pl1e)) & (_PAGE_GUEST_KERNEL|_PAGE_GLOBAL)) \
+                 == (_PAGE_GUEST_KERNEL|_PAGE_GLOBAL) )                      \
+                gdprintk(XENLOG_WARNING,                                     \
+                         "Global bit is set to kernel page %lx\n",           \
+                         l1e_get_pfn((pl1e)));                               \
+            if ( !(l1e_get_flags((pl1e)) & _PAGE_USER) )                     \
+                l1e_add_flags((pl1e), (_PAGE_GUEST_KERNEL|_PAGE_USER));      \
+            if ( !(l1e_get_flags((pl1e)) & _PAGE_GUEST_KERNEL) )             \
+                l1e_add_flags((pl1e), (_PAGE_GLOBAL|_PAGE_USER));            \
+        }                                                                    \
+    } while ( 0 )
+
+#define adjust_guest_l2e(pl2e, d)                               \
+    do {                                                        \
+        if ( likely(l2e_get_flags((pl2e)) & _PAGE_PRESENT) &&   \
+             likely(!is_pv_32bit_domain(d)) )                   \
+            l2e_add_flags((pl2e), _PAGE_USER);                  \
+    } while ( 0 )
+
+#define adjust_guest_l3e(pl3e, d)                                   \
+    do {                                                            \
+        if ( likely(l3e_get_flags((pl3e)) & _PAGE_PRESENT) )        \
+            l3e_add_flags((pl3e), likely(!is_pv_32bit_domain(d)) ?  \
+                                         _PAGE_USER :               \
+                                         _PAGE_USER|_PAGE_RW);      \
+    } while ( 0 )
+
+#define adjust_guest_l4e(pl4e, d)                               \
+    do {                                                        \
+        if ( likely(l4e_get_flags((pl4e)) & _PAGE_PRESENT) &&   \
+             likely(!is_pv_32bit_domain(d)) )                   \
+            l4e_add_flags((pl4e), _PAGE_USER);                  \
+    } while ( 0 )
+
+#define unadjust_guest_l3e(pl3e, d)                                         \
+    do {                                                                    \
+        if ( unlikely(is_pv_32bit_domain(d)) &&                             \
+             likely(l3e_get_flags((pl3e)) & _PAGE_PRESENT) )                \
+            l3e_remove_flags((pl3e), _PAGE_USER|_PAGE_RW|_PAGE_ACCESSED);   \
+    } while ( 0 )
+
 void pv_get_guest_eff_l1e(unsigned long addr, l1_pgentry_t *eff_l1e);
 
 void pv_get_guest_eff_kern_l1e(struct vcpu *v, unsigned long addr,
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 11/31] x86/mm: split out writable pagetable emulation code
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (9 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 10/31] x86/mm: move {un, }adjust_guest_* to pv/mm.h Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-24 15:15   ` Jan Beulich
  2017-08-17 14:44 ` [PATCH v4 12/31] x86/mm: split out readonly MMIO " Wei Liu
                   ` (19 subsequent siblings)
  30 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Move the code to pv/emul-ptwr-op.c. Fix coding style issues while
moving the code.

Rename ptwr_emulated_read to pv_emul_ptwr_read and export it via
pv/mm.h because it is needed by other emulation code.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c              | 308 +-------------------------------------
 xen/arch/x86/pv/Makefile       |   1 +
 xen/arch/x86/pv/emul-ptwr-op.c | 327 +++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/pv/emulate.h      |   2 +
 4 files changed, 332 insertions(+), 306 deletions(-)
 create mode 100644 xen/arch/x86/pv/emul-ptwr-op.c

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 4ac69b3804..3c0aa52f38 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4786,310 +4786,6 @@ long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 }
 
 
-/*************************
- * Writable Pagetables
- */
-
-struct ptwr_emulate_ctxt {
-    struct x86_emulate_ctxt ctxt;
-    unsigned long cr2;
-    l1_pgentry_t  pte;
-};
-
-static int ptwr_emulated_read(
-    enum x86_segment seg,
-    unsigned long offset,
-    void *p_data,
-    unsigned int bytes,
-    struct x86_emulate_ctxt *ctxt)
-{
-    unsigned int rc = bytes;
-    unsigned long addr = offset;
-
-    if ( !__addr_ok(addr) ||
-         (rc = __copy_from_user(p_data, (void *)addr, bytes)) )
-    {
-        x86_emul_pagefault(0, addr + bytes - rc, ctxt);  /* Read fault. */
-        return X86EMUL_EXCEPTION;
-    }
-
-    return X86EMUL_OKAY;
-}
-
-static int ptwr_emulated_update(
-    unsigned long addr,
-    paddr_t old,
-    paddr_t val,
-    unsigned int bytes,
-    unsigned int do_cmpxchg,
-    struct ptwr_emulate_ctxt *ptwr_ctxt)
-{
-    unsigned long mfn;
-    unsigned long unaligned_addr = addr;
-    struct page_info *page;
-    l1_pgentry_t pte, ol1e, nl1e, *pl1e;
-    struct vcpu *v = current;
-    struct domain *d = v->domain;
-    int ret;
-
-    /* Only allow naturally-aligned stores within the original %cr2 page. */
-    if ( unlikely(((addr ^ ptwr_ctxt->cr2) & PAGE_MASK) ||
-                  (addr & (bytes - 1))) )
-    {
-        gdprintk(XENLOG_WARNING, "bad access (cr2=%lx, addr=%lx, bytes=%u)\n",
-                 ptwr_ctxt->cr2, addr, bytes);
-        return X86EMUL_UNHANDLEABLE;
-    }
-
-    /* Turn a sub-word access into a full-word access. */
-    if ( bytes != sizeof(paddr_t) )
-    {
-        paddr_t      full;
-        unsigned int rc, offset = addr & (sizeof(paddr_t) - 1);
-
-        /* Align address; read full word. */
-        addr &= ~(sizeof(paddr_t) - 1);
-        if ( (rc = copy_from_user(&full, (void *)addr, sizeof(paddr_t))) != 0 )
-        {
-            x86_emul_pagefault(0, /* Read fault. */
-                               addr + sizeof(paddr_t) - rc,
-                               &ptwr_ctxt->ctxt);
-            return X86EMUL_EXCEPTION;
-        }
-        /* Mask out bits provided by caller. */
-        full &= ~((((paddr_t)1 << (bytes * 8)) - 1) << (offset * 8));
-        /* Shift the caller value and OR in the missing bits. */
-        val  &= (((paddr_t)1 << (bytes * 8)) - 1);
-        val <<= (offset) * 8;
-        val  |= full;
-        /* Also fill in missing parts of the cmpxchg old value. */
-        old  &= (((paddr_t)1 << (bytes * 8)) - 1);
-        old <<= (offset) * 8;
-        old  |= full;
-    }
-
-    pte  = ptwr_ctxt->pte;
-    mfn  = l1e_get_pfn(pte);
-    page = mfn_to_page(mfn);
-
-    /* We are looking only for read-only mappings of p.t. pages. */
-    ASSERT((l1e_get_flags(pte) & (_PAGE_RW|_PAGE_PRESENT)) == _PAGE_PRESENT);
-    ASSERT(mfn_valid(_mfn(mfn)));
-    ASSERT((page->u.inuse.type_info & PGT_type_mask) == PGT_l1_page_table);
-    ASSERT((page->u.inuse.type_info & PGT_count_mask) != 0);
-    ASSERT(page_get_owner(page) == d);
-
-    /* Check the new PTE. */
-    nl1e = l1e_from_intpte(val);
-    switch ( ret = get_page_from_l1e(nl1e, d, d) )
-    {
-    default:
-        if ( is_pv_32bit_domain(d) && (bytes == 4) && (unaligned_addr & 4) &&
-             !do_cmpxchg && (l1e_get_flags(nl1e) & _PAGE_PRESENT) )
-        {
-            /*
-             * If this is an upper-half write to a PAE PTE then we assume that
-             * the guest has simply got the two writes the wrong way round. We
-             * zap the PRESENT bit on the assumption that the bottom half will
-             * be written immediately after we return to the guest.
-             */
-            gdprintk(XENLOG_DEBUG, "ptwr_emulate: fixing up invalid PAE PTE %"
-                     PRIpte"\n", l1e_get_intpte(nl1e));
-            l1e_remove_flags(nl1e, _PAGE_PRESENT);
-        }
-        else
-        {
-            gdprintk(XENLOG_WARNING, "could not get_page_from_l1e()\n");
-            return X86EMUL_UNHANDLEABLE;
-        }
-        break;
-    case 0:
-        break;
-    case _PAGE_RW ... _PAGE_RW | PAGE_CACHE_ATTRS:
-        ASSERT(!(ret & ~(_PAGE_RW | PAGE_CACHE_ATTRS)));
-        l1e_flip_flags(nl1e, ret);
-        break;
-    }
-
-    adjust_guest_l1e(nl1e, d);
-
-    /* Checked successfully: do the update (write or cmpxchg). */
-    pl1e = map_domain_page(_mfn(mfn));
-    pl1e = (l1_pgentry_t *)((unsigned long)pl1e + (addr & ~PAGE_MASK));
-    if ( do_cmpxchg )
-    {
-        bool okay;
-        intpte_t t = old;
-
-        ol1e = l1e_from_intpte(old);
-        okay = paging_cmpxchg_guest_entry(v, &l1e_get_intpte(*pl1e),
-                                          &t, l1e_get_intpte(nl1e), _mfn(mfn));
-        okay = (okay && t == old);
-
-        if ( !okay )
-        {
-            unmap_domain_page(pl1e);
-            put_page_from_l1e(nl1e, d);
-            return X86EMUL_RETRY;
-        }
-    }
-    else
-    {
-        ol1e = *pl1e;
-        if ( !UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, mfn, v, 0) )
-            BUG();
-    }
-
-    trace_ptwr_emulation(addr, nl1e);
-
-    unmap_domain_page(pl1e);
-
-    /* Finally, drop the old PTE. */
-    put_page_from_l1e(ol1e, d);
-
-    return X86EMUL_OKAY;
-}
-
-static int ptwr_emulated_write(
-    enum x86_segment seg,
-    unsigned long offset,
-    void *p_data,
-    unsigned int bytes,
-    struct x86_emulate_ctxt *ctxt)
-{
-    paddr_t val = 0;
-
-    if ( (bytes > sizeof(paddr_t)) || (bytes & (bytes - 1)) || !bytes )
-    {
-        gdprintk(XENLOG_WARNING, "bad write size (addr=%lx, bytes=%u)\n",
-                 offset, bytes);
-        return X86EMUL_UNHANDLEABLE;
-    }
-
-    memcpy(&val, p_data, bytes);
-
-    return ptwr_emulated_update(
-        offset, 0, val, bytes, 0,
-        container_of(ctxt, struct ptwr_emulate_ctxt, ctxt));
-}
-
-static int ptwr_emulated_cmpxchg(
-    enum x86_segment seg,
-    unsigned long offset,
-    void *p_old,
-    void *p_new,
-    unsigned int bytes,
-    struct x86_emulate_ctxt *ctxt)
-{
-    paddr_t old = 0, new = 0;
-
-    if ( (bytes > sizeof(paddr_t)) || (bytes & (bytes - 1)) )
-    {
-        gdprintk(XENLOG_WARNING, "bad cmpxchg size (addr=%lx, bytes=%u)\n",
-                 offset, bytes);
-        return X86EMUL_UNHANDLEABLE;
-    }
-
-    memcpy(&old, p_old, bytes);
-    memcpy(&new, p_new, bytes);
-
-    return ptwr_emulated_update(
-        offset, old, new, bytes, 1,
-        container_of(ctxt, struct ptwr_emulate_ctxt, ctxt));
-}
-
-static const struct x86_emulate_ops ptwr_emulate_ops = {
-    .read       = ptwr_emulated_read,
-    .insn_fetch = ptwr_emulated_read,
-    .write      = ptwr_emulated_write,
-    .cmpxchg    = ptwr_emulated_cmpxchg,
-    .validate   = pv_emul_is_mem_write,
-    .cpuid      = pv_emul_cpuid,
-};
-
-/* Write page fault handler: check if guest is trying to modify a PTE. */
-int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
-                       struct cpu_user_regs *regs)
-{
-    struct domain *d = v->domain;
-    struct page_info *page;
-    l1_pgentry_t      pte;
-    struct ptwr_emulate_ctxt ptwr_ctxt = {
-        .ctxt = {
-            .regs = regs,
-            .vendor = d->arch.cpuid->x86_vendor,
-            .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
-            .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
-            .lma       = !is_pv_32bit_domain(d),
-        },
-    };
-    int rc;
-
-    /* Attempt to read the PTE that maps the VA being accessed. */
-    pv_get_guest_eff_l1e(addr, &pte);
-
-    /* We are looking only for read-only mappings of p.t. pages. */
-    if ( ((l1e_get_flags(pte) & (_PAGE_PRESENT|_PAGE_RW)) != _PAGE_PRESENT) ||
-         rangeset_contains_singleton(mmio_ro_ranges, l1e_get_pfn(pte)) ||
-         !get_page_from_mfn(_mfn(l1e_get_pfn(pte)), d) )
-        goto bail;
-
-    page = l1e_get_page(pte);
-    if ( !page_lock(page) )
-    {
-        put_page(page);
-        goto bail;
-    }
-
-    if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
-    {
-        page_unlock(page);
-        put_page(page);
-        goto bail;
-    }
-
-    ptwr_ctxt.cr2 = addr;
-    ptwr_ctxt.pte = pte;
-
-    rc = x86_emulate(&ptwr_ctxt.ctxt, &ptwr_emulate_ops);
-
-    page_unlock(page);
-    put_page(page);
-
-    switch ( rc )
-    {
-    case X86EMUL_EXCEPTION:
-        /*
-         * This emulation only covers writes to pagetables which are marked
-         * read-only by Xen.  We tolerate #PF (in case a concurrent pagetable
-         * update has succeeded on a different vcpu).  Anything else is an
-         * emulation bug, or a guest playing with the instruction stream under
-         * Xen's feet.
-         */
-        if ( ptwr_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
-             ptwr_ctxt.ctxt.event.vector == TRAP_page_fault )
-            pv_inject_event(&ptwr_ctxt.ctxt.event);
-        else
-            gdprintk(XENLOG_WARNING,
-                     "Unexpected event (type %u, vector %#x) from emulation\n",
-                     ptwr_ctxt.ctxt.event.type, ptwr_ctxt.ctxt.event.vector);
-
-        /* Fallthrough */
-    case X86EMUL_OKAY:
-
-        if ( ptwr_ctxt.ctxt.retire.singlestep )
-            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
-
-        /* Fallthrough */
-    case X86EMUL_RETRY:
-        perfc_incr(ptwr_emulations);
-        return EXCRET_fault_fixed;
-    }
-
- bail:
-    return 0;
-}
-
 /*************************
  * fault handling for read-only MMIO pages
  */
@@ -5117,7 +4813,7 @@ int mmio_ro_emulated_write(
 
 static const struct x86_emulate_ops mmio_ro_emulate_ops = {
     .read       = x86emul_unhandleable_rw,
-    .insn_fetch = ptwr_emulated_read,
+    .insn_fetch = pv_emul_ptwr_read,
     .write      = mmio_ro_emulated_write,
     .validate   = pv_emul_is_mem_write,
     .cpuid      = pv_emul_cpuid,
@@ -5156,7 +4852,7 @@ int mmcfg_intercept_write(
 
 static const struct x86_emulate_ops mmcfg_intercept_ops = {
     .read       = x86emul_unhandleable_rw,
-    .insn_fetch = ptwr_emulated_read,
+    .insn_fetch = pv_emul_ptwr_read,
     .write      = mmcfg_intercept_write,
     .validate   = pv_emul_is_mem_write,
     .cpuid      = pv_emul_cpuid,
diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index c83aed493b..cbd890c5f2 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -4,6 +4,7 @@ obj-y += emulate.o
 obj-y += emul-gate-op.o
 obj-y += emul-inv-op.o
 obj-y += emul-priv-op.o
+obj-y += emul-ptwr-op.o
 obj-y += hypercall.o
 obj-y += iret.o
 obj-y += misc-hypercalls.o
diff --git a/xen/arch/x86/pv/emul-ptwr-op.c b/xen/arch/x86/pv/emul-ptwr-op.c
new file mode 100644
index 0000000000..17d7ee8f41
--- /dev/null
+++ b/xen/arch/x86/pv/emul-ptwr-op.c
@@ -0,0 +1,327 @@
+/******************************************************************************
+ * arch/x86/pv/emul-ptwr-op.c
+ *
+ * Emulate writable pagetable for PV guests
+ *
+ * Copyright (c) 2002-2005 K A Fraser
+ * Copyright (c) 2004 Christian Limpach
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/guest_access.h>
+#include <xen/trace.h>
+
+#include <asm/pv/mm.h>
+
+#include "emulate.h"
+
+/*************************
+ * Writable Pagetables
+ */
+
+struct ptwr_emulate_ctxt {
+    struct x86_emulate_ctxt ctxt;
+    unsigned long cr2;
+    l1_pgentry_t  pte;
+};
+
+int pv_emul_ptwr_read(enum x86_segment seg, unsigned long offset, void *p_data,
+                      unsigned int bytes, struct x86_emulate_ctxt *ctxt)
+{
+    unsigned int rc = bytes;
+    unsigned long addr = offset;
+
+    if ( !__addr_ok(addr) ||
+         (rc = __copy_from_user(p_data, (void *)addr, bytes)) )
+    {
+        x86_emul_pagefault(0, addr + bytes - rc, ctxt);  /* Read fault. */
+        return X86EMUL_EXCEPTION;
+    }
+
+    return X86EMUL_OKAY;
+}
+
+static int ptwr_emulated_update(unsigned long addr, paddr_t old, paddr_t val,
+                                unsigned int bytes, unsigned int do_cmpxchg,
+                                struct ptwr_emulate_ctxt *ptwr_ctxt)
+{
+    unsigned long mfn;
+    unsigned long unaligned_addr = addr;
+    struct page_info *page;
+    l1_pgentry_t pte, ol1e, nl1e, *pl1e;
+    struct vcpu *v = current;
+    struct domain *d = v->domain;
+    int ret;
+
+    /* Only allow naturally-aligned stores within the original %cr2 page. */
+    if ( unlikely(((addr ^ ptwr_ctxt->cr2) & PAGE_MASK) ||
+                  (addr & (bytes - 1))) )
+    {
+        gdprintk(XENLOG_WARNING, "bad access (cr2=%lx, addr=%lx, bytes=%u)\n",
+                 ptwr_ctxt->cr2, addr, bytes);
+        return X86EMUL_UNHANDLEABLE;
+    }
+
+    /* Turn a sub-word access into a full-word access. */
+    if ( bytes != sizeof(paddr_t) )
+    {
+        paddr_t      full;
+        unsigned int rc, offset = addr & (sizeof(paddr_t) - 1);
+
+        /* Align address; read full word. */
+        addr &= ~(sizeof(paddr_t) - 1);
+        if ( (rc = copy_from_user(&full, (void *)addr, sizeof(paddr_t))) != 0 )
+        {
+            x86_emul_pagefault(0, /* Read fault. */
+                               addr + sizeof(paddr_t) - rc,
+                               &ptwr_ctxt->ctxt);
+            return X86EMUL_EXCEPTION;
+        }
+        /* Mask out bits provided by caller. */
+        full &= ~((((paddr_t)1 << (bytes * 8)) - 1) << (offset * 8));
+        /* Shift the caller value and OR in the missing bits. */
+        val  &= (((paddr_t)1 << (bytes * 8)) - 1);
+        val <<= (offset) * 8;
+        val  |= full;
+        /* Also fill in missing parts of the cmpxchg old value. */
+        old  &= (((paddr_t)1 << (bytes * 8)) - 1);
+        old <<= (offset) * 8;
+        old  |= full;
+    }
+
+    pte  = ptwr_ctxt->pte;
+    mfn  = l1e_get_pfn(pte);
+    page = mfn_to_page(mfn);
+
+    /* We are looking only for read-only mappings of p.t. pages. */
+    ASSERT((l1e_get_flags(pte) & (_PAGE_RW|_PAGE_PRESENT)) == _PAGE_PRESENT);
+    ASSERT(mfn_valid(_mfn(mfn)));
+    ASSERT((page->u.inuse.type_info & PGT_type_mask) == PGT_l1_page_table);
+    ASSERT((page->u.inuse.type_info & PGT_count_mask) != 0);
+    ASSERT(page_get_owner(page) == d);
+
+    /* Check the new PTE. */
+    nl1e = l1e_from_intpte(val);
+    switch ( ret = get_page_from_l1e(nl1e, d, d) )
+    {
+    default:
+        if ( is_pv_32bit_domain(d) && (bytes == 4) && (unaligned_addr & 4) &&
+             !do_cmpxchg && (l1e_get_flags(nl1e) & _PAGE_PRESENT) )
+        {
+            /*
+             * If this is an upper-half write to a PAE PTE then we assume that
+             * the guest has simply got the two writes the wrong way round. We
+             * zap the PRESENT bit on the assumption that the bottom half will
+             * be written immediately after we return to the guest.
+             */
+            gdprintk(XENLOG_DEBUG, "ptwr_emulate: fixing up invalid PAE PTE %"
+                     PRIpte"\n", l1e_get_intpte(nl1e));
+            l1e_remove_flags(nl1e, _PAGE_PRESENT);
+        }
+        else
+        {
+            gdprintk(XENLOG_WARNING, "could not get_page_from_l1e()\n");
+            return X86EMUL_UNHANDLEABLE;
+        }
+        break;
+    case 0:
+        break;
+    case _PAGE_RW ... _PAGE_RW | PAGE_CACHE_ATTRS:
+        ASSERT(!(ret & ~(_PAGE_RW | PAGE_CACHE_ATTRS)));
+        l1e_flip_flags(nl1e, ret);
+        break;
+    }
+
+    adjust_guest_l1e(nl1e, d);
+
+    /* Checked successfully: do the update (write or cmpxchg). */
+    pl1e = map_domain_page(_mfn(mfn));
+    pl1e = (l1_pgentry_t *)((unsigned long)pl1e + (addr & ~PAGE_MASK));
+    if ( do_cmpxchg )
+    {
+        bool okay;
+        intpte_t t = old;
+
+        ol1e = l1e_from_intpte(old);
+        okay = paging_cmpxchg_guest_entry(v, &l1e_get_intpte(*pl1e),
+                                          &t, l1e_get_intpte(nl1e), _mfn(mfn));
+        okay = (okay && t == old);
+
+        if ( !okay )
+        {
+            unmap_domain_page(pl1e);
+            put_page_from_l1e(nl1e, d);
+            return X86EMUL_RETRY;
+        }
+    }
+    else
+    {
+        ol1e = *pl1e;
+        if ( !UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, mfn, v, 0) )
+            BUG();
+    }
+
+    trace_ptwr_emulation(addr, nl1e);
+
+    unmap_domain_page(pl1e);
+
+    /* Finally, drop the old PTE. */
+    put_page_from_l1e(ol1e, d);
+
+    return X86EMUL_OKAY;
+}
+
+static int ptwr_emulated_write(enum x86_segment seg, unsigned long offset,
+                               void *p_data, unsigned int bytes,
+                               struct x86_emulate_ctxt *ctxt)
+{
+    paddr_t val = 0;
+
+    if ( (bytes > sizeof(paddr_t)) || (bytes & (bytes - 1)) || !bytes )
+    {
+        gdprintk(XENLOG_WARNING, "bad write size (addr=%lx, bytes=%u)\n",
+                 offset, bytes);
+        return X86EMUL_UNHANDLEABLE;
+    }
+
+    memcpy(&val, p_data, bytes);
+
+    return ptwr_emulated_update(
+        offset, 0, val, bytes, 0,
+        container_of(ctxt, struct ptwr_emulate_ctxt, ctxt));
+}
+
+static int ptwr_emulated_cmpxchg(enum x86_segment seg, unsigned long offset,
+                                 void *p_old, void *p_new, unsigned int bytes,
+                                 struct x86_emulate_ctxt *ctxt)
+{
+    paddr_t old = 0, new = 0;
+
+    if ( (bytes > sizeof(paddr_t)) || (bytes & (bytes - 1)) )
+    {
+        gdprintk(XENLOG_WARNING, "bad cmpxchg size (addr=%lx, bytes=%u)\n",
+                 offset, bytes);
+        return X86EMUL_UNHANDLEABLE;
+    }
+
+    memcpy(&old, p_old, bytes);
+    memcpy(&new, p_new, bytes);
+
+    return ptwr_emulated_update(
+        offset, old, new, bytes, 1,
+        container_of(ctxt, struct ptwr_emulate_ctxt, ctxt));
+}
+
+static const struct x86_emulate_ops ptwr_emulate_ops = {
+    .read       = pv_emul_ptwr_read,
+    .insn_fetch = pv_emul_ptwr_read,
+    .write      = ptwr_emulated_write,
+    .cmpxchg    = ptwr_emulated_cmpxchg,
+    .validate   = pv_emul_is_mem_write,
+    .cpuid      = pv_emul_cpuid,
+};
+
+/* Write page fault handler: check if guest is trying to modify a PTE. */
+int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
+                       struct cpu_user_regs *regs)
+{
+    struct domain *d = v->domain;
+    struct page_info *page;
+    l1_pgentry_t      pte;
+    struct ptwr_emulate_ctxt ptwr_ctxt = {
+        .ctxt = {
+            .regs = regs,
+            .vendor = d->arch.cpuid->x86_vendor,
+            .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
+            .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
+            .lma       = !is_pv_32bit_domain(d),
+        },
+    };
+    int rc;
+
+    /* Attempt to read the PTE that maps the VA being accessed. */
+    pv_get_guest_eff_l1e(addr, &pte);
+
+    /* We are looking only for read-only mappings of p.t. pages. */
+    if ( ((l1e_get_flags(pte) & (_PAGE_PRESENT|_PAGE_RW)) != _PAGE_PRESENT) ||
+         rangeset_contains_singleton(mmio_ro_ranges, l1e_get_pfn(pte)) ||
+         !get_page_from_mfn(_mfn(l1e_get_pfn(pte)), d) )
+        goto bail;
+
+    page = l1e_get_page(pte);
+    if ( !page_lock(page) )
+    {
+        put_page(page);
+        goto bail;
+    }
+
+    if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
+    {
+        page_unlock(page);
+        put_page(page);
+        goto bail;
+    }
+
+    ptwr_ctxt.cr2 = addr;
+    ptwr_ctxt.pte = pte;
+
+    rc = x86_emulate(&ptwr_ctxt.ctxt, &ptwr_emulate_ops);
+
+    page_unlock(page);
+    put_page(page);
+
+    switch ( rc )
+    {
+    case X86EMUL_EXCEPTION:
+        /*
+         * This emulation only covers writes to pagetables which are marked
+         * read-only by Xen.  We tolerate #PF (in case a concurrent pagetable
+         * update has succeeded on a different vcpu).  Anything else is an
+         * emulation bug, or a guest playing with the instruction stream under
+         * Xen's feet.
+         */
+        if ( ptwr_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
+             ptwr_ctxt.ctxt.event.vector == TRAP_page_fault )
+            pv_inject_event(&ptwr_ctxt.ctxt.event);
+        else
+            gdprintk(XENLOG_WARNING,
+                     "Unexpected event (type %u, vector %#x) from emulation\n",
+                     ptwr_ctxt.ctxt.event.type, ptwr_ctxt.ctxt.event.vector);
+
+        /* Fallthrough */
+    case X86EMUL_OKAY:
+
+        if ( ptwr_ctxt.ctxt.retire.singlestep )
+            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+
+        /* Fallthrough */
+    case X86EMUL_RETRY:
+        perfc_incr(ptwr_emulations);
+        return EXCRET_fault_fixed;
+    }
+
+ bail:
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/pv/emulate.h b/xen/arch/x86/pv/emulate.h
index 89abbe010f..7fb568adc0 100644
--- a/xen/arch/x86/pv/emulate.h
+++ b/xen/arch/x86/pv/emulate.h
@@ -10,4 +10,6 @@ void pv_emul_instruction_done(struct cpu_user_regs *regs, unsigned long rip);
 int pv_emul_is_mem_write(const struct x86_emulate_state *state,
                          struct x86_emulate_ctxt *ctxt);
 
+int pv_emul_ptwr_read(enum x86_segment seg, unsigned long offset, void *p_data,
+                      unsigned int bytes, struct x86_emulate_ctxt *ctxt);
 #endif /* __PV_EMULATE_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 12/31] x86/mm: split out readonly MMIO emulation code
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (10 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 11/31] x86/mm: split out writable pagetable emulation code Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-24 15:16   ` Jan Beulich
  2017-08-17 14:44 ` [PATCH v4 13/31] x86/mm: remove the unused inclusion of pv/emulate.h Wei Liu
                   ` (18 subsequent siblings)
  30 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Move the code to pv/emul-mmio-op.c. Fix coding style issues while
moving.

Note that mmio_ro_emulated_write is needed by both PV and HVM, so it
is left in x86/mm.c.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c              | 129 --------------------------------
 xen/arch/x86/pv/Makefile       |   1 +
 xen/arch/x86/pv/emul-mmio-op.c | 166 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 167 insertions(+), 129 deletions(-)
 create mode 100644 xen/arch/x86/pv/emul-mmio-op.c

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 3c0aa52f38..a42720c8d1 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4785,11 +4785,6 @@ long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
     return 0;
 }
 
-
-/*************************
- * fault handling for read-only MMIO pages
- */
-
 int mmio_ro_emulated_write(
     enum x86_segment seg,
     unsigned long offset,
@@ -4811,130 +4806,6 @@ int mmio_ro_emulated_write(
     return X86EMUL_OKAY;
 }
 
-static const struct x86_emulate_ops mmio_ro_emulate_ops = {
-    .read       = x86emul_unhandleable_rw,
-    .insn_fetch = pv_emul_ptwr_read,
-    .write      = mmio_ro_emulated_write,
-    .validate   = pv_emul_is_mem_write,
-    .cpuid      = pv_emul_cpuid,
-};
-
-int mmcfg_intercept_write(
-    enum x86_segment seg,
-    unsigned long offset,
-    void *p_data,
-    unsigned int bytes,
-    struct x86_emulate_ctxt *ctxt)
-{
-    struct mmio_ro_emulate_ctxt *mmio_ctxt = ctxt->data;
-
-    /*
-     * Only allow naturally-aligned stores no wider than 4 bytes to the
-     * original %cr2 address.
-     */
-    if ( ((bytes | offset) & (bytes - 1)) || bytes > 4 || !bytes ||
-         offset != mmio_ctxt->cr2 )
-    {
-        gdprintk(XENLOG_WARNING, "bad write (cr2=%lx, addr=%lx, bytes=%u)\n",
-                mmio_ctxt->cr2, offset, bytes);
-        return X86EMUL_UNHANDLEABLE;
-    }
-
-    offset &= 0xfff;
-    if ( pci_conf_write_intercept(mmio_ctxt->seg, mmio_ctxt->bdf,
-                                  offset, bytes, p_data) >= 0 )
-        pci_mmcfg_write(mmio_ctxt->seg, PCI_BUS(mmio_ctxt->bdf),
-                        PCI_DEVFN2(mmio_ctxt->bdf), offset, bytes,
-                        *(uint32_t *)p_data);
-
-    return X86EMUL_OKAY;
-}
-
-static const struct x86_emulate_ops mmcfg_intercept_ops = {
-    .read       = x86emul_unhandleable_rw,
-    .insn_fetch = pv_emul_ptwr_read,
-    .write      = mmcfg_intercept_write,
-    .validate   = pv_emul_is_mem_write,
-    .cpuid      = pv_emul_cpuid,
-};
-
-/* Check if guest is trying to modify a r/o MMIO page. */
-int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
-                          struct cpu_user_regs *regs)
-{
-    l1_pgentry_t pte;
-    unsigned long mfn;
-    unsigned int addr_size = is_pv_32bit_vcpu(v) ? 32 : BITS_PER_LONG;
-    struct mmio_ro_emulate_ctxt mmio_ro_ctxt = { .cr2 = addr };
-    struct x86_emulate_ctxt ctxt = {
-        .regs = regs,
-        .vendor = v->domain->arch.cpuid->x86_vendor,
-        .addr_size = addr_size,
-        .sp_size = addr_size,
-        .lma = !is_pv_32bit_vcpu(v),
-        .data = &mmio_ro_ctxt,
-    };
-    int rc;
-
-    /* Attempt to read the PTE that maps the VA being accessed. */
-    pv_get_guest_eff_l1e(addr, &pte);
-
-    /* We are looking only for read-only mappings of MMIO pages. */
-    if ( ((l1e_get_flags(pte) & (_PAGE_PRESENT|_PAGE_RW)) != _PAGE_PRESENT) )
-        return 0;
-
-    mfn = l1e_get_pfn(pte);
-    if ( mfn_valid(_mfn(mfn)) )
-    {
-        struct page_info *page = mfn_to_page(mfn);
-        struct domain *owner = page_get_owner_and_reference(page);
-
-        if ( owner )
-            put_page(page);
-        if ( owner != dom_io )
-            return 0;
-    }
-
-    if ( !rangeset_contains_singleton(mmio_ro_ranges, mfn) )
-        return 0;
-
-    if ( pci_ro_mmcfg_decode(mfn, &mmio_ro_ctxt.seg, &mmio_ro_ctxt.bdf) )
-        rc = x86_emulate(&ctxt, &mmcfg_intercept_ops);
-    else
-        rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
-
-    switch ( rc )
-    {
-    case X86EMUL_EXCEPTION:
-        /*
-         * This emulation only covers writes to MMCFG space or read-only MFNs.
-         * We tolerate #PF (from hitting an adjacent page or a successful
-         * concurrent pagetable update).  Anything else is an emulation bug,
-         * or a guest playing with the instruction stream under Xen's feet.
-         */
-        if ( ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
-             ctxt.event.vector == TRAP_page_fault )
-            pv_inject_event(&ctxt.event);
-        else
-            gdprintk(XENLOG_WARNING,
-                     "Unexpected event (type %u, vector %#x) from emulation\n",
-                     ctxt.event.type, ctxt.event.vector);
-
-        /* Fallthrough */
-    case X86EMUL_OKAY:
-
-        if ( ctxt.retire.singlestep )
-            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
-
-        /* Fallthrough */
-    case X86EMUL_RETRY:
-        perfc_incr(ptwr_emulations);
-        return EXCRET_fault_fixed;
-    }
-
-    return 0;
-}
-
 void *alloc_xen_pagetable(void)
 {
     if ( system_state != SYS_STATE_early_boot )
diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index cbd890c5f2..016b1b6e8f 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -3,6 +3,7 @@ obj-y += domain.o
 obj-y += emulate.o
 obj-y += emul-gate-op.o
 obj-y += emul-inv-op.o
+obj-y += emul-mmio-op.o
 obj-y += emul-priv-op.o
 obj-y += emul-ptwr-op.o
 obj-y += hypercall.o
diff --git a/xen/arch/x86/pv/emul-mmio-op.c b/xen/arch/x86/pv/emul-mmio-op.c
new file mode 100644
index 0000000000..ee5c684777
--- /dev/null
+++ b/xen/arch/x86/pv/emul-mmio-op.c
@@ -0,0 +1,166 @@
+/******************************************************************************
+ * arch/x86/emul-mmio-op.c
+ *
+ * Readonly MMIO emulation for PV guests
+ *
+ * Copyright (c) 2002-2005 K A Fraser
+ * Copyright (c) 2004 Christian Limpach
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/rangeset.h>
+#include <xen/sched.h>
+
+#include <asm/domain.h>
+#include <asm/mm.h>
+#include <asm/pci.h>
+#include <asm/pv/mm.h>
+
+#include "emulate.h"
+
+/*************************
+ * fault handling for read-only MMIO pages
+ */
+
+static const struct x86_emulate_ops mmio_ro_emulate_ops = {
+    .read       = x86emul_unhandleable_rw,
+    .insn_fetch = pv_emul_ptwr_read,
+    .write      = mmio_ro_emulated_write,
+    .validate   = pv_emul_is_mem_write,
+    .cpuid      = pv_emul_cpuid,
+};
+
+int mmcfg_intercept_write(enum x86_segment seg, unsigned long offset,
+                          void *p_data, unsigned int bytes,
+                          struct x86_emulate_ctxt *ctxt)
+{
+    struct mmio_ro_emulate_ctxt *mmio_ctxt = ctxt->data;
+
+    /*
+     * Only allow naturally-aligned stores no wider than 4 bytes to the
+     * original %cr2 address.
+     */
+    if ( ((bytes | offset) & (bytes - 1)) || bytes > 4 || !bytes ||
+         offset != mmio_ctxt->cr2 )
+    {
+        gdprintk(XENLOG_WARNING, "bad write (cr2=%lx, addr=%lx, bytes=%u)\n",
+                mmio_ctxt->cr2, offset, bytes);
+        return X86EMUL_UNHANDLEABLE;
+    }
+
+    offset &= 0xfff;
+    if ( pci_conf_write_intercept(mmio_ctxt->seg, mmio_ctxt->bdf,
+                                  offset, bytes, p_data) >= 0 )
+        pci_mmcfg_write(mmio_ctxt->seg, PCI_BUS(mmio_ctxt->bdf),
+                        PCI_DEVFN2(mmio_ctxt->bdf), offset, bytes,
+                        *(uint32_t *)p_data);
+
+    return X86EMUL_OKAY;
+}
+
+static const struct x86_emulate_ops mmcfg_intercept_ops = {
+    .read       = x86emul_unhandleable_rw,
+    .insn_fetch = pv_emul_ptwr_read,
+    .write      = mmcfg_intercept_write,
+    .validate   = pv_emul_is_mem_write,
+    .cpuid      = pv_emul_cpuid,
+};
+
+/* Check if guest is trying to modify a r/o MMIO page. */
+int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
+                          struct cpu_user_regs *regs)
+{
+    l1_pgentry_t pte;
+    unsigned long mfn;
+    unsigned int addr_size = is_pv_32bit_vcpu(v) ? 32 : BITS_PER_LONG;
+    struct mmio_ro_emulate_ctxt mmio_ro_ctxt = { .cr2 = addr };
+    struct x86_emulate_ctxt ctxt = {
+        .regs = regs,
+        .vendor = v->domain->arch.cpuid->x86_vendor,
+        .addr_size = addr_size,
+        .sp_size = addr_size,
+        .lma = !is_pv_32bit_vcpu(v),
+        .data = &mmio_ro_ctxt,
+    };
+    int rc;
+
+    /* Attempt to read the PTE that maps the VA being accessed. */
+    pv_get_guest_eff_l1e(addr, &pte);
+
+    /* We are looking only for read-only mappings of MMIO pages. */
+    if ( ((l1e_get_flags(pte) & (_PAGE_PRESENT|_PAGE_RW)) != _PAGE_PRESENT) )
+        return 0;
+
+    mfn = l1e_get_pfn(pte);
+    if ( mfn_valid(_mfn(mfn)) )
+    {
+        struct page_info *page = mfn_to_page(mfn);
+        struct domain *owner = page_get_owner_and_reference(page);
+
+        if ( owner )
+            put_page(page);
+        if ( owner != dom_io )
+            return 0;
+    }
+
+    if ( !rangeset_contains_singleton(mmio_ro_ranges, mfn) )
+        return 0;
+
+    if ( pci_ro_mmcfg_decode(mfn, &mmio_ro_ctxt.seg, &mmio_ro_ctxt.bdf) )
+        rc = x86_emulate(&ctxt, &mmcfg_intercept_ops);
+    else
+        rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
+
+    switch ( rc )
+    {
+    case X86EMUL_EXCEPTION:
+        /*
+         * This emulation only covers writes to MMCFG space or read-only MFNs.
+         * We tolerate #PF (from hitting an adjacent page or a successful
+         * concurrent pagetable update).  Anything else is an emulation bug,
+         * or a guest playing with the instruction stream under Xen's feet.
+         */
+        if ( ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
+             ctxt.event.vector == TRAP_page_fault )
+            pv_inject_event(&ctxt.event);
+        else
+            gdprintk(XENLOG_WARNING,
+                     "Unexpected event (type %u, vector %#x) from emulation\n",
+                     ctxt.event.type, ctxt.event.vector);
+
+        /* Fallthrough */
+    case X86EMUL_OKAY:
+
+        if ( ctxt.retire.singlestep )
+            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+
+        /* Fallthrough */
+    case X86EMUL_RETRY:
+        perfc_incr(ptwr_emulations);
+        return EXCRET_fault_fixed;
+    }
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 13/31] x86/mm: remove the unused inclusion of pv/emulate.h
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (11 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 12/31] x86/mm: split out readonly MMIO " Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 14/31] x86/mm: move and rename guest_{, un}map_l1e Wei Liu
                   ` (17 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

All emulation code is moved by now.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a42720c8d1..dd8fa43ef3 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -127,8 +127,6 @@
 #include <asm/pv/grant_table.h>
 #include <asm/pv/mm.h>
 
-#include "pv/emulate.h"
-
 /* Mapping of the fixmap space needed early. */
 l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
     l1_fixmap[L1_PAGETABLE_ENTRIES];
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 14/31] x86/mm: move and rename guest_{, un}map_l1e
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (12 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 13/31] x86/mm: remove the unused inclusion of pv/emulate.h Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 15/31] x86/mm: split out PV grant table code Wei Liu
                   ` (16 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Move them to pv/mm.c and rename them pv_{,un}map_guest_l1e.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c           | 63 +++++++++++----------------------------------
 xen/arch/x86/pv/mm.c        | 33 ++++++++++++++++++++++++
 xen/include/asm-x86/pv/mm.h |  9 +++++++
 3 files changed, 57 insertions(+), 48 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index dd8fa43ef3..c42dd5f8f5 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -512,39 +512,6 @@ void update_cr3(struct vcpu *v)
     make_cr3(v, cr3_mfn);
 }
 
-/* Get a mapping of a PV guest's l1e for this virtual address. */
-static l1_pgentry_t *guest_map_l1e(unsigned long addr, unsigned long *gl1mfn)
-{
-    l2_pgentry_t l2e;
-
-    ASSERT(!paging_mode_translate(current->domain));
-    ASSERT(!paging_mode_external(current->domain));
-
-    if ( unlikely(!__addr_ok(addr)) )
-        return NULL;
-
-    /* Find this l1e and its enclosing l1mfn in the linear map. */
-    if ( __copy_from_user(&l2e,
-                          &__linear_l2_table[l2_linear_offset(addr)],
-                          sizeof(l2_pgentry_t)) )
-        return NULL;
-
-    /* Check flags that it will be safe to read the l1e. */
-    if ( (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT )
-        return NULL;
-
-    *gl1mfn = l2e_get_pfn(l2e);
-
-    return (l1_pgentry_t *)map_domain_page(_mfn(*gl1mfn)) +
-           l1_table_offset(addr);
-}
-
-/* Pull down the mapping we got from guest_map_l1e(). */
-static inline void guest_unmap_l1e(void *p)
-{
-    unmap_domain_page(p);
-}
-
 static inline void page_set_tlbflush_timestamp(struct page_info *page)
 {
     /*
@@ -3786,7 +3753,7 @@ static int create_grant_va_mapping(
 
     adjust_guest_l1e(nl1e, d);
 
-    pl1e = guest_map_l1e(va, &gl1mfn);
+    pl1e = pv_map_guest_l1e(va, &gl1mfn);
     if ( !pl1e )
     {
         gdprintk(XENLOG_WARNING, "Could not find L1 PTE for address %lx\n", va);
@@ -3795,7 +3762,7 @@ static int create_grant_va_mapping(
 
     if ( !get_page_from_mfn(_mfn(gl1mfn), current->domain) )
     {
-        guest_unmap_l1e(pl1e);
+        pv_unmap_guest_l1e(pl1e);
         return GNTST_general_error;
     }
 
@@ -3803,7 +3770,7 @@ static int create_grant_va_mapping(
     if ( !page_lock(l1pg) )
     {
         put_page(l1pg);
-        guest_unmap_l1e(pl1e);
+        pv_unmap_guest_l1e(pl1e);
         return GNTST_general_error;
     }
 
@@ -3811,7 +3778,7 @@ static int create_grant_va_mapping(
     {
         page_unlock(l1pg);
         put_page(l1pg);
-        guest_unmap_l1e(pl1e);
+        pv_unmap_guest_l1e(pl1e);
         return GNTST_general_error;
     }
 
@@ -3820,7 +3787,7 @@ static int create_grant_va_mapping(
 
     page_unlock(l1pg);
     put_page(l1pg);
-    guest_unmap_l1e(pl1e);
+    pv_unmap_guest_l1e(pl1e);
 
     if ( okay )
         put_page_from_l1e(ol1e, d);
@@ -3836,7 +3803,7 @@ static int replace_grant_va_mapping(
     struct page_info *l1pg;
     int rc = 0;
 
-    pl1e = guest_map_l1e(addr, &gl1mfn);
+    pl1e = pv_map_guest_l1e(addr, &gl1mfn);
     if ( !pl1e )
     {
         gdprintk(XENLOG_WARNING, "Could not find L1 PTE for address %lx\n", addr);
@@ -3887,7 +3854,7 @@ static int replace_grant_va_mapping(
     page_unlock(l1pg);
     put_page(l1pg);
  out:
-    guest_unmap_l1e(pl1e);
+    pv_unmap_guest_l1e(pl1e);
     return rc;
 }
 
@@ -3945,7 +3912,7 @@ int replace_grant_pv_mapping(uint64_t addr, unsigned long frame,
     if ( !new_addr )
         return destroy_grant_va_mapping(addr, frame, curr);
 
-    pl1e = guest_map_l1e(new_addr, &gl1mfn);
+    pl1e = pv_map_guest_l1e(new_addr, &gl1mfn);
     if ( !pl1e )
     {
         gdprintk(XENLOG_WARNING,
@@ -3955,7 +3922,7 @@ int replace_grant_pv_mapping(uint64_t addr, unsigned long frame,
 
     if ( !get_page_from_mfn(_mfn(gl1mfn), current->domain) )
     {
-        guest_unmap_l1e(pl1e);
+        pv_unmap_guest_l1e(pl1e);
         return GNTST_general_error;
     }
 
@@ -3963,7 +3930,7 @@ int replace_grant_pv_mapping(uint64_t addr, unsigned long frame,
     if ( !page_lock(l1pg) )
     {
         put_page(l1pg);
-        guest_unmap_l1e(pl1e);
+        pv_unmap_guest_l1e(pl1e);
         return GNTST_general_error;
     }
 
@@ -3971,7 +3938,7 @@ int replace_grant_pv_mapping(uint64_t addr, unsigned long frame,
     {
         page_unlock(l1pg);
         put_page(l1pg);
-        guest_unmap_l1e(pl1e);
+        pv_unmap_guest_l1e(pl1e);
         return GNTST_general_error;
     }
 
@@ -3983,13 +3950,13 @@ int replace_grant_pv_mapping(uint64_t addr, unsigned long frame,
         page_unlock(l1pg);
         put_page(l1pg);
         gdprintk(XENLOG_WARNING, "Cannot delete PTE entry at %p\n", pl1e);
-        guest_unmap_l1e(pl1e);
+        pv_unmap_guest_l1e(pl1e);
         return GNTST_general_error;
     }
 
     page_unlock(l1pg);
     put_page(l1pg);
-    guest_unmap_l1e(pl1e);
+    pv_unmap_guest_l1e(pl1e);
 
     rc = replace_grant_va_mapping(addr, frame, ol1e, curr);
     if ( rc )
@@ -4123,7 +4090,7 @@ static int __do_update_va_mapping(
         return rc;
 
     rc = -EINVAL;
-    pl1e = guest_map_l1e(va, &gl1mfn);
+    pl1e = pv_map_guest_l1e(va, &gl1mfn);
     if ( unlikely(!pl1e || !get_page_from_mfn(_mfn(gl1mfn), d)) )
         goto out;
 
@@ -4148,7 +4115,7 @@ static int __do_update_va_mapping(
 
  out:
     if ( pl1e )
-        guest_unmap_l1e(pl1e);
+        pv_unmap_guest_l1e(pl1e);
 
     switch ( flags & UVMF_FLUSHTYPE_MASK )
     {
diff --git a/xen/arch/x86/pv/mm.c b/xen/arch/x86/pv/mm.c
index 2cb5995e62..32e73d59df 100644
--- a/xen/arch/x86/pv/mm.c
+++ b/xen/arch/x86/pv/mm.c
@@ -63,6 +63,39 @@ void pv_get_guest_eff_kern_l1e(struct vcpu *v, unsigned long addr,
         toggle_guest_mode(v);
 }
 
+/* Get a mapping of a PV guest's l1e for this virtual address. */
+l1_pgentry_t *pv_map_guest_l1e(unsigned long addr, unsigned long *gl1mfn)
+{
+    l2_pgentry_t l2e;
+
+    ASSERT(!paging_mode_translate(current->domain));
+    ASSERT(!paging_mode_external(current->domain));
+
+    if ( unlikely(!__addr_ok(addr)) )
+        return NULL;
+
+    /* Find this l1e and its enclosing l1mfn in the linear map. */
+    if ( __copy_from_user(&l2e,
+                          &__linear_l2_table[l2_linear_offset(addr)],
+                          sizeof(l2_pgentry_t)) )
+        return NULL;
+
+    /* Check flags that it will be safe to read the l1e. */
+    if ( (l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT )
+        return NULL;
+
+    *gl1mfn = l2e_get_pfn(l2e);
+
+    return (l1_pgentry_t *)map_domain_page(_mfn(*gl1mfn)) +
+           l1_table_offset(addr);
+}
+
+/* Pull down the mapping we got from pv_map_guest_l1e(). */
+void pv_unmap_guest_l1e(void *p)
+{
+    unmap_domain_page(p);
+}
+
 /*
  * How to write an entry to the guest pagetables.
  * Returns false for failure (pointer not valid), true for success.
diff --git a/xen/include/asm-x86/pv/mm.h b/xen/include/asm-x86/pv/mm.h
index b3887989b6..006156d0e1 100644
--- a/xen/include/asm-x86/pv/mm.h
+++ b/xen/include/asm-x86/pv/mm.h
@@ -86,6 +86,9 @@ bool pv_update_intpte(intpte_t *p, intpte_t old, intpte_t new,
                      _t ## e_get_intpte(_o), _t ## e_get_intpte(_n),   \
                      (_m), (_v), (_ad))
 
+l1_pgentry_t *pv_map_guest_l1e(unsigned long addr, unsigned long *gl1mfn);
+void pv_unmap_guest_l1e(void *p);
+
 #else
 
 static inline void pv_get_guest_eff_l1e(unsigned long addr,
@@ -101,6 +104,12 @@ static inline bool pv_update_intpte(intpte_t *p, intpte_t old, intpte_t new,
                                     int preserve_ad)
 { return false; }
 
+static inline l1_pgentry_t *pv_map_guest_l1e(unsigned long addr,
+                                             unsigned long *gl1mfn);
+{ return NULL; }
+
+static inline void pv_unmap_guest_l1e(void *p) {}
+
 #endif
 
 #endif /* __X86_PV_MM_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 15/31] x86/mm: split out PV grant table code
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (13 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 14/31] x86/mm: move and rename guest_{, un}map_l1e Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 16/31] x86/mm: split out descriptor " Wei Liu
                   ` (15 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Move the code to pv/grant_table.c. Fix some coding style issues while
moving code.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c             | 363 --------------------------------------
 xen/arch/x86/pv/Makefile      |   1 +
 xen/arch/x86/pv/grant_table.c | 398 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 399 insertions(+), 363 deletions(-)
 create mode 100644 xen/arch/x86/pv/grant_table.c

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index c42dd5f8f5..63549b987c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3602,369 +3602,6 @@ long do_mmu_update(
     return rc;
 }
 
-
-static int create_grant_pte_mapping(
-    uint64_t pte_addr, l1_pgentry_t nl1e, struct vcpu *v)
-{
-    int rc = GNTST_okay;
-    void *va;
-    unsigned long gmfn, mfn;
-    struct page_info *page;
-    l1_pgentry_t ol1e;
-    struct domain *d = v->domain;
-
-    if ( !IS_ALIGNED(pte_addr, sizeof(nl1e)) )
-        return GNTST_general_error;
-
-    adjust_guest_l1e(nl1e, d);
-
-    gmfn = pte_addr >> PAGE_SHIFT;
-    page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
-
-    if ( unlikely(!page) )
-    {
-        gdprintk(XENLOG_WARNING, "Could not get page for normal update\n");
-        return GNTST_general_error;
-    }
-
-    mfn = page_to_mfn(page);
-    va = map_domain_page(_mfn(mfn));
-    va = (void *)((unsigned long)va + ((unsigned long)pte_addr & ~PAGE_MASK));
-
-    if ( !page_lock(page) )
-    {
-        rc = GNTST_general_error;
-        goto failed;
-    }
-
-    if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
-    {
-        page_unlock(page);
-        rc = GNTST_general_error;
-        goto failed;
-    }
-
-    ol1e = *(l1_pgentry_t *)va;
-    if ( !UPDATE_ENTRY(l1, (l1_pgentry_t *)va, ol1e, nl1e, mfn, v, 0) )
-    {
-        page_unlock(page);
-        rc = GNTST_general_error;
-        goto failed;
-    }
-
-    page_unlock(page);
-
-    put_page_from_l1e(ol1e, d);
-
- failed:
-    unmap_domain_page(va);
-    put_page(page);
-
-    return rc;
-}
-
-static int destroy_grant_pte_mapping(
-    uint64_t addr, unsigned long frame, struct domain *d)
-{
-    int rc = GNTST_okay;
-    void *va;
-    unsigned long gmfn, mfn;
-    struct page_info *page;
-    l1_pgentry_t ol1e;
-
-    /*
-     * addr comes from Xen's active_entry tracking so isn't guest controlled,
-     * but it had still better be PTE-aligned.
-     */
-    if ( !IS_ALIGNED(addr, sizeof(ol1e)) )
-    {
-        ASSERT_UNREACHABLE();
-        return GNTST_general_error;
-    }
-
-    gmfn = addr >> PAGE_SHIFT;
-    page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
-
-    if ( unlikely(!page) )
-    {
-        gdprintk(XENLOG_WARNING, "Could not get page for normal update\n");
-        return GNTST_general_error;
-    }
-
-    mfn = page_to_mfn(page);
-    va = map_domain_page(_mfn(mfn));
-    va = (void *)((unsigned long)va + ((unsigned long)addr & ~PAGE_MASK));
-
-    if ( !page_lock(page) )
-    {
-        rc = GNTST_general_error;
-        goto failed;
-    }
-
-    if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
-    {
-        page_unlock(page);
-        rc = GNTST_general_error;
-        goto failed;
-    }
-
-    ol1e = *(l1_pgentry_t *)va;
-
-    /* Check that the virtual address supplied is actually mapped to frame. */
-    if ( unlikely(l1e_get_pfn(ol1e) != frame) )
-    {
-        page_unlock(page);
-        gdprintk(XENLOG_WARNING,
-                 "PTE entry %"PRIpte" for address %"PRIx64" doesn't match frame %lx\n",
-                 l1e_get_intpte(ol1e), addr, frame);
-        rc = GNTST_general_error;
-        goto failed;
-    }
-
-    /* Delete pagetable entry. */
-    if ( unlikely(!UPDATE_ENTRY(l1,
-                                (l1_pgentry_t *)va, ol1e, l1e_empty(), mfn,
-                                d->vcpu[0] /* Change if we go to per-vcpu shadows. */,
-                                0)) )
-    {
-        page_unlock(page);
-        gdprintk(XENLOG_WARNING, "Cannot delete PTE entry at %p\n", va);
-        rc = GNTST_general_error;
-        goto failed;
-    }
-
-    page_unlock(page);
-
- failed:
-    unmap_domain_page(va);
-    put_page(page);
-    return rc;
-}
-
-
-static int create_grant_va_mapping(
-    unsigned long va, l1_pgentry_t nl1e, struct vcpu *v)
-{
-    l1_pgentry_t *pl1e, ol1e;
-    struct domain *d = v->domain;
-    unsigned long gl1mfn;
-    struct page_info *l1pg;
-    int okay;
-
-    adjust_guest_l1e(nl1e, d);
-
-    pl1e = pv_map_guest_l1e(va, &gl1mfn);
-    if ( !pl1e )
-    {
-        gdprintk(XENLOG_WARNING, "Could not find L1 PTE for address %lx\n", va);
-        return GNTST_general_error;
-    }
-
-    if ( !get_page_from_mfn(_mfn(gl1mfn), current->domain) )
-    {
-        pv_unmap_guest_l1e(pl1e);
-        return GNTST_general_error;
-    }
-
-    l1pg = mfn_to_page(gl1mfn);
-    if ( !page_lock(l1pg) )
-    {
-        put_page(l1pg);
-        pv_unmap_guest_l1e(pl1e);
-        return GNTST_general_error;
-    }
-
-    if ( (l1pg->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
-    {
-        page_unlock(l1pg);
-        put_page(l1pg);
-        pv_unmap_guest_l1e(pl1e);
-        return GNTST_general_error;
-    }
-
-    ol1e = *pl1e;
-    okay = UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, v, 0);
-
-    page_unlock(l1pg);
-    put_page(l1pg);
-    pv_unmap_guest_l1e(pl1e);
-
-    if ( okay )
-        put_page_from_l1e(ol1e, d);
-
-    return okay ? GNTST_okay : GNTST_general_error;
-}
-
-static int replace_grant_va_mapping(
-    unsigned long addr, unsigned long frame, l1_pgentry_t nl1e, struct vcpu *v)
-{
-    l1_pgentry_t *pl1e, ol1e;
-    unsigned long gl1mfn;
-    struct page_info *l1pg;
-    int rc = 0;
-
-    pl1e = pv_map_guest_l1e(addr, &gl1mfn);
-    if ( !pl1e )
-    {
-        gdprintk(XENLOG_WARNING, "Could not find L1 PTE for address %lx\n", addr);
-        return GNTST_general_error;
-    }
-
-    if ( !get_page_from_mfn(_mfn(gl1mfn), current->domain) )
-    {
-        rc = GNTST_general_error;
-        goto out;
-    }
-
-    l1pg = mfn_to_page(gl1mfn);
-    if ( !page_lock(l1pg) )
-    {
-        rc = GNTST_general_error;
-        put_page(l1pg);
-        goto out;
-    }
-
-    if ( (l1pg->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
-    {
-        rc = GNTST_general_error;
-        goto unlock_and_out;
-    }
-
-    ol1e = *pl1e;
-
-    /* Check that the virtual address supplied is actually mapped to frame. */
-    if ( unlikely(l1e_get_pfn(ol1e) != frame) )
-    {
-        gdprintk(XENLOG_WARNING,
-                 "PTE entry %lx for address %lx doesn't match frame %lx\n",
-                 l1e_get_pfn(ol1e), addr, frame);
-        rc = GNTST_general_error;
-        goto unlock_and_out;
-    }
-
-    /* Delete pagetable entry. */
-    if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, v, 0)) )
-    {
-        gdprintk(XENLOG_WARNING, "Cannot delete PTE entry at %p\n", pl1e);
-        rc = GNTST_general_error;
-        goto unlock_and_out;
-    }
-
- unlock_and_out:
-    page_unlock(l1pg);
-    put_page(l1pg);
- out:
-    pv_unmap_guest_l1e(pl1e);
-    return rc;
-}
-
-static int destroy_grant_va_mapping(
-    unsigned long addr, unsigned long frame, struct vcpu *v)
-{
-    return replace_grant_va_mapping(addr, frame, l1e_empty(), v);
-}
-
-int create_grant_pv_mapping(uint64_t addr, unsigned long frame,
-                            unsigned int flags, unsigned int cache_flags)
-{
-    l1_pgentry_t pte;
-    uint32_t grant_pte_flags;
-
-    grant_pte_flags =
-        _PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_GNTTAB;
-    if ( cpu_has_nx )
-        grant_pte_flags |= _PAGE_NX_BIT;
-
-    pte = l1e_from_pfn(frame, grant_pte_flags);
-    if ( (flags & GNTMAP_application_map) )
-        l1e_add_flags(pte,_PAGE_USER);
-    if ( !(flags & GNTMAP_readonly) )
-        l1e_add_flags(pte,_PAGE_RW);
-
-    l1e_add_flags(pte,
-                  ((flags >> _GNTMAP_guest_avail0) * _PAGE_AVAIL0)
-                   & _PAGE_AVAIL);
-
-    l1e_add_flags(pte, cacheattr_to_pte_flags(cache_flags >> 5));
-
-    if ( flags & GNTMAP_contains_pte )
-        return create_grant_pte_mapping(addr, pte, current);
-    return create_grant_va_mapping(addr, pte, current);
-}
-
-int replace_grant_pv_mapping(uint64_t addr, unsigned long frame,
-                             uint64_t new_addr, unsigned int flags)
-{
-    struct vcpu *curr = current;
-    l1_pgentry_t *pl1e, ol1e;
-    unsigned long gl1mfn;
-    struct page_info *l1pg;
-    int rc;
-
-    if ( flags & GNTMAP_contains_pte )
-    {
-        if ( !new_addr )
-            return destroy_grant_pte_mapping(addr, frame, curr->domain);
-
-        return GNTST_general_error;
-    }
-
-    if ( !new_addr )
-        return destroy_grant_va_mapping(addr, frame, curr);
-
-    pl1e = pv_map_guest_l1e(new_addr, &gl1mfn);
-    if ( !pl1e )
-    {
-        gdprintk(XENLOG_WARNING,
-                 "Could not find L1 PTE for address %"PRIx64"\n", new_addr);
-        return GNTST_general_error;
-    }
-
-    if ( !get_page_from_mfn(_mfn(gl1mfn), current->domain) )
-    {
-        pv_unmap_guest_l1e(pl1e);
-        return GNTST_general_error;
-    }
-
-    l1pg = mfn_to_page(gl1mfn);
-    if ( !page_lock(l1pg) )
-    {
-        put_page(l1pg);
-        pv_unmap_guest_l1e(pl1e);
-        return GNTST_general_error;
-    }
-
-    if ( (l1pg->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
-    {
-        page_unlock(l1pg);
-        put_page(l1pg);
-        pv_unmap_guest_l1e(pl1e);
-        return GNTST_general_error;
-    }
-
-    ol1e = *pl1e;
-
-    if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, l1e_empty(),
-                                gl1mfn, curr, 0)) )
-    {
-        page_unlock(l1pg);
-        put_page(l1pg);
-        gdprintk(XENLOG_WARNING, "Cannot delete PTE entry at %p\n", pl1e);
-        pv_unmap_guest_l1e(pl1e);
-        return GNTST_general_error;
-    }
-
-    page_unlock(l1pg);
-    put_page(l1pg);
-    pv_unmap_guest_l1e(pl1e);
-
-    rc = replace_grant_va_mapping(addr, frame, ol1e, curr);
-    if ( rc )
-        put_page_from_l1e(ol1e, curr->domain);
-
-    return rc;
-}
-
 int donate_page(
     struct domain *d, struct page_info *page, unsigned int memflags)
 {
diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 016b1b6e8f..501c766cc2 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -6,6 +6,7 @@ obj-y += emul-inv-op.o
 obj-y += emul-mmio-op.o
 obj-y += emul-priv-op.o
 obj-y += emul-ptwr-op.o
+obj-y += grant_table.o
 obj-y += hypercall.o
 obj-y += iret.o
 obj-y += misc-hypercalls.o
diff --git a/xen/arch/x86/pv/grant_table.c b/xen/arch/x86/pv/grant_table.c
new file mode 100644
index 0000000000..35d045b90d
--- /dev/null
+++ b/xen/arch/x86/pv/grant_table.c
@@ -0,0 +1,398 @@
+/******************************************************************************
+ * arch/x86/pv/grant_table.c
+ *
+ * Grant table interfaces for PV guests
+ *
+ * Copyright (C) 2017 Wei Liu <wei.liu2@citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/types.h>
+
+#include <public/grant_table.h>
+
+#include <asm/p2m.h>
+#include <asm/pv/mm.h>
+
+static int create_grant_pte_mapping(uint64_t pte_addr, l1_pgentry_t nl1e,
+                                    struct vcpu *v)
+{
+    int rc = GNTST_okay;
+    void *va;
+    unsigned long gmfn, mfn;
+    struct page_info *page;
+    l1_pgentry_t ol1e;
+    struct domain *d = v->domain;
+
+    if ( !IS_ALIGNED(pte_addr, sizeof(nl1e)) )
+        return GNTST_general_error;
+
+    adjust_guest_l1e(nl1e, d);
+
+    gmfn = pte_addr >> PAGE_SHIFT;
+    page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
+
+    if ( unlikely(!page) )
+    {
+        gdprintk(XENLOG_WARNING, "Could not get page for normal update\n");
+        return GNTST_general_error;
+    }
+
+    mfn = page_to_mfn(page);
+    va = map_domain_page(_mfn(mfn));
+    va = (void *)((unsigned long)va + ((unsigned long)pte_addr & ~PAGE_MASK));
+
+    if ( !page_lock(page) )
+    {
+        rc = GNTST_general_error;
+        goto failed;
+    }
+
+    if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
+    {
+        page_unlock(page);
+        rc = GNTST_general_error;
+        goto failed;
+    }
+
+    ol1e = *(l1_pgentry_t *)va;
+    if ( !UPDATE_ENTRY(l1, (l1_pgentry_t *)va, ol1e, nl1e, mfn, v, 0) )
+    {
+        page_unlock(page);
+        rc = GNTST_general_error;
+        goto failed;
+    }
+
+    page_unlock(page);
+
+    put_page_from_l1e(ol1e, d);
+
+ failed:
+    unmap_domain_page(va);
+    put_page(page);
+
+    return rc;
+}
+
+static int destroy_grant_pte_mapping(uint64_t addr, unsigned long frame,
+                                     struct domain *d)
+{
+    int rc = GNTST_okay;
+    void *va;
+    unsigned long gmfn, mfn;
+    struct page_info *page;
+    l1_pgentry_t ol1e;
+
+    /*
+     * addr comes from Xen's active_entry tracking so isn't guest controlled,
+     * but it had still better be PTE-aligned.
+     */
+    if ( !IS_ALIGNED(addr, sizeof(ol1e)) )
+    {
+        ASSERT_UNREACHABLE();
+        return GNTST_general_error;
+    }
+
+    gmfn = addr >> PAGE_SHIFT;
+    page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
+
+    if ( unlikely(!page) )
+    {
+        gdprintk(XENLOG_WARNING, "Could not get page for normal update\n");
+        return GNTST_general_error;
+    }
+
+    mfn = page_to_mfn(page);
+    va = map_domain_page(_mfn(mfn));
+    va = (void *)((unsigned long)va + ((unsigned long)addr & ~PAGE_MASK));
+
+    if ( !page_lock(page) )
+    {
+        rc = GNTST_general_error;
+        goto failed;
+    }
+
+    if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
+    {
+        page_unlock(page);
+        rc = GNTST_general_error;
+        goto failed;
+    }
+
+    ol1e = *(l1_pgentry_t *)va;
+
+    /* Check that the virtual address supplied is actually mapped to frame. */
+    if ( unlikely(l1e_get_pfn(ol1e) != frame) )
+    {
+        page_unlock(page);
+        gdprintk(XENLOG_WARNING,
+                 "PTE entry %"PRIpte" for address %"PRIx64" doesn't match frame %lx\n",
+                 l1e_get_intpte(ol1e), addr, frame);
+        rc = GNTST_general_error;
+        goto failed;
+    }
+
+    /* Delete pagetable entry. */
+    if ( unlikely(!UPDATE_ENTRY(l1,
+                                (l1_pgentry_t *)va, ol1e, l1e_empty(), mfn,
+                                d->vcpu[0] /* Change if we go to per-vcpu shadows. */,
+                                0)) )
+    {
+        page_unlock(page);
+        gdprintk(XENLOG_WARNING, "Cannot delete PTE entry at %p\n", va);
+        rc = GNTST_general_error;
+        goto failed;
+    }
+
+    page_unlock(page);
+
+ failed:
+    unmap_domain_page(va);
+    put_page(page);
+    return rc;
+}
+
+static int create_grant_va_mapping(unsigned long va, l1_pgentry_t nl1e,
+                                   struct vcpu *v)
+{
+    l1_pgentry_t *pl1e, ol1e;
+    struct domain *d = v->domain;
+    unsigned long gl1mfn;
+    struct page_info *l1pg;
+    int okay;
+
+    adjust_guest_l1e(nl1e, d);
+
+    pl1e = pv_map_guest_l1e(va, &gl1mfn);
+    if ( !pl1e )
+    {
+        gdprintk(XENLOG_WARNING, "Could not find L1 PTE for address %lx\n", va);
+        return GNTST_general_error;
+    }
+
+    if ( !get_page_from_mfn(_mfn(gl1mfn), current->domain) )
+    {
+        pv_unmap_guest_l1e(pl1e);
+        return GNTST_general_error;
+    }
+
+    l1pg = mfn_to_page(gl1mfn);
+    if ( !page_lock(l1pg) )
+    {
+        put_page(l1pg);
+        pv_unmap_guest_l1e(pl1e);
+        return GNTST_general_error;
+    }
+
+    if ( (l1pg->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
+    {
+        page_unlock(l1pg);
+        put_page(l1pg);
+        pv_unmap_guest_l1e(pl1e);
+        return GNTST_general_error;
+    }
+
+    ol1e = *pl1e;
+    okay = UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, v, 0);
+
+    page_unlock(l1pg);
+    put_page(l1pg);
+    pv_unmap_guest_l1e(pl1e);
+
+    if ( okay )
+        put_page_from_l1e(ol1e, d);
+
+    return okay ? GNTST_okay : GNTST_general_error;
+}
+
+static int replace_grant_va_mapping(unsigned long addr, unsigned long frame,
+                                    l1_pgentry_t nl1e, struct vcpu *v)
+{
+    l1_pgentry_t *pl1e, ol1e;
+    unsigned long gl1mfn;
+    struct page_info *l1pg;
+    int rc = 0;
+
+    pl1e = pv_map_guest_l1e(addr, &gl1mfn);
+    if ( !pl1e )
+    {
+        gdprintk(XENLOG_WARNING, "Could not find L1 PTE for address %lx\n", addr);
+        return GNTST_general_error;
+    }
+
+    if ( !get_page_from_mfn(_mfn(gl1mfn), current->domain) )
+    {
+        rc = GNTST_general_error;
+        goto out;
+    }
+
+    l1pg = mfn_to_page(gl1mfn);
+    if ( !page_lock(l1pg) )
+    {
+        rc = GNTST_general_error;
+        put_page(l1pg);
+        goto out;
+    }
+
+    if ( (l1pg->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
+    {
+        rc = GNTST_general_error;
+        goto unlock_and_out;
+    }
+
+    ol1e = *pl1e;
+
+    /* Check that the virtual address supplied is actually mapped to frame. */
+    if ( unlikely(l1e_get_pfn(ol1e) != frame) )
+    {
+        gdprintk(XENLOG_WARNING,
+                 "PTE entry %lx for address %lx doesn't match frame %lx\n",
+                 l1e_get_pfn(ol1e), addr, frame);
+        rc = GNTST_general_error;
+        goto unlock_and_out;
+    }
+
+    /* Delete pagetable entry. */
+    if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, v, 0)) )
+    {
+        gdprintk(XENLOG_WARNING, "Cannot delete PTE entry at %p\n", pl1e);
+        rc = GNTST_general_error;
+        goto unlock_and_out;
+    }
+
+ unlock_and_out:
+    page_unlock(l1pg);
+    put_page(l1pg);
+ out:
+    pv_unmap_guest_l1e(pl1e);
+    return rc;
+}
+
+static int destroy_grant_va_mapping(unsigned long addr, unsigned long frame,
+                                    struct vcpu *v)
+{
+    return replace_grant_va_mapping(addr, frame, l1e_empty(), v);
+}
+
+int create_grant_pv_mapping(uint64_t addr, unsigned long frame,
+                            unsigned int flags, unsigned int cache_flags)
+{
+    l1_pgentry_t pte;
+    uint32_t grant_pte_flags;
+
+    grant_pte_flags =
+        _PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_GNTTAB;
+    if ( cpu_has_nx )
+        grant_pte_flags |= _PAGE_NX_BIT;
+
+    pte = l1e_from_pfn(frame, grant_pte_flags);
+    if ( (flags & GNTMAP_application_map) )
+        l1e_add_flags(pte,_PAGE_USER);
+    if ( !(flags & GNTMAP_readonly) )
+        l1e_add_flags(pte,_PAGE_RW);
+
+    l1e_add_flags(pte,
+                  ((flags >> _GNTMAP_guest_avail0) * _PAGE_AVAIL0)
+                   & _PAGE_AVAIL);
+
+    l1e_add_flags(pte, cacheattr_to_pte_flags(cache_flags >> 5));
+
+    if ( flags & GNTMAP_contains_pte )
+        return create_grant_pte_mapping(addr, pte, current);
+    return create_grant_va_mapping(addr, pte, current);
+}
+
+int replace_grant_pv_mapping(uint64_t addr, unsigned long frame,
+                             uint64_t new_addr, unsigned int flags)
+{
+    struct vcpu *curr = current;
+    l1_pgentry_t *pl1e, ol1e;
+    unsigned long gl1mfn;
+    struct page_info *l1pg;
+    int rc;
+
+    if ( flags & GNTMAP_contains_pte )
+    {
+        if ( !new_addr )
+            return destroy_grant_pte_mapping(addr, frame, curr->domain);
+
+        return GNTST_general_error;
+    }
+
+    if ( !new_addr )
+        return destroy_grant_va_mapping(addr, frame, curr);
+
+    pl1e = pv_map_guest_l1e(new_addr, &gl1mfn);
+    if ( !pl1e )
+    {
+        gdprintk(XENLOG_WARNING,
+                 "Could not find L1 PTE for address %"PRIx64"\n", new_addr);
+        return GNTST_general_error;
+    }
+
+    if ( !get_page_from_mfn(_mfn(gl1mfn), current->domain) )
+    {
+        pv_unmap_guest_l1e(pl1e);
+        return GNTST_general_error;
+    }
+
+    l1pg = mfn_to_page(gl1mfn);
+    if ( !page_lock(l1pg) )
+    {
+        put_page(l1pg);
+        pv_unmap_guest_l1e(pl1e);
+        return GNTST_general_error;
+    }
+
+    if ( (l1pg->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
+    {
+        page_unlock(l1pg);
+        put_page(l1pg);
+        pv_unmap_guest_l1e(pl1e);
+        return GNTST_general_error;
+    }
+
+    ol1e = *pl1e;
+
+    if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, l1e_empty(),
+                                gl1mfn, curr, 0)) )
+    {
+        page_unlock(l1pg);
+        put_page(l1pg);
+        gdprintk(XENLOG_WARNING, "Cannot delete PTE entry at %p\n", pl1e);
+        pv_unmap_guest_l1e(pl1e);
+        return GNTST_general_error;
+    }
+
+    page_unlock(l1pg);
+    put_page(l1pg);
+    pv_unmap_guest_l1e(pl1e);
+
+    rc = replace_grant_va_mapping(addr, frame, ol1e, curr);
+    if ( rc )
+        put_page_from_l1e(ol1e, curr->domain);
+
+    return rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 16/31] x86/mm: split out descriptor table code
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (14 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 15/31] x86/mm: split out PV grant table code Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 17/31] x86/mm: move compat descriptor handling code Wei Liu
                   ` (14 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Move the code to pv/descriptor-tables.c. Add "pv_" prefix to
{set,destroy}_gdt. Fix up call sites. Move the declarations to new
header file. Fix coding style issues while moving code.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/domain.c               |  11 ++-
 xen/arch/x86/mm.c                   | 156 ------------------------------
 xen/arch/x86/pv/Makefile            |   1 +
 xen/arch/x86/pv/descriptor-tables.c | 188 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/x86_64/compat/mm.c     |   6 +-
 xen/include/asm-x86/processor.h     |   5 -
 xen/include/asm-x86/pv/processor.h  |  40 ++++++++
 7 files changed, 239 insertions(+), 168 deletions(-)
 create mode 100644 xen/arch/x86/pv/descriptor-tables.c
 create mode 100644 xen/include/asm-x86/pv/processor.h

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index baaf8151d2..9a25c04f6c 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -64,6 +64,7 @@
 #include <compat/vcpu.h>
 #include <asm/psr.h>
 #include <asm/pv/domain.h>
+#include <asm/pv/processor.h>
 
 DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
 
@@ -986,7 +987,7 @@ int arch_set_info_guest(
         return rc;
 
     if ( !compat )
-        rc = (int)set_gdt(v, c.nat->gdt_frames, c.nat->gdt_ents);
+        rc = (int)pv_set_gdt(v, c.nat->gdt_frames, c.nat->gdt_ents);
     else
     {
         unsigned long gdt_frames[ARRAY_SIZE(v->arch.pv_vcpu.gdt_frames)];
@@ -996,7 +997,7 @@ int arch_set_info_guest(
             return -EINVAL;
         for ( i = 0; i < n; ++i )
             gdt_frames[i] = c.cmp->gdt_frames[i];
-        rc = (int)set_gdt(v, gdt_frames, c.cmp->gdt_ents);
+        rc = (int)pv_set_gdt(v, gdt_frames, c.cmp->gdt_ents);
     }
     if ( rc != 0 )
         return rc;
@@ -1095,7 +1096,7 @@ int arch_set_info_guest(
     {
         if ( cr3_page )
             put_page(cr3_page);
-        destroy_gdt(v);
+        pv_destroy_gdt(v);
         return rc;
     }
 
@@ -1147,7 +1148,7 @@ int arch_vcpu_reset(struct vcpu *v)
 {
     if ( is_pv_vcpu(v) )
     {
-        destroy_gdt(v);
+        pv_destroy_gdt(v);
         return vcpu_destroy_pagetables(v);
     }
 
@@ -1890,7 +1891,7 @@ int domain_relinquish_resources(struct domain *d)
                  * the LDT as it automatically gets squashed with the guest
                  * mappings.
                  */
-                destroy_gdt(v);
+                pv_destroy_gdt(v);
             }
         }
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 63549b987c..6cbcdabcd2 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3824,162 +3824,6 @@ long do_update_va_mapping_otherdomain(unsigned long va, u64 val64,
 }
 
 
-
-/*************************
- * Descriptor Tables
- */
-
-void destroy_gdt(struct vcpu *v)
-{
-    l1_pgentry_t *pl1e;
-    unsigned int i;
-    unsigned long pfn, zero_pfn = PFN_DOWN(__pa(zero_page));
-
-    v->arch.pv_vcpu.gdt_ents = 0;
-    pl1e = gdt_ldt_ptes(v->domain, v);
-    for ( i = 0; i < FIRST_RESERVED_GDT_PAGE; i++ )
-    {
-        pfn = l1e_get_pfn(pl1e[i]);
-        if ( (l1e_get_flags(pl1e[i]) & _PAGE_PRESENT) && pfn != zero_pfn )
-            put_page_and_type(mfn_to_page(pfn));
-        l1e_write(&pl1e[i], l1e_from_pfn(zero_pfn, __PAGE_HYPERVISOR_RO));
-        v->arch.pv_vcpu.gdt_frames[i] = 0;
-    }
-}
-
-
-long set_gdt(struct vcpu *v,
-             unsigned long *frames,
-             unsigned int entries)
-{
-    struct domain *d = v->domain;
-    l1_pgentry_t *pl1e;
-    /* NB. There are 512 8-byte entries per GDT page. */
-    unsigned int i, nr_pages = (entries + 511) / 512;
-
-    if ( entries > FIRST_RESERVED_GDT_ENTRY )
-        return -EINVAL;
-
-    /* Check the pages in the new GDT. */
-    for ( i = 0; i < nr_pages; i++ )
-    {
-        struct page_info *page;
-
-        page = get_page_from_gfn(d, frames[i], NULL, P2M_ALLOC);
-        if ( !page )
-            goto fail;
-        if ( !get_page_type(page, PGT_seg_desc_page) )
-        {
-            put_page(page);
-            goto fail;
-        }
-        frames[i] = page_to_mfn(page);
-    }
-
-    /* Tear down the old GDT. */
-    destroy_gdt(v);
-
-    /* Install the new GDT. */
-    v->arch.pv_vcpu.gdt_ents = entries;
-    pl1e = gdt_ldt_ptes(d, v);
-    for ( i = 0; i < nr_pages; i++ )
-    {
-        v->arch.pv_vcpu.gdt_frames[i] = frames[i];
-        l1e_write(&pl1e[i], l1e_from_pfn(frames[i], __PAGE_HYPERVISOR_RW));
-    }
-
-    return 0;
-
- fail:
-    while ( i-- > 0 )
-    {
-        put_page_and_type(mfn_to_page(frames[i]));
-    }
-    return -EINVAL;
-}
-
-
-long do_set_gdt(XEN_GUEST_HANDLE_PARAM(xen_ulong_t) frame_list,
-                unsigned int entries)
-{
-    int nr_pages = (entries + 511) / 512;
-    unsigned long frames[16];
-    struct vcpu *curr = current;
-    long ret;
-
-    /* Rechecked in set_gdt, but ensures a sane limit for copy_from_user(). */
-    if ( entries > FIRST_RESERVED_GDT_ENTRY )
-        return -EINVAL;
-
-    if ( copy_from_guest(frames, frame_list, nr_pages) )
-        return -EFAULT;
-
-    domain_lock(curr->domain);
-
-    if ( (ret = set_gdt(curr, frames, entries)) == 0 )
-        flush_tlb_local();
-
-    domain_unlock(curr->domain);
-
-    return ret;
-}
-
-
-long do_update_descriptor(u64 pa, u64 desc)
-{
-    struct domain *dom = current->domain;
-    unsigned long gmfn = pa >> PAGE_SHIFT;
-    unsigned long mfn;
-    unsigned int  offset;
-    struct desc_struct *gdt_pent, d;
-    struct page_info *page;
-    long ret = -EINVAL;
-
-    offset = ((unsigned int)pa & ~PAGE_MASK) / sizeof(struct desc_struct);
-
-    *(u64 *)&d = desc;
-
-    page = get_page_from_gfn(dom, gmfn, NULL, P2M_ALLOC);
-    if ( (((unsigned int)pa % sizeof(struct desc_struct)) != 0) ||
-         !page ||
-         !check_descriptor(dom, &d) )
-    {
-        if ( page )
-            put_page(page);
-        return -EINVAL;
-    }
-    mfn = page_to_mfn(page);
-
-    /* Check if the given frame is in use in an unsafe context. */
-    switch ( page->u.inuse.type_info & PGT_type_mask )
-    {
-    case PGT_seg_desc_page:
-        if ( unlikely(!get_page_type(page, PGT_seg_desc_page)) )
-            goto out;
-        break;
-    default:
-        if ( unlikely(!get_page_type(page, PGT_writable_page)) )
-            goto out;
-        break;
-    }
-
-    paging_mark_dirty(dom, _mfn(mfn));
-
-    /* All is good so make the update. */
-    gdt_pent = map_domain_page(_mfn(mfn));
-    write_atomic((uint64_t *)&gdt_pent[offset], *(uint64_t *)&d);
-    unmap_domain_page(gdt_pent);
-
-    put_page_type(page);
-
-    ret = 0; /* success */
-
- out:
-    put_page(page);
-
-    return ret;
-}
-
 typedef struct e820entry e820entry_t;
 DEFINE_XEN_GUEST_HANDLE(e820entry_t);
 
diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 501c766cc2..42e9d3723b 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -1,4 +1,5 @@
 obj-y += callback.o
+obj-y += descriptor-tables.o
 obj-y += domain.o
 obj-y += emulate.o
 obj-y += emul-gate-op.o
diff --git a/xen/arch/x86/pv/descriptor-tables.c b/xen/arch/x86/pv/descriptor-tables.c
new file mode 100644
index 0000000000..12dc45b671
--- /dev/null
+++ b/xen/arch/x86/pv/descriptor-tables.c
@@ -0,0 +1,188 @@
+/******************************************************************************
+ * arch/x86/pv/descriptor-tables.c
+ *
+ * Descriptor table related code
+ *
+ * Copyright (c) 2002-2005 K A Fraser
+ * Copyright (c) 2004 Christian Limpach
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/guest_access.h>
+#include <xen/hypercall.h>
+
+#include <asm/p2m.h>
+#include <asm/pv/processor.h>
+
+/*************************
+ * Descriptor Tables
+ */
+
+void pv_destroy_gdt(struct vcpu *v)
+{
+    l1_pgentry_t *pl1e;
+    unsigned int i;
+    unsigned long pfn, zero_pfn = PFN_DOWN(__pa(zero_page));
+
+    v->arch.pv_vcpu.gdt_ents = 0;
+    pl1e = gdt_ldt_ptes(v->domain, v);
+    for ( i = 0; i < FIRST_RESERVED_GDT_PAGE; i++ )
+    {
+        pfn = l1e_get_pfn(pl1e[i]);
+        if ( (l1e_get_flags(pl1e[i]) & _PAGE_PRESENT) && pfn != zero_pfn )
+            put_page_and_type(mfn_to_page(pfn));
+        l1e_write(&pl1e[i], l1e_from_pfn(zero_pfn, __PAGE_HYPERVISOR_RO));
+        v->arch.pv_vcpu.gdt_frames[i] = 0;
+    }
+}
+
+long pv_set_gdt(struct vcpu *v, unsigned long *frames, unsigned int entries)
+{
+    struct domain *d = v->domain;
+    l1_pgentry_t *pl1e;
+    /* NB. There are 512 8-byte entries per GDT page. */
+    unsigned int i, nr_pages = (entries + 511) / 512;
+
+    if ( entries > FIRST_RESERVED_GDT_ENTRY )
+        return -EINVAL;
+
+    /* Check the pages in the new GDT. */
+    for ( i = 0; i < nr_pages; i++ )
+    {
+        struct page_info *page;
+
+        page = get_page_from_gfn(d, frames[i], NULL, P2M_ALLOC);
+        if ( !page )
+            goto fail;
+        if ( !get_page_type(page, PGT_seg_desc_page) )
+        {
+            put_page(page);
+            goto fail;
+        }
+        frames[i] = page_to_mfn(page);
+    }
+
+    /* Tear down the old GDT. */
+    pv_destroy_gdt(v);
+
+    /* Install the new GDT. */
+    v->arch.pv_vcpu.gdt_ents = entries;
+    pl1e = gdt_ldt_ptes(d, v);
+    for ( i = 0; i < nr_pages; i++ )
+    {
+        v->arch.pv_vcpu.gdt_frames[i] = frames[i];
+        l1e_write(&pl1e[i], l1e_from_pfn(frames[i], __PAGE_HYPERVISOR_RW));
+    }
+
+    return 0;
+
+ fail:
+    while ( i-- > 0 )
+    {
+        put_page_and_type(mfn_to_page(frames[i]));
+    }
+    return -EINVAL;
+}
+
+
+long do_set_gdt(XEN_GUEST_HANDLE_PARAM(xen_ulong_t) frame_list,
+                unsigned int entries)
+{
+    int nr_pages = (entries + 511) / 512;
+    unsigned long frames[16];
+    struct vcpu *curr = current;
+    long ret;
+
+    /* Rechecked in pv_set_gdt, but ensures a sane limit for copy_from_user(). */
+    if ( entries > FIRST_RESERVED_GDT_ENTRY )
+        return -EINVAL;
+
+    if ( copy_from_guest(frames, frame_list, nr_pages) )
+        return -EFAULT;
+
+    domain_lock(curr->domain);
+
+    if ( (ret = pv_set_gdt(curr, frames, entries)) == 0 )
+        flush_tlb_local();
+
+    domain_unlock(curr->domain);
+
+    return ret;
+}
+
+long do_update_descriptor(u64 pa, u64 desc)
+{
+    struct domain *dom = current->domain;
+    unsigned long gmfn = pa >> PAGE_SHIFT;
+    unsigned long mfn;
+    unsigned int  offset;
+    struct desc_struct *gdt_pent, d;
+    struct page_info *page;
+    long ret = -EINVAL;
+
+    offset = ((unsigned int)pa & ~PAGE_MASK) / sizeof(struct desc_struct);
+
+    *(u64 *)&d = desc;
+
+    page = get_page_from_gfn(dom, gmfn, NULL, P2M_ALLOC);
+    if ( (((unsigned int)pa % sizeof(struct desc_struct)) != 0) ||
+         !page ||
+         !check_descriptor(dom, &d) )
+    {
+        if ( page )
+            put_page(page);
+        return -EINVAL;
+    }
+    mfn = page_to_mfn(page);
+
+    /* Check if the given frame is in use in an unsafe context. */
+    switch ( page->u.inuse.type_info & PGT_type_mask )
+    {
+    case PGT_seg_desc_page:
+        if ( unlikely(!get_page_type(page, PGT_seg_desc_page)) )
+            goto out;
+        break;
+    default:
+        if ( unlikely(!get_page_type(page, PGT_writable_page)) )
+            goto out;
+        break;
+    }
+
+    paging_mark_dirty(dom, _mfn(mfn));
+
+    /* All is good so make the update. */
+    gdt_pent = map_domain_page(_mfn(mfn));
+    write_atomic((uint64_t *)&gdt_pent[offset], *(uint64_t *)&d);
+    unmap_domain_page(gdt_pent);
+
+    put_page_type(page);
+
+    ret = 0; /* success */
+
+ out:
+    put_page(page);
+
+    return ret;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/x86_64/compat/mm.c b/xen/arch/x86/x86_64/compat/mm.c
index ef0ff86519..d61fb89c27 100644
--- a/xen/arch/x86/x86_64/compat/mm.c
+++ b/xen/arch/x86/x86_64/compat/mm.c
@@ -6,13 +6,15 @@
 #include <asm/mem_paging.h>
 #include <asm/mem_sharing.h>
 
+#include <asm/pv/processor.h>
+
 int compat_set_gdt(XEN_GUEST_HANDLE_PARAM(uint) frame_list, unsigned int entries)
 {
     unsigned int i, nr_pages = (entries + 511) / 512;
     unsigned long frames[16];
     long ret;
 
-    /* Rechecked in set_gdt, but ensures a sane limit for copy_from_user(). */
+    /* Rechecked in pv_set_gdt, but ensures a sane limit for copy_from_user(). */
     if ( entries > FIRST_RESERVED_GDT_ENTRY )
         return -EINVAL;
 
@@ -31,7 +33,7 @@ int compat_set_gdt(XEN_GUEST_HANDLE_PARAM(uint) frame_list, unsigned int entries
 
     domain_lock(current->domain);
 
-    if ( (ret = set_gdt(current, frames, entries)) == 0 )
+    if ( (ret = pv_set_gdt(current, frames, entries)) == 0 )
         flush_tlb_local();
 
     domain_unlock(current->domain);
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 4bef698633..747fcbdc75 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -466,11 +466,6 @@ extern void init_int80_direct_trap(struct vcpu *v);
 
 extern void write_ptbase(struct vcpu *v);
 
-void destroy_gdt(struct vcpu *d);
-long set_gdt(struct vcpu *d, 
-             unsigned long *frames, 
-             unsigned int entries);
-
 /* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
 static always_inline void rep_nop(void)
 {
diff --git a/xen/include/asm-x86/pv/processor.h b/xen/include/asm-x86/pv/processor.h
new file mode 100644
index 0000000000..8ab5773871
--- /dev/null
+++ b/xen/include/asm-x86/pv/processor.h
@@ -0,0 +1,40 @@
+/*
+ * asm-x86/pv/processor.h
+ *
+ * Vcpu interfaces for PV guests
+ *
+ * Copyright (C) 2017 Wei Liu <wei.liu2@citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __X86_PV_PROCESSOR_H__
+#define __X86_PV_PROCESSOR_H__
+
+#ifdef CONFIG_PV
+
+void pv_destroy_gdt(struct vcpu *d);
+long pv_set_gdt(struct vcpu *d, unsigned long *frames, unsigned int entries);
+
+#else
+
+#include <xen/errno.h>
+
+static inline void pv_destroy_gdt(struct vcpu *d) {}
+static inline long pv_set_gdt(struct vcpu *d, unsigned long *frames,
+                              unsigned int entries)
+{ return -EINVAL; }
+
+#endif
+
+#endif /* __X86_PV_PROCESSOR_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 17/31] x86/mm: move compat descriptor handling code
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (15 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 16/31] x86/mm: split out descriptor " Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 18/31] x86/mm: move and rename map_ldt_shadow_page Wei Liu
                   ` (13 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Move them along side the non-compat variants.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/descriptor-tables.c | 40 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/x86_64/compat/mm.c     | 41 -------------------------------------
 2 files changed, 40 insertions(+), 41 deletions(-)

diff --git a/xen/arch/x86/pv/descriptor-tables.c b/xen/arch/x86/pv/descriptor-tables.c
index 12dc45b671..a302812774 100644
--- a/xen/arch/x86/pv/descriptor-tables.c
+++ b/xen/arch/x86/pv/descriptor-tables.c
@@ -177,6 +177,46 @@ long do_update_descriptor(u64 pa, u64 desc)
     return ret;
 }
 
+int compat_set_gdt(XEN_GUEST_HANDLE_PARAM(uint) frame_list,
+                   unsigned int entries)
+{
+    unsigned int i, nr_pages = (entries + 511) / 512;
+    unsigned long frames[16];
+    long ret;
+
+    /* Rechecked in pv_set_gdt, but ensures a sane limit for copy_from_user(). */
+    if ( entries > FIRST_RESERVED_GDT_ENTRY )
+        return -EINVAL;
+
+    if ( !guest_handle_okay(frame_list, nr_pages) )
+        return -EFAULT;
+
+    for ( i = 0; i < nr_pages; ++i )
+    {
+        unsigned int frame;
+
+        if ( __copy_from_guest(&frame, frame_list, 1) )
+            return -EFAULT;
+        frames[i] = frame;
+        guest_handle_add_offset(frame_list, 1);
+    }
+
+    domain_lock(current->domain);
+
+    if ( (ret = pv_set_gdt(current, frames, entries)) == 0 )
+        flush_tlb_local();
+
+    domain_unlock(current->domain);
+
+    return ret;
+}
+
+int compat_update_descriptor(u32 pa_lo, u32 pa_hi, u32 desc_lo, u32 desc_hi)
+{
+    return do_update_descriptor(pa_lo | ((u64)pa_hi << 32),
+                                desc_lo | ((u64)desc_hi << 32));
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/x86_64/compat/mm.c b/xen/arch/x86/x86_64/compat/mm.c
index d61fb89c27..89a0b27573 100644
--- a/xen/arch/x86/x86_64/compat/mm.c
+++ b/xen/arch/x86/x86_64/compat/mm.c
@@ -6,47 +6,6 @@
 #include <asm/mem_paging.h>
 #include <asm/mem_sharing.h>
 
-#include <asm/pv/processor.h>
-
-int compat_set_gdt(XEN_GUEST_HANDLE_PARAM(uint) frame_list, unsigned int entries)
-{
-    unsigned int i, nr_pages = (entries + 511) / 512;
-    unsigned long frames[16];
-    long ret;
-
-    /* Rechecked in pv_set_gdt, but ensures a sane limit for copy_from_user(). */
-    if ( entries > FIRST_RESERVED_GDT_ENTRY )
-        return -EINVAL;
-
-    if ( !guest_handle_okay(frame_list, nr_pages) )
-        return -EFAULT;
-
-    for ( i = 0; i < nr_pages; ++i )
-    {
-        unsigned int frame;
-
-        if ( __copy_from_guest(&frame, frame_list, 1) )
-            return -EFAULT;
-        frames[i] = frame;
-        guest_handle_add_offset(frame_list, 1);
-    }
-
-    domain_lock(current->domain);
-
-    if ( (ret = pv_set_gdt(current, frames, entries)) == 0 )
-        flush_tlb_local();
-
-    domain_unlock(current->domain);
-
-    return ret;
-}
-
-int compat_update_descriptor(u32 pa_lo, u32 pa_hi, u32 desc_lo, u32 desc_hi)
-{
-    return do_update_descriptor(pa_lo | ((u64)pa_hi << 32),
-                                desc_lo | ((u64)desc_hi << 32));
-}
-
 int compat_arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     struct compat_machphys_mfn_list xmml;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 18/31] x86/mm: move and rename map_ldt_shadow_page
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (16 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 17/31] x86/mm: move compat descriptor handling code Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 19/31] x86/mm: factor out pv_arch_init_memory Wei Liu
                   ` (12 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Take the chance to change v to curr and d to currd in code. Also
change the return type to bool.  Fix up all the call sites.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c                   | 43 -------------------------------------
 xen/arch/x86/pv/descriptor-tables.c | 42 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/traps.c                |  5 +++--
 xen/include/asm-x86/mm.h            |  2 --
 xen/include/asm-x86/pv/processor.h  |  2 ++
 5 files changed, 47 insertions(+), 47 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 6cbcdabcd2..7f175bacc9 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -580,49 +580,6 @@ static int alloc_segdesc_page(struct page_info *page)
     return i == 512 ? 0 : -EINVAL;
 }
 
-
-/* Map shadow page at offset @off. */
-int map_ldt_shadow_page(unsigned int off)
-{
-    struct vcpu *v = current;
-    struct domain *d = v->domain;
-    unsigned long gmfn;
-    struct page_info *page;
-    l1_pgentry_t l1e, nl1e;
-    unsigned long gva = v->arch.pv_vcpu.ldt_base + (off << PAGE_SHIFT);
-    int okay;
-
-    BUG_ON(unlikely(in_irq()));
-
-    if ( is_pv_32bit_domain(d) )
-        gva = (u32)gva;
-    pv_get_guest_eff_kern_l1e(v, gva, &l1e);
-    if ( unlikely(!(l1e_get_flags(l1e) & _PAGE_PRESENT)) )
-        return 0;
-
-    gmfn = l1e_get_pfn(l1e);
-    page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
-    if ( unlikely(!page) )
-        return 0;
-
-    okay = get_page_type(page, PGT_seg_desc_page);
-    if ( unlikely(!okay) )
-    {
-        put_page(page);
-        return 0;
-    }
-
-    nl1e = l1e_from_pfn(page_to_mfn(page), l1e_get_flags(l1e) | _PAGE_RW);
-
-    spin_lock(&v->arch.pv_vcpu.shadow_ldt_lock);
-    l1e_write(&gdt_ldt_ptes(d, v)[off + 16], nl1e);
-    v->arch.pv_vcpu.shadow_ldt_mapcnt++;
-    spin_unlock(&v->arch.pv_vcpu.shadow_ldt_lock);
-
-    return 1;
-}
-
-
 bool get_page_from_mfn(mfn_t mfn, struct domain *d)
 {
     struct page_info *page = mfn_to_page(mfn_x(mfn));
diff --git a/xen/arch/x86/pv/descriptor-tables.c b/xen/arch/x86/pv/descriptor-tables.c
index a302812774..6ac5c736cf 100644
--- a/xen/arch/x86/pv/descriptor-tables.c
+++ b/xen/arch/x86/pv/descriptor-tables.c
@@ -24,6 +24,7 @@
 #include <xen/hypercall.h>
 
 #include <asm/p2m.h>
+#include <asm/pv/mm.h>
 #include <asm/pv/processor.h>
 
 /*************************
@@ -217,6 +218,47 @@ int compat_update_descriptor(u32 pa_lo, u32 pa_hi, u32 desc_lo, u32 desc_hi)
                                 desc_lo | ((u64)desc_hi << 32));
 }
 
+/* Map shadow page at offset @off. */
+bool pv_map_ldt_shadow_page(unsigned int off)
+{
+    struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
+    unsigned long gmfn;
+    struct page_info *page;
+    l1_pgentry_t l1e, nl1e;
+    unsigned long gva = curr->arch.pv_vcpu.ldt_base + (off << PAGE_SHIFT);
+    int okay;
+
+    BUG_ON(unlikely(in_irq()));
+
+    if ( is_pv_32bit_domain(currd) )
+        gva = (u32)gva;
+    pv_get_guest_eff_kern_l1e(curr, gva, &l1e);
+    if ( unlikely(!(l1e_get_flags(l1e) & _PAGE_PRESENT)) )
+        return false;
+
+    gmfn = l1e_get_pfn(l1e);
+    page = get_page_from_gfn(currd, gmfn, NULL, P2M_ALLOC);
+    if ( unlikely(!page) )
+        return false;
+
+    okay = get_page_type(page, PGT_seg_desc_page);
+    if ( unlikely(!okay) )
+    {
+        put_page(page);
+        return false;
+    }
+
+    nl1e = l1e_from_pfn(page_to_mfn(page), l1e_get_flags(l1e) | _PAGE_RW);
+
+    spin_lock(&curr->arch.pv_vcpu.shadow_ldt_lock);
+    l1e_write(&gdt_ldt_ptes(currd, curr)[off + 16], nl1e);
+    curr->arch.pv_vcpu.shadow_ldt_mapcnt++;
+    spin_unlock(&curr->arch.pv_vcpu.shadow_ldt_lock);
+
+    return true;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index b93b3d1317..dbdcdf62a6 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -77,6 +77,7 @@
 #include <public/arch-x86/cpuid.h>
 #include <asm/cpuid.h>
 #include <xsm/xsm.h>
+#include <asm/pv/processor.h>
 #include <asm/pv/traps.h>
 
 /*
@@ -1100,7 +1101,7 @@ static int handle_gdt_ldt_mapping_fault(unsigned long offset,
     /*
      * If the fault is in another vcpu's area, it cannot be due to
      * a GDT/LDT descriptor load. Thus we can reasonably exit immediately, and
-     * indeed we have to since map_ldt_shadow_page() works correctly only on
+     * indeed we have to since pv_map_ldt_shadow_page() works correctly only on
      * accesses to a vcpu's own area.
      */
     if ( vcpu_area != curr->vcpu_id )
@@ -1112,7 +1113,7 @@ static int handle_gdt_ldt_mapping_fault(unsigned long offset,
     if ( likely(is_ldt_area) )
     {
         /* LDT fault: Copy a mapping from the guest's LDT, if it is valid. */
-        if ( likely(map_ldt_shadow_page(offset >> PAGE_SHIFT)) )
+        if ( likely(pv_map_ldt_shadow_page(offset >> PAGE_SHIFT)) )
         {
             if ( guest_mode(regs) )
                 trace_trap_two_addr(TRC_PV_GDT_LDT_MAPPING_FAULT,
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index c6e1d01c7d..c11fa680bd 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -530,8 +530,6 @@ long subarch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg);
 int compat_arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void));
 int compat_subarch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void));
 
-int map_ldt_shadow_page(unsigned int);
-
 #define NIL(type) ((type *)-sizeof(type))
 #define IS_NIL(ptr) (!((uintptr_t)(ptr) + sizeof(*(ptr))))
 
diff --git a/xen/include/asm-x86/pv/processor.h b/xen/include/asm-x86/pv/processor.h
index 8ab5773871..6f9e1afe8a 100644
--- a/xen/include/asm-x86/pv/processor.h
+++ b/xen/include/asm-x86/pv/processor.h
@@ -25,6 +25,7 @@
 
 void pv_destroy_gdt(struct vcpu *d);
 long pv_set_gdt(struct vcpu *d, unsigned long *frames, unsigned int entries);
+bool pv_map_ldt_shadow_page(unsigned int);
 
 #else
 
@@ -34,6 +35,7 @@ static inline void pv_destroy_gdt(struct vcpu *d) {}
 static inline long pv_set_gdt(struct vcpu *d, unsigned long *frames,
                               unsigned int entries)
 { return -EINVAL; }
+static inline bool pv_map_ldt_shadow_page(unsigned int) { return false; }
 
 #endif
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 19/31] x86/mm: factor out pv_arch_init_memory
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (17 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 18/31] x86/mm: move and rename map_ldt_shadow_page Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 20/31] x86/mm: move l4 table setup code Wei Liu
                   ` (11 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Move the split l4 setup code into the new function. The new function
is also going to contain other PV specific setup code in later patch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c | 73 ++++++++++++++++++++++++++++++-------------------------
 1 file changed, 40 insertions(+), 33 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 7f175bacc9..fbf402d16f 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -236,6 +236,45 @@ static l4_pgentry_t __read_mostly split_l4e;
 #define root_pgt_pv_xen_slots ROOT_PAGETABLE_PV_XEN_SLOTS
 #endif
 
+static void pv_arch_init_memory(void)
+{
+#ifndef NDEBUG
+    unsigned int i;
+
+    if ( highmem_start )
+    {
+        unsigned long split_va = (unsigned long)__va(highmem_start);
+
+        if ( split_va < HYPERVISOR_VIRT_END &&
+             split_va - 1 == (unsigned long)__va(highmem_start - 1) )
+        {
+            root_pgt_pv_xen_slots = l4_table_offset(split_va) -
+                                    ROOT_PAGETABLE_FIRST_XEN_SLOT;
+            ASSERT(root_pgt_pv_xen_slots < ROOT_PAGETABLE_PV_XEN_SLOTS);
+            if ( l4_table_offset(split_va) == l4_table_offset(split_va - 1) )
+            {
+                l3_pgentry_t *l3tab = alloc_xen_pagetable();
+
+                if ( l3tab )
+                {
+                    const l3_pgentry_t *l3idle =
+                        l4e_to_l3e(idle_pg_table[l4_table_offset(split_va)]);
+
+                    for ( i = 0; i < l3_table_offset(split_va); ++i )
+                        l3tab[i] = l3idle[i];
+                    for ( ; i < L3_PAGETABLE_ENTRIES; ++i )
+                        l3tab[i] = l3e_empty();
+                    split_l4e = l4e_from_pfn(virt_to_mfn(l3tab),
+                                             __PAGE_HYPERVISOR_RW);
+                }
+                else
+                    ++root_pgt_pv_xen_slots;
+            }
+        }
+    }
+#endif
+}
+
 void __init arch_init_memory(void)
 {
     unsigned long i, pfn, rstart_pfn, rend_pfn, iostart_pfn, ioend_pfn;
@@ -330,39 +369,7 @@ void __init arch_init_memory(void)
 
     mem_sharing_init();
 
-#ifndef NDEBUG
-    if ( highmem_start )
-    {
-        unsigned long split_va = (unsigned long)__va(highmem_start);
-
-        if ( split_va < HYPERVISOR_VIRT_END &&
-             split_va - 1 == (unsigned long)__va(highmem_start - 1) )
-        {
-            root_pgt_pv_xen_slots = l4_table_offset(split_va) -
-                                    ROOT_PAGETABLE_FIRST_XEN_SLOT;
-            ASSERT(root_pgt_pv_xen_slots < ROOT_PAGETABLE_PV_XEN_SLOTS);
-            if ( l4_table_offset(split_va) == l4_table_offset(split_va - 1) )
-            {
-                l3_pgentry_t *l3tab = alloc_xen_pagetable();
-
-                if ( l3tab )
-                {
-                    const l3_pgentry_t *l3idle =
-                        l4e_to_l3e(idle_pg_table[l4_table_offset(split_va)]);
-
-                    for ( i = 0; i < l3_table_offset(split_va); ++i )
-                        l3tab[i] = l3idle[i];
-                    for ( ; i < L3_PAGETABLE_ENTRIES; ++i )
-                        l3tab[i] = l3e_empty();
-                    split_l4e = l4e_from_pfn(virt_to_mfn(l3tab),
-                                             __PAGE_HYPERVISOR_RW);
-                }
-                else
-                    ++root_pgt_pv_xen_slots;
-            }
-        }
-    }
-#endif
+    pv_arch_init_memory();
 }
 
 int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 20/31] x86/mm: move l4 table setup code
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (18 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 19/31] x86/mm: factor out pv_arch_init_memory Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 21/31] x86/mm: add "pv_" prefix to new_guest_cr3 Wei Liu
                   ` (10 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Move two functions to pv/mm.c. Add prefix to init_guest_l4_table.
Export them via pv/mm.h. Fix up call sites.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c            | 69 +-------------------------------------------
 xen/arch/x86/pv/dom0_build.c |  3 +-
 xen/arch/x86/pv/domain.c     |  3 +-
 xen/arch/x86/pv/mm.c         | 68 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/mm.h     |  2 --
 xen/include/asm-x86/pv/mm.h  |  8 +++++
 6 files changed, 81 insertions(+), 72 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index fbf402d16f..ec523a4f51 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -228,53 +228,6 @@ void __init init_frametable(void)
     memset(end_pg, -1, (unsigned long)top_pg - (unsigned long)end_pg);
 }
 
-#ifndef NDEBUG
-static unsigned int __read_mostly root_pgt_pv_xen_slots
-    = ROOT_PAGETABLE_PV_XEN_SLOTS;
-static l4_pgentry_t __read_mostly split_l4e;
-#else
-#define root_pgt_pv_xen_slots ROOT_PAGETABLE_PV_XEN_SLOTS
-#endif
-
-static void pv_arch_init_memory(void)
-{
-#ifndef NDEBUG
-    unsigned int i;
-
-    if ( highmem_start )
-    {
-        unsigned long split_va = (unsigned long)__va(highmem_start);
-
-        if ( split_va < HYPERVISOR_VIRT_END &&
-             split_va - 1 == (unsigned long)__va(highmem_start - 1) )
-        {
-            root_pgt_pv_xen_slots = l4_table_offset(split_va) -
-                                    ROOT_PAGETABLE_FIRST_XEN_SLOT;
-            ASSERT(root_pgt_pv_xen_slots < ROOT_PAGETABLE_PV_XEN_SLOTS);
-            if ( l4_table_offset(split_va) == l4_table_offset(split_va - 1) )
-            {
-                l3_pgentry_t *l3tab = alloc_xen_pagetable();
-
-                if ( l3tab )
-                {
-                    const l3_pgentry_t *l3idle =
-                        l4e_to_l3e(idle_pg_table[l4_table_offset(split_va)]);
-
-                    for ( i = 0; i < l3_table_offset(split_va); ++i )
-                        l3tab[i] = l3idle[i];
-                    for ( ; i < L3_PAGETABLE_ENTRIES; ++i )
-                        l3tab[i] = l3e_empty();
-                    split_l4e = l4e_from_pfn(virt_to_mfn(l3tab),
-                                             __PAGE_HYPERVISOR_RW);
-                }
-                else
-                    ++root_pgt_pv_xen_slots;
-            }
-        }
-    }
-#endif
-}
-
 void __init arch_init_memory(void)
 {
     unsigned long i, pfn, rstart_pfn, rend_pfn, iostart_pfn, ioend_pfn;
@@ -1433,26 +1386,6 @@ static int alloc_l3_table(struct page_info *page)
     return rc > 0 ? 0 : rc;
 }
 
-void init_guest_l4_table(l4_pgentry_t l4tab[], const struct domain *d,
-                         bool zap_ro_mpt)
-{
-    /* Xen private mappings. */
-    memcpy(&l4tab[ROOT_PAGETABLE_FIRST_XEN_SLOT],
-           &idle_pg_table[ROOT_PAGETABLE_FIRST_XEN_SLOT],
-           root_pgt_pv_xen_slots * sizeof(l4_pgentry_t));
-#ifndef NDEBUG
-    if ( l4e_get_intpte(split_l4e) )
-        l4tab[ROOT_PAGETABLE_FIRST_XEN_SLOT + root_pgt_pv_xen_slots] =
-            split_l4e;
-#endif
-    l4tab[l4_table_offset(LINEAR_PT_VIRT_START)] =
-        l4e_from_pfn(domain_page_map_to_mfn(l4tab), __PAGE_HYPERVISOR_RW);
-    l4tab[l4_table_offset(PERDOMAIN_VIRT_START)] =
-        l4e_from_page(d->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW);
-    if ( zap_ro_mpt || is_pv_32bit_domain(d) )
-        l4tab[l4_table_offset(RO_MPT_VIRT_START)] = l4e_empty();
-}
-
 bool fill_ro_mpt(unsigned long mfn)
 {
     l4_pgentry_t *l4tab = map_domain_page(_mfn(mfn));
@@ -1527,7 +1460,7 @@ static int alloc_l4_table(struct page_info *page)
 
     if ( rc >= 0 )
     {
-        init_guest_l4_table(pl4e, d, !VM_ASSIST(d, m2p_strict));
+        pv_init_guest_l4_table(pl4e, d, !VM_ASSIST(d, m2p_strict));
         atomic_inc(&d->arch.pv_domain.nr_l4_pages);
         rc = 0;
     }
diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index e67ffdd7b8..7e1f8f2ea1 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -18,6 +18,7 @@
 #include <asm/bzimage.h>
 #include <asm/dom0_build.h>
 #include <asm/page.h>
+#include <asm/pv/mm.h>
 #include <asm/setup.h>
 
 /* Allow ring-3 access in long mode as guest cannot use ring 1 ... */
@@ -588,7 +589,7 @@ int __init dom0_construct_pv(struct domain *d,
         l3start = __va(mpt_alloc); mpt_alloc += PAGE_SIZE;
     }
     clear_page(l4tab);
-    init_guest_l4_table(l4tab, d, 0);
+    pv_init_guest_l4_table(l4tab, d, 0);
     v->arch.guest_table = pagetable_from_paddr(__pa(l4start));
     if ( is_pv_32bit_domain(d) )
         v->arch.guest_table_user = v->arch.guest_table;
diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c
index 6cb61f2e14..415d0634a3 100644
--- a/xen/arch/x86/pv/domain.c
+++ b/xen/arch/x86/pv/domain.c
@@ -10,6 +10,7 @@
 #include <xen/sched.h>
 
 #include <asm/pv/domain.h>
+#include <asm/pv/mm.h>
 
 static void noreturn continue_nonidle_domain(struct vcpu *v)
 {
@@ -29,7 +30,7 @@ static int setup_compat_l4(struct vcpu *v)
 
     l4tab = __map_domain_page(pg);
     clear_page(l4tab);
-    init_guest_l4_table(l4tab, v->domain, 1);
+    pv_init_guest_l4_table(l4tab, v->domain, 1);
     unmap_domain_page(l4tab);
 
     /* This page needs to look like a pagetable so that it can be shadowed */
diff --git a/xen/arch/x86/pv/mm.c b/xen/arch/x86/pv/mm.c
index 32e73d59df..0f4303cef2 100644
--- a/xen/arch/x86/pv/mm.c
+++ b/xen/arch/x86/pv/mm.c
@@ -23,6 +23,7 @@
 #include <xen/guest_access.h>
 
 #include <asm/pv/mm.h>
+#include <asm/setup.h>
 
 /*
  * PTE updates can be done with ordinary writes except:
@@ -32,6 +33,14 @@
 #define PTE_UPDATE_WITH_CMPXCHG
 #endif
 
+#ifndef NDEBUG
+static unsigned int __read_mostly root_pgt_pv_xen_slots
+    = ROOT_PAGETABLE_PV_XEN_SLOTS;
+static l4_pgentry_t __read_mostly split_l4e;
+#else
+#define root_pgt_pv_xen_slots ROOT_PAGETABLE_PV_XEN_SLOTS
+#endif
+
 /* Read a PV guest's l1e that maps this virtual address. */
 void pv_get_guest_eff_l1e(unsigned long addr, l1_pgentry_t *eff_l1e)
 {
@@ -96,6 +105,65 @@ void pv_unmap_guest_l1e(void *p)
     unmap_domain_page(p);
 }
 
+void pv_init_guest_l4_table(l4_pgentry_t l4tab[], const struct domain *d,
+                            bool zap_ro_mpt)
+{
+    /* Xen private mappings. */
+    memcpy(&l4tab[ROOT_PAGETABLE_FIRST_XEN_SLOT],
+           &idle_pg_table[ROOT_PAGETABLE_FIRST_XEN_SLOT],
+           root_pgt_pv_xen_slots * sizeof(l4_pgentry_t));
+#ifndef NDEBUG
+    if ( l4e_get_intpte(split_l4e) )
+        l4tab[ROOT_PAGETABLE_FIRST_XEN_SLOT + root_pgt_pv_xen_slots] =
+            split_l4e;
+#endif
+    l4tab[l4_table_offset(LINEAR_PT_VIRT_START)] =
+        l4e_from_pfn(domain_page_map_to_mfn(l4tab), __PAGE_HYPERVISOR_RW);
+    l4tab[l4_table_offset(PERDOMAIN_VIRT_START)] =
+        l4e_from_page(d->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW);
+    if ( zap_ro_mpt || is_pv_32bit_domain(d) )
+        l4tab[l4_table_offset(RO_MPT_VIRT_START)] = l4e_empty();
+}
+
+void pv_arch_init_memory(void)
+{
+#ifndef NDEBUG
+    unsigned int i;
+
+    if ( highmem_start )
+    {
+        unsigned long split_va = (unsigned long)__va(highmem_start);
+
+        if ( split_va < HYPERVISOR_VIRT_END &&
+             split_va - 1 == (unsigned long)__va(highmem_start - 1) )
+        {
+            root_pgt_pv_xen_slots = l4_table_offset(split_va) -
+                                    ROOT_PAGETABLE_FIRST_XEN_SLOT;
+            ASSERT(root_pgt_pv_xen_slots < ROOT_PAGETABLE_PV_XEN_SLOTS);
+            if ( l4_table_offset(split_va) == l4_table_offset(split_va - 1) )
+            {
+                l3_pgentry_t *l3tab = alloc_xen_pagetable();
+
+                if ( l3tab )
+                {
+                    const l3_pgentry_t *l3idle =
+                        l4e_to_l3e(idle_pg_table[l4_table_offset(split_va)]);
+
+                    for ( i = 0; i < l3_table_offset(split_va); ++i )
+                        l3tab[i] = l3idle[i];
+                    for ( ; i < L3_PAGETABLE_ENTRIES; ++i )
+                        l3tab[i] = l3e_empty();
+                    split_l4e = l4e_from_pfn(virt_to_mfn(l3tab),
+                                             __PAGE_HYPERVISOR_RW);
+                }
+                else
+                    ++root_pgt_pv_xen_slots;
+            }
+        }
+    }
+#endif
+}
+
 /*
  * How to write an entry to the guest pagetables.
  * Returns false for failure (pointer not valid), true for success.
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index c11fa680bd..a6352e6fc9 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -305,8 +305,6 @@ static inline void *__page_to_virt(const struct page_info *pg)
 int free_page_type(struct page_info *page, unsigned long type,
                    int preemptible);
 
-void init_guest_l4_table(l4_pgentry_t[], const struct domain *,
-                         bool_t zap_ro_mpt);
 bool_t fill_ro_mpt(unsigned long mfn);
 void zap_ro_mpt(unsigned long mfn);
 
diff --git a/xen/include/asm-x86/pv/mm.h b/xen/include/asm-x86/pv/mm.h
index 006156d0e1..4ecbf50b18 100644
--- a/xen/include/asm-x86/pv/mm.h
+++ b/xen/include/asm-x86/pv/mm.h
@@ -89,6 +89,10 @@ bool pv_update_intpte(intpte_t *p, intpte_t old, intpte_t new,
 l1_pgentry_t *pv_map_guest_l1e(unsigned long addr, unsigned long *gl1mfn);
 void pv_unmap_guest_l1e(void *p);
 
+void pv_init_guest_l4_table(l4_pgentry_t[], const struct domain *,
+                            bool zap_ro_mpt);
+void pv_arch_init_memory(void);
+
 #else
 
 static inline void pv_get_guest_eff_l1e(unsigned long addr,
@@ -110,6 +114,10 @@ static inline l1_pgentry_t *pv_map_guest_l1e(unsigned long addr,
 
 static inline void pv_unmap_guest_l1e(void *p) {}
 
+static inline void pv_init_guest_l4_table(l4_pgentry_t[],
+                                          const struct domain *,
+                                          bool zap_ro_mpt) {}
+static inline void pv_arch_init_memory(void) {}
 #endif
 
 #endif /* __X86_PV_MM_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 21/31] x86/mm: add "pv_" prefix to new_guest_cr3
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (19 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 20/31] x86/mm: move l4 table setup code Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 22/31] x86: add pv_ prefix to {alloc, free}_page_type Wei Liu
                   ` (9 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Also take the chance to change d to currd. This function can't be
moved yet. It can only be moved with other functions.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c              | 19 ++++++++++---------
 xen/arch/x86/pv/emul-priv-op.c |  3 ++-
 xen/include/asm-x86/mm.h       |  1 -
 xen/include/asm-x86/pv/mm.h    |  7 +++++++
 4 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index ec523a4f51..928f4330e7 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2469,14 +2469,14 @@ int vcpu_destroy_pagetables(struct vcpu *v)
     return rc != -EINTR ? rc : -ERESTART;
 }
 
-int new_guest_cr3(unsigned long mfn)
+int pv_new_guest_cr3(unsigned long mfn)
 {
     struct vcpu *curr = current;
-    struct domain *d = curr->domain;
+    struct domain *currd = curr->domain;
     int rc;
     unsigned long old_base_mfn;
 
-    if ( is_pv_32bit_domain(d) )
+    if ( is_pv_32bit_domain(currd) )
     {
         unsigned long gt_mfn = pagetable_get_pfn(curr->arch.guest_table);
         l4_pgentry_t *pl4e = map_domain_page(_mfn(gt_mfn));
@@ -2522,9 +2522,10 @@ int new_guest_cr3(unsigned long mfn)
         return 0;
     }
 
-    rc = paging_mode_refcounts(d)
-         ? (get_page_from_mfn(_mfn(mfn), d) ? 0 : -EINVAL)
-         : get_page_and_type_from_mfn(_mfn(mfn), PGT_root_page_table, d, 0, 1);
+    rc = paging_mode_refcounts(currd)
+         ? (get_page_from_mfn(_mfn(mfn), currd) ? 0 : -EINVAL)
+         : get_page_and_type_from_mfn(_mfn(mfn), PGT_root_page_table,
+                                      currd, 0, 1);
     switch ( rc )
     {
     case 0:
@@ -2540,7 +2541,7 @@ int new_guest_cr3(unsigned long mfn)
 
     invalidate_shadow_ldt(curr, 0);
 
-    if ( !VM_ASSIST(d, m2p_strict) && !paging_mode_refcounts(d) )
+    if ( !VM_ASSIST(currd, m2p_strict) && !paging_mode_refcounts(currd) )
         fill_ro_mpt(mfn);
     curr->arch.guest_table = pagetable_from_pfn(mfn);
     update_cr3(curr);
@@ -2551,7 +2552,7 @@ int new_guest_cr3(unsigned long mfn)
     {
         struct page_info *page = mfn_to_page(old_base_mfn);
 
-        if ( paging_mode_refcounts(d) )
+        if ( paging_mode_refcounts(currd) )
             put_page(page);
         else
             switch ( rc = put_page_and_type_preemptible(page) )
@@ -2876,7 +2877,7 @@ long do_mmuext_op(
             else if ( unlikely(paging_mode_translate(currd)) )
                 rc = -EINVAL;
             else
-                rc = new_guest_cr3(op.arg1.mfn);
+                rc = pv_new_guest_cr3(op.arg1.mfn);
             break;
 
         case MMUEXT_NEW_USER_BASEPTR: {
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index d50f51944f..d549c7ce1f 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -32,6 +32,7 @@
 #include <asm/hypercall.h>
 #include <asm/mc146818rtc.h>
 #include <asm/p2m.h>
+#include <asm/pv/mm.h>
 #include <asm/pv/traps.h>
 #include <asm/shared.h>
 #include <asm/traps.h>
@@ -768,7 +769,7 @@ static int priv_op_write_cr(unsigned int reg, unsigned long val,
         page = get_page_from_gfn(currd, gfn, NULL, P2M_ALLOC);
         if ( !page )
             break;
-        rc = new_guest_cr3(page_to_mfn(page));
+        rc = pv_new_guest_cr3(page_to_mfn(page));
         put_page(page);
 
         switch ( rc )
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index a6352e6fc9..521a8b1b7b 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -514,7 +514,6 @@ void audit_domains(void);
 
 #endif
 
-int new_guest_cr3(unsigned long pfn);
 void make_cr3(struct vcpu *v, unsigned long mfn);
 void update_cr3(struct vcpu *v);
 int vcpu_destroy_pagetables(struct vcpu *);
diff --git a/xen/include/asm-x86/pv/mm.h b/xen/include/asm-x86/pv/mm.h
index 4ecbf50b18..648b26d7d0 100644
--- a/xen/include/asm-x86/pv/mm.h
+++ b/xen/include/asm-x86/pv/mm.h
@@ -93,8 +93,12 @@ void pv_init_guest_l4_table(l4_pgentry_t[], const struct domain *,
                             bool zap_ro_mpt);
 void pv_arch_init_memory(void);
 
+int pv_new_guest_cr3(unsigned long pfn);
+
 #else
 
+#include <xen/errno.h>
+
 static inline void pv_get_guest_eff_l1e(unsigned long addr,
                                         l1_pgentry_t *eff_l1e)
 {}
@@ -118,6 +122,9 @@ static inline void pv_init_guest_l4_table(l4_pgentry_t[],
                                           const struct domain *,
                                           bool zap_ro_mpt) {}
 static inline void pv_arch_init_memory(void) {}
+
+static inline int pv_new_guest_cr3(unsigned long pfn) { return -EINVAL; }
+
 #endif
 
 #endif /* __X86_PV_MM_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 22/31] x86: add pv_ prefix to {alloc, free}_page_type
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (20 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 21/31] x86/mm: add "pv_" prefix to new_guest_cr3 Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 23/31] x86/mm: export more get/put page functions Wei Liu
                   ` (8 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

They are only useful for PV guests. Also change preemptible to bool.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/domain.c    |  2 +-
 xen/arch/x86/mm.c        | 12 ++++++------
 xen/include/asm-x86/mm.h |  4 ++--
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 9a25c04f6c..8fae38485c 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1806,7 +1806,7 @@ static int relinquish_memory(
             if ( likely(y == x) )
             {
                 /* No need for atomic update of type_info here: noone else updates it. */
-                switch ( ret = free_page_type(page, x, 1) )
+                switch ( ret = pv_free_page_type(page, x, true) )
                 {
                 case 0:
                     break;
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 928f4330e7..1fdae6e1e6 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2006,8 +2006,8 @@ static void get_page_light(struct page_info *page)
     while ( unlikely(y != x) );
 }
 
-static int alloc_page_type(struct page_info *page, unsigned long type,
-                           int preemptible)
+static int pv_alloc_page_type(struct page_info *page, unsigned long type,
+                              bool preemptible)
 {
     struct domain *owner = page_get_owner(page);
     int rc;
@@ -2079,8 +2079,8 @@ static int alloc_page_type(struct page_info *page, unsigned long type,
 }
 
 
-int free_page_type(struct page_info *page, unsigned long type,
-                   int preemptible)
+int pv_free_page_type(struct page_info *page, unsigned long type,
+                      bool preemptible)
 {
     struct domain *owner = page_get_owner(page);
     unsigned long gmfn;
@@ -2137,7 +2137,7 @@ int free_page_type(struct page_info *page, unsigned long type,
 static int __put_final_page_type(
     struct page_info *page, unsigned long type, int preemptible)
 {
-    int rc = free_page_type(page, type, preemptible);
+    int rc = pv_free_page_type(page, type, preemptible);
 
     /* No need for atomic update of type_info here: noone else updates it. */
     if ( rc == 0 )
@@ -2353,7 +2353,7 @@ static int __get_page_type(struct page_info *page, unsigned long type,
             page->nr_validated_ptes = 0;
             page->partial_pte = 0;
         }
-        rc = alloc_page_type(page, type, preemptible);
+        rc = pv_alloc_page_type(page, type, preemptible);
     }
 
     if ( (x & PGT_partial) && !(nx & PGT_partial) )
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 521a8b1b7b..a5662f327b 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -302,8 +302,8 @@ static inline void *__page_to_virt(const struct page_info *pg)
                     (PAGE_SIZE / (sizeof(*pg) & -sizeof(*pg))));
 }
 
-int free_page_type(struct page_info *page, unsigned long type,
-                   int preemptible);
+int pv_free_page_type(struct page_info *page, unsigned long type,
+                      bool preemptible);
 
 bool_t fill_ro_mpt(unsigned long mfn);
 void zap_ro_mpt(unsigned long mfn);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 23/31] x86/mm: export more get/put page functions
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (21 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 22/31] x86: add pv_ prefix to {alloc, free}_page_type Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 24/31] x86/mm: move and add pv_ prefix to create_pae_xen_mappings Wei Liu
                   ` (7 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Export some of the get/put functions so that we can move PV mm code
trunk by trunk.

When moving code is done some of the functions might be made static
again.

Also fix coding style issues and use bool when appropriate.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c        | 40 ++++++++++++++++++++--------------------
 xen/include/asm-x86/mm.h | 17 +++++++++++++++--
 2 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 1fdae6e1e6..fb6485500f 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -555,9 +555,8 @@ bool get_page_from_mfn(mfn_t mfn, struct domain *d)
 }
 
 
-static int get_page_and_type_from_mfn(
-    mfn_t mfn, unsigned long type, struct domain *d,
-    int partial, int preemptible)
+int get_page_and_type_from_mfn(mfn_t mfn, unsigned long type, struct domain *d,
+                               int partial, bool preemptible)
 {
     struct page_info *page = mfn_to_page(mfn_x(mfn));
     int rc;
@@ -940,7 +939,7 @@ get_page_from_l1e(
  *  <0 => error code
  */
 define_get_linear_pagetable(l2);
-static int
+int
 get_page_from_l2e(
     l2_pgentry_t l2e, unsigned long pfn, struct domain *d)
 {
@@ -959,7 +958,8 @@ get_page_from_l2e(
 
     if ( !(l2e_get_flags(l2e) & _PAGE_PSE) )
     {
-        rc = get_page_and_type_from_mfn(_mfn(mfn), PGT_l1_page_table, d, 0, 0);
+        rc = get_page_and_type_from_mfn(_mfn(mfn), PGT_l1_page_table, d, 0,
+                                        false);
         if ( unlikely(rc == -EINVAL) && get_l2_linear_pagetable(l2e, pfn, d) )
             rc = 0;
         return rc;
@@ -976,7 +976,7 @@ get_page_from_l2e(
  *  <0 => error code
  */
 define_get_linear_pagetable(l3);
-static int
+int
 get_page_from_l3e(
     l3_pgentry_t l3e, unsigned long pfn, struct domain *d, int partial)
 {
@@ -992,8 +992,8 @@ get_page_from_l3e(
         return -EINVAL;
     }
 
-    rc = get_page_and_type_from_mfn(
-        _mfn(l3e_get_pfn(l3e)), PGT_l2_page_table, d, partial, 1);
+    rc = get_page_and_type_from_mfn(_mfn(l3e_get_pfn(l3e)), PGT_l2_page_table,
+                                    d, partial, true);
     if ( unlikely(rc == -EINVAL) &&
          !is_pv_32bit_domain(d) &&
          get_l3_linear_pagetable(l3e, pfn, d) )
@@ -1009,7 +1009,7 @@ get_page_from_l3e(
  *  <0 => error code
  */
 define_get_linear_pagetable(l4);
-static int
+int
 get_page_from_l4e(
     l4_pgentry_t l4e, unsigned long pfn, struct domain *d, int partial)
 {
@@ -1025,8 +1025,8 @@ get_page_from_l4e(
         return -EINVAL;
     }
 
-    rc = get_page_and_type_from_mfn(
-        _mfn(l4e_get_pfn(l4e)), PGT_l3_page_table, d, partial, 1);
+    rc = get_page_and_type_from_mfn(_mfn(l4e_get_pfn(l4e)), PGT_l3_page_table,
+                                    d, partial, true);
     if ( unlikely(rc == -EINVAL) && get_l4_linear_pagetable(l4e, pfn, d) )
         rc = 0;
 
@@ -1097,7 +1097,7 @@ void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner)
  * NB. Virtual address 'l2e' maps to a machine address within frame 'pfn'.
  * Note also that this automatically deals correctly with linear p.t.'s.
  */
-static int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn)
+int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn)
 {
     if ( !(l2e_get_flags(l2e) & _PAGE_PRESENT) || (l2e_get_pfn(l2e) == pfn) )
         return 1;
@@ -1117,8 +1117,8 @@ static int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn)
 
 static int __put_page_type(struct page_info *, int preemptible);
 
-static int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn,
-                             int partial, bool defer)
+int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn, int partial,
+                      bool defer)
 {
     struct page_info *pg;
 
@@ -1155,8 +1155,8 @@ static int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn,
     return put_page_and_type_preemptible(pg);
 }
 
-static int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn,
-                             int partial, bool defer)
+int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn, int partial,
+                      bool defer)
 {
     if ( (l4e_get_flags(l4e) & _PAGE_PRESENT) &&
          (l4e_get_pfn(l4e) != pfn) )
@@ -1340,7 +1340,7 @@ static int alloc_l3_table(struct page_info *page)
             else
                 rc = get_page_and_type_from_mfn(
                     _mfn(l3e_get_pfn(pl3e[i])),
-                    PGT_l2_page_table | PGT_pae_xen_l2, d, partial, 1);
+                    PGT_l2_page_table | PGT_pae_xen_l2, d, partial, true);
         }
         else if ( !is_guest_l3_slot(i) ||
                   (rc = get_page_from_l3e(pl3e[i], pfn, d, partial)) > 0 )
@@ -1992,7 +1992,7 @@ int get_page(struct page_info *page, struct domain *domain)
  *   acquired reference again.
  * Due to get_page() reserving one reference, this call cannot fail.
  */
-static void get_page_light(struct page_info *page)
+void get_page_light(struct page_info *page)
 {
     unsigned long x, nx, y = page->count_info;
 
@@ -2525,7 +2525,7 @@ int pv_new_guest_cr3(unsigned long mfn)
     rc = paging_mode_refcounts(currd)
          ? (get_page_from_mfn(_mfn(mfn), currd) ? 0 : -EINVAL)
          : get_page_and_type_from_mfn(_mfn(mfn), PGT_root_page_table,
-                                      currd, 0, 1);
+                                      currd, 0, true);
     switch ( rc )
     {
     case 0:
@@ -2901,7 +2901,7 @@ long do_mmuext_op(
             if ( op.arg1.mfn != 0 )
             {
                 rc = get_page_and_type_from_mfn(
-                    _mfn(op.arg1.mfn), PGT_root_page_table, currd, 0, 1);
+                    _mfn(op.arg1.mfn), PGT_root_page_table, currd, 0, true);
 
                 if ( unlikely(rc) )
                 {
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index a5662f327b..07d4c06fc3 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -339,10 +339,23 @@ int  get_page_type(struct page_info *page, unsigned long type);
 int  put_page_type_preemptible(struct page_info *page);
 int  get_page_type_preemptible(struct page_info *page, unsigned long type);
 int  put_old_guest_table(struct vcpu *);
-int  get_page_from_l1e(
-    l1_pgentry_t l1e, struct domain *l1e_owner, struct domain *pg_owner);
+int  get_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner,
+                       struct domain *pg_owner);
 void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner);
+int get_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn, struct domain *d);
+int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn);
+int get_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn, struct domain *d,
+                      int partial);
+int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn, int partial,
+                      bool defer);
+int get_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn, struct domain *d,
+                      int partial);
+int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn, int partial,
+                      bool defer);
+void get_page_light(struct page_info *page);
 bool get_page_from_mfn(mfn_t mfn, struct domain *d);
+int get_page_and_type_from_mfn(mfn_t mfn, unsigned long type, struct domain *d,
+                               int partial, bool preemptible);
 
 static inline void put_page_and_type(struct page_info *page)
 {
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 24/31] x86/mm: move and add pv_ prefix to create_pae_xen_mappings
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (22 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 23/31] x86/mm: export more get/put page functions Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 25/31] x86/mm: move disallow_mask variable and macros Wei Liu
                   ` (6 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

And export it via a local header because it is going to be used by
several PV specific files.

Take the chance to change its return type to bool.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c    | 46 ++++------------------------------------------
 xen/arch/x86/pv/mm.c | 40 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/pv/mm.h |  6 ++++++
 3 files changed, 50 insertions(+), 42 deletions(-)
 create mode 100644 xen/arch/x86/pv/mm.h

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index fb6485500f..5bfdfabc5e 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -127,6 +127,8 @@
 #include <asm/pv/grant_table.h>
 #include <asm/pv/mm.h>
 
+#include "pv/mm.h"
+
 /* Mapping of the fixmap space needed early. */
 l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
     l1_fixmap[L1_PAGETABLE_ENTRIES];
@@ -1219,46 +1221,6 @@ static int alloc_l1_table(struct page_info *page)
     return ret;
 }
 
-static int create_pae_xen_mappings(struct domain *d, l3_pgentry_t *pl3e)
-{
-    struct page_info *page;
-    l3_pgentry_t     l3e3;
-
-    if ( !is_pv_32bit_domain(d) )
-        return 1;
-
-    pl3e = (l3_pgentry_t *)((unsigned long)pl3e & PAGE_MASK);
-
-    /* 3rd L3 slot contains L2 with Xen-private mappings. It *must* exist. */
-    l3e3 = pl3e[3];
-    if ( !(l3e_get_flags(l3e3) & _PAGE_PRESENT) )
-    {
-        gdprintk(XENLOG_WARNING, "PAE L3 3rd slot is empty\n");
-        return 0;
-    }
-
-    /*
-     * The Xen-private mappings include linear mappings. The L2 thus cannot
-     * be shared by multiple L3 tables. The test here is adequate because:
-     *  1. Cannot appear in slots != 3 because get_page_type() checks the
-     *     PGT_pae_xen_l2 flag, which is asserted iff the L2 appears in slot 3
-     *  2. Cannot appear in another page table's L3:
-     *     a. alloc_l3_table() calls this function and this check will fail
-     *     b. mod_l3_entry() disallows updates to slot 3 in an existing table
-     */
-    page = l3e_get_page(l3e3);
-    BUG_ON(page->u.inuse.type_info & PGT_pinned);
-    BUG_ON((page->u.inuse.type_info & PGT_count_mask) == 0);
-    BUG_ON(!(page->u.inuse.type_info & PGT_pae_xen_l2));
-    if ( (page->u.inuse.type_info & PGT_count_mask) != 1 )
-    {
-        gdprintk(XENLOG_WARNING, "PAE L3 3rd slot is shared\n");
-        return 0;
-    }
-
-    return 1;
-}
-
 static int alloc_l2_table(struct page_info *page, unsigned long type,
                           int preemptible)
 {
@@ -1363,7 +1325,7 @@ static int alloc_l3_table(struct page_info *page)
         adjust_guest_l3e(pl3e[i], d);
     }
 
-    if ( rc >= 0 && !create_pae_xen_mappings(d, pl3e) )
+    if ( rc >= 0 && !pv_create_pae_xen_mappings(d, pl3e) )
         rc = -EINVAL;
     if ( rc < 0 && rc != -ERESTART && rc != -EINTR )
     {
@@ -1835,7 +1797,7 @@ static int mod_l3_entry(l3_pgentry_t *pl3e,
     }
 
     if ( likely(rc == 0) )
-        if ( !create_pae_xen_mappings(d, pl3e) )
+        if ( !pv_create_pae_xen_mappings(d, pl3e) )
             BUG();
 
     put_page_from_l3e(ol3e, pfn, 0, 1);
diff --git a/xen/arch/x86/pv/mm.c b/xen/arch/x86/pv/mm.c
index 0f4303cef2..46e1fcf4e5 100644
--- a/xen/arch/x86/pv/mm.c
+++ b/xen/arch/x86/pv/mm.c
@@ -211,6 +211,46 @@ bool pv_update_intpte(intpte_t *p, intpte_t old, intpte_t new,
     return rv;
 }
 
+bool pv_create_pae_xen_mappings(struct domain *d, l3_pgentry_t *pl3e)
+{
+    struct page_info *page;
+    l3_pgentry_t     l3e3;
+
+    if ( !is_pv_32bit_domain(d) )
+        return true;
+
+    pl3e = (l3_pgentry_t *)((unsigned long)pl3e & PAGE_MASK);
+
+    /* 3rd L3 slot contains L2 with Xen-private mappings. It *must* exist. */
+    l3e3 = pl3e[3];
+    if ( !(l3e_get_flags(l3e3) & _PAGE_PRESENT) )
+    {
+        gdprintk(XENLOG_WARNING, "PAE L3 3rd slot is empty\n");
+        return false;
+    }
+
+    /*
+     * The Xen-private mappings include linear mappings. The L2 thus cannot
+     * be shared by multiple L3 tables. The test here is adequate because:
+     *  1. Cannot appear in slots != 3 because get_page_type() checks the
+     *     PGT_pae_xen_l2 flag, which is asserted iff the L2 appears in slot 3
+     *  2. Cannot appear in another page table's L3:
+     *     a. alloc_l3_table() calls this function and this check will fail
+     *     b. mod_l3_entry() disallows updates to slot 3 in an existing table
+     */
+    page = l3e_get_page(l3e3);
+    BUG_ON(page->u.inuse.type_info & PGT_pinned);
+    BUG_ON((page->u.inuse.type_info & PGT_count_mask) == 0);
+    BUG_ON(!(page->u.inuse.type_info & PGT_pae_xen_l2));
+    if ( (page->u.inuse.type_info & PGT_count_mask) != 1 )
+    {
+        gdprintk(XENLOG_WARNING, "PAE L3 3rd slot is shared\n");
+        return false;
+    }
+
+    return true;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/pv/mm.h b/xen/arch/x86/pv/mm.h
new file mode 100644
index 0000000000..bafc2b6116
--- /dev/null
+++ b/xen/arch/x86/pv/mm.h
@@ -0,0 +1,6 @@
+#ifndef __PV_MM_H__
+#define __PV_MM_H__
+
+bool pv_create_pae_xen_mappings(struct domain *d, l3_pgentry_t *pl3e);
+
+#endif /* __PV_MM_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 25/31] x86/mm: move disallow_mask variable and macros
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (23 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 24/31] x86/mm: move and add pv_ prefix to create_pae_xen_mappings Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 26/31] x86/mm: move pv_{alloc, free}_page_type Wei Liu
                   ` (5 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

They will be used by both common mm code and PV mm code in the next
few patches. Note that they might be moved again later if they aren't
needed by common mm code any more.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c        | 19 +------------------
 xen/include/asm-x86/mm.h | 19 +++++++++++++++++++
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 5bfdfabc5e..590e7ae65b 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -146,24 +146,7 @@ bool __read_mostly machine_to_phys_mapping_valid;
 
 struct rangeset *__read_mostly mmio_ro_ranges;
 
-static uint32_t base_disallow_mask;
-/* Global bit is allowed to be set on L1 PTEs. Intended for user mappings. */
-#define L1_DISALLOW_MASK ((base_disallow_mask | _PAGE_GNTTAB) & ~_PAGE_GLOBAL)
-
-#define L2_DISALLOW_MASK base_disallow_mask
-
-#define l3_disallow_mask(d) (!is_pv_32bit_domain(d) ? \
-                             base_disallow_mask : 0xFFFFF198U)
-
-#define L4_DISALLOW_MASK (base_disallow_mask)
-
-#define l1_disallow_mask(d)                                     \
-    ((d != dom_io) &&                                           \
-     (rangeset_is_empty((d)->iomem_caps) &&                     \
-      rangeset_is_empty((d)->arch.ioport_caps) &&               \
-      !has_arch_pdevs(d) &&                                     \
-      is_pv_domain(d)) ?                                        \
-     L1_DISALLOW_MASK : (L1_DISALLOW_MASK & ~PAGE_CACHE_ATTRS))
+uint32_t base_disallow_mask;
 
 static s8 __read_mostly opt_mmio_relax;
 static void __init parse_mmio_relax(const char *s)
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 07d4c06fc3..6857651db1 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -334,6 +334,25 @@ const unsigned long *get_platform_badpages(unsigned int *array_size);
 int page_lock(struct page_info *page);
 void page_unlock(struct page_info *page);
 
+extern uint32_t base_disallow_mask;
+/* Global bit is allowed to be set on L1 PTEs. Intended for user mappings. */
+#define L1_DISALLOW_MASK ((base_disallow_mask | _PAGE_GNTTAB) & ~_PAGE_GLOBAL)
+
+#define L2_DISALLOW_MASK base_disallow_mask
+
+#define l3_disallow_mask(d) (!is_pv_32bit_domain(d) ? \
+                             base_disallow_mask : 0xFFFFF198U)
+
+#define L4_DISALLOW_MASK (base_disallow_mask)
+
+#define l1_disallow_mask(d)                                     \
+    ((d != dom_io) &&                                           \
+     (rangeset_is_empty((d)->iomem_caps) &&                     \
+      rangeset_is_empty((d)->arch.ioport_caps) &&               \
+      !has_arch_pdevs(d) &&                                     \
+      is_pv_domain(d)) ?                                        \
+     L1_DISALLOW_MASK : (L1_DISALLOW_MASK & ~PAGE_CACHE_ATTRS))
+
 void put_page_type(struct page_info *page);
 int  get_page_type(struct page_info *page, unsigned long type);
 int  put_page_type_preemptible(struct page_info *page);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 26/31] x86/mm: move pv_{alloc, free}_page_type
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (24 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 25/31] x86/mm: move disallow_mask variable and macros Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 27/31] x86/mm: move and add pv_ prefix to invalidate_shadow_ldt Wei Liu
                   ` (4 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Move them and the helper functions to pv/mm.c.  Use bool in the moved
code where appropriate.

Put BUG() in the stub because the callers will call BUG or BUG_ON
anyway.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
v4: add BUG() in stubs
---
 xen/arch/x86/domain.c       |   1 +
 xen/arch/x86/mm.c           | 492 --------------------------------------------
 xen/arch/x86/pv/mm.c        | 491 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/mm.h    |   3 -
 xen/include/asm-x86/pv/mm.h |  14 ++
 5 files changed, 506 insertions(+), 495 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 8fae38485c..e79a7de7e4 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -64,6 +64,7 @@
 #include <compat/vcpu.h>
 #include <asm/psr.h>
 #include <asm/pv/domain.h>
+#include <asm/pv/mm.h>
 #include <asm/pv/processor.h>
 
 DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 590e7ae65b..204c20d6fd 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -510,21 +510,6 @@ static void invalidate_shadow_ldt(struct vcpu *v, int flush)
 }
 
 
-static int alloc_segdesc_page(struct page_info *page)
-{
-    const struct domain *owner = page_get_owner(page);
-    struct desc_struct *descs = __map_domain_page(page);
-    unsigned i;
-
-    for ( i = 0; i < 512; i++ )
-        if ( unlikely(!check_descriptor(owner, &descs[i])) )
-            break;
-
-    unmap_domain_page(descs);
-
-    return i == 512 ? 0 : -EINVAL;
-}
-
 bool get_page_from_mfn(mfn_t mfn, struct domain *d)
 {
     struct page_info *page = mfn_to_page(mfn_x(mfn));
@@ -539,7 +524,6 @@ bool get_page_from_mfn(mfn_t mfn, struct domain *d)
     return true;
 }
 
-
 int get_page_and_type_from_mfn(mfn_t mfn, unsigned long type, struct domain *d,
                                int partial, bool preemptible)
 {
@@ -1165,172 +1149,6 @@ int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn, int partial,
     return 1;
 }
 
-static int alloc_l1_table(struct page_info *page)
-{
-    struct domain *d = page_get_owner(page);
-    unsigned long  pfn = page_to_mfn(page);
-    l1_pgentry_t  *pl1e;
-    unsigned int   i;
-    int            ret = 0;
-
-    pl1e = map_domain_page(_mfn(pfn));
-
-    for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
-    {
-        switch ( ret = get_page_from_l1e(pl1e[i], d, d) )
-        {
-        default:
-            goto fail;
-        case 0:
-            break;
-        case _PAGE_RW ... _PAGE_RW | PAGE_CACHE_ATTRS:
-            ASSERT(!(ret & ~(_PAGE_RW | PAGE_CACHE_ATTRS)));
-            l1e_flip_flags(pl1e[i], ret);
-            break;
-        }
-
-        adjust_guest_l1e(pl1e[i], d);
-    }
-
-    unmap_domain_page(pl1e);
-    return 0;
-
- fail:
-    gdprintk(XENLOG_WARNING, "Failure in alloc_l1_table: slot %#x\n", i);
-    while ( i-- > 0 )
-        put_page_from_l1e(pl1e[i], d);
-
-    unmap_domain_page(pl1e);
-    return ret;
-}
-
-static int alloc_l2_table(struct page_info *page, unsigned long type,
-                          int preemptible)
-{
-    struct domain *d = page_get_owner(page);
-    unsigned long  pfn = page_to_mfn(page);
-    l2_pgentry_t  *pl2e;
-    unsigned int   i;
-    int            rc = 0;
-
-    pl2e = map_domain_page(_mfn(pfn));
-
-    for ( i = page->nr_validated_ptes; i < L2_PAGETABLE_ENTRIES; i++ )
-    {
-        if ( preemptible && i > page->nr_validated_ptes
-             && hypercall_preempt_check() )
-        {
-            page->nr_validated_ptes = i;
-            rc = -ERESTART;
-            break;
-        }
-
-        if ( !is_guest_l2_slot(d, type, i) ||
-             (rc = get_page_from_l2e(pl2e[i], pfn, d)) > 0 )
-            continue;
-
-        if ( rc < 0 )
-        {
-            gdprintk(XENLOG_WARNING, "Failure in alloc_l2_table: slot %#x\n", i);
-            while ( i-- > 0 )
-                if ( is_guest_l2_slot(d, type, i) )
-                    put_page_from_l2e(pl2e[i], pfn);
-            break;
-        }
-
-        adjust_guest_l2e(pl2e[i], d);
-    }
-
-    if ( rc >= 0 && (type & PGT_pae_xen_l2) )
-    {
-        /* Xen private mappings. */
-        memcpy(&pl2e[COMPAT_L2_PAGETABLE_FIRST_XEN_SLOT(d)],
-               &compat_idle_pg_table_l2[
-                   l2_table_offset(HIRO_COMPAT_MPT_VIRT_START)],
-               COMPAT_L2_PAGETABLE_XEN_SLOTS(d) * sizeof(*pl2e));
-    }
-
-    unmap_domain_page(pl2e);
-    return rc > 0 ? 0 : rc;
-}
-
-static int alloc_l3_table(struct page_info *page)
-{
-    struct domain *d = page_get_owner(page);
-    unsigned long  pfn = page_to_mfn(page);
-    l3_pgentry_t  *pl3e;
-    unsigned int   i;
-    int            rc = 0, partial = page->partial_pte;
-
-    pl3e = map_domain_page(_mfn(pfn));
-
-    /*
-     * PAE guests allocate full pages, but aren't required to initialize
-     * more than the first four entries; when running in compatibility
-     * mode, however, the full page is visible to the MMU, and hence all
-     * 512 entries must be valid/verified, which is most easily achieved
-     * by clearing them out.
-     */
-    if ( is_pv_32bit_domain(d) )
-        memset(pl3e + 4, 0, (L3_PAGETABLE_ENTRIES - 4) * sizeof(*pl3e));
-
-    for ( i = page->nr_validated_ptes; i < L3_PAGETABLE_ENTRIES;
-          i++, partial = 0 )
-    {
-        if ( is_pv_32bit_domain(d) && (i == 3) )
-        {
-            if ( !(l3e_get_flags(pl3e[i]) & _PAGE_PRESENT) ||
-                 (l3e_get_flags(pl3e[i]) & l3_disallow_mask(d)) )
-                rc = -EINVAL;
-            else
-                rc = get_page_and_type_from_mfn(
-                    _mfn(l3e_get_pfn(pl3e[i])),
-                    PGT_l2_page_table | PGT_pae_xen_l2, d, partial, true);
-        }
-        else if ( !is_guest_l3_slot(i) ||
-                  (rc = get_page_from_l3e(pl3e[i], pfn, d, partial)) > 0 )
-            continue;
-
-        if ( rc == -ERESTART )
-        {
-            page->nr_validated_ptes = i;
-            page->partial_pte = partial ?: 1;
-        }
-        else if ( rc == -EINTR && i )
-        {
-            page->nr_validated_ptes = i;
-            page->partial_pte = 0;
-            rc = -ERESTART;
-        }
-        if ( rc < 0 )
-            break;
-
-        adjust_guest_l3e(pl3e[i], d);
-    }
-
-    if ( rc >= 0 && !pv_create_pae_xen_mappings(d, pl3e) )
-        rc = -EINVAL;
-    if ( rc < 0 && rc != -ERESTART && rc != -EINTR )
-    {
-        gdprintk(XENLOG_WARNING, "Failure in alloc_l3_table: slot %#x\n", i);
-        if ( i )
-        {
-            page->nr_validated_ptes = i;
-            page->partial_pte = 0;
-            current->arch.old_guest_table = page;
-        }
-        while ( i-- > 0 )
-        {
-            if ( !is_guest_l3_slot(i) )
-                continue;
-            unadjust_guest_l3e(pl3e[i], d);
-        }
-    }
-
-    unmap_domain_page(pl3e);
-    return rc > 0 ? 0 : rc;
-}
-
 bool fill_ro_mpt(unsigned long mfn)
 {
     l4_pgentry_t *l4tab = map_domain_page(_mfn(mfn));
@@ -1355,188 +1173,6 @@ void zap_ro_mpt(unsigned long mfn)
     unmap_domain_page(l4tab);
 }
 
-static int alloc_l4_table(struct page_info *page)
-{
-    struct domain *d = page_get_owner(page);
-    unsigned long  pfn = page_to_mfn(page);
-    l4_pgentry_t  *pl4e = map_domain_page(_mfn(pfn));
-    unsigned int   i;
-    int            rc = 0, partial = page->partial_pte;
-
-    for ( i = page->nr_validated_ptes; i < L4_PAGETABLE_ENTRIES;
-          i++, partial = 0 )
-    {
-        if ( !is_guest_l4_slot(d, i) ||
-             (rc = get_page_from_l4e(pl4e[i], pfn, d, partial)) > 0 )
-            continue;
-
-        if ( rc == -ERESTART )
-        {
-            page->nr_validated_ptes = i;
-            page->partial_pte = partial ?: 1;
-        }
-        else if ( rc < 0 )
-        {
-            if ( rc != -EINTR )
-                gdprintk(XENLOG_WARNING,
-                         "Failure in alloc_l4_table: slot %#x\n", i);
-            if ( i )
-            {
-                page->nr_validated_ptes = i;
-                page->partial_pte = 0;
-                if ( rc == -EINTR )
-                    rc = -ERESTART;
-                else
-                {
-                    if ( current->arch.old_guest_table )
-                        page->nr_validated_ptes++;
-                    current->arch.old_guest_table = page;
-                }
-            }
-        }
-        if ( rc < 0 )
-        {
-            unmap_domain_page(pl4e);
-            return rc;
-        }
-
-        adjust_guest_l4e(pl4e[i], d);
-    }
-
-    if ( rc >= 0 )
-    {
-        pv_init_guest_l4_table(pl4e, d, !VM_ASSIST(d, m2p_strict));
-        atomic_inc(&d->arch.pv_domain.nr_l4_pages);
-        rc = 0;
-    }
-    unmap_domain_page(pl4e);
-
-    return rc;
-}
-
-static void free_l1_table(struct page_info *page)
-{
-    struct domain *d = page_get_owner(page);
-    unsigned long pfn = page_to_mfn(page);
-    l1_pgentry_t *pl1e;
-    unsigned int  i;
-
-    pl1e = map_domain_page(_mfn(pfn));
-
-    for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
-        put_page_from_l1e(pl1e[i], d);
-
-    unmap_domain_page(pl1e);
-}
-
-
-static int free_l2_table(struct page_info *page, int preemptible)
-{
-    struct domain *d = page_get_owner(page);
-    unsigned long pfn = page_to_mfn(page);
-    l2_pgentry_t *pl2e;
-    unsigned int  i = page->nr_validated_ptes - 1;
-    int err = 0;
-
-    pl2e = map_domain_page(_mfn(pfn));
-
-    ASSERT(page->nr_validated_ptes);
-    do {
-        if ( is_guest_l2_slot(d, page->u.inuse.type_info, i) &&
-             put_page_from_l2e(pl2e[i], pfn) == 0 &&
-             preemptible && i && hypercall_preempt_check() )
-        {
-           page->nr_validated_ptes = i;
-           err = -ERESTART;
-        }
-    } while ( !err && i-- );
-
-    unmap_domain_page(pl2e);
-
-    if ( !err )
-        page->u.inuse.type_info &= ~PGT_pae_xen_l2;
-
-    return err;
-}
-
-static int free_l3_table(struct page_info *page)
-{
-    struct domain *d = page_get_owner(page);
-    unsigned long pfn = page_to_mfn(page);
-    l3_pgentry_t *pl3e;
-    int rc = 0, partial = page->partial_pte;
-    unsigned int  i = page->nr_validated_ptes - !partial;
-
-    pl3e = map_domain_page(_mfn(pfn));
-
-    do {
-        if ( is_guest_l3_slot(i) )
-        {
-            rc = put_page_from_l3e(pl3e[i], pfn, partial, 0);
-            if ( rc < 0 )
-                break;
-            partial = 0;
-            if ( rc > 0 )
-                continue;
-            unadjust_guest_l3e(pl3e[i], d);
-        }
-    } while ( i-- );
-
-    unmap_domain_page(pl3e);
-
-    if ( rc == -ERESTART )
-    {
-        page->nr_validated_ptes = i;
-        page->partial_pte = partial ?: -1;
-    }
-    else if ( rc == -EINTR && i < L3_PAGETABLE_ENTRIES - 1 )
-    {
-        page->nr_validated_ptes = i + 1;
-        page->partial_pte = 0;
-        rc = -ERESTART;
-    }
-    return rc > 0 ? 0 : rc;
-}
-
-static int free_l4_table(struct page_info *page)
-{
-    struct domain *d = page_get_owner(page);
-    unsigned long pfn = page_to_mfn(page);
-    l4_pgentry_t *pl4e = map_domain_page(_mfn(pfn));
-    int rc = 0, partial = page->partial_pte;
-    unsigned int  i = page->nr_validated_ptes - !partial;
-
-    do {
-        if ( is_guest_l4_slot(d, i) )
-            rc = put_page_from_l4e(pl4e[i], pfn, partial, 0);
-        if ( rc < 0 )
-            break;
-        partial = 0;
-    } while ( i-- );
-
-    if ( rc == -ERESTART )
-    {
-        page->nr_validated_ptes = i;
-        page->partial_pte = partial ?: -1;
-    }
-    else if ( rc == -EINTR && i < L4_PAGETABLE_ENTRIES - 1 )
-    {
-        page->nr_validated_ptes = i + 1;
-        page->partial_pte = 0;
-        rc = -ERESTART;
-    }
-
-    unmap_domain_page(pl4e);
-
-    if ( rc >= 0 )
-    {
-        atomic_dec(&d->arch.pv_domain.nr_l4_pages);
-        rc = 0;
-    }
-
-    return rc;
-}
-
 int page_lock(struct page_info *page)
 {
     unsigned long x, nx;
@@ -1951,134 +1587,6 @@ void get_page_light(struct page_info *page)
     while ( unlikely(y != x) );
 }
 
-static int pv_alloc_page_type(struct page_info *page, unsigned long type,
-                              bool preemptible)
-{
-    struct domain *owner = page_get_owner(page);
-    int rc;
-
-    /* A page table is dirtied when its type count becomes non-zero. */
-    if ( likely(owner != NULL) )
-        paging_mark_dirty(owner, _mfn(page_to_mfn(page)));
-
-    switch ( type & PGT_type_mask )
-    {
-    case PGT_l1_page_table:
-        rc = alloc_l1_table(page);
-        break;
-    case PGT_l2_page_table:
-        rc = alloc_l2_table(page, type, preemptible);
-        break;
-    case PGT_l3_page_table:
-        ASSERT(preemptible);
-        rc = alloc_l3_table(page);
-        break;
-    case PGT_l4_page_table:
-        ASSERT(preemptible);
-        rc = alloc_l4_table(page);
-        break;
-    case PGT_seg_desc_page:
-        rc = alloc_segdesc_page(page);
-        break;
-    default:
-        printk("Bad type in alloc_page_type %lx t=%" PRtype_info " c=%lx\n",
-               type, page->u.inuse.type_info,
-               page->count_info);
-        rc = -EINVAL;
-        BUG();
-    }
-
-    /* No need for atomic update of type_info here: noone else updates it. */
-    smp_wmb();
-    switch ( rc )
-    {
-    case 0:
-        page->u.inuse.type_info |= PGT_validated;
-        break;
-    case -EINTR:
-        ASSERT((page->u.inuse.type_info &
-                (PGT_count_mask|PGT_validated|PGT_partial)) == 1);
-        page->u.inuse.type_info &= ~PGT_count_mask;
-        break;
-    default:
-        ASSERT(rc < 0);
-        gdprintk(XENLOG_WARNING, "Error while validating mfn %" PRI_mfn
-                 " (pfn %" PRI_pfn ") for type %" PRtype_info
-                 ": caf=%08lx taf=%" PRtype_info "\n",
-                 page_to_mfn(page), get_gpfn_from_mfn(page_to_mfn(page)),
-                 type, page->count_info, page->u.inuse.type_info);
-        if ( page != current->arch.old_guest_table )
-            page->u.inuse.type_info = 0;
-        else
-        {
-            ASSERT((page->u.inuse.type_info &
-                    (PGT_count_mask | PGT_validated)) == 1);
-    case -ERESTART:
-            get_page_light(page);
-            page->u.inuse.type_info |= PGT_partial;
-        }
-        break;
-    }
-
-    return rc;
-}
-
-
-int pv_free_page_type(struct page_info *page, unsigned long type,
-                      bool preemptible)
-{
-    struct domain *owner = page_get_owner(page);
-    unsigned long gmfn;
-    int rc;
-
-    if ( likely(owner != NULL) && unlikely(paging_mode_enabled(owner)) )
-    {
-        /* A page table is dirtied when its type count becomes zero. */
-        paging_mark_dirty(owner, _mfn(page_to_mfn(page)));
-
-        ASSERT(!shadow_mode_refcounts(owner));
-
-        gmfn = mfn_to_gmfn(owner, page_to_mfn(page));
-        ASSERT(VALID_M2P(gmfn));
-        /* Page sharing not supported for shadowed domains */
-        if(!SHARED_M2P(gmfn))
-            shadow_remove_all_shadows(owner, _mfn(gmfn));
-    }
-
-    if ( !(type & PGT_partial) )
-    {
-        page->nr_validated_ptes = 1U << PAGETABLE_ORDER;
-        page->partial_pte = 0;
-    }
-
-    switch ( type & PGT_type_mask )
-    {
-    case PGT_l1_page_table:
-        free_l1_table(page);
-        rc = 0;
-        break;
-    case PGT_l2_page_table:
-        rc = free_l2_table(page, preemptible);
-        break;
-    case PGT_l3_page_table:
-        ASSERT(preemptible);
-        rc = free_l3_table(page);
-        break;
-    case PGT_l4_page_table:
-        ASSERT(preemptible);
-        rc = free_l4_table(page);
-        break;
-    default:
-        gdprintk(XENLOG_WARNING, "type %" PRtype_info " mfn %" PRI_mfn "\n",
-                 type, page_to_mfn(page));
-        rc = -EINVAL;
-        BUG();
-    }
-
-    return rc;
-}
-
-
 static int __put_final_page_type(
     struct page_info *page, unsigned long type, int preemptible)
 {
diff --git a/xen/arch/x86/pv/mm.c b/xen/arch/x86/pv/mm.c
index 46e1fcf4e5..f0393b9e3c 100644
--- a/xen/arch/x86/pv/mm.c
+++ b/xen/arch/x86/pv/mm.c
@@ -20,10 +20,13 @@
  * along with this program; If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <xen/event.h>
 #include <xen/guest_access.h>
 
+#include <asm/mm.h>
 #include <asm/pv/mm.h>
 #include <asm/setup.h>
+#include <asm/shadow.h>
 
 /*
  * PTE updates can be done with ordinary writes except:
@@ -251,6 +254,494 @@ bool pv_create_pae_xen_mappings(struct domain *d, l3_pgentry_t *pl3e)
     return true;
 }
 
+static int alloc_l1_table(struct page_info *page)
+{
+    struct domain *d = page_get_owner(page);
+    unsigned long  pfn = page_to_mfn(page);
+    l1_pgentry_t  *pl1e;
+    unsigned int   i;
+    int            ret = 0;
+
+    pl1e = map_domain_page(_mfn(pfn));
+
+    for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
+    {
+        switch ( ret = get_page_from_l1e(pl1e[i], d, d) )
+        {
+        default:
+            goto fail;
+        case 0:
+            break;
+        case _PAGE_RW ... _PAGE_RW | PAGE_CACHE_ATTRS:
+            ASSERT(!(ret & ~(_PAGE_RW | PAGE_CACHE_ATTRS)));
+            l1e_flip_flags(pl1e[i], ret);
+            break;
+        }
+
+        adjust_guest_l1e(pl1e[i], d);
+    }
+
+    unmap_domain_page(pl1e);
+    return 0;
+
+ fail:
+    gdprintk(XENLOG_WARNING, "Failure in alloc_l1_table: slot %#x\n", i);
+    while ( i-- > 0 )
+        put_page_from_l1e(pl1e[i], d);
+
+    unmap_domain_page(pl1e);
+    return ret;
+}
+
+static int alloc_l2_table(struct page_info *page, unsigned long type,
+                          bool preemptible)
+{
+    struct domain *d = page_get_owner(page);
+    unsigned long  pfn = page_to_mfn(page);
+    l2_pgentry_t  *pl2e;
+    unsigned int   i;
+    int            rc = 0;
+
+    pl2e = map_domain_page(_mfn(pfn));
+
+    for ( i = page->nr_validated_ptes; i < L2_PAGETABLE_ENTRIES; i++ )
+    {
+        if ( preemptible && i > page->nr_validated_ptes
+             && hypercall_preempt_check() )
+        {
+            page->nr_validated_ptes = i;
+            rc = -ERESTART;
+            break;
+        }
+
+        if ( !is_guest_l2_slot(d, type, i) ||
+             (rc = get_page_from_l2e(pl2e[i], pfn, d)) > 0 )
+            continue;
+
+        if ( rc < 0 )
+        {
+            gdprintk(XENLOG_WARNING, "Failure in alloc_l2_table: slot %#x\n", i);
+            while ( i-- > 0 )
+                if ( is_guest_l2_slot(d, type, i) )
+                    put_page_from_l2e(pl2e[i], pfn);
+            break;
+        }
+
+        adjust_guest_l2e(pl2e[i], d);
+    }
+
+    if ( rc >= 0 && (type & PGT_pae_xen_l2) )
+    {
+        /* Xen private mappings. */
+        memcpy(&pl2e[COMPAT_L2_PAGETABLE_FIRST_XEN_SLOT(d)],
+               &compat_idle_pg_table_l2[
+                   l2_table_offset(HIRO_COMPAT_MPT_VIRT_START)],
+               COMPAT_L2_PAGETABLE_XEN_SLOTS(d) * sizeof(*pl2e));
+    }
+
+    unmap_domain_page(pl2e);
+    return rc > 0 ? 0 : rc;
+}
+
+static int alloc_l3_table(struct page_info *page)
+{
+    struct domain *d = page_get_owner(page);
+    unsigned long  pfn = page_to_mfn(page);
+    l3_pgentry_t  *pl3e;
+    unsigned int   i;
+    int            rc = 0, partial = page->partial_pte;
+
+    pl3e = map_domain_page(_mfn(pfn));
+
+    /*
+     * PAE guests allocate full pages, but aren't required to initialize
+     * more than the first four entries; when running in compatibility
+     * mode, however, the full page is visible to the MMU, and hence all
+     * 512 entries must be valid/verified, which is most easily achieved
+     * by clearing them out.
+     */
+    if ( is_pv_32bit_domain(d) )
+        memset(pl3e + 4, 0, (L3_PAGETABLE_ENTRIES - 4) * sizeof(*pl3e));
+
+    for ( i = page->nr_validated_ptes; i < L3_PAGETABLE_ENTRIES;
+          i++, partial = 0 )
+    {
+        if ( is_pv_32bit_domain(d) && (i == 3) )
+        {
+            if ( !(l3e_get_flags(pl3e[i]) & _PAGE_PRESENT) ||
+                 (l3e_get_flags(pl3e[i]) & l3_disallow_mask(d)) )
+                rc = -EINVAL;
+            else
+                rc = get_page_and_type_from_mfn(
+                    _mfn(l3e_get_pfn(pl3e[i])),
+                    PGT_l2_page_table | PGT_pae_xen_l2, d, partial, true);
+        }
+        else if ( !is_guest_l3_slot(i) ||
+                  (rc = get_page_from_l3e(pl3e[i], pfn, d, partial)) > 0 )
+            continue;
+
+        if ( rc == -ERESTART )
+        {
+            page->nr_validated_ptes = i;
+            page->partial_pte = partial ?: 1;
+        }
+        else if ( rc == -EINTR && i )
+        {
+            page->nr_validated_ptes = i;
+            page->partial_pte = 0;
+            rc = -ERESTART;
+        }
+        if ( rc < 0 )
+            break;
+
+        adjust_guest_l3e(pl3e[i], d);
+    }
+
+    if ( rc >= 0 && !pv_create_pae_xen_mappings(d, pl3e) )
+        rc = -EINVAL;
+    if ( rc < 0 && rc != -ERESTART && rc != -EINTR )
+    {
+        gdprintk(XENLOG_WARNING, "Failure in alloc_l3_table: slot %#x\n", i);
+        if ( i )
+        {
+            page->nr_validated_ptes = i;
+            page->partial_pte = 0;
+            current->arch.old_guest_table = page;
+        }
+        while ( i-- > 0 )
+        {
+            if ( !is_guest_l3_slot(i) )
+                continue;
+            unadjust_guest_l3e(pl3e[i], d);
+        }
+    }
+
+    unmap_domain_page(pl3e);
+    return rc > 0 ? 0 : rc;
+}
+
+static int alloc_l4_table(struct page_info *page)
+{
+    struct domain *d = page_get_owner(page);
+    unsigned long  pfn = page_to_mfn(page);
+    l4_pgentry_t  *pl4e = map_domain_page(_mfn(pfn));
+    unsigned int   i;
+    int            rc = 0, partial = page->partial_pte;
+
+    for ( i = page->nr_validated_ptes; i < L4_PAGETABLE_ENTRIES;
+          i++, partial = 0 )
+    {
+        if ( !is_guest_l4_slot(d, i) ||
+             (rc = get_page_from_l4e(pl4e[i], pfn, d, partial)) > 0 )
+            continue;
+
+        if ( rc == -ERESTART )
+        {
+            page->nr_validated_ptes = i;
+            page->partial_pte = partial ?: 1;
+        }
+        else if ( rc < 0 )
+        {
+            if ( rc != -EINTR )
+                gdprintk(XENLOG_WARNING,
+                         "Failure in alloc_l4_table: slot %#x\n", i);
+            if ( i )
+            {
+                page->nr_validated_ptes = i;
+                page->partial_pte = 0;
+                if ( rc == -EINTR )
+                    rc = -ERESTART;
+                else
+                {
+                    if ( current->arch.old_guest_table )
+                        page->nr_validated_ptes++;
+                    current->arch.old_guest_table = page;
+                }
+            }
+        }
+        if ( rc < 0 )
+        {
+            unmap_domain_page(pl4e);
+            return rc;
+        }
+
+        adjust_guest_l4e(pl4e[i], d);
+    }
+
+    if ( rc >= 0 )
+    {
+        pv_init_guest_l4_table(pl4e, d, !VM_ASSIST(d, m2p_strict));
+        atomic_inc(&d->arch.pv_domain.nr_l4_pages);
+        rc = 0;
+    }
+    unmap_domain_page(pl4e);
+
+    return rc;
+}
+
+static void free_l1_table(struct page_info *page)
+{
+    struct domain *d = page_get_owner(page);
+    unsigned long pfn = page_to_mfn(page);
+    l1_pgentry_t *pl1e;
+    unsigned int  i;
+
+    pl1e = map_domain_page(_mfn(pfn));
+
+    for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
+        put_page_from_l1e(pl1e[i], d);
+
+    unmap_domain_page(pl1e);
+}
+
+static int free_l2_table(struct page_info *page, int preemptible)
+{
+    struct domain *d = page_get_owner(page);
+    unsigned long pfn = page_to_mfn(page);
+    l2_pgentry_t *pl2e;
+    unsigned int  i = page->nr_validated_ptes - 1;
+    int err = 0;
+
+    pl2e = map_domain_page(_mfn(pfn));
+
+    ASSERT(page->nr_validated_ptes);
+    do {
+        if ( is_guest_l2_slot(d, page->u.inuse.type_info, i) &&
+             put_page_from_l2e(pl2e[i], pfn) == 0 &&
+             preemptible && i && hypercall_preempt_check() )
+        {
+           page->nr_validated_ptes = i;
+           err = -ERESTART;
+        }
+    } while ( !err && i-- );
+
+    unmap_domain_page(pl2e);
+
+    if ( !err )
+        page->u.inuse.type_info &= ~PGT_pae_xen_l2;
+
+    return err;
+}
+
+static int free_l3_table(struct page_info *page)
+{
+    struct domain *d = page_get_owner(page);
+    unsigned long pfn = page_to_mfn(page);
+    l3_pgentry_t *pl3e;
+    int rc = 0, partial = page->partial_pte;
+    unsigned int  i = page->nr_validated_ptes - !partial;
+
+    pl3e = map_domain_page(_mfn(pfn));
+
+    do {
+        if ( is_guest_l3_slot(i) )
+        {
+            rc = put_page_from_l3e(pl3e[i], pfn, partial, 0);
+            if ( rc < 0 )
+                break;
+            partial = 0;
+            if ( rc > 0 )
+                continue;
+            unadjust_guest_l3e(pl3e[i], d);
+        }
+    } while ( i-- );
+
+    unmap_domain_page(pl3e);
+
+    if ( rc == -ERESTART )
+    {
+        page->nr_validated_ptes = i;
+        page->partial_pte = partial ?: -1;
+    }
+    else if ( rc == -EINTR && i < L3_PAGETABLE_ENTRIES - 1 )
+    {
+        page->nr_validated_ptes = i + 1;
+        page->partial_pte = 0;
+        rc = -ERESTART;
+    }
+    return rc > 0 ? 0 : rc;
+}
+
+static int free_l4_table(struct page_info *page)
+{
+    struct domain *d = page_get_owner(page);
+    unsigned long pfn = page_to_mfn(page);
+    l4_pgentry_t *pl4e = map_domain_page(_mfn(pfn));
+    int rc = 0, partial = page->partial_pte;
+    unsigned int  i = page->nr_validated_ptes - !partial;
+
+    do {
+        if ( is_guest_l4_slot(d, i) )
+            rc = put_page_from_l4e(pl4e[i], pfn, partial, 0);
+        if ( rc < 0 )
+            break;
+        partial = 0;
+    } while ( i-- );
+
+    if ( rc == -ERESTART )
+    {
+        page->nr_validated_ptes = i;
+        page->partial_pte = partial ?: -1;
+    }
+    else if ( rc == -EINTR && i < L4_PAGETABLE_ENTRIES - 1 )
+    {
+        page->nr_validated_ptes = i + 1;
+        page->partial_pte = 0;
+        rc = -ERESTART;
+    }
+
+    unmap_domain_page(pl4e);
+
+    if ( rc >= 0 )
+    {
+        atomic_dec(&d->arch.pv_domain.nr_l4_pages);
+        rc = 0;
+    }
+
+    return rc;
+}
+
+static int alloc_segdesc_page(struct page_info *page)
+{
+    const struct domain *owner = page_get_owner(page);
+    struct desc_struct *descs = __map_domain_page(page);
+    unsigned i;
+
+    for ( i = 0; i < 512; i++ )
+        if ( unlikely(!check_descriptor(owner, &descs[i])) )
+            break;
+
+    unmap_domain_page(descs);
+
+    return i == 512 ? 0 : -EINVAL;
+}
+
+int pv_alloc_page_type(struct page_info *page, unsigned long type,
+                       bool preemptible)
+{
+    struct domain *owner = page_get_owner(page);
+    int rc;
+
+    /* A page table is dirtied when its type count becomes non-zero. */
+    if ( likely(owner != NULL) )
+        paging_mark_dirty(owner, _mfn(page_to_mfn(page)));
+
+    switch ( type & PGT_type_mask )
+    {
+    case PGT_l1_page_table:
+        rc = alloc_l1_table(page);
+        break;
+    case PGT_l2_page_table:
+        rc = alloc_l2_table(page, type, preemptible);
+        break;
+    case PGT_l3_page_table:
+        ASSERT(preemptible);
+        rc = alloc_l3_table(page);
+        break;
+    case PGT_l4_page_table:
+        ASSERT(preemptible);
+        rc = alloc_l4_table(page);
+        break;
+    case PGT_seg_desc_page:
+        rc = alloc_segdesc_page(page);
+        break;
+    default:
+        printk("Bad type in alloc_page_type %lx t=%" PRtype_info " c=%lx\n",
+               type, page->u.inuse.type_info,
+               page->count_info);
+        rc = -EINVAL;
+        BUG();
+    }
+
+    /* No need for atomic update of type_info here: noone else updates it. */
+    smp_wmb();
+    switch ( rc )
+    {
+    case 0:
+        page->u.inuse.type_info |= PGT_validated;
+        break;
+    case -EINTR:
+        ASSERT((page->u.inuse.type_info &
+                (PGT_count_mask|PGT_validated|PGT_partial)) == 1);
+        page->u.inuse.type_info &= ~PGT_count_mask;
+        break;
+    default:
+        ASSERT(rc < 0);
+        gdprintk(XENLOG_WARNING, "Error while validating mfn %" PRI_mfn
+                 " (pfn %" PRI_pfn ") for type %" PRtype_info
+                 ": caf=%08lx taf=%" PRtype_info "\n",
+                 page_to_mfn(page), get_gpfn_from_mfn(page_to_mfn(page)),
+                 type, page->count_info, page->u.inuse.type_info);
+        if ( page != current->arch.old_guest_table )
+            page->u.inuse.type_info = 0;
+        else
+        {
+            ASSERT((page->u.inuse.type_info &
+                    (PGT_count_mask | PGT_validated)) == 1);
+    case -ERESTART:
+            get_page_light(page);
+            page->u.inuse.type_info |= PGT_partial;
+        }
+        break;
+    }
+
+    return rc;
+}
+
+int pv_free_page_type(struct page_info *page, unsigned long type,
+                      bool preemptible)
+{
+    struct domain *owner = page_get_owner(page);
+    unsigned long gmfn;
+    int rc;
+
+    if ( likely(owner != NULL) && unlikely(paging_mode_enabled(owner)) )
+    {
+        /* A page table is dirtied when its type count becomes zero. */
+        paging_mark_dirty(owner, _mfn(page_to_mfn(page)));
+
+        ASSERT(!shadow_mode_refcounts(owner));
+
+        gmfn = mfn_to_gmfn(owner, page_to_mfn(page));
+        ASSERT(VALID_M2P(gmfn));
+        /* Page sharing not supported for shadowed domains */
+        if(!SHARED_M2P(gmfn))
+            shadow_remove_all_shadows(owner, _mfn(gmfn));
+    }
+
+    if ( !(type & PGT_partial) )
+    {
+        page->nr_validated_ptes = 1U << PAGETABLE_ORDER;
+        page->partial_pte = 0;
+    }
+
+    switch ( type & PGT_type_mask )
+    {
+    case PGT_l1_page_table:
+        free_l1_table(page);
+        rc = 0;
+        break;
+    case PGT_l2_page_table:
+        rc = free_l2_table(page, preemptible);
+        break;
+    case PGT_l3_page_table:
+        ASSERT(preemptible);
+        rc = free_l3_table(page);
+        break;
+    case PGT_l4_page_table:
+        ASSERT(preemptible);
+        rc = free_l4_table(page);
+        break;
+    default:
+        gdprintk(XENLOG_WARNING, "type %" PRtype_info " mfn %" PRI_mfn "\n",
+                 type, page_to_mfn(page));
+        rc = -EINVAL;
+        BUG();
+    }
+
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 6857651db1..7480341240 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -302,9 +302,6 @@ static inline void *__page_to_virt(const struct page_info *pg)
                     (PAGE_SIZE / (sizeof(*pg) & -sizeof(*pg))));
 }
 
-int pv_free_page_type(struct page_info *page, unsigned long type,
-                      bool preemptible);
-
 bool_t fill_ro_mpt(unsigned long mfn);
 void zap_ro_mpt(unsigned long mfn);
 
diff --git a/xen/include/asm-x86/pv/mm.h b/xen/include/asm-x86/pv/mm.h
index 648b26d7d0..7de5ea2f12 100644
--- a/xen/include/asm-x86/pv/mm.h
+++ b/xen/include/asm-x86/pv/mm.h
@@ -95,10 +95,17 @@ void pv_arch_init_memory(void);
 
 int pv_new_guest_cr3(unsigned long pfn);
 
+int pv_alloc_page_type(struct page_info *page, unsigned long type,
+                       bool preemptible);
+int pv_free_page_type(struct page_info *page, unsigned long type,
+                      bool preemptible);
+
 #else
 
 #include <xen/errno.h>
 
+#include <asm/bug.h>
+
 static inline void pv_get_guest_eff_l1e(unsigned long addr,
                                         l1_pgentry_t *eff_l1e)
 {}
@@ -125,6 +132,13 @@ static inline void pv_arch_init_memory(void) {}
 
 static inline int pv_new_guest_cr3(unsigned long pfn) { return -EINVAL; }
 
+static inline int pv_alloc_page_type(struct page_info *page, unsigned long type,
+                                     bool preemptible)
+{ BUG(); return -EINVAL; }
+static inline int pv_free_page_type(struct page_info *page, unsigned long type,
+                                    bool preemptible)
+{ BUG(); return -EINVAL; }
+
 #endif
 
 #endif /* __X86_PV_MM_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 27/31] x86/mm: move and add pv_ prefix to invalidate_shadow_ldt
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (25 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 26/31] x86/mm: move pv_{alloc, free}_page_type Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 28/31] x86/mm: move PV hypercalls to pv/mm-hypercalls.c Wei Liu
                   ` (3 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Move the code to pv/mm.c and export it via pv/mm.h. Use bool for flush.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c           | 44 ++++----------------------------------------
 xen/arch/x86/pv/mm.c        | 35 +++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/pv/mm.h |  4 ++++
 3 files changed, 43 insertions(+), 40 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 204c20d6fd..2de2ec7f1e 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -474,42 +474,6 @@ static inline void page_set_tlbflush_timestamp(struct page_info *page)
 const char __section(".bss.page_aligned.const") __aligned(PAGE_SIZE)
     zero_page[PAGE_SIZE];
 
-static void invalidate_shadow_ldt(struct vcpu *v, int flush)
-{
-    l1_pgentry_t *pl1e;
-    unsigned int i;
-    struct page_info *page;
-
-    BUG_ON(unlikely(in_irq()));
-
-    spin_lock(&v->arch.pv_vcpu.shadow_ldt_lock);
-
-    if ( v->arch.pv_vcpu.shadow_ldt_mapcnt == 0 )
-        goto out;
-
-    v->arch.pv_vcpu.shadow_ldt_mapcnt = 0;
-    pl1e = gdt_ldt_ptes(v->domain, v);
-
-    for ( i = 16; i < 32; i++ )
-    {
-        if ( !(l1e_get_flags(pl1e[i]) & _PAGE_PRESENT) )
-            continue;
-        page = l1e_get_page(pl1e[i]);
-        l1e_write(&pl1e[i], l1e_empty());
-        ASSERT_PAGE_IS_TYPE(page, PGT_seg_desc_page);
-        ASSERT_PAGE_IS_DOMAIN(page, v->domain);
-        put_page_and_type(page);
-    }
-
-    /* Rid TLBs of stale mappings (guest mappings and shadow mappings). */
-    if ( flush )
-        flush_tlb_mask(v->vcpu_dirty_cpumask);
-
- out:
-    spin_unlock(&v->arch.pv_vcpu.shadow_ldt_lock);
-}
-
-
 bool get_page_from_mfn(mfn_t mfn, struct domain *d)
 {
     struct page_info *page = mfn_to_page(mfn_x(mfn));
@@ -1055,7 +1019,7 @@ void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner)
              (l1e_owner == pg_owner) )
         {
             for_each_vcpu ( pg_owner, v )
-                invalidate_shadow_ldt(v, 1);
+                pv_invalidate_shadow_ldt(v, true);
         }
         put_page(page);
     }
@@ -1954,7 +1918,7 @@ int pv_new_guest_cr3(unsigned long mfn)
             return rc;
         }
 
-        invalidate_shadow_ldt(curr, 0);
+        pv_invalidate_shadow_ldt(curr, false);
         write_ptbase(curr);
 
         return 0;
@@ -1992,7 +1956,7 @@ int pv_new_guest_cr3(unsigned long mfn)
         return rc;
     }
 
-    invalidate_shadow_ldt(curr, 0);
+    pv_invalidate_shadow_ldt(curr, false);
 
     if ( !VM_ASSIST(currd, m2p_strict) && !paging_mode_refcounts(currd) )
         fill_ro_mpt(mfn);
@@ -2492,7 +2456,7 @@ long do_mmuext_op(
             else if ( (curr->arch.pv_vcpu.ldt_ents != ents) ||
                       (curr->arch.pv_vcpu.ldt_base != ptr) )
             {
-                invalidate_shadow_ldt(curr, 0);
+                pv_invalidate_shadow_ldt(curr, false);
                 flush_tlb_local();
                 curr->arch.pv_vcpu.ldt_base = ptr;
                 curr->arch.pv_vcpu.ldt_ents = ents;
diff --git a/xen/arch/x86/pv/mm.c b/xen/arch/x86/pv/mm.c
index f0393b9e3c..19b2ae588e 100644
--- a/xen/arch/x86/pv/mm.c
+++ b/xen/arch/x86/pv/mm.c
@@ -742,6 +742,41 @@ int pv_free_page_type(struct page_info *page, unsigned long type,
     return rc;
 }
 
+void pv_invalidate_shadow_ldt(struct vcpu *v, bool flush)
+{
+    l1_pgentry_t *pl1e;
+    unsigned int i;
+    struct page_info *page;
+
+    BUG_ON(unlikely(in_irq()));
+
+    spin_lock(&v->arch.pv_vcpu.shadow_ldt_lock);
+
+    if ( v->arch.pv_vcpu.shadow_ldt_mapcnt == 0 )
+        goto out;
+
+    v->arch.pv_vcpu.shadow_ldt_mapcnt = 0;
+    pl1e = gdt_ldt_ptes(v->domain, v);
+
+    for ( i = 16; i < 32; i++ )
+    {
+        if ( !(l1e_get_flags(pl1e[i]) & _PAGE_PRESENT) )
+            continue;
+        page = l1e_get_page(pl1e[i]);
+        l1e_write(&pl1e[i], l1e_empty());
+        ASSERT_PAGE_IS_TYPE(page, PGT_seg_desc_page);
+        ASSERT_PAGE_IS_DOMAIN(page, v->domain);
+        put_page_and_type(page);
+    }
+
+    /* Rid TLBs of stale mappings (guest mappings and shadow mappings). */
+    if ( flush )
+        flush_tlb_mask(v->vcpu_dirty_cpumask);
+
+ out:
+    spin_unlock(&v->arch.pv_vcpu.shadow_ldt_lock);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/pv/mm.h b/xen/include/asm-x86/pv/mm.h
index 7de5ea2f12..1e2871fb58 100644
--- a/xen/include/asm-x86/pv/mm.h
+++ b/xen/include/asm-x86/pv/mm.h
@@ -100,6 +100,8 @@ int pv_alloc_page_type(struct page_info *page, unsigned long type,
 int pv_free_page_type(struct page_info *page, unsigned long type,
                       bool preemptible);
 
+void pv_invalidate_shadow_ldt(struct vcpu *v, bool flush);
+
 #else
 
 #include <xen/errno.h>
@@ -139,6 +141,8 @@ static inline int pv_free_page_type(struct page_info *page, unsigned long type,
                                     bool preemptible)
 { BUG(); return -EINVAL; }
 
+static inline void pv_invalidate_shadow_ldt(struct vcpu *v, bool flush) {}
+
 #endif
 
 #endif /* __X86_PV_MM_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 28/31] x86/mm: move PV hypercalls to pv/mm-hypercalls.c
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (26 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 27/31] x86/mm: move and add pv_ prefix to invalidate_shadow_ldt Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 29/31] x86/mm: remove the now unused inclusion of pv/mm.h Wei Liu
                   ` (2 subsequent siblings)
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Also move pv_new_guest_cr3 there so that we don't have to export
mod_l1_entry.

Fix coding style issues. Change v to curr and d to currd where
appropriate.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
I can't convince git diff to produce sensible diff for donate_page
and steal_page.  Those functions aren't changed.
---
 xen/arch/x86/mm.c               | 1569 ++-------------------------------------
 xen/arch/x86/pv/Makefile        |    1 +
 xen/arch/x86/pv/mm-hypercalls.c | 1463 ++++++++++++++++++++++++++++++++++++
 3 files changed, 1533 insertions(+), 1500 deletions(-)
 create mode 100644 xen/arch/x86/pv/mm-hypercalls.c

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 2de2ec7f1e..54256b6a11 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1164,290 +1164,6 @@ void page_unlock(struct page_info *page)
     } while ( (y = cmpxchg(&page->u.inuse.type_info, x, nx)) != x );
 }
 
-/*
- * PTE flags that a guest may change without re-validating the PTE.
- * All other bits affect translation, caching, or Xen's safety.
- */
-#define FASTPATH_FLAG_WHITELIST                                     \
-    (_PAGE_NX_BIT | _PAGE_AVAIL_HIGH | _PAGE_AVAIL | _PAGE_GLOBAL | \
-     _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_USER)
-
-/* Update the L1 entry at pl1e to new value nl1e. */
-static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t nl1e,
-                        unsigned long gl1mfn, int preserve_ad,
-                        struct vcpu *pt_vcpu, struct domain *pg_dom)
-{
-    l1_pgentry_t ol1e;
-    struct domain *pt_dom = pt_vcpu->domain;
-    int rc = 0;
-
-    if ( unlikely(__copy_from_user(&ol1e, pl1e, sizeof(ol1e)) != 0) )
-        return -EFAULT;
-
-    ASSERT(!paging_mode_refcounts(pt_dom));
-
-    if ( l1e_get_flags(nl1e) & _PAGE_PRESENT )
-    {
-        /* Translate foreign guest addresses. */
-        struct page_info *page = NULL;
-
-        if ( unlikely(l1e_get_flags(nl1e) & l1_disallow_mask(pt_dom)) )
-        {
-            gdprintk(XENLOG_WARNING, "Bad L1 flags %x\n",
-                    l1e_get_flags(nl1e) & l1_disallow_mask(pt_dom));
-            return -EINVAL;
-        }
-
-        if ( paging_mode_translate(pg_dom) )
-        {
-            page = get_page_from_gfn(pg_dom, l1e_get_pfn(nl1e), NULL, P2M_ALLOC);
-            if ( !page )
-                return -EINVAL;
-            nl1e = l1e_from_pfn(page_to_mfn(page), l1e_get_flags(nl1e));
-        }
-
-        /* Fast path for sufficiently-similar mappings. */
-        if ( !l1e_has_changed(ol1e, nl1e, ~FASTPATH_FLAG_WHITELIST) )
-        {
-            adjust_guest_l1e(nl1e, pt_dom);
-            rc = UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, pt_vcpu,
-                              preserve_ad);
-            if ( page )
-                put_page(page);
-            return rc ? 0 : -EBUSY;
-        }
-
-        switch ( rc = get_page_from_l1e(nl1e, pt_dom, pg_dom) )
-        {
-        default:
-            if ( page )
-                put_page(page);
-            return rc;
-        case 0:
-            break;
-        case _PAGE_RW ... _PAGE_RW | PAGE_CACHE_ATTRS:
-            ASSERT(!(rc & ~(_PAGE_RW | PAGE_CACHE_ATTRS)));
-            l1e_flip_flags(nl1e, rc);
-            rc = 0;
-            break;
-        }
-        if ( page )
-            put_page(page);
-
-        adjust_guest_l1e(nl1e, pt_dom);
-        if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, pt_vcpu,
-                                    preserve_ad)) )
-        {
-            ol1e = nl1e;
-            rc = -EBUSY;
-        }
-    }
-    else if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, pt_vcpu,
-                                     preserve_ad)) )
-    {
-        return -EBUSY;
-    }
-
-    put_page_from_l1e(ol1e, pt_dom);
-    return rc;
-}
-
-
-/* Update the L2 entry at pl2e to new value nl2e. pl2e is within frame pfn. */
-static int mod_l2_entry(l2_pgentry_t *pl2e,
-                        l2_pgentry_t nl2e,
-                        unsigned long pfn,
-                        int preserve_ad,
-                        struct vcpu *vcpu)
-{
-    l2_pgentry_t ol2e;
-    struct domain *d = vcpu->domain;
-    struct page_info *l2pg = mfn_to_page(pfn);
-    unsigned long type = l2pg->u.inuse.type_info;
-    int rc = 0;
-
-    if ( unlikely(!is_guest_l2_slot(d, type, pgentry_ptr_to_slot(pl2e))) )
-    {
-        gdprintk(XENLOG_WARNING, "L2 update in Xen-private area, slot %#lx\n",
-                 pgentry_ptr_to_slot(pl2e));
-        return -EPERM;
-    }
-
-    if ( unlikely(__copy_from_user(&ol2e, pl2e, sizeof(ol2e)) != 0) )
-        return -EFAULT;
-
-    if ( l2e_get_flags(nl2e) & _PAGE_PRESENT )
-    {
-        if ( unlikely(l2e_get_flags(nl2e) & L2_DISALLOW_MASK) )
-        {
-            gdprintk(XENLOG_WARNING, "Bad L2 flags %x\n",
-                    l2e_get_flags(nl2e) & L2_DISALLOW_MASK);
-            return -EINVAL;
-        }
-
-        /* Fast path for sufficiently-similar mappings. */
-        if ( !l2e_has_changed(ol2e, nl2e, ~FASTPATH_FLAG_WHITELIST) )
-        {
-            adjust_guest_l2e(nl2e, d);
-            if ( UPDATE_ENTRY(l2, pl2e, ol2e, nl2e, pfn, vcpu, preserve_ad) )
-                return 0;
-            return -EBUSY;
-        }
-
-        if ( unlikely((rc = get_page_from_l2e(nl2e, pfn, d)) < 0) )
-            return rc;
-
-        adjust_guest_l2e(nl2e, d);
-        if ( unlikely(!UPDATE_ENTRY(l2, pl2e, ol2e, nl2e, pfn, vcpu,
-                                    preserve_ad)) )
-        {
-            ol2e = nl2e;
-            rc = -EBUSY;
-        }
-    }
-    else if ( unlikely(!UPDATE_ENTRY(l2, pl2e, ol2e, nl2e, pfn, vcpu,
-                                     preserve_ad)) )
-    {
-        return -EBUSY;
-    }
-
-    put_page_from_l2e(ol2e, pfn);
-    return rc;
-}
-
-/* Update the L3 entry at pl3e to new value nl3e. pl3e is within frame pfn. */
-static int mod_l3_entry(l3_pgentry_t *pl3e,
-                        l3_pgentry_t nl3e,
-                        unsigned long pfn,
-                        int preserve_ad,
-                        struct vcpu *vcpu)
-{
-    l3_pgentry_t ol3e;
-    struct domain *d = vcpu->domain;
-    int rc = 0;
-
-    if ( unlikely(!is_guest_l3_slot(pgentry_ptr_to_slot(pl3e))) )
-    {
-        gdprintk(XENLOG_WARNING, "L3 update in Xen-private area, slot %#lx\n",
-                 pgentry_ptr_to_slot(pl3e));
-        return -EINVAL;
-    }
-
-    /*
-     * Disallow updates to final L3 slot. It contains Xen mappings, and it
-     * would be a pain to ensure they remain continuously valid throughout.
-     */
-    if ( is_pv_32bit_domain(d) && (pgentry_ptr_to_slot(pl3e) >= 3) )
-        return -EINVAL;
-
-    if ( unlikely(__copy_from_user(&ol3e, pl3e, sizeof(ol3e)) != 0) )
-        return -EFAULT;
-
-    if ( l3e_get_flags(nl3e) & _PAGE_PRESENT )
-    {
-        if ( unlikely(l3e_get_flags(nl3e) & l3_disallow_mask(d)) )
-        {
-            gdprintk(XENLOG_WARNING, "Bad L3 flags %x\n",
-                    l3e_get_flags(nl3e) & l3_disallow_mask(d));
-            return -EINVAL;
-        }
-
-        /* Fast path for sufficiently-similar mappings. */
-        if ( !l3e_has_changed(ol3e, nl3e, ~FASTPATH_FLAG_WHITELIST) )
-        {
-            adjust_guest_l3e(nl3e, d);
-            rc = UPDATE_ENTRY(l3, pl3e, ol3e, nl3e, pfn, vcpu, preserve_ad);
-            return rc ? 0 : -EFAULT;
-        }
-
-        rc = get_page_from_l3e(nl3e, pfn, d, 0);
-        if ( unlikely(rc < 0) )
-            return rc;
-        rc = 0;
-
-        adjust_guest_l3e(nl3e, d);
-        if ( unlikely(!UPDATE_ENTRY(l3, pl3e, ol3e, nl3e, pfn, vcpu,
-                                    preserve_ad)) )
-        {
-            ol3e = nl3e;
-            rc = -EFAULT;
-        }
-    }
-    else if ( unlikely(!UPDATE_ENTRY(l3, pl3e, ol3e, nl3e, pfn, vcpu,
-                                     preserve_ad)) )
-    {
-        return -EFAULT;
-    }
-
-    if ( likely(rc == 0) )
-        if ( !pv_create_pae_xen_mappings(d, pl3e) )
-            BUG();
-
-    put_page_from_l3e(ol3e, pfn, 0, 1);
-    return rc;
-}
-
-/* Update the L4 entry at pl4e to new value nl4e. pl4e is within frame pfn. */
-static int mod_l4_entry(l4_pgentry_t *pl4e,
-                        l4_pgentry_t nl4e,
-                        unsigned long pfn,
-                        int preserve_ad,
-                        struct vcpu *vcpu)
-{
-    struct domain *d = vcpu->domain;
-    l4_pgentry_t ol4e;
-    int rc = 0;
-
-    if ( unlikely(!is_guest_l4_slot(d, pgentry_ptr_to_slot(pl4e))) )
-    {
-        gdprintk(XENLOG_WARNING, "L4 update in Xen-private area, slot %#lx\n",
-                 pgentry_ptr_to_slot(pl4e));
-        return -EINVAL;
-    }
-
-    if ( unlikely(__copy_from_user(&ol4e, pl4e, sizeof(ol4e)) != 0) )
-        return -EFAULT;
-
-    if ( l4e_get_flags(nl4e) & _PAGE_PRESENT )
-    {
-        if ( unlikely(l4e_get_flags(nl4e) & L4_DISALLOW_MASK) )
-        {
-            gdprintk(XENLOG_WARNING, "Bad L4 flags %x\n",
-                    l4e_get_flags(nl4e) & L4_DISALLOW_MASK);
-            return -EINVAL;
-        }
-
-        /* Fast path for sufficiently-similar mappings. */
-        if ( !l4e_has_changed(ol4e, nl4e, ~FASTPATH_FLAG_WHITELIST) )
-        {
-            adjust_guest_l4e(nl4e, d);
-            rc = UPDATE_ENTRY(l4, pl4e, ol4e, nl4e, pfn, vcpu, preserve_ad);
-            return rc ? 0 : -EFAULT;
-        }
-
-        rc = get_page_from_l4e(nl4e, pfn, d, 0);
-        if ( unlikely(rc < 0) )
-            return rc;
-        rc = 0;
-
-        adjust_guest_l4e(nl4e, d);
-        if ( unlikely(!UPDATE_ENTRY(l4, pl4e, ol4e, nl4e, pfn, vcpu,
-                                    preserve_ad)) )
-        {
-            ol4e = nl4e;
-            rc = -EFAULT;
-        }
-    }
-    else if ( unlikely(!UPDATE_ENTRY(l4, pl4e, ol4e, nl4e, pfn, vcpu,
-                                     preserve_ad)) )
-    {
-        return -EFAULT;
-    }
-
-    put_page_from_l4e(ol4e, pfn, 0, 1);
-    return rc;
-}
-
 static int cleanup_page_cacheattr(struct page_info *page)
 {
     unsigned int cacheattr =
@@ -1886,1127 +1602,96 @@ int vcpu_destroy_pagetables(struct vcpu *v)
     return rc != -EINTR ? rc : -ERESTART;
 }
 
-int pv_new_guest_cr3(unsigned long mfn)
+int donate_page(
+    struct domain *d, struct page_info *page, unsigned int memflags)
 {
-    struct vcpu *curr = current;
-    struct domain *currd = curr->domain;
-    int rc;
-    unsigned long old_base_mfn;
-
-    if ( is_pv_32bit_domain(currd) )
-    {
-        unsigned long gt_mfn = pagetable_get_pfn(curr->arch.guest_table);
-        l4_pgentry_t *pl4e = map_domain_page(_mfn(gt_mfn));
-
-        rc = mod_l4_entry(pl4e,
-                          l4e_from_pfn(mfn,
-                                       (_PAGE_PRESENT | _PAGE_RW |
-                                        _PAGE_USER | _PAGE_ACCESSED)),
-                          gt_mfn, 0, curr);
-        unmap_domain_page(pl4e);
-        switch ( rc )
-        {
-        case 0:
-            break;
-        case -EINTR:
-        case -ERESTART:
-            return -ERESTART;
-        default:
-            gdprintk(XENLOG_WARNING,
-                     "Error while installing new compat baseptr %" PRI_mfn "\n",
-                     mfn);
-            return rc;
-        }
+    const struct domain *owner = dom_xen;
 
-        pv_invalidate_shadow_ldt(curr, false);
-        write_ptbase(curr);
+    spin_lock(&d->page_alloc_lock);
 
-        return 0;
-    }
+    if ( is_xen_heap_page(page) || ((owner = page_get_owner(page)) != NULL) )
+        goto fail;
 
-    rc = put_old_guest_table(curr);
-    if ( unlikely(rc) )
-        return rc;
+    if ( d->is_dying )
+        goto fail;
 
-    old_base_mfn = pagetable_get_pfn(curr->arch.guest_table);
-    /*
-     * This is particularly important when getting restarted after the
-     * previous attempt got preempted in the put-old-MFN phase.
-     */
-    if ( old_base_mfn == mfn )
-    {
-        write_ptbase(curr);
-        return 0;
-    }
+    if ( page->count_info & ~(PGC_allocated | 1) )
+        goto fail;
 
-    rc = paging_mode_refcounts(currd)
-         ? (get_page_from_mfn(_mfn(mfn), currd) ? 0 : -EINVAL)
-         : get_page_and_type_from_mfn(_mfn(mfn), PGT_root_page_table,
-                                      currd, 0, true);
-    switch ( rc )
+    if ( !(memflags & MEMF_no_refcount) )
     {
-    case 0:
-        break;
-    case -EINTR:
-    case -ERESTART:
-        return -ERESTART;
-    default:
-        gdprintk(XENLOG_WARNING,
-                 "Error while installing new baseptr %" PRI_mfn "\n", mfn);
-        return rc;
+        if ( d->tot_pages >= d->max_pages )
+            goto fail;
+        domain_adjust_tot_pages(d, 1);
     }
 
-    pv_invalidate_shadow_ldt(curr, false);
-
-    if ( !VM_ASSIST(currd, m2p_strict) && !paging_mode_refcounts(currd) )
-        fill_ro_mpt(mfn);
-    curr->arch.guest_table = pagetable_from_pfn(mfn);
-    update_cr3(curr);
-
-    write_ptbase(curr);
-
-    if ( likely(old_base_mfn != 0) )
-    {
-        struct page_info *page = mfn_to_page(old_base_mfn);
+    page->count_info = PGC_allocated | 1;
+    page_set_owner(page, d);
+    page_list_add_tail(page,&d->page_list);
 
-        if ( paging_mode_refcounts(currd) )
-            put_page(page);
-        else
-            switch ( rc = put_page_and_type_preemptible(page) )
-            {
-            case -EINTR:
-                rc = -ERESTART;
-                /* fallthrough */
-            case -ERESTART:
-                curr->arch.old_guest_table = page;
-                break;
-            default:
-                BUG_ON(rc);
-                break;
-            }
-    }
+    spin_unlock(&d->page_alloc_lock);
+    return 0;
 
-    return rc;
+ fail:
+    spin_unlock(&d->page_alloc_lock);
+    gdprintk(XENLOG_WARNING, "Bad donate mfn %" PRI_mfn
+             " to d%d (owner d%d) caf=%08lx taf=%" PRtype_info "\n",
+             page_to_mfn(page), d->domain_id,
+             owner ? owner->domain_id : DOMID_INVALID,
+             page->count_info, page->u.inuse.type_info);
+    return -EINVAL;
 }
 
-static struct domain *get_pg_owner(domid_t domid)
+int steal_page(
+    struct domain *d, struct page_info *page, unsigned int memflags)
 {
-    struct domain *pg_owner = NULL, *curr = current->domain;
+    unsigned long x, y;
+    bool drop_dom_ref = false;
+    const struct domain *owner = dom_xen;
 
-    if ( likely(domid == DOMID_SELF) )
-    {
-        pg_owner = rcu_lock_current_domain();
-        goto out;
-    }
+    if ( paging_mode_external(d) )
+        return -EOPNOTSUPP;
 
-    if ( unlikely(domid == curr->domain_id) )
-    {
-        gdprintk(XENLOG_WARNING, "Cannot specify itself as foreign domain\n");
-        goto out;
-    }
+    spin_lock(&d->page_alloc_lock);
 
-    switch ( domid )
-    {
-    case DOMID_IO:
-        pg_owner = rcu_lock_domain(dom_io);
-        break;
-    case DOMID_XEN:
-        pg_owner = rcu_lock_domain(dom_xen);
-        break;
-    default:
-        if ( (pg_owner = rcu_lock_domain_by_id(domid)) == NULL )
-        {
-            gdprintk(XENLOG_WARNING, "Unknown domain d%d\n", domid);
-            break;
-        }
-        break;
-    }
+    if ( is_xen_heap_page(page) || ((owner = page_get_owner(page)) != d) )
+        goto fail;
 
- out:
-    return pg_owner;
-}
+    /*
+     * We require there is just one reference (PGC_allocated). We temporarily
+     * drop this reference now so that we can safely swizzle the owner.
+     */
+    y = page->count_info;
+    do {
+        x = y;
+        if ( (x & (PGC_count_mask|PGC_allocated)) != (1 | PGC_allocated) )
+            goto fail;
+        y = cmpxchg(&page->count_info, x, x & ~PGC_count_mask);
+    } while ( y != x );
 
-static void put_pg_owner(struct domain *pg_owner)
-{
-    rcu_unlock_domain(pg_owner);
-}
+    /*
+     * With the sole reference dropped temporarily, no-one can update type
+     * information. Type count also needs to be zero in this case, but e.g.
+     * PGT_seg_desc_page may still have PGT_validated set, which we need to
+     * clear before transferring ownership (as validation criteria vary
+     * depending on domain type).
+     */
+    BUG_ON(page->u.inuse.type_info & (PGT_count_mask | PGT_locked |
+                                      PGT_pinned));
+    page->u.inuse.type_info = 0;
 
-static inline int vcpumask_to_pcpumask(
-    struct domain *d, XEN_GUEST_HANDLE_PARAM(const_void) bmap, cpumask_t *pmask)
-{
-    unsigned int vcpu_id, vcpu_bias, offs;
-    unsigned long vmask;
-    struct vcpu *v;
-    bool is_native = !is_pv_32bit_domain(d);
+    /* Swizzle the owner then reinstate the PGC_allocated reference. */
+    page_set_owner(page, NULL);
+    y = page->count_info;
+    do {
+        x = y;
+        BUG_ON((x & (PGC_count_mask|PGC_allocated)) != PGC_allocated);
+    } while ( (y = cmpxchg(&page->count_info, x, x | 1)) != x );
 
-    cpumask_clear(pmask);
-    for ( vmask = 0, offs = 0; ; ++offs )
-    {
-        vcpu_bias = offs * (is_native ? BITS_PER_LONG : 32);
-        if ( vcpu_bias >= d->max_vcpus )
-            return 0;
-
-        if ( unlikely(is_native ?
-                      copy_from_guest_offset(&vmask, bmap, offs, 1) :
-                      copy_from_guest_offset((unsigned int *)&vmask, bmap,
-                                             offs, 1)) )
-        {
-            cpumask_clear(pmask);
-            return -EFAULT;
-        }
-
-        while ( vmask )
-        {
-            vcpu_id = find_first_set_bit(vmask);
-            vmask &= ~(1UL << vcpu_id);
-            vcpu_id += vcpu_bias;
-            if ( (vcpu_id >= d->max_vcpus) )
-                return 0;
-            if ( ((v = d->vcpu[vcpu_id]) != NULL) )
-                cpumask_or(pmask, pmask, v->vcpu_dirty_cpumask);
-        }
-    }
-}
-
-long do_mmuext_op(
-    XEN_GUEST_HANDLE_PARAM(mmuext_op_t) uops,
-    unsigned int count,
-    XEN_GUEST_HANDLE_PARAM(uint) pdone,
-    unsigned int foreigndom)
-{
-    struct mmuext_op op;
-    unsigned long type;
-    unsigned int i, done = 0;
-    struct vcpu *curr = current;
-    struct domain *currd = curr->domain;
-    struct domain *pg_owner;
-    int rc = put_old_guest_table(curr);
-
-    if ( unlikely(rc) )
-    {
-        if ( likely(rc == -ERESTART) )
-            rc = hypercall_create_continuation(
-                     __HYPERVISOR_mmuext_op, "hihi", uops, count, pdone,
-                     foreigndom);
-        return rc;
-    }
-
-    if ( unlikely(count == MMU_UPDATE_PREEMPTED) &&
-         likely(guest_handle_is_null(uops)) )
-    {
-        /*
-         * See the curr->arch.old_guest_table related
-         * hypercall_create_continuation() below.
-         */
-        return (int)foreigndom;
-    }
-
-    if ( unlikely(count & MMU_UPDATE_PREEMPTED) )
-    {
-        count &= ~MMU_UPDATE_PREEMPTED;
-        if ( unlikely(!guest_handle_is_null(pdone)) )
-            (void)copy_from_guest(&done, pdone, 1);
-    }
-    else
-        perfc_incr(calls_to_mmuext_op);
-
-    if ( unlikely(!guest_handle_okay(uops, count)) )
-        return -EFAULT;
-
-    if ( (pg_owner = get_pg_owner(foreigndom)) == NULL )
-        return -ESRCH;
-
-    if ( !is_pv_domain(pg_owner) )
-    {
-        put_pg_owner(pg_owner);
-        return -EINVAL;
-    }
-
-    rc = xsm_mmuext_op(XSM_TARGET, currd, pg_owner);
-    if ( rc )
-    {
-        put_pg_owner(pg_owner);
-        return rc;
-    }
-
-    for ( i = 0; i < count; i++ )
-    {
-        if ( curr->arch.old_guest_table || (i && hypercall_preempt_check()) )
-        {
-            rc = -ERESTART;
-            break;
-        }
-
-        if ( unlikely(__copy_from_guest(&op, uops, 1) != 0) )
-        {
-            rc = -EFAULT;
-            break;
-        }
-
-        if ( is_hvm_domain(currd) )
-        {
-            switch ( op.cmd )
-            {
-            case MMUEXT_PIN_L1_TABLE:
-            case MMUEXT_PIN_L2_TABLE:
-            case MMUEXT_PIN_L3_TABLE:
-            case MMUEXT_PIN_L4_TABLE:
-            case MMUEXT_UNPIN_TABLE:
-                break;
-            default:
-                rc = -EOPNOTSUPP;
-                goto done;
-            }
-        }
-
-        rc = 0;
-
-        switch ( op.cmd )
-        {
-            struct page_info *page;
-            p2m_type_t p2mt;
-
-        case MMUEXT_PIN_L1_TABLE:
-            type = PGT_l1_page_table;
-            goto pin_page;
-
-        case MMUEXT_PIN_L2_TABLE:
-            type = PGT_l2_page_table;
-            goto pin_page;
-
-        case MMUEXT_PIN_L3_TABLE:
-            type = PGT_l3_page_table;
-            goto pin_page;
-
-        case MMUEXT_PIN_L4_TABLE:
-            if ( is_pv_32bit_domain(pg_owner) )
-                break;
-            type = PGT_l4_page_table;
-
-        pin_page:
-            /* Ignore pinning of invalid paging levels. */
-            if ( (op.cmd - MMUEXT_PIN_L1_TABLE) > (CONFIG_PAGING_LEVELS - 1) )
-                break;
-
-            if ( paging_mode_refcounts(pg_owner) )
-                break;
-
-            page = get_page_from_gfn(pg_owner, op.arg1.mfn, NULL, P2M_ALLOC);
-            if ( unlikely(!page) )
-            {
-                rc = -EINVAL;
-                break;
-            }
-
-            rc = get_page_type_preemptible(page, type);
-            if ( unlikely(rc) )
-            {
-                if ( rc == -EINTR )
-                    rc = -ERESTART;
-                else if ( rc != -ERESTART )
-                    gdprintk(XENLOG_WARNING,
-                             "Error %d while pinning mfn %" PRI_mfn "\n",
-                            rc, page_to_mfn(page));
-                if ( page != curr->arch.old_guest_table )
-                    put_page(page);
-                break;
-            }
-
-            rc = xsm_memory_pin_page(XSM_HOOK, currd, pg_owner, page);
-            if ( !rc && unlikely(test_and_set_bit(_PGT_pinned,
-                                                  &page->u.inuse.type_info)) )
-            {
-                gdprintk(XENLOG_WARNING,
-                         "mfn %" PRI_mfn " already pinned\n", page_to_mfn(page));
-                rc = -EINVAL;
-            }
-
-            if ( unlikely(rc) )
-                goto pin_drop;
-
-            /* A page is dirtied when its pin status is set. */
-            paging_mark_dirty(pg_owner, _mfn(page_to_mfn(page)));
-
-            /* We can race domain destruction (domain_relinquish_resources). */
-            if ( unlikely(pg_owner != currd) )
-            {
-                bool drop_ref;
-
-                spin_lock(&pg_owner->page_alloc_lock);
-                drop_ref = (pg_owner->is_dying &&
-                            test_and_clear_bit(_PGT_pinned,
-                                               &page->u.inuse.type_info));
-                spin_unlock(&pg_owner->page_alloc_lock);
-                if ( drop_ref )
-                {
-        pin_drop:
-                    if ( type == PGT_l1_page_table )
-                        put_page_and_type(page);
-                    else
-                        curr->arch.old_guest_table = page;
-                }
-            }
-            break;
-
-        case MMUEXT_UNPIN_TABLE:
-            if ( paging_mode_refcounts(pg_owner) )
-                break;
-
-            page = get_page_from_gfn(pg_owner, op.arg1.mfn, NULL, P2M_ALLOC);
-            if ( unlikely(!page) )
-            {
-                gdprintk(XENLOG_WARNING,
-                         "mfn %" PRI_mfn " bad, or bad owner d%d\n",
-                         op.arg1.mfn, pg_owner->domain_id);
-                rc = -EINVAL;
-                break;
-            }
-
-            if ( !test_and_clear_bit(_PGT_pinned, &page->u.inuse.type_info) )
-            {
-                put_page(page);
-                gdprintk(XENLOG_WARNING,
-                         "mfn %" PRI_mfn " not pinned\n", op.arg1.mfn);
-                rc = -EINVAL;
-                break;
-            }
-
-            switch ( rc = put_page_and_type_preemptible(page) )
-            {
-            case -EINTR:
-            case -ERESTART:
-                curr->arch.old_guest_table = page;
-                rc = 0;
-                break;
-            default:
-                BUG_ON(rc);
-                break;
-            }
-            put_page(page);
-
-            /* A page is dirtied when its pin status is cleared. */
-            paging_mark_dirty(pg_owner, _mfn(page_to_mfn(page)));
-            break;
-
-        case MMUEXT_NEW_BASEPTR:
-            if ( unlikely(currd != pg_owner) )
-                rc = -EPERM;
-            else if ( unlikely(paging_mode_translate(currd)) )
-                rc = -EINVAL;
-            else
-                rc = pv_new_guest_cr3(op.arg1.mfn);
-            break;
-
-        case MMUEXT_NEW_USER_BASEPTR: {
-            unsigned long old_mfn;
-
-            if ( unlikely(currd != pg_owner) )
-                rc = -EPERM;
-            else if ( unlikely(paging_mode_translate(currd)) )
-                rc = -EINVAL;
-            if ( unlikely(rc) )
-                break;
-
-            old_mfn = pagetable_get_pfn(curr->arch.guest_table_user);
-            /*
-             * This is particularly important when getting restarted after the
-             * previous attempt got preempted in the put-old-MFN phase.
-             */
-            if ( old_mfn == op.arg1.mfn )
-                break;
-
-            if ( op.arg1.mfn != 0 )
-            {
-                rc = get_page_and_type_from_mfn(
-                    _mfn(op.arg1.mfn), PGT_root_page_table, currd, 0, true);
-
-                if ( unlikely(rc) )
-                {
-                    if ( rc == -EINTR )
-                        rc = -ERESTART;
-                    else if ( rc != -ERESTART )
-                        gdprintk(XENLOG_WARNING,
-                                 "Error %d installing new mfn %" PRI_mfn "\n",
-                                 rc, op.arg1.mfn);
-                    break;
-                }
-
-                if ( VM_ASSIST(currd, m2p_strict) )
-                    zap_ro_mpt(op.arg1.mfn);
-            }
-
-            curr->arch.guest_table_user = pagetable_from_pfn(op.arg1.mfn);
-
-            if ( old_mfn != 0 )
-            {
-                page = mfn_to_page(old_mfn);
-
-                switch ( rc = put_page_and_type_preemptible(page) )
-                {
-                case -EINTR:
-                    rc = -ERESTART;
-                    /* fallthrough */
-                case -ERESTART:
-                    curr->arch.old_guest_table = page;
-                    break;
-                default:
-                    BUG_ON(rc);
-                    break;
-                }
-            }
-
-            break;
-        }
-
-        case MMUEXT_TLB_FLUSH_LOCAL:
-            if ( likely(currd == pg_owner) )
-                flush_tlb_local();
-            else
-                rc = -EPERM;
-            break;
-
-        case MMUEXT_INVLPG_LOCAL:
-            if ( unlikely(currd != pg_owner) )
-                rc = -EPERM;
-            else
-                paging_invlpg(curr, op.arg1.linear_addr);
-            break;
-
-        case MMUEXT_TLB_FLUSH_MULTI:
-        case MMUEXT_INVLPG_MULTI:
-        {
-            cpumask_t *mask = this_cpu(scratch_cpumask);
-
-            if ( unlikely(currd != pg_owner) )
-                rc = -EPERM;
-            else if ( unlikely(vcpumask_to_pcpumask(currd,
-                                   guest_handle_to_param(op.arg2.vcpumask,
-                                                         const_void),
-                                   mask)) )
-                rc = -EINVAL;
-            if ( unlikely(rc) )
-                break;
-
-            if ( op.cmd == MMUEXT_TLB_FLUSH_MULTI )
-                flush_tlb_mask(mask);
-            else if ( __addr_ok(op.arg1.linear_addr) )
-                flush_tlb_one_mask(mask, op.arg1.linear_addr);
-            break;
-        }
-
-        case MMUEXT_TLB_FLUSH_ALL:
-            if ( likely(currd == pg_owner) )
-                flush_tlb_mask(currd->domain_dirty_cpumask);
-            else
-                rc = -EPERM;
-            break;
-
-        case MMUEXT_INVLPG_ALL:
-            if ( unlikely(currd != pg_owner) )
-                rc = -EPERM;
-            else if ( __addr_ok(op.arg1.linear_addr) )
-                flush_tlb_one_mask(currd->domain_dirty_cpumask,
-                                   op.arg1.linear_addr);
-            break;
-
-        case MMUEXT_FLUSH_CACHE:
-            if ( unlikely(currd != pg_owner) )
-                rc = -EPERM;
-            else if ( unlikely(!cache_flush_permitted(currd)) )
-                rc = -EACCES;
-            else
-                wbinvd();
-            break;
-
-        case MMUEXT_FLUSH_CACHE_GLOBAL:
-            if ( unlikely(currd != pg_owner) )
-                rc = -EPERM;
-            else if ( likely(cache_flush_permitted(currd)) )
-            {
-                unsigned int cpu;
-                cpumask_t *mask = this_cpu(scratch_cpumask);
-
-                cpumask_clear(mask);
-                for_each_online_cpu(cpu)
-                    if ( !cpumask_intersects(mask,
-                                             per_cpu(cpu_sibling_mask, cpu)) )
-                        __cpumask_set_cpu(cpu, mask);
-                flush_mask(mask, FLUSH_CACHE);
-            }
-            else
-                rc = -EINVAL;
-            break;
-
-        case MMUEXT_SET_LDT:
-        {
-            unsigned int ents = op.arg2.nr_ents;
-            unsigned long ptr = ents ? op.arg1.linear_addr : 0;
-
-            if ( unlikely(currd != pg_owner) )
-                rc = -EPERM;
-            else if ( paging_mode_external(currd) )
-                rc = -EINVAL;
-            else if ( ((ptr & (PAGE_SIZE - 1)) != 0) || !__addr_ok(ptr) ||
-                      (ents > 8192) )
-            {
-                gdprintk(XENLOG_WARNING,
-                         "Bad args to SET_LDT: ptr=%lx, ents=%x\n", ptr, ents);
-                rc = -EINVAL;
-            }
-            else if ( (curr->arch.pv_vcpu.ldt_ents != ents) ||
-                      (curr->arch.pv_vcpu.ldt_base != ptr) )
-            {
-                pv_invalidate_shadow_ldt(curr, false);
-                flush_tlb_local();
-                curr->arch.pv_vcpu.ldt_base = ptr;
-                curr->arch.pv_vcpu.ldt_ents = ents;
-                load_LDT(curr);
-            }
-            break;
-        }
-
-        case MMUEXT_CLEAR_PAGE:
-            page = get_page_from_gfn(pg_owner, op.arg1.mfn, &p2mt, P2M_ALLOC);
-            if ( unlikely(p2mt != p2m_ram_rw) && page )
-            {
-                put_page(page);
-                page = NULL;
-            }
-            if ( !page || !get_page_type(page, PGT_writable_page) )
-            {
-                if ( page )
-                    put_page(page);
-                gdprintk(XENLOG_WARNING,
-                         "Error clearing mfn %" PRI_mfn "\n", op.arg1.mfn);
-                rc = -EINVAL;
-                break;
-            }
-
-            /* A page is dirtied when it's being cleared. */
-            paging_mark_dirty(pg_owner, _mfn(page_to_mfn(page)));
-
-            clear_domain_page(_mfn(page_to_mfn(page)));
-
-            put_page_and_type(page);
-            break;
-
-        case MMUEXT_COPY_PAGE:
-        {
-            struct page_info *src_page, *dst_page;
-
-            src_page = get_page_from_gfn(pg_owner, op.arg2.src_mfn, &p2mt,
-                                         P2M_ALLOC);
-            if ( unlikely(p2mt != p2m_ram_rw) && src_page )
-            {
-                put_page(src_page);
-                src_page = NULL;
-            }
-            if ( unlikely(!src_page) )
-            {
-                gdprintk(XENLOG_WARNING,
-                         "Error copying from mfn %" PRI_mfn "\n",
-                         op.arg2.src_mfn);
-                rc = -EINVAL;
-                break;
-            }
-
-            dst_page = get_page_from_gfn(pg_owner, op.arg1.mfn, &p2mt,
-                                         P2M_ALLOC);
-            if ( unlikely(p2mt != p2m_ram_rw) && dst_page )
-            {
-                put_page(dst_page);
-                dst_page = NULL;
-            }
-            rc = (dst_page &&
-                  get_page_type(dst_page, PGT_writable_page)) ? 0 : -EINVAL;
-            if ( unlikely(rc) )
-            {
-                put_page(src_page);
-                if ( dst_page )
-                    put_page(dst_page);
-                gdprintk(XENLOG_WARNING,
-                         "Error copying to mfn %" PRI_mfn "\n", op.arg1.mfn);
-                break;
-            }
-
-            /* A page is dirtied when it's being copied to. */
-            paging_mark_dirty(pg_owner, _mfn(page_to_mfn(dst_page)));
-
-            copy_domain_page(_mfn(page_to_mfn(dst_page)),
-                             _mfn(page_to_mfn(src_page)));
-
-            put_page_and_type(dst_page);
-            put_page(src_page);
-            break;
-        }
-
-        case MMUEXT_MARK_SUPER:
-        case MMUEXT_UNMARK_SUPER:
-            rc = -EOPNOTSUPP;
-            break;
-
-        default:
-            rc = -ENOSYS;
-            break;
-        }
-
- done:
-        if ( unlikely(rc) )
-            break;
-
-        guest_handle_add_offset(uops, 1);
-    }
-
-    if ( rc == -ERESTART )
-    {
-        ASSERT(i < count);
-        rc = hypercall_create_continuation(
-            __HYPERVISOR_mmuext_op, "hihi",
-            uops, (count - i) | MMU_UPDATE_PREEMPTED, pdone, foreigndom);
-    }
-    else if ( curr->arch.old_guest_table )
-    {
-        XEN_GUEST_HANDLE_PARAM(void) null;
-
-        ASSERT(rc || i == count);
-        set_xen_guest_handle(null, NULL);
-        /*
-         * In order to have a way to communicate the final return value to
-         * our continuation, we pass this in place of "foreigndom", building
-         * on the fact that this argument isn't needed anymore.
-         */
-        rc = hypercall_create_continuation(
-                __HYPERVISOR_mmuext_op, "hihi", null,
-                MMU_UPDATE_PREEMPTED, null, rc);
-    }
-
-    put_pg_owner(pg_owner);
-
-    perfc_add(num_mmuext_ops, i);
-
-    /* Add incremental work we have done to the @done output parameter. */
-    if ( unlikely(!guest_handle_is_null(pdone)) )
-    {
-        done += i;
-        copy_to_guest(pdone, &done, 1);
-    }
-
-    return rc;
-}
-
-long do_mmu_update(
-    XEN_GUEST_HANDLE_PARAM(mmu_update_t) ureqs,
-    unsigned int count,
-    XEN_GUEST_HANDLE_PARAM(uint) pdone,
-    unsigned int foreigndom)
-{
-    struct mmu_update req;
-    void *va = NULL;
-    unsigned long gpfn, gmfn, mfn;
-    struct page_info *page;
-    unsigned int cmd, i = 0, done = 0, pt_dom;
-    struct vcpu *curr = current, *v = curr;
-    struct domain *d = v->domain, *pt_owner = d, *pg_owner;
-    mfn_t map_mfn = INVALID_MFN;
-    uint32_t xsm_needed = 0;
-    uint32_t xsm_checked = 0;
-    int rc = put_old_guest_table(curr);
-
-    if ( unlikely(rc) )
-    {
-        if ( likely(rc == -ERESTART) )
-            rc = hypercall_create_continuation(
-                     __HYPERVISOR_mmu_update, "hihi", ureqs, count, pdone,
-                     foreigndom);
-        return rc;
-    }
-
-    if ( unlikely(count == MMU_UPDATE_PREEMPTED) &&
-         likely(guest_handle_is_null(ureqs)) )
-    {
-        /*
-         * See the curr->arch.old_guest_table related
-         * hypercall_create_continuation() below.
-         */
-        return (int)foreigndom;
-    }
-
-    if ( unlikely(count & MMU_UPDATE_PREEMPTED) )
-    {
-        count &= ~MMU_UPDATE_PREEMPTED;
-        if ( unlikely(!guest_handle_is_null(pdone)) )
-            (void)copy_from_guest(&done, pdone, 1);
-    }
-    else
-        perfc_incr(calls_to_mmu_update);
-
-    if ( unlikely(!guest_handle_okay(ureqs, count)) )
-        return -EFAULT;
-
-    if ( (pt_dom = foreigndom >> 16) != 0 )
-    {
-        /* Pagetables belong to a foreign domain (PFD). */
-        if ( (pt_owner = rcu_lock_domain_by_id(pt_dom - 1)) == NULL )
-            return -ESRCH;
-
-        if ( pt_owner == d )
-            rcu_unlock_domain(pt_owner);
-        else if ( !pt_owner->vcpu || (v = pt_owner->vcpu[0]) == NULL )
-        {
-            rc = -EINVAL;
-            goto out;
-        }
-    }
-
-    if ( (pg_owner = get_pg_owner((uint16_t)foreigndom)) == NULL )
-    {
-        rc = -ESRCH;
-        goto out;
-    }
-
-    for ( i = 0; i < count; i++ )
-    {
-        if ( curr->arch.old_guest_table || (i && hypercall_preempt_check()) )
-        {
-            rc = -ERESTART;
-            break;
-        }
-
-        if ( unlikely(__copy_from_guest(&req, ureqs, 1) != 0) )
-        {
-            rc = -EFAULT;
-            break;
-        }
-
-        cmd = req.ptr & (sizeof(l1_pgentry_t)-1);
-
-        switch ( cmd )
-        {
-            /*
-             * MMU_NORMAL_PT_UPDATE: Normal update to any level of page table.
-             * MMU_UPDATE_PT_PRESERVE_AD: As above but also preserve (OR)
-             * current A/D bits.
-             */
-        case MMU_NORMAL_PT_UPDATE:
-        case MMU_PT_UPDATE_PRESERVE_AD:
-        {
-            p2m_type_t p2mt;
-
-            rc = -EOPNOTSUPP;
-            if ( unlikely(paging_mode_refcounts(pt_owner)) )
-                break;
-
-            xsm_needed |= XSM_MMU_NORMAL_UPDATE;
-            if ( get_pte_flags(req.val) & _PAGE_PRESENT )
-            {
-                xsm_needed |= XSM_MMU_UPDATE_READ;
-                if ( get_pte_flags(req.val) & _PAGE_RW )
-                    xsm_needed |= XSM_MMU_UPDATE_WRITE;
-            }
-            if ( xsm_needed != xsm_checked )
-            {
-                rc = xsm_mmu_update(XSM_TARGET, d, pt_owner, pg_owner, xsm_needed);
-                if ( rc )
-                    break;
-                xsm_checked = xsm_needed;
-            }
-            rc = -EINVAL;
-
-            req.ptr -= cmd;
-            gmfn = req.ptr >> PAGE_SHIFT;
-            page = get_page_from_gfn(pt_owner, gmfn, &p2mt, P2M_ALLOC);
-
-            if ( p2m_is_paged(p2mt) )
-            {
-                ASSERT(!page);
-                p2m_mem_paging_populate(pg_owner, gmfn);
-                rc = -ENOENT;
-                break;
-            }
-
-            if ( unlikely(!page) )
-            {
-                gdprintk(XENLOG_WARNING,
-                         "Could not get page for normal update\n");
-                break;
-            }
-
-            mfn = page_to_mfn(page);
-
-            if ( !mfn_eq(_mfn(mfn), map_mfn) )
-            {
-                if ( va )
-                    unmap_domain_page(va);
-                va = map_domain_page(_mfn(mfn));
-                map_mfn = _mfn(mfn);
-            }
-            va = _p(((unsigned long)va & PAGE_MASK) + (req.ptr & ~PAGE_MASK));
-
-            if ( page_lock(page) )
-            {
-                switch ( page->u.inuse.type_info & PGT_type_mask )
-                {
-                case PGT_l1_page_table:
-                {
-                    l1_pgentry_t l1e = l1e_from_intpte(req.val);
-                    p2m_type_t l1e_p2mt = p2m_ram_rw;
-                    struct page_info *target = NULL;
-                    p2m_query_t q = (l1e_get_flags(l1e) & _PAGE_RW) ?
-                                        P2M_UNSHARE : P2M_ALLOC;
-
-                    if ( paging_mode_translate(pg_owner) )
-                        target = get_page_from_gfn(pg_owner, l1e_get_pfn(l1e),
-                                                   &l1e_p2mt, q);
-
-                    if ( p2m_is_paged(l1e_p2mt) )
-                    {
-                        if ( target )
-                            put_page(target);
-                        p2m_mem_paging_populate(pg_owner, l1e_get_pfn(l1e));
-                        rc = -ENOENT;
-                        break;
-                    }
-                    else if ( p2m_ram_paging_in == l1e_p2mt && !target )
-                    {
-                        rc = -ENOENT;
-                        break;
-                    }
-                    /* If we tried to unshare and failed */
-                    else if ( (q & P2M_UNSHARE) && p2m_is_shared(l1e_p2mt) )
-                    {
-                        /* We could not have obtained a page ref. */
-                        ASSERT(target == NULL);
-                        /* And mem_sharing_notify has already been called. */
-                        rc = -ENOMEM;
-                        break;
-                    }
-
-                    rc = mod_l1_entry(va, l1e, mfn,
-                                      cmd == MMU_PT_UPDATE_PRESERVE_AD, v,
-                                      pg_owner);
-                    if ( target )
-                        put_page(target);
-                }
-                break;
-                case PGT_l2_page_table:
-                    rc = mod_l2_entry(va, l2e_from_intpte(req.val), mfn,
-                                      cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
-                    break;
-                case PGT_l3_page_table:
-                    rc = mod_l3_entry(va, l3e_from_intpte(req.val), mfn,
-                                      cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
-                    break;
-                case PGT_l4_page_table:
-                    rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn,
-                                      cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
-                break;
-                case PGT_writable_page:
-                    perfc_incr(writable_mmu_updates);
-                    if ( paging_write_guest_entry(v, va, req.val, _mfn(mfn)) )
-                        rc = 0;
-                    break;
-                }
-                page_unlock(page);
-                if ( rc == -EINTR )
-                    rc = -ERESTART;
-            }
-            else if ( get_page_type(page, PGT_writable_page) )
-            {
-                perfc_incr(writable_mmu_updates);
-                if ( paging_write_guest_entry(v, va, req.val, _mfn(mfn)) )
-                    rc = 0;
-                put_page_type(page);
-            }
-
-            put_page(page);
-        }
-        break;
-
-        case MMU_MACHPHYS_UPDATE:
-            if ( unlikely(d != pt_owner) )
-            {
-                rc = -EPERM;
-                break;
-            }
-
-            if ( unlikely(paging_mode_translate(pg_owner)) )
-            {
-                rc = -EINVAL;
-                break;
-            }
-
-            mfn = req.ptr >> PAGE_SHIFT;
-            gpfn = req.val;
-
-            xsm_needed |= XSM_MMU_MACHPHYS_UPDATE;
-            if ( xsm_needed != xsm_checked )
-            {
-                rc = xsm_mmu_update(XSM_TARGET, d, NULL, pg_owner, xsm_needed);
-                if ( rc )
-                    break;
-                xsm_checked = xsm_needed;
-            }
-
-            if ( unlikely(!get_page_from_mfn(_mfn(mfn), pg_owner)) )
-            {
-                gdprintk(XENLOG_WARNING,
-                         "Could not get page for mach->phys update\n");
-                rc = -EINVAL;
-                break;
-            }
-
-            set_gpfn_from_mfn(mfn, gpfn);
-
-            paging_mark_dirty(pg_owner, _mfn(mfn));
-
-            put_page(mfn_to_page(mfn));
-            break;
-
-        default:
-            rc = -ENOSYS;
-            break;
-        }
-
-        if ( unlikely(rc) )
-            break;
-
-        guest_handle_add_offset(ureqs, 1);
-    }
-
-    if ( rc == -ERESTART )
-    {
-        ASSERT(i < count);
-        rc = hypercall_create_continuation(
-            __HYPERVISOR_mmu_update, "hihi",
-            ureqs, (count - i) | MMU_UPDATE_PREEMPTED, pdone, foreigndom);
-    }
-    else if ( curr->arch.old_guest_table )
-    {
-        XEN_GUEST_HANDLE_PARAM(void) null;
-
-        ASSERT(rc || i == count);
-        set_xen_guest_handle(null, NULL);
-        /*
-         * In order to have a way to communicate the final return value to
-         * our continuation, we pass this in place of "foreigndom", building
-         * on the fact that this argument isn't needed anymore.
-         */
-        rc = hypercall_create_continuation(
-                __HYPERVISOR_mmu_update, "hihi", null,
-                MMU_UPDATE_PREEMPTED, null, rc);
-    }
-
-    put_pg_owner(pg_owner);
-
-    if ( va )
-        unmap_domain_page(va);
-
-    perfc_add(num_page_updates, i);
-
- out:
-    if ( pt_owner != d )
-        rcu_unlock_domain(pt_owner);
-
-    /* Add incremental work we have done to the @done output parameter. */
-    if ( unlikely(!guest_handle_is_null(pdone)) )
-    {
-        done += i;
-        copy_to_guest(pdone, &done, 1);
-    }
-
-    return rc;
-}
-
-int donate_page(
-    struct domain *d, struct page_info *page, unsigned int memflags)
-{
-    const struct domain *owner = dom_xen;
-
-    spin_lock(&d->page_alloc_lock);
-
-    if ( is_xen_heap_page(page) || ((owner = page_get_owner(page)) != NULL) )
-        goto fail;
-
-    if ( d->is_dying )
-        goto fail;
-
-    if ( page->count_info & ~(PGC_allocated | 1) )
-        goto fail;
-
-    if ( !(memflags & MEMF_no_refcount) )
-    {
-        if ( d->tot_pages >= d->max_pages )
-            goto fail;
-        domain_adjust_tot_pages(d, 1);
-    }
-
-    page->count_info = PGC_allocated | 1;
-    page_set_owner(page, d);
-    page_list_add_tail(page,&d->page_list);
-
-    spin_unlock(&d->page_alloc_lock);
-    return 0;
-
- fail:
-    spin_unlock(&d->page_alloc_lock);
-    gdprintk(XENLOG_WARNING, "Bad donate mfn %" PRI_mfn
-             " to d%d (owner d%d) caf=%08lx taf=%" PRtype_info "\n",
-             page_to_mfn(page), d->domain_id,
-             owner ? owner->domain_id : DOMID_INVALID,
-             page->count_info, page->u.inuse.type_info);
-    return -EINVAL;
-}
-
-int steal_page(
-    struct domain *d, struct page_info *page, unsigned int memflags)
-{
-    unsigned long x, y;
-    bool drop_dom_ref = false;
-    const struct domain *owner = dom_xen;
-
-    if ( paging_mode_external(d) )
-        return -EOPNOTSUPP;
-
-    spin_lock(&d->page_alloc_lock);
-
-    if ( is_xen_heap_page(page) || ((owner = page_get_owner(page)) != d) )
-        goto fail;
-
-    /*
-     * We require there is just one reference (PGC_allocated). We temporarily
-     * drop this reference now so that we can safely swizzle the owner.
-     */
-    y = page->count_info;
-    do {
-        x = y;
-        if ( (x & (PGC_count_mask|PGC_allocated)) != (1 | PGC_allocated) )
-            goto fail;
-        y = cmpxchg(&page->count_info, x, x & ~PGC_count_mask);
-    } while ( y != x );
-
-    /*
-     * With the sole reference dropped temporarily, no-one can update type
-     * information. Type count also needs to be zero in this case, but e.g.
-     * PGT_seg_desc_page may still have PGT_validated set, which we need to
-     * clear before transferring ownership (as validation criteria vary
-     * depending on domain type).
-     */
-    BUG_ON(page->u.inuse.type_info & (PGT_count_mask | PGT_locked |
-                                      PGT_pinned));
-    page->u.inuse.type_info = 0;
-
-    /* Swizzle the owner then reinstate the PGC_allocated reference. */
-    page_set_owner(page, NULL);
-    y = page->count_info;
-    do {
-        x = y;
-        BUG_ON((x & (PGC_count_mask|PGC_allocated)) != PGC_allocated);
-    } while ( (y = cmpxchg(&page->count_info, x, x | 1)) != x );
-
-    /* Unlink from original owner. */
-    if ( !(memflags & MEMF_no_refcount) && !domain_adjust_tot_pages(d, -1) )
-        drop_dom_ref = true;
-    page_list_del(page, &d->page_list);
+    /* Unlink from original owner. */
+    if ( !(memflags & MEMF_no_refcount) && !domain_adjust_tot_pages(d, -1) )
+        drop_dom_ref = true;
+    page_list_del(page, &d->page_list);
 
     spin_unlock(&d->page_alloc_lock);
     if ( unlikely(drop_dom_ref) )
@@ -3023,122 +1708,6 @@ int steal_page(
     return -EINVAL;
 }
 
-static int __do_update_va_mapping(
-    unsigned long va, u64 val64, unsigned long flags, struct domain *pg_owner)
-{
-    l1_pgentry_t   val = l1e_from_intpte(val64);
-    struct vcpu   *v   = current;
-    struct domain *d   = v->domain;
-    struct page_info *gl1pg;
-    l1_pgentry_t  *pl1e;
-    unsigned long  bmap_ptr, gl1mfn;
-    cpumask_t     *mask = NULL;
-    int            rc;
-
-    perfc_incr(calls_to_update_va);
-
-    rc = xsm_update_va_mapping(XSM_TARGET, d, pg_owner, val);
-    if ( rc )
-        return rc;
-
-    rc = -EINVAL;
-    pl1e = pv_map_guest_l1e(va, &gl1mfn);
-    if ( unlikely(!pl1e || !get_page_from_mfn(_mfn(gl1mfn), d)) )
-        goto out;
-
-    gl1pg = mfn_to_page(gl1mfn);
-    if ( !page_lock(gl1pg) )
-    {
-        put_page(gl1pg);
-        goto out;
-    }
-
-    if ( (gl1pg->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
-    {
-        page_unlock(gl1pg);
-        put_page(gl1pg);
-        goto out;
-    }
-
-    rc = mod_l1_entry(pl1e, val, gl1mfn, 0, v, pg_owner);
-
-    page_unlock(gl1pg);
-    put_page(gl1pg);
-
- out:
-    if ( pl1e )
-        pv_unmap_guest_l1e(pl1e);
-
-    switch ( flags & UVMF_FLUSHTYPE_MASK )
-    {
-    case UVMF_TLB_FLUSH:
-        switch ( (bmap_ptr = flags & ~UVMF_FLUSHTYPE_MASK) )
-        {
-        case UVMF_LOCAL:
-            flush_tlb_local();
-            break;
-        case UVMF_ALL:
-            mask = d->domain_dirty_cpumask;
-            break;
-        default:
-            mask = this_cpu(scratch_cpumask);
-            rc = vcpumask_to_pcpumask(d, const_guest_handle_from_ptr(bmap_ptr,
-                                                                     void),
-                                      mask);
-            break;
-        }
-        if ( mask )
-            flush_tlb_mask(mask);
-        break;
-
-    case UVMF_INVLPG:
-        switch ( (bmap_ptr = flags & ~UVMF_FLUSHTYPE_MASK) )
-        {
-        case UVMF_LOCAL:
-            paging_invlpg(v, va);
-            break;
-        case UVMF_ALL:
-            mask = d->domain_dirty_cpumask;
-            break;
-        default:
-            mask = this_cpu(scratch_cpumask);
-            rc = vcpumask_to_pcpumask(d, const_guest_handle_from_ptr(bmap_ptr,
-                                                                     void),
-                                      mask);
-            break;
-        }
-        if ( mask )
-            flush_tlb_one_mask(mask, va);
-        break;
-    }
-
-    return rc;
-}
-
-long do_update_va_mapping(unsigned long va, u64 val64,
-                          unsigned long flags)
-{
-    return __do_update_va_mapping(va, val64, flags, current->domain);
-}
-
-long do_update_va_mapping_otherdomain(unsigned long va, u64 val64,
-                                      unsigned long flags,
-                                      domid_t domid)
-{
-    struct domain *pg_owner;
-    int rc;
-
-    if ( (pg_owner = get_pg_owner(domid)) == NULL )
-        return -ESRCH;
-
-    rc = __do_update_va_mapping(va, val64, flags, pg_owner);
-
-    put_pg_owner(pg_owner);
-
-    return rc;
-}
-
-
 typedef struct e820entry e820entry_t;
 DEFINE_XEN_GUEST_HANDLE(e820entry_t);
 
diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 42e9d3723b..219d7d0c63 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -12,6 +12,7 @@ obj-y += hypercall.o
 obj-y += iret.o
 obj-y += misc-hypercalls.o
 obj-y += mm.o
+obj-y += mm-hypercalls.o
 obj-y += traps.o
 
 obj-bin-y += dom0_build.init.o
diff --git a/xen/arch/x86/pv/mm-hypercalls.c b/xen/arch/x86/pv/mm-hypercalls.c
new file mode 100644
index 0000000000..7447633c8a
--- /dev/null
+++ b/xen/arch/x86/pv/mm-hypercalls.c
@@ -0,0 +1,1463 @@
+/******************************************************************************
+ * arch/x86/pv/mm-hypercalls.c
+ *
+ * Memory management hypercalls for PV guests
+ *
+ * Copyright (c) 2002-2005 K A Fraser
+ * Copyright (c) 2004 Christian Limpach
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/event.h>
+#include <xen/guest_access.h>
+
+#include <asm/hypercall.h>
+#include <asm/iocap.h>
+#include <asm/ldt.h>
+#include <asm/mm.h>
+#include <asm/p2m.h>
+#include <asm/pv/mm.h>
+#include <asm/setup.h>
+
+#include <xsm/xsm.h>
+
+#include "mm.h"
+
+static struct domain *get_pg_owner(domid_t domid)
+{
+    struct domain *pg_owner = NULL, *currd = current->domain;
+
+    if ( likely(domid == DOMID_SELF) )
+    {
+        pg_owner = rcu_lock_current_domain();
+        goto out;
+    }
+
+    if ( unlikely(domid == currd->domain_id) )
+    {
+        gdprintk(XENLOG_WARNING, "Cannot specify itself as foreign domain\n");
+        goto out;
+    }
+
+    switch ( domid )
+    {
+    case DOMID_IO:
+        pg_owner = rcu_lock_domain(dom_io);
+        break;
+    case DOMID_XEN:
+        pg_owner = rcu_lock_domain(dom_xen);
+        break;
+    default:
+        if ( (pg_owner = rcu_lock_domain_by_id(domid)) == NULL )
+        {
+            gdprintk(XENLOG_WARNING, "Unknown domain d%d\n", domid);
+            break;
+        }
+        break;
+    }
+
+ out:
+    return pg_owner;
+}
+
+static void put_pg_owner(struct domain *pg_owner)
+{
+    rcu_unlock_domain(pg_owner);
+}
+
+static inline int vcpumask_to_pcpumask(
+    struct domain *d, XEN_GUEST_HANDLE_PARAM(const_void) bmap, cpumask_t *pmask)
+{
+    unsigned int vcpu_id, vcpu_bias, offs;
+    unsigned long vmask;
+    struct vcpu *v;
+    bool is_native = !is_pv_32bit_domain(d);
+
+    cpumask_clear(pmask);
+    for ( vmask = 0, offs = 0; ; ++offs )
+    {
+        vcpu_bias = offs * (is_native ? BITS_PER_LONG : 32);
+        if ( vcpu_bias >= d->max_vcpus )
+            return 0;
+
+        if ( unlikely(is_native ?
+                      copy_from_guest_offset(&vmask, bmap, offs, 1) :
+                      copy_from_guest_offset((unsigned int *)&vmask, bmap,
+                                             offs, 1)) )
+        {
+            cpumask_clear(pmask);
+            return -EFAULT;
+        }
+
+        while ( vmask )
+        {
+            vcpu_id = find_first_set_bit(vmask);
+            vmask &= ~(1UL << vcpu_id);
+            vcpu_id += vcpu_bias;
+            if ( (vcpu_id >= d->max_vcpus) )
+                return 0;
+            if ( ((v = d->vcpu[vcpu_id]) != NULL) )
+                cpumask_or(pmask, pmask, v->vcpu_dirty_cpumask);
+        }
+    }
+}
+
+/*
+ * PTE flags that a guest may change without re-validating the PTE.
+ * All other bits affect translation, caching, or Xen's safety.
+ */
+#define FASTPATH_FLAG_WHITELIST                                     \
+    (_PAGE_NX_BIT | _PAGE_AVAIL_HIGH | _PAGE_AVAIL | _PAGE_GLOBAL | \
+     _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_USER)
+
+/* Update the L1 entry at pl1e to new value nl1e. */
+static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t nl1e,
+                        unsigned long gl1mfn, int preserve_ad,
+                        struct vcpu *pt_vcpu, struct domain *pg_dom)
+{
+    l1_pgentry_t ol1e;
+    struct domain *pt_dom = pt_vcpu->domain;
+    int rc = 0;
+
+    if ( unlikely(__copy_from_user(&ol1e, pl1e, sizeof(ol1e)) != 0) )
+        return -EFAULT;
+
+    ASSERT(!paging_mode_refcounts(pt_dom));
+
+    if ( l1e_get_flags(nl1e) & _PAGE_PRESENT )
+    {
+        /* Translate foreign guest addresses. */
+        struct page_info *page = NULL;
+
+        if ( unlikely(l1e_get_flags(nl1e) & l1_disallow_mask(pt_dom)) )
+        {
+            gdprintk(XENLOG_WARNING, "Bad L1 flags %x\n",
+                    l1e_get_flags(nl1e) & l1_disallow_mask(pt_dom));
+            return -EINVAL;
+        }
+
+        if ( paging_mode_translate(pg_dom) )
+        {
+            page = get_page_from_gfn(pg_dom, l1e_get_pfn(nl1e), NULL, P2M_ALLOC);
+            if ( !page )
+                return -EINVAL;
+            nl1e = l1e_from_pfn(page_to_mfn(page), l1e_get_flags(nl1e));
+        }
+
+        /* Fast path for sufficiently-similar mappings. */
+        if ( !l1e_has_changed(ol1e, nl1e, ~FASTPATH_FLAG_WHITELIST) )
+        {
+            adjust_guest_l1e(nl1e, pt_dom);
+            rc = UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, pt_vcpu,
+                              preserve_ad);
+            if ( page )
+                put_page(page);
+            return rc ? 0 : -EBUSY;
+        }
+
+        switch ( rc = get_page_from_l1e(nl1e, pt_dom, pg_dom) )
+        {
+        default:
+            if ( page )
+                put_page(page);
+            return rc;
+        case 0:
+            break;
+        case _PAGE_RW ... _PAGE_RW | PAGE_CACHE_ATTRS:
+            ASSERT(!(rc & ~(_PAGE_RW | PAGE_CACHE_ATTRS)));
+            l1e_flip_flags(nl1e, rc);
+            rc = 0;
+            break;
+        }
+        if ( page )
+            put_page(page);
+
+        adjust_guest_l1e(nl1e, pt_dom);
+        if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, pt_vcpu,
+                                    preserve_ad)) )
+        {
+            ol1e = nl1e;
+            rc = -EBUSY;
+        }
+    }
+    else if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, pt_vcpu,
+                                     preserve_ad)) )
+    {
+        return -EBUSY;
+    }
+
+    put_page_from_l1e(ol1e, pt_dom);
+    return rc;
+}
+
+/* Update the L2 entry at pl2e to new value nl2e. pl2e is within frame pfn. */
+static int mod_l2_entry(l2_pgentry_t *pl2e, l2_pgentry_t nl2e,
+                        unsigned long pfn, int preserve_ad, struct vcpu *vcpu)
+{
+    l2_pgentry_t ol2e;
+    struct domain *d = vcpu->domain;
+    struct page_info *l2pg = mfn_to_page(pfn);
+    unsigned long type = l2pg->u.inuse.type_info;
+    int rc = 0;
+
+    if ( unlikely(!is_guest_l2_slot(d, type, pgentry_ptr_to_slot(pl2e))) )
+    {
+        gdprintk(XENLOG_WARNING, "L2 update in Xen-private area, slot %#lx\n",
+                 pgentry_ptr_to_slot(pl2e));
+        return -EPERM;
+    }
+
+    if ( unlikely(__copy_from_user(&ol2e, pl2e, sizeof(ol2e)) != 0) )
+        return -EFAULT;
+
+    if ( l2e_get_flags(nl2e) & _PAGE_PRESENT )
+    {
+        if ( unlikely(l2e_get_flags(nl2e) & L2_DISALLOW_MASK) )
+        {
+            gdprintk(XENLOG_WARNING, "Bad L2 flags %x\n",
+                    l2e_get_flags(nl2e) & L2_DISALLOW_MASK);
+            return -EINVAL;
+        }
+
+        /* Fast path for sufficiently-similar mappings. */
+        if ( !l2e_has_changed(ol2e, nl2e, ~FASTPATH_FLAG_WHITELIST) )
+        {
+            adjust_guest_l2e(nl2e, d);
+            if ( UPDATE_ENTRY(l2, pl2e, ol2e, nl2e, pfn, vcpu, preserve_ad) )
+                return 0;
+            return -EBUSY;
+        }
+
+        if ( unlikely((rc = get_page_from_l2e(nl2e, pfn, d)) < 0) )
+            return rc;
+
+        adjust_guest_l2e(nl2e, d);
+        if ( unlikely(!UPDATE_ENTRY(l2, pl2e, ol2e, nl2e, pfn, vcpu,
+                                    preserve_ad)) )
+        {
+            ol2e = nl2e;
+            rc = -EBUSY;
+        }
+    }
+    else if ( unlikely(!UPDATE_ENTRY(l2, pl2e, ol2e, nl2e, pfn, vcpu,
+                                     preserve_ad)) )
+    {
+        return -EBUSY;
+    }
+
+    put_page_from_l2e(ol2e, pfn);
+    return rc;
+}
+
+/* Update the L3 entry at pl3e to new value nl3e. pl3e is within frame pfn. */
+static int mod_l3_entry(l3_pgentry_t *pl3e, l3_pgentry_t nl3e,
+                        unsigned long pfn, int preserve_ad, struct vcpu *vcpu)
+{
+    l3_pgentry_t ol3e;
+    struct domain *d = vcpu->domain;
+    int rc = 0;
+
+    if ( unlikely(!is_guest_l3_slot(pgentry_ptr_to_slot(pl3e))) )
+    {
+        gdprintk(XENLOG_WARNING, "L3 update in Xen-private area, slot %#lx\n",
+                 pgentry_ptr_to_slot(pl3e));
+        return -EINVAL;
+    }
+
+    /*
+     * Disallow updates to final L3 slot. It contains Xen mappings, and it
+     * would be a pain to ensure they remain continuously valid throughout.
+     */
+    if ( is_pv_32bit_domain(d) && (pgentry_ptr_to_slot(pl3e) >= 3) )
+        return -EINVAL;
+
+    if ( unlikely(__copy_from_user(&ol3e, pl3e, sizeof(ol3e)) != 0) )
+        return -EFAULT;
+
+    if ( l3e_get_flags(nl3e) & _PAGE_PRESENT )
+    {
+        if ( unlikely(l3e_get_flags(nl3e) & l3_disallow_mask(d)) )
+        {
+            gdprintk(XENLOG_WARNING, "Bad L3 flags %x\n",
+                    l3e_get_flags(nl3e) & l3_disallow_mask(d));
+            return -EINVAL;
+        }
+
+        /* Fast path for sufficiently-similar mappings. */
+        if ( !l3e_has_changed(ol3e, nl3e, ~FASTPATH_FLAG_WHITELIST) )
+        {
+            adjust_guest_l3e(nl3e, d);
+            rc = UPDATE_ENTRY(l3, pl3e, ol3e, nl3e, pfn, vcpu, preserve_ad);
+            return rc ? 0 : -EFAULT;
+        }
+
+        rc = get_page_from_l3e(nl3e, pfn, d, 0);
+        if ( unlikely(rc < 0) )
+            return rc;
+        rc = 0;
+
+        adjust_guest_l3e(nl3e, d);
+        if ( unlikely(!UPDATE_ENTRY(l3, pl3e, ol3e, nl3e, pfn, vcpu,
+                                    preserve_ad)) )
+        {
+            ol3e = nl3e;
+            rc = -EFAULT;
+        }
+    }
+    else if ( unlikely(!UPDATE_ENTRY(l3, pl3e, ol3e, nl3e, pfn, vcpu,
+                                     preserve_ad)) )
+    {
+        return -EFAULT;
+    }
+
+    if ( likely(rc == 0) )
+        if ( !pv_create_pae_xen_mappings(d, pl3e) )
+            BUG();
+
+    put_page_from_l3e(ol3e, pfn, 0, 1);
+    return rc;
+}
+
+/* Update the L4 entry at pl4e to new value nl4e. pl4e is within frame pfn. */
+static int mod_l4_entry(l4_pgentry_t *pl4e, l4_pgentry_t nl4e,
+                        unsigned long pfn, int preserve_ad, struct vcpu *vcpu)
+{
+    struct domain *d = vcpu->domain;
+    l4_pgentry_t ol4e;
+    int rc = 0;
+
+    if ( unlikely(!is_guest_l4_slot(d, pgentry_ptr_to_slot(pl4e))) )
+    {
+        gdprintk(XENLOG_WARNING, "L4 update in Xen-private area, slot %#lx\n",
+                 pgentry_ptr_to_slot(pl4e));
+        return -EINVAL;
+    }
+
+    if ( unlikely(__copy_from_user(&ol4e, pl4e, sizeof(ol4e)) != 0) )
+        return -EFAULT;
+
+    if ( l4e_get_flags(nl4e) & _PAGE_PRESENT )
+    {
+        if ( unlikely(l4e_get_flags(nl4e) & L4_DISALLOW_MASK) )
+        {
+            gdprintk(XENLOG_WARNING, "Bad L4 flags %x\n",
+                    l4e_get_flags(nl4e) & L4_DISALLOW_MASK);
+            return -EINVAL;
+        }
+
+        /* Fast path for sufficiently-similar mappings. */
+        if ( !l4e_has_changed(ol4e, nl4e, ~FASTPATH_FLAG_WHITELIST) )
+        {
+            adjust_guest_l4e(nl4e, d);
+            rc = UPDATE_ENTRY(l4, pl4e, ol4e, nl4e, pfn, vcpu, preserve_ad);
+            return rc ? 0 : -EFAULT;
+        }
+
+        rc = get_page_from_l4e(nl4e, pfn, d, 0);
+        if ( unlikely(rc < 0) )
+            return rc;
+        rc = 0;
+
+        adjust_guest_l4e(nl4e, d);
+        if ( unlikely(!UPDATE_ENTRY(l4, pl4e, ol4e, nl4e, pfn, vcpu,
+                                    preserve_ad)) )
+        {
+            ol4e = nl4e;
+            rc = -EFAULT;
+        }
+    }
+    else if ( unlikely(!UPDATE_ENTRY(l4, pl4e, ol4e, nl4e, pfn, vcpu,
+                                     preserve_ad)) )
+    {
+        return -EFAULT;
+    }
+
+    put_page_from_l4e(ol4e, pfn, 0, 1);
+    return rc;
+}
+
+int pv_new_guest_cr3(unsigned long mfn)
+{
+    struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
+    int rc;
+    unsigned long old_base_mfn;
+
+    if ( is_pv_32bit_domain(currd) )
+    {
+        unsigned long gt_mfn = pagetable_get_pfn(curr->arch.guest_table);
+        l4_pgentry_t *pl4e = map_domain_page(_mfn(gt_mfn));
+
+        rc = mod_l4_entry(pl4e,
+                          l4e_from_pfn(mfn,
+                                       (_PAGE_PRESENT | _PAGE_RW |
+                                        _PAGE_USER | _PAGE_ACCESSED)),
+                          gt_mfn, 0, curr);
+        unmap_domain_page(pl4e);
+        switch ( rc )
+        {
+        case 0:
+            break;
+        case -EINTR:
+        case -ERESTART:
+            return -ERESTART;
+        default:
+            gdprintk(XENLOG_WARNING,
+                     "Error while installing new compat baseptr %" PRI_mfn "\n",
+                     mfn);
+            return rc;
+        }
+
+        pv_invalidate_shadow_ldt(curr, false);
+        write_ptbase(curr);
+
+        return 0;
+    }
+
+    rc = put_old_guest_table(curr);
+    if ( unlikely(rc) )
+        return rc;
+
+    old_base_mfn = pagetable_get_pfn(curr->arch.guest_table);
+    /*
+     * This is particularly important when getting restarted after the
+     * previous attempt got preempted in the put-old-MFN phase.
+     */
+    if ( old_base_mfn == mfn )
+    {
+        write_ptbase(curr);
+        return 0;
+    }
+
+    rc = paging_mode_refcounts(currd)
+         ? (get_page_from_mfn(_mfn(mfn), currd) ? 0 : -EINVAL)
+         : get_page_and_type_from_mfn(_mfn(mfn), PGT_root_page_table,
+                                      currd, 0, true);
+    switch ( rc )
+    {
+    case 0:
+        break;
+    case -EINTR:
+    case -ERESTART:
+        return -ERESTART;
+    default:
+        gdprintk(XENLOG_WARNING,
+                 "Error while installing new baseptr %" PRI_mfn "\n", mfn);
+        return rc;
+    }
+
+    pv_invalidate_shadow_ldt(curr, false);
+
+    if ( !VM_ASSIST(currd, m2p_strict) && !paging_mode_refcounts(currd) )
+        fill_ro_mpt(mfn);
+    curr->arch.guest_table = pagetable_from_pfn(mfn);
+    update_cr3(curr);
+
+    write_ptbase(curr);
+
+    if ( likely(old_base_mfn != 0) )
+    {
+        struct page_info *page = mfn_to_page(old_base_mfn);
+
+        if ( paging_mode_refcounts(currd) )
+            put_page(page);
+        else
+            switch ( rc = put_page_and_type_preemptible(page) )
+            {
+            case -EINTR:
+                rc = -ERESTART;
+                /* fallthrough */
+            case -ERESTART:
+                curr->arch.old_guest_table = page;
+                break;
+            default:
+                BUG_ON(rc);
+                break;
+            }
+    }
+
+    return rc;
+}
+
+long do_mmuext_op(XEN_GUEST_HANDLE_PARAM(mmuext_op_t) uops, unsigned int count,
+                                         XEN_GUEST_HANDLE_PARAM(uint) pdone,
+                                         unsigned int foreigndom)
+{
+    struct mmuext_op op;
+    unsigned long type;
+    unsigned int i, done = 0;
+    struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
+    struct domain *pg_owner;
+    int rc = put_old_guest_table(curr);
+
+    if ( unlikely(rc) )
+    {
+        if ( likely(rc == -ERESTART) )
+            rc = hypercall_create_continuation(
+                     __HYPERVISOR_mmuext_op, "hihi", uops, count, pdone,
+                     foreigndom);
+        return rc;
+    }
+
+    if ( unlikely(count == MMU_UPDATE_PREEMPTED) &&
+         likely(guest_handle_is_null(uops)) )
+    {
+        /*
+         * See the curr->arch.old_guest_table related
+         * hypercall_create_continuation() below.
+         */
+        return (int)foreigndom;
+    }
+
+    if ( unlikely(count & MMU_UPDATE_PREEMPTED) )
+    {
+        count &= ~MMU_UPDATE_PREEMPTED;
+        if ( unlikely(!guest_handle_is_null(pdone)) )
+            (void)copy_from_guest(&done, pdone, 1);
+    }
+    else
+        perfc_incr(calls_to_mmuext_op);
+
+    if ( unlikely(!guest_handle_okay(uops, count)) )
+        return -EFAULT;
+
+    if ( (pg_owner = get_pg_owner(foreigndom)) == NULL )
+        return -ESRCH;
+
+    if ( !is_pv_domain(pg_owner) )
+    {
+        put_pg_owner(pg_owner);
+        return -EINVAL;
+    }
+
+    rc = xsm_mmuext_op(XSM_TARGET, currd, pg_owner);
+    if ( rc )
+    {
+        put_pg_owner(pg_owner);
+        return rc;
+    }
+
+    for ( i = 0; i < count; i++ )
+    {
+        if ( curr->arch.old_guest_table || (i && hypercall_preempt_check()) )
+        {
+            rc = -ERESTART;
+            break;
+        }
+
+        if ( unlikely(__copy_from_guest(&op, uops, 1) != 0) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+
+        if ( is_hvm_domain(currd) )
+        {
+            switch ( op.cmd )
+            {
+            case MMUEXT_PIN_L1_TABLE:
+            case MMUEXT_PIN_L2_TABLE:
+            case MMUEXT_PIN_L3_TABLE:
+            case MMUEXT_PIN_L4_TABLE:
+            case MMUEXT_UNPIN_TABLE:
+                break;
+            default:
+                rc = -EOPNOTSUPP;
+                goto done;
+            }
+        }
+
+        rc = 0;
+
+        switch ( op.cmd )
+        {
+            struct page_info *page;
+            p2m_type_t p2mt;
+
+        case MMUEXT_PIN_L1_TABLE:
+            type = PGT_l1_page_table;
+            goto pin_page;
+
+        case MMUEXT_PIN_L2_TABLE:
+            type = PGT_l2_page_table;
+            goto pin_page;
+
+        case MMUEXT_PIN_L3_TABLE:
+            type = PGT_l3_page_table;
+            goto pin_page;
+
+        case MMUEXT_PIN_L4_TABLE:
+            if ( is_pv_32bit_domain(pg_owner) )
+                break;
+            type = PGT_l4_page_table;
+
+        pin_page:
+            /* Ignore pinning of invalid paging levels. */
+            if ( (op.cmd - MMUEXT_PIN_L1_TABLE) > (CONFIG_PAGING_LEVELS - 1) )
+                break;
+
+            if ( paging_mode_refcounts(pg_owner) )
+                break;
+
+            page = get_page_from_gfn(pg_owner, op.arg1.mfn, NULL, P2M_ALLOC);
+            if ( unlikely(!page) )
+            {
+                rc = -EINVAL;
+                break;
+            }
+
+            rc = get_page_type_preemptible(page, type);
+            if ( unlikely(rc) )
+            {
+                if ( rc == -EINTR )
+                    rc = -ERESTART;
+                else if ( rc != -ERESTART )
+                    gdprintk(XENLOG_WARNING,
+                             "Error %d while pinning mfn %" PRI_mfn "\n",
+                            rc, page_to_mfn(page));
+                if ( page != curr->arch.old_guest_table )
+                    put_page(page);
+                break;
+            }
+
+            rc = xsm_memory_pin_page(XSM_HOOK, currd, pg_owner, page);
+            if ( !rc && unlikely(test_and_set_bit(_PGT_pinned,
+                                                  &page->u.inuse.type_info)) )
+            {
+                gdprintk(XENLOG_WARNING,
+                         "mfn %" PRI_mfn " already pinned\n", page_to_mfn(page));
+                rc = -EINVAL;
+            }
+
+            if ( unlikely(rc) )
+                goto pin_drop;
+
+            /* A page is dirtied when its pin status is set. */
+            paging_mark_dirty(pg_owner, _mfn(page_to_mfn(page)));
+
+            /* We can race domain destruction (domain_relinquish_resources). */
+            if ( unlikely(pg_owner != currd) )
+            {
+                bool drop_ref;
+
+                spin_lock(&pg_owner->page_alloc_lock);
+                drop_ref = (pg_owner->is_dying &&
+                            test_and_clear_bit(_PGT_pinned,
+                                               &page->u.inuse.type_info));
+                spin_unlock(&pg_owner->page_alloc_lock);
+                if ( drop_ref )
+                {
+        pin_drop:
+                    if ( type == PGT_l1_page_table )
+                        put_page_and_type(page);
+                    else
+                        curr->arch.old_guest_table = page;
+                }
+            }
+            break;
+
+        case MMUEXT_UNPIN_TABLE:
+            if ( paging_mode_refcounts(pg_owner) )
+                break;
+
+            page = get_page_from_gfn(pg_owner, op.arg1.mfn, NULL, P2M_ALLOC);
+            if ( unlikely(!page) )
+            {
+                gdprintk(XENLOG_WARNING,
+                         "mfn %" PRI_mfn " bad, or bad owner d%d\n",
+                         op.arg1.mfn, pg_owner->domain_id);
+                rc = -EINVAL;
+                break;
+            }
+
+            if ( !test_and_clear_bit(_PGT_pinned, &page->u.inuse.type_info) )
+            {
+                put_page(page);
+                gdprintk(XENLOG_WARNING,
+                         "mfn %" PRI_mfn " not pinned\n", op.arg1.mfn);
+                rc = -EINVAL;
+                break;
+            }
+
+            switch ( rc = put_page_and_type_preemptible(page) )
+            {
+            case -EINTR:
+            case -ERESTART:
+                curr->arch.old_guest_table = page;
+                rc = 0;
+                break;
+            default:
+                BUG_ON(rc);
+                break;
+            }
+            put_page(page);
+
+            /* A page is dirtied when its pin status is cleared. */
+            paging_mark_dirty(pg_owner, _mfn(page_to_mfn(page)));
+            break;
+
+        case MMUEXT_NEW_BASEPTR:
+            if ( unlikely(currd != pg_owner) )
+                rc = -EPERM;
+            else if ( unlikely(paging_mode_translate(currd)) )
+                rc = -EINVAL;
+            else
+                rc = pv_new_guest_cr3(op.arg1.mfn);
+            break;
+
+        case MMUEXT_NEW_USER_BASEPTR: {
+            unsigned long old_mfn;
+
+            if ( unlikely(currd != pg_owner) )
+                rc = -EPERM;
+            else if ( unlikely(paging_mode_translate(currd)) )
+                rc = -EINVAL;
+            if ( unlikely(rc) )
+                break;
+
+            old_mfn = pagetable_get_pfn(curr->arch.guest_table_user);
+            /*
+             * This is particularly important when getting restarted after the
+             * previous attempt got preempted in the put-old-MFN phase.
+             */
+            if ( old_mfn == op.arg1.mfn )
+                break;
+
+            if ( op.arg1.mfn != 0 )
+            {
+                rc = get_page_and_type_from_mfn(
+                    _mfn(op.arg1.mfn), PGT_root_page_table, currd, 0, true);
+
+                if ( unlikely(rc) )
+                {
+                    if ( rc == -EINTR )
+                        rc = -ERESTART;
+                    else if ( rc != -ERESTART )
+                        gdprintk(XENLOG_WARNING,
+                                 "Error %d installing new mfn %" PRI_mfn "\n",
+                                 rc, op.arg1.mfn);
+                    break;
+                }
+
+                if ( VM_ASSIST(currd, m2p_strict) )
+                    zap_ro_mpt(op.arg1.mfn);
+            }
+
+            curr->arch.guest_table_user = pagetable_from_pfn(op.arg1.mfn);
+
+            if ( old_mfn != 0 )
+            {
+                page = mfn_to_page(old_mfn);
+
+                switch ( rc = put_page_and_type_preemptible(page) )
+                {
+                case -EINTR:
+                    rc = -ERESTART;
+                    /* fallthrough */
+                case -ERESTART:
+                    curr->arch.old_guest_table = page;
+                    break;
+                default:
+                    BUG_ON(rc);
+                    break;
+                }
+            }
+
+            break;
+        }
+
+        case MMUEXT_TLB_FLUSH_LOCAL:
+            if ( likely(currd == pg_owner) )
+                flush_tlb_local();
+            else
+                rc = -EPERM;
+            break;
+
+        case MMUEXT_INVLPG_LOCAL:
+            if ( unlikely(currd != pg_owner) )
+                rc = -EPERM;
+            else
+                paging_invlpg(curr, op.arg1.linear_addr);
+            break;
+
+        case MMUEXT_TLB_FLUSH_MULTI:
+        case MMUEXT_INVLPG_MULTI:
+        {
+            cpumask_t *mask = this_cpu(scratch_cpumask);
+
+            if ( unlikely(currd != pg_owner) )
+                rc = -EPERM;
+            else if ( unlikely(vcpumask_to_pcpumask(currd,
+                                   guest_handle_to_param(op.arg2.vcpumask,
+                                                         const_void),
+                                   mask)) )
+                rc = -EINVAL;
+            if ( unlikely(rc) )
+                break;
+
+            if ( op.cmd == MMUEXT_TLB_FLUSH_MULTI )
+                flush_tlb_mask(mask);
+            else if ( __addr_ok(op.arg1.linear_addr) )
+                flush_tlb_one_mask(mask, op.arg1.linear_addr);
+            break;
+        }
+
+        case MMUEXT_TLB_FLUSH_ALL:
+            if ( likely(currd == pg_owner) )
+                flush_tlb_mask(currd->domain_dirty_cpumask);
+            else
+                rc = -EPERM;
+            break;
+
+        case MMUEXT_INVLPG_ALL:
+            if ( unlikely(currd != pg_owner) )
+                rc = -EPERM;
+            else if ( __addr_ok(op.arg1.linear_addr) )
+                flush_tlb_one_mask(currd->domain_dirty_cpumask,
+                                   op.arg1.linear_addr);
+            break;
+
+        case MMUEXT_FLUSH_CACHE:
+            if ( unlikely(currd != pg_owner) )
+                rc = -EPERM;
+            else if ( unlikely(!cache_flush_permitted(currd)) )
+                rc = -EACCES;
+            else
+                wbinvd();
+            break;
+
+        case MMUEXT_FLUSH_CACHE_GLOBAL:
+            if ( unlikely(currd != pg_owner) )
+                rc = -EPERM;
+            else if ( likely(cache_flush_permitted(currd)) )
+            {
+                unsigned int cpu;
+                cpumask_t *mask = this_cpu(scratch_cpumask);
+
+                cpumask_clear(mask);
+                for_each_online_cpu(cpu)
+                    if ( !cpumask_intersects(mask,
+                                             per_cpu(cpu_sibling_mask, cpu)) )
+                        __cpumask_set_cpu(cpu, mask);
+                flush_mask(mask, FLUSH_CACHE);
+            }
+            else
+                rc = -EINVAL;
+            break;
+
+        case MMUEXT_SET_LDT:
+        {
+            unsigned int ents = op.arg2.nr_ents;
+            unsigned long ptr = ents ? op.arg1.linear_addr : 0;
+
+            if ( unlikely(currd != pg_owner) )
+                rc = -EPERM;
+            else if ( paging_mode_external(currd) )
+                rc = -EINVAL;
+            else if ( ((ptr & (PAGE_SIZE - 1)) != 0) || !__addr_ok(ptr) ||
+                      (ents > 8192) )
+            {
+                gdprintk(XENLOG_WARNING,
+                         "Bad args to SET_LDT: ptr=%lx, ents=%x\n", ptr, ents);
+                rc = -EINVAL;
+            }
+            else if ( (curr->arch.pv_vcpu.ldt_ents != ents) ||
+                      (curr->arch.pv_vcpu.ldt_base != ptr) )
+            {
+                pv_invalidate_shadow_ldt(curr, false);
+                flush_tlb_local();
+                curr->arch.pv_vcpu.ldt_base = ptr;
+                curr->arch.pv_vcpu.ldt_ents = ents;
+                load_LDT(curr);
+            }
+            break;
+        }
+
+        case MMUEXT_CLEAR_PAGE:
+            page = get_page_from_gfn(pg_owner, op.arg1.mfn, &p2mt, P2M_ALLOC);
+            if ( unlikely(p2mt != p2m_ram_rw) && page )
+            {
+                put_page(page);
+                page = NULL;
+            }
+            if ( !page || !get_page_type(page, PGT_writable_page) )
+            {
+                if ( page )
+                    put_page(page);
+                gdprintk(XENLOG_WARNING,
+                         "Error clearing mfn %" PRI_mfn "\n", op.arg1.mfn);
+                rc = -EINVAL;
+                break;
+            }
+
+            /* A page is dirtied when it's being cleared. */
+            paging_mark_dirty(pg_owner, _mfn(page_to_mfn(page)));
+
+            clear_domain_page(_mfn(page_to_mfn(page)));
+
+            put_page_and_type(page);
+            break;
+
+        case MMUEXT_COPY_PAGE:
+        {
+            struct page_info *src_page, *dst_page;
+
+            src_page = get_page_from_gfn(pg_owner, op.arg2.src_mfn, &p2mt,
+                                         P2M_ALLOC);
+            if ( unlikely(p2mt != p2m_ram_rw) && src_page )
+            {
+                put_page(src_page);
+                src_page = NULL;
+            }
+            if ( unlikely(!src_page) )
+            {
+                gdprintk(XENLOG_WARNING,
+                         "Error copying from mfn %" PRI_mfn "\n",
+                         op.arg2.src_mfn);
+                rc = -EINVAL;
+                break;
+            }
+
+            dst_page = get_page_from_gfn(pg_owner, op.arg1.mfn, &p2mt,
+                                         P2M_ALLOC);
+            if ( unlikely(p2mt != p2m_ram_rw) && dst_page )
+            {
+                put_page(dst_page);
+                dst_page = NULL;
+            }
+            rc = (dst_page &&
+                  get_page_type(dst_page, PGT_writable_page)) ? 0 : -EINVAL;
+            if ( unlikely(rc) )
+            {
+                put_page(src_page);
+                if ( dst_page )
+                    put_page(dst_page);
+                gdprintk(XENLOG_WARNING,
+                         "Error copying to mfn %" PRI_mfn "\n", op.arg1.mfn);
+                break;
+            }
+
+            /* A page is dirtied when it's being copied to. */
+            paging_mark_dirty(pg_owner, _mfn(page_to_mfn(dst_page)));
+
+            copy_domain_page(_mfn(page_to_mfn(dst_page)),
+                             _mfn(page_to_mfn(src_page)));
+
+            put_page_and_type(dst_page);
+            put_page(src_page);
+            break;
+        }
+
+        case MMUEXT_MARK_SUPER:
+        case MMUEXT_UNMARK_SUPER:
+            rc = -EOPNOTSUPP;
+            break;
+
+        default:
+            rc = -ENOSYS;
+            break;
+        }
+
+ done:
+        if ( unlikely(rc) )
+            break;
+
+        guest_handle_add_offset(uops, 1);
+    }
+
+    if ( rc == -ERESTART )
+    {
+        ASSERT(i < count);
+        rc = hypercall_create_continuation(
+            __HYPERVISOR_mmuext_op, "hihi",
+            uops, (count - i) | MMU_UPDATE_PREEMPTED, pdone, foreigndom);
+    }
+    else if ( curr->arch.old_guest_table )
+    {
+        XEN_GUEST_HANDLE_PARAM(void) null;
+
+        ASSERT(rc || i == count);
+        set_xen_guest_handle(null, NULL);
+        /*
+         * In order to have a way to communicate the final return value to
+         * our continuation, we pass this in place of "foreigndom", building
+         * on the fact that this argument isn't needed anymore.
+         */
+        rc = hypercall_create_continuation(
+                __HYPERVISOR_mmuext_op, "hihi", null,
+                MMU_UPDATE_PREEMPTED, null, rc);
+    }
+
+    put_pg_owner(pg_owner);
+
+    perfc_add(num_mmuext_ops, i);
+
+    /* Add incremental work we have done to the @done output parameter. */
+    if ( unlikely(!guest_handle_is_null(pdone)) )
+    {
+        done += i;
+        copy_to_guest(pdone, &done, 1);
+    }
+
+    return rc;
+}
+
+long do_mmu_update(XEN_GUEST_HANDLE_PARAM(mmu_update_t) ureqs,
+                   unsigned int count, XEN_GUEST_HANDLE_PARAM(uint) pdone,
+                   unsigned int foreigndom)
+{
+    struct mmu_update req;
+    void *va = NULL;
+    unsigned long gpfn, gmfn, mfn;
+    struct page_info *page;
+    unsigned int cmd, i = 0, done = 0, pt_dom;
+    struct vcpu *curr = current, *v = curr;
+    struct domain *d = v->domain, *pt_owner = d, *pg_owner;
+    mfn_t map_mfn = INVALID_MFN;
+    uint32_t xsm_needed = 0;
+    uint32_t xsm_checked = 0;
+    int rc = put_old_guest_table(curr);
+
+    if ( unlikely(rc) )
+    {
+        if ( likely(rc == -ERESTART) )
+            rc = hypercall_create_continuation(
+                     __HYPERVISOR_mmu_update, "hihi", ureqs, count, pdone,
+                     foreigndom);
+        return rc;
+    }
+
+    if ( unlikely(count == MMU_UPDATE_PREEMPTED) &&
+         likely(guest_handle_is_null(ureqs)) )
+    {
+        /*
+         * See the curr->arch.old_guest_table related
+         * hypercall_create_continuation() below.
+         */
+        return (int)foreigndom;
+    }
+
+    if ( unlikely(count & MMU_UPDATE_PREEMPTED) )
+    {
+        count &= ~MMU_UPDATE_PREEMPTED;
+        if ( unlikely(!guest_handle_is_null(pdone)) )
+            (void)copy_from_guest(&done, pdone, 1);
+    }
+    else
+        perfc_incr(calls_to_mmu_update);
+
+    if ( unlikely(!guest_handle_okay(ureqs, count)) )
+        return -EFAULT;
+
+    if ( (pt_dom = foreigndom >> 16) != 0 )
+    {
+        /* Pagetables belong to a foreign domain (PFD). */
+        if ( (pt_owner = rcu_lock_domain_by_id(pt_dom - 1)) == NULL )
+            return -ESRCH;
+
+        if ( pt_owner == d )
+            rcu_unlock_domain(pt_owner);
+        else if ( !pt_owner->vcpu || (v = pt_owner->vcpu[0]) == NULL )
+        {
+            rc = -EINVAL;
+            goto out;
+        }
+    }
+
+    if ( (pg_owner = get_pg_owner((uint16_t)foreigndom)) == NULL )
+    {
+        rc = -ESRCH;
+        goto out;
+    }
+
+    for ( i = 0; i < count; i++ )
+    {
+        if ( curr->arch.old_guest_table || (i && hypercall_preempt_check()) )
+        {
+            rc = -ERESTART;
+            break;
+        }
+
+        if ( unlikely(__copy_from_guest(&req, ureqs, 1) != 0) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+
+        cmd = req.ptr & (sizeof(l1_pgentry_t)-1);
+
+        switch ( cmd )
+        {
+            /*
+             * MMU_NORMAL_PT_UPDATE: Normal update to any level of page table.
+             * MMU_UPDATE_PT_PRESERVE_AD: As above but also preserve (OR)
+             * current A/D bits.
+             */
+        case MMU_NORMAL_PT_UPDATE:
+        case MMU_PT_UPDATE_PRESERVE_AD:
+        {
+            p2m_type_t p2mt;
+
+            rc = -EOPNOTSUPP;
+            if ( unlikely(paging_mode_refcounts(pt_owner)) )
+                break;
+
+            xsm_needed |= XSM_MMU_NORMAL_UPDATE;
+            if ( get_pte_flags(req.val) & _PAGE_PRESENT )
+            {
+                xsm_needed |= XSM_MMU_UPDATE_READ;
+                if ( get_pte_flags(req.val) & _PAGE_RW )
+                    xsm_needed |= XSM_MMU_UPDATE_WRITE;
+            }
+            if ( xsm_needed != xsm_checked )
+            {
+                rc = xsm_mmu_update(XSM_TARGET, d, pt_owner, pg_owner, xsm_needed);
+                if ( rc )
+                    break;
+                xsm_checked = xsm_needed;
+            }
+            rc = -EINVAL;
+
+            req.ptr -= cmd;
+            gmfn = req.ptr >> PAGE_SHIFT;
+            page = get_page_from_gfn(pt_owner, gmfn, &p2mt, P2M_ALLOC);
+
+            if ( p2m_is_paged(p2mt) )
+            {
+                ASSERT(!page);
+                p2m_mem_paging_populate(pg_owner, gmfn);
+                rc = -ENOENT;
+                break;
+            }
+
+            if ( unlikely(!page) )
+            {
+                gdprintk(XENLOG_WARNING,
+                         "Could not get page for normal update\n");
+                break;
+            }
+
+            mfn = page_to_mfn(page);
+
+            if ( !mfn_eq(_mfn(mfn), map_mfn) )
+            {
+                if ( va )
+                    unmap_domain_page(va);
+                va = map_domain_page(_mfn(mfn));
+                map_mfn = _mfn(mfn);
+            }
+            va = _p(((unsigned long)va & PAGE_MASK) + (req.ptr & ~PAGE_MASK));
+
+            if ( page_lock(page) )
+            {
+                switch ( page->u.inuse.type_info & PGT_type_mask )
+                {
+                case PGT_l1_page_table:
+                {
+                    l1_pgentry_t l1e = l1e_from_intpte(req.val);
+                    p2m_type_t l1e_p2mt = p2m_ram_rw;
+                    struct page_info *target = NULL;
+                    p2m_query_t q = (l1e_get_flags(l1e) & _PAGE_RW) ?
+                                        P2M_UNSHARE : P2M_ALLOC;
+
+                    if ( paging_mode_translate(pg_owner) )
+                        target = get_page_from_gfn(pg_owner, l1e_get_pfn(l1e),
+                                                   &l1e_p2mt, q);
+
+                    if ( p2m_is_paged(l1e_p2mt) )
+                    {
+                        if ( target )
+                            put_page(target);
+                        p2m_mem_paging_populate(pg_owner, l1e_get_pfn(l1e));
+                        rc = -ENOENT;
+                        break;
+                    }
+                    else if ( p2m_ram_paging_in == l1e_p2mt && !target )
+                    {
+                        rc = -ENOENT;
+                        break;
+                    }
+                    /* If we tried to unshare and failed */
+                    else if ( (q & P2M_UNSHARE) && p2m_is_shared(l1e_p2mt) )
+                    {
+                        /* We could not have obtained a page ref. */
+                        ASSERT(target == NULL);
+                        /* And mem_sharing_notify has already been called. */
+                        rc = -ENOMEM;
+                        break;
+                    }
+
+                    rc = mod_l1_entry(va, l1e, mfn,
+                                      cmd == MMU_PT_UPDATE_PRESERVE_AD, v,
+                                      pg_owner);
+                    if ( target )
+                        put_page(target);
+                }
+                break;
+                case PGT_l2_page_table:
+                    rc = mod_l2_entry(va, l2e_from_intpte(req.val), mfn,
+                                      cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
+                    break;
+                case PGT_l3_page_table:
+                    rc = mod_l3_entry(va, l3e_from_intpte(req.val), mfn,
+                                      cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
+                    break;
+                case PGT_l4_page_table:
+                    rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn,
+                                      cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
+                break;
+                case PGT_writable_page:
+                    perfc_incr(writable_mmu_updates);
+                    if ( paging_write_guest_entry(v, va, req.val, _mfn(mfn)) )
+                        rc = 0;
+                    break;
+                }
+                page_unlock(page);
+                if ( rc == -EINTR )
+                    rc = -ERESTART;
+            }
+            else if ( get_page_type(page, PGT_writable_page) )
+            {
+                perfc_incr(writable_mmu_updates);
+                if ( paging_write_guest_entry(v, va, req.val, _mfn(mfn)) )
+                    rc = 0;
+                put_page_type(page);
+            }
+
+            put_page(page);
+        }
+        break;
+
+        case MMU_MACHPHYS_UPDATE:
+            if ( unlikely(d != pt_owner) )
+            {
+                rc = -EPERM;
+                break;
+            }
+
+            if ( unlikely(paging_mode_translate(pg_owner)) )
+            {
+                rc = -EINVAL;
+                break;
+            }
+
+            mfn = req.ptr >> PAGE_SHIFT;
+            gpfn = req.val;
+
+            xsm_needed |= XSM_MMU_MACHPHYS_UPDATE;
+            if ( xsm_needed != xsm_checked )
+            {
+                rc = xsm_mmu_update(XSM_TARGET, d, NULL, pg_owner, xsm_needed);
+                if ( rc )
+                    break;
+                xsm_checked = xsm_needed;
+            }
+
+            if ( unlikely(!get_page_from_mfn(_mfn(mfn), pg_owner)) )
+            {
+                gdprintk(XENLOG_WARNING,
+                         "Could not get page for mach->phys update\n");
+                rc = -EINVAL;
+                break;
+            }
+
+            set_gpfn_from_mfn(mfn, gpfn);
+
+            paging_mark_dirty(pg_owner, _mfn(mfn));
+
+            put_page(mfn_to_page(mfn));
+            break;
+
+        default:
+            rc = -ENOSYS;
+            break;
+        }
+
+        if ( unlikely(rc) )
+            break;
+
+        guest_handle_add_offset(ureqs, 1);
+    }
+
+    if ( rc == -ERESTART )
+    {
+        ASSERT(i < count);
+        rc = hypercall_create_continuation(
+            __HYPERVISOR_mmu_update, "hihi",
+            ureqs, (count - i) | MMU_UPDATE_PREEMPTED, pdone, foreigndom);
+    }
+    else if ( curr->arch.old_guest_table )
+    {
+        XEN_GUEST_HANDLE_PARAM(void) null;
+
+        ASSERT(rc || i == count);
+        set_xen_guest_handle(null, NULL);
+        /*
+         * In order to have a way to communicate the final return value to
+         * our continuation, we pass this in place of "foreigndom", building
+         * on the fact that this argument isn't needed anymore.
+         */
+        rc = hypercall_create_continuation(
+                __HYPERVISOR_mmu_update, "hihi", null,
+                MMU_UPDATE_PREEMPTED, null, rc);
+    }
+
+    put_pg_owner(pg_owner);
+
+    if ( va )
+        unmap_domain_page(va);
+
+    perfc_add(num_page_updates, i);
+
+ out:
+    if ( pt_owner != d )
+        rcu_unlock_domain(pt_owner);
+
+    /* Add incremental work we have done to the @done output parameter. */
+    if ( unlikely(!guest_handle_is_null(pdone)) )
+    {
+        done += i;
+        copy_to_guest(pdone, &done, 1);
+    }
+
+    return rc;
+}
+
+static int __do_update_va_mapping(unsigned long va, u64 val64,
+                                  unsigned long flags,
+                                  struct domain *pg_owner)
+{
+    l1_pgentry_t   val = l1e_from_intpte(val64);
+    struct vcpu   *curr = current;
+    struct domain *currd = curr->domain;
+    struct page_info *gl1pg;
+    l1_pgentry_t  *pl1e;
+    unsigned long  bmap_ptr, gl1mfn;
+    cpumask_t     *mask = NULL;
+    int            rc;
+
+    perfc_incr(calls_to_update_va);
+
+    rc = xsm_update_va_mapping(XSM_TARGET, currd, pg_owner, val);
+    if ( rc )
+        return rc;
+
+    rc = -EINVAL;
+    pl1e = pv_map_guest_l1e(va, &gl1mfn);
+    if ( unlikely(!pl1e || !get_page_from_mfn(_mfn(gl1mfn), currd)) )
+        goto out;
+
+    gl1pg = mfn_to_page(gl1mfn);
+    if ( !page_lock(gl1pg) )
+    {
+        put_page(gl1pg);
+        goto out;
+    }
+
+    if ( (gl1pg->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
+    {
+        page_unlock(gl1pg);
+        put_page(gl1pg);
+        goto out;
+    }
+
+    rc = mod_l1_entry(pl1e, val, gl1mfn, 0, curr, pg_owner);
+
+    page_unlock(gl1pg);
+    put_page(gl1pg);
+
+ out:
+    if ( pl1e )
+        pv_unmap_guest_l1e(pl1e);
+
+    switch ( flags & UVMF_FLUSHTYPE_MASK )
+    {
+    case UVMF_TLB_FLUSH:
+        switch ( (bmap_ptr = flags & ~UVMF_FLUSHTYPE_MASK) )
+        {
+        case UVMF_LOCAL:
+            flush_tlb_local();
+            break;
+        case UVMF_ALL:
+            mask = currd->domain_dirty_cpumask;
+            break;
+        default:
+            mask = this_cpu(scratch_cpumask);
+            rc = vcpumask_to_pcpumask(currd,
+                                      const_guest_handle_from_ptr(bmap_ptr, void),
+                                      mask);
+            break;
+        }
+        if ( mask )
+            flush_tlb_mask(mask);
+        break;
+
+    case UVMF_INVLPG:
+        switch ( (bmap_ptr = flags & ~UVMF_FLUSHTYPE_MASK) )
+        {
+        case UVMF_LOCAL:
+            paging_invlpg(curr, va);
+            break;
+        case UVMF_ALL:
+            mask = currd->domain_dirty_cpumask;
+            break;
+        default:
+            mask = this_cpu(scratch_cpumask);
+            rc = vcpumask_to_pcpumask(currd,
+                                      const_guest_handle_from_ptr(bmap_ptr, void),
+                                      mask);
+            break;
+        }
+        if ( mask )
+            flush_tlb_one_mask(mask, va);
+        break;
+    }
+
+    return rc;
+}
+
+long do_update_va_mapping(unsigned long va, u64 val64,
+                          unsigned long flags)
+{
+    return __do_update_va_mapping(va, val64, flags, current->domain);
+}
+
+long do_update_va_mapping_otherdomain(unsigned long va, u64 val64,
+                                      unsigned long flags,
+                                      domid_t domid)
+{
+    struct domain *pg_owner;
+    int rc;
+
+    if ( (pg_owner = get_pg_owner(domid)) == NULL )
+        return -ESRCH;
+
+    rc = __do_update_va_mapping(va, val64, flags, pg_owner);
+
+    put_pg_owner(pg_owner);
+
+    return rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 29/31] x86/mm: remove the now unused inclusion of pv/mm.h
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (27 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 28/31] x86/mm: move PV hypercalls to pv/mm-hypercalls.c Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 30/31] x86/mm: use put_page_type_preemptible in put_page_from_l{2, 3}e Wei Liu
  2017-08-17 14:44 ` [PATCH v4 31/31] x86/mm: move {get, put}_page_from_l{2, 3, 4}e Wei Liu
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 54256b6a11..c2a73f123f 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -127,8 +127,6 @@
 #include <asm/pv/grant_table.h>
 #include <asm/pv/mm.h>
 
-#include "pv/mm.h"
-
 /* Mapping of the fixmap space needed early. */
 l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
     l1_fixmap[L1_PAGETABLE_ENTRIES];
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 30/31] x86/mm: use put_page_type_preemptible in put_page_from_l{2, 3}e
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (28 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 29/31] x86/mm: remove the now unused inclusion of pv/mm.h Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  2017-08-17 14:44 ` [PATCH v4 31/31] x86/mm: move {get, put}_page_from_l{2, 3, 4}e Wei Liu
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index c2a73f123f..70559c687c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1046,8 +1046,6 @@ int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn)
     return 0;
 }
 
-static int __put_page_type(struct page_info *, int preemptible);
-
 int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn, int partial,
                       bool defer)
 {
@@ -1074,7 +1072,7 @@ int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn, int partial,
     if ( unlikely(partial > 0) )
     {
         ASSERT(!defer);
-        return __put_page_type(pg, 1);
+        return put_page_type_preemptible(pg);
     }
 
     if ( defer )
@@ -1097,7 +1095,7 @@ int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn, int partial,
         if ( unlikely(partial > 0) )
         {
             ASSERT(!defer);
-            return __put_page_type(pg, 1);
+            return put_page_type_preemptible(pg);
         }
 
         if ( defer )
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v4 31/31] x86/mm: move {get, put}_page_from_l{2, 3, 4}e
  2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
                   ` (29 preceding siblings ...)
  2017-08-17 14:44 ` [PATCH v4 30/31] x86/mm: use put_page_type_preemptible in put_page_from_l{2, 3}e Wei Liu
@ 2017-08-17 14:44 ` Wei Liu
  30 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-17 14:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

They are only used by PV code.

Fix coding style issues while moving. Move declarations to PV specific
header file.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c           | 253 --------------------------------------------
 xen/arch/x86/pv/mm.c        | 246 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/mm.h    |  10 --
 xen/include/asm-x86/pv/mm.h |  29 +++++
 4 files changed, 275 insertions(+), 263 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 70559c687c..9750f657ca 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -507,72 +507,6 @@ int get_page_and_type_from_mfn(mfn_t mfn, unsigned long type, struct domain *d,
     return rc;
 }
 
-static void put_data_page(
-    struct page_info *page, int writeable)
-{
-    if ( writeable )
-        put_page_and_type(page);
-    else
-        put_page(page);
-}
-
-/*
- * We allow root tables to map each other (a.k.a. linear page tables). It
- * needs some special care with reference counts and access permissions:
- *  1. The mapping entry must be read-only, or the guest may get write access
- *     to its own PTEs.
- *  2. We must only bump the reference counts for an *already validated*
- *     L2 table, or we can end up in a deadlock in get_page_type() by waiting
- *     on a validation that is required to complete that validation.
- *  3. We only need to increment the reference counts for the mapped page
- *     frame if it is mapped by a different root table. This is sufficient and
- *     also necessary to allow validation of a root table mapping itself.
- */
-#define define_get_linear_pagetable(level)                                  \
-static int                                                                  \
-get_##level##_linear_pagetable(                                             \
-    level##_pgentry_t pde, unsigned long pde_pfn, struct domain *d)         \
-{                                                                           \
-    unsigned long x, y;                                                     \
-    struct page_info *page;                                                 \
-    unsigned long pfn;                                                      \
-                                                                            \
-    if ( (level##e_get_flags(pde) & _PAGE_RW) )                             \
-    {                                                                       \
-        gdprintk(XENLOG_WARNING,                                            \
-                 "Attempt to create linear p.t. with write perms\n");       \
-        return 0;                                                           \
-    }                                                                       \
-                                                                            \
-    if ( (pfn = level##e_get_pfn(pde)) != pde_pfn )                         \
-    {                                                                       \
-        /* Make sure the mapped frame belongs to the correct domain. */     \
-        if ( unlikely(!get_page_from_mfn(_mfn(pfn), d)) )                   \
-            return 0;                                                       \
-                                                                            \
-        /*                                                                  \
-         * Ensure that the mapped frame is an already-validated page table. \
-         * If so, atomically increment the count (checking for overflow).   \
-         */                                                                 \
-        page = mfn_to_page(pfn);                                            \
-        y = page->u.inuse.type_info;                                        \
-        do {                                                                \
-            x = y;                                                          \
-            if ( unlikely((x & PGT_count_mask) == PGT_count_mask) ||        \
-                 unlikely((x & (PGT_type_mask|PGT_validated)) !=            \
-                          (PGT_##level##_page_table|PGT_validated)) )       \
-            {                                                               \
-                put_page(page);                                             \
-                return 0;                                                   \
-            }                                                               \
-        }                                                                   \
-        while ( (y = cmpxchg(&page->u.inuse.type_info, x, x + 1)) != x );   \
-    }                                                                       \
-                                                                            \
-    return 1;                                                               \
-}
-
-
 bool is_iomem_page(mfn_t mfn)
 {
     struct page_info *page;
@@ -862,108 +796,6 @@ get_page_from_l1e(
 }
 
 
-/* NB. Virtual address 'l2e' maps to a machine address within frame 'pfn'. */
-/*
- * get_page_from_l2e returns:
- *   1 => page not present
- *   0 => success
- *  <0 => error code
- */
-define_get_linear_pagetable(l2);
-int
-get_page_from_l2e(
-    l2_pgentry_t l2e, unsigned long pfn, struct domain *d)
-{
-    unsigned long mfn = l2e_get_pfn(l2e);
-    int rc;
-
-    if ( !(l2e_get_flags(l2e) & _PAGE_PRESENT) )
-        return 1;
-
-    if ( unlikely((l2e_get_flags(l2e) & L2_DISALLOW_MASK)) )
-    {
-        gdprintk(XENLOG_WARNING, "Bad L2 flags %x\n",
-                 l2e_get_flags(l2e) & L2_DISALLOW_MASK);
-        return -EINVAL;
-    }
-
-    if ( !(l2e_get_flags(l2e) & _PAGE_PSE) )
-    {
-        rc = get_page_and_type_from_mfn(_mfn(mfn), PGT_l1_page_table, d, 0,
-                                        false);
-        if ( unlikely(rc == -EINVAL) && get_l2_linear_pagetable(l2e, pfn, d) )
-            rc = 0;
-        return rc;
-    }
-
-    return -EINVAL;
-}
-
-
-/*
- * get_page_from_l3e returns:
- *   1 => page not present
- *   0 => success
- *  <0 => error code
- */
-define_get_linear_pagetable(l3);
-int
-get_page_from_l3e(
-    l3_pgentry_t l3e, unsigned long pfn, struct domain *d, int partial)
-{
-    int rc;
-
-    if ( !(l3e_get_flags(l3e) & _PAGE_PRESENT) )
-        return 1;
-
-    if ( unlikely((l3e_get_flags(l3e) & l3_disallow_mask(d))) )
-    {
-        gdprintk(XENLOG_WARNING, "Bad L3 flags %x\n",
-                 l3e_get_flags(l3e) & l3_disallow_mask(d));
-        return -EINVAL;
-    }
-
-    rc = get_page_and_type_from_mfn(_mfn(l3e_get_pfn(l3e)), PGT_l2_page_table,
-                                    d, partial, true);
-    if ( unlikely(rc == -EINVAL) &&
-         !is_pv_32bit_domain(d) &&
-         get_l3_linear_pagetable(l3e, pfn, d) )
-        rc = 0;
-
-    return rc;
-}
-
-/*
- * get_page_from_l4e returns:
- *   1 => page not present
- *   0 => success
- *  <0 => error code
- */
-define_get_linear_pagetable(l4);
-int
-get_page_from_l4e(
-    l4_pgentry_t l4e, unsigned long pfn, struct domain *d, int partial)
-{
-    int rc;
-
-    if ( !(l4e_get_flags(l4e) & _PAGE_PRESENT) )
-        return 1;
-
-    if ( unlikely((l4e_get_flags(l4e) & L4_DISALLOW_MASK)) )
-    {
-        gdprintk(XENLOG_WARNING, "Bad L4 flags %x\n",
-                 l4e_get_flags(l4e) & L4_DISALLOW_MASK);
-        return -EINVAL;
-    }
-
-    rc = get_page_and_type_from_mfn(_mfn(l4e_get_pfn(l4e)), PGT_l3_page_table,
-                                    d, partial, true);
-    if ( unlikely(rc == -EINVAL) && get_l4_linear_pagetable(l4e, pfn, d) )
-        rc = 0;
-
-    return rc;
-}
-
 void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner)
 {
     unsigned long     pfn = l1e_get_pfn(l1e);
@@ -1024,91 +856,6 @@ void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner)
 }
 
 
-/*
- * NB. Virtual address 'l2e' maps to a machine address within frame 'pfn'.
- * Note also that this automatically deals correctly with linear p.t.'s.
- */
-int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn)
-{
-    if ( !(l2e_get_flags(l2e) & _PAGE_PRESENT) || (l2e_get_pfn(l2e) == pfn) )
-        return 1;
-
-    if ( l2e_get_flags(l2e) & _PAGE_PSE )
-    {
-        struct page_info *page = mfn_to_page(l2e_get_pfn(l2e));
-        unsigned int i;
-
-        for ( i = 0; i < (1u << PAGETABLE_ORDER); i++, page++ )
-            put_page_and_type(page);
-    } else
-        put_page_and_type(l2e_get_page(l2e));
-
-    return 0;
-}
-
-int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn, int partial,
-                      bool defer)
-{
-    struct page_info *pg;
-
-    if ( !(l3e_get_flags(l3e) & _PAGE_PRESENT) || (l3e_get_pfn(l3e) == pfn) )
-        return 1;
-
-    if ( unlikely(l3e_get_flags(l3e) & _PAGE_PSE) )
-    {
-        unsigned long mfn = l3e_get_pfn(l3e);
-        int writeable = l3e_get_flags(l3e) & _PAGE_RW;
-
-        ASSERT(!(mfn & ((1UL << (L3_PAGETABLE_SHIFT - PAGE_SHIFT)) - 1)));
-        do {
-            put_data_page(mfn_to_page(mfn), writeable);
-        } while ( ++mfn & ((1UL << (L3_PAGETABLE_SHIFT - PAGE_SHIFT)) - 1) );
-
-        return 0;
-    }
-
-    pg = l3e_get_page(l3e);
-
-    if ( unlikely(partial > 0) )
-    {
-        ASSERT(!defer);
-        return put_page_type_preemptible(pg);
-    }
-
-    if ( defer )
-    {
-        current->arch.old_guest_table = pg;
-        return 0;
-    }
-
-    return put_page_and_type_preemptible(pg);
-}
-
-int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn, int partial,
-                      bool defer)
-{
-    if ( (l4e_get_flags(l4e) & _PAGE_PRESENT) &&
-         (l4e_get_pfn(l4e) != pfn) )
-    {
-        struct page_info *pg = l4e_get_page(l4e);
-
-        if ( unlikely(partial > 0) )
-        {
-            ASSERT(!defer);
-            return put_page_type_preemptible(pg);
-        }
-
-        if ( defer )
-        {
-            current->arch.old_guest_table = pg;
-            return 0;
-        }
-
-        return put_page_and_type_preemptible(pg);
-    }
-    return 1;
-}
-
 bool fill_ro_mpt(unsigned long mfn)
 {
     l4_pgentry_t *l4tab = map_domain_page(_mfn(mfn));
diff --git a/xen/arch/x86/pv/mm.c b/xen/arch/x86/pv/mm.c
index 19b2ae588e..ad35808c51 100644
--- a/xen/arch/x86/pv/mm.c
+++ b/xen/arch/x86/pv/mm.c
@@ -777,6 +777,252 @@ void pv_invalidate_shadow_ldt(struct vcpu *v, bool flush)
     spin_unlock(&v->arch.pv_vcpu.shadow_ldt_lock);
 }
 
+/*
+ * We allow root tables to map each other (a.k.a. linear page tables). It
+ * needs some special care with reference counts and access permissions:
+ *  1. The mapping entry must be read-only, or the guest may get write access
+ *     to its own PTEs.
+ *  2. We must only bump the reference counts for an *already validated*
+ *     L2 table, or we can end up in a deadlock in get_page_type() by waiting
+ *     on a validation that is required to complete that validation.
+ *  3. We only need to increment the reference counts for the mapped page
+ *     frame if it is mapped by a different root table. This is sufficient and
+ *     also necessary to allow validation of a root table mapping itself.
+ */
+#define define_get_linear_pagetable(level)                                  \
+static int                                                                  \
+get_##level##_linear_pagetable(                                             \
+    level##_pgentry_t pde, unsigned long pde_pfn, struct domain *d)         \
+{                                                                           \
+    unsigned long x, y;                                                     \
+    struct page_info *page;                                                 \
+    unsigned long pfn;                                                      \
+                                                                            \
+    if ( (level##e_get_flags(pde) & _PAGE_RW) )                             \
+    {                                                                       \
+        gdprintk(XENLOG_WARNING,                                            \
+                 "Attempt to create linear p.t. with write perms\n");       \
+        return 0;                                                           \
+    }                                                                       \
+                                                                            \
+    if ( (pfn = level##e_get_pfn(pde)) != pde_pfn )                         \
+    {                                                                       \
+        /* Make sure the mapped frame belongs to the correct domain. */     \
+        if ( unlikely(!get_page_from_mfn(_mfn(pfn), d)) )                   \
+            return 0;                                                       \
+                                                                            \
+        /*                                                                  \
+         * Ensure that the mapped frame is an already-validated page table. \
+         * If so, atomically increment the count (checking for overflow).   \
+         */                                                                 \
+        page = mfn_to_page(pfn);                                            \
+        y = page->u.inuse.type_info;                                        \
+        do {                                                                \
+            x = y;                                                          \
+            if ( unlikely((x & PGT_count_mask) == PGT_count_mask) ||        \
+                 unlikely((x & (PGT_type_mask|PGT_validated)) !=            \
+                          (PGT_##level##_page_table|PGT_validated)) )       \
+            {                                                               \
+                put_page(page);                                             \
+                return 0;                                                   \
+            }                                                               \
+        }                                                                   \
+        while ( (y = cmpxchg(&page->u.inuse.type_info, x, x + 1)) != x );   \
+    }                                                                       \
+                                                                            \
+    return 1;                                                               \
+}
+
+/* NB. Virtual address 'l2e' maps to a machine address within frame 'pfn'. */
+/*
+ * get_page_from_l2e returns:
+ *   1 => page not present
+ *   0 => success
+ *  <0 => error code
+ */
+define_get_linear_pagetable(l2);
+int get_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn, struct domain *d)
+{
+    unsigned long mfn = l2e_get_pfn(l2e);
+    int rc;
+
+    if ( !(l2e_get_flags(l2e) & _PAGE_PRESENT) )
+        return 1;
+
+    if ( unlikely((l2e_get_flags(l2e) & L2_DISALLOW_MASK)) )
+    {
+        gdprintk(XENLOG_WARNING, "Bad L2 flags %x\n",
+                 l2e_get_flags(l2e) & L2_DISALLOW_MASK);
+        return -EINVAL;
+    }
+
+    if ( !(l2e_get_flags(l2e) & _PAGE_PSE) )
+    {
+        rc = get_page_and_type_from_mfn(_mfn(mfn), PGT_l1_page_table, d, 0,
+                                        false);
+        if ( unlikely(rc == -EINVAL) && get_l2_linear_pagetable(l2e, pfn, d) )
+            rc = 0;
+        return rc;
+    }
+
+    return -EINVAL;
+}
+
+/*
+ * get_page_from_l3e returns:
+ *   1 => page not present
+ *   0 => success
+ *  <0 => error code
+ */
+define_get_linear_pagetable(l3);
+int get_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn, struct domain *d,
+                      int partial)
+{
+    int rc;
+
+    if ( !(l3e_get_flags(l3e) & _PAGE_PRESENT) )
+        return 1;
+
+    if ( unlikely((l3e_get_flags(l3e) & l3_disallow_mask(d))) )
+    {
+        gdprintk(XENLOG_WARNING, "Bad L3 flags %x\n",
+                 l3e_get_flags(l3e) & l3_disallow_mask(d));
+        return -EINVAL;
+    }
+
+    rc = get_page_and_type_from_mfn(_mfn(l3e_get_pfn(l3e)), PGT_l2_page_table,
+                                    d, partial, true);
+    if ( unlikely(rc == -EINVAL) &&
+         !is_pv_32bit_domain(d) &&
+         get_l3_linear_pagetable(l3e, pfn, d) )
+        rc = 0;
+
+    return rc;
+}
+
+/*
+ * get_page_from_l4e returns:
+ *   1 => page not present
+ *   0 => success
+ *  <0 => error code
+ */
+define_get_linear_pagetable(l4);
+int get_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn, struct domain *d,
+                      int partial)
+{
+    int rc;
+
+    if ( !(l4e_get_flags(l4e) & _PAGE_PRESENT) )
+        return 1;
+
+    if ( unlikely((l4e_get_flags(l4e) & L4_DISALLOW_MASK)) )
+    {
+        gdprintk(XENLOG_WARNING, "Bad L4 flags %x\n",
+                 l4e_get_flags(l4e) & L4_DISALLOW_MASK);
+        return -EINVAL;
+    }
+
+    rc = get_page_and_type_from_mfn(_mfn(l4e_get_pfn(l4e)), PGT_l3_page_table,
+                                    d, partial, true);
+    if ( unlikely(rc == -EINVAL) && get_l4_linear_pagetable(l4e, pfn, d) )
+        rc = 0;
+
+    return rc;
+}
+
+/*
+ * NB. Virtual address 'l2e' maps to a machine address within frame 'pfn'.
+ * Note also that this automatically deals correctly with linear p.t.'s.
+ */
+int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn)
+{
+    if ( !(l2e_get_flags(l2e) & _PAGE_PRESENT) || (l2e_get_pfn(l2e) == pfn) )
+        return 1;
+
+    if ( l2e_get_flags(l2e) & _PAGE_PSE )
+    {
+        struct page_info *page = mfn_to_page(l2e_get_pfn(l2e));
+        unsigned int i;
+
+        for ( i = 0; i < (1u << PAGETABLE_ORDER); i++, page++ )
+            put_page_and_type(page);
+    } else
+        put_page_and_type(l2e_get_page(l2e));
+
+    return 0;
+}
+
+static void put_data_page(struct page_info *page, bool writeable)
+{
+    if ( writeable )
+        put_page_and_type(page);
+    else
+        put_page(page);
+}
+
+int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn, int partial,
+                      bool defer)
+{
+    struct page_info *pg;
+
+    if ( !(l3e_get_flags(l3e) & _PAGE_PRESENT) || (l3e_get_pfn(l3e) == pfn) )
+        return 1;
+
+    if ( unlikely(l3e_get_flags(l3e) & _PAGE_PSE) )
+    {
+        unsigned long mfn = l3e_get_pfn(l3e);
+        int writeable = l3e_get_flags(l3e) & _PAGE_RW;
+
+        ASSERT(!(mfn & ((1UL << (L3_PAGETABLE_SHIFT - PAGE_SHIFT)) - 1)));
+        do {
+            put_data_page(mfn_to_page(mfn), writeable);
+        } while ( ++mfn & ((1UL << (L3_PAGETABLE_SHIFT - PAGE_SHIFT)) - 1) );
+
+        return 0;
+    }
+
+    pg = l3e_get_page(l3e);
+
+    if ( unlikely(partial > 0) )
+    {
+        ASSERT(!defer);
+        return put_page_type_preemptible(pg);
+    }
+
+    if ( defer )
+    {
+        current->arch.old_guest_table = pg;
+        return 0;
+    }
+
+    return put_page_and_type_preemptible(pg);
+}
+
+int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn, int partial,
+                      bool defer)
+{
+    if ( (l4e_get_flags(l4e) & _PAGE_PRESENT) &&
+         (l4e_get_pfn(l4e) != pfn) )
+    {
+        struct page_info *pg = l4e_get_page(l4e);
+
+        if ( unlikely(partial > 0) )
+        {
+            ASSERT(!defer);
+            return put_page_type_preemptible(pg);
+        }
+
+        if ( defer )
+        {
+            current->arch.old_guest_table = pg;
+            return 0;
+        }
+
+        return put_page_and_type_preemptible(pg);
+    }
+    return 1;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 7480341240..4eeaf709c1 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -358,16 +358,6 @@ int  put_old_guest_table(struct vcpu *);
 int  get_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner,
                        struct domain *pg_owner);
 void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner);
-int get_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn, struct domain *d);
-int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn);
-int get_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn, struct domain *d,
-                      int partial);
-int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn, int partial,
-                      bool defer);
-int get_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn, struct domain *d,
-                      int partial);
-int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn, int partial,
-                      bool defer);
 void get_page_light(struct page_info *page);
 bool get_page_from_mfn(mfn_t mfn, struct domain *d);
 int get_page_and_type_from_mfn(mfn_t mfn, unsigned long type, struct domain *d,
diff --git a/xen/include/asm-x86/pv/mm.h b/xen/include/asm-x86/pv/mm.h
index 1e2871fb58..acab7b7c42 100644
--- a/xen/include/asm-x86/pv/mm.h
+++ b/xen/include/asm-x86/pv/mm.h
@@ -102,6 +102,17 @@ int pv_free_page_type(struct page_info *page, unsigned long type,
 
 void pv_invalidate_shadow_ldt(struct vcpu *v, bool flush);
 
+int get_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn, struct domain *d);
+int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn);
+int get_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn, struct domain *d,
+                      int partial);
+int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn, int partial,
+                      bool defer);
+int get_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn, struct domain *d,
+                      int partial);
+int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn, int partial,
+                      bool defer);
+
 #else
 
 #include <xen/errno.h>
@@ -143,6 +154,24 @@ static inline int pv_free_page_type(struct page_info *page, unsigned long type,
 
 static inline void pv_invalidate_shadow_ldt(struct vcpu *v, bool flush) {}
 
+static inline int get_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn,
+                                    struct domain *d)
+{ return -EINVAL; }
+static inline int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn)
+{ return -EINVAL; }
+static inline int get_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn,
+                                    struct domain *d, int partial)
+{ return -EINVAL; }
+static inline int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn,
+                                    int partial, bool defer)
+{ return -EINVAL; }
+static inline int get_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn,
+                                    struct domain *d, int partial)
+{ return -EINVAL; }
+static inline int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn,
+                                    int partial, bool defer)
+{ return -EINVAL; }
+
 #endif
 
 #endif /* __X86_PV_MM_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 04/31] x86/mm: lift PAGE_CACHE_ATTRS to page.h
  2017-08-17 14:44 ` [PATCH v4 04/31] x86/mm: lift PAGE_CACHE_ATTRS to page.h Wei Liu
@ 2017-08-17 14:46   ` Andrew Cooper
  0 siblings, 0 replies; 53+ messages in thread
From: Andrew Cooper @ 2017-08-17 14:46 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: George Dunlap, Jan Beulich

On 17/08/17 15:44, Wei Liu wrote:
> Currently all the users are within x86/mm.c. But that will change once
> we split PV specific mm code to another file. Lift that to page.h
> along side _PAGE_* in preparation for later patches.
>
> No functional change. Add some spaces around "|" while moving.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 06/31] x86: move pv_emul_is_mem_write to pv/emulate.c
  2017-08-17 14:44 ` [PATCH v4 06/31] x86: move pv_emul_is_mem_write to pv/emulate.c Wei Liu
@ 2017-08-17 14:53   ` Andrew Cooper
  2017-08-18 10:08   ` Jan Beulich
  1 sibling, 0 replies; 53+ messages in thread
From: Andrew Cooper @ 2017-08-17 14:53 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: George Dunlap, Jan Beulich

On 17/08/17 15:44, Wei Liu wrote:
> Export it via pv/emulate.h.  In the mean time it is required to
> include pv/emulate.h in x86/mm.c.
>
> The said function will be used later by different emulation handlers
> in later patches.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 06/31] x86: move pv_emul_is_mem_write to pv/emulate.c
  2017-08-17 14:44 ` [PATCH v4 06/31] x86: move pv_emul_is_mem_write to pv/emulate.c Wei Liu
  2017-08-17 14:53   ` Andrew Cooper
@ 2017-08-18 10:08   ` Jan Beulich
  2017-08-18 12:08     ` Wei Liu
  1 sibling, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2017-08-18 10:08 UTC (permalink / raw)
  To: WeiLiu; +Cc: George Dunlap, Andrew Cooper, Xen-devel

>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> @@ -5138,13 +5140,6 @@ static int ptwr_emulated_cmpxchg(
>          container_of(ctxt, struct ptwr_emulate_ctxt, ctxt));
>  }
>  
> -static int pv_emul_is_mem_write(const struct x86_emulate_state *state,
> -                                struct x86_emulate_ctxt *ctxt)
> -{
> -    return x86_insn_is_mem_write(state, ctxt) ? X86EMUL_OKAY
> -                                              : X86EMUL_UNHANDLEABLE;
> -}

If it can't be static anymore, and considering it's just a wrapper
around another function call, would there be anything wrong with
making it an inline function in the header?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 01/31] x86/mm: carve out create_grant_pv_mapping
  2017-08-17 14:44 ` [PATCH v4 01/31] x86/mm: carve out create_grant_pv_mapping Wei Liu
@ 2017-08-18 10:12   ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2017-08-18 10:12 UTC (permalink / raw)
  To: WeiLiu; +Cc: George Dunlap, Andrew Cooper, Xen-devel

>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> And at once make create_grant_host_mapping an inline function.  This
> requires making create_grant_{hvm,pv}_mapping non-static.  Provide
> {hvm,pv}/grant_table.h. Include the headers where necessary.
> 
> The two functions create_grant_{hvm,pv}_mapping will be moved later in
> a dedicated patch with all their helpers.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

With the function name in the description above corrected (it's
_p2m_ rather than _hvm_)
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 02/31] x86/mm: carve out replace_grant_pv_mapping
  2017-08-17 14:44 ` [PATCH v4 02/31] x86/mm: carve out replace_grant_pv_mapping Wei Liu
@ 2017-08-18 10:14   ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2017-08-18 10:14 UTC (permalink / raw)
  To: WeiLiu; +Cc: George Dunlap, Andrew Cooper, Xen-devel

>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> And at once make it an inline function. Add declarations of
> replace_grant_{hvm,pv}_mapping to respective header files.

Same remark on function name.

> @@ -38,6 +40,12 @@ static inline int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
>      return GNTST_general_error;
>  }
>  
> +int replace_grant_p2m_mapping(uint64_t addr, unsigned long frame,
> +                              uint64_t new_addr, unsigned int flags)

static inline

> @@ -37,6 +39,12 @@ static inline int create_grant_pv_mapping(uint64_t addr, unsigned long frame,
>      return GNTST_general_error;
>  }
>  
> +int replace_grant_pv_mapping(uint64_t addr, unsigned long frame,
> +                             uint64_t new_addr, unsigned int flags)

Again.

With these taken care of
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 03/31] x86/mm: split HVM grant table code to hvm/grant_table.c
  2017-08-17 14:44 ` [PATCH v4 03/31] x86/mm: split HVM grant table code to hvm/grant_table.c Wei Liu
@ 2017-08-18 10:16   ` Jan Beulich
  2017-08-18 10:26   ` Andrew Cooper
  1 sibling, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2017-08-18 10:16 UTC (permalink / raw)
  To: WeiLiu; +Cc: George Dunlap, Andrew Cooper, Xen-devel

>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Looks to be pure code movement, so
Acked-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 05/31] x86/mm: document the return values from get_page_from_l*e
  2017-08-17 14:44 ` [PATCH v4 05/31] x86/mm: document the return values from get_page_from_l*e Wei Liu
@ 2017-08-18 10:24   ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2017-08-18 10:24 UTC (permalink / raw)
  To: WeiLiu; +Cc: George Dunlap, Andrew Cooper, Xen-devel

>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 03/31] x86/mm: split HVM grant table code to hvm/grant_table.c
  2017-08-17 14:44 ` [PATCH v4 03/31] x86/mm: split HVM grant table code to hvm/grant_table.c Wei Liu
  2017-08-18 10:16   ` Jan Beulich
@ 2017-08-18 10:26   ` Andrew Cooper
  1 sibling, 0 replies; 53+ messages in thread
From: Andrew Cooper @ 2017-08-18 10:26 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: George Dunlap, Jan Beulich

On 17/08/17 15:44, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/arch/x86/hvm/Makefile      |  1 +
>  xen/arch/x86/hvm/grant_table.c | 89 ++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/x86/mm.c              | 53 -------------------------
>  3 files changed, 90 insertions(+), 53 deletions(-)
>  create mode 100644 xen/arch/x86/hvm/grant_table.c
>
> diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
> index c394af7364..5bd38f633f 100644
> --- a/xen/arch/x86/hvm/Makefile
> +++ b/xen/arch/x86/hvm/Makefile
> @@ -6,6 +6,7 @@ obj-y += dm.o
>  obj-bin-y += dom0_build.init.o
>  obj-y += domain.o
>  obj-y += emulate.o
> +obj-y += grant_table.o
>  obj-y += hpet.o
>  obj-y += hvm.o
>  obj-y += hypercall.o
> diff --git a/xen/arch/x86/hvm/grant_table.c b/xen/arch/x86/hvm/grant_table.c
> new file mode 100644
> index 0000000000..7503c2c61b
> --- /dev/null
> +++ b/xen/arch/x86/hvm/grant_table.c
> @@ -0,0 +1,89 @@
> +/******************************************************************************
> + * arch/x86/hvm/grant_table.c
> + *
> + * Grant table interfaces for HVM guests
> + *
> + * Copyright (C) 2017 Wei Liu <wei.liu2@citrix.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/types.h>
> +
> +#include <public/grant_table.h>
> +
> +#include <asm/p2m.h>
> +
> +int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
> +                             unsigned int flags,
> +                             unsigned int cache_flags)
> +{
> +    p2m_type_t p2mt;
> +    int rc;
> +
> +    if ( cache_flags  || (flags & ~GNTMAP_readonly) != GNTMAP_host_map )

Mind dropping this double space while moving?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 06/31] x86: move pv_emul_is_mem_write to pv/emulate.c
  2017-08-18 10:08   ` Jan Beulich
@ 2017-08-18 12:08     ` Wei Liu
  2017-08-18 12:13       ` Andrew Cooper
  0 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2017-08-18 12:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, WeiLiu, Xen-devel

On Fri, Aug 18, 2017 at 04:08:44AM -0600, Jan Beulich wrote:
> >>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> > @@ -5138,13 +5140,6 @@ static int ptwr_emulated_cmpxchg(
> >          container_of(ctxt, struct ptwr_emulate_ctxt, ctxt));
> >  }
> >  
> > -static int pv_emul_is_mem_write(const struct x86_emulate_state *state,
> > -                                struct x86_emulate_ctxt *ctxt)
> > -{
> > -    return x86_insn_is_mem_write(state, ctxt) ? X86EMUL_OKAY
> > -                                              : X86EMUL_UNHANDLEABLE;
> > -}
> 
> If it can't be static anymore, and considering it's just a wrapper
> around another function call, would there be anything wrong with
> making it an inline function in the header?

Yes it can be done:

---8<---
From 129ea54249114f97fe66b85672f1710c5f63f604 Mon Sep 17 00:00:00 2001
From: Wei Liu <wei.liu2@citrix.com>
Date: Wed, 19 Jul 2017 16:15:48 +0100
Subject: [PATCH] x86: move pv_emul_is_mem_write to pv/emulate.h

Make it a static inline function in pv/emulate.h.  In the mean time it
is required to include pv/emulate.h in x86/mm.c.

The said function will be used later by different emulation handlers
in later patches.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm.c         | 9 ++-------
 xen/arch/x86/pv/emulate.h | 9 +++++++++
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 5983a56811..e0e655ac31 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -126,6 +126,8 @@
 #include <asm/hvm/grant_table.h>
 #include <asm/pv/grant_table.h>
 
+#include "pv/emulate.h"
+
 /* Mapping of the fixmap space needed early. */
 l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE)
     l1_fixmap[L1_PAGETABLE_ENTRIES];
@@ -5138,13 +5140,6 @@ static int ptwr_emulated_cmpxchg(
         container_of(ctxt, struct ptwr_emulate_ctxt, ctxt));
 }
 
-static int pv_emul_is_mem_write(const struct x86_emulate_state *state,
-                                struct x86_emulate_ctxt *ctxt)
-{
-    return x86_insn_is_mem_write(state, ctxt) ? X86EMUL_OKAY
-                                              : X86EMUL_UNHANDLEABLE;
-}
-
 static const struct x86_emulate_ops ptwr_emulate_ops = {
     .read       = ptwr_emulated_read,
     .insn_fetch = ptwr_emulated_read,
diff --git a/xen/arch/x86/pv/emulate.h b/xen/arch/x86/pv/emulate.h
index b2b1192d48..656c12f62d 100644
--- a/xen/arch/x86/pv/emulate.h
+++ b/xen/arch/x86/pv/emulate.h
@@ -1,10 +1,19 @@
 #ifndef __PV_EMULATE_H__
 #define __PV_EMULATE_H__
 
+#include <asm/x86_emulate.h>
+
 int pv_emul_read_descriptor(unsigned int sel, const struct vcpu *v,
                             unsigned long *base, unsigned long *limit,
                             unsigned int *ar, bool insn_fetch);
 
 void pv_emul_instruction_done(struct cpu_user_regs *regs, unsigned long rip);
 
+static inline int pv_emul_is_mem_write(const struct x86_emulate_state *state,
+                                       struct x86_emulate_ctxt *ctxt)
+{
+    return x86_insn_is_mem_write(state, ctxt) ? X86EMUL_OKAY
+                                              : X86EMUL_UNHANDLEABLE;
+}
+
 #endif /* __PV_EMULATE_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 06/31] x86: move pv_emul_is_mem_write to pv/emulate.c
  2017-08-18 12:08     ` Wei Liu
@ 2017-08-18 12:13       ` Andrew Cooper
  0 siblings, 0 replies; 53+ messages in thread
From: Andrew Cooper @ 2017-08-18 12:13 UTC (permalink / raw)
  To: Wei Liu, Jan Beulich; +Cc: George Dunlap, Xen-devel

On 18/08/17 13:08, Wei Liu wrote:
> On Fri, Aug 18, 2017 at 04:08:44AM -0600, Jan Beulich wrote:
>>
>> If it can't be static anymore, and considering it's just a wrapper
>> around another function call, would there be anything wrong with
>> making it an inline function in the header?
> Yes it can be done:
>
> ---8<---
> From 129ea54249114f97fe66b85672f1710c5f63f604 Mon Sep 17 00:00:00 2001
> From: Wei Liu <wei.liu2@citrix.com>
> Date: Wed, 19 Jul 2017 16:15:48 +0100
> Subject: [PATCH] x86: move pv_emul_is_mem_write to pv/emulate.h
>
> Make it a static inline function in pv/emulate.h.  In the mean time it
> is required to include pv/emulate.h in x86/mm.c.
>
> The said function will be used later by different emulation handlers
> in later patches.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 07/31] x86/mm: move and rename guest_get_eff{, kern}_l1e
  2017-08-17 14:44 ` [PATCH v4 07/31] x86/mm: move and rename guest_get_eff{, kern}_l1e Wei Liu
@ 2017-08-24 14:52   ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2017-08-24 14:52 UTC (permalink / raw)
  To: WeiLiu; +Cc: George Dunlap, Andrew Cooper, Xen-devel

>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> Move them to pv/mm.c and rename them to pv_get_guest_eff_{,kern}_l1e.
> Export them via pv/mm.h.

I think these should remain static inlines. If either is really needed by
more than one C file, it may be best to move it/them to a private
header in x86/pv/. But guest_get_eff_kern_l1e() clearly has just a
single caller.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 08/31] x86/mm: export get_page_from_mfn
  2017-08-17 14:44 ` [PATCH v4 08/31] x86/mm: export get_page_from_mfn Wei Liu
@ 2017-08-24 14:55   ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2017-08-24 14:55 UTC (permalink / raw)
  To: WeiLiu; +Cc: George Dunlap, Andrew Cooper, Xen-devel

>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> It will be used by different files later, so export it via
> asm-x86/mm.h.

This is a pretty thin wrapper - wouldn't be better to make it a
static inline?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 09/31] x86/mm: rename and move update_intpte
  2017-08-17 14:44 ` [PATCH v4 09/31] x86/mm: rename and move update_intpte Wei Liu
@ 2017-08-24 14:59   ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2017-08-24 14:59 UTC (permalink / raw)
  To: WeiLiu; +Cc: George Dunlap, Andrew Cooper, Xen-devel

>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> That function is only used by PV guests supporting code, add pv_
> prefix.

Is any code outside of x86/pv/ going to need access to it? I hope not,
in which case it shouldn't be exposed via include/asm-x86/pv/mm.h,
but via a private header in x86/pv/. That non-private header should
have only declarations of things needed by non-PV/HVM-specific code.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 10/31] x86/mm: move {un, }adjust_guest_* to pv/mm.h
  2017-08-17 14:44 ` [PATCH v4 10/31] x86/mm: move {un, }adjust_guest_* to pv/mm.h Wei Liu
@ 2017-08-24 15:00   ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2017-08-24 15:00 UTC (permalink / raw)
  To: WeiLiu; +Cc: George Dunlap, Andrew Cooper, Xen-devel

>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> Those macros will soon be used in different files. They are PV
> specific so move them to pv/mm.h.

Same comment here regarding where to move them.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 11/31] x86/mm: split out writable pagetable emulation code
  2017-08-17 14:44 ` [PATCH v4 11/31] x86/mm: split out writable pagetable emulation code Wei Liu
@ 2017-08-24 15:15   ` Jan Beulich
  2017-08-30 14:07     ` Wei Liu
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2017-08-24 15:15 UTC (permalink / raw)
  To: WeiLiu; +Cc: George Dunlap, Andrew Cooper, Xen-devel

>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> Move the code to pv/emul-ptwr-op.c. Fix coding style issues while
> moving the code.
> 
> Rename ptwr_emulated_read to pv_emul_ptwr_read and export it via
> pv/mm.h because it is needed by other emulation code.

If other emulated code uses it, renaming the function would better
imply dropping the ptwr infix from it. pv_emulated_read() perhaps?

> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/arch/x86/mm.c              | 308 +-------------------------------------
>  xen/arch/x86/pv/Makefile       |   1 +
>  xen/arch/x86/pv/emul-ptwr-op.c | 327 +++++++++++++++++++++++++++++++++++++++++

Would you mind calling this just ptwr.c?

> +/*************************
> + * Writable Pagetables
> + */
> +
> +struct ptwr_emulate_ctxt {
> +    struct x86_emulate_ctxt ctxt;
> +    unsigned long cr2;
> +    l1_pgentry_t  pte;
> +};
>[...]
> +static int ptwr_emulated_update(unsigned long addr, paddr_t old, paddr_t val,
> +                                unsigned int bytes, unsigned int do_cmpxchg,
> +                                struct ptwr_emulate_ctxt *ptwr_ctxt)

I've meanwhile noticed that in prior patches of yours such movement
was needlessly retaining the component prefixes. With you splitting
things into separate files, these aren't really useful anymore - stack
traces will have them disambiguated by being prefixed with their
file names. They merely eat valuable serial line bandwidth / ring
buffer space and clutter the (serial) log. I could accept the structure
tags to stay the way they are, but please shorten the local function
names as much as possible without losing information. That'll likely
mean dropping more than just the ptwr_ prefix.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 12/31] x86/mm: split out readonly MMIO emulation code
  2017-08-17 14:44 ` [PATCH v4 12/31] x86/mm: split out readonly MMIO " Wei Liu
@ 2017-08-24 15:16   ` Jan Beulich
  2017-08-24 15:25     ` Andrew Cooper
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2017-08-24 15:16 UTC (permalink / raw)
  To: WeiLiu; +Cc: George Dunlap, Andrew Cooper, Xen-devel

>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> Move the code to pv/emul-mmio-op.c. Fix coding style issues while
> moving.
> 
> Note that mmio_ro_emulated_write is needed by both PV and HVM, so it
> is left in x86/mm.c.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/arch/x86/mm.c              | 129 --------------------------------
>  xen/arch/x86/pv/Makefile       |   1 +
>  xen/arch/x86/pv/emul-mmio-op.c | 166 +++++++++++++++++++++++++++++++++++++++++

Again I think just mmio.c would do. Other comments on earlier
patches apply here as well.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 12/31] x86/mm: split out readonly MMIO emulation code
  2017-08-24 15:16   ` Jan Beulich
@ 2017-08-24 15:25     ` Andrew Cooper
  2017-08-30 14:35       ` Wei Liu
  0 siblings, 1 reply; 53+ messages in thread
From: Andrew Cooper @ 2017-08-24 15:25 UTC (permalink / raw)
  To: Jan Beulich, WeiLiu; +Cc: George Dunlap, Xen-devel

On 24/08/17 16:16, Jan Beulich wrote:
>>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
>> Move the code to pv/emul-mmio-op.c. Fix coding style issues while
>> moving.
>>
>> Note that mmio_ro_emulated_write is needed by both PV and HVM, so it
>> is left in x86/mm.c.
>>
>> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
>> ---
>>  xen/arch/x86/mm.c              | 129 --------------------------------
>>  xen/arch/x86/pv/Makefile       |   1 +
>>  xen/arch/x86/pv/emul-mmio-op.c | 166 +++++++++++++++++++++++++++++++++++++++++
> Again I think just mmio.c would do. Other comments on earlier
> patches apply here as well.

I think it would be wise to merge the ptwr and mmio handling.  At the
moment, we invoke a full lookup pte/decode/try-to-emulate cycle twice in
the #PF handler for PV guests before handing the fault back to the guest.

The correct ops and context can be determined by inspecting the l1e
under %cr2 before calling into any emulation code.

Simplifying this logic before moving it would be the better option.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 11/31] x86/mm: split out writable pagetable emulation code
  2017-08-24 15:15   ` Jan Beulich
@ 2017-08-30 14:07     ` Wei Liu
  2017-08-30 15:23       ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2017-08-30 14:07 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, WeiLiu, Xen-devel

On Thu, Aug 24, 2017 at 09:15:36AM -0600, Jan Beulich wrote:
> >>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> > Move the code to pv/emul-ptwr-op.c. Fix coding style issues while
> > moving the code.
> > 
> > Rename ptwr_emulated_read to pv_emul_ptwr_read and export it via
> > pv/mm.h because it is needed by other emulation code.
> 
> If other emulated code uses it, renaming the function would better
> imply dropping the ptwr infix from it. pv_emulated_read() perhaps?
> 
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > ---
> >  xen/arch/x86/mm.c              | 308 +-------------------------------------
> >  xen/arch/x86/pv/Makefile       |   1 +
> >  xen/arch/x86/pv/emul-ptwr-op.c | 327 +++++++++++++++++++++++++++++++++++++++++
> 
> Would you mind calling this just ptwr.c?

That seems to deviate from the naming scheme we decided on, but I don't
think I care.

> 
> > +/*************************
> > + * Writable Pagetables
> > + */
> > +
> > +struct ptwr_emulate_ctxt {
> > +    struct x86_emulate_ctxt ctxt;
> > +    unsigned long cr2;
> > +    l1_pgentry_t  pte;
> > +};
> >[...]
> > +static int ptwr_emulated_update(unsigned long addr, paddr_t old, paddr_t val,
> > +                                unsigned int bytes, unsigned int do_cmpxchg,
> > +                                struct ptwr_emulate_ctxt *ptwr_ctxt)
> 
> I've meanwhile noticed that in prior patches of yours such movement
> was needlessly retaining the component prefixes. With you splitting
> things into separate files, these aren't really useful anymore - stack
> traces will have them disambiguated by being prefixed with their
> file names. They merely eat valuable serial line bandwidth / ring
> buffer space and clutter the (serial) log. I could accept the structure
> tags to stay the way they are, but please shorten the local function
> names as much as possible without losing information. That'll likely
> mean dropping more than just the ptwr_ prefix.
> 

No problem.

Do you want me to change the ones I already moved? If so, I will do it
before we release 4.10.

> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 12/31] x86/mm: split out readonly MMIO emulation code
  2017-08-24 15:25     ` Andrew Cooper
@ 2017-08-30 14:35       ` Wei Liu
  0 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-30 14:35 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, Xen-devel, WeiLiu, Jan Beulich

On Thu, Aug 24, 2017 at 04:25:23PM +0100, Andrew Cooper wrote:
> On 24/08/17 16:16, Jan Beulich wrote:
> >>>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> >> Move the code to pv/emul-mmio-op.c. Fix coding style issues while
> >> moving.
> >>
> >> Note that mmio_ro_emulated_write is needed by both PV and HVM, so it
> >> is left in x86/mm.c.
> >>
> >> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> >> ---
> >>  xen/arch/x86/mm.c              | 129 --------------------------------
> >>  xen/arch/x86/pv/Makefile       |   1 +
> >>  xen/arch/x86/pv/emul-mmio-op.c | 166 +++++++++++++++++++++++++++++++++++++++++
> > Again I think just mmio.c would do. Other comments on earlier
> > patches apply here as well.
> 
> I think it would be wise to merge the ptwr and mmio handling.  At the
> moment, we invoke a full lookup pte/decode/try-to-emulate cycle twice in
> the #PF handler for PV guests before handing the fault back to the guest.
> 
> The correct ops and context can be determined by inspecting the l1e
> under %cr2 before calling into any emulation code.

I will see what I can do. The predicates in traps.c look terribly
complicated.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 11/31] x86/mm: split out writable pagetable emulation code
  2017-08-30 14:07     ` Wei Liu
@ 2017-08-30 15:23       ` Jan Beulich
  2017-08-30 15:43         ` Wei Liu
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2017-08-30 15:23 UTC (permalink / raw)
  To: WeiLiu; +Cc: George Dunlap, Andrew Cooper, Xen-devel

>>> On 30.08.17 at 16:07, <wei.liu2@citrix.com> wrote:
> On Thu, Aug 24, 2017 at 09:15:36AM -0600, Jan Beulich wrote:
>> >>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
>> > +/*************************
>> > + * Writable Pagetables
>> > + */
>> > +
>> > +struct ptwr_emulate_ctxt {
>> > +    struct x86_emulate_ctxt ctxt;
>> > +    unsigned long cr2;
>> > +    l1_pgentry_t  pte;
>> > +};
>> >[...]
>> > +static int ptwr_emulated_update(unsigned long addr, paddr_t old, paddr_t val,
>> > +                                unsigned int bytes, unsigned int do_cmpxchg,
>> > +                                struct ptwr_emulate_ctxt *ptwr_ctxt)
>> 
>> I've meanwhile noticed that in prior patches of yours such movement
>> was needlessly retaining the component prefixes. With you splitting
>> things into separate files, these aren't really useful anymore - stack
>> traces will have them disambiguated by being prefixed with their
>> file names. They merely eat valuable serial line bandwidth / ring
>> buffer space and clutter the (serial) log. I could accept the structure
>> tags to stay the way they are, but please shorten the local function
>> names as much as possible without losing information. That'll likely
>> mean dropping more than just the ptwr_ prefix.
> 
> No problem.
> 
> Do you want me to change the ones I already moved? If so, I will do it
> before we release 4.10.

I'd likely be doing it at some point myself, so if you're willing to
do it, I would of course appreciate it.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v4 11/31] x86/mm: split out writable pagetable emulation code
  2017-08-30 15:23       ` Jan Beulich
@ 2017-08-30 15:43         ` Wei Liu
  0 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2017-08-30 15:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, WeiLiu, Xen-devel

On Wed, Aug 30, 2017 at 09:23:20AM -0600, Jan Beulich wrote:
> >>> On 30.08.17 at 16:07, <wei.liu2@citrix.com> wrote:
> > On Thu, Aug 24, 2017 at 09:15:36AM -0600, Jan Beulich wrote:
> >> >>> On 17.08.17 at 16:44, <wei.liu2@citrix.com> wrote:
> >> > +/*************************
> >> > + * Writable Pagetables
> >> > + */
> >> > +
> >> > +struct ptwr_emulate_ctxt {
> >> > +    struct x86_emulate_ctxt ctxt;
> >> > +    unsigned long cr2;
> >> > +    l1_pgentry_t  pte;
> >> > +};
> >> >[...]
> >> > +static int ptwr_emulated_update(unsigned long addr, paddr_t old, paddr_t val,
> >> > +                                unsigned int bytes, unsigned int do_cmpxchg,
> >> > +                                struct ptwr_emulate_ctxt *ptwr_ctxt)
> >> 
> >> I've meanwhile noticed that in prior patches of yours such movement
> >> was needlessly retaining the component prefixes. With you splitting
> >> things into separate files, these aren't really useful anymore - stack
> >> traces will have them disambiguated by being prefixed with their
> >> file names. They merely eat valuable serial line bandwidth / ring
> >> buffer space and clutter the (serial) log. I could accept the structure
> >> tags to stay the way they are, but please shorten the local function
> >> names as much as possible without losing information. That'll likely
> >> mean dropping more than just the ptwr_ prefix.
> > 
> > No problem.
> > 
> > Do you want me to change the ones I already moved? If so, I will do it
> > before we release 4.10.
> 
> I'd likely be doing it at some point myself, so if you're willing to
> do it, I would of course appreciate it.
> 

Sure, I can do that tomorrow or the day after tomorrow.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2017-08-30 15:44 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-17 14:44 [PATCH v4 00/31] x86: refactor mm.c Wei Liu
2017-08-17 14:44 ` [PATCH v4 01/31] x86/mm: carve out create_grant_pv_mapping Wei Liu
2017-08-18 10:12   ` Jan Beulich
2017-08-17 14:44 ` [PATCH v4 02/31] x86/mm: carve out replace_grant_pv_mapping Wei Liu
2017-08-18 10:14   ` Jan Beulich
2017-08-17 14:44 ` [PATCH v4 03/31] x86/mm: split HVM grant table code to hvm/grant_table.c Wei Liu
2017-08-18 10:16   ` Jan Beulich
2017-08-18 10:26   ` Andrew Cooper
2017-08-17 14:44 ` [PATCH v4 04/31] x86/mm: lift PAGE_CACHE_ATTRS to page.h Wei Liu
2017-08-17 14:46   ` Andrew Cooper
2017-08-17 14:44 ` [PATCH v4 05/31] x86/mm: document the return values from get_page_from_l*e Wei Liu
2017-08-18 10:24   ` Jan Beulich
2017-08-17 14:44 ` [PATCH v4 06/31] x86: move pv_emul_is_mem_write to pv/emulate.c Wei Liu
2017-08-17 14:53   ` Andrew Cooper
2017-08-18 10:08   ` Jan Beulich
2017-08-18 12:08     ` Wei Liu
2017-08-18 12:13       ` Andrew Cooper
2017-08-17 14:44 ` [PATCH v4 07/31] x86/mm: move and rename guest_get_eff{, kern}_l1e Wei Liu
2017-08-24 14:52   ` Jan Beulich
2017-08-17 14:44 ` [PATCH v4 08/31] x86/mm: export get_page_from_mfn Wei Liu
2017-08-24 14:55   ` Jan Beulich
2017-08-17 14:44 ` [PATCH v4 09/31] x86/mm: rename and move update_intpte Wei Liu
2017-08-24 14:59   ` Jan Beulich
2017-08-17 14:44 ` [PATCH v4 10/31] x86/mm: move {un, }adjust_guest_* to pv/mm.h Wei Liu
2017-08-24 15:00   ` Jan Beulich
2017-08-17 14:44 ` [PATCH v4 11/31] x86/mm: split out writable pagetable emulation code Wei Liu
2017-08-24 15:15   ` Jan Beulich
2017-08-30 14:07     ` Wei Liu
2017-08-30 15:23       ` Jan Beulich
2017-08-30 15:43         ` Wei Liu
2017-08-17 14:44 ` [PATCH v4 12/31] x86/mm: split out readonly MMIO " Wei Liu
2017-08-24 15:16   ` Jan Beulich
2017-08-24 15:25     ` Andrew Cooper
2017-08-30 14:35       ` Wei Liu
2017-08-17 14:44 ` [PATCH v4 13/31] x86/mm: remove the unused inclusion of pv/emulate.h Wei Liu
2017-08-17 14:44 ` [PATCH v4 14/31] x86/mm: move and rename guest_{, un}map_l1e Wei Liu
2017-08-17 14:44 ` [PATCH v4 15/31] x86/mm: split out PV grant table code Wei Liu
2017-08-17 14:44 ` [PATCH v4 16/31] x86/mm: split out descriptor " Wei Liu
2017-08-17 14:44 ` [PATCH v4 17/31] x86/mm: move compat descriptor handling code Wei Liu
2017-08-17 14:44 ` [PATCH v4 18/31] x86/mm: move and rename map_ldt_shadow_page Wei Liu
2017-08-17 14:44 ` [PATCH v4 19/31] x86/mm: factor out pv_arch_init_memory Wei Liu
2017-08-17 14:44 ` [PATCH v4 20/31] x86/mm: move l4 table setup code Wei Liu
2017-08-17 14:44 ` [PATCH v4 21/31] x86/mm: add "pv_" prefix to new_guest_cr3 Wei Liu
2017-08-17 14:44 ` [PATCH v4 22/31] x86: add pv_ prefix to {alloc, free}_page_type Wei Liu
2017-08-17 14:44 ` [PATCH v4 23/31] x86/mm: export more get/put page functions Wei Liu
2017-08-17 14:44 ` [PATCH v4 24/31] x86/mm: move and add pv_ prefix to create_pae_xen_mappings Wei Liu
2017-08-17 14:44 ` [PATCH v4 25/31] x86/mm: move disallow_mask variable and macros Wei Liu
2017-08-17 14:44 ` [PATCH v4 26/31] x86/mm: move pv_{alloc, free}_page_type Wei Liu
2017-08-17 14:44 ` [PATCH v4 27/31] x86/mm: move and add pv_ prefix to invalidate_shadow_ldt Wei Liu
2017-08-17 14:44 ` [PATCH v4 28/31] x86/mm: move PV hypercalls to pv/mm-hypercalls.c Wei Liu
2017-08-17 14:44 ` [PATCH v4 29/31] x86/mm: remove the now unused inclusion of pv/mm.h Wei Liu
2017-08-17 14:44 ` [PATCH v4 30/31] x86/mm: use put_page_type_preemptible in put_page_from_l{2, 3}e Wei Liu
2017-08-17 14:44 ` [PATCH v4 31/31] x86/mm: move {get, put}_page_from_l{2, 3, 4}e Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.