All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes
@ 2019-02-11 17:46 Roger Pau Monne
  2019-02-11 17:46 ` [PATCH for-4.12 v2 1/7] dom0/pvh: align allocation and mapping order to start address Roger Pau Monne
                   ` (7 more replies)
  0 siblings, 8 replies; 26+ messages in thread
From: Roger Pau Monne @ 2019-02-11 17:46 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne

Hello,

The following series contains fixes that should be considered for 4.12.

I'm not sure whether patches 5, 6 and 7 should be aimed at 4.12, they
contain changes to the p2m code that could affect HVM guests. Note that
without those changes a PVH dom0 running on AMD hardware will be unable
to create guests. Overall the patches are a nice cleanup to the handling
of p2m_ioreq_server and p2m_map_foreign types.

The series can also be found at:

git://xenbits.xen.org/people/royger/xen.git fixes-4.12-v2.1

Roger Pau Monne (7):
  dom0/pvh: align allocation and mapping order to start address
  amd/npt/shadow: replace assert that prevents creating 2M/1G MMIO
    entries
  x86/pvh: reorder PVH dom0 iommu initialization
  pvh/dom0: warn when dom0_mem is not set
  x86/mm: split p2m ioreq server pages special handling into helper
  x86/mm: handle foreign mappings in p2m_entry_modify
  npt/shadow: allow getting foreign page table entries

 xen/arch/x86/dom0_build.c           |  10 ++
 xen/arch/x86/hvm/dom0_build.c       |  37 +++++---
 xen/arch/x86/mm/hap/hap.c           |   4 +
 xen/arch/x86/mm/p2m-ept.c           | 137 ++++++----------------------
 xen/arch/x86/mm/p2m-pt.c            |  54 ++++-------
 xen/arch/x86/mm/shadow/common.c     |   4 +
 xen/drivers/passthrough/x86/iommu.c |   7 +-
 xen/include/asm-x86/p2m.h           |  45 +++++++++
 8 files changed, 135 insertions(+), 163 deletions(-)

-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH for-4.12 v2 1/7] dom0/pvh: align allocation and mapping order to start address
  2019-02-11 17:46 [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes Roger Pau Monne
@ 2019-02-11 17:46 ` Roger Pau Monne
  2019-02-12 10:52   ` Wei Liu
  2019-02-13 15:32   ` Jan Beulich
  2019-02-11 17:46 ` [PATCH for-4.12 v2 2/7] amd/npt/shadow: replace assert that prevents creating 2M/1G MMIO entries Roger Pau Monne
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 26+ messages in thread
From: Roger Pau Monne @ 2019-02-11 17:46 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

The p2m and iommu mapping code always had the requirement that
addresses and orders must be aligned when populating the p2m or the
iommu page tables.

PVH dom0 builder didn't take this requirement into account, and can
call into the p2m/iommu mapping helpers with addresses and orders that
are not aligned.

Fix this by making sure the orders passed to the physmap population
helpers are always aligned to the guest address to be populated.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
Without this patch trying to create a PVH dom0 will trigger an assert
on certain hardware depending on the memory map.
---
Changes since v1:
 - Reword commit message.
---
 xen/arch/x86/hvm/dom0_build.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index 51cf490811..a571d15c13 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -152,6 +152,8 @@ static int __init pvh_populate_memory_range(struct domain *d,
 
         order = get_order_from_pages(end - start + 1);
         order = min(order ? order - 1 : 0, max_order);
+        /* The order allocated and populated must be aligned to the address. */
+        order = min(order, start ? find_first_set_bit(start) : MAX_ORDER);
         page = alloc_domheap_pages(d, order, dom0_memflags | MEMF_no_scrub);
         if ( page == NULL )
         {
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH for-4.12 v2 2/7] amd/npt/shadow: replace assert that prevents creating 2M/1G MMIO entries
  2019-02-11 17:46 [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes Roger Pau Monne
  2019-02-11 17:46 ` [PATCH for-4.12 v2 1/7] dom0/pvh: align allocation and mapping order to start address Roger Pau Monne
@ 2019-02-11 17:46 ` Roger Pau Monne
  2019-02-13 15:53   ` Jan Beulich
  2019-02-11 17:46 ` [PATCH for-4.12 v2 3/7] x86/pvh: reorder PVH dom0 iommu initialization Roger Pau Monne
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 26+ messages in thread
From: Roger Pau Monne @ 2019-02-11 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Wei Liu, George Dunlap, Andrew Cooper,
	George Dunlap, Jan Beulich, Roger Pau Monne

The assert was originally added to make sure that higher order
regions (> PAGE_ORDER_4K) could not be used to bypass the
mmio_ro_ranges check performed by p2m_type_to_flags.

This however is already checked in set_mmio_p2m_entry, which makes
sure that higher order mappings don't overlap with mmio_ro_ranges,
thus allowing the creation of high order MMIO mappings safely.

Replace the assert to allow 2M/1G entries to be created for MMIO
regions and add some extra asserts as a replacement to make sure
there's no overlapping with MMIO read-only ranges.

Note that 1G MMIO entries will not be created unless mmio_order is
changed to allow it.

Suggested-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
---
Without this patch trying to create a PVH dom0 will trigger an assert
on certain hardware depending on the memory map.
---
Changes since v1:
 - Fix subject.
 - Replace the assert with a suitable one.
---
 xen/arch/x86/mm/p2m-pt.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 12f92cf1f0..791600f6ba 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -576,7 +576,15 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
         }
 
         ASSERT(p2m_flags_to_type(flags) != p2m_ioreq_server);
-        ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
+        if ( p2mt == p2m_mmio_direct )
+            ASSERT(!mfn_eq(mfn, INVALID_MFN) &&
+                   !rangeset_overlaps_range(mmio_ro_ranges, mfn_x(mfn),
+                                            mfn_x(mfn) + PFN_DOWN(MB(2))));
+        else if ( p2m_allows_invalid_mfn(p2mt) || p2mt == p2m_invalid ||
+                  p2mt == p2m_mmio_dm )
+            ASSERT(mfn_valid(mfn) || mfn_eq(mfn, INVALID_MFN));
+        else
+            ASSERT(mfn_valid(mfn));
         l3e_content = mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt)
             ? p2m_l3e_from_pfn(mfn_x(mfn),
                                p2m_type_to_flags(p2m, p2mt, mfn, 2))
@@ -668,7 +676,15 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
         }
 
         ASSERT(p2m_flags_to_type(flags) != p2m_ioreq_server);
-        ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
+        if ( p2mt == p2m_mmio_direct )
+            ASSERT(!mfn_eq(mfn, INVALID_MFN) &&
+                   !rangeset_overlaps_range(mmio_ro_ranges, mfn_x(mfn),
+                                            mfn_x(mfn) + PFN_DOWN(MB(2))));
+        else if ( p2m_allows_invalid_mfn(p2mt) || p2mt == p2m_invalid ||
+                  p2mt == p2m_mmio_dm )
+            ASSERT(mfn_valid(mfn) || mfn_eq(mfn, INVALID_MFN));
+        else
+            ASSERT(mfn_valid(mfn));
         l2e_content = mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt)
             ? p2m_l2e_from_pfn(mfn_x(mfn),
                                p2m_type_to_flags(p2m, p2mt, mfn, 1))
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH for-4.12 v2 3/7] x86/pvh: reorder PVH dom0 iommu initialization
  2019-02-11 17:46 [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes Roger Pau Monne
  2019-02-11 17:46 ` [PATCH for-4.12 v2 1/7] dom0/pvh: align allocation and mapping order to start address Roger Pau Monne
  2019-02-11 17:46 ` [PATCH for-4.12 v2 2/7] amd/npt/shadow: replace assert that prevents creating 2M/1G MMIO entries Roger Pau Monne
@ 2019-02-11 17:46 ` Roger Pau Monne
  2019-02-13 15:58   ` Jan Beulich
  2019-02-11 17:46 ` [PATCH for-4.12 v2 4/7] pvh/dom0: warn when dom0_mem is not set Roger Pau Monne
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 26+ messages in thread
From: Roger Pau Monne @ 2019-02-11 17:46 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

So that the iommu is initialized before populating the p2m, and
entries added get the corresponding iommu page table entries if
required. This requires splitting the current pvh_setup_p2m into two
different functions. One that crafts dom0 physmap and sets the paging
allocation, and another one that actually populates the p2m with RAM
regions.

Note that this allows to remove the special casing done for the low
1MB in hwdom_iommu_map.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
Changes since v1:
 - New in this version.
---
The previous flow, where the iommu was enabled after having populated
the p2m was used to workaround an issue on an old pre-Haswell Intel
box. I no longer have that box, and when I tested PVH dom0 on it was
before the iommu map-reserved fixes.

If hardware needing the iommu page tables to contain RAM regions is
found I'm happy to work on other solutions, like performing the setup
of the iommu at the start of dom0_construct_pvh but only enabling it
when dom0 p2m is fully populated.
---
 xen/arch/x86/hvm/dom0_build.c       | 35 +++++++++++++++++++----------
 xen/drivers/passthrough/x86/iommu.c |  7 +-----
 2 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index a571d15c13..aa599f09ef 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -409,14 +409,10 @@ static __init void pvh_setup_e820(struct domain *d, unsigned long nr_pages)
     ASSERT(cur_pages == nr_pages);
 }
 
-static int __init pvh_setup_p2m(struct domain *d)
+static void __init pvh_init_p2m(struct domain *d)
 {
-    struct vcpu *v = d->vcpu[0];
     unsigned long nr_pages = dom0_compute_nr_pages(d, NULL, 0);
-    unsigned int i;
-    int rc;
     bool preempted;
-#define MB1_PAGES PFN_DOWN(MB(1))
 
     pvh_setup_e820(d, nr_pages);
     do {
@@ -425,6 +421,14 @@ static int __init pvh_setup_p2m(struct domain *d)
                               &preempted);
         process_pending_softirqs();
     } while ( preempted );
+}
+
+static int __init pvh_populate_p2m(struct domain *d)
+{
+    struct vcpu *v = d->vcpu[0];
+    unsigned int i;
+    int rc;
+#define MB1_PAGES PFN_DOWN(MB(1))
 
     /*
      * Memory below 1MB is identity mapped initially. RAM regions are
@@ -1134,13 +1138,6 @@ int __init dom0_construct_pvh(struct domain *d, const module_t *image,
 
     printk(XENLOG_INFO "*** Building a PVH Dom%d ***\n", d->domain_id);
 
-    rc = pvh_setup_p2m(d);
-    if ( rc )
-    {
-        printk("Failed to setup Dom0 physical memory map\n");
-        return rc;
-    }
-
     /*
      * NB: MMCFG initialization needs to be performed before iommu
      * initialization so the iommu code can fetch the MMCFG regions used by the
@@ -1148,8 +1145,22 @@ int __init dom0_construct_pvh(struct domain *d, const module_t *image,
      */
     pvh_setup_mmcfg(d);
 
+    /*
+     * Craft dom0 physical memory map and set the paging allocation. This must
+     * be done before the iommu initializion, since iommu initialization code
+     * will likely add mappings required by devices to the p2m (ie: RMRRs).
+     */
+    pvh_init_p2m(d);
+
     iommu_hwdom_init(d);
 
+    rc = pvh_populate_p2m(d);
+    if ( rc )
+    {
+        printk("Failed to setup Dom0 physical memory map\n");
+        return rc;
+    }
+
     rc = pvh_load_kernel(d, image, image_headroom, initrd, bootstrap_map(image),
                          cmdline, &entry, &start_info);
     if ( rc )
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index a88ef9b189..42b1a1bbc3 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -151,12 +151,7 @@ static bool __hwdom_init hwdom_iommu_map(const struct domain *d,
      * inclusive mapping additionally maps in every pfn up to 4GB except those
      * that fall in unusable ranges for PV Dom0.
      */
-    if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) ||
-         /*
-          * Ignore any address below 1MB, that's already identity mapped by the
-          * Dom0 builder for HVM.
-          */
-         (!d->domain_id && is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) )
+    if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) )
         return false;
 
     switch ( type = page_get_ram_type(mfn) )
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH for-4.12 v2 4/7] pvh/dom0: warn when dom0_mem is not set
  2019-02-11 17:46 [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes Roger Pau Monne
                   ` (2 preceding siblings ...)
  2019-02-11 17:46 ` [PATCH for-4.12 v2 3/7] x86/pvh: reorder PVH dom0 iommu initialization Roger Pau Monne
@ 2019-02-11 17:46 ` Roger Pau Monne
  2019-02-12 10:52   ` Wei Liu
  2019-02-13 16:01   ` Jan Beulich
  2019-02-11 17:46 ` [PATCH v2 5/7] x86/mm: split p2m ioreq server pages special handling into helper Roger Pau Monne
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 26+ messages in thread
From: Roger Pau Monne @ 2019-02-11 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

There have been several reports of the dom0 builder running out of
memory when building a PVH dom0 without having specified a dom0_mem
value. Print a warning message if dom0_mem is not set when booting in
PVH mode.

This is a temporary workaround until accounting for internal memory
required by Xen (ie: paging structures) is improved.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
---
Without this patch creating a PVH dom0 without a dom0_mem parameter
can result in the dom0 builder running out of memory thus leading to a
Xen crash. The added message gives a hit to the user about a possible
fix.
---
Changes since v1:
 - Rewrite the warning message.
 - Check nr_pages in order to figure out if the amount of dom0 memory
   has been set.
---
 xen/arch/x86/dom0_build.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 2b4d9e9ea6..95d60d1163 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -378,8 +378,18 @@ unsigned long __init dom0_compute_nr_pages(
          * maximum of 128MB.
          */
         if ( !nr_pages )
+        {
             nr_pages = avail - (pv_shim ? pv_shim_mem(avail)
                                  : min(avail / 16, 128UL << (20 - PAGE_SHIFT)));
+            if ( is_hvm_domain(d) )
+                /*
+                 * Temporary workaround message until internal (paging) memory
+                 * accounting required to build a pvh dom0 is improved.
+                 */
+                printk("WARNING: PVH dom0 without dom0_mem set is still unstable. "
+                       "If you get crashes during boot, try adding a dom0_mem parameter\n");
+        }
+
 
         /* Clamp according to min/max limits and available memory. */
         nr_pages = max(nr_pages, min_pages);
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 5/7] x86/mm: split p2m ioreq server pages special handling into helper
  2019-02-11 17:46 [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes Roger Pau Monne
                   ` (3 preceding siblings ...)
  2019-02-11 17:46 ` [PATCH for-4.12 v2 4/7] pvh/dom0: warn when dom0_mem is not set Roger Pau Monne
@ 2019-02-11 17:46 ` Roger Pau Monne
  2019-02-13 10:23   ` Paul Durrant
  2019-02-11 17:46 ` [PATCH v2 6/7] x86/mm: handle foreign mappings in p2m_entry_modify Roger Pau Monne
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 26+ messages in thread
From: Roger Pau Monne @ 2019-02-11 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper,
	Paul Durrant, Jun Nakajima, Roger Pau Monne

So that it can be shared by both ept, npt and shadow code, instead of
duplicating it.

No change in functionality intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Paul Durrant <paul.durrant@citrix.com>
---
Changes since v1:
 - Remove unused p2mt_old from p2m_pt_set_entry.
---
 xen/arch/x86/mm/hap/hap.c       |  3 ++
 xen/arch/x86/mm/p2m-ept.c       | 55 +++++++++++----------------------
 xen/arch/x86/mm/p2m-pt.c        | 24 --------------
 xen/arch/x86/mm/shadow/common.c |  3 ++
 xen/include/asm-x86/p2m.h       | 32 +++++++++++++++++++
 5 files changed, 56 insertions(+), 61 deletions(-)

diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 3d651b94c3..dc46d5e14f 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -734,6 +734,9 @@ hap_write_p2m_entry(struct domain *d, unsigned long gfn, l1_pgentry_t *p,
             && perms_strictly_increased(old_flags, l1e_get_flags(new)) );
     }
 
+    p2m_entry_modify(p2m_get_hostp2m(d), p2m_flags_to_type(l1e_get_flags(new)),
+                     p2m_flags_to_type(old_flags), level);
+
     safe_write_pte(p, new);
     if ( old_flags & _PAGE_PRESENT )
         flush_tlb_mask(d->dirty_cpumask);
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index bb562607f7..0ece6608cb 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -46,7 +46,8 @@ static inline bool_t is_epte_valid(ept_entry_t *e)
 }
 
 /* returns : 0 for success, -errno otherwise */
-static int atomic_write_ept_entry(ept_entry_t *entryptr, ept_entry_t new,
+static int atomic_write_ept_entry(struct p2m_domain *p2m,
+                                  ept_entry_t *entryptr, ept_entry_t new,
                                   int level)
 {
     int rc;
@@ -89,6 +90,8 @@ static int atomic_write_ept_entry(ept_entry_t *entryptr, ept_entry_t new,
     if ( unlikely(p2m_is_foreign(entryptr->sa_p2mt)) && check_foreign )
         oldmfn = entryptr->mfn;
 
+    p2m_entry_modify(p2m, new.sa_p2mt, entryptr->sa_p2mt, level);
+
     write_atomic(&entryptr->epte, new.epte);
 
     if ( unlikely(oldmfn != mfn_x(INVALID_MFN)) )
@@ -390,7 +393,8 @@ static int ept_next_level(struct p2m_domain *p2m, bool_t read_only,
  * present entries in the given page table, optionally marking the entries
  * also for their subtrees needing P2M type re-calculation.
  */
-static bool_t ept_invalidate_emt(mfn_t mfn, bool_t recalc, int level)
+static bool_t ept_invalidate_emt(struct p2m_domain *p2m, mfn_t mfn,
+                                 bool_t recalc, int level)
 {
     int rc;
     ept_entry_t *epte = map_domain_page(mfn);
@@ -408,7 +412,7 @@ static bool_t ept_invalidate_emt(mfn_t mfn, bool_t recalc, int level)
         e.emt = MTRR_NUM_TYPES;
         if ( recalc )
             e.recalc = 1;
-        rc = atomic_write_ept_entry(&epte[i], e, level);
+        rc = atomic_write_ept_entry(p2m, &epte[i], e, level);
         ASSERT(rc == 0);
         changed = 1;
     }
@@ -459,7 +463,7 @@ static int ept_invalidate_emt_range(struct p2m_domain *p2m,
             rc = -ENOMEM;
             goto out;
         }
-        wrc = atomic_write_ept_entry(&table[index], split_ept_entry, i);
+        wrc = atomic_write_ept_entry(p2m, &table[index], split_ept_entry, i);
         ASSERT(wrc == 0);
 
         for ( ; i > target; --i )
@@ -479,7 +483,7 @@ static int ept_invalidate_emt_range(struct p2m_domain *p2m,
         {
             e.emt = MTRR_NUM_TYPES;
             e.recalc = 1;
-            wrc = atomic_write_ept_entry(&table[index], e, target);
+            wrc = atomic_write_ept_entry(p2m, &table[index], e, target);
             ASSERT(wrc == 0);
             rc = 1;
         }
@@ -549,17 +553,11 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
                     nt = p2m_recalc_type(e.recalc, e.sa_p2mt, p2m, gfn + i);
                     if ( nt != e.sa_p2mt )
                     {
-                        if ( e.sa_p2mt == p2m_ioreq_server )
-                        {
-                            ASSERT(p2m->ioreq.entry_count > 0);
-                            p2m->ioreq.entry_count--;
-                        }
-
                         e.sa_p2mt = nt;
                         ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt, e.access);
                     }
                     e.recalc = 0;
-                    wrc = atomic_write_ept_entry(&epte[i], e, level);
+                    wrc = atomic_write_ept_entry(p2m, &epte[i], e, level);
                     ASSERT(wrc == 0);
                 }
             }
@@ -595,7 +593,7 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
                 {
                     if ( ept_split_super_page(p2m, &e, level, level - 1) )
                     {
-                        wrc = atomic_write_ept_entry(&epte[i], e, level);
+                        wrc = atomic_write_ept_entry(p2m, &epte[i], e, level);
                         ASSERT(wrc == 0);
                         unmap_domain_page(epte);
                         mfn = e.mfn;
@@ -610,7 +608,7 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
                 e.recalc = 0;
                 if ( recalc && p2m_is_changeable(e.sa_p2mt) )
                     ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt, e.access);
-                wrc = atomic_write_ept_entry(&epte[i], e, level);
+                wrc = atomic_write_ept_entry(p2m, &epte[i], e, level);
                 ASSERT(wrc == 0);
             }
 
@@ -621,11 +619,11 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
         if ( e.emt == MTRR_NUM_TYPES )
         {
             ASSERT(is_epte_present(&e));
-            ept_invalidate_emt(_mfn(e.mfn), e.recalc, level);
+            ept_invalidate_emt(p2m, _mfn(e.mfn), e.recalc, level);
             smp_wmb();
             e.emt = 0;
             e.recalc = 0;
-            wrc = atomic_write_ept_entry(&epte[i], e, level);
+            wrc = atomic_write_ept_entry(p2m, &epte[i], e, level);
             ASSERT(wrc == 0);
             unmap_domain_page(epte);
             rc = 1;
@@ -786,7 +784,7 @@ ept_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
 
         /* now install the newly split ept sub-tree */
         /* NB: please make sure domian is paused and no in-fly VT-d DMA. */
-        rc = atomic_write_ept_entry(ept_entry, split_ept_entry, i);
+        rc = atomic_write_ept_entry(p2m, ept_entry, split_ept_entry, i);
         ASSERT(rc == 0);
 
         /* then move to the level we want to make real changes */
@@ -833,24 +831,7 @@ ept_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
         new_entry.suppress_ve = is_epte_valid(&old_entry) ?
                                     old_entry.suppress_ve : 1;
 
-    /*
-     * p2m_ioreq_server is only used for 4K pages, so the
-     * count is only done on ept page table entries.
-     */
-    if ( p2mt == p2m_ioreq_server )
-    {
-        ASSERT(i == 0);
-        p2m->ioreq.entry_count++;
-    }
-
-    if ( ept_entry->sa_p2mt == p2m_ioreq_server )
-    {
-        ASSERT(i == 0);
-        ASSERT(p2m->ioreq.entry_count > 0);
-        p2m->ioreq.entry_count--;
-    }
-
-    rc = atomic_write_ept_entry(ept_entry, new_entry, target);
+    rc = atomic_write_ept_entry(p2m, ept_entry, new_entry, target);
     if ( unlikely(rc) )
         old_entry.epte = 0;
     else
@@ -1070,7 +1051,7 @@ static void ept_change_entry_type_global(struct p2m_domain *p2m,
     if ( !mfn )
         return;
 
-    if ( ept_invalidate_emt(_mfn(mfn), 1, p2m->ept.wl) )
+    if ( ept_invalidate_emt(p2m, _mfn(mfn), 1, p2m->ept.wl) )
         ept_sync_domain(p2m);
 }
 
@@ -1128,7 +1109,7 @@ static void ept_memory_type_changed(struct p2m_domain *p2m)
     if ( !mfn )
         return;
 
-    if ( ept_invalidate_emt(_mfn(mfn), 0, p2m->ept.wl) )
+    if ( ept_invalidate_emt(p2m, _mfn(mfn), 0, p2m->ept.wl) )
         ept_sync_domain(p2m);
 }
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 791600f6ba..fd6386b8fd 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -436,13 +436,6 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
                 flags |= _PAGE_PSE;
             }
 
-            if ( ot == p2m_ioreq_server )
-            {
-                ASSERT(p2m->ioreq.entry_count > 0);
-                ASSERT(level == 0);
-                p2m->ioreq.entry_count--;
-            }
-
             e = l1e_from_pfn(mfn, flags);
             p2m_add_iommu_flags(&e, level,
                                 (nt == p2m_ram_rw)
@@ -608,8 +601,6 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
 
     if ( page_order == PAGE_ORDER_4K )
     {
-        p2m_type_t p2mt_old;
-
         rc = p2m_next_level(p2m, &table, &gfn_remainder, gfn,
                             L2_PAGETABLE_SHIFT - PAGE_SHIFT,
                             L2_PAGETABLE_ENTRIES, 1, 1);
@@ -633,21 +624,6 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
         if ( entry_content.l1 != 0 )
             p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
 
-        p2mt_old = p2m_flags_to_type(l1e_get_flags(*p2m_entry));
-
-        /*
-         * p2m_ioreq_server is only used for 4K pages, so
-         * the count is only done for level 1 entries.
-         */
-        if ( p2mt == p2m_ioreq_server )
-            p2m->ioreq.entry_count++;
-
-        if ( p2mt_old == p2m_ioreq_server )
-        {
-            ASSERT(p2m->ioreq.entry_count > 0);
-            p2m->ioreq.entry_count--;
-        }
-
         /* level 1 entry */
         p2m->write_p2m_entry(p2m, gfn, p2m_entry, entry_content, 1);
         /* NB: paging_write_p2m_entry() handles tlb flushes properly */
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index c49aeb5e60..6d8a950054 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -3180,6 +3180,9 @@ shadow_write_p2m_entry(struct domain *d, unsigned long gfn,
     if ( likely(d->arch.paging.shadow.total_pages != 0) )
          sh_unshadow_for_p2m_change(d, gfn, p, new, level);
 
+    p2m_entry_modify(p2m_get_hostp2m(d), p2m_flags_to_type(l1e_get_flags(new)),
+                     p2m_flags_to_type(l1e_get_flags(*p)), level);
+
     /* Update the entry with new content */
     safe_write_pte(p, new);
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 2095076556..834d49d2d4 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -932,6 +932,38 @@ int p2m_set_ioreq_server(struct domain *d, unsigned int flags,
 struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
                                               unsigned int *flags);
 
+static inline void p2m_entry_modify(struct p2m_domain *p2m, p2m_type_t nt,
+                                    p2m_type_t ot, unsigned int level)
+{
+    if ( level != 1 || nt == ot )
+        return;
+
+    switch ( nt )
+    {
+    case p2m_ioreq_server:
+        /*
+         * p2m_ioreq_server is only used for 4K pages, so
+         * the count is only done for level 1 entries.
+         */
+        p2m->ioreq.entry_count++;
+        break;
+
+    default:
+        break;
+    }
+
+    switch ( ot )
+    {
+    case p2m_ioreq_server:
+        ASSERT(p2m->ioreq.entry_count > 0);
+        p2m->ioreq.entry_count--;
+        break;
+
+    default:
+        break;
+    }
+}
+
 #endif /* _XEN_ASM_X86_P2M_H */
 
 /*
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 6/7] x86/mm: handle foreign mappings in p2m_entry_modify
  2019-02-11 17:46 [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes Roger Pau Monne
                   ` (4 preceding siblings ...)
  2019-02-11 17:46 ` [PATCH v2 5/7] x86/mm: split p2m ioreq server pages special handling into helper Roger Pau Monne
@ 2019-02-11 17:46 ` Roger Pau Monne
  2019-02-14 11:25   ` Jan Beulich
  2019-02-11 17:46 ` [PATCH v2 7/7] npt/shadow: allow getting foreign page table entries Roger Pau Monne
  2019-02-11 19:50 ` [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes Sander Eikelenboom
  7 siblings, 1 reply; 26+ messages in thread
From: Roger Pau Monne @ 2019-02-11 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper,
	Tim Deegan, Jun Nakajima, Roger Pau Monne

So that the specific handling can be removed from
atomic_write_ept_entry and be shared with npt and shadow code.

This commit also removes the check that prevent non-ept PVH dom0 from
mapping foreign pages.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Tim Deegan <tim@xen.org>
---
Changes since v1:
 - Simply code since mfn_to_page cannot return NULL.
 - Check if the mfn is valid before getting/dropping the page reference.
 - Use BUG_ON instead of ASSERTs, since getting the reference counting
   wrong is more dangerous than a DoS.
---
 xen/arch/x86/mm/hap/hap.c       |   3 +-
 xen/arch/x86/mm/p2m-ept.c       | 108 +++++++-------------------------
 xen/arch/x86/mm/p2m-pt.c        |   7 ---
 xen/arch/x86/mm/shadow/common.c |   3 +-
 xen/include/asm-x86/p2m.h       |  17 ++++-
 5 files changed, 40 insertions(+), 98 deletions(-)

diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index dc46d5e14f..4f52639be5 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -735,7 +735,8 @@ hap_write_p2m_entry(struct domain *d, unsigned long gfn, l1_pgentry_t *p,
     }
 
     p2m_entry_modify(p2m_get_hostp2m(d), p2m_flags_to_type(l1e_get_flags(new)),
-                     p2m_flags_to_type(old_flags), level);
+                     p2m_flags_to_type(old_flags), l1e_get_mfn(new),
+                     l1e_get_mfn(*p), level);
 
     safe_write_pte(p, new);
     if ( old_flags & _PAGE_PRESENT )
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 0ece6608cb..2b0c3ab265 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -45,65 +45,13 @@ static inline bool_t is_epte_valid(ept_entry_t *e)
     return ((e->epte & ~(1ul << 63)) != 0 && e->sa_p2mt != p2m_invalid);
 }
 
-/* returns : 0 for success, -errno otherwise */
-static int atomic_write_ept_entry(struct p2m_domain *p2m,
-                                  ept_entry_t *entryptr, ept_entry_t new,
-                                  int level)
+static void atomic_write_ept_entry(struct p2m_domain *p2m,
+                                   ept_entry_t *entryptr, ept_entry_t new,
+                                   int level)
 {
-    int rc;
-    unsigned long oldmfn = mfn_x(INVALID_MFN);
-    bool_t check_foreign = (new.mfn != entryptr->mfn ||
-                            new.sa_p2mt != entryptr->sa_p2mt);
-
-    if ( level )
-    {
-        ASSERT(!is_epte_superpage(&new) || !p2m_is_foreign(new.sa_p2mt));
-        write_atomic(&entryptr->epte, new.epte);
-        return 0;
-    }
-
-    if ( unlikely(p2m_is_foreign(new.sa_p2mt)) )
-    {
-        rc = -EINVAL;
-        if ( !is_epte_present(&new) )
-                goto out;
-
-        if ( check_foreign )
-        {
-            struct domain *fdom;
-
-            if ( !mfn_valid(_mfn(new.mfn)) )
-                goto out;
-
-            rc = -ESRCH;
-            fdom = page_get_owner(mfn_to_page(_mfn(new.mfn)));
-            if ( fdom == NULL )
-                goto out;
-
-            /* get refcount on the page */
-            rc = -EBUSY;
-            if ( !get_page(mfn_to_page(_mfn(new.mfn)), fdom) )
-                goto out;
-        }
-    }
-
-    if ( unlikely(p2m_is_foreign(entryptr->sa_p2mt)) && check_foreign )
-        oldmfn = entryptr->mfn;
-
-    p2m_entry_modify(p2m, new.sa_p2mt, entryptr->sa_p2mt, level);
-
+    p2m_entry_modify(p2m, new.sa_p2mt, entryptr->sa_p2mt, _mfn(new.mfn),
+                     _mfn(entryptr->mfn), level);
     write_atomic(&entryptr->epte, new.epte);
-
-    if ( unlikely(oldmfn != mfn_x(INVALID_MFN)) )
-        put_page(mfn_to_page(_mfn(oldmfn)));
-
-    rc = 0;
-
- out:
-    if ( rc )
-        gdprintk(XENLOG_ERR, "epte o:%"PRIx64" n:%"PRIx64" rc:%d\n",
-                 entryptr->epte, new.epte, rc);
-    return rc;
 }
 
 static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
@@ -396,7 +344,6 @@ static int ept_next_level(struct p2m_domain *p2m, bool_t read_only,
 static bool_t ept_invalidate_emt(struct p2m_domain *p2m, mfn_t mfn,
                                  bool_t recalc, int level)
 {
-    int rc;
     ept_entry_t *epte = map_domain_page(mfn);
     unsigned int i;
     bool_t changed = 0;
@@ -412,8 +359,7 @@ static bool_t ept_invalidate_emt(struct p2m_domain *p2m, mfn_t mfn,
         e.emt = MTRR_NUM_TYPES;
         if ( recalc )
             e.recalc = 1;
-        rc = atomic_write_ept_entry(p2m, &epte[i], e, level);
-        ASSERT(rc == 0);
+        atomic_write_ept_entry(p2m, &epte[i], e, level);
         changed = 1;
     }
 
@@ -437,7 +383,7 @@ static int ept_invalidate_emt_range(struct p2m_domain *p2m,
     ept_entry_t *table;
     unsigned long gfn_remainder = first_gfn;
     unsigned int i, index;
-    int wrc, rc = 0, ret = GUEST_TABLE_MAP_FAILED;
+    int rc = 0, ret = GUEST_TABLE_MAP_FAILED;
 
     table = map_domain_page(pagetable_get_mfn(p2m_get_pagetable(p2m)));
     for ( i = p2m->ept.wl; i > target; --i )
@@ -463,8 +409,7 @@ static int ept_invalidate_emt_range(struct p2m_domain *p2m,
             rc = -ENOMEM;
             goto out;
         }
-        wrc = atomic_write_ept_entry(p2m, &table[index], split_ept_entry, i);
-        ASSERT(wrc == 0);
+        atomic_write_ept_entry(p2m, &table[index], split_ept_entry, i);
 
         for ( ; i > target; --i )
             if ( !ept_next_level(p2m, 1, &table, &gfn_remainder, i) )
@@ -483,8 +428,7 @@ static int ept_invalidate_emt_range(struct p2m_domain *p2m,
         {
             e.emt = MTRR_NUM_TYPES;
             e.recalc = 1;
-            wrc = atomic_write_ept_entry(p2m, &table[index], e, target);
-            ASSERT(wrc == 0);
+            atomic_write_ept_entry(p2m, &table[index], e, target);
             rc = 1;
         }
     }
@@ -513,7 +457,7 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
     unsigned int level = ept->wl;
     unsigned long mfn = ept->mfn;
     ept_entry_t *epte;
-    int wrc, rc = 0;
+    int rc = 0;
 
     if ( !mfn )
         return 0;
@@ -557,8 +501,7 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
                         ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt, e.access);
                     }
                     e.recalc = 0;
-                    wrc = atomic_write_ept_entry(p2m, &epte[i], e, level);
-                    ASSERT(wrc == 0);
+                    atomic_write_ept_entry(p2m, &epte[i], e, level);
                 }
             }
             else
@@ -593,8 +536,7 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
                 {
                     if ( ept_split_super_page(p2m, &e, level, level - 1) )
                     {
-                        wrc = atomic_write_ept_entry(p2m, &epte[i], e, level);
-                        ASSERT(wrc == 0);
+                        atomic_write_ept_entry(p2m, &epte[i], e, level);
                         unmap_domain_page(epte);
                         mfn = e.mfn;
                         continue;
@@ -608,8 +550,7 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
                 e.recalc = 0;
                 if ( recalc && p2m_is_changeable(e.sa_p2mt) )
                     ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt, e.access);
-                wrc = atomic_write_ept_entry(p2m, &epte[i], e, level);
-                ASSERT(wrc == 0);
+                atomic_write_ept_entry(p2m, &epte[i], e, level);
             }
 
             rc = 1;
@@ -623,8 +564,7 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
             smp_wmb();
             e.emt = 0;
             e.recalc = 0;
-            wrc = atomic_write_ept_entry(p2m, &epte[i], e, level);
-            ASSERT(wrc == 0);
+            atomic_write_ept_entry(p2m, &epte[i], e, level);
             unmap_domain_page(epte);
             rc = 1;
         }
@@ -784,8 +724,7 @@ ept_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
 
         /* now install the newly split ept sub-tree */
         /* NB: please make sure domian is paused and no in-fly VT-d DMA. */
-        rc = atomic_write_ept_entry(p2m, ept_entry, split_ept_entry, i);
-        ASSERT(rc == 0);
+        atomic_write_ept_entry(p2m, ept_entry, split_ept_entry, i);
 
         /* then move to the level we want to make real changes */
         for ( ; i > target; i-- )
@@ -831,18 +770,13 @@ ept_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
         new_entry.suppress_ve = is_epte_valid(&old_entry) ?
                                     old_entry.suppress_ve : 1;
 
-    rc = atomic_write_ept_entry(p2m, ept_entry, new_entry, target);
-    if ( unlikely(rc) )
-        old_entry.epte = 0;
-    else
-    {
-        entry_written = 1;
+    atomic_write_ept_entry(p2m, ept_entry, new_entry, target);
+    entry_written = 1;
 
-        if ( p2mt != p2m_invalid &&
-             (gfn + (1UL << order) - 1 > p2m->max_mapped_pfn) )
-            /* Track the highest gfn for which we have ever had a valid mapping */
-            p2m->max_mapped_pfn = gfn + (1UL << order) - 1;
-    }
+    if ( p2mt != p2m_invalid &&
+         (gfn + (1UL << order) - 1 > p2m->max_mapped_pfn) )
+        /* Track the highest gfn for which we have ever had a valid mapping */
+        p2m->max_mapped_pfn = gfn + (1UL << order) - 1;
 
 out:
     if ( needs_sync )
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index fd6386b8fd..aea8110254 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -523,13 +523,6 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
         __trace_var(TRC_MEM_SET_P2M_ENTRY, 0, sizeof(t), &t);
     }
 
-    if ( unlikely(p2m_is_foreign(p2mt)) )
-    {
-        /* hvm fixme: foreign types are only supported on ept at present */
-        gdprintk(XENLOG_WARNING, "Unimplemented foreign p2m type.\n");
-        return -EINVAL;
-    }
-
     /* Carry out any eventually pending earlier changes first. */
     rc = do_recalc(p2m, gfn);
     if ( rc < 0 )
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 6d8a950054..01fe81230a 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -3181,7 +3181,8 @@ shadow_write_p2m_entry(struct domain *d, unsigned long gfn,
          sh_unshadow_for_p2m_change(d, gfn, p, new, level);
 
     p2m_entry_modify(p2m_get_hostp2m(d), p2m_flags_to_type(l1e_get_flags(new)),
-                     p2m_flags_to_type(l1e_get_flags(*p)), level);
+                     p2m_flags_to_type(l1e_get_flags(*p)), l1e_get_mfn(new),
+                     l1e_get_mfn(*p), level);
 
     /* Update the entry with new content */
     safe_write_pte(p, new);
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 834d49d2d4..0de27239be 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -933,9 +933,12 @@ struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
                                               unsigned int *flags);
 
 static inline void p2m_entry_modify(struct p2m_domain *p2m, p2m_type_t nt,
-                                    p2m_type_t ot, unsigned int level)
+                                    p2m_type_t ot, mfn_t nfn, mfn_t ofn,
+                                    unsigned int level)
 {
-    if ( level != 1 || nt == ot )
+    BUG_ON(level > 1 && (nt == p2m_ioreq_server || nt == p2m_map_foreign));
+
+    if ( level != 1 || (nt == ot && mfn_eq(nfn, ofn)) )
         return;
 
     switch ( nt )
@@ -948,6 +951,11 @@ static inline void p2m_entry_modify(struct p2m_domain *p2m, p2m_type_t nt,
         p2m->ioreq.entry_count++;
         break;
 
+    case p2m_map_foreign:
+        BUG_ON(!mfn_valid(nfn) ||
+               !page_get_owner_and_reference(mfn_to_page(nfn)));
+        break;
+
     default:
         break;
     }
@@ -959,6 +967,11 @@ static inline void p2m_entry_modify(struct p2m_domain *p2m, p2m_type_t nt,
         p2m->ioreq.entry_count--;
         break;
 
+    case p2m_map_foreign:
+        BUG_ON(!mfn_valid(ofn));
+        put_page(mfn_to_page(ofn));
+        break;
+
     default:
         break;
     }
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 7/7] npt/shadow: allow getting foreign page table entries
  2019-02-11 17:46 [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes Roger Pau Monne
                   ` (5 preceding siblings ...)
  2019-02-11 17:46 ` [PATCH v2 6/7] x86/mm: handle foreign mappings in p2m_entry_modify Roger Pau Monne
@ 2019-02-11 17:46 ` Roger Pau Monne
  2019-02-14 11:38   ` Jan Beulich
  2019-02-11 19:50 ` [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes Sander Eikelenboom
  7 siblings, 1 reply; 26+ messages in thread
From: Roger Pau Monne @ 2019-02-11 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

Current npt and shadow code to get an entry will always return
INVALID_MFN for foreign entries. Allow to return the entry mfn for
foreign entries, like it's done for grant table entries.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm/p2m-pt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index aea8110254..39ef27943e 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -865,7 +865,8 @@ pod_retry_l1:
     unmap_domain_page(l1e);
 
     ASSERT(mfn_valid(mfn) || !p2m_is_ram(*t) || p2m_is_paging(*t));
-    return (p2m_is_valid(*t) || p2m_is_grant(*t)) ? mfn : INVALID_MFN;
+    return (p2m_is_valid(*t) || p2m_is_grant(*t) || p2m_is_foreign(*t))
+           ? mfn : INVALID_MFN;
 }
 
 static void p2m_pt_change_entry_type_global(struct p2m_domain *p2m,
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes
  2019-02-11 17:46 [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes Roger Pau Monne
                   ` (6 preceding siblings ...)
  2019-02-11 17:46 ` [PATCH v2 7/7] npt/shadow: allow getting foreign page table entries Roger Pau Monne
@ 2019-02-11 19:50 ` Sander Eikelenboom
  7 siblings, 0 replies; 26+ messages in thread
From: Sander Eikelenboom @ 2019-02-11 19:50 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel

On 11/02/2019 18:46, Roger Pau Monne wrote:
> Hello,
> 
> The following series contains fixes that should be considered for 4.12.
> 
> I'm not sure whether patches 5, 6 and 7 should be aimed at 4.12, they
> contain changes to the p2m code that could affect HVM guests. Note that
> without those changes a PVH dom0 running on AMD hardware will be unable
> to create guests. Overall the patches are a nice cleanup to the handling
> of p2m_ioreq_server and p2m_map_foreign types.
> 
> The series can also be found at:
> 
> git://xenbits.xen.org/people/royger/xen.git fixes-4.12-v2.1

I have tested this on AMD hardware, both as PVH dom0 and PV dom0
(running both PVH and HVM guests). So FWIW:

Tested-by: Sander Eikelenboom <linux@eikelenboom.it>

> Roger Pau Monne (7):
>   dom0/pvh: align allocation and mapping order to start address
>   amd/npt/shadow: replace assert that prevents creating 2M/1G MMIO
>     entries
>   x86/pvh: reorder PVH dom0 iommu initialization
>   pvh/dom0: warn when dom0_mem is not set
>   x86/mm: split p2m ioreq server pages special handling into helper
>   x86/mm: handle foreign mappings in p2m_entry_modify
>   npt/shadow: allow getting foreign page table entries
> 
>  xen/arch/x86/dom0_build.c           |  10 ++
>  xen/arch/x86/hvm/dom0_build.c       |  37 +++++---
>  xen/arch/x86/mm/hap/hap.c           |   4 +
>  xen/arch/x86/mm/p2m-ept.c           | 137 ++++++----------------------
>  xen/arch/x86/mm/p2m-pt.c            |  54 ++++-------
>  xen/arch/x86/mm/shadow/common.c     |   4 +
>  xen/drivers/passthrough/x86/iommu.c |   7 +-
>  xen/include/asm-x86/p2m.h           |  45 +++++++++
>  8 files changed, 135 insertions(+), 163 deletions(-)
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH for-4.12 v2 1/7] dom0/pvh: align allocation and mapping order to start address
  2019-02-11 17:46 ` [PATCH for-4.12 v2 1/7] dom0/pvh: align allocation and mapping order to start address Roger Pau Monne
@ 2019-02-12 10:52   ` Wei Liu
  2019-02-13 15:32   ` Jan Beulich
  1 sibling, 0 replies; 26+ messages in thread
From: Wei Liu @ 2019-02-12 10:52 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Mon, Feb 11, 2019 at 06:46:36PM +0100, Roger Pau Monne wrote:
> The p2m and iommu mapping code always had the requirement that
> addresses and orders must be aligned when populating the p2m or the
> iommu page tables.
> 
> PVH dom0 builder didn't take this requirement into account, and can
> call into the p2m/iommu mapping helpers with addresses and orders that
> are not aligned.
> 
> Fix this by making sure the orders passed to the physmap population
> helpers are always aligned to the guest address to be populated.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH for-4.12 v2 4/7] pvh/dom0: warn when dom0_mem is not set
  2019-02-11 17:46 ` [PATCH for-4.12 v2 4/7] pvh/dom0: warn when dom0_mem is not set Roger Pau Monne
@ 2019-02-12 10:52   ` Wei Liu
  2019-02-13 16:01   ` Jan Beulich
  1 sibling, 0 replies; 26+ messages in thread
From: Wei Liu @ 2019-02-12 10:52 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Juergen Gross, xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Mon, Feb 11, 2019 at 06:46:39PM +0100, Roger Pau Monne wrote:
> There have been several reports of the dom0 builder running out of
> memory when building a PVH dom0 without having specified a dom0_mem
> value. Print a warning message if dom0_mem is not set when booting in
> PVH mode.
> 
> This is a temporary workaround until accounting for internal memory
> required by Xen (ie: paging structures) is improved.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 5/7] x86/mm: split p2m ioreq server pages special handling into helper
  2019-02-11 17:46 ` [PATCH v2 5/7] x86/mm: split p2m ioreq server pages special handling into helper Roger Pau Monne
@ 2019-02-13 10:23   ` Paul Durrant
  0 siblings, 0 replies; 26+ messages in thread
From: Paul Durrant @ 2019-02-13 10:23 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Wei Liu, Jan Beulich, Andrew Cooper, George Dunlap,
	Jun Nakajima, Roger Pau Monne

> -----Original Message-----
> From: Roger Pau Monne [mailto:roger.pau@citrix.com]
> Sent: 11 February 2019 17:47
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; Jun Nakajima
> <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Subject: [PATCH v2 5/7] x86/mm: split p2m ioreq server pages special
> handling into helper
> 
> So that it can be shared by both ept, npt and shadow code, instead of
> duplicating it.
> 
> No change in functionality intended.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

LGTM.

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Paul Durrant <paul.durrant@citrix.com>
> ---
> Changes since v1:
>  - Remove unused p2mt_old from p2m_pt_set_entry.
> ---
>  xen/arch/x86/mm/hap/hap.c       |  3 ++
>  xen/arch/x86/mm/p2m-ept.c       | 55 +++++++++++----------------------
>  xen/arch/x86/mm/p2m-pt.c        | 24 --------------
>  xen/arch/x86/mm/shadow/common.c |  3 ++
>  xen/include/asm-x86/p2m.h       | 32 +++++++++++++++++++
>  5 files changed, 56 insertions(+), 61 deletions(-)
> 
> diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
> index 3d651b94c3..dc46d5e14f 100644
> --- a/xen/arch/x86/mm/hap/hap.c
> +++ b/xen/arch/x86/mm/hap/hap.c
> @@ -734,6 +734,9 @@ hap_write_p2m_entry(struct domain *d, unsigned long
> gfn, l1_pgentry_t *p,
>              && perms_strictly_increased(old_flags, l1e_get_flags(new)) );
>      }
> 
> +    p2m_entry_modify(p2m_get_hostp2m(d),
> p2m_flags_to_type(l1e_get_flags(new)),
> +                     p2m_flags_to_type(old_flags), level);
> +
>      safe_write_pte(p, new);
>      if ( old_flags & _PAGE_PRESENT )
>          flush_tlb_mask(d->dirty_cpumask);
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index bb562607f7..0ece6608cb 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -46,7 +46,8 @@ static inline bool_t is_epte_valid(ept_entry_t *e)
>  }
> 
>  /* returns : 0 for success, -errno otherwise */
> -static int atomic_write_ept_entry(ept_entry_t *entryptr, ept_entry_t new,
> +static int atomic_write_ept_entry(struct p2m_domain *p2m,
> +                                  ept_entry_t *entryptr, ept_entry_t new,
>                                    int level)
>  {
>      int rc;
> @@ -89,6 +90,8 @@ static int atomic_write_ept_entry(ept_entry_t *entryptr,
> ept_entry_t new,
>      if ( unlikely(p2m_is_foreign(entryptr->sa_p2mt)) && check_foreign )
>          oldmfn = entryptr->mfn;
> 
> +    p2m_entry_modify(p2m, new.sa_p2mt, entryptr->sa_p2mt, level);
> +
>      write_atomic(&entryptr->epte, new.epte);
> 
>      if ( unlikely(oldmfn != mfn_x(INVALID_MFN)) )
> @@ -390,7 +393,8 @@ static int ept_next_level(struct p2m_domain *p2m,
> bool_t read_only,
>   * present entries in the given page table, optionally marking the
> entries
>   * also for their subtrees needing P2M type re-calculation.
>   */
> -static bool_t ept_invalidate_emt(mfn_t mfn, bool_t recalc, int level)
> +static bool_t ept_invalidate_emt(struct p2m_domain *p2m, mfn_t mfn,
> +                                 bool_t recalc, int level)
>  {
>      int rc;
>      ept_entry_t *epte = map_domain_page(mfn);
> @@ -408,7 +412,7 @@ static bool_t ept_invalidate_emt(mfn_t mfn, bool_t
> recalc, int level)
>          e.emt = MTRR_NUM_TYPES;
>          if ( recalc )
>              e.recalc = 1;
> -        rc = atomic_write_ept_entry(&epte[i], e, level);
> +        rc = atomic_write_ept_entry(p2m, &epte[i], e, level);
>          ASSERT(rc == 0);
>          changed = 1;
>      }
> @@ -459,7 +463,7 @@ static int ept_invalidate_emt_range(struct p2m_domain
> *p2m,
>              rc = -ENOMEM;
>              goto out;
>          }
> -        wrc = atomic_write_ept_entry(&table[index], split_ept_entry, i);
> +        wrc = atomic_write_ept_entry(p2m, &table[index], split_ept_entry,
> i);
>          ASSERT(wrc == 0);
> 
>          for ( ; i > target; --i )
> @@ -479,7 +483,7 @@ static int ept_invalidate_emt_range(struct p2m_domain
> *p2m,
>          {
>              e.emt = MTRR_NUM_TYPES;
>              e.recalc = 1;
> -            wrc = atomic_write_ept_entry(&table[index], e, target);
> +            wrc = atomic_write_ept_entry(p2m, &table[index], e, target);
>              ASSERT(wrc == 0);
>              rc = 1;
>          }
> @@ -549,17 +553,11 @@ static int resolve_misconfig(struct p2m_domain *p2m,
> unsigned long gfn)
>                      nt = p2m_recalc_type(e.recalc, e.sa_p2mt, p2m, gfn +
> i);
>                      if ( nt != e.sa_p2mt )
>                      {
> -                        if ( e.sa_p2mt == p2m_ioreq_server )
> -                        {
> -                            ASSERT(p2m->ioreq.entry_count > 0);
> -                            p2m->ioreq.entry_count--;
> -                        }
> -
>                          e.sa_p2mt = nt;
>                          ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt,
> e.access);
>                      }
>                      e.recalc = 0;
> -                    wrc = atomic_write_ept_entry(&epte[i], e, level);
> +                    wrc = atomic_write_ept_entry(p2m, &epte[i], e,
> level);
>                      ASSERT(wrc == 0);
>                  }
>              }
> @@ -595,7 +593,7 @@ static int resolve_misconfig(struct p2m_domain *p2m,
> unsigned long gfn)
>                  {
>                      if ( ept_split_super_page(p2m, &e, level, level - 1)
> )
>                      {
> -                        wrc = atomic_write_ept_entry(&epte[i], e, level);
> +                        wrc = atomic_write_ept_entry(p2m, &epte[i], e,
> level);
>                          ASSERT(wrc == 0);
>                          unmap_domain_page(epte);
>                          mfn = e.mfn;
> @@ -610,7 +608,7 @@ static int resolve_misconfig(struct p2m_domain *p2m,
> unsigned long gfn)
>                  e.recalc = 0;
>                  if ( recalc && p2m_is_changeable(e.sa_p2mt) )
>                      ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt, e.access);
> -                wrc = atomic_write_ept_entry(&epte[i], e, level);
> +                wrc = atomic_write_ept_entry(p2m, &epte[i], e, level);
>                  ASSERT(wrc == 0);
>              }
> 
> @@ -621,11 +619,11 @@ static int resolve_misconfig(struct p2m_domain *p2m,
> unsigned long gfn)
>          if ( e.emt == MTRR_NUM_TYPES )
>          {
>              ASSERT(is_epte_present(&e));
> -            ept_invalidate_emt(_mfn(e.mfn), e.recalc, level);
> +            ept_invalidate_emt(p2m, _mfn(e.mfn), e.recalc, level);
>              smp_wmb();
>              e.emt = 0;
>              e.recalc = 0;
> -            wrc = atomic_write_ept_entry(&epte[i], e, level);
> +            wrc = atomic_write_ept_entry(p2m, &epte[i], e, level);
>              ASSERT(wrc == 0);
>              unmap_domain_page(epte);
>              rc = 1;
> @@ -786,7 +784,7 @@ ept_set_entry(struct p2m_domain *p2m, gfn_t gfn_,
> mfn_t mfn,
> 
>          /* now install the newly split ept sub-tree */
>          /* NB: please make sure domian is paused and no in-fly VT-d DMA.
> */
> -        rc = atomic_write_ept_entry(ept_entry, split_ept_entry, i);
> +        rc = atomic_write_ept_entry(p2m, ept_entry, split_ept_entry, i);
>          ASSERT(rc == 0);
> 
>          /* then move to the level we want to make real changes */
> @@ -833,24 +831,7 @@ ept_set_entry(struct p2m_domain *p2m, gfn_t gfn_,
> mfn_t mfn,
>          new_entry.suppress_ve = is_epte_valid(&old_entry) ?
>                                      old_entry.suppress_ve : 1;
> 
> -    /*
> -     * p2m_ioreq_server is only used for 4K pages, so the
> -     * count is only done on ept page table entries.
> -     */
> -    if ( p2mt == p2m_ioreq_server )
> -    {
> -        ASSERT(i == 0);
> -        p2m->ioreq.entry_count++;
> -    }
> -
> -    if ( ept_entry->sa_p2mt == p2m_ioreq_server )
> -    {
> -        ASSERT(i == 0);
> -        ASSERT(p2m->ioreq.entry_count > 0);
> -        p2m->ioreq.entry_count--;
> -    }
> -
> -    rc = atomic_write_ept_entry(ept_entry, new_entry, target);
> +    rc = atomic_write_ept_entry(p2m, ept_entry, new_entry, target);
>      if ( unlikely(rc) )
>          old_entry.epte = 0;
>      else
> @@ -1070,7 +1051,7 @@ static void ept_change_entry_type_global(struct
> p2m_domain *p2m,
>      if ( !mfn )
>          return;
> 
> -    if ( ept_invalidate_emt(_mfn(mfn), 1, p2m->ept.wl) )
> +    if ( ept_invalidate_emt(p2m, _mfn(mfn), 1, p2m->ept.wl) )
>          ept_sync_domain(p2m);
>  }
> 
> @@ -1128,7 +1109,7 @@ static void ept_memory_type_changed(struct
> p2m_domain *p2m)
>      if ( !mfn )
>          return;
> 
> -    if ( ept_invalidate_emt(_mfn(mfn), 0, p2m->ept.wl) )
> +    if ( ept_invalidate_emt(p2m, _mfn(mfn), 0, p2m->ept.wl) )
>          ept_sync_domain(p2m);
>  }
> 
> diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
> index 791600f6ba..fd6386b8fd 100644
> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -436,13 +436,6 @@ static int do_recalc(struct p2m_domain *p2m, unsigned
> long gfn)
>                  flags |= _PAGE_PSE;
>              }
> 
> -            if ( ot == p2m_ioreq_server )
> -            {
> -                ASSERT(p2m->ioreq.entry_count > 0);
> -                ASSERT(level == 0);
> -                p2m->ioreq.entry_count--;
> -            }
> -
>              e = l1e_from_pfn(mfn, flags);
>              p2m_add_iommu_flags(&e, level,
>                                  (nt == p2m_ram_rw)
> @@ -608,8 +601,6 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_,
> mfn_t mfn,
> 
>      if ( page_order == PAGE_ORDER_4K )
>      {
> -        p2m_type_t p2mt_old;
> -
>          rc = p2m_next_level(p2m, &table, &gfn_remainder, gfn,
>                              L2_PAGETABLE_SHIFT - PAGE_SHIFT,
>                              L2_PAGETABLE_ENTRIES, 1, 1);
> @@ -633,21 +624,6 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_,
> mfn_t mfn,
>          if ( entry_content.l1 != 0 )
>              p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
> 
> -        p2mt_old = p2m_flags_to_type(l1e_get_flags(*p2m_entry));
> -
> -        /*
> -         * p2m_ioreq_server is only used for 4K pages, so
> -         * the count is only done for level 1 entries.
> -         */
> -        if ( p2mt == p2m_ioreq_server )
> -            p2m->ioreq.entry_count++;
> -
> -        if ( p2mt_old == p2m_ioreq_server )
> -        {
> -            ASSERT(p2m->ioreq.entry_count > 0);
> -            p2m->ioreq.entry_count--;
> -        }
> -
>          /* level 1 entry */
>          p2m->write_p2m_entry(p2m, gfn, p2m_entry, entry_content, 1);
>          /* NB: paging_write_p2m_entry() handles tlb flushes properly */
> diff --git a/xen/arch/x86/mm/shadow/common.c
> b/xen/arch/x86/mm/shadow/common.c
> index c49aeb5e60..6d8a950054 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -3180,6 +3180,9 @@ shadow_write_p2m_entry(struct domain *d, unsigned
> long gfn,
>      if ( likely(d->arch.paging.shadow.total_pages != 0) )
>           sh_unshadow_for_p2m_change(d, gfn, p, new, level);
> 
> +    p2m_entry_modify(p2m_get_hostp2m(d),
> p2m_flags_to_type(l1e_get_flags(new)),
> +                     p2m_flags_to_type(l1e_get_flags(*p)), level);
> +
>      /* Update the entry with new content */
>      safe_write_pte(p, new);
> 
> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
> index 2095076556..834d49d2d4 100644
> --- a/xen/include/asm-x86/p2m.h
> +++ b/xen/include/asm-x86/p2m.h
> @@ -932,6 +932,38 @@ int p2m_set_ioreq_server(struct domain *d, unsigned
> int flags,
>  struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
>                                                unsigned int *flags);
> 
> +static inline void p2m_entry_modify(struct p2m_domain *p2m, p2m_type_t
> nt,
> +                                    p2m_type_t ot, unsigned int level)
> +{
> +    if ( level != 1 || nt == ot )
> +        return;
> +
> +    switch ( nt )
> +    {
> +    case p2m_ioreq_server:
> +        /*
> +         * p2m_ioreq_server is only used for 4K pages, so
> +         * the count is only done for level 1 entries.
> +         */
> +        p2m->ioreq.entry_count++;
> +        break;
> +
> +    default:
> +        break;
> +    }
> +
> +    switch ( ot )
> +    {
> +    case p2m_ioreq_server:
> +        ASSERT(p2m->ioreq.entry_count > 0);
> +        p2m->ioreq.entry_count--;
> +        break;
> +
> +    default:
> +        break;
> +    }
> +}
> +
>  #endif /* _XEN_ASM_X86_P2M_H */
> 
>  /*
> --
> 2.20.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH for-4.12 v2 1/7] dom0/pvh: align allocation and mapping order to start address
  2019-02-11 17:46 ` [PATCH for-4.12 v2 1/7] dom0/pvh: align allocation and mapping order to start address Roger Pau Monne
  2019-02-12 10:52   ` Wei Liu
@ 2019-02-13 15:32   ` Jan Beulich
  1 sibling, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-02-13 15:32 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
> The p2m and iommu mapping code always had the requirement that
> addresses and orders must be aligned when populating the p2m or the
> iommu page tables.
> 
> PVH dom0 builder didn't take this requirement into account, and can
> call into the p2m/iommu mapping helpers with addresses and orders that
> are not aligned.
> 
> Fix this by making sure the orders passed to the physmap population
> helpers are always aligned to the guest address to be populated.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH for-4.12 v2 2/7] amd/npt/shadow: replace assert that prevents creating 2M/1G MMIO entries
  2019-02-11 17:46 ` [PATCH for-4.12 v2 2/7] amd/npt/shadow: replace assert that prevents creating 2M/1G MMIO entries Roger Pau Monne
@ 2019-02-13 15:53   ` Jan Beulich
  2019-02-14 13:59     ` Roger Pau Monné
  0 siblings, 1 reply; 26+ messages in thread
From: Jan Beulich @ 2019-02-13 15:53 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Juergen Gross, Wei Liu, George Dunlap, Andrew Cooper,
	george.dunlap, xen-devel

>>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
> The assert was originally added to make sure that higher order
> regions (> PAGE_ORDER_4K) could not be used to bypass the
> mmio_ro_ranges check performed by p2m_type_to_flags.
> 
> This however is already checked in set_mmio_p2m_entry, which makes
> sure that higher order mappings don't overlap with mmio_ro_ranges,
> thus allowing the creation of high order MMIO mappings safely.
> 
> Replace the assert to allow 2M/1G entries to be created for MMIO
> regions and add some extra asserts as a replacement to make sure
> there's no overlapping with MMIO read-only ranges.
> 
> Note that 1G MMIO entries will not be created unless mmio_order is
> changed to allow it.
> 
> Suggested-by: George Dunlap <george.dunlap@citrix.com>

Is this still the case? Iirc the original suggestion was to remove
the assertion altogether?

> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -576,7 +576,15 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
>          }
>  
>          ASSERT(p2m_flags_to_type(flags) != p2m_ioreq_server);
> -        ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
> +        if ( p2mt == p2m_mmio_direct )
> +            ASSERT(!mfn_eq(mfn, INVALID_MFN) &&
> +                   !rangeset_overlaps_range(mmio_ro_ranges, mfn_x(mfn),
> +                                            mfn_x(mfn) + PFN_DOWN(MB(2))));
> +        else if ( p2m_allows_invalid_mfn(p2mt) || p2mt == p2m_invalid ||
> +                  p2mt == p2m_mmio_dm )
> +            ASSERT(mfn_valid(mfn) || mfn_eq(mfn, INVALID_MFN));
> +        else
> +            ASSERT(mfn_valid(mfn));
>          l3e_content = mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt)
>              ? p2m_l3e_from_pfn(mfn_x(mfn),
>                                 p2m_type_to_flags(p2m, p2mt, mfn, 2))
> @@ -668,7 +676,15 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
>          }
>  
>          ASSERT(p2m_flags_to_type(flags) != p2m_ioreq_server);
> -        ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
> +        if ( p2mt == p2m_mmio_direct )
> +            ASSERT(!mfn_eq(mfn, INVALID_MFN) &&
> +                   !rangeset_overlaps_range(mmio_ro_ranges, mfn_x(mfn),
> +                                            mfn_x(mfn) + PFN_DOWN(MB(2))));
> +        else if ( p2m_allows_invalid_mfn(p2mt) || p2mt == p2m_invalid ||
> +                  p2mt == p2m_mmio_dm )
> +            ASSERT(mfn_valid(mfn) || mfn_eq(mfn, INVALID_MFN));
> +        else
> +            ASSERT(mfn_valid(mfn));

Seeing this supposedly almost the same (but actually entirely the same,
due to the wrong MB(2) in the first hunk) code I wonder whether this
wouldn't better be put in a helper (macro or function), together with
adjacent assertion in context, immediately ahead of the line you alter.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH for-4.12 v2 3/7] x86/pvh: reorder PVH dom0 iommu initialization
  2019-02-11 17:46 ` [PATCH for-4.12 v2 3/7] x86/pvh: reorder PVH dom0 iommu initialization Roger Pau Monne
@ 2019-02-13 15:58   ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-02-13 15:58 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
> So that the iommu is initialized before populating the p2m, and
> entries added get the corresponding iommu page table entries if
> required. This requires splitting the current pvh_setup_p2m into two
> different functions. One that crafts dom0 physmap and sets the paging
> allocation, and another one that actually populates the p2m with RAM
> regions.
> 
> Note that this allows to remove the special casing done for the low
> 1MB in hwdom_iommu_map.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH for-4.12 v2 4/7] pvh/dom0: warn when dom0_mem is not set
  2019-02-11 17:46 ` [PATCH for-4.12 v2 4/7] pvh/dom0: warn when dom0_mem is not set Roger Pau Monne
  2019-02-12 10:52   ` Wei Liu
@ 2019-02-13 16:01   ` Jan Beulich
  2019-02-13 17:13     ` Roger Pau Monné
  1 sibling, 1 reply; 26+ messages in thread
From: Jan Beulich @ 2019-02-13 16:01 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

>>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
> There have been several reports of the dom0 builder running out of
> memory when building a PVH dom0 without having specified a dom0_mem
> value. Print a warning message if dom0_mem is not set when booting in
> PVH mode.
> 
> This is a temporary workaround until accounting for internal memory
> required by Xen (ie: paging structures) is improved.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>

I take it that ...

> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -378,8 +378,18 @@ unsigned long __init dom0_compute_nr_pages(
>           * maximum of 128MB.
>           */
>          if ( !nr_pages )
> +        {
>              nr_pages = avail - (pv_shim ? pv_shim_mem(avail)
>                                   : min(avail / 16, 128UL << (20 - PAGE_SHIFT)));
> +            if ( is_hvm_domain(d) )
> +                /*
> +                 * Temporary workaround message until internal (paging) memory
> +                 * accounting required to build a pvh dom0 is improved.
> +                 */
> +                printk("WARNING: PVH dom0 without dom0_mem set is still unstable. "
> +                       "If you get crashes during boot, try adding a dom0_mem parameter\n");
> +        }

... you consider it acceptable for the message to be logged twice
in certain cases?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH for-4.12 v2 4/7] pvh/dom0: warn when dom0_mem is not set
  2019-02-13 16:01   ` Jan Beulich
@ 2019-02-13 17:13     ` Roger Pau Monné
  2019-02-14  8:09       ` Jan Beulich
  0 siblings, 1 reply; 26+ messages in thread
From: Roger Pau Monné @ 2019-02-13 17:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

On Wed, Feb 13, 2019 at 09:01:07AM -0700, Jan Beulich wrote:
> >>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
> > There have been several reports of the dom0 builder running out of
> > memory when building a PVH dom0 without having specified a dom0_mem
> > value. Print a warning message if dom0_mem is not set when booting in
> > PVH mode.
> > 
> > This is a temporary workaround until accounting for internal memory
> > required by Xen (ie: paging structures) is improved.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
> 
> I take it that ...
> 
> > --- a/xen/arch/x86/dom0_build.c
> > +++ b/xen/arch/x86/dom0_build.c
> > @@ -378,8 +378,18 @@ unsigned long __init dom0_compute_nr_pages(
> >           * maximum of 128MB.
> >           */
> >          if ( !nr_pages )
> > +        {
> >              nr_pages = avail - (pv_shim ? pv_shim_mem(avail)
> >                                   : min(avail / 16, 128UL << (20 - PAGE_SHIFT)));
> > +            if ( is_hvm_domain(d) )
> > +                /*
> > +                 * Temporary workaround message until internal (paging) memory
> > +                 * accounting required to build a pvh dom0 is improved.
> > +                 */
> > +                printk("WARNING: PVH dom0 without dom0_mem set is still unstable. "
> > +                       "If you get crashes during boot, try adding a dom0_mem parameter\n");
> > +        }
> 
> ... you consider it acceptable for the message to be logged twice
> in certain cases?

Right, nr_pages could be set to 0 again if there are 2 iterations of
the parent for loop, in which case using a boolean would be better. I
can fix this in the next version.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH for-4.12 v2 4/7] pvh/dom0: warn when dom0_mem is not set
  2019-02-13 17:13     ` Roger Pau Monné
@ 2019-02-14  8:09       ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-02-14  8:09 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

>>> On 13.02.19 at 18:13, <roger.pau@citrix.com> wrote:
> On Wed, Feb 13, 2019 at 09:01:07AM -0700, Jan Beulich wrote:
>> >>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
>> > There have been several reports of the dom0 builder running out of
>> > memory when building a PVH dom0 without having specified a dom0_mem
>> > value. Print a warning message if dom0_mem is not set when booting in
>> > PVH mode.
>> > 
>> > This is a temporary workaround until accounting for internal memory
>> > required by Xen (ie: paging structures) is improved.
>> > 
>> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>> 
>> Acked-by: Jan Beulich <jbeulich@suse.com>
>> 
>> I take it that ...
>> 
>> > --- a/xen/arch/x86/dom0_build.c
>> > +++ b/xen/arch/x86/dom0_build.c
>> > @@ -378,8 +378,18 @@ unsigned long __init dom0_compute_nr_pages(
>> >           * maximum of 128MB.
>> >           */
>> >          if ( !nr_pages )
>> > +        {
>> >              nr_pages = avail - (pv_shim ? pv_shim_mem(avail)
>> >                                   : min(avail / 16, 128UL << (20 - PAGE_SHIFT)));
>> > +            if ( is_hvm_domain(d) )
>> > +                /*
>> > +                 * Temporary workaround message until internal (paging) memory
>> > +                 * accounting required to build a pvh dom0 is improved.
>> > +                 */
>> > +                printk("WARNING: PVH dom0 without dom0_mem set is still unstable. "
>> > +                       "If you get crashes during boot, try adding a dom0_mem parameter\n");
>> > +        }
>> 
>> ... you consider it acceptable for the message to be logged twice
>> in certain cases?
> 
> Right, nr_pages could be set to 0 again if there are 2 iterations of
> the parent for loop, in which case using a boolean would be better. I
> can fix this in the next version.

As "a boolean" I think you can use the already present need_paging
one.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 6/7] x86/mm: handle foreign mappings in p2m_entry_modify
  2019-02-11 17:46 ` [PATCH v2 6/7] x86/mm: handle foreign mappings in p2m_entry_modify Roger Pau Monne
@ 2019-02-14 11:25   ` Jan Beulich
  2019-02-14 12:12     ` Roger Pau Monné
  0 siblings, 1 reply; 26+ messages in thread
From: Jan Beulich @ 2019-02-14 11:25 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Kevin Tian, Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan,
	Jun Nakajima, xen-devel

>>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
> @@ -948,6 +951,11 @@ static inline void p2m_entry_modify(struct p2m_domain *p2m, p2m_type_t nt,
>          p2m->ioreq.entry_count++;
>          break;
>  
> +    case p2m_map_foreign:
> +        BUG_ON(!mfn_valid(nfn) ||
> +               !page_get_owner_and_reference(mfn_to_page(nfn)));
> +        break;

Asserting that the passed in MFN is valid is fine. Asserting that a
reference can be got is not, as this sets us up for a DoS in case
of a refcount overflow, or the page having got ballooned out by
its owner. That is, the issue of you folding the two original calls
into one is wider than just the two distinct error codes getting lost
that were previously produced - you can't (currently) report up
any error from this low layer. (And I'm sorry, I should have noticed
this on v1 already.)

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 7/7] npt/shadow: allow getting foreign page table entries
  2019-02-11 17:46 ` [PATCH v2 7/7] npt/shadow: allow getting foreign page table entries Roger Pau Monne
@ 2019-02-14 11:38   ` Jan Beulich
  2019-02-14 12:16     ` Roger Pau Monné
  0 siblings, 1 reply; 26+ messages in thread
From: Jan Beulich @ 2019-02-14 11:38 UTC (permalink / raw)
  To: Roger Pau Monne, George Dunlap; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -865,7 +865,8 @@ pod_retry_l1:
>      unmap_domain_page(l1e);
>  
>      ASSERT(mfn_valid(mfn) || !p2m_is_ram(*t) || p2m_is_paging(*t));
> -    return (p2m_is_valid(*t) || p2m_is_grant(*t)) ? mfn : INVALID_MFN;
> +    return (p2m_is_valid(*t) || p2m_is_grant(*t) || p2m_is_foreign(*t))
> +           ? mfn : INVALID_MFN;
>  }

Wouldn't you better alter the ASSERT() as well, using p2m_is_any_ram()
instead of p2m_is_ram() now? Grants should have been included there
before, but omitting foreign ones there before was benign.

I also have an unrelated question here, perhaps more for George:
p2m_is_valid() includes p2m_mmio_dm. There are no valid MFNs
associated with that type iirc. The MFN taken out of the P2M entry,
however, can't possibly be INVALID_MFN (due to bit width
constraints). It would seem better to me though to return
INVALID_MFN for that type. (This would then also apply to the 2M
and 1G cases, which have separate return paths.)

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 6/7] x86/mm: handle foreign mappings in p2m_entry_modify
  2019-02-14 11:25   ` Jan Beulich
@ 2019-02-14 12:12     ` Roger Pau Monné
  2019-02-14 12:25       ` Jan Beulich
  0 siblings, 1 reply; 26+ messages in thread
From: Roger Pau Monné @ 2019-02-14 12:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan,
	Jun Nakajima, xen-devel

On Thu, Feb 14, 2019 at 04:25:49AM -0700, Jan Beulich wrote:
> >>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
> > @@ -948,6 +951,11 @@ static inline void p2m_entry_modify(struct p2m_domain *p2m, p2m_type_t nt,
> >          p2m->ioreq.entry_count++;
> >          break;
> >  
> > +    case p2m_map_foreign:
> > +        BUG_ON(!mfn_valid(nfn) ||
> > +               !page_get_owner_and_reference(mfn_to_page(nfn)));
> > +        break;
> 
> Asserting that the passed in MFN is valid is fine. Asserting that a
> reference can be got is not, as this sets us up for a DoS in case
> of a refcount overflow, or the page having got ballooned out by
> its owner. That is, the issue of you folding the two original calls
> into one is wider than just the two distinct error codes getting lost
> that were previously produced - you can't (currently) report up
> any error from this low layer. (And I'm sorry, I should have noticed
> this on v1 already.)

What about using something like:

case p2m_map_foreign:
    BUG_ON(!mfn_valid(nfn));

    if ( !page_get_owner_and_reference(mfn_to_page(nfn)) )
    {
        ASSERT_UNREACHABLE();
        return;
    }

    break;

It's not strictly worse than what's currently done in EPT code, but I
agree it should be improved.

Improving this will mean modifying the write_p2m_entry hook, and all
the callers, together with fixing the handling of the return code from
atomic_write_ept_entry, which is currently only asserted.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 7/7] npt/shadow: allow getting foreign page table entries
  2019-02-14 11:38   ` Jan Beulich
@ 2019-02-14 12:16     ` Roger Pau Monné
  2019-02-14 12:30       ` Jan Beulich
  0 siblings, 1 reply; 26+ messages in thread
From: Roger Pau Monné @ 2019-02-14 12:16 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel

On Thu, Feb 14, 2019 at 04:38:54AM -0700, Jan Beulich wrote:
> >>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
> > --- a/xen/arch/x86/mm/p2m-pt.c
> > +++ b/xen/arch/x86/mm/p2m-pt.c
> > @@ -865,7 +865,8 @@ pod_retry_l1:
> >      unmap_domain_page(l1e);
> >  
> >      ASSERT(mfn_valid(mfn) || !p2m_is_ram(*t) || p2m_is_paging(*t));
> > -    return (p2m_is_valid(*t) || p2m_is_grant(*t)) ? mfn : INVALID_MFN;
> > +    return (p2m_is_valid(*t) || p2m_is_grant(*t) || p2m_is_foreign(*t))
> > +           ? mfn : INVALID_MFN;
> >  }
> 
> Wouldn't you better alter the ASSERT() as well, using p2m_is_any_ram()
> instead of p2m_is_ram() now? Grants should have been included there
> before, but omitting foreign ones there before was benign.

Yes, I could use p2m_is_any_ram both in the assert and in the return
condition:

ASSERT(mfn_valid(mfn) || !p2m_is_any_ram(*t) || p2m_is_paging(*t));
return (p2m_is_valid(*t) || p2m_is_any_ram(*t)) ? mfn : INVALID_MFN;

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 6/7] x86/mm: handle foreign mappings in p2m_entry_modify
  2019-02-14 12:12     ` Roger Pau Monné
@ 2019-02-14 12:25       ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-02-14 12:25 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Kevin Tian, Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan,
	Jun Nakajima, xen-devel

>>> On 14.02.19 at 13:12, <roger.pau@citrix.com> wrote:
> On Thu, Feb 14, 2019 at 04:25:49AM -0700, Jan Beulich wrote:
>> >>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
>> > @@ -948,6 +951,11 @@ static inline void p2m_entry_modify(struct p2m_domain *p2m, p2m_type_t nt,
>> >          p2m->ioreq.entry_count++;
>> >          break;
>> >  
>> > +    case p2m_map_foreign:
>> > +        BUG_ON(!mfn_valid(nfn) ||
>> > +               !page_get_owner_and_reference(mfn_to_page(nfn)));
>> > +        break;
>> 
>> Asserting that the passed in MFN is valid is fine. Asserting that a
>> reference can be got is not, as this sets us up for a DoS in case
>> of a refcount overflow, or the page having got ballooned out by
>> its owner. That is, the issue of you folding the two original calls
>> into one is wider than just the two distinct error codes getting lost
>> that were previously produced - you can't (currently) report up
>> any error from this low layer. (And I'm sorry, I should have noticed
>> this on v1 already.)
> 
> What about using something like:
> 
> case p2m_map_foreign:
>     BUG_ON(!mfn_valid(nfn));
> 
>     if ( !page_get_owner_and_reference(mfn_to_page(nfn)) )
>     {
>         ASSERT_UNREACHABLE();
>         return;
>     }
> 
>     break;

How would this be any better? In a release build the caller
will now assume all is fine, and the subsequent put_page()
will screw up reference counts.

> It's not strictly worse than what's currently done in EPT code, but I
> agree it should be improved.

The EPT code return -EBUSY in this case, clearly flagging to
the caller that an error has occurred. Most callers of
atomic_write_ept_entry() indeed ASSERT() right now, but
there's the one crucial one which doesn't, in ept_set_entry().

> Improving this will mean modifying the write_p2m_entry hook, and all
> the callers, together with fixing the handling of the return code from
> atomic_write_ept_entry, which is currently only asserted.

... in most cases.

The alternative is to leave EPT code as it is, and introduce similar
handling into NPT/shadow code.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 7/7] npt/shadow: allow getting foreign page table entries
  2019-02-14 12:16     ` Roger Pau Monné
@ 2019-02-14 12:30       ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-02-14 12:30 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel

>>> On 14.02.19 at 13:16, <roger.pau@citrix.com> wrote:
> On Thu, Feb 14, 2019 at 04:38:54AM -0700, Jan Beulich wrote:
>> >>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
>> > --- a/xen/arch/x86/mm/p2m-pt.c
>> > +++ b/xen/arch/x86/mm/p2m-pt.c
>> > @@ -865,7 +865,8 @@ pod_retry_l1:
>> >      unmap_domain_page(l1e);
>> >  
>> >      ASSERT(mfn_valid(mfn) || !p2m_is_ram(*t) || p2m_is_paging(*t));
>> > -    return (p2m_is_valid(*t) || p2m_is_grant(*t)) ? mfn : INVALID_MFN;
>> > +    return (p2m_is_valid(*t) || p2m_is_grant(*t) || p2m_is_foreign(*t))
>> > +           ? mfn : INVALID_MFN;
>> >  }
>> 
>> Wouldn't you better alter the ASSERT() as well, using p2m_is_any_ram()
>> instead of p2m_is_ram() now? Grants should have been included there
>> before, but omitting foreign ones there before was benign.
> 
> Yes, I could use p2m_is_any_ram both in the assert and in the return
> condition:
> 
> ASSERT(mfn_valid(mfn) || !p2m_is_any_ram(*t) || p2m_is_paging(*t));
> return (p2m_is_valid(*t) || p2m_is_any_ram(*t)) ? mfn : INVALID_MFN;

Ah, indeed, it can be used in the return statement as well (and even
helps its legibility).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH for-4.12 v2 2/7] amd/npt/shadow: replace assert that prevents creating 2M/1G MMIO entries
  2019-02-13 15:53   ` Jan Beulich
@ 2019-02-14 13:59     ` Roger Pau Monné
  2019-02-14 14:48       ` Jan Beulich
  0 siblings, 1 reply; 26+ messages in thread
From: Roger Pau Monné @ 2019-02-14 13:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Wei Liu, George Dunlap, Andrew Cooper,
	george.dunlap, xen-devel

On Wed, Feb 13, 2019 at 08:53:14AM -0700, Jan Beulich wrote:
> >>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
> > The assert was originally added to make sure that higher order
> > regions (> PAGE_ORDER_4K) could not be used to bypass the
> > mmio_ro_ranges check performed by p2m_type_to_flags.
> > 
> > This however is already checked in set_mmio_p2m_entry, which makes
> > sure that higher order mappings don't overlap with mmio_ro_ranges,
> > thus allowing the creation of high order MMIO mappings safely.
> > 
> > Replace the assert to allow 2M/1G entries to be created for MMIO
> > regions and add some extra asserts as a replacement to make sure
> > there's no overlapping with MMIO read-only ranges.
> > 
> > Note that 1G MMIO entries will not be created unless mmio_order is
> > changed to allow it.
> > 
> > Suggested-by: George Dunlap <george.dunlap@citrix.com>
> 
> Is this still the case? Iirc the original suggestion was to remove
> the assertion altogether?

Right, should I instead add your suggested-by tag?

> > --- a/xen/arch/x86/mm/p2m-pt.c
> > +++ b/xen/arch/x86/mm/p2m-pt.c
> > @@ -576,7 +576,15 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
> >          }
> >  
> >          ASSERT(p2m_flags_to_type(flags) != p2m_ioreq_server);
> > -        ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
> > +        if ( p2mt == p2m_mmio_direct )
> > +            ASSERT(!mfn_eq(mfn, INVALID_MFN) &&
> > +                   !rangeset_overlaps_range(mmio_ro_ranges, mfn_x(mfn),
> > +                                            mfn_x(mfn) + PFN_DOWN(MB(2))));
> > +        else if ( p2m_allows_invalid_mfn(p2mt) || p2mt == p2m_invalid ||
> > +                  p2mt == p2m_mmio_dm )
> > +            ASSERT(mfn_valid(mfn) || mfn_eq(mfn, INVALID_MFN));
> > +        else
> > +            ASSERT(mfn_valid(mfn));
> >          l3e_content = mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt)
> >              ? p2m_l3e_from_pfn(mfn_x(mfn),
> >                                 p2m_type_to_flags(p2m, p2mt, mfn, 2))
> > @@ -668,7 +676,15 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
> >          }
> >  
> >          ASSERT(p2m_flags_to_type(flags) != p2m_ioreq_server);
> > -        ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
> > +        if ( p2mt == p2m_mmio_direct )
> > +            ASSERT(!mfn_eq(mfn, INVALID_MFN) &&
> > +                   !rangeset_overlaps_range(mmio_ro_ranges, mfn_x(mfn),
> > +                                            mfn_x(mfn) + PFN_DOWN(MB(2))));
> > +        else if ( p2m_allows_invalid_mfn(p2mt) || p2mt == p2m_invalid ||
> > +                  p2mt == p2m_mmio_dm )
> > +            ASSERT(mfn_valid(mfn) || mfn_eq(mfn, INVALID_MFN));
> > +        else
> > +            ASSERT(mfn_valid(mfn));
> 
> Seeing this supposedly almost the same (but actually entirely the same,
> due to the wrong MB(2) in the first hunk) code I wonder whether this
> wouldn't better be put in a helper (macro or function), together with
> adjacent assertion in context, immediately ahead of the line you alter.

Thanks, I've placed this into a helper function named check_entry,
which is the best name I could come up with.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH for-4.12 v2 2/7] amd/npt/shadow: replace assert that prevents creating 2M/1G MMIO entries
  2019-02-14 13:59     ` Roger Pau Monné
@ 2019-02-14 14:48       ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-02-14 14:48 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Juergen Gross, Wei Liu, George Dunlap, Andrew Cooper,
	george.dunlap, xen-devel

>>> On 14.02.19 at 14:59, <roger.pau@citrix.com> wrote:
> On Wed, Feb 13, 2019 at 08:53:14AM -0700, Jan Beulich wrote:
>> >>> On 11.02.19 at 18:46, <roger.pau@citrix.com> wrote:
>> > The assert was originally added to make sure that higher order
>> > regions (> PAGE_ORDER_4K) could not be used to bypass the
>> > mmio_ro_ranges check performed by p2m_type_to_flags.
>> > 
>> > This however is already checked in set_mmio_p2m_entry, which makes
>> > sure that higher order mappings don't overlap with mmio_ro_ranges,
>> > thus allowing the creation of high order MMIO mappings safely.
>> > 
>> > Replace the assert to allow 2M/1G entries to be created for MMIO
>> > regions and add some extra asserts as a replacement to make sure
>> > there's no overlapping with MMIO read-only ranges.
>> > 
>> > Note that 1G MMIO entries will not be created unless mmio_order is
>> > changed to allow it.
>> > 
>> > Suggested-by: George Dunlap <george.dunlap@citrix.com>
>> 
>> Is this still the case? Iirc the original suggestion was to remove
>> the assertion altogether?
> 
> Right, should I instead add your suggested-by tag?

Or simply leave it out?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-02-14 14:48 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-11 17:46 [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes Roger Pau Monne
2019-02-11 17:46 ` [PATCH for-4.12 v2 1/7] dom0/pvh: align allocation and mapping order to start address Roger Pau Monne
2019-02-12 10:52   ` Wei Liu
2019-02-13 15:32   ` Jan Beulich
2019-02-11 17:46 ` [PATCH for-4.12 v2 2/7] amd/npt/shadow: replace assert that prevents creating 2M/1G MMIO entries Roger Pau Monne
2019-02-13 15:53   ` Jan Beulich
2019-02-14 13:59     ` Roger Pau Monné
2019-02-14 14:48       ` Jan Beulich
2019-02-11 17:46 ` [PATCH for-4.12 v2 3/7] x86/pvh: reorder PVH dom0 iommu initialization Roger Pau Monne
2019-02-13 15:58   ` Jan Beulich
2019-02-11 17:46 ` [PATCH for-4.12 v2 4/7] pvh/dom0: warn when dom0_mem is not set Roger Pau Monne
2019-02-12 10:52   ` Wei Liu
2019-02-13 16:01   ` Jan Beulich
2019-02-13 17:13     ` Roger Pau Monné
2019-02-14  8:09       ` Jan Beulich
2019-02-11 17:46 ` [PATCH v2 5/7] x86/mm: split p2m ioreq server pages special handling into helper Roger Pau Monne
2019-02-13 10:23   ` Paul Durrant
2019-02-11 17:46 ` [PATCH v2 6/7] x86/mm: handle foreign mappings in p2m_entry_modify Roger Pau Monne
2019-02-14 11:25   ` Jan Beulich
2019-02-14 12:12     ` Roger Pau Monné
2019-02-14 12:25       ` Jan Beulich
2019-02-11 17:46 ` [PATCH v2 7/7] npt/shadow: allow getting foreign page table entries Roger Pau Monne
2019-02-14 11:38   ` Jan Beulich
2019-02-14 12:16     ` Roger Pau Monné
2019-02-14 12:30       ` Jan Beulich
2019-02-11 19:50 ` [PATCH for-4.12 v2 0/7] pvh/dom0/shadow/amd fixes Sander Eikelenboom

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.