xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v13 00/16] Fix RMRR (avoid RDM)
@ 2015-07-22 15:44 Ian Jackson
  2015-07-22 15:44 ` [PATCH 01/16] introduce XENMEM_reserved_device_memory_map Ian Jackson
                   ` (18 more replies)
  0 siblings, 19 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Tiejun Chen, Wei Liu, Ian Campbell, Jan Beulich

This is about v13 of the RDM series.  Compared to v11, I have:
 - imported it into git (dropping change notices for all previous version)
 - brought in new versions of 01 (aka "v12a") and 06 from Jan
 - made adjustments as needed by the new 01 from Jan
 - double-checked the acks etc., dropping unjustified acks

git branch here:
  http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=summary
  git://xenbits.xen.org/people/iwj/xen.git
  base.rdm-v13..wip.rdm-v13

*a   01   introduce XENMEM_reserved_device_memory_map
 a   02   xen/vtd: create RMRR mapping
 a   03   xen/passthrough: extend hypercall to support rdm reservation policy
 a   04   xen: enable XENMEM_memory_map in hvm
 a   05   hvmloader: get guest memory map into memory_map[]
*a   06   hvmloader/pci: try to avoid placing BARs in RMRRs
 a   07   hvmloader/e820: construct guest e820 table
* T  08   tools/libxc: Expose new hypercall xc_reserved_device_memory_map
 a   09   tools: extend xc_assign_device() to support rdm reservation policy
 a   10   tools: introduce some new parameters to set rdm policy
* T  11   tools/libxl: detect and avoid conflicts with RDM
 a   12   tools: introduce a new parameter to set a predefined rdm boundary
 a   13   libxl: construct e820 map with RDM information for HVM guest
 a   14   xen/vtd: enable USB device assignment
 a   15   xen/vtd: prevent from assign the device with shared rmrr
 a   16   tools: parse to enable new rdm policy parameters

* = changed in v13, since v11
a = acked enough to commit
T = tools maintainer ack required

Wei, can you please give your formal ack for the two changed patches ?

Tiejun, can you please test this series ?  Hopefully if you say it
still works after this, we can commit it tomorrow.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 01/16] introduce XENMEM_reserved_device_memory_map
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:44 ` [PATCH 02/16] xen/vtd: create RMRR mapping Ian Jackson
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Tiejun Chen, Wei Liu, Ian Campbell, Jan Beulich

From: Jan Beulich <jbeulich@suse.com>

This is a prerequisite for punching holes into HVM and PVH guests' P2M
to allow passing through devices that are associated with (on VT-d)
RMRRs.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v12a: Move interface structure union member to the end, while moving
     the whole public header block into a __XEN__ / __XEN_TOOLS__
     conditional block.
v12: Restore changes as much as possible to my original version, fixing
     a few issues that got introduced after handing it over. Unionize
     new public memop interface structure to allow for non-PCI to be
     supported later on. Check flags to have all currently undefined
     flags clear. Refine adjustments to xen/pci.h.
---
 xen/common/compat/memory.c           |   65 ++++++++++++++++++++++++++++++++++
 xen/common/memory.c                  |   62 ++++++++++++++++++++++++++++++++
 xen/drivers/passthrough/iommu.c      |   10 ++++++
 xen/drivers/passthrough/vtd/dmar.c   |   27 ++++++++++++++
 xen/drivers/passthrough/vtd/extern.h |    1 +
 xen/drivers/passthrough/vtd/iommu.c  |    1 +
 xen/include/public/memory.h          |   37 ++++++++++++++++++-
 xen/include/xen/iommu.h              |   10 ++++++
 xen/include/xen/pci.h                |    4 +++
 xen/include/xlat.lst                 |    3 +-
 10 files changed, 218 insertions(+), 2 deletions(-)

diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index b258138..002948b 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -17,6 +17,42 @@ CHECK_TYPE(domid);
 CHECK_mem_access_op;
 CHECK_vmemrange;
 
+#ifdef HAS_PASSTHROUGH
+struct get_reserved_device_memory {
+    struct compat_reserved_device_memory_map map;
+    unsigned int used_entries;
+};
+
+static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
+                                      u32 id, void *ctxt)
+{
+    struct get_reserved_device_memory *grdm = ctxt;
+    u32 sbdf = PCI_SBDF3(grdm->map.dev.pci.seg, grdm->map.dev.pci.bus,
+                         grdm->map.dev.pci.devfn);
+
+    if ( !(grdm->map.flags & XENMEM_RDM_ALL) && (sbdf != id) )
+        return 0;
+
+    if ( grdm->used_entries < grdm->map.nr_entries )
+    {
+        struct compat_reserved_device_memory rdm = {
+            .start_pfn = start, .nr_pages = nr
+        };
+
+        if ( rdm.start_pfn != start || rdm.nr_pages != nr )
+            return -ERANGE;
+
+        if ( __copy_to_compat_offset(grdm->map.buffer, grdm->used_entries,
+                                     &rdm, 1) )
+            return -EFAULT;
+    }
+
+    ++grdm->used_entries;
+
+    return 1;
+}
+#endif
+
 int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 {
     int split, op = cmd & MEMOP_CMD_MASK;
@@ -303,6 +339,35 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
             break;
         }
 
+#ifdef HAS_PASSTHROUGH
+        case XENMEM_reserved_device_memory_map:
+        {
+            struct get_reserved_device_memory grdm;
+
+            if ( unlikely(start_extent) )
+                return -ENOSYS;
+
+            if ( copy_from_guest(&grdm.map, compat, 1) ||
+                 !compat_handle_okay(grdm.map.buffer, grdm.map.nr_entries) )
+                return -EFAULT;
+
+            if ( grdm.map.flags & ~XENMEM_RDM_ALL )
+                return -EINVAL;
+
+            grdm.used_entries = 0;
+            rc = iommu_get_reserved_device_memory(get_reserved_device_memory,
+                                                  &grdm);
+
+            if ( !rc && grdm.map.nr_entries < grdm.used_entries )
+                rc = -ENOBUFS;
+            grdm.map.nr_entries = grdm.used_entries;
+            if ( __copy_to_guest(compat, &grdm.map, 1) )
+                rc = -EFAULT;
+
+            return rc;
+        }
+#endif
+
         default:
             return compat_arch_memory_op(cmd, compat);
         }
diff --git a/xen/common/memory.c b/xen/common/memory.c
index e5d49d8..61bb94c 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -748,6 +748,39 @@ static int construct_memop_from_reservation(
     return 0;
 }
 
+#ifdef HAS_PASSTHROUGH
+struct get_reserved_device_memory {
+    struct xen_reserved_device_memory_map map;
+    unsigned int used_entries;
+};
+
+static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
+                                      u32 id, void *ctxt)
+{
+    struct get_reserved_device_memory *grdm = ctxt;
+    u32 sbdf = PCI_SBDF3(grdm->map.dev.pci.seg, grdm->map.dev.pci.bus,
+                         grdm->map.dev.pci.devfn);
+
+    if ( !(grdm->map.flags & XENMEM_RDM_ALL) && (sbdf != id) )
+        return 0;
+
+    if ( grdm->used_entries < grdm->map.nr_entries )
+    {
+        struct xen_reserved_device_memory rdm = {
+            .start_pfn = start, .nr_pages = nr
+        };
+
+        if ( __copy_to_guest_offset(grdm->map.buffer, grdm->used_entries,
+                                    &rdm, 1) )
+            return -EFAULT;
+    }
+
+    ++grdm->used_entries;
+
+    return 1;
+}
+#endif
+
 long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     struct domain *d;
@@ -1162,6 +1195,35 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+#ifdef HAS_PASSTHROUGH
+    case XENMEM_reserved_device_memory_map:
+    {
+        struct get_reserved_device_memory grdm;
+
+        if ( unlikely(start_extent) )
+            return -ENOSYS;
+
+        if ( copy_from_guest(&grdm.map, arg, 1) ||
+             !guest_handle_okay(grdm.map.buffer, grdm.map.nr_entries) )
+            return -EFAULT;
+
+        if ( grdm.map.flags & ~XENMEM_RDM_ALL )
+            return -EINVAL;
+
+        grdm.used_entries = 0;
+        rc = iommu_get_reserved_device_memory(get_reserved_device_memory,
+                                              &grdm);
+
+        if ( !rc && grdm.map.nr_entries < grdm.used_entries )
+            rc = -ENOBUFS;
+        grdm.map.nr_entries = grdm.used_entries;
+        if ( __copy_to_guest(arg, &grdm.map, 1) )
+            rc = -EFAULT;
+
+        break;
+    }
+#endif
+
     default:
         rc = arch_memory_op(cmd, arg);
         break;
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 06cb38f..0b2ef52 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -375,6 +375,16 @@ void iommu_crash_shutdown(void)
     iommu_enabled = iommu_intremap = 0;
 }
 
+int iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt)
+{
+    const struct iommu_ops *ops = iommu_get_ops();
+
+    if ( !iommu_enabled || !ops->get_reserved_device_memory )
+        return 0;
+
+    return ops->get_reserved_device_memory(func, ctxt);
+}
+
 bool_t iommu_has_feature(struct domain *d, enum iommu_feature feature)
 {
     const struct hvm_iommu *hd = domain_hvm_iommu(d);
diff --git a/xen/drivers/passthrough/vtd/dmar.c b/xen/drivers/passthrough/vtd/dmar.c
index 2672688..56daac7 100644
--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -914,3 +914,30 @@ int platform_supports_x2apic(void)
     unsigned int mask = ACPI_DMAR_INTR_REMAP | ACPI_DMAR_X2APIC_OPT_OUT;
     return cpu_has_x2apic && ((dmar_flags & mask) == ACPI_DMAR_INTR_REMAP);
 }
+
+int intel_iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt)
+{
+    struct acpi_rmrr_unit *rmrr, *rmrr_cur = NULL;
+    unsigned int i;
+    u16 bdf;
+
+    for_each_rmrr_device ( rmrr, bdf, i )
+    {
+        int rc;
+
+        if ( rmrr == rmrr_cur )
+            continue;
+
+        rc = func(PFN_DOWN(rmrr->base_address),
+                  PFN_UP(rmrr->end_address) - PFN_DOWN(rmrr->base_address),
+                  PCI_SBDF2(rmrr->segment, bdf), ctxt);
+
+        if ( unlikely(rc < 0) )
+            return rc;
+
+        if ( rc )
+            rmrr_cur = rmrr;
+    }
+
+    return 0;
+}
diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h
index 5524dba..f9ee9b0 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -75,6 +75,7 @@ int domain_context_mapping_one(struct domain *domain, struct iommu *iommu,
                                u8 bus, u8 devfn, const struct pci_dev *);
 int domain_context_unmap_one(struct domain *domain, struct iommu *iommu,
                              u8 bus, u8 devfn);
+int intel_iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt);
 
 unsigned int io_apic_read_remap_rte(unsigned int apic, unsigned int reg);
 void io_apic_write_remap_rte(unsigned int apic,
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index a88b5a3..9849d0e 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2490,6 +2490,7 @@ const struct iommu_ops intel_iommu_ops = {
     .crash_shutdown = vtd_crash_shutdown,
     .iotlb_flush = intel_iommu_iotlb_flush,
     .iotlb_flush_all = intel_iommu_iotlb_flush_all,
+    .get_reserved_device_memory = intel_iommu_get_reserved_device_memory,
     .dump_p2m_table = vtd_dump_p2m_table,
 };
 
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 832559a..320de91 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -28,6 +28,7 @@
 #define __XEN_PUBLIC_MEMORY_H__
 
 #include "xen.h"
+#include "physdev.h"
 
 /*
  * Increase or decrease the specified domain's memory reservation. Returns the
@@ -522,6 +523,40 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_sharing_op_t);
  * The zero value is appropiate.
  */
 
+/*
+ * With some legacy devices, certain guest-physical addresses cannot safely
+ * be used for other purposes, e.g. to map guest RAM.  This hypercall
+ * enumerates those regions so the toolstack can avoid using them.
+ */
+#define XENMEM_reserved_device_memory_map   27
+struct xen_reserved_device_memory {
+    xen_pfn_t start_pfn;
+    xen_ulong_t nr_pages;
+};
+typedef struct xen_reserved_device_memory xen_reserved_device_memory_t;
+DEFINE_XEN_GUEST_HANDLE(xen_reserved_device_memory_t);
+
+struct xen_reserved_device_memory_map {
+#define XENMEM_RDM_ALL 1 /* Request all regions (ignore dev union). */
+    /* IN */
+    uint32_t flags;
+    /*
+     * IN/OUT
+     *
+     * Gets set to the required number of entries when too low,
+     * signaled by error code -ERANGE.
+     */
+    unsigned int nr_entries;
+    /* OUT */
+    XEN_GUEST_HANDLE(xen_reserved_device_memory_t) buffer;
+    /* IN */
+    union {
+        struct physdev_pci_device pci;
+    } dev;
+};
+typedef struct xen_reserved_device_memory_map xen_reserved_device_memory_map_t;
+DEFINE_XEN_GUEST_HANDLE(xen_reserved_device_memory_map_t);
+
 #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
 
 /*
@@ -573,7 +608,7 @@ struct xen_vnuma_topology_info {
 typedef struct xen_vnuma_topology_info xen_vnuma_topology_info_t;
 DEFINE_XEN_GUEST_HANDLE(xen_vnuma_topology_info_t);
 
-/* Next available subop number is 27 */
+/* Next available subop number is 28 */
 
 #endif /* __XEN_PUBLIC_MEMORY_H__ */
 
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 1d00696..52ed3b7 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -125,6 +125,14 @@ int iommu_do_dt_domctl(struct xen_domctl *, struct domain *,
 
 struct page_info;
 
+/*
+ * Any non-zero value returned from callbacks of this type will cause the
+ * function the callback was handed to terminate its iteration. Assigning
+ * meaning of these non-zero values is left to the top level caller /
+ * callback pair.
+ */
+typedef int iommu_grdm_t(xen_pfn_t start, xen_ulong_t nr, u32 id, void *ctxt);
+
 struct iommu_ops {
     int (*init)(struct domain *d);
     void (*hwdom_init)(struct domain *d);
@@ -156,12 +164,14 @@ struct iommu_ops {
     void (*crash_shutdown)(void);
     void (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count);
     void (*iotlb_flush_all)(struct domain *d);
+    int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
     void (*dump_p2m_table)(struct domain *d);
 };
 
 void iommu_suspend(void);
 void iommu_resume(void);
 void iommu_crash_shutdown(void);
+int iommu_get_reserved_device_memory(iommu_grdm_t *, void *);
 
 void iommu_share_p2m_table(struct domain *d);
 
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index 3908146..e85d46f 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -26,6 +26,7 @@
  *  7:3 = slot
  *  2:0 = function
  */
+#define PCI_SEG(sbdf) (((sbdf) >> 16) & 0xffff)
 #define PCI_BUS(bdf)    (((bdf) >> 8) & 0xff)
 #define PCI_SLOT(bdf)   (((bdf) >> 3) & 0x1f)
 #define PCI_FUNC(bdf)   ((bdf) & 0x07)
@@ -33,6 +34,9 @@
 #define PCI_DEVFN2(bdf) ((bdf) & 0xff)
 #define PCI_BDF(b,d,f)  ((((b) & 0xff) << 8) | PCI_DEVFN(d,f))
 #define PCI_BDF2(b,df)  ((((b) & 0xff) << 8) | ((df) & 0xff))
+#define PCI_SBDF(s,b,d,f) ((((s) & 0xffff) << 16) | PCI_BDF(b,d,f))
+#define PCI_SBDF2(s,bdf) ((((s) & 0xffff) << 16) | ((bdf) & 0xffff))
+#define PCI_SBDF3(s,b,df) ((((s) & 0xffff) << 16) | PCI_BDF2(b, df))
 
 struct pci_dev_info {
     bool_t is_extfn;
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 6fb15bf..8cedee7 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -65,9 +65,10 @@
 !	memory_exchange			memory.h
 !	memory_map			memory.h
 !	memory_reservation		memory.h
-?	mem_access_op		memory.h
+?	mem_access_op			memory.h
 !	pod_target			memory.h
 !	remove_from_physmap		memory.h
+!	reserved_device_memory_map	memory.h
 ?	vmemrange			memory.h
 !	vnuma_topology_info		memory.h
 ?	physdev_eoi			physdev.h
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 02/16] xen/vtd: create RMRR mapping
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
  2015-07-22 15:44 ` [PATCH 01/16] introduce XENMEM_reserved_device_memory_map Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:44 ` [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy Ian Jackson
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Wei Liu, Ian Campbell, George Dunlap, Andrew Cooper,
	Tim Deegan, Jan Beulich, Yang Zhang, Tiejun Chen, Keir Fraser

From: Tiejun Chen <tiejun.chen@intel.com>

RMRR reserved regions must be setup in the pfn space with an identity
mapping to reported mfn. However existing code has problem to setup
correct mapping when VT-d shares EPT page table, so lead to problem
when assigning devices (e.g GPU) with RMRR reported. So instead, this
patch aims to setup identity mapping in p2m layer, regardless of
whether EPT is shared or not. And we still keep creating VT-d table.

And we also need to introduce a pair of helper to create/clear this
sort of identity mapping as follows:

set_identity_p2m_entry():

If the gfn space is unoccupied, we just set the mapping. If space
is already occupied by desired identity mapping, do nothing.
Otherwise, failure is returned.

clear_identity_p2m_entry():

We just define macro to wrapper guest_physmap_remove_page() with
a returning value as necessary.

CC: Tim Deegan <tim@xen.org>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Yang Zhang <yang.z.zhang@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
---
 xen/arch/x86/mm/p2m.c               |   40 +++++++++++++++++++++++++++++++++--
 xen/drivers/passthrough/vtd/iommu.c |    5 ++---
 xen/include/asm-x86/p2m.h           |   13 +++++++++---
 3 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 6fe6387..1e763dc 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -584,14 +584,16 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn, unsigned long mfn,
                          p2m->default_access);
 }
 
-void
+int
 guest_physmap_remove_page(struct domain *d, unsigned long gfn,
                           unsigned long mfn, unsigned int page_order)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    int rc;
     gfn_lock(p2m, gfn, page_order);
-    p2m_remove_page(p2m, gfn, mfn, page_order);
+    rc = p2m_remove_page(p2m, gfn, mfn, page_order);
     gfn_unlock(p2m, gfn, page_order);
+    return rc;
 }
 
 int
@@ -898,6 +900,40 @@ int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
     return set_typed_p2m_entry(d, gfn, mfn, p2m_mmio_direct, access);
 }
 
+int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
+                           p2m_access_t p2ma)
+{
+    p2m_type_t p2mt;
+    p2m_access_t a;
+    mfn_t mfn;
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    int ret;
+
+    if ( !paging_mode_translate(p2m->domain) )
+        return 0;
+
+    gfn_lock(p2m, gfn, 0);
+
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+
+    if ( p2mt == p2m_invalid || p2mt == p2m_mmio_dm )
+        ret = p2m_set_entry(p2m, gfn, _mfn(gfn), PAGE_ORDER_4K,
+                            p2m_mmio_direct, p2ma);
+    else if ( mfn_x(mfn) == gfn && p2mt == p2m_mmio_direct && a == p2ma )
+        ret = 0;
+    else
+    {
+        ret = -EBUSY;
+        printk(XENLOG_G_WARNING
+               "Cannot setup identity map d%d:%lx,"
+               " gfn already mapped to %lx.\n",
+               d->domain_id, gfn, mfn_x(mfn));
+    }
+
+    gfn_unlock(p2m, gfn, 0);
+    return ret;
+}
+
 /* Returns: 0 for success, -errno for failure */
 int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
 {
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 9849d0e..5aa482f 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1839,7 +1839,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
 
             while ( base_pfn < end_pfn )
             {
-                if ( intel_iommu_unmap_page(d, base_pfn) )
+                if ( clear_identity_p2m_entry(d, base_pfn, 0) )
                     ret = -ENXIO;
                 base_pfn++;
             }
@@ -1855,8 +1855,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
 
     while ( base_pfn < end_pfn )
     {
-        int err = intel_iommu_map_page(d, base_pfn, base_pfn,
-                                       IOMMUF_readable|IOMMUF_writable);
+        int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw);
 
         if ( err )
             return err;
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index b49c09b..190a286 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -503,9 +503,9 @@ static inline int guest_physmap_add_page(struct domain *d,
 }
 
 /* Remove a page from a domain's p2m table */
-void guest_physmap_remove_page(struct domain *d,
-                               unsigned long gfn,
-                               unsigned long mfn, unsigned int page_order);
+int guest_physmap_remove_page(struct domain *d,
+                              unsigned long gfn,
+                              unsigned long mfn, unsigned int page_order);
 
 /* Set a p2m range as populate-on-demand */
 int guest_physmap_mark_populate_on_demand(struct domain *d, unsigned long gfn,
@@ -543,6 +543,13 @@ int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
                        p2m_access_t access);
 int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
 
+/* Set identity addresses in the p2m table (for pass-through) */
+int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
+                           p2m_access_t p2ma);
+
+#define clear_identity_p2m_entry(d, gfn, page_order) \
+                        guest_physmap_remove_page(d, gfn, gfn, page_order)
+
 /* Add foreign mapping to the guest's p2m table. */
 int p2m_add_foreign(struct domain *tdom, unsigned long fgfn,
                     unsigned long gpfn, domid_t foreign_domid);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
  2015-07-22 15:44 ` [PATCH 01/16] introduce XENMEM_reserved_device_memory_map Ian Jackson
  2015-07-22 15:44 ` [PATCH 02/16] xen/vtd: create RMRR mapping Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-23 11:45   ` Ian Jackson
  2015-07-22 15:44 ` [PATCH 04/16] xen: enable XENMEM_memory_map in hvm Ian Jackson
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Wei Liu, Ian Campbell, George Dunlap, Andrew Cooper,
	Tim Deegan, Stefano Stabellini, Aravind Gopalakrishnan,
	Jan Beulich, Yang Zhang, Tiejun Chen, Keir Fraser,
	Suravee Suthikulpanit

From: Tiejun Chen <tiejun.chen@intel.com>

This patch extends the existing hypercall to support rdm reservation policy.
We return error or just throw out a warning message depending on whether
the policy is "strict" or "relaxed" when reserving RDM regions in pfn space.
Note in some special cases, e.g. add a device to hwdomain, and remove a
device from user domain, 'relaxed' is fine enough since this is always safe
to hwdomain.

CC: Tim Deegan <tim@xen.org>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
CC: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Stefano Stabellini <stefano.stabellini@citrix.com>
CC: Yang Zhang <yang.z.zhang@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/mm/p2m.c                       |    7 +++--
 xen/drivers/passthrough/amd/pci_amd_iommu.c |    3 ++-
 xen/drivers/passthrough/arm/smmu.c          |    2 +-
 xen/drivers/passthrough/device_tree.c       |    3 ++-
 xen/drivers/passthrough/pci.c               |   15 ++++++++---
 xen/drivers/passthrough/vtd/iommu.c         |   37 +++++++++++++++++++++------
 xen/include/asm-x86/p2m.h                   |    2 +-
 xen/include/public/domctl.h                 |    3 +++
 xen/include/xen/iommu.h                     |    2 +-
 9 files changed, 55 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 1e763dc..89616b7 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -901,7 +901,7 @@ int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
 }
 
 int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
-                           p2m_access_t p2ma)
+                           p2m_access_t p2ma, unsigned int flag)
 {
     p2m_type_t p2mt;
     p2m_access_t a;
@@ -923,7 +923,10 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
         ret = 0;
     else
     {
-        ret = -EBUSY;
+        if ( flag & XEN_DOMCTL_DEV_RDM_RELAXED )
+            ret = 0;
+        else
+            ret = -EBUSY;
         printk(XENLOG_G_WARNING
                "Cannot setup identity map d%d:%lx,"
                " gfn already mapped to %lx.\n",
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index e83bb35..920b35a 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -394,7 +394,8 @@ static int reassign_device(struct domain *source, struct domain *target,
 }
 
 static int amd_iommu_assign_device(struct domain *d, u8 devfn,
-                                   struct pci_dev *pdev)
+                                   struct pci_dev *pdev,
+                                   u32 flag)
 {
     struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(pdev->seg);
     int bdf = PCI_BDF2(pdev->bus, devfn);
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 6cc4394..9a667e9 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2605,7 +2605,7 @@ static void arm_smmu_destroy_iommu_domain(struct iommu_domain *domain)
 }
 
 static int arm_smmu_assign_dev(struct domain *d, u8 devfn,
-			       struct device *dev)
+			       struct device *dev, u32 flag)
 {
 	struct iommu_domain *domain;
 	struct arm_smmu_xen_domain *xen_domain;
diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
index 5d3842a..7ff79f8 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -52,7 +52,8 @@ int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev)
             goto fail;
     }
 
-    rc = hd->platform_ops->assign_device(d, 0, dt_to_dev(dev));
+    /* The flag field doesn't matter to DT device. */
+    rc = hd->platform_ops->assign_device(d, 0, dt_to_dev(dev), 0);
 
     if ( rc )
         goto fail;
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index e30be43..c7bbf6e 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1335,7 +1335,7 @@ static int device_assigned(u16 seg, u8 bus, u8 devfn)
     return pdev ? 0 : -EBUSY;
 }
 
-static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
+static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
     struct pci_dev *pdev;
@@ -1371,7 +1371,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
 
     pdev->fault.count = 0;
 
-    if ( (rc = hd->platform_ops->assign_device(d, devfn, pci_to_dev(pdev))) )
+    if ( (rc = hd->platform_ops->assign_device(d, devfn, pci_to_dev(pdev), flag)) )
         goto done;
 
     for ( ; pdev->phantom_stride; rc = 0 )
@@ -1379,7 +1379,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
         devfn += pdev->phantom_stride;
         if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) )
             break;
-        rc = hd->platform_ops->assign_device(d, devfn, pci_to_dev(pdev));
+        rc = hd->platform_ops->assign_device(d, devfn, pci_to_dev(pdev), flag);
         if ( rc )
             printk(XENLOG_G_WARNING "d%d: assign %04x:%02x:%02x.%u failed (%d)\n",
                    d->domain_id, seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
@@ -1496,6 +1496,7 @@ int iommu_do_pci_domctl(
 {
     u16 seg;
     u8 bus, devfn;
+    u32 flag;
     int ret = 0;
     uint32_t machine_sbdf;
 
@@ -1577,9 +1578,15 @@ int iommu_do_pci_domctl(
         seg = machine_sbdf >> 16;
         bus = PCI_BUS(machine_sbdf);
         devfn = PCI_DEVFN2(machine_sbdf);
+        flag = domctl->u.assign_device.flag;
+        if ( flag & ~XEN_DOMCTL_DEV_RDM_RELAXED )
+        {
+            ret = -EINVAL;
+            break;
+        }
 
         ret = device_assigned(seg, bus, devfn) ?:
-              assign_device(d, seg, bus, devfn);
+              assign_device(d, seg, bus, devfn, flag);
         if ( ret == -ERESTART )
             ret = hypercall_create_continuation(__HYPERVISOR_domctl,
                                                 "h", u_domctl);
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 5aa482f..a2f3a66 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1807,7 +1807,8 @@ static void iommu_set_pgd(struct domain *d)
 }
 
 static int rmrr_identity_mapping(struct domain *d, bool_t map,
-                                 const struct acpi_rmrr_unit *rmrr)
+                                 const struct acpi_rmrr_unit *rmrr,
+                                 u32 flag)
 {
     unsigned long base_pfn = rmrr->base_address >> PAGE_SHIFT_4K;
     unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) >> PAGE_SHIFT_4K;
@@ -1855,7 +1856,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
 
     while ( base_pfn < end_pfn )
     {
-        int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw);
+        int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
 
         if ( err )
             return err;
@@ -1898,7 +1899,13 @@ static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev)
              PCI_BUS(bdf) == pdev->bus &&
              PCI_DEVFN2(bdf) == devfn )
         {
-            ret = rmrr_identity_mapping(pdev->domain, 1, rmrr);
+            /*
+             * iommu_add_device() is only called for the hardware
+             * domain (see xen/drivers/passthrough/pci.c:pci_add_device()).
+             * Since RMRRs are always reserved in the e820 map for the hardware
+             * domain, there shouldn't be a conflict.
+             */
+            ret = rmrr_identity_mapping(pdev->domain, 1, rmrr, 0);
             if ( ret )
                 dprintk(XENLOG_ERR VTDPREFIX, "d%d: RMRR mapping failed\n",
                         pdev->domain->domain_id);
@@ -1939,7 +1946,11 @@ static int intel_iommu_remove_device(u8 devfn, struct pci_dev *pdev)
              PCI_DEVFN2(bdf) != devfn )
             continue;
 
-        rmrr_identity_mapping(pdev->domain, 0, rmrr);
+        /*
+         * Any flag is nothing to clear these mappings but here
+         * its always safe and strict to set 0.
+         */
+        rmrr_identity_mapping(pdev->domain, 0, rmrr, 0);
     }
 
     return domain_context_unmap(pdev->domain, devfn, pdev);
@@ -2097,7 +2108,13 @@ static void __hwdom_init setup_hwdom_rmrr(struct domain *d)
     spin_lock(&pcidevs_lock);
     for_each_rmrr_device ( rmrr, bdf, i )
     {
-        ret = rmrr_identity_mapping(d, 1, rmrr);
+        /*
+         * Here means we're add a device to the hardware domain.
+         * Since RMRRs are always reserved in the e820 map for the hardware
+         * domain, there shouldn't be a conflict. So its always safe and
+         * strict to set 0.
+         */
+        ret = rmrr_identity_mapping(d, 1, rmrr, 0);
         if ( ret )
             dprintk(XENLOG_ERR VTDPREFIX,
                      "IOMMU: mapping reserved region failed\n");
@@ -2240,7 +2257,11 @@ static int reassign_device_ownership(
                  PCI_BUS(bdf) == pdev->bus &&
                  PCI_DEVFN2(bdf) == devfn )
             {
-                ret = rmrr_identity_mapping(source, 0, rmrr);
+                /*
+                 * Any RMRR flag is always ignored when remove a device,
+                 * but its always safe and strict to set 0.
+                 */
+                ret = rmrr_identity_mapping(source, 0, rmrr, 0);
                 if ( ret != -ENOENT )
                     return ret;
             }
@@ -2264,7 +2285,7 @@ static int reassign_device_ownership(
 }
 
 static int intel_iommu_assign_device(
-    struct domain *d, u8 devfn, struct pci_dev *pdev)
+    struct domain *d, u8 devfn, struct pci_dev *pdev, u32 flag)
 {
     struct acpi_rmrr_unit *rmrr;
     int ret = 0, i;
@@ -2293,7 +2314,7 @@ static int intel_iommu_assign_device(
              PCI_BUS(bdf) == bus &&
              PCI_DEVFN2(bdf) == devfn )
         {
-            ret = rmrr_identity_mapping(d, 1, rmrr);
+            ret = rmrr_identity_mapping(d, 1, rmrr, flag);
             if ( ret )
             {
                 reassign_device_ownership(d, hardware_domain, devfn, pdev);
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 190a286..68da0a9 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -545,7 +545,7 @@ int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
 
 /* Set identity addresses in the p2m table (for pass-through) */
 int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
-                           p2m_access_t p2ma);
+                           p2m_access_t p2ma, unsigned int flag);
 
 #define clear_identity_p2m_entry(d, gfn, page_order) \
                         guest_physmap_remove_page(d, gfn, gfn, page_order)
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 631935a..675f021 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -471,6 +471,9 @@ struct xen_domctl_assign_device {
             XEN_GUEST_HANDLE_64(char) path; /* path to the device tree node */
         } dt;
     } u;
+    /* IN */
+#define XEN_DOMCTL_DEV_RDM_RELAXED      1
+    uint32_t  flag;   /* flag of assigned device */
 };
 typedef struct xen_domctl_assign_device xen_domctl_assign_device_t;
 DEFINE_XEN_GUEST_HANDLE(xen_domctl_assign_device_t);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 52ed3b7..1124601 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -139,7 +139,7 @@ struct iommu_ops {
     int (*add_device)(u8 devfn, device_t *dev);
     int (*enable_device)(device_t *dev);
     int (*remove_device)(u8 devfn, device_t *dev);
-    int (*assign_device)(struct domain *, u8 devfn, device_t *dev);
+    int (*assign_device)(struct domain *, u8 devfn, device_t *dev, u32 flag);
     int (*reassign_device)(struct domain *s, struct domain *t,
                            u8 devfn, device_t *dev);
 #ifdef HAS_PCI
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 04/16] xen: enable XENMEM_memory_map in hvm
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (2 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:44 ` [PATCH 05/16] hvmloader: get guest memory map into memory_map[] Ian Jackson
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Keir Fraser, Ian Campbell, George Dunlap, Andrew Cooper,
	Jan Beulich, Tiejun Chen

From: Tiejun Chen <tiejun.chen@intel.com>

This patch enables XENMEM_memory_map in hvm. So hvmloader can
use it to setup the e820 mappings.

CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
---
 xen/arch/x86/hvm/hvm.c |    2 --
 xen/arch/x86/mm.c      |    6 ------
 2 files changed, 8 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c07e3ef..d860579 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4855,7 +4855,6 @@ static long hvm_memory_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     switch ( cmd & MEMOP_CMD_MASK )
     {
-    case XENMEM_memory_map:
     case XENMEM_machine_memory_map:
     case XENMEM_machphys_mapping:
         return -ENOSYS;
@@ -4931,7 +4930,6 @@ static long hvm_memory_op_compat32(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     switch ( cmd & MEMOP_CMD_MASK )
     {
-    case XENMEM_memory_map:
     case XENMEM_machine_memory_map:
     case XENMEM_machphys_mapping:
         return -ENOSYS;
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 342414f..8c887d8 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4717,12 +4717,6 @@ long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
             return rc;
         }
 
-        if ( is_hvm_domain(d) )
-        {
-            rcu_unlock_domain(d);
-            return -EPERM;
-        }
-
         e820 = xmalloc_array(e820entry_t, fmap.map.nr_entries);
         if ( e820 == NULL )
         {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 05/16] hvmloader: get guest memory map into memory_map[]
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (3 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 04/16] xen: enable XENMEM_memory_map in hvm Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:44 ` [PATCH 06/16] hvmloader/pci: try to avoid placing BARs in RMRRs Ian Jackson
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Keir Fraser, Ian Campbell, Stefano Stabellini,
	George Dunlap, Andrew Cooper, Ian Jackson, Jan Beulich,
	Tiejun Chen

From: Tiejun Chen <tiejun.chen@intel.com>

Now we get this map layout by call XENMEM_memory_map then
save them into one global variable memory_map[]. It should
include lowmem range, rdm range and highmem range. Note
rdm range and highmem range may not exist in some cases.

And here we need to check if any reserved memory conflicts with
[RESERVED_MEMORY_DYNAMIC_START, RESERVED_MEMORY_DYNAMIC_END).
This range is used to allocate memory in hvmloder level, and
we would lead hvmloader failed in case of conflict since its
another rare possibility in real world.

CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 tools/firmware/hvmloader/e820.c      |   32 ++++++++++++++++++++++++++++++++
 tools/firmware/hvmloader/e820.h      |    7 +++++++
 tools/firmware/hvmloader/hvmloader.c |    2 ++
 tools/firmware/hvmloader/util.c      |   26 ++++++++++++++++++++++++++
 tools/firmware/hvmloader/util.h      |   12 ++++++++++++
 5 files changed, 79 insertions(+)

diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
index 2e05e93..7a414ab 100644
--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -23,6 +23,38 @@
 #include "config.h"
 #include "util.h"
 
+struct e820map memory_map;
+
+void memory_map_setup(void)
+{
+    unsigned int nr_entries = E820MAX, i;
+    int rc;
+    uint64_t alloc_addr = RESERVED_MEMORY_DYNAMIC_START;
+    uint64_t alloc_size = RESERVED_MEMORY_DYNAMIC_END - alloc_addr;
+
+    rc = get_mem_mapping_layout(memory_map.map, &nr_entries);
+
+    if ( rc || !nr_entries )
+    {
+        printf("Get guest memory maps[%d] failed. (%d)\n", nr_entries, rc);
+        BUG();
+    }
+
+    memory_map.nr_map = nr_entries;
+
+    for ( i = 0; i < nr_entries; i++ )
+    {
+        if ( memory_map.map[i].type == E820_RESERVED &&
+             check_overlap(alloc_addr, alloc_size,
+                           memory_map.map[i].addr, memory_map.map[i].size) )
+        {
+            printf("Fail to setup memory map due to conflict");
+            printf(" on dynamic reserved memory range.\n");
+            BUG();
+        }
+    }
+}
+
 void dump_e820_table(struct e820entry *e820, unsigned int nr)
 {
     uint64_t last_end = 0, start, end;
diff --git a/tools/firmware/hvmloader/e820.h b/tools/firmware/hvmloader/e820.h
index b2ead7f..8b5a9e0 100644
--- a/tools/firmware/hvmloader/e820.h
+++ b/tools/firmware/hvmloader/e820.h
@@ -15,6 +15,13 @@ struct e820entry {
     uint32_t type;
 } __attribute__((packed));
 
+#define E820MAX	128
+
+struct e820map {
+    unsigned int nr_map;
+    struct e820entry map[E820MAX];
+};
+
 #endif /* __HVMLOADER_E820_H__ */
 
 /*
diff --git a/tools/firmware/hvmloader/hvmloader.c b/tools/firmware/hvmloader/hvmloader.c
index 25b7f08..84c588c 100644
--- a/tools/firmware/hvmloader/hvmloader.c
+++ b/tools/firmware/hvmloader/hvmloader.c
@@ -262,6 +262,8 @@ int main(void)
 
     init_hypercalls();
 
+    memory_map_setup();
+
     xenbus_setup();
 
     bios = detect_bios();
diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
index 80d822f..122e3fa 100644
--- a/tools/firmware/hvmloader/util.c
+++ b/tools/firmware/hvmloader/util.c
@@ -27,6 +27,17 @@
 #include <xen/memory.h>
 #include <xen/sched.h>
 
+/*
+ * Check whether there exists overlap in the specified memory range.
+ * Returns true if exists, else returns false.
+ */
+bool check_overlap(uint64_t start, uint64_t size,
+                   uint64_t reserved_start, uint64_t reserved_size)
+{
+    return (start + size > reserved_start) &&
+            (start < reserved_start + reserved_size);
+}
+
 void wrmsr(uint32_t idx, uint64_t v)
 {
     asm volatile (
@@ -368,6 +379,21 @@ uuid_to_string(char *dest, uint8_t *uuid)
     *p = '\0';
 }
 
+int get_mem_mapping_layout(struct e820entry entries[], uint32_t *max_entries)
+{
+    int rc;
+    struct xen_memory_map memmap = {
+        .nr_entries = *max_entries
+    };
+
+    set_xen_guest_handle(memmap.buffer, entries);
+
+    rc = hypercall_memory_op(XENMEM_memory_map, &memmap);
+    *max_entries = memmap.nr_entries;
+
+    return rc;
+}
+
 void mem_hole_populate_ram(xen_pfn_t mfn, uint32_t nr_mfns)
 {
     static int over_allocated;
diff --git a/tools/firmware/hvmloader/util.h b/tools/firmware/hvmloader/util.h
index f99c0f19..1100a3b 100644
--- a/tools/firmware/hvmloader/util.h
+++ b/tools/firmware/hvmloader/util.h
@@ -4,8 +4,10 @@
 #include <stdarg.h>
 #include <stdint.h>
 #include <stddef.h>
+#include <stdbool.h>
 #include <xen/xen.h>
 #include <xen/hvm/hvm_info_table.h>
+#include "e820.h"
 
 #define __STR(...) #__VA_ARGS__
 #define STR(...) __STR(__VA_ARGS__)
@@ -222,6 +224,9 @@ int hvm_param_set(uint32_t index, uint64_t value);
 /* Setup PCI bus */
 void pci_setup(void);
 
+/* Setup memory map  */
+void memory_map_setup(void);
+
 /* Prepare the 32bit BIOS */
 uint32_t rombios_highbios_setup(void);
 
@@ -249,6 +254,13 @@ void perform_tests(void);
 
 extern char _start[], _end[];
 
+int get_mem_mapping_layout(struct e820entry entries[],
+                           unsigned int *max_entries);
+
+extern struct e820map memory_map;
+bool check_overlap(uint64_t start, uint64_t size,
+                   uint64_t reserved_start, uint64_t reserved_size);
+
 #endif /* __HVMLOADER_UTIL_H__ */
 
 /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 06/16] hvmloader/pci: try to avoid placing BARs in RMRRs
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (4 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 05/16] hvmloader: get guest memory map into memory_map[] Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:44 ` [PATCH 07/16] hvmloader/e820: construct guest e820 table Ian Jackson
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Tiejun Chen, Wei Liu, Ian Campbell, Jan Beulich

From: George Dunlap <george.dunlap@eu.citrix.com>

Try to avoid placing PCI BARs over RMRRs:

- If mmio_hole_size is not specified, and the existing MMIO range has
  RMRRs in it, and there is space to expand the hole in lowmem without
  moving more memory, then make the MMIO hole as large as possible.

- When placing RMRRs, find the next RMRR higher than the current base
  in the lowmem mmio hole.  If it overlaps, skip ahead of it and find
  the next one.

This certainly won't work in all cases, but it should work in a
significant number of cases.  Additionally, users should be able to
work around problems by setting mmio_hole_size larger in the guest
config.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/firmware/hvmloader/pci.c |   64 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c
index 5ff87a7..ff81fba 100644
--- a/tools/firmware/hvmloader/pci.c
+++ b/tools/firmware/hvmloader/pci.c
@@ -38,6 +38,45 @@ uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0;
 enum virtual_vga virtual_vga = VGA_none;
 unsigned long igd_opregion_pgbase = 0;
 
+/* Check if the specified range conflicts with any reserved device memory. */
+static bool check_overlap_all(uint64_t start, uint64_t size)
+{
+    unsigned int i;
+
+    for ( i = 0; i < memory_map.nr_map; i++ )
+    {
+        if ( memory_map.map[i].type == E820_RESERVED &&
+             check_overlap(start, size,
+                           memory_map.map[i].addr,
+                           memory_map.map[i].size) )
+            return true;
+    }
+
+    return false;
+}
+
+/* Find the lowest RMRR ending above base but below 4G. */
+static int find_next_rmrr(uint32_t base)
+{
+    unsigned int i;
+    int next_rmrr = -1;
+    uint64_t end, min_end = 1ULL << 32;
+
+    for ( i = 0; i < memory_map.nr_map ; i++ )
+    {
+        end = memory_map.map[i].addr + memory_map.map[i].size;
+
+        if ( memory_map.map[i].type == E820_RESERVED &&
+             end > base && end <= min_end )
+        {
+            next_rmrr = i;
+            min_end = end;
+        }
+    }
+
+    return next_rmrr;
+}
+
 void pci_setup(void)
 {
     uint8_t is_64bar, using_64bar, bar64_relocate = 0;
@@ -46,6 +85,7 @@ void pci_setup(void)
     uint32_t vga_devfn = 256;
     uint16_t class, vendor_id, device_id;
     unsigned int bar, pin, link, isa_irq;
+    int next_rmrr;
 
     /* Resources assignable to PCI devices via BARs. */
     struct resource {
@@ -299,6 +339,15 @@ void pci_setup(void)
                     || (((pci_mem_start << 1) >> PAGE_SHIFT)
                         >= hvm_info->low_mem_pgend)) )
             pci_mem_start <<= 1;
+
+        /*
+         * Try to accommodate RMRRs in our MMIO region on a best-effort basis.
+         * If we have RMRRs in the range, then make pci_mem_start just after
+         * hvm_info->low_mem_pgend.
+         */
+        if ( pci_mem_start > (hvm_info->low_mem_pgend << PAGE_SHIFT) &&
+             check_overlap_all(pci_mem_start, pci_mem_end-pci_mem_start) )
+            pci_mem_start = hvm_info->low_mem_pgend << PAGE_SHIFT;
     }
 
     if ( mmio_total > (pci_mem_end - pci_mem_start) )
@@ -352,6 +401,8 @@ void pci_setup(void)
     io_resource.base = 0xc000;
     io_resource.max = 0x10000;
 
+    next_rmrr = find_next_rmrr(pci_mem_start);
+
     /* Assign iomem and ioport resources in descending order of size. */
     for ( i = 0; i < nr_bars; i++ )
     {
@@ -407,6 +458,19 @@ void pci_setup(void)
         }
 
         base = (resource->base  + bar_sz - 1) & ~(uint64_t)(bar_sz - 1);
+
+        /* If we're using mem_resource, check for RMRR conflicts. */
+        while ( resource == &mem_resource &&
+                next_rmrr >= 0 &&
+                check_overlap(base, bar_sz,
+                              memory_map.map[next_rmrr].addr,
+                              memory_map.map[next_rmrr].size) )
+        {
+            base = memory_map.map[next_rmrr].addr + memory_map.map[next_rmrr].size;
+            base = (base + bar_sz - 1) & ~(bar_sz - 1);
+            next_rmrr = find_next_rmrr(base);
+        }
+
         bar_data |= (uint32_t)base;
         bar_data_upper = (uint32_t)(base >> 32);
         base += bar_sz;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 07/16] hvmloader/e820: construct guest e820 table
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (5 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 06/16] hvmloader/pci: try to avoid placing BARs in RMRRs Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:44 ` [PATCH 08/16] tools/libxc: Expose new hypercall xc_reserved_device_memory_map Ian Jackson
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Keir Fraser, Ian Campbell, Stefano Stabellini,
	George Dunlap, Andrew Cooper, Ian Jackson, Jan Beulich,
	Tiejun Chen

From: Tiejun Chen <tiejun.chen@intel.com>

Now use the hypervisor-supplied memory map to build our final e820 table:
* Add regions for BIOS ranges and other special mappings not in the
  hypervisor map
* Add in the hypervisor supplied regions
* Adjust the lowmem and highmem regions if we've had to relocate
  memory (adding a highmem region if necessary)
* Sort all the ranges so that they appear in memory order.

CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
---
 tools/firmware/hvmloader/e820.c |  109 ++++++++++++++++++++++++++++++++++-----
 1 file changed, 96 insertions(+), 13 deletions(-)

diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
index 7a414ab..a6cacdf 100644
--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -105,7 +105,11 @@ int build_e820_table(struct e820entry *e820,
                      unsigned int lowmem_reserved_base,
                      unsigned int bios_image_base)
 {
-    unsigned int nr = 0;
+    unsigned int nr = 0, i, j;
+    uint32_t low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT;
+    uint32_t add_high_mem = 0;
+    uint64_t high_mem_end = (uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT;
+    uint64_t map_start, map_size, map_end;
 
     if ( !lowmem_reserved_base )
             lowmem_reserved_base = 0xA0000;
@@ -149,13 +153,6 @@ int build_e820_table(struct e820entry *e820,
     e820[nr].type = E820_RESERVED;
     nr++;
 
-    /* Low RAM goes here. Reserve space for special pages. */
-    BUG_ON((hvm_info->low_mem_pgend << PAGE_SHIFT) < (2u << 20));
-    e820[nr].addr = 0x100000;
-    e820[nr].size = (hvm_info->low_mem_pgend << PAGE_SHIFT) - e820[nr].addr;
-    e820[nr].type = E820_RAM;
-    nr++;
-
     /*
      * Explicitly reserve space for special pages.
      * This space starts at RESERVED_MEMBASE an extends to cover various
@@ -191,16 +188,102 @@ int build_e820_table(struct e820entry *e820,
         nr++;
     }
 
+    /* Low RAM goes here. Reserve space for special pages. */
+    BUG_ON(low_mem_end < (2u << 20));
 
-    if ( hvm_info->high_mem_pgend )
+    /*
+     * Construct E820 table according to recorded memory map.
+     *
+     * The memory map created by toolstack may include,
+     *
+     * #1. Low memory region
+     *
+     * Low RAM starts at least from 1M to make sure all standard regions
+     * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios,
+     * have enough space.
+     *
+     * #2. Reserved regions if they exist
+     *
+     * #3. High memory region if it exists
+     *
+     * Note we just have one low memory entry and one high mmeory entry if
+     * exists.
+     *
+     * But we may have relocated RAM to allocate sufficient MMIO previously
+     * so low_mem_pgend would be changed over there. And here memory_map[]
+     * records the original low/high memory, so if low_mem_end is less than
+     * the original we need to revise low/high memory range firstly.
+     */
+    for ( i = 0; i < memory_map.nr_map; i++ )
     {
-        e820[nr].addr = ((uint64_t)1 << 32);
-        e820[nr].size =
-            ((uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT) - e820[nr].addr;
-        e820[nr].type = E820_RAM;
+        map_start = memory_map.map[i].addr;
+        map_size = memory_map.map[i].size;
+        map_end = map_start + map_size;
+
+        /* If we need to adjust lowmem. */
+        if ( memory_map.map[i].type == E820_RAM &&
+             low_mem_end > map_start && low_mem_end < map_end )
+        {
+            add_high_mem = map_end - low_mem_end;
+            memory_map.map[i].size = low_mem_end - map_start;
+            break;
+        }
+    }
+
+    /* If we need to adjust highmem. */
+    if ( add_high_mem )
+    {
+        /* Modify the existing highmem region if it exists. */
+        for ( i = 0; i < memory_map.nr_map; i++ )
+        {
+            map_start = memory_map.map[i].addr;
+            map_size = memory_map.map[i].size;
+            map_end = map_start + map_size;
+
+            if ( memory_map.map[i].type == E820_RAM &&
+                 map_start == ((uint64_t)1 << 32))
+            {
+                memory_map.map[i].size += add_high_mem;
+                break;
+            }
+        }
+
+        /* If there was no highmem region, just create one. */
+        if ( i == memory_map.nr_map )
+        {
+            memory_map.map[i].addr = ((uint64_t)1 << 32);
+            memory_map.map[i].size = add_high_mem;
+            memory_map.map[i].type = E820_RAM;
+            memory_map.nr_map++;
+        }
+
+        /* A sanity check if high memory is broken. */
+        BUG_ON( high_mem_end !=
+                memory_map.map[i].addr + memory_map.map[i].size);
+    }
+
+    /* Now fill e820. */
+    for ( i = 0; i < memory_map.nr_map; i++ )
+    {
+        e820[nr] = memory_map.map[i];
         nr++;
     }
 
+    /* Finally we need to sort all e820 entries. */
+    for ( j = 0; j < nr - 1; j++ )
+    {
+        for ( i = j + 1; i < nr; i++ )
+        {
+            if ( e820[j].addr > e820[i].addr )
+            {
+                struct e820entry tmp = e820[j];
+
+                e820[j] = e820[i];
+                e820[i] = tmp;
+            }
+        }
+    }
+
     return nr;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 08/16] tools/libxc: Expose new hypercall xc_reserved_device_memory_map
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (6 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 07/16] hvmloader/e820: construct guest e820 table Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:51   ` Wei Liu
  2015-07-22 15:44 ` [PATCH 09/16] tools: extend xc_assign_device() to support rdm reservation policy Ian Jackson
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, Jan Beulich, Tiejun Chen

From: Tiejun Chen <tiejun.chen@intel.com>

We will introduce the hypercall xc_reserved_device_memory_map
approach to libxc. This helps us get rdm entry info according to
different parameters. If flag == PCI_DEV_RDM_ALL, all entries
should be exposed. Or we just expose that rdm entry specific to
a SBDF.

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>

---
v13: Mechanical changes to deal with changes to patch 01/
     XENMEM_reserved_device_memory_map.
---
 tools/libxc/include/xenctrl.h |    8 ++++++++
 tools/libxc/xc_domain.c       |   36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index ce9029c..9c5ef8b 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1314,6 +1314,14 @@ int xc_domain_set_memory_map(xc_interface *xch,
 int xc_get_machine_memory_map(xc_interface *xch,
                               struct e820entry entries[],
                               uint32_t max_entries);
+
+int xc_reserved_device_memory_map(xc_interface *xch,
+                                  uint32_t flags,
+                                  uint16_t seg,
+                                  uint8_t bus,
+                                  uint8_t devfn,
+                                  struct xen_reserved_device_memory entries[],
+                                  uint32_t *max_entries);
 #endif
 int xc_domain_set_time_offset(xc_interface *xch,
                               uint32_t domid,
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 6db8d13..3151103 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -684,6 +684,42 @@ int xc_domain_set_memory_map(xc_interface *xch,
 
     return rc;
 }
+
+int xc_reserved_device_memory_map(xc_interface *xch,
+                                  uint32_t flags,
+                                  uint16_t seg,
+                                  uint8_t bus,
+                                  uint8_t devfn,
+                                  struct xen_reserved_device_memory entries[],
+                                  uint32_t *max_entries)
+{
+    int rc;
+    struct xen_reserved_device_memory_map xrdmmap = {
+        .flags = flags,
+        .dev.pci.seg = seg,
+        .dev.pci.bus = bus,
+        .dev.pci.devfn = devfn,
+        .nr_entries = *max_entries
+    };
+    DECLARE_HYPERCALL_BOUNCE(entries,
+                             sizeof(struct xen_reserved_device_memory) *
+                             *max_entries, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+
+    if ( xc_hypercall_bounce_pre(xch, entries) )
+        return -1;
+
+    set_xen_guest_handle(xrdmmap.buffer, entries);
+
+    rc = do_memory_op(xch, XENMEM_reserved_device_memory_map,
+                      &xrdmmap, sizeof(xrdmmap));
+
+    xc_hypercall_bounce_post(xch, entries);
+
+    *max_entries = xrdmmap.nr_entries;
+
+    return rc;
+}
+
 int xc_get_machine_memory_map(xc_interface *xch,
                               struct e820entry entries[],
                               uint32_t max_entries)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 09/16] tools: extend xc_assign_device() to support rdm reservation policy
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (7 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 08/16] tools/libxc: Expose new hypercall xc_reserved_device_memory_map Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:44 ` [PATCH 10/16] tools: introduce some new parameters to set rdm policy Ian Jackson
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, Jan Beulich, Tiejun Chen, David Scott

From: Tiejun Chen <tiejun.chen@intel.com>

This patch passes rdm reservation policy to xc_assign_device() so the policy
is checked when assigning devices to a VM.

Note this also bring some fallout to python usage of xc_assign_device().

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: David Scott <dave.scott@eu.citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/include/xenctrl.h       |    3 ++-
 tools/libxc/xc_domain.c             |    9 ++++++++-
 tools/libxl/libxl_pci.c             |    3 ++-
 tools/ocaml/libs/xc/xenctrl_stubs.c |   16 ++++++++++++----
 tools/python/xen/lowlevel/xc/xc.c   |   30 ++++++++++++++++++++----------
 5 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 9c5ef8b..3f2381a 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2067,7 +2067,8 @@ int xc_hvm_destroy_ioreq_server(xc_interface *xch,
 /* HVM guest pass-through */
 int xc_assign_device(xc_interface *xch,
                      uint32_t domid,
-                     uint32_t machine_sbdf);
+                     uint32_t machine_sbdf,
+                     uint32_t flag);
 
 int xc_get_device_group(xc_interface *xch,
                      uint32_t domid,
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 3151103..df06314 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1697,7 +1697,8 @@ int xc_domain_setdebugging(xc_interface *xch,
 int xc_assign_device(
     xc_interface *xch,
     uint32_t domid,
-    uint32_t machine_sbdf)
+    uint32_t machine_sbdf,
+    uint32_t flag)
 {
     DECLARE_DOMCTL;
 
@@ -1705,6 +1706,7 @@ int xc_assign_device(
     domctl.domain = domid;
     domctl.u.assign_device.dev = XEN_DOMCTL_DEV_PCI;
     domctl.u.assign_device.u.pci.machine_sbdf = machine_sbdf;
+    domctl.u.assign_device.flag = flag;
 
     return do_domctl(xch, &domctl);
 }
@@ -1792,6 +1794,11 @@ int xc_assign_dt_device(
 
     domctl.u.assign_device.dev = XEN_DOMCTL_DEV_DT;
     domctl.u.assign_device.u.dt.size = size;
+    /*
+     * DT doesn't own any RDM so actually DT has nothing to do
+     * for any flag and here just fix that as 0.
+     */
+    domctl.u.assign_device.flag = 0;
     set_xen_guest_handle(domctl.u.assign_device.u.dt.path, path);
 
     rc = do_domctl(xch, &domctl);
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index e0743f8..632c15e 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -894,6 +894,7 @@ static int do_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcidev, i
     FILE *f;
     unsigned long long start, end, flags, size;
     int irq, i, rc, hvm = 0;
+    uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
 
     if (type == LIBXL_DOMAIN_TYPE_INVALID)
         return ERROR_FAIL;
@@ -987,7 +988,7 @@ static int do_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcidev, i
 
 out:
     if (!libxl_is_stubdom(ctx, domid, NULL)) {
-        rc = xc_assign_device(ctx->xch, domid, pcidev_encode_bdf(pcidev));
+        rc = xc_assign_device(ctx->xch, domid, pcidev_encode_bdf(pcidev), flag);
         if (rc < 0 && (hvm || errno != ENOSYS)) {
             LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_assign_device failed");
             return ERROR_FAIL;
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 64f1137..b7de615 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -1172,12 +1172,17 @@ CAMLprim value stub_xc_domain_test_assign_device(value xch, value domid, value d
 	CAMLreturn(Val_bool(ret == 0));
 }
 
-CAMLprim value stub_xc_domain_assign_device(value xch, value domid, value desc)
+static int domain_assign_device_rdm_flag_table[] = {
+    XEN_DOMCTL_DEV_RDM_RELAXED,
+};
+
+CAMLprim value stub_xc_domain_assign_device(value xch, value domid, value desc,
+                                            value rflag)
 {
-	CAMLparam3(xch, domid, desc);
+	CAMLparam4(xch, domid, desc, rflag);
 	int ret;
 	int domain, bus, dev, func;
-	uint32_t sbdf;
+	uint32_t sbdf, flag;
 
 	domain = Int_val(Field(desc, 0));
 	bus = Int_val(Field(desc, 1));
@@ -1185,7 +1190,10 @@ CAMLprim value stub_xc_domain_assign_device(value xch, value domid, value desc)
 	func = Int_val(Field(desc, 3));
 	sbdf = encode_sbdf(domain, bus, dev, func);
 
-	ret = xc_assign_device(_H(xch), _D(domid), sbdf);
+	ret = Int_val(Field(rflag, 0));
+	flag = domain_assign_device_rdm_flag_table[ret];
+
+	ret = xc_assign_device(_H(xch), _D(domid), sbdf, flag);
 
 	if (ret < 0)
 		failwith_xc(_H(xch));
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index ee3e1d0..c8380d1 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -592,7 +592,8 @@ static int token_value(char *token)
     return strtol(token, NULL, 16);
 }
 
-static int next_bdf(char **str, int *seg, int *bus, int *dev, int *func)
+static int next_bdf(char **str, int *seg, int *bus, int *dev, int *func,
+                    int *flag)
 {
     char *token;
 
@@ -607,8 +608,17 @@ static int next_bdf(char **str, int *seg, int *bus, int *dev, int *func)
     *dev  = token_value(token);
     token = strchr(token, ',') + 1;
     *func  = token_value(token);
-    token = strchr(token, ',');
-    *str = token ? token + 1 : NULL;
+    token = strchr(token, ',') + 1;
+    if ( token ) {
+        *flag = token_value(token);
+        *str = token + 1;
+    }
+    else
+    {
+        /* O means we take "strict" as our default policy. */
+        *flag = 0;
+        *str = NULL;
+    }
 
     return 1;
 }
@@ -620,14 +630,14 @@ static PyObject *pyxc_test_assign_device(XcObject *self,
     uint32_t dom;
     char *pci_str;
     int32_t sbdf = 0;
-    int seg, bus, dev, func;
+    int seg, bus, dev, func, flag;
 
     static char *kwd_list[] = { "domid", "pci", NULL };
     if ( !PyArg_ParseTupleAndKeywords(args, kwds, "is", kwd_list,
                                       &dom, &pci_str) )
         return NULL;
 
-    while ( next_bdf(&pci_str, &seg, &bus, &dev, &func) )
+    while ( next_bdf(&pci_str, &seg, &bus, &dev, &func, &flag) )
     {
         sbdf = seg << 16;
         sbdf |= (bus & 0xff) << 8;
@@ -653,21 +663,21 @@ static PyObject *pyxc_assign_device(XcObject *self,
     uint32_t dom;
     char *pci_str;
     int32_t sbdf = 0;
-    int seg, bus, dev, func;
+    int seg, bus, dev, func, flag;
 
     static char *kwd_list[] = { "domid", "pci", NULL };
     if ( !PyArg_ParseTupleAndKeywords(args, kwds, "is", kwd_list,
                                       &dom, &pci_str) )
         return NULL;
 
-    while ( next_bdf(&pci_str, &seg, &bus, &dev, &func) )
+    while ( next_bdf(&pci_str, &seg, &bus, &dev, &func, &flag) )
     {
         sbdf = seg << 16;
         sbdf |= (bus & 0xff) << 8;
         sbdf |= (dev & 0x1f) << 3;
         sbdf |= (func & 0x7);
 
-        if ( xc_assign_device(self->xc_handle, dom, sbdf) != 0 )
+        if ( xc_assign_device(self->xc_handle, dom, sbdf, flag) != 0 )
         {
             if (errno == ENOSYS)
                 sbdf = -1;
@@ -686,14 +696,14 @@ static PyObject *pyxc_deassign_device(XcObject *self,
     uint32_t dom;
     char *pci_str;
     int32_t sbdf = 0;
-    int seg, bus, dev, func;
+    int seg, bus, dev, func, flag;
 
     static char *kwd_list[] = { "domid", "pci", NULL };
     if ( !PyArg_ParseTupleAndKeywords(args, kwds, "is", kwd_list,
                                       &dom, &pci_str) )
         return NULL;
 
-    while ( next_bdf(&pci_str, &seg, &bus, &dev, &func) )
+    while ( next_bdf(&pci_str, &seg, &bus, &dev, &func, &flag) )
     {
         sbdf = seg << 16;
         sbdf |= (bus & 0xff) << 8;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 10/16] tools: introduce some new parameters to set rdm policy
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (8 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 09/16] tools: extend xc_assign_device() to support rdm reservation policy Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:44 ` [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM Ian Jackson
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, Jan Beulich, Tiejun Chen

From: Tiejun Chen <tiejun.chen@intel.com>

This patch introduces user configurable parameters to specify RDM
resource and according policies,

Global RDM parameter:
    rdm = "strategy=host,policy=strict/relaxed"
Per-device RDM parameter:
    pci = [ 'sbdf, rdm_policy=strict/relaxed' ]

Global RDM parameter, "strategy", allows user to specify reserved regions
explicitly, Currently, using 'host' to include all reserved regions reported
on this platform which is good to handle hotplug scenario. In the future
this parameter may be further extended to allow specifying random regions,
e.g. even those belonging to another platform as a preparation for live
migration with passthrough devices. By default this isn't set so we don't
check all rdms. Instead, we just check rdm specific to a given device if
you're assigning this kind of device. Note this option is not recommended
unless you can make sure any conflict does exist.

'strict/relaxed' policy decides how to handle conflict when reserving RDM
regions in pfn space. If conflict exists, 'strict' means an immediate error
so VM can't keep running, while 'relaxed' allows moving forward with a
warning message thrown out.

Default per-device RDM policy is same as default global RDM policy as being
'relaxed'. And the per-device policy would override the global policy like
others.

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Checked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 docs/man/xl.cfg.pod.5        |   81 ++++++++++++++++++++++++++++++++++++++++++
 docs/misc/vtd.txt            |   24 +++++++++++++
 tools/libxl/libxl_create.c   |    7 ++++
 tools/libxl/libxl_internal.h |    2 ++
 tools/libxl/libxl_pci.c      |    9 +++++
 tools/libxl/libxl_types.idl  |   18 ++++++++++
 6 files changed, 141 insertions(+)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 382f30b..e6e0f70 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -633,6 +633,79 @@ assigned slave device.
 
 =back
 
+=item B<rdm="RDM_RESERVATION_STRING">
+
+(HVM/x86 only) Specifies information about Reserved Device Memory (RDM),
+which is necessary to enable robust device passthrough. One example of RDM
+is reported through ACPI Reserved Memory Region Reporting (RMRR) structure
+on x86 platform.
+
+B<RDM_RESERVE_STRING> has the form C<[KEY=VALUE,KEY=VALUE,...> where:
+
+=over 4
+
+=item B<KEY=VALUE>
+
+Possible B<KEY>s are:
+
+=over 4
+
+=item B<strategy="STRING">
+
+Currently there is only one valid type:
+
+"host" means all reserved device memory on this platform should be checked to
+reserve regions in this VM's guest address space. This global rdm parameter
+allows user to specify reserved regions explicitly, and using "host" includes
+all reserved regions reported on this platform, which is useful when doing
+hotplug.
+
+By default this isn't set so we don't check all rdms. Instead, we just check
+rdm specific to a given device if you're assigning this kind of device. Note
+this option is not recommended unless you can make sure any conflict does exist.
+
+For example, you're trying to set "memory = 2800" to allocate memory to one
+given VM but the platform owns two RDM regions like,
+
+Device A [sbdf_A]: RMRR region_A: base_addr ac6d3000 end_address ac6e6fff
+Device B [sbdf_B]: RMRR region_B: base_addr ad800000 end_address afffffff
+
+In this conflict case,
+
+#1. If B<strategy> is set to "host", for example,
+
+rdm = "strategy=host,policy=strict" or rdm = "strategy=host,policy=relaxed"
+
+It means all conflicts will be handled according to the policy
+introduced by B<policy> as described below.
+
+#2. If B<strategy> is not set at all, but
+
+pci = [ 'sbdf_A, rdm_policy=xxxxx' ]
+
+It means only one conflict of region_A will be handled according to the policy
+introduced by B<rdm_policy="STRING"> as described inside pci options.
+
+=item B<policy="STRING">
+
+Specifies how to deal with conflicts when reserving reserved device
+memory in guest address space.
+
+When that conflict is unsolved,
+
+"strict" means VM can't be created, or the associated device can't be
+attached in the case of hotplug.
+
+"relaxed" allows VM to be created but may cause VM to crash if
+pass-through device accesses RDM. For exampl,e Windows IGD GFX driver
+always accessed RDM regions so it leads to VM crash.
+
+Note this may be overridden by rdm_policy option in PCI device configuration.
+
+=back
+
+=back
+
 =item B<pci=[ "PCI_SPEC_STRING", "PCI_SPEC_STRING", ... ]>
 
 Specifies the host PCI devices to passthrough to this guest. Each B<PCI_SPEC_STRING>
@@ -695,6 +768,14 @@ dom0 without confirmation.  Please use with care.
 D0-D3hot power management states for the PCI device. False (0) by
 default.
 
+=item B<rdm_policy="STRING">
+
+(HVM/x86 only) This is same as policy option inside the rdm option but
+just specific to a given device. Therefore the default is "relaxed" as
+same as policy option as well.
+
+Note this would override global B<rdm> option.
+
 =back
 
 =back
diff --git a/docs/misc/vtd.txt b/docs/misc/vtd.txt
index 9af0e99..88b2102 100644
--- a/docs/misc/vtd.txt
+++ b/docs/misc/vtd.txt
@@ -111,6 +111,30 @@ in the config file:
 To override for a specific device:
 	pci = [ '01:00.0,msitranslate=0', '03:00.0' ]
 
+RDM, 'reserved device memory', for PCI Device Passthrough
+---------------------------------------------------------
+
+There are some devices the BIOS controls, for e.g. USB devices to perform
+PS2 emulation. The regions of memory used for these devices are marked
+reserved in the e820 map. When we turn on DMA translation, DMA to those
+regions will fail. Hence BIOS uses RMRR to specify these regions along with
+devices that need to access these regions. OS is expected to setup
+identity mappings for these regions for these devices to access these regions.
+
+While creating a VM we should reserve them in advance, and avoid any conflicts.
+So we introduce user configurable parameters to specify RDM resource and
+according policies,
+
+To enable this globally, add "rdm" in the config file:
+
+    rdm = "strategy=host, policy=relaxed"   (default policy is "relaxed")
+
+Or just for a specific device:
+
+    pci = [ '01:00.0,rdm_policy=relaxed', '03:00.0,rdm_policy=strict' ]
+
+For all the options available to RDM, see xl.cfg(5).
+
 
 Caveat on Conventional PCI Device Passthrough
 ---------------------------------------------
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index a32e3df..7c884c4 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -50,6 +50,12 @@ int libxl__domain_create_info_setdefault(libxl__gc *gc,
     return 0;
 }
 
+void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
+{
+    if (b_info->u.hvm.rdm.policy == LIBXL_RDM_RESERVE_POLICY_INVALID)
+        b_info->u.hvm.rdm.policy = LIBXL_RDM_RESERVE_POLICY_RELAXED;
+}
+
 int libxl__domain_build_info_setdefault(libxl__gc *gc,
                                         libxl_domain_build_info *b_info)
 {
@@ -332,6 +338,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 
         libxl_defbool_setdefault(&b_info->u.hvm.gfx_passthru, false);
 
+        libxl__rdm_setdefault(gc, b_info);
         break;
     case LIBXL_DOMAIN_TYPE_PV:
         libxl_defbool_setdefault(&b_info->u.pv.e820_host, false);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 3c09668..4bb0f38 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1176,6 +1176,8 @@ _hidden int libxl__device_vtpm_setdefault(libxl__gc *gc, libxl_device_vtpm *vtpm
 _hidden int libxl__device_vfb_setdefault(libxl__gc *gc, libxl_device_vfb *vfb);
 _hidden int libxl__device_vkb_setdefault(libxl__gc *gc, libxl_device_vkb *vkb);
 _hidden int libxl__device_pci_setdefault(libxl__gc *gc, libxl_device_pci *pci);
+_hidden void libxl__rdm_setdefault(libxl__gc *gc,
+                                   libxl_domain_build_info *b_info);
 
 _hidden const char *libxl__device_nic_devname(libxl__gc *gc,
                                               uint32_t domid,
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index 632c15e..1ebdce7 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -988,6 +988,12 @@ static int do_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcidev, i
 
 out:
     if (!libxl_is_stubdom(ctx, domid, NULL)) {
+        if (pcidev->rdm_policy == LIBXL_RDM_RESERVE_POLICY_STRICT) {
+            flag &= ~XEN_DOMCTL_DEV_RDM_RELAXED;
+        } else if (pcidev->rdm_policy != LIBXL_RDM_RESERVE_POLICY_RELAXED) {
+            LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "unknown rdm check flag.");
+            return ERROR_FAIL;
+        }
         rc = xc_assign_device(ctx->xch, domid, pcidev_encode_bdf(pcidev), flag);
         if (rc < 0 && (hvm || errno != ENOSYS)) {
             LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_assign_device failed");
@@ -1040,6 +1046,9 @@ static int libxl__device_pci_reset(libxl__gc *gc, unsigned int domain, unsigned
 
 int libxl__device_pci_setdefault(libxl__gc *gc, libxl_device_pci *pci)
 {
+    /* We'd like to force reserve rdm specific to a device by default.*/
+    if (pci->rdm_policy == LIBXL_RDM_RESERVE_POLICY_INVALID)
+        pci->rdm_policy = LIBXL_RDM_RESERVE_POLICY_STRICT;
     return 0;
 }
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index bc0c4ef..db9f75a 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -77,6 +77,17 @@ libxl_domain_type = Enumeration("domain_type", [
     (2, "PV"),
     ], init_val = "LIBXL_DOMAIN_TYPE_INVALID")
 
+libxl_rdm_reserve_strategy = Enumeration("rdm_reserve_strategy", [
+    (0, "ignore"),
+    (1, "host"),
+    ])
+
+libxl_rdm_reserve_policy = Enumeration("rdm_reserve_policy", [
+    (-1, "invalid"),
+    (0, "strict"),
+    (1, "relaxed"),
+    ], init_val = "LIBXL_RDM_RESERVE_POLICY_INVALID")
+
 libxl_channel_connection = Enumeration("channel_connection", [
     (0, "UNKNOWN"),
     (1, "PTY"),
@@ -387,6 +398,11 @@ libxl_gic_version = Enumeration("gic_version", [
     (0x30, "v3")
     ], init_val = "LIBXL_GIC_VERSION_DEFAULT")
 
+libxl_rdm_reserve = Struct("rdm_reserve", [
+    ("strategy",    libxl_rdm_reserve_strategy),
+    ("policy",      libxl_rdm_reserve_policy),
+    ])
+
 libxl_domain_build_info = Struct("domain_build_info",[
     ("max_vcpus",       integer),
     ("avail_vcpus",     libxl_bitmap),
@@ -486,6 +502,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                        # See libxl_ms_vm_genid_generate()
                                        ("ms_vm_genid",      libxl_ms_vm_genid),
                                        ("serial_list",      libxl_string_list),
+                                       ("rdm", libxl_rdm_reserve),
                                        ])),
                  ("pv", Struct(None, [("kernel", string),
                                       ("slack_memkb", MemKB),
@@ -566,6 +583,7 @@ libxl_device_pci = Struct("device_pci", [
     ("power_mgmt", bool),
     ("permissive", bool),
     ("seize", bool),
+    ("rdm_policy",      libxl_rdm_reserve_policy),
     ])
 
 libxl_device_dtdev = Struct("device_dtdev", [
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (9 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 10/16] tools: introduce some new parameters to set rdm policy Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:53   ` Wei Liu
  2015-07-23  0:52   ` Chen, Tiejun
  2015-07-22 15:44 ` [PATCH 12/16] tools: introduce a new parameter to set a predefined rdm boundary Ian Jackson
                   ` (7 subsequent siblings)
  18 siblings, 2 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, Jan Beulich, Tiejun Chen

From: Tiejun Chen <tiejun.chen@intel.com>

While building a VM, HVM domain builder provides struct hvm_info_table{}
to help hvmloader. Currently it includes two fields to construct guest
e820 table by hvmloader, low_mem_pgend and high_mem_pgend. So we should
check them to fix any conflict with RDM.

RMRR can reside in address space beyond 4G theoretically, but we never
see this in real world. So in order to avoid breaking highmem layout
we don't solve highmem conflict. Note this means highmem rmrr could still
be supported if no conflict.

But in the case of lowmem, RMRR probably scatter the whole RAM space.
Especially multiple RMRR entries would worsen this to lead a complicated
memory layout. And then its hard to extend hvm_info_table{} to work
hvmloader out. So here we're trying to figure out a simple solution to
avoid breaking existing layout. So when a conflict occurs,

    #1. Above a predefined boundary (2G)
        - move lowmem_end below reserved region to solve conflict;

    #2. Below a predefined boundary (2G)
        - Check strict/relaxed policy.
        "strict" policy leads to fail libxl. Note when both policies
        are specified on a given region, 'strict' is always preferred.
        "relaxed" policy issue a warning message and also mask this entry INVALID
        to indicate we shouldn't expose this entry to hvmloader.

Note later we need to provide a parameter to set that predefined boundary
dynamically.

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
v13: Mechanical changes to deal with changes to patch 01/
     XENMEM_reserved_device_memory_map.
---
 tools/libxl/libxl_create.c   |    2 +-
 tools/libxl/libxl_dm.c       |  274 ++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_dom.c      |   17 ++-
 tools/libxl/libxl_internal.h |   14 ++-
 tools/libxl/libxl_types.idl  |    7 ++
 5 files changed, 311 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 7c884c4..5b57062 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -407,7 +407,7 @@ int libxl__domain_build(libxl__gc *gc,
 
     switch (info->type) {
     case LIBXL_DOMAIN_TYPE_HVM:
-        ret = libxl__build_hvm(gc, domid, info, state);
+        ret = libxl__build_hvm(gc, domid, d_config, state);
         if (ret)
             goto out;
 
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 634b8d2..40b2bba 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -92,6 +92,280 @@ const char *libxl__domain_device_model(libxl__gc *gc,
     return dm;
 }
 
+static int
+libxl__xc_device_get_rdm(libxl__gc *gc,
+                         uint32_t flag,
+                         uint16_t seg,
+                         uint8_t bus,
+                         uint8_t devfn,
+                         unsigned int *nr_entries,
+                         struct xen_reserved_device_memory **xrdm)
+{
+    int rc = 0, r;
+
+    /*
+     * We really can't presume how many entries we can get in advance.
+     */
+    *nr_entries = 0;
+    r = xc_reserved_device_memory_map(CTX->xch, flag, seg, bus, devfn,
+                                      NULL, nr_entries);
+    assert(r <= 0);
+    /* "0" means we have no any rdm entry. */
+    if (!r) goto out;
+
+    if (errno != ENOBUFS) {
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    GCNEW_ARRAY(*xrdm, *nr_entries);
+    r = xc_reserved_device_memory_map(CTX->xch, flag, seg, bus, devfn,
+                                      *xrdm, nr_entries);
+    if (r)
+        rc = ERROR_FAIL;
+
+ out:
+    if (rc) {
+        *nr_entries = 0;
+        *xrdm = NULL;
+        LOG(ERROR, "Could not get reserved device memory maps.\n");
+    }
+    return rc;
+}
+
+/*
+ * Check whether there exists rdm hole in the specified memory range.
+ * Returns true if exists, else returns false.
+ */
+static bool overlaps_rdm(uint64_t start, uint64_t memsize,
+                         uint64_t rdm_start, uint64_t rdm_size)
+{
+    return (start + memsize > rdm_start) && (start < rdm_start + rdm_size);
+}
+
+static void
+add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
+              uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
+{
+    d_config->rdms = libxl__realloc(NOGC, d_config->rdms,
+                    (d_config->num_rdms+1) * sizeof(libxl_device_rdm));
+
+    d_config->rdms[d_config->num_rdms].start = rdm_start;
+    d_config->rdms[d_config->num_rdms].size = rdm_size;
+    d_config->rdms[d_config->num_rdms].policy = rdm_policy;
+    d_config->num_rdms++;
+}
+
+/*
+ * Check reported RDM regions and handle potential gfn conflicts according
+ * to user preferred policy.
+ *
+ * RDM can reside in address space beyond 4G theoretically, but we never
+ * see this in real world. So in order to avoid breaking highmem layout
+ * we don't solve highmem conflict. Note this means highmem rmrr could
+ * still be supported if no conflict.
+ *
+ * But in the case of lowmem, RDM probably scatter the whole RAM space.
+ * Especially multiple RDM entries would worsen this to lead a complicated
+ * memory layout. And then its hard to extend hvm_info_table{} to work
+ * hvmloader out. So here we're trying to figure out a simple solution to
+ * avoid breaking existing layout. So when a conflict occurs,
+ *
+ * #1. Above a predefined boundary (default 2G)
+ * - Move lowmem_end below reserved region to solve conflict;
+ *
+ * #2. Below a predefined boundary (default 2G)
+ * - Check strict/relaxed policy.
+ * "strict" policy leads to fail libxl.
+ * "relaxed" policy issue a warning message and also mask this entry
+ * INVALID to indicate we shouldn't expose this entry to hvmloader.
+ * Note when both policies are specified on a given region, the per-device
+ * policy should override the global policy.
+ */
+int libxl__domain_device_construct_rdm(libxl__gc *gc,
+                                       libxl_domain_config *d_config,
+                                       uint64_t rdm_mem_boundary,
+                                       struct xc_hvm_build_args *args)
+{
+    int i, j, conflict, rc;
+    struct xen_reserved_device_memory *xrdm = NULL;
+    uint32_t strategy = d_config->b_info.u.hvm.rdm.strategy;
+    uint16_t seg;
+    uint8_t bus, devfn;
+    uint64_t rdm_start, rdm_size;
+    uint64_t highmem_end = args->highmem_end ? args->highmem_end : (1ull<<32);
+
+    /*
+     * We just want to construct RDM once since RDM is specific to the
+     * given platform, so this shouldn't change again.
+     */
+    if (d_config->num_rdms)
+        return 0;
+
+    /* Might not expose rdm. */
+    if (strategy == LIBXL_RDM_RESERVE_STRATEGY_IGNORE &&
+        !d_config->num_pcidevs)
+        return 0;
+
+    /* Query all RDM entries in this platform */
+    if (strategy == LIBXL_RDM_RESERVE_STRATEGY_HOST) {
+        unsigned int nr_entries;
+
+        /* Collect all rdm info if exist. */
+        rc = libxl__xc_device_get_rdm(gc, XENMEM_RDM_ALL,
+                                      0, 0, 0, &nr_entries, &xrdm);
+        if (rc)
+            goto out;
+        if (!nr_entries)
+            return 0;
+
+        assert(xrdm);
+
+        for (i = 0; i < nr_entries; i++)
+        {
+            add_rdm_entry(gc, d_config,
+                          pfn_to_paddr(xrdm[i].start_pfn),
+                          pfn_to_paddr(xrdm[i].nr_pages),
+                          d_config->b_info.u.hvm.rdm.policy);
+        }
+    }
+
+    /* Query RDM entries per-device */
+    for (i = 0; i < d_config->num_pcidevs; i++) {
+        unsigned int nr_entries;
+        bool new = true;
+
+        seg = d_config->pcidevs[i].domain;
+        bus = d_config->pcidevs[i].bus;
+        devfn = PCI_DEVFN(d_config->pcidevs[i].dev,
+                          d_config->pcidevs[i].func);
+        nr_entries = 0;
+        rc = libxl__xc_device_get_rdm(gc, 0,
+                                      seg, bus, devfn, &nr_entries, &xrdm);
+        if (rc)
+            goto out;
+        /* No RDM to associated with this device. */
+        if (!nr_entries)
+            continue;
+
+        assert(xrdm);
+
+        /*
+         * Need to check whether this entry is already saved in the array.
+         * This could come from two cases:
+         *
+         *   - user may configure to get all RDMs in this platform, which
+         *   is already queried before this point
+         *   - or two assigned devices may share one RDM entry
+         *
+         * Different policies may be configured on the same RDM due to
+         * above two cases. But we don't allow to assign such a group
+         * devies right now so it doesn't come true in our case.
+         */
+        for (j = 0; j < d_config->num_rdms; j++) {
+            if (d_config->rdms[j].start == pfn_to_paddr(xrdm[0].start_pfn))
+            {
+                /*
+                 * So the per-device policy always override the global
+                 * policy in this case.
+                 */
+                d_config->rdms[j].policy = d_config->pcidevs[i].rdm_policy;
+                new = false;
+                break;
+            }
+        }
+
+        if (new) {
+            add_rdm_entry(gc, d_config,
+                          pfn_to_paddr(xrdm[0].start_pfn),
+                          pfn_to_paddr(xrdm[0].nr_pages),
+                          d_config->pcidevs[i].rdm_policy);
+        }
+    }
+
+    /*
+     * Next step is to check and avoid potential conflict between RDM
+     * entries and guest RAM. To avoid intrusive impact to existing
+     * memory layout {lowmem, mmio, highmem} which is passed around
+     * various function blocks, below conflicts are not handled which
+     * are rare and handling them would lead to a more scattered
+     * layout:
+     *  - RDM  in highmem area (>4G)
+     *  - RDM lower than a defined memory boundary (e.g. 2G)
+     * Otherwise for conflicts between boundary and 4G, we'll simply
+     * move lowmem end below reserved region to solve conflict.
+     *
+     * If a conflict is detected on a given RDM entry, an error will
+     * be returned if 'strict' policy is specified. Instead, if
+     * 'relaxed' policy specified, this conflict is treated just as a
+     * warning, but we mark this RDM entry as INVALID to indicate that
+     * this entry shouldn't be exposed to hvmloader.
+     *
+     * Firstly we should check the case of rdm < 4G because we may
+     * need to expand highmem_end.
+     */
+    for (i = 0; i < d_config->num_rdms; i++) {
+        rdm_start = d_config->rdms[i].start;
+        rdm_size = d_config->rdms[i].size;
+        conflict = overlaps_rdm(0, args->lowmem_end, rdm_start, rdm_size);
+
+        if (!conflict)
+            continue;
+
+        /* Just check if RDM > our memory boundary. */
+        if (rdm_start > rdm_mem_boundary) {
+            /*
+             * We will move downwards lowmem_end so we have to expand
+             * highmem_end.
+             */
+            highmem_end += (args->lowmem_end - rdm_start);
+            /* Now move downwards lowmem_end. */
+            args->lowmem_end = rdm_start;
+        }
+    }
+
+    /* Sync highmem_end. */
+    args->highmem_end = highmem_end;
+
+    /*
+     * Finally we can take same policy to check lowmem(< 2G) and
+     * highmem adjusted above.
+     */
+    for (i = 0; i < d_config->num_rdms; i++) {
+        rdm_start = d_config->rdms[i].start;
+        rdm_size = d_config->rdms[i].size;
+        /* Does this entry conflict with lowmem? */
+        conflict = overlaps_rdm(0, args->lowmem_end,
+                                rdm_start, rdm_size);
+        /* Does this entry conflict with highmem? */
+        conflict |= overlaps_rdm((1ULL<<32),
+                                 args->highmem_end - (1ULL<<32),
+                                 rdm_start, rdm_size);
+
+        if (!conflict)
+            continue;
+
+        if (d_config->rdms[i].policy == LIBXL_RDM_RESERVE_POLICY_STRICT) {
+            LOG(ERROR, "RDM conflict at 0x%lx.\n", d_config->rdms[i].start);
+            goto out;
+        } else {
+            LOG(WARN, "Ignoring RDM conflict at 0x%lx.\n",
+                      d_config->rdms[i].start);
+
+            /*
+             * Then mask this INVALID to indicate we shouldn't expose this
+             * to hvmloader.
+             */
+            d_config->rdms[i].policy = LIBXL_RDM_RESERVE_POLICY_INVALID;
+        }
+    }
+
+    return 0;
+
+ out:
+    return ERROR_FAIL;
+}
+
 const libxl_vnc_info *libxl__dm_vnc(const libxl_domain_config *guest_config)
 {
     const libxl_vnc_info *vnc = NULL;
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index edd7f3f..9af3b21 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -922,13 +922,20 @@ out:
 }
 
 int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
-              libxl_domain_build_info *info,
+              libxl_domain_config *d_config,
               libxl__domain_build_state *state)
 {
     libxl_ctx *ctx = libxl__gc_owner(gc);
     struct xc_hvm_build_args args = {};
     int ret, rc = ERROR_FAIL;
     uint64_t mmio_start, lowmem_end, highmem_end;
+    libxl_domain_build_info *const info = &d_config->b_info;
+    /*
+     * Currently we fix this as 2G to guarantee how to handle
+     * our rdm policy. But we'll provide a parameter to set
+     * this dynamically.
+     */
+    uint64_t rdm_mem_boundary = 0x80000000;
 
     memset(&args, 0, sizeof(struct xc_hvm_build_args));
     /* The params from the configuration file are in Mb, which are then
@@ -966,6 +973,14 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
     args.highmem_end = highmem_end;
     args.mmio_start = mmio_start;
 
+    rc = libxl__domain_device_construct_rdm(gc, d_config,
+                                            rdm_mem_boundary,
+                                            &args);
+    if (rc) {
+        LOG(ERROR, "checking reserved device memory failed");
+        goto out;
+    }
+
     if (info->num_vnuma_nodes != 0) {
         int i;
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 4bb0f38..f2466dc 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -144,6 +144,9 @@
 #endif
   /* all of these macros preserve errno (saving and restoring) */
 
+/* Convert pfn to physical address space. */
+#define pfn_to_paddr(x) ((uint64_t)(x) << XC_PAGE_SHIFT)
+
 /* logging */
 _hidden void libxl__logv(libxl_ctx *ctx, xentoollog_level msglevel, int errnoval,
              const char *file /* may be 0 */, int line /* ignored if !file */,
@@ -1079,7 +1082,7 @@ _hidden int libxl__build_post(libxl__gc *gc, uint32_t domid,
 _hidden int libxl__build_pv(libxl__gc *gc, uint32_t domid,
              libxl_domain_build_info *info, libxl__domain_build_state *state);
 _hidden int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
-              libxl_domain_build_info *info,
+              libxl_domain_config *d_config,
               libxl__domain_build_state *state);
 
 _hidden int libxl__qemu_traditional_cmd(libxl__gc *gc, uint32_t domid,
@@ -1587,6 +1590,15 @@ _hidden int libxl__need_xenpv_qemu(libxl__gc *gc,
         int nr_channels, libxl_device_channel *channels);
 
 /*
+ * This function will fix reserved device memory conflict
+ * according to user's configuration.
+ */
+_hidden int libxl__domain_device_construct_rdm(libxl__gc *gc,
+                                   libxl_domain_config *d_config,
+                                   uint64_t rdm_mem_guard,
+                                   struct xc_hvm_build_args *args);
+
+/*
  * This function will cause the whole libxl process to hang
  * if the device model does not respond.  It is deprecated.
  *
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index db9f75a..157fa59 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -586,6 +586,12 @@ libxl_device_pci = Struct("device_pci", [
     ("rdm_policy",      libxl_rdm_reserve_policy),
     ])
 
+libxl_device_rdm = Struct("device_rdm", [
+    ("start", uint64),
+    ("size", uint64),
+    ("policy", libxl_rdm_reserve_policy),
+    ])
+
 libxl_device_dtdev = Struct("device_dtdev", [
     ("path", string),
     ])
@@ -616,6 +622,7 @@ libxl_domain_config = Struct("domain_config", [
     ("disks", Array(libxl_device_disk, "num_disks")),
     ("nics", Array(libxl_device_nic, "num_nics")),
     ("pcidevs", Array(libxl_device_pci, "num_pcidevs")),
+    ("rdms", Array(libxl_device_rdm, "num_rdms")),
     ("dtdevs", Array(libxl_device_dtdev, "num_dtdevs")),
     ("vfbs", Array(libxl_device_vfb, "num_vfbs")),
     ("vkbs", Array(libxl_device_vkb, "num_vkbs")),
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 12/16] tools: introduce a new parameter to set a predefined rdm boundary
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (10 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:44 ` [PATCH 13/16] libxl: construct e820 map with RDM information for HVM guest Ian Jackson
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, Jan Beulich, Tiejun Chen

From: Tiejun Chen <tiejun.chen@intel.com>

Previously we always fix that predefined boundary as 2G to handle
conflict between memory and rdm, but now this predefined boundar
can be changes with the parameter "rdm_mem_boundary" in .cfg file.

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Checked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 docs/man/xl.cfg.pod.5       |   22 ++++++++++++++++++++++
 tools/libxl/libxl.h         |    6 ++++++
 tools/libxl/libxl_create.c  |    4 ++++
 tools/libxl/libxl_dom.c     |    8 +-------
 tools/libxl/libxl_types.idl |    1 +
 tools/libxl/xl_cmdimpl.c    |    3 +++
 6 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index e6e0f70..ce7ce85 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -845,6 +845,28 @@ More information about Xen gfx_passthru feature is available
 on the XenVGAPassthrough L<http://wiki.xen.org/wiki/XenVGAPassthrough>
 wiki page.
 
+=item B<rdm_mem_boundary=MBYTES>
+
+Number of megabytes to set a boundary for checking rdm conflict.
+
+When RDM conflicts with RAM, RDM probably scatter the whole RAM space.
+Especially multiple RDM entries would worsen this to lead a complicated
+memory layout. So here we're trying to figure out a simple solution to
+avoid breaking existing layout. So when a conflict occurs,
+
+    #1. Above a predefined boundary
+        - move lowmem_end below reserved region to solve conflict;
+
+    #2. Below a predefined boundary
+        - Check strict/relaxed policy.
+        "strict" policy leads to fail libxl. Note when both policies
+        are specified on a given region, 'strict' is always preferred.
+        "relaxed" policy issue a warning message and also mask this
+        entry INVALID to indicate we shouldn't expose this entry to
+        hvmloader.
+
+Here the default is 2G.
+
 =item B<dtdev=[ "DTDEV_PATH", "DTDEV_PATH", ... ]>
 
 Specifies the host device tree nodes to passthrough to this guest. Each
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 5a7308d..927b2d8 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -909,6 +909,12 @@ const char *libxl_defbool_to_string(libxl_defbool b);
 #define LIBXL_TIMER_MODE_DEFAULT -1
 #define LIBXL_MEMKB_DEFAULT ~0ULL
 
+/*
+ * We'd like to set a memory boundary to determine if we need to check
+ * any overlap with reserved device memory.
+ */
+#define LIBXL_RDM_MEM_BOUNDARY_MEMKB_DEFAULT (2048 * 1024)
+
 #define LIBXL_MS_VM_GENID_LEN 16
 typedef struct {
     uint8_t bytes[LIBXL_MS_VM_GENID_LEN];
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 5b57062..b27c53a 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -54,6 +54,10 @@ void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
 {
     if (b_info->u.hvm.rdm.policy == LIBXL_RDM_RESERVE_POLICY_INVALID)
         b_info->u.hvm.rdm.policy = LIBXL_RDM_RESERVE_POLICY_RELAXED;
+
+    if (b_info->u.hvm.rdm_mem_boundary_memkb == LIBXL_MEMKB_DEFAULT)
+        b_info->u.hvm.rdm_mem_boundary_memkb =
+                            LIBXL_RDM_MEM_BOUNDARY_MEMKB_DEFAULT;
 }
 
 int libxl__domain_build_info_setdefault(libxl__gc *gc,
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 9af3b21..0b7c39d 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -930,12 +930,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
     int ret, rc = ERROR_FAIL;
     uint64_t mmio_start, lowmem_end, highmem_end;
     libxl_domain_build_info *const info = &d_config->b_info;
-    /*
-     * Currently we fix this as 2G to guarantee how to handle
-     * our rdm policy. But we'll provide a parameter to set
-     * this dynamically.
-     */
-    uint64_t rdm_mem_boundary = 0x80000000;
 
     memset(&args, 0, sizeof(struct xc_hvm_build_args));
     /* The params from the configuration file are in Mb, which are then
@@ -974,7 +968,7 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
     args.mmio_start = mmio_start;
 
     rc = libxl__domain_device_construct_rdm(gc, d_config,
-                                            rdm_mem_boundary,
+                                            info->u.hvm.rdm_mem_boundary_memkb*1024,
                                             &args);
     if (rc) {
         LOG(ERROR, "checking reserved device memory failed");
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 157fa59..9caaf44 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -503,6 +503,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                        ("ms_vm_genid",      libxl_ms_vm_genid),
                                        ("serial_list",      libxl_string_list),
                                        ("rdm", libxl_rdm_reserve),
+                                       ("rdm_mem_boundary_memkb", MemKB),
                                        ])),
                  ("pv", Struct(None, [("kernel", string),
                                       ("slack_memkb", MemKB),
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 5c6d1b0..615b78b 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1581,6 +1581,9 @@ static void parse_config_data(const char *config_source,
                     exit(1);
             }
         }
+
+        if (!xlu_cfg_get_long (config, "rdm_mem_boundary", &l, 0))
+            b_info->u.hvm.rdm_mem_boundary_memkb = l * 1024;
         break;
     case LIBXL_DOMAIN_TYPE_PV:
     {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 13/16] libxl: construct e820 map with RDM information for HVM guest
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (11 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 12/16] tools: introduce a new parameter to set a predefined rdm boundary Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:44 ` [PATCH 14/16] xen/vtd: enable USB device assignment Ian Jackson
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, Jan Beulich, Tiejun Chen

From: Tiejun Chen <tiejun.chen@intel.com>

Here we'll construct a basic guest e820 table via
XENMEM_set_memory_map. This table includes lowmem, highmem
and RDMs if they exist, and hvmloader would need this info
later.

Note this guest e820 table would be same as before if the
platform has no any RDM or we disable RDM (by default).

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Checked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxl_arch.h |    7 ++++
 tools/libxl/libxl_arm.c  |    8 +++++
 tools/libxl/libxl_dom.c  |    5 +++
 tools/libxl/libxl_x86.c  |   83 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 103 insertions(+)

diff --git a/tools/libxl/libxl_arch.h b/tools/libxl/libxl_arch.h
index 9a80d43..bd030b6 100644
--- a/tools/libxl/libxl_arch.h
+++ b/tools/libxl/libxl_arch.h
@@ -55,4 +55,11 @@ int libxl__arch_vnuma_build_vmemrange(libxl__gc *gc,
 _hidden
 int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t domid, int irq);
 
+/* arch specific to construct memory mapping function */
+_hidden
+int libxl__arch_domain_construct_memmap(libxl__gc *gc,
+                                        libxl_domain_config *d_config,
+                                        uint32_t domid,
+                                        struct xc_hvm_build_args *args);
+
 #endif
diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c
index d306905..42ab6d8 100644
--- a/tools/libxl/libxl_arm.c
+++ b/tools/libxl/libxl_arm.c
@@ -969,6 +969,14 @@ int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t domid, int irq)
     return xc_domain_bind_pt_spi_irq(CTX->xch, domid, irq, irq);
 }
 
+int libxl__arch_domain_construct_memmap(libxl__gc *gc,
+                                        libxl_domain_config *d_config,
+                                        uint32_t domid,
+                                        struct xc_hvm_build_args *args)
+{
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 0b7c39d..a76d4b3 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1012,6 +1012,11 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
         goto out;
     }
 
+    if (libxl__arch_domain_construct_memmap(gc, d_config, domid, &args)) {
+        LOG(ERROR, "setting domain memory map failed");
+        goto out;
+    }
+
     ret = hvm_build_set_params(ctx->xch, domid, info, state->store_port,
                                &state->store_mfn, state->console_port,
                                &state->console_mfn, state->store_domid,
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 8cd15ca..b3cf3e2 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -445,6 +445,89 @@ int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t domid, int irq)
 }
 
 /*
+ * Here we're just trying to set these kinds of e820 mappings:
+ *
+ * #1. Low memory region
+ *
+ * Low RAM starts at least from 1M to make sure all standard regions
+ * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios,
+ * have enough space.
+ * Note: Those stuffs below 1M are still constructed with multiple
+ * e820 entries by hvmloader. At this point we don't change anything.
+ *
+ * #2. RDM region if it exists
+ *
+ * #3. High memory region if it exists
+ *
+ * Note: these regions are not overlapping since we already check
+ * to adjust them. Please refer to libxl__domain_device_construct_rdm().
+ */
+#define GUEST_LOW_MEM_START_DEFAULT 0x100000
+int libxl__arch_domain_construct_memmap(libxl__gc *gc,
+                                        libxl_domain_config *d_config,
+                                        uint32_t domid,
+                                        struct xc_hvm_build_args *args)
+{
+    int rc = 0;
+    unsigned int nr = 0, i;
+    /* We always own at least one lowmem entry. */
+    unsigned int e820_entries = 1;
+    struct e820entry *e820 = NULL;
+    uint64_t highmem_size =
+                    args->highmem_end ? args->highmem_end - (1ull << 32) : 0;
+
+    /* Add all rdm entries. */
+    for (i = 0; i < d_config->num_rdms; i++)
+        if (d_config->rdms[i].policy != LIBXL_RDM_RESERVE_POLICY_INVALID)
+            e820_entries++;
+
+
+    /* If we should have a highmem range. */
+    if (highmem_size)
+        e820_entries++;
+
+    if (e820_entries >= E820MAX) {
+        LOG(ERROR, "Ooops! Too many entries in the memory map!\n");
+        rc = ERROR_INVAL;
+        goto out;
+    }
+
+    e820 = libxl__malloc(gc, sizeof(struct e820entry) * e820_entries);
+
+    /* Low memory */
+    e820[nr].addr = GUEST_LOW_MEM_START_DEFAULT;
+    e820[nr].size = args->lowmem_end - GUEST_LOW_MEM_START_DEFAULT;
+    e820[nr].type = E820_RAM;
+    nr++;
+
+    /* RDM mapping */
+    for (i = 0; i < d_config->num_rdms; i++) {
+        if (d_config->rdms[i].policy == LIBXL_RDM_RESERVE_POLICY_INVALID)
+            continue;
+
+        e820[nr].addr = d_config->rdms[i].start;
+        e820[nr].size = d_config->rdms[i].size;
+        e820[nr].type = E820_RESERVED;
+        nr++;
+    }
+
+    /* High memory */
+    if (highmem_size) {
+        e820[nr].addr = ((uint64_t)1 << 32);
+        e820[nr].size = highmem_size;
+        e820[nr].type = E820_RAM;
+    }
+
+    if (xc_domain_set_memory_map(CTX->xch, domid, e820, e820_entries) != 0) {
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+out:
+    return rc;
+}
+
+/*
  * Local variables:
  * mode: C
  * c-basic-offset: 4
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 14/16] xen/vtd: enable USB device assignment
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (12 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 13/16] libxl: construct e820 map with RDM information for HVM guest Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:44 ` [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr Ian Jackson
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Wei Liu, Ian Campbell, George Dunlap, Jan Beulich,
	Yang Zhang, Tiejun Chen

From: Tiejun Chen <tiejun.chen@intel.com>

USB RMRR may conflict with guest BIOS region. In such case, identity
mapping setup is simply skipped in previous implementation. Now we
can handle this scenario cleanly with new policy mechanism so previous
hack code can be removed now.

CC: Yang Zhang <yang.z.zhang@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
---
 xen/drivers/passthrough/vtd/dmar.h  |    1 -
 xen/drivers/passthrough/vtd/iommu.c |   11 ++---------
 xen/drivers/passthrough/vtd/utils.c |    7 -------
 3 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/dmar.h b/xen/drivers/passthrough/vtd/dmar.h
index af1feef..af205f5 100644
--- a/xen/drivers/passthrough/vtd/dmar.h
+++ b/xen/drivers/passthrough/vtd/dmar.h
@@ -129,7 +129,6 @@ do {                                                \
 
 int vtd_hw_check(void);
 void disable_pmr(struct iommu *iommu);
-int is_usb_device(u16 seg, u8 bus, u8 devfn);
 int is_igd_drhd(struct acpi_drhd_unit *drhd);
 
 #endif /* _DMAR_H_ */
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index a2f3a66..8a8d763 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2242,11 +2242,9 @@ static int reassign_device_ownership(
     /*
      * If the device belongs to the hardware domain, and it has RMRR, don't
      * remove it from the hardware domain, because BIOS may use RMRR at
-     * booting time. Also account for the special casing of USB below (in
-     * intel_iommu_assign_device()).
+     * booting time.
      */
-    if ( !is_hardware_domain(source) &&
-         !is_usb_device(pdev->seg, pdev->bus, pdev->devfn) )
+    if ( !is_hardware_domain(source) )
     {
         const struct acpi_rmrr_unit *rmrr;
         u16 bdf;
@@ -2299,13 +2297,8 @@ static int intel_iommu_assign_device(
     if ( ret )
         return ret;
 
-    /* FIXME: Because USB RMRR conflicts with guest bios region,
-     * ignore USB RMRR temporarily.
-     */
     seg = pdev->seg;
     bus = pdev->bus;
-    if ( is_usb_device(seg, bus, pdev->devfn) )
-        return 0;
 
     /* Setup rmrr identity mapping */
     for_each_rmrr_device( rmrr, bdf, i )
diff --git a/xen/drivers/passthrough/vtd/utils.c b/xen/drivers/passthrough/vtd/utils.c
index bd14c02..b8a077f 100644
--- a/xen/drivers/passthrough/vtd/utils.c
+++ b/xen/drivers/passthrough/vtd/utils.c
@@ -29,13 +29,6 @@
 #include "extern.h"
 #include <asm/io_apic.h>
 
-int is_usb_device(u16 seg, u8 bus, u8 devfn)
-{
-    u16 class = pci_conf_read16(seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
-                                PCI_CLASS_DEVICE);
-    return (class == 0xc03);
-}
-
 /* Disable vt-d protected memory registers. */
 void disable_pmr(struct iommu *iommu)
 {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (13 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 14/16] xen/vtd: enable USB device assignment Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-09-03 19:39   ` Tamas K Lengyel
  2015-07-22 15:44 ` [PATCH 16/16] tools: parse to enable new rdm policy parameters Ian Jackson
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Wei Liu, Ian Campbell, George Dunlap, Jan Beulich,
	Yang Zhang, Tiejun Chen

From: Tiejun Chen <tiejun.chen@intel.com>

Currently we're intending to cover this kind of devices
with shared RMRR simply since the case of shared RMRR is
a rare case according to our previous experiences. But
late we can group these devices which shared rmrr, and
then allow all devices within a group to be assigned to
same domain.

CC: Yang Zhang <yang.z.zhang@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
---
 xen/drivers/passthrough/vtd/iommu.c |   30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 8a8d763..ce5c295 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2293,13 +2293,37 @@ static int intel_iommu_assign_device(
     if ( list_empty(&acpi_drhd_units) )
         return -ENODEV;
 
+    seg = pdev->seg;
+    bus = pdev->bus;
+    /*
+     * In rare cases one given rmrr is shared by multiple devices but
+     * obviously this would put the security of a system at risk. So
+     * we should prevent from this sort of device assignment.
+     *
+     * TODO: in the future we can introduce group device assignment
+     * interface to make sure devices sharing RMRR are assigned to the
+     * same domain together.
+     */
+    for_each_rmrr_device( rmrr, bdf, i )
+    {
+        if ( rmrr->segment == seg &&
+             PCI_BUS(bdf) == bus &&
+             PCI_DEVFN2(bdf) == devfn &&
+             rmrr->scope.devices_cnt > 1 )
+        {
+            printk(XENLOG_G_ERR VTDPREFIX
+                   " cannot assign %04x:%02x:%02x.%u"
+                   " with shared RMRR at %"PRIx64" for Dom%d.\n",
+                   seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+                   rmrr->base_address, d->domain_id);
+            return -EPERM;
+        }
+    }
+
     ret = reassign_device_ownership(hardware_domain, d, devfn, pdev);
     if ( ret )
         return ret;
 
-    seg = pdev->seg;
-    bus = pdev->bus;
-
     /* Setup rmrr identity mapping */
     for_each_rmrr_device( rmrr, bdf, i )
     {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 16/16] tools: parse to enable new rdm policy parameters
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (14 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr Ian Jackson
@ 2015-07-22 15:44 ` Ian Jackson
  2015-07-22 15:51 ` [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, Jan Beulich, Tiejun Chen

From: Tiejun Chen <tiejun.chen@intel.com>

This patch parses to enable user configurable parameters to specify
RDM resource and according policies which are defined previously,

Global RDM parameter:
    rdm = "strategy=host,policy=strict/relaxed"
Per-device RDM parameter:
    pci = [ 'sbdf, rdm_policy=strict/relaxed' ]

Default per-device RDM policy is same as default global RDM policy as being
'relaxed'. And the per-device policy would override the global policy like
others.

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxlu_pci.c |   92 +++++++++++++++++++++++++++++++++++++++++++++-
 tools/libxl/libxlutil.h  |    4 ++
 tools/libxl/xl_cmdimpl.c |   13 +++++++
 3 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxlu_pci.c b/tools/libxl/libxlu_pci.c
index 26fb143..026413b 100644
--- a/tools/libxl/libxlu_pci.c
+++ b/tools/libxl/libxlu_pci.c
@@ -42,6 +42,9 @@ static int pcidev_struct_fill(libxl_device_pci *pcidev, unsigned int domain,
 #define STATE_OPTIONS_K 6
 #define STATE_OPTIONS_V 7
 #define STATE_TERMINAL  8
+#define STATE_TYPE      9
+#define STATE_RDM_STRATEGY      10
+#define STATE_RESERVE_POLICY    11
 int xlu_pci_parse_bdf(XLU_Config *cfg, libxl_device_pci *pcidev, const char *str)
 {
     unsigned state = STATE_DOMAIN;
@@ -143,7 +146,18 @@ int xlu_pci_parse_bdf(XLU_Config *cfg, libxl_device_pci *pcidev, const char *str
                     pcidev->permissive = atoi(tok);
                 }else if ( !strcmp(optkey, "seize") ) {
                     pcidev->seize = atoi(tok);
-                }else{
+                } else if (!strcmp(optkey, "rdm_policy")) {
+                    if (!strcmp(tok, "strict")) {
+                        pcidev->rdm_policy = LIBXL_RDM_RESERVE_POLICY_STRICT;
+                    } else if (!strcmp(tok, "relaxed")) {
+                        pcidev->rdm_policy = LIBXL_RDM_RESERVE_POLICY_RELAXED;
+                    } else {
+                        XLU__PCI_ERR(cfg, "%s is not an valid PCI RDM property"
+                                          " policy: 'strict' or 'relaxed'.",
+                                     tok);
+                        goto parse_error;
+                    }
+                } else {
                     XLU__PCI_ERR(cfg, "Unknown PCI BDF option: %s", optkey);
                 }
                 tok = ptr + 1;
@@ -167,6 +181,82 @@ parse_error:
     return ERROR_INVAL;
 }
 
+int xlu_rdm_parse(XLU_Config *cfg, libxl_rdm_reserve *rdm, const char *str)
+{
+    unsigned state = STATE_TYPE;
+    char *buf2, *tok, *ptr, *end;
+
+    if (NULL == (buf2 = ptr = strdup(str)))
+        return ERROR_NOMEM;
+
+    for (tok = ptr, end = ptr + strlen(ptr) + 1; ptr < end; ptr++) {
+        switch(state) {
+        case STATE_TYPE:
+            if (*ptr == '=') {
+                state = STATE_RDM_STRATEGY;
+                *ptr = '\0';
+                if (strcmp(tok, "strategy")) {
+                    XLU__PCI_ERR(cfg, "Unknown RDM state option: %s", tok);
+                    goto parse_error;
+                }
+                tok = ptr + 1;
+            }
+            break;
+        case STATE_RDM_STRATEGY:
+            if (*ptr == '\0' || *ptr == ',') {
+                state = STATE_RESERVE_POLICY;
+                *ptr = '\0';
+                if (!strcmp(tok, "host")) {
+                    rdm->strategy = LIBXL_RDM_RESERVE_STRATEGY_HOST;
+                } else {
+                    XLU__PCI_ERR(cfg, "Unknown RDM strategy option: %s", tok);
+                    goto parse_error;
+                }
+                tok = ptr + 1;
+            }
+            break;
+        case STATE_RESERVE_POLICY:
+            if (*ptr == '=') {
+                state = STATE_OPTIONS_V;
+                *ptr = '\0';
+                if (strcmp(tok, "policy")) {
+                    XLU__PCI_ERR(cfg, "Unknown RDM property value: %s", tok);
+                    goto parse_error;
+                }
+                tok = ptr + 1;
+            }
+            break;
+        case STATE_OPTIONS_V:
+            if (*ptr == ',' || *ptr == '\0') {
+                state = STATE_TERMINAL;
+                *ptr = '\0';
+                if (!strcmp(tok, "strict")) {
+                    rdm->policy = LIBXL_RDM_RESERVE_POLICY_STRICT;
+                } else if (!strcmp(tok, "relaxed")) {
+                    rdm->policy = LIBXL_RDM_RESERVE_POLICY_RELAXED;
+                } else {
+                    XLU__PCI_ERR(cfg, "Unknown RDM property policy value: %s",
+                                 tok);
+                    goto parse_error;
+                }
+                tok = ptr + 1;
+            }
+        default:
+            break;
+        }
+    }
+
+    free(buf2);
+
+    if (tok != ptr || state != STATE_TERMINAL)
+        goto parse_error;
+
+    return 0;
+
+parse_error:
+    return ERROR_INVAL;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxlutil.h b/tools/libxl/libxlutil.h
index 989605a..e81b644 100644
--- a/tools/libxl/libxlutil.h
+++ b/tools/libxl/libxlutil.h
@@ -106,6 +106,10 @@ int xlu_disk_parse(XLU_Config *cfg, int nspecs, const char *const *specs,
  */
 int xlu_pci_parse_bdf(XLU_Config *cfg, libxl_device_pci *pcidev, const char *str);
 
+/*
+ * RDM parsing
+ */
+int xlu_rdm_parse(XLU_Config *cfg, libxl_rdm_reserve *rdm, const char *str);
 
 /*
  * Vif rate parsing.
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 615b78b..d102439 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1980,6 +1980,14 @@ skip_vfb:
         xlu_cfg_get_defbool(config, "e820_host", &b_info->u.pv.e820_host, 0);
     }
 
+    if (!xlu_cfg_get_string(config, "rdm", &buf, 0)) {
+        libxl_rdm_reserve rdm;
+        if (!xlu_rdm_parse(config, &rdm, buf)) {
+            b_info->u.hvm.rdm.strategy = rdm.strategy;
+            b_info->u.hvm.rdm.policy = rdm.policy;
+        }
+    }
+
     if (!xlu_cfg_get_list (config, "pci", &pcis, 0, 0)) {
         d_config->num_pcidevs = 0;
         d_config->pcidevs = NULL;
@@ -1993,6 +2001,11 @@ skip_vfb:
             pcidev->power_mgmt = pci_power_mgmt;
             pcidev->permissive = pci_permissive;
             pcidev->seize = pci_seize;
+            /*
+             * Like other pci option, the per-device policy always follows
+             * the global policy by default.
+             */
+            pcidev->rdm_policy = b_info->u.hvm.rdm.policy;
             e = xlu_pci_parse_bdf(config, pcidev, buf);
             if (e) {
                 fprintf(stderr,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 08/16] tools/libxc: Expose new hypercall xc_reserved_device_memory_map
  2015-07-22 15:44 ` [PATCH 08/16] tools/libxc: Expose new hypercall xc_reserved_device_memory_map Ian Jackson
@ 2015-07-22 15:51   ` Wei Liu
  0 siblings, 0 replies; 46+ messages in thread
From: Wei Liu @ 2015-07-22 15:51 UTC (permalink / raw)
  To: Ian Jackson
  Cc: xen-devel, Wei Liu, Ian Campbell, Stefano Stabellini,
	George Dunlap, Jan Beulich, Tiejun Chen

On Wed, Jul 22, 2015 at 04:44:11PM +0100, Ian Jackson wrote:
> From: Tiejun Chen <tiejun.chen@intel.com>
> 
> We will introduce the hypercall xc_reserved_device_memory_map
> approach to libxc. This helps us get rdm entry info according to
> different parameters. If flag == PCI_DEV_RDM_ALL, all entries
> should be exposed. Or we just expose that rdm entry specific to
> a SBDF.
> 
> CC: Ian Jackson <ian.jackson@eu.citrix.com>
> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> CC: Ian Campbell <ian.campbell@citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
> 

Acked-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v13 00/16] Fix RMRR (avoid RDM)
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (15 preceding siblings ...)
  2015-07-22 15:44 ` [PATCH 16/16] tools: parse to enable new rdm policy parameters Ian Jackson
@ 2015-07-22 15:51 ` Ian Jackson
  2015-07-23  2:15 ` Chen, Tiejun
  2015-07-23  7:36 ` Wei Liu
  18 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-22 15:51 UTC (permalink / raw)
  To: Ian Jackson
  Cc: xen-devel, Wei Liu, Ian Campbell, George Dunlap, Jan Beulich,
	Tiejun Chen

Ian Jackson writes ("[PATCH v13 00/16] Fix RMRR (avoid RDM)"):
> Wei, can you please give your formal ack for the two changed patches ?
> 
> Tiejun, can you please test this series ?  Hopefully if you say it
> still works after this, we can commit it tomorrow.

FAOD, I have not executed it, although I have checked that it builds
(on amd64).

Ian.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM
  2015-07-22 15:44 ` [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM Ian Jackson
@ 2015-07-22 15:53   ` Wei Liu
  2015-07-23 11:05     ` Ian Jackson
  2015-07-23  0:52   ` Chen, Tiejun
  1 sibling, 1 reply; 46+ messages in thread
From: Wei Liu @ 2015-07-22 15:53 UTC (permalink / raw)
  To: Ian Jackson
  Cc: xen-devel, Wei Liu, Ian Campbell, Stefano Stabellini,
	George Dunlap, Jan Beulich, Tiejun Chen

On Wed, Jul 22, 2015 at 04:44:14PM +0100, Ian Jackson wrote:
> From: Tiejun Chen <tiejun.chen@intel.com>
> 
> While building a VM, HVM domain builder provides struct hvm_info_table{}
> to help hvmloader. Currently it includes two fields to construct guest
> e820 table by hvmloader, low_mem_pgend and high_mem_pgend. So we should
> check them to fix any conflict with RDM.
> 
> RMRR can reside in address space beyond 4G theoretically, but we never
> see this in real world. So in order to avoid breaking highmem layout
> we don't solve highmem conflict. Note this means highmem rmrr could still
> be supported if no conflict.
> 
> But in the case of lowmem, RMRR probably scatter the whole RAM space.
> Especially multiple RMRR entries would worsen this to lead a complicated
> memory layout. And then its hard to extend hvm_info_table{} to work
> hvmloader out. So here we're trying to figure out a simple solution to
> avoid breaking existing layout. So when a conflict occurs,
> 
>     #1. Above a predefined boundary (2G)
>         - move lowmem_end below reserved region to solve conflict;
> 
>     #2. Below a predefined boundary (2G)
>         - Check strict/relaxed policy.
>         "strict" policy leads to fail libxl. Note when both policies
>         are specified on a given region, 'strict' is always preferred.
>         "relaxed" policy issue a warning message and also mask this entry INVALID
>         to indicate we shouldn't expose this entry to hvmloader.
> 
> Note later we need to provide a parameter to set that predefined boundary
> dynamically.
> 
> CC: Ian Jackson <ian.jackson@eu.citrix.com>
> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> CC: Ian Campbell <ian.campbell@citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>

Ian, you forgot your S-o-B.

Acked-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM
  2015-07-22 15:44 ` [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM Ian Jackson
  2015-07-22 15:53   ` Wei Liu
@ 2015-07-23  0:52   ` Chen, Tiejun
  2015-07-23  7:35     ` Wei Liu
  2015-07-23 11:09     ` Ian Jackson
  1 sibling, 2 replies; 46+ messages in thread
From: Chen, Tiejun @ 2015-07-23  0:52 UTC (permalink / raw)
  To: Ian Jackson, xen-devel
  Cc: George Dunlap, Wei Liu, Ian Campbell, Jan Beulich, Stefano Stabellini

Ian,

Thanks for your effort.

A tiny change may be needed but I don't block this.

> +libxl__xc_device_get_rdm(libxl__gc *gc,
> +                         uint32_t flag,

Since now we are sitting on xc_reserved_device_memory_map(, flags, xxx), 
s/flag/flags may be better.

> +                         uint16_t seg,
> +                         uint8_t bus,
> +                         uint8_t devfn,
> +                         unsigned int *nr_entries,
> +                         struct xen_reserved_device_memory **xrdm)
> +{

[snip]

> +    r = xc_reserved_device_memory_map(CTX->xch, flag, seg, bus, devfn,

Ditto.

> +                                      NULL, nr_entries);
> +    assert(r <= 0);
> +    /* "0" means we have no any rdm entry. */
> +    if (!r) goto out;
> +
> +    if (errno != ENOBUFS) {
> +        rc = ERROR_FAIL;
> +        goto out;
> +    }
> +
> +    GCNEW_ARRAY(*xrdm, *nr_entries);
> +    r = xc_reserved_device_memory_map(CTX->xch, flag, seg, bus, devfn,

Ditto.

Thanks
Tiejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v13 00/16] Fix RMRR (avoid RDM)
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (16 preceding siblings ...)
  2015-07-22 15:51 ` [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
@ 2015-07-23  2:15 ` Chen, Tiejun
  2015-07-23 11:10   ` Ian Jackson
  2015-07-23  7:36 ` Wei Liu
  18 siblings, 1 reply; 46+ messages in thread
From: Chen, Tiejun @ 2015-07-23  2:15 UTC (permalink / raw)
  To: Ian Jackson, xen-devel; +Cc: George Dunlap, Wei Liu, Ian Campbell, Jan Beulich

Ian,

This was supposed to be my responsibility to resend this series so 
appreciate for your extra effort.

> git branch here:
>    http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=summary
>    git://xenbits.xen.org/people/iwj/xen.git
>    base.rdm-v13..wip.rdm-v13
>

I pick all RMRR patches from wip.rdm-v13 and apply them into staging 
branch in my tree.

...

>
> Tiejun, can you please test this series ?  Hopefully if you say it

Yes. I can compile and boot successfully with/without rdm setting.

> still works after this, we can commit it tomorrow.

Thanks
Tiejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM
  2015-07-23  0:52   ` Chen, Tiejun
@ 2015-07-23  7:35     ` Wei Liu
  2015-07-23  7:51       ` Chen, Tiejun
  2015-07-23 11:09     ` Ian Jackson
  1 sibling, 1 reply; 46+ messages in thread
From: Wei Liu @ 2015-07-23  7:35 UTC (permalink / raw)
  To: Chen, Tiejun
  Cc: xen-devel, Wei Liu, Ian Campbell, Stefano Stabellini,
	George Dunlap, Ian Jackson, Jan Beulich

On Thu, Jul 23, 2015 at 08:52:24AM +0800, Chen, Tiejun wrote:
> Ian,
> 
> Thanks for your effort.
> 
> A tiny change may be needed but I don't block this.
> 
> >+libxl__xc_device_get_rdm(libxl__gc *gc,
> >+                         uint32_t flag,
> 
> Since now we are sitting on xc_reserved_device_memory_map(, flags, xxx),
> s/flag/flags may be better.
> 
> >+                         uint16_t seg,
> >+                         uint8_t bus,
> >+                         uint8_t devfn,
> >+                         unsigned int *nr_entries,
> >+                         struct xen_reserved_device_memory **xrdm)
> >+{
> 
> [snip]
> 
> >+    r = xc_reserved_device_memory_map(CTX->xch, flag, seg, bus, devfn,
> 
> Ditto.
> 
> >+                                      NULL, nr_entries);
> >+    assert(r <= 0);
> >+    /* "0" means we have no any rdm entry. */
> >+    if (!r) goto out;
> >+
> >+    if (errno != ENOBUFS) {
> >+        rc = ERROR_FAIL;
> >+        goto out;
> >+    }
> >+
> >+    GCNEW_ARRAY(*xrdm, *nr_entries);
> >+    r = xc_reserved_device_memory_map(CTX->xch, flag, seg, bus, devfn,
> 
> Ditto.
> 

These cosmetic changes can be fixed by a follow-up patch.

Wei.

> Thanks
> Tiejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v13 00/16] Fix RMRR (avoid RDM)
  2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
                   ` (17 preceding siblings ...)
  2015-07-23  2:15 ` Chen, Tiejun
@ 2015-07-23  7:36 ` Wei Liu
  18 siblings, 0 replies; 46+ messages in thread
From: Wei Liu @ 2015-07-23  7:36 UTC (permalink / raw)
  To: Ian Jackson
  Cc: xen-devel, Wei Liu, Ian Campbell, George Dunlap, Jan Beulich,
	Tiejun Chen

Forgot to do this yesterday.

With my RM hat on:

Release-acked-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM
  2015-07-23  7:35     ` Wei Liu
@ 2015-07-23  7:51       ` Chen, Tiejun
  0 siblings, 0 replies; 46+ messages in thread
From: Chen, Tiejun @ 2015-07-23  7:51 UTC (permalink / raw)
  To: Wei Liu
  Cc: xen-devel, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, Jan Beulich

> These cosmetic changes can be fixed by a follow-up patch.
>

If Jackson would like not to fix this directly in his tree, I can post 
this a small patch but we'd better squash this into the predecessor just 
as one commit.

Thanks
Tiejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM
  2015-07-22 15:53   ` Wei Liu
@ 2015-07-23 11:05     ` Ian Jackson
  0 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-23 11:05 UTC (permalink / raw)
  To: Wei Liu
  Cc: Kevin Tian, xen-devel, Ian Campbell, Stefano Stabellini,
	George Dunlap, Jan Beulich, Tiejun Chen

Wei Liu writes ("Re: [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM"):
> Ian, you forgot your S-o-B.

So I did, thanks.

> Acked-by: Wei Liu <wei.liu2@citrix.com>

Thanks.

Also, I should mention that have retained the ack from Kevin Tian
(CC'd), since my changes were almost entirely mechanical.

Ian.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM
  2015-07-23  0:52   ` Chen, Tiejun
  2015-07-23  7:35     ` Wei Liu
@ 2015-07-23 11:09     ` Ian Jackson
  1 sibling, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-23 11:09 UTC (permalink / raw)
  To: Chen, Tiejun
  Cc: xen-devel, Wei Liu, Ian Campbell, Stefano Stabellini,
	George Dunlap, Jan Beulich

Chen, Tiejun writes ("Re: [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM"):
> Thanks for your effort.
> 
> A tiny change may be needed but I don't block this.

Thanks.  I have applied this change to my tree.  There is no
functional change and I don't feel the need to collect a new ack.

Ian.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v13 00/16] Fix RMRR (avoid RDM)
  2015-07-23  2:15 ` Chen, Tiejun
@ 2015-07-23 11:10   ` Ian Jackson
  2015-07-23 12:53     ` Ian Jackson
  0 siblings, 1 reply; 46+ messages in thread
From: Ian Jackson @ 2015-07-23 11:10 UTC (permalink / raw)
  To: Chen, Tiejun; +Cc: George Dunlap, xen-devel, Wei Liu, Ian Campbell, Jan Beulich

Chen, Tiejun writes ("Re: [PATCH v13 00/16] Fix RMRR (avoid RDM)"):
> This was supposed to be my responsibility to resend this series so 
> appreciate for your extra effort.

Thanks.

> Yes. I can compile and boot successfully with/without rdm setting.

Great.

I will double check with Jan but I intend to push RSN.

Ian.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy
  2015-07-22 15:44 ` [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy Ian Jackson
@ 2015-07-23 11:45   ` Ian Jackson
  2015-07-23 11:54     ` Jan Beulich
                       ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-23 11:45 UTC (permalink / raw)
  To: xen-devel, Wei Liu, Tiejun Chen, Jan Beulich, Ian Campbell,
	George Dunlap, Tim Deegan, Keir Fraser, Andrew Cooper,
	Suravee Suthikulpanit, Aravind Gopalakrishnan,
	Stefano Stabellini, Yang Zhang, Kevin Tian

Ian Jackson writes ("[PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy"):
> From: Tiejun Chen <tiejun.chen@intel.com>
> 
> This patch extends the existing hypercall to support rdm reservation policy.
> We return error or just throw out a warning message depending on whether
> the policy is "strict" or "relaxed" when reserving RDM regions in pfn space.
> Note in some special cases, e.g. add a device to hwdomain, and remove a
> device from user domain, 'relaxed' is fine enough since this is always safe
> to hwdomain.

This patch breaks the build on ARM:

gcc -O1 -fno-omit-frame-pointer -marm -g -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs   -I/local/scratch/ianj/xen.git/xen/include -fno-stack-protector -fno-exceptions -Wnested-externs -msoft-float -mcpu=cortex-a15 -DGCC_HAS_VISIBILITY_ATTRIBUTE -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -pipe -g -D__XEN__ -include /local/scratch/ianj/xen.git/xen/include/xen/config.h -nostdinc -fno-optimize-sibling-calls -DVERBOSE -DHAS_PASSTHROUGH -DHAS_DEVICE_TREE -DHAS_MEM_ACCESS -DHAS_PDX -fno-omit-frame-pointer -DCONFIG_FRAME_POINTER -MMD -MF .smmu.o.d -c smmu.c -o smmu.o
smmu.c: In function 'arm_smmu_reassign_dev':
smmu.c:2712:9: error: too few arguments to function 'arm_smmu_assign_dev'
   ret = arm_smmu_assign_dev(t, devfn, dev);
         ^
smmu.c:2607:12: note: declared here
 static int arm_smmu_assign_dev(struct domain *d, u8 devfn,
            ^
/local/scratch/ianj/xen.git/xen/Rules.mk:168: recipe for target 'smmu.o' failed
make[6]: *** [smmu.o] Error 1

I had a quick look but it's not a simple matter of plumbing through an
additional flags parameter becuase the reassign_device method
apparently doesn't take flags.


Tiejun, please can you send a patch to fix this up.  Please send just
a revised version of this patch.  I think the rest of the series will
rebase just fine on top of it.  (If I'm wrong then we will need to do
something more complex.)

Ian.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy
  2015-07-23 11:45   ` Ian Jackson
@ 2015-07-23 11:54     ` Jan Beulich
  2015-07-23 11:54     ` Ian Campbell
  2015-07-23 12:15     ` Chen, Tiejun
  2 siblings, 0 replies; 46+ messages in thread
From: Jan Beulich @ 2015-07-23 11:54 UTC (permalink / raw)
  To: Ian Jackson
  Cc: KevinTian, Wei Liu, Ian Campbell, George Dunlap, Andrew Cooper,
	TimDeegan, xen-devel, Stefano Stabellini, Suravee Suthikulpanit,
	TiejunChen, Yang Zhang, Aravind Gopalakrishnan, Keir Fraser

>>> On 23.07.15 at 13:45, <Ian.Jackson@eu.citrix.com> wrote:
> Ian Jackson writes ("[PATCH 03/16] xen/passthrough: extend hypercall to 
> support rdm reservation policy"):
>> From: Tiejun Chen <tiejun.chen@intel.com>
>> 
>> This patch extends the existing hypercall to support rdm reservation policy.
>> We return error or just throw out a warning message depending on whether
>> the policy is "strict" or "relaxed" when reserving RDM regions in pfn space.
>> Note in some special cases, e.g. add a device to hwdomain, and remove a
>> device from user domain, 'relaxed' is fine enough since this is always safe
>> to hwdomain.
> 
> This patch breaks the build on ARM:
> 
> gcc -O1 -fno-omit-frame-pointer -marm -g -fno-strict-aliasing -std=gnu99 -Wall 
> -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable 
> -Wno-unused-local-typedefs   -I/local/scratch/ianj/xen.git/xen/include 
> -fno-stack-protector -fno-exceptions -Wnested-externs -msoft-float -mcpu=cortex-a15 
> -DGCC_HAS_VISIBILITY_ATTRIBUTE -fno-builtin -fno-common -Werror -Wredundant-decls 
> -Wno-pointer-arith -pipe -g -D__XEN__ -include 
> /local/scratch/ianj/xen.git/xen/include/xen/config.h -nostdinc 
> -fno-optimize-sibling-calls -DVERBOSE -DHAS_PASSTHROUGH -DHAS_DEVICE_TREE 
> -DHAS_MEM_ACCESS -DHAS_PDX -fno-omit-frame-pointer -DCONFIG_FRAME_POINTER -MMD -MF 
> .smmu.o.d -c smmu.c -o smmu.o
> smmu.c: In function 'arm_smmu_reassign_dev':
> smmu.c:2712:9: error: too few arguments to function 'arm_smmu_assign_dev'
>    ret = arm_smmu_assign_dev(t, devfn, dev);
>          ^
> smmu.c:2607:12: note: declared here
>  static int arm_smmu_assign_dev(struct domain *d, u8 devfn,
>             ^
> /local/scratch/ianj/xen.git/xen/Rules.mk:168: recipe for target 'smmu.o' 
> failed
> make[6]: *** [smmu.o] Error 1
> 
> I had a quick look but it's not a simple matter of plumbing through an
> additional flags parameter becuase the reassign_device method
> apparently doesn't take flags.

Considering that the parameter is ignored anyway, I'd suggest
following what was done for various of the rmrr_identity_mapping()
callers - just pass zero.

Jan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy
  2015-07-23 11:45   ` Ian Jackson
  2015-07-23 11:54     ` Jan Beulich
@ 2015-07-23 11:54     ` Ian Campbell
  2015-07-23 12:15     ` Chen, Tiejun
  2 siblings, 0 replies; 46+ messages in thread
From: Ian Campbell @ 2015-07-23 11:54 UTC (permalink / raw)
  To: Ian Jackson, xen-devel, Wei Liu, Tiejun Chen, Jan Beulich,
	George Dunlap, Tim Deegan, Keir Fraser, Andrew Cooper,
	Suravee Suthikulpanit, Aravind Gopalakrishnan,
	Stefano Stabellini, Yang Zhang, Kevin Tian

On Thu, 2015-07-23 at 12:45 +0100, Ian Jackson wrote:
> 
> I had a quick look but it's not a simple matter of plumbing through 
> an additional flags parameter becuase the reassign_device method
> apparently doesn't take flags.

The vtd version (reassign_device_ownership) does:
                /*
                 * Any RMRR flag is always ignored when remove a device,
                 * but its always safe and strict to set 0.
                 */
                ret = rmrr_identity_mapping(source, 0, rmrr, 0);

Where that second 0 there was the flag when it was called from
(intel_iommu_assign_device).

Given that arm_smmu_assign_dev currently ignores the flag, and that ARM
doesn't currently have RDM/RMRR at all I think just passing 0 in
arm_smmu_reassign_dev would be tolerable.

Ian.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy
  2015-07-23 11:45   ` Ian Jackson
  2015-07-23 11:54     ` Jan Beulich
  2015-07-23 11:54     ` Ian Campbell
@ 2015-07-23 12:15     ` Chen, Tiejun
  2015-07-23 12:19       ` Ian Jackson
  2 siblings, 1 reply; 46+ messages in thread
From: Chen, Tiejun @ 2015-07-23 12:15 UTC (permalink / raw)
  To: Ian Jackson, xen-devel, Wei Liu, Jan Beulich, Ian Campbell,
	George Dunlap, Tim Deegan, Keir Fraser, Andrew Cooper,
	Suravee Suthikulpanit, Aravind Gopalakrishnan,
	Stefano Stabellini, Yang Zhang, Kevin Tian

> Tiejun, please can you send a patch to fix this up.  Please send just
> a revised version of this patch.  I think the rest of the series will
> rebase just fine on top of it.  (If I'm wrong then we will need to do
> something more complex.)

Sorry to this.


     On ARM side the flag field doesn't take any affect.

     Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>

diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index 9a667e9..b62c8cf 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2709,7 +2709,8 @@ static int arm_smmu_reassign_dev(struct domain *s, 
struct domain *t,
                 return ret;

         if (t) {
-               ret = arm_smmu_assign_dev(t, devfn, dev);
+               /* The flag field doesn't matter to DT device. */
+               ret = arm_smmu_assign_dev(t, devfn, dev, 0);
                 if (ret)
                         return ret;
         }


Thanks
Tiejun

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy
  2015-07-23 12:15     ` Chen, Tiejun
@ 2015-07-23 12:19       ` Ian Jackson
  2015-07-23 12:27         ` Ian Campbell
  0 siblings, 1 reply; 46+ messages in thread
From: Ian Jackson @ 2015-07-23 12:19 UTC (permalink / raw)
  To: Chen, Tiejun
  Cc: Kevin Tian, xen-devel, Wei Liu, Ian Campbell, George Dunlap,
	Tim Deegan, Ian Jackson, Stefano Stabellini,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper, Yang Zhang,
	Keir Fraser, Suravee Suthikulpanit

Chen, Tiejun writes ("Re: [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy"):
> > Tiejun, please can you send a patch to fix this up.  Please send just
> > a revised version of this patch.  I think the rest of the series will
> > rebase just fine on top of it.  (If I'm wrong then we will need to do
> > something more complex.)
> 
> Sorry to this.
> 
> 
>      On ARM side the flag field doesn't take any affect.
> 
>      Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>

OK, thanks.  I wasn't sure that just passing 0 was right.

I will take Jan's and Ian's suggestions as acks for your patch, which
looks correct to me to implement them.

Ian.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy
  2015-07-23 12:19       ` Ian Jackson
@ 2015-07-23 12:27         ` Ian Campbell
  2015-07-23 12:40           ` Chen, Tiejun
  2015-07-23 12:43           ` Ian Jackson
  0 siblings, 2 replies; 46+ messages in thread
From: Ian Campbell @ 2015-07-23 12:27 UTC (permalink / raw)
  To: Ian Jackson, Chen, Tiejun
  Cc: Kevin Tian, xen-devel, Keir Fraser, Suravee Suthikulpanit,
	George Dunlap, Andrew Cooper, Tim Deegan, Stefano Stabellini,
	Aravind Gopalakrishnan, Jan Beulich, Yang Zhang, Wei Liu

On Thu, 2015-07-23 at 13:19 +0100, Ian Jackson wrote:
> Chen, Tiejun writes ("Re: [PATCH 03/16] xen/passthrough: extend 
> hypercall to support rdm reservation policy"):
> > > Tiejun, please can you send a patch to fix this up.  Please send 
> > > just
> > > a revised version of this patch.  I think the rest of the series 
> > > will
> > > rebase just fine on top of it.  (If I'm wrong then we will need 
> > > to do
> > > something more complex.)
> > 
> > Sorry to this.
> > 
> > 
> >      On ARM side the flag field doesn't take any affect.
> > 
> >      Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> 
> OK, thanks.  I wasn't sure that just passing 0 was right.
> 
> I will take Jan's and Ian's suggestions as acks for your patch, which
> looks correct to me to implement them.

Can you avoid the mention of DT in the comment please, since PCI will
eventually go that path. Something like "No flags are defined for ARM"
would suffice I think.

Ian.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy
  2015-07-23 12:27         ` Ian Campbell
@ 2015-07-23 12:40           ` Chen, Tiejun
  2015-07-23 12:43           ` Ian Jackson
  1 sibling, 0 replies; 46+ messages in thread
From: Chen, Tiejun @ 2015-07-23 12:40 UTC (permalink / raw)
  To: Ian Campbell, Ian Jackson
  Cc: Kevin Tian, xen-devel, Keir Fraser, Suravee Suthikulpanit,
	George Dunlap, Andrew Cooper, Tim Deegan, Stefano Stabellini,
	Aravind Gopalakrishnan, Jan Beulich, Yang Zhang, Wei Liu

> Can you avoid the mention of DT in the comment please, since PCI will
> eventually go that path. Something like "No flags are defined for ARM"
> would suffice I think.

Works for me.

But in some way "0" also represents one valid flag. So what about this ?

No flags are defined for ARM so its always safe to set 0.

Thanks
Tiejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy
  2015-07-23 12:27         ` Ian Campbell
  2015-07-23 12:40           ` Chen, Tiejun
@ 2015-07-23 12:43           ` Ian Jackson
  1 sibling, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-23 12:43 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Kevin Tian, xen-devel, Wei Liu, Suravee Suthikulpanit,
	George Dunlap, Andrew Cooper, Tim Deegan, Stefano Stabellini,
	Aravind Gopalakrishnan, Jan Beulich, Yang Zhang, Chen, Tiejun,
	Keir Fraser

Ian Campbell writes ("Re: [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy"):
> Can you avoid the mention of DT in the comment please, since PCI will
> eventually go that path. Something like "No flags are defined for ARM"
> would suffice I think.

Right.  I have substituted that for the commment.

Ian.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v13 00/16] Fix RMRR (avoid RDM)
  2015-07-23 11:10   ` Ian Jackson
@ 2015-07-23 12:53     ` Ian Jackson
  0 siblings, 0 replies; 46+ messages in thread
From: Ian Jackson @ 2015-07-23 12:53 UTC (permalink / raw)
  To: Chen, Tiejun, xen-devel, Wei Liu, Jan Beulich, Ian Campbell,
	George Dunlap

Ian Jackson writes ("Re: [PATCH v13 00/16] Fix RMRR (avoid RDM)"):
> I will double check with Jan but I intend to push RSN.

I fixed up the ARM build breakage as discussed and have now pushed
what I have been calling v13a.

Ian.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr
  2015-07-22 15:44 ` [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr Ian Jackson
@ 2015-09-03 19:39   ` Tamas K Lengyel
  2015-09-04  8:17     ` Jan Beulich
  0 siblings, 1 reply; 46+ messages in thread
From: Tamas K Lengyel @ 2015-09-03 19:39 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Kevin Tian, xen-devel, Wei Liu, Ian Campbell, George Dunlap,
	Jan Beulich, Yang Zhang, Tiejun Chen


[-- Attachment #1.1: Type: text/plain, Size: 3140 bytes --]

So this change in 4.6 prevents me from passing through devices that have
worked previously with VT-d:

(XEN) [VT-D] cannot assign 0000:00:1a.0 with shared RMRR at ae8a9000 for
Dom30.
(XEN) [VT-D] cannot assign 0000:00:1d.0 with shared RMRR at ae8a9000 for
Dom31.

The devices are the USB 2.0 devices on a DQ67SW motherboard:

00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
Family USB Enhanced Host Controller #2 (rev 04)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
Family USB Enhanced Host Controller #1 (rev 04)

Tamas


On Wed, Jul 22, 2015 at 9:44 AM, Ian Jackson <ian.jackson@eu.citrix.com>
wrote:

> From: Tiejun Chen <tiejun.chen@intel.com>
>
> Currently we're intending to cover this kind of devices
> with shared RMRR simply since the case of shared RMRR is
> a rare case according to our previous experiences. But
> late we can group these devices which shared rmrr, and
> then allow all devices within a group to be assigned to
> same domain.
>
> CC: Yang Zhang <yang.z.zhang@intel.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> Acked-by: Kevin Tian <kevin.tian@intel.com>
> ---
>  xen/drivers/passthrough/vtd/iommu.c |   30 +++++++++++++++++++++++++++---
>  1 file changed, 27 insertions(+), 3 deletions(-)
>
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index 8a8d763..ce5c295 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2293,13 +2293,37 @@ static int intel_iommu_assign_device(
>      if ( list_empty(&acpi_drhd_units) )
>          return -ENODEV;
>
> +    seg = pdev->seg;
> +    bus = pdev->bus;
> +    /*
> +     * In rare cases one given rmrr is shared by multiple devices but
> +     * obviously this would put the security of a system at risk. So
> +     * we should prevent from this sort of device assignment.
> +     *
> +     * TODO: in the future we can introduce group device assignment
> +     * interface to make sure devices sharing RMRR are assigned to the
> +     * same domain together.
> +     */
> +    for_each_rmrr_device( rmrr, bdf, i )
> +    {
> +        if ( rmrr->segment == seg &&
> +             PCI_BUS(bdf) == bus &&
> +             PCI_DEVFN2(bdf) == devfn &&
> +             rmrr->scope.devices_cnt > 1 )
> +        {
> +            printk(XENLOG_G_ERR VTDPREFIX
> +                   " cannot assign %04x:%02x:%02x.%u"
> +                   " with shared RMRR at %"PRIx64" for Dom%d.\n",
> +                   seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
> +                   rmrr->base_address, d->domain_id);
> +            return -EPERM;
> +        }
> +    }
> +
>      ret = reassign_device_ownership(hardware_domain, d, devfn, pdev);
>      if ( ret )
>          return ret;
>
> -    seg = pdev->seg;
> -    bus = pdev->bus;
> -
>      /* Setup rmrr identity mapping */
>      for_each_rmrr_device( rmrr, bdf, i )
>      {
> --
> 1.7.10.4
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

[-- Attachment #1.2: Type: text/html, Size: 4415 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr
  2015-09-03 19:39   ` Tamas K Lengyel
@ 2015-09-04  8:17     ` Jan Beulich
  2015-09-04 21:52       ` Tamas K Lengyel
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2015-09-04  8:17 UTC (permalink / raw)
  To: Tamas K Lengyel, Tiejun Chen
  Cc: Kevin Tian, xen-devel, Wei Liu, Ian Campbell, George Dunlap,
	Ian Jackson, Yang Zhang

>>> On 03.09.15 at 21:39, <tamas.k.lengyel@gmail.com> wrote:
> So this change in 4.6 prevents me from passing through devices that have
> worked previously with VT-d:
> 
> (XEN) [VT-D] cannot assign 0000:00:1a.0 with shared RMRR at ae8a9000 for
> Dom30.
> (XEN) [VT-D] cannot assign 0000:00:1d.0 with shared RMRR at ae8a9000 for
> Dom31.
> 
> The devices are the USB 2.0 devices on a DQ67SW motherboard:
> 
> 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> Family USB Enhanced Host Controller #2 (rev 04)
> 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> Family USB Enhanced Host Controller #1 (rev 04)

Please don't top post. And I'm also puzzled by you sending this to
Ian rather than the author.

Technically - Tiejun, should this perhaps be permitted in relaxed
mode, at least until group assignment gets implemented? (Or
Tamas, do you actually mean to assign these to _different_
guests, considering the log fragment above?)

Jan

> On Wed, Jul 22, 2015 at 9:44 AM, Ian Jackson <ian.jackson@eu.citrix.com>
> wrote:
> 
>> From: Tiejun Chen <tiejun.chen@intel.com>
>>
>> Currently we're intending to cover this kind of devices
>> with shared RMRR simply since the case of shared RMRR is
>> a rare case according to our previous experiences. But
>> late we can group these devices which shared rmrr, and
>> then allow all devices within a group to be assigned to
>> same domain.
>>
>> CC: Yang Zhang <yang.z.zhang@intel.com>
>> CC: Kevin Tian <kevin.tian@intel.com>
>> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
>> Acked-by: Kevin Tian <kevin.tian@intel.com>
>> ---
>>  xen/drivers/passthrough/vtd/iommu.c |   30 +++++++++++++++++++++++++++---
>>  1 file changed, 27 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/drivers/passthrough/vtd/iommu.c
>> b/xen/drivers/passthrough/vtd/iommu.c
>> index 8a8d763..ce5c295 100644
>> --- a/xen/drivers/passthrough/vtd/iommu.c
>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>> @@ -2293,13 +2293,37 @@ static int intel_iommu_assign_device(
>>      if ( list_empty(&acpi_drhd_units) )
>>          return -ENODEV;
>>
>> +    seg = pdev->seg;
>> +    bus = pdev->bus;
>> +    /*
>> +     * In rare cases one given rmrr is shared by multiple devices but
>> +     * obviously this would put the security of a system at risk. So
>> +     * we should prevent from this sort of device assignment.
>> +     *
>> +     * TODO: in the future we can introduce group device assignment
>> +     * interface to make sure devices sharing RMRR are assigned to the
>> +     * same domain together.
>> +     */
>> +    for_each_rmrr_device( rmrr, bdf, i )
>> +    {
>> +        if ( rmrr->segment == seg &&
>> +             PCI_BUS(bdf) == bus &&
>> +             PCI_DEVFN2(bdf) == devfn &&
>> +             rmrr->scope.devices_cnt > 1 )
>> +        {
>> +            printk(XENLOG_G_ERR VTDPREFIX
>> +                   " cannot assign %04x:%02x:%02x.%u"
>> +                   " with shared RMRR at %"PRIx64" for Dom%d.\n",
>> +                   seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
>> +                   rmrr->base_address, d->domain_id);
>> +            return -EPERM;
>> +        }
>> +    }
>> +
>>      ret = reassign_device_ownership(hardware_domain, d, devfn, pdev);
>>      if ( ret )
>>          return ret;
>>
>> -    seg = pdev->seg;
>> -    bus = pdev->bus;
>> -
>>      /* Setup rmrr identity mapping */
>>      for_each_rmrr_device( rmrr, bdf, i )
>>      {
>> --
>> 1.7.10.4
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org 
>> http://lists.xen.org/xen-devel 
>>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr
  2015-09-04  8:17     ` Jan Beulich
@ 2015-09-04 21:52       ` Tamas K Lengyel
  2015-09-06  2:16         ` Chen, Tiejun
  0 siblings, 1 reply; 46+ messages in thread
From: Tamas K Lengyel @ 2015-09-04 21:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, xen-devel, Wei Liu, Ian Campbell, George Dunlap,
	Ian Jackson, Yang Zhang, Tiejun Chen


[-- Attachment #1.1: Type: text/plain, Size: 1379 bytes --]

On Fri, Sep 4, 2015 at 2:17 AM, Jan Beulich <JBeulich@suse.com> wrote:

> >>> On 03.09.15 at 21:39, <tamas.k.lengyel@gmail.com> wrote:
> > So this change in 4.6 prevents me from passing through devices that have
> > worked previously with VT-d:
> >
> > (XEN) [VT-D] cannot assign 0000:00:1a.0 with shared RMRR at ae8a9000 for
> > Dom30.
> > (XEN) [VT-D] cannot assign 0000:00:1d.0 with shared RMRR at ae8a9000 for
> > Dom31.
> >
> > The devices are the USB 2.0 devices on a DQ67SW motherboard:
> >
> > 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> > Family USB Enhanced Host Controller #2 (rev 04)
> > 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> > Family USB Enhanced Host Controller #1 (rev 04)
>
> Please don't top post. And I'm also puzzled by you sending this to
> Ian rather than the author.
>

Hm, I've just hit reply-all to the latest message I've found in the thread.


>
> Technically - Tiejun, should this perhaps be permitted in relaxed
> mode, at least until group assignment gets implemented? (Or
> Tamas, do you actually mean to assign these to _different_
> guests, considering the log fragment above?)
>

No, I actually want to assign them to the same domain. The domain creation
fails with either of those devices specified for passthrough whether they
are to be attached to the same domain or not.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 2083 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr
  2015-09-04 21:52       ` Tamas K Lengyel
@ 2015-09-06  2:16         ` Chen, Tiejun
  2015-09-06  3:19           ` Tamas K Lengyel
  0 siblings, 1 reply; 46+ messages in thread
From: Chen, Tiejun @ 2015-09-06  2:16 UTC (permalink / raw)
  To: Tamas K Lengyel, Jan Beulich
  Cc: Kevin Tian, xen-devel, Wei Liu, Ian Campbell, George Dunlap,
	Ian Jackson, Yang Zhang

Sorry for this delay response because I was on vacation.

On 9/5/2015 5:52 AM, Tamas K Lengyel wrote:
> On Fri, Sep 4, 2015 at 2:17 AM, Jan Beulich <JBeulich@suse.com> wrote:
>
>> >>> On 03.09.15 at 21:39, <tamas.k.lengyel@gmail.com> wrote:
>> > So this change in 4.6 prevents me from passing through devices that have
>> > worked previously with VT-d:
>> >
>> > (XEN) [VT-D] cannot assign 0000:00:1a.0 with shared RMRR at ae8a9000 for
>> > Dom30.
>> > (XEN) [VT-D] cannot assign 0000:00:1d.0 with shared RMRR at ae8a9000 for
>> > Dom31.
>> >
>> > The devices are the USB 2.0 devices on a DQ67SW motherboard:
>> >
>> > 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
>> > Family USB Enhanced Host Controller #2 (rev 04)
>> > 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
>> > Family USB Enhanced Host Controller #1 (rev 04)
>>
>> Please don't top post. And I'm also puzzled by you sending this to
>> Ian rather than the author.
>>
>
> Hm, I've just hit reply-all to the latest message I've found in the thread.
>
>
>>
>> Technically - Tiejun, should this perhaps be permitted in relaxed
>> mode, at least until group assignment gets implemented? (Or

I agree. What about this?

diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 836aed5..038776a 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2310,12 +2310,16 @@ static int intel_iommu_assign_device(
               PCI_DEVFN2(bdf) == devfn &&
               rmrr->scope.devices_cnt > 1 )
          {
+            u32 relaxed = flag & XEN_DOMCTL_DEV_RDM_RELAXED;
+
              printk(XENLOG_G_ERR VTDPREFIX
-                   " cannot assign %04x:%02x:%02x.%u"
+                   " Currently its %s to assign %04x:%02x:%02x.%u"
                     " with shared RMRR at %"PRIx64" for Dom%d.\n",
+                   relaxed ? "disallowed" : "risky",
                     seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
                     rmrr->base_address, d->domain_id);
-            return -EPERM;
+            if ( !relaxed )
+                return -EPERM;
          }
      }


>> Tamas, do you actually mean to assign these to _different_
>> guests, considering the log fragment above?)
>>
>
> No, I actually want to assign them to the same domain. The domain creation
> fails with either of those devices specified for passthrough whether they
> are to be attached to the same domain or not.
>

Tamas, could you try this in your case?

Thanks
Tiejun

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr
  2015-09-06  2:16         ` Chen, Tiejun
@ 2015-09-06  3:19           ` Tamas K Lengyel
  2015-09-06  4:19             ` Chen, Tiejun
  0 siblings, 1 reply; 46+ messages in thread
From: Tamas K Lengyel @ 2015-09-06  3:19 UTC (permalink / raw)
  To: Chen, Tiejun
  Cc: Kevin Tian, xen-devel, Wei Liu, Ian Campbell, George Dunlap,
	Ian Jackson, Jan Beulich, Yang Zhang


[-- Attachment #1.1: Type: text/plain, Size: 1553 bytes --]

>
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index 836aed5..038776a 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2310,12 +2310,16 @@ static int intel_iommu_assign_device(
>               PCI_DEVFN2(bdf) == devfn &&
>               rmrr->scope.devices_cnt > 1 )
>          {
> +            u32 relaxed = flag & XEN_DOMCTL_DEV_RDM_RELAXED;
> +
>              printk(XENLOG_G_ERR VTDPREFIX
> -                   " cannot assign %04x:%02x:%02x.%u"
> +                   " Currently its %s to assign %04x:%02x:%02x.%u"
>                     " with shared RMRR at %"PRIx64" for Dom%d.\n",
> +                   relaxed ? "disallowed" : "risky",
>

This debug message is backwards?


>                     seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
>                     rmrr->base_address, d->domain_id);
> -            return -EPERM;
> +            if ( !relaxed )
> +                return -EPERM;
>          }
>      }
>
>
> Tamas, do you actually mean to assign these to _different_
>>> guests, considering the log fragment above?)
>>>
>>>
>> No, I actually want to assign them to the same domain. The domain creation
>> fails with either of those devices specified for passthrough whether they
>> are to be attached to the same domain or not.
>>
>>
> Tamas, could you try this in your case?
>

Took me a while to find the xl config option to set this flag (pci = [
'sbdf, rdm_policy=strict/relaxed' ]) but now it works as expected!

Thanks,
Tamas

[-- Attachment #1.2: Type: text/html, Size: 3039 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr
  2015-09-06  3:19           ` Tamas K Lengyel
@ 2015-09-06  4:19             ` Chen, Tiejun
  2015-09-06  4:21               ` Tamas K Lengyel
                                 ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Chen, Tiejun @ 2015-09-06  4:19 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: Kevin Tian, xen-devel, Wei Liu, Ian Campbell, George Dunlap,
	Ian Jackson, Jan Beulich, Yang Zhang

On 9/6/2015 11:19 AM, Tamas K Lengyel wrote:
>>
>> diff --git a/xen/drivers/passthrough/vtd/iommu.c
>> b/xen/drivers/passthrough/vtd/iommu.c
>> index 836aed5..038776a 100644
>> --- a/xen/drivers/passthrough/vtd/iommu.c
>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>> @@ -2310,12 +2310,16 @@ static int intel_iommu_assign_device(
>>               PCI_DEVFN2(bdf) == devfn &&
>>               rmrr->scope.devices_cnt > 1 )
>>          {
>> +            u32 relaxed = flag & XEN_DOMCTL_DEV_RDM_RELAXED;
>> +
>>              printk(XENLOG_G_ERR VTDPREFIX
>> -                   " cannot assign %04x:%02x:%02x.%u"
>> +                   " Currently its %s to assign %04x:%02x:%02x.%u"
>>                     " with shared RMRR at %"PRIx64" for Dom%d.\n",
>> +                   relaxed ? "disallowed" : "risky",
>>
>
> This debug message is backwards?

Yeah. Its indeed like this, relaxed ? "risky" : "disallowed"

But lets wait Jan's comment to step next.

>
>
>>                     seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
>>                     rmrr->base_address, d->domain_id);
>> -            return -EPERM;
>> +            if ( !relaxed )
>> +                return -EPERM;
>>          }
>>      }
>>
>>
>> Tamas, do you actually mean to assign these to _different_
>>>> guests, considering the log fragment above?)
>>>>
>>>>
>>> No, I actually want to assign them to the same domain. The domain creation
>>> fails with either of those devices specified for passthrough whether they
>>> are to be attached to the same domain or not.
>>>
>>>
>> Tamas, could you try this in your case?
>>
>
> Took me a while to find the xl config option to set this flag (pci = [
> 'sbdf, rdm_policy=strict/relaxed' ]) but now it works as expected!
>

I remember 'relaxed' is a default value so 'rdm_policy' can't be dropped 
here if you like this.

Thanks
Tiejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr
  2015-09-06  4:19             ` Chen, Tiejun
@ 2015-09-06  4:21               ` Tamas K Lengyel
  2015-09-06 21:27               ` Wei Liu
  2015-09-07  9:45               ` Jan Beulich
  2 siblings, 0 replies; 46+ messages in thread
From: Tamas K Lengyel @ 2015-09-06  4:21 UTC (permalink / raw)
  To: Chen, Tiejun
  Cc: Kevin Tian, xen-devel, Wei Liu, Ian Campbell, George Dunlap,
	Ian Jackson, Jan Beulich, Yang Zhang


[-- Attachment #1.1: Type: text/plain, Size: 367 bytes --]

> Took me a while to find the xl config option to set this flag (pci = [
>> 'sbdf, rdm_policy=strict/relaxed' ]) but now it works as expected!
>>
>>
> I remember 'relaxed' is a default value so 'rdm_policy' can't be dropped
> here if you like this.
>

Without specifically setting rdm_policy=relaxed domain creation failed, so
it's definitely not the default.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 801 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr
  2015-09-06  4:19             ` Chen, Tiejun
  2015-09-06  4:21               ` Tamas K Lengyel
@ 2015-09-06 21:27               ` Wei Liu
  2015-09-07  9:45               ` Jan Beulich
  2 siblings, 0 replies; 46+ messages in thread
From: Wei Liu @ 2015-09-06 21:27 UTC (permalink / raw)
  To: Chen, Tiejun
  Cc: Kevin Tian, xen-devel, Wei Liu, Ian Campbell, George Dunlap,
	Tamas K Lengyel, Ian Jackson, Jan Beulich, Yang Zhang

On Sun, Sep 06, 2015 at 12:19:49PM +0800, Chen, Tiejun wrote:
[...]
> >>Tamas, do you actually mean to assign these to _different_
> >>>>guests, considering the log fragment above?)
> >>>>
> >>>>
> >>>No, I actually want to assign them to the same domain. The domain creation
> >>>fails with either of those devices specified for passthrough whether they
> >>>are to be attached to the same domain or not.
> >>>
> >>>
> >>Tamas, could you try this in your case?
> >>
> >
> >Took me a while to find the xl config option to set this flag (pci = [
> >'sbdf, rdm_policy=strict/relaxed' ]) but now it works as expected!
> >
> 
> I remember 'relaxed' is a default value so 'rdm_policy' can't be dropped
> here if you like this.
> 

For a specific PCI device, the default value is strict.

Wei.

> Thanks
> Tiejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr
  2015-09-06  4:19             ` Chen, Tiejun
  2015-09-06  4:21               ` Tamas K Lengyel
  2015-09-06 21:27               ` Wei Liu
@ 2015-09-07  9:45               ` Jan Beulich
  2 siblings, 0 replies; 46+ messages in thread
From: Jan Beulich @ 2015-09-07  9:45 UTC (permalink / raw)
  To: Tiejun Chen
  Cc: Kevin Tian, xen-devel, Wei Liu, Ian Campbell, George Dunlap,
	Tamas K Lengyel, Ian Jackson, Yang Zhang

>>> On 06.09.15 at 06:19, <tiejun.chen@intel.com> wrote:
> On 9/6/2015 11:19 AM, Tamas K Lengyel wrote:
>>> --- a/xen/drivers/passthrough/vtd/iommu.c
>>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>>> @@ -2310,12 +2310,16 @@ static int intel_iommu_assign_device(
>>>               PCI_DEVFN2(bdf) == devfn &&
>>>               rmrr->scope.devices_cnt > 1 )
>>>          {
>>> +            u32 relaxed = flag & XEN_DOMCTL_DEV_RDM_RELAXED;
>>> +
>>>              printk(XENLOG_G_ERR VTDPREFIX
>>> -                   " cannot assign %04x:%02x:%02x.%u"
>>> +                   " Currently its %s to assign %04x:%02x:%02x.%u"
>>>                     " with shared RMRR at %"PRIx64" for Dom%d.\n",
>>> +                   relaxed ? "disallowed" : "risky",
>>>
>>
>> This debug message is backwards?
> 
> Yeah. Its indeed like this, relaxed ? "risky" : "disallowed"
> 
> But lets wait Jan's comment to step next.

What kind of comment are you expecting from me?

Jan

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2015-09-07  9:45 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-22 15:44 [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
2015-07-22 15:44 ` [PATCH 01/16] introduce XENMEM_reserved_device_memory_map Ian Jackson
2015-07-22 15:44 ` [PATCH 02/16] xen/vtd: create RMRR mapping Ian Jackson
2015-07-22 15:44 ` [PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy Ian Jackson
2015-07-23 11:45   ` Ian Jackson
2015-07-23 11:54     ` Jan Beulich
2015-07-23 11:54     ` Ian Campbell
2015-07-23 12:15     ` Chen, Tiejun
2015-07-23 12:19       ` Ian Jackson
2015-07-23 12:27         ` Ian Campbell
2015-07-23 12:40           ` Chen, Tiejun
2015-07-23 12:43           ` Ian Jackson
2015-07-22 15:44 ` [PATCH 04/16] xen: enable XENMEM_memory_map in hvm Ian Jackson
2015-07-22 15:44 ` [PATCH 05/16] hvmloader: get guest memory map into memory_map[] Ian Jackson
2015-07-22 15:44 ` [PATCH 06/16] hvmloader/pci: try to avoid placing BARs in RMRRs Ian Jackson
2015-07-22 15:44 ` [PATCH 07/16] hvmloader/e820: construct guest e820 table Ian Jackson
2015-07-22 15:44 ` [PATCH 08/16] tools/libxc: Expose new hypercall xc_reserved_device_memory_map Ian Jackson
2015-07-22 15:51   ` Wei Liu
2015-07-22 15:44 ` [PATCH 09/16] tools: extend xc_assign_device() to support rdm reservation policy Ian Jackson
2015-07-22 15:44 ` [PATCH 10/16] tools: introduce some new parameters to set rdm policy Ian Jackson
2015-07-22 15:44 ` [PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM Ian Jackson
2015-07-22 15:53   ` Wei Liu
2015-07-23 11:05     ` Ian Jackson
2015-07-23  0:52   ` Chen, Tiejun
2015-07-23  7:35     ` Wei Liu
2015-07-23  7:51       ` Chen, Tiejun
2015-07-23 11:09     ` Ian Jackson
2015-07-22 15:44 ` [PATCH 12/16] tools: introduce a new parameter to set a predefined rdm boundary Ian Jackson
2015-07-22 15:44 ` [PATCH 13/16] libxl: construct e820 map with RDM information for HVM guest Ian Jackson
2015-07-22 15:44 ` [PATCH 14/16] xen/vtd: enable USB device assignment Ian Jackson
2015-07-22 15:44 ` [PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr Ian Jackson
2015-09-03 19:39   ` Tamas K Lengyel
2015-09-04  8:17     ` Jan Beulich
2015-09-04 21:52       ` Tamas K Lengyel
2015-09-06  2:16         ` Chen, Tiejun
2015-09-06  3:19           ` Tamas K Lengyel
2015-09-06  4:19             ` Chen, Tiejun
2015-09-06  4:21               ` Tamas K Lengyel
2015-09-06 21:27               ` Wei Liu
2015-09-07  9:45               ` Jan Beulich
2015-07-22 15:44 ` [PATCH 16/16] tools: parse to enable new rdm policy parameters Ian Jackson
2015-07-22 15:51 ` [PATCH v13 00/16] Fix RMRR (avoid RDM) Ian Jackson
2015-07-23  2:15 ` Chen, Tiejun
2015-07-23 11:10   ` Ian Jackson
2015-07-23 12:53     ` Ian Jackson
2015-07-23  7:36 ` Wei Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).