All of lore.kernel.org
 help / color / mirror / Atom feed
* [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM
@ 2014-09-10  5:49 Tiejun Chen
  2014-09-10  5:49 ` [v6][PATCH 1/7] introduce XENMEM_reserved_device_memory_map Tiejun Chen
                   ` (7 more replies)
  0 siblings, 8 replies; 37+ messages in thread
From: Tiejun Chen @ 2014-09-10  5:49 UTC (permalink / raw)
  To: JBeulich, ian.campbell, ian.jackson, stefano.stabellini,
	kevin.tian, yang.z.zhang
  Cc: xen-devel

v6:

* Jan cleanup to regenerate one patch to replace the two original patches
  then rebase the others.
* Refine all patches short/long logs.
* Replace those info with 'reserved device memory maps' and s/RMRR/RDM. 
* Fix some code styles.
* Fix one interation error when we grab e820 info
  This is not valid when j == nr - 1 (last iteration). 
* Cleanup some codes.
* One comment to introduce a new field, nr_reserved_device_memory_map
  libxc/hvm_info_table, is still pending to wait the tools maintainer's
  further comment.

v5:

* Add patch #2 to introduce a global count, acpi_rmrr_unit_entries. 
* Refine hypercall return value to make sure the caller can distinguish
  clearly between "xen filled in N entries" and "xen said you need N
  entries for all information".
* Refine some structures
* Then Rebase 

v4:

* Drop the original patch #1. Instead, we use acpi_rmrr_units to get
  rmrr info directly.
* Refine the hypercall definition to make sure we can use it safely.
* Introduce introduce nr_reserved_device_memory_map in hvm_info_table,
  then we can avoid issue unnecessary hypercall, even we can know
  current RMRR entries to issue hypercall one time.
* Cleanup and rebase

v3:

* Use XENMEM_reserved_device_memory_map to replace XENMEM_RMRR_memory_map
* Then rebase all patches

v2:

* Don't use e820map to define RMRR maps directly to avoid any confusion.
* In patch #3 we introduce construct_rmrr_e820_maps() to check if we can
  insert RMRR maps and then we will sort all e820 entries.
* Clean patch #4
* In patch #5 we reuse check_mmio_hole() to check if current mmio range is
  fine to RMRR maps. If not, we just issue error to notify the user since
  mostly mmio should be configured again.

While we work for supporting RMRR mapping for Windows GFX driver in case
shared table,

http://osdir.com/ml/general/2014-07/msg55347.html
http://osdir.com/ml/general/2014-07/msg55348.html

we realize we should reserve RMRR range to avoid any potential MMIO/RAM
overlap with our discussion so here these preliminary patches are intended
to cover this.

----------------------------------------------------------------
Jan Beulich (1):
      introduce XENMEM_reserved_device_memory_map

Tiejun Chen (6):
      tools/libxc: introduce hypercall for xc_reserved_device_memory_map
      tools/libxc: check if mmio BAR is out of reserved device memory maps
      libxc/hvm_info_table: introduce a new field nr_reserved_device_memory_map
      hvmloader: introduce hypercall for xc_reserved_device_memory_map
      hvmloader: check to reserved device memory maps in e820
      xen/vtd: make USB RMRR mapping safe

 tools/firmware/hvmloader/e820.c         | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tools/firmware/hvmloader/util.c         |  22 ++++++++++++++++++++++
 tools/firmware/hvmloader/util.h         |   3 +++
 tools/libxc/xc_domain.c                 |  29 +++++++++++++++++++++++++++++
 tools/libxc/xc_hvm_build_x86.c          |  82 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 tools/libxc/xenctrl.h                   |   4 ++++
 xen/common/compat/memory.c              |  52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 xen/common/memory.c                     |  49 +++++++++++++++++++++++++++++++++++++++++++++++++
 xen/drivers/passthrough/iommu.c         |  10 ++++++++++
 xen/drivers/passthrough/vtd/dmar.c      |  17 +++++++++++++++++
 xen/drivers/passthrough/vtd/extern.h    |   1 +
 xen/drivers/passthrough/vtd/iommu.c     |   9 +--------
 xen/include/public/hvm/hvm_info_table.h |   3 +++
 xen/include/public/memory.h             |  24 +++++++++++++++++++++++-
 xen/include/xen/iommu.h                 |   4 ++++
 xen/include/xlat.lst                    |   3 ++-
 16 files changed, 400 insertions(+), 12 deletions(-)

Thanks
Tiejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [v6][PATCH 1/7] introduce XENMEM_reserved_device_memory_map
  2014-09-10  5:49 [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tiejun Chen
@ 2014-09-10  5:49 ` Tiejun Chen
  2014-09-10 21:34   ` Tian, Kevin
  2014-09-10  5:49 ` [v6][PATCH 2/7] tools/libxc: introduce hypercall for xc_reserved_device_memory_map Tiejun Chen
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Tiejun Chen @ 2014-09-10  5:49 UTC (permalink / raw)
  To: JBeulich, ian.campbell, ian.jackson, stefano.stabellini,
	kevin.tian, yang.z.zhang
  Cc: xen-devel

From: Jan Beulich <jbeulich@suse.com>

This is a prerequisite for punching holes into HVM and PVH guests' P2M
to allow passing through devices that are associated with (on VT-d)
RMRRs.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index 25dc016..01154f6 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -15,6 +15,35 @@ CHECK_TYPE(domid);
 
 CHECK_mem_access_op;
 
+struct get_reserved_device_memory {
+    struct compat_mem_reserved_device_memory_map map;
+    unsigned int used_entries;
+};
+
+static int get_reserved_device_memory(xen_pfn_t start,
+                                      xen_ulong_t nr, void *ctxt)
+{
+    struct get_reserved_device_memory *grdm = ctxt;
+
+    if ( grdm->used_entries < grdm->map.nr_entries )
+    {
+        struct compat_mem_reserved_device_memory rdm = {
+            .start_pfn = start, .nr_pages = nr
+        };
+
+        if ( rdm.start_pfn != start || rdm.nr_pages != nr )
+            return -ERANGE;
+
+        if ( __copy_to_compat_offset(grdm->map.buffer, grdm->used_entries,
+                                     &rdm, 1) )
+            return -EFAULT;
+    }
+
+    ++grdm->used_entries;
+
+    return 0;
+}
+
 int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 {
     int split, op = cmd & MEMOP_CMD_MASK;
@@ -272,6 +301,29 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
             break;
         }
 
+#ifdef HAS_PASSTHROUGH
+        case XENMEM_reserved_device_memory_map:
+        {
+            struct get_reserved_device_memory grdm;
+
+            if ( copy_from_guest(&grdm.map, compat, 1) ||
+                 !compat_handle_okay(grdm.map.buffer, grdm.map.nr_entries) )
+                return -EFAULT;
+
+            grdm.used_entries = 0;
+            rc = iommu_get_reserved_device_memory(get_reserved_device_memory,
+                                                  &grdm);
+
+            if ( !rc && grdm.map.nr_entries < grdm.used_entries )
+                rc = -ENOBUFS;
+            grdm.map.nr_entries = grdm.used_entries;
+            if ( __copy_to_guest(compat, &grdm.map, 1) )
+                rc = -EFAULT;
+
+            return rc;
+        }
+#endif
+
         default:
             return compat_arch_memory_op(cmd, compat);
         }
diff --git a/xen/common/memory.c b/xen/common/memory.c
index c2dd31b..c7efd6b 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -695,6 +695,32 @@ out:
     return rc;
 }
 
+struct get_reserved_device_memory {
+    struct xen_mem_reserved_device_memory_map map;
+    unsigned int used_entries;
+};
+
+static int get_reserved_device_memory(xen_pfn_t start,
+                                      xen_ulong_t nr, void *ctxt)
+{
+    struct get_reserved_device_memory *grdm = ctxt;
+
+    if ( grdm->used_entries < grdm->map.nr_entries )
+    {
+        struct xen_mem_reserved_device_memory rdm = {
+            .start_pfn = start, .nr_pages = nr
+        };
+
+        if ( __copy_to_guest_offset(grdm->map.buffer, grdm->used_entries,
+                                    &rdm, 1) )
+            return -EFAULT;
+    }
+
+    ++grdm->used_entries;
+
+    return 0;
+}
+
 long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     struct domain *d;
@@ -969,6 +995,29 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
         break;
 
+#ifdef HAS_PASSTHROUGH
+    case XENMEM_reserved_device_memory_map:
+    {
+        struct get_reserved_device_memory grdm;
+
+        if ( copy_from_guest(&grdm.map, arg, 1) ||
+             !guest_handle_okay(grdm.map.buffer, grdm.map.nr_entries) )
+            return -EFAULT;
+
+        grdm.used_entries = 0;
+        rc = iommu_get_reserved_device_memory(get_reserved_device_memory,
+                                              &grdm);
+
+        if ( !rc && grdm.map.nr_entries < grdm.used_entries )
+            rc = -ENOBUFS;
+        grdm.map.nr_entries = grdm.used_entries;
+        if ( __copy_to_guest(arg, &grdm.map, 1) )
+            rc = -EFAULT;
+
+        break;
+    }
+#endif
+
     default:
         rc = arch_memory_op(cmd, arg);
         break;
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index cc12735..7c17e8d 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -344,6 +344,16 @@ void iommu_crash_shutdown(void)
     iommu_enabled = iommu_intremap = 0;
 }
 
+int iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt)
+{
+    const struct iommu_ops *ops = iommu_get_ops();
+
+    if ( !iommu_enabled || !ops->get_reserved_device_memory )
+        return 0;
+
+    return ops->get_reserved_device_memory(func, ctxt);
+}
+
 bool_t iommu_has_feature(struct domain *d, enum iommu_feature feature)
 {
     const struct hvm_iommu *hd = domain_hvm_iommu(d);
diff --git a/xen/drivers/passthrough/vtd/dmar.c b/xen/drivers/passthrough/vtd/dmar.c
index 1152c3a..141e735 100644
--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -893,3 +893,20 @@ int platform_supports_x2apic(void)
     unsigned int mask = ACPI_DMAR_INTR_REMAP | ACPI_DMAR_X2APIC_OPT_OUT;
     return cpu_has_x2apic && ((dmar_flags & mask) == ACPI_DMAR_INTR_REMAP);
 }
+
+int intel_iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt)
+{
+    struct acpi_rmrr_unit *rmrr;
+    int rc = 0;
+
+    list_for_each_entry(rmrr, &acpi_rmrr_units, list)
+    {
+        rc = func(PFN_DOWN(rmrr->base_address),
+                  PFN_UP(rmrr->end_address) - PFN_DOWN(rmrr->base_address),
+                  ctxt);
+        if ( rc )
+            break;
+    }
+
+    return rc;
+}
diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h
index 5524dba..f9ee9b0 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -75,6 +75,7 @@ int domain_context_mapping_one(struct domain *domain, struct iommu *iommu,
                                u8 bus, u8 devfn, const struct pci_dev *);
 int domain_context_unmap_one(struct domain *domain, struct iommu *iommu,
                              u8 bus, u8 devfn);
+int intel_iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt);
 
 unsigned int io_apic_read_remap_rte(unsigned int apic, unsigned int reg);
 void io_apic_write_remap_rte(unsigned int apic,
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 042b882..d513dba 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2485,6 +2485,7 @@ const struct iommu_ops intel_iommu_ops = {
     .crash_shutdown = vtd_crash_shutdown,
     .iotlb_flush = intel_iommu_iotlb_flush,
     .iotlb_flush_all = intel_iommu_iotlb_flush_all,
+    .get_reserved_device_memory = intel_iommu_get_reserved_device_memory,
     .dump_p2m_table = vtd_dump_p2m_table,
 };
 
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 2c57aa0..c59d42d 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -523,7 +523,29 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_sharing_op_t);
 
 #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
 
-/* Next available subop number is 26 */
+/*
+ * For legacy reasons, some devices must be configured with special memory
+ * regions to function correctly.  The guest must avoid using any of these
+ * regions.
+ */
+#define XENMEM_reserved_device_memory_map   26
+struct xen_mem_reserved_device_memory {
+    xen_pfn_t start_pfn;
+    xen_ulong_t nr_pages;
+};
+typedef struct xen_mem_reserved_device_memory xen_mem_reserved_device_memory_t;
+DEFINE_XEN_GUEST_HANDLE(xen_mem_reserved_device_memory_t);
+
+struct xen_mem_reserved_device_memory_map {
+    /* IN/OUT */
+    unsigned int nr_entries;
+    /* OUT */
+    XEN_GUEST_HANDLE(xen_mem_reserved_device_memory_t) buffer;
+};
+typedef struct xen_mem_reserved_device_memory_map xen_mem_reserved_device_memory_map_t;
+DEFINE_XEN_GUEST_HANDLE(xen_mem_reserved_device_memory_map_t);
+
+/* Next available subop number is 27 */
 
 #endif /* __XEN_PUBLIC_MEMORY_H__ */
 
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 8eb764a..409f6f8 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -120,6 +120,8 @@ void iommu_dt_domain_destroy(struct domain *d);
 
 struct page_info;
 
+typedef int iommu_grdm_t(xen_pfn_t start, xen_ulong_t nr, void *ctxt);
+
 struct iommu_ops {
     int (*init)(struct domain *d);
     void (*hwdom_init)(struct domain *d);
@@ -156,12 +158,14 @@ struct iommu_ops {
     void (*crash_shutdown)(void);
     void (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count);
     void (*iotlb_flush_all)(struct domain *d);
+    int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
     void (*dump_p2m_table)(struct domain *d);
 };
 
 void iommu_suspend(void);
 void iommu_resume(void);
 void iommu_crash_shutdown(void);
+int iommu_get_reserved_device_memory(iommu_grdm_t *, void *);
 
 void iommu_share_p2m_table(struct domain *d);
 
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 9a35dd7..3ec1749 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -60,7 +60,8 @@
 !	memory_exchange			memory.h
 !	memory_map			memory.h
 !	memory_reservation		memory.h
-?	mem_access_op		memory.h
+?	mem_access_op			memory.h
+!	mem_reserved_device_memory_map	memory.h
 !	pod_target			memory.h
 !	remove_from_physmap		memory.h
 ?	physdev_eoi			physdev.h
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [v6][PATCH 2/7] tools/libxc: introduce hypercall for xc_reserved_device_memory_map
  2014-09-10  5:49 [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tiejun Chen
  2014-09-10  5:49 ` [v6][PATCH 1/7] introduce XENMEM_reserved_device_memory_map Tiejun Chen
@ 2014-09-10  5:49 ` Tiejun Chen
  2014-09-11 15:21   ` Jan Beulich
  2014-09-10  5:49 ` [v6][PATCH 3/7] tools/libxc: check if mmio BAR is out of reserved device memory maps Tiejun Chen
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Tiejun Chen @ 2014-09-10  5:49 UTC (permalink / raw)
  To: JBeulich, ian.campbell, ian.jackson, stefano.stabellini,
	kevin.tian, yang.z.zhang
  Cc: xen-devel

We will introduce that hypercall xc_reserved_device_memory_map
to libxc.

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>

diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index c67ac9a..b9374d2 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -649,6 +649,35 @@ int xc_domain_set_memory_map(xc_interface *xch,
 
     return rc;
 }
+
+int xc_reserved_device_memory_map(xc_interface *xch,
+                                  struct xen_mem_reserved_device_memory entries[],
+                                  uint32_t *max_entries)
+{
+    int rc;
+    struct xen_mem_reserved_device_memory_map memmap = {
+        .nr_entries = *max_entries
+    };
+    DECLARE_HYPERCALL_BOUNCE(entries,
+                             sizeof(struct xen_mem_reserved_device_memory) *
+                             *max_entries, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+
+    if ( xc_hypercall_bounce_pre(xch, entries) )
+        return -1;
+
+    set_xen_guest_handle(memmap.buffer, entries);
+
+    rc = do_memory_op(xch, XENMEM_reserved_device_memory_map,
+                      &memmap, sizeof(memmap));
+
+    xc_hypercall_bounce_post(xch, entries);
+
+    if ( errno == ENOBUFS )
+        *max_entries = memmap.nr_entries;
+
+    return rc ? -errno : memmap.nr_entries;
+}
+
 int xc_get_machine_memory_map(xc_interface *xch,
                               struct e820entry entries[],
                               uint32_t max_entries)
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
index 1c5d0db..aa32c81 100644
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -1270,6 +1270,10 @@ int xc_domain_set_memory_map(xc_interface *xch,
 int xc_get_machine_memory_map(xc_interface *xch,
                               struct e820entry entries[],
                               uint32_t max_entries);
+
+int xc_reserved_device_memory_map(xc_interface *xch,
+                                  struct xen_mem_reserved_device_memory entries[],
+                                  uint32_t *max_entries);
 #endif
 int xc_domain_set_time_offset(xc_interface *xch,
                               uint32_t domid,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [v6][PATCH 3/7] tools/libxc: check if mmio BAR is out of reserved device memory maps
  2014-09-10  5:49 [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tiejun Chen
  2014-09-10  5:49 ` [v6][PATCH 1/7] introduce XENMEM_reserved_device_memory_map Tiejun Chen
  2014-09-10  5:49 ` [v6][PATCH 2/7] tools/libxc: introduce hypercall for xc_reserved_device_memory_map Tiejun Chen
@ 2014-09-10  5:49 ` Tiejun Chen
  2014-09-10 21:37   ` Tian, Kevin
  2014-09-11 15:38   ` Jan Beulich
  2014-09-10  5:49 ` [v6][PATCH 4/7] libxc/hvm_info_table: introduce a new field nr_reserved_device_memory_map Tiejun Chen
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 37+ messages in thread
From: Tiejun Chen @ 2014-09-10  5:49 UTC (permalink / raw)
  To: JBeulich, ian.campbell, ian.jackson, stefano.stabellini,
	kevin.tian, yang.z.zhang
  Cc: xen-devel

We need to avoid allocating MMIO BAR conflicting to all reserved device
memory range.

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index c81a25b..299e33a 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -239,6 +239,73 @@ static int check_mmio_hole(uint64_t start, uint64_t memsize,
         return 1;
 }
 
+/*
+ * Check whether there exists mmio overplap with the reserved device
+ * memory map
+ */
+static int check_rdm_overlap(xc_interface *xch, uint64_t mmio_start,
+                             uint64_t mmio_size)
+{
+    struct xen_mem_reserved_device_memory *map = NULL;
+    uint64_t rdm_start = 0, rdm_end = 0;
+    unsigned int i = 0;
+    int rc = 0;
+    /* Assume we have one entry if not enough we'll expand.*/
+    uint32_t nr_entries = 1;
+
+    /* We should check if mmio range is out of RDM mapping. */
+    if ( (map = malloc(nr_entries *
+                       sizeof(xen_mem_reserved_device_memory_t))) == NULL )
+    {
+        PERROR("Could not allocate memory for map.");
+        return -1;
+    }
+    rc = xc_reserved_device_memory_map(xch, map, &nr_entries);
+    if ( rc < 0 )
+    {
+        /* DRM doesn't exist. */
+        if ( rc == -ENOENT )
+            rc = 0;
+        else if ( rc == -ENOBUFS)
+        {
+            free(map);
+            /* Now we need more space to map all RDM mappings. */
+            if ( (map = malloc(nr_entries *
+                               sizeof(xen_mem_reserved_device_memory_t))) == NULL )
+            {
+                PERROR("Could not allocate memory for map.");
+                return -1;
+            }
+            rc = xc_reserved_device_memory_map(xch, map, &nr_entries);
+            if ( rc < 0 )
+            {
+                PERROR("Could not get DRM info on domain");
+                free(map);
+                return rc;
+            }
+        }
+        else
+            PERROR("Could not get RDM info on domain");
+    }
+
+    for ( i = 0; i < rc; i++ )
+    {
+        rdm_start = map[i].start_pfn << PAGE_SHIFT;
+        rdm_end = rdm_start + map[i].nr_pages * PAGE_SIZE;
+        if ( check_mmio_hole(rdm_start, map[i].nr_pages * PAGE_SIZE,
+                             mmio_start, mmio_size) )
+        {
+            PERROR("MMIO: [%lx]<->[%lx] overlap DRM [%lx]<->[%lx]\n",
+                   mmio_start, (mmio_start + mmio_size), rdm_start, rdm_end);
+            free(map);
+            return -1;
+        }
+    }
+
+    free(map);
+    return rc;
+}
+
 static int setup_guest(xc_interface *xch,
                        uint32_t dom, struct xc_hvm_build_args *args,
                        char *image, unsigned long image_size)
@@ -300,6 +367,10 @@ static int setup_guest(xc_interface *xch,
         goto error_out;
     }
 
+    rc = check_rdm_overlap(xch, mmio_start, mmio_start);
+    if ( rc < 0 )
+        goto error_out;
+
     for ( i = 0; i < nr_pages; i++ )
         page_array[i] = i;
     for ( i = mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [v6][PATCH 4/7] libxc/hvm_info_table: introduce a new field nr_reserved_device_memory_map
  2014-09-10  5:49 [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tiejun Chen
                   ` (2 preceding siblings ...)
  2014-09-10  5:49 ` [v6][PATCH 3/7] tools/libxc: check if mmio BAR is out of reserved device memory maps Tiejun Chen
@ 2014-09-10  5:49 ` Tiejun Chen
  2014-09-10 21:39   ` Tian, Kevin
  2014-09-10  5:49 ` [v6][PATCH 5/7] hvmloader: introduce hypercall for xc_reserved_device_memory_map Tiejun Chen
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Tiejun Chen @ 2014-09-10  5:49 UTC (permalink / raw)
  To: JBeulich, ian.campbell, ian.jackson, stefano.stabellini,
	kevin.tian, yang.z.zhang
  Cc: xen-devel

In hvm_info_table this field represents the number of all reserved device
memory maps. It will be convenient to expose such a information to VM.
While building hvm info, libxc is responsible for constructing this number
after check_rdm_overlap().

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index 299e33a..8c61422 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -89,7 +89,8 @@ static int modules_init(struct xc_hvm_build_args *args,
 }
 
 static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
-                           uint64_t mmio_start, uint64_t mmio_size)
+                           uint64_t mmio_start, uint64_t mmio_size,
+                           unsigned int num)
 {
     struct hvm_info_table *hvm_info = (struct hvm_info_table *)
         (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET);
@@ -119,6 +120,9 @@ static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
     hvm_info->high_mem_pgend = highmem_end >> PAGE_SHIFT;
     hvm_info->reserved_mem_pgstart = ioreq_server_pfn(0);
 
+    /* Reserved device memory map number. */
+    hvm_info->nr_reserved_device_memory_map = num;
+
     /* Finish with the checksum. */
     for ( i = 0, sum = 0; i < hvm_info->length; i++ )
         sum += ((uint8_t *)hvm_info)[i];
@@ -329,6 +333,7 @@ static int setup_guest(xc_interface *xch,
     int claim_enabled = args->claim_enabled;
     xen_pfn_t special_array[NR_SPECIAL_PAGES];
     xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
+    unsigned int num_reserved = 0;
 
     if ( nr_pages > target_pages )
         pod_mode = XENMEMF_populate_on_demand;
@@ -371,6 +376,8 @@ static int setup_guest(xc_interface *xch,
     if ( rc < 0 )
         goto error_out;
 
+    num_reserved = rc;
+
     for ( i = 0; i < nr_pages; i++ )
         page_array[i] = i;
     for ( i = mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
@@ -540,7 +547,7 @@ static int setup_guest(xc_interface *xch,
               xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
               HVM_INFO_PFN)) == NULL )
         goto error_out;
-    build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size);
+    build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size, num_reserved);
     munmap(hvm_info_page, PAGE_SIZE);
 
     /* Allocate and clear special pages. */
diff --git a/xen/include/public/hvm/hvm_info_table.h b/xen/include/public/hvm/hvm_info_table.h
index 36085fa..bf401d5 100644
--- a/xen/include/public/hvm/hvm_info_table.h
+++ b/xen/include/public/hvm/hvm_info_table.h
@@ -65,6 +65,9 @@ struct hvm_info_table {
      */
     uint32_t    high_mem_pgend;
 
+    /* How many reserved device memory maps does we have? */
+    uint32_t    nr_reserved_device_memory_map;
+
     /* Bitmap of which CPUs are online at boot time. */
     uint8_t     vcpu_online[(HVM_MAX_VCPUS + 7)/8];
 };
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [v6][PATCH 5/7] hvmloader: introduce hypercall for xc_reserved_device_memory_map
  2014-09-10  5:49 [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tiejun Chen
                   ` (3 preceding siblings ...)
  2014-09-10  5:49 ` [v6][PATCH 4/7] libxc/hvm_info_table: introduce a new field nr_reserved_device_memory_map Tiejun Chen
@ 2014-09-10  5:49 ` Tiejun Chen
  2014-09-10 21:41   ` Tian, Kevin
  2014-09-11 15:45   ` Jan Beulich
  2014-09-10  5:49 ` [v6][PATCH 6/7] hvmloader: check to reserved device memory maps in e820 Tiejun Chen
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 37+ messages in thread
From: Tiejun Chen @ 2014-09-10  5:49 UTC (permalink / raw)
  To: JBeulich, ian.campbell, ian.jackson, stefano.stabellini,
	kevin.tian, yang.z.zhang
  Cc: xen-devel

We will introduce that hypercall xc_reserved_device_memory_map
to hvmloader.

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>

diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
index 80d822f..90dbb6e 100644
--- a/tools/firmware/hvmloader/util.c
+++ b/tools/firmware/hvmloader/util.c
@@ -828,6 +828,28 @@ int hpet_exists(unsigned long hpet_base)
     return ((hpet_id >> 16) == 0x8086);
 }
 
+int get_reserved_device_memory_map(struct xen_mem_reserved_device_memory entries[],
+                                   uint32_t max_entries)
+{
+    static int map_done = 0;
+    struct xen_mem_reserved_device_memory_map memmap = {
+        .nr_entries = max_entries
+    };
+
+    if ( map_done )
+        return 0;
+
+    set_xen_guest_handle(memmap.buffer, entries);
+
+    if ( hypercall_memory_op(XENMEM_reserved_device_memory_map,
+                             &memmap) != 0 )
+        BUG();
+
+    map_done = 1;
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/firmware/hvmloader/util.h b/tools/firmware/hvmloader/util.h
index a70e4aa..cca0d5b 100644
--- a/tools/firmware/hvmloader/util.h
+++ b/tools/firmware/hvmloader/util.h
@@ -241,6 +241,9 @@ int build_e820_table(struct e820entry *e820,
                      unsigned int bios_image_base);
 void dump_e820_table(struct e820entry *e820, unsigned int nr);
 
+#include <xen/memory.h>
+int get_reserved_device_memory_map(struct xen_mem_reserved_device_memory entries[],
+                                   uint32_t max_entries);
 #ifndef NDEBUG
 void perform_tests(void);
 #else
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [v6][PATCH 6/7] hvmloader: check to reserved device memory maps in e820
  2014-09-10  5:49 [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tiejun Chen
                   ` (4 preceding siblings ...)
  2014-09-10  5:49 ` [v6][PATCH 5/7] hvmloader: introduce hypercall for xc_reserved_device_memory_map Tiejun Chen
@ 2014-09-10  5:49 ` Tiejun Chen
  2014-09-11 15:57   ` Jan Beulich
  2014-09-10  5:49 ` [v6][PATCH 7/7] xen/vtd: make USB RMRR mapping safe Tiejun Chen
  2014-09-10 21:44 ` [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tian, Kevin
  7 siblings, 1 reply; 37+ messages in thread
From: Tiejun Chen @ 2014-09-10  5:49 UTC (permalink / raw)
  To: JBeulich, ian.campbell, ian.jackson, stefano.stabellini,
	kevin.tian, yang.z.zhang
  Cc: xen-devel

We need to check to reserve all reserved device memory maps in e820
to avoid any potential guest memory conflict.

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>

diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
index 2e05e93..b4842d9 100644
--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -68,12 +68,89 @@ void dump_e820_table(struct e820entry *e820, unsigned int nr)
     }
 }
 
+static unsigned int construct_rdm_e820_maps(unsigned int nr,
+                                            uint32_t nr_map,
+                                            struct xen_mem_reserved_device_memory *map,
+                                            struct e820entry *e820)
+{
+    unsigned int i = 0, j = 0, m = 0, sum_nr = nr + nr_map;
+    uint64_t start, end, rdm_start, rdm_end;
+    unsigned int insert = 0, do_insert = 0;
+
+ do_real_construct:
+    for ( i = 0; i < nr_map; i++ )
+    {
+        rdm_start = map[i].start_pfn << PAGE_SHIFT;
+        rdm_end = rdm_start + map[i].nr_pages * PAGE_SIZE;
+
+        for ( j = 0; j < nr - 1; j++ )
+        {
+            end = e820[j].addr + e820[j].size;
+            start = e820[j+1].addr;
+
+            /* Between those existing e820 entries. */
+            if ( (rdm_start > end) && (rdm_end < start) )
+            {
+                if ( do_insert )
+                {
+                    /* Move to free this entry. */
+                    for ( m = sum_nr - 1; m > j; m-- )
+                    {
+                        e820[m].addr = e820[m-1].addr;
+                        e820[m].size = e820[m-1].size;
+                        e820[m].type = e820[m-1].type;
+                    }
+
+                    /* Then fill RMRR into that entry. */
+                    e820[j+1].addr = rdm_start;
+                    e820[j+1].size = rdm_end - rdm_start;
+                    e820[j+1].type = E820_RESERVED;
+                    nr++;
+                }
+                insert++;
+            }
+            /* Already at the end. */
+            else if ( (rdm_start > end) && !start )
+            {
+                if ( do_insert )
+                {
+                    e820[nr].addr = rdm_start;
+                    e820[nr].size = rdm_end - rdm_start;
+                    e820[nr].type = E820_RESERVED;
+                    nr++;
+                }
+                insert++;
+            }
+        }
+    }
+
+    /* Just return if done. */
+    if ( do_insert )
+        return nr;
+
+    /* Fine to construct RDM mappings into e820. */
+    if ( insert == nr_map )
+    {
+        do_insert = 1;
+        goto do_real_construct;
+    }
+    /* Overlap. */
+    else
+    {
+        printf("RDM overlap with those existing e820 entries!\n");
+        printf("So we don't construct RDM mapping in e820!\n");
+    }
+
+    return nr;
+}
 /* Create an E820 table based on memory parameters provided in hvm_info. */
 int build_e820_table(struct e820entry *e820,
                      unsigned int lowmem_reserved_base,
                      unsigned int bios_image_base)
 {
     unsigned int nr = 0;
+    struct xen_mem_reserved_device_memory *map = NULL;
+    int rc;
 
     if ( !lowmem_reserved_base )
             lowmem_reserved_base = 0xA0000;
@@ -169,6 +246,29 @@ int build_e820_table(struct e820entry *e820,
         nr++;
     }
 
+    /* We'd better reserve RDM maps for each VM to avoid potential
+     * memory conflict.
+     */
+    if ( hvm_info->nr_reserved_device_memory_map )
+    {
+        if ( !map )
+            map = mem_alloc(hvm_info->nr_reserved_device_memory_map *
+                            sizeof(struct xen_mem_reserved_device_memory), 0);
+        if ( map )
+        {
+            rc = get_reserved_device_memory_map(map,
+                                                hvm_info->nr_reserved_device_memory_map);
+            if ( !rc )
+            {
+                nr = construct_rdm_e820_maps(nr,
+                                             hvm_info->nr_reserved_device_memory_map,
+                                             map, e820);
+            }
+        }
+        else
+            printf("No space to get device reserved memory maps!\n");
+    }
+
     return nr;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [v6][PATCH 7/7] xen/vtd: make USB RMRR mapping safe
  2014-09-10  5:49 [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tiejun Chen
                   ` (5 preceding siblings ...)
  2014-09-10  5:49 ` [v6][PATCH 6/7] hvmloader: check to reserved device memory maps in e820 Tiejun Chen
@ 2014-09-10  5:49 ` Tiejun Chen
  2014-09-18  9:11   ` Jan Beulich
  2014-09-10 21:44 ` [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tian, Kevin
  7 siblings, 1 reply; 37+ messages in thread
From: Tiejun Chen @ 2014-09-10  5:49 UTC (permalink / raw)
  To: JBeulich, ian.campbell, ian.jackson, stefano.stabellini,
	kevin.tian, yang.z.zhang
  Cc: xen-devel

We already reserve all RMRR mappings so USB should be safe with
RMRR.

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>

diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index d513dba..ea5d343 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2273,16 +2273,8 @@ static int intel_iommu_assign_device(
     if ( ret )
         goto done;
 
-    /* FIXME: Because USB RMRR conflicts with guest bios region,
-     * ignore USB RMRR temporarily.
-     */
     seg = pdev->seg;
     bus = pdev->bus;
-    if ( is_usb_device(seg, bus, pdev->devfn) )
-    {
-        ret = 0;
-        goto done;
-    }
 
     /* Setup rmrr identity mapping */
     for_each_rmrr_device( rmrr, bdf, i )
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 1/7] introduce XENMEM_reserved_device_memory_map
  2014-09-10  5:49 ` [v6][PATCH 1/7] introduce XENMEM_reserved_device_memory_map Tiejun Chen
@ 2014-09-10 21:34   ` Tian, Kevin
  0 siblings, 0 replies; 37+ messages in thread
From: Tian, Kevin @ 2014-09-10 21:34 UTC (permalink / raw)
  To: Chen, Tiejun, JBeulich, ian.campbell, ian.jackson,
	stefano.stabellini, Zhang, Yang Z
  Cc: xen-devel

> From: Chen, Tiejun
> Sent: Tuesday, September 09, 2014 10:50 PM
> 
> From: Jan Beulich <jbeulich@suse.com>
> 
> This is a prerequisite for punching holes into HVM and PVH guests' P2M
> to allow passing through devices that are associated with (on VT-d)
> RMRRs.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Did you change anything on top of Jan's original version? If yes, also need your sign-off.

Acked-by: Kevin Tian <kevin.tian@intel.com> for VT-d part.

> 
> diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
> index 25dc016..01154f6 100644
> --- a/xen/common/compat/memory.c
> +++ b/xen/common/compat/memory.c
> @@ -15,6 +15,35 @@ CHECK_TYPE(domid);
> 
>  CHECK_mem_access_op;
> 
> +struct get_reserved_device_memory {
> +    struct compat_mem_reserved_device_memory_map map;
> +    unsigned int used_entries;
> +};
> +
> +static int get_reserved_device_memory(xen_pfn_t start,
> +                                      xen_ulong_t nr, void *ctxt)
> +{
> +    struct get_reserved_device_memory *grdm = ctxt;
> +
> +    if ( grdm->used_entries < grdm->map.nr_entries )
> +    {
> +        struct compat_mem_reserved_device_memory rdm = {
> +            .start_pfn = start, .nr_pages = nr
> +        };
> +
> +        if ( rdm.start_pfn != start || rdm.nr_pages != nr )
> +            return -ERANGE;
> +
> +        if ( __copy_to_compat_offset(grdm->map.buffer,
> grdm->used_entries,
> +                                     &rdm, 1) )
> +            return -EFAULT;
> +    }
> +
> +    ++grdm->used_entries;
> +
> +    return 0;
> +}
> +
>  int compat_memory_op(unsigned int cmd,
> XEN_GUEST_HANDLE_PARAM(void) compat)
>  {
>      int split, op = cmd & MEMOP_CMD_MASK;
> @@ -272,6 +301,29 @@ int compat_memory_op(unsigned int cmd,
> XEN_GUEST_HANDLE_PARAM(void) compat)
>              break;
>          }
> 
> +#ifdef HAS_PASSTHROUGH
> +        case XENMEM_reserved_device_memory_map:
> +        {
> +            struct get_reserved_device_memory grdm;
> +
> +            if ( copy_from_guest(&grdm.map, compat, 1) ||
> +                 !compat_handle_okay(grdm.map.buffer,
> grdm.map.nr_entries) )
> +                return -EFAULT;
> +
> +            grdm.used_entries = 0;
> +            rc =
> iommu_get_reserved_device_memory(get_reserved_device_memory,
> +                                                  &grdm);
> +
> +            if ( !rc && grdm.map.nr_entries < grdm.used_entries )
> +                rc = -ENOBUFS;
> +            grdm.map.nr_entries = grdm.used_entries;
> +            if ( __copy_to_guest(compat, &grdm.map, 1) )
> +                rc = -EFAULT;
> +
> +            return rc;
> +        }
> +#endif
> +
>          default:
>              return compat_arch_memory_op(cmd, compat);
>          }
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index c2dd31b..c7efd6b 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -695,6 +695,32 @@ out:
>      return rc;
>  }
> 
> +struct get_reserved_device_memory {
> +    struct xen_mem_reserved_device_memory_map map;
> +    unsigned int used_entries;
> +};
> +
> +static int get_reserved_device_memory(xen_pfn_t start,
> +                                      xen_ulong_t nr, void *ctxt)
> +{
> +    struct get_reserved_device_memory *grdm = ctxt;
> +
> +    if ( grdm->used_entries < grdm->map.nr_entries )
> +    {
> +        struct xen_mem_reserved_device_memory rdm = {
> +            .start_pfn = start, .nr_pages = nr
> +        };
> +
> +        if ( __copy_to_guest_offset(grdm->map.buffer,
> grdm->used_entries,
> +                                    &rdm, 1) )
> +            return -EFAULT;
> +    }
> +
> +    ++grdm->used_entries;
> +
> +    return 0;
> +}
> +
>  long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void)
> arg)
>  {
>      struct domain *d;
> @@ -969,6 +995,29 @@ long do_memory_op(unsigned long cmd,
> XEN_GUEST_HANDLE_PARAM(void) arg)
> 
>          break;
> 
> +#ifdef HAS_PASSTHROUGH
> +    case XENMEM_reserved_device_memory_map:
> +    {
> +        struct get_reserved_device_memory grdm;
> +
> +        if ( copy_from_guest(&grdm.map, arg, 1) ||
> +             !guest_handle_okay(grdm.map.buffer,
> grdm.map.nr_entries) )
> +            return -EFAULT;
> +
> +        grdm.used_entries = 0;
> +        rc =
> iommu_get_reserved_device_memory(get_reserved_device_memory,
> +                                              &grdm);
> +
> +        if ( !rc && grdm.map.nr_entries < grdm.used_entries )
> +            rc = -ENOBUFS;
> +        grdm.map.nr_entries = grdm.used_entries;
> +        if ( __copy_to_guest(arg, &grdm.map, 1) )
> +            rc = -EFAULT;
> +
> +        break;
> +    }
> +#endif
> +
>      default:
>          rc = arch_memory_op(cmd, arg);
>          break;
> diff --git a/xen/drivers/passthrough/iommu.c
> b/xen/drivers/passthrough/iommu.c
> index cc12735..7c17e8d 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -344,6 +344,16 @@ void iommu_crash_shutdown(void)
>      iommu_enabled = iommu_intremap = 0;
>  }
> 
> +int iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt)
> +{
> +    const struct iommu_ops *ops = iommu_get_ops();
> +
> +    if ( !iommu_enabled || !ops->get_reserved_device_memory )
> +        return 0;
> +
> +    return ops->get_reserved_device_memory(func, ctxt);
> +}
> +
>  bool_t iommu_has_feature(struct domain *d, enum iommu_feature feature)
>  {
>      const struct hvm_iommu *hd = domain_hvm_iommu(d);
> diff --git a/xen/drivers/passthrough/vtd/dmar.c
> b/xen/drivers/passthrough/vtd/dmar.c
> index 1152c3a..141e735 100644
> --- a/xen/drivers/passthrough/vtd/dmar.c
> +++ b/xen/drivers/passthrough/vtd/dmar.c
> @@ -893,3 +893,20 @@ int platform_supports_x2apic(void)
>      unsigned int mask = ACPI_DMAR_INTR_REMAP |
> ACPI_DMAR_X2APIC_OPT_OUT;
>      return cpu_has_x2apic && ((dmar_flags & mask) ==
> ACPI_DMAR_INTR_REMAP);
>  }
> +
> +int intel_iommu_get_reserved_device_memory(iommu_grdm_t *func, void
> *ctxt)
> +{
> +    struct acpi_rmrr_unit *rmrr;
> +    int rc = 0;
> +
> +    list_for_each_entry(rmrr, &acpi_rmrr_units, list)
> +    {
> +        rc = func(PFN_DOWN(rmrr->base_address),
> +                  PFN_UP(rmrr->end_address) -
> PFN_DOWN(rmrr->base_address),
> +                  ctxt);
> +        if ( rc )
> +            break;
> +    }
> +
> +    return rc;
> +}
> diff --git a/xen/drivers/passthrough/vtd/extern.h
> b/xen/drivers/passthrough/vtd/extern.h
> index 5524dba..f9ee9b0 100644
> --- a/xen/drivers/passthrough/vtd/extern.h
> +++ b/xen/drivers/passthrough/vtd/extern.h
> @@ -75,6 +75,7 @@ int domain_context_mapping_one(struct domain
> *domain, struct iommu *iommu,
>                                 u8 bus, u8 devfn, const struct pci_dev
> *);
>  int domain_context_unmap_one(struct domain *domain, struct iommu
> *iommu,
>                               u8 bus, u8 devfn);
> +int intel_iommu_get_reserved_device_memory(iommu_grdm_t *func, void
> *ctxt);
> 
>  unsigned int io_apic_read_remap_rte(unsigned int apic, unsigned int reg);
>  void io_apic_write_remap_rte(unsigned int apic,
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index 042b882..d513dba 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2485,6 +2485,7 @@ const struct iommu_ops intel_iommu_ops = {
>      .crash_shutdown = vtd_crash_shutdown,
>      .iotlb_flush = intel_iommu_iotlb_flush,
>      .iotlb_flush_all = intel_iommu_iotlb_flush_all,
> +    .get_reserved_device_memory =
> intel_iommu_get_reserved_device_memory,
>      .dump_p2m_table = vtd_dump_p2m_table,
>  };
> 
> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
> index 2c57aa0..c59d42d 100644
> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -523,7 +523,29 @@
> DEFINE_XEN_GUEST_HANDLE(xen_mem_sharing_op_t);
> 
>  #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
> 
> -/* Next available subop number is 26 */
> +/*
> + * For legacy reasons, some devices must be configured with special memory
> + * regions to function correctly.  The guest must avoid using any of these
> + * regions.
> + */
> +#define XENMEM_reserved_device_memory_map   26
> +struct xen_mem_reserved_device_memory {
> +    xen_pfn_t start_pfn;
> +    xen_ulong_t nr_pages;
> +};
> +typedef struct xen_mem_reserved_device_memory
> xen_mem_reserved_device_memory_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_mem_reserved_device_memory_t);
> +
> +struct xen_mem_reserved_device_memory_map {
> +    /* IN/OUT */
> +    unsigned int nr_entries;
> +    /* OUT */
> +    XEN_GUEST_HANDLE(xen_mem_reserved_device_memory_t) buffer;
> +};
> +typedef struct xen_mem_reserved_device_memory_map
> xen_mem_reserved_device_memory_map_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_mem_reserved_device_memory_map_t);
> +
> +/* Next available subop number is 27 */
> 
>  #endif /* __XEN_PUBLIC_MEMORY_H__ */
> 
> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index 8eb764a..409f6f8 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -120,6 +120,8 @@ void iommu_dt_domain_destroy(struct domain *d);
> 
>  struct page_info;
> 
> +typedef int iommu_grdm_t(xen_pfn_t start, xen_ulong_t nr, void *ctxt);
> +
>  struct iommu_ops {
>      int (*init)(struct domain *d);
>      void (*hwdom_init)(struct domain *d);
> @@ -156,12 +158,14 @@ struct iommu_ops {
>      void (*crash_shutdown)(void);
>      void (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int
> page_count);
>      void (*iotlb_flush_all)(struct domain *d);
> +    int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
>      void (*dump_p2m_table)(struct domain *d);
>  };
> 
>  void iommu_suspend(void);
>  void iommu_resume(void);
>  void iommu_crash_shutdown(void);
> +int iommu_get_reserved_device_memory(iommu_grdm_t *, void *);
> 
>  void iommu_share_p2m_table(struct domain *d);
> 
> diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
> index 9a35dd7..3ec1749 100644
> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -60,7 +60,8 @@
>  !	memory_exchange			memory.h
>  !	memory_map			memory.h
>  !	memory_reservation		memory.h
> -?	mem_access_op		memory.h
> +?	mem_access_op			memory.h
> +!	mem_reserved_device_memory_map	memory.h
>  !	pod_target			memory.h
>  !	remove_from_physmap		memory.h
>  ?	physdev_eoi			physdev.h
> --
> 1.9.1

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 3/7] tools/libxc: check if mmio BAR is out of reserved device memory maps
  2014-09-10  5:49 ` [v6][PATCH 3/7] tools/libxc: check if mmio BAR is out of reserved device memory maps Tiejun Chen
@ 2014-09-10 21:37   ` Tian, Kevin
  2014-09-11  1:14     ` Chen, Tiejun
  2014-09-11 15:38   ` Jan Beulich
  1 sibling, 1 reply; 37+ messages in thread
From: Tian, Kevin @ 2014-09-10 21:37 UTC (permalink / raw)
  To: Chen, Tiejun, JBeulich, ian.campbell, ian.jackson,
	stefano.stabellini, Zhang, Yang Z
  Cc: xen-devel

> From: Chen, Tiejun
> Sent: Tuesday, September 09, 2014 10:50 PM
> 
> We need to avoid allocating MMIO BAR conflicting to all reserved device
> memory range.

besides checking MMIO BAR confliction, you also need check guest memory
confliction with reserved device memory ranges.

> 
> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> 
> diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
> index c81a25b..299e33a 100644
> --- a/tools/libxc/xc_hvm_build_x86.c
> +++ b/tools/libxc/xc_hvm_build_x86.c
> @@ -239,6 +239,73 @@ static int check_mmio_hole(uint64_t start, uint64_t
> memsize,
>          return 1;
>  }
> 
> +/*
> + * Check whether there exists mmio overplap with the reserved device
> + * memory map
> + */
> +static int check_rdm_overlap(xc_interface *xch, uint64_t mmio_start,
> +                             uint64_t mmio_size)
> +{
> +    struct xen_mem_reserved_device_memory *map = NULL;

If there are multiple callers of this call, better to move the structure out of
this function to avoid multiple calls.

> +    uint64_t rdm_start = 0, rdm_end = 0;
> +    unsigned int i = 0;
> +    int rc = 0;
> +    /* Assume we have one entry if not enough we'll expand.*/
> +    uint32_t nr_entries = 1;
> +
> +    /* We should check if mmio range is out of RDM mapping. */
> +    if ( (map = malloc(nr_entries *
> +                       sizeof(xen_mem_reserved_device_memory_t)))
> == NULL )
> +    {
> +        PERROR("Could not allocate memory for map.");
> +        return -1;
> +    }
> +    rc = xc_reserved_device_memory_map(xch, map, &nr_entries);
> +    if ( rc < 0 )
> +    {
> +        /* DRM doesn't exist. */
> +        if ( rc == -ENOENT )
> +            rc = 0;
> +        else if ( rc == -ENOBUFS)
> +        {
> +            free(map);
> +            /* Now we need more space to map all RDM mappings. */
> +            if ( (map = malloc(nr_entries *
> +
> sizeof(xen_mem_reserved_device_memory_t))) == NULL )
> +            {
> +                PERROR("Could not allocate memory for map.");
> +                return -1;
> +            }
> +            rc = xc_reserved_device_memory_map(xch, map,
> &nr_entries);
> +            if ( rc < 0 )
> +            {
> +                PERROR("Could not get DRM info on domain");
> +                free(map);
> +                return rc;
> +            }
> +        }
> +        else
> +            PERROR("Could not get RDM info on domain");
> +    }
> +
> +    for ( i = 0; i < rc; i++ )
> +    {
> +        rdm_start = map[i].start_pfn << PAGE_SHIFT;
> +        rdm_end = rdm_start + map[i].nr_pages * PAGE_SIZE;
> +        if ( check_mmio_hole(rdm_start, map[i].nr_pages * PAGE_SIZE,
> +                             mmio_start, mmio_size) )
> +        {
> +            PERROR("MMIO: [%lx]<->[%lx] overlap DRM [%lx]<->[%lx]\n",
> +                   mmio_start, (mmio_start + mmio_size), rdm_start,
> rdm_end);
> +            free(map);
> +            return -1;
> +        }
> +    }
> +
> +    free(map);
> +    return rc;
> +}
> +
>  static int setup_guest(xc_interface *xch,
>                         uint32_t dom, struct xc_hvm_build_args *args,
>                         char *image, unsigned long image_size)
> @@ -300,6 +367,10 @@ static int setup_guest(xc_interface *xch,
>          goto error_out;
>      }
> 
> +    rc = check_rdm_overlap(xch, mmio_start, mmio_start);

mmio_start -> mmio_size.

> +    if ( rc < 0 )
> +        goto error_out;
> +
>      for ( i = 0; i < nr_pages; i++ )
>          page_array[i] = i;
>      for ( i = mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
> --
> 1.9.1

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 4/7] libxc/hvm_info_table: introduce a new field nr_reserved_device_memory_map
  2014-09-10  5:49 ` [v6][PATCH 4/7] libxc/hvm_info_table: introduce a new field nr_reserved_device_memory_map Tiejun Chen
@ 2014-09-10 21:39   ` Tian, Kevin
  2014-09-11  1:16     ` Chen, Tiejun
  0 siblings, 1 reply; 37+ messages in thread
From: Tian, Kevin @ 2014-09-10 21:39 UTC (permalink / raw)
  To: Chen, Tiejun, JBeulich, ian.campbell, ian.jackson,
	stefano.stabellini, Zhang, Yang Z
  Cc: xen-devel

> From: Chen, Tiejun
> Sent: Tuesday, September 09, 2014 10:50 PM
> 
> In hvm_info_table this field represents the number of all reserved device
> memory maps. It will be convenient to expose such a information to VM.
> While building hvm info, libxc is responsible for constructing this number
> after check_rdm_overlap().

Agree with Jan that putting the entry number here looks dirty. If we really
want to go this way, I prefer to put the whole reserved entries so it removes
the call completely from hvmloader. But since it's a dynamic structure, so
it's not a good option here.

> 
> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> 
> diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
> index 299e33a..8c61422 100644
> --- a/tools/libxc/xc_hvm_build_x86.c
> +++ b/tools/libxc/xc_hvm_build_x86.c
> @@ -89,7 +89,8 @@ static int modules_init(struct xc_hvm_build_args *args,
>  }
> 
>  static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
> -                           uint64_t mmio_start, uint64_t mmio_size)
> +                           uint64_t mmio_start, uint64_t mmio_size,
> +                           unsigned int num)
>  {
>      struct hvm_info_table *hvm_info = (struct hvm_info_table *)
>          (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET);
> @@ -119,6 +120,9 @@ static void build_hvm_info(void *hvm_info_page,
> uint64_t mem_size,
>      hvm_info->high_mem_pgend = highmem_end >> PAGE_SHIFT;
>      hvm_info->reserved_mem_pgstart = ioreq_server_pfn(0);
> 
> +    /* Reserved device memory map number. */
> +    hvm_info->nr_reserved_device_memory_map = num;
> +
>      /* Finish with the checksum. */
>      for ( i = 0, sum = 0; i < hvm_info->length; i++ )
>          sum += ((uint8_t *)hvm_info)[i];
> @@ -329,6 +333,7 @@ static int setup_guest(xc_interface *xch,
>      int claim_enabled = args->claim_enabled;
>      xen_pfn_t special_array[NR_SPECIAL_PAGES];
>      xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
> +    unsigned int num_reserved = 0;
> 
>      if ( nr_pages > target_pages )
>          pod_mode = XENMEMF_populate_on_demand;
> @@ -371,6 +376,8 @@ static int setup_guest(xc_interface *xch,
>      if ( rc < 0 )
>          goto error_out;
> 
> +    num_reserved = rc;
> +
>      for ( i = 0; i < nr_pages; i++ )
>          page_array[i] = i;
>      for ( i = mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
> @@ -540,7 +547,7 @@ static int setup_guest(xc_interface *xch,
>                xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
>                HVM_INFO_PFN)) == NULL )
>          goto error_out;
> -    build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size);
> +    build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size,
> num_reserved);
>      munmap(hvm_info_page, PAGE_SIZE);
> 
>      /* Allocate and clear special pages. */
> diff --git a/xen/include/public/hvm/hvm_info_table.h
> b/xen/include/public/hvm/hvm_info_table.h
> index 36085fa..bf401d5 100644
> --- a/xen/include/public/hvm/hvm_info_table.h
> +++ b/xen/include/public/hvm/hvm_info_table.h
> @@ -65,6 +65,9 @@ struct hvm_info_table {
>       */
>      uint32_t    high_mem_pgend;
> 
> +    /* How many reserved device memory maps does we have? */
> +    uint32_t    nr_reserved_device_memory_map;
> +
>      /* Bitmap of which CPUs are online at boot time. */
>      uint8_t     vcpu_online[(HVM_MAX_VCPUS + 7)/8];
>  };
> --
> 1.9.1

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 5/7] hvmloader: introduce hypercall for xc_reserved_device_memory_map
  2014-09-10  5:49 ` [v6][PATCH 5/7] hvmloader: introduce hypercall for xc_reserved_device_memory_map Tiejun Chen
@ 2014-09-10 21:41   ` Tian, Kevin
  2014-09-11  1:32     ` Chen, Tiejun
  2014-09-11  7:52     ` Jan Beulich
  2014-09-11 15:45   ` Jan Beulich
  1 sibling, 2 replies; 37+ messages in thread
From: Tian, Kevin @ 2014-09-10 21:41 UTC (permalink / raw)
  To: Chen, Tiejun, JBeulich, ian.campbell, ian.jackson,
	stefano.stabellini, Zhang, Yang Z
  Cc: xen-devel

> From: Chen, Tiejun
> Sent: Tuesday, September 09, 2014 10:50 PM
> 
> We will introduce that hypercall xc_reserved_device_memory_map
> to hvmloader.
> 
> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> 
> diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
> index 80d822f..90dbb6e 100644
> --- a/tools/firmware/hvmloader/util.c
> +++ b/tools/firmware/hvmloader/util.c
> @@ -828,6 +828,28 @@ int hpet_exists(unsigned long hpet_base)
>      return ((hpet_id >> 16) == 0x8086);
>  }
> 
> +int get_reserved_device_memory_map(struct
> xen_mem_reserved_device_memory entries[],
> +                                   uint32_t max_entries)

usually claim as a pointer instead of array as the parameter.

> +{
> +    static int map_done = 0;

don't see much value of the static variable here. Better for the caller
to judge whether multiple calls are required.

> +    struct xen_mem_reserved_device_memory_map memmap = {
> +        .nr_entries = max_entries
> +    };
> +
> +    if ( map_done )
> +        return 0;
> +
> +    set_xen_guest_handle(memmap.buffer, entries);
> +
> +    if ( hypercall_memory_op(XENMEM_reserved_device_memory_map,
> +                             &memmap) != 0 )
> +        BUG();
> +
> +    map_done = 1;
> +
> +    return 0;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/tools/firmware/hvmloader/util.h
> b/tools/firmware/hvmloader/util.h
> index a70e4aa..cca0d5b 100644
> --- a/tools/firmware/hvmloader/util.h
> +++ b/tools/firmware/hvmloader/util.h
> @@ -241,6 +241,9 @@ int build_e820_table(struct e820entry *e820,
>                       unsigned int bios_image_base);
>  void dump_e820_table(struct e820entry *e820, unsigned int nr);
> 
> +#include <xen/memory.h>
> +int get_reserved_device_memory_map(struct
> xen_mem_reserved_device_memory entries[],
> +                                   uint32_t max_entries);
>  #ifndef NDEBUG
>  void perform_tests(void);
>  #else
> --
> 1.9.1

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM
  2014-09-10  5:49 [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tiejun Chen
                   ` (6 preceding siblings ...)
  2014-09-10  5:49 ` [v6][PATCH 7/7] xen/vtd: make USB RMRR mapping safe Tiejun Chen
@ 2014-09-10 21:44 ` Tian, Kevin
  2014-09-11  1:38   ` Chen, Tiejun
  7 siblings, 1 reply; 37+ messages in thread
From: Tian, Kevin @ 2014-09-10 21:44 UTC (permalink / raw)
  To: Chen, Tiejun, JBeulich, ian.campbell, ian.jackson,
	stefano.stabellini, Zhang, Yang Z
  Cc: xen-devel

> From: Chen, Tiejun
> Sent: Tuesday, September 09, 2014 10:50 PM
> 

currently the confliction is detected absolutely. Do we need a way to allow
the confliction if there is no device assigned at all?

> v6:
> 
> * Jan cleanup to regenerate one patch to replace the two original patches
>   then rebase the others.
> * Refine all patches short/long logs.
> * Replace those info with 'reserved device memory maps' and s/RMRR/RDM.
> * Fix some code styles.
> * Fix one interation error when we grab e820 info
>   This is not valid when j == nr - 1 (last iteration).
> * Cleanup some codes.
> * One comment to introduce a new field, nr_reserved_device_memory_map
>   libxc/hvm_info_table, is still pending to wait the tools maintainer's
>   further comment.
> 
> v5:
> 
> * Add patch #2 to introduce a global count, acpi_rmrr_unit_entries.
> * Refine hypercall return value to make sure the caller can distinguish
>   clearly between "xen filled in N entries" and "xen said you need N
>   entries for all information".
> * Refine some structures
> * Then Rebase
> 
> v4:
> 
> * Drop the original patch #1. Instead, we use acpi_rmrr_units to get
>   rmrr info directly.
> * Refine the hypercall definition to make sure we can use it safely.
> * Introduce introduce nr_reserved_device_memory_map in hvm_info_table,
>   then we can avoid issue unnecessary hypercall, even we can know
>   current RMRR entries to issue hypercall one time.
> * Cleanup and rebase
> 
> v3:
> 
> * Use XENMEM_reserved_device_memory_map to replace
> XENMEM_RMRR_memory_map
> * Then rebase all patches
> 
> v2:
> 
> * Don't use e820map to define RMRR maps directly to avoid any confusion.
> * In patch #3 we introduce construct_rmrr_e820_maps() to check if we can
>   insert RMRR maps and then we will sort all e820 entries.
> * Clean patch #4
> * In patch #5 we reuse check_mmio_hole() to check if current mmio range is
>   fine to RMRR maps. If not, we just issue error to notify the user since
>   mostly mmio should be configured again.
> 
> While we work for supporting RMRR mapping for Windows GFX driver in case
> shared table,
> 
> http://osdir.com/ml/general/2014-07/msg55347.html
> http://osdir.com/ml/general/2014-07/msg55348.html
> 
> we realize we should reserve RMRR range to avoid any potential MMIO/RAM
> overlap with our discussion so here these preliminary patches are intended
> to cover this.
> 
> ----------------------------------------------------------------
> Jan Beulich (1):
>       introduce XENMEM_reserved_device_memory_map
> 
> Tiejun Chen (6):
>       tools/libxc: introduce hypercall for xc_reserved_device_memory_map
>       tools/libxc: check if mmio BAR is out of reserved device memory maps
>       libxc/hvm_info_table: introduce a new field
> nr_reserved_device_memory_map
>       hvmloader: introduce hypercall for xc_reserved_device_memory_map
>       hvmloader: check to reserved device memory maps in e820
>       xen/vtd: make USB RMRR mapping safe
> 
>  tools/firmware/hvmloader/e820.c         | 100
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ++++++++++++++++++++++++++++++++++++
>  tools/firmware/hvmloader/util.c         |  22 ++++++++++++++++++++++
>  tools/firmware/hvmloader/util.h         |   3 +++
>  tools/libxc/xc_domain.c                 |  29
> +++++++++++++++++++++++++++++
>  tools/libxc/xc_hvm_build_x86.c          |  82
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ++++++++++++++++--
>  tools/libxc/xenctrl.h                   |   4 ++++
>  xen/common/compat/memory.c              |  52
> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  xen/common/memory.c                     |  49
> +++++++++++++++++++++++++++++++++++++++++++++++++
>  xen/drivers/passthrough/iommu.c         |  10 ++++++++++
>  xen/drivers/passthrough/vtd/dmar.c      |  17 +++++++++++++++++
>  xen/drivers/passthrough/vtd/extern.h    |   1 +
>  xen/drivers/passthrough/vtd/iommu.c     |   9 +--------
>  xen/include/public/hvm/hvm_info_table.h |   3 +++
>  xen/include/public/memory.h             |  24
> +++++++++++++++++++++++-
>  xen/include/xen/iommu.h                 |   4 ++++
>  xen/include/xlat.lst                    |   3 ++-
>  16 files changed, 400 insertions(+), 12 deletions(-)
> 
> Thanks
> Tiejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 3/7] tools/libxc: check if mmio BAR is out of reserved device memory maps
  2014-09-10 21:37   ` Tian, Kevin
@ 2014-09-11  1:14     ` Chen, Tiejun
  2014-09-11 22:55       ` Tian, Kevin
  0 siblings, 1 reply; 37+ messages in thread
From: Chen, Tiejun @ 2014-09-11  1:14 UTC (permalink / raw)
  To: Tian, Kevin, JBeulich, ian.campbell, ian.jackson,
	stefano.stabellini, Zhang, Yang Z
  Cc: xen-devel

On 2014/9/11 5:37, Tian, Kevin wrote:
>> From: Chen, Tiejun
>> Sent: Tuesday, September 09, 2014 10:50 PM
>>
>> We need to avoid allocating MMIO BAR conflicting to all reserved device
>> memory range.
>
> besides checking MMIO BAR confliction, you also need check guest memory
> confliction with reserved device memory ranges.

Kexin,

In patch #6, hvmloader: check to reserved device memory maps in e820, 
we'll lookup the entire e820 table to check if RMRR is overlapping with 
all recorded ranges in e820, so I think that already cover this case, right?

>
>>
>> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
>>
>> diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
>> index c81a25b..299e33a 100644
>> --- a/tools/libxc/xc_hvm_build_x86.c
>> +++ b/tools/libxc/xc_hvm_build_x86.c
>> @@ -239,6 +239,73 @@ static int check_mmio_hole(uint64_t start, uint64_t
>> memsize,
>>           return 1;
>>   }
>>
>> +/*
>> + * Check whether there exists mmio overplap with the reserved device
>> + * memory map
>> + */
>> +static int check_rdm_overlap(xc_interface *xch, uint64_t mmio_start,
>> +                             uint64_t mmio_size)
>> +{
>> +    struct xen_mem_reserved_device_memory *map = NULL;
>
> If there are multiple callers of this call, better to move the structure out of
> this function to avoid multiple calls.

Good point.

+/* Record reserved device memory. */
+static struct xen_mem_reserved_device_memory *xmrdm = NULL;
+


>
>> +    uint64_t rdm_start = 0, rdm_end = 0;
>> +    unsigned int i = 0;
>> +    int rc = 0;
>> +    /* Assume we have one entry if not enough we'll expand.*/
>> +    uint32_t nr_entries = 1;

[snip]

>>       }
>>
>> +    rc = check_rdm_overlap(xch, mmio_start, mmio_start);
>
> mmio_start -> mmio_size.

Fixed and thanks a lot.

Thanks
Tiejun

>
>> +    if ( rc < 0 )
>> +        goto error_out;
>> +
>>       for ( i = 0; i < nr_pages; i++ )
>>           page_array[i] = i;
>>       for ( i = mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
>> --
>> 1.9.1
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 4/7] libxc/hvm_info_table: introduce a new field nr_reserved_device_memory_map
  2014-09-10 21:39   ` Tian, Kevin
@ 2014-09-11  1:16     ` Chen, Tiejun
  0 siblings, 0 replies; 37+ messages in thread
From: Chen, Tiejun @ 2014-09-11  1:16 UTC (permalink / raw)
  To: Tian, Kevin, JBeulich, ian.campbell, ian.jackson,
	stefano.stabellini, Zhang, Yang Z
  Cc: xen-devel

On 2014/9/11 5:39, Tian, Kevin wrote:
>> From: Chen, Tiejun
>> Sent: Tuesday, September 09, 2014 10:50 PM
>>
>> In hvm_info_table this field represents the number of all reserved device
>> memory maps. It will be convenient to expose such a information to VM.
>> While building hvm info, libxc is responsible for constructing this number
>> after check_rdm_overlap().
>
> Agree with Jan that putting the entry number here looks dirty. If we really
> want to go this way, I prefer to put the whole reserved entries so it removes
> the call completely from hvmloader. But since it's a dynamic structure, so
> it's not a good option here.

Looks most guys don't refer to introducing such a thing, I will remove this.

Thanks
Tiejun

>
>>
>> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
>>
>> diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
>> index 299e33a..8c61422 100644
>> --- a/tools/libxc/xc_hvm_build_x86.c
>> +++ b/tools/libxc/xc_hvm_build_x86.c
>> @@ -89,7 +89,8 @@ static int modules_init(struct xc_hvm_build_args *args,
>>   }
>>
>>   static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
>> -                           uint64_t mmio_start, uint64_t mmio_size)
>> +                           uint64_t mmio_start, uint64_t mmio_size,
>> +                           unsigned int num)
>>   {
>>       struct hvm_info_table *hvm_info = (struct hvm_info_table *)
>>           (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET);
>> @@ -119,6 +120,9 @@ static void build_hvm_info(void *hvm_info_page,
>> uint64_t mem_size,
>>       hvm_info->high_mem_pgend = highmem_end >> PAGE_SHIFT;
>>       hvm_info->reserved_mem_pgstart = ioreq_server_pfn(0);
>>
>> +    /* Reserved device memory map number. */
>> +    hvm_info->nr_reserved_device_memory_map = num;
>> +
>>       /* Finish with the checksum. */
>>       for ( i = 0, sum = 0; i < hvm_info->length; i++ )
>>           sum += ((uint8_t *)hvm_info)[i];
>> @@ -329,6 +333,7 @@ static int setup_guest(xc_interface *xch,
>>       int claim_enabled = args->claim_enabled;
>>       xen_pfn_t special_array[NR_SPECIAL_PAGES];
>>       xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
>> +    unsigned int num_reserved = 0;
>>
>>       if ( nr_pages > target_pages )
>>           pod_mode = XENMEMF_populate_on_demand;
>> @@ -371,6 +376,8 @@ static int setup_guest(xc_interface *xch,
>>       if ( rc < 0 )
>>           goto error_out;
>>
>> +    num_reserved = rc;
>> +
>>       for ( i = 0; i < nr_pages; i++ )
>>           page_array[i] = i;
>>       for ( i = mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
>> @@ -540,7 +547,7 @@ static int setup_guest(xc_interface *xch,
>>                 xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
>>                 HVM_INFO_PFN)) == NULL )
>>           goto error_out;
>> -    build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size);
>> +    build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size,
>> num_reserved);
>>       munmap(hvm_info_page, PAGE_SIZE);
>>
>>       /* Allocate and clear special pages. */
>> diff --git a/xen/include/public/hvm/hvm_info_table.h
>> b/xen/include/public/hvm/hvm_info_table.h
>> index 36085fa..bf401d5 100644
>> --- a/xen/include/public/hvm/hvm_info_table.h
>> +++ b/xen/include/public/hvm/hvm_info_table.h
>> @@ -65,6 +65,9 @@ struct hvm_info_table {
>>        */
>>       uint32_t    high_mem_pgend;
>>
>> +    /* How many reserved device memory maps does we have? */
>> +    uint32_t    nr_reserved_device_memory_map;
>> +
>>       /* Bitmap of which CPUs are online at boot time. */
>>       uint8_t     vcpu_online[(HVM_MAX_VCPUS + 7)/8];
>>   };
>> --
>> 1.9.1
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 5/7] hvmloader: introduce hypercall for xc_reserved_device_memory_map
  2014-09-10 21:41   ` Tian, Kevin
@ 2014-09-11  1:32     ` Chen, Tiejun
  2014-09-11  7:52     ` Jan Beulich
  1 sibling, 0 replies; 37+ messages in thread
From: Chen, Tiejun @ 2014-09-11  1:32 UTC (permalink / raw)
  To: Tian, Kevin, JBeulich, ian.campbell, ian.jackson,
	stefano.stabellini, Zhang, Yang Z
  Cc: xen-devel

On 2014/9/11 5:41, Tian, Kevin wrote:
>> From: Chen, Tiejun
>> Sent: Tuesday, September 09, 2014 10:50 PM
>>
>> We will introduce that hypercall xc_reserved_device_memory_map
>> to hvmloader.
>>
>> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
>>
>> diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
>> index 80d822f..90dbb6e 100644
>> --- a/tools/firmware/hvmloader/util.c
>> +++ b/tools/firmware/hvmloader/util.c
>> @@ -828,6 +828,28 @@ int hpet_exists(unsigned long hpet_base)
>>       return ((hpet_id >> 16) == 0x8086);
>>   }
>>
>> +int get_reserved_device_memory_map(struct
>> xen_mem_reserved_device_memory entries[],
>> +                                   uint32_t max_entries)
>
> usually claim as a pointer instead of array as the parameter.

But I see there are some existing similar things like,

tools/libxc/xenctrl.h:

int xc_domain_set_memory_map(xc_interface *xch,
                                uint32_t domid,
                                struct e820entry entries[],
                                uint32_t nr_entries);

int xc_get_machine_memory_map(xc_interface *xch,
                               struct e820entry entries[],
                               uint32_t max_entries);

>
>> +{
>> +    static int map_done = 0;
>
> don't see much value of the static variable here. Better for the caller
> to judge whether multiple calls are required.

Originally this intends to indicate if we call this hypercall in case 
when we already know that appropriate entries number as we introduce 
that new field nr_reserved_device_memory_map in hvm_info_table. If so, 
we just need to call one time.

But looks you guys think we shouldn't live with 
nr_reserved_device_memory_map, so this should be gone.

Thanks
Tiejun

>
>> +    struct xen_mem_reserved_device_memory_map memmap = {
>> +        .nr_entries = max_entries
>> +    };
>> +
>> +    if ( map_done )
>> +        return 0;
>> +
>> +    set_xen_guest_handle(memmap.buffer, entries);
>> +
>> +    if ( hypercall_memory_op(XENMEM_reserved_device_memory_map,
>> +                             &memmap) != 0 )
>> +        BUG();
>> +
>> +    map_done = 1;
>> +
>> +    return 0;
>> +}
>> +
>>   /*
>>    * Local variables:
>>    * mode: C
>> diff --git a/tools/firmware/hvmloader/util.h
>> b/tools/firmware/hvmloader/util.h
>> index a70e4aa..cca0d5b 100644
>> --- a/tools/firmware/hvmloader/util.h
>> +++ b/tools/firmware/hvmloader/util.h
>> @@ -241,6 +241,9 @@ int build_e820_table(struct e820entry *e820,
>>                        unsigned int bios_image_base);
>>   void dump_e820_table(struct e820entry *e820, unsigned int nr);
>>
>> +#include <xen/memory.h>
>> +int get_reserved_device_memory_map(struct
>> xen_mem_reserved_device_memory entries[],
>> +                                   uint32_t max_entries);
>>   #ifndef NDEBUG
>>   void perform_tests(void);
>>   #else
>> --
>> 1.9.1
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM
  2014-09-10 21:44 ` [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tian, Kevin
@ 2014-09-11  1:38   ` Chen, Tiejun
  2014-09-11  7:48     ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: Chen, Tiejun @ 2014-09-11  1:38 UTC (permalink / raw)
  To: Tian, Kevin, JBeulich, ian.campbell, ian.jackson,
	stefano.stabellini, Zhang, Yang Z
  Cc: xen-devel

On 2014/9/11 5:44, Tian, Kevin wrote:
>> From: Chen, Tiejun
>> Sent: Tuesday, September 09, 2014 10:50 PM
>>
>
> currently the confliction is detected absolutely. Do we need a way to allow
> the confliction if there is no device assigned at all?

How to handle a hot-plug case when guest already boot? I think it may 
not be worth distinguishing such fine gain, things will be becoming 
complicated.

Thanks
Tiejun

>
>> v6:
>>
>> * Jan cleanup to regenerate one patch to replace the two original patches
>>    then rebase the others.
>> * Refine all patches short/long logs.
>> * Replace those info with 'reserved device memory maps' and s/RMRR/RDM.
>> * Fix some code styles.
>> * Fix one interation error when we grab e820 info
>>    This is not valid when j == nr - 1 (last iteration).
>> * Cleanup some codes.
>> * One comment to introduce a new field, nr_reserved_device_memory_map
>>    libxc/hvm_info_table, is still pending to wait the tools maintainer's
>>    further comment.
>>
>> v5:
>>
>> * Add patch #2 to introduce a global count, acpi_rmrr_unit_entries.
>> * Refine hypercall return value to make sure the caller can distinguish
>>    clearly between "xen filled in N entries" and "xen said you need N
>>    entries for all information".
>> * Refine some structures
>> * Then Rebase
>>
>> v4:
>>
>> * Drop the original patch #1. Instead, we use acpi_rmrr_units to get
>>    rmrr info directly.
>> * Refine the hypercall definition to make sure we can use it safely.
>> * Introduce introduce nr_reserved_device_memory_map in hvm_info_table,
>>    then we can avoid issue unnecessary hypercall, even we can know
>>    current RMRR entries to issue hypercall one time.
>> * Cleanup and rebase
>>
>> v3:
>>
>> * Use XENMEM_reserved_device_memory_map to replace
>> XENMEM_RMRR_memory_map
>> * Then rebase all patches
>>
>> v2:
>>
>> * Don't use e820map to define RMRR maps directly to avoid any confusion.
>> * In patch #3 we introduce construct_rmrr_e820_maps() to check if we can
>>    insert RMRR maps and then we will sort all e820 entries.
>> * Clean patch #4
>> * In patch #5 we reuse check_mmio_hole() to check if current mmio range is
>>    fine to RMRR maps. If not, we just issue error to notify the user since
>>    mostly mmio should be configured again.
>>
>> While we work for supporting RMRR mapping for Windows GFX driver in case
>> shared table,
>>
>> http://osdir.com/ml/general/2014-07/msg55347.html
>> http://osdir.com/ml/general/2014-07/msg55348.html
>>
>> we realize we should reserve RMRR range to avoid any potential MMIO/RAM
>> overlap with our discussion so here these preliminary patches are intended
>> to cover this.
>>
>> ----------------------------------------------------------------
>> Jan Beulich (1):
>>        introduce XENMEM_reserved_device_memory_map
>>
>> Tiejun Chen (6):
>>        tools/libxc: introduce hypercall for xc_reserved_device_memory_map
>>        tools/libxc: check if mmio BAR is out of reserved device memory maps
>>        libxc/hvm_info_table: introduce a new field
>> nr_reserved_device_memory_map
>>        hvmloader: introduce hypercall for xc_reserved_device_memory_map
>>        hvmloader: check to reserved device memory maps in e820
>>        xen/vtd: make USB RMRR mapping safe
>>
>>   tools/firmware/hvmloader/e820.c         | 100
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> ++++++++++++++++++++++++++++++++++++
>>   tools/firmware/hvmloader/util.c         |  22 ++++++++++++++++++++++
>>   tools/firmware/hvmloader/util.h         |   3 +++
>>   tools/libxc/xc_domain.c                 |  29
>> +++++++++++++++++++++++++++++
>>   tools/libxc/xc_hvm_build_x86.c          |  82
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> ++++++++++++++++--
>>   tools/libxc/xenctrl.h                   |   4 ++++
>>   xen/common/compat/memory.c              |  52
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   xen/common/memory.c                     |  49
>> +++++++++++++++++++++++++++++++++++++++++++++++++
>>   xen/drivers/passthrough/iommu.c         |  10 ++++++++++
>>   xen/drivers/passthrough/vtd/dmar.c      |  17 +++++++++++++++++
>>   xen/drivers/passthrough/vtd/extern.h    |   1 +
>>   xen/drivers/passthrough/vtd/iommu.c     |   9 +--------
>>   xen/include/public/hvm/hvm_info_table.h |   3 +++
>>   xen/include/public/memory.h             |  24
>> +++++++++++++++++++++++-
>>   xen/include/xen/iommu.h                 |   4 ++++
>>   xen/include/xlat.lst                    |   3 ++-
>>   16 files changed, 400 insertions(+), 12 deletions(-)
>>
>> Thanks
>> Tiejun
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM
  2014-09-11  1:38   ` Chen, Tiejun
@ 2014-09-11  7:48     ` Jan Beulich
  2014-09-11  9:39       ` Chen, Tiejun
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2014-09-11  7:48 UTC (permalink / raw)
  To: Kevin Tian, Tiejun Chen
  Cc: Yang Z Zhang, xen-devel, ian.jackson, ian.campbell, stefano.stabellini

>>> On 11.09.14 at 03:38, <tiejun.chen@intel.com> wrote:
> On 2014/9/11 5:44, Tian, Kevin wrote:
>>> From: Chen, Tiejun
>>> Sent: Tuesday, September 09, 2014 10:50 PM
>>>
>>
>> currently the confliction is detected absolutely. Do we need a way to allow
>> the confliction if there is no device assigned at all?
> 
> How to handle a hot-plug case when guest already boot? I think it may 
> not be worth distinguishing such fine gain, things will be becoming 
> complicated.

In that case hotplug should fail. I'm very much in agreement with
Kevin that an override should be possible if there's any risk of the
detection done now could cause problems on certain systems (as
pointed out before, I'm mainly concerned about RMRRs being
defined in regions that overlap where the guest's BIOS wants to
be placed).

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 5/7] hvmloader: introduce hypercall for xc_reserved_device_memory_map
  2014-09-10 21:41   ` Tian, Kevin
  2014-09-11  1:32     ` Chen, Tiejun
@ 2014-09-11  7:52     ` Jan Beulich
  1 sibling, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2014-09-11  7:52 UTC (permalink / raw)
  To: Kevin Tian, Tiejun Chen
  Cc: Yang Z Zhang, xen-devel, ian.jackson, ian.campbell, stefano.stabellini

>>> On 10.09.14 at 23:41, <kevin.tian@intel.com> wrote:
>>  From: Chen, Tiejun
>> Sent: Tuesday, September 09, 2014 10:50 PM
>> 
>> We will introduce that hypercall xc_reserved_device_memory_map
>> to hvmloader.
>> 
>> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
>> 
>> diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
>> index 80d822f..90dbb6e 100644
>> --- a/tools/firmware/hvmloader/util.c
>> +++ b/tools/firmware/hvmloader/util.c
>> @@ -828,6 +828,28 @@ int hpet_exists(unsigned long hpet_base)
>>      return ((hpet_id >> 16) == 0x8086);
>>  }
>> 
>> +int get_reserved_device_memory_map(struct
>> xen_mem_reserved_device_memory entries[],
>> +                                   uint32_t max_entries)
> 
> usually claim as a pointer instead of array as the parameter.

I see nothing wrong with the array syntax; if what is being referred
to in fact is more than one object, I'd actually slightly favor this
syntax. All that keeping in mind that both forms mean _exactly_
the same.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM
  2014-09-11  7:48     ` Jan Beulich
@ 2014-09-11  9:39       ` Chen, Tiejun
  2014-09-11 10:01         ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: Chen, Tiejun @ 2014-09-11  9:39 UTC (permalink / raw)
  To: Jan Beulich, Kevin Tian
  Cc: Yang Z Zhang, xen-devel, ian.jackson, ian.campbell, stefano.stabellini

On 2014/9/11 15:48, Jan Beulich wrote:
>>>> On 11.09.14 at 03:38, <tiejun.chen@intel.com> wrote:
>> On 2014/9/11 5:44, Tian, Kevin wrote:
>>>> From: Chen, Tiejun
>>>> Sent: Tuesday, September 09, 2014 10:50 PM
>>>>
>>>
>>> currently the confliction is detected absolutely. Do we need a way to allow
>>> the confliction if there is no device assigned at all?
>>
>> How to handle a hot-plug case when guest already boot? I think it may
>> not be worth distinguishing such fine gain, things will be becoming
>> complicated.
>
> In that case hotplug should fail. I'm very much in agreement with
> Kevin that an override should be possible if there's any risk of the
> detection done now could cause problems on certain systems (as
> pointed out before, I'm mainly concerned about RMRRs being
> defined in regions that overlap where the guest's BIOS wants to
> be placed).
>

Looks I have to refactor most codes again :)

I just take a quick look at Xen codes to try figuring out this possible 
way. As I see, either we do assign a device when create VM, or we do 
attach a device to a running VM, eventually we always call 
libxl__device_pci_add() to finish this operation. Furthermore, 
xc_test_assign_device() call such a hypercall, 
XEN_DOMCTL_test_assign_device. Right?

If yes, I guess I can do check if RMRR is overlapping something in two 
cases:

#1: In case of creating a VM, we may use global flag to make sure our 
original codes work indeed. It should be easy and especially I guess we 
can't check any conflict directly here since those memory info, 
RAM/MMIO/others, don't be filled completely, right?

#2: In case of a running VM, we can check if any overlapping exist then 
determine what's next. Often we should drop to attach such a conflicting 
device.

I'm not sure if I understand this path in Xen, so if I'm wrong please 
correct me.

Thanks
Tiejun	

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM
  2014-09-11  9:39       ` Chen, Tiejun
@ 2014-09-11 10:01         ` Jan Beulich
  0 siblings, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2014-09-11 10:01 UTC (permalink / raw)
  To: Kevin Tian, Tiejun Chen
  Cc: Yang Z Zhang, xen-devel, ian.jackson, ian.campbell, stefano.stabellini

>>> On 11.09.14 at 11:39, <tiejun.chen@intel.com> wrote:
> #1: In case of creating a VM, we may use global flag to make sure our 
> original codes work indeed. It should be easy and especially I guess we 
> can't check any conflict directly here since those memory info, 
> RAM/MMIO/others, don't be filled completely, right?

This part will probably need fully dealing with in the tools, with the
hypervisor only refusing to set up conflicting P2M entries (as iirc
you're already doing).

> #2: In case of a running VM, we can check if any overlapping exist then 
> determine what's next. Often we should drop to attach such a conflicting 
> device.

This one I would guess would benefit from the tools doing the checking
if they can, but perhaps the hypervisor is in a better position to do this.
If I'm right with this, at the tools level it would need to be made certain
that this doesn't end up in a cryptic and/or well hidden error messages,
leaving the operator to guess what went wrong.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 2/7] tools/libxc: introduce hypercall for xc_reserved_device_memory_map
  2014-09-10  5:49 ` [v6][PATCH 2/7] tools/libxc: introduce hypercall for xc_reserved_device_memory_map Tiejun Chen
@ 2014-09-11 15:21   ` Jan Beulich
  2014-09-11 15:23     ` Ian Campbell
                       ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Jan Beulich @ 2014-09-11 15:21 UTC (permalink / raw)
  To: Tiejun Chen
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
> +    if ( errno == ENOBUFS )
> +        *max_entries = memmap.nr_entries;
> +
> +    return rc ? -errno : memmap.nr_entries;
> +}

Isn't libxc generally aiming at using POSIX style return values, i.e. -1
one error with errno indicating the details? Furthermore you're even
inconsistent with yourself, as you return -1 in an earlier error path.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 2/7] tools/libxc: introduce hypercall for xc_reserved_device_memory_map
  2014-09-11 15:21   ` Jan Beulich
@ 2014-09-11 15:23     ` Ian Campbell
  2014-09-11 15:55     ` Andrew Cooper
  2014-09-12  2:43     ` Chen, Tiejun
  2 siblings, 0 replies; 37+ messages in thread
From: Ian Campbell @ 2014-09-11 15:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, stefano.stabellini, ian.jackson, xen-devel,
	yang.z.zhang, Tiejun Chen

On Thu, 2014-09-11 at 16:21 +0100, Jan Beulich wrote:
> >>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
> > +    if ( errno == ENOBUFS )
> > +        *max_entries = memmap.nr_entries;
> > +
> > +    return rc ? -errno : memmap.nr_entries;
> > +}
> 
> Isn't libxc generally aiming at using POSIX style return values, i.e. -1
> one error with errno indicating the details?

We are trying to move that way, yes. It's an unholy mess right now but
lets at least try not to add to it!

>  Furthermore you're even
> inconsistent with yourself, as you return -1 in an earlier error path.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 3/7] tools/libxc: check if mmio BAR is out of reserved device memory maps
  2014-09-10  5:49 ` [v6][PATCH 3/7] tools/libxc: check if mmio BAR is out of reserved device memory maps Tiejun Chen
  2014-09-10 21:37   ` Tian, Kevin
@ 2014-09-11 15:38   ` Jan Beulich
  2014-09-12  2:56     ` Chen, Tiejun
  2014-09-12  6:19     ` Jan Beulich
  1 sibling, 2 replies; 37+ messages in thread
From: Jan Beulich @ 2014-09-11 15:38 UTC (permalink / raw)
  To: Tiejun Chen
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
> +static int check_rdm_overlap(xc_interface *xch, uint64_t mmio_start,
> +                             uint64_t mmio_size)
> +{
> +    struct xen_mem_reserved_device_memory *map = NULL;
> +    uint64_t rdm_start = 0, rdm_end = 0;
> +    unsigned int i = 0;
> +    int rc = 0;
> +    /* Assume we have one entry if not enough we'll expand.*/
> +    uint32_t nr_entries = 1;
> +
> +    /* We should check if mmio range is out of RDM mapping. */
> +    if ( (map = malloc(nr_entries *
> +                       sizeof(xen_mem_reserved_device_memory_t))) == NULL )
> +    {
> +        PERROR("Could not allocate memory for map.");
> +        return -1;
> +    }
> +    rc = xc_reserved_device_memory_map(xch, map, &nr_entries);
> +    if ( rc < 0 )
> +    {
> +        /* DRM doesn't exist. */
> +        if ( rc == -ENOENT )
> +            rc = 0;
> +        else if ( rc == -ENOBUFS)

I think you'd be better off handling both if() levels we're in here
with a single switch().

> +        {
> +            free(map);
> +            /* Now we need more space to map all RDM mappings. */
> +            if ( (map = malloc(nr_entries *
> +                               sizeof(xen_mem_reserved_device_memory_t))) == NULL )
> +            {
> +                PERROR("Could not allocate memory for map.");
> +                return -1;
> +            }
> +            rc = xc_reserved_device_memory_map(xch, map, &nr_entries);
> +            if ( rc < 0 )
> +            {
> +                PERROR("Could not get DRM info on domain");

Please change to RDM here and elsewhere.

> +                free(map);
> +                return rc;
> +            }
> +        }
> +        else
> +            PERROR("Could not get RDM info on domain");

Not returning here means looping for almost 4G iterations below.

> +    }
> +
> +    for ( i = 0; i < rc; i++ )
> +    {
> +        rdm_start = map[i].start_pfn << PAGE_SHIFT;
> +        rdm_end = rdm_start + map[i].nr_pages * PAGE_SIZE;
> +        if ( check_mmio_hole(rdm_start, map[i].nr_pages * PAGE_SIZE,

First of all please be consistent: Either always use "<< PAGE_SHIFT"
or always use "* PAGE_SIZE". And then what you're doing here won't
do what you intend to in a 32-bit toolstack (due to the lack of casts).
Plus finally I think using XC_PAGE_* would be the better choice here
even if PAGE_* are #define-d as aliases thereof.

> +                             mmio_start, mmio_size) )

Can you really use check_mmio_hole() here? I'm not certain that
its two one-offs are intentional (to allow for a gap between MMIO
and RAM), but these are certainly harmful to you.

> +        {
> +            PERROR("MMIO: [%lx]<->[%lx] overlap DRM [%lx]<->[%lx]\n",

Please use a more conventional representation here, e.g.

"MMIO [%lx,%lx) overlaps RDM [%lx,%lx)\n"

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 5/7] hvmloader: introduce hypercall for xc_reserved_device_memory_map
  2014-09-10  5:49 ` [v6][PATCH 5/7] hvmloader: introduce hypercall for xc_reserved_device_memory_map Tiejun Chen
  2014-09-10 21:41   ` Tian, Kevin
@ 2014-09-11 15:45   ` Jan Beulich
  2014-09-12  4:52     ` Chen, Tiejun
  1 sibling, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2014-09-11 15:45 UTC (permalink / raw)
  To: Tiejun Chen
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
> We will introduce that hypercall xc_reserved_device_memory_map
> to hvmloader.

Title and text: What does a libxc name do here?

Also I don't think this warrants a separate patch - introduce the
function along with the code needing it.

> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> 
> diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
> index 80d822f..90dbb6e 100644
> --- a/tools/firmware/hvmloader/util.c
> +++ b/tools/firmware/hvmloader/util.c
> @@ -828,6 +828,28 @@ int hpet_exists(unsigned long hpet_base)
>      return ((hpet_id >> 16) == 0x8086);
>  }
>  
> +int get_reserved_device_memory_map(struct xen_mem_reserved_device_memory entries[],
> +                                   uint32_t max_entries)
> +{
> +    static int map_done = 0;
> +    struct xen_mem_reserved_device_memory_map memmap = {
> +        .nr_entries = max_entries
> +    };
> +
> +    if ( map_done )
> +        return 0;
> +
> +    set_xen_guest_handle(memmap.buffer, entries);
> +
> +    if ( hypercall_memory_op(XENMEM_reserved_device_memory_map,
> +                             &memmap) != 0 )
> +        BUG();

I don't think there's any harm in not BUG()ing here - just return the
error to the caller (and let it assume there are no entries). In the
worst case guest startup will subsequently fail (instead of here).

> +
> +    map_done = 1;

The map_done variable if completely bogus here, but I think
someone else already pointed this out.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 2/7] tools/libxc: introduce hypercall for xc_reserved_device_memory_map
  2014-09-11 15:21   ` Jan Beulich
  2014-09-11 15:23     ` Ian Campbell
@ 2014-09-11 15:55     ` Andrew Cooper
  2014-09-12  2:43     ` Chen, Tiejun
  2 siblings, 0 replies; 37+ messages in thread
From: Andrew Cooper @ 2014-09-11 15:55 UTC (permalink / raw)
  To: Tiejun Chen
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, Jan Beulich, yang.z.zhang

On 11/09/14 16:21, Jan Beulich wrote:
>>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
>> +    if ( errno == ENOBUFS )
>> +        *max_entries = memmap.nr_entries;
>> +
>> +    return rc ? -errno : memmap.nr_entries;
>> +}
> Isn't libxc generally aiming at using POSIX style return values, i.e. -1
> one error with errno indicating the details? Furthermore you're even
> inconsistent with yourself, as you return -1 in an earlier error path.

Furthermore, this in now the 3rd time I have commented on this problem
specifically, across different versions of this series.

Please address all outstanding comments before reposting.

~Andrew

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 6/7] hvmloader: check to reserved device memory maps in e820
  2014-09-10  5:49 ` [v6][PATCH 6/7] hvmloader: check to reserved device memory maps in e820 Tiejun Chen
@ 2014-09-11 15:57   ` Jan Beulich
  2014-09-12  6:08     ` Jan Beulich
  2014-09-12  6:28     ` Chen, Tiejun
  0 siblings, 2 replies; 37+ messages in thread
From: Jan Beulich @ 2014-09-11 15:57 UTC (permalink / raw)
  To: Tiejun Chen
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
> +static unsigned int construct_rdm_e820_maps(unsigned int nr,
> +                                            uint32_t nr_map,
> +                                            struct xen_mem_reserved_device_memory *map,
> +                                            struct e820entry *e820)
> +{
> +    unsigned int i = 0, j = 0, m = 0, sum_nr = nr + nr_map;

Some or all of the initializers here are pointless.

> +    uint64_t start, end, rdm_start, rdm_end;
> +    unsigned int insert = 0, do_insert = 0;
> +
> + do_real_construct:
> +    for ( i = 0; i < nr_map; i++ )
> +    {
> +        rdm_start = map[i].start_pfn << PAGE_SHIFT;
> +        rdm_end = rdm_start + map[i].nr_pages * PAGE_SIZE;

Same comment as for an earlier patch.

> +
> +        for ( j = 0; j < nr - 1; j++ )
> +        {
> +            end = e820[j].addr + e820[j].size;
> +            start = e820[j+1].addr;

So now you don't check the last region at all? You just need to make
this assignment conditional upon whether you're in the last iteration.

> +
> +            /* Between those existing e820 entries. */
> +            if ( (rdm_start > end) && (rdm_end < start) )
> +            {
> +                if ( do_insert )
> +                {
> +                    /* Move to free this entry. */
> +                    for ( m = sum_nr - 1; m > j; m-- )
> +                    {
> +                        e820[m].addr = e820[m-1].addr;
> +                        e820[m].size = e820[m-1].size;
> +                        e820[m].type = e820[m-1].type;
> +                    }

You know of the memmove() function, don't you?

> +
> +                    /* Then fill RMRR into that entry. */
> +                    e820[j+1].addr = rdm_start;
> +                    e820[j+1].size = rdm_end - rdm_start;
> +                    e820[j+1].type = E820_RESERVED;
> +                    nr++;
> +                }
> +                insert++;
> +            }
> +            /* Already at the end. */
> +            else if ( (rdm_start > end) && !start )

How would !start represent the end of anything?

> +            {
> +                if ( do_insert )
> +                {
> +                    e820[nr].addr = rdm_start;
> +                    e820[nr].size = rdm_end - rdm_start;
> +                    e820[nr].type = E820_RESERVED;
> +                    nr++;
> +                }
> +                insert++;
> +            }
> +        }
> +    }
> +
> +    /* Just return if done. */
> +    if ( do_insert )
> +        return nr;
> +
> +    /* Fine to construct RDM mappings into e820. */
> +    if ( insert == nr_map )
> +    {
> +        do_insert = 1;
> +        goto do_real_construct;
> +    }
> +    /* Overlap. */
> +    else
> +    {
> +        printf("RDM overlap with those existing e820 entries!\n");
> +        printf("So we don't construct RDM mapping in e820!\n");
> +    }
> +
> +    return nr;
> +}
>  /* Create an E820 table based on memory parameters provided in hvm_info. */

Blank line missing above.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 3/7] tools/libxc: check if mmio BAR is out of reserved device memory maps
  2014-09-11  1:14     ` Chen, Tiejun
@ 2014-09-11 22:55       ` Tian, Kevin
  0 siblings, 0 replies; 37+ messages in thread
From: Tian, Kevin @ 2014-09-11 22:55 UTC (permalink / raw)
  To: Chen, Tiejun, JBeulich, ian.campbell, ian.jackson,
	stefano.stabellini, Zhang, Yang Z
  Cc: xen-devel

> From: Chen, Tiejun
> Sent: Wednesday, September 10, 2014 6:15 PM
> 
> On 2014/9/11 5:37, Tian, Kevin wrote:
> >> From: Chen, Tiejun
> >> Sent: Tuesday, September 09, 2014 10:50 PM
> >>
> >> We need to avoid allocating MMIO BAR conflicting to all reserved device
> >> memory range.
> >
> > besides checking MMIO BAR confliction, you also need check guest memory
> > confliction with reserved device memory ranges.
> 
> Kexin,
> 
> In patch #6, hvmloader: check to reserved device memory maps in e820,
> we'll lookup the entire e820 table to check if RMRR is overlapping with
> all recorded ranges in e820, so I think that already cover this case, right?
> 

yes, forgot my comment then. My comment was still based on the assumption
that we'll both detect and fix confliction, so 'fix' means we need do something
before P2M is allocated. But since the decision now is only for detection, yes
patch #6 is enough then.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 2/7] tools/libxc: introduce hypercall for xc_reserved_device_memory_map
  2014-09-11 15:21   ` Jan Beulich
  2014-09-11 15:23     ` Ian Campbell
  2014-09-11 15:55     ` Andrew Cooper
@ 2014-09-12  2:43     ` Chen, Tiejun
  2014-09-12  6:20       ` Jan Beulich
  2 siblings, 1 reply; 37+ messages in thread
From: Chen, Tiejun @ 2014-09-12  2:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

On 2014/9/11 23:21, Jan Beulich wrote:
>>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
>> +    if ( errno == ENOBUFS )
>> +        *max_entries = memmap.nr_entries;
>> +
>> +    return rc ? -errno : memmap.nr_entries;
>> +}
>
> Isn't libxc generally aiming at using POSIX style return values, i.e. -1
> one error with errno indicating the details? Furthermore you're even
> inconsistent with yourself, as you return -1 in an earlier error path.
>

tools/libxc/xc_domain.c:

int xc_get_machine_memory_map(xc_interface *xch,
                               struct e820entry entries[],
                               uint32_t max_entries)
{
     int rc;
     struct xen_memory_map memmap = {
         .nr_entries = max_entries
     };
     DECLARE_HYPERCALL_BOUNCE(entries, sizeof(struct e820entry) * 
max_entries,
                              XC_HYPERCALL_BUFFER_BOUNCE_OUT);

     if ( !entries || xc_hypercall_bounce_pre(xch, entries) || 
max_entries <= 1)
         return -1;


     set_xen_guest_handle(memmap.buffer, entries);

     rc = do_memory_op(xch, XENMEM_machine_memory_map, &memmap, 
sizeof(memmap));

     xc_hypercall_bounce_post(xch, entries);

     return rc ? rc : memmap.nr_entries;
}

Here I just change last line as follows:

return rc ? -errno : memmap.nr_entries;

Because if I don't return '-errno', how will the caller know the 
detailed error?
     rc = xc_reserved_device_memory_map(xch, xmrdm, &nr_entries);
     if ( rc < 0 )
     {
         /* DRM doesn't exist. */
         if ( rc == -ENOENT )
             rc = 0;

Or are you saying we should use errno ditectly,

+
+        rc = xc_reserved_device_memory_map(xch, xmrdm, &nr_entries);
+        if (rc < 0 ) {
+            switch (errno) {
+            case ENOENT: /* RDM doesn't exist. */


Right?

Thanks
Tiejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 3/7] tools/libxc: check if mmio BAR is out of reserved device memory maps
  2014-09-11 15:38   ` Jan Beulich
@ 2014-09-12  2:56     ` Chen, Tiejun
  2014-09-12  6:19     ` Jan Beulich
  1 sibling, 0 replies; 37+ messages in thread
From: Chen, Tiejun @ 2014-09-12  2:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

On 2014/9/11 23:38, Jan Beulich wrote:
>>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
>> +static int check_rdm_overlap(xc_interface *xch, uint64_t mmio_start,
>> +                             uint64_t mmio_size)
>> +{
>> +    struct xen_mem_reserved_device_memory *map = NULL;
>> +    uint64_t rdm_start = 0, rdm_end = 0;
>> +    unsigned int i = 0;
>> +    int rc = 0;
>> +    /* Assume we have one entry if not enough we'll expand.*/
>> +    uint32_t nr_entries = 1;
>> +
>> +    /* We should check if mmio range is out of RDM mapping. */
>> +    if ( (map = malloc(nr_entries *
>> +                       sizeof(xen_mem_reserved_device_memory_t))) == NULL )
>> +    {
>> +        PERROR("Could not allocate memory for map.");
>> +        return -1;
>> +    }
>> +    rc = xc_reserved_device_memory_map(xch, map, &nr_entries);
>> +    if ( rc < 0 )
>> +    {
>> +        /* DRM doesn't exist. */
>> +        if ( rc == -ENOENT )
>> +            rc = 0;
>> +        else if ( rc == -ENOBUFS)
>
> I think you'd be better off handling both if() levels we're in here
> with a single switch().

Okay.

>
>> +        {
>> +            free(map);
>> +            /* Now we need more space to map all RDM mappings. */
>> +            if ( (map = malloc(nr_entries *
>> +                               sizeof(xen_mem_reserved_device_memory_t))) == NULL )
>> +            {
>> +                PERROR("Could not allocate memory for map.");
>> +                return -1;
>> +            }
>> +            rc = xc_reserved_device_memory_map(xch, map, &nr_entries);
>> +            if ( rc < 0 )
>> +            {
>> +                PERROR("Could not get DRM info on domain");
>
> Please change to RDM here and elsewhere.

Okay.

>
>> +                free(map);
>> +                return rc;
>> +            }
>> +        }
>> +        else
>> +            PERROR("Could not get RDM info on domain");
>
> Not returning here means looping for almost 4G iterations below.
>
>> +    }
>> +
>> +    for ( i = 0; i < rc; i++ )
>> +    {
>> +        rdm_start = map[i].start_pfn << PAGE_SHIFT;
>> +        rdm_end = rdm_start + map[i].nr_pages * PAGE_SIZE;
>> +        if ( check_mmio_hole(rdm_start, map[i].nr_pages * PAGE_SIZE,
>
> First of all please be consistent: Either always use "<< PAGE_SHIFT"
> or always use "* PAGE_SIZE". And then what you're doing here won't
> do what you intend to in a 32-bit toolstack (due to the lack of casts).
> Plus finally I think using XC_PAGE_* would be the better choice here
> even if PAGE_* are #define-d as aliases thereof.
>
>> +                             mmio_start, mmio_size) )
>
> Can you really use check_mmio_hole() here? I'm not certain that
> its two one-offs are intentional (to allow for a gap between MMIO
> and RAM), but these are certainly harmful to you.

I don't understand what you mean here. But here we should check if RMRR 
is overlapping the whole MMIO,

/*
  * Check whether there exists mmio hole in the specified memory range.
  * Returns 1 if exists, else returns 0.
  */
static int check_mmio_hole(uint64_t start, uint64_t memsize,
                            uint64_t mmio_start, uint64_t mmio_size)
{
     if ( start + memsize <= mmio_start || start >= mmio_start + mmio_size )
         return 0;
     else
         return 1;
}

>
>> +        {
>> +            PERROR("MMIO: [%lx]<->[%lx] overlap DRM [%lx]<->[%lx]\n",
>
> Please use a more conventional representation here, e.g.
>
> "MMIO [%lx,%lx) overlaps RDM [%lx,%lx)\n"

Okay.

I tried to address your comments but I assume its correct to use 'errno' 
directly as I'm asking in another email:

@@ -54,6 +54,9 @@

  #define VGA_HOLE_SIZE (0x20)

+/* Record reserved device memory. */
+static struct xen_mem_reserved_device_memory *xmrdm = NULL;
+
  static int modules_init(struct xc_hvm_build_args *args,
                          uint64_t vend, struct elf_binary *elf,
                          uint64_t *mstart_out, uint64_t *mend_out)
@@ -239,6 +242,70 @@ static int check_mmio_hole(uint64_t start, uint64_t 
memsize,
          return 1;
  }

+/*
+ * Check whether there exists mmio overplap with the reserved device
+ * memory map
+ */
+static int check_rdm_overlap(xc_interface *xch, uint64_t mmio_start,
+                             uint64_t mmio_size)
+{
+    uint64_t rdm_start = 0, rdm_end = 0;
+    unsigned int i = 0;
+    int rc = 0;
+    /* Assume we have one entry if not enough we'll expand.*/
+    uint32_t nr_entries = 1;
+
+    /* We should check if mmio range is out of RDM mapping. */
+    if (!xmrdm) {
+        if ((xmrdm = malloc(nr_entries *
+                            sizeof(xen_mem_reserved_device_memory_t))) 
== NULL) {
+            PERROR("Could not allocate memory for map.");
+            return -1;
+        }
+
+        rc = xc_reserved_device_memory_map(xch, xmrdm, &nr_entries);
+        if (rc < 0 ) {
+            switch (errno) {
+            case ENOENT: /* RDM doesn't exist. */
+                rc = 0;
+                break;
+            case ENOBUFS: /* Need more space */
+                free(xmrdm);
+                /* Now we need more space to map all RDM mappings. */
+                if ((xmrdm = malloc(nr_entries *
+ 
sizeof(xen_mem_reserved_device_memory_t))) == NULL) {
+                    PERROR("Could not allocate memory for map.");
+                    return -1;
+                }
+                rc = xc_reserved_device_memory_map(xch, xmrdm, 
&nr_entries);
+                if (rc) {
+                    PERROR("Could not get RDM info on domain");
+                    free(xmrdm);
+                    return rc;
+                }
+                break;
+            default: /* Failed to get RDM */
+                PERROR("Could not get RDM info on domain");
+                return rc;
+            }
+        }
+    }
+
+    for (i = 0; i < rc; i++) {
+        rdm_start = xmrdm[i].start_pfn << XC_PAGE_SHIFT;
+        rdm_end = rdm_start + (xmrdm[i].nr_pages << XC_PAGE_SHIFT);
+        if (check_mmio_hole(rdm_start, (xmrdm[i].nr_pages << 
XC_PAGE_SHIFT),
+                            mmio_start, mmio_size) ) {
+            PERROR("MMIO: [%lx,%lx) overlap RDM [%lx,%lx)\n",
+                   mmio_start, (mmio_start + mmio_size), rdm_start, 
rdm_end);
+            free(xmrdm);
+            return -1;
+        }
+    }
+
+    return rc;
+}
+
  static int setup_guest(xc_interface *xch,
                         uint32_t dom, struct xc_hvm_build_args *args,
                         char *image, unsigned long image_size)

Thanks
Tiejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 5/7] hvmloader: introduce hypercall for xc_reserved_device_memory_map
  2014-09-11 15:45   ` Jan Beulich
@ 2014-09-12  4:52     ` Chen, Tiejun
  0 siblings, 0 replies; 37+ messages in thread
From: Chen, Tiejun @ 2014-09-12  4:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

On 2014/9/11 23:45, Jan Beulich wrote:
>>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
>> We will introduce that hypercall xc_reserved_device_memory_map
>> to hvmloader.
>
> Title and text: What does a libxc name do here?

I need to rename this.

>
> Also I don't think this warrants a separate patch - introduce the
> function along with the code needing it.

I will squash the following patch into this.

>
>> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
>>
>> diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
>> index 80d822f..90dbb6e 100644
>> --- a/tools/firmware/hvmloader/util.c
>> +++ b/tools/firmware/hvmloader/util.c
>> @@ -828,6 +828,28 @@ int hpet_exists(unsigned long hpet_base)
>>       return ((hpet_id >> 16) == 0x8086);
>>   }
>>
>> +int get_reserved_device_memory_map(struct xen_mem_reserved_device_memory entries[],
>> +                                   uint32_t max_entries)
>> +{
>> +    static int map_done = 0;
>> +    struct xen_mem_reserved_device_memory_map memmap = {
>> +        .nr_entries = max_entries
>> +    };
>> +
>> +    if ( map_done )
>> +        return 0;
>> +
>> +    set_xen_guest_handle(memmap.buffer, entries);
>> +
>> +    if ( hypercall_memory_op(XENMEM_reserved_device_memory_map,
>> +                             &memmap) != 0 )
>> +        BUG();
>
> I don't think there's any harm in not BUG()ing here - just return the
> error to the caller (and let it assume there are no entries). In the
> worst case guest startup will subsequently fail (instead of here).

Okay so just return rc.

>
>> +
>> +    map_done = 1;
>
> The map_done variable if completely bogus here, but I think
> someone else already pointed this out.
>

Yes. I will remove this.

Thanks
Tiejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 6/7] hvmloader: check to reserved device memory maps in e820
  2014-09-11 15:57   ` Jan Beulich
@ 2014-09-12  6:08     ` Jan Beulich
  2014-09-12  6:28     ` Chen, Tiejun
  1 sibling, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2014-09-12  6:08 UTC (permalink / raw)
  To: Tiejun Chen
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

>>> On 11.09.14 at 17:57, <JBeulich@suse.com> wrote:
>>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
>> +static unsigned int construct_rdm_e820_maps(unsigned int nr,
>> +                                            uint32_t nr_map,
>> +                                            struct 
> xen_mem_reserved_device_memory *map,
>> +                                            struct e820entry *e820)
>> +{
>> +    unsigned int i = 0, j = 0, m = 0, sum_nr = nr + nr_map;
> 
> Some or all of the initializers here are pointless.
> 
>> +    uint64_t start, end, rdm_start, rdm_end;
>> +    unsigned int insert = 0, do_insert = 0;
>> +
>> + do_real_construct:
>> +    for ( i = 0; i < nr_map; i++ )
>> +    {
>> +        rdm_start = map[i].start_pfn << PAGE_SHIFT;
>> +        rdm_end = rdm_start + map[i].nr_pages * PAGE_SIZE;
> 
> Same comment as for an earlier patch.
> 
>> +
>> +        for ( j = 0; j < nr - 1; j++ )
>> +        {
>> +            end = e820[j].addr + e820[j].size;
>> +            start = e820[j+1].addr;
> 
> So now you don't check the last region at all? You just need to make
> this assignment conditional upon whether you're in the last iteration.

Furthermore
- I don't think an overlap with an E820_RESERVED or E820_UNUSABLE
  region would be a problem (you'd just need to extend the former or
  amend the latter)
- you only check here rather than making sure no RAM gets placed
  into these address regions in the first place (or rather than relocating
  the RAM as is done for at least some MMIO/RAM overlaps)

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 3/7] tools/libxc: check if mmio BAR is out of reserved device memory maps
  2014-09-11 15:38   ` Jan Beulich
  2014-09-12  2:56     ` Chen, Tiejun
@ 2014-09-12  6:19     ` Jan Beulich
  1 sibling, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2014-09-12  6:19 UTC (permalink / raw)
  To: Tiejun Chen
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

>>> On 11.09.14 at 17:38, <JBeulich@suse.com> wrote:
>>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
>> +                             mmio_start, mmio_size) )
> 
> Can you really use check_mmio_hole() here? I'm not certain that
> its two one-offs are intentional (to allow for a gap between MMIO
> and RAM), but these are certainly harmful to you.

Oops - I was sure I deleted this after closer looking. Please
disregard this.

However, the while thing here seems to get done in the wrong
place anyway: The (mmio_start,mmio_size) tuple here doesn't
represent any actual MMIO assignments, just the hole below
4G that's being left available for such. Instead you'd need to
avoid the specific regions in hvmloader's pci_setup().

Plus, considering migration, all of this isn't going to help if on
the destination host (in order for subsequent device hotplug to
work) different RMRR regions need to be taken care of.

All in all I continue thinking that we'd be best off disallowing
assignment of devices associated with RMRR regions when the
1:1 mappings needed collide with something already there,
perhaps with a 2-level operator override to allow USB ones
(possibly making the current USB-specific code conditional) or
anything at their own risk.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 2/7] tools/libxc: introduce hypercall for xc_reserved_device_memory_map
  2014-09-12  2:43     ` Chen, Tiejun
@ 2014-09-12  6:20       ` Jan Beulich
  0 siblings, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2014-09-12  6:20 UTC (permalink / raw)
  To: Tiejun Chen
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

>>> On 12.09.14 at 04:43, <tiejun.chen@intel.com> wrote:
> Or are you saying we should use errno ditectly,
> 
> +
> +        rc = xc_reserved_device_memory_map(xch, xmrdm, &nr_entries);
> +        if (rc < 0 ) {
> +            switch (errno) {
> +            case ENOENT: /* RDM doesn't exist. */

Yes, of course.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 6/7] hvmloader: check to reserved device memory maps in e820
  2014-09-11 15:57   ` Jan Beulich
  2014-09-12  6:08     ` Jan Beulich
@ 2014-09-12  6:28     ` Chen, Tiejun
  2014-09-12  6:44       ` Jan Beulich
  1 sibling, 1 reply; 37+ messages in thread
From: Chen, Tiejun @ 2014-09-12  6:28 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

On 2014/9/11 23:57, Jan Beulich wrote:
>>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
>> +static unsigned int construct_rdm_e820_maps(unsigned int nr,
>> +                                            uint32_t nr_map,
>> +                                            struct xen_mem_reserved_device_memory *map,
>> +                                            struct e820entry *e820)
>> +{
>> +    unsigned int i = 0, j = 0, m = 0, sum_nr = nr + nr_map;
>
> Some or all of the initializers here are pointless.

     unsigned int i, j, m, sum_nr = nr + nr_map;
     uint64_t start, end, rdm_start, rdm_end;
     unsigned int insert, do_insert;

>
>> +    uint64_t start, end, rdm_start, rdm_end;
>> +    unsigned int insert = 0, do_insert = 0;
>> +
>> + do_real_construct:
>> +    for ( i = 0; i < nr_map; i++ )
>> +    {
>> +        rdm_start = map[i].start_pfn << PAGE_SHIFT;
>> +        rdm_end = rdm_start + map[i].nr_pages * PAGE_SIZE;
>
> Same comment as for an earlier patch.

         rdm_start = map[i].start_pfn << XC_PAGE_SHIFT;
         rdm_end = rdm_start + map[i].nr_pages << XC_PAGE_SHIFT;

But there's no such definition in tools/firmware/, and this is defined 
in xenctl.h. But if I include xenctl.h here, something else is 
redefined. So here I have to define this, XC_PAGE_SHIFTseparately.

>
>> +
>> +        for ( j = 0; j < nr - 1; j++ )
>> +        {
>> +            end = e820[j].addr + e820[j].size;
>> +            start = e820[j+1].addr;
>
> So now you don't check the last region at all? You just need to make
> this assignment conditional upon whether you're in the last iteration.

I remember we discussed this previously.

"
 > And this would skip the last region - I'm not sure that's correct.
 >

No. Here we just want to know where we should insert a entry, so

     for ( j = 0; j < nr - 1; j++ )
     {
         end = e820[j].addr + e820[j].size;
         start = e820[j+1].addr;

For example, nr = 3,

     for ( j = 0; j < 2; j++ )

So
#1,
         end = e820[0].addr + e820[0].size;
         start = e820[1].addr;

#2,
         end = e820[1].addr + e820[1].size;
         start = e820[2].addr;
"

I didn't you have further comments there.

>
>> +
>> +            /* Between those existing e820 entries. */
>> +            if ( (rdm_start > end) && (rdm_end < start) )
>> +            {
>> +                if ( do_insert )
>> +                {
>> +                    /* Move to free this entry. */
>> +                    for ( m = sum_nr - 1; m > j; m-- )
>> +                    {
>> +                        e820[m].addr = e820[m-1].addr;
>> +                        e820[m].size = e820[m-1].size;
>> +                        e820[m].type = e820[m-1].type;
>> +                    }
>
> You know of the memmove() function, don't you?

I will try this.

>
>> +
>> +                    /* Then fill RMRR into that entry. */
>> +                    e820[j+1].addr = rdm_start;
>> +                    e820[j+1].size = rdm_end - rdm_start;
>> +                    e820[j+1].type = E820_RESERVED;
>> +                    nr++;
>> +                }
>> +                insert++;
>> +            }
>> +            /* Already at the end. */
>> +            else if ( (rdm_start > end) && !start )
>
> How would !start represent the end of anything?

end = e820[j].addr + e820[j].size;
start = e820[j+1].addr;

so if end = 0xXXXXXXXX and start = 0, does this mean we already are at 
the end of all valid e820 entries?

>
>> +            {
>> +                if ( do_insert )
>> +                {
>> +                    e820[nr].addr = rdm_start;
>> +                    e820[nr].size = rdm_end - rdm_start;
>> +                    e820[nr].type = E820_RESERVED;
>> +                    nr++;
>> +                }
>> +                insert++;
>> +            }
>> +        }
>> +    }
>> +
>> +    /* Just return if done. */
>> +    if ( do_insert )
>> +        return nr;
>> +
>> +    /* Fine to construct RDM mappings into e820. */
>> +    if ( insert == nr_map )
>> +    {
>> +        do_insert = 1;
>> +        goto do_real_construct;
>> +    }
>> +    /* Overlap. */
>> +    else
>> +    {
>> +        printf("RDM overlap with those existing e820 entries!\n");
>> +        printf("So we don't construct RDM mapping in e820!\n");
>> +    }
>> +
>> +    return nr;
>> +}
>>   /* Create an E820 table based on memory parameters provided in hvm_info. */
>
> Blank line missing above.
>

Will fixed.

Thanks
Tiejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 6/7] hvmloader: check to reserved device memory maps in e820
  2014-09-12  6:28     ` Chen, Tiejun
@ 2014-09-12  6:44       ` Jan Beulich
  0 siblings, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2014-09-12  6:44 UTC (permalink / raw)
  To: Tiejun Chen
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

>>> On 12.09.14 at 08:28, <tiejun.chen@intel.com> wrote:
> On 2014/9/11 23:57, Jan Beulich wrote:
>>>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
>>> + do_real_construct:
>>> +    for ( i = 0; i < nr_map; i++ )
>>> +    {
>>> +        rdm_start = map[i].start_pfn << PAGE_SHIFT;
>>> +        rdm_end = rdm_start + map[i].nr_pages * PAGE_SIZE;
>>
>> Same comment as for an earlier patch.
> 
>          rdm_start = map[i].start_pfn << XC_PAGE_SHIFT;
>          rdm_end = rdm_start + map[i].nr_pages << XC_PAGE_SHIFT;
> 
> But there's no such definition in tools/firmware/, and this is defined 
> in xenctl.h. But if I include xenctl.h here, something else is 
> redefined. So here I have to define this, XC_PAGE_SHIFTseparately.

Iirc the earlier comment was on a libxc change. Obviously using its
definitions in hvmloader is forbidden. Out of the three things that
were pointed out wrong there, two still apply (and I think you should
have the smarts to filter out the one that doesn't rather than me
having to re-explain the same issue here again).

>>> +
>>> +                    /* Then fill RMRR into that entry. */
>>> +                    e820[j+1].addr = rdm_start;
>>> +                    e820[j+1].size = rdm_end - rdm_start;
>>> +                    e820[j+1].type = E820_RESERVED;
>>> +                    nr++;
>>> +                }
>>> +                insert++;
>>> +            }
>>> +            /* Already at the end. */
>>> +            else if ( (rdm_start > end) && !start )
>>
>> How would !start represent the end of anything?
> 
> end = e820[j].addr + e820[j].size;
> start = e820[j+1].addr;
> 
> so if end = 0xXXXXXXXX and start = 0, does this mean we already are at 
> the end of all valid e820 entries?

How would start validly be zero (i.e. other than incidentally due to
reading from an uninitialized entry)?

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [v6][PATCH 7/7] xen/vtd: make USB RMRR mapping safe
  2014-09-10  5:49 ` [v6][PATCH 7/7] xen/vtd: make USB RMRR mapping safe Tiejun Chen
@ 2014-09-18  9:11   ` Jan Beulich
  0 siblings, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2014-09-18  9:11 UTC (permalink / raw)
  To: Tiejun Chen
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	xen-devel, yang.z.zhang

>>> On 10.09.14 at 07:49, <tiejun.chen@intel.com> wrote:
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2273,16 +2273,8 @@ static int intel_iommu_assign_device(
>      if ( ret )
>          goto done;
>  
> -    /* FIXME: Because USB RMRR conflicts with guest bios region,
> -     * ignore USB RMRR temporarily.
> -     */
>      seg = pdev->seg;
>      bus = pdev->bus;
> -    if ( is_usb_device(seg, bus, pdev->devfn) )
> -    {
> -        ret = 0;
> -        goto done;
> -    }
>  
>      /* Setup rmrr identity mapping */
>      for_each_rmrr_device( rmrr, bdf, i )

This orphans is_usb_device(), which therefore the patch should
remove at once.

Jan

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2014-09-18  9:11 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-10  5:49 [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tiejun Chen
2014-09-10  5:49 ` [v6][PATCH 1/7] introduce XENMEM_reserved_device_memory_map Tiejun Chen
2014-09-10 21:34   ` Tian, Kevin
2014-09-10  5:49 ` [v6][PATCH 2/7] tools/libxc: introduce hypercall for xc_reserved_device_memory_map Tiejun Chen
2014-09-11 15:21   ` Jan Beulich
2014-09-11 15:23     ` Ian Campbell
2014-09-11 15:55     ` Andrew Cooper
2014-09-12  2:43     ` Chen, Tiejun
2014-09-12  6:20       ` Jan Beulich
2014-09-10  5:49 ` [v6][PATCH 3/7] tools/libxc: check if mmio BAR is out of reserved device memory maps Tiejun Chen
2014-09-10 21:37   ` Tian, Kevin
2014-09-11  1:14     ` Chen, Tiejun
2014-09-11 22:55       ` Tian, Kevin
2014-09-11 15:38   ` Jan Beulich
2014-09-12  2:56     ` Chen, Tiejun
2014-09-12  6:19     ` Jan Beulich
2014-09-10  5:49 ` [v6][PATCH 4/7] libxc/hvm_info_table: introduce a new field nr_reserved_device_memory_map Tiejun Chen
2014-09-10 21:39   ` Tian, Kevin
2014-09-11  1:16     ` Chen, Tiejun
2014-09-10  5:49 ` [v6][PATCH 5/7] hvmloader: introduce hypercall for xc_reserved_device_memory_map Tiejun Chen
2014-09-10 21:41   ` Tian, Kevin
2014-09-11  1:32     ` Chen, Tiejun
2014-09-11  7:52     ` Jan Beulich
2014-09-11 15:45   ` Jan Beulich
2014-09-12  4:52     ` Chen, Tiejun
2014-09-10  5:49 ` [v6][PATCH 6/7] hvmloader: check to reserved device memory maps in e820 Tiejun Chen
2014-09-11 15:57   ` Jan Beulich
2014-09-12  6:08     ` Jan Beulich
2014-09-12  6:28     ` Chen, Tiejun
2014-09-12  6:44       ` Jan Beulich
2014-09-10  5:49 ` [v6][PATCH 7/7] xen/vtd: make USB RMRR mapping safe Tiejun Chen
2014-09-18  9:11   ` Jan Beulich
2014-09-10 21:44 ` [v6][PATCH 0/7] xen: reserve RMRR to avoid conflicting MMIO/RAM Tian, Kevin
2014-09-11  1:38   ` Chen, Tiejun
2014-09-11  7:48     ` Jan Beulich
2014-09-11  9:39       ` Chen, Tiejun
2014-09-11 10:01         ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.