All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/12] x86: guest resource mapping
@ 2017-09-18 15:31 Paul Durrant
  2017-09-18 15:31 ` [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns Paul Durrant
                   ` (11 more replies)
  0 siblings, 12 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Paul Durrant, Jan Beulich

This series introduces support for direct mapping of guest resources.
The resources are:
 - Grant tables
 - IOREQ server pages

NOTE: This series is based on a master re-base of Juergen Gross's patch "xen: move
XENMAPSPACE_grant_table code into grant_table.c". For convenience the code is also available
on a branch at:

http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git;a=shortlog;h=refs/heads/ioreq11

v7:
 - Fixed assertion failure hit during domain destroy.

v6:
 - Responded to missed comments from Roger.

v5:
 - Responded to review comments from Wei.

v4:
 - Responded to further review comments from Roger.

v3:
 - Dropped original patch #1 since it is covered by Juergen's patch.
 - Added new xenforeignmemorycleanup patch (#4).
 - Replaced the patch introducing the ioreq server 'is_default' flag with one
   that changes the ioreq server list into an array (#8).

Paul Durrant (12):
  x86/mm: allow a privileged PV domain to map guest mfns
  x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  tools/libxenforeignmemory: add support for resource mapping
  tools/libxenforeignmemory: reduce xenforeignmemory_restrict code
    footprint
  tools/libxenctrl: use new xenforeignmemory API to seed grant table
  x86/hvm/ioreq: rename .*pfn and .*gmfn to .*gfn
  x86/hvm/ioreq: use bool rather than bool_t
  x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  x86/hvm/ioreq: simplify code and use consistent naming
  x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page
  x86/hvm/ioreq: defer mapping gfns until they are actually requsted
  x86/hvm/ioreq: add a new mappable resource type...

 tools/include/xen-sys/Linux/privcmd.h              |  11 +
 tools/libs/devicemodel/core.c                      |  18 +-
 tools/libs/devicemodel/include/xendevicemodel.h    |  14 +-
 tools/libs/foreignmemory/Makefile                  |   2 +-
 tools/libs/foreignmemory/core.c                    |  53 ++
 tools/libs/foreignmemory/freebsd.c                 |   7 -
 .../libs/foreignmemory/include/xenforeignmemory.h  |  41 +
 tools/libs/foreignmemory/libxenforeignmemory.map   |   5 +
 tools/libs/foreignmemory/linux.c                   |  45 ++
 tools/libs/foreignmemory/minios.c                  |   7 -
 tools/libs/foreignmemory/netbsd.c                  |   7 -
 tools/libs/foreignmemory/private.h                 |  43 +-
 tools/libs/foreignmemory/solaris.c                 |   7 -
 tools/libxc/include/xc_dom.h                       |   8 +-
 tools/libxc/xc_dom_boot.c                          | 114 ++-
 tools/libxc/xc_sr_restore_x86_hvm.c                |  10 +-
 tools/libxc/xc_sr_restore_x86_pv.c                 |   2 +-
 tools/libxl/libxl_dom.c                            |   1 -
 tools/python/xen/lowlevel/xc/xc.c                  |   6 +-
 xen/arch/x86/hvm/dm.c                              |  11 +-
 xen/arch/x86/hvm/hvm.c                             |   8 +-
 xen/arch/x86/hvm/io.c                              |   4 +-
 xen/arch/x86/hvm/ioreq.c                           | 869 +++++++++++----------
 xen/arch/x86/mm.c                                  | 151 +++-
 xen/arch/x86/mm/p2m.c                              |   3 +-
 xen/common/grant_table.c                           |  56 +-
 xen/include/asm-x86/hvm/domain.h                   |  21 +-
 xen/include/asm-x86/hvm/ioreq.h                    |  24 +-
 xen/include/asm-x86/p2m.h                          |   3 +
 xen/include/public/hvm/dm_op.h                     |  46 +-
 xen/include/public/memory.h                        |  41 +-
 xen/include/xen/grant_table.h                      |   1 +
 32 files changed, 1081 insertions(+), 558 deletions(-)

---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
@ 2017-09-18 15:31 ` Paul Durrant
  2017-09-19 12:51   ` Paul Durrant
  2017-09-25 13:02   ` Jan Beulich
  2017-09-18 15:31 ` [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources Paul Durrant
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich

In the case where a PV domain is mapping guest resources then it needs make
the HYPERVISOR_mmu_update call using DOMID_SELF, rather than the guest
domid, so that the passed in gmfn values are correctly treated as mfns
rather than gfns present in the guest p2m.

This patch removes a check which currently disallows mapping of a page when
the owner of the page tables matches the domain passed to
HYPERVISOR_mmu_update, but that domain is not the real owner of the page.
The check was introduced by patch d3c6a215ca9 ("x86: Clean up
get_page_from_l1e() to correctly distinguish between owner-of-pte and
owner-of-data-page in all cases") but it's not clear why it was needed.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/mm.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 2e5b15e7a2..cb0189efae 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1024,12 +1024,15 @@ get_page_from_l1e(
                    (real_pg_owner != dom_cow) ) )
     {
         /*
-         * Let privileged domains transfer the right to map their target
-         * domain's pages. This is used to allow stub-domain pvfb export to
-         * dom0, until pvfb supports granted mappings. At that time this
-         * minor hack can go away.
+         * If the real page owner is not the domain specified in the
+         * hypercall then establish that the specified domain has
+         * mapping privilege over the page owner.
+         * This is used to allow stub-domain pvfb export to dom0. It is
+         * also used to allow a privileged PV domain to map mfns using
+         * DOMID_SELF, which is needed for mapping guest resources such
+         * grant table frames.
          */
-        if ( (real_pg_owner == NULL) || (pg_owner == l1e_owner) ||
+        if ( (real_pg_owner == NULL) ||
              xsm_priv_mapping(XSM_TARGET, pg_owner, real_pg_owner) )
         {
             gdprintk(XENLOG_WARNING,
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
  2017-09-18 15:31 ` [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns Paul Durrant
@ 2017-09-18 15:31 ` Paul Durrant
  2017-09-25 13:49   ` Jan Beulich
  2017-09-25 14:23   ` Jan Beulich
  2017-09-18 15:31 ` [PATCH v7 03/12] tools/libxenforeignmemory: add support for resource mapping Paul Durrant
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich

Certain memory resources associated with a guest are not necessarily
present in the guest P2M and so are not necessarily available to be
foreign-mapped by a tools domain unless they are inserted, which risks
shattering a super-page mapping.

This patch adds a new memory op to allow such a resource to be priv-mapped
directly, by either a PV or HVM tools domain: grant table frames.

NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
      I have no means to test it on an ARM platform and so cannot verify
      that it functions correctly. Hence it is currently only implemented
      for x86.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>

v5:
 - Switched __copy_to/from_guest_offset() to copy_to/from_guest_offset().
---
 xen/arch/x86/mm.c             | 111 ++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/mm/p2m.c         |   3 +-
 xen/common/grant_table.c      |  56 ++++++++++++++-------
 xen/include/asm-x86/p2m.h     |   3 ++
 xen/include/public/memory.h   |  38 ++++++++++++++-
 xen/include/xen/grant_table.h |   1 +
 6 files changed, 191 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index cb0189efae..c8f50f3bb0 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4768,6 +4768,107 @@ int xenmem_add_to_physmap_one(
     return rc;
 }
 
+static int xenmem_acquire_grant_table(struct domain *d,
+                                      unsigned long frame,
+                                      unsigned long nr_frames,
+                                      unsigned long mfn_list[])
+{
+    unsigned int i;
+
+    /*
+     * Iterate through the list backwards so that gnttab_get_frame() is
+     * first called for the highest numbered frame. This means that the
+     * out-of-bounds check will be done on the first iteration and, if
+     * the table needs to grow, it will only grow once.
+     */
+    i = nr_frames;
+    while ( i-- != 0 )
+    {
+        mfn_t mfn = gnttab_get_frame(d, frame + i);
+
+        if ( mfn_eq(mfn, INVALID_MFN) )
+            return -EINVAL;
+
+        mfn_list[i] = mfn_x(mfn);
+    }
+
+    return 0;
+}
+
+static int xenmem_acquire_resource(xen_mem_acquire_resource_t *xmar)
+{
+    struct domain *d, *currd = current->domain;
+    unsigned long *mfn_list;
+    int rc;
+
+    if ( xmar->nr_frames == 0 )
+        return -EINVAL;
+
+    d = rcu_lock_domain_by_any_id(xmar->domid);
+    if ( d == NULL )
+        return -ESRCH;
+
+    rc = xsm_domain_memory_map(XSM_TARGET, d);
+    if ( rc )
+        goto out;
+
+    mfn_list = xmalloc_array(unsigned long, xmar->nr_frames);
+
+    rc = -ENOMEM;
+    if ( !mfn_list )
+        goto out;
+
+    switch ( xmar->type )
+    {
+    case XENMEM_resource_grant_table:
+        rc = -EINVAL;
+        if ( xmar->id ) /* must be zero for grant_table */
+            break;
+
+        rc = xenmem_acquire_grant_table(d, xmar->frame, xmar->nr_frames,
+                                        mfn_list);
+        break;
+
+    default:
+        rc = -EOPNOTSUPP;
+        break;
+    }
+
+    if ( rc )
+        goto free_and_out;
+
+    if ( !paging_mode_translate(currd) )
+    {
+        if ( copy_to_guest_offset(xmar->gmfn_list, 0, mfn_list,
+                                  xmar->nr_frames) )
+            rc = -EFAULT;
+    }
+    else
+    {
+        unsigned int i;
+
+        for ( i = 0; i < xmar->nr_frames; i++ )
+        {
+            xen_pfn_t gfn;
+
+            rc = -EFAULT;
+            if ( copy_from_guest_offset(&gfn, xmar->gmfn_list, i, 1) )
+                goto free_and_out;
+
+            rc = set_foreign_p2m_entry(currd, gfn, _mfn(mfn_list[i]));
+            if ( rc )
+                goto free_and_out;
+        }
+    }
+
+ free_and_out:
+    xfree(mfn_list);
+
+ out:
+    rcu_unlock_domain(d);
+    return rc;
+}
+
 long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     int rc;
@@ -4990,6 +5091,16 @@ long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         return rc;
     }
 
+    case XENMEM_acquire_resource:
+    {
+        xen_mem_acquire_resource_t xmar;
+
+        if ( copy_from_guest(&xmar, arg, 1) )
+            return -EFAULT;
+
+        return xenmem_acquire_resource(&xmar);
+    }
+
     default:
         return subarch_memory_op(cmd, arg);
     }
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 0b479105b9..d0f8fc249b 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1121,8 +1121,7 @@ static int set_typed_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
 }
 
 /* Set foreign mfn in the given guest's p2m table. */
-static int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
-                                 mfn_t mfn)
+int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
 {
     return set_typed_p2m_entry(d, gfn, mfn, PAGE_ORDER_4K, p2m_map_foreign,
                                p2m_get_hostp2m(d)->default_access);
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 9a4d335ee0..dfd00a9432 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3607,38 +3607,58 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
 }
 #endif
 
-int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
-                     mfn_t *mfn)
-{
-    int rc = 0;
 
-    grant_write_lock(d->grant_table);
+static mfn_t gnttab_get_frame_locked(struct domain *d, unsigned long idx)
+{
+    struct grant_table *gt = d->grant_table;
+    mfn_t mfn = INVALID_MFN;
 
-    if ( d->grant_table->gt_version == 0 )
-        d->grant_table->gt_version = 1;
+    if ( gt->gt_version == 0 )
+        gt->gt_version = 1;
 
-    if ( d->grant_table->gt_version == 2 &&
+    if ( gt->gt_version == 2 &&
          (idx & XENMAPIDX_grant_table_status) )
     {
         idx &= ~XENMAPIDX_grant_table_status;
-        if ( idx < nr_status_frames(d->grant_table) )
-            *mfn = _mfn(virt_to_mfn(d->grant_table->status[idx]));
-        else
-            rc = -EINVAL;
+        if ( idx < nr_status_frames(gt) )
+            mfn = _mfn(virt_to_mfn(gt->status[idx]));
     }
     else
     {
-        if ( (idx >= nr_grant_frames(d->grant_table)) &&
+        if ( (idx >= nr_grant_frames(gt)) &&
              (idx < max_grant_frames) )
             gnttab_grow_table(d, idx + 1);
 
-        if ( idx < nr_grant_frames(d->grant_table) )
-            *mfn = _mfn(virt_to_mfn(d->grant_table->shared_raw[idx]));
-        else
-            rc = -EINVAL;
+        if ( idx < nr_grant_frames(gt) )
+            mfn = _mfn(virt_to_mfn(gt->shared_raw[idx]));
     }
 
-    gnttab_set_frame_gfn(d, idx, gfn);
+    return mfn;
+}
+
+mfn_t gnttab_get_frame(struct domain *d, unsigned long idx)
+{
+    mfn_t mfn;
+
+    grant_write_lock(d->grant_table);
+    mfn = gnttab_get_frame_locked(d, idx);
+    grant_write_unlock(d->grant_table);
+
+    return mfn;
+}
+
+int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
+                     mfn_t *mfn)
+{
+    int rc = 0;
+
+    grant_write_lock(d->grant_table);
+
+    *mfn = gnttab_get_frame_locked(d, idx);
+    if ( mfn_eq(*mfn, INVALID_MFN) )
+        rc = -EINVAL;
+    else
+        gnttab_set_frame_gfn(d, idx, gfn);
 
     grant_write_unlock(d->grant_table);
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 10cdfc09a9..4eff0458bc 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -613,6 +613,9 @@ void p2m_memory_type_changed(struct domain *d);
 int p2m_is_logdirty_range(struct p2m_domain *, unsigned long start,
                           unsigned long end);
 
+/* Set foreign entry in the p2m table (for priv-mapping) */
+int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
+
 /* Set mmio addresses in the p2m table (for pass-through) */
 int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
                        unsigned int order, p2m_access_t access);
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 29386df98b..9bf58e7384 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -650,7 +650,43 @@ struct xen_vnuma_topology_info {
 typedef struct xen_vnuma_topology_info xen_vnuma_topology_info_t;
 DEFINE_XEN_GUEST_HANDLE(xen_vnuma_topology_info_t);
 
-/* Next available subop number is 28 */
+#if defined(__XEN__) || defined(__XEN_TOOLS__)
+
+/*
+ * Get the pages for a particular guest resource, so that they can be
+ * mapped directly by a tools domain.
+ */
+#define XENMEM_acquire_resource 28
+struct xen_mem_acquire_resource {
+    /* IN - the domain whose resource is to be mapped */
+    domid_t domid;
+    /* IN - the type of resource (defined below) */
+    uint16_t type;
+
+#define XENMEM_resource_grant_table 0
+
+    /*
+     * IN - a type-specific resource identifier, which must be zero
+     *      unless stated otherwise.
+     */
+    uint32_t id;
+    /* IN - number of (4K) frames of the resource to be mapped */
+    uint32_t nr_frames;
+    /* IN - the index of the initial frame to be mapped */
+    uint64_aligned_t frame;
+    /* IN/OUT - If the tools domain is PV then, upon return, gmfn_list
+     *          will be populated with the MFNs of the resource.
+     *          If the tools domain is HVM then it is expected that, on
+     *          entry, gmfn_list will be populated with a list of GFNs
+     *          that will be mapped to the MFNs of the resource.
+     */
+    XEN_GUEST_HANDLE(xen_pfn_t) gmfn_list;
+};
+typedef struct xen_mem_acquire_resource xen_mem_acquire_resource_t;
+
+#endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
+
+/* Next available subop number is 29 */
 
 #endif /* __XEN_PUBLIC_MEMORY_H__ */
 
diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
index 43ec6c4d80..f9e89375bb 100644
--- a/xen/include/xen/grant_table.h
+++ b/xen/include/xen/grant_table.h
@@ -136,6 +136,7 @@ static inline unsigned int grant_to_status_frames(int grant_frames)
 int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
                             gfn_t *gfn, uint16_t *status);
 
+mfn_t gnttab_get_frame(struct domain *d, unsigned long idx);
 int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
                      mfn_t *mfn);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v7 03/12] tools/libxenforeignmemory: add support for resource mapping
  2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
  2017-09-18 15:31 ` [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns Paul Durrant
  2017-09-18 15:31 ` [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources Paul Durrant
@ 2017-09-18 15:31 ` Paul Durrant
  2017-09-18 16:16   ` Ian Jackson
  2017-09-18 15:31 ` [PATCH v7 04/12] tools/libxenforeignmemory: reduce xenforeignmemory_restrict code footprint Paul Durrant
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Ian Jackson

A previous patch introduced a new HYPERVISOR_memory_op to acquire guest
resources for direct priv-mapping.

This patch adds new functionality into libxenforeignmemory to make use
of a new privcmd ioctl [1] that uses the new memory op to make such
resources available via mmap(2).

[1] http://xenbits.xen.org/gitweb/?p=people/pauldu/linux.git;a=commit;h=ce59a05e6712

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>

v4:
 - Fixed errno and removed single-use label
 - The unmap call now returns a status
 - Use C99 initialization for ioctl struct

v2:
 - Bump minor version up to 3.
---
 tools/include/xen-sys/Linux/privcmd.h              | 11 +++++
 tools/libs/foreignmemory/Makefile                  |  2 +-
 tools/libs/foreignmemory/core.c                    | 53 ++++++++++++++++++++++
 .../libs/foreignmemory/include/xenforeignmemory.h  | 41 +++++++++++++++++
 tools/libs/foreignmemory/libxenforeignmemory.map   |  5 ++
 tools/libs/foreignmemory/linux.c                   | 45 ++++++++++++++++++
 tools/libs/foreignmemory/private.h                 | 31 +++++++++++++
 7 files changed, 187 insertions(+), 1 deletion(-)

diff --git a/tools/include/xen-sys/Linux/privcmd.h b/tools/include/xen-sys/Linux/privcmd.h
index 732ff7c15a..9531b728f9 100644
--- a/tools/include/xen-sys/Linux/privcmd.h
+++ b/tools/include/xen-sys/Linux/privcmd.h
@@ -86,6 +86,15 @@ typedef struct privcmd_dm_op {
 	const privcmd_dm_op_buf_t __user *ubufs;
 } privcmd_dm_op_t;
 
+typedef struct privcmd_mmap_resource {
+	domid_t dom;
+	__u32 type;
+	__u32 id;
+	__u32 idx;
+	__u64 num;
+	__u64 addr;
+} privcmd_mmap_resource_t;
+
 /*
  * @cmd: IOCTL_PRIVCMD_HYPERCALL
  * @arg: &privcmd_hypercall_t
@@ -103,5 +112,7 @@ typedef struct privcmd_dm_op {
 	_IOC(_IOC_NONE, 'P', 5, sizeof(privcmd_dm_op_t))
 #define IOCTL_PRIVCMD_RESTRICT					\
 	_IOC(_IOC_NONE, 'P', 6, sizeof(domid_t))
+#define IOCTL_PRIVCMD_MMAP_RESOURCE				\
+	_IOC(_IOC_NONE, 'P', 7, sizeof(privcmd_mmap_resource_t))
 
 #endif /* __LINUX_PUBLIC_PRIVCMD_H__ */
diff --git a/tools/libs/foreignmemory/Makefile b/tools/libs/foreignmemory/Makefile
index ab7f873f26..5c7f78f61d 100644
--- a/tools/libs/foreignmemory/Makefile
+++ b/tools/libs/foreignmemory/Makefile
@@ -2,7 +2,7 @@ XEN_ROOT = $(CURDIR)/../../..
 include $(XEN_ROOT)/tools/Rules.mk
 
 MAJOR    = 1
-MINOR    = 2
+MINOR    = 3
 SHLIB_LDFLAGS += -Wl,--version-script=libxenforeignmemory.map
 
 CFLAGS   += -Werror -Wmissing-prototypes
diff --git a/tools/libs/foreignmemory/core.c b/tools/libs/foreignmemory/core.c
index a6897dc561..8d3f9f178f 100644
--- a/tools/libs/foreignmemory/core.c
+++ b/tools/libs/foreignmemory/core.c
@@ -17,6 +17,8 @@
 #include <assert.h>
 #include <errno.h>
 
+#include <sys/mman.h>
+
 #include "private.h"
 
 xenforeignmemory_handle *xenforeignmemory_open(xentoollog_logger *logger,
@@ -120,6 +122,57 @@ int xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
     return osdep_xenforeignmemory_restrict(fmem, domid);
 }
 
+xenforeignmemory_resource_handle *xenforeignmemory_map_resource(
+    xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+    unsigned int id, unsigned long frame, unsigned long nr_frames,
+    void **paddr, int prot, int flags)
+{
+    xenforeignmemory_resource_handle *fres;
+    int rc;
+
+    /* Check flags only contains POSIX defined values */
+    if ( flags & ~(MAP_SHARED | MAP_PRIVATE) )
+    {
+        errno = EINVAL;
+        return NULL;
+    }
+
+    fres = calloc(1, sizeof(*fres));
+    if ( !fres )
+    {
+        errno = ENOMEM;
+        return NULL;
+    }
+
+    fres->domid = domid;
+    fres->type = type;
+    fres->id = id;
+    fres->frame = frame;
+    fres->nr_frames = nr_frames;
+    fres->addr = *paddr;
+    fres->prot = prot;
+    fres->flags = flags;
+
+    rc = osdep_xenforeignmemory_map_resource(fmem, fres);
+    if ( rc )
+    {
+        free(fres);
+        fres = NULL;
+    } else
+        *paddr = fres->addr;
+
+    return fres;
+}
+
+int xenforeignmemory_unmap_resource(
+    xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
+{
+    int rc = osdep_xenforeignmemory_unmap_resource(fmem, fres);
+
+    free(fres);
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/include/xenforeignmemory.h b/tools/libs/foreignmemory/include/xenforeignmemory.h
index f4814c390f..d594be8df0 100644
--- a/tools/libs/foreignmemory/include/xenforeignmemory.h
+++ b/tools/libs/foreignmemory/include/xenforeignmemory.h
@@ -138,6 +138,47 @@ int xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
 int xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
                               domid_t domid);
 
+typedef struct xenforeignmemory_resource_handle xenforeignmemory_resource_handle;
+
+/**
+ * This function maps a guest resource.
+ *
+ * @parm fmem handle to the open foreignmemory interface
+ * @parm domid the domain id
+ * @parm type the resource type
+ * @parm id the type-specific resource identifier
+ * @parm frame base frame index within the resource
+ * @parm nr_frames number of frames to map
+ * @parm paddr pointer to an address passed through to mmap(2)
+ * @parm prot passed through to mmap(2)
+ * @parm POSIX-only flags passed through to mmap(2)
+ * @return pointer to foreignmemory resource handle on success, NULL on
+ *         failure
+ *
+ * *paddr is used, on entry, as a hint address for foreign map placement
+ * (see mmap(2)) so should be set to NULL if no specific placement is
+ * required. On return *paddr contains the address where the resource is
+ * mapped.
+ * As for xenforeignmemory_map2() flags is a set of additional flags
+ * for mmap(2). Not all of the flag combinations are possible due to
+ * implementation details on different platforms.
+ */
+xenforeignmemory_resource_handle *xenforeignmemory_map_resource(
+    xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+    unsigned int id, unsigned long frame, unsigned long nr_frames,
+    void **paddr, int prot, int flags);
+
+/**
+ * This function releases a previously acquired resource.
+ *
+ * @parm fmem handle to the open foreignmemory interface
+ * @parm fres handle to the acquired resource
+ *
+ * Returns 0 on success on failure sets errno and returns -1.
+ */
+int xenforeignmemory_unmap_resource(
+    xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres);
+
 #endif
 
 /*
diff --git a/tools/libs/foreignmemory/libxenforeignmemory.map b/tools/libs/foreignmemory/libxenforeignmemory.map
index 716ecaf15c..d5323c87d9 100644
--- a/tools/libs/foreignmemory/libxenforeignmemory.map
+++ b/tools/libs/foreignmemory/libxenforeignmemory.map
@@ -14,3 +14,8 @@ VERS_1.2 {
 	global:
 		xenforeignmemory_map2;
 } VERS_1.1;
+VERS_1.3 {
+	global:
+		xenforeignmemory_map_resource;
+		xenforeignmemory_unmap_resource;
+} VERS_1.2;
diff --git a/tools/libs/foreignmemory/linux.c b/tools/libs/foreignmemory/linux.c
index 374e45aed5..a6b41b0b7f 100644
--- a/tools/libs/foreignmemory/linux.c
+++ b/tools/libs/foreignmemory/linux.c
@@ -277,6 +277,51 @@ int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
     return ioctl(fmem->fd, IOCTL_PRIVCMD_RESTRICT, &domid);
 }
 
+int osdep_xenforeignmemory_unmap_resource(
+    xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
+{
+    return munmap(fres->addr, fres->nr_frames << PAGE_SHIFT);
+}
+
+int osdep_xenforeignmemory_map_resource(
+    xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
+{
+    privcmd_mmap_resource_t mr = {
+        .dom = fres->domid,
+        .type = fres->type,
+        .id = fres->id,
+        .idx = fres->frame,
+        .num = fres->nr_frames,
+    };
+    int rc;
+
+    fres->addr = mmap(fres->addr, fres->nr_frames << PAGE_SHIFT,
+                      fres->prot, fres->flags | MAP_SHARED, fmem->fd, 0);
+    if ( fres->addr == MAP_FAILED )
+        return -1;
+
+    mr.addr = (uintptr_t)fres->addr;
+
+    rc = ioctl(fmem->fd, IOCTL_PRIVCMD_MMAP_RESOURCE, &mr);
+    if ( rc )
+    {
+        int saved_errno;
+
+        if ( errno != ENOTTY )
+            PERROR("ioctl failed");
+        else
+            errno = EOPNOTSUPP;
+
+        saved_errno = errno;
+        (void)osdep_xenforeignmemory_unmap_resource(fmem, fres);
+        errno = saved_errno;
+
+        return -1;
+    }
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/private.h b/tools/libs/foreignmemory/private.h
index c5c07cc4c4..80b22bdbfc 100644
--- a/tools/libs/foreignmemory/private.h
+++ b/tools/libs/foreignmemory/private.h
@@ -42,6 +42,37 @@ void *compat_mapforeign_batch(xenforeignmem_handle *fmem, uint32_t dom,
                               xen_pfn_t *arr, int num);
 #endif
 
+struct xenforeignmemory_resource_handle {
+    domid_t domid;
+    unsigned int type;
+    unsigned int id;
+    unsigned long frame;
+    unsigned long nr_frames;
+    void *addr;
+    int prot;
+    int flags;
+};
+
+#ifndef __linux__
+static inline int osdep_xenforeignmemory_map_resource(
+    xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
+{
+    errno = EOPNOTSUPP;
+    return -1;
+}
+
+static inline int osdep_xenforeignmemory_unmap_resource(
+    xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
+{
+    return 0;
+}
+#else
+int osdep_xenforeignmemory_map_resource(
+    xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres);
+int osdep_xenforeignmemory_unmap_resource(
+    xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres);
+#endif
+
 #define PERROR(_f...) \
     xtl_log(fmem->logger, XTL_ERROR, errno, "xenforeignmemory", _f)
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v7 04/12] tools/libxenforeignmemory: reduce xenforeignmemory_restrict code footprint
  2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
                   ` (2 preceding siblings ...)
  2017-09-18 15:31 ` [PATCH v7 03/12] tools/libxenforeignmemory: add support for resource mapping Paul Durrant
@ 2017-09-18 15:31 ` Paul Durrant
  2017-09-18 15:31 ` [PATCH v7 05/12] tools/libxenctrl: use new xenforeignmemory API to seed grant table Paul Durrant
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Ian Jackson

By using a static inline stub in private.h for OS where this functionality
is not implemented, the various duplicate stubs in the OS-specific source
modules can be avoided.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>

v4:
 - Removed extraneous freebsd code.

v3:
 - Patch added in response to review comments.
---
 tools/libs/foreignmemory/freebsd.c |  7 -------
 tools/libs/foreignmemory/minios.c  |  7 -------
 tools/libs/foreignmemory/netbsd.c  |  7 -------
 tools/libs/foreignmemory/private.h | 12 +++++++++---
 tools/libs/foreignmemory/solaris.c |  7 -------
 5 files changed, 9 insertions(+), 31 deletions(-)

diff --git a/tools/libs/foreignmemory/freebsd.c b/tools/libs/foreignmemory/freebsd.c
index dec447485a..6e6bc4b11f 100644
--- a/tools/libs/foreignmemory/freebsd.c
+++ b/tools/libs/foreignmemory/freebsd.c
@@ -95,13 +95,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
     return munmap(addr, num << PAGE_SHIFT);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-                                    domid_t domid)
-{
-    errno = -EOPNOTSUPP;
-    return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/minios.c b/tools/libs/foreignmemory/minios.c
index 75f340122e..43341ca301 100644
--- a/tools/libs/foreignmemory/minios.c
+++ b/tools/libs/foreignmemory/minios.c
@@ -58,13 +58,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
     return munmap(addr, num << PAGE_SHIFT);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-                                    domid_t domid)
-{
-    errno = -EOPNOTSUPP;
-    return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/netbsd.c b/tools/libs/foreignmemory/netbsd.c
index 9bf95ef4f0..54a418ebd6 100644
--- a/tools/libs/foreignmemory/netbsd.c
+++ b/tools/libs/foreignmemory/netbsd.c
@@ -100,13 +100,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
     return munmap(addr, num*XC_PAGE_SIZE);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-                                    domid_t domid)
-{
-    errno = -EOPNOTSUPP;
-    return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/private.h b/tools/libs/foreignmemory/private.h
index 80b22bdbfc..b5d5f0a354 100644
--- a/tools/libs/foreignmemory/private.h
+++ b/tools/libs/foreignmemory/private.h
@@ -32,9 +32,6 @@ void *osdep_xenforeignmemory_map(xenforeignmemory_handle *fmem,
 int osdep_xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
                                  void *addr, size_t num);
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-                                    domid_t domid);
-
 #if defined(__NetBSD__) || defined(__sun__)
 /* Strictly compat for those two only only */
 void *compat_mapforeign_batch(xenforeignmem_handle *fmem, uint32_t dom,
@@ -54,6 +51,13 @@ struct xenforeignmemory_resource_handle {
 };
 
 #ifndef __linux__
+static inline int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
+                                                  domid_t domid)
+{
+    errno = EOPNOTSUPP;
+    return -1;
+}
+
 static inline int osdep_xenforeignmemory_map_resource(
     xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
 {
@@ -67,6 +71,8 @@ static inline int osdep_xenforeignmemory_unmap_resource(
     return 0;
 }
 #else
+int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
+                                    domid_t domid);
 int osdep_xenforeignmemory_map_resource(
     xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres);
 int osdep_xenforeignmemory_unmap_resource(
diff --git a/tools/libs/foreignmemory/solaris.c b/tools/libs/foreignmemory/solaris.c
index a33decb4ae..ee8aae4fbd 100644
--- a/tools/libs/foreignmemory/solaris.c
+++ b/tools/libs/foreignmemory/solaris.c
@@ -97,13 +97,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
     return munmap(addr, num*XC_PAGE_SIZE);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-                                    domid_t domid)
-{
-    errno = -EOPNOTSUPP;
-    return -1;
-}
-
 /*
  * Local variables:
  * mode: C
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v7 05/12] tools/libxenctrl: use new xenforeignmemory API to seed grant table
  2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
                   ` (3 preceding siblings ...)
  2017-09-18 15:31 ` [PATCH v7 04/12] tools/libxenforeignmemory: reduce xenforeignmemory_restrict code footprint Paul Durrant
@ 2017-09-18 15:31 ` Paul Durrant
  2017-09-18 15:31 ` [PATCH v7 06/12] x86/hvm/ioreq: rename .*pfn and .*gmfn to .*gfn Paul Durrant
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Ian Jackson

A previous patch added support for priv-mapping guest resources directly
(rather than having to foreign-map, which requires P2M modification for
HVM guests).

This patch makes use of the new API to seed the guest grant table unless
the underlying infrastructure (i.e. privcmd) doesn't support it, in which
case the old scheme is used.

NOTE: The call to xc_dom_gnttab_hvm_seed() in hvm_build_set_params() was
      actually unnecessary, as the grant table has already been seeded
      by a prior call to xc_dom_gnttab_init() made by libxl__build_dom().

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>

v4:
 - Minor cosmetic fix suggested by Roger.

v3:
 - Introduced xc_dom_set_gnttab_entry() to avoid duplicated code.
---
 tools/libxc/include/xc_dom.h        |   8 +--
 tools/libxc/xc_dom_boot.c           | 114 +++++++++++++++++++++++++-----------
 tools/libxc/xc_sr_restore_x86_hvm.c |  10 ++--
 tools/libxc/xc_sr_restore_x86_pv.c  |   2 +-
 tools/libxl/libxl_dom.c             |   1 -
 tools/python/xen/lowlevel/xc/xc.c   |   6 +-
 6 files changed, 92 insertions(+), 49 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index ce47058c41..d6ca0a8680 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -323,12 +323,8 @@ void *xc_dom_boot_domU_map(struct xc_dom_image *dom, xen_pfn_t pfn,
 int xc_dom_boot_image(struct xc_dom_image *dom);
 int xc_dom_compat_check(struct xc_dom_image *dom);
 int xc_dom_gnttab_init(struct xc_dom_image *dom);
-int xc_dom_gnttab_hvm_seed(xc_interface *xch, domid_t domid,
-                           xen_pfn_t console_gmfn,
-                           xen_pfn_t xenstore_gmfn,
-                           domid_t console_domid,
-                           domid_t xenstore_domid);
-int xc_dom_gnttab_seed(xc_interface *xch, domid_t domid,
+int xc_dom_gnttab_seed(xc_interface *xch, domid_t guest_domid,
+                       bool is_hvm,
                        xen_pfn_t console_gmfn,
                        xen_pfn_t xenstore_gmfn,
                        domid_t console_domid,
diff --git a/tools/libxc/xc_dom_boot.c b/tools/libxc/xc_dom_boot.c
index c3b44dd399..dc0a1fdee8 100644
--- a/tools/libxc/xc_dom_boot.c
+++ b/tools/libxc/xc_dom_boot.c
@@ -280,11 +280,29 @@ static xen_pfn_t xc_dom_gnttab_setup(xc_interface *xch, domid_t domid)
     return gmfn;
 }
 
-int xc_dom_gnttab_seed(xc_interface *xch, domid_t domid,
-                       xen_pfn_t console_gmfn,
-                       xen_pfn_t xenstore_gmfn,
-                       domid_t console_domid,
-                       domid_t xenstore_domid)
+static void xc_dom_set_gnttab_entry(xc_interface *xch,
+                                    grant_entry_v1_t *gnttab,
+                                    unsigned int idx,
+                                    domid_t guest_domid,
+                                    domid_t backend_domid,
+                                    xen_pfn_t backend_gmfn)
+{
+    if ( guest_domid == backend_domid || backend_gmfn == -1)
+        return;
+
+    xc_dom_printf(xch, "%s: [%u] -> 0x%"PRI_xen_pfn,
+                  __FUNCTION__, idx, backend_gmfn);
+
+    gnttab[idx].flags = GTF_permit_access;
+    gnttab[idx].domid = backend_domid;
+    gnttab[idx].frame = backend_gmfn;
+}
+
+static int compat_gnttab_seed(xc_interface *xch, domid_t domid,
+                              xen_pfn_t console_gmfn,
+                              xen_pfn_t xenstore_gmfn,
+                              domid_t console_domid,
+                              domid_t xenstore_domid)
 {
 
     xen_pfn_t gnttab_gmfn;
@@ -308,18 +326,10 @@ int xc_dom_gnttab_seed(xc_interface *xch, domid_t domid,
         return -1;
     }
 
-    if ( domid != console_domid  && console_gmfn != -1)
-    {
-        gnttab[GNTTAB_RESERVED_CONSOLE].flags = GTF_permit_access;
-        gnttab[GNTTAB_RESERVED_CONSOLE].domid = console_domid;
-        gnttab[GNTTAB_RESERVED_CONSOLE].frame = console_gmfn;
-    }
-    if ( domid != xenstore_domid && xenstore_gmfn != -1)
-    {
-        gnttab[GNTTAB_RESERVED_XENSTORE].flags = GTF_permit_access;
-        gnttab[GNTTAB_RESERVED_XENSTORE].domid = xenstore_domid;
-        gnttab[GNTTAB_RESERVED_XENSTORE].frame = xenstore_gmfn;
-    }
+    xc_dom_set_gnttab_entry(xch, gnttab, GNTTAB_RESERVED_CONSOLE,
+                            domid, console_domid, console_gmfn);
+    xc_dom_set_gnttab_entry(xch, gnttab, GNTTAB_RESERVED_XENSTORE,
+                            domid, xenstore_domid, xenstore_gmfn);
 
     if ( munmap(gnttab, PAGE_SIZE) == -1 )
     {
@@ -337,11 +347,11 @@ int xc_dom_gnttab_seed(xc_interface *xch, domid_t domid,
     return 0;
 }
 
-int xc_dom_gnttab_hvm_seed(xc_interface *xch, domid_t domid,
-                           xen_pfn_t console_gpfn,
-                           xen_pfn_t xenstore_gpfn,
-                           domid_t console_domid,
-                           domid_t xenstore_domid)
+static int compat_gnttab_hvm_seed(xc_interface *xch, domid_t domid,
+                                  xen_pfn_t console_gpfn,
+                                  xen_pfn_t xenstore_gpfn,
+                                  domid_t console_domid,
+                                  domid_t xenstore_domid)
 {
     int rc;
     xen_pfn_t scratch_gpfn;
@@ -380,7 +390,7 @@ int xc_dom_gnttab_hvm_seed(xc_interface *xch, domid_t domid,
         return -1;
     }
 
-    rc = xc_dom_gnttab_seed(xch, domid,
+    rc = compat_gnttab_seed(xch, domid,
                             console_gpfn, xenstore_gpfn,
                             console_domid, xenstore_domid);
     if (rc != 0)
@@ -405,18 +415,56 @@ int xc_dom_gnttab_hvm_seed(xc_interface *xch, domid_t domid,
     return 0;
 }
 
-int xc_dom_gnttab_init(struct xc_dom_image *dom)
+int xc_dom_gnttab_seed(xc_interface *xch, domid_t guest_domid,
+                       bool is_hvm, xen_pfn_t console_gmfn,
+                       xen_pfn_t xenstore_gmfn, domid_t console_domid,
+                       domid_t xenstore_domid)
 {
-    if ( xc_dom_translated(dom) ) {
-        return xc_dom_gnttab_hvm_seed(dom->xch, dom->guest_domid,
-                                      dom->console_pfn, dom->xenstore_pfn,
-                                      dom->console_domid, dom->xenstore_domid);
-    } else {
-        return xc_dom_gnttab_seed(dom->xch, dom->guest_domid,
-                                  xc_dom_p2m(dom, dom->console_pfn),
-                                  xc_dom_p2m(dom, dom->xenstore_pfn),
-                                  dom->console_domid, dom->xenstore_domid);
+    xenforeignmemory_handle* fmem = xch->fmem;
+    xenforeignmemory_resource_handle *fres;
+    void *addr = NULL;
+
+    fres = xenforeignmemory_map_resource(fmem, guest_domid,
+                                         XENMEM_resource_grant_table,
+                                         0, 0, 1,
+                                         &addr, PROT_READ | PROT_WRITE, 0);
+    if ( !fres )
+    {
+        if ( errno == EOPNOTSUPP )
+            return is_hvm ?
+                compat_gnttab_hvm_seed(xch, guest_domid,
+                                       console_gmfn, xenstore_gmfn,
+                                       console_domid, xenstore_domid) :
+                compat_gnttab_seed(xch, guest_domid,
+                                   console_gmfn, xenstore_gmfn,
+                                   console_domid, xenstore_domid);
+
+        xc_dom_panic(xch, XC_INTERNAL_ERROR,
+                     "%s: failed to acquire grant table "
+                     "[errno=%d]\n",
+                     __FUNCTION__, errno);
+        return -1;
     }
+
+    xc_dom_set_gnttab_entry(xch, addr, GNTTAB_RESERVED_CONSOLE,
+                            guest_domid, console_domid, console_gmfn);
+    xc_dom_set_gnttab_entry(xch, addr, GNTTAB_RESERVED_XENSTORE,
+                            guest_domid, xenstore_domid, xenstore_gmfn);
+
+    xenforeignmemory_unmap_resource(fmem, fres);
+
+    return 0;
+}
+
+int xc_dom_gnttab_init(struct xc_dom_image *dom)
+{
+    bool is_hvm = xc_dom_translated(dom);
+    xen_pfn_t console_gmfn = xc_dom_p2m(dom, dom->console_pfn);
+    xen_pfn_t xenstore_gmfn = xc_dom_p2m(dom, dom->xenstore_pfn);
+
+    return xc_dom_gnttab_seed(dom->xch, dom->guest_domid, is_hvm,
+                              console_gmfn, xenstore_gmfn,
+                              dom->console_domid, dom->xenstore_domid);
 }
 
 /*
diff --git a/tools/libxc/xc_sr_restore_x86_hvm.c b/tools/libxc/xc_sr_restore_x86_hvm.c
index 1dca85354a..a5c661da8f 100644
--- a/tools/libxc/xc_sr_restore_x86_hvm.c
+++ b/tools/libxc/xc_sr_restore_x86_hvm.c
@@ -207,11 +207,11 @@ static int x86_hvm_stream_complete(struct xc_sr_context *ctx)
         return rc;
     }
 
-    rc = xc_dom_gnttab_hvm_seed(xch, ctx->domid,
-                                ctx->restore.console_gfn,
-                                ctx->restore.xenstore_gfn,
-                                ctx->restore.console_domid,
-                                ctx->restore.xenstore_domid);
+    rc = xc_dom_gnttab_seed(xch, ctx->domid, true,
+                            ctx->restore.console_gfn,
+                            ctx->restore.xenstore_gfn,
+                            ctx->restore.console_domid,
+                            ctx->restore.xenstore_domid);
     if ( rc )
     {
         PERROR("Failed to seed grant table");
diff --git a/tools/libxc/xc_sr_restore_x86_pv.c b/tools/libxc/xc_sr_restore_x86_pv.c
index 50e25c162c..10635d436b 100644
--- a/tools/libxc/xc_sr_restore_x86_pv.c
+++ b/tools/libxc/xc_sr_restore_x86_pv.c
@@ -1104,7 +1104,7 @@ static int x86_pv_stream_complete(struct xc_sr_context *ctx)
     if ( rc )
         return rc;
 
-    rc = xc_dom_gnttab_seed(xch, ctx->domid,
+    rc = xc_dom_gnttab_seed(xch, ctx->domid, false,
                             ctx->restore.console_gfn,
                             ctx->restore.xenstore_gfn,
                             ctx->restore.console_domid,
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index f54fd49a73..0d3e462c12 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -851,7 +851,6 @@ static int hvm_build_set_params(xc_interface *handle, uint32_t domid,
     *store_mfn = str_mfn;
     *console_mfn = cons_mfn;
 
-    xc_dom_gnttab_hvm_seed(handle, domid, *console_mfn, *store_mfn, console_domid, store_domid);
     return 0;
 }
 
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index aa9f8e4d9e..583ab52a6f 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -800,9 +800,9 @@ static PyObject *pyxc_gnttab_hvm_seed(XcObject *self,
 				      &console_domid, &xenstore_domid) )
         return NULL;
 
-    if ( xc_dom_gnttab_hvm_seed(self->xc_handle, dom,
-				console_gmfn, xenstore_gmfn,
-				console_domid, xenstore_domid) != 0 )
+    if ( xc_dom_gnttab_seed(self->xc_handle, dom, true,
+                            console_gmfn, xenstore_gmfn,
+                            console_domid, xenstore_domid) != 0 )
         return pyxc_error_to_exception(self->xc_handle);
 
     return Py_None;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v7 06/12] x86/hvm/ioreq: rename .*pfn and .*gmfn to .*gfn
  2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
                   ` (4 preceding siblings ...)
  2017-09-18 15:31 ` [PATCH v7 05/12] tools/libxenctrl: use new xenforeignmemory API to seed grant table Paul Durrant
@ 2017-09-18 15:31 ` Paul Durrant
  2017-09-25 14:29   ` Jan Beulich
  2017-09-18 15:31 ` [PATCH v7 07/12] x86/hvm/ioreq: use bool rather than bool_t Paul Durrant
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich

Since ioreq servers are only relevant to HVM guests and all the names in
question unequivocally refer to guest frame numbers, name them all .*gfn
to avoid any confusion.

This patch is purely cosmetic. No semantic or functional change.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/libs/devicemodel/core.c                   | 10 ++--
 tools/libs/devicemodel/include/xendevicemodel.h | 12 ++--
 xen/arch/x86/hvm/dm.c                           |  4 +-
 xen/arch/x86/hvm/hvm.c                          |  6 +-
 xen/arch/x86/hvm/ioreq.c                        | 74 ++++++++++++-------------
 xen/include/asm-x86/hvm/domain.h                |  4 +-
 xen/include/asm-x86/hvm/ioreq.h                 |  4 +-
 xen/include/public/hvm/dm_op.h                  | 20 +++----
 8 files changed, 67 insertions(+), 67 deletions(-)

diff --git a/tools/libs/devicemodel/core.c b/tools/libs/devicemodel/core.c
index d7c6476006..fcb260d29b 100644
--- a/tools/libs/devicemodel/core.c
+++ b/tools/libs/devicemodel/core.c
@@ -174,7 +174,7 @@ int xendevicemodel_create_ioreq_server(
 
 int xendevicemodel_get_ioreq_server_info(
     xendevicemodel_handle *dmod, domid_t domid, ioservid_t id,
-    xen_pfn_t *ioreq_pfn, xen_pfn_t *bufioreq_pfn,
+    xen_pfn_t *ioreq_gfn, xen_pfn_t *bufioreq_gfn,
     evtchn_port_t *bufioreq_port)
 {
     struct xen_dm_op op;
@@ -192,11 +192,11 @@ int xendevicemodel_get_ioreq_server_info(
     if (rc)
         return rc;
 
-    if (ioreq_pfn)
-        *ioreq_pfn = data->ioreq_pfn;
+    if (ioreq_gfn)
+        *ioreq_gfn = data->ioreq_gfn;
 
-    if (bufioreq_pfn)
-        *bufioreq_pfn = data->bufioreq_pfn;
+    if (bufioreq_gfn)
+        *bufioreq_gfn = data->bufioreq_gfn;
 
     if (bufioreq_port)
         *bufioreq_port = data->bufioreq_port;
diff --git a/tools/libs/devicemodel/include/xendevicemodel.h b/tools/libs/devicemodel/include/xendevicemodel.h
index 580fad2f49..13216db04a 100644
--- a/tools/libs/devicemodel/include/xendevicemodel.h
+++ b/tools/libs/devicemodel/include/xendevicemodel.h
@@ -60,17 +60,17 @@ int xendevicemodel_create_ioreq_server(
  * @parm dmod a handle to an open devicemodel interface.
  * @parm domid the domain id to be serviced
  * @parm id the IOREQ Server id.
- * @parm ioreq_pfn pointer to a xen_pfn_t to receive the synchronous ioreq
- *                  gmfn
- * @parm bufioreq_pfn pointer to a xen_pfn_t to receive the buffered ioreq
- *                    gmfn
+ * @parm ioreq_gfn pointer to a xen_pfn_t to receive the synchronous ioreq
+ *                  gfn
+ * @parm bufioreq_gfn pointer to a xen_pfn_t to receive the buffered ioreq
+ *                    gfn
  * @parm bufioreq_port pointer to a evtchn_port_t to receive the buffered
  *                     ioreq event channel
  * @return 0 on success, -1 on failure.
  */
 int xendevicemodel_get_ioreq_server_info(
     xendevicemodel_handle *dmod, domid_t domid, ioservid_t id,
-    xen_pfn_t *ioreq_pfn, xen_pfn_t *bufioreq_pfn,
+    xen_pfn_t *ioreq_gfn, xen_pfn_t *bufioreq_gfn,
     evtchn_port_t *bufioreq_port);
 
 /**
@@ -168,7 +168,7 @@ int xendevicemodel_destroy_ioreq_server(
  * This function sets IOREQ Server state. An IOREQ Server
  * will not be passed emulation requests until it is in
  * the enabled state.
- * Note that the contents of the ioreq_pfn and bufioreq_pfn are
+ * Note that the contents of the ioreq_gfn and bufioreq_gfn are
  * not meaningful until the IOREQ Server is in the enabled state.
  *
  * @parm dmod a handle to an open devicemodel interface.
diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index 4cf6deedc7..f7cb883fec 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -426,8 +426,8 @@ static int dm_op(const struct dmop_args *op_args)
             break;
 
         rc = hvm_get_ioreq_server_info(d, data->id,
-                                       &data->ioreq_pfn,
-                                       &data->bufioreq_pfn,
+                                       &data->ioreq_gfn,
+                                       &data->bufioreq_gfn,
                                        &data->bufioreq_port);
         break;
     }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 6cb903def5..58b4afa1d1 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4185,20 +4185,20 @@ static int hvmop_set_param(
             rc = -EINVAL;
         break;
     case HVM_PARAM_IOREQ_SERVER_PFN:
-        d->arch.hvm_domain.ioreq_gmfn.base = a.value;
+        d->arch.hvm_domain.ioreq_gfn.base = a.value;
         break;
     case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
     {
         unsigned int i;
 
         if ( a.value == 0 ||
-             a.value > sizeof(d->arch.hvm_domain.ioreq_gmfn.mask) * 8 )
+             a.value > sizeof(d->arch.hvm_domain.ioreq_gfn.mask) * 8 )
         {
             rc = -EINVAL;
             break;
         }
         for ( i = 0; i < a.value; i++ )
-            set_bit(i, &d->arch.hvm_domain.ioreq_gmfn.mask);
+            set_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask);
 
         break;
     }
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 752976d16d..69913cf3cd 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -181,17 +181,17 @@ bool_t handle_hvm_io_completion(struct vcpu *v)
     return 1;
 }
 
-static int hvm_alloc_ioreq_gmfn(struct domain *d, unsigned long *gmfn)
+static int hvm_alloc_ioreq_gfn(struct domain *d, unsigned long *gfn)
 {
     unsigned int i;
     int rc;
 
     rc = -ENOMEM;
-    for ( i = 0; i < sizeof(d->arch.hvm_domain.ioreq_gmfn.mask) * 8; i++ )
+    for ( i = 0; i < sizeof(d->arch.hvm_domain.ioreq_gfn.mask) * 8; i++ )
     {
-        if ( test_and_clear_bit(i, &d->arch.hvm_domain.ioreq_gmfn.mask) )
+        if ( test_and_clear_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask) )
         {
-            *gmfn = d->arch.hvm_domain.ioreq_gmfn.base + i;
+            *gfn = d->arch.hvm_domain.ioreq_gfn.base + i;
             rc = 0;
             break;
         }
@@ -200,12 +200,12 @@ static int hvm_alloc_ioreq_gmfn(struct domain *d, unsigned long *gmfn)
     return rc;
 }
 
-static void hvm_free_ioreq_gmfn(struct domain *d, unsigned long gmfn)
+static void hvm_free_ioreq_gfn(struct domain *d, unsigned long gfn)
 {
-    unsigned int i = gmfn - d->arch.hvm_domain.ioreq_gmfn.base;
+    unsigned int i = gfn - d->arch.hvm_domain.ioreq_gfn.base;
 
-    if ( gmfn != gfn_x(INVALID_GFN) )
-        set_bit(i, &d->arch.hvm_domain.ioreq_gmfn.mask);
+    if ( gfn != gfn_x(INVALID_GFN) )
+        set_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask);
 }
 
 static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool_t buf)
@@ -216,7 +216,7 @@ static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool_t buf)
 }
 
 static int hvm_map_ioreq_page(
-    struct hvm_ioreq_server *s, bool_t buf, unsigned long gmfn)
+    struct hvm_ioreq_server *s, bool_t buf, unsigned long gfn)
 {
     struct domain *d = s->domain;
     struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
@@ -224,7 +224,7 @@ static int hvm_map_ioreq_page(
     void *va;
     int rc;
 
-    if ( (rc = prepare_ring_for_helper(d, gmfn, &page, &va)) )
+    if ( (rc = prepare_ring_for_helper(d, gfn, &page, &va)) )
         return rc;
 
     if ( (iorp->va != NULL) || d->is_dying )
@@ -235,7 +235,7 @@ static int hvm_map_ioreq_page(
 
     iorp->va = va;
     iorp->page = page;
-    iorp->gmfn = gmfn;
+    iorp->gfn = gfn;
 
     return 0;
 }
@@ -264,23 +264,23 @@ bool_t is_ioreq_server_page(struct domain *d, const struct page_info *page)
     return found;
 }
 
-static void hvm_remove_ioreq_gmfn(
+static void hvm_remove_ioreq_gfn(
     struct domain *d, struct hvm_ioreq_page *iorp)
 {
-    if ( guest_physmap_remove_page(d, _gfn(iorp->gmfn),
+    if ( guest_physmap_remove_page(d, _gfn(iorp->gfn),
                                    _mfn(page_to_mfn(iorp->page)), 0) )
         domain_crash(d);
     clear_page(iorp->va);
 }
 
-static int hvm_add_ioreq_gmfn(
+static int hvm_add_ioreq_gfn(
     struct domain *d, struct hvm_ioreq_page *iorp)
 {
     int rc;
 
     clear_page(iorp->va);
 
-    rc = guest_physmap_add_page(d, _gfn(iorp->gmfn),
+    rc = guest_physmap_add_page(d, _gfn(iorp->gfn),
                                 _mfn(page_to_mfn(iorp->page)), 0);
     if ( rc == 0 )
         paging_mark_dirty(d, _mfn(page_to_mfn(iorp->page)));
@@ -412,17 +412,17 @@ static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
 }
 
 static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s,
-                                      unsigned long ioreq_pfn,
-                                      unsigned long bufioreq_pfn)
+                                      unsigned long ioreq_gfn,
+                                      unsigned long bufioreq_gfn)
 {
     int rc;
 
-    rc = hvm_map_ioreq_page(s, 0, ioreq_pfn);
+    rc = hvm_map_ioreq_page(s, 0, ioreq_gfn);
     if ( rc )
         return rc;
 
-    if ( bufioreq_pfn != gfn_x(INVALID_GFN) )
-        rc = hvm_map_ioreq_page(s, 1, bufioreq_pfn);
+    if ( bufioreq_gfn != gfn_x(INVALID_GFN) )
+        rc = hvm_map_ioreq_page(s, 1, bufioreq_gfn);
 
     if ( rc )
         hvm_unmap_ioreq_page(s, 0);
@@ -435,8 +435,8 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
                                         bool_t handle_bufioreq)
 {
     struct domain *d = s->domain;
-    unsigned long ioreq_pfn = gfn_x(INVALID_GFN);
-    unsigned long bufioreq_pfn = gfn_x(INVALID_GFN);
+    unsigned long ioreq_gfn = gfn_x(INVALID_GFN);
+    unsigned long bufioreq_gfn = gfn_x(INVALID_GFN);
     int rc;
 
     if ( is_default )
@@ -451,18 +451,18 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
                    d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN]);
     }
 
-    rc = hvm_alloc_ioreq_gmfn(d, &ioreq_pfn);
+    rc = hvm_alloc_ioreq_gfn(d, &ioreq_gfn);
 
     if ( !rc && handle_bufioreq )
-        rc = hvm_alloc_ioreq_gmfn(d, &bufioreq_pfn);
+        rc = hvm_alloc_ioreq_gfn(d, &bufioreq_gfn);
 
     if ( !rc )
-        rc = hvm_ioreq_server_map_pages(s, ioreq_pfn, bufioreq_pfn);
+        rc = hvm_ioreq_server_map_pages(s, ioreq_gfn, bufioreq_gfn);
 
     if ( rc )
     {
-        hvm_free_ioreq_gmfn(d, ioreq_pfn);
-        hvm_free_ioreq_gmfn(d, bufioreq_pfn);
+        hvm_free_ioreq_gfn(d, ioreq_gfn);
+        hvm_free_ioreq_gfn(d, bufioreq_gfn);
     }
 
     return rc;
@@ -482,9 +482,9 @@ static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s,
     if ( !is_default )
     {
         if ( handle_bufioreq )
-            hvm_free_ioreq_gmfn(d, s->bufioreq.gmfn);
+            hvm_free_ioreq_gfn(d, s->bufioreq.gfn);
 
-        hvm_free_ioreq_gmfn(d, s->ioreq.gmfn);
+        hvm_free_ioreq_gfn(d, s->ioreq.gfn);
     }
 }
 
@@ -556,10 +556,10 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s,
 
     if ( !is_default )
     {
-        hvm_remove_ioreq_gmfn(d, &s->ioreq);
+        hvm_remove_ioreq_gfn(d, &s->ioreq);
 
         if ( handle_bufioreq )
-            hvm_remove_ioreq_gmfn(d, &s->bufioreq);
+            hvm_remove_ioreq_gfn(d, &s->bufioreq);
     }
 
     s->enabled = 1;
@@ -587,9 +587,9 @@ static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s,
     if ( !is_default )
     {
         if ( handle_bufioreq )
-            hvm_add_ioreq_gmfn(d, &s->bufioreq);
+            hvm_add_ioreq_gfn(d, &s->bufioreq);
 
-        hvm_add_ioreq_gmfn(d, &s->ioreq);
+        hvm_add_ioreq_gfn(d, &s->ioreq);
     }
 
     s->enabled = 0;
@@ -776,8 +776,8 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 }
 
 int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
-                              unsigned long *ioreq_pfn,
-                              unsigned long *bufioreq_pfn,
+                              unsigned long *ioreq_gfn,
+                              unsigned long *bufioreq_gfn,
                               evtchn_port_t *bufioreq_port)
 {
     struct hvm_ioreq_server *s;
@@ -796,11 +796,11 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
         if ( s->id != id )
             continue;
 
-        *ioreq_pfn = s->ioreq.gmfn;
+        *ioreq_gfn = s->ioreq.gfn;
 
         if ( s->bufioreq.va != NULL )
         {
-            *bufioreq_pfn = s->bufioreq.gmfn;
+            *bufioreq_gfn = s->bufioreq.gfn;
             *bufioreq_port = s->bufioreq_evtchn;
         }
 
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index d2899c9bb2..ce536f75ef 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -36,7 +36,7 @@
 #include <public/hvm/dm_op.h>
 
 struct hvm_ioreq_page {
-    unsigned long gmfn;
+    unsigned long gfn;
     struct page_info *page;
     void *va;
 };
@@ -105,7 +105,7 @@ struct hvm_domain {
     struct {
         unsigned long base;
         unsigned long mask;
-    } ioreq_gmfn;
+    } ioreq_gfn;
 
     /* Lock protects all other values in the sub-struct and the default */
     struct {
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index b43667a367..43fbe115dc 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -28,8 +28,8 @@ int hvm_create_ioreq_server(struct domain *d, domid_t domid,
                             ioservid_t *id);
 int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id);
 int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
-                              unsigned long *ioreq_pfn,
-                              unsigned long *bufioreq_pfn,
+                              unsigned long *ioreq_gfn,
+                              unsigned long *bufioreq_gfn,
                               evtchn_port_t *bufioreq_port);
 int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
                                      uint32_t type, uint64_t start,
diff --git a/xen/include/public/hvm/dm_op.h b/xen/include/public/hvm/dm_op.h
index 2a4c3d938d..6bbab5fca3 100644
--- a/xen/include/public/hvm/dm_op.h
+++ b/xen/include/public/hvm/dm_op.h
@@ -41,9 +41,9 @@
  * A domain supports a single 'legacy' IOREQ Server which is instantiated if
  * parameter...
  *
- * HVM_PARAM_IOREQ_PFN is read (to get the gmfn containing the synchronous
+ * HVM_PARAM_IOREQ_PFN is read (to get the gfn containing the synchronous
  * ioreq structures), or...
- * HVM_PARAM_BUFIOREQ_PFN is read (to get the gmfn containing the buffered
+ * HVM_PARAM_BUFIOREQ_PFN is read (to get the gfn containing the buffered
  * ioreq ring), or...
  * HVM_PARAM_BUFIOREQ_EVTCHN is read (to get the event channel that Xen uses
  * to request buffered I/O emulation).
@@ -81,14 +81,14 @@ struct xen_dm_op_create_ioreq_server {
  *
  * The emulator needs to map the synchronous ioreq structures and buffered
  * ioreq ring (if it exists) that Xen uses to request emulation. These are
- * hosted in the target domain's gmfns <ioreq_pfn> and <bufioreq_pfn>
+ * hosted in the target domain's gmfns <ioreq_gfn> and <bufioreq_gfn>
  * respectively. In addition, if the IOREQ Server is handling buffered
  * emulation requests, the emulator needs to bind to event channel
  * <bufioreq_port> to listen for them. (The event channels used for
  * synchronous emulation requests are specified in the per-CPU ioreq
- * structures in <ioreq_pfn>).
+ * structures in <ioreq_gfn>).
  * If the IOREQ Server is not handling buffered emulation requests then the
- * values handed back in <bufioreq_pfn> and <bufioreq_port> will both be 0.
+ * values handed back in <bufioreq_gfn> and <bufioreq_port> will both be 0.
  */
 #define XEN_DMOP_get_ioreq_server_info 2
 
@@ -98,10 +98,10 @@ struct xen_dm_op_get_ioreq_server_info {
     uint16_t pad;
     /* OUT - buffered ioreq port */
     evtchn_port_t bufioreq_port;
-    /* OUT - sync ioreq pfn */
-    uint64_aligned_t ioreq_pfn;
-    /* OUT - buffered ioreq pfn */
-    uint64_aligned_t bufioreq_pfn;
+    /* OUT - sync ioreq gfn */
+    uint64_aligned_t ioreq_gfn;
+    /* OUT - buffered ioreq gfn */
+    uint64_aligned_t bufioreq_gfn;
 };
 
 /*
@@ -150,7 +150,7 @@ struct xen_dm_op_ioreq_server_range {
  *
  * The IOREQ Server will not be passed any emulation requests until it is
  * in the enabled state.
- * Note that the contents of the ioreq_pfn and bufioreq_fn (see
+ * Note that the contents of the ioreq_gfn and bufioreq_gfn (see
  * XEN_DMOP_get_ioreq_server_info) are not meaningful until the IOREQ Server
  * is in the enabled state.
  */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v7 07/12] x86/hvm/ioreq: use bool rather than bool_t
  2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
                   ` (5 preceding siblings ...)
  2017-09-18 15:31 ` [PATCH v7 06/12] x86/hvm/ioreq: rename .*pfn and .*gmfn to .*gfn Paul Durrant
@ 2017-09-18 15:31 ` Paul Durrant
  2017-09-25 14:30   ` Jan Beulich
  2017-09-18 15:31 ` [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list Paul Durrant
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich

This patch changes use of bool_t to bool in the ioreq server code. It also
fixes an incorrect indentation in a continuation line.

This patch is purely cosmetic. No semantic or functional change.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/dm.c            |   2 +-
 xen/arch/x86/hvm/hvm.c           |   2 +-
 xen/arch/x86/hvm/io.c            |   4 +-
 xen/arch/x86/hvm/ioreq.c         | 100 +++++++++++++++++++--------------------
 xen/include/asm-x86/hvm/domain.h |   6 +--
 xen/include/asm-x86/hvm/ioreq.h  |  14 +++---
 6 files changed, 64 insertions(+), 64 deletions(-)

diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index f7cb883fec..87ef4b6ca9 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -409,7 +409,7 @@ static int dm_op(const struct dmop_args *op_args)
         if ( data->pad[0] || data->pad[1] || data->pad[2] )
             break;
 
-        rc = hvm_create_ioreq_server(d, curr_d->domain_id, 0,
+        rc = hvm_create_ioreq_server(d, curr_d->domain_id, false,
                                      data->handle_bufioreq, &data->id);
         break;
     }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 58b4afa1d1..031d07baf0 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4361,7 +4361,7 @@ static int hvmop_get_param(
         {
             domid_t domid = d->arch.hvm_domain.params[HVM_PARAM_DM_DOMAIN];
 
-            rc = hvm_create_ioreq_server(d, domid, 1,
+            rc = hvm_create_ioreq_server(d, domid, true,
                                          HVM_IOREQSRV_BUFIOREQ_LEGACY, NULL);
             if ( rc != 0 && rc != -EEXIST )
                 goto out;
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index bf41954f59..1ddcaba52e 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -59,7 +59,7 @@ void send_timeoffset_req(unsigned long timeoff)
     if ( timeoff == 0 )
         return;
 
-    if ( hvm_broadcast_ioreq(&p, 1) != 0 )
+    if ( hvm_broadcast_ioreq(&p, true) != 0 )
         gprintk(XENLOG_ERR, "Unsuccessful timeoffset update\n");
 }
 
@@ -73,7 +73,7 @@ void send_invalidate_req(void)
         .data = ~0UL, /* flush all */
     };
 
-    if ( hvm_broadcast_ioreq(&p, 0) != 0 )
+    if ( hvm_broadcast_ioreq(&p, false) != 0 )
         gprintk(XENLOG_ERR, "Unsuccessful map-cache invalidate\n");
 }
 
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 69913cf3cd..f2e0b3f74a 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -43,7 +43,7 @@ static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
     return &p->vcpu_ioreq[v->vcpu_id];
 }
 
-bool_t hvm_io_pending(struct vcpu *v)
+bool hvm_io_pending(struct vcpu *v)
 {
     struct domain *d = v->domain;
     struct hvm_ioreq_server *s;
@@ -59,11 +59,11 @@ bool_t hvm_io_pending(struct vcpu *v)
                               list_entry )
         {
             if ( sv->vcpu == v && sv->pending )
-                return 1;
+                return true;
         }
     }
 
-    return 0;
+    return false;
 }
 
 static void hvm_io_assist(struct hvm_ioreq_vcpu *sv, uint64_t data)
@@ -82,10 +82,10 @@ static void hvm_io_assist(struct hvm_ioreq_vcpu *sv, uint64_t data)
     msix_write_completion(v);
     vcpu_end_shutdown_deferral(v);
 
-    sv->pending = 0;
+    sv->pending = false;
 }
 
-static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
+static bool hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
 {
     while ( sv->pending )
     {
@@ -112,16 +112,16 @@ static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
             break;
         default:
             gdprintk(XENLOG_ERR, "Weird HVM iorequest state %u\n", state);
-            sv->pending = 0;
+            sv->pending = false;
             domain_crash(sv->vcpu->domain);
-            return 0; /* bail */
+            return false; /* bail */
         }
     }
 
-    return 1;
+    return true;
 }
 
-bool_t handle_hvm_io_completion(struct vcpu *v)
+bool handle_hvm_io_completion(struct vcpu *v)
 {
     struct domain *d = v->domain;
     struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
@@ -141,7 +141,7 @@ bool_t handle_hvm_io_completion(struct vcpu *v)
             if ( sv->vcpu == v && sv->pending )
             {
                 if ( !hvm_wait_for_io(sv, get_ioreq(s, v)) )
-                    return 0;
+                    return false;
 
                 break;
             }
@@ -178,7 +178,7 @@ bool_t handle_hvm_io_completion(struct vcpu *v)
         break;
     }
 
-    return 1;
+    return true;
 }
 
 static int hvm_alloc_ioreq_gfn(struct domain *d, unsigned long *gfn)
@@ -208,7 +208,7 @@ static void hvm_free_ioreq_gfn(struct domain *d, unsigned long gfn)
         set_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask);
 }
 
-static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool_t buf)
+static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool buf)
 {
     struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
 
@@ -216,7 +216,7 @@ static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool_t buf)
 }
 
 static int hvm_map_ioreq_page(
-    struct hvm_ioreq_server *s, bool_t buf, unsigned long gfn)
+    struct hvm_ioreq_server *s, bool buf, unsigned long gfn)
 {
     struct domain *d = s->domain;
     struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
@@ -240,10 +240,10 @@ static int hvm_map_ioreq_page(
     return 0;
 }
 
-bool_t is_ioreq_server_page(struct domain *d, const struct page_info *page)
+bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
 {
     const struct hvm_ioreq_server *s;
-    bool_t found = 0;
+    bool found = false;
 
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
@@ -254,7 +254,7 @@ bool_t is_ioreq_server_page(struct domain *d, const struct page_info *page)
         if ( (s->ioreq.va && s->ioreq.page == page) ||
              (s->bufioreq.va && s->bufioreq.page == page) )
         {
-            found = 1;
+            found = true;
             break;
         }
     }
@@ -302,7 +302,7 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
 }
 
 static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
-                                     bool_t is_default, struct vcpu *v)
+                                     bool is_default, struct vcpu *v)
 {
     struct hvm_ioreq_vcpu *sv;
     int rc;
@@ -417,22 +417,22 @@ static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s,
 {
     int rc;
 
-    rc = hvm_map_ioreq_page(s, 0, ioreq_gfn);
+    rc = hvm_map_ioreq_page(s, false, ioreq_gfn);
     if ( rc )
         return rc;
 
     if ( bufioreq_gfn != gfn_x(INVALID_GFN) )
-        rc = hvm_map_ioreq_page(s, 1, bufioreq_gfn);
+        rc = hvm_map_ioreq_page(s, true, bufioreq_gfn);
 
     if ( rc )
-        hvm_unmap_ioreq_page(s, 0);
+        hvm_unmap_ioreq_page(s, false);
 
     return rc;
 }
 
 static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
-                                        bool_t is_default,
-                                        bool_t handle_bufioreq)
+                                        bool is_default,
+                                        bool handle_bufioreq)
 {
     struct domain *d = s->domain;
     unsigned long ioreq_gfn = gfn_x(INVALID_GFN);
@@ -469,15 +469,15 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
 }
 
 static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s,
-                                         bool_t is_default)
+                                         bool is_default)
 {
     struct domain *d = s->domain;
-    bool_t handle_bufioreq = ( s->bufioreq.va != NULL );
+    bool handle_bufioreq = !!s->bufioreq.va;
 
     if ( handle_bufioreq )
-        hvm_unmap_ioreq_page(s, 1);
+        hvm_unmap_ioreq_page(s, true);
 
-    hvm_unmap_ioreq_page(s, 0);
+    hvm_unmap_ioreq_page(s, false);
 
     if ( !is_default )
     {
@@ -489,7 +489,7 @@ static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s,
 }
 
 static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s,
-                                            bool_t is_default)
+                                            bool is_default)
 {
     unsigned int i;
 
@@ -501,7 +501,7 @@ static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s,
 }
 
 static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
-                                            bool_t is_default)
+                                            bool is_default)
 {
     unsigned int i;
     int rc;
@@ -537,17 +537,17 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
     return 0;
 
  fail:
-    hvm_ioreq_server_free_rangesets(s, 0);
+    hvm_ioreq_server_free_rangesets(s, false);
 
     return rc;
 }
 
 static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s,
-                                    bool_t is_default)
+                                    bool is_default)
 {
     struct domain *d = s->domain;
     struct hvm_ioreq_vcpu *sv;
-    bool_t handle_bufioreq = ( s->bufioreq.va != NULL );
+    bool handle_bufioreq = !!s->bufioreq.va;
 
     spin_lock(&s->lock);
 
@@ -562,7 +562,7 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s,
             hvm_remove_ioreq_gfn(d, &s->bufioreq);
     }
 
-    s->enabled = 1;
+    s->enabled = true;
 
     list_for_each_entry ( sv,
                           &s->ioreq_vcpu_list,
@@ -574,10 +574,10 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s,
 }
 
 static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s,
-                                    bool_t is_default)
+                                     bool is_default)
 {
     struct domain *d = s->domain;
-    bool_t handle_bufioreq = ( s->bufioreq.va != NULL );
+    bool handle_bufioreq = !!s->bufioreq.va;
 
     spin_lock(&s->lock);
 
@@ -592,7 +592,7 @@ static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s,
         hvm_add_ioreq_gfn(d, &s->ioreq);
     }
 
-    s->enabled = 0;
+    s->enabled = false;
 
  done:
     spin_unlock(&s->lock);
@@ -600,7 +600,7 @@ static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s,
 
 static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
                                  struct domain *d, domid_t domid,
-                                 bool_t is_default, int bufioreq_handling,
+                                 bool is_default, int bufioreq_handling,
                                  ioservid_t id)
 {
     struct vcpu *v;
@@ -619,7 +619,7 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
         return rc;
 
     if ( bufioreq_handling == HVM_IOREQSRV_BUFIOREQ_ATOMIC )
-        s->bufioreq_atomic = 1;
+        s->bufioreq_atomic = true;
 
     rc = hvm_ioreq_server_setup_pages(
              s, is_default, bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF);
@@ -646,7 +646,7 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
 }
 
 static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s,
-                                    bool_t is_default)
+                                    bool is_default)
 {
     ASSERT(!s->enabled);
     hvm_ioreq_server_remove_all_vcpus(s);
@@ -681,7 +681,7 @@ static ioservid_t next_ioservid(struct domain *d)
 }
 
 int hvm_create_ioreq_server(struct domain *d, domid_t domid,
-                            bool_t is_default, int bufioreq_handling,
+                            bool is_default, int bufioreq_handling,
                             ioservid_t *id)
 {
     struct hvm_ioreq_server *s;
@@ -713,7 +713,7 @@ int hvm_create_ioreq_server(struct domain *d, domid_t domid,
     if ( is_default )
     {
         d->arch.hvm_domain.default_ioreq_server = s;
-        hvm_ioreq_server_enable(s, 1);
+        hvm_ioreq_server_enable(s, true);
     }
 
     if ( id )
@@ -756,11 +756,11 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 
         p2m_set_ioreq_server(d, 0, s);
 
-        hvm_ioreq_server_disable(s, 0);
+        hvm_ioreq_server_disable(s, false);
 
         list_del(&s->list_entry);
 
-        hvm_ioreq_server_deinit(s, 0);
+        hvm_ioreq_server_deinit(s, false);
 
         domain_unpause(d);
 
@@ -968,7 +968,7 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
 }
 
 int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
-                               bool_t enabled)
+                               bool enabled)
 {
     struct list_head *entry;
     int rc;
@@ -992,9 +992,9 @@ int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
         domain_pause(d);
 
         if ( enabled )
-            hvm_ioreq_server_enable(s, 0);
+            hvm_ioreq_server_enable(s, false);
         else
-            hvm_ioreq_server_disable(s, 0);
+            hvm_ioreq_server_disable(s, false);
 
         domain_unpause(d);
 
@@ -1017,7 +1017,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
                           &d->arch.hvm_domain.ioreq_server.list,
                           list_entry )
     {
-        bool_t is_default = (s == d->arch.hvm_domain.default_ioreq_server);
+        bool is_default = (s == d->arch.hvm_domain.default_ioreq_server);
 
         rc = hvm_ioreq_server_add_vcpu(s, is_default, v);
         if ( rc )
@@ -1066,7 +1066,7 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
                                &d->arch.hvm_domain.ioreq_server.list,
                                list_entry )
     {
-        bool_t is_default = (s == d->arch.hvm_domain.default_ioreq_server);
+        bool is_default = (s == d->arch.hvm_domain.default_ioreq_server);
 
         hvm_ioreq_server_disable(s, is_default);
 
@@ -1347,7 +1347,7 @@ static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
 }
 
 int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
-                   bool_t buffered)
+                   bool buffered)
 {
     struct vcpu *curr = current;
     struct domain *d = curr->domain;
@@ -1398,7 +1398,7 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
             p->state = STATE_IOREQ_READY;
             notify_via_xen_event_channel(d, port);
 
-            sv->pending = 1;
+            sv->pending = true;
             return X86EMUL_RETRY;
         }
     }
@@ -1406,7 +1406,7 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
     return X86EMUL_UNHANDLEABLE;
 }
 
-unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool_t buffered)
+unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
 {
     struct domain *d = current->domain;
     struct hvm_ioreq_server *s;
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index ce536f75ef..7f128c05ff 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -45,7 +45,7 @@ struct hvm_ioreq_vcpu {
     struct list_head list_entry;
     struct vcpu      *vcpu;
     evtchn_port_t    ioreq_evtchn;
-    bool_t           pending;
+    bool             pending;
 };
 
 #define NR_IO_RANGE_TYPES (XEN_DMOP_IO_RANGE_PCI + 1)
@@ -69,8 +69,8 @@ struct hvm_ioreq_server {
     spinlock_t             bufioreq_lock;
     evtchn_port_t          bufioreq_evtchn;
     struct rangeset        *range[NR_IO_RANGE_TYPES];
-    bool_t                 enabled;
-    bool_t                 bufioreq_atomic;
+    bool                   enabled;
+    bool                   bufioreq_atomic;
 };
 
 /*
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index 43fbe115dc..1829fcf43e 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -19,12 +19,12 @@
 #ifndef __ASM_X86_HVM_IOREQ_H__
 #define __ASM_X86_HVM_IOREQ_H__
 
-bool_t hvm_io_pending(struct vcpu *v);
-bool_t handle_hvm_io_completion(struct vcpu *v);
-bool_t is_ioreq_server_page(struct domain *d, const struct page_info *page);
+bool hvm_io_pending(struct vcpu *v);
+bool handle_hvm_io_completion(struct vcpu *v);
+bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
 
 int hvm_create_ioreq_server(struct domain *d, domid_t domid,
-                            bool_t is_default, int bufioreq_handling,
+                            bool is_default, int bufioreq_handling,
                             ioservid_t *id);
 int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id);
 int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
@@ -40,7 +40,7 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
 int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
                                      uint32_t type, uint32_t flags);
 int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
-                               bool_t enabled);
+                               bool enabled);
 
 int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v);
 void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v);
@@ -51,8 +51,8 @@ int hvm_set_dm_domain(struct domain *d, domid_t domid);
 struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
                                                  ioreq_t *p);
 int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
-                   bool_t buffered);
-unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool_t buffered);
+                   bool buffered);
+unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
 
 void hvm_ioreq_init(struct domain *d);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
                   ` (6 preceding siblings ...)
  2017-09-18 15:31 ` [PATCH v7 07/12] x86/hvm/ioreq: use bool rather than bool_t Paul Durrant
@ 2017-09-18 15:31 ` Paul Durrant
  2017-09-25 15:17   ` Jan Beulich
  2017-09-18 15:31 ` [PATCH v7 09/12] x86/hvm/ioreq: simplify code and use consistent naming Paul Durrant
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich

A subsequent patch will remove the current implicit limitation on creation
of ioreq servers which is due to the allocation of gfns for the ioreq
structures and buffered ioreq ring.

It will therefore be necessary to introduce an explicit limit and, since
this limit should be small, it simplifies the code to maintain an array of
that size rather than using a list.

Also, by reserving an array slot for the default server and populating
array slots early in create, the need to pass an 'is_default' boolean
to sub-functions can be avoided.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>

v7:
 - Fixed assertion failure found in testing.

v6:
 - Updated according to comments made by Roger on v4 that I'd missed.

v5:
 - Switched GET/SET_IOREQ_SERVER() macros to get/set_ioreq_server()
   functions to avoid possible double-evaluation issues.

v4:
 - Introduced more helper macros and relocated them to the top of the
   code.

v3:
 - New patch (replacing "move is_default into struct hvm_ioreq_server") in
   response to review comments.
---
 xen/arch/x86/hvm/ioreq.c         | 512 +++++++++++++++++++--------------------
 xen/include/asm-x86/hvm/domain.h |  11 +-
 2 files changed, 261 insertions(+), 262 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index f2e0b3f74a..fe29004e87 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -33,6 +33,32 @@
 
 #include <public/hvm/ioreq.h>
 
+static void set_ioreq_server(struct domain *d, unsigned int id,
+                             struct hvm_ioreq_server *s)
+{
+    ASSERT(id < MAX_NR_IOREQ_SERVERS);
+    ASSERT(!s || !d->arch.hvm_domain.ioreq_server.server[id]);
+
+    d->arch.hvm_domain.ioreq_server.server[id] = s;
+}
+
+static struct hvm_ioreq_server *get_ioreq_server(struct domain *d,
+                                                 unsigned int id)
+{
+    if ( id >= MAX_NR_IOREQ_SERVERS )
+        return NULL;
+
+    return d->arch.hvm_domain.ioreq_server.server[id];
+}
+
+#define IS_DEFAULT(s) \
+    ((s) == get_ioreq_server((s)->domain, DEFAULT_IOSERVID))
+
+#define FOR_EACH_IOREQ_SERVER(d, id, s) \
+    for ( (id) = 0, (s) = get_ioreq_server((d), (id)); \
+          (id) < MAX_NR_IOREQ_SERVERS; \
+          (s) = get_ioreq_server((d), ++(id)) )
+
 static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
 {
     shared_iopage_t *p = s->ioreq.va;
@@ -47,13 +73,15 @@ bool hvm_io_pending(struct vcpu *v)
 {
     struct domain *d = v->domain;
     struct hvm_ioreq_server *s;
+    unsigned int id;
 
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
+    FOR_EACH_IOREQ_SERVER(d, id, s)
     {
         struct hvm_ioreq_vcpu *sv;
 
+        if ( !s )
+            continue;
+
         list_for_each_entry ( sv,
                               &s->ioreq_vcpu_list,
                               list_entry )
@@ -127,13 +155,15 @@ bool handle_hvm_io_completion(struct vcpu *v)
     struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
     struct hvm_ioreq_server *s;
     enum hvm_io_completion io_completion;
+    unsigned int id;
 
-      list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
+    FOR_EACH_IOREQ_SERVER(d, id, s)
     {
         struct hvm_ioreq_vcpu *sv;
 
+        if ( !s )
+            continue;
+
         list_for_each_entry ( sv,
                               &s->ioreq_vcpu_list,
                               list_entry )
@@ -243,14 +273,16 @@ static int hvm_map_ioreq_page(
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
 {
     const struct hvm_ioreq_server *s;
+    unsigned int id;
     bool found = false;
 
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
+    FOR_EACH_IOREQ_SERVER(d, id, s)
     {
+        if ( !s )
+            continue;
+
         if ( (s->ioreq.va && s->ioreq.page == page) ||
              (s->bufioreq.va && s->bufioreq.page == page) )
         {
@@ -301,8 +333,9 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
     }
 }
 
+
 static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
-                                     bool is_default, struct vcpu *v)
+                                     struct vcpu *v)
 {
     struct hvm_ioreq_vcpu *sv;
     int rc;
@@ -331,7 +364,7 @@ static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
             goto fail3;
 
         s->bufioreq_evtchn = rc;
-        if ( is_default )
+        if ( IS_DEFAULT(s) )
             d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_EVTCHN] =
                 s->bufioreq_evtchn;
     }
@@ -431,7 +464,6 @@ static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s,
 }
 
 static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
-                                        bool is_default,
                                         bool handle_bufioreq)
 {
     struct domain *d = s->domain;
@@ -439,7 +471,7 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
     unsigned long bufioreq_gfn = gfn_x(INVALID_GFN);
     int rc;
 
-    if ( is_default )
+    if ( IS_DEFAULT(s) )
     {
         /*
          * The default ioreq server must handle buffered ioreqs, for
@@ -468,8 +500,7 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
     return rc;
 }
 
-static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s,
-                                         bool is_default)
+static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
 {
     struct domain *d = s->domain;
     bool handle_bufioreq = !!s->bufioreq.va;
@@ -479,7 +510,7 @@ static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s,
 
     hvm_unmap_ioreq_page(s, false);
 
-    if ( !is_default )
+    if ( !IS_DEFAULT(s) )
     {
         if ( handle_bufioreq )
             hvm_free_ioreq_gfn(d, s->bufioreq.gfn);
@@ -488,12 +519,11 @@ static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s,
     }
 }
 
-static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s,
-                                            bool is_default)
+static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s)
 {
     unsigned int i;
 
-    if ( is_default )
+    if ( IS_DEFAULT(s) )
         return;
 
     for ( i = 0; i < NR_IO_RANGE_TYPES; i++ )
@@ -501,19 +531,19 @@ static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s,
 }
 
 static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
-                                            bool is_default)
+                                            ioservid_t id)
 {
     unsigned int i;
     int rc;
 
-    if ( is_default )
+    if ( IS_DEFAULT(s) )
         goto done;
 
     for ( i = 0; i < NR_IO_RANGE_TYPES; i++ )
     {
         char *name;
 
-        rc = asprintf(&name, "ioreq_server %d %s", s->id,
+        rc = asprintf(&name, "ioreq_server %d %s", id,
                       (i == XEN_DMOP_IO_RANGE_PORT) ? "port" :
                       (i == XEN_DMOP_IO_RANGE_MEMORY) ? "memory" :
                       (i == XEN_DMOP_IO_RANGE_PCI) ? "pci" :
@@ -537,13 +567,12 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
     return 0;
 
  fail:
-    hvm_ioreq_server_free_rangesets(s, false);
+    hvm_ioreq_server_free_rangesets(s);
 
     return rc;
 }
 
-static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s,
-                                    bool is_default)
+static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
 {
     struct domain *d = s->domain;
     struct hvm_ioreq_vcpu *sv;
@@ -554,7 +583,7 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s,
     if ( s->enabled )
         goto done;
 
-    if ( !is_default )
+    if ( !IS_DEFAULT(s) )
     {
         hvm_remove_ioreq_gfn(d, &s->ioreq);
 
@@ -573,8 +602,7 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s,
     spin_unlock(&s->lock);
 }
 
-static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s,
-                                     bool is_default)
+static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
 {
     struct domain *d = s->domain;
     bool handle_bufioreq = !!s->bufioreq.va;
@@ -584,7 +612,7 @@ static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s,
     if ( !s->enabled )
         goto done;
 
-    if ( !is_default )
+    if ( !IS_DEFAULT(s) )
     {
         if ( handle_bufioreq )
             hvm_add_ioreq_gfn(d, &s->bufioreq);
@@ -600,13 +628,11 @@ static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s,
 
 static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
                                  struct domain *d, domid_t domid,
-                                 bool is_default, int bufioreq_handling,
-                                 ioservid_t id)
+                                 int bufioreq_handling, ioservid_t id)
 {
     struct vcpu *v;
     int rc;
 
-    s->id = id;
     s->domain = d;
     s->domid = domid;
 
@@ -614,7 +640,7 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
     INIT_LIST_HEAD(&s->ioreq_vcpu_list);
     spin_lock_init(&s->bufioreq_lock);
 
-    rc = hvm_ioreq_server_alloc_rangesets(s, is_default);
+    rc = hvm_ioreq_server_alloc_rangesets(s, id);
     if ( rc )
         return rc;
 
@@ -622,13 +648,13 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
         s->bufioreq_atomic = true;
 
     rc = hvm_ioreq_server_setup_pages(
-             s, is_default, bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF);
+             s, bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF);
     if ( rc )
         goto fail_map;
 
     for_each_vcpu ( d, v )
     {
-        rc = hvm_ioreq_server_add_vcpu(s, is_default, v);
+        rc = hvm_ioreq_server_add_vcpu(s, v);
         if ( rc )
             goto fail_add;
     }
@@ -637,47 +663,20 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
 
  fail_add:
     hvm_ioreq_server_remove_all_vcpus(s);
-    hvm_ioreq_server_unmap_pages(s, is_default);
+    hvm_ioreq_server_unmap_pages(s);
 
  fail_map:
-    hvm_ioreq_server_free_rangesets(s, is_default);
+    hvm_ioreq_server_free_rangesets(s);
 
     return rc;
 }
 
-static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s,
-                                    bool is_default)
+static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
 {
     ASSERT(!s->enabled);
     hvm_ioreq_server_remove_all_vcpus(s);
-    hvm_ioreq_server_unmap_pages(s, is_default);
-    hvm_ioreq_server_free_rangesets(s, is_default);
-}
-
-static ioservid_t next_ioservid(struct domain *d)
-{
-    struct hvm_ioreq_server *s;
-    ioservid_t id;
-
-    ASSERT(spin_is_locked(&d->arch.hvm_domain.ioreq_server.lock));
-
-    id = d->arch.hvm_domain.ioreq_server.id;
-
- again:
-    id++;
-
-    /* Check for uniqueness */
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
-    {
-        if ( id == s->id )
-            goto again;
-    }
-
-    d->arch.hvm_domain.ioreq_server.id = id;
-
-    return id;
+    hvm_ioreq_server_unmap_pages(s);
+    hvm_ioreq_server_free_rangesets(s);
 }
 
 int hvm_create_ioreq_server(struct domain *d, domid_t domid,
@@ -685,52 +684,66 @@ int hvm_create_ioreq_server(struct domain *d, domid_t domid,
                             ioservid_t *id)
 {
     struct hvm_ioreq_server *s;
+    unsigned int i;
     int rc;
 
     if ( bufioreq_handling > HVM_IOREQSRV_BUFIOREQ_ATOMIC )
         return -EINVAL;
 
-    rc = -ENOMEM;
     s = xzalloc(struct hvm_ioreq_server);
     if ( !s )
-        goto fail1;
+        return -ENOMEM;
 
     domain_pause(d);
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
-    rc = -EEXIST;
-    if ( is_default && d->arch.hvm_domain.default_ioreq_server != NULL )
-        goto fail2;
-
-    rc = hvm_ioreq_server_init(s, d, domid, is_default, bufioreq_handling,
-                               next_ioservid(d));
-    if ( rc )
-        goto fail3;
-
-    list_add(&s->list_entry,
-             &d->arch.hvm_domain.ioreq_server.list);
-
     if ( is_default )
     {
-        d->arch.hvm_domain.default_ioreq_server = s;
-        hvm_ioreq_server_enable(s, true);
+        i = DEFAULT_IOSERVID;
+
+        rc = -EEXIST;
+        if ( get_ioreq_server(d, i) )
+            goto fail;
+    }
+    else
+    {
+        for ( i = 0; i < MAX_NR_IOREQ_SERVERS; i++ )
+        {
+            if ( i != DEFAULT_IOSERVID && !get_ioreq_server(d, i) )
+                break;
+        }
+
+        rc = -ENOSPC;
+        if ( i >= MAX_NR_IOREQ_SERVERS )
+            goto fail;
     }
 
+    set_ioreq_server(d, i, s);
+
+    rc = hvm_ioreq_server_init(s, d, domid, bufioreq_handling, i);
+    if ( rc )
+        goto fail;
+
+    if ( IS_DEFAULT(s) )
+        hvm_ioreq_server_enable(s);
+
     if ( id )
-        *id = s->id;
+        *id = i;
+
+    d->arch.hvm_domain.ioreq_server.count++;
 
     spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
     domain_unpause(d);
 
     return 0;
 
- fail3:
- fail2:
+ fail:
+    set_ioreq_server(d, i, NULL);
+
     spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
     domain_unpause(d);
 
     xfree(s);
- fail1:
     return rc;
 }
 
@@ -741,35 +754,34 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
-    rc = -ENOENT;
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
-    {
-        if ( s == d->arch.hvm_domain.default_ioreq_server )
-            continue;
+    s = get_ioreq_server(d, id);
 
-        if ( s->id != id )
-            continue;
+    rc = -ENOENT;
+    if ( !s )
+        goto out;
 
-        domain_pause(d);
+    rc = -EPERM;
+    if ( IS_DEFAULT(s) )
+        goto out;
 
-        p2m_set_ioreq_server(d, 0, s);
+    domain_pause(d);
 
-        hvm_ioreq_server_disable(s, false);
+    p2m_set_ioreq_server(d, 0, s);
 
-        list_del(&s->list_entry);
+    hvm_ioreq_server_disable(s);
+    hvm_ioreq_server_deinit(s);
 
-        hvm_ioreq_server_deinit(s, false);
+    domain_unpause(d);
 
-        domain_unpause(d);
+    ASSERT(d->arch.hvm_domain.ioreq_server.count);
+    --d->arch.hvm_domain.ioreq_server.count;
 
-        xfree(s);
+    set_ioreq_server(d, id, NULL);
+    xfree(s);
 
-        rc = 0;
-        break;
-    }
+    rc = 0;
 
+ out:
     spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
     return rc;
@@ -785,29 +797,27 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
 
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
-    rc = -ENOENT;
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
-    {
-        if ( s == d->arch.hvm_domain.default_ioreq_server )
-            continue;
+    s = get_ioreq_server(d, id);
 
-        if ( s->id != id )
-            continue;
+    rc = -ENOENT;
+    if ( !s )
+        goto out;
 
-        *ioreq_gfn = s->ioreq.gfn;
+    rc = -EOPNOTSUPP;
+    if ( IS_DEFAULT(s) )
+        goto out;
 
-        if ( s->bufioreq.va != NULL )
-        {
-            *bufioreq_gfn = s->bufioreq.gfn;
-            *bufioreq_port = s->bufioreq_evtchn;
-        }
+    *ioreq_gfn = s->ioreq.gfn;
 
-        rc = 0;
-        break;
+    if ( s->bufioreq.va != NULL )
+    {
+        *bufioreq_gfn = s->bufioreq.gfn;
+        *bufioreq_port = s->bufioreq_evtchn;
     }
 
+    rc = 0;
+
+ out:
     spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
     return rc;
@@ -818,48 +828,45 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
                                      uint64_t end)
 {
     struct hvm_ioreq_server *s;
+    struct rangeset *r;
     int rc;
 
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
+    s = get_ioreq_server(d, id);
+
     rc = -ENOENT;
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
-    {
-        if ( s == d->arch.hvm_domain.default_ioreq_server )
-            continue;
+    if ( !s )
+        goto out;
 
-        if ( s->id == id )
-        {
-            struct rangeset *r;
+    rc = -EOPNOTSUPP;
+    if ( IS_DEFAULT(s) )
+        goto out;
 
-            switch ( type )
-            {
-            case XEN_DMOP_IO_RANGE_PORT:
-            case XEN_DMOP_IO_RANGE_MEMORY:
-            case XEN_DMOP_IO_RANGE_PCI:
-                r = s->range[type];
-                break;
+    switch ( type )
+    {
+    case XEN_DMOP_IO_RANGE_PORT:
+    case XEN_DMOP_IO_RANGE_MEMORY:
+    case XEN_DMOP_IO_RANGE_PCI:
+        r = s->range[type];
+        break;
 
-            default:
-                r = NULL;
-                break;
-            }
+    default:
+        r = NULL;
+        break;
+    }
 
-            rc = -EINVAL;
-            if ( !r )
-                break;
+    rc = -EINVAL;
+    if ( !r )
+        goto out;
 
-            rc = -EEXIST;
-            if ( rangeset_overlaps_range(r, start, end) )
-                break;
+    rc = -EEXIST;
+    if ( rangeset_overlaps_range(r, start, end) )
+        goto out;
 
-            rc = rangeset_add_range(r, start, end);
-            break;
-        }
-    }
+    rc = rangeset_add_range(r, start, end);
 
+ out:
     spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
     return rc;
@@ -870,48 +877,45 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
                                          uint64_t end)
 {
     struct hvm_ioreq_server *s;
+    struct rangeset *r;
     int rc;
 
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
+    s = get_ioreq_server(d, id);
+
     rc = -ENOENT;
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
-    {
-        if ( s == d->arch.hvm_domain.default_ioreq_server )
-            continue;
+    if ( !s )
+        goto out;
 
-        if ( s->id == id )
-        {
-            struct rangeset *r;
+    rc = -EOPNOTSUPP;
+    if ( IS_DEFAULT(s) )
+        goto out;
 
-            switch ( type )
-            {
-            case XEN_DMOP_IO_RANGE_PORT:
-            case XEN_DMOP_IO_RANGE_MEMORY:
-            case XEN_DMOP_IO_RANGE_PCI:
-                r = s->range[type];
-                break;
+    switch ( type )
+    {
+    case XEN_DMOP_IO_RANGE_PORT:
+    case XEN_DMOP_IO_RANGE_MEMORY:
+    case XEN_DMOP_IO_RANGE_PCI:
+        r = s->range[type];
+        break;
 
-            default:
-                r = NULL;
-                break;
-            }
+    default:
+        r = NULL;
+        break;
+    }
 
-            rc = -EINVAL;
-            if ( !r )
-                break;
+    rc = -EINVAL;
+    if ( !r )
+        goto out;
 
-            rc = -ENOENT;
-            if ( !rangeset_contains_range(r, start, end) )
-                break;
+    rc = -ENOENT;
+    if ( !rangeset_contains_range(r, start, end) )
+        goto out;
 
-            rc = rangeset_remove_range(r, start, end);
-            break;
-        }
-    }
+    rc = rangeset_remove_range(r, start, end);
 
+ out:
     spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
     return rc;
@@ -939,20 +943,14 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
 
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
-    rc = -ENOENT;
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
-    {
-        if ( s == d->arch.hvm_domain.default_ioreq_server )
-            continue;
+    s = get_ioreq_server(d, id);
 
-        if ( s->id == id )
-        {
-            rc = p2m_set_ioreq_server(d, flags, s);
-            break;
-        }
-    }
+    if ( !s )
+        rc = -ENOENT;
+    else if ( IS_DEFAULT(s) )
+        rc = -EOPNOTSUPP;
+    else
+        rc = p2m_set_ioreq_server(d, flags, s);
 
     spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
@@ -970,38 +968,33 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
 int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
                                bool enabled)
 {
-    struct list_head *entry;
+    struct hvm_ioreq_server *s;
     int rc;
 
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
-    rc = -ENOENT;
-    list_for_each ( entry,
-                    &d->arch.hvm_domain.ioreq_server.list )
-    {
-        struct hvm_ioreq_server *s = list_entry(entry,
-                                                struct hvm_ioreq_server,
-                                                list_entry);
+    s = get_ioreq_server(d, id);
 
-        if ( s == d->arch.hvm_domain.default_ioreq_server )
-            continue;
+    rc = -ENOENT;
+    if ( !s )
+        goto out;
 
-        if ( s->id != id )
-            continue;
+    rc = -EOPNOTSUPP;
+    if ( IS_DEFAULT(s) )
+        goto out;
 
-        domain_pause(d);
+    domain_pause(d);
 
-        if ( enabled )
-            hvm_ioreq_server_enable(s, false);
-        else
-            hvm_ioreq_server_disable(s, false);
+    if ( enabled )
+        hvm_ioreq_server_enable(s);
+    else
+        hvm_ioreq_server_disable(s);
 
-        domain_unpause(d);
+    domain_unpause(d);
 
-        rc = 0;
-        break;
-    }
+    rc = 0;
 
+ out:
     spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
     return rc;
 }
@@ -1009,17 +1002,17 @@ int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
 int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
 {
     struct hvm_ioreq_server *s;
+    unsigned int id;
     int rc;
 
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
+    FOR_EACH_IOREQ_SERVER(d, id, s)
     {
-        bool is_default = (s == d->arch.hvm_domain.default_ioreq_server);
+        if ( !s )
+            continue;
 
-        rc = hvm_ioreq_server_add_vcpu(s, is_default, v);
+        rc = hvm_ioreq_server_add_vcpu(s, v);
         if ( rc )
             goto fail;
     }
@@ -1029,10 +1022,15 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
     return 0;
 
  fail:
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
+    while ( id-- != 0 )
+    {
+        s = get_ioreq_server(d, id);
+
+        if ( !s )
+            continue;
+
         hvm_ioreq_server_remove_vcpu(s, v);
+    }
 
     spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
@@ -1042,43 +1040,45 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
 void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
 {
     struct hvm_ioreq_server *s;
+    unsigned int id;
 
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
+    FOR_EACH_IOREQ_SERVER(d, id, s)
+    {
+        if ( !s )
+            continue;
+
         hvm_ioreq_server_remove_vcpu(s, v);
+    }
 
     spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 }
 
 void hvm_destroy_all_ioreq_servers(struct domain *d)
 {
-    struct hvm_ioreq_server *s, *next;
+    struct hvm_ioreq_server *s;
+    unsigned int id;
 
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
     /* No need to domain_pause() as the domain is being torn down */
 
-    list_for_each_entry_safe ( s,
-                               next,
-                               &d->arch.hvm_domain.ioreq_server.list,
-                               list_entry )
+    FOR_EACH_IOREQ_SERVER(d, id, s)
     {
-        bool is_default = (s == d->arch.hvm_domain.default_ioreq_server);
-
-        hvm_ioreq_server_disable(s, is_default);
-
-        if ( is_default )
-            d->arch.hvm_domain.default_ioreq_server = NULL;
+        if ( !s )
+            continue;
 
-        list_del(&s->list_entry);
+        hvm_ioreq_server_disable(s);
+        hvm_ioreq_server_deinit(s);
 
-        hvm_ioreq_server_deinit(s, is_default);
+        ASSERT(d->arch.hvm_domain.ioreq_server.count);
+        --d->arch.hvm_domain.ioreq_server.count;
 
+        set_ioreq_server(d, id, NULL);
         xfree(s);
     }
+    ASSERT(!d->arch.hvm_domain.ioreq_server.count);
 
     spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 }
@@ -1111,7 +1111,7 @@ int hvm_set_dm_domain(struct domain *d, domid_t domid)
      * still be set and thus, when the server is created, it will have
      * the correct domid.
      */
-    s = d->arch.hvm_domain.default_ioreq_server;
+    s = get_ioreq_server(d, DEFAULT_IOSERVID);
     if ( !s )
         goto done;
 
@@ -1164,12 +1164,13 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
     uint32_t cf8;
     uint8_t type;
     uint64_t addr;
+    unsigned int id;
 
-    if ( list_empty(&d->arch.hvm_domain.ioreq_server.list) )
+    if ( !d->arch.hvm_domain.ioreq_server.count )
         return NULL;
 
     if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
-        return d->arch.hvm_domain.default_ioreq_server;
+        return get_ioreq_server(d, DEFAULT_IOSERVID);
 
     cf8 = d->arch.hvm_domain.pci_cf8;
 
@@ -1209,16 +1210,11 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
         addr = p->addr;
     }
 
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
+    FOR_EACH_IOREQ_SERVER(d, id, s)
     {
         struct rangeset *r;
 
-        if ( s == d->arch.hvm_domain.default_ioreq_server )
-            continue;
-
-        if ( !s->enabled )
+        if ( !s || IS_DEFAULT(s) )
             continue;
 
         r = s->range[type];
@@ -1251,7 +1247,7 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
         }
     }
 
-    return d->arch.hvm_domain.default_ioreq_server;
+    return get_ioreq_server(d, DEFAULT_IOSERVID);
 }
 
 static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
@@ -1410,13 +1406,16 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
 {
     struct domain *d = current->domain;
     struct hvm_ioreq_server *s;
-    unsigned int failed = 0;
+    unsigned int id, failed = 0;
+
+    FOR_EACH_IOREQ_SERVER(d, id, s)
+    {
+        if ( !s )
+            continue;
 
-    list_for_each_entry ( s,
-                          &d->arch.hvm_domain.ioreq_server.list,
-                          list_entry )
         if ( hvm_send_ioreq(s, p, buffered) == X86EMUL_UNHANDLEABLE )
             failed++;
+    }
 
     return failed;
 }
@@ -1436,7 +1435,6 @@ static int hvm_access_cf8(
 void hvm_ioreq_init(struct domain *d)
 {
     spin_lock_init(&d->arch.hvm_domain.ioreq_server.lock);
-    INIT_LIST_HEAD(&d->arch.hvm_domain.ioreq_server.list);
 
     register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
 }
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index 7f128c05ff..01fe8a72d8 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -60,7 +60,6 @@ struct hvm_ioreq_server {
 
     /* Domain id of emulating domain */
     domid_t                domid;
-    ioservid_t             id;
     struct hvm_ioreq_page  ioreq;
     struct list_head       ioreq_vcpu_list;
     struct hvm_ioreq_page  bufioreq;
@@ -100,6 +99,9 @@ struct hvm_pi_ops {
     void (*do_resume)(struct vcpu *v);
 };
 
+#define MAX_NR_IOREQ_SERVERS 8
+#define DEFAULT_IOSERVID 0
+
 struct hvm_domain {
     /* Guest page range used for non-default ioreq servers */
     struct {
@@ -109,11 +111,10 @@ struct hvm_domain {
 
     /* Lock protects all other values in the sub-struct and the default */
     struct {
-        spinlock_t       lock;
-        ioservid_t       id;
-        struct list_head list;
+        spinlock_t              lock;
+        struct hvm_ioreq_server *server[MAX_NR_IOREQ_SERVERS];
+        unsigned int            count;
     } ioreq_server;
-    struct hvm_ioreq_server *default_ioreq_server;
 
     /* Cached CF8 for guest PCI config cycles */
     uint32_t                pci_cf8;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v7 09/12] x86/hvm/ioreq: simplify code and use consistent naming
  2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
                   ` (7 preceding siblings ...)
  2017-09-18 15:31 ` [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list Paul Durrant
@ 2017-09-18 15:31 ` Paul Durrant
  2017-09-25 15:26   ` Jan Beulich
  2017-09-18 15:31 ` [PATCH v7 10/12] x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page Paul Durrant
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich

This patch re-works much of the ioreq server initialization and teardown
code:

- The hvm_map/unmap_ioreq_gfn() functions are expanded to call through
  to hvm_alloc/free_ioreq_gfn() rather than expecting them to be called
  separately by outer functions.
- Several functions now test the validity of the hvm_ioreq_page gfn value
  to determine whether they need to act. This means can be safely called
  for the bufioreq page even when it is not used.
- hvm_add/remove_ioreq_gfn() simply return in the case of the default
  IOREQ server so callers no longer need to test before calling.
- hvm_ioreq_server_setup_pages() is renamed to hvm_ioreq_server_map_pages()
  to mirror the existing hvm_ioreq_server_unmap_pages().

All of this significantly shortens the code.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>

v3:
 - Rebased on top of 's->is_default' to 'IS_DEFAULT(s)' changes.
 - Minor updates in response to review comments from Roger.
---
 xen/arch/x86/hvm/ioreq.c | 183 ++++++++++++++++++-----------------------------
 1 file changed, 69 insertions(+), 114 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index fe29004e87..5f38d39ce2 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -211,63 +211,75 @@ bool handle_hvm_io_completion(struct vcpu *v)
     return true;
 }
 
-static int hvm_alloc_ioreq_gfn(struct domain *d, unsigned long *gfn)
+static unsigned long hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
 {
+    struct domain *d = s->domain;
     unsigned int i;
-    int rc;
 
-    rc = -ENOMEM;
+    ASSERT(!IS_DEFAULT(s));
+
     for ( i = 0; i < sizeof(d->arch.hvm_domain.ioreq_gfn.mask) * 8; i++ )
     {
         if ( test_and_clear_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask) )
-        {
-            *gfn = d->arch.hvm_domain.ioreq_gfn.base + i;
-            rc = 0;
-            break;
-        }
+            return d->arch.hvm_domain.ioreq_gfn.base + i;
     }
 
-    return rc;
+    return gfn_x(INVALID_GFN);
 }
 
-static void hvm_free_ioreq_gfn(struct domain *d, unsigned long gfn)
+static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s,
+                               unsigned long gfn)
 {
+    struct domain *d = s->domain;
     unsigned int i = gfn - d->arch.hvm_domain.ioreq_gfn.base;
 
-    if ( gfn != gfn_x(INVALID_GFN) )
-        set_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask);
+    ASSERT(!IS_DEFAULT(s));
+    ASSERT(gfn != gfn_x(INVALID_GFN));
+
+    set_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask);
 }
 
-static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool buf)
+static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 {
     struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
 
+    if ( iorp->gfn == gfn_x(INVALID_GFN) )
+        return;
+
     destroy_ring_for_helper(&iorp->va, iorp->page);
+    iorp->page = NULL;
+
+    if ( !IS_DEFAULT(s) )
+        hvm_free_ioreq_gfn(s, iorp->gfn);
+
+    iorp->gfn = gfn_x(INVALID_GFN);
 }
 
-static int hvm_map_ioreq_page(
-    struct hvm_ioreq_server *s, bool buf, unsigned long gfn)
+static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 {
     struct domain *d = s->domain;
     struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
-    struct page_info *page;
-    void *va;
     int rc;
 
-    if ( (rc = prepare_ring_for_helper(d, gfn, &page, &va)) )
-        return rc;
-
-    if ( (iorp->va != NULL) || d->is_dying )
-    {
-        destroy_ring_for_helper(&va, page);
+    if ( d->is_dying )
         return -EINVAL;
-    }
 
-    iorp->va = va;
-    iorp->page = page;
-    iorp->gfn = gfn;
+    if ( IS_DEFAULT(s) )
+        iorp->gfn = buf ?
+                    d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
+                    d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN];
+    else
+        iorp->gfn = hvm_alloc_ioreq_gfn(s);
+
+    if ( iorp->gfn == gfn_x(INVALID_GFN) )
+        return -ENOMEM;
 
-    return 0;
+    rc = prepare_ring_for_helper(d, iorp->gfn, &iorp->page, &iorp->va);
+
+    if ( rc )
+        hvm_unmap_ioreq_gfn(s, buf);
+
+    return rc;
 }
 
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
@@ -283,8 +295,7 @@ bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
         if ( !s )
             continue;
 
-        if ( (s->ioreq.va && s->ioreq.page == page) ||
-             (s->bufioreq.va && s->bufioreq.page == page) )
+        if ( (s->ioreq.page == page) || (s->bufioreq.page == page) )
         {
             found = true;
             break;
@@ -296,20 +307,30 @@ bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
     return found;
 }
 
-static void hvm_remove_ioreq_gfn(
-    struct domain *d, struct hvm_ioreq_page *iorp)
+static void hvm_remove_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
+
 {
+    struct domain *d = s->domain;
+    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+
+    if ( IS_DEFAULT(s) || iorp->gfn == gfn_x(INVALID_GFN) )
+        return;
+
     if ( guest_physmap_remove_page(d, _gfn(iorp->gfn),
                                    _mfn(page_to_mfn(iorp->page)), 0) )
         domain_crash(d);
     clear_page(iorp->va);
 }
 
-static int hvm_add_ioreq_gfn(
-    struct domain *d, struct hvm_ioreq_page *iorp)
+static int hvm_add_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 {
+    struct domain *d = s->domain;
+    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
     int rc;
 
+    if ( IS_DEFAULT(s) || iorp->gfn == gfn_x(INVALID_GFN) )
+        return 0;
+
     clear_page(iorp->va);
 
     rc = guest_physmap_add_page(d, _gfn(iorp->gfn),
@@ -333,7 +354,6 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
     }
 }
 
-
 static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
                                      struct vcpu *v)
 {
@@ -445,78 +465,25 @@ static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
 }
 
 static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s,
-                                      unsigned long ioreq_gfn,
-                                      unsigned long bufioreq_gfn)
+                                      bool handle_bufioreq)
 {
     int rc;
 
-    rc = hvm_map_ioreq_page(s, false, ioreq_gfn);
-    if ( rc )
-        return rc;
-
-    if ( bufioreq_gfn != gfn_x(INVALID_GFN) )
-        rc = hvm_map_ioreq_page(s, true, bufioreq_gfn);
-
-    if ( rc )
-        hvm_unmap_ioreq_page(s, false);
-
-    return rc;
-}
-
-static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
-                                        bool handle_bufioreq)
-{
-    struct domain *d = s->domain;
-    unsigned long ioreq_gfn = gfn_x(INVALID_GFN);
-    unsigned long bufioreq_gfn = gfn_x(INVALID_GFN);
-    int rc;
-
-    if ( IS_DEFAULT(s) )
-    {
-        /*
-         * The default ioreq server must handle buffered ioreqs, for
-         * backwards compatibility.
-         */
-        ASSERT(handle_bufioreq);
-        return hvm_ioreq_server_map_pages(s,
-                   d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN],
-                   d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN]);
-    }
-
-    rc = hvm_alloc_ioreq_gfn(d, &ioreq_gfn);
+    rc = hvm_map_ioreq_gfn(s, false);
 
     if ( !rc && handle_bufioreq )
-        rc = hvm_alloc_ioreq_gfn(d, &bufioreq_gfn);
-
-    if ( !rc )
-        rc = hvm_ioreq_server_map_pages(s, ioreq_gfn, bufioreq_gfn);
+        rc = hvm_map_ioreq_gfn(s, true);
 
     if ( rc )
-    {
-        hvm_free_ioreq_gfn(d, ioreq_gfn);
-        hvm_free_ioreq_gfn(d, bufioreq_gfn);
-    }
+        hvm_unmap_ioreq_gfn(s, false);
 
     return rc;
 }
 
 static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
 {
-    struct domain *d = s->domain;
-    bool handle_bufioreq = !!s->bufioreq.va;
-
-    if ( handle_bufioreq )
-        hvm_unmap_ioreq_page(s, true);
-
-    hvm_unmap_ioreq_page(s, false);
-
-    if ( !IS_DEFAULT(s) )
-    {
-        if ( handle_bufioreq )
-            hvm_free_ioreq_gfn(d, s->bufioreq.gfn);
-
-        hvm_free_ioreq_gfn(d, s->ioreq.gfn);
-    }
+    hvm_unmap_ioreq_gfn(s, true);
+    hvm_unmap_ioreq_gfn(s, false);
 }
 
 static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s)
@@ -574,22 +541,15 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
 
 static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
 {
-    struct domain *d = s->domain;
     struct hvm_ioreq_vcpu *sv;
-    bool handle_bufioreq = !!s->bufioreq.va;
 
     spin_lock(&s->lock);
 
     if ( s->enabled )
         goto done;
 
-    if ( !IS_DEFAULT(s) )
-    {
-        hvm_remove_ioreq_gfn(d, &s->ioreq);
-
-        if ( handle_bufioreq )
-            hvm_remove_ioreq_gfn(d, &s->bufioreq);
-    }
+    hvm_remove_ioreq_gfn(s, false);
+    hvm_remove_ioreq_gfn(s, true);
 
     s->enabled = true;
 
@@ -604,21 +564,13 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
 
 static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
 {
-    struct domain *d = s->domain;
-    bool handle_bufioreq = !!s->bufioreq.va;
-
     spin_lock(&s->lock);
 
     if ( !s->enabled )
         goto done;
 
-    if ( !IS_DEFAULT(s) )
-    {
-        if ( handle_bufioreq )
-            hvm_add_ioreq_gfn(d, &s->bufioreq);
-
-        hvm_add_ioreq_gfn(d, &s->ioreq);
-    }
+    hvm_add_ioreq_gfn(s, true);
+    hvm_add_ioreq_gfn(s, false);
 
     s->enabled = false;
 
@@ -640,6 +592,9 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
     INIT_LIST_HEAD(&s->ioreq_vcpu_list);
     spin_lock_init(&s->bufioreq_lock);
 
+    s->ioreq.gfn = gfn_x(INVALID_GFN);
+    s->bufioreq.gfn = gfn_x(INVALID_GFN);
+
     rc = hvm_ioreq_server_alloc_rangesets(s, id);
     if ( rc )
         return rc;
@@ -647,7 +602,7 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
     if ( bufioreq_handling == HVM_IOREQSRV_BUFIOREQ_ATOMIC )
         s->bufioreq_atomic = true;
 
-    rc = hvm_ioreq_server_setup_pages(
+    rc = hvm_ioreq_server_map_pages(
              s, bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF);
     if ( rc )
         goto fail_map;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v7 10/12] x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page
  2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
                   ` (8 preceding siblings ...)
  2017-09-18 15:31 ` [PATCH v7 09/12] x86/hvm/ioreq: simplify code and use consistent naming Paul Durrant
@ 2017-09-18 15:31 ` Paul Durrant
  2017-09-25 15:27   ` Jan Beulich
  2017-09-18 15:31 ` [PATCH v7 11/12] x86/hvm/ioreq: defer mapping gfns until they are actually requsted Paul Durrant
  2017-09-18 15:31 ` [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource type Paul Durrant
  11 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich

This patch adjusts the ioreq server code to use type-safe gfn_t values
where possible. No functional change.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/hvm/ioreq.c         | 44 ++++++++++++++++++++--------------------
 xen/include/asm-x86/hvm/domain.h |  2 +-
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 5f38d39ce2..3e2a3f62ba 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -211,7 +211,7 @@ bool handle_hvm_io_completion(struct vcpu *v)
     return true;
 }
 
-static unsigned long hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
+static gfn_t hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
 {
     struct domain *d = s->domain;
     unsigned int i;
@@ -221,20 +221,19 @@ static unsigned long hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
     for ( i = 0; i < sizeof(d->arch.hvm_domain.ioreq_gfn.mask) * 8; i++ )
     {
         if ( test_and_clear_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask) )
-            return d->arch.hvm_domain.ioreq_gfn.base + i;
+            return _gfn(d->arch.hvm_domain.ioreq_gfn.base + i);
     }
 
-    return gfn_x(INVALID_GFN);
+    return INVALID_GFN;
 }
 
-static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s,
-                               unsigned long gfn)
+static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s, gfn_t gfn)
 {
     struct domain *d = s->domain;
-    unsigned int i = gfn - d->arch.hvm_domain.ioreq_gfn.base;
+    unsigned int i = gfn_x(gfn) - d->arch.hvm_domain.ioreq_gfn.base;
 
     ASSERT(!IS_DEFAULT(s));
-    ASSERT(gfn != gfn_x(INVALID_GFN));
+    ASSERT(!gfn_eq(gfn, INVALID_GFN));
 
     set_bit(i, &d->arch.hvm_domain.ioreq_gfn.mask);
 }
@@ -243,7 +242,7 @@ static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 {
     struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
 
-    if ( iorp->gfn == gfn_x(INVALID_GFN) )
+    if ( gfn_eq(iorp->gfn, INVALID_GFN) )
         return;
 
     destroy_ring_for_helper(&iorp->va, iorp->page);
@@ -252,7 +251,7 @@ static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
     if ( !IS_DEFAULT(s) )
         hvm_free_ioreq_gfn(s, iorp->gfn);
 
-    iorp->gfn = gfn_x(INVALID_GFN);
+    iorp->gfn = INVALID_GFN;
 }
 
 static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
@@ -265,16 +264,17 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
         return -EINVAL;
 
     if ( IS_DEFAULT(s) )
-        iorp->gfn = buf ?
-                    d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
-                    d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN];
+        iorp->gfn = _gfn(buf ?
+                         d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
+                         d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN]);
     else
         iorp->gfn = hvm_alloc_ioreq_gfn(s);
 
-    if ( iorp->gfn == gfn_x(INVALID_GFN) )
+    if ( gfn_eq(iorp->gfn, INVALID_GFN) )
         return -ENOMEM;
 
-    rc = prepare_ring_for_helper(d, iorp->gfn, &iorp->page, &iorp->va);
+    rc = prepare_ring_for_helper(d, gfn_x(iorp->gfn), &iorp->page,
+                                 &iorp->va);
 
     if ( rc )
         hvm_unmap_ioreq_gfn(s, buf);
@@ -313,10 +313,10 @@ static void hvm_remove_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
     struct domain *d = s->domain;
     struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
 
-    if ( IS_DEFAULT(s) || iorp->gfn == gfn_x(INVALID_GFN) )
+    if ( IS_DEFAULT(s) || gfn_eq(iorp->gfn, INVALID_GFN) )
         return;
 
-    if ( guest_physmap_remove_page(d, _gfn(iorp->gfn),
+    if ( guest_physmap_remove_page(d, iorp->gfn,
                                    _mfn(page_to_mfn(iorp->page)), 0) )
         domain_crash(d);
     clear_page(iorp->va);
@@ -328,12 +328,12 @@ static int hvm_add_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
     struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
     int rc;
 
-    if ( IS_DEFAULT(s) || iorp->gfn == gfn_x(INVALID_GFN) )
+    if ( IS_DEFAULT(s) || gfn_eq(iorp->gfn, INVALID_GFN) )
         return 0;
 
     clear_page(iorp->va);
 
-    rc = guest_physmap_add_page(d, _gfn(iorp->gfn),
+    rc = guest_physmap_add_page(d, iorp->gfn,
                                 _mfn(page_to_mfn(iorp->page)), 0);
     if ( rc == 0 )
         paging_mark_dirty(d, _mfn(page_to_mfn(iorp->page)));
@@ -592,8 +592,8 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
     INIT_LIST_HEAD(&s->ioreq_vcpu_list);
     spin_lock_init(&s->bufioreq_lock);
 
-    s->ioreq.gfn = gfn_x(INVALID_GFN);
-    s->bufioreq.gfn = gfn_x(INVALID_GFN);
+    s->ioreq.gfn = INVALID_GFN;
+    s->bufioreq.gfn = INVALID_GFN;
 
     rc = hvm_ioreq_server_alloc_rangesets(s, id);
     if ( rc )
@@ -762,11 +762,11 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
     if ( IS_DEFAULT(s) )
         goto out;
 
-    *ioreq_gfn = s->ioreq.gfn;
+    *ioreq_gfn = gfn_x(s->ioreq.gfn);
 
     if ( s->bufioreq.va != NULL )
     {
-        *bufioreq_gfn = s->bufioreq.gfn;
+        *bufioreq_gfn = gfn_x(s->bufioreq.gfn);
         *bufioreq_port = s->bufioreq_evtchn;
     }
 
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index 01fe8a72d8..2be9353e37 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -36,7 +36,7 @@
 #include <public/hvm/dm_op.h>
 
 struct hvm_ioreq_page {
-    unsigned long gfn;
+    gfn_t gfn;
     struct page_info *page;
     void *va;
 };
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v7 11/12] x86/hvm/ioreq: defer mapping gfns until they are actually requsted
  2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
                   ` (9 preceding siblings ...)
  2017-09-18 15:31 ` [PATCH v7 10/12] x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page Paul Durrant
@ 2017-09-18 15:31 ` Paul Durrant
  2017-09-25 16:00   ` Jan Beulich
  2017-09-18 15:31 ` [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource type Paul Durrant
  11 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Paul Durrant, Jan Beulich

A subsequent patch will introduce a new scheme to allow an emulator to
map ioreq server pages directly from Xen rather than the guest P2M.

This patch lays the groundwork for that change by deferring mapping of
gfns until their values are requested by an emulator. To that end, the
pad field of the xen_dm_op_get_ioreq_server_info structure is re-purposed
to a flags field and new flag, XEN_DMOP_no_gfns, defined which modifies the
behaviour of XEN_DMOP_get_ioreq_server_info to allow the caller to avoid
requesting the gfn values.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>

v3:
 - Updated in response to review comments from Wei and Roger.
 - Added a HANDLE_BUFIOREQ macro to make the code neater.
 - This patch no longer introduces a security vulnerability since there
   is now an explicit limit on the number of ioreq servers that may be
   created for any one domain.
---
 tools/libs/devicemodel/core.c                   |  8 +++++
 tools/libs/devicemodel/include/xendevicemodel.h |  6 ++--
 xen/arch/x86/hvm/dm.c                           |  9 ++++--
 xen/arch/x86/hvm/ioreq.c                        | 41 +++++++++++++------------
 xen/include/asm-x86/hvm/domain.h                |  2 +-
 xen/include/public/hvm/dm_op.h                  | 32 +++++++++++--------
 6 files changed, 59 insertions(+), 39 deletions(-)

diff --git a/tools/libs/devicemodel/core.c b/tools/libs/devicemodel/core.c
index fcb260d29b..28958934bf 100644
--- a/tools/libs/devicemodel/core.c
+++ b/tools/libs/devicemodel/core.c
@@ -188,6 +188,14 @@ int xendevicemodel_get_ioreq_server_info(
 
     data->id = id;
 
+    /*
+     * If the caller is not requesting gfn values then instruct the
+     * hypercall not to retrieve them as this may cause them to be
+     * mapped.
+     */
+    if (!ioreq_gfn && !bufioreq_gfn)
+        data->flags |= XEN_DMOP_no_gfns;
+
     rc = xendevicemodel_op(dmod, domid, 1, &op, sizeof(op));
     if (rc)
         return rc;
diff --git a/tools/libs/devicemodel/include/xendevicemodel.h b/tools/libs/devicemodel/include/xendevicemodel.h
index 13216db04a..d73a76da35 100644
--- a/tools/libs/devicemodel/include/xendevicemodel.h
+++ b/tools/libs/devicemodel/include/xendevicemodel.h
@@ -61,11 +61,11 @@ int xendevicemodel_create_ioreq_server(
  * @parm domid the domain id to be serviced
  * @parm id the IOREQ Server id.
  * @parm ioreq_gfn pointer to a xen_pfn_t to receive the synchronous ioreq
- *                  gfn
+ *                  gfn. (May be NULL if not required)
  * @parm bufioreq_gfn pointer to a xen_pfn_t to receive the buffered ioreq
- *                    gfn
+ *                    gfn. (May be NULL if not required)
  * @parm bufioreq_port pointer to a evtchn_port_t to receive the buffered
- *                     ioreq event channel
+ *                     ioreq event channel. (May be NULL if not required)
  * @return 0 on success, -1 on failure.
  */
 int xendevicemodel_get_ioreq_server_info(
diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index 87ef4b6ca9..c020f0c99f 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -418,16 +418,19 @@ static int dm_op(const struct dmop_args *op_args)
     {
         struct xen_dm_op_get_ioreq_server_info *data =
             &op.u.get_ioreq_server_info;
+        const uint16_t valid_flags = XEN_DMOP_no_gfns;
 
         const_op = false;
 
         rc = -EINVAL;
-        if ( data->pad )
+        if ( data->flags & ~valid_flags )
             break;
 
         rc = hvm_get_ioreq_server_info(d, data->id,
-                                       &data->ioreq_gfn,
-                                       &data->bufioreq_gfn,
+                                       (data->flags & XEN_DMOP_no_gfns) ?
+                                       NULL : &data->ioreq_gfn,
+                                       (data->flags & XEN_DMOP_no_gfns) ?
+                                       NULL : &data->bufioreq_gfn,
                                        &data->bufioreq_port);
         break;
     }
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 3e2a3f62ba..1fbc81fb15 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -354,6 +354,9 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
     }
 }
 
+#define HANDLE_BUFIOREQ(s) \
+    (s->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
+
 static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
                                      struct vcpu *v)
 {
@@ -375,7 +378,7 @@ static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
 
     sv->ioreq_evtchn = rc;
 
-    if ( v->vcpu_id == 0 && s->bufioreq.va != NULL )
+    if ( v->vcpu_id == 0 && HANDLE_BUFIOREQ(s) )
     {
         struct domain *d = s->domain;
 
@@ -426,7 +429,7 @@ static void hvm_ioreq_server_remove_vcpu(struct hvm_ioreq_server *s,
 
         list_del(&sv->list_entry);
 
-        if ( v->vcpu_id == 0 && s->bufioreq.va != NULL )
+        if ( v->vcpu_id == 0 && HANDLE_BUFIOREQ(s) )
             free_xen_event_channel(v->domain, s->bufioreq_evtchn);
 
         free_xen_event_channel(v->domain, sv->ioreq_evtchn);
@@ -453,7 +456,7 @@ static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
 
         list_del(&sv->list_entry);
 
-        if ( v->vcpu_id == 0 && s->bufioreq.va != NULL )
+        if ( v->vcpu_id == 0 && HANDLE_BUFIOREQ(s) )
             free_xen_event_channel(v->domain, s->bufioreq_evtchn);
 
         free_xen_event_channel(v->domain, sv->ioreq_evtchn);
@@ -464,14 +467,13 @@ static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
     spin_unlock(&s->lock);
 }
 
-static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s,
-                                      bool handle_bufioreq)
+static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s)
 {
     int rc;
 
     rc = hvm_map_ioreq_gfn(s, false);
 
-    if ( !rc && handle_bufioreq )
+    if ( !rc && HANDLE_BUFIOREQ(s) )
         rc = hvm_map_ioreq_gfn(s, true);
 
     if ( rc )
@@ -599,13 +601,7 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
     if ( rc )
         return rc;
 
-    if ( bufioreq_handling == HVM_IOREQSRV_BUFIOREQ_ATOMIC )
-        s->bufioreq_atomic = true;
-
-    rc = hvm_ioreq_server_map_pages(
-             s, bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF);
-    if ( rc )
-        goto fail_map;
+    s->bufioreq_handling = bufioreq_handling;
 
     for_each_vcpu ( d, v )
     {
@@ -620,9 +616,6 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
     hvm_ioreq_server_remove_all_vcpus(s);
     hvm_ioreq_server_unmap_pages(s);
 
- fail_map:
-    hvm_ioreq_server_free_rangesets(s);
-
     return rc;
 }
 
@@ -762,11 +755,20 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
     if ( IS_DEFAULT(s) )
         goto out;
 
+    if ( ioreq_gfn || bufioreq_gfn )
+    {
+        rc = hvm_ioreq_server_map_pages(s);
+        if ( rc )
+            goto out;
+    }
+
     *ioreq_gfn = gfn_x(s->ioreq.gfn);
 
-    if ( s->bufioreq.va != NULL )
+    if ( HANDLE_BUFIOREQ(s) )
     {
-        *bufioreq_gfn = gfn_x(s->bufioreq.gfn);
+        if ( bufioreq_gfn )
+            *bufioreq_gfn = gfn_x(s->bufioreq.gfn);
+
         *bufioreq_port = s->bufioreq_evtchn;
     }
 
@@ -1280,7 +1282,8 @@ static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
     pg->ptrs.write_pointer += qw ? 2 : 1;
 
     /* Canonicalize read/write pointers to prevent their overflow. */
-    while ( s->bufioreq_atomic && qw++ < IOREQ_BUFFER_SLOT_NUM &&
+    while ( (s->bufioreq_handling == HVM_IOREQSRV_BUFIOREQ_ATOMIC) &&
+            qw++ < IOREQ_BUFFER_SLOT_NUM &&
             pg->ptrs.read_pointer >= IOREQ_BUFFER_SLOT_NUM )
     {
         union bufioreq_pointers old = pg->ptrs, new;
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index 2be9353e37..4491a96350 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -68,8 +68,8 @@ struct hvm_ioreq_server {
     spinlock_t             bufioreq_lock;
     evtchn_port_t          bufioreq_evtchn;
     struct rangeset        *range[NR_IO_RANGE_TYPES];
+    int                    bufioreq_handling;
     bool                   enabled;
-    bool                   bufioreq_atomic;
 };
 
 /*
diff --git a/xen/include/public/hvm/dm_op.h b/xen/include/public/hvm/dm_op.h
index 6bbab5fca3..9677bd74e7 100644
--- a/xen/include/public/hvm/dm_op.h
+++ b/xen/include/public/hvm/dm_op.h
@@ -79,28 +79,34 @@ struct xen_dm_op_create_ioreq_server {
  * XEN_DMOP_get_ioreq_server_info: Get all the information necessary to
  *                                 access IOREQ Server <id>.
  *
- * The emulator needs to map the synchronous ioreq structures and buffered
- * ioreq ring (if it exists) that Xen uses to request emulation. These are
- * hosted in the target domain's gmfns <ioreq_gfn> and <bufioreq_gfn>
- * respectively. In addition, if the IOREQ Server is handling buffered
- * emulation requests, the emulator needs to bind to event channel
- * <bufioreq_port> to listen for them. (The event channels used for
- * synchronous emulation requests are specified in the per-CPU ioreq
- * structures in <ioreq_gfn>).
- * If the IOREQ Server is not handling buffered emulation requests then the
- * values handed back in <bufioreq_gfn> and <bufioreq_port> will both be 0.
+ * If the IOREQ Server is handling buffered emulation requests, the
+ * emulator needs to bind to event channel <bufioreq_port> to listen for
+ * them. (The event channels used for synchronous emulation requests are
+ * specified in the per-CPU ioreq structures).
+ * In addition, if the XENMEM_acquire_resource memory op cannot be used,
+ * the emulator will need to map the synchronous ioreq structures and
+ * buffered ioreq ring (if it exists) from guest memory. If <flags> does
+ * not contain XEN_DMOP_no_gfns then these pages will be made available and
+ * the frame numbers passed back in gfns <ioreq_gfn> and <bufioreq_gfn>
+ * respectively. (If the IOREQ Server is not handling buffered emulation
+ * only <ioreq_gfn> will be valid).
  */
 #define XEN_DMOP_get_ioreq_server_info 2
 
 struct xen_dm_op_get_ioreq_server_info {
     /* IN - server id */
     ioservid_t id;
-    uint16_t pad;
+    /* IN - flags */
+    uint16_t flags;
+
+#define _XEN_DMOP_no_gfns 0
+#define XEN_DMOP_no_gfns (1u << _XEN_DMOP_no_gfns)
+
     /* OUT - buffered ioreq port */
     evtchn_port_t bufioreq_port;
-    /* OUT - sync ioreq gfn */
+    /* OUT - sync ioreq gfn (see block comment above) */
     uint64_aligned_t ioreq_gfn;
-    /* OUT - buffered ioreq gfn */
+    /* OUT - buffered ioreq gfn (see block comment above)*/
     uint64_aligned_t bufioreq_gfn;
 };
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource type...
  2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
                   ` (10 preceding siblings ...)
  2017-09-18 15:31 ` [PATCH v7 11/12] x86/hvm/ioreq: defer mapping gfns until they are actually requsted Paul Durrant
@ 2017-09-18 15:31 ` Paul Durrant
  2017-09-18 16:18   ` Roger Pau Monné
  2017-09-26 12:58   ` Jan Beulich
  11 siblings, 2 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-18 15:31 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrew Cooper, Ian Jackson, Tim Deegan,
	Paul Durrant, Jan Beulich

... XENMEM_resource_ioreq_server

This patch adds support for a new resource type that can be mapped using
the XENMEM_acquire_resource memory op.

If an emulator makes use of this resource type then, instead of mapping
gfns, the IOREQ server will allocate pages from the heap. These pages
will never be present in the P2M of the guest at any point and so are
not vulnerable to any direct attack by the guest. They are only ever
accessible by Xen and any domain that has mapping privilege over the
guest (which may or may not be limited to the domain running the emulator).

NOTE: Use of the new resource type is not compatible with use of
      XEN_DMOP_get_ioreq_server_info unless the XEN_DMOP_no_gfns flag is
      set.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: George Dunlap <George.Dunlap@eu.citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>

v5:
 - Use get_ioreq_server() function rather than indexing array directly.
 - Add more explanation into comments to state than mapping guest frames
   and allocation of pages for ioreq servers are not simultaneously
   permitted.
 - Add a comment into asm/ioreq.h stating the meaning of the index
   value passed to hvm_get_ioreq_server_frame().
---
 xen/arch/x86/hvm/ioreq.c        | 131 +++++++++++++++++++++++++++++++++++++++-
 xen/arch/x86/mm.c               |  27 +++++++++
 xen/include/asm-x86/hvm/ioreq.h |   6 ++
 xen/include/public/hvm/dm_op.h  |   4 ++
 xen/include/public/memory.h     |   3 +
 5 files changed, 170 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 1fbc81fb15..0aacd7d2c2 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -260,6 +260,19 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
     struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
     int rc;
 
+    if ( iorp->page )
+    {
+        /*
+         * If a page has already been allocated (which will happen on
+         * demand if hvm_get_ioreq_server_frame() is called), then
+         * mapping a guest frame is not permitted.
+         */
+        if ( gfn_eq(iorp->gfn, INVALID_GFN) )
+            return -EPERM;
+
+        return 0;
+    }
+
     if ( d->is_dying )
         return -EINVAL;
 
@@ -282,6 +295,61 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
     return rc;
 }
 
+static int hvm_alloc_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
+{
+    struct domain *currd = current->domain;
+    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+
+    if ( iorp->page )
+    {
+        /*
+         * If a guest frame has already been mapped (which may happen
+         * on demand if hvm_get_ioreq_server_info() is called), then
+         * allocating a page is not permitted.
+         */
+        if ( !gfn_eq(iorp->gfn, INVALID_GFN) )
+            return -EPERM;
+
+        return 0;
+    }
+
+    /*
+     * Allocated IOREQ server pages are assigned to the emulating
+     * domain, not the target domain. This is because the emulator is
+     * likely to be destroyed after the target domain has been torn
+     * down, and we must use MEMF_no_refcount otherwise page allocation
+     * could fail if the emulating domain has already reached its
+     * maximum allocation.
+     */
+    iorp->page = alloc_domheap_page(currd, MEMF_no_refcount);
+    if ( !iorp->page )
+        return -ENOMEM;
+
+    iorp->va = __map_domain_page_global(iorp->page);
+    if ( !iorp->va )
+    {
+        iorp->page = NULL;
+        return -ENOMEM;
+    }
+
+    clear_page(iorp->va);
+    return 0;
+}
+
+static void hvm_free_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
+{
+    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+
+    if ( !iorp->page )
+        return;
+
+    unmap_domain_page_global(iorp->va);
+    iorp->va = NULL;
+
+    put_page(iorp->page);
+    iorp->page = NULL;
+}
+
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
 {
     const struct hvm_ioreq_server *s;
@@ -488,6 +556,27 @@ static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
     hvm_unmap_ioreq_gfn(s, false);
 }
 
+static int hvm_ioreq_server_alloc_pages(struct hvm_ioreq_server *s)
+{
+    int rc;
+
+    rc = hvm_alloc_ioreq_mfn(s, false);
+
+    if ( !rc && (s->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF) )
+        rc = hvm_alloc_ioreq_mfn(s, true);
+
+    if ( rc )
+        hvm_free_ioreq_mfn(s, false);
+
+    return rc;
+}
+
+static void hvm_ioreq_server_free_pages(struct hvm_ioreq_server *s)
+{
+    hvm_free_ioreq_mfn(s, true);
+    hvm_free_ioreq_mfn(s, false);
+}
+
 static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s)
 {
     unsigned int i;
@@ -614,7 +703,18 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
 
  fail_add:
     hvm_ioreq_server_remove_all_vcpus(s);
+
+    /*
+     * NOTE: It is safe to call both hvm_ioreq_server_unmap_pages() and
+     *       hvm_ioreq_server_free_pages() in that order.
+     *       This is because the former will do nothing if the pages
+     *       are not mapped, leaving the page to be freed by the latter.
+     *       However if the pages are mapped then the former will set
+     *       the page_info pointer to NULL, meaning the latter will do
+     *       nothing.
+     */
     hvm_ioreq_server_unmap_pages(s);
+    hvm_ioreq_server_free_pages(s);
 
     return rc;
 }
@@ -624,6 +724,7 @@ static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
     ASSERT(!s->enabled);
     hvm_ioreq_server_remove_all_vcpus(s);
     hvm_ioreq_server_unmap_pages(s);
+    hvm_ioreq_server_free_pages(s);
     hvm_ioreq_server_free_rangesets(s);
 }
 
@@ -762,7 +863,8 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
             goto out;
     }
 
-    *ioreq_gfn = gfn_x(s->ioreq.gfn);
+    if ( ioreq_gfn )
+        *ioreq_gfn = gfn_x(s->ioreq.gfn);
 
     if ( HANDLE_BUFIOREQ(s) )
     {
@@ -780,6 +882,33 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
     return rc;
 }
 
+mfn_t hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
+                                 unsigned int idx)
+{
+    struct hvm_ioreq_server *s;
+    mfn_t mfn = INVALID_MFN;
+
+    spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
+
+    s = get_ioreq_server(d, id);
+
+    if ( id >= MAX_NR_IOREQ_SERVERS || !s || IS_DEFAULT(s) )
+        goto out;
+
+    if ( hvm_ioreq_server_alloc_pages(s) )
+        goto out;
+
+    if ( idx == 0 )
+        mfn = _mfn(page_to_mfn(s->bufioreq.page));
+    else if ( idx == 1 )
+        mfn = _mfn(page_to_mfn(s->ioreq.page));
+
+ out:
+    spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
+
+    return mfn;
+}
+
 int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
                                      uint32_t type, uint64_t start,
                                      uint64_t end)
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index c8f50f3bb0..87debbdef3 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -122,6 +122,7 @@
 #include <asm/fixmap.h>
 #include <asm/io_apic.h>
 #include <asm/pci.h>
+#include <asm/hvm/ioreq.h>
 
 #include <asm/hvm/grant_table.h>
 #include <asm/pv/grant_table.h>
@@ -4795,6 +4796,27 @@ static int xenmem_acquire_grant_table(struct domain *d,
     return 0;
 }
 
+static int xenmem_acquire_ioreq_server(struct domain *d,
+                                       unsigned int id,
+                                       unsigned long frame,
+                                       unsigned long nr_frames,
+                                       unsigned long mfn_list[])
+{
+    unsigned int i;
+
+    for ( i = 0; i < nr_frames; i++ )
+    {
+        mfn_t mfn = hvm_get_ioreq_server_frame(d, id, frame + i);
+
+        if ( mfn_eq(mfn, INVALID_MFN) )
+            return -EINVAL;
+
+        mfn_list[i] = mfn_x(mfn);
+    }
+
+    return 0;
+}
+
 static int xenmem_acquire_resource(xen_mem_acquire_resource_t *xmar)
 {
     struct domain *d, *currd = current->domain;
@@ -4829,6 +4851,11 @@ static int xenmem_acquire_resource(xen_mem_acquire_resource_t *xmar)
                                         mfn_list);
         break;
 
+    case XENMEM_resource_ioreq_server:
+        rc = xenmem_acquire_ioreq_server(d, xmar->id, xmar->frame,
+                                         xmar->nr_frames, mfn_list);
+        break;
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index 1829fcf43e..46b275f72f 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -31,6 +31,12 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
                               unsigned long *ioreq_gfn,
                               unsigned long *bufioreq_gfn,
                               evtchn_port_t *bufioreq_port);
+/*
+ * Get the mfn of either the buffered or synchronous ioreq frame.
+ * (idx == 0 -> buffered, idx == 1 -> synchronous).
+ */
+mfn_t hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
+                                 unsigned int idx);
 int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
                                      uint32_t type, uint64_t start,
                                      uint64_t end);
diff --git a/xen/include/public/hvm/dm_op.h b/xen/include/public/hvm/dm_op.h
index 9677bd74e7..59b6006910 100644
--- a/xen/include/public/hvm/dm_op.h
+++ b/xen/include/public/hvm/dm_op.h
@@ -90,6 +90,10 @@ struct xen_dm_op_create_ioreq_server {
  * the frame numbers passed back in gfns <ioreq_gfn> and <bufioreq_gfn>
  * respectively. (If the IOREQ Server is not handling buffered emulation
  * only <ioreq_gfn> will be valid).
+ *
+ * NOTE: To access the synchronous ioreq structures and buffered ioreq
+ *       ring, it is preferable to use the XENMEM_acquire_resource memory
+ *       op specifying resource type XENMEM_resource_ioreq_server.
  */
 #define XEN_DMOP_get_ioreq_server_info 2
 
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 9bf58e7384..716941dc0c 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -664,10 +664,13 @@ struct xen_mem_acquire_resource {
     uint16_t type;
 
 #define XENMEM_resource_grant_table 0
+#define XENMEM_resource_ioreq_server 1
 
     /*
      * IN - a type-specific resource identifier, which must be zero
      *      unless stated otherwise.
+     *
+     * type == XENMEM_resource_ioreq_server -> id == ioreq server id
      */
     uint32_t id;
     /* IN - number of (4K) frames of the resource to be mapped */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 03/12] tools/libxenforeignmemory: add support for resource mapping
  2017-09-18 15:31 ` [PATCH v7 03/12] tools/libxenforeignmemory: add support for resource mapping Paul Durrant
@ 2017-09-18 16:16   ` Ian Jackson
  2017-09-19  8:19     ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Ian Jackson @ 2017-09-18 16:16 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel

Paul Durrant writes ("[PATCH v7 03/12] tools/libxenforeignmemory: add support for resource mapping"):
> A previous patch introduced a new HYPERVISOR_memory_op to acquire guest
> resources for direct priv-mapping.
> 
> This patch adds new functionality into libxenforeignmemory to make use
> of a new privcmd ioctl [1] that uses the new memory op to make such
> resources available via mmap(2).
> 
> [1] http://xenbits.xen.org/gitweb/?p=people/pauldu/linux.git;a=commit;h=ce59a05e6712

This looks plausible to me.

I wonder whether this, particularly for the ioreq server page, will
make it possible to deprivilege earlier than I did in my own series on
Friday.

(With my series, I do the depriv on entering the `running' state,
which is quite late.  It's after reading the migration stream, which
is not ideal.  But it did mean that qemu had already aquired the ioreq
page by then so it worked.  Unless that's just because my Xen was a
bit old?)

Anyway,

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource type...
  2017-09-18 15:31 ` [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource type Paul Durrant
@ 2017-09-18 16:18   ` Roger Pau Monné
  2017-09-19  8:14     ` Paul Durrant
  2017-09-26 12:58   ` Jan Beulich
  1 sibling, 1 reply; 62+ messages in thread
From: Roger Pau Monné @ 2017-09-18 16:18 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Ian Jackson, Tim Deegan, Jan Beulich,
	Andrew Cooper, xen-devel

On Mon, Sep 18, 2017 at 04:31:26PM +0100, Paul Durrant wrote:
> ... XENMEM_resource_ioreq_server
> 
> This patch adds support for a new resource type that can be mapped using
> the XENMEM_acquire_resource memory op.
> 
> If an emulator makes use of this resource type then, instead of mapping
> gfns, the IOREQ server will allocate pages from the heap. These pages
> will never be present in the P2M of the guest at any point and so are
> not vulnerable to any direct attack by the guest. They are only ever
> accessible by Xen and any domain that has mapping privilege over the
> guest (which may or may not be limited to the domain running the emulator).
> 
> NOTE: Use of the new resource type is not compatible with use of
>       XEN_DMOP_get_ioreq_server_info unless the XEN_DMOP_no_gfns flag is
>       set.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Acked-by: George Dunlap <George.Dunlap@eu.citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Just one nit below.

> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> +mfn_t hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
> +                                 unsigned int idx)
> +{
> +    struct hvm_ioreq_server *s;
> +    mfn_t mfn = INVALID_MFN;
> +
> +    spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> +
> +    s = get_ioreq_server(d, id);
> +
> +    if ( id >= MAX_NR_IOREQ_SERVERS || !s || IS_DEFAULT(s) )

If you use get_ioreq_server the id >= MAX_NR_IOREQ_SERVERS check is
pointless, get_ioreq_server will already return NULL in that case.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource type...
  2017-09-18 16:18   ` Roger Pau Monné
@ 2017-09-19  8:14     ` Paul Durrant
  0 siblings, 0 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-19  8:14 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Andrew Cooper, Tim (Xen.org),
	Jan Beulich, Ian Jackson, xen-devel

> -----Original Message-----
> From: Roger Pau Monne
> Sent: 18 September 2017 17:18
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel@lists.xenproject.org; Stefano Stabellini
> <sstabellini@kernel.org>; Andrew Cooper <Andrew.Cooper3@citrix.com>;
> Ian Jackson <Ian.Jackson@citrix.com>; Tim (Xen.org) <tim@xen.org>; Jan
> Beulich <jbeulich@suse.com>
> Subject: Re: [Xen-devel] [PATCH v7 12/12] x86/hvm/ioreq: add a new
> mappable resource type...
> 
> On Mon, Sep 18, 2017 at 04:31:26PM +0100, Paul Durrant wrote:
> > ... XENMEM_resource_ioreq_server
> >
> > This patch adds support for a new resource type that can be mapped using
> > the XENMEM_acquire_resource memory op.
> >
> > If an emulator makes use of this resource type then, instead of mapping
> > gfns, the IOREQ server will allocate pages from the heap. These pages
> > will never be present in the P2M of the guest at any point and so are
> > not vulnerable to any direct attack by the guest. They are only ever
> > accessible by Xen and any domain that has mapping privilege over the
> > guest (which may or may not be limited to the domain running the
> emulator).
> >
> > NOTE: Use of the new resource type is not compatible with use of
> >       XEN_DMOP_get_ioreq_server_info unless the XEN_DMOP_no_gfns
> flag is
> >       set.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Acked-by: George Dunlap <George.Dunlap@eu.citrix.com>
> > Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Just one nit below.
> 
> > --- a/xen/arch/x86/hvm/ioreq.c
> > +++ b/xen/arch/x86/hvm/ioreq.c
> > +mfn_t hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
> > +                                 unsigned int idx)
> > +{
> > +    struct hvm_ioreq_server *s;
> > +    mfn_t mfn = INVALID_MFN;
> > +
> > +    spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> > +
> > +    s = get_ioreq_server(d, id);
> > +
> > +    if ( id >= MAX_NR_IOREQ_SERVERS || !s || IS_DEFAULT(s) )
> 
> If you use get_ioreq_server the id >= MAX_NR_IOREQ_SERVERS check is
> pointless, get_ioreq_server will already return NULL in that case.

True. Possibly not worth a v8 in itself, but if there are any more changes needed I'll fix it up.

Thanks,

  Paul

> 
> Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 03/12] tools/libxenforeignmemory: add support for resource mapping
  2017-09-18 16:16   ` Ian Jackson
@ 2017-09-19  8:19     ` Paul Durrant
  0 siblings, 0 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-19  8:19 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel

> -----Original Message-----
> From: Ian Jackson [mailto:ian.jackson@eu.citrix.com]
> Sent: 18 September 2017 17:16
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel@lists.xenproject.org
> Subject: Re: [PATCH v7 03/12] tools/libxenforeignmemory: add support for
> resource mapping
> 
> Paul Durrant writes ("[PATCH v7 03/12] tools/libxenforeignmemory: add
> support for resource mapping"):
> > A previous patch introduced a new HYPERVISOR_memory_op to acquire
> guest
> > resources for direct priv-mapping.
> >
> > This patch adds new functionality into libxenforeignmemory to make use
> > of a new privcmd ioctl [1] that uses the new memory op to make such
> > resources available via mmap(2).
> >
> > [1]
> http://xenbits.xen.org/gitweb/?p=people/pauldu/linux.git;a=commit;h=ce5
> 9a05e6712
> 
> This looks plausible to me.
> 
> I wonder whether this, particularly for the ioreq server page, will
> make it possible to deprivilege earlier than I did in my own series on
> Friday.
> 

It should, eventually. The necessary changes to privcmd would also need to make it into dom0, as well as the changes to QEMU (both of which I have on branches ready to go).

> (With my series, I do the depriv on entering the `running' state,
> which is quite late.  It's after reading the migration stream, which
> is not ideal.  But it did mean that qemu had already aquired the ioreq
> page by then so it worked.  Unless that's just because my Xen was a
> bit old?)

The acquisition of the ioreq pages is towards the end of the hvm init routine, so depriv any time after that (under the old scheme) should be doable.

> 
> Anyway,
> 
> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

Great. Thanks,

  Paul

> 
> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-18 15:31 ` [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns Paul Durrant
@ 2017-09-19 12:51   ` Paul Durrant
  2017-09-19 13:05     ` Jan Beulich
  2017-09-25 13:02   ` Jan Beulich
  1 sibling, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-19 12:51 UTC (permalink / raw)
  To: Paul Durrant, Jan Beulich, Andrew Cooper; +Cc: xen-devel

Ping?

> -----Original Message-----
> From: Paul Durrant [mailto:paul.durrant@citrix.com]
> Sent: 18 September 2017 16:31
> To: xen-devel@lists.xenproject.org
> Cc: Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> <jbeulich@suse.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>
> Subject: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map
> guest mfns
> 
> In the case where a PV domain is mapping guest resources then it needs
> make
> the HYPERVISOR_mmu_update call using DOMID_SELF, rather than the guest
> domid, so that the passed in gmfn values are correctly treated as mfns
> rather than gfns present in the guest p2m.
> 
> This patch removes a check which currently disallows mapping of a page
> when
> the owner of the page tables matches the domain passed to
> HYPERVISOR_mmu_update, but that domain is not the real owner of the
> page.
> The check was introduced by patch d3c6a215ca9 ("x86: Clean up
> get_page_from_l1e() to correctly distinguish between owner-of-pte and
> owner-of-data-page in all cases") but it's not clear why it was needed.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

I believe this is now the only patch in the series without a R-b or A-b.

  Paul

> ---
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  xen/arch/x86/mm.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 2e5b15e7a2..cb0189efae 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -1024,12 +1024,15 @@ get_page_from_l1e(
>                     (real_pg_owner != dom_cow) ) )
>      {
>          /*
> -         * Let privileged domains transfer the right to map their target
> -         * domain's pages. This is used to allow stub-domain pvfb export to
> -         * dom0, until pvfb supports granted mappings. At that time this
> -         * minor hack can go away.
> +         * If the real page owner is not the domain specified in the
> +         * hypercall then establish that the specified domain has
> +         * mapping privilege over the page owner.
> +         * This is used to allow stub-domain pvfb export to dom0. It is
> +         * also used to allow a privileged PV domain to map mfns using
> +         * DOMID_SELF, which is needed for mapping guest resources such
> +         * grant table frames.
>           */
> -        if ( (real_pg_owner == NULL) || (pg_owner == l1e_owner) ||
> +        if ( (real_pg_owner == NULL) ||
>               xsm_priv_mapping(XSM_TARGET, pg_owner, real_pg_owner) )
>          {
>              gdprintk(XENLOG_WARNING,
> --
> 2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-19 12:51   ` Paul Durrant
@ 2017-09-19 13:05     ` Jan Beulich
  0 siblings, 0 replies; 62+ messages in thread
From: Jan Beulich @ 2017-09-19 13:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 19.09.17 at 14:51, <Paul.Durrant@citrix.com> wrote:
> Ping?

Your patch series hasn't been forgotten, but I can't currently predict
when I would be able to look at it (together with the well over 100
other patches sitting in the queue). I can only promise: As soon as
other, often higher priority, work permits me to get there (unless
Andrew manages to get there earlier).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-18 15:31 ` [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns Paul Durrant
  2017-09-19 12:51   ` Paul Durrant
@ 2017-09-25 13:02   ` Jan Beulich
  2017-09-25 13:29     ` Andrew Cooper
                       ` (2 more replies)
  1 sibling, 3 replies; 62+ messages in thread
From: Jan Beulich @ 2017-09-25 13:02 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> In the case where a PV domain is mapping guest resources then it needs make
> the HYPERVISOR_mmu_update call using DOMID_SELF, rather than the guest
> domid, so that the passed in gmfn values are correctly treated as mfns
> rather than gfns present in the guest p2m.

Since things are presently working fine, I think the description is not
really accurate. You only require the new behavior if you don't know
the GFN of the page you want to map, and that it has to be
DOMID_SELF that should be passed also doesn't appear to derive
from anything else. To properly judge about the need for this patch
it would help if it was briefly explained why being able to map by GFN
is no longer sufficient, and to re-word the DOMID_SELF part.

The other aspect I don't understand is why this is needed for PV
Dom0, but not for PVH. The answer here can't be "because PVH
Dom0 isn't supported yet", because it eventually will be, and then
there will still be the problem of PVH supposedly having no notion
of MFNs (be their own or foreign guest ones). The answer also
can't be "since it would use XENMAPSPACE_gmfn_foreign", as
that's acting in terms of GFN too.

> This patch removes a check which currently disallows mapping of a page when
> the owner of the page tables matches the domain passed to
> HYPERVISOR_mmu_update, but that domain is not the real owner of the page.
> The check was introduced by patch d3c6a215ca9 ("x86: Clean up
> get_page_from_l1e() to correctly distinguish between owner-of-pte and
> owner-of-data-page in all cases") but it's not clear why it was needed.

I think the goal here simply was to not permit anything that doesn't
really need permitting. Furthermore the check being "introduced"
there was, afaict, replacing the earlier d != curr->domain.

> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -1024,12 +1024,15 @@ get_page_from_l1e(
>                     (real_pg_owner != dom_cow) ) )
>      {
>          /*
> -         * Let privileged domains transfer the right to map their target
> -         * domain's pages. This is used to allow stub-domain pvfb export to
> -         * dom0, until pvfb supports granted mappings. At that time this
> -         * minor hack can go away.
> +         * If the real page owner is not the domain specified in the
> +         * hypercall then establish that the specified domain has
> +         * mapping privilege over the page owner.
> +         * This is used to allow stub-domain pvfb export to dom0. It is
> +         * also used to allow a privileged PV domain to map mfns using
> +         * DOMID_SELF, which is needed for mapping guest resources such
> +         * grant table frames.

How do grant table frames come into the picture here? So far
I had assumed only ioreq server pages are in need of this.

>           */
> -        if ( (real_pg_owner == NULL) || (pg_owner == l1e_owner) ||
> +        if ( (real_pg_owner == NULL) ||
>               xsm_priv_mapping(XSM_TARGET, pg_owner, real_pg_owner) )
>          {
>              gdprintk(XENLOG_WARNING,

I'm concerned of the effect of the change on the code paths
which you're not really interested in: alloc_l1_table(),
ptwr_emulated_update(), and shadow_get_page_from_l1e() all
explicitly pass both domains identical, and are now suddenly able
to do things they weren't supposed to do. A similar concern
applies to __do_update_va_mapping() calling mod_l1_table().

I therefore wonder whether the solution to your problem
wouldn't rather be MMU_FOREIGN_PT_UPDATE (name subject
to improvement suggestions). This at the same time would
address my concern regarding the misleading DOMID_SELF
passing when really a foreign domain's page is meant.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-25 13:02   ` Jan Beulich
@ 2017-09-25 13:29     ` Andrew Cooper
  2017-09-25 14:03       ` Jan Beulich
  2017-09-25 14:42     ` Paul Durrant
  2017-09-27 11:18     ` Paul Durrant
  2 siblings, 1 reply; 62+ messages in thread
From: Andrew Cooper @ 2017-09-25 13:29 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant; +Cc: xen-devel

On 25/09/17 14:02, Jan Beulich wrote:
>>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>> In the case where a PV domain is mapping guest resources then it needs make
>> the HYPERVISOR_mmu_update call using DOMID_SELF, rather than the guest
>> domid, so that the passed in gmfn values are correctly treated as mfns
>> rather than gfns present in the guest p2m.
> Since things are presently working fine, I think the description is not
> really accurate. You only require the new behavior if you don't know
> the GFN of the page you want to map, and that it has to be
> DOMID_SELF that should be passed also doesn't appear to derive
> from anything else. To properly judge about the need for this patch
> it would help if it was briefly explained why being able to map by GFN
> is no longer sufficient, and to re-word the DOMID_SELF part.

I think there is still confusion as to the purpose here.

For security and scalability reasons, we explicitly want to be able to
create frames which are not part of a guests p2m.  We still need to map
these frames however.

The frames are referred to in an abstract way by a space id/offset.  To
create mappings of these frames, PV guests pass an array which Xen fills
in with MFNs, while HVM guests pass an array of GFNs which have their
mappings updated.

The problem is trying to map an MFN belonging to a target domain,
because Xen interprets the frame under the targets paging mode.  Passing
DOMID_SELF here is Pauls way of getting Xen to interpret the frame under
current's paging mode.

Alternative suggestions welcome.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  2017-09-18 15:31 ` [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources Paul Durrant
@ 2017-09-25 13:49   ` Jan Beulich
  2017-09-25 14:53     ` Paul Durrant
  2017-09-25 14:23   ` Jan Beulich
  1 sibling, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-25 13:49 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> Certain memory resources associated with a guest are not necessarily
> present in the guest P2M and so are not necessarily available to be
> foreign-mapped by a tools domain unless they are inserted, which risks
> shattering a super-page mapping.

For grant tables I can see this as the primary issue, but isn't the
goal of not exposing IOREQ server pages an even more important
aspect, and hence more relevant to mention here?

> NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
>       I have no means to test it on an ARM platform and so cannot verify
>       that it functions correctly. Hence it is currently only implemented
>       for x86.

Which will require things to be moved around later on. May I
instead suggest to put it in common code and simply have a
small #ifdef somewhere causing the operation to fail early for
ARM?

> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -4768,6 +4768,107 @@ int xenmem_add_to_physmap_one(
>      return rc;
>  }
>  
> +static int xenmem_acquire_grant_table(struct domain *d,

I don't think static functions need a xenmem_ prefix.

> +static int xenmem_acquire_resource(xen_mem_acquire_resource_t *xmar)

const?

> +{
> +    struct domain *d, *currd = current->domain;
> +    unsigned long *mfn_list;
> +    int rc;
> +
> +    if ( xmar->nr_frames == 0 )
> +        return -EINVAL;
> +
> +    d = rcu_lock_domain_by_any_id(xmar->domid);
> +    if ( d == NULL )
> +        return -ESRCH;
> +
> +    rc = xsm_domain_memory_map(XSM_TARGET, d);
> +    if ( rc )
> +        goto out;
> +
> +    mfn_list = xmalloc_array(unsigned long, xmar->nr_frames);

Despite XSA-77 there should not be any new disaggregation related
security risks (here: memory exhaustion and long lasting loops).
Large requests need to be refused or broken up.

> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -3607,38 +3607,58 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
>  }
>  #endif
>  
> -int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
> -                     mfn_t *mfn)
> -{
> -    int rc = 0;
>  
> -    grant_write_lock(d->grant_table);
> +static mfn_t gnttab_get_frame_locked(struct domain *d, unsigned long idx)
> +{
> +    struct grant_table *gt = d->grant_table;
> +    mfn_t mfn = INVALID_MFN;
>  
> -    if ( d->grant_table->gt_version == 0 )
> -        d->grant_table->gt_version = 1;
> +    if ( gt->gt_version == 0 )
> +        gt->gt_version = 1;
>  
> -    if ( d->grant_table->gt_version == 2 &&
> +    if ( gt->gt_version == 2 &&
>           (idx & XENMAPIDX_grant_table_status) )
>      {
>          idx &= ~XENMAPIDX_grant_table_status;

I don't see the use of this bit mentioned anywhere in the public
header addition.

> +mfn_t gnttab_get_frame(struct domain *d, unsigned long idx)
> +{
> +    mfn_t mfn;
> +
> +    grant_write_lock(d->grant_table);
> +    mfn = gnttab_get_frame_locked(d, idx);
> +    grant_write_unlock(d->grant_table);

Hmm, a read lock is insufficient here only because of the possible
bumping of the version from 0 to 1 afaict, but I don't really see
why what now becomes gnttab_get_frame_locked() does that in
the first place.

> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -650,7 +650,43 @@ struct xen_vnuma_topology_info {
>  typedef struct xen_vnuma_topology_info xen_vnuma_topology_info_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_vnuma_topology_info_t);
>  
> -/* Next available subop number is 28 */
> +#if defined(__XEN__) || defined(__XEN_TOOLS__)
> +
> +/*
> + * Get the pages for a particular guest resource, so that they can be
> + * mapped directly by a tools domain.
> + */
> +#define XENMEM_acquire_resource 28
> +struct xen_mem_acquire_resource {
> +    /* IN - the domain whose resource is to be mapped */
> +    domid_t domid;
> +    /* IN - the type of resource (defined below) */
> +    uint16_t type;
> +
> +#define XENMEM_resource_grant_table 0
> +
> +    /*
> +     * IN - a type-specific resource identifier, which must be zero
> +     *      unless stated otherwise.
> +     */
> +    uint32_t id;
> +    /* IN - number of (4K) frames of the resource to be mapped */
> +    uint32_t nr_frames;
> +    /* IN - the index of the initial frame to be mapped */
> +    uint64_aligned_t frame;

There are 32 bits of unnamed padding ahead of this field - please
name it and check it's set to zero.

> +    /* IN/OUT - If the tools domain is PV then, upon return, gmfn_list
> +     *          will be populated with the MFNs of the resource.
> +     *          If the tools domain is HVM then it is expected that, on

s/tools domain/calling domain/ (twice)?

> +     *          entry, gmfn_list will be populated with a list of GFNs

s/will be/is/ to further emphasize the input vs output difference?

> +     *          that will be mapped to the MFNs of the resource.
> +     */
> +    XEN_GUEST_HANDLE(xen_pfn_t) gmfn_list;

What about a 32-bit x86 tool stack domain as caller here? I
don't think you want to limit it to just the low 16Tb? Nor are
you adding a compat translation layer.

> +};
> +typedef struct xen_mem_acquire_resource xen_mem_acquire_resource_t;
> +
> +#endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */

Also please group this with the other similar section, without
introducing a second identical #if. After all we have ...

> +/* Next available subop number is 29 */

... this to eliminate the need to have things numerically sorted in
this file.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-25 13:29     ` Andrew Cooper
@ 2017-09-25 14:03       ` Jan Beulich
  0 siblings, 0 replies; 62+ messages in thread
From: Jan Beulich @ 2017-09-25 14:03 UTC (permalink / raw)
  To: Andrew Cooper, Paul Durrant; +Cc: xen-devel

>>> On 25.09.17 at 15:29, <andrew.cooper3@citrix.com> wrote:
> On 25/09/17 14:02, Jan Beulich wrote:
>>>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>>> In the case where a PV domain is mapping guest resources then it needs make
>>> the HYPERVISOR_mmu_update call using DOMID_SELF, rather than the guest
>>> domid, so that the passed in gmfn values are correctly treated as mfns
>>> rather than gfns present in the guest p2m.
>> Since things are presently working fine, I think the description is not
>> really accurate. You only require the new behavior if you don't know
>> the GFN of the page you want to map, and that it has to be
>> DOMID_SELF that should be passed also doesn't appear to derive
>> from anything else. To properly judge about the need for this patch
>> it would help if it was briefly explained why being able to map by GFN
>> is no longer sufficient, and to re-word the DOMID_SELF part.
> 
> I think there is still confusion as to the purpose here.
> 
> For security and scalability reasons, we explicitly want to be able to
> create frames which are not part of a guests p2m.  We still need to map
> these frames however.
> 
> The frames are referred to in an abstract way by a space id/offset.  To
> create mappings of these frames, PV guests pass an array which Xen fills
> in with MFNs, while HVM guests pass an array of GFNs which have their
> mappings updated.

In the course of reviewing patch 2 I've gained some more
understanding of the intentions. Still it would have been
helpful to have an abstract understanding already before even
looking at patch 1, i.e. presented in the overview mail.

> The problem is trying to map an MFN belonging to a target domain,
> because Xen interprets the frame under the targets paging mode.  Passing
> DOMID_SELF here is Pauls way of getting Xen to interpret the frame under
> current's paging mode.
> 
> Alternative suggestions welcome.

I've given one in the earlier reply.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  2017-09-18 15:31 ` [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources Paul Durrant
  2017-09-25 13:49   ` Jan Beulich
@ 2017-09-25 14:23   ` Jan Beulich
  2017-09-25 15:00     ` Paul Durrant
  1 sibling, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-25 14:23 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> Certain memory resources associated with a guest are not necessarily
> present in the guest P2M and so are not necessarily available to be
> foreign-mapped by a tools domain unless they are inserted, which risks
> shattering a super-page mapping.

Btw., I'm additionally having trouble seeing this shattering of a
superpage: For one, xc_core_arch_get_scratch_gpfn() could be
a little less simplistic. And then even with the currently chosen
value (outside of the range of valid GFNs at that point in time)
there shouldn't be a larger page to be shattered, as there should
be no mapping at all at that index. But perhaps I'm just blind and
don't see the obvious ...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 06/12] x86/hvm/ioreq: rename .*pfn and .*gmfn to .*gfn
  2017-09-18 15:31 ` [PATCH v7 06/12] x86/hvm/ioreq: rename .*pfn and .*gmfn to .*gfn Paul Durrant
@ 2017-09-25 14:29   ` Jan Beulich
  2017-09-25 14:32     ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-25 14:29 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> Since ioreq servers are only relevant to HVM guests and all the names in
> question unequivocally refer to guest frame numbers, name them all .*gfn
> to avoid any confusion.
> 
> This patch is purely cosmetic. No semantic or functional change.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>

Are there any dependencies here on prior patches in the series
(at the first glance it looks like there might not be)? I.e. could it
go in ahead of the earlier parts?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 07/12] x86/hvm/ioreq: use bool rather than bool_t
  2017-09-18 15:31 ` [PATCH v7 07/12] x86/hvm/ioreq: use bool rather than bool_t Paul Durrant
@ 2017-09-25 14:30   ` Jan Beulich
  0 siblings, 0 replies; 62+ messages in thread
From: Jan Beulich @ 2017-09-25 14:30 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> This patch changes use of bool_t to bool in the ioreq server code. It also
> fixes an incorrect indentation in a continuation line.
> 
> This patch is purely cosmetic. No semantic or functional change.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
with the same question as on patch 6.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 06/12] x86/hvm/ioreq: rename .*pfn and .*gmfn to .*gfn
  2017-09-25 14:29   ` Jan Beulich
@ 2017-09-25 14:32     ` Paul Durrant
  0 siblings, 0 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-25 14:32 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 25 September 2017 15:29
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v7 06/12] x86/hvm/ioreq: rename .*pfn and .*gmfn to
> .*gfn
> 
> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> > Since ioreq servers are only relevant to HVM guests and all the names in
> > question unequivocally refer to guest frame numbers, name them all .*gfn
> > to avoid any confusion.
> >
> > This patch is purely cosmetic. No semantic or functional change.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
> 

Thanks.

> Are there any dependencies here on prior patches in the series
> (at the first glance it looks like there might not be)? I.e. could it
> go in ahead of the earlier parts?

No, I don' believe there are any dependencies (same for patch #7).

  Paul

> 
> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-25 13:02   ` Jan Beulich
  2017-09-25 13:29     ` Andrew Cooper
@ 2017-09-25 14:42     ` Paul Durrant
  2017-09-25 14:49       ` Jan Beulich
  2017-09-27 11:18     ` Paul Durrant
  2 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-25 14:42 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 25 September 2017 14:03
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map
> guest mfns
> 
> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> > In the case where a PV domain is mapping guest resources then it needs
> make
> > the HYPERVISOR_mmu_update call using DOMID_SELF, rather than the
> guest
> > domid, so that the passed in gmfn values are correctly treated as mfns
> > rather than gfns present in the guest p2m.
> 
> Since things are presently working fine, I think the description is not
> really accurate. You only require the new behavior if you don't know
> the GFN of the page you want to map, and that it has to be
> DOMID_SELF that should be passed also doesn't appear to derive
> from anything else. To properly judge about the need for this patch
> it would help if it was briefly explained why being able to map by GFN
> is no longer sufficient, and to re-word the DOMID_SELF part.

Ok, I can expand the explanation.

> 
> The other aspect I don't understand is why this is needed for PV
> Dom0, but not for PVH. The answer here can't be "because PVH
> Dom0 isn't supported yet", because it eventually will be, and then
> there will still be the problem of PVH supposedly having no notion
> of MFNs (be their own or foreign guest ones). The answer also
> can't be "since it would use XENMAPSPACE_gmfn_foreign", as
> that's acting in terms of GFN too.

The hypercall is PV-only. For a PVH/HVM tools domain things are handled by doing an add-to-physmap to gfns specified by the tools domain. I have tested both PV and HVM clients of my new memory op.

> 
> > This patch removes a check which currently disallows mapping of a page
> when
> > the owner of the page tables matches the domain passed to
> > HYPERVISOR_mmu_update, but that domain is not the real owner of the
> page.
> > The check was introduced by patch d3c6a215ca9 ("x86: Clean up
> > get_page_from_l1e() to correctly distinguish between owner-of-pte and
> > owner-of-data-page in all cases") but it's not clear why it was needed.
> 
> I think the goal here simply was to not permit anything that doesn't
> really need permitting. Furthermore the check being "introduced"
> there was, afaict, replacing the earlier d != curr->domain.

I'm not entirely sure why that check was there though.

> 
> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -1024,12 +1024,15 @@ get_page_from_l1e(
> >                     (real_pg_owner != dom_cow) ) )
> >      {
> >          /*
> > -         * Let privileged domains transfer the right to map their target
> > -         * domain's pages. This is used to allow stub-domain pvfb export to
> > -         * dom0, until pvfb supports granted mappings. At that time this
> > -         * minor hack can go away.
> > +         * If the real page owner is not the domain specified in the
> > +         * hypercall then establish that the specified domain has
> > +         * mapping privilege over the page owner.
> > +         * This is used to allow stub-domain pvfb export to dom0. It is
> > +         * also used to allow a privileged PV domain to map mfns using
> > +         * DOMID_SELF, which is needed for mapping guest resources such
> > +         * grant table frames.
> 
> How do grant table frames come into the picture here? So far
> I had assumed only ioreq server pages are in need of this.
> 

Grant frames required less re-work in other places so I started with them. Nothing to prevent the series from being re-ordered now that it's complete.

> >           */
> > -        if ( (real_pg_owner == NULL) || (pg_owner == l1e_owner) ||
> > +        if ( (real_pg_owner == NULL) ||
> >               xsm_priv_mapping(XSM_TARGET, pg_owner, real_pg_owner) )
> >          {
> >              gdprintk(XENLOG_WARNING,
> 
> I'm concerned of the effect of the change on the code paths
> which you're not really interested in: alloc_l1_table(),
> ptwr_emulated_update(), and shadow_get_page_from_l1e() all
> explicitly pass both domains identical, and are now suddenly able
> to do things they weren't supposed to do. A similar concern
> applies to __do_update_va_mapping() calling mod_l1_table().
> 
> I therefore wonder whether the solution to your problem
> wouldn't rather be MMU_FOREIGN_PT_UPDATE (name subject
> to improvement suggestions). This at the same time would
> address my concern regarding the misleading DOMID_SELF
> passing when really a foreign domain's page is meant.

Ok. I'm not familiar with MMU_FOREIGN_PT_UPDATE so I'd need to investigate. I just need a mechanism for a privileged PV guest to map pages belonging to another guest that don't appear in that guests P2M. As I said above, it's much simpler if the tools domain is PVH or HVM.

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-25 14:42     ` Paul Durrant
@ 2017-09-25 14:49       ` Jan Beulich
  2017-09-25 14:56         ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-25 14:49 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 25.09.17 at 16:42, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 25 September 2017 14:03
>> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>> The other aspect I don't understand is why this is needed for PV
>> Dom0, but not for PVH. The answer here can't be "because PVH
>> Dom0 isn't supported yet", because it eventually will be, and then
>> there will still be the problem of PVH supposedly having no notion
>> of MFNs (be their own or foreign guest ones). The answer also
>> can't be "since it would use XENMAPSPACE_gmfn_foreign", as
>> that's acting in terms of GFN too.
> 
> The hypercall is PV-only. For a PVH/HVM tools domain things are handled by 
> doing an add-to-physmap to gfns specified by the tools domain. I have tested 
> both PV and HVM clients of my new memory op.

And how is this add-to-physmap any better superpage shattering
wise than the old mechansim?

>> > -        if ( (real_pg_owner == NULL) || (pg_owner == l1e_owner) ||
>> > +        if ( (real_pg_owner == NULL) ||
>> >               xsm_priv_mapping(XSM_TARGET, pg_owner, real_pg_owner) )
>> >          {
>> >              gdprintk(XENLOG_WARNING,
>> 
>> I'm concerned of the effect of the change on the code paths
>> which you're not really interested in: alloc_l1_table(),
>> ptwr_emulated_update(), and shadow_get_page_from_l1e() all
>> explicitly pass both domains identical, and are now suddenly able
>> to do things they weren't supposed to do. A similar concern
>> applies to __do_update_va_mapping() calling mod_l1_table().
>> 
>> I therefore wonder whether the solution to your problem
>> wouldn't rather be MMU_FOREIGN_PT_UPDATE (name subject
>> to improvement suggestions). This at the same time would
>> address my concern regarding the misleading DOMID_SELF
>> passing when really a foreign domain's page is meant.
> 
> Ok. I'm not familiar with MMU_FOREIGN_PT_UPDATE so I'd need to investigate. 
> I just need a mechanism for a privileged PV guest to map pages belonging to 
> another guest that don't appear in that guests P2M. As I said above, it's 
> much simpler if the tools domain is PVH or HVM.

Hmm, looks like I wasn't able to express things such that it
becomes clear the MMU_FOREIGN_PT_UPDATE is the proposed
(sub-optimal) name for a new sub-op.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  2017-09-25 13:49   ` Jan Beulich
@ 2017-09-25 14:53     ` Paul Durrant
  0 siblings, 0 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-25 14:53 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan
> Beulich
> Sent: 25 September 2017 14:50
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [Xen-devel] [PATCH v7 02/12] x86/mm: add
> HYPERVISOR_memory_op to acquire guest resources
> 
> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> > Certain memory resources associated with a guest are not necessarily
> > present in the guest P2M and so are not necessarily available to be
> > foreign-mapped by a tools domain unless they are inserted, which risks
> > shattering a super-page mapping.
> 
> For grant tables I can see this as the primary issue, but isn't the
> goal of not exposing IOREQ server pages an even more important
> aspect, and hence more relevant to mention here?
> 
> > NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
> >       I have no means to test it on an ARM platform and so cannot verify
> >       that it functions correctly. Hence it is currently only implemented
> >       for x86.
> 
> Which will require things to be moved around later on. May I
> instead suggest to put it in common code and simply have a
> small #ifdef somewhere causing the operation to fail early for
> ARM?

Ok. If you prefer that approach then I'll do that.

> 
> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -4768,6 +4768,107 @@ int xenmem_add_to_physmap_one(
> >      return rc;
> >  }
> >
> > +static int xenmem_acquire_grant_table(struct domain *d,
> 
> I don't think static functions need a xenmem_ prefix.
> 

Ok.

> > +static int xenmem_acquire_resource(xen_mem_acquire_resource_t
> *xmar)
> 
> const?

Yes.

> 
> > +{
> > +    struct domain *d, *currd = current->domain;
> > +    unsigned long *mfn_list;
> > +    int rc;
> > +
> > +    if ( xmar->nr_frames == 0 )
> > +        return -EINVAL;
> > +
> > +    d = rcu_lock_domain_by_any_id(xmar->domid);
> > +    if ( d == NULL )
> > +        return -ESRCH;
> > +
> > +    rc = xsm_domain_memory_map(XSM_TARGET, d);
> > +    if ( rc )
> > +        goto out;
> > +
> > +    mfn_list = xmalloc_array(unsigned long, xmar->nr_frames);
> 
> Despite XSA-77 there should not be any new disaggregation related
> security risks (here: memory exhaustion and long lasting loops).
> Large requests need to be refused or broken up.

Ok, I'll put an upper limit on frames.

> 
> > --- a/xen/common/grant_table.c
> > +++ b/xen/common/grant_table.c
> > @@ -3607,38 +3607,58 @@ int mem_sharing_gref_to_gfn(struct
> grant_table *gt, grant_ref_t ref,
> >  }
> >  #endif
> >
> > -int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
> > -                     mfn_t *mfn)
> > -{
> > -    int rc = 0;
> >
> > -    grant_write_lock(d->grant_table);
> > +static mfn_t gnttab_get_frame_locked(struct domain *d, unsigned long
> idx)
> > +{
> > +    struct grant_table *gt = d->grant_table;
> > +    mfn_t mfn = INVALID_MFN;
> >
> > -    if ( d->grant_table->gt_version == 0 )
> > -        d->grant_table->gt_version = 1;
> > +    if ( gt->gt_version == 0 )
> > +        gt->gt_version = 1;
> >
> > -    if ( d->grant_table->gt_version == 2 &&
> > +    if ( gt->gt_version == 2 &&
> >           (idx & XENMAPIDX_grant_table_status) )
> >      {
> >          idx &= ~XENMAPIDX_grant_table_status;
> 
> I don't see the use of this bit mentioned anywhere in the public
> header addition.

Good point, I do need to mention v2 now that Juergen has revived it.

> 
> > +mfn_t gnttab_get_frame(struct domain *d, unsigned long idx)
> > +{
> > +    mfn_t mfn;
> > +
> > +    grant_write_lock(d->grant_table);
> > +    mfn = gnttab_get_frame_locked(d, idx);
> > +    grant_write_unlock(d->grant_table);
> 
> Hmm, a read lock is insufficient here only because of the possible
> bumping of the version from 0 to 1 afaict, but I don't really see
> why what now becomes gnttab_get_frame_locked() does that in
> the first place.

It may be the case that, after Juergen's recent re-arrangements, that the version will always be set by this point. I'll check and drop to and ASSERT on the version and read-lock if I can.

> 
> > --- a/xen/include/public/memory.h
> > +++ b/xen/include/public/memory.h
> > @@ -650,7 +650,43 @@ struct xen_vnuma_topology_info {
> >  typedef struct xen_vnuma_topology_info xen_vnuma_topology_info_t;
> >  DEFINE_XEN_GUEST_HANDLE(xen_vnuma_topology_info_t);
> >
> > -/* Next available subop number is 28 */
> > +#if defined(__XEN__) || defined(__XEN_TOOLS__)
> > +
> > +/*
> > + * Get the pages for a particular guest resource, so that they can be
> > + * mapped directly by a tools domain.
> > + */
> > +#define XENMEM_acquire_resource 28
> > +struct xen_mem_acquire_resource {
> > +    /* IN - the domain whose resource is to be mapped */
> > +    domid_t domid;
> > +    /* IN - the type of resource (defined below) */
> > +    uint16_t type;
> > +
> > +#define XENMEM_resource_grant_table 0
> > +
> > +    /*
> > +     * IN - a type-specific resource identifier, which must be zero
> > +     *      unless stated otherwise.
> > +     */
> > +    uint32_t id;
> > +    /* IN - number of (4K) frames of the resource to be mapped */
> > +    uint32_t nr_frames;
> > +    /* IN - the index of the initial frame to be mapped */
> > +    uint64_aligned_t frame;
> 
> There are 32 bits of unnamed padding ahead of this field - please
> name it and check it's set to zero.

Ah yes, for some reason the #define above had made me think things were aligned at this point.

> 
> > +    /* IN/OUT - If the tools domain is PV then, upon return, gmfn_list
> > +     *          will be populated with the MFNs of the resource.
> > +     *          If the tools domain is HVM then it is expected that, on
> 
> s/tools domain/calling domain/ (twice)?

Ok.

> 
> > +     *          entry, gmfn_list will be populated with a list of GFNs
> 
> s/will be/is/ to further emphasize the input vs output difference?

Ok.

> 
> > +     *          that will be mapped to the MFNs of the resource.
> > +     */
> > +    XEN_GUEST_HANDLE(xen_pfn_t) gmfn_list;
> 
> What about a 32-bit x86 tool stack domain as caller here? I
> don't think you want to limit it to just the low 16Tb? Nor are
> you adding a compat translation layer.

Ok. Good point, even though somewhat unlikely.

> 
> > +};
> > +typedef struct xen_mem_acquire_resource
> xen_mem_acquire_resource_t;
> > +
> > +#endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
> 
> Also please group this with the other similar section, without
> introducing a second identical #if. After all we have ...
> 
> > +/* Next available subop number is 29 */
> 
> ... this to eliminate the need to have things numerically sorted in
> this file.

Ok.

  Paul

> 
> Jan
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-25 14:49       ` Jan Beulich
@ 2017-09-25 14:56         ` Paul Durrant
  2017-09-25 15:30           ` Jan Beulich
  0 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-25 14:56 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan
> Beulich
> Sent: 25 September 2017 15:50
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [Xen-devel] [PATCH v7 01/12] x86/mm: allow a privileged PV
> domain to map guest mfns
> 
> >>> On 25.09.17 at 16:42, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 25 September 2017 14:03
> >> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> >> The other aspect I don't understand is why this is needed for PV
> >> Dom0, but not for PVH. The answer here can't be "because PVH
> >> Dom0 isn't supported yet", because it eventually will be, and then
> >> there will still be the problem of PVH supposedly having no notion
> >> of MFNs (be their own or foreign guest ones). The answer also
> >> can't be "since it would use XENMAPSPACE_gmfn_foreign", as
> >> that's acting in terms of GFN too.
> >
> > The hypercall is PV-only. For a PVH/HVM tools domain things are handled
> by
> > doing an add-to-physmap to gfns specified by the tools domain. I have
> tested
> > both PV and HVM clients of my new memory op.
> 
> And how is this add-to-physmap any better superpage shattering
> wise than the old mechansim?

Because the calling domain can make an intelligent choice about what gfns to use?

> 
> >> > -        if ( (real_pg_owner == NULL) || (pg_owner == l1e_owner) ||
> >> > +        if ( (real_pg_owner == NULL) ||
> >> >               xsm_priv_mapping(XSM_TARGET, pg_owner, real_pg_owner) )
> >> >          {
> >> >              gdprintk(XENLOG_WARNING,
> >>
> >> I'm concerned of the effect of the change on the code paths
> >> which you're not really interested in: alloc_l1_table(),
> >> ptwr_emulated_update(), and shadow_get_page_from_l1e() all
> >> explicitly pass both domains identical, and are now suddenly able
> >> to do things they weren't supposed to do. A similar concern
> >> applies to __do_update_va_mapping() calling mod_l1_table().
> >>
> >> I therefore wonder whether the solution to your problem
> >> wouldn't rather be MMU_FOREIGN_PT_UPDATE (name subject
> >> to improvement suggestions). This at the same time would
> >> address my concern regarding the misleading DOMID_SELF
> >> passing when really a foreign domain's page is meant.
> >
> > Ok. I'm not familiar with MMU_FOREIGN_PT_UPDATE so I'd need to
> investigate.
> > I just need a mechanism for a privileged PV guest to map pages belonging
> to
> > another guest that don't appear in that guests P2M. As I said above, it's
> > much simpler if the tools domain is PVH or HVM.
> 
> Hmm, looks like I wasn't able to express things such that it
> becomes clear the MMU_FOREIGN_PT_UPDATE is the proposed
> (sub-optimal) name for a new sub-op.
> 

Ok. I see what you mean now.

  Paul

> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  2017-09-25 14:23   ` Jan Beulich
@ 2017-09-25 15:00     ` Paul Durrant
  2017-09-26 12:20       ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-25 15:00 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 25 September 2017 15:23
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to
> acquire guest resources
> 
> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> > Certain memory resources associated with a guest are not necessarily
> > present in the guest P2M and so are not necessarily available to be
> > foreign-mapped by a tools domain unless they are inserted, which risks
> > shattering a super-page mapping.
> 
> Btw., I'm additionally having trouble seeing this shattering of a
> superpage: For one, xc_core_arch_get_scratch_gpfn() could be
> a little less simplistic. And then even with the currently chosen
> value (outside of the range of valid GFNs at that point in time)
> there shouldn't be a larger page to be shattered, as there should
> be no mapping at all at that index. But perhaps I'm just blind and
> don't see the obvious ...

The shattering was Andrew's observation. Andrew, can you comment?

Even if it's not the case, it's suboptimal to have to write-lock and update the guest's P2M twice just to map a page of grants, which I will mention in the commit comment too.

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  2017-09-18 15:31 ` [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list Paul Durrant
@ 2017-09-25 15:17   ` Jan Beulich
  2017-09-26 10:55     ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-25 15:17 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -33,6 +33,32 @@
>  
>  #include <public/hvm/ioreq.h>
>  
> +static void set_ioreq_server(struct domain *d, unsigned int id,
> +                             struct hvm_ioreq_server *s)
> +{
> +    ASSERT(id < MAX_NR_IOREQ_SERVERS);
> +    ASSERT(!s || !d->arch.hvm_domain.ioreq_server.server[id]);
> +
> +    d->arch.hvm_domain.ioreq_server.server[id] = s;
> +}
> +
> +static struct hvm_ioreq_server *get_ioreq_server(struct domain *d,

const (for the parameter)?

> +                                                 unsigned int id)
> +{
> +    if ( id >= MAX_NR_IOREQ_SERVERS )
> +        return NULL;
> +
> +    return d->arch.hvm_domain.ioreq_server.server[id];
> +}
> +
> +#define IS_DEFAULT(s) \
> +    ((s) == get_ioreq_server((s)->domain, DEFAULT_IOSERVID))

Is it really useful to go through get_ioreq_server() here?

> +#define FOR_EACH_IOREQ_SERVER(d, id, s) \
> +    for ( (id) = 0, (s) = get_ioreq_server((d), (id)); \
> +          (id) < MAX_NR_IOREQ_SERVERS; \
> +          (s) = get_ioreq_server((d), ++(id)) )

There are three instances of stray pairs of parentheses here, each
time when a macro parameter gets passed unaltered to
get_ioreq_server().

> @@ -301,8 +333,9 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
>      }
>  }
>  
> +
>  static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,

Stray addition of a blank line?

> @@ -501,19 +531,19 @@ static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s,
>  }
>  
>  static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
> -                                            bool is_default)
> +                                            ioservid_t id)
>  {
>      unsigned int i;
>      int rc;
>  
> -    if ( is_default )
> +    if ( IS_DEFAULT(s) )
>          goto done;

Wouldn't comparing the ID be even cheaper in cases like this one?
And perhaps assert that ID and server actually match?

> @@ -741,35 +754,34 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
>  
>      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
>  
> -    rc = -ENOENT;
> -    list_for_each_entry ( s,
> -                          &d->arch.hvm_domain.ioreq_server.list,
> -                          list_entry )
> -    {
> -        if ( s == d->arch.hvm_domain.default_ioreq_server )
> -            continue;
> +    s = get_ioreq_server(d, id);
>  
> -        if ( s->id != id )
> -            continue;
> +    rc = -ENOENT;
> +    if ( !s )
> +        goto out;
>  
> -        domain_pause(d);
> +    rc = -EPERM;
> +    if ( IS_DEFAULT(s) )
> +        goto out;

Here I think it is particularly strange to not use the ID in the check;
this could even be done without holding the lock. Same in other
functions below.

> @@ -785,29 +797,27 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
>  
>      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
>  
> -    rc = -ENOENT;
> -    list_for_each_entry ( s,
> -                          &d->arch.hvm_domain.ioreq_server.list,
> -                          list_entry )
> -    {
> -        if ( s == d->arch.hvm_domain.default_ioreq_server )
> -            continue;
> +    s = get_ioreq_server(d, id);
>  
> -        if ( s->id != id )
> -            continue;
> +    rc = -ENOENT;
> +    if ( !s )
> +        goto out;
>  
> -        *ioreq_gfn = s->ioreq.gfn;
> +    rc = -EOPNOTSUPP;
> +    if ( IS_DEFAULT(s) )
> +        goto out;

Why EOPNOTSUPP when it was just the same ENOENT as no
server at all before (same further down)?

>  void hvm_destroy_all_ioreq_servers(struct domain *d)
>  {
> -    struct hvm_ioreq_server *s, *next;
> +    struct hvm_ioreq_server *s;
> +    unsigned int id;
>  
>      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
>  
>      /* No need to domain_pause() as the domain is being torn down */
>  
> -    list_for_each_entry_safe ( s,
> -                               next,
> -                               &d->arch.hvm_domain.ioreq_server.list,
> -                               list_entry )
> +    FOR_EACH_IOREQ_SERVER(d, id, s)
>      {
> -        bool is_default = (s == d->arch.hvm_domain.default_ioreq_server);
> -
> -        hvm_ioreq_server_disable(s, is_default);
> -
> -        if ( is_default )
> -            d->arch.hvm_domain.default_ioreq_server = NULL;
> +        if ( !s )
> +            continue;
>  
> -        list_del(&s->list_entry);
> +        hvm_ioreq_server_disable(s);
> +        hvm_ioreq_server_deinit(s);
>  
> -        hvm_ioreq_server_deinit(s, is_default);
> +        ASSERT(d->arch.hvm_domain.ioreq_server.count);
> +        --d->arch.hvm_domain.ioreq_server.count;

Seeing this - do you actually need the count as a separate field?
I.e. are there performance critical uses of it, where going through
the array would be too expensive? Most of the uses are just
ASSERT()s anyway.

> @@ -1111,7 +1111,7 @@ int hvm_set_dm_domain(struct domain *d, domid_t domid)
>       * still be set and thus, when the server is created, it will have
>       * the correct domid.
>       */
> -    s = d->arch.hvm_domain.default_ioreq_server;
> +    s = get_ioreq_server(d, DEFAULT_IOSERVID);

Similar to above, is it really useful to go through get_ioreq_server()
here (and in other similar cases)?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 09/12] x86/hvm/ioreq: simplify code and use consistent naming
  2017-09-18 15:31 ` [PATCH v7 09/12] x86/hvm/ioreq: simplify code and use consistent naming Paul Durrant
@ 2017-09-25 15:26   ` Jan Beulich
  0 siblings, 0 replies; 62+ messages in thread
From: Jan Beulich @ 2017-09-25 15:26 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> This patch re-works much of the ioreq server initialization and teardown
> code:
> 
> - The hvm_map/unmap_ioreq_gfn() functions are expanded to call through
>   to hvm_alloc/free_ioreq_gfn() rather than expecting them to be called
>   separately by outer functions.
> - Several functions now test the validity of the hvm_ioreq_page gfn value
>   to determine whether they need to act. This means can be safely called
>   for the bufioreq page even when it is not used.
> - hvm_add/remove_ioreq_gfn() simply return in the case of the default
>   IOREQ server so callers no longer need to test before calling.
> - hvm_ioreq_server_setup_pages() is renamed to hvm_ioreq_server_map_pages()
>   to mirror the existing hvm_ioreq_server_unmap_pages().
> 
> All of this significantly shortens the code.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
(in case it matters)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 10/12] x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page
  2017-09-18 15:31 ` [PATCH v7 10/12] x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page Paul Durrant
@ 2017-09-25 15:27   ` Jan Beulich
  0 siblings, 0 replies; 62+ messages in thread
From: Jan Beulich @ 2017-09-25 15:27 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> This patch adjusts the ioreq server code to use type-safe gfn_t values
> where possible. No functional change.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Again in case it matters
Acked-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-25 14:56         ` Paul Durrant
@ 2017-09-25 15:30           ` Jan Beulich
  2017-09-25 15:33             ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-25 15:30 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 25.09.17 at 16:56, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan
>> Beulich
>> Sent: 25 September 2017 15:50
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
>> devel@lists.xenproject.org 
>> Subject: Re: [Xen-devel] [PATCH v7 01/12] x86/mm: allow a privileged PV
>> domain to map guest mfns
>> 
>> >>> On 25.09.17 at 16:42, <Paul.Durrant@citrix.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 25 September 2017 14:03
>> >> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>> >> The other aspect I don't understand is why this is needed for PV
>> >> Dom0, but not for PVH. The answer here can't be "because PVH
>> >> Dom0 isn't supported yet", because it eventually will be, and then
>> >> there will still be the problem of PVH supposedly having no notion
>> >> of MFNs (be their own or foreign guest ones). The answer also
>> >> can't be "since it would use XENMAPSPACE_gmfn_foreign", as
>> >> that's acting in terms of GFN too.
>> >
>> > The hypercall is PV-only. For a PVH/HVM tools domain things are handled
>> by
>> > doing an add-to-physmap to gfns specified by the tools domain. I have
>> tested
>> > both PV and HVM clients of my new memory op.
>> 
>> And how is this add-to-physmap any better superpage shattering
>> wise than the old mechansim?
> 
> Because the calling domain can make an intelligent choice about what gfns to 
> use?

And why was an intelligent choice not possible with the old
mechanism? The calling domain is the tool stack one in both
cases, isn't it?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-25 15:30           ` Jan Beulich
@ 2017-09-25 15:33             ` Paul Durrant
  0 siblings, 0 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-25 15:33 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 25 September 2017 16:31
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: RE: [Xen-devel] [PATCH v7 01/12] x86/mm: allow a privileged PV
> domain to map guest mfns
> 
> >>> On 25.09.17 at 16:56, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> Jan
> >> Beulich
> >> Sent: 25 September 2017 15:50
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> >> devel@lists.xenproject.org
> >> Subject: Re: [Xen-devel] [PATCH v7 01/12] x86/mm: allow a privileged PV
> >> domain to map guest mfns
> >>
> >> >>> On 25.09.17 at 16:42, <Paul.Durrant@citrix.com> wrote:
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 25 September 2017 14:03
> >> >> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> >> >> The other aspect I don't understand is why this is needed for PV
> >> >> Dom0, but not for PVH. The answer here can't be "because PVH
> >> >> Dom0 isn't supported yet", because it eventually will be, and then
> >> >> there will still be the problem of PVH supposedly having no notion
> >> >> of MFNs (be their own or foreign guest ones). The answer also
> >> >> can't be "since it would use XENMAPSPACE_gmfn_foreign", as
> >> >> that's acting in terms of GFN too.
> >> >
> >> > The hypercall is PV-only. For a PVH/HVM tools domain things are
> handled
> >> by
> >> > doing an add-to-physmap to gfns specified by the tools domain. I have
> >> tested
> >> > both PV and HVM clients of my new memory op.
> >>
> >> And how is this add-to-physmap any better superpage shattering
> >> wise than the old mechansim?
> >
> > Because the calling domain can make an intelligent choice about what gfns
> to
> > use?
> 
> And why was an intelligent choice not possible with the old
> mechanism? The calling domain is the tool stack one in both
> cases, isn't it?

True, I suppose.

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 11/12] x86/hvm/ioreq: defer mapping gfns until they are actually requsted
  2017-09-18 15:31 ` [PATCH v7 11/12] x86/hvm/ioreq: defer mapping gfns until they are actually requsted Paul Durrant
@ 2017-09-25 16:00   ` Jan Beulich
  2017-09-25 16:04     ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-25 16:00 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, xen-devel

>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -354,6 +354,9 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
>      }
>  }
>  
> +#define HANDLE_BUFIOREQ(s) \
> +    (s->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)

(s)

> @@ -762,11 +755,20 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
>      if ( IS_DEFAULT(s) )
>          goto out;
>  
> +    if ( ioreq_gfn || bufioreq_gfn )

This conditional together with ...

> +    {
> +        rc = hvm_ioreq_server_map_pages(s);
> +        if ( rc )
> +            goto out;
> +    }
> +
>      *ioreq_gfn = gfn_x(s->ioreq.gfn);

... this unconditional dereference is suspicious.

> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -68,8 +68,8 @@ struct hvm_ioreq_server {
>      spinlock_t             bufioreq_lock;
>      evtchn_port_t          bufioreq_evtchn;
>      struct rangeset        *range[NR_IO_RANGE_TYPES];
> +    int                    bufioreq_handling;

Does this need to be plain int (i.e. signed and 32 bits wide)?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 11/12] x86/hvm/ioreq: defer mapping gfns until they are actually requsted
  2017-09-25 16:00   ` Jan Beulich
@ 2017-09-25 16:04     ` Paul Durrant
  0 siblings, 0 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-25 16:04 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Ian Jackson, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 25 September 2017 17:00
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; xen-devel@lists.xenproject.org;
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Tim (Xen.org)
> <tim@xen.org>
> Subject: Re: [PATCH v7 11/12] x86/hvm/ioreq: defer mapping gfns until they
> are actually requsted
> 
> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> > --- a/xen/arch/x86/hvm/ioreq.c
> > +++ b/xen/arch/x86/hvm/ioreq.c
> > @@ -354,6 +354,9 @@ static void hvm_update_ioreq_evtchn(struct
> hvm_ioreq_server *s,
> >      }
> >  }
> >
> > +#define HANDLE_BUFIOREQ(s) \
> > +    (s->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
> 
> (s)
> 
> > @@ -762,11 +755,20 @@ int hvm_get_ioreq_server_info(struct domain
> *d, ioservid_t id,
> >      if ( IS_DEFAULT(s) )
> >          goto out;
> >
> > +    if ( ioreq_gfn || bufioreq_gfn )
> 
> This conditional together with ...
> 
> > +    {
> > +        rc = hvm_ioreq_server_map_pages(s);
> > +        if ( rc )
> > +            goto out;
> > +    }
> > +
> >      *ioreq_gfn = gfn_x(s->ioreq.gfn);
> 
> ... this unconditional dereference is suspicious.

True, it should be protected.

> 
> > --- a/xen/include/asm-x86/hvm/domain.h
> > +++ b/xen/include/asm-x86/hvm/domain.h
> > @@ -68,8 +68,8 @@ struct hvm_ioreq_server {
> >      spinlock_t             bufioreq_lock;
> >      evtchn_port_t          bufioreq_evtchn;
> >      struct rangeset        *range[NR_IO_RANGE_TYPES];
> > +    int                    bufioreq_handling;
> 
> Does this need to be plain int (i.e. signed and 32 bits wide)?

No, I can shrink it.

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  2017-09-25 15:17   ` Jan Beulich
@ 2017-09-26 10:55     ` Paul Durrant
  2017-09-26 11:45       ` Jan Beulich
  0 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-26 10:55 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 25 September 2017 16:17
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq
> servers rather than a list
> 
> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> > --- a/xen/arch/x86/hvm/ioreq.c
> > +++ b/xen/arch/x86/hvm/ioreq.c
> > @@ -33,6 +33,32 @@
> >
> >  #include <public/hvm/ioreq.h>
> >
> > +static void set_ioreq_server(struct domain *d, unsigned int id,
> > +                             struct hvm_ioreq_server *s)
> > +{
> > +    ASSERT(id < MAX_NR_IOREQ_SERVERS);
> > +    ASSERT(!s || !d->arch.hvm_domain.ioreq_server.server[id]);
> > +
> > +    d->arch.hvm_domain.ioreq_server.server[id] = s;
> > +}
> > +
> > +static struct hvm_ioreq_server *get_ioreq_server(struct domain *d,
> 
> const (for the parameter)?
> 

Ok.

> > +                                                 unsigned int id)
> > +{
> > +    if ( id >= MAX_NR_IOREQ_SERVERS )
> > +        return NULL;
> > +
> > +    return d->arch.hvm_domain.ioreq_server.server[id];
> > +}
> > +
> > +#define IS_DEFAULT(s) \
> > +    ((s) == get_ioreq_server((s)->domain, DEFAULT_IOSERVID))
> 
> Is it really useful to go through get_ioreq_server() here?
> 

No, I'll change to direct array dereference.

> > +#define FOR_EACH_IOREQ_SERVER(d, id, s) \
> > +    for ( (id) = 0, (s) = get_ioreq_server((d), (id)); \
> > +          (id) < MAX_NR_IOREQ_SERVERS; \
> > +          (s) = get_ioreq_server((d), ++(id)) )
> 
> There are three instances of stray pairs of parentheses here, each
> time when a macro parameter gets passed unaltered to
> get_ioreq_server().

OK.

> 
> > @@ -301,8 +333,9 @@ static void hvm_update_ioreq_evtchn(struct
> hvm_ioreq_server *s,
> >      }
> >  }
> >
> > +
> >  static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
> 
> Stray addition of a blank line?
> 

Yep. I'll get rid of that.

> > @@ -501,19 +531,19 @@ static void
> hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s,
> >  }
> >
> >  static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
> > -                                            bool is_default)
> > +                                            ioservid_t id)
> >  {
> >      unsigned int i;
> >      int rc;
> >
> > -    if ( is_default )
> > +    if ( IS_DEFAULT(s) )
> >          goto done;
> 
> Wouldn't comparing the ID be even cheaper in cases like this one?
> And perhaps assert that ID and server actually match?

Ok. If id is available I'll use that.

> 
> > @@ -741,35 +754,34 @@ int hvm_destroy_ioreq_server(struct domain *d,
> ioservid_t id)
> >
> >      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> >
> > -    rc = -ENOENT;
> > -    list_for_each_entry ( s,
> > -                          &d->arch.hvm_domain.ioreq_server.list,
> > -                          list_entry )
> > -    {
> > -        if ( s == d->arch.hvm_domain.default_ioreq_server )
> > -            continue;
> > +    s = get_ioreq_server(d, id);
> >
> > -        if ( s->id != id )
> > -            continue;
> > +    rc = -ENOENT;
> > +    if ( !s )
> > +        goto out;
> >
> > -        domain_pause(d);
> > +    rc = -EPERM;
> > +    if ( IS_DEFAULT(s) )
> > +        goto out;
> 
> Here I think it is particularly strange to not use the ID in the check;
> this could even be done without holding the lock. Same in other
> functions below.
> 
> > @@ -785,29 +797,27 @@ int hvm_get_ioreq_server_info(struct domain
> *d, ioservid_t id,
> >
> >      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> >
> > -    rc = -ENOENT;
> > -    list_for_each_entry ( s,
> > -                          &d->arch.hvm_domain.ioreq_server.list,
> > -                          list_entry )
> > -    {
> > -        if ( s == d->arch.hvm_domain.default_ioreq_server )
> > -            continue;
> > +    s = get_ioreq_server(d, id);
> >
> > -        if ( s->id != id )
> > -            continue;
> > +    rc = -ENOENT;
> > +    if ( !s )
> > +        goto out;
> >
> > -        *ioreq_gfn = s->ioreq.gfn;
> > +    rc = -EOPNOTSUPP;
> > +    if ( IS_DEFAULT(s) )
> > +        goto out;
> 
> Why EOPNOTSUPP when it was just the same ENOENT as no
> server at all before (same further down)?
> 

This was because of comments from Roger. In some cases I think a return of EOPNOTSUPP is more appropriate. Passing the default id is a distinct failure case.

> >  void hvm_destroy_all_ioreq_servers(struct domain *d)
> >  {
> > -    struct hvm_ioreq_server *s, *next;
> > +    struct hvm_ioreq_server *s;
> > +    unsigned int id;
> >
> >      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> >
> >      /* No need to domain_pause() as the domain is being torn down */
> >
> > -    list_for_each_entry_safe ( s,
> > -                               next,
> > -                               &d->arch.hvm_domain.ioreq_server.list,
> > -                               list_entry )
> > +    FOR_EACH_IOREQ_SERVER(d, id, s)
> >      {
> > -        bool is_default = (s == d->arch.hvm_domain.default_ioreq_server);
> > -
> > -        hvm_ioreq_server_disable(s, is_default);
> > -
> > -        if ( is_default )
> > -            d->arch.hvm_domain.default_ioreq_server = NULL;
> > +        if ( !s )
> > +            continue;
> >
> > -        list_del(&s->list_entry);
> > +        hvm_ioreq_server_disable(s);
> > +        hvm_ioreq_server_deinit(s);
> >
> > -        hvm_ioreq_server_deinit(s, is_default);
> > +        ASSERT(d->arch.hvm_domain.ioreq_server.count);
> > +        --d->arch.hvm_domain.ioreq_server.count;
> 
> Seeing this - do you actually need the count as a separate field?
> I.e. are there performance critical uses of it, where going through
> the array would be too expensive? Most of the uses are just
> ASSERT()s anyway.

The specific case is in hvm_select_ioreq_server(). If there was no count then the array would have to be searched for the initial test.

> 
> > @@ -1111,7 +1111,7 @@ int hvm_set_dm_domain(struct domain *d,
> domid_t domid)
> >       * still be set and thus, when the server is created, it will have
> >       * the correct domid.
> >       */
> > -    s = d->arch.hvm_domain.default_ioreq_server;
> > +    s = get_ioreq_server(d, DEFAULT_IOSERVID);
> 
> Similar to above, is it really useful to go through get_ioreq_server()
> here (and in other similar cases)?

Perhaps I should re-introduce GET_IOREQ_SERVER() for array lookup without the bounds check and use that instead of the inline func in places such as this.

  Paul

> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  2017-09-26 10:55     ` Paul Durrant
@ 2017-09-26 11:45       ` Jan Beulich
  2017-09-26 12:12         ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-26 11:45 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 26.09.17 at 12:55, <Paul.Durrant@citrix.com> wrote:
>> Sent: 25 September 2017 16:17
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>> > @@ -785,29 +797,27 @@ int hvm_get_ioreq_server_info(struct domain
>> *d, ioservid_t id,
>> >
>> >      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
>> >
>> > -    rc = -ENOENT;
>> > -    list_for_each_entry ( s,
>> > -                          &d->arch.hvm_domain.ioreq_server.list,
>> > -                          list_entry )
>> > -    {
>> > -        if ( s == d->arch.hvm_domain.default_ioreq_server )
>> > -            continue;
>> > +    s = get_ioreq_server(d, id);
>> >
>> > -        if ( s->id != id )
>> > -            continue;
>> > +    rc = -ENOENT;
>> > +    if ( !s )
>> > +        goto out;
>> >
>> > -        *ioreq_gfn = s->ioreq.gfn;
>> > +    rc = -EOPNOTSUPP;
>> > +    if ( IS_DEFAULT(s) )
>> > +        goto out;
>> 
>> Why EOPNOTSUPP when it was just the same ENOENT as no
>> server at all before (same further down)?
>> 
> 
> This was because of comments from Roger. In some cases I think a return of 
> EOPNOTSUPP is more appropriate. Passing the default id is a distinct failure 
> case.

And I think the change is fine as long as the commit message makes
clear it's an intentional change.

>> >  void hvm_destroy_all_ioreq_servers(struct domain *d)
>> >  {
>> > -    struct hvm_ioreq_server *s, *next;
>> > +    struct hvm_ioreq_server *s;
>> > +    unsigned int id;
>> >
>> >      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
>> >
>> >      /* No need to domain_pause() as the domain is being torn down */
>> >
>> > -    list_for_each_entry_safe ( s,
>> > -                               next,
>> > -                               &d->arch.hvm_domain.ioreq_server.list,
>> > -                               list_entry )
>> > +    FOR_EACH_IOREQ_SERVER(d, id, s)
>> >      {
>> > -        bool is_default = (s == d->arch.hvm_domain.default_ioreq_server);
>> > -
>> > -        hvm_ioreq_server_disable(s, is_default);
>> > -
>> > -        if ( is_default )
>> > -            d->arch.hvm_domain.default_ioreq_server = NULL;
>> > +        if ( !s )
>> > +            continue;
>> >
>> > -        list_del(&s->list_entry);
>> > +        hvm_ioreq_server_disable(s);
>> > +        hvm_ioreq_server_deinit(s);
>> >
>> > -        hvm_ioreq_server_deinit(s, is_default);
>> > +        ASSERT(d->arch.hvm_domain.ioreq_server.count);
>> > +        --d->arch.hvm_domain.ioreq_server.count;
>> 
>> Seeing this - do you actually need the count as a separate field?
>> I.e. are there performance critical uses of it, where going through
>> the array would be too expensive? Most of the uses are just
>> ASSERT()s anyway.
> 
> The specific case is in hvm_select_ioreq_server(). If there was no count 
> then the array would have to be searched for the initial test.

And is this something that happens frequently, i.e. the
performance of which matters?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  2017-09-26 11:45       ` Jan Beulich
@ 2017-09-26 12:12         ` Paul Durrant
  2017-09-26 12:38           ` Jan Beulich
  0 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-26 12:12 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 26 September 2017 12:45
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: RE: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq
> servers rather than a list
> 
> >>> On 26.09.17 at 12:55, <Paul.Durrant@citrix.com> wrote:
> >> Sent: 25 September 2017 16:17
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> >> > @@ -785,29 +797,27 @@ int hvm_get_ioreq_server_info(struct domain
> >> *d, ioservid_t id,
> >> >
> >> >      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> >> >
> >> > -    rc = -ENOENT;
> >> > -    list_for_each_entry ( s,
> >> > -                          &d->arch.hvm_domain.ioreq_server.list,
> >> > -                          list_entry )
> >> > -    {
> >> > -        if ( s == d->arch.hvm_domain.default_ioreq_server )
> >> > -            continue;
> >> > +    s = get_ioreq_server(d, id);
> >> >
> >> > -        if ( s->id != id )
> >> > -            continue;
> >> > +    rc = -ENOENT;
> >> > +    if ( !s )
> >> > +        goto out;
> >> >
> >> > -        *ioreq_gfn = s->ioreq.gfn;
> >> > +    rc = -EOPNOTSUPP;
> >> > +    if ( IS_DEFAULT(s) )
> >> > +        goto out;
> >>
> >> Why EOPNOTSUPP when it was just the same ENOENT as no
> >> server at all before (same further down)?
> >>
> >
> > This was because of comments from Roger. In some cases I think a return
> of
> > EOPNOTSUPP is more appropriate. Passing the default id is a distinct failure
> > case.
> 
> And I think the change is fine as long as the commit message makes
> clear it's an intentional change.
> 
> >> >  void hvm_destroy_all_ioreq_servers(struct domain *d)
> >> >  {
> >> > -    struct hvm_ioreq_server *s, *next;
> >> > +    struct hvm_ioreq_server *s;
> >> > +    unsigned int id;
> >> >
> >> >      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> >> >
> >> >      /* No need to domain_pause() as the domain is being torn down */
> >> >
> >> > -    list_for_each_entry_safe ( s,
> >> > -                               next,
> >> > -                               &d->arch.hvm_domain.ioreq_server.list,
> >> > -                               list_entry )
> >> > +    FOR_EACH_IOREQ_SERVER(d, id, s)
> >> >      {
> >> > -        bool is_default = (s == d-
> >arch.hvm_domain.default_ioreq_server);
> >> > -
> >> > -        hvm_ioreq_server_disable(s, is_default);
> >> > -
> >> > -        if ( is_default )
> >> > -            d->arch.hvm_domain.default_ioreq_server = NULL;
> >> > +        if ( !s )
> >> > +            continue;
> >> >
> >> > -        list_del(&s->list_entry);
> >> > +        hvm_ioreq_server_disable(s);
> >> > +        hvm_ioreq_server_deinit(s);
> >> >
> >> > -        hvm_ioreq_server_deinit(s, is_default);
> >> > +        ASSERT(d->arch.hvm_domain.ioreq_server.count);
> >> > +        --d->arch.hvm_domain.ioreq_server.count;
> >>
> >> Seeing this - do you actually need the count as a separate field?
> >> I.e. are there performance critical uses of it, where going through
> >> the array would be too expensive? Most of the uses are just
> >> ASSERT()s anyway.
> >
> > The specific case is in hvm_select_ioreq_server(). If there was no count
> > then the array would have to be searched for the initial test.
> 
> And is this something that happens frequently, i.e. the
> performance of which matters?

Yes, this is on the critical emulation path. I.e. it is a per-io call.

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  2017-09-25 15:00     ` Paul Durrant
@ 2017-09-26 12:20       ` Paul Durrant
  2017-09-26 12:35         ` Jan Beulich
  0 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-26 12:20 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich', Andrew Cooper; +Cc: xen-devel

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> Paul Durrant
> Sent: 25 September 2017 16:00
> To: 'Jan Beulich' <JBeulich@suse.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [Xen-devel] [PATCH v7 02/12] x86/mm: add
> HYPERVISOR_memory_op to acquire guest resources
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 25 September 2017 15:23
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> > devel@lists.xenproject.org
> > Subject: Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to
> > acquire guest resources
> >
> > >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> > > Certain memory resources associated with a guest are not necessarily
> > > present in the guest P2M and so are not necessarily available to be
> > > foreign-mapped by a tools domain unless they are inserted, which risks
> > > shattering a super-page mapping.
> >
> > Btw., I'm additionally having trouble seeing this shattering of a
> > superpage: For one, xc_core_arch_get_scratch_gpfn() could be
> > a little less simplistic. And then even with the currently chosen
> > value (outside of the range of valid GFNs at that point in time)
> > there shouldn't be a larger page to be shattered, as there should
> > be no mapping at all at that index. But perhaps I'm just blind and
> > don't see the obvious ...
> 
> The shattering was Andrew's observation. Andrew, can you comment?
> 

Andrew commented verbally on this. It's not actually a shattering as such... The issue, apparently, is that adding the 4k grant table frame into the guest p2m will potentially cause creation of all layers of page table but removing it again will only remove the L1 entry. Thus it is no longer possible to use a superpage for that mapping at any point subsequently.

  Paul
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  2017-09-26 12:20       ` Paul Durrant
@ 2017-09-26 12:35         ` Jan Beulich
  2017-09-26 12:49           ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-26 12:35 UTC (permalink / raw)
  To: Andrew Cooper, Paul Durrant; +Cc: xen-devel

>>> On 26.09.17 at 14:20, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
>> Paul Durrant
>> Sent: 25 September 2017 16:00
>> To: 'Jan Beulich' <JBeulich@suse.com>
>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
>> devel@lists.xenproject.org 
>> Subject: Re: [Xen-devel] [PATCH v7 02/12] x86/mm: add
>> HYPERVISOR_memory_op to acquire guest resources
>> 
>> > -----Original Message-----
>> > From: Jan Beulich [mailto:JBeulich@suse.com]
>> > Sent: 25 September 2017 15:23
>> > To: Paul Durrant <Paul.Durrant@citrix.com>
>> > Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
>> > devel@lists.xenproject.org 
>> > Subject: Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to
>> > acquire guest resources
>> >
>> > >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>> > > Certain memory resources associated with a guest are not necessarily
>> > > present in the guest P2M and so are not necessarily available to be
>> > > foreign-mapped by a tools domain unless they are inserted, which risks
>> > > shattering a super-page mapping.
>> >
>> > Btw., I'm additionally having trouble seeing this shattering of a
>> > superpage: For one, xc_core_arch_get_scratch_gpfn() could be
>> > a little less simplistic. And then even with the currently chosen
>> > value (outside of the range of valid GFNs at that point in time)
>> > there shouldn't be a larger page to be shattered, as there should
>> > be no mapping at all at that index. But perhaps I'm just blind and
>> > don't see the obvious ...
>> 
>> The shattering was Andrew's observation. Andrew, can you comment?
>> 
> 
> Andrew commented verbally on this. It's not actually a shattering as such... 
> The issue, apparently, is that adding the 4k grant table frame into the guest 
> p2m will potentially cause creation of all layers of page table but removing 
> it again will only remove the L1 entry. Thus it is no longer possible to use 
> a superpage for that mapping at any point subsequently.

First of all - what would cause a mapping to appear at that slot (or in
a range covering that slot). And then, while re-combining contiguous
mappings indeed doesn't exist right now, replacing a non-leaf entry
(page table) with a large page is very well supported (see e.g.
ept_set_entry(), which even has a comment to that effect). Hence
I continue to be confused why we need a new mechanism for
seeding the grant tables.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  2017-09-26 12:12         ` Paul Durrant
@ 2017-09-26 12:38           ` Jan Beulich
  2017-09-26 12:41             ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-26 12:38 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 26.09.17 at 14:12, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 26 September 2017 12:45
>> >>> On 26.09.17 at 12:55, <Paul.Durrant@citrix.com> wrote:
>> >> Sent: 25 September 2017 16:17
>> >> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>> >> > @@ -785,29 +797,27 @@ int hvm_get_ioreq_server_info(struct domain
>> >> >  void hvm_destroy_all_ioreq_servers(struct domain *d)
>> >> >  {
>> >> > -    struct hvm_ioreq_server *s, *next;
>> >> > +    struct hvm_ioreq_server *s;
>> >> > +    unsigned int id;
>> >> >
>> >> >      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
>> >> >
>> >> >      /* No need to domain_pause() as the domain is being torn down */
>> >> >
>> >> > -    list_for_each_entry_safe ( s,
>> >> > -                               next,
>> >> > -                               &d->arch.hvm_domain.ioreq_server.list,
>> >> > -                               list_entry )
>> >> > +    FOR_EACH_IOREQ_SERVER(d, id, s)
>> >> >      {
>> >> > -        bool is_default = (s == d-
>> >arch.hvm_domain.default_ioreq_server);
>> >> > -
>> >> > -        hvm_ioreq_server_disable(s, is_default);
>> >> > -
>> >> > -        if ( is_default )
>> >> > -            d->arch.hvm_domain.default_ioreq_server = NULL;
>> >> > +        if ( !s )
>> >> > +            continue;
>> >> >
>> >> > -        list_del(&s->list_entry);
>> >> > +        hvm_ioreq_server_disable(s);
>> >> > +        hvm_ioreq_server_deinit(s);
>> >> >
>> >> > -        hvm_ioreq_server_deinit(s, is_default);
>> >> > +        ASSERT(d->arch.hvm_domain.ioreq_server.count);
>> >> > +        --d->arch.hvm_domain.ioreq_server.count;
>> >>
>> >> Seeing this - do you actually need the count as a separate field?
>> >> I.e. are there performance critical uses of it, where going through
>> >> the array would be too expensive? Most of the uses are just
>> >> ASSERT()s anyway.
>> >
>> > The specific case is in hvm_select_ioreq_server(). If there was no count
>> > then the array would have to be searched for the initial test.
>> 
>> And is this something that happens frequently, i.e. the
>> performance of which matters?
> 
> Yes, this is on the critical emulation path. I.e. it is a per-io call.

That's not answering the question, because you leave out implied
context: Is it a performance critical path to get here with no
ioreq server registers (i.e. count being zero)? After all if count is
non-zero, you get past the early-out conditional there anyway.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  2017-09-26 12:38           ` Jan Beulich
@ 2017-09-26 12:41             ` Paul Durrant
  2017-09-26 13:03               ` Jan Beulich
  0 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-26 12:41 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 26 September 2017 13:38
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: RE: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq
> servers rather than a list
> 
> >>> On 26.09.17 at 14:12, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 26 September 2017 12:45
> >> >>> On 26.09.17 at 12:55, <Paul.Durrant@citrix.com> wrote:
> >> >> Sent: 25 September 2017 16:17
> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> >> >> > @@ -785,29 +797,27 @@ int hvm_get_ioreq_server_info(struct
> domain
> >> >> >  void hvm_destroy_all_ioreq_servers(struct domain *d)
> >> >> >  {
> >> >> > -    struct hvm_ioreq_server *s, *next;
> >> >> > +    struct hvm_ioreq_server *s;
> >> >> > +    unsigned int id;
> >> >> >
> >> >> >      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> >> >> >
> >> >> >      /* No need to domain_pause() as the domain is being torn down
> */
> >> >> >
> >> >> > -    list_for_each_entry_safe ( s,
> >> >> > -                               next,
> >> >> > -                               &d->arch.hvm_domain.ioreq_server.list,
> >> >> > -                               list_entry )
> >> >> > +    FOR_EACH_IOREQ_SERVER(d, id, s)
> >> >> >      {
> >> >> > -        bool is_default = (s == d-
> >> >arch.hvm_domain.default_ioreq_server);
> >> >> > -
> >> >> > -        hvm_ioreq_server_disable(s, is_default);
> >> >> > -
> >> >> > -        if ( is_default )
> >> >> > -            d->arch.hvm_domain.default_ioreq_server = NULL;
> >> >> > +        if ( !s )
> >> >> > +            continue;
> >> >> >
> >> >> > -        list_del(&s->list_entry);
> >> >> > +        hvm_ioreq_server_disable(s);
> >> >> > +        hvm_ioreq_server_deinit(s);
> >> >> >
> >> >> > -        hvm_ioreq_server_deinit(s, is_default);
> >> >> > +        ASSERT(d->arch.hvm_domain.ioreq_server.count);
> >> >> > +        --d->arch.hvm_domain.ioreq_server.count;
> >> >>
> >> >> Seeing this - do you actually need the count as a separate field?
> >> >> I.e. are there performance critical uses of it, where going through
> >> >> the array would be too expensive? Most of the uses are just
> >> >> ASSERT()s anyway.
> >> >
> >> > The specific case is in hvm_select_ioreq_server(). If there was no count
> >> > then the array would have to be searched for the initial test.
> >>
> >> And is this something that happens frequently, i.e. the
> >> performance of which matters?
> >
> > Yes, this is on the critical emulation path. I.e. it is a per-io call.
> 
> That's not answering the question, because you leave out implied
> context: Is it a performance critical path to get here with no
> ioreq server registers (i.e. count being zero)? After all if count is
> non-zero, you get past the early-out conditional there anyway.
> 

Clearly, only emulation that does not hit any IOREQ servers would be impacted. Generally you'd hope that the guest would not hit memory ranges that don't exist too often but, if it does, then by not maintaining a count there would be a small performance regression over the code as it stands now.

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  2017-09-26 12:35         ` Jan Beulich
@ 2017-09-26 12:49           ` Paul Durrant
  2017-09-27 11:34             ` Andrew Cooper
  0 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-26 12:49 UTC (permalink / raw)
  To: 'Jan Beulich', Andrew Cooper; +Cc: xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 26 September 2017 13:35
> To: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Cc: xen-devel@lists.xenproject.org
> Subject: RE: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to
> acquire guest resources
> 
> >>> On 26.09.17 at 14:20, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> >> Paul Durrant
> >> Sent: 25 September 2017 16:00
> >> To: 'Jan Beulich' <JBeulich@suse.com>
> >> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> >> devel@lists.xenproject.org
> >> Subject: Re: [Xen-devel] [PATCH v7 02/12] x86/mm: add
> >> HYPERVISOR_memory_op to acquire guest resources
> >>
> >> > -----Original Message-----
> >> > From: Jan Beulich [mailto:JBeulich@suse.com]
> >> > Sent: 25 September 2017 15:23
> >> > To: Paul Durrant <Paul.Durrant@citrix.com>
> >> > Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> >> > devel@lists.xenproject.org
> >> > Subject: Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op
> to
> >> > acquire guest resources
> >> >
> >> > >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> >> > > Certain memory resources associated with a guest are not necessarily
> >> > > present in the guest P2M and so are not necessarily available to be
> >> > > foreign-mapped by a tools domain unless they are inserted, which
> risks
> >> > > shattering a super-page mapping.
> >> >
> >> > Btw., I'm additionally having trouble seeing this shattering of a
> >> > superpage: For one, xc_core_arch_get_scratch_gpfn() could be
> >> > a little less simplistic. And then even with the currently chosen
> >> > value (outside of the range of valid GFNs at that point in time)
> >> > there shouldn't be a larger page to be shattered, as there should
> >> > be no mapping at all at that index. But perhaps I'm just blind and
> >> > don't see the obvious ...
> >>
> >> The shattering was Andrew's observation. Andrew, can you comment?
> >>
> >
> > Andrew commented verbally on this. It's not actually a shattering as such...
> > The issue, apparently, is that adding the 4k grant table frame into the guest
> > p2m will potentially cause creation of all layers of page table but removing
> > it again will only remove the L1 entry. Thus it is no longer possible to use
> > a superpage for that mapping at any point subsequently.
> 
> First of all - what would cause a mapping to appear at that slot (or in
> a range covering that slot). And then, while re-combining contiguous
> mappings indeed doesn't exist right now, replacing a non-leaf entry
> (page table) with a large page is very well supported (see e.g.
> ept_set_entry(), which even has a comment to that effect). Hence
> I continue to be confused why we need a new mechanism for
> seeding the grant tables.

I'll have to defer to Andrew to answer at this point.

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource type...
  2017-09-18 15:31 ` [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource type Paul Durrant
  2017-09-18 16:18   ` Roger Pau Monné
@ 2017-09-26 12:58   ` Jan Beulich
  2017-09-26 13:05     ` Paul Durrant
  1 sibling, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-26 12:58 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Andrew Cooper, Tim Deegan, xen-devel, Ian Jackson

>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> @@ -762,7 +863,8 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
>              goto out;
>      }
>  
> -    *ioreq_gfn = gfn_x(s->ioreq.gfn);
> +    if ( ioreq_gfn )
> +        *ioreq_gfn = gfn_x(s->ioreq.gfn);

Ah, this is what actually wants to be in patch 11. Considering what
you say in the description regarding the XEN_DMOP_no_gfns I
wonder whether you wouldn't better return "invalid" indicators in
the GFN output fields of the hypercall when the pages haven't
been mapped to a GFN.

> @@ -780,6 +882,33 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
>      return rc;
>  }
>  
> +mfn_t hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
> +                                 unsigned int idx)
> +{
> +    struct hvm_ioreq_server *s;
> +    mfn_t mfn = INVALID_MFN;
> +
> +    spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> +
> +    s = get_ioreq_server(d, id);
> +
> +    if ( id >= MAX_NR_IOREQ_SERVERS || !s || IS_DEFAULT(s) )
> +        goto out;
> +
> +    if ( hvm_ioreq_server_alloc_pages(s) )
> +        goto out;
> +
> +    if ( idx == 0 )
> +        mfn = _mfn(page_to_mfn(s->bufioreq.page));
> +    else if ( idx == 1 )
> +        mfn = _mfn(page_to_mfn(s->ioreq.page));

else <some sort of error>? Also with buffered I/O being optional,
wouldn't it be more natural for index 0 representing the synchronous
page? And with buffered I/O not enabled, aren't you returning
rubbish (NULL translated by page_to_mfn())?

> + out:
> +    spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> +
> +    return mfn;

The unspecific error (INVALID_MFN) here makes me wonder ...

> @@ -4795,6 +4796,27 @@ static int xenmem_acquire_grant_table(struct domain *d,
>      return 0;
>  }
>  
> +static int xenmem_acquire_ioreq_server(struct domain *d,
> +                                       unsigned int id,
> +                                       unsigned long frame,
> +                                       unsigned long nr_frames,
> +                                       unsigned long mfn_list[])
> +{
> +    unsigned int i;
> +
> +    for ( i = 0; i < nr_frames; i++ )
> +    {
> +        mfn_t mfn = hvm_get_ioreq_server_frame(d, id, frame + i);
> +
> +        if ( mfn_eq(mfn, INVALID_MFN) )
> +            return -EINVAL;

... how meaningful EINVAL here is. In particular if page allocation
failed, ENOMEM would certainly be more appropriate (and give the
caller a better idea of what needs to be done).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  2017-09-26 12:41             ` Paul Durrant
@ 2017-09-26 13:03               ` Jan Beulich
  2017-09-26 13:11                 ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-26 13:03 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 26.09.17 at 14:41, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 26 September 2017 13:38
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
>> devel@lists.xenproject.org 
>> Subject: RE: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq
>> servers rather than a list
>> 
>> >>> On 26.09.17 at 14:12, <Paul.Durrant@citrix.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 26 September 2017 12:45
>> >> >>> On 26.09.17 at 12:55, <Paul.Durrant@citrix.com> wrote:
>> >> >> Sent: 25 September 2017 16:17
>> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> >> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>> >> >> > @@ -785,29 +797,27 @@ int hvm_get_ioreq_server_info(struct
>> domain
>> >> >> >  void hvm_destroy_all_ioreq_servers(struct domain *d)
>> >> >> >  {
>> >> >> > -    struct hvm_ioreq_server *s, *next;
>> >> >> > +    struct hvm_ioreq_server *s;
>> >> >> > +    unsigned int id;
>> >> >> >
>> >> >> >      spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
>> >> >> >
>> >> >> >      /* No need to domain_pause() as the domain is being torn down
>> */
>> >> >> >
>> >> >> > -    list_for_each_entry_safe ( s,
>> >> >> > -                               next,
>> >> >> > -                               &d->arch.hvm_domain.ioreq_server.list,
>> >> >> > -                               list_entry )
>> >> >> > +    FOR_EACH_IOREQ_SERVER(d, id, s)
>> >> >> >      {
>> >> >> > -        bool is_default = (s == d-
>> >> >arch.hvm_domain.default_ioreq_server);
>> >> >> > -
>> >> >> > -        hvm_ioreq_server_disable(s, is_default);
>> >> >> > -
>> >> >> > -        if ( is_default )
>> >> >> > -            d->arch.hvm_domain.default_ioreq_server = NULL;
>> >> >> > +        if ( !s )
>> >> >> > +            continue;
>> >> >> >
>> >> >> > -        list_del(&s->list_entry);
>> >> >> > +        hvm_ioreq_server_disable(s);
>> >> >> > +        hvm_ioreq_server_deinit(s);
>> >> >> >
>> >> >> > -        hvm_ioreq_server_deinit(s, is_default);
>> >> >> > +        ASSERT(d->arch.hvm_domain.ioreq_server.count);
>> >> >> > +        --d->arch.hvm_domain.ioreq_server.count;
>> >> >>
>> >> >> Seeing this - do you actually need the count as a separate field?
>> >> >> I.e. are there performance critical uses of it, where going through
>> >> >> the array would be too expensive? Most of the uses are just
>> >> >> ASSERT()s anyway.
>> >> >
>> >> > The specific case is in hvm_select_ioreq_server(). If there was no count
>> >> > then the array would have to be searched for the initial test.
>> >>
>> >> And is this something that happens frequently, i.e. the
>> >> performance of which matters?
>> >
>> > Yes, this is on the critical emulation path. I.e. it is a per-io call.
>> 
>> That's not answering the question, because you leave out implied
>> context: Is it a performance critical path to get here with no
>> ioreq server registers (i.e. count being zero)? After all if count is
>> non-zero, you get past the early-out conditional there anyway.
>> 
> 
> Clearly, only emulation that does not hit any IOREQ servers would be 
> impacted. Generally you'd hope that the guest would not hit memory ranges 
> that don't exist too often but, if it does, then by not maintaining a count 
> there would be a small performance regression over the code as it stands now.

By "now" you mean with this patch in place as is, rather than the
current tip of the tree? I don't see a regression compared to
current staging, and I question the need to performance optimize
a corner case a guest should not have much interest getting into.
Anyway - I'm not meaning to block the patch because of the
presence of this field (the more that you're the maintainer of this
code anyway), it merely seems to me that the patch would be
ending up smaller but no worse in quality if the field wasn't there.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource type...
  2017-09-26 12:58   ` Jan Beulich
@ 2017-09-26 13:05     ` Paul Durrant
  2017-09-26 13:10       ` Jan Beulich
  0 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-26 13:05 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Andrew Cooper, Tim (Xen.org), Ian Jackson, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 26 September 2017 13:59
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Ian Jackson
> <Ian.Jackson@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel@lists.xenproject.org; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource
> type...
> 
> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> > @@ -762,7 +863,8 @@ int hvm_get_ioreq_server_info(struct domain *d,
> ioservid_t id,
> >              goto out;
> >      }
> >
> > -    *ioreq_gfn = gfn_x(s->ioreq.gfn);
> > +    if ( ioreq_gfn )
> > +        *ioreq_gfn = gfn_x(s->ioreq.gfn);
> 
> Ah, this is what actually wants to be in patch 11. Considering what
> you say in the description regarding the XEN_DMOP_no_gfns I
> wonder whether you wouldn't better return "invalid" indicators in
> the GFN output fields of the hypercall when the pages haven't
> been mapped to a GFN.
> 
> > @@ -780,6 +882,33 @@ int hvm_get_ioreq_server_info(struct domain *d,
> ioservid_t id,
> >      return rc;
> >  }
> >
> > +mfn_t hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
> > +                                 unsigned int idx)
> > +{
> > +    struct hvm_ioreq_server *s;
> > +    mfn_t mfn = INVALID_MFN;
> > +
> > +    spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> > +
> > +    s = get_ioreq_server(d, id);
> > +
> > +    if ( id >= MAX_NR_IOREQ_SERVERS || !s || IS_DEFAULT(s) )
> > +        goto out;
> > +
> > +    if ( hvm_ioreq_server_alloc_pages(s) )
> > +        goto out;
> > +
> > +    if ( idx == 0 )
> > +        mfn = _mfn(page_to_mfn(s->bufioreq.page));
> > +    else if ( idx == 1 )
> > +        mfn = _mfn(page_to_mfn(s->ioreq.page));
> 
> else <some sort of error>?

I set mfn to INVALID above. Is that not enough?

> Also with buffered I/O being optional,
> wouldn't it be more natural for index 0 representing the synchronous
> page? And with buffered I/O not enabled, aren't you returning
> rubbish (NULL translated by page_to_mfn())?

Good point. I should leave the mfn set to invalid if the buffered page is not there. As for making it zero, and putting the synchronous ones afterwards, that was intentional because more synchronous pages will need to be added if we needs to support more vCPUs, whereas there should only ever need to be one buffered page.

  Paul

> 
> > + out:
> > +    spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> > +
> > +    return mfn;
> 
> The unspecific error (INVALID_MFN) here makes me wonder ...
> 
> > @@ -4795,6 +4796,27 @@ static int xenmem_acquire_grant_table(struct
> domain *d,
> >      return 0;
> >  }
> >
> > +static int xenmem_acquire_ioreq_server(struct domain *d,
> > +                                       unsigned int id,
> > +                                       unsigned long frame,
> > +                                       unsigned long nr_frames,
> > +                                       unsigned long mfn_list[])
> > +{
> > +    unsigned int i;
> > +
> > +    for ( i = 0; i < nr_frames; i++ )
> > +    {
> > +        mfn_t mfn = hvm_get_ioreq_server_frame(d, id, frame + i);
> > +
> > +        if ( mfn_eq(mfn, INVALID_MFN) )
> > +            return -EINVAL;
> 
> ... how meaningful EINVAL here is. In particular if page allocation
> failed, ENOMEM would certainly be more appropriate (and give the
> caller a better idea of what needs to be done).
> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource type...
  2017-09-26 13:05     ` Paul Durrant
@ 2017-09-26 13:10       ` Jan Beulich
  2017-09-26 13:12         ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-26 13:10 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Andrew Cooper, Tim (Xen.org), Ian Jackson, xen-devel

>>> On 26.09.17 at 15:05, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 26 September 2017 13:59
>> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>> > @@ -780,6 +882,33 @@ int hvm_get_ioreq_server_info(struct domain *d,
>> ioservid_t id,
>> >      return rc;
>> >  }
>> >
>> > +mfn_t hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
>> > +                                 unsigned int idx)
>> > +{
>> > +    struct hvm_ioreq_server *s;
>> > +    mfn_t mfn = INVALID_MFN;
>> > +
>> > +    spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
>> > +
>> > +    s = get_ioreq_server(d, id);
>> > +
>> > +    if ( id >= MAX_NR_IOREQ_SERVERS || !s || IS_DEFAULT(s) )
>> > +        goto out;
>> > +
>> > +    if ( hvm_ioreq_server_alloc_pages(s) )
>> > +        goto out;
>> > +
>> > +    if ( idx == 0 )
>> > +        mfn = _mfn(page_to_mfn(s->bufioreq.page));
>> > +    else if ( idx == 1 )
>> > +        mfn = _mfn(page_to_mfn(s->ioreq.page));
>> 
>> else <some sort of error>?
> 
> I set mfn to INVALID above. Is that not enough?

Together with ...

>> Also with buffered I/O being optional,
>> wouldn't it be more natural for index 0 representing the synchronous
>> page? And with buffered I/O not enabled, aren't you returning
>> rubbish (NULL translated by page_to_mfn())?
> 
> Good point. I should leave the mfn set to invalid if the buffered page is 
> not there.

... this - no, I don't think so. The two cases would be
indistinguishable. An invalid index should be EINVAL or EDOM or
some such.

> As for making it zero, and putting the synchronous ones 
> afterwards, that was intentional because more synchronous pages will need to 
> be added if we needs to support more vCPUs, whereas there should only ever 
> need to be one buffered page.

Ah, I see.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  2017-09-26 13:03               ` Jan Beulich
@ 2017-09-26 13:11                 ` Paul Durrant
  0 siblings, 0 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-26 13:11 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 26 September 2017 14:04
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: RE: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq
> servers rather than a list
> 
> >>> On 26.09.17 at 14:41, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 26 September 2017 13:38
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> >> devel@lists.xenproject.org
> >> Subject: RE: [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq
> >> servers rather than a list
> >>
> >> >>> On 26.09.17 at 14:12, <Paul.Durrant@citrix.com> wrote:
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 26 September 2017 12:45
> >> >> >>> On 26.09.17 at 12:55, <Paul.Durrant@citrix.com> wrote:
> >> >> >> Sent: 25 September 2017 16:17
> >> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >> >> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> >> >> >> > @@ -785,29 +797,27 @@ int hvm_get_ioreq_server_info(struct
> >> domain
> >> >> >> >  void hvm_destroy_all_ioreq_servers(struct domain *d)
> >> >> >> >  {
> >> >> >> > -    struct hvm_ioreq_server *s, *next;
> >> >> >> > +    struct hvm_ioreq_server *s;
> >> >> >> > +    unsigned int id;
> >> >> >> >
> >> >> >> >      spin_lock_recursive(&d-
> >arch.hvm_domain.ioreq_server.lock);
> >> >> >> >
> >> >> >> >      /* No need to domain_pause() as the domain is being torn
> down
> >> */
> >> >> >> >
> >> >> >> > -    list_for_each_entry_safe ( s,
> >> >> >> > -                               next,
> >> >> >> > -                               &d->arch.hvm_domain.ioreq_server.list,
> >> >> >> > -                               list_entry )
> >> >> >> > +    FOR_EACH_IOREQ_SERVER(d, id, s)
> >> >> >> >      {
> >> >> >> > -        bool is_default = (s == d-
> >> >> >arch.hvm_domain.default_ioreq_server);
> >> >> >> > -
> >> >> >> > -        hvm_ioreq_server_disable(s, is_default);
> >> >> >> > -
> >> >> >> > -        if ( is_default )
> >> >> >> > -            d->arch.hvm_domain.default_ioreq_server = NULL;
> >> >> >> > +        if ( !s )
> >> >> >> > +            continue;
> >> >> >> >
> >> >> >> > -        list_del(&s->list_entry);
> >> >> >> > +        hvm_ioreq_server_disable(s);
> >> >> >> > +        hvm_ioreq_server_deinit(s);
> >> >> >> >
> >> >> >> > -        hvm_ioreq_server_deinit(s, is_default);
> >> >> >> > +        ASSERT(d->arch.hvm_domain.ioreq_server.count);
> >> >> >> > +        --d->arch.hvm_domain.ioreq_server.count;
> >> >> >>
> >> >> >> Seeing this - do you actually need the count as a separate field?
> >> >> >> I.e. are there performance critical uses of it, where going through
> >> >> >> the array would be too expensive? Most of the uses are just
> >> >> >> ASSERT()s anyway.
> >> >> >
> >> >> > The specific case is in hvm_select_ioreq_server(). If there was no
> count
> >> >> > then the array would have to be searched for the initial test.
> >> >>
> >> >> And is this something that happens frequently, i.e. the
> >> >> performance of which matters?
> >> >
> >> > Yes, this is on the critical emulation path. I.e. it is a per-io call.
> >>
> >> That's not answering the question, because you leave out implied
> >> context: Is it a performance critical path to get here with no
> >> ioreq server registers (i.e. count being zero)? After all if count is
> >> non-zero, you get past the early-out conditional there anyway.
> >>
> >
> > Clearly, only emulation that does not hit any IOREQ servers would be
> > impacted. Generally you'd hope that the guest would not hit memory
> ranges
> > that don't exist too often but, if it does, then by not maintaining a count
> > there would be a small performance regression over the code as it stands
> now.
> 
> By "now" you mean with this patch in place as is, rather than the
> current tip of the tree? I don't see a regression compared to
> current staging, and I question the need to performance optimize
> a corner case a guest should not have much interest getting into.

I mean against current master, which has the following test at:

http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/hvm/ioreq.c;hb=HEAD#l1168

    if ( list_empty(&d->arch.hvm_domain.ioreq_server.list) )
        return NULL;

Without a count then the array would need to be searched, which is clearly slower than testing the emptiness a linked list.

> Anyway - I'm not meaning to block the patch because of the
> presence of this field (the more that you're the maintainer of this
> code anyway), it merely seems to me that the patch would be
> ending up smaller but no worse in quality if the field wasn't there.
> 

Agreed it's a corner case though so the regression is not of great concern. I'll get rid of the count for now... It can always be put back if it proves to be a real issue.

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource type...
  2017-09-26 13:10       ` Jan Beulich
@ 2017-09-26 13:12         ` Paul Durrant
  0 siblings, 0 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-26 13:12 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Andrew Cooper, Tim (Xen.org), Ian Jackson, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 26 September 2017 14:11
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Ian Jackson
> <Ian.Jackson@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel@lists.xenproject.org; KonradRzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: RE: [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource
> type...
> 
> >>> On 26.09.17 at 15:05, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 26 September 2017 13:59
> >> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> >> > @@ -780,6 +882,33 @@ int hvm_get_ioreq_server_info(struct domain
> *d,
> >> ioservid_t id,
> >> >      return rc;
> >> >  }
> >> >
> >> > +mfn_t hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
> >> > +                                 unsigned int idx)
> >> > +{
> >> > +    struct hvm_ioreq_server *s;
> >> > +    mfn_t mfn = INVALID_MFN;
> >> > +
> >> > +    spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> >> > +
> >> > +    s = get_ioreq_server(d, id);
> >> > +
> >> > +    if ( id >= MAX_NR_IOREQ_SERVERS || !s || IS_DEFAULT(s) )
> >> > +        goto out;
> >> > +
> >> > +    if ( hvm_ioreq_server_alloc_pages(s) )
> >> > +        goto out;
> >> > +
> >> > +    if ( idx == 0 )
> >> > +        mfn = _mfn(page_to_mfn(s->bufioreq.page));
> >> > +    else if ( idx == 1 )
> >> > +        mfn = _mfn(page_to_mfn(s->ioreq.page));
> >>
> >> else <some sort of error>?
> >
> > I set mfn to INVALID above. Is that not enough?
> 
> Together with ...
> 
> >> Also with buffered I/O being optional,
> >> wouldn't it be more natural for index 0 representing the synchronous
> >> page? And with buffered I/O not enabled, aren't you returning
> >> rubbish (NULL translated by page_to_mfn())?
> >
> > Good point. I should leave the mfn set to invalid if the buffered page is
> > not there.
> 
> ... this - no, I don't think so. The two cases would be
> indistinguishable. An invalid index should be EINVAL or EDOM or
> some such.
> 

Ok, I'll change the function to return an errno to distinguish the cases.

  Paul

> > As for making it zero, and putting the synchronous ones
> > afterwards, that was intentional because more synchronous pages will
> need to
> > be added if we needs to support more vCPUs, whereas there should only
> ever
> > need to be one buffered page.
> 
> Ah, I see.
> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-25 13:02   ` Jan Beulich
  2017-09-25 13:29     ` Andrew Cooper
  2017-09-25 14:42     ` Paul Durrant
@ 2017-09-27 11:18     ` Paul Durrant
  2017-09-27 12:46       ` Jan Beulich
  2 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-27 11:18 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 25 September 2017 14:03
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map
> guest mfns
> 
> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> > In the case where a PV domain is mapping guest resources then it needs
> make
> > the HYPERVISOR_mmu_update call using DOMID_SELF, rather than the
> guest
> > domid, so that the passed in gmfn values are correctly treated as mfns
> > rather than gfns present in the guest p2m.
> 
> Since things are presently working fine, I think the description is not
> really accurate. You only require the new behavior if you don't know
> the GFN of the page you want to map, and that it has to be
> DOMID_SELF that should be passed also doesn't appear to derive
> from anything else. To properly judge about the need for this patch
> it would help if it was briefly explained why being able to map by GFN
> is no longer sufficient, and to re-word the DOMID_SELF part.
> 
> The other aspect I don't understand is why this is needed for PV
> Dom0, but not for PVH. The answer here can't be "because PVH
> Dom0 isn't supported yet", because it eventually will be, and then
> there will still be the problem of PVH supposedly having no notion
> of MFNs (be their own or foreign guest ones). The answer also
> can't be "since it would use XENMAPSPACE_gmfn_foreign", as
> that's acting in terms of GFN too.
> 
> > This patch removes a check which currently disallows mapping of a page
> when
> > the owner of the page tables matches the domain passed to
> > HYPERVISOR_mmu_update, but that domain is not the real owner of the
> page.
> > The check was introduced by patch d3c6a215ca9 ("x86: Clean up
> > get_page_from_l1e() to correctly distinguish between owner-of-pte and
> > owner-of-data-page in all cases") but it's not clear why it was needed.
> 
> I think the goal here simply was to not permit anything that doesn't
> really need permitting. Furthermore the check being "introduced"
> there was, afaict, replacing the earlier d != curr->domain.
> 
> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -1024,12 +1024,15 @@ get_page_from_l1e(
> >                     (real_pg_owner != dom_cow) ) )
> >      {
> >          /*
> > -         * Let privileged domains transfer the right to map their target
> > -         * domain's pages. This is used to allow stub-domain pvfb export to
> > -         * dom0, until pvfb supports granted mappings. At that time this
> > -         * minor hack can go away.
> > +         * If the real page owner is not the domain specified in the
> > +         * hypercall then establish that the specified domain has
> > +         * mapping privilege over the page owner.
> > +         * This is used to allow stub-domain pvfb export to dom0. It is
> > +         * also used to allow a privileged PV domain to map mfns using
> > +         * DOMID_SELF, which is needed for mapping guest resources such
> > +         * grant table frames.
> 
> How do grant table frames come into the picture here? So far
> I had assumed only ioreq server pages are in need of this.
> 
> >           */
> > -        if ( (real_pg_owner == NULL) || (pg_owner == l1e_owner) ||
> > +        if ( (real_pg_owner == NULL) ||
> >               xsm_priv_mapping(XSM_TARGET, pg_owner, real_pg_owner) )
> >          {
> >              gdprintk(XENLOG_WARNING,
> 
> I'm concerned of the effect of the change on the code paths
> which you're not really interested in: alloc_l1_table(),
> ptwr_emulated_update(), and shadow_get_page_from_l1e() all
> explicitly pass both domains identical, and are now suddenly able
> to do things they weren't supposed to do. A similar concern
> applies to __do_update_va_mapping() calling mod_l1_table().
> 
> I therefore wonder whether the solution to your problem
> wouldn't rather be MMU_FOREIGN_PT_UPDATE (name subject
> to improvement suggestions). This at the same time would
> address my concern regarding the misleading DOMID_SELF
> passing when really a foreign domain's page is meant.

Looking at this I wonder whether a cleaner solution would be to introduce a new domid: DOMID_ANY. This meaning of this would be along the same sort of lines as DOMID_XEN or DOMID_IO and would be used in mmu_update to mean 'any page over which the caller has privilege'. Does that sound reasonable?

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  2017-09-26 12:49           ` Paul Durrant
@ 2017-09-27 11:34             ` Andrew Cooper
  2017-09-27 12:56               ` Jan Beulich
  0 siblings, 1 reply; 62+ messages in thread
From: Andrew Cooper @ 2017-09-27 11:34 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'; +Cc: xen-devel

On 26/09/17 13:49, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 26 September 2017 13:35
>> To: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant
>> <Paul.Durrant@citrix.com>
>> Cc: xen-devel@lists.xenproject.org
>> Subject: RE: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to
>> acquire guest resources
>>
>>>>> On 26.09.17 at 14:20, <Paul.Durrant@citrix.com> wrote:
>>>>  -----Original Message-----
>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
>>>> Paul Durrant
>>>> Sent: 25 September 2017 16:00
>>>> To: 'Jan Beulich' <JBeulich@suse.com>
>>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
>>>> devel@lists.xenproject.org
>>>> Subject: Re: [Xen-devel] [PATCH v7 02/12] x86/mm: add
>>>> HYPERVISOR_memory_op to acquire guest resources
>>>>
>>>>> -----Original Message-----
>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>> Sent: 25 September 2017 15:23
>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
>>>>> devel@lists.xenproject.org
>>>>> Subject: Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op
>> to
>>>>> acquire guest resources
>>>>>
>>>>>>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>>>>>> Certain memory resources associated with a guest are not necessarily
>>>>>> present in the guest P2M and so are not necessarily available to be
>>>>>> foreign-mapped by a tools domain unless they are inserted, which
>> risks
>>>>>> shattering a super-page mapping.
>>>>> Btw., I'm additionally having trouble seeing this shattering of a
>>>>> superpage: For one, xc_core_arch_get_scratch_gpfn() could be
>>>>> a little less simplistic. And then even with the currently chosen
>>>>> value (outside of the range of valid GFNs at that point in time)
>>>>> there shouldn't be a larger page to be shattered, as there should
>>>>> be no mapping at all at that index. But perhaps I'm just blind and
>>>>> don't see the obvious ...
>>>> The shattering was Andrew's observation. Andrew, can you comment?
>>>>
>>> Andrew commented verbally on this. It's not actually a shattering as such...
>>> The issue, apparently, is that adding the 4k grant table frame into the guest
>>> p2m will potentially cause creation of all layers of page table but removing
>>> it again will only remove the L1 entry. Thus it is no longer possible to use
>>> a superpage for that mapping at any point subsequently.
>> First of all - what would cause a mapping to appear at that slot (or in
>> a range covering that slot).

???

At the moment, the toolstack's *only* way of editing the grant table of
an HVM guest is to add it into the p2m, map the gfn, write two values,
and unmap it.  That is how a 4k mapping gets added, which forces an
allocation or shattering to cause a L1 table to exist.

This is a kludge and is architecturally unclean.

>>  And then, while re-combining contiguous
>> mappings indeed doesn't exist right now, replacing a non-leaf entry
>> (page table) with a large page is very well supported (see e.g.
>> ept_set_entry(), which even has a comment to that effect).

I don't see anything equivalent in the NPT or IOMMU logic.

>>  Hence
>> I continue to be confused why we need a new mechanism for
>> seeding the grant tables.
> I'll have to defer to Andrew to answer at this point.

Joao's improvements for network transmit require a trusted backend to be
able to map the full grant table.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-27 11:18     ` Paul Durrant
@ 2017-09-27 12:46       ` Jan Beulich
  2017-09-27 12:49         ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-27 12:46 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 27.09.17 at 13:18, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 25 September 2017 14:03
>> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>> > -        if ( (real_pg_owner == NULL) || (pg_owner == l1e_owner) ||
>> > +        if ( (real_pg_owner == NULL) ||
>> >               xsm_priv_mapping(XSM_TARGET, pg_owner, real_pg_owner) )
>> >          {
>> >              gdprintk(XENLOG_WARNING,
>> 
>> I'm concerned of the effect of the change on the code paths
>> which you're not really interested in: alloc_l1_table(),
>> ptwr_emulated_update(), and shadow_get_page_from_l1e() all
>> explicitly pass both domains identical, and are now suddenly able
>> to do things they weren't supposed to do. A similar concern
>> applies to __do_update_va_mapping() calling mod_l1_table().
>> 
>> I therefore wonder whether the solution to your problem
>> wouldn't rather be MMU_FOREIGN_PT_UPDATE (name subject
>> to improvement suggestions). This at the same time would
>> address my concern regarding the misleading DOMID_SELF
>> passing when really a foreign domain's page is meant.
> 
> Looking at this I wonder whether a cleaner solution would be to introduce a 
> new domid: DOMID_ANY. This meaning of this would be along the same sort of 
> lines as DOMID_XEN or DOMID_IO and would be used in mmu_update to mean 'any 
> page over which the caller has privilege'. Does that sound reasonable?

Not really, no. Even if the caller has privilege over multiple domains,
it should still specify which one it means. Otherwise we may end up
with a page transferring ownership behind its back, and it doing
something to one domain which was meant to be done to another.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-27 12:46       ` Jan Beulich
@ 2017-09-27 12:49         ` Paul Durrant
  2017-09-27 13:31           ` Jan Beulich
  0 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-27 12:49 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 27 September 2017 13:47
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: RE: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map
> guest mfns
> 
> >>> On 27.09.17 at 13:18, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 25 September 2017 14:03
> >> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> >> > -        if ( (real_pg_owner == NULL) || (pg_owner == l1e_owner) ||
> >> > +        if ( (real_pg_owner == NULL) ||
> >> >               xsm_priv_mapping(XSM_TARGET, pg_owner, real_pg_owner) )
> >> >          {
> >> >              gdprintk(XENLOG_WARNING,
> >>
> >> I'm concerned of the effect of the change on the code paths
> >> which you're not really interested in: alloc_l1_table(),
> >> ptwr_emulated_update(), and shadow_get_page_from_l1e() all
> >> explicitly pass both domains identical, and are now suddenly able
> >> to do things they weren't supposed to do. A similar concern
> >> applies to __do_update_va_mapping() calling mod_l1_table().
> >>
> >> I therefore wonder whether the solution to your problem
> >> wouldn't rather be MMU_FOREIGN_PT_UPDATE (name subject
> >> to improvement suggestions). This at the same time would
> >> address my concern regarding the misleading DOMID_SELF
> >> passing when really a foreign domain's page is meant.
> >
> > Looking at this I wonder whether a cleaner solution would be to introduce a
> > new domid: DOMID_ANY. This meaning of this would be along the same
> sort of
> > lines as DOMID_XEN or DOMID_IO and would be used in mmu_update to
> mean 'any
> > page over which the caller has privilege'. Does that sound reasonable?
> 
> Not really, no. Even if the caller has privilege over multiple domains,
> it should still specify which one it means. Otherwise we may end up
> with a page transferring ownership behind its back, and it doing
> something to one domain which was meant to be done to another.
> 

Ok, I'll claim the final cmd value then.

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  2017-09-27 11:34             ` Andrew Cooper
@ 2017-09-27 12:56               ` Jan Beulich
  0 siblings, 0 replies; 62+ messages in thread
From: Jan Beulich @ 2017-09-27 12:56 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Paul Durrant

>>> On 27.09.17 at 13:34, <andrew.cooper3@citrix.com> wrote:
> On 26/09/17 13:49, Paul Durrant wrote:
>>> -----Original Message-----
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>> Sent: 26 September 2017 13:35
>>> To: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant
>>> <Paul.Durrant@citrix.com>
>>> Cc: xen-devel@lists.xenproject.org 
>>> Subject: RE: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to
>>> acquire guest resources
>>>
>>>>>> On 26.09.17 at 14:20, <Paul.Durrant@citrix.com> wrote:
>>>>>  -----Original Message-----
>>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
>>>>> Paul Durrant
>>>>> Sent: 25 September 2017 16:00
>>>>> To: 'Jan Beulich' <JBeulich@suse.com>
>>>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
>>>>> devel@lists.xenproject.org 
>>>>> Subject: Re: [Xen-devel] [PATCH v7 02/12] x86/mm: add
>>>>> HYPERVISOR_memory_op to acquire guest resources
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>> Sent: 25 September 2017 15:23
>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
>>>>>> devel@lists.xenproject.org 
>>>>>> Subject: Re: [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op
>>> to
>>>>>> acquire guest resources
>>>>>>
>>>>>>>>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>>>>>>> Certain memory resources associated with a guest are not necessarily
>>>>>>> present in the guest P2M and so are not necessarily available to be
>>>>>>> foreign-mapped by a tools domain unless they are inserted, which
>>> risks
>>>>>>> shattering a super-page mapping.
>>>>>> Btw., I'm additionally having trouble seeing this shattering of a
>>>>>> superpage: For one, xc_core_arch_get_scratch_gpfn() could be
>>>>>> a little less simplistic. And then even with the currently chosen
>>>>>> value (outside of the range of valid GFNs at that point in time)
>>>>>> there shouldn't be a larger page to be shattered, as there should
>>>>>> be no mapping at all at that index. But perhaps I'm just blind and
>>>>>> don't see the obvious ...
>>>>> The shattering was Andrew's observation. Andrew, can you comment?
>>>>>
>>>> Andrew commented verbally on this. It's not actually a shattering as such...
>>>> The issue, apparently, is that adding the 4k grant table frame into the guest
>>>> p2m will potentially cause creation of all layers of page table but removing
>>>> it again will only remove the L1 entry. Thus it is no longer possible to use
>>>> a superpage for that mapping at any point subsequently.
>>> First of all - what would cause a mapping to appear at that slot (or in
>>> a range covering that slot).
> 
> ???
> 
> At the moment, the toolstack's *only* way of editing the grant table of
> an HVM guest is to add it into the p2m, map the gfn, write two values,
> and unmap it.  That is how a 4k mapping gets added, which forces an
> allocation or shattering to cause a L1 table to exist.
> 
> This is a kludge and is architecturally unclean.

Well, if the grant table related parts of series here was presented
to be simply cleaning up a kludge, I'd probably be fine. But so far
it has been claimed that there are other bad effects, besides this
just being (as I would call it) sub-optimal.

>>>  And then, while re-combining contiguous
>>> mappings indeed doesn't exist right now, replacing a non-leaf entry
>>> (page table) with a large page is very well supported (see e.g.
>>> ept_set_entry(), which even has a comment to that effect).
> 
> I don't see anything equivalent in the NPT or IOMMU logic.

Look for intermediate_entry in p2m_pt_set_entry(). In AMD
IOMMU code see iommu_merge_pages(). For VT-d I agree.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-27 12:49         ` Paul Durrant
@ 2017-09-27 13:31           ` Jan Beulich
  2017-09-27 14:22             ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-27 13:31 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 27.09.17 at 14:49, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 27 September 2017 13:47
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
>> devel@lists.xenproject.org 
>> Subject: RE: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map
>> guest mfns
>> 
>> >>> On 27.09.17 at 13:18, <Paul.Durrant@citrix.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 25 September 2017 14:03
>> >> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
>> >> > -        if ( (real_pg_owner == NULL) || (pg_owner == l1e_owner) ||
>> >> > +        if ( (real_pg_owner == NULL) ||
>> >> >               xsm_priv_mapping(XSM_TARGET, pg_owner, real_pg_owner) )
>> >> >          {
>> >> >              gdprintk(XENLOG_WARNING,
>> >>
>> >> I'm concerned of the effect of the change on the code paths
>> >> which you're not really interested in: alloc_l1_table(),
>> >> ptwr_emulated_update(), and shadow_get_page_from_l1e() all
>> >> explicitly pass both domains identical, and are now suddenly able
>> >> to do things they weren't supposed to do. A similar concern
>> >> applies to __do_update_va_mapping() calling mod_l1_table().
>> >>
>> >> I therefore wonder whether the solution to your problem
>> >> wouldn't rather be MMU_FOREIGN_PT_UPDATE (name subject
>> >> to improvement suggestions). This at the same time would
>> >> address my concern regarding the misleading DOMID_SELF
>> >> passing when really a foreign domain's page is meant.
>> >
>> > Looking at this I wonder whether a cleaner solution would be to introduce a
>> > new domid: DOMID_ANY. This meaning of this would be along the same
>> sort of
>> > lines as DOMID_XEN or DOMID_IO and would be used in mmu_update to
>> mean 'any
>> > page over which the caller has privilege'. Does that sound reasonable?
>> 
>> Not really, no. Even if the caller has privilege over multiple domains,
>> it should still specify which one it means. Otherwise we may end up
>> with a page transferring ownership behind its back, and it doing
>> something to one domain which was meant to be done to another.
>> 
> 
> Ok, I'll claim the final cmd value then.

Final? We've got 5 left (for a total of 3 bits) afaict.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-27 13:31           ` Jan Beulich
@ 2017-09-27 14:22             ` Paul Durrant
  2017-09-27 14:42               ` Jan Beulich
  0 siblings, 1 reply; 62+ messages in thread
From: Paul Durrant @ 2017-09-27 14:22 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 27 September 2017 14:31
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: RE: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map
> guest mfns
> 
> >>> On 27.09.17 at 14:49, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 27 September 2017 13:47
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> >> devel@lists.xenproject.org
> >> Subject: RE: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to
> map
> >> guest mfns
> >>
> >> >>> On 27.09.17 at 13:18, <Paul.Durrant@citrix.com> wrote:
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 25 September 2017 14:03
> >> >> >>> On 18.09.17 at 17:31, <paul.durrant@citrix.com> wrote:
> >> >> > -        if ( (real_pg_owner == NULL) || (pg_owner == l1e_owner) ||
> >> >> > +        if ( (real_pg_owner == NULL) ||
> >> >> >               xsm_priv_mapping(XSM_TARGET, pg_owner,
> real_pg_owner) )
> >> >> >          {
> >> >> >              gdprintk(XENLOG_WARNING,
> >> >>
> >> >> I'm concerned of the effect of the change on the code paths
> >> >> which you're not really interested in: alloc_l1_table(),
> >> >> ptwr_emulated_update(), and shadow_get_page_from_l1e() all
> >> >> explicitly pass both domains identical, and are now suddenly able
> >> >> to do things they weren't supposed to do. A similar concern
> >> >> applies to __do_update_va_mapping() calling mod_l1_table().
> >> >>
> >> >> I therefore wonder whether the solution to your problem
> >> >> wouldn't rather be MMU_FOREIGN_PT_UPDATE (name subject
> >> >> to improvement suggestions). This at the same time would
> >> >> address my concern regarding the misleading DOMID_SELF
> >> >> passing when really a foreign domain's page is meant.
> >> >
> >> > Looking at this I wonder whether a cleaner solution would be to
> introduce a
> >> > new domid: DOMID_ANY. This meaning of this would be along the same
> >> sort of
> >> > lines as DOMID_XEN or DOMID_IO and would be used in mmu_update
> to
> >> mean 'any
> >> > page over which the caller has privilege'. Does that sound reasonable?
> >>
> >> Not really, no. Even if the caller has privilege over multiple domains,
> >> it should still specify which one it means. Otherwise we may end up
> >> with a page transferring ownership behind its back, and it doing
> >> something to one domain which was meant to be done to another.
> >>
> >
> > Ok, I'll claim the final cmd value then.
> 
> Final? We've got 5 left (for a total of 3 bits) afaict.

Really? Maybe I misread... looks like only 2 bits to me.

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-27 14:22             ` Paul Durrant
@ 2017-09-27 14:42               ` Jan Beulich
  2017-09-27 14:47                 ` Paul Durrant
  0 siblings, 1 reply; 62+ messages in thread
From: Jan Beulich @ 2017-09-27 14:42 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 27.09.17 at 16:22, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 27 September 2017 14:31
>> >>> On 27.09.17 at 14:49, <Paul.Durrant@citrix.com> wrote:
>> > Ok, I'll claim the final cmd value then.
>> 
>> Final? We've got 5 left (for a total of 3 bits) afaict.
> 
> Really? Maybe I misread... looks like only 2 bits to me.

Maybe you and I looked in different places. I'm deriving this from

        cmd = req.ptr & (sizeof(l1_pgentry_t)-1);

        switch ( cmd )

in do_mmu_update(). Only 32-bit non-PAE guests would have been
limited to 2 bits.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns
  2017-09-27 14:42               ` Jan Beulich
@ 2017-09-27 14:47                 ` Paul Durrant
  0 siblings, 0 replies; 62+ messages in thread
From: Paul Durrant @ 2017-09-27 14:47 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan
> Beulich
> Sent: 27 September 2017 15:42
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-
> devel@lists.xenproject.org
> Subject: Re: [Xen-devel] [PATCH v7 01/12] x86/mm: allow a privileged PV
> domain to map guest mfns
> 
> >>> On 27.09.17 at 16:22, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 27 September 2017 14:31
> >> >>> On 27.09.17 at 14:49, <Paul.Durrant@citrix.com> wrote:
> >> > Ok, I'll claim the final cmd value then.
> >>
> >> Final? We've got 5 left (for a total of 3 bits) afaict.
> >
> > Really? Maybe I misread... looks like only 2 bits to me.
> 
> Maybe you and I looked in different places. I'm deriving this from
> 
>         cmd = req.ptr & (sizeof(l1_pgentry_t)-1);
> 
>         switch ( cmd )
> 
> in do_mmu_update(). Only 32-bit non-PAE guests would have been
> limited to 2 bits.

Ah, ok. I was going off what it says in the header, where the comments state that [0:1] are command bits and [:2] are address bits, but for 64-bit or PAE guests then it makes sense that bit 2 is up for grabs. Anyway I can use 3, which still fits in bits [0:1].

  Paul 

> 
> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2017-09-27 14:52 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-18 15:31 [PATCH v7 00/12] x86: guest resource mapping Paul Durrant
2017-09-18 15:31 ` [PATCH v7 01/12] x86/mm: allow a privileged PV domain to map guest mfns Paul Durrant
2017-09-19 12:51   ` Paul Durrant
2017-09-19 13:05     ` Jan Beulich
2017-09-25 13:02   ` Jan Beulich
2017-09-25 13:29     ` Andrew Cooper
2017-09-25 14:03       ` Jan Beulich
2017-09-25 14:42     ` Paul Durrant
2017-09-25 14:49       ` Jan Beulich
2017-09-25 14:56         ` Paul Durrant
2017-09-25 15:30           ` Jan Beulich
2017-09-25 15:33             ` Paul Durrant
2017-09-27 11:18     ` Paul Durrant
2017-09-27 12:46       ` Jan Beulich
2017-09-27 12:49         ` Paul Durrant
2017-09-27 13:31           ` Jan Beulich
2017-09-27 14:22             ` Paul Durrant
2017-09-27 14:42               ` Jan Beulich
2017-09-27 14:47                 ` Paul Durrant
2017-09-18 15:31 ` [PATCH v7 02/12] x86/mm: add HYPERVISOR_memory_op to acquire guest resources Paul Durrant
2017-09-25 13:49   ` Jan Beulich
2017-09-25 14:53     ` Paul Durrant
2017-09-25 14:23   ` Jan Beulich
2017-09-25 15:00     ` Paul Durrant
2017-09-26 12:20       ` Paul Durrant
2017-09-26 12:35         ` Jan Beulich
2017-09-26 12:49           ` Paul Durrant
2017-09-27 11:34             ` Andrew Cooper
2017-09-27 12:56               ` Jan Beulich
2017-09-18 15:31 ` [PATCH v7 03/12] tools/libxenforeignmemory: add support for resource mapping Paul Durrant
2017-09-18 16:16   ` Ian Jackson
2017-09-19  8:19     ` Paul Durrant
2017-09-18 15:31 ` [PATCH v7 04/12] tools/libxenforeignmemory: reduce xenforeignmemory_restrict code footprint Paul Durrant
2017-09-18 15:31 ` [PATCH v7 05/12] tools/libxenctrl: use new xenforeignmemory API to seed grant table Paul Durrant
2017-09-18 15:31 ` [PATCH v7 06/12] x86/hvm/ioreq: rename .*pfn and .*gmfn to .*gfn Paul Durrant
2017-09-25 14:29   ` Jan Beulich
2017-09-25 14:32     ` Paul Durrant
2017-09-18 15:31 ` [PATCH v7 07/12] x86/hvm/ioreq: use bool rather than bool_t Paul Durrant
2017-09-25 14:30   ` Jan Beulich
2017-09-18 15:31 ` [PATCH v7 08/12] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list Paul Durrant
2017-09-25 15:17   ` Jan Beulich
2017-09-26 10:55     ` Paul Durrant
2017-09-26 11:45       ` Jan Beulich
2017-09-26 12:12         ` Paul Durrant
2017-09-26 12:38           ` Jan Beulich
2017-09-26 12:41             ` Paul Durrant
2017-09-26 13:03               ` Jan Beulich
2017-09-26 13:11                 ` Paul Durrant
2017-09-18 15:31 ` [PATCH v7 09/12] x86/hvm/ioreq: simplify code and use consistent naming Paul Durrant
2017-09-25 15:26   ` Jan Beulich
2017-09-18 15:31 ` [PATCH v7 10/12] x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page Paul Durrant
2017-09-25 15:27   ` Jan Beulich
2017-09-18 15:31 ` [PATCH v7 11/12] x86/hvm/ioreq: defer mapping gfns until they are actually requsted Paul Durrant
2017-09-25 16:00   ` Jan Beulich
2017-09-25 16:04     ` Paul Durrant
2017-09-18 15:31 ` [PATCH v7 12/12] x86/hvm/ioreq: add a new mappable resource type Paul Durrant
2017-09-18 16:18   ` Roger Pau Monné
2017-09-19  8:14     ` Paul Durrant
2017-09-26 12:58   ` Jan Beulich
2017-09-26 13:05     ` Paul Durrant
2017-09-26 13:10       ` Jan Beulich
2017-09-26 13:12         ` Paul Durrant

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.