xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] x86/ioreq server: introduce HVMMEM_ioreq_server mem type
@ 2016-03-31 10:53 Yu Zhang
  2016-03-31 10:53 ` [PATCH v2 1/3] x86/ioreq server: Add new functions to get/set memory types Yu Zhang
                   ` (2 more replies)
  0 siblings, 3 replies; 82+ messages in thread
From: Yu Zhang @ 2016-03-31 10:53 UTC (permalink / raw)
  To: xen-devel; +Cc: zhiyuan.lv

XenGT leverages ioreq server to track and forward the accesses to GPU
I/O resources, e.g. the PPGTT(per-process graphic translation tables).
Currently, ioreq server uses rangeset to track the BDF/ PIO/MMIO ranges
to be emulated. To select an ioreq server, the rangeset is searched to
see if the I/O range is recorded. However, number of ram pages to be
tracked may exceed the upper limit of rangeset.

Previously, one solution was proposed to refactor the rangeset, and
extend its upper limit. However, after 12 rounds discussion, we have
decided to drop this approach due to security concerns. Now this new
patch series introduces a new mem type, HVMMEM_ioreq_server, and added
hvm operations to let one ioreq server to claim its ownership of ram
pages with this type. Accesses to a page of this type will be handled
by the specified ioreq server directly. 

Changes in v12: 
1> Use p2m_change_entry_type_global() to resync the p2m types, instead of
p2m_memory_type_changed() in last version, cause p2m_memory_type_changed()
only resyncs the memory type in the paging structures other than the p2m
types;
2> Only support the HVMOP_map_mem_type_to_ioreq_server and HVMOP_ioreq_server
for HVMs with HAP enabled;
3> Code corrections according to George Dunlap's comments;
4> Comments added according to George Dunlap's comments.

Yu Zhang (3):
  x86/ioreq server: Add new functions to get/set memory types.
  x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to
    an ioreq server

 xen/arch/x86/hvm/emulate.c       | 125 ++++++++++++-
 xen/arch/x86/hvm/hvm.c           | 375 ++++++++++++++++++++++++++-------------
 xen/arch/x86/mm/hap/nested_hap.c |   2 +-
 xen/arch/x86/mm/p2m-ept.c        |  14 +-
 xen/arch/x86/mm/p2m-pt.c         |  25 ++-
 xen/arch/x86/mm/p2m.c            |  82 +++++++++
 xen/arch/x86/mm/shadow/multi.c   |   3 +-
 xen/include/asm-x86/p2m.h        |  38 +++-
 xen/include/public/hvm/hvm_op.h  |  39 +++-
 9 files changed, 554 insertions(+), 149 deletions(-)

-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v2 1/3] x86/ioreq server: Add new functions to get/set memory types.
  2016-03-31 10:53 [PATCH v2 0/3] x86/ioreq server: introduce HVMMEM_ioreq_server mem type Yu Zhang
@ 2016-03-31 10:53 ` Yu Zhang
  2016-04-05 13:57   ` George Dunlap
  2016-04-08 13:25   ` Andrew Cooper
  2016-03-31 10:53 ` [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server Yu Zhang
  2016-03-31 10:53 ` [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
  2 siblings, 2 replies; 82+ messages in thread
From: Yu Zhang @ 2016-03-31 10:53 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Paul Durrant, Keir Fraser, zhiyuan.lv, Jan Beulich

For clarity this patch breaks the code to set/get memory types out
of do_hvm_op() into dedicated functions: hvmop_set/get_mem_type().
Also, for clarity, checks for whether a memory type change is allowed
are broken out into a separate function called by hvmop_set_mem_type().

There is no intentional functional change in this patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c | 284 +++++++++++++++++++++++++++----------------------
 1 file changed, 159 insertions(+), 125 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 80d59ff..f700923 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6547,6 +6547,61 @@ static int do_altp2m_op(
     return rc;
 }
 
+static int hvmop_get_mem_type(
+    XEN_GUEST_HANDLE_PARAM(xen_hvm_get_mem_type_t) arg)
+{
+    struct xen_hvm_get_mem_type a;
+    struct domain *d;
+    p2m_type_t t;
+    int rc;
+
+    if ( copy_from_guest(&a, arg, 1) )
+        return -EFAULT;
+
+    d = rcu_lock_domain_by_any_id(a.domid);
+    if ( d == NULL )
+        return -ESRCH;
+
+    rc = xsm_hvm_param(XSM_TARGET, d, HVMOP_get_mem_type);
+    if ( rc )
+        goto out;
+
+    rc = -EINVAL;
+    if ( !is_hvm_domain(d) )
+        goto out;
+
+    /*
+     * Use get_gfn query as we are interested in the current
+     * type, not in allocating or unsharing. That'll happen
+     * on access.
+     */
+    get_gfn_query_unlocked(d, a.pfn, &t);
+    if ( p2m_is_mmio(t) )
+        a.mem_type =  HVMMEM_mmio_dm;
+    else if ( t == p2m_mmio_write_dm )
+        a.mem_type = HVMMEM_mmio_write_dm;
+    else if ( p2m_is_readonly(t) )
+        a.mem_type =  HVMMEM_ram_ro;
+    else if ( p2m_is_ram(t) )
+        a.mem_type =  HVMMEM_ram_rw;
+    else if ( p2m_is_pod(t) )
+        a.mem_type =  HVMMEM_ram_rw;
+    else if ( p2m_is_grant(t) )
+        a.mem_type =  HVMMEM_ram_rw;
+    else
+        a.mem_type =  HVMMEM_mmio_dm;
+
+    rc = -EFAULT;
+    if ( __copy_to_guest(arg, &a, 1) )
+        goto out;
+    rc = 0;
+
+ out:
+    rcu_unlock_domain(d);
+
+    return rc;
+}
+
 /*
  * Note that this value is effectively part of the ABI, even if we don't need
  * to make it a formal part of it: A guest suspended for migration in the
@@ -6555,6 +6610,105 @@ static int do_altp2m_op(
  */
 #define HVMOP_op_mask 0xff
 
+static bool_t hvm_allow_p2m_type_change(p2m_type_t old, p2m_type_t new)
+{
+    if ( p2m_is_ram(old) ||
+         (p2m_is_hole(old) && new == p2m_mmio_dm) ||
+         (old == p2m_mmio_write_dm && new == p2m_ram_rw) )
+        return 1;
+
+    return 0;
+}
+
+static int hvmop_set_mem_type(
+    XEN_GUEST_HANDLE_PARAM(xen_hvm_set_mem_type_t) arg,
+    unsigned long *iter)
+{
+    unsigned long start_iter = *iter;
+    struct xen_hvm_set_mem_type a;
+    struct domain *d;
+    int rc;
+
+    /* Interface types to internal p2m types */
+    static const p2m_type_t memtype[] = {
+        [HVMMEM_ram_rw]  = p2m_ram_rw,
+        [HVMMEM_ram_ro]  = p2m_ram_ro,
+        [HVMMEM_mmio_dm] = p2m_mmio_dm,
+        [HVMMEM_mmio_write_dm] = p2m_mmio_write_dm
+    };
+
+    if ( copy_from_guest(&a, arg, 1) )
+        return -EFAULT;
+
+    rc = rcu_lock_remote_domain_by_id(a.domid, &d);
+    if ( rc != 0 )
+        return rc;
+
+    rc = -EINVAL;
+    if ( !is_hvm_domain(d) )
+        goto out;
+
+    rc = xsm_hvm_control(XSM_DM_PRIV, d, HVMOP_set_mem_type);
+    if ( rc )
+        goto out;
+
+    rc = -EINVAL;
+    if ( a.nr < start_iter ||
+         ((a.first_pfn + a.nr - 1) < a.first_pfn) ||
+         ((a.first_pfn + a.nr - 1) > domain_get_maximum_gpfn(d)) )
+        goto out;
+
+    if ( a.hvmmem_type >= ARRAY_SIZE(memtype) )
+        goto out;
+
+    while ( a.nr > start_iter )
+    {
+        unsigned long pfn = a.first_pfn + start_iter;
+        p2m_type_t t;
+
+        get_gfn_unshare(d, pfn, &t);
+        if ( p2m_is_paging(t) )
+        {
+            put_gfn(d, pfn);
+            p2m_mem_paging_populate(d, pfn);
+            rc = -EAGAIN;
+            goto out;
+        }
+        if ( p2m_is_shared(t) )
+        {
+            put_gfn(d, pfn);
+            rc = -EAGAIN;
+            goto out;
+        }
+        if ( !hvm_allow_p2m_type_change(t, memtype[a.hvmmem_type]) )
+        {
+            put_gfn(d, pfn);
+            goto out;
+        }
+
+        rc = p2m_change_type_one(d, pfn, t, memtype[a.hvmmem_type]);
+        put_gfn(d, pfn);
+
+        if ( rc )
+            goto out;
+
+        /* Check for continuation if it's not the last interation */
+        if ( a.nr > ++start_iter && !(start_iter & HVMOP_op_mask) &&
+             hypercall_preempt_check() )
+        {
+            rc = -ERESTART;
+            goto out;
+        }
+    }
+    rc = 0;
+
+ out:
+    rcu_unlock_domain(d);
+    *iter = start_iter;
+
+    return rc;
+}
+
 long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     unsigned long start_iter, mask;
@@ -6744,135 +6898,15 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
     }
 
     case HVMOP_get_mem_type:
-    {
-        struct xen_hvm_get_mem_type a;
-        struct domain *d;
-        p2m_type_t t;
-
-        if ( copy_from_guest(&a, arg, 1) )
-            return -EFAULT;
-
-        d = rcu_lock_domain_by_any_id(a.domid);
-        if ( d == NULL )
-            return -ESRCH;
-
-        rc = xsm_hvm_param(XSM_TARGET, d, op);
-        if ( unlikely(rc) )
-            /* nothing */;
-        else if ( likely(is_hvm_domain(d)) )
-        {
-            /* Use get_gfn query as we are interested in the current 
-             * type, not in allocating or unsharing. That'll happen 
-             * on access. */
-            get_gfn_query_unlocked(d, a.pfn, &t);
-            if ( p2m_is_mmio(t) )
-                a.mem_type =  HVMMEM_mmio_dm;
-            else if ( t == p2m_mmio_write_dm )
-                a.mem_type = HVMMEM_mmio_write_dm;
-            else if ( p2m_is_readonly(t) )
-                a.mem_type =  HVMMEM_ram_ro;
-            else if ( p2m_is_ram(t) )
-                a.mem_type =  HVMMEM_ram_rw;
-            else if ( p2m_is_pod(t) )
-                a.mem_type =  HVMMEM_ram_rw;
-            else if ( p2m_is_grant(t) )
-                a.mem_type =  HVMMEM_ram_rw;
-            else
-                a.mem_type =  HVMMEM_mmio_dm;
-            if ( __copy_to_guest(arg, &a, 1) )
-                rc = -EFAULT;
-        }
-        else
-            rc = -EINVAL;
-
-        rcu_unlock_domain(d);
+        rc = hvmop_get_mem_type(
+            guest_handle_cast(arg, xen_hvm_get_mem_type_t));
         break;
-    }
 
     case HVMOP_set_mem_type:
-    {
-        struct xen_hvm_set_mem_type a;
-        struct domain *d;
-        
-        /* Interface types to internal p2m types */
-        static const p2m_type_t memtype[] = {
-            [HVMMEM_ram_rw]  = p2m_ram_rw,
-            [HVMMEM_ram_ro]  = p2m_ram_ro,
-            [HVMMEM_mmio_dm] = p2m_mmio_dm,
-            [HVMMEM_mmio_write_dm] = p2m_mmio_write_dm
-        };
-
-        if ( copy_from_guest(&a, arg, 1) )
-            return -EFAULT;
-
-        rc = rcu_lock_remote_domain_by_id(a.domid, &d);
-        if ( rc != 0 )
-            return rc;
-
-        rc = -EINVAL;
-        if ( !is_hvm_domain(d) )
-            goto setmemtype_fail;
-
-        rc = xsm_hvm_control(XSM_DM_PRIV, d, op);
-        if ( rc )
-            goto setmemtype_fail;
-
-        rc = -EINVAL;
-        if ( a.nr < start_iter ||
-             ((a.first_pfn + a.nr - 1) < a.first_pfn) ||
-             ((a.first_pfn + a.nr - 1) > domain_get_maximum_gpfn(d)) )
-            goto setmemtype_fail;
-            
-        if ( a.hvmmem_type >= ARRAY_SIZE(memtype) )
-            goto setmemtype_fail;
-
-        while ( a.nr > start_iter )
-        {
-            unsigned long pfn = a.first_pfn + start_iter;
-            p2m_type_t t;
-
-            get_gfn_unshare(d, pfn, &t);
-            if ( p2m_is_paging(t) )
-            {
-                put_gfn(d, pfn);
-                p2m_mem_paging_populate(d, pfn);
-                rc = -EAGAIN;
-                goto setmemtype_fail;
-            }
-            if ( p2m_is_shared(t) )
-            {
-                put_gfn(d, pfn);
-                rc = -EAGAIN;
-                goto setmemtype_fail;
-            }
-            if ( !p2m_is_ram(t) &&
-                 (!p2m_is_hole(t) || a.hvmmem_type != HVMMEM_mmio_dm) &&
-                 (t != p2m_mmio_write_dm || a.hvmmem_type != HVMMEM_ram_rw) )
-            {
-                put_gfn(d, pfn);
-                goto setmemtype_fail;
-            }
-
-            rc = p2m_change_type_one(d, pfn, t, memtype[a.hvmmem_type]);
-            put_gfn(d, pfn);
-            if ( rc )
-                goto setmemtype_fail;
-
-            /* Check for continuation if it's not the last interation */
-            if ( a.nr > ++start_iter && !(start_iter & HVMOP_op_mask) &&
-                 hypercall_preempt_check() )
-            {
-                rc = -ERESTART;
-                goto setmemtype_fail;
-            }
-        }
-
-        rc = 0;
-
-    setmemtype_fail:
-        rcu_unlock_domain(d);
+        rc = hvmop_set_mem_type(
+            guest_handle_cast(arg, xen_hvm_set_mem_type_t),
+            &start_iter);
         break;
-    }
 
     case HVMOP_pagetable_dying:
     {
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-03-31 10:53 [PATCH v2 0/3] x86/ioreq server: introduce HVMMEM_ioreq_server mem type Yu Zhang
  2016-03-31 10:53 ` [PATCH v2 1/3] x86/ioreq server: Add new functions to get/set memory types Yu Zhang
@ 2016-03-31 10:53 ` Yu Zhang
  2016-04-05 14:38   ` George Dunlap
                     ` (2 more replies)
  2016-03-31 10:53 ` [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
  2 siblings, 3 replies; 82+ messages in thread
From: Yu Zhang @ 2016-03-31 10:53 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, George Dunlap,
	Andrew Cooper, Tim Deegan, Paul Durrant, zhiyuan.lv,
	Jun Nakajima

Previously p2m type p2m_mmio_write_dm was introduced for write-
protected memory pages whose write operations are supposed to be
forwarded to and emulated by an ioreq server. Yet limitations of
rangeset restrict the number of guest pages to be write-protected.

This patch replaces the p2m type p2m_mmio_write_dm with a new name:
p2m_ioreq_server, which means this p2m type can be claimed by one
ioreq server, instead of being tracked inside the rangeset of ioreq
server. Patches following up will add the related hvmop handling
code which map/unmap type p2m_ioreq_server to/from an ioreq server.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Tim Deegan <tim@xen.org>
---
 xen/arch/x86/hvm/hvm.c          | 10 +++++-----
 xen/arch/x86/mm/p2m-ept.c       |  2 +-
 xen/arch/x86/mm/p2m-pt.c        |  2 +-
 xen/arch/x86/mm/shadow/multi.c  |  2 +-
 xen/include/asm-x86/p2m.h       |  4 ++--
 xen/include/public/hvm/hvm_op.h |  2 +-
 6 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index f700923..bec6a8a 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3166,7 +3166,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
      */
     if ( (p2mt == p2m_mmio_dm) || 
          (npfec.write_access &&
-          (p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
+          (p2m_is_discard_write(p2mt) || (p2mt == p2m_ioreq_server))) )
     {
         __put_gfn(p2m, gfn);
         if ( ap2m_active )
@@ -6578,8 +6578,8 @@ static int hvmop_get_mem_type(
     get_gfn_query_unlocked(d, a.pfn, &t);
     if ( p2m_is_mmio(t) )
         a.mem_type =  HVMMEM_mmio_dm;
-    else if ( t == p2m_mmio_write_dm )
-        a.mem_type = HVMMEM_mmio_write_dm;
+    else if ( t == p2m_ioreq_server )
+        a.mem_type = HVMMEM_ioreq_server;
     else if ( p2m_is_readonly(t) )
         a.mem_type =  HVMMEM_ram_ro;
     else if ( p2m_is_ram(t) )
@@ -6614,7 +6614,7 @@ static bool_t hvm_allow_p2m_type_change(p2m_type_t old, p2m_type_t new)
 {
     if ( p2m_is_ram(old) ||
          (p2m_is_hole(old) && new == p2m_mmio_dm) ||
-         (old == p2m_mmio_write_dm && new == p2m_ram_rw) )
+         (old == p2m_ioreq_server && new == p2m_ram_rw) )
         return 1;
 
     return 0;
@@ -6634,7 +6634,7 @@ static int hvmop_set_mem_type(
         [HVMMEM_ram_rw]  = p2m_ram_rw,
         [HVMMEM_ram_ro]  = p2m_ram_ro,
         [HVMMEM_mmio_dm] = p2m_mmio_dm,
-        [HVMMEM_mmio_write_dm] = p2m_mmio_write_dm
+        [HVMMEM_ioreq_server] = p2m_ioreq_server
     };
 
     if ( copy_from_guest(&a, arg, 1) )
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 3cb6868..380ec25 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -171,7 +171,7 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
             entry->a = entry->d = !!cpu_has_vmx_ept_ad;
             break;
         case p2m_grant_map_ro:
-        case p2m_mmio_write_dm:
+        case p2m_ioreq_server:
             entry->r = 1;
             entry->w = entry->x = 0;
             entry->a = !!cpu_has_vmx_ept_ad;
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 3d80612..eabd2e3 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -94,7 +94,7 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
     default:
         return flags | _PAGE_NX_BIT;
     case p2m_grant_map_ro:
-    case p2m_mmio_write_dm:
+    case p2m_ioreq_server:
         return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
     case p2m_ram_ro:
     case p2m_ram_logdirty:
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index e5c8499..c81302a 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3225,7 +3225,7 @@ static int sh_page_fault(struct vcpu *v,
 
     /* Need to hand off device-model MMIO to the device model */
     if ( p2mt == p2m_mmio_dm
-         || (p2mt == p2m_mmio_write_dm && ft == ft_demand_write) )
+         || (p2mt == p2m_ioreq_server && ft == ft_demand_write) )
     {
         gpa = guest_walk_to_gpa(&gw);
         goto mmio;
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 5392eb0..ee2ea9c 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -71,7 +71,7 @@ typedef enum {
     p2m_ram_shared = 12,          /* Shared or sharable memory */
     p2m_ram_broken = 13,          /* Broken page, access cause domain crash */
     p2m_map_foreign  = 14,        /* ram pages from foreign domain */
-    p2m_mmio_write_dm = 15,       /* Read-only; writes go to the device model */
+    p2m_ioreq_server = 15,
 } p2m_type_t;
 
 /* Modifiers to the query */
@@ -112,7 +112,7 @@ typedef unsigned int p2m_query_t;
                       | p2m_to_mask(p2m_ram_ro)         \
                       | p2m_to_mask(p2m_grant_map_ro)   \
                       | p2m_to_mask(p2m_ram_shared)     \
-                      | p2m_to_mask(p2m_mmio_write_dm))
+                      | p2m_to_mask(p2m_ioreq_server))
 
 /* Write-discard types, which should discard the write operations */
 #define P2M_DISCARD_WRITE_TYPES (p2m_to_mask(p2m_ram_ro)     \
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index 1606185..a1eae52 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -83,7 +83,7 @@ typedef enum {
     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
     HVMMEM_mmio_dm,            /* Reads and write go to the device model */
-    HVMMEM_mmio_write_dm       /* Read-only; writes go to the device model */
+    HVMMEM_ioreq_server,
 } hvmmem_type_t;
 
 /* Following tools-only interfaces may change in future. */
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-03-31 10:53 [PATCH v2 0/3] x86/ioreq server: introduce HVMMEM_ioreq_server mem type Yu Zhang
  2016-03-31 10:53 ` [PATCH v2 1/3] x86/ioreq server: Add new functions to get/set memory types Yu Zhang
  2016-03-31 10:53 ` [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server Yu Zhang
@ 2016-03-31 10:53 ` Yu Zhang
       [not found]   ` <20160404082556.GC28633@deinos.phlegethon.org>
                     ` (3 more replies)
  2 siblings, 4 replies; 82+ messages in thread
From: Yu Zhang @ 2016-03-31 10:53 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Jun Nakajima, George Dunlap,
	Andrew Cooper, Tim Deegan, Paul Durrant, zhiyuan.lv, Jan Beulich

A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
let one ioreq server claim/disclaim its responsibility for the
handling of guest pages with p2m type p2m_ioreq_server. Users
of this HVMOP can specify whether the p2m_ioreq_server is supposed
to handle write accesses or read ones or both in a parameter named
flags. For now, we only support one ioreq server for this p2m type,
so once an ioreq server has claimed its ownership, subsequent calls
of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
disclaim the ownership of guest ram pages with this p2m type, by
triggering this new HVMOP, with ioreq server id set to the current
owner's and flags parameter set to 0.

For now, both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
are only supported for HVMs with HAP enabled.

Note that flags parameter(if not 0) of this HVMOP only indicates
which kind of memory accesses are to be forwarded to an ioreq server,
it has impact on the access rights of guest ram pages, but are not
the same. Due to hardware limitations, if only write operations are
to be forwarded, read ones will be performed at full speed, with
no hypervisor intervention. But if read ones are to be forwarded to
an ioreq server, writes will inevitably be trapped into hypervisor,
which means significant performance impact.

Also note that this HVMOP_map_mem_type_to_ioreq_server will not
change the p2m type of any guest ram page, until HVMOP_set_mem_type
is triggered. So normally the steps should be the backend driver
first claims its ownership of guest ram pages with p2m_ioreq_server
type, and then sets the memory type to p2m_ioreq_server for specified
guest ram pages.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Tim Deegan <tim@xen.org>
---
 xen/arch/x86/hvm/emulate.c       | 125 +++++++++++++++++++++++++++++++++++++--
 xen/arch/x86/hvm/hvm.c           |  95 +++++++++++++++++++++++++++--
 xen/arch/x86/mm/hap/nested_hap.c |   2 +-
 xen/arch/x86/mm/p2m-ept.c        |  14 ++++-
 xen/arch/x86/mm/p2m-pt.c         |  25 +++++---
 xen/arch/x86/mm/p2m.c            |  82 +++++++++++++++++++++++++
 xen/arch/x86/mm/shadow/multi.c   |   3 +-
 xen/include/asm-x86/p2m.h        |  36 +++++++++--
 xen/include/public/hvm/hvm_op.h  |  37 ++++++++++++
 9 files changed, 395 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index ddc8007..77a4793 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -94,11 +94,69 @@ static const struct hvm_io_handler null_handler = {
     .ops = &null_ops
 };
 
+static int mem_read(const struct hvm_io_handler *io_handler,
+                    uint64_t addr,
+                    uint32_t size,
+                    uint64_t *data)
+{
+    struct domain *currd = current->domain;
+    unsigned long gmfn = paddr_to_pfn(addr);
+    unsigned long offset = addr & ~PAGE_MASK;
+    struct page_info *page = get_page_from_gfn(currd, gmfn, NULL, P2M_UNSHARE);
+    uint8_t *p;
+
+    if ( !page )
+        return X86EMUL_UNHANDLEABLE;
+
+    p = __map_domain_page(page);
+    p += offset;
+    memcpy(data, p, size);
+
+    unmap_domain_page(p);
+    put_page(page);
+
+    return X86EMUL_OKAY;
+}
+
+static int mem_write(const struct hvm_io_handler *handler,
+                     uint64_t addr,
+                     uint32_t size,
+                     uint64_t data)
+{
+    struct domain *currd = current->domain;
+    unsigned long gmfn = paddr_to_pfn(addr);
+    unsigned long offset = addr & ~PAGE_MASK;
+    struct page_info *page = get_page_from_gfn(currd, gmfn, NULL, P2M_UNSHARE);
+    uint8_t *p;
+
+    if ( !page )
+        return X86EMUL_UNHANDLEABLE;
+
+    p = __map_domain_page(page);
+    p += offset;
+    memcpy(p, &data, size);
+
+    unmap_domain_page(p);
+    put_page(page);
+
+    return X86EMUL_OKAY;
+}
+
+static const struct hvm_io_ops mem_ops = {
+    .read = mem_read,
+    .write = mem_write
+};
+
+static const struct hvm_io_handler mem_handler = {
+    .ops = &mem_ops
+};
+
 static int hvmemul_do_io(
     bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
 {
     struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     ioreq_t p = {
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
@@ -140,7 +198,7 @@ static int hvmemul_do_io(
              (p.dir != dir) ||
              (p.df != df) ||
              (p.data_is_ptr != data_is_addr) )
-            domain_crash(curr->domain);
+            domain_crash(currd);
 
         if ( data_is_addr )
             return X86EMUL_UNHANDLEABLE;
@@ -168,13 +226,72 @@ static int hvmemul_do_io(
         break;
     case X86EMUL_UNHANDLEABLE:
     {
-        struct hvm_ioreq_server *s =
-            hvm_select_ioreq_server(curr->domain, &p);
+        struct hvm_ioreq_server *s;
+        p2m_type_t p2mt;
+
+        if ( is_mmio )
+        {
+            unsigned long gmfn = paddr_to_pfn(addr);
+
+            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
+
+            switch ( p2mt )
+            {
+                case p2m_ioreq_server:
+                {
+                    unsigned long flags;
+
+                    p2m_get_ioreq_server(currd, &flags, &s);
+
+                    if ( !s )
+                        break;
+
+                    if ( (dir == IOREQ_READ &&
+                          !(flags & P2M_IOREQ_HANDLE_READ_ACCESS)) ||
+                         (dir == IOREQ_WRITE &&
+                          !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS)) )
+                        s = NULL;
+
+                    break;
+                }
+                case p2m_ram_rw:
+                    s = NULL;
+                    break;
+
+                default:
+                    s = hvm_select_ioreq_server(currd, &p);
+                    break;
+            }
+        }
+        else
+        {
+            p2mt = p2m_invalid;
+
+            s = hvm_select_ioreq_server(currd, &p);
+        }
 
         /* If there is no suitable backing DM, just ignore accesses */
         if ( !s )
         {
-            rc = hvm_process_io_intercept(&null_handler, &p);
+            switch ( p2mt )
+            {
+            case p2m_ioreq_server:
+            /*
+             * Race conditions may exist when access to a gfn with
+             * p2m_ioreq_server is intercepted by hypervisor, during
+             * which time p2m type of this gfn is recalculated back
+             * to p2m_ram_rw. mem_handler is used to handle this
+             * corner case.
+             */
+            case p2m_ram_rw:
+                rc = hvm_process_io_intercept(&mem_handler, &p);
+                break;
+
+            default:
+                rc = hvm_process_io_intercept(&null_handler, &p);
+                break;
+            }
+
             vio->io_req.state = STATE_IOREQ_NONE;
         }
         else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index bec6a8a..ba1de00 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1252,6 +1252,8 @@ static int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 
         domain_pause(d);
 
+        p2m_destroy_ioreq_server(d, s);
+
         hvm_ioreq_server_disable(s, 0);
 
         list_del(&s->list_entry);
@@ -1411,6 +1413,47 @@ static int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
     return rc;
 }
 
+static int hvm_map_mem_type_to_ioreq_server(struct domain *d,
+                                            ioservid_t id,
+                                            hvmmem_type_t type,
+                                            uint32_t flags)
+{
+    struct hvm_ioreq_server *s;
+    int rc;
+
+    /* For now, only HVMMEM_ioreq_server is supported */
+    if ( type != HVMMEM_ioreq_server )
+        return -EINVAL;
+
+    if ( flags & ~(HVMOP_IOREQ_MEM_ACCESS_READ |
+                   HVMOP_IOREQ_MEM_ACCESS_WRITE) )
+        return -EINVAL;
+
+    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
+
+    rc = -ENOENT;
+    list_for_each_entry ( s,
+                          &d->arch.hvm_domain.ioreq_server.list,
+                          list_entry )
+    {
+        if ( s == d->arch.hvm_domain.default_ioreq_server )
+            continue;
+
+        if ( s->id == id )
+        {
+            rc = p2m_set_ioreq_server(d, flags, s);
+            if ( rc == 0 )
+                gdprintk(XENLOG_DEBUG, "%u %s type HVMMEM_ioreq_server.\n",
+                         s->id, (flags != 0) ? "mapped to" : "unmapped from");
+
+            break;
+        }
+    }
+
+    spin_unlock(&d->arch.hvm_domain.ioreq_server.lock);
+    return rc;
+}
+
 static int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
                                       bool_t enabled)
 {
@@ -3164,9 +3207,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
      * If this GFN is emulated MMIO or marked as read-only, pass the fault
      * to the mmio handler.
      */
-    if ( (p2mt == p2m_mmio_dm) || 
-         (npfec.write_access &&
-          (p2m_is_discard_write(p2mt) || (p2mt == p2m_ioreq_server))) )
+    if ( (p2mt == p2m_mmio_dm) ||
+         (p2mt == p2m_ioreq_server) ||
+         (npfec.write_access && p2m_is_discard_write(p2mt)) )
     {
         __put_gfn(p2m, gfn);
         if ( ap2m_active )
@@ -5979,6 +6022,40 @@ static int hvmop_unmap_io_range_from_ioreq_server(
     return rc;
 }
 
+static int hvmop_map_mem_type_to_ioreq_server(
+    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
+{
+    xen_hvm_map_mem_type_to_ioreq_server_t op;
+    struct domain *d;
+    int rc;
+
+    if ( copy_from_guest(&op, uop, 1) )
+        return -EFAULT;
+
+    rc = rcu_lock_remote_domain_by_id(op.domid, &d);
+    if ( rc != 0 )
+        return rc;
+
+    rc = -EINVAL;
+    if ( !is_hvm_domain(d) )
+        goto out;
+
+    /* For now, only support for HAP enabled hvm */
+    if ( !hap_enabled(d) )
+        goto out;
+
+    rc = xsm_hvm_ioreq_server(XSM_DM_PRIV, d,
+                              HVMOP_map_mem_type_to_ioreq_server);
+    if ( rc != 0 )
+        goto out;
+
+    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
+
+ out:
+    rcu_unlock_domain(d);
+    return rc;
+}
+
 static int hvmop_set_ioreq_server_state(
     XEN_GUEST_HANDLE_PARAM(xen_hvm_set_ioreq_server_state_t) uop)
 {
@@ -6613,8 +6690,7 @@ static int hvmop_get_mem_type(
 static bool_t hvm_allow_p2m_type_change(p2m_type_t old, p2m_type_t new)
 {
     if ( p2m_is_ram(old) ||
-         (p2m_is_hole(old) && new == p2m_mmio_dm) ||
-         (old == p2m_ioreq_server && new == p2m_ram_rw) )
+         (p2m_is_hole(old) && new == p2m_mmio_dm) )
         return 1;
 
     return 0;
@@ -6648,6 +6724,10 @@ static int hvmop_set_mem_type(
     if ( !is_hvm_domain(d) )
         goto out;
 
+    /* For now, HVMMEM_ioreq_server is only supported for HAP enabled hvm. */
+    if ( a.hvmmem_type == HVMMEM_ioreq_server && !hap_enabled(d) )
+        goto out;
+
     rc = xsm_hvm_control(XSM_DM_PRIV, d, HVMOP_set_mem_type);
     if ( rc )
         goto out;
@@ -6748,6 +6828,11 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
             guest_handle_cast(arg, xen_hvm_io_range_t));
         break;
 
+    case HVMOP_map_mem_type_to_ioreq_server:
+        rc = hvmop_map_mem_type_to_ioreq_server(
+            guest_handle_cast(arg, xen_hvm_map_mem_type_to_ioreq_server_t));
+        break;
+
     case HVMOP_set_ioreq_server_state:
         rc = hvmop_set_ioreq_server_state(
             guest_handle_cast(arg, xen_hvm_set_ioreq_server_state_t));
diff --git a/xen/arch/x86/mm/hap/nested_hap.c b/xen/arch/x86/mm/hap/nested_hap.c
index 9cee5a0..bbb6d85 100644
--- a/xen/arch/x86/mm/hap/nested_hap.c
+++ b/xen/arch/x86/mm/hap/nested_hap.c
@@ -174,7 +174,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t L1_gpa, paddr_t *L0_gpa,
     if ( *p2mt == p2m_mmio_direct )
         goto direct_mmio_out;
     rc = NESTEDHVM_PAGEFAULT_MMIO;
-    if ( *p2mt == p2m_mmio_dm )
+    if ( *p2mt == p2m_mmio_dm || *p2mt == p2m_ioreq_server )
         goto out;
 
     rc = NESTEDHVM_PAGEFAULT_L0_ERROR;
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 380ec25..854e158 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -132,6 +132,19 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
             entry->r = entry->w = entry->x = 1;
             entry->a = entry->d = !!cpu_has_vmx_ept_ad;
             break;
+        case p2m_ioreq_server:
+            entry->r = !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_READ_ACCESS);
+	    /*
+	     * write access right is disabled when entry->r is 0, but whether
+	     * write accesses are emulated by hypervisor or forwarded to an
+	     * ioreq server depends on the setting of p2m->ioreq.flags.
+	     */
+            entry->w = (entry->r &&
+                        !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_WRITE_ACCESS));
+            entry->x = entry->r;
+            entry->a = !!cpu_has_vmx_ept_ad;
+            entry->d = entry->w && cpu_has_vmx_ept_ad;
+            break;
         case p2m_mmio_direct:
             entry->r = entry->x = 1;
             entry->w = !rangeset_contains_singleton(mmio_ro_ranges,
@@ -171,7 +184,6 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
             entry->a = entry->d = !!cpu_has_vmx_ept_ad;
             break;
         case p2m_grant_map_ro:
-        case p2m_ioreq_server:
             entry->r = 1;
             entry->w = entry->x = 0;
             entry->a = !!cpu_has_vmx_ept_ad;
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index eabd2e3..7a0ddb8 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -72,8 +72,8 @@ static const unsigned long pgt[] = {
     PGT_l3_page_table
 };
 
-static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
-                                       unsigned int level)
+static unsigned long p2m_type_to_flags(struct p2m_domain *p2m, p2m_type_t t,
+                                       mfn_t mfn, unsigned int level)
 {
     unsigned long flags;
     /*
@@ -94,8 +94,18 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
     default:
         return flags | _PAGE_NX_BIT;
     case p2m_grant_map_ro:
-    case p2m_ioreq_server:
         return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
+    case p2m_ioreq_server:
+    {
+        flags |= P2M_BASE_FLAGS | _PAGE_RW;
+
+        if ( p2m->ioreq.flags & P2M_IOREQ_HANDLE_READ_ACCESS )
+            return flags & ~(_PAGE_PRESENT | _PAGE_RW);
+        else if ( p2m->ioreq.flags & P2M_IOREQ_HANDLE_WRITE_ACCESS )
+            return flags & ~_PAGE_RW;
+        else
+            return flags;
+    }
     case p2m_ram_ro:
     case p2m_ram_logdirty:
     case p2m_ram_shared:
@@ -442,7 +452,8 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
             p2m_type_t p2mt = p2m_is_logdirty_range(p2m, gfn & mask, gfn | ~mask)
                               ? p2m_ram_logdirty : p2m_ram_rw;
             unsigned long mfn = l1e_get_pfn(e);
-            unsigned long flags = p2m_type_to_flags(p2mt, _mfn(mfn), level);
+            unsigned long flags = p2m_type_to_flags(p2m, p2mt,
+                                                    _mfn(mfn), level);
 
             if ( level )
             {
@@ -579,7 +590,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
         l3e_content = mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt)
             ? l3e_from_pfn(mfn_x(mfn),
-                           p2m_type_to_flags(p2mt, mfn, 2) | _PAGE_PSE)
+                           p2m_type_to_flags(p2m, p2mt, mfn, 2) | _PAGE_PSE)
             : l3e_empty();
         entry_content.l1 = l3e_content.l3;
 
@@ -615,7 +626,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
 
         if ( mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt) )
             entry_content = p2m_l1e_from_pfn(mfn_x(mfn),
-                                             p2m_type_to_flags(p2mt, mfn, 0));
+                                         p2m_type_to_flags(p2m, p2mt, mfn, 0));
         else
             entry_content = l1e_empty();
 
@@ -651,7 +662,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
         if ( mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt) )
             l2e_content = l2e_from_pfn(mfn_x(mfn),
-                                       p2m_type_to_flags(p2mt, mfn, 1) |
+                                       p2m_type_to_flags(p2m, p2mt, mfn, 1) |
                                        _PAGE_PSE);
         else
             l2e_content = l2e_empty();
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index b3fce1b..f7d2f60 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -83,6 +83,8 @@ static int p2m_initialise(struct domain *d, struct p2m_domain *p2m)
     else
         p2m_pt_init(p2m);
 
+    spin_lock_init(&p2m->ioreq.lock);
+
     return ret;
 }
 
@@ -289,6 +291,86 @@ void p2m_memory_type_changed(struct domain *d)
     }
 }
 
+int p2m_set_ioreq_server(struct domain *d,
+                         unsigned long flags,
+                         struct hvm_ioreq_server *s)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    int rc;
+
+    spin_lock(&p2m->ioreq.lock);
+
+    rc = -EBUSY;
+    if ( (flags != 0) && (p2m->ioreq.server != NULL) )
+        goto out;
+
+    rc = -EINVAL;
+    /* unmap ioreq server from p2m type by passing flags with 0 */
+    if ( (flags == 0) && (p2m->ioreq.server != s) )
+        goto out;
+
+    if ( flags == 0 )
+    {
+        p2m->ioreq.server = NULL;
+        p2m->ioreq.flags = 0;
+    }
+    else
+    {
+        p2m->ioreq.server = s;
+        p2m->ioreq.flags = flags;
+    }
+
+    /*
+     * Each time we map/unmap an ioreq server to/from p2m_ioreq_server,
+     * we mark the p2m table to be recalculated, so that gfns which were
+     * previously marked with p2m_ioreq_server can be resynced.
+     */
+    p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
+
+    rc = 0;
+
+out:
+    spin_unlock(&p2m->ioreq.lock);
+
+    return rc;
+}
+
+void p2m_get_ioreq_server(struct domain *d,
+                          unsigned long *flags,
+                          struct hvm_ioreq_server **s)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+    spin_lock(&p2m->ioreq.lock);
+
+    *s = p2m->ioreq.server;
+    *flags = p2m->ioreq.flags;
+
+    spin_unlock(&p2m->ioreq.lock);
+}
+
+void p2m_destroy_ioreq_server(struct domain *d,
+                              struct hvm_ioreq_server *s)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+    spin_lock(&p2m->ioreq.lock);
+
+    if ( p2m->ioreq.server == s )
+    {
+        p2m->ioreq.server = NULL;
+        p2m->ioreq.flags = 0;
+
+        /*
+         * Mark p2m table to be recalculated, so that gfns which were
+         * previously marked with p2m_ioreq_server can be resynced.
+         */
+        p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
+    }
+
+    spin_unlock(&p2m->ioreq.lock);
+}
+
 void p2m_enable_hardware_log_dirty(struct domain *d)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index c81302a..2e0d258 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3224,8 +3224,7 @@ static int sh_page_fault(struct vcpu *v,
     }
 
     /* Need to hand off device-model MMIO to the device model */
-    if ( p2mt == p2m_mmio_dm
-         || (p2mt == p2m_ioreq_server && ft == ft_demand_write) )
+    if ( p2mt == p2m_mmio_dm )
     {
         gpa = guest_walk_to_gpa(&gw);
         goto mmio;
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index ee2ea9c..8f925ac 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -89,7 +89,8 @@ typedef unsigned int p2m_query_t;
                        | p2m_to_mask(p2m_ram_paging_out)      \
                        | p2m_to_mask(p2m_ram_paged)           \
                        | p2m_to_mask(p2m_ram_paging_in)       \
-                       | p2m_to_mask(p2m_ram_shared))
+                       | p2m_to_mask(p2m_ram_shared)          \
+                       | p2m_to_mask(p2m_ioreq_server))
 
 /* Types that represent a physmap hole that is ok to replace with a shared
  * entry */
@@ -111,8 +112,7 @@ typedef unsigned int p2m_query_t;
 #define P2M_RO_TYPES (p2m_to_mask(p2m_ram_logdirty)     \
                       | p2m_to_mask(p2m_ram_ro)         \
                       | p2m_to_mask(p2m_grant_map_ro)   \
-                      | p2m_to_mask(p2m_ram_shared)     \
-                      | p2m_to_mask(p2m_ioreq_server))
+                      | p2m_to_mask(p2m_ram_shared))
 
 /* Write-discard types, which should discard the write operations */
 #define P2M_DISCARD_WRITE_TYPES (p2m_to_mask(p2m_ram_ro)     \
@@ -120,7 +120,8 @@ typedef unsigned int p2m_query_t;
 
 /* Types that can be subject to bulk transitions. */
 #define P2M_CHANGEABLE_TYPES (p2m_to_mask(p2m_ram_rw) \
-                              | p2m_to_mask(p2m_ram_logdirty) )
+                              | p2m_to_mask(p2m_ram_logdirty) \
+                              | p2m_to_mask(p2m_ioreq_server) )
 
 #define P2M_POD_TYPES (p2m_to_mask(p2m_populate_on_demand))
 
@@ -320,6 +321,27 @@ struct p2m_domain {
         struct ept_data ept;
         /* NPT-equivalent structure could be added here. */
     };
+
+    struct {
+        spinlock_t lock;
+        /*
+         * ioreq server who's responsible for the emulation of
+         * gfns with specific p2m type(for now, p2m_ioreq_server).
+         * Behaviors of gfns with p2m_ioreq_server set but no
+         * ioreq server mapped in advance should be the same as
+         * p2m_ram_rw.
+         */
+        struct hvm_ioreq_server *server;
+        /*
+         * flags specifies whether read, write or both operations
+         * are to be emulated by an ioreq server.
+         */
+        unsigned long flags;
+
+#define P2M_IOREQ_HANDLE_WRITE_ACCESS HVMOP_IOREQ_MEM_ACCESS_WRITE
+#define P2M_IOREQ_HANDLE_READ_ACCESS  HVMOP_IOREQ_MEM_ACCESS_READ
+
+    } ioreq;
 };
 
 /* get host p2m table */
@@ -821,6 +843,12 @@ static inline unsigned int p2m_get_iommu_flags(p2m_type_t p2mt)
     return flags;
 }
 
+int p2m_set_ioreq_server(struct domain *d, unsigned long flags,
+                         struct hvm_ioreq_server *s);
+void p2m_get_ioreq_server(struct domain *d, unsigned long *flags,
+                         struct hvm_ioreq_server **s);
+void p2m_destroy_ioreq_server(struct domain *d, struct hvm_ioreq_server *s);
+
 #endif /* _XEN_P2M_H */
 
 /*
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index a1eae52..d46f186 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -489,6 +489,43 @@ struct xen_hvm_altp2m_op {
 typedef struct xen_hvm_altp2m_op xen_hvm_altp2m_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_op_t);
 
+#if defined(__XEN__) || defined(__XEN_TOOLS__)
+
+/*
+ * HVMOP_map_mem_type_to_ioreq_server : map or unmap the IOREQ Server <id>
+ *                                      to specific memroy type <type>
+ *                                      for specific accesses <flags>
+ *
+ * Note that if only write operations are to be forwarded to an ioreq server,
+ * read operations will be performed with no hypervisor intervention. But if
+ * flags indicates that read operations are to be forwarded to an ioreq server,
+ * write operations will inevitably be trapped into hypervisor, whether they
+ * are emulated by hypervisor or forwarded to ioreq server depends on the flags
+ * setting. This situation means significant performance impact.
+ */
+#define HVMOP_map_mem_type_to_ioreq_server 26
+struct xen_hvm_map_mem_type_to_ioreq_server {
+    domid_t domid;      /* IN - domain to be serviced */
+    ioservid_t id;      /* IN - ioreq server id */
+    hvmmem_type_t type; /* IN - memory type */
+    uint32_t flags;     /* IN - types of accesses to be forwarded to the
+                           ioreq server. flags with 0 means to unmap the
+                           ioreq server */
+#define _HVMOP_IOREQ_MEM_ACCESS_READ 0
+#define HVMOP_IOREQ_MEM_ACCESS_READ \
+    (1u << _HVMOP_IOREQ_MEM_ACCESS_READ)
+
+#define _HVMOP_IOREQ_MEM_ACCESS_WRITE 1
+#define HVMOP_IOREQ_MEM_ACCESS_WRITE \
+    (1u << _HVMOP_IOREQ_MEM_ACCESS_WRITE)
+};
+typedef struct xen_hvm_map_mem_type_to_ioreq_server
+    xen_hvm_map_mem_type_to_ioreq_server_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_map_mem_type_to_ioreq_server_t);
+
+#endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
+
+
 #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
 
 /*
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
       [not found]   ` <20160404082556.GC28633@deinos.phlegethon.org>
@ 2016-04-05  6:01     ` Yu, Zhang
  0 siblings, 0 replies; 82+ messages in thread
From: Yu, Zhang @ 2016-04-05  6:01 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Kevin Tian, Keir Fraser, Jun Nakajima, George Dunlap,
	Andrew Cooper, xen-devel, Paul Durrant, zhiyuan.lv, Jan Beulich

Thanks, Tim.

On 4/4/2016 4:25 PM, Tim Deegan wrote:
> At 18:53 +0800 on 31 Mar (1459450418), Yu Zhang wrote:
>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>> let one ioreq server claim/disclaim its responsibility for the
>> handling of guest pages with p2m type p2m_ioreq_server. Users
>> of this HVMOP can specify whether the p2m_ioreq_server is supposed
>> to handle write accesses or read ones or both in a parameter named
>> flags. For now, we only support one ioreq server for this p2m type,
>> so once an ioreq server has claimed its ownership, subsequent calls
>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>> disclaim the ownership of guest ram pages with this p2m type, by
>> triggering this new HVMOP, with ioreq server id set to the current
>> owner's and flags parameter set to 0.
>>
>> For now, both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>> are only supported for HVMs with HAP enabled.
>
> "For now", meaning you intend to make this work on shadow pagetables? :P
> If not, please just say it's HAP-only.
>

Well, I have no clear plan in the near future to support this on shadow
mode, and I'll change the commit message in next version. :)

>> diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
>> index c81302a..2e0d258 100644
>> --- a/xen/arch/x86/mm/shadow/multi.c
>> +++ b/xen/arch/x86/mm/shadow/multi.c
>> @@ -3224,8 +3224,7 @@ static int sh_page_fault(struct vcpu *v,
>>       }
>>
>>       /* Need to hand off device-model MMIO to the device model */
>> -    if ( p2mt == p2m_mmio_dm
>> -         || (p2mt == p2m_ioreq_server && ft == ft_demand_write) )
>> +    if ( p2mt == p2m_mmio_dm )
>>       {
>>           gpa = guest_walk_to_gpa(&gw);
>>           goto mmio;
>
> Acked-by: Tim Deegan <tim@xen.org>
>
>

Regards.
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 1/3] x86/ioreq server: Add new functions to get/set memory types.
  2016-03-31 10:53 ` [PATCH v2 1/3] x86/ioreq server: Add new functions to get/set memory types Yu Zhang
@ 2016-04-05 13:57   ` George Dunlap
  2016-04-05 14:08     ` George Dunlap
  2016-04-08 13:25   ` Andrew Cooper
  1 sibling, 1 reply; 82+ messages in thread
From: George Dunlap @ 2016-04-05 13:57 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Keir Fraser, Andrew Cooper, xen-devel, Paul Durrant, Lv, Zhiyuan,
	Jan Beulich

On Thu, Mar 31, 2016 at 11:53 AM, Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
> For clarity this patch breaks the code to set/get memory types out
> of do_hvm_op() into dedicated functions: hvmop_set/get_mem_type().
> Also, for clarity, checks for whether a memory type change is allowed
> are broken out into a separate function called by hvmop_set_mem_type().
>
> There is no intentional functional change in this patch.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

> ---
>  xen/arch/x86/hvm/hvm.c | 284 +++++++++++++++++++++++++++----------------------
>  1 file changed, 159 insertions(+), 125 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 80d59ff..f700923 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -6547,6 +6547,61 @@ static int do_altp2m_op(
>      return rc;
>  }
>
> +static int hvmop_get_mem_type(
> +    XEN_GUEST_HANDLE_PARAM(xen_hvm_get_mem_type_t) arg)
> +{
> +    struct xen_hvm_get_mem_type a;
> +    struct domain *d;
> +    p2m_type_t t;
> +    int rc;
> +
> +    if ( copy_from_guest(&a, arg, 1) )
> +        return -EFAULT;
> +
> +    d = rcu_lock_domain_by_any_id(a.domid);
> +    if ( d == NULL )
> +        return -ESRCH;
> +
> +    rc = xsm_hvm_param(XSM_TARGET, d, HVMOP_get_mem_type);
> +    if ( rc )
> +        goto out;
> +
> +    rc = -EINVAL;
> +    if ( !is_hvm_domain(d) )
> +        goto out;
> +
> +    /*
> +     * Use get_gfn query as we are interested in the current
> +     * type, not in allocating or unsharing. That'll happen
> +     * on access.
> +     */
> +    get_gfn_query_unlocked(d, a.pfn, &t);
> +    if ( p2m_is_mmio(t) )
> +        a.mem_type =  HVMMEM_mmio_dm;
> +    else if ( t == p2m_mmio_write_dm )
> +        a.mem_type = HVMMEM_mmio_write_dm;
> +    else if ( p2m_is_readonly(t) )
> +        a.mem_type =  HVMMEM_ram_ro;
> +    else if ( p2m_is_ram(t) )
> +        a.mem_type =  HVMMEM_ram_rw;
> +    else if ( p2m_is_pod(t) )
> +        a.mem_type =  HVMMEM_ram_rw;
> +    else if ( p2m_is_grant(t) )
> +        a.mem_type =  HVMMEM_ram_rw;
> +    else
> +        a.mem_type =  HVMMEM_mmio_dm;
> +
> +    rc = -EFAULT;
> +    if ( __copy_to_guest(arg, &a, 1) )
> +        goto out;
> +    rc = 0;
> +
> + out:
> +    rcu_unlock_domain(d);
> +
> +    return rc;
> +}
> +
>  /*
>   * Note that this value is effectively part of the ABI, even if we don't need
>   * to make it a formal part of it: A guest suspended for migration in the
> @@ -6555,6 +6610,105 @@ static int do_altp2m_op(
>   */
>  #define HVMOP_op_mask 0xff
>
> +static bool_t hvm_allow_p2m_type_change(p2m_type_t old, p2m_type_t new)
> +{
> +    if ( p2m_is_ram(old) ||
> +         (p2m_is_hole(old) && new == p2m_mmio_dm) ||
> +         (old == p2m_mmio_write_dm && new == p2m_ram_rw) )
> +        return 1;
> +
> +    return 0;
> +}
> +
> +static int hvmop_set_mem_type(
> +    XEN_GUEST_HANDLE_PARAM(xen_hvm_set_mem_type_t) arg,
> +    unsigned long *iter)
> +{
> +    unsigned long start_iter = *iter;
> +    struct xen_hvm_set_mem_type a;
> +    struct domain *d;
> +    int rc;
> +
> +    /* Interface types to internal p2m types */
> +    static const p2m_type_t memtype[] = {
> +        [HVMMEM_ram_rw]  = p2m_ram_rw,
> +        [HVMMEM_ram_ro]  = p2m_ram_ro,
> +        [HVMMEM_mmio_dm] = p2m_mmio_dm,
> +        [HVMMEM_mmio_write_dm] = p2m_mmio_write_dm
> +    };
> +
> +    if ( copy_from_guest(&a, arg, 1) )
> +        return -EFAULT;
> +
> +    rc = rcu_lock_remote_domain_by_id(a.domid, &d);
> +    if ( rc != 0 )
> +        return rc;
> +
> +    rc = -EINVAL;
> +    if ( !is_hvm_domain(d) )
> +        goto out;
> +
> +    rc = xsm_hvm_control(XSM_DM_PRIV, d, HVMOP_set_mem_type);
> +    if ( rc )
> +        goto out;
> +
> +    rc = -EINVAL;
> +    if ( a.nr < start_iter ||
> +         ((a.first_pfn + a.nr - 1) < a.first_pfn) ||
> +         ((a.first_pfn + a.nr - 1) > domain_get_maximum_gpfn(d)) )
> +        goto out;
> +
> +    if ( a.hvmmem_type >= ARRAY_SIZE(memtype) )
> +        goto out;
> +
> +    while ( a.nr > start_iter )
> +    {
> +        unsigned long pfn = a.first_pfn + start_iter;
> +        p2m_type_t t;
> +
> +        get_gfn_unshare(d, pfn, &t);
> +        if ( p2m_is_paging(t) )
> +        {
> +            put_gfn(d, pfn);
> +            p2m_mem_paging_populate(d, pfn);
> +            rc = -EAGAIN;
> +            goto out;
> +        }
> +        if ( p2m_is_shared(t) )
> +        {
> +            put_gfn(d, pfn);
> +            rc = -EAGAIN;
> +            goto out;
> +        }
> +        if ( !hvm_allow_p2m_type_change(t, memtype[a.hvmmem_type]) )
> +        {
> +            put_gfn(d, pfn);
> +            goto out;
> +        }
> +
> +        rc = p2m_change_type_one(d, pfn, t, memtype[a.hvmmem_type]);
> +        put_gfn(d, pfn);
> +
> +        if ( rc )
> +            goto out;
> +
> +        /* Check for continuation if it's not the last interation */
> +        if ( a.nr > ++start_iter && !(start_iter & HVMOP_op_mask) &&
> +             hypercall_preempt_check() )
> +        {
> +            rc = -ERESTART;
> +            goto out;
> +        }
> +    }
> +    rc = 0;
> +
> + out:
> +    rcu_unlock_domain(d);
> +    *iter = start_iter;
> +
> +    return rc;
> +}
> +
>  long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
>  {
>      unsigned long start_iter, mask;
> @@ -6744,135 +6898,15 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
>      }
>
>      case HVMOP_get_mem_type:
> -    {
> -        struct xen_hvm_get_mem_type a;
> -        struct domain *d;
> -        p2m_type_t t;
> -
> -        if ( copy_from_guest(&a, arg, 1) )
> -            return -EFAULT;
> -
> -        d = rcu_lock_domain_by_any_id(a.domid);
> -        if ( d == NULL )
> -            return -ESRCH;
> -
> -        rc = xsm_hvm_param(XSM_TARGET, d, op);
> -        if ( unlikely(rc) )
> -            /* nothing */;
> -        else if ( likely(is_hvm_domain(d)) )
> -        {
> -            /* Use get_gfn query as we are interested in the current
> -             * type, not in allocating or unsharing. That'll happen
> -             * on access. */
> -            get_gfn_query_unlocked(d, a.pfn, &t);
> -            if ( p2m_is_mmio(t) )
> -                a.mem_type =  HVMMEM_mmio_dm;
> -            else if ( t == p2m_mmio_write_dm )
> -                a.mem_type = HVMMEM_mmio_write_dm;
> -            else if ( p2m_is_readonly(t) )
> -                a.mem_type =  HVMMEM_ram_ro;
> -            else if ( p2m_is_ram(t) )
> -                a.mem_type =  HVMMEM_ram_rw;
> -            else if ( p2m_is_pod(t) )
> -                a.mem_type =  HVMMEM_ram_rw;
> -            else if ( p2m_is_grant(t) )
> -                a.mem_type =  HVMMEM_ram_rw;
> -            else
> -                a.mem_type =  HVMMEM_mmio_dm;
> -            if ( __copy_to_guest(arg, &a, 1) )
> -                rc = -EFAULT;
> -        }
> -        else
> -            rc = -EINVAL;
> -
> -        rcu_unlock_domain(d);
> +        rc = hvmop_get_mem_type(
> +            guest_handle_cast(arg, xen_hvm_get_mem_type_t));
>          break;
> -    }
>
>      case HVMOP_set_mem_type:
> -    {
> -        struct xen_hvm_set_mem_type a;
> -        struct domain *d;
> -
> -        /* Interface types to internal p2m types */
> -        static const p2m_type_t memtype[] = {
> -            [HVMMEM_ram_rw]  = p2m_ram_rw,
> -            [HVMMEM_ram_ro]  = p2m_ram_ro,
> -            [HVMMEM_mmio_dm] = p2m_mmio_dm,
> -            [HVMMEM_mmio_write_dm] = p2m_mmio_write_dm
> -        };
> -
> -        if ( copy_from_guest(&a, arg, 1) )
> -            return -EFAULT;
> -
> -        rc = rcu_lock_remote_domain_by_id(a.domid, &d);
> -        if ( rc != 0 )
> -            return rc;
> -
> -        rc = -EINVAL;
> -        if ( !is_hvm_domain(d) )
> -            goto setmemtype_fail;
> -
> -        rc = xsm_hvm_control(XSM_DM_PRIV, d, op);
> -        if ( rc )
> -            goto setmemtype_fail;
> -
> -        rc = -EINVAL;
> -        if ( a.nr < start_iter ||
> -             ((a.first_pfn + a.nr - 1) < a.first_pfn) ||
> -             ((a.first_pfn + a.nr - 1) > domain_get_maximum_gpfn(d)) )
> -            goto setmemtype_fail;
> -
> -        if ( a.hvmmem_type >= ARRAY_SIZE(memtype) )
> -            goto setmemtype_fail;
> -
> -        while ( a.nr > start_iter )
> -        {
> -            unsigned long pfn = a.first_pfn + start_iter;
> -            p2m_type_t t;
> -
> -            get_gfn_unshare(d, pfn, &t);
> -            if ( p2m_is_paging(t) )
> -            {
> -                put_gfn(d, pfn);
> -                p2m_mem_paging_populate(d, pfn);
> -                rc = -EAGAIN;
> -                goto setmemtype_fail;
> -            }
> -            if ( p2m_is_shared(t) )
> -            {
> -                put_gfn(d, pfn);
> -                rc = -EAGAIN;
> -                goto setmemtype_fail;
> -            }
> -            if ( !p2m_is_ram(t) &&
> -                 (!p2m_is_hole(t) || a.hvmmem_type != HVMMEM_mmio_dm) &&
> -                 (t != p2m_mmio_write_dm || a.hvmmem_type != HVMMEM_ram_rw) )
> -            {
> -                put_gfn(d, pfn);
> -                goto setmemtype_fail;
> -            }
> -
> -            rc = p2m_change_type_one(d, pfn, t, memtype[a.hvmmem_type]);
> -            put_gfn(d, pfn);
> -            if ( rc )
> -                goto setmemtype_fail;
> -
> -            /* Check for continuation if it's not the last interation */
> -            if ( a.nr > ++start_iter && !(start_iter & HVMOP_op_mask) &&
> -                 hypercall_preempt_check() )
> -            {
> -                rc = -ERESTART;
> -                goto setmemtype_fail;
> -            }
> -        }
> -
> -        rc = 0;
> -
> -    setmemtype_fail:
> -        rcu_unlock_domain(d);
> +        rc = hvmop_set_mem_type(
> +            guest_handle_cast(arg, xen_hvm_set_mem_type_t),
> +            &start_iter);
>          break;
> -    }
>
>      case HVMOP_pagetable_dying:
>      {
> --
> 1.9.1
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 1/3] x86/ioreq server: Add new functions to get/set memory types.
  2016-04-05 13:57   ` George Dunlap
@ 2016-04-05 14:08     ` George Dunlap
  0 siblings, 0 replies; 82+ messages in thread
From: George Dunlap @ 2016-04-05 14:08 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Keir Fraser, Andrew Cooper, xen-devel, Paul Durrant, Lv, Zhiyuan,
	Jan Beulich

On Tue, Apr 5, 2016 at 2:57 PM, George Dunlap <dunlapg@umich.edu> wrote:
> On Thu, Mar 31, 2016 at 11:53 AM, Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
>> For clarity this patch breaks the code to set/get memory types out
>> of do_hvm_op() into dedicated functions: hvmop_set/get_mem_type().
>> Also, for clarity, checks for whether a memory type change is allowed
>> are broken out into a separate function called by hvmop_set_mem_type().
>>
>> There is no intentional functional change in this patch.
>>
>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>> Cc: Keir Fraser <keir@xen.org>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>
> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
>
>> ---

And for future reference: You should put the changes between versions
in the patches themselves, rather than in the header.  It's probably
easier for both you and the reviewers.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-03-31 10:53 ` [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server Yu Zhang
@ 2016-04-05 14:38   ` George Dunlap
  2016-04-08 13:26   ` Andrew Cooper
  2016-04-08 21:48   ` Jan Beulich
  2 siblings, 0 replies; 82+ messages in thread
From: George Dunlap @ 2016-04-05 14:38 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Kevin Tian, Keir Fraser, Jun Nakajima, Andrew Cooper, Tim Deegan,
	xen-devel, Paul Durrant, Lv, Zhiyuan, Jan Beulich

On Thu, Mar 31, 2016 at 11:53 AM, Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
> Previously p2m type p2m_mmio_write_dm was introduced for write-
> protected memory pages whose write operations are supposed to be
> forwarded to and emulated by an ioreq server. Yet limitations of
> rangeset restrict the number of guest pages to be write-protected.
>
> This patch replaces the p2m type p2m_mmio_write_dm with a new name:
> p2m_ioreq_server, which means this p2m type can be claimed by one
> ioreq server, instead of being tracked inside the rangeset of ioreq
> server. Patches following up will add the related hvmop handling
> code which map/unmap type p2m_ioreq_server to/from an ioreq server.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>

Again, this would have been a lot easier to review if you'd said that
the only difference was to make it a straight rename, and move the
type change to the next patch.

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-03-31 10:53 ` [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
       [not found]   ` <20160404082556.GC28633@deinos.phlegethon.org>
@ 2016-04-06 17:13   ` George Dunlap
  2016-04-07  7:01     ` Yu, Zhang
  2016-04-08 13:33   ` Andrew Cooper
  2016-04-08 22:28   ` Jan Beulich
  3 siblings, 1 reply; 82+ messages in thread
From: George Dunlap @ 2016-04-06 17:13 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Andrew Cooper, Tim Deegan,
	xen-devel, Paul Durrant, Lv, Zhiyuan, Jun Nakajima

On Thu, Mar 31, 2016 at 11:53 AM, Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
> let one ioreq server claim/disclaim its responsibility for the
> handling of guest pages with p2m type p2m_ioreq_server. Users
> of this HVMOP can specify whether the p2m_ioreq_server is supposed
> to handle write accesses or read ones or both in a parameter named
> flags. For now, we only support one ioreq server for this p2m type,
> so once an ioreq server has claimed its ownership, subsequent calls
> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
> disclaim the ownership of guest ram pages with this p2m type, by
> triggering this new HVMOP, with ioreq server id set to the current
> owner's and flags parameter set to 0.
>
> For now, both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
> are only supported for HVMs with HAP enabled.
>
> Note that flags parameter(if not 0) of this HVMOP only indicates
> which kind of memory accesses are to be forwarded to an ioreq server,
> it has impact on the access rights of guest ram pages, but are not
> the same. Due to hardware limitations, if only write operations are
> to be forwarded, read ones will be performed at full speed, with
> no hypervisor intervention. But if read ones are to be forwarded to
> an ioreq server, writes will inevitably be trapped into hypervisor,
> which means significant performance impact.
>
> Also note that this HVMOP_map_mem_type_to_ioreq_server will not
> change the p2m type of any guest ram page, until HVMOP_set_mem_type
> is triggered. So normally the steps should be the backend driver
> first claims its ownership of guest ram pages with p2m_ioreq_server
> type, and then sets the memory type to p2m_ioreq_server for specified
> guest ram pages.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>

And again, review of this patch was significantly delayed because you
didn't provide any description of the changes you made between v1 and
v2 or why.

Overall looks good.  Just a few questions...

> +static int hvmop_map_mem_type_to_ioreq_server(
> +    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
> +{
> +    xen_hvm_map_mem_type_to_ioreq_server_t op;
> +    struct domain *d;
> +    int rc;
> +
> +    if ( copy_from_guest(&op, uop, 1) )
> +        return -EFAULT;
> +
> +    rc = rcu_lock_remote_domain_by_id(op.domid, &d);
> +    if ( rc != 0 )
> +        return rc;
> +
> +    rc = -EINVAL;
> +    if ( !is_hvm_domain(d) )
> +        goto out;
> +
> +    /* For now, only support for HAP enabled hvm */
> +    if ( !hap_enabled(d) )
> +        goto out;

So before I suggested that this be restricted to HAP because you were
using p2m_memory_type_changed(), which was only implemented on EPT.
But since then you've switched that code to use
p2m_change_entry_type_global() instead, which is implemented by both;
and you implement the type in p2m_type_to_flags().  Is there any
reason to keep this restriction?

> +    /*
> +     * Each time we map/unmap an ioreq server to/from p2m_ioreq_server,
> +     * we mark the p2m table to be recalculated, so that gfns which were
> +     * previously marked with p2m_ioreq_server can be resynced.
> +     */
> +    p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);

This comment doesn't seem to be accurate (or if it is it's a bit
confusing).  Would it be more accurate to say something like the
following:

"Each time we map / unmap in ioreq server to/from p2m_ioreq_server, we
reset all memory currently marked p2m_ioreq_server to p2m_ram_rw."

But of course that raises another question: is there ever any risk
that an ioreq server will change some other ram type (say, p2m_ram_ro)
to p2m_ioreq_server, and then have it changed back to p2m_ram_rw when
it detaches?

>  /* Types that can be subject to bulk transitions. */
>  #define P2M_CHANGEABLE_TYPES (p2m_to_mask(p2m_ram_rw) \
> -                              | p2m_to_mask(p2m_ram_logdirty) )
> +                              | p2m_to_mask(p2m_ram_logdirty) \
> +                              | p2m_to_mask(p2m_ioreq_server) )

It's probably worth a comment here noting that you can do a mass
change *from* p2m_ioreq_server, but you can't do a mass change *to*
p2m_ioreq_server.  (And doing so would need to change a number of
places in the code where it's assumed that the result of such a
recalculation is either p2m_logdirty or p2m_ram_rw -- e.g.,
p2m-ept.c:553, p2m-pt.c:452, &c.

Really getting down to the fine details here -- thanks for all your
work on this.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-06 17:13   ` George Dunlap
@ 2016-04-07  7:01     ` Yu, Zhang
       [not found]       ` <CAFLBxZbLp2zWzCzQTaJNWbanQSmTJ57ZyTh0qaD-+YUn8o8pyQ@mail.gmail.com>
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-07  7:01 UTC (permalink / raw)
  To: George Dunlap
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Andrew Cooper, Tim Deegan,
	xen-devel, Paul Durrant, Lv, Zhiyuan, Jun Nakajima

Thanks for your advices and good questions. :)

On 4/7/2016 1:13 AM, George Dunlap wrote:
> On Thu, Mar 31, 2016 at 11:53 AM, Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>> let one ioreq server claim/disclaim its responsibility for the
>> handling of guest pages with p2m type p2m_ioreq_server. Users
>> of this HVMOP can specify whether the p2m_ioreq_server is supposed
>> to handle write accesses or read ones or both in a parameter named
>> flags. For now, we only support one ioreq server for this p2m type,
>> so once an ioreq server has claimed its ownership, subsequent calls
>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>> disclaim the ownership of guest ram pages with this p2m type, by
>> triggering this new HVMOP, with ioreq server id set to the current
>> owner's and flags parameter set to 0.
>>
>> For now, both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>> are only supported for HVMs with HAP enabled.
>>
>> Note that flags parameter(if not 0) of this HVMOP only indicates
>> which kind of memory accesses are to be forwarded to an ioreq server,
>> it has impact on the access rights of guest ram pages, but are not
>> the same. Due to hardware limitations, if only write operations are
>> to be forwarded, read ones will be performed at full speed, with
>> no hypervisor intervention. But if read ones are to be forwarded to
>> an ioreq server, writes will inevitably be trapped into hypervisor,
>> which means significant performance impact.
>>
>> Also note that this HVMOP_map_mem_type_to_ioreq_server will not
>> change the p2m type of any guest ram page, until HVMOP_set_mem_type
>> is triggered. So normally the steps should be the backend driver
>> first claims its ownership of guest ram pages with p2m_ioreq_server
>> type, and then sets the memory type to p2m_ioreq_server for specified
>> guest ram pages.
>>
>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>
> And again, review of this patch was significantly delayed because you
> didn't provide any description of the changes you made between v1 and
> v2 or why.

Sorry about the inconvenience, will change in next version.

>
> Overall looks good.  Just a few questions...
>
>> +static int hvmop_map_mem_type_to_ioreq_server(
>> +    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
>> +{
>> +    xen_hvm_map_mem_type_to_ioreq_server_t op;
>> +    struct domain *d;
>> +    int rc;
>> +
>> +    if ( copy_from_guest(&op, uop, 1) )
>> +        return -EFAULT;
>> +
>> +    rc = rcu_lock_remote_domain_by_id(op.domid, &d);
>> +    if ( rc != 0 )
>> +        return rc;
>> +
>> +    rc = -EINVAL;
>> +    if ( !is_hvm_domain(d) )
>> +        goto out;
>> +
>> +    /* For now, only support for HAP enabled hvm */
>> +    if ( !hap_enabled(d) )
>> +        goto out;
>
> So before I suggested that this be restricted to HAP because you were
> using p2m_memory_type_changed(), which was only implemented on EPT.
> But since then you've switched that code to use
> p2m_change_entry_type_global() instead, which is implemented by both;
> and you implement the type in p2m_type_to_flags().  Is there any
> reason to keep this restriction?
>

Yes. And this is a change which was not explained clearly. Sorry.

Reason I've chosen p2m_change_entry_type_global() instead:
p2m_memory_type_changed() will only trigger the resynchronization for
the ept memory types in resolve_misconfig(). Yet it is the p2m type we
wanna to be recalculated, so here comes p2m_change_entry_type_global().

Reasons I restricting the code in HAP mode:
Well, I guess p2m_change_entry_type_global() was only called by HAP code
like hap_[en|dis]able_log_dirty() etc, which were registered during
hap_domain_init(). As to shadow mode, it is sh_[en|dis]able_log_dirty(),
which do not use p2m_change_entry_type_global().

Since my intention is to resync the outdated p2m_ioreq_server pages
back to p2m_ram_rw, calling p2m_change_entry_global() directly should
be much more convenient(and correct) for me than inventing another
wrapper to cover both the HAP and shadow mode(which xengt does not use
by now).


>> +    /*
>> +     * Each time we map/unmap an ioreq server to/from p2m_ioreq_server,
>> +     * we mark the p2m table to be recalculated, so that gfns which were
>> +     * previously marked with p2m_ioreq_server can be resynced.
>> +     */
>> +    p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>
> This comment doesn't seem to be accurate (or if it is it's a bit
> confusing).  Would it be more accurate to say something like the
> following:
>
> "Each time we map / unmap in ioreq server to/from p2m_ioreq_server, we
> reset all memory currently marked p2m_ioreq_server to p2m_ram_rw."
>
Well, I agree this comment is not quite accurate. Like you said in your
comment, the purpose here, calling p2m_change_entry_type_global() is to
"reset all memory currently marked p2m_ioreq_server to p2m_ram_rw". But
the recalculation is asynchronous. So how about:

"Each time we map/unmap an ioreq server to/from p2m_ioreq_server, we
mark the p2m table to be recalculated, so all memory currently marked
p2m_ioreq_server can be reset to p2m_ram_rw later."?

> But of course that raises another question: is there ever any risk
> that an ioreq server will change some other ram type (say, p2m_ram_ro)
> to p2m_ioreq_server, and then have it changed back to p2m_ram_rw when
> it detaches?
>
Good question. And unfortunately, yes there is. :)
Maybe we should insist only p2m_ram_rw pages can be changed to
p2m_ioreq_server, and vice versa.

>>   /* Types that can be subject to bulk transitions. */
>>   #define P2M_CHANGEABLE_TYPES (p2m_to_mask(p2m_ram_rw) \
>> -                              | p2m_to_mask(p2m_ram_logdirty) )
>> +                              | p2m_to_mask(p2m_ram_logdirty) \
>> +                              | p2m_to_mask(p2m_ioreq_server) )
>
> It's probably worth a comment here noting that you can do a mass
> change *from* p2m_ioreq_server, but you can't do a mass change *to*
> p2m_ioreq_server.  (And doing so would need to change a number of
> places in the code where it's assumed that the result of such a
> recalculation is either p2m_logdirty or p2m_ram_rw -- e.g.,
> p2m-ept.c:553, p2m-pt.c:452, &c.
>
I agree with adding a note here.
But adding extra code in resolve_miconfig()/do_recalc()? Is this
necessary? IIUC, current code already guarantees there will be no mass
change *to* the p2m_ioreq_server.


> Really getting down to the fine details here -- thanks for all your
> work on this.
>
>   -George
>

B.R.
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
       [not found]       ` <CAFLBxZbLp2zWzCzQTaJNWbanQSmTJ57ZyTh0qaD-+YUn8o8pyQ@mail.gmail.com>
@ 2016-04-08 10:39         ` George Dunlap
       [not found]         ` <5707839F.9060803@linux.intel.com>
  1 sibling, 0 replies; 82+ messages in thread
From: George Dunlap @ 2016-04-08 10:39 UTC (permalink / raw)
  To: Yu, Zhang; +Cc: xen-devel

Oops -- forgot to CC the list...

On Thu, Apr 7, 2016 at 10:55 AM, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
> On Thu, Apr 7, 2016 at 8:01 AM, Yu, Zhang <yu.c.zhang@linux.intel.com> wrote:
>>>> +    /* For now, only support for HAP enabled hvm */
>>>> +    if ( !hap_enabled(d) )
>>>> +        goto out;
>>>
>>>
>>> So before I suggested that this be restricted to HAP because you were
>>> using p2m_memory_type_changed(), which was only implemented on EPT.
>>> But since then you've switched that code to use
>>> p2m_change_entry_type_global() instead, which is implemented by both;
>>> and you implement the type in p2m_type_to_flags().  Is there any
>>> reason to keep this restriction?
>>>
>>
>> Yes. And this is a change which was not explained clearly. Sorry.
>>
>> Reason I've chosen p2m_change_entry_type_global() instead:
>> p2m_memory_type_changed() will only trigger the resynchronization for
>> the ept memory types in resolve_misconfig(). Yet it is the p2m type we
>> wanna to be recalculated, so here comes p2m_change_entry_type_global().
>>
>> Reasons I restricting the code in HAP mode:
>> Well, I guess p2m_change_entry_type_global() was only called by HAP code
>> like hap_[en|dis]able_log_dirty() etc, which were registered during
>> hap_domain_init(). As to shadow mode, it is sh_[en|dis]able_log_dirty(),
>> which do not use p2m_change_entry_type_global().
>
> Oh, right -- yes, there's an ASSERT(hap_enabled()) right at the top of
> p2m_pt_change_entry_type_global().
>
> Yes, if that functionality is not already implemented for shadow,
> there's no need for you to implement it; and restricting it to
> HAP-only is the obvious solution.
>
>>>> +    /*
>>>> +     * Each time we map/unmap an ioreq server to/from p2m_ioreq_server,
>>>> +     * we mark the p2m table to be recalculated, so that gfns which were
>>>> +     * previously marked with p2m_ioreq_server can be resynced.
>>>> +     */
>>>> +    p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>>>
>>>
>>> This comment doesn't seem to be accurate (or if it is it's a bit
>>> confusing).  Would it be more accurate to say something like the
>>> following:
>>>
>>> "Each time we map / unmap in ioreq server to/from p2m_ioreq_server, we
>>> reset all memory currently marked p2m_ioreq_server to p2m_ram_rw."
>>>
>> Well, I agree this comment is not quite accurate. Like you said in your
>> comment, the purpose here, calling p2m_change_entry_type_global() is to
>> "reset all memory currently marked p2m_ioreq_server to p2m_ram_rw". But
>> the recalculation is asynchronous. So how about:
>>
>> "Each time we map/unmap an ioreq server to/from p2m_ioreq_server, we
>> mark the p2m table to be recalculated, so all memory currently marked
>> p2m_ioreq_server can be reset to p2m_ram_rw later."?
>
> I think we're emphasizing different things -- I'm emphasizing what the
> change will be, and you're emphasizing when the change will happen.
> :-)
>
> I think from the point of view of someone reading this code, it
> doesn't matter when the physical p2m entries get updated; *logically*
> they are updated now, since from this call onward anything that reads
> the p2m table will get the new type.  The fact that we do it lazily is
> an implementation detail -- we could change the function behind the
> scenes to do it all at once, and the semantics would be the same (it
> would just cause a long change all at once).
>
> If you really want to include when the change will happen, what about this:
>
> "Each time we map / unmap in ioreq server to/from p2m_ioreq_server, we
> reset all memory currently marked p2m_ioreq_server to p2m_ram_rw.
> (This happens lazily as the p2m entries are accessed.)"
>
> BTW, I think this also means that for the interface at the moment, you
> can't change the kinds of accesses you want to intercept very easily
> -- if you want to change from intercepting only writes to intercepting
> both reads and writes, you have to detach from the ioreq_server type
> completely (which will make all your currently-marked ioreq_server
> pages go back to ram_rw), then attach again, and re-mark all the pages
> you were watching.
>
> Which is perhaps not perfect, but I suppose it will do.  It should
> even be possible to change this in the future -- ioreq servers that
> want to change the access mode can try just changing it directly, and
> if they get -EBUSY, do it the hard way.
>
> (Just to be clear, I'm thinking out loud here in the last two
> paragraphs -- no action required unless you feel like it.)
>
>>> But of course that raises another question: is there ever any risk
>>> that an ioreq server will change some other ram type (say, p2m_ram_ro)
>>> to p2m_ioreq_server, and then have it changed back to p2m_ram_rw when
>>> it detaches?
>>>
>> Good question. And unfortunately, yes there is. :)
>> Maybe we should insist only p2m_ram_rw pages can be changed to
>> p2m_ioreq_server, and vice versa.
>
> Well I think if you only allow p2m_ram_rw pages to be changed to
> p2m_ioreq_server, that should be sufficient.  If that works for you it
> seems like it could be a reasonable approach.
>
>>
>>>>   /* Types that can be subject to bulk transitions. */
>>>>   #define P2M_CHANGEABLE_TYPES (p2m_to_mask(p2m_ram_rw) \
>>>> -                              | p2m_to_mask(p2m_ram_logdirty) )
>>>> +                              | p2m_to_mask(p2m_ram_logdirty) \
>>>> +                              | p2m_to_mask(p2m_ioreq_server) )
>>>
>>>
>>> It's probably worth a comment here noting that you can do a mass
>>> change *from* p2m_ioreq_server, but you can't do a mass change *to*
>>> p2m_ioreq_server.  (And doing so would need to change a number of
>>> places in the code where it's assumed that the result of such a
>>> recalculation is either p2m_logdirty or p2m_ram_rw -- e.g.,
>>> p2m-ept.c:553, p2m-pt.c:452, &c.
>>>
>> I agree with adding a note here.
>> But adding extra code in resolve_miconfig()/do_recalc()? Is this
>> necessary? IIUC, current code already guarantees there will be no mass
>> change *to* the p2m_ioreq_server.
>
> Sorry -- in this context, when I said "And doing so would need...", I
> meant, "And also note in your comment that if someone wanted to do a
> mass change to p2m_ioreq_server, they would need to change..."  That
> is, was suggesting you write a comment giving a hint to people in the
> future about the limitations, not add any code yourself. :-)
>
> But maybe it would actually be better to add an ASSERT(nt !=
> p2m_ioreq_server) to p2m_change_entry_type_global(), with a comment
> saying that much of the lazy recalc code assumes being changed into
> only ram_logdirty or ram_rw?
>
> And then of course there's the p2m_ioreq_server -> p2m_ram_logdirty
> transition -- I assume that live migration is incompatible with this
> functionality?  Is there anything that prevents a live migration from
> being started when there are outstanding p2m_ioreq_server entries?
>
>  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
       [not found]         ` <5707839F.9060803@linux.intel.com>
@ 2016-04-08 11:01           ` George Dunlap
  2016-04-11 11:15             ` Yu, Zhang
  0 siblings, 1 reply; 82+ messages in thread
From: George Dunlap @ 2016-04-08 11:01 UTC (permalink / raw)
  To: Yu, Zhang, George Dunlap, xen-devel

On 08/04/16 11:10, Yu, Zhang wrote:
[snip]
> BTW, I noticed your reply has not be CCed to mailing list, and I also
> wonder if we should raise this last question in community?

Oops -- that was a mistake on my part.  :-)  I appreciate the
discretion; just so you know in the future, if I'm purposely changing
the CC list (removing xen-devel and/or adding extra people), I'll almost
always say so at the top of the mail.

>> And then of course there's the p2m_ioreq_server -> p2m_ram_logdirty
>> transition -- I assume that live migration is incompatible with this
>> functionality?  Is there anything that prevents a live migration from
>> being started when there are outstanding p2m_ioreq_server entries?
>>
> 
> Another good question, and the answer is unfortunately yes. :-)
> 
> If live migration happens during the normal emulation process, entries
> marked with p2m_ioreq_server will be changed to p2m_log_dirty in
> resolve_misconfig(), and later write operations will change them to
> p2m_ram_rw, thereafter these pages can not be forwarded to device model.
> From this point of view, this functionality is incompatible with live
> migration.
> 
> But for XenGT, I think this is acceptable, because, if live migration
> is to be supported in the future, intervention from backend device
> model will be necessary. At that time, we can guarantee from the device
> model side that there's no outdated p2m_ioreq_server entries, hence no
> need to reset the p2m type back to p2m_ram_rw(and do not include
> p2m_ioreq_server in the P2M_CHANGEABLE_TYPES). By "outdated", I mean
> when an ioreq server is detached from p2m_ioreq_server, or before an
> ioreq server is attached to this type, entries marked with
> p2m_ioreq_server should be regarded as outdated.
> 
> Is this acceptible to you? Any suggestions?

So the question is, as of this series, what happens if someone tries to
initiate a live migration while there are outstanding p2m_ioreq_server
entries?

If the answer is "the ioreq server suddenly loses all control of the
memory", that's something that needs to be changed.

If the answer is, "everything just works", that's perfect.

If the answer is, "Before logdirty mode is set, the ioreq server has the
opportunity to detach, removing the p2m_ioreq_server entries, and
operating without that functionality", that's good too.

If the answer is, "the live migration request fails and the guest
continues to run", that's also acceptable.  If you want this series to
be checked in today (the last day for 4.7), this is probably your best bet.

 -George



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 1/3] x86/ioreq server: Add new functions to get/set memory types.
  2016-03-31 10:53 ` [PATCH v2 1/3] x86/ioreq server: Add new functions to get/set memory types Yu Zhang
  2016-04-05 13:57   ` George Dunlap
@ 2016-04-08 13:25   ` Andrew Cooper
  1 sibling, 0 replies; 82+ messages in thread
From: Andrew Cooper @ 2016-04-08 13:25 UTC (permalink / raw)
  To: Yu Zhang, xen-devel; +Cc: Paul Durrant, Keir Fraser, zhiyuan.lv, Jan Beulich

On 31/03/16 11:53, Yu Zhang wrote:
> For clarity this patch breaks the code to set/get memory types out
> of do_hvm_op() into dedicated functions: hvmop_set/get_mem_type().
> Also, for clarity, checks for whether a memory type change is allowed
> are broken out into a separate function called by hvmop_set_mem_type().
>
> There is no intentional functional change in this patch.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-03-31 10:53 ` [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server Yu Zhang
  2016-04-05 14:38   ` George Dunlap
@ 2016-04-08 13:26   ` Andrew Cooper
  2016-04-08 21:48   ` Jan Beulich
  2 siblings, 0 replies; 82+ messages in thread
From: Andrew Cooper @ 2016-04-08 13:26 UTC (permalink / raw)
  To: Yu Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, George Dunlap, Tim Deegan,
	Paul Durrant, zhiyuan.lv, Jun Nakajima

On 31/03/16 11:53, Yu Zhang wrote:
> Previously p2m type p2m_mmio_write_dm was introduced for write-
> protected memory pages whose write operations are supposed to be
> forwarded to and emulated by an ioreq server. Yet limitations of
> rangeset restrict the number of guest pages to be write-protected.
>
> This patch replaces the p2m type p2m_mmio_write_dm with a new name:
> p2m_ioreq_server, which means this p2m type can be claimed by one
> ioreq server, instead of being tracked inside the rangeset of ioreq
> server. Patches following up will add the related hvmop handling
> code which map/unmap type p2m_ioreq_server to/from an ioreq server.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-03-31 10:53 ` [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
       [not found]   ` <20160404082556.GC28633@deinos.phlegethon.org>
  2016-04-06 17:13   ` George Dunlap
@ 2016-04-08 13:33   ` Andrew Cooper
  2016-04-11 11:14     ` Yu, Zhang
  2016-04-08 22:28   ` Jan Beulich
  3 siblings, 1 reply; 82+ messages in thread
From: Andrew Cooper @ 2016-04-08 13:33 UTC (permalink / raw)
  To: Yu Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jun Nakajima, George Dunlap, Tim Deegan,
	Paul Durrant, zhiyuan.lv, Jan Beulich

On 31/03/16 11:53, Yu Zhang wrote:
> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
> let one ioreq server claim/disclaim its responsibility for the
> handling of guest pages with p2m type p2m_ioreq_server. Users
> of this HVMOP can specify whether the p2m_ioreq_server is supposed
> to handle write accesses or read ones or both in a parameter named
> flags. For now, we only support one ioreq server for this p2m type,
> so once an ioreq server has claimed its ownership, subsequent calls
> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
> disclaim the ownership of guest ram pages with this p2m type, by
> triggering this new HVMOP, with ioreq server id set to the current
> owner's and flags parameter set to 0.
>
> For now, both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
> are only supported for HVMs with HAP enabled.
>
> Note that flags parameter(if not 0) of this HVMOP only indicates
> which kind of memory accesses are to be forwarded to an ioreq server,
> it has impact on the access rights of guest ram pages, but are not
> the same. Due to hardware limitations, if only write operations are
> to be forwarded, read ones will be performed at full speed, with
> no hypervisor intervention. But if read ones are to be forwarded to
> an ioreq server, writes will inevitably be trapped into hypervisor,
> which means significant performance impact.
>
> Also note that this HVMOP_map_mem_type_to_ioreq_server will not
> change the p2m type of any guest ram page, until HVMOP_set_mem_type
> is triggered. So normally the steps should be the backend driver
> first claims its ownership of guest ram pages with p2m_ioreq_server
> type, and then sets the memory type to p2m_ioreq_server for specified
> guest ram pages.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Tim Deegan <tim@xen.org>
> ---
>  xen/arch/x86/hvm/emulate.c       | 125 +++++++++++++++++++++++++++++++++++++--
>  xen/arch/x86/hvm/hvm.c           |  95 +++++++++++++++++++++++++++--
>  xen/arch/x86/mm/hap/nested_hap.c |   2 +-
>  xen/arch/x86/mm/p2m-ept.c        |  14 ++++-
>  xen/arch/x86/mm/p2m-pt.c         |  25 +++++---
>  xen/arch/x86/mm/p2m.c            |  82 +++++++++++++++++++++++++
>  xen/arch/x86/mm/shadow/multi.c   |   3 +-
>  xen/include/asm-x86/p2m.h        |  36 +++++++++--
>  xen/include/public/hvm/hvm_op.h  |  37 ++++++++++++
>  9 files changed, 395 insertions(+), 24 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index ddc8007..77a4793 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -94,11 +94,69 @@ static const struct hvm_io_handler null_handler = {
>      .ops = &null_ops
>  };
>  
> +static int mem_read(const struct hvm_io_handler *io_handler,
> +                    uint64_t addr,
> +                    uint32_t size,
> +                    uint64_t *data)
> +{
> +    struct domain *currd = current->domain;
> +    unsigned long gmfn = paddr_to_pfn(addr);
> +    unsigned long offset = addr & ~PAGE_MASK;
> +    struct page_info *page = get_page_from_gfn(currd, gmfn, NULL, P2M_UNSHARE);
> +    uint8_t *p;
> +
> +    if ( !page )
> +        return X86EMUL_UNHANDLEABLE;
> +
> +    p = __map_domain_page(page);
> +    p += offset;
> +    memcpy(data, p, size);

What happens when offset + size crosses the page boundary?


> diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
> index a1eae52..d46f186 100644
> --- a/xen/include/public/hvm/hvm_op.h
> +++ b/xen/include/public/hvm/hvm_op.h
> @@ -489,6 +489,43 @@ struct xen_hvm_altp2m_op {
>  typedef struct xen_hvm_altp2m_op xen_hvm_altp2m_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_op_t);
>  
> +#if defined(__XEN__) || defined(__XEN_TOOLS__)
> +
> +/*
> + * HVMOP_map_mem_type_to_ioreq_server : map or unmap the IOREQ Server <id>
> + *                                      to specific memroy type <type>
> + *                                      for specific accesses <flags>
> + *
> + * Note that if only write operations are to be forwarded to an ioreq server,
> + * read operations will be performed with no hypervisor intervention. But if
> + * flags indicates that read operations are to be forwarded to an ioreq server,
> + * write operations will inevitably be trapped into hypervisor, whether they
> + * are emulated by hypervisor or forwarded to ioreq server depends on the flags
> + * setting. This situation means significant performance impact.
> + */
> +#define HVMOP_map_mem_type_to_ioreq_server 26
> +struct xen_hvm_map_mem_type_to_ioreq_server {
> +    domid_t domid;      /* IN - domain to be serviced */
> +    ioservid_t id;      /* IN - ioreq server id */
> +    hvmmem_type_t type; /* IN - memory type */

hvmmem_type_t is an enum and doesn't have a fixed width.  It can't be
used in the public API.

You also have some implicit padding holes as a result of the layout.

~Andrew

> +    uint32_t flags;     /* IN - types of accesses to be forwarded to the
> +                           ioreq server. flags with 0 means to unmap the
> +                           ioreq server */
> +#define _HVMOP_IOREQ_MEM_ACCESS_READ 0
> +#define HVMOP_IOREQ_MEM_ACCESS_READ \
> +    (1u << _HVMOP_IOREQ_MEM_ACCESS_READ)
> +
> +#define _HVMOP_IOREQ_MEM_ACCESS_WRITE 1
> +#define HVMOP_IOREQ_MEM_ACCESS_WRITE \
> +    (1u << _HVMOP_IOREQ_MEM_ACCESS_WRITE)
> +};
> +typedef struct xen_hvm_map_mem_type_to_ioreq_server
> +    xen_hvm_map_mem_type_to_ioreq_server_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_hvm_map_mem_type_to_ioreq_server_t);
> +
> +#endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
> +
> +
>  #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
>  
>  /*


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-03-31 10:53 ` [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server Yu Zhang
  2016-04-05 14:38   ` George Dunlap
  2016-04-08 13:26   ` Andrew Cooper
@ 2016-04-08 21:48   ` Jan Beulich
  2016-04-18  8:41     ` Paul Durrant
  2 siblings, 1 reply; 82+ messages in thread
From: Jan Beulich @ 2016-04-08 21:48 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Tim Deegan, Paul Durrant, Yu Zhang, zhiyuan.lv, Jun Nakajima

>>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
> --- a/xen/include/public/hvm/hvm_op.h
> +++ b/xen/include/public/hvm/hvm_op.h
> @@ -83,7 +83,7 @@ typedef enum {
>      HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>      HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>      HVMMEM_mmio_dm,            /* Reads and write go to the device model */
> -    HVMMEM_mmio_write_dm       /* Read-only; writes go to the device model */
> +    HVMMEM_ioreq_server,
>  } hvmmem_type_t;
>  
>  /* Following tools-only interfaces may change in future. */

So there's one problem here, which the comment at the bottom
of the context already hints at: This enum is part of the not
tools restricted interface (as HVMOP_get_mem_type is usable
by guests themselves), which we cannot change like this. Since
the meaning of the enumerator value doesn't change, I guess
we can get away with simply retaining its old name for non-up-
to-date __XEN_INTERFACE_VERSION__.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-03-31 10:53 ` [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
                     ` (2 preceding siblings ...)
  2016-04-08 13:33   ` Andrew Cooper
@ 2016-04-08 22:28   ` Jan Beulich
  2016-04-11 11:14     ` Yu, Zhang
  3 siblings, 1 reply; 82+ messages in thread
From: Jan Beulich @ 2016-04-08 22:28 UTC (permalink / raw)
  To: Paul Durrant, Yu Zhang
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Tim Deegan, xen-devel, zhiyuan.lv, Jun Nakajima

>>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
> +static int mem_write(const struct hvm_io_handler *handler,
> +                     uint64_t addr,
> +                     uint32_t size,
> +                     uint64_t data)
> +{
> +    struct domain *currd = current->domain;
> +    unsigned long gmfn = paddr_to_pfn(addr);
> +    unsigned long offset = addr & ~PAGE_MASK;
> +    struct page_info *page = get_page_from_gfn(currd, gmfn, NULL, P2M_UNSHARE);
> +    uint8_t *p;
> +
> +    if ( !page )
> +        return X86EMUL_UNHANDLEABLE;
> +
> +    p = __map_domain_page(page);
> +    p += offset;
> +    memcpy(p, &data, size);

What if the page is a r/o one? Not having found an ioreq server, I'm
not sure assumptions on the page being writable can validly be made.

> @@ -168,13 +226,72 @@ static int hvmemul_do_io(
>          break;
>      case X86EMUL_UNHANDLEABLE:
>      {
> -        struct hvm_ioreq_server *s =
> -            hvm_select_ioreq_server(curr->domain, &p);
> +        struct hvm_ioreq_server *s;
> +        p2m_type_t p2mt;
> +
> +        if ( is_mmio )
> +        {
> +            unsigned long gmfn = paddr_to_pfn(addr);
> +
> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
> +
> +            switch ( p2mt )
> +            {
> +                case p2m_ioreq_server:
> +                {
> +                    unsigned long flags;
> +
> +                    p2m_get_ioreq_server(currd, &flags, &s);

As the function apparently returns no value right now, please avoid
the indirection on both values you're after - one of the two
(presumably s) can be the function's return value.

> +                    if ( !s )
> +                        break;
> +
> +                    if ( (dir == IOREQ_READ &&
> +                          !(flags & P2M_IOREQ_HANDLE_READ_ACCESS)) ||
> +                         (dir == IOREQ_WRITE &&
> +                          !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS)) )

I think this would be easier to read using a conditional expression
with the condition being dir == IOREQ_<one-of-the-two>, just
selecting either of the two possible bit masks.

> +                        s = NULL;
> +
> +                    break;
> +                }
> +                case p2m_ram_rw:

Blank line above here please.

>          /* If there is no suitable backing DM, just ignore accesses */
>          if ( !s )
>          {
> -            rc = hvm_process_io_intercept(&null_handler, &p);
> +            switch ( p2mt )
> +            {
> +            case p2m_ioreq_server:
> +            /*
> +             * Race conditions may exist when access to a gfn with
> +             * p2m_ioreq_server is intercepted by hypervisor, during
> +             * which time p2m type of this gfn is recalculated back
> +             * to p2m_ram_rw. mem_handler is used to handle this
> +             * corner case.
> +             */

Now if there is such a race condition, the race could also be with a
page changing first to ram_rw and then immediately further to e.g.
ram_ro. See the earlier comment about assuming the page to be
writable.

> +            case p2m_ram_rw:
> +                rc = hvm_process_io_intercept(&mem_handler, &p);
> +                break;
> +
> +            default:
> +                rc = hvm_process_io_intercept(&null_handler, &p);

Along with the above, I doubt it is correct to have e.g. ram_ro come
here.

> +static int hvm_map_mem_type_to_ioreq_server(struct domain *d,
> +                                            ioservid_t id,
> +                                            hvmmem_type_t type,
> +                                            uint32_t flags)
> +{
> +    struct hvm_ioreq_server *s;
> +    int rc;
> +
> +    /* For now, only HVMMEM_ioreq_server is supported */
> +    if ( type != HVMMEM_ioreq_server )
> +        return -EINVAL;
> +
> +    if ( flags & ~(HVMOP_IOREQ_MEM_ACCESS_READ |
> +                   HVMOP_IOREQ_MEM_ACCESS_WRITE) )
> +        return -EINVAL;
> +
> +    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
> +
> +    rc = -ENOENT;
> +    list_for_each_entry ( s,
> +                          &d->arch.hvm_domain.ioreq_server.list,
> +                          list_entry )
> +    {
> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
> +            continue;
> +
> +        if ( s->id == id )
> +        {
> +            rc = p2m_set_ioreq_server(d, flags, s);
> +            if ( rc == 0 )
> +                gdprintk(XENLOG_DEBUG, "%u %s type HVMMEM_ioreq_server.\n",
> +                         s->id, (flags != 0) ? "mapped to" : "unmapped from");

Why gdprintk()? I don't think the current domain is of much
interest here. What would be of interest is the subject domain.

> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -132,6 +132,19 @@ static void ept_p2m_type_to_flags(struct p2m_domain 
> *p2m, ept_entry_t *entry,
>              entry->r = entry->w = entry->x = 1;
>              entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>              break;
> +        case p2m_ioreq_server:
> +            entry->r = !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_READ_ACCESS);
> +	    /*
> +	     * write access right is disabled when entry->r is 0, but whether
> +	     * write accesses are emulated by hypervisor or forwarded to an
> +	     * ioreq server depends on the setting of p2m->ioreq.flags.
> +	     */
> +            entry->w = (entry->r &&
> +                        !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_WRITE_ACCESS));
> +            entry->x = entry->r;

Why would we want to allow instruction execution from such pages?
And with all three bits now possibly being clear, aren't we risking the
entries to be mis-treated as not-present ones?

> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -72,8 +72,8 @@ static const unsigned long pgt[] = {
>      PGT_l3_page_table
>  };
>  
> -static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
> -                                       unsigned int level)
> +static unsigned long p2m_type_to_flags(struct p2m_domain *p2m, p2m_type_t t,

const

> +int p2m_set_ioreq_server(struct domain *d,
> +                         unsigned long flags,
> +                         struct hvm_ioreq_server *s)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +    int rc;
> +
> +    spin_lock(&p2m->ioreq.lock);
> +
> +    rc = -EBUSY;
> +    if ( (flags != 0) && (p2m->ioreq.server != NULL) )
> +        goto out;
> +
> +    rc = -EINVAL;
> +    /* unmap ioreq server from p2m type by passing flags with 0 */

Comment style (also elsewhere).

> +    if ( (flags == 0) && (p2m->ioreq.server != s) )
> +        goto out;

The two flags checks above are redundant with ...

> +    if ( flags == 0 )
> +    {
> +        p2m->ioreq.server = NULL;
> +        p2m->ioreq.flags = 0;
> +    }
> +    else
> +    {
> +        p2m->ioreq.server = s;
> +        p2m->ioreq.flags = flags;
> +    }

... this - I think the earlier ones should be folded into this.

> +    /*
> +     * Each time we map/unmap an ioreq server to/from p2m_ioreq_server,
> +     * we mark the p2m table to be recalculated, so that gfns which were
> +     * previously marked with p2m_ioreq_server can be resynced.
> +     */
> +    p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);

What does "resynced" here mean? I.e. I can see why this is wanted
when unmapping a server, but when mapping a server there shouldn't
be any such pages in the first place.

> +    rc = 0;
> +
> +out:

Labels indented by at least one space please.

> @@ -320,6 +321,27 @@ struct p2m_domain {
>          struct ept_data ept;
>          /* NPT-equivalent structure could be added here. */
>      };
> +
> +    struct {
> +        spinlock_t lock;
> +        /*
> +         * ioreq server who's responsible for the emulation of
> +         * gfns with specific p2m type(for now, p2m_ioreq_server).
> +         * Behaviors of gfns with p2m_ioreq_server set but no
> +         * ioreq server mapped in advance should be the same as
> +         * p2m_ram_rw.
> +         */
> +        struct hvm_ioreq_server *server;
> +        /*
> +         * flags specifies whether read, write or both operations
> +         * are to be emulated by an ioreq server.
> +         */
> +        unsigned long flags;

unsigned int

> --- a/xen/include/public/hvm/hvm_op.h
> +++ b/xen/include/public/hvm/hvm_op.h
> @@ -489,6 +489,43 @@ struct xen_hvm_altp2m_op {
>  typedef struct xen_hvm_altp2m_op xen_hvm_altp2m_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_op_t);
>  
> +#if defined(__XEN__) || defined(__XEN_TOOLS__)

Instead of adding yet another such section, couldn't this be added
to an already existing one?

> +struct xen_hvm_map_mem_type_to_ioreq_server {
> +    domid_t domid;      /* IN - domain to be serviced */
> +    ioservid_t id;      /* IN - ioreq server id */
> +    hvmmem_type_t type; /* IN - memory type */

You can't use this type for public interface structure fields - this
must be uintXX_t.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-08 13:33   ` Andrew Cooper
@ 2016-04-11 11:14     ` Yu, Zhang
  2016-04-11 12:20       ` Andrew Cooper
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-11 11:14 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, George Dunlap, Tim Deegan,
	Paul Durrant, zhiyuan.lv, Jun Nakajima



On 4/8/2016 9:33 PM, Andrew Cooper wrote:
> On 31/03/16 11:53, Yu Zhang wrote:
>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>> let one ioreq server claim/disclaim its responsibility for the
>> handling of guest pages with p2m type p2m_ioreq_server. Users
>> of this HVMOP can specify whether the p2m_ioreq_server is supposed
>> to handle write accesses or read ones or both in a parameter named
>> flags. For now, we only support one ioreq server for this p2m type,
>> so once an ioreq server has claimed its ownership, subsequent calls
>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>> disclaim the ownership of guest ram pages with this p2m type, by
>> triggering this new HVMOP, with ioreq server id set to the current
>> owner's and flags parameter set to 0.
>>
>> For now, both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>> are only supported for HVMs with HAP enabled.
>>
>> Note that flags parameter(if not 0) of this HVMOP only indicates
>> which kind of memory accesses are to be forwarded to an ioreq server,
>> it has impact on the access rights of guest ram pages, but are not
>> the same. Due to hardware limitations, if only write operations are
>> to be forwarded, read ones will be performed at full speed, with
>> no hypervisor intervention. But if read ones are to be forwarded to
>> an ioreq server, writes will inevitably be trapped into hypervisor,
>> which means significant performance impact.
>>
>> Also note that this HVMOP_map_mem_type_to_ioreq_server will not
>> change the p2m type of any guest ram page, until HVMOP_set_mem_type
>> is triggered. So normally the steps should be the backend driver
>> first claims its ownership of guest ram pages with p2m_ioreq_server
>> type, and then sets the memory type to p2m_ioreq_server for specified
>> guest ram pages.
>>
>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>> Cc: Keir Fraser <keir@xen.org>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>> Cc: Kevin Tian <kevin.tian@intel.com>
>> Cc: Tim Deegan <tim@xen.org>
>> ---
>>   xen/arch/x86/hvm/emulate.c       | 125 +++++++++++++++++++++++++++++++++++++--
>>   xen/arch/x86/hvm/hvm.c           |  95 +++++++++++++++++++++++++++--
>>   xen/arch/x86/mm/hap/nested_hap.c |   2 +-
>>   xen/arch/x86/mm/p2m-ept.c        |  14 ++++-
>>   xen/arch/x86/mm/p2m-pt.c         |  25 +++++---
>>   xen/arch/x86/mm/p2m.c            |  82 +++++++++++++++++++++++++
>>   xen/arch/x86/mm/shadow/multi.c   |   3 +-
>>   xen/include/asm-x86/p2m.h        |  36 +++++++++--
>>   xen/include/public/hvm/hvm_op.h  |  37 ++++++++++++
>>   9 files changed, 395 insertions(+), 24 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
>> index ddc8007..77a4793 100644
>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -94,11 +94,69 @@ static const struct hvm_io_handler null_handler = {
>>       .ops = &null_ops
>>   };
>>
>> +static int mem_read(const struct hvm_io_handler *io_handler,
>> +                    uint64_t addr,
>> +                    uint32_t size,
>> +                    uint64_t *data)
>> +{
>> +    struct domain *currd = current->domain;
>> +    unsigned long gmfn = paddr_to_pfn(addr);
>> +    unsigned long offset = addr & ~PAGE_MASK;
>> +    struct page_info *page = get_page_from_gfn(currd, gmfn, NULL, P2M_UNSHARE);
>> +    uint8_t *p;
>> +
>> +    if ( !page )
>> +        return X86EMUL_UNHANDLEABLE;
>> +
>> +    p = __map_domain_page(page);
>> +    p += offset;
>> +    memcpy(data, p, size);
>
> What happens when offset + size crosses the page boundary?
>

The 'size' is set in hvmemul_linear_mmio_access(), to insure offset + 
size will not cross the page boundary.

>
>> diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
>> index a1eae52..d46f186 100644
>> --- a/xen/include/public/hvm/hvm_op.h
>> +++ b/xen/include/public/hvm/hvm_op.h
>> @@ -489,6 +489,43 @@ struct xen_hvm_altp2m_op {
>>   typedef struct xen_hvm_altp2m_op xen_hvm_altp2m_op_t;
>>   DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_op_t);
>>
>> +#if defined(__XEN__) || defined(__XEN_TOOLS__)
>> +
>> +/*
>> + * HVMOP_map_mem_type_to_ioreq_server : map or unmap the IOREQ Server <id>
>> + *                                      to specific memroy type <type>
>> + *                                      for specific accesses <flags>
>> + *
>> + * Note that if only write operations are to be forwarded to an ioreq server,
>> + * read operations will be performed with no hypervisor intervention. But if
>> + * flags indicates that read operations are to be forwarded to an ioreq server,
>> + * write operations will inevitably be trapped into hypervisor, whether they
>> + * are emulated by hypervisor or forwarded to ioreq server depends on the flags
>> + * setting. This situation means significant performance impact.
>> + */
>> +#define HVMOP_map_mem_type_to_ioreq_server 26
>> +struct xen_hvm_map_mem_type_to_ioreq_server {
>> +    domid_t domid;      /* IN - domain to be serviced */
>> +    ioservid_t id;      /* IN - ioreq server id */
>> +    hvmmem_type_t type; /* IN - memory type */
>
> hvmmem_type_t is an enum and doesn't have a fixed width.  It can't be
> used in the public API.
>
> You also have some implicit padding holes as a result of the layout.
>
Oh, guess I should use uint16_t hvmmem_type(any maybe some pad if
necessary).
Thanks for pointing this out, Andrew. :)

> ~Andrew
>
>> +    uint32_t flags;     /* IN - types of accesses to be forwarded to the
>> +                           ioreq server. flags with 0 means to unmap the
>> +                           ioreq server */
>> +#define _HVMOP_IOREQ_MEM_ACCESS_READ 0
>> +#define HVMOP_IOREQ_MEM_ACCESS_READ \
>> +    (1u << _HVMOP_IOREQ_MEM_ACCESS_READ)
>> +
>> +#define _HVMOP_IOREQ_MEM_ACCESS_WRITE 1
>> +#define HVMOP_IOREQ_MEM_ACCESS_WRITE \
>> +    (1u << _HVMOP_IOREQ_MEM_ACCESS_WRITE)
>> +};
>> +typedef struct xen_hvm_map_mem_type_to_ioreq_server
>> +    xen_hvm_map_mem_type_to_ioreq_server_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_hvm_map_mem_type_to_ioreq_server_t);
>> +
>> +#endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
>> +
>> +
>>   #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
>>
>>   /*
>
>

B.R.
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-08 22:28   ` Jan Beulich
@ 2016-04-11 11:14     ` Yu, Zhang
  2016-04-11 16:31       ` Jan Beulich
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-11 11:14 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Tim Deegan, xen-devel, zhiyuan.lv, Jun Nakajima



On 4/9/2016 6:28 AM, Jan Beulich wrote:
>>>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
>> +static int mem_write(const struct hvm_io_handler *handler,
>> +                     uint64_t addr,
>> +                     uint32_t size,
>> +                     uint64_t data)
>> +{
>> +    struct domain *currd = current->domain;
>> +    unsigned long gmfn = paddr_to_pfn(addr);
>> +    unsigned long offset = addr & ~PAGE_MASK;
>> +    struct page_info *page = get_page_from_gfn(currd, gmfn, NULL, P2M_UNSHARE);
>> +    uint8_t *p;
>> +
>> +    if ( !page )
>> +        return X86EMUL_UNHANDLEABLE;
>> +
>> +    p = __map_domain_page(page);
>> +    p += offset;
>> +    memcpy(p, &data, size);
>
> What if the page is a r/o one? Not having found an ioreq server, I'm
> not sure assumptions on the page being writable can validly be made.
>
>> @@ -168,13 +226,72 @@ static int hvmemul_do_io(
>>           break;
>>       case X86EMUL_UNHANDLEABLE:
>>       {
>> -        struct hvm_ioreq_server *s =
>> -            hvm_select_ioreq_server(curr->domain, &p);
>> +        struct hvm_ioreq_server *s;
>> +        p2m_type_t p2mt;
>> +
>> +        if ( is_mmio )
>> +        {
>> +            unsigned long gmfn = paddr_to_pfn(addr);
>> +
>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>> +
>> +            switch ( p2mt )
>> +            {
>> +                case p2m_ioreq_server:
>> +                {
>> +                    unsigned long flags;
>> +
>> +                    p2m_get_ioreq_server(currd, &flags, &s);
>
> As the function apparently returns no value right now, please avoid
> the indirection on both values you're after - one of the two
> (presumably s) can be the function's return value.
>

Well, current implementation of p2m_get_ioreq_server() has spin_lock/
spin_unlock surrounding the reading of flags and the s, but I believe
we can also use the s as return value.

>> +                    if ( !s )
>> +                        break;
>> +
>> +                    if ( (dir == IOREQ_READ &&
>> +                          !(flags & P2M_IOREQ_HANDLE_READ_ACCESS)) ||
>> +                         (dir == IOREQ_WRITE &&
>> +                          !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS)) )
>
> I think this would be easier to read using a conditional expression
> with the condition being dir == IOREQ_<one-of-the-two>, just
> selecting either of the two possible bit masks.
>
>> +                        s = NULL;
>> +
>> +                    break;
>> +                }
>> +                case p2m_ram_rw:
>
> Blank line above here please.
>
>>           /* If there is no suitable backing DM, just ignore accesses */
>>           if ( !s )
>>           {
>> -            rc = hvm_process_io_intercept(&null_handler, &p);
>> +            switch ( p2mt )
>> +            {
>> +            case p2m_ioreq_server:
>> +            /*
>> +             * Race conditions may exist when access to a gfn with
>> +             * p2m_ioreq_server is intercepted by hypervisor, during
>> +             * which time p2m type of this gfn is recalculated back
>> +             * to p2m_ram_rw. mem_handler is used to handle this
>> +             * corner case.
>> +             */
>
> Now if there is such a race condition, the race could also be with a
> page changing first to ram_rw and then immediately further to e.g.
> ram_ro. See the earlier comment about assuming the page to be
> writable.
>

Thanks, Jan. After rechecking the code, I suppose the race condition
will not happen. In hvmemul_do_io(), get_gfn_query_unlocked() is used
to peek the p2mt for the gfn, but get_gfn_type_access() is called inside
hvm_hap_nested_page_fault(), and this will guarantee no p2m change shall
occur during the emulation.
Is this understanding correct?


>> +            case p2m_ram_rw:
>> +                rc = hvm_process_io_intercept(&mem_handler, &p);
>> +                break;
>> +
>> +            default:
>> +                rc = hvm_process_io_intercept(&null_handler, &p);
>
> Along with the above, I doubt it is correct to have e.g. ram_ro come
> here.
>

So if no race condition happens, no need to treat p2m_ram_rw specially.

>> +static int hvm_map_mem_type_to_ioreq_server(struct domain *d,
>> +                                            ioservid_t id,
>> +                                            hvmmem_type_t type,
>> +                                            uint32_t flags)
>> +{
>> +    struct hvm_ioreq_server *s;
>> +    int rc;
>> +
>> +    /* For now, only HVMMEM_ioreq_server is supported */
>> +    if ( type != HVMMEM_ioreq_server )
>> +        return -EINVAL;
>> +
>> +    if ( flags & ~(HVMOP_IOREQ_MEM_ACCESS_READ |
>> +                   HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>> +        return -EINVAL;
>> +
>> +    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
>> +
>> +    rc = -ENOENT;
>> +    list_for_each_entry ( s,
>> +                          &d->arch.hvm_domain.ioreq_server.list,
>> +                          list_entry )
>> +    {
>> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
>> +            continue;
>> +
>> +        if ( s->id == id )
>> +        {
>> +            rc = p2m_set_ioreq_server(d, flags, s);
>> +            if ( rc == 0 )
>> +                gdprintk(XENLOG_DEBUG, "%u %s type HVMMEM_ioreq_server.\n",
>> +                         s->id, (flags != 0) ? "mapped to" : "unmapped from");
>
> Why gdprintk()? I don't think the current domain is of much
> interest here. What would be of interest is the subject domain.
>

s->id is not the domain_id, but id of the ioreq server.

>> --- a/xen/arch/x86/mm/p2m-ept.c
>> +++ b/xen/arch/x86/mm/p2m-ept.c
>> @@ -132,6 +132,19 @@ static void ept_p2m_type_to_flags(struct p2m_domain
>> *p2m, ept_entry_t *entry,
>>               entry->r = entry->w = entry->x = 1;
>>               entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>>               break;
>> +        case p2m_ioreq_server:
>> +            entry->r = !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_READ_ACCESS);
>> +	    /*
>> +	     * write access right is disabled when entry->r is 0, but whether
>> +	     * write accesses are emulated by hypervisor or forwarded to an
>> +	     * ioreq server depends on the setting of p2m->ioreq.flags.
>> +	     */
>> +            entry->w = (entry->r &&
>> +                        !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_WRITE_ACCESS));
>> +            entry->x = entry->r;
>
> Why would we want to allow instruction execution from such pages?
> And with all three bits now possibly being clear, aren't we risking the
> entries to be mis-treated as not-present ones?
>

Hah. You got me. Thanks! :)
Now I realized it would be difficult if we wanna to emulate the read
operations for HVM. According to Intel mannual, entry->r is to be
cleared, so should entry->w if we do not want ept misconfig. And
with both read and write permissions being forbidden, entry->x can be
set only on processors with EXECUTE_ONLY capability.
To avoid any entry to be mis-treated as not-present. We have several
solutions:
a> do not support the read emulation for now - we have no such usage
case;
b> add the check of p2m_t against p2m_ioreq_server in is_epte_present -
a bit weird to me.
Which one do you prefer? or any other suggestions?

>> --- a/xen/arch/x86/mm/p2m-pt.c
>> +++ b/xen/arch/x86/mm/p2m-pt.c
>> @@ -72,8 +72,8 @@ static const unsigned long pgt[] = {
>>       PGT_l3_page_table
>>   };
>>
>> -static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
>> -                                       unsigned int level)
>> +static unsigned long p2m_type_to_flags(struct p2m_domain *p2m, p2m_type_t t,
>
> const
>
>> +int p2m_set_ioreq_server(struct domain *d,
>> +                         unsigned long flags,
>> +                         struct hvm_ioreq_server *s)
>> +{
>> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> +    int rc;
>> +
>> +    spin_lock(&p2m->ioreq.lock);
>> +
>> +    rc = -EBUSY;
>> +    if ( (flags != 0) && (p2m->ioreq.server != NULL) )
>> +        goto out;
>> +
>> +    rc = -EINVAL;
>> +    /* unmap ioreq server from p2m type by passing flags with 0 */
>
> Comment style (also elsewhere).
>
>> +    if ( (flags == 0) && (p2m->ioreq.server != s) )
>> +        goto out;
>
> The two flags checks above are redundant with ...
>
>> +    if ( flags == 0 )
>> +    {
>> +        p2m->ioreq.server = NULL;
>> +        p2m->ioreq.flags = 0;
>> +    }
>> +    else
>> +    {
>> +        p2m->ioreq.server = s;
>> +        p2m->ioreq.flags = flags;
>> +    }
>
> ... this - I think the earlier ones should be folded into this.
>

Ok. I'll have a try. :)

>> +    /*
>> +     * Each time we map/unmap an ioreq server to/from p2m_ioreq_server,
>> +     * we mark the p2m table to be recalculated, so that gfns which were
>> +     * previously marked with p2m_ioreq_server can be resynced.
>> +     */
>> +    p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>
> What does "resynced" here mean? I.e. I can see why this is wanted
> when unmapping a server, but when mapping a server there shouldn't
> be any such pages in the first place.
>

There shouldn't be. But if there is(misbehavior from the device model
side), it can be recalculated back to p2m_ram_rw(which is not quite
necessary as the unmapping case).

>> +    rc = 0;
>> +
>> +out:
>
> Labels indented by at least one space please.
>
>> @@ -320,6 +321,27 @@ struct p2m_domain {
>>           struct ept_data ept;
>>           /* NPT-equivalent structure could be added here. */
>>       };
>> +
>> +    struct {
>> +        spinlock_t lock;
>> +        /*
>> +         * ioreq server who's responsible for the emulation of
>> +         * gfns with specific p2m type(for now, p2m_ioreq_server).
>> +         * Behaviors of gfns with p2m_ioreq_server set but no
>> +         * ioreq server mapped in advance should be the same as
>> +         * p2m_ram_rw.
>> +         */
>> +        struct hvm_ioreq_server *server;
>> +        /*
>> +         * flags specifies whether read, write or both operations
>> +         * are to be emulated by an ioreq server.
>> +         */
>> +        unsigned long flags;
>
> unsigned int
>
>> --- a/xen/include/public/hvm/hvm_op.h
>> +++ b/xen/include/public/hvm/hvm_op.h
>> @@ -489,6 +489,43 @@ struct xen_hvm_altp2m_op {
>>   typedef struct xen_hvm_altp2m_op xen_hvm_altp2m_op_t;
>>   DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_op_t);
>>
>> +#if defined(__XEN__) || defined(__XEN_TOOLS__)
>
> Instead of adding yet another such section, couldn't this be added
> to an already existing one?
>

Yes.

>> +struct xen_hvm_map_mem_type_to_ioreq_server {
>> +    domid_t domid;      /* IN - domain to be serviced */
>> +    ioservid_t id;      /* IN - ioreq server id */
>> +    hvmmem_type_t type; /* IN - memory type */
>
> You can't use this type for public interface structure fields - this
> must be uintXX_t.
>

Got it.


Thanks
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-08 11:01           ` George Dunlap
@ 2016-04-11 11:15             ` Yu, Zhang
  2016-04-14 10:45               ` Yu, Zhang
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-11 11:15 UTC (permalink / raw)
  To: George Dunlap, George Dunlap, xen-devel



On 4/8/2016 7:01 PM, George Dunlap wrote:
> On 08/04/16 11:10, Yu, Zhang wrote:
> [snip]
>> BTW, I noticed your reply has not be CCed to mailing list, and I also
>> wonder if we should raise this last question in community?
>
> Oops -- that was a mistake on my part.  :-)  I appreciate the
> discretion; just so you know in the future, if I'm purposely changing
> the CC list (removing xen-devel and/or adding extra people), I'll almost
> always say so at the top of the mail.
>
>>> And then of course there's the p2m_ioreq_server -> p2m_ram_logdirty
>>> transition -- I assume that live migration is incompatible with this
>>> functionality?  Is there anything that prevents a live migration from
>>> being started when there are outstanding p2m_ioreq_server entries?
>>>
>>
>> Another good question, and the answer is unfortunately yes. :-)
>>
>> If live migration happens during the normal emulation process, entries
>> marked with p2m_ioreq_server will be changed to p2m_log_dirty in
>> resolve_misconfig(), and later write operations will change them to
>> p2m_ram_rw, thereafter these pages can not be forwarded to device model.
>>  From this point of view, this functionality is incompatible with live
>> migration.
>>
>> But for XenGT, I think this is acceptable, because, if live migration
>> is to be supported in the future, intervention from backend device
>> model will be necessary. At that time, we can guarantee from the device
>> model side that there's no outdated p2m_ioreq_server entries, hence no
>> need to reset the p2m type back to p2m_ram_rw(and do not include
>> p2m_ioreq_server in the P2M_CHANGEABLE_TYPES). By "outdated", I mean
>> when an ioreq server is detached from p2m_ioreq_server, or before an
>> ioreq server is attached to this type, entries marked with
>> p2m_ioreq_server should be regarded as outdated.
>>
>> Is this acceptible to you? Any suggestions?
>
> So the question is, as of this series, what happens if someone tries to
> initiate a live migration while there are outstanding p2m_ioreq_server
> entries?
>
> If the answer is "the ioreq server suddenly loses all control of the
> memory", that's something that needs to be changed.
>

Sorry, for this patch series, I'm afraid the above description is the
answer.

Besides, I find it's hard to change current code to both support the
deferred resetting of p2m_ioreq_server and the live migration at the
same time. One reason is that a page with p2m_ioreq_server behaves
differently in different situations.

My assumption of XenGT is that, for live migration to work, the device
model should guarantee there's no outstanding p2m_ioreq_server pages
in hypervisor(no need to use the deferred recalculation), and it is our
device model who should be responsible for the copying of the write
protected guest pages later.

And another solution I can think of: when unmapping the ioreq server,
we walk the p2m table and reset entries with p2m_ioreq_server back
directly, instead of deferring the reset. And of course, this means
performance impact. But since the mapping and unmapping of an ioreq
server is not a frequent one, the performance penalty may be acceptable.
How do you think about this approach?

> If the answer is, "everything just works", that's perfect.
>
> If the answer is, "Before logdirty mode is set, the ioreq server has the
> opportunity to detach, removing the p2m_ioreq_server entries, and
> operating without that functionality", that's good too.
>
> If the answer is, "the live migration request fails and the guest
> continues to run", that's also acceptable.  If you want this series to
> be checked in today (the last day for 4.7), this is probably your best bet.
>
>   -George
>
>
>

B.R.
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-11 11:14     ` Yu, Zhang
@ 2016-04-11 12:20       ` Andrew Cooper
  2016-04-11 16:25         ` Jan Beulich
  0 siblings, 1 reply; 82+ messages in thread
From: Andrew Cooper @ 2016-04-11 12:20 UTC (permalink / raw)
  To: Yu, Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, George Dunlap, Tim Deegan,
	Paul Durrant, zhiyuan.lv, Jun Nakajima

On 11/04/16 12:14, Yu, Zhang wrote:
>
>
> On 4/8/2016 9:33 PM, Andrew Cooper wrote:
>> On 31/03/16 11:53, Yu Zhang wrote:
>>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>>> let one ioreq server claim/disclaim its responsibility for the
>>> handling of guest pages with p2m type p2m_ioreq_server. Users
>>> of this HVMOP can specify whether the p2m_ioreq_server is supposed
>>> to handle write accesses or read ones or both in a parameter named
>>> flags. For now, we only support one ioreq server for this p2m type,
>>> so once an ioreq server has claimed its ownership, subsequent calls
>>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>>> disclaim the ownership of guest ram pages with this p2m type, by
>>> triggering this new HVMOP, with ioreq server id set to the current
>>> owner's and flags parameter set to 0.
>>>
>>> For now, both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>>> are only supported for HVMs with HAP enabled.
>>>
>>> Note that flags parameter(if not 0) of this HVMOP only indicates
>>> which kind of memory accesses are to be forwarded to an ioreq server,
>>> it has impact on the access rights of guest ram pages, but are not
>>> the same. Due to hardware limitations, if only write operations are
>>> to be forwarded, read ones will be performed at full speed, with
>>> no hypervisor intervention. But if read ones are to be forwarded to
>>> an ioreq server, writes will inevitably be trapped into hypervisor,
>>> which means significant performance impact.
>>>
>>> Also note that this HVMOP_map_mem_type_to_ioreq_server will not
>>> change the p2m type of any guest ram page, until HVMOP_set_mem_type
>>> is triggered. So normally the steps should be the backend driver
>>> first claims its ownership of guest ram pages with p2m_ioreq_server
>>> type, and then sets the memory type to p2m_ioreq_server for specified
>>> guest ram pages.
>>>
>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>> Cc: Keir Fraser <keir@xen.org>
>>> Cc: Jan Beulich <jbeulich@suse.com>
>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>> Cc: Tim Deegan <tim@xen.org>
>>> ---
>>>   xen/arch/x86/hvm/emulate.c       | 125
>>> +++++++++++++++++++++++++++++++++++++--
>>>   xen/arch/x86/hvm/hvm.c           |  95 +++++++++++++++++++++++++++--
>>>   xen/arch/x86/mm/hap/nested_hap.c |   2 +-
>>>   xen/arch/x86/mm/p2m-ept.c        |  14 ++++-
>>>   xen/arch/x86/mm/p2m-pt.c         |  25 +++++---
>>>   xen/arch/x86/mm/p2m.c            |  82 +++++++++++++++++++++++++
>>>   xen/arch/x86/mm/shadow/multi.c   |   3 +-
>>>   xen/include/asm-x86/p2m.h        |  36 +++++++++--
>>>   xen/include/public/hvm/hvm_op.h  |  37 ++++++++++++
>>>   9 files changed, 395 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
>>> index ddc8007..77a4793 100644
>>> --- a/xen/arch/x86/hvm/emulate.c
>>> +++ b/xen/arch/x86/hvm/emulate.c
>>> @@ -94,11 +94,69 @@ static const struct hvm_io_handler null_handler = {
>>>       .ops = &null_ops
>>>   };
>>>
>>> +static int mem_read(const struct hvm_io_handler *io_handler,
>>> +                    uint64_t addr,
>>> +                    uint32_t size,
>>> +                    uint64_t *data)
>>> +{
>>> +    struct domain *currd = current->domain;
>>> +    unsigned long gmfn = paddr_to_pfn(addr);
>>> +    unsigned long offset = addr & ~PAGE_MASK;
>>> +    struct page_info *page = get_page_from_gfn(currd, gmfn, NULL,
>>> P2M_UNSHARE);
>>> +    uint8_t *p;
>>> +
>>> +    if ( !page )
>>> +        return X86EMUL_UNHANDLEABLE;
>>> +
>>> +    p = __map_domain_page(page);
>>> +    p += offset;
>>> +    memcpy(data, p, size);
>>
>> What happens when offset + size crosses the page boundary?
>>
>
> The 'size' is set in hvmemul_linear_mmio_access(), to insure offset +
> size will not cross the page boundary.

Ok, in which case please

ASSERT(offset + size < PAGE_SIZE)

So it is absolutely clear in this function that the caller is required
to honour this restriction.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-11 12:20       ` Andrew Cooper
@ 2016-04-11 16:25         ` Jan Beulich
  0 siblings, 0 replies; 82+ messages in thread
From: Jan Beulich @ 2016-04-11 16:25 UTC (permalink / raw)
  To: Andrew Cooper, Zhang Yu, xen-devel
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Tim Deegan, Paul Durrant,
	zhiyuan.lv, JunNakajima

>>> On 11.04.16 at 14:20, <andrew.cooper3@citrix.com> wrote:
> On 11/04/16 12:14, Yu, Zhang wrote:
>>
>>
>> On 4/8/2016 9:33 PM, Andrew Cooper wrote:
>>> On 31/03/16 11:53, Yu Zhang wrote:
>>>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>>>> let one ioreq server claim/disclaim its responsibility for the
>>>> handling of guest pages with p2m type p2m_ioreq_server. Users
>>>> of this HVMOP can specify whether the p2m_ioreq_server is supposed
>>>> to handle write accesses or read ones or both in a parameter named
>>>> flags. For now, we only support one ioreq server for this p2m type,
>>>> so once an ioreq server has claimed its ownership, subsequent calls
>>>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>>>> disclaim the ownership of guest ram pages with this p2m type, by
>>>> triggering this new HVMOP, with ioreq server id set to the current
>>>> owner's and flags parameter set to 0.
>>>>
>>>> For now, both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>>>> are only supported for HVMs with HAP enabled.
>>>>
>>>> Note that flags parameter(if not 0) of this HVMOP only indicates
>>>> which kind of memory accesses are to be forwarded to an ioreq server,
>>>> it has impact on the access rights of guest ram pages, but are not
>>>> the same. Due to hardware limitations, if only write operations are
>>>> to be forwarded, read ones will be performed at full speed, with
>>>> no hypervisor intervention. But if read ones are to be forwarded to
>>>> an ioreq server, writes will inevitably be trapped into hypervisor,
>>>> which means significant performance impact.
>>>>
>>>> Also note that this HVMOP_map_mem_type_to_ioreq_server will not
>>>> change the p2m type of any guest ram page, until HVMOP_set_mem_type
>>>> is triggered. So normally the steps should be the backend driver
>>>> first claims its ownership of guest ram pages with p2m_ioreq_server
>>>> type, and then sets the memory type to p2m_ioreq_server for specified
>>>> guest ram pages.
>>>>
>>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>>> Cc: Keir Fraser <keir@xen.org>
>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>>> Cc: Tim Deegan <tim@xen.org>
>>>> ---
>>>>   xen/arch/x86/hvm/emulate.c       | 125
>>>> +++++++++++++++++++++++++++++++++++++--
>>>>   xen/arch/x86/hvm/hvm.c           |  95 +++++++++++++++++++++++++++--
>>>>   xen/arch/x86/mm/hap/nested_hap.c |   2 +-
>>>>   xen/arch/x86/mm/p2m-ept.c        |  14 ++++-
>>>>   xen/arch/x86/mm/p2m-pt.c         |  25 +++++---
>>>>   xen/arch/x86/mm/p2m.c            |  82 +++++++++++++++++++++++++
>>>>   xen/arch/x86/mm/shadow/multi.c   |   3 +-
>>>>   xen/include/asm-x86/p2m.h        |  36 +++++++++--
>>>>   xen/include/public/hvm/hvm_op.h  |  37 ++++++++++++
>>>>   9 files changed, 395 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
>>>> index ddc8007..77a4793 100644
>>>> --- a/xen/arch/x86/hvm/emulate.c
>>>> +++ b/xen/arch/x86/hvm/emulate.c
>>>> @@ -94,11 +94,69 @@ static const struct hvm_io_handler null_handler = {
>>>>       .ops = &null_ops
>>>>   };
>>>>
>>>> +static int mem_read(const struct hvm_io_handler *io_handler,
>>>> +                    uint64_t addr,
>>>> +                    uint32_t size,
>>>> +                    uint64_t *data)
>>>> +{
>>>> +    struct domain *currd = current->domain;
>>>> +    unsigned long gmfn = paddr_to_pfn(addr);
>>>> +    unsigned long offset = addr & ~PAGE_MASK;
>>>> +    struct page_info *page = get_page_from_gfn(currd, gmfn, NULL,
>>>> P2M_UNSHARE);
>>>> +    uint8_t *p;
>>>> +
>>>> +    if ( !page )
>>>> +        return X86EMUL_UNHANDLEABLE;
>>>> +
>>>> +    p = __map_domain_page(page);
>>>> +    p += offset;
>>>> +    memcpy(data, p, size);
>>>
>>> What happens when offset + size crosses the page boundary?
>>>
>>
>> The 'size' is set in hvmemul_linear_mmio_access(), to insure offset +
>> size will not cross the page boundary.
> 
> Ok, in which case please
> 
> ASSERT(offset + size < PAGE_SIZE)

<=

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-11 11:14     ` Yu, Zhang
@ 2016-04-11 16:31       ` Jan Beulich
  2016-04-12  9:37         ` Yu, Zhang
  0 siblings, 1 reply; 82+ messages in thread
From: Jan Beulich @ 2016-04-11 16:31 UTC (permalink / raw)
  To: Paul Durrant, Zhang Yu
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Tim Deegan, xen-devel, zhiyuan.lv, Jun Nakajima

>>> On 11.04.16 at 13:14, <yu.c.zhang@linux.intel.com> wrote:
> On 4/9/2016 6:28 AM, Jan Beulich wrote:
>>>>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
>>> @@ -168,13 +226,72 @@ static int hvmemul_do_io(
>>>           break;
>>>       case X86EMUL_UNHANDLEABLE:
>>>       {
>>> -        struct hvm_ioreq_server *s =
>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>> +        struct hvm_ioreq_server *s;
>>> +        p2m_type_t p2mt;
>>> +
>>> +        if ( is_mmio )
>>> +        {
>>> +            unsigned long gmfn = paddr_to_pfn(addr);
>>> +
>>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>>> +
>>> +            switch ( p2mt )
>>> +            {
>>> +                case p2m_ioreq_server:
>>> +                {
>>> +                    unsigned long flags;
>>> +
>>> +                    p2m_get_ioreq_server(currd, &flags, &s);
>>
>> As the function apparently returns no value right now, please avoid
>> the indirection on both values you're after - one of the two
>> (presumably s) can be the function's return value.
> 
> Well, current implementation of p2m_get_ioreq_server() has spin_lock/
> spin_unlock surrounding the reading of flags and the s, but I believe
> we can also use the s as return value.

The use of a lock inside the function has nothing to do with how it
returns values to the caller.

>>>           /* If there is no suitable backing DM, just ignore accesses */
>>>           if ( !s )
>>>           {
>>> -            rc = hvm_process_io_intercept(&null_handler, &p);
>>> +            switch ( p2mt )
>>> +            {
>>> +            case p2m_ioreq_server:
>>> +            /*
>>> +             * Race conditions may exist when access to a gfn with
>>> +             * p2m_ioreq_server is intercepted by hypervisor, during
>>> +             * which time p2m type of this gfn is recalculated back
>>> +             * to p2m_ram_rw. mem_handler is used to handle this
>>> +             * corner case.
>>> +             */
>>
>> Now if there is such a race condition, the race could also be with a
>> page changing first to ram_rw and then immediately further to e.g.
>> ram_ro. See the earlier comment about assuming the page to be
>> writable.
>>
> 
> Thanks, Jan. After rechecking the code, I suppose the race condition
> will not happen. In hvmemul_do_io(), get_gfn_query_unlocked() is used
> to peek the p2mt for the gfn, but get_gfn_type_access() is called inside
> hvm_hap_nested_page_fault(), and this will guarantee no p2m change shall
> occur during the emulation.
> Is this understanding correct?

Ah, yes, I think so. So the comment is misleading.

>>> +static int hvm_map_mem_type_to_ioreq_server(struct domain *d,
>>> +                                            ioservid_t id,
>>> +                                            hvmmem_type_t type,
>>> +                                            uint32_t flags)
>>> +{
>>> +    struct hvm_ioreq_server *s;
>>> +    int rc;
>>> +
>>> +    /* For now, only HVMMEM_ioreq_server is supported */
>>> +    if ( type != HVMMEM_ioreq_server )
>>> +        return -EINVAL;
>>> +
>>> +    if ( flags & ~(HVMOP_IOREQ_MEM_ACCESS_READ |
>>> +                   HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>>> +        return -EINVAL;
>>> +
>>> +    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
>>> +
>>> +    rc = -ENOENT;
>>> +    list_for_each_entry ( s,
>>> +                          &d->arch.hvm_domain.ioreq_server.list,
>>> +                          list_entry )
>>> +    {
>>> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
>>> +            continue;
>>> +
>>> +        if ( s->id == id )
>>> +        {
>>> +            rc = p2m_set_ioreq_server(d, flags, s);
>>> +            if ( rc == 0 )
>>> +                gdprintk(XENLOG_DEBUG, "%u %s type HVMMEM_ioreq_server.\n",
>>> +                         s->id, (flags != 0) ? "mapped to" : "unmapped from");
>>
>> Why gdprintk()? I don't think the current domain is of much
>> interest here. What would be of interest is the subject domain.
>>
> 
> s->id is not the domain_id, but id of the ioreq server.

That's understood. But gdprintk() itself logs the current domain,
which isn't as useful as the subject one.

>>> --- a/xen/arch/x86/mm/p2m-ept.c
>>> +++ b/xen/arch/x86/mm/p2m-ept.c
>>> @@ -132,6 +132,19 @@ static void ept_p2m_type_to_flags(struct p2m_domain
>>> *p2m, ept_entry_t *entry,
>>>               entry->r = entry->w = entry->x = 1;
>>>               entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>>>               break;
>>> +        case p2m_ioreq_server:
>>> +            entry->r = !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_READ_ACCESS);
>>> +	    /*
>>> +	     * write access right is disabled when entry->r is 0, but whether
>>> +	     * write accesses are emulated by hypervisor or forwarded to an
>>> +	     * ioreq server depends on the setting of p2m->ioreq.flags.
>>> +	     */
>>> +            entry->w = (entry->r &&
>>> +                        !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_WRITE_ACCESS));
>>> +            entry->x = entry->r;
>>
>> Why would we want to allow instruction execution from such pages?
>> And with all three bits now possibly being clear, aren't we risking the
>> entries to be mis-treated as not-present ones?
>>
> 
> Hah. You got me. Thanks! :)
> Now I realized it would be difficult if we wanna to emulate the read
> operations for HVM. According to Intel mannual, entry->r is to be
> cleared, so should entry->w if we do not want ept misconfig. And
> with both read and write permissions being forbidden, entry->x can be
> set only on processors with EXECUTE_ONLY capability.
> To avoid any entry to be mis-treated as not-present. We have several
> solutions:
> a> do not support the read emulation for now - we have no such usage
> case;
> b> add the check of p2m_t against p2m_ioreq_server in is_epte_present -
> a bit weird to me.
> Which one do you prefer? or any other suggestions?

That question would also need to be asked to others who had
suggested supporting both. I'd be fine with a, but I also don't view
b as too awkward.

>>> +    /*
>>> +     * Each time we map/unmap an ioreq server to/from p2m_ioreq_server,
>>> +     * we mark the p2m table to be recalculated, so that gfns which were
>>> +     * previously marked with p2m_ioreq_server can be resynced.
>>> +     */
>>> +    p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>>
>> What does "resynced" here mean? I.e. I can see why this is wanted
>> when unmapping a server, but when mapping a server there shouldn't
>> be any such pages in the first place.
>>
> 
> There shouldn't be. But if there is(misbehavior from the device model
> side), it can be recalculated back to p2m_ram_rw(which is not quite
> necessary as the unmapping case).

DM misbehavior should not result in such a problem - the hypervisor
should refuse any bad requests.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-11 16:31       ` Jan Beulich
@ 2016-04-12  9:37         ` Yu, Zhang
  2016-04-12 15:08           ` Jan Beulich
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-12  9:37 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Tim Deegan, xen-devel, zhiyuan.lv, Jun Nakajima



On 4/12/2016 12:31 AM, Jan Beulich wrote:
>>>> On 11.04.16 at 13:14, <yu.c.zhang@linux.intel.com> wrote:
>> On 4/9/2016 6:28 AM, Jan Beulich wrote:
>>>>>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
>>>> @@ -168,13 +226,72 @@ static int hvmemul_do_io(
>>>>            break;
>>>>        case X86EMUL_UNHANDLEABLE:
>>>>        {
>>>> -        struct hvm_ioreq_server *s =
>>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>>> +        struct hvm_ioreq_server *s;
>>>> +        p2m_type_t p2mt;
>>>> +
>>>> +        if ( is_mmio )
>>>> +        {
>>>> +            unsigned long gmfn = paddr_to_pfn(addr);
>>>> +
>>>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>>>> +
>>>> +            switch ( p2mt )
>>>> +            {
>>>> +                case p2m_ioreq_server:
>>>> +                {
>>>> +                    unsigned long flags;
>>>> +
>>>> +                    p2m_get_ioreq_server(currd, &flags, &s);
>>>
>>> As the function apparently returns no value right now, please avoid
>>> the indirection on both values you're after - one of the two
>>> (presumably s) can be the function's return value.
>>
>> Well, current implementation of p2m_get_ioreq_server() has spin_lock/
>> spin_unlock surrounding the reading of flags and the s, but I believe
>> we can also use the s as return value.
>
> The use of a lock inside the function has nothing to do with how it
> returns values to the caller.
>

Agree. I'll use s as return value then.

>>>>            /* If there is no suitable backing DM, just ignore accesses */
>>>>            if ( !s )
>>>>            {
>>>> -            rc = hvm_process_io_intercept(&null_handler, &p);
>>>> +            switch ( p2mt )
>>>> +            {
>>>> +            case p2m_ioreq_server:
>>>> +            /*
>>>> +             * Race conditions may exist when access to a gfn with
>>>> +             * p2m_ioreq_server is intercepted by hypervisor, during
>>>> +             * which time p2m type of this gfn is recalculated back
>>>> +             * to p2m_ram_rw. mem_handler is used to handle this
>>>> +             * corner case.
>>>> +             */
>>>
>>> Now if there is such a race condition, the race could also be with a
>>> page changing first to ram_rw and then immediately further to e.g.
>>> ram_ro. See the earlier comment about assuming the page to be
>>> writable.
>>>
>>
>> Thanks, Jan. After rechecking the code, I suppose the race condition
>> will not happen. In hvmemul_do_io(), get_gfn_query_unlocked() is used
>> to peek the p2mt for the gfn, but get_gfn_type_access() is called inside
>> hvm_hap_nested_page_fault(), and this will guarantee no p2m change shall
>> occur during the emulation.
>> Is this understanding correct?
>
> Ah, yes, I think so. So the comment is misleading.
>

I'll remove the comment, together with the p2m_ram_rw case. Thanks. :)

>>>> +static int hvm_map_mem_type_to_ioreq_server(struct domain *d,
>>>> +                                            ioservid_t id,
>>>> +                                            hvmmem_type_t type,
>>>> +                                            uint32_t flags)
>>>> +{
>>>> +    struct hvm_ioreq_server *s;
>>>> +    int rc;
>>>> +
>>>> +    /* For now, only HVMMEM_ioreq_server is supported */
>>>> +    if ( type != HVMMEM_ioreq_server )
>>>> +        return -EINVAL;
>>>> +
>>>> +    if ( flags & ~(HVMOP_IOREQ_MEM_ACCESS_READ |
>>>> +                   HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>>>> +        return -EINVAL;
>>>> +
>>>> +    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
>>>> +
>>>> +    rc = -ENOENT;
>>>> +    list_for_each_entry ( s,
>>>> +                          &d->arch.hvm_domain.ioreq_server.list,
>>>> +                          list_entry )
>>>> +    {
>>>> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
>>>> +            continue;
>>>> +
>>>> +        if ( s->id == id )
>>>> +        {
>>>> +            rc = p2m_set_ioreq_server(d, flags, s);
>>>> +            if ( rc == 0 )
>>>> +                gdprintk(XENLOG_DEBUG, "%u %s type HVMMEM_ioreq_server.\n",
>>>> +                         s->id, (flags != 0) ? "mapped to" : "unmapped from");
>>>
>>> Why gdprintk()? I don't think the current domain is of much
>>> interest here. What would be of interest is the subject domain.
>>>
>>
>> s->id is not the domain_id, but id of the ioreq server.
>
> That's understood. But gdprintk() itself logs the current domain,
> which isn't as useful as the subject one.
>

Oh, I see. So the correct routine here should be dprintk(), right?

>>>> --- a/xen/arch/x86/mm/p2m-ept.c
>>>> +++ b/xen/arch/x86/mm/p2m-ept.c
>>>> @@ -132,6 +132,19 @@ static void ept_p2m_type_to_flags(struct p2m_domain
>>>> *p2m, ept_entry_t *entry,
>>>>                entry->r = entry->w = entry->x = 1;
>>>>                entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>>>>                break;
>>>> +        case p2m_ioreq_server:
>>>> +            entry->r = !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_READ_ACCESS);
>>>> +	    /*
>>>> +	     * write access right is disabled when entry->r is 0, but whether
>>>> +	     * write accesses are emulated by hypervisor or forwarded to an
>>>> +	     * ioreq server depends on the setting of p2m->ioreq.flags.
>>>> +	     */
>>>> +            entry->w = (entry->r &&
>>>> +                        !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_WRITE_ACCESS));
>>>> +            entry->x = entry->r;
>>>
>>> Why would we want to allow instruction execution from such pages?
>>> And with all three bits now possibly being clear, aren't we risking the
>>> entries to be mis-treated as not-present ones?
>>>
>>
>> Hah. You got me. Thanks! :)
>> Now I realized it would be difficult if we wanna to emulate the read
>> operations for HVM. According to Intel mannual, entry->r is to be
>> cleared, so should entry->w if we do not want ept misconfig. And
>> with both read and write permissions being forbidden, entry->x can be
>> set only on processors with EXECUTE_ONLY capability.
>> To avoid any entry to be mis-treated as not-present. We have several
>> solutions:
>> a> do not support the read emulation for now - we have no such usage
>> case;
>> b> add the check of p2m_t against p2m_ioreq_server in is_epte_present -
>> a bit weird to me.
>> Which one do you prefer? or any other suggestions?
>
> That question would also need to be asked to others who had
> suggested supporting both. I'd be fine with a, but I also don't view
> b as too awkward.
>

According to Intel mannual, an entry is regarded as not present, if
bit0:2 is 0. So adding a p2m type check in is_epte_present() means we
will change its semantics, if this is acceptable(with no hurt to
hypervisor). I'd prefer option b>

Does anyone else have different suggestions other than b> ?


>>>> +    /*
>>>> +     * Each time we map/unmap an ioreq server to/from p2m_ioreq_server,
>>>> +     * we mark the p2m table to be recalculated, so that gfns which were
>>>> +     * previously marked with p2m_ioreq_server can be resynced.
>>>> +     */
>>>> +    p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
>>>
>>> What does "resynced" here mean? I.e. I can see why this is wanted
>>> when unmapping a server, but when mapping a server there shouldn't
>>> be any such pages in the first place.
>>>
>>
>> There shouldn't be. But if there is(misbehavior from the device model
>> side), it can be recalculated back to p2m_ram_rw(which is not quite
>> necessary as the unmapping case).
>
> DM misbehavior should not result in such a problem - the hypervisor
> should refuse any bad requests.
>
OK. I can add code to guarantee from hypervisor side no entries changed
to p2m_ioreq_server before the mapping happens.

B.R.
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-12  9:37         ` Yu, Zhang
@ 2016-04-12 15:08           ` Jan Beulich
  2016-04-14  9:56             ` Yu, Zhang
  0 siblings, 1 reply; 82+ messages in thread
From: Jan Beulich @ 2016-04-12 15:08 UTC (permalink / raw)
  To: paul.durrant, yu.c.zhang
  Cc: kevin.tian, keir, george.dunlap, andrew.cooper3, tim, xen-devel,
	zhiyuan.lv, jun.nakajima

>>> "Yu, Zhang" <yu.c.zhang@linux.intel.com> 04/12/16 11:47 AM >>>
>On 4/12/2016 12:31 AM, Jan Beulich wrote:
>>>>> On 11.04.16 at 13:14, <yu.c.zhang@linux.intel.com> wrote:
>>> On 4/9/2016 6:28 AM, Jan Beulich wrote:
>>>>>>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
>>>>> +        if ( s->id == id )
>>>>> +        {
>>>>> +            rc = p2m_set_ioreq_server(d, flags, s);
>>>>> +            if ( rc == 0 )
>>>>> +                gdprintk(XENLOG_DEBUG, "%u %s type HVMMEM_ioreq_server.\n",
>>>>> +                         s->id, (flags != 0) ? "mapped to" : "unmapped from");
>>>>
>>>> Why gdprintk()? I don't think the current domain is of much
>>>> interest here. What would be of interest is the subject domain.
>>>>
>>>
>>> s->id is not the domain_id, but id of the ioreq server.
>>
>> That's understood. But gdprintk() itself logs the current domain,
>> which isn't as useful as the subject one.
>
>Oh, I see. So the correct routine here should be dprintk(), right?

Yes.

>>>> And with all three bits now possibly being clear, aren't we risking the
>>>> entries to be mis-treated as not-present ones?
>>>
>>> Hah. You got me. Thanks! :)
>>> Now I realized it would be difficult if we wanna to emulate the read
>>> operations for HVM. According to Intel mannual, entry->r is to be
>>> cleared, so should entry->w if we do not want ept misconfig. And
>>> with both read and write permissions being forbidden, entry->x can be
>>> set only on processors with EXECUTE_ONLY capability.
>>> To avoid any entry to be mis-treated as not-present. We have several
>>> solutions:
>>> a> do not support the read emulation for now - we have no such usage
>>> case;
>>> b> add the check of p2m_t against p2m_ioreq_server in is_epte_present -
>>> a bit weird to me.
>>> Which one do you prefer? or any other suggestions?
>>
>> That question would also need to be asked to others who had
>> suggested supporting both. I'd be fine with a, but I also don't view
>> b as too awkward.
>
>According to Intel mannual, an entry is regarded as not present, if
>bit0:2 is 0. So adding a p2m type check in is_epte_present() means we
>will change its semantics, if this is acceptable(with no hurt to
>hypervisor). I'd prefer option b>

Perhaps time for the VMX maintainers to chime in - such a change is acceptable
only if it doesn't result in changed hardware behavior. I can't think of any such off
the top of my head, but this really should be confirmed by the maintainers before
deciding to go such a route.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-12 15:08           ` Jan Beulich
@ 2016-04-14  9:56             ` Yu, Zhang
  2016-04-19  4:50               ` Tian, Kevin
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-14  9:56 UTC (permalink / raw)
  To: Jan Beulich, paul.durrant, jun.nakajima, kevin.tian
  Cc: keir, george.dunlap, andrew.cooper3, tim, xen-devel, zhiyuan.lv



On 4/12/2016 11:08 PM, Jan Beulich wrote:
>>>> "Yu, Zhang" <yu.c.zhang@linux.intel.com> 04/12/16 11:47 AM >>>
>> On 4/12/2016 12:31 AM, Jan Beulich wrote:
>>>>>> On 11.04.16 at 13:14, <yu.c.zhang@linux.intel.com> wrote:
>>>> On 4/9/2016 6:28 AM, Jan Beulich wrote:
>>>>>>>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
>>>>>> +        if ( s->id == id )
>>>>>> +        {
>>>>>> +            rc = p2m_set_ioreq_server(d, flags, s);
>>>>>> +            if ( rc == 0 )
>>>>>> +                gdprintk(XENLOG_DEBUG, "%u %s type HVMMEM_ioreq_server.\n",
>>>>>> +                         s->id, (flags != 0) ? "mapped to" : "unmapped from");
>>>>>
>>>>> Why gdprintk()? I don't think the current domain is of much
>>>>> interest here. What would be of interest is the subject domain.
>>>>>
>>>>
>>>> s->id is not the domain_id, but id of the ioreq server.
>>>
>>> That's understood. But gdprintk() itself logs the current domain,
>>> which isn't as useful as the subject one.
>>
>> Oh, I see. So the correct routine here should be dprintk(), right?
>
> Yes.
>
>>>>> And with all three bits now possibly being clear, aren't we risking the
>>>>> entries to be mis-treated as not-present ones?
>>>>
>>>> Hah. You got me. Thanks! :)
>>>> Now I realized it would be difficult if we wanna to emulate the read
>>>> operations for HVM. According to Intel mannual, entry->r is to be
>>>> cleared, so should entry->w if we do not want ept misconfig. And
>>>> with both read and write permissions being forbidden, entry->x can be
>>>> set only on processors with EXECUTE_ONLY capability.
>>>> To avoid any entry to be mis-treated as not-present. We have several
>>>> solutions:
>>>> a> do not support the read emulation for now - we have no such usage
>>>> case;
>>>> b> add the check of p2m_t against p2m_ioreq_server in is_epte_present -
>>>> a bit weird to me.
>>>> Which one do you prefer? or any other suggestions?
>>>
>>> That question would also need to be asked to others who had
>>> suggested supporting both. I'd be fine with a, but I also don't view
>>> b as too awkward.
>>
>> According to Intel mannual, an entry is regarded as not present, if
>> bit0:2 is 0. So adding a p2m type check in is_epte_present() means we
>> will change its semantics, if this is acceptable(with no hurt to
>> hypervisor). I'd prefer option b>
>
> Perhaps time for the VMX maintainers to chime in - such a change is acceptable
> only if it doesn't result in changed hardware behavior. I can't think of any such off
> the top of my head, but this really should be confirmed by the maintainers before
> deciding to go such a route.
>

Thanks, Jan. :)
Jun & Kevin, any suggestions?

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-11 11:15             ` Yu, Zhang
@ 2016-04-14 10:45               ` Yu, Zhang
  2016-04-18 15:57                 ` Paul Durrant
  2016-04-19  4:37                 ` Tian, Kevin
  0 siblings, 2 replies; 82+ messages in thread
From: Yu, Zhang @ 2016-04-14 10:45 UTC (permalink / raw)
  To: George Dunlap, Paul Durrant, xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim Deegan, Lv, Zhiyuan,
	jun.nakajima

On 4/11/2016 7:15 PM, Yu, Zhang wrote:
>
>
> On 4/8/2016 7:01 PM, George Dunlap wrote:
>> On 08/04/16 11:10, Yu, Zhang wrote:
>> [snip]
>>> BTW, I noticed your reply has not be CCed to mailing list, and I also
>>> wonder if we should raise this last question in community?
>>
>> Oops -- that was a mistake on my part.  :-)  I appreciate the
>> discretion; just so you know in the future, if I'm purposely changing
>> the CC list (removing xen-devel and/or adding extra people), I'll almost
>> always say so at the top of the mail.
>>
>>>> And then of course there's the p2m_ioreq_server -> p2m_ram_logdirty
>>>> transition -- I assume that live migration is incompatible with this
>>>> functionality?  Is there anything that prevents a live migration from
>>>> being started when there are outstanding p2m_ioreq_server entries?
>>>>
>>>
>>> Another good question, and the answer is unfortunately yes. :-)
>>>
>>> If live migration happens during the normal emulation process, entries
>>> marked with p2m_ioreq_server will be changed to p2m_log_dirty in
>>> resolve_misconfig(), and later write operations will change them to
>>> p2m_ram_rw, thereafter these pages can not be forwarded to device model.
>>>  From this point of view, this functionality is incompatible with live
>>> migration.
>>>
>>> But for XenGT, I think this is acceptable, because, if live migration
>>> is to be supported in the future, intervention from backend device
>>> model will be necessary. At that time, we can guarantee from the device
>>> model side that there's no outdated p2m_ioreq_server entries, hence no
>>> need to reset the p2m type back to p2m_ram_rw(and do not include
>>> p2m_ioreq_server in the P2M_CHANGEABLE_TYPES). By "outdated", I mean
>>> when an ioreq server is detached from p2m_ioreq_server, or before an
>>> ioreq server is attached to this type, entries marked with
>>> p2m_ioreq_server should be regarded as outdated.
>>>
>>> Is this acceptible to you? Any suggestions?
>>
>> So the question is, as of this series, what happens if someone tries to
>> initiate a live migration while there are outstanding p2m_ioreq_server
>> entries?
>>
>> If the answer is "the ioreq server suddenly loses all control of the
>> memory", that's something that needs to be changed.
>>
>
> Sorry, for this patch series, I'm afraid the above description is the
> answer.
>
> Besides, I find it's hard to change current code to both support the
> deferred resetting of p2m_ioreq_server and the live migration at the
> same time. One reason is that a page with p2m_ioreq_server behaves
> differently in different situations.
>
> My assumption of XenGT is that, for live migration to work, the device
> model should guarantee there's no outstanding p2m_ioreq_server pages
> in hypervisor(no need to use the deferred recalculation), and it is our
> device model who should be responsible for the copying of the write
> protected guest pages later.
>
> And another solution I can think of: when unmapping the ioreq server,
> we walk the p2m table and reset entries with p2m_ioreq_server back
> directly, instead of deferring the reset. And of course, this means
> performance impact. But since the mapping and unmapping of an ioreq
> server is not a frequent one, the performance penalty may be acceptable.
> How do you think about this approach?
>

George, sorry to bother you. Any comments on above option? :)

Another choice might be to let live migration fail if there's
outstanding p2m_ioreq_server entries. But I'm not quite inclined to do
so, because:
1> I'd still like to keep live migration feature for XenGT.
2> Not easy to know if there's outstanding p2m_ioreq_server entries. I
mean, since p2m type change is not only triggered by hypercall, to keep
a counter for remaining p2m_ioreq_server entries means a lot code
changes;

Besides, I wonder whether the requirement to reset the p2m_ioreq_server
is indispensable, could we let the device model side to be responsible
for this? The worst case I can imagine for device model failing to do
so is that operations of a gfn might be delivered to a wrong device
model. I'm not clear what kind of damage would this cause to the
hypervisor or other VM.

Does any other maintainers have any suggestions?
Thanks in advance! :)
>> If the answer is, "everything just works", that's perfect.
>>
>> If the answer is, "Before logdirty mode is set, the ioreq server has the
>> opportunity to detach, removing the p2m_ioreq_server entries, and
>> operating without that functionality", that's good too.
>>
>> If the answer is, "the live migration request fails and the guest
>> continues to run", that's also acceptable.  If you want this series to
>> be checked in today (the last day for 4.7), this is probably your best
>> bet.
>>
>>   -George
>>
>>
>>
>

Regards
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-08 21:48   ` Jan Beulich
@ 2016-04-18  8:41     ` Paul Durrant
  2016-04-18  9:10       ` George Dunlap
  2016-04-18 16:40       ` Jan Beulich
  0 siblings, 2 replies; 82+ messages in thread
From: Paul Durrant @ 2016-04-18  8:41 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, Yu Zhang, zhiyuan.lv, Jun Nakajima

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 08 April 2016 22:48
> To: xen-devel@lists.xen.org
> Cc: Andrew Cooper; Paul Durrant; George Dunlap; Jun Nakajima; Kevin Tian;
> zhiyuan.lv@intel.com; Yu Zhang; Keir (Xen.org); Tim (Xen.org)
> Subject: Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm
> to p2m_ioreq_server
> 
> >>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
> > --- a/xen/include/public/hvm/hvm_op.h
> > +++ b/xen/include/public/hvm/hvm_op.h
> > @@ -83,7 +83,7 @@ typedef enum {
> >      HVMMEM_ram_rw,             /* Normal read/write guest RAM */
> >      HVMMEM_ram_ro,             /* Read-only; writes are discarded */
> >      HVMMEM_mmio_dm,            /* Reads and write go to the device model
> */
> > -    HVMMEM_mmio_write_dm       /* Read-only; writes go to the device
> model */
> > +    HVMMEM_ioreq_server,
> >  } hvmmem_type_t;
> >
> >  /* Following tools-only interfaces may change in future. */
> 
> So there's one problem here, which the comment at the bottom
> of the context already hints at: This enum is part of the not
> tools restricted interface (as HVMOP_get_mem_type is usable
> by guests themselves), which we cannot change like this. Since
> the meaning of the enumerator value doesn't change, I guess
> we can get away with simply retaining its old name for non-up-
> to-date __XEN_INTERFACE_VERSION__.
>

Has the type made it into a release yet. I was assuming we could make the change without any need to play with the version since it's only ever been present in  xen-unstable so far.

  Paul

 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-18  8:41     ` Paul Durrant
@ 2016-04-18  9:10       ` George Dunlap
  2016-04-18  9:14         ` Wei Liu
  2016-04-18 16:40       ` Jan Beulich
  1 sibling, 1 reply; 82+ messages in thread
From: George Dunlap @ 2016-04-18  9:10 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Kevin Tian, Keir (Xen.org),
	Jun Nakajima, Andrew Cooper, Tim (Xen.org),
	xen-devel, Yu Zhang, zhiyuan.lv, Jan Beulich, Wei Liu

On Mon, Apr 18, 2016 at 9:41 AM, Paul Durrant <Paul.Durrant@citrix.com> wrote:
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 08 April 2016 22:48
>> To: xen-devel@lists.xen.org
>> Cc: Andrew Cooper; Paul Durrant; George Dunlap; Jun Nakajima; Kevin Tian;
>> zhiyuan.lv@intel.com; Yu Zhang; Keir (Xen.org); Tim (Xen.org)
>> Subject: Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm
>> to p2m_ioreq_server
>>
>> >>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
>> > --- a/xen/include/public/hvm/hvm_op.h
>> > +++ b/xen/include/public/hvm/hvm_op.h
>> > @@ -83,7 +83,7 @@ typedef enum {
>> >      HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>> >      HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>> >      HVMMEM_mmio_dm,            /* Reads and write go to the device model
>> */
>> > -    HVMMEM_mmio_write_dm       /* Read-only; writes go to the device
>> model */
>> > +    HVMMEM_ioreq_server,
>> >  } hvmmem_type_t;
>> >
>> >  /* Following tools-only interfaces may change in future. */
>>
>> So there's one problem here, which the comment at the bottom
>> of the context already hints at: This enum is part of the not
>> tools restricted interface (as HVMOP_get_mem_type is usable
>> by guests themselves), which we cannot change like this. Since
>> the meaning of the enumerator value doesn't change, I guess
>> we can get away with simply retaining its old name for non-up-
>> to-date __XEN_INTERFACE_VERSION__.
>>
>
> Has the type made it into a release yet. I was assuming we could make the change without any need to play with the version since it's only ever been present in  xen-unstable so far.

I think you need a Release Manager ack for that; but if it's the case
that the type has never been seen in public, then I think it should be
able to be renamed (in fact, we should rename it so that we don't have
to deal with backwards compatibility later).

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-18  9:10       ` George Dunlap
@ 2016-04-18  9:14         ` Wei Liu
  2016-04-18  9:45           ` Paul Durrant
  0 siblings, 1 reply; 82+ messages in thread
From: Wei Liu @ 2016-04-18  9:14 UTC (permalink / raw)
  To: George Dunlap
  Cc: Kevin Tian, Keir (Xen.org),
	Jun Nakajima, Andrew Cooper, Tim (Xen.org),
	xen-devel, Paul Durrant, Yu Zhang, zhiyuan.lv, Jan Beulich,
	Wei Liu

On Mon, Apr 18, 2016 at 10:10:00AM +0100, George Dunlap wrote:
> On Mon, Apr 18, 2016 at 9:41 AM, Paul Durrant <Paul.Durrant@citrix.com> wrote:
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 08 April 2016 22:48
> >> To: xen-devel@lists.xen.org
> >> Cc: Andrew Cooper; Paul Durrant; George Dunlap; Jun Nakajima; Kevin Tian;
> >> zhiyuan.lv@intel.com; Yu Zhang; Keir (Xen.org); Tim (Xen.org)
> >> Subject: Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm
> >> to p2m_ioreq_server
> >>
> >> >>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
> >> > --- a/xen/include/public/hvm/hvm_op.h
> >> > +++ b/xen/include/public/hvm/hvm_op.h
> >> > @@ -83,7 +83,7 @@ typedef enum {
> >> >      HVMMEM_ram_rw,             /* Normal read/write guest RAM */
> >> >      HVMMEM_ram_ro,             /* Read-only; writes are discarded */
> >> >      HVMMEM_mmio_dm,            /* Reads and write go to the device model
> >> */
> >> > -    HVMMEM_mmio_write_dm       /* Read-only; writes go to the device
> >> model */
> >> > +    HVMMEM_ioreq_server,
> >> >  } hvmmem_type_t;
> >> >
> >> >  /* Following tools-only interfaces may change in future. */
> >>
> >> So there's one problem here, which the comment at the bottom
> >> of the context already hints at: This enum is part of the not
> >> tools restricted interface (as HVMOP_get_mem_type is usable
> >> by guests themselves), which we cannot change like this. Since
> >> the meaning of the enumerator value doesn't change, I guess
> >> we can get away with simply retaining its old name for non-up-
> >> to-date __XEN_INTERFACE_VERSION__.
> >>
> >
> > Has the type made it into a release yet. I was assuming we could make the change without any need to play with the version since it's only ever been present in  xen-unstable so far.
> 
> I think you need a Release Manager ack for that; but if it's the case
> that the type has never been seen in public, then I think it should be
> able to be renamed (in fact, we should rename it so that we don't have
> to deal with backwards compatibility later).
> 

If it is not released yet, feel free to change it -- but at this point
you also need to present argument on why it should be changed. I don't
expect the seddery too hard to review.

Wei.

>  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-18  9:14         ` Wei Liu
@ 2016-04-18  9:45           ` Paul Durrant
  0 siblings, 0 replies; 82+ messages in thread
From: Paul Durrant @ 2016-04-18  9:45 UTC (permalink / raw)
  To: George Dunlap
  Cc: Kevin Tian, Keir (Xen.org),
	Jun Nakajima, Andrew Cooper, Tim (Xen.org),
	xen-devel, Yu Zhang, zhiyuan.lv, Jan Beulich, Wei Liu

> -----Original Message-----
> From: Wei Liu [mailto:wei.liu2@citrix.com]
> Sent: 18 April 2016 10:15
> To: George Dunlap
> Cc: Paul Durrant; Jan Beulich; xen-devel@lists.xen.org; Kevin Tian; Keir
> (Xen.org); Andrew Cooper; Tim (Xen.org); Yu Zhang; zhiyuan.lv@intel.com;
> Jun Nakajima; Wei Liu
> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> p2m_mmio_write_dm to p2m_ioreq_server
> 
> On Mon, Apr 18, 2016 at 10:10:00AM +0100, George Dunlap wrote:
> > On Mon, Apr 18, 2016 at 9:41 AM, Paul Durrant <Paul.Durrant@citrix.com>
> wrote:
> > >> -----Original Message-----
> > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> Sent: 08 April 2016 22:48
> > >> To: xen-devel@lists.xen.org
> > >> Cc: Andrew Cooper; Paul Durrant; George Dunlap; Jun Nakajima; Kevin
> Tian;
> > >> zhiyuan.lv@intel.com; Yu Zhang; Keir (Xen.org); Tim (Xen.org)
> > >> Subject: Re: [PATCH v2 2/3] x86/ioreq server: Rename
> p2m_mmio_write_dm
> > >> to p2m_ioreq_server
> > >>
> > >> >>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
> > >> > --- a/xen/include/public/hvm/hvm_op.h
> > >> > +++ b/xen/include/public/hvm/hvm_op.h
> > >> > @@ -83,7 +83,7 @@ typedef enum {
> > >> >      HVMMEM_ram_rw,             /* Normal read/write guest RAM */
> > >> >      HVMMEM_ram_ro,             /* Read-only; writes are discarded */
> > >> >      HVMMEM_mmio_dm,            /* Reads and write go to the device
> model
> > >> */
> > >> > -    HVMMEM_mmio_write_dm       /* Read-only; writes go to the
> device
> > >> model */
> > >> > +    HVMMEM_ioreq_server,
> > >> >  } hvmmem_type_t;
> > >> >
> > >> >  /* Following tools-only interfaces may change in future. */
> > >>
> > >> So there's one problem here, which the comment at the bottom
> > >> of the context already hints at: This enum is part of the not
> > >> tools restricted interface (as HVMOP_get_mem_type is usable
> > >> by guests themselves), which we cannot change like this. Since
> > >> the meaning of the enumerator value doesn't change, I guess
> > >> we can get away with simply retaining its old name for non-up-
> > >> to-date __XEN_INTERFACE_VERSION__.
> > >>
> > >
> > > Has the type made it into a release yet. I was assuming we could make
> the change without any need to play with the version since it's only ever
> been present in  xen-unstable so far.
> >
> > I think you need a Release Manager ack for that; but if it's the case
> > that the type has never been seen in public, then I think it should be
> > able to be renamed (in fact, we should rename it so that we don't have
> > to deal with backwards compatibility later).
> >
> 
> If it is not released yet, feel free to change it -- but at this point
> you also need to present argument on why it should be changed. I don't
> expect the seddery too hard to review.
> 

There was a design doc posted to the list a couple of months back which has the justification. See http://lists.xen.org/archives/html/xen-devel/2016-02/msg03770.html

  Paul

> Wei.
> 
> >  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-14 10:45               ` Yu, Zhang
@ 2016-04-18 15:57                 ` Paul Durrant
  2016-04-19  9:11                   ` Yu, Zhang
  2016-04-19  4:37                 ` Tian, Kevin
  1 sibling, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-18 15:57 UTC (permalink / raw)
  To: Yu, Zhang, George Dunlap, xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim (Xen.org),
	Lv, Zhiyuan, jun.nakajima

> -----Original Message-----
> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: 14 April 2016 11:45
> To: George Dunlap; Paul Durrant; xen-devel@lists.xen.org
> Cc: Jan Beulich; Kevin Tian; Andrew Cooper; Lv, Zhiyuan; Tim (Xen.org);
> jun.nakajima@intel.com
> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> map guest ram with p2m_ioreq_server to an ioreq server
> 
> On 4/11/2016 7:15 PM, Yu, Zhang wrote:
> >
> >
> > On 4/8/2016 7:01 PM, George Dunlap wrote:
> >> On 08/04/16 11:10, Yu, Zhang wrote:
> >> [snip]
> >>> BTW, I noticed your reply has not be CCed to mailing list, and I also
> >>> wonder if we should raise this last question in community?
> >>
> >> Oops -- that was a mistake on my part.  :-)  I appreciate the
> >> discretion; just so you know in the future, if I'm purposely changing
> >> the CC list (removing xen-devel and/or adding extra people), I'll almost
> >> always say so at the top of the mail.
> >>
> >>>> And then of course there's the p2m_ioreq_server ->
> p2m_ram_logdirty
> >>>> transition -- I assume that live migration is incompatible with this
> >>>> functionality?  Is there anything that prevents a live migration from
> >>>> being started when there are outstanding p2m_ioreq_server entries?
> >>>>
> >>>
> >>> Another good question, and the answer is unfortunately yes. :-)
> >>>
> >>> If live migration happens during the normal emulation process, entries
> >>> marked with p2m_ioreq_server will be changed to p2m_log_dirty in
> >>> resolve_misconfig(), and later write operations will change them to
> >>> p2m_ram_rw, thereafter these pages can not be forwarded to device
> model.
> >>>  From this point of view, this functionality is incompatible with live
> >>> migration.
> >>>
> >>> But for XenGT, I think this is acceptable, because, if live migration
> >>> is to be supported in the future, intervention from backend device
> >>> model will be necessary. At that time, we can guarantee from the device
> >>> model side that there's no outdated p2m_ioreq_server entries, hence
> no
> >>> need to reset the p2m type back to p2m_ram_rw(and do not include
> >>> p2m_ioreq_server in the P2M_CHANGEABLE_TYPES). By "outdated", I
> mean
> >>> when an ioreq server is detached from p2m_ioreq_server, or before an
> >>> ioreq server is attached to this type, entries marked with
> >>> p2m_ioreq_server should be regarded as outdated.
> >>>
> >>> Is this acceptible to you? Any suggestions?
> >>
> >> So the question is, as of this series, what happens if someone tries to
> >> initiate a live migration while there are outstanding p2m_ioreq_server
> >> entries?
> >>
> >> If the answer is "the ioreq server suddenly loses all control of the
> >> memory", that's something that needs to be changed.
> >>
> >
> > Sorry, for this patch series, I'm afraid the above description is the
> > answer.
> >
> > Besides, I find it's hard to change current code to both support the
> > deferred resetting of p2m_ioreq_server and the live migration at the
> > same time. One reason is that a page with p2m_ioreq_server behaves
> > differently in different situations.
> >
> > My assumption of XenGT is that, for live migration to work, the device
> > model should guarantee there's no outstanding p2m_ioreq_server pages
> > in hypervisor(no need to use the deferred recalculation), and it is our
> > device model who should be responsible for the copying of the write
> > protected guest pages later.
> >
> > And another solution I can think of: when unmapping the ioreq server,
> > we walk the p2m table and reset entries with p2m_ioreq_server back
> > directly, instead of deferring the reset. And of course, this means
> > performance impact. But since the mapping and unmapping of an ioreq
> > server is not a frequent one, the performance penalty may be acceptable.
> > How do you think about this approach?
> >
> 
> George, sorry to bother you. Any comments on above option? :)
> 
> Another choice might be to let live migration fail if there's
> outstanding p2m_ioreq_server entries. But I'm not quite inclined to do
> so, because:
> 1> I'd still like to keep live migration feature for XenGT.
> 2> Not easy to know if there's outstanding p2m_ioreq_server entries. I
> mean, since p2m type change is not only triggered by hypercall, to keep
> a counter for remaining p2m_ioreq_server entries means a lot code
> changes;
> 
> Besides, I wonder whether the requirement to reset the p2m_ioreq_server
> is indispensable, could we let the device model side to be responsible
> for this? The worst case I can imagine for device model failing to do
> so is that operations of a gfn might be delivered to a wrong device
> model. I'm not clear what kind of damage would this cause to the
> hypervisor or other VM.
> 
> Does any other maintainers have any suggestions?

Note that it is a requirement that an ioreq server be disabled before VM suspend. That means ioreq server pages essentially have to go back to ram_rw semantics.

  Paul

> Thanks in advance! :)
> >> If the answer is, "everything just works", that's perfect.
> >>
> >> If the answer is, "Before logdirty mode is set, the ioreq server has the
> >> opportunity to detach, removing the p2m_ioreq_server entries, and
> >> operating without that functionality", that's good too.
> >>
> >> If the answer is, "the live migration request fails and the guest
> >> continues to run", that's also acceptable.  If you want this series to
> >> be checked in today (the last day for 4.7), this is probably your best
> >> bet.
> >>
> >>   -George
> >>
> >>
> >>
> >
> 
> Regards
> Yu
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-18  8:41     ` Paul Durrant
  2016-04-18  9:10       ` George Dunlap
@ 2016-04-18 16:40       ` Jan Beulich
  2016-04-18 16:45         ` Paul Durrant
  1 sibling, 1 reply; 82+ messages in thread
From: Jan Beulich @ 2016-04-18 16:40 UTC (permalink / raw)
  To: Paul.Durrant
  Cc: kevin.tian, keir, Andrew.Cooper3, tim, George.Dunlap, xen-devel,
	yu.c.zhang, zhiyuan.lv, jun.nakajima

>>> Paul Durrant <Paul.Durrant@citrix.com> 04/18/16 10:41 AM >>>
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 08 April 2016 22:48
>> >>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
>> > --- a/xen/include/public/hvm/hvm_op.h
>> > +++ b/xen/include/public/hvm/hvm_op.h
>> > @@ -83,7 +83,7 @@ typedef enum {
>> >      HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>> >      HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>> >      HVMMEM_mmio_dm,            /* Reads and write go to the device model
>> */
>> > -    HVMMEM_mmio_write_dm       /* Read-only; writes go to the device
>> model */
>> > +    HVMMEM_ioreq_server,
>> >  } hvmmem_type_t;
>> >
>> >  /* Following tools-only interfaces may change in future. */
>> 
>> So there's one problem here, which the comment at the bottom
>> of the context already hints at: This enum is part of the not
>> tools restricted interface (as HVMOP_get_mem_type is usable
>> by guests themselves), which we cannot change like this. Since
>> the meaning of the enumerator value doesn't change, I guess
>> we can get away with simply retaining its old name for non-up-
>> to-date __XEN_INTERFACE_VERSION__.
>
>Has the type made it into a release yet. I was assuming we could make the change without any need to play with the version since it's only ever been present in  >xen-unstable so far.

  Oh, I didn't realize this got added only after 4.6. If that was the case, then the
change of course could be done without any conditional. Checking ... No, 4.6.1
has it.

As for any of this going in now - I suppose this would need a freeze exception,
which I think we've meant to avoid as much as possible this time round.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-18 16:40       ` Jan Beulich
@ 2016-04-18 16:45         ` Paul Durrant
  2016-04-18 16:47           ` Jan Beulich
  0 siblings, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-18 16:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, yu.c.zhang, zhiyuan.lv, jun.nakajima

> -----Original Message-----
> From: Jan Beulich [mailto:jbeulich@suse.com]
> Sent: 18 April 2016 17:40
> To: Paul Durrant
> Cc: Andrew Cooper; George Dunlap; jun.nakajima@intel.com; Kevin Tian;
> zhiyuan.lv@intel.com; yu.c.zhang@linux.intel.com; xen-devel@lists.xen.org;
> Keir (Xen.org); Tim (Xen.org)
> Subject: Re: RE: [PATCH v2 2/3] x86/ioreq server: Rename
> p2m_mmio_write_dm to p2m_ioreq_server
> 
> >>> Paul Durrant <Paul.Durrant@citrix.com> 04/18/16 10:41 AM >>>
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 08 April 2016 22:48
> >> >>> On 31.03.16 at 12:53, <yu.c.zhang@linux.intel.com> wrote:
> >> > --- a/xen/include/public/hvm/hvm_op.h
> >> > +++ b/xen/include/public/hvm/hvm_op.h
> >> > @@ -83,7 +83,7 @@ typedef enum {
> >> >      HVMMEM_ram_rw,             /* Normal read/write guest RAM */
> >> >      HVMMEM_ram_ro,             /* Read-only; writes are discarded */
> >> >      HVMMEM_mmio_dm,            /* Reads and write go to the device
> model
> >> */
> >> > -    HVMMEM_mmio_write_dm       /* Read-only; writes go to the device
> >> model */
> >> > +    HVMMEM_ioreq_server,
> >> >  } hvmmem_type_t;
> >> >
> >> >  /* Following tools-only interfaces may change in future. */
> >>
> >> So there's one problem here, which the comment at the bottom
> >> of the context already hints at: This enum is part of the not
> >> tools restricted interface (as HVMOP_get_mem_type is usable
> >> by guests themselves), which we cannot change like this. Since
> >> the meaning of the enumerator value doesn't change, I guess
> >> we can get away with simply retaining its old name for non-up-
> >> to-date __XEN_INTERFACE_VERSION__.
> >
> >Has the type made it into a release yet. I was assuming we could make the
> change without any need to play with the version since it's only ever been
> present in  >xen-unstable so far.
> 
>   Oh, I didn't realize this got added only after 4.6. If that was the case, then
> the
> change of course could be done without any conditional. Checking ... No,
> 4.6.1
> has it.
> 

Damn. It needs to be ifdef-ed then :-(

> As for any of this going in now - I suppose this would need a freeze
> exception,
> which I think we've meant to avoid as much as possible this time round.
> 

The original patch was posted before the cut-off IIRC so I'm not sure of the policy regarding freeze-exceptions.

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-18 16:45         ` Paul Durrant
@ 2016-04-18 16:47           ` Jan Beulich
  2016-04-18 16:58             ` Paul Durrant
  0 siblings, 1 reply; 82+ messages in thread
From: Jan Beulich @ 2016-04-18 16:47 UTC (permalink / raw)
  To: Paul.Durrant
  Cc: kevin.tian, keir, Andrew.Cooper3, tim, George.Dunlap, xen-devel,
	yu.c.zhang, zhiyuan.lv, jun.nakajima

>>> Paul Durrant <Paul.Durrant@citrix.com> 04/18/16 6:45 PM >>>
>The original patch was posted before the cut-off IIRC so I'm not sure
> of the policy regarding freeze-exceptions.

  It was submitted before the feature freeze, yes, but didn't make it in by
code freeze. So it's my understanding that an exception would be needed.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-18 16:47           ` Jan Beulich
@ 2016-04-18 16:58             ` Paul Durrant
  2016-04-19 11:02               ` Yu, Zhang
  0 siblings, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-18 16:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, yu.c.zhang, zhiyuan.lv, jun.nakajima

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan
> Beulich
> Sent: 18 April 2016 17:47
> To: Paul Durrant
> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George
> Dunlap; xen-devel@lists.xen.org; yu.c.zhang@linux.intel.com;
> zhiyuan.lv@intel.com; jun.nakajima@intel.com
> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> p2m_mmio_write_dm to p2m_ioreq_server
> 
> >>> Paul Durrant <Paul.Durrant@citrix.com> 04/18/16 6:45 PM >>>
> >The original patch was posted before the cut-off IIRC so I'm not sure
> > of the policy regarding freeze-exceptions.
> 
>   It was submitted before the feature freeze, yes, but didn't make it in by
> code freeze. So it's my understanding that an exception would be needed.
> 

Ok. Thanks for the clarification. IMO getting this in is worth the freeze exception... it's a shame p2m_mmio_write_dm made it into 4.6.1. It needs to go before it proliferates any further.

  Paul

> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-14 10:45               ` Yu, Zhang
  2016-04-18 15:57                 ` Paul Durrant
@ 2016-04-19  4:37                 ` Tian, Kevin
  2016-04-19  9:21                   ` Yu, Zhang
  1 sibling, 1 reply; 82+ messages in thread
From: Tian, Kevin @ 2016-04-19  4:37 UTC (permalink / raw)
  To: Yu, Zhang, George Dunlap, Paul Durrant, xen-devel
  Cc: Nakajima, Jun, Andrew Cooper, Tim Deegan, Lv, Zhiyuan, Jan Beulich

> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: Thursday, April 14, 2016 6:45 PM
> 
> On 4/11/2016 7:15 PM, Yu, Zhang wrote:
> >
> >
> > On 4/8/2016 7:01 PM, George Dunlap wrote:
> >> On 08/04/16 11:10, Yu, Zhang wrote:
> >> [snip]
> >>> BTW, I noticed your reply has not be CCed to mailing list, and I also
> >>> wonder if we should raise this last question in community?
> >>
> >> Oops -- that was a mistake on my part.  :-)  I appreciate the
> >> discretion; just so you know in the future, if I'm purposely changing
> >> the CC list (removing xen-devel and/or adding extra people), I'll almost
> >> always say so at the top of the mail.
> >>
> >>>> And then of course there's the p2m_ioreq_server -> p2m_ram_logdirty
> >>>> transition -- I assume that live migration is incompatible with this
> >>>> functionality?  Is there anything that prevents a live migration from
> >>>> being started when there are outstanding p2m_ioreq_server entries?
> >>>>
> >>>
> >>> Another good question, and the answer is unfortunately yes. :-)
> >>>
> >>> If live migration happens during the normal emulation process, entries
> >>> marked with p2m_ioreq_server will be changed to p2m_log_dirty in
> >>> resolve_misconfig(), and later write operations will change them to
> >>> p2m_ram_rw, thereafter these pages can not be forwarded to device model.
> >>>  From this point of view, this functionality is incompatible with live
> >>> migration.
> >>>
> >>> But for XenGT, I think this is acceptable, because, if live migration
> >>> is to be supported in the future, intervention from backend device
> >>> model will be necessary. At that time, we can guarantee from the device
> >>> model side that there's no outdated p2m_ioreq_server entries, hence no
> >>> need to reset the p2m type back to p2m_ram_rw(and do not include
> >>> p2m_ioreq_server in the P2M_CHANGEABLE_TYPES). By "outdated", I mean
> >>> when an ioreq server is detached from p2m_ioreq_server, or before an
> >>> ioreq server is attached to this type, entries marked with
> >>> p2m_ioreq_server should be regarded as outdated.
> >>>
> >>> Is this acceptible to you? Any suggestions?
> >>
> >> So the question is, as of this series, what happens if someone tries to
> >> initiate a live migration while there are outstanding p2m_ioreq_server
> >> entries?
> >>
> >> If the answer is "the ioreq server suddenly loses all control of the
> >> memory", that's something that needs to be changed.
> >>
> >
> > Sorry, for this patch series, I'm afraid the above description is the
> > answer.
> >
> > Besides, I find it's hard to change current code to both support the
> > deferred resetting of p2m_ioreq_server and the live migration at the
> > same time. One reason is that a page with p2m_ioreq_server behaves
> > differently in different situations.
> >
> > My assumption of XenGT is that, for live migration to work, the device
> > model should guarantee there's no outstanding p2m_ioreq_server pages
> > in hypervisor(no need to use the deferred recalculation), and it is our
> > device model who should be responsible for the copying of the write
> > protected guest pages later.
> >
> > And another solution I can think of: when unmapping the ioreq server,
> > we walk the p2m table and reset entries with p2m_ioreq_server back
> > directly, instead of deferring the reset. And of course, this means
> > performance impact. But since the mapping and unmapping of an ioreq
> > server is not a frequent one, the performance penalty may be acceptable.
> > How do you think about this approach?
> >
> 
> George, sorry to bother you. Any comments on above option? :)
> 
> Another choice might be to let live migration fail if there's
> outstanding p2m_ioreq_server entries. But I'm not quite inclined to do
> so, because:
> 1> I'd still like to keep live migration feature for XenGT.
> 2> Not easy to know if there's outstanding p2m_ioreq_server entries. I
> mean, since p2m type change is not only triggered by hypercall, to keep
> a counter for remaining p2m_ioreq_server entries means a lot code
> changes;
> 
> Besides, I wonder whether the requirement to reset the p2m_ioreq_server
> is indispensable, could we let the device model side to be responsible
> for this? The worst case I can imagine for device model failing to do
> so is that operations of a gfn might be delivered to a wrong device
> model. I'm not clear what kind of damage would this cause to the
> hypervisor or other VM.
> 
> Does any other maintainers have any suggestions?
> Thanks in advance! :)

I'm not sure how above is working. In pre-copy phase (where logdirty
is concerned), the device model is still actively serving requests from
guest, including initiating new write-protection requests. How can you
guarantee draining of outstanding p2m_ioreq_server entries w/o 
actually freezing device model (while freezing device model means guest 
driver might be blocked with random errors)?

Thanks
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-14  9:56             ` Yu, Zhang
@ 2016-04-19  4:50               ` Tian, Kevin
  2016-04-19  8:46                 ` Paul Durrant
  2016-04-19  9:15                 ` Yu, Zhang
  0 siblings, 2 replies; 82+ messages in thread
From: Tian, Kevin @ 2016-04-19  4:50 UTC (permalink / raw)
  To: Yu, Zhang, Jan Beulich, paul.durrant, Nakajima, Jun
  Cc: keir, george.dunlap, andrew.cooper3, tim, xen-devel, Lv, Zhiyuan

> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: Thursday, April 14, 2016 5:57 PM
> >>>>> And with all three bits now possibly being clear, aren't we risking the
> >>>>> entries to be mis-treated as not-present ones?
> >>>>
> >>>> Hah. You got me. Thanks! :)
> >>>> Now I realized it would be difficult if we wanna to emulate the read
> >>>> operations for HVM. According to Intel mannual, entry->r is to be
> >>>> cleared, so should entry->w if we do not want ept misconfig. And
> >>>> with both read and write permissions being forbidden, entry->x can be
> >>>> set only on processors with EXECUTE_ONLY capability.
> >>>> To avoid any entry to be mis-treated as not-present. We have several
> >>>> solutions:
> >>>> a> do not support the read emulation for now - we have no such usage
> >>>> case;
> >>>> b> add the check of p2m_t against p2m_ioreq_server in is_epte_present -
> >>>> a bit weird to me.
> >>>> Which one do you prefer? or any other suggestions?
> >>>
> >>> That question would also need to be asked to others who had
> >>> suggested supporting both. I'd be fine with a, but I also don't view
> >>> b as too awkward.
> >>
> >> According to Intel mannual, an entry is regarded as not present, if
> >> bit0:2 is 0. So adding a p2m type check in is_epte_present() means we
> >> will change its semantics, if this is acceptable(with no hurt to
> >> hypervisor). I'd prefer option b>
> >
> > Perhaps time for the VMX maintainers to chime in - such a change is acceptable
> > only if it doesn't result in changed hardware behavior. I can't think of any such off
> > the top of my head, but this really should be confirmed by the maintainers before
> > deciding to go such a route.
> >
> 
> Thanks, Jan. :)
> Jun & Kevin, any suggestions?
> 

I'm a bit worried about this change, since it's a fundamental EPT
interface. Can we avoid supporting read emulation now?

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19  4:50               ` Tian, Kevin
@ 2016-04-19  8:46                 ` Paul Durrant
  2016-04-19  9:27                   ` Yu, Zhang
  2016-04-19  9:15                 ` Yu, Zhang
  1 sibling, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-19  8:46 UTC (permalink / raw)
  To: Kevin Tian, Yu, Zhang, Jan Beulich, Nakajima, Jun
  Cc: Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, Lv, Zhiyuan

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 19 April 2016 05:51
> To: Yu, Zhang; Jan Beulich; Paul Durrant; Nakajima, Jun
> Cc: Andrew Cooper; George Dunlap; Lv, Zhiyuan; xen-devel@lists.xen.org;
> Keir (Xen.org); Tim (Xen.org)
> Subject: RE: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> map guest ram with p2m_ioreq_server to an ioreq server
> 
> > From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> > Sent: Thursday, April 14, 2016 5:57 PM
> > >>>>> And with all three bits now possibly being clear, aren't we risking the
> > >>>>> entries to be mis-treated as not-present ones?
> > >>>>
> > >>>> Hah. You got me. Thanks! :)
> > >>>> Now I realized it would be difficult if we wanna to emulate the read
> > >>>> operations for HVM. According to Intel mannual, entry->r is to be
> > >>>> cleared, so should entry->w if we do not want ept misconfig. And
> > >>>> with both read and write permissions being forbidden, entry->x can
> be
> > >>>> set only on processors with EXECUTE_ONLY capability.
> > >>>> To avoid any entry to be mis-treated as not-present. We have several
> > >>>> solutions:
> > >>>> a> do not support the read emulation for now - we have no such
> usage
> > >>>> case;
> > >>>> b> add the check of p2m_t against p2m_ioreq_server in
> is_epte_present -
> > >>>> a bit weird to me.
> > >>>> Which one do you prefer? or any other suggestions?
> > >>>
> > >>> That question would also need to be asked to others who had
> > >>> suggested supporting both. I'd be fine with a, but I also don't view
> > >>> b as too awkward.
> > >>
> > >> According to Intel mannual, an entry is regarded as not present, if
> > >> bit0:2 is 0. So adding a p2m type check in is_epte_present() means we
> > >> will change its semantics, if this is acceptable(with no hurt to
> > >> hypervisor). I'd prefer option b>
> > >
> > > Perhaps time for the VMX maintainers to chime in - such a change is
> acceptable
> > > only if it doesn't result in changed hardware behavior. I can't think of any
> such off
> > > the top of my head, but this really should be confirmed by the
> maintainers before
> > > deciding to go such a route.
> > >
> >
> > Thanks, Jan. :)
> > Jun & Kevin, any suggestions?
> >
> 
> I'm a bit worried about this change, since it's a fundamental EPT
> interface. Can we avoid supporting read emulation now?
> 

I'm happy to drop the implementation of read emulation for the moment to keep things simple, as long as it is kept in the interface.

  Paul

> Thanks
> Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-18 15:57                 ` Paul Durrant
@ 2016-04-19  9:11                   ` Yu, Zhang
  2016-04-19  9:21                     ` Paul Durrant
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-19  9:11 UTC (permalink / raw)
  To: Paul Durrant, George Dunlap, xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim (Xen.org),
	Lv, Zhiyuan, jun.nakajima



On 4/18/2016 11:57 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: 14 April 2016 11:45
>> To: George Dunlap; Paul Durrant; xen-devel@lists.xen.org
>> Cc: Jan Beulich; Kevin Tian; Andrew Cooper; Lv, Zhiyuan; Tim (Xen.org);
>> jun.nakajima@intel.com
>> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
>> map guest ram with p2m_ioreq_server to an ioreq server
>>
>> On 4/11/2016 7:15 PM, Yu, Zhang wrote:
>>>
>>>
>>> On 4/8/2016 7:01 PM, George Dunlap wrote:
>>>> On 08/04/16 11:10, Yu, Zhang wrote:
>>>> [snip]
>>>>> BTW, I noticed your reply has not be CCed to mailing list, and I also
>>>>> wonder if we should raise this last question in community?
>>>>
>>>> Oops -- that was a mistake on my part.  :-)  I appreciate the
>>>> discretion; just so you know in the future, if I'm purposely changing
>>>> the CC list (removing xen-devel and/or adding extra people), I'll almost
>>>> always say so at the top of the mail.
>>>>
>>>>>> And then of course there's the p2m_ioreq_server ->
>> p2m_ram_logdirty
>>>>>> transition -- I assume that live migration is incompatible with this
>>>>>> functionality?  Is there anything that prevents a live migration from
>>>>>> being started when there are outstanding p2m_ioreq_server entries?
>>>>>>
>>>>>
>>>>> Another good question, and the answer is unfortunately yes. :-)
>>>>>
>>>>> If live migration happens during the normal emulation process, entries
>>>>> marked with p2m_ioreq_server will be changed to p2m_log_dirty in
>>>>> resolve_misconfig(), and later write operations will change them to
>>>>> p2m_ram_rw, thereafter these pages can not be forwarded to device
>> model.
>>>>>   From this point of view, this functionality is incompatible with live
>>>>> migration.
>>>>>
>>>>> But for XenGT, I think this is acceptable, because, if live migration
>>>>> is to be supported in the future, intervention from backend device
>>>>> model will be necessary. At that time, we can guarantee from the device
>>>>> model side that there's no outdated p2m_ioreq_server entries, hence
>> no
>>>>> need to reset the p2m type back to p2m_ram_rw(and do not include
>>>>> p2m_ioreq_server in the P2M_CHANGEABLE_TYPES). By "outdated", I
>> mean
>>>>> when an ioreq server is detached from p2m_ioreq_server, or before an
>>>>> ioreq server is attached to this type, entries marked with
>>>>> p2m_ioreq_server should be regarded as outdated.
>>>>>
>>>>> Is this acceptible to you? Any suggestions?
>>>>
>>>> So the question is, as of this series, what happens if someone tries to
>>>> initiate a live migration while there are outstanding p2m_ioreq_server
>>>> entries?
>>>>
>>>> If the answer is "the ioreq server suddenly loses all control of the
>>>> memory", that's something that needs to be changed.
>>>>
>>>
>>> Sorry, for this patch series, I'm afraid the above description is the
>>> answer.
>>>
>>> Besides, I find it's hard to change current code to both support the
>>> deferred resetting of p2m_ioreq_server and the live migration at the
>>> same time. One reason is that a page with p2m_ioreq_server behaves
>>> differently in different situations.
>>>
>>> My assumption of XenGT is that, for live migration to work, the device
>>> model should guarantee there's no outstanding p2m_ioreq_server pages
>>> in hypervisor(no need to use the deferred recalculation), and it is our
>>> device model who should be responsible for the copying of the write
>>> protected guest pages later.
>>>
>>> And another solution I can think of: when unmapping the ioreq server,
>>> we walk the p2m table and reset entries with p2m_ioreq_server back
>>> directly, instead of deferring the reset. And of course, this means
>>> performance impact. But since the mapping and unmapping of an ioreq
>>> server is not a frequent one, the performance penalty may be acceptable.
>>> How do you think about this approach?
>>>
>>
>> George, sorry to bother you. Any comments on above option? :)
>>
>> Another choice might be to let live migration fail if there's
>> outstanding p2m_ioreq_server entries. But I'm not quite inclined to do
>> so, because:
>> 1> I'd still like to keep live migration feature for XenGT.
>> 2> Not easy to know if there's outstanding p2m_ioreq_server entries. I
>> mean, since p2m type change is not only triggered by hypercall, to keep
>> a counter for remaining p2m_ioreq_server entries means a lot code
>> changes;
>>
>> Besides, I wonder whether the requirement to reset the p2m_ioreq_server
>> is indispensable, could we let the device model side to be responsible
>> for this? The worst case I can imagine for device model failing to do
>> so is that operations of a gfn might be delivered to a wrong device
>> model. I'm not clear what kind of damage would this cause to the
>> hypervisor or other VM.
>>
>> Does any other maintainers have any suggestions?
>
> Note that it is a requirement that an ioreq server be disabled before VM suspend. That means ioreq server pages essentially have to go back to ram_rw semantics.
>
>    Paul
>

OK. So it should be hypervisor's responsibility to do the resetting.
Now we probably have 2 choices:
1> we reset the p2m type synchronously when ioreq server unmapping
happens, instead of deferring to the misconfig handling part. This
means performance impact to traverse the p2m table.

Or
2> we just disallow live migration when p2m->ioreq.server is not NULL.
This is not quite accurate, because having p2m->ioreq.server mapped
to p2m_ioreq_server does not necessarily means there would be such
outstanding entries. To be more accurate, we can add some other rough
check, e.g. both check if p2m->ioreq.server against NULL and check if
the hvmop_set_mem_type has ever been triggered once for the
p2m_ioreq_server type.

Both choice seems suboptimal for me. And I wonder if we have any
better solutions?

Thanks
Yu

>> Thanks in advance! :)
>>>> If the answer is, "everything just works", that's perfect.
>>>>
>>>> If the answer is, "Before logdirty mode is set, the ioreq server has the
>>>> opportunity to detach, removing the p2m_ioreq_server entries, and
>>>> operating without that functionality", that's good too.
>>>>
>>>> If the answer is, "the live migration request fails and the guest
>>>> continues to run", that's also acceptable.  If you want this series to
>>>> be checked in today (the last day for 4.7), this is probably your best
>>>> bet.
>>>>
>>>>    -George
>>>>
>>>>
>>>>
>>>
>>
>> Regards
>> Yu
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19  4:50               ` Tian, Kevin
  2016-04-19  8:46                 ` Paul Durrant
@ 2016-04-19  9:15                 ` Yu, Zhang
  2016-04-19  9:23                   ` Paul Durrant
  1 sibling, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-19  9:15 UTC (permalink / raw)
  To: Tian, Kevin, Jan Beulich, paul.durrant, Nakajima, Jun
  Cc: keir, george.dunlap, andrew.cooper3, tim, xen-devel, Lv, Zhiyuan



On 4/19/2016 12:50 PM, Tian, Kevin wrote:
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: Thursday, April 14, 2016 5:57 PM
>>>>>>> And with all three bits now possibly being clear, aren't we risking the
>>>>>>> entries to be mis-treated as not-present ones?
>>>>>>
>>>>>> Hah. You got me. Thanks! :)
>>>>>> Now I realized it would be difficult if we wanna to emulate the read
>>>>>> operations for HVM. According to Intel mannual, entry->r is to be
>>>>>> cleared, so should entry->w if we do not want ept misconfig. And
>>>>>> with both read and write permissions being forbidden, entry->x can be
>>>>>> set only on processors with EXECUTE_ONLY capability.
>>>>>> To avoid any entry to be mis-treated as not-present. We have several
>>>>>> solutions:
>>>>>> a> do not support the read emulation for now - we have no such usage
>>>>>> case;
>>>>>> b> add the check of p2m_t against p2m_ioreq_server in is_epte_present -
>>>>>> a bit weird to me.
>>>>>> Which one do you prefer? or any other suggestions?
>>>>>
>>>>> That question would also need to be asked to others who had
>>>>> suggested supporting both. I'd be fine with a, but I also don't view
>>>>> b as too awkward.
>>>>
>>>> According to Intel mannual, an entry is regarded as not present, if
>>>> bit0:2 is 0. So adding a p2m type check in is_epte_present() means we
>>>> will change its semantics, if this is acceptable(with no hurt to
>>>> hypervisor). I'd prefer option b>
>>>
>>> Perhaps time for the VMX maintainers to chime in - such a change is acceptable
>>> only if it doesn't result in changed hardware behavior. I can't think of any such off
>>> the top of my head, but this really should be confirmed by the maintainers before
>>> deciding to go such a route.
>>>
>>
>> Thanks, Jan. :)
>> Jun & Kevin, any suggestions?
>>
>
> I'm a bit worried about this change, since it's a fundamental EPT
> interface. Can we avoid supporting read emulation now?
>

Thanks for your reply, Kevin.
For XenGT, either waY would be fine. But I do not think changing
is_epte_present will introduce any bug.

Paul, do you have any preference?

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19  9:11                   ` Yu, Zhang
@ 2016-04-19  9:21                     ` Paul Durrant
  2016-04-19  9:44                       ` Yu, Zhang
  0 siblings, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-19  9:21 UTC (permalink / raw)
  To: Yu, Zhang, George Dunlap, xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim (Xen.org),
	Lv, Zhiyuan, jun.nakajima

> -----Original Message-----
[snip]
> >> Does any other maintainers have any suggestions?
> >
> > Note that it is a requirement that an ioreq server be disabled before VM
> suspend. That means ioreq server pages essentially have to go back to
> ram_rw semantics.
> >
> >    Paul
> >
> 
> OK. So it should be hypervisor's responsibility to do the resetting.
> Now we probably have 2 choices:
> 1> we reset the p2m type synchronously when ioreq server unmapping
> happens, instead of deferring to the misconfig handling part. This
> means performance impact to traverse the p2m table.
> 

Do we need to reset at all. The p2m type does need to be transferred, it will just be unclaimed on the far end (i.e. the pages are treated as r/w ram) until the emulator starts up there. If that cannot be done without creating yet another p2m type to handle logdirty (which seems a suboptimal way of dealing with it) then I think migration needs to be disallowed on any domain than contains any ioreq_server type pages at this stage.

  Paul

> Or
> 2> we just disallow live migration when p2m->ioreq.server is not NULL.
> This is not quite accurate, because having p2m->ioreq.server mapped
> to p2m_ioreq_server does not necessarily means there would be such
> outstanding entries. To be more accurate, we can add some other rough
> check, e.g. both check if p2m->ioreq.server against NULL and check if
> the hvmop_set_mem_type has ever been triggered once for the
> p2m_ioreq_server type.
> 
> Both choice seems suboptimal for me. And I wonder if we have any
> better solutions?
> 
> Thanks
> Yu
> 
> >> Thanks in advance! :)
> >>>> If the answer is, "everything just works", that's perfect.
> >>>>
> >>>> If the answer is, "Before logdirty mode is set, the ioreq server has the
> >>>> opportunity to detach, removing the p2m_ioreq_server entries, and
> >>>> operating without that functionality", that's good too.
> >>>>
> >>>> If the answer is, "the live migration request fails and the guest
> >>>> continues to run", that's also acceptable.  If you want this series to
> >>>> be checked in today (the last day for 4.7), this is probably your best
> >>>> bet.
> >>>>
> >>>>    -George
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >> Regards
> >> Yu
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> >
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19  4:37                 ` Tian, Kevin
@ 2016-04-19  9:21                   ` Yu, Zhang
  0 siblings, 0 replies; 82+ messages in thread
From: Yu, Zhang @ 2016-04-19  9:21 UTC (permalink / raw)
  To: Tian, Kevin, George Dunlap, Paul Durrant, xen-devel
  Cc: Andrew Cooper, Jan Beulich, Tim Deegan, Lv, Zhiyuan, Nakajima, Jun



On 4/19/2016 12:37 PM, Tian, Kevin wrote:
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: Thursday, April 14, 2016 6:45 PM
>>
>> On 4/11/2016 7:15 PM, Yu, Zhang wrote:
>>>
>>>
>>> On 4/8/2016 7:01 PM, George Dunlap wrote:
>>>> On 08/04/16 11:10, Yu, Zhang wrote:
>>>> [snip]
>>>>> BTW, I noticed your reply has not be CCed to mailing list, and I also
>>>>> wonder if we should raise this last question in community?
>>>>
>>>> Oops -- that was a mistake on my part.  :-)  I appreciate the
>>>> discretion; just so you know in the future, if I'm purposely changing
>>>> the CC list (removing xen-devel and/or adding extra people), I'll almost
>>>> always say so at the top of the mail.
>>>>
>>>>>> And then of course there's the p2m_ioreq_server -> p2m_ram_logdirty
>>>>>> transition -- I assume that live migration is incompatible with this
>>>>>> functionality?  Is there anything that prevents a live migration from
>>>>>> being started when there are outstanding p2m_ioreq_server entries?
>>>>>>
>>>>>
>>>>> Another good question, and the answer is unfortunately yes. :-)
>>>>>
>>>>> If live migration happens during the normal emulation process, entries
>>>>> marked with p2m_ioreq_server will be changed to p2m_log_dirty in
>>>>> resolve_misconfig(), and later write operations will change them to
>>>>> p2m_ram_rw, thereafter these pages can not be forwarded to device model.
>>>>>   From this point of view, this functionality is incompatible with live
>>>>> migration.
>>>>>
>>>>> But for XenGT, I think this is acceptable, because, if live migration
>>>>> is to be supported in the future, intervention from backend device
>>>>> model will be necessary. At that time, we can guarantee from the device
>>>>> model side that there's no outdated p2m_ioreq_server entries, hence no
>>>>> need to reset the p2m type back to p2m_ram_rw(and do not include
>>>>> p2m_ioreq_server in the P2M_CHANGEABLE_TYPES). By "outdated", I mean
>>>>> when an ioreq server is detached from p2m_ioreq_server, or before an
>>>>> ioreq server is attached to this type, entries marked with
>>>>> p2m_ioreq_server should be regarded as outdated.
>>>>>
>>>>> Is this acceptible to you? Any suggestions?
>>>>
>>>> So the question is, as of this series, what happens if someone tries to
>>>> initiate a live migration while there are outstanding p2m_ioreq_server
>>>> entries?
>>>>
>>>> If the answer is "the ioreq server suddenly loses all control of the
>>>> memory", that's something that needs to be changed.
>>>>
>>>
>>> Sorry, for this patch series, I'm afraid the above description is the
>>> answer.
>>>
>>> Besides, I find it's hard to change current code to both support the
>>> deferred resetting of p2m_ioreq_server and the live migration at the
>>> same time. One reason is that a page with p2m_ioreq_server behaves
>>> differently in different situations.
>>>
>>> My assumption of XenGT is that, for live migration to work, the device
>>> model should guarantee there's no outstanding p2m_ioreq_server pages
>>> in hypervisor(no need to use the deferred recalculation), and it is our
>>> device model who should be responsible for the copying of the write
>>> protected guest pages later.
>>>
>>> And another solution I can think of: when unmapping the ioreq server,
>>> we walk the p2m table and reset entries with p2m_ioreq_server back
>>> directly, instead of deferring the reset. And of course, this means
>>> performance impact. But since the mapping and unmapping of an ioreq
>>> server is not a frequent one, the performance penalty may be acceptable.
>>> How do you think about this approach?
>>>
>>
>> George, sorry to bother you. Any comments on above option? :)
>>
>> Another choice might be to let live migration fail if there's
>> outstanding p2m_ioreq_server entries. But I'm not quite inclined to do
>> so, because:
>> 1> I'd still like to keep live migration feature for XenGT.
>> 2> Not easy to know if there's outstanding p2m_ioreq_server entries. I
>> mean, since p2m type change is not only triggered by hypercall, to keep
>> a counter for remaining p2m_ioreq_server entries means a lot code
>> changes;
>>
>> Besides, I wonder whether the requirement to reset the p2m_ioreq_server
>> is indispensable, could we let the device model side to be responsible
>> for this? The worst case I can imagine for device model failing to do
>> so is that operations of a gfn might be delivered to a wrong device
>> model. I'm not clear what kind of damage would this cause to the
>> hypervisor or other VM.
>>
>> Does any other maintainers have any suggestions?
>> Thanks in advance! :)
>
> I'm not sure how above is working. In pre-copy phase (where logdirty
> is concerned), the device model is still actively serving requests from
> guest, including initiating new write-protection requests. How can you
> guarantee draining of outstanding p2m_ioreq_server entries w/o
> actually freezing device model (while freezing device model means guest
> driver might be blocked with random errors)?
>

You are right, and I'm not suggesting to clear the p2m_ioreq_server
entries when live migration happens. My suggestion is that either we
guarantee there is no outstanding p2m_ioreq_server entries right after
the ioreq server is unbounded, or do not support live migration for
now.  :)

B.R.
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19  9:15                 ` Yu, Zhang
@ 2016-04-19  9:23                   ` Paul Durrant
  0 siblings, 0 replies; 82+ messages in thread
From: Paul Durrant @ 2016-04-19  9:23 UTC (permalink / raw)
  To: Yu, Zhang, Kevin Tian, Jan Beulich, Nakajima, Jun
  Cc: Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, Lv, Zhiyuan

> -----Original Message-----
> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: 19 April 2016 10:15
> To: Kevin Tian; Jan Beulich; Paul Durrant; Nakajima, Jun
> Cc: Keir (Xen.org); George Dunlap; Andrew Cooper; Tim (Xen.org); xen-
> devel@lists.xen.org; Lv, Zhiyuan
> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> map guest ram with p2m_ioreq_server to an ioreq server
> 
> 
> 
> On 4/19/2016 12:50 PM, Tian, Kevin wrote:
> >> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> >> Sent: Thursday, April 14, 2016 5:57 PM
> >>>>>>> And with all three bits now possibly being clear, aren't we risking
> the
> >>>>>>> entries to be mis-treated as not-present ones?
> >>>>>>
> >>>>>> Hah. You got me. Thanks! :)
> >>>>>> Now I realized it would be difficult if we wanna to emulate the read
> >>>>>> operations for HVM. According to Intel mannual, entry->r is to be
> >>>>>> cleared, so should entry->w if we do not want ept misconfig. And
> >>>>>> with both read and write permissions being forbidden, entry->x can
> be
> >>>>>> set only on processors with EXECUTE_ONLY capability.
> >>>>>> To avoid any entry to be mis-treated as not-present. We have
> several
> >>>>>> solutions:
> >>>>>> a> do not support the read emulation for now - we have no such
> usage
> >>>>>> case;
> >>>>>> b> add the check of p2m_t against p2m_ioreq_server in
> is_epte_present -
> >>>>>> a bit weird to me.
> >>>>>> Which one do you prefer? or any other suggestions?
> >>>>>
> >>>>> That question would also need to be asked to others who had
> >>>>> suggested supporting both. I'd be fine with a, but I also don't view
> >>>>> b as too awkward.
> >>>>
> >>>> According to Intel mannual, an entry is regarded as not present, if
> >>>> bit0:2 is 0. So adding a p2m type check in is_epte_present() means we
> >>>> will change its semantics, if this is acceptable(with no hurt to
> >>>> hypervisor). I'd prefer option b>
> >>>
> >>> Perhaps time for the VMX maintainers to chime in - such a change is
> acceptable
> >>> only if it doesn't result in changed hardware behavior. I can't think of any
> such off
> >>> the top of my head, but this really should be confirmed by the
> maintainers before
> >>> deciding to go such a route.
> >>>
> >>
> >> Thanks, Jan. :)
> >> Jun & Kevin, any suggestions?
> >>
> >
> > I'm a bit worried about this change, since it's a fundamental EPT
> > interface. Can we avoid supporting read emulation now?
> >
> 
> Thanks for your reply, Kevin.
> For XenGT, either waY would be fine. But I do not think changing
> is_epte_present will introduce any bug.
> 
> Paul, do you have any preference?
>

Unless the code is nailed down in the next couple of days then I don't think any implementation is going to make it into 4.7 so I'd prefer to go with the simplest option, and not supporting read emulation makes things a lot simpler.

  Paul
 
> Yu
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19  8:46                 ` Paul Durrant
@ 2016-04-19  9:27                   ` Yu, Zhang
  2016-04-19  9:40                     ` Paul Durrant
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-19  9:27 UTC (permalink / raw)
  To: Paul Durrant, Kevin Tian, Jan Beulich, Nakajima, Jun
  Cc: Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, Lv, Zhiyuan



On 4/19/2016 4:46 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Tian, Kevin [mailto:kevin.tian@intel.com]
>> Sent: 19 April 2016 05:51
>> To: Yu, Zhang; Jan Beulich; Paul Durrant; Nakajima, Jun
>> Cc: Andrew Cooper; George Dunlap; Lv, Zhiyuan; xen-devel@lists.xen.org;
>> Keir (Xen.org); Tim (Xen.org)
>> Subject: RE: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
>> map guest ram with p2m_ioreq_server to an ioreq server
>>
>>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>>> Sent: Thursday, April 14, 2016 5:57 PM
>>>>>>>> And with all three bits now possibly being clear, aren't we risking the
>>>>>>>> entries to be mis-treated as not-present ones?
>>>>>>>
>>>>>>> Hah. You got me. Thanks! :)
>>>>>>> Now I realized it would be difficult if we wanna to emulate the read
>>>>>>> operations for HVM. According to Intel mannual, entry->r is to be
>>>>>>> cleared, so should entry->w if we do not want ept misconfig. And
>>>>>>> with both read and write permissions being forbidden, entry->x can
>> be
>>>>>>> set only on processors with EXECUTE_ONLY capability.
>>>>>>> To avoid any entry to be mis-treated as not-present. We have several
>>>>>>> solutions:
>>>>>>> a> do not support the read emulation for now - we have no such
>> usage
>>>>>>> case;
>>>>>>> b> add the check of p2m_t against p2m_ioreq_server in
>> is_epte_present -
>>>>>>> a bit weird to me.
>>>>>>> Which one do you prefer? or any other suggestions?
>>>>>>
>>>>>> That question would also need to be asked to others who had
>>>>>> suggested supporting both. I'd be fine with a, but I also don't view
>>>>>> b as too awkward.
>>>>>
>>>>> According to Intel mannual, an entry is regarded as not present, if
>>>>> bit0:2 is 0. So adding a p2m type check in is_epte_present() means we
>>>>> will change its semantics, if this is acceptable(with no hurt to
>>>>> hypervisor). I'd prefer option b>
>>>>
>>>> Perhaps time for the VMX maintainers to chime in - such a change is
>> acceptable
>>>> only if it doesn't result in changed hardware behavior. I can't think of any
>> such off
>>>> the top of my head, but this really should be confirmed by the
>> maintainers before
>>>> deciding to go such a route.
>>>>
>>>
>>> Thanks, Jan. :)
>>> Jun & Kevin, any suggestions?
>>>
>>
>> I'm a bit worried about this change, since it's a fundamental EPT
>> interface. Can we avoid supporting read emulation now?
>>
>
> I'm happy to drop the implementation of read emulation for the moment to keep things simple, as long as it is kept in the interface.
>

Thanks, Paul.
So what's the benefit to keep the read in the interface, if we can
not support its emulation?

It could be easier to change the is_epte_present definition than
to remove the read emulation code, but either way would not be
difficult.

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19  9:27                   ` Yu, Zhang
@ 2016-04-19  9:40                     ` Paul Durrant
  2016-04-19  9:49                       ` Yu, Zhang
  0 siblings, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-19  9:40 UTC (permalink / raw)
  To: Yu, Zhang, Kevin Tian, Jan Beulich, Nakajima, Jun
  Cc: Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, Lv, Zhiyuan

> -----Original Message-----
> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: 19 April 2016 10:27
> To: Paul Durrant; Kevin Tian; Jan Beulich; Nakajima, Jun
> Cc: Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George Dunlap; xen-
> devel@lists.xen.org; Lv, Zhiyuan
> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> map guest ram with p2m_ioreq_server to an ioreq server
> 
> 
> 
> On 4/19/2016 4:46 PM, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> >> Sent: 19 April 2016 05:51
> >> To: Yu, Zhang; Jan Beulich; Paul Durrant; Nakajima, Jun
> >> Cc: Andrew Cooper; George Dunlap; Lv, Zhiyuan; xen-
> devel@lists.xen.org;
> >> Keir (Xen.org); Tim (Xen.org)
> >> Subject: RE: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> >> map guest ram with p2m_ioreq_server to an ioreq server
> >>
> >>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> >>> Sent: Thursday, April 14, 2016 5:57 PM
> >>>>>>>> And with all three bits now possibly being clear, aren't we risking
> the
> >>>>>>>> entries to be mis-treated as not-present ones?
> >>>>>>>
> >>>>>>> Hah. You got me. Thanks! :)
> >>>>>>> Now I realized it would be difficult if we wanna to emulate the read
> >>>>>>> operations for HVM. According to Intel mannual, entry->r is to be
> >>>>>>> cleared, so should entry->w if we do not want ept misconfig. And
> >>>>>>> with both read and write permissions being forbidden, entry->x
> can
> >> be
> >>>>>>> set only on processors with EXECUTE_ONLY capability.
> >>>>>>> To avoid any entry to be mis-treated as not-present. We have
> several
> >>>>>>> solutions:
> >>>>>>> a> do not support the read emulation for now - we have no such
> >> usage
> >>>>>>> case;
> >>>>>>> b> add the check of p2m_t against p2m_ioreq_server in
> >> is_epte_present -
> >>>>>>> a bit weird to me.
> >>>>>>> Which one do you prefer? or any other suggestions?
> >>>>>>
> >>>>>> That question would also need to be asked to others who had
> >>>>>> suggested supporting both. I'd be fine with a, but I also don't view
> >>>>>> b as too awkward.
> >>>>>
> >>>>> According to Intel mannual, an entry is regarded as not present, if
> >>>>> bit0:2 is 0. So adding a p2m type check in is_epte_present() means we
> >>>>> will change its semantics, if this is acceptable(with no hurt to
> >>>>> hypervisor). I'd prefer option b>
> >>>>
> >>>> Perhaps time for the VMX maintainers to chime in - such a change is
> >> acceptable
> >>>> only if it doesn't result in changed hardware behavior. I can't think of
> any
> >> such off
> >>>> the top of my head, but this really should be confirmed by the
> >> maintainers before
> >>>> deciding to go such a route.
> >>>>
> >>>
> >>> Thanks, Jan. :)
> >>> Jun & Kevin, any suggestions?
> >>>
> >>
> >> I'm a bit worried about this change, since it's a fundamental EPT
> >> interface. Can we avoid supporting read emulation now?
> >>
> >
> > I'm happy to drop the implementation of read emulation for the moment
> to keep things simple, as long as it is kept in the interface.
> >
> 
> Thanks, Paul.
> So what's the benefit to keep the read in the interface, if we can
> not support its emulation?
>

Well, if it's in the interface then support can be added in future. If it's not in the interface then it needs to be added later and that makes things more awkward compatibility-wise.

  Paul
 
> It could be easier to change the is_epte_present definition than
> to remove the read emulation code, but either way would not be
> difficult.
> 
> Yu
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19  9:21                     ` Paul Durrant
@ 2016-04-19  9:44                       ` Yu, Zhang
  2016-04-19 10:05                         ` Paul Durrant
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-19  9:44 UTC (permalink / raw)
  To: Paul Durrant, George Dunlap, xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim (Xen.org),
	Lv, Zhiyuan, jun.nakajima



On 4/19/2016 5:21 PM, Paul Durrant wrote:
>> -----Original Message-----
> [snip]
>>>> Does any other maintainers have any suggestions?
>>>
>>> Note that it is a requirement that an ioreq server be disabled before VM
>> suspend. That means ioreq server pages essentially have to go back to
>> ram_rw semantics.
>>>
>>>     Paul
>>>
>>
>> OK. So it should be hypervisor's responsibility to do the resetting.
>> Now we probably have 2 choices:
>> 1> we reset the p2m type synchronously when ioreq server unmapping
>> happens, instead of deferring to the misconfig handling part. This
>> means performance impact to traverse the p2m table.
>>
>
> Do we need to reset at all. The p2m type does need to be transferred, it will just be unclaimed on the far end (i.e. the pages are treated as r/w ram) until the emulator starts up there. If that cannot be done without creating yet another p2m type to handle logdirty (which seems a suboptimal way of dealing with it) then I think migration needs to be disallowed on any domain than contains any ioreq_server type pages at this stage.
>
>    Paul
>

Yes. We need - either the device model or hypervisor should grantee
there's no p2m_ioreq_server pages left after an ioreq server is unmapped
from this type (which is write protected in such senario), otherwise
its emulation might be forwarded to other unexpected device models which
claims the p2m_ioreq_server later.

So I guess approach 2> is your suggestion now.

Besides, previously, Jan also questioned the necessity of resetting the
p2m type when an ioreq server is mapping to the p2m_ioreq_server. His
argument is that we should only allow such p2m transition after an
ioreq server has already mapped to this p2m_ioreq_server. I think his 
point sounds also reasonable.

Thanks
Yu

>> Or
>> 2> we just disallow live migration when p2m->ioreq.server is not NULL.
>> This is not quite accurate, because having p2m->ioreq.server mapped
>> to p2m_ioreq_server does not necessarily means there would be such
>> outstanding entries. To be more accurate, we can add some other rough
>> check, e.g. both check if p2m->ioreq.server against NULL and check if
>> the hvmop_set_mem_type has ever been triggered once for the
>> p2m_ioreq_server type.
>>
>> Both choice seems suboptimal for me. And I wonder if we have any
>> better solutions?
>>
>> Thanks
>> Yu
>>
>>>> Thanks in advance! :)
>>>>>> If the answer is, "everything just works", that's perfect.
>>>>>>
>>>>>> If the answer is, "Before logdirty mode is set, the ioreq server has the
>>>>>> opportunity to detach, removing the p2m_ioreq_server entries, and
>>>>>> operating without that functionality", that's good too.
>>>>>>
>>>>>> If the answer is, "the live migration request fails and the guest
>>>>>> continues to run", that's also acceptable.  If you want this series to
>>>>>> be checked in today (the last day for 4.7), this is probably your best
>>>>>> bet.
>>>>>>
>>>>>>     -George
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> Regards
>>>> Yu
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel
>>>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19  9:40                     ` Paul Durrant
@ 2016-04-19  9:49                       ` Yu, Zhang
  2016-04-19 10:01                         ` Paul Durrant
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-19  9:49 UTC (permalink / raw)
  To: Paul Durrant, Kevin Tian, Jan Beulich, Nakajima, Jun
  Cc: Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, Lv, Zhiyuan



On 4/19/2016 5:40 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: 19 April 2016 10:27
>> To: Paul Durrant; Kevin Tian; Jan Beulich; Nakajima, Jun
>> Cc: Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George Dunlap; xen-
>> devel@lists.xen.org; Lv, Zhiyuan
>> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
>> map guest ram with p2m_ioreq_server to an ioreq server
>>
>>
>>
>> On 4/19/2016 4:46 PM, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Tian, Kevin [mailto:kevin.tian@intel.com]
>>>> Sent: 19 April 2016 05:51
>>>> To: Yu, Zhang; Jan Beulich; Paul Durrant; Nakajima, Jun
>>>> Cc: Andrew Cooper; George Dunlap; Lv, Zhiyuan; xen-
>> devel@lists.xen.org;
>>>> Keir (Xen.org); Tim (Xen.org)
>>>> Subject: RE: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
>>>> map guest ram with p2m_ioreq_server to an ioreq server
>>>>
>>>>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>>>>> Sent: Thursday, April 14, 2016 5:57 PM
>>>>>>>>>> And with all three bits now possibly being clear, aren't we risking
>> the
>>>>>>>>>> entries to be mis-treated as not-present ones?
>>>>>>>>>
>>>>>>>>> Hah. You got me. Thanks! :)
>>>>>>>>> Now I realized it would be difficult if we wanna to emulate the read
>>>>>>>>> operations for HVM. According to Intel mannual, entry->r is to be
>>>>>>>>> cleared, so should entry->w if we do not want ept misconfig. And
>>>>>>>>> with both read and write permissions being forbidden, entry->x
>> can
>>>> be
>>>>>>>>> set only on processors with EXECUTE_ONLY capability.
>>>>>>>>> To avoid any entry to be mis-treated as not-present. We have
>> several
>>>>>>>>> solutions:
>>>>>>>>> a> do not support the read emulation for now - we have no such
>>>> usage
>>>>>>>>> case;
>>>>>>>>> b> add the check of p2m_t against p2m_ioreq_server in
>>>> is_epte_present -
>>>>>>>>> a bit weird to me.
>>>>>>>>> Which one do you prefer? or any other suggestions?
>>>>>>>>
>>>>>>>> That question would also need to be asked to others who had
>>>>>>>> suggested supporting both. I'd be fine with a, but I also don't view
>>>>>>>> b as too awkward.
>>>>>>>
>>>>>>> According to Intel mannual, an entry is regarded as not present, if
>>>>>>> bit0:2 is 0. So adding a p2m type check in is_epte_present() means we
>>>>>>> will change its semantics, if this is acceptable(with no hurt to
>>>>>>> hypervisor). I'd prefer option b>
>>>>>>
>>>>>> Perhaps time for the VMX maintainers to chime in - such a change is
>>>> acceptable
>>>>>> only if it doesn't result in changed hardware behavior. I can't think of
>> any
>>>> such off
>>>>>> the top of my head, but this really should be confirmed by the
>>>> maintainers before
>>>>>> deciding to go such a route.
>>>>>>
>>>>>
>>>>> Thanks, Jan. :)
>>>>> Jun & Kevin, any suggestions?
>>>>>
>>>>
>>>> I'm a bit worried about this change, since it's a fundamental EPT
>>>> interface. Can we avoid supporting read emulation now?
>>>>
>>>
>>> I'm happy to drop the implementation of read emulation for the moment
>> to keep things simple, as long as it is kept in the interface.
>>>
>>
>> Thanks, Paul.
>> So what's the benefit to keep the read in the interface, if we can
>> not support its emulation?
>>
>
> Well, if it's in the interface then support can be added in future. If it's not in the interface then it needs to be added later and that makes things more awkward compatibility-wise.
>
>    Paul
>

But if we need to support read emulation in the future, we'll have to 
clear bit0:2 in ept pte, which is the same situation we are facing
now. So why do we not support it now?

Sorry for seeming so stubborn, I just want to have some more convincing
explanations. :)

Yu

>> It could be easier to change the is_epte_present definition than
>> to remove the read emulation code, but either way would not be
>> difficult.
>>
>> Yu
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19 10:01                         ` Paul Durrant
@ 2016-04-19  9:54                           ` Yu, Zhang
  0 siblings, 0 replies; 82+ messages in thread
From: Yu, Zhang @ 2016-04-19  9:54 UTC (permalink / raw)
  To: Paul Durrant, Kevin Tian, Jan Beulich, Nakajima, Jun
  Cc: Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, Lv, Zhiyuan



On 4/19/2016 6:01 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: 19 April 2016 10:49
>> To: Paul Durrant; Kevin Tian; Jan Beulich; Nakajima, Jun
>> Cc: Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George Dunlap; xen-
>> devel@lists.xen.org; Lv, Zhiyuan
>> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
>> map guest ram with p2m_ioreq_server to an ioreq server
>>
>>
>>
>> On 4/19/2016 5:40 PM, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>>>> Sent: 19 April 2016 10:27
>>>> To: Paul Durrant; Kevin Tian; Jan Beulich; Nakajima, Jun
>>>> Cc: Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George Dunlap; xen-
>>>> devel@lists.xen.org; Lv, Zhiyuan
>>>> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
>>>> map guest ram with p2m_ioreq_server to an ioreq server
>>>>
>>>>
>>>>
>>>> On 4/19/2016 4:46 PM, Paul Durrant wrote:
>>>>>> -----Original Message-----
>>>>>> From: Tian, Kevin [mailto:kevin.tian@intel.com]
>>>>>> Sent: 19 April 2016 05:51
>>>>>> To: Yu, Zhang; Jan Beulich; Paul Durrant; Nakajima, Jun
>>>>>> Cc: Andrew Cooper; George Dunlap; Lv, Zhiyuan; xen-
>>>> devel@lists.xen.org;
>>>>>> Keir (Xen.org); Tim (Xen.org)
>>>>>> Subject: RE: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP
>> to
>>>>>> map guest ram with p2m_ioreq_server to an ioreq server
>>>>>>
>>>>>>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>>>>>>> Sent: Thursday, April 14, 2016 5:57 PM
>>>>>>>>>>>> And with all three bits now possibly being clear, aren't we
>> risking
>>>> the
>>>>>>>>>>>> entries to be mis-treated as not-present ones?
>>>>>>>>>>>
>>>>>>>>>>> Hah. You got me. Thanks! :)
>>>>>>>>>>> Now I realized it would be difficult if we wanna to emulate the
>> read
>>>>>>>>>>> operations for HVM. According to Intel mannual, entry->r is to
>> be
>>>>>>>>>>> cleared, so should entry->w if we do not want ept misconfig.
>> And
>>>>>>>>>>> with both read and write permissions being forbidden, entry->x
>>>> can
>>>>>> be
>>>>>>>>>>> set only on processors with EXECUTE_ONLY capability.
>>>>>>>>>>> To avoid any entry to be mis-treated as not-present. We have
>>>> several
>>>>>>>>>>> solutions:
>>>>>>>>>>> a> do not support the read emulation for now - we have no such
>>>>>> usage
>>>>>>>>>>> case;
>>>>>>>>>>> b> add the check of p2m_t against p2m_ioreq_server in
>>>>>> is_epte_present -
>>>>>>>>>>> a bit weird to me.
>>>>>>>>>>> Which one do you prefer? or any other suggestions?
>>>>>>>>>>
>>>>>>>>>> That question would also need to be asked to others who had
>>>>>>>>>> suggested supporting both. I'd be fine with a, but I also don't view
>>>>>>>>>> b as too awkward.
>>>>>>>>>
>>>>>>>>> According to Intel mannual, an entry is regarded as not present, if
>>>>>>>>> bit0:2 is 0. So adding a p2m type check in is_epte_present() means
>> we
>>>>>>>>> will change its semantics, if this is acceptable(with no hurt to
>>>>>>>>> hypervisor). I'd prefer option b>
>>>>>>>>
>>>>>>>> Perhaps time for the VMX maintainers to chime in - such a change is
>>>>>> acceptable
>>>>>>>> only if it doesn't result in changed hardware behavior. I can't think of
>>>> any
>>>>>> such off
>>>>>>>> the top of my head, but this really should be confirmed by the
>>>>>> maintainers before
>>>>>>>> deciding to go such a route.
>>>>>>>>
>>>>>>>
>>>>>>> Thanks, Jan. :)
>>>>>>> Jun & Kevin, any suggestions?
>>>>>>>
>>>>>>
>>>>>> I'm a bit worried about this change, since it's a fundamental EPT
>>>>>> interface. Can we avoid supporting read emulation now?
>>>>>>
>>>>>
>>>>> I'm happy to drop the implementation of read emulation for the
>> moment
>>>> to keep things simple, as long as it is kept in the interface.
>>>>>
>>>>
>>>> Thanks, Paul.
>>>> So what's the benefit to keep the read in the interface, if we can
>>>> not support its emulation?
>>>>
>>>
>>> Well, if it's in the interface then support can be added in future. If it's not in
>> the interface then it needs to be added later and that makes things more
>> awkward compatibility-wise.
>>>
>>>     Paul
>>>
>>
>> But if we need to support read emulation in the future, we'll have to
>> clear bit0:2 in ept pte, which is the same situation we are facing
>> now. So why do we not support it now?
>>
>> Sorry for seeming so stubborn, I just want to have some more convincing
>> explanations. :)
>>
>
> That's an argument you need to have with Kevin. If he thinks the risk of making the change now is small then go ahead. All I'm saying is that 4.7 is about to go out of the door so whatever happens now needs to be as low risk and as minimally disruptive as possible.
>

Got it. Thank you, Paul. :)

Yu

>    Paul
>
>> Yu
>>
>>>> It could be easier to change the is_epte_present definition than
>>>> to remove the read emulation code, but either way would not be
>>>> difficult.
>>>>
>>>> Yu
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel
>>>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19  9:49                       ` Yu, Zhang
@ 2016-04-19 10:01                         ` Paul Durrant
  2016-04-19  9:54                           ` Yu, Zhang
  0 siblings, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-19 10:01 UTC (permalink / raw)
  To: Yu, Zhang, Kevin Tian, Jan Beulich, Nakajima, Jun
  Cc: Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, Lv, Zhiyuan

> -----Original Message-----
> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: 19 April 2016 10:49
> To: Paul Durrant; Kevin Tian; Jan Beulich; Nakajima, Jun
> Cc: Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George Dunlap; xen-
> devel@lists.xen.org; Lv, Zhiyuan
> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> map guest ram with p2m_ioreq_server to an ioreq server
> 
> 
> 
> On 4/19/2016 5:40 PM, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> >> Sent: 19 April 2016 10:27
> >> To: Paul Durrant; Kevin Tian; Jan Beulich; Nakajima, Jun
> >> Cc: Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George Dunlap; xen-
> >> devel@lists.xen.org; Lv, Zhiyuan
> >> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> >> map guest ram with p2m_ioreq_server to an ioreq server
> >>
> >>
> >>
> >> On 4/19/2016 4:46 PM, Paul Durrant wrote:
> >>>> -----Original Message-----
> >>>> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> >>>> Sent: 19 April 2016 05:51
> >>>> To: Yu, Zhang; Jan Beulich; Paul Durrant; Nakajima, Jun
> >>>> Cc: Andrew Cooper; George Dunlap; Lv, Zhiyuan; xen-
> >> devel@lists.xen.org;
> >>>> Keir (Xen.org); Tim (Xen.org)
> >>>> Subject: RE: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP
> to
> >>>> map guest ram with p2m_ioreq_server to an ioreq server
> >>>>
> >>>>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> >>>>> Sent: Thursday, April 14, 2016 5:57 PM
> >>>>>>>>>> And with all three bits now possibly being clear, aren't we
> risking
> >> the
> >>>>>>>>>> entries to be mis-treated as not-present ones?
> >>>>>>>>>
> >>>>>>>>> Hah. You got me. Thanks! :)
> >>>>>>>>> Now I realized it would be difficult if we wanna to emulate the
> read
> >>>>>>>>> operations for HVM. According to Intel mannual, entry->r is to
> be
> >>>>>>>>> cleared, so should entry->w if we do not want ept misconfig.
> And
> >>>>>>>>> with both read and write permissions being forbidden, entry->x
> >> can
> >>>> be
> >>>>>>>>> set only on processors with EXECUTE_ONLY capability.
> >>>>>>>>> To avoid any entry to be mis-treated as not-present. We have
> >> several
> >>>>>>>>> solutions:
> >>>>>>>>> a> do not support the read emulation for now - we have no such
> >>>> usage
> >>>>>>>>> case;
> >>>>>>>>> b> add the check of p2m_t against p2m_ioreq_server in
> >>>> is_epte_present -
> >>>>>>>>> a bit weird to me.
> >>>>>>>>> Which one do you prefer? or any other suggestions?
> >>>>>>>>
> >>>>>>>> That question would also need to be asked to others who had
> >>>>>>>> suggested supporting both. I'd be fine with a, but I also don't view
> >>>>>>>> b as too awkward.
> >>>>>>>
> >>>>>>> According to Intel mannual, an entry is regarded as not present, if
> >>>>>>> bit0:2 is 0. So adding a p2m type check in is_epte_present() means
> we
> >>>>>>> will change its semantics, if this is acceptable(with no hurt to
> >>>>>>> hypervisor). I'd prefer option b>
> >>>>>>
> >>>>>> Perhaps time for the VMX maintainers to chime in - such a change is
> >>>> acceptable
> >>>>>> only if it doesn't result in changed hardware behavior. I can't think of
> >> any
> >>>> such off
> >>>>>> the top of my head, but this really should be confirmed by the
> >>>> maintainers before
> >>>>>> deciding to go such a route.
> >>>>>>
> >>>>>
> >>>>> Thanks, Jan. :)
> >>>>> Jun & Kevin, any suggestions?
> >>>>>
> >>>>
> >>>> I'm a bit worried about this change, since it's a fundamental EPT
> >>>> interface. Can we avoid supporting read emulation now?
> >>>>
> >>>
> >>> I'm happy to drop the implementation of read emulation for the
> moment
> >> to keep things simple, as long as it is kept in the interface.
> >>>
> >>
> >> Thanks, Paul.
> >> So what's the benefit to keep the read in the interface, if we can
> >> not support its emulation?
> >>
> >
> > Well, if it's in the interface then support can be added in future. If it's not in
> the interface then it needs to be added later and that makes things more
> awkward compatibility-wise.
> >
> >    Paul
> >
> 
> But if we need to support read emulation in the future, we'll have to
> clear bit0:2 in ept pte, which is the same situation we are facing
> now. So why do we not support it now?
> 
> Sorry for seeming so stubborn, I just want to have some more convincing
> explanations. :)
> 

That's an argument you need to have with Kevin. If he thinks the risk of making the change now is small then go ahead. All I'm saying is that 4.7 is about to go out of the door so whatever happens now needs to be as low risk and as minimally disruptive as possible.

  Paul

> Yu
> 
> >> It could be easier to change the is_epte_present definition than
> >> to remove the read emulation code, but either way would not be
> >> difficult.
> >>
> >> Yu
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> >
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19  9:44                       ` Yu, Zhang
@ 2016-04-19 10:05                         ` Paul Durrant
  2016-04-19 11:17                           ` Yu, Zhang
  0 siblings, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-19 10:05 UTC (permalink / raw)
  To: Yu, Zhang, George Dunlap, xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim (Xen.org),
	Lv, Zhiyuan, jun.nakajima

> -----Original Message-----
> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: 19 April 2016 10:44
> To: Paul Durrant; George Dunlap; xen-devel@lists.xen.org
> Cc: Kevin Tian; Jan Beulich; Andrew Cooper; Tim (Xen.org); Lv, Zhiyuan;
> jun.nakajima@intel.com
> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> map guest ram with p2m_ioreq_server to an ioreq server
> 
> 
> 
> On 4/19/2016 5:21 PM, Paul Durrant wrote:
> >> -----Original Message-----
> > [snip]
> >>>> Does any other maintainers have any suggestions?
> >>>
> >>> Note that it is a requirement that an ioreq server be disabled before VM
> >> suspend. That means ioreq server pages essentially have to go back to
> >> ram_rw semantics.
> >>>
> >>>     Paul
> >>>
> >>
> >> OK. So it should be hypervisor's responsibility to do the resetting.
> >> Now we probably have 2 choices:
> >> 1> we reset the p2m type synchronously when ioreq server unmapping
> >> happens, instead of deferring to the misconfig handling part. This
> >> means performance impact to traverse the p2m table.
> >>
> >
> > Do we need to reset at all. The p2m type does need to be transferred, it
> will just be unclaimed on the far end (i.e. the pages are treated as r/w ram)
> until the emulator starts up there. If that cannot be done without creating
> yet another p2m type to handle logdirty (which seems a suboptimal way of
> dealing with it) then I think migration needs to be disallowed on any domain
> than contains any ioreq_server type pages at this stage.
> >
> >    Paul
> >
> 
> Yes. We need - either the device model or hypervisor should grantee
> there's no p2m_ioreq_server pages left after an ioreq server is unmapped
> from this type (which is write protected in such senario), otherwise
> its emulation might be forwarded to other unexpected device models which
> claims the p2m_ioreq_server later.

That should be for the device model to guarantee IMO. If the 'wrong' emulator claims the ioreq server type then I don't think that's Xen's problem.

> 
> So I guess approach 2> is your suggestion now.
> 
> Besides, previously, Jan also questioned the necessity of resetting the
> p2m type when an ioreq server is mapping to the p2m_ioreq_server. His
> argument is that we should only allow such p2m transition after an
> ioreq server has already mapped to this p2m_ioreq_server. I think his
> point sounds also reasonable.
> 

I was kind of hoping to avoid that ordering dependency but if it makes things simpler then so be it.

  Paul

> Thanks
> Yu
> 
> >> Or
> >> 2> we just disallow live migration when p2m->ioreq.server is not NULL.
> >> This is not quite accurate, because having p2m->ioreq.server mapped
> >> to p2m_ioreq_server does not necessarily means there would be such
> >> outstanding entries. To be more accurate, we can add some other rough
> >> check, e.g. both check if p2m->ioreq.server against NULL and check if
> >> the hvmop_set_mem_type has ever been triggered once for the
> >> p2m_ioreq_server type.
> >>
> >> Both choice seems suboptimal for me. And I wonder if we have any
> >> better solutions?
> >>
> >> Thanks
> >> Yu
> >>
> >>>> Thanks in advance! :)
> >>>>>> If the answer is, "everything just works", that's perfect.
> >>>>>>
> >>>>>> If the answer is, "Before logdirty mode is set, the ioreq server has
> the
> >>>>>> opportunity to detach, removing the p2m_ioreq_server entries, and
> >>>>>> operating without that functionality", that's good too.
> >>>>>>
> >>>>>> If the answer is, "the live migration request fails and the guest
> >>>>>> continues to run", that's also acceptable.  If you want this series to
> >>>>>> be checked in today (the last day for 4.7), this is probably your best
> >>>>>> bet.
> >>>>>>
> >>>>>>     -George
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>> Regards
> >>>> Yu
> >>> _______________________________________________
> >>> Xen-devel mailing list
> >>> Xen-devel@lists.xen.org
> >>> http://lists.xen.org/xen-devel
> >>>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-18 16:58             ` Paul Durrant
@ 2016-04-19 11:02               ` Yu, Zhang
  2016-04-19 11:15                 ` Paul Durrant
  2016-04-20 15:02                 ` George Dunlap
  0 siblings, 2 replies; 82+ messages in thread
From: Yu, Zhang @ 2016-04-19 11:02 UTC (permalink / raw)
  To: Paul Durrant, Jan Beulich, Wei Liu
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, zhiyuan.lv, jun.nakajima



On 4/19/2016 12:58 AM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan
>> Beulich
>> Sent: 18 April 2016 17:47
>> To: Paul Durrant
>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George
>> Dunlap; xen-devel@lists.xen.org; yu.c.zhang@linux.intel.com;
>> zhiyuan.lv@intel.com; jun.nakajima@intel.com
>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
>> p2m_mmio_write_dm to p2m_ioreq_server
>>
>>>>> Paul Durrant <Paul.Durrant@citrix.com> 04/18/16 6:45 PM >>>
>>> The original patch was posted before the cut-off IIRC so I'm not sure
>>> of the policy regarding freeze-exceptions.
>>
>>    It was submitted before the feature freeze, yes, but didn't make it in by
>> code freeze. So it's my understanding that an exception would be needed.
>>
>
> Ok. Thanks for the clarification. IMO getting this in is worth the freeze exception... it's a shame p2m_mmio_write_dm made it into 4.6.1. It needs to go before it proliferates any further.
>

Thanks, Paul.

So I suppose the only place we need change for this patch is
for hvmmem_type_t, which should be defined like this?

typedef enum {
     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
     HVMMEM_mmio_dm,            /* Reads and write go to the device model */
#if __XEN_INTERFACE_VERSION__ >= 0x00040700
     HVMMEM_ioreq_server
#else
     HVMMEM_mmio_write_dm
#endif
} hvmmem_type_t;

Besides, does 4.7 still accept freeze exception? It would be great
if we can get an approval for this.

Thanks
Yu

>    Paul
>
>> Jan
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-19 11:02               ` Yu, Zhang
@ 2016-04-19 11:15                 ` Paul Durrant
  2016-04-19 11:38                   ` Yu, Zhang
  2016-04-20 15:02                 ` George Dunlap
  1 sibling, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-19 11:15 UTC (permalink / raw)
  To: Yu, Zhang, Jan Beulich, Wei Liu
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, zhiyuan.lv, jun.nakajima

> -----Original Message-----
> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: 19 April 2016 12:03
> To: Paul Durrant; Jan Beulich; Wei Liu
> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George
> Dunlap; xen-devel@lists.xen.org; zhiyuan.lv@intel.com;
> jun.nakajima@intel.com
> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> p2m_mmio_write_dm to p2m_ioreq_server
> 
> 
> 
> On 4/19/2016 12:58 AM, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> Jan
> >> Beulich
> >> Sent: 18 April 2016 17:47
> >> To: Paul Durrant
> >> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George
> >> Dunlap; xen-devel@lists.xen.org; yu.c.zhang@linux.intel.com;
> >> zhiyuan.lv@intel.com; jun.nakajima@intel.com
> >> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> >> p2m_mmio_write_dm to p2m_ioreq_server
> >>
> >>>>> Paul Durrant <Paul.Durrant@citrix.com> 04/18/16 6:45 PM >>>
> >>> The original patch was posted before the cut-off IIRC so I'm not sure
> >>> of the policy regarding freeze-exceptions.
> >>
> >>    It was submitted before the feature freeze, yes, but didn't make it in by
> >> code freeze. So it's my understanding that an exception would be
> needed.
> >>
> >
> > Ok. Thanks for the clarification. IMO getting this in is worth the freeze
> exception... it's a shame p2m_mmio_write_dm made it into 4.6.1. It needs to
> go before it proliferates any further.
> >
> 
> Thanks, Paul.
> 
> So I suppose the only place we need change for this patch is
> for hvmmem_type_t, which should be defined like this?
> 
> typedef enum {
>      HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>      HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>      HVMMEM_mmio_dm,            /* Reads and write go to the device model */
> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
>      HVMMEM_ioreq_server
> #else
>      HVMMEM_mmio_write_dm
> #endif
> } hvmmem_type_t;
> 
> Besides, does 4.7 still accept freeze exception? It would be great
> if we can get an approval for this.

I talked to Wei earlier and he is happy to give a freeze exception to this change.

  Paul

> 
> Thanks
> Yu
> 
> >    Paul
> >
> >> Jan
> >>
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@lists.xen.org
> >> http://lists.xen.org/xen-devel
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> >
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19 10:05                         ` Paul Durrant
@ 2016-04-19 11:17                           ` Yu, Zhang
  2016-04-19 11:47                             ` Paul Durrant
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-19 11:17 UTC (permalink / raw)
  To: Paul Durrant, George Dunlap, xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim (Xen.org),
	Lv, Zhiyuan, jun.nakajima



On 4/19/2016 6:05 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: 19 April 2016 10:44
>> To: Paul Durrant; George Dunlap; xen-devel@lists.xen.org
>> Cc: Kevin Tian; Jan Beulich; Andrew Cooper; Tim (Xen.org); Lv, Zhiyuan;
>> jun.nakajima@intel.com
>> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
>> map guest ram with p2m_ioreq_server to an ioreq server
>>
>>
>>
>> On 4/19/2016 5:21 PM, Paul Durrant wrote:
>>>> -----Original Message-----
>>> [snip]
>>>>>> Does any other maintainers have any suggestions?
>>>>>
>>>>> Note that it is a requirement that an ioreq server be disabled before VM
>>>> suspend. That means ioreq server pages essentially have to go back to
>>>> ram_rw semantics.
>>>>>
>>>>>      Paul
>>>>>
>>>>
>>>> OK. So it should be hypervisor's responsibility to do the resetting.
>>>> Now we probably have 2 choices:
>>>> 1> we reset the p2m type synchronously when ioreq server unmapping
>>>> happens, instead of deferring to the misconfig handling part. This
>>>> means performance impact to traverse the p2m table.
>>>>
>>>
>>> Do we need to reset at all. The p2m type does need to be transferred, it
>> will just be unclaimed on the far end (i.e. the pages are treated as r/w ram)
>> until the emulator starts up there. If that cannot be done without creating
>> yet another p2m type to handle logdirty (which seems a suboptimal way of
>> dealing with it) then I think migration needs to be disallowed on any domain
>> than contains any ioreq_server type pages at this stage.
>>>
>>>     Paul
>>>
>>
>> Yes. We need - either the device model or hypervisor should grantee
>> there's no p2m_ioreq_server pages left after an ioreq server is unmapped
>> from this type (which is write protected in such senario), otherwise
>> its emulation might be forwarded to other unexpected device models which
>> claims the p2m_ioreq_server later.
>
> That should be for the device model to guarantee IMO. If the 'wrong' emulator claims the ioreq server type then I don't think that's Xen's problem.
>

Thanks, Paul.

So what about the VM suspend case you mentioned above? Will that trigger
the unmapping of ioreq server? Could the device model also take the role
to change the p2m type back in such case?

It would be much simpler if hypervisor side does not need to provide
the p2m resetting logic, and we can support live migration at the same
time then. :)


B.R.
Yu

>>
>> So I guess approach 2> is your suggestion now.
>>
>> Besides, previously, Jan also questioned the necessity of resetting the
>> p2m type when an ioreq server is mapping to the p2m_ioreq_server. His
>> argument is that we should only allow such p2m transition after an
>> ioreq server has already mapped to this p2m_ioreq_server. I think his
>> point sounds also reasonable.
>>
>
> I was kind of hoping to avoid that ordering dependency but if it makes things simpler then so be it.
>
>    Paul
>
>> Thanks
>> Yu
>>
>>>> Or
>>>> 2> we just disallow live migration when p2m->ioreq.server is not NULL.
>>>> This is not quite accurate, because having p2m->ioreq.server mapped
>>>> to p2m_ioreq_server does not necessarily means there would be such
>>>> outstanding entries. To be more accurate, we can add some other rough
>>>> check, e.g. both check if p2m->ioreq.server against NULL and check if
>>>> the hvmop_set_mem_type has ever been triggered once for the
>>>> p2m_ioreq_server type.
>>>>
>>>> Both choice seems suboptimal for me. And I wonder if we have any
>>>> better solutions?
>>>>
>>>> Thanks
>>>> Yu
>>>>
>>>>>> Thanks in advance! :)
>>>>>>>> If the answer is, "everything just works", that's perfect.
>>>>>>>>
>>>>>>>> If the answer is, "Before logdirty mode is set, the ioreq server has
>> the
>>>>>>>> opportunity to detach, removing the p2m_ioreq_server entries, and
>>>>>>>> operating without that functionality", that's good too.
>>>>>>>>
>>>>>>>> If the answer is, "the live migration request fails and the guest
>>>>>>>> continues to run", that's also acceptable.  If you want this series to
>>>>>>>> be checked in today (the last day for 4.7), this is probably your best
>>>>>>>> bet.
>>>>>>>>
>>>>>>>>      -George
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Regards
>>>>>> Yu
>>>>> _______________________________________________
>>>>> Xen-devel mailing list
>>>>> Xen-devel@lists.xen.org
>>>>> http://lists.xen.org/xen-devel
>>>>>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-19 11:15                 ` Paul Durrant
@ 2016-04-19 11:38                   ` Yu, Zhang
  2016-04-19 11:50                     ` Paul Durrant
  2016-04-19 16:51                     ` Jan Beulich
  0 siblings, 2 replies; 82+ messages in thread
From: Yu, Zhang @ 2016-04-19 11:38 UTC (permalink / raw)
  To: Paul Durrant, Jan Beulich, Wei Liu
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, zhiyuan.lv, jun.nakajima



On 4/19/2016 7:15 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: 19 April 2016 12:03
>> To: Paul Durrant; Jan Beulich; Wei Liu
>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George
>> Dunlap; xen-devel@lists.xen.org; zhiyuan.lv@intel.com;
>> jun.nakajima@intel.com
>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
>> p2m_mmio_write_dm to p2m_ioreq_server
>>
>>
>>
>> On 4/19/2016 12:58 AM, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
>> Jan
>>>> Beulich
>>>> Sent: 18 April 2016 17:47
>>>> To: Paul Durrant
>>>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George
>>>> Dunlap; xen-devel@lists.xen.org; yu.c.zhang@linux.intel.com;
>>>> zhiyuan.lv@intel.com; jun.nakajima@intel.com
>>>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
>>>> p2m_mmio_write_dm to p2m_ioreq_server
>>>>
>>>>>>> Paul Durrant <Paul.Durrant@citrix.com> 04/18/16 6:45 PM >>>
>>>>> The original patch was posted before the cut-off IIRC so I'm not sure
>>>>> of the policy regarding freeze-exceptions.
>>>>
>>>>     It was submitted before the feature freeze, yes, but didn't make it in by
>>>> code freeze. So it's my understanding that an exception would be
>> needed.
>>>>
>>>
>>> Ok. Thanks for the clarification. IMO getting this in is worth the freeze
>> exception... it's a shame p2m_mmio_write_dm made it into 4.6.1. It needs to
>> go before it proliferates any further.
>>>
>>
>> Thanks, Paul.
>>
>> So I suppose the only place we need change for this patch is
>> for hvmmem_type_t, which should be defined like this?
>>
>> typedef enum {
>>       HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>>       HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>>       HVMMEM_mmio_dm,            /* Reads and write go to the device model */
>> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
>>       HVMMEM_ioreq_server
>> #else
>>       HVMMEM_mmio_write_dm
>> #endif
>> } hvmmem_type_t;
>>
>> Besides, does 4.7 still accept freeze exception? It would be great
>> if we can get an approval for this.
>
> I talked to Wei earlier and he is happy to give a freeze exception to this change.
>

Great! I really obliged. :)
BTW, Does some form of application need to be submitted? I'm not
familiar with the procedure.

Yu
>    Paul
>
>>
>> Thanks
>> Yu
>>
>>>     Paul
>>>
>>>> Jan
>>>>
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@lists.xen.org
>>>> http://lists.xen.org/xen-devel
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel
>>>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19 11:17                           ` Yu, Zhang
@ 2016-04-19 11:47                             ` Paul Durrant
  2016-04-19 11:59                               ` Yu, Zhang
  0 siblings, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-19 11:47 UTC (permalink / raw)
  To: Yu, Zhang, George Dunlap, xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim (Xen.org),
	Lv, Zhiyuan, jun.nakajima

> -----Original Message-----
> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: 19 April 2016 12:18
> To: Paul Durrant; George Dunlap; xen-devel@lists.xen.org
> Cc: Kevin Tian; Jan Beulich; Andrew Cooper; Tim (Xen.org); Lv, Zhiyuan;
> jun.nakajima@intel.com
> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> map guest ram with p2m_ioreq_server to an ioreq server
> 
> 
> 
> On 4/19/2016 6:05 PM, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> >> Sent: 19 April 2016 10:44
> >> To: Paul Durrant; George Dunlap; xen-devel@lists.xen.org
> >> Cc: Kevin Tian; Jan Beulich; Andrew Cooper; Tim (Xen.org); Lv, Zhiyuan;
> >> jun.nakajima@intel.com
> >> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> >> map guest ram with p2m_ioreq_server to an ioreq server
> >>
> >>
> >>
> >> On 4/19/2016 5:21 PM, Paul Durrant wrote:
> >>>> -----Original Message-----
> >>> [snip]
> >>>>>> Does any other maintainers have any suggestions?
> >>>>>
> >>>>> Note that it is a requirement that an ioreq server be disabled before
> VM
> >>>> suspend. That means ioreq server pages essentially have to go back to
> >>>> ram_rw semantics.
> >>>>>
> >>>>>      Paul
> >>>>>
> >>>>
> >>>> OK. So it should be hypervisor's responsibility to do the resetting.
> >>>> Now we probably have 2 choices:
> >>>> 1> we reset the p2m type synchronously when ioreq server unmapping
> >>>> happens, instead of deferring to the misconfig handling part. This
> >>>> means performance impact to traverse the p2m table.
> >>>>
> >>>
> >>> Do we need to reset at all. The p2m type does need to be transferred, it
> >> will just be unclaimed on the far end (i.e. the pages are treated as r/w
> ram)
> >> until the emulator starts up there. If that cannot be done without creating
> >> yet another p2m type to handle logdirty (which seems a suboptimal way
> of
> >> dealing with it) then I think migration needs to be disallowed on any
> domain
> >> than contains any ioreq_server type pages at this stage.
> >>>
> >>>     Paul
> >>>
> >>
> >> Yes. We need - either the device model or hypervisor should grantee
> >> there's no p2m_ioreq_server pages left after an ioreq server is
> unmapped
> >> from this type (which is write protected in such senario), otherwise
> >> its emulation might be forwarded to other unexpected device models
> which
> >> claims the p2m_ioreq_server later.
> >
> > That should be for the device model to guarantee IMO. If the 'wrong'
> emulator claims the ioreq server type then I don't think that's Xen's problem.
> >
> 
> Thanks, Paul.
> 
> So what about the VM suspend case you mentioned above? Will that trigger
> the unmapping of ioreq server? Could the device model also take the role
> to change the p2m type back in such case?

Yes. The device model has to be told by the toolstack that the VM is suspending, otherwise it can't disable the ioreq server which puts the shared ioreq pages back into the guest p2m. If that's not done then the pages will be leaked.

> 
> It would be much simpler if hypervisor side does not need to provide
> the p2m resetting logic, and we can support live migration at the same
> time then. :)
> 

That really should not be hypervisor's job.

  Paul

> 
> B.R.
> Yu
> 
> >>
> >> So I guess approach 2> is your suggestion now.
> >>
> >> Besides, previously, Jan also questioned the necessity of resetting the
> >> p2m type when an ioreq server is mapping to the p2m_ioreq_server. His
> >> argument is that we should only allow such p2m transition after an
> >> ioreq server has already mapped to this p2m_ioreq_server. I think his
> >> point sounds also reasonable.
> >>
> >
> > I was kind of hoping to avoid that ordering dependency but if it makes
> things simpler then so be it.
> >
> >    Paul
> >
> >> Thanks
> >> Yu
> >>
> >>>> Or
> >>>> 2> we just disallow live migration when p2m->ioreq.server is not NULL.
> >>>> This is not quite accurate, because having p2m->ioreq.server mapped
> >>>> to p2m_ioreq_server does not necessarily means there would be such
> >>>> outstanding entries. To be more accurate, we can add some other
> rough
> >>>> check, e.g. both check if p2m->ioreq.server against NULL and check if
> >>>> the hvmop_set_mem_type has ever been triggered once for the
> >>>> p2m_ioreq_server type.
> >>>>
> >>>> Both choice seems suboptimal for me. And I wonder if we have any
> >>>> better solutions?
> >>>>
> >>>> Thanks
> >>>> Yu
> >>>>
> >>>>>> Thanks in advance! :)
> >>>>>>>> If the answer is, "everything just works", that's perfect.
> >>>>>>>>
> >>>>>>>> If the answer is, "Before logdirty mode is set, the ioreq server has
> >> the
> >>>>>>>> opportunity to detach, removing the p2m_ioreq_server entries,
> and
> >>>>>>>> operating without that functionality", that's good too.
> >>>>>>>>
> >>>>>>>> If the answer is, "the live migration request fails and the guest
> >>>>>>>> continues to run", that's also acceptable.  If you want this series to
> >>>>>>>> be checked in today (the last day for 4.7), this is probably your
> best
> >>>>>>>> bet.
> >>>>>>>>
> >>>>>>>>      -George
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> Regards
> >>>>>> Yu
> >>>>> _______________________________________________
> >>>>> Xen-devel mailing list
> >>>>> Xen-devel@lists.xen.org
> >>>>> http://lists.xen.org/xen-devel
> >>>>>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-19 11:38                   ` Yu, Zhang
@ 2016-04-19 11:50                     ` Paul Durrant
  2016-04-19 16:51                     ` Jan Beulich
  1 sibling, 0 replies; 82+ messages in thread
From: Paul Durrant @ 2016-04-19 11:50 UTC (permalink / raw)
  To: Yu, Zhang, Jan Beulich, Wei Liu
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, zhiyuan.lv, jun.nakajima

> -----Original Message-----
> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: 19 April 2016 12:39
> To: Paul Durrant; Jan Beulich; Wei Liu
> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George
> Dunlap; xen-devel@lists.xen.org; zhiyuan.lv@intel.com;
> jun.nakajima@intel.com
> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> p2m_mmio_write_dm to p2m_ioreq_server
> 
> 
> 
> On 4/19/2016 7:15 PM, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> >> Sent: 19 April 2016 12:03
> >> To: Paul Durrant; Jan Beulich; Wei Liu
> >> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George
> >> Dunlap; xen-devel@lists.xen.org; zhiyuan.lv@intel.com;
> >> jun.nakajima@intel.com
> >> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> >> p2m_mmio_write_dm to p2m_ioreq_server
> >>
> >>
> >>
> >> On 4/19/2016 12:58 AM, Paul Durrant wrote:
> >>>> -----Original Message-----
> >>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf
> Of
> >> Jan
> >>>> Beulich
> >>>> Sent: 18 April 2016 17:47
> >>>> To: Paul Durrant
> >>>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George
> >>>> Dunlap; xen-devel@lists.xen.org; yu.c.zhang@linux.intel.com;
> >>>> zhiyuan.lv@intel.com; jun.nakajima@intel.com
> >>>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> >>>> p2m_mmio_write_dm to p2m_ioreq_server
> >>>>
> >>>>>>> Paul Durrant <Paul.Durrant@citrix.com> 04/18/16 6:45 PM >>>
> >>>>> The original patch was posted before the cut-off IIRC so I'm not sure
> >>>>> of the policy regarding freeze-exceptions.
> >>>>
> >>>>     It was submitted before the feature freeze, yes, but didn't make it in
> by
> >>>> code freeze. So it's my understanding that an exception would be
> >> needed.
> >>>>
> >>>
> >>> Ok. Thanks for the clarification. IMO getting this in is worth the freeze
> >> exception... it's a shame p2m_mmio_write_dm made it into 4.6.1. It
> needs to
> >> go before it proliferates any further.
> >>>
> >>
> >> Thanks, Paul.
> >>
> >> So I suppose the only place we need change for this patch is
> >> for hvmmem_type_t, which should be defined like this?
> >>
> >> typedef enum {
> >>       HVMMEM_ram_rw,             /* Normal read/write guest RAM */
> >>       HVMMEM_ram_ro,             /* Read-only; writes are discarded */
> >>       HVMMEM_mmio_dm,            /* Reads and write go to the device model
> */
> >> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
> >>       HVMMEM_ioreq_server
> >> #else
> >>       HVMMEM_mmio_write_dm
> >> #endif
> >> } hvmmem_type_t;
> >>
> >> Besides, does 4.7 still accept freeze exception? It would be great
> >> if we can get an approval for this.
> >
> > I talked to Wei earlier and he is happy to give a freeze exception to this
> change.
> >
> 
> Great! I really obliged. :)
> BTW, Does some form of application need to be submitted? I'm not
> familiar with the procedure.

Nor am I. I would expect that you should submit the patch with something like "PATCH for 4.7" in the subject.

  Paul

> 
> Yu
> >    Paul
> >
> >>
> >> Thanks
> >> Yu
> >>
> >>>     Paul
> >>>
> >>>> Jan
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Xen-devel mailing list
> >>>> Xen-devel@lists.xen.org
> >>>> http://lists.xen.org/xen-devel
> >>> _______________________________________________
> >>> Xen-devel mailing list
> >>> Xen-devel@lists.xen.org
> >>> http://lists.xen.org/xen-devel
> >>>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19 11:47                             ` Paul Durrant
@ 2016-04-19 11:59                               ` Yu, Zhang
  2016-04-20 14:50                                 ` George Dunlap
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-19 11:59 UTC (permalink / raw)
  To: Paul Durrant, George Dunlap, xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim (Xen.org),
	Lv, Zhiyuan, jun.nakajima



On 4/19/2016 7:47 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: 19 April 2016 12:18
>> To: Paul Durrant; George Dunlap; xen-devel@lists.xen.org
>> Cc: Kevin Tian; Jan Beulich; Andrew Cooper; Tim (Xen.org); Lv, Zhiyuan;
>> jun.nakajima@intel.com
>> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
>> map guest ram with p2m_ioreq_server to an ioreq server
>>
>>
>>
>> On 4/19/2016 6:05 PM, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>>>> Sent: 19 April 2016 10:44
>>>> To: Paul Durrant; George Dunlap; xen-devel@lists.xen.org
>>>> Cc: Kevin Tian; Jan Beulich; Andrew Cooper; Tim (Xen.org); Lv, Zhiyuan;
>>>> jun.nakajima@intel.com
>>>> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
>>>> map guest ram with p2m_ioreq_server to an ioreq server
>>>>
>>>>
>>>>
>>>> On 4/19/2016 5:21 PM, Paul Durrant wrote:
>>>>>> -----Original Message-----
>>>>> [snip]
>>>>>>>> Does any other maintainers have any suggestions?
>>>>>>>
>>>>>>> Note that it is a requirement that an ioreq server be disabled before
>> VM
>>>>>> suspend. That means ioreq server pages essentially have to go back to
>>>>>> ram_rw semantics.
>>>>>>>
>>>>>>>       Paul
>>>>>>>
>>>>>>
>>>>>> OK. So it should be hypervisor's responsibility to do the resetting.
>>>>>> Now we probably have 2 choices:
>>>>>> 1> we reset the p2m type synchronously when ioreq server unmapping
>>>>>> happens, instead of deferring to the misconfig handling part. This
>>>>>> means performance impact to traverse the p2m table.
>>>>>>
>>>>>
>>>>> Do we need to reset at all. The p2m type does need to be transferred, it
>>>> will just be unclaimed on the far end (i.e. the pages are treated as r/w
>> ram)
>>>> until the emulator starts up there. If that cannot be done without creating
>>>> yet another p2m type to handle logdirty (which seems a suboptimal way
>> of
>>>> dealing with it) then I think migration needs to be disallowed on any
>> domain
>>>> than contains any ioreq_server type pages at this stage.
>>>>>
>>>>>      Paul
>>>>>
>>>>
>>>> Yes. We need - either the device model or hypervisor should grantee
>>>> there's no p2m_ioreq_server pages left after an ioreq server is
>> unmapped
>>>> from this type (which is write protected in such senario), otherwise
>>>> its emulation might be forwarded to other unexpected device models
>> which
>>>> claims the p2m_ioreq_server later.
>>>
>>> That should be for the device model to guarantee IMO. If the 'wrong'
>> emulator claims the ioreq server type then I don't think that's Xen's problem.
>>>
>>
>> Thanks, Paul.
>>
>> So what about the VM suspend case you mentioned above? Will that trigger
>> the unmapping of ioreq server? Could the device model also take the role
>> to change the p2m type back in such case?
>
> Yes. The device model has to be told by the toolstack that the VM is suspending, otherwise it can't disable the ioreq server which puts the shared ioreq pages back into the guest p2m. If that's not done then the pages will be leaked.
>
>>
>> It would be much simpler if hypervisor side does not need to provide
>> the p2m resetting logic, and we can support live migration at the same
>> time then. :)
>>
>
> That really should not be hypervisor's job.
>
>    Paul
>

Oh. So let's just remove the p2m type recalculation code from this
patch, no need to call p2m_change_entry_type_global, and no need to
worry about the log dirty part.

George, do you think this acceptable?

BTW, if no need to call p2m_change_entry_type_global, which is not
used for shadow mode, we can keep this p2m type in shadow code, right?

Thanks
Yu

>>
>> B.R.
>> Yu
>>
>>>>
>>>> So I guess approach 2> is your suggestion now.
>>>>
>>>> Besides, previously, Jan also questioned the necessity of resetting the
>>>> p2m type when an ioreq server is mapping to the p2m_ioreq_server. His
>>>> argument is that we should only allow such p2m transition after an
>>>> ioreq server has already mapped to this p2m_ioreq_server. I think his
>>>> point sounds also reasonable.
>>>>
>>>
>>> I was kind of hoping to avoid that ordering dependency but if it makes
>> things simpler then so be it.
>>>
>>>     Paul
>>>
>>>> Thanks
>>>> Yu
>>>>
>>>>>> Or
>>>>>> 2> we just disallow live migration when p2m->ioreq.server is not NULL.
>>>>>> This is not quite accurate, because having p2m->ioreq.server mapped
>>>>>> to p2m_ioreq_server does not necessarily means there would be such
>>>>>> outstanding entries. To be more accurate, we can add some other
>> rough
>>>>>> check, e.g. both check if p2m->ioreq.server against NULL and check if
>>>>>> the hvmop_set_mem_type has ever been triggered once for the
>>>>>> p2m_ioreq_server type.
>>>>>>
>>>>>> Both choice seems suboptimal for me. And I wonder if we have any
>>>>>> better solutions?
>>>>>>
>>>>>> Thanks
>>>>>> Yu
>>>>>>
>>>>>>>> Thanks in advance! :)
>>>>>>>>>> If the answer is, "everything just works", that's perfect.
>>>>>>>>>>
>>>>>>>>>> If the answer is, "Before logdirty mode is set, the ioreq server has
>>>> the
>>>>>>>>>> opportunity to detach, removing the p2m_ioreq_server entries,
>> and
>>>>>>>>>> operating without that functionality", that's good too.
>>>>>>>>>>
>>>>>>>>>> If the answer is, "the live migration request fails and the guest
>>>>>>>>>> continues to run", that's also acceptable.  If you want this series to
>>>>>>>>>> be checked in today (the last day for 4.7), this is probably your
>> best
>>>>>>>>>> bet.
>>>>>>>>>>
>>>>>>>>>>       -George
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Yu
>>>>>>> _______________________________________________
>>>>>>> Xen-devel mailing list
>>>>>>> Xen-devel@lists.xen.org
>>>>>>> http://lists.xen.org/xen-devel
>>>>>>>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-19 11:38                   ` Yu, Zhang
  2016-04-19 11:50                     ` Paul Durrant
@ 2016-04-19 16:51                     ` Jan Beulich
  2016-04-20 14:59                       ` Wei Liu
  1 sibling, 1 reply; 82+ messages in thread
From: Jan Beulich @ 2016-04-19 16:51 UTC (permalink / raw)
  To: Paul.Durrant, wei.liu2, yu.c.zhang
  Cc: kevin.tian, keir, Andrew.Cooper3, tim, George.Dunlap, xen-devel,
	zhiyuan.lv, jun.nakajima

>>> "Yu, Zhang" <yu.c.zhang@linux.intel.com> 04/19/16 1:46 PM >>>
>On 4/19/2016 7:15 PM, Paul Durrant wrote:
>> I talked to Wei earlier and he is happy to give a freeze exception to this change.
>
>Great! I really obliged. :)
>BTW, Does some form of application need to be submitted? I'm not
>familiar with the procedure.

To add some clarification (hopefully; Wei, please correct me if I'm wrong): The
exception so far was granted only for this one patch, not the entire series.
Hence it would help if in the re-submission (with the I think single pending
change needed to this particular patch) you'd put this first in the series (or
submit it on its own). And make sure you Cc Wei in addition to any of the
people you're normally expected to Cc.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-19 11:59                               ` Yu, Zhang
@ 2016-04-20 14:50                                 ` George Dunlap
  2016-04-20 14:57                                   ` Paul Durrant
  2016-04-21 13:21                                   ` Yu, Zhang
  0 siblings, 2 replies; 82+ messages in thread
From: George Dunlap @ 2016-04-20 14:50 UTC (permalink / raw)
  To: Yu, Zhang
  Cc: Kevin Tian, jun.nakajima, Andrew Cooper, Tim (Xen.org),
	xen-devel, Paul Durrant, Lv, Zhiyuan, Jan Beulich

On Tue, Apr 19, 2016 at 12:59 PM, Yu, Zhang <yu.c.zhang@linux.intel.com> wrote:
>>> So what about the VM suspend case you mentioned above? Will that trigger
>>> the unmapping of ioreq server? Could the device model also take the role
>>> to change the p2m type back in such case?
>>
>>
>> Yes. The device model has to be told by the toolstack that the VM is
>> suspending, otherwise it can't disable the ioreq server which puts the
>> shared ioreq pages back into the guest p2m. If that's not done then the
>> pages will be leaked.
>>
>>>
>>> It would be much simpler if hypervisor side does not need to provide
>>> the p2m resetting logic, and we can support live migration at the same
>>> time then. :)
>>>
>>
>> That really should not be hypervisor's job.
>>
>>    Paul
>>
>
> Oh. So let's just remove the p2m type recalculation code from this
> patch, no need to call p2m_change_entry_type_global, and no need to
> worry about the log dirty part.
>
> George, do you think this acceptable?

Sorry, just to make sure I understand your intentions:

1. The ioreq servers will be responsible for changing all the
p2m_ioreq_server types back to p2m_ram_rw before detaching
2. Xen will refuse to set logdirty mode if there are any attached ioreq servers

The one problem with #1 is what to do if the ioreq server is buggy /
killed, and forgets / is unable to clean up its ioreq_server entries?

There's a certain symmetry with the idea of having the ioreq server
responsible both for mapping and unmapping; and it would be nice to
have this stuff work for shadow mode as well.  But the automatic type
change seems like a more robust option.  (Still open to persuasion on
this.)

On another note: the hard freeze is long past the already-extended
deadline, so I was assuming this was 4.8 material at this point.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-20 14:50                                 ` George Dunlap
@ 2016-04-20 14:57                                   ` Paul Durrant
  2016-04-20 15:37                                     ` George Dunlap
  2016-04-21 13:21                                   ` Yu, Zhang
  1 sibling, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-20 14:57 UTC (permalink / raw)
  To: George Dunlap, Yu, Zhang
  Cc: Kevin Tian, jun.nakajima, Andrew Cooper, Tim (Xen.org),
	xen-devel, Lv, Zhiyuan, Jan Beulich

> -----Original Message-----
> From: George Dunlap
> Sent: 20 April 2016 15:50
> To: Yu, Zhang
> Cc: Paul Durrant; xen-devel@lists.xen.org; Kevin Tian; Jan Beulich; Andrew
> Cooper; Tim (Xen.org); Lv, Zhiyuan; jun.nakajima@intel.com
> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> map guest ram with p2m_ioreq_server to an ioreq server
> 
> On Tue, Apr 19, 2016 at 12:59 PM, Yu, Zhang <yu.c.zhang@linux.intel.com>
> wrote:
> >>> So what about the VM suspend case you mentioned above? Will that
> trigger
> >>> the unmapping of ioreq server? Could the device model also take the
> role
> >>> to change the p2m type back in such case?
> >>
> >>
> >> Yes. The device model has to be told by the toolstack that the VM is
> >> suspending, otherwise it can't disable the ioreq server which puts the
> >> shared ioreq pages back into the guest p2m. If that's not done then the
> >> pages will be leaked.
> >>
> >>>
> >>> It would be much simpler if hypervisor side does not need to provide
> >>> the p2m resetting logic, and we can support live migration at the same
> >>> time then. :)
> >>>
> >>
> >> That really should not be hypervisor's job.
> >>
> >>    Paul
> >>
> >
> > Oh. So let's just remove the p2m type recalculation code from this
> > patch, no need to call p2m_change_entry_type_global, and no need to
> > worry about the log dirty part.
> >
> > George, do you think this acceptable?
> 
> Sorry, just to make sure I understand your intentions:
> 
> 1. The ioreq servers will be responsible for changing all the
> p2m_ioreq_server types back to p2m_ram_rw before detaching
> 2. Xen will refuse to set logdirty mode if there are any attached ioreq servers
> 
> The one problem with #1 is what to do if the ioreq server is buggy /
> killed, and forgets / is unable to clean up its ioreq_server entries?
> 

If we ever want to meaningfully migrate a VM with XenGT enabled then I think it has to be possible to migrate the p2m_ioreq_server page types as-is. I'd expect a new instance of the xengt emulator to claim them on the far and any emulator state be transferred in a similar way to which qemu state is transferred such that vgpu emulation can be continued after the migration.
After all, it's not like we magically reset the type of mmio-dm pages on migration. It is just expected that a new instance of QEMU is spawned to take care of them on the far end when the VM is unpaused.

  Paul

> There's a certain symmetry with the idea of having the ioreq server
> responsible both for mapping and unmapping; and it would be nice to
> have this stuff work for shadow mode as well.  But the automatic type
> change seems like a more robust option.  (Still open to persuasion on
> this.)
> 
> On another note: the hard freeze is long past the already-extended
> deadline, so I was assuming this was 4.8 material at this point.
> 
>  -George
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-19 16:51                     ` Jan Beulich
@ 2016-04-20 14:59                       ` Wei Liu
  0 siblings, 0 replies; 82+ messages in thread
From: Wei Liu @ 2016-04-20 14:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, Andrew.Cooper3, tim, George.Dunlap,
	xen-devel, Paul.Durrant, yu.c.zhang, zhiyuan.lv, jun.nakajima,
	keir

On Tue, Apr 19, 2016 at 10:51:14AM -0600, Jan Beulich wrote:
> >>> "Yu, Zhang" <yu.c.zhang@linux.intel.com> 04/19/16 1:46 PM >>>
> >On 4/19/2016 7:15 PM, Paul Durrant wrote:
> >> I talked to Wei earlier and he is happy to give a freeze exception to this change.
> >
> >Great! I really obliged. :)
> >BTW, Does some form of application need to be submitted? I'm not
> >familiar with the procedure.
> 
> To add some clarification (hopefully; Wei, please correct me if I'm wrong): The
> exception so far was granted only for this one patch, not the entire series.
> Hence it would help if in the re-submission (with the I think single pending
> change needed to this particular patch) you'd put this first in the series (or
> submit it on its own). And make sure you Cc Wei in addition to any of the
> people you're normally expected to Cc.

Hmm... I had the impression that this is the only patch needed and the
enum wasn't released, but I was wrong. I will go over this series in
detailed and let you know.

Wei.

> 
> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-19 11:02               ` Yu, Zhang
  2016-04-19 11:15                 ` Paul Durrant
@ 2016-04-20 15:02                 ` George Dunlap
  2016-04-20 16:30                   ` George Dunlap
  1 sibling, 1 reply; 82+ messages in thread
From: George Dunlap @ 2016-04-20 15:02 UTC (permalink / raw)
  To: Yu, Zhang, Paul Durrant, Jan Beulich, Wei Liu
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	xen-devel, zhiyuan.lv, jun.nakajima

On 19/04/16 12:02, Yu, Zhang wrote:
> 
> 
> On 4/19/2016 12:58 AM, Paul Durrant wrote:
>>> -----Original Message-----
>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
>>> Jan
>>> Beulich
>>> Sent: 18 April 2016 17:47
>>> To: Paul Durrant
>>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George
>>> Dunlap; xen-devel@lists.xen.org; yu.c.zhang@linux.intel.com;
>>> zhiyuan.lv@intel.com; jun.nakajima@intel.com
>>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
>>> p2m_mmio_write_dm to p2m_ioreq_server
>>>
>>>>>> Paul Durrant <Paul.Durrant@citrix.com> 04/18/16 6:45 PM >>>
>>>> The original patch was posted before the cut-off IIRC so I'm not sure
>>>> of the policy regarding freeze-exceptions.
>>>
>>>    It was submitted before the feature freeze, yes, but didn't make
>>> it in by
>>> code freeze. So it's my understanding that an exception would be needed.
>>>
>>
>> Ok. Thanks for the clarification. IMO getting this in is worth the
>> freeze exception... it's a shame p2m_mmio_write_dm made it into 4.6.1.
>> It needs to go before it proliferates any further.
>>
> 
> Thanks, Paul.
> 
> So I suppose the only place we need change for this patch is
> for hvmmem_type_t, which should be defined like this?
> 
> typedef enum {
>     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>     HVMMEM_mmio_dm,            /* Reads and write go to the device model */
> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
>     HVMMEM_ioreq_server
> #else
>     HVMMEM_mmio_write_dm
> #endif
> } hvmmem_type_t;
> 
> Besides, does 4.7 still accept freeze exception? It would be great
> if we can get an approval for this.

Wait, do we *actually* need this?  Is anyone actually using this?

I'd say remove it, and if anyone complains, *then* do the #ifdef'ery as
a bug-fix.  I'm pretty sure that's Linux's policy -- You Must Keep
Userspace Working, but you can break it to see if anyone complains first.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-20 14:57                                   ` Paul Durrant
@ 2016-04-20 15:37                                     ` George Dunlap
  2016-04-20 16:30                                       ` Paul Durrant
  0 siblings, 1 reply; 82+ messages in thread
From: George Dunlap @ 2016-04-20 15:37 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim (Xen.org),
	xen-devel, Yu, Zhang, Lv, Zhiyuan, jun.nakajima

On Wed, Apr 20, 2016 at 3:57 PM, Paul Durrant <Paul.Durrant@citrix.com> wrote:
>> -----Original Message-----
>> From: George Dunlap
>> Sent: 20 April 2016 15:50
>> To: Yu, Zhang
>> Cc: Paul Durrant; xen-devel@lists.xen.org; Kevin Tian; Jan Beulich; Andrew
>> Cooper; Tim (Xen.org); Lv, Zhiyuan; jun.nakajima@intel.com
>> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
>> map guest ram with p2m_ioreq_server to an ioreq server
>>
>> On Tue, Apr 19, 2016 at 12:59 PM, Yu, Zhang <yu.c.zhang@linux.intel.com>
>> wrote:
>> >>> So what about the VM suspend case you mentioned above? Will that
>> trigger
>> >>> the unmapping of ioreq server? Could the device model also take the
>> role
>> >>> to change the p2m type back in such case?
>> >>
>> >>
>> >> Yes. The device model has to be told by the toolstack that the VM is
>> >> suspending, otherwise it can't disable the ioreq server which puts the
>> >> shared ioreq pages back into the guest p2m. If that's not done then the
>> >> pages will be leaked.
>> >>
>> >>>
>> >>> It would be much simpler if hypervisor side does not need to provide
>> >>> the p2m resetting logic, and we can support live migration at the same
>> >>> time then. :)
>> >>>
>> >>
>> >> That really should not be hypervisor's job.
>> >>
>> >>    Paul
>> >>
>> >
>> > Oh. So let's just remove the p2m type recalculation code from this
>> > patch, no need to call p2m_change_entry_type_global, and no need to
>> > worry about the log dirty part.
>> >
>> > George, do you think this acceptable?
>>
>> Sorry, just to make sure I understand your intentions:
>>
>> 1. The ioreq servers will be responsible for changing all the
>> p2m_ioreq_server types back to p2m_ram_rw before detaching
>> 2. Xen will refuse to set logdirty mode if there are any attached ioreq servers
>>
>> The one problem with #1 is what to do if the ioreq server is buggy /
>> killed, and forgets / is unable to clean up its ioreq_server entries?
>>
>
> If we ever want to meaningfully migrate a VM with XenGT enabled then I think it has to be possible to migrate the p2m_ioreq_server page types as-is. I'd expect a new instance of the xengt emulator to claim them on the far and any emulator state be transferred in a similar way to which qemu state is transferred such that vgpu emulation can be continued after the migration.
> After all, it's not like we magically reset the type of mmio-dm pages on migration. It is just expected that a new instance of QEMU is spawned to take care of them on the far end when the VM is unpaused.

I don't see the implication you're trying to make regarding the
attach/detach interface.

In any case, is it really not the case that the devicemodels on the
far side have to re-register which IO ranges they're interested in?
Shouldn't the devicemodel on the far side re-register the pfns it
wants to watch as well?  That again seems like the most robust
solution, rather than assuming that the pfns you expect to have been
marked are already marked.  To do otherwise risks bugs where 1) pages
you think were being watched aren't 2) pages being watched that you
don't care about (and you don't think to un-watch them because you
didn't think you'd watched them in the first place).

On a slightly different topic: come to think of it, there's no need to
switch the p2m type for ioreq_server when we turn on logdirty mode; we
just have to call paging_mark_dirty() if it's a write, don't we?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-20 15:02                 ` George Dunlap
@ 2016-04-20 16:30                   ` George Dunlap
  2016-04-20 16:52                     ` Jan Beulich
  0 siblings, 1 reply; 82+ messages in thread
From: George Dunlap @ 2016-04-20 16:30 UTC (permalink / raw)
  To: Yu, Zhang, Paul Durrant, Jan Beulich, Wei Liu
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	xen-devel, zhiyuan.lv, jun.nakajima

On Wed, Apr 20, 2016 at 4:02 PM, George Dunlap <george.dunlap@citrix.com> wrote:
> On 19/04/16 12:02, Yu, Zhang wrote:
>>
>>
>> On 4/19/2016 12:58 AM, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
>>>> Jan
>>>> Beulich
>>>> Sent: 18 April 2016 17:47
>>>> To: Paul Durrant
>>>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); George
>>>> Dunlap; xen-devel@lists.xen.org; yu.c.zhang@linux.intel.com;
>>>> zhiyuan.lv@intel.com; jun.nakajima@intel.com
>>>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
>>>> p2m_mmio_write_dm to p2m_ioreq_server
>>>>
>>>>>>> Paul Durrant <Paul.Durrant@citrix.com> 04/18/16 6:45 PM >>>
>>>>> The original patch was posted before the cut-off IIRC so I'm not sure
>>>>> of the policy regarding freeze-exceptions.
>>>>
>>>>    It was submitted before the feature freeze, yes, but didn't make
>>>> it in by
>>>> code freeze. So it's my understanding that an exception would be needed.
>>>>
>>>
>>> Ok. Thanks for the clarification. IMO getting this in is worth the
>>> freeze exception... it's a shame p2m_mmio_write_dm made it into 4.6.1.
>>> It needs to go before it proliferates any further.
>>>
>>
>> Thanks, Paul.
>>
>> So I suppose the only place we need change for this patch is
>> for hvmmem_type_t, which should be defined like this?
>>
>> typedef enum {
>>     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>>     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>>     HVMMEM_mmio_dm,            /* Reads and write go to the device model */
>> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
>>     HVMMEM_ioreq_server
>> #else
>>     HVMMEM_mmio_write_dm
>> #endif
>> } hvmmem_type_t;
>>
>> Besides, does 4.7 still accept freeze exception? It would be great
>> if we can get an approval for this.
>
> Wait, do we *actually* need this?  Is anyone actually using this?
>
> I'd say remove it, and if anyone complains, *then* do the #ifdef'ery as
> a bug-fix.  I'm pretty sure that's Linux's policy -- You Must Keep
> Userspace Working, but you can break it to see if anyone complains first.

Going further than this:

The proposed patch series not only changes the name, it changes the
functionality.  We do not want code to *compile* against 4.7 and then
not *work* against 4.7; and the worst of all is to compile and sort of
work but do it incorrectly.

Does the ioreq server have a way of asking Xen what version of the ABI
it's providing?  I'm assuming the answer is "no"; in which case code
that is compiled against the 4.6 interface but run on a 4.8 interface
that looks like this will fail in a somewhat unpredictable way.

Given that:

1. When we do check the ioreq server functionality in, what's the
correct way to deal with code that wants to use the old interface, and
what do we do with code compiled against the old interface but running
on the new one?
2. What's the best thing to do for this release?

If it's the case that the only code that uses this is in XenServer,
then I'd say the answer to #1 can be simply, "Don't compile" and
"Don't do that" respectively; and the answer to #2 can be either
"Leave it be" or "Remove the enum from the public interface".

If there are other projects that have started to use this interface,
then we need a better answer to #1 than "Compile but fail in
unpredicatble ways".

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-20 15:37                                     ` George Dunlap
@ 2016-04-20 16:30                                       ` Paul Durrant
  2016-04-20 16:58                                         ` George Dunlap
  2016-04-21 13:28                                         ` Yu, Zhang
  0 siblings, 2 replies; 82+ messages in thread
From: Paul Durrant @ 2016-04-20 16:30 UTC (permalink / raw)
  To: George Dunlap
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim (Xen.org),
	xen-devel, Yu, Zhang, Lv, Zhiyuan, jun.nakajima

> -----Original Message-----
> From: George Dunlap
> Sent: 20 April 2016 16:38
> To: Paul Durrant
> Cc: Yu, Zhang; Kevin Tian; jun.nakajima@intel.com; Andrew Cooper; Tim
> (Xen.org); xen-devel@lists.xen.org; Lv, Zhiyuan; Jan Beulich
> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> map guest ram with p2m_ioreq_server to an ioreq server
> 
> On Wed, Apr 20, 2016 at 3:57 PM, Paul Durrant <Paul.Durrant@citrix.com>
> wrote:
> >> -----Original Message-----
> >> From: George Dunlap
> >> Sent: 20 April 2016 15:50
> >> To: Yu, Zhang
> >> Cc: Paul Durrant; xen-devel@lists.xen.org; Kevin Tian; Jan Beulich;
> Andrew
> >> Cooper; Tim (Xen.org); Lv, Zhiyuan; jun.nakajima@intel.com
> >> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> >> map guest ram with p2m_ioreq_server to an ioreq server
> >>
> >> On Tue, Apr 19, 2016 at 12:59 PM, Yu, Zhang
> <yu.c.zhang@linux.intel.com>
> >> wrote:
> >> >>> So what about the VM suspend case you mentioned above? Will that
> >> trigger
> >> >>> the unmapping of ioreq server? Could the device model also take the
> >> role
> >> >>> to change the p2m type back in such case?
> >> >>
> >> >>
> >> >> Yes. The device model has to be told by the toolstack that the VM is
> >> >> suspending, otherwise it can't disable the ioreq server which puts the
> >> >> shared ioreq pages back into the guest p2m. If that's not done then
> the
> >> >> pages will be leaked.
> >> >>
> >> >>>
> >> >>> It would be much simpler if hypervisor side does not need to provide
> >> >>> the p2m resetting logic, and we can support live migration at the
> same
> >> >>> time then. :)
> >> >>>
> >> >>
> >> >> That really should not be hypervisor's job.
> >> >>
> >> >>    Paul
> >> >>
> >> >
> >> > Oh. So let's just remove the p2m type recalculation code from this
> >> > patch, no need to call p2m_change_entry_type_global, and no need to
> >> > worry about the log dirty part.
> >> >
> >> > George, do you think this acceptable?
> >>
> >> Sorry, just to make sure I understand your intentions:
> >>
> >> 1. The ioreq servers will be responsible for changing all the
> >> p2m_ioreq_server types back to p2m_ram_rw before detaching
> >> 2. Xen will refuse to set logdirty mode if there are any attached ioreq
> servers
> >>
> >> The one problem with #1 is what to do if the ioreq server is buggy /
> >> killed, and forgets / is unable to clean up its ioreq_server entries?
> >>
> >
> > If we ever want to meaningfully migrate a VM with XenGT enabled then I
> think it has to be possible to migrate the p2m_ioreq_server page types as-is.
> I'd expect a new instance of the xengt emulator to claim them on the far and
> any emulator state be transferred in a similar way to which qemu state is
> transferred such that vgpu emulation can be continued after the migration.
> > After all, it's not like we magically reset the type of mmio-dm pages on
> migration. It is just expected that a new instance of QEMU is spawned to
> take care of them on the far end when the VM is unpaused.
> 
> I don't see the implication you're trying to make regarding the
> attach/detach interface.
> 
> In any case, is it really not the case that the devicemodels on the
> far side have to re-register which IO ranges they're interested in?
> Shouldn't the devicemodel on the far side re-register the pfns it
> wants to watch as well?  That again seems like the most robust
> solution, rather than assuming that the pfns you expect to have been
> marked are already marked.  To do otherwise risks bugs where 1) pages
> you think were being watched aren't 2) pages being watched that you
> don't care about (and you don't think to un-watch them because you
> didn't think you'd watched them in the first place).
> 

That sounds a little inefficient when there could be thousands of pages in play. If the p2m types were faithfully reproduced in the receiving domain then the new incarnation of the emulator just needs to claim the type rather than making thousands of set-mem-type hypercalls as well. I agree that the information would need to be transferred though, otherwise the new incarnation wouldn't be able reset the page types when it was done with them.

> On a slightly different topic: come to think of it, there's no need to
> switch the p2m type for ioreq_server when we turn on logdirty mode; we
> just have to call paging_mark_dirty() if it's a write, don't we?
> 

That sounds right to me.

  Paul

>  -George
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-20 16:30                   ` George Dunlap
@ 2016-04-20 16:52                     ` Jan Beulich
  2016-04-20 16:58                       ` Paul Durrant
                                         ` (2 more replies)
  0 siblings, 3 replies; 82+ messages in thread
From: Jan Beulich @ 2016-04-20 16:52 UTC (permalink / raw)
  To: george.dunlap, Paul.Durrant, wei.liu2, yu.c.zhang
  Cc: kevin.tian, keir, Andrew.Cooper3, tim, xen-devel, zhiyuan.lv,
	jun.nakajima

>>> George Dunlap <george.dunlap@citrix.com> 04/20/16 6:30 PM >>>
>On Wed, Apr 20, 2016 at 4:02 PM, George Dunlap <george.dunlap@citrix.com> wrote:
>> On 19/04/16 12:02, Yu, Zhang wrote:
>>> So I suppose the only place we need change for this patch is
>>> for hvmmem_type_t, which should be defined like this?
>>>
>>> typedef enum {
>>>     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>>>     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>>>     HVMMEM_mmio_dm,            /* Reads and write go to the device model */
>>> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
>>>     HVMMEM_ioreq_server
>>> #else
>>>     HVMMEM_mmio_write_dm
>>> #endif
>>> } hvmmem_type_t;
>>>
>>> Besides, does 4.7 still accept freeze exception? It would be great
>>> if we can get an approval for this.
>>
>> Wait, do we *actually* need this?  Is anyone actually using this?
>>
>> I'd say remove it, and if anyone complains, *then* do the #ifdef'ery as
>> a bug-fix.  I'm pretty sure that's Linux's policy -- You Must Keep
>> Userspace Working, but you can break it to see if anyone complains first.

We don't normally do it like that - we aim at keeping things compatible
right away. I don't know of a case where we would have knowingly broken
compatibility for users of the public headers (leaving aside tool stack only
stuff of course).

>Going further than this:
>
>The proposed patch series not only changes the name, it changes the
>functionality.  We do not want code to *compile* against 4.7 and then
>not *work* against 4.7; and the worst of all is to compile and sort of
>work but do it incorrectly.

I had the impression that the renaming patch was what it is - a renaming
patch, without altering behavior.

>Does the ioreq server have a way of asking Xen what version of the ABI
>it's providing?  I'm assuming the answer is "no"; in which case code
>that is compiled against the 4.6 interface but run on a 4.8 interface
>that looks like this will fail in a somewhat unpredictable way.

The only thing it can do is ask for the Xen version. The ABI version is not
being returned by anything (but perhaps should be).

>Given that:
>
>1. When we do check the ioreq server functionality in, what's the
>correct way to deal with code that wants to use the old interface, and
>what do we do with code compiled against the old interface but running
>on the new one?

For the full series I'm not sure I can really tell.But as said, for the rename
patch alone I thought it is just a rename. And that's what we want to get
in (see Paul's earlier reply - he wants to see the old name gone, so it won't
be used any further).

>2. What's the best thing to do for this release?

If the entire series (no matter whether to go in now or later) is changing
behavior, then the only choice is to consider the currently used enum
value burnt, and use a fresh one for the new semantics.

>If it's the case that the only code that uses this is in XenServer,
>then I'd say the answer to #1 can be simply, "Don't compile" and
>"Don't do that" respectively; and the answer to #2 can be either
>"Leave it be" or "Remove the enum from the public interface".
>
>If there are other projects that have started to use this interface,
>then we need a better answer to #1 than "Compile but fail in
>unpredicatble ways".

How would we know whether there are other users?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-20 16:30                                       ` Paul Durrant
@ 2016-04-20 16:58                                         ` George Dunlap
  2016-04-21 13:28                                         ` Yu, Zhang
  1 sibling, 0 replies; 82+ messages in thread
From: George Dunlap @ 2016-04-20 16:58 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Kevin Tian, jun.nakajima, Andrew Cooper, Tim (Xen.org),
	xen-devel, Yu, Zhang, Lv, Zhiyuan, Jan Beulich

On Wed, Apr 20, 2016 at 5:30 PM, Paul Durrant <Paul.Durrant@citrix.com> wrote:
>> In any case, is it really not the case that the devicemodels on the
>> far side have to re-register which IO ranges they're interested in?
>> Shouldn't the devicemodel on the far side re-register the pfns it
>> wants to watch as well?  That again seems like the most robust
>> solution, rather than assuming that the pfns you expect to have been
>> marked are already marked.  To do otherwise risks bugs where 1) pages
>> you think were being watched aren't 2) pages being watched that you
>> don't care about (and you don't think to un-watch them because you
>> didn't think you'd watched them in the first place).
>>
>
> That sounds a little inefficient when there could be thousands of pages in play. If the p2m types were faithfully reproduced in the receiving domain then the new incarnation of the emulator just needs to claim the type rather than making thousands of set-mem-type hypercalls as well. I agree that the information would need to be transferred though, otherwise the new incarnation wouldn't be able reset the page types when it was done with them.

The emulator could presumably go through and re-shadow the GPTs as
they were used, rather than all at once, right?

I still think that flushing all types on disconnect and having the
ioreq server on the far side re-protect things as it wants to is a
more robust solution.  But ultimately you're probably going to be the
one who's going to have to deal with any bugs related to this, so I'd
leave it between you and Yu Zhang.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-20 16:52                     ` Jan Beulich
@ 2016-04-20 16:58                       ` Paul Durrant
  2016-04-20 17:06                         ` George Dunlap
  2016-04-20 17:08                       ` George Dunlap
  2016-04-21 12:04                       ` Yu, Zhang
  2 siblings, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-20 16:58 UTC (permalink / raw)
  To: Jan Beulich, George Dunlap, Wei Liu, yu.c.zhang
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	xen-devel, zhiyuan.lv, jun.nakajima

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan
> Beulich
> Sent: 20 April 2016 17:53
> To: George Dunlap; Paul Durrant; Wei Liu; yu.c.zhang@linux.intel.com
> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> p2m_mmio_write_dm to p2m_ioreq_server
> 
> >>> George Dunlap <george.dunlap@citrix.com> 04/20/16 6:30 PM >>>
> >On Wed, Apr 20, 2016 at 4:02 PM, George Dunlap
> <george.dunlap@citrix.com> wrote:
> >> On 19/04/16 12:02, Yu, Zhang wrote:
> >>> So I suppose the only place we need change for this patch is
> >>> for hvmmem_type_t, which should be defined like this?
> >>>
> >>> typedef enum {
> >>>     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
> >>>     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
> >>>     HVMMEM_mmio_dm,            /* Reads and write go to the device
> model */
> >>> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
> >>>     HVMMEM_ioreq_server
> >>> #else
> >>>     HVMMEM_mmio_write_dm
> >>> #endif
> >>> } hvmmem_type_t;
> >>>
> >>> Besides, does 4.7 still accept freeze exception? It would be great
> >>> if we can get an approval for this.
> >>
> >> Wait, do we *actually* need this?  Is anyone actually using this?
> >>
> >> I'd say remove it, and if anyone complains, *then* do the #ifdef'ery as
> >> a bug-fix.  I'm pretty sure that's Linux's policy -- You Must Keep
> >> Userspace Working, but you can break it to see if anyone complains first.
> 
> We don't normally do it like that - we aim at keeping things compatible
> right away. I don't know of a case where we would have knowingly broken
> compatibility for users of the public headers (leaving aside tool stack only
> stuff of course).
> 
> >Going further than this:
> >
> >The proposed patch series not only changes the name, it changes the
> >functionality.  We do not want code to *compile* against 4.7 and then
> >not *work* against 4.7; and the worst of all is to compile and sort of
> >work but do it incorrectly.
> 
> I had the impression that the renaming patch was what it is - a renaming
> patch, without altering behavior.
> 
> >Does the ioreq server have a way of asking Xen what version of the ABI
> >it's providing?  I'm assuming the answer is "no"; in which case code
> >that is compiled against the 4.6 interface but run on a 4.8 interface
> >that looks like this will fail in a somewhat unpredictable way.
> 
> The only thing it can do is ask for the Xen version. The ABI version is not
> being returned by anything (but perhaps should be).
> 
> >Given that:
> >
> >1. When we do check the ioreq server functionality in, what's the
> >correct way to deal with code that wants to use the old interface, and
> >what do we do with code compiled against the old interface but running
> >on the new one?
> 
> For the full series I'm not sure I can really tell.But as said, for the rename
> patch alone I thought it is just a rename. And that's what we want to get
> in (see Paul's earlier reply - he wants to see the old name gone, so it won't
> be used any further).
> 
> >2. What's the best thing to do for this release?
> 
> If the entire series (no matter whether to go in now or later) is changing
> behavior, then the only choice is to consider the currently used enum
> value burnt, and use a fresh one for the new semantics.

It sounds like that would be best way. If we don't so that then we have to maintain the write-dm semantics for pages of that type unless the type is claimed (by using the new hypercall) and that's bit icky. I much prefer that pages of the new type are treated as RAM until claimed.

  Paul

> 
> >If it's the case that the only code that uses this is in XenServer,
> >then I'd say the answer to #1 can be simply, "Don't compile" and
> >"Don't do that" respectively; and the answer to #2 can be either
> >"Leave it be" or "Remove the enum from the public interface".
> >
> >If there are other projects that have started to use this interface,
> >then we need a better answer to #1 than "Compile but fail in
> >unpredicatble ways".
> 
> How would we know whether there are other users?
> 
> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-20 16:58                       ` Paul Durrant
@ 2016-04-20 17:06                         ` George Dunlap
  2016-04-20 17:09                           ` Paul Durrant
  2016-04-21 12:24                           ` Yu, Zhang
  0 siblings, 2 replies; 82+ messages in thread
From: George Dunlap @ 2016-04-20 17:06 UTC (permalink / raw)
  To: Paul Durrant, Jan Beulich, Wei Liu, yu.c.zhang
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	xen-devel, zhiyuan.lv, jun.nakajima

On 20/04/16 17:58, Paul Durrant wrote:
>> -----Original Message-----
>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan
>> Beulich
>> Sent: 20 April 2016 17:53
>> To: George Dunlap; Paul Durrant; Wei Liu; yu.c.zhang@linux.intel.com
>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
>> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
>> p2m_mmio_write_dm to p2m_ioreq_server
>>
>>>>> George Dunlap <george.dunlap@citrix.com> 04/20/16 6:30 PM >>>
>>> On Wed, Apr 20, 2016 at 4:02 PM, George Dunlap
>> <george.dunlap@citrix.com> wrote:
>>>> On 19/04/16 12:02, Yu, Zhang wrote:
>>>>> So I suppose the only place we need change for this patch is
>>>>> for hvmmem_type_t, which should be defined like this?
>>>>>
>>>>> typedef enum {
>>>>>     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>>>>>     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>>>>>     HVMMEM_mmio_dm,            /* Reads and write go to the device
>> model */
>>>>> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
>>>>>     HVMMEM_ioreq_server
>>>>> #else
>>>>>     HVMMEM_mmio_write_dm
>>>>> #endif
>>>>> } hvmmem_type_t;
>>>>>
>>>>> Besides, does 4.7 still accept freeze exception? It would be great
>>>>> if we can get an approval for this.
>>>>
>>>> Wait, do we *actually* need this?  Is anyone actually using this?
>>>>
>>>> I'd say remove it, and if anyone complains, *then* do the #ifdef'ery as
>>>> a bug-fix.  I'm pretty sure that's Linux's policy -- You Must Keep
>>>> Userspace Working, but you can break it to see if anyone complains first.
>>
>> We don't normally do it like that - we aim at keeping things compatible
>> right away. I don't know of a case where we would have knowingly broken
>> compatibility for users of the public headers (leaving aside tool stack only
>> stuff of course).
>>
>>> Going further than this:
>>>
>>> The proposed patch series not only changes the name, it changes the
>>> functionality.  We do not want code to *compile* against 4.7 and then
>>> not *work* against 4.7; and the worst of all is to compile and sort of
>>> work but do it incorrectly.
>>
>> I had the impression that the renaming patch was what it is - a renaming
>> patch, without altering behavior.
>>
>>> Does the ioreq server have a way of asking Xen what version of the ABI
>>> it's providing?  I'm assuming the answer is "no"; in which case code
>>> that is compiled against the 4.6 interface but run on a 4.8 interface
>>> that looks like this will fail in a somewhat unpredictable way.
>>
>> The only thing it can do is ask for the Xen version. The ABI version is not
>> being returned by anything (but perhaps should be).
>>
>>> Given that:
>>>
>>> 1. When we do check the ioreq server functionality in, what's the
>>> correct way to deal with code that wants to use the old interface, and
>>> what do we do with code compiled against the old interface but running
>>> on the new one?
>>
>> For the full series I'm not sure I can really tell.But as said, for the rename
>> patch alone I thought it is just a rename. And that's what we want to get
>> in (see Paul's earlier reply - he wants to see the old name gone, so it won't
>> be used any further).
>>
>>> 2. What's the best thing to do for this release?
>>
>> If the entire series (no matter whether to go in now or later) is changing
>> behavior, then the only choice is to consider the currently used enum
>> value burnt, and use a fresh one for the new semantics.
> 
> It sounds like that would be best way. If we don't so that then we have to maintain the write-dm semantics for pages of that type unless the type is claimed (by using the new hypercall) and that's bit icky. I much prefer that pages of the new type are treated as RAM until claimed.

I think the only sensible way to keep the enum is to also keep the
functionality, which would mean using *another* p2m type for ioreq_server.

Given that the functionality isn't going away for 4.7, I don't see an
urgent need to remove the enum; but if Paul does, then a patch renaming
it to HVMMEM_unused would be the way forward then I guess.  Once the
underlying p2m type goes away, you'll want to return -EINVAL for this
enum value.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-20 16:52                     ` Jan Beulich
  2016-04-20 16:58                       ` Paul Durrant
@ 2016-04-20 17:08                       ` George Dunlap
  2016-04-21 12:04                       ` Yu, Zhang
  2 siblings, 0 replies; 82+ messages in thread
From: George Dunlap @ 2016-04-20 17:08 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Keir Fraser, Andrew Cooper, Tim Deegan, xen-devel,
	Paul Durrant, Yu, Zhang, Lv, Zhiyuan, Jun Nakajima, Wei Liu

On Wed, Apr 20, 2016 at 5:52 PM, Jan Beulich <jbeulich@suse.com> wrote:
>>If it's the case that the only code that uses this is in XenServer,
>>then I'd say the answer to #1 can be simply, "Don't compile" and
>>"Don't do that" respectively; and the answer to #2 can be either
>>"Leave it be" or "Remove the enum from the public interface".
>>
>>If there are other projects that have started to use this interface,
>>then we need a better answer to #1 than "Compile but fail in
>>unpredicatble ways".
>
> How would we know whether there are other users?

Well we'd wait for them to complain... at which point we would have
had a release with an *incompatible* ABI and no way to change it.

So yes, upon consideration, burning the enum value is really the only
option. :-)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-20 17:06                         ` George Dunlap
@ 2016-04-20 17:09                           ` Paul Durrant
  2016-04-21 12:24                           ` Yu, Zhang
  1 sibling, 0 replies; 82+ messages in thread
From: Paul Durrant @ 2016-04-20 17:09 UTC (permalink / raw)
  To: George Dunlap, Jan Beulich, Wei Liu, yu.c.zhang
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	xen-devel, zhiyuan.lv, jun.nakajima

> -----Original Message-----
> From: George Dunlap [mailto:george.dunlap@citrix.com]
> Sent: 20 April 2016 18:07
> To: Paul Durrant; Jan Beulich; Wei Liu; yu.c.zhang@linux.intel.com
> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> p2m_mmio_write_dm to p2m_ioreq_server
> 
> On 20/04/16 17:58, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> Jan
> >> Beulich
> >> Sent: 20 April 2016 17:53
> >> To: George Dunlap; Paul Durrant; Wei Liu; yu.c.zhang@linux.intel.com
> >> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
> >> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
> >> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> >> p2m_mmio_write_dm to p2m_ioreq_server
> >>
> >>>>> George Dunlap <george.dunlap@citrix.com> 04/20/16 6:30 PM >>>
> >>> On Wed, Apr 20, 2016 at 4:02 PM, George Dunlap
> >> <george.dunlap@citrix.com> wrote:
> >>>> On 19/04/16 12:02, Yu, Zhang wrote:
> >>>>> So I suppose the only place we need change for this patch is
> >>>>> for hvmmem_type_t, which should be defined like this?
> >>>>>
> >>>>> typedef enum {
> >>>>>     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
> >>>>>     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
> >>>>>     HVMMEM_mmio_dm,            /* Reads and write go to the device
> >> model */
> >>>>> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
> >>>>>     HVMMEM_ioreq_server
> >>>>> #else
> >>>>>     HVMMEM_mmio_write_dm
> >>>>> #endif
> >>>>> } hvmmem_type_t;
> >>>>>
> >>>>> Besides, does 4.7 still accept freeze exception? It would be great
> >>>>> if we can get an approval for this.
> >>>>
> >>>> Wait, do we *actually* need this?  Is anyone actually using this?
> >>>>
> >>>> I'd say remove it, and if anyone complains, *then* do the #ifdef'ery as
> >>>> a bug-fix.  I'm pretty sure that's Linux's policy -- You Must Keep
> >>>> Userspace Working, but you can break it to see if anyone complains
> first.
> >>
> >> We don't normally do it like that - we aim at keeping things compatible
> >> right away. I don't know of a case where we would have knowingly
> broken
> >> compatibility for users of the public headers (leaving aside tool stack only
> >> stuff of course).
> >>
> >>> Going further than this:
> >>>
> >>> The proposed patch series not only changes the name, it changes the
> >>> functionality.  We do not want code to *compile* against 4.7 and then
> >>> not *work* against 4.7; and the worst of all is to compile and sort of
> >>> work but do it incorrectly.
> >>
> >> I had the impression that the renaming patch was what it is - a renaming
> >> patch, without altering behavior.
> >>
> >>> Does the ioreq server have a way of asking Xen what version of the ABI
> >>> it's providing?  I'm assuming the answer is "no"; in which case code
> >>> that is compiled against the 4.6 interface but run on a 4.8 interface
> >>> that looks like this will fail in a somewhat unpredictable way.
> >>
> >> The only thing it can do is ask for the Xen version. The ABI version is not
> >> being returned by anything (but perhaps should be).
> >>
> >>> Given that:
> >>>
> >>> 1. When we do check the ioreq server functionality in, what's the
> >>> correct way to deal with code that wants to use the old interface, and
> >>> what do we do with code compiled against the old interface but running
> >>> on the new one?
> >>
> >> For the full series I'm not sure I can really tell.But as said, for the rename
> >> patch alone I thought it is just a rename. And that's what we want to get
> >> in (see Paul's earlier reply - he wants to see the old name gone, so it won't
> >> be used any further).
> >>
> >>> 2. What's the best thing to do for this release?
> >>
> >> If the entire series (no matter whether to go in now or later) is changing
> >> behavior, then the only choice is to consider the currently used enum
> >> value burnt, and use a fresh one for the new semantics.
> >
> > It sounds like that would be best way. If we don't so that then we have to
> maintain the write-dm semantics for pages of that type unless the type is
> claimed (by using the new hypercall) and that's bit icky. I much prefer that
> pages of the new type are treated as RAM until claimed.
> 
> I think the only sensible way to keep the enum is to also keep the
> functionality, which would mean using *another* p2m type for ioreq_server.
> 
> Given that the functionality isn't going away for 4.7, I don't see an
> urgent need to remove the enum; but if Paul does, then a patch renaming
> it to HVMMEM_unused would be the way forward then I guess.  Once the
> underlying p2m type goes away, you'll want to return -EINVAL for this
> enum value.
> 

Since the old semantics made it into the wild in 4.6.1 (which I was unaware of until a couple of days ago) then I guess we are going to need some form of deprecation like this.

  Paul

>  -George
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-20 16:52                     ` Jan Beulich
  2016-04-20 16:58                       ` Paul Durrant
  2016-04-20 17:08                       ` George Dunlap
@ 2016-04-21 12:04                       ` Yu, Zhang
  2 siblings, 0 replies; 82+ messages in thread
From: Yu, Zhang @ 2016-04-21 12:04 UTC (permalink / raw)
  To: Jan Beulich, george.dunlap, Paul.Durrant, wei.liu2
  Cc: kevin.tian, keir, Andrew.Cooper3, tim, xen-devel, zhiyuan.lv,
	jun.nakajima

maintainers,

  Thanks for the discussion and sorry for my delayed reply...

On 4/21/2016 12:52 AM, Jan Beulich wrote:
>>>> George Dunlap <george.dunlap@citrix.com> 04/20/16 6:30 PM >>>
>> On Wed, Apr 20, 2016 at 4:02 PM, George Dunlap <george.dunlap@citrix.com> wrote:
>>> On 19/04/16 12:02, Yu, Zhang wrote:
>>>> So I suppose the only place we need change for this patch is
>>>> for hvmmem_type_t, which should be defined like this?
>>>>
>>>> typedef enum {
>>>>     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>>>>     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>>>>     HVMMEM_mmio_dm,            /* Reads and write go to the device model */
>>>> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
>>>>     HVMMEM_ioreq_server
>>>> #else
>>>>     HVMMEM_mmio_write_dm
>>>> #endif
>>>> } hvmmem_type_t;
>>>>
>>>> Besides, does 4.7 still accept freeze exception? It would be great
>>>> if we can get an approval for this.
>>>
>>> Wait, do we *actually* need this?  Is anyone actually using this?
>>>
>>> I'd say remove it, and if anyone complains, *then* do the #ifdef'ery as
>>> a bug-fix.  I'm pretty sure that's Linux's policy -- You Must Keep
>>> Userspace Working, but you can break it to see if anyone complains first.
>
> We don't normally do it like that - we aim at keeping things compatible
> right away. I don't know of a case where we would have knowingly broken
> compatibility for users of the public headers (leaving aside tool stack only
> stuff of course).
>
>> Going further than this:
>>
>> The proposed patch series not only changes the name, it changes the
>> functionality.  We do not want code to *compile* against 4.7 and then
>> not *work* against 4.7; and the worst of all is to compile and sort of
>> work but do it incorrectly.
>
> I had the impression that the renaming patch was what it is - a renaming
> patch, without altering behavior.
>
>> Does the ioreq server have a way of asking Xen what version of the ABI
>> it's providing?  I'm assuming the answer is "no"; in which case code
>> that is compiled against the 4.6 interface but run on a 4.8 interface
>> that looks like this will fail in a somewhat unpredictable way.
>
> The only thing it can do is ask for the Xen version. The ABI version is not
> being returned by anything (but perhaps should be).
>
>> Given that:
>>
>> 1. When we do check the ioreq server functionality in, what's the
>> correct way to deal with code that wants to use the old interface, and
>> what do we do with code compiled against the old interface but running
>> on the new one?
>
> For the full series I'm not sure I can really tell.But as said, for the rename
> patch alone I thought it is just a rename. And that's what we want to get
> in (see Paul's earlier reply - he wants to see the old name gone, so it won't
> be used any further).
>
>> 2. What's the best thing to do for this release?
>
> If the entire series (no matter whether to go in now or later) is changing
> behavior, then the only choice is to consider the currently used enum
> value burnt, and use a fresh one for the new semantics.
>

The 3rd patch in this V2 series do change the semantics of this type.

But IIUC, we have reached an agreement in the discussion of patch 3
that for now, we only support the write emulation, and it shall be no
p2m_ioreq_server entries before an ioreq server claimed its ownership.
So in my V3 series(nearly finished, not sent out yet), this type shall
have the same semantics with the old one.

Besides, I'm sorry, and I do not quite understand what "consider the
currently used enum value burnt" means. Any example? :)

>> If it's the case that the only code that uses this is in XenServer,
>> then I'd say the answer to #1 can be simply, "Don't compile" and
>> "Don't do that" respectively; and the answer to #2 can be either
>> "Leave it be" or "Remove the enum from the public interface".
>>
>> If there are other projects that have started to use this interface,
>> then we need a better answer to #1 than "Compile but fail in
>> unpredicatble ways".
>
> How would we know whether there are other users?
>

B.R.
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-20 17:06                         ` George Dunlap
  2016-04-20 17:09                           ` Paul Durrant
@ 2016-04-21 12:24                           ` Yu, Zhang
  2016-04-21 13:31                             ` Paul Durrant
  1 sibling, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-21 12:24 UTC (permalink / raw)
  To: George Dunlap, Paul Durrant, Jan Beulich, Wei Liu
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	xen-devel, zhiyuan.lv, jun.nakajima



On 4/21/2016 1:06 AM, George Dunlap wrote:
> On 20/04/16 17:58, Paul Durrant wrote:
>>> -----Original Message-----
>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan
>>> Beulich
>>> Sent: 20 April 2016 17:53
>>> To: George Dunlap; Paul Durrant; Wei Liu; yu.c.zhang@linux.intel.com
>>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
>>> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
>>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
>>> p2m_mmio_write_dm to p2m_ioreq_server
>>>
>>>>>> George Dunlap <george.dunlap@citrix.com> 04/20/16 6:30 PM >>>
>>>> On Wed, Apr 20, 2016 at 4:02 PM, George Dunlap
>>> <george.dunlap@citrix.com> wrote:
>>>>> On 19/04/16 12:02, Yu, Zhang wrote:
>>>>>> So I suppose the only place we need change for this patch is
>>>>>> for hvmmem_type_t, which should be defined like this?
>>>>>>
>>>>>> typedef enum {
>>>>>>     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>>>>>>     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>>>>>>     HVMMEM_mmio_dm,            /* Reads and write go to the device
>>> model */
>>>>>> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
>>>>>>     HVMMEM_ioreq_server
>>>>>> #else
>>>>>>     HVMMEM_mmio_write_dm
>>>>>> #endif
>>>>>> } hvmmem_type_t;
>>>>>>
>>>>>> Besides, does 4.7 still accept freeze exception? It would be great
>>>>>> if we can get an approval for this.
>>>>>
>>>>> Wait, do we *actually* need this?  Is anyone actually using this?
>>>>>
>>>>> I'd say remove it, and if anyone complains, *then* do the #ifdef'ery as
>>>>> a bug-fix.  I'm pretty sure that's Linux's policy -- You Must Keep
>>>>> Userspace Working, but you can break it to see if anyone complains first.
>>>
>>> We don't normally do it like that - we aim at keeping things compatible
>>> right away. I don't know of a case where we would have knowingly broken
>>> compatibility for users of the public headers (leaving aside tool stack only
>>> stuff of course).
>>>
>>>> Going further than this:
>>>>
>>>> The proposed patch series not only changes the name, it changes the
>>>> functionality.  We do not want code to *compile* against 4.7 and then
>>>> not *work* against 4.7; and the worst of all is to compile and sort of
>>>> work but do it incorrectly.
>>>
>>> I had the impression that the renaming patch was what it is - a renaming
>>> patch, without altering behavior.
>>>
>>>> Does the ioreq server have a way of asking Xen what version of the ABI
>>>> it's providing?  I'm assuming the answer is "no"; in which case code
>>>> that is compiled against the 4.6 interface but run on a 4.8 interface
>>>> that looks like this will fail in a somewhat unpredictable way.
>>>
>>> The only thing it can do is ask for the Xen version. The ABI version is not
>>> being returned by anything (but perhaps should be).
>>>
>>>> Given that:
>>>>
>>>> 1. When we do check the ioreq server functionality in, what's the
>>>> correct way to deal with code that wants to use the old interface, and
>>>> what do we do with code compiled against the old interface but running
>>>> on the new one?
>>>
>>> For the full series I'm not sure I can really tell.But as said, for the rename
>>> patch alone I thought it is just a rename. And that's what we want to get
>>> in (see Paul's earlier reply - he wants to see the old name gone, so it won't
>>> be used any further).
>>>
>>>> 2. What's the best thing to do for this release?
>>>
>>> If the entire series (no matter whether to go in now or later) is changing
>>> behavior, then the only choice is to consider the currently used enum
>>> value burnt, and use a fresh one for the new semantics.
>>
>> It sounds like that would be best way. If we don't so that then we have to maintain the write-dm semantics for pages of that type unless the type is claimed (by using the new hypercall) and that's bit icky. I much prefer that pages of the new type are treated as RAM until claimed.
>
> I think the only sensible way to keep the enum is to also keep the
> functionality, which would mean using *another* p2m type for ioreq_server.
>
> Given that the functionality isn't going away for 4.7, I don't see an
> urgent need to remove the enum; but if Paul does, then a patch renaming
> it to HVMMEM_unused would be the way forward then I guess.  Once the
> underlying p2m type goes away, you'll want to return -EINVAL for this
> enum value.
>

So the enum would be sth. like this?

typedef enum {
     HVMMEM_ram_rw,        /* Normal read/write guest RAM */
     HVMMEM_ram_ro,        /* Read-only; writes are discarded */
     HVMMEM_mmio_dm,       /* Reads and write go to the device model */
#if __XEN_INTERFACE_VERSION__ < 0x00040700
     HVMMEM_mmio_write_dm, /* Read-only; writes go to the device model */
#else
     HVMMEM_unused,
#endif
     HVMMEM_ioreq_server
} hvmmem_type_t;

Thanks
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-20 14:50                                 ` George Dunlap
  2016-04-20 14:57                                   ` Paul Durrant
@ 2016-04-21 13:21                                   ` Yu, Zhang
  2016-04-22 11:27                                     ` Wei Liu
  1 sibling, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-21 13:21 UTC (permalink / raw)
  To: George Dunlap
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim (Xen.org),
	xen-devel, Paul Durrant, Lv, Zhiyuan, jun.nakajima

Thank you, George.

On 4/20/2016 10:50 PM, George Dunlap wrote:
> On Tue, Apr 19, 2016 at 12:59 PM, Yu, Zhang <yu.c.zhang@linux.intel.com> wrote:
>>>> So what about the VM suspend case you mentioned above? Will that trigger
>>>> the unmapping of ioreq server? Could the device model also take the role
>>>> to change the p2m type back in such case?
>>>
>>>
>>> Yes. The device model has to be told by the toolstack that the VM is
>>> suspending, otherwise it can't disable the ioreq server which puts the
>>> shared ioreq pages back into the guest p2m. If that's not done then the
>>> pages will be leaked.
>>>
>>>>
>>>> It would be much simpler if hypervisor side does not need to provide
>>>> the p2m resetting logic, and we can support live migration at the same
>>>> time then. :)
>>>>
>>>
>>> That really should not be hypervisor's job.
>>>
>>>    Paul
>>>
>>
>> Oh. So let's just remove the p2m type recalculation code from this
>> patch, no need to call p2m_change_entry_type_global, and no need to
>> worry about the log dirty part.
>>
>> George, do you think this acceptable?
>
> Sorry, just to make sure I understand your intentions:
>
> 1. The ioreq servers will be responsible for changing all the
> p2m_ioreq_server types back to p2m_ram_rw before detaching

In this version, this is done by calling p2m_change_entry_type_global()
*after* the ioreq server detaching.

But Paul & I now agreed not to reset p2m type in hypervisor, it should
be the device model's responsibility to do so. Even if buggy/malicious
device model fails to do so, the only affected one is the HVM being
serviced, neither hypervisor nor other VMs will be hurt.

> 2. Xen will refuse to set logdirty mode if there are any attached ioreq servers
>

In this version, I included p2m_ioreq_server to P2M_CHANGEABLE_TYPES,
so that we can use p2m_change_entry_global() for the auto resetting of
p2m_ioreq_server, yet later realized that if live migration happens
during a normal emulation process(before the detachment of the ioreq
server), entries with p2m_ioreq_server will be changed to p2m_log_dirty
(later to p2m_ram_rw in ept violation), therefore write operations will
not be able to forward to the device model.

I also realized it impossible to both keep the p2m auto resetting for
ioreq server detachment, and also support the live migration. So I once
suggested this option #2.

But it is not necessary now that we do not need the p2m resetting code.

> The one problem with #1 is what to do if the ioreq server is buggy /
> killed, and forgets / is unable to clean up its ioreq_server entries?
>
> There's a certain symmetry with the idea of having the ioreq server
> responsible both for mapping and unmapping; and it would be nice to
> have this stuff work for shadow mode as well.  But the automatic type

About shadow mode, it is not supported in this version, because routine
p2m_change_entry_global() is only supported on HAP mode. But I still
would like to keep shadow mode out in next version, because we have no
such usage model and if there's any bug it would be hard for us to find.

> change seems like a more robust option.  (Still open to persuasion on
> this.)
>
> On another note: the hard freeze is long past the already-extended
> deadline, so I was assuming this was 4.8 material at this point.
>

It is a pity. And I'm the one to be blamed for. Thank you very much
for all the advice and help. I'll continue to work on this feature
till no matter whichever version it is accepted to. :)

B.R.
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-20 16:30                                       ` Paul Durrant
  2016-04-20 16:58                                         ` George Dunlap
@ 2016-04-21 13:28                                         ` Yu, Zhang
  1 sibling, 0 replies; 82+ messages in thread
From: Yu, Zhang @ 2016-04-21 13:28 UTC (permalink / raw)
  To: Paul Durrant, George Dunlap
  Cc: Kevin Tian, jun.nakajima, Andrew Cooper, Tim (Xen.org),
	xen-devel, Lv, Zhiyuan, Jan Beulich



On 4/21/2016 12:30 AM, Paul Durrant wrote:
>> -----Original Message-----
>> From: George Dunlap
>> Sent: 20 April 2016 16:38
>> To: Paul Durrant
>> Cc: Yu, Zhang; Kevin Tian; jun.nakajima@intel.com; Andrew Cooper; Tim
>> (Xen.org); xen-devel@lists.xen.org; Lv, Zhiyuan; Jan Beulich
>> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
>> map guest ram with p2m_ioreq_server to an ioreq server
>>
>> On Wed, Apr 20, 2016 at 3:57 PM, Paul Durrant <Paul.Durrant@citrix.com>
>> wrote:
>>>> -----Original Message-----
>>>> From: George Dunlap
>>>> Sent: 20 April 2016 15:50
>>>> To: Yu, Zhang
>>>> Cc: Paul Durrant; xen-devel@lists.xen.org; Kevin Tian; Jan Beulich;
>> Andrew
>>>> Cooper; Tim (Xen.org); Lv, Zhiyuan; jun.nakajima@intel.com
>>>> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
>>>> map guest ram with p2m_ioreq_server to an ioreq server
>>>>
>>>> On Tue, Apr 19, 2016 at 12:59 PM, Yu, Zhang
>> <yu.c.zhang@linux.intel.com>
>>>> wrote:
>>>>>>> So what about the VM suspend case you mentioned above? Will that
>>>> trigger
>>>>>>> the unmapping of ioreq server? Could the device model also take the
>>>> role
>>>>>>> to change the p2m type back in such case?
>>>>>>
>>>>>>
>>>>>> Yes. The device model has to be told by the toolstack that the VM is
>>>>>> suspending, otherwise it can't disable the ioreq server which puts the
>>>>>> shared ioreq pages back into the guest p2m. If that's not done then
>> the
>>>>>> pages will be leaked.
>>>>>>
>>>>>>>
>>>>>>> It would be much simpler if hypervisor side does not need to provide
>>>>>>> the p2m resetting logic, and we can support live migration at the
>> same
>>>>>>> time then. :)
>>>>>>>
>>>>>>
>>>>>> That really should not be hypervisor's job.
>>>>>>
>>>>>>    Paul
>>>>>>
>>>>>
>>>>> Oh. So let's just remove the p2m type recalculation code from this
>>>>> patch, no need to call p2m_change_entry_type_global, and no need to
>>>>> worry about the log dirty part.
>>>>>
>>>>> George, do you think this acceptable?
>>>>
>>>> Sorry, just to make sure I understand your intentions:
>>>>
>>>> 1. The ioreq servers will be responsible for changing all the
>>>> p2m_ioreq_server types back to p2m_ram_rw before detaching
>>>> 2. Xen will refuse to set logdirty mode if there are any attached ioreq
>> servers
>>>>
>>>> The one problem with #1 is what to do if the ioreq server is buggy /
>>>> killed, and forgets / is unable to clean up its ioreq_server entries?
>>>>
>>>
>>> If we ever want to meaningfully migrate a VM with XenGT enabled then I
>> think it has to be possible to migrate the p2m_ioreq_server page types as-is.
>> I'd expect a new instance of the xengt emulator to claim them on the far and
>> any emulator state be transferred in a similar way to which qemu state is
>> transferred such that vgpu emulation can be continued after the migration.
>>> After all, it's not like we magically reset the type of mmio-dm pages on
>> migration. It is just expected that a new instance of QEMU is spawned to
>> take care of them on the far end when the VM is unpaused.
>>
>> I don't see the implication you're trying to make regarding the
>> attach/detach interface.
>>
>> In any case, is it really not the case that the devicemodels on the
>> far side have to re-register which IO ranges they're interested in?
>> Shouldn't the devicemodel on the far side re-register the pfns it
>> wants to watch as well?  That again seems like the most robust
>> solution, rather than assuming that the pfns you expect to have been
>> marked are already marked.  To do otherwise risks bugs where 1) pages
>> you think were being watched aren't 2) pages being watched that you
>> don't care about (and you don't think to un-watch them because you
>> didn't think you'd watched them in the first place).
>>
>
> That sounds a little inefficient when there could be thousands of pages in play. If the p2m types were faithfully reproduced in the receiving domain then the new incarnation of the emulator just needs to claim the type rather than making thousands of set-mem-type hypercalls as well. I agree that the information would need to be transferred though, otherwise the new incarnation wouldn't be able reset the page types when it was done with them.
>

I agree. For XenGT to live migrate, device model may probably actively
take role like this.

>> On a slightly different topic: come to think of it, there's no need to
>> switch the p2m type for ioreq_server when we turn on logdirty mode; we
>> just have to call paging_mark_dirty() if it's a write, don't we?
>>

Yes. Unlike the log dirty pages, which will be changed back to
p2m_ram_rw after the first ept violation(to keep it simple, no
intel PML considered), pages with p2m_ioreq_server will continue
to be trapped. Since this patch series is only for the basic
functionality of XenGT, I suggest we do not include this until
we plan to support XenGT live migration in the future. And at
that time, some kind optimization may be necessary(i.e. how to
remove the redundant paging_mark_dirty for write protected pages).


>
> That sounds right to me.
>
>   Paul
>
>>  -George

B.R.
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-21 12:24                           ` Yu, Zhang
@ 2016-04-21 13:31                             ` Paul Durrant
  2016-04-21 13:48                               ` Yu, Zhang
  0 siblings, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-21 13:31 UTC (permalink / raw)
  To: Yu, Zhang, George Dunlap, Jan Beulich, Wei Liu
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	xen-devel, zhiyuan.lv, jun.nakajima

> -----Original Message-----
> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: 21 April 2016 13:25
> To: George Dunlap; Paul Durrant; Jan Beulich; Wei Liu
> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> p2m_mmio_write_dm to p2m_ioreq_server
> 
> 
> 
> On 4/21/2016 1:06 AM, George Dunlap wrote:
> > On 20/04/16 17:58, Paul Durrant wrote:
> >>> -----Original Message-----
> >>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf
> Of Jan
> >>> Beulich
> >>> Sent: 20 April 2016 17:53
> >>> To: George Dunlap; Paul Durrant; Wei Liu; yu.c.zhang@linux.intel.com
> >>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
> >>> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
> >>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> >>> p2m_mmio_write_dm to p2m_ioreq_server
> >>>
> >>>>>> George Dunlap <george.dunlap@citrix.com> 04/20/16 6:30 PM >>>
> >>>> On Wed, Apr 20, 2016 at 4:02 PM, George Dunlap
> >>> <george.dunlap@citrix.com> wrote:
> >>>>> On 19/04/16 12:02, Yu, Zhang wrote:
> >>>>>> So I suppose the only place we need change for this patch is
> >>>>>> for hvmmem_type_t, which should be defined like this?
> >>>>>>
> >>>>>> typedef enum {
> >>>>>>     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
> >>>>>>     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
> >>>>>>     HVMMEM_mmio_dm,            /* Reads and write go to the device
> >>> model */
> >>>>>> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
> >>>>>>     HVMMEM_ioreq_server
> >>>>>> #else
> >>>>>>     HVMMEM_mmio_write_dm
> >>>>>> #endif
> >>>>>> } hvmmem_type_t;
> >>>>>>
> >>>>>> Besides, does 4.7 still accept freeze exception? It would be great
> >>>>>> if we can get an approval for this.
> >>>>>
> >>>>> Wait, do we *actually* need this?  Is anyone actually using this?
> >>>>>
> >>>>> I'd say remove it, and if anyone complains, *then* do the #ifdef'ery
> as
> >>>>> a bug-fix.  I'm pretty sure that's Linux's policy -- You Must Keep
> >>>>> Userspace Working, but you can break it to see if anyone complains
> first.
> >>>
> >>> We don't normally do it like that - we aim at keeping things compatible
> >>> right away. I don't know of a case where we would have knowingly
> broken
> >>> compatibility for users of the public headers (leaving aside tool stack only
> >>> stuff of course).
> >>>
> >>>> Going further than this:
> >>>>
> >>>> The proposed patch series not only changes the name, it changes the
> >>>> functionality.  We do not want code to *compile* against 4.7 and then
> >>>> not *work* against 4.7; and the worst of all is to compile and sort of
> >>>> work but do it incorrectly.
> >>>
> >>> I had the impression that the renaming patch was what it is - a renaming
> >>> patch, without altering behavior.
> >>>
> >>>> Does the ioreq server have a way of asking Xen what version of the ABI
> >>>> it's providing?  I'm assuming the answer is "no"; in which case code
> >>>> that is compiled against the 4.6 interface but run on a 4.8 interface
> >>>> that looks like this will fail in a somewhat unpredictable way.
> >>>
> >>> The only thing it can do is ask for the Xen version. The ABI version is not
> >>> being returned by anything (but perhaps should be).
> >>>
> >>>> Given that:
> >>>>
> >>>> 1. When we do check the ioreq server functionality in, what's the
> >>>> correct way to deal with code that wants to use the old interface, and
> >>>> what do we do with code compiled against the old interface but
> running
> >>>> on the new one?
> >>>
> >>> For the full series I'm not sure I can really tell.But as said, for the rename
> >>> patch alone I thought it is just a rename. And that's what we want to get
> >>> in (see Paul's earlier reply - he wants to see the old name gone, so it
> won't
> >>> be used any further).
> >>>
> >>>> 2. What's the best thing to do for this release?
> >>>
> >>> If the entire series (no matter whether to go in now or later) is changing
> >>> behavior, then the only choice is to consider the currently used enum
> >>> value burnt, and use a fresh one for the new semantics.
> >>
> >> It sounds like that would be best way. If we don't so that then we have to
> maintain the write-dm semantics for pages of that type unless the type is
> claimed (by using the new hypercall) and that's bit icky. I much prefer that
> pages of the new type are treated as RAM until claimed.
> >
> > I think the only sensible way to keep the enum is to also keep the
> > functionality, which would mean using *another* p2m type for
> ioreq_server.
> >
> > Given that the functionality isn't going away for 4.7, I don't see an
> > urgent need to remove the enum; but if Paul does, then a patch renaming
> > it to HVMMEM_unused would be the way forward then I guess.  Once the
> > underlying p2m type goes away, you'll want to return -EINVAL for this
> > enum value.
> >
> 
> So the enum would be sth. like this?
> 
> typedef enum {
>      HVMMEM_ram_rw,        /* Normal read/write guest RAM */
>      HVMMEM_ram_ro,        /* Read-only; writes are discarded */
>      HVMMEM_mmio_dm,       /* Reads and write go to the device model */
> #if __XEN_INTERFACE_VERSION__ < 0x00040700
>      HVMMEM_mmio_write_dm, /* Read-only; writes go to the device model
> */
> #else
>      HVMMEM_unused,
> #endif
>      HVMMEM_ioreq_server
> } hvmmem_type_t;
> 

I believe that's correct, but presumably there's need to be a change to the hypervisor since any reference there to HVMMEM_mmio_write_dm (which I think is limited to the get and set mem type code in hvm.c) will now need to map HVMMEM_unused to the old p2m_mmio_write_dm type.

  Paul

> Thanks
> Yu
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-21 13:31                             ` Paul Durrant
@ 2016-04-21 13:48                               ` Yu, Zhang
  2016-04-21 13:56                                 ` Paul Durrant
  0 siblings, 1 reply; 82+ messages in thread
From: Yu, Zhang @ 2016-04-21 13:48 UTC (permalink / raw)
  To: Paul Durrant, George Dunlap, Jan Beulich, Wei Liu
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	xen-devel, zhiyuan.lv, jun.nakajima



On 4/21/2016 9:31 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: 21 April 2016 13:25
>> To: George Dunlap; Paul Durrant; Jan Beulich; Wei Liu
>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
>> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
>> p2m_mmio_write_dm to p2m_ioreq_server
>>
>>
>>
>> On 4/21/2016 1:06 AM, George Dunlap wrote:
>>> On 20/04/16 17:58, Paul Durrant wrote:
>>>>> -----Original Message-----
>>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf
>> Of Jan
>>>>> Beulich
>>>>> Sent: 20 April 2016 17:53
>>>>> To: George Dunlap; Paul Durrant; Wei Liu; yu.c.zhang@linux.intel.com
>>>>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
>>>>> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
>>>>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
>>>>> p2m_mmio_write_dm to p2m_ioreq_server
>>>>>
>>>>>>>> George Dunlap <george.dunlap@citrix.com> 04/20/16 6:30 PM >>>
>>>>>> On Wed, Apr 20, 2016 at 4:02 PM, George Dunlap
>>>>> <george.dunlap@citrix.com> wrote:
>>>>>>> On 19/04/16 12:02, Yu, Zhang wrote:
>>>>>>>> So I suppose the only place we need change for this patch is
>>>>>>>> for hvmmem_type_t, which should be defined like this?
>>>>>>>>
>>>>>>>> typedef enum {
>>>>>>>>     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>>>>>>>>     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>>>>>>>>     HVMMEM_mmio_dm,            /* Reads and write go to the device
>>>>> model */
>>>>>>>> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
>>>>>>>>     HVMMEM_ioreq_server
>>>>>>>> #else
>>>>>>>>     HVMMEM_mmio_write_dm
>>>>>>>> #endif
>>>>>>>> } hvmmem_type_t;
>>>>>>>>
>>>>>>>> Besides, does 4.7 still accept freeze exception? It would be great
>>>>>>>> if we can get an approval for this.
>>>>>>>
>>>>>>> Wait, do we *actually* need this?  Is anyone actually using this?
>>>>>>>
>>>>>>> I'd say remove it, and if anyone complains, *then* do the #ifdef'ery
>> as
>>>>>>> a bug-fix.  I'm pretty sure that's Linux's policy -- You Must Keep
>>>>>>> Userspace Working, but you can break it to see if anyone complains
>> first.
>>>>>
>>>>> We don't normally do it like that - we aim at keeping things compatible
>>>>> right away. I don't know of a case where we would have knowingly
>> broken
>>>>> compatibility for users of the public headers (leaving aside tool stack only
>>>>> stuff of course).
>>>>>
>>>>>> Going further than this:
>>>>>>
>>>>>> The proposed patch series not only changes the name, it changes the
>>>>>> functionality.  We do not want code to *compile* against 4.7 and then
>>>>>> not *work* against 4.7; and the worst of all is to compile and sort of
>>>>>> work but do it incorrectly.
>>>>>
>>>>> I had the impression that the renaming patch was what it is - a renaming
>>>>> patch, without altering behavior.
>>>>>
>>>>>> Does the ioreq server have a way of asking Xen what version of the ABI
>>>>>> it's providing?  I'm assuming the answer is "no"; in which case code
>>>>>> that is compiled against the 4.6 interface but run on a 4.8 interface
>>>>>> that looks like this will fail in a somewhat unpredictable way.
>>>>>
>>>>> The only thing it can do is ask for the Xen version. The ABI version is not
>>>>> being returned by anything (but perhaps should be).
>>>>>
>>>>>> Given that:
>>>>>>
>>>>>> 1. When we do check the ioreq server functionality in, what's the
>>>>>> correct way to deal with code that wants to use the old interface, and
>>>>>> what do we do with code compiled against the old interface but
>> running
>>>>>> on the new one?
>>>>>
>>>>> For the full series I'm not sure I can really tell.But as said, for the rename
>>>>> patch alone I thought it is just a rename. And that's what we want to get
>>>>> in (see Paul's earlier reply - he wants to see the old name gone, so it
>> won't
>>>>> be used any further).
>>>>>
>>>>>> 2. What's the best thing to do for this release?
>>>>>
>>>>> If the entire series (no matter whether to go in now or later) is changing
>>>>> behavior, then the only choice is to consider the currently used enum
>>>>> value burnt, and use a fresh one for the new semantics.
>>>>
>>>> It sounds like that would be best way. If we don't so that then we have to
>> maintain the write-dm semantics for pages of that type unless the type is
>> claimed (by using the new hypercall) and that's bit icky. I much prefer that
>> pages of the new type are treated as RAM until claimed.
>>>
>>> I think the only sensible way to keep the enum is to also keep the
>>> functionality, which would mean using *another* p2m type for
>> ioreq_server.
>>>
>>> Given that the functionality isn't going away for 4.7, I don't see an
>>> urgent need to remove the enum; but if Paul does, then a patch renaming
>>> it to HVMMEM_unused would be the way forward then I guess.  Once the
>>> underlying p2m type goes away, you'll want to return -EINVAL for this
>>> enum value.
>>>
>>
>> So the enum would be sth. like this?
>>
>> typedef enum {
>>      HVMMEM_ram_rw,        /* Normal read/write guest RAM */
>>      HVMMEM_ram_ro,        /* Read-only; writes are discarded */
>>      HVMMEM_mmio_dm,       /* Reads and write go to the device model */
>> #if __XEN_INTERFACE_VERSION__ < 0x00040700
>>      HVMMEM_mmio_write_dm, /* Read-only; writes go to the device model
>> */
>> #else
>>      HVMMEM_unused,
>> #endif
>>      HVMMEM_ioreq_server
>> } hvmmem_type_t;
>>
>
> I believe that's correct, but presumably there's need to be a change to the hypervisor since any reference there to HVMMEM_mmio_write_dm (which I think is limited to the get and set mem type code in hvm.c) will now need to map HVMMEM_unused to the old p2m_mmio_write_dm type.
>
Thank you, Paul.

But p2m_mmio_write_dm will not exist any more...
E.g. if in hvmop_get_mem_type(), if type 0xf(p2m_ioreq_server) is
returned, we could just return HVMMEM_ioreq_server. No need to
worry about the HVMMEM_mmio_write_dm.

Maybe we only need to change the beginning of hvmop_set_mem_type()
to sth. like this:

/* Interface types to internal p2m types */
static const p2m_type_t memtype[] = {
     [HVMMEM_ram_rw]  = p2m_ram_rw,
     [HVMMEM_ram_ro]  = p2m_ram_ro,
     [HVMMEM_mmio_dm] = p2m_mmio_dm,
     [HVMMEM_unused] = p2m_invalid,  /* this will be rejected later */
     [HVMMEM_ioreq_server] = p2m_ioreq_server
};
and later in the same routine, just reject the HVMMEM_unused type, in
an if(with unlikely) statement.

>   Paul

B.R.
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-21 13:48                               ` Yu, Zhang
@ 2016-04-21 13:56                                 ` Paul Durrant
  2016-04-21 14:09                                   ` George Dunlap
  0 siblings, 1 reply; 82+ messages in thread
From: Paul Durrant @ 2016-04-21 13:56 UTC (permalink / raw)
  To: Yu, Zhang, George Dunlap, Jan Beulich, Wei Liu
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	xen-devel, zhiyuan.lv, jun.nakajima

> -----Original Message-----
> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: 21 April 2016 14:49
> To: Paul Durrant; George Dunlap; Jan Beulich; Wei Liu
> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> p2m_mmio_write_dm to p2m_ioreq_server
> 
> 
> 
> On 4/21/2016 9:31 PM, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> >> Sent: 21 April 2016 13:25
> >> To: George Dunlap; Paul Durrant; Jan Beulich; Wei Liu
> >> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
> >> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
> >> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> >> p2m_mmio_write_dm to p2m_ioreq_server
> >>
> >>
> >>
> >> On 4/21/2016 1:06 AM, George Dunlap wrote:
> >>> On 20/04/16 17:58, Paul Durrant wrote:
> >>>>> -----Original Message-----
> >>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf
> >> Of Jan
> >>>>> Beulich
> >>>>> Sent: 20 April 2016 17:53
> >>>>> To: George Dunlap; Paul Durrant; Wei Liu; yu.c.zhang@linux.intel.com
> >>>>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
> >>>>> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
> >>>>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
> >>>>> p2m_mmio_write_dm to p2m_ioreq_server
> >>>>>
> >>>>>>>> George Dunlap <george.dunlap@citrix.com> 04/20/16 6:30 PM
> >>>
> >>>>>> On Wed, Apr 20, 2016 at 4:02 PM, George Dunlap
> >>>>> <george.dunlap@citrix.com> wrote:
> >>>>>>> On 19/04/16 12:02, Yu, Zhang wrote:
> >>>>>>>> So I suppose the only place we need change for this patch is
> >>>>>>>> for hvmmem_type_t, which should be defined like this?
> >>>>>>>>
> >>>>>>>> typedef enum {
> >>>>>>>>     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
> >>>>>>>>     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
> >>>>>>>>     HVMMEM_mmio_dm,            /* Reads and write go to the device
> >>>>> model */
> >>>>>>>> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
> >>>>>>>>     HVMMEM_ioreq_server
> >>>>>>>> #else
> >>>>>>>>     HVMMEM_mmio_write_dm
> >>>>>>>> #endif
> >>>>>>>> } hvmmem_type_t;
> >>>>>>>>
> >>>>>>>> Besides, does 4.7 still accept freeze exception? It would be great
> >>>>>>>> if we can get an approval for this.
> >>>>>>>
> >>>>>>> Wait, do we *actually* need this?  Is anyone actually using this?
> >>>>>>>
> >>>>>>> I'd say remove it, and if anyone complains, *then* do the
> #ifdef'ery
> >> as
> >>>>>>> a bug-fix.  I'm pretty sure that's Linux's policy -- You Must Keep
> >>>>>>> Userspace Working, but you can break it to see if anyone complains
> >> first.
> >>>>>
> >>>>> We don't normally do it like that - we aim at keeping things compatible
> >>>>> right away. I don't know of a case where we would have knowingly
> >> broken
> >>>>> compatibility for users of the public headers (leaving aside tool stack
> only
> >>>>> stuff of course).
> >>>>>
> >>>>>> Going further than this:
> >>>>>>
> >>>>>> The proposed patch series not only changes the name, it changes
> the
> >>>>>> functionality.  We do not want code to *compile* against 4.7 and
> then
> >>>>>> not *work* against 4.7; and the worst of all is to compile and sort of
> >>>>>> work but do it incorrectly.
> >>>>>
> >>>>> I had the impression that the renaming patch was what it is - a
> renaming
> >>>>> patch, without altering behavior.
> >>>>>
> >>>>>> Does the ioreq server have a way of asking Xen what version of the
> ABI
> >>>>>> it's providing?  I'm assuming the answer is "no"; in which case code
> >>>>>> that is compiled against the 4.6 interface but run on a 4.8 interface
> >>>>>> that looks like this will fail in a somewhat unpredictable way.
> >>>>>
> >>>>> The only thing it can do is ask for the Xen version. The ABI version is
> not
> >>>>> being returned by anything (but perhaps should be).
> >>>>>
> >>>>>> Given that:
> >>>>>>
> >>>>>> 1. When we do check the ioreq server functionality in, what's the
> >>>>>> correct way to deal with code that wants to use the old interface,
> and
> >>>>>> what do we do with code compiled against the old interface but
> >> running
> >>>>>> on the new one?
> >>>>>
> >>>>> For the full series I'm not sure I can really tell.But as said, for the
> rename
> >>>>> patch alone I thought it is just a rename. And that's what we want to
> get
> >>>>> in (see Paul's earlier reply - he wants to see the old name gone, so it
> >> won't
> >>>>> be used any further).
> >>>>>
> >>>>>> 2. What's the best thing to do for this release?
> >>>>>
> >>>>> If the entire series (no matter whether to go in now or later) is
> changing
> >>>>> behavior, then the only choice is to consider the currently used enum
> >>>>> value burnt, and use a fresh one for the new semantics.
> >>>>
> >>>> It sounds like that would be best way. If we don't so that then we have
> to
> >> maintain the write-dm semantics for pages of that type unless the type is
> >> claimed (by using the new hypercall) and that's bit icky. I much prefer that
> >> pages of the new type are treated as RAM until claimed.
> >>>
> >>> I think the only sensible way to keep the enum is to also keep the
> >>> functionality, which would mean using *another* p2m type for
> >> ioreq_server.
> >>>
> >>> Given that the functionality isn't going away for 4.7, I don't see an
> >>> urgent need to remove the enum; but if Paul does, then a patch
> renaming
> >>> it to HVMMEM_unused would be the way forward then I guess.  Once
> the
> >>> underlying p2m type goes away, you'll want to return -EINVAL for this
> >>> enum value.
> >>>
> >>
> >> So the enum would be sth. like this?
> >>
> >> typedef enum {
> >>      HVMMEM_ram_rw,        /* Normal read/write guest RAM */
> >>      HVMMEM_ram_ro,        /* Read-only; writes are discarded */
> >>      HVMMEM_mmio_dm,       /* Reads and write go to the device model */
> >> #if __XEN_INTERFACE_VERSION__ < 0x00040700
> >>      HVMMEM_mmio_write_dm, /* Read-only; writes go to the device
> model
> >> */
> >> #else
> >>      HVMMEM_unused,
> >> #endif
> >>      HVMMEM_ioreq_server
> >> } hvmmem_type_t;
> >>
> >
> > I believe that's correct, but presumably there's need to be a change to the
> hypervisor since any reference there to HVMMEM_mmio_write_dm (which I
> think is limited to the get and set mem type code in hvm.c) will now need to
> map HVMMEM_unused to the old p2m_mmio_write_dm type.
> >
> Thank you, Paul.
> 
> But p2m_mmio_write_dm will not exist any more...
> E.g. if in hvmop_get_mem_type(), if type 0xf(p2m_ioreq_server) is
> returned, we could just return HVMMEM_ioreq_server. No need to
> worry about the HVMMEM_mmio_write_dm.
> 
> Maybe we only need to change the beginning of hvmop_set_mem_type()
> to sth. like this:
> 
> /* Interface types to internal p2m types */
> static const p2m_type_t memtype[] = {
>      [HVMMEM_ram_rw]  = p2m_ram_rw,
>      [HVMMEM_ram_ro]  = p2m_ram_ro,
>      [HVMMEM_mmio_dm] = p2m_mmio_dm,
>      [HVMMEM_unused] = p2m_invalid,  /* this will be rejected later */
>      [HVMMEM_ioreq_server] = p2m_ioreq_server
> };
> and later in the same routine, just reject the HVMMEM_unused type, in
> an if(with unlikely) statement.
> 

As long as everyone is in agreement then we can break the functionality that exists in 4.6.1 (and presumably 4.7 now) then that’s ok.

  Paul

> >   Paul
> 
> B.R.
> Yu
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server
  2016-04-21 13:56                                 ` Paul Durrant
@ 2016-04-21 14:09                                   ` George Dunlap
  0 siblings, 0 replies; 82+ messages in thread
From: George Dunlap @ 2016-04-21 14:09 UTC (permalink / raw)
  To: Paul Durrant, Yu, Zhang, Jan Beulich, Wei Liu
  Cc: Kevin Tian, Keir (Xen.org), Andrew Cooper, Tim (Xen.org),
	xen-devel, zhiyuan.lv, jun.nakajima

On 21/04/16 14:56, Paul Durrant wrote:
>> -----Original Message-----
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: 21 April 2016 14:49
>> To: Paul Durrant; George Dunlap; Jan Beulich; Wei Liu
>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
>> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
>> p2m_mmio_write_dm to p2m_ioreq_server
>>
>>
>>
>> On 4/21/2016 9:31 PM, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>>>> Sent: 21 April 2016 13:25
>>>> To: George Dunlap; Paul Durrant; Jan Beulich; Wei Liu
>>>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
>>>> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
>>>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
>>>> p2m_mmio_write_dm to p2m_ioreq_server
>>>>
>>>>
>>>>
>>>> On 4/21/2016 1:06 AM, George Dunlap wrote:
>>>>> On 20/04/16 17:58, Paul Durrant wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf
>>>> Of Jan
>>>>>>> Beulich
>>>>>>> Sent: 20 April 2016 17:53
>>>>>>> To: George Dunlap; Paul Durrant; Wei Liu; yu.c.zhang@linux.intel.com
>>>>>>> Cc: Kevin Tian; Keir (Xen.org); Andrew Cooper; Tim (Xen.org); xen-
>>>>>>> devel@lists.xen.org; zhiyuan.lv@intel.com; jun.nakajima@intel.com
>>>>>>> Subject: Re: [Xen-devel] [PATCH v2 2/3] x86/ioreq server: Rename
>>>>>>> p2m_mmio_write_dm to p2m_ioreq_server
>>>>>>>
>>>>>>>>>> George Dunlap <george.dunlap@citrix.com> 04/20/16 6:30 PM
>>>>>
>>>>>>>> On Wed, Apr 20, 2016 at 4:02 PM, George Dunlap
>>>>>>> <george.dunlap@citrix.com> wrote:
>>>>>>>>> On 19/04/16 12:02, Yu, Zhang wrote:
>>>>>>>>>> So I suppose the only place we need change for this patch is
>>>>>>>>>> for hvmmem_type_t, which should be defined like this?
>>>>>>>>>>
>>>>>>>>>> typedef enum {
>>>>>>>>>>     HVMMEM_ram_rw,             /* Normal read/write guest RAM */
>>>>>>>>>>     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
>>>>>>>>>>     HVMMEM_mmio_dm,            /* Reads and write go to the device
>>>>>>> model */
>>>>>>>>>> #if __XEN_INTERFACE_VERSION__ >= 0x00040700
>>>>>>>>>>     HVMMEM_ioreq_server
>>>>>>>>>> #else
>>>>>>>>>>     HVMMEM_mmio_write_dm
>>>>>>>>>> #endif
>>>>>>>>>> } hvmmem_type_t;
>>>>>>>>>>
>>>>>>>>>> Besides, does 4.7 still accept freeze exception? It would be great
>>>>>>>>>> if we can get an approval for this.
>>>>>>>>>
>>>>>>>>> Wait, do we *actually* need this?  Is anyone actually using this?
>>>>>>>>>
>>>>>>>>> I'd say remove it, and if anyone complains, *then* do the
>> #ifdef'ery
>>>> as
>>>>>>>>> a bug-fix.  I'm pretty sure that's Linux's policy -- You Must Keep
>>>>>>>>> Userspace Working, but you can break it to see if anyone complains
>>>> first.
>>>>>>>
>>>>>>> We don't normally do it like that - we aim at keeping things compatible
>>>>>>> right away. I don't know of a case where we would have knowingly
>>>> broken
>>>>>>> compatibility for users of the public headers (leaving aside tool stack
>> only
>>>>>>> stuff of course).
>>>>>>>
>>>>>>>> Going further than this:
>>>>>>>>
>>>>>>>> The proposed patch series not only changes the name, it changes
>> the
>>>>>>>> functionality.  We do not want code to *compile* against 4.7 and
>> then
>>>>>>>> not *work* against 4.7; and the worst of all is to compile and sort of
>>>>>>>> work but do it incorrectly.
>>>>>>>
>>>>>>> I had the impression that the renaming patch was what it is - a
>> renaming
>>>>>>> patch, without altering behavior.
>>>>>>>
>>>>>>>> Does the ioreq server have a way of asking Xen what version of the
>> ABI
>>>>>>>> it's providing?  I'm assuming the answer is "no"; in which case code
>>>>>>>> that is compiled against the 4.6 interface but run on a 4.8 interface
>>>>>>>> that looks like this will fail in a somewhat unpredictable way.
>>>>>>>
>>>>>>> The only thing it can do is ask for the Xen version. The ABI version is
>> not
>>>>>>> being returned by anything (but perhaps should be).
>>>>>>>
>>>>>>>> Given that:
>>>>>>>>
>>>>>>>> 1. When we do check the ioreq server functionality in, what's the
>>>>>>>> correct way to deal with code that wants to use the old interface,
>> and
>>>>>>>> what do we do with code compiled against the old interface but
>>>> running
>>>>>>>> on the new one?
>>>>>>>
>>>>>>> For the full series I'm not sure I can really tell.But as said, for the
>> rename
>>>>>>> patch alone I thought it is just a rename. And that's what we want to
>> get
>>>>>>> in (see Paul's earlier reply - he wants to see the old name gone, so it
>>>> won't
>>>>>>> be used any further).
>>>>>>>
>>>>>>>> 2. What's the best thing to do for this release?
>>>>>>>
>>>>>>> If the entire series (no matter whether to go in now or later) is
>> changing
>>>>>>> behavior, then the only choice is to consider the currently used enum
>>>>>>> value burnt, and use a fresh one for the new semantics.
>>>>>>
>>>>>> It sounds like that would be best way. If we don't so that then we have
>> to
>>>> maintain the write-dm semantics for pages of that type unless the type is
>>>> claimed (by using the new hypercall) and that's bit icky. I much prefer that
>>>> pages of the new type are treated as RAM until claimed.
>>>>>
>>>>> I think the only sensible way to keep the enum is to also keep the
>>>>> functionality, which would mean using *another* p2m type for
>>>> ioreq_server.
>>>>>
>>>>> Given that the functionality isn't going away for 4.7, I don't see an
>>>>> urgent need to remove the enum; but if Paul does, then a patch
>> renaming
>>>>> it to HVMMEM_unused would be the way forward then I guess.  Once
>> the
>>>>> underlying p2m type goes away, you'll want to return -EINVAL for this
>>>>> enum value.
>>>>>
>>>>
>>>> So the enum would be sth. like this?
>>>>
>>>> typedef enum {
>>>>      HVMMEM_ram_rw,        /* Normal read/write guest RAM */
>>>>      HVMMEM_ram_ro,        /* Read-only; writes are discarded */
>>>>      HVMMEM_mmio_dm,       /* Reads and write go to the device model */
>>>> #if __XEN_INTERFACE_VERSION__ < 0x00040700
>>>>      HVMMEM_mmio_write_dm, /* Read-only; writes go to the device
>> model
>>>> */
>>>> #else
>>>>      HVMMEM_unused,
>>>> #endif
>>>>      HVMMEM_ioreq_server
>>>> } hvmmem_type_t;
>>>>
>>>
>>> I believe that's correct, but presumably there's need to be a change to the
>> hypervisor since any reference there to HVMMEM_mmio_write_dm (which I
>> think is limited to the get and set mem type code in hvm.c) will now need to
>> map HVMMEM_unused to the old p2m_mmio_write_dm type.
>>>
>> Thank you, Paul.
>>
>> But p2m_mmio_write_dm will not exist any more...
>> E.g. if in hvmop_get_mem_type(), if type 0xf(p2m_ioreq_server) is
>> returned, we could just return HVMMEM_ioreq_server. No need to
>> worry about the HVMMEM_mmio_write_dm.
>>
>> Maybe we only need to change the beginning of hvmop_set_mem_type()
>> to sth. like this:
>>
>> /* Interface types to internal p2m types */
>> static const p2m_type_t memtype[] = {
>>      [HVMMEM_ram_rw]  = p2m_ram_rw,
>>      [HVMMEM_ram_ro]  = p2m_ram_ro,
>>      [HVMMEM_mmio_dm] = p2m_mmio_dm,
>>      [HVMMEM_unused] = p2m_invalid,  /* this will be rejected later */
>>      [HVMMEM_ioreq_server] = p2m_ioreq_server
>> };
>> and later in the same routine, just reject the HVMMEM_unused type, in
>> an if(with unlikely) statement.
>>
> 
> As long as everyone is in agreement then we can break the functionality that exists in 4.6.1 (and presumably 4.7 now) then that’s ok.

I think we're all in agreement that we can *remove* functionality as
long as (from Xen's perspective) it should fail gracefully.  Which means
that if anyone uses the hvm_mem_type enum  value previously allocated do
HVMMEM_mmio_write_dm, return an error (one way or another).

The exact method for doing so is up for review.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-21 13:21                                   ` Yu, Zhang
@ 2016-04-22 11:27                                     ` Wei Liu
  2016-04-22 11:30                                       ` George Dunlap
  0 siblings, 1 reply; 82+ messages in thread
From: Wei Liu @ 2016-04-22 11:27 UTC (permalink / raw)
  To: Yu, Zhang
  Cc: Kevin Tian, Wei Liu, jun.nakajima, Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, Paul Durrant, Lv, Zhiyuan, Jan Beulich

On Thu, Apr 21, 2016 at 09:21:17PM +0800, Yu, Zhang wrote:
[...]
> It is a pity. And I'm the one to be blamed for. Thank you very much

Please don't blame yourself. I don't think anyone should be blamed here.

I can only speak for myself, but I feel grateful when other people want
to collaborate with upstream projects.  We just need to realise
collaboration takes time and effort and things don't always come out as
expected.  I've been on both sides, both as contributor and maintainer
so I understand the frustration.

I want to publicly thank you (and all other engineers at Intel) for your
persistent effort for pushing things upstream and I believe things will
work out in the end.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server
  2016-04-22 11:27                                     ` Wei Liu
@ 2016-04-22 11:30                                       ` George Dunlap
  0 siblings, 0 replies; 82+ messages in thread
From: George Dunlap @ 2016-04-22 11:30 UTC (permalink / raw)
  To: Wei Liu, Yu, Zhang
  Cc: Kevin Tian, jun.nakajima, Andrew Cooper, Tim (Xen.org),
	xen-devel, Paul Durrant, Lv, Zhiyuan, Jan Beulich

On 22/04/16 12:27, Wei Liu wrote:
> On Thu, Apr 21, 2016 at 09:21:17PM +0800, Yu, Zhang wrote:
> [...]
>> It is a pity. And I'm the one to be blamed for. Thank you very much
> 
> Please don't blame yourself. I don't think anyone should be blamed here.

There are lots of things that might have led to a different outcome:

1. Yu Zhang may have noticed the issues earlier
2. Paul or I may have noticed the issues earlier
3. One of the x86 maintainers might have given a review earlier and
caught the issues
4. Yu Zhang, Paul, or I may have pinged one of the x86 maintainers to
get the review earlier

We all tried, but we all could perhaps have done a bit better.  It is a
shame it missed it by so little, but that's part of the reason we're
shortening the release cycles. :-)

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 82+ messages in thread

end of thread, other threads:[~2016-04-22 11:30 UTC | newest]

Thread overview: 82+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-31 10:53 [PATCH v2 0/3] x86/ioreq server: introduce HVMMEM_ioreq_server mem type Yu Zhang
2016-03-31 10:53 ` [PATCH v2 1/3] x86/ioreq server: Add new functions to get/set memory types Yu Zhang
2016-04-05 13:57   ` George Dunlap
2016-04-05 14:08     ` George Dunlap
2016-04-08 13:25   ` Andrew Cooper
2016-03-31 10:53 ` [PATCH v2 2/3] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server Yu Zhang
2016-04-05 14:38   ` George Dunlap
2016-04-08 13:26   ` Andrew Cooper
2016-04-08 21:48   ` Jan Beulich
2016-04-18  8:41     ` Paul Durrant
2016-04-18  9:10       ` George Dunlap
2016-04-18  9:14         ` Wei Liu
2016-04-18  9:45           ` Paul Durrant
2016-04-18 16:40       ` Jan Beulich
2016-04-18 16:45         ` Paul Durrant
2016-04-18 16:47           ` Jan Beulich
2016-04-18 16:58             ` Paul Durrant
2016-04-19 11:02               ` Yu, Zhang
2016-04-19 11:15                 ` Paul Durrant
2016-04-19 11:38                   ` Yu, Zhang
2016-04-19 11:50                     ` Paul Durrant
2016-04-19 16:51                     ` Jan Beulich
2016-04-20 14:59                       ` Wei Liu
2016-04-20 15:02                 ` George Dunlap
2016-04-20 16:30                   ` George Dunlap
2016-04-20 16:52                     ` Jan Beulich
2016-04-20 16:58                       ` Paul Durrant
2016-04-20 17:06                         ` George Dunlap
2016-04-20 17:09                           ` Paul Durrant
2016-04-21 12:24                           ` Yu, Zhang
2016-04-21 13:31                             ` Paul Durrant
2016-04-21 13:48                               ` Yu, Zhang
2016-04-21 13:56                                 ` Paul Durrant
2016-04-21 14:09                                   ` George Dunlap
2016-04-20 17:08                       ` George Dunlap
2016-04-21 12:04                       ` Yu, Zhang
2016-03-31 10:53 ` [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
     [not found]   ` <20160404082556.GC28633@deinos.phlegethon.org>
2016-04-05  6:01     ` Yu, Zhang
2016-04-06 17:13   ` George Dunlap
2016-04-07  7:01     ` Yu, Zhang
     [not found]       ` <CAFLBxZbLp2zWzCzQTaJNWbanQSmTJ57ZyTh0qaD-+YUn8o8pyQ@mail.gmail.com>
2016-04-08 10:39         ` George Dunlap
     [not found]         ` <5707839F.9060803@linux.intel.com>
2016-04-08 11:01           ` George Dunlap
2016-04-11 11:15             ` Yu, Zhang
2016-04-14 10:45               ` Yu, Zhang
2016-04-18 15:57                 ` Paul Durrant
2016-04-19  9:11                   ` Yu, Zhang
2016-04-19  9:21                     ` Paul Durrant
2016-04-19  9:44                       ` Yu, Zhang
2016-04-19 10:05                         ` Paul Durrant
2016-04-19 11:17                           ` Yu, Zhang
2016-04-19 11:47                             ` Paul Durrant
2016-04-19 11:59                               ` Yu, Zhang
2016-04-20 14:50                                 ` George Dunlap
2016-04-20 14:57                                   ` Paul Durrant
2016-04-20 15:37                                     ` George Dunlap
2016-04-20 16:30                                       ` Paul Durrant
2016-04-20 16:58                                         ` George Dunlap
2016-04-21 13:28                                         ` Yu, Zhang
2016-04-21 13:21                                   ` Yu, Zhang
2016-04-22 11:27                                     ` Wei Liu
2016-04-22 11:30                                       ` George Dunlap
2016-04-19  4:37                 ` Tian, Kevin
2016-04-19  9:21                   ` Yu, Zhang
2016-04-08 13:33   ` Andrew Cooper
2016-04-11 11:14     ` Yu, Zhang
2016-04-11 12:20       ` Andrew Cooper
2016-04-11 16:25         ` Jan Beulich
2016-04-08 22:28   ` Jan Beulich
2016-04-11 11:14     ` Yu, Zhang
2016-04-11 16:31       ` Jan Beulich
2016-04-12  9:37         ` Yu, Zhang
2016-04-12 15:08           ` Jan Beulich
2016-04-14  9:56             ` Yu, Zhang
2016-04-19  4:50               ` Tian, Kevin
2016-04-19  8:46                 ` Paul Durrant
2016-04-19  9:27                   ` Yu, Zhang
2016-04-19  9:40                     ` Paul Durrant
2016-04-19  9:49                       ` Yu, Zhang
2016-04-19 10:01                         ` Paul Durrant
2016-04-19  9:54                           ` Yu, Zhang
2016-04-19  9:15                 ` Yu, Zhang
2016-04-19  9:23                   ` Paul Durrant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).