xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type.
@ 2016-07-12  9:02 Yu Zhang
  2016-07-12  9:02 ` [PATCH v5 1/4] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server Yu Zhang
                   ` (4 more replies)
  0 siblings, 5 replies; 32+ messages in thread
From: Yu Zhang @ 2016-07-12  9:02 UTC (permalink / raw)
  To: xen-devel; +Cc: zhiyuan.lv

XenGT leverages ioreq server to track and forward the accesses to GPU 
I/O resources, e.g. the PPGTT(per-process graphic translation tables).
Currently, ioreq server uses rangeset to track the BDF/ PIO/MMIO ranges
to be emulated. To select an ioreq server, the rangeset is searched to
see if the I/O range is recorded. However, number of ram pages to be
tracked may exceed the upper limit of rangeset.

Previously, one solution was proposed to refactor the rangeset, and 
extend its upper limit. However, after 12 rounds discussion, we have
decided to drop this approach due to security concerns. Now this new 
patch series introduces a new mem type, HVMMEM_ioreq_server, and added
hvm operations to let one ioreq server to claim its ownership of ram 
pages with this type. Accesses to a page of this type will be handled
by the specified ioreq server directly. 

Yu Zhang (4):
  x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server.
  x86/ioreq server: Add new functions to get/set memory types.
  x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to
    an ioreq server.
  x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an
    ioreq server unmaps.

 xen/arch/x86/hvm/emulate.c       |  33 +++-
 xen/arch/x86/hvm/hvm.c           | 395 ++++++++++++++++++++++++++-------------
 xen/arch/x86/hvm/ioreq.c         |  41 ++++
 xen/arch/x86/mm/hap/hap.c        |   9 +
 xen/arch/x86/mm/hap/nested_hap.c |   2 +-
 xen/arch/x86/mm/p2m-ept.c        |  14 +-
 xen/arch/x86/mm/p2m-pt.c         |  30 ++-
 xen/arch/x86/mm/p2m.c            |  77 ++++++++
 xen/arch/x86/mm/shadow/multi.c   |   3 +-
 xen/include/asm-x86/hvm/ioreq.h  |   2 +
 xen/include/asm-x86/p2m.h        |  36 +++-
 xen/include/public/hvm/hvm_op.h  |  34 +++-
 12 files changed, 528 insertions(+), 148 deletions(-)

-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v5 1/4] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server.
  2016-07-12  9:02 [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
@ 2016-07-12  9:02 ` Yu Zhang
  2016-07-12  9:02 ` [PATCH v5 2/4] x86/ioreq server: Add new functions to get/set memory types Yu Zhang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 32+ messages in thread
From: Yu Zhang @ 2016-07-12  9:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Jan Beulich, George Dunlap, Andrew Cooper,
	Tim Deegan, Paul Durrant, zhiyuan.lv, Jun Nakajima

Previously p2m type p2m_mmio_write_dm was introduced for write-
protected memory pages whose write operations are supposed to be
forwarded to and emulated by an ioreq server. Yet limitations of
rangeset restrict the number of guest pages to be write-protected.

This patch replaces the p2m type p2m_mmio_write_dm with a new name:
p2m_ioreq_server, which means this p2m type can be claimed by one
ioreq server, instead of being tracked inside the rangeset of ioreq
server. And a new memory type, HVMMEM_ioreq_server, is now used in
the HVMOP_set/get_mem_type interface to set/get this p2m type.

Patches following up will add the related HVMOP handling code which
map/unmap type p2m_ioreq_server to/from an ioreq server. Without
following patches, memory type changes to HVMMEM_ioreq_server can
still be allowed, and in such cases, p2m_ioreq_server pages will be
treated the same as ones with previous type p2m_mmio_write_dm, and
are tracked inside the ioreq server's rangeset.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Tim Deegan <tim@xen.org>

changes in v5: 
  - Commit message changes - explained why we can allow memory
    type changes to HVMMEM_ioreq_server.
  - Add George's Acked-by.

changes in v4: 
  - According to George's comments, move the HVMMEM_unused part
    into a seperate patch(which has already been accepted);
  - Removed George's Reviewed-by because of changes after v3.
  - According to Wei Liu's comments, change the format of the commit
    message.

changes in v3: 
  - According to Jan & George's comments, keep HVMMEM_mmio_write_dm
    for old xen interface versions, and replace it with HVMMEM_unused
    for xen interfaces newer than 4.7.0; For p2m_ioreq_server, a new 
    enum - HVMMEM_ioreq_server is introduced for the get/set mem type
    interfaces;
  - Add George's Reviewed-by and Acked-by from Tim & Andrew.

changes in v2: 
  - According to George Dunlap's comments, only rename the p2m type,
    with no behavior changes.
---
 xen/arch/x86/hvm/hvm.c          | 9 ++++++---
 xen/arch/x86/mm/p2m-ept.c       | 2 +-
 xen/arch/x86/mm/p2m-pt.c        | 2 +-
 xen/arch/x86/mm/shadow/multi.c  | 2 +-
 xen/include/asm-x86/p2m.h       | 4 ++--
 xen/include/public/hvm/hvm_op.h | 5 +++--
 6 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c89ab6e..4b51c57 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1865,7 +1865,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
      */
     if ( (p2mt == p2m_mmio_dm) || 
          (npfec.write_access &&
-          (p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
+          (p2m_is_discard_write(p2mt) || (p2mt == p2m_ioreq_server))) )
     {
         __put_gfn(p2m, gfn);
         if ( ap2m_active )
@@ -5565,6 +5565,8 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
             get_gfn_query_unlocked(d, a.pfn, &t);
             if ( p2m_is_mmio(t) )
                 a.mem_type =  HVMMEM_mmio_dm;
+            else if ( t == p2m_ioreq_server )
+                a.mem_type = HVMMEM_ioreq_server;
             else if ( p2m_is_readonly(t) )
                 a.mem_type =  HVMMEM_ram_ro;
             else if ( p2m_is_ram(t) )
@@ -5595,7 +5597,8 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
             [HVMMEM_ram_rw]  = p2m_ram_rw,
             [HVMMEM_ram_ro]  = p2m_ram_ro,
             [HVMMEM_mmio_dm] = p2m_mmio_dm,
-            [HVMMEM_unused] = p2m_invalid
+            [HVMMEM_unused] = p2m_invalid,
+            [HVMMEM_ioreq_server] = p2m_ioreq_server
         };
 
         if ( copy_from_guest(&a, arg, 1) )
@@ -5644,7 +5647,7 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
             }
             if ( !p2m_is_ram(t) &&
                  (!p2m_is_hole(t) || a.hvmmem_type != HVMMEM_mmio_dm) &&
-                 (t != p2m_mmio_write_dm || a.hvmmem_type != HVMMEM_ram_rw) )
+                 (t != p2m_ioreq_server || a.hvmmem_type != HVMMEM_ram_rw) )
             {
                 put_gfn(d, pfn);
                 goto setmemtype_fail;
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 7166c71..7adc77d 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -171,7 +171,7 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
             entry->a = entry->d = !!cpu_has_vmx_ept_ad;
             break;
         case p2m_grant_map_ro:
-        case p2m_mmio_write_dm:
+        case p2m_ioreq_server:
             entry->r = 1;
             entry->w = entry->x = 0;
             entry->a = !!cpu_has_vmx_ept_ad;
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 4980934..05aaf8f 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -94,7 +94,7 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
     default:
         return flags | _PAGE_NX_BIT;
     case p2m_grant_map_ro:
-    case p2m_mmio_write_dm:
+    case p2m_ioreq_server:
         return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
     case p2m_ram_ro:
     case p2m_ram_logdirty:
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index dfe59a2..8c4b20e 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3226,7 +3226,7 @@ static int sh_page_fault(struct vcpu *v,
 
     /* Need to hand off device-model MMIO to the device model */
     if ( p2mt == p2m_mmio_dm
-         || (p2mt == p2m_mmio_write_dm && ft == ft_demand_write) )
+         || (p2mt == p2m_ioreq_server && ft == ft_demand_write) )
     {
         gpa = guest_walk_to_gpa(&gw);
         goto mmio;
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 4ab3574..6785669 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -71,7 +71,7 @@ typedef enum {
     p2m_ram_shared = 12,          /* Shared or sharable memory */
     p2m_ram_broken = 13,          /* Broken page, access cause domain crash */
     p2m_map_foreign  = 14,        /* ram pages from foreign domain */
-    p2m_mmio_write_dm = 15,       /* Read-only; writes go to the device model */
+    p2m_ioreq_server = 15,
 } p2m_type_t;
 
 /* Modifiers to the query */
@@ -112,7 +112,7 @@ typedef unsigned int p2m_query_t;
                       | p2m_to_mask(p2m_ram_ro)         \
                       | p2m_to_mask(p2m_grant_map_ro)   \
                       | p2m_to_mask(p2m_ram_shared)     \
-                      | p2m_to_mask(p2m_mmio_write_dm))
+                      | p2m_to_mask(p2m_ioreq_server))
 
 /* Write-discard types, which should discard the write operations */
 #define P2M_DISCARD_WRITE_TYPES (p2m_to_mask(p2m_ram_ro)     \
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index ebb907a..b3e45cf 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -84,11 +84,12 @@ typedef enum {
     HVMMEM_ram_ro,             /* Read-only; writes are discarded */
     HVMMEM_mmio_dm,            /* Reads and write go to the device model */
 #if __XEN_INTERFACE_VERSION__ < 0x00040700
-    HVMMEM_mmio_write_dm       /* Read-only; writes go to the device model */
+    HVMMEM_mmio_write_dm,      /* Read-only; writes go to the device model */
 #else
-    HVMMEM_unused              /* Placeholder; setting memory to this type
+    HVMMEM_unused,             /* Placeholder; setting memory to this type
                                   will fail for code after 4.7.0 */
 #endif
+    HVMMEM_ioreq_server
 } hvmmem_type_t;
 
 /* Following tools-only interfaces may change in future. */
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v5 2/4] x86/ioreq server: Add new functions to get/set memory types.
  2016-07-12  9:02 [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
  2016-07-12  9:02 ` [PATCH v5 1/4] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server Yu Zhang
@ 2016-07-12  9:02 ` Yu Zhang
  2016-07-12  9:02 ` [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 32+ messages in thread
From: Yu Zhang @ 2016-07-12  9:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, zhiyuan.lv, Jan Beulich

For clarity this patch breaks the code to set/get memory types out
of do_hvm_op() into dedicated functions: hvmop_set/get_mem_type().
Also, for clarity, checks for whether a memory type change is allowed
are broken out into a separate function called by hvmop_set_mem_type().

There is no intentional functional change in this patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>

changes in v4: 
  - According to Wei Liu's comments, change the format of the commit
    message.
  
changes in v3: 
  - Add Andrew's Acked-by and George's Reviewed-by.

changes in v2: 
  - According to George Dunlap's comments, follow the "set rc /
    do something / goto out" pattern in hvmop_get_mem_type().
---
 xen/arch/x86/hvm/hvm.c | 288 +++++++++++++++++++++++++++----------------------
 1 file changed, 161 insertions(+), 127 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4b51c57..4453ec0 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5345,6 +5345,61 @@ static int do_altp2m_op(
     return rc;
 }
 
+static int hvmop_get_mem_type(
+    XEN_GUEST_HANDLE_PARAM(xen_hvm_get_mem_type_t) arg)
+{
+    struct xen_hvm_get_mem_type a;
+    struct domain *d;
+    p2m_type_t t;
+    int rc;
+
+    if ( copy_from_guest(&a, arg, 1) )
+        return -EFAULT;
+
+    d = rcu_lock_domain_by_any_id(a.domid);
+    if ( d == NULL )
+        return -ESRCH;
+
+    rc = xsm_hvm_param(XSM_TARGET, d, HVMOP_get_mem_type);
+    if ( rc )
+        goto out;
+
+    rc = -EINVAL;
+    if ( !is_hvm_domain(d) )
+        goto out;
+
+    /*
+     * Use get_gfn query as we are interested in the current
+     * type, not in allocating or unsharing. That'll happen
+     * on access.
+     */
+    get_gfn_query_unlocked(d, a.pfn, &t);
+    if ( p2m_is_mmio(t) )
+        a.mem_type =  HVMMEM_mmio_dm;
+    else if ( t == p2m_ioreq_server )
+        a.mem_type = HVMMEM_ioreq_server;
+    else if ( p2m_is_readonly(t) )
+        a.mem_type =  HVMMEM_ram_ro;
+    else if ( p2m_is_ram(t) )
+        a.mem_type =  HVMMEM_ram_rw;
+    else if ( p2m_is_pod(t) )
+        a.mem_type =  HVMMEM_ram_rw;
+    else if ( p2m_is_grant(t) )
+        a.mem_type =  HVMMEM_ram_rw;
+    else
+        a.mem_type =  HVMMEM_mmio_dm;
+
+    rc = -EFAULT;
+    if ( __copy_to_guest(arg, &a, 1) )
+        goto out;
+    rc = 0;
+
+ out:
+    rcu_unlock_domain(d);
+
+    return rc;
+}
+
 /*
  * Note that this value is effectively part of the ABI, even if we don't need
  * to make it a formal part of it: A guest suspended for migration in the
@@ -5353,6 +5408,107 @@ static int do_altp2m_op(
  */
 #define HVMOP_op_mask 0xff
 
+static bool_t hvm_allow_p2m_type_change(p2m_type_t old, p2m_type_t new)
+{
+    if ( p2m_is_ram(old) ||
+         (p2m_is_hole(old) && new == p2m_mmio_dm) ||
+         (old == p2m_ioreq_server && new == p2m_ram_rw) )
+        return 1;
+
+    return 0;
+}
+
+static int hvmop_set_mem_type(
+    XEN_GUEST_HANDLE_PARAM(xen_hvm_set_mem_type_t) arg,
+    unsigned long *iter)
+{
+    unsigned long start_iter = *iter;
+    struct xen_hvm_set_mem_type a;
+    struct domain *d;
+    int rc;
+
+    /* Interface types to internal p2m types */
+    static const p2m_type_t memtype[] = {
+        [HVMMEM_ram_rw]  = p2m_ram_rw,
+        [HVMMEM_ram_ro]  = p2m_ram_ro,
+        [HVMMEM_mmio_dm] = p2m_mmio_dm,
+        [HVMMEM_unused] = p2m_invalid,
+        [HVMMEM_ioreq_server] = p2m_ioreq_server
+    };
+
+    if ( copy_from_guest(&a, arg, 1) )
+        return -EFAULT;
+
+    rc = rcu_lock_remote_domain_by_id(a.domid, &d);
+    if ( rc != 0 )
+        return rc;
+
+    rc = -EINVAL;
+    if ( !is_hvm_domain(d) )
+        goto out;
+
+    rc = xsm_hvm_control(XSM_DM_PRIV, d, HVMOP_set_mem_type);
+    if ( rc )
+        goto out;
+
+    rc = -EINVAL;
+    if ( a.nr < start_iter ||
+         ((a.first_pfn + a.nr - 1) < a.first_pfn) ||
+         ((a.first_pfn + a.nr - 1) > domain_get_maximum_gpfn(d)) )
+        goto out;
+
+    if ( a.hvmmem_type >= ARRAY_SIZE(memtype) ||
+         unlikely(a.hvmmem_type == HVMMEM_unused) )
+        goto out;
+
+    while ( a.nr > start_iter )
+    {
+        unsigned long pfn = a.first_pfn + start_iter;
+        p2m_type_t t;
+
+        get_gfn_unshare(d, pfn, &t);
+        if ( p2m_is_paging(t) )
+        {
+            put_gfn(d, pfn);
+            p2m_mem_paging_populate(d, pfn);
+            rc = -EAGAIN;
+            goto out;
+        }
+        if ( p2m_is_shared(t) )
+        {
+            put_gfn(d, pfn);
+            rc = -EAGAIN;
+            goto out;
+        }
+        if ( !hvm_allow_p2m_type_change(t, memtype[a.hvmmem_type]) )
+        {
+            put_gfn(d, pfn);
+            goto out;
+        }
+
+        rc = p2m_change_type_one(d, pfn, t, memtype[a.hvmmem_type]);
+        put_gfn(d, pfn);
+
+        if ( rc )
+            goto out;
+
+        /* Check for continuation if it's not the last interation */
+        if ( a.nr > ++start_iter && !(start_iter & HVMOP_op_mask) &&
+             hypercall_preempt_check() )
+        {
+            rc = -ERESTART;
+            goto out;
+        }
+    }
+    rc = 0;
+
+ out:
+    rcu_unlock_domain(d);
+    *iter = start_iter;
+
+    return rc;
+}
+
 long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     unsigned long start_iter, mask;
@@ -5542,137 +5698,15 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
     }
 
     case HVMOP_get_mem_type:
-    {
-        struct xen_hvm_get_mem_type a;
-        struct domain *d;
-        p2m_type_t t;
-
-        if ( copy_from_guest(&a, arg, 1) )
-            return -EFAULT;
-
-        d = rcu_lock_domain_by_any_id(a.domid);
-        if ( d == NULL )
-            return -ESRCH;
-
-        rc = xsm_hvm_param(XSM_TARGET, d, op);
-        if ( unlikely(rc) )
-            /* nothing */;
-        else if ( likely(is_hvm_domain(d)) )
-        {
-            /* Use get_gfn query as we are interested in the current 
-             * type, not in allocating or unsharing. That'll happen 
-             * on access. */
-            get_gfn_query_unlocked(d, a.pfn, &t);
-            if ( p2m_is_mmio(t) )
-                a.mem_type =  HVMMEM_mmio_dm;
-            else if ( t == p2m_ioreq_server )
-                a.mem_type = HVMMEM_ioreq_server;
-            else if ( p2m_is_readonly(t) )
-                a.mem_type =  HVMMEM_ram_ro;
-            else if ( p2m_is_ram(t) )
-                a.mem_type =  HVMMEM_ram_rw;
-            else if ( p2m_is_pod(t) )
-                a.mem_type =  HVMMEM_ram_rw;
-            else if ( p2m_is_grant(t) )
-                a.mem_type =  HVMMEM_ram_rw;
-            else
-                a.mem_type =  HVMMEM_mmio_dm;
-            if ( __copy_to_guest(arg, &a, 1) )
-                rc = -EFAULT;
-        }
-        else
-            rc = -EINVAL;
-
-        rcu_unlock_domain(d);
+        rc = hvmop_get_mem_type(
+            guest_handle_cast(arg, xen_hvm_get_mem_type_t));
         break;
-    }
 
     case HVMOP_set_mem_type:
-    {
-        struct xen_hvm_set_mem_type a;
-        struct domain *d;
-        
-        /* Interface types to internal p2m types */
-        static const p2m_type_t memtype[] = {
-            [HVMMEM_ram_rw]  = p2m_ram_rw,
-            [HVMMEM_ram_ro]  = p2m_ram_ro,
-            [HVMMEM_mmio_dm] = p2m_mmio_dm,
-            [HVMMEM_unused] = p2m_invalid,
-            [HVMMEM_ioreq_server] = p2m_ioreq_server
-        };
-
-        if ( copy_from_guest(&a, arg, 1) )
-            return -EFAULT;
-
-        rc = rcu_lock_remote_domain_by_id(a.domid, &d);
-        if ( rc != 0 )
-            return rc;
-
-        rc = -EINVAL;
-        if ( !is_hvm_domain(d) )
-            goto setmemtype_fail;
-
-        rc = xsm_hvm_control(XSM_DM_PRIV, d, op);
-        if ( rc )
-            goto setmemtype_fail;
-
-        rc = -EINVAL;
-        if ( a.nr < start_iter ||
-             ((a.first_pfn + a.nr - 1) < a.first_pfn) ||
-             ((a.first_pfn + a.nr - 1) > domain_get_maximum_gpfn(d)) )
-            goto setmemtype_fail;
-            
-        if ( a.hvmmem_type >= ARRAY_SIZE(memtype) ||
-             unlikely(a.hvmmem_type == HVMMEM_unused) )
-            goto setmemtype_fail;
-
-        while ( a.nr > start_iter )
-        {
-            unsigned long pfn = a.first_pfn + start_iter;
-            p2m_type_t t;
-
-            get_gfn_unshare(d, pfn, &t);
-            if ( p2m_is_paging(t) )
-            {
-                put_gfn(d, pfn);
-                p2m_mem_paging_populate(d, pfn);
-                rc = -EAGAIN;
-                goto setmemtype_fail;
-            }
-            if ( p2m_is_shared(t) )
-            {
-                put_gfn(d, pfn);
-                rc = -EAGAIN;
-                goto setmemtype_fail;
-            }
-            if ( !p2m_is_ram(t) &&
-                 (!p2m_is_hole(t) || a.hvmmem_type != HVMMEM_mmio_dm) &&
-                 (t != p2m_ioreq_server || a.hvmmem_type != HVMMEM_ram_rw) )
-            {
-                put_gfn(d, pfn);
-                goto setmemtype_fail;
-            }
-
-            rc = p2m_change_type_one(d, pfn, t, memtype[a.hvmmem_type]);
-            put_gfn(d, pfn);
-            if ( rc )
-                goto setmemtype_fail;
-
-            /* Check for continuation if it's not the last interation */
-            if ( a.nr > ++start_iter && !(start_iter & HVMOP_op_mask) &&
-                 hypercall_preempt_check() )
-            {
-                rc = -ERESTART;
-                goto setmemtype_fail;
-            }
-        }
-
-        rc = 0;
-
-    setmemtype_fail:
-        rcu_unlock_domain(d);
+        rc = hvmop_set_mem_type(
+            guest_handle_cast(arg, xen_hvm_set_mem_type_t),
+            &start_iter);
         break;
-    }
 
     case HVMOP_pagetable_dying:
     {
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-07-12  9:02 [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
  2016-07-12  9:02 ` [PATCH v5 1/4] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server Yu Zhang
  2016-07-12  9:02 ` [PATCH v5 2/4] x86/ioreq server: Add new functions to get/set memory types Yu Zhang
@ 2016-07-12  9:02 ` Yu Zhang
  2016-08-08 15:40   ` Jan Beulich
  2016-07-12  9:02 ` [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
  2016-08-05  2:44 ` [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
  4 siblings, 1 reply; 32+ messages in thread
From: Yu Zhang @ 2016-07-12  9:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Jun Nakajima, George Dunlap, Andrew Cooper,
	Tim Deegan, Paul Durrant, zhiyuan.lv, Jan Beulich

A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
let one ioreq server claim/disclaim its responsibility for the
handling of guest pages with p2m type p2m_ioreq_server. Users
of this HVMOP can specify which kind of operation is supposed
to be emulated in a parameter named flags. Currently, this HVMOP
only support the emulation of write operations. And it can be
further extended to support the emulation of read ones if an
ioreq server has such requirement in the future.

For now, we only support one ioreq server for this p2m type, so
once an ioreq server has claimed its ownership, subsequent calls
of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
disclaim the ownership of guest ram pages with p2m_ioreq_server, by
triggering this new HVMOP, with ioreq server id set to the current
owner's and flags parameter set to 0.

Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
are only supported for HVMs with HAP enabled.

Also note that only after one ioreq server claims its ownership
of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
be allowed.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
---
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Tim Deegan <tim@xen.org>

changes in v5:
  - Simplify logic in hvmemul_do_io().
  - Use natual width types instead of fixed width types when possible.
  - Do not grant executable permission for p2m_ioreq_server entries.
  - Clarify comments and commit message.
  - Introduce a seperate patch to recalculate the p2m types after
    the ioreq server unmaps the p2m_ioreq_server.

changes in v4:
  - According to Paul's advice, add comments around the definition
    of HVMMEM_iore_server in hvm_op.h.
  - According to Wei Liu's comments, change the format of the commit
    message.

changes in v3:
  - Only support write emulation in this patch;
  - Remove the code to handle race condition in hvmemul_do_io(),
  - No need to reset the p2m type after an ioreq server has disclaimed
    its ownership of p2m_ioreq_server;
  - Only allow p2m type change to p2m_ioreq_server after an ioreq
    server has claimed its ownership of p2m_ioreq_server;
  - Only allow p2m type change to p2m_ioreq_server from pages with type
    p2m_ram_rw, and vice versa;
  - HVMOP_map_mem_type_to_ioreq_server interface change - use uint16,
    instead of enum to specify the memory type;
  - Function prototype change to p2m_get_ioreq_server();
  - Coding style changes;
  - Commit message changes;
  - Add Tim's Acked-by.

changes in v2: 
  - Only support HAP enabled HVMs;
  - Replace p2m_mem_type_changed() with p2m_change_entry_type_global()
    to reset the p2m type, when an ioreq server tries to claim/disclaim
    its ownership of p2m_ioreq_server;
  - Comments changes.
---
 xen/arch/x86/hvm/emulate.c       | 33 ++++++++++++++++--
 xen/arch/x86/hvm/hvm.c           | 66 +++++++++++++++++++++++++++++++++--
 xen/arch/x86/hvm/ioreq.c         | 41 ++++++++++++++++++++++
 xen/arch/x86/mm/hap/nested_hap.c |  2 +-
 xen/arch/x86/mm/p2m-ept.c        |  8 ++++-
 xen/arch/x86/mm/p2m-pt.c         | 20 +++++++----
 xen/arch/x86/mm/p2m.c            | 74 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/mm/shadow/multi.c   |  3 +-
 xen/include/asm-x86/hvm/ioreq.h  |  2 ++
 xen/include/asm-x86/p2m.h        | 29 ++++++++++++++--
 xen/include/public/hvm/hvm_op.h  | 31 ++++++++++++++++-
 11 files changed, 290 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 855af4d..c235a40 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -100,6 +100,7 @@ static int hvmemul_do_io(
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
 {
     struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     ioreq_t p = {
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
@@ -141,7 +142,7 @@ static int hvmemul_do_io(
              (p.dir != dir) ||
              (p.df != df) ||
              (p.data_is_ptr != data_is_addr) )
-            domain_crash(curr->domain);
+            domain_crash(currd);
 
         if ( data_is_addr )
             return X86EMUL_UNHANDLEABLE;
@@ -178,8 +179,34 @@ static int hvmemul_do_io(
         break;
     case X86EMUL_UNHANDLEABLE:
     {
-        struct hvm_ioreq_server *s =
-            hvm_select_ioreq_server(curr->domain, &p);
+        struct hvm_ioreq_server *s;
+
+        if ( is_mmio )
+        {
+            unsigned long gmfn = paddr_to_pfn(addr);
+            p2m_type_t p2mt;
+
+            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
+
+            if ( p2mt == p2m_ioreq_server )
+            {
+                unsigned int flags;
+
+                if ( dir != IOREQ_WRITE )
+                    s = NULL;
+                else
+                {
+                    s = p2m_get_ioreq_server(currd, &flags);
+
+                    if ( !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS) )
+                        s = NULL;
+                }
+            }
+            else
+                s = hvm_select_ioreq_server(currd, &p);
+        }
+        else
+            s = hvm_select_ioreq_server(currd, &p);
 
         /* If there is no suitable backing DM, just ignore accesses */
         if ( !s )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4453ec0..4d98cc6 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5410,9 +5410,14 @@ static int hvmop_get_mem_type(
 
 static bool_t hvm_allow_p2m_type_change(p2m_type_t old, p2m_type_t new)
 {
+    if ( new == p2m_ioreq_server )
+        return old == p2m_ram_rw;
+
+    if ( old == p2m_ioreq_server )
+        return new == p2m_ram_rw;
+
     if ( p2m_is_ram(old) ||
-         (p2m_is_hole(old) && new == p2m_mmio_dm) ||
-         (old == p2m_ioreq_server && new == p2m_ram_rw) )
+         (p2m_is_hole(old) && new == p2m_mmio_dm) )
         return 1;
 
     return 0;
@@ -5447,6 +5452,21 @@ static int hvmop_set_mem_type(
     if ( !is_hvm_domain(d) )
         goto out;
 
+    if ( a.hvmmem_type == HVMMEM_ioreq_server )
+    {
+        unsigned int flags;
+        struct hvm_ioreq_server *s;
+
+        /* HVMMEM_ioreq_server is only supported for HAP enabled hvm. */
+        if ( !hap_enabled(d) )
+            goto out;
+
+        /* Do not change to HVMMEM_ioreq_server if no ioreq server mapped. */
+        s = p2m_get_ioreq_server(d, &flags);
+        if ( s == NULL )
+            goto out;
+    }
+
     rc = xsm_hvm_control(XSM_DM_PRIV, d, HVMOP_set_mem_type);
     if ( rc )
         goto out;
@@ -5509,6 +5529,43 @@ static int hvmop_set_mem_type(
     return rc;
 }
 
+static int hvmop_map_mem_type_to_ioreq_server(
+    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
+{
+    xen_hvm_map_mem_type_to_ioreq_server_t op;
+    struct domain *d;
+    int rc;
+
+    if ( copy_from_guest(&op, uop, 1) )
+        return -EFAULT;
+
+    rc = rcu_lock_remote_domain_by_id(op.domid, &d);
+    if ( rc != 0 )
+        return rc;
+
+    rc = -EINVAL;
+    if ( !is_hvm_domain(d) )
+        goto out;
+
+    if ( op.pad != 0 )
+        goto out;
+
+    /* Only support for HAP enabled hvm. */
+    if ( !hap_enabled(d) )
+        goto out;
+
+    rc = xsm_hvm_ioreq_server(XSM_DM_PRIV, d,
+                              HVMOP_map_mem_type_to_ioreq_server);
+    if ( rc != 0 )
+        goto out;
+
+    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
+
+ out:
+    rcu_unlock_domain(d);
+    return rc;
+}
+
 long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     unsigned long start_iter, mask;
@@ -5548,6 +5605,11 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
             guest_handle_cast(arg, xen_hvm_io_range_t));
         break;
 
+    case HVMOP_map_mem_type_to_ioreq_server:
+        rc = hvmop_map_mem_type_to_ioreq_server(
+            guest_handle_cast(arg, xen_hvm_map_mem_type_to_ioreq_server_t));
+        break;
+
     case HVMOP_set_ioreq_server_state:
         rc = hvmop_set_ioreq_server_state(
             guest_handle_cast(arg, xen_hvm_set_ioreq_server_state_t));
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 7148ac4..36a2298 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -753,6 +753,8 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 
         domain_pause(d);
 
+        p2m_destroy_ioreq_server(d, s);
+
         hvm_ioreq_server_disable(s, 0);
 
         list_del(&s->list_entry);
@@ -914,6 +916,45 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
     return rc;
 }
 
+int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
+                                     uint32_t type, uint32_t flags)
+{
+    struct hvm_ioreq_server *s;
+    int rc;
+
+    /* For now, only HVMMEM_ioreq_server is supported. */
+    if ( type != HVMMEM_ioreq_server )
+        return -EINVAL;
+
+    /* For now, only write emulation is supported. */
+    if ( flags & ~(XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
+        return -EINVAL;
+
+    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
+
+    rc = -ENOENT;
+    list_for_each_entry ( s,
+                          &d->arch.hvm_domain.ioreq_server.list,
+                          list_entry )
+    {
+        if ( s == d->arch.hvm_domain.default_ioreq_server )
+            continue;
+
+        if ( s->id == id )
+        {
+            rc = p2m_set_ioreq_server(d, flags, s);
+            if ( rc == 0 )
+                dprintk(XENLOG_DEBUG, "%u %s type HVMMEM_ioreq_server.\n",
+                         s->id, (flags != 0) ? "mapped to" : "unmapped from");
+
+            break;
+        }
+    }
+
+    spin_unlock(&d->arch.hvm_domain.ioreq_server.lock);
+    return rc;
+}
+
 int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
                                bool_t enabled)
 {
diff --git a/xen/arch/x86/mm/hap/nested_hap.c b/xen/arch/x86/mm/hap/nested_hap.c
index d41bb09..aa90a62 100644
--- a/xen/arch/x86/mm/hap/nested_hap.c
+++ b/xen/arch/x86/mm/hap/nested_hap.c
@@ -174,7 +174,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t L1_gpa, paddr_t *L0_gpa,
     if ( *p2mt == p2m_mmio_direct )
         goto direct_mmio_out;
     rc = NESTEDHVM_PAGEFAULT_MMIO;
-    if ( *p2mt == p2m_mmio_dm )
+    if ( *p2mt == p2m_mmio_dm || *p2mt == p2m_ioreq_server )
         goto out;
 
     rc = NESTEDHVM_PAGEFAULT_L0_ERROR;
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 7adc77d..5f06d40 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -132,6 +132,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
             entry->r = entry->w = entry->x = 1;
             entry->a = entry->d = !!cpu_has_vmx_ept_ad;
             break;
+        case p2m_ioreq_server:
+            entry->r = 1;
+            entry->w = !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_WRITE_ACCESS);
+            entry->x = 0;
+            entry->a = !!cpu_has_vmx_ept_ad;
+            entry->d = entry->w && cpu_has_vmx_ept_ad;
+            break;
         case p2m_mmio_direct:
             entry->r = entry->x = 1;
             entry->w = !rangeset_contains_singleton(mmio_ro_ranges,
@@ -171,7 +178,6 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
             entry->a = entry->d = !!cpu_has_vmx_ept_ad;
             break;
         case p2m_grant_map_ro:
-        case p2m_ioreq_server:
             entry->r = 1;
             entry->w = entry->x = 0;
             entry->a = !!cpu_has_vmx_ept_ad;
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 05aaf8f..6209e7b 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -72,7 +72,9 @@ static const unsigned long pgt[] = {
     PGT_l3_page_table
 };
 
-static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
+static unsigned long p2m_type_to_flags(const struct p2m_domain *p2m,
+                                       p2m_type_t t,
+                                       mfn_t mfn,
                                        unsigned int level)
 {
     unsigned long flags;
@@ -94,8 +96,13 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
     default:
         return flags | _PAGE_NX_BIT;
     case p2m_grant_map_ro:
-    case p2m_ioreq_server:
         return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
+    case p2m_ioreq_server:
+        flags |= P2M_BASE_FLAGS | _PAGE_RW | _PAGE_NX_BIT;
+        if ( p2m->ioreq.flags & P2M_IOREQ_HANDLE_WRITE_ACCESS )
+            return flags & ~_PAGE_RW;
+        else
+            return flags;
     case p2m_ram_ro:
     case p2m_ram_logdirty:
     case p2m_ram_shared:
@@ -442,7 +449,8 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
             p2m_type_t p2mt = p2m_is_logdirty_range(p2m, gfn & mask, gfn | ~mask)
                               ? p2m_ram_logdirty : p2m_ram_rw;
             unsigned long mfn = l1e_get_pfn(e);
-            unsigned long flags = p2m_type_to_flags(p2mt, _mfn(mfn), level);
+            unsigned long flags = p2m_type_to_flags(p2m, p2mt,
+                                                    _mfn(mfn), level);
 
             if ( level )
             {
@@ -579,7 +587,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
         l3e_content = mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt)
             ? l3e_from_pfn(mfn_x(mfn),
-                           p2m_type_to_flags(p2mt, mfn, 2) | _PAGE_PSE)
+                           p2m_type_to_flags(p2m, p2mt, mfn, 2) | _PAGE_PSE)
             : l3e_empty();
         entry_content.l1 = l3e_content.l3;
 
@@ -615,7 +623,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
 
         if ( mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt) )
             entry_content = p2m_l1e_from_pfn(mfn_x(mfn),
-                                             p2m_type_to_flags(p2mt, mfn, 0));
+                                         p2m_type_to_flags(p2m, p2mt, mfn, 0));
         else
             entry_content = l1e_empty();
 
@@ -651,7 +659,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
         if ( mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt) )
             l2e_content = l2e_from_pfn(mfn_x(mfn),
-                                       p2m_type_to_flags(p2mt, mfn, 1) |
+                                       p2m_type_to_flags(p2m, p2mt, mfn, 1) |
                                        _PAGE_PSE);
         else
             l2e_content = l2e_empty();
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 16733a4..5567181 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -83,6 +83,8 @@ static int p2m_initialise(struct domain *d, struct p2m_domain *p2m)
     else
         p2m_pt_init(p2m);
 
+    spin_lock_init(&p2m->ioreq.lock);
+
     return ret;
 }
 
@@ -289,6 +291,78 @@ void p2m_memory_type_changed(struct domain *d)
     }
 }
 
+int p2m_set_ioreq_server(struct domain *d,
+                         unsigned int flags,
+                         struct hvm_ioreq_server *s)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    int rc;
+
+    /*
+     * Use lock to prevent concurrent setting requirements
+     * from multiple ioreq serers.
+     */
+    spin_lock(&p2m->ioreq.lock);
+
+    /* Unmap ioreq server from p2m type by passing flags with 0. */
+    if ( flags == 0 )
+    {
+        rc = -EINVAL;
+        if ( p2m->ioreq.server != s )
+            goto out;
+
+        p2m->ioreq.server = NULL;
+        p2m->ioreq.flags = 0;
+    }
+    else
+    {
+        rc = -EBUSY;
+        if ( p2m->ioreq.server != NULL )
+            goto out;
+
+        p2m->ioreq.server = s;
+        p2m->ioreq.flags = flags;
+    }
+
+    rc = 0;
+
+ out:
+    spin_unlock(&p2m->ioreq.lock);
+
+    return rc;
+}
+
+struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
+                                              unsigned int *flags)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    struct hvm_ioreq_server *s;
+
+    spin_lock(&p2m->ioreq.lock);
+
+    s = p2m->ioreq.server;
+    *flags = p2m->ioreq.flags;
+
+    spin_unlock(&p2m->ioreq.lock);
+    return s;
+}
+
+void p2m_destroy_ioreq_server(const struct domain *d,
+                              const struct hvm_ioreq_server *s)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+    spin_lock(&p2m->ioreq.lock);
+
+    if ( p2m->ioreq.server == s )
+    {
+        p2m->ioreq.server = NULL;
+        p2m->ioreq.flags = 0;
+    }
+
+    spin_unlock(&p2m->ioreq.lock);
+}
+
 void p2m_enable_hardware_log_dirty(struct domain *d)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 8c4b20e..2f40816 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3225,8 +3225,7 @@ static int sh_page_fault(struct vcpu *v,
     }
 
     /* Need to hand off device-model MMIO to the device model */
-    if ( p2mt == p2m_mmio_dm
-         || (p2mt == p2m_ioreq_server && ft == ft_demand_write) )
+    if ( p2mt == p2m_mmio_dm )
     {
         gpa = guest_walk_to_gpa(&gw);
         goto mmio;
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index fbf2c74..b43667a 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -37,6 +37,8 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
 int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
                                          uint32_t type, uint64_t start,
                                          uint64_t end);
+int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
+                                     uint32_t type, uint32_t flags);
 int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
                                bool_t enabled);
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 6785669..0950a91 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -89,7 +89,8 @@ typedef unsigned int p2m_query_t;
                        | p2m_to_mask(p2m_ram_paging_out)      \
                        | p2m_to_mask(p2m_ram_paged)           \
                        | p2m_to_mask(p2m_ram_paging_in)       \
-                       | p2m_to_mask(p2m_ram_shared))
+                       | p2m_to_mask(p2m_ram_shared)          \
+                       | p2m_to_mask(p2m_ioreq_server))
 
 /* Types that represent a physmap hole that is ok to replace with a shared
  * entry */
@@ -111,8 +112,7 @@ typedef unsigned int p2m_query_t;
 #define P2M_RO_TYPES (p2m_to_mask(p2m_ram_logdirty)     \
                       | p2m_to_mask(p2m_ram_ro)         \
                       | p2m_to_mask(p2m_grant_map_ro)   \
-                      | p2m_to_mask(p2m_ram_shared)     \
-                      | p2m_to_mask(p2m_ioreq_server))
+                      | p2m_to_mask(p2m_ram_shared))
 
 /* Write-discard types, which should discard the write operations */
 #define P2M_DISCARD_WRITE_TYPES (p2m_to_mask(p2m_ram_ro)     \
@@ -336,6 +336,23 @@ struct p2m_domain {
         struct ept_data ept;
         /* NPT-equivalent structure could be added here. */
     };
+
+    struct {
+        spinlock_t lock;
+        /*
+         * ioreq server who's responsible for the emulation of
+         * gfns with specific p2m type(for now, p2m_ioreq_server).
+         */
+        struct hvm_ioreq_server *server;
+        /*
+         * flags specifies whether read, write or both operations
+         * are to be emulated by an ioreq server.
+         */
+        unsigned int flags;
+
+#define P2M_IOREQ_HANDLE_WRITE_ACCESS XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE
+#define P2M_IOREQ_HANDLE_READ_ACCESS  XEN_HVMOP_IOREQ_MEM_ACCESS_READ
+    } ioreq;
 };
 
 /* get host p2m table */
@@ -842,6 +859,12 @@ static inline unsigned int p2m_get_iommu_flags(p2m_type_t p2mt)
     return flags;
 }
 
+int p2m_set_ioreq_server(struct domain *d, unsigned int flags,
+                         struct hvm_ioreq_server *s);
+struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
+                                              unsigned int *flags);
+void p2m_destroy_ioreq_server(const struct domain *d, const struct hvm_ioreq_server *s);
+
 #endif /* _XEN_P2M_H */
 
 /*
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index b3e45cf..d484c5f 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -89,7 +89,9 @@ typedef enum {
     HVMMEM_unused,             /* Placeholder; setting memory to this type
                                   will fail for code after 4.7.0 */
 #endif
-    HVMMEM_ioreq_server
+    HVMMEM_ioreq_server        /* Memory type claimed by an ioreq server; type
+                                  changes to this value are only allowed after
+                                  an ioreq server has claimed its ownership. */
 } hvmmem_type_t;
 
 /* Following tools-only interfaces may change in future. */
@@ -383,6 +385,33 @@ struct xen_hvm_set_ioreq_server_state {
 typedef struct xen_hvm_set_ioreq_server_state xen_hvm_set_ioreq_server_state_t;
 DEFINE_XEN_GUEST_HANDLE(xen_hvm_set_ioreq_server_state_t);
 
+/*
+ * HVMOP_map_mem_type_to_ioreq_server : map or unmap the IOREQ Server <id>
+ *                                      to specific memroy type <type>
+ *                                      for specific accesses <flags>
+ *
+ * For now, flags only accept the value of HVMOP_IOREQ_MEM_ACCESS_WRITE,
+ * which means only write operations are to be forwarded to an ioreq server.
+ * Support for the emulation of read operations can be added when an ioreq
+ * server has such requirement in future.
+ */
+#define HVMOP_map_mem_type_to_ioreq_server 26
+struct xen_hvm_map_mem_type_to_ioreq_server {
+    domid_t domid;      /* IN - domain to be serviced */
+    ioservid_t id;      /* IN - ioreq server id */
+    uint16_t type;      /* IN - memory type */
+    uint16_t pad;
+    uint32_t flags;     /* IN - types of accesses to be forwarded to the
+                           ioreq server. flags with 0 means to unmap the
+                           ioreq server */
+
+#define XEN_HVMOP_IOREQ_MEM_ACCESS_READ (1u << 0)
+#define XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE (1u << 1)
+};
+typedef struct xen_hvm_map_mem_type_to_ioreq_server
+    xen_hvm_map_mem_type_to_ioreq_server_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_map_mem_type_to_ioreq_server_t);
+
 #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
 
 #if defined(__i386__) || defined(__x86_64__)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-07-12  9:02 [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
                   ` (2 preceding siblings ...)
  2016-07-12  9:02 ` [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
@ 2016-07-12  9:02 ` Yu Zhang
  2016-08-08 16:29   ` Jan Beulich
  2016-08-16 13:35   ` George Dunlap
  2016-08-05  2:44 ` [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
  4 siblings, 2 replies; 32+ messages in thread
From: Yu Zhang @ 2016-07-12  9:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Jun Nakajima, George Dunlap, Andrew Cooper,
	Paul Durrant, zhiyuan.lv, Jan Beulich

This patch resets p2m_ioreq_server entries back to p2m_ram_rw,
after an ioreq server has unmapped. The resync is done both
asynchronously with the current p2m_change_entry_type_global()
interface, and synchronously by iterating the p2m table. The
synchronous resetting is necessary because we need to guarantee
the p2m table is clean before another ioreq server is mapped.
And since the sweeping of p2m table could be time consuming,
it is done with hypercall continuation. Asynchronous approach
is also taken so that p2m_ioreq_server entries can also be reset
when the HVM is scheduled between hypercall continuations.

This patch also disallows live migration, when there's still any
outstanding p2m_ioreq_server entry left. The core reason is our
current implementation of p2m_change_entry_type_global() can not
tell the state of p2m_ioreq_server entries(can not decide if an
entry is to be emulated or to be resynced).

Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/hvm.c    | 52 ++++++++++++++++++++++++++++++++++++++++++++---
 xen/arch/x86/mm/hap/hap.c |  9 ++++++++
 xen/arch/x86/mm/p2m-ept.c |  6 +++++-
 xen/arch/x86/mm/p2m-pt.c  | 10 +++++++--
 xen/arch/x86/mm/p2m.c     |  3 +++
 xen/include/asm-x86/p2m.h |  5 ++++-
 6 files changed, 78 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4d98cc6..e57c8b9 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5485,6 +5485,7 @@ static int hvmop_set_mem_type(
     {
         unsigned long pfn = a.first_pfn + start_iter;
         p2m_type_t t;
+        struct p2m_domain *p2m = p2m_get_hostp2m(d);
 
         get_gfn_unshare(d, pfn, &t);
         if ( p2m_is_paging(t) )
@@ -5512,6 +5513,12 @@ static int hvmop_set_mem_type(
         if ( rc )
             goto out;
 
+        if ( t == p2m_ram_rw && memtype[a.hvmmem_type] == p2m_ioreq_server )
+            p2m->ioreq.entry_count++;
+
+        if ( t == p2m_ioreq_server && memtype[a.hvmmem_type] == p2m_ram_rw )
+            p2m->ioreq.entry_count--;
+
         /* Check for continuation if it's not the last interation */
         if ( a.nr > ++start_iter && !(start_iter & HVMOP_op_mask) &&
              hypercall_preempt_check() )
@@ -5530,11 +5537,13 @@ static int hvmop_set_mem_type(
 }
 
 static int hvmop_map_mem_type_to_ioreq_server(
-    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
+    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop,
+    unsigned long *iter)
 {
     xen_hvm_map_mem_type_to_ioreq_server_t op;
     struct domain *d;
     int rc;
+    unsigned long gfn = *iter;
 
     if ( copy_from_guest(&op, uop, 1) )
         return -EFAULT;
@@ -5559,7 +5568,42 @@ static int hvmop_map_mem_type_to_ioreq_server(
     if ( rc != 0 )
         goto out;
 
-    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
+    if ( gfn == 0 || op.flags != 0 )
+        rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
+
+    /*
+     * Iterate p2m table when an ioreq server unmaps from p2m_ioreq_server,
+     * and reset the remaining p2m_ioreq_server entries back to p2m_ram_rw.
+     */
+    if ( op.flags == 0 && rc == 0 )
+    {
+        struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+        while ( gfn <= p2m->max_mapped_pfn )
+        {
+            p2m_type_t t;
+
+            if ( p2m->ioreq.entry_count == 0 )
+                break;
+
+            get_gfn_unshare(d, gfn, &t);
+
+            if ( (t == p2m_ioreq_server) &&
+                 !(p2m_change_type_one(d, gfn, t, p2m_ram_rw)) )
+                p2m->ioreq.entry_count--;
+
+            put_gfn(d, gfn);
+
+            /* Check for continuation if it's not the last iteration. */
+            if ( ++gfn <= p2m->max_mapped_pfn &&
+                 !(gfn & HVMOP_op_mask) &&
+                 hypercall_preempt_check() )
+            {
+                rc = -ERESTART;
+                goto out;
+            }
+        }
+    }
 
  out:
     rcu_unlock_domain(d);
@@ -5578,6 +5622,7 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     case HVMOP_modified_memory:
     case HVMOP_set_mem_type:
+    case HVMOP_map_mem_type_to_ioreq_server:
         mask = HVMOP_op_mask;
         break;
     }
@@ -5607,7 +5652,8 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     case HVMOP_map_mem_type_to_ioreq_server:
         rc = hvmop_map_mem_type_to_ioreq_server(
-            guest_handle_cast(arg, xen_hvm_map_mem_type_to_ioreq_server_t));
+            guest_handle_cast(arg, xen_hvm_map_mem_type_to_ioreq_server_t),
+            &start_iter);
         break;
 
     case HVMOP_set_ioreq_server_state:
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 9c2cd49..0442b1d 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -190,6 +190,15 @@ out:
  */
 static int hap_enable_log_dirty(struct domain *d, bool_t log_global)
 {
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+    /*
+    * Refuse to turn on global log-dirty mode if
+    * there's outstanding p2m_ioreq_server pages.
+    */
+    if ( log_global && p2m->ioreq.entry_count > 0 )
+        return -EBUSY;
+
     /* turn on PG_log_dirty bit in paging mode */
     paging_lock(d);
     d->arch.paging.mode |= PG_log_dirty;
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 5f06d40..5d4d804 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -545,6 +545,9 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
                     e.ipat = ipat;
                     if ( e.recalc && p2m_is_changeable(e.sa_p2mt) )
                     {
+                         if ( e.sa_p2mt == p2m_ioreq_server )
+                             p2m->ioreq.entry_count--;
+
                          e.sa_p2mt = p2m_is_logdirty_range(p2m, gfn + i, gfn + i)
                                      ? p2m_ram_logdirty : p2m_ram_rw;
                          ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt, e.access);
@@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
     if ( is_epte_valid(ept_entry) )
     {
         if ( (recalc || ept_entry->recalc) &&
-             p2m_is_changeable(ept_entry->sa_p2mt) )
+             p2m_is_changeable(ept_entry->sa_p2mt) &&
+             (ept_entry->sa_p2mt != p2m_ioreq_server) )
             *t = p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty
                                                       : p2m_ram_rw;
         else
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 6209e7b..7bebfd1 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -439,11 +439,13 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
          needs_recalc(l1, *pent) )
     {
         l1_pgentry_t e = *pent;
+        p2m_type_t p2mt_old;
 
         if ( !valid_recalc(l1, e) )
             P2M_DEBUG("bogus recalc leaf at d%d:%lx:%u\n",
                       p2m->domain->domain_id, gfn, level);
-        if ( p2m_is_changeable(p2m_flags_to_type(l1e_get_flags(e))) )
+        p2mt_old = p2m_flags_to_type(l1e_get_flags(e));
+        if ( p2m_is_changeable(p2mt_old) )
         {
             unsigned long mask = ~0UL << (level * PAGETABLE_ORDER);
             p2m_type_t p2mt = p2m_is_logdirty_range(p2m, gfn & mask, gfn | ~mask)
@@ -463,6 +465,10 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
                      mfn &= ~(_PAGE_PSE_PAT >> PAGE_SHIFT);
                 flags |= _PAGE_PSE;
             }
+
+            if ( p2mt_old == p2m_ioreq_server )
+                p2m->ioreq.entry_count--;
+
             e = l1e_from_pfn(mfn, flags);
             p2m_add_iommu_flags(&e, level,
                                 (p2mt == p2m_ram_rw)
@@ -729,7 +735,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
 static inline p2m_type_t recalc_type(bool_t recalc, p2m_type_t t,
                                      struct p2m_domain *p2m, unsigned long gfn)
 {
-    if ( !recalc || !p2m_is_changeable(t) )
+    if ( !recalc || !p2m_is_changeable(t) || (t == p2m_ioreq_server) )
         return t;
     return p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty
                                                 : p2m_ram_rw;
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 5567181..e1c3e31 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -313,6 +313,9 @@ int p2m_set_ioreq_server(struct domain *d,
 
         p2m->ioreq.server = NULL;
         p2m->ioreq.flags = 0;
+
+        if ( p2m->ioreq.entry_count > 0 )
+            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
     }
     else
     {
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 0950a91..8e5b4f5 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -120,7 +120,8 @@ typedef unsigned int p2m_query_t;
 
 /* Types that can be subject to bulk transitions. */
 #define P2M_CHANGEABLE_TYPES (p2m_to_mask(p2m_ram_rw) \
-                              | p2m_to_mask(p2m_ram_logdirty) )
+                              | p2m_to_mask(p2m_ram_logdirty) \
+                              | p2m_to_mask(p2m_ioreq_server))
 
 #define P2M_POD_TYPES (p2m_to_mask(p2m_populate_on_demand))
 
@@ -352,6 +353,8 @@ struct p2m_domain {
 
 #define P2M_IOREQ_HANDLE_WRITE_ACCESS XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE
 #define P2M_IOREQ_HANDLE_READ_ACCESS  XEN_HVMOP_IOREQ_MEM_ACCESS_READ
+
+        unsigned int entry_count;
     } ioreq;
 };
 
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type.
  2016-07-12  9:02 [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
                   ` (3 preceding siblings ...)
  2016-07-12  9:02 ` [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
@ 2016-08-05  2:44 ` Yu Zhang
  2016-08-05  6:16   ` Jan Beulich
  2016-08-05 12:46   ` George Dunlap
  4 siblings, 2 replies; 32+ messages in thread
From: Yu Zhang @ 2016-08-05  2:44 UTC (permalink / raw)
  To: Jan Beulich, George Dunlap; +Cc: Lv, Zhiyuan, xen-devel


On 7/12/2016 5:02 PM, Yu Zhang wrote:
> XenGT leverages ioreq server to track and forward the accesses to GPU
> I/O resources, e.g. the PPGTT(per-process graphic translation tables).
> Currently, ioreq server uses rangeset to track the BDF/ PIO/MMIO ranges
> to be emulated. To select an ioreq server, the rangeset is searched to
> see if the I/O range is recorded. However, number of ram pages to be
> tracked may exceed the upper limit of rangeset.
>
> Previously, one solution was proposed to refactor the rangeset, and
> extend its upper limit. However, after 12 rounds discussion, we have
> decided to drop this approach due to security concerns. Now this new
> patch series introduces a new mem type, HVMMEM_ioreq_server, and added
> hvm operations to let one ioreq server to claim its ownership of ram
> pages with this type. Accesses to a page of this type will be handled
> by the specified ioreq server directly.
>
> Yu Zhang (4):
>    x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server.
>    x86/ioreq server: Add new functions to get/set memory types.
>    x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to
>      an ioreq server.
>    x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an
>      ioreq server unmaps.
>
>   xen/arch/x86/hvm/emulate.c       |  33 +++-
>   xen/arch/x86/hvm/hvm.c           | 395 ++++++++++++++++++++++++++-------------
>   xen/arch/x86/hvm/ioreq.c         |  41 ++++
>   xen/arch/x86/mm/hap/hap.c        |   9 +
>   xen/arch/x86/mm/hap/nested_hap.c |   2 +-
>   xen/arch/x86/mm/p2m-ept.c        |  14 +-
>   xen/arch/x86/mm/p2m-pt.c         |  30 ++-
>   xen/arch/x86/mm/p2m.c            |  77 ++++++++
>   xen/arch/x86/mm/shadow/multi.c   |   3 +-
>   xen/include/asm-x86/hvm/ioreq.h  |   2 +
>   xen/include/asm-x86/p2m.h        |  36 +++-
>   xen/include/public/hvm/hvm_op.h  |  34 +++-
>   12 files changed, 528 insertions(+), 148 deletions(-)
>

Hi Jan & George, any comments on this version? :)
Sorry if this mail disturbs, but it has been more than 3 weeks since the 
post...

Thanks
Yu




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type.
  2016-08-05  2:44 ` [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
@ 2016-08-05  6:16   ` Jan Beulich
  2016-08-05 12:46   ` George Dunlap
  1 sibling, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2016-08-05  6:16 UTC (permalink / raw)
  To: Yu Zhang; +Cc: xen-devel, George Dunlap, Zhiyuan Lv

>>> On 05.08.16 at 04:44, <yu.c.zhang@linux.intel.com> wrote:

> On 7/12/2016 5:02 PM, Yu Zhang wrote:
>> XenGT leverages ioreq server to track and forward the accesses to GPU
>> I/O resources, e.g. the PPGTT(per-process graphic translation tables).
>> Currently, ioreq server uses rangeset to track the BDF/ PIO/MMIO ranges
>> to be emulated. To select an ioreq server, the rangeset is searched to
>> see if the I/O range is recorded. However, number of ram pages to be
>> tracked may exceed the upper limit of rangeset.
>>
>> Previously, one solution was proposed to refactor the rangeset, and
>> extend its upper limit. However, after 12 rounds discussion, we have
>> decided to drop this approach due to security concerns. Now this new
>> patch series introduces a new mem type, HVMMEM_ioreq_server, and added
>> hvm operations to let one ioreq server to claim its ownership of ram
>> pages with this type. Accesses to a page of this type will be handled
>> by the specified ioreq server directly.
>>
>> Yu Zhang (4):
>>    x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server.
>>    x86/ioreq server: Add new functions to get/set memory types.
>>    x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to
>>      an ioreq server.
>>    x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an
>>      ioreq server unmaps.
>>
>>   xen/arch/x86/hvm/emulate.c       |  33 +++-
>>   xen/arch/x86/hvm/hvm.c           | 395 ++++++++++++++++++++++++++-------------
>>   xen/arch/x86/hvm/ioreq.c         |  41 ++++
>>   xen/arch/x86/mm/hap/hap.c        |   9 +
>>   xen/arch/x86/mm/hap/nested_hap.c |   2 +-
>>   xen/arch/x86/mm/p2m-ept.c        |  14 +-
>>   xen/arch/x86/mm/p2m-pt.c         |  30 ++-
>>   xen/arch/x86/mm/p2m.c            |  77 ++++++++
>>   xen/arch/x86/mm/shadow/multi.c   |   3 +-
>>   xen/include/asm-x86/hvm/ioreq.h  |   2 +
>>   xen/include/asm-x86/p2m.h        |  36 +++-
>>   xen/include/public/hvm/hvm_op.h  |  34 +++-
>>   12 files changed, 528 insertions(+), 148 deletions(-)
>>
> 
> Hi Jan & George, any comments on this version? :)
> Sorry if this mail disturbs, but it has been more than 3 weeks since the 
> post...

I've been on vacation up until the end of last week, and as you may
have seen I did push the first two patches yesterday. The other
two aren't forgotten. Otoh I'm not entirely certain it makes sense
to apply the new hvmop one ahead of the hvmctl (or dmop, or
whatever is going to evolve out of this) series, as that would mean
a fair part of the newly added code to get moved elsewhere pretty
soon afterwards.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type.
  2016-08-05  2:44 ` [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
  2016-08-05  6:16   ` Jan Beulich
@ 2016-08-05 12:46   ` George Dunlap
  1 sibling, 0 replies; 32+ messages in thread
From: George Dunlap @ 2016-08-05 12:46 UTC (permalink / raw)
  To: Yu Zhang, Jan Beulich; +Cc: Lv, Zhiyuan, xen-devel

On 05/08/16 03:44, Yu Zhang wrote:
> 
> On 7/12/2016 5:02 PM, Yu Zhang wrote:
>> XenGT leverages ioreq server to track and forward the accesses to GPU
>> I/O resources, e.g. the PPGTT(per-process graphic translation tables).
>> Currently, ioreq server uses rangeset to track the BDF/ PIO/MMIO ranges
>> to be emulated. To select an ioreq server, the rangeset is searched to
>> see if the I/O range is recorded. However, number of ram pages to be
>> tracked may exceed the upper limit of rangeset.
>>
>> Previously, one solution was proposed to refactor the rangeset, and
>> extend its upper limit. However, after 12 rounds discussion, we have
>> decided to drop this approach due to security concerns. Now this new
>> patch series introduces a new mem type, HVMMEM_ioreq_server, and added
>> hvm operations to let one ioreq server to claim its ownership of ram
>> pages with this type. Accesses to a page of this type will be handled
>> by the specified ioreq server directly.
>>
>> Yu Zhang (4):
>>    x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server.
>>    x86/ioreq server: Add new functions to get/set memory types.
>>    x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to
>>      an ioreq server.
>>    x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an
>>      ioreq server unmaps.
>>
>>   xen/arch/x86/hvm/emulate.c       |  33 +++-
>>   xen/arch/x86/hvm/hvm.c           | 395
>> ++++++++++++++++++++++++++-------------
>>   xen/arch/x86/hvm/ioreq.c         |  41 ++++
>>   xen/arch/x86/mm/hap/hap.c        |   9 +
>>   xen/arch/x86/mm/hap/nested_hap.c |   2 +-
>>   xen/arch/x86/mm/p2m-ept.c        |  14 +-
>>   xen/arch/x86/mm/p2m-pt.c         |  30 ++-
>>   xen/arch/x86/mm/p2m.c            |  77 ++++++++
>>   xen/arch/x86/mm/shadow/multi.c   |   3 +-
>>   xen/include/asm-x86/hvm/ioreq.h  |   2 +
>>   xen/include/asm-x86/p2m.h        |  36 +++-
>>   xen/include/public/hvm/hvm_op.h  |  34 +++-
>>   12 files changed, 528 insertions(+), 148 deletions(-)
>>
> 
> Hi Jan & George, any comments on this version? :)
> Sorry if this mail disturbs, but it has been more than 3 weeks since the
> post...

Yes, sorry for the delay -- I did take a look at it earlier this week
but of course there was quite a bit of new code, so it's not something I
can just give a once-over and ack. :-)

I should be able to get to it next week.

 -George



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-07-12  9:02 ` [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
@ 2016-08-08 15:40   ` Jan Beulich
  2016-08-09  7:39     ` Yu Zhang
  2016-08-10  8:09     ` Yu Zhang
  0 siblings, 2 replies; 32+ messages in thread
From: Jan Beulich @ 2016-08-08 15:40 UTC (permalink / raw)
  To: Paul Durrant, Yu Zhang
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	zhiyuan.lv, Jun Nakajima

>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
> @@ -178,8 +179,34 @@ static int hvmemul_do_io(
>          break;
>      case X86EMUL_UNHANDLEABLE:
>      {
> -        struct hvm_ioreq_server *s =
> -            hvm_select_ioreq_server(curr->domain, &p);
> +        struct hvm_ioreq_server *s;
> +
> +        if ( is_mmio )
> +        {
> +            unsigned long gmfn = paddr_to_pfn(addr);
> +            p2m_type_t p2mt;
> +
> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
> +
> +            if ( p2mt == p2m_ioreq_server )
> +            {
> +                unsigned int flags;
> +
> +                if ( dir != IOREQ_WRITE )
> +                    s = NULL;
> +                else
> +                {
> +                    s = p2m_get_ioreq_server(currd, &flags);
> +
> +                    if ( !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS) )
> +                        s = NULL;
> +                }
> +            }
> +            else
> +                s = hvm_select_ioreq_server(currd, &p);
> +        }
> +        else
> +            s = hvm_select_ioreq_server(currd, &p);

Wouldn't it both be more natural and make the logic even easier
to follow if s got set to NULL up front, all the "else"-s dropped,
and a simple

        if ( !s )
            s = hvm_select_ioreq_server(currd, &p);

be done in the end?

> @@ -5447,6 +5452,21 @@ static int hvmop_set_mem_type(
>      if ( !is_hvm_domain(d) )
>          goto out;
>  
> +    if ( a.hvmmem_type == HVMMEM_ioreq_server )
> +    {
> +        unsigned int flags;
> +        struct hvm_ioreq_server *s;
> +
> +        /* HVMMEM_ioreq_server is only supported for HAP enabled hvm. */
> +        if ( !hap_enabled(d) )
> +            goto out;
> +
> +        /* Do not change to HVMMEM_ioreq_server if no ioreq server mapped. */
> +        s = p2m_get_ioreq_server(d, &flags);
> +        if ( s == NULL )
> +            goto out;

Either drop s as an intermediate variable altogether (preferred), or
constify it properly.

> +int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
> +                                     uint32_t type, uint32_t flags)
> +{
> +    struct hvm_ioreq_server *s;
> +    int rc;
> +
> +    /* For now, only HVMMEM_ioreq_server is supported. */
> +    if ( type != HVMMEM_ioreq_server )
> +        return -EINVAL;
> +
> +    /* For now, only write emulation is supported. */
> +    if ( flags & ~(XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
> +        return -EINVAL;
> +
> +    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);

This lock did get converted to a recursive one a little while back.

> +    rc = -ENOENT;
> +    list_for_each_entry ( s,
> +                          &d->arch.hvm_domain.ioreq_server.list,
> +                          list_entry )
> +    {
> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
> +            continue;
> +
> +        if ( s->id == id )
> +        {
> +            rc = p2m_set_ioreq_server(d, flags, s);
> +            if ( rc == 0 )
> +                dprintk(XENLOG_DEBUG, "%u %s type HVMMEM_ioreq_server.\n",
> +                         s->id, (flags != 0) ? "mapped to" : "unmapped from");

Is this really useful?

> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -132,6 +132,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
>              entry->r = entry->w = entry->x = 1;
>              entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>              break;
> +        case p2m_ioreq_server:
> +            entry->r = 1;
> +            entry->w = !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_WRITE_ACCESS);
> +            entry->x = 0;
> +            entry->a = !!cpu_has_vmx_ept_ad;
> +            entry->d = entry->w && cpu_has_vmx_ept_ad;

For self-consistency, could this become

            entry->d = entry->w && entry->a;

?

> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3225,8 +3225,7 @@ static int sh_page_fault(struct vcpu *v,
>      }
>  
>      /* Need to hand off device-model MMIO to the device model */
> -    if ( p2mt == p2m_mmio_dm
> -         || (p2mt == p2m_ioreq_server && ft == ft_demand_write) )
> +    if ( p2mt == p2m_mmio_dm )

Could you remind me again what the code being removed here gets
replaced by, or why it doesn't need any replacement?

> @@ -336,6 +336,23 @@ struct p2m_domain {
>          struct ept_data ept;
>          /* NPT-equivalent structure could be added here. */
>      };
> +
> +    struct {
> +        spinlock_t lock;
> +        /*
> +         * ioreq server who's responsible for the emulation of
> +         * gfns with specific p2m type(for now, p2m_ioreq_server).
> +         */
> +        struct hvm_ioreq_server *server;
> +        /*
> +         * flags specifies whether read, write or both operations
> +         * are to be emulated by an ioreq server.
> +         */
> +        unsigned int flags;
> +
> +#define P2M_IOREQ_HANDLE_WRITE_ACCESS XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE
> +#define P2M_IOREQ_HANDLE_READ_ACCESS  XEN_HVMOP_IOREQ_MEM_ACCESS_READ

I think I did say so on a previous iteration already: I can't see the
value of these two defines, or in fact I can see these being actively
dangerous: The rest of your patch assume that each pair shares
their values (as there's no translation between them, but also no
BUILD_BUG_ON() ensuring they're identical).

> --- a/xen/include/public/hvm/hvm_op.h
> +++ b/xen/include/public/hvm/hvm_op.h
> @@ -89,7 +89,9 @@ typedef enum {
>      HVMMEM_unused,             /* Placeholder; setting memory to this type
>                                    will fail for code after 4.7.0 */
>  #endif
> -    HVMMEM_ioreq_server
> +    HVMMEM_ioreq_server        /* Memory type claimed by an ioreq server; type
> +                                  changes to this value are only allowed after
> +                                  an ioreq server has claimed its ownership. */

Wouldn't it be worth also noting in the comment that only changes
to/from rw are permitted?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-07-12  9:02 ` [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
@ 2016-08-08 16:29   ` Jan Beulich
  2016-08-09  7:39     ` Yu Zhang
  2016-08-16 13:35   ` George Dunlap
  1 sibling, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2016-08-08 16:29 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, xen-devel,
	Paul Durrant, zhiyuan.lv, Jun Nakajima

>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
> @@ -5512,6 +5513,12 @@ static int hvmop_set_mem_type(
>          if ( rc )
>              goto out;
>  
> +        if ( t == p2m_ram_rw && memtype[a.hvmmem_type] == p2m_ioreq_server )
> +            p2m->ioreq.entry_count++;
> +
> +        if ( t == p2m_ioreq_server && memtype[a.hvmmem_type] == p2m_ram_rw )
> +            p2m->ioreq.entry_count--;
> +

These (and others below) happen, afaict, outside of any lock, which
can't be right.

> @@ -5530,11 +5537,13 @@ static int hvmop_set_mem_type(
>  }
>  
>  static int hvmop_map_mem_type_to_ioreq_server(
> -    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
> +    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop,
> +    unsigned long *iter)
>  {
>      xen_hvm_map_mem_type_to_ioreq_server_t op;
>      struct domain *d;
>      int rc;
> +    unsigned long gfn = *iter;
>  
>      if ( copy_from_guest(&op, uop, 1) )
>          return -EFAULT;
> @@ -5559,7 +5568,42 @@ static int hvmop_map_mem_type_to_ioreq_server(
>      if ( rc != 0 )
>          goto out;
>  
> -    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
> +    if ( gfn == 0 || op.flags != 0 )
> +        rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);

Couldn't you omit the right side of the || in the if()?

> +    /*
> +     * Iterate p2m table when an ioreq server unmaps from p2m_ioreq_server,
> +     * and reset the remaining p2m_ioreq_server entries back to p2m_ram_rw.
> +     */
> +    if ( op.flags == 0 && rc == 0 )
> +    {
> +        struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +
> +        while ( gfn <= p2m->max_mapped_pfn )
> +        {
> +            p2m_type_t t;
> +
> +            if ( p2m->ioreq.entry_count == 0 )
> +                break;

Perhaps better to be moved up into the while() expression?

> +            get_gfn_unshare(d, gfn, &t);
> +
> +            if ( (t == p2m_ioreq_server) &&
> +                 !(p2m_change_type_one(d, gfn, t, p2m_ram_rw)) )

Stray parentheses.

> +                p2m->ioreq.entry_count--;
> +
> +            put_gfn(d, gfn);
> +
> +            /* Check for continuation if it's not the last iteration. */
> +            if ( ++gfn <= p2m->max_mapped_pfn &&
> +                 !(gfn & HVMOP_op_mask) &&
> +                 hypercall_preempt_check() )
> +            {
> +                rc = -ERESTART;
> +                goto out;

I think you need to write gfn back to *iter.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-08 15:40   ` Jan Beulich
@ 2016-08-09  7:39     ` Yu Zhang
  2016-08-09  8:11       ` Jan Beulich
  2016-08-10  8:09     ` Yu Zhang
  1 sibling, 1 reply; 32+ messages in thread
From: Yu Zhang @ 2016-08-09  7:39 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	zhiyuan.lv, Jun Nakajima



On 8/8/2016 11:40 PM, Jan Beulich wrote:
>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>> @@ -178,8 +179,34 @@ static int hvmemul_do_io(
>>           break;
>>       case X86EMUL_UNHANDLEABLE:
>>       {
>> -        struct hvm_ioreq_server *s =
>> -            hvm_select_ioreq_server(curr->domain, &p);
>> +        struct hvm_ioreq_server *s;
>> +
>> +        if ( is_mmio )
>> +        {
>> +            unsigned long gmfn = paddr_to_pfn(addr);
>> +            p2m_type_t p2mt;
>> +
>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>> +
>> +            if ( p2mt == p2m_ioreq_server )
>> +            {
>> +                unsigned int flags;
>> +
>> +                if ( dir != IOREQ_WRITE )
>> +                    s = NULL;
>> +                else
>> +                {
>> +                    s = p2m_get_ioreq_server(currd, &flags);
>> +
>> +                    if ( !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS) )
>> +                        s = NULL;
>> +                }
>> +            }
>> +            else
>> +                s = hvm_select_ioreq_server(currd, &p);
>> +        }
>> +        else
>> +            s = hvm_select_ioreq_server(currd, &p);
> Wouldn't it both be more natural and make the logic even easier
> to follow if s got set to NULL up front, all the "else"-s dropped,
> and a simple
>
>          if ( !s )
>              s = hvm_select_ioreq_server(currd, &p);
>
> be done in the end?

Thanks, Jan. I'll have a try.

>> @@ -5447,6 +5452,21 @@ static int hvmop_set_mem_type(
>>       if ( !is_hvm_domain(d) )
>>           goto out;
>>   
>> +    if ( a.hvmmem_type == HVMMEM_ioreq_server )
>> +    {
>> +        unsigned int flags;
>> +        struct hvm_ioreq_server *s;
>> +
>> +        /* HVMMEM_ioreq_server is only supported for HAP enabled hvm. */
>> +        if ( !hap_enabled(d) )
>> +            goto out;
>> +
>> +        /* Do not change to HVMMEM_ioreq_server if no ioreq server mapped. */
>> +        s = p2m_get_ioreq_server(d, &flags);
>> +        if ( s == NULL )
>> +            goto out;
> Either drop s as an intermediate variable altogether (preferred), or
> constify it properly.

Got it.

>> +int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>> +                                     uint32_t type, uint32_t flags)
>> +{
>> +    struct hvm_ioreq_server *s;
>> +    int rc;
>> +
>> +    /* For now, only HVMMEM_ioreq_server is supported. */
>> +    if ( type != HVMMEM_ioreq_server )
>> +        return -EINVAL;
>> +
>> +    /* For now, only write emulation is supported. */
>> +    if ( flags & ~(XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>> +        return -EINVAL;
>> +
>> +    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
> This lock did get converted to a recursive one a little while back.

Oh. I see, should use the spin_lock_recursive/spin_unlock_recursive 
pair. Thanks.

>> +    rc = -ENOENT;
>> +    list_for_each_entry ( s,
>> +                          &d->arch.hvm_domain.ioreq_server.list,
>> +                          list_entry )
>> +    {
>> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
>> +            continue;
>> +
>> +        if ( s->id == id )
>> +        {
>> +            rc = p2m_set_ioreq_server(d, flags, s);
>> +            if ( rc == 0 )
>> +                dprintk(XENLOG_DEBUG, "%u %s type HVMMEM_ioreq_server.\n",
>> +                         s->id, (flags != 0) ? "mapped to" : "unmapped from");
> Is this really useful?

Sorry, are you referring to this debug message?
I believe it helps, especially when multiple ioreq servers are 
claiming/disclaiming their ownership of
the HVMMEM_ioreq_server. :)

>> --- a/xen/arch/x86/mm/p2m-ept.c
>> +++ b/xen/arch/x86/mm/p2m-ept.c
>> @@ -132,6 +132,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
>>               entry->r = entry->w = entry->x = 1;
>>               entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>>               break;
>> +        case p2m_ioreq_server:
>> +            entry->r = 1;
>> +            entry->w = !(p2m->ioreq.flags & P2M_IOREQ_HANDLE_WRITE_ACCESS);
>> +            entry->x = 0;
>> +            entry->a = !!cpu_has_vmx_ept_ad;
>> +            entry->d = entry->w && cpu_has_vmx_ept_ad;
> For self-consistency, could this become
>
>              entry->d = entry->w && entry->a;
>
> ?

Yep. Thanks. :)

>> --- a/xen/arch/x86/mm/shadow/multi.c
>> +++ b/xen/arch/x86/mm/shadow/multi.c
>> @@ -3225,8 +3225,7 @@ static int sh_page_fault(struct vcpu *v,
>>       }
>>   
>>       /* Need to hand off device-model MMIO to the device model */
>> -    if ( p2mt == p2m_mmio_dm
>> -         || (p2mt == p2m_ioreq_server && ft == ft_demand_write) )
>> +    if ( p2mt == p2m_mmio_dm )
> Could you remind me again what the code being removed here gets
> replaced by, or why it doesn't need any replacement?

  HVMMEM_ioreq_server is now only supported for HAP enabled hvm, so 
shadow code
does not need to handle this p2m type.

>> @@ -336,6 +336,23 @@ struct p2m_domain {
>>           struct ept_data ept;
>>           /* NPT-equivalent structure could be added here. */
>>       };
>> +
>> +    struct {
>> +        spinlock_t lock;
>> +        /*
>> +         * ioreq server who's responsible for the emulation of
>> +         * gfns with specific p2m type(for now, p2m_ioreq_server).
>> +         */
>> +        struct hvm_ioreq_server *server;
>> +        /*
>> +         * flags specifies whether read, write or both operations
>> +         * are to be emulated by an ioreq server.
>> +         */
>> +        unsigned int flags;
>> +
>> +#define P2M_IOREQ_HANDLE_WRITE_ACCESS XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE
>> +#define P2M_IOREQ_HANDLE_READ_ACCESS  XEN_HVMOP_IOREQ_MEM_ACCESS_READ
> I think I did say so on a previous iteration already: I can't see the
> value of these two defines, or in fact I can see these being actively
> dangerous: The rest of your patch assume that each pair shares
> their values (as there's no translation between them, but also no
> BUILD_BUG_ON() ensuring they're identical).

Oh. I thought it was a coding style issue we were discussing - 
previously,  there was a
#define using the <underscore><uppercase> pattern, which should be 
deprecated.

For the P2M_IOREQ_HANDLE_WRITE/READ_ACCESS definition, I agree they can 
be removed.

>
>> --- a/xen/include/public/hvm/hvm_op.h
>> +++ b/xen/include/public/hvm/hvm_op.h
>> @@ -89,7 +89,9 @@ typedef enum {
>>       HVMMEM_unused,             /* Placeholder; setting memory to this type
>>                                     will fail for code after 4.7.0 */
>>   #endif
>> -    HVMMEM_ioreq_server
>> +    HVMMEM_ioreq_server        /* Memory type claimed by an ioreq server; type
>> +                                  changes to this value are only allowed after
>> +                                  an ioreq server has claimed its ownership. */
> Wouldn't it be worth also noting in the comment that only changes
> to/from rw are permitted?
>

OK. Will add this comment in next version.

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-08-08 16:29   ` Jan Beulich
@ 2016-08-09  7:39     ` Yu Zhang
  2016-08-09  8:13       ` Jan Beulich
  0 siblings, 1 reply; 32+ messages in thread
From: Yu Zhang @ 2016-08-09  7:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, xen-devel,
	Paul Durrant, zhiyuan.lv, Jun Nakajima



On 8/9/2016 12:29 AM, Jan Beulich wrote:
>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>> @@ -5512,6 +5513,12 @@ static int hvmop_set_mem_type(
>>           if ( rc )
>>               goto out;
>>   
>> +        if ( t == p2m_ram_rw && memtype[a.hvmmem_type] == p2m_ioreq_server )
>> +            p2m->ioreq.entry_count++;
>> +
>> +        if ( t == p2m_ioreq_server && memtype[a.hvmmem_type] == p2m_ram_rw )
>> +            p2m->ioreq.entry_count--;
>> +
> These (and others below) happen, afaict, outside of any lock, which
> can't be right.

How about we make this entry_count as atomic_t and use atomic_inc/dec 
instead?

>
>> @@ -5530,11 +5537,13 @@ static int hvmop_set_mem_type(
>>   }
>>   
>>   static int hvmop_map_mem_type_to_ioreq_server(
>> -    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
>> +    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop,
>> +    unsigned long *iter)
>>   {
>>       xen_hvm_map_mem_type_to_ioreq_server_t op;
>>       struct domain *d;
>>       int rc;
>> +    unsigned long gfn = *iter;
>>   
>>       if ( copy_from_guest(&op, uop, 1) )
>>           return -EFAULT;
>> @@ -5559,7 +5568,42 @@ static int hvmop_map_mem_type_to_ioreq_server(
>>       if ( rc != 0 )
>>           goto out;
>>   
>> -    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
>> +    if ( gfn == 0 || op.flags != 0 )
>> +        rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
> Couldn't you omit the right side of the || in the if()?

Oh right, it is redundant. Thanks! :-)

>
>> +    /*
>> +     * Iterate p2m table when an ioreq server unmaps from p2m_ioreq_server,
>> +     * and reset the remaining p2m_ioreq_server entries back to p2m_ram_rw.
>> +     */
>> +    if ( op.flags == 0 && rc == 0 )
>> +    {
>> +        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> +
>> +        while ( gfn <= p2m->max_mapped_pfn )
>> +        {
>> +            p2m_type_t t;
>> +
>> +            if ( p2m->ioreq.entry_count == 0 )
>> +                break;
> Perhaps better to be moved up into the while() expression?

OK. Another thing is maybe we can use _atomic_read() to get value of 
entry_count if
the counter is defined as atomic_t?

>> +            get_gfn_unshare(d, gfn, &t);
>> +
>> +            if ( (t == p2m_ioreq_server) &&
>> +                 !(p2m_change_type_one(d, gfn, t, p2m_ram_rw)) )
> Stray parentheses.

Got it. Thanks!

>> +                p2m->ioreq.entry_count--;
>> +
>> +            put_gfn(d, gfn);
>> +
>> +            /* Check for continuation if it's not the last iteration. */
>> +            if ( ++gfn <= p2m->max_mapped_pfn &&
>> +                 !(gfn & HVMOP_op_mask) &&
>> +                 hypercall_preempt_check() )
>> +            {
>> +                rc = -ERESTART;
>> +                goto out;
> I think you need to write gfn back to *iter.

Yes, thanks for pointing out.

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-09  7:39     ` Yu Zhang
@ 2016-08-09  8:11       ` Jan Beulich
  2016-08-09  8:20         ` Paul Durrant
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2016-08-09  8:11 UTC (permalink / raw)
  To: Paul Durrant, Yu Zhang
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	zhiyuan.lv, Jun Nakajima

>>> On 09.08.16 at 09:39, <yu.c.zhang@linux.intel.com> wrote:
> On 8/8/2016 11:40 PM, Jan Beulich wrote:
>>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>>> +    rc = -ENOENT;
>>> +    list_for_each_entry ( s,
>>> +                          &d->arch.hvm_domain.ioreq_server.list,
>>> +                          list_entry )
>>> +    {
>>> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
>>> +            continue;
>>> +
>>> +        if ( s->id == id )
>>> +        {
>>> +            rc = p2m_set_ioreq_server(d, flags, s);
>>> +            if ( rc == 0 )
>>> +                dprintk(XENLOG_DEBUG, "%u %s type HVMMEM_ioreq_server.\n",
>>> +                         s->id, (flags != 0) ? "mapped to" : "unmapped from");
>> Is this really useful?
> 
> Sorry, are you referring to this debug message?
> I believe it helps, especially when multiple ioreq servers are 
> claiming/disclaiming their ownership of
> the HVMMEM_ioreq_server. :)

Well, that's a configuration bug anyway right now, so I'm not really
with you. But in the end it'll be Paul to judge ...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-08-09  7:39     ` Yu Zhang
@ 2016-08-09  8:13       ` Jan Beulich
  2016-08-09  9:25         ` Yu Zhang
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2016-08-09  8:13 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, xen-devel,
	Paul Durrant, zhiyuan.lv, Jun Nakajima

>>> On 09.08.16 at 09:39, <yu.c.zhang@linux.intel.com> wrote:

> 
> On 8/9/2016 12:29 AM, Jan Beulich wrote:
>>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>>> @@ -5512,6 +5513,12 @@ static int hvmop_set_mem_type(
>>>           if ( rc )
>>>               goto out;
>>>   
>>> +        if ( t == p2m_ram_rw && memtype[a.hvmmem_type] == p2m_ioreq_server )
>>> +            p2m->ioreq.entry_count++;
>>> +
>>> +        if ( t == p2m_ioreq_server && memtype[a.hvmmem_type] == p2m_ram_rw )
>>> +            p2m->ioreq.entry_count--;
>>> +
>> These (and others below) happen, afaict, outside of any lock, which
>> can't be right.
> 
> How about we make this entry_count as atomic_t and use atomic_inc/dec 
> instead?

That's certainly one of the options, but beware of overflow.

>>> +    /*
>>> +     * Iterate p2m table when an ioreq server unmaps from p2m_ioreq_server,
>>> +     * and reset the remaining p2m_ioreq_server entries back to p2m_ram_rw.
>>> +     */
>>> +    if ( op.flags == 0 && rc == 0 )
>>> +    {
>>> +        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>> +
>>> +        while ( gfn <= p2m->max_mapped_pfn )
>>> +        {
>>> +            p2m_type_t t;
>>> +
>>> +            if ( p2m->ioreq.entry_count == 0 )
>>> +                break;
>> Perhaps better to be moved up into the while() expression?
> 
> OK. Another thing is maybe we can use _atomic_read() to get value of 
> entry_count if
> the counter is defined as atomic_t?

Well, that's an orthogonal adjustment which solely depends on the
type of the field.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-09  8:11       ` Jan Beulich
@ 2016-08-09  8:20         ` Paul Durrant
  2016-08-09  8:51           ` Yu Zhang
  0 siblings, 1 reply; 32+ messages in thread
From: Paul Durrant @ 2016-08-09  8:20 UTC (permalink / raw)
  To: Jan Beulich, Yu Zhang
  Cc: Kevin Tian, Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, zhiyuan.lv, Jun Nakajima

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 August 2016 09:11
> To: Paul Durrant; Yu Zhang
> Cc: Andrew Cooper; George Dunlap; Jun Nakajima; Kevin Tian;
> zhiyuan.lv@intel.com; xen-devel@lists.xen.org; Tim (Xen.org)
> Subject: Re: [Xen-devel] [PATCH v5 3/4] x86/ioreq server: Add HVMOP to
> map guest ram with p2m_ioreq_server to an ioreq server.
> 
> >>> On 09.08.16 at 09:39, <yu.c.zhang@linux.intel.com> wrote:
> > On 8/8/2016 11:40 PM, Jan Beulich wrote:
> >>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
> >>> +    rc = -ENOENT;
> >>> +    list_for_each_entry ( s,
> >>> +                          &d->arch.hvm_domain.ioreq_server.list,
> >>> +                          list_entry )
> >>> +    {
> >>> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
> >>> +            continue;
> >>> +
> >>> +        if ( s->id == id )
> >>> +        {
> >>> +            rc = p2m_set_ioreq_server(d, flags, s);
> >>> +            if ( rc == 0 )
> >>> +                dprintk(XENLOG_DEBUG, "%u %s type
> HVMMEM_ioreq_server.\n",
> >>> +                         s->id, (flags != 0) ? "mapped to" : "unmapped from");
> >> Is this really useful?
> >
> > Sorry, are you referring to this debug message?
> > I believe it helps, especially when multiple ioreq servers are
> > claiming/disclaiming their ownership of
> > the HVMMEM_ioreq_server. :)
> 
> Well, that's a configuration bug anyway right now, so I'm not really
> with you. But in the end it'll be Paul to judge ...
>

Indeed, I don't the message has a massive amount of value. More useful would be to add a call into keyhandler.c:dump_domains() to display the current claim.

  Paul
 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-09  8:20         ` Paul Durrant
@ 2016-08-09  8:51           ` Yu Zhang
  2016-08-09  9:07             ` Paul Durrant
  0 siblings, 1 reply; 32+ messages in thread
From: Yu Zhang @ 2016-08-09  8:51 UTC (permalink / raw)
  To: Paul Durrant, Jan Beulich
  Cc: Kevin Tian, Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, zhiyuan.lv, Jun Nakajima



On 8/9/2016 4:20 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 09 August 2016 09:11
>> To: Paul Durrant; Yu Zhang
>> Cc: Andrew Cooper; George Dunlap; Jun Nakajima; Kevin Tian;
>> zhiyuan.lv@intel.com; xen-devel@lists.xen.org; Tim (Xen.org)
>> Subject: Re: [Xen-devel] [PATCH v5 3/4] x86/ioreq server: Add HVMOP to
>> map guest ram with p2m_ioreq_server to an ioreq server.
>>
>>>>> On 09.08.16 at 09:39, <yu.c.zhang@linux.intel.com> wrote:
>>> On 8/8/2016 11:40 PM, Jan Beulich wrote:
>>>>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>>>>> +    rc = -ENOENT;
>>>>> +    list_for_each_entry ( s,
>>>>> +                          &d->arch.hvm_domain.ioreq_server.list,
>>>>> +                          list_entry )
>>>>> +    {
>>>>> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
>>>>> +            continue;
>>>>> +
>>>>> +        if ( s->id == id )
>>>>> +        {
>>>>> +            rc = p2m_set_ioreq_server(d, flags, s);
>>>>> +            if ( rc == 0 )
>>>>> +                dprintk(XENLOG_DEBUG, "%u %s type
>> HVMMEM_ioreq_server.\n",
>>>>> +                         s->id, (flags != 0) ? "mapped to" : "unmapped from");
>>>> Is this really useful?
>>> Sorry, are you referring to this debug message?
>>> I believe it helps, especially when multiple ioreq servers are
>>> claiming/disclaiming their ownership of
>>> the HVMMEM_ioreq_server. :)
>> Well, that's a configuration bug anyway right now, so I'm not really
>> with you. But in the end it'll be Paul to judge ...
>>
> Indeed, I don't the message has a massive amount of value. More useful would be to add a call into keyhandler.c:dump_domains() to display the current claim.
>

All right. Let's remove this debug message. The debug key handler can be 
updated in a separate patch, is this OK? :-)

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-09  8:51           ` Yu Zhang
@ 2016-08-09  9:07             ` Paul Durrant
  0 siblings, 0 replies; 32+ messages in thread
From: Paul Durrant @ 2016-08-09  9:07 UTC (permalink / raw)
  To: Yu Zhang, Jan Beulich
  Cc: Kevin Tian, Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, zhiyuan.lv, Jun Nakajima

> -----Original Message-----
> From: Yu Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: 09 August 2016 09:51
> To: Paul Durrant; Jan Beulich
> Cc: Andrew Cooper; George Dunlap; Jun Nakajima; Kevin Tian;
> zhiyuan.lv@intel.com; xen-devel@lists.xen.org; Tim (Xen.org)
> Subject: Re: [Xen-devel] [PATCH v5 3/4] x86/ioreq server: Add HVMOP to
> map guest ram with p2m_ioreq_server to an ioreq server.
> 
> 
> 
> On 8/9/2016 4:20 PM, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 09 August 2016 09:11
> >> To: Paul Durrant; Yu Zhang
> >> Cc: Andrew Cooper; George Dunlap; Jun Nakajima; Kevin Tian;
> >> zhiyuan.lv@intel.com; xen-devel@lists.xen.org; Tim (Xen.org)
> >> Subject: Re: [Xen-devel] [PATCH v5 3/4] x86/ioreq server: Add HVMOP to
> >> map guest ram with p2m_ioreq_server to an ioreq server.
> >>
> >>>>> On 09.08.16 at 09:39, <yu.c.zhang@linux.intel.com> wrote:
> >>> On 8/8/2016 11:40 PM, Jan Beulich wrote:
> >>>>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
> >>>>> +    rc = -ENOENT;
> >>>>> +    list_for_each_entry ( s,
> >>>>> +                          &d->arch.hvm_domain.ioreq_server.list,
> >>>>> +                          list_entry )
> >>>>> +    {
> >>>>> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
> >>>>> +            continue;
> >>>>> +
> >>>>> +        if ( s->id == id )
> >>>>> +        {
> >>>>> +            rc = p2m_set_ioreq_server(d, flags, s);
> >>>>> +            if ( rc == 0 )
> >>>>> +                dprintk(XENLOG_DEBUG, "%u %s type
> >> HVMMEM_ioreq_server.\n",
> >>>>> +                         s->id, (flags != 0) ? "mapped to" : "unmapped from");
> >>>> Is this really useful?
> >>> Sorry, are you referring to this debug message?
> >>> I believe it helps, especially when multiple ioreq servers are
> >>> claiming/disclaiming their ownership of
> >>> the HVMMEM_ioreq_server. :)
> >> Well, that's a configuration bug anyway right now, so I'm not really
> >> with you. But in the end it'll be Paul to judge ...
> >>
> > Indeed, I don't the message has a massive amount of value. More useful
> would be to add a call into keyhandler.c:dump_domains() to display the
> current claim.
> >
> 
> All right. Let's remove this debug message. The debug key handler can be
> updated in a separate patch, is this OK? :-)
>

OK with me.

  Paul
 
> Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-08-09  8:13       ` Jan Beulich
@ 2016-08-09  9:25         ` Yu Zhang
  2016-08-09  9:45           ` Jan Beulich
  0 siblings, 1 reply; 32+ messages in thread
From: Yu Zhang @ 2016-08-09  9:25 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, xen-devel,
	Paul Durrant, zhiyuan.lv, Jun Nakajima



On 8/9/2016 4:13 PM, Jan Beulich wrote:
>>>> On 09.08.16 at 09:39, <yu.c.zhang@linux.intel.com> wrote:
>> On 8/9/2016 12:29 AM, Jan Beulich wrote:
>>>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>>>> @@ -5512,6 +5513,12 @@ static int hvmop_set_mem_type(
>>>>            if ( rc )
>>>>                goto out;
>>>>    
>>>> +        if ( t == p2m_ram_rw && memtype[a.hvmmem_type] == p2m_ioreq_server )
>>>> +            p2m->ioreq.entry_count++;
>>>> +
>>>> +        if ( t == p2m_ioreq_server && memtype[a.hvmmem_type] == p2m_ram_rw )
>>>> +            p2m->ioreq.entry_count--;
>>>> +
>>> These (and others below) happen, afaict, outside of any lock, which
>>> can't be right.
>> How about we make this entry_count as atomic_t and use atomic_inc/dec
>> instead?
> That's certainly one of the options, but beware of overflow.

Oh, thanks for your remind, Jan. I just found that atomic_t is defined 
as  "typedef struct { int counter; } atomic_t;"
which do have overflow issues if entry_count is supposed to be a uint32 
or uint64.

Now I have another thought: the entry_count is referenced in 3 
different  scenarios:
1> the hvmop handlers -  hvmop_set_mem_type() and 
hvmop_map_mem_type_to_ioreq_server(),
which  are triggered by device model, and will not be concurrent. And 
hvmop_set_mem_type() will
permit the mem type change only when an ioreq server has already been 
mapped to this type.

2> the misconfiguration handlers - resolve_misconfig() or do_recalc(),  
which are triggered by HVM's
vm exit, yet this will only happen after the ioreq server has already 
been unmapped. This means the
accesses to the entry_count will not be concurrent with the above hvmop 
handlers.

3> routine hap_enable_log_dirty() - this can be triggered by tool stack 
at any time, but only by read
access to this entry_count, so a read_atomic() would work.

Do you think this analysis reasonable?
If so, we may only use read_atomic() to get the value of entry_count in 
hap_enable_log_dirty() and keep
other places as it is.

Of course, some explanation comments in entry_count's definition would 
be necessary. :-)

Yu


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-08-09  9:25         ` Yu Zhang
@ 2016-08-09  9:45           ` Jan Beulich
  2016-08-09 10:21             ` Yu Zhang
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2016-08-09  9:45 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, xen-devel,
	Paul Durrant, zhiyuan.lv, Jun Nakajima

>>> On 09.08.16 at 11:25, <yu.c.zhang@linux.intel.com> wrote:

> 
> On 8/9/2016 4:13 PM, Jan Beulich wrote:
>>>>> On 09.08.16 at 09:39, <yu.c.zhang@linux.intel.com> wrote:
>>> On 8/9/2016 12:29 AM, Jan Beulich wrote:
>>>>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>>>>> @@ -5512,6 +5513,12 @@ static int hvmop_set_mem_type(
>>>>>            if ( rc )
>>>>>                goto out;
>>>>>    
>>>>> +        if ( t == p2m_ram_rw && memtype[a.hvmmem_type] == p2m_ioreq_server )
>>>>> +            p2m->ioreq.entry_count++;
>>>>> +
>>>>> +        if ( t == p2m_ioreq_server && memtype[a.hvmmem_type] == p2m_ram_rw )
>>>>> +            p2m->ioreq.entry_count--;
>>>>> +
>>>> These (and others below) happen, afaict, outside of any lock, which
>>>> can't be right.
>>> How about we make this entry_count as atomic_t and use atomic_inc/dec
>>> instead?
>> That's certainly one of the options, but beware of overflow.
> 
> Oh, thanks for your remind, Jan. I just found that atomic_t is defined 
> as  "typedef struct { int counter; } atomic_t;"
> which do have overflow issues if entry_count is supposed to be a uint32 
> or uint64.
> 
> Now I have another thought: the entry_count is referenced in 3 
> different  scenarios:
> 1> the hvmop handlers -  hvmop_set_mem_type() and 
> hvmop_map_mem_type_to_ioreq_server(),
> which  are triggered by device model, and will not be concurrent. And 
> hvmop_set_mem_type() will
> permit the mem type change only when an ioreq server has already been 
> mapped to this type.

You shouldn't rely on a well behaved dm, and that's already
leaving aside the question whether there couldn't even be well
behaved use cases of parallel invocations of this op.

> 2> the misconfiguration handlers - resolve_misconfig() or do_recalc(),  
> which are triggered by HVM's
> vm exit, yet this will only happen after the ioreq server has already 
> been unmapped. This means the
> accesses to the entry_count will not be concurrent with the above hvmop 
> handlers.

This case may be fine, but not for (just) the named reason:
Multiple misconfig invocations can happen at the same time, but
presumably your addition is inside the p2m-locked region (you'd
have to double check that).

> 3> routine hap_enable_log_dirty() - this can be triggered by tool stack 
> at any time, but only by read
> access to this entry_count, so a read_atomic() would work.

A read access may be fine, but only if the value can't get incremented
in a racy way.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-08-09  9:45           ` Jan Beulich
@ 2016-08-09 10:21             ` Yu Zhang
  0 siblings, 0 replies; 32+ messages in thread
From: Yu Zhang @ 2016-08-09 10:21 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, xen-devel,
	Paul Durrant, zhiyuan.lv, Jun Nakajima



On 8/9/2016 5:45 PM, Jan Beulich wrote:
>>>> On 09.08.16 at 11:25, <yu.c.zhang@linux.intel.com> wrote:
>> On 8/9/2016 4:13 PM, Jan Beulich wrote:
>>>>>> On 09.08.16 at 09:39, <yu.c.zhang@linux.intel.com> wrote:
>>>> On 8/9/2016 12:29 AM, Jan Beulich wrote:
>>>>>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>>>>>> @@ -5512,6 +5513,12 @@ static int hvmop_set_mem_type(
>>>>>>             if ( rc )
>>>>>>                 goto out;
>>>>>>     
>>>>>> +        if ( t == p2m_ram_rw && memtype[a.hvmmem_type] == p2m_ioreq_server )
>>>>>> +            p2m->ioreq.entry_count++;
>>>>>> +
>>>>>> +        if ( t == p2m_ioreq_server && memtype[a.hvmmem_type] == p2m_ram_rw )
>>>>>> +            p2m->ioreq.entry_count--;
>>>>>> +
>>>>> These (and others below) happen, afaict, outside of any lock, which
>>>>> can't be right.
>>>> How about we make this entry_count as atomic_t and use atomic_inc/dec
>>>> instead?
>>> That's certainly one of the options, but beware of overflow.
>> Oh, thanks for your remind, Jan. I just found that atomic_t is defined
>> as  "typedef struct { int counter; } atomic_t;"
>> which do have overflow issues if entry_count is supposed to be a uint32
>> or uint64.
>>
>> Now I have another thought: the entry_count is referenced in 3
>> different  scenarios:
>> 1> the hvmop handlers -  hvmop_set_mem_type() and
>> hvmop_map_mem_type_to_ioreq_server(),
>> which  are triggered by device model, and will not be concurrent. And
>> hvmop_set_mem_type() will
>> permit the mem type change only when an ioreq server has already been
>> mapped to this type.
> You shouldn't rely on a well behaved dm, and that's already
> leaving aside the question whether there couldn't even be well
> behaved use cases of parallel invocations of this op.

Oh. This is a good point. Thanks. :-)

>> 2> the misconfiguration handlers - resolve_misconfig() or do_recalc(),
>> which are triggered by HVM's
>> vm exit, yet this will only happen after the ioreq server has already
>> been unmapped. This means the
>> accesses to the entry_count will not be concurrent with the above hvmop
>> handlers.
> This case may be fine, but not for (just) the named reason:
> Multiple misconfig invocations can happen at the same time, but
> presumably your addition is inside the p2m-locked region (you'd
> have to double check that).

Yes, both are wrapped in p2m-locked region. So this give me another thought:
if we put the increment/decrement of entry_count inside 
p2m_change_type_one(),
which has the p2m_lock/p2m_unlock, we could avoid introducing another 
spinlock
and can also avoid the overflow possibility by using atomic_t.

[snip]

Thanks
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-08 15:40   ` Jan Beulich
  2016-08-09  7:39     ` Yu Zhang
@ 2016-08-10  8:09     ` Yu Zhang
  2016-08-10 10:33       ` Jan Beulich
  1 sibling, 1 reply; 32+ messages in thread
From: Yu Zhang @ 2016-08-10  8:09 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	zhiyuan.lv, Jun Nakajima



On 8/8/2016 11:40 PM, Jan Beulich wrote:
>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>> @@ -178,8 +179,34 @@ static int hvmemul_do_io(
>>           break;
>>       case X86EMUL_UNHANDLEABLE:
>>       {
>> -        struct hvm_ioreq_server *s =
>> -            hvm_select_ioreq_server(curr->domain, &p);
>> +        struct hvm_ioreq_server *s;
>> +
>> +        if ( is_mmio )
>> +        {
>> +            unsigned long gmfn = paddr_to_pfn(addr);
>> +            p2m_type_t p2mt;
>> +
>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>> +
>> +            if ( p2mt == p2m_ioreq_server )
>> +            {
>> +                unsigned int flags;
>> +
>> +                if ( dir != IOREQ_WRITE )
>> +                    s = NULL;
>> +                else
>> +                {
>> +                    s = p2m_get_ioreq_server(currd, &flags);
>> +
>> +                    if ( !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS) )
>> +                        s = NULL;
>> +                }
>> +            }
>> +            else
>> +                s = hvm_select_ioreq_server(currd, &p);
>> +        }
>> +        else
>> +            s = hvm_select_ioreq_server(currd, &p);
> Wouldn't it both be more natural and make the logic even easier
> to follow if s got set to NULL up front, all the "else"-s dropped,
> and a simple
>
>          if ( !s )
>              s = hvm_select_ioreq_server(currd, &p);
>
> be done in the end?
>

Sorry, Jan. I tried to simplify above code, but found the new code is 
still not very
clean,  because in some cases the s is supposed to return NULL instead 
of to be
set from the hvm_select_ioreq_server().
To keep the same logic, the simplified code looks like this:

      case X86EMUL_UNHANDLEABLE:
      {
-        struct hvm_ioreq_server *s =
-            hvm_select_ioreq_server(curr->domain, &p);
+        struct hvm_ioreq_server *s = NULL;
+        p2m_type_t p2mt = p2m_invalid;
+
+        if ( is_mmio && dir == IOREQ_WRITE )
+        {
+            unsigned long gmfn = paddr_to_pfn(addr);
+
+            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
+
+            if ( p2mt == p2m_ioreq_server )
+            {
+                unsigned int flags;
+
+                s = p2m_get_ioreq_server(currd, &flags);
+                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
+                    s = NULL;
+            }
+        }
+
+        if ( !s && p2mt != p2m_ioreq_server )
+            s = hvm_select_ioreq_server(currd, &p);

          /* If there is no suitable backing DM, just ignore accesses */
          if ( !s )

As you can see, definition of p2mt is moved outside the if ( is_mmio ) 
judgement,
and is checked against p2m_ioreq_server before we search the ioreq 
server's rangeset
in hvm_select_ioreq_server(). So I am not quite satisfied with this 
simplification.
Any suggestions?

[snip]

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-10  8:09     ` Yu Zhang
@ 2016-08-10 10:33       ` Jan Beulich
  2016-08-10 10:43         ` Paul Durrant
  2016-08-10 10:43         ` Yu Zhang
  0 siblings, 2 replies; 32+ messages in thread
From: Jan Beulich @ 2016-08-10 10:33 UTC (permalink / raw)
  To: Paul Durrant, Yu Zhang
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	zhiyuan.lv, Jun Nakajima

>>> On 10.08.16 at 10:09, <yu.c.zhang@linux.intel.com> wrote:

> 
> On 8/8/2016 11:40 PM, Jan Beulich wrote:
>>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>>> @@ -178,8 +179,34 @@ static int hvmemul_do_io(
>>>           break;
>>>       case X86EMUL_UNHANDLEABLE:
>>>       {
>>> -        struct hvm_ioreq_server *s =
>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>> +        struct hvm_ioreq_server *s;
>>> +
>>> +        if ( is_mmio )
>>> +        {
>>> +            unsigned long gmfn = paddr_to_pfn(addr);
>>> +            p2m_type_t p2mt;
>>> +
>>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>>> +
>>> +            if ( p2mt == p2m_ioreq_server )
>>> +            {
>>> +                unsigned int flags;
>>> +
>>> +                if ( dir != IOREQ_WRITE )
>>> +                    s = NULL;
>>> +                else
>>> +                {
>>> +                    s = p2m_get_ioreq_server(currd, &flags);
>>> +
>>> +                    if ( !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS) )
>>> +                        s = NULL;
>>> +                }
>>> +            }
>>> +            else
>>> +                s = hvm_select_ioreq_server(currd, &p);
>>> +        }
>>> +        else
>>> +            s = hvm_select_ioreq_server(currd, &p);
>> Wouldn't it both be more natural and make the logic even easier
>> to follow if s got set to NULL up front, all the "else"-s dropped,
>> and a simple
>>
>>          if ( !s )
>>              s = hvm_select_ioreq_server(currd, &p);
>>
>> be done in the end?
>>
> 
> Sorry, Jan. I tried to simplify above code, but found the new code is 
> still not very
> clean,  because in some cases the s is supposed to return NULL instead 
> of to be
> set from the hvm_select_ioreq_server().
> To keep the same logic, the simplified code looks like this:
> 
>       case X86EMUL_UNHANDLEABLE:
>       {
> -        struct hvm_ioreq_server *s =
> -            hvm_select_ioreq_server(curr->domain, &p);
> +        struct hvm_ioreq_server *s = NULL;
> +        p2m_type_t p2mt = p2m_invalid;
> +
> +        if ( is_mmio && dir == IOREQ_WRITE )
> +        {
> +            unsigned long gmfn = paddr_to_pfn(addr);
> +
> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
> +
> +            if ( p2mt == p2m_ioreq_server )
> +            {
> +                unsigned int flags;
> +
> +                s = p2m_get_ioreq_server(currd, &flags);
> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
> +                    s = NULL;
> +            }
> +        }
> +
> +        if ( !s && p2mt != p2m_ioreq_server )
> +            s = hvm_select_ioreq_server(currd, &p);
> 
>           /* If there is no suitable backing DM, just ignore accesses */
>           if ( !s )
> 
> As you can see, definition of p2mt is moved outside the if ( is_mmio ) 
> judgement,
> and is checked against p2m_ioreq_server before we search the ioreq 
> server's rangeset
> in hvm_select_ioreq_server(). So I am not quite satisfied with this 
> simplification.
> Any suggestions?

I think it's better than the code was before, but an implicit part of
my suggestion was that I'm not really convinced the
" && p2mt != p2m_ioreq_server" part of your new conditional is
really needed: Would it indeed be wrong to hand such a request
to the "normal" ioreq server, instead of terminating it right away?
(I guess that's a question to you as much as to Paul.)

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-10 10:33       ` Jan Beulich
@ 2016-08-10 10:43         ` Paul Durrant
  2016-08-10 12:32           ` Yu Zhang
  2016-08-10 10:43         ` Yu Zhang
  1 sibling, 1 reply; 32+ messages in thread
From: Paul Durrant @ 2016-08-10 10:43 UTC (permalink / raw)
  To: Jan Beulich, Yu Zhang
  Cc: Kevin Tian, Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, zhiyuan.lv, Jun Nakajima

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 10 August 2016 11:33
> To: Paul Durrant; Yu Zhang
> Cc: Andrew Cooper; George Dunlap; Jun Nakajima; Kevin Tian;
> zhiyuan.lv@intel.com; xen-devel@lists.xen.org; Tim (Xen.org)
> Subject: Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram
> with p2m_ioreq_server to an ioreq server.
> 
> >>> On 10.08.16 at 10:09, <yu.c.zhang@linux.intel.com> wrote:
> 
> >
> > On 8/8/2016 11:40 PM, Jan Beulich wrote:
> >>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
> >>> @@ -178,8 +179,34 @@ static int hvmemul_do_io(
> >>>           break;
> >>>       case X86EMUL_UNHANDLEABLE:
> >>>       {
> >>> -        struct hvm_ioreq_server *s =
> >>> -            hvm_select_ioreq_server(curr->domain, &p);
> >>> +        struct hvm_ioreq_server *s;
> >>> +
> >>> +        if ( is_mmio )
> >>> +        {
> >>> +            unsigned long gmfn = paddr_to_pfn(addr);
> >>> +            p2m_type_t p2mt;
> >>> +
> >>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
> >>> +
> >>> +            if ( p2mt == p2m_ioreq_server )
> >>> +            {
> >>> +                unsigned int flags;
> >>> +
> >>> +                if ( dir != IOREQ_WRITE )
> >>> +                    s = NULL;
> >>> +                else
> >>> +                {
> >>> +                    s = p2m_get_ioreq_server(currd, &flags);
> >>> +
> >>> +                    if ( !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS) )
> >>> +                        s = NULL;
> >>> +                }
> >>> +            }
> >>> +            else
> >>> +                s = hvm_select_ioreq_server(currd, &p);
> >>> +        }
> >>> +        else
> >>> +            s = hvm_select_ioreq_server(currd, &p);
> >> Wouldn't it both be more natural and make the logic even easier
> >> to follow if s got set to NULL up front, all the "else"-s dropped,
> >> and a simple
> >>
> >>          if ( !s )
> >>              s = hvm_select_ioreq_server(currd, &p);
> >>
> >> be done in the end?
> >>
> >
> > Sorry, Jan. I tried to simplify above code, but found the new code is
> > still not very
> > clean,  because in some cases the s is supposed to return NULL instead
> > of to be
> > set from the hvm_select_ioreq_server().
> > To keep the same logic, the simplified code looks like this:
> >
> >       case X86EMUL_UNHANDLEABLE:
> >       {
> > -        struct hvm_ioreq_server *s =
> > -            hvm_select_ioreq_server(curr->domain, &p);
> > +        struct hvm_ioreq_server *s = NULL;
> > +        p2m_type_t p2mt = p2m_invalid;
> > +
> > +        if ( is_mmio && dir == IOREQ_WRITE )
> > +        {
> > +            unsigned long gmfn = paddr_to_pfn(addr);
> > +
> > +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
> > +
> > +            if ( p2mt == p2m_ioreq_server )
> > +            {
> > +                unsigned int flags;
> > +
> > +                s = p2m_get_ioreq_server(currd, &flags);
> > +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
> > +                    s = NULL;
> > +            }
> > +        }
> > +
> > +        if ( !s && p2mt != p2m_ioreq_server )
> > +            s = hvm_select_ioreq_server(currd, &p);
> >
> >           /* If there is no suitable backing DM, just ignore accesses */
> >           if ( !s )
> >
> > As you can see, definition of p2mt is moved outside the if ( is_mmio )
> > judgement,
> > and is checked against p2m_ioreq_server before we search the ioreq
> > server's rangeset
> > in hvm_select_ioreq_server(). So I am not quite satisfied with this
> > simplification.
> > Any suggestions?
> 
> I think it's better than the code was before, but an implicit part of
> my suggestion was that I'm not really convinced the
> " && p2mt != p2m_ioreq_server" part of your new conditional is
> really needed: Would it indeed be wrong to hand such a request
> to the "normal" ioreq server, instead of terminating it right away?
> (I guess that's a question to you as much as to Paul.)
>

Well, I thought the correct semantics for p2m_ioreq_server without any interception are the same as p2m_ram_rw. In which case if we find an ioreq server, but it does not want to emulate writes, then we should complete the I/O by writing to guest RAM. But, how are we ever going to hit this code-path without some form of race with EPT configuration?

  Paul
 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-10 10:33       ` Jan Beulich
  2016-08-10 10:43         ` Paul Durrant
@ 2016-08-10 10:43         ` Yu Zhang
  2016-08-11  8:47           ` Yu Zhang
  1 sibling, 1 reply; 32+ messages in thread
From: Yu Zhang @ 2016-08-10 10:43 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	zhiyuan.lv, Jun Nakajima



On 8/10/2016 6:33 PM, Jan Beulich wrote:
>>>> On 10.08.16 at 10:09, <yu.c.zhang@linux.intel.com> wrote:
>> On 8/8/2016 11:40 PM, Jan Beulich wrote:
>>>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>>>> @@ -178,8 +179,34 @@ static int hvmemul_do_io(
>>>>            break;
>>>>        case X86EMUL_UNHANDLEABLE:
>>>>        {
>>>> -        struct hvm_ioreq_server *s =
>>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>>> +        struct hvm_ioreq_server *s;
>>>> +
>>>> +        if ( is_mmio )
>>>> +        {
>>>> +            unsigned long gmfn = paddr_to_pfn(addr);
>>>> +            p2m_type_t p2mt;
>>>> +
>>>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>>>> +
>>>> +            if ( p2mt == p2m_ioreq_server )
>>>> +            {
>>>> +                unsigned int flags;
>>>> +
>>>> +                if ( dir != IOREQ_WRITE )
>>>> +                    s = NULL;
>>>> +                else
>>>> +                {
>>>> +                    s = p2m_get_ioreq_server(currd, &flags);
>>>> +
>>>> +                    if ( !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS) )
>>>> +                        s = NULL;
>>>> +                }
>>>> +            }
>>>> +            else
>>>> +                s = hvm_select_ioreq_server(currd, &p);
>>>> +        }
>>>> +        else
>>>> +            s = hvm_select_ioreq_server(currd, &p);
>>> Wouldn't it both be more natural and make the logic even easier
>>> to follow if s got set to NULL up front, all the "else"-s dropped,
>>> and a simple
>>>
>>>           if ( !s )
>>>               s = hvm_select_ioreq_server(currd, &p);
>>>
>>> be done in the end?
>>>
>> Sorry, Jan. I tried to simplify above code, but found the new code is
>> still not very
>> clean,  because in some cases the s is supposed to return NULL instead
>> of to be
>> set from the hvm_select_ioreq_server().
>> To keep the same logic, the simplified code looks like this:
>>
>>        case X86EMUL_UNHANDLEABLE:
>>        {
>> -        struct hvm_ioreq_server *s =
>> -            hvm_select_ioreq_server(curr->domain, &p);
>> +        struct hvm_ioreq_server *s = NULL;
>> +        p2m_type_t p2mt = p2m_invalid;
>> +
>> +        if ( is_mmio && dir == IOREQ_WRITE )
>> +        {
>> +            unsigned long gmfn = paddr_to_pfn(addr);
>> +
>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>> +
>> +            if ( p2mt == p2m_ioreq_server )
>> +            {
>> +                unsigned int flags;
>> +
>> +                s = p2m_get_ioreq_server(currd, &flags);
>> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>> +                    s = NULL;
>> +            }
>> +        }
>> +
>> +        if ( !s && p2mt != p2m_ioreq_server )
>> +            s = hvm_select_ioreq_server(currd, &p);
>>
>>            /* If there is no suitable backing DM, just ignore accesses */
>>            if ( !s )
>>
>> As you can see, definition of p2mt is moved outside the if ( is_mmio )
>> judgement,
>> and is checked against p2m_ioreq_server before we search the ioreq
>> server's rangeset
>> in hvm_select_ioreq_server(). So I am not quite satisfied with this
>> simplification.
>> Any suggestions?
> I think it's better than the code was before, but an implicit part of
> my suggestion was that I'm not really convinced the
> " && p2mt != p2m_ioreq_server" part of your new conditional is
> really needed: Would it indeed be wrong to hand such a request
> to the "normal" ioreq server, instead of terminating it right away?
> (I guess that's a question to you as much as to Paul.)
>

Thanks for your reply, Jan.
For " && p2mt != p2m_ioreq_server" condition, it is just to guarantee 
that if a write
operation is trapped, and at the same period, device model changed the 
status of
ioreq server, it should be discarded.

A second thought is, I am now more worried about the " && dir == 
IOREQ_WRITE"
condition, which we used previously to set s to NULL if it is not a 
write operation.
However, if HVM uses a read-modify-write instruction to operate on a 
write-protected
address, it will be treated as both read and write accesses in 
ept_handle_violation(). In
such situation, we need to emulate the read access first(by just 
returning the value being
fetched either in hypervisor or in device model), instead of discarding 
the read access.

Thanks
Yu


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-10 10:43         ` Paul Durrant
@ 2016-08-10 12:32           ` Yu Zhang
  0 siblings, 0 replies; 32+ messages in thread
From: Yu Zhang @ 2016-08-10 12:32 UTC (permalink / raw)
  To: Paul Durrant, Jan Beulich
  Cc: Kevin Tian, Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, zhiyuan.lv, Jun Nakajima



On 8/10/2016 6:43 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 10 August 2016 11:33
>> To: Paul Durrant; Yu Zhang
>> Cc: Andrew Cooper; George Dunlap; Jun Nakajima; Kevin Tian;
>> zhiyuan.lv@intel.com; xen-devel@lists.xen.org; Tim (Xen.org)
>> Subject: Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram
>> with p2m_ioreq_server to an ioreq server.
>>
>>>>> On 10.08.16 at 10:09, <yu.c.zhang@linux.intel.com> wrote:
>>> On 8/8/2016 11:40 PM, Jan Beulich wrote:
>>>>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>>>>> @@ -178,8 +179,34 @@ static int hvmemul_do_io(
>>>>>            break;
>>>>>        case X86EMUL_UNHANDLEABLE:
>>>>>        {
>>>>> -        struct hvm_ioreq_server *s =
>>>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>>>> +        struct hvm_ioreq_server *s;
>>>>> +
>>>>> +        if ( is_mmio )
>>>>> +        {
>>>>> +            unsigned long gmfn = paddr_to_pfn(addr);
>>>>> +            p2m_type_t p2mt;
>>>>> +
>>>>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>>>>> +
>>>>> +            if ( p2mt == p2m_ioreq_server )
>>>>> +            {
>>>>> +                unsigned int flags;
>>>>> +
>>>>> +                if ( dir != IOREQ_WRITE )
>>>>> +                    s = NULL;
>>>>> +                else
>>>>> +                {
>>>>> +                    s = p2m_get_ioreq_server(currd, &flags);
>>>>> +
>>>>> +                    if ( !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS) )
>>>>> +                        s = NULL;
>>>>> +                }
>>>>> +            }
>>>>> +            else
>>>>> +                s = hvm_select_ioreq_server(currd, &p);
>>>>> +        }
>>>>> +        else
>>>>> +            s = hvm_select_ioreq_server(currd, &p);
>>>> Wouldn't it both be more natural and make the logic even easier
>>>> to follow if s got set to NULL up front, all the "else"-s dropped,
>>>> and a simple
>>>>
>>>>           if ( !s )
>>>>               s = hvm_select_ioreq_server(currd, &p);
>>>>
>>>> be done in the end?
>>>>
>>> Sorry, Jan. I tried to simplify above code, but found the new code is
>>> still not very
>>> clean,  because in some cases the s is supposed to return NULL instead
>>> of to be
>>> set from the hvm_select_ioreq_server().
>>> To keep the same logic, the simplified code looks like this:
>>>
>>>        case X86EMUL_UNHANDLEABLE:
>>>        {
>>> -        struct hvm_ioreq_server *s =
>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>> +        struct hvm_ioreq_server *s = NULL;
>>> +        p2m_type_t p2mt = p2m_invalid;
>>> +
>>> +        if ( is_mmio && dir == IOREQ_WRITE )
>>> +        {
>>> +            unsigned long gmfn = paddr_to_pfn(addr);
>>> +
>>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>>> +
>>> +            if ( p2mt == p2m_ioreq_server )
>>> +            {
>>> +                unsigned int flags;
>>> +
>>> +                s = p2m_get_ioreq_server(currd, &flags);
>>> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>>> +                    s = NULL;
>>> +            }
>>> +        }
>>> +
>>> +        if ( !s && p2mt != p2m_ioreq_server )
>>> +            s = hvm_select_ioreq_server(currd, &p);
>>>
>>>            /* If there is no suitable backing DM, just ignore accesses */
>>>            if ( !s )
>>>
>>> As you can see, definition of p2mt is moved outside the if ( is_mmio )
>>> judgement,
>>> and is checked against p2m_ioreq_server before we search the ioreq
>>> server's rangeset
>>> in hvm_select_ioreq_server(). So I am not quite satisfied with this
>>> simplification.
>>> Any suggestions?
>> I think it's better than the code was before, but an implicit part of
>> my suggestion was that I'm not really convinced the
>> " && p2mt != p2m_ioreq_server" part of your new conditional is
>> really needed: Would it indeed be wrong to hand such a request
>> to the "normal" ioreq server, instead of terminating it right away?
>> (I guess that's a question to you as much as to Paul.)
>>
> Well, I thought the correct semantics for p2m_ioreq_server without any interception are the same as p2m_ram_rw. In which case if we find an ioreq server, but it does not want to emulate writes, then we should complete the I/O by writing to guest RAM. But, how are we ever going to hit this code-path without some form of race with EPT configuration?
>

Thanks Paul. I think the p2m type change race condition can be avoided in
hvm_hap_nested_page_fault(), by delaying the call of put_gfn() after the
ioreq is delivered.  :)

Otherwise we will need to keep the previous mem_handler, and use 
get_gfn_type()
instead of get_gfn_query_unlocked() in hvmemul_do_io() - because, 
without a lock,
another race condition can happen when
1> a p2m_ioreq_server gfn is changed to p2m_ram_rw;
2> we got this p2mt with value p2m_ram_rw by get_gfn_query_unlocked();
3> p2m type of this gfn is changed back to p2m_ioreq_server;
4> mem_handler is called instead of deliver the ioreq to device model;

However, the mem_handler may really be helpful for the read-modify-write 
case
I mentioned in another mail. So hypervisor do not need to forward the 
read ioreq
to the device model.

B.R.
Yu

Yu



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-10 10:43         ` Yu Zhang
@ 2016-08-11  8:47           ` Yu Zhang
  2016-08-11  8:58             ` Jan Beulich
  0 siblings, 1 reply; 32+ messages in thread
From: Yu Zhang @ 2016-08-11  8:47 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	zhiyuan.lv, Jun Nakajima



On 8/10/2016 6:43 PM, Yu Zhang wrote:
>
>
> On 8/10/2016 6:33 PM, Jan Beulich wrote:
>>>>> On 10.08.16 at 10:09, <yu.c.zhang@linux.intel.com> wrote:
>>> On 8/8/2016 11:40 PM, Jan Beulich wrote:
>>>>>>> On 12.07.16 at 11:02, <yu.c.zhang@linux.intel.com> wrote:
>>>>> @@ -178,8 +179,34 @@ static int hvmemul_do_io(
>>>>>            break;
>>>>>        case X86EMUL_UNHANDLEABLE:
>>>>>        {
>>>>> -        struct hvm_ioreq_server *s =
>>>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>>>> +        struct hvm_ioreq_server *s;
>>>>> +
>>>>> +        if ( is_mmio )
>>>>> +        {
>>>>> +            unsigned long gmfn = paddr_to_pfn(addr);
>>>>> +            p2m_type_t p2mt;
>>>>> +
>>>>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>>>>> +
>>>>> +            if ( p2mt == p2m_ioreq_server )
>>>>> +            {
>>>>> +                unsigned int flags;
>>>>> +
>>>>> +                if ( dir != IOREQ_WRITE )
>>>>> +                    s = NULL;
>>>>> +                else
>>>>> +                {
>>>>> +                    s = p2m_get_ioreq_server(currd, &flags);
>>>>> +
>>>>> +                    if ( !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS) )
>>>>> +                        s = NULL;
>>>>> +                }
>>>>> +            }
>>>>> +            else
>>>>> +                s = hvm_select_ioreq_server(currd, &p);
>>>>> +        }
>>>>> +        else
>>>>> +            s = hvm_select_ioreq_server(currd, &p);
>>>> Wouldn't it both be more natural and make the logic even easier
>>>> to follow if s got set to NULL up front, all the "else"-s dropped,
>>>> and a simple
>>>>
>>>>           if ( !s )
>>>>               s = hvm_select_ioreq_server(currd, &p);
>>>>
>>>> be done in the end?
>>>>
>>> Sorry, Jan. I tried to simplify above code, but found the new code is
>>> still not very
>>> clean,  because in some cases the s is supposed to return NULL instead
>>> of to be
>>> set from the hvm_select_ioreq_server().
>>> To keep the same logic, the simplified code looks like this:
>>>
>>>        case X86EMUL_UNHANDLEABLE:
>>>        {
>>> -        struct hvm_ioreq_server *s =
>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>> +        struct hvm_ioreq_server *s = NULL;
>>> +        p2m_type_t p2mt = p2m_invalid;
>>> +
>>> +        if ( is_mmio && dir == IOREQ_WRITE )
>>> +        {
>>> +            unsigned long gmfn = paddr_to_pfn(addr);
>>> +
>>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>>> +
>>> +            if ( p2mt == p2m_ioreq_server )
>>> +            {
>>> +                unsigned int flags;
>>> +
>>> +                s = p2m_get_ioreq_server(currd, &flags);
>>> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>>> +                    s = NULL;
>>> +            }
>>> +        }
>>> +
>>> +        if ( !s && p2mt != p2m_ioreq_server )
>>> +            s = hvm_select_ioreq_server(currd, &p);
>>>
>>>            /* If there is no suitable backing DM, just ignore 
>>> accesses */
>>>            if ( !s )
>>>
>>> As you can see, definition of p2mt is moved outside the if ( is_mmio )
>>> judgement,
>>> and is checked against p2m_ioreq_server before we search the ioreq
>>> server's rangeset
>>> in hvm_select_ioreq_server(). So I am not quite satisfied with this
>>> simplification.
>>> Any suggestions?
>> I think it's better than the code was before, but an implicit part of
>> my suggestion was that I'm not really convinced the
>> " && p2mt != p2m_ioreq_server" part of your new conditional is
>> really needed: Would it indeed be wrong to hand such a request
>> to the "normal" ioreq server, instead of terminating it right away?
>> (I guess that's a question to you as much as to Paul.)
>>
>
> Thanks for your reply, Jan.
> For " && p2mt != p2m_ioreq_server" condition, it is just to guarantee 
> that if a write
> operation is trapped, and at the same period, device model changed the 
> status of
> ioreq server, it should be discarded.

Hi Paul & Jan, any comments?


> A second thought is, I am now more worried about the " && dir == 
> IOREQ_WRITE"
> condition, which we used previously to set s to NULL if it is not a 
> write operation.
> However, if HVM uses a read-modify-write instruction to operate on a 
> write-protected
> address, it will be treated as both read and write accesses in 
> ept_handle_violation(). In
> such situation, we need to emulate the read access first(by just 
> returning the value being
> fetched either in hypervisor or in device model), instead of 
> discarding the read access.
>

Any suggestions about this guest read-modify-write instruction situation?
Is my depiction clear? :)

Thanks
Yu


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-11  8:47           ` Yu Zhang
@ 2016-08-11  8:58             ` Jan Beulich
  2016-08-11  9:19               ` Yu Zhang
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2016-08-11  8:58 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	Paul Durrant, zhiyuan.lv, Jun Nakajima

>>> On 11.08.16 at 10:47, <yu.c.zhang@linux.intel.com> wrote:
> On 8/10/2016 6:43 PM, Yu Zhang wrote:
>> For " && p2mt != p2m_ioreq_server" condition, it is just to guarantee 
>> that if a write
>> operation is trapped, and at the same period, device model changed the 
>> status of
>> ioreq server, it should be discarded.
> 
> Hi Paul & Jan, any comments?

Didn't Paul's "should behave like p2m_ram_rw" reply clarify things
sufficiently?

>> A second thought is, I am now more worried about the " && dir == 
>> IOREQ_WRITE"
>> condition, which we used previously to set s to NULL if it is not a 
>> write operation.
>> However, if HVM uses a read-modify-write instruction to operate on a 
>> write-protected
>> address, it will be treated as both read and write accesses in 
>> ept_handle_violation(). In
>> such situation, we need to emulate the read access first(by just 
>> returning the value being
>> fetched either in hypervisor or in device model), instead of 
>> discarding the read access.
> 
> Any suggestions about this guest read-modify-write instruction situation?
> Is my depiction clear? :)

Well, from your earlier reply I concluded that you'd just go ahead
and put this into patch form, which we'd then look at.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-08-11  8:58             ` Jan Beulich
@ 2016-08-11  9:19               ` Yu Zhang
  0 siblings, 0 replies; 32+ messages in thread
From: Yu Zhang @ 2016-08-11  9:19 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	zhiyuan.lv, Jun Nakajima



On 8/11/2016 4:58 PM, Jan Beulich wrote:
>>>> On 11.08.16 at 10:47, <yu.c.zhang@linux.intel.com> wrote:
>> On 8/10/2016 6:43 PM, Yu Zhang wrote:
>>> For " && p2mt != p2m_ioreq_server" condition, it is just to guarantee
>>> that if a write
>>> operation is trapped, and at the same period, device model changed the
>>> status of
>>> ioreq server, it should be discarded.
>> Hi Paul & Jan, any comments?
> Didn't Paul's "should behave like p2m_ram_rw" reply clarify things
> sufficiently?

Oh, I may have misunderstood. I thought he was talking about the p2m 
change race condition. :)

So please allow me to give a summary about my next to do for this:
1> To prevent p2m type change race condition, 
hvm_hap_nested_page_fault() need to
be changed so that p2m_unlock() can be triggered after the write 
operation is handled;

2> If a gfn with p2m_ioreq_server is trapped, but the current ioreq 
server has been unmapped,
it will be treated as a p2m_ram_rw;

3> If a gfn with p2m_ioreq_server is trapped, but the  dir is 
IOREQ_READ, it will be treated as a
read-modify-write case.

>>> A second thought is, I am now more worried about the " && dir ==
>>> IOREQ_WRITE"
>>> condition, which we used previously to set s to NULL if it is not a
>>> write operation.
>>> However, if HVM uses a read-modify-write instruction to operate on a
>>> write-protected
>>> address, it will be treated as both read and write accesses in
>>> ept_handle_violation(). In
>>> such situation, we need to emulate the read access first(by just
>>> returning the value being
>>> fetched either in hypervisor or in device model), instead of
>>> discarding the read access.
>> Any suggestions about this guest read-modify-write instruction situation?
>> Is my depiction clear? :)
> Well, from your earlier reply I concluded that you'd just go ahead
> and put this into patch form, which we'd then look at.

OK, thanks. I have give a rough summary in 3> above.

I will have to take several days annual leave from this weekend due to 
some family
urgency, and will be back after Aug 23. Can hardly seen the mailing list 
during this
period, sorry for the inconvenience.  :(

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-07-12  9:02 ` [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
  2016-08-08 16:29   ` Jan Beulich
@ 2016-08-16 13:35   ` George Dunlap
  2016-08-16 13:54     ` Jan Beulich
  2016-08-30 12:17     ` Yu Zhang
  1 sibling, 2 replies; 32+ messages in thread
From: George Dunlap @ 2016-08-16 13:35 UTC (permalink / raw)
  To: Yu Zhang, xen-devel
  Cc: Kevin Tian, Jun Nakajima, George Dunlap, Andrew Cooper,
	Paul Durrant, zhiyuan.lv, Jan Beulich

On 12/07/16 10:02, Yu Zhang wrote:
> This patch resets p2m_ioreq_server entries back to p2m_ram_rw,
> after an ioreq server has unmapped. The resync is done both
> asynchronously with the current p2m_change_entry_type_global()
> interface, and synchronously by iterating the p2m table. The
> synchronous resetting is necessary because we need to guarantee
> the p2m table is clean before another ioreq server is mapped.
> And since the sweeping of p2m table could be time consuming,
> it is done with hypercall continuation. Asynchronous approach
> is also taken so that p2m_ioreq_server entries can also be reset
> when the HVM is scheduled between hypercall continuations.
> 
> This patch also disallows live migration, when there's still any
> outstanding p2m_ioreq_server entry left. The core reason is our
> current implementation of p2m_change_entry_type_global() can not
> tell the state of p2m_ioreq_server entries(can not decide if an
> entry is to be emulated or to be resynced).
> 
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>

Thanks for doing this Yu Zhang!  A couple of comments.

> ---
> Cc: Paul Durrant <paul.durrant@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> ---
>  xen/arch/x86/hvm/hvm.c    | 52 ++++++++++++++++++++++++++++++++++++++++++++---
>  xen/arch/x86/mm/hap/hap.c |  9 ++++++++
>  xen/arch/x86/mm/p2m-ept.c |  6 +++++-
>  xen/arch/x86/mm/p2m-pt.c  | 10 +++++++--
>  xen/arch/x86/mm/p2m.c     |  3 +++
>  xen/include/asm-x86/p2m.h |  5 ++++-
>  6 files changed, 78 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 4d98cc6..e57c8b9 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -5485,6 +5485,7 @@ static int hvmop_set_mem_type(
>      {
>          unsigned long pfn = a.first_pfn + start_iter;
>          p2m_type_t t;
> +        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>  
>          get_gfn_unshare(d, pfn, &t);
>          if ( p2m_is_paging(t) )
> @@ -5512,6 +5513,12 @@ static int hvmop_set_mem_type(
>          if ( rc )
>              goto out;
>  
> +        if ( t == p2m_ram_rw && memtype[a.hvmmem_type] == p2m_ioreq_server )
> +            p2m->ioreq.entry_count++;
> +
> +        if ( t == p2m_ioreq_server && memtype[a.hvmmem_type] == p2m_ram_rw )
> +            p2m->ioreq.entry_count--;

Changing these here might make sense if they were *only* changed in the
hvm code; but as you also have to modify this value in the p2m code (in
resolve_misconfig), I think it makes sense to try to do all the counting
in the p2m code.  That would take care of any locking issues as well.

Logically the most sensible place to do it would be
atomic_write_ept_entry(); that would make it basically impossible to
miss a case where we change to of from p2m_ioreq_server.

On the other hand, it would mean adding code to a core path for all p2m
updates.

> +
>          /* Check for continuation if it's not the last interation */
>          if ( a.nr > ++start_iter && !(start_iter & HVMOP_op_mask) &&
>               hypercall_preempt_check() )
> @@ -5530,11 +5537,13 @@ static int hvmop_set_mem_type(
>  }
>  
>  static int hvmop_map_mem_type_to_ioreq_server(
> -    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
> +    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop,
> +    unsigned long *iter)
>  {
>      xen_hvm_map_mem_type_to_ioreq_server_t op;
>      struct domain *d;
>      int rc;
> +    unsigned long gfn = *iter;
>  
>      if ( copy_from_guest(&op, uop, 1) )
>          return -EFAULT;
> @@ -5559,7 +5568,42 @@ static int hvmop_map_mem_type_to_ioreq_server(
>      if ( rc != 0 )
>          goto out;
>  
> -    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
> +    if ( gfn == 0 || op.flags != 0 )
> +        rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
> +
> +    /*
> +     * Iterate p2m table when an ioreq server unmaps from p2m_ioreq_server,
> +     * and reset the remaining p2m_ioreq_server entries back to p2m_ram_rw.
> +     */
> +    if ( op.flags == 0 && rc == 0 )
> +    {
> +        struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +
> +        while ( gfn <= p2m->max_mapped_pfn )
> +        {
> +            p2m_type_t t;
> +
> +            if ( p2m->ioreq.entry_count == 0 )
> +                break;

Any reason not to make this part of the while() condition?

> +
> +            get_gfn_unshare(d, gfn, &t);

This will completely unshare all pages in a domain below the last
dangling p2m_ioreq_server page.  I don't think unsharing is necessary at
all; if it's shared, it certainly won't be of type p2m_ioreq_server.

Actually -- it seems like ept_get_entry() really should be calling
resolve_misconfig(), the same way that ept_set_entry() does.  In that
case, simply going through and reading all the p2m entries would suffice
to finish off the type change.

Although really, it seems like having a "p2m_finish_type_change()"
function which looked for misconfigured entries and reset them would be
a step closer to the right direction, in that it could be re-used in
other situations where the type change may not have finished.

Jan / Yu Zhang, thoughts?

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-08-16 13:35   ` George Dunlap
@ 2016-08-16 13:54     ` Jan Beulich
  2016-08-30 12:17     ` Yu Zhang
  1 sibling, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2016-08-16 13:54 UTC (permalink / raw)
  To: George Dunlap, Yu Zhang, xen-devel
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Paul Durrant,
	zhiyuan.lv, Jun Nakajima

>>> On 16.08.16 at 15:35, <george.dunlap@citrix.com> wrote:
> Although really, it seems like having a "p2m_finish_type_change()"
> function which looked for misconfigured entries and reset them would be
> a step closer to the right direction, in that it could be re-used in
> other situations where the type change may not have finished.

That's a good idea imo.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-08-16 13:35   ` George Dunlap
  2016-08-16 13:54     ` Jan Beulich
@ 2016-08-30 12:17     ` Yu Zhang
  2016-09-02 11:00       ` Yu Zhang
  1 sibling, 1 reply; 32+ messages in thread
From: Yu Zhang @ 2016-08-30 12:17 UTC (permalink / raw)
  To: George Dunlap, xen-devel
  Cc: Kevin Tian, Jan Beulich, George Dunlap, Andrew Cooper,
	Paul Durrant, zhiyuan.lv, Jun Nakajima



On 8/16/2016 9:35 PM, George Dunlap wrote:
> On 12/07/16 10:02, Yu Zhang wrote:
>> This patch resets p2m_ioreq_server entries back to p2m_ram_rw,
>> after an ioreq server has unmapped. The resync is done both
>> asynchronously with the current p2m_change_entry_type_global()
>> interface, and synchronously by iterating the p2m table. The
>> synchronous resetting is necessary because we need to guarantee
>> the p2m table is clean before another ioreq server is mapped.
>> And since the sweeping of p2m table could be time consuming,
>> it is done with hypercall continuation. Asynchronous approach
>> is also taken so that p2m_ioreq_server entries can also be reset
>> when the HVM is scheduled between hypercall continuations.
>>
>> This patch also disallows live migration, when there's still any
>> outstanding p2m_ioreq_server entry left. The core reason is our
>> current implementation of p2m_change_entry_type_global() can not
>> tell the state of p2m_ioreq_server entries(can not decide if an
>> entry is to be emulated or to be resynced).
>>
>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> Thanks for doing this Yu Zhang!  A couple of comments.

Thanks a lot for your advice, George.
And sorry for the delayed reply(had been in annual leave previously).

>> ---
>> Cc: Paul Durrant <paul.durrant@citrix.com>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>> Cc: Kevin Tian <kevin.tian@intel.com>
>> ---
>>   xen/arch/x86/hvm/hvm.c    | 52 ++++++++++++++++++++++++++++++++++++++++++++---
>>   xen/arch/x86/mm/hap/hap.c |  9 ++++++++
>>   xen/arch/x86/mm/p2m-ept.c |  6 +++++-
>>   xen/arch/x86/mm/p2m-pt.c  | 10 +++++++--
>>   xen/arch/x86/mm/p2m.c     |  3 +++
>>   xen/include/asm-x86/p2m.h |  5 ++++-
>>   6 files changed, 78 insertions(+), 7 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>> index 4d98cc6..e57c8b9 100644
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -5485,6 +5485,7 @@ static int hvmop_set_mem_type(
>>       {
>>           unsigned long pfn = a.first_pfn + start_iter;
>>           p2m_type_t t;
>> +        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>   
>>           get_gfn_unshare(d, pfn, &t);
>>           if ( p2m_is_paging(t) )
>> @@ -5512,6 +5513,12 @@ static int hvmop_set_mem_type(
>>           if ( rc )
>>               goto out;
>>   
>> +        if ( t == p2m_ram_rw && memtype[a.hvmmem_type] == p2m_ioreq_server )
>> +            p2m->ioreq.entry_count++;
>> +
>> +        if ( t == p2m_ioreq_server && memtype[a.hvmmem_type] == p2m_ram_rw )
>> +            p2m->ioreq.entry_count--;
> Changing these here might make sense if they were *only* changed in the
> hvm code; but as you also have to modify this value in the p2m code (in
> resolve_misconfig), I think it makes sense to try to do all the counting
> in the p2m code.  That would take care of any locking issues as well.
>
> Logically the most sensible place to do it would be
> atomic_write_ept_entry(); that would make it basically impossible to
> miss a case where we change to of from p2m_ioreq_server.
>
> On the other hand, it would mean adding code to a core path for all p2m
> updates.

Well, the atomic_write_ept_entry() is a rather core path, and is only 
for ept. Although p2m_ioreq_server is
not accepted in shadow page table code, I'd still like to support it in 
AMD SVM - also consistent with the
p2m_change_entry_type_global() functionality. :)

I am considering p2m_change_type_one(), which has gfn_lock/unlock() 
inside. And since previously the p2m
type changes to/from p2m_ioreq_server are only triggered by hvmop which 
leads to p2m_change_type_one(),
I believe calculating the entry_count here would make sense.

Besides, the resolve_misconfig()/do_recalc() still need to decrease the 
entry_count for p2m_ioreq_server,
but since both routines have p2m locked by their caller, so it's also OK.

Other places that read entry_count can be protected by 
read_atomic(&p2m->ioreq.entry_count).

Do you think this acceptable?

>> +
>>           /* Check for continuation if it's not the last interation */
>>           if ( a.nr > ++start_iter && !(start_iter & HVMOP_op_mask) &&
>>                hypercall_preempt_check() )
>> @@ -5530,11 +5537,13 @@ static int hvmop_set_mem_type(
>>   }
>>   
>>   static int hvmop_map_mem_type_to_ioreq_server(
>> -    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
>> +    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop,
>> +    unsigned long *iter)
>>   {
>>       xen_hvm_map_mem_type_to_ioreq_server_t op;
>>       struct domain *d;
>>       int rc;
>> +    unsigned long gfn = *iter;
>>   
>>       if ( copy_from_guest(&op, uop, 1) )
>>           return -EFAULT;
>> @@ -5559,7 +5568,42 @@ static int hvmop_map_mem_type_to_ioreq_server(
>>       if ( rc != 0 )
>>           goto out;
>>   
>> -    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
>> +    if ( gfn == 0 || op.flags != 0 )
>> +        rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
>> +
>> +    /*
>> +     * Iterate p2m table when an ioreq server unmaps from p2m_ioreq_server,
>> +     * and reset the remaining p2m_ioreq_server entries back to p2m_ram_rw.
>> +     */
>> +    if ( op.flags == 0 && rc == 0 )
>> +    {
>> +        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> +
>> +        while ( gfn <= p2m->max_mapped_pfn )
>> +        {
>> +            p2m_type_t t;
>> +
>> +            if ( p2m->ioreq.entry_count == 0 )
>> +                break;
> Any reason not to make this part of the while() condition?
>
>> +
>> +            get_gfn_unshare(d, gfn, &t);
> This will completely unshare all pages in a domain below the last
> dangling p2m_ioreq_server page.  I don't think unsharing is necessary at
> all; if it's shared, it certainly won't be of type p2m_ioreq_server.
>
> Actually -- it seems like ept_get_entry() really should be calling
> resolve_misconfig(), the same way that ept_set_entry() does.  In that
> case, simply going through and reading all the p2m entries would suffice
> to finish off the type change.

Well, I have doubts about this - resolve_misconfig() is supposed to 
change p2m table,
and should probably be protected by p2m lock. But ept_get_entry() may be 
triggered
by some routine like get_gfn_query_unlocked().

Besides, although calling resolve_misconfig() will speed up the p2m type 
sweeping,
it also introduces more cost each time we peek the p2m table.

>
> Although really, it seems like having a "p2m_finish_type_change()"
> function which looked for misconfigured entries and reset them would be
> a step closer to the right direction, in that it could be re-used in
> other situations where the type change may not have finished.

Thanks. This sounds a good idea. And I'd like to define the interface of 
this routine
like this:
  void p2m_finish_type_change(struct domain *d,
                            unsigned long start, unsigned long end,
                            p2m_type_t ot, p2m_type_t nt)
So it can be triggered by hvmop_map_mem_type_to_ioreq_server() with 
hypercall continuation
method. Do you think this OK?

In fact, I have worked out a new version of this patch according to my 
understanding, and if you
have no more disagreement on the above issues, I'll send out the new 
patch soon after some test
in XenGT environment. :)

Thanks
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-08-30 12:17     ` Yu Zhang
@ 2016-09-02 11:00       ` Yu Zhang
  0 siblings, 0 replies; 32+ messages in thread
From: Yu Zhang @ 2016-09-02 11:00 UTC (permalink / raw)
  To: George Dunlap, xen-devel



On 8/30/2016 8:17 PM, Yu Zhang wrote:
>
>
> On 8/16/2016 9:35 PM, George Dunlap wrote:
>> On 12/07/16 10:02, Yu Zhang wrote:
>>> This patch resets p2m_ioreq_server entries back to p2m_ram_rw,
>>> after an ioreq server has unmapped. The resync is done both
>>> asynchronously with the current p2m_change_entry_type_global()
>>> interface, and synchronously by iterating the p2m table. The
>>> synchronous resetting is necessary because we need to guarantee
>>> the p2m table is clean before another ioreq server is mapped.
>>> And since the sweeping of p2m table could be time consuming,
>>> it is done with hypercall continuation. Asynchronous approach
>>> is also taken so that p2m_ioreq_server entries can also be reset
>>> when the HVM is scheduled between hypercall continuations.
>>>
>>> This patch also disallows live migration, when there's still any
>>> outstanding p2m_ioreq_server entry left. The core reason is our
>>> current implementation of p2m_change_entry_type_global() can not
>>> tell the state of p2m_ioreq_server entries(can not decide if an
>>> entry is to be emulated or to be resynced).
>>>
>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>> Thanks for doing this Yu Zhang!  A couple of comments.
>
> Thanks a lot for your advice, George.
> And sorry for the delayed reply(had been in annual leave previously).
>
>>> ---
>>> Cc: Paul Durrant <paul.durrant@citrix.com>
>>> Cc: Jan Beulich <jbeulich@suse.com>
>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>> ---
>>>   xen/arch/x86/hvm/hvm.c    | 52 
>>> ++++++++++++++++++++++++++++++++++++++++++++---
>>>   xen/arch/x86/mm/hap/hap.c |  9 ++++++++
>>>   xen/arch/x86/mm/p2m-ept.c |  6 +++++-
>>>   xen/arch/x86/mm/p2m-pt.c  | 10 +++++++--
>>>   xen/arch/x86/mm/p2m.c     |  3 +++
>>>   xen/include/asm-x86/p2m.h |  5 ++++-
>>>   6 files changed, 78 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>>> index 4d98cc6..e57c8b9 100644
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -5485,6 +5485,7 @@ static int hvmop_set_mem_type(
>>>       {
>>>           unsigned long pfn = a.first_pfn + start_iter;
>>>           p2m_type_t t;
>>> +        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>>             get_gfn_unshare(d, pfn, &t);
>>>           if ( p2m_is_paging(t) )
>>> @@ -5512,6 +5513,12 @@ static int hvmop_set_mem_type(
>>>           if ( rc )
>>>               goto out;
>>>   +        if ( t == p2m_ram_rw && memtype[a.hvmmem_type] == 
>>> p2m_ioreq_server )
>>> +            p2m->ioreq.entry_count++;
>>> +
>>> +        if ( t == p2m_ioreq_server && memtype[a.hvmmem_type] == 
>>> p2m_ram_rw )
>>> +            p2m->ioreq.entry_count--;
>> Changing these here might make sense if they were *only* changed in the
>> hvm code; but as you also have to modify this value in the p2m code (in
>> resolve_misconfig), I think it makes sense to try to do all the counting
>> in the p2m code.  That would take care of any locking issues as well.
>>
>> Logically the most sensible place to do it would be
>> atomic_write_ept_entry(); that would make it basically impossible to
>> miss a case where we change to of from p2m_ioreq_server.
>>
>> On the other hand, it would mean adding code to a core path for all p2m
>> updates.
>
> Well, the atomic_write_ept_entry() is a rather core path, and is only 
> for ept. Although p2m_ioreq_server is
> not accepted in shadow page table code, I'd still like to support it 
> in AMD SVM - also consistent with the
> p2m_change_entry_type_global() functionality. :)
>
> I am considering p2m_change_type_one(), which has gfn_lock/unlock() 
> inside. And since previously the p2m
> type changes to/from p2m_ioreq_server are only triggered by hvmop 
> which leads to p2m_change_type_one(),
> I believe calculating the entry_count here would make sense.
>
> Besides, the resolve_misconfig()/do_recalc() still need to decrease 
> the entry_count for p2m_ioreq_server,
> but since both routines have p2m locked by their caller, so it's also OK.
>
> Other places that read entry_count can be protected by 
> read_atomic(&p2m->ioreq.entry_count).
>
> Do you think this acceptable?
>
>>> +
>>>           /* Check for continuation if it's not the last interation */
>>>           if ( a.nr > ++start_iter && !(start_iter & HVMOP_op_mask) &&
>>>                hypercall_preempt_check() )
>>> @@ -5530,11 +5537,13 @@ static int hvmop_set_mem_type(
>>>   }
>>>     static int hvmop_map_mem_type_to_ioreq_server(
>>> - XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
>>> + XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop,
>>> +    unsigned long *iter)
>>>   {
>>>       xen_hvm_map_mem_type_to_ioreq_server_t op;
>>>       struct domain *d;
>>>       int rc;
>>> +    unsigned long gfn = *iter;
>>>         if ( copy_from_guest(&op, uop, 1) )
>>>           return -EFAULT;
>>> @@ -5559,7 +5568,42 @@ static int hvmop_map_mem_type_to_ioreq_server(
>>>       if ( rc != 0 )
>>>           goto out;
>>>   -    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, 
>>> op.flags);
>>> +    if ( gfn == 0 || op.flags != 0 )
>>> +        rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, 
>>> op.flags);
>>> +
>>> +    /*
>>> +     * Iterate p2m table when an ioreq server unmaps from 
>>> p2m_ioreq_server,
>>> +     * and reset the remaining p2m_ioreq_server entries back to 
>>> p2m_ram_rw.
>>> +     */
>>> +    if ( op.flags == 0 && rc == 0 )
>>> +    {
>>> +        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>> +
>>> +        while ( gfn <= p2m->max_mapped_pfn )
>>> +        {
>>> +            p2m_type_t t;
>>> +
>>> +            if ( p2m->ioreq.entry_count == 0 )
>>> +                break;
>> Any reason not to make this part of the while() condition?
>>
>>> +
>>> +            get_gfn_unshare(d, gfn, &t);
>> This will completely unshare all pages in a domain below the last
>> dangling p2m_ioreq_server page.  I don't think unsharing is necessary at
>> all; if it's shared, it certainly won't be of type p2m_ioreq_server.
>>
>> Actually -- it seems like ept_get_entry() really should be calling
>> resolve_misconfig(), the same way that ept_set_entry() does.  In that
>> case, simply going through and reading all the p2m entries would suffice
>> to finish off the type change.
>
> Well, I have doubts about this - resolve_misconfig() is supposed to 
> change p2m table,
> and should probably be protected by p2m lock. But ept_get_entry() may 
> be triggered
> by some routine like get_gfn_query_unlocked().
>
> Besides, although calling resolve_misconfig() will speed up the p2m 
> type sweeping,
> it also introduces more cost each time we peek the p2m table.
>
>>
>> Although really, it seems like having a "p2m_finish_type_change()"
>> function which looked for misconfigured entries and reset them would be
>> a step closer to the right direction, in that it could be re-used in
>> other situations where the type change may not have finished.
>
> Thanks. This sounds a good idea. And I'd like to define the interface 
> of this routine
> like this:
>  void p2m_finish_type_change(struct domain *d,
>                            unsigned long start, unsigned long end,
>                            p2m_type_t ot, p2m_type_t nt)
> So it can be triggered by hvmop_map_mem_type_to_ioreq_server() with 
> hypercall continuation
> method. Do you think this OK?
>
> In fact, I have worked out a new version of this patch according to my 
> understanding, and if you
> have no more disagreement on the above issues, I'll send out the new 
> patch soon after some test
> in XenGT environment. :)
>

Hi George, I just send out the v6 patch set, you can have a look on that 
version directly when you
have time. Thanks! :)

Yu


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2016-09-02 11:00 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-12  9:02 [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2016-07-12  9:02 ` [PATCH v5 1/4] x86/ioreq server: Rename p2m_mmio_write_dm to p2m_ioreq_server Yu Zhang
2016-07-12  9:02 ` [PATCH v5 2/4] x86/ioreq server: Add new functions to get/set memory types Yu Zhang
2016-07-12  9:02 ` [PATCH v5 3/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
2016-08-08 15:40   ` Jan Beulich
2016-08-09  7:39     ` Yu Zhang
2016-08-09  8:11       ` Jan Beulich
2016-08-09  8:20         ` Paul Durrant
2016-08-09  8:51           ` Yu Zhang
2016-08-09  9:07             ` Paul Durrant
2016-08-10  8:09     ` Yu Zhang
2016-08-10 10:33       ` Jan Beulich
2016-08-10 10:43         ` Paul Durrant
2016-08-10 12:32           ` Yu Zhang
2016-08-10 10:43         ` Yu Zhang
2016-08-11  8:47           ` Yu Zhang
2016-08-11  8:58             ` Jan Beulich
2016-08-11  9:19               ` Yu Zhang
2016-07-12  9:02 ` [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
2016-08-08 16:29   ` Jan Beulich
2016-08-09  7:39     ` Yu Zhang
2016-08-09  8:13       ` Jan Beulich
2016-08-09  9:25         ` Yu Zhang
2016-08-09  9:45           ` Jan Beulich
2016-08-09 10:21             ` Yu Zhang
2016-08-16 13:35   ` George Dunlap
2016-08-16 13:54     ` Jan Beulich
2016-08-30 12:17     ` Yu Zhang
2016-09-02 11:00       ` Yu Zhang
2016-08-05  2:44 ` [PATCH v5 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2016-08-05  6:16   ` Jan Beulich
2016-08-05 12:46   ` George Dunlap

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).