All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type.
@ 2016-09-02 10:47 Yu Zhang
  2016-09-02 10:47 ` [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
                   ` (4 more replies)
  0 siblings, 5 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-02 10:47 UTC (permalink / raw)
  To: xen-devel; +Cc: zhiyuan.lv

XenGT leverages ioreq server to track and forward the accesses to GPU 
I/O resources, e.g. the PPGTT(per-process graphic translation tables).
Currently, ioreq server uses rangeset to track the BDF/ PIO/MMIO ranges
to be emulated. To select an ioreq server, the rangeset is searched to
see if the I/O range is recorded. However, number of ram pages to be
tracked may exceed the upper limit of rangeset.

Previously, one solution was proposed to refactor the rangeset, and 
extend its upper limit. However, after 12 rounds discussion, we have
decided to drop this approach due to security concerns. Now this new 
patch series introduces a new mem type, HVMMEM_ioreq_server, and added
hvm operations to let one ioreq server to claim its ownership of ram 
pages with this type. Accesses to a page of this type will be handled
by the specified ioreq server directly. 

Yu Zhang (4):
  x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to
    an ioreq server.
  x86/ioreq server: Release the p2m lock after mmio is handled.
  x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server
    pages.
  x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an
    ioreq server unmaps.

 xen/arch/x86/hvm/emulate.c       |  71 ++++++++++++++++++++--
 xen/arch/x86/hvm/hvm.c           | 104 ++++++++++++++++++++++++++++++---
 xen/arch/x86/hvm/ioreq.c         |  39 +++++++++++++
 xen/arch/x86/mm/hap/hap.c        |   9 +++
 xen/arch/x86/mm/hap/nested_hap.c |   2 +-
 xen/arch/x86/mm/p2m-ept.c        |  14 ++++-
 xen/arch/x86/mm/p2m-pt.c         |  30 +++++++---
 xen/arch/x86/mm/p2m.c            | 123 +++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/mm/shadow/multi.c   |   3 +-
 xen/include/asm-x86/hvm/ioreq.h  |   2 +
 xen/include/asm-x86/p2m.h        |  35 +++++++++--
 xen/include/public/hvm/hvm_op.h  |  35 ++++++++++-
 12 files changed, 437 insertions(+), 30 deletions(-)

-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-02 10:47 [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
@ 2016-09-02 10:47 ` Yu Zhang
  2016-09-05 13:31   ` Jan Beulich
                     ` (2 more replies)
  2016-09-02 10:47 ` [PATCH v6 2/4] x86/ioreq server: Release the p2m lock after mmio is handled Yu Zhang
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-02 10:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Jun Nakajima, George Dunlap, Andrew Cooper,
	Tim Deegan, Paul Durrant, zhiyuan.lv, Jan Beulich

A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
let one ioreq server claim/disclaim its responsibility for the
handling of guest pages with p2m type p2m_ioreq_server. Users
of this HVMOP can specify which kind of operation is supposed
to be emulated in a parameter named flags. Currently, this HVMOP
only support the emulation of write operations. And it can be
further extended to support the emulation of read ones if an
ioreq server has such requirement in the future.

For now, we only support one ioreq server for this p2m type, so
once an ioreq server has claimed its ownership, subsequent calls
of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
disclaim the ownership of guest ram pages with p2m_ioreq_server, by
triggering this new HVMOP, with ioreq server id set to the current
owner's and flags parameter set to 0.

Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
are only supported for HVMs with HAP enabled.

Also note that only after one ioreq server claims its ownership
of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
be allowed.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
---
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Tim Deegan <tim@xen.org>

changes in v6: 
  - Clarify logic in hvmemul_do_io().
  - Use recursive lock for ioreq server lock.
  - Remove debug print when mapping ioreq server.
  - Clarify code in ept_p2m_type_to_flags() for consistency.
  - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
  - Add comments for HVMMEM_ioreq_server to note only changes
    to/from HVMMEM_ram_rw are permitted.
  - Add domain_pause/unpause() in hvm_map_mem_type_to_ioreq_server()
    to avoid the race condition when a vm exit happens on a write-
    protected page, just to find the ioreq server has been unmapped
    already.
  - Introduce a seperate patch to delay the release of p2m 
    lock to avoid the race condition.
  - Introduce a seperate patch to handle the read-modify-write
    operations on a write protected page.

changes in v5: 
  - Simplify logic in hvmemul_do_io().
  - Use natual width types instead of fixed width types when possible.
  - Do not grant executable permission for p2m_ioreq_server entries.
  - Clarify comments and commit message.
  - Introduce a seperate patch to recalculate the p2m types after
    the ioreq server unmaps the p2m_ioreq_server.

changes in v4:
  - According to Paul's advice, add comments around the definition
    of HVMMEM_iore_server in hvm_op.h.
  - According to Wei Liu's comments, change the format of the commit
    message.

changes in v3:
  - Only support write emulation in this patch;
  - Remove the code to handle race condition in hvmemul_do_io(),
  - No need to reset the p2m type after an ioreq server has disclaimed
    its ownership of p2m_ioreq_server;
  - Only allow p2m type change to p2m_ioreq_server after an ioreq
    server has claimed its ownership of p2m_ioreq_server;
  - Only allow p2m type change to p2m_ioreq_server from pages with type
    p2m_ram_rw, and vice versa;
  - HVMOP_map_mem_type_to_ioreq_server interface change - use uint16,
    instead of enum to specify the memory type;
  - Function prototype change to p2m_get_ioreq_server();
  - Coding style changes;
  - Commit message changes;
  - Add Tim's Acked-by.

changes in v2:
  - Only support HAP enabled HVMs;
  - Replace p2m_mem_type_changed() with p2m_change_entry_type_global()
    to reset the p2m type, when an ioreq server tries to claim/disclaim
    its ownership of p2m_ioreq_server;
  - Comments changes.
---
 xen/arch/x86/hvm/emulate.c       | 26 ++++++++++++--
 xen/arch/x86/hvm/hvm.c           | 64 ++++++++++++++++++++++++++++++++--
 xen/arch/x86/hvm/ioreq.c         | 39 +++++++++++++++++++++
 xen/arch/x86/mm/hap/nested_hap.c |  2 +-
 xen/arch/x86/mm/p2m-ept.c        |  8 ++++-
 xen/arch/x86/mm/p2m-pt.c         | 20 +++++++----
 xen/arch/x86/mm/p2m.c            | 74 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/mm/shadow/multi.c   |  3 +-
 xen/include/asm-x86/hvm/ioreq.h  |  2 ++
 xen/include/asm-x86/p2m.h        | 26 ++++++++++++--
 xen/include/public/hvm/hvm_op.h  | 35 ++++++++++++++++++-
 11 files changed, 280 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index c55ad7b..a33346e 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -100,6 +100,7 @@ static int hvmemul_do_io(
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
 {
     struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     ioreq_t p = {
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
@@ -141,7 +142,7 @@ static int hvmemul_do_io(
              (p.dir != dir) ||
              (p.df != df) ||
              (p.data_is_ptr != data_is_addr) )
-            domain_crash(curr->domain);
+            domain_crash(currd);
 
         if ( data_is_addr )
             return X86EMUL_UNHANDLEABLE;
@@ -178,8 +179,27 @@ static int hvmemul_do_io(
         break;
     case X86EMUL_UNHANDLEABLE:
     {
-        struct hvm_ioreq_server *s =
-            hvm_select_ioreq_server(curr->domain, &p);
+        struct hvm_ioreq_server *s = NULL;
+        p2m_type_t p2mt = p2m_invalid;
+
+        if ( is_mmio )
+        {
+            unsigned long gmfn = paddr_to_pfn(addr);
+
+            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
+
+            if ( p2mt == p2m_ioreq_server && dir == IOREQ_WRITE )
+            {
+                unsigned int flags;
+
+                s = p2m_get_ioreq_server(currd, &flags);
+                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
+                    s = NULL;
+            }
+        }
+
+        if ( !s && p2mt != p2m_ioreq_server )
+            s = hvm_select_ioreq_server(currd, &p);
 
         /* If there is no suitable backing DM, just ignore accesses */
         if ( !s )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 0180f26..e969735 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5408,9 +5408,14 @@ static int hvmop_get_mem_type(
 
 static bool_t hvm_allow_p2m_type_change(p2m_type_t old, p2m_type_t new)
 {
+    if ( new == p2m_ioreq_server )
+        return old == p2m_ram_rw;
+
+    if ( old == p2m_ioreq_server )
+        return new == p2m_ram_rw;
+
     if ( p2m_is_ram(old) ||
-         (p2m_is_hole(old) && new == p2m_mmio_dm) ||
-         (old == p2m_ioreq_server && new == p2m_ram_rw) )
+         (p2m_is_hole(old) && new == p2m_mmio_dm) )
         return 1;
 
     return 0;
@@ -5445,6 +5450,19 @@ static int hvmop_set_mem_type(
     if ( !is_hvm_domain(d) )
         goto out;
 
+    if ( a.hvmmem_type == HVMMEM_ioreq_server )
+    {
+        unsigned int flags;
+
+        /* HVMMEM_ioreq_server is only supported for HAP enabled hvm. */
+        if ( !hap_enabled(d) )
+            goto out;
+
+        /* Do not change to HVMMEM_ioreq_server if no ioreq server mapped. */
+        if ( !p2m_get_ioreq_server(d, &flags) )
+            goto out;
+    }
+
     rc = xsm_hvm_control(XSM_DM_PRIV, d, HVMOP_set_mem_type);
     if ( rc )
         goto out;
@@ -5507,6 +5525,43 @@ static int hvmop_set_mem_type(
     return rc;
 }
 
+static int hvmop_map_mem_type_to_ioreq_server(
+    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
+{
+    xen_hvm_map_mem_type_to_ioreq_server_t op;
+    struct domain *d;
+    int rc;
+
+    if ( copy_from_guest(&op, uop, 1) )
+        return -EFAULT;
+
+    rc = rcu_lock_remote_domain_by_id(op.domid, &d);
+    if ( rc != 0 )
+        return rc;
+
+    rc = -EINVAL;
+    if ( !is_hvm_domain(d) )
+        goto out;
+
+    if ( op.pad != 0 )
+        goto out;
+
+    /* Only support for HAP enabled hvm. */
+    if ( !hap_enabled(d) )
+        goto out;
+
+    rc = xsm_hvm_ioreq_server(XSM_DM_PRIV, d,
+                              HVMOP_map_mem_type_to_ioreq_server);
+    if ( rc != 0 )
+        goto out;
+
+    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
+
+ out:
+    rcu_unlock_domain(d);
+    return rc;
+}
+
 long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     unsigned long start_iter, mask;
@@ -5546,6 +5601,11 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
             guest_handle_cast(arg, xen_hvm_io_range_t));
         break;
 
+    case HVMOP_map_mem_type_to_ioreq_server:
+        rc = hvmop_map_mem_type_to_ioreq_server(
+            guest_handle_cast(arg, xen_hvm_map_mem_type_to_ioreq_server_t));
+        break;
+
     case HVMOP_set_ioreq_server_state:
         rc = hvmop_set_ioreq_server_state(
             guest_handle_cast(arg, xen_hvm_set_ioreq_server_state_t));
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index d2245e2..ac84563 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -753,6 +753,8 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 
         domain_pause(d);
 
+        p2m_destroy_ioreq_server(d, s);
+
         hvm_ioreq_server_disable(s, 0);
 
         list_del(&s->list_entry);
@@ -914,6 +916,43 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
     return rc;
 }
 
+int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
+                                     uint32_t type, uint32_t flags)
+{
+    struct hvm_ioreq_server *s;
+    int rc;
+
+    /* For now, only HVMMEM_ioreq_server is supported. */
+    if ( type != HVMMEM_ioreq_server )
+        return -EINVAL;
+
+    /* For now, only write emulation is supported. */
+    if ( flags & ~(XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
+        return -EINVAL;
+
+    domain_pause(d);
+    spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
+
+    rc = -ENOENT;
+    list_for_each_entry ( s,
+                          &d->arch.hvm_domain.ioreq_server.list,
+                          list_entry )
+    {
+        if ( s == d->arch.hvm_domain.default_ioreq_server )
+            continue;
+
+        if ( s->id == id )
+        {
+            rc = p2m_set_ioreq_server(d, flags, s);
+            break;
+        }
+    }
+
+    spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
+    domain_unpause(d);
+    return rc;
+}
+
 int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
                                bool_t enabled)
 {
diff --git a/xen/arch/x86/mm/hap/nested_hap.c b/xen/arch/x86/mm/hap/nested_hap.c
index d41bb09..aa90a62 100644
--- a/xen/arch/x86/mm/hap/nested_hap.c
+++ b/xen/arch/x86/mm/hap/nested_hap.c
@@ -174,7 +174,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t L1_gpa, paddr_t *L0_gpa,
     if ( *p2mt == p2m_mmio_direct )
         goto direct_mmio_out;
     rc = NESTEDHVM_PAGEFAULT_MMIO;
-    if ( *p2mt == p2m_mmio_dm )
+    if ( *p2mt == p2m_mmio_dm || *p2mt == p2m_ioreq_server )
         goto out;
 
     rc = NESTEDHVM_PAGEFAULT_L0_ERROR;
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 13cab24..700420a 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -132,6 +132,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
             entry->r = entry->w = entry->x = 1;
             entry->a = entry->d = !!cpu_has_vmx_ept_ad;
             break;
+        case p2m_ioreq_server:
+            entry->r = 1;
+            entry->w = !(p2m->ioreq.flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE);
+            entry->x = 0;
+            entry->a = !!cpu_has_vmx_ept_ad;
+            entry->d = entry->w && entry->a;
+            break;
         case p2m_mmio_direct:
             entry->r = entry->x = 1;
             entry->w = !rangeset_contains_singleton(mmio_ro_ranges,
@@ -171,7 +178,6 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
             entry->a = entry->d = !!cpu_has_vmx_ept_ad;
             break;
         case p2m_grant_map_ro:
-        case p2m_ioreq_server:
             entry->r = 1;
             entry->w = entry->x = 0;
             entry->a = !!cpu_has_vmx_ept_ad;
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 3b025d5..46a56fa 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -72,7 +72,9 @@ static const unsigned long pgt[] = {
     PGT_l3_page_table
 };
 
-static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
+static unsigned long p2m_type_to_flags(const struct p2m_domain *p2m,
+                                       p2m_type_t t,
+                                       mfn_t mfn,
                                        unsigned int level)
 {
     unsigned long flags;
@@ -94,8 +96,13 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
     default:
         return flags | _PAGE_NX_BIT;
     case p2m_grant_map_ro:
-    case p2m_ioreq_server:
         return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
+    case p2m_ioreq_server:
+        flags |= P2M_BASE_FLAGS | _PAGE_RW | _PAGE_NX_BIT;
+        if ( p2m->ioreq.flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE )
+            return flags & ~_PAGE_RW;
+        else
+            return flags;
     case p2m_ram_ro:
     case p2m_ram_logdirty:
     case p2m_ram_shared:
@@ -442,7 +449,8 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
             p2m_type_t p2mt = p2m_is_logdirty_range(p2m, gfn & mask, gfn | ~mask)
                               ? p2m_ram_logdirty : p2m_ram_rw;
             unsigned long mfn = l1e_get_pfn(e);
-            unsigned long flags = p2m_type_to_flags(p2mt, _mfn(mfn), level);
+            unsigned long flags = p2m_type_to_flags(p2m, p2mt,
+                                                    _mfn(mfn), level);
 
             if ( level )
             {
@@ -579,7 +587,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
         l3e_content = mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt)
             ? l3e_from_pfn(mfn_x(mfn),
-                           p2m_type_to_flags(p2mt, mfn, 2) | _PAGE_PSE)
+                           p2m_type_to_flags(p2m, p2mt, mfn, 2) | _PAGE_PSE)
             : l3e_empty();
         entry_content.l1 = l3e_content.l3;
 
@@ -615,7 +623,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
 
         if ( mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt) )
             entry_content = p2m_l1e_from_pfn(mfn_x(mfn),
-                                             p2m_type_to_flags(p2mt, mfn, 0));
+                                         p2m_type_to_flags(p2m, p2mt, mfn, 0));
         else
             entry_content = l1e_empty();
 
@@ -651,7 +659,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
         if ( mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt) )
             l2e_content = l2e_from_pfn(mfn_x(mfn),
-                                       p2m_type_to_flags(p2mt, mfn, 1) |
+                                       p2m_type_to_flags(p2m, p2mt, mfn, 1) |
                                        _PAGE_PSE);
         else
             l2e_content = l2e_empty();
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 812dbf6..6e4cb1f 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -83,6 +83,8 @@ static int p2m_initialise(struct domain *d, struct p2m_domain *p2m)
     else
         p2m_pt_init(p2m);
 
+    spin_lock_init(&p2m->ioreq.lock);
+
     return ret;
 }
 
@@ -289,6 +291,78 @@ void p2m_memory_type_changed(struct domain *d)
     }
 }
 
+int p2m_set_ioreq_server(struct domain *d,
+                         unsigned int flags,
+                         struct hvm_ioreq_server *s)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    int rc;
+
+    /*
+     * Use lock to prevent concurrent setting requirements
+     * from multiple ioreq serers.
+     */
+    spin_lock(&p2m->ioreq.lock);
+
+    /* Unmap ioreq server from p2m type by passing flags with 0. */
+    if ( flags == 0 )
+    {
+        rc = -EINVAL;
+        if ( p2m->ioreq.server != s )
+            goto out;
+
+        p2m->ioreq.server = NULL;
+        p2m->ioreq.flags = 0;
+    }
+    else
+    {
+        rc = -EBUSY;
+        if ( p2m->ioreq.server != NULL )
+            goto out;
+
+        p2m->ioreq.server = s;
+        p2m->ioreq.flags = flags;
+    }
+
+    rc = 0;
+
+ out:
+    spin_unlock(&p2m->ioreq.lock);
+
+    return rc;
+}
+
+struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
+                                              unsigned int *flags)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    struct hvm_ioreq_server *s;
+
+    spin_lock(&p2m->ioreq.lock);
+
+    s = p2m->ioreq.server;
+    *flags = p2m->ioreq.flags;
+
+    spin_unlock(&p2m->ioreq.lock);
+    return s;
+}
+
+void p2m_destroy_ioreq_server(const struct domain *d,
+                              const struct hvm_ioreq_server *s)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+    spin_lock(&p2m->ioreq.lock);
+
+    if ( p2m->ioreq.server == s )
+    {
+        p2m->ioreq.server = NULL;
+        p2m->ioreq.flags = 0;
+    }
+
+    spin_unlock(&p2m->ioreq.lock);
+}
+
 void p2m_enable_hardware_log_dirty(struct domain *d)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 833f279..4deab39 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3225,8 +3225,7 @@ static int sh_page_fault(struct vcpu *v,
     }
 
     /* Need to hand off device-model MMIO to the device model */
-    if ( p2mt == p2m_mmio_dm
-         || (p2mt == p2m_ioreq_server && ft == ft_demand_write) )
+    if ( p2mt == p2m_mmio_dm )
     {
         gpa = guest_walk_to_gpa(&gw);
         goto mmio;
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index fbf2c74..b43667a 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -37,6 +37,8 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
 int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
                                          uint32_t type, uint64_t start,
                                          uint64_t end);
+int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
+                                     uint32_t type, uint32_t flags);
 int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
                                bool_t enabled);
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index d5fd546..4924c4b 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -89,7 +89,8 @@ typedef unsigned int p2m_query_t;
                        | p2m_to_mask(p2m_ram_paging_out)      \
                        | p2m_to_mask(p2m_ram_paged)           \
                        | p2m_to_mask(p2m_ram_paging_in)       \
-                       | p2m_to_mask(p2m_ram_shared))
+                       | p2m_to_mask(p2m_ram_shared)          \
+                       | p2m_to_mask(p2m_ioreq_server))
 
 /* Types that represent a physmap hole that is ok to replace with a shared
  * entry */
@@ -111,8 +112,7 @@ typedef unsigned int p2m_query_t;
 #define P2M_RO_TYPES (p2m_to_mask(p2m_ram_logdirty)     \
                       | p2m_to_mask(p2m_ram_ro)         \
                       | p2m_to_mask(p2m_grant_map_ro)   \
-                      | p2m_to_mask(p2m_ram_shared)     \
-                      | p2m_to_mask(p2m_ioreq_server))
+                      | p2m_to_mask(p2m_ram_shared))
 
 /* Write-discard types, which should discard the write operations */
 #define P2M_DISCARD_WRITE_TYPES (p2m_to_mask(p2m_ram_ro)     \
@@ -336,6 +336,20 @@ struct p2m_domain {
         struct ept_data ept;
         /* NPT-equivalent structure could be added here. */
     };
+
+    struct {
+        spinlock_t lock;
+        /*
+         * ioreq server who's responsible for the emulation of
+         * gfns with specific p2m type(for now, p2m_ioreq_server).
+         */
+        struct hvm_ioreq_server *server;
+        /*
+         * flags specifies whether read, write or both operations
+         * are to be emulated by an ioreq server.
+         */
+        unsigned int flags;
+    } ioreq;
 };
 
 /* get host p2m table */
@@ -845,6 +859,12 @@ static inline unsigned int p2m_get_iommu_flags(p2m_type_t p2mt)
     return flags;
 }
 
+int p2m_set_ioreq_server(struct domain *d, unsigned int flags,
+                         struct hvm_ioreq_server *s);
+struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
+                                              unsigned int *flags);
+void p2m_destroy_ioreq_server(const struct domain *d, const struct hvm_ioreq_server *s);
+
 #endif /* _XEN_P2M_H */
 
 /*
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index b3e45cf..4c19fc5 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -89,7 +89,13 @@ typedef enum {
     HVMMEM_unused,             /* Placeholder; setting memory to this type
                                   will fail for code after 4.7.0 */
 #endif
-    HVMMEM_ioreq_server
+    HVMMEM_ioreq_server        /* Memory type claimed by an ioreq server; type
+                                  changes to this value are only allowed after
+                                  an ioreq server has claimed its ownership.
+                                  Only pages with HVMMEM_ram_rw are allowed to
+                                  change to this type; conversely, pages with
+                                  this type are only allowed to be changed back
+                                  to HVMMEM_ram_rw. */
 } hvmmem_type_t;
 
 /* Following tools-only interfaces may change in future. */
@@ -383,6 +389,33 @@ struct xen_hvm_set_ioreq_server_state {
 typedef struct xen_hvm_set_ioreq_server_state xen_hvm_set_ioreq_server_state_t;
 DEFINE_XEN_GUEST_HANDLE(xen_hvm_set_ioreq_server_state_t);
 
+/*
+ * HVMOP_map_mem_type_to_ioreq_server : map or unmap the IOREQ Server <id>
+ *                                      to specific memroy type <type>
+ *                                      for specific accesses <flags>
+ *
+ * For now, flags only accept the value of HVMOP_IOREQ_MEM_ACCESS_WRITE,
+ * which means only write operations are to be forwarded to an ioreq server.
+ * Support for the emulation of read operations can be added when an ioreq
+ * server has such requirement in future.
+ */
+#define HVMOP_map_mem_type_to_ioreq_server 26
+struct xen_hvm_map_mem_type_to_ioreq_server {
+    domid_t domid;      /* IN - domain to be serviced */
+    ioservid_t id;      /* IN - ioreq server id */
+    uint16_t type;      /* IN - memory type */
+    uint16_t pad;
+    uint32_t flags;     /* IN - types of accesses to be forwarded to the
+                           ioreq server. flags with 0 means to unmap the
+                           ioreq server */
+
+#define XEN_HVMOP_IOREQ_MEM_ACCESS_READ (1u << 0)
+#define XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE (1u << 1)
+};
+typedef struct xen_hvm_map_mem_type_to_ioreq_server
+    xen_hvm_map_mem_type_to_ioreq_server_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_map_mem_type_to_ioreq_server_t);
+
 #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
 
 #if defined(__i386__) || defined(__x86_64__)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v6 2/4] x86/ioreq server: Release the p2m lock after mmio is handled.
  2016-09-02 10:47 [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
  2016-09-02 10:47 ` [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
@ 2016-09-02 10:47 ` Yu Zhang
  2016-09-05 13:49   ` Jan Beulich
       [not found]   ` <57D24782.6010701@linux.intel.com>
  2016-09-02 10:47 ` [PATCH v6 3/4] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages Yu Zhang
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-02 10:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, zhiyuan.lv, Jan Beulich

Routine hvmemul_do_io() may need to peek the p2m type of a gfn to
select the ioreq server. For example, operations on gfns with
p2m_ioreq_server type will be delivered to a corresponding ioreq
server, and this requires that the p2m type not be switched back
to p2m_ram_rw during the emulation process. To avoid this race
condition, we delay the release of p2m lock in hvm_hap_nested_page_fault()
until mmio is handled.

Note: previously in hvm_hap_nested_page_fault(), put_gfn() was moved
before the handling of mmio, due to a deadlock risk between the p2m
lock and the event lock(in commit 77b8dfe). Later, a per-event channel
lock was introduced in commit de6acb7, to send events. So we do not
need to worry about the deadlock issue.

Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index e969735..9b419d2 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1865,18 +1865,14 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
          (npfec.write_access &&
           (p2m_is_discard_write(p2mt) || (p2mt == p2m_ioreq_server))) )
     {
-        __put_gfn(p2m, gfn);
-        if ( ap2m_active )
-            __put_gfn(hostp2m, gfn);
-
         rc = 0;
         if ( unlikely(is_pvh_domain(currd)) )
-            goto out;
+            goto out_put_gfn;
 
         if ( !handle_mmio_with_translation(gla, gpa >> PAGE_SHIFT, npfec) )
             hvm_inject_hw_exception(TRAP_gp_fault, 0);
         rc = 1;
-        goto out;
+        goto out_put_gfn;
     }
 
     /* Check if the page has been paged out */
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v6 3/4] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages.
  2016-09-02 10:47 [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
  2016-09-02 10:47 ` [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
  2016-09-02 10:47 ` [PATCH v6 2/4] x86/ioreq server: Release the p2m lock after mmio is handled Yu Zhang
@ 2016-09-02 10:47 ` Yu Zhang
  2016-09-05 14:10   ` Jan Beulich
       [not found]   ` <57D247F6.9010503@linux.intel.com>
  2016-09-02 10:47 ` [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
  2016-09-06 10:57 ` [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
  4 siblings, 2 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-02 10:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, zhiyuan.lv, Jan Beulich

In ept_handle_violation(), write violations are also treated as
read violations. And when a VM is accessing a write-protected
address with read-modify-write instructions, the read emulation
process is triggered first.

For p2m_ioreq_server pages, current ioreq server only forwards
the write operations to the device model. Therefore when such page
is being accessed by a read-modify-write instruction, the read
operations should be emulated here in hypervisor. This patch provides
such a handler to copy the data to the buffer.

Note: MMIOs with p2m_mmio_dm type do not need such special treatment
because both reads and writes will go to the device mode.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index a33346e..f92de48 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -95,6 +95,41 @@ static const struct hvm_io_handler null_handler = {
     .ops = &null_ops
 };
 
+static int mem_read(const struct hvm_io_handler *io_handler,
+                    uint64_t addr,
+                    uint32_t size,
+                    uint64_t *data)
+{
+    struct domain *currd = current->domain;
+    unsigned long gmfn = paddr_to_pfn(addr);
+    unsigned long offset = addr & ~PAGE_MASK;
+    struct page_info *page = get_page_from_gfn(currd, gmfn, NULL, P2M_UNSHARE);
+    uint8_t *p;
+
+    ASSERT(offset + size < PAGE_SIZE);
+
+    if ( !page )
+        return X86EMUL_UNHANDLEABLE;
+
+    p = __map_domain_page(page);
+    p += offset;
+    memcpy(data, p, size);
+
+    unmap_domain_page(p);
+    put_page(page);
+
+    return X86EMUL_OKAY;
+}
+
+static const struct hvm_io_ops mem_ops = {
+    .read = mem_read,
+    .write = null_write
+};
+
+static const struct hvm_io_handler mem_handler = {
+    .ops = &mem_ops
+};
+
 static int hvmemul_do_io(
     bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
@@ -204,7 +239,15 @@ static int hvmemul_do_io(
         /* If there is no suitable backing DM, just ignore accesses */
         if ( !s )
         {
-            rc = hvm_process_io_intercept(&null_handler, &p);
+            /*
+             * For p2m_ioreq_server pages accessed with read-modify-write
+             * instructions, we provide a read handler to copy the data to
+             * the buffer.
+             */
+            if ( p2mt == p2m_ioreq_server )
+                rc = hvm_process_io_intercept(&mem_handler, &p);
+            else
+                rc = hvm_process_io_intercept(&null_handler, &p);
             vio->io_req.state = STATE_IOREQ_NONE;
         }
         else
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-09-02 10:47 [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
                   ` (2 preceding siblings ...)
  2016-09-02 10:47 ` [PATCH v6 3/4] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages Yu Zhang
@ 2016-09-02 10:47 ` Yu Zhang
  2016-09-05 14:47   ` Jan Beulich
       [not found]   ` <57D24813.2090903@linux.intel.com>
  2016-09-06 10:57 ` [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
  4 siblings, 2 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-02 10:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Jun Nakajima, George Dunlap, Andrew Cooper,
	Paul Durrant, zhiyuan.lv, Jan Beulich

This patch resets p2m_ioreq_server entries back to p2m_ram_rw,
after an ioreq server has unmapped. The resync is done both
asynchronously with the current p2m_change_entry_type_global()
interface, and synchronously by iterating the p2m table. The
synchronous resetting is necessary because we need to guarantee
the p2m table is clean before another ioreq server is mapped.
And since the sweeping of p2m table could be time consuming,
it is done with hypercall continuation. Asynchronous approach
is also taken so that p2m_ioreq_server entries can also be reset
when the HVM is scheduled between hypercall continuations.

This patch also disallows live migration, when there's still any
outstanding p2m_ioreq_server entry left. The core reason is our
current implementation of p2m_change_entry_type_global() can not
tell the state of p2m_ioreq_server entries(can not decide if an
entry is to be emulated or to be resynced).

Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>

changes in v2: 
  - Move the calculation of ioreq server page entry_cout into 
    p2m_change_type_one() so that we do not need a seperate lock.
    Note: entry_count is also calculated in resolve_misconfig()/
    do_recalc(), fortunately callers of both routines have p2m
    lock protected already.
  - Simplify logic in hvmop_set_mem_type().
  - Introduce routine p2m_finish_type_change() to walk the p2m 
    table and do the p2m reset.
---
 xen/arch/x86/hvm/hvm.c    | 38 +++++++++++++++++++++++++++++++++---
 xen/arch/x86/mm/hap/hap.c |  9 +++++++++
 xen/arch/x86/mm/p2m-ept.c |  6 +++++-
 xen/arch/x86/mm/p2m-pt.c  | 10 ++++++++--
 xen/arch/x86/mm/p2m.c     | 49 +++++++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/p2m.h |  9 ++++++++-
 6 files changed, 114 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 9b419d2..200c661 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5522,11 +5522,13 @@ static int hvmop_set_mem_type(
 }
 
 static int hvmop_map_mem_type_to_ioreq_server(
-    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
+    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop,
+    unsigned long *iter)
 {
     xen_hvm_map_mem_type_to_ioreq_server_t op;
     struct domain *d;
     int rc;
+    unsigned long gfn = *iter;
 
     if ( copy_from_guest(&op, uop, 1) )
         return -EFAULT;
@@ -5551,7 +5553,35 @@ static int hvmop_map_mem_type_to_ioreq_server(
     if ( rc != 0 )
         goto out;
 
-    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
+    if ( gfn == 0 )
+        rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
+
+    /*
+     * Iterate p2m table when an ioreq server unmaps from p2m_ioreq_server,
+     * and reset the remaining p2m_ioreq_server entries back to p2m_ram_rw.
+     */
+    if ( op.flags == 0 && rc == 0 )
+    {
+        struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+        while ( read_atomic(&p2m->ioreq.entry_count) &&
+                gfn <= p2m->max_mapped_pfn )
+        {
+            unsigned long gfn_end = gfn + HVMOP_op_mask;
+
+            p2m_finish_type_change(d, gfn, gfn_end,
+                                   p2m_ioreq_server, p2m_ram_rw);
+
+            /* Check for continuation if it's not the last iteration. */
+            if ( ++gfn_end <= p2m->max_mapped_pfn &&
+                hypercall_preempt_check() )
+            {
+                rc = -ERESTART;
+                *iter = gfn_end;
+                goto out;
+            }
+        }
+    }
 
  out:
     rcu_unlock_domain(d);
@@ -5570,6 +5600,7 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     case HVMOP_modified_memory:
     case HVMOP_set_mem_type:
+    case HVMOP_map_mem_type_to_ioreq_server:
         mask = HVMOP_op_mask;
         break;
     }
@@ -5599,7 +5630,8 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     case HVMOP_map_mem_type_to_ioreq_server:
         rc = hvmop_map_mem_type_to_ioreq_server(
-            guest_handle_cast(arg, xen_hvm_map_mem_type_to_ioreq_server_t));
+            guest_handle_cast(arg, xen_hvm_map_mem_type_to_ioreq_server_t),
+            &start_iter);
         break;
 
     case HVMOP_set_ioreq_server_state:
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 3218fa2..5df2d62 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -190,6 +190,15 @@ out:
  */
 static int hap_enable_log_dirty(struct domain *d, bool_t log_global)
 {
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+    /*
+    * Refuse to turn on global log-dirty mode if
+    * there's outstanding p2m_ioreq_server pages.
+    */
+    if ( log_global && read_atomic(&p2m->ioreq.entry_count) )
+        return -EBUSY;
+
     /* turn on PG_log_dirty bit in paging mode */
     paging_lock(d);
     d->arch.paging.mode |= PG_log_dirty;
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 700420a..6679ae7 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -545,6 +545,9 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
                     e.ipat = ipat;
                     if ( e.recalc && p2m_is_changeable(e.sa_p2mt) )
                     {
+                         if ( e.sa_p2mt == p2m_ioreq_server )
+                             p2m->ioreq.entry_count--;
+
                          e.sa_p2mt = p2m_is_logdirty_range(p2m, gfn + i, gfn + i)
                                      ? p2m_ram_logdirty : p2m_ram_rw;
                          ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt, e.access);
@@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
     if ( is_epte_valid(ept_entry) )
     {
         if ( (recalc || ept_entry->recalc) &&
-             p2m_is_changeable(ept_entry->sa_p2mt) )
+             p2m_is_changeable(ept_entry->sa_p2mt) &&
+             (ept_entry->sa_p2mt != p2m_ioreq_server) )
             *t = p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty
                                                       : p2m_ram_rw;
         else
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 46a56fa..7f31c0e 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -439,11 +439,13 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
          needs_recalc(l1, *pent) )
     {
         l1_pgentry_t e = *pent;
+        p2m_type_t p2mt_old;
 
         if ( !valid_recalc(l1, e) )
             P2M_DEBUG("bogus recalc leaf at d%d:%lx:%u\n",
                       p2m->domain->domain_id, gfn, level);
-        if ( p2m_is_changeable(p2m_flags_to_type(l1e_get_flags(e))) )
+        p2mt_old = p2m_flags_to_type(l1e_get_flags(e));
+        if ( p2m_is_changeable(p2mt_old) )
         {
             unsigned long mask = ~0UL << (level * PAGETABLE_ORDER);
             p2m_type_t p2mt = p2m_is_logdirty_range(p2m, gfn & mask, gfn | ~mask)
@@ -463,6 +465,10 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
                      mfn &= ~(_PAGE_PSE_PAT >> PAGE_SHIFT);
                 flags |= _PAGE_PSE;
             }
+
+            if ( p2mt_old == p2m_ioreq_server )
+                p2m->ioreq.entry_count--;
+
             e = l1e_from_pfn(mfn, flags);
             p2m_add_iommu_flags(&e, level,
                                 (p2mt == p2m_ram_rw)
@@ -729,7 +735,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
 static inline p2m_type_t recalc_type(bool_t recalc, p2m_type_t t,
                                      struct p2m_domain *p2m, unsigned long gfn)
 {
-    if ( !recalc || !p2m_is_changeable(t) )
+    if ( !recalc || !p2m_is_changeable(t) || (t == p2m_ioreq_server) )
         return t;
     return p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty
                                                 : p2m_ram_rw;
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 6e4cb1f..6581a70 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -313,6 +313,9 @@ int p2m_set_ioreq_server(struct domain *d,
 
         p2m->ioreq.server = NULL;
         p2m->ioreq.flags = 0;
+
+        if ( read_atomic(&p2m->ioreq.entry_count) )
+            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
     }
     else
     {
@@ -957,6 +960,23 @@ int p2m_change_type_one(struct domain *d, unsigned long gfn,
                          p2m->default_access)
          : -EBUSY;
 
+    if ( !rc )
+    {
+        switch ( nt )
+        {
+        case p2m_ram_rw:
+            if ( ot == p2m_ioreq_server )
+                p2m->ioreq.entry_count--;
+            break;
+        case p2m_ioreq_server:
+            if ( ot == p2m_ram_rw )
+                p2m->ioreq.entry_count++;
+            break;
+        default:
+            break;
+        }
+    }
+
     gfn_unlock(p2m, gfn, 0);
 
     return rc;
@@ -1021,6 +1041,35 @@ void p2m_change_type_range(struct domain *d,
     p2m_unlock(p2m);
 }
 
+/* Synchronously modify the p2m type of a range of gfns from ot to nt. */
+void p2m_finish_type_change(struct domain *d,
+                           unsigned long start, unsigned long end,
+                           p2m_type_t ot, p2m_type_t nt)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    p2m_type_t t;
+    unsigned long gfn = start;
+
+    ASSERT(start <= end);
+    ASSERT(ot != nt);
+    ASSERT(p2m_is_changeable(ot) && p2m_is_changeable(nt));
+
+    p2m_lock(p2m);
+
+    end = (end > p2m->max_mapped_pfn) ? p2m->max_mapped_pfn : end;
+    while ( gfn <= end )
+    {
+        get_gfn_query_unlocked(d, gfn, &t);
+
+        if ( t == ot )
+            p2m_change_type_one(d, gfn, t, nt);
+
+        gfn++;
+    }
+
+    p2m_unlock(p2m);
+}
+
 /*
  * Returns:
  *    0              for success
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 4924c4b..bf9adca 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -120,7 +120,8 @@ typedef unsigned int p2m_query_t;
 
 /* Types that can be subject to bulk transitions. */
 #define P2M_CHANGEABLE_TYPES (p2m_to_mask(p2m_ram_rw) \
-                              | p2m_to_mask(p2m_ram_logdirty) )
+                              | p2m_to_mask(p2m_ram_logdirty) \
+                              | p2m_to_mask(p2m_ioreq_server))
 
 #define P2M_POD_TYPES (p2m_to_mask(p2m_populate_on_demand))
 
@@ -349,6 +350,7 @@ struct p2m_domain {
          * are to be emulated by an ioreq server.
          */
         unsigned int flags;
+        unsigned int entry_count;
     } ioreq;
 };
 
@@ -604,6 +606,11 @@ void p2m_change_type_range(struct domain *d,
 int p2m_change_type_one(struct domain *d, unsigned long gfn,
                         p2m_type_t ot, p2m_type_t nt);
 
+/* Synchronously change types across a range of p2m entries (start ... end) */
+void p2m_finish_type_change(struct domain *d,
+                           unsigned long start, unsigned long end,
+                           p2m_type_t ot, p2m_type_t nt);
+
 /* Report a change affecting memory types. */
 void p2m_memory_type_changed(struct domain *d);
 
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-02 10:47 ` [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
@ 2016-09-05 13:31   ` Jan Beulich
  2016-09-05 17:20     ` George Dunlap
  2016-09-09  5:55     ` Yu Zhang
  2016-09-05 17:23   ` George Dunlap
       [not found]   ` <57D24730.2050904@linux.intel.com>
  2 siblings, 2 replies; 40+ messages in thread
From: Jan Beulich @ 2016-09-05 13:31 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	Paul Durrant, zhiyuan.lv, Jun Nakajima

>>> On 02.09.16 at 12:47, <yu.c.zhang@linux.intel.com> wrote:
> @@ -178,8 +179,27 @@ static int hvmemul_do_io(
>          break;
>      case X86EMUL_UNHANDLEABLE:
>      {
> -        struct hvm_ioreq_server *s =
> -            hvm_select_ioreq_server(curr->domain, &p);
> +        struct hvm_ioreq_server *s = NULL;
> +        p2m_type_t p2mt = p2m_invalid;
> +
> +        if ( is_mmio )
> +        {
> +            unsigned long gmfn = paddr_to_pfn(addr);
> +
> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
> +
> +            if ( p2mt == p2m_ioreq_server && dir == IOREQ_WRITE )
> +            {
> +                unsigned int flags;
> +
> +                s = p2m_get_ioreq_server(currd, &flags);
> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
> +                    s = NULL;
> +            }
> +        }
> +
> +        if ( !s && p2mt != p2m_ioreq_server )
> +            s = hvm_select_ioreq_server(currd, &p);

What I recall is that we had agreed on p2m_ioreq_server pages
to be treated as ordinary RAM ones as long as no server can be
found. The type check here contradicts that. Is there a reason?

> +static int hvmop_map_mem_type_to_ioreq_server(
> +    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
> +{
> +    xen_hvm_map_mem_type_to_ioreq_server_t op;
> +    struct domain *d;
> +    int rc;
> +
> +    if ( copy_from_guest(&op, uop, 1) )
> +        return -EFAULT;
> +
> +    rc = rcu_lock_remote_domain_by_id(op.domid, &d);
> +    if ( rc != 0 )
> +        return rc;
> +
> +    rc = -EINVAL;
> +    if ( !is_hvm_domain(d) )
> +        goto out;
> +
> +    if ( op.pad != 0 )
> +        goto out;

This, I think, should be done first thing after having copied in the
structure. No need to lookup domain or anything if this is not zero.

> +int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
> +                                     uint32_t type, uint32_t flags)
> +{
> +    struct hvm_ioreq_server *s;
> +    int rc;
> +
> +    /* For now, only HVMMEM_ioreq_server is supported. */
> +    if ( type != HVMMEM_ioreq_server )
> +        return -EINVAL;
> +
> +    /* For now, only write emulation is supported. */
> +    if ( flags & ~(XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
> +        return -EINVAL;
> +
> +    domain_pause(d);
> +    spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> +
> +    rc = -ENOENT;
> +    list_for_each_entry ( s,
> +                          &d->arch.hvm_domain.ioreq_server.list,
> +                          list_entry )
> +    {
> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
> +            continue;
> +
> +        if ( s->id == id )
> +        {
> +            rc = p2m_set_ioreq_server(d, flags, s);
> +            break;
> +        }
> +    }
> +
> +    spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
> +    domain_unpause(d);
> +    return rc;
> +}

Blank line before final return statement of a function please.

> +int p2m_set_ioreq_server(struct domain *d,
> +                         unsigned int flags,
> +                         struct hvm_ioreq_server *s)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +    int rc;
> +
> +    /*
> +     * Use lock to prevent concurrent setting requirements
> +     * from multiple ioreq serers.
> +     */

"Concurrent setting requirements"? DYM "attempts"?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 2/4] x86/ioreq server: Release the p2m lock after mmio is handled.
  2016-09-02 10:47 ` [PATCH v6 2/4] x86/ioreq server: Release the p2m lock after mmio is handled Yu Zhang
@ 2016-09-05 13:49   ` Jan Beulich
       [not found]   ` <57D24782.6010701@linux.intel.com>
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2016-09-05 13:49 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Andrew Cooper, Paul Durrant, zhiyuan.lv, xen-devel

>>> On 02.09.16 at 12:47, <yu.c.zhang@linux.intel.com> wrote:
> Routine hvmemul_do_io() may need to peek the p2m type of a gfn to
> select the ioreq server. For example, operations on gfns with
> p2m_ioreq_server type will be delivered to a corresponding ioreq
> server, and this requires that the p2m type not be switched back
> to p2m_ram_rw during the emulation process. To avoid this race
> condition, we delay the release of p2m lock in hvm_hap_nested_page_fault()
> until mmio is handled.
> 
> Note: previously in hvm_hap_nested_page_fault(), put_gfn() was moved
> before the handling of mmio, due to a deadlock risk between the p2m
> lock and the event lock(in commit 77b8dfe). Later, a per-event channel
> lock was introduced in commit de6acb7, to send events. So we do not
> need to worry about the deadlock issue.
> 
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

However, shouldn't this go _before_ what is now patch 1?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 3/4] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages.
  2016-09-02 10:47 ` [PATCH v6 3/4] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages Yu Zhang
@ 2016-09-05 14:10   ` Jan Beulich
       [not found]   ` <57D247F6.9010503@linux.intel.com>
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2016-09-05 14:10 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Andrew Cooper, Paul Durrant, zhiyuan.lv, xen-devel

>>> On 02.09.16 at 12:47, <yu.c.zhang@linux.intel.com> wrote:
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -95,6 +95,41 @@ static const struct hvm_io_handler null_handler = {
>      .ops = &null_ops
>  };
>  
> +static int mem_read(const struct hvm_io_handler *io_handler,
> +                    uint64_t addr,
> +                    uint32_t size,
> +                    uint64_t *data)
> +{
> +    struct domain *currd = current->domain;
> +    unsigned long gmfn = paddr_to_pfn(addr);
> +    unsigned long offset = addr & ~PAGE_MASK;
> +    struct page_info *page = get_page_from_gfn(currd, gmfn, NULL, P2M_UNSHARE);
> +    uint8_t *p;

const (and preferably also void)

> +    ASSERT(offset + size < PAGE_SIZE);

Surely <= ?

> +    if ( !page )
> +        return X86EMUL_UNHANDLEABLE;
> +
> +    p = __map_domain_page(page);
> +    p += offset;
> +    memcpy(data, p, size);
> +
> +    unmap_domain_page(p);
> +    put_page(page);

But anyway - I think rather than all this open coding you would
better call hvm_copy_from_guest_phys().

> +static const struct hvm_io_ops mem_ops = {
> +    .read = mem_read,
> +    .write = null_write
> +};
> +
> +static const struct hvm_io_handler mem_handler = {
> +    .ops = &mem_ops
> +};

I think the mem_ prefix for both objects is a bad one, considering
that this isn't suitable for general memory handling.

> @@ -204,7 +239,15 @@ static int hvmemul_do_io(
>          /* If there is no suitable backing DM, just ignore accesses */
>          if ( !s )
>          {
> -            rc = hvm_process_io_intercept(&null_handler, &p);
> +            /*
> +             * For p2m_ioreq_server pages accessed with read-modify-write
> +             * instructions, we provide a read handler to copy the data to
> +             * the buffer.
> +             */
> +            if ( p2mt == p2m_ioreq_server )

Please add unlikely() here, or aid the compiler in avoiding any
branch by ...

> +                rc = hvm_process_io_intercept(&mem_handler, &p);
> +            else
> +                rc = hvm_process_io_intercept(&null_handler, &p);

... using a conditional expression for the first function argument.

And the comment ahead of the if() now also needs adjustment
(perhaps you want to merge the one you add into that one).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-09-02 10:47 ` [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
@ 2016-09-05 14:47   ` Jan Beulich
       [not found]   ` <57D24813.2090903@linux.intel.com>
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2016-09-05 14:47 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, xen-devel,
	Paul Durrant, zhiyuan.lv, Jun Nakajima

>>> On 02.09.16 at 12:47, <yu.c.zhang@linux.intel.com> wrote:
> @@ -5551,7 +5553,35 @@ static int hvmop_map_mem_type_to_ioreq_server(
>      if ( rc != 0 )
>          goto out;
>  
> -    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
> +    if ( gfn == 0 )
> +        rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
> +
> +    /*
> +     * Iterate p2m table when an ioreq server unmaps from p2m_ioreq_server,
> +     * and reset the remaining p2m_ioreq_server entries back to p2m_ram_rw.
> +     */
> +    if ( op.flags == 0 && rc == 0 )
> +    {
> +        struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +
> +        while ( read_atomic(&p2m->ioreq.entry_count) &&
> +                gfn <= p2m->max_mapped_pfn )
> +        {
> +            unsigned long gfn_end = gfn + HVMOP_op_mask;
> +
> +            p2m_finish_type_change(d, gfn, gfn_end,
> +                                   p2m_ioreq_server, p2m_ram_rw);
> +
> +            /* Check for continuation if it's not the last iteration. */
> +            if ( ++gfn_end <= p2m->max_mapped_pfn &&
> +                hypercall_preempt_check() )
> +            {
> +                rc = -ERESTART;
> +                *iter = gfn_end;
> +                goto out;
> +            }
> +        }

"gfn" doesn't get updated, so if no preemption happens here, the
same GFN range will get acted on a second time (and so on, until
eventually preemption becomes necessary).

Also when you don't really need to use "goto", please try to avoid
it - here you could easily use just "break" instead.

> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -545,6 +545,9 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
>                      e.ipat = ipat;
>                      if ( e.recalc && p2m_is_changeable(e.sa_p2mt) )
>                      {
> +                         if ( e.sa_p2mt == p2m_ioreq_server )
> +                             p2m->ioreq.entry_count--;

ISTR that George had asked you to put this accounting elsewhere.

And then I'd really like you to add assertions making sure this
count doesn't underflow.

> @@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
>      if ( is_epte_valid(ept_entry) )
>      {
>          if ( (recalc || ept_entry->recalc) &&
> -             p2m_is_changeable(ept_entry->sa_p2mt) )
> +             p2m_is_changeable(ept_entry->sa_p2mt) &&
> +             (ept_entry->sa_p2mt != p2m_ioreq_server) )
>              *t = p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty
>                                                        : p2m_ram_rw;
>          else

Considering this (and at least one more similar adjustment further
down), is it really appropriate to include p2m_ioreq_server in the
set of "changeable" types?

> +void p2m_finish_type_change(struct domain *d,
> +                           unsigned long start, unsigned long end,
> +                           p2m_type_t ot, p2m_type_t nt)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +    p2m_type_t t;
> +    unsigned long gfn = start;
> +
> +    ASSERT(start <= end);
> +    ASSERT(ot != nt);
> +    ASSERT(p2m_is_changeable(ot) && p2m_is_changeable(nt));
> +
> +    p2m_lock(p2m);
> +
> +    end = (end > p2m->max_mapped_pfn) ? p2m->max_mapped_pfn : end;

min() or an if() without else.

> +    while ( gfn <= end )
> +    {
> +        get_gfn_query_unlocked(d, gfn, &t);
> +
> +        if ( t == ot )
> +            p2m_change_type_one(d, gfn, t, nt);
> +
> +        gfn++;
> +    }
> +
> +    p2m_unlock(p2m);
> +}

And then I wonder why p2m_change_type_range() can't be used
instead of adding this new function.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-05 13:31   ` Jan Beulich
@ 2016-09-05 17:20     ` George Dunlap
  2016-09-06  7:58       ` Jan Beulich
  2016-09-09  5:55     ` Yu Zhang
  1 sibling, 1 reply; 40+ messages in thread
From: George Dunlap @ 2016-09-05 17:20 UTC (permalink / raw)
  To: Jan Beulich, Yu Zhang
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	Paul Durrant, zhiyuan.lv, Jun Nakajima

On 05/09/16 14:31, Jan Beulich wrote:
>>>> On 02.09.16 at 12:47, <yu.c.zhang@linux.intel.com> wrote:
>> @@ -178,8 +179,27 @@ static int hvmemul_do_io(
>>          break;
>>      case X86EMUL_UNHANDLEABLE:
>>      {
>> -        struct hvm_ioreq_server *s =
>> -            hvm_select_ioreq_server(curr->domain, &p);
>> +        struct hvm_ioreq_server *s = NULL;
>> +        p2m_type_t p2mt = p2m_invalid;
>> +
>> +        if ( is_mmio )
>> +        {
>> +            unsigned long gmfn = paddr_to_pfn(addr);
>> +
>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>> +
>> +            if ( p2mt == p2m_ioreq_server && dir == IOREQ_WRITE )
>> +            {
>> +                unsigned int flags;
>> +
>> +                s = p2m_get_ioreq_server(currd, &flags);
>> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>> +                    s = NULL;
>> +            }
>> +        }
>> +
>> +        if ( !s && p2mt != p2m_ioreq_server )
>> +            s = hvm_select_ioreq_server(currd, &p);
> 
> What I recall is that we had agreed on p2m_ioreq_server pages
> to be treated as ordinary RAM ones as long as no server can be
> found. The type check here contradicts that. Is there a reason?

I think it must be a confusion as to what "treated like ordinary RAM
ones" means.  p2m_ram_rw types that gets here will go through
hvm_select_ioreq_server(), and (therefore) potentially be treated as
MMIO accesses, which is not how "ordinary RAM" would behave.  If what
you meant was that you want p2m_ioreq_server to behave like p2m_ram_rw
(and be treated as MMIO if it matches an iorange) then yes.  If what you
want is for p2m_ioreq_server to actually act like RAM, then probably
something more needs to happen here.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-02 10:47 ` [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
  2016-09-05 13:31   ` Jan Beulich
@ 2016-09-05 17:23   ` George Dunlap
       [not found]   ` <57D24730.2050904@linux.intel.com>
  2 siblings, 0 replies; 40+ messages in thread
From: George Dunlap @ 2016-09-05 17:23 UTC (permalink / raw)
  To: Yu Zhang, xen-devel
  Cc: Kevin Tian, Jun Nakajima, George Dunlap, Andrew Cooper,
	Tim Deegan, Paul Durrant, zhiyuan.lv, Jan Beulich

On 02/09/16 11:47, Yu Zhang wrote:
> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
> let one ioreq server claim/disclaim its responsibility for the
> handling of guest pages with p2m type p2m_ioreq_server. Users
> of this HVMOP can specify which kind of operation is supposed
> to be emulated in a parameter named flags. Currently, this HVMOP
> only support the emulation of write operations. And it can be
> further extended to support the emulation of read ones if an
> ioreq server has such requirement in the future.
> 
> For now, we only support one ioreq server for this p2m type, so
> once an ioreq server has claimed its ownership, subsequent calls
> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
> triggering this new HVMOP, with ioreq server id set to the current
> owner's and flags parameter set to 0.
> 
> Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
> are only supported for HVMs with HAP enabled.
> 
> Also note that only after one ioreq server claims its ownership
> of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
> be allowed.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> Acked-by: Tim Deegan <tim@xen.org>
> ---
> Cc: Paul Durrant <paul.durrant@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Tim Deegan <tim@xen.org>
> 
> changes in v6: 
>   - Clarify logic in hvmemul_do_io().
>   - Use recursive lock for ioreq server lock.
>   - Remove debug print when mapping ioreq server.
>   - Clarify code in ept_p2m_type_to_flags() for consistency.
>   - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
>   - Add comments for HVMMEM_ioreq_server to note only changes
>     to/from HVMMEM_ram_rw are permitted.
>   - Add domain_pause/unpause() in hvm_map_mem_type_to_ioreq_server()
>     to avoid the race condition when a vm exit happens on a write-
>     protected page, just to find the ioreq server has been unmapped
>     already.

Why do we need to do this?  Won't the default case just DTRT if it finds
that the ioreq server has been unmapped?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-05 17:20     ` George Dunlap
@ 2016-09-06  7:58       ` Jan Beulich
  2016-09-06  8:03         ` Paul Durrant
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2016-09-06  7:58 UTC (permalink / raw)
  To: George Dunlap, Yu Zhang
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	Paul Durrant, zhiyuan.lv, JunNakajima

>>> On 05.09.16 at 19:20, <george.dunlap@citrix.com> wrote:
> On 05/09/16 14:31, Jan Beulich wrote:
>>>>> On 02.09.16 at 12:47, <yu.c.zhang@linux.intel.com> wrote:
>>> @@ -178,8 +179,27 @@ static int hvmemul_do_io(
>>>          break;
>>>      case X86EMUL_UNHANDLEABLE:
>>>      {
>>> -        struct hvm_ioreq_server *s =
>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>> +        struct hvm_ioreq_server *s = NULL;
>>> +        p2m_type_t p2mt = p2m_invalid;
>>> +
>>> +        if ( is_mmio )
>>> +        {
>>> +            unsigned long gmfn = paddr_to_pfn(addr);
>>> +
>>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>>> +
>>> +            if ( p2mt == p2m_ioreq_server && dir == IOREQ_WRITE )
>>> +            {
>>> +                unsigned int flags;
>>> +
>>> +                s = p2m_get_ioreq_server(currd, &flags);
>>> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>>> +                    s = NULL;
>>> +            }
>>> +        }
>>> +
>>> +        if ( !s && p2mt != p2m_ioreq_server )
>>> +            s = hvm_select_ioreq_server(currd, &p);
>> 
>> What I recall is that we had agreed on p2m_ioreq_server pages
>> to be treated as ordinary RAM ones as long as no server can be
>> found. The type check here contradicts that. Is there a reason?
> 
> I think it must be a confusion as to what "treated like ordinary RAM
> ones" means.  p2m_ram_rw types that gets here will go through
> hvm_select_ioreq_server(), and (therefore) potentially be treated as
> MMIO accesses, which is not how "ordinary RAM" would behave.  If what
> you meant was that you want p2m_ioreq_server to behave like p2m_ram_rw
> (and be treated as MMIO if it matches an iorange) then yes.  If what you
> want is for p2m_ioreq_server to actually act like RAM, then probably
> something more needs to happen here.

Well, all I'm questioning is the special casing of p2m_ioreq_server in
the if(). That's imo orthogonal to p2m_ram_rw pages not being
supposed to make it here (hence the is_mmio check in the earlier if()
also looks questionable). Perhaps it would already help if there was
a comment explaining what the exact intended behavior here is.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-06  7:58       ` Jan Beulich
@ 2016-09-06  8:03         ` Paul Durrant
  2016-09-06  8:13           ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Paul Durrant @ 2016-09-06  8:03 UTC (permalink / raw)
  To: Jan Beulich, Yu Zhang
  Cc: Kevin Tian, Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, zhiyuan.lv, JunNakajima

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 06 September 2016 08:58
> To: George Dunlap <George.Dunlap@citrix.com>; Yu Zhang
> <yu.c.zhang@linux.intel.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant
> <Paul.Durrant@citrix.com>; George Dunlap <George.Dunlap@citrix.com>;
> JunNakajima <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>;
> zhiyuan.lv@intel.com; xen-devel@lists.xen.org; Tim (Xen.org)
> <tim@xen.org>
> Subject: Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram
> with p2m_ioreq_server to an ioreq server.
> 
> >>> On 05.09.16 at 19:20, <george.dunlap@citrix.com> wrote:
> > On 05/09/16 14:31, Jan Beulich wrote:
> >>>>> On 02.09.16 at 12:47, <yu.c.zhang@linux.intel.com> wrote:
> >>> @@ -178,8 +179,27 @@ static int hvmemul_do_io(
> >>>          break;
> >>>      case X86EMUL_UNHANDLEABLE:
> >>>      {
> >>> -        struct hvm_ioreq_server *s =
> >>> -            hvm_select_ioreq_server(curr->domain, &p);
> >>> +        struct hvm_ioreq_server *s = NULL;
> >>> +        p2m_type_t p2mt = p2m_invalid;
> >>> +
> >>> +        if ( is_mmio )
> >>> +        {
> >>> +            unsigned long gmfn = paddr_to_pfn(addr);
> >>> +
> >>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
> >>> +
> >>> +            if ( p2mt == p2m_ioreq_server && dir == IOREQ_WRITE )
> >>> +            {
> >>> +                unsigned int flags;
> >>> +
> >>> +                s = p2m_get_ioreq_server(currd, &flags);
> >>> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
> >>> +                    s = NULL;
> >>> +            }
> >>> +        }
> >>> +
> >>> +        if ( !s && p2mt != p2m_ioreq_server )
> >>> +            s = hvm_select_ioreq_server(currd, &p);
> >>
> >> What I recall is that we had agreed on p2m_ioreq_server pages to be
> >> treated as ordinary RAM ones as long as no server can be found. The
> >> type check here contradicts that. Is there a reason?
> >
> > I think it must be a confusion as to what "treated like ordinary RAM
> > ones" means.  p2m_ram_rw types that gets here will go through
> > hvm_select_ioreq_server(), and (therefore) potentially be treated as
> > MMIO accesses, which is not how "ordinary RAM" would behave.  If what
> > you meant was that you want p2m_ioreq_server to behave like
> p2m_ram_rw
> > (and be treated as MMIO if it matches an iorange) then yes.  If what
> > you want is for p2m_ioreq_server to actually act like RAM, then
> > probably something more needs to happen here.
> 
> Well, all I'm questioning is the special casing of p2m_ioreq_server in the if().
> That's imo orthogonal to p2m_ram_rw pages not being supposed to make it
> here (hence the is_mmio check in the earlier if() also looks questionable).
> Perhaps it would already help if there was a comment explaining what the
> exact intended behavior here is.
> 

My understanding is that we want accesses that make it here for pages that are not of type 'ioreq_server' to result in MMIO emulation (i.e. they hit an emulator's requested ranges, or the access is completed as unclaimed MMIO by Xen). Accesses that make it here because the page *is* of type 'ioreq server' should be sent to the emulator that has claimed the type and, if no emulator does currently have a claim to the type, be handled as if the access was to r/w RAM.

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-06  8:03         ` Paul Durrant
@ 2016-09-06  8:13           ` Jan Beulich
  2016-09-06 10:00             ` Yu Zhang
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2016-09-06  8:13 UTC (permalink / raw)
  To: Paul Durrant, Yu Zhang
  Cc: Kevin Tian, Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, zhiyuan.lv, JunNakajima

>>> On 06.09.16 at 10:03, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 06 September 2016 08:58
>> To: George Dunlap <George.Dunlap@citrix.com>; Yu Zhang
>> <yu.c.zhang@linux.intel.com>
>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant
>> <Paul.Durrant@citrix.com>; George Dunlap <George.Dunlap@citrix.com>;
>> JunNakajima <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>;
>> zhiyuan.lv@intel.com; xen-devel@lists.xen.org; Tim (Xen.org)
>> <tim@xen.org>
>> Subject: Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram
>> with p2m_ioreq_server to an ioreq server.
>> 
>> >>> On 05.09.16 at 19:20, <george.dunlap@citrix.com> wrote:
>> > On 05/09/16 14:31, Jan Beulich wrote:
>> >>>>> On 02.09.16 at 12:47, <yu.c.zhang@linux.intel.com> wrote:
>> >>> @@ -178,8 +179,27 @@ static int hvmemul_do_io(
>> >>>          break;
>> >>>      case X86EMUL_UNHANDLEABLE:
>> >>>      {
>> >>> -        struct hvm_ioreq_server *s =
>> >>> -            hvm_select_ioreq_server(curr->domain, &p);
>> >>> +        struct hvm_ioreq_server *s = NULL;
>> >>> +        p2m_type_t p2mt = p2m_invalid;
>> >>> +
>> >>> +        if ( is_mmio )
>> >>> +        {
>> >>> +            unsigned long gmfn = paddr_to_pfn(addr);
>> >>> +
>> >>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>> >>> +
>> >>> +            if ( p2mt == p2m_ioreq_server && dir == IOREQ_WRITE )
>> >>> +            {
>> >>> +                unsigned int flags;
>> >>> +
>> >>> +                s = p2m_get_ioreq_server(currd, &flags);
>> >>> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>> >>> +                    s = NULL;
>> >>> +            }
>> >>> +        }
>> >>> +
>> >>> +        if ( !s && p2mt != p2m_ioreq_server )
>> >>> +            s = hvm_select_ioreq_server(currd, &p);
>> >>
>> >> What I recall is that we had agreed on p2m_ioreq_server pages to be
>> >> treated as ordinary RAM ones as long as no server can be found. The
>> >> type check here contradicts that. Is there a reason?
>> >
>> > I think it must be a confusion as to what "treated like ordinary RAM
>> > ones" means.  p2m_ram_rw types that gets here will go through
>> > hvm_select_ioreq_server(), and (therefore) potentially be treated as
>> > MMIO accesses, which is not how "ordinary RAM" would behave.  If what
>> > you meant was that you want p2m_ioreq_server to behave like
>> p2m_ram_rw
>> > (and be treated as MMIO if it matches an iorange) then yes.  If what
>> > you want is for p2m_ioreq_server to actually act like RAM, then
>> > probably something more needs to happen here.
>> 
>> Well, all I'm questioning is the special casing of p2m_ioreq_server in the if().
>> That's imo orthogonal to p2m_ram_rw pages not being supposed to make it
>> here (hence the is_mmio check in the earlier if() also looks questionable).
>> Perhaps it would already help if there was a comment explaining what the
>> exact intended behavior here is.
>> 
> 
> My understanding is that we want accesses that make it here for pages that 
> are not of type 'ioreq_server' to result in MMIO emulation (i.e. they hit an 
> emulator's requested ranges, or the access is completed as unclaimed MMIO by 
> Xen). Accesses that make it here because the page *is* of type 'ioreq server' 
> should be sent to the emulator that has claimed the type and, if no emulator 
> does currently have a claim to the type, be handled as if the access was to 
> r/w RAM.

Makes sense, but it doesn't look like the code above does that, as
- keeping s to be NULL means ignoring the access
- finding a server by some means would imply handling the access as I/O
  instead of RAM.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-06  8:13           ` Jan Beulich
@ 2016-09-06 10:00             ` Yu Zhang
  0 siblings, 0 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-06 10:00 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: Kevin Tian, Andrew Cooper, Tim (Xen.org),
	George Dunlap, xen-devel, zhiyuan.lv, JunNakajima



On 9/6/2016 4:13 PM, Jan Beulich wrote:
>>>> On 06.09.16 at 10:03, <Paul.Durrant@citrix.com> wrote:
>>>   -----Original Message-----
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>> Sent: 06 September 2016 08:58
>>> To: George Dunlap <George.Dunlap@citrix.com>; Yu Zhang
>>> <yu.c.zhang@linux.intel.com>
>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant
>>> <Paul.Durrant@citrix.com>; George Dunlap <George.Dunlap@citrix.com>;
>>> JunNakajima <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>;
>>> zhiyuan.lv@intel.com; xen-devel@lists.xen.org; Tim (Xen.org)
>>> <tim@xen.org>
>>> Subject: Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram
>>> with p2m_ioreq_server to an ioreq server.
>>>
>>>>>> On 05.09.16 at 19:20, <george.dunlap@citrix.com> wrote:
>>>> On 05/09/16 14:31, Jan Beulich wrote:
>>>>>>>> On 02.09.16 at 12:47, <yu.c.zhang@linux.intel.com> wrote:
>>>>>> @@ -178,8 +179,27 @@ static int hvmemul_do_io(
>>>>>>           break;
>>>>>>       case X86EMUL_UNHANDLEABLE:
>>>>>>       {
>>>>>> -        struct hvm_ioreq_server *s =
>>>>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>>>>> +        struct hvm_ioreq_server *s = NULL;
>>>>>> +        p2m_type_t p2mt = p2m_invalid;
>>>>>> +
>>>>>> +        if ( is_mmio )
>>>>>> +        {
>>>>>> +            unsigned long gmfn = paddr_to_pfn(addr);
>>>>>> +
>>>>>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>>>>>> +
>>>>>> +            if ( p2mt == p2m_ioreq_server && dir == IOREQ_WRITE )
>>>>>> +            {
>>>>>> +                unsigned int flags;
>>>>>> +
>>>>>> +                s = p2m_get_ioreq_server(currd, &flags);
>>>>>> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>>>>>> +                    s = NULL;
>>>>>> +            }
>>>>>> +        }
>>>>>> +
>>>>>> +        if ( !s && p2mt != p2m_ioreq_server )
>>>>>> +            s = hvm_select_ioreq_server(currd, &p);
>>>>> What I recall is that we had agreed on p2m_ioreq_server pages to be
>>>>> treated as ordinary RAM ones as long as no server can be found. The
>>>>> type check here contradicts that. Is there a reason?
>>>> I think it must be a confusion as to what "treated like ordinary RAM
>>>> ones" means.  p2m_ram_rw types that gets here will go through
>>>> hvm_select_ioreq_server(), and (therefore) potentially be treated as
>>>> MMIO accesses, which is not how "ordinary RAM" would behave.  If what
>>>> you meant was that you want p2m_ioreq_server to behave like
>>> p2m_ram_rw
>>>> (and be treated as MMIO if it matches an iorange) then yes.  If what
>>>> you want is for p2m_ioreq_server to actually act like RAM, then
>>>> probably something more needs to happen here.
>>> Well, all I'm questioning is the special casing of p2m_ioreq_server in the if().
>>> That's imo orthogonal to p2m_ram_rw pages not being supposed to make it
>>> here (hence the is_mmio check in the earlier if() also looks questionable).
>>> Perhaps it would already help if there was a comment explaining what the
>>> exact intended behavior here is.
>>>
>> My understanding is that we want accesses that make it here for pages that
>> are not of type 'ioreq_server' to result in MMIO emulation (i.e. they hit an
>> emulator's requested ranges, or the access is completed as unclaimed MMIO by
>> Xen). Accesses that make it here because the page *is* of type 'ioreq server'
>> should be sent to the emulator that has claimed the type and, if no emulator
>> does currently have a claim to the type, be handled as if the access was to
>> r/w RAM.
> Makes sense, but it doesn't look like the code above does that, as
> - keeping s to be NULL means ignoring the access
> - finding a server by some means would imply handling the access as I/O
>    instead of RAM.
>
> Jan

Thanks Paul for helping me explain this. :)
And I'd like to give some clarification:
Handling of accesses to p2m_ioreq_server pages are supposed to be 
delivered to device model
selected from p2m_get_ioreq_server(); only accesses to port I/O and MMIO 
are supposed to use
routine hvm_select_ioreq_server() to select the device model.

For p2m_ioreq_server pages, variable s(for ioreq server) could be NULL 
in following cases:
1> Since we now only support the emulation of write accesses, we shall 
not forward  current
access to the device model if it is a read access. Therefore, you can 
see the p2m_get_ioreq_server()
is only called with  "if ( p2mt == p2m_ioreq_server && dir == 
IOREQ_WRITE )" restriction.
However, we may encounter a read-modify-write instruction in guest, 
trying to access this page,
and the read emulation should be handled in hypervisor first and then 
deliver the write access
to device model. In such case, the s is NULL, and will be handled by the 
mem_handler in patch 3/4.

2> Even when p2m_get_ioreq_server() return a valid ioreq server, it 
could be reset to NULL,
if the (flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) is not true. But this 
part is not
indispensable. It is for future support of emualtion of read accesses. 
For now, we have code
in hvm_map_mem_type_to_ioreq_server() to guarantee the flag is 
XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE;

Note: we do not need to worry about the situation when an 
p2m_ioreq_server page is written,
yet no ioreq server is bound. Because:
1> hvmop_set_mem_type() requires an ioreq server have been bound when we 
set a page to
p2m_ioreq_server;

2> hvmop_map_mem_type_to_ioreq_server() also makes sure all pages with 
p2m_ioreq_server
be reset back to p2m_ram_rw when it unbind an ioreq server(in patch 
4/4). So when no ioreq
server is bound with p2m_ioreq_server, there should be no such page.

Thanks
Yu




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type.
  2016-09-02 10:47 [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
                   ` (3 preceding siblings ...)
  2016-09-02 10:47 ` [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
@ 2016-09-06 10:57 ` Yu Zhang
  4 siblings, 0 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-06 10:57 UTC (permalink / raw)
  To: xen-devel, Jan Beulich, George Dunlap, Paul Durrant; +Cc: zhiyuan.lv



On 9/2/2016 6:47 PM, Yu Zhang wrote:
> XenGT leverages ioreq server to track and forward the accesses to GPU
> I/O resources, e.g. the PPGTT(per-process graphic translation tables).
> Currently, ioreq server uses rangeset to track the BDF/ PIO/MMIO ranges
> to be emulated. To select an ioreq server, the rangeset is searched to
> see if the I/O range is recorded. However, number of ram pages to be
> tracked may exceed the upper limit of rangeset.
>
> Previously, one solution was proposed to refactor the rangeset, and
> extend its upper limit. However, after 12 rounds discussion, we have
> decided to drop this approach due to security concerns. Now this new
> patch series introduces a new mem type, HVMMEM_ioreq_server, and added
> hvm operations to let one ioreq server to claim its ownership of ram
> pages with this type. Accesses to a page of this type will be handled
> by the specified ioreq server directly.
>
> Yu Zhang (4):
>    x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to
>      an ioreq server.
>    x86/ioreq server: Release the p2m lock after mmio is handled.
>    x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server
>      pages.
>    x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an
>      ioreq server unmaps.
>
>   xen/arch/x86/hvm/emulate.c       |  71 ++++++++++++++++++++--
>   xen/arch/x86/hvm/hvm.c           | 104 ++++++++++++++++++++++++++++++---
>   xen/arch/x86/hvm/ioreq.c         |  39 +++++++++++++
>   xen/arch/x86/mm/hap/hap.c        |   9 +++
>   xen/arch/x86/mm/hap/nested_hap.c |   2 +-
>   xen/arch/x86/mm/p2m-ept.c        |  14 ++++-
>   xen/arch/x86/mm/p2m-pt.c         |  30 +++++++---
>   xen/arch/x86/mm/p2m.c            | 123 +++++++++++++++++++++++++++++++++++++++
>   xen/arch/x86/mm/shadow/multi.c   |   3 +-
>   xen/include/asm-x86/hvm/ioreq.h  |   2 +
>   xen/include/asm-x86/p2m.h        |  35 +++++++++--
>   xen/include/public/hvm/hvm_op.h  |  35 ++++++++++-
>   12 files changed, 437 insertions(+), 30 deletions(-)
>
Hi maintainers,

Sorry, I just realized that you have already commented all these patches 
yesterday.
But it's strange that my thunderbird failed to receive most mails from 
xen-devel
yesterday(Sep 5), though it works fine today. I just saw your reply in 
the web archive,
and can not reply to you. And even accesses to the web archive have 
connection
failures now and then.

Seems my colleagues here also met this issue, and I'll try to contact 
the IT to see if
we can fix this.

Thanks
Yu

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
       [not found]   ` <57D24730.2050904@linux.intel.com>
@ 2016-09-09  5:51     ` Yu Zhang
  2016-09-21 13:04       ` George Dunlap
  0 siblings, 1 reply; 40+ messages in thread
From: Yu Zhang @ 2016-09-09  5:51 UTC (permalink / raw)
  To: George Dunlap; +Cc: Paul Durrant, Lv, Zhiyuan, Jan Beulich, Xen-devel



On 9/9/2016 1:22 PM, Yu Zhang wrote:
>
>
> On 9/2/2016 6:47 PM, Yu Zhang wrote:
>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>> let one ioreq server claim/disclaim its responsibility for the
>> handling of guest pages with p2m type p2m_ioreq_server. Users
>> of this HVMOP can specify which kind of operation is supposed
>> to be emulated in a parameter named flags. Currently, this HVMOP
>> only support the emulation of write operations. And it can be
>> further extended to support the emulation of read ones if an
>> ioreq server has such requirement in the future.
>>
>> For now, we only support one ioreq server for this p2m type, so
>> once an ioreq server has claimed its ownership, subsequent calls
>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
>> triggering this new HVMOP, with ioreq server id set to the current
>> owner's and flags parameter set to 0.
>>
>> Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>> are only supported for HVMs with HAP enabled.
>>
>> Also note that only after one ioreq server claims its ownership
>> of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
>> be allowed.
>>
>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>> Acked-by: Tim Deegan <tim@xen.org>
>> ---
>> Cc: Paul Durrant <paul.durrant@citrix.com>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>> Cc: Kevin Tian <kevin.tian@intel.com>
>> Cc: Tim Deegan <tim@xen.org>
>>
>> changes in v6:
>>    - Clarify logic in hvmemul_do_io().
>>    - Use recursive lock for ioreq server lock.
>>    - Remove debug print when mapping ioreq server.
>>    - Clarify code in ept_p2m_type_to_flags() for consistency.
>>    - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
>>    - Add comments for HVMMEM_ioreq_server to note only changes
>>      to/from HVMMEM_ram_rw are permitted.
>>    - Add domain_pause/unpause() in hvm_map_mem_type_to_ioreq_server()
>>      to avoid the race condition when a vm exit happens on a write-
>>      protected page, just to find the ioreq server has been unmapped
>>      already.
>>    - Introduce a seperate patch to delay the release of p2m
>>      lock to avoid the race condition.
>>    - Introduce a seperate patch to handle the read-modify-write
>>      operations on a write protected page.
>>
>
> Why do we need to do this?  Won't the default case just DTRT if it finds
> that the ioreq server has been unmapped?

Well, patch 4 will either mark the remaining p2m_ioreq_server entries as 
"recal" or
reset to p2m_ram_rw directly. So my understanding is that we do not wish to
see a ept violation due to a p2m_ioreq_server access after the ioreq 
server is unmapped.
Yet without this domain_pause/unpause() pair, VM accesses may trigger an 
ept violation
during the hvmop hypercall(hvm_map_mem_type_to_ioreq_server), just to 
find the ioreq
server is NULL. Then we would have to provide handlers which just do the 
copy to/from
actions for the VM. This seems awkward to me.

IIUC, with domain_pause/unpause(), we can guarantee that the VM will not 
trigger such
ept violation during the unmapping process. After that, accesses to a 
p2m_ioreq_server page
will only trigger ept misconfiguration. :)

Thanks
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-05 13:31   ` Jan Beulich
  2016-09-05 17:20     ` George Dunlap
@ 2016-09-09  5:55     ` Yu Zhang
  2016-09-09  8:09       ` Jan Beulich
  1 sibling, 1 reply; 40+ messages in thread
From: Yu Zhang @ 2016-09-09  5:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	Paul Durrant, Lv, Zhiyuan, Nakajima, Jun



On 9/5/2016 9:31 PM, Jan Beulich wrote:
>>>> On 02.09.16 at 12:47, <yu.c.zhang@linux.intel.com> wrote:
>> @@ -178,8 +179,27 @@ static int hvmemul_do_io(
>>           break;
>>       case X86EMUL_UNHANDLEABLE:
>>       {
>> -        struct hvm_ioreq_server *s =
>> -            hvm_select_ioreq_server(curr->domain, &p);
>> +        struct hvm_ioreq_server *s = NULL;
>> +        p2m_type_t p2mt = p2m_invalid;
>> +
>> +        if ( is_mmio )
>> +        {
>> +            unsigned long gmfn = paddr_to_pfn(addr);
>> +
>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>> +
>> +            if ( p2mt == p2m_ioreq_server && dir == IOREQ_WRITE )
>> +            {
>> +                unsigned int flags;
>> +
>> +                s = p2m_get_ioreq_server(currd, &flags);
>> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>> +                    s = NULL;
>> +            }
>> +        }
>> +
>> +        if ( !s && p2mt != p2m_ioreq_server )
>> +            s = hvm_select_ioreq_server(currd, &p);
> What I recall is that we had agreed on p2m_ioreq_server pages
> to be treated as ordinary RAM ones as long as no server can be
> found. The type check here contradicts that. Is there a reason?

Thanks Jan. I had given my explaination on Sep 6's reply. :)
If s is NULL for a p2m_ioreq_server page, we do not wish to traverse the 
rangeset
by hvm_select_ioreq_server() again, it may probably be a read emulation 
process
for a read-modify-write scenario.

>
>> +static int hvmop_map_mem_type_to_ioreq_server(
>> +    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
>> +{
>> +    xen_hvm_map_mem_type_to_ioreq_server_t op;
>> +    struct domain *d;
>> +    int rc;
>> +
>> +    if ( copy_from_guest(&op, uop, 1) )
>> +        return -EFAULT;
>> +
>> +    rc = rcu_lock_remote_domain_by_id(op.domid, &d);
>> +    if ( rc != 0 )
>> +        return rc;
>> +
>> +    rc = -EINVAL;
>> +    if ( !is_hvm_domain(d) )
>> +        goto out;
>> +
>> +    if ( op.pad != 0 )
>> +        goto out;
> This, I think, should be done first thing after having copied in the
> structure. No need to lookup domain or anything if this is not zero.

Right. Thanks!

>> +int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>> +                                     uint32_t type, uint32_t flags)
>> +{
>> +    struct hvm_ioreq_server *s;
>> +    int rc;
>> +
>> +    /* For now, only HVMMEM_ioreq_server is supported. */
>> +    if ( type != HVMMEM_ioreq_server )
>> +        return -EINVAL;
>> +
>> +    /* For now, only write emulation is supported. */
>> +    if ( flags & ~(XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>> +        return -EINVAL;
>> +
>> +    domain_pause(d);
>> +    spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
>> +
>> +    rc = -ENOENT;
>> +    list_for_each_entry ( s,
>> +                          &d->arch.hvm_domain.ioreq_server.list,
>> +                          list_entry )
>> +    {
>> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
>> +            continue;
>> +
>> +        if ( s->id == id )
>> +        {
>> +            rc = p2m_set_ioreq_server(d, flags, s);
>> +            break;
>> +        }
>> +    }
>> +
>> +    spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
>> +    domain_unpause(d);
>> +    return rc;
>> +}
> Blank line before final return statement of a function please.

Got it.

>
>> +int p2m_set_ioreq_server(struct domain *d,
>> +                         unsigned int flags,
>> +                         struct hvm_ioreq_server *s)
>> +{
>> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> +    int rc;
>> +
>> +    /*
>> +     * Use lock to prevent concurrent setting requirements
>> +     * from multiple ioreq serers.
>> +     */
> "Concurrent setting requirements"? DYM "attempts"?

Yep. "attempts" is more accurate.

> Jan
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 2/4] x86/ioreq server: Release the p2m lock after mmio is handled.
       [not found]   ` <57D24782.6010701@linux.intel.com>
@ 2016-09-09  5:56     ` Yu Zhang
  0 siblings, 0 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-09  5:56 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Paul Durrant, Xen-devel, George Dunlap, Lv, Zhiyuan



On 9/9/2016 1:24 PM, Yu Zhang wrote:
>
>
> On 9/2/2016 6:47 PM, Yu Zhang wrote:
>> Routine hvmemul_do_io() may need to peek the p2m type of a gfn to
>> select the ioreq server. For example, operations on gfns with
>> p2m_ioreq_server type will be delivered to a corresponding ioreq
>> server, and this requires that the p2m type not be switched back
>> to p2m_ram_rw during the emulation process. To avoid this race
>> condition, we delay the release of p2m lock in 
>> hvm_hap_nested_page_fault()
>> until mmio is handled.
>>
>> Note: previously in hvm_hap_nested_page_fault(), put_gfn() was moved
>> before the handling of mmio, due to a deadlock risk between the p2m
>> lock and the event lock(in commit 77b8dfe). Later, a per-event channel
>> lock was introduced in commit de6acb7, to send events. So we do not
>> need to worry about the deadlock issue.
>>
>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>
>
> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
>
> However, shouldn't this go _before_ what is now patch 1?
>

Yes. This should be the patch 1/4. Thanks! :)

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 3/4] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages.
       [not found]   ` <57D247F6.9010503@linux.intel.com>
@ 2016-09-09  6:21     ` Yu Zhang
  2016-09-09  8:12       ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Yu Zhang @ 2016-09-09  6:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Paul Durrant, Xen-devel, George Dunlap, Lv, Zhiyuan



On 9/9/2016 1:26 PM, Yu Zhang wrote:
>
> >>> On 02.09.16 at 12:47, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
> > --- a/xen/arch/x86/hvm/emulate.c
> > +++ b/xen/arch/x86/hvm/emulate.c
> > @@ -95,6 +95,41 @@ static const struct hvm_io_handler null_handler = {
> >      .ops = &null_ops
> >  };
> >
> > +static int mem_read(const struct hvm_io_handler *io_handler,
> > +                    uint64_t addr,
> > +                    uint32_t size,
> > +                    uint64_t *data)
> > +{
> > +    struct domain *currd = current->domain;
> > +    unsigned long gmfn = paddr_to_pfn(addr);
> > +    unsigned long offset = addr & ~PAGE_MASK;
> > +    struct page_info *page = get_page_from_gfn(currd, gmfn, NULL,
> > P2M_UNSHARE);
> > +    uint8_t *p;
>
> const (and preferably also void)
>

Thanks, I think we do not need this variable with 
hvm_copy_from_guest_phys().

> > +    ASSERT(offset + size < PAGE_SIZE);
>
> Surely <= ?
>

Yes. Thanks.

> > +    if ( !page )
> > +        return X86EMUL_UNHANDLEABLE;
> > +
> > +    p = __map_domain_page(page);
> > +    p += offset;
> > +    memcpy(data, p, size);
> > +
> > +    unmap_domain_page(p);
> > +    put_page(page);
>
> But anyway - I think rather than all this open coding you would
> better call hvm_copy_from_guest_phys().
>

Agree.

> > +static const struct hvm_io_ops mem_ops = {
> > +    .read = mem_read,
> > +    .write = null_write
> > +};
> > +
> > +static const struct hvm_io_handler mem_handler = {
> > +    .ops = &mem_ops
> > +};
>
> I think the mem_ prefix for both objects is a bad one, considering
> that this isn't suitable for general memory handling.

How about ioreq_server_read/ops? It is only for this special p2m type.

>
> > @@ -204,7 +239,15 @@ static int hvmemul_do_io(
> >          /* If there is no suitable backing DM, just ignore accesses */
> >          if ( !s )
> >          {
> > -            rc = hvm_process_io_intercept(&null_handler, &p);
> > +            /*
> > +             * For p2m_ioreq_server pages accessed with 
> read-modify-write
> > +             * instructions, we provide a read handler to copy the 
> data to
> > +             * the buffer.
> > +             */
> > +            if ( p2mt == p2m_ioreq_server )
>
> Please add unlikely() here, or aid the compiler in avoiding any
> branch by ...
>
> > +                rc = hvm_process_io_intercept(&mem_handler, &p);
> > +            else
> > +                rc = hvm_process_io_intercept(&null_handler, &p);
>
> ... using a conditional expression for the first function argument.
>
OK. I prefer to add the unlikely().

> And the comment ahead of the if() now also needs adjustment
> (perhaps you want to merge the one you add into that one).
>

OK. And IIUC, you mean merge to the original comments above the "if (!s)"?
Like this:
         /*
          * For p2m_ioreq_server pages accessed with read-modify-write
          * instructions, we provide a read handler to copy the data to
          * the buffer. For other cases, if there is no suitable backing
          * DM, we just ignore accesses.
          */
         if ( !s )

> Jan
>

Thanks
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
       [not found]   ` <57D24813.2090903@linux.intel.com>
@ 2016-09-09  7:24     ` Yu Zhang
  2016-09-09  8:20       ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Yu Zhang @ 2016-09-09  7:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Paul Durrant, Xen-devel, George Dunlap, Lv, Zhiyuan



On 9/9/2016 1:26 PM, Yu Zhang wrote:
> >>> On 02.09.16 at 12:47, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
> > @@ -5551,7 +5553,35 @@ static int hvmop_map_mem_type_to_ioreq_server(
> >      if ( rc != 0 )
> >          goto out;
> >
> > -    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, 
> op.flags);
> > +    if ( gfn == 0 )
> > +        rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, 
> op.flags);
> > +
> > +    /*
> > +     * Iterate p2m table when an ioreq server unmaps from 
> p2m_ioreq_server,
> > +     * and reset the remaining p2m_ioreq_server entries back to 
> p2m_ram_rw.
> > +     */
> > +    if ( op.flags == 0 && rc == 0 )
> > +    {
> > +        struct p2m_domain *p2m = p2m_get_hostp2m(d);
> > +
> > +        while ( read_atomic(&p2m->ioreq.entry_count) &&
> > +                gfn <= p2m->max_mapped_pfn )
> > +        {
> > +            unsigned long gfn_end = gfn + HVMOP_op_mask;
> > +
> > +            p2m_finish_type_change(d, gfn, gfn_end,
> > +                                   p2m_ioreq_server, p2m_ram_rw);
> > +
> > +            /* Check for continuation if it's not the last 
> iteration. */
> > +            if ( ++gfn_end <= p2m->max_mapped_pfn &&
> > +                hypercall_preempt_check() )
> > +            {
> > +                rc = -ERESTART;
> > +                *iter = gfn_end;
> > +                goto out;
> > +            }
> > +        }
>
> "gfn" doesn't get updated, so if no preemption happens here, the
> same GFN range will get acted on a second time (and so on, until
> eventually preemption becomes necessary).

Oh, right. Thanks for pointing out. :)

>
> Also when you don't really need to use "goto", please try to avoid
> it - here you could easily use just "break" instead.

OK.

>
> > --- a/xen/arch/x86/mm/p2m-ept.c
> > +++ b/xen/arch/x86/mm/p2m-ept.c
> > @@ -545,6 +545,9 @@ static int resolve_misconfig(struct p2m_domain 
> *p2m,
> > unsigned long gfn)
> >                      e.ipat = ipat;
> >                      if ( e.recalc && p2m_is_changeable(e.sa_p2mt) )
> >                      {
> > +                         if ( e.sa_p2mt == p2m_ioreq_server )
> > +                             p2m->ioreq.entry_count--;
>
> ISTR that George had asked you to put this accounting elsewhere.
>

Yes. You have good memory. : )

George's suggestion is to put inside atomic_write_ept_entry(), which is 
a quite core routine,
and is only for ept. And my suggestion is to put inside the 
p2m_type_change_one() and routine
resolve_misconfig()/do_recalc() as well, so that we can support both 
Intel EPT and AMD NPT -
like the p2m_change_entry_type_global().

I had given a more detailed explanation in a reply on Aug 30 in the v5 
thread. :)

> And then I'd really like you to add assertions making sure this
> count doesn't underflow.

OK.

>
> > @@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
> >      if ( is_epte_valid(ept_entry) )
> >      {
> >          if ( (recalc || ept_entry->recalc) &&
> > -             p2m_is_changeable(ept_entry->sa_p2mt) )
> > +             p2m_is_changeable(ept_entry->sa_p2mt) &&
> > +             (ept_entry->sa_p2mt != p2m_ioreq_server) )
> >              *t = p2m_is_logdirty_range(p2m, gfn, gfn) ? 
> p2m_ram_logdirty
> >                                                        : p2m_ram_rw;
> >          else
>
> Considering this (and at least one more similar adjustment further
> down), is it really appropriate to include p2m_ioreq_server in the
> set of "changeable" types?

Well, I agree p2m_ioreq_server do have different behaviors than the
p2m_log_dirty, but removing p2m_ioreq_server from changeable type
would need more specific code for the p2m_ioreq_server in routines like
resolve_misconfig(), do_recalc() and p2m_change_entry_type_global() etc.
I've tried this approach and abandoned later. :)

>
> > +void p2m_finish_type_change(struct domain *d,
> > +                           unsigned long start, unsigned long end,
> > +                           p2m_type_t ot, p2m_type_t nt)
> > +{
> > +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> > +    p2m_type_t t;
> > +    unsigned long gfn = start;
> > +
> > +    ASSERT(start <= end);
> > +    ASSERT(ot != nt);
> > +    ASSERT(p2m_is_changeable(ot) && p2m_is_changeable(nt));
> > +
> > +    p2m_lock(p2m);
> > +
> > +    end = (end > p2m->max_mapped_pfn) ? p2m->max_mapped_pfn : end;
>
> min() or an if() without else.

Got it. Thanks

>
> > +    while ( gfn <= end )
> > +    {
> > +        get_gfn_query_unlocked(d, gfn, &t);
> > +
> > +        if ( t == ot )
> > +            p2m_change_type_one(d, gfn, t, nt);
> > +
> > +        gfn++;
> > +    }
> > +
> > +    p2m_unlock(p2m);
> > +}
>
> And then I wonder why p2m_change_type_range() can't be used
> instead of adding this new function.
>

Well, like p2m_change_entry_type_global() , p2m_change_type_range() also
does the change asynchronously. And here this routine is introduced
to synchronously reset the p2m type.

> Jan

Thanks
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-09  5:55     ` Yu Zhang
@ 2016-09-09  8:09       ` Jan Beulich
  2016-09-09  8:59         ` Yu Zhang
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2016-09-09  8:09 UTC (permalink / raw)
  To: Yu Zhang
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	Paul Durrant, Zhiyuan Lv, Jun Nakajima

>>> On 09.09.16 at 07:55, <yu.c.zhang@linux.intel.com> wrote:

> 
> On 9/5/2016 9:31 PM, Jan Beulich wrote:
>>>>> On 02.09.16 at 12:47, <yu.c.zhang@linux.intel.com> wrote:
>>> @@ -178,8 +179,27 @@ static int hvmemul_do_io(
>>>           break;
>>>       case X86EMUL_UNHANDLEABLE:
>>>       {
>>> -        struct hvm_ioreq_server *s =
>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>> +        struct hvm_ioreq_server *s = NULL;
>>> +        p2m_type_t p2mt = p2m_invalid;
>>> +
>>> +        if ( is_mmio )
>>> +        {
>>> +            unsigned long gmfn = paddr_to_pfn(addr);
>>> +
>>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>>> +
>>> +            if ( p2mt == p2m_ioreq_server && dir == IOREQ_WRITE )
>>> +            {
>>> +                unsigned int flags;
>>> +
>>> +                s = p2m_get_ioreq_server(currd, &flags);
>>> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>>> +                    s = NULL;
>>> +            }
>>> +        }
>>> +
>>> +        if ( !s && p2mt != p2m_ioreq_server )
>>> +            s = hvm_select_ioreq_server(currd, &p);
>> What I recall is that we had agreed on p2m_ioreq_server pages
>> to be treated as ordinary RAM ones as long as no server can be
>> found. The type check here contradicts that. Is there a reason?
> 
> Thanks Jan. I had given my explaination on Sep 6's reply. :)
> If s is NULL for a p2m_ioreq_server page, we do not wish to traverse the 
> rangeset
> by hvm_select_ioreq_server() again, it may probably be a read emulation 
> process
> for a read-modify-write scenario.

Well, yes, I recall. Yet my request stands: At the very least there
should be a comment here outlining the intended behavior. That'll
make it quite a bit easier, I think, for the reader to figure whether
the code actually matches those intentions (and also help people
needing to touch this code again down the road).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 3/4] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages.
  2016-09-09  6:21     ` Yu Zhang
@ 2016-09-09  8:12       ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2016-09-09  8:12 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Paul Durrant, Xen-devel, George Dunlap, Zhiyuan Lv

>>> On 09.09.16 at 08:21, <yu.c.zhang@linux.intel.com> wrote:
> On 9/9/2016 1:26 PM, Yu Zhang wrote:
>> >>> On 02.09.16 at 12:47, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
>> > +static const struct hvm_io_ops mem_ops = {
>> > +    .read = mem_read,
>> > +    .write = null_write
>> > +};
>> > +
>> > +static const struct hvm_io_handler mem_handler = {
>> > +    .ops = &mem_ops
>> > +};
>>
>> I think the mem_ prefix for both objects is a bad one, considering
>> that this isn't suitable for general memory handling.
> 
> How about ioreq_server_read/ops? It is only for this special p2m type.

SGTM.

>> And the comment ahead of the if() now also needs adjustment
>> (perhaps you want to merge the one you add into that one).
>>
> 
> OK. And IIUC, you mean merge to the original comments above the "if (!s)"?
> Like this:
>          /*
>           * For p2m_ioreq_server pages accessed with read-modify-write
>           * instructions, we provide a read handler to copy the data to
>           * the buffer. For other cases, if there is no suitable backing
>           * DM, we just ignore accesses.
>           */
>          if ( !s )

Yes, thanks.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-09-09  7:24     ` Yu Zhang
@ 2016-09-09  8:20       ` Jan Beulich
  2016-09-09  9:24         ` Yu Zhang
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2016-09-09  8:20 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Paul Durrant, Xen-devel, George Dunlap, Zhiyuan Lv

>>> On 09.09.16 at 09:24, <yu.c.zhang@linux.intel.com> wrote:
> On 9/9/2016 1:26 PM, Yu Zhang wrote:
>> >>> On 02.09.16 at 12:47, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
>> > @@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
>> >      if ( is_epte_valid(ept_entry) )
>> >      {
>> >          if ( (recalc || ept_entry->recalc) &&
>> > -             p2m_is_changeable(ept_entry->sa_p2mt) )
>> > +             p2m_is_changeable(ept_entry->sa_p2mt) &&
>> > +             (ept_entry->sa_p2mt != p2m_ioreq_server) )
>> >              *t = p2m_is_logdirty_range(p2m, gfn, gfn) ? 
>> p2m_ram_logdirty
>> >                                                        : p2m_ram_rw;
>> >          else
>>
>> Considering this (and at least one more similar adjustment further
>> down), is it really appropriate to include p2m_ioreq_server in the
>> set of "changeable" types?
> 
> Well, I agree p2m_ioreq_server do have different behaviors than the
> p2m_log_dirty, but removing p2m_ioreq_server from changeable type
> would need more specific code for the p2m_ioreq_server in routines like
> resolve_misconfig(), do_recalc() and p2m_change_entry_type_global() etc.
> I've tried this approach and abandoned later. :)

I can see that the other option might require more adjustments, in
which case I guess this variant would be fine if you created another
helper (well named) inline function instead of open coding this in
several places. Of course such dissimilar handling in the various
places p2m_is_changeable() gets used right now will additionally
require good reasoning - after all that predicate exists to express
the similarities between different code paths.

>> > +    while ( gfn <= end )
>> > +    {
>> > +        get_gfn_query_unlocked(d, gfn, &t);
>> > +
>> > +        if ( t == ot )
>> > +            p2m_change_type_one(d, gfn, t, nt);
>> > +
>> > +        gfn++;
>> > +    }
>> > +
>> > +    p2m_unlock(p2m);
>> > +}
>>
>> And then I wonder why p2m_change_type_range() can't be used
>> instead of adding this new function.
>>
> 
> Well, like p2m_change_entry_type_global() , p2m_change_type_range() also
> does the change asynchronously. And here this routine is introduced
> to synchronously reset the p2m type.

Ah, of course. Silly me.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-09  8:09       ` Jan Beulich
@ 2016-09-09  8:59         ` Yu Zhang
  0 siblings, 0 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-09  8:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, George Dunlap, Andrew Cooper, Tim Deegan, xen-devel,
	Paul Durrant, Zhiyuan Lv, Jun Nakajima



On 9/9/2016 4:09 PM, Jan Beulich wrote:
>>>> On 09.09.16 at 07:55, <yu.c.zhang@linux.intel.com> wrote:
>> On 9/5/2016 9:31 PM, Jan Beulich wrote:
>>>>>> On 02.09.16 at 12:47, <yu.c.zhang@linux.intel.com> wrote:
>>>> @@ -178,8 +179,27 @@ static int hvmemul_do_io(
>>>>            break;
>>>>        case X86EMUL_UNHANDLEABLE:
>>>>        {
>>>> -        struct hvm_ioreq_server *s =
>>>> -            hvm_select_ioreq_server(curr->domain, &p);
>>>> +        struct hvm_ioreq_server *s = NULL;
>>>> +        p2m_type_t p2mt = p2m_invalid;
>>>> +
>>>> +        if ( is_mmio )
>>>> +        {
>>>> +            unsigned long gmfn = paddr_to_pfn(addr);
>>>> +
>>>> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
>>>> +
>>>> +            if ( p2mt == p2m_ioreq_server && dir == IOREQ_WRITE )
>>>> +            {
>>>> +                unsigned int flags;
>>>> +
>>>> +                s = p2m_get_ioreq_server(currd, &flags);
>>>> +                if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
>>>> +                    s = NULL;
>>>> +            }
>>>> +        }
>>>> +
>>>> +        if ( !s && p2mt != p2m_ioreq_server )
>>>> +            s = hvm_select_ioreq_server(currd, &p);
>>> What I recall is that we had agreed on p2m_ioreq_server pages
>>> to be treated as ordinary RAM ones as long as no server can be
>>> found. The type check here contradicts that. Is there a reason?
>> Thanks Jan. I had given my explaination on Sep 6's reply. :)
>> If s is NULL for a p2m_ioreq_server page, we do not wish to traverse the
>> rangeset
>> by hvm_select_ioreq_server() again, it may probably be a read emulation
>> process
>> for a read-modify-write scenario.
> Well, yes, I recall. Yet my request stands: At the very least there
> should be a comment here outlining the intended behavior. That'll
> make it quite a bit easier, I think, for the reader to figure whether
> the code actually matches those intentions (and also help people
> needing to touch this code again down the road).
>

OK. Thanks. I'll add a comment here. :)

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-09-09  8:20       ` Jan Beulich
@ 2016-09-09  9:24         ` Yu Zhang
  2016-09-09  9:44           ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Yu Zhang @ 2016-09-09  9:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Paul Durrant, George Dunlap, Zhiyuan Lv, Xen-devel



On 9/9/2016 4:20 PM, Jan Beulich wrote:
>>>> On 09.09.16 at 09:24, <yu.c.zhang@linux.intel.com> wrote:
>> On 9/9/2016 1:26 PM, Yu Zhang wrote:
>>>>>> On 02.09.16 at 12:47, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
>>>> @@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
>>>>       if ( is_epte_valid(ept_entry) )
>>>>       {
>>>>           if ( (recalc || ept_entry->recalc) &&
>>>> -             p2m_is_changeable(ept_entry->sa_p2mt) )
>>>> +             p2m_is_changeable(ept_entry->sa_p2mt) &&
>>>> +             (ept_entry->sa_p2mt != p2m_ioreq_server) )
>>>>               *t = p2m_is_logdirty_range(p2m, gfn, gfn) ?
>>> p2m_ram_logdirty
>>>>                                                         : p2m_ram_rw;
>>>>           else
>>> Considering this (and at least one more similar adjustment further
>>> down), is it really appropriate to include p2m_ioreq_server in the
>>> set of "changeable" types?
>> Well, I agree p2m_ioreq_server do have different behaviors than the
>> p2m_log_dirty, but removing p2m_ioreq_server from changeable type
>> would need more specific code for the p2m_ioreq_server in routines like
>> resolve_misconfig(), do_recalc() and p2m_change_entry_type_global() etc.
>> I've tried this approach and abandoned later. :)
> I can see that the other option might require more adjustments, in
> which case I guess this variant would be fine if you created another
> helper (well named) inline function instead of open coding this in
> several places. Of course such dissimilar handling in the various
> places p2m_is_changeable() gets used right now will additionally
> require good reasoning - after all that predicate exists to express
> the similarities between different code paths.

Thanks Jan.
Well, for the logic of p2m type recalculation, similarities between 
p2m_ioreq_server
and other changeable types exceeds their differences. As to the special 
cases, how
about we use a macro, i.e. p2m_is_ioreq?


>
> Jan
>
>

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-09-09  9:24         ` Yu Zhang
@ 2016-09-09  9:44           ` Jan Beulich
  2016-09-09  9:56             ` Yu Zhang
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2016-09-09  9:44 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Paul Durrant, Xen-devel, George Dunlap, Zhiyuan Lv

>>> On 09.09.16 at 11:24, <yu.c.zhang@linux.intel.com> wrote:

> 
> On 9/9/2016 4:20 PM, Jan Beulich wrote:
>>>>> On 09.09.16 at 09:24, <yu.c.zhang@linux.intel.com> wrote:
>>> On 9/9/2016 1:26 PM, Yu Zhang wrote:
>>>>>>> On 02.09.16 at 12:47, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
>>>>> @@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
>>>>>       if ( is_epte_valid(ept_entry) )
>>>>>       {
>>>>>           if ( (recalc || ept_entry->recalc) &&
>>>>> -             p2m_is_changeable(ept_entry->sa_p2mt) )
>>>>> +             p2m_is_changeable(ept_entry->sa_p2mt) &&
>>>>> +             (ept_entry->sa_p2mt != p2m_ioreq_server) )
>>>>>               *t = p2m_is_logdirty_range(p2m, gfn, gfn) ?
>>>> p2m_ram_logdirty
>>>>>                                                         : p2m_ram_rw;
>>>>>           else
>>>> Considering this (and at least one more similar adjustment further
>>>> down), is it really appropriate to include p2m_ioreq_server in the
>>>> set of "changeable" types?
>>> Well, I agree p2m_ioreq_server do have different behaviors than the
>>> p2m_log_dirty, but removing p2m_ioreq_server from changeable type
>>> would need more specific code for the p2m_ioreq_server in routines like
>>> resolve_misconfig(), do_recalc() and p2m_change_entry_type_global() etc.
>>> I've tried this approach and abandoned later. :)
>> I can see that the other option might require more adjustments, in
>> which case I guess this variant would be fine if you created another
>> helper (well named) inline function instead of open coding this in
>> several places. Of course such dissimilar handling in the various
>> places p2m_is_changeable() gets used right now will additionally
>> require good reasoning - after all that predicate exists to express
>> the similarities between different code paths.
> 
> Thanks Jan.
> Well, for the logic of p2m type recalculation, similarities between 
> p2m_ioreq_server
> and other changeable types exceeds their differences. As to the special 
> cases, how
> about we use a macro, i.e. p2m_is_ioreq?

That'd be better than the open coded check, but would still result
in (taking the above example)

             p2m_is_changeable(ept_entry->sa_p2mt) &&
             !p2m_is_ioreq(ept_entry->sa_p2mt) )

? What I'd prefer is a predicate that can be applied here on its own,
without involving && or ||.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-09-09  9:44           ` Jan Beulich
@ 2016-09-09  9:56             ` Yu Zhang
  2016-09-09 10:09               ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Yu Zhang @ 2016-09-09  9:56 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Paul Durrant, Xen-devel, George Dunlap, Zhiyuan Lv



On 9/9/2016 5:44 PM, Jan Beulich wrote:
>>>> On 09.09.16 at 11:24, <yu.c.zhang@linux.intel.com> wrote:
>> On 9/9/2016 4:20 PM, Jan Beulich wrote:
>>>>>> On 09.09.16 at 09:24, <yu.c.zhang@linux.intel.com> wrote:
>>>> On 9/9/2016 1:26 PM, Yu Zhang wrote:
>>>>>>>> On 02.09.16 at 12:47, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
>>>>>> @@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
>>>>>>        if ( is_epte_valid(ept_entry) )
>>>>>>        {
>>>>>>            if ( (recalc || ept_entry->recalc) &&
>>>>>> -             p2m_is_changeable(ept_entry->sa_p2mt) )
>>>>>> +             p2m_is_changeable(ept_entry->sa_p2mt) &&
>>>>>> +             (ept_entry->sa_p2mt != p2m_ioreq_server) )
>>>>>>                *t = p2m_is_logdirty_range(p2m, gfn, gfn) ?
>>>>> p2m_ram_logdirty
>>>>>>                                                          : p2m_ram_rw;
>>>>>>            else
>>>>> Considering this (and at least one more similar adjustment further
>>>>> down), is it really appropriate to include p2m_ioreq_server in the
>>>>> set of "changeable" types?
>>>> Well, I agree p2m_ioreq_server do have different behaviors than the
>>>> p2m_log_dirty, but removing p2m_ioreq_server from changeable type
>>>> would need more specific code for the p2m_ioreq_server in routines like
>>>> resolve_misconfig(), do_recalc() and p2m_change_entry_type_global() etc.
>>>> I've tried this approach and abandoned later. :)
>>> I can see that the other option might require more adjustments, in
>>> which case I guess this variant would be fine if you created another
>>> helper (well named) inline function instead of open coding this in
>>> several places. Of course such dissimilar handling in the various
>>> places p2m_is_changeable() gets used right now will additionally
>>> require good reasoning - after all that predicate exists to express
>>> the similarities between different code paths.
>> Thanks Jan.
>> Well, for the logic of p2m type recalculation, similarities between
>> p2m_ioreq_server
>> and other changeable types exceeds their differences. As to the special
>> cases, how
>> about we use a macro, i.e. p2m_is_ioreq?
> That'd be better than the open coded check, but would still result
> in (taking the above example)
>
>               p2m_is_changeable(ept_entry->sa_p2mt) &&
>               !p2m_is_ioreq(ept_entry->sa_p2mt) )
>
> ? What I'd prefer is a predicate that can be applied here on its own,
> without involving && or ||.
>

OK. I can think of 2 scenarios that p2m_ioreq_server needs special handling:

1> In ept_get_entry()/recal_type(), the p2m types are supposed to return 
as it is, instead of
changing to p2m_log_dirty. So we can use a macro or a inline function 
like p2m_check_changeable(),
which combines the

              p2m_is_changeable(ept_entry->sa_p2mt) &&
              !p2m_is_ioreq(ept_entry->sa_p2mt) )

together.

2> In resolve_misconfig()/do_recalc(), the entry_count gets decremented. We
do not need this new inline function, because they are in a separate 
if() statement.

Is this OK? :)

Thanks
Yu



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-09-09 10:09               ` Jan Beulich
@ 2016-09-09 10:01                 ` Yu Zhang
  2016-09-20  2:57                 ` Yu Zhang
  1 sibling, 0 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-09 10:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Paul Durrant, Xen-devel, George Dunlap, Zhiyuan Lv



On 9/9/2016 6:09 PM, Jan Beulich wrote:
>>>> On 09.09.16 at 11:56, <yu.c.zhang@linux.intel.com> wrote:
>> On 9/9/2016 5:44 PM, Jan Beulich wrote:
>>>>>> On 09.09.16 at 11:24, <yu.c.zhang@linux.intel.com> wrote:
>>>> On 9/9/2016 4:20 PM, Jan Beulich wrote:
>>>>>>>> On 09.09.16 at 09:24, <yu.c.zhang@linux.intel.com> wrote:
>>>>>> On 9/9/2016 1:26 PM, Yu Zhang wrote:
>>>>>>>>>> On 02.09.16 at 12:47, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
>>>>>>>> @@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
>>>>>>>>         if ( is_epte_valid(ept_entry) )
>>>>>>>>         {
>>>>>>>>             if ( (recalc || ept_entry->recalc) &&
>>>>>>>> -             p2m_is_changeable(ept_entry->sa_p2mt) )
>>>>>>>> +             p2m_is_changeable(ept_entry->sa_p2mt) &&
>>>>>>>> +             (ept_entry->sa_p2mt != p2m_ioreq_server) )
>>>>>>>>                 *t = p2m_is_logdirty_range(p2m, gfn, gfn) ?
>>>>>>> p2m_ram_logdirty
>>>>>>>>                                                           : p2m_ram_rw;
>>>>>>>>             else
>>>>>>> Considering this (and at least one more similar adjustment further
>>>>>>> down), is it really appropriate to include p2m_ioreq_server in the
>>>>>>> set of "changeable" types?
>>>>>> Well, I agree p2m_ioreq_server do have different behaviors than the
>>>>>> p2m_log_dirty, but removing p2m_ioreq_server from changeable type
>>>>>> would need more specific code for the p2m_ioreq_server in routines like
>>>>>> resolve_misconfig(), do_recalc() and p2m_change_entry_type_global() etc.
>>>>>> I've tried this approach and abandoned later. :)
>>>>> I can see that the other option might require more adjustments, in
>>>>> which case I guess this variant would be fine if you created another
>>>>> helper (well named) inline function instead of open coding this in
>>>>> several places. Of course such dissimilar handling in the various
>>>>> places p2m_is_changeable() gets used right now will additionally
>>>>> require good reasoning - after all that predicate exists to express
>>>>> the similarities between different code paths.
>>>> Thanks Jan.
>>>> Well, for the logic of p2m type recalculation, similarities between
>>>> p2m_ioreq_server
>>>> and other changeable types exceeds their differences. As to the special
>>>> cases, how
>>>> about we use a macro, i.e. p2m_is_ioreq?
>>> That'd be better than the open coded check, but would still result
>>> in (taking the above example)
>>>
>>>                p2m_is_changeable(ept_entry->sa_p2mt) &&
>>>                !p2m_is_ioreq(ept_entry->sa_p2mt) )
>>>
>>> ? What I'd prefer is a predicate that can be applied here on its own,
>>> without involving && or ||.
>>>
>> OK. I can think of 2 scenarios that p2m_ioreq_server needs special handling:
>>
>> 1> In ept_get_entry()/recal_type(), the p2m types are supposed to return
>> as it is, instead of
>> changing to p2m_log_dirty. So we can use a macro or a inline function
>> like p2m_check_changeable(),
>> which combines the
>>
>>                p2m_is_changeable(ept_entry->sa_p2mt) &&
>>                !p2m_is_ioreq(ept_entry->sa_p2mt) )
>>
>> together.
>>
>> 2> In resolve_misconfig()/do_recalc(), the entry_count gets decremented. We
>> do not need this new inline function, because they are in a separate
>> if() statement.
>>
>> Is this OK? :)
> Sounds reasonable. But please give George and others a chance to
> voice their opinions before you go that route.
>
>

Sure, thanks!

Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-09-09  9:56             ` Yu Zhang
@ 2016-09-09 10:09               ` Jan Beulich
  2016-09-09 10:01                 ` Yu Zhang
  2016-09-20  2:57                 ` Yu Zhang
  0 siblings, 2 replies; 40+ messages in thread
From: Jan Beulich @ 2016-09-09 10:09 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Paul Durrant, Xen-devel, George Dunlap, Zhiyuan Lv

>>> On 09.09.16 at 11:56, <yu.c.zhang@linux.intel.com> wrote:

> 
> On 9/9/2016 5:44 PM, Jan Beulich wrote:
>>>>> On 09.09.16 at 11:24, <yu.c.zhang@linux.intel.com> wrote:
>>> On 9/9/2016 4:20 PM, Jan Beulich wrote:
>>>>>>> On 09.09.16 at 09:24, <yu.c.zhang@linux.intel.com> wrote:
>>>>> On 9/9/2016 1:26 PM, Yu Zhang wrote:
>>>>>>>>> On 02.09.16 at 12:47, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
>>>>>>> @@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
>>>>>>>        if ( is_epte_valid(ept_entry) )
>>>>>>>        {
>>>>>>>            if ( (recalc || ept_entry->recalc) &&
>>>>>>> -             p2m_is_changeable(ept_entry->sa_p2mt) )
>>>>>>> +             p2m_is_changeable(ept_entry->sa_p2mt) &&
>>>>>>> +             (ept_entry->sa_p2mt != p2m_ioreq_server) )
>>>>>>>                *t = p2m_is_logdirty_range(p2m, gfn, gfn) ?
>>>>>> p2m_ram_logdirty
>>>>>>>                                                          : p2m_ram_rw;
>>>>>>>            else
>>>>>> Considering this (and at least one more similar adjustment further
>>>>>> down), is it really appropriate to include p2m_ioreq_server in the
>>>>>> set of "changeable" types?
>>>>> Well, I agree p2m_ioreq_server do have different behaviors than the
>>>>> p2m_log_dirty, but removing p2m_ioreq_server from changeable type
>>>>> would need more specific code for the p2m_ioreq_server in routines like
>>>>> resolve_misconfig(), do_recalc() and p2m_change_entry_type_global() etc.
>>>>> I've tried this approach and abandoned later. :)
>>>> I can see that the other option might require more adjustments, in
>>>> which case I guess this variant would be fine if you created another
>>>> helper (well named) inline function instead of open coding this in
>>>> several places. Of course such dissimilar handling in the various
>>>> places p2m_is_changeable() gets used right now will additionally
>>>> require good reasoning - after all that predicate exists to express
>>>> the similarities between different code paths.
>>> Thanks Jan.
>>> Well, for the logic of p2m type recalculation, similarities between
>>> p2m_ioreq_server
>>> and other changeable types exceeds their differences. As to the special
>>> cases, how
>>> about we use a macro, i.e. p2m_is_ioreq?
>> That'd be better than the open coded check, but would still result
>> in (taking the above example)
>>
>>               p2m_is_changeable(ept_entry->sa_p2mt) &&
>>               !p2m_is_ioreq(ept_entry->sa_p2mt) )
>>
>> ? What I'd prefer is a predicate that can be applied here on its own,
>> without involving && or ||.
>>
> 
> OK. I can think of 2 scenarios that p2m_ioreq_server needs special handling:
> 
> 1> In ept_get_entry()/recal_type(), the p2m types are supposed to return 
> as it is, instead of
> changing to p2m_log_dirty. So we can use a macro or a inline function 
> like p2m_check_changeable(),
> which combines the
> 
>               p2m_is_changeable(ept_entry->sa_p2mt) &&
>               !p2m_is_ioreq(ept_entry->sa_p2mt) )
> 
> together.
> 
> 2> In resolve_misconfig()/do_recalc(), the entry_count gets decremented. We
> do not need this new inline function, because they are in a separate 
> if() statement.
> 
> Is this OK? :)

Sounds reasonable. But please give George and others a chance to
voice their opinions before you go that route.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-09-09 10:09               ` Jan Beulich
  2016-09-09 10:01                 ` Yu Zhang
@ 2016-09-20  2:57                 ` Yu Zhang
  2016-09-22 18:06                   ` George Dunlap
  1 sibling, 1 reply; 40+ messages in thread
From: Yu Zhang @ 2016-09-20  2:57 UTC (permalink / raw)
  To: George Dunlap; +Cc: Paul Durrant, Zhiyuan Lv, Jan Beulich, Xen-devel



On 9/9/2016 6:09 PM, Jan Beulich wrote:
>>>> On 09.09.16 at 11:56, <yu.c.zhang@linux.intel.com> wrote:
>> On 9/9/2016 5:44 PM, Jan Beulich wrote:
>>>>>> On 09.09.16 at 11:24, <yu.c.zhang@linux.intel.com> wrote:
>>>> On 9/9/2016 4:20 PM, Jan Beulich wrote:
>>>>>>>> On 09.09.16 at 09:24, <yu.c.zhang@linux.intel.com> wrote:
>>>>>> On 9/9/2016 1:26 PM, Yu Zhang wrote:
>>>>>>>>>> On 02.09.16 at 12:47, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
>>>>>>>> @@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
>>>>>>>>         if ( is_epte_valid(ept_entry) )
>>>>>>>>         {
>>>>>>>>             if ( (recalc || ept_entry->recalc) &&
>>>>>>>> -             p2m_is_changeable(ept_entry->sa_p2mt) )
>>>>>>>> +             p2m_is_changeable(ept_entry->sa_p2mt) &&
>>>>>>>> +             (ept_entry->sa_p2mt != p2m_ioreq_server) )
>>>>>>>>                 *t = p2m_is_logdirty_range(p2m, gfn, gfn) ?
>>>>>>> p2m_ram_logdirty
>>>>>>>>                                                           : p2m_ram_rw;
>>>>>>>>             else
>>>>>>> Considering this (and at least one more similar adjustment further
>>>>>>> down), is it really appropriate to include p2m_ioreq_server in the
>>>>>>> set of "changeable" types?
>>>>>> Well, I agree p2m_ioreq_server do have different behaviors than the
>>>>>> p2m_log_dirty, but removing p2m_ioreq_server from changeable type
>>>>>> would need more specific code for the p2m_ioreq_server in routines like
>>>>>> resolve_misconfig(), do_recalc() and p2m_change_entry_type_global() etc.
>>>>>> I've tried this approach and abandoned later. :)
>>>>> I can see that the other option might require more adjustments, in
>>>>> which case I guess this variant would be fine if you created another
>>>>> helper (well named) inline function instead of open coding this in
>>>>> several places. Of course such dissimilar handling in the various
>>>>> places p2m_is_changeable() gets used right now will additionally
>>>>> require good reasoning - after all that predicate exists to express
>>>>> the similarities between different code paths.
>>>> Thanks Jan.
>>>> Well, for the logic of p2m type recalculation, similarities between
>>>> p2m_ioreq_server
>>>> and other changeable types exceeds their differences. As to the special
>>>> cases, how
>>>> about we use a macro, i.e. p2m_is_ioreq?
>>> That'd be better than the open coded check, but would still result
>>> in (taking the above example)
>>>
>>>                p2m_is_changeable(ept_entry->sa_p2mt) &&
>>>                !p2m_is_ioreq(ept_entry->sa_p2mt) )
>>>
>>> ? What I'd prefer is a predicate that can be applied here on its own,
>>> without involving && or ||.
>>>
>> OK. I can think of 2 scenarios that p2m_ioreq_server needs special handling:
>>
>> 1> In ept_get_entry()/recal_type(), the p2m types are supposed to return
>> as it is, instead of
>> changing to p2m_log_dirty. So we can use a macro or a inline function
>> like p2m_check_changeable(),
>> which combines the
>>
>>                p2m_is_changeable(ept_entry->sa_p2mt) &&
>>                !p2m_is_ioreq(ept_entry->sa_p2mt) )
>>
>> together.
>>
>> 2> In resolve_misconfig()/do_recalc(), the entry_count gets decremented. We
>> do not need this new inline function, because they are in a separate
>> if() statement.
>>
>> Is this OK? :)
> Sounds reasonable. But please give George and others a chance to
> voice their opinions before you go that route.
>
> Jan

Hi George,

   Any comments on this series? :)

Yu


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-09  5:51     ` Yu Zhang
@ 2016-09-21 13:04       ` George Dunlap
  2016-09-22  9:12         ` Yu Zhang
  0 siblings, 1 reply; 40+ messages in thread
From: George Dunlap @ 2016-09-21 13:04 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Paul Durrant, Lv, Zhiyuan, Jan Beulich, Xen-devel

On Fri, Sep 9, 2016 at 6:51 AM, Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
>> On 9/2/2016 6:47 PM, Yu Zhang wrote:
>>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>>> let one ioreq server claim/disclaim its responsibility for the
>>> handling of guest pages with p2m type p2m_ioreq_server. Users
>>> of this HVMOP can specify which kind of operation is supposed
>>> to be emulated in a parameter named flags. Currently, this HVMOP
>>> only support the emulation of write operations. And it can be
>>> further extended to support the emulation of read ones if an
>>> ioreq server has such requirement in the future.
>>>
>>> For now, we only support one ioreq server for this p2m type, so
>>> once an ioreq server has claimed its ownership, subsequent calls
>>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>>> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
>>> triggering this new HVMOP, with ioreq server id set to the current
>>> owner's and flags parameter set to 0.
>>>
>>> Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>>> are only supported for HVMs with HAP enabled.
>>>
>>> Also note that only after one ioreq server claims its ownership
>>> of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
>>> be allowed.
>>>
>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>> Acked-by: Tim Deegan <tim@xen.org>
>>> ---
>>> Cc: Paul Durrant <paul.durrant@citrix.com>
>>> Cc: Jan Beulich <jbeulich@suse.com>
>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>> Cc: Tim Deegan <tim@xen.org>
>>>
>>> changes in v6:
>>>    - Clarify logic in hvmemul_do_io().
>>>    - Use recursive lock for ioreq server lock.
>>>    - Remove debug print when mapping ioreq server.
>>>    - Clarify code in ept_p2m_type_to_flags() for consistency.
>>>    - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
>>>    - Add comments for HVMMEM_ioreq_server to note only changes
>>>      to/from HVMMEM_ram_rw are permitted.
>>>    - Add domain_pause/unpause() in hvm_map_mem_type_to_ioreq_server()
>>>      to avoid the race condition when a vm exit happens on a write-
>>>      protected page, just to find the ioreq server has been unmapped
>>>      already.
>>>    - Introduce a seperate patch to delay the release of p2m
>>>      lock to avoid the race condition.
>>>    - Introduce a seperate patch to handle the read-modify-write
>>>      operations on a write protected page.
>>>
>>
>> Why do we need to do this?  Won't the default case just DTRT if it finds
>> that the ioreq server has been unmapped?
>
>
> Well, patch 4 will either mark the remaining p2m_ioreq_server entries as
> "recal" or
> reset to p2m_ram_rw directly. So my understanding is that we do not wish to
> see a ept violation due to a p2m_ioreq_server access after the ioreq server
> is unmapped.
> Yet without this domain_pause/unpause() pair, VM accesses may trigger an ept
> violation
> during the hvmop hypercall(hvm_map_mem_type_to_ioreq_server), just to find
> the ioreq
> server is NULL. Then we would have to provide handlers which just do the
> copy to/from
> actions for the VM. This seems awkward to me.

So the race you're worried about is this:

1. Guest fault happens
2. ioreq server calls map_mem_type_to_ioreq_server, unhooking
3. guest  finds no ioreq server present

I think in that case the easiest thing to do would be to simply assume
there was a race and re-execute the instruction.  Is that not possible
for some reason?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-21 13:04       ` George Dunlap
@ 2016-09-22  9:12         ` Yu Zhang
  2016-09-22 11:32           ` George Dunlap
  2016-09-26  6:58           ` Yu Zhang
  0 siblings, 2 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-22  9:12 UTC (permalink / raw)
  To: George Dunlap; +Cc: Paul Durrant, Lv, Zhiyuan, Jan Beulich, Xen-devel



On 9/21/2016 9:04 PM, George Dunlap wrote:
> On Fri, Sep 9, 2016 at 6:51 AM, Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
>>> On 9/2/2016 6:47 PM, Yu Zhang wrote:
>>>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>>>> let one ioreq server claim/disclaim its responsibility for the
>>>> handling of guest pages with p2m type p2m_ioreq_server. Users
>>>> of this HVMOP can specify which kind of operation is supposed
>>>> to be emulated in a parameter named flags. Currently, this HVMOP
>>>> only support the emulation of write operations. And it can be
>>>> further extended to support the emulation of read ones if an
>>>> ioreq server has such requirement in the future.
>>>>
>>>> For now, we only support one ioreq server for this p2m type, so
>>>> once an ioreq server has claimed its ownership, subsequent calls
>>>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>>>> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
>>>> triggering this new HVMOP, with ioreq server id set to the current
>>>> owner's and flags parameter set to 0.
>>>>
>>>> Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>>>> are only supported for HVMs with HAP enabled.
>>>>
>>>> Also note that only after one ioreq server claims its ownership
>>>> of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
>>>> be allowed.
>>>>
>>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>>> Acked-by: Tim Deegan <tim@xen.org>
>>>> ---
>>>> Cc: Paul Durrant <paul.durrant@citrix.com>
>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>>> Cc: Tim Deegan <tim@xen.org>
>>>>
>>>> changes in v6:
>>>>     - Clarify logic in hvmemul_do_io().
>>>>     - Use recursive lock for ioreq server lock.
>>>>     - Remove debug print when mapping ioreq server.
>>>>     - Clarify code in ept_p2m_type_to_flags() for consistency.
>>>>     - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
>>>>     - Add comments for HVMMEM_ioreq_server to note only changes
>>>>       to/from HVMMEM_ram_rw are permitted.
>>>>     - Add domain_pause/unpause() in hvm_map_mem_type_to_ioreq_server()
>>>>       to avoid the race condition when a vm exit happens on a write-
>>>>       protected page, just to find the ioreq server has been unmapped
>>>>       already.
>>>>     - Introduce a seperate patch to delay the release of p2m
>>>>       lock to avoid the race condition.
>>>>     - Introduce a seperate patch to handle the read-modify-write
>>>>       operations on a write protected page.
>>>>
>>> Why do we need to do this?  Won't the default case just DTRT if it finds
>>> that the ioreq server has been unmapped?
>>
>> Well, patch 4 will either mark the remaining p2m_ioreq_server entries as
>> "recal" or
>> reset to p2m_ram_rw directly. So my understanding is that we do not wish to
>> see a ept violation due to a p2m_ioreq_server access after the ioreq server
>> is unmapped.
>> Yet without this domain_pause/unpause() pair, VM accesses may trigger an ept
>> violation
>> during the hvmop hypercall(hvm_map_mem_type_to_ioreq_server), just to find
>> the ioreq
>> server is NULL. Then we would have to provide handlers which just do the
>> copy to/from
>> actions for the VM. This seems awkward to me.
> So the race you're worried about is this:
>
> 1. Guest fault happens
> 2. ioreq server calls map_mem_type_to_ioreq_server, unhooking
> 3. guest  finds no ioreq server present
>
> I think in that case the easiest thing to do would be to simply assume
> there was a race and re-execute the instruction.  Is that not possible
> for some reason?
>
>   -George

Thanks for your reply, George. :)
Two reasons I'd like to use the domain_pause/unpause() to avoid the race 
condition:

1>  Like my previous explanation, in the read-modify-write scenario, the 
ioreq server will
be NULL for the read emulation. But in such case, hypervisor will not 
discard this trap, instead
it is supposed to do the copy work for the read access. So it would be 
difficult for hypervisor
to decide if the ioreq server was detached due to a race condition, or 
if the ioreq server should
be a NULL because we are emulating a read operation first for a 
read-modify-write instruction.

2> I also realized it can avoid a deadlock possibility -
     a> dom0 triggers map_mem_type_to_ioreq_server, which spin locks the 
ioreq_server.lock,
          and before routine hvm_map_mem_type_to_ioreq_server() returns,
     b> The HVM triggers an ept violation, and p2m lock is held by 
hvm_hap_nested_page_fault();
     c> hypervisor continues to the I/O handler, which will need the 
ioreq_server.lock to select
          an ioreq server, which is being held by the hypercall handler 
side;
     d> likewise, the map_mem_type_to_ioreq_server will meet problem 
when trying to get the
          p2m lock when it calls p2m_change_entry_type_global().
    With a domain_pause/unpause(), we could avoid the ept violation 
during this period(which I
believe won't be long because this pair only covers the 
p2m_change_entry_type_global() call, not
the p2m_finish_type_change() call).

Thanks
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-22  9:12         ` Yu Zhang
@ 2016-09-22 11:32           ` George Dunlap
  2016-09-22 16:02             ` Yu Zhang
  2016-09-26  6:58           ` Yu Zhang
  1 sibling, 1 reply; 40+ messages in thread
From: George Dunlap @ 2016-09-22 11:32 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Paul Durrant, Lv, Zhiyuan, Jan Beulich, Xen-devel

[-- Attachment #1: Type: text/plain, Size: 4772 bytes --]

On Thu, Sep 22, 2016 at 10:12 AM, Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
>
>
> On 9/21/2016 9:04 PM, George Dunlap wrote:
>>
>> On Fri, Sep 9, 2016 at 6:51 AM, Yu Zhang <yu.c.zhang@linux.intel.com>
>> wrote:
>>>>
>>>> On 9/2/2016 6:47 PM, Yu Zhang wrote:
>>>>>
>>>>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>>>>> let one ioreq server claim/disclaim its responsibility for the
>>>>> handling of guest pages with p2m type p2m_ioreq_server. Users
>>>>> of this HVMOP can specify which kind of operation is supposed
>>>>> to be emulated in a parameter named flags. Currently, this HVMOP
>>>>> only support the emulation of write operations. And it can be
>>>>> further extended to support the emulation of read ones if an
>>>>> ioreq server has such requirement in the future.
>>>>>
>>>>> For now, we only support one ioreq server for this p2m type, so
>>>>> once an ioreq server has claimed its ownership, subsequent calls
>>>>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>>>>> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
>>>>> triggering this new HVMOP, with ioreq server id set to the current
>>>>> owner's and flags parameter set to 0.
>>>>>
>>>>> Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>>>>> are only supported for HVMs with HAP enabled.
>>>>>
>>>>> Also note that only after one ioreq server claims its ownership
>>>>> of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
>>>>> be allowed.
>>>>>
>>>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>>>> Acked-by: Tim Deegan <tim@xen.org>
>>>>> ---
>>>>> Cc: Paul Durrant <paul.durrant@citrix.com>
>>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>>>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>>>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>>>> Cc: Tim Deegan <tim@xen.org>
>>>>>
>>>>> changes in v6:
>>>>>     - Clarify logic in hvmemul_do_io().
>>>>>     - Use recursive lock for ioreq server lock.
>>>>>     - Remove debug print when mapping ioreq server.
>>>>>     - Clarify code in ept_p2m_type_to_flags() for consistency.
>>>>>     - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
>>>>>     - Add comments for HVMMEM_ioreq_server to note only changes
>>>>>       to/from HVMMEM_ram_rw are permitted.
>>>>>     - Add domain_pause/unpause() in hvm_map_mem_type_to_ioreq_server()
>>>>>       to avoid the race condition when a vm exit happens on a write-
>>>>>       protected page, just to find the ioreq server has been unmapped
>>>>>       already.
>>>>>     - Introduce a seperate patch to delay the release of p2m
>>>>>       lock to avoid the race condition.
>>>>>     - Introduce a seperate patch to handle the read-modify-write
>>>>>       operations on a write protected page.
>>>>>
>>>> Why do we need to do this?  Won't the default case just DTRT if it finds
>>>> that the ioreq server has been unmapped?
>>>
>>>
>>> Well, patch 4 will either mark the remaining p2m_ioreq_server entries as
>>> "recal" or
>>> reset to p2m_ram_rw directly. So my understanding is that we do not wish
>>> to
>>> see a ept violation due to a p2m_ioreq_server access after the ioreq
>>> server
>>> is unmapped.
>>> Yet without this domain_pause/unpause() pair, VM accesses may trigger an
>>> ept
>>> violation
>>> during the hvmop hypercall(hvm_map_mem_type_to_ioreq_server), just to
>>> find
>>> the ioreq
>>> server is NULL. Then we would have to provide handlers which just do the
>>> copy to/from
>>> actions for the VM. This seems awkward to me.
>>
>> So the race you're worried about is this:
>>
>> 1. Guest fault happens
>> 2. ioreq server calls map_mem_type_to_ioreq_server, unhooking
>> 3. guest  finds no ioreq server present
>>
>> I think in that case the easiest thing to do would be to simply assume
>> there was a race and re-execute the instruction.  Is that not possible
>> for some reason?
>>
>>   -George
>
>
> Thanks for your reply, George. :)
> Two reasons I'd like to use the domain_pause/unpause() to avoid the race
> condition:
>
> 1>  Like my previous explanation, in the read-modify-write scenario, the
> ioreq server will
> be NULL for the read emulation. But in such case, hypervisor will not
> discard this trap, instead
> it is supposed to do the copy work for the read access. So it would be
> difficult for hypervisor
> to decide if the ioreq server was detached due to a race condition, or if
> the ioreq server should
> be a NULL because we are emulating a read operation first for a
> read-modify-write instruction.

Wouldn't a patch like the attached work (applied on top of the whole series)?

 -George

[-- Attachment #2: 0001-x86-hvm-Handle-both-ioreq-detach-races-and-read-modi.patch --]
[-- Type: text/x-diff, Size: 4162 bytes --]

From 4e4b14641cd94c0c9fe64606b329cdbbf9c6a92b Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Thu, 22 Sep 2016 12:30:26 +0100
Subject: [PATCH] x86/hvm: Handle both ioreq detach races and read-modify-write
 instructions

Compile-tested only.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
 xen/arch/x86/hvm/emulate.c | 68 ++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 56 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 564c117..120ef86 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -214,40 +214,84 @@ static int hvmemul_do_io(
         break;
     case X86EMUL_UNHANDLEABLE:
     {
+        /* 
+         * Xen isn't emulating the instruction internally, so see if
+         * there's an ioreq server that can handle it.  Rules:
+         *
+         * - PIO and "normal" mmio run through
+         * hvm_select_ioreq_server() to choose the ioreq server by
+         * range.  If no server is found, the access is ignored.
+         *
+         * - p2m_ioreq_server accesses are handled by the current
+         * ioreq_server for the domain, but there are some corner
+         * cases:
+         *
+         *   - If the domain ioreq_server is NULL, assume this is a
+         *   race between unlooking the ioreq server and
+         *   p2m_type_change and re-try the instruction.
+         *
+         *   - If the IOREQ_MEM_ACCESS_WRITE flag is not set, treat it
+         *   like a normal PIO or MMIO that doesn't have an ioreq
+         *   server (i.e., by ignoring it).
+         *
+         *   - If the accesss is a read, this must be part of a
+         *   read-modify-write instruction; emulate the read so that we have
+         *   it
+         */
+
         struct hvm_ioreq_server *s = NULL;
         p2m_type_t p2mt = p2m_invalid;
-
+        
         if ( is_mmio )
         {
             unsigned long gmfn = paddr_to_pfn(addr);
 
             (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
 
-            if ( p2mt == p2m_ioreq_server && dir == IOREQ_WRITE )
+            if ( p2mt == p2m_ioreq_server )
             {
                 unsigned int flags;
 
                 s = p2m_get_ioreq_server(currd, &flags);
+
+                /* 
+                 * If p2mt is ioreq_server but ioreq_server is NULL,
+                 * we probably lost a race with p2m_type_change; just
+                 * retry the access.
+                 */
+                if ( s == NULL )
+                {
+                    rc = X86EMUL_RETRY;
+                    vio->io_req.state = STATE_IOREQ_NONE;
+                    break;
+                }
+
+                /* 
+                 * This is part of a read-modify-write instruction.
+                 * Emulate the read part so we have the value cached.
+                 */
+                if ( dir == IOREQ_READ )
+                {
+                    rc = hvm_process_io_intercept(&mem_handler, &p);
+                    vio->io_req.state = STATE_IOREQ_NONE;
+                    break;
+                }
+
+                
                 if ( !(flags & XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE) )
+                {
                     s = NULL;
+                }
             }
         }
 
         if ( !s && p2mt != p2m_ioreq_server )
             s = hvm_select_ioreq_server(currd, &p);
 
-        /* If there is no suitable backing DM, just ignore accesses */
         if ( !s )
         {
-            /*
-             * For p2m_ioreq_server pages accessed with read-modify-write
-             * instructions, we provide a read handler to copy the data to
-             * the buffer.
-             */
-            if ( p2mt == p2m_ioreq_server )
-                rc = hvm_process_io_intercept(&mem_handler, &p);
-            else
-                rc = hvm_process_io_intercept(&null_handler, &p);
+            /* If there is no suitable backing DM, just ignore accesses */
+            rc = hvm_process_io_intercept(&null_handler, &p);
             vio->io_req.state = STATE_IOREQ_NONE;
         }
         else
-- 
2.1.4


[-- Attachment #3: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-22 11:32           ` George Dunlap
@ 2016-09-22 16:02             ` Yu Zhang
  2016-09-23 10:35               ` George Dunlap
  0 siblings, 1 reply; 40+ messages in thread
From: Yu Zhang @ 2016-09-22 16:02 UTC (permalink / raw)
  To: George Dunlap; +Cc: Paul Durrant, Lv, Zhiyuan, Jan Beulich, Xen-devel



On 9/22/2016 7:32 PM, George Dunlap wrote:
> On Thu, Sep 22, 2016 at 10:12 AM, Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
>>
>> On 9/21/2016 9:04 PM, George Dunlap wrote:
>>> On Fri, Sep 9, 2016 at 6:51 AM, Yu Zhang <yu.c.zhang@linux.intel.com>
>>> wrote:
>>>>> On 9/2/2016 6:47 PM, Yu Zhang wrote:
>>>>>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>>>>>> let one ioreq server claim/disclaim its responsibility for the
>>>>>> handling of guest pages with p2m type p2m_ioreq_server. Users
>>>>>> of this HVMOP can specify which kind of operation is supposed
>>>>>> to be emulated in a parameter named flags. Currently, this HVMOP
>>>>>> only support the emulation of write operations. And it can be
>>>>>> further extended to support the emulation of read ones if an
>>>>>> ioreq server has such requirement in the future.
>>>>>>
>>>>>> For now, we only support one ioreq server for this p2m type, so
>>>>>> once an ioreq server has claimed its ownership, subsequent calls
>>>>>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>>>>>> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
>>>>>> triggering this new HVMOP, with ioreq server id set to the current
>>>>>> owner's and flags parameter set to 0.
>>>>>>
>>>>>> Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>>>>>> are only supported for HVMs with HAP enabled.
>>>>>>
>>>>>> Also note that only after one ioreq server claims its ownership
>>>>>> of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
>>>>>> be allowed.
>>>>>>
>>>>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>>>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>>>>> Acked-by: Tim Deegan <tim@xen.org>
>>>>>> ---
>>>>>> Cc: Paul Durrant <paul.durrant@citrix.com>
>>>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>>>>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>>>>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>>>>> Cc: Tim Deegan <tim@xen.org>
>>>>>>
>>>>>> changes in v6:
>>>>>>      - Clarify logic in hvmemul_do_io().
>>>>>>      - Use recursive lock for ioreq server lock.
>>>>>>      - Remove debug print when mapping ioreq server.
>>>>>>      - Clarify code in ept_p2m_type_to_flags() for consistency.
>>>>>>      - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
>>>>>>      - Add comments for HVMMEM_ioreq_server to note only changes
>>>>>>        to/from HVMMEM_ram_rw are permitted.
>>>>>>      - Add domain_pause/unpause() in hvm_map_mem_type_to_ioreq_server()
>>>>>>        to avoid the race condition when a vm exit happens on a write-
>>>>>>        protected page, just to find the ioreq server has been unmapped
>>>>>>        already.
>>>>>>      - Introduce a seperate patch to delay the release of p2m
>>>>>>        lock to avoid the race condition.
>>>>>>      - Introduce a seperate patch to handle the read-modify-write
>>>>>>        operations on a write protected page.
>>>>>>
>>>>> Why do we need to do this?  Won't the default case just DTRT if it finds
>>>>> that the ioreq server has been unmapped?
>>>>
>>>> Well, patch 4 will either mark the remaining p2m_ioreq_server entries as
>>>> "recal" or
>>>> reset to p2m_ram_rw directly. So my understanding is that we do not wish
>>>> to
>>>> see a ept violation due to a p2m_ioreq_server access after the ioreq
>>>> server
>>>> is unmapped.
>>>> Yet without this domain_pause/unpause() pair, VM accesses may trigger an
>>>> ept
>>>> violation
>>>> during the hvmop hypercall(hvm_map_mem_type_to_ioreq_server), just to
>>>> find
>>>> the ioreq
>>>> server is NULL. Then we would have to provide handlers which just do the
>>>> copy to/from
>>>> actions for the VM. This seems awkward to me.
>>> So the race you're worried about is this:
>>>
>>> 1. Guest fault happens
>>> 2. ioreq server calls map_mem_type_to_ioreq_server, unhooking
>>> 3. guest  finds no ioreq server present
>>>
>>> I think in that case the easiest thing to do would be to simply assume
>>> there was a race and re-execute the instruction.  Is that not possible
>>> for some reason?
>>>
>>>    -George
>>
>> Thanks for your reply, George. :)
>> Two reasons I'd like to use the domain_pause/unpause() to avoid the race
>> condition:
>>
>> 1>  Like my previous explanation, in the read-modify-write scenario, the
>> ioreq server will
>> be NULL for the read emulation. But in such case, hypervisor will not
>> discard this trap, instead
>> it is supposed to do the copy work for the read access. So it would be
>> difficult for hypervisor
>> to decide if the ioreq server was detached due to a race condition, or if
>> the ioreq server should
>> be a NULL because we are emulating a read operation first for a
>> read-modify-write instruction.
> Wouldn't a patch like the attached work (applied on top of the whole series)?

Thanks for your patch, George. I think it should work for 1>. But we 
still have the dead lock
problem.  :)

BTW, do you think a domain_pause will cause any new problem?

B.R.
Yu

>   -George
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-09-20  2:57                 ` Yu Zhang
@ 2016-09-22 18:06                   ` George Dunlap
  2016-09-23  1:31                     ` Yu Zhang
  0 siblings, 1 reply; 40+ messages in thread
From: George Dunlap @ 2016-09-22 18:06 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Paul Durrant, Zhiyuan Lv, Jan Beulich, Xen-devel

On Tue, Sep 20, 2016 at 3:57 AM, Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
>>>>> Well, for the logic of p2m type recalculation, similarities between
>>>>> p2m_ioreq_server
>>>>> and other changeable types exceeds their differences. As to the special
>>>>> cases, how
>>>>> about we use a macro, i.e. p2m_is_ioreq?
>>>>
>>>> That'd be better than the open coded check, but would still result
>>>> in (taking the above example)
>>>>
>>>>                p2m_is_changeable(ept_entry->sa_p2mt) &&
>>>>                !p2m_is_ioreq(ept_entry->sa_p2mt) )
>>>>
>>>> ? What I'd prefer is a predicate that can be applied here on its own,
>>>> without involving && or ||.
>>>>
>>> OK. I can think of 2 scenarios that p2m_ioreq_server needs special
>>> handling:
>>>
>>> 1> In ept_get_entry()/recal_type(), the p2m types are supposed to return
>>> as it is, instead of
>>> changing to p2m_log_dirty. So we can use a macro or a inline function
>>> like p2m_check_changeable(),
>>> which combines the
>>>
>>>                p2m_is_changeable(ept_entry->sa_p2mt) &&
>>>                !p2m_is_ioreq(ept_entry->sa_p2mt) )
>>>
>>> together.
>>>
>>> 2> In resolve_misconfig()/do_recalc(), the entry_count gets decremented.
>>> We
>>> do not need this new inline function, because they are in a separate
>>> if() statement.
>>>
>>> Is this OK? :)
>>
>> Sounds reasonable. But please give George and others a chance to
>> voice their opinions before you go that route.
>>
>> Jan
>
>
> Hi George,
>
>   Any comments on this series? :)

Well regarding the question you and Jan have been discussing, of what
to call / how to do the checks for changeable-but-not-ioreq, I don't
really care very much. :-)

I'm sorry it's taking so long to look at this series -- the code
you're trying to modify is already a bit of a tangled mess, and I
think this patch has a ways to go before it's ready.  I do think this
series is important, so I'll be coming back to it first thing Monday.

Regarding the pausing added in this patch -- you listed two reasons in
the first patch for the domain pausing.  The first was detecting
read-modify-write and acting appropriately; I think that can be done
with the patch that I sent you.

The second was the deadlock due to grabbing locks in a different
order.  I'm afraid having such a thing lying around, even if you've
avoided it for now by pausing the domain, is another major trap that
we should try to avoid laying for future developers if we can at all
help it.  The normal thing to do here is to simply have a locking
discipline -- in this case, I don't think it would be too difficult to
work out a locking order that would avoid the deadlock in a more
robust way than pausing and unpausing the domain.

With those two handled, we shouldn't need to pause the domain any more.

Thanks for your work on this -- I'll get back to patch 4/4 next week.

Peace,
 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
  2016-09-22 18:06                   ` George Dunlap
@ 2016-09-23  1:31                     ` Yu Zhang
  0 siblings, 0 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-23  1:31 UTC (permalink / raw)
  To: George Dunlap; +Cc: Paul Durrant, Zhiyuan Lv, Jan Beulich, Xen-devel



On 9/23/2016 2:06 AM, George Dunlap wrote:
> On Tue, Sep 20, 2016 at 3:57 AM, Yu Zhang <yu.c.zhang@linux.intel.com> wrote:
>>>>>> Well, for the logic of p2m type recalculation, similarities between
>>>>>> p2m_ioreq_server
>>>>>> and other changeable types exceeds their differences. As to the special
>>>>>> cases, how
>>>>>> about we use a macro, i.e. p2m_is_ioreq?
>>>>> That'd be better than the open coded check, but would still result
>>>>> in (taking the above example)
>>>>>
>>>>>                 p2m_is_changeable(ept_entry->sa_p2mt) &&
>>>>>                 !p2m_is_ioreq(ept_entry->sa_p2mt) )
>>>>>
>>>>> ? What I'd prefer is a predicate that can be applied here on its own,
>>>>> without involving && or ||.
>>>>>
>>>> OK. I can think of 2 scenarios that p2m_ioreq_server needs special
>>>> handling:
>>>>
>>>> 1> In ept_get_entry()/recal_type(), the p2m types are supposed to return
>>>> as it is, instead of
>>>> changing to p2m_log_dirty. So we can use a macro or a inline function
>>>> like p2m_check_changeable(),
>>>> which combines the
>>>>
>>>>                 p2m_is_changeable(ept_entry->sa_p2mt) &&
>>>>                 !p2m_is_ioreq(ept_entry->sa_p2mt) )
>>>>
>>>> together.
>>>>
>>>> 2> In resolve_misconfig()/do_recalc(), the entry_count gets decremented.
>>>> We
>>>> do not need this new inline function, because they are in a separate
>>>> if() statement.
>>>>
>>>> Is this OK? :)
>>> Sounds reasonable. But please give George and others a chance to
>>> voice their opinions before you go that route.
>>>
>>> Jan
>>
>> Hi George,
>>
>>    Any comments on this series? :)
> Well regarding the question you and Jan have been discussing, of what
> to call / how to do the checks for changeable-but-not-ioreq, I don't
> really care very much. :-)
>
> I'm sorry it's taking so long to look at this series -- the code
> you're trying to modify is already a bit of a tangled mess, and I
> think this patch has a ways to go before it's ready.  I do think this
> series is important, so I'll be coming back to it first thing Monday.
>
> Regarding the pausing added in this patch -- you listed two reasons in
> the first patch for the domain pausing.  The first was detecting
> read-modify-write and acting appropriately; I think that can be done
> with the patch that I sent you.
>
> The second was the deadlock due to grabbing locks in a different
> order.  I'm afraid having such a thing lying around, even if you've
> avoided it for now by pausing the domain, is another major trap that
> we should try to avoid laying for future developers if we can at all
> help it.  The normal thing to do here is to simply have a locking
> discipline -- in this case, I don't think it would be too difficult to
> work out a locking order that would avoid the deadlock in a more
> robust way than pausing and unpausing the domain.
>
> With those two handled, we shouldn't need to pause the domain any more.

Thank you, George. Hope we can find a more elegant approach. :-)
B.R.
Yu

> Thanks for your work on this -- I'll get back to patch 4/4 next week.
>
> Peace,
>   -George
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-22 16:02             ` Yu Zhang
@ 2016-09-23 10:35               ` George Dunlap
  2016-09-26  6:57                 ` Yu Zhang
  0 siblings, 1 reply; 40+ messages in thread
From: George Dunlap @ 2016-09-23 10:35 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Paul Durrant, Lv, Zhiyuan, Jan Beulich, Xen-devel

On 22/09/16 17:02, Yu Zhang wrote:
> 
> 
> On 9/22/2016 7:32 PM, George Dunlap wrote:
>> On Thu, Sep 22, 2016 at 10:12 AM, Yu Zhang
>> <yu.c.zhang@linux.intel.com> wrote:
>>>
>>> On 9/21/2016 9:04 PM, George Dunlap wrote:
>>>> On Fri, Sep 9, 2016 at 6:51 AM, Yu Zhang <yu.c.zhang@linux.intel.com>
>>>> wrote:
>>>>>> On 9/2/2016 6:47 PM, Yu Zhang wrote:
>>>>>>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>>>>>>> let one ioreq server claim/disclaim its responsibility for the
>>>>>>> handling of guest pages with p2m type p2m_ioreq_server. Users
>>>>>>> of this HVMOP can specify which kind of operation is supposed
>>>>>>> to be emulated in a parameter named flags. Currently, this HVMOP
>>>>>>> only support the emulation of write operations. And it can be
>>>>>>> further extended to support the emulation of read ones if an
>>>>>>> ioreq server has such requirement in the future.
>>>>>>>
>>>>>>> For now, we only support one ioreq server for this p2m type, so
>>>>>>> once an ioreq server has claimed its ownership, subsequent calls
>>>>>>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>>>>>>> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
>>>>>>> triggering this new HVMOP, with ioreq server id set to the current
>>>>>>> owner's and flags parameter set to 0.
>>>>>>>
>>>>>>> Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>>>>>>> are only supported for HVMs with HAP enabled.
>>>>>>>
>>>>>>> Also note that only after one ioreq server claims its ownership
>>>>>>> of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
>>>>>>> be allowed.
>>>>>>>
>>>>>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>>>>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>>>>>> Acked-by: Tim Deegan <tim@xen.org>
>>>>>>> ---
>>>>>>> Cc: Paul Durrant <paul.durrant@citrix.com>
>>>>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>>>>>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>>>>>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>>>>>> Cc: Tim Deegan <tim@xen.org>
>>>>>>>
>>>>>>> changes in v6:
>>>>>>>      - Clarify logic in hvmemul_do_io().
>>>>>>>      - Use recursive lock for ioreq server lock.
>>>>>>>      - Remove debug print when mapping ioreq server.
>>>>>>>      - Clarify code in ept_p2m_type_to_flags() for consistency.
>>>>>>>      - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
>>>>>>>      - Add comments for HVMMEM_ioreq_server to note only changes
>>>>>>>        to/from HVMMEM_ram_rw are permitted.
>>>>>>>      - Add domain_pause/unpause() in
>>>>>>> hvm_map_mem_type_to_ioreq_server()
>>>>>>>        to avoid the race condition when a vm exit happens on a
>>>>>>> write-
>>>>>>>        protected page, just to find the ioreq server has been
>>>>>>> unmapped
>>>>>>>        already.
>>>>>>>      - Introduce a seperate patch to delay the release of p2m
>>>>>>>        lock to avoid the race condition.
>>>>>>>      - Introduce a seperate patch to handle the read-modify-write
>>>>>>>        operations on a write protected page.
>>>>>>>
>>>>>> Why do we need to do this?  Won't the default case just DTRT if it
>>>>>> finds
>>>>>> that the ioreq server has been unmapped?
>>>>>
>>>>> Well, patch 4 will either mark the remaining p2m_ioreq_server
>>>>> entries as
>>>>> "recal" or
>>>>> reset to p2m_ram_rw directly. So my understanding is that we do not
>>>>> wish
>>>>> to
>>>>> see a ept violation due to a p2m_ioreq_server access after the ioreq
>>>>> server
>>>>> is unmapped.
>>>>> Yet without this domain_pause/unpause() pair, VM accesses may
>>>>> trigger an
>>>>> ept
>>>>> violation
>>>>> during the hvmop hypercall(hvm_map_mem_type_to_ioreq_server), just to
>>>>> find
>>>>> the ioreq
>>>>> server is NULL. Then we would have to provide handlers which just
>>>>> do the
>>>>> copy to/from
>>>>> actions for the VM. This seems awkward to me.
>>>> So the race you're worried about is this:
>>>>
>>>> 1. Guest fault happens
>>>> 2. ioreq server calls map_mem_type_to_ioreq_server, unhooking
>>>> 3. guest  finds no ioreq server present
>>>>
>>>> I think in that case the easiest thing to do would be to simply assume
>>>> there was a race and re-execute the instruction.  Is that not possible
>>>> for some reason?
>>>>
>>>>    -George
>>>
>>> Thanks for your reply, George. :)
>>> Two reasons I'd like to use the domain_pause/unpause() to avoid the race
>>> condition:
>>>
>>> 1>  Like my previous explanation, in the read-modify-write scenario, the
>>> ioreq server will
>>> be NULL for the read emulation. But in such case, hypervisor will not
>>> discard this trap, instead
>>> it is supposed to do the copy work for the read access. So it would be
>>> difficult for hypervisor
>>> to decide if the ioreq server was detached due to a race condition,
>>> or if
>>> the ioreq server should
>>> be a NULL because we are emulating a read operation first for a
>>> read-modify-write instruction.
>> Wouldn't a patch like the attached work (applied on top of the whole
>> series)?
> 
> Thanks for your patch, George. I think it should work for 1>. But we
> still have the dead lock
> problem.  :)
> 
> BTW, do you think a domain_pause will cause any new problem?

Well using a "big hammer" like domain_pause in a case like this is
usually indicates that there are other issues that aren't being solved
properly -- for instance, latent deadlocks or unhandled race conditions.
:-)  Leaving those issues around in the codebase but "papered over" by
domain_pause is storing up technical debt that future generations will
inherit and need to untangle.  (Particularly as in this case, there was
no comment *in the code* explaining what problems the domain_pause was
there to solve, so anyone wanting to try to remove it would need to just
figure out.)

domain_pause is relatively expensive to do (since you have to spin
waiting for all the vcpus to finish running) and completely stops the
domain from handling interrupts or anything for an arbitrary amount of
time.  As long as it doesn't happen often, the cost shouldn't be a major
issue; and as long as the domain_pause is short (less than 100ms), then
it shouldn't cause any more problems than running on a fairly busy
system should cause.

On the other hand, if every time we ran into a tricky situation we just
did a domain pause rather than solving the root issue, pretty soon the
domain would be paused several times per second.  The performance would
plummet, and fixing it would be a nightmare because you'd have hundreds
of undocumented issues to try to understand and fix.

So: the domain_pause itself isn't terrible (although it's better to
avoid it if we can); what's more of a problem is the potential issues
that it's hiding.  These issues can add up, so it's important to push
back and ask "why do we need this and can we solve it a different way"
pro-actively, as patches come in, rather than waiting until it becomes
an issue.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-23 10:35               ` George Dunlap
@ 2016-09-26  6:57                 ` Yu Zhang
  0 siblings, 0 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-26  6:57 UTC (permalink / raw)
  To: George Dunlap; +Cc: Paul Durrant, Lv, Zhiyuan, Jan Beulich, Xen-devel



On 9/23/2016 6:35 PM, George Dunlap wrote:
> On 22/09/16 17:02, Yu Zhang wrote:
>>
>> On 9/22/2016 7:32 PM, George Dunlap wrote:
>>> On Thu, Sep 22, 2016 at 10:12 AM, Yu Zhang
>>> <yu.c.zhang@linux.intel.com> wrote:
>>>> On 9/21/2016 9:04 PM, George Dunlap wrote:
>>>>> On Fri, Sep 9, 2016 at 6:51 AM, Yu Zhang <yu.c.zhang@linux.intel.com>
>>>>> wrote:
>>>>>>> On 9/2/2016 6:47 PM, Yu Zhang wrote:
>>>>>>>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>>>>>>>> let one ioreq server claim/disclaim its responsibility for the
>>>>>>>> handling of guest pages with p2m type p2m_ioreq_server. Users
>>>>>>>> of this HVMOP can specify which kind of operation is supposed
>>>>>>>> to be emulated in a parameter named flags. Currently, this HVMOP
>>>>>>>> only support the emulation of write operations. And it can be
>>>>>>>> further extended to support the emulation of read ones if an
>>>>>>>> ioreq server has such requirement in the future.
>>>>>>>>
>>>>>>>> For now, we only support one ioreq server for this p2m type, so
>>>>>>>> once an ioreq server has claimed its ownership, subsequent calls
>>>>>>>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>>>>>>>> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
>>>>>>>> triggering this new HVMOP, with ioreq server id set to the current
>>>>>>>> owner's and flags parameter set to 0.
>>>>>>>>
>>>>>>>> Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>>>>>>>> are only supported for HVMs with HAP enabled.
>>>>>>>>
>>>>>>>> Also note that only after one ioreq server claims its ownership
>>>>>>>> of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
>>>>>>>> be allowed.
>>>>>>>>
>>>>>>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>>>>>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>>>>>>> Acked-by: Tim Deegan <tim@xen.org>
>>>>>>>> ---
>>>>>>>> Cc: Paul Durrant <paul.durrant@citrix.com>
>>>>>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>>>>>>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>>>>>>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>>>>>>> Cc: Tim Deegan <tim@xen.org>
>>>>>>>>
>>>>>>>> changes in v6:
>>>>>>>>       - Clarify logic in hvmemul_do_io().
>>>>>>>>       - Use recursive lock for ioreq server lock.
>>>>>>>>       - Remove debug print when mapping ioreq server.
>>>>>>>>       - Clarify code in ept_p2m_type_to_flags() for consistency.
>>>>>>>>       - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
>>>>>>>>       - Add comments for HVMMEM_ioreq_server to note only changes
>>>>>>>>         to/from HVMMEM_ram_rw are permitted.
>>>>>>>>       - Add domain_pause/unpause() in
>>>>>>>> hvm_map_mem_type_to_ioreq_server()
>>>>>>>>         to avoid the race condition when a vm exit happens on a
>>>>>>>> write-
>>>>>>>>         protected page, just to find the ioreq server has been
>>>>>>>> unmapped
>>>>>>>>         already.
>>>>>>>>       - Introduce a seperate patch to delay the release of p2m
>>>>>>>>         lock to avoid the race condition.
>>>>>>>>       - Introduce a seperate patch to handle the read-modify-write
>>>>>>>>         operations on a write protected page.
>>>>>>>>
>>>>>>> Why do we need to do this?  Won't the default case just DTRT if it
>>>>>>> finds
>>>>>>> that the ioreq server has been unmapped?
>>>>>> Well, patch 4 will either mark the remaining p2m_ioreq_server
>>>>>> entries as
>>>>>> "recal" or
>>>>>> reset to p2m_ram_rw directly. So my understanding is that we do not
>>>>>> wish
>>>>>> to
>>>>>> see a ept violation due to a p2m_ioreq_server access after the ioreq
>>>>>> server
>>>>>> is unmapped.
>>>>>> Yet without this domain_pause/unpause() pair, VM accesses may
>>>>>> trigger an
>>>>>> ept
>>>>>> violation
>>>>>> during the hvmop hypercall(hvm_map_mem_type_to_ioreq_server), just to
>>>>>> find
>>>>>> the ioreq
>>>>>> server is NULL. Then we would have to provide handlers which just
>>>>>> do the
>>>>>> copy to/from
>>>>>> actions for the VM. This seems awkward to me.
>>>>> So the race you're worried about is this:
>>>>>
>>>>> 1. Guest fault happens
>>>>> 2. ioreq server calls map_mem_type_to_ioreq_server, unhooking
>>>>> 3. guest  finds no ioreq server present
>>>>>
>>>>> I think in that case the easiest thing to do would be to simply assume
>>>>> there was a race and re-execute the instruction.  Is that not possible
>>>>> for some reason?
>>>>>
>>>>>     -George
>>>> Thanks for your reply, George. :)
>>>> Two reasons I'd like to use the domain_pause/unpause() to avoid the race
>>>> condition:
>>>>
>>>> 1>  Like my previous explanation, in the read-modify-write scenario, the
>>>> ioreq server will
>>>> be NULL for the read emulation. But in such case, hypervisor will not
>>>> discard this trap, instead
>>>> it is supposed to do the copy work for the read access. So it would be
>>>> difficult for hypervisor
>>>> to decide if the ioreq server was detached due to a race condition,
>>>> or if
>>>> the ioreq server should
>>>> be a NULL because we are emulating a read operation first for a
>>>> read-modify-write instruction.
>>> Wouldn't a patch like the attached work (applied on top of the whole
>>> series)?
>> Thanks for your patch, George. I think it should work for 1>. But we
>> still have the dead lock
>> problem.  :)
>>
>> BTW, do you think a domain_pause will cause any new problem?
> Well using a "big hammer" like domain_pause in a case like this is
> usually indicates that there are other issues that aren't being solved
> properly -- for instance, latent deadlocks or unhandled race conditions.
> :-)  Leaving those issues around in the codebase but "papered over" by
> domain_pause is storing up technical debt that future generations will
> inherit and need to untangle.  (Particularly as in this case, there was
> no comment *in the code* explaining what problems the domain_pause was
> there to solve, so anyone wanting to try to remove it would need to just
> figure out.)
>
> domain_pause is relatively expensive to do (since you have to spin
> waiting for all the vcpus to finish running) and completely stops the
> domain from handling interrupts or anything for an arbitrary amount of
> time.  As long as it doesn't happen often, the cost shouldn't be a major
> issue; and as long as the domain_pause is short (less than 100ms), then
> it shouldn't cause any more problems than running on a fairly busy
> system should cause.
>
> On the other hand, if every time we ran into a tricky situation we just
> did a domain pause rather than solving the root issue, pretty soon the
> domain would be paused several times per second.  The performance would
> plummet, and fixing it would be a nightmare because you'd have hundreds
> of undocumented issues to try to understand and fix.
>
> So: the domain_pause itself isn't terrible (although it's better to
> avoid it if we can); what's more of a problem is the potential issues
> that it's hiding.  These issues can add up, so it's important to push
> back and ask "why do we need this and can we solve it a different way"
> pro-actively, as patches come in, rather than waiting until it becomes
> an issue.
>
>   -George

Thanks for your thorough explanation on this point. And I agree. :)

I have given a proposal to solve this deadlock issue in another mail. 
Nested locks are
key to the potential deadlock, and I believe they are not necessary in 
this case, so using
these locks sequentially could be our way out.

And as to domain pause, I can remove them if we can accept the 
possibility when an
ept violation happens, yet to find the ioreq server has already been 
unmapped(discarding
this operation).

B.R.
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
  2016-09-22  9:12         ` Yu Zhang
  2016-09-22 11:32           ` George Dunlap
@ 2016-09-26  6:58           ` Yu Zhang
  1 sibling, 0 replies; 40+ messages in thread
From: Yu Zhang @ 2016-09-26  6:58 UTC (permalink / raw)
  To: George Dunlap; +Cc: Paul Durrant, Lv, Zhiyuan, Jan Beulich, Xen-devel



On 9/22/2016 5:12 PM, Yu Zhang wrote:
>
>
> On 9/21/2016 9:04 PM, George Dunlap wrote:
>> On Fri, Sep 9, 2016 at 6:51 AM, Yu Zhang <yu.c.zhang@linux.intel.com> 
>> wrote:
>>>> On 9/2/2016 6:47 PM, Yu Zhang wrote:
>>>>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
>>>>> let one ioreq server claim/disclaim its responsibility for the
>>>>> handling of guest pages with p2m type p2m_ioreq_server. Users
>>>>> of this HVMOP can specify which kind of operation is supposed
>>>>> to be emulated in a parameter named flags. Currently, this HVMOP
>>>>> only support the emulation of write operations. And it can be
>>>>> further extended to support the emulation of read ones if an
>>>>> ioreq server has such requirement in the future.
>>>>>
>>>>> For now, we only support one ioreq server for this p2m type, so
>>>>> once an ioreq server has claimed its ownership, subsequent calls
>>>>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
>>>>> disclaim the ownership of guest ram pages with p2m_ioreq_server, by
>>>>> triggering this new HVMOP, with ioreq server id set to the current
>>>>> owner's and flags parameter set to 0.
>>>>>
>>>>> Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
>>>>> are only supported for HVMs with HAP enabled.
>>>>>
>>>>> Also note that only after one ioreq server claims its ownership
>>>>> of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server
>>>>> be allowed.
>>>>>
>>>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>>>> Acked-by: Tim Deegan <tim@xen.org>
>>>>> ---
>>>>> Cc: Paul Durrant <paul.durrant@citrix.com>
>>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>>>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>>>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>>>> Cc: Tim Deegan <tim@xen.org>
>>>>>
>>>>> changes in v6:
>>>>>     - Clarify logic in hvmemul_do_io().
>>>>>     - Use recursive lock for ioreq server lock.
>>>>>     - Remove debug print when mapping ioreq server.
>>>>>     - Clarify code in ept_p2m_type_to_flags() for consistency.
>>>>>     - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS.
>>>>>     - Add comments for HVMMEM_ioreq_server to note only changes
>>>>>       to/from HVMMEM_ram_rw are permitted.
>>>>>     - Add domain_pause/unpause() in 
>>>>> hvm_map_mem_type_to_ioreq_server()
>>>>>       to avoid the race condition when a vm exit happens on a write-
>>>>>       protected page, just to find the ioreq server has been unmapped
>>>>>       already.
>>>>>     - Introduce a seperate patch to delay the release of p2m
>>>>>       lock to avoid the race condition.
>>>>>     - Introduce a seperate patch to handle the read-modify-write
>>>>>       operations on a write protected page.
>>>>>
>>>> Why do we need to do this?  Won't the default case just DTRT if it 
>>>> finds
>>>> that the ioreq server has been unmapped?
>>>
>>> Well, patch 4 will either mark the remaining p2m_ioreq_server 
>>> entries as
>>> "recal" or
>>> reset to p2m_ram_rw directly. So my understanding is that we do not 
>>> wish to
>>> see a ept violation due to a p2m_ioreq_server access after the ioreq 
>>> server
>>> is unmapped.
>>> Yet without this domain_pause/unpause() pair, VM accesses may 
>>> trigger an ept
>>> violation
>>> during the hvmop hypercall(hvm_map_mem_type_to_ioreq_server), just 
>>> to find
>>> the ioreq
>>> server is NULL. Then we would have to provide handlers which just do 
>>> the
>>> copy to/from
>>> actions for the VM. This seems awkward to me.
>> So the race you're worried about is this:
>>
>> 1. Guest fault happens
>> 2. ioreq server calls map_mem_type_to_ioreq_server, unhooking
>> 3. guest  finds no ioreq server present
>>
>> I think in that case the easiest thing to do would be to simply assume
>> there was a race and re-execute the instruction.  Is that not possible
>> for some reason?
>>
>>   -George
>
> Thanks for your reply, George. :)
> Two reasons I'd like to use the domain_pause/unpause() to avoid the 
> race condition:
>
> 1>  Like my previous explanation, in the read-modify-write scenario, 
> the ioreq server will
> be NULL for the read emulation. But in such case, hypervisor will not 
> discard this trap, instead
> it is supposed to do the copy work for the read access. So it would be 
> difficult for hypervisor
> to decide if the ioreq server was detached due to a race condition, or 
> if the ioreq server should
> be a NULL because we are emulating a read operation first for a 
> read-modify-write instruction.
>
> 2> I also realized it can avoid a deadlock possibility -
>     a> dom0 triggers map_mem_type_to_ioreq_server, which spin locks 
> the ioreq_server.lock,
>          and before routine hvm_map_mem_type_to_ioreq_server() returns,
>     b> The HVM triggers an ept violation, and p2m lock is held by 
> hvm_hap_nested_page_fault();
>     c> hypervisor continues to the I/O handler, which will need the 
> ioreq_server.lock to select
>          an ioreq server, which is being held by the hypercall handler 
> side;
>     d> likewise, the map_mem_type_to_ioreq_server will meet problem 
> when trying to get the
>          p2m lock when it calls p2m_change_entry_type_global().
>    With a domain_pause/unpause(), we could avoid the ept violation 
> during this period(which I
> believe won't be long because this pair only covers the 
> p2m_change_entry_type_global() call, not
> the p2m_finish_type_change() call).

Sorry, the deadlock issue should be between the p2m->ioreq.lock and the 
p2m lock which are both
inside p2m_set_ioreq_server() called by hvm_map_mem_type_to_ioreq_server().
The sequence when deadlock happens is similar to the above description:

a> dom0 triggers map_mem_type_to_ioreq_server, to unmap the 
p2m_ioreq_server type;
b> The HVM triggers an ept violation, and p2m lock is held by 
hvm_hap_nested_page_fault();
c> hypervisor continues to the I/O handler, which will need the 
p2m->ioreq.lock to select
      an ioreq server for the write protected address, which is being 
held by the hypercall handler
      side;
d> likewise, p2m_set_ioreq_server() will meet problem when trying to get 
the p2m lock when
      it calls p2m_change_entry_type_global().

Now I believe we can avoid this deadlock by moving the 
p2m_change_entry_type_global() outside
the p2m_set_ioreq_server() to its caller 
hvm_map_mem_type_to_ioreq_server(), and there would
be no nested lock between the p2m->ioreq.lock and p2m lock. :)

A code snippet:
@@ -944,6 +944,13 @@ int hvm_map_mem_type_to_ioreq_server(struct domain 
*d, ioservid_t id,
          if ( s->id == id )
          {
              rc = p2m_set_ioreq_server(d, flags, s);
+            if ( rc == 0 && flags == 0 )
+            {
+                struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+                if ( read_atomic(&p2m->ioreq.entry_count) )
+                    p2m_change_entry_type_global(d, p2m_ioreq_server, 
p2m_ram_rw);
+            }
              break;
          }
      }

Thanks
Yu

>
> Thanks
> Yu
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2016-09-26  6:58 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-02 10:47 [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2016-09-02 10:47 ` [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
2016-09-05 13:31   ` Jan Beulich
2016-09-05 17:20     ` George Dunlap
2016-09-06  7:58       ` Jan Beulich
2016-09-06  8:03         ` Paul Durrant
2016-09-06  8:13           ` Jan Beulich
2016-09-06 10:00             ` Yu Zhang
2016-09-09  5:55     ` Yu Zhang
2016-09-09  8:09       ` Jan Beulich
2016-09-09  8:59         ` Yu Zhang
2016-09-05 17:23   ` George Dunlap
     [not found]   ` <57D24730.2050904@linux.intel.com>
2016-09-09  5:51     ` Yu Zhang
2016-09-21 13:04       ` George Dunlap
2016-09-22  9:12         ` Yu Zhang
2016-09-22 11:32           ` George Dunlap
2016-09-22 16:02             ` Yu Zhang
2016-09-23 10:35               ` George Dunlap
2016-09-26  6:57                 ` Yu Zhang
2016-09-26  6:58           ` Yu Zhang
2016-09-02 10:47 ` [PATCH v6 2/4] x86/ioreq server: Release the p2m lock after mmio is handled Yu Zhang
2016-09-05 13:49   ` Jan Beulich
     [not found]   ` <57D24782.6010701@linux.intel.com>
2016-09-09  5:56     ` Yu Zhang
2016-09-02 10:47 ` [PATCH v6 3/4] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages Yu Zhang
2016-09-05 14:10   ` Jan Beulich
     [not found]   ` <57D247F6.9010503@linux.intel.com>
2016-09-09  6:21     ` Yu Zhang
2016-09-09  8:12       ` Jan Beulich
2016-09-02 10:47 ` [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
2016-09-05 14:47   ` Jan Beulich
     [not found]   ` <57D24813.2090903@linux.intel.com>
2016-09-09  7:24     ` Yu Zhang
2016-09-09  8:20       ` Jan Beulich
2016-09-09  9:24         ` Yu Zhang
2016-09-09  9:44           ` Jan Beulich
2016-09-09  9:56             ` Yu Zhang
2016-09-09 10:09               ` Jan Beulich
2016-09-09 10:01                 ` Yu Zhang
2016-09-20  2:57                 ` Yu Zhang
2016-09-22 18:06                   ` George Dunlap
2016-09-23  1:31                     ` Yu Zhang
2016-09-06 10:57 ` [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.