All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix
@ 2015-06-30 13:05 Paul Durrant
  2015-06-30 13:05 ` [PATCH v5 01/16] x86/hvm: make sure emulation is retried if domain is shutting down Paul Durrant
                   ` (16 more replies)
  0 siblings, 17 replies; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant

This patch series re-works much of the code involved in emulation of port
and memory mapped I/O for HVM guests.

The code has become very convoluted and, at least by inspection, certain
emulations will apparently malfunction.

The series is broken down into 16 patches (which are also available in
my xenbits repo: http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
on the emulation32 branch).

Previous changelog
------------------

v2:
 - Removed bogus assertion from patch #15
 - Re-worked patch #17 after basic testing of back-port onto XenServer

v3:
 - Addressed comments from Jan
 - Re-ordered series to bring a couple of more trivial patches to the
   front
 - Backport to XenServer (4.5) now passing automated tests
 - Tested on unstable with QEMU upstream and trad, with and without
   HAP (to force shadow emulation)

v4:
 - Removed previous patch (make sure translated MMIO reads or
   writes fall within a page) and rebased rest of series.
 - Address Jan's comments on patch #1

Changelog (now per-patch)
-------------------------

0001-x86-hvm-make-sure-emulation-is-retried-if-domain-is-.patch

This is a fix for an issue on staging reported by Don Slutz

0002-x86-hvm-remove-multiple-open-coded-chunking-loops.patch

v5: Addressed further comments from Jan

0003-x86-hvm-change-hvm_mmio_read_t-and-hvm_mmio_write_t-.patch

v5: New patch to tidy up types

0004-x86-hvm-restrict-port-numbers-to-uint16_t-and-sizes-.patch

v5: New patch to tidy up more types

0005-x86-hvm-unify-internal-portio-and-mmio-intercepts.patch

v5: Addressed further comments from Jan and simplified implementation
    by passing ioreq_t to accept() function

0006-x86-hvm-add-length-to-mmio-check-op.patch

v5: Simplified by leaving mmio_check() implementation alone and
    calling to check last byte if first-byte check passes

0007-x86-hvm-unify-dpci-portio-intercept-with-standard-po.patch

v5: Addressed further comments from Jan

0008-x86-hvm-unify-stdvga-mmio-intercept-with-standard-mm.patch

v5: Fixed semantic problems pointed out by Jan

0009-x86-hvm-limit-reps-to-avoid-the-need-to-handle-retry.patch

v5: Addressed further comments from Jan

0010-x86-hvm-only-call-hvm_io_assist-from-hvm_wait_for_io.patch

v5: Added Jan's acked-by

0011-x86-hvm-split-I-O-completion-handling-from-state-mod.patch

v5: Confirmed call to msix_write_completion() is in the correct place.

0012-x86-hvm-remove-HVMIO_dispatched-I-O-state.patch

v5: Added some extra comments to the commit

0013-x86-hvm-remove-hvm_io_state-enumeration.patch

v5: Added Jan's acked-by

0014-x86-hvm-use-ioreq_t-to-track-in-flight-state.patch

v5: Added missing hunk with call to handle_pio()

0015-x86-hvm-always-re-emulate-I-O-from-a-buffer.patch

v5: Added Jan's acked-by

0016-x86-hvm-track-large-memory-mapped-accesses-by-buffer.patch

v5: Fixed to cache up three distict I/O emulations per instruction

Testing
-------

The series was been back-ported to staging-4.5 and then dropped onto the
XenServer (Dundee) patch queue. All automated branch-safety tests pass.

The series as-is has been manually tested with a Windows 7 (32-bit) VM
using upstream QEMU.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 01/16] x86/hvm: make sure emulation is retried if domain is shutting down
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-06-30 13:45   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 02/16] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant

The addition of commit 2df1aa01 "x86/hvm: remove hvm_io_pending() check
in hvmemul_do_io()" causes a problem in migration because I/O that was
caught by the test of vcpu_start_shutdown_deferral() in
hvm_send_assist_req() is now considered completed rather than requiring
a retry.

This patch fixes the problem by having hvm_send_assist_req() return
X86EMUL_RETRY rather than X86EMUL_OKAY if the
vcpu_start_shutdown_deferral() test fails and then making sure that
the emulation state is reset if the domain is found to be shutting
down.

Reported-by: Don Slutz <don.slutz@gmail.com>
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Keir Fraser <keir@xen.org>
Jan Beulich <jbeulich@suse.com>
Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c |    2 +-
 xen/arch/x86/hvm/hvm.c     |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index fe5661d..8b60843 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -183,7 +183,7 @@ static int hvmemul_do_io(
         else
         {
             rc = hvm_send_assist_req(s, &p);
-            if ( rc != X86EMUL_RETRY )
+            if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
                 vio->io_state = HVMIO_none;
             else if ( data_is_addr || dir == IOREQ_WRITE )
                 rc = X86EMUL_OKAY;
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 535d622..f0cf064 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2613,7 +2613,7 @@ int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
 
     ASSERT(s);
     if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
-        return X86EMUL_OKAY;
+        return X86EMUL_RETRY;
 
     list_for_each_entry ( sv,
                           &s->ioreq_vcpu_list,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
  2015-06-30 13:05 ` [PATCH v5 01/16] x86/hvm: make sure emulation is retried if domain is shutting down Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-02 15:37   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 03/16] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument Paul Durrant
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

...in hvmemul_read/write()

Add hvmemul_phys_mmio_access() and hvmemul_linear_mmio_access() functions
to reduce code duplication.

NOTE: This patch also introduces a change in 'chunking' around a page
      boundary. Previously (for example) an 8 byte access at the last
      byte of a page would get carried out as 8 single-byte accesses.
      It will now be carried out as a single-byte access, followed by
      a 4-byte access, a 2-byte access and then another single-byte
      access.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c |  223 +++++++++++++++++++++++---------------------
 1 file changed, 116 insertions(+), 107 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 8b60843..b67f5db 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -539,6 +539,117 @@ static int hvmemul_virtual_to_linear(
     return X86EMUL_EXCEPTION;
 }
 
+static int hvmemul_phys_mmio_access(
+    paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer)
+{
+    unsigned long one_rep = 1;
+    unsigned int chunk;
+    int rc;
+
+    /* Accesses must fall within a page */
+    BUG_ON((gpa & (PAGE_SIZE - 1)) + size > PAGE_SIZE);
+
+    /*
+     * hvmemul_do_io() cannot handle non-power-of-2 accesses or
+     * accesses larger than sizeof(long), so choose the highest power
+     * of 2 not exceeding sizeof(long) as the 'chunk' size.
+     */
+    chunk = 1 << (fls(size) - 1);
+    if ( chunk > sizeof (long) )
+        chunk = sizeof (long);
+
+    for ( ;; )
+    {
+        rc = hvmemul_do_mmio_buffer(gpa, &one_rep, chunk, dir, 0,
+                                    buffer);
+        if ( rc != X86EMUL_OKAY )
+            break;
+
+        /* Advance to the next chunk */
+        gpa += chunk;
+        buffer += chunk;
+        size -= chunk;
+
+        if ( size == 0 )
+            break;
+
+        /*
+         * If the chunk now exceeds the remaining size, choose the next
+         * lowest power of 2 that will fit.
+         */
+        while ( chunk > size )
+            chunk >>= 1;
+    }
+
+    return rc;
+}
+
+static int hvmemul_linear_mmio_access(
+    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
+    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t known_gpfn)
+{
+    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    unsigned long page_off = gla & (PAGE_SIZE - 1);
+    unsigned int chunk;
+    paddr_t gpa;
+    unsigned long one_rep = 1;
+    int rc;
+
+    chunk = min_t(unsigned int, size, PAGE_SIZE - page_off);
+
+    if ( known_gpfn )
+        gpa = pfn_to_paddr(vio->mmio_gpfn) | page_off;
+    else
+    {
+        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
+                                    hvmemul_ctxt);
+        if ( rc != X86EMUL_OKAY )
+            return rc;
+    }
+
+    for ( ;; )
+    {
+        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
+        if ( rc != X86EMUL_OKAY )
+            break;
+
+        gla += chunk;
+        buffer += chunk;
+        size -= chunk;
+
+        if ( size == 0 )
+            break;
+
+        ASSERT((gla & (PAGE_SIZE - 1)) == 0);
+        ASSERT(size < PAGE_SIZE);
+        chunk = size;
+        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
+                                    hvmemul_ctxt);
+        if ( rc != X86EMUL_OKAY )
+            return rc;
+    }
+
+    return rc;
+}
+
+static inline int hvmemul_linear_mmio_read(
+    unsigned long gla, unsigned int size, void *buffer,
+    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt,
+    bool_t translate)
+{
+    return hvmemul_linear_mmio_access(gla, size, IOREQ_READ, buffer,
+                                      pfec, hvmemul_ctxt, translate);
+}
+
+static inline int hvmemul_linear_mmio_write(
+    unsigned long gla, unsigned int size, void *buffer,
+    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt,
+    bool_t translate)
+{
+    return hvmemul_linear_mmio_access(gla, size, IOREQ_WRITE, buffer,
+                                      pfec, hvmemul_ctxt, translate);
+}
+
 static int __hvmemul_read(
     enum x86_segment seg,
     unsigned long offset,
@@ -549,51 +660,19 @@ static int __hvmemul_read(
 {
     struct vcpu *curr = current;
     unsigned long addr, reps = 1;
-    unsigned int off, chunk = min(bytes, 1U << LONG_BYTEORDER);
     uint32_t pfec = PFEC_page_present;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    paddr_t gpa;
     int rc;
 
     rc = hvmemul_virtual_to_linear(
         seg, offset, bytes, &reps, access_type, hvmemul_ctxt, &addr);
     if ( rc != X86EMUL_OKAY )
         return rc;
-    off = addr & (PAGE_SIZE - 1);
-    /*
-     * We only need to handle sizes actual instruction operands can have. All
-     * such sizes are either powers of 2 or the sum of two powers of 2. Thus
-     * picking as initial chunk size the largest power of 2 not greater than
-     * the total size will always result in only power-of-2 size requests
-     * issued to hvmemul_do_mmio() (hvmemul_do_io() rejects non-powers-of-2).
-     */
-    while ( chunk & (chunk - 1) )
-        chunk &= chunk - 1;
-    if ( off + bytes > PAGE_SIZE )
-        while ( off & (chunk - 1) )
-            chunk >>= 1;
-
     if ( ((access_type != hvm_access_insn_fetch
            ? vio->mmio_access.read_access
            : vio->mmio_access.insn_fetch)) &&
          (vio->mmio_gva == (addr & PAGE_MASK)) )
-    {
-        gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off);
-        while ( (off + chunk) <= PAGE_SIZE )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_READ, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                return rc;
-            addr += chunk;
-            off += chunk;
-            gpa += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-        }
-    }
+        return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, hvmemul_ctxt, 1);
 
     if ( (seg != x86_seg_none) &&
          (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
@@ -612,30 +691,8 @@ static int __hvmemul_read(
     case HVMCOPY_bad_gfn_to_mfn:
         if ( access_type == hvm_access_insn_fetch )
             return X86EMUL_UNHANDLEABLE;
-        rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                    hvmemul_ctxt);
-        while ( rc == X86EMUL_OKAY )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_READ, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                break;
-            addr += chunk;
-            off += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-            if ( off < PAGE_SIZE )
-                gpa += chunk;
-            else
-            {
-                rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                            hvmemul_ctxt);
-                off = 0;
-            }
-        }
-        return rc;
+
+        return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, hvmemul_ctxt, 0);
     case HVMCOPY_gfn_paged_out:
     case HVMCOPY_gfn_shared:
         return X86EMUL_RETRY;
@@ -701,43 +758,18 @@ static int hvmemul_write(
         container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
     struct vcpu *curr = current;
     unsigned long addr, reps = 1;
-    unsigned int off, chunk = min(bytes, 1U << LONG_BYTEORDER);
     uint32_t pfec = PFEC_page_present | PFEC_write_access;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    paddr_t gpa;
     int rc;
 
     rc = hvmemul_virtual_to_linear(
         seg, offset, bytes, &reps, hvm_access_write, hvmemul_ctxt, &addr);
     if ( rc != X86EMUL_OKAY )
         return rc;
-    off = addr & (PAGE_SIZE - 1);
-    /* See the respective comment in __hvmemul_read(). */
-    while ( chunk & (chunk - 1) )
-        chunk &= chunk - 1;
-    if ( off + bytes > PAGE_SIZE )
-        while ( off & (chunk - 1) )
-            chunk >>= 1;
 
     if ( vio->mmio_access.write_access &&
          (vio->mmio_gva == (addr & PAGE_MASK)) )
-    {
-        gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off);
-        while ( (off + chunk) <= PAGE_SIZE )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_WRITE, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                return rc;
-            addr += chunk;
-            off += chunk;
-            gpa += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-        }
-    }
+        return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, hvmemul_ctxt, 1);
 
     if ( (seg != x86_seg_none) &&
          (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
@@ -752,30 +784,7 @@ static int hvmemul_write(
     case HVMCOPY_bad_gva_to_gfn:
         return X86EMUL_EXCEPTION;
     case HVMCOPY_bad_gfn_to_mfn:
-        rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                    hvmemul_ctxt);
-        while ( rc == X86EMUL_OKAY )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_WRITE, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                break;
-            addr += chunk;
-            off += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-            if ( off < PAGE_SIZE )
-                gpa += chunk;
-            else
-            {
-                rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                            hvmemul_ctxt);
-                off = 0;
-            }
-        }
-        return rc;
+        return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, hvmemul_ctxt, 0);
     case HVMCOPY_gfn_paged_out:
     case HVMCOPY_gfn_shared:
         return X86EMUL_RETRY;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 03/16] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument...
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
  2015-06-30 13:05 ` [PATCH v5 01/16] x86/hvm: make sure emulation is retried if domain is shutting down Paul Durrant
  2015-06-30 13:05 ` [PATCH v5 02/16] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-02 15:39   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int Paul Durrant
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

...from unsigned long to unsigned int

A 64-bit length is not necessary, 32 bits is enough.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hpet.c                   |    4 ++--
 xen/arch/x86/hvm/vioapic.c                |    4 ++--
 xen/arch/x86/hvm/vlapic.c                 |   18 +++++++++---------
 xen/arch/x86/hvm/vmsi.c                   |    4 ++--
 xen/drivers/passthrough/amd/iommu_guest.c |    8 ++++----
 xen/include/asm-x86/hvm/io.h              |    6 +++---
 6 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index 9585ca8..30ac5dd 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -166,7 +166,7 @@ static inline int hpet_check_access_length(
 }
 
 static int hpet_read(
-    struct vcpu *v, unsigned long addr, unsigned long length,
+    struct vcpu *v, unsigned long addr, unsigned int length,
     unsigned long *pval)
 {
     HPETState *h = vcpu_vhpet(v);
@@ -301,7 +301,7 @@ static inline uint64_t hpet_fixup_reg(
 
 static int hpet_write(
     struct vcpu *v, unsigned long addr,
-    unsigned long length, unsigned long val)
+    unsigned int length, unsigned long val)
 {
     HPETState *h = vcpu_vhpet(v);
     uint64_t old_val, new_val;
diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index cbbef9f..5c8d890 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -94,7 +94,7 @@ static uint32_t vioapic_read_indirect(const struct hvm_hw_vioapic *vioapic)
 
 static int vioapic_read(
     struct vcpu *v, unsigned long addr,
-    unsigned long length, unsigned long *pval)
+    unsigned int length, unsigned long *pval)
 {
     const struct hvm_hw_vioapic *vioapic = domain_vioapic(v->domain);
     uint32_t result;
@@ -215,7 +215,7 @@ static void vioapic_write_indirect(
 
 static int vioapic_write(
     struct vcpu *v, unsigned long addr,
-    unsigned long length, unsigned long val)
+    unsigned int length, unsigned long val)
 {
     struct hvm_hw_vioapic *vioapic = domain_vioapic(v->domain);
 
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 32e649e..0235cb3 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -581,7 +581,7 @@ static uint32_t vlapic_read_aligned(struct vlapic *vlapic, unsigned int offset)
 
 static int vlapic_read(
     struct vcpu *v, unsigned long address,
-    unsigned long len, unsigned long *pval)
+    unsigned int len, unsigned long *pval)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
     unsigned int offset = address - vlapic_base_address(vlapic);
@@ -611,12 +611,12 @@ static int vlapic_read(
         break;
 
     default:
-        gdprintk(XENLOG_ERR, "Local APIC read with len=%#lx, "
+        gdprintk(XENLOG_ERR, "Local APIC read with len=%#x, "
                  "should be 4 instead.\n", len);
         goto exit_and_crash;
     }
 
-    HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "offset %#x with length %#lx, "
+    HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "offset %#x with length %#x, "
                 "and the result is %#x", offset, len, result);
 
  out:
@@ -624,7 +624,7 @@ static int vlapic_read(
     return X86EMUL_OKAY;
 
  unaligned_exit_and_crash:
-    gdprintk(XENLOG_ERR, "Unaligned LAPIC read len=%#lx at offset=%#x.\n",
+    gdprintk(XENLOG_ERR, "Unaligned LAPIC read len=%#x at offset=%#x.\n",
              len, offset);
  exit_and_crash:
     domain_crash(v->domain);
@@ -819,7 +819,7 @@ static int vlapic_reg_write(struct vcpu *v,
 }
 
 static int vlapic_write(struct vcpu *v, unsigned long address,
-                        unsigned long len, unsigned long val)
+                        unsigned int len, unsigned long val)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
     unsigned int offset = address - vlapic_base_address(vlapic);
@@ -827,7 +827,7 @@ static int vlapic_write(struct vcpu *v, unsigned long address,
 
     if ( offset != APIC_EOI )
         HVM_DBG_LOG(DBG_LEVEL_VLAPIC,
-                    "offset %#x with length %#lx, and value is %#lx",
+                    "offset %#x with length %#x, and value is %#lx",
                     offset, len, val);
 
     /*
@@ -854,11 +854,11 @@ static int vlapic_write(struct vcpu *v, unsigned long address,
             break;
 
         default:
-            gprintk(XENLOG_ERR, "LAPIC write with len %lu\n", len);
+            gprintk(XENLOG_ERR, "LAPIC write with len %u\n", len);
             goto exit_and_crash;
         }
 
-        gdprintk(XENLOG_INFO, "Notice: LAPIC write with len %lu\n", len);
+        gdprintk(XENLOG_INFO, "Notice: LAPIC write with len %u\n", len);
         offset &= ~3;
     }
     else if ( unlikely(offset & 3) )
@@ -867,7 +867,7 @@ static int vlapic_write(struct vcpu *v, unsigned long address,
     return vlapic_reg_write(v, offset, val);
 
  unaligned_exit_and_crash:
-    gprintk(XENLOG_ERR, "Unaligned LAPIC write: len=%lu offset=%#x.\n",
+    gprintk(XENLOG_ERR, "Unaligned LAPIC write: len=%u offset=%#x.\n",
             len, offset);
  exit_and_crash:
     domain_crash(v->domain);
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index f89233d..6eeae0a 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -202,7 +202,7 @@ static struct msi_desc *msixtbl_addr_to_desc(
 
 static int msixtbl_read(
     struct vcpu *v, unsigned long address,
-    unsigned long len, unsigned long *pval)
+    unsigned int len, unsigned long *pval)
 {
     unsigned long offset;
     struct msixtbl_entry *entry;
@@ -258,7 +258,7 @@ out:
 }
 
 static int msixtbl_write(struct vcpu *v, unsigned long address,
-                         unsigned long len, unsigned long val)
+                         unsigned int len, unsigned long val)
 {
     unsigned long offset;
     struct msixtbl_entry *entry;
diff --git a/xen/drivers/passthrough/amd/iommu_guest.c b/xen/drivers/passthrough/amd/iommu_guest.c
index 7b0c102..a832fdb 100644
--- a/xen/drivers/passthrough/amd/iommu_guest.c
+++ b/xen/drivers/passthrough/amd/iommu_guest.c
@@ -682,7 +682,7 @@ static uint64_t iommu_mmio_read64(struct guest_iommu *iommu,
 }
 
 static int guest_iommu_mmio_read(struct vcpu *v, unsigned long addr,
-                                 unsigned long len, unsigned long *pval)
+                                 unsigned int len, unsigned long *pval)
 {
     struct guest_iommu *iommu = vcpu_iommu(v);
     unsigned long offset;
@@ -695,7 +695,7 @@ static int guest_iommu_mmio_read(struct vcpu *v, unsigned long addr,
     if ( unlikely((offset & (len - 1 )) || (len > 8)) )
     {
         AMD_IOMMU_DEBUG("iommu mmio read access is not aligned:"
-                        " offset = %lx, len = %lx\n", offset, len);
+                        " offset = %lx, len = %x\n", offset, len);
         return X86EMUL_UNHANDLEABLE;
     }
 
@@ -772,7 +772,7 @@ static void guest_iommu_mmio_write64(struct guest_iommu *iommu,
 }
 
 static int guest_iommu_mmio_write(struct vcpu *v, unsigned long addr,
-                                  unsigned long len, unsigned long val)
+                                  unsigned int len, unsigned long val)
 {
     struct guest_iommu *iommu = vcpu_iommu(v);
     unsigned long offset;
@@ -785,7 +785,7 @@ static int guest_iommu_mmio_write(struct vcpu *v, unsigned long addr,
     if ( unlikely((offset & (len - 1)) || (len > 8)) )
     {
         AMD_IOMMU_DEBUG("iommu mmio write access is not aligned:"
-                        " offset = %lx, len = %lx\n", offset, len);
+                        " offset = %lx, len = %x\n", offset, len);
         return X86EMUL_UNHANDLEABLE;
     }
 
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index f2aaec5..d1d79dc 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -32,11 +32,11 @@
 
 typedef int (*hvm_mmio_read_t)(struct vcpu *v,
                                unsigned long addr,
-                               unsigned long length,
+                               unsigned int length,
                                unsigned long *val);
 typedef int (*hvm_mmio_write_t)(struct vcpu *v,
                                 unsigned long addr,
-                                unsigned long length,
+                                unsigned int length,
                                 unsigned long val);
 typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
 
@@ -111,7 +111,7 @@ static inline void relocate_portio_handler(
 
 static inline void register_buffered_io_handler(
     struct domain *d, unsigned long addr,
-    unsigned long size, mmio_action_t action)
+    unsigned int size, mmio_action_t action)
 {
     register_io_handler(d, addr, size, action, HVM_BUFFERED_IO);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (2 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 03/16] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-02 15:54   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 05/16] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

Building on the previous patch, this patch restricts portio port numbers
to uint16_t in registration/relocate calls and portio_action_t. It also
changes portio sizes to unsigned int which then allows the io_handler
size field to reduce to an unsigned int.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c       |    6 +++---
 xen/arch/x86/hvm/i8254.c     |    8 ++++----
 xen/arch/x86/hvm/intercept.c |    4 ++--
 xen/arch/x86/hvm/pmtimer.c   |    4 ++--
 xen/arch/x86/hvm/rtc.c       |    2 +-
 xen/arch/x86/hvm/stdvga.c    |    2 +-
 xen/arch/x86/hvm/vpic.c      |    4 ++--
 xen/include/asm-x86/hvm/io.h |   20 ++++++++++----------
 8 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index f0cf064..2bd7d45 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -559,7 +559,7 @@ static int hvm_add_ioreq_gmfn(
 }
 
 static int hvm_print_line(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct domain *cd = current->domain;
     char c = *val;
@@ -585,7 +585,7 @@ static int hvm_print_line(
 }
 
 static int hvm_access_cf8(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct domain *d = current->domain;
 
@@ -597,7 +597,7 @@ static int hvm_access_cf8(
 }
 
 static int handle_pvh_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct domain *currd = current->domain;
 
diff --git a/xen/arch/x86/hvm/i8254.c b/xen/arch/x86/hvm/i8254.c
index 36a0a53..5a47580 100644
--- a/xen/arch/x86/hvm/i8254.c
+++ b/xen/arch/x86/hvm/i8254.c
@@ -50,9 +50,9 @@
 #define RW_STATE_WORD1 4
 
 static int handle_pit_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val);
 static int handle_speaker_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val);
 
 #define get_guest_time(v) \
    (is_hvm_vcpu(v) ? hvm_get_guest_time(v) : (u64)get_s_time())
@@ -479,7 +479,7 @@ void pit_deinit(struct domain *d)
 
 /* the intercept action for PIT DM retval:0--not handled; 1--handled */  
 static int handle_pit_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct PITState *vpit = vcpu_vpit(current);
 
@@ -522,7 +522,7 @@ static uint32_t speaker_ioport_read(
 }
 
 static int handle_speaker_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, uint32_t bytes, uint32_t *val)
 {
     struct PITState *vpit = vcpu_vpit(current);
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index cc44733..52879ff 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -369,7 +369,7 @@ int hvm_io_intercept(ioreq_t *p, int type)
 }
 
 void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned long size,
+    struct domain *d, unsigned long addr, unsigned int size,
     void *action, int type)
 {
     struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
@@ -386,7 +386,7 @@ void register_io_handler(
 
 void relocate_io_handler(
     struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned long size, int type)
+    unsigned int size, int type)
 {
     struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
     int i;
diff --git a/xen/arch/x86/hvm/pmtimer.c b/xen/arch/x86/hvm/pmtimer.c
index 6ad2797..8b531d9 100644
--- a/xen/arch/x86/hvm/pmtimer.c
+++ b/xen/arch/x86/hvm/pmtimer.c
@@ -142,7 +142,7 @@ static void pmt_timer_callback(void *opaque)
 
 /* Handle port I/O to the PM1a_STS and PM1a_EN registers */
 static int handle_evt_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct vcpu *v = current;
     PMTState *s = &v->domain->arch.hvm_domain.pl_time.vpmt;
@@ -205,7 +205,7 @@ static int handle_evt_io(
 
 /* Handle port I/O to the TMR_VAL register */
 static int handle_pmt_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct vcpu *v = current;
     PMTState *s = &v->domain->arch.hvm_domain.pl_time.vpmt;
diff --git a/xen/arch/x86/hvm/rtc.c b/xen/arch/x86/hvm/rtc.c
index 3448971..fde120b 100644
--- a/xen/arch/x86/hvm/rtc.c
+++ b/xen/arch/x86/hvm/rtc.c
@@ -697,7 +697,7 @@ static uint32_t rtc_ioport_read(RTCState *s, uint32_t addr)
 }
 
 static int handle_rtc_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct RTCState *vrtc = vcpu_vrtc(current);
 
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 13d1029..98308cd 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -173,7 +173,7 @@ static void stdvga_out(uint32_t port, uint32_t bytes, uint32_t val)
 }
 
 static int stdvga_intercept_pio(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
 
diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c
index 8eea061..909b9d0 100644
--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -324,7 +324,7 @@ static uint32_t vpic_ioport_read(struct hvm_hw_vpic *vpic, uint32_t addr)
 }
 
 static int vpic_intercept_pic_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct hvm_hw_vpic *vpic;
 
@@ -346,7 +346,7 @@ static int vpic_intercept_pic_io(
 }
 
 static int vpic_intercept_elcr_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val)
 {
     struct hvm_hw_vpic *vpic;
     uint32_t data;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index d1d79dc..1ed0c13 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -41,12 +41,12 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v,
 typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
 
 typedef int (*portio_action_t)(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val);
 typedef int (*mmio_action_t)(ioreq_t *);
 struct io_handler {
     int                 type;
     unsigned long       addr;
-    unsigned long       size;
+    unsigned int        size;
     union {
         portio_action_t portio;
         mmio_action_t   mmio;
@@ -75,11 +75,11 @@ extern const struct hvm_mmio_ops iommu_mmio_ops;
 
 int hvm_io_intercept(ioreq_t *p, int type);
 void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned long size,
+    struct domain *d, unsigned long addr, unsigned int size,
     void *action, int type);
 void relocate_io_handler(
     struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned long size, int type);
+    unsigned int size, int type);
 
 static inline int hvm_portio_intercept(ioreq_t *p)
 {
@@ -96,17 +96,17 @@ int hvm_mmio_intercept(ioreq_t *p);
 int hvm_buffered_io_send(ioreq_t *p);
 
 static inline void register_portio_handler(
-    struct domain *d, unsigned long addr,
-    unsigned long size, portio_action_t action)
+    struct domain *d, uint16_t port, unsigned int size,
+    portio_action_t action)
 {
-    register_io_handler(d, addr, size, action, HVM_PORTIO);
+    register_io_handler(d, port, size, action, HVM_PORTIO);
 }
 
 static inline void relocate_portio_handler(
-    struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned long size)
+    struct domain *d, uint16_t old_port, uint16_t new_port,
+    unsigned int size)
 {
-    relocate_io_handler(d, old_addr, new_addr, size, HVM_PORTIO);
+    relocate_io_handler(d, old_port, new_port, size, HVM_PORTIO);
 }
 
 static inline void register_buffered_io_handler(
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 05/16] x86/hvm: unify internal portio and mmio intercepts
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (3 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-02 14:52   ` Roger Pau Monné
  2015-07-02 16:29   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 06/16] x86/hvm: add length to mmio check op Paul Durrant
                   ` (11 subsequent siblings)
  16 siblings, 2 replies; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

The implementation of mmio and portio intercepts is unnecessarily different.
This leads to much code duplication. This patch unifies much of the
intercept handling, leaving only distinct handlers for stdvga mmio and dpci
portio. Subsequent patches will unify those handlers.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c                |   11 +-
 xen/arch/x86/hvm/hpet.c                   |    4 +-
 xen/arch/x86/hvm/hvm.c                    |    7 +-
 xen/arch/x86/hvm/intercept.c              |  486 +++++++++++++----------------
 xen/arch/x86/hvm/stdvga.c                 |   30 +-
 xen/arch/x86/hvm/vioapic.c                |    4 +-
 xen/arch/x86/hvm/vlapic.c                 |    5 +-
 xen/arch/x86/hvm/vmsi.c                   |    7 +-
 xen/drivers/passthrough/amd/iommu_guest.c |   30 +-
 xen/include/asm-x86/hvm/domain.h          |    1 +
 xen/include/asm-x86/hvm/io.h              |  115 +++----
 11 files changed, 334 insertions(+), 366 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index b67f5db..205ca5e 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -142,16 +142,7 @@ static int hvmemul_do_io(
         hvmtrace_io_assist(&p);
     }
 
-    if ( is_mmio )
-    {
-        rc = hvm_mmio_intercept(&p);
-        if ( rc == X86EMUL_UNHANDLEABLE )
-            rc = hvm_buffered_io_intercept(&p);
-    }
-    else
-    {
-        rc = hvm_portio_intercept(&p);
-    }
+    rc = hvm_io_intercept(&p);
 
     switch ( rc )
     {
diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index 30ac5dd..732504a 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -504,7 +504,7 @@ static int hpet_range(struct vcpu *v, unsigned long addr)
              (addr < (HPET_BASE_ADDRESS + HPET_MMAP_SIZE)) );
 }
 
-const struct hvm_mmio_ops hpet_mmio_ops = {
+static const struct hvm_mmio_ops hpet_mmio_ops = {
     .check = hpet_range,
     .read  = hpet_read,
     .write = hpet_write
@@ -659,6 +659,8 @@ void hpet_init(struct domain *d)
         h->hpet.comparator64[i] = ~0ULL;
         h->pt[i].source = PTSRC_isa;
     }
+
+    register_mmio_handler(d, &hpet_mmio_ops);
 }
 
 void hpet_deinit(struct domain *d)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 2bd7d45..60f8972 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1465,11 +1465,12 @@ int hvm_domain_initialise(struct domain *d)
         goto fail0;
 
     d->arch.hvm_domain.params = xzalloc_array(uint64_t, HVM_NR_PARAMS);
-    d->arch.hvm_domain.io_handler = xmalloc(struct hvm_io_handler);
+    d->arch.hvm_domain.io_handler = xzalloc_array(struct hvm_io_handler,
+                                                  NR_IO_HANDLERS);
     rc = -ENOMEM;
     if ( !d->arch.hvm_domain.params || !d->arch.hvm_domain.io_handler )
         goto fail1;
-    d->arch.hvm_domain.io_handler->num_slot = 0;
+    d->arch.hvm_domain.io_handler_count = 0;
 
     /* Set the default IO Bitmap. */
     if ( is_hardware_domain(d) )
@@ -1506,6 +1507,8 @@ int hvm_domain_initialise(struct domain *d)
 
     rtc_init(d);
 
+    msixtbl_init(d);
+
     register_portio_handler(d, 0xe9, 1, hvm_print_line);
     register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 52879ff..7d36785 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -32,205 +32,94 @@
 #include <xen/event.h>
 #include <xen/iommu.h>
 
-static const struct hvm_mmio_ops *const
-hvm_mmio_handlers[HVM_MMIO_HANDLER_NR] =
+static bool_t hvm_mmio_accept(const struct hvm_io_handler *handler,
+                              const ioreq_t *p)
 {
-    &hpet_mmio_ops,
-    &vlapic_mmio_ops,
-    &vioapic_mmio_ops,
-    &msixtbl_mmio_ops,
-    &iommu_mmio_ops
-};
+    BUG_ON(handler->type != IOREQ_TYPE_COPY);
+
+    return handler->mmio.ops->check(current, p->addr);
+}
 
-static int hvm_mmio_access(struct vcpu *v,
-                           ioreq_t *p,
-                           hvm_mmio_read_t read,
-                           hvm_mmio_write_t write)
+static int hvm_mmio_read(const struct hvm_io_handler *handler,
+                         uint64_t addr,
+                         uint32_t size,
+                         uint64_t *data)
 {
-    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
-    unsigned long data;
-    int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
+    BUG_ON(handler->type != IOREQ_TYPE_COPY);
 
-    if ( !p->data_is_ptr )
-    {
-        if ( p->dir == IOREQ_READ )
-        {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-                rc = read(v, p->addr, p->size, &data);
-            p->data = data;
-        }
-        else /* p->dir == IOREQ_WRITE */
-            rc = write(v, p->addr, p->size, p->data);
-        return rc;
-    }
+    return handler->mmio.ops->read(current, addr, size, data);
+}
 
-    if ( p->dir == IOREQ_READ )
-    {
-        for ( i = 0; i < p->count; i++ )
-        {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-            {
-                rc = read(v, p->addr + step * i, p->size, &data);
-                if ( rc != X86EMUL_OKAY )
-                    break;
-            }
-            switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                            &data, p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                /* Drop the write as real hardware would. */
-                continue;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY)
-                break;
-        }
+static int hvm_mmio_write(const struct hvm_io_handler *handler,
+                          uint64_t addr,
+                          uint32_t size,
+                          uint64_t data)
+{
+    BUG_ON(handler->type != IOREQ_TYPE_COPY);
 
-        if ( rc == X86EMUL_RETRY )
-        {
-            vio->mmio_retry = 1;
-            vio->mmio_large_read_bytes = p->size;
-            memcpy(vio->mmio_large_read, &data, p->size);
-        }
-    }
-    else
-    {
-        for ( i = 0; i < p->count; i++ )
-        {
-            switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
-                                              p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                data = ~0;
-                break;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY )
-                break;
-            rc = write(v, p->addr + step * i, p->size, data);
-            if ( rc != X86EMUL_OKAY )
-                break;
-        }
+    return handler->mmio.ops->write(current, addr, size, data);
+}
 
-        if ( rc == X86EMUL_RETRY )
-            vio->mmio_retry = 1;
-    }
+static const struct hvm_io_ops mmio_ops = {
+    .accept = hvm_mmio_accept,
+    .read = hvm_mmio_read,
+    .write = hvm_mmio_write
+};
 
-    if ( i != 0 )
-    {
-        p->count = i;
-        rc = X86EMUL_OKAY;
-    }
+static bool_t hvm_portio_accept(const struct hvm_io_handler *handler,
+                                const ioreq_t *p)
+{
+    BUG_ON(handler->type != IOREQ_TYPE_PIO);
 
-    return rc;
+    return (p->addr >= handler->portio.start) &&
+           ((p->addr + p->size) <= handler->portio.end);
 }
 
-bool_t hvm_mmio_internal(paddr_t gpa)
+static int hvm_portio_read(const struct hvm_io_handler *handler,
+                           uint64_t addr,
+                           uint32_t size,
+                           uint64_t *data)
 {
-    struct vcpu *curr = current;
-    unsigned int i;
+    uint32_t val = 0;
+    int rc;
+
+    BUG_ON(handler->type != IOREQ_TYPE_PIO);
 
-    for ( i = 0; i < HVM_MMIO_HANDLER_NR; ++i )
-        if ( hvm_mmio_handlers[i]->check(curr, gpa) )
-            return 1;
+    rc = handler->portio.action(IOREQ_READ, addr, size, &val);
+    *data = val;
 
-    return 0;
+    return rc;
 }
 
-int hvm_mmio_intercept(ioreq_t *p)
+static int hvm_portio_write(const struct hvm_io_handler *handler,
+                            uint64_t addr,
+                            uint32_t size,
+                            uint64_t data)
 {
-    struct vcpu *v = current;
-    int i;
-
-    for ( i = 0; i < HVM_MMIO_HANDLER_NR; i++ )
-    {
-        hvm_mmio_check_t check = hvm_mmio_handlers[i]->check;
+    uint32_t val = data;
 
-        if ( check(v, p->addr) )
-        {
-            if ( unlikely(p->count > 1) &&
-                 !check(v, unlikely(p->df)
-                        ? p->addr - (p->count - 1L) * p->size
-                        : p->addr + (p->count - 1L) * p->size) )
-                p->count = 1;
-
-            return hvm_mmio_access(
-                v, p,
-                hvm_mmio_handlers[i]->read,
-                hvm_mmio_handlers[i]->write);
-        }
-    }
+    BUG_ON(handler->type != IOREQ_TYPE_PIO);
 
-    return X86EMUL_UNHANDLEABLE;
+    return handler->portio.action(IOREQ_WRITE, addr, size, &val);
 }
 
-static int process_portio_intercept(portio_action_t action, ioreq_t *p)
+static const struct hvm_io_ops portio_ops = {
+    .accept = hvm_portio_accept,
+    .read = hvm_portio_read,
+    .write = hvm_portio_write
+};
+
+static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
+                                    ioreq_t *p)
 {
     struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    const struct hvm_io_ops *ops =
+        (p->type == IOREQ_TYPE_COPY) ?
+        &mmio_ops :
+        &portio_ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-    uint32_t data;
-
-    if ( !p->data_is_ptr )
-    {
-        if ( p->dir == IOREQ_READ )
-        {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-                rc = action(IOREQ_READ, p->addr, p->size, &data);
-            p->data = data;
-        }
-        else
-        {
-            data = p->data;
-            rc = action(IOREQ_WRITE, p->addr, p->size, &data);
-        }
-        return rc;
-    }
+    uint64_t data;
+    uint64_t addr;
 
     if ( p->dir == IOREQ_READ )
     {
@@ -246,31 +135,40 @@ static int process_portio_intercept(portio_action_t action, ioreq_t *p)
             }
             else
             {
-                rc = action(IOREQ_READ, p->addr, p->size, &data);
+                addr = (p->type == IOREQ_TYPE_COPY) ?
+                       p->addr + step * i :
+                       p->addr;
+                rc = ops->read(handler, addr, p->size, &data);
                 if ( rc != X86EMUL_OKAY )
                     break;
             }
-            switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                            &data, p->size) )
+
+            if ( p->data_is_ptr )
             {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                /* Drop the write as real hardware would. */
-                continue;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
+                switch ( hvm_copy_to_guest_phys(p->data + step * i,
+                                                &data, p->size) )
+                {
+                case HVMCOPY_okay:
+                    break;
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
+                    rc = X86EMUL_RETRY;
+                    break;
+                case HVMCOPY_bad_gfn_to_mfn:
+                    /* Drop the write as real hardware would. */
+                    continue;
+                case HVMCOPY_bad_gva_to_gfn:
+                    ASSERT_UNREACHABLE();
+                    /* fall through */
+                default:
+                    rc = X86EMUL_UNHANDLEABLE;
+                    break;
+                }
+                if ( rc != X86EMUL_OKAY )
+                    break;
             }
-            if ( rc != X86EMUL_OKAY)
-                break;
+            else
+                p->data = data;
         }
 
         if ( rc == X86EMUL_RETRY )
@@ -284,29 +182,37 @@ static int process_portio_intercept(portio_action_t action, ioreq_t *p)
     {
         for ( i = 0; i < p->count; i++ )
         {
-            data = 0;
-            switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
-                                              p->size) )
+            if ( p->data_is_ptr )
             {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                data = ~0;
-                break;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
+                switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
+                                                  p->size) )
+                {
+                case HVMCOPY_okay:
+                    break;
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
+                    rc = X86EMUL_RETRY;
+                    break;
+                case HVMCOPY_bad_gfn_to_mfn:
+                    data = ~0;
+                    break;
+                case HVMCOPY_bad_gva_to_gfn:
+                    ASSERT_UNREACHABLE();
+                    /* fall through */
+                default:
+                    rc = X86EMUL_UNHANDLEABLE;
+                    break;
+                }
+                if ( rc != X86EMUL_OKAY )
+                    break;
             }
-            if ( rc != X86EMUL_OKAY )
-                break;
-            rc = action(IOREQ_WRITE, p->addr, p->size, &data);
+            else
+                data = p->data;
+
+            addr = (p->type == IOREQ_TYPE_COPY) ?
+                   p->addr + step * i :
+                   p->addr;
+            rc = ops->write(handler, addr, p->size, data);
             if ( rc != X86EMUL_OKAY )
                 break;
         }
@@ -324,78 +230,122 @@ static int process_portio_intercept(portio_action_t action, ioreq_t *p)
     return rc;
 }
 
-/*
- * Check if the request is handled inside xen
- * return value: 0 --not handled; 1 --handled
- */
-int hvm_io_intercept(ioreq_t *p, int type)
+static const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
+{
+    struct domain *curr_d = current->domain;
+    const struct hvm_io_ops *ops =
+        (p->type == IOREQ_TYPE_COPY) ?
+        &mmio_ops :
+        &portio_ops;
+    unsigned int i;
+
+    BUG_ON((p->type != IOREQ_TYPE_PIO) &&
+           (p->type != IOREQ_TYPE_COPY));
+
+    for ( i = 0; i < curr_d->arch.hvm_domain.io_handler_count; i++ )
+    {
+        const struct hvm_io_handler *handler =
+            &curr_d->arch.hvm_domain.io_handler[i];
+
+        if ( handler->type != p->type )
+            continue;
+
+        if ( ops->accept(handler, p) )
+            return handler;
+    }
+
+    return NULL;
+}
+
+int hvm_io_intercept(ioreq_t *p)
 {
-    struct vcpu *v = current;
-    struct hvm_io_handler *handler = v->domain->arch.hvm_domain.io_handler;
-    int i;
-    unsigned long addr, size;
+    const struct hvm_io_handler *handler;
 
-    if ( type == HVM_PORTIO )
+    if ( p->type == IOREQ_TYPE_PIO )
     {
         int rc = dpci_ioport_intercept(p);
         if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
             return rc;
     }
-
-    for ( i = 0; i < handler->num_slot; i++ )
+    else if ( p->type == IOREQ_TYPE_COPY )
     {
-        if ( type != handler->hdl_list[i].type )
-            continue;
-        addr = handler->hdl_list[i].addr;
-        size = handler->hdl_list[i].size;
-        if ( (p->addr >= addr) &&
-             ((p->addr + p->size) <= (addr + size)) )
-        {
-            if ( type == HVM_PORTIO )
-                return process_portio_intercept(
-                    handler->hdl_list[i].action.portio, p);
+        int rc = stdvga_intercept_mmio(p);
+        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
+            return rc;
+    }
 
-            if ( unlikely(p->count > 1) &&
-                 (unlikely(p->df)
-                  ? p->addr - (p->count - 1L) * p->size < addr
-                  : p->addr + p->count * 1L * p->size - 1 >= addr + size) )
-                p->count = 1;
+    handler = hvm_find_io_handler(p);
 
-            return handler->hdl_list[i].action.mmio(p);
-        }
+    if ( handler == NULL )
+        return X86EMUL_UNHANDLEABLE;
+
+    return hvm_process_io_intercept(handler, p);
+}
+
+static struct hvm_io_handler *hvm_next_io_handler(struct domain *d)
+{
+    unsigned int i = d->arch.hvm_domain.io_handler_count++;
+
+    if ( i == NR_IO_HANDLERS )
+    {
+        domain_crash(d);
+        return NULL;
     }
 
-    return X86EMUL_UNHANDLEABLE;
+    return &d->arch.hvm_domain.io_handler[i];
 }
 
-void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned int size,
-    void *action, int type)
+void register_mmio_handler(struct domain *d,
+                           const struct hvm_mmio_ops *ops)
 {
-    struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
-    int num = handler->num_slot;
+    struct hvm_io_handler *handler = hvm_next_io_handler(d);
 
-    BUG_ON(num >= MAX_IO_HANDLER);
+    handler->type = IOREQ_TYPE_COPY;
+    handler->mmio.ops = ops;
+}
 
-    handler->hdl_list[num].addr = addr;
-    handler->hdl_list[num].size = size;
-    handler->hdl_list[num].action.ptr = action;
-    handler->hdl_list[num].type = type;
-    handler->num_slot++;
+void register_portio_handler(struct domain *d, uint16_t port,
+                             unsigned int size, portio_action_t action)
+{
+    struct hvm_io_handler *handler = hvm_next_io_handler(d);
+
+    handler->type = IOREQ_TYPE_PIO;
+    handler->portio.start = port;
+    handler->portio.end = port + size;
+    handler->portio.action = action;
 }
 
-void relocate_io_handler(
-    struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned int size, int type)
+void relocate_portio_handler(struct domain *d, uint16_t old_port,
+                             uint16_t new_port, unsigned int size)
 {
-    struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
-    int i;
-
-    for ( i = 0; i < handler->num_slot; i++ )
-        if ( (handler->hdl_list[i].addr == old_addr) &&
-             (handler->hdl_list[i].size == size) &&
-             (handler->hdl_list[i].type == type) )
-            handler->hdl_list[i].addr = new_addr;
+    unsigned int i;
+
+    for ( i = 0; i < d->arch.hvm_domain.io_handler_count; i++ )
+    {
+        struct hvm_io_handler *handler =
+            &d->arch.hvm_domain.io_handler[i];
+
+        if ( handler->type != IOREQ_TYPE_PIO )
+            continue;
+
+        if ( (handler->portio.start == old_port) &&
+             (handler->portio.end = old_port + size) )
+        {
+            handler->portio.start = new_port;
+            handler->portio.end = new_port + size;
+            break;
+        }
+    }
+}
+
+bool_t hvm_mmio_internal(paddr_t gpa)
+{
+    ioreq_t p = {
+        .type = IOREQ_TYPE_COPY,
+        .addr = gpa
+    };
+
+    return (hvm_find_io_handler(&p) != NULL);
 }
 
 /*
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 98308cd..30760fa 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -547,12 +547,27 @@ static int mmio_move(struct hvm_hw_stdvga *s, ioreq_t *p)
     return 1;
 }
 
-static int stdvga_intercept_mmio(ioreq_t *p)
+int stdvga_intercept_mmio(ioreq_t *p)
 {
     struct domain *d = current->domain;
     struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
+    uint64_t start, end, count = p->count, size = p->size;
     int buf = 0, rc;
 
+    if ( p->df )
+    {
+        start = (p->addr - (count - 1) * size);
+        end = p->addr + size;
+    }
+    else
+    {
+        start = p->addr;
+        end = p->addr + count * size;
+    }
+
+    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
+        return X86EMUL_UNHANDLEABLE;
+
     if ( p->size > 8 )
     {
         gdprintk(XENLOG_WARNING, "invalid mmio size %d\n", (int)p->size);
@@ -619,9 +634,6 @@ void stdvga_init(struct domain *d)
         register_portio_handler(d, 0x3c4, 2, stdvga_intercept_pio);
         /* Graphics registers. */
         register_portio_handler(d, 0x3ce, 2, stdvga_intercept_pio);
-        /* MMIO. */
-        register_buffered_io_handler(
-            d, VGA_MEM_BASE, VGA_MEM_SIZE, stdvga_intercept_mmio);
     }
 }
 
@@ -638,3 +650,13 @@ void stdvga_deinit(struct domain *d)
         s->vram_page[i] = NULL;
     }
 }
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 5c8d890..9de2ff3 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -250,7 +250,7 @@ static int vioapic_range(struct vcpu *v, unsigned long addr)
              (addr < vioapic->base_address + VIOAPIC_MEM_LENGTH)));
 }
 
-const struct hvm_mmio_ops vioapic_mmio_ops = {
+static const struct hvm_mmio_ops vioapic_mmio_ops = {
     .check = vioapic_range,
     .read = vioapic_read,
     .write = vioapic_write
@@ -456,6 +456,8 @@ int vioapic_init(struct domain *d)
     d->arch.hvm_domain.vioapic->domain = d;
     vioapic_reset(d);
 
+    register_mmio_handler(d, &vioapic_mmio_ops);
+
     return 0;
 }
 
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 0235cb3..5472f79 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -977,7 +977,7 @@ static int vlapic_range(struct vcpu *v, unsigned long addr)
            (offset < PAGE_SIZE);
 }
 
-const struct hvm_mmio_ops vlapic_mmio_ops = {
+static const struct hvm_mmio_ops vlapic_mmio_ops = {
     .check = vlapic_range,
     .read = vlapic_read,
     .write = vlapic_write
@@ -1443,6 +1443,9 @@ int vlapic_init(struct vcpu *v)
                  vlapic_init_sipi_action,
                  (unsigned long)v);
 
+    if ( v->vcpu_id == 0 )
+        register_mmio_handler(v->domain, &vlapic_mmio_ops);
+
     return 0;
 }
 
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index 6eeae0a..bf6090e 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -344,7 +344,7 @@ static int msixtbl_range(struct vcpu *v, unsigned long addr)
     return !!desc;
 }
 
-const struct hvm_mmio_ops msixtbl_mmio_ops = {
+static const struct hvm_mmio_ops msixtbl_mmio_ops = {
     .check = msixtbl_range,
     .read = msixtbl_read,
     .write = msixtbl_write
@@ -481,6 +481,11 @@ found:
     spin_unlock_irq(&irq_desc->lock);
 }
 
+void msixtbl_init(struct domain *d)
+{
+    register_mmio_handler(d, &msixtbl_mmio_ops);
+}
+
 void msixtbl_pt_cleanup(struct domain *d)
 {
     struct msixtbl_entry *entry, *temp;
diff --git a/xen/drivers/passthrough/amd/iommu_guest.c b/xen/drivers/passthrough/amd/iommu_guest.c
index a832fdb..5908bf3 100644
--- a/xen/drivers/passthrough/amd/iommu_guest.c
+++ b/xen/drivers/passthrough/amd/iommu_guest.c
@@ -868,6 +868,20 @@ static void guest_iommu_reg_init(struct guest_iommu *iommu)
     iommu->reg_ext_feature.hi = upper;
 }
 
+static int guest_iommu_mmio_range(struct vcpu *v, unsigned long addr)
+{
+    struct guest_iommu *iommu = vcpu_iommu(v);
+
+    return iommu && addr >= iommu->mmio_base &&
+           addr < iommu->mmio_base + IOMMU_MMIO_SIZE;
+}
+
+static const struct hvm_mmio_ops iommu_mmio_ops = {
+    .check = guest_iommu_mmio_range,
+    .read = guest_iommu_mmio_read,
+    .write = guest_iommu_mmio_write
+};
+
 /* Domain specific initialization */
 int guest_iommu_init(struct domain* d)
 {
@@ -894,6 +908,8 @@ int guest_iommu_init(struct domain* d)
 
     spin_lock_init(&iommu->lock);
 
+    register_mmio_handler(d, &iommu_mmio_ops);
+
     return 0;
 }
 
@@ -910,17 +926,3 @@ void guest_iommu_destroy(struct domain *d)
 
     domain_hvm_iommu(d)->arch.g_iommu = NULL;
 }
-
-static int guest_iommu_mmio_range(struct vcpu *v, unsigned long addr)
-{
-    struct guest_iommu *iommu = vcpu_iommu(v);
-
-    return iommu && addr >= iommu->mmio_base &&
-           addr < iommu->mmio_base + IOMMU_MMIO_SIZE;
-}
-
-const struct hvm_mmio_ops iommu_mmio_ops = {
-    .check = guest_iommu_mmio_range,
-    .read = guest_iommu_mmio_read,
-    .write = guest_iommu_mmio_write
-};
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index ad68fcf..b612755 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -94,6 +94,7 @@ struct hvm_domain {
     struct pl_time         pl_time;
 
     struct hvm_io_handler *io_handler;
+    unsigned int          io_handler_count;
 
     /* Lock protects access to irq, vpic and vioapic. */
     spinlock_t             irq_lock;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 1ed0c13..3f6a39a 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -25,10 +25,7 @@
 #include <public/hvm/ioreq.h>
 #include <public/event_channel.h>
 
-#define MAX_IO_HANDLER             16
-
-#define HVM_PORTIO                  0
-#define HVM_BUFFERED_IO             2
+#define NR_IO_HANDLERS 32
 
 typedef int (*hvm_mmio_read_t)(struct vcpu *v,
                                unsigned long addr,
@@ -40,82 +37,60 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v,
                                 unsigned long val);
 typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
 
-typedef int (*portio_action_t)(
-    int dir, uint16_t port, unsigned int bytes, uint32_t *val);
-typedef int (*mmio_action_t)(ioreq_t *);
-struct io_handler {
-    int                 type;
-    unsigned long       addr;
-    unsigned int        size;
-    union {
-        portio_action_t portio;
-        mmio_action_t   mmio;
-        void           *ptr;
-    } action;
-};
-
-struct hvm_io_handler {
-    int     num_slot;
-    struct  io_handler hdl_list[MAX_IO_HANDLER];
-};
-
 struct hvm_mmio_ops {
     hvm_mmio_check_t check;
     hvm_mmio_read_t  read;
     hvm_mmio_write_t write;
 };
 
-extern const struct hvm_mmio_ops hpet_mmio_ops;
-extern const struct hvm_mmio_ops vlapic_mmio_ops;
-extern const struct hvm_mmio_ops vioapic_mmio_ops;
-extern const struct hvm_mmio_ops msixtbl_mmio_ops;
-extern const struct hvm_mmio_ops iommu_mmio_ops;
-
-#define HVM_MMIO_HANDLER_NR 5
+typedef int (*portio_action_t)(
+    int dir, uint16_t port, unsigned int bytes, uint32_t *val);
 
-int hvm_io_intercept(ioreq_t *p, int type);
-void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned int size,
-    void *action, int type);
-void relocate_io_handler(
-    struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned int size, int type);
+struct hvm_io_handler {
+    union {
+        struct {
+            const struct hvm_mmio_ops *ops;
+        } mmio;
+        struct {
+            unsigned int start, end;
+            portio_action_t action;
+        } portio;
+    };
+    uint8_t type;
+};
 
-static inline int hvm_portio_intercept(ioreq_t *p)
-{
-    return hvm_io_intercept(p, HVM_PORTIO);
-}
+typedef int (*hvm_io_read_t)(const struct hvm_io_handler *,
+                             uint64_t addr,
+                             uint32_t size,
+                             uint64_t *data);
+typedef int (*hvm_io_write_t)(const struct hvm_io_handler *,
+                              uint64_t addr,
+                              uint32_t size,
+                              uint64_t data);
+typedef bool_t (*hvm_io_accept_t)(const struct hvm_io_handler *,
+                                  const ioreq_t *p);
+struct hvm_io_ops {
+    hvm_io_accept_t accept;
+    hvm_io_read_t   read;
+    hvm_io_write_t  write;
+};
 
-static inline int hvm_buffered_io_intercept(ioreq_t *p)
-{
-    return hvm_io_intercept(p, HVM_BUFFERED_IO);
-}
+int hvm_io_intercept(ioreq_t *p);
 
 bool_t hvm_mmio_internal(paddr_t gpa);
-int hvm_mmio_intercept(ioreq_t *p);
-int hvm_buffered_io_send(ioreq_t *p);
 
-static inline void register_portio_handler(
+void register_mmio_handler(struct domain *d,
+                           const struct hvm_mmio_ops *ops);
+
+void register_portio_handler(
     struct domain *d, uint16_t port, unsigned int size,
-    portio_action_t action)
-{
-    register_io_handler(d, port, size, action, HVM_PORTIO);
-}
+    portio_action_t action);
 
-static inline void relocate_portio_handler(
+void relocate_portio_handler(
     struct domain *d, uint16_t old_port, uint16_t new_port,
-    unsigned int size)
-{
-    relocate_io_handler(d, old_port, new_port, size, HVM_PORTIO);
-}
-
-static inline void register_buffered_io_handler(
-    struct domain *d, unsigned long addr,
-    unsigned int size, mmio_action_t action)
-{
-    register_io_handler(d, addr, size, action, HVM_BUFFERED_IO);
-}
+    unsigned int size);
 
+int hvm_buffered_io_send(ioreq_t *p);
 void send_timeoffset_req(unsigned long timeoff);
 void send_invalidate_req(void);
 int handle_mmio(void);
@@ -127,6 +102,7 @@ void hvm_io_assist(ioreq_t *p);
 void hvm_dpci_eoi(struct domain *d, unsigned int guest_irq,
                   const union vioapic_redir_entry *ent);
 void msix_write_completion(struct vcpu *);
+void msixtbl_init(struct domain *d);
 
 struct hvm_hw_stdvga {
     uint8_t sr_index;
@@ -141,8 +117,19 @@ struct hvm_hw_stdvga {
 };
 
 void stdvga_init(struct domain *d);
+int stdvga_intercept_mmio(ioreq_t *p);
 void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
 #endif /* __ASM_X86_HVM_IO_H__ */
 
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 06/16] x86/hvm: add length to mmio check op
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (4 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 05/16] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-02 16:37   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 07/16] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

When memory mapped I/O is range checked by internal handlers, the length
of the access should be taken into account.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/intercept.c |   22 +++++++++++++++++++---
 xen/include/asm-x86/hvm/io.h |   16 ++++++++++++++++
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 7d36785..42050f4 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -35,9 +35,19 @@
 static bool_t hvm_mmio_accept(const struct hvm_io_handler *handler,
                               const ioreq_t *p)
 {
+    paddr_t first = hvm_mmio_first_byte(p);
+    paddr_t last = hvm_mmio_last_byte(p);
+
     BUG_ON(handler->type != IOREQ_TYPE_COPY);
 
-    return handler->mmio.ops->check(current, p->addr);
+    if ( !handler->mmio.ops->check(current, first) )
+        return 0;
+
+    if ( p->size > 1 &&
+         !handler->mmio.ops->check(current, last) )
+        domain_crash(current->domain);
+
+    return 1;
 }
 
 static int hvm_mmio_read(const struct hvm_io_handler *handler,
@@ -112,7 +122,8 @@ static const struct hvm_io_ops portio_ops = {
 static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
                                     ioreq_t *p)
 {
-    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    struct vcpu *curr = current;
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     const struct hvm_io_ops *ops =
         (p->type == IOREQ_TYPE_COPY) ?
         &mmio_ops :
@@ -223,6 +234,9 @@ static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
 
     if ( i != 0 )
     {
+        if ( rc == X86EMUL_UNHANDLEABLE )
+            domain_crash(curr->domain);
+
         p->count = i;
         rc = X86EMUL_OKAY;
     }
@@ -342,7 +356,9 @@ bool_t hvm_mmio_internal(paddr_t gpa)
 {
     ioreq_t p = {
         .type = IOREQ_TYPE_COPY,
-        .addr = gpa
+        .addr = gpa,
+        .count = 1,
+        .size = 1
     };
 
     return (hvm_find_io_handler(&p) != NULL);
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 3f6a39a..78c9704 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -43,6 +43,22 @@ struct hvm_mmio_ops {
     hvm_mmio_write_t write;
 };
 
+static inline paddr_t hvm_mmio_first_byte(const ioreq_t *p)
+{
+    return p->df ?
+           p->addr - (p->count - 1ul) * p->size :
+           p->addr;
+}
+
+static inline paddr_t hvm_mmio_last_byte(const ioreq_t *p)
+{
+    unsigned long count = p->count;
+
+    return p->df ?
+           p->addr + p->size - 1:
+           p->addr + (count * p->size) - 1;
+}
+
 typedef int (*portio_action_t)(
     int dir, uint16_t port, unsigned int bytes, uint32_t *val);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 07/16] x86/hvm: unify dpci portio intercept with standard portio intercept
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (5 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 06/16] x86/hvm: add length to mmio check op Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-02 16:50   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

This patch re-works the dpci portio intercepts so that they can be unified
with standard portio handling thereby removing a substantial amount of
code duplication.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c         |    2 +
 xen/arch/x86/hvm/intercept.c   |   22 ++--
 xen/arch/x86/hvm/io.c          |  220 ++++++++++++----------------------------
 xen/include/asm-x86/hvm/io.h   |    5 +
 xen/include/asm-x86/hvm/vcpu.h |    2 +
 xen/include/xen/iommu.h        |    1 -
 6 files changed, 80 insertions(+), 172 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 60f8972..6643e0c 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1486,6 +1486,8 @@ int hvm_domain_initialise(struct domain *d)
     else
         d->arch.hvm_domain.io_bitmap = hvm_io_bitmap;
 
+    register_dpci_portio_handler(d);
+
     if ( is_pvh_domain(d) )
     {
         register_portio_handler(d, 0, 0x10003, handle_pvh_io);
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 42050f4..a613865 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -124,10 +124,7 @@ static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
 {
     struct vcpu *curr = current;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    const struct hvm_io_ops *ops =
-        (p->type == IOREQ_TYPE_COPY) ?
-        &mmio_ops :
-        &portio_ops;
+    const struct hvm_io_ops *ops = handler->ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
     uint64_t data;
     uint64_t addr;
@@ -247,10 +244,6 @@ static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
 static const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
 {
     struct domain *curr_d = current->domain;
-    const struct hvm_io_ops *ops =
-        (p->type == IOREQ_TYPE_COPY) ?
-        &mmio_ops :
-        &portio_ops;
     unsigned int i;
 
     BUG_ON((p->type != IOREQ_TYPE_PIO) &&
@@ -260,6 +253,7 @@ static const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
     {
         const struct hvm_io_handler *handler =
             &curr_d->arch.hvm_domain.io_handler[i];
+        const struct hvm_io_ops *ops = handler->ops;
 
         if ( handler->type != p->type )
             continue;
@@ -275,13 +269,7 @@ int hvm_io_intercept(ioreq_t *p)
 {
     const struct hvm_io_handler *handler;
 
-    if ( p->type == IOREQ_TYPE_PIO )
-    {
-        int rc = dpci_ioport_intercept(p);
-        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
-            return rc;
-    }
-    else if ( p->type == IOREQ_TYPE_COPY )
+    if ( p->type == IOREQ_TYPE_COPY )
     {
         int rc = stdvga_intercept_mmio(p);
         if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
@@ -296,7 +284,7 @@ int hvm_io_intercept(ioreq_t *p)
     return hvm_process_io_intercept(handler, p);
 }
 
-static struct hvm_io_handler *hvm_next_io_handler(struct domain *d)
+struct hvm_io_handler *hvm_next_io_handler(struct domain *d)
 {
     unsigned int i = d->arch.hvm_domain.io_handler_count++;
 
@@ -315,6 +303,7 @@ void register_mmio_handler(struct domain *d,
     struct hvm_io_handler *handler = hvm_next_io_handler(d);
 
     handler->type = IOREQ_TYPE_COPY;
+    handler->ops = &mmio_ops;
     handler->mmio.ops = ops;
 }
 
@@ -324,6 +313,7 @@ void register_portio_handler(struct domain *d, uint16_t port,
     struct hvm_io_handler *handler = hvm_next_io_handler(d);
 
     handler->type = IOREQ_TYPE_PIO;
+    handler->ops = &portio_ops;
     handler->portio.start = port;
     handler->portio.end = port + size;
     handler->portio.action = action;
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index c0964ec..2c88ddb 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -208,185 +208,95 @@ void hvm_io_assist(ioreq_t *p)
     }
 }
 
-static int dpci_ioport_read(uint32_t mport, ioreq_t *p)
+static bool_t dpci_portio_accept(const struct hvm_io_handler *handler,
+                                 const ioreq_t *p)
 {
-    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
-    int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-    uint32_t data = 0;
+    struct vcpu *curr = current;
+    struct hvm_iommu *hd = domain_hvm_iommu(curr->domain);
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
+    struct g2m_ioport *g2m_ioport;
+    unsigned int start, end;
 
-    for ( i = 0; i < p->count; i++ )
+    list_for_each_entry( g2m_ioport, &hd->arch.g2m_ioport_list, list )
     {
-        if ( vio->mmio_retrying )
-        {
-            if ( vio->mmio_large_read_bytes != p->size )
-                return X86EMUL_UNHANDLEABLE;
-            memcpy(&data, vio->mmio_large_read, p->size);
-            vio->mmio_large_read_bytes = 0;
-            vio->mmio_retrying = 0;
-        }
-        else switch ( p->size )
+        start = g2m_ioport->gport;
+        end = start + g2m_ioport->np;
+        if ( (p->addr >= start) && (p->addr + p->size <= end) )
         {
-        case 1:
-            data = inb(mport);
-            break;
-        case 2:
-            data = inw(mport);
-            break;
-        case 4:
-            data = inl(mport);
-            break;
-        default:
-            BUG();
+            vio->g2m_ioport = g2m_ioport;
+            return 1;
         }
-
-        if ( p->data_is_ptr )
-        {
-            switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                            &data, p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                /* Drop the write as real hardware would. */
-                continue;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY)
-                break;
-        }
-        else
-            p->data = data;
     }
 
-    if ( rc == X86EMUL_RETRY )
-    {
-        vio->mmio_retry = 1;
-        vio->mmio_large_read_bytes = p->size;
-        memcpy(vio->mmio_large_read, &data, p->size);
-    }
-
-    if ( i != 0 )
-    {
-        p->count = i;
-        rc = X86EMUL_OKAY;
-    }
-
-    return rc;
+    return 0;
 }
 
-static int dpci_ioport_write(uint32_t mport, ioreq_t *p)
+static int dpci_portio_read(const struct hvm_io_handler *handler,
+                            uint64_t addr,
+                            uint32_t size,
+                            uint64_t *data)
 {
-    int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-    uint32_t data;
-
-    for ( i = 0; i < p->count; i++ )
-    {
-        data = p->data;
-        if ( p->data_is_ptr )
-        {
-            switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
-                                              p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                data = ~0;
-                break;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY)
-                break;
-        }
-
-        switch ( p->size )
-        {
-        case 1:
-            outb(data, mport);
-            break;
-        case 2:
-            outw(data, mport);
-            break;
-        case 4:
-            outl(data, mport);
-            break;
-        default:
-            BUG();
-        }
-    }
-
-    if ( rc == X86EMUL_RETRY )
-        current->arch.hvm_vcpu.hvm_io.mmio_retry = 1;
+    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    const struct g2m_ioport *g2m_ioport = vio->g2m_ioport;
+    unsigned int mport = (addr - g2m_ioport->gport) + g2m_ioport->mport;
 
-    if ( i != 0 )
+    switch ( size )
     {
-        p->count = i;
-        rc = X86EMUL_OKAY;
+    case 1:
+        *data = inb(mport);
+        break;
+    case 2:
+        *data = inw(mport);
+        break;
+    case 4:
+        *data = inl(mport);
+        break;
+    default:
+        BUG();
     }
 
-    return rc;
+    return X86EMUL_OKAY;
 }
 
-int dpci_ioport_intercept(ioreq_t *p)
+static int dpci_portio_write(const struct hvm_io_handler *handler,
+                             uint64_t addr,
+                             uint32_t size,
+                             uint64_t data)
 {
-    struct domain *d = current->domain;
-    struct hvm_iommu *hd = domain_hvm_iommu(d);
-    struct g2m_ioport *g2m_ioport;
-    unsigned int mport, gport = p->addr;
-    unsigned int s = 0, e = 0;
-    int rc;
-
-    list_for_each_entry( g2m_ioport, &hd->arch.g2m_ioport_list, list )
-    {
-        s = g2m_ioport->gport;
-        e = s + g2m_ioport->np;
-        if ( (gport >= s) && (gport < e) )
-            goto found;
-    }
-
-    return X86EMUL_UNHANDLEABLE;
-
- found:
-    mport = (gport - s) + g2m_ioport->mport;
-
-    if ( !ioports_access_permitted(d, mport, mport + p->size - 1) ) 
-    {
-        gdprintk(XENLOG_ERR, "Error: access to gport=%#x denied!\n",
-                 (uint32_t)p->addr);
-        return X86EMUL_UNHANDLEABLE;
-    }
+    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    const struct g2m_ioport *g2m_ioport = vio->g2m_ioport;
+    unsigned int mport = (addr - g2m_ioport->gport) + g2m_ioport->mport;
 
-    switch ( p->dir )
+    switch ( size )
     {
-    case IOREQ_READ:
-        rc = dpci_ioport_read(mport, p);
+    case 1:
+        outb(data, mport);
         break;
-    case IOREQ_WRITE:
-        rc = dpci_ioport_write(mport, p);
+    case 2:
+        outw(data, mport);
+        break;
+    case 4:
+        outl(data, mport);
         break;
     default:
-        gdprintk(XENLOG_ERR, "Error: couldn't handle p->dir = %d", p->dir);
-        rc = X86EMUL_UNHANDLEABLE;
+        BUG();
     }
 
-    return rc;
+    return X86EMUL_OKAY;
+}
+
+static const struct hvm_io_ops dpci_portio_ops = {
+    .accept = dpci_portio_accept,
+    .read = dpci_portio_read,
+    .write = dpci_portio_write
+};
+
+void register_dpci_portio_handler(struct domain *d)
+{
+    struct hvm_io_handler *handler = hvm_next_io_handler(d);
+
+    handler->type = IOREQ_TYPE_PIO;
+    handler->ops = &dpci_portio_ops;
 }
 
 /*
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 78c9704..740592f 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -72,6 +72,7 @@ struct hvm_io_handler {
             portio_action_t action;
         } portio;
     };
+    const struct hvm_io_ops *ops;
     uint8_t type;
 };
 
@@ -94,6 +95,7 @@ struct hvm_io_ops {
 int hvm_io_intercept(ioreq_t *p);
 
 bool_t hvm_mmio_internal(paddr_t gpa);
+struct hvm_io_handler *hvm_next_io_handler(struct domain *d);
 
 void register_mmio_handler(struct domain *d,
                            const struct hvm_mmio_ops *ops);
@@ -137,6 +139,9 @@ int stdvga_intercept_mmio(ioreq_t *p);
 void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
+
+void register_dpci_portio_handler(struct domain *d);
+
 #endif /* __ASM_X86_HVM_IO_H__ */
 
 
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 3d8f4dc..b15c1e6 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -77,6 +77,8 @@ struct hvm_vcpu_io {
     bool_t mmio_retry, mmio_retrying;
 
     unsigned long msix_unmask_address;
+
+    const struct g2m_ioport *g2m_ioport;
 };
 
 #define VMCX_EADDR    (~0ULL)
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index b30bf41..1d00696 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -93,7 +93,6 @@ void pt_pci_init(void);
 
 struct pirq;
 int hvm_do_IRQ_dpci(struct domain *, struct pirq *);
-int dpci_ioport_intercept(ioreq_t *p);
 int pt_irq_create_bind(struct domain *, xen_domctl_bind_pt_irq_t *);
 int pt_irq_destroy_bind(struct domain *, xen_domctl_bind_pt_irq_t *);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (6 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 07/16] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-02 16:55   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 09/16] x86/hvm: limit reps to avoid the need to handle retry Paul Durrant
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

It's clear from the following check in hvmemul_rep_movs:

    if ( sp2mt == p2m_mmio_direct || dp2mt == p2m_mmio_direct ||
         (sp2mt == p2m_mmio_dm && dp2mt == p2m_mmio_dm) )
        return X86EMUL_UNHANDLEABLE;

that mmio <-> mmio copy is not handled. This means the code in the
stdvga mmio intercept that explicitly handles mmio <-> mmio copy when
hvm_copy_to/from_guest_phys() fails is never going to be executed.

This patch therefore adds a check in hvmemul_do_io_addr() to make sure
mmio <-> mmio is disallowed and then registers standard mmio intercept ops
in stdvga_init().

With this patch all mmio and portio handled within Xen now goes through
process_io_intercept().

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c   |    9 ++
 xen/arch/x86/hvm/intercept.c |   30 ++++--
 xen/arch/x86/hvm/stdvga.c    |  207 +++++++++++++++---------------------------
 xen/include/asm-x86/hvm/io.h |   10 +-
 4 files changed, 109 insertions(+), 147 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 205ca5e..72603f5 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -266,6 +266,15 @@ static int hvmemul_acquire_page(unsigned long gmfn, struct page_info **page)
         return X86EMUL_RETRY;
     }
 
+    /* This code should not be reached if the gmfn is not RAM */
+    if ( p2m_is_mmio(p2mt) )
+    {
+        domain_crash(curr_d);
+
+        put_page(*page);
+        return X86EMUL_UNHANDLEABLE;
+    }
+
     return X86EMUL_OKAY;
 }
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index a613865..415d342 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -268,20 +268,21 @@ static const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
 int hvm_io_intercept(ioreq_t *p)
 {
     const struct hvm_io_handler *handler;
-
-    if ( p->type == IOREQ_TYPE_COPY )
-    {
-        int rc = stdvga_intercept_mmio(p);
-        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
-            return rc;
-    }
+    const struct hvm_io_ops *ops;
+    int rc;
 
     handler = hvm_find_io_handler(p);
 
     if ( handler == NULL )
         return X86EMUL_UNHANDLEABLE;
 
-    return hvm_process_io_intercept(handler, p);
+    rc = hvm_process_io_intercept(handler, p);
+
+    ops = handler->ops;
+    if ( ops->complete != NULL )
+        ops->complete(handler);
+
+    return rc;
 }
 
 struct hvm_io_handler *hvm_next_io_handler(struct domain *d)
@@ -344,6 +345,8 @@ void relocate_portio_handler(struct domain *d, uint16_t old_port,
 
 bool_t hvm_mmio_internal(paddr_t gpa)
 {
+    const struct hvm_io_handler *handler;
+    const struct hvm_io_ops *ops;
     ioreq_t p = {
         .type = IOREQ_TYPE_COPY,
         .addr = gpa,
@@ -351,7 +354,16 @@ bool_t hvm_mmio_internal(paddr_t gpa)
         .size = 1
     };
 
-    return (hvm_find_io_handler(&p) != NULL);
+    handler = hvm_find_io_handler(&p);
+
+    if ( handler == NULL )
+        return 0;
+
+    ops = handler->ops;
+    if ( ops->complete != NULL )
+        ops->complete(handler);
+
+    return 1;
 }
 
 /*
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 30760fa..f13983e 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -148,7 +148,7 @@ static int stdvga_outb(uint64_t addr, uint8_t val)
     }
     else if ( prev_stdvga && !s->stdvga )
     {
-        gdprintk(XENLOG_INFO, "leaving stdvga\n");
+        gdprintk(XENLOG_INFO, "leaving stdvga mode\n");
     }
 
     return rc;
@@ -275,7 +275,8 @@ static uint8_t stdvga_mem_readb(uint64_t addr)
     return ret;
 }
 
-static uint64_t stdvga_mem_read(uint64_t addr, uint64_t size)
+static int stdvga_mem_read(const struct hvm_io_handler *handler,
+                           uint64_t addr, uint32_t size, uint64_t *p_data)
 {
     uint64_t data = 0;
 
@@ -309,11 +310,12 @@ static uint64_t stdvga_mem_read(uint64_t addr, uint64_t size)
         break;
 
     default:
-        gdprintk(XENLOG_WARNING, "invalid io size: %"PRId64"\n", size);
+        gdprintk(XENLOG_WARNING, "invalid io size: %u\n", size);
         break;
     }
 
-    return data;
+    *p_data = data;
+    return X86EMUL_OKAY;
 }
 
 static void stdvga_mem_writeb(uint64_t addr, uint32_t val)
@@ -424,8 +426,22 @@ static void stdvga_mem_writeb(uint64_t addr, uint32_t val)
     }
 }
 
-static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
+static int stdvga_mem_write(const struct hvm_io_handler *handler,
+                            uint64_t addr, uint32_t size,
+                            uint64_t data)
 {
+    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
+    ioreq_t p = { .type = IOREQ_TYPE_COPY,
+                  .addr = addr,
+                  .size = size,
+                  .count = 1,
+                  .dir = IOREQ_WRITE,
+                  .data = data,
+    };
+
+    if ( !s->cache )
+        goto done;
+
     /* Intercept mmio write */
     switch ( size )
     {
@@ -457,156 +473,72 @@ static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
         break;
 
     default:
-        gdprintk(XENLOG_WARNING, "invalid io size: %"PRId64"\n", size);
+        gdprintk(XENLOG_WARNING, "invalid io size: %u\n", size);
         break;
     }
-}
-
-static uint32_t read_data;
-
-static int mmio_move(struct hvm_hw_stdvga *s, ioreq_t *p)
-{
-    int i;
-    uint64_t addr = p->addr;
-    p2m_type_t p2mt;
-    struct domain *d = current->domain;
-    int step = p->df ? -p->size : p->size;
-
-    if ( p->data_is_ptr )
-    {
-        uint64_t data = p->data, tmp;
 
-        if ( p->dir == IOREQ_READ )
-        {
-            for ( i = 0; i < p->count; i++ ) 
-            {
-                tmp = stdvga_mem_read(addr, p->size);
-                if ( hvm_copy_to_guest_phys(data, &tmp, p->size) !=
-                     HVMCOPY_okay )
-                {
-                    struct page_info *dp = get_page_from_gfn(
-                            d, data >> PAGE_SHIFT, &p2mt, P2M_ALLOC);
-                    /*
-                     * The only case we handle is vga_mem <-> vga_mem.
-                     * Anything else disables caching and leaves it to qemu-dm.
-                     */
-                    if ( (p2mt != p2m_mmio_dm) || (data < VGA_MEM_BASE) ||
-                         ((data + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-                    {
-                        if ( dp )
-                            put_page(dp);
-                        return 0;
-                    }
-                    ASSERT(!dp);
-                    stdvga_mem_write(data, tmp, p->size);
-                }
-                data += step;
-                addr += step;
-            }
-        }
-        else
-        {
-            for ( i = 0; i < p->count; i++ )
-            {
-                if ( hvm_copy_from_guest_phys(&tmp, data, p->size) !=
-                     HVMCOPY_okay )
-                {
-                    struct page_info *dp = get_page_from_gfn(
-                        d, data >> PAGE_SHIFT, &p2mt, P2M_ALLOC);
-                    if ( (p2mt != p2m_mmio_dm) || (data < VGA_MEM_BASE) ||
-                         ((data + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-                    {
-                        if ( dp )
-                            put_page(dp);
-                        return 0;
-                    }
-                    ASSERT(!dp);
-                    tmp = stdvga_mem_read(data, p->size);
-                }
-                stdvga_mem_write(addr, tmp, p->size);
-                data += step;
-                addr += step;
-            }
-        }
-    }
-    else if ( p->dir == IOREQ_WRITE )
-    {
-        for ( i = 0; i < p->count; i++ )
-        {
-            stdvga_mem_write(addr, p->data, p->size);
-            addr += step;
-        }
-    }
-    else
-    {
-        ASSERT(p->count == 1);
-        p->data = stdvga_mem_read(addr, p->size);
-    }
+ done:
+    if ( hvm_buffered_io_send(&p) )
+        return X86EMUL_OKAY;
 
-    read_data = p->data;
-    return 1;
+    return X86EMUL_UNHANDLEABLE;
 }
 
-int stdvga_intercept_mmio(ioreq_t *p)
+static bool_t stdvga_mem_accept(const struct hvm_io_handler *handler,
+                                const ioreq_t *p)
 {
-    struct domain *d = current->domain;
-    struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
-    uint64_t start, end, count = p->count, size = p->size;
-    int buf = 0, rc;
-
-    if ( p->df )
-    {
-        start = (p->addr - (count - 1) * size);
-        end = p->addr + size;
-    }
-    else
-    {
-        start = p->addr;
-        end = p->addr + count * size;
-    }
-
-    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-        return X86EMUL_UNHANDLEABLE;
-
-    if ( p->size > 8 )
-    {
-        gdprintk(XENLOG_WARNING, "invalid mmio size %d\n", (int)p->size);
-        return X86EMUL_UNHANDLEABLE;
-    }
+    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
 
     spin_lock(&s->lock);
 
-    if ( s->stdvga && s->cache )
+    if ( !s->stdvga ||
+         (p->addr < VGA_MEM_BASE) ||
+         ((p->addr + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
+        goto reject;
+
+    if ( p->dir == IOREQ_WRITE && p->count > 1 )
     {
-        switch ( p->type )
+        /*
+         * We cannot return X86EMUL_UNHANDLEABLE on anything other then the
+         * first cycle of an I/O. So, since we cannot guarantee to always be
+         * able to buffer writes, we have to reject any multi-cycle I/O and,
+         * since we are rejecting an I/O, we must invalidate the cache.
+         * Single-cycle write transactions are accepted even if the cache is
+         * not active since we can assert, when in stdvga mode, that writes
+         * to VRAM have no side effect and thus we can try to buffer them.
+         */
+        if ( s->cache )
         {
-        case IOREQ_TYPE_COPY:
-            buf = mmio_move(s, p);
-            if ( !buf )
-                s->cache = 0;
-            break;
-        default:
-            gdprintk(XENLOG_WARNING, "unsupported mmio request type:%d "
-                     "addr:0x%04x data:0x%04x size:%d count:%d state:%d "
-                     "isptr:%d dir:%d df:%d\n",
-                     p->type, (int)p->addr, (int)p->data, (int)p->size,
-                     (int)p->count, p->state,
-                     p->data_is_ptr, p->dir, p->df);
+            gdprintk(XENLOG_INFO, "leaving caching mode\n");
             s->cache = 0;
         }
+
+        goto reject;
     }
-    else
-    {
-        buf = (p->dir == IOREQ_WRITE);
-    }
+    else if ( p->dir == IOREQ_READ && !s->cache )
+        goto reject;
 
-    rc = (buf && hvm_buffered_io_send(p));
+    return 1;
 
+ reject:
     spin_unlock(&s->lock);
+    return 0;
+}
 
-    return rc ? X86EMUL_OKAY : X86EMUL_UNHANDLEABLE;
+static void stdvga_mem_complete(const struct hvm_io_handler *handler)
+{
+    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
+
+    spin_unlock(&s->lock);
 }
 
+static const struct hvm_io_ops stdvga_mem_ops = {
+    .accept = stdvga_mem_accept,
+    .read = stdvga_mem_read,
+    .write = stdvga_mem_write,
+    .complete = stdvga_mem_complete
+};
+
 void stdvga_init(struct domain *d)
 {
     struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
@@ -630,10 +562,17 @@ void stdvga_init(struct domain *d)
 
     if ( i == ARRAY_SIZE(s->vram_page) )
     {
+        struct hvm_io_handler *handler;
+
         /* Sequencer registers. */
         register_portio_handler(d, 0x3c4, 2, stdvga_intercept_pio);
         /* Graphics registers. */
         register_portio_handler(d, 0x3ce, 2, stdvga_intercept_pio);
+
+        /* VGA memory */
+        handler = hvm_next_io_handler(d);
+        handler->type = IOREQ_TYPE_COPY;
+        handler->ops = &stdvga_mem_ops;
     }
 }
 
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 740592f..1ca0d16 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -86,10 +86,13 @@ typedef int (*hvm_io_write_t)(const struct hvm_io_handler *,
                               uint64_t data);
 typedef bool_t (*hvm_io_accept_t)(const struct hvm_io_handler *,
                                   const ioreq_t *p);
+typedef void (*hvm_io_complete_t)(const struct hvm_io_handler *);
+
 struct hvm_io_ops {
-    hvm_io_accept_t accept;
-    hvm_io_read_t   read;
-    hvm_io_write_t  write;
+    hvm_io_accept_t   accept;
+    hvm_io_read_t     read;
+    hvm_io_write_t    write;
+    hvm_io_complete_t complete;
 };
 
 int hvm_io_intercept(ioreq_t *p);
@@ -135,7 +138,6 @@ struct hvm_hw_stdvga {
 };
 
 void stdvga_init(struct domain *d);
-int stdvga_intercept_mmio(ioreq_t *p);
 void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (7 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-02 17:10   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 10/16] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

By limiting hvmemul_do_io_addr() to reps falling within the page on which
a reference has already been taken, we can guarantee that calls to
hvm_copy_to/from_guest_phys() will not hit the HVMCOPY_gfn_paged_out
or HVMCOPY_gfn_shared cases. Thus we can remove the retry logic (added
by c/s 82ed8716b "fix direct PCI port I/O emulation retry and error
handling") from the intercept code and simplify it significantly.

Normally hvmemul_do_io_addr() will only reference single page at a time.
It will, however, take an extra page reference for I/O spanning a page
boundary.

It is still important to know, upon returning from x86_emulate(), whether
the number of reps was reduced so the mmio_retry flag is retained for that
purpose.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c     |   73 +++++++++++++++++++++++++---------------
 xen/arch/x86/hvm/hvm.c         |    4 +++
 xen/arch/x86/hvm/intercept.c   |   57 +++++++------------------------
 xen/include/asm-x86/hvm/vcpu.h |    2 +-
 4 files changed, 63 insertions(+), 73 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 72603f5..83b4e5f 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -51,7 +51,7 @@ static void hvmtrace_io_assist(const ioreq_t *p)
 }
 
 static int hvmemul_do_io(
-    bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
+    bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
 {
     struct vcpu *curr = current;
@@ -60,6 +60,7 @@ static int hvmemul_do_io(
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
         .addr = addr,
         .size = size,
+        .count = reps,
         .dir = dir,
         .df = df,
         .data = data,
@@ -125,15 +126,6 @@ static int hvmemul_do_io(
         HVMIO_dispatched : HVMIO_awaiting_completion;
     vio->io_size = size;
 
-    /*
-     * When retrying a repeated string instruction, force exit to guest after
-     * completion of the retried iteration to allow handling of interrupts.
-     */
-    if ( vio->mmio_retrying )
-        *reps = 1;
-
-    p.count = *reps;
-
     if ( dir == IOREQ_WRITE )
     {
         if ( !data_is_addr )
@@ -147,17 +139,9 @@ static int hvmemul_do_io(
     switch ( rc )
     {
     case X86EMUL_OKAY:
-    case X86EMUL_RETRY:
-        *reps = p.count;
         p.state = STATE_IORESP_READY;
-        if ( !vio->mmio_retry )
-        {
-            hvm_io_assist(&p);
-            vio->io_state = HVMIO_none;
-        }
-        else
-            /* Defer hvm_io_assist() invocation to hvm_do_resume(). */
-            vio->io_state = HVMIO_handle_mmio_awaiting_completion;
+        hvm_io_assist(&p);
+        vio->io_state = HVMIO_none;
         break;
     case X86EMUL_UNHANDLEABLE:
     {
@@ -235,7 +219,7 @@ static int hvmemul_do_io_buffer(
 
     BUG_ON(buffer == NULL);
 
-    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
+    rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0,
                        (uintptr_t)buffer);
     if ( rc == X86EMUL_UNHANDLEABLE && dir == IOREQ_READ )
         memset(buffer, 0xff, size);
@@ -287,17 +271,53 @@ static int hvmemul_do_io_addr(
     bool_t is_mmio, paddr_t addr, unsigned long *reps,
     unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
 {
-    struct page_info *ram_page;
+    struct vcpu *v = current;
+    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
+    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);
+    struct page_info *ram_page[2];
+    int nr_pages = 0;
+    unsigned long count;
     int rc;
 
-    rc = hvmemul_acquire_page(paddr_to_pfn(ram_gpa), &ram_page);
+    rc = hvmemul_acquire_page(ram_gmfn, &ram_page[nr_pages]);
     if ( rc != X86EMUL_OKAY )
-        return rc;
+        goto out;
 
-    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
+    nr_pages++;
+
+    /* Detemine how many reps will fit within this page */
+    count = min_t(unsigned long,
+                  *reps,
+                  df ?
+                  (page_off + size - 1) / size :
+                  (PAGE_SIZE - page_off) / size);
+
+    if ( count == 0 )
+    {
+        /*
+         * This access must span two pages, so grab a reference to
+         * the next page and do a single rep.
+         */
+        rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1,
+                                  &ram_page[nr_pages]);
+        if ( rc != X86EMUL_OKAY )
+            goto out;
+
+        nr_pages++;
+        count = 1;
+    }
+
+    rc = hvmemul_do_io(is_mmio, addr, count, size, dir, df, 1,
                        ram_gpa);
+    if ( rc == X86EMUL_OKAY )
+    {
+        v->arch.hvm_vcpu.hvm_io.mmio_retry = (count < *reps);
+        *reps = count;
+    }
 
-    hvmemul_release_page(ram_page);
+ out:
+    while ( --nr_pages >= 0 )
+        hvmemul_release_page(ram_page[nr_pages]);
 
     return rc;
 }
@@ -1547,7 +1567,6 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
     }
 
     hvmemul_ctxt->exn_pending = 0;
-    vio->mmio_retrying = vio->mmio_retry;
     vio->mmio_retry = 0;
 
     if ( cpu_has_vmx )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 6643e0c..a7de030 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -440,6 +440,7 @@ static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
 
 void hvm_do_resume(struct vcpu *v)
 {
+    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
     struct domain *d = v->domain;
     struct hvm_ioreq_server *s;
 
@@ -468,6 +469,9 @@ void hvm_do_resume(struct vcpu *v)
         }
     }
 
+    if ( vio->mmio_retry )
+        (void)handle_mmio();
+
     /* Inject pending hw/sw trap */
     if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) 
     {
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 415d342..11b6028 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -122,8 +122,6 @@ static const struct hvm_io_ops portio_ops = {
 static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
                                     ioreq_t *p)
 {
-    struct vcpu *curr = current;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     const struct hvm_io_ops *ops = handler->ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
     uint64_t data;
@@ -133,23 +131,12 @@ static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
     {
         for ( i = 0; i < p->count; i++ )
         {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-            {
-                addr = (p->type == IOREQ_TYPE_COPY) ?
-                       p->addr + step * i :
-                       p->addr;
-                rc = ops->read(handler, addr, p->size, &data);
-                if ( rc != X86EMUL_OKAY )
-                    break;
-            }
+            addr = (p->type == IOREQ_TYPE_COPY) ?
+                   p->addr + step * i :
+                   p->addr;
+            rc = ops->read(handler, addr, p->size, &data);
+            if ( rc != X86EMUL_OKAY )
+                break;
 
             if ( p->data_is_ptr )
             {
@@ -158,14 +145,12 @@ static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
                 {
                 case HVMCOPY_okay:
                     break;
-                case HVMCOPY_gfn_paged_out:
-                case HVMCOPY_gfn_shared:
-                    rc = X86EMUL_RETRY;
-                    break;
                 case HVMCOPY_bad_gfn_to_mfn:
                     /* Drop the write as real hardware would. */
                     continue;
                 case HVMCOPY_bad_gva_to_gfn:
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
@@ -178,13 +163,6 @@ static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
             else
                 p->data = data;
         }
-
-        if ( rc == X86EMUL_RETRY )
-        {
-            vio->mmio_retry = 1;
-            vio->mmio_large_read_bytes = p->size;
-            memcpy(vio->mmio_large_read, &data, p->size);
-        }
     }
     else /* p->dir == IOREQ_WRITE */
     {
@@ -197,14 +175,12 @@ static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
                 {
                 case HVMCOPY_okay:
                     break;
-                case HVMCOPY_gfn_paged_out:
-                case HVMCOPY_gfn_shared:
-                    rc = X86EMUL_RETRY;
-                    break;
                 case HVMCOPY_bad_gfn_to_mfn:
                     data = ~0;
                     break;
                 case HVMCOPY_bad_gva_to_gfn:
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
@@ -224,19 +200,10 @@ static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
             if ( rc != X86EMUL_OKAY )
                 break;
         }
-
-        if ( rc == X86EMUL_RETRY )
-            vio->mmio_retry = 1;
     }
 
-    if ( i != 0 )
-    {
-        if ( rc == X86EMUL_UNHANDLEABLE )
-            domain_crash(curr->domain);
-
-        p->count = i;
-        rc = X86EMUL_OKAY;
-    }
+    if ( i != 0 && rc == X86EMUL_UNHANDLEABLE )
+        domain_crash(current->domain);
 
     return rc;
 }
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index b15c1e6..2ed0606 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -74,7 +74,7 @@ struct hvm_vcpu_io {
      * For string instruction emulation we need to be able to signal a
      * necessary retry through other than function return codes.
      */
-    bool_t mmio_retry, mmio_retrying;
+    bool_t mmio_retry;
 
     unsigned long msix_unmask_address;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 10/16] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (8 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 09/16] x86/hvm: limit reps to avoid the need to handle retry Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-03 15:03   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 11/16] x86/hvm: split I/O completion handling from state model Paul Durrant
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser

By removing the calls in hvmemul_do_io() (which is replaced by a single
assignment) and hvm_complete_assist_request() (which is replaced by a
call to hvm_process_portio_intercept() with a suitable set of ops) then
hvm_io_assist() can be moved into hvm.c and made static (and hence be a
candidate for inlining).

The calls to msix_write_completion() and vcpu_end_shutdown_deferral()
are also made unconditionally because the ioreq state will always be
STATE_IOREQ_NONE at the end of hvm_io_assist() so the test was
pointless. These calls are also only relevant when the emulation has
been handled externally which is now always the case.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c    |   34 ++++++++++++++++++---
 xen/arch/x86/hvm/hvm.c        |   67 ++++++++++++++++++++++-------------------
 xen/arch/x86/hvm/intercept.c  |    4 +--
 xen/arch/x86/hvm/io.c         |   39 ------------------------
 xen/include/asm-x86/hvm/hvm.h |    1 -
 xen/include/asm-x86/hvm/io.h  |    3 +-
 6 files changed, 70 insertions(+), 78 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 83b4e5f..e17e6ab 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -50,6 +50,32 @@ static void hvmtrace_io_assist(const ioreq_t *p)
     trace_var(event, 0/*!cycles*/, size, buffer);
 }
 
+static int null_read(const struct hvm_io_handler *io_handler,
+                     uint64_t addr,
+                     uint32_t size,
+                     uint64_t *data)
+{
+    *data = ~0ul;
+    return X86EMUL_OKAY;
+}
+
+static int null_write(const struct hvm_io_handler *handler,
+                      uint64_t addr,
+                      uint32_t size,
+                      uint64_t data)
+{
+    return X86EMUL_OKAY;
+}
+
+static const struct hvm_io_ops null_ops = {
+    .read = null_read,
+    .write = null_write
+};
+
+static const struct hvm_io_handler null_handler = {
+    .ops = &null_ops
+};
+
 static int hvmemul_do_io(
     bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
@@ -139,8 +165,7 @@ static int hvmemul_do_io(
     switch ( rc )
     {
     case X86EMUL_OKAY:
-        p.state = STATE_IORESP_READY;
-        hvm_io_assist(&p);
+        vio->io_data = p.data;
         vio->io_state = HVMIO_none;
         break;
     case X86EMUL_UNHANDLEABLE:
@@ -151,8 +176,9 @@ static int hvmemul_do_io(
         /* If there is no suitable backing DM, just ignore accesses */
         if ( !s )
         {
-            hvm_complete_assist_req(&p);
-            rc = X86EMUL_OKAY;
+            rc = hvm_process_io_intercept(&null_handler, &p);
+            if ( rc == X86EMUL_OKAY )
+                vio->io_data = p.data;
             vio->io_state = HVMIO_none;
         }
         else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index a7de030..d498151 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -411,6 +411,42 @@ bool_t hvm_io_pending(struct vcpu *v)
     return 0;
 }
 
+static void hvm_io_assist(ioreq_t *p)
+{
+    struct vcpu *curr = current;
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
+    enum hvm_io_state io_state;
+
+    p->state = STATE_IOREQ_NONE;
+
+    io_state = vio->io_state;
+    vio->io_state = HVMIO_none;
+
+    switch ( io_state )
+    {
+    case HVMIO_awaiting_completion:
+        vio->io_state = HVMIO_completed;
+        vio->io_data = p->data;
+        break;
+    case HVMIO_handle_mmio_awaiting_completion:
+        vio->io_state = HVMIO_completed;
+        vio->io_data = p->data;
+        (void)handle_mmio();
+        break;
+    case HVMIO_handle_pio_awaiting_completion:
+        if ( vio->io_size == 4 ) /* Needs zero extension. */
+            guest_cpu_user_regs()->rax = (uint32_t)p->data;
+        else
+            memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
+        break;
+    default:
+        break;
+    }
+
+    msix_write_completion(curr);
+    vcpu_end_shutdown_deferral(curr);
+}
+
 static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
 {
     /* NB. Optimised for common case (p->state == STATE_IOREQ_NONE). */
@@ -2667,37 +2703,6 @@ int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
     return X86EMUL_UNHANDLEABLE;
 }
 
-void hvm_complete_assist_req(ioreq_t *p)
-{
-    switch ( p->type )
-    {
-    case IOREQ_TYPE_PCI_CONFIG:
-        ASSERT_UNREACHABLE();
-        break;
-    case IOREQ_TYPE_COPY:
-    case IOREQ_TYPE_PIO:
-        if ( p->dir == IOREQ_READ )
-        {
-            if ( !p->data_is_ptr )
-                p->data = ~0ul;
-            else
-            {
-                int i, step = p->df ? -p->size : p->size;
-                uint32_t data = ~0;
-
-                for ( i = 0; i < p->count; i++ )
-                    hvm_copy_to_guest_phys(p->data + step * i, &data,
-                                           p->size);
-            }
-        }
-        /* FALLTHRU */
-    default:
-        p->state = STATE_IORESP_READY;
-        hvm_io_assist(p);
-        break;
-    }
-}
-
 void hvm_broadcast_assist_req(ioreq_t *p)
 {
     struct domain *d = current->domain;
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 11b6028..f84f7c5 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -119,8 +119,8 @@ static const struct hvm_io_ops portio_ops = {
     .write = hvm_portio_write
 };
 
-static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
-                                    ioreq_t *p)
+int hvm_process_io_intercept(const struct hvm_io_handler *handler,
+                             ioreq_t *p)
 {
     const struct hvm_io_ops *ops = handler->ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 2c88ddb..fe099d8 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -169,45 +169,6 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
     return 1;
 }
 
-void hvm_io_assist(ioreq_t *p)
-{
-    struct vcpu *curr = current;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    enum hvm_io_state io_state;
-
-    p->state = STATE_IOREQ_NONE;
-
-    io_state = vio->io_state;
-    vio->io_state = HVMIO_none;
-
-    switch ( io_state )
-    {
-    case HVMIO_awaiting_completion:
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        break;
-    case HVMIO_handle_mmio_awaiting_completion:
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        (void)handle_mmio();
-        break;
-    case HVMIO_handle_pio_awaiting_completion:
-        if ( vio->io_size == 4 ) /* Needs zero extension. */
-            guest_cpu_user_regs()->rax = (uint32_t)p->data;
-        else
-            memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
-        break;
-    default:
-        break;
-    }
-
-    if ( p->state == STATE_IOREQ_NONE )
-    {
-        msix_write_completion(curr);
-        vcpu_end_shutdown_deferral(curr);
-    }
-}
-
 static bool_t dpci_portio_accept(const struct hvm_io_handler *handler,
                                  const ioreq_t *p)
 {
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 57f9605..9792e15 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -232,7 +232,6 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
                                                  ioreq_t *p);
 int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *p);
 void hvm_broadcast_assist_req(ioreq_t *p);
-void hvm_complete_assist_req(ioreq_t *p);
 
 void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat);
 int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat);
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 1ca0d16..87034af 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -95,6 +95,8 @@ struct hvm_io_ops {
     hvm_io_complete_t complete;
 };
 
+int hvm_process_io_intercept(const struct hvm_io_handler *handler,
+                             ioreq_t *p);
 int hvm_io_intercept(ioreq_t *p);
 
 bool_t hvm_mmio_internal(paddr_t gpa);
@@ -119,7 +121,6 @@ int handle_mmio_with_translation(unsigned long gva, unsigned long gpfn,
                                  struct npfec);
 int handle_pio(uint16_t port, unsigned int size, int dir);
 void hvm_interrupt_post(struct vcpu *v, int vector, int type);
-void hvm_io_assist(ioreq_t *p);
 void hvm_dpci_eoi(struct domain *d, unsigned int guest_irq,
                   const union vioapic_redir_entry *ent);
 void msix_write_completion(struct vcpu *);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 11/16] x86/hvm: split I/O completion handling from state model
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (9 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 10/16] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-03 15:08   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 12/16] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

The state of in-flight I/O and how its completion will be handled are
logically separate and conflating the two makes the code unnecessarily
confusing.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c            |   44 +++++++++++++++++++++++++++----------
 xen/arch/x86/hvm/io.c             |    6 ++---
 xen/arch/x86/hvm/vmx/realmode.c   |   27 +++++++++++++++++------
 xen/include/asm-x86/hvm/vcpu.h    |   16 +++++++++-----
 xen/include/asm-x86/hvm/vmx/vmx.h |    1 +
 5 files changed, 67 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index d498151..0f0730e 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -59,6 +59,7 @@
 #include <asm/hvm/trace.h>
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/event.h>
+#include <asm/hvm/vmx/vmx.h>
 #include <asm/mtrr.h>
 #include <asm/apic.h>
 #include <public/sched.h>
@@ -428,17 +429,6 @@ static void hvm_io_assist(ioreq_t *p)
         vio->io_state = HVMIO_completed;
         vio->io_data = p->data;
         break;
-    case HVMIO_handle_mmio_awaiting_completion:
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        (void)handle_mmio();
-        break;
-    case HVMIO_handle_pio_awaiting_completion:
-        if ( vio->io_size == 4 ) /* Needs zero extension. */
-            guest_cpu_user_regs()->rax = (uint32_t)p->data;
-        else
-            memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
-        break;
     default:
         break;
     }
@@ -479,6 +469,7 @@ void hvm_do_resume(struct vcpu *v)
     struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
     struct domain *d = v->domain;
     struct hvm_ioreq_server *s;
+    enum hvm_io_completion io_completion;
 
     check_wakeup_from_wait();
 
@@ -505,8 +496,37 @@ void hvm_do_resume(struct vcpu *v)
         }
     }
 
-    if ( vio->mmio_retry )
+    io_completion = vio->io_completion;
+    vio->io_completion = HVMIO_no_completion;
+
+    switch ( io_completion )
+    {
+    case HVMIO_no_completion:
+        break;
+    case HVMIO_mmio_completion:
         (void)handle_mmio();
+        break;
+    case HVMIO_pio_completion:
+        if ( vio->io_size == 4 ) /* Needs zero extension. */
+            guest_cpu_user_regs()->rax = (uint32_t)vio->io_data;
+        else
+            memcpy(&guest_cpu_user_regs()->rax, &vio->io_data, vio->io_size);
+        vio->io_state = HVMIO_none;
+        break;
+    case HVMIO_realmode_completion:
+    {
+        struct hvm_emulate_ctxt ctxt;
+
+        hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
+        vmx_realmode_emulate_one(&ctxt);
+        hvm_emulate_writeback(&ctxt);
+
+        break;
+    }
+    default:
+        ASSERT_UNREACHABLE();
+        break;
+    }
 
     /* Inject pending hw/sw trap */
     if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) 
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index fe099d8..221d05e 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -92,8 +92,8 @@ int handle_mmio(void)
 
     if ( rc != X86EMUL_RETRY )
         vio->io_state = HVMIO_none;
-    if ( vio->io_state == HVMIO_awaiting_completion )
-        vio->io_state = HVMIO_handle_mmio_awaiting_completion;
+    if ( vio->io_state == HVMIO_awaiting_completion || vio->mmio_retry )
+        vio->io_completion = HVMIO_mmio_completion;
     else
         vio->mmio_access = (struct npfec){};
 
@@ -158,7 +158,7 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
             return 0;
         /* Completion in hvm_io_assist() with no re-emulation required. */
         ASSERT(dir == IOREQ_READ);
-        vio->io_state = HVMIO_handle_pio_awaiting_completion;
+        vio->io_completion = HVMIO_pio_completion;
         break;
     default:
         gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc);
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index fe8b4a0..76ff9a5 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -101,15 +101,19 @@ static void realmode_deliver_exception(
     }
 }
 
-static void realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
+void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
     struct vcpu *curr = current;
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     int rc;
 
     perfc_incr(realmode_emulations);
 
     rc = hvm_emulate_one(hvmemul_ctxt);
 
+    if ( vio->io_state == HVMIO_awaiting_completion || vio->mmio_retry )
+        vio->io_completion = HVMIO_realmode_completion;
+
     if ( rc == X86EMUL_UNHANDLEABLE )
     {
         gdprintk(XENLOG_ERR, "Failed to emulate insn.\n");
@@ -177,9 +181,6 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
     hvm_emulate_prepare(&hvmemul_ctxt, regs);
 
-    if ( vio->io_state == HVMIO_completed )
-        realmode_emulate_one(&hvmemul_ctxt);
-
     /* Only deliver interrupts into emulated real mode. */
     if ( !(curr->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) &&
          (intr_info & INTR_INFO_VALID_MASK) )
@@ -190,8 +191,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
     curr->arch.hvm_vmx.vmx_emulate = 1;
     while ( curr->arch.hvm_vmx.vmx_emulate &&
-            !softirq_pending(smp_processor_id()) &&
-            (vio->io_state == HVMIO_none) )
+            !softirq_pending(smp_processor_id()) )
     {
         /*
          * Check for pending interrupts only every 16 instructions, because
@@ -203,7 +203,10 @@ void vmx_realmode(struct cpu_user_regs *regs)
              hvm_local_events_need_delivery(curr) )
             break;
 
-        realmode_emulate_one(&hvmemul_ctxt);
+        vmx_realmode_emulate_one(&hvmemul_ctxt);
+
+        if ( vio->io_state != HVMIO_none || vio->mmio_retry )
+            break;
 
         /* Stop emulating unless our segment state is not safe */
         if ( curr->arch.hvm_vmx.vmx_realmode )
@@ -245,3 +248,13 @@ void vmx_realmode(struct cpu_user_regs *regs)
     if ( intr_info & INTR_INFO_VALID_MASK )
         __vmwrite(VM_ENTRY_INTR_INFO, intr_info);
 }
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 2ed0606..efb373f 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -34,11 +34,16 @@ enum hvm_io_state {
     HVMIO_none = 0,
     HVMIO_dispatched,
     HVMIO_awaiting_completion,
-    HVMIO_handle_mmio_awaiting_completion,
-    HVMIO_handle_pio_awaiting_completion,
     HVMIO_completed
 };
 
+enum hvm_io_completion {
+    HVMIO_no_completion = 0,
+    HVMIO_mmio_completion,
+    HVMIO_pio_completion,
+    HVMIO_realmode_completion
+};
+
 struct hvm_vcpu_asid {
     uint64_t generation;
     uint32_t asid;
@@ -46,9 +51,10 @@ struct hvm_vcpu_asid {
 
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
-    enum hvm_io_state   io_state;
-    unsigned long       io_data;
-    int                 io_size;
+    enum hvm_io_state      io_state;
+    unsigned long          io_data;
+    unsigned int           io_size;
+    enum hvm_io_completion io_completion;
 
     /*
      * HVM emulation:
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 35f804a..c5f3d24 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -94,6 +94,7 @@ void vmx_asm_do_vmentry(void);
 void vmx_intr_assist(void);
 void noreturn vmx_do_resume(struct vcpu *);
 void vmx_vlapic_msr_changed(struct vcpu *v);
+void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt);
 void vmx_realmode(struct cpu_user_regs *regs);
 void vmx_update_debug_state(struct vcpu *v);
 void vmx_update_exception_bitmap(struct vcpu *v);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 12/16] x86/hvm: remove HVMIO_dispatched I/O state
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (10 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 11/16] x86/hvm: split I/O completion handling from state model Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-03 15:12   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 13/16] x86/hvm: remove hvm_io_state enumeration Paul Durrant
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

By removing the HVMIO_dispatched state and making all pending emulations
(i.e. all those not handled by the hypervisor) use HVMIO_awating_completion,
various code-paths can be simplified.

The completion case for HVMIO_dispatched can also be trivally removed
from hvmemul_do_io() as it was already unreachable. This is because that
state was only ever used for writes or I/O to/from a guest page and
hvmemul_do_io() is never called to complete such I/O.

NOTE: There is one sublety in handle_pio()... The only case when
      handle_pio() got a return code of X86EMUL_RETRY back from
      hvmemul_do_pio_buffer() and found io_state was not
      HVMIO_awaiting_completion was in the case where the domain is
      shutting down. This is because all writes normally yield a return
      of HVMEMUL_OKAY and all reads put io_state into
      HVMIO_awaiting_completion. Hence the io_state check there is
      replaced with a check of the is_shutting_down flag on the domain.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c      |   12 +++---------
 xen/arch/x86/hvm/hvm.c          |   12 +++---------
 xen/arch/x86/hvm/io.c           |   14 +++++++-------
 xen/arch/x86/hvm/vmx/realmode.c |    2 +-
 xen/include/asm-x86/hvm/vcpu.h  |   10 +++++++++-
 5 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index e17e6ab..a6c7f66 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -137,20 +137,14 @@ static int hvmemul_do_io(
         if ( data_is_addr || dir == IOREQ_WRITE )
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
-    case HVMIO_dispatched:
-        /* May have to wait for previous cycle of a multi-write to complete. */
-        if ( is_mmio && !data_is_addr && (dir == IOREQ_WRITE) &&
-             (addr == (vio->mmio_large_write_pa +
-                       vio->mmio_large_write_bytes)) )
-            return X86EMUL_RETRY;
-        /* fallthrough */
     default:
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state = (data_is_addr || dir == IOREQ_WRITE) ?
-        HVMIO_dispatched : HVMIO_awaiting_completion;
+    vio->io_state = HVMIO_awaiting_completion;
     vio->io_size = size;
+    vio->io_dir = dir;
+    vio->io_data_is_addr = data_is_addr;
 
     if ( dir == IOREQ_WRITE )
     {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 0f0730e..b94a9b6 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -416,22 +416,16 @@ static void hvm_io_assist(ioreq_t *p)
 {
     struct vcpu *curr = current;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    enum hvm_io_state io_state;
 
     p->state = STATE_IOREQ_NONE;
 
-    io_state = vio->io_state;
-    vio->io_state = HVMIO_none;
-
-    switch ( io_state )
+    if ( hvm_vcpu_io_need_completion(vio) )
     {
-    case HVMIO_awaiting_completion:
         vio->io_state = HVMIO_completed;
         vio->io_data = p->data;
-        break;
-    default:
-        break;
     }
+    else
+        vio->io_state = HVMIO_none;
 
     msix_write_completion(curr);
     vcpu_end_shutdown_deferral(curr);
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 221d05e..3b51d59 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -90,9 +90,7 @@ int handle_mmio(void)
 
     rc = hvm_emulate_one(&ctxt);
 
-    if ( rc != X86EMUL_RETRY )
-        vio->io_state = HVMIO_none;
-    if ( vio->io_state == HVMIO_awaiting_completion || vio->mmio_retry )
+    if ( hvm_vcpu_io_need_completion(vio) || vio->mmio_retry )
         vio->io_completion = HVMIO_mmio_completion;
     else
         vio->mmio_access = (struct npfec){};
@@ -142,6 +140,9 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
 
     rc = hvmemul_do_pio_buffer(port, size, dir, &data);
 
+    if ( hvm_vcpu_io_need_completion(vio) )
+        vio->io_completion = HVMIO_pio_completion;
+
     switch ( rc )
     {
     case X86EMUL_OKAY:
@@ -154,11 +155,10 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
         }
         break;
     case X86EMUL_RETRY:
-        if ( vio->io_state != HVMIO_awaiting_completion )
+        /* We should not advance RIP/EIP if the domain is shutting down */
+        if ( curr->domain->is_shutting_down )
             return 0;
-        /* Completion in hvm_io_assist() with no re-emulation required. */
-        ASSERT(dir == IOREQ_READ);
-        vio->io_completion = HVMIO_pio_completion;
+
         break;
     default:
         gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc);
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index 76ff9a5..deb53ae 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -111,7 +111,7 @@ void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
 
     rc = hvm_emulate_one(hvmemul_ctxt);
 
-    if ( vio->io_state == HVMIO_awaiting_completion || vio->mmio_retry )
+    if ( hvm_vcpu_io_need_completion(vio) || vio->mmio_retry )
         vio->io_completion = HVMIO_realmode_completion;
 
     if ( rc == X86EMUL_UNHANDLEABLE )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index efb373f..13e7eb9 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -32,7 +32,6 @@
 
 enum hvm_io_state {
     HVMIO_none = 0,
-    HVMIO_dispatched,
     HVMIO_awaiting_completion,
     HVMIO_completed
 };
@@ -55,6 +54,8 @@ struct hvm_vcpu_io {
     unsigned long          io_data;
     unsigned int           io_size;
     enum hvm_io_completion io_completion;
+    uint8_t                io_dir;
+    uint8_t                io_data_is_addr;
 
     /*
      * HVM emulation:
@@ -87,6 +88,13 @@ struct hvm_vcpu_io {
     const struct g2m_ioport *g2m_ioport;
 };
 
+static inline bool_t hvm_vcpu_io_need_completion(const struct hvm_vcpu_io *vio)
+{
+    return (vio->io_state == HVMIO_awaiting_completion) &&
+           !vio->io_data_is_addr &&
+           (vio->io_dir == IOREQ_READ);
+}
+
 #define VMCX_EADDR    (~0ULL)
 
 struct nestedvcpu {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 13/16] x86/hvm: remove hvm_io_state enumeration
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (11 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 12/16] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-03 15:13   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 14/16] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser

Emulation request status is already covered by STATE_IOREQ_XXX values so
just use those. The mapping is:

HVMIO_none                -> STATE_IOREQ_NONE
HVMIO_awaiting_completion -> STATE_IOREQ_READY
HVMIO_completed           -> STATE_IORESP_READY

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c       |   14 +++++++-------
 xen/arch/x86/hvm/hvm.c           |    6 +++---
 xen/arch/x86/hvm/svm/nestedsvm.c |    2 +-
 xen/arch/x86/hvm/vmx/realmode.c  |    4 ++--
 xen/include/asm-x86/hvm/vcpu.h   |   10 ++--------
 5 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index a6c7f66..4c696d1 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -130,10 +130,10 @@ static int hvmemul_do_io(
 
     switch ( vio->io_state )
     {
-    case HVMIO_none:
+    case STATE_IOREQ_NONE:
         break;
-    case HVMIO_completed:
-        vio->io_state = HVMIO_none;
+    case STATE_IORESP_READY:
+        vio->io_state = STATE_IOREQ_NONE;
         if ( data_is_addr || dir == IOREQ_WRITE )
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
@@ -141,7 +141,7 @@ static int hvmemul_do_io(
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state = HVMIO_awaiting_completion;
+    vio->io_state = STATE_IOREQ_READY;
     vio->io_size = size;
     vio->io_dir = dir;
     vio->io_data_is_addr = data_is_addr;
@@ -160,7 +160,7 @@ static int hvmemul_do_io(
     {
     case X86EMUL_OKAY:
         vio->io_data = p.data;
-        vio->io_state = HVMIO_none;
+        vio->io_state = STATE_IOREQ_NONE;
         break;
     case X86EMUL_UNHANDLEABLE:
     {
@@ -173,13 +173,13 @@ static int hvmemul_do_io(
             rc = hvm_process_io_intercept(&null_handler, &p);
             if ( rc == X86EMUL_OKAY )
                 vio->io_data = p.data;
-            vio->io_state = HVMIO_none;
+            vio->io_state = STATE_IOREQ_NONE;
         }
         else
         {
             rc = hvm_send_assist_req(s, &p);
             if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
-                vio->io_state = HVMIO_none;
+                vio->io_state = STATE_IOREQ_NONE;
             else if ( data_is_addr || dir == IOREQ_WRITE )
                 rc = X86EMUL_OKAY;
         }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index b94a9b6..32b0d32 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -421,11 +421,11 @@ static void hvm_io_assist(ioreq_t *p)
 
     if ( hvm_vcpu_io_need_completion(vio) )
     {
-        vio->io_state = HVMIO_completed;
+        vio->io_state = STATE_IORESP_READY;
         vio->io_data = p->data;
     }
     else
-        vio->io_state = HVMIO_none;
+        vio->io_state = STATE_IOREQ_NONE;
 
     msix_write_completion(curr);
     vcpu_end_shutdown_deferral(curr);
@@ -505,7 +505,7 @@ void hvm_do_resume(struct vcpu *v)
             guest_cpu_user_regs()->rax = (uint32_t)vio->io_data;
         else
             memcpy(&guest_cpu_user_regs()->rax, &vio->io_data, vio->io_size);
-        vio->io_state = HVMIO_none;
+        vio->io_state = STATE_IOREQ_NONE;
         break;
     case HVMIO_realmode_completion:
     {
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index be5797a..8b165c6 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -1231,7 +1231,7 @@ enum hvm_intblk nsvm_intr_blocked(struct vcpu *v)
          * Delay the injection because this would result in delivering
          * an interrupt *within* the execution of an instruction.
          */
-        if ( v->arch.hvm_vcpu.hvm_io.io_state != HVMIO_none )
+        if ( v->arch.hvm_vcpu.hvm_io.io_state != STATE_IOREQ_NONE )
             return hvm_intblk_shadow;
 
         if ( !nv->nv_vmexit_pending && n2vmcb->exitintinfo.bytes != 0 ) {
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index deb53ae..25533dc 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -205,7 +205,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
         vmx_realmode_emulate_one(&hvmemul_ctxt);
 
-        if ( vio->io_state != HVMIO_none || vio->mmio_retry )
+        if ( vio->io_state != STATE_IOREQ_NONE || vio->mmio_retry )
             break;
 
         /* Stop emulating unless our segment state is not safe */
@@ -219,7 +219,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     }
 
     /* Need to emulate next time if we've started an IO operation */
-    if ( vio->io_state != HVMIO_none )
+    if ( vio->io_state != STATE_IOREQ_NONE )
         curr->arch.hvm_vmx.vmx_emulate = 1;
 
     if ( !curr->arch.hvm_vmx.vmx_emulate && !curr->arch.hvm_vmx.vmx_realmode )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 13e7eb9..9ca96d0 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -30,12 +30,6 @@
 #include <asm/hvm/svm/nestedsvm.h>
 #include <asm/mtrr.h>
 
-enum hvm_io_state {
-    HVMIO_none = 0,
-    HVMIO_awaiting_completion,
-    HVMIO_completed
-};
-
 enum hvm_io_completion {
     HVMIO_no_completion = 0,
     HVMIO_mmio_completion,
@@ -50,7 +44,7 @@ struct hvm_vcpu_asid {
 
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
-    enum hvm_io_state      io_state;
+    uint8_t                io_state;
     unsigned long          io_data;
     unsigned int           io_size;
     enum hvm_io_completion io_completion;
@@ -90,7 +84,7 @@ struct hvm_vcpu_io {
 
 static inline bool_t hvm_vcpu_io_need_completion(const struct hvm_vcpu_io *vio)
 {
-    return (vio->io_state == HVMIO_awaiting_completion) &&
+    return (vio->io_state == STATE_IOREQ_READY) &&
            !vio->io_data_is_addr &&
            (vio->io_dir == IOREQ_READ);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 14/16] x86/hvm: use ioreq_t to track in-flight state
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (12 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 13/16] x86/hvm: remove hvm_io_state enumeration Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-03 15:15   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 15/16] x86/hvm: always re-emulate I/O from a buffer Paul Durrant
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

Use an ioreq_t rather than open coded state, size, dir and data fields
in struct hvm_vcpu_io. This also allows PIO completion to be handled
similarly to MMIO completion by re-issuing the handle_pio() call.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c       |   35 +++++++++++++++++++++--------------
 xen/arch/x86/hvm/hvm.c           |   13 +++++--------
 xen/arch/x86/hvm/svm/nestedsvm.c |    2 +-
 xen/arch/x86/hvm/vmx/realmode.c  |    4 ++--
 xen/include/asm-x86/hvm/vcpu.h   |   12 ++++--------
 5 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 4c696d1..b5fb0da 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -91,6 +91,7 @@ static int hvmemul_do_io(
         .df = df,
         .data = data,
         .data_is_ptr = data_is_addr, /* ioreq_t field name is misleading */
+        .state = STATE_IOREQ_READY,
     };
     void *p_data = (void *)data;
     int rc;
@@ -128,12 +129,24 @@ static int hvmemul_do_io(
         }
     }
 
-    switch ( vio->io_state )
+    switch ( vio->io_req.state )
     {
     case STATE_IOREQ_NONE:
         break;
     case STATE_IORESP_READY:
-        vio->io_state = STATE_IOREQ_NONE;
+        vio->io_req.state = STATE_IOREQ_NONE;
+        p = vio->io_req;
+
+        /* Verify the emulation request has been correctly re-issued */
+        if ( (p.type != is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO) ||
+             (p.addr != addr) ||
+             (p.size != size) ||
+             (p.count != reps) ||
+             (p.dir != dir) ||
+             (p.df != df) ||
+             (p.data_is_ptr != data_is_addr) )
+            domain_crash(curr->domain);
+
         if ( data_is_addr || dir == IOREQ_WRITE )
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
@@ -141,11 +154,6 @@ static int hvmemul_do_io(
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state = STATE_IOREQ_READY;
-    vio->io_size = size;
-    vio->io_dir = dir;
-    vio->io_data_is_addr = data_is_addr;
-
     if ( dir == IOREQ_WRITE )
     {
         if ( !data_is_addr )
@@ -154,13 +162,14 @@ static int hvmemul_do_io(
         hvmtrace_io_assist(&p);
     }
 
+    vio->io_req = p;
+
     rc = hvm_io_intercept(&p);
 
     switch ( rc )
     {
     case X86EMUL_OKAY:
-        vio->io_data = p.data;
-        vio->io_state = STATE_IOREQ_NONE;
+        vio->io_req.state = STATE_IOREQ_NONE;
         break;
     case X86EMUL_UNHANDLEABLE:
     {
@@ -171,15 +180,13 @@ static int hvmemul_do_io(
         if ( !s )
         {
             rc = hvm_process_io_intercept(&null_handler, &p);
-            if ( rc == X86EMUL_OKAY )
-                vio->io_data = p.data;
-            vio->io_state = STATE_IOREQ_NONE;
+            vio->io_req.state = STATE_IOREQ_NONE;
         }
         else
         {
             rc = hvm_send_assist_req(s, &p);
             if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
-                vio->io_state = STATE_IOREQ_NONE;
+                vio->io_req.state = STATE_IOREQ_NONE;
             else if ( data_is_addr || dir == IOREQ_WRITE )
                 rc = X86EMUL_OKAY;
         }
@@ -198,7 +205,7 @@ static int hvmemul_do_io(
         hvmtrace_io_assist(&p);
 
         if ( !data_is_addr )
-            memcpy(p_data, &vio->io_data, size);
+            memcpy(p_data, &p.data, size);
     }
 
     if ( is_mmio && !data_is_addr )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 32b0d32..901f9a8 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -421,11 +421,11 @@ static void hvm_io_assist(ioreq_t *p)
 
     if ( hvm_vcpu_io_need_completion(vio) )
     {
-        vio->io_state = STATE_IORESP_READY;
-        vio->io_data = p->data;
+        vio->io_req.state = STATE_IORESP_READY;
+        vio->io_req.data = p->data;
     }
     else
-        vio->io_state = STATE_IOREQ_NONE;
+        vio->io_req.state = STATE_IOREQ_NONE;
 
     msix_write_completion(curr);
     vcpu_end_shutdown_deferral(curr);
@@ -501,11 +501,8 @@ void hvm_do_resume(struct vcpu *v)
         (void)handle_mmio();
         break;
     case HVMIO_pio_completion:
-        if ( vio->io_size == 4 ) /* Needs zero extension. */
-            guest_cpu_user_regs()->rax = (uint32_t)vio->io_data;
-        else
-            memcpy(&guest_cpu_user_regs()->rax, &vio->io_data, vio->io_size);
-        vio->io_state = STATE_IOREQ_NONE;
+        (void)handle_pio(vio->io_req.addr, vio->io_req.size,
+                         vio->io_req.dir);
         break;
     case HVMIO_realmode_completion:
     {
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index 8b165c6..78667a2 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -1231,7 +1231,7 @@ enum hvm_intblk nsvm_intr_blocked(struct vcpu *v)
          * Delay the injection because this would result in delivering
          * an interrupt *within* the execution of an instruction.
          */
-        if ( v->arch.hvm_vcpu.hvm_io.io_state != STATE_IOREQ_NONE )
+        if ( v->arch.hvm_vcpu.hvm_io.io_req.state != STATE_IOREQ_NONE )
             return hvm_intblk_shadow;
 
         if ( !nv->nv_vmexit_pending && n2vmcb->exitintinfo.bytes != 0 ) {
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index 25533dc..e83a61f 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -205,7 +205,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
         vmx_realmode_emulate_one(&hvmemul_ctxt);
 
-        if ( vio->io_state != STATE_IOREQ_NONE || vio->mmio_retry )
+        if ( vio->io_req.state != STATE_IOREQ_NONE || vio->mmio_retry )
             break;
 
         /* Stop emulating unless our segment state is not safe */
@@ -219,7 +219,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     }
 
     /* Need to emulate next time if we've started an IO operation */
-    if ( vio->io_state != STATE_IOREQ_NONE )
+    if ( vio->io_req.state != STATE_IOREQ_NONE )
         curr->arch.hvm_vmx.vmx_emulate = 1;
 
     if ( !curr->arch.hvm_vmx.vmx_emulate && !curr->arch.hvm_vmx.vmx_realmode )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 9ca96d0..7ffa28a 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -44,12 +44,8 @@ struct hvm_vcpu_asid {
 
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
-    uint8_t                io_state;
-    unsigned long          io_data;
-    unsigned int           io_size;
+    ioreq_t                io_req;
     enum hvm_io_completion io_completion;
-    uint8_t                io_dir;
-    uint8_t                io_data_is_addr;
 
     /*
      * HVM emulation:
@@ -84,9 +80,9 @@ struct hvm_vcpu_io {
 
 static inline bool_t hvm_vcpu_io_need_completion(const struct hvm_vcpu_io *vio)
 {
-    return (vio->io_state == STATE_IOREQ_READY) &&
-           !vio->io_data_is_addr &&
-           (vio->io_dir == IOREQ_READ);
+    return (vio->io_req.state == STATE_IOREQ_READY) &&
+           !vio->io_req.data_is_ptr &&
+           (vio->io_req.dir == IOREQ_READ);
 }
 
 #define VMCX_EADDR    (~0ULL)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 15/16] x86/hvm: always re-emulate I/O from a buffer
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (13 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 14/16] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-03 15:26   ` Andrew Cooper
  2015-06-30 13:05 ` [PATCH v5 16/16] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
  2015-06-30 14:48 ` [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Fabio Fantoni
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser

If memory mapped I/O is 'chunked' then the I/O must be re-emulated,
otherwise only the first chunk will be processed. This patch makes
sure all I/O from a buffer is re-emulated regardless of whether it
is a read or a write.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c     |    4 ++--
 xen/include/asm-x86/hvm/vcpu.h |    3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index b5fb0da..8bb56a2 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -147,7 +147,7 @@ static int hvmemul_do_io(
              (p.data_is_ptr != data_is_addr) )
             domain_crash(curr->domain);
 
-        if ( data_is_addr || dir == IOREQ_WRITE )
+        if ( data_is_addr )
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
     default:
@@ -187,7 +187,7 @@ static int hvmemul_do_io(
             rc = hvm_send_assist_req(s, &p);
             if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
                 vio->io_req.state = STATE_IOREQ_NONE;
-            else if ( data_is_addr || dir == IOREQ_WRITE )
+            else if ( data_is_addr )
                 rc = X86EMUL_OKAY;
         }
         break;
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 7ffa28a..506ce89 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -81,8 +81,7 @@ struct hvm_vcpu_io {
 static inline bool_t hvm_vcpu_io_need_completion(const struct hvm_vcpu_io *vio)
 {
     return (vio->io_req.state == STATE_IOREQ_READY) &&
-           !vio->io_req.data_is_ptr &&
-           (vio->io_req.dir == IOREQ_READ);
+           !vio->io_req.data_is_ptr;
 }
 
 #define VMCX_EADDR    (~0ULL)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 16/16] x86/hvm: track large memory mapped accesses by buffer offset
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (14 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 15/16] x86/hvm: always re-emulate I/O from a buffer Paul Durrant
@ 2015-06-30 13:05 ` Paul Durrant
  2015-07-03 15:26   ` Andrew Cooper
  2015-06-30 14:48 ` [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Fabio Fantoni
  16 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 13:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

The code in hvmemul_do_io() that tracks large reads or writes, to avoid
re-issue of component I/O, is defeated by accesses across a page boundary
because it uses physical address. The code is also only relevant to memory
mapped I/O to or from a buffer.

This patch re-factors the code and moves it into hvmemul_phys_mmio_access()
where it is relevant and tracks using buffer offset rather than address.
Separate I/O emulations (of which there may be up to three per instruction)
are distinguished by linear address.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c     |  122 ++++++++++++++++++++--------------------
 xen/include/asm-x86/hvm/vcpu.h |   25 +++++---
 2 files changed, 78 insertions(+), 69 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 8bb56a2..d83d21c 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -106,29 +106,6 @@ static int hvmemul_do_io(
         return X86EMUL_UNHANDLEABLE;
     }
 
-    if ( is_mmio && !data_is_addr )
-    {
-        /* Part of a multi-cycle read or write? */
-        if ( dir == IOREQ_WRITE )
-        {
-            paddr_t pa = vio->mmio_large_write_pa;
-            unsigned int bytes = vio->mmio_large_write_bytes;
-            if ( (addr >= pa) && ((addr + size) <= (pa + bytes)) )
-                return X86EMUL_OKAY;
-        }
-        else
-        {
-            paddr_t pa = vio->mmio_large_read_pa;
-            unsigned int bytes = vio->mmio_large_read_bytes;
-            if ( (addr >= pa) && ((addr + size) <= (pa + bytes)) )
-            {
-                memcpy(p_data, &vio->mmio_large_read[addr - pa],
-                       size);
-                return X86EMUL_OKAY;
-            }
-        }
-    }
-
     switch ( vio->io_req.state )
     {
     case STATE_IOREQ_NONE:
@@ -208,33 +185,6 @@ static int hvmemul_do_io(
             memcpy(p_data, &p.data, size);
     }
 
-    if ( is_mmio && !data_is_addr )
-    {
-        /* Part of a multi-cycle read or write? */
-        if ( dir == IOREQ_WRITE )
-        {
-            paddr_t pa = vio->mmio_large_write_pa;
-            unsigned int bytes = vio->mmio_large_write_bytes;
-            if ( bytes == 0 )
-                pa = vio->mmio_large_write_pa = addr;
-            if ( addr == (pa + bytes) )
-                vio->mmio_large_write_bytes += size;
-        }
-        else
-        {
-            paddr_t pa = vio->mmio_large_read_pa;
-            unsigned int bytes = vio->mmio_large_read_bytes;
-            if ( bytes == 0 )
-                pa = vio->mmio_large_read_pa = addr;
-            if ( (addr == (pa + bytes)) &&
-                 ((bytes + size) <= sizeof(vio->mmio_large_read)) )
-            {
-                memcpy(&vio->mmio_large_read[bytes], p_data, size);
-                vio->mmio_large_read_bytes += size;
-            }
-        }
-    }
-
     return X86EMUL_OKAY;
 }
 
@@ -587,11 +537,12 @@ static int hvmemul_virtual_to_linear(
 }
 
 static int hvmemul_phys_mmio_access(
-    paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer)
+    struct hvm_mmio_cache *cache, paddr_t gpa, unsigned int size, uint8_t dir,
+    uint8_t *buffer, unsigned int off)
 {
     unsigned long one_rep = 1;
     unsigned int chunk;
-    int rc;
+    int rc = X86EMUL_OKAY;
 
     /* Accesses must fall within a page */
     BUG_ON((gpa & (PAGE_SIZE - 1)) + size > PAGE_SIZE);
@@ -607,14 +558,33 @@ static int hvmemul_phys_mmio_access(
 
     for ( ;; )
     {
-        rc = hvmemul_do_mmio_buffer(gpa, &one_rep, chunk, dir, 0,
-                                    buffer);
-        if ( rc != X86EMUL_OKAY )
-            break;
+        /* Have we already done this chunk? */
+        if ( off < cache->size )
+        {
+            ASSERT((off + chunk) <= cache->size);
+
+            if ( dir == IOREQ_READ )
+                memcpy(&buffer[off], &cache->buffer[off], chunk);
+            else if ( memcmp(&buffer[off], &cache->buffer[off], chunk) != 0 )
+                domain_crash(current->domain);
+        }
+        else
+        {
+            ASSERT(off == cache->size);
+
+            rc = hvmemul_do_mmio_buffer(gpa, &one_rep, chunk, dir, 0,
+                                        &buffer[off]);
+            if ( rc != X86EMUL_OKAY )
+                break;
+
+            /* Note that we have now done this chunk. */
+            memcpy(&cache->buffer[off], &buffer[off], chunk);
+            cache->size += chunk;
+        }
 
         /* Advance to the next chunk */
         gpa += chunk;
-        buffer += chunk;
+        off += chunk;
         size -= chunk;
 
         if ( size == 0 )
@@ -631,13 +601,41 @@ static int hvmemul_phys_mmio_access(
     return rc;
 }
 
+static struct hvm_mmio_cache *hvmemul_find_mmio_cache(
+    struct hvm_vcpu_io *vio, unsigned long gla, uint8_t dir)
+{
+    unsigned int i;
+    struct hvm_mmio_cache *cache;
+
+    for ( i = 0; i < vio->mmio_cache_count; i ++ )
+    {
+        cache = &vio->mmio_cache[i];
+
+        if ( gla == cache->gla &&
+             dir == cache->dir )
+            return cache;
+    }
+
+    i = vio->mmio_cache_count++;
+    BUG_ON(i == ARRAY_SIZE(vio->mmio_cache));
+
+    cache = &vio->mmio_cache[i];
+    memset(cache, 0, sizeof (*cache));
+
+    cache->gla = gla;
+    cache->dir = dir;
+
+    return cache;
+}
+
 static int hvmemul_linear_mmio_access(
-    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
+    unsigned long gla, unsigned int size, uint8_t dir, void *buffer,
     uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t known_gpfn)
 {
     struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
     unsigned long page_off = gla & (PAGE_SIZE - 1);
-    unsigned int chunk;
+    struct hvm_mmio_cache *cache = hvmemul_find_mmio_cache(vio, gla, dir);
+    unsigned int chunk, buffer_off = 0;
     paddr_t gpa;
     unsigned long one_rep = 1;
     int rc;
@@ -656,12 +654,12 @@ static int hvmemul_linear_mmio_access(
 
     for ( ;; )
     {
-        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
+        rc = hvmemul_phys_mmio_access(cache, gpa, chunk, dir, buffer, buffer_off);
         if ( rc != X86EMUL_OKAY )
             break;
 
         gla += chunk;
-        buffer += chunk;
+        buffer_off += chunk;
         size -= chunk;
 
         if ( size == 0 )
@@ -1609,7 +1607,7 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
         rc = X86EMUL_RETRY;
     if ( rc != X86EMUL_RETRY )
     {
-        vio->mmio_large_read_bytes = vio->mmio_large_write_bytes = 0;
+        vio->mmio_cache_count = 0;
         vio->mmio_insn_bytes = 0;
     }
     else
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 506ce89..90dea8d 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -42,6 +42,17 @@ struct hvm_vcpu_asid {
     uint32_t asid;
 };
 
+/*
+ * We may read or write up to m256 as a number of device-model
+ * transactions.
+ */
+struct hvm_mmio_cache {
+    unsigned long size;
+    uint8_t buffer[32];
+    unsigned long gla;
+    uint8_t dir;
+};
+
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
     ioreq_t                io_req;
@@ -57,13 +68,13 @@ struct hvm_vcpu_io {
     unsigned long       mmio_gva;
     unsigned long       mmio_gpfn;
 
-    /* We may read up to m256 as a number of device-model transactions. */
-    paddr_t mmio_large_read_pa;
-    uint8_t mmio_large_read[32];
-    unsigned int mmio_large_read_bytes;
-    /* We may write up to m256 as a number of device-model transactions. */
-    unsigned int mmio_large_write_bytes;
-    paddr_t mmio_large_write_pa;
+    /*
+     * We may need to handle up to 3 distinct memory accesses per
+     * instruction.
+     */
+    struct hvm_mmio_cache mmio_cache[3];
+    unsigned int mmio_cache_count;
+
     /* For retries we shouldn't re-fetch the instruction. */
     unsigned int mmio_insn_bytes;
     unsigned char mmio_insn[16];
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 01/16] x86/hvm: make sure emulation is retried if domain is shutting down
  2015-06-30 13:05 ` [PATCH v5 01/16] x86/hvm: make sure emulation is retried if domain is shutting down Paul Durrant
@ 2015-06-30 13:45   ` Andrew Cooper
  2015-06-30 16:14     ` Don Slutz
  0 siblings, 1 reply; 46+ messages in thread
From: Andrew Cooper @ 2015-06-30 13:45 UTC (permalink / raw)
  To: Paul Durrant, xen-devel

On 30/06/15 14:05, Paul Durrant wrote:
> The addition of commit 2df1aa01 "x86/hvm: remove hvm_io_pending() check
> in hvmemul_do_io()" causes a problem in migration because I/O that was
> caught by the test of vcpu_start_shutdown_deferral() in
> hvm_send_assist_req() is now considered completed rather than requiring
> a retry.
>
> This patch fixes the problem by having hvm_send_assist_req() return
> X86EMUL_RETRY rather than X86EMUL_OKAY if the
> vcpu_start_shutdown_deferral() test fails and then making sure that
> the emulation state is reset if the domain is found to be shutting
> down.
>
> Reported-by: Don Slutz <don.slutz@gmail.com>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Keir Fraser <keir@xen.org>
> Jan Beulich <jbeulich@suse.com>
> Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

> ---
>  xen/arch/x86/hvm/emulate.c |    2 +-
>  xen/arch/x86/hvm/hvm.c     |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index fe5661d..8b60843 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -183,7 +183,7 @@ static int hvmemul_do_io(
>          else
>          {
>              rc = hvm_send_assist_req(s, &p);
> -            if ( rc != X86EMUL_RETRY )
> +            if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
>                  vio->io_state = HVMIO_none;
>              else if ( data_is_addr || dir == IOREQ_WRITE )
>                  rc = X86EMUL_OKAY;
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 535d622..f0cf064 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2613,7 +2613,7 @@ int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
>  
>      ASSERT(s);
>      if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
> -        return X86EMUL_OKAY;
> +        return X86EMUL_RETRY;
>  
>      list_for_each_entry ( sv,
>                            &s->ioreq_vcpu_list,

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix
  2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (15 preceding siblings ...)
  2015-06-30 13:05 ` [PATCH v5 16/16] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
@ 2015-06-30 14:48 ` Fabio Fantoni
  2015-07-07 11:19   ` Fabio Fantoni
  16 siblings, 1 reply; 46+ messages in thread
From: Fabio Fantoni @ 2015-06-30 14:48 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Jim Fehlig, Jan Beulich

Il 30/06/2015 15:05, Paul Durrant ha scritto:
> This patch series re-works much of the code involved in emulation of port
> and memory mapped I/O for HVM guests.
>
> The code has become very convoluted and, at least by inspection, certain
> emulations will apparently malfunction.
>
> The series is broken down into 16 patches (which are also available in
> my xenbits repo: http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
> on the emulation32 branch).
>
> Previous changelog
> ------------------
>
> v2:
>   - Removed bogus assertion from patch #15
>   - Re-worked patch #17 after basic testing of back-port onto XenServer
>
> v3:
>   - Addressed comments from Jan
>   - Re-ordered series to bring a couple of more trivial patches to the
>     front
>   - Backport to XenServer (4.5) now passing automated tests
>   - Tested on unstable with QEMU upstream and trad, with and without
>     HAP (to force shadow emulation)
>
> v4:
>   - Removed previous patch (make sure translated MMIO reads or
>     writes fall within a page) and rebased rest of series.
>   - Address Jan's comments on patch #1
>
> Changelog (now per-patch)
> -------------------------
>
> 0001-x86-hvm-make-sure-emulation-is-retried-if-domain-is-.patch
>
> This is a fix for an issue on staging reported by Don Slutz
>
> 0002-x86-hvm-remove-multiple-open-coded-chunking-loops.patch
>
> v5: Addressed further comments from Jan
>
> 0003-x86-hvm-change-hvm_mmio_read_t-and-hvm_mmio_write_t-.patch
>
> v5: New patch to tidy up types
>
> 0004-x86-hvm-restrict-port-numbers-to-uint16_t-and-sizes-.patch
>
> v5: New patch to tidy up more types
>
> 0005-x86-hvm-unify-internal-portio-and-mmio-intercepts.patch
>
> v5: Addressed further comments from Jan and simplified implementation
>      by passing ioreq_t to accept() function
>
> 0006-x86-hvm-add-length-to-mmio-check-op.patch
>
> v5: Simplified by leaving mmio_check() implementation alone and
>      calling to check last byte if first-byte check passes
>
> 0007-x86-hvm-unify-dpci-portio-intercept-with-standard-po.patch
>
> v5: Addressed further comments from Jan
>
> 0008-x86-hvm-unify-stdvga-mmio-intercept-with-standard-mm.patch
>
> v5: Fixed semantic problems pointed out by Jan
>
> 0009-x86-hvm-limit-reps-to-avoid-the-need-to-handle-retry.patch
>
> v5: Addressed further comments from Jan
>
> 0010-x86-hvm-only-call-hvm_io_assist-from-hvm_wait_for_io.patch
>
> v5: Added Jan's acked-by
>
> 0011-x86-hvm-split-I-O-completion-handling-from-state-mod.patch
>
> v5: Confirmed call to msix_write_completion() is in the correct place.
>
> 0012-x86-hvm-remove-HVMIO_dispatched-I-O-state.patch
>
> v5: Added some extra comments to the commit
>
> 0013-x86-hvm-remove-hvm_io_state-enumeration.patch
>
> v5: Added Jan's acked-by
>
> 0014-x86-hvm-use-ioreq_t-to-track-in-flight-state.patch
>
> v5: Added missing hunk with call to handle_pio()
>
> 0015-x86-hvm-always-re-emulate-I-O-from-a-buffer.patch
>
> v5: Added Jan's acked-by
>
> 0016-x86-hvm-track-large-memory-mapped-accesses-by-buffer.patch
>
> v5: Fixed to cache up three distict I/O emulations per instruction
>
> Testing
> -------
>
> The series was been back-ported to staging-4.5 and then dropped onto the
> XenServer (Dundee) patch queue. All automated branch-safety tests pass.
>
> The series as-is has been manually tested with a Windows 7 (32-bit) VM
> using upstream QEMU.
>
>

Thanks for your work.
I did some very fast tests, no regression found, on my linux domUs qxl 
is still not working and I'm unable to debug it.
@Jim Fehlig: can you try if qxl is still working at least on suse 
dom0/domU after this serie please?

Can someone tell me how to debug the qxl problem now that qemu don't 
crash anymore but remain at 100% cpu without nothing about in dom0 logs 
(if exists) please?

Thanks for any reply and sorry for my bad english.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 01/16] x86/hvm: make sure emulation is retried if domain is shutting down
  2015-06-30 13:45   ` Andrew Cooper
@ 2015-06-30 16:14     ` Don Slutz
  2015-06-30 16:29       ` Paul Durrant
  0 siblings, 1 reply; 46+ messages in thread
From: Don Slutz @ 2015-06-30 16:14 UTC (permalink / raw)
  To: Andrew Cooper, Paul Durrant, xen-devel

On 06/30/15 09:45, Andrew Cooper wrote:
> On 30/06/15 14:05, Paul Durrant wrote:
>> The addition of commit 2df1aa01 "x86/hvm: remove hvm_io_pending() check
>> in hvmemul_do_io()" causes a problem in migration because I/O that was
>> caught by the test of vcpu_start_shutdown_deferral() in
>> hvm_send_assist_req() is now considered completed rather than requiring
>> a retry.
>>
>> This patch fixes the problem by having hvm_send_assist_req() return
>> X86EMUL_RETRY rather than X86EMUL_OKAY if the
>> vcpu_start_shutdown_deferral() test fails and then making sure that
>> the emulation state is reset if the domain is found to be shutting
>> down.
>>
>> Reported-by: Don Slutz <don.slutz@gmail.com>
>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> Keir Fraser <keir@xen.org>
>> Jan Beulich <jbeulich@suse.com>
>> Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
>> ---
>>   xen/arch/x86/hvm/emulate.c |    2 +-
>>   xen/arch/x86/hvm/hvm.c     |    2 +-
>>   2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
>> index fe5661d..8b60843 100644
>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -183,7 +183,7 @@ static int hvmemul_do_io(
>>           else
>>           {
>>               rc = hvm_send_assist_req(s, &p);
>> -            if ( rc != X86EMUL_RETRY )
>> +            if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )

Having spent more time thinking about this, a safer test is "!curr-> 
defer_shutdown" .

The reason are as follows:
1) This uses what vcpu_start_shutdown_deferral() returns.  But it is 
only valid if
      vcpu_start_shutdown_deferral() is called which is the case now.
2) The checking of "curr->domain->is_shutting_down" without also
      checking for "curr-> defer_shutdown" may open a timing window.
3) No use of "spin_lock(&d->shutdown_lock)".


>>                   vio->io_state = HVMIO_none;
>>               else if ( data_is_addr || dir == IOREQ_WRITE )
>>                   rc = X86EMUL_OKAY;
>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>> index 535d622..f0cf064 100644
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -2613,7 +2613,7 @@ int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
>>   
>>       ASSERT(s);
>>       if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
>> -        return X86EMUL_OKAY;
>> +        return X86EMUL_RETRY;

I will also note that this change makes hvm_send_assist_req() have only 
2 return codes:

X86EMUL_RETRY
X86EMUL_UNHANDLEABLE

and so is not clear why its return should not be a bool.

And there are now 2 reasons why X86EMUL_RETRY is returned, which require 
the caller to redo a test.
Granted X86EMUL_OKAY is bad.  But I wounder if it would be better to add 
an enum for
hvm_send_assist_req() to return in stead of overloading the usage of 
X86EMUL_UNHANDLEABLE
and X86EMUL_RETRY with yet more meanings.

    -Don Slutz

>>   
>>       list_for_each_entry ( sv,
>>                             &s->ioreq_vcpu_list,
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 01/16] x86/hvm: make sure emulation is retried if domain is shutting down
  2015-06-30 16:14     ` Don Slutz
@ 2015-06-30 16:29       ` Paul Durrant
  0 siblings, 0 replies; 46+ messages in thread
From: Paul Durrant @ 2015-06-30 16:29 UTC (permalink / raw)
  To: Don Slutz, Andrew Cooper, xen-devel

> -----Original Message-----
> From: Don Slutz [mailto:don.slutz@gmail.com]
> Sent: 30 June 2015 17:14
> To: Andrew Cooper; Paul Durrant; xen-devel@lists.xenproject.org
> Subject: Re: [Xen-devel] [PATCH v5 01/16] x86/hvm: make sure emulation is
> retried if domain is shutting down
> 
> On 06/30/15 09:45, Andrew Cooper wrote:
> > On 30/06/15 14:05, Paul Durrant wrote:
> >> The addition of commit 2df1aa01 "x86/hvm: remove hvm_io_pending()
> check
> >> in hvmemul_do_io()" causes a problem in migration because I/O that was
> >> caught by the test of vcpu_start_shutdown_deferral() in
> >> hvm_send_assist_req() is now considered completed rather than
> requiring
> >> a retry.
> >>
> >> This patch fixes the problem by having hvm_send_assist_req() return
> >> X86EMUL_RETRY rather than X86EMUL_OKAY if the
> >> vcpu_start_shutdown_deferral() test fails and then making sure that
> >> the emulation state is reset if the domain is found to be shutting
> >> down.
> >>
> >> Reported-by: Don Slutz <don.slutz@gmail.com>
> >> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> >> Keir Fraser <keir@xen.org>
> >> Jan Beulich <jbeulich@suse.com>
> >> Andrew Cooper <andrew.cooper3@citrix.com>
> > Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >
> >> ---
> >>   xen/arch/x86/hvm/emulate.c |    2 +-
> >>   xen/arch/x86/hvm/hvm.c     |    2 +-
> >>   2 files changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> >> index fe5661d..8b60843 100644
> >> --- a/xen/arch/x86/hvm/emulate.c
> >> +++ b/xen/arch/x86/hvm/emulate.c
> >> @@ -183,7 +183,7 @@ static int hvmemul_do_io(
> >>           else
> >>           {
> >>               rc = hvm_send_assist_req(s, &p);
> >> -            if ( rc != X86EMUL_RETRY )
> >> +            if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
> 
> Having spent more time thinking about this, a safer test is "!curr->
> defer_shutdown" .
> 
> The reason are as follows:
> 1) This uses what vcpu_start_shutdown_deferral() returns.  But it is
> only valid if
>       vcpu_start_shutdown_deferral() is called which is the case now.
> 2) The checking of "curr->domain->is_shutting_down" without also
>       checking for "curr-> defer_shutdown" may open a timing window.
> 3) No use of "spin_lock(&d->shutdown_lock)".

I don't think we need to lock. The is_shutting_down flag from this code's PoV will only ever go from 0 -> 1 so if shutdown deferral failed I think it is perfectly safe to check without taking the spinlock.

What we are essentially want to know in handle_pio() is whether the I/O has been issued or whether the whole thing needs to be retried. Another option to do this would be to change the io_state to something like IOREQ_INPROGRESS (or whatever it is called) when the I/O is actually issued and then test for that. Maybe hvm_send_assist_req() should just become a void function and the return code just set on the basis of the io_state after it has been called.

Anyway, all that is more code churn and the aim of this patch was a minimal diff to fix the problem... which I think it does.

  Paul

> 
> 
> >>                   vio->io_state = HVMIO_none;
> >>               else if ( data_is_addr || dir == IOREQ_WRITE )
> >>                   rc = X86EMUL_OKAY;
> >> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> >> index 535d622..f0cf064 100644
> >> --- a/xen/arch/x86/hvm/hvm.c
> >> +++ b/xen/arch/x86/hvm/hvm.c
> >> @@ -2613,7 +2613,7 @@ int hvm_send_assist_req(struct
> hvm_ioreq_server *s, ioreq_t *proto_p)
> >>
> >>       ASSERT(s);
> >>       if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
> >> -        return X86EMUL_OKAY;
> >> +        return X86EMUL_RETRY;
> 
> I will also note that this change makes hvm_send_assist_req() have only
> 2 return codes:
> 
> X86EMUL_RETRY
> X86EMUL_UNHANDLEABLE
> 
> and so is not clear why its return should not be a bool.
> 
> And there are now 2 reasons why X86EMUL_RETRY is returned, which
> require
> the caller to redo a test.
> Granted X86EMUL_OKAY is bad.  But I wounder if it would be better to add
> an enum for
> hvm_send_assist_req() to return in stead of overloading the usage of
> X86EMUL_UNHANDLEABLE
> and X86EMUL_RETRY with yet more meanings.
> 
>     -Don Slutz
> 
> >>
> >>       list_for_each_entry ( sv,
> >>                             &s->ioreq_vcpu_list,
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 05/16] x86/hvm: unify internal portio and mmio intercepts
  2015-06-30 13:05 ` [PATCH v5 05/16] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
@ 2015-07-02 14:52   ` Roger Pau Monné
  2015-07-02 15:02     ` Paul Durrant
  2015-07-02 16:29   ` Andrew Cooper
  1 sibling, 1 reply; 46+ messages in thread
From: Roger Pau Monné @ 2015-07-02 14:52 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Andrew Cooper, Keir Fraser, Jan Beulich

El 30/06/15 a les 15.05, Paul Durrant ha escrit:
[...]
+void msixtbl_init(struct domain *d)
> +{
> +    register_mmio_handler(d, &msixtbl_mmio_ops);

Since you are adding an initialization function to vmsi I think
msixtbl_list and msixtbl_list_lock should also be initialized here
instead of hvm_domain_initialise.

Roger.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 05/16] x86/hvm: unify internal portio and mmio intercepts
  2015-07-02 14:52   ` Roger Pau Monné
@ 2015-07-02 15:02     ` Paul Durrant
  2015-07-02 15:12       ` Roger Pau Monné
  0 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-07-02 15:02 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel; +Cc: Andrew Cooper, Keir (Xen.org), Jan Beulich

> -----Original Message-----
> From: Roger Pau Monné [mailto:roger.pau@citrix.com]
> Sent: 02 July 2015 15:52
> To: Paul Durrant; xen-devel@lists.xenproject.org
> Cc: Andrew Cooper; Keir (Xen.org); Jan Beulich
> Subject: Re: [Xen-devel] [PATCH v5 05/16] x86/hvm: unify internal portio and
> mmio intercepts
> 
> El 30/06/15 a les 15.05, Paul Durrant ha escrit:
> [...]
> +void msixtbl_init(struct domain *d)
> > +{
> > +    register_mmio_handler(d, &msixtbl_mmio_ops);
> 
> Since you are adding an initialization function to vmsi I think
> msixtbl_list and msixtbl_list_lock should also be initialized here
> instead of hvm_domain_initialise.
> 

Yes, that sounds quite reasonable. Perhaps as a follow-up patch though, unless I need to post v6 of the series for some other reason.

  Paul

> Roger.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 05/16] x86/hvm: unify internal portio and mmio intercepts
  2015-07-02 15:02     ` Paul Durrant
@ 2015-07-02 15:12       ` Roger Pau Monné
  2015-07-02 15:12         ` Paul Durrant
  0 siblings, 1 reply; 46+ messages in thread
From: Roger Pau Monné @ 2015-07-02 15:12 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Andrew Cooper, Keir (Xen.org), Jan Beulich

El 02/07/15 a les 17.02, Paul Durrant ha escrit:
>> -----Original Message-----
>> From: Roger Pau Monné [mailto:roger.pau@citrix.com]
>> Sent: 02 July 2015 15:52
>> To: Paul Durrant; xen-devel@lists.xenproject.org
>> Cc: Andrew Cooper; Keir (Xen.org); Jan Beulich
>> Subject: Re: [Xen-devel] [PATCH v5 05/16] x86/hvm: unify internal portio and
>> mmio intercepts
>>
>> El 30/06/15 a les 15.05, Paul Durrant ha escrit:
>> [...]
>> +void msixtbl_init(struct domain *d)
>>> +{
>>> +    register_mmio_handler(d, &msixtbl_mmio_ops);
>>
>> Since you are adding an initialization function to vmsi I think
>> msixtbl_list and msixtbl_list_lock should also be initialized here
>> instead of hvm_domain_initialise.
>>
> 
> Yes, that sounds quite reasonable. Perhaps as a follow-up patch though, unless I need to post v6 of the series for some other reason.

That's fine, if not I can also add it as a pre-patch to my HVMlite
stuff, it's trivial.

Roger.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 05/16] x86/hvm: unify internal portio and mmio intercepts
  2015-07-02 15:12       ` Roger Pau Monné
@ 2015-07-02 15:12         ` Paul Durrant
  0 siblings, 0 replies; 46+ messages in thread
From: Paul Durrant @ 2015-07-02 15:12 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel; +Cc: Andrew Cooper, Keir (Xen.org), Jan Beulich

> -----Original Message-----
> From: Roger Pau Monné [mailto:roger.pau@citrix.com]
> Sent: 02 July 2015 16:12
> To: Paul Durrant; xen-devel@lists.xenproject.org
> Cc: Andrew Cooper; Keir (Xen.org); Jan Beulich
> Subject: Re: [Xen-devel] [PATCH v5 05/16] x86/hvm: unify internal portio and
> mmio intercepts
> 
> El 02/07/15 a les 17.02, Paul Durrant ha escrit:
> >> -----Original Message-----
> >> From: Roger Pau Monné [mailto:roger.pau@citrix.com]
> >> Sent: 02 July 2015 15:52
> >> To: Paul Durrant; xen-devel@lists.xenproject.org
> >> Cc: Andrew Cooper; Keir (Xen.org); Jan Beulich
> >> Subject: Re: [Xen-devel] [PATCH v5 05/16] x86/hvm: unify internal portio
> and
> >> mmio intercepts
> >>
> >> El 30/06/15 a les 15.05, Paul Durrant ha escrit:
> >> [...]
> >> +void msixtbl_init(struct domain *d)
> >>> +{
> >>> +    register_mmio_handler(d, &msixtbl_mmio_ops);
> >>
> >> Since you are adding an initialization function to vmsi I think
> >> msixtbl_list and msixtbl_list_lock should also be initialized here
> >> instead of hvm_domain_initialise.
> >>
> >
> > Yes, that sounds quite reasonable. Perhaps as a follow-up patch though,
> unless I need to post v6 of the series for some other reason.
> 
> That's fine, if not I can also add it as a pre-patch to my HVMlite
> stuff, it's trivial.
> 

Sure. Thanks,

  Paul

> Roger.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-06-30 13:05 ` [PATCH v5 02/16] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
@ 2015-07-02 15:37   ` Andrew Cooper
  2015-07-02 15:55     ` Paul Durrant
  0 siblings, 1 reply; 46+ messages in thread
From: Andrew Cooper @ 2015-07-02 15:37 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 30/06/15 14:05, Paul Durrant wrote:
> ...in hvmemul_read/write()
>
> Add hvmemul_phys_mmio_access() and hvmemul_linear_mmio_access() functions
> to reduce code duplication.
>
> NOTE: This patch also introduces a change in 'chunking' around a page
>       boundary. Previously (for example) an 8 byte access at the last
>       byte of a page would get carried out as 8 single-byte accesses.
>       It will now be carried out as a single-byte access, followed by
>       a 4-byte access, a 2-byte access and then another single-byte
>       access.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  xen/arch/x86/hvm/emulate.c |  223 +++++++++++++++++++++++---------------------
>  1 file changed, 116 insertions(+), 107 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 8b60843..b67f5db 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -539,6 +539,117 @@ static int hvmemul_virtual_to_linear(
>      return X86EMUL_EXCEPTION;
>  }
>  
> +static int hvmemul_phys_mmio_access(
> +    paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer)
> +{
> +    unsigned long one_rep = 1;
> +    unsigned int chunk;
> +    int rc;
> +
> +    /* Accesses must fall within a page */

Full stop.

> +    BUG_ON((gpa & (PAGE_SIZE - 1)) + size > PAGE_SIZE);

~PAGE_MASK as opposed to (PAGE_SIZE - 1)

> +
> +    /*
> +     * hvmemul_do_io() cannot handle non-power-of-2 accesses or
> +     * accesses larger than sizeof(long), so choose the highest power
> +     * of 2 not exceeding sizeof(long) as the 'chunk' size.
> +     */
> +    chunk = 1 << (fls(size) - 1);

Depending on size, chunk can become undefined (shifting by 31 or -1) or
zero (shifting by 32).

How about

if ( size > sizeof(long) )
    chunk = sizeof(long);
else
    chunk = 1U << (fls(size) - 1);

?

> +    if ( chunk > sizeof (long) )
> +        chunk = sizeof (long);
> +
> +    for ( ;; )
> +    {
> +        rc = hvmemul_do_mmio_buffer(gpa, &one_rep, chunk, dir, 0,
> +                                    buffer);
> +        if ( rc != X86EMUL_OKAY )
> +            break;
> +
> +        /* Advance to the next chunk */

Full stop.

> +        gpa += chunk;
> +        buffer += chunk;
> +        size -= chunk;
> +
> +        if ( size == 0 )
> +            break;
> +
> +        /*
> +         * If the chunk now exceeds the remaining size, choose the next
> +         * lowest power of 2 that will fit.
> +         */
> +        while ( chunk > size )
> +            chunk >>= 1;
> +    }
> +
> +    return rc;
> +}
> +
> +static int hvmemul_linear_mmio_access(
> +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
> +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t known_gpfn)
> +{
> +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
> +    unsigned long page_off = gla & (PAGE_SIZE - 1);

Can be int as opposed to long, and "offset" appears to be the prevailing
name.  Also, ~PAGE_MASK.

> +    unsigned int chunk;
> +    paddr_t gpa;
> +    unsigned long one_rep = 1;
> +    int rc;
> +
> +    chunk = min_t(unsigned int, size, PAGE_SIZE - page_off);
> +
> +    if ( known_gpfn )
> +        gpa = pfn_to_paddr(vio->mmio_gpfn) | page_off;
> +    else
> +    {
> +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
> +                                    hvmemul_ctxt);
> +        if ( rc != X86EMUL_OKAY )
> +            return rc;
> +    }
> +
> +    for ( ;; )
> +    {
> +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
> +        if ( rc != X86EMUL_OKAY )
> +            break;
> +
> +        gla += chunk;
> +        buffer += chunk;
> +        size -= chunk;
> +
> +        if ( size == 0 )
> +            break;
> +
> +        ASSERT((gla & (PAGE_SIZE - 1)) == 0);

~PAGE_MASK.

> +        ASSERT(size < PAGE_SIZE);

Nothing I can see here prevents size being greater than PAGE_SIZE. 
chunk strictly will be, but size -= chunk can still leave size greater
than a page.

~Andrew

> +        chunk = size;
> +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
> +                                    hvmemul_ctxt);
> +        if ( rc != X86EMUL_OKAY )
> +            return rc;
> +    }
> +
> +    return rc;
> +}
> +

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 03/16] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument...
  2015-06-30 13:05 ` [PATCH v5 03/16] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument Paul Durrant
@ 2015-07-02 15:39   ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-02 15:39 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 30/06/15 14:05, Paul Durrant wrote:
> ...from unsigned long to unsigned int
>
> A 64-bit length is not necessary, 32 bits is enough.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int
  2015-06-30 13:05 ` [PATCH v5 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int Paul Durrant
@ 2015-07-02 15:54   ` Andrew Cooper
  2015-07-02 15:56     ` Paul Durrant
  0 siblings, 1 reply; 46+ messages in thread
From: Andrew Cooper @ 2015-07-02 15:54 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 30/06/15 14:05, Paul Durrant wrote:
> diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
> index d1d79dc..1ed0c13 100644
> --- a/xen/include/asm-x86/hvm/io.h
> +++ b/xen/include/asm-x86/hvm/io.h
> @@ -41,12 +41,12 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v,
>  typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
>  
>  typedef int (*portio_action_t)(
> -    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
> +    int dir, uint16_t port, unsigned int bytes, uint32_t *val);
>  typedef int (*mmio_action_t)(ioreq_t *);
>  struct io_handler {
>      int                 type;
>      unsigned long       addr;
> -    unsigned long       size;
> +    unsigned int        size;

You want to reorder size above addr to fill a 4 byte hole and avoid
introducing a second.

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-02 15:37   ` Andrew Cooper
@ 2015-07-02 15:55     ` Paul Durrant
  2015-07-02 16:03       ` Andrew Cooper
  0 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-07-02 15:55 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Keir (Xen.org), Jan Beulich

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 02 July 2015 16:38
> To: Paul Durrant; xen-devel@lists.xenproject.org
> Cc: Keir (Xen.org); Jan Beulich
> Subject: Re: [PATCH v5 02/16] x86/hvm: remove multiple open coded
> 'chunking' loops
> 
> On 30/06/15 14:05, Paul Durrant wrote:
> > ...in hvmemul_read/write()
> >
> > Add hvmemul_phys_mmio_access() and hvmemul_linear_mmio_access()
> functions
> > to reduce code duplication.
> >
> > NOTE: This patch also introduces a change in 'chunking' around a page
> >       boundary. Previously (for example) an 8 byte access at the last
> >       byte of a page would get carried out as 8 single-byte accesses.
> >       It will now be carried out as a single-byte access, followed by
> >       a 4-byte access, a 2-byte access and then another single-byte
> >       access.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Cc: Keir Fraser <keir@xen.org>
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > ---
> >  xen/arch/x86/hvm/emulate.c |  223 +++++++++++++++++++++++----------
> -----------
> >  1 file changed, 116 insertions(+), 107 deletions(-)
> >
> > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> > index 8b60843..b67f5db 100644
> > --- a/xen/arch/x86/hvm/emulate.c
> > +++ b/xen/arch/x86/hvm/emulate.c
> > @@ -539,6 +539,117 @@ static int hvmemul_virtual_to_linear(
> >      return X86EMUL_EXCEPTION;
> >  }
> >
> > +static int hvmemul_phys_mmio_access(
> > +    paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer)
> > +{
> > +    unsigned long one_rep = 1;
> > +    unsigned int chunk;
> > +    int rc;
> > +
> > +    /* Accesses must fall within a page */
> 
> Full stop.
> 
> > +    BUG_ON((gpa & (PAGE_SIZE - 1)) + size > PAGE_SIZE);
> 
> ~PAGE_MASK as opposed to (PAGE_SIZE - 1)
> 
> > +
> > +    /*
> > +     * hvmemul_do_io() cannot handle non-power-of-2 accesses or
> > +     * accesses larger than sizeof(long), so choose the highest power
> > +     * of 2 not exceeding sizeof(long) as the 'chunk' size.
> > +     */
> > +    chunk = 1 << (fls(size) - 1);
> 
> Depending on size, chunk can become undefined (shifting by 31 or -1) or
> zero (shifting by 32).
> 
> How about
> 
> if ( size > sizeof(long) )
>     chunk = sizeof(long);
> else
>     chunk = 1U << (fls(size) - 1);
> 

fls(size) - 1 can't be more than 31 (since size is an unsigned int). I can assert size != 0. So would

chunk = 1u << (fls(size) - 1);

be ok? (I.e. I just missed the 'u' suffix before).

> ?
> 
> > +    if ( chunk > sizeof (long) )
> > +        chunk = sizeof (long);
> > +
> > +    for ( ;; )
> > +    {
> > +        rc = hvmemul_do_mmio_buffer(gpa, &one_rep, chunk, dir, 0,
> > +                                    buffer);
> > +        if ( rc != X86EMUL_OKAY )
> > +            break;
> > +
> > +        /* Advance to the next chunk */
> 
> Full stop.
> 
> > +        gpa += chunk;
> > +        buffer += chunk;
> > +        size -= chunk;
> > +
> > +        if ( size == 0 )
> > +            break;
> > +
> > +        /*
> > +         * If the chunk now exceeds the remaining size, choose the next
> > +         * lowest power of 2 that will fit.
> > +         */
> > +        while ( chunk > size )
> > +            chunk >>= 1;
> > +    }
> > +
> > +    return rc;
> > +}
> > +
> > +static int hvmemul_linear_mmio_access(
> > +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
> > +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t
> known_gpfn)
> > +{
> > +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
> > +    unsigned long page_off = gla & (PAGE_SIZE - 1);
> 
> Can be int as opposed to long, and "offset" appears to be the prevailing
> name.  Also, ~PAGE_MASK.

Ok.

> 
> > +    unsigned int chunk;
> > +    paddr_t gpa;
> > +    unsigned long one_rep = 1;
> > +    int rc;
> > +
> > +    chunk = min_t(unsigned int, size, PAGE_SIZE - page_off);
> > +
> > +    if ( known_gpfn )
> > +        gpa = pfn_to_paddr(vio->mmio_gpfn) | page_off;
> > +    else
> > +    {
> > +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
> > +                                    hvmemul_ctxt);
> > +        if ( rc != X86EMUL_OKAY )
> > +            return rc;
> > +    }
> > +
> > +    for ( ;; )
> > +    {
> > +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
> > +        if ( rc != X86EMUL_OKAY )
> > +            break;
> > +
> > +        gla += chunk;
> > +        buffer += chunk;
> > +        size -= chunk;
> > +
> > +        if ( size == 0 )
> > +            break;
> > +
> > +        ASSERT((gla & (PAGE_SIZE - 1)) == 0);
> 
> ~PAGE_MASK.
> 
> > +        ASSERT(size < PAGE_SIZE);
> 
> Nothing I can see here prevents size being greater than PAGE_SIZE.
> chunk strictly will be, but size -= chunk can still leave size greater
> than a page.
> 

Ok, I'll allow for size >= PAGE_SIZE.

  Paul

> ~Andrew
> 
> > +        chunk = size;
> > +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
> > +                                    hvmemul_ctxt);
> > +        if ( rc != X86EMUL_OKAY )
> > +            return rc;
> > +    }
> > +
> > +    return rc;
> > +}
> > +

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int
  2015-07-02 15:54   ` Andrew Cooper
@ 2015-07-02 15:56     ` Paul Durrant
  0 siblings, 0 replies; 46+ messages in thread
From: Paul Durrant @ 2015-07-02 15:56 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Keir (Xen.org), Jan Beulich

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 02 July 2015 16:55
> To: Paul Durrant; xen-devel@lists.xenproject.org
> Cc: Keir (Xen.org); Jan Beulich
> Subject: Re: [PATCH v5 04/16] x86/hvm: restrict port numbers to uint16_t
> and sizes to unsigned int
> 
> On 30/06/15 14:05, Paul Durrant wrote:
> > diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-
> x86/hvm/io.h
> > index d1d79dc..1ed0c13 100644
> > --- a/xen/include/asm-x86/hvm/io.h
> > +++ b/xen/include/asm-x86/hvm/io.h
> > @@ -41,12 +41,12 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v,
> >  typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
> >
> >  typedef int (*portio_action_t)(
> > -    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
> > +    int dir, uint16_t port, unsigned int bytes, uint32_t *val);
> >  typedef int (*mmio_action_t)(ioreq_t *);
> >  struct io_handler {
> >      int                 type;
> >      unsigned long       addr;
> > -    unsigned long       size;
> > +    unsigned int        size;
> 
> You want to reorder size above addr to fill a 4 byte hole and avoid
> introducing a second.
> 

Sure.

  Paul

> Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 02/16] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-02 15:55     ` Paul Durrant
@ 2015-07-02 16:03       ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-02 16:03 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir (Xen.org), Jan Beulich

On 02/07/15 16:55, Paul Durrant wrote:
>> -----Original Message-----
>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> Sent: 02 July 2015 16:38
>> To: Paul Durrant; xen-devel@lists.xenproject.org
>> Cc: Keir (Xen.org); Jan Beulich
>> Subject: Re: [PATCH v5 02/16] x86/hvm: remove multiple open coded
>> 'chunking' loops
>>
>> On 30/06/15 14:05, Paul Durrant wrote:
>>> ...in hvmemul_read/write()
>>>
>>> Add hvmemul_phys_mmio_access() and hvmemul_linear_mmio_access()
>> functions
>>> to reduce code duplication.
>>>
>>> NOTE: This patch also introduces a change in 'chunking' around a page
>>>       boundary. Previously (for example) an 8 byte access at the last
>>>       byte of a page would get carried out as 8 single-byte accesses.
>>>       It will now be carried out as a single-byte access, followed by
>>>       a 4-byte access, a 2-byte access and then another single-byte
>>>       access.
>>>
>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>> Cc: Keir Fraser <keir@xen.org>
>>> Cc: Jan Beulich <jbeulich@suse.com>
>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>> ---
>>>  xen/arch/x86/hvm/emulate.c |  223 +++++++++++++++++++++++----------
>> -----------
>>>  1 file changed, 116 insertions(+), 107 deletions(-)
>>>
>>> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
>>> index 8b60843..b67f5db 100644
>>> --- a/xen/arch/x86/hvm/emulate.c
>>> +++ b/xen/arch/x86/hvm/emulate.c
>>> @@ -539,6 +539,117 @@ static int hvmemul_virtual_to_linear(
>>>      return X86EMUL_EXCEPTION;
>>>  }
>>>
>>> +static int hvmemul_phys_mmio_access(
>>> +    paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer)
>>> +{
>>> +    unsigned long one_rep = 1;
>>> +    unsigned int chunk;
>>> +    int rc;
>>> +
>>> +    /* Accesses must fall within a page */
>> Full stop.
>>
>>> +    BUG_ON((gpa & (PAGE_SIZE - 1)) + size > PAGE_SIZE);
>> ~PAGE_MASK as opposed to (PAGE_SIZE - 1)
>>
>>> +
>>> +    /*
>>> +     * hvmemul_do_io() cannot handle non-power-of-2 accesses or
>>> +     * accesses larger than sizeof(long), so choose the highest power
>>> +     * of 2 not exceeding sizeof(long) as the 'chunk' size.
>>> +     */
>>> +    chunk = 1 << (fls(size) - 1);
>> Depending on size, chunk can become undefined (shifting by 31 or -1) or
>> zero (shifting by 32).
>>
>> How about
>>
>> if ( size > sizeof(long) )
>>     chunk = sizeof(long);
>> else
>>     chunk = 1U << (fls(size) - 1);
>>
> fls(size) - 1 can't be more than 31 (since size is an unsigned int). I can assert size != 0. So would

Ah yes - very true (until someone goes and changes size to an unsigned long)

Size being 0 is probably the kind of thing which needs checking at the
very top of the emulation calltree and assumed thereafter.  Might also
want an upper sanity check as well.

>
> chunk = 1u << (fls(size) - 1);
>
> be ok? (I.e. I just missed the 'u' suffix before).

Yeah.

~Andrew

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 05/16] x86/hvm: unify internal portio and mmio intercepts
  2015-06-30 13:05 ` [PATCH v5 05/16] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
  2015-07-02 14:52   ` Roger Pau Monné
@ 2015-07-02 16:29   ` Andrew Cooper
  1 sibling, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-02 16:29 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 30/06/15 14:05, Paul Durrant wrote:
> -bool_t hvm_mmio_internal(paddr_t gpa)
> +static int hvm_portio_read(const struct hvm_io_handler *handler,
> +                           uint64_t addr,
> +                           uint32_t size,
> +                           uint64_t *data)
>  {
> -    struct vcpu *curr = current;
> -    unsigned int i;
> +    uint32_t val = 0;

Default to ~0U to match real hardware.

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 06/16] x86/hvm: add length to mmio check op
  2015-06-30 13:05 ` [PATCH v5 06/16] x86/hvm: add length to mmio check op Paul Durrant
@ 2015-07-02 16:37   ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-02 16:37 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 30/06/15 14:05, Paul Durrant wrote:
> When memory mapped I/O is range checked by internal handlers, the length
> of the access should be taken into account.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  xen/arch/x86/hvm/intercept.c |   22 +++++++++++++++++++---
>  xen/include/asm-x86/hvm/io.h |   16 ++++++++++++++++
>  2 files changed, 35 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
> index 7d36785..42050f4 100644
> --- a/xen/arch/x86/hvm/intercept.c
> +++ b/xen/arch/x86/hvm/intercept.c
> @@ -35,9 +35,19 @@
>  static bool_t hvm_mmio_accept(const struct hvm_io_handler *handler,
>                                const ioreq_t *p)
>  {
> +    paddr_t first = hvm_mmio_first_byte(p);
> +    paddr_t last = hvm_mmio_last_byte(p);
> +
>      BUG_ON(handler->type != IOREQ_TYPE_COPY);
>  
> -    return handler->mmio.ops->check(current, p->addr);
> +    if ( !handler->mmio.ops->check(current, first) )
> +        return 0;
> +

I would put a comment here about an IO access straddling an MMIO handler
boundary, so that someone investigating this domain crash gets some clue
as to why.

> +    if ( p->size > 1 &&
> +         !handler->mmio.ops->check(current, last) )
> +        domain_crash(current->domain);
> +
> +    return 1;
>  }
>  
>  static int hvm_mmio_read(const struct hvm_io_handler *handler,
> @@ -112,7 +122,8 @@ static const struct hvm_io_ops portio_ops = {
>  static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
>                                      ioreq_t *p)
>  {
> -    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
> +    struct vcpu *curr = current;
> +    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
>      const struct hvm_io_ops *ops =
>          (p->type == IOREQ_TYPE_COPY) ?
>          &mmio_ops :
> @@ -223,6 +234,9 @@ static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
>  
>      if ( i != 0 )
>      {
> +        if ( rc == X86EMUL_UNHANDLEABLE )
> +            domain_crash(curr->domain);
> +
>          p->count = i;
>          rc = X86EMUL_OKAY;
>      }
> @@ -342,7 +356,9 @@ bool_t hvm_mmio_internal(paddr_t gpa)
>  {
>      ioreq_t p = {
>          .type = IOREQ_TYPE_COPY,
> -        .addr = gpa
> +        .addr = gpa,

As a general note, many compilers (gcc includes) permit having a comma
as the final token before the } which avoids a diff which looks like
this when adding a subsequent member.

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 07/16] x86/hvm: unify dpci portio intercept with standard portio intercept
  2015-06-30 13:05 ` [PATCH v5 07/16] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
@ 2015-07-02 16:50   ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-02 16:50 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 30/06/15 14:05, Paul Durrant wrote:
> This patch re-works the dpci portio intercepts so that they can be unified
> with standard portio handling thereby removing a substantial amount of
> code duplication.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  xen/arch/x86/hvm/hvm.c         |    2 +
>  xen/arch/x86/hvm/intercept.c   |   22 ++--
>  xen/arch/x86/hvm/io.c          |  220 ++++++++++++----------------------------
>  xen/include/asm-x86/hvm/io.h   |    5 +
>  xen/include/asm-x86/hvm/vcpu.h |    2 +
>  xen/include/xen/iommu.h        |    1 -
>  6 files changed, 80 insertions(+), 172 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 60f8972..6643e0c 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -1486,6 +1486,8 @@ int hvm_domain_initialise(struct domain *d)
>      else
>          d->arch.hvm_domain.io_bitmap = hvm_io_bitmap;
>  
> +    register_dpci_portio_handler(d);
> +
>      if ( is_pvh_domain(d) )
>      {
>          register_portio_handler(d, 0, 0x10003, handle_pvh_io);
> diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
> index 42050f4..a613865 100644
> --- a/xen/arch/x86/hvm/intercept.c
> +++ b/xen/arch/x86/hvm/intercept.c
> @@ -124,10 +124,7 @@ static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
>  {
>      struct vcpu *curr = current;
>      struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
> -    const struct hvm_io_ops *ops =
> -        (p->type == IOREQ_TYPE_COPY) ?
> -        &mmio_ops :
> -        &portio_ops;
> +    const struct hvm_io_ops *ops = handler->ops;
>      int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
>      uint64_t data;
>      uint64_t addr;
> @@ -247,10 +244,6 @@ static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
>  static const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
>  {
>      struct domain *curr_d = current->domain;
> -    const struct hvm_io_ops *ops =
> -        (p->type == IOREQ_TYPE_COPY) ?
> -        &mmio_ops :
> -        &portio_ops;
>      unsigned int i;
>  
>      BUG_ON((p->type != IOREQ_TYPE_PIO) &&
> @@ -260,6 +253,7 @@ static const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
>      {
>          const struct hvm_io_handler *handler =
>              &curr_d->arch.hvm_domain.io_handler[i];
> +        const struct hvm_io_ops *ops = handler->ops;
>  
>          if ( handler->type != p->type )
>              continue;
> @@ -275,13 +269,7 @@ int hvm_io_intercept(ioreq_t *p)
>  {
>      const struct hvm_io_handler *handler;
>  
> -    if ( p->type == IOREQ_TYPE_PIO )
> -    {
> -        int rc = dpci_ioport_intercept(p);
> -        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
> -            return rc;
> -    }
> -    else if ( p->type == IOREQ_TYPE_COPY )
> +    if ( p->type == IOREQ_TYPE_COPY )
>      {
>          int rc = stdvga_intercept_mmio(p);
>          if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
> @@ -296,7 +284,7 @@ int hvm_io_intercept(ioreq_t *p)
>      return hvm_process_io_intercept(handler, p);
>  }
>  
> -static struct hvm_io_handler *hvm_next_io_handler(struct domain *d)
> +struct hvm_io_handler *hvm_next_io_handler(struct domain *d)

Instead of introducing this as a static in patch 5, then modifying it
here to export it, introduce it directly as an exported function in 5.

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-06-30 13:05 ` [PATCH v5 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
@ 2015-07-02 16:55   ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-02 16:55 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 30/06/15 14:05, Paul Durrant wrote:
> @@ -275,7 +275,8 @@ static uint8_t stdvga_mem_readb(uint64_t addr)
>      return ret;
>  }
>  
> -static uint64_t stdvga_mem_read(uint64_t addr, uint64_t size)
> +static int stdvga_mem_read(const struct hvm_io_handler *handler,
> +                           uint64_t addr, uint32_t size, uint64_t *p_data)
>  {
>      uint64_t data = 0;

This initialiser wants to default to ~0ULL.

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-06-30 13:05 ` [PATCH v5 09/16] x86/hvm: limit reps to avoid the need to handle retry Paul Durrant
@ 2015-07-02 17:10   ` Andrew Cooper
  2015-07-02 17:14     ` Paul Durrant
  0 siblings, 1 reply; 46+ messages in thread
From: Andrew Cooper @ 2015-07-02 17:10 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 30/06/15 14:05, Paul Durrant wrote:
> @@ -235,7 +219,7 @@ static int hvmemul_do_io_buffer(
>  
>      BUG_ON(buffer == NULL);
>  
> -    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
> +    rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0,
>                         (uintptr_t)buffer);
>      if ( rc == X86EMUL_UNHANDLEABLE && dir == IOREQ_READ )
>          memset(buffer, 0xff, size);
> @@ -287,17 +271,53 @@ static int hvmemul_do_io_addr(
>      bool_t is_mmio, paddr_t addr, unsigned long *reps,
>      unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
>  {
> -    struct page_info *ram_page;
> +    struct vcpu *v = current;

curr.

> +    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);

ram_gfn.

> +    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);

offset and ~PAGE_MASK.

> +    struct page_info *ram_page[2];
> +    int nr_pages = 0;

unsigned int.

> +    unsigned long count;
>      int rc;
>  
> -    rc = hvmemul_acquire_page(paddr_to_pfn(ram_gpa), &ram_page);
> +    rc = hvmemul_acquire_page(ram_gmfn, &ram_page[nr_pages]);
>      if ( rc != X86EMUL_OKAY )
> -        return rc;
> +        goto out;
>  
> -    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
> +    nr_pages++;
> +
> +    /* Detemine how many reps will fit within this page */
> +    count = min_t(unsigned long,
> +                  *reps,
> +                  df ?
> +                  (page_off + size - 1) / size :
> +                  (PAGE_SIZE - page_off) / size);
> +
> +    if ( count == 0 )
> +    {
> +        /*
> +         * This access must span two pages, so grab a reference to
> +         * the next page and do a single rep.
> +         */
> +        rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1,
> +                                  &ram_page[nr_pages]);

All guest-based ways to trigger an IO spanning a page boundary will be
based on linear address.  If a guest has paging enabled, this movement
to an adjacent physical is not valid.  A new pagetable walk will be
required to determine the correct second page.

~Andrew

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-02 17:10   ` Andrew Cooper
@ 2015-07-02 17:14     ` Paul Durrant
  2015-07-02 17:31       ` Andrew Cooper
  0 siblings, 1 reply; 46+ messages in thread
From: Paul Durrant @ 2015-07-02 17:14 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Keir (Xen.org), Jan Beulich

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 02 July 2015 18:11
> To: Paul Durrant; xen-devel@lists.xenproject.org
> Cc: Keir (Xen.org); Jan Beulich
> Subject: Re: [PATCH v5 09/16] x86/hvm: limit reps to avoid the need to
> handle retry
> 
> On 30/06/15 14:05, Paul Durrant wrote:
> > @@ -235,7 +219,7 @@ static int hvmemul_do_io_buffer(
> >
> >      BUG_ON(buffer == NULL);
> >
> > -    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
> > +    rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0,
> >                         (uintptr_t)buffer);
> >      if ( rc == X86EMUL_UNHANDLEABLE && dir == IOREQ_READ )
> >          memset(buffer, 0xff, size);
> > @@ -287,17 +271,53 @@ static int hvmemul_do_io_addr(
> >      bool_t is_mmio, paddr_t addr, unsigned long *reps,
> >      unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
> >  {
> > -    struct page_info *ram_page;
> > +    struct vcpu *v = current;
> 
> curr.
> 
> > +    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
> 
> ram_gfn.
> 
> > +    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);
> 
> offset and ~PAGE_MASK.
> 
> > +    struct page_info *ram_page[2];
> > +    int nr_pages = 0;
> 
> unsigned int.
> 
> > +    unsigned long count;
> >      int rc;
> >
> > -    rc = hvmemul_acquire_page(paddr_to_pfn(ram_gpa), &ram_page);
> > +    rc = hvmemul_acquire_page(ram_gmfn, &ram_page[nr_pages]);
> >      if ( rc != X86EMUL_OKAY )
> > -        return rc;
> > +        goto out;
> >
> > -    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
> > +    nr_pages++;
> > +
> > +    /* Detemine how many reps will fit within this page */
> > +    count = min_t(unsigned long,
> > +                  *reps,
> > +                  df ?
> > +                  (page_off + size - 1) / size :
> > +                  (PAGE_SIZE - page_off) / size);
> > +
> > +    if ( count == 0 )
> > +    {
> > +        /*
> > +         * This access must span two pages, so grab a reference to
> > +         * the next page and do a single rep.
> > +         */
> > +        rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1,
> > +                                  &ram_page[nr_pages]);
> 
> All guest-based ways to trigger an IO spanning a page boundary will be
> based on linear address.  If a guest has paging enabled, this movement
> to an adjacent physical is not valid.  A new pagetable walk will be
> required to determine the correct second page.

I don't think that is true. hvmemul_linear_to_phys() will break at non-contiguous boundaries.

  Paul

> 
> ~Andrew

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 09/16] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-02 17:14     ` Paul Durrant
@ 2015-07-02 17:31       ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-02 17:31 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir (Xen.org), Jan Beulich

On 02/07/15 18:14, Paul Durrant wrote:
>> -----Original Message-----
>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> Sent: 02 July 2015 18:11
>> To: Paul Durrant; xen-devel@lists.xenproject.org
>> Cc: Keir (Xen.org); Jan Beulich
>> Subject: Re: [PATCH v5 09/16] x86/hvm: limit reps to avoid the need to
>> handle retry
>>
>> On 30/06/15 14:05, Paul Durrant wrote:
>>> @@ -235,7 +219,7 @@ static int hvmemul_do_io_buffer(
>>>
>>>      BUG_ON(buffer == NULL);
>>>
>>> -    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
>>> +    rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0,
>>>                         (uintptr_t)buffer);
>>>      if ( rc == X86EMUL_UNHANDLEABLE && dir == IOREQ_READ )
>>>          memset(buffer, 0xff, size);
>>> @@ -287,17 +271,53 @@ static int hvmemul_do_io_addr(
>>>      bool_t is_mmio, paddr_t addr, unsigned long *reps,
>>>      unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
>>>  {
>>> -    struct page_info *ram_page;
>>> +    struct vcpu *v = current;
>> curr.
>>
>>> +    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
>> ram_gfn.
>>
>>> +    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);
>> offset and ~PAGE_MASK.
>>
>>> +    struct page_info *ram_page[2];
>>> +    int nr_pages = 0;
>> unsigned int.
>>
>>> +    unsigned long count;
>>>      int rc;
>>>
>>> -    rc = hvmemul_acquire_page(paddr_to_pfn(ram_gpa), &ram_page);
>>> +    rc = hvmemul_acquire_page(ram_gmfn, &ram_page[nr_pages]);
>>>      if ( rc != X86EMUL_OKAY )
>>> -        return rc;
>>> +        goto out;
>>>
>>> -    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
>>> +    nr_pages++;
>>> +
>>> +    /* Detemine how many reps will fit within this page */
>>> +    count = min_t(unsigned long,
>>> +                  *reps,
>>> +                  df ?
>>> +                  (page_off + size - 1) / size :
>>> +                  (PAGE_SIZE - page_off) / size);
>>> +
>>> +    if ( count == 0 )
>>> +    {
>>> +        /*
>>> +         * This access must span two pages, so grab a reference to
>>> +         * the next page and do a single rep.
>>> +         */
>>> +        rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1,
>>> +                                  &ram_page[nr_pages]);
>> All guest-based ways to trigger an IO spanning a page boundary will be
>> based on linear address.  If a guest has paging enabled, this movement
>> to an adjacent physical is not valid.  A new pagetable walk will be
>> required to determine the correct second page.
> I don't think that is true. hvmemul_linear_to_phys() will break at non-contiguous boundaries.

Hmm - it looks like it will bail with unhandleable on a straddled access
across a non-contiguous boundary.  In which case a comment confirming
the safety of the +/- 1 will be useful to the next person who follows
the same track of logic as I did.

~Andrew

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 10/16] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
  2015-06-30 13:05 ` [PATCH v5 10/16] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
@ 2015-07-03 15:03   ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-03 15:03 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser

On 30/06/15 14:05, Paul Durrant wrote:
> diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
> index 11b6028..f84f7c5 100644
> --- a/xen/arch/x86/hvm/intercept.c
> +++ b/xen/arch/x86/hvm/intercept.c
> @@ -119,8 +119,8 @@ static const struct hvm_io_ops portio_ops = {
>      .write = hvm_portio_write
>  };
>  
> -static int hvm_process_io_intercept(const struct hvm_io_handler *handler,
> -                                    ioreq_t *p)
> +int hvm_process_io_intercept(const struct hvm_io_handler *handler,
> +                             ioreq_t *p)

Again, I would suggest introducing this as a global function straight
away, to reduce the eventual churn.

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 11/16] x86/hvm: split I/O completion handling from state model
  2015-06-30 13:05 ` [PATCH v5 11/16] x86/hvm: split I/O completion handling from state model Paul Durrant
@ 2015-07-03 15:08   ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-03 15:08 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 30/06/15 14:05, Paul Durrant wrote:
> diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
> index 2ed0606..efb373f 100644
> --- a/xen/include/asm-x86/hvm/vcpu.h
> +++ b/xen/include/asm-x86/hvm/vcpu.h
> @@ -34,11 +34,16 @@ enum hvm_io_state {
>      HVMIO_none = 0,
>      HVMIO_dispatched,
>      HVMIO_awaiting_completion,
> -    HVMIO_handle_mmio_awaiting_completion,
> -    HVMIO_handle_pio_awaiting_completion,
>      HVMIO_completed
>  };
>  
> +enum hvm_io_completion {
> +    HVMIO_no_completion = 0,

No need to default to 0 here.

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 12/16] x86/hvm: remove HVMIO_dispatched I/O state
  2015-06-30 13:05 ` [PATCH v5 12/16] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
@ 2015-07-03 15:12   ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-03 15:12 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 30/06/15 14:05, Paul Durrant wrote:
> By removing the HVMIO_dispatched state and making all pending emulations
> (i.e. all those not handled by the hypervisor) use HVMIO_awating_completion,
> various code-paths can be simplified.
>
> The completion case for HVMIO_dispatched can also be trivally removed
> from hvmemul_do_io() as it was already unreachable. This is because that
> state was only ever used for writes or I/O to/from a guest page and
> hvmemul_do_io() is never called to complete such I/O.
>
> NOTE: There is one sublety in handle_pio()... The only case when
>       handle_pio() got a return code of X86EMUL_RETRY back from
>       hvmemul_do_pio_buffer() and found io_state was not
>       HVMIO_awaiting_completion was in the case where the domain is
>       shutting down. This is because all writes normally yield a return
>       of HVMEMUL_OKAY and all reads put io_state into
>       HVMIO_awaiting_completion. Hence the io_state check there is
>       replaced with a check of the is_shutting_down flag on the domain.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 13/16] x86/hvm: remove hvm_io_state enumeration
  2015-06-30 13:05 ` [PATCH v5 13/16] x86/hvm: remove hvm_io_state enumeration Paul Durrant
@ 2015-07-03 15:13   ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-03 15:13 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser

On 30/06/15 14:05, Paul Durrant wrote:
> Emulation request status is already covered by STATE_IOREQ_XXX values so
> just use those. The mapping is:
>
> HVMIO_none                -> STATE_IOREQ_NONE
> HVMIO_awaiting_completion -> STATE_IOREQ_READY
> HVMIO_completed           -> STATE_IORESP_READY
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Keir Fraser <keir@xen.org>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 14/16] x86/hvm: use ioreq_t to track in-flight state
  2015-06-30 13:05 ` [PATCH v5 14/16] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
@ 2015-07-03 15:15   ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-03 15:15 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 30/06/15 14:05, Paul Durrant wrote:
> Use an ioreq_t rather than open coded state, size, dir and data fields
> in struct hvm_vcpu_io. This also allows PIO completion to be handled
> similarly to MMIO completion by re-issuing the handle_pio() call.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 15/16] x86/hvm: always re-emulate I/O from a buffer
  2015-06-30 13:05 ` [PATCH v5 15/16] x86/hvm: always re-emulate I/O from a buffer Paul Durrant
@ 2015-07-03 15:26   ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-03 15:26 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser

On 30/06/15 14:05, Paul Durrant wrote:
> If memory mapped I/O is 'chunked' then the I/O must be re-emulated,
> otherwise only the first chunk will be processed. This patch makes
> sure all I/O from a buffer is re-emulated regardless of whether it
> is a read or a write.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Keir Fraser <keir@xen.org>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 16/16] x86/hvm: track large memory mapped accesses by buffer offset
  2015-06-30 13:05 ` [PATCH v5 16/16] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
@ 2015-07-03 15:26   ` Andrew Cooper
  0 siblings, 0 replies; 46+ messages in thread
From: Andrew Cooper @ 2015-07-03 15:26 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Keir Fraser, Jan Beulich

On 30/06/15 14:05, Paul Durrant wrote:
> The code in hvmemul_do_io() that tracks large reads or writes, to avoid
> re-issue of component I/O, is defeated by accesses across a page boundary
> because it uses physical address. The code is also only relevant to memory
> mapped I/O to or from a buffer.
>
> This patch re-factors the code and moves it into hvmemul_phys_mmio_access()
> where it is relevant and tracks using buffer offset rather than address.
> Separate I/O emulations (of which there may be up to three per instruction)
> are distinguished by linear address.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix
  2015-06-30 14:48 ` [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Fabio Fantoni
@ 2015-07-07 11:19   ` Fabio Fantoni
  0 siblings, 0 replies; 46+ messages in thread
From: Fabio Fantoni @ 2015-07-07 11:19 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Jim Fehlig, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 6118 bytes --]

Il 30/06/2015 16:48, Fabio Fantoni ha scritto:
> Il 30/06/2015 15:05, Paul Durrant ha scritto:
>> This patch series re-works much of the code involved in emulation of 
>> port
>> and memory mapped I/O for HVM guests.
>>
>> The code has become very convoluted and, at least by inspection, certain
>> emulations will apparently malfunction.
>>
>> The series is broken down into 16 patches (which are also available in
>> my xenbits repo: http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
>> on the emulation32 branch).
>>
>> Previous changelog
>> ------------------
>>
>> v2:
>>   - Removed bogus assertion from patch #15
>>   - Re-worked patch #17 after basic testing of back-port onto XenServer
>>
>> v3:
>>   - Addressed comments from Jan
>>   - Re-ordered series to bring a couple of more trivial patches to the
>>     front
>>   - Backport to XenServer (4.5) now passing automated tests
>>   - Tested on unstable with QEMU upstream and trad, with and without
>>     HAP (to force shadow emulation)
>>
>> v4:
>>   - Removed previous patch (make sure translated MMIO reads or
>>     writes fall within a page) and rebased rest of series.
>>   - Address Jan's comments on patch #1
>>
>> Changelog (now per-patch)
>> -------------------------
>>
>> 0001-x86-hvm-make-sure-emulation-is-retried-if-domain-is-.patch
>>
>> This is a fix for an issue on staging reported by Don Slutz
>>
>> 0002-x86-hvm-remove-multiple-open-coded-chunking-loops.patch
>>
>> v5: Addressed further comments from Jan
>>
>> 0003-x86-hvm-change-hvm_mmio_read_t-and-hvm_mmio_write_t-.patch
>>
>> v5: New patch to tidy up types
>>
>> 0004-x86-hvm-restrict-port-numbers-to-uint16_t-and-sizes-.patch
>>
>> v5: New patch to tidy up more types
>>
>> 0005-x86-hvm-unify-internal-portio-and-mmio-intercepts.patch
>>
>> v5: Addressed further comments from Jan and simplified implementation
>>      by passing ioreq_t to accept() function
>>
>> 0006-x86-hvm-add-length-to-mmio-check-op.patch
>>
>> v5: Simplified by leaving mmio_check() implementation alone and
>>      calling to check last byte if first-byte check passes
>>
>> 0007-x86-hvm-unify-dpci-portio-intercept-with-standard-po.patch
>>
>> v5: Addressed further comments from Jan
>>
>> 0008-x86-hvm-unify-stdvga-mmio-intercept-with-standard-mm.patch
>>
>> v5: Fixed semantic problems pointed out by Jan
>>
>> 0009-x86-hvm-limit-reps-to-avoid-the-need-to-handle-retry.patch
>>
>> v5: Addressed further comments from Jan
>>
>> 0010-x86-hvm-only-call-hvm_io_assist-from-hvm_wait_for_io.patch
>>
>> v5: Added Jan's acked-by
>>
>> 0011-x86-hvm-split-I-O-completion-handling-from-state-mod.patch
>>
>> v5: Confirmed call to msix_write_completion() is in the correct place.
>>
>> 0012-x86-hvm-remove-HVMIO_dispatched-I-O-state.patch
>>
>> v5: Added some extra comments to the commit
>>
>> 0013-x86-hvm-remove-hvm_io_state-enumeration.patch
>>
>> v5: Added Jan's acked-by
>>
>> 0014-x86-hvm-use-ioreq_t-to-track-in-flight-state.patch
>>
>> v5: Added missing hunk with call to handle_pio()
>>
>> 0015-x86-hvm-always-re-emulate-I-O-from-a-buffer.patch
>>
>> v5: Added Jan's acked-by
>>
>> 0016-x86-hvm-track-large-memory-mapped-accesses-by-buffer.patch
>>
>> v5: Fixed to cache up three distict I/O emulations per instruction
>>
>> Testing
>> -------
>>
>> The series was been back-ported to staging-4.5 and then dropped onto the
>> XenServer (Dundee) patch queue. All automated branch-safety tests pass.
>>
>> The series as-is has been manually tested with a Windows 7 (32-bit) VM
>> using upstream QEMU.
>>
>>
>
> Thanks for your work.
> I did some very fast tests, no regression found, on my linux domUs qxl 
> is still not working and I'm unable to debug it.
> @Jim Fehlig: can you try if qxl is still working at least on suse 
> dom0/domU after this serie please?
>
> Can someone tell me how to debug the qxl problem now that qemu don't 
> crash anymore but remain at 100% cpu without nothing about in dom0 
> logs (if exists) please?
>
> Thanks for any reply and sorry for my bad english.

I don't have knowledge about x86 emulation but I'm trying desperately to 
find the cause of such problems which persists despite this serie of 
patches.
In latest xengt xen patches I saw this patch: "vgt: add support of 
emulating SSE2 instruction MOVD"
https://github.com/01org/XenGT-Preview-xen/commit/f2bad31f80f698a452c37cb39841da8e4f69350f
This xengt patch is still based on xen 4.5 instead, and x86_emulate.c is 
different but I noticed some strange things looking at it and also 
comparing it with upstream staging...

xengt add this:
> case 0x7e: /* movd xmm,mm/mm32 */
this seems still missing in upstream staging, should it be added? Should 
I try the patch with latest upstream xen or does it need other changes?
There is also this:
> ea.bytes = (b == 0x7e ? 4: 16);
An sse2 istruction of 4 byte seems strange to me, is it right?


In upstream I saw this:
> case 0xe7: /* movntq mm,m64 */
>                 /* {,v}movntdq xmm,m128 */
>                 /* vmovntdq ymm,m256 */
>          fail_if(ea.type != OP_MEM);
>          fail_if(vex.pfx == vex_f3);
>          /* fall through */

In the last qemu crash I had with qxl on linux I got this:
> #0  __memset_sse2 () at ../sysdeps/x86_64/multiarch/../memset.S:908
> Latest istruction:
> => 0x7ffff3713f7b <__memset_sse2+2363>:    movntdq %xmm0,(%rdi)

Is it possible that the "fail_if(ea.type != OP_MEM);" or the other one 
make a difference?



After applying this patch serie qemu doesn't crash anymore but it 
remains at 100% cpu and is unusable, I can do only xl destroy, I find 
nothing in logs and I'm unable to debug it.
Is there something I can do to debug this?

I also took a fast look at suse kernel patches 
(https://github.com/openSUSE/kernel-source/tree/SLE12-SP1) where qxl is 
also working on linux domUs (other things seems already similar based on 
what Jim Fehlig told me) but I didn't find a possible fix/workaround for 
it to try. Can someone tell me about possible patches I should try please?

Any help to find the workaround or fix to apply upstream and have linux 
domUs working with qxl in every case is appreciated.


[-- Attachment #1.2: Type: text/html, Size: 10135 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2015-07-07 11:19 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-30 13:05 [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Paul Durrant
2015-06-30 13:05 ` [PATCH v5 01/16] x86/hvm: make sure emulation is retried if domain is shutting down Paul Durrant
2015-06-30 13:45   ` Andrew Cooper
2015-06-30 16:14     ` Don Slutz
2015-06-30 16:29       ` Paul Durrant
2015-06-30 13:05 ` [PATCH v5 02/16] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
2015-07-02 15:37   ` Andrew Cooper
2015-07-02 15:55     ` Paul Durrant
2015-07-02 16:03       ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 03/16] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument Paul Durrant
2015-07-02 15:39   ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 04/16] x86/hvm: restrict port numbers to uint16_t and sizes to unsigned int Paul Durrant
2015-07-02 15:54   ` Andrew Cooper
2015-07-02 15:56     ` Paul Durrant
2015-06-30 13:05 ` [PATCH v5 05/16] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
2015-07-02 14:52   ` Roger Pau Monné
2015-07-02 15:02     ` Paul Durrant
2015-07-02 15:12       ` Roger Pau Monné
2015-07-02 15:12         ` Paul Durrant
2015-07-02 16:29   ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 06/16] x86/hvm: add length to mmio check op Paul Durrant
2015-07-02 16:37   ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 07/16] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
2015-07-02 16:50   ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 08/16] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
2015-07-02 16:55   ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 09/16] x86/hvm: limit reps to avoid the need to handle retry Paul Durrant
2015-07-02 17:10   ` Andrew Cooper
2015-07-02 17:14     ` Paul Durrant
2015-07-02 17:31       ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 10/16] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
2015-07-03 15:03   ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 11/16] x86/hvm: split I/O completion handling from state model Paul Durrant
2015-07-03 15:08   ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 12/16] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
2015-07-03 15:12   ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 13/16] x86/hvm: remove hvm_io_state enumeration Paul Durrant
2015-07-03 15:13   ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 14/16] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
2015-07-03 15:15   ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 15/16] x86/hvm: always re-emulate I/O from a buffer Paul Durrant
2015-07-03 15:26   ` Andrew Cooper
2015-06-30 13:05 ` [PATCH v5 16/16] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
2015-07-03 15:26   ` Andrew Cooper
2015-06-30 14:48 ` [PATCH v5 00/16] x86/hvm: I/O emulation cleanup and fix Fabio Fantoni
2015-07-07 11:19   ` Fabio Fantoni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.