All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix
@ 2015-07-09 13:10 Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 01/15] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
                   ` (15 more replies)
  0 siblings, 16 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant

This patch series re-works much of the code involved in emulation of port
and memory mapped I/O for HVM guests.

The code has become very convoluted and, at least by inspection, certain
emulations will apparently malfunction.

The series is broken down into 15 patches (which are also available in
my xenbits repo: http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
on the emulation34 branch).

Previous changelog
------------------

v4:
 - Removed previous patch (make sure translated MMIO reads or
   writes fall within a page) and rebased rest of series.
 - Address Jan's comments on patch #1

v3:
 - Addressed comments from Jan
 - Re-ordered series to bring a couple of more trivial patches to the
   front
 - Backport to XenServer (4.5) now passing automated tests
 - Tested on unstable with QEMU upstream and trad, with and without
   HAP (to force shadow emulation)

v2:
 - Removed bogus assertion from patch #15
 - Re-worked patch #17 after basic testing of back-port onto XenServer

Subsequent changes are logged in the individual patch files (thanks
to David Vrabel for that).

Testing
-------

v6 of the series was been back-ported to staging-4.5 and then dropped
onto the XenServer (Dundee) patch queue. All automated branch-safety tests
pass.

v7 has just been compile tested since changes were largely cosmetic. It
will be back-ported in the near future.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v7 01/15] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 15:13   ` Jan Beulich
  2015-07-09 13:10 ` [PATCH v7 02/15] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument Paul Durrant
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

...in hvmemul_read/write()

Add hvmemul_phys_mmio_access() and hvmemul_linear_mmio_access() functions
to reduce code duplication.

NOTE: This patch also introduces a change in 'chunking' around a page
      boundary. Previously (for example) an 8 byte access at the last
      byte of a page would get carried out as 8 single-byte accesses.
      It will now be carried out as a single-byte access, followed by
      a 4-byte access, a 2-byte access and then another single-byte
      access.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- hvmemul_linear_mmio_access() left alone (i.e. no change)
- Added Andrew's reviewed-by

v6:
- Addressed comments from Andrew

v5:
- Addressed comments from Jan
---
 xen/arch/x86/hvm/emulate.c |  223 +++++++++++++++++++++++---------------------
 1 file changed, 116 insertions(+), 107 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 8b60843..b823d84 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -539,6 +539,117 @@ static int hvmemul_virtual_to_linear(
     return X86EMUL_EXCEPTION;
 }
 
+static int hvmemul_phys_mmio_access(
+    paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer)
+{
+    unsigned long one_rep = 1;
+    unsigned int chunk;
+    int rc;
+
+    /* Accesses must fall within a page. */
+    BUG_ON((gpa & ~PAGE_MASK) + size > PAGE_SIZE);
+
+    /*
+     * hvmemul_do_io() cannot handle non-power-of-2 accesses or
+     * accesses larger than sizeof(long), so choose the highest power
+     * of 2 not exceeding sizeof(long) as the 'chunk' size.
+     */
+    ASSERT(size != 0);
+    chunk = 1u << (fls(size) - 1);
+    if ( chunk > sizeof (long) )
+        chunk = sizeof (long);
+
+    for ( ;; )
+    {
+        rc = hvmemul_do_mmio_buffer(gpa, &one_rep, chunk, dir, 0,
+                                    buffer);
+        if ( rc != X86EMUL_OKAY )
+            break;
+
+        /* Advance to the next chunk. */
+        gpa += chunk;
+        buffer += chunk;
+        size -= chunk;
+
+        if ( size == 0 )
+            break;
+
+        /*
+         * If the chunk now exceeds the remaining size, choose the next
+         * lowest power of 2 that will fit.
+         */
+        while ( chunk > size )
+            chunk >>= 1;
+    }
+
+    return rc;
+}
+
+static int hvmemul_linear_mmio_access(
+    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
+    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t known_gpfn)
+{
+    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    unsigned long offset = gla & ~PAGE_MASK;
+    unsigned int chunk;
+    paddr_t gpa;
+    unsigned long one_rep = 1;
+    int rc;
+
+    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
+
+    if ( known_gpfn )
+        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
+    else
+    {
+        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
+                                    hvmemul_ctxt);
+        if ( rc != X86EMUL_OKAY )
+            return rc;
+    }
+
+    for ( ;; )
+    {
+        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
+        if ( rc != X86EMUL_OKAY )
+            break;
+
+        gla += chunk;
+        buffer += chunk;
+        size -= chunk;
+
+        if ( size == 0 )
+            break;
+
+        ASSERT((gla & ~PAGE_MASK) == 0);
+        chunk = min_t(unsigned int, size, PAGE_SIZE);
+        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
+                                    hvmemul_ctxt);
+        if ( rc != X86EMUL_OKAY )
+            return rc;
+    }
+
+    return rc;
+}
+
+static inline int hvmemul_linear_mmio_read(
+    unsigned long gla, unsigned int size, void *buffer,
+    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt,
+    bool_t translate)
+{
+    return hvmemul_linear_mmio_access(gla, size, IOREQ_READ, buffer,
+                                      pfec, hvmemul_ctxt, translate);
+}
+
+static inline int hvmemul_linear_mmio_write(
+    unsigned long gla, unsigned int size, void *buffer,
+    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt,
+    bool_t translate)
+{
+    return hvmemul_linear_mmio_access(gla, size, IOREQ_WRITE, buffer,
+                                      pfec, hvmemul_ctxt, translate);
+}
+
 static int __hvmemul_read(
     enum x86_segment seg,
     unsigned long offset,
@@ -549,51 +660,19 @@ static int __hvmemul_read(
 {
     struct vcpu *curr = current;
     unsigned long addr, reps = 1;
-    unsigned int off, chunk = min(bytes, 1U << LONG_BYTEORDER);
     uint32_t pfec = PFEC_page_present;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    paddr_t gpa;
     int rc;
 
     rc = hvmemul_virtual_to_linear(
         seg, offset, bytes, &reps, access_type, hvmemul_ctxt, &addr);
     if ( rc != X86EMUL_OKAY )
         return rc;
-    off = addr & (PAGE_SIZE - 1);
-    /*
-     * We only need to handle sizes actual instruction operands can have. All
-     * such sizes are either powers of 2 or the sum of two powers of 2. Thus
-     * picking as initial chunk size the largest power of 2 not greater than
-     * the total size will always result in only power-of-2 size requests
-     * issued to hvmemul_do_mmio() (hvmemul_do_io() rejects non-powers-of-2).
-     */
-    while ( chunk & (chunk - 1) )
-        chunk &= chunk - 1;
-    if ( off + bytes > PAGE_SIZE )
-        while ( off & (chunk - 1) )
-            chunk >>= 1;
-
     if ( ((access_type != hvm_access_insn_fetch
            ? vio->mmio_access.read_access
            : vio->mmio_access.insn_fetch)) &&
          (vio->mmio_gva == (addr & PAGE_MASK)) )
-    {
-        gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off);
-        while ( (off + chunk) <= PAGE_SIZE )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_READ, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                return rc;
-            addr += chunk;
-            off += chunk;
-            gpa += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-        }
-    }
+        return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, hvmemul_ctxt, 1);
 
     if ( (seg != x86_seg_none) &&
          (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
@@ -612,30 +691,8 @@ static int __hvmemul_read(
     case HVMCOPY_bad_gfn_to_mfn:
         if ( access_type == hvm_access_insn_fetch )
             return X86EMUL_UNHANDLEABLE;
-        rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                    hvmemul_ctxt);
-        while ( rc == X86EMUL_OKAY )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_READ, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                break;
-            addr += chunk;
-            off += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-            if ( off < PAGE_SIZE )
-                gpa += chunk;
-            else
-            {
-                rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                            hvmemul_ctxt);
-                off = 0;
-            }
-        }
-        return rc;
+
+        return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, hvmemul_ctxt, 0);
     case HVMCOPY_gfn_paged_out:
     case HVMCOPY_gfn_shared:
         return X86EMUL_RETRY;
@@ -701,43 +758,18 @@ static int hvmemul_write(
         container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
     struct vcpu *curr = current;
     unsigned long addr, reps = 1;
-    unsigned int off, chunk = min(bytes, 1U << LONG_BYTEORDER);
     uint32_t pfec = PFEC_page_present | PFEC_write_access;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    paddr_t gpa;
     int rc;
 
     rc = hvmemul_virtual_to_linear(
         seg, offset, bytes, &reps, hvm_access_write, hvmemul_ctxt, &addr);
     if ( rc != X86EMUL_OKAY )
         return rc;
-    off = addr & (PAGE_SIZE - 1);
-    /* See the respective comment in __hvmemul_read(). */
-    while ( chunk & (chunk - 1) )
-        chunk &= chunk - 1;
-    if ( off + bytes > PAGE_SIZE )
-        while ( off & (chunk - 1) )
-            chunk >>= 1;
 
     if ( vio->mmio_access.write_access &&
          (vio->mmio_gva == (addr & PAGE_MASK)) )
-    {
-        gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off);
-        while ( (off + chunk) <= PAGE_SIZE )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_WRITE, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                return rc;
-            addr += chunk;
-            off += chunk;
-            gpa += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-        }
-    }
+        return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, hvmemul_ctxt, 1);
 
     if ( (seg != x86_seg_none) &&
          (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
@@ -752,30 +784,7 @@ static int hvmemul_write(
     case HVMCOPY_bad_gva_to_gfn:
         return X86EMUL_EXCEPTION;
     case HVMCOPY_bad_gfn_to_mfn:
-        rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                    hvmemul_ctxt);
-        while ( rc == X86EMUL_OKAY )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_WRITE, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                break;
-            addr += chunk;
-            off += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-            if ( off < PAGE_SIZE )
-                gpa += chunk;
-            else
-            {
-                rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                            hvmemul_ctxt);
-                off = 0;
-            }
-        }
-        return rc;
+        return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, hvmemul_ctxt, 0);
     case HVMCOPY_gfn_paged_out:
     case HVMCOPY_gfn_shared:
         return X86EMUL_RETRY;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 02/15] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument...
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 01/15] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 03/15] x86/hvm: restrict port numbers and uint16_t and sizes to unsigned int Paul Durrant
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

...from unsigned long to unsigned int

A 64-bit length is not necessary, 32 bits is enough.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- No change

v6:
- Added Andrew's reviewed-by

v5:
- New patch to tidy up types
---
 xen/arch/x86/hvm/hpet.c                   |    4 ++--
 xen/arch/x86/hvm/vioapic.c                |    4 ++--
 xen/arch/x86/hvm/vlapic.c                 |   18 +++++++++---------
 xen/arch/x86/hvm/vmsi.c                   |    4 ++--
 xen/drivers/passthrough/amd/iommu_guest.c |    8 ++++----
 xen/include/asm-x86/hvm/io.h              |    6 +++---
 6 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index 9585ca8..30ac5dd 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -166,7 +166,7 @@ static inline int hpet_check_access_length(
 }
 
 static int hpet_read(
-    struct vcpu *v, unsigned long addr, unsigned long length,
+    struct vcpu *v, unsigned long addr, unsigned int length,
     unsigned long *pval)
 {
     HPETState *h = vcpu_vhpet(v);
@@ -301,7 +301,7 @@ static inline uint64_t hpet_fixup_reg(
 
 static int hpet_write(
     struct vcpu *v, unsigned long addr,
-    unsigned long length, unsigned long val)
+    unsigned int length, unsigned long val)
 {
     HPETState *h = vcpu_vhpet(v);
     uint64_t old_val, new_val;
diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index cbbef9f..5c8d890 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -94,7 +94,7 @@ static uint32_t vioapic_read_indirect(const struct hvm_hw_vioapic *vioapic)
 
 static int vioapic_read(
     struct vcpu *v, unsigned long addr,
-    unsigned long length, unsigned long *pval)
+    unsigned int length, unsigned long *pval)
 {
     const struct hvm_hw_vioapic *vioapic = domain_vioapic(v->domain);
     uint32_t result;
@@ -215,7 +215,7 @@ static void vioapic_write_indirect(
 
 static int vioapic_write(
     struct vcpu *v, unsigned long addr,
-    unsigned long length, unsigned long val)
+    unsigned int length, unsigned long val)
 {
     struct hvm_hw_vioapic *vioapic = domain_vioapic(v->domain);
 
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 32e649e..0235cb3 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -581,7 +581,7 @@ static uint32_t vlapic_read_aligned(struct vlapic *vlapic, unsigned int offset)
 
 static int vlapic_read(
     struct vcpu *v, unsigned long address,
-    unsigned long len, unsigned long *pval)
+    unsigned int len, unsigned long *pval)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
     unsigned int offset = address - vlapic_base_address(vlapic);
@@ -611,12 +611,12 @@ static int vlapic_read(
         break;
 
     default:
-        gdprintk(XENLOG_ERR, "Local APIC read with len=%#lx, "
+        gdprintk(XENLOG_ERR, "Local APIC read with len=%#x, "
                  "should be 4 instead.\n", len);
         goto exit_and_crash;
     }
 
-    HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "offset %#x with length %#lx, "
+    HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "offset %#x with length %#x, "
                 "and the result is %#x", offset, len, result);
 
  out:
@@ -624,7 +624,7 @@ static int vlapic_read(
     return X86EMUL_OKAY;
 
  unaligned_exit_and_crash:
-    gdprintk(XENLOG_ERR, "Unaligned LAPIC read len=%#lx at offset=%#x.\n",
+    gdprintk(XENLOG_ERR, "Unaligned LAPIC read len=%#x at offset=%#x.\n",
              len, offset);
  exit_and_crash:
     domain_crash(v->domain);
@@ -819,7 +819,7 @@ static int vlapic_reg_write(struct vcpu *v,
 }
 
 static int vlapic_write(struct vcpu *v, unsigned long address,
-                        unsigned long len, unsigned long val)
+                        unsigned int len, unsigned long val)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
     unsigned int offset = address - vlapic_base_address(vlapic);
@@ -827,7 +827,7 @@ static int vlapic_write(struct vcpu *v, unsigned long address,
 
     if ( offset != APIC_EOI )
         HVM_DBG_LOG(DBG_LEVEL_VLAPIC,
-                    "offset %#x with length %#lx, and value is %#lx",
+                    "offset %#x with length %#x, and value is %#lx",
                     offset, len, val);
 
     /*
@@ -854,11 +854,11 @@ static int vlapic_write(struct vcpu *v, unsigned long address,
             break;
 
         default:
-            gprintk(XENLOG_ERR, "LAPIC write with len %lu\n", len);
+            gprintk(XENLOG_ERR, "LAPIC write with len %u\n", len);
             goto exit_and_crash;
         }
 
-        gdprintk(XENLOG_INFO, "Notice: LAPIC write with len %lu\n", len);
+        gdprintk(XENLOG_INFO, "Notice: LAPIC write with len %u\n", len);
         offset &= ~3;
     }
     else if ( unlikely(offset & 3) )
@@ -867,7 +867,7 @@ static int vlapic_write(struct vcpu *v, unsigned long address,
     return vlapic_reg_write(v, offset, val);
 
  unaligned_exit_and_crash:
-    gprintk(XENLOG_ERR, "Unaligned LAPIC write: len=%lu offset=%#x.\n",
+    gprintk(XENLOG_ERR, "Unaligned LAPIC write: len=%u offset=%#x.\n",
             len, offset);
  exit_and_crash:
     domain_crash(v->domain);
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index f89233d..6eeae0a 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -202,7 +202,7 @@ static struct msi_desc *msixtbl_addr_to_desc(
 
 static int msixtbl_read(
     struct vcpu *v, unsigned long address,
-    unsigned long len, unsigned long *pval)
+    unsigned int len, unsigned long *pval)
 {
     unsigned long offset;
     struct msixtbl_entry *entry;
@@ -258,7 +258,7 @@ out:
 }
 
 static int msixtbl_write(struct vcpu *v, unsigned long address,
-                         unsigned long len, unsigned long val)
+                         unsigned int len, unsigned long val)
 {
     unsigned long offset;
     struct msixtbl_entry *entry;
diff --git a/xen/drivers/passthrough/amd/iommu_guest.c b/xen/drivers/passthrough/amd/iommu_guest.c
index 7b0c102..a832fdb 100644
--- a/xen/drivers/passthrough/amd/iommu_guest.c
+++ b/xen/drivers/passthrough/amd/iommu_guest.c
@@ -682,7 +682,7 @@ static uint64_t iommu_mmio_read64(struct guest_iommu *iommu,
 }
 
 static int guest_iommu_mmio_read(struct vcpu *v, unsigned long addr,
-                                 unsigned long len, unsigned long *pval)
+                                 unsigned int len, unsigned long *pval)
 {
     struct guest_iommu *iommu = vcpu_iommu(v);
     unsigned long offset;
@@ -695,7 +695,7 @@ static int guest_iommu_mmio_read(struct vcpu *v, unsigned long addr,
     if ( unlikely((offset & (len - 1 )) || (len > 8)) )
     {
         AMD_IOMMU_DEBUG("iommu mmio read access is not aligned:"
-                        " offset = %lx, len = %lx\n", offset, len);
+                        " offset = %lx, len = %x\n", offset, len);
         return X86EMUL_UNHANDLEABLE;
     }
 
@@ -772,7 +772,7 @@ static void guest_iommu_mmio_write64(struct guest_iommu *iommu,
 }
 
 static int guest_iommu_mmio_write(struct vcpu *v, unsigned long addr,
-                                  unsigned long len, unsigned long val)
+                                  unsigned int len, unsigned long val)
 {
     struct guest_iommu *iommu = vcpu_iommu(v);
     unsigned long offset;
@@ -785,7 +785,7 @@ static int guest_iommu_mmio_write(struct vcpu *v, unsigned long addr,
     if ( unlikely((offset & (len - 1)) || (len > 8)) )
     {
         AMD_IOMMU_DEBUG("iommu mmio write access is not aligned:"
-                        " offset = %lx, len = %lx\n", offset, len);
+                        " offset = %lx, len = %x\n", offset, len);
         return X86EMUL_UNHANDLEABLE;
     }
 
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index f2aaec5..d1d79dc 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -32,11 +32,11 @@
 
 typedef int (*hvm_mmio_read_t)(struct vcpu *v,
                                unsigned long addr,
-                               unsigned long length,
+                               unsigned int length,
                                unsigned long *val);
 typedef int (*hvm_mmio_write_t)(struct vcpu *v,
                                 unsigned long addr,
-                                unsigned long length,
+                                unsigned int length,
                                 unsigned long val);
 typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
 
@@ -111,7 +111,7 @@ static inline void relocate_portio_handler(
 
 static inline void register_buffered_io_handler(
     struct domain *d, unsigned long addr,
-    unsigned long size, mmio_action_t action)
+    unsigned int size, mmio_action_t action)
 {
     register_io_handler(d, addr, size, action, HVM_BUFFERED_IO);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 03/15] x86/hvm: restrict port numbers and uint16_t and sizes to unsigned int
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 01/15] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 02/15] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 15:24   ` Jan Beulich
  2015-07-09 13:10 ` [PATCH v7 04/15] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

Building on the previous patch, this patch restricts portio port numbers
to uint16_t in registration/relocate calls. In portio_action_t the port
number is change to unsigned int though to avoid the compiler generating
16-bit operations unnecessarily. The patch also changes I/O sizes to
unsigned int which then allows the io_handler size field to reduce to
an unsigned int.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- Change port type in portio_action_t to unsigned int as requested
  by Jan

v6:
- Added Andrew's reviewed-by

v5:
- New patch to tidy up more types
---
 xen/arch/x86/hvm/hvm.c       |    6 +++---
 xen/arch/x86/hvm/i8254.c     |    8 ++++----
 xen/arch/x86/hvm/intercept.c |    4 ++--
 xen/arch/x86/hvm/pmtimer.c   |    4 ++--
 xen/arch/x86/hvm/rtc.c       |    2 +-
 xen/arch/x86/hvm/stdvga.c    |    2 +-
 xen/arch/x86/hvm/vpic.c      |    4 ++--
 xen/include/asm-x86/hvm/io.h |   20 ++++++++++----------
 8 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 554e239..126882d 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -559,7 +559,7 @@ static int hvm_add_ioreq_gmfn(
 }
 
 static int hvm_print_line(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val)
 {
     struct domain *cd = current->domain;
     char c = *val;
@@ -585,7 +585,7 @@ static int hvm_print_line(
 }
 
 static int hvm_access_cf8(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val)
 {
     struct domain *d = current->domain;
 
@@ -597,7 +597,7 @@ static int hvm_access_cf8(
 }
 
 static int handle_pvh_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val)
 {
     struct domain *currd = current->domain;
 
diff --git a/xen/arch/x86/hvm/i8254.c b/xen/arch/x86/hvm/i8254.c
index 36a0a53..8a93c88 100644
--- a/xen/arch/x86/hvm/i8254.c
+++ b/xen/arch/x86/hvm/i8254.c
@@ -50,9 +50,9 @@
 #define RW_STATE_WORD1 4
 
 static int handle_pit_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val);
 static int handle_speaker_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val);
 
 #define get_guest_time(v) \
    (is_hvm_vcpu(v) ? hvm_get_guest_time(v) : (u64)get_s_time())
@@ -479,7 +479,7 @@ void pit_deinit(struct domain *d)
 
 /* the intercept action for PIT DM retval:0--not handled; 1--handled */  
 static int handle_pit_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val)
 {
     struct PITState *vpit = vcpu_vpit(current);
 
@@ -522,7 +522,7 @@ static uint32_t speaker_ioport_read(
 }
 
 static int handle_speaker_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, unsigned int port, uint32_t bytes, uint32_t *val)
 {
     struct PITState *vpit = vcpu_vpit(current);
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index cc44733..52879ff 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -369,7 +369,7 @@ int hvm_io_intercept(ioreq_t *p, int type)
 }
 
 void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned long size,
+    struct domain *d, unsigned long addr, unsigned int size,
     void *action, int type)
 {
     struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
@@ -386,7 +386,7 @@ void register_io_handler(
 
 void relocate_io_handler(
     struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned long size, int type)
+    unsigned int size, int type)
 {
     struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
     int i;
diff --git a/xen/arch/x86/hvm/pmtimer.c b/xen/arch/x86/hvm/pmtimer.c
index 6ad2797..34fc062 100644
--- a/xen/arch/x86/hvm/pmtimer.c
+++ b/xen/arch/x86/hvm/pmtimer.c
@@ -142,7 +142,7 @@ static void pmt_timer_callback(void *opaque)
 
 /* Handle port I/O to the PM1a_STS and PM1a_EN registers */
 static int handle_evt_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val)
 {
     struct vcpu *v = current;
     PMTState *s = &v->domain->arch.hvm_domain.pl_time.vpmt;
@@ -205,7 +205,7 @@ static int handle_evt_io(
 
 /* Handle port I/O to the TMR_VAL register */
 static int handle_pmt_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val)
 {
     struct vcpu *v = current;
     PMTState *s = &v->domain->arch.hvm_domain.pl_time.vpmt;
diff --git a/xen/arch/x86/hvm/rtc.c b/xen/arch/x86/hvm/rtc.c
index 3448971..a9efeaf 100644
--- a/xen/arch/x86/hvm/rtc.c
+++ b/xen/arch/x86/hvm/rtc.c
@@ -697,7 +697,7 @@ static uint32_t rtc_ioport_read(RTCState *s, uint32_t addr)
 }
 
 static int handle_rtc_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val)
 {
     struct RTCState *vrtc = vcpu_vrtc(current);
 
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 13d1029..0e18b76 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -173,7 +173,7 @@ static void stdvga_out(uint32_t port, uint32_t bytes, uint32_t val)
 }
 
 static int stdvga_intercept_pio(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val)
 {
     struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
 
diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c
index 8eea061..7c2edc8 100644
--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -324,7 +324,7 @@ static uint32_t vpic_ioport_read(struct hvm_hw_vpic *vpic, uint32_t addr)
 }
 
 static int vpic_intercept_pic_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val)
 {
     struct hvm_hw_vpic *vpic;
 
@@ -346,7 +346,7 @@ static int vpic_intercept_pic_io(
 }
 
 static int vpic_intercept_elcr_io(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val)
 {
     struct hvm_hw_vpic *vpic;
     uint32_t data;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index d1d79dc..ef76e69 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -41,12 +41,12 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v,
 typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
 
 typedef int (*portio_action_t)(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
+    int dir, unsigned int port, unsigned int bytes, uint32_t *val);
 typedef int (*mmio_action_t)(ioreq_t *);
 struct io_handler {
     int                 type;
+    unsigned int        size;
     unsigned long       addr;
-    unsigned long       size;
     union {
         portio_action_t portio;
         mmio_action_t   mmio;
@@ -75,11 +75,11 @@ extern const struct hvm_mmio_ops iommu_mmio_ops;
 
 int hvm_io_intercept(ioreq_t *p, int type);
 void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned long size,
+    struct domain *d, unsigned long addr, unsigned int size,
     void *action, int type);
 void relocate_io_handler(
     struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned long size, int type);
+    unsigned int size, int type);
 
 static inline int hvm_portio_intercept(ioreq_t *p)
 {
@@ -96,17 +96,17 @@ int hvm_mmio_intercept(ioreq_t *p);
 int hvm_buffered_io_send(ioreq_t *p);
 
 static inline void register_portio_handler(
-    struct domain *d, unsigned long addr,
-    unsigned long size, portio_action_t action)
+    struct domain *d, uint16_t port, unsigned int size,
+    portio_action_t action)
 {
-    register_io_handler(d, addr, size, action, HVM_PORTIO);
+    register_io_handler(d, port, size, action, HVM_PORTIO);
 }
 
 static inline void relocate_portio_handler(
-    struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned long size)
+    struct domain *d, uint16_t old_port, uint16_t new_port,
+    unsigned int size)
 {
-    relocate_io_handler(d, old_addr, new_addr, size, HVM_PORTIO);
+    relocate_io_handler(d, old_port, new_port, size, HVM_PORTIO);
 }
 
 static inline void register_buffered_io_handler(
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 04/15] x86/hvm: unify internal portio and mmio intercepts
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (2 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 03/15] x86/hvm: restrict port numbers and uint16_t and sizes to unsigned int Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 05/15] x86/hvm: add length to mmio check op Paul Durrant
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

The implementation of mmio and portio intercepts is unnecessarily different.
This leads to much code duplication. This patch unifies much of the
intercept handling, leaving only distinct handlers for stdvga mmio and dpci
portio. Subsequent patches will unify those handlers.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- Change start/end fields in portio handler to port/size so that
  relocation only has to modify a single field
- Addressed further comments from Jan

v6:
- Added Andrew's reviewed-by and made the modification requested
  by Roger

v5:
- Addressed further comments from Jan and simplified implementation
  by passing ioreq_t to accept() function
---
 xen/arch/x86/hvm/emulate.c                |   11 +-
 xen/arch/x86/hvm/hpet.c                   |    4 +-
 xen/arch/x86/hvm/hvm.c                    |    9 +-
 xen/arch/x86/hvm/intercept.c              |  475 +++++++++++++----------------
 xen/arch/x86/hvm/stdvga.c                 |   32 +-
 xen/arch/x86/hvm/vioapic.c                |    4 +-
 xen/arch/x86/hvm/vlapic.c                 |    5 +-
 xen/arch/x86/hvm/vmsi.c                   |   10 +-
 xen/drivers/passthrough/amd/iommu_guest.c |   30 +-
 xen/include/asm-x86/hvm/domain.h          |    1 +
 xen/include/asm-x86/hvm/io.h              |  118 ++++---
 11 files changed, 331 insertions(+), 368 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index b823d84..20bf372 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -142,16 +142,7 @@ static int hvmemul_do_io(
         hvmtrace_io_assist(&p);
     }
 
-    if ( is_mmio )
-    {
-        rc = hvm_mmio_intercept(&p);
-        if ( rc == X86EMUL_UNHANDLEABLE )
-            rc = hvm_buffered_io_intercept(&p);
-    }
-    else
-    {
-        rc = hvm_portio_intercept(&p);
-    }
+    rc = hvm_io_intercept(&p);
 
     switch ( rc )
     {
diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index 30ac5dd..732504a 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -504,7 +504,7 @@ static int hpet_range(struct vcpu *v, unsigned long addr)
              (addr < (HPET_BASE_ADDRESS + HPET_MMAP_SIZE)) );
 }
 
-const struct hvm_mmio_ops hpet_mmio_ops = {
+static const struct hvm_mmio_ops hpet_mmio_ops = {
     .check = hpet_range,
     .read  = hpet_read,
     .write = hpet_write
@@ -659,6 +659,8 @@ void hpet_init(struct domain *d)
         h->hpet.comparator64[i] = ~0ULL;
         h->pt[i].source = PTSRC_isa;
     }
+
+    register_mmio_handler(d, &hpet_mmio_ops);
 }
 
 void hpet_deinit(struct domain *d)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 126882d..1fd5efc 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1455,9 +1455,6 @@ int hvm_domain_initialise(struct domain *d)
     spin_lock_init(&d->arch.hvm_domain.irq_lock);
     spin_lock_init(&d->arch.hvm_domain.uc_lock);
 
-    INIT_LIST_HEAD(&d->arch.hvm_domain.msixtbl_list);
-    spin_lock_init(&d->arch.hvm_domain.msixtbl_list_lock);
-
     hvm_init_cacheattr_region_list(d);
 
     rc = paging_enable(d, PG_refcounts|PG_translate|PG_external);
@@ -1465,11 +1462,11 @@ int hvm_domain_initialise(struct domain *d)
         goto fail0;
 
     d->arch.hvm_domain.params = xzalloc_array(uint64_t, HVM_NR_PARAMS);
-    d->arch.hvm_domain.io_handler = xmalloc(struct hvm_io_handler);
+    d->arch.hvm_domain.io_handler = xzalloc_array(struct hvm_io_handler,
+                                                  NR_IO_HANDLERS);
     rc = -ENOMEM;
     if ( !d->arch.hvm_domain.params || !d->arch.hvm_domain.io_handler )
         goto fail1;
-    d->arch.hvm_domain.io_handler->num_slot = 0;
 
     /* Set the default IO Bitmap. */
     if ( is_hardware_domain(d) )
@@ -1506,6 +1503,8 @@ int hvm_domain_initialise(struct domain *d)
 
     rtc_init(d);
 
+    msixtbl_init(d);
+
     register_portio_handler(d, 0xe9, 1, hvm_print_line);
     register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 52879ff..a44baa1 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -32,205 +32,86 @@
 #include <xen/event.h>
 #include <xen/iommu.h>
 
-static const struct hvm_mmio_ops *const
-hvm_mmio_handlers[HVM_MMIO_HANDLER_NR] =
+static bool_t hvm_mmio_accept(const struct hvm_io_handler *handler,
+                              const ioreq_t *p)
 {
-    &hpet_mmio_ops,
-    &vlapic_mmio_ops,
-    &vioapic_mmio_ops,
-    &msixtbl_mmio_ops,
-    &iommu_mmio_ops
-};
+    BUG_ON(handler->type != IOREQ_TYPE_COPY);
+
+    return handler->mmio.ops->check(current, p->addr);
+}
 
-static int hvm_mmio_access(struct vcpu *v,
-                           ioreq_t *p,
-                           hvm_mmio_read_t read,
-                           hvm_mmio_write_t write)
+static int hvm_mmio_read(const struct hvm_io_handler *handler,
+                         uint64_t addr, uint32_t size, uint64_t *data)
 {
-    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
-    unsigned long data;
-    int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
+    BUG_ON(handler->type != IOREQ_TYPE_COPY);
 
-    if ( !p->data_is_ptr )
-    {
-        if ( p->dir == IOREQ_READ )
-        {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-                rc = read(v, p->addr, p->size, &data);
-            p->data = data;
-        }
-        else /* p->dir == IOREQ_WRITE */
-            rc = write(v, p->addr, p->size, p->data);
-        return rc;
-    }
+    return handler->mmio.ops->read(current, addr, size, data);
+}
 
-    if ( p->dir == IOREQ_READ )
-    {
-        for ( i = 0; i < p->count; i++ )
-        {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-            {
-                rc = read(v, p->addr + step * i, p->size, &data);
-                if ( rc != X86EMUL_OKAY )
-                    break;
-            }
-            switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                            &data, p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                /* Drop the write as real hardware would. */
-                continue;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY)
-                break;
-        }
+static int hvm_mmio_write(const struct hvm_io_handler *handler,
+                          uint64_t addr, uint32_t size, uint64_t data)
+{
+    BUG_ON(handler->type != IOREQ_TYPE_COPY);
 
-        if ( rc == X86EMUL_RETRY )
-        {
-            vio->mmio_retry = 1;
-            vio->mmio_large_read_bytes = p->size;
-            memcpy(vio->mmio_large_read, &data, p->size);
-        }
-    }
-    else
-    {
-        for ( i = 0; i < p->count; i++ )
-        {
-            switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
-                                              p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                data = ~0;
-                break;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY )
-                break;
-            rc = write(v, p->addr + step * i, p->size, data);
-            if ( rc != X86EMUL_OKAY )
-                break;
-        }
+    return handler->mmio.ops->write(current, addr, size, data);
+}
 
-        if ( rc == X86EMUL_RETRY )
-            vio->mmio_retry = 1;
-    }
+static const struct hvm_io_ops mmio_ops = {
+    .accept = hvm_mmio_accept,
+    .read = hvm_mmio_read,
+    .write = hvm_mmio_write
+};
 
-    if ( i != 0 )
-    {
-        p->count = i;
-        rc = X86EMUL_OKAY;
-    }
+static bool_t hvm_portio_accept(const struct hvm_io_handler *handler,
+                                const ioreq_t *p)
+{
+    unsigned int start = handler->portio.port;
+    unsigned int end = start + handler->portio.size;
 
-    return rc;
+    BUG_ON(handler->type != IOREQ_TYPE_PIO);
+
+    return (p->addr >= start) && ((p->addr + p->size) <= end);
 }
 
-bool_t hvm_mmio_internal(paddr_t gpa)
+static int hvm_portio_read(const struct hvm_io_handler *handler,
+                           uint64_t addr, uint32_t size, uint64_t *data)
 {
-    struct vcpu *curr = current;
-    unsigned int i;
+    uint32_t val = ~0u;
+    int rc;
+
+    BUG_ON(handler->type != IOREQ_TYPE_PIO);
 
-    for ( i = 0; i < HVM_MMIO_HANDLER_NR; ++i )
-        if ( hvm_mmio_handlers[i]->check(curr, gpa) )
-            return 1;
+    rc = handler->portio.action(IOREQ_READ, addr, size, &val);
+    *data = val;
 
-    return 0;
+    return rc;
 }
 
-int hvm_mmio_intercept(ioreq_t *p)
+static int hvm_portio_write(const struct hvm_io_handler *handler,
+                            uint64_t addr, uint32_t size, uint64_t data)
 {
-    struct vcpu *v = current;
-    int i;
-
-    for ( i = 0; i < HVM_MMIO_HANDLER_NR; i++ )
-    {
-        hvm_mmio_check_t check = hvm_mmio_handlers[i]->check;
+    uint32_t val = data;
 
-        if ( check(v, p->addr) )
-        {
-            if ( unlikely(p->count > 1) &&
-                 !check(v, unlikely(p->df)
-                        ? p->addr - (p->count - 1L) * p->size
-                        : p->addr + (p->count - 1L) * p->size) )
-                p->count = 1;
-
-            return hvm_mmio_access(
-                v, p,
-                hvm_mmio_handlers[i]->read,
-                hvm_mmio_handlers[i]->write);
-        }
-    }
+    BUG_ON(handler->type != IOREQ_TYPE_PIO);
 
-    return X86EMUL_UNHANDLEABLE;
+    return handler->portio.action(IOREQ_WRITE, addr, size, &val);
 }
 
-static int process_portio_intercept(portio_action_t action, ioreq_t *p)
+static const struct hvm_io_ops portio_ops = {
+    .accept = hvm_portio_accept,
+    .read = hvm_portio_read,
+    .write = hvm_portio_write
+};
+
+int hvm_process_io_intercept(const struct hvm_io_handler *handler,
+                             ioreq_t *p)
 {
     struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    const struct hvm_io_ops *ops = (p->type == IOREQ_TYPE_COPY) ?
+                                   &mmio_ops : &portio_ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-    uint32_t data;
-
-    if ( !p->data_is_ptr )
-    {
-        if ( p->dir == IOREQ_READ )
-        {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-                rc = action(IOREQ_READ, p->addr, p->size, &data);
-            p->data = data;
-        }
-        else
-        {
-            data = p->data;
-            rc = action(IOREQ_WRITE, p->addr, p->size, &data);
-        }
-        return rc;
-    }
+    uint64_t data;
+    uint64_t addr;
 
     if ( p->dir == IOREQ_READ )
     {
@@ -246,31 +127,40 @@ static int process_portio_intercept(portio_action_t action, ioreq_t *p)
             }
             else
             {
-                rc = action(IOREQ_READ, p->addr, p->size, &data);
+                addr = (p->type == IOREQ_TYPE_COPY) ?
+                       p->addr + step * i :
+                       p->addr;
+                rc = ops->read(handler, addr, p->size, &data);
                 if ( rc != X86EMUL_OKAY )
                     break;
             }
-            switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                            &data, p->size) )
+
+            if ( p->data_is_ptr )
             {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                /* Drop the write as real hardware would. */
-                continue;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
+                switch ( hvm_copy_to_guest_phys(p->data + step * i,
+                                                &data, p->size) )
+                {
+                case HVMCOPY_okay:
+                    break;
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
+                    rc = X86EMUL_RETRY;
+                    break;
+                case HVMCOPY_bad_gfn_to_mfn:
+                    /* Drop the write as real hardware would. */
+                    continue;
+                case HVMCOPY_bad_gva_to_gfn:
+                    ASSERT_UNREACHABLE();
+                    /* fall through */
+                default:
+                    rc = X86EMUL_UNHANDLEABLE;
+                    break;
+                }
+                if ( rc != X86EMUL_OKAY )
+                    break;
             }
-            if ( rc != X86EMUL_OKAY)
-                break;
+            else
+                p->data = data;
         }
 
         if ( rc == X86EMUL_RETRY )
@@ -284,29 +174,37 @@ static int process_portio_intercept(portio_action_t action, ioreq_t *p)
     {
         for ( i = 0; i < p->count; i++ )
         {
-            data = 0;
-            switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
-                                              p->size) )
+            if ( p->data_is_ptr )
             {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                data = ~0;
-                break;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
+                switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
+                                                  p->size) )
+                {
+                case HVMCOPY_okay:
+                    break;
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
+                    rc = X86EMUL_RETRY;
+                    break;
+                case HVMCOPY_bad_gfn_to_mfn:
+                    data = ~0;
+                    break;
+                case HVMCOPY_bad_gva_to_gfn:
+                    ASSERT_UNREACHABLE();
+                    /* fall through */
+                default:
+                    rc = X86EMUL_UNHANDLEABLE;
+                    break;
+                }
+                if ( rc != X86EMUL_OKAY )
+                    break;
             }
-            if ( rc != X86EMUL_OKAY )
-                break;
-            rc = action(IOREQ_WRITE, p->addr, p->size, &data);
+            else
+                data = p->data;
+
+            addr = (p->type == IOREQ_TYPE_COPY) ?
+                   p->addr + step * i :
+                   p->addr;
+            rc = ops->write(handler, addr, p->size, data);
             if ( rc != X86EMUL_OKAY )
                 break;
         }
@@ -324,78 +222,119 @@ static int process_portio_intercept(portio_action_t action, ioreq_t *p)
     return rc;
 }
 
-/*
- * Check if the request is handled inside xen
- * return value: 0 --not handled; 1 --handled
- */
-int hvm_io_intercept(ioreq_t *p, int type)
+const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
+{
+    struct domain *curr_d = current->domain;
+    const struct hvm_io_ops *ops = (p->type == IOREQ_TYPE_COPY) ?
+                                   &mmio_ops : &portio_ops;
+    unsigned int i;
+
+    BUG_ON((p->type != IOREQ_TYPE_PIO) &&
+           (p->type != IOREQ_TYPE_COPY));
+
+    for ( i = 0; i < curr_d->arch.hvm_domain.io_handler_count; i++ )
+    {
+        const struct hvm_io_handler *handler =
+            &curr_d->arch.hvm_domain.io_handler[i];
+
+        if ( handler->type != p->type )
+            continue;
+
+        if ( ops->accept(handler, p) )
+            return handler;
+    }
+
+    return NULL;
+}
+
+int hvm_io_intercept(ioreq_t *p)
 {
-    struct vcpu *v = current;
-    struct hvm_io_handler *handler = v->domain->arch.hvm_domain.io_handler;
-    int i;
-    unsigned long addr, size;
+    const struct hvm_io_handler *handler;
 
-    if ( type == HVM_PORTIO )
+    if ( p->type == IOREQ_TYPE_PIO )
     {
         int rc = dpci_ioport_intercept(p);
         if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
             return rc;
     }
-
-    for ( i = 0; i < handler->num_slot; i++ )
+    else if ( p->type == IOREQ_TYPE_COPY )
     {
-        if ( type != handler->hdl_list[i].type )
-            continue;
-        addr = handler->hdl_list[i].addr;
-        size = handler->hdl_list[i].size;
-        if ( (p->addr >= addr) &&
-             ((p->addr + p->size) <= (addr + size)) )
-        {
-            if ( type == HVM_PORTIO )
-                return process_portio_intercept(
-                    handler->hdl_list[i].action.portio, p);
+        int rc = stdvga_intercept_mmio(p);
+        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
+            return rc;
+    }
 
-            if ( unlikely(p->count > 1) &&
-                 (unlikely(p->df)
-                  ? p->addr - (p->count - 1L) * p->size < addr
-                  : p->addr + p->count * 1L * p->size - 1 >= addr + size) )
-                p->count = 1;
+    handler = hvm_find_io_handler(p);
 
-            return handler->hdl_list[i].action.mmio(p);
-        }
+    if ( handler == NULL )
+        return X86EMUL_UNHANDLEABLE;
+
+    return hvm_process_io_intercept(handler, p);
+}
+
+struct hvm_io_handler *hvm_next_io_handler(struct domain *d)
+{
+    unsigned int i = d->arch.hvm_domain.io_handler_count++;
+
+    if ( i == NR_IO_HANDLERS )
+    {
+        domain_crash(d);
+        return NULL;
     }
 
-    return X86EMUL_UNHANDLEABLE;
+    return &d->arch.hvm_domain.io_handler[i];
 }
 
-void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned int size,
-    void *action, int type)
+void register_mmio_handler(struct domain *d,
+                           const struct hvm_mmio_ops *ops)
 {
-    struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
-    int num = handler->num_slot;
+    struct hvm_io_handler *handler = hvm_next_io_handler(d);
 
-    BUG_ON(num >= MAX_IO_HANDLER);
+    handler->type = IOREQ_TYPE_COPY;
+    handler->mmio.ops = ops;
+}
 
-    handler->hdl_list[num].addr = addr;
-    handler->hdl_list[num].size = size;
-    handler->hdl_list[num].action.ptr = action;
-    handler->hdl_list[num].type = type;
-    handler->num_slot++;
+void register_portio_handler(struct domain *d, uint16_t port,
+                             unsigned int size, portio_action_t action)
+{
+    struct hvm_io_handler *handler = hvm_next_io_handler(d);
+
+    handler->type = IOREQ_TYPE_PIO;
+    handler->portio.port = port;
+    handler->portio.size = size;
+    handler->portio.action = action;
 }
 
-void relocate_io_handler(
-    struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned int size, int type)
+void relocate_portio_handler(struct domain *d, uint16_t old_port,
+                             uint16_t new_port, unsigned int size)
 {
-    struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
-    int i;
-
-    for ( i = 0; i < handler->num_slot; i++ )
-        if ( (handler->hdl_list[i].addr == old_addr) &&
-             (handler->hdl_list[i].size == size) &&
-             (handler->hdl_list[i].type == type) )
-            handler->hdl_list[i].addr = new_addr;
+    unsigned int i;
+
+    for ( i = 0; i < d->arch.hvm_domain.io_handler_count; i++ )
+    {
+        struct hvm_io_handler *handler =
+            &d->arch.hvm_domain.io_handler[i];
+
+        if ( handler->type != IOREQ_TYPE_PIO )
+            continue;
+
+        if ( (handler->portio.port == old_port) &&
+             (handler->portio.size = size) )
+        {
+            handler->portio.port = new_port;
+            break;
+        }
+    }
+}
+
+bool_t hvm_mmio_internal(paddr_t gpa)
+{
+    ioreq_t p = {
+        .type = IOREQ_TYPE_COPY,
+        .addr = gpa
+    };
+
+    return hvm_find_io_handler(&p) != NULL;
 }
 
 /*
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 0e18b76..e6dfdb7 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -547,13 +547,28 @@ static int mmio_move(struct hvm_hw_stdvga *s, ioreq_t *p)
     return 1;
 }
 
-static int stdvga_intercept_mmio(ioreq_t *p)
+int stdvga_intercept_mmio(ioreq_t *p)
 {
     struct domain *d = current->domain;
     struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
+    uint64_t start, end, addr = p->addr, count = p->count, size = p->size;
     int buf = 0, rc;
 
-    if ( p->size > 8 )
+    if ( unlikely(p->df) )
+    {
+        start = (addr - (count - 1) * size);
+        end = addr + size;
+    }
+    else
+    {
+        start = addr;
+        end = addr + count * size;
+    }
+
+    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
+        return X86EMUL_UNHANDLEABLE;
+
+    if ( size > 8 )
     {
         gdprintk(XENLOG_WARNING, "invalid mmio size %d\n", (int)p->size);
         return X86EMUL_UNHANDLEABLE;
@@ -619,9 +634,6 @@ void stdvga_init(struct domain *d)
         register_portio_handler(d, 0x3c4, 2, stdvga_intercept_pio);
         /* Graphics registers. */
         register_portio_handler(d, 0x3ce, 2, stdvga_intercept_pio);
-        /* MMIO. */
-        register_buffered_io_handler(
-            d, VGA_MEM_BASE, VGA_MEM_SIZE, stdvga_intercept_mmio);
     }
 }
 
@@ -638,3 +650,13 @@ void stdvga_deinit(struct domain *d)
         s->vram_page[i] = NULL;
     }
 }
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 5c8d890..9de2ff3 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -250,7 +250,7 @@ static int vioapic_range(struct vcpu *v, unsigned long addr)
              (addr < vioapic->base_address + VIOAPIC_MEM_LENGTH)));
 }
 
-const struct hvm_mmio_ops vioapic_mmio_ops = {
+static const struct hvm_mmio_ops vioapic_mmio_ops = {
     .check = vioapic_range,
     .read = vioapic_read,
     .write = vioapic_write
@@ -456,6 +456,8 @@ int vioapic_init(struct domain *d)
     d->arch.hvm_domain.vioapic->domain = d;
     vioapic_reset(d);
 
+    register_mmio_handler(d, &vioapic_mmio_ops);
+
     return 0;
 }
 
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 0235cb3..5472f79 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -977,7 +977,7 @@ static int vlapic_range(struct vcpu *v, unsigned long addr)
            (offset < PAGE_SIZE);
 }
 
-const struct hvm_mmio_ops vlapic_mmio_ops = {
+static const struct hvm_mmio_ops vlapic_mmio_ops = {
     .check = vlapic_range,
     .read = vlapic_read,
     .write = vlapic_write
@@ -1443,6 +1443,9 @@ int vlapic_init(struct vcpu *v)
                  vlapic_init_sipi_action,
                  (unsigned long)v);
 
+    if ( v->vcpu_id == 0 )
+        register_mmio_handler(v->domain, &vlapic_mmio_ops);
+
     return 0;
 }
 
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index 6eeae0a..66545a4 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -344,7 +344,7 @@ static int msixtbl_range(struct vcpu *v, unsigned long addr)
     return !!desc;
 }
 
-const struct hvm_mmio_ops msixtbl_mmio_ops = {
+static const struct hvm_mmio_ops msixtbl_mmio_ops = {
     .check = msixtbl_range,
     .read = msixtbl_read,
     .write = msixtbl_write
@@ -481,6 +481,14 @@ found:
     spin_unlock_irq(&irq_desc->lock);
 }
 
+void msixtbl_init(struct domain *d)
+{
+    INIT_LIST_HEAD(&d->arch.hvm_domain.msixtbl_list);
+    spin_lock_init(&d->arch.hvm_domain.msixtbl_list_lock);
+
+    register_mmio_handler(d, &msixtbl_mmio_ops);
+}
+
 void msixtbl_pt_cleanup(struct domain *d)
 {
     struct msixtbl_entry *entry, *temp;
diff --git a/xen/drivers/passthrough/amd/iommu_guest.c b/xen/drivers/passthrough/amd/iommu_guest.c
index a832fdb..5908bf3 100644
--- a/xen/drivers/passthrough/amd/iommu_guest.c
+++ b/xen/drivers/passthrough/amd/iommu_guest.c
@@ -868,6 +868,20 @@ static void guest_iommu_reg_init(struct guest_iommu *iommu)
     iommu->reg_ext_feature.hi = upper;
 }
 
+static int guest_iommu_mmio_range(struct vcpu *v, unsigned long addr)
+{
+    struct guest_iommu *iommu = vcpu_iommu(v);
+
+    return iommu && addr >= iommu->mmio_base &&
+           addr < iommu->mmio_base + IOMMU_MMIO_SIZE;
+}
+
+static const struct hvm_mmio_ops iommu_mmio_ops = {
+    .check = guest_iommu_mmio_range,
+    .read = guest_iommu_mmio_read,
+    .write = guest_iommu_mmio_write
+};
+
 /* Domain specific initialization */
 int guest_iommu_init(struct domain* d)
 {
@@ -894,6 +908,8 @@ int guest_iommu_init(struct domain* d)
 
     spin_lock_init(&iommu->lock);
 
+    register_mmio_handler(d, &iommu_mmio_ops);
+
     return 0;
 }
 
@@ -910,17 +926,3 @@ void guest_iommu_destroy(struct domain *d)
 
     domain_hvm_iommu(d)->arch.g_iommu = NULL;
 }
-
-static int guest_iommu_mmio_range(struct vcpu *v, unsigned long addr)
-{
-    struct guest_iommu *iommu = vcpu_iommu(v);
-
-    return iommu && addr >= iommu->mmio_base &&
-           addr < iommu->mmio_base + IOMMU_MMIO_SIZE;
-}
-
-const struct hvm_mmio_ops iommu_mmio_ops = {
-    .check = guest_iommu_mmio_range,
-    .read = guest_iommu_mmio_read,
-    .write = guest_iommu_mmio_write
-};
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index ad68fcf..b612755 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -94,6 +94,7 @@ struct hvm_domain {
     struct pl_time         pl_time;
 
     struct hvm_io_handler *io_handler;
+    unsigned int          io_handler_count;
 
     /* Lock protects access to irq, vpic and vioapic. */
     spinlock_t             irq_lock;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index ef76e69..1a37243 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -25,10 +25,7 @@
 #include <public/hvm/ioreq.h>
 #include <public/event_channel.h>
 
-#define MAX_IO_HANDLER             16
-
-#define HVM_PORTIO                  0
-#define HVM_BUFFERED_IO             2
+#define NR_IO_HANDLERS 32
 
 typedef int (*hvm_mmio_read_t)(struct vcpu *v,
                                unsigned long addr,
@@ -40,82 +37,67 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v,
                                 unsigned long val);
 typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
 
+struct hvm_mmio_ops {
+    hvm_mmio_check_t check;
+    hvm_mmio_read_t  read;
+    hvm_mmio_write_t write;
+};
+
 typedef int (*portio_action_t)(
     int dir, unsigned int port, unsigned int bytes, uint32_t *val);
-typedef int (*mmio_action_t)(ioreq_t *);
-struct io_handler {
-    int                 type;
-    unsigned int        size;
-    unsigned long       addr;
-    union {
-        portio_action_t portio;
-        mmio_action_t   mmio;
-        void           *ptr;
-    } action;
-};
 
 struct hvm_io_handler {
-    int     num_slot;
-    struct  io_handler hdl_list[MAX_IO_HANDLER];
+    union {
+        struct {
+            const struct hvm_mmio_ops *ops;
+        } mmio;
+        struct {
+            unsigned int port, size;
+            portio_action_t action;
+        } portio;
+    };
+    uint8_t type;
 };
 
-struct hvm_mmio_ops {
-    hvm_mmio_check_t check;
-    hvm_mmio_read_t  read;
-    hvm_mmio_write_t write;
+typedef int (*hvm_io_read_t)(const struct hvm_io_handler *,
+                             uint64_t addr,
+                             uint32_t size,
+                             uint64_t *data);
+typedef int (*hvm_io_write_t)(const struct hvm_io_handler *,
+                              uint64_t addr,
+                              uint32_t size,
+                              uint64_t data);
+typedef bool_t (*hvm_io_accept_t)(const struct hvm_io_handler *,
+                                  const ioreq_t *p);
+struct hvm_io_ops {
+    hvm_io_accept_t accept;
+    hvm_io_read_t   read;
+    hvm_io_write_t  write;
 };
 
-extern const struct hvm_mmio_ops hpet_mmio_ops;
-extern const struct hvm_mmio_ops vlapic_mmio_ops;
-extern const struct hvm_mmio_ops vioapic_mmio_ops;
-extern const struct hvm_mmio_ops msixtbl_mmio_ops;
-extern const struct hvm_mmio_ops iommu_mmio_ops;
+int hvm_process_io_intercept(const struct hvm_io_handler *handler,
+                             ioreq_t *p);
 
-#define HVM_MMIO_HANDLER_NR 5
+const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p);
 
-int hvm_io_intercept(ioreq_t *p, int type);
-void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned int size,
-    void *action, int type);
-void relocate_io_handler(
-    struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned int size, int type);
+int hvm_io_intercept(ioreq_t *p);
 
-static inline int hvm_portio_intercept(ioreq_t *p)
-{
-    return hvm_io_intercept(p, HVM_PORTIO);
-}
-
-static inline int hvm_buffered_io_intercept(ioreq_t *p)
-{
-    return hvm_io_intercept(p, HVM_BUFFERED_IO);
-}
+struct hvm_io_handler *hvm_next_io_handler(struct domain *d);
 
 bool_t hvm_mmio_internal(paddr_t gpa);
-int hvm_mmio_intercept(ioreq_t *p);
-int hvm_buffered_io_send(ioreq_t *p);
 
-static inline void register_portio_handler(
+void register_mmio_handler(struct domain *d,
+                           const struct hvm_mmio_ops *ops);
+
+void register_portio_handler(
     struct domain *d, uint16_t port, unsigned int size,
-    portio_action_t action)
-{
-    register_io_handler(d, port, size, action, HVM_PORTIO);
-}
+    portio_action_t action);
 
-static inline void relocate_portio_handler(
+void relocate_portio_handler(
     struct domain *d, uint16_t old_port, uint16_t new_port,
-    unsigned int size)
-{
-    relocate_io_handler(d, old_port, new_port, size, HVM_PORTIO);
-}
-
-static inline void register_buffered_io_handler(
-    struct domain *d, unsigned long addr,
-    unsigned int size, mmio_action_t action)
-{
-    register_io_handler(d, addr, size, action, HVM_BUFFERED_IO);
-}
+    unsigned int size);
 
+int hvm_buffered_io_send(ioreq_t *p);
 void send_timeoffset_req(unsigned long timeoff);
 void send_invalidate_req(void);
 int handle_mmio(void);
@@ -127,6 +109,7 @@ void hvm_io_assist(ioreq_t *p);
 void hvm_dpci_eoi(struct domain *d, unsigned int guest_irq,
                   const union vioapic_redir_entry *ent);
 void msix_write_completion(struct vcpu *);
+void msixtbl_init(struct domain *d);
 
 struct hvm_hw_stdvga {
     uint8_t sr_index;
@@ -141,8 +124,19 @@ struct hvm_hw_stdvga {
 };
 
 void stdvga_init(struct domain *d);
+int stdvga_intercept_mmio(ioreq_t *p);
 void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
 #endif /* __ASM_X86_HVM_IO_H__ */
 
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 05/15] x86/hvm: add length to mmio check op
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (3 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 04/15] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 06/15] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

When memory mapped I/O is range checked by internal handlers, the length
of the access should be taken into account.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- No change

v6:
- Added Andrew's reviewed-by

v5:
- Simplified by leaving mmio_check() implementation alone and
  calling to check last byte if first-byte check passes
---
 xen/arch/x86/hvm/intercept.c |   23 ++++++++++++++++++++---
 xen/include/asm-x86/hvm/io.h |   16 ++++++++++++++++
 2 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index a44baa1..f9535ea 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -35,9 +35,20 @@
 static bool_t hvm_mmio_accept(const struct hvm_io_handler *handler,
                               const ioreq_t *p)
 {
+    paddr_t first = hvm_mmio_first_byte(p);
+    paddr_t last = hvm_mmio_last_byte(p);
+
     BUG_ON(handler->type != IOREQ_TYPE_COPY);
 
-    return handler->mmio.ops->check(current, p->addr);
+    if ( !handler->mmio.ops->check(current, first) )
+        return 0;
+
+    /* Make sure the handler will accept the whole access */
+    if ( p->size > 1 &&
+         !handler->mmio.ops->check(current, last) )
+        domain_crash(current->domain);
+
+    return 1;
 }
 
 static int hvm_mmio_read(const struct hvm_io_handler *handler,
@@ -106,7 +117,8 @@ static const struct hvm_io_ops portio_ops = {
 int hvm_process_io_intercept(const struct hvm_io_handler *handler,
                              ioreq_t *p)
 {
-    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    struct vcpu *curr = current;
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     const struct hvm_io_ops *ops = (p->type == IOREQ_TYPE_COPY) ?
                                    &mmio_ops : &portio_ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
@@ -215,6 +227,9 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
 
     if ( i != 0 )
     {
+        if ( rc == X86EMUL_UNHANDLEABLE )
+            domain_crash(curr->domain);
+
         p->count = i;
         rc = X86EMUL_OKAY;
     }
@@ -331,7 +346,9 @@ bool_t hvm_mmio_internal(paddr_t gpa)
 {
     ioreq_t p = {
         .type = IOREQ_TYPE_COPY,
-        .addr = gpa
+        .addr = gpa,
+        .count = 1,
+        .size = 1,
     };
 
     return hvm_find_io_handler(&p) != NULL;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 1a37243..a01502a 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -43,6 +43,22 @@ struct hvm_mmio_ops {
     hvm_mmio_write_t write;
 };
 
+static inline paddr_t hvm_mmio_first_byte(const ioreq_t *p)
+{
+    return p->df ?
+           p->addr - (p->count - 1ul) * p->size :
+           p->addr;
+}
+
+static inline paddr_t hvm_mmio_last_byte(const ioreq_t *p)
+{
+    unsigned long count = p->count;
+
+    return p->df ?
+           p->addr + p->size - 1:
+           p->addr + (count * p->size) - 1;
+}
+
 typedef int (*portio_action_t)(
     int dir, unsigned int port, unsigned int bytes, uint32_t *val);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 06/15] x86/hvm: unify dpci portio intercept with standard portio intercept
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (4 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 05/15] x86/hvm: add length to mmio check op Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 07/15] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

This patch re-works the dpci portio intercepts so that they can be unified
with standard portio handling thereby removing a substantial amount of
code duplication.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- Cosmetic changes requested by Jan

v6:
- Added Andrew's reviewed-by

v5:
- Addressed further comments from Jan
---
 xen/arch/x86/hvm/hvm.c         |    2 +
 xen/arch/x86/hvm/intercept.c   |   16 +--
 xen/arch/x86/hvm/io.c          |  220 ++++++++++++----------------------------
 xen/include/asm-x86/hvm/io.h   |    4 +
 xen/include/asm-x86/hvm/vcpu.h |    2 +
 xen/include/xen/iommu.h        |    1 -
 6 files changed, 78 insertions(+), 167 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 1fd5efc..e0fca45 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1482,6 +1482,8 @@ int hvm_domain_initialise(struct domain *d)
     else
         d->arch.hvm_domain.io_bitmap = hvm_io_bitmap;
 
+    register_dpci_portio_handler(d);
+
     if ( is_pvh_domain(d) )
     {
         register_portio_handler(d, 0, 0x10003, handle_pvh_io);
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index f9535ea..a1c4f29 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -119,8 +119,7 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
 {
     struct vcpu *curr = current;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    const struct hvm_io_ops *ops = (p->type == IOREQ_TYPE_COPY) ?
-                                   &mmio_ops : &portio_ops;
+    const struct hvm_io_ops *ops = handler->ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
     uint64_t data;
     uint64_t addr;
@@ -240,8 +239,6 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
 const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
 {
     struct domain *curr_d = current->domain;
-    const struct hvm_io_ops *ops = (p->type == IOREQ_TYPE_COPY) ?
-                                   &mmio_ops : &portio_ops;
     unsigned int i;
 
     BUG_ON((p->type != IOREQ_TYPE_PIO) &&
@@ -251,6 +248,7 @@ const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
     {
         const struct hvm_io_handler *handler =
             &curr_d->arch.hvm_domain.io_handler[i];
+        const struct hvm_io_ops *ops = handler->ops;
 
         if ( handler->type != p->type )
             continue;
@@ -266,13 +264,7 @@ int hvm_io_intercept(ioreq_t *p)
 {
     const struct hvm_io_handler *handler;
 
-    if ( p->type == IOREQ_TYPE_PIO )
-    {
-        int rc = dpci_ioport_intercept(p);
-        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
-            return rc;
-    }
-    else if ( p->type == IOREQ_TYPE_COPY )
+    if ( p->type == IOREQ_TYPE_COPY )
     {
         int rc = stdvga_intercept_mmio(p);
         if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
@@ -306,6 +298,7 @@ void register_mmio_handler(struct domain *d,
     struct hvm_io_handler *handler = hvm_next_io_handler(d);
 
     handler->type = IOREQ_TYPE_COPY;
+    handler->ops = &mmio_ops;
     handler->mmio.ops = ops;
 }
 
@@ -315,6 +308,7 @@ void register_portio_handler(struct domain *d, uint16_t port,
     struct hvm_io_handler *handler = hvm_next_io_handler(d);
 
     handler->type = IOREQ_TYPE_PIO;
+    handler->ops = &portio_ops;
     handler->portio.port = port;
     handler->portio.size = size;
     handler->portio.action = action;
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index c0964ec..2c88ddb 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -208,185 +208,95 @@ void hvm_io_assist(ioreq_t *p)
     }
 }
 
-static int dpci_ioport_read(uint32_t mport, ioreq_t *p)
+static bool_t dpci_portio_accept(const struct hvm_io_handler *handler,
+                                 const ioreq_t *p)
 {
-    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
-    int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-    uint32_t data = 0;
+    struct vcpu *curr = current;
+    struct hvm_iommu *hd = domain_hvm_iommu(curr->domain);
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
+    struct g2m_ioport *g2m_ioport;
+    unsigned int start, end;
 
-    for ( i = 0; i < p->count; i++ )
+    list_for_each_entry( g2m_ioport, &hd->arch.g2m_ioport_list, list )
     {
-        if ( vio->mmio_retrying )
-        {
-            if ( vio->mmio_large_read_bytes != p->size )
-                return X86EMUL_UNHANDLEABLE;
-            memcpy(&data, vio->mmio_large_read, p->size);
-            vio->mmio_large_read_bytes = 0;
-            vio->mmio_retrying = 0;
-        }
-        else switch ( p->size )
+        start = g2m_ioport->gport;
+        end = start + g2m_ioport->np;
+        if ( (p->addr >= start) && (p->addr + p->size <= end) )
         {
-        case 1:
-            data = inb(mport);
-            break;
-        case 2:
-            data = inw(mport);
-            break;
-        case 4:
-            data = inl(mport);
-            break;
-        default:
-            BUG();
+            vio->g2m_ioport = g2m_ioport;
+            return 1;
         }
-
-        if ( p->data_is_ptr )
-        {
-            switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                            &data, p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                /* Drop the write as real hardware would. */
-                continue;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY)
-                break;
-        }
-        else
-            p->data = data;
     }
 
-    if ( rc == X86EMUL_RETRY )
-    {
-        vio->mmio_retry = 1;
-        vio->mmio_large_read_bytes = p->size;
-        memcpy(vio->mmio_large_read, &data, p->size);
-    }
-
-    if ( i != 0 )
-    {
-        p->count = i;
-        rc = X86EMUL_OKAY;
-    }
-
-    return rc;
+    return 0;
 }
 
-static int dpci_ioport_write(uint32_t mport, ioreq_t *p)
+static int dpci_portio_read(const struct hvm_io_handler *handler,
+                            uint64_t addr,
+                            uint32_t size,
+                            uint64_t *data)
 {
-    int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-    uint32_t data;
-
-    for ( i = 0; i < p->count; i++ )
-    {
-        data = p->data;
-        if ( p->data_is_ptr )
-        {
-            switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
-                                              p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                data = ~0;
-                break;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY)
-                break;
-        }
-
-        switch ( p->size )
-        {
-        case 1:
-            outb(data, mport);
-            break;
-        case 2:
-            outw(data, mport);
-            break;
-        case 4:
-            outl(data, mport);
-            break;
-        default:
-            BUG();
-        }
-    }
-
-    if ( rc == X86EMUL_RETRY )
-        current->arch.hvm_vcpu.hvm_io.mmio_retry = 1;
+    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    const struct g2m_ioport *g2m_ioport = vio->g2m_ioport;
+    unsigned int mport = (addr - g2m_ioport->gport) + g2m_ioport->mport;
 
-    if ( i != 0 )
+    switch ( size )
     {
-        p->count = i;
-        rc = X86EMUL_OKAY;
+    case 1:
+        *data = inb(mport);
+        break;
+    case 2:
+        *data = inw(mport);
+        break;
+    case 4:
+        *data = inl(mport);
+        break;
+    default:
+        BUG();
     }
 
-    return rc;
+    return X86EMUL_OKAY;
 }
 
-int dpci_ioport_intercept(ioreq_t *p)
+static int dpci_portio_write(const struct hvm_io_handler *handler,
+                             uint64_t addr,
+                             uint32_t size,
+                             uint64_t data)
 {
-    struct domain *d = current->domain;
-    struct hvm_iommu *hd = domain_hvm_iommu(d);
-    struct g2m_ioport *g2m_ioport;
-    unsigned int mport, gport = p->addr;
-    unsigned int s = 0, e = 0;
-    int rc;
-
-    list_for_each_entry( g2m_ioport, &hd->arch.g2m_ioport_list, list )
-    {
-        s = g2m_ioport->gport;
-        e = s + g2m_ioport->np;
-        if ( (gport >= s) && (gport < e) )
-            goto found;
-    }
-
-    return X86EMUL_UNHANDLEABLE;
-
- found:
-    mport = (gport - s) + g2m_ioport->mport;
-
-    if ( !ioports_access_permitted(d, mport, mport + p->size - 1) ) 
-    {
-        gdprintk(XENLOG_ERR, "Error: access to gport=%#x denied!\n",
-                 (uint32_t)p->addr);
-        return X86EMUL_UNHANDLEABLE;
-    }
+    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
+    const struct g2m_ioport *g2m_ioport = vio->g2m_ioport;
+    unsigned int mport = (addr - g2m_ioport->gport) + g2m_ioport->mport;
 
-    switch ( p->dir )
+    switch ( size )
     {
-    case IOREQ_READ:
-        rc = dpci_ioport_read(mport, p);
+    case 1:
+        outb(data, mport);
         break;
-    case IOREQ_WRITE:
-        rc = dpci_ioport_write(mport, p);
+    case 2:
+        outw(data, mport);
+        break;
+    case 4:
+        outl(data, mport);
         break;
     default:
-        gdprintk(XENLOG_ERR, "Error: couldn't handle p->dir = %d", p->dir);
-        rc = X86EMUL_UNHANDLEABLE;
+        BUG();
     }
 
-    return rc;
+    return X86EMUL_OKAY;
+}
+
+static const struct hvm_io_ops dpci_portio_ops = {
+    .accept = dpci_portio_accept,
+    .read = dpci_portio_read,
+    .write = dpci_portio_write
+};
+
+void register_dpci_portio_handler(struct domain *d)
+{
+    struct hvm_io_handler *handler = hvm_next_io_handler(d);
+
+    handler->type = IOREQ_TYPE_PIO;
+    handler->ops = &dpci_portio_ops;
 }
 
 /*
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index a01502a..1288005 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -72,6 +72,7 @@ struct hvm_io_handler {
             portio_action_t action;
         } portio;
     };
+    const struct hvm_io_ops *ops;
     uint8_t type;
 };
 
@@ -144,6 +145,9 @@ int stdvga_intercept_mmio(ioreq_t *p);
 void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
+
+void register_dpci_portio_handler(struct domain *d);
+
 #endif /* __ASM_X86_HVM_IO_H__ */
 
 
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 0faf60d..4ed285f 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -77,6 +77,8 @@ struct hvm_vcpu_io {
     bool_t mmio_retry, mmio_retrying;
 
     unsigned long msix_unmask_address;
+
+    const struct g2m_ioport *g2m_ioport;
 };
 
 #define VMCX_EADDR    (~0ULL)
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index b30bf41..1d00696 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -93,7 +93,6 @@ void pt_pci_init(void);
 
 struct pirq;
 int hvm_do_IRQ_dpci(struct domain *, struct pirq *);
-int dpci_ioport_intercept(ioreq_t *p);
 int pt_irq_create_bind(struct domain *, xen_domctl_bind_pt_irq_t *);
 int pt_irq_destroy_bind(struct domain *, xen_domctl_bind_pt_irq_t *);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 07/15] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (5 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 06/15] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 15:33   ` Jan Beulich
  2015-07-09 13:10 ` [PATCH v7 08/15] x86/hvm: limit reps to avoid the need to handle retry Paul Durrant
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

It's clear from the following check in hvmemul_rep_movs:

    if ( sp2mt == p2m_mmio_direct || dp2mt == p2m_mmio_direct ||
         (sp2mt == p2m_mmio_dm && dp2mt == p2m_mmio_dm) )
        return X86EMUL_UNHANDLEABLE;

that mmio <-> mmio copy is not handled. This means the code in the
stdvga mmio intercept that explicitly handles mmio <-> mmio copy when
hvm_copy_to/from_guest_phys() fails is never going to be executed.

This patch therefore adds a check in hvmemul_do_io_addr() to make sure
mmio <-> mmio is disallowed and then registers standard mmio intercept ops
in stdvga_init().

With this patch all mmio and portio handled within Xen now goes through
process_io_intercept().

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- Use hvm_mmio_{first,last}_byte in stdvga_mem_accept for correctness
- Add comments requested by Jan

v6:
- Added Andrew's reviewed-by

v5:
- Fixed semantic problems pointed out by Jan
---
 xen/arch/x86/hvm/emulate.c   |    9 ++
 xen/arch/x86/hvm/intercept.c |   30 ++++--
 xen/arch/x86/hvm/stdvga.c    |  211 +++++++++++++++---------------------------
 xen/include/asm-x86/hvm/io.h |   10 +-
 4 files changed, 112 insertions(+), 148 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 20bf372..1afd626 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -266,6 +266,15 @@ static int hvmemul_acquire_page(unsigned long gmfn, struct page_info **page)
         return X86EMUL_RETRY;
     }
 
+    /* This code should not be reached if the gmfn is not RAM */
+    if ( p2m_is_mmio(p2mt) )
+    {
+        domain_crash(curr_d);
+
+        put_page(*page);
+        return X86EMUL_UNHANDLEABLE;
+    }
+
     return X86EMUL_OKAY;
 }
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index a1c4f29..bb45293 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -263,20 +263,21 @@ const struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
 int hvm_io_intercept(ioreq_t *p)
 {
     const struct hvm_io_handler *handler;
-
-    if ( p->type == IOREQ_TYPE_COPY )
-    {
-        int rc = stdvga_intercept_mmio(p);
-        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
-            return rc;
-    }
+    const struct hvm_io_ops *ops;
+    int rc;
 
     handler = hvm_find_io_handler(p);
 
     if ( handler == NULL )
         return X86EMUL_UNHANDLEABLE;
 
-    return hvm_process_io_intercept(handler, p);
+    rc = hvm_process_io_intercept(handler, p);
+
+    ops = handler->ops;
+    if ( ops->complete != NULL )
+        ops->complete(handler);
+
+    return rc;
 }
 
 struct hvm_io_handler *hvm_next_io_handler(struct domain *d)
@@ -338,6 +339,8 @@ void relocate_portio_handler(struct domain *d, uint16_t old_port,
 
 bool_t hvm_mmio_internal(paddr_t gpa)
 {
+    const struct hvm_io_handler *handler;
+    const struct hvm_io_ops *ops;
     ioreq_t p = {
         .type = IOREQ_TYPE_COPY,
         .addr = gpa,
@@ -345,7 +348,16 @@ bool_t hvm_mmio_internal(paddr_t gpa)
         .size = 1,
     };
 
-    return hvm_find_io_handler(&p) != NULL;
+    handler = hvm_find_io_handler(&p);
+
+    if ( handler == NULL )
+        return 0;
+
+    ops = handler->ops;
+    if ( ops->complete != NULL )
+        ops->complete(handler);
+
+    return 1;
 }
 
 /*
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index e6dfdb7..30a46f5 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -148,7 +148,7 @@ static int stdvga_outb(uint64_t addr, uint8_t val)
     }
     else if ( prev_stdvga && !s->stdvga )
     {
-        gdprintk(XENLOG_INFO, "leaving stdvga\n");
+        gdprintk(XENLOG_INFO, "leaving stdvga mode\n");
     }
 
     return rc;
@@ -275,9 +275,10 @@ static uint8_t stdvga_mem_readb(uint64_t addr)
     return ret;
 }
 
-static uint64_t stdvga_mem_read(uint64_t addr, uint64_t size)
+static int stdvga_mem_read(const struct hvm_io_handler *handler,
+                           uint64_t addr, uint32_t size, uint64_t *p_data)
 {
-    uint64_t data = 0;
+    uint64_t data = ~0ul;
 
     switch ( size )
     {
@@ -309,11 +310,12 @@ static uint64_t stdvga_mem_read(uint64_t addr, uint64_t size)
         break;
 
     default:
-        gdprintk(XENLOG_WARNING, "invalid io size: %"PRId64"\n", size);
+        gdprintk(XENLOG_WARNING, "invalid io size: %u\n", size);
         break;
     }
 
-    return data;
+    *p_data = data;
+    return X86EMUL_OKAY;
 }
 
 static void stdvga_mem_writeb(uint64_t addr, uint32_t val)
@@ -424,8 +426,22 @@ static void stdvga_mem_writeb(uint64_t addr, uint32_t val)
     }
 }
 
-static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
+static int stdvga_mem_write(const struct hvm_io_handler *handler,
+                            uint64_t addr, uint32_t size,
+                            uint64_t data)
 {
+    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
+    ioreq_t p = { .type = IOREQ_TYPE_COPY,
+                  .addr = addr,
+                  .size = size,
+                  .count = 1,
+                  .dir = IOREQ_WRITE,
+                  .data = data,
+                };
+
+    if ( !s->cache )
+        goto done;
+
     /* Intercept mmio write */
     switch ( size )
     {
@@ -457,156 +473,74 @@ static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
         break;
 
     default:
-        gdprintk(XENLOG_WARNING, "invalid io size: %"PRId64"\n", size);
+        gdprintk(XENLOG_WARNING, "invalid io size: %u\n", size);
         break;
     }
-}
-
-static uint32_t read_data;
-
-static int mmio_move(struct hvm_hw_stdvga *s, ioreq_t *p)
-{
-    int i;
-    uint64_t addr = p->addr;
-    p2m_type_t p2mt;
-    struct domain *d = current->domain;
-    int step = p->df ? -p->size : p->size;
-
-    if ( p->data_is_ptr )
-    {
-        uint64_t data = p->data, tmp;
 
-        if ( p->dir == IOREQ_READ )
-        {
-            for ( i = 0; i < p->count; i++ ) 
-            {
-                tmp = stdvga_mem_read(addr, p->size);
-                if ( hvm_copy_to_guest_phys(data, &tmp, p->size) !=
-                     HVMCOPY_okay )
-                {
-                    struct page_info *dp = get_page_from_gfn(
-                            d, data >> PAGE_SHIFT, &p2mt, P2M_ALLOC);
-                    /*
-                     * The only case we handle is vga_mem <-> vga_mem.
-                     * Anything else disables caching and leaves it to qemu-dm.
-                     */
-                    if ( (p2mt != p2m_mmio_dm) || (data < VGA_MEM_BASE) ||
-                         ((data + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-                    {
-                        if ( dp )
-                            put_page(dp);
-                        return 0;
-                    }
-                    ASSERT(!dp);
-                    stdvga_mem_write(data, tmp, p->size);
-                }
-                data += step;
-                addr += step;
-            }
-        }
-        else
-        {
-            for ( i = 0; i < p->count; i++ )
-            {
-                if ( hvm_copy_from_guest_phys(&tmp, data, p->size) !=
-                     HVMCOPY_okay )
-                {
-                    struct page_info *dp = get_page_from_gfn(
-                        d, data >> PAGE_SHIFT, &p2mt, P2M_ALLOC);
-                    if ( (p2mt != p2m_mmio_dm) || (data < VGA_MEM_BASE) ||
-                         ((data + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-                    {
-                        if ( dp )
-                            put_page(dp);
-                        return 0;
-                    }
-                    ASSERT(!dp);
-                    tmp = stdvga_mem_read(data, p->size);
-                }
-                stdvga_mem_write(addr, tmp, p->size);
-                data += step;
-                addr += step;
-            }
-        }
-    }
-    else if ( p->dir == IOREQ_WRITE )
-    {
-        for ( i = 0; i < p->count; i++ )
-        {
-            stdvga_mem_write(addr, p->data, p->size);
-            addr += step;
-        }
-    }
-    else
-    {
-        ASSERT(p->count == 1);
-        p->data = stdvga_mem_read(addr, p->size);
-    }
+ done:
+    if ( hvm_buffered_io_send(&p) )
+        return X86EMUL_OKAY;
 
-    read_data = p->data;
-    return 1;
+    return X86EMUL_UNHANDLEABLE;
 }
 
-int stdvga_intercept_mmio(ioreq_t *p)
+static bool_t stdvga_mem_accept(const struct hvm_io_handler *handler,
+                                const ioreq_t *p)
 {
-    struct domain *d = current->domain;
-    struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
-    uint64_t start, end, addr = p->addr, count = p->count, size = p->size;
-    int buf = 0, rc;
-
-    if ( unlikely(p->df) )
-    {
-        start = (addr - (count - 1) * size);
-        end = addr + size;
-    }
-    else
-    {
-        start = addr;
-        end = addr + count * size;
-    }
-
-    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-        return X86EMUL_UNHANDLEABLE;
-
-    if ( size > 8 )
-    {
-        gdprintk(XENLOG_WARNING, "invalid mmio size %d\n", (int)p->size);
-        return X86EMUL_UNHANDLEABLE;
-    }
+    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
 
     spin_lock(&s->lock);
 
-    if ( s->stdvga && s->cache )
+    if ( !s->stdvga ||
+         (hvm_mmio_first_byte(p) < VGA_MEM_BASE) ||
+         (hvm_mmio_last_byte(p) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
+        goto reject;
+
+    if ( p->dir == IOREQ_WRITE && p->count > 1 )
     {
-        switch ( p->type )
+        /*
+         * We cannot return X86EMUL_UNHANDLEABLE on anything other then the
+         * first cycle of an I/O. So, since we cannot guarantee to always be
+         * able to send buffered writes, we have to reject any multi-cycle
+         * I/O and, since we are rejecting an I/O, we must invalidate the
+         * cache.
+         * Single-cycle write transactions are accepted even if the cache is
+         * not active since we can assert, when in stdvga mode, that writes
+         * to VRAM have no side effect and thus we can try to buffer them.
+         */
+        if ( s->cache )
         {
-        case IOREQ_TYPE_COPY:
-            buf = mmio_move(s, p);
-            if ( !buf )
-                s->cache = 0;
-            break;
-        default:
-            gdprintk(XENLOG_WARNING, "unsupported mmio request type:%d "
-                     "addr:0x%04x data:0x%04x size:%d count:%d state:%d "
-                     "isptr:%d dir:%d df:%d\n",
-                     p->type, (int)p->addr, (int)p->data, (int)p->size,
-                     (int)p->count, p->state,
-                     p->data_is_ptr, p->dir, p->df);
+            gdprintk(XENLOG_INFO, "leaving caching mode\n");
             s->cache = 0;
         }
+
+        goto reject;
     }
-    else
-    {
-        buf = (p->dir == IOREQ_WRITE);
-    }
+    else if ( p->dir == IOREQ_READ && !s->cache )
+        goto reject;
 
-    rc = (buf && hvm_buffered_io_send(p));
+    /* s->lock intentionally held */
+    return 1;
 
+ reject:
     spin_unlock(&s->lock);
+    return 0;
+}
 
-    return rc ? X86EMUL_OKAY : X86EMUL_UNHANDLEABLE;
+static void stdvga_mem_complete(const struct hvm_io_handler *handler)
+{
+    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
+
+    spin_unlock(&s->lock);
 }
 
+static const struct hvm_io_ops stdvga_mem_ops = {
+    .accept = stdvga_mem_accept,
+    .read = stdvga_mem_read,
+    .write = stdvga_mem_write,
+    .complete = stdvga_mem_complete
+};
+
 void stdvga_init(struct domain *d)
 {
     struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
@@ -630,10 +564,17 @@ void stdvga_init(struct domain *d)
 
     if ( i == ARRAY_SIZE(s->vram_page) )
     {
+        struct hvm_io_handler *handler;
+
         /* Sequencer registers. */
         register_portio_handler(d, 0x3c4, 2, stdvga_intercept_pio);
         /* Graphics registers. */
         register_portio_handler(d, 0x3ce, 2, stdvga_intercept_pio);
+
+        /* VGA memory */
+        handler = hvm_next_io_handler(d);
+        handler->type = IOREQ_TYPE_COPY;
+        handler->ops = &stdvga_mem_ops;
     }
 }
 
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 1288005..d9e2447 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -86,10 +86,13 @@ typedef int (*hvm_io_write_t)(const struct hvm_io_handler *,
                               uint64_t data);
 typedef bool_t (*hvm_io_accept_t)(const struct hvm_io_handler *,
                                   const ioreq_t *p);
+typedef void (*hvm_io_complete_t)(const struct hvm_io_handler *);
+
 struct hvm_io_ops {
-    hvm_io_accept_t accept;
-    hvm_io_read_t   read;
-    hvm_io_write_t  write;
+    hvm_io_accept_t   accept;
+    hvm_io_read_t     read;
+    hvm_io_write_t    write;
+    hvm_io_complete_t complete;
 };
 
 int hvm_process_io_intercept(const struct hvm_io_handler *handler,
@@ -141,7 +144,6 @@ struct hvm_hw_stdvga {
 };
 
 void stdvga_init(struct domain *d);
-int stdvga_intercept_mmio(ioreq_t *p);
 void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 08/15] x86/hvm: limit reps to avoid the need to handle retry
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (6 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 07/15] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 09/15] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

By limiting hvmemul_do_io_addr() to reps falling within the page on which
a reference has already been taken, we can guarantee that calls to
hvm_copy_to/from_guest_phys() will not hit the HVMCOPY_gfn_paged_out
or HVMCOPY_gfn_shared cases. Thus we can remove the retry logic (added
by c/s 82ed8716b "fix direct PCI port I/O emulation retry and error
handling") from the intercept code and simplify it significantly.

Normally hvmemul_do_io_addr() will only reference single page at a time.
It will, however, take an extra page reference for I/O spanning a page
boundary.

It is still important to know, upon returning from x86_emulate(), whether
the number of reps was reduced so the mmio_retry flag is retained for that
purpose.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- Fixed flaw in calculation pointed out by Jan
- Make nr_pages unsigned as requested by Jan
- Added Andrew's reviewed-by

v6:
- Added comment requested by Andrew

v5:
- Addressed further comments from Jan
---
 xen/arch/x86/hvm/emulate.c     |   76 ++++++++++++++++++++++++++--------------
 xen/arch/x86/hvm/hvm.c         |    4 +++
 xen/arch/x86/hvm/intercept.c   |   57 +++++++-----------------------
 xen/include/asm-x86/hvm/vcpu.h |    2 +-
 4 files changed, 66 insertions(+), 73 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 1afd626..53ab3d3 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -51,7 +51,7 @@ static void hvmtrace_io_assist(const ioreq_t *p)
 }
 
 static int hvmemul_do_io(
-    bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
+    bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
 {
     struct vcpu *curr = current;
@@ -60,6 +60,7 @@ static int hvmemul_do_io(
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
         .addr = addr,
         .size = size,
+        .count = reps,
         .dir = dir,
         .df = df,
         .data = data,
@@ -125,15 +126,6 @@ static int hvmemul_do_io(
         HVMIO_dispatched : HVMIO_awaiting_completion;
     vio->io_size = size;
 
-    /*
-     * When retrying a repeated string instruction, force exit to guest after
-     * completion of the retried iteration to allow handling of interrupts.
-     */
-    if ( vio->mmio_retrying )
-        *reps = 1;
-
-    p.count = *reps;
-
     if ( dir == IOREQ_WRITE )
     {
         if ( !data_is_addr )
@@ -147,17 +139,9 @@ static int hvmemul_do_io(
     switch ( rc )
     {
     case X86EMUL_OKAY:
-    case X86EMUL_RETRY:
-        *reps = p.count;
         p.state = STATE_IORESP_READY;
-        if ( !vio->mmio_retry )
-        {
-            hvm_io_assist(&p);
-            vio->io_state = HVMIO_none;
-        }
-        else
-            /* Defer hvm_io_assist() invocation to hvm_do_resume(). */
-            vio->io_state = HVMIO_handle_mmio_awaiting_completion;
+        hvm_io_assist(&p);
+        vio->io_state = HVMIO_none;
         break;
     case X86EMUL_UNHANDLEABLE:
     {
@@ -235,7 +219,7 @@ static int hvmemul_do_io_buffer(
 
     BUG_ON(buffer == NULL);
 
-    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
+    rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0,
                        (uintptr_t)buffer);
     if ( rc == X86EMUL_UNHANDLEABLE && dir == IOREQ_READ )
         memset(buffer, 0xff, size);
@@ -287,17 +271,56 @@ static int hvmemul_do_io_addr(
     bool_t is_mmio, paddr_t addr, unsigned long *reps,
     unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
 {
-    struct page_info *ram_page;
+    struct vcpu *v = current;
+    unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
+    unsigned int page_off = ram_gpa & (PAGE_SIZE - 1);
+    struct page_info *ram_page[2];
+    unsigned int nr_pages = 0;
+    unsigned long count;
     int rc;
 
-    rc = hvmemul_acquire_page(paddr_to_pfn(ram_gpa), &ram_page);
+    rc = hvmemul_acquire_page(ram_gmfn, &ram_page[nr_pages]);
     if ( rc != X86EMUL_OKAY )
-        return rc;
+        goto out;
 
-    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
+    nr_pages++;
+
+    /* Detemine how many reps will fit within this page */
+    count = min_t(unsigned long,
+                  *reps,
+                  df ?
+                  ((page_off + size - 1) & ~PAGE_MASK) / size :
+                  (PAGE_SIZE - page_off) / size);
+
+    if ( count == 0 )
+    {
+        /*
+         * This access must span two pages, so grab a reference to
+         * the next page and do a single rep.
+         * It is safe to assume multiple pages are physically
+         * contiguous at this point as hvmemul_linear_to_phys() will
+         * ensure this is the case.
+         */
+        rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1,
+                                  &ram_page[nr_pages]);
+        if ( rc != X86EMUL_OKAY )
+            goto out;
+
+        nr_pages++;
+        count = 1;
+    }
+
+    rc = hvmemul_do_io(is_mmio, addr, count, size, dir, df, 1,
                        ram_gpa);
+    if ( rc == X86EMUL_OKAY )
+    {
+        v->arch.hvm_vcpu.hvm_io.mmio_retry = (count < *reps);
+        *reps = count;
+    }
 
-    hvmemul_release_page(ram_page);
+ out:
+    while ( nr_pages )
+        hvmemul_release_page(ram_page[--nr_pages]);
 
     return rc;
 }
@@ -1547,7 +1570,6 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
     }
 
     hvmemul_ctxt->exn_pending = 0;
-    vio->mmio_retrying = vio->mmio_retry;
     vio->mmio_retry = 0;
 
     if ( cpu_has_vmx )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index e0fca45..9bdc1d6 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -440,6 +440,7 @@ static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
 
 void hvm_do_resume(struct vcpu *v)
 {
+    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
     struct domain *d = v->domain;
     struct hvm_ioreq_server *s;
 
@@ -468,6 +469,9 @@ void hvm_do_resume(struct vcpu *v)
         }
     }
 
+    if ( vio->mmio_retry )
+        handle_mmio();
+
     /* Inject pending hw/sw trap */
     if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) 
     {
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index bb45293..1a21541 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -117,8 +117,6 @@ static const struct hvm_io_ops portio_ops = {
 int hvm_process_io_intercept(const struct hvm_io_handler *handler,
                              ioreq_t *p)
 {
-    struct vcpu *curr = current;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     const struct hvm_io_ops *ops = handler->ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
     uint64_t data;
@@ -128,23 +126,12 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
     {
         for ( i = 0; i < p->count; i++ )
         {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-            {
-                addr = (p->type == IOREQ_TYPE_COPY) ?
-                       p->addr + step * i :
-                       p->addr;
-                rc = ops->read(handler, addr, p->size, &data);
-                if ( rc != X86EMUL_OKAY )
-                    break;
-            }
+            addr = (p->type == IOREQ_TYPE_COPY) ?
+                   p->addr + step * i :
+                   p->addr;
+            rc = ops->read(handler, addr, p->size, &data);
+            if ( rc != X86EMUL_OKAY )
+                break;
 
             if ( p->data_is_ptr )
             {
@@ -153,14 +140,12 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
                 {
                 case HVMCOPY_okay:
                     break;
-                case HVMCOPY_gfn_paged_out:
-                case HVMCOPY_gfn_shared:
-                    rc = X86EMUL_RETRY;
-                    break;
                 case HVMCOPY_bad_gfn_to_mfn:
                     /* Drop the write as real hardware would. */
                     continue;
                 case HVMCOPY_bad_gva_to_gfn:
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
@@ -173,13 +158,6 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
             else
                 p->data = data;
         }
-
-        if ( rc == X86EMUL_RETRY )
-        {
-            vio->mmio_retry = 1;
-            vio->mmio_large_read_bytes = p->size;
-            memcpy(vio->mmio_large_read, &data, p->size);
-        }
     }
     else /* p->dir == IOREQ_WRITE */
     {
@@ -192,14 +170,12 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
                 {
                 case HVMCOPY_okay:
                     break;
-                case HVMCOPY_gfn_paged_out:
-                case HVMCOPY_gfn_shared:
-                    rc = X86EMUL_RETRY;
-                    break;
                 case HVMCOPY_bad_gfn_to_mfn:
                     data = ~0;
                     break;
                 case HVMCOPY_bad_gva_to_gfn:
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
@@ -219,19 +195,10 @@ int hvm_process_io_intercept(const struct hvm_io_handler *handler,
             if ( rc != X86EMUL_OKAY )
                 break;
         }
-
-        if ( rc == X86EMUL_RETRY )
-            vio->mmio_retry = 1;
     }
 
-    if ( i != 0 )
-    {
-        if ( rc == X86EMUL_UNHANDLEABLE )
-            domain_crash(curr->domain);
-
-        p->count = i;
-        rc = X86EMUL_OKAY;
-    }
+    if ( i != 0 && rc == X86EMUL_UNHANDLEABLE )
+        domain_crash(current->domain);
 
     return rc;
 }
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 4ed285f..bf6c7ab 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -74,7 +74,7 @@ struct hvm_vcpu_io {
      * For string instruction emulation we need to be able to signal a
      * necessary retry through other than function return codes.
      */
-    bool_t mmio_retry, mmio_retrying;
+    bool_t mmio_retry;
 
     unsigned long msix_unmask_address;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 09/15] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (7 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 08/15] x86/hvm: limit reps to avoid the need to handle retry Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 10/15] x86/hvm: split I/O completion handling from state model Paul Durrant
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser

By removing the calls in hvmemul_do_io() (which is replaced by a single
assignment) and hvm_complete_assist_request() (which is replaced by a
call to hvm_process_portio_intercept() with a suitable set of ops) then
hvm_io_assist() can be moved into hvm.c and made static (and hence be a
candidate for inlining).

The calls to msix_write_completion() and vcpu_end_shutdown_deferral()
are also made unconditionally because the ioreq state will always be
STATE_IOREQ_NONE at the end of hvm_io_assist() so the test was
pointless. These calls are also only relevant when the emulation has
been handled externally which is now always the case.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- No change

v6:
- Added Andrew's reviewed-by

v5:
- Added Jan's acked-by
---
 xen/arch/x86/hvm/emulate.c    |   34 ++++++++++++++++++---
 xen/arch/x86/hvm/hvm.c        |   67 ++++++++++++++++++++++-------------------
 xen/arch/x86/hvm/io.c         |   39 ------------------------
 xen/include/asm-x86/hvm/hvm.h |    1 -
 xen/include/asm-x86/hvm/io.h  |    1 -
 5 files changed, 66 insertions(+), 76 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 53ab3d3..54b9430 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -50,6 +50,32 @@ static void hvmtrace_io_assist(const ioreq_t *p)
     trace_var(event, 0/*!cycles*/, size, buffer);
 }
 
+static int null_read(const struct hvm_io_handler *io_handler,
+                     uint64_t addr,
+                     uint32_t size,
+                     uint64_t *data)
+{
+    *data = ~0ul;
+    return X86EMUL_OKAY;
+}
+
+static int null_write(const struct hvm_io_handler *handler,
+                      uint64_t addr,
+                      uint32_t size,
+                      uint64_t data)
+{
+    return X86EMUL_OKAY;
+}
+
+static const struct hvm_io_ops null_ops = {
+    .read = null_read,
+    .write = null_write
+};
+
+static const struct hvm_io_handler null_handler = {
+    .ops = &null_ops
+};
+
 static int hvmemul_do_io(
     bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
@@ -139,8 +165,7 @@ static int hvmemul_do_io(
     switch ( rc )
     {
     case X86EMUL_OKAY:
-        p.state = STATE_IORESP_READY;
-        hvm_io_assist(&p);
+        vio->io_data = p.data;
         vio->io_state = HVMIO_none;
         break;
     case X86EMUL_UNHANDLEABLE:
@@ -151,8 +176,9 @@ static int hvmemul_do_io(
         /* If there is no suitable backing DM, just ignore accesses */
         if ( !s )
         {
-            hvm_complete_assist_req(&p);
-            rc = X86EMUL_OKAY;
+            rc = hvm_process_io_intercept(&null_handler, &p);
+            if ( rc == X86EMUL_OKAY )
+                vio->io_data = p.data;
             vio->io_state = HVMIO_none;
         }
         else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 9bdc1d6..7358acf 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -411,6 +411,42 @@ bool_t hvm_io_pending(struct vcpu *v)
     return 0;
 }
 
+static void hvm_io_assist(ioreq_t *p)
+{
+    struct vcpu *curr = current;
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
+    enum hvm_io_state io_state;
+
+    p->state = STATE_IOREQ_NONE;
+
+    io_state = vio->io_state;
+    vio->io_state = HVMIO_none;
+
+    switch ( io_state )
+    {
+    case HVMIO_awaiting_completion:
+        vio->io_state = HVMIO_completed;
+        vio->io_data = p->data;
+        break;
+    case HVMIO_handle_mmio_awaiting_completion:
+        vio->io_state = HVMIO_completed;
+        vio->io_data = p->data;
+        (void)handle_mmio();
+        break;
+    case HVMIO_handle_pio_awaiting_completion:
+        if ( vio->io_size == 4 ) /* Needs zero extension. */
+            guest_cpu_user_regs()->rax = (uint32_t)p->data;
+        else
+            memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
+        break;
+    default:
+        break;
+    }
+
+    msix_write_completion(curr);
+    vcpu_end_shutdown_deferral(curr);
+}
+
 static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
 {
     /* NB. Optimised for common case (p->state == STATE_IOREQ_NONE). */
@@ -2663,37 +2699,6 @@ int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
     return X86EMUL_UNHANDLEABLE;
 }
 
-void hvm_complete_assist_req(ioreq_t *p)
-{
-    switch ( p->type )
-    {
-    case IOREQ_TYPE_PCI_CONFIG:
-        ASSERT_UNREACHABLE();
-        break;
-    case IOREQ_TYPE_COPY:
-    case IOREQ_TYPE_PIO:
-        if ( p->dir == IOREQ_READ )
-        {
-            if ( !p->data_is_ptr )
-                p->data = ~0ul;
-            else
-            {
-                int i, step = p->df ? -p->size : p->size;
-                uint32_t data = ~0;
-
-                for ( i = 0; i < p->count; i++ )
-                    hvm_copy_to_guest_phys(p->data + step * i, &data,
-                                           p->size);
-            }
-        }
-        /* FALLTHRU */
-    default:
-        p->state = STATE_IORESP_READY;
-        hvm_io_assist(p);
-        break;
-    }
-}
-
 void hvm_broadcast_assist_req(ioreq_t *p)
 {
     struct domain *d = current->domain;
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 2c88ddb..fe099d8 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -169,45 +169,6 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
     return 1;
 }
 
-void hvm_io_assist(ioreq_t *p)
-{
-    struct vcpu *curr = current;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    enum hvm_io_state io_state;
-
-    p->state = STATE_IOREQ_NONE;
-
-    io_state = vio->io_state;
-    vio->io_state = HVMIO_none;
-
-    switch ( io_state )
-    {
-    case HVMIO_awaiting_completion:
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        break;
-    case HVMIO_handle_mmio_awaiting_completion:
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        (void)handle_mmio();
-        break;
-    case HVMIO_handle_pio_awaiting_completion:
-        if ( vio->io_size == 4 ) /* Needs zero extension. */
-            guest_cpu_user_regs()->rax = (uint32_t)p->data;
-        else
-            memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
-        break;
-    default:
-        break;
-    }
-
-    if ( p->state == STATE_IOREQ_NONE )
-    {
-        msix_write_completion(curr);
-        vcpu_end_shutdown_deferral(curr);
-    }
-}
-
 static bool_t dpci_portio_accept(const struct hvm_io_handler *handler,
                                  const ioreq_t *p)
 {
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 1d1fd35..efb8e7d 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -227,7 +227,6 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
                                                  ioreq_t *p);
 int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *p);
 void hvm_broadcast_assist_req(ioreq_t *p);
-void hvm_complete_assist_req(ioreq_t *p);
 
 void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat);
 int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat);
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index d9e2447..508ec52 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -125,7 +125,6 @@ int handle_mmio_with_translation(unsigned long gva, unsigned long gpfn,
                                  struct npfec);
 int handle_pio(uint16_t port, unsigned int size, int dir);
 void hvm_interrupt_post(struct vcpu *v, int vector, int type);
-void hvm_io_assist(ioreq_t *p);
 void hvm_dpci_eoi(struct domain *d, unsigned int guest_irq,
                   const union vioapic_redir_entry *ent);
 void msix_write_completion(struct vcpu *);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 10/15] x86/hvm: split I/O completion handling from state model
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (8 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 09/15] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 11/15] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

The state of in-flight I/O and how its completion will be handled are
logically separate and conflating the two makes the code unnecessarily
confusing.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- Modified struct field placement as requested by Jan

v6:
- Added Andrew's reviewed-by

v5:
- Confirmed call to msix_write_completion() is in the correct place.
---
 xen/arch/x86/hvm/hvm.c            |   44 +++++++++++++++++++++++++++----------
 xen/arch/x86/hvm/io.c             |    6 ++---
 xen/arch/x86/hvm/vmx/realmode.c   |   27 +++++++++++++++++------
 xen/include/asm-x86/hvm/vcpu.h    |   16 +++++++++-----
 xen/include/asm-x86/hvm/vmx/vmx.h |    1 +
 5 files changed, 67 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 7358acf..093a710 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -59,6 +59,7 @@
 #include <asm/hvm/trace.h>
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/event.h>
+#include <asm/hvm/vmx/vmx.h>
 #include <asm/mtrr.h>
 #include <asm/apic.h>
 #include <public/sched.h>
@@ -428,17 +429,6 @@ static void hvm_io_assist(ioreq_t *p)
         vio->io_state = HVMIO_completed;
         vio->io_data = p->data;
         break;
-    case HVMIO_handle_mmio_awaiting_completion:
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        (void)handle_mmio();
-        break;
-    case HVMIO_handle_pio_awaiting_completion:
-        if ( vio->io_size == 4 ) /* Needs zero extension. */
-            guest_cpu_user_regs()->rax = (uint32_t)p->data;
-        else
-            memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
-        break;
     default:
         break;
     }
@@ -479,6 +469,7 @@ void hvm_do_resume(struct vcpu *v)
     struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
     struct domain *d = v->domain;
     struct hvm_ioreq_server *s;
+    enum hvm_io_completion io_completion;
 
     check_wakeup_from_wait();
 
@@ -505,8 +496,37 @@ void hvm_do_resume(struct vcpu *v)
         }
     }
 
-    if ( vio->mmio_retry )
+    io_completion = vio->io_completion;
+    vio->io_completion = HVMIO_no_completion;
+
+    switch ( io_completion )
+    {
+    case HVMIO_no_completion:
+        break;
+    case HVMIO_mmio_completion:
         handle_mmio();
+        break;
+    case HVMIO_pio_completion:
+        if ( vio->io_size == 4 ) /* Needs zero extension. */
+            guest_cpu_user_regs()->rax = (uint32_t)vio->io_data;
+        else
+            memcpy(&guest_cpu_user_regs()->rax, &vio->io_data, vio->io_size);
+        vio->io_state = HVMIO_none;
+        break;
+    case HVMIO_realmode_completion:
+    {
+        struct hvm_emulate_ctxt ctxt;
+
+        hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
+        vmx_realmode_emulate_one(&ctxt);
+        hvm_emulate_writeback(&ctxt);
+
+        break;
+    }
+    default:
+        ASSERT_UNREACHABLE();
+        break;
+    }
 
     /* Inject pending hw/sw trap */
     if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) 
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index fe099d8..221d05e 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -92,8 +92,8 @@ int handle_mmio(void)
 
     if ( rc != X86EMUL_RETRY )
         vio->io_state = HVMIO_none;
-    if ( vio->io_state == HVMIO_awaiting_completion )
-        vio->io_state = HVMIO_handle_mmio_awaiting_completion;
+    if ( vio->io_state == HVMIO_awaiting_completion || vio->mmio_retry )
+        vio->io_completion = HVMIO_mmio_completion;
     else
         vio->mmio_access = (struct npfec){};
 
@@ -158,7 +158,7 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
             return 0;
         /* Completion in hvm_io_assist() with no re-emulation required. */
         ASSERT(dir == IOREQ_READ);
-        vio->io_state = HVMIO_handle_pio_awaiting_completion;
+        vio->io_completion = HVMIO_pio_completion;
         break;
     default:
         gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc);
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index fe8b4a0..76ff9a5 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -101,15 +101,19 @@ static void realmode_deliver_exception(
     }
 }
 
-static void realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
+void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
     struct vcpu *curr = current;
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     int rc;
 
     perfc_incr(realmode_emulations);
 
     rc = hvm_emulate_one(hvmemul_ctxt);
 
+    if ( vio->io_state == HVMIO_awaiting_completion || vio->mmio_retry )
+        vio->io_completion = HVMIO_realmode_completion;
+
     if ( rc == X86EMUL_UNHANDLEABLE )
     {
         gdprintk(XENLOG_ERR, "Failed to emulate insn.\n");
@@ -177,9 +181,6 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
     hvm_emulate_prepare(&hvmemul_ctxt, regs);
 
-    if ( vio->io_state == HVMIO_completed )
-        realmode_emulate_one(&hvmemul_ctxt);
-
     /* Only deliver interrupts into emulated real mode. */
     if ( !(curr->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) &&
          (intr_info & INTR_INFO_VALID_MASK) )
@@ -190,8 +191,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
     curr->arch.hvm_vmx.vmx_emulate = 1;
     while ( curr->arch.hvm_vmx.vmx_emulate &&
-            !softirq_pending(smp_processor_id()) &&
-            (vio->io_state == HVMIO_none) )
+            !softirq_pending(smp_processor_id()) )
     {
         /*
          * Check for pending interrupts only every 16 instructions, because
@@ -203,7 +203,10 @@ void vmx_realmode(struct cpu_user_regs *regs)
              hvm_local_events_need_delivery(curr) )
             break;
 
-        realmode_emulate_one(&hvmemul_ctxt);
+        vmx_realmode_emulate_one(&hvmemul_ctxt);
+
+        if ( vio->io_state != HVMIO_none || vio->mmio_retry )
+            break;
 
         /* Stop emulating unless our segment state is not safe */
         if ( curr->arch.hvm_vmx.vmx_realmode )
@@ -245,3 +248,13 @@ void vmx_realmode(struct cpu_user_regs *regs)
     if ( intr_info & INTR_INFO_VALID_MASK )
         __vmwrite(VM_ENTRY_INTR_INFO, intr_info);
 }
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index bf6c7ab..5564fba 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -34,11 +34,16 @@ enum hvm_io_state {
     HVMIO_none = 0,
     HVMIO_dispatched,
     HVMIO_awaiting_completion,
-    HVMIO_handle_mmio_awaiting_completion,
-    HVMIO_handle_pio_awaiting_completion,
     HVMIO_completed
 };
 
+enum hvm_io_completion {
+    HVMIO_no_completion,
+    HVMIO_mmio_completion,
+    HVMIO_pio_completion,
+    HVMIO_realmode_completion
+};
+
 struct hvm_vcpu_asid {
     uint64_t generation;
     uint32_t asid;
@@ -46,9 +51,10 @@ struct hvm_vcpu_asid {
 
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
-    enum hvm_io_state   io_state;
-    unsigned long       io_data;
-    int                 io_size;
+    enum hvm_io_state      io_state;
+    enum hvm_io_completion io_completion;
+    unsigned long          io_data;
+    unsigned int           io_size;
 
     /*
      * HVM emulation:
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 35f804a..c5f3d24 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -94,6 +94,7 @@ void vmx_asm_do_vmentry(void);
 void vmx_intr_assist(void);
 void noreturn vmx_do_resume(struct vcpu *);
 void vmx_vlapic_msr_changed(struct vcpu *v);
+void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt);
 void vmx_realmode(struct cpu_user_regs *regs);
 void vmx_update_debug_state(struct vcpu *v);
 void vmx_update_exception_bitmap(struct vcpu *v);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 11/15] x86/hvm: remove HVMIO_dispatched I/O state
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (9 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 10/15] x86/hvm: split I/O completion handling from state model Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 12/15] x86/hvm: remove hvm_io_state enumeration Paul Durrant
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

By removing the HVMIO_dispatched state and making all pending emulations
(i.e. all those not handled by the hypervisor) use HVMIO_awating_completion,
various code-paths can be simplified.

The completion case for HVMIO_dispatched can also be trivally removed
from hvmemul_do_io() as it was already unreachable. This is because that
state was only ever used for writes or I/O to/from a guest page and
hvmemul_do_io() is never called to complete such I/O.

NOTE: There is one sublety in handle_pio()... The only case when
      handle_pio() got a return code of X86EMUL_RETRY back from
      hvmemul_do_pio_buffer() and found io_state was not
      HVMIO_awaiting_completion was in the case where the domain is
      shutting down. This is because all writes normally yield a return
      of HVMEMUL_OKAY and all reads put io_state into
      HVMIO_awaiting_completion. Hence the io_state check there is
      replaced with a check of the is_shutting_down flag on the domain.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- Modified struct field placement as knock-on from previous patch

v6:
- Added Andrew's reviewed-by

v5:
- Added some extra comments to the commit
---
 xen/arch/x86/hvm/emulate.c      |   12 +++---------
 xen/arch/x86/hvm/hvm.c          |   12 +++---------
 xen/arch/x86/hvm/io.c           |   14 +++++++-------
 xen/arch/x86/hvm/vmx/realmode.c |    2 +-
 xen/include/asm-x86/hvm/vcpu.h  |   10 +++++++++-
 5 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 54b9430..73cce4e 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -137,20 +137,14 @@ static int hvmemul_do_io(
         if ( data_is_addr || dir == IOREQ_WRITE )
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
-    case HVMIO_dispatched:
-        /* May have to wait for previous cycle of a multi-write to complete. */
-        if ( is_mmio && !data_is_addr && (dir == IOREQ_WRITE) &&
-             (addr == (vio->mmio_large_write_pa +
-                       vio->mmio_large_write_bytes)) )
-            return X86EMUL_RETRY;
-        /* fallthrough */
     default:
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state = (data_is_addr || dir == IOREQ_WRITE) ?
-        HVMIO_dispatched : HVMIO_awaiting_completion;
+    vio->io_state = HVMIO_awaiting_completion;
     vio->io_size = size;
+    vio->io_dir = dir;
+    vio->io_data_is_addr = data_is_addr;
 
     if ( dir == IOREQ_WRITE )
     {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 093a710..8e487d4 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -416,22 +416,16 @@ static void hvm_io_assist(ioreq_t *p)
 {
     struct vcpu *curr = current;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    enum hvm_io_state io_state;
 
     p->state = STATE_IOREQ_NONE;
 
-    io_state = vio->io_state;
-    vio->io_state = HVMIO_none;
-
-    switch ( io_state )
+    if ( hvm_vcpu_io_need_completion(vio) )
     {
-    case HVMIO_awaiting_completion:
         vio->io_state = HVMIO_completed;
         vio->io_data = p->data;
-        break;
-    default:
-        break;
     }
+    else
+        vio->io_state = HVMIO_none;
 
     msix_write_completion(curr);
     vcpu_end_shutdown_deferral(curr);
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 221d05e..3b51d59 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -90,9 +90,7 @@ int handle_mmio(void)
 
     rc = hvm_emulate_one(&ctxt);
 
-    if ( rc != X86EMUL_RETRY )
-        vio->io_state = HVMIO_none;
-    if ( vio->io_state == HVMIO_awaiting_completion || vio->mmio_retry )
+    if ( hvm_vcpu_io_need_completion(vio) || vio->mmio_retry )
         vio->io_completion = HVMIO_mmio_completion;
     else
         vio->mmio_access = (struct npfec){};
@@ -142,6 +140,9 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
 
     rc = hvmemul_do_pio_buffer(port, size, dir, &data);
 
+    if ( hvm_vcpu_io_need_completion(vio) )
+        vio->io_completion = HVMIO_pio_completion;
+
     switch ( rc )
     {
     case X86EMUL_OKAY:
@@ -154,11 +155,10 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
         }
         break;
     case X86EMUL_RETRY:
-        if ( vio->io_state != HVMIO_awaiting_completion )
+        /* We should not advance RIP/EIP if the domain is shutting down */
+        if ( curr->domain->is_shutting_down )
             return 0;
-        /* Completion in hvm_io_assist() with no re-emulation required. */
-        ASSERT(dir == IOREQ_READ);
-        vio->io_completion = HVMIO_pio_completion;
+
         break;
     default:
         gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc);
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index 76ff9a5..deb53ae 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -111,7 +111,7 @@ void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
 
     rc = hvm_emulate_one(hvmemul_ctxt);
 
-    if ( vio->io_state == HVMIO_awaiting_completion || vio->mmio_retry )
+    if ( hvm_vcpu_io_need_completion(vio) || vio->mmio_retry )
         vio->io_completion = HVMIO_realmode_completion;
 
     if ( rc == X86EMUL_UNHANDLEABLE )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 5564fba..6bb9c12 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -32,7 +32,6 @@
 
 enum hvm_io_state {
     HVMIO_none = 0,
-    HVMIO_dispatched,
     HVMIO_awaiting_completion,
     HVMIO_completed
 };
@@ -55,6 +54,8 @@ struct hvm_vcpu_io {
     enum hvm_io_completion io_completion;
     unsigned long          io_data;
     unsigned int           io_size;
+    uint8_t                io_dir;
+    uint8_t                io_data_is_addr;
 
     /*
      * HVM emulation:
@@ -87,6 +88,13 @@ struct hvm_vcpu_io {
     const struct g2m_ioport *g2m_ioport;
 };
 
+static inline bool_t hvm_vcpu_io_need_completion(const struct hvm_vcpu_io *vio)
+{
+    return (vio->io_state == HVMIO_awaiting_completion) &&
+           !vio->io_data_is_addr &&
+           (vio->io_dir == IOREQ_READ);
+}
+
 #define VMCX_EADDR    (~0ULL)
 
 struct nestedvcpu {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 12/15] x86/hvm: remove hvm_io_state enumeration
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (10 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 11/15] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 13/15] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser

Emulation request status is already covered by STATE_IOREQ_XXX values so
just use those. The mapping is:

HVMIO_none                -> STATE_IOREQ_NONE
HVMIO_awaiting_completion -> STATE_IOREQ_READY
HVMIO_completed           -> STATE_IORESP_READY

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- Modified struct field placement as knock-on from previous patch

v6:
- Added Andrew's reviewed-by

v5:
- Added Jan's acked-by
---
 xen/arch/x86/hvm/emulate.c       |   14 +++++++-------
 xen/arch/x86/hvm/hvm.c           |    6 +++---
 xen/arch/x86/hvm/svm/nestedsvm.c |    2 +-
 xen/arch/x86/hvm/vmx/realmode.c  |    4 ++--
 xen/include/asm-x86/hvm/vcpu.h   |   10 ++--------
 5 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 73cce4e..9d9219e 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -130,10 +130,10 @@ static int hvmemul_do_io(
 
     switch ( vio->io_state )
     {
-    case HVMIO_none:
+    case STATE_IOREQ_NONE:
         break;
-    case HVMIO_completed:
-        vio->io_state = HVMIO_none;
+    case STATE_IORESP_READY:
+        vio->io_state = STATE_IOREQ_NONE;
         if ( data_is_addr || dir == IOREQ_WRITE )
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
@@ -141,7 +141,7 @@ static int hvmemul_do_io(
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state = HVMIO_awaiting_completion;
+    vio->io_state = STATE_IOREQ_READY;
     vio->io_size = size;
     vio->io_dir = dir;
     vio->io_data_is_addr = data_is_addr;
@@ -160,7 +160,7 @@ static int hvmemul_do_io(
     {
     case X86EMUL_OKAY:
         vio->io_data = p.data;
-        vio->io_state = HVMIO_none;
+        vio->io_state = STATE_IOREQ_NONE;
         break;
     case X86EMUL_UNHANDLEABLE:
     {
@@ -173,13 +173,13 @@ static int hvmemul_do_io(
             rc = hvm_process_io_intercept(&null_handler, &p);
             if ( rc == X86EMUL_OKAY )
                 vio->io_data = p.data;
-            vio->io_state = HVMIO_none;
+            vio->io_state = STATE_IOREQ_NONE;
         }
         else
         {
             rc = hvm_send_assist_req(s, &p);
             if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
-                vio->io_state = HVMIO_none;
+                vio->io_state = STATE_IOREQ_NONE;
             else if ( data_is_addr || dir == IOREQ_WRITE )
                 rc = X86EMUL_OKAY;
         }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 8e487d4..3be17f9 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -421,11 +421,11 @@ static void hvm_io_assist(ioreq_t *p)
 
     if ( hvm_vcpu_io_need_completion(vio) )
     {
-        vio->io_state = HVMIO_completed;
+        vio->io_state = STATE_IORESP_READY;
         vio->io_data = p->data;
     }
     else
-        vio->io_state = HVMIO_none;
+        vio->io_state = STATE_IOREQ_NONE;
 
     msix_write_completion(curr);
     vcpu_end_shutdown_deferral(curr);
@@ -505,7 +505,7 @@ void hvm_do_resume(struct vcpu *v)
             guest_cpu_user_regs()->rax = (uint32_t)vio->io_data;
         else
             memcpy(&guest_cpu_user_regs()->rax, &vio->io_data, vio->io_size);
-        vio->io_state = HVMIO_none;
+        vio->io_state = STATE_IOREQ_NONE;
         break;
     case HVMIO_realmode_completion:
     {
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index ad8afb4..2243873 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -1221,7 +1221,7 @@ enum hvm_intblk nsvm_intr_blocked(struct vcpu *v)
          * Delay the injection because this would result in delivering
          * an interrupt *within* the execution of an instruction.
          */
-        if ( v->arch.hvm_vcpu.hvm_io.io_state != HVMIO_none )
+        if ( v->arch.hvm_vcpu.hvm_io.io_state != STATE_IOREQ_NONE )
             return hvm_intblk_shadow;
 
         if ( !nv->nv_vmexit_pending && n2vmcb->exitintinfo.bytes != 0 ) {
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index deb53ae..25533dc 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -205,7 +205,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
         vmx_realmode_emulate_one(&hvmemul_ctxt);
 
-        if ( vio->io_state != HVMIO_none || vio->mmio_retry )
+        if ( vio->io_state != STATE_IOREQ_NONE || vio->mmio_retry )
             break;
 
         /* Stop emulating unless our segment state is not safe */
@@ -219,7 +219,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     }
 
     /* Need to emulate next time if we've started an IO operation */
-    if ( vio->io_state != HVMIO_none )
+    if ( vio->io_state != STATE_IOREQ_NONE )
         curr->arch.hvm_vmx.vmx_emulate = 1;
 
     if ( !curr->arch.hvm_vmx.vmx_emulate && !curr->arch.hvm_vmx.vmx_realmode )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 6bb9c12..5c9faf2 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -30,12 +30,6 @@
 #include <asm/hvm/svm/nestedsvm.h>
 #include <asm/mtrr.h>
 
-enum hvm_io_state {
-    HVMIO_none = 0,
-    HVMIO_awaiting_completion,
-    HVMIO_completed
-};
-
 enum hvm_io_completion {
     HVMIO_no_completion,
     HVMIO_mmio_completion,
@@ -50,10 +44,10 @@ struct hvm_vcpu_asid {
 
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
-    enum hvm_io_state      io_state;
     enum hvm_io_completion io_completion;
     unsigned long          io_data;
     unsigned int           io_size;
+    uint8_t                io_state;
     uint8_t                io_dir;
     uint8_t                io_data_is_addr;
 
@@ -90,7 +84,7 @@ struct hvm_vcpu_io {
 
 static inline bool_t hvm_vcpu_io_need_completion(const struct hvm_vcpu_io *vio)
 {
-    return (vio->io_state == HVMIO_awaiting_completion) &&
+    return (vio->io_state == STATE_IOREQ_READY) &&
            !vio->io_data_is_addr &&
            (vio->io_dir == IOREQ_READ);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 13/15] x86/hvm: use ioreq_t to track in-flight state
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (11 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 12/15] x86/hvm: remove hvm_io_state enumeration Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 14/15] x86/hvm: always re-emulate I/O from a buffer Paul Durrant
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

Use an ioreq_t rather than open coded state, size, dir and data fields
in struct hvm_vcpu_io. This also allows PIO completion to be handled
similarly to MMIO completion by re-issuing the handle_pio() call.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- Modified struct field placement as knock-on from previous patch

v6:
- Added Andrew's reviewed-by

v5:
- Added missing hunk with call to handle_pio()
---
 xen/arch/x86/hvm/emulate.c       |   35 +++++++++++++++++++++--------------
 xen/arch/x86/hvm/hvm.c           |   13 +++++--------
 xen/arch/x86/hvm/svm/nestedsvm.c |    2 +-
 xen/arch/x86/hvm/vmx/realmode.c  |    4 ++--
 xen/include/asm-x86/hvm/vcpu.h   |   12 ++++--------
 5 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 9d9219e..a2db826 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -91,6 +91,7 @@ static int hvmemul_do_io(
         .df = df,
         .data = data,
         .data_is_ptr = data_is_addr, /* ioreq_t field name is misleading */
+        .state = STATE_IOREQ_READY,
     };
     void *p_data = (void *)data;
     int rc;
@@ -128,12 +129,24 @@ static int hvmemul_do_io(
         }
     }
 
-    switch ( vio->io_state )
+    switch ( vio->io_req.state )
     {
     case STATE_IOREQ_NONE:
         break;
     case STATE_IORESP_READY:
-        vio->io_state = STATE_IOREQ_NONE;
+        vio->io_req.state = STATE_IOREQ_NONE;
+        p = vio->io_req;
+
+        /* Verify the emulation request has been correctly re-issued */
+        if ( (p.type != is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO) ||
+             (p.addr != addr) ||
+             (p.size != size) ||
+             (p.count != reps) ||
+             (p.dir != dir) ||
+             (p.df != df) ||
+             (p.data_is_ptr != data_is_addr) )
+            domain_crash(curr->domain);
+
         if ( data_is_addr || dir == IOREQ_WRITE )
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
@@ -141,11 +154,6 @@ static int hvmemul_do_io(
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state = STATE_IOREQ_READY;
-    vio->io_size = size;
-    vio->io_dir = dir;
-    vio->io_data_is_addr = data_is_addr;
-
     if ( dir == IOREQ_WRITE )
     {
         if ( !data_is_addr )
@@ -154,13 +162,14 @@ static int hvmemul_do_io(
         hvmtrace_io_assist(&p);
     }
 
+    vio->io_req = p;
+
     rc = hvm_io_intercept(&p);
 
     switch ( rc )
     {
     case X86EMUL_OKAY:
-        vio->io_data = p.data;
-        vio->io_state = STATE_IOREQ_NONE;
+        vio->io_req.state = STATE_IOREQ_NONE;
         break;
     case X86EMUL_UNHANDLEABLE:
     {
@@ -171,15 +180,13 @@ static int hvmemul_do_io(
         if ( !s )
         {
             rc = hvm_process_io_intercept(&null_handler, &p);
-            if ( rc == X86EMUL_OKAY )
-                vio->io_data = p.data;
-            vio->io_state = STATE_IOREQ_NONE;
+            vio->io_req.state = STATE_IOREQ_NONE;
         }
         else
         {
             rc = hvm_send_assist_req(s, &p);
             if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
-                vio->io_state = STATE_IOREQ_NONE;
+                vio->io_req.state = STATE_IOREQ_NONE;
             else if ( data_is_addr || dir == IOREQ_WRITE )
                 rc = X86EMUL_OKAY;
         }
@@ -198,7 +205,7 @@ static int hvmemul_do_io(
         hvmtrace_io_assist(&p);
 
         if ( !data_is_addr )
-            memcpy(p_data, &vio->io_data, size);
+            memcpy(p_data, &p.data, size);
     }
 
     if ( is_mmio && !data_is_addr )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 3be17f9..31ae4d4 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -421,11 +421,11 @@ static void hvm_io_assist(ioreq_t *p)
 
     if ( hvm_vcpu_io_need_completion(vio) )
     {
-        vio->io_state = STATE_IORESP_READY;
-        vio->io_data = p->data;
+        vio->io_req.state = STATE_IORESP_READY;
+        vio->io_req.data = p->data;
     }
     else
-        vio->io_state = STATE_IOREQ_NONE;
+        vio->io_req.state = STATE_IOREQ_NONE;
 
     msix_write_completion(curr);
     vcpu_end_shutdown_deferral(curr);
@@ -501,11 +501,8 @@ void hvm_do_resume(struct vcpu *v)
         handle_mmio();
         break;
     case HVMIO_pio_completion:
-        if ( vio->io_size == 4 ) /* Needs zero extension. */
-            guest_cpu_user_regs()->rax = (uint32_t)vio->io_data;
-        else
-            memcpy(&guest_cpu_user_regs()->rax, &vio->io_data, vio->io_size);
-        vio->io_state = STATE_IOREQ_NONE;
+        (void)handle_pio(vio->io_req.addr, vio->io_req.size,
+                         vio->io_req.dir);
         break;
     case HVMIO_realmode_completion:
     {
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index 2243873..2653bc1 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -1221,7 +1221,7 @@ enum hvm_intblk nsvm_intr_blocked(struct vcpu *v)
          * Delay the injection because this would result in delivering
          * an interrupt *within* the execution of an instruction.
          */
-        if ( v->arch.hvm_vcpu.hvm_io.io_state != STATE_IOREQ_NONE )
+        if ( v->arch.hvm_vcpu.hvm_io.io_req.state != STATE_IOREQ_NONE )
             return hvm_intblk_shadow;
 
         if ( !nv->nv_vmexit_pending && n2vmcb->exitintinfo.bytes != 0 ) {
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index 25533dc..e83a61f 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -205,7 +205,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
         vmx_realmode_emulate_one(&hvmemul_ctxt);
 
-        if ( vio->io_state != STATE_IOREQ_NONE || vio->mmio_retry )
+        if ( vio->io_req.state != STATE_IOREQ_NONE || vio->mmio_retry )
             break;
 
         /* Stop emulating unless our segment state is not safe */
@@ -219,7 +219,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     }
 
     /* Need to emulate next time if we've started an IO operation */
-    if ( vio->io_state != STATE_IOREQ_NONE )
+    if ( vio->io_req.state != STATE_IOREQ_NONE )
         curr->arch.hvm_vmx.vmx_emulate = 1;
 
     if ( !curr->arch.hvm_vmx.vmx_emulate && !curr->arch.hvm_vmx.vmx_realmode )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 5c9faf2..01cbfe5 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -45,11 +45,7 @@ struct hvm_vcpu_asid {
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
     enum hvm_io_completion io_completion;
-    unsigned long          io_data;
-    unsigned int           io_size;
-    uint8_t                io_state;
-    uint8_t                io_dir;
-    uint8_t                io_data_is_addr;
+    ioreq_t                io_req;
 
     /*
      * HVM emulation:
@@ -84,9 +80,9 @@ struct hvm_vcpu_io {
 
 static inline bool_t hvm_vcpu_io_need_completion(const struct hvm_vcpu_io *vio)
 {
-    return (vio->io_state == STATE_IOREQ_READY) &&
-           !vio->io_data_is_addr &&
-           (vio->io_dir == IOREQ_READ);
+    return (vio->io_req.state == STATE_IOREQ_READY) &&
+           !vio->io_req.data_is_ptr &&
+           (vio->io_req.dir == IOREQ_READ);
 }
 
 #define VMCX_EADDR    (~0ULL)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 14/15] x86/hvm: always re-emulate I/O from a buffer
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (12 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 13/15] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 13:10 ` [PATCH v7 15/15] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
  2015-07-10  9:27 ` [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment Fabio Fantoni
  15 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser

If memory mapped I/O is 'chunked' then the I/O must be re-emulated,
otherwise only the first chunk will be processed. This patch makes
sure all I/O from a buffer is re-emulated regardless of whether it
is a read or a write.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- No change

v6:
- Added Andrew's reviewed-by

v5:
- Added Jan's acked-by
---
 xen/arch/x86/hvm/emulate.c     |    4 ++--
 xen/include/asm-x86/hvm/vcpu.h |    3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index a2db826..d7ee096 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -147,7 +147,7 @@ static int hvmemul_do_io(
              (p.data_is_ptr != data_is_addr) )
             domain_crash(curr->domain);
 
-        if ( data_is_addr || dir == IOREQ_WRITE )
+        if ( data_is_addr )
             return X86EMUL_UNHANDLEABLE;
         goto finish_access;
     default:
@@ -187,7 +187,7 @@ static int hvmemul_do_io(
             rc = hvm_send_assist_req(s, &p);
             if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
                 vio->io_req.state = STATE_IOREQ_NONE;
-            else if ( data_is_addr || dir == IOREQ_WRITE )
+            else if ( data_is_addr )
                 rc = X86EMUL_OKAY;
         }
         break;
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 01cbfe5..13ff54f 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -81,8 +81,7 @@ struct hvm_vcpu_io {
 static inline bool_t hvm_vcpu_io_need_completion(const struct hvm_vcpu_io *vio)
 {
     return (vio->io_req.state == STATE_IOREQ_READY) &&
-           !vio->io_req.data_is_ptr &&
-           (vio->io_req.dir == IOREQ_READ);
+           !vio->io_req.data_is_ptr;
 }
 
 #define VMCX_EADDR    (~0ULL)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 15/15] x86/hvm: track large memory mapped accesses by buffer offset
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (13 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 14/15] x86/hvm: always re-emulate I/O from a buffer Paul Durrant
@ 2015-07-09 13:10 ` Paul Durrant
  2015-07-09 15:46   ` Jan Beulich
  2015-07-10  9:27 ` [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment Fabio Fantoni
  15 siblings, 1 reply; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 13:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

The code in hvmemul_do_io() that tracks large reads or writes, to avoid
re-issue of component I/O, is defeated by accesses across a page boundary
because it uses physical address. The code is also only relevant to memory
mapped I/O to or from a buffer.

This patch re-factors the code and moves it into hvmemul_phys_mmio_access()
where it is relevant and tracks using buffer offset rather than address.
Separate I/O emulations (of which there may be up to three per instruction)
are distinguished by linear address.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---

v7:
- Added comment requested by Jan
- Changed BUG_ON() to domain_crash()

v6:
- Added Andrew's reviewed-by

v5:
- Fixed to cache up three distict I/O emulations per instruction
---
 xen/arch/x86/hvm/emulate.c     |  130 +++++++++++++++++++++-------------------
 xen/include/asm-x86/hvm/vcpu.h |   25 +++++---
 2 files changed, 86 insertions(+), 69 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index d7ee096..f4d61be 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -106,29 +106,6 @@ static int hvmemul_do_io(
         return X86EMUL_UNHANDLEABLE;
     }
 
-    if ( is_mmio && !data_is_addr )
-    {
-        /* Part of a multi-cycle read or write? */
-        if ( dir == IOREQ_WRITE )
-        {
-            paddr_t pa = vio->mmio_large_write_pa;
-            unsigned int bytes = vio->mmio_large_write_bytes;
-            if ( (addr >= pa) && ((addr + size) <= (pa + bytes)) )
-                return X86EMUL_OKAY;
-        }
-        else
-        {
-            paddr_t pa = vio->mmio_large_read_pa;
-            unsigned int bytes = vio->mmio_large_read_bytes;
-            if ( (addr >= pa) && ((addr + size) <= (pa + bytes)) )
-            {
-                memcpy(p_data, &vio->mmio_large_read[addr - pa],
-                       size);
-                return X86EMUL_OKAY;
-            }
-        }
-    }
-
     switch ( vio->io_req.state )
     {
     case STATE_IOREQ_NONE:
@@ -208,33 +185,6 @@ static int hvmemul_do_io(
             memcpy(p_data, &p.data, size);
     }
 
-    if ( is_mmio && !data_is_addr )
-    {
-        /* Part of a multi-cycle read or write? */
-        if ( dir == IOREQ_WRITE )
-        {
-            paddr_t pa = vio->mmio_large_write_pa;
-            unsigned int bytes = vio->mmio_large_write_bytes;
-            if ( bytes == 0 )
-                pa = vio->mmio_large_write_pa = addr;
-            if ( addr == (pa + bytes) )
-                vio->mmio_large_write_bytes += size;
-        }
-        else
-        {
-            paddr_t pa = vio->mmio_large_read_pa;
-            unsigned int bytes = vio->mmio_large_read_bytes;
-            if ( bytes == 0 )
-                pa = vio->mmio_large_read_pa = addr;
-            if ( (addr == (pa + bytes)) &&
-                 ((bytes + size) <= sizeof(vio->mmio_large_read)) )
-            {
-                memcpy(&vio->mmio_large_read[bytes], p_data, size);
-                vio->mmio_large_read_bytes += size;
-            }
-        }
-    }
-
     return X86EMUL_OKAY;
 }
 
@@ -590,11 +540,12 @@ static int hvmemul_virtual_to_linear(
 }
 
 static int hvmemul_phys_mmio_access(
-    paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer)
+    struct hvm_mmio_cache *cache, paddr_t gpa, unsigned int size, uint8_t dir,
+    uint8_t *buffer, unsigned int offset)
 {
     unsigned long one_rep = 1;
     unsigned int chunk;
-    int rc;
+    int rc = X86EMUL_OKAY;
 
     /* Accesses must fall within a page. */
     BUG_ON((gpa & ~PAGE_MASK) + size > PAGE_SIZE);
@@ -611,14 +562,33 @@ static int hvmemul_phys_mmio_access(
 
     for ( ;; )
     {
-        rc = hvmemul_do_mmio_buffer(gpa, &one_rep, chunk, dir, 0,
-                                    buffer);
-        if ( rc != X86EMUL_OKAY )
-            break;
+        /* Have we already done this chunk? */
+        if ( offset < cache->size )
+        {
+            ASSERT((offset + chunk) <= cache->size);
+
+            if ( dir == IOREQ_READ )
+                memcpy(&buffer[offset], &cache->buffer[offset], chunk);
+            else if ( memcmp(&buffer[offset], &cache->buffer[offset], chunk) != 0 )
+                domain_crash(current->domain);
+        }
+        else
+        {
+            ASSERT(offset == cache->size);
+
+            rc = hvmemul_do_mmio_buffer(gpa, &one_rep, chunk, dir, 0,
+                                        &buffer[offset]);
+            if ( rc != X86EMUL_OKAY )
+                break;
+
+            /* Note that we have now done this chunk. */
+            memcpy(&cache->buffer[offset], &buffer[offset], chunk);
+            cache->size += chunk;
+        }
 
         /* Advance to the next chunk. */
         gpa += chunk;
-        buffer += chunk;
+        offset += chunk;
         size -= chunk;
 
         if ( size == 0 )
@@ -635,13 +605,49 @@ static int hvmemul_phys_mmio_access(
     return rc;
 }
 
+/*
+ * Multi-cycle MMIO handling is based upon the assumption that emulation
+ * of the same instruction will not access the same MMIO region more
+ * than once. Hence we can deal with re-emulation (for secondary or
+ * subsequent cycles) by looking up the result or previous I/O in a
+ * cache indexed by linear MMIO address.
+ */
+static struct hvm_mmio_cache *hvmemul_find_mmio_cache(
+    struct hvm_vcpu_io *vio, unsigned long gla, uint8_t dir)
+{
+    unsigned int i;
+    struct hvm_mmio_cache *cache;
+
+    for ( i = 0; i < vio->mmio_cache_count; i ++ )
+    {
+        cache = &vio->mmio_cache[i];
+
+        if ( gla == cache->gla &&
+             dir == cache->dir )
+            return cache;
+    }
+
+    i = vio->mmio_cache_count++;
+    if( i == ARRAY_SIZE(vio->mmio_cache) )
+        domain_crash(current->domain);
+
+    cache = &vio->mmio_cache[i];
+    memset(cache, 0, sizeof (*cache));
+
+    cache->gla = gla;
+    cache->dir = dir;
+
+    return cache;
+}
+
 static int hvmemul_linear_mmio_access(
-    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
+    unsigned long gla, unsigned int size, uint8_t dir, void *buffer,
     uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t known_gpfn)
 {
     struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
     unsigned long offset = gla & ~PAGE_MASK;
-    unsigned int chunk;
+    struct hvm_mmio_cache *cache = hvmemul_find_mmio_cache(vio, gla, dir);
+    unsigned int chunk, buffer_offset = 0;
     paddr_t gpa;
     unsigned long one_rep = 1;
     int rc;
@@ -660,12 +666,12 @@ static int hvmemul_linear_mmio_access(
 
     for ( ;; )
     {
-        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
+        rc = hvmemul_phys_mmio_access(cache, gpa, chunk, dir, buffer, buffer_offset);
         if ( rc != X86EMUL_OKAY )
             break;
 
         gla += chunk;
-        buffer += chunk;
+        buffer_offset += chunk;
         size -= chunk;
 
         if ( size == 0 )
@@ -1612,7 +1618,7 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
         rc = X86EMUL_RETRY;
     if ( rc != X86EMUL_RETRY )
     {
-        vio->mmio_large_read_bytes = vio->mmio_large_write_bytes = 0;
+        vio->mmio_cache_count = 0;
         vio->mmio_insn_bytes = 0;
     }
     else
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 13ff54f..6ee693f 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -42,6 +42,17 @@ struct hvm_vcpu_asid {
     uint32_t asid;
 };
 
+/*
+ * We may read or write up to m256 as a number of device-model
+ * transactions.
+ */
+struct hvm_mmio_cache {
+    unsigned long gla;
+    unsigned int size;
+    uint8_t dir;
+    uint8_t buffer[32];
+};
+
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
     enum hvm_io_completion io_completion;
@@ -57,13 +68,13 @@ struct hvm_vcpu_io {
     unsigned long       mmio_gva;
     unsigned long       mmio_gpfn;
 
-    /* We may read up to m256 as a number of device-model transactions. */
-    paddr_t mmio_large_read_pa;
-    uint8_t mmio_large_read[32];
-    unsigned int mmio_large_read_bytes;
-    /* We may write up to m256 as a number of device-model transactions. */
-    unsigned int mmio_large_write_bytes;
-    paddr_t mmio_large_write_pa;
+    /*
+     * We may need to handle up to 3 distinct memory accesses per
+     * instruction.
+     */
+    struct hvm_mmio_cache mmio_cache[3];
+    unsigned int mmio_cache_count;
+
     /* For retries we shouldn't re-fetch the instruction. */
     unsigned int mmio_insn_bytes;
     unsigned char mmio_insn[16];
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 01/15] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-09 13:10 ` [PATCH v7 01/15] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
@ 2015-07-09 15:13   ` Jan Beulich
  2015-07-09 16:16     ` Paul Durrant
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-09 15:13 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Keir Fraser, xen-devel

>>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
> +static int hvmemul_linear_mmio_access(
> +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
> +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t known_gpfn)
> +{
> +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
> +    unsigned long offset = gla & ~PAGE_MASK;
> +    unsigned int chunk;
> +    paddr_t gpa;
> +    unsigned long one_rep = 1;
> +    int rc;
> +
> +    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
> +
> +    if ( known_gpfn )
> +        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
> +    else
> +    {
> +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
> +                                    hvmemul_ctxt);
> +        if ( rc != X86EMUL_OKAY )
> +            return rc;
> +    }
> +
> +    for ( ;; )
> +    {
> +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
> +        if ( rc != X86EMUL_OKAY )
> +            break;
> +
> +        gla += chunk;
> +        buffer += chunk;
> +        size -= chunk;
> +
> +        if ( size == 0 )
> +            break;
> +
> +        ASSERT((gla & ~PAGE_MASK) == 0);

While I don't mean you to re-submit another time, I'd still like to
get my question answered: Does this really matter for the code
below?

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 03/15] x86/hvm: restrict port numbers and uint16_t and sizes to unsigned int
  2015-07-09 13:10 ` [PATCH v7 03/15] x86/hvm: restrict port numbers and uint16_t and sizes to unsigned int Paul Durrant
@ 2015-07-09 15:24   ` Jan Beulich
  2015-07-09 16:10     ` Paul Durrant
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-09 15:24 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Keir Fraser, xen-devel

>>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
> Building on the previous patch, this patch restricts portio port numbers
> to uint16_t in registration/relocate calls. In portio_action_t the port
> number is change to unsigned int though to avoid the compiler generating
> 16-bit operations unnecessarily. The patch also changes I/O sizes to
> unsigned int which then allows the io_handler size field to reduce to
> an unsigned int.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> 
> v7:
> - Change port type in portio_action_t to unsigned int as requested
>   by Jan

Yet title and description were left in places, and ...

> @@ -96,17 +96,17 @@ int hvm_mmio_intercept(ioreq_t *p);
>  int hvm_buffered_io_send(ioreq_t *p);
>  
>  static inline void register_portio_handler(
> -    struct domain *d, unsigned long addr,
> -    unsigned long size, portio_action_t action)
> +    struct domain *d, uint16_t port, unsigned int size,
> +    portio_action_t action)
>  {
> -    register_io_handler(d, addr, size, action, HVM_PORTIO);
> +    register_io_handler(d, port, size, action, HVM_PORTIO);
>  }
>  
>  static inline void relocate_portio_handler(
> -    struct domain *d, unsigned long old_addr, unsigned long new_addr,
> -    unsigned long size)
> +    struct domain *d, uint16_t old_port, uint16_t new_port,
> +    unsigned int size)
>  {
> -    relocate_io_handler(d, old_addr, new_addr, size, HVM_PORTIO);
> +    relocate_io_handler(d, old_port, new_port, size, HVM_PORTIO);
>  }

... these still use uint16_t. I'm pretty sure I gave my comment in a
way indicating that this should generally change, perhaps just at
the example of portio_action_t.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 07/15] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-07-09 13:10 ` [PATCH v7 07/15] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
@ 2015-07-09 15:33   ` Jan Beulich
  2015-07-09 16:12     ` Paul Durrant
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-09 15:33 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Keir Fraser, xen-devel

>>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
> @@ -424,8 +426,22 @@ static void stdvga_mem_writeb(uint64_t addr, uint32_t val)
>      }
>  }
>  
> -static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
> +static int stdvga_mem_write(const struct hvm_io_handler *handler,
> +                            uint64_t addr, uint32_t size,
> +                            uint64_t data)
>  {
> +    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
> +    ioreq_t p = { .type = IOREQ_TYPE_COPY,
> +                  .addr = addr,
> +                  .size = size,
> +                  .count = 1,
> +                  .dir = IOREQ_WRITE,
> +                  .data = data,
> +                };

Indentation (still - I know I pointed this out on v6, just perhaps at
another example). See e.g. the context of the earlier change to the
beginning of hvm_mmio_internal() in this patch for how this should
look like.

> -int stdvga_intercept_mmio(ioreq_t *p)
> +static bool_t stdvga_mem_accept(const struct hvm_io_handler *handler,
> +                                const ioreq_t *p)
>  {
> -    struct domain *d = current->domain;
> -    struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
> -    uint64_t start, end, addr = p->addr, count = p->count, size = p->size;
> -    int buf = 0, rc;
> -
> -    if ( unlikely(p->df) )
> -    {
> -        start = (addr - (count - 1) * size);
> -        end = addr + size;
> -    }
> -    else
> -    {
> -        start = addr;
> -        end = addr + count * size;
> -    }
> -
> -    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
> -        return X86EMUL_UNHANDLEABLE;
> -
> -    if ( size > 8 )
> -    {
> -        gdprintk(XENLOG_WARNING, "invalid mmio size %d\n", (int)p->size);
> -        return X86EMUL_UNHANDLEABLE;
> -    }
> +    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
>  
>      spin_lock(&s->lock);
>  
> -    if ( s->stdvga && s->cache )
> +    if ( !s->stdvga ||
> +         (hvm_mmio_first_byte(p) < VGA_MEM_BASE) ||
> +         (hvm_mmio_last_byte(p) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )

If "last" means what is says, you need >= here.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 15/15] x86/hvm: track large memory mapped accesses by buffer offset
  2015-07-09 13:10 ` [PATCH v7 15/15] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
@ 2015-07-09 15:46   ` Jan Beulich
  2015-07-09 16:05     ` Paul Durrant
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-09 15:46 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Keir Fraser, xen-devel

>>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
> @@ -635,13 +605,49 @@ static int hvmemul_phys_mmio_access(
>      return rc;
>  }
>  
> +/*
> + * Multi-cycle MMIO handling is based upon the assumption that emulation
> + * of the same instruction will not access the same MMIO region more
> + * than once. Hence we can deal with re-emulation (for secondary or
> + * subsequent cycles) by looking up the result or previous I/O in a
> + * cache indexed by linear MMIO address.
> + */
> +static struct hvm_mmio_cache *hvmemul_find_mmio_cache(
> +    struct hvm_vcpu_io *vio, unsigned long gla, uint8_t dir)
> +{
> +    unsigned int i;
> +    struct hvm_mmio_cache *cache;
> +
> +    for ( i = 0; i < vio->mmio_cache_count; i ++ )
> +    {
> +        cache = &vio->mmio_cache[i];
> +
> +        if ( gla == cache->gla &&
> +             dir == cache->dir )
> +            return cache;
> +    }
> +
> +    i = vio->mmio_cache_count++;
> +    if( i == ARRAY_SIZE(vio->mmio_cache) )
> +        domain_crash(current->domain);

But you mustn't continue here, or at least force i into range so you
don't corrupt other data.

And while doing that please also add the missing space on the
if() line.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 15/15] x86/hvm: track large memory mapped accesses by buffer offset
  2015-07-09 15:46   ` Jan Beulich
@ 2015-07-09 16:05     ` Paul Durrant
  0 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 16:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir (Xen.org), xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 16:46
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; Keir (Xen.org)
> Subject: Re: [PATCH v7 15/15] x86/hvm: track large memory mapped
> accesses by buffer offset
> 
> >>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
> > @@ -635,13 +605,49 @@ static int hvmemul_phys_mmio_access(
> >      return rc;
> >  }
> >
> > +/*
> > + * Multi-cycle MMIO handling is based upon the assumption that
> emulation
> > + * of the same instruction will not access the same MMIO region more
> > + * than once. Hence we can deal with re-emulation (for secondary or
> > + * subsequent cycles) by looking up the result or previous I/O in a
> > + * cache indexed by linear MMIO address.
> > + */
> > +static struct hvm_mmio_cache *hvmemul_find_mmio_cache(
> > +    struct hvm_vcpu_io *vio, unsigned long gla, uint8_t dir)
> > +{
> > +    unsigned int i;
> > +    struct hvm_mmio_cache *cache;
> > +
> > +    for ( i = 0; i < vio->mmio_cache_count; i ++ )
> > +    {
> > +        cache = &vio->mmio_cache[i];
> > +
> > +        if ( gla == cache->gla &&
> > +             dir == cache->dir )
> > +            return cache;
> > +    }
> > +
> > +    i = vio->mmio_cache_count++;
> > +    if( i == ARRAY_SIZE(vio->mmio_cache) )
> > +        domain_crash(current->domain);
> 
> But you mustn't continue here, or at least force i into range so you
> don't corrupt other data.
> 
> And while doing that please also add the missing space on the
> if() line.
> 

Sorry, I was forgetting domain_crash() is not immediate. I'll send v8.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 03/15] x86/hvm: restrict port numbers and uint16_t and sizes to unsigned int
  2015-07-09 15:24   ` Jan Beulich
@ 2015-07-09 16:10     ` Paul Durrant
  2015-07-09 16:20       ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 16:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir (Xen.org), xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 16:24
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; Keir (Xen.org)
> Subject: Re: [PATCH v7 03/15] x86/hvm: restrict port numbers and uint16_t
> and sizes to unsigned int
> 
> >>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
> > Building on the previous patch, this patch restricts portio port numbers
> > to uint16_t in registration/relocate calls. In portio_action_t the port
> > number is change to unsigned int though to avoid the compiler generating
> > 16-bit operations unnecessarily. The patch also changes I/O sizes to
> > unsigned int which then allows the io_handler size field to reduce to
> > an unsigned int.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Cc: Keir Fraser <keir@xen.org>
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> > ---
> >
> > v7:
> > - Change port type in portio_action_t to unsigned int as requested
> >   by Jan
> 
> Yet title and description were left in places, and ...

The title remains. The description was modified:

" In portio_action_t the port number is change to unsigned int though to avoid the compiler generating 16-bit operations unnecessarily."

> 
> > @@ -96,17 +96,17 @@ int hvm_mmio_intercept(ioreq_t *p);
> >  int hvm_buffered_io_send(ioreq_t *p);
> >
> >  static inline void register_portio_handler(
> > -    struct domain *d, unsigned long addr,
> > -    unsigned long size, portio_action_t action)
> > +    struct domain *d, uint16_t port, unsigned int size,
> > +    portio_action_t action)
> >  {
> > -    register_io_handler(d, addr, size, action, HVM_PORTIO);
> > +    register_io_handler(d, port, size, action, HVM_PORTIO);
> >  }
> >
> >  static inline void relocate_portio_handler(
> > -    struct domain *d, unsigned long old_addr, unsigned long new_addr,
> > -    unsigned long size)
> > +    struct domain *d, uint16_t old_port, uint16_t new_port,
> > +    unsigned int size)
> >  {
> > -    relocate_io_handler(d, old_addr, new_addr, size, HVM_PORTIO);
> > +    relocate_io_handler(d, old_port, new_port, size, HVM_PORTIO);
> >  }
> 
> ... these still use uint16_t. I'm pretty sure I gave my comment in a
> way indicating that this should generally change, perhaps just at
> the example of portio_action_t.

Why? Do we not want to restrict to uint16_t at the interface level?

  Paul

> 
> Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 07/15] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-07-09 15:33   ` Jan Beulich
@ 2015-07-09 16:12     ` Paul Durrant
  2015-07-09 16:21       ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 16:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir (Xen.org), xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 16:33
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; Keir (Xen.org)
> Subject: Re: [PATCH v7 07/15] x86/hvm: unify stdvga mmio intercept with
> standard mmio intercept
> 
> >>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
> > @@ -424,8 +426,22 @@ static void stdvga_mem_writeb(uint64_t addr,
> uint32_t val)
> >      }
> >  }
> >
> > -static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
> > +static int stdvga_mem_write(const struct hvm_io_handler *handler,
> > +                            uint64_t addr, uint32_t size,
> > +                            uint64_t data)
> >  {
> > +    struct hvm_hw_stdvga *s = &current->domain-
> >arch.hvm_domain.stdvga;
> > +    ioreq_t p = { .type = IOREQ_TYPE_COPY,
> > +                  .addr = addr,
> > +                  .size = size,
> > +                  .count = 1,
> > +                  .dir = IOREQ_WRITE,
> > +                  .data = data,
> > +                };
> 
> Indentation (still - I know I pointed this out on v6, just perhaps at
> another example). See e.g. the context of the earlier change to the
> beginning of hvm_mmio_internal() in this patch for how this should
> look like.

It has to be something my emacs is doing then; I can't see any brokenness.

> 
> > -int stdvga_intercept_mmio(ioreq_t *p)
> > +static bool_t stdvga_mem_accept(const struct hvm_io_handler *handler,
> > +                                const ioreq_t *p)
> >  {
> > -    struct domain *d = current->domain;
> > -    struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
> > -    uint64_t start, end, addr = p->addr, count = p->count, size = p->size;
> > -    int buf = 0, rc;
> > -
> > -    if ( unlikely(p->df) )
> > -    {
> > -        start = (addr - (count - 1) * size);
> > -        end = addr + size;
> > -    }
> > -    else
> > -    {
> > -        start = addr;
> > -        end = addr + count * size;
> > -    }
> > -
> > -    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE +
> VGA_MEM_SIZE)) )
> > -        return X86EMUL_UNHANDLEABLE;
> > -
> > -    if ( size > 8 )
> > -    {
> > -        gdprintk(XENLOG_WARNING, "invalid mmio size %d\n", (int)p->size);
> > -        return X86EMUL_UNHANDLEABLE;
> > -    }
> > +    struct hvm_hw_stdvga *s = &current->domain-
> >arch.hvm_domain.stdvga;
> >
> >      spin_lock(&s->lock);
> >
> > -    if ( s->stdvga && s->cache )
> > +    if ( !s->stdvga ||
> > +         (hvm_mmio_first_byte(p) < VGA_MEM_BASE) ||
> > +         (hvm_mmio_last_byte(p) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
> 
> If "last" means what is says, you need >= here.
> 

True.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 01/15] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-09 15:13   ` Jan Beulich
@ 2015-07-09 16:16     ` Paul Durrant
  2015-07-09 16:24       ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 16:16 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir (Xen.org), xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 16:13
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; Keir (Xen.org)
> Subject: Re: [PATCH v7 01/15] x86/hvm: remove multiple open coded
> 'chunking' loops
> 
> >>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
> > +static int hvmemul_linear_mmio_access(
> > +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
> > +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t
> known_gpfn)
> > +{
> > +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
> > +    unsigned long offset = gla & ~PAGE_MASK;
> > +    unsigned int chunk;
> > +    paddr_t gpa;
> > +    unsigned long one_rep = 1;
> > +    int rc;
> > +
> > +    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
> > +
> > +    if ( known_gpfn )
> > +        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
> > +    else
> > +    {
> > +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
> > +                                    hvmemul_ctxt);
> > +        if ( rc != X86EMUL_OKAY )
> > +            return rc;
> > +    }
> > +
> > +    for ( ;; )
> > +    {
> > +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
> > +        if ( rc != X86EMUL_OKAY )
> > +            break;
> > +
> > +        gla += chunk;
> > +        buffer += chunk;
> > +        size -= chunk;
> > +
> > +        if ( size == 0 )
> > +            break;
> > +
> > +        ASSERT((gla & ~PAGE_MASK) == 0);
> 
> While I don't mean you to re-submit another time, I'd still like to
> get my question answered: Does this really matter for the code
> below?
> 

No, it doesn't, but if it's not true then an incorrect chunk size was chosen above.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 03/15] x86/hvm: restrict port numbers and uint16_t and sizes to unsigned int
  2015-07-09 16:10     ` Paul Durrant
@ 2015-07-09 16:20       ` Jan Beulich
  2015-07-09 16:23         ` Paul Durrant
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-09 16:20 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Keir (Xen.org), xen-devel

>>> On 09.07.15 at 18:10, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 09 July 2015 16:24
>> >>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
>> > Building on the previous patch, this patch restricts portio port numbers
>> > to uint16_t in registration/relocate calls. In portio_action_t the port
>> > number is change to unsigned int though to avoid the compiler generating
>> > 16-bit operations unnecessarily. The patch also changes I/O sizes to
>> > unsigned int which then allows the io_handler size field to reduce to
>> > an unsigned int.
>> >
>> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> > Cc: Keir Fraser <keir@xen.org>
>> > Cc: Jan Beulich <jbeulich@suse.com>
>> > Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> > ---
>> >
>> > v7:
>> > - Change port type in portio_action_t to unsigned int as requested
>> >   by Jan
>> 
>> Yet title and description were left in places, and ...
> 
> The title remains. The description was modified:
> 
> " In portio_action_t the port number is change to unsigned int though to 
> avoid the compiler generating 16-bit operations unnecessarily."

Maybe I'm just being confused by the two "and"-s in the title,
which I assumed can't both be meant to be "and", or else you'd
have written "port numbers, uint16_t, and sizes". Plus it seems
unclear what "restrict a uint16_t to unsigned int" actually means.

>> > @@ -96,17 +96,17 @@ int hvm_mmio_intercept(ioreq_t *p);
>> >  int hvm_buffered_io_send(ioreq_t *p);
>> >
>> >  static inline void register_portio_handler(
>> > -    struct domain *d, unsigned long addr,
>> > -    unsigned long size, portio_action_t action)
>> > +    struct domain *d, uint16_t port, unsigned int size,
>> > +    portio_action_t action)
>> >  {
>> > -    register_io_handler(d, addr, size, action, HVM_PORTIO);
>> > +    register_io_handler(d, port, size, action, HVM_PORTIO);
>> >  }
>> >
>> >  static inline void relocate_portio_handler(
>> > -    struct domain *d, unsigned long old_addr, unsigned long new_addr,
>> > -    unsigned long size)
>> > +    struct domain *d, uint16_t old_port, uint16_t new_port,
>> > +    unsigned int size)
>> >  {
>> > -    relocate_io_handler(d, old_addr, new_addr, size, HVM_PORTIO);
>> > +    relocate_io_handler(d, old_port, new_port, size, HVM_PORTIO);
>> >  }
>> 
>> ... these still use uint16_t. I'm pretty sure I gave my comment in a
>> way indicating that this should generally change, perhaps just at
>> the example of portio_action_t.
> 
> Why? Do we not want to restrict to uint16_t at the interface level?

Quite the inverse - why would we want to? This just makes the
calling sequence less efficient (due to the needed operand size
overrides), and hides mistakes in callers properly truncating
when reading guest registers.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 07/15] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-07-09 16:12     ` Paul Durrant
@ 2015-07-09 16:21       ` Jan Beulich
  2015-07-09 16:24         ` Paul Durrant
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-09 16:21 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Keir (Xen.org), xen-devel

>>> On 09.07.15 at 18:12, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 09 July 2015 16:33
>> >>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
>> > @@ -424,8 +426,22 @@ static void stdvga_mem_writeb(uint64_t addr,
>> uint32_t val)
>> >      }
>> >  }
>> >
>> > -static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
>> > +static int stdvga_mem_write(const struct hvm_io_handler *handler,
>> > +                            uint64_t addr, uint32_t size,
>> > +                            uint64_t data)
>> >  {
>> > +    struct hvm_hw_stdvga *s = &current->domain->arch.hvm_domain.stdvga;
>> > +    ioreq_t p = { .type = IOREQ_TYPE_COPY,
>> > +                  .addr = addr,
>> > +                  .size = size,
>> > +                  .count = 1,
>> > +                  .dir = IOREQ_WRITE,
>> > +                  .data = data,
>> > +                };
>> 
>> Indentation (still - I know I pointed this out on v6, just perhaps at
>> another example). See e.g. the context of the earlier change to the
>> beginning of hvm_mmio_internal() in this patch for how this should
>> look like.
> 
> It has to be something my emacs is doing then; I can't see any brokenness.

Indentation simply is too deep. See the good example I pointed
you at.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 03/15] x86/hvm: restrict port numbers and uint16_t and sizes to unsigned int
  2015-07-09 16:20       ` Jan Beulich
@ 2015-07-09 16:23         ` Paul Durrant
  2015-07-09 16:31           ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 16:23 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir (Xen.org), xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 17:20
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; Keir (Xen.org)
> Subject: RE: [PATCH v7 03/15] x86/hvm: restrict port numbers and uint16_t
> and sizes to unsigned int
> 
> >>> On 09.07.15 at 18:10, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 09 July 2015 16:24
> >> >>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
> >> > Building on the previous patch, this patch restricts portio port numbers
> >> > to uint16_t in registration/relocate calls. In portio_action_t the port
> >> > number is change to unsigned int though to avoid the compiler
> generating
> >> > 16-bit operations unnecessarily. The patch also changes I/O sizes to
> >> > unsigned int which then allows the io_handler size field to reduce to
> >> > an unsigned int.
> >> >
> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> >> > Cc: Keir Fraser <keir@xen.org>
> >> > Cc: Jan Beulich <jbeulich@suse.com>
> >> > Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >> > ---
> >> >
> >> > v7:
> >> > - Change port type in portio_action_t to unsigned int as requested
> >> >   by Jan
> >>
> >> Yet title and description were left in places, and ...
> >
> > The title remains. The description was modified:
> >
> > " In portio_action_t the port number is change to unsigned int though to
> > avoid the compiler generating 16-bit operations unnecessarily."
> 
> Maybe I'm just being confused by the two "and"-s in the title,
> which I assumed can't both be meant to be "and", or else you'd
> have written "port numbers, uint16_t, and sizes". Plus it seems
> unclear what "restrict a uint16_t to unsigned int" actually means.
> 
> >> > @@ -96,17 +96,17 @@ int hvm_mmio_intercept(ioreq_t *p);
> >> >  int hvm_buffered_io_send(ioreq_t *p);
> >> >
> >> >  static inline void register_portio_handler(
> >> > -    struct domain *d, unsigned long addr,
> >> > -    unsigned long size, portio_action_t action)
> >> > +    struct domain *d, uint16_t port, unsigned int size,
> >> > +    portio_action_t action)
> >> >  {
> >> > -    register_io_handler(d, addr, size, action, HVM_PORTIO);
> >> > +    register_io_handler(d, port, size, action, HVM_PORTIO);
> >> >  }
> >> >
> >> >  static inline void relocate_portio_handler(
> >> > -    struct domain *d, unsigned long old_addr, unsigned long new_addr,
> >> > -    unsigned long size)
> >> > +    struct domain *d, uint16_t old_port, uint16_t new_port,
> >> > +    unsigned int size)
> >> >  {
> >> > -    relocate_io_handler(d, old_addr, new_addr, size, HVM_PORTIO);
> >> > +    relocate_io_handler(d, old_port, new_port, size, HVM_PORTIO);
> >> >  }
> >>
> >> ... these still use uint16_t. I'm pretty sure I gave my comment in a
> >> way indicating that this should generally change, perhaps just at
> >> the example of portio_action_t.
> >
> > Why? Do we not want to restrict to uint16_t at the interface level?
> 
> Quite the inverse - why would we want to? This just makes the
> calling sequence less efficient (due to the needed operand size
> overrides), and hides mistakes in callers properly truncating
> when reading guest registers.
> 

I would have thought that the compiler would point out the overflow if a caller tried to pass >64k in the port number, but I'll get rid of the remaining uint16_ts.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 07/15] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-07-09 16:21       ` Jan Beulich
@ 2015-07-09 16:24         ` Paul Durrant
  0 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 16:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir (Xen.org), xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 17:22
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; Keir (Xen.org)
> Subject: RE: [PATCH v7 07/15] x86/hvm: unify stdvga mmio intercept with
> standard mmio intercept
> 
> >>> On 09.07.15 at 18:12, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 09 July 2015 16:33
> >> >>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
> >> > @@ -424,8 +426,22 @@ static void stdvga_mem_writeb(uint64_t addr,
> >> uint32_t val)
> >> >      }
> >> >  }
> >> >
> >> > -static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t
> size)
> >> > +static int stdvga_mem_write(const struct hvm_io_handler *handler,
> >> > +                            uint64_t addr, uint32_t size,
> >> > +                            uint64_t data)
> >> >  {
> >> > +    struct hvm_hw_stdvga *s = &current->domain-
> >arch.hvm_domain.stdvga;
> >> > +    ioreq_t p = { .type = IOREQ_TYPE_COPY,
> >> > +                  .addr = addr,
> >> > +                  .size = size,
> >> > +                  .count = 1,
> >> > +                  .dir = IOREQ_WRITE,
> >> > +                  .data = data,
> >> > +                };
> >>
> >> Indentation (still - I know I pointed this out on v6, just perhaps at
> >> another example). See e.g. the context of the earlier change to the
> >> beginning of hvm_mmio_internal() in this patch for how this should
> >> look like.
> >
> > It has to be something my emacs is doing then; I can't see any brokenness.
> 
> Indentation simply is too deep. See the good example I pointed
> you at.
> 

Yes, I'll look... and maybe see what it looks like with vi too.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 01/15] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-09 16:16     ` Paul Durrant
@ 2015-07-09 16:24       ` Jan Beulich
  2015-07-09 16:27         ` Paul Durrant
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-09 16:24 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Keir (Xen.org), xen-devel

>>> On 09.07.15 at 18:16, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 09 July 2015 16:13
>> To: Paul Durrant
>> Cc: xen-devel@lists.xen.org; Keir (Xen.org)
>> Subject: Re: [PATCH v7 01/15] x86/hvm: remove multiple open coded
>> 'chunking' loops
>> 
>> >>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
>> > +static int hvmemul_linear_mmio_access(
>> > +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
>> > +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t
>> known_gpfn)
>> > +{
>> > +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
>> > +    unsigned long offset = gla & ~PAGE_MASK;
>> > +    unsigned int chunk;
>> > +    paddr_t gpa;
>> > +    unsigned long one_rep = 1;
>> > +    int rc;
>> > +
>> > +    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
>> > +
>> > +    if ( known_gpfn )
>> > +        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
>> > +    else
>> > +    {
>> > +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
>> > +                                    hvmemul_ctxt);
>> > +        if ( rc != X86EMUL_OKAY )
>> > +            return rc;
>> > +    }
>> > +
>> > +    for ( ;; )
>> > +    {
>> > +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
>> > +        if ( rc != X86EMUL_OKAY )
>> > +            break;
>> > +
>> > +        gla += chunk;
>> > +        buffer += chunk;
>> > +        size -= chunk;
>> > +
>> > +        if ( size == 0 )
>> > +            break;
>> > +
>> > +        ASSERT((gla & ~PAGE_MASK) == 0);
>> 
>> While I don't mean you to re-submit another time, I'd still like to
>> get my question answered: Does this really matter for the code
>> below?
> 
> No, it doesn't, but if it's not true then an incorrect chunk size was chosen 
> above.

I suspected as much. It's then just slightly less odd than having

    x = 0;
    ASSERT(x == 0);

I guess I'll strip the ASSERT() when applying.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 01/15] x86/hvm: remove multiple open coded 'chunking' loops
  2015-07-09 16:24       ` Jan Beulich
@ 2015-07-09 16:27         ` Paul Durrant
  0 siblings, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-09 16:27 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir (Xen.org), xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 09 July 2015 17:24
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; Keir (Xen.org)
> Subject: RE: [PATCH v7 01/15] x86/hvm: remove multiple open coded
> 'chunking' loops
> 
> >>> On 09.07.15 at 18:16, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 09 July 2015 16:13
> >> To: Paul Durrant
> >> Cc: xen-devel@lists.xen.org; Keir (Xen.org)
> >> Subject: Re: [PATCH v7 01/15] x86/hvm: remove multiple open coded
> >> 'chunking' loops
> >>
> >> >>> On 09.07.15 at 15:10, <paul.durrant@citrix.com> wrote:
> >> > +static int hvmemul_linear_mmio_access(
> >> > +    unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
> >> > +    uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t
> >> known_gpfn)
> >> > +{
> >> > +    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
> >> > +    unsigned long offset = gla & ~PAGE_MASK;
> >> > +    unsigned int chunk;
> >> > +    paddr_t gpa;
> >> > +    unsigned long one_rep = 1;
> >> > +    int rc;
> >> > +
> >> > +    chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
> >> > +
> >> > +    if ( known_gpfn )
> >> > +        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
> >> > +    else
> >> > +    {
> >> > +        rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
> >> > +                                    hvmemul_ctxt);
> >> > +        if ( rc != X86EMUL_OKAY )
> >> > +            return rc;
> >> > +    }
> >> > +
> >> > +    for ( ;; )
> >> > +    {
> >> > +        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
> >> > +        if ( rc != X86EMUL_OKAY )
> >> > +            break;
> >> > +
> >> > +        gla += chunk;
> >> > +        buffer += chunk;
> >> > +        size -= chunk;
> >> > +
> >> > +        if ( size == 0 )
> >> > +            break;
> >> > +
> >> > +        ASSERT((gla & ~PAGE_MASK) == 0);
> >>
> >> While I don't mean you to re-submit another time, I'd still like to
> >> get my question answered: Does this really matter for the code
> >> below?
> >
> > No, it doesn't, but if it's not true then an incorrect chunk size was chosen
> > above.
> 
> I suspected as much. It's then just slightly less odd than having
> 
>     x = 0;
>     ASSERT(x == 0);
> 

Well, not quite since the lines aren't adjacent in this case.

> I guess I'll strip the ASSERT() when applying.
> 

Ok. That's fine

  Paul

> Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 03/15] x86/hvm: restrict port numbers and uint16_t and sizes to unsigned int
  2015-07-09 16:23         ` Paul Durrant
@ 2015-07-09 16:31           ` Jan Beulich
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2015-07-09 16:31 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Keir (Xen.org), xen-devel

>>> On 09.07.15 at 18:23, <Paul.Durrant@citrix.com> wrote:
> I would have thought that the compiler would point out the overflow if a 
> caller tried to pass >64k in the port number,

That would at best happen when passing literal numbers.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment
  2015-07-10 11:00             ` Jan Beulich
@ 2015-07-09 19:32               ` Zhi Wang
  2015-07-10 11:46                 ` Jan Beulich
  2015-07-10 11:49               ` Fabio Fantoni
  1 sibling, 1 reply; 42+ messages in thread
From: Zhi Wang @ 2015-07-09 19:32 UTC (permalink / raw)
  To: Jan Beulich, Fabio Fantoni; +Cc: Andrew Cooper, Paul Durrant, xen-devel

Hi Gurus:
     We found that MOVD instruction are used by some windows driver 
during developing XenGT, and also we found this one:

(XEN) MMIO emulation failed: d7v1 64bit @ 0010:fffff8000294e273 -> 66 0f 
e7 00 48 83 c0 10 45 3
b cb 73 f0 45 85 c9

If anyone can help us to upstream this one into instruction emulator and 
plus the MOVD instruction emulation, that will be nice. :)

Thanks,
Zhi.

于 07/10/15 19:00, Jan Beulich 写道:
>>>> On 10.07.15 at 12:51, <fabio.fantoni@m2r.biz> wrote:
>> Il 10/07/2015 12:20, Jan Beulich ha scritto:
>>> Other than MOVD, MOVAPS is already being supported by the
>>> insn emulator.
>>>
>> Then why do you think MOVAPS fails in my test?
>
> No idea. That's what Paul asked you to narrow down.
>
>> Sse2 was introduced in cpus 11 years ago, so I think it would be useful
>> to add support for all missing instructions to avoid more cases like
>> this (and it seems to improve performance too, comparing the same task
>> executed with/without sse2).
>
> The reason we didn't add any more of the SSE insns so far is that
> we don't expect them to be used for accessing MMIO. Once we
> learn they're needed, we'll add emulation for them, but you
> realize this is a significant task?
>
> Jan
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment
  2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (14 preceding siblings ...)
  2015-07-09 13:10 ` [PATCH v7 15/15] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
@ 2015-07-10  9:27 ` Fabio Fantoni
  2015-07-10  9:31   ` Paul Durrant
  15 siblings, 1 reply; 42+ messages in thread
From: Fabio Fantoni @ 2015-07-10  9:27 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Andrew Cooper, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 2357 bytes --]

Il 09/07/2015 15:10, Paul Durrant ha scritto:
> This patch series re-works much of the code involved in emulation of port
> and memory mapped I/O for HVM guests.
>
> The code has become very convoluted and, at least by inspection, certain
> emulations will apparently malfunction.
>
> The series is broken down into 15 patches (which are also available in
> my xenbits repo: http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
> on the emulation34 branch).

Yesterday I retried with this version and seems that you fixed something 
that make possible atleast debug in the domU.

I taken gdb data of X crash inside Sid hvm domU:
#0  sse2_blt (src_bits=<optimized out>, dst_bits=<optimized out>, 
src_stride=<optimized out>, dst_stride=<optimized out>, 
src_bpp=<optimized out>, src_x=<optimized out>, src_y=0, dest_x=0, 
dest_y=0, width=1024, height=<optimized out>, dst_bpp=<optimized out>, 
imp=<optimized out>) at ../../pixman/pixman-sse2.c:4773

Latest istruction:
=> 0x7f70360ef8eb <sse2_blt+459>:    movaps %xmm0,-0x10(%rsi)

Full log in attachment.

If you need more informations/tests tell me and I'll post them.

Thanks for any reply and sorry for my bad english.

>
> Previous changelog
> ------------------
>
> v4:
>   - Removed previous patch (make sure translated MMIO reads or
>     writes fall within a page) and rebased rest of series.
>   - Address Jan's comments on patch #1
>
> v3:
>   - Addressed comments from Jan
>   - Re-ordered series to bring a couple of more trivial patches to the
>     front
>   - Backport to XenServer (4.5) now passing automated tests
>   - Tested on unstable with QEMU upstream and trad, with and without
>     HAP (to force shadow emulation)
>
> v2:
>   - Removed bogus assertion from patch #15
>   - Re-worked patch #17 after basic testing of back-port onto XenServer
>
> Subsequent changes are logged in the individual patch files (thanks
> to David Vrabel for that).
>
> Testing
> -------
>
> v6 of the series was been back-ported to staging-4.5 and then dropped
> onto the XenServer (Dundee) patch queue. All automated branch-safety tests
> pass.
>
> v7 has just been compile tested since changes were largely cosmetic. It
> will be back-ported in the near future.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


[-- Attachment #2: X-gdb.log --]
[-- Type: text/plain, Size: 11379 bytes --]

Full backtrace:
#0  sse2_blt (src_bits=<optimized out>, dst_bits=<optimized out>, src_stride=<optimized out>, dst_stride=<optimized out>, src_bpp=<optimized out>, src_x=<optimized out>, src_y=0, dest_x=0, dest_y=0, width=1024, height=<optimized out>, dst_bpp=<optimized out>, imp=<optimized out>) at ../../pixman/pixman-sse2.c:4773
        w = 4160
        s = 0x7f702985e530 "\300\300\300"
        d = 0x7f70226d3fd0 "\300\300\300"
        src_bytes = 0x7f702985d130 "\277\277\277\261\277\277\277\260\277\277\277\257\277\277\277\256\277\277\277\257\277\277\277\261\300\300\300\261\300\300\300\260\277\277\277\261\277\277\277\261\277\277\277\260\277\277\277\257\277\277\277\262\277\277\277\261\277\277\277\262\300\300\300\261\300\300\300\261\300\300\300\261\277\277\277\257\277\277\277\261\277\277\277\261\277\277\277\261\277\277\277\260\300\300\300\262\277\277\277\261\277\277\277\260\277\277\277\260\277\277\277\260\277\277\277\261\277\277\277\261\277\277\277\261\277\277\277\261\277\277\277\261\277\277\277\260\277\277\277\261\277\277\277\261\277\277\277\261\277\277\277\260\277\277\277\261\277\277\277\261\277\277\277\261\277\277\277\260\277\277\277\262\277\277\277\261\277\277\277\261\277\277\277\260\277\277\277\261\277\277\277
 \261\277\277\277\261\277\277\277\261"...
        dst_bytes = <optimized out>
        byte_width = 4096
#1  0x00007f70360efa39 in sse2_blt (height=768, width=1024, dest_y=0, dest_x=<optimized out>, src_y=<optimized out>, src_x=<optimized out>, dst_bpp=<optimized out>, src_bpp=<optimized out>, dst_stride=<optimized out>, src_stride=<optimized out>, dst_bits=<optimized out>, src_bits=<optimized out>, imp=<optimized out>) at ../../pixman/pixman-sse2.c:4822
No locals.
#2  sse2_composite_copy_area (imp=<optimized out>, info=<optimized out>) at ../../pixman/pixman-sse2.c:4815
        op = <optimized out>
        src_image = <optimized out>
        mask_image = <optimized out>
        dest_image = <optimized out>
        src_x = <optimized out>
        src_y = <optimized out>
        mask_x = <optimized out>
        mask_y = <optimized out>
        dest_x = <optimized out>
        dest_y = 0
        width = 1024
        height = 768
#3  0x00007f7036067711 in pixman_image_composite32 (op=op@entry=PIXMAN_OP_SRC, src=<optimized out>, mask=mask@entry=0x0, dest=<optimized out>, src_x=src_x@entry=0, src_y=src_y@entry=0, mask_x=0, mask_y=0, dest_x=0, dest_y=0, width=1024, height=768) at ../../pixman/pixman.c:707
        src_format = <optimized out>
        mask_format = 0
        dest_format = PIXMAN_a8r8g8b8
        region = {extents = {x1 = 0, y1 = 0, x2 = 1024, y2 = 768}, data = 0x0}
        extents = {x1 = 0, y1 = 0, x2 = 1024, y2 = 768}
        imp = 0x7f7038e848a0
        func = 0x7f70360ef9d0 <sse2_composite_copy_area>
        info = {op = PIXMAN_OP_SRC, src_image = 0x7f7039267e90, mask_image = 0x0, dest_image = 0x7f70392603d0, src_x = 0, src_y = 0, mask_x = 0, mask_y = 0, dest_x = 0, dest_y = 0, width = 1024, height = 768, src_flags = 42420863, mask_flags = 8194, dest_flags = 34032255}
        pbox = 0x7fff9ba9e060
        n = 0
#4  0x00007f70360677e3 in pixman_image_composite (op=op@entry=PIXMAN_OP_SRC, src=<optimized out>, mask=mask@entry=0x0, dest=<optimized out>, src_x=src_x@entry=0, src_y=src_y@entry=0, mask_x=<optimized out>, mask_y=<optimized out>, dest_x=<optimized out>, dest_y=<optimized out>, width=<optimized out>, height=<optimized out>) at ../../pixman/pixman.c:730
No locals.
#5  0x00007f7030e61ca8 in download_box_no_update (surface=0x7f7037402f88, surface=0x7f7037402f88, y2=<optimized out>, x2=<optimized out>, y1=0, x1=0) at ../../src/qxl_surface.c:133
No locals.
#6  qxl_download_box (surface=0x7f7037402f88, x1=0, y1=0, x2=<optimized out>, y2=<optimized out>) at ../../src/qxl_surface.c:150
No locals.
#7  0x00007f7030e61daf in qxl_surface_prepare_access (surface=0x7f7037402f88, pixmap=0x7f70392c3fa0, region=0x7fff9ba9e190, access=<optimized out>) at ../../src/qxl_surface.c:183
        n_boxes = <optimized out>
        boxes = 0x7fff9ba9e190
        pScreen = 0x7f7038ea9e50
        pScrn = <optimized out>
        new = {extents = {x1 = 0, y1 = 0, x2 = 1024, y2 = 768}, data = 0x0}
#8  0x00007f7030e6cd6d in uxa_prepare_access (pDrawable=0x7f70392c3fa0, region=0x7fff9ba9e200, region@entry=0x0, access=access@entry=UXA_ACCESS_RO) at ../../../src/uxa/uxa.c:172
        pScreen = <optimized out>
        xoff = 0
        yoff = 0
        pPixmap = 0x7f70392c3fa0
        box = {x1 = 0, y1 = 0, x2 = <optimized out>, y2 = <optimized out>}
        region_rec = {extents = {x1 = 0, y1 = 0, x2 = 1024, y2 = 768}, data = 0x0}
        result = 1
#9  0x00007f7030e6d0f0 in uxa_prepare_access_window (pWin=0x7f7038f2ef70) at ../../../src/uxa/uxa.c:310
No locals.
#10 0x00007f7030e6d187 in uxa_change_window_attributes (pWin=0x7f7038f2ef70, mask=1) at ../../../src/uxa/uxa.c:343
        ret = <optimized out>
        need_access = 1
#11 0x00007f70371346fd in ?? ()
No symbol table info available.
#12 0x00007f70370d5ef4 in ChangeWindowAttributes ()
No symbol table info available.
#13 0x00007f70370a22f6 in ?? ()
No symbol table info available.
#14 0x00007f70370a8117 in ?? ()
No symbol table info available.
#15 0x00007f70370ac29b in ?? ()
No symbol table info available.
#16 0x00007f7034f70b45 in __libc_start_main (main=0x7f7037096660, argc=7, argv=0x7fff9ba9e588, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff9ba9e578) at libc-start.c:287
        result = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, 1261194766413414073, 140119936427621, 140735804990848, 0, 0, -1261114930715214151, -1197746277989925191}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x7f7037213d90 <__libc_csu_init>, 0x7fff9ba9e588}, data = {prev = 0x0, cleanup = 0x0, canceltype = 924925328}}}
        not_first_call = <optimized out>
#17 0x00007f703709668e in _start ()
No symbol table info available.


Registers:
rax            0x7f70221d3010	140119585402896
rbx            0x7f702985d130	140119709700400
rcx            0x7f702985e530	140119709705520
rdx            0x1000	4096
rsi            0x7f70226d4010	140119590649872
rdi            0x7f702985e530	140119709705520
rbp            0x1400	0x1400
rsp            0x7fff9ba9df88	0x7fff9ba9df88
r8             0x1000	4096
r9             0x7f70221d3010	140119585402896
r10            0x300	768
r11            0x1000	4096
r12            0xffffffffffffec00	-5120
r13            0x0	0
r14            0x0	0
r15            0x0	0
rip            0x7f70360ef8eb	0x7f70360ef8eb <sse2_blt+459>
eflags         0x13202	[ IF #12 #13 RF ]
cs             0x33	51
ss             0x2b	43
ds             0x0	0
es             0x0	0
fs             0x0	0
gs             0x0	0


Current instructions:
=> 0x7f70360ef8eb <sse2_blt+459>:	movaps %xmm0,-0x10(%rsi)
   0x7f70360ef8ef <sse2_blt+463>:	cmp    $0x3f,%r8d
   0x7f70360ef8f3 <sse2_blt+467>:	jg     0x7f70360ef8c0 <sse2_blt+416>
   0x7f70360ef8f5 <sse2_blt+469>:	sub    $0x40,%edx
   0x7f70360ef8f8 <sse2_blt+472>:	mov    %edx,%esi
   0x7f70360ef8fa <sse2_blt+474>:	shr    $0x6,%esi
   0x7f70360ef8fd <sse2_blt+477>:	mov    %esi,%ecx
   0x7f70360ef8ff <sse2_blt+479>:	shl    $0x6,%esi
   0x7f70360ef902 <sse2_blt+482>:	add    $0x1,%rcx
   0x7f70360ef906 <sse2_blt+486>:	sub    %esi,%edx
   0x7f70360ef908 <sse2_blt+488>:	shl    $0x6,%rcx
   0x7f70360ef90c <sse2_blt+492>:	add    %rcx,%rdi
   0x7f70360ef90f <sse2_blt+495>:	add    %rcx,%rax
   0x7f70360ef912 <sse2_blt+498>:	jmpq   0x7f70360ef7cf <sse2_blt+175>
   0x7f70360ef917 <sse2_blt+503>:	lea    (%rdx,%rdx,1),%eax
   0x7f70360ef91a <sse2_blt+506>:	movslq %r9d,%r9


Threads backtrace:

Thread 3 (Thread 0x7f7023f72700 (LWP 2260)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f7027bfa463 in ?? () from /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
#2  0x00007f7027bf9cc7 in ?? () from /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
#3  0x00007f70347160a4 in start_thread (arg=0x7f7023f72700) at pthread_create.c:309
#4  0x00007f703503507d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 (Thread 0x7f7024773700 (LWP 2259)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f7027bfa463 in ?? () from /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
#2  0x00007f7027bf9cc7 in ?? () from /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
#3  0x00007f70347160a4 in start_thread (arg=0x7f7024773700) at pthread_create.c:309
#4  0x00007f703503507d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x7f7037451980 (LWP 2255)):
#0  sse2_blt (src_bits=<optimized out>, dst_bits=<optimized out>, src_stride=<optimized out>, dst_stride=<optimized out>, src_bpp=<optimized out>, src_x=<optimized out>, src_y=0, dest_x=0, dest_y=0, width=1024, height=<optimized out>, dst_bpp=<optimized out>, imp=<optimized out>) at ../../pixman/pixman-sse2.c:4773
#1  0x00007f70360efa39 in sse2_blt (height=768, width=1024, dest_y=0, dest_x=<optimized out>, src_y=<optimized out>, src_x=<optimized out>, dst_bpp=<optimized out>, src_bpp=<optimized out>, dst_stride=<optimized out>, src_stride=<optimized out>, dst_bits=<optimized out>, src_bits=<optimized out>, imp=<optimized out>) at ../../pixman/pixman-sse2.c:4822
#2  sse2_composite_copy_area (imp=<optimized out>, info=<optimized out>) at ../../pixman/pixman-sse2.c:4815
#3  0x00007f7036067711 in pixman_image_composite32 (op=op@entry=PIXMAN_OP_SRC, src=<optimized out>, mask=mask@entry=0x0, dest=<optimized out>, src_x=src_x@entry=0, src_y=src_y@entry=0, mask_x=0, mask_y=0, dest_x=0, dest_y=0, width=1024, height=768) at ../../pixman/pixman.c:707
#4  0x00007f70360677e3 in pixman_image_composite (op=op@entry=PIXMAN_OP_SRC, src=<optimized out>, mask=mask@entry=0x0, dest=<optimized out>, src_x=src_x@entry=0, src_y=src_y@entry=0, mask_x=<optimized out>, mask_y=<optimized out>, dest_x=<optimized out>, dest_y=<optimized out>, width=<optimized out>, height=<optimized out>) at ../../pixman/pixman.c:730
#5  0x00007f7030e61ca8 in download_box_no_update (surface=0x7f7037402f88, surface=0x7f7037402f88, y2=<optimized out>, x2=<optimized out>, y1=0, x1=0) at ../../src/qxl_surface.c:133
#6  qxl_download_box (surface=0x7f7037402f88, x1=0, y1=0, x2=<optimized out>, y2=<optimized out>) at ../../src/qxl_surface.c:150
#7  0x00007f7030e61daf in qxl_surface_prepare_access (surface=0x7f7037402f88, pixmap=0x7f70392c3fa0, region=0x7fff9ba9e190, access=<optimized out>) at ../../src/qxl_surface.c:183
#8  0x00007f7030e6cd6d in uxa_prepare_access (pDrawable=0x7f70392c3fa0, region=0x7fff9ba9e200, region@entry=0x0, access=access@entry=UXA_ACCESS_RO) at ../../../src/uxa/uxa.c:172
#9  0x00007f7030e6d0f0 in uxa_prepare_access_window (pWin=0x7f7038f2ef70) at ../../../src/uxa/uxa.c:310
#10 0x00007f7030e6d187 in uxa_change_window_attributes (pWin=0x7f7038f2ef70, mask=1) at ../../../src/uxa/uxa.c:343
#11 0x00007f70371346fd in ?? ()
#12 0x00007f70370d5ef4 in ChangeWindowAttributes ()
#13 0x00007f70370a22f6 in ?? ()
#14 0x00007f70370a8117 in ?? ()
#15 0x00007f70370ac29b in ?? ()
#16 0x00007f7034f70b45 in __libc_start_main (main=0x7f7037096660, argc=7, argv=0x7fff9ba9e588, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff9ba9e578) at libc-start.c:287
#17 0x00007f703709668e in _start ()

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment
  2015-07-10  9:27 ` [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment Fabio Fantoni
@ 2015-07-10  9:31   ` Paul Durrant
  2015-07-10  9:54     ` Fabio Fantoni
  0 siblings, 1 reply; 42+ messages in thread
From: Paul Durrant @ 2015-07-10  9:31 UTC (permalink / raw)
  To: Fabio Fantoni, xen-devel; +Cc: Andrew Cooper, Jan Beulich

> -----Original Message-----
> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz]
> Sent: 10 July 2015 10:28
> To: Paul Durrant; xen-devel@lists.xen.org
> Cc: Jan Beulich; Andrew Cooper
> Subject: Re: [Xen-devel] [PATCH v7 00/15] x86/hvm: I/O emulation cleanup
> and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in
> attachment
> 
> Il 09/07/2015 15:10, Paul Durrant ha scritto:
> > This patch series re-works much of the code involved in emulation of port
> > and memory mapped I/O for HVM guests.
> >
> > The code has become very convoluted and, at least by inspection, certain
> > emulations will apparently malfunction.
> >
> > The series is broken down into 15 patches (which are also available in
> > my xenbits repo:
> http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
> > on the emulation34 branch).
> 
> Yesterday I retried with this version and seems that you fixed something
> that make possible atleast debug in the domU.
> 
> I taken gdb data of X crash inside Sid hvm domU:
> #0  sse2_blt (src_bits=<optimized out>, dst_bits=<optimized out>,
> src_stride=<optimized out>, dst_stride=<optimized out>,
> src_bpp=<optimized out>, src_x=<optimized out>, src_y=0, dest_x=0,
> dest_y=0, width=1024, height=<optimized out>, dst_bpp=<optimized out>,
> imp=<optimized out>) at ../../pixman/pixman-sse2.c:4773
> 
> Latest istruction:
> => 0x7f70360ef8eb <sse2_blt+459>:    movaps %xmm0,-0x10(%rsi)
> 
> Full log in attachment.
> 
> If you need more informations/tests tell me and I'll post them.
> 

I imagine you got a GP fault due to handle_mmio() returning X86EMUL_UNHANDLEABLE, but that's only a guess. I suggest you try to instrument Xen a little to find out why.

  Paul

> Thanks for any reply and sorry for my bad english.
> 
> >
> > Previous changelog
> > ------------------
> >
> > v4:
> >   - Removed previous patch (make sure translated MMIO reads or
> >     writes fall within a page) and rebased rest of series.
> >   - Address Jan's comments on patch #1
> >
> > v3:
> >   - Addressed comments from Jan
> >   - Re-ordered series to bring a couple of more trivial patches to the
> >     front
> >   - Backport to XenServer (4.5) now passing automated tests
> >   - Tested on unstable with QEMU upstream and trad, with and without
> >     HAP (to force shadow emulation)
> >
> > v2:
> >   - Removed bogus assertion from patch #15
> >   - Re-worked patch #17 after basic testing of back-port onto XenServer
> >
> > Subsequent changes are logged in the individual patch files (thanks
> > to David Vrabel for that).
> >
> > Testing
> > -------
> >
> > v6 of the series was been back-ported to staging-4.5 and then dropped
> > onto the XenServer (Dundee) patch queue. All automated branch-safety
> tests
> > pass.
> >
> > v7 has just been compile tested since changes were largely cosmetic. It
> > will be back-ported in the near future.
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment
  2015-07-10  9:31   ` Paul Durrant
@ 2015-07-10  9:54     ` Fabio Fantoni
  2015-07-10 10:09       ` Fabio Fantoni
  0 siblings, 1 reply; 42+ messages in thread
From: Fabio Fantoni @ 2015-07-10  9:54 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Andrew Cooper, Jan Beulich

Il 10/07/2015 11:31, Paul Durrant ha scritto:
>> -----Original Message-----
>> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz]
>> Sent: 10 July 2015 10:28
>> To: Paul Durrant; xen-devel@lists.xen.org
>> Cc: Jan Beulich; Andrew Cooper
>> Subject: Re: [Xen-devel] [PATCH v7 00/15] x86/hvm: I/O emulation cleanup
>> and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in
>> attachment
>>
>> Il 09/07/2015 15:10, Paul Durrant ha scritto:
>>> This patch series re-works much of the code involved in emulation of port
>>> and memory mapped I/O for HVM guests.
>>>
>>> The code has become very convoluted and, at least by inspection, certain
>>> emulations will apparently malfunction.
>>>
>>> The series is broken down into 15 patches (which are also available in
>>> my xenbits repo:
>> http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
>>> on the emulation34 branch).
>> Yesterday I retried with this version and seems that you fixed something
>> that make possible atleast debug in the domU.
>>
>> I taken gdb data of X crash inside Sid hvm domU:
>> #0  sse2_blt (src_bits=<optimized out>, dst_bits=<optimized out>,
>> src_stride=<optimized out>, dst_stride=<optimized out>,
>> src_bpp=<optimized out>, src_x=<optimized out>, src_y=0, dest_x=0,
>> dest_y=0, width=1024, height=<optimized out>, dst_bpp=<optimized out>,
>> imp=<optimized out>) at ../../pixman/pixman-sse2.c:4773
>>
>> Latest istruction:
>> => 0x7f70360ef8eb <sse2_blt+459>:    movaps %xmm0,-0x10(%rsi)
>>
>> Full log in attachment.
>>
>> If you need more informations/tests tell me and I'll post them.
>>
> I imagine you got a GP fault due to handle_mmio() returning X86EMUL_UNHANDLEABLE, but that's only a guess.
> I suggest you try to instrument Xen a little to find out why.
Thanks for reply, sorry but I not understand exactly what I must do. Can 
you detail please?
>
>    Paul
>
>> Thanks for any reply and sorry for my bad english.
>>
>>> Previous changelog
>>> ------------------
>>>
>>> v4:
>>>    - Removed previous patch (make sure translated MMIO reads or
>>>      writes fall within a page) and rebased rest of series.
>>>    - Address Jan's comments on patch #1
>>>
>>> v3:
>>>    - Addressed comments from Jan
>>>    - Re-ordered series to bring a couple of more trivial patches to the
>>>      front
>>>    - Backport to XenServer (4.5) now passing automated tests
>>>    - Tested on unstable with QEMU upstream and trad, with and without
>>>      HAP (to force shadow emulation)
>>>
>>> v2:
>>>    - Removed bogus assertion from patch #15
>>>    - Re-worked patch #17 after basic testing of back-port onto XenServer
>>>
>>> Subsequent changes are logged in the individual patch files (thanks
>>> to David Vrabel for that).
>>>
>>> Testing
>>> -------
>>>
>>> v6 of the series was been back-ported to staging-4.5 and then dropped
>>> onto the XenServer (Dundee) patch queue. All automated branch-safety
>> tests
>>> pass.
>>>
>>> v7 has just been compile tested since changes were largely cosmetic. It
>>> will be back-ported in the near future.
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment
  2015-07-10  9:54     ` Fabio Fantoni
@ 2015-07-10 10:09       ` Fabio Fantoni
  2015-07-10 10:13         ` Paul Durrant
  2015-07-10 10:20         ` Jan Beulich
  0 siblings, 2 replies; 42+ messages in thread
From: Fabio Fantoni @ 2015-07-10 10:09 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Andrew Cooper, Jan Beulich

Il 10/07/2015 11:54, Fabio Fantoni ha scritto:
> Il 10/07/2015 11:31, Paul Durrant ha scritto:
>>> -----Original Message-----
>>> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz]
>>> Sent: 10 July 2015 10:28
>>> To: Paul Durrant; xen-devel@lists.xen.org
>>> Cc: Jan Beulich; Andrew Cooper
>>> Subject: Re: [Xen-devel] [PATCH v7 00/15] x86/hvm: I/O emulation 
>>> cleanup
>>> and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in
>>> attachment
>>>
>>> Il 09/07/2015 15:10, Paul Durrant ha scritto:
>>>> This patch series re-works much of the code involved in emulation 
>>>> of port
>>>> and memory mapped I/O for HVM guests.
>>>>
>>>> The code has become very convoluted and, at least by inspection, 
>>>> certain
>>>> emulations will apparently malfunction.
>>>>
>>>> The series is broken down into 15 patches (which are also available in
>>>> my xenbits repo:
>>> http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
>>>> on the emulation34 branch).
>>> Yesterday I retried with this version and seems that you fixed 
>>> something
>>> that make possible atleast debug in the domU.
>>>
>>> I taken gdb data of X crash inside Sid hvm domU:
>>> #0  sse2_blt (src_bits=<optimized out>, dst_bits=<optimized out>,
>>> src_stride=<optimized out>, dst_stride=<optimized out>,
>>> src_bpp=<optimized out>, src_x=<optimized out>, src_y=0, dest_x=0,
>>> dest_y=0, width=1024, height=<optimized out>, dst_bpp=<optimized out>,
>>> imp=<optimized out>) at ../../pixman/pixman-sse2.c:4773
>>>
>>> Latest istruction:
>>> => 0x7f70360ef8eb <sse2_blt+459>:    movaps %xmm0,-0x10(%rsi)
>>>
>>> Full log in attachment.
>>>
>>> If you need more informations/tests tell me and I'll post them.
>>>
>> I imagine you got a GP fault due to handle_mmio() returning 
>> X86EMUL_UNHANDLEABLE, but that's only a guess.
>> I suggest you try to instrument Xen a little to find out why.
> Thanks for reply, sorry but I not understand exactly what I must do. 
> Can you detail please?

I take a look in xen/arch/x86/x86_emulate/x86_emulate.c and the 
istruction seems like this other found by xengt developers:
https://github.com/01org/XenGT-Preview-xen/commit/f2bad31f80f698a452c37cb39841da8e4f69350f

>>
>>    Paul
>>
>>> Thanks for any reply and sorry for my bad english.
>>>
>>>> Previous changelog
>>>> ------------------
>>>>
>>>> v4:
>>>>    - Removed previous patch (make sure translated MMIO reads or
>>>>      writes fall within a page) and rebased rest of series.
>>>>    - Address Jan's comments on patch #1
>>>>
>>>> v3:
>>>>    - Addressed comments from Jan
>>>>    - Re-ordered series to bring a couple of more trivial patches to 
>>>> the
>>>>      front
>>>>    - Backport to XenServer (4.5) now passing automated tests
>>>>    - Tested on unstable with QEMU upstream and trad, with and without
>>>>      HAP (to force shadow emulation)
>>>>
>>>> v2:
>>>>    - Removed bogus assertion from patch #15
>>>>    - Re-worked patch #17 after basic testing of back-port onto 
>>>> XenServer
>>>>
>>>> Subsequent changes are logged in the individual patch files (thanks
>>>> to David Vrabel for that).
>>>>
>>>> Testing
>>>> -------
>>>>
>>>> v6 of the series was been back-ported to staging-4.5 and then dropped
>>>> onto the XenServer (Dundee) patch queue. All automated branch-safety
>>> tests
>>>> pass.
>>>>
>>>> v7 has just been compile tested since changes were largely 
>>>> cosmetic. It
>>>> will be back-ported in the near future.
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@lists.xen.org
>>>> http://lists.xen.org/xen-devel
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment
  2015-07-10 10:09       ` Fabio Fantoni
@ 2015-07-10 10:13         ` Paul Durrant
  2015-07-10 10:20         ` Jan Beulich
  1 sibling, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2015-07-10 10:13 UTC (permalink / raw)
  To: Fabio Fantoni, xen-devel; +Cc: Andrew Cooper, Jan Beulich

> -----Original Message-----
> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz]
> Sent: 10 July 2015 11:10
> To: Paul Durrant; xen-devel@lists.xen.org
> Cc: Jan Beulich; Andrew Cooper
> Subject: Re: [Xen-devel] [PATCH v7 00/15] x86/hvm: I/O emulation cleanup
> and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in
> attachment
> 
> Il 10/07/2015 11:54, Fabio Fantoni ha scritto:
> > Il 10/07/2015 11:31, Paul Durrant ha scritto:
> >>> -----Original Message-----
> >>> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz]
> >>> Sent: 10 July 2015 10:28
> >>> To: Paul Durrant; xen-devel@lists.xen.org
> >>> Cc: Jan Beulich; Andrew Cooper
> >>> Subject: Re: [Xen-devel] [PATCH v7 00/15] x86/hvm: I/O emulation
> >>> cleanup
> >>> and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in
> >>> attachment
> >>>
> >>> Il 09/07/2015 15:10, Paul Durrant ha scritto:
> >>>> This patch series re-works much of the code involved in emulation
> >>>> of port
> >>>> and memory mapped I/O for HVM guests.
> >>>>
> >>>> The code has become very convoluted and, at least by inspection,
> >>>> certain
> >>>> emulations will apparently malfunction.
> >>>>
> >>>> The series is broken down into 15 patches (which are also available in
> >>>> my xenbits repo:
> >>> http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
> >>>> on the emulation34 branch).
> >>> Yesterday I retried with this version and seems that you fixed
> >>> something
> >>> that make possible atleast debug in the domU.
> >>>
> >>> I taken gdb data of X crash inside Sid hvm domU:
> >>> #0  sse2_blt (src_bits=<optimized out>, dst_bits=<optimized out>,
> >>> src_stride=<optimized out>, dst_stride=<optimized out>,
> >>> src_bpp=<optimized out>, src_x=<optimized out>, src_y=0, dest_x=0,
> >>> dest_y=0, width=1024, height=<optimized out>, dst_bpp=<optimized
> out>,
> >>> imp=<optimized out>) at ../../pixman/pixman-sse2.c:4773
> >>>
> >>> Latest istruction:
> >>> => 0x7f70360ef8eb <sse2_blt+459>:    movaps %xmm0,-0x10(%rsi)
> >>>
> >>> Full log in attachment.
> >>>
> >>> If you need more informations/tests tell me and I'll post them.
> >>>
> >> I imagine you got a GP fault due to handle_mmio() returning
> >> X86EMUL_UNHANDLEABLE, but that's only a guess.
> >> I suggest you try to instrument Xen a little to find out why.
> > Thanks for reply, sorry but I not understand exactly what I must do.
> > Can you detail please?
> 
> I take a look in xen/arch/x86/x86_emulate/x86_emulate.c and the
> istruction seems like this other found by xengt developers:
> https://github.com/01org/XenGT-Preview-
> xen/commit/f2bad31f80f698a452c37cb39841da8e4f69350f
> 

Sounds like that patch is worth a try then.

  Paul

> >>
> >>    Paul
> >>
> >>> Thanks for any reply and sorry for my bad english.
> >>>
> >>>> Previous changelog
> >>>> ------------------
> >>>>
> >>>> v4:
> >>>>    - Removed previous patch (make sure translated MMIO reads or
> >>>>      writes fall within a page) and rebased rest of series.
> >>>>    - Address Jan's comments on patch #1
> >>>>
> >>>> v3:
> >>>>    - Addressed comments from Jan
> >>>>    - Re-ordered series to bring a couple of more trivial patches to
> >>>> the
> >>>>      front
> >>>>    - Backport to XenServer (4.5) now passing automated tests
> >>>>    - Tested on unstable with QEMU upstream and trad, with and without
> >>>>      HAP (to force shadow emulation)
> >>>>
> >>>> v2:
> >>>>    - Removed bogus assertion from patch #15
> >>>>    - Re-worked patch #17 after basic testing of back-port onto
> >>>> XenServer
> >>>>
> >>>> Subsequent changes are logged in the individual patch files (thanks
> >>>> to David Vrabel for that).
> >>>>
> >>>> Testing
> >>>> -------
> >>>>
> >>>> v6 of the series was been back-ported to staging-4.5 and then dropped
> >>>> onto the XenServer (Dundee) patch queue. All automated branch-
> safety
> >>> tests
> >>>> pass.
> >>>>
> >>>> v7 has just been compile tested since changes were largely
> >>>> cosmetic. It
> >>>> will be back-ported in the near future.
> >>>>
> >>>> _______________________________________________
> >>>> Xen-devel mailing list
> >>>> Xen-devel@lists.xen.org
> >>>> http://lists.xen.org/xen-devel
> >

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment
  2015-07-10 10:09       ` Fabio Fantoni
  2015-07-10 10:13         ` Paul Durrant
@ 2015-07-10 10:20         ` Jan Beulich
  2015-07-10 10:51           ` Fabio Fantoni
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-10 10:20 UTC (permalink / raw)
  To: Paul Durrant, Fabio Fantoni; +Cc: Andrew Cooper, xen-devel

>>> On 10.07.15 at 12:09, <fabio.fantoni@m2r.biz> wrote:
> Il 10/07/2015 11:54, Fabio Fantoni ha scritto:
>> Il 10/07/2015 11:31, Paul Durrant ha scritto:
>>>> -----Original Message-----
>>>> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz]
>>>> Sent: 10 July 2015 10:28
>>>> To: Paul Durrant; xen-devel@lists.xen.org 
>>>> Cc: Jan Beulich; Andrew Cooper
>>>> Subject: Re: [Xen-devel] [PATCH v7 00/15] x86/hvm: I/O emulation 
>>>> cleanup
>>>> and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in
>>>> attachment
>>>>
>>>> Il 09/07/2015 15:10, Paul Durrant ha scritto:
>>>>> This patch series re-works much of the code involved in emulation 
>>>>> of port
>>>>> and memory mapped I/O for HVM guests.
>>>>>
>>>>> The code has become very convoluted and, at least by inspection, 
>>>>> certain
>>>>> emulations will apparently malfunction.
>>>>>
>>>>> The series is broken down into 15 patches (which are also available in
>>>>> my xenbits repo:
>>>> http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git 
>>>>> on the emulation34 branch).
>>>> Yesterday I retried with this version and seems that you fixed 
>>>> something
>>>> that make possible atleast debug in the domU.
>>>>
>>>> I taken gdb data of X crash inside Sid hvm domU:
>>>> #0  sse2_blt (src_bits=<optimized out>, dst_bits=<optimized out>,
>>>> src_stride=<optimized out>, dst_stride=<optimized out>,
>>>> src_bpp=<optimized out>, src_x=<optimized out>, src_y=0, dest_x=0,
>>>> dest_y=0, width=1024, height=<optimized out>, dst_bpp=<optimized out>,
>>>> imp=<optimized out>) at ../../pixman/pixman-sse2.c:4773
>>>>
>>>> Latest istruction:
>>>> => 0x7f70360ef8eb <sse2_blt+459>:    movaps %xmm0,-0x10(%rsi)
>>>>
>>>> Full log in attachment.
>>>>
>>>> If you need more informations/tests tell me and I'll post them.
>>>>
>>> I imagine you got a GP fault due to handle_mmio() returning 
>>> X86EMUL_UNHANDLEABLE, but that's only a guess.
>>> I suggest you try to instrument Xen a little to find out why.
>> Thanks for reply, sorry but I not understand exactly what I must do. 
>> Can you detail please?
> 
> I take a look in xen/arch/x86/x86_emulate/x86_emulate.c and the 
> istruction seems like this other found by xengt developers:
> https://github.com/01org/XenGT-Preview-xen/commit/f2bad31f80f698a452c37cb39841 
> da8e4f69350f

Other than MOVD, MOVAPS is already being supported by the
insn emulator.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment
  2015-07-10 10:20         ` Jan Beulich
@ 2015-07-10 10:51           ` Fabio Fantoni
  2015-07-10 11:00             ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Fabio Fantoni @ 2015-07-10 10:51 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant; +Cc: Andrew Cooper, zhi.a.wang, xen-devel

Il 10/07/2015 12:20, Jan Beulich ha scritto:
>>>> On 10.07.15 at 12:09, <fabio.fantoni@m2r.biz> wrote:
>> Il 10/07/2015 11:54, Fabio Fantoni ha scritto:
>>> Il 10/07/2015 11:31, Paul Durrant ha scritto:
>>>>> -----Original Message-----
>>>>> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz]
>>>>> Sent: 10 July 2015 10:28
>>>>> To: Paul Durrant; xen-devel@lists.xen.org
>>>>> Cc: Jan Beulich; Andrew Cooper
>>>>> Subject: Re: [Xen-devel] [PATCH v7 00/15] x86/hvm: I/O emulation
>>>>> cleanup
>>>>> and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in
>>>>> attachment
>>>>>
>>>>> Il 09/07/2015 15:10, Paul Durrant ha scritto:
>>>>>> This patch series re-works much of the code involved in emulation
>>>>>> of port
>>>>>> and memory mapped I/O for HVM guests.
>>>>>>
>>>>>> The code has become very convoluted and, at least by inspection,
>>>>>> certain
>>>>>> emulations will apparently malfunction.
>>>>>>
>>>>>> The series is broken down into 15 patches (which are also available in
>>>>>> my xenbits repo:
>>>>> http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
>>>>>> on the emulation34 branch).
>>>>> Yesterday I retried with this version and seems that you fixed
>>>>> something
>>>>> that make possible atleast debug in the domU.
>>>>>
>>>>> I taken gdb data of X crash inside Sid hvm domU:
>>>>> #0  sse2_blt (src_bits=<optimized out>, dst_bits=<optimized out>,
>>>>> src_stride=<optimized out>, dst_stride=<optimized out>,
>>>>> src_bpp=<optimized out>, src_x=<optimized out>, src_y=0, dest_x=0,
>>>>> dest_y=0, width=1024, height=<optimized out>, dst_bpp=<optimized out>,
>>>>> imp=<optimized out>) at ../../pixman/pixman-sse2.c:4773
>>>>>
>>>>> Latest istruction:
>>>>> => 0x7f70360ef8eb <sse2_blt+459>:    movaps %xmm0,-0x10(%rsi)
>>>>>
>>>>> Full log in attachment.
>>>>>
>>>>> If you need more informations/tests tell me and I'll post them.
>>>>>
>>>> I imagine you got a GP fault due to handle_mmio() returning
>>>> X86EMUL_UNHANDLEABLE, but that's only a guess.
>>>> I suggest you try to instrument Xen a little to find out why.
>>> Thanks for reply, sorry but I not understand exactly what I must do.
>>> Can you detail please?
>> I take a look in xen/arch/x86/x86_emulate/x86_emulate.c and the
>> istruction seems like this other found by xengt developers:
>> https://github.com/01org/XenGT-Preview-xen/commit/f2bad31f80f698a452c37cb39841
>> da8e4f69350f
> Other than MOVD, MOVAPS is already being supported by the
> insn emulator.
>
> Jan
>
Thanks for your reply.
Then why do you think MOVAPS fails in my test? If you need it, here is 
the full gdb output:
http://lists.xen.org/archives/html/xen-devel/2015-07/txtpTIgy5CpzU.txt
If you need more informations/tests tell me and I'll post them.
Sse2 was introduced in cpus 11 years ago, so I think it would be useful 
to add support for all missing instructions to avoid more cases like 
this (and it seems to improve performance too, comparing the same task 
executed with/without sse2).
I'm going to try updated xen staging with MOVD support patch added, any 
other useful information for other tests/debug is appreciated.

Thanks for any reply and sorry for my bad english.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment
  2015-07-10 10:51           ` Fabio Fantoni
@ 2015-07-10 11:00             ` Jan Beulich
  2015-07-09 19:32               ` Zhi Wang
  2015-07-10 11:49               ` Fabio Fantoni
  0 siblings, 2 replies; 42+ messages in thread
From: Jan Beulich @ 2015-07-10 11:00 UTC (permalink / raw)
  To: Fabio Fantoni; +Cc: Andrew Cooper, Paul Durrant, zhi.a.wang, xen-devel

>>> On 10.07.15 at 12:51, <fabio.fantoni@m2r.biz> wrote:
> Il 10/07/2015 12:20, Jan Beulich ha scritto:
>> Other than MOVD, MOVAPS is already being supported by the
>> insn emulator.
>>
> Then why do you think MOVAPS fails in my test?

No idea. That's what Paul asked you to narrow down.

> Sse2 was introduced in cpus 11 years ago, so I think it would be useful 
> to add support for all missing instructions to avoid more cases like 
> this (and it seems to improve performance too, comparing the same task 
> executed with/without sse2).

The reason we didn't add any more of the SSE insns so far is that
we don't expect them to be used for accessing MMIO. Once we
learn they're needed, we'll add emulation for them, but you
realize this is a significant task?

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment
  2015-07-09 19:32               ` Zhi Wang
@ 2015-07-10 11:46                 ` Jan Beulich
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2015-07-10 11:46 UTC (permalink / raw)
  To: Zhi Wang; +Cc: Fabio Fantoni, Andrew Cooper, Paul Durrant, xen-devel

>>> On 09.07.15 at 21:32, <zhi.a.wang@intel.com> wrote:
>      We found that MOVD instruction are used by some windows driver 
> during developing XenGT, and also we found this one:
> 
> (XEN) MMIO emulation failed: d7v1 64bit @ 0010:fffff8000294e273 -> 66 0f 
> e7 00 48 83 c0 10 45 3
> b cb 73 f0 45 85 c9
> 
> If anyone can help us to upstream this one into instruction emulator and 
> plus the MOVD instruction emulation, that will be nice. :)

The thread you hijacked already references a patch to that effect;
I don't recall it having got submitted formally, hence it's not clear
what "help us to upstream" really is supposed to mean here.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment
  2015-07-10 11:00             ` Jan Beulich
  2015-07-09 19:32               ` Zhi Wang
@ 2015-07-10 11:49               ` Fabio Fantoni
  1 sibling, 0 replies; 42+ messages in thread
From: Fabio Fantoni @ 2015-07-10 11:49 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Paul Durrant, zhi.a.wang, xen-devel

Il 10/07/2015 13:00, Jan Beulich ha scritto:
>>>> On 10.07.15 at 12:51, <fabio.fantoni@m2r.biz> wrote:
>> Il 10/07/2015 12:20, Jan Beulich ha scritto:
>>> Other than MOVD, MOVAPS is already being supported by the
>>> insn emulator.
>>>
>> Then why do you think MOVAPS fails in my test?
> No idea. That's what Paul asked you to narrow down.
Can you be more specific about instrumenting xen because I don't know 
what to do, other then using gdb on domU xorg. Any hint would be 
appreciated.
>> Sse2 was introduced in cpus 11 years ago, so I think it would be useful
>> to add support for all missing instructions to avoid more cases like
>> this (and it seems to improve performance too, comparing the same task
>> executed with/without sse2).
> The reason we didn't add any more of the SSE insns so far is that
> we don't expect them to be used for accessing MMIO. Once we
> learn they're needed, we'll add emulation for them, but you
> realize this is a significant task?
>
> Jan
>
Sorry for this, since the end of 2011 I'm trying desperately to improve 
recent desktop domUs graphic performance (windows >xp), recent linux DE 
(gnome 3, unity ecc...), bigger resolutions (7-8 years ago the standard 
was 1024x768 now it's around 1920x1080)
FWIK this seems necessary for what I'm trying to achieve (and to anyone 
other who want to do the same since I post all my patches), it seems 
also needed for xengt (in development) and there may be others.

A temp. solution at least to have qxl fully working on any linux domUs 
can be to find the workaround/fix that make it work in SUSE and apply it 
upstream....
 From my mail of some days ago:
> I also took a fast look at suse kernel patches 
> (https://github.com/openSUSE/kernel-source/tree/SLE12-SP1) where qxl 
> is also working on linux domUs (other things seems already similar 
> based on what Jim Fehlig told me) but I didn't find a possible 
> fix/workaround for it to try. Can someone tell me about possible 
> patches I should try please?

Thanks for any reply.

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2015-07-10 11:49 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-09 13:10 [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix Paul Durrant
2015-07-09 13:10 ` [PATCH v7 01/15] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
2015-07-09 15:13   ` Jan Beulich
2015-07-09 16:16     ` Paul Durrant
2015-07-09 16:24       ` Jan Beulich
2015-07-09 16:27         ` Paul Durrant
2015-07-09 13:10 ` [PATCH v7 02/15] x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument Paul Durrant
2015-07-09 13:10 ` [PATCH v7 03/15] x86/hvm: restrict port numbers and uint16_t and sizes to unsigned int Paul Durrant
2015-07-09 15:24   ` Jan Beulich
2015-07-09 16:10     ` Paul Durrant
2015-07-09 16:20       ` Jan Beulich
2015-07-09 16:23         ` Paul Durrant
2015-07-09 16:31           ` Jan Beulich
2015-07-09 13:10 ` [PATCH v7 04/15] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
2015-07-09 13:10 ` [PATCH v7 05/15] x86/hvm: add length to mmio check op Paul Durrant
2015-07-09 13:10 ` [PATCH v7 06/15] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
2015-07-09 13:10 ` [PATCH v7 07/15] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
2015-07-09 15:33   ` Jan Beulich
2015-07-09 16:12     ` Paul Durrant
2015-07-09 16:21       ` Jan Beulich
2015-07-09 16:24         ` Paul Durrant
2015-07-09 13:10 ` [PATCH v7 08/15] x86/hvm: limit reps to avoid the need to handle retry Paul Durrant
2015-07-09 13:10 ` [PATCH v7 09/15] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
2015-07-09 13:10 ` [PATCH v7 10/15] x86/hvm: split I/O completion handling from state model Paul Durrant
2015-07-09 13:10 ` [PATCH v7 11/15] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
2015-07-09 13:10 ` [PATCH v7 12/15] x86/hvm: remove hvm_io_state enumeration Paul Durrant
2015-07-09 13:10 ` [PATCH v7 13/15] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
2015-07-09 13:10 ` [PATCH v7 14/15] x86/hvm: always re-emulate I/O from a buffer Paul Durrant
2015-07-09 13:10 ` [PATCH v7 15/15] x86/hvm: track large memory mapped accesses by buffer offset Paul Durrant
2015-07-09 15:46   ` Jan Beulich
2015-07-09 16:05     ` Paul Durrant
2015-07-10  9:27 ` [PATCH v7 00/15] x86/hvm: I/O emulation cleanup and fix | Full Backtrace of domU's X crash caused by SSE2 istruction in attachment Fabio Fantoni
2015-07-10  9:31   ` Paul Durrant
2015-07-10  9:54     ` Fabio Fantoni
2015-07-10 10:09       ` Fabio Fantoni
2015-07-10 10:13         ` Paul Durrant
2015-07-10 10:20         ` Jan Beulich
2015-07-10 10:51           ` Fabio Fantoni
2015-07-10 11:00             ` Jan Beulich
2015-07-09 19:32               ` Zhi Wang
2015-07-10 11:46                 ` Jan Beulich
2015-07-10 11:49               ` Fabio Fantoni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.