xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] x86/HVM: adjustments to internal device emulation
@ 2016-03-24 11:21 Jan Beulich
  2016-03-24 11:28 ` [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests Jan Beulich
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Jan Beulich @ 2016-03-24 11:21 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser

The patches are really independent, and the latter two are just
cleaning up code to make it easier to see all relevant uses of
X86EMUL_UNHANDLEABLE (dealing with which is the primary subject
of the first one).

1: HVM: fix forwarding of internally cached requests
2: vLAPIC: vlapic_reg_write() can't fail
3: HVM: terminate writes to PM_TMR port

Signed-off-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests
  2016-03-24 11:21 [PATCH 0/3] x86/HVM: adjustments to internal device emulation Jan Beulich
@ 2016-03-24 11:28 ` Jan Beulich
  2016-03-24 11:49   ` Paul Durrant
  2016-03-24 11:29 ` [PATCH 2/3] x86/vLAPIC: vlapic_reg_write() can't fail Jan Beulich
  2016-03-24 11:30 ` [PATCH 3/3] x86/HVM: terminate writes to PM_TMR port Jan Beulich
  2 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2016-03-24 11:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Keir Fraser, Paul Durrant, Chang Jianzhong

[-- Attachment #1: Type: text/plain, Size: 4975 bytes --]

Forwarding entire batches to the device model when an individual
iteration of them got rejected by internal device emulation handlers
with X86EMUL_UNHANDLEABLE is wrong: The device model would then handle
all iterations, without the internal handler getting to see any past
the one it returned failure for. This causes misbehavior in at least
the MSI-X and VGA code, which want to see all such requests for
internal tracking/caching purposes. But note that this does not apply
to buffered I/O requests.

This in turn means that the condition in hvm_process_io_intercept() of
when to crash the domain was wrong: Since X86EMUL_UNHANDLEABLE can
validly be returned by the individual device handlers, we mustn't
blindly crash the domain if such occurs on other than the initial
iteration. Instead we need to distinguish hvm_copy_*_guest_phys()
failures from device specific ones, and then the former need to always
be fatal to the domain (i.e. also on the first iteration), since
otherwise we again would end up forwarding a request to qemu which the
internal handler didn't get to see.

Also commit 4faffc41d ("x86/hvm: limit reps to avoid the need to handle
retry") went too far in removing code from hvm_process_io_intercept():
When there were successfully handled iterations, the function should
continue to return success with a clipped repeat count.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Chang Jianzhong <changjzh@gmail.com>
---
I assume this also addresses the issue which
http://lists.xenproject.org/archives/html/xen-devel/2016-01/msg03189.html 
attempted to deal with in a not really acceptable way.

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -95,7 +95,7 @@ static const struct hvm_io_handler null_
 };
 
 static int hvmemul_do_io(
-    bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
+    bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
 {
     struct vcpu *curr = current;
@@ -104,7 +104,7 @@ static int hvmemul_do_io(
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
         .addr = addr,
         .size = size,
-        .count = reps,
+        .count = *reps,
         .dir = dir,
         .df = df,
         .data = data,
@@ -136,7 +136,7 @@ static int hvmemul_do_io(
         if ( (p.type != is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO) ||
              (p.addr != addr) ||
              (p.size != size) ||
-             (p.count != reps) ||
+             (p.count != *reps) ||
              (p.dir != dir) ||
              (p.df != df) ||
              (p.data_is_ptr != data_is_addr) )
@@ -214,7 +214,7 @@ static int hvmemul_do_io_buffer(
 
     BUG_ON(buffer == NULL);
 
-    rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0,
+    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
                        (uintptr_t)buffer);
     if ( rc == X86EMUL_UNHANDLEABLE && dir == IOREQ_READ )
         memset(buffer, 0xff, size);
@@ -305,13 +305,13 @@ static int hvmemul_do_io_addr(
         count = 1;
     }
 
-    rc = hvmemul_do_io(is_mmio, addr, count, size, dir, df, 1,
+    rc = hvmemul_do_io(is_mmio, addr, &count, size, dir, df, 1,
                        ram_gpa);
+
     if ( rc == X86EMUL_OKAY )
-    {
         v->arch.hvm_vcpu.hvm_io.mmio_retry = (count < *reps);
-        *reps = count;
-    }
+
+    *reps = count;
 
  out:
     while ( nr_pages )
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -148,7 +148,7 @@ int hvm_process_io_intercept(const struc
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
-                    rc = X86EMUL_UNHANDLEABLE;
+                    rc = -1; /* != any X86EMUL_* value. */
                     break;
                 }
                 if ( rc != X86EMUL_OKAY )
@@ -178,7 +178,7 @@ int hvm_process_io_intercept(const struc
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
-                    rc = X86EMUL_UNHANDLEABLE;
+                    rc = -1; /* != any X86EMUL_* value. */
                     break;
                 }
                 if ( rc != X86EMUL_OKAY )
@@ -196,8 +196,22 @@ int hvm_process_io_intercept(const struc
         }
     }
 
-    if ( i != 0 && rc == X86EMUL_UNHANDLEABLE )
+    if ( unlikely(rc < 0) )
         domain_crash(current->domain);
+    else if ( i )
+    {
+        p->count = i;
+        rc = X86EMUL_OKAY;
+    }
+    else if ( rc == X86EMUL_UNHANDLEABLE )
+    {
+        /*
+         * Don't forward entire batches to the device model: This would
+         * prevent the internal handlers to see subsequent iterations of
+         * the request.
+         */
+        p->count = 1;
+    }
 
     return rc;
 }



[-- Attachment #2: x86-HVM-internal-IO-with-REP.patch --]
[-- Type: text/plain, Size: 5027 bytes --]

x86/HVM: fix forwarding of internally cached requests

Forwarding entire batches to the device model when an individual
iteration of them got rejected by internal device emulation handlers
with X86EMUL_UNHANDLEABLE is wrong: The device model would then handle
all iterations, without the internal handler getting to see any past
the one it returned failure for. This causes misbehavior in at least
the MSI-X and VGA code, which want to see all such requests for
internal tracking/caching purposes. But note that this does not apply
to buffered I/O requests.

This in turn means that the condition in hvm_process_io_intercept() of
when to crash the domain was wrong: Since X86EMUL_UNHANDLEABLE can
validly be returned by the individual device handlers, we mustn't
blindly crash the domain if such occurs on other than the initial
iteration. Instead we need to distinguish hvm_copy_*_guest_phys()
failures from device specific ones, and then the former need to always
be fatal to the domain (i.e. also on the first iteration), since
otherwise we again would end up forwarding a request to qemu which the
internal handler didn't get to see.

Also commit 4faffc41d ("x86/hvm: limit reps to avoid the need to handle
retry") went too far in removing code from hvm_process_io_intercept():
When there were successfully handled iterations, the function should
continue to return success with a clipped repeat count.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Chang Jianzhong <changjzh@gmail.com>
---
I assume this also addresses the issue which
http://lists.xenproject.org/archives/html/xen-devel/2016-01/msg03189.html
attempted to deal with in a not really acceptable way.

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -95,7 +95,7 @@ static const struct hvm_io_handler null_
 };
 
 static int hvmemul_do_io(
-    bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
+    bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
 {
     struct vcpu *curr = current;
@@ -104,7 +104,7 @@ static int hvmemul_do_io(
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
         .addr = addr,
         .size = size,
-        .count = reps,
+        .count = *reps,
         .dir = dir,
         .df = df,
         .data = data,
@@ -136,7 +136,7 @@ static int hvmemul_do_io(
         if ( (p.type != is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO) ||
              (p.addr != addr) ||
              (p.size != size) ||
-             (p.count != reps) ||
+             (p.count != *reps) ||
              (p.dir != dir) ||
              (p.df != df) ||
              (p.data_is_ptr != data_is_addr) )
@@ -214,7 +214,7 @@ static int hvmemul_do_io_buffer(
 
     BUG_ON(buffer == NULL);
 
-    rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0,
+    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
                        (uintptr_t)buffer);
     if ( rc == X86EMUL_UNHANDLEABLE && dir == IOREQ_READ )
         memset(buffer, 0xff, size);
@@ -305,13 +305,13 @@ static int hvmemul_do_io_addr(
         count = 1;
     }
 
-    rc = hvmemul_do_io(is_mmio, addr, count, size, dir, df, 1,
+    rc = hvmemul_do_io(is_mmio, addr, &count, size, dir, df, 1,
                        ram_gpa);
+
     if ( rc == X86EMUL_OKAY )
-    {
         v->arch.hvm_vcpu.hvm_io.mmio_retry = (count < *reps);
-        *reps = count;
-    }
+
+    *reps = count;
 
  out:
     while ( nr_pages )
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -148,7 +148,7 @@ int hvm_process_io_intercept(const struc
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
-                    rc = X86EMUL_UNHANDLEABLE;
+                    rc = -1; /* != any X86EMUL_* value. */
                     break;
                 }
                 if ( rc != X86EMUL_OKAY )
@@ -178,7 +178,7 @@ int hvm_process_io_intercept(const struc
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
-                    rc = X86EMUL_UNHANDLEABLE;
+                    rc = -1; /* != any X86EMUL_* value. */
                     break;
                 }
                 if ( rc != X86EMUL_OKAY )
@@ -196,8 +196,22 @@ int hvm_process_io_intercept(const struc
         }
     }
 
-    if ( i != 0 && rc == X86EMUL_UNHANDLEABLE )
+    if ( unlikely(rc < 0) )
         domain_crash(current->domain);
+    else if ( i )
+    {
+        p->count = i;
+        rc = X86EMUL_OKAY;
+    }
+    else if ( rc == X86EMUL_UNHANDLEABLE )
+    {
+        /*
+         * Don't forward entire batches to the device model: This would
+         * prevent the internal handlers to see subsequent iterations of
+         * the request.
+         */
+        p->count = 1;
+    }
 
     return rc;
 }

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/3] x86/vLAPIC: vlapic_reg_write() can't fail
  2016-03-24 11:21 [PATCH 0/3] x86/HVM: adjustments to internal device emulation Jan Beulich
  2016-03-24 11:28 ` [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests Jan Beulich
@ 2016-03-24 11:29 ` Jan Beulich
  2016-03-24 14:03   ` Andrew Cooper
  2016-03-24 11:30 ` [PATCH 3/3] x86/HVM: terminate writes to PM_TMR port Jan Beulich
  2 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2016-03-24 11:29 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser

[-- Attachment #1: Type: text/plain, Size: 2809 bytes --]

It only ever returns X86EMUL_OKAY, so to make this more obvious change
the function return type to void. Re-structure vlapic_apicv_write() at
once to have only a single path leading to vlapic_reg_write().

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -681,11 +681,10 @@ static void vlapic_tdt_pt_cb(struct vcpu
     vcpu_vlapic(v)->hw.tdt_msr = 0;
 }
 
-static int vlapic_reg_write(struct vcpu *v,
-                            unsigned int offset, uint32_t val)
+static void vlapic_reg_write(struct vcpu *v,
+                             unsigned int offset, uint32_t val)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
-    int rc = X86EMUL_OKAY;
 
     memset(&vlapic->loaded, 0, sizeof(vlapic->loaded));
 
@@ -815,14 +814,7 @@ static int vlapic_reg_write(struct vcpu
         HVM_DBG_LOG(DBG_LEVEL_VLAPIC_TIMER, "timer divisor is %#x",
                     vlapic->hw.timer_divisor);
         break;
-
-    default:
-        break;
     }
-    if (rc == X86EMUL_UNHANDLEABLE)
-        gdprintk(XENLOG_DEBUG,
-                "Local APIC Write wrong to register %#x\n", offset);
-    return rc;
 }
 
 static int vlapic_write(struct vcpu *v, unsigned long address,
@@ -871,7 +863,9 @@ static int vlapic_write(struct vcpu *v,
     else if ( unlikely(offset & 3) )
         goto unaligned_exit_and_crash;
 
-    return vlapic_reg_write(v, offset, val);
+    vlapic_reg_write(v, offset, val);
+
+    return X86EMUL_OKAY;
 
  unaligned_exit_and_crash:
     gprintk(XENLOG_ERR, "Unaligned LAPIC write: len=%u offset=%#x.\n",
@@ -886,14 +880,18 @@ int vlapic_apicv_write(struct vcpu *v, u
     struct vlapic *vlapic = vcpu_vlapic(v);
     uint32_t val = vlapic_get_reg(vlapic, offset);
 
-    if ( !vlapic_x2apic_mode(vlapic) )
-        return vlapic_reg_write(v, offset, val);
+    if ( vlapic_x2apic_mode(vlapic) )
+    {
+        if ( offset != APIC_SELF_IPI )
+            return X86EMUL_UNHANDLEABLE;
 
-    if ( offset != APIC_SELF_IPI )
-        return X86EMUL_UNHANDLEABLE;
+        offset = APIC_ICR;
+        val = APIC_DEST_SELF | (val & APIC_VECTOR_MASK);
+    }
+
+    vlapic_reg_write(v, offset, val);
 
-    return vlapic_reg_write(v, APIC_ICR,
-                            APIC_DEST_SELF | (val & APIC_VECTOR_MASK));
+    return X86EMUL_OKAY;
 }
 
 int hvm_x2apic_msr_write(struct vcpu *v, unsigned int msr, uint64_t msr_content)
@@ -971,7 +969,9 @@ int hvm_x2apic_msr_write(struct vcpu *v,
             return X86EMUL_UNHANDLEABLE;
     }
 
-    return vlapic_reg_write(v, offset, msr_content);
+    vlapic_reg_write(v, offset, msr_content);
+
+    return X86EMUL_OKAY;
 }
 
 static int vlapic_range(struct vcpu *v, unsigned long addr)




[-- Attachment #2: x86-vLAPIC-MMIO-not-unhandleable.patch --]
[-- Type: text/plain, Size: 2848 bytes --]

x86/vLAPIC: vlapic_reg_write() can't fail

It only ever returns X86EMUL_OKAY, so to make this more obvious change
the function return type to void. Re-structure vlapic_apicv_write() at
once to have only a single path leading to vlapic_reg_write().

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -681,11 +681,10 @@ static void vlapic_tdt_pt_cb(struct vcpu
     vcpu_vlapic(v)->hw.tdt_msr = 0;
 }
 
-static int vlapic_reg_write(struct vcpu *v,
-                            unsigned int offset, uint32_t val)
+static void vlapic_reg_write(struct vcpu *v,
+                             unsigned int offset, uint32_t val)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
-    int rc = X86EMUL_OKAY;
 
     memset(&vlapic->loaded, 0, sizeof(vlapic->loaded));
 
@@ -815,14 +814,7 @@ static int vlapic_reg_write(struct vcpu
         HVM_DBG_LOG(DBG_LEVEL_VLAPIC_TIMER, "timer divisor is %#x",
                     vlapic->hw.timer_divisor);
         break;
-
-    default:
-        break;
     }
-    if (rc == X86EMUL_UNHANDLEABLE)
-        gdprintk(XENLOG_DEBUG,
-                "Local APIC Write wrong to register %#x\n", offset);
-    return rc;
 }
 
 static int vlapic_write(struct vcpu *v, unsigned long address,
@@ -871,7 +863,9 @@ static int vlapic_write(struct vcpu *v,
     else if ( unlikely(offset & 3) )
         goto unaligned_exit_and_crash;
 
-    return vlapic_reg_write(v, offset, val);
+    vlapic_reg_write(v, offset, val);
+
+    return X86EMUL_OKAY;
 
  unaligned_exit_and_crash:
     gprintk(XENLOG_ERR, "Unaligned LAPIC write: len=%u offset=%#x.\n",
@@ -886,14 +880,18 @@ int vlapic_apicv_write(struct vcpu *v, u
     struct vlapic *vlapic = vcpu_vlapic(v);
     uint32_t val = vlapic_get_reg(vlapic, offset);
 
-    if ( !vlapic_x2apic_mode(vlapic) )
-        return vlapic_reg_write(v, offset, val);
+    if ( vlapic_x2apic_mode(vlapic) )
+    {
+        if ( offset != APIC_SELF_IPI )
+            return X86EMUL_UNHANDLEABLE;
 
-    if ( offset != APIC_SELF_IPI )
-        return X86EMUL_UNHANDLEABLE;
+        offset = APIC_ICR;
+        val = APIC_DEST_SELF | (val & APIC_VECTOR_MASK);
+    }
+
+    vlapic_reg_write(v, offset, val);
 
-    return vlapic_reg_write(v, APIC_ICR,
-                            APIC_DEST_SELF | (val & APIC_VECTOR_MASK));
+    return X86EMUL_OKAY;
 }
 
 int hvm_x2apic_msr_write(struct vcpu *v, unsigned int msr, uint64_t msr_content)
@@ -971,7 +969,9 @@ int hvm_x2apic_msr_write(struct vcpu *v,
             return X86EMUL_UNHANDLEABLE;
     }
 
-    return vlapic_reg_write(v, offset, msr_content);
+    vlapic_reg_write(v, offset, msr_content);
+
+    return X86EMUL_OKAY;
 }
 
 static int vlapic_range(struct vcpu *v, unsigned long addr)

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/3] x86/HVM: terminate writes to PM_TMR port
  2016-03-24 11:21 [PATCH 0/3] x86/HVM: adjustments to internal device emulation Jan Beulich
  2016-03-24 11:28 ` [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests Jan Beulich
  2016-03-24 11:29 ` [PATCH 2/3] x86/vLAPIC: vlapic_reg_write() can't fail Jan Beulich
@ 2016-03-24 11:30 ` Jan Beulich
  2016-03-24 14:08   ` Andrew Cooper
  2 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2016-03-24 11:30 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser

[-- Attachment #1: Type: text/plain, Size: 1931 bytes --]

There's no point in forwarding these to the device model.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/pmtimer.c
+++ b/xen/arch/x86/hvm/pmtimer.c
@@ -217,36 +217,30 @@ static int handle_pmt_io(
     struct vcpu *v = current;
     PMTState *s = &v->domain->arch.hvm_domain.pl_time->vpmt;
 
-    if ( bytes != 4 )
+    if ( bytes != 4 || dir != IOREQ_READ )
     {
         gdprintk(XENLOG_WARNING, "HVM_PMT bad access\n");
         *val = ~0;
-        return X86EMUL_OKAY;
     }
-    
-    if ( dir == IOREQ_READ )
+    else if ( spin_trylock(&s->lock) )
     {
-        if ( spin_trylock(&s->lock) )
-        {
-            /* We hold the lock: update timer value and return it. */
-            pmt_update_time(s);
-            *val = s->pm.tmr_val;
-            spin_unlock(&s->lock);
-        }
-        else
-        {
-            /*
-             * Someone else is updating the timer: rather than do the work
-             * again ourselves, wait for them to finish and then steal their
-             * updated value with a lock-free atomic read.
-             */
-            spin_barrier(&s->lock);
-            *val = *(volatile uint32_t *)&s->pm.tmr_val;
-        }
-        return X86EMUL_OKAY;
+        /* We hold the lock: update timer value and return it. */
+        pmt_update_time(s);
+        *val = s->pm.tmr_val;
+        spin_unlock(&s->lock);
+    }
+    else
+    {
+        /*
+         * Someone else is updating the timer: rather than do the work
+         * again ourselves, wait for them to finish and then steal their
+         * updated value with a lock-free atomic read.
+         */
+        spin_barrier(&s->lock);
+        *val = read_atomic(&s->pm.tmr_val);
     }
 
-    return X86EMUL_UNHANDLEABLE;
+    return X86EMUL_OKAY;
 }
 
 static int pmtimer_save(struct domain *d, hvm_domain_context_t *h)





[-- Attachment #2: x86-vPMT-terminate-writes.patch --]
[-- Type: text/plain, Size: 1967 bytes --]

x86/HVM: terminate writes to PM_TMR port

There's no point in forwarding these to the device model.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/pmtimer.c
+++ b/xen/arch/x86/hvm/pmtimer.c
@@ -217,36 +217,30 @@ static int handle_pmt_io(
     struct vcpu *v = current;
     PMTState *s = &v->domain->arch.hvm_domain.pl_time->vpmt;
 
-    if ( bytes != 4 )
+    if ( bytes != 4 || dir != IOREQ_READ )
     {
         gdprintk(XENLOG_WARNING, "HVM_PMT bad access\n");
         *val = ~0;
-        return X86EMUL_OKAY;
     }
-    
-    if ( dir == IOREQ_READ )
+    else if ( spin_trylock(&s->lock) )
     {
-        if ( spin_trylock(&s->lock) )
-        {
-            /* We hold the lock: update timer value and return it. */
-            pmt_update_time(s);
-            *val = s->pm.tmr_val;
-            spin_unlock(&s->lock);
-        }
-        else
-        {
-            /*
-             * Someone else is updating the timer: rather than do the work
-             * again ourselves, wait for them to finish and then steal their
-             * updated value with a lock-free atomic read.
-             */
-            spin_barrier(&s->lock);
-            *val = *(volatile uint32_t *)&s->pm.tmr_val;
-        }
-        return X86EMUL_OKAY;
+        /* We hold the lock: update timer value and return it. */
+        pmt_update_time(s);
+        *val = s->pm.tmr_val;
+        spin_unlock(&s->lock);
+    }
+    else
+    {
+        /*
+         * Someone else is updating the timer: rather than do the work
+         * again ourselves, wait for them to finish and then steal their
+         * updated value with a lock-free atomic read.
+         */
+        spin_barrier(&s->lock);
+        *val = read_atomic(&s->pm.tmr_val);
     }
 
-    return X86EMUL_UNHANDLEABLE;
+    return X86EMUL_OKAY;
 }
 
 static int pmtimer_save(struct domain *d, hvm_domain_context_t *h)

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests
  2016-03-24 11:28 ` [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests Jan Beulich
@ 2016-03-24 11:49   ` Paul Durrant
  2016-03-24 12:01     ` Jan Beulich
  2016-03-24 12:52     ` Jan Beulich
  0 siblings, 2 replies; 11+ messages in thread
From: Paul Durrant @ 2016-03-24 11:49 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Andrew Cooper, Keir (Xen.org), Chang Jianzhong

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 24 March 2016 11:29
> To: xen-devel
> Cc: Andrew Cooper; Paul Durrant; Chang Jianzhong; Keir (Xen.org)
> Subject: [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests
> 
> Forwarding entire batches to the device model when an individual
> iteration of them got rejected by internal device emulation handlers
> with X86EMUL_UNHANDLEABLE is wrong: The device model would then
> handle
> all iterations, without the internal handler getting to see any past
> the one it returned failure for. This causes misbehavior in at least
> the MSI-X and VGA code, which want to see all such requests for
> internal tracking/caching purposes. But note that this does not apply
> to buffered I/O requests.
> 
> This in turn means that the condition in hvm_process_io_intercept() of
> when to crash the domain was wrong: Since X86EMUL_UNHANDLEABLE can
> validly be returned by the individual device handlers, we mustn't
> blindly crash the domain if such occurs on other than the initial
> iteration. Instead we need to distinguish hvm_copy_*_guest_phys()
> failures from device specific ones, and then the former need to always
> be fatal to the domain (i.e. also on the first iteration), since
> otherwise we again would end up forwarding a request to qemu which the
> internal handler didn't get to see.
> 
> Also commit 4faffc41d ("x86/hvm: limit reps to avoid the need to handle
> retry") went too far in removing code from hvm_process_io_intercept():
> When there were successfully handled iterations, the function should
> continue to return success with a clipped repeat count.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Cc: Chang Jianzhong <changjzh@gmail.com>
> ---
> I assume this also addresses the issue which
> http://lists.xenproject.org/archives/html/xen-devel/2016-01/msg03189.html
> attempted to deal with in a not really acceptable way.
> 
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -95,7 +95,7 @@ static const struct hvm_io_handler null_
>  };
> 
>  static int hvmemul_do_io(
> -    bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
> +    bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
>      uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
>  {
>      struct vcpu *curr = current;
> @@ -104,7 +104,7 @@ static int hvmemul_do_io(
>          .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
>          .addr = addr,
>          .size = size,
> -        .count = reps,
> +        .count = *reps,
>          .dir = dir,
>          .df = df,
>          .data = data,
> @@ -136,7 +136,7 @@ static int hvmemul_do_io(
>          if ( (p.type != is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO) ||
>               (p.addr != addr) ||
>               (p.size != size) ||
> -             (p.count != reps) ||
> +             (p.count != *reps) ||
>               (p.dir != dir) ||
>               (p.df != df) ||
>               (p.data_is_ptr != data_is_addr) )
> @@ -214,7 +214,7 @@ static int hvmemul_do_io_buffer(
> 
>      BUG_ON(buffer == NULL);
> 
> -    rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0,
> +    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
>                         (uintptr_t)buffer);
>      if ( rc == X86EMUL_UNHANDLEABLE && dir == IOREQ_READ )
>          memset(buffer, 0xff, size);
> @@ -305,13 +305,13 @@ static int hvmemul_do_io_addr(
>          count = 1;
>      }
> 
> -    rc = hvmemul_do_io(is_mmio, addr, count, size, dir, df, 1,
> +    rc = hvmemul_do_io(is_mmio, addr, &count, size, dir, df, 1,
>                         ram_gpa);
> +
>      if ( rc == X86EMUL_OKAY )
> -    {
>          v->arch.hvm_vcpu.hvm_io.mmio_retry = (count < *reps);
> -        *reps = count;
> -    }
> +
> +    *reps = count;
> 
>   out:
>      while ( nr_pages )
> --- a/xen/arch/x86/hvm/intercept.c
> +++ b/xen/arch/x86/hvm/intercept.c
> @@ -148,7 +148,7 @@ int hvm_process_io_intercept(const struc
>                      ASSERT_UNREACHABLE();
>                      /* fall through */
>                  default:
> -                    rc = X86EMUL_UNHANDLEABLE;
> +                    rc = -1; /* != any X86EMUL_* value. */

Rather than the need for magic values, couldn't you just goto a domain_crash label at the tail of the function?
Also, since domain_crash() isn't synchronous, I think you should replace the -1 value with some valid X86EMUL_ value before returning from the function.

>                      break;
>                  }
>                  if ( rc != X86EMUL_OKAY )
> @@ -178,7 +178,7 @@ int hvm_process_io_intercept(const struc
>                      ASSERT_UNREACHABLE();
>                      /* fall through */
>                  default:
> -                    rc = X86EMUL_UNHANDLEABLE;
> +                    rc = -1; /* != any X86EMUL_* value. */
>                      break;
>                  }
>                  if ( rc != X86EMUL_OKAY )
> @@ -196,8 +196,22 @@ int hvm_process_io_intercept(const struc
>          }
>      }
> 
> -    if ( i != 0 && rc == X86EMUL_UNHANDLEABLE )
> +    if ( unlikely(rc < 0) )
>          domain_crash(current->domain);
> +    else if ( i )
> +    {
> +        p->count = i;
> +        rc = X86EMUL_OKAY;
> +    }
> +    else if ( rc == X86EMUL_UNHANDLEABLE )
> +    {
> +        /*
> +         * Don't forward entire batches to the device model: This would
> +         * prevent the internal handlers to see subsequent iterations of
> +         * the request.
> +         */
> +        p->count = 1;

I guess this is ok. If stdvga is not caching then the accept function would have failed so you won't get here, and if it send the buffered ioreq then you still don't get here because it returns X86EMUL_OKAY.

  Paul

> +    }
> 
>      return rc;
>  }
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests
  2016-03-24 11:49   ` Paul Durrant
@ 2016-03-24 12:01     ` Jan Beulich
  2016-03-24 12:11       ` Paul Durrant
  2016-03-24 12:52     ` Jan Beulich
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2016-03-24 12:01 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, Keir (Xen.org), Chang Jianzhong, xen-devel

>>> On 24.03.16 at 12:49, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 24 March 2016 11:29
>> --- a/xen/arch/x86/hvm/intercept.c
>> +++ b/xen/arch/x86/hvm/intercept.c
>> @@ -148,7 +148,7 @@ int hvm_process_io_intercept(const struc
>>                      ASSERT_UNREACHABLE();
>>                      /* fall through */
>>                  default:
>> -                    rc = X86EMUL_UNHANDLEABLE;
>> +                    rc = -1; /* != any X86EMUL_* value. */
> 
> Rather than the need for magic values, couldn't you just goto a domain_crash 
> label at the tail of the function?
> Also, since domain_crash() isn't synchronous, I think you should replace the 
> -1 value with some valid X86EMUL_ value before returning from the function.

Good point, but I guess I'd rather move the domain_crash() right here
and retain the X86EMUL_UNHANDLEABLE then. What do you think?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests
  2016-03-24 12:01     ` Jan Beulich
@ 2016-03-24 12:11       ` Paul Durrant
  0 siblings, 0 replies; 11+ messages in thread
From: Paul Durrant @ 2016-03-24 12:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Keir (Xen.org), Chang Jianzhong, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 24 March 2016 12:02
> To: Paul Durrant
> Cc: Andrew Cooper; Chang Jianzhong; xen-devel; Keir (Xen.org)
> Subject: RE: [PATCH 1/3] x86/HVM: fix forwarding of internally cached
> requests
> 
> >>> On 24.03.16 at 12:49, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 24 March 2016 11:29
> >> --- a/xen/arch/x86/hvm/intercept.c
> >> +++ b/xen/arch/x86/hvm/intercept.c
> >> @@ -148,7 +148,7 @@ int hvm_process_io_intercept(const struc
> >>                      ASSERT_UNREACHABLE();
> >>                      /* fall through */
> >>                  default:
> >> -                    rc = X86EMUL_UNHANDLEABLE;
> >> +                    rc = -1; /* != any X86EMUL_* value. */
> >
> > Rather than the need for magic values, couldn't you just goto a
> domain_crash
> > label at the tail of the function?
> > Also, since domain_crash() isn't synchronous, I think you should replace the
> > -1 value with some valid X86EMUL_ value before returning from the
> function.
> 
> Good point, but I guess I'd rather move the domain_crash() right here
> and retain the X86EMUL_UNHANDLEABLE then. What do you think?
> 

Yes, that's probably better since IIRC domain_crash()  will spit out the line number it's invoked from.

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests
  2016-03-24 11:49   ` Paul Durrant
  2016-03-24 12:01     ` Jan Beulich
@ 2016-03-24 12:52     ` Jan Beulich
  2016-03-24 12:58       ` Paul Durrant
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2016-03-24 12:52 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, Keir (Xen.org), Chang Jianzhong, xen-devel

>>> On 24.03.16 at 12:49, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 24 March 2016 11:29
>> @@ -196,8 +196,22 @@ int hvm_process_io_intercept(const struc
>>          }
>>      }
>> 
>> -    if ( i != 0 && rc == X86EMUL_UNHANDLEABLE )
>> +    if ( unlikely(rc < 0) )
>>          domain_crash(current->domain);
>> +    else if ( i )
>> +    {
>> +        p->count = i;
>> +        rc = X86EMUL_OKAY;
>> +    }
>> +    else if ( rc == X86EMUL_UNHANDLEABLE )
>> +    {
>> +        /*
>> +         * Don't forward entire batches to the device model: This would
>> +         * prevent the internal handlers to see subsequent iterations of
>> +         * the request.
>> +         */
>> +        p->count = 1;
> 
> I guess this is ok. If stdvga is not caching then the accept function would 
> have failed so you won't get here, and if it send the buffered ioreq then you 
> still don't get here because it returns X86EMUL_OKAY.

Good that you thought of this - I had forgotten that stdvga's
MMIO handling now takes this same code path rather than a
fully separate one. I guess I'll steal some of the wording above
for the v2 commit message.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests
  2016-03-24 12:52     ` Jan Beulich
@ 2016-03-24 12:58       ` Paul Durrant
  0 siblings, 0 replies; 11+ messages in thread
From: Paul Durrant @ 2016-03-24 12:58 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Keir (Xen.org), Chang Jianzhong, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 24 March 2016 12:52
> To: Paul Durrant
> Cc: Andrew Cooper; Chang Jianzhong; xen-devel; Keir (Xen.org)
> Subject: RE: [PATCH 1/3] x86/HVM: fix forwarding of internally cached
> requests
> 
> >>> On 24.03.16 at 12:49, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 24 March 2016 11:29
> >> @@ -196,8 +196,22 @@ int hvm_process_io_intercept(const struc
> >>          }
> >>      }
> >>
> >> -    if ( i != 0 && rc == X86EMUL_UNHANDLEABLE )
> >> +    if ( unlikely(rc < 0) )
> >>          domain_crash(current->domain);
> >> +    else if ( i )
> >> +    {
> >> +        p->count = i;
> >> +        rc = X86EMUL_OKAY;
> >> +    }
> >> +    else if ( rc == X86EMUL_UNHANDLEABLE )
> >> +    {
> >> +        /*
> >> +         * Don't forward entire batches to the device model: This would
> >> +         * prevent the internal handlers to see subsequent iterations of
> >> +         * the request.
> >> +         */
> >> +        p->count = 1;
> >
> > I guess this is ok. If stdvga is not caching then the accept function would
> > have failed so you won't get here, and if it send the buffered ioreq then
> you
> > still don't get here because it returns X86EMUL_OKAY.
> 
> Good that you thought of this - I had forgotten that stdvga's
> MMIO handling now takes this same code path rather than a
> fully separate one.

Yes, that's why I was a little worried before I saw the code :-)

> I guess I'll steal some of the wording above
> for the v2 commit message.
> 

Sure.

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] x86/vLAPIC: vlapic_reg_write() can't fail
  2016-03-24 11:29 ` [PATCH 2/3] x86/vLAPIC: vlapic_reg_write() can't fail Jan Beulich
@ 2016-03-24 14:03   ` Andrew Cooper
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2016-03-24 14:03 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Keir Fraser

On 24/03/16 11:29, Jan Beulich wrote:
> It only ever returns X86EMUL_OKAY, so to make this more obvious change
> the function return type to void. Re-structure vlapic_apicv_write() at
> once to have only a single path leading to vlapic_reg_write().
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] x86/HVM: terminate writes to PM_TMR port
  2016-03-24 11:30 ` [PATCH 3/3] x86/HVM: terminate writes to PM_TMR port Jan Beulich
@ 2016-03-24 14:08   ` Andrew Cooper
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2016-03-24 14:08 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Keir Fraser

On 24/03/16 11:30, Jan Beulich wrote:
> There's no point in forwarding these to the device model.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-03-24 14:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-24 11:21 [PATCH 0/3] x86/HVM: adjustments to internal device emulation Jan Beulich
2016-03-24 11:28 ` [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests Jan Beulich
2016-03-24 11:49   ` Paul Durrant
2016-03-24 12:01     ` Jan Beulich
2016-03-24 12:11       ` Paul Durrant
2016-03-24 12:52     ` Jan Beulich
2016-03-24 12:58       ` Paul Durrant
2016-03-24 11:29 ` [PATCH 2/3] x86/vLAPIC: vlapic_reg_write() can't fail Jan Beulich
2016-03-24 14:03   ` Andrew Cooper
2016-03-24 11:30 ` [PATCH 3/3] x86/HVM: terminate writes to PM_TMR port Jan Beulich
2016-03-24 14:08   ` Andrew Cooper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).