[PATCH] evtchn: clean last_vcpu_id on EVTCHNOP_reset to avoid crash

* [PATCH] evtchn: clean last_vcpu_id on EVTCHNOP_reset to avoid crash
@ 2014-08-08 14:22 Vitaly Kuznetsov
  2014-08-08 15:03 ` Jan Beulich
  2014-08-08 15:05 ` David Vrabel
  0 siblings, 2 replies; 7+ messages in thread
From: Vitaly Kuznetsov @ 2014-08-08 14:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Jones, David Vrabel

When EVTCHNOP_reset is being performed last_vcpu_id attribute is not being
cleaned by __evtchn_close(). In case last_vcpu_id != 0 for a particular
event channel and this event channel is going to be used for event delivery
(for another vcpu) before EVTCHNOP_init_control for vcpu == last_vcpu_id
was done the following crash is observed:

 ...
 (XEN) Xen call trace:
 (XEN)    [<ffff82d080127785>] _spin_lock_irqsave+0x5/0x70
 (XEN)    [<ffff82d0801097db>] evtchn_fifo_set_pending+0xdb/0x370
 (XEN)    [<ffff82d080107146>] evtchn_send+0xd6/0x160
 (XEN)    [<ffff82d080107df9>] do_event_channel_op+0x6a9/0x16c0
 (XEN)    [<ffff82d0801ce800>] vmx_intr_assist+0x30/0x480
 (XEN)    [<ffff82d080219e99>] syscall_enter+0xa9/0xae

This happens because lock_old_queue() does not check VCPU's control
block existence and after EVTCHNOP_reset they are all cleaned.

I suggest we fix the issue twice: reset last_vcpu_id to 0 in __evtchn_close()
and add appropriate check to lock_old_queue() as lost event is much better
than hypervisor crash.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 xen/common/event_channel.c | 3 +++
 xen/common/event_fifo.c    | 9 +++++++++
 2 files changed, 12 insertions(+)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index a7becae..67b9d53 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -578,6 +578,9 @@ static long __evtchn_close(struct domain *d1, int port1)
     chn1->state          = ECS_FREE;
     chn1->notify_vcpu_id = 0;
 
+    /* Reset last_vcpu_id to vcpu0 as control block can be freed */
+    chn1->last_vcpu_id = 0;
+
     xsm_evtchn_close_post(chn1);
 
  out:
diff --git a/xen/common/event_fifo.c b/xen/common/event_fifo.c
index 51b4ff6..e4bef80 100644
--- a/xen/common/event_fifo.c
+++ b/xen/common/event_fifo.c
@@ -61,6 +61,15 @@ static struct evtchn_fifo_queue *lock_old_queue(const struct domain *d,
     for ( try = 0; try < 3; try++ )
     {
         v = d->vcpu[evtchn->last_vcpu_id];
+
+        if ( !v->evtchn_fifo )
+        {
+            gdprintk(XENLOG_ERR,
+                     "domain %d vcpu %d has no control block!\n",
+                     d->domain_id, v->vcpu_id);
+            return NULL;
+        }
+
         old_q = &v->evtchn_fifo->queue[evtchn->last_priority];
 
         spin_lock_irqsave(&old_q->lock, *flags);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 7+ messages in thread