All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] A few event channel-related fixes
@ 2015-04-29 21:10 Boris Ostrovsky
  2015-04-29 21:10 ` [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming Boris Ostrovsky
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-04-29 21:10 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk
  Cc: xen-devel, linux-kernel, annie.li, boris.ostrovsky

Fixes for issues that we discovered during live migration when source and
target systems have different event channel assignments for guests.

Changes in v2:
* Don't use IRQ_MOVE_PCNTXT, bind channels to VCPUs explicitly in rebind_evtchn_irq()
* Use xenstore_domain_type to determine whether to install resume notifier

Boris Ostrovsky (4):
  xen/events: Clear cpu_evtchn_mask before resuming
  xen/xenbus: Update xenbus event channel on resume
  xen/console: Update console event channel on resume
  xen/events: Set irq_info->evtchn before binding the channel to CPU in
    __startup_pirq()

 drivers/tty/hvc/hvc_xen.c         |   18 +++++++++++++++++-
 drivers/xen/events/events_2l.c    |   10 ++++++++++
 drivers/xen/events/events_base.c  |   15 ++++++++++++---
 drivers/xen/xenbus/xenbus_probe.c |   29 +++++++++++++++++++++++++++++
 4 files changed, 68 insertions(+), 4 deletions(-)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming
  2015-04-29 21:10 [PATCH v2 0/4] A few event channel-related fixes Boris Ostrovsky
  2015-04-29 21:10 ` [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming Boris Ostrovsky
@ 2015-04-29 21:10 ` Boris Ostrovsky
  2015-05-01 10:46   ` David Vrabel
  2015-05-01 10:46   ` [Xen-devel] " David Vrabel
  2015-04-29 21:10 ` [PATCH v2 2/4] xen/xenbus: Update xenbus event channel on resume Boris Ostrovsky
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-04-29 21:10 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk
  Cc: xen-devel, linux-kernel, annie.li, boris.ostrovsky

When a guest is resumed, the hypervisor may change event channel
assignments. If this happens and the guest uses 2-level events it
is possible for the interrupt to be claimed by wrong VCPU since
cpu_evtchn_mask bits may be stale. This can happen even though
evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
is passed in is not necessarily the original one (from pre-migration
times) but instead is freshly allocated during resume and so any
information about which CPU the channel was bound to is lost.

Thus we should clear the mask during resume.

We also need to make sure that bits for xenstore and console channels
are set when these two subsystems are resumed. While rebind_evtchn_irq()
(which is invoked for both of them on a resume) calls irq_set_affinity(),
the latter will in fact postpone setting affinity until handling the
interrupt. But because cpu_evtchn_mask will have bits for these two
cleared we won't be able to take the interrupt.

With that in mind, we need to bind those two channels explicitly in
rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
pass through generic irq affinity code later, in case something needs
to be updated there as well.

(Also replace cpumask_of(0) with cpumask_of(info->cpu) in
rebind_evtchn_irq(): it should be set to zero in preceding
xen_irq_info_evtchn_setup().)

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Annie Li <annie.li@oracle.com>
---

Changes in v2:
* Don't use IRQ_MOVE_PCNTXT, bind channels to VCPUs explicitly in rebind_evtchn_irq()

 drivers/xen/events/events_2l.c   |   10 ++++++++++
 drivers/xen/events/events_base.c |   13 +++++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c
index 5db43fc..7dd4631 100644
--- a/drivers/xen/events/events_2l.c
+++ b/drivers/xen/events/events_2l.c
@@ -345,6 +345,15 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static void evtchn_2l_resume(void)
+{
+	int i;
+
+	for_each_online_cpu(i)
+		memset(per_cpu(cpu_evtchn_mask, i), 0, sizeof(xen_ulong_t) *
+				EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD);
+}
+
 static const struct evtchn_ops evtchn_ops_2l = {
 	.max_channels      = evtchn_2l_max_channels,
 	.nr_channels       = evtchn_2l_max_channels,
@@ -356,6 +365,7 @@ static const struct evtchn_ops evtchn_ops_2l = {
 	.mask              = evtchn_2l_mask,
 	.unmask            = evtchn_2l_unmask,
 	.handle_events     = evtchn_2l_handle_events,
+	.resume	           = evtchn_2l_resume,
 };
 
 void __init xen_evtchn_2l_init(void)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 70fba97..26f372a 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1259,6 +1259,7 @@ EXPORT_SYMBOL_GPL(xen_hvm_evtchn_do_upcall);
 void rebind_evtchn_irq(int evtchn, int irq)
 {
 	struct irq_info *info = info_for_irq(irq);
+	struct evtchn_bind_vcpu bind_vcpu;
 
 	if (WARN_ON(!info))
 		return;
@@ -1279,8 +1280,16 @@ void rebind_evtchn_irq(int evtchn, int irq)
 
 	mutex_unlock(&irq_mapping_update_lock);
 
-	/* new event channels are always bound to cpu 0 */
-	irq_set_affinity(irq, cpumask_of(0));
+	bind_vcpu.port = evtchn;
+	bind_vcpu.vcpu = info->cpu;
+	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) == 0)
+		bind_evtchn_to_cpu(evtchn, info->cpu);
+	else
+		pr_warn("Failed binding port %d to cpu %d\n",
+			evtchn, info->cpu);
+
+	/* This will be deferred until interrupt is processed */
+	irq_set_affinity(irq, cpumask_of(info->cpu));
 
 	/* Unmask the event channel. */
 	enable_irq(irq);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming
  2015-04-29 21:10 [PATCH v2 0/4] A few event channel-related fixes Boris Ostrovsky
@ 2015-04-29 21:10 ` Boris Ostrovsky
  2015-04-29 21:10 ` Boris Ostrovsky
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-04-29 21:10 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk
  Cc: boris.ostrovsky, xen-devel, annie.li, linux-kernel

When a guest is resumed, the hypervisor may change event channel
assignments. If this happens and the guest uses 2-level events it
is possible for the interrupt to be claimed by wrong VCPU since
cpu_evtchn_mask bits may be stale. This can happen even though
evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
is passed in is not necessarily the original one (from pre-migration
times) but instead is freshly allocated during resume and so any
information about which CPU the channel was bound to is lost.

Thus we should clear the mask during resume.

We also need to make sure that bits for xenstore and console channels
are set when these two subsystems are resumed. While rebind_evtchn_irq()
(which is invoked for both of them on a resume) calls irq_set_affinity(),
the latter will in fact postpone setting affinity until handling the
interrupt. But because cpu_evtchn_mask will have bits for these two
cleared we won't be able to take the interrupt.

With that in mind, we need to bind those two channels explicitly in
rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
pass through generic irq affinity code later, in case something needs
to be updated there as well.

(Also replace cpumask_of(0) with cpumask_of(info->cpu) in
rebind_evtchn_irq(): it should be set to zero in preceding
xen_irq_info_evtchn_setup().)

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Annie Li <annie.li@oracle.com>
---

Changes in v2:
* Don't use IRQ_MOVE_PCNTXT, bind channels to VCPUs explicitly in rebind_evtchn_irq()

 drivers/xen/events/events_2l.c   |   10 ++++++++++
 drivers/xen/events/events_base.c |   13 +++++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c
index 5db43fc..7dd4631 100644
--- a/drivers/xen/events/events_2l.c
+++ b/drivers/xen/events/events_2l.c
@@ -345,6 +345,15 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static void evtchn_2l_resume(void)
+{
+	int i;
+
+	for_each_online_cpu(i)
+		memset(per_cpu(cpu_evtchn_mask, i), 0, sizeof(xen_ulong_t) *
+				EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD);
+}
+
 static const struct evtchn_ops evtchn_ops_2l = {
 	.max_channels      = evtchn_2l_max_channels,
 	.nr_channels       = evtchn_2l_max_channels,
@@ -356,6 +365,7 @@ static const struct evtchn_ops evtchn_ops_2l = {
 	.mask              = evtchn_2l_mask,
 	.unmask            = evtchn_2l_unmask,
 	.handle_events     = evtchn_2l_handle_events,
+	.resume	           = evtchn_2l_resume,
 };
 
 void __init xen_evtchn_2l_init(void)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 70fba97..26f372a 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1259,6 +1259,7 @@ EXPORT_SYMBOL_GPL(xen_hvm_evtchn_do_upcall);
 void rebind_evtchn_irq(int evtchn, int irq)
 {
 	struct irq_info *info = info_for_irq(irq);
+	struct evtchn_bind_vcpu bind_vcpu;
 
 	if (WARN_ON(!info))
 		return;
@@ -1279,8 +1280,16 @@ void rebind_evtchn_irq(int evtchn, int irq)
 
 	mutex_unlock(&irq_mapping_update_lock);
 
-	/* new event channels are always bound to cpu 0 */
-	irq_set_affinity(irq, cpumask_of(0));
+	bind_vcpu.port = evtchn;
+	bind_vcpu.vcpu = info->cpu;
+	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) == 0)
+		bind_evtchn_to_cpu(evtchn, info->cpu);
+	else
+		pr_warn("Failed binding port %d to cpu %d\n",
+			evtchn, info->cpu);
+
+	/* This will be deferred until interrupt is processed */
+	irq_set_affinity(irq, cpumask_of(info->cpu));
 
 	/* Unmask the event channel. */
 	enable_irq(irq);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/4] xen/xenbus: Update xenbus event channel on resume
  2015-04-29 21:10 [PATCH v2 0/4] A few event channel-related fixes Boris Ostrovsky
  2015-04-29 21:10 ` [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming Boris Ostrovsky
  2015-04-29 21:10 ` Boris Ostrovsky
@ 2015-04-29 21:10 ` Boris Ostrovsky
  2015-05-01 10:46   ` David Vrabel
  2015-05-01 10:46   ` [Xen-devel] " David Vrabel
  2015-04-29 21:10 ` Boris Ostrovsky
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-04-29 21:10 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk
  Cc: xen-devel, linux-kernel, annie.li, boris.ostrovsky

After a resume the hypervisor/tools may change xenbus event
channel number. We should re-query it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---

Changes in v2:
* Use xenstore_domain_type to determine whether to install resume notifier

 drivers/xen/xenbus/xenbus_probe.c |   29 +++++++++++++++++++++++++++++
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 564b315..5390a67 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -57,6 +57,7 @@
 #include <xen/xen.h>
 #include <xen/xenbus.h>
 #include <xen/events.h>
+#include <xen/xen-ops.h>
 #include <xen/page.h>
 
 #include <xen/hvm.h>
@@ -735,6 +736,30 @@ static int __init xenstored_local_init(void)
 	return err;
 }
 
+static int xenbus_resume_cb(struct notifier_block *nb,
+			    unsigned long action, void *data)
+{
+	int err = 0;
+
+	if (xen_hvm_domain()) {
+		uint64_t v;
+
+		err = hvm_get_parameter(HVM_PARAM_STORE_EVTCHN, &v);
+		if (!err && v)
+			xen_store_evtchn = v;
+		else
+			pr_warn("Cannot update xenstore event channel: %d\n",
+				err);
+	} else
+		xen_store_evtchn = xen_start_info->store_evtchn;
+
+	return err;
+}
+
+static struct notifier_block xenbus_resume_nb = {
+	.notifier_call = xenbus_resume_cb,
+};
+
 static int __init xenbus_init(void)
 {
 	int err = 0;
@@ -793,6 +818,10 @@ static int __init xenbus_init(void)
 		goto out_error;
 	}
 
+	if ((xen_store_domain_type != XS_LOCAL) &&
+	    (xen_store_domain_type != XS_UNKNOWN))
+		xen_resume_notifier_register(&xenbus_resume_nb);
+
 #ifdef CONFIG_XEN_COMPAT_XENFS
 	/*
 	 * Create xenfs mountpoint in /proc for compatibility with
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/4] xen/xenbus: Update xenbus event channel on resume
  2015-04-29 21:10 [PATCH v2 0/4] A few event channel-related fixes Boris Ostrovsky
                   ` (2 preceding siblings ...)
  2015-04-29 21:10 ` [PATCH v2 2/4] xen/xenbus: Update xenbus event channel on resume Boris Ostrovsky
@ 2015-04-29 21:10 ` Boris Ostrovsky
  2015-04-29 21:10 ` [PATCH v2 3/4] xen/console: Update console " Boris Ostrovsky
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-04-29 21:10 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk
  Cc: boris.ostrovsky, xen-devel, annie.li, linux-kernel

After a resume the hypervisor/tools may change xenbus event
channel number. We should re-query it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---

Changes in v2:
* Use xenstore_domain_type to determine whether to install resume notifier

 drivers/xen/xenbus/xenbus_probe.c |   29 +++++++++++++++++++++++++++++
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 564b315..5390a67 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -57,6 +57,7 @@
 #include <xen/xen.h>
 #include <xen/xenbus.h>
 #include <xen/events.h>
+#include <xen/xen-ops.h>
 #include <xen/page.h>
 
 #include <xen/hvm.h>
@@ -735,6 +736,30 @@ static int __init xenstored_local_init(void)
 	return err;
 }
 
+static int xenbus_resume_cb(struct notifier_block *nb,
+			    unsigned long action, void *data)
+{
+	int err = 0;
+
+	if (xen_hvm_domain()) {
+		uint64_t v;
+
+		err = hvm_get_parameter(HVM_PARAM_STORE_EVTCHN, &v);
+		if (!err && v)
+			xen_store_evtchn = v;
+		else
+			pr_warn("Cannot update xenstore event channel: %d\n",
+				err);
+	} else
+		xen_store_evtchn = xen_start_info->store_evtchn;
+
+	return err;
+}
+
+static struct notifier_block xenbus_resume_nb = {
+	.notifier_call = xenbus_resume_cb,
+};
+
 static int __init xenbus_init(void)
 {
 	int err = 0;
@@ -793,6 +818,10 @@ static int __init xenbus_init(void)
 		goto out_error;
 	}
 
+	if ((xen_store_domain_type != XS_LOCAL) &&
+	    (xen_store_domain_type != XS_UNKNOWN))
+		xen_resume_notifier_register(&xenbus_resume_nb);
+
 #ifdef CONFIG_XEN_COMPAT_XENFS
 	/*
 	 * Create xenfs mountpoint in /proc for compatibility with
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/4] xen/console: Update console event channel on resume
  2015-04-29 21:10 [PATCH v2 0/4] A few event channel-related fixes Boris Ostrovsky
                   ` (3 preceding siblings ...)
  2015-04-29 21:10 ` Boris Ostrovsky
@ 2015-04-29 21:10 ` Boris Ostrovsky
  2015-04-29 21:10 ` Boris Ostrovsky
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-04-29 21:10 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk
  Cc: xen-devel, linux-kernel, annie.li, boris.ostrovsky

After a resume the hypervisor/tools may change console event
channel number. We should re-query it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/tty/hvc/hvc_xen.c |   18 +++++++++++++++++-
 1 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c
index f1e5742..5bab1c6 100644
--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -299,11 +299,27 @@ static int xen_initial_domain_console_init(void)
 	return 0;
 }
 
+static void xen_console_update_evtchn(struct xencons_info *info)
+{
+	if (xen_hvm_domain()) {
+		uint64_t v;
+		int err;
+
+		err = hvm_get_parameter(HVM_PARAM_CONSOLE_EVTCHN, &v);
+		if (!err && v)
+			info->evtchn = v;
+	} else
+		info->evtchn = xen_start_info->console.domU.evtchn;
+}
+
 void xen_console_resume(void)
 {
 	struct xencons_info *info = vtermno_to_xencons(HVC_COOKIE);
-	if (info != NULL && info->irq)
+	if (info != NULL && info->irq) {
+		if (!xen_initial_domain())
+			xen_console_update_evtchn(info);
 		rebind_evtchn_irq(info->evtchn, info->irq);
+	}
 }
 
 static void xencons_disconnect_backend(struct xencons_info *info)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/4] xen/console: Update console event channel on resume
  2015-04-29 21:10 [PATCH v2 0/4] A few event channel-related fixes Boris Ostrovsky
                   ` (4 preceding siblings ...)
  2015-04-29 21:10 ` [PATCH v2 3/4] xen/console: Update console " Boris Ostrovsky
@ 2015-04-29 21:10 ` Boris Ostrovsky
  2015-04-29 21:10 ` [PATCH v2 4/4] xen/events: Set irq_info->evtchn before binding the channel to CPU in __startup_pirq() Boris Ostrovsky
  2015-04-29 21:10 ` Boris Ostrovsky
  7 siblings, 0 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-04-29 21:10 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk
  Cc: boris.ostrovsky, xen-devel, annie.li, linux-kernel

After a resume the hypervisor/tools may change console event
channel number. We should re-query it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/tty/hvc/hvc_xen.c |   18 +++++++++++++++++-
 1 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c
index f1e5742..5bab1c6 100644
--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -299,11 +299,27 @@ static int xen_initial_domain_console_init(void)
 	return 0;
 }
 
+static void xen_console_update_evtchn(struct xencons_info *info)
+{
+	if (xen_hvm_domain()) {
+		uint64_t v;
+		int err;
+
+		err = hvm_get_parameter(HVM_PARAM_CONSOLE_EVTCHN, &v);
+		if (!err && v)
+			info->evtchn = v;
+	} else
+		info->evtchn = xen_start_info->console.domU.evtchn;
+}
+
 void xen_console_resume(void)
 {
 	struct xencons_info *info = vtermno_to_xencons(HVC_COOKIE);
-	if (info != NULL && info->irq)
+	if (info != NULL && info->irq) {
+		if (!xen_initial_domain())
+			xen_console_update_evtchn(info);
 		rebind_evtchn_irq(info->evtchn, info->irq);
+	}
 }
 
 static void xencons_disconnect_backend(struct xencons_info *info)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/4] xen/events: Set irq_info->evtchn before binding the channel to CPU in __startup_pirq()
  2015-04-29 21:10 [PATCH v2 0/4] A few event channel-related fixes Boris Ostrovsky
                   ` (5 preceding siblings ...)
  2015-04-29 21:10 ` Boris Ostrovsky
@ 2015-04-29 21:10 ` Boris Ostrovsky
  2015-04-29 21:10 ` Boris Ostrovsky
  7 siblings, 0 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-04-29 21:10 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk
  Cc: xen-devel, linux-kernel, annie.li, boris.ostrovsky

.. because bind_evtchn_to_cpu(evtchn, cpu) will map evtchn to
'info' and pass 'info' down to xen_evtchn_port_bind_to_cpu().

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Tested-by: Annie Li <annie.li@oracle.com>
---
 drivers/xen/events/events_base.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 26f372a..044cf3d 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -529,8 +529,8 @@ static unsigned int __startup_pirq(unsigned int irq)
 	if (rc)
 		goto err;
 
-	bind_evtchn_to_cpu(evtchn, 0);
 	info->evtchn = evtchn;
+	bind_evtchn_to_cpu(evtchn, 0);
 
 	rc = xen_evtchn_port_setup(info);
 	if (rc)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/4] xen/events: Set irq_info->evtchn before binding the channel to CPU in __startup_pirq()
  2015-04-29 21:10 [PATCH v2 0/4] A few event channel-related fixes Boris Ostrovsky
                   ` (6 preceding siblings ...)
  2015-04-29 21:10 ` [PATCH v2 4/4] xen/events: Set irq_info->evtchn before binding the channel to CPU in __startup_pirq() Boris Ostrovsky
@ 2015-04-29 21:10 ` Boris Ostrovsky
  7 siblings, 0 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-04-29 21:10 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk
  Cc: boris.ostrovsky, xen-devel, annie.li, linux-kernel

.. because bind_evtchn_to_cpu(evtchn, cpu) will map evtchn to
'info' and pass 'info' down to xen_evtchn_port_bind_to_cpu().

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Tested-by: Annie Li <annie.li@oracle.com>
---
 drivers/xen/events/events_base.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 26f372a..044cf3d 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -529,8 +529,8 @@ static unsigned int __startup_pirq(unsigned int irq)
 	if (rc)
 		goto err;
 
-	bind_evtchn_to_cpu(evtchn, 0);
 	info->evtchn = evtchn;
+	bind_evtchn_to_cpu(evtchn, 0);
 
 	rc = xen_evtchn_port_setup(info);
 	if (rc)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/4] xen/xenbus: Update xenbus event channel on resume
  2015-04-29 21:10 ` [PATCH v2 2/4] xen/xenbus: Update xenbus event channel on resume Boris Ostrovsky
  2015-05-01 10:46   ` David Vrabel
@ 2015-05-01 10:46   ` David Vrabel
  1 sibling, 0 replies; 19+ messages in thread
From: David Vrabel @ 2015-05-01 10:46 UTC (permalink / raw)
  To: Boris Ostrovsky, david.vrabel, konrad.wilk
  Cc: xen-devel, annie.li, linux-kernel

On 29/04/15 22:10, Boris Ostrovsky wrote:
> After a resume the hypervisor/tools may change xenbus event
> channel number. We should re-query it.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] xen/xenbus: Update xenbus event channel on resume
  2015-04-29 21:10 ` [PATCH v2 2/4] xen/xenbus: Update xenbus event channel on resume Boris Ostrovsky
@ 2015-05-01 10:46   ` David Vrabel
  2015-05-01 10:46   ` [Xen-devel] " David Vrabel
  1 sibling, 0 replies; 19+ messages in thread
From: David Vrabel @ 2015-05-01 10:46 UTC (permalink / raw)
  To: Boris Ostrovsky, david.vrabel, konrad.wilk
  Cc: xen-devel, annie.li, linux-kernel

On 29/04/15 22:10, Boris Ostrovsky wrote:
> After a resume the hypervisor/tools may change xenbus event
> channel number. We should re-query it.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming
  2015-04-29 21:10 ` Boris Ostrovsky
  2015-05-01 10:46   ` David Vrabel
@ 2015-05-01 10:46   ` David Vrabel
  2015-05-01 13:39     ` Boris Ostrovsky
  2015-05-01 13:39     ` [Xen-devel] " Boris Ostrovsky
  1 sibling, 2 replies; 19+ messages in thread
From: David Vrabel @ 2015-05-01 10:46 UTC (permalink / raw)
  To: Boris Ostrovsky, david.vrabel, konrad.wilk
  Cc: xen-devel, annie.li, linux-kernel

On 29/04/15 22:10, Boris Ostrovsky wrote:
> When a guest is resumed, the hypervisor may change event channel
> assignments. If this happens and the guest uses 2-level events it
> is possible for the interrupt to be claimed by wrong VCPU since
> cpu_evtchn_mask bits may be stale. This can happen even though
> evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
> is passed in is not necessarily the original one (from pre-migration
> times) but instead is freshly allocated during resume and so any
> information about which CPU the channel was bound to is lost.
> 
> Thus we should clear the mask during resume.
> 
> We also need to make sure that bits for xenstore and console channels
> are set when these two subsystems are resumed. While rebind_evtchn_irq()
> (which is invoked for both of them on a resume) calls irq_set_affinity(),
> the latter will in fact postpone setting affinity until handling the
> interrupt. But because cpu_evtchn_mask will have bits for these two
> cleared we won't be able to take the interrupt.
> 
> With that in mind, we need to bind those two channels explicitly in
> rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
> pass through generic irq affinity code later, in case something needs
> to be updated there as well.
> 
> (Also replace cpumask_of(0) with cpumask_of(info->cpu) in
> rebind_evtchn_irq(): it should be set to zero in preceding
> xen_irq_info_evtchn_setup().)
[...]
> @@ -1279,8 +1280,16 @@ void rebind_evtchn_irq(int evtchn, int irq)
>  
>  	mutex_unlock(&irq_mapping_update_lock);
>  
> -	/* new event channels are always bound to cpu 0 */
> -	irq_set_affinity(irq, cpumask_of(0));
> +	bind_vcpu.port = evtchn;
> +	bind_vcpu.vcpu = info->cpu;
> +	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) == 0)
> +		bind_evtchn_to_cpu(evtchn, info->cpu);

Isn't the hypercall is unnecessary since this is a new event channel
it's already bound to VCPU 0 and info->cpu == 0?

I think only the bind_evtchn_to_cpu() call is needed here.

If you agree I can remove the hypercall and apply this series.

> +	else
> +		pr_warn("Failed binding port %d to cpu %d\n",
> +			evtchn, info->cpu);
> +
> +	/* This will be deferred until interrupt is processed */
> +	irq_set_affinity(irq, cpumask_of(info->cpu));

David


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming
  2015-04-29 21:10 ` Boris Ostrovsky
@ 2015-05-01 10:46   ` David Vrabel
  2015-05-01 10:46   ` [Xen-devel] " David Vrabel
  1 sibling, 0 replies; 19+ messages in thread
From: David Vrabel @ 2015-05-01 10:46 UTC (permalink / raw)
  To: Boris Ostrovsky, david.vrabel, konrad.wilk
  Cc: xen-devel, annie.li, linux-kernel

On 29/04/15 22:10, Boris Ostrovsky wrote:
> When a guest is resumed, the hypervisor may change event channel
> assignments. If this happens and the guest uses 2-level events it
> is possible for the interrupt to be claimed by wrong VCPU since
> cpu_evtchn_mask bits may be stale. This can happen even though
> evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
> is passed in is not necessarily the original one (from pre-migration
> times) but instead is freshly allocated during resume and so any
> information about which CPU the channel was bound to is lost.
> 
> Thus we should clear the mask during resume.
> 
> We also need to make sure that bits for xenstore and console channels
> are set when these two subsystems are resumed. While rebind_evtchn_irq()
> (which is invoked for both of them on a resume) calls irq_set_affinity(),
> the latter will in fact postpone setting affinity until handling the
> interrupt. But because cpu_evtchn_mask will have bits for these two
> cleared we won't be able to take the interrupt.
> 
> With that in mind, we need to bind those two channels explicitly in
> rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
> pass through generic irq affinity code later, in case something needs
> to be updated there as well.
> 
> (Also replace cpumask_of(0) with cpumask_of(info->cpu) in
> rebind_evtchn_irq(): it should be set to zero in preceding
> xen_irq_info_evtchn_setup().)
[...]
> @@ -1279,8 +1280,16 @@ void rebind_evtchn_irq(int evtchn, int irq)
>  
>  	mutex_unlock(&irq_mapping_update_lock);
>  
> -	/* new event channels are always bound to cpu 0 */
> -	irq_set_affinity(irq, cpumask_of(0));
> +	bind_vcpu.port = evtchn;
> +	bind_vcpu.vcpu = info->cpu;
> +	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) == 0)
> +		bind_evtchn_to_cpu(evtchn, info->cpu);

Isn't the hypercall is unnecessary since this is a new event channel
it's already bound to VCPU 0 and info->cpu == 0?

I think only the bind_evtchn_to_cpu() call is needed here.

If you agree I can remove the hypercall and apply this series.

> +	else
> +		pr_warn("Failed binding port %d to cpu %d\n",
> +			evtchn, info->cpu);
> +
> +	/* This will be deferred until interrupt is processed */
> +	irq_set_affinity(irq, cpumask_of(info->cpu));

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming
  2015-05-01 10:46   ` [Xen-devel] " David Vrabel
  2015-05-01 13:39     ` Boris Ostrovsky
@ 2015-05-01 13:39     ` Boris Ostrovsky
  2015-05-01 15:25       ` David Vrabel
  2015-05-01 15:25       ` [Xen-devel] " David Vrabel
  1 sibling, 2 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-05-01 13:39 UTC (permalink / raw)
  To: David Vrabel, konrad.wilk; +Cc: xen-devel, annie.li, linux-kernel

On 05/01/2015 06:46 AM, David Vrabel wrote:
> On 29/04/15 22:10, Boris Ostrovsky wrote:
>> When a guest is resumed, the hypervisor may change event channel
>> assignments. If this happens and the guest uses 2-level events it
>> is possible for the interrupt to be claimed by wrong VCPU since
>> cpu_evtchn_mask bits may be stale. This can happen even though
>> evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
>> is passed in is not necessarily the original one (from pre-migration
>> times) but instead is freshly allocated during resume and so any
>> information about which CPU the channel was bound to is lost.
>>
>> Thus we should clear the mask during resume.
>>
>> We also need to make sure that bits for xenstore and console channels
>> are set when these two subsystems are resumed. While rebind_evtchn_irq()
>> (which is invoked for both of them on a resume) calls irq_set_affinity(),
>> the latter will in fact postpone setting affinity until handling the
>> interrupt. But because cpu_evtchn_mask will have bits for these two
>> cleared we won't be able to take the interrupt.
>>
>> With that in mind, we need to bind those two channels explicitly in
>> rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
>> pass through generic irq affinity code later, in case something needs
>> to be updated there as well.
>>
>> (Also replace cpumask_of(0) with cpumask_of(info->cpu) in
>> rebind_evtchn_irq(): it should be set to zero in preceding
>> xen_irq_info_evtchn_setup().)
> [...]
>> @@ -1279,8 +1280,16 @@ void rebind_evtchn_irq(int evtchn, int irq)
>>   
>>   	mutex_unlock(&irq_mapping_update_lock);
>>   
>> -	/* new event channels are always bound to cpu 0 */
>> -	irq_set_affinity(irq, cpumask_of(0));
>> +	bind_vcpu.port = evtchn;
>> +	bind_vcpu.vcpu = info->cpu;
>> +	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) == 0)
>> +		bind_evtchn_to_cpu(evtchn, info->cpu);
> Isn't the hypercall is unnecessary since this is a new event channel
> it's already bound to VCPU 0 and info->cpu == 0?
>
> I think only the bind_evtchn_to_cpu() call is needed here.


True. However, I added the hypercall here to make the routine 
independent of what happens in other parts (hypervisor binding new 
channels to cpu0, xen_irq_info_evtchn_setup() initializing to zero, 
etc.). This way, if either of these two change in the future (unlikely, 
but possible) this routine will still work as expected.

That's why I also replaced cpumask_of(0) with cpumask_of(info->cpu) in 
irq_set_affinity() call.

-boris


>
> If you agree I can remove the hypercall and apply this series.
>
>> +	else
>> +		pr_warn("Failed binding port %d to cpu %d\n",
>> +			evtchn, info->cpu);
>> +
>> +	/* This will be deferred until interrupt is processed */
>> +	irq_set_affinity(irq, cpumask_of(info->cpu));
> David
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming
  2015-05-01 10:46   ` [Xen-devel] " David Vrabel
@ 2015-05-01 13:39     ` Boris Ostrovsky
  2015-05-01 13:39     ` [Xen-devel] " Boris Ostrovsky
  1 sibling, 0 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-05-01 13:39 UTC (permalink / raw)
  To: David Vrabel, konrad.wilk; +Cc: xen-devel, annie.li, linux-kernel

On 05/01/2015 06:46 AM, David Vrabel wrote:
> On 29/04/15 22:10, Boris Ostrovsky wrote:
>> When a guest is resumed, the hypervisor may change event channel
>> assignments. If this happens and the guest uses 2-level events it
>> is possible for the interrupt to be claimed by wrong VCPU since
>> cpu_evtchn_mask bits may be stale. This can happen even though
>> evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
>> is passed in is not necessarily the original one (from pre-migration
>> times) but instead is freshly allocated during resume and so any
>> information about which CPU the channel was bound to is lost.
>>
>> Thus we should clear the mask during resume.
>>
>> We also need to make sure that bits for xenstore and console channels
>> are set when these two subsystems are resumed. While rebind_evtchn_irq()
>> (which is invoked for both of them on a resume) calls irq_set_affinity(),
>> the latter will in fact postpone setting affinity until handling the
>> interrupt. But because cpu_evtchn_mask will have bits for these two
>> cleared we won't be able to take the interrupt.
>>
>> With that in mind, we need to bind those two channels explicitly in
>> rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
>> pass through generic irq affinity code later, in case something needs
>> to be updated there as well.
>>
>> (Also replace cpumask_of(0) with cpumask_of(info->cpu) in
>> rebind_evtchn_irq(): it should be set to zero in preceding
>> xen_irq_info_evtchn_setup().)
> [...]
>> @@ -1279,8 +1280,16 @@ void rebind_evtchn_irq(int evtchn, int irq)
>>   
>>   	mutex_unlock(&irq_mapping_update_lock);
>>   
>> -	/* new event channels are always bound to cpu 0 */
>> -	irq_set_affinity(irq, cpumask_of(0));
>> +	bind_vcpu.port = evtchn;
>> +	bind_vcpu.vcpu = info->cpu;
>> +	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) == 0)
>> +		bind_evtchn_to_cpu(evtchn, info->cpu);
> Isn't the hypercall is unnecessary since this is a new event channel
> it's already bound to VCPU 0 and info->cpu == 0?
>
> I think only the bind_evtchn_to_cpu() call is needed here.


True. However, I added the hypercall here to make the routine 
independent of what happens in other parts (hypervisor binding new 
channels to cpu0, xen_irq_info_evtchn_setup() initializing to zero, 
etc.). This way, if either of these two change in the future (unlikely, 
but possible) this routine will still work as expected.

That's why I also replaced cpumask_of(0) with cpumask_of(info->cpu) in 
irq_set_affinity() call.

-boris


>
> If you agree I can remove the hypercall and apply this series.
>
>> +	else
>> +		pr_warn("Failed binding port %d to cpu %d\n",
>> +			evtchn, info->cpu);
>> +
>> +	/* This will be deferred until interrupt is processed */
>> +	irq_set_affinity(irq, cpumask_of(info->cpu));
> David
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming
  2015-05-01 13:39     ` [Xen-devel] " Boris Ostrovsky
  2015-05-01 15:25       ` David Vrabel
@ 2015-05-01 15:25       ` David Vrabel
  2015-05-01 16:03         ` Boris Ostrovsky
  2015-05-01 16:03         ` [Xen-devel] " Boris Ostrovsky
  1 sibling, 2 replies; 19+ messages in thread
From: David Vrabel @ 2015-05-01 15:25 UTC (permalink / raw)
  To: Boris Ostrovsky, David Vrabel, konrad.wilk
  Cc: xen-devel, annie.li, linux-kernel

On 01/05/15 14:39, Boris Ostrovsky wrote:
> On 05/01/2015 06:46 AM, David Vrabel wrote:
>> On 29/04/15 22:10, Boris Ostrovsky wrote:
>>> When a guest is resumed, the hypervisor may change event channel
>>> assignments. If this happens and the guest uses 2-level events it
>>> is possible for the interrupt to be claimed by wrong VCPU since
>>> cpu_evtchn_mask bits may be stale. This can happen even though
>>> evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
>>> is passed in is not necessarily the original one (from pre-migration
>>> times) but instead is freshly allocated during resume and so any
>>> information about which CPU the channel was bound to is lost.
>>>
>>> Thus we should clear the mask during resume.
>>>
>>> We also need to make sure that bits for xenstore and console channels
>>> are set when these two subsystems are resumed. While rebind_evtchn_irq()
>>> (which is invoked for both of them on a resume) calls
>>> irq_set_affinity(),
>>> the latter will in fact postpone setting affinity until handling the
>>> interrupt. But because cpu_evtchn_mask will have bits for these two
>>> cleared we won't be able to take the interrupt.
>>>
>>> With that in mind, we need to bind those two channels explicitly in
>>> rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
>>> pass through generic irq affinity code later, in case something needs
>>> to be updated there as well.
>>>
>>> (Also replace cpumask_of(0) with cpumask_of(info->cpu) in
>>> rebind_evtchn_irq(): it should be set to zero in preceding
>>> xen_irq_info_evtchn_setup().)
>> [...]
>>> @@ -1279,8 +1280,16 @@ void rebind_evtchn_irq(int evtchn, int irq)
>>>         mutex_unlock(&irq_mapping_update_lock);
>>>   -    /* new event channels are always bound to cpu 0 */
>>> -    irq_set_affinity(irq, cpumask_of(0));
>>> +    bind_vcpu.port = evtchn;
>>> +    bind_vcpu.vcpu = info->cpu;
>>> +    if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu)
>>> == 0)
>>> +        bind_evtchn_to_cpu(evtchn, info->cpu);
>> Isn't the hypercall is unnecessary since this is a new event channel
>> it's already bound to VCPU 0 and info->cpu == 0?
>>
>> I think only the bind_evtchn_to_cpu() call is needed here.
> 
> 
> True. However, I added the hypercall here to make the routine
> independent of what happens in other parts (hypervisor binding new
> channels to cpu0, xen_irq_info_evtchn_setup() initializing to zero,
> etc.). This way, if either of these two change in the future (unlikely,
> but possible) this routine will still work as expected.

New event channels being bound to VCPU0 is part of the ABI and cannot
change, so I've taken the hypercall and its dodgy looking error path out.

I've applied this series to for-linus-4.1b but before I push, do any of
these need to be tagged for stable?

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming
  2015-05-01 13:39     ` [Xen-devel] " Boris Ostrovsky
@ 2015-05-01 15:25       ` David Vrabel
  2015-05-01 15:25       ` [Xen-devel] " David Vrabel
  1 sibling, 0 replies; 19+ messages in thread
From: David Vrabel @ 2015-05-01 15:25 UTC (permalink / raw)
  To: Boris Ostrovsky, David Vrabel, konrad.wilk
  Cc: xen-devel, annie.li, linux-kernel

On 01/05/15 14:39, Boris Ostrovsky wrote:
> On 05/01/2015 06:46 AM, David Vrabel wrote:
>> On 29/04/15 22:10, Boris Ostrovsky wrote:
>>> When a guest is resumed, the hypervisor may change event channel
>>> assignments. If this happens and the guest uses 2-level events it
>>> is possible for the interrupt to be claimed by wrong VCPU since
>>> cpu_evtchn_mask bits may be stale. This can happen even though
>>> evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
>>> is passed in is not necessarily the original one (from pre-migration
>>> times) but instead is freshly allocated during resume and so any
>>> information about which CPU the channel was bound to is lost.
>>>
>>> Thus we should clear the mask during resume.
>>>
>>> We also need to make sure that bits for xenstore and console channels
>>> are set when these two subsystems are resumed. While rebind_evtchn_irq()
>>> (which is invoked for both of them on a resume) calls
>>> irq_set_affinity(),
>>> the latter will in fact postpone setting affinity until handling the
>>> interrupt. But because cpu_evtchn_mask will have bits for these two
>>> cleared we won't be able to take the interrupt.
>>>
>>> With that in mind, we need to bind those two channels explicitly in
>>> rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
>>> pass through generic irq affinity code later, in case something needs
>>> to be updated there as well.
>>>
>>> (Also replace cpumask_of(0) with cpumask_of(info->cpu) in
>>> rebind_evtchn_irq(): it should be set to zero in preceding
>>> xen_irq_info_evtchn_setup().)
>> [...]
>>> @@ -1279,8 +1280,16 @@ void rebind_evtchn_irq(int evtchn, int irq)
>>>         mutex_unlock(&irq_mapping_update_lock);
>>>   -    /* new event channels are always bound to cpu 0 */
>>> -    irq_set_affinity(irq, cpumask_of(0));
>>> +    bind_vcpu.port = evtchn;
>>> +    bind_vcpu.vcpu = info->cpu;
>>> +    if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu)
>>> == 0)
>>> +        bind_evtchn_to_cpu(evtchn, info->cpu);
>> Isn't the hypercall is unnecessary since this is a new event channel
>> it's already bound to VCPU 0 and info->cpu == 0?
>>
>> I think only the bind_evtchn_to_cpu() call is needed here.
> 
> 
> True. However, I added the hypercall here to make the routine
> independent of what happens in other parts (hypervisor binding new
> channels to cpu0, xen_irq_info_evtchn_setup() initializing to zero,
> etc.). This way, if either of these two change in the future (unlikely,
> but possible) this routine will still work as expected.

New event channels being bound to VCPU0 is part of the ABI and cannot
change, so I've taken the hypercall and its dodgy looking error path out.

I've applied this series to for-linus-4.1b but before I push, do any of
these need to be tagged for stable?

David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming
  2015-05-01 15:25       ` [Xen-devel] " David Vrabel
  2015-05-01 16:03         ` Boris Ostrovsky
@ 2015-05-01 16:03         ` Boris Ostrovsky
  1 sibling, 0 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-05-01 16:03 UTC (permalink / raw)
  To: David Vrabel, konrad.wilk; +Cc: xen-devel, annie.li, linux-kernel

On 05/01/2015 11:25 AM, David Vrabel wrote:
> On 01/05/15 14:39, Boris Ostrovsky wrote:
>> On 05/01/2015 06:46 AM, David Vrabel wrote:
>>> On 29/04/15 22:10, Boris Ostrovsky wrote:
>>>> When a guest is resumed, the hypervisor may change event channel
>>>> assignments. If this happens and the guest uses 2-level events it
>>>> is possible for the interrupt to be claimed by wrong VCPU since
>>>> cpu_evtchn_mask bits may be stale. This can happen even though
>>>> evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
>>>> is passed in is not necessarily the original one (from pre-migration
>>>> times) but instead is freshly allocated during resume and so any
>>>> information about which CPU the channel was bound to is lost.
>>>>
>>>> Thus we should clear the mask during resume.
>>>>
>>>> We also need to make sure that bits for xenstore and console channels
>>>> are set when these two subsystems are resumed. While rebind_evtchn_irq()
>>>> (which is invoked for both of them on a resume) calls
>>>> irq_set_affinity(),
>>>> the latter will in fact postpone setting affinity until handling the
>>>> interrupt. But because cpu_evtchn_mask will have bits for these two
>>>> cleared we won't be able to take the interrupt.
>>>>
>>>> With that in mind, we need to bind those two channels explicitly in
>>>> rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
>>>> pass through generic irq affinity code later, in case something needs
>>>> to be updated there as well.
>>>>
>>>> (Also replace cpumask_of(0) with cpumask_of(info->cpu) in
>>>> rebind_evtchn_irq(): it should be set to zero in preceding
>>>> xen_irq_info_evtchn_setup().)
>>> [...]
>>>> @@ -1279,8 +1280,16 @@ void rebind_evtchn_irq(int evtchn, int irq)
>>>>          mutex_unlock(&irq_mapping_update_lock);
>>>>    -    /* new event channels are always bound to cpu 0 */
>>>> -    irq_set_affinity(irq, cpumask_of(0));
>>>> +    bind_vcpu.port = evtchn;
>>>> +    bind_vcpu.vcpu = info->cpu;
>>>> +    if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu)
>>>> == 0)
>>>> +        bind_evtchn_to_cpu(evtchn, info->cpu);
>>> Isn't the hypercall is unnecessary since this is a new event channel
>>> it's already bound to VCPU 0 and info->cpu == 0?
>>>
>>> I think only the bind_evtchn_to_cpu() call is needed here.
>>
>> True. However, I added the hypercall here to make the routine
>> independent of what happens in other parts (hypervisor binding new
>> channels to cpu0, xen_irq_info_evtchn_setup() initializing to zero,
>> etc.). This way, if either of these two change in the future (unlikely,
>> but possible) this routine will still work as expected.
> New event channels being bound to VCPU0 is part of the ABI and cannot
> change, so I've taken the hypercall and its dodgy looking error path out.
>
> I've applied this series to for-linus-4.1b but before I push, do any of
> these need to be tagged for stable?

Yes, all of them need to. The first one can probably be easily 
backported starting from whenever FIFO events went in (3.14?) but the 
rest could go back all the way to 3.2.

-boris


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming
  2015-05-01 15:25       ` [Xen-devel] " David Vrabel
@ 2015-05-01 16:03         ` Boris Ostrovsky
  2015-05-01 16:03         ` [Xen-devel] " Boris Ostrovsky
  1 sibling, 0 replies; 19+ messages in thread
From: Boris Ostrovsky @ 2015-05-01 16:03 UTC (permalink / raw)
  To: David Vrabel, konrad.wilk; +Cc: xen-devel, annie.li, linux-kernel

On 05/01/2015 11:25 AM, David Vrabel wrote:
> On 01/05/15 14:39, Boris Ostrovsky wrote:
>> On 05/01/2015 06:46 AM, David Vrabel wrote:
>>> On 29/04/15 22:10, Boris Ostrovsky wrote:
>>>> When a guest is resumed, the hypervisor may change event channel
>>>> assignments. If this happens and the guest uses 2-level events it
>>>> is possible for the interrupt to be claimed by wrong VCPU since
>>>> cpu_evtchn_mask bits may be stale. This can happen even though
>>>> evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
>>>> is passed in is not necessarily the original one (from pre-migration
>>>> times) but instead is freshly allocated during resume and so any
>>>> information about which CPU the channel was bound to is lost.
>>>>
>>>> Thus we should clear the mask during resume.
>>>>
>>>> We also need to make sure that bits for xenstore and console channels
>>>> are set when these two subsystems are resumed. While rebind_evtchn_irq()
>>>> (which is invoked for both of them on a resume) calls
>>>> irq_set_affinity(),
>>>> the latter will in fact postpone setting affinity until handling the
>>>> interrupt. But because cpu_evtchn_mask will have bits for these two
>>>> cleared we won't be able to take the interrupt.
>>>>
>>>> With that in mind, we need to bind those two channels explicitly in
>>>> rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
>>>> pass through generic irq affinity code later, in case something needs
>>>> to be updated there as well.
>>>>
>>>> (Also replace cpumask_of(0) with cpumask_of(info->cpu) in
>>>> rebind_evtchn_irq(): it should be set to zero in preceding
>>>> xen_irq_info_evtchn_setup().)
>>> [...]
>>>> @@ -1279,8 +1280,16 @@ void rebind_evtchn_irq(int evtchn, int irq)
>>>>          mutex_unlock(&irq_mapping_update_lock);
>>>>    -    /* new event channels are always bound to cpu 0 */
>>>> -    irq_set_affinity(irq, cpumask_of(0));
>>>> +    bind_vcpu.port = evtchn;
>>>> +    bind_vcpu.vcpu = info->cpu;
>>>> +    if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu)
>>>> == 0)
>>>> +        bind_evtchn_to_cpu(evtchn, info->cpu);
>>> Isn't the hypercall is unnecessary since this is a new event channel
>>> it's already bound to VCPU 0 and info->cpu == 0?
>>>
>>> I think only the bind_evtchn_to_cpu() call is needed here.
>>
>> True. However, I added the hypercall here to make the routine
>> independent of what happens in other parts (hypervisor binding new
>> channels to cpu0, xen_irq_info_evtchn_setup() initializing to zero,
>> etc.). This way, if either of these two change in the future (unlikely,
>> but possible) this routine will still work as expected.
> New event channels being bound to VCPU0 is part of the ABI and cannot
> change, so I've taken the hypercall and its dodgy looking error path out.
>
> I've applied this series to for-linus-4.1b but before I push, do any of
> these need to be tagged for stable?

Yes, all of them need to. The first one can probably be easily 
backported starting from whenever FIFO events went in (3.14?) but the 
rest could go back all the way to 3.2.

-boris

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-05-01 16:05 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-29 21:10 [PATCH v2 0/4] A few event channel-related fixes Boris Ostrovsky
2015-04-29 21:10 ` [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask before resuming Boris Ostrovsky
2015-04-29 21:10 ` Boris Ostrovsky
2015-05-01 10:46   ` David Vrabel
2015-05-01 10:46   ` [Xen-devel] " David Vrabel
2015-05-01 13:39     ` Boris Ostrovsky
2015-05-01 13:39     ` [Xen-devel] " Boris Ostrovsky
2015-05-01 15:25       ` David Vrabel
2015-05-01 15:25       ` [Xen-devel] " David Vrabel
2015-05-01 16:03         ` Boris Ostrovsky
2015-05-01 16:03         ` [Xen-devel] " Boris Ostrovsky
2015-04-29 21:10 ` [PATCH v2 2/4] xen/xenbus: Update xenbus event channel on resume Boris Ostrovsky
2015-05-01 10:46   ` David Vrabel
2015-05-01 10:46   ` [Xen-devel] " David Vrabel
2015-04-29 21:10 ` Boris Ostrovsky
2015-04-29 21:10 ` [PATCH v2 3/4] xen/console: Update console " Boris Ostrovsky
2015-04-29 21:10 ` Boris Ostrovsky
2015-04-29 21:10 ` [PATCH v2 4/4] xen/events: Set irq_info->evtchn before binding the channel to CPU in __startup_pirq() Boris Ostrovsky
2015-04-29 21:10 ` Boris Ostrovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.