All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 0/4] Linux: fix some (more) event handling bugs
@ 2013-08-15 12:21 David Vrabel
  2013-08-15 12:21 ` [PATCH 1/4] x86/xen: disable premption when enabling local irqs David Vrabel
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: David Vrabel @ 2013-08-15 12:21 UTC (permalink / raw)
  To: xen-devel; +Cc: Boris Ostrovsky, David Vrabel

This series fixes a critical bug where a newly bound interdomain event
would be lost and a couple of other things I found when looking
through the code.

Patches 1, 3 and 4 should be considered for stable.

Changes in v2:
- drop the memory barrier change.  The existing compiler barriers are fine.
- fix two bugs in the binding of events to VCPUs.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] x86/xen: disable premption when enabling local irqs
  2013-08-15 12:21 [PATCHv2 0/4] Linux: fix some (more) event handling bugs David Vrabel
@ 2013-08-15 12:21 ` David Vrabel
  2013-08-15 12:21 ` [PATCH 2/4] xen/events: document behaviour when scanning the start word for events David Vrabel
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: David Vrabel @ 2013-08-15 12:21 UTC (permalink / raw)
  To: xen-devel; +Cc: Boris Ostrovsky, David Vrabel

From: David Vrabel <david.vrabel@citrix.com>

If CONFIG_PREEMPT is enabled then xen_enable_irq() (and
xen_restore_fl()) could be preempted and rescheduled on a different
VCPU in between the clear of the mask and the check for pending
events.  This may result in events being lost as the upcall will check
for pending events on the wrong VCPU.

Fix this by disabling preemption around the unmask and check for
events.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 arch/x86/xen/irq.c |   25 ++++++++++++-------------
 1 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
index 01a4dc0..0da7f86 100644
--- a/arch/x86/xen/irq.c
+++ b/arch/x86/xen/irq.c
@@ -47,23 +47,18 @@ static void xen_restore_fl(unsigned long flags)
 	/* convert from IF type flag */
 	flags = !(flags & X86_EFLAGS_IF);
 
-	/* There's a one instruction preempt window here.  We need to
-	   make sure we're don't switch CPUs between getting the vcpu
-	   pointer and updating the mask. */
+	/* See xen_irq_enable() for why preemption must be disabled. */
 	preempt_disable();
 	vcpu = this_cpu_read(xen_vcpu);
 	vcpu->evtchn_upcall_mask = flags;
-	preempt_enable_no_resched();
-
-	/* Doesn't matter if we get preempted here, because any
-	   pending event will get dealt with anyway. */
 
 	if (flags == 0) {
-		preempt_check_resched();
 		barrier(); /* unmask then check (avoid races) */
 		if (unlikely(vcpu->evtchn_upcall_pending))
 			xen_force_evtchn_callback();
-	}
+		preempt_enable();
+	} else
+		preempt_enable_no_resched();
 }
 PV_CALLEE_SAVE_REGS_THUNK(xen_restore_fl);
 
@@ -82,10 +77,12 @@ static void xen_irq_enable(void)
 {
 	struct vcpu_info *vcpu;
 
-	/* We don't need to worry about being preempted here, since
-	   either a) interrupts are disabled, so no preemption, or b)
-	   the caller is confused and is trying to re-enable interrupts
-	   on an indeterminate processor. */
+	/*
+	 * We may be preempted as soon as vcpu->evtchn_upcall_mask is
+	 * cleared, so disable preemption to ensure we check for
+	 * events on the VCPU we are still running on.
+	 */
+	preempt_disable();
 
 	vcpu = this_cpu_read(xen_vcpu);
 	vcpu->evtchn_upcall_mask = 0;
@@ -96,6 +93,8 @@ static void xen_irq_enable(void)
 	barrier(); /* unmask then check (avoid races) */
 	if (unlikely(vcpu->evtchn_upcall_pending))
 		xen_force_evtchn_callback();
+
+	preempt_enable();
 }
 PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable);
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] xen/events: document behaviour when scanning the start word for events
  2013-08-15 12:21 [PATCHv2 0/4] Linux: fix some (more) event handling bugs David Vrabel
  2013-08-15 12:21 ` [PATCH 1/4] x86/xen: disable premption when enabling local irqs David Vrabel
@ 2013-08-15 12:21 ` David Vrabel
  2013-08-15 14:10   ` Boris Ostrovsky
  2013-08-15 12:21 ` [PATCH 3/4] xen/events: initialize local per-cpu mask for all possible events David Vrabel
  2013-08-15 12:21 ` [PATCH 4/4] xen/events: mask events when changing their VCPU binding David Vrabel
  3 siblings, 1 reply; 11+ messages in thread
From: David Vrabel @ 2013-08-15 12:21 UTC (permalink / raw)
  To: xen-devel; +Cc: Boris Ostrovsky, David Vrabel

From: David Vrabel <david.vrabel@citrix.com>

The original comment on the scanning of the start word on the 2nd pass
did not reflect the actual behaviour (the code was incorrectly masking
bit_idx instead of the pending word itself).

The documented behaviour is not actually required since if event were
pending in the MSBs, they would be immediately scanned anyway as we go
through the loop again.

Update the documentation to reflect this (instead of trying to change
the behaviour).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events.c |   17 ++++++++++++-----
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index a58ac43..f866f50 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -1379,14 +1379,21 @@ static void __xen_evtchn_do_upcall(void)
 
 			pending_bits = active_evtchns(cpu, s, word_idx);
 			bit_idx = 0; /* usually scan entire word from start */
+			/*
+			 * We scan the starting word in two parts.
+			 *
+			 * 1st time: start in the middle, scanning the
+			 * MSBs.
+			 *
+			 * 2nd time: scan the whole word (not just the
+			 * parts skipped in the first pass) -- if an
+			 * event in the previously scanned bits is
+			 * pending again it would just be scanned on
+			 * the next loop anyway.
+			 */
 			if (word_idx == start_word_idx) {
-				/* We scan the starting word in two parts */
 				if (i == 0)
-					/* 1st time: start in the middle */
 					bit_idx = start_bit_idx;
-				else
-					/* 2nd time: mask bits done already */
-					bit_idx &= (1UL << start_bit_idx) - 1;
 			}
 
 			do {
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] xen/events: initialize local per-cpu mask for all possible events
  2013-08-15 12:21 [PATCHv2 0/4] Linux: fix some (more) event handling bugs David Vrabel
  2013-08-15 12:21 ` [PATCH 1/4] x86/xen: disable premption when enabling local irqs David Vrabel
  2013-08-15 12:21 ` [PATCH 2/4] xen/events: document behaviour when scanning the start word for events David Vrabel
@ 2013-08-15 12:21 ` David Vrabel
  2013-08-15 12:44   ` Jan Beulich
  2013-08-15 12:21 ` [PATCH 4/4] xen/events: mask events when changing their VCPU binding David Vrabel
  3 siblings, 1 reply; 11+ messages in thread
From: David Vrabel @ 2013-08-15 12:21 UTC (permalink / raw)
  To: xen-devel; +Cc: Boris Ostrovsky, David Vrabel

From: David Vrabel <david.vrabel@citrix.com>

The sizeof() argument in init_evtchn_cpu_bindings() is incorrect
resulting in only the first 64 (or 32 in 32-bit guests) ports having
their bindings being initialized to VCPU 0.

In most cases this does not cause a problem as request_irq() will set
the irq affinity which will set the correct local per-cpu mask.
However, if the request_irq() is called on a VCPU other than 0, there
is a window between the unmasking of the event and the affinity being
set were an event may be lost because it is not locally unmasked on
any VCPU. If request_irq() is called on VCPU 0 then local irqs are
disabled during the window and the race does not occur.

Fix this by initializing all NR_EVENT_CHANNEL bits in the local
per-cpu masks.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index f866f50..fb88e32 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -348,7 +348,7 @@ static void init_evtchn_cpu_bindings(void)
 
 	for_each_possible_cpu(i)
 		memset(per_cpu(cpu_evtchn_mask, i),
-		       (i == 0) ? ~0 : 0, sizeof(*per_cpu(cpu_evtchn_mask, i)));
+		       (i == 0) ? ~0 : 0, NR_EVENT_CHANNELS/8);
 }
 
 static inline void clear_evtchn(int port)
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] xen/events: mask events when changing their VCPU binding
  2013-08-15 12:21 [PATCHv2 0/4] Linux: fix some (more) event handling bugs David Vrabel
                   ` (2 preceding siblings ...)
  2013-08-15 12:21 ` [PATCH 3/4] xen/events: initialize local per-cpu mask for all possible events David Vrabel
@ 2013-08-15 12:21 ` David Vrabel
  2013-08-15 12:48   ` Jan Beulich
  3 siblings, 1 reply; 11+ messages in thread
From: David Vrabel @ 2013-08-15 12:21 UTC (permalink / raw)
  To: xen-devel; +Cc: Boris Ostrovsky, David Vrabel

From: David Vrabel <david.vrabel@citrix.com>

When a event is being bound to a VCPU there is a window between the
EVTCHNOP_bind_vpcu call and the adjustment of the local per-cpu masks
where an event may be lost.  The hypervisor upcalls the new VCPU but
the kernel thinks that event is still bound to the old VCPU and
ignores it.

There is even a problem when the event is being bound to the same VCPU
as there is a small window beween the clear_bit() and set_bit() calls
in bind_evtchn_to_cpu().  When scanning for pending events, the kernel
may read the bit when it is momentarily clear and ignore the event.

Avoid this by masking the event during the whole bind operation.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events.c |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index fb88e32..d404c7d 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -1500,8 +1500,10 @@ void rebind_evtchn_irq(int evtchn, int irq)
 /* Rebind an evtchn so that it gets delivered to a specific cpu */
 static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
 {
+	struct shared_info *s = HYPERVISOR_shared_info;
 	struct evtchn_bind_vcpu bind_vcpu;
 	int evtchn = evtchn_from_irq(irq);
+	int masked;
 
 	if (!VALID_EVTCHN(evtchn))
 		return -1;
@@ -1518,6 +1520,12 @@ static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
 	bind_vcpu.vcpu = tcpu;
 
 	/*
+	 * Mask the event while changing the VCPU binding to prevent
+	 * it being delivered on an unexpected VCPU.
+	 */
+	masked = sync_test_and_set_bit(evtchn, BM(s->evtchn_mask));
+
+	/*
 	 * If this fails, it usually just indicates that we're dealing with a
 	 * virq or IPI channel, which don't actually need to be rebound. Ignore
 	 * it, but don't do the xenlinux-level rebind in that case.
@@ -1525,6 +1533,9 @@ static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
 	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0)
 		bind_evtchn_to_cpu(evtchn, tcpu);
 
+	if (!masked)
+		unmask_evtchn(evtchn);
+
 	return 0;
 }
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] xen/events: initialize local per-cpu mask for all possible events
  2013-08-15 12:21 ` [PATCH 3/4] xen/events: initialize local per-cpu mask for all possible events David Vrabel
@ 2013-08-15 12:44   ` Jan Beulich
  2013-08-15 12:49     ` David Vrabel
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2013-08-15 12:44 UTC (permalink / raw)
  To: David Vrabel; +Cc: Boris Ostrovsky, xen-devel

>>> On 15.08.13 at 14:21, David Vrabel <david.vrabel@citrix.com> wrote:
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -348,7 +348,7 @@ static void init_evtchn_cpu_bindings(void)
>  
>  	for_each_possible_cpu(i)
>  		memset(per_cpu(cpu_evtchn_mask, i),
> -		       (i == 0) ? ~0 : 0, sizeof(*per_cpu(cpu_evtchn_mask, i)));
> +		       (i == 0) ? ~0 : 0, NR_EVENT_CHANNELS/8);

I think simply dropping the bogus * would have been the better fix.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] xen/events: mask events when changing their VCPU binding
  2013-08-15 12:21 ` [PATCH 4/4] xen/events: mask events when changing their VCPU binding David Vrabel
@ 2013-08-15 12:48   ` Jan Beulich
  0 siblings, 0 replies; 11+ messages in thread
From: Jan Beulich @ 2013-08-15 12:48 UTC (permalink / raw)
  To: David Vrabel; +Cc: Boris Ostrovsky, xen-devel

>>> On 15.08.13 at 14:21, David Vrabel <david.vrabel@citrix.com> wrote:
> From: David Vrabel <david.vrabel@citrix.com>
> 
> When a event is being bound to a VCPU there is a window between the
> EVTCHNOP_bind_vpcu call and the adjustment of the local per-cpu masks
> where an event may be lost.  The hypervisor upcalls the new VCPU but
> the kernel thinks that event is still bound to the old VCPU and
> ignores it.
> 
> There is even a problem when the event is being bound to the same VCPU
> as there is a small window beween the clear_bit() and set_bit() calls
> in bind_evtchn_to_cpu().  When scanning for pending events, the kernel
> may read the bit when it is momentarily clear and ignore the event.
> 
> Avoid this by masking the event during the whole bind operation.
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

Interesting that this got lost at some point of the porting from the
XenoLinux tree.

Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] xen/events: initialize local per-cpu mask for all possible events
  2013-08-15 12:44   ` Jan Beulich
@ 2013-08-15 12:49     ` David Vrabel
  2013-08-15 13:06       ` Jan Beulich
  0 siblings, 1 reply; 11+ messages in thread
From: David Vrabel @ 2013-08-15 12:49 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Boris Ostrovsky, xen-devel

On 15/08/13 13:44, Jan Beulich wrote:
>>>> On 15.08.13 at 14:21, David Vrabel <david.vrabel@citrix.com> wrote:
>> --- a/drivers/xen/events.c
>> +++ b/drivers/xen/events.c
>> @@ -348,7 +348,7 @@ static void init_evtchn_cpu_bindings(void)
>>  
>>  	for_each_possible_cpu(i)
>>  		memset(per_cpu(cpu_evtchn_mask, i),
>> -		       (i == 0) ? ~0 : 0, sizeof(*per_cpu(cpu_evtchn_mask, i)));
>> +		       (i == 0) ? ~0 : 0, NR_EVENT_CHANNELS/8);
> 
> I think simply dropping the bogus * would have been the better fix.

I disagree.  Using NR_EVENT_CHANNELS/8 makes it clearer that this is
initializing all bits.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] xen/events: initialize local per-cpu mask for all possible events
  2013-08-15 12:49     ` David Vrabel
@ 2013-08-15 13:06       ` Jan Beulich
  2013-08-19 17:10         ` David Vrabel
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2013-08-15 13:06 UTC (permalink / raw)
  To: David Vrabel; +Cc: Boris Ostrovsky, xen-devel

>>> On 15.08.13 at 14:49, David Vrabel <david.vrabel@citrix.com> wrote:
> On 15/08/13 13:44, Jan Beulich wrote:
>>>>> On 15.08.13 at 14:21, David Vrabel <david.vrabel@citrix.com> wrote:
>>> --- a/drivers/xen/events.c
>>> +++ b/drivers/xen/events.c
>>> @@ -348,7 +348,7 @@ static void init_evtchn_cpu_bindings(void)
>>>  
>>>  	for_each_possible_cpu(i)
>>>  		memset(per_cpu(cpu_evtchn_mask, i),
>>> -		       (i == 0) ? ~0 : 0, sizeof(*per_cpu(cpu_evtchn_mask, i)));
>>> +		       (i == 0) ? ~0 : 0, NR_EVENT_CHANNELS/8);
>> 
>> I think simply dropping the bogus * would have been the better fix.
> 
> I disagree.  Using NR_EVENT_CHANNELS/8 makes it clearer that this is
> initializing all bits.

My preference of using sizeof() wherever possible is mainly attributed
to the fact that if someone changes the array size, (s)he'd have to go
through and identify and change all the corresponding loop or
whatever boundaries.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] xen/events: document behaviour when scanning the start word for events
  2013-08-15 12:21 ` [PATCH 2/4] xen/events: document behaviour when scanning the start word for events David Vrabel
@ 2013-08-15 14:10   ` Boris Ostrovsky
  0 siblings, 0 replies; 11+ messages in thread
From: Boris Ostrovsky @ 2013-08-15 14:10 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel

On 08/15/2013 08:21 AM, David Vrabel wrote:
> From: David Vrabel <david.vrabel@citrix.com>
>
> The original comment on the scanning of the start word on the 2nd pass
> did not reflect the actual behaviour (the code was incorrectly masking
> bit_idx instead of the pending word itself).
>
> The documented behaviour is not actually required since if event were
> pending in the MSBs, they would be immediately scanned anyway as we go
> through the loop again.
>
> Update the documentation to reflect this (instead of trying to change
> the behaviour).
>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>   drivers/xen/events.c |   17 ++++++++++++-----
>   1 files changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index a58ac43..f866f50 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -1379,14 +1379,21 @@ static void __xen_evtchn_do_upcall(void)
>   
>   			pending_bits = active_evtchns(cpu, s, word_idx);
>   			bit_idx = 0; /* usually scan entire word from start */
> +			/*
> +			 * We scan the starting word in two parts.
> +			 *
> +			 * 1st time: start in the middle, scanning the
> +			 * MSBs.

I think MSB stands for most significant *byte*. 'msb' is for *bit*. Or 
just say
'upper bits'.

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

> +			 *
> +			 * 2nd time: scan the whole word (not just the
> +			 * parts skipped in the first pass) -- if an
> +			 * event in the previously scanned bits is
> +			 * pending again it would just be scanned on
> +			 * the next loop anyway.
> +			 */
>   			if (word_idx == start_word_idx) {
> -				/* We scan the starting word in two parts */
>   				if (i == 0)
> -					/* 1st time: start in the middle */
>   					bit_idx = start_bit_idx;
> -				else
> -					/* 2nd time: mask bits done already */
> -					bit_idx &= (1UL << start_bit_idx) - 1;
>   			}
>   
>   			do {

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] xen/events: initialize local per-cpu mask for all possible events
  2013-08-15 13:06       ` Jan Beulich
@ 2013-08-19 17:10         ` David Vrabel
  0 siblings, 0 replies; 11+ messages in thread
From: David Vrabel @ 2013-08-19 17:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Boris Ostrovsky, xen-devel

On 15/08/13 14:06, Jan Beulich wrote:
>>>> On 15.08.13 at 14:49, David Vrabel <david.vrabel@citrix.com> wrote:
>> On 15/08/13 13:44, Jan Beulich wrote:
>>>>>> On 15.08.13 at 14:21, David Vrabel <david.vrabel@citrix.com> wrote:
>>>> --- a/drivers/xen/events.c
>>>> +++ b/drivers/xen/events.c
>>>> @@ -348,7 +348,7 @@ static void init_evtchn_cpu_bindings(void)
>>>>  
>>>>  	for_each_possible_cpu(i)
>>>>  		memset(per_cpu(cpu_evtchn_mask, i),
>>>> -		       (i == 0) ? ~0 : 0, sizeof(*per_cpu(cpu_evtchn_mask, i)));
>>>> +		       (i == 0) ? ~0 : 0, NR_EVENT_CHANNELS/8);
>>>
>>> I think simply dropping the bogus * would have been the better fix.
>>
>> I disagree.  Using NR_EVENT_CHANNELS/8 makes it clearer that this is
>> initializing all bits.
> 
> My preference of using sizeof() wherever possible is mainly attributed
> to the fact that if someone changes the array size, (s)he'd have to go
> through and identify and change all the corresponding loop or
> whatever boundaries.

Normally I would agree, but in this case a) but this array won't change
its size and b) someone was already confused by the original code and
copied it across incorrectly.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-08-19 17:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-15 12:21 [PATCHv2 0/4] Linux: fix some (more) event handling bugs David Vrabel
2013-08-15 12:21 ` [PATCH 1/4] x86/xen: disable premption when enabling local irqs David Vrabel
2013-08-15 12:21 ` [PATCH 2/4] xen/events: document behaviour when scanning the start word for events David Vrabel
2013-08-15 14:10   ` Boris Ostrovsky
2013-08-15 12:21 ` [PATCH 3/4] xen/events: initialize local per-cpu mask for all possible events David Vrabel
2013-08-15 12:44   ` Jan Beulich
2013-08-15 12:49     ` David Vrabel
2013-08-15 13:06       ` Jan Beulich
2013-08-19 17:10         ` David Vrabel
2013-08-15 12:21 ` [PATCH 4/4] xen/events: mask events when changing their VCPU binding David Vrabel
2013-08-15 12:48   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.