All of lore.kernel.org
 help / color / mirror / Atom feed
* Interrupt for port 19, but apparently not enabled; per-user 000000004af23acc
@ 2021-06-22 10:24 Julien Grall
  2021-06-22 11:04 ` Juergen Gross
  0 siblings, 1 reply; 6+ messages in thread
From: Julien Grall @ 2021-06-22 10:24 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, linux-kernel, mheyne

Hi Juergen,

As discussed on IRC yesterday, we noticed a couple of splat in 5.13-rc6 
(and stable 5.4) in the evtchn driver:

[    7.581000] ------------[ cut here ]------------
[    7.581899] Interrupt for port 19, but apparently not enabled; 
per-user 000000004af23acc
[    7.583401] WARNING: CPU: 0 PID: 467 at 
/home/ANT.AMAZON.COM/jgrall/works/oss/linux/drivers/xen/evtchn.c:169 
evtchn_interrupt+0xd5/0x100
[    7.585583] Modules linked in:
[    7.586188] CPU: 0 PID: 467 Comm: xenstore-read Not tainted 
5.13.0-rc6 #240
[    7.587462] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[    7.589462] RIP: e030:evtchn_interrupt+0xd5/0x100
[    7.590361] Code: 48 8d bb d8 01 00 00 ba 01 00 00 00 be 1d 00 00 00 
e8 5f 72 c4 ff eb b2 8b 75 20 48 89 da 48 c7 c7 a8 03 5f 82 e8 6b 2d 96 
ff <0f> 0b e9 4d ff ff ff 41 0f b6 f4 48 c7 c7 80 da a2 82 e8 f0
[    7.593662] RSP: e02b:ffffc90040003e60 EFLAGS: 00010082
[    7.594636] RAX: 0000000000000000 RBX: ffff888102328c00 RCX: 
0000000000000027
[    7.595924] RDX: 0000000000000000 RSI: ffff88817fe18ad0 RDI: 
ffff88817fe18ad8
[    7.597216] RBP: ffff888108ef8140 R08: 0000000000000000 R09: 
0000000000000001
[    7.598522] R10: 0000000000000000 R11: 7075727265746e49 R12: 
0000000000000000
[    7.599810] R13: ffffc90040003ec4 R14: ffff8881001b8000 R15: 
ffff888109b36f80
[    7.601113] FS:  0000000000000000(0000) GS:ffff88817fe00000(0000) 
knlGS:0000000000000000
[    7.602570] CS:  10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[    7.603700] CR2: 00007f15b390e368 CR3: 000000010bb04000 CR4: 
0000000000050660
[    7.604993] Call Trace:
[    7.605501]  <IRQ>
[    7.605929]  __handle_irq_event_percpu+0x4c/0x330
[    7.606817]  handle_irq_event_percpu+0x32/0xa0
[    7.607670]  handle_irq_event+0x3a/0x60
[    7.608416]  handle_edge_irq+0x9b/0x1f0
[    7.609154]  generic_handle_irq+0x4f/0x60
[    7.609918]  __evtchn_fifo_handle_events+0x195/0x3a0
[    7.610864]  __xen_evtchn_do_upcall+0x66/0xb0
[    7.611693]  __xen_pv_evtchn_do_upcall+0x1d/0x30
[    7.612582]  xen_pv_evtchn_do_upcall+0x9d/0xc0
[    7.613439]  </IRQ>
[    7.613882]  exc_xen_hypervisor_callback+0x8/0x10

This is quite similar to the problem I reported a few months ago (see 
[1]) but this time this is happening with fifo rather than 2L.

I haven't been able to reproduced it reliably so far. But looking at the 
code, I think I have found another potential race after commit

commit b6622798bc50b625a1e62f82c7190df40c1f5b21
Author: Juergen Gross <jgross@suse.com>
Date:   Sat Mar 6 17:18:33 2021 +0100
     xen/events: avoid handling the same event on two cpus at the same time
     When changing the cpu affinity of an event it can happen today that
     (with some unlucky timing) the same event will be handled on the old
     and the new cpu at the same time.
     Avoid that by adding an "event active" flag to the per-event data and
     call the handler only if this flag isn't set.
     Cc: stable@vger.kernel.org
     Reported-by: Julien Grall <julien@xen.org>
     Signed-off-by: Juergen Gross <jgross@suse.com>
     Reviewed-by: Julien Grall <jgrall@amazon.com>
     Link: https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com
     Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

The evtchn driver will use the lateeoi handlers. So the code to ack 
looks like:

do_mask(..., EVT_MASK_REASON_EOI_PENDING)
smp_store_release(&info->is_active, 0);
clear_evtchn(info->evtchn);

The code to handle an interrupts look like:

clear_link(...)
if ( evtchn_fifo_is_pending(port) && !evtchn_fifo_is_mask()) {
    if (xchg_acquire(&info->is_active, 1)
      return;
    generic_handle_irq();
}

After changing the affinity, an interrupt may be received once on the 
previous vCPU. So, I think the following can happen:

vCPU0                             | vCPU1
				  |
   Receive event			  |
				  | change affinity to vCPU1
   clear_link()			  |
      				  |
	           /* The interrupt is re-raised */
				  | receive event
    				  |
				  | /* The interrupt is not masked */
   info->is_active = 1		  |
   do_mask(...)			  |
   info->is_active = 0		  |
				  | info->is_active = 1
   clear_evtchn(...)               |
                                   | do_mask(...)
                                   | info->is_active = 0
				  | clear_evtchn(...)

Does this look plausible to you?

Cheers,

[1] https://www.spinics.net/lists/kernel/msg3771782.html



-- 
Julien Grall

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Interrupt for port 19, but apparently not enabled; per-user 000000004af23acc
  2021-06-22 10:24 Interrupt for port 19, but apparently not enabled; per-user 000000004af23acc Julien Grall
@ 2021-06-22 11:04 ` Juergen Gross
  2021-06-22 12:21   ` Julien Grall
  0 siblings, 1 reply; 6+ messages in thread
From: Juergen Gross @ 2021-06-22 11:04 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, linux-kernel, mheyne


[-- Attachment #1.1.1: Type: text/plain, Size: 5588 bytes --]

On 22.06.21 12:24, Julien Grall wrote:
> Hi Juergen,
> 
> As discussed on IRC yesterday, we noticed a couple of splat in 5.13-rc6 

> (and stable 5.4) in the evtchn driver:
> 
> [    7.581000] ------------[ cut here ]------------
> [    7.581899] Interrupt for port 19, but apparently not 
enabled; 
> per-user 000000004af23acc
> [    7.583401] WARNING: CPU: 0 PID: 467 at 
> /home/ANT.AMAZON.COM/jgrall/works/oss/linux/drivers/xen/evtchn.c:169 
> evtchn_interrupt+0xd5/0x100
> [    7.585583] Modules linked in:
> [    7.586188] CPU: 0 PID: 467 Comm: xenstore-read Not tainted 
> 5.13.0-rc6 #240
> [    7.587462] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
> [    7.589462] RIP: e030:evtchn_interrupt+0xd5/0x100
> [    7.590361] Code: 48 8d bb d8 01 00 00 ba 01 00 00 00 
be 1d 00 00 00 
> e8 5f 72 c4 ff eb b2 8b 75 20 48 89 da 48 c7 c7 a8 03 5f 82 e8 6b 2d 96 

> ff <0f> 0b e9 4d ff ff ff 41 0f b6 f4 48 c7 c7 80 da a2 82 e8 f0
> [    7.593662] RSP: e02b:ffffc90040003e60 EFLAGS: 00010082
> [    7.594636] RAX: 0000000000000000 RBX: ffff888102328c00 RCX: 
> 0000000000000027
> [    7.595924] RDX: 0000000000000000 RSI: ffff88817fe18ad0 RDI: 
> ffff88817fe18ad8
> [    7.597216] RBP: ffff888108ef8140 R08: 0000000000000000 R09: 
> 0000000000000001
> [    7.598522] R10: 0000000000000000 R11: 7075727265746e49 R12: 
> 0000000000000000
> [    7.599810] R13: ffffc90040003ec4 R14: ffff8881001b8000 R15: 
> ffff888109b36f80
> [    7.601113] FS:  0000000000000000(0000) GS:ffff88817fe00000(0000) 
> knlGS:0000000000000000
> [    7.602570] CS:  10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    7.603700] CR2: 00007f15b390e368 CR3: 000000010bb04000 CR4: 
> 0000000000050660
> [    7.604993] Call Trace:
> [    7.605501]  <IRQ>
> [    7.605929]  __handle_irq_event_percpu+0x4c/0x330
> [    7.606817]  handle_irq_event_percpu+0x32/0xa0
> [    7.607670]  handle_irq_event+0x3a/0x60
> [    7.608416]  handle_edge_irq+0x9b/0x1f0
> [    7.609154]  generic_handle_irq+0x4f/0x60
> [    7.609918]  __evtchn_fifo_handle_events+0x195/0x3a0
> [    7.610864]  __xen_evtchn_do_upcall+0x66/0xb0
> [    7.611693]  __xen_pv_evtchn_do_upcall+0x1d/0x30
> [    7.612582]  xen_pv_evtchn_do_upcall+0x9d/0xc0
> [    7.613439]  </IRQ>
> [    7.613882]  exc_xen_hypervisor_callback+0x8/0x10
> 
> This is quite similar to the problem I reported a few months ago (see 
> [1]) but this time this is happening with fifo rather than 2L.
> 
> I haven't been able to reproduced it reliably so far. But looking at the 
> code, I think I have found another potential race after commit
> 
> commit b6622798bc50b625a1e62f82c7190df40c1f5b21
> Author: Juergen Gross <jgross@suse.com>
> Date:   Sat Mar 6 17:18:33 2021 +0100
>     xen/events: avoid handling the same event on two cpus at the same time
>     When changing the cpu affinity of an event it can happen today that
>     (with some unlucky timing) the same event will be handled 
on the old
>     and the new cpu at the same time.
>     Avoid that by adding an "event active" flag to the per-event data and
>     call the handler only if this flag isn't set.
>     Cc: stable@vger.kernel.org
>     Reported-by: Julien Grall <julien@xen.org>
>     Signed-off-by: Juergen Gross <jgross@suse.com>
>     Reviewed-by: Julien Grall <jgrall@amazon.com>
>     Link: https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com
>     Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> 
> The evtchn driver will use the lateeoi handlers. So the code to ack 
> looks like:
> 
> do_mask(..., EVT_MASK_REASON_EOI_PENDING)
> smp_store_release(&info->is_active, 0);
> clear_evtchn(info->evtchn);
> 
> The code to handle an interrupts look like:
> 
> clear_link(...)
> if ( evtchn_fifo_is_pending(port) && !evtchn_fifo_is_mask()) {
>    if (xchg_acquire(&info->is_active, 1)
>      return;
>    generic_handle_irq();
> }
> 
> After changing the affinity, an interrupt may be received once on the 
> previous vCPU. So, I think the following can happen:
> 
> vCPU0                             | vCPU1
>                    |
>   Receive event              |
>                    | change affinity to vCPU1
>   clear_link()              |
>                        |
>                 /* The interrupt is re-raised */
>                    | receive event
>                      |
>                    | /* The interrupt is not masked */
>   info->is_active = 1          |
>   do_mask(...)              |
>   info->is_active = 0          |
>                    | info->is_active = 1
>   clear_evtchn(...)               |
>                                   | do_mask(...)
>                                   | info->is_active = 0
>                    | clear_evtchn(...)
> 
> Does this look plausible to you?

Yes, it does.

Thanks for the analysis.

So I guess for lateeoi events we need to clear is_active only in
xen_irq_lateeoi()? At a first glance this should fix the issue.

What do you think?


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Interrupt for port 19, but apparently not enabled; per-user 000000004af23acc
  2021-06-22 11:04 ` Juergen Gross
@ 2021-06-22 12:21   ` Julien Grall
  2021-06-22 12:23     ` Juergen Gross
  2021-06-22 15:14     ` Juergen Gross
  0 siblings, 2 replies; 6+ messages in thread
From: Julien Grall @ 2021-06-22 12:21 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, linux-kernel, mheyne

Hi Juergen,

On 22/06/2021 13:04, Juergen Gross wrote:
> On 22.06.21 12:24, Julien Grall wrote:
>> Hi Juergen,
>>
>> As discussed on IRC yesterday, we noticed a couple of splat in 5.13-rc6 
> 
>> (and stable 5.4) in the evtchn driver:
>>
>> [    7.581000] ------------[ cut here ]------------
>> [    7.581899] Interrupt for port 19, but apparently not 
> enabled;
>> per-user 000000004af23acc
>> [    7.583401] WARNING: CPU: 0 PID: 467 at 
>> /home/ANT.AMAZON.COM/jgrall/works/oss/linux/drivers/xen/evtchn.c:169 
>> evtchn_interrupt+0xd5/0x100
>> [    7.585583] Modules linked in:
>> [    7.586188] CPU: 0 PID: 467 Comm: xenstore-read Not tainted 
>> 5.13.0-rc6 #240
>> [    7.587462] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), 
>> BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
>> [    7.589462] RIP: e030:evtchn_interrupt+0xd5/0x100
>> [    7.590361] Code: 48 8d bb d8 01 00 00 ba 01 00 00 00 
> be 1d 00 00 00
>> e8 5f 72 c4 ff eb b2 8b 75 20 48 89 da 48 c7 c7 a8 03 5f 82 e8 6b 2d 96 
> 
>> ff <0f> 0b e9 4d ff ff ff 41 0f b6 f4 48 c7 c7 80 da a2 82 e8 f0
>> [    7.593662] RSP: e02b:ffffc90040003e60 EFLAGS: 00010082
>> [    7.594636] RAX: 0000000000000000 RBX: ffff888102328c00 RCX: 
>> 0000000000000027
>> [    7.595924] RDX: 0000000000000000 RSI: ffff88817fe18ad0 RDI: 
>> ffff88817fe18ad8
>> [    7.597216] RBP: ffff888108ef8140 R08: 0000000000000000 R09: 
>> 0000000000000001
>> [    7.598522] R10: 0000000000000000 R11: 7075727265746e49 R12: 
>> 0000000000000000
>> [    7.599810] R13: ffffc90040003ec4 R14: ffff8881001b8000 R15: 
>> ffff888109b36f80
>> [    7.601113] FS:  0000000000000000(0000) GS:ffff88817fe00000(0000) 
>> knlGS:0000000000000000
>> [    7.602570] CS:  10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [    7.603700] CR2: 00007f15b390e368 CR3: 000000010bb04000 CR4: 
>> 0000000000050660
>> [    7.604993] Call Trace:
>> [    7.605501]  <IRQ>
>> [    7.605929]  __handle_irq_event_percpu+0x4c/0x330
>> [    7.606817]  handle_irq_event_percpu+0x32/0xa0
>> [    7.607670]  handle_irq_event+0x3a/0x60
>> [    7.608416]  handle_edge_irq+0x9b/0x1f0
>> [    7.609154]  generic_handle_irq+0x4f/0x60
>> [    7.609918]  __evtchn_fifo_handle_events+0x195/0x3a0
>> [    7.610864]  __xen_evtchn_do_upcall+0x66/0xb0
>> [    7.611693]  __xen_pv_evtchn_do_upcall+0x1d/0x30
>> [    7.612582]  xen_pv_evtchn_do_upcall+0x9d/0xc0
>> [    7.613439]  </IRQ>
>> [    7.613882]  exc_xen_hypervisor_callback+0x8/0x10
>>
>> This is quite similar to the problem I reported a few months ago (see 
>> [1]) but this time this is happening with fifo rather than 2L.
>>
>> I haven't been able to reproduced it reliably so far. But looking at 
>> the code, I think I have found another potential race after commit
>>
>> commit b6622798bc50b625a1e62f82c7190df40c1f5b21
>> Author: Juergen Gross <jgross@suse.com>
>> Date:   Sat Mar 6 17:18:33 2021 +0100
>>     xen/events: avoid handling the same event on two cpus at the same 
>> time
>>     When changing the cpu affinity of an event it can happen today that
>>     (with some unlucky timing) the same event will be handled 
> on the old
>>     and the new cpu at the same time.
>>     Avoid that by adding an "event active" flag to the per-event data and
>>     call the handler only if this flag isn't set.
>>     Cc: stable@vger.kernel.org
>>     Reported-by: Julien Grall <julien@xen.org>
>>     Signed-off-by: Juergen Gross <jgross@suse.com>
>>     Reviewed-by: Julien Grall <jgrall@amazon.com>
>>     Link: https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com
>>     Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>
>> The evtchn driver will use the lateeoi handlers. So the code to ack 
>> looks like:
>>
>> do_mask(..., EVT_MASK_REASON_EOI_PENDING)
>> smp_store_release(&info->is_active, 0);
>> clear_evtchn(info->evtchn);
>>
>> The code to handle an interrupts look like:
>>
>> clear_link(...)
>> if ( evtchn_fifo_is_pending(port) && !evtchn_fifo_is_mask()) {
>>    if (xchg_acquire(&info->is_active, 1)
>>      return;
>>    generic_handle_irq();
>> }
>>
>> After changing the affinity, an interrupt may be received once on the 
>> previous vCPU. So, I think the following can happen:
>>
>> vCPU0                             | vCPU1
>>                    |
>>   Receive event              |
>>                    | change affinity to vCPU1
>>   clear_link()              |
>>                        |
>>                 /* The interrupt is re-raised */
>>                    | receive event
>>                      |
>>                    | /* The interrupt is not masked */
>>   info->is_active = 1          |
>>   do_mask(...)              |
>>   info->is_active = 0          |
>>                    | info->is_active = 1
>>   clear_evtchn(...)               |
>>                                   | do_mask(...)
>>                                   | info->is_active = 0
>>                    | clear_evtchn(...)
>>
>> Does this look plausible to you?
> 
> Yes, it does.
> 
> Thanks for the analysis.
> 
> So I guess for lateeoi events we need to clear is_active only in
> xen_irq_lateeoi()? At a first glance this should fix the issue.

It should work and would be quite neat. But, I believe clear_evtchn() 
would have to stick in the ack helper to avoid losing interrupts.

Cheers,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Interrupt for port 19, but apparently not enabled; per-user 000000004af23acc
  2021-06-22 12:21   ` Julien Grall
@ 2021-06-22 12:23     ` Juergen Gross
  2021-06-22 15:14     ` Juergen Gross
  1 sibling, 0 replies; 6+ messages in thread
From: Juergen Gross @ 2021-06-22 12:23 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, linux-kernel, mheyne


[-- Attachment #1.1.1: Type: text/plain, Size: 6154 bytes --]

On 22.06.21 14:21, Julien Grall wrote:
> Hi Juergen,
> 
> On 22/06/2021 13:04, Juergen Gross wrote:
>> On 22.06.21 12:24, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> As discussed on IRC yesterday, we noticed a couple of splat in 5.13-rc6 
>>
>>> (and stable 5.4) in the evtchn driver:
>>>
>>> [    7.581000] ------------[ cut here ]------------
>>> [    7.581899] Interrupt for port 19, but apparently not 
>> enabled;
>>> per-user 000000004af23acc
>>> [    7.583401] WARNING: CPU: 0 PID: 467 at 
>>> /home/ANT.AMAZON.COM/jgrall/works/oss/linux/drivers/xen/evtchn.c:169 
>>> evtchn_interrupt+0xd5/0x100
>>> [    7.585583] Modules linked in:
>>> [    7.586188] CPU: 0 PID: 467 Comm: xenstore-read Not 
tainted 
>>> 5.13.0-rc6 #240
>>> [    7.587462] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), 
>>> BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
>>> [    7.589462] RIP: e030:evtchn_interrupt+0xd5/0x100
>>> [    7.590361] Code: 48 8d bb d8 01 00 00 ba 01 00 00 00 
>> be 1d 00 00 00
>>> e8 5f 72 c4 ff eb b2 8b 75 20 48 89 da 48 c7 c7 a8 03 5f 82 e8 6b 2d 96 
>>
>>> ff <0f> 0b e9 4d ff ff ff 41 0f b6 f4 48 c7 c7 80 da a2 82 e8 f0
>>> [    7.593662] RSP: e02b:ffffc90040003e60 EFLAGS: 00010082
>>> [    7.594636] RAX: 0000000000000000 RBX: ffff888102328c00 RCX: 
>>> 0000000000000027
>>> [    7.595924] RDX: 0000000000000000 RSI: ffff88817fe18ad0 RDI: 
>>> ffff88817fe18ad8
>>> [    7.597216] RBP: ffff888108ef8140 R08: 0000000000000000 R09: 
>>> 0000000000000001
>>> [    7.598522] R10: 0000000000000000 R11: 7075727265746e49 R12: 
>>> 0000000000000000
>>> [    7.599810] R13: ffffc90040003ec4 R14: ffff8881001b8000 R15: 
>>> ffff888109b36f80
>>> [    7.601113] FS:  0000000000000000(0000) GS:ffff88817fe00000(0000) 
>>> knlGS:0000000000000000
>>> [    7.602570] CS:  10000e030 DS: 0000 ES: 0000 CR0:0000000080050033
>>> [    7.603700] CR2: 00007f15b390e368 CR3: 000000010bb04000 CR4: 
>>> 0000000000050660
>>> [    7.604993] Call Trace:
>>> [    7.605501]  <IRQ>
>>> [    7.605929]  __handle_irq_event_percpu+0x4c/0x330
>>> [    7.606817]  handle_irq_event_percpu+0x32/0xa0
>>> [    7.607670]  handle_irq_event+0x3a/0x60
>>> [    7.608416]  handle_edge_irq+0x9b/0x1f0
>>> [    7.609154]  generic_handle_irq+0x4f/0x60
>>> [    7.609918]  __evtchn_fifo_handle_events+0x195/0x3a0
>>> [    7.610864]  __xen_evtchn_do_upcall+0x66/0xb0
>>> [    7.611693]  __xen_pv_evtchn_do_upcall+0x1d/0x30
>>> [    7.612582]  xen_pv_evtchn_do_upcall+0x9d/0xc0
>>> [    7.613439]  </IRQ>
>>> [    7.613882]  exc_xen_hypervisor_callback+0x8/0x10
>>>
>>> This is quite similar to the problem I reported a few months ago (see 

>>> [1]) but this time this is happening with fifo rather than 2L.
>>>
>>> I haven't been able to reproduced it reliably so far. But looking at 
>>> the code, I think I have found another potential race after commit
>>>
>>> commit b6622798bc50b625a1e62f82c7190df40c1f5b21
>>> Author: Juergen Gross <jgross@suse.com>
>>> Date:   Sat Mar 6 17:18:33 2021 +0100
>>>     xen/events: avoid handling the same event on two cpusat the same 
>>> time
>>>     When changing the cpu affinity of an event it can happen today that
>>>     (with some unlucky timing) the same event will be handled 
>> on the old
>>>     and the new cpu at the same time.
>>>     Avoid that by adding an "event active" flag to the 
per-event data 
>>> and
>>>     call the handler only if this flag isn't set.
>>>     Cc: stable@vger.kernel.org
>>>     Reported-by: Julien Grall <julien@xen.org>
>>>     Signed-off-by: Juergen Gross <jgross@suse.com>
>>>     Reviewed-by: Julien Grall <jgrall@amazon.com>
>>>     Link: 
>>> https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com
>>>     Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>
>>> The evtchn driver will use the lateeoi handlers. So the code to ack 
>>> looks like:
>>>
>>> do_mask(..., EVT_MASK_REASON_EOI_PENDING)
>>> smp_store_release(&info->is_active, 0);
>>> clear_evtchn(info->evtchn);
>>>
>>> The code to handle an interrupts look like:
>>>
>>> clear_link(...)
>>> if ( evtchn_fifo_is_pending(port) && !evtchn_fifo_is_mask()) {
>>>    if (xchg_acquire(&info->is_active, 1)
>>>      return;
>>>    generic_handle_irq();
>>> }
>>>
>>> After changing the affinity, an interrupt may be received once on the 

>>> previous vCPU. So, I think the following can happen:
>>>
>>> vCPU0                             | vCPU1
>>>                    |
>>>   Receive event              |
>>>                    | change affinity to vCPU1
>>>   clear_link()              |
>>>                        |
>>>                 /* The interrupt is re-raised */
>>>                    | receive event
>>>                      |
>>>                    | /* The interrupt is not masked */
>>>   info->is_active = 1          |
>>>   do_mask(...)              |
>>>   info->is_active = 0          |
>>>                    | info->is_active = 1
>>>   clear_evtchn(...)               |
>>>                                   | do_mask(...)
>>>                                   | info->is_active = 0
>>>                    | clear_evtchn(...)
>>>
>>> Does this look plausible to you?
>>
>> Yes, it does.
>>
>> Thanks for the analysis.
>>
>> So I guess for lateeoi events we need to clear is_active only in
>> xen_irq_lateeoi()? At a first glance this should fix the issue.
> 
> It should work and would be quite neat. But, I believe clear_evtchn() 
> would have to stick in the ack helper to avoid losing interrupts.

I agree.

Preparing a patch ...


Juergen


[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Interrupt for port 19, but apparently not enabled; per-user 000000004af23acc
  2021-06-22 12:21   ` Julien Grall
  2021-06-22 12:23     ` Juergen Gross
@ 2021-06-22 15:14     ` Juergen Gross
  2021-06-23 12:25       ` Julien Grall
  1 sibling, 1 reply; 6+ messages in thread
From: Juergen Gross @ 2021-06-22 15:14 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, linux-kernel, mheyne


[-- Attachment #1.1.1: Type: text/plain, Size: 6185 bytes --]

On 22.06.21 14:21, Julien Grall wrote:
> Hi Juergen,
> 
> On 22/06/2021 13:04, Juergen Gross wrote:
>> On 22.06.21 12:24, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> As discussed on IRC yesterday, we noticed a couple of splat in 5.13-rc6 
>>
>>> (and stable 5.4) in the evtchn driver:
>>>
>>> [    7.581000] ------------[ cut here ]------------
>>> [    7.581899] Interrupt for port 19, but apparently not 
>> enabled;
>>> per-user 000000004af23acc
>>> [    7.583401] WARNING: CPU: 0 PID: 467 at 
>>> /home/ANT.AMAZON.COM/jgrall/works/oss/linux/drivers/xen/evtchn.c:169 
>>> evtchn_interrupt+0xd5/0x100
>>> [    7.585583] Modules linked in:
>>> [    7.586188] CPU: 0 PID: 467 Comm: xenstore-read Not 
tainted 
>>> 5.13.0-rc6 #240
>>> [    7.587462] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), 
>>> BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
>>> [    7.589462] RIP: e030:evtchn_interrupt+0xd5/0x100
>>> [    7.590361] Code: 48 8d bb d8 01 00 00 ba 01 00 00 00 
>> be 1d 00 00 00
>>> e8 5f 72 c4 ff eb b2 8b 75 20 48 89 da 48 c7 c7 a8 03 5f 82 e8 6b 2d 96 
>>
>>> ff <0f> 0b e9 4d ff ff ff 41 0f b6 f4 48 c7 c7 80 da a2 82 e8 f0
>>> [    7.593662] RSP: e02b:ffffc90040003e60 EFLAGS: 00010082
>>> [    7.594636] RAX: 0000000000000000 RBX: ffff888102328c00 RCX: 
>>> 0000000000000027
>>> [    7.595924] RDX: 0000000000000000 RSI: ffff88817fe18ad0 RDI: 
>>> ffff88817fe18ad8
>>> [    7.597216] RBP: ffff888108ef8140 R08: 0000000000000000 R09: 
>>> 0000000000000001
>>> [    7.598522] R10: 0000000000000000 R11: 7075727265746e49 R12: 
>>> 0000000000000000
>>> [    7.599810] R13: ffffc90040003ec4 R14: ffff8881001b8000 R15: 
>>> ffff888109b36f80
>>> [    7.601113] FS:  0000000000000000(0000) GS:ffff88817fe00000(0000) 
>>> knlGS:0000000000000000
>>> [    7.602570] CS:  10000e030 DS: 0000 ES: 0000 CR0:0000000080050033
>>> [    7.603700] CR2: 00007f15b390e368 CR3: 000000010bb04000 CR4: 
>>> 0000000000050660
>>> [    7.604993] Call Trace:
>>> [    7.605501]  <IRQ>
>>> [    7.605929]  __handle_irq_event_percpu+0x4c/0x330
>>> [    7.606817]  handle_irq_event_percpu+0x32/0xa0
>>> [    7.607670]  handle_irq_event+0x3a/0x60
>>> [    7.608416]  handle_edge_irq+0x9b/0x1f0
>>> [    7.609154]  generic_handle_irq+0x4f/0x60
>>> [    7.609918]  __evtchn_fifo_handle_events+0x195/0x3a0
>>> [    7.610864]  __xen_evtchn_do_upcall+0x66/0xb0
>>> [    7.611693]  __xen_pv_evtchn_do_upcall+0x1d/0x30
>>> [    7.612582]  xen_pv_evtchn_do_upcall+0x9d/0xc0
>>> [    7.613439]  </IRQ>
>>> [    7.613882]  exc_xen_hypervisor_callback+0x8/0x10
>>>
>>> This is quite similar to the problem I reported a few months ago (see 

>>> [1]) but this time this is happening with fifo rather than 2L.
>>>
>>> I haven't been able to reproduced it reliably so far. But looking at 
>>> the code, I think I have found another potential race after commit
>>>
>>> commit b6622798bc50b625a1e62f82c7190df40c1f5b21
>>> Author: Juergen Gross <jgross@suse.com>
>>> Date:   Sat Mar 6 17:18:33 2021 +0100
>>>     xen/events: avoid handling the same event on two cpusat the same 
>>> time
>>>     When changing the cpu affinity of an event it can happen today that
>>>     (with some unlucky timing) the same event will be handled 
>> on the old
>>>     and the new cpu at the same time.
>>>     Avoid that by adding an "event active" flag to the 
per-event data 
>>> and
>>>     call the handler only if this flag isn't set.
>>>     Cc: stable@vger.kernel.org
>>>     Reported-by: Julien Grall <julien@xen.org>
>>>     Signed-off-by: Juergen Gross <jgross@suse.com>
>>>     Reviewed-by: Julien Grall <jgrall@amazon.com>
>>>     Link: 
>>> https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com
>>>     Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>
>>> The evtchn driver will use the lateeoi handlers. So the code to ack 
>>> looks like:
>>>
>>> do_mask(..., EVT_MASK_REASON_EOI_PENDING)
>>> smp_store_release(&info->is_active, 0);
>>> clear_evtchn(info->evtchn);
>>>
>>> The code to handle an interrupts look like:
>>>
>>> clear_link(...)
>>> if ( evtchn_fifo_is_pending(port) && !evtchn_fifo_is_mask()) {
>>>    if (xchg_acquire(&info->is_active, 1)
>>>      return;
>>>    generic_handle_irq();
>>> }
>>>
>>> After changing the affinity, an interrupt may be received once on the 

>>> previous vCPU. So, I think the following can happen:
>>>
>>> vCPU0                             | vCPU1
>>>                    |
>>>   Receive event              |
>>>                    | change affinity to vCPU1
>>>   clear_link()              |
>>>                        |
>>>                 /* The interrupt is re-raised */
>>>                    | receive event
>>>                      |
>>>                    | /* The interrupt is not masked */
>>>   info->is_active = 1          |
>>>   do_mask(...)              |
>>>   info->is_active = 0          |
>>>                    | info->is_active = 1
>>>   clear_evtchn(...)               |
>>>                                   | do_mask(...)
>>>                                   | info->is_active = 0
>>>                    | clear_evtchn(...)
>>>
>>> Does this look plausible to you?
>>
>> Yes, it does.
>>
>> Thanks for the analysis.
>>
>> So I guess for lateeoi events we need to clear is_active only in
>> xen_irq_lateeoi()? At a first glance this should fix the issue.
> 
> It should work and would be quite neat. But, I believe clear_evtchn() 
> would have to stick in the ack helper to avoid losing interrupts.
> 

Could you try the attached patch, please? Only compile tested.


Juergen

[-- Attachment #1.1.2: 0001-xen-events-reset-active-flag-for-lateeoi-events-late.patch --]
[-- Type: text/x-patch, Size: 1588 bytes --]

From 593e81cfb5d4a15e5e420111565e7f6019da9b72 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
To: linux-kernel@vger.kernel.org
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: xen-devel@lists.xenproject.org
Date: Tue, 22 Jun 2021 17:05:03 +0200
Subject: [PATCH] xen/events: reset active flag for lateeoi events later

In order to avoid a race condition for user events when changing
cpu affinity reset the active flag only when EOI-ing the event.

This is working fine as all user events are lateeoi events.

Reported-by: Julien Grall <julien@xen.org>
Fixes: b6622798bc50b62 ("xen/events: avoid handling the same event on two 
cpus at the same time")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 drivers/xen/events/events_base.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 7bbfd58958bc..26836546e1cd 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -642,6 +642,7 @@ static void xen_irq_lateeoi_locked(struct irq_info *info, bool spurious)
 	}
 
 	info->eoi_time = 0;
+	smp_store_release(&info->is_active, 0);
 	do_unmask(info, EVT_MASK_REASON_EOI_PENDING);
 }
 
@@ -1883,7 +1884,7 @@ static void lateeoi_ack_dynirq(struct irq_data *data)
 
 	if (VALID_EVTCHN(evtchn)) {
 		do_mask(info, EVT_MASK_REASON_EOI_PENDING);
-		event_handler_exit(info);
+		clear_evtchn(evtchn);
 	}
 }
 
-- 
2.26.2


[-- Attachment #1.1.3: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Interrupt for port 19, but apparently not enabled; per-user 000000004af23acc
  2021-06-22 15:14     ` Juergen Gross
@ 2021-06-23 12:25       ` Julien Grall
  0 siblings, 0 replies; 6+ messages in thread
From: Julien Grall @ 2021-06-23 12:25 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, linux-kernel, mheyne

Hi Juergen,

On 22/06/2021 17:14, Juergen Gross wrote:
> On 22.06.21 14:21, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 22/06/2021 13:04, Juergen Gross wrote:
>>> On 22.06.21 12:24, Julien Grall wrote:
>>>> Hi Juergen,
>>>>
>>>> As discussed on IRC yesterday, we noticed a couple of splat in 5.13-rc6 
>>>
>>>> (and stable 5.4) in the evtchn driver:
>>>>
>>>> [    7.581000] ------------[ cut here ]------------
>>>> [    7.581899] Interrupt for port 19, but apparently not 
>>> enabled;
>>>> per-user 000000004af23acc
>>>> [    7.583401] WARNING: CPU: 0 PID: 467 at 
>>>> /home/ANT.AMAZON.COM/jgrall/works/oss/linux/drivers/xen/evtchn.c:169 
>>>> evtchn_interrupt+0xd5/0x100
>>>> [    7.585583] Modules linked in:
>>>> [    7.586188] CPU: 0 PID: 467 Comm: xenstore-read Not 
> tainted
>>>> 5.13.0-rc6 #240
>>>> [    7.587462] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), 
>>>> BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
>>>> [    7.589462] RIP: e030:evtchn_interrupt+0xd5/0x100
>>>> [    7.590361] Code: 48 8d bb d8 01 00 00 ba 01 00 00 00 
>>> be 1d 00 00 00
>>>> e8 5f 72 c4 ff eb b2 8b 75 20 48 89 da 48 c7 c7 a8 03 5f 82 e8 6b 2d 96 
>>>
>>>> ff <0f> 0b e9 4d ff ff ff 41 0f b6 f4 48 c7 c7 80 da a2 82 e8 f0
>>>> [    7.593662] RSP: e02b:ffffc90040003e60 EFLAGS: 00010082
>>>> [    7.594636] RAX: 0000000000000000 RBX: ffff888102328c00 RCX: 
>>>> 0000000000000027
>>>> [    7.595924] RDX: 0000000000000000 RSI: ffff88817fe18ad0 RDI: 
>>>> ffff88817fe18ad8
>>>> [    7.597216] RBP: ffff888108ef8140 R08: 0000000000000000 R09: 
>>>> 0000000000000001
>>>> [    7.598522] R10: 0000000000000000 R11: 7075727265746e49 R12: 
>>>> 0000000000000000
>>>> [    7.599810] R13: ffffc90040003ec4 R14: ffff8881001b8000 R15: 
>>>> ffff888109b36f80
>>>> [    7.601113] FS:  0000000000000000(0000) GS:ffff88817fe00000(0000) 
>>>> knlGS:0000000000000000
>>>> [    7.602570] CS:  10000e030 DS: 0000 ES: 0000 CR0:0000000080050033
>>>> [    7.603700] CR2: 00007f15b390e368 CR3: 000000010bb04000 CR4: 
>>>> 0000000000050660
>>>> [    7.604993] Call Trace:
>>>> [    7.605501]  <IRQ>
>>>> [    7.605929]  __handle_irq_event_percpu+0x4c/0x330
>>>> [    7.606817]  handle_irq_event_percpu+0x32/0xa0
>>>> [    7.607670]  handle_irq_event+0x3a/0x60
>>>> [    7.608416]  handle_edge_irq+0x9b/0x1f0
>>>> [    7.609154]  generic_handle_irq+0x4f/0x60
>>>> [    7.609918]  __evtchn_fifo_handle_events+0x195/0x3a0
>>>> [    7.610864]  __xen_evtchn_do_upcall+0x66/0xb0
>>>> [    7.611693]  __xen_pv_evtchn_do_upcall+0x1d/0x30
>>>> [    7.612582]  xen_pv_evtchn_do_upcall+0x9d/0xc0
>>>> [    7.613439]  </IRQ>
>>>> [    7.613882]  exc_xen_hypervisor_callback+0x8/0x10
>>>>
>>>> This is quite similar to the problem I reported a few months ago (see 
> 
>>>> [1]) but this time this is happening with fifo rather than 2L.
>>>>
>>>> I haven't been able to reproduced it reliably so far. But looking at 
>>>> the code, I think I have found another potential race after commit
>>>>
>>>> commit b6622798bc50b625a1e62f82c7190df40c1f5b21
>>>> Author: Juergen Gross <jgross@suse.com>
>>>> Date:   Sat Mar 6 17:18:33 2021 +0100
>>>>     xen/events: avoid handling the same event on two cpusat the same 
>>>> time
>>>>     When changing the cpu affinity of an event it can happen today that
>>>>     (with some unlucky timing) the same event will be handled 
>>> on the old
>>>>     and the new cpu at the same time.
>>>>     Avoid that by adding an "event active" flag to the 
> per-event data
>>>> and
>>>>     call the handler only if this flag isn't set.
>>>>     Cc: stable@vger.kernel.org
>>>>     Reported-by: Julien Grall <julien@xen.org>
>>>>     Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>     Reviewed-by: Julien Grall <jgrall@amazon.com>
>>>>     Link: 
>>>> https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com
>>>>     Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>>
>>>> The evtchn driver will use the lateeoi handlers. So the code to ack 
>>>> looks like:
>>>>
>>>> do_mask(..., EVT_MASK_REASON_EOI_PENDING)
>>>> smp_store_release(&info->is_active, 0);
>>>> clear_evtchn(info->evtchn);
>>>>
>>>> The code to handle an interrupts look like:
>>>>
>>>> clear_link(...)
>>>> if ( evtchn_fifo_is_pending(port) && !evtchn_fifo_is_mask()) {
>>>>    if (xchg_acquire(&info->is_active, 1)
>>>>      return;
>>>>    generic_handle_irq();
>>>> }
>>>>
>>>> After changing the affinity, an interrupt may be received once on the 
> 
>>>> previous vCPU. So, I think the following can happen:
>>>>
>>>> vCPU0                             | vCPU1
>>>>                    |
>>>>   Receive event              |
>>>>                    | change affinity to vCPU1
>>>>   clear_link()              |
>>>>                        |
>>>>                 /* The interrupt is re-raised */
>>>>                    | receive event
>>>>                      |
>>>>                    | /* The interrupt is not masked */
>>>>   info->is_active = 1          |
>>>>   do_mask(...)              |
>>>>   info->is_active = 0          |
>>>>                    | info->is_active = 1
>>>>   clear_evtchn(...)               |
>>>>                                   | do_mask(...)
>>>>                                   | info->is_active = 0
>>>>                    | clear_evtchn(...)
>>>>
>>>> Does this look plausible to you?
>>>
>>> Yes, it does.
>>>
>>> Thanks for the analysis.
>>>
>>> So I guess for lateeoi events we need to clear is_active only in
>>> xen_irq_lateeoi()? At a first glance this should fix the issue.
>>
>> It should work and would be quite neat. But, I believe clear_evtchn() 
>> would have to stick in the ack helper to avoid losing interrupts.
>>
> 
> Could you try the attached patch, please? Only compile tested.

Thanks for the patch! I have also found a reproducer on Linux 5.13 so it 
was easier to confirm the patch works.

The reproducer is continuously the affinity of the interrupt under high 
interrupt load. After a few seconds I can see dozen of WARN splat.

Regarding the patch itself, a few suggestions:
   1) It is not entirely obvious from the code why ack_mask_dynirq() is 
not modified. My understanding is we are assuming that the 
xen_irq_lateeoi_locked() will not be called in this case. I would 
suggest to spell it clearly in the commit message
   2) I would suggest to add a comment in the code explaining why 
event_handler_exit() is not used. It is probably worth to also add one 
in event_handler_exit() so one know that this doesn't cover all the paths.

Cheers,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-06-23 12:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-22 10:24 Interrupt for port 19, but apparently not enabled; per-user 000000004af23acc Julien Grall
2021-06-22 11:04 ` Juergen Gross
2021-06-22 12:21   ` Julien Grall
2021-06-22 12:23     ` Juergen Gross
2021-06-22 15:14     ` Juergen Gross
2021-06-23 12:25       ` Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.