Re: Interrupt for port 19, but apparently not enabled; per-user 000000004af23acc

From: Juergen Gross <jgross@suse.com>
To: Julien Grall <julien@xen.org>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	linux-kernel@vger.kernel.org, mheyne@amazon.de
Subject: Re: Interrupt for port 19, but apparently not enabled; per-user 000000004af23acc
Date: Tue, 22 Jun 2021 17:14:49 +0200	[thread overview]
Message-ID: <5d88a82e-d237-7803-7b50-897e857f2fbd@suse.com> (raw)
In-Reply-To: <9a08bbf2-ba6a-6e49-3bcb-bfe2beb32b99@xen.org>


[-- Attachment #1.1.1: Type: text/plain, Size: 6185 bytes --]

On 22.06.21 14:21, Julien Grall wrote:
> Hi Juergen,
> 
> On 22/06/2021 13:04, Juergen Gross wrote:
>> On 22.06.21 12:24, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> As discussed on IRC yesterday, we noticed a couple of splat in 5.13-rc6 
>>
>>> (and stable 5.4) in the evtchn driver:
>>>
>>> [    7.581000] ------------[ cut here ]------------
>>> [    7.581899] Interrupt for port 19, but apparently not 
>> enabled;
>>> per-user 000000004af23acc
>>> [    7.583401] WARNING: CPU: 0 PID: 467 at 
>>> /home/ANT.AMAZON.COM/jgrall/works/oss/linux/drivers/xen/evtchn.c:169 
>>> evtchn_interrupt+0xd5/0x100
>>> [    7.585583] Modules linked in:
>>> [    7.586188] CPU: 0 PID: 467 Comm: xenstore-read Not 
tainted 
>>> 5.13.0-rc6 #240
>>> [    7.587462] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), 
>>> BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
>>> [    7.589462] RIP: e030:evtchn_interrupt+0xd5/0x100
>>> [    7.590361] Code: 48 8d bb d8 01 00 00 ba 01 00 00 00 
>> be 1d 00 00 00
>>> e8 5f 72 c4 ff eb b2 8b 75 20 48 89 da 48 c7 c7 a8 03 5f 82 e8 6b 2d 96 
>>
>>> ff <0f> 0b e9 4d ff ff ff 41 0f b6 f4 48 c7 c7 80 da a2 82 e8 f0
>>> [    7.593662] RSP: e02b:ffffc90040003e60 EFLAGS: 00010082
>>> [    7.594636] RAX: 0000000000000000 RBX: ffff888102328c00 RCX: 
>>> 0000000000000027
>>> [    7.595924] RDX: 0000000000000000 RSI: ffff88817fe18ad0 RDI: 
>>> ffff88817fe18ad8
>>> [    7.597216] RBP: ffff888108ef8140 R08: 0000000000000000 R09: 
>>> 0000000000000001
>>> [    7.598522] R10: 0000000000000000 R11: 7075727265746e49 R12: 
>>> 0000000000000000
>>> [    7.599810] R13: ffffc90040003ec4 R14: ffff8881001b8000 R15: 
>>> ffff888109b36f80
>>> [    7.601113] FS:  0000000000000000(0000) GS:ffff88817fe00000(0000) 
>>> knlGS:0000000000000000
>>> [    7.602570] CS:  10000e030 DS: 0000 ES: 0000 CR0:0000000080050033
>>> [    7.603700] CR2: 00007f15b390e368 CR3: 000000010bb04000 CR4: 
>>> 0000000000050660
>>> [    7.604993] Call Trace:
>>> [    7.605501]  <IRQ>
>>> [    7.605929]  __handle_irq_event_percpu+0x4c/0x330
>>> [    7.606817]  handle_irq_event_percpu+0x32/0xa0
>>> [    7.607670]  handle_irq_event+0x3a/0x60
>>> [    7.608416]  handle_edge_irq+0x9b/0x1f0
>>> [    7.609154]  generic_handle_irq+0x4f/0x60
>>> [    7.609918]  __evtchn_fifo_handle_events+0x195/0x3a0
>>> [    7.610864]  __xen_evtchn_do_upcall+0x66/0xb0
>>> [    7.611693]  __xen_pv_evtchn_do_upcall+0x1d/0x30
>>> [    7.612582]  xen_pv_evtchn_do_upcall+0x9d/0xc0
>>> [    7.613439]  </IRQ>
>>> [    7.613882]  exc_xen_hypervisor_callback+0x8/0x10
>>>
>>> This is quite similar to the problem I reported a few months ago (see 

>>> [1]) but this time this is happening with fifo rather than 2L.
>>>
>>> I haven't been able to reproduced it reliably so far. But looking at 
>>> the code, I think I have found another potential race after commit
>>>
>>> commit b6622798bc50b625a1e62f82c7190df40c1f5b21
>>> Author: Juergen Gross <jgross@suse.com>
>>> Date:   Sat Mar 6 17:18:33 2021 +0100
>>>     xen/events: avoid handling the same event on two cpusat the same 
>>> time
>>>     When changing the cpu affinity of an event it can happen today that
>>>     (with some unlucky timing) the same event will be handled 
>> on the old
>>>     and the new cpu at the same time.
>>>     Avoid that by adding an "event active" flag to the 
per-event data 
>>> and
>>>     call the handler only if this flag isn't set.
>>>     Cc: stable@vger.kernel.org
>>>     Reported-by: Julien Grall <julien@xen.org>
>>>     Signed-off-by: Juergen Gross <jgross@suse.com>
>>>     Reviewed-by: Julien Grall <jgrall@amazon.com>
>>>     Link: 
>>> https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com
>>>     Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>
>>> The evtchn driver will use the lateeoi handlers. So the code to ack 
>>> looks like:
>>>
>>> do_mask(..., EVT_MASK_REASON_EOI_PENDING)
>>> smp_store_release(&info->is_active, 0);
>>> clear_evtchn(info->evtchn);
>>>
>>> The code to handle an interrupts look like:
>>>
>>> clear_link(...)
>>> if ( evtchn_fifo_is_pending(port) && !evtchn_fifo_is_mask()) {
>>>    if (xchg_acquire(&info->is_active, 1)
>>>      return;
>>>    generic_handle_irq();
>>> }
>>>
>>> After changing the affinity, an interrupt may be received once on the 

>>> previous vCPU. So, I think the following can happen:
>>>
>>> vCPU0                             | vCPU1
>>>                    |
>>>   Receive event              |
>>>                    | change affinity to vCPU1
>>>   clear_link()              |
>>>                        |
>>>                 /* The interrupt is re-raised */
>>>                    | receive event
>>>                      |
>>>                    | /* The interrupt is not masked */
>>>   info->is_active = 1          |
>>>   do_mask(...)              |
>>>   info->is_active = 0          |
>>>                    | info->is_active = 1
>>>   clear_evtchn(...)               |
>>>                                   | do_mask(...)
>>>                                   | info->is_active = 0
>>>                    | clear_evtchn(...)
>>>
>>> Does this look plausible to you?
>>
>> Yes, it does.
>>
>> Thanks for the analysis.
>>
>> So I guess for lateeoi events we need to clear is_active only in
>> xen_irq_lateeoi()? At a first glance this should fix the issue.
> 
> It should work and would be quite neat. But, I believe clear_evtchn() 
> would have to stick in the ack helper to avoid losing interrupts.
> 

Could you try the attached patch, please? Only compile tested.


Juergen

[-- Attachment #1.1.2: 0001-xen-events-reset-active-flag-for-lateeoi-events-late.patch --]
[-- Type: text/x-patch, Size: 1588 bytes --]

From 593e81cfb5d4a15e5e420111565e7f6019da9b72 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
To: linux-kernel@vger.kernel.org
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: xen-devel@lists.xenproject.org
Date: Tue, 22 Jun 2021 17:05:03 +0200
Subject: [PATCH] xen/events: reset active flag for lateeoi events later

In order to avoid a race condition for user events when changing
cpu affinity reset the active flag only when EOI-ing the event.

This is working fine as all user events are lateeoi events.

Reported-by: Julien Grall <julien@xen.org>
Fixes: b6622798bc50b62 ("xen/events: avoid handling the same event on two 
cpus at the same time")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 drivers/xen/events/events_base.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 7bbfd58958bc..26836546e1cd 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -642,6 +642,7 @@ static void xen_irq_lateeoi_locked(struct irq_info *info, bool spurious)
 	}
 
 	info->eoi_time = 0;
+	smp_store_release(&info->is_active, 0);
 	do_unmask(info, EVT_MASK_REASON_EOI_PENDING);
 }
 
@@ -1883,7 +1884,7 @@ static void lateeoi_ack_dynirq(struct irq_data *data)
 
 	if (VALID_EVTCHN(evtchn)) {
 		do_mask(info, EVT_MASK_REASON_EOI_PENDING);
-		event_handler_exit(info);
+		clear_evtchn(evtchn);
 	}
 }
 
-- 
2.26.2


[-- Attachment #1.1.3: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]