linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH kernel] powerpc/xive: Drop deregistered irqs
@ 2019-07-12  8:20 Alexey Kardashevskiy
  2019-07-12  8:29 ` Benjamin Herrenschmidt
  2019-07-14 19:14 ` Daniel Henrique Barboza
  0 siblings, 2 replies; 10+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-12  8:20 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Alistair Popple, Greg Kurz,
	Nicholas Piggin, Paul Mackerras, Cédric Le Goater,
	David Gibson

There is a race between releasing an irq on one cpu and fetching it
from XIVE on another cpu as there does not seem to be any locking between
these, probably because xive_irq_chip::irq_shutdown() is supposed to
remove the irq from all queues in the system which it does not do.

As a result, when such released irq appears in a queue, we take it
from the queue but we do not change the current priority on that cpu and
since there is no handler for the irq, EOI is never called and the cpu
current priority remains elevated (7 vs. 0xff==unmasked). If another irq
is assigned to the same cpu, then that device stops working until irq
is moved to another cpu or the device is reset.

This checks if irq is still registered, if not, it assumes no valid irq
was fetched from the loop and if there is none left, it continues to
the irq==0 case (not visible in this patch) and sets priority to 0xff
which is basically unmasking. This calls irq_to_desc() on a hot path now
which is a radix tree lookup; hopefully this won't be noticeable as
that tree is quite small.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---

Found it on P9 system with:
- a host with 8 cpus online
- a boot disk on ahci with its msix on cpu#0
- a guest with 2xGPUs + 6xNVLink + 4 cpus
- GPU#0 from the guest is bound to the same cpu#0.

Killing a guest killed ahci and therefore the host because of the race.
Note that VFIO masks interrupts first and only then resets the device.

Alternatives:

1. Fix xive_irq_chip::irq_shutdown() to walk through all cpu queues and
drop deregistered irqs.

2. Exploit chip->irq_get_irqchip_state function from
62e0468650c30f0298 "genirq: Add optional hardware synchronization for shutdown".

Both require deep XIVE knowledge which I do not have.
---
 arch/powerpc/sysdev/xive/common.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 082c7e1c20f0..65742e280337 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -148,8 +148,12 @@ static u32 xive_scan_interrupts(struct xive_cpu *xc, bool just_peek)
 		irq = xive_read_eq(&xc->queue[prio], just_peek);
 
 		/* Found something ? That's it */
-		if (irq)
-			break;
+		if (irq) {
+			/* Another CPU may have shut this irq down, check it */
+			if (irq_to_desc(irq))
+				break;
+			irq = 0;
+		}
 
 		/* Clear pending bits */
 		xc->pending_prio &= ~(1 << prio);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-07-15  2:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-12  8:20 [RFC PATCH kernel] powerpc/xive: Drop deregistered irqs Alexey Kardashevskiy
2019-07-12  8:29 ` Benjamin Herrenschmidt
2019-07-12  9:37   ` Alexey Kardashevskiy
2019-07-12 23:47     ` Benjamin Herrenschmidt
2019-07-13  8:53       ` Alexey Kardashevskiy
2019-07-14  1:31         ` Benjamin Herrenschmidt
2019-07-14 19:44           ` Cédric Le Goater
2019-07-14 22:48             ` Benjamin Herrenschmidt
2019-07-14 19:14 ` Daniel Henrique Barboza
2019-07-15  2:27   ` Alexey Kardashevskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).