linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Daniel Henrique Barboza <danielhb413@gmail.com>,
	linuxppc-dev@lists.ozlabs.org
Cc: "Alistair Popple" <alistair@popple.id.au>,
	"Greg Kurz" <groug@kaod.org>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Paul Mackerras" <paulus@samba.org>,
	"Cédric Le Goater" <clg@kaod.org>,
	"David Gibson" <david@gibson.dropbear.id.au>
Subject: Re: [RFC PATCH kernel] powerpc/xive: Drop deregistered irqs
Date: Mon, 15 Jul 2019 12:27:30 +1000	[thread overview]
Message-ID: <816f26ea-d868-781f-9189-6c0d4e7257cf@ozlabs.ru> (raw)
In-Reply-To: <e500148e-93f3-5a63-cdc2-b48428c51ee8@gmail.com>



On 15/07/2019 05:14, Daniel Henrique Barboza wrote:
> This patch fixed an issue I was experiencing with virsh start/destroy
> of guests with mlx5 and GPU passthrough in a Power 9 server. I
> believe it's a similar situation which Alexey described in the post
> commit msg.
> 
> 
> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>


Thanks for testing! Unfortunately, this is not the right fix as it only 
reduces the race window but does not eliminate it, we need something 
better than this, i.e. do not backport this anywhere please :) Thanks,


> 
> 
> On 7/12/19 5:20 AM, Alexey Kardashevskiy wrote:
>> There is a race between releasing an irq on one cpu and fetching it
>> from XIVE on another cpu as there does not seem to be any locking between
>> these, probably because xive_irq_chip::irq_shutdown() is supposed to
>> remove the irq from all queues in the system which it does not do.
>>
>> As a result, when such released irq appears in a queue, we take it
>> from the queue but we do not change the current priority on that cpu and
>> since there is no handler for the irq, EOI is never called and the cpu
>> current priority remains elevated (7 vs. 0xff==unmasked). If another irq
>> is assigned to the same cpu, then that device stops working until irq
>> is moved to another cpu or the device is reset.
>>
>> This checks if irq is still registered, if not, it assumes no valid irq
>> was fetched from the loop and if there is none left, it continues to
>> the irq==0 case (not visible in this patch) and sets priority to 0xff
>> which is basically unmasking. This calls irq_to_desc() on a hot path now
>> which is a radix tree lookup; hopefully this won't be noticeable as
>> that tree is quite small.
>>
>> Signed-off-by: Alexey Kardashevskiy<aik@ozlabs.ru>
>> ---
>>
>> Found it on P9 system with:
>> - a host with 8 cpus online
>> - a boot disk on ahci with its msix on cpu#0
>> - a guest with 2xGPUs + 6xNVLink + 4 cpus
>> - GPU#0 from the guest is bound to the same cpu#0.
>>
>> Killing a guest killed ahci and therefore the host because of the race.
>> Note that VFIO masks interrupts first and only then resets the device.
>>
>> Alternatives:
>>
>> 1. Fix xive_irq_chip::irq_shutdown() to walk through all cpu queues and
>> drop deregistered irqs.
>>
>> 2. Exploit chip->irq_get_irqchip_state function from
>> 62e0468650c30f0298 "genirq: Add optional hardware synchronization for shutdown".
>>
>> Both require deep XIVE knowledge which I do not have.
>> ---
>>   arch/powerpc/sysdev/xive/common.c | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
>> index 082c7e1c20f0..65742e280337 100644
>> --- a/arch/powerpc/sysdev/xive/common.c
>> +++ b/arch/powerpc/sysdev/xive/common.c
>> @@ -148,8 +148,12 @@ static u32 xive_scan_interrupts(struct xive_cpu *xc, bool just_peek)
>>   		irq = xive_read_eq(&xc->queue[prio], just_peek);
>>   
>>   		/* Found something ? That's it */
>> -		if (irq)
>> -			break;
>> +		if (irq) {
>> +			/* Another CPU may have shut this irq down, check it */
>> +			if (irq_to_desc(irq))
>> +				break;
>> +			irq = 0;
>> +		}
>>   
>>   		/* Clear pending bits */
>>   		xc->pending_prio &= ~(1 << prio);
> 

-- 
Alexey

      reply	other threads:[~2019-07-15  2:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-12  8:20 [RFC PATCH kernel] powerpc/xive: Drop deregistered irqs Alexey Kardashevskiy
2019-07-12  8:29 ` Benjamin Herrenschmidt
2019-07-12  9:37   ` Alexey Kardashevskiy
2019-07-12 23:47     ` Benjamin Herrenschmidt
2019-07-13  8:53       ` Alexey Kardashevskiy
2019-07-14  1:31         ` Benjamin Herrenschmidt
2019-07-14 19:44           ` Cédric Le Goater
2019-07-14 22:48             ` Benjamin Herrenschmidt
2019-07-14 19:14 ` Daniel Henrique Barboza
2019-07-15  2:27   ` Alexey Kardashevskiy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=816f26ea-d868-781f-9189-6c0d4e7257cf@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=alistair@popple.id.au \
    --cc=clg@kaod.org \
    --cc=danielhb413@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=groug@kaod.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=npiggin@gmail.com \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).