linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH kernel v2] powerpc/xive: Drop deregistered irqs
@ 2019-07-15  7:11 Alexey Kardashevskiy
  2019-07-16  8:59 ` Cédric Le Goater
  0 siblings, 1 reply; 5+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-15  7:11 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Alistair Popple, Daniel Henrique Barboza,
	Greg Kurz, Nicholas Piggin, Cédric Le Goater,
	Paul Mackerras, David Gibson

There is a race between releasing an irq on one cpu and fetching it
from XIVE on another cpu as there does not seem to be any locking between
these, probably because xive_irq_chip::irq_shutdown() is supposed to
remove the irq from all queues in the system which it does not do.

As a result, when such released irq appears in a queue, we take it
from the queue but we do not change the current priority on that cpu and
since there is no handler for the irq, EOI is never called and the cpu
current priority remains elevated (7 vs. 0xff==unmasked). If another irq
is assigned to the same cpu, then that device stops working until irq
is moved to another cpu or the device is reset.

This adds a new ppc_md.orphan_irq callback which is called if no irq
descriptor is found. The XIVE implementation drops the current priority
to 0xff which effectively unmasks interrupts in a current CPU.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v2:
* added ppc_md.orphan_irq

---

Found it on P9 system with:
- a host with 8 cpus online
- a boot disk on ahci with its msix on cpu#0
- a guest with 2xGPUs + 6xNVLink + 4 cpus
- GPU#0 from the guest is bound to the same cpu#0.

Killing a guest killed ahci and therefore the host because of the race.
Note that VFIO masks interrupts first and only then resets the device.
---
 arch/powerpc/include/asm/machdep.h |  3 +++
 arch/powerpc/kernel/irq.c          |  9 ++++++---
 arch/powerpc/sysdev/xive/common.c  | 10 ++++++++++
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index c43d6eca9edd..6cc14e28e89a 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -59,6 +59,9 @@ struct machdep_calls {
 	/* Return an irq, or 0 to indicate there are none pending. */
 	unsigned int	(*get_irq)(void);
 
+	/* Drops irq if it does not have a valid descriptor */
+	void		(*orphan_irq)(unsigned int irq);
+
 	/* PCI stuff */
 	/* Called after allocating resources */
 	void		(*pcibios_fixup)(void);
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index bc68c53af67c..b4e06d05bdba 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -632,10 +632,13 @@ void __do_irq(struct pt_regs *regs)
 	may_hard_irq_enable();
 
 	/* And finally process it */
-	if (unlikely(!irq))
+	if (unlikely(!irq)) {
 		__this_cpu_inc(irq_stat.spurious_irqs);
-	else
-		generic_handle_irq(irq);
+	} else if (generic_handle_irq(irq)) {
+		if (ppc_md.orphan_irq)
+			ppc_md.orphan_irq(irq);
+		__this_cpu_inc(irq_stat.spurious_irqs);
+	}
 
 	trace_irq_exit(regs);
 
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 082c7e1c20f0..b4054091999a 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -283,6 +283,15 @@ static unsigned int xive_get_irq(void)
 	return irq;
 }
 
+static void xive_orphan_irq(unsigned int irq)
+{
+	struct xive_cpu *xc = __this_cpu_read(xive_cpu);
+
+	xc->cppr = 0xff;
+	out_8(xive_tima + xive_tima_offset + TM_CPPR, 0xff);
+	DBG_VERBOSE("orphan_irq: irq %d, adjusting CPPR to 0xff\n", irq);
+}
+
 /*
  * After EOI'ing an interrupt, we need to re-check the queue
  * to see if another interrupt is pending since multiple
@@ -1419,6 +1428,7 @@ bool __init xive_core_init(const struct xive_ops *ops, void __iomem *area, u32 o
 	xive_irq_priority = max_prio;
 
 	ppc_md.get_irq = xive_get_irq;
+	ppc_md.orphan_irq = xive_orphan_irq;
 	__xive_enabled = true;
 
 	pr_devel("Initializing host..\n");
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH kernel v2] powerpc/xive: Drop deregistered irqs
  2019-07-15  7:11 [PATCH kernel v2] powerpc/xive: Drop deregistered irqs Alexey Kardashevskiy
@ 2019-07-16  8:59 ` Cédric Le Goater
  2019-07-16  9:10   ` Alexey Kardashevskiy
  0 siblings, 1 reply; 5+ messages in thread
From: Cédric Le Goater @ 2019-07-16  8:59 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev
  Cc: Alistair Popple, Daniel Henrique Barboza, Greg Kurz,
	Nicholas Piggin, Paul Mackerras, David Gibson

On 15/07/2019 09:11, Alexey Kardashevskiy wrote:
> There is a race between releasing an irq on one cpu and fetching it
> from XIVE on another cpu as there does not seem to be any locking between
> these, probably because xive_irq_chip::irq_shutdown() is supposed to
> remove the irq from all queues in the system which it does not do.
> 
> As a result, when such released irq appears in a queue, we take it
> from the queue but we do not change the current priority on that cpu and
> since there is no handler for the irq, EOI is never called and the cpu
> current priority remains elevated (7 vs. 0xff==unmasked). If another irq
> is assigned to the same cpu, then that device stops working until irq
> is moved to another cpu or the device is reset.
> 
> This adds a new ppc_md.orphan_irq callback which is called if no irq
> descriptor is found. The XIVE implementation drops the current priority
> to 0xff which effectively unmasks interrupts in a current CPU.


The test on generic_handle_irq() catches interrupt events that
were served on a target CPU while the source interrupt was being
shutdown on another CPU.

The orphan_irq() handler restores the CPPR in such cases. 

This looks OK to me. I would have added some more comments in the 
code. 

Reviewed-by: Cédric Le Goater <clg@kaod.org>


And adding to the list of future cleanups : a 'set_cppr' helper.

Thanks,

C.


> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> Changes:
> v2:
> * added ppc_md.orphan_irq
> 
> ---
> 
> Found it on P9 system with:
> - a host with 8 cpus online
> - a boot disk on ahci with its msix on cpu#0
> - a guest with 2xGPUs + 6xNVLink + 4 cpus
> - GPU#0 from the guest is bound to the same cpu#0.
> 
> Killing a guest killed ahci and therefore the host because of the race.
> Note that VFIO masks interrupts first and only then resets the device.
> ---
>  arch/powerpc/include/asm/machdep.h |  3 +++
>  arch/powerpc/kernel/irq.c          |  9 ++++++---
>  arch/powerpc/sysdev/xive/common.c  | 10 ++++++++++
>  3 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
> index c43d6eca9edd..6cc14e28e89a 100644
> --- a/arch/powerpc/include/asm/machdep.h
> +++ b/arch/powerpc/include/asm/machdep.h
> @@ -59,6 +59,9 @@ struct machdep_calls {
>  	/* Return an irq, or 0 to indicate there are none pending. */
>  	unsigned int	(*get_irq)(void);
>  
> +	/* Drops irq if it does not have a valid descriptor */
> +	void		(*orphan_irq)(unsigned int irq);
> +
>  	/* PCI stuff */
>  	/* Called after allocating resources */
>  	void		(*pcibios_fixup)(void);
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index bc68c53af67c..b4e06d05bdba 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -632,10 +632,13 @@ void __do_irq(struct pt_regs *regs)
>  	may_hard_irq_enable();
>  
>  	/* And finally process it */
> -	if (unlikely(!irq))
> +	if (unlikely(!irq)) {
>  		__this_cpu_inc(irq_stat.spurious_irqs);
> -	else
> -		generic_handle_irq(irq);
> +	} else if (generic_handle_irq(irq)) {
> +		if (ppc_md.orphan_irq)
> +			ppc_md.orphan_irq(irq);
> +		__this_cpu_inc(irq_stat.spurious_irqs);
> +	}
>  
>  	trace_irq_exit(regs);
>  
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index 082c7e1c20f0..b4054091999a 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -283,6 +283,15 @@ static unsigned int xive_get_irq(void)
>  	return irq;
>  }
>  
> +static void xive_orphan_irq(unsigned int irq)
> +{
> +	struct xive_cpu *xc = __this_cpu_read(xive_cpu);
> +
> +	xc->cppr = 0xff;
> +	out_8(xive_tima + xive_tima_offset + TM_CPPR, 0xff);
> +	DBG_VERBOSE("orphan_irq: irq %d, adjusting CPPR to 0xff\n", irq);
> +}
> +
>  /*
>   * After EOI'ing an interrupt, we need to re-check the queue
>   * to see if another interrupt is pending since multiple
> @@ -1419,6 +1428,7 @@ bool __init xive_core_init(const struct xive_ops *ops, void __iomem *area, u32 o
>  	xive_irq_priority = max_prio;
>  
>  	ppc_md.get_irq = xive_get_irq;
> +	ppc_md.orphan_irq = xive_orphan_irq;
>  	__xive_enabled = true;
>  
>  	pr_devel("Initializing host..\n");
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH kernel v2] powerpc/xive: Drop deregistered irqs
  2019-07-16  8:59 ` Cédric Le Goater
@ 2019-07-16  9:10   ` Alexey Kardashevskiy
  2019-07-16  9:14     ` Cédric Le Goater
  0 siblings, 1 reply; 5+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-16  9:10 UTC (permalink / raw)
  To: Cédric Le Goater, linuxppc-dev
  Cc: Alistair Popple, Daniel Henrique Barboza, Greg Kurz,
	Nicholas Piggin, Paul Mackerras, David Gibson



On 16/07/2019 18:59, Cédric Le Goater wrote:
> On 15/07/2019 09:11, Alexey Kardashevskiy wrote:
>> There is a race between releasing an irq on one cpu and fetching it
>> from XIVE on another cpu as there does not seem to be any locking between
>> these, probably because xive_irq_chip::irq_shutdown() is supposed to
>> remove the irq from all queues in the system which it does not do.
>>
>> As a result, when such released irq appears in a queue, we take it
>> from the queue but we do not change the current priority on that cpu and
>> since there is no handler for the irq, EOI is never called and the cpu
>> current priority remains elevated (7 vs. 0xff==unmasked). If another irq
>> is assigned to the same cpu, then that device stops working until irq
>> is moved to another cpu or the device is reset.
>>
>> This adds a new ppc_md.orphan_irq callback which is called if no irq
>> descriptor is found. The XIVE implementation drops the current priority
>> to 0xff which effectively unmasks interrupts in a current CPU.
> 
> 
> The test on generic_handle_irq() catches interrupt events that
> were served on a target CPU while the source interrupt was being
> shutdown on another CPU.
> 
> The orphan_irq() handler restores the CPPR in such cases.
> 
> This looks OK to me. I would have added some more comments in the
> code.


Which and where? Thanks,


> Reviewed-by: Cédric Le Goater <clg@kaod.org>
> 
> 
> And adding to the list of future cleanups : a 'set_cppr' helper.
> 
> Thanks,
> 
> C.
> 
> 
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>> Changes:
>> v2:
>> * added ppc_md.orphan_irq
>>
>> ---
>>
>> Found it on P9 system with:
>> - a host with 8 cpus online
>> - a boot disk on ahci with its msix on cpu#0
>> - a guest with 2xGPUs + 6xNVLink + 4 cpus
>> - GPU#0 from the guest is bound to the same cpu#0.
>>
>> Killing a guest killed ahci and therefore the host because of the race.
>> Note that VFIO masks interrupts first and only then resets the device.
>> ---
>>   arch/powerpc/include/asm/machdep.h |  3 +++
>>   arch/powerpc/kernel/irq.c          |  9 ++++++---
>>   arch/powerpc/sysdev/xive/common.c  | 10 ++++++++++
>>   3 files changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
>> index c43d6eca9edd..6cc14e28e89a 100644
>> --- a/arch/powerpc/include/asm/machdep.h
>> +++ b/arch/powerpc/include/asm/machdep.h
>> @@ -59,6 +59,9 @@ struct machdep_calls {
>>   	/* Return an irq, or 0 to indicate there are none pending. */
>>   	unsigned int	(*get_irq)(void);
>>   
>> +	/* Drops irq if it does not have a valid descriptor */
>> +	void		(*orphan_irq)(unsigned int irq);
>> +
>>   	/* PCI stuff */
>>   	/* Called after allocating resources */
>>   	void		(*pcibios_fixup)(void);
>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
>> index bc68c53af67c..b4e06d05bdba 100644
>> --- a/arch/powerpc/kernel/irq.c
>> +++ b/arch/powerpc/kernel/irq.c
>> @@ -632,10 +632,13 @@ void __do_irq(struct pt_regs *regs)
>>   	may_hard_irq_enable();
>>   
>>   	/* And finally process it */
>> -	if (unlikely(!irq))
>> +	if (unlikely(!irq)) {
>>   		__this_cpu_inc(irq_stat.spurious_irqs);
>> -	else
>> -		generic_handle_irq(irq);
>> +	} else if (generic_handle_irq(irq)) {
>> +		if (ppc_md.orphan_irq)
>> +			ppc_md.orphan_irq(irq);
>> +		__this_cpu_inc(irq_stat.spurious_irqs);
>> +	}
>>   
>>   	trace_irq_exit(regs);
>>   
>> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
>> index 082c7e1c20f0..b4054091999a 100644
>> --- a/arch/powerpc/sysdev/xive/common.c
>> +++ b/arch/powerpc/sysdev/xive/common.c
>> @@ -283,6 +283,15 @@ static unsigned int xive_get_irq(void)
>>   	return irq;
>>   }
>>   
>> +static void xive_orphan_irq(unsigned int irq)
>> +{
>> +	struct xive_cpu *xc = __this_cpu_read(xive_cpu);
>> +
>> +	xc->cppr = 0xff;
>> +	out_8(xive_tima + xive_tima_offset + TM_CPPR, 0xff);
>> +	DBG_VERBOSE("orphan_irq: irq %d, adjusting CPPR to 0xff\n", irq);
>> +}
>> +
>>   /*
>>    * After EOI'ing an interrupt, we need to re-check the queue
>>    * to see if another interrupt is pending since multiple
>> @@ -1419,6 +1428,7 @@ bool __init xive_core_init(const struct xive_ops *ops, void __iomem *area, u32 o
>>   	xive_irq_priority = max_prio;
>>   
>>   	ppc_md.get_irq = xive_get_irq;
>> +	ppc_md.orphan_irq = xive_orphan_irq;
>>   	__xive_enabled = true;
>>   
>>   	pr_devel("Initializing host..\n");
>>
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH kernel v2] powerpc/xive: Drop deregistered irqs
  2019-07-16  9:10   ` Alexey Kardashevskiy
@ 2019-07-16  9:14     ` Cédric Le Goater
  2019-07-16 11:54       ` Michael Ellerman
  0 siblings, 1 reply; 5+ messages in thread
From: Cédric Le Goater @ 2019-07-16  9:14 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev
  Cc: Alistair Popple, Daniel Henrique Barboza, Greg Kurz,
	Nicholas Piggin, Paul Mackerras, David Gibson

On 16/07/2019 11:10, Alexey Kardashevskiy wrote:
> 
> 
> On 16/07/2019 18:59, Cédric Le Goater wrote:
>> On 15/07/2019 09:11, Alexey Kardashevskiy wrote:
>>> There is a race between releasing an irq on one cpu and fetching it
>>> from XIVE on another cpu as there does not seem to be any locking between
>>> these, probably because xive_irq_chip::irq_shutdown() is supposed to
>>> remove the irq from all queues in the system which it does not do.
>>>
>>> As a result, when such released irq appears in a queue, we take it
>>> from the queue but we do not change the current priority on that cpu and
>>> since there is no handler for the irq, EOI is never called and the cpu
>>> current priority remains elevated (7 vs. 0xff==unmasked). If another irq
>>> is assigned to the same cpu, then that device stops working until irq
>>> is moved to another cpu or the device is reset.
>>>
>>> This adds a new ppc_md.orphan_irq callback which is called if no irq
>>> descriptor is found. The XIVE implementation drops the current priority
>>> to 0xff which effectively unmasks interrupts in a current CPU.
>>
>>
>> The test on generic_handle_irq() catches interrupt events that
>> were served on a target CPU while the source interrupt was being
>> shutdown on another CPU.
>>
>> The orphan_irq() handler restores the CPPR in such cases.
>>
>> This looks OK to me. I would have added some more comments in the
>> code.
> 
> 
> Which and where? Thanks,

Above xive_orphan_irq() explaining the complete problem that we are 
addressing. XIVE is not super obvious when looking at the code ...


C.
 
> 
>> Reviewed-by: Cédric Le Goater <clg@kaod.org>
>>
>>
>> And adding to the list of future cleanups : a 'set_cppr' helper.
>>
>> Thanks,
>>
>> C.
>>
>>
>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>> ---
>>> Changes:
>>> v2:
>>> * added ppc_md.orphan_irq
>>>
>>> ---
>>>
>>> Found it on P9 system with:
>>> - a host with 8 cpus online
>>> - a boot disk on ahci with its msix on cpu#0
>>> - a guest with 2xGPUs + 6xNVLink + 4 cpus
>>> - GPU#0 from the guest is bound to the same cpu#0.
>>>
>>> Killing a guest killed ahci and therefore the host because of the race.
>>> Note that VFIO masks interrupts first and only then resets the device.
>>> ---
>>>   arch/powerpc/include/asm/machdep.h |  3 +++
>>>   arch/powerpc/kernel/irq.c          |  9 ++++++---
>>>   arch/powerpc/sysdev/xive/common.c  | 10 ++++++++++
>>>   3 files changed, 19 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
>>> index c43d6eca9edd..6cc14e28e89a 100644
>>> --- a/arch/powerpc/include/asm/machdep.h
>>> +++ b/arch/powerpc/include/asm/machdep.h
>>> @@ -59,6 +59,9 @@ struct machdep_calls {
>>>       /* Return an irq, or 0 to indicate there are none pending. */
>>>       unsigned int    (*get_irq)(void);
>>>   +    /* Drops irq if it does not have a valid descriptor */
>>> +    void        (*orphan_irq)(unsigned int irq);
>>> +
>>>       /* PCI stuff */
>>>       /* Called after allocating resources */
>>>       void        (*pcibios_fixup)(void);
>>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
>>> index bc68c53af67c..b4e06d05bdba 100644
>>> --- a/arch/powerpc/kernel/irq.c
>>> +++ b/arch/powerpc/kernel/irq.c
>>> @@ -632,10 +632,13 @@ void __do_irq(struct pt_regs *regs)
>>>       may_hard_irq_enable();
>>>         /* And finally process it */
>>> -    if (unlikely(!irq))
>>> +    if (unlikely(!irq)) {
>>>           __this_cpu_inc(irq_stat.spurious_irqs);
>>> -    else
>>> -        generic_handle_irq(irq);
>>> +    } else if (generic_handle_irq(irq)) {
>>> +        if (ppc_md.orphan_irq)
>>> +            ppc_md.orphan_irq(irq);
>>> +        __this_cpu_inc(irq_stat.spurious_irqs);
>>> +    }
>>>         trace_irq_exit(regs);
>>>   diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
>>> index 082c7e1c20f0..b4054091999a 100644
>>> --- a/arch/powerpc/sysdev/xive/common.c
>>> +++ b/arch/powerpc/sysdev/xive/common.c
>>> @@ -283,6 +283,15 @@ static unsigned int xive_get_irq(void)
>>>       return irq;
>>>   }
>>>   +static void xive_orphan_irq(unsigned int irq)
>>> +{
>>> +    struct xive_cpu *xc = __this_cpu_read(xive_cpu);
>>> +
>>> +    xc->cppr = 0xff;
>>> +    out_8(xive_tima + xive_tima_offset + TM_CPPR, 0xff);
>>> +    DBG_VERBOSE("orphan_irq: irq %d, adjusting CPPR to 0xff\n", irq);
>>> +}
>>> +
>>>   /*
>>>    * After EOI'ing an interrupt, we need to re-check the queue
>>>    * to see if another interrupt is pending since multiple
>>> @@ -1419,6 +1428,7 @@ bool __init xive_core_init(const struct xive_ops *ops, void __iomem *area, u32 o
>>>       xive_irq_priority = max_prio;
>>>         ppc_md.get_irq = xive_get_irq;
>>> +    ppc_md.orphan_irq = xive_orphan_irq;
>>>       __xive_enabled = true;
>>>         pr_devel("Initializing host..\n");
>>>
>>
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH kernel v2] powerpc/xive: Drop deregistered irqs
  2019-07-16  9:14     ` Cédric Le Goater
@ 2019-07-16 11:54       ` Michael Ellerman
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2019-07-16 11:54 UTC (permalink / raw)
  To: Cédric Le Goater, Alexey Kardashevskiy, linuxppc-dev
  Cc: Alistair Popple, Daniel Henrique Barboza, Greg Kurz,
	Nicholas Piggin, Paul Mackerras, David Gibson

Cédric Le Goater <clg@kaod.org> writes:
> On 16/07/2019 11:10, Alexey Kardashevskiy wrote:
>> On 16/07/2019 18:59, Cédric Le Goater wrote:
>>> On 15/07/2019 09:11, Alexey Kardashevskiy wrote:
>>>> There is a race between releasing an irq on one cpu and fetching it
>>>> from XIVE on another cpu as there does not seem to be any locking between
>>>> these, probably because xive_irq_chip::irq_shutdown() is supposed to
>>>> remove the irq from all queues in the system which it does not do.
>>>>
>>>> As a result, when such released irq appears in a queue, we take it
>>>> from the queue but we do not change the current priority on that cpu and
>>>> since there is no handler for the irq, EOI is never called and the cpu
>>>> current priority remains elevated (7 vs. 0xff==unmasked). If another irq
>>>> is assigned to the same cpu, then that device stops working until irq
>>>> is moved to another cpu or the device is reset.
>>>>
>>>> This adds a new ppc_md.orphan_irq callback which is called if no irq
>>>> descriptor is found. The XIVE implementation drops the current priority
>>>> to 0xff which effectively unmasks interrupts in a current CPU.
>>>
>>>
>>> The test on generic_handle_irq() catches interrupt events that
>>> were served on a target CPU while the source interrupt was being
>>> shutdown on another CPU.
>>>
>>> The orphan_irq() handler restores the CPPR in such cases.
>>>
>>> This looks OK to me. I would have added some more comments in the
>>> code.
>> 
>> Which and where? Thanks,
>
> Above xive_orphan_irq() explaining the complete problem that we are 
> addressing. XIVE is not super obvious when looking at the code ...

Yes adding a comment would be good, thanks.

This will also need a Fixes: tag.

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-07-16 12:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-15  7:11 [PATCH kernel v2] powerpc/xive: Drop deregistered irqs Alexey Kardashevskiy
2019-07-16  8:59 ` Cédric Le Goater
2019-07-16  9:10   ` Alexey Kardashevskiy
2019-07-16  9:14     ` Cédric Le Goater
2019-07-16 11:54       ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).