linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] genirq: Race condition in ONESHOT IRQ handler disabling IRQ forever
@ 2012-02-06  8:14 =?utf-8?Q?Lothar_Wa=C3=9Fmann?=
  2012-02-06 10:42 ` Lars-Peter Clausen
  2012-02-07  9:03 ` Yong Zhang
  0 siblings, 2 replies; 12+ messages in thread
From: =?utf-8?Q?Lothar_Wa=C3=9Fmann?= @ 2012-02-06  8:14 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arm-kernel, Thomas Gleixner

Hi,

I already sent this to <linux-kernel@vger.kernel.org> on Feb. 1, 2012
but did not get any response there. So resending to a wider audience
with improved subject line:

there is a race condition in the threaded IRQ handler code for oneshot
interrupts that may lead to disabling an IRQ indefinitely. IRQs are
masked before calling the hard-irq handler and are unmasked only after
the soft-irq handler has been run. Thus if the hard-irq handler
returns IRQ_HANDLED instead of IRQ_WAKE_THREAD, meaning the soft-irq
will not be called, the interrupt will remain masked forever.

This can happen due to a short pulse on the interrupt line, that
triggers the interrupt logic, but goes undetected by the hard-irq
handler. The problem can be reproduced with the TSC2007 touch
controller driver that uses ONESHOT interrupts.

The problem arises also with interrupt controllers that latch a level
triggered IRQ until it is acknowledged (like the i.MX28 does).
In this case the IRQ status bit will remain asserted after the
soft-irq finishes and retrigger the interrupt while the interrupt line
is already deasserted.

The following patch would solve the problem, but I'm not sure whether
it's the Right Thing(TM) to do. Especially wrt. shared interrupts.

diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
index 470d08c..93beadb 100644
--- a/kernel/irq/handle.c
+++ b/kernel/irq/handle.c
@@ -146,6 +146,11 @@ handle_irq_event_percpu(struct irq_desc *desc, struct irqaction *action)
 			/* Fall through to add to randomness */
 		case IRQ_HANDLED:
 			random |= action->flags;
+			/* unmask the IRQ that has been left masked
+			 * due to race condition
+			 */
+			if (res == IRQ_HANDLED && (action->flags & IRQF_ONESHOT))
+				unmask_irq(desc);
 			break;
 
 		default:

Best regards,

Lothar Wassmann
-- 
___________________________________________________________

Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Geschäftsführer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996

www.karo-electronics.de | info@karo-electronics.de
___________________________________________________________


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [BUG] genirq: Race condition in ONESHOT IRQ handler disabling IRQ forever
  2012-02-06  8:14 [BUG] genirq: Race condition in ONESHOT IRQ handler disabling IRQ forever =?utf-8?Q?Lothar_Wa=C3=9Fmann?=
@ 2012-02-06 10:42 ` Lars-Peter Clausen
  2012-02-07  9:03 ` Yong Zhang
  1 sibling, 0 replies; 12+ messages in thread
From: Lars-Peter Clausen @ 2012-02-06 10:42 UTC (permalink / raw)
  To: Lothar Waßmann; +Cc: linux-kernel, linux-arm-kernel, Thomas Gleixner

On 02/06/2012 09:14 AM, =?utf-8?Q?Lothar_Wa=C3=9Fmann?= wrote:
> Hi,
> 
> I already sent this to <linux-kernel@vger.kernel.org> on Feb. 1, 2012
> but did not get any response there. So resending to a wider audience
> with improved subject line:
> 
> there is a race condition in the threaded IRQ handler code for oneshot
> interrupts that may lead to disabling an IRQ indefinitely. IRQs are
> masked before calling the hard-irq handler and are unmasked only after
> the soft-irq handler has been run. Thus if the hard-irq handler
> returns IRQ_HANDLED instead of IRQ_WAKE_THREAD, meaning the soft-irq
> will not be called, the interrupt will remain masked forever.
> 
> This can happen due to a short pulse on the interrupt line, that
> triggers the interrupt logic, but goes undetected by the hard-irq
> handler. The problem can be reproduced with the TSC2007 touch
> controller driver that uses ONESHOT interrupts.
> 
> The problem arises also with interrupt controllers that latch a level
> triggered IRQ until it is acknowledged (like the i.MX28 does).
> In this case the IRQ status bit will remain asserted after the
> soft-irq finishes and retrigger the interrupt while the interrupt line
> is already deasserted.
> 
> The following patch would solve the problem, but I'm not sure whether
> it's the Right Thing(TM) to do. Especially wrt. shared interrupts.
> 
> diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
> index 470d08c..93beadb 100644
> --- a/kernel/irq/handle.c
> +++ b/kernel/irq/handle.c
> @@ -146,6 +146,11 @@ handle_irq_event_percpu(struct irq_desc *desc, struct irqaction *action)
>  			/* Fall through to add to randomness */
>  		case IRQ_HANDLED:
>  			random |= action->flags;
> +			/* unmask the IRQ that has been left masked
> +			 * due to race condition
> +			 */
> +			if (res == IRQ_HANDLED && (action->flags & IRQF_ONESHOT))
> +				unmask_irq(desc);
>  			break;
>  
>  		default:

I think a better fix is to check the return value of handle_irq_event in
handle_level_irq and if the IRQ_WAKE_THREADED bit is not set unmask the irq.

The same should probably also be done for handle_fasteoi_irq.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [BUG] genirq: Race condition in ONESHOT IRQ handler disabling IRQ forever
  2012-02-06  8:14 [BUG] genirq: Race condition in ONESHOT IRQ handler disabling IRQ forever =?utf-8?Q?Lothar_Wa=C3=9Fmann?=
  2012-02-06 10:42 ` Lars-Peter Clausen
@ 2012-02-07  9:03 ` Yong Zhang
  2012-02-07 10:01   ` Lothar Waßmann
  1 sibling, 1 reply; 12+ messages in thread
From: Yong Zhang @ 2012-02-07  9:03 UTC (permalink / raw)
  To: =?utf-8?Q?Lothar_Wa=C3=9Fmann?=
  Cc: linux-kernel, linux-arm-kernel, Thomas Gleixner

On Mon, Feb 06, 2012 at 09:14:47AM +0100, =?utf-8?Q?Lothar_Wa=C3=9Fmann?= wrote:
> Hi,
> 
> I already sent this to <linux-kernel@vger.kernel.org> on Feb. 1, 2012
> but did not get any response there. So resending to a wider audience
> with improved subject line:
> 
> there is a race condition in the threaded IRQ handler code for oneshot
> interrupts that may lead to disabling an IRQ indefinitely. IRQs are
> masked before calling the hard-irq handler and are unmasked only after
> the soft-irq handler has been run. Thus if the hard-irq handler
> returns IRQ_HANDLED instead of IRQ_WAKE_THREAD, meaning the soft-irq
> will not be called, the interrupt will remain masked forever.
> 
> This can happen due to a short pulse on the interrupt line, that
> triggers the interrupt logic, but goes undetected by the hard-irq
> handler. The problem can be reproduced with the TSC2007 touch
> controller driver that uses ONESHOT interrupts.

Isn't it the responsibility of the driver (say TSC2007)?

In this case, TSC2007 should return IRQ_WAKE_THREAD IMHO.

Thanks,
Yong


> 
> The problem arises also with interrupt controllers that latch a level
> triggered IRQ until it is acknowledged (like the i.MX28 does).
> In this case the IRQ status bit will remain asserted after the
> soft-irq finishes and retrigger the interrupt while the interrupt line
> is already deasserted.
> 
> The following patch would solve the problem, but I'm not sure whether
> it's the Right Thing(TM) to do. Especially wrt. shared interrupts.
> 
> diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
> index 470d08c..93beadb 100644
> --- a/kernel/irq/handle.c
> +++ b/kernel/irq/handle.c
> @@ -146,6 +146,11 @@ handle_irq_event_percpu(struct irq_desc *desc, struct irqaction *action)
>  			/* Fall through to add to randomness */
>  		case IRQ_HANDLED:
>  			random |= action->flags;
> +			/* unmask the IRQ that has been left masked
> +			 * due to race condition
> +			 */
> +			if (res == IRQ_HANDLED && (action->flags & IRQF_ONESHOT))
> +				unmask_irq(desc);
>  			break;
>  
>  		default:
> 
> Best regards,
> 
> Lothar Wassmann
> -- 
> ___________________________________________________________
> 
> Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
> Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
> Geschäftsführer: Matthias Kaussen
> Handelsregistereintrag: Amtsgericht Aachen, HRB 4996
> 
> www.karo-electronics.de | info@karo-electronics.de
> ___________________________________________________________
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Only stand for myself

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [BUG] genirq: Race condition in ONESHOT IRQ handler disabling IRQ forever
  2012-02-07  9:03 ` Yong Zhang
@ 2012-02-07 10:01   ` Lothar Waßmann
  2012-02-07 12:34     ` Yong Zhang
  0 siblings, 1 reply; 12+ messages in thread
From: Lothar Waßmann @ 2012-02-07 10:01 UTC (permalink / raw)
  To: Yong Zhang; +Cc: linux-kernel, linux-arm-kernel, Thomas Gleixner

Hi,

> On Mon, Feb 06, 2012 at 09:14:47AM +0100, Lothar Waßmann wrote:
> > Hi,
> > 
> > I already sent this to <linux-kernel@vger.kernel.org> on Feb. 1, 2012
> > but did not get any response there. So resending to a wider audience
> > with improved subject line:
> > 
> > there is a race condition in the threaded IRQ handler code for oneshot
> > interrupts that may lead to disabling an IRQ indefinitely. IRQs are
> > masked before calling the hard-irq handler and are unmasked only after
> > the soft-irq handler has been run. Thus if the hard-irq handler
> > returns IRQ_HANDLED instead of IRQ_WAKE_THREAD, meaning the soft-irq
> > will not be called, the interrupt will remain masked forever.
> > 
> > This can happen due to a short pulse on the interrupt line, that
> > triggers the interrupt logic, but goes undetected by the hard-irq
> > handler. The problem can be reproduced with the TSC2007 touch
> > controller driver that uses ONESHOT interrupts.
> 
> Isn't it the responsibility of the driver (say TSC2007)?
> 
> In this case, TSC2007 should return IRQ_WAKE_THREAD IMHO.
> 
That would mean it had to return IRQ_WAKE_THREAD unconditionally
making the return code useless.
And it would cause an extra useless loop through the softirq
handler.


Lothar Waßmann
-- 
___________________________________________________________

Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Geschäftsführer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996

www.karo-electronics.de | info@karo-electronics.de
___________________________________________________________

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [BUG] genirq: Race condition in ONESHOT IRQ handler disabling IRQ forever
  2012-02-07 10:01   ` Lothar Waßmann
@ 2012-02-07 12:34     ` Yong Zhang
  2012-02-07 12:52       ` Lothar Waßmann
  0 siblings, 1 reply; 12+ messages in thread
From: Yong Zhang @ 2012-02-07 12:34 UTC (permalink / raw)
  To: Lothar Waßmann; +Cc: linux-kernel, linux-arm-kernel, Thomas Gleixner

On Tue, Feb 07, 2012 at 11:01:06AM +0100, Lothar Waßmann wrote:
> Hi,
> 
> > On Mon, Feb 06, 2012 at 09:14:47AM +0100, Lothar Waßmann wrote:
> > > Hi,
> > > 
> > > I already sent this to <linux-kernel@vger.kernel.org> on Feb. 1, 2012
> > > but did not get any response there. So resending to a wider audience
> > > with improved subject line:
> > > 
> > > there is a race condition in the threaded IRQ handler code for oneshot
> > > interrupts that may lead to disabling an IRQ indefinitely. IRQs are
> > > masked before calling the hard-irq handler and are unmasked only after
> > > the soft-irq handler has been run. Thus if the hard-irq handler
> > > returns IRQ_HANDLED instead of IRQ_WAKE_THREAD, meaning the soft-irq
> > > will not be called, the interrupt will remain masked forever.
> > > 
> > > This can happen due to a short pulse on the interrupt line, that
> > > triggers the interrupt logic, but goes undetected by the hard-irq
> > > handler. The problem can be reproduced with the TSC2007 touch
> > > controller driver that uses ONESHOT interrupts.
> > 
> > Isn't it the responsibility of the driver (say TSC2007)?
> > 
> > In this case, TSC2007 should return IRQ_WAKE_THREAD IMHO.
> > 
> That would mean it had to return IRQ_WAKE_THREAD unconditionally
> making the return code useless.
> And it would cause an extra useless loop through the softirq
> handler.

Yeah, it's the default behavior when we introduce 'theadirqs',
and it's safe.

You know in your patch unmask_irq() is called locklessly and
it will introduce other race.

Thanks,
Yong

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [BUG] genirq: Race condition in ONESHOT IRQ handler disabling IRQ forever
  2012-02-07 12:34     ` Yong Zhang
@ 2012-02-07 12:52       ` Lothar Waßmann
  2012-02-07 13:07         ` Lars-Peter Clausen
  0 siblings, 1 reply; 12+ messages in thread
From: Lothar Waßmann @ 2012-02-07 12:52 UTC (permalink / raw)
  To: Yong Zhang; +Cc: linux-kernel, linux-arm-kernel, Thomas Gleixner

Hi,

Yong Zhang writes:
> On Tue, Feb 07, 2012 at 11:01:06AM +0100, Lothar Waßmann wrote:
> > Hi,
> > 
> > > On Mon, Feb 06, 2012 at 09:14:47AM +0100, Lothar Waßmann wrote:
> > > > Hi,
> > > > 
> > > > I already sent this to <linux-kernel@vger.kernel.org> on Feb. 1, 2012
> > > > but did not get any response there. So resending to a wider audience
> > > > with improved subject line:
> > > > 
> > > > there is a race condition in the threaded IRQ handler code for oneshot
> > > > interrupts that may lead to disabling an IRQ indefinitely. IRQs are
> > > > masked before calling the hard-irq handler and are unmasked only after
> > > > the soft-irq handler has been run. Thus if the hard-irq handler
> > > > returns IRQ_HANDLED instead of IRQ_WAKE_THREAD, meaning the soft-irq
> > > > will not be called, the interrupt will remain masked forever.
> > > > 
> > > > This can happen due to a short pulse on the interrupt line, that
> > > > triggers the interrupt logic, but goes undetected by the hard-irq
> > > > handler. The problem can be reproduced with the TSC2007 touch
> > > > controller driver that uses ONESHOT interrupts.
> > > 
> > > Isn't it the responsibility of the driver (say TSC2007)?
> > > 
> > > In this case, TSC2007 should return IRQ_WAKE_THREAD IMHO.
> > > 
> > That would mean it had to return IRQ_WAKE_THREAD unconditionally
> > making the return code useless.
> > And it would cause an extra useless loop through the softirq
> > handler.
> 
> Yeah, it's the default behavior when we introduce 'theadirqs',
> and it's safe.
> 
So, the correct solution would be to remove the check for
IRQ_WAKE_THREAD in handle_irq_event_percpu() and always invoke the
softirq handler?
Note that this problem is not specific to the TSC2007 driver, but may
occur with any hardware.

Or maybe do the unmasking in handle_irq_event() as proposed by
Lars-Peter Clausen in <4F2FAE93.5020205@metafoo.de>?
Like that:
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index f7c543a..fbf68c7 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -343,6 +343,8 @@ EXPORT_SYMBOL_GPL(handle_simple_irq);
 void
 handle_level_irq(unsigned int irq, struct irq_desc *desc)
 {
+	int ret;
+
 	raw_spin_lock(&desc->lock);
 	mask_ack_irq(desc);
 
@@ -360,10 +362,12 @@ handle_level_irq(unsigned int irq, struct irq_desc *desc)
 	if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data)))
 		goto out_unlock;
 
-	handle_irq_event(desc);
+	ret = handle_irq_event(desc);
 
-	if (!irqd_irq_disabled(&desc->irq_data) && !(desc->istate & IRQS_ONESHOT))
+	if (!irqd_irq_disabled(&desc->irq_data) &&
+		(!(desc->istate & IRQS_ONESHOT) || ret != IRQ_WAKE_THREAD))
 		unmask_irq(desc);
+
 out_unlock:
 	raw_spin_unlock(&desc->lock);
 }


Lothar Waßmann
-- 
___________________________________________________________

Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Geschäftsführer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996

www.karo-electronics.de | info@karo-electronics.de
___________________________________________________________

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [BUG] genirq: Race condition in ONESHOT IRQ handler disabling IRQ forever
  2012-02-07 12:52       ` Lothar Waßmann
@ 2012-02-07 13:07         ` Lars-Peter Clausen
  2012-02-07 13:38           ` [PATCH] genirq: Fix race condition in ONESHOT irq handler Lothar Waßmann
  0 siblings, 1 reply; 12+ messages in thread
From: Lars-Peter Clausen @ 2012-02-07 13:07 UTC (permalink / raw)
  To: Lothar Waßmann
  Cc: Yong Zhang, Thomas Gleixner, linux-kernel, linux-arm-kernel

On 02/07/2012 01:52 PM, Lothar Waßmann wrote:
> Hi,
> 
> Yong Zhang writes:
>> On Tue, Feb 07, 2012 at 11:01:06AM +0100, Lothar Waßmann wrote:
>>> Hi,
>>>
>>>> On Mon, Feb 06, 2012 at 09:14:47AM +0100, Lothar Waßmann wrote:
>>>>> Hi,
>>>>>
>>>>> I already sent this to <linux-kernel@vger.kernel.org> on Feb. 1, 2012
>>>>> but did not get any response there. So resending to a wider audience
>>>>> with improved subject line:
>>>>>
>>>>> there is a race condition in the threaded IRQ handler code for oneshot
>>>>> interrupts that may lead to disabling an IRQ indefinitely. IRQs are
>>>>> masked before calling the hard-irq handler and are unmasked only after
>>>>> the soft-irq handler has been run. Thus if the hard-irq handler
>>>>> returns IRQ_HANDLED instead of IRQ_WAKE_THREAD, meaning the soft-irq
>>>>> will not be called, the interrupt will remain masked forever.
>>>>>
>>>>> This can happen due to a short pulse on the interrupt line, that
>>>>> triggers the interrupt logic, but goes undetected by the hard-irq
>>>>> handler. The problem can be reproduced with the TSC2007 touch
>>>>> controller driver that uses ONESHOT interrupts.
>>>>
>>>> Isn't it the responsibility of the driver (say TSC2007)?
>>>>
>>>> In this case, TSC2007 should return IRQ_WAKE_THREAD IMHO.
>>>>
>>> That would mean it had to return IRQ_WAKE_THREAD unconditionally
>>> making the return code useless.
>>> And it would cause an extra useless loop through the softirq
>>> handler.
>>
>> Yeah, it's the default behavior when we introduce 'theadirqs',
>> and it's safe.
>>
> So, the correct solution would be to remove the check for
> IRQ_WAKE_THREAD in handle_irq_event_percpu() and always invoke the
> softirq handler?
> Note that this problem is not specific to the TSC2007 driver, but may
> occur with any hardware.
> 
> Or maybe do the unmasking in handle_irq_event() as proposed by
> Lars-Peter Clausen in <4F2FAE93.5020205@metafoo.de>?
> Like that:
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index f7c543a..fbf68c7 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -343,6 +343,8 @@ EXPORT_SYMBOL_GPL(handle_simple_irq);
>  void
>  handle_level_irq(unsigned int irq, struct irq_desc *desc)
>  {
> +	int ret;

This should be irqreturn_t

> +
>  	raw_spin_lock(&desc->lock);
>  	mask_ack_irq(desc);
>  
> @@ -360,10 +362,12 @@ handle_level_irq(unsigned int irq, struct irq_desc *desc)
>  	if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data)))
>  		goto out_unlock;
>  
> -	handle_irq_event(desc);
> +	ret = handle_irq_event(desc);
>  
> -	if (!irqd_irq_disabled(&desc->irq_data) && !(desc->istate & IRQS_ONESHOT))
> +	if (!irqd_irq_disabled(&desc->irq_data) &&
> +		(!(desc->istate & IRQS_ONESHOT) || ret != IRQ_WAKE_THREAD))

As I said, check for the bit, not for the value. This will ensure that will
also work with shared interrupts. So something like this:

		!((desc->istate & IRQS_ONESHOT) && (ret & IRQ_WAKE_THREAD)))

>  		unmask_irq(desc);
> +
>  out_unlock:
>  	raw_spin_unlock(&desc->lock);
>  }
> 
> 
> Lothar Waßmann


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] genirq: Fix race condition in ONESHOT irq handler
  2012-02-07 13:07         ` Lars-Peter Clausen
@ 2012-02-07 13:38           ` Lothar Waßmann
  2012-02-07 17:03             ` Thomas Gleixner
  0 siblings, 1 reply; 12+ messages in thread
From: Lothar Waßmann @ 2012-02-07 13:38 UTC (permalink / raw)
  To: linux-kernel, linux-kernel
  Cc: Lothar Waßmann, Thomas Gleixner, Lars-Peter Clausen,
	Yong Zhang, linux-arm-kernel

There is a race condition in the threaded IRQ handler code for oneshot
interrupts that may lead to disabling an IRQ indefinitely. IRQs are
masked before calling the hard-irq handler and are unmasked only after
the soft-irq handler has been run. Thus if the hard-irq handler
returns IRQ_HANDLED instead of IRQ_WAKE_THREAD, meaning the soft-irq
will not be called, the interrupt will remain masked forever.

This can happen due to a short pulse on the interrupt line, that
triggers the interrupt logic, but goes undetected by the hard-irq
handler. The problem can be reproduced with the TSC2007 touch
controller driver that uses ONESHOT interrupts.

The problem arises also with interrupt controllers that latch a level
triggered IRQ until it is acknowledged (like the i.MX28 does).
In this case the IRQ status bit will remain asserted after the
soft-irq finishes and retrigger the interrupt while the interrupt line
is already deasserted.

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
---
 kernel/irq/chip.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index f7c543a..74fdef9 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -343,6 +343,8 @@ EXPORT_SYMBOL_GPL(handle_simple_irq);
 void
 handle_level_irq(unsigned int irq, struct irq_desc *desc)
 {
+	irqreturn_t ret;
+
 	raw_spin_lock(&desc->lock);
 	mask_ack_irq(desc);
 
@@ -360,10 +362,13 @@ handle_level_irq(unsigned int irq, struct irq_desc *desc)
 	if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data)))
 		goto out_unlock;
 
-	handle_irq_event(desc);
+	ret = handle_irq_event(desc);
 
-	if (!irqd_irq_disabled(&desc->irq_data) && !(desc->istate & IRQS_ONESHOT))
+	if (!irqd_irq_disabled(&desc->irq_data) &&
+			(!(desc->istate & IRQS_ONESHOT) ||
+				!(ret & IRQ_WAKE_THREAD)))
 		unmask_irq(desc);
+
 out_unlock:
 	raw_spin_unlock(&desc->lock);
 }
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] genirq: Fix race condition in ONESHOT irq handler
  2012-02-07 13:38           ` [PATCH] genirq: Fix race condition in ONESHOT irq handler Lothar Waßmann
@ 2012-02-07 17:03             ` Thomas Gleixner
  2012-02-08  6:05               ` Lothar Waßmann
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2012-02-07 17:03 UTC (permalink / raw)
  To: Lothar Waßmann
  Cc: linux-kernel, Lars-Peter Clausen, Yong Zhang, linux-arm-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4460 bytes --]

On Tue, 7 Feb 2012, Lothar Waßmann wrote:

> There is a race condition in the threaded IRQ handler code for oneshot
> interrupts that may lead to disabling an IRQ indefinitely. IRQs are
> masked before calling the hard-irq handler and are unmasked only after
> the soft-irq handler has been run. Thus if the hard-irq handler
> returns IRQ_HANDLED instead of IRQ_WAKE_THREAD, meaning the soft-irq

Well, oneshot mode interrupts always had the semantics that the
threaded handler needs to run unconditionally. In fact the oneshot
mode was implemented to handle hardware which cannot do anything in
hard interrupt context to avoid the ugliness of a primary handler
calling disable_irq_nosync().

So it looks like driver developers decided that the oneshot mode might
be interesting with a primary handler as well. I can see the reason
why the tsc2007 driver uses it, but that does not make it a bug in the
core code in the first place.

Though we should handle it and the problem not only arises with the
IRQ_HANDLED return code, it also arises with IRQ_NONE.

> will not be called, the interrupt will remain masked forever.
> 
> This can happen due to a short pulse on the interrupt line, that
> triggers the interrupt logic, but goes undetected by the hard-irq
> handler. The problem can be reproduced with the TSC2007 touch
> controller driver that uses ONESHOT interrupts.

It should not return IRQ_HANDLED in that case, as the real thing is a
spurious interrupt.
 
> The problem arises also with interrupt controllers that latch a level
> triggered IRQ until it is acknowledged (like the i.MX28 does).
> In this case the IRQ status bit will remain asserted after the
> soft-irq finishes and retrigger the interrupt while the interrupt line
> is already deasserted.

This does not make sense. We acknowledge interrupts via mask_ack_irq()
right on entry of handle_level_irq(). So either the interrupt
controller is completely hosed or this explanation is bogus.

> Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
> ---
>  kernel/irq/chip.c |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index f7c543a..74fdef9 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -343,6 +343,8 @@ EXPORT_SYMBOL_GPL(handle_simple_irq);
>  void
>  handle_level_irq(unsigned int irq, struct irq_desc *desc)
>  {
> +	irqreturn_t ret;
> +
>  	raw_spin_lock(&desc->lock);
>  	mask_ack_irq(desc);
>  
> @@ -360,10 +362,13 @@ handle_level_irq(unsigned int irq, struct irq_desc *desc)
>  	if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data)))
>  		goto out_unlock;
>  
> -	handle_irq_event(desc);
> +	ret = handle_irq_event(desc);
>  
> -	if (!irqd_irq_disabled(&desc->irq_data) && !(desc->istate & IRQS_ONESHOT))
> +	if (!irqd_irq_disabled(&desc->irq_data) &&
> +			(!(desc->istate & IRQS_ONESHOT) ||
> +				!(ret & IRQ_WAKE_THREAD)))

Hmm, that looks ugly and it misses the same fixup for
handle_fasteoi_irq() including proper comments.

The following patch should address both cases.

Thanks,

	tglx

===================================================================
--- linux-3.2.orig/kernel/irq/chip.c +++ linux-3.2/kernel/irq/chip.c
@@ -330,6 +330,24 @@ out_unlock: }
EXPORT_SYMBOL_GPL(handle_simple_irq);
 
+/*
+ * Called unconditionally from handle_level_irq() and only for oneshot
+ * interrupts from handle_fasteoi_irq()
+ */
+static void cond_unmask_irq(struct irq_desc *desc)
+{
+	/*
+	 * We need to unmask in the following cases:
+	 * - Standard level irq (IRQF_ONESHOT is not set)
+	 * - Oneshot irq which did not wake the thread (caused by a
+	 *   spurious interrupt or a primary handler handling it
+	 *   completely).
+	 */
+	if (!irqd_irq_disabled(&desc->irq_data) &&
+	    irqd_irq_masked(&desc->irq_data) && !desc->threads_oneshot)
+		unmask_irq(desc);
+}
+
 /**
  *	handle_level_irq - Level type irq handler
  *	@irq:	the interrupt number
@@ -362,8 +380,8 @@ handle_level_irq(unsigned int irq, struc
 
 	handle_irq_event(desc);
 
-	if (!irqd_irq_disabled(&desc->irq_data) && !(desc->istate & IRQS_ONESHOT))
-		unmask_irq(desc);
+	cond_unmask_irq(desc);
+
 out_unlock:
 	raw_spin_unlock(&desc->lock);
 }
@@ -417,6 +435,9 @@ handle_fasteoi_irq(unsigned int irq, str
 	preflow_handler(desc);
 	handle_irq_event(desc);
 
+	if (desc->istate & IRQS_ONESHOT)
+		cond_unmask_irq(desc);
+
 out_eoi:
 	desc->irq_data.chip->irq_eoi(&desc->irq_data);
 out_unlock:

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] genirq: Fix race condition in ONESHOT irq handler
  2012-02-07 17:03             ` Thomas Gleixner
@ 2012-02-08  6:05               ` Lothar Waßmann
  2012-02-08 10:38                 ` Thomas Gleixner
  0 siblings, 1 reply; 12+ messages in thread
From: Lothar Waßmann @ 2012-02-08  6:05 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Lars-Peter Clausen, Yong Zhang, linux-arm-kernel

Hi,

Thomas Gleixner writes:
> On Tue, 7 Feb 2012, Lothar Waßmann wrote:
> 
> > There is a race condition in the threaded IRQ handler code for oneshot
> > interrupts that may lead to disabling an IRQ indefinitely. IRQs are
> > masked before calling the hard-irq handler and are unmasked only after
> > the soft-irq handler has been run. Thus if the hard-irq handler
> > returns IRQ_HANDLED instead of IRQ_WAKE_THREAD, meaning the soft-irq
> 
> Well, oneshot mode interrupts always had the semantics that the
> threaded handler needs to run unconditionally. In fact the oneshot
> mode was implemented to handle hardware which cannot do anything in
> hard interrupt context to avoid the ugliness of a primary handler
> calling disable_irq_nosync().
> 
> So it looks like driver developers decided that the oneshot mode might
> be interesting with a primary handler as well. I can see the reason
> why the tsc2007 driver uses it, but that does not make it a bug in the
> core code in the first place.
> 
Then maybe the core code should not check the return value
of the primary handler for IRQ_WAKE_THREAD but call the secondary
handler unconditionally for ONESHOT interrupts.
Or it should be at least documented somewhere that primary handlers
have to return IRQ_WAKE_THREAD in any case for oneshot interrupts.

> > The problem arises also with interrupt controllers that latch a level
> > triggered IRQ until it is acknowledged (like the i.MX28 does).
> > In this case the IRQ status bit will remain asserted after the
> > soft-irq finishes and retrigger the interrupt while the interrupt line
> > is already deasserted.
> 
> This does not make sense. We acknowledge interrupts via mask_ack_irq()
> right on entry of handle_level_irq(). So either the interrupt
> 
That's right. But at that point the IRQ line is still asserted and
since it is a level IRQ this will not actually clear the interrupt
status bit. Normally the IRQ status bit would self-clear when the IRQ
line is being deasserted (in this case by removing the finger from the
touch panel). But the i.MX28 leaves the IRQ status bit set and it
takes another write to the IRQ status register to remove the bogus IRQ
status.

> controller is completely hosed or this explanation is bogus.
>
The first is the case.


Lothar Waßmann
-- 
___________________________________________________________

Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Geschäftsführer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996

www.karo-electronics.de | info@karo-electronics.de
___________________________________________________________

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] genirq: Fix race condition in ONESHOT irq handler
  2012-02-08  6:05               ` Lothar Waßmann
@ 2012-02-08 10:38                 ` Thomas Gleixner
  2012-02-09  8:40                   ` Lothar Waßmann
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2012-02-08 10:38 UTC (permalink / raw)
  To: Lothar Waßmann
  Cc: linux-kernel, Lars-Peter Clausen, Yong Zhang, linux-arm-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1956 bytes --]

On Wed, 8 Feb 2012, Lothar Waßmann wrote:
> > So it looks like driver developers decided that the oneshot mode might
> > be interesting with a primary handler as well. I can see the reason
> > why the tsc2007 driver uses it, but that does not make it a bug in the
> > core code in the first place.
> > 
> Then maybe the core code should not check the return value
> of the primary handler for IRQ_WAKE_THREAD but call the secondary
> handler unconditionally for ONESHOT interrupts.
> Or it should be at least documented somewhere that primary handlers
> have to return IRQ_WAKE_THREAD in any case for oneshot interrupts.

Well, you know how good we are with documentation :)
 
> > > The problem arises also with interrupt controllers that latch a level
> > > triggered IRQ until it is acknowledged (like the i.MX28 does).
> > > In this case the IRQ status bit will remain asserted after the
> > > soft-irq finishes and retrigger the interrupt while the interrupt line
> > > is already deasserted.
> > 
> > This does not make sense. We acknowledge interrupts via mask_ack_irq()
> > right on entry of handle_level_irq(). So either the interrupt
> > 
> That's right. But at that point the IRQ line is still asserted and
> since it is a level IRQ this will not actually clear the interrupt
> status bit. Normally the IRQ status bit would self-clear when the IRQ
> line is being deasserted (in this case by removing the finger from the
> touch panel). But the i.MX28 leaves the IRQ status bit set and it
> takes another write to the IRQ status register to remove the bogus IRQ
> status.

So the question is whether the imx irq chip implementation should
write to the status register on unmask for level type irqs to avoid
spurious interrupts being generated in the first place. This is not
only an optimization for threaded interrupts, afaict this spurious
effect should happen with non threaded interrupts as well.

Did my patch work for you ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] genirq: Fix race condition in ONESHOT irq handler
  2012-02-08 10:38                 ` Thomas Gleixner
@ 2012-02-09  8:40                   ` Lothar Waßmann
  0 siblings, 0 replies; 12+ messages in thread
From: Lothar Waßmann @ 2012-02-09  8:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Lars-Peter Clausen, Yong Zhang, linux-arm-kernel

Hi,

Thomas Gleixner writes:
> On Wed, 8 Feb 2012, Lothar Waßmann wrote:
> > > > The problem arises also with interrupt controllers that latch a level
> > > > triggered IRQ until it is acknowledged (like the i.MX28 does).
> > > > In this case the IRQ status bit will remain asserted after the
> > > > soft-irq finishes and retrigger the interrupt while the interrupt line
> > > > is already deasserted.
> > > 
> > > This does not make sense. We acknowledge interrupts via mask_ack_irq()
> > > right on entry of handle_level_irq(). So either the interrupt
> > > 
> > That's right. But at that point the IRQ line is still asserted and
> > since it is a level IRQ this will not actually clear the interrupt
> > status bit. Normally the IRQ status bit would self-clear when the IRQ
> > line is being deasserted (in this case by removing the finger from the
> > touch panel). But the i.MX28 leaves the IRQ status bit set and it
> > takes another write to the IRQ status register to remove the bogus IRQ
> > status.
> 
> So the question is whether the imx irq chip implementation should
> write to the status register on unmask for level type irqs to avoid
> spurious interrupts being generated in the first place. This is not
> only an optimization for threaded interrupts, afaict this spurious
> effect should happen with non threaded interrupts as well.
> 
> Did my patch work for you ?
> 
Sorry, I couldn't test it earlier. Yes, it works.

Lothar Waßmann
-- 
___________________________________________________________

Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Geschäftsführer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996

www.karo-electronics.de | info@karo-electronics.de
___________________________________________________________

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-02-09  8:40 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-06  8:14 [BUG] genirq: Race condition in ONESHOT IRQ handler disabling IRQ forever =?utf-8?Q?Lothar_Wa=C3=9Fmann?=
2012-02-06 10:42 ` Lars-Peter Clausen
2012-02-07  9:03 ` Yong Zhang
2012-02-07 10:01   ` Lothar Waßmann
2012-02-07 12:34     ` Yong Zhang
2012-02-07 12:52       ` Lothar Waßmann
2012-02-07 13:07         ` Lars-Peter Clausen
2012-02-07 13:38           ` [PATCH] genirq: Fix race condition in ONESHOT irq handler Lothar Waßmann
2012-02-07 17:03             ` Thomas Gleixner
2012-02-08  6:05               ` Lothar Waßmann
2012-02-08 10:38                 ` Thomas Gleixner
2012-02-09  8:40                   ` Lothar Waßmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).