All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 3.6] ath9k: fix interrupt storms on queued hardware reset
@ 2012-08-08 14:25 Felix Fietkau
  2012-08-08 14:43 ` Rajkumar Manoharan
  0 siblings, 1 reply; 5+ messages in thread
From: Felix Fietkau @ 2012-08-08 14:25 UTC (permalink / raw)
  To: linux-wireless; +Cc: linville, mcgrof, rmanohar, c_manoha

commit b74713d04effbacd3d126ce94cec18742187b6ce
"ath9k: Handle fatal interrupts properly" introduced a race condition, where
IRQs are being left enabled, however the irq handler returns IRQ_HANDLED
while the reset is still queued without addressing the IRQ cause.
This leads to an IRQ storm that prevents the system from even getting to
the reset code.

Fix this by disabling IRQs in the handler without touching intr_ref_cnt.

Cc: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Cc: Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
---
 drivers/net/wireless/ath/ath9k/mac.c  |   18 ++++++++++++------
 drivers/net/wireless/ath/ath9k/mac.h  |    1 +
 drivers/net/wireless/ath/ath9k/main.c |    4 +++-
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/mac.c b/drivers/net/wireless/ath/ath9k/mac.c
index 7990cd5..b42be91 100644
--- a/drivers/net/wireless/ath/ath9k/mac.c
+++ b/drivers/net/wireless/ath/ath9k/mac.c
@@ -773,15 +773,10 @@ bool ath9k_hw_intrpend(struct ath_hw *ah)
 }
 EXPORT_SYMBOL(ath9k_hw_intrpend);
 
-void ath9k_hw_disable_interrupts(struct ath_hw *ah)
+void ath9k_hw_kill_interrupts(struct ath_hw *ah)
 {
 	struct ath_common *common = ath9k_hw_common(ah);
 
-	if (!(ah->imask & ATH9K_INT_GLOBAL))
-		atomic_set(&ah->intr_ref_cnt, -1);
-	else
-		atomic_dec(&ah->intr_ref_cnt);
-
 	ath_dbg(common, INTERRUPT, "disable IER\n");
 	REG_WRITE(ah, AR_IER, AR_IER_DISABLE);
 	(void) REG_READ(ah, AR_IER);
@@ -793,6 +788,17 @@ void ath9k_hw_disable_interrupts(struct ath_hw *ah)
 		(void) REG_READ(ah, AR_INTR_SYNC_ENABLE);
 	}
 }
+EXPORT_SYMBOL(ath9k_hw_kill_interrupts);
+
+void ath9k_hw_disable_interrupts(struct ath_hw *ah)
+{
+	if (!(ah->imask & ATH9K_INT_GLOBAL))
+		atomic_set(&ah->intr_ref_cnt, -1);
+	else
+		atomic_dec(&ah->intr_ref_cnt);
+
+	ath9k_hw_kill_interrupts(ah);
+}
 EXPORT_SYMBOL(ath9k_hw_disable_interrupts);
 
 void ath9k_hw_enable_interrupts(struct ath_hw *ah)
diff --git a/drivers/net/wireless/ath/ath9k/mac.h b/drivers/net/wireless/ath/ath9k/mac.h
index 0eba36d..4a745e6 100644
--- a/drivers/net/wireless/ath/ath9k/mac.h
+++ b/drivers/net/wireless/ath/ath9k/mac.h
@@ -738,6 +738,7 @@ bool ath9k_hw_intrpend(struct ath_hw *ah);
 void ath9k_hw_set_interrupts(struct ath_hw *ah);
 void ath9k_hw_enable_interrupts(struct ath_hw *ah);
 void ath9k_hw_disable_interrupts(struct ath_hw *ah);
+void ath9k_hw_kill_interrupts(struct ath_hw *ah);
 
 void ar9002_hw_attach_mac_ops(struct ath_hw *ah);
 
diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
index 6049d8b..a22df74 100644
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -462,8 +462,10 @@ irqreturn_t ath_isr(int irq, void *dev)
 	if (!ath9k_hw_intrpend(ah))
 		return IRQ_NONE;
 
-	if(test_bit(SC_OP_HW_RESET, &sc->sc_flags))
+	if (test_bit(SC_OP_HW_RESET, &sc->sc_flags)) {
+		ath9k_hw_kill_interrupts(ah);
 		return IRQ_HANDLED;
+	}
 
 	/*
 	 * Figure out the reason(s) for the interrupt.  Note
-- 
1.7.9.6 (Apple Git-31.1)


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 3.6] ath9k: fix interrupt storms on queued hardware reset
  2012-08-08 14:25 [PATCH v2 3.6] ath9k: fix interrupt storms on queued hardware reset Felix Fietkau
@ 2012-08-08 14:43 ` Rajkumar Manoharan
  2012-08-08 15:00   ` Felix Fietkau
  2012-08-08 15:25   ` Sujith Manoharan
  0 siblings, 2 replies; 5+ messages in thread
From: Rajkumar Manoharan @ 2012-08-08 14:43 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless, linville, rodrigue, c_manoha

On Wed, Aug 08, 2012 at 04:25:03PM +0200, Felix Fietkau wrote:
> commit b74713d04effbacd3d126ce94cec18742187b6ce
> "ath9k: Handle fatal interrupts properly" introduced a race condition, where
> IRQs are being left enabled, however the irq handler returns IRQ_HANDLED
> while the reset is still queued without addressing the IRQ cause.
> This leads to an IRQ storm that prevents the system from even getting to
> the reset code.
> 
> Fix this by disabling IRQs in the handler without touching intr_ref_cnt.
>
It is safer not to re-enable interrupts on FATAL errors rather than enabling
it and then checking it on irq for bailing out. It would be better if you kill
the interrupts on processing fatal interrupts.

-Rajkumar

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 3.6] ath9k: fix interrupt storms on queued hardware reset
  2012-08-08 14:43 ` Rajkumar Manoharan
@ 2012-08-08 15:00   ` Felix Fietkau
  2012-08-08 15:20     ` Rajkumar Manoharan
  2012-08-08 15:25   ` Sujith Manoharan
  1 sibling, 1 reply; 5+ messages in thread
From: Felix Fietkau @ 2012-08-08 15:00 UTC (permalink / raw)
  To: Rajkumar Manoharan; +Cc: linux-wireless, linville, rodrigue, c_manoha

On 2012-08-08 4:43 PM, Rajkumar Manoharan wrote:
> On Wed, Aug 08, 2012 at 04:25:03PM +0200, Felix Fietkau wrote:
>> commit b74713d04effbacd3d126ce94cec18742187b6ce
>> "ath9k: Handle fatal interrupts properly" introduced a race condition, where
>> IRQs are being left enabled, however the irq handler returns IRQ_HANDLED
>> while the reset is still queued without addressing the IRQ cause.
>> This leads to an IRQ storm that prevents the system from even getting to
>> the reset code.
>> 
>> Fix this by disabling IRQs in the handler without touching intr_ref_cnt.
>>
> It is safer not to re-enable interrupts on FATAL errors rather than enabling
> it and then checking it on irq for bailing out. It would be better if you kill
> the interrupts on processing fatal interrupts.
A fatal interrupt isn't the only place where this is race shows up.
Anything that queues a reset is affected, so skipping the interrupt
enable in the IRQ handler is not enough (aside from the fact that it
would mess up irq disable refcounting).

Also, how is it safer? It's not like the interrupt handler does any real
processing before running into that check.

- Felix

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 3.6] ath9k: fix interrupt storms on queued hardware reset
  2012-08-08 15:00   ` Felix Fietkau
@ 2012-08-08 15:20     ` Rajkumar Manoharan
  0 siblings, 0 replies; 5+ messages in thread
From: Rajkumar Manoharan @ 2012-08-08 15:20 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless, linville, rodrigue, c_manoha

On Wed, Aug 08, 2012 at 05:00:39PM +0200, Felix Fietkau wrote:
> On 2012-08-08 4:43 PM, Rajkumar Manoharan wrote:
> > On Wed, Aug 08, 2012 at 04:25:03PM +0200, Felix Fietkau wrote:
> >> commit b74713d04effbacd3d126ce94cec18742187b6ce
> >> "ath9k: Handle fatal interrupts properly" introduced a race condition, where
> >> IRQs are being left enabled, however the irq handler returns IRQ_HANDLED
> >> while the reset is still queued without addressing the IRQ cause.
> >> This leads to an IRQ storm that prevents the system from even getting to
> >> the reset code.
> >> 
> >> Fix this by disabling IRQs in the handler without touching intr_ref_cnt.
> >>
> > It is safer not to re-enable interrupts on FATAL errors rather than enabling
> > it and then checking it on irq for bailing out. It would be better if you kill
> > the interrupts on processing fatal interrupts.
> A fatal interrupt isn't the only place where this is race shows up.
> Anything that queues a reset is affected, so skipping the interrupt
> enable in the IRQ handler is not enough (aside from the fact that it
> would mess up irq disable refcounting).
> 
> Also, how is it safer? It's not like the interrupt handler does any real
> processing before running into that check.
> 
Agree. I confused with the mentioned commit subject. Sorry for the noise.

-Rajkumar

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 3.6] ath9k: fix interrupt storms on queued hardware reset
  2012-08-08 14:43 ` Rajkumar Manoharan
  2012-08-08 15:00   ` Felix Fietkau
@ 2012-08-08 15:25   ` Sujith Manoharan
  1 sibling, 0 replies; 5+ messages in thread
From: Sujith Manoharan @ 2012-08-08 15:25 UTC (permalink / raw)
  To: Rajkumar Manoharan
  Cc: Felix Fietkau, linux-wireless, linville, rodrigue, c_manoha

Rajkumar Manoharan wrote:
> It is safer not to re-enable interrupts on FATAL errors rather than enabling
> it and then checking it on irq for bailing out. It would be better if you kill
> the interrupts on processing fatal interrupts.

I am not sure I understand.

The original issue was the race between reset-work and the ISR which resulted in
frequent disconnects when a BB-WATCHDOG interrupt occurred or TX hung, which was
fixed by introducing the SC_OP_HW_RESET flag. Later, the work_pending() race was
fixed. Still, this is a race that can happen and I think fixing it by bypassing
the ref-count maintenance and disabling interrupts is okay.

Sujith

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-08-08 15:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-08 14:25 [PATCH v2 3.6] ath9k: fix interrupt storms on queued hardware reset Felix Fietkau
2012-08-08 14:43 ` Rajkumar Manoharan
2012-08-08 15:00   ` Felix Fietkau
2012-08-08 15:20     ` Rajkumar Manoharan
2012-08-08 15:25   ` Sujith Manoharan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.