linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] ath9k: interrupt fixes on queue reset
@ 2021-09-14 19:25 Linus Lüssing
  2021-09-14 19:25 ` [PATCH 1/3] ath9k: add option to reset the wifi chip via debugfs Linus Lüssing
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Linus Lüssing @ 2021-09-14 19:25 UTC (permalink / raw)
  To: Kalle Valo, Felix Fietkau, Sujith Manoharan, ath9k-devel
  Cc: linux-wireless, David S . Miller, Jakub Kicinski,
	John W . Linville, Felix Fietkau, Simon Wunderlich,
	Sven Eckelmann, netdev, linux-kernel

Hi,

The following are two patches for ath9k to fix a potential interrupt
storm (PATCH 2/3) and to fix potentially resetting the wifi chip while
its interrupts were accidentally reenabled (PATCH 3/3).

PATCH 1/3 adds the possibility to trigger the ath9k queue reset through
the ath9k reset file in debugfs. Which was helpful to reproduce and debug
this issue and might help for future debugging.

PATCH 2/3 and PATCH 3/3 should be applicable for stable.

Regards, Linus


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] ath9k: add option to reset the wifi chip via debugfs
  2021-09-14 19:25 [PATCH 0/3] ath9k: interrupt fixes on queue reset Linus Lüssing
@ 2021-09-14 19:25 ` Linus Lüssing
  2021-10-05 14:27   ` Kalle Valo
  2021-09-14 19:25 ` [PATCH 2/3] ath9k: Fix potential interrupt storm on queue reset Linus Lüssing
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Linus Lüssing @ 2021-09-14 19:25 UTC (permalink / raw)
  To: Kalle Valo, Felix Fietkau, Sujith Manoharan, ath9k-devel
  Cc: linux-wireless, David S . Miller, Jakub Kicinski,
	John W . Linville, Felix Fietkau, Simon Wunderlich,
	Sven Eckelmann, netdev, linux-kernel, Linus Lüssing

From: Linus Lüssing <ll@simonwunderlich.de>

Sometimes, in yet unknown cases the wifi chip stops working. To allow a
watchdog in userspace to easily and quickly reset the wifi chip, add the
according functionality to userspace. A reset can then be triggered
via:

  $ echo 1 > /sys/kernel/debug/ieee80211/phy0/ath9k/reset

The number of user resets can further be tracked in the row "User reset"
in the same file.

So far people usually used "iw scan" to fix ath9k chip hangs from
userspace. Which triggers the ath9k_queue_reset(), too. The reset file
however has the advantage of less overhead, which makes debugging bugs
within ath9k_queue_reset() easier.

Signed-off-by: Linus Lüssing <ll@simonwunderlich.de>
---
 drivers/net/wireless/ath/ath9k/debug.c | 57 ++++++++++++++++++++++++--
 drivers/net/wireless/ath/ath9k/debug.h |  1 +
 2 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/debug.c b/drivers/net/wireless/ath/ath9k/debug.c
index 4c81b1d7f417..fb7a2952d0ce 100644
--- a/drivers/net/wireless/ath/ath9k/debug.c
+++ b/drivers/net/wireless/ath/ath9k/debug.c
@@ -749,9 +749,9 @@ static int read_file_misc(struct seq_file *file, void *data)
 
 static int read_file_reset(struct seq_file *file, void *data)
 {
-	struct ieee80211_hw *hw = dev_get_drvdata(file->private);
-	struct ath_softc *sc = hw->priv;
+	struct ath_softc *sc = file->private;
 	static const char * const reset_cause[__RESET_TYPE_MAX] = {
+		[RESET_TYPE_USER] = "User reset",
 		[RESET_TYPE_BB_HANG] = "Baseband Hang",
 		[RESET_TYPE_BB_WATCHDOG] = "Baseband Watchdog",
 		[RESET_TYPE_FATAL_INT] = "Fatal HW Error",
@@ -779,6 +779,55 @@ static int read_file_reset(struct seq_file *file, void *data)
 	return 0;
 }
 
+static int open_file_reset(struct inode *inode, struct file *f)
+{
+	return single_open(f, read_file_reset, inode->i_private);
+}
+
+static ssize_t write_file_reset(struct file *file,
+				const char __user *user_buf,
+				size_t count, loff_t *ppos)
+{
+	struct ath_softc *sc = file_inode(file)->i_private;
+	struct ath_hw *ah = sc->sc_ah;
+	struct ath_common *common = ath9k_hw_common(ah);
+	unsigned long val;
+	char buf[32];
+	ssize_t len;
+
+	len = min(count, sizeof(buf) - 1);
+	if (copy_from_user(buf, user_buf, len))
+		return -EFAULT;
+
+	buf[len] = '\0';
+	if (kstrtoul(buf, 0, &val))
+		return -EINVAL;
+
+	if (val != 1)
+		return -EINVAL;
+
+	/* avoid rearming hw_reset_work on shutdown */
+	mutex_lock(&sc->mutex);
+	if (test_bit(ATH_OP_INVALID, &common->op_flags)) {
+		mutex_unlock(&sc->mutex);
+		return -EBUSY;
+	}
+
+	ath9k_queue_reset(sc, RESET_TYPE_USER);
+	mutex_unlock(&sc->mutex);
+
+	return count;
+}
+
+static const struct file_operations fops_reset = {
+	.read = seq_read,
+	.write = write_file_reset,
+	.open = open_file_reset,
+	.owner = THIS_MODULE,
+	.llseek = seq_lseek,
+	.release = single_release,
+};
+
 void ath_debug_stat_tx(struct ath_softc *sc, struct ath_buf *bf,
 		       struct ath_tx_status *ts, struct ath_txq *txq,
 		       unsigned int flags)
@@ -1393,8 +1442,8 @@ int ath9k_init_debug(struct ath_hw *ah)
 				    read_file_queues);
 	debugfs_create_devm_seqfile(sc->dev, "misc", sc->debug.debugfs_phy,
 				    read_file_misc);
-	debugfs_create_devm_seqfile(sc->dev, "reset", sc->debug.debugfs_phy,
-				    read_file_reset);
+	debugfs_create_file("reset", 0600, sc->debug.debugfs_phy,
+			    sc, &fops_reset);
 
 	ath9k_cmn_debug_recv(sc->debug.debugfs_phy, &sc->debug.stats.rxstats);
 	ath9k_cmn_debug_phy_err(sc->debug.debugfs_phy, &sc->debug.stats.rxstats);
diff --git a/drivers/net/wireless/ath/ath9k/debug.h b/drivers/net/wireless/ath/ath9k/debug.h
index 33826aa13687..389459c04d14 100644
--- a/drivers/net/wireless/ath/ath9k/debug.h
+++ b/drivers/net/wireless/ath/ath9k/debug.h
@@ -39,6 +39,7 @@ struct fft_sample_tlv;
 #endif
 
 enum ath_reset_type {
+	RESET_TYPE_USER,
 	RESET_TYPE_BB_HANG,
 	RESET_TYPE_BB_WATCHDOG,
 	RESET_TYPE_FATAL_INT,
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] ath9k: Fix potential interrupt storm on queue reset
  2021-09-14 19:25 [PATCH 0/3] ath9k: interrupt fixes on queue reset Linus Lüssing
  2021-09-14 19:25 ` [PATCH 1/3] ath9k: add option to reset the wifi chip via debugfs Linus Lüssing
@ 2021-09-14 19:25 ` Linus Lüssing
  2021-09-14 19:25 ` [PATCH 3/3] ath9k: Fix potential hw interrupt resume during reset Linus Lüssing
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Linus Lüssing @ 2021-09-14 19:25 UTC (permalink / raw)
  To: Kalle Valo, Felix Fietkau, Sujith Manoharan, ath9k-devel
  Cc: linux-wireless, David S . Miller, Jakub Kicinski,
	John W . Linville, Felix Fietkau, Simon Wunderlich,
	Sven Eckelmann, netdev, linux-kernel, Linus Lüssing,
	Linus Lüssing

From: Linus Lüssing <ll@simonwunderlich.de>

In tests with two Lima boards from 8devices (QCA4531 based) on OpenWrt
19.07 we could force a silent restart of a device with no serial
output when we were sending a high amount of UDP traffic (iperf3 at 80
MBit/s in both directions from external hosts, saturating the wifi and
causing a load of about 4.5 to 6) and were then triggering an
ath9k_queue_reset().

Further debugging showed that the restart was caused by the ath79
watchdog. With disabled watchdog we could observe that the device was
constantly going into ath_isr() interrupt handler and was returning
early after the ATH_OP_HW_RESET flag test, without clearing any
interrupts. Even though ath9k_queue_reset() calls
ath9k_hw_kill_interrupts().

With JTAG we could observe the following race condition:

1) ath9k_queue_reset()
   ...
   -> ath9k_hw_kill_interrupts()
   -> set_bit(ATH_OP_HW_RESET, &common->op_flags);
   ...
   <- returns

      2) ath9k_tasklet()
         ...
         -> ath9k_hw_resume_interrupts()
         ...
         <- returns

                 3) loops around:
                    ...
                    handle_int()
                    -> ath_isr()
                       ...
                       -> if (test_bit(ATH_OP_HW_RESET,
                                       &common->op_flags))
                            return IRQ_HANDLED;

                    x) ath_reset_internal():
                       => never reached <=

And in ath_isr() we would typically see the following interrupts /
interrupt causes:

* status: 0x00111030 or 0x00110030
* async_cause: 2 (AR_INTR_MAC_IPQ)
* sync_cause: 0

So the ath9k_tasklet() reenables the ath9k interrupts
through ath9k_hw_resume_interrupts() which ath9k_queue_reset() had just
disabled. And ath_isr() then keeps firing because it returns IRQ_HANDLED
without actually clearing the interrupt.

To fix this IRQ storm also clear/disable the interrupts again when we
are in reset state.

Cc: Sven Eckelmann <sven@narfation.org>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Linus Lüssing <linus.luessing@c0d3.blue>
Fixes: 872b5d814f99 ("ath9k: do not access hardware on IRQs during reset")
Signed-off-by: Linus Lüssing <ll@simonwunderlich.de>
---
 drivers/net/wireless/ath/ath9k/main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
index 139831539da3..98090e40e1cf 100644
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -533,8 +533,10 @@ irqreturn_t ath_isr(int irq, void *dev)
 	ath9k_debug_sync_cause(sc, sync_cause);
 	status &= ah->imask;	/* discard unasked-for bits */
 
-	if (test_bit(ATH_OP_HW_RESET, &common->op_flags))
+	if (test_bit(ATH_OP_HW_RESET, &common->op_flags)) {
+		ath9k_hw_kill_interrupts(sc->sc_ah);
 		return IRQ_HANDLED;
+	}
 
 	/*
 	 * If there are no status bits set, then this interrupt was not
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] ath9k: Fix potential hw interrupt resume during reset
  2021-09-14 19:25 [PATCH 0/3] ath9k: interrupt fixes on queue reset Linus Lüssing
  2021-09-14 19:25 ` [PATCH 1/3] ath9k: add option to reset the wifi chip via debugfs Linus Lüssing
  2021-09-14 19:25 ` [PATCH 2/3] ath9k: Fix potential interrupt storm on queue reset Linus Lüssing
@ 2021-09-14 19:25 ` Linus Lüssing
  2021-09-15  9:48   ` Felix Fietkau
  2021-09-14 19:53 ` [PATCH 0/3] ath9k: interrupt fixes on queue reset Toke Høiland-Jørgensen
  2021-10-05 14:12 ` Linus Lüssing
  4 siblings, 1 reply; 11+ messages in thread
From: Linus Lüssing @ 2021-09-14 19:25 UTC (permalink / raw)
  To: Kalle Valo, Felix Fietkau, Sujith Manoharan, ath9k-devel
  Cc: linux-wireless, David S . Miller, Jakub Kicinski,
	John W . Linville, Felix Fietkau, Simon Wunderlich,
	Sven Eckelmann, netdev, linux-kernel, Linus Lüssing,
	Linus Lüssing

From: Linus Lüssing <ll@simonwunderlich.de>

There is a small risk of the ath9k hw interrupts being reenabled in the
following way:

1) ath_reset_internal()
   ...
   -> disable_irq()
      ...
      <- returns

                      2) ath9k_tasklet()
                         ...
                         -> ath9k_hw_resume_interrupts()
                         ...

1) ath_reset_internal() continued:
   -> tasklet_disable(&sc->intr_tq); (= ath9k_tasklet() off)

By first disabling the ath9k interrupt there is a small window
afterwards which allows ath9k hw interrupts being reenabled through
the ath9k_tasklet() before we disable this tasklet in
ath_reset_internal(). Leading to having the ath9k hw interrupts enabled
during the reset, which we should avoid.

Fixing this by first disabling all ath9k tasklets. And only after
they are not running anymore also disabling the overall ath9k interrupt.

Either ath9k_queue_reset()->ath9k_kill_hw_interrupts() or
ath_reset_internal()->disable_irq()->ath_isr()->ath9k_kill_hw_interrupts()
should then have ensured that no ath9k hw interrupts are running during
the actual ath9k reset.

We could reproduce this issue with two Lima boards from 8devices
(QCA4531) on OpenWrt 19.07 while sending UDP traffic between the two and
triggering an ath9k_queue_reset() and with added msleep()s between
disable_irq() and tasklet_disable() in ath_reset_internal().

Cc: Sven Eckelmann <sven@narfation.org>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Linus Lüssing <linus.luessing@c0d3.blue>
Fixes: e3f31175a3ee ("ath9k: fix race condition in irq processing during hardware reset")
Signed-off-by: Linus Lüssing <ll@simonwunderlich.de>
---
 drivers/net/wireless/ath/ath9k/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
index 98090e40e1cf..b9f9a8ae3b56 100644
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -292,9 +292,9 @@ static int ath_reset_internal(struct ath_softc *sc, struct ath9k_channel *hchan)
 
 	__ath_cancel_work(sc);
 
-	disable_irq(sc->irq);
 	tasklet_disable(&sc->intr_tq);
 	tasklet_disable(&sc->bcon_tasklet);
+	disable_irq(sc->irq);
 	spin_lock_bh(&sc->sc_pcu_lock);
 
 	if (!sc->cur_chan->offchannel) {
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] ath9k: interrupt fixes on queue reset
  2021-09-14 19:25 [PATCH 0/3] ath9k: interrupt fixes on queue reset Linus Lüssing
                   ` (2 preceding siblings ...)
  2021-09-14 19:25 ` [PATCH 3/3] ath9k: Fix potential hw interrupt resume during reset Linus Lüssing
@ 2021-09-14 19:53 ` Toke Høiland-Jørgensen
  2021-09-15  9:23   ` Linus Lüssing
  2021-10-05 14:12 ` Linus Lüssing
  4 siblings, 1 reply; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-09-14 19:53 UTC (permalink / raw)
  To: Linus Lüssing, Kalle Valo, Felix Fietkau, Sujith Manoharan,
	ath9k-devel
  Cc: linux-wireless, David S . Miller, Jakub Kicinski,
	John W . Linville, Felix Fietkau, Simon Wunderlich,
	Sven Eckelmann, netdev, linux-kernel

Linus Lüssing <linus.luessing@c0d3.blue> writes:

> Hi,
>
> The following are two patches for ath9k to fix a potential interrupt
> storm (PATCH 2/3) and to fix potentially resetting the wifi chip while
> its interrupts were accidentally reenabled (PATCH 3/3).

Uhh, interesting - nice debugging work! What's the user-level symptom of
this? I.e., when this triggers does the device just appear to hang, or
does it cause reboots, or?

-Toke


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] ath9k: interrupt fixes on queue reset
  2021-09-14 19:53 ` [PATCH 0/3] ath9k: interrupt fixes on queue reset Toke Høiland-Jørgensen
@ 2021-09-15  9:23   ` Linus Lüssing
  0 siblings, 0 replies; 11+ messages in thread
From: Linus Lüssing @ 2021-09-15  9:23 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Kalle Valo, Felix Fietkau, Sujith Manoharan, ath9k-devel,
	linux-wireless, David S . Miller, Jakub Kicinski,
	John W . Linville, Felix Fietkau, Simon Wunderlich,
	Sven Eckelmann, netdev, linux-kernel

Hi Toke,

On Tue, Sep 14, 2021 at 09:53:34PM +0200, Toke Høiland-Jørgensen wrote:
> Linus Lüssing <linus.luessing@c0d3.blue> writes:
> 
> > Hi,
> >
> > The following are two patches for ath9k to fix a potential interrupt
> > storm (PATCH 2/3) and to fix potentially resetting the wifi chip while
> > its interrupts were accidentally reenabled (PATCH 3/3).
> 
> Uhh, interesting - nice debugging work! What's the user-level symptom of
> this? I.e., when this triggers does the device just appear to hang, or
> does it cause reboots, or?
> 
> -Toke
> 

For PATCH 2/3 the user-level symptom was that the system would
hang for a few seconds and would then silently reboot without any notice
on the serial console. And after disabling CONFIG_ATH79_WDT the
system would "hang" indefinitely without any notice on the console
without a reboot (while JTAG/gdb showed that it was entering ath_isr()
again and again and wasn't doing anything else).

For PATCH 3/3 I don't have a specific user-level symptom. But from
looking at the git history it seemed to me that the ath9k hw
interrupts (AR_IER, AR_INTR_ASYNC_ENABLE and AR_INTR_ASYNC_ENABLE off)
should be disabled during a reset:

  4668cce527ac ath9k: disable the tasklet before taking the PCU lock
  eaf04a691566 ath9k: Disable beacon tasklet during reset
  872b5d814f99 ath9k: do not access hardware on IRQs during reset
  e3f31175a3ee ath9k: fix race condition in irq processing during hardware reset

Maybe someone else on these lists might know what issues this can
cause exactly?

Regards, Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] ath9k: Fix potential hw interrupt resume during reset
  2021-09-14 19:25 ` [PATCH 3/3] ath9k: Fix potential hw interrupt resume during reset Linus Lüssing
@ 2021-09-15  9:48   ` Felix Fietkau
  2021-09-15 19:18     ` Linus Lüssing
  0 siblings, 1 reply; 11+ messages in thread
From: Felix Fietkau @ 2021-09-15  9:48 UTC (permalink / raw)
  To: Linus Lüssing, Kalle Valo, Sujith Manoharan, ath9k-devel
  Cc: linux-wireless, David S . Miller, Jakub Kicinski,
	John W . Linville, Felix Fietkau, Simon Wunderlich,
	Sven Eckelmann, netdev, linux-kernel, Linus Lüssing


On 2021-09-14 21:25, Linus Lüssing wrote:
> From: Linus Lüssing <ll@simonwunderlich.de>
> 
> There is a small risk of the ath9k hw interrupts being reenabled in the
> following way:
> 
> 1) ath_reset_internal()
>    ...
>    -> disable_irq()
>       ...
>       <- returns
> 
>                       2) ath9k_tasklet()
>                          ...
>                          -> ath9k_hw_resume_interrupts()
>                          ...
> 
> 1) ath_reset_internal() continued:
>    -> tasklet_disable(&sc->intr_tq); (= ath9k_tasklet() off)
> 
> By first disabling the ath9k interrupt there is a small window
> afterwards which allows ath9k hw interrupts being reenabled through
> the ath9k_tasklet() before we disable this tasklet in
> ath_reset_internal(). Leading to having the ath9k hw interrupts enabled
> during the reset, which we should avoid.
I don't see a way in which interrupts can be re-enabled through the
tasklet. disable_irq disables the entire PCI IRQ (not through ath9k hw
registers), and they will only be re-enabled by the corresponding
enable_irq call.

- Felix

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] ath9k: Fix potential hw interrupt resume during reset
  2021-09-15  9:48   ` Felix Fietkau
@ 2021-09-15 19:18     ` Linus Lüssing
  0 siblings, 0 replies; 11+ messages in thread
From: Linus Lüssing @ 2021-09-15 19:18 UTC (permalink / raw)
  To: Felix Fietkau
  Cc: Kalle Valo, Sujith Manoharan, ath9k-devel, linux-wireless,
	David S . Miller, Jakub Kicinski, John W . Linville,
	Felix Fietkau, Simon Wunderlich, Sven Eckelmann, netdev,
	linux-kernel, Linus Lüssing

On Wed, Sep 15, 2021 at 11:48:55AM +0200, Felix Fietkau wrote:
> 
> On 2021-09-14 21:25, Linus Lüssing wrote:
> > From: Linus Lüssing <ll@simonwunderlich.de>
> > 
> > There is a small risk of the ath9k hw interrupts being reenabled in the
> > following way:
> > 
> > 1) ath_reset_internal()
> >    ...
> >    -> disable_irq()
> >       ...
> >       <- returns
> > 
> >                       2) ath9k_tasklet()
> >                          ...
> >                          -> ath9k_hw_resume_interrupts()
> >                          ...
> > 
> > 1) ath_reset_internal() continued:
> >    -> tasklet_disable(&sc->intr_tq); (= ath9k_tasklet() off)
> > 
> > By first disabling the ath9k interrupt there is a small window
> > afterwards which allows ath9k hw interrupts being reenabled through
> > the ath9k_tasklet() before we disable this tasklet in
> > ath_reset_internal(). Leading to having the ath9k hw interrupts enabled
> > during the reset, which we should avoid.
> I don't see a way in which interrupts can be re-enabled through the
> tasklet. disable_irq disables the entire PCI IRQ (not through ath9k hw
> registers), and they will only be re-enabled by the corresponding
> enable_irq call.

Ah, okay, then I think I misunderstood the previous fixes to the
ath9k interrupt shutdown during reset here. I had only tested the
following diff and assumed that it were not okay to have the ath9k
hw interrupt registers enabled within the spinlock'd section:

```
@@ -299,11 +299,23 @@ static int ath_reset_internal(struct ath_softc *sc, struct ath9k_channel *hchan)
 
        __ath_cancel_work(sc);
 
        disable_irq(sc->irq);
+       for (r = 0; r < 200; r++) {
+               msleep(5);
+
+               if (REG_READ(ah, AR_INTR_SYNC_CAUSE) ||
+                   REG_READ(ah, AR_INTR_ASYNC_CAUSE)) {
+                       break;
+               }
+       }
        tasklet_disable(&sc->intr_tq);
        tasklet_disable(&sc->bcon_tasklet);
        spin_lock_bh(&sc->sc_pcu_lock);
 
+       if (REG_READ(ah, AR_INTR_SYNC_CAUSE) ||
+           REG_READ(ah, AR_INTR_ASYNC_CAUSE))
+               ATH_DBG_WARN(1, "%s: interrupts are enabled", __func__);
+
        if (!sc->cur_chan->offchannel) {
                fastcc = false;
                caldata = &sc->cur_chan->caldata;
```

And yes, the general ath9k interrupt should still be disabled. Didn't
test for that but seems like it.


(And now I noticed that actually
 ath_reset_internal()
 -> ath_prepare_reset()
   -> ath9k_hw_disable_interrupts()
     -> ath9k_hw_kill_interrupts()
 disables the ath9k hw interrupt registers before the
 ath_reset_internal()->ath9k_hw_reset() call anyway.)


What would you prefer, should this patch just be dropped or should
I resend it without the "Fixes:" line as a cosmetic change? (e.g. to
have the order in reverse to reenablement at the end of
ath_reset_internal(), to avoid confusion in the future?)

Regards, Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] ath9k: interrupt fixes on queue reset
  2021-09-14 19:25 [PATCH 0/3] ath9k: interrupt fixes on queue reset Linus Lüssing
                   ` (3 preceding siblings ...)
  2021-09-14 19:53 ` [PATCH 0/3] ath9k: interrupt fixes on queue reset Toke Høiland-Jørgensen
@ 2021-10-05 14:12 ` Linus Lüssing
  2021-10-05 14:24   ` Kalle Valo
  4 siblings, 1 reply; 11+ messages in thread
From: Linus Lüssing @ 2021-10-05 14:12 UTC (permalink / raw)
  To: Kalle Valo, Felix Fietkau, Sujith Manoharan, ath9k-devel
  Cc: linux-wireless, David S . Miller, Jakub Kicinski,
	John W . Linville, Felix Fietkau, Simon Wunderlich,
	Sven Eckelmann, netdev, linux-kernel

On Tue, Sep 14, 2021 at 09:25:12PM +0200, Linus Lüssing wrote:
> Hi,
> 
> The following are two patches for ath9k to fix a potential interrupt
> storm (PATCH 2/3) and to fix potentially resetting the wifi chip while
> its interrupts were accidentally reenabled (PATCH 3/3).
> 
> PATCH 1/3 adds the possibility to trigger the ath9k queue reset through
> the ath9k reset file in debugfs. Which was helpful to reproduce and debug
> this issue and might help for future debugging.
> 
> PATCH 2/3 and PATCH 3/3 should be applicable for stable.
> 
> Regards, Linus
> 

I've marked PATCH 3/3 as "rejected" in Patchwork now due to
Felix's legitimate remarks. For patches 1/3 and and 2/3 I'd
still like to see them merged upstream if there is no objection
to those.

Regars, Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] ath9k: interrupt fixes on queue reset
  2021-10-05 14:12 ` Linus Lüssing
@ 2021-10-05 14:24   ` Kalle Valo
  0 siblings, 0 replies; 11+ messages in thread
From: Kalle Valo @ 2021-10-05 14:24 UTC (permalink / raw)
  To: Linus Lüssing
  Cc: Felix Fietkau, ath9k-devel, linux-wireless, David S . Miller,
	Jakub Kicinski, John W . Linville, Felix Fietkau,
	Simon Wunderlich, Sven Eckelmann, netdev, linux-kernel

Linus Lüssing <linus.luessing@c0d3.blue> writes:

> On Tue, Sep 14, 2021 at 09:25:12PM +0200, Linus Lüssing wrote:
>> Hi,
>> 
>> The following are two patches for ath9k to fix a potential interrupt
>> storm (PATCH 2/3) and to fix potentially resetting the wifi chip while
>> its interrupts were accidentally reenabled (PATCH 3/3).
>> 
>> PATCH 1/3 adds the possibility to trigger the ath9k queue reset through
>> the ath9k reset file in debugfs. Which was helpful to reproduce and debug
>> this issue and might help for future debugging.
>> 
>> PATCH 2/3 and PATCH 3/3 should be applicable for stable.
>> 
>> Regards, Linus
>> 
>
> I've marked PATCH 3/3 as "rejected" in Patchwork now due to
> Felix's legitimate remarks.

BTW I prefer to mark patches as rejected myself in patchwork so that I
know what's happening (patchwork is lacking in this respect as it
doesn't notify me if there are any changes in patches). But good that
you mentioned this via email so I didn't need to wonder what happened.

> For patches 1/3 and and 2/3 I'd still like to see them merged upstream
> if there is no objection to those.

Thanks, I was about to ask what I should do with this patchset.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] ath9k: add option to reset the wifi chip via debugfs
  2021-09-14 19:25 ` [PATCH 1/3] ath9k: add option to reset the wifi chip via debugfs Linus Lüssing
@ 2021-10-05 14:27   ` Kalle Valo
  0 siblings, 0 replies; 11+ messages in thread
From: Kalle Valo @ 2021-10-05 14:27 UTC (permalink / raw)
  To: Linus Lüssing
  Cc: Felix Fietkau, Sujith Manoharan, ath9k-devel, linux-wireless,
	David S . Miller, Jakub Kicinski, John W . Linville,
	Felix Fietkau, Simon Wunderlich, Sven Eckelmann, netdev,
	linux-kernel, Linus Lüssing

Linus Lüssing <linus.luessing@c0d3.blue> wrote:

> Sometimes, in yet unknown cases the wifi chip stops working. To allow a
> watchdog in userspace to easily and quickly reset the wifi chip, add the
> according functionality to userspace. A reset can then be triggered
> via:
> 
>   $ echo 1 > /sys/kernel/debug/ieee80211/phy0/ath9k/reset
> 
> The number of user resets can further be tracked in the row "User reset"
> in the same file.
> 
> So far people usually used "iw scan" to fix ath9k chip hangs from
> userspace. Which triggers the ath9k_queue_reset(), too. The reset file
> however has the advantage of less overhead, which makes debugging bugs
> within ath9k_queue_reset() easier.
> 
> Signed-off-by: Linus Lüssing <ll@simonwunderlich.de>
> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>

2 patches applied to ath-next branch of ath.git, thanks.

053f9852b95e ath9k: add option to reset the wifi chip via debugfs
4925642d5412 ath9k: Fix potential interrupt storm on queue reset

-- 
https://patchwork.kernel.org/project/linux-wireless/patch/20210914192515.9273-2-linus.luessing@c0d3.blue/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-10-05 14:27 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-14 19:25 [PATCH 0/3] ath9k: interrupt fixes on queue reset Linus Lüssing
2021-09-14 19:25 ` [PATCH 1/3] ath9k: add option to reset the wifi chip via debugfs Linus Lüssing
2021-10-05 14:27   ` Kalle Valo
2021-09-14 19:25 ` [PATCH 2/3] ath9k: Fix potential interrupt storm on queue reset Linus Lüssing
2021-09-14 19:25 ` [PATCH 3/3] ath9k: Fix potential hw interrupt resume during reset Linus Lüssing
2021-09-15  9:48   ` Felix Fietkau
2021-09-15 19:18     ` Linus Lüssing
2021-09-14 19:53 ` [PATCH 0/3] ath9k: interrupt fixes on queue reset Toke Høiland-Jørgensen
2021-09-15  9:23   ` Linus Lüssing
2021-10-05 14:12 ` Linus Lüssing
2021-10-05 14:24   ` Kalle Valo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).