linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] mwifiex: Work around firmware bugs on 88W8897 chip
@ 2021-10-11 13:32 Jonas Dreßler
  2021-10-11 13:32 ` [PATCH v3 1/2] mwifiex: Read a PCI register after writing the TX ring write pointer Jonas Dreßler
  2021-10-11 13:32 ` [PATCH v3 2/2] mwifiex: Try waking the firmware until we get an interrupt Jonas Dreßler
  0 siblings, 2 replies; 4+ messages in thread
From: Jonas Dreßler @ 2021-10-11 13:32 UTC (permalink / raw)
  To: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, Kalle Valo,
	David S. Miller, Jakub Kicinski
  Cc: Jonas Dreßler, Tsuchiya Yuto, linux-wireless, netdev,
	linux-kernel, linux-pci, Maximilian Luz, Andy Shevchenko,
	Bjorn Helgaas, Pali Rohár, Heiner Kallweit, Johannes Berg,
	Brian Norris, David Laight

This is the third revision of this patch, here's v1 and v2:
v1: https://lore.kernel.org/linux-wireless/20210830123704.221494-1-verdre@v0yd.nl/
v2: https://lore.kernel.org/linux-wireless/20210914114813.15404-1-verdre@v0yd.nl/

Changes between v2 and v3:
 - Use consistent terminology (PCIe, USB)
 - Read a generic register (PCI_VENDOR_ID) in the first patch since it's not 
 the actual readback that fixes the crash. I decided against using usleep()
 because reading a register has proven to work on lots of devices for a few 
 months now, and usleep() only appears to work when a certain duration is used.
 - Use read_poll_timeout() for wakeup patch

Jonas Dreßler (2):
  mwifiex: Read a PCI register after writing the TX ring write pointer
  mwifiex: Try waking the firmware until we get an interrupt

 drivers/net/wireless/marvell/mwifiex/pcie.c | 36 ++++++++++++++++++---
 1 file changed, 31 insertions(+), 5 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v3 1/2] mwifiex: Read a PCI register after writing the TX ring write pointer
  2021-10-11 13:32 [PATCH v3 0/2] mwifiex: Work around firmware bugs on 88W8897 chip Jonas Dreßler
@ 2021-10-11 13:32 ` Jonas Dreßler
  2021-10-18 12:30   ` Kalle Valo
  2021-10-11 13:32 ` [PATCH v3 2/2] mwifiex: Try waking the firmware until we get an interrupt Jonas Dreßler
  1 sibling, 1 reply; 4+ messages in thread
From: Jonas Dreßler @ 2021-10-11 13:32 UTC (permalink / raw)
  To: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, Kalle Valo,
	David S. Miller, Jakub Kicinski
  Cc: Jonas Dreßler, Tsuchiya Yuto, linux-wireless, netdev,
	linux-kernel, linux-pci, Maximilian Luz, Andy Shevchenko,
	Bjorn Helgaas, Pali Rohár, Heiner Kallweit, Johannes Berg,
	Brian Norris, David Laight, stable

On the 88W8897 PCIe+USB card the firmware randomly crashes after setting
the TX ring write pointer. The issue is present in the latest firmware
version 15.68.19.p21 of the PCIe+USB card.

Those firmware crashes can be worked around by reading any PCI register
of the card after setting that register, so read the PCI_VENDOR_ID
register here. The reason this works is probably because we keep the bus
from entering an ASPM state for a bit longer, because that's what causes
the cards firmware to crash.

This fixes a bug where during RX/TX traffic and with ASPM L1 substates
enabled (the specific substates where the issue happens appear to be
platform dependent), the firmware crashes and eventually a command
timeout appears in the logs.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681
Cc: stable@vger.kernel.org
Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
---
 drivers/net/wireless/marvell/mwifiex/pcie.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index c6ccce426b49..641fa539de1f 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -1490,6 +1490,14 @@ mwifiex_pcie_send_data(struct mwifiex_adapter *adapter, struct sk_buff *skb,
 			ret = -1;
 			goto done_unmap;
 		}
+
+		/* The firmware (latest version 15.68.19.p21) of the 88W8897 PCIe+USB card
+		 * seems to crash randomly after setting the TX ring write pointer when
+		 * ASPM powersaving is enabled. A workaround seems to be keeping the bus
+		 * busy by reading a random register afterwards.
+		 */
+		mwifiex_read_reg(adapter, PCI_VENDOR_ID, &rx_val);
+
 		if ((mwifiex_pcie_txbd_not_full(card)) &&
 		    tx_param->next_pkt_len) {
 			/* have more packets and TxBD still can hold more */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v3 2/2] mwifiex: Try waking the firmware until we get an interrupt
  2021-10-11 13:32 [PATCH v3 0/2] mwifiex: Work around firmware bugs on 88W8897 chip Jonas Dreßler
  2021-10-11 13:32 ` [PATCH v3 1/2] mwifiex: Read a PCI register after writing the TX ring write pointer Jonas Dreßler
@ 2021-10-11 13:32 ` Jonas Dreßler
  1 sibling, 0 replies; 4+ messages in thread
From: Jonas Dreßler @ 2021-10-11 13:32 UTC (permalink / raw)
  To: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, Kalle Valo,
	David S. Miller, Jakub Kicinski
  Cc: Jonas Dreßler, Tsuchiya Yuto, linux-wireless, netdev,
	linux-kernel, linux-pci, Maximilian Luz, Andy Shevchenko,
	Bjorn Helgaas, Pali Rohár, Heiner Kallweit, Johannes Berg,
	Brian Norris, David Laight, stable

It seems that the PCIe+USB firmware (latest version 15.68.19.p21) of the
88W8897 card sometimes ignores or misses when we try to wake it up by
writing to the firmware status register. This leads to the firmware
wakeup timeout expiring and the driver resetting the card because we
assume the firmware has hung up or crashed.

Turns out that the firmware actually didn't hang up, but simply "missed"
our wakeup request and didn't send us an interrupt with an AWAKE event.

Trying again to read the firmware status register after a short timeout
usually makes the firmware wake up as expected, so add a small retry
loop to mwifiex_pm_wakeup_card() that looks at the interrupt status to
check whether the card woke up.

The number of tries and timeout lengths for this were determined
experimentally: The firmware usually takes about 500 us to wake up
after we attempt to read the status register. In some cases where the
firmware is very busy (for example while doing a bluetooth scan) it
might even miss our requests for multiple milliseconds, which is why
after 15 tries the waiting time gets increased to 10 ms. The maximum
number of tries it took to wake the firmware when testing this was
around 20, so a maximum number of 50 tries should give us plenty of
safety margin.

Here's a reproducer for those firmware wakeup failures I've found:

1) Make sure wifi powersaving is enabled (iw dev wlp1s0 set power_save on)
2) Connect to any wifi network (makes firmware go into wifi powersaving
mode, not deep sleep)
3) Make sure bluetooth is turned off (to ensure the firmware actually
enters powersave mode and doesn't keep the radio active doing bluetooth
stuff)
4) To confirm that wifi powersaving is entered ping a device on the LAN,
pings should be a few ms higher than without powersaving
5) Run "while true; do iwconfig; sleep 0.0001; done", this wakes and
suspends the firmware extremely often
6) Wait until things explode, for me it consistently takes <5 minutes

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681
Cc: stable@vger.kernel.org
Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
---
 drivers/net/wireless/marvell/mwifiex/pcie.c | 28 +++++++++++++++++----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index 641fa539de1f..c3f5583ea70d 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -17,6 +17,7 @@
  * this warranty disclaimer.
  */
 
+#include <linux/iopoll.h>
 #include <linux/firmware.h>
 
 #include "decl.h"
@@ -647,11 +648,15 @@ static void mwifiex_delay_for_sleep_cookie(struct mwifiex_adapter *adapter,
 			    "max count reached while accessing sleep cookie\n");
 }
 
+#define N_WAKEUP_TRIES_SHORT_INTERVAL 15
+#define N_WAKEUP_TRIES_LONG_INTERVAL 35
+
 /* This function wakes up the card by reading fw_status register. */
 static int mwifiex_pm_wakeup_card(struct mwifiex_adapter *adapter)
 {
 	struct pcie_service_card *card = adapter->card;
 	const struct mwifiex_pcie_card_reg *reg = card->pcie.reg;
+	int retval;
 
 	mwifiex_dbg(adapter, EVENT,
 		    "event: Wakeup device...\n");
@@ -659,11 +664,24 @@ static int mwifiex_pm_wakeup_card(struct mwifiex_adapter *adapter)
 	if (reg->sleep_cookie)
 		mwifiex_pcie_dev_wakeup_delay(adapter);
 
-	/* Accessing fw_status register will wakeup device */
-	if (mwifiex_write_reg(adapter, reg->fw_status, FIRMWARE_READY_PCIE)) {
-		mwifiex_dbg(adapter, ERROR,
-			    "Writing fw_status register failed\n");
-		return -1;
+	/* The 88W8897 PCIe+USB firmware (latest version 15.68.19.p21) sometimes
+	 * appears to ignore or miss our wakeup request, so we continue trying
+	 * until we receive an interrupt from the card.
+	 */
+	if (read_poll_timeout(mwifiex_write_reg, retval,
+			      READ_ONCE(adapter->int_status) != 0,
+			      500, 500 * N_WAKEUP_TRIES_SHORT_INTERVAL,
+			      false,
+			      adapter, reg->fw_status, FIRMWARE_READY_PCIE)) {
+		if (read_poll_timeout(mwifiex_write_reg, retval,
+				      READ_ONCE(adapter->int_status) != 0,
+				      10000, 10000 * N_WAKEUP_TRIES_LONG_INTERVAL,
+				      false,
+				      adapter, reg->fw_status, FIRMWARE_READY_PCIE)) {
+			mwifiex_dbg(adapter, ERROR,
+				    "Firmware didn't wake up\n");
+			return -EIO;
+		}
 	}
 
 	if (reg->sleep_cookie) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3 1/2] mwifiex: Read a PCI register after writing the TX ring write pointer
  2021-10-11 13:32 ` [PATCH v3 1/2] mwifiex: Read a PCI register after writing the TX ring write pointer Jonas Dreßler
@ 2021-10-18 12:30   ` Kalle Valo
  0 siblings, 0 replies; 4+ messages in thread
From: Kalle Valo @ 2021-10-18 12:30 UTC (permalink / raw)
  To: Jonas Dreßler
  Cc: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, David S. Miller,
	Jakub Kicinski, Jonas Dreßler, Tsuchiya Yuto,
	linux-wireless, netdev, linux-kernel, linux-pci, Maximilian Luz,
	Andy Shevchenko, Bjorn Helgaas, Pali Rohár, Heiner Kallweit,
	Johannes Berg, Brian Norris, David Laight, stable

Jonas Dreßler <verdre@v0yd.nl> wrote:

> On the 88W8897 PCIe+USB card the firmware randomly crashes after setting
> the TX ring write pointer. The issue is present in the latest firmware
> version 15.68.19.p21 of the PCIe+USB card.
> 
> Those firmware crashes can be worked around by reading any PCI register
> of the card after setting that register, so read the PCI_VENDOR_ID
> register here. The reason this works is probably because we keep the bus
> from entering an ASPM state for a bit longer, because that's what causes
> the cards firmware to crash.
> 
> This fixes a bug where during RX/TX traffic and with ASPM L1 substates
> enabled (the specific substates where the issue happens appear to be
> platform dependent), the firmware crashes and eventually a command
> timeout appears in the logs.
> 
> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681
> Cc: stable@vger.kernel.org
> Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>

2 patches applied to wireless-drivers-next.git, thanks.

e5f4eb8223aa mwifiex: Read a PCI register after writing the TX ring write pointer
8e3e59c31fea mwifiex: Try waking the firmware until we get an interrupt

-- 
https://patchwork.kernel.org/project/linux-wireless/patch/20211011133224.15561-2-verdre@v0yd.nl/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-10-18 12:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-11 13:32 [PATCH v3 0/2] mwifiex: Work around firmware bugs on 88W8897 chip Jonas Dreßler
2021-10-11 13:32 ` [PATCH v3 1/2] mwifiex: Read a PCI register after writing the TX ring write pointer Jonas Dreßler
2021-10-18 12:30   ` Kalle Valo
2021-10-11 13:32 ` [PATCH v3 2/2] mwifiex: Try waking the firmware until we get an interrupt Jonas Dreßler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).