All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] mwifiex: Work around firmware bugs on 88W8897 chip
@ 2021-10-11 13:32 Jonas Dreßler
  2021-10-11 13:32 ` [PATCH v3 1/2] mwifiex: Read a PCI register after writing the TX ring write pointer Jonas Dreßler
  2021-10-11 13:32 ` [PATCH v3 2/2] mwifiex: Try waking the firmware until we get an interrupt Jonas Dreßler
  0 siblings, 2 replies; 5+ messages in thread
From: Jonas Dreßler @ 2021-10-11 13:32 UTC (permalink / raw)
  To: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, Kalle Valo,
	David S. Miller, Jakub Kicinski
  Cc: Jonas Dreßler, Tsuchiya Yuto, linux-wireless, netdev,
	linux-kernel, linux-pci, Maximilian Luz, Andy Shevchenko,
	Bjorn Helgaas, Pali Rohár, Heiner Kallweit, Johannes Berg,
	Brian Norris, David Laight

This is the third revision of this patch, here's v1 and v2:
v1: https://lore.kernel.org/linux-wireless/20210830123704.221494-1-verdre@v0yd.nl/
v2: https://lore.kernel.org/linux-wireless/20210914114813.15404-1-verdre@v0yd.nl/

Changes between v2 and v3:
 - Use consistent terminology (PCIe, USB)
 - Read a generic register (PCI_VENDOR_ID) in the first patch since it's not 
 the actual readback that fixes the crash. I decided against using usleep()
 because reading a register has proven to work on lots of devices for a few 
 months now, and usleep() only appears to work when a certain duration is used.
 - Use read_poll_timeout() for wakeup patch

Jonas Dreßler (2):
  mwifiex: Read a PCI register after writing the TX ring write pointer
  mwifiex: Try waking the firmware until we get an interrupt

 drivers/net/wireless/marvell/mwifiex/pcie.c | 36 ++++++++++++++++++---
 1 file changed, 31 insertions(+), 5 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v3 1/2] mwifiex: Read a PCI register after writing the TX ring write pointer
  2021-10-11 13:32 [PATCH v3 0/2] mwifiex: Work around firmware bugs on 88W8897 chip Jonas Dreßler
@ 2021-10-11 13:32 ` Jonas Dreßler
  2021-10-18 12:30   ` Kalle Valo
  2021-10-11 13:32 ` [PATCH v3 2/2] mwifiex: Try waking the firmware until we get an interrupt Jonas Dreßler
  1 sibling, 1 reply; 5+ messages in thread
From: Jonas Dreßler @ 2021-10-11 13:32 UTC (permalink / raw)
  To: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, Kalle Valo,
	David S. Miller, Jakub Kicinski
  Cc: Jonas Dreßler, Tsuchiya Yuto, linux-wireless, netdev,
	linux-kernel, linux-pci, Maximilian Luz, Andy Shevchenko,
	Bjorn Helgaas, Pali Rohár, Heiner Kallweit, Johannes Berg,
	Brian Norris, David Laight, stable

On the 88W8897 PCIe+USB card the firmware randomly crashes after setting
the TX ring write pointer. The issue is present in the latest firmware
version 15.68.19.p21 of the PCIe+USB card.

Those firmware crashes can be worked around by reading any PCI register
of the card after setting that register, so read the PCI_VENDOR_ID
register here. The reason this works is probably because we keep the bus
from entering an ASPM state for a bit longer, because that's what causes
the cards firmware to crash.

This fixes a bug where during RX/TX traffic and with ASPM L1 substates
enabled (the specific substates where the issue happens appear to be
platform dependent), the firmware crashes and eventually a command
timeout appears in the logs.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681
Cc: stable@vger.kernel.org
Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
---
 drivers/net/wireless/marvell/mwifiex/pcie.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index c6ccce426b49..641fa539de1f 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -1490,6 +1490,14 @@ mwifiex_pcie_send_data(struct mwifiex_adapter *adapter, struct sk_buff *skb,
 			ret = -1;
 			goto done_unmap;
 		}
+
+		/* The firmware (latest version 15.68.19.p21) of the 88W8897 PCIe+USB card
+		 * seems to crash randomly after setting the TX ring write pointer when
+		 * ASPM powersaving is enabled. A workaround seems to be keeping the bus
+		 * busy by reading a random register afterwards.
+		 */
+		mwifiex_read_reg(adapter, PCI_VENDOR_ID, &rx_val);
+
 		if ((mwifiex_pcie_txbd_not_full(card)) &&
 		    tx_param->next_pkt_len) {
 			/* have more packets and TxBD still can hold more */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v3 2/2] mwifiex: Try waking the firmware until we get an interrupt
  2021-10-11 13:32 [PATCH v3 0/2] mwifiex: Work around firmware bugs on 88W8897 chip Jonas Dreßler
  2021-10-11 13:32 ` [PATCH v3 1/2] mwifiex: Read a PCI register after writing the TX ring write pointer Jonas Dreßler
@ 2021-10-11 13:32 ` Jonas Dreßler
  2021-10-26  2:44   ` kernel test robot
  1 sibling, 1 reply; 5+ messages in thread
From: Jonas Dreßler @ 2021-10-11 13:32 UTC (permalink / raw)
  To: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, Kalle Valo,
	David S. Miller, Jakub Kicinski
  Cc: Jonas Dreßler, Tsuchiya Yuto, linux-wireless, netdev,
	linux-kernel, linux-pci, Maximilian Luz, Andy Shevchenko,
	Bjorn Helgaas, Pali Rohár, Heiner Kallweit, Johannes Berg,
	Brian Norris, David Laight, stable

It seems that the PCIe+USB firmware (latest version 15.68.19.p21) of the
88W8897 card sometimes ignores or misses when we try to wake it up by
writing to the firmware status register. This leads to the firmware
wakeup timeout expiring and the driver resetting the card because we
assume the firmware has hung up or crashed.

Turns out that the firmware actually didn't hang up, but simply "missed"
our wakeup request and didn't send us an interrupt with an AWAKE event.

Trying again to read the firmware status register after a short timeout
usually makes the firmware wake up as expected, so add a small retry
loop to mwifiex_pm_wakeup_card() that looks at the interrupt status to
check whether the card woke up.

The number of tries and timeout lengths for this were determined
experimentally: The firmware usually takes about 500 us to wake up
after we attempt to read the status register. In some cases where the
firmware is very busy (for example while doing a bluetooth scan) it
might even miss our requests for multiple milliseconds, which is why
after 15 tries the waiting time gets increased to 10 ms. The maximum
number of tries it took to wake the firmware when testing this was
around 20, so a maximum number of 50 tries should give us plenty of
safety margin.

Here's a reproducer for those firmware wakeup failures I've found:

1) Make sure wifi powersaving is enabled (iw dev wlp1s0 set power_save on)
2) Connect to any wifi network (makes firmware go into wifi powersaving
mode, not deep sleep)
3) Make sure bluetooth is turned off (to ensure the firmware actually
enters powersave mode and doesn't keep the radio active doing bluetooth
stuff)
4) To confirm that wifi powersaving is entered ping a device on the LAN,
pings should be a few ms higher than without powersaving
5) Run "while true; do iwconfig; sleep 0.0001; done", this wakes and
suspends the firmware extremely often
6) Wait until things explode, for me it consistently takes <5 minutes

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681
Cc: stable@vger.kernel.org
Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
---
 drivers/net/wireless/marvell/mwifiex/pcie.c | 28 +++++++++++++++++----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index 641fa539de1f..c3f5583ea70d 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -17,6 +17,7 @@
  * this warranty disclaimer.
  */
 
+#include <linux/iopoll.h>
 #include <linux/firmware.h>
 
 #include "decl.h"
@@ -647,11 +648,15 @@ static void mwifiex_delay_for_sleep_cookie(struct mwifiex_adapter *adapter,
 			    "max count reached while accessing sleep cookie\n");
 }
 
+#define N_WAKEUP_TRIES_SHORT_INTERVAL 15
+#define N_WAKEUP_TRIES_LONG_INTERVAL 35
+
 /* This function wakes up the card by reading fw_status register. */
 static int mwifiex_pm_wakeup_card(struct mwifiex_adapter *adapter)
 {
 	struct pcie_service_card *card = adapter->card;
 	const struct mwifiex_pcie_card_reg *reg = card->pcie.reg;
+	int retval;
 
 	mwifiex_dbg(adapter, EVENT,
 		    "event: Wakeup device...\n");
@@ -659,11 +664,24 @@ static int mwifiex_pm_wakeup_card(struct mwifiex_adapter *adapter)
 	if (reg->sleep_cookie)
 		mwifiex_pcie_dev_wakeup_delay(adapter);
 
-	/* Accessing fw_status register will wakeup device */
-	if (mwifiex_write_reg(adapter, reg->fw_status, FIRMWARE_READY_PCIE)) {
-		mwifiex_dbg(adapter, ERROR,
-			    "Writing fw_status register failed\n");
-		return -1;
+	/* The 88W8897 PCIe+USB firmware (latest version 15.68.19.p21) sometimes
+	 * appears to ignore or miss our wakeup request, so we continue trying
+	 * until we receive an interrupt from the card.
+	 */
+	if (read_poll_timeout(mwifiex_write_reg, retval,
+			      READ_ONCE(adapter->int_status) != 0,
+			      500, 500 * N_WAKEUP_TRIES_SHORT_INTERVAL,
+			      false,
+			      adapter, reg->fw_status, FIRMWARE_READY_PCIE)) {
+		if (read_poll_timeout(mwifiex_write_reg, retval,
+				      READ_ONCE(adapter->int_status) != 0,
+				      10000, 10000 * N_WAKEUP_TRIES_LONG_INTERVAL,
+				      false,
+				      adapter, reg->fw_status, FIRMWARE_READY_PCIE)) {
+			mwifiex_dbg(adapter, ERROR,
+				    "Firmware didn't wake up\n");
+			return -EIO;
+		}
 	}
 
 	if (reg->sleep_cookie) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 1/2] mwifiex: Read a PCI register after writing the TX ring write pointer
  2021-10-11 13:32 ` [PATCH v3 1/2] mwifiex: Read a PCI register after writing the TX ring write pointer Jonas Dreßler
@ 2021-10-18 12:30   ` Kalle Valo
  0 siblings, 0 replies; 5+ messages in thread
From: Kalle Valo @ 2021-10-18 12:30 UTC (permalink / raw)
  To: Jonas Dreßler
  Cc: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, David S. Miller,
	Jakub Kicinski, Jonas Dreßler, Tsuchiya Yuto,
	linux-wireless, netdev, linux-kernel, linux-pci, Maximilian Luz,
	Andy Shevchenko, Bjorn Helgaas, Pali Rohár, Heiner Kallweit,
	Johannes Berg, Brian Norris, David Laight, stable

Jonas Dreßler <verdre@v0yd.nl> wrote:

> On the 88W8897 PCIe+USB card the firmware randomly crashes after setting
> the TX ring write pointer. The issue is present in the latest firmware
> version 15.68.19.p21 of the PCIe+USB card.
> 
> Those firmware crashes can be worked around by reading any PCI register
> of the card after setting that register, so read the PCI_VENDOR_ID
> register here. The reason this works is probably because we keep the bus
> from entering an ASPM state for a bit longer, because that's what causes
> the cards firmware to crash.
> 
> This fixes a bug where during RX/TX traffic and with ASPM L1 substates
> enabled (the specific substates where the issue happens appear to be
> platform dependent), the firmware crashes and eventually a command
> timeout appears in the logs.
> 
> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681
> Cc: stable@vger.kernel.org
> Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>

2 patches applied to wireless-drivers-next.git, thanks.

e5f4eb8223aa mwifiex: Read a PCI register after writing the TX ring write pointer
8e3e59c31fea mwifiex: Try waking the firmware until we get an interrupt

-- 
https://patchwork.kernel.org/project/linux-wireless/patch/20211011133224.15561-2-verdre@v0yd.nl/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 2/2] mwifiex: Try waking the firmware until we get an interrupt
  2021-10-11 13:32 ` [PATCH v3 2/2] mwifiex: Try waking the firmware until we get an interrupt Jonas Dreßler
@ 2021-10-26  2:44   ` kernel test robot
  0 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2021-10-26  2:44 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3697 bytes --]

Hi "Jonas,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on kvalo-wireless-drivers-next/master]
[also build test WARNING on v5.15-rc7]
[cannot apply to next-20211025]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jonas-Dre-ler/mwifiex-Work-around-firmware-bugs-on-88W8897-chip/20211011-213355
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git master
config: arm64-defconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/e0e25bbda88f5c6c729414fb18ede64aa80d4032
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jonas-Dre-ler/mwifiex-Work-around-firmware-bugs-on-88W8897-chip/20211011-213355
        git checkout e0e25bbda88f5c6c729414fb18ede64aa80d4032
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/net/wireless/marvell/mwifiex/pcie.c: In function 'mwifiex_pm_wakeup_card':
>> drivers/net/wireless/marvell/mwifiex/pcie.c:659:13: warning: variable 'retval' set but not used [-Wunused-but-set-variable]
     659 |         int retval;
         |             ^~~~~~


vim +/retval +659 drivers/net/wireless/marvell/mwifiex/pcie.c

   653	
   654	/* This function wakes up the card by reading fw_status register. */
   655	static int mwifiex_pm_wakeup_card(struct mwifiex_adapter *adapter)
   656	{
   657		struct pcie_service_card *card = adapter->card;
   658		const struct mwifiex_pcie_card_reg *reg = card->pcie.reg;
 > 659		int retval;
   660	
   661		mwifiex_dbg(adapter, EVENT,
   662			    "event: Wakeup device...\n");
   663	
   664		if (reg->sleep_cookie)
   665			mwifiex_pcie_dev_wakeup_delay(adapter);
   666	
   667		/* The 88W8897 PCIe+USB firmware (latest version 15.68.19.p21) sometimes
   668		 * appears to ignore or miss our wakeup request, so we continue trying
   669		 * until we receive an interrupt from the card.
   670		 */
   671		if (read_poll_timeout(mwifiex_write_reg, retval,
   672				      READ_ONCE(adapter->int_status) != 0,
   673				      500, 500 * N_WAKEUP_TRIES_SHORT_INTERVAL,
   674				      false,
   675				      adapter, reg->fw_status, FIRMWARE_READY_PCIE)) {
   676			if (read_poll_timeout(mwifiex_write_reg, retval,
   677					      READ_ONCE(adapter->int_status) != 0,
   678					      10000, 10000 * N_WAKEUP_TRIES_LONG_INTERVAL,
   679					      false,
   680					      adapter, reg->fw_status, FIRMWARE_READY_PCIE)) {
   681				mwifiex_dbg(adapter, ERROR,
   682					    "Firmware didn't wake up\n");
   683				return -EIO;
   684			}
   685		}
   686	
   687		if (reg->sleep_cookie) {
   688			mwifiex_pcie_dev_wakeup_delay(adapter);
   689			mwifiex_dbg(adapter, INFO,
   690				    "PCIE wakeup: Setting PS_STATE_AWAKE\n");
   691			adapter->ps_state = PS_STATE_AWAKE;
   692		}
   693	
   694		return 0;
   695	}
   696	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 55810 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-10-26  2:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-11 13:32 [PATCH v3 0/2] mwifiex: Work around firmware bugs on 88W8897 chip Jonas Dreßler
2021-10-11 13:32 ` [PATCH v3 1/2] mwifiex: Read a PCI register after writing the TX ring write pointer Jonas Dreßler
2021-10-18 12:30   ` Kalle Valo
2021-10-11 13:32 ` [PATCH v3 2/2] mwifiex: Try waking the firmware until we get an interrupt Jonas Dreßler
2021-10-26  2:44   ` kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.