All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mwifiex: pcie: add enable_device_dump module parameter
@ 2020-10-28 14:26 Tsuchiya Yuto
  2020-10-29  0:12 ` Brian Norris
  0 siblings, 1 reply; 5+ messages in thread
From: Tsuchiya Yuto @ 2020-10-28 14:26 UTC (permalink / raw)
  To: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, Kalle Valo,
	David S. Miller, Jakub Kicinski
  Cc: linux-wireless, netdev, linux-kernel, Maximilian Luz,
	Andy Shevchenko, verdre, Tsuchiya Yuto

The devicve_dump may take a little bit long time and users may want to
disable the dump for daily usage.

This commit adds a new module parameter enable_device_dump and disables
the device_dump by default.

Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com>
---
 drivers/net/wireless/marvell/mwifiex/pcie.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index 6a10ff0377a24..8254e06fb22ce 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -33,6 +33,11 @@
 
 static struct mwifiex_if_ops pcie_ops;
 
+static bool enable_device_dump;
+module_param(enable_device_dump, bool, 0644);
+MODULE_PARM_DESC(enable_device_dump,
+		 "enable device_dump (default: disabled)");
+
 static const struct mwifiex_pcie_card_reg mwifiex_reg_8766 = {
 	.cmd_addr_lo = PCIE_SCRATCH_0_REG,
 	.cmd_addr_hi = PCIE_SCRATCH_1_REG,
@@ -2938,6 +2943,12 @@ static void mwifiex_pcie_fw_dump(struct mwifiex_adapter *adapter)
 
 static void mwifiex_pcie_device_dump_work(struct mwifiex_adapter *adapter)
 {
+	if (!enable_device_dump) {
+		mwifiex_dbg(adapter, MSG,
+			    "device_dump is disabled by module parameter\n");
+		return;
+	}
+
 	adapter->devdump_data = vzalloc(MWIFIEX_FW_DUMP_SIZE);
 	if (!adapter->devdump_data) {
 		mwifiex_dbg(adapter, ERROR,
-- 
2.29.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] mwifiex: pcie: add enable_device_dump module parameter
  2020-10-28 14:26 [PATCH] mwifiex: pcie: add enable_device_dump module parameter Tsuchiya Yuto
@ 2020-10-29  0:12 ` Brian Norris
  2020-10-30 10:30   ` Tsuchiya Yuto
  0 siblings, 1 reply; 5+ messages in thread
From: Brian Norris @ 2020-10-29  0:12 UTC (permalink / raw)
  To: Tsuchiya Yuto
  Cc: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, Kalle Valo,
	David S. Miller, Jakub Kicinski, linux-wireless,
	<netdev@vger.kernel.org>,
	Linux Kernel, Maximilian Luz, Andy Shevchenko, verdre

On Wed, Oct 28, 2020 at 3:58 PM Tsuchiya Yuto <kitakar@gmail.com> wrote:
>
> The devicve_dump may take a little bit long time and users may want to
> disable the dump for daily usage.
>
> This commit adds a new module parameter enable_device_dump and disables
> the device_dump by default.

As with one of your other patches, please don't change the defaults
and hide them under a module parameter. If you're adding a module
parameter, leave the default behavior alone.

This also seems like something that might be nicer as a user-space
knob in generic form (similar to "/sys/class/devcoredump/disabled",
except on a per-device basis, and fed back to the driver so it doesn't
waste time generating such dumps), but I suppose I can see why a
module parameter (so you can just stick your configuration in
/etc/modprobe.d/) might be easier to deal with in some cases.

Brian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mwifiex: pcie: add enable_device_dump module parameter
  2020-10-29  0:12 ` Brian Norris
@ 2020-10-30 10:30   ` Tsuchiya Yuto
  2020-11-20 20:53     ` Brian Norris
  0 siblings, 1 reply; 5+ messages in thread
From: Tsuchiya Yuto @ 2020-10-30 10:30 UTC (permalink / raw)
  To: Brian Norris
  Cc: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, Kalle Valo,
	David S. Miller, Jakub Kicinski, linux-wireless,
	<netdev@vger.kernel.org>,
	Linux Kernel, Maximilian Luz, Andy Shevchenko, verdre

On Wed, 2020-10-28 at 17:12 -0700, Brian Norris wrote:
> On Wed, Oct 28, 2020 at 3:58 PM Tsuchiya Yuto <kitakar@gmail.com> wrote:
> > 
> > The devicve_dump may take a little bit long time and users may want to
> > disable the dump for daily usage.
> > 
> > This commit adds a new module parameter enable_device_dump and disables
> > the device_dump by default.
> 
> As with one of your other patches, please don't change the defaults
> and hide them under a module parameter. If you're adding a module
> parameter, leave the default behavior alone.

Thanks for the review!

I mentioned about power_save stability on the other patches. But I should
have added this fact into the commit message of this patch that even if
we disable that power_save, the firmware crashes a lot on some specific
devices. Really a lot.

For example, as far as I know Surface Pro 5 needs ASPM L1.2 substate
disabled to avoid the firmware crash. Disabling it is still acceptable.
On the other hand, Surface 3 needs L1 ASPM state disabled. This is not
acceptable because this breaks S0ix. Anyway, handling ASPM should be done
in firmware I think.

So, the context of why I sent this patch is the next. We can't fix the
fw crash itself, so, we decided to just let it crash and reset by itself
(with the other fw reset quirks I sent). In this way, the time it does
device_dump is really annoying if fw crashes so often.

Let me know if splitting this patch like this works. 1) The first patch
is to add this module parameter but don't change the default behavior.
2) The second patch is to change the parameter value depending on the
DMI matching or something so that it doesn't break the existing users.

But what I want to say here as well is that, if the firmware can be fixed,
we don't need a patch like this.

> This also seems like something that might be nicer as a user-space
> knob in generic form (similar to "/sys/class/devcoredump/disabled",
> except on a per-device basis, and fed back to the driver so it doesn't
> waste time generating such dumps), but I suppose I can see why a
> module parameter (so you can just stick your configuration in
> /etc/modprobe.d/) might be easier to deal with in some cases.

Agreed.

> Brian



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mwifiex: pcie: add enable_device_dump module parameter
  2020-10-30 10:30   ` Tsuchiya Yuto
@ 2020-11-20 20:53     ` Brian Norris
  2020-11-26 18:31       ` Tsuchiya Yuto
  0 siblings, 1 reply; 5+ messages in thread
From: Brian Norris @ 2020-11-20 20:53 UTC (permalink / raw)
  To: Tsuchiya Yuto
  Cc: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, Kalle Valo,
	David S. Miller, Jakub Kicinski, linux-wireless,
	<netdev@vger.kernel.org>,
	Linux Kernel, Maximilian Luz, Andy Shevchenko, verdre

(Sorry if anything's a bit slow here. I don't really have time to
write out full proposals myself.)

On Fri, Oct 30, 2020 at 3:30 AM Tsuchiya Yuto <kitakar@gmail.com> wrote:
> Let me know if splitting this patch like this works. 1) The first patch
> is to add this module parameter but don't change the default behavior.

That *could* be OK with me, although it's already been said that there
are many people who dislike extra module parameters. I also don't see
why this needs to be a module parameter at all. If you do #2 right,
you don't really need this, as there are already several standard ways
of doing this (e.g., via Kconfig, or via nl80211 on a per-device
basis).

> 2) The second patch is to change the parameter value depending on the
> DMI matching or something so that it doesn't break the existing users.

Point 2 sounds good, and this is the key point. Note that you can do
point 2 without making it a module parameter. Just keep a flag in the
driver-private structures.

> But what I want to say here as well is that, if the firmware can be fixed,
> we don't need a patch like this.

Sure. That's also where we don't necessarily need more ways to control
this from user space (e.g., module parameters), but just better
detection of currently broken systems (in the driver). And if firmware
ever gets fixed, we can undo the "broken device" detection.

Brian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mwifiex: pcie: add enable_device_dump module parameter
  2020-11-20 20:53     ` Brian Norris
@ 2020-11-26 18:31       ` Tsuchiya Yuto
  0 siblings, 0 replies; 5+ messages in thread
From: Tsuchiya Yuto @ 2020-11-26 18:31 UTC (permalink / raw)
  To: Brian Norris
  Cc: Amitkumar Karwar, Ganapathi Bhat, Xinming Hu, Kalle Valo,
	David S. Miller, Jakub Kicinski, linux-wireless,
	<netdev@vger.kernel.org>,
	Linux Kernel, Maximilian Luz, Andy Shevchenko, verdre

On Fri, 2020-11-20 at 12:53 -0800, Brian Norris wrote:
> (Sorry if anything's a bit slow here. I don't really have time to
> write out full proposals myself.)

Don’t worry, I (and other owners) am already glad to hear from upstream
devs, including you :)

(and I'm also sorry for the late reply from me. It was difficult to find
spare time these days.)

> On Fri, Oct 30, 2020 at 3:30 AM Tsuchiya Yuto <kitakar@gmail.com> wrote:
> > Let me know if splitting this patch like this works. 1) The first patch
> > is to add this module parameter but don't change the default behavior.
> 
> That *could* be OK with me, although it's already been said that there
> are many people who dislike extra module parameters. I also don't see
> why this needs to be a module parameter at all. If you do #2 right,
> you don't really need this, as there are already several standard ways
> of doing this (e.g., via Kconfig, or via nl80211 on a per-device
> basis).
> 
> > 2) The second patch is to change the parameter value depending on the
> > DMI matching or something so that it doesn't break the existing users.
> 
> Point 2 sounds good, and this is the key point. Note that you can do
> point 2 without making it a module parameter. Just keep a flag in the
> driver-private structures.

I chose to also provide the module parameter because I thought it would
be a little bit complicated for users to re-enable this dump feature if
I chose not to provide this parameter.

If I don't provide the parameter but still want to allow users to
re-enable this dump feature, we can provide a way to change the bit flags
of quirks (via a new debugfs entry or another module parameter). That
said, is there a way to easily change the quirk flags only for this dump
feature?

For example, assume that the following three quirks are enabled by default:

/* quirks */
#define QUIRK_FW_RST_D3COLD	BIT(0)
#define QUIRK_NO_DEVICE_DUMP	BIT(1)
#define QUIRK_FOO		BIT(2)

.driver_data = (void *)(QUIRK_FW_RST_D3COLD |
			QUIRK_NO_DEVICE_DUMP |
			QUIRK_FOO),

card->quirks = (uintptr_t)dmi_id->driver_data;

/* and assume that card->quirks can be changed by users by a file named
 * "quirks" under debugfs.
 */

So, the card->quirks will be "7" in decimal by default. Then, if users
want to re-enable the dump feature, as far as I know, users need to know
the default value "7", then need to know that device_dump is controlled
by bit 1. Finally, users can set the new quirk flags "5" in decimal (101
in binary).

echo 5 > /sys/kernel/debug/mwifiex/mlan0/quirks

I'm glad if there is another nice way to control only the device_dump
quirk flag, if we only provide a way to change quirk flags.

But at the same time I also think that if someone dare to want to
re-enable this feature, maybe the person won't feel it's complicated haha.
So, I'll drop the device_dump module parameter and switch to use the quirk
framework, adding a way to change the quirk flags value by users.

That said, we may even drop disabling the dump. Take a look at my comment
I wrote below.

> > But what I want to say here as well is that, if the firmware can be fixed,
> > we don't need a patch like this.
> 
> Sure. That's also where we don't necessarily need more ways to control
> this from user space (e.g., module parameters), but just better
> detection of currently broken systems (in the driver). And if firmware
> ever gets fixed, we can undo the "broken device" detection.

There are two types of firmware crashes we observes:
1) cmd_timedout (other than ps_mode-related)
2) Firmware wakeup failed (ps_mode-related)

The #2 is observed when we enabled ps_mode. The #1 is observed for the
other causes. And hopefully, verdre (in Cc) found a "fix" [1] for the
#1 fw crash. We are trying the fix now.

The pull req (still WIP) basically addresses fw crashing by using
"non-posted" register writes and uninterruptible wait queue for PCI
operations in the kernel driver side.

We still don't have any ideas to "fix" the #2 fw crash, but now we don't
see the #1 crash anymore so far.

I'd like to see where the attempt goes before I start working on this
patch here again.

Thank you, everyone.

[1] https://github.com/linux-surface/kernel/pull/70
    [WIP] Properly fix wifi firmware crashes by jonas2515 · Pull Request #70 · linux-surface/kernel

> Brian



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-11-26 18:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-28 14:26 [PATCH] mwifiex: pcie: add enable_device_dump module parameter Tsuchiya Yuto
2020-10-29  0:12 ` Brian Norris
2020-10-30 10:30   ` Tsuchiya Yuto
2020-11-20 20:53     ` Brian Norris
2020-11-26 18:31       ` Tsuchiya Yuto

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.