All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown handler
@ 2017-12-13 11:27 Xinming Hu
  2018-01-08 17:38 ` Kalle Valo
  2018-01-08 18:11 ` [PATCH] " Brian Norris
  0 siblings, 2 replies; 5+ messages in thread
From: Xinming Hu @ 2017-12-13 11:27 UTC (permalink / raw)
  To: Linux Wireless
  Cc: Kalle Valo, Brian Norris, Dmitry Torokhov, rajatja, Zhiyuan Yang,
	Tim Song, Cathy Luo, James Cao, Ganapathi Bhat, Ellie Reeves,
	Xinming Hu

The last command used to shutdown firmware might be timeout,
and trigger firmware dump in asynchronous pcie/sdio work.

The remove/shutdown handler will continue free core data
structure private/adapter, which might be dereferenced in
pcie/sdio work, finally crash the kernel.

Sync and Cancel pcie/sdio work, could be a fix for above
cornel case. In this way, the last command timeout could
be handled properly.

Signed-off-by: Xinming Hu <huxm@marvell.com>
---
 drivers/net/wireless/marvell/mwifiex/pcie.c | 2 ++
 drivers/net/wireless/marvell/mwifiex/sdio.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index f666cb2..23209c5 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -310,6 +310,8 @@ static void mwifiex_pcie_remove(struct pci_dev *pdev)
 		mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
 	}
 
+	cancel_work_sync(&card->work);
+
 	mwifiex_remove_card(adapter);
 }
 
diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
index a828801..2488587 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.c
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
@@ -399,6 +399,8 @@ static int mwifiex_check_winner_status(struct mwifiex_adapter *adapter)
 		mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
 	}
 
+	cancel_work_sync(&card->work);
+
 	mwifiex_remove_card(adapter);
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: mwifiex: cancel pcie/sdio work in remove/shutdown handler
  2017-12-13 11:27 [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown handler Xinming Hu
@ 2018-01-08 17:38 ` Kalle Valo
  2018-01-08 18:11 ` [PATCH] " Brian Norris
  1 sibling, 0 replies; 5+ messages in thread
From: Kalle Valo @ 2018-01-08 17:38 UTC (permalink / raw)
  To: Xinming Hu
  Cc: Linux Wireless, Brian Norris, Dmitry Torokhov, rajatja,
	Zhiyuan Yang, Tim Song, Cathy Luo, James Cao, Ganapathi Bhat,
	Ellie Reeves, Xinming Hu

Xinming Hu <huxm@marvell.com> wrote:

> The last command used to shutdown firmware might be timeout,
> and trigger firmware dump in asynchronous pcie/sdio work.
> 
> The remove/shutdown handler will continue free core data
> structure private/adapter, which might be dereferenced in
> pcie/sdio work, finally crash the kernel.
> 
> Sync and Cancel pcie/sdio work, could be a fix for above
> cornel case. In this way, the last command timeout could
> be handled properly.
> 
> Signed-off-by: Xinming Hu <huxm@marvell.com>

Patch applied to wireless-drivers-next.git, thanks.

b713bbf1471b mwifiex: cancel pcie/sdio work in remove/shutdown handler

-- 
https://patchwork.kernel.org/patch/10109773/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown handler
  2017-12-13 11:27 [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown handler Xinming Hu
  2018-01-08 17:38 ` Kalle Valo
@ 2018-01-08 18:11 ` Brian Norris
  2018-01-09  7:39   ` Kalle Valo
  1 sibling, 1 reply; 5+ messages in thread
From: Brian Norris @ 2018-01-08 18:11 UTC (permalink / raw)
  To: Xinming Hu
  Cc: Linux Wireless, Kalle Valo, Dmitry Torokhov, rajatja,
	Zhiyuan Yang, Tim Song, Cathy Luo, James Cao, Ganapathi Bhat,
	Ellie Reeves

Hi,

On Wed, Dec 13, 2017 at 07:27:53PM +0800, Xinming Hu wrote:
> The last command used to shutdown firmware might be timeout,
> and trigger firmware dump in asynchronous pcie/sdio work.
> 
> The remove/shutdown handler will continue free core data
> structure private/adapter, which might be dereferenced in
> pcie/sdio work, finally crash the kernel.
> 
> Sync and Cancel pcie/sdio work, could be a fix for above
> cornel case. In this way, the last command timeout could

s/cornel/corner/

> be handled properly.
> 
> Signed-off-by: Xinming Hu <huxm@marvell.com>
> ---
>  drivers/net/wireless/marvell/mwifiex/pcie.c | 2 ++
>  drivers/net/wireless/marvell/mwifiex/sdio.c | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
> index f666cb2..23209c5 100644
> --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> @@ -310,6 +310,8 @@ static void mwifiex_pcie_remove(struct pci_dev *pdev)
>  		mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
>  	}
>  
> +	cancel_work_sync(&card->work);
> +

Just FYI, this "fix" is not a real fix. It will likely paper over some
of your bugs (where, e.g., the FW shutdown command times out in the
previous couple of lines), but this highlights the fact that there are
other races that could trigger the same behavior. You're not fixing
those.

For example, what if somebody initiates a scan or other nl80211 command
between the above line and mwifiex_remove_card()? That command could
potentially time out too.

The proper fix would be to institute some kind of mutual exclusion
(locking) between mwifiex_shutdown_sw() and mwifiex_remove_card(), so
that they can't occur at the same time.

Unfortunately, I only paid attention to this after Kalle already applied
this patch. Personally, I'd prefer this patch not get applied, since
it's a bad solution to an obvious problem, which instead leaves a subtle
problem that perhaps no one will bother fixing.

Brian

>  	mwifiex_remove_card(adapter);
>  }
>  
> diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
> index a828801..2488587 100644
> --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> @@ -399,6 +399,8 @@ static int mwifiex_check_winner_status(struct mwifiex_adapter *adapter)
>  		mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
>  	}
>  
> +	cancel_work_sync(&card->work);
> +
>  	mwifiex_remove_card(adapter);
>  }
>  
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown handler
  2018-01-08 18:11 ` [PATCH] " Brian Norris
@ 2018-01-09  7:39   ` Kalle Valo
  0 siblings, 0 replies; 5+ messages in thread
From: Kalle Valo @ 2018-01-09  7:39 UTC (permalink / raw)
  To: Brian Norris
  Cc: Xinming Hu, Linux Wireless, Dmitry Torokhov, rajatja,
	Zhiyuan Yang, Tim Song, Cathy Luo, James Cao, Ganapathi Bhat,
	Ellie Reeves

Brian Norris <briannorris@chromium.org> writes:

>> --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
>> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
>> @@ -310,6 +310,8 @@ static void mwifiex_pcie_remove(struct pci_dev *pdev)
>>  		mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
>>  	}
>>  
>> +	cancel_work_sync(&card->work);
>> +
>
> Just FYI, this "fix" is not a real fix. It will likely paper over some
> of your bugs (where, e.g., the FW shutdown command times out in the
> previous couple of lines), but this highlights the fact that there are
> other races that could trigger the same behavior. You're not fixing
> those.
>
> For example, what if somebody initiates a scan or other nl80211 command
> between the above line and mwifiex_remove_card()? That command could
> potentially time out too.
>
> The proper fix would be to institute some kind of mutual exclusion
> (locking) between mwifiex_shutdown_sw() and mwifiex_remove_card(), so
> that they can't occur at the same time.
>
> Unfortunately, I only paid attention to this after Kalle already applied
> this patch. Personally, I'd prefer this patch not get applied, since
> it's a bad solution to an obvious problem, which instead leaves a subtle
> problem that perhaps no one will bother fixing.

I can revert it, that's not a problem. Can I use the text below as
explanation for the revert?

----------------------------------------------------------------------
Brian Norris <briannorris@chromium.org> says:

Just FYI, this "fix" is not a real fix. It will likely paper over some
of your bugs (where, e.g., the FW shutdown command times out in the
previous couple of lines), but this highlights the fact that there are
other races that could trigger the same behavior. You're not fixing
those.

For example, what if somebody initiates a scan or other nl80211 command
between the above line and mwifiex_remove_card()? That command could
potentially time out too.

The proper fix would be to institute some kind of mutual exclusion
(locking) between mwifiex_shutdown_sw() and mwifiex_remove_card(), so
that they can't occur at the same time.

----------------------------------------------------------------------

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown handler
  2018-01-12 19:06   ` Brian Norris
@ 2018-01-13  9:54     ` Kalle Valo
  0 siblings, 0 replies; 5+ messages in thread
From: Kalle Valo @ 2018-01-13  9:54 UTC (permalink / raw)
  To: Brian Norris
  Cc: Xinming Hu, Linux Wireless, Dmitry Torokhov, rajatja,
	Zhiyuan Yang, Tim Song, Cathy Luo, James Cao, Ganapathi Bhat,
	Ellie Reeves, Christoph Hellwig

Brian Norris <briannorris@chromium.org> writes:

> On Thu, Jan 11, 2018 at 06:25:09PM -0800, Brian Norris wrote:
>> Anyway, I'll do my own testing and then submit my patch properly.
>
> OK, so I definitely confirmed: if your patch does anything, it
> introduces a new deadlock possibility. Just trigger a Wifi timeout or
> reset from within remove(), and you'll see the work event get stuck in
> pci_reset_function(), while remove() gets stuck at cancel_work_sync().
>
> I also confirmed that my patch resolves this problem.
>
> I'll send the revert + my patch now.

Great, thanks. I didn't had a chance to do the revert yet but I'll now
apply your revert instead.

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-01-13  9:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-13 11:27 [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown handler Xinming Hu
2018-01-08 17:38 ` Kalle Valo
2018-01-08 18:11 ` [PATCH] " Brian Norris
2018-01-09  7:39   ` Kalle Valo
2018-01-10 12:30 Re: " Xinming Hu
2018-01-12  2:25 ` Brian Norris
2018-01-12 19:06   ` Brian Norris
2018-01-13  9:54     ` Kalle Valo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.