linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] Revert "mwifiex: cancel pcie/sdio work in remove/shutdown handler"
@ 2018-01-12 21:08 Brian Norris
  2018-01-12 21:08 ` [PATCH 2/2] mwifiex: resolve reset vs. remove()/shutdown() deadlocks Brian Norris
  2018-01-16 16:01 ` [1/2] Revert "mwifiex: cancel pcie/sdio work in remove/shutdown handler" Kalle Valo
  0 siblings, 2 replies; 3+ messages in thread
From: Brian Norris @ 2018-01-12 21:08 UTC (permalink / raw)
  To: Ganapathi Bhat, Nishant Sarmukadam, Kalle Valo, Amitkumar Karwar
  Cc: linux-kernel, linux-wireless, Ellie Reeves, Christoph Hellwig,
	Dmitry Torokhov, Zhiyuan Yang, Tim Song, Cathy Luo, James Cao,
	Brian Norris, Signed-off-by : Xinming Hu

This reverts commit b713bbf1471b56b572ce26bd02b81a85c2b007f4.

The "fix" in question does not actually fix all related problems, and it
also introduces new deadlock possibilities. Since commit b014e96d1abb
("PCI: Protect pci_error_handlers->reset_notify() usage with
device_lock()"), the race in question is actually resolved (PCIe reset
cannot happen at the same time as remove()). Instead, this "fix" just
introduces a deadlock where mwifiex_pcie_card_reset_work() is waiting on
device_lock, which is held by PCIe device remove(), which is waiting
on...mwifiex_pcie_card_reset_work().

The proper thing to do is just to fix the deadlock. Patch for this will
come separately.

Cc: Signed-off-by: Xinming Hu <huxm@marvell.com>
Signed-off-by: Brian Norris <briannorris@chromium.org>
---
 drivers/net/wireless/marvell/mwifiex/pcie.c | 2 --
 drivers/net/wireless/marvell/mwifiex/sdio.c | 2 --
 2 files changed, 4 deletions(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index 23209c5cab05..f666cb2ea7e0 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -310,8 +310,6 @@ static void mwifiex_pcie_remove(struct pci_dev *pdev)
 		mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
 	}
 
-	cancel_work_sync(&card->work);
-
 	mwifiex_remove_card(adapter);
 }
 
diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
index 248858723753..a82880132af4 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.c
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
@@ -399,8 +399,6 @@ mwifiex_sdio_remove(struct sdio_func *func)
 		mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
 	}
 
-	cancel_work_sync(&card->work);
-
 	mwifiex_remove_card(adapter);
 }
 
-- 
2.16.0.rc1.238.g530d649a79-goog

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2] mwifiex: resolve reset vs. remove()/shutdown() deadlocks
  2018-01-12 21:08 [PATCH 1/2] Revert "mwifiex: cancel pcie/sdio work in remove/shutdown handler" Brian Norris
@ 2018-01-12 21:08 ` Brian Norris
  2018-01-16 16:01 ` [1/2] Revert "mwifiex: cancel pcie/sdio work in remove/shutdown handler" Kalle Valo
  1 sibling, 0 replies; 3+ messages in thread
From: Brian Norris @ 2018-01-12 21:08 UTC (permalink / raw)
  To: Ganapathi Bhat, Nishant Sarmukadam, Kalle Valo, Amitkumar Karwar
  Cc: linux-kernel, linux-wireless, Ellie Reeves, Christoph Hellwig,
	Dmitry Torokhov, Zhiyuan Yang, Tim Song, Cathy Luo, James Cao,
	Brian Norris, stable, Xinming Hu

Commit b014e96d1abb ("PCI: Protect pci_error_handlers->reset_notify()
usage with device_lock()") resolves races between driver reset and
removal, but it introduces some new deadlock problems. If we see a
timeout while we've already started suspending, removing, or shutting
down the driver, we might see:

(a) a worker thread, running mwifiex_pcie_work() ->
    mwifiex_pcie_card_reset_work() -> pci_reset_function()
(b) a removal thread, running mwifiex_pcie_remove() ->
    mwifiex_free_adapter() -> mwifiex_unregister() ->
    mwifiex_cleanup_pcie() -> cancel_work_sync(&card->work)

Unfortunately, mwifiex_pcie_remove() already holds the device lock that
pci_reset_function() is now requesting, and so we see a deadlock.

It's necessary to cancel and synchronize our outstanding work before
tearing down the driver, so we can't have this work wait indefinitely
for the lock.

It's reasonable to only "try" to reset here, since this will mostly
happen for cases where it's already difficult to reset the firmware
anyway (e.g., while we're suspending or powering off the system). And if
reset *really* needs to happen, we can always try again later.

Fixes: b014e96d1abb ("PCI: Protect pci_error_handlers->reset_notify() usage with device_lock()")
Cc: <stable@vger.kernel.org>
Cc: Xinming Hu <huxm@marvell.com>
Signed-off-by: Brian Norris <briannorris@chromium.org>
---
 drivers/net/wireless/marvell/mwifiex/pcie.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index f666cb2ea7e0..97a6199692ab 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -2786,7 +2786,10 @@ static void mwifiex_pcie_card_reset_work(struct mwifiex_adapter *adapter)
 {
 	struct pcie_service_card *card = adapter->card;
 
-	pci_reset_function(card->dev);
+	/* We can't afford to wait here; remove() might be waiting on us. If we
+	 * can't grab the device lock, maybe we'll get another chance later.
+	 */
+	pci_try_reset_function(card->dev);
 }
 
 static void mwifiex_pcie_work(struct work_struct *work)
-- 
2.16.0.rc1.238.g530d649a79-goog

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [1/2] Revert "mwifiex: cancel pcie/sdio work in remove/shutdown handler"
  2018-01-12 21:08 [PATCH 1/2] Revert "mwifiex: cancel pcie/sdio work in remove/shutdown handler" Brian Norris
  2018-01-12 21:08 ` [PATCH 2/2] mwifiex: resolve reset vs. remove()/shutdown() deadlocks Brian Norris
@ 2018-01-16 16:01 ` Kalle Valo
  1 sibling, 0 replies; 3+ messages in thread
From: Kalle Valo @ 2018-01-16 16:01 UTC (permalink / raw)
  To: Brian Norris
  Cc: Ganapathi Bhat, Nishant Sarmukadam, Amitkumar Karwar,
	linux-kernel, linux-wireless, Ellie Reeves, Christoph Hellwig,
	Dmitry Torokhov, Zhiyuan Yang, Tim Song, Cathy Luo, James Cao,
	Brian Norris, Signed-off-by : Xinming Hu

Brian Norris <briannorris@chromium.org> wrote:

> This reverts commit b713bbf1471b56b572ce26bd02b81a85c2b007f4.
> 
> The "fix" in question does not actually fix all related problems, and it
> also introduces new deadlock possibilities. Since commit b014e96d1abb
> ("PCI: Protect pci_error_handlers->reset_notify() usage with
> device_lock()"), the race in question is actually resolved (PCIe reset
> cannot happen at the same time as remove()). Instead, this "fix" just
> introduces a deadlock where mwifiex_pcie_card_reset_work() is waiting on
> device_lock, which is held by PCIe device remove(), which is waiting
> on...mwifiex_pcie_card_reset_work().
> 
> The proper thing to do is just to fix the deadlock. Patch for this will
> come separately.
> 
> Cc: Signed-off-by: Xinming Hu <huxm@marvell.com>
> Signed-off-by: Brian Norris <briannorris@chromium.org>

2 patches applied to wireless-drivers-next.git, thanks.

7e34c0d2f631 Revert "mwifiex: cancel pcie/sdio work in remove/shutdown handler"
a64e7a79dd60 mwifiex: resolve reset vs. remove()/shutdown() deadlocks

-- 
https://patchwork.kernel.org/patch/10161659/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-01-16 16:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-12 21:08 [PATCH 1/2] Revert "mwifiex: cancel pcie/sdio work in remove/shutdown handler" Brian Norris
2018-01-12 21:08 ` [PATCH 2/2] mwifiex: resolve reset vs. remove()/shutdown() deadlocks Brian Norris
2018-01-16 16:01 ` [1/2] Revert "mwifiex: cancel pcie/sdio work in remove/shutdown handler" Kalle Valo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).