All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] USB: Quiesce interrupts across pm freeze
@ 2022-04-21 17:39 Evan Green
  2022-04-21 17:39 ` [PATCH v3 1/2] USB: core: Disable remote wakeup for freeze/quiesce Evan Green
  2022-04-21 17:39 ` [PATCH v3 2/2] USB: hcd-pci: Fully suspend across freeze/thaw cycle Evan Green
  0 siblings, 2 replies; 3+ messages in thread
From: Evan Green @ 2022-04-21 17:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mathias Nyman, Alan Stern, Rajat Jain, Oliver Neukum,
	Thomas Gleixner, Evan Green, Bjorn Helgaas, Rafael J. Wysocki,
	Razvan Heghedus, Wei Ming Chen, Youngjin Jang, linux-kernel,
	linux-usb

The documentation for the freeze() method says that it "should quiesce
the device so that it doesn't generate IRQs or DMA". The unspoken
consequence of not doing this is that MSIs aimed at non-boot CPUs may
get fully lost if they're sent during the period where the target CPU is
offline.

The current behavior of the USB subsystem still allows interrupts to
come in after freeze, both in terms of remote wakeups and HC events
related to things like root plug port activity. This can get controllers
like XHCI, which is very sensitive to lost interrupts, in a wedged
state. This series attempts to fully quiesce interrupts coming from USB
across in a freeze or quiescent state.

These patches are grouped together because they serve a united purpose,
but are actually independent. They could be merged or reverted
individually.

You may be able to reproduce this issue on your own machine via the
following:
1. Disable runtime PM on your XHCI controller
2. Aim your XHCI IRQ at a non-boot CPU (replace 174): echo 2 >
   /proc/irq/174/smp_affinity
3. Attempt to hibernate (no need to actually go all the way down).

I run 2 and 3 in a loop, and can usually hit a hang or dead XHCI
controller within 1-2 iterations. I happened to notice this on an
Alderlake system where runtime PM is accidentally disabled for one of
the XHCI controllers. Some more discussion and debugging can be found at
[1].

[1] https://lore.kernel.org/linux-pci/CAE=gft4a-QL82iFJE_xRQ3JrMmz-KZKWREtz=MghhjFbJeK=8A@mail.gmail.com/T/#u

Changes in v3:
 - Fix comment formatting and line wrap

Changes in v2:
 - Introduced the patch to always disable remote wakeups
 - Removed the change to freeze_noirq/thaw_noirq

Evan Green (2):
  USB: core: Disable remote wakeup for freeze/quiesce
  USB: hcd-pci: Fully suspend across freeze/thaw cycle

 drivers/usb/core/driver.c  | 25 +++++++++++++------------
 drivers/usb/core/hcd-pci.c |  4 ++--
 2 files changed, 15 insertions(+), 14 deletions(-)

-- 
2.31.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v3 1/2] USB: core: Disable remote wakeup for freeze/quiesce
  2022-04-21 17:39 [PATCH v3 0/2] USB: Quiesce interrupts across pm freeze Evan Green
@ 2022-04-21 17:39 ` Evan Green
  2022-04-21 17:39 ` [PATCH v3 2/2] USB: hcd-pci: Fully suspend across freeze/thaw cycle Evan Green
  1 sibling, 0 replies; 3+ messages in thread
From: Evan Green @ 2022-04-21 17:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mathias Nyman, Alan Stern, Rajat Jain, Oliver Neukum,
	Thomas Gleixner, Evan Green, Razvan Heghedus, Wei Ming Chen,
	linux-kernel, linux-usb

The PM_EVENT_FREEZE and PM_EVENT_QUIESCE messages should cause the
device to stop generating interrupts. USB core was previously allowing
devices that were already runtime suspended to keep remote wakeup
enabled if they had gone down that way. This violates the contract with
pm, and can potentially cause MSI interrupts to be lost.

Change that so that if a device is runtime suspended with remote wakeups
enabled, it will be resumed to ensure remote wakeup is always disabled
across a freeze.

Signed-off-by: Evan Green <evgreen@chromium.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>

---

Changes in v3:
 - Fix comment formatting and line wrap

Changes in v2:
 - Introduced the patch to always disable remote wakeups

 drivers/usb/core/driver.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index 355ed33a21792b..b87452e228353d 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -1533,22 +1533,23 @@ static void choose_wakeup(struct usb_device *udev, pm_message_t msg)
 {
 	int	w;
 
-	/* Remote wakeup is needed only when we actually go to sleep.
-	 * For things like FREEZE and QUIESCE, if the device is already
-	 * autosuspended then its current wakeup setting is okay.
+	/*
+	 * For FREEZE/QUIESCE, disable remote wakeups so no interrupts get
+	 * generated.
 	 */
 	if (msg.event == PM_EVENT_FREEZE || msg.event == PM_EVENT_QUIESCE) {
-		if (udev->state != USB_STATE_SUSPENDED)
-			udev->do_remote_wakeup = 0;
-		return;
-	}
+		w = 0;
 
-	/* Enable remote wakeup if it is allowed, even if no interface drivers
-	 * actually want it.
-	 */
-	w = device_may_wakeup(&udev->dev);
+	} else {
+		/*
+		 * Enable remote wakeup if it is allowed, even if no interface
+		 * drivers actually want it.
+		 */
+		w = device_may_wakeup(&udev->dev);
+	}
 
-	/* If the device is autosuspended with the wrong wakeup setting,
+	/*
+	 * If the device is autosuspended with the wrong wakeup setting,
 	 * autoresume now so the setting can be changed.
 	 */
 	if (udev->state == USB_STATE_SUSPENDED && w != udev->do_remote_wakeup)
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH v3 2/2] USB: hcd-pci: Fully suspend across freeze/thaw cycle
  2022-04-21 17:39 [PATCH v3 0/2] USB: Quiesce interrupts across pm freeze Evan Green
  2022-04-21 17:39 ` [PATCH v3 1/2] USB: core: Disable remote wakeup for freeze/quiesce Evan Green
@ 2022-04-21 17:39 ` Evan Green
  1 sibling, 0 replies; 3+ messages in thread
From: Evan Green @ 2022-04-21 17:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mathias Nyman, Alan Stern, Rajat Jain, Oliver Neukum,
	Thomas Gleixner, Evan Green, Bjorn Helgaas, Rafael J. Wysocki,
	Youngjin Jang, linux-kernel, linux-usb

The documentation for the freeze() method says that it "should quiesce
the device so that it doesn't generate IRQs or DMA". The unspoken
consequence of not doing this is that MSIs aimed at non-boot CPUs may
get fully lost if they're sent during the period where the target CPU is
offline.

The current callbacks for USB HCD do not fully quiesce interrupts,
specifically on XHCI. Change to use the full suspend/resume flow for
freeze/thaw to ensure interrupts are fully quiesced. This fixes issues
where USB devices fail to thaw during hibernation because XHCI misses
its interrupt and cannot recover.

Signed-off-by: Evan Green <evgreen@chromium.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>

---

(no changes since v2)

Changes in v2:
 - Removed the change to freeze_noirq/thaw_noirq

 drivers/usb/core/hcd-pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
index 8176bc81a635d6..ae5e6d572376be 100644
--- a/drivers/usb/core/hcd-pci.c
+++ b/drivers/usb/core/hcd-pci.c
@@ -616,10 +616,10 @@ const struct dev_pm_ops usb_hcd_pci_pm_ops = {
 	.suspend_noirq	= hcd_pci_suspend_noirq,
 	.resume_noirq	= hcd_pci_resume_noirq,
 	.resume		= hcd_pci_resume,
-	.freeze		= check_root_hub_suspended,
+	.freeze		= hcd_pci_suspend,
 	.freeze_noirq	= check_root_hub_suspended,
 	.thaw_noirq	= NULL,
-	.thaw		= NULL,
+	.thaw		= hcd_pci_resume,
 	.poweroff	= hcd_pci_suspend,
 	.poweroff_noirq	= hcd_pci_suspend_noirq,
 	.restore_noirq	= hcd_pci_resume_noirq,
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-04-21 17:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-21 17:39 [PATCH v3 0/2] USB: Quiesce interrupts across pm freeze Evan Green
2022-04-21 17:39 ` [PATCH v3 1/2] USB: core: Disable remote wakeup for freeze/quiesce Evan Green
2022-04-21 17:39 ` [PATCH v3 2/2] USB: hcd-pci: Fully suspend across freeze/thaw cycle Evan Green

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.