linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures
@ 2020-12-28  4:05 Bjorn Helgaas
  2020-12-28  4:41 ` Kenneth R. Crudup
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2020-12-28  4:05 UTC (permalink / raw)
  To: Kenneth R. Crudup; +Cc: Vidya Sagar, linux-pci, linux-kernel

> From: Kenneth R. Crudup <kenny@panix.com>
> 
> I've been running Linus' master branch on my laptop (Dell XPS 13
> 2-in-1).  With this commit in place, after resuming from hibernate
> my machine is essentially useless, with a torrent of disk I/O errors
> on my NVMe device (at least, and possibly other devices affected)
> until a reboot.
> 
> I do use tlp to set the PCIe ASPM to "performance" on AC and
> "powersupersave" on battery.

Thanks a lot for the report, and sorry for the breakage.

4257f7e008ea restores PCI_L1SS_CTL1, then PCI_L1SS_CTL2.  I think it
should do those in the reverse order, since the Enable bits are in
PCI_L1SS_CTL1.  It also restores L1SS state (potentially enabling
L1.x) before we restore the PCIe Capability (potentially enabling ASPM
as a whole).  Those probably should also be in the other order.

If it's convenient, can you try the patch below?  If the debug patch
doesn't help:

  - Are you seeing the hibernate/resume problem when on AC or on
    battery?

  - What if you don't use tlp?  Does hibernate/resume work fine then?
    If tlp makes a difference, can you collect "sudo lspci -vv" output
    with and without using tlp (before hibernate)?

  - If you revert 4257f7e008ea, does hibernate/resume work fine?  Both
    with the tlp tweak and without?

  - Collect "sudo lspci -vv" output before hibernate and (if possible)
    after resume when you see the problem.

I guess tlp only uses /sys/module/pcie_aspm/parameters/policy, so it
sets the same ASPM policy system-wide.

Bjorn

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b9fecc25d213..6598b5cd3154 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1665,9 +1665,8 @@ void pci_restore_state(struct pci_dev *dev)
 	 * LTR itself (in the PCIe capability).
 	 */
 	pci_restore_ltr_state(dev);
-	pci_restore_aspm_l1ss_state(dev);
-
 	pci_restore_pcie_state(dev);
+	pci_restore_aspm_l1ss_state(dev);
 	pci_restore_pasid_state(dev);
 	pci_restore_pri_state(dev);
 	pci_restore_ats_state(dev);
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index a08e7d6dc248..c4a99274b4bb 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -752,8 +752,8 @@ void pci_save_aspm_l1ss_state(struct pci_dev *dev)
 		return;
 
 	cap = (u32 *)&save_state->cap.data[0];
-	pci_read_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL1, cap++);
 	pci_read_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL2, cap++);
+	pci_read_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL1, cap++);
 }
 
 void pci_restore_aspm_l1ss_state(struct pci_dev *dev)
@@ -774,8 +774,8 @@ void pci_restore_aspm_l1ss_state(struct pci_dev *dev)
 		return;
 
 	cap = (u32 *)&save_state->cap.data[0];
-	pci_write_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL1, *cap++);
 	pci_write_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL2, *cap++);
+	pci_write_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL1, *cap++);
 }
 
 static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val)

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures
  2020-12-28  4:05 Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures Bjorn Helgaas
@ 2020-12-28  4:41 ` Kenneth R. Crudup
  2020-12-28  5:43 ` Kenneth R. Crudup
  2021-01-22 20:11 ` Kenneth R. Crudup
  2 siblings, 0 replies; 9+ messages in thread
From: Kenneth R. Crudup @ 2020-12-28  4:41 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Vidya Sagar, linux-pci, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 835 bytes --]


On Sun, 27 Dec 2020, Bjorn Helgaas wrote:

> If it's convenient, can you try the patch below?

Will do!

Also:

>   - Are you seeing the hibernate/resume problem when on AC or on
>     battery?

Um, I forget :) but want to say "both". I'll try both ways and let you know.

>   - If you revert 4257f7e008ea, does hibernate/resume work fine?  Both
>     with the tlp tweak and without?

Yeah, but TBH there were two other PM regressions in this -rc cycle, so
you guys are in good company :)

>   - Collect "sudo lspci -vv" output before hibernate and (if possible)
>     after resume when you see the problem.

See attached.

> I guess tlp only uses /sys/module/pcie_aspm/parameters/policy, so it
> sets the same ASPM policy system-wide.

Yeah.

	-Kenny

-- 
Kenneth R. Crudup  Sr. SW Engineer, Scott County Consulting, Orange County CA

[-- Attachment #2: Type: text/plain, Size: 14038 bytes --]

00:00.0 Host bridge [0600]: Intel Corporation Device [8086:8a12] (rev 03)
	Subsystem: Dell Device [1028:08b0]
	Flags: bus master, fast devsel, latency 0, IOMMU group 0
	Capabilities: [e0] Vendor Specific Information: Len=10 <?>
	Kernel driver in use: icl_uncore

00:02.0 VGA compatible controller [0300]: Intel Corporation Iris Plus Graphics G7 [8086:8a52] (rev 07) (prog-if 00 [VGA controller])
	DeviceName: To Be Filled by O.E.M.
	Subsystem: Dell Iris Plus Graphics G7 [1028:08b0]
	Flags: bus master, fast devsel, latency 0, IRQ 201, IOMMU group 1
	Memory at 603d000000 (64-bit, non-prefetchable) [size=16M]
	Memory at 4000000000 (64-bit, prefetchable) [size=256M]
	I/O ports at 4000 [size=64]
	Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
	Capabilities: [40] Vendor Specific Information: Len=0c <?>
	Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
	Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
	Capabilities: [d0] Power Management version 2
	Capabilities: [100] Process Address Space ID (PASID)
	Capabilities: [200] Address Translation Service (ATS)
	Capabilities: [300] Page Request Interface (PRI)
	Kernel driver in use: i915
	Kernel modules: i915

00:04.0 Signal processing controller [1180]: Intel Corporation Device [8086:8a03] (rev 03)
	Subsystem: Dell Device [1028:08b0]
	Flags: fast devsel, IRQ 16, IOMMU group 2
	Memory at 603eba0000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-
	Capabilities: [d0] Power Management version 3
	Capabilities: [e0] Vendor Specific Information: Len=0c <?>
	Kernel driver in use: proc_thermal

00:05.0 Multimedia controller [0480]: Intel Corporation Device [8086:8a19] (rev 03)
	Subsystem: Dell Device [1028:08b0]
	Flags: fast devsel, IRQ 255, IOMMU group 3
	Memory at 603c000000 (64-bit, non-prefetchable) [disabled] [size=16M]
	Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
	Capabilities: [ac] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [d0] Power Management version 3

00:07.0 PCI bridge [0604]: Intel Corporation Ice Lake Thunderbolt 3 PCI Express Root Port #0 [8086:8a1d] (rev 03) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 155, IOMMU group 4
	Bus: primary=00, secondary=01, subordinate=2d, sec-latency=0
	I/O behind bridge: 00005000-00006fff [size=8K]
	Memory behind bridge: 7e000000-8a1fffff [size=194M]
	Prefetchable memory behind bridge: 0000006000000000-000000601bffffff [size=448M]
	Capabilities: [40] Express Root Port (Slot+), MSI 00
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
	Capabilities: [90] Subsystem: Device [0000:0000]
	Capabilities: [a0] Power Management version 3
	Capabilities: [100] Null
	Capabilities: [220] Access Control Services
	Capabilities: [a00] Downstream Port Containment
	Kernel driver in use: pcieport

00:07.2 PCI bridge [0604]: Intel Corporation Ice Lake Thunderbolt 3 PCI Express Root Port #2 [8086:8a21] (rev 03) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 156, IOMMU group 5
	Bus: primary=00, secondary=2e, subordinate=58, sec-latency=0
	I/O behind bridge: 00007000-00007fff [size=4K]
	Memory behind bridge: 70000000-7c1fffff [size=194M]
	Prefetchable memory behind bridge: 0000006020000000-000000603bffffff [size=448M]
	Capabilities: [40] Express Root Port (Slot+), MSI 00
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
	Capabilities: [90] Subsystem: Device [0000:0000]
	Capabilities: [a0] Power Management version 3
	Capabilities: [100] Null
	Capabilities: [220] Access Control Services
	Capabilities: [a00] Downstream Port Containment
	Kernel driver in use: pcieport

00:0d.0 USB controller [0c03]: Intel Corporation Ice Lake Thunderbolt 3 USB Controller [8086:8a13] (rev 03) (prog-if 30 [XHCI])
	Flags: bus master, medium devsel, latency 0, IRQ 171, IOMMU group 6
	Memory at 603eb90000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [70] Power Management version 2
	Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
	Capabilities: [90] Vendor Specific Information: Len=14 <?>
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci

00:0d.2 System peripheral [0880]: Intel Corporation Ice Lake Thunderbolt 3 NHI #0 [8086:8a17] (rev 03)
	Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 6
	Memory at 603eb40000 (64-bit, non-prefetchable) [size=256K]
	Memory at 603ebc3000 (64-bit, non-prefetchable) [size=4K]
	Capabilities: [80] Power Management version 3
	Capabilities: [88] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [a0] MSI-X: Enable+ Count=16 Masked-
	Kernel driver in use: thunderbolt

00:0d.3 System peripheral [0880]: Intel Corporation Ice Lake Thunderbolt 3 NHI #1 [8086:8a0d] (rev 03)
	Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 6
	Memory at 603eb00000 (64-bit, non-prefetchable) [size=256K]
	Memory at 603ebc2000 (64-bit, non-prefetchable) [size=4K]
	Capabilities: [80] Power Management version 3
	Capabilities: [88] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [a0] MSI-X: Enable+ Count=16 Masked-
	Kernel driver in use: thunderbolt

00:12.0 Serial controller [0700]: Intel Corporation Device [8086:34fc] (rev 30) (prog-if 00 [8250])
	Subsystem: Dell Device [1028:08b0]
	Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 7
	Memory at 603ebba000 (64-bit, non-prefetchable) [size=8K]
	Capabilities: [80] Power Management version 3
	Capabilities: [90] Vendor Specific Information: Len=14 <?>
	Kernel driver in use: intel_ish_ipc

00:14.0 USB controller [0c03]: Intel Corporation Ice Lake-LP USB 3.1 xHCI Host Controller [8086:34ed] (rev 30) (prog-if 30 [XHCI])
	Subsystem: Dell Ice Lake-LP USB 3.1 xHCI Host Controller [1028:08b0]
	Flags: medium devsel, IRQ 188, IOMMU group 8
	Memory at 603eb80000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [70] Power Management version 2
	Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
	Capabilities: [90] Vendor Specific Information: Len=14 <?>
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci

00:14.2 RAM memory [0500]: Intel Corporation Device [8086:34ef] (rev 30)
	Flags: fast devsel, IOMMU group 8
	Memory at 603ebb8000 (64-bit, non-prefetchable) [disabled] [size=8K]
	Memory at 603ebc1000 (64-bit, non-prefetchable) [disabled] [size=4K]
	Capabilities: [80] Power Management version 3

00:14.3 Network controller [0280]: Intel Corporation Killer Wi-Fi 6 AX1650i 160MHz Wireless Network Adapter (201NGW) [8086:34f0] (rev 30)
	Subsystem: Bigfoot Networks, Inc. Killer Wi-Fi 6 AX1650i 160MHz Wireless Network Adapter (201NGW) [1a56:1651]
	Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 9
	Memory at 603ebb4000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [c8] Power Management version 3
	Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [40] Express Root Complex Integrated Endpoint, MSI 00
	Capabilities: [80] MSI-X: Enable+ Count=16 Masked-
	Capabilities: [100] Latency Tolerance Reporting
	Capabilities: [164] Vendor Specific Information: ID=0010 Rev=0 Len=014 <?>
	Kernel driver in use: iwlwifi
	Kernel modules: iwlwifi

00:15.0 Serial bus controller [0c80]: Intel Corporation Ice Lake-LP Serial IO I2C Controller #0 [8086:34e8] (rev 30)
	Subsystem: Dell Ice Lake-LP Serial IO I2C Controller [1028:08b0]
	Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 10
	Memory at 4010000000 (64-bit, non-prefetchable) [virtual] [size=4K]
	Capabilities: [80] Power Management version 3
	Capabilities: [90] Vendor Specific Information: Len=14 <?>
	Kernel driver in use: intel-lpss

00:15.1 Serial bus controller [0c80]: Intel Corporation Ice Lake-LP Serial IO I2C Controller #1 [8086:34e9] (rev 30)
	Subsystem: Dell Ice Lake-LP Serial IO I2C Controller [1028:08b0]
	Flags: bus master, fast devsel, latency 0, IRQ 17, IOMMU group 10
	Memory at 4010001000 (64-bit, non-prefetchable) [size=4K]
	Capabilities: [80] Power Management version 3
	Capabilities: [90] Vendor Specific Information: Len=14 <?>
	Kernel driver in use: intel-lpss

00:15.3 Serial bus controller [0c80]: Intel Corporation Ice Lake-LP Serial IO I2C Controller #3 [8086:34eb] (rev 30)
	Subsystem: Dell Ice Lake-LP Serial IO I2C Controller [1028:08b0]
	Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 10
	Memory at 4010002000 (64-bit, non-prefetchable) [virtual] [size=4K]
	Capabilities: [80] Power Management version 3
	Capabilities: [90] Vendor Specific Information: Len=14 <?>
	Kernel driver in use: intel-lpss

00:16.0 Communication controller [0780]: Intel Corporation Management Engine Interface [8086:34e0] (rev 30)
	Subsystem: Dell Management Engine Interface [1028:08b0]
	Flags: bus master, fast devsel, latency 0, IRQ 159, IOMMU group 11
	Memory at 603ebbd000 (64-bit, non-prefetchable) [size=4K]
	Capabilities: [50] Power Management version 3
	Capabilities: [8c] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [a4] Vendor Specific Information: Len=14 <?>
	Kernel driver in use: mei_me

00:1d.0 PCI bridge [0604]: Intel Corporation Ice Lake-LP PCI Express Root Port #9 [8086:34b0] (rev 30) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 157, IOMMU group 12
	Bus: primary=00, secondary=59, subordinate=59, sec-latency=0
	I/O behind bridge: [disabled]
	Memory behind bridge: 8ac00000-8acfffff [size=1M]
	Prefetchable memory behind bridge: [disabled]
	Capabilities: [40] Express Root Port (Slot+), MSI 00
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
	Capabilities: [90] Subsystem: Dell Ice Lake-LP PCI Express Root Port [1028:08b0]
	Capabilities: [a0] Power Management version 3
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [220] Access Control Services
	Capabilities: [150] Precision Time Measurement
	Capabilities: [200] L1 PM Substates
	Capabilities: [a30] Secondary PCI Express
	Capabilities: [a00] Downstream Port Containment
	Kernel driver in use: pcieport

00:1d.7 PCI bridge [0604]: Intel Corporation Device [8086:34b7] (rev 30) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 158, IOMMU group 13
	Bus: primary=00, secondary=5a, subordinate=5a, sec-latency=0
	I/O behind bridge: 00003000-00003fff [size=4K]
	Memory behind bridge: 8a200000-8abfffff [size=10M]
	Prefetchable memory behind bridge: 000000603e000000-000000603e9fffff [size=10M]
	Capabilities: [40] Express Root Port (Slot+), MSI 00
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
	Capabilities: [90] Subsystem: Dell Device [1028:08b0]
	Capabilities: [a0] Power Management version 3
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [220] Access Control Services
	Capabilities: [150] Precision Time Measurement
	Capabilities: [200] L1 PM Substates
	Capabilities: [a30] Secondary PCI Express
	Capabilities: [a00] Downstream Port Containment
	Kernel driver in use: pcieport

00:1f.0 ISA bridge [0601]: Intel Corporation Ice Lake-LP LPC Controller [8086:3482] (rev 30)
	Subsystem: Dell Ice Lake-LP LPC Controller [1028:08b0]
	Flags: bus master, fast devsel, latency 0, IOMMU group 14

00:1f.3 Audio device [0403]: Intel Corporation Smart Sound Technology Audio Controller [8086:34c8] (rev 30) (prog-if 80)
	Subsystem: Dell Smart Sound Technology Audio Controller [1028:08b0]
	Flags: bus master, fast devsel, latency 64, IRQ 170, IOMMU group 14
	Memory at 603ebb0000 (64-bit, non-prefetchable) [size=16K]
	Memory at 603ea00000 (64-bit, non-prefetchable) [size=1M]
	Capabilities: [50] Power Management version 3
	Capabilities: [80] Vendor Specific Information: Len=14 <?>
	Capabilities: [60] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel

00:1f.4 SMBus [0c05]: Intel Corporation Ice Lake-LP SMBus Controller [8086:34a3] (rev 30)
	Subsystem: Dell Ice Lake-LP SMBus Controller [1028:08b0]
	Flags: medium devsel, IRQ 16, IOMMU group 14
	Memory at 603ebbc000 (64-bit, non-prefetchable) [size=256]
	I/O ports at efa0 [size=32]
	Kernel driver in use: i801_smbus

00:1f.5 Serial bus controller [0c80]: Intel Corporation Ice Lake-LP SPI Controller [8086:34a4] (rev 30)
	Subsystem: Dell Ice Lake-LP SPI Controller [1028:08b0]
	Flags: fast devsel, IOMMU group 14
	Memory at 6f800000 (32-bit, non-prefetchable) [size=4K]
	Kernel modules: intel_spi_pci

59:00.0 Non-Volatile memory controller [0108]: KIOXIA Corporation Device [1e0f:0001] (prog-if 02 [NVM Express])
	Subsystem: KIOXIA Corporation Device [1e0f:0001]
	Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 19
	Memory at 8ac00000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [40] Express Endpoint, MSI 00
	Capabilities: [80] Power Management version 3
	Capabilities: [90] MSI: Enable- Count=1/32 Maskable+ 64bit+
	Capabilities: [b0] MSI-X: Enable+ Count=32 Masked-
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [150] Virtual Channel
	Capabilities: [260] Latency Tolerance Reporting
	Capabilities: [300] Secondary PCI Express
	Capabilities: [400] L1 PM Substates
	Kernel driver in use: nvme

5a:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader [10ec:525a] (rev 01)
	Subsystem: Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader [10ec:525a]
	Physical Slot: 15
	Flags: bus master, fast devsel, latency 0, IRQ 161, IOMMU group 20
	Memory at 8a200000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: [80] Power Management version 3
	Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [b0] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [148] Device Serial Number 00-00-00-01-00-4c-e0-00
	Capabilities: [158] Latency Tolerance Reporting
	Capabilities: [160] L1 PM Substates
	Kernel driver in use: rtsx_pci


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures
  2020-12-28  4:05 Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures Bjorn Helgaas
  2020-12-28  4:41 ` Kenneth R. Crudup
@ 2020-12-28  5:43 ` Kenneth R. Crudup
  2020-12-28  6:30   ` Kenneth R. Crudup
  2021-01-22 20:11 ` Kenneth R. Crudup
  2 siblings, 1 reply; 9+ messages in thread
From: Kenneth R. Crudup @ 2020-12-28  5:43 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Vidya Sagar, linux-pci, linux-kernel


OK, got more info:

On Sun, 27 Dec 2020, Bjorn Helgaas wrote:

> If it's convenient, can you try the patch below?  If the debug patch
> doesn't help:

>   - Are you seeing the hibernate/resume problem when on AC or on
>     battery?

OK, so:

- on TLP, before your patch, it panic()s on AC, but not on battery
- on TLP, with your patch, it panic()s on battery, but NOT on AC
- if TLP is masked, it still panic()s, but I forget if AC or battery
- even if I mask TLP, with your commit in place it panic()s

>   - If you revert 4257f7e008ea, does hibernate/resume work fine?  Both
>     with the tlp tweak and without?

Definitely with the revert everything works. I'll try your patch after the
revert and see if anything changes.

	-Kenny

-- 
Kenneth R. Crudup  Sr. SW Engineer, Scott County Consulting, Orange County CA

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures
  2020-12-28  5:43 ` Kenneth R. Crudup
@ 2020-12-28  6:30   ` Kenneth R. Crudup
  2020-12-30  6:55     ` Vidya Sagar
  0 siblings, 1 reply; 9+ messages in thread
From: Kenneth R. Crudup @ 2020-12-28  6:30 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Vidya Sagar, linux-pci, linux-kernel


On Sun, 27 Dec 2020, Kenneth R. Crudup wrote:

> I'll try your patch after the revert and see if anything changes.

I just realized today's patch makes no sense if it's reverted, so nevermind.

	-Kenny

-- 
Kenneth R. Crudup  Sr. SW Engineer, Scott County Consulting, Orange County CA

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures
  2020-12-28  6:30   ` Kenneth R. Crudup
@ 2020-12-30  6:55     ` Vidya Sagar
  0 siblings, 0 replies; 9+ messages in thread
From: Vidya Sagar @ 2020-12-30  6:55 UTC (permalink / raw)
  To: Kenneth R. Crudup, Bjorn Helgaas; +Cc: linux-pci, linux-kernel

Ideally Bjorn's patch should have worked.
Could you please collect 'sudo lspci -vv' (please don't forget to give 
sudo) with Bjorn's patch before and after hibernate?
Also, is it right to say that with policy set to "performance" there is 
no issue during hibernate/resume?

- Vidya Sagar

On 12/28/2020 12:00 PM, Kenneth R. Crudup wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Sun, 27 Dec 2020, Kenneth R. Crudup wrote:
> 
>> I'll try your patch after the revert and see if anything changes.
> 
> I just realized today's patch makes no sense if it's reverted, so nevermind.
> 
>          -Kenny
> 
> --
> Kenneth R. Crudup  Sr. SW Engineer, Scott County Consulting, Orange County CA
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures
  2020-12-28  4:05 Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures Bjorn Helgaas
  2020-12-28  4:41 ` Kenneth R. Crudup
  2020-12-28  5:43 ` Kenneth R. Crudup
@ 2021-01-22 20:11 ` Kenneth R. Crudup
  2021-01-27 15:50   ` Bjorn Helgaas
  2 siblings, 1 reply; 9+ messages in thread
From: Kenneth R. Crudup @ 2021-01-22 20:11 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Vidya Sagar, linux-pci, linux-kernel


> > From: Kenneth R. Crudup <kenny@panix.com>
> > I've been running Linus' master branch on my laptop (Dell XPS 13
> > 2-in-1).  With this commit in place, after resuming from hibernate
> > my machine is essentially useless, with a torrent of disk I/O errors
> > on my NVMe device (at least, and possibly other devices affected)
> > until a reboot.
> >
> > I do use tlp to set the PCIe ASPM to "performance" on AC and
> > "powersupersave" on battery.

On Sun, 27 Dec 2020, Bjorn Helgaas wrote:

> Thanks a lot for the report, and sorry for the breakage.
> 4257f7e008ea restores PCI_L1SS_CTL1, then PCI_L1SS_CTL2.  I think it
> should do those in the reverse order, since the Enable bits are in
> PCI_L1SS_CTL1.  It also restores L1SS state (potentially enabling
> L1.x) before we restore the PCIe Capability (potentially enabling ASPM
> as a whole).  Those probably should also be in the other order.

Any new news on this? Disabling "tlp" (which just shifts the problem around
on my machine) shouldn't be a solution for this issue.

I'd thought it may have been tied to some of the PM regressions of the last
week of December, but all of those have been fixed but this still remains.

Thanks,

	-Kenny

-- 
Kenneth R. Crudup  Sr. SW Engineer, Scott County Consulting, Orange County CA

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures
  2021-01-22 20:11 ` Kenneth R. Crudup
@ 2021-01-27 15:50   ` Bjorn Helgaas
  2021-01-27 16:00     ` Kenneth R. Crudup
  2021-01-27 16:03     ` Kenneth R. Crudup
  0 siblings, 2 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2021-01-27 15:50 UTC (permalink / raw)
  To: Kenneth R. Crudup; +Cc: Vidya Sagar, linux-pci, linux-kernel

On Fri, Jan 22, 2021 at 12:11:08PM -0800, Kenneth R. Crudup wrote:
> > > From: Kenneth R. Crudup <kenny@panix.com>
> > > I've been running Linus' master branch on my laptop (Dell XPS 13
> > > 2-in-1).  With this commit in place, after resuming from hibernate
> > > my machine is essentially useless, with a torrent of disk I/O errors
> > > on my NVMe device (at least, and possibly other devices affected)
> > > until a reboot.
> > >
> > > I do use tlp to set the PCIe ASPM to "performance" on AC and
> > > "powersupersave" on battery.
> 
> On Sun, 27 Dec 2020, Bjorn Helgaas wrote:
> 
> > Thanks a lot for the report, and sorry for the breakage.
> > 4257f7e008ea restores PCI_L1SS_CTL1, then PCI_L1SS_CTL2.  I think it
> > should do those in the reverse order, since the Enable bits are in
> > PCI_L1SS_CTL1.  It also restores L1SS state (potentially enabling
> > L1.x) before we restore the PCIe Capability (potentially enabling ASPM
> > as a whole).  Those probably should also be in the other order.
> 
> Any new news on this? Disabling "tlp" (which just shifts the problem around
> on my machine) shouldn't be a solution for this issue.

Agreed; disabling "tlp" is a workaround but not a solution.

> I'd thought it may have been tied to some of the PM regressions of the last
> week of December, but all of those have been fixed but this still remains.

I haven't seen anything yet and haven't had a chance to look into it
more myself.

We're at v5.11-rc5 already, so I guess we'll have to think about
reverting 4257f7e008ea ("PCI/ASPM: Save/restore L1SS Capability for
suspend/resume") before v5.11-final unless we can make some progress.

That would mean ASPM L1 substate configuration would be lost by a
suspend/resume, so we'd give up some power saving.  But that's better
than the regression you're seeing.

I'll tentatively queue up a revert on for-linus pending progress on a
better fix.  For some reason I can't find your initial report of the
regression.  The first thing I can find is this:

https://lore.kernel.org/linux-pci/20201228040513.GA611645@bjorn-Precision-5520/

Do you have a URL for your initial report that I could include in the
revert commit log?

Bjorn

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures
  2021-01-27 15:50   ` Bjorn Helgaas
@ 2021-01-27 16:00     ` Kenneth R. Crudup
  2021-01-27 16:03     ` Kenneth R. Crudup
  1 sibling, 0 replies; 9+ messages in thread
From: Kenneth R. Crudup @ 2021-01-27 16:00 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Vidya Sagar, linux-pci, linux-kernel


On Wed, 27 Jan 2021, Bjorn Helgaas wrote:

> Do you have a URL for your initial report that I could include in the
> revert commit log?

I don't, as I'd emailed the committers first and that was then CCed to the
mailing list, but here's what I'd sent:

----
Date: Fri, 25 Dec 2020 16:38:56
From: Kenneth R. Crudup <kenny@panix.com>
To: vidyas@nvidia.com
Cc: bhelgaas@google.com
Subject: Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume
    failures

I've been running Linus' master branch on my laptop (Dell XPS 13 2-in-1). With
this commit in place, after resuming from hibernate my machine is essentially
useless, with a torrent of disk I/O errors on my NVMe device (at least, and
possibly other devices affected) until a reboot.

I do use tlp to set the PCIe ASPM to "performance" on AC and "powersupersave"
on battery.

Let me know if you need more information.

        -Kenny

----

-- 
Kenneth R. Crudup  Sr. SW Engineer, Scott County Consulting, Orange County CA

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures
  2021-01-27 15:50   ` Bjorn Helgaas
  2021-01-27 16:00     ` Kenneth R. Crudup
@ 2021-01-27 16:03     ` Kenneth R. Crudup
  1 sibling, 0 replies; 9+ messages in thread
From: Kenneth R. Crudup @ 2021-01-27 16:03 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Vidya Sagar, linux-pci, linux-kernel


On Wed, 27 Jan 2021, Bjorn Helgaas wrote:

> > Any new news on this? Disabling "tlp" (which just shifts the problem around
> > on my machine) shouldn't be a solution for this issue.
>
> Agreed; disabling "tlp" is a workaround but not a solution.

Actually, disabling "tlp" doesn't fix the issue; I'd tested this and if IIRC,
if I don't use tlp it doesn't prevent this from happening, it just shifts it
from breaking on hibernate cycles to suspend/resume cycles instead.

	-Kenny

-- 
Kenneth R. Crudup  Sr. SW Engineer, Scott County Consulting, Orange County CA

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-01-27 16:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-28  4:05 Commit 4257f7e0 ("PCI/ASPM: Save/restore L1SS Capability for suspend/resume") causing hibernate resume failures Bjorn Helgaas
2020-12-28  4:41 ` Kenneth R. Crudup
2020-12-28  5:43 ` Kenneth R. Crudup
2020-12-28  6:30   ` Kenneth R. Crudup
2020-12-30  6:55     ` Vidya Sagar
2021-01-22 20:11 ` Kenneth R. Crudup
2021-01-27 15:50   ` Bjorn Helgaas
2021-01-27 16:00     ` Kenneth R. Crudup
2021-01-27 16:03     ` Kenneth R. Crudup

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).