* Re: [Bug 216877] New: Regression in PCI powermanagement breaks resume after suspend [not found] <bug-216877-41252@https.bugzilla.kernel.org/> @ 2023-01-02 17:15 ` Bjorn Helgaas 2023-01-04 0:30 ` Bjorn Helgaas 0 siblings, 1 reply; 6+ messages in thread From: Bjorn Helgaas @ 2023-01-02 17:15 UTC (permalink / raw) To: Vidya Sagar; +Cc: Thomas Witt, linux-kernel, linux-pci Thank you very much for your report and all the work of bisecting, Thomas! If you have a chance, can you collect the "sudo lspci -vv" output (the one attached doesn't include the ASPM info, probably because lspci wasn't run as root), and also a complete dmesg log from before the suspend? On Mon, Jan 02, 2023 at 02:02:51PM +0000, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=216877 > > Bug ID: 216877 > Summary: Regression in PCI powermanagement breaks resume after > suspend > Kernel Version: 6.0.0-rc1 > > Created attachment 303512 > --> https://bugzilla.kernel.org/attachment.cgi?id=303512&action=edit > output of git bisect log > > After commit 5e85eba6f50dc288c22083a7e213152bcc4b8208 "PCI/ASPM: > Refactor L1 PM Substates Control Register programming" my Laptop > does not resume PCI devices back from suspend. > > My Laptop is a Tuxedo Infinitybook S 14 v5, as far as I can tell > they use a Clevo L140CU Mainboard. > > The main symptom is: > iwlwifi 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible > nvme 0000:03:00.0: Unable to change power state from D3hot to D0, device inaccessible > > after that, the level of interaction I still have with the laptop > varies, but It cannot run dmesg and it cannot do a clean reboot. The > issue occurs on every suspend/resume cycle. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug 216877] New: Regression in PCI powermanagement breaks resume after suspend 2023-01-02 17:15 ` [Bug 216877] New: Regression in PCI powermanagement breaks resume after suspend Bjorn Helgaas @ 2023-01-04 0:30 ` Bjorn Helgaas 0 siblings, 0 replies; 6+ messages in thread From: Bjorn Helgaas @ 2023-01-04 0:30 UTC (permalink / raw) To: Vidya Sagar; +Cc: Thomas Witt, linux-kernel, linux-pci On Mon, Jan 02, 2023 at 11:15:16AM -0600, Bjorn Helgaas wrote: > On Mon, Jan 02, 2023 at 02:02:51PM +0000, bugzilla-daemon@kernel.org wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=216877 > > > > Bug ID: 216877 > > Summary: Regression in PCI powermanagement breaks resume after > > suspend > > Kernel Version: 6.0.0-rc1 > > > > Created attachment 303512 > > --> https://bugzilla.kernel.org/attachment.cgi?id=303512&action=edit > > output of git bisect log > > > > After commit 5e85eba6f50dc288c22083a7e213152bcc4b8208 "PCI/ASPM: > > Refactor L1 PM Substates Control Register programming" my Laptop > > does not resume PCI devices back from suspend. Thomas, could you try the debug patch below on top of v6.2-rc1? Prior to 5e85eba6f50d, aspm_calc_l1ss_info() did these config writes: if (enables) a child clear L1SS_CTL1 PCIPM_L1_2 ASPM_L1_2 # disable L1.2 b parent clear L1SS_CTL1 PCIPM_L1_2 ASPM_L1_2 # disable L1.2 /* T_POWER_ON in both */ c parent write L1SS_CTL2 T_POWER_ON d child write L1SS_CTL2 T_POWER_ON /* Common_Mode_Restore_Time in parent (rsvd in child) */ e parent write L1SS_CTL1 CM_RESTORE_TIME | LTR_L12_TH # clear CM_REST_TIME /* LTR_L1.2_THRESHOLD in both */ f parent write L1SS_CTL1 CM_RESTORE_TIME | LTR_L12_TH # clear LTR_L2_TH g child write L1SS_CTL1 CM_RESTORE_TIME | LTR_L12_TH # CM_REST_TIME rsvd? if (enables) h parent write L1SS_CTL1 PCIPM_L1_2 ASPM_L1_2 i child write L1SS_CTL1 PCIPM_L1_2 ASPM_L1_2 After 5e85eba6f50d, we do these: A parent write L1SS_CTL2 T_POWER_ON B parent write L1SS_CTL1 (CM_RESTORE_TIME | LTR_L12_TH) & ~(PCIPM_L1_2 ASPM_L1_2) if (enable) C parent write L1SS_CTL1 PCIPM_L1_2 ASPM_L1_2 D child write L1SS_CTL2 T_POWER_ON E child write L1SS_CTL1 (CM_RESTORE_TIME | LTR_L12_TH) & ~(PCIPM_L1_2 ASPM_L1_2) if (enable) F child write L1SS_CTL1 PCIPM_L1_2 ASPM_L1_2 Notes: - Before 5e85eba6f50d, we disable L1.2 at (a,b) before writing T_POWER_ON, CM_RESTORE_TIME, and LTR_L12_TH at (c,d,e,f,g). After 5e85eba6f50d, we write T_POWER_ON, CM_RESTORE_TIME, and LTR_L12_TH at (A,B) without disabling L1.2. Sec 5.5.4 suggests this may be a problem. - Even before 5e85eba6f50d, we write CM_REST_TIME for the child at (g), which is reserved per spec. - Before 5e85eba6f50d, the write at (e) inserts the new CMRT value, but ORs in the LTR_L12_TH values without clearing what was there before. The write at (f) inserts LTR_L12_TH correctly, so the result is probably correct, but it's messy. I think 5e85eba6f50d does this better. commit 98248ea220c8 ("debug https://bugzilla.kernel.org/show_bug.cgi?id=216877") Author: Bjorn Helgaas <bhelgaas@google.com> Date: Tue Jan 3 18:15:11 2023 -0600 debug https://bugzilla.kernel.org/show_bug.cgi?id=216877 diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 53a1fa306e1e..398c817858ac 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -552,6 +552,11 @@ static void aspm_calc_l1ss_info(struct pcie_link_state *link, ctl2 == pctl2 && ctl2 == cctl2) return; + pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_L1_2_MASK, 0); + pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_L1_2_MASK, 0); + pctl1 &= ~(PCI_L1SS_CTL1_CM_RESTORE_TIME | PCI_L1SS_CTL1_LTR_L12_TH_VALUE | PCI_L1SS_CTL1_LTR_L12_TH_SCALE); ^ permalink raw reply related [flat|nested] 6+ messages in thread
[parent not found: <ba98ad17-53b0-1f76-70f7-10d37d279f9d@witt.link>]
* Re: [Bug 216877] New: Regression in PCI powermanagement breaks resume after suspend [not found] <ba98ad17-53b0-1f76-70f7-10d37d279f9d@witt.link> @ 2023-01-04 15:02 ` Bjorn Helgaas 2023-01-04 15:37 ` Thomas Witt 0 siblings, 1 reply; 6+ messages in thread From: Bjorn Helgaas @ 2023-01-04 15:02 UTC (permalink / raw) To: Thomas Witt; +Cc: Vidya Sagar, Thomas Witt, linux-kernel, linux-pci On Wed, Jan 04, 2023 at 09:44:25AM +0100, Thomas Witt wrote: > On 04/01/2023 01:30, Bjorn Helgaas wrote: > > On Mon, Jan 02, 2023 at 11:15:16AM -0600, Bjorn Helgaas wrote: > > > On Mon, Jan 02, 2023 at 02:02:51PM +0000, bugzilla-daemon@kernel.org wrote: > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216877 > > > > > > > > Bug ID: 216877 > > > > Summary: Regression in PCI powermanagement breaks resume after > > > > suspend > > > > Kernel Version: 6.0.0-rc1 BTW, if the bisect is correct, I think the regression actually is in v6.1-rc1, where 5e85eba6f50d ("PCI/ASPM: Refactor L1 PM Substates Control Register programming") appeared. > > > > Created attachment 303512 > > > > --> https://bugzilla.kernel.org/attachment.cgi?id=303512&action=edit > > > > output of git bisect log > > > > > > > > After commit 5e85eba6f50dc288c22083a7e213152bcc4b8208 "PCI/ASPM: > > > > Refactor L1 PM Substates Control Register programming" my Laptop > > > > does not resume PCI devices back from suspend. > > > > Thomas, could you try the debug patch below on top of v6.2-rc1? > > Thank you for that patch Bjorn, but as far as I can see it does not change > anything. Thanks for testing it. Maybe Vidya will have more ideas. The patch below (based on v6.2-rc1) would revert 5e85eba6f50d and 4ff116d0d5fd. If 5e85eba6f50d is the culprit, it should fix the regression. It would also potentially break L1 substates after resume, so we'd like to avoid reverting it if possible. But the "Unable to change power state from D3hot to D0, device inaccessible" symptom suggests that the device is still in D3, which would be more like a wakeup issue than an ASPM issue. Your bisect log said 3e347969a577 ("PCI/PM: Reduce D3hot delay with usleep_range()") was "good", but it would be worth double-checking, e.g., see if reverting it from v6.2-rc1 makes any difference. Bjorn commit 61de2691d549 ("Revert "PCI/ASPM: Refactor L1 PM Substates Control Register programming"") parent 1b929c02afd3 Author: Bjorn Helgaas <bhelgaas@google.com> Date: Wed Jan 4 08:38:53 2023 -0600 Revert "PCI/ASPM: Refactor L1 PM Substates Control Register programming" This reverts 5e85eba6f50d ("PCI/ASPM: Refactor L1 PM Substates Control Register programming") and 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"). diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index fba95486caaf..5641786bd020 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1665,7 +1665,6 @@ int pci_save_state(struct pci_dev *dev) return i; pci_save_ltr_state(dev); - pci_save_aspm_l1ss_state(dev); pci_save_dpc_state(dev); pci_save_aer_state(dev); pci_save_ptm_state(dev); @@ -1772,7 +1771,6 @@ void pci_restore_state(struct pci_dev *dev) * LTR itself (in the PCIe capability). */ pci_restore_ltr_state(dev); - pci_restore_aspm_l1ss_state(dev); pci_restore_pcie_state(dev); pci_restore_pasid_state(dev); @@ -3465,11 +3463,6 @@ void pci_allocate_cap_save_buffers(struct pci_dev *dev) if (error) pci_err(dev, "unable to allocate suspend buffer for LTR\n"); - error = pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_L1SS, - 2 * sizeof(u32)); - if (error) - pci_err(dev, "unable to allocate suspend buffer for ASPM-L1SS\n"); - pci_allocate_vc_save_buffers(dev); } diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 9ed3b5550043..9049d07d3aae 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -566,14 +566,10 @@ bool pcie_wait_for_link(struct pci_dev *pdev, bool active); void pcie_aspm_init_link_state(struct pci_dev *pdev); void pcie_aspm_exit_link_state(struct pci_dev *pdev); void pcie_aspm_powersave_config_link(struct pci_dev *pdev); -void pci_save_aspm_l1ss_state(struct pci_dev *dev); -void pci_restore_aspm_l1ss_state(struct pci_dev *dev); #else static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { } static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { } static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { } -static inline void pci_save_aspm_l1ss_state(struct pci_dev *dev) { } -static inline void pci_restore_aspm_l1ss_state(struct pci_dev *dev) { } #endif #ifdef CONFIG_PCIE_ECRC diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 53a1fa306e1e..4b4184563a92 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -470,31 +470,6 @@ static void pci_clear_and_set_dword(struct pci_dev *pdev, int pos, pci_write_config_dword(pdev, pos, val); } -static void aspm_program_l1ss(struct pci_dev *dev, u32 ctl1, u32 ctl2) -{ - u16 l1ss = dev->l1ss; - u32 l1_2_enable; - - /* - * Per PCIe r6.0, sec 5.5.4, T_POWER_ON in PCI_L1SS_CTL2 must be - * programmed prior to setting the L1.2 enable bits in PCI_L1SS_CTL1. - */ - pci_write_config_dword(dev, l1ss + PCI_L1SS_CTL2, ctl2); - - /* - * In addition, Common_Mode_Restore_Time and LTR_L1.2_THRESHOLD in - * PCI_L1SS_CTL1 must be programmed *before* setting the L1.2 - * enable bits, even though they're all in PCI_L1SS_CTL1. - */ - l1_2_enable = ctl1 & PCI_L1SS_CTL1_L1_2_MASK; - ctl1 &= ~PCI_L1SS_CTL1_L1_2_MASK; - - pci_write_config_dword(dev, l1ss + PCI_L1SS_CTL1, ctl1); - if (l1_2_enable) - pci_write_config_dword(dev, l1ss + PCI_L1SS_CTL1, - ctl1 | l1_2_enable); -} - /* Calculate L1.2 PM substate timing parameters */ static void aspm_calc_l1ss_info(struct pcie_link_state *link, u32 parent_l1ss_cap, u32 child_l1ss_cap) @@ -504,6 +479,7 @@ static void aspm_calc_l1ss_info(struct pcie_link_state *link, u32 t_common_mode, t_power_on, l1_2_threshold, scale, value; u32 ctl1 = 0, ctl2 = 0; u32 pctl1, pctl2, cctl1, cctl2; + u32 pl1_2_enables, cl1_2_enables; if (!(link->aspm_support & ASPM_STATE_L1_2_MASK)) return; @@ -552,21 +528,39 @@ static void aspm_calc_l1ss_info(struct pcie_link_state *link, ctl2 == pctl2 && ctl2 == cctl2) return; - pctl1 &= ~(PCI_L1SS_CTL1_CM_RESTORE_TIME | - PCI_L1SS_CTL1_LTR_L12_TH_VALUE | - PCI_L1SS_CTL1_LTR_L12_TH_SCALE); - pctl1 |= (ctl1 & (PCI_L1SS_CTL1_CM_RESTORE_TIME | - PCI_L1SS_CTL1_LTR_L12_TH_VALUE | - PCI_L1SS_CTL1_LTR_L12_TH_SCALE)); - aspm_program_l1ss(parent, pctl1, ctl2); + /* Disable L1.2 while updating. See PCIe r5.0, sec 5.5.4, 7.8.3.3 */ + pl1_2_enables = pctl1 & PCI_L1SS_CTL1_L1_2_MASK; + cl1_2_enables = cctl1 & PCI_L1SS_CTL1_L1_2_MASK; - cctl1 &= ~(PCI_L1SS_CTL1_CM_RESTORE_TIME | - PCI_L1SS_CTL1_LTR_L12_TH_VALUE | - PCI_L1SS_CTL1_LTR_L12_TH_SCALE); - cctl1 |= (ctl1 & (PCI_L1SS_CTL1_CM_RESTORE_TIME | - PCI_L1SS_CTL1_LTR_L12_TH_VALUE | - PCI_L1SS_CTL1_LTR_L12_TH_SCALE)); - aspm_program_l1ss(child, cctl1, ctl2); + if (pl1_2_enables || cl1_2_enables) { + pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_L1_2_MASK, 0); + pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_L1_2_MASK, 0); + } + + /* Program T_POWER_ON times in both ports */ + pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL2, ctl2); + pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL2, ctl2); + + /* Program Common_Mode_Restore_Time in upstream device */ + pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_CM_RESTORE_TIME, ctl1); + + /* Program LTR_L1.2_THRESHOLD time in both ports */ + pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_LTR_L12_TH_VALUE | + PCI_L1SS_CTL1_LTR_L12_TH_SCALE, ctl1); + pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_LTR_L12_TH_VALUE | + PCI_L1SS_CTL1_LTR_L12_TH_SCALE, ctl1); + + if (pl1_2_enables || cl1_2_enables) { + pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 0, + pl1_2_enables); + pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, 0, + cl1_2_enables); + } } static void aspm_l1ss_init(struct pcie_link_state *link) @@ -757,43 +751,6 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) PCI_L1SS_CTL1_L1SS_MASK, val); } -void pci_save_aspm_l1ss_state(struct pci_dev *dev) -{ - struct pci_cap_saved_state *save_state; - u16 l1ss = dev->l1ss; - u32 *cap; - - if (!l1ss) - return; - - save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_L1SS); - if (!save_state) - return; - - cap = (u32 *)&save_state->cap.data[0]; - pci_read_config_dword(dev, l1ss + PCI_L1SS_CTL2, cap++); - pci_read_config_dword(dev, l1ss + PCI_L1SS_CTL1, cap++); -} - -void pci_restore_aspm_l1ss_state(struct pci_dev *dev) -{ - struct pci_cap_saved_state *save_state; - u32 *cap, ctl1, ctl2; - u16 l1ss = dev->l1ss; - - if (!l1ss) - return; - - save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_L1SS); - if (!save_state) - return; - - cap = (u32 *)&save_state->cap.data[0]; - ctl2 = *cap++; - ctl1 = *cap; - aspm_program_l1ss(dev, ctl1, ctl2); -} - static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) { pcie_capability_clear_and_set_word(pdev, PCI_EXP_LNKCTL, ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Bug 216877] New: Regression in PCI powermanagement breaks resume after suspend 2023-01-04 15:02 ` Bjorn Helgaas @ 2023-01-04 15:37 ` Thomas Witt 2023-01-26 19:24 ` Thomas Witt 0 siblings, 1 reply; 6+ messages in thread From: Thomas Witt @ 2023-01-04 15:37 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: Vidya Sagar, linux-kernel, linux-pci On 04/01/2023 16:02, Bjorn Helgaas wrote: > On Wed, Jan 04, 2023 at 09:44:25AM +0100, Thomas Witt wrote: >> On 04/01/2023 01:30, Bjorn Helgaas wrote: >>> On Mon, Jan 02, 2023 at 11:15:16AM -0600, Bjorn Helgaas wrote: >>>> On Mon, Jan 02, 2023 at 02:02:51PM +0000, bugzilla-daemon@kernel.org wrote: >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=216877 >>>>> >>>>> Bug ID: 216877 >>>>> Summary: Regression in PCI powermanagement breaks resume after >>>>> suspend >>>>> Kernel Version: 6.0.0-rc1 > > BTW, if the bisect is correct, I think the regression actually is in > v6.1-rc1, where 5e85eba6f50d ("PCI/ASPM: Refactor L1 PM Substates > Control Register programming") appeared. > >>>>> Created attachment 303512 >>>>> --> https://bugzilla.kernel.org/attachment.cgi?id=303512&action=edit >>>>> output of git bisect log >>>>> >>>>> After commit 5e85eba6f50dc288c22083a7e213152bcc4b8208 "PCI/ASPM: >>>>> Refactor L1 PM Substates Control Register programming" my Laptop >>>>> does not resume PCI devices back from suspend. >>> >>> Thomas, could you try the debug patch below on top of v6.2-rc1? >> >> Thank you for that patch Bjorn, but as far as I can see it does not change >> anything. > > Thanks for testing it. Maybe Vidya will have more ideas. The patch > below (based on v6.2-rc1) would revert 5e85eba6f50d and 4ff116d0d5fd. > If 5e85eba6f50d is the culprit, it should fix the regression. It > would also potentially break L1 substates after resume, so we'd like > to avoid reverting it if possible. > > But the "Unable to change power state from D3hot to D0, device > inaccessible" symptom suggests that the device is still in D3, which > would be more like a wakeup issue than an ASPM issue. > > Your bisect log said 3e347969a577 ("PCI/PM: Reduce D3hot delay with > usleep_range()") was "good", but it would be worth double-checking, > e.g., see if reverting it from v6.2-rc1 makes any difference. > > Bjorn > > commit 61de2691d549 ("Revert "PCI/ASPM: Refactor L1 PM Substates Control Register programming"") > parent 1b929c02afd3 > Author: Bjorn Helgaas <bhelgaas@google.com> > Date: Wed Jan 4 08:38:53 2023 -0600 > > Revert "PCI/ASPM: Refactor L1 PM Substates Control Register programming" With this patch on top of 6.2-rc1 suspend/resume works and my PCI devices come back online. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug 216877] New: Regression in PCI powermanagement breaks resume after suspend 2023-01-04 15:37 ` Thomas Witt @ 2023-01-26 19:24 ` Thomas Witt 2023-02-02 20:49 ` Bjorn Helgaas 0 siblings, 1 reply; 6+ messages in thread From: Thomas Witt @ 2023-01-26 19:24 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: Vidya Sagar, linux-kernel, linux-pci On 04/01/2023 16:37, Thomas Witt wrote: > On 04/01/2023 16:02, Bjorn Helgaas wrote: >> Thanks for testing it. Maybe Vidya will have more ideas. The patch >> below (based on v6.2-rc1) would revert 5e85eba6f50d and 4ff116d0d5fd. >> If 5e85eba6f50d is the culprit, it should fix the regression. It >> would also potentially break L1 substates after resume, so we'd like >> to avoid reverting it if possible. >> >> But the "Unable to change power state from D3hot to D0, device >> inaccessible" symptom suggests that the device is still in D3, which >> would be more like a wakeup issue than an ASPM issue. >> >> Your bisect log said 3e347969a577 ("PCI/PM: Reduce D3hot delay with >> usleep_range()") was "good", but it would be worth double-checking, >> e.g., see if reverting it from v6.2-rc1 makes any difference. >> >> Bjorn >> >> commit 61de2691d549 ("Revert "PCI/ASPM: Refactor L1 PM Substates >> Control Register programming"") >> parent 1b929c02afd3 >> Author: Bjorn Helgaas <bhelgaas@google.com> >> Date: Wed Jan 4 08:38:53 2023 -0600 >> >> Revert "PCI/ASPM: Refactor L1 PM Substates Control Register >> programming" > > With this patch on top of 6.2-rc1 suspend/resume works and my PCI > devices come back online. > Hello Bjorn, hello Vidya, do you have an Idea what went wrong in that commit to cause my PCI devices to not return from D3? BR Thomas ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug 216877] New: Regression in PCI powermanagement breaks resume after suspend 2023-01-26 19:24 ` Thomas Witt @ 2023-02-02 20:49 ` Bjorn Helgaas 0 siblings, 0 replies; 6+ messages in thread From: Bjorn Helgaas @ 2023-02-02 20:49 UTC (permalink / raw) To: Thomas Witt; +Cc: Vidya Sagar, linux-kernel, linux-pci On Thu, Jan 26, 2023 at 08:24:23PM +0100, Thomas Witt wrote: > On 04/01/2023 16:37, Thomas Witt wrote: > > On 04/01/2023 16:02, Bjorn Helgaas wrote: > > > Thanks for testing it. Maybe Vidya will have more ideas. The patch > > > below (based on v6.2-rc1) would revert 5e85eba6f50d and 4ff116d0d5fd. > > > If 5e85eba6f50d is the culprit, it should fix the regression. It > > > would also potentially break L1 substates after resume, so we'd like > > > to avoid reverting it if possible. > > > > > > But the "Unable to change power state from D3hot to D0, device > > > inaccessible" symptom suggests that the device is still in D3, which > > > would be more like a wakeup issue than an ASPM issue. > > > > > > Your bisect log said 3e347969a577 ("PCI/PM: Reduce D3hot delay with > > > usleep_range()") was "good", but it would be worth double-checking, > > > e.g., see if reverting it from v6.2-rc1 makes any difference. > > > > > > Bjorn > > > > > > commit 61de2691d549 ("Revert "PCI/ASPM: Refactor L1 PM Substates > > > Control Register programming"") > > > parent 1b929c02afd3 > > > Author: Bjorn Helgaas <bhelgaas@google.com> > > > Date: Wed Jan 4 08:38:53 2023 -0600 > > > > > > Revert "PCI/ASPM: Refactor L1 PM Substates Control Register > > > programming" > > > > With this patch on top of 6.2-rc1 suspend/resume works and my PCI > > devices come back online. > > do you have an Idea what went wrong in that commit to cause my PCI devices > to not return from D3? I do not have any ideas yet. I suspect maybe we should plan to revert 5e85eba6f50d ("PCI/ASPM: Refactor L1 PM Substates Control Register programming") and 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") for now. It's odd that they both appeared in v6.1, and we haven't seen a flood of reports about this, so maybe there's something unusual about your machine, Thomas. I'll post a revert, and we can do that for v6.2 if we don't come up with any other ideas. Bjorn ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-02-02 20:50 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <bug-216877-41252@https.bugzilla.kernel.org/> 2023-01-02 17:15 ` [Bug 216877] New: Regression in PCI powermanagement breaks resume after suspend Bjorn Helgaas 2023-01-04 0:30 ` Bjorn Helgaas [not found] <ba98ad17-53b0-1f76-70f7-10d37d279f9d@witt.link> 2023-01-04 15:02 ` Bjorn Helgaas 2023-01-04 15:37 ` Thomas Witt 2023-01-26 19:24 ` Thomas Witt 2023-02-02 20:49 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).