All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Vidya Sagar <vidyas@nvidia.com>
Cc: Thomas Witt <kernel@witt.link>,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org
Subject: Re: [Bug 216877] New: Regression in PCI powermanagement breaks resume after suspend
Date: Tue, 3 Jan 2023 18:30:46 -0600	[thread overview]
Message-ID: <20230104003046.GA1032131@bhelgaas> (raw)
In-Reply-To: <20230102171516.GA783946@bhelgaas>

On Mon, Jan 02, 2023 at 11:15:16AM -0600, Bjorn Helgaas wrote:
> On Mon, Jan 02, 2023 at 02:02:51PM +0000, bugzilla-daemon@kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=216877
> > 
> >             Bug ID: 216877
> >            Summary: Regression in PCI powermanagement breaks resume after
> >                     suspend
> >     Kernel Version: 6.0.0-rc1
> > 
> > Created attachment 303512
> >   --> https://bugzilla.kernel.org/attachment.cgi?id=303512&action=edit
> > output of git bisect log
> > 
> > After commit 5e85eba6f50dc288c22083a7e213152bcc4b8208 "PCI/ASPM:
> > Refactor L1 PM Substates Control Register programming" my Laptop
> > does not resume PCI devices back from suspend.

Thomas, could you try the debug patch below on top of v6.2-rc1?

Prior to 5e85eba6f50d, aspm_calc_l1ss_info() did these config writes:

    if (enables)
 a    child  clear L1SS_CTL1  PCIPM_L1_2 ASPM_L1_2	# disable L1.2
 b    parent clear L1SS_CTL1  PCIPM_L1_2 ASPM_L1_2	# disable L1.2
    /* T_POWER_ON in both */
 c  parent write L1SS_CTL2  T_POWER_ON
 d  child  write L1SS_CTL2  T_POWER_ON
    /* Common_Mode_Restore_Time in parent (rsvd in child) */
 e  parent write L1SS_CTL1  CM_RESTORE_TIME | LTR_L12_TH # clear CM_REST_TIME 
    /* LTR_L1.2_THRESHOLD in both */
 f  parent write L1SS_CTL1  CM_RESTORE_TIME | LTR_L12_TH # clear LTR_L2_TH
 g  child  write L1SS_CTL1  CM_RESTORE_TIME | LTR_L12_TH # CM_REST_TIME rsvd?
    if (enables)
 h    parent write L1SS_CTL1  PCIPM_L1_2 ASPM_L1_2
 i    child  write L1SS_CTL1  PCIPM_L1_2 ASPM_L1_2

After 5e85eba6f50d, we do these:

 A  parent write L1SS_CTL2  T_POWER_ON
 B  parent write L1SS_CTL1  (CM_RESTORE_TIME | LTR_L12_TH) & ~(PCIPM_L1_2 ASPM_L1_2)
    if (enable)
 C    parent write L1SS_CTL1  PCIPM_L1_2 ASPM_L1_2

 D  child  write L1SS_CTL2  T_POWER_ON
 E  child  write L1SS_CTL1  (CM_RESTORE_TIME | LTR_L12_TH) & ~(PCIPM_L1_2 ASPM_L1_2)
    if (enable)
 F    child  write L1SS_CTL1  PCIPM_L1_2 ASPM_L1_2

Notes:

  - Before 5e85eba6f50d, we disable L1.2 at (a,b) before writing
    T_POWER_ON, CM_RESTORE_TIME, and LTR_L12_TH at (c,d,e,f,g).

    After 5e85eba6f50d, we write T_POWER_ON, CM_RESTORE_TIME, and
    LTR_L12_TH at (A,B) without disabling L1.2.  Sec 5.5.4 suggests
    this may be a problem.

  - Even before 5e85eba6f50d, we write CM_REST_TIME for the child at
    (g), which is reserved per spec.

  - Before 5e85eba6f50d, the write at (e) inserts the new CMRT value,
    but ORs in the LTR_L12_TH values without clearing what was there
    before.  The write at (f) inserts LTR_L12_TH correctly, so the
    result is probably correct, but it's messy.  I think 5e85eba6f50d
    does this better.


commit 98248ea220c8 ("debug https://bugzilla.kernel.org/show_bug.cgi?id=216877")
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Tue Jan 3 18:15:11 2023 -0600

    debug https://bugzilla.kernel.org/show_bug.cgi?id=216877
    

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 53a1fa306e1e..398c817858ac 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -552,6 +552,11 @@ static void aspm_calc_l1ss_info(struct pcie_link_state *link,
 	    ctl2 == pctl2 && ctl2 == cctl2)
 		return;
 
+	pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1,
+				PCI_L1SS_CTL1_L1_2_MASK, 0);
+	pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
+				PCI_L1SS_CTL1_L1_2_MASK, 0);
+
 	pctl1 &= ~(PCI_L1SS_CTL1_CM_RESTORE_TIME |
 		   PCI_L1SS_CTL1_LTR_L12_TH_VALUE |
 		   PCI_L1SS_CTL1_LTR_L12_TH_SCALE);

  reply	other threads:[~2023-01-04  0:31 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-216877-41252@https.bugzilla.kernel.org/>
2023-01-02 17:15 ` [Bug 216877] New: Regression in PCI powermanagement breaks resume after suspend Bjorn Helgaas
2023-01-04  0:30   ` Bjorn Helgaas [this message]
2023-01-04  8:44     ` Thomas Witt
2023-01-04 15:02       ` Bjorn Helgaas
2023-01-04 15:37         ` Thomas Witt
2023-01-26 19:24           ` Thomas Witt
2023-02-02 20:49             ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230104003046.GA1032131@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=kernel@witt.link \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=vidyas@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.