linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 4/4] PCI/ASPM: Fix L1.2 parameters when enable link state
@ 2024-04-24 11:02 Jian-Hong Pan
  2024-04-27  0:03 ` David E. Box
  0 siblings, 1 reply; 8+ messages in thread
From: Jian-Hong Pan @ 2024-04-24 11:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Johan Hovold, David Box, Ilpo Järvinen,
	Kuppuswamy Sathyanarayanan, Mika Westerberg, Damien Le Moal,
	Nirmal Patel, Jonathan Derrick, linux-pci, linux-kernel,
	Jian-Hong Pan

Currently, when enable link's L1.2 features with __pci_enable_link_state(),
it configs the link directly without ensuring related L1.2 parameters, such
as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have been
programmed.

This leads the link's L1.2 between PCIe Root Port and child device gets
wrong configs when a caller tries to enabled it.

Here is a failed example on ASUS B1400CEAE with enabled VMD:

10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor PCIe Controller (rev 01) (prog-if 00 [Normal decode])
    ...
    Capabilities: [200 v1] L1 PM Substates
        L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
        	  PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
        L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
        	   T_CommonMode=45us LTR1.2_Threshold=101376ns
        L1SubCtl2: T_PwrOn=50us

10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 NVMe SSD (rev 01) (prog-if 02 [NVM Express])
    ...
    Capabilities: [900 v1] L1 PM Substates
        L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
                  PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
        L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
                   T_CommonMode=0us LTR1.2_Threshold=0ns
        L1SubCtl2: T_PwrOn=10us

According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the PCIe
Root Port and the child NVMe, they should be programmed with the same
LTR1.2_Threshold value. However, they have different values in this case.

Invoke aspm_calc_l12_info() to program the L1.2 parameters properly before
enable L1.2 bits of L1 PM Substates Control Register in
__pci_enable_link_state().

Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
---
v2:
- Prepare the PCIe LTR parameters before enable L1 Substates

v3:
- Only enable supported features for the L1 Substates part

v4:
- Focus on fixing L1.2 parameters, instead of re-initializing whole L1SS

v5:
- Fix typo and commit message
- Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce
  aspm_get_l1ss_cap()"

 drivers/pci/pcie/aspm.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index c55ac11faa73..553327dee991 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state);
 static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool locked)
 {
 	struct pcie_link_state *link = pcie_aspm_get_link(pdev);
+	struct pci_dev *child = link->downstream, *parent = link->pdev;
+	u32 parent_l1ss_cap, child_l1ss_cap;
 
 	if (!link)
 		return -EINVAL;
@@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool locked)
 		link->aspm_default |= ASPM_STATE_L1_1_PCIPM | ASPM_STATE_L1;
 	if (state & PCIE_LINK_STATE_L1_2_PCIPM)
 		link->aspm_default |= ASPM_STATE_L1_2_PCIPM | ASPM_STATE_L1;
+	/*
+	 * Ensure L1.2 parameters: Common_Mode_Restore_Times, T_POWER_ON and
+	 * LTR_L1.2_THRESHOLD are programmed properly before enable bits for
+	 * L1.2, per PCIe r6.0, sec 5.5.4.
+	 */
+	if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) {
+		parent_l1ss_cap = aspm_get_l1ss_cap(parent);
+		child_l1ss_cap = aspm_get_l1ss_cap(child);
+		aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap);
+	}
 	pcie_config_aspm_link(link, policy_to_aspm_state(link));
 
 	link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 4/4] PCI/ASPM: Fix L1.2 parameters when enable link state
  2024-04-24 11:02 [PATCH v5 4/4] PCI/ASPM: Fix L1.2 parameters when enable link state Jian-Hong Pan
@ 2024-04-27  0:03 ` David E. Box
  2024-04-30  7:46   ` Jian-Hong Pan
  0 siblings, 1 reply; 8+ messages in thread
From: David E. Box @ 2024-04-27  0:03 UTC (permalink / raw)
  To: Jian-Hong Pan, Bjorn Helgaas
  Cc: Johan Hovold, Ilpo Järvinen, Kuppuswamy Sathyanarayanan,
	Mika Westerberg, Damien Le Moal, Nirmal Patel, Jonathan Derrick,
	linux-pci, linux-kernel

Hi Jian-Hong,

On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote:
> Currently, when enable link's L1.2 features with __pci_enable_link_state(),
> it configs the link directly without ensuring related L1.2 parameters, such
> as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have been
> programmed.
> 
> This leads the link's L1.2 between PCIe Root Port and child device gets
> wrong configs when a caller tries to enabled it.
> 
> Here is a failed example on ASUS B1400CEAE with enabled VMD:
> 
> 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor PCIe
> Controller (rev 01) (prog-if 00 [Normal decode])
>     ...
>     Capabilities: [200 v1] L1 PM Substates
>         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
> L1_PM_Substates+
>                   PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
>         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>                    T_CommonMode=45us LTR1.2_Threshold=101376ns
>         L1SubCtl2: T_PwrOn=50us
> 
> 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 NVMe
> SSD (rev 01) (prog-if 02 [NVM Express])
>     ...
>     Capabilities: [900 v1] L1 PM Substates
>         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> L1_PM_Substates+
>                   PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
>         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>                    T_CommonMode=0us LTR1.2_Threshold=0ns
>         L1SubCtl2: T_PwrOn=10us
> 
> According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the PCIe
> Root Port and the child NVMe, they should be programmed with the same
> LTR1.2_Threshold value. However, they have different values in this case.
> 
> Invoke aspm_calc_l12_info() to program the L1.2 parameters properly before
> enable L1.2 bits of L1 PM Substates Control Register in
> __pci_enable_link_state().
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
> Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> ---
> v2:
> - Prepare the PCIe LTR parameters before enable L1 Substates
> 
> v3:
> - Only enable supported features for the L1 Substates part
> 
> v4:
> - Focus on fixing L1.2 parameters, instead of re-initializing whole L1SS
> 
> v5:
> - Fix typo and commit message
> - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce
>   aspm_get_l1ss_cap()"
> 
>  drivers/pci/pcie/aspm.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index c55ac11faa73..553327dee991 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state);
>  static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool
> locked)
>  {
>         struct pcie_link_state *link = pcie_aspm_get_link(pdev);
> +       struct pci_dev *child = link->downstream, *parent = link->pdev;
> +       u32 parent_l1ss_cap, child_l1ss_cap;
>  
>         if (!link)
>                 return -EINVAL;
> @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct pci_dev
> *pdev, int state, bool locked)
>                 link->aspm_default |= ASPM_STATE_L1_1_PCIPM | ASPM_STATE_L1;
>         if (state & PCIE_LINK_STATE_L1_2_PCIPM)
>                 link->aspm_default |= ASPM_STATE_L1_2_PCIPM | ASPM_STATE_L1;
> +       /*
> +        * Ensure L1.2 parameters: Common_Mode_Restore_Times, T_POWER_ON and
> +        * LTR_L1.2_THRESHOLD are programmed properly before enable bits for
> +        * L1.2, per PCIe r6.0, sec 5.5.4.
> +        */
> +       if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) {

This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags. 'state' should
not even matter. The timings should always be calculated and programmed as long
as L1_2 is capable. That way the timings are ready even if L1_2 isn't being
enabled now (in case the user enables it later).

David

> +               parent_l1ss_cap = aspm_get_l1ss_cap(parent);
> +               child_l1ss_cap = aspm_get_l1ss_cap(child);
> +               aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap);
> +       }
>         pcie_config_aspm_link(link, policy_to_aspm_state(link));
>  
>         link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 4/4] PCI/ASPM: Fix L1.2 parameters when enable link state
  2024-04-27  0:03 ` David E. Box
@ 2024-04-30  7:46   ` Jian-Hong Pan
  2024-04-30 18:26     ` David E. Box
  0 siblings, 1 reply; 8+ messages in thread
From: Jian-Hong Pan @ 2024-04-30  7:46 UTC (permalink / raw)
  To: david.e.box
  Cc: Bjorn Helgaas, Johan Hovold, Ilpo Järvinen,
	Kuppuswamy Sathyanarayanan, Mika Westerberg, Damien Le Moal,
	Nirmal Patel, Jonathan Derrick, linux-pci, linux-kernel

David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道:
>
> Hi Jian-Hong,
>
> On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote:
> > Currently, when enable link's L1.2 features with __pci_enable_link_state(),
> > it configs the link directly without ensuring related L1.2 parameters, such
> > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have been
> > programmed.
> >
> > This leads the link's L1.2 between PCIe Root Port and child device gets
> > wrong configs when a caller tries to enabled it.
> >
> > Here is a failed example on ASUS B1400CEAE with enabled VMD:
> >
> > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor PCIe
> > Controller (rev 01) (prog-if 00 [Normal decode])
> >     ...
> >     Capabilities: [200 v1] L1 PM Substates
> >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
> > L1_PM_Substates+
> >                   PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> >                    T_CommonMode=45us LTR1.2_Threshold=101376ns
> >         L1SubCtl2: T_PwrOn=50us
> >
> > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 NVMe
> > SSD (rev 01) (prog-if 02 [NVM Express])
> >     ...
> >     Capabilities: [900 v1] L1 PM Substates
> >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > L1_PM_Substates+
> >                   PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> >                    T_CommonMode=0us LTR1.2_Threshold=0ns
> >         L1SubCtl2: T_PwrOn=10us
> >
> > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the PCIe
> > Root Port and the child NVMe, they should be programmed with the same
> > LTR1.2_Threshold value. However, they have different values in this case.
> >
> > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly before
> > enable L1.2 bits of L1 PM Substates Control Register in
> > __pci_enable_link_state().
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
> > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> > ---
> > v2:
> > - Prepare the PCIe LTR parameters before enable L1 Substates
> >
> > v3:
> > - Only enable supported features for the L1 Substates part
> >
> > v4:
> > - Focus on fixing L1.2 parameters, instead of re-initializing whole L1SS
> >
> > v5:
> > - Fix typo and commit message
> > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce
> >   aspm_get_l1ss_cap()"
> >
> >  drivers/pci/pcie/aspm.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index c55ac11faa73..553327dee991 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state);
> >  static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool
> > locked)
> >  {
> >         struct pcie_link_state *link = pcie_aspm_get_link(pdev);
> > +       struct pci_dev *child = link->downstream, *parent = link->pdev;
> > +       u32 parent_l1ss_cap, child_l1ss_cap;
> >
> >         if (!link)
> >                 return -EINVAL;
> > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct pci_dev
> > *pdev, int state, bool locked)
> >                 link->aspm_default |= ASPM_STATE_L1_1_PCIPM | ASPM_STATE_L1;
> >         if (state & PCIE_LINK_STATE_L1_2_PCIPM)
> >                 link->aspm_default |= ASPM_STATE_L1_2_PCIPM | ASPM_STATE_L1;
> > +       /*
> > +        * Ensure L1.2 parameters: Common_Mode_Restore_Times, T_POWER_ON and
> > +        * LTR_L1.2_THRESHOLD are programmed properly before enable bits for
> > +        * L1.2, per PCIe r6.0, sec 5.5.4.
> > +        */
> > +       if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) {
>
> This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags.

Thanks for your review, but I notice some description in PCIe spec,
5.5.4 L1 PM Substates Configuration:
"Prior to setting either or both of the enable bits for L1.2, the
values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2
Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale
fields) must be programmed." => I think this includes both "ASPM L1.2
Enable" and "PCI-PM L1.2 Enable" bits.

Jain-Hong Pan

> 'state' should not even matter.
> The timings should always be calculated and programmed as long
> as L1_2 is capable. That way the timings are ready even if L1_2 isn't being
> enabled now (in case the user enables it later).
>
> David
>
> > +               parent_l1ss_cap = aspm_get_l1ss_cap(parent);
> > +               child_l1ss_cap = aspm_get_l1ss_cap(child);
> > +               aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap);
> > +       }
> >         pcie_config_aspm_link(link, policy_to_aspm_state(link));
> >
> >         link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 4/4] PCI/ASPM: Fix L1.2 parameters when enable link state
  2024-04-30  7:46   ` Jian-Hong Pan
@ 2024-04-30 18:26     ` David E. Box
  2024-05-03  9:45       ` Jian-Hong Pan
  0 siblings, 1 reply; 8+ messages in thread
From: David E. Box @ 2024-04-30 18:26 UTC (permalink / raw)
  To: Jian-Hong Pan
  Cc: Bjorn Helgaas, Johan Hovold, Ilpo Järvinen,
	Kuppuswamy Sathyanarayanan, Mika Westerberg, Damien Le Moal,
	Nirmal Patel, Jonathan Derrick, linux-pci, linux-kernel

On Tue, 2024-04-30 at 15:46 +0800, Jian-Hong Pan wrote:
> David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道:
> > 
> > Hi Jian-Hong,
> > 
> > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote:
> > > Currently, when enable link's L1.2 features with
> > > __pci_enable_link_state(),
> > > it configs the link directly without ensuring related L1.2 parameters,
> > > such
> > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have been
> > > programmed.
> > > 
> > > This leads the link's L1.2 between PCIe Root Port and child device gets
> > > wrong configs when a caller tries to enabled it.
> > > 
> > > Here is a failed example on ASUS B1400CEAE with enabled VMD:
> > > 
> > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor PCIe
> > > Controller (rev 01) (prog-if 00 [Normal decode])
> > >     ...
> > >     Capabilities: [200 v1] L1 PM Substates
> > >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
> > > L1_PM_Substates+
> > >                   PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > >                    T_CommonMode=45us LTR1.2_Threshold=101376ns
> > >         L1SubCtl2: T_PwrOn=50us
> > > 
> > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550
> > > NVMe
> > > SSD (rev 01) (prog-if 02 [NVM Express])
> > >     ...
> > >     Capabilities: [900 v1] L1 PM Substates
> > >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > L1_PM_Substates+
> > >                   PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > >                    T_CommonMode=0us LTR1.2_Threshold=0ns
> > >         L1SubCtl2: T_PwrOn=10us
> > > 
> > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the PCIe
> > > Root Port and the child NVMe, they should be programmed with the same
> > > LTR1.2_Threshold value. However, they have different values in this case.
> > > 
> > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly before
> > > enable L1.2 bits of L1 PM Substates Control Register in
> > > __pci_enable_link_state().
> > > 
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
> > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> > > ---
> > > v2:
> > > - Prepare the PCIe LTR parameters before enable L1 Substates
> > > 
> > > v3:
> > > - Only enable supported features for the L1 Substates part
> > > 
> > > v4:
> > > - Focus on fixing L1.2 parameters, instead of re-initializing whole L1SS
> > > 
> > > v5:
> > > - Fix typo and commit message
> > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce
> > >   aspm_get_l1ss_cap()"
> > > 
> > >  drivers/pci/pcie/aspm.c | 12 ++++++++++++
> > >  1 file changed, 12 insertions(+)
> > > 
> > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > index c55ac11faa73..553327dee991 100644
> > > --- a/drivers/pci/pcie/aspm.c
> > > +++ b/drivers/pci/pcie/aspm.c
> > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state);
> > >  static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool
> > > locked)
> > >  {
> > >         struct pcie_link_state *link = pcie_aspm_get_link(pdev);
> > > +       struct pci_dev *child = link->downstream, *parent = link->pdev;
> > > +       u32 parent_l1ss_cap, child_l1ss_cap;
> > > 
> > >         if (!link)
> > >                 return -EINVAL;
> > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct pci_dev
> > > *pdev, int state, bool locked)
> > >                 link->aspm_default |= ASPM_STATE_L1_1_PCIPM |
> > > ASPM_STATE_L1;
> > >         if (state & PCIE_LINK_STATE_L1_2_PCIPM)
> > >                 link->aspm_default |= ASPM_STATE_L1_2_PCIPM |
> > > ASPM_STATE_L1;
> > > +       /*
> > > +        * Ensure L1.2 parameters: Common_Mode_Restore_Times, T_POWER_ON
> > > and
> > > +        * LTR_L1.2_THRESHOLD are programmed properly before enable bits
> > > for
> > > +        * L1.2, per PCIe r6.0, sec 5.5.4.
> > > +        */
> > > +       if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) {
> > 
> > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags.
> 
> Thanks for your review, but I notice some description in PCIe spec,
> 5.5.4 L1 PM Substates Configuration:
> "Prior to setting either or both of the enable bits for L1.2, the
> values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2
> Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale
> fields) must be programmed." => I think this includes both "ASPM L1.2
> Enable" and "PCI-PM L1.2 Enable" bits.

That's fine. While the spec clearly calls out the ASPM L1.2 Enable bit here, I
see no harm in including PCI-PM L1.2 in that check. This is what the code
already does in aspm_l1ss_init().

The issue is the mixed used of two different types of flags that don't have the
same meaning. 'state' contains PCIE_LINK_STATE flags which are part of the
caller API to the pci_<enabled/disable>_link_state() functions. The ASPM_STATE
flags are used internally to aspm.c to track all states and their meaningful
combinations such as ASPM_STATE_L1_2_MASK which includes ASPM L1.2 and PCI-PM
L1.2. You should not do bit operations between them.

Also, you should not require that the timings be calculated only if L1_2 is
enabled. You should calculate the timings as long as it's capable. This is also
what aspm_l1ss_init() does.

The confusion might be over the fact that you are having
__pci_enable_link_state() call aspm_calc_l12_info(). This should have been
handled during initialization of the link in aspm_l1ss_init() and I'm not sure
why it didn't. Maybe it's because, for VMD, ASPM default state would have
started out all disabled and this somehow led to aspm_l1ss_init() not getting
called. But looking through the code I don't see it. It would be great if you
can confirm why they weren't calculated before.

David

> 
> Jain-Hong Pan
> 
> > 'state' should not even matter.
> > The timings should always be calculated and programmed as long
> > as L1_2 is capable. That way the timings are ready even if L1_2 isn't being
> > enabled now (in case the user enables it later).
> > 
> > David
> > 
> > > +               parent_l1ss_cap = aspm_get_l1ss_cap(parent);
> > > +               child_l1ss_cap = aspm_get_l1ss_cap(child);
> > > +               aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap);
> > > +       }
> > >         pcie_config_aspm_link(link, policy_to_aspm_state(link));
> > > 
> > >         link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
> > 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 4/4] PCI/ASPM: Fix L1.2 parameters when enable link state
  2024-04-30 18:26     ` David E. Box
@ 2024-05-03  9:45       ` Jian-Hong Pan
  2024-05-03 19:15         ` David E. Box
  0 siblings, 1 reply; 8+ messages in thread
From: Jian-Hong Pan @ 2024-05-03  9:45 UTC (permalink / raw)
  To: david.e.box
  Cc: Bjorn Helgaas, Johan Hovold, Ilpo Järvinen,
	Kuppuswamy Sathyanarayanan, Mika Westerberg, Damien Le Moal,
	Nirmal Patel, Jonathan Derrick, linux-pci, linux-kernel

David E. Box <david.e.box@linux.intel.com> 於 2024年5月1日 週三 上午2:26寫道:
>
> On Tue, 2024-04-30 at 15:46 +0800, Jian-Hong Pan wrote:
> > David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道:
> > >
> > > Hi Jian-Hong,
> > >
> > > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote:
> > > > Currently, when enable link's L1.2 features with
> > > > __pci_enable_link_state(),
> > > > it configs the link directly without ensuring related L1.2 parameters,
> > > > such
> > > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have been
> > > > programmed.
> > > >
> > > > This leads the link's L1.2 between PCIe Root Port and child device gets
> > > > wrong configs when a caller tries to enabled it.
> > > >
> > > > Here is a failed example on ASUS B1400CEAE with enabled VMD:
> > > >
> > > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor PCIe
> > > > Controller (rev 01) (prog-if 00 [Normal decode])
> > > >     ...
> > > >     Capabilities: [200 v1] L1 PM Substates
> > > >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
> > > > L1_PM_Substates+
> > > >                   PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > > >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > >                    T_CommonMode=45us LTR1.2_Threshold=101376ns
> > > >         L1SubCtl2: T_PwrOn=50us
> > > >
> > > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550
> > > > NVMe
> > > > SSD (rev 01) (prog-if 02 [NVM Express])
> > > >     ...
> > > >     Capabilities: [900 v1] L1 PM Substates
> > > >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > L1_PM_Substates+
> > > >                   PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > > >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > >                    T_CommonMode=0us LTR1.2_Threshold=0ns
> > > >         L1SubCtl2: T_PwrOn=10us
> > > >
> > > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the PCIe
> > > > Root Port and the child NVMe, they should be programmed with the same
> > > > LTR1.2_Threshold value. However, they have different values in this case.
> > > >
> > > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly before
> > > > enable L1.2 bits of L1 PM Substates Control Register in
> > > > __pci_enable_link_state().
> > > >
> > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
> > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> > > > ---
> > > > v2:
> > > > - Prepare the PCIe LTR parameters before enable L1 Substates
> > > >
> > > > v3:
> > > > - Only enable supported features for the L1 Substates part
> > > >
> > > > v4:
> > > > - Focus on fixing L1.2 parameters, instead of re-initializing whole L1SS
> > > >
> > > > v5:
> > > > - Fix typo and commit message
> > > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce
> > > >   aspm_get_l1ss_cap()"
> > > >
> > > >  drivers/pci/pcie/aspm.c | 12 ++++++++++++
> > > >  1 file changed, 12 insertions(+)
> > > >
> > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > > index c55ac11faa73..553327dee991 100644
> > > > --- a/drivers/pci/pcie/aspm.c
> > > > +++ b/drivers/pci/pcie/aspm.c
> > > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state);
> > > >  static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool
> > > > locked)
> > > >  {
> > > >         struct pcie_link_state *link = pcie_aspm_get_link(pdev);
> > > > +       struct pci_dev *child = link->downstream, *parent = link->pdev;
> > > > +       u32 parent_l1ss_cap, child_l1ss_cap;
> > > >
> > > >         if (!link)
> > > >                 return -EINVAL;
> > > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct pci_dev
> > > > *pdev, int state, bool locked)
> > > >                 link->aspm_default |= ASPM_STATE_L1_1_PCIPM |
> > > > ASPM_STATE_L1;
> > > >         if (state & PCIE_LINK_STATE_L1_2_PCIPM)
> > > >                 link->aspm_default |= ASPM_STATE_L1_2_PCIPM |
> > > > ASPM_STATE_L1;
> > > > +       /*
> > > > +        * Ensure L1.2 parameters: Common_Mode_Restore_Times, T_POWER_ON
> > > > and
> > > > +        * LTR_L1.2_THRESHOLD are programmed properly before enable bits
> > > > for
> > > > +        * L1.2, per PCIe r6.0, sec 5.5.4.
> > > > +        */
> > > > +       if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) {
> > >
> > > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags.
> >
> > Thanks for your review, but I notice some description in PCIe spec,
> > 5.5.4 L1 PM Substates Configuration:
> > "Prior to setting either or both of the enable bits for L1.2, the
> > values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2
> > Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale
> > fields) must be programmed." => I think this includes both "ASPM L1.2
> > Enable" and "PCI-PM L1.2 Enable" bits.
>
> That's fine. While the spec clearly calls out the ASPM L1.2 Enable bit here, I
> see no harm in including PCI-PM L1.2 in that check. This is what the code
> already does in aspm_l1ss_init().
>
> The issue is the mixed used of two different types of flags that don't have the
> same meaning. 'state' contains PCIE_LINK_STATE flags which are part of the
> caller API to the pci_<enabled/disable>_link_state() functions. The ASPM_STATE
> flags are used internally to aspm.c to track all states and their meaningful
> combinations such as ASPM_STATE_L1_2_MASK which includes ASPM L1.2 and PCI-PM
> L1.2. You should not do bit operations between them.
>
> Also, you should not require that the timings be calculated only if L1_2 is
> enabled. You should calculate the timings as long as it's capable. This is also
> what aspm_l1ss_init() does.
>
> The confusion might be over the fact that you are having
> __pci_enable_link_state() call aspm_calc_l12_info(). This should have been
> handled during initialization of the link in aspm_l1ss_init() and I'm not sure
> why it didn't. Maybe it's because, for VMD, ASPM default state would have
> started out all disabled and this somehow led to aspm_l1ss_init() not getting
> called. But looking through the code I don't see it. It would be great if you
> can confirm why they weren't calculated before.

I debug it again.  If I delete the pci_reset_bus() in vmd controller like:

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 87b7856f375a..39bfda4350bf 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -930,25 +930,6 @@ static int vmd_enable_domain(struct vmd_dev *vmd,
unsigned long features)
        pci_scan_child_bus(vmd->bus);
        vmd_domain_reset(vmd);

-       /* When Intel VMD is enabled, the OS does not discover the Root Ports
-        * owned by Intel VMD within the MMCFG space. pci_reset_bus() applies
-        * a reset to the parent of the PCI device supplied as argument. This
-        * is why we pass a child device, so the reset can be triggered at
-        * the Intel bridge level and propagated to all the children in the
-        * hierarchy.
-        */
-       list_for_each_entry(child, &vmd->bus->children, node) {
-               if (!list_empty(&child->devices)) {
-                       dev = list_first_entry(&child->devices,
-                                              struct pci_dev, bus_list);
-                       ret = pci_reset_bus(dev);
-                       if (ret)
-                               pci_warn(dev, "can't reset device: %d\n", ret);
-
-                       break;
-               }
-       }
-
        pci_assign_unassigned_bus_resources(vmd->bus);

        pci_walk_bus(vmd->bus, vmd_pm_enable_quirk, &features);

Although PCI-PM_L1.2 is disabled, both PCI bridge and the NVMe's
LTR1.2_Threshold are configured as 101376ns:

10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core
Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal
decode])
...
  Capabilities: [200 v1] L1 PM Substates
  L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
    PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
  L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
     T_CommonMode=45us LTR1.2_Threshold=101376ns
  L1SubCtl2: T_PwrOn=50us

10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD
Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
...
  Capabilities: [900 v1] L1 PM Substates
  L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
    PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
  L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
     T_CommonMode=0us LTR1.2_Threshold=101376ns
  L1SubCtl2: T_PwrOn=50us

Then, I apply the patch "PCI: vmd: Set PCI devices to D0 before enable
PCI PM's L1 substates".  Both PCI bridge and the NVMe's PCI-PM_L1.2 is
enabled and LTR1.2_Threshold is configured as 101376ns.

10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core
Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal
decode])
...
  Capabilities: [200 v1] L1 PM Substates
  L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
    PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
  L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
     T_CommonMode=45us LTR1.2_Threshold=101376ns
  L1SubCtl2: T_PwrOn=50us

10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD
Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
...
  Capabilities: [900 v1] L1 PM Substates
  L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
    PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
  L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
     T_CommonMode=0us LTR1.2_Threshold=101376ns
  L1SubCtl2: T_PwrOn=50us

I do not know VMD very much.  However, from the test result, it looks
like LTR1.2_Threshold has been configured properly originally.  But,
LTR1.2_Threshold is misconfigured by 'pci_reset_bus()'.

Jian-Hong Pan



> >
> > Jain-Hong Pan
> >
> > > 'state' should not even matter.
> > > The timings should always be calculated and programmed as long
> > > as L1_2 is capable. That way the timings are ready even if L1_2 isn't being
> > > enabled now (in case the user enables it later).
> > >
> > > David
> > >
> > > > +               parent_l1ss_cap = aspm_get_l1ss_cap(parent);
> > > > +               child_l1ss_cap = aspm_get_l1ss_cap(child);
> > > > +               aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap);
> > > > +       }
> > > >         pcie_config_aspm_link(link, policy_to_aspm_state(link));
> > > >
> > > >         link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
> > >
>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 4/4] PCI/ASPM: Fix L1.2 parameters when enable link state
  2024-05-03  9:45       ` Jian-Hong Pan
@ 2024-05-03 19:15         ` David E. Box
  2024-05-03 22:28           ` Bjorn Helgaas
  0 siblings, 1 reply; 8+ messages in thread
From: David E. Box @ 2024-05-03 19:15 UTC (permalink / raw)
  To: Jian-Hong Pan, Nirmal Patel, Bjorn Helgaas
  Cc: Johan Hovold, Ilpo Järvinen, Kuppuswamy Sathyanarayanan,
	Mika Westerberg, Damien Le Moal, Jonathan Derrick, linux-pci,
	linux-kernel

+Nirmal,

Thanks Jian-Hong. This is a good find.

On Fri, 2024-05-03 at 17:45 +0800, Jian-Hong Pan wrote:
> David E. Box <david.e.box@linux.intel.com> 於 2024年5月1日 週三 上午2:26寫道:
> > 
> > On Tue, 2024-04-30 at 15:46 +0800, Jian-Hong Pan wrote:
> > > David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道:
> > > > 
> > > > Hi Jian-Hong,
> > > > 
> > > > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote:
> > > > > Currently, when enable link's L1.2 features with
> > > > > __pci_enable_link_state(),
> > > > > it configs the link directly without ensuring related L1.2 parameters,
> > > > > such
> > > > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have
> > > > > been
> > > > > programmed.
> > > > > 
> > > > > This leads the link's L1.2 between PCIe Root Port and child device
> > > > > gets
> > > > > wrong configs when a caller tries to enabled it.
> > > > > 
> > > > > Here is a failed example on ASUS B1400CEAE with enabled VMD:
> > > > > 
> > > > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor
> > > > > PCIe
> > > > > Controller (rev 01) (prog-if 00 [Normal decode])
> > > > >     ...
> > > > >     Capabilities: [200 v1] L1 PM Substates
> > > > >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
> > > > > L1_PM_Substates+
> > > > >                   PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > > > >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > >                    T_CommonMode=45us LTR1.2_Threshold=101376ns
> > > > >         L1SubCtl2: T_PwrOn=50us
> > > > > 
> > > > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue
> > > > > SN550
> > > > > NVMe
> > > > > SSD (rev 01) (prog-if 02 [NVM Express])
> > > > >     ...
> > > > >     Capabilities: [900 v1] L1 PM Substates
> > > > >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > > L1_PM_Substates+
> > > > >                   PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > > > >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > >                    T_CommonMode=0us LTR1.2_Threshold=0ns
> > > > >         L1SubCtl2: T_PwrOn=10us
> > > > > 
> > > > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the
> > > > > PCIe
> > > > > Root Port and the child NVMe, they should be programmed with the same
> > > > > LTR1.2_Threshold value. However, they have different values in this
> > > > > case.
> > > > > 
> > > > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly
> > > > > before
> > > > > enable L1.2 bits of L1 PM Substates Control Register in
> > > > > __pci_enable_link_state().
> > > > > 
> > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
> > > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> > > > > ---
> > > > > v2:
> > > > > - Prepare the PCIe LTR parameters before enable L1 Substates
> > > > > 
> > > > > v3:
> > > > > - Only enable supported features for the L1 Substates part
> > > > > 
> > > > > v4:
> > > > > - Focus on fixing L1.2 parameters, instead of re-initializing whole
> > > > > L1SS
> > > > > 
> > > > > v5:
> > > > > - Fix typo and commit message
> > > > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce
> > > > >   aspm_get_l1ss_cap()"
> > > > > 
> > > > >  drivers/pci/pcie/aspm.c | 12 ++++++++++++
> > > > >  1 file changed, 12 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > > > index c55ac11faa73..553327dee991 100644
> > > > > --- a/drivers/pci/pcie/aspm.c
> > > > > +++ b/drivers/pci/pcie/aspm.c
> > > > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state);
> > > > >  static int __pci_enable_link_state(struct pci_dev *pdev, int state,
> > > > > bool
> > > > > locked)
> > > > >  {
> > > > >         struct pcie_link_state *link = pcie_aspm_get_link(pdev);
> > > > > +       struct pci_dev *child = link->downstream, *parent = link-
> > > > > >pdev;
> > > > > +       u32 parent_l1ss_cap, child_l1ss_cap;
> > > > > 
> > > > >         if (!link)
> > > > >                 return -EINVAL;
> > > > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct
> > > > > pci_dev
> > > > > *pdev, int state, bool locked)
> > > > >                 link->aspm_default |= ASPM_STATE_L1_1_PCIPM |
> > > > > ASPM_STATE_L1;
> > > > >         if (state & PCIE_LINK_STATE_L1_2_PCIPM)
> > > > >                 link->aspm_default |= ASPM_STATE_L1_2_PCIPM |
> > > > > ASPM_STATE_L1;
> > > > > +       /*
> > > > > +        * Ensure L1.2 parameters: Common_Mode_Restore_Times,
> > > > > T_POWER_ON
> > > > > and
> > > > > +        * LTR_L1.2_THRESHOLD are programmed properly before enable
> > > > > bits
> > > > > for
> > > > > +        * L1.2, per PCIe r6.0, sec 5.5.4.
> > > > > +        */
> > > > > +       if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) {
> > > > 
> > > > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags.
> > > 
> > > Thanks for your review, but I notice some description in PCIe spec,
> > > 5.5.4 L1 PM Substates Configuration:
> > > "Prior to setting either or both of the enable bits for L1.2, the
> > > values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2
> > > Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale
> > > fields) must be programmed." => I think this includes both "ASPM L1.2
> > > Enable" and "PCI-PM L1.2 Enable" bits.
> > 
> > That's fine. While the spec clearly calls out the ASPM L1.2 Enable bit here,
> > I
> > see no harm in including PCI-PM L1.2 in that check. This is what the code
> > already does in aspm_l1ss_init().
> > 
> > The issue is the mixed used of two different types of flags that don't have
> > the
> > same meaning. 'state' contains PCIE_LINK_STATE flags which are part of the
> > caller API to the pci_<enabled/disable>_link_state() functions. The
> > ASPM_STATE
> > flags are used internally to aspm.c to track all states and their meaningful
> > combinations such as ASPM_STATE_L1_2_MASK which includes ASPM L1.2 and PCI-
> > PM
> > L1.2. You should not do bit operations between them.
> > 
> > Also, you should not require that the timings be calculated only if L1_2 is
> > enabled. You should calculate the timings as long as it's capable. This is
> > also
> > what aspm_l1ss_init() does.
> > 
> > The confusion might be over the fact that you are having
> > __pci_enable_link_state() call aspm_calc_l12_info(). This should have been
> > handled during initialization of the link in aspm_l1ss_init() and I'm not
> > sure
> > why it didn't. Maybe it's because, for VMD, ASPM default state would have
> > started out all disabled and this somehow led to aspm_l1ss_init() not
> > getting
> > called. But looking through the code I don't see it. It would be great if
> > you
> > can confirm why they weren't calculated before.
> 
> I debug it again.  If I delete the pci_reset_bus() in vmd controller like:
> 
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 87b7856f375a..39bfda4350bf 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -930,25 +930,6 @@ static int vmd_enable_domain(struct vmd_dev *vmd,
> unsigned long features)
>         pci_scan_child_bus(vmd->bus);
>         vmd_domain_reset(vmd);
> 
> -       /* When Intel VMD is enabled, the OS does not discover the Root Ports
> -        * owned by Intel VMD within the MMCFG space. pci_reset_bus() applies
> -        * a reset to the parent of the PCI device supplied as argument. This
> -        * is why we pass a child device, so the reset can be triggered at
> -        * the Intel bridge level and propagated to all the children in the
> -        * hierarchy.
> -        */
> -       list_for_each_entry(child, &vmd->bus->children, node) {
> -               if (!list_empty(&child->devices)) {
> -                       dev = list_first_entry(&child->devices,
> -                                              struct pci_dev, bus_list);
> -                       ret = pci_reset_bus(dev);

Hi Nirmal. It's not clear to me from the comment why there's a need to do a bus
reset. It looks like it is causing misconfiguration of the ASPM L1.2 timings
which would have been done above in pci_scan_child_bus(). Jian-Hong discovered
that without the above reset code, the timings are correct.

This patch recalculates the timings during the call to pci_enable_link_state()
which is called during pci_bus_walk() below. Originally I thought the
recalculation might have been needed by all callers to pci_enabled_link_state()
since it changes the default BIOS configuration. But it looks like the reset is
the cause and only the VMD driver would need the recalculation as a result. I
don't see qcom doing a reset.

Jian-Hong, given this (and assuming the reset is needed) I would not call
aspm_calc_l12_info() from pci_enable_link_state() but instead try redoing the
whole ASPM initialization right after the resets are done, maybe by calling
pci_scan_child_bus() again. What do you think Bjorn?

David

> -                       if (ret)
> -                               pci_warn(dev, "can't reset device: %d\n",
> ret);
> -
> -                       break;
> -               }
> -       }
> -
>         pci_assign_unassigned_bus_resources(vmd->bus);
> 
>         pci_walk_bus(vmd->bus, vmd_pm_enable_quirk, &features);
> 
> Although PCI-PM_L1.2 is disabled, both PCI bridge and the NVMe's
> LTR1.2_Threshold are configured as 101376ns:
> 
> 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core
> Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal
> decode])
> ...
>   Capabilities: [200 v1] L1 PM Substates
>   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>     PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
>   L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>      T_CommonMode=45us LTR1.2_Threshold=101376ns
>   L1SubCtl2: T_PwrOn=50us
> 
> 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD
> Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> ...
>   Capabilities: [900 v1] L1 PM Substates
>   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
>     PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
>   L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>      T_CommonMode=0us LTR1.2_Threshold=101376ns
>   L1SubCtl2: T_PwrOn=50us
> 
> Then, I apply the patch "PCI: vmd: Set PCI devices to D0 before enable
> PCI PM's L1 substates".  Both PCI bridge and the NVMe's PCI-PM_L1.2 is
> enabled and LTR1.2_Threshold is configured as 101376ns.
> 
> 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core
> Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal
> decode])
> ...
>   Capabilities: [200 v1] L1 PM Substates
>   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>     PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
>   L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>      T_CommonMode=45us LTR1.2_Threshold=101376ns
>   L1SubCtl2: T_PwrOn=50us
> 
> 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD
> Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> ...
>   Capabilities: [900 v1] L1 PM Substates
>   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
>     PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
>   L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>      T_CommonMode=0us LTR1.2_Threshold=101376ns
>   L1SubCtl2: T_PwrOn=50us
> 
> I do not know VMD very much.  However, from the test result, it looks
> like LTR1.2_Threshold has been configured properly originally.  But,
> LTR1.2_Threshold is misconfigured by 'pci_reset_bus()'.
> 
> Jian-Hong Pan
> 
> 
> 
> > > 
> > > Jain-Hong Pan
> > > 
> > > > 'state' should not even matter.
> > > > The timings should always be calculated and programmed as long
> > > > as L1_2 is capable. That way the timings are ready even if L1_2 isn't
> > > > being
> > > > enabled now (in case the user enables it later).
> > > > 
> > > > David
> > > > 
> > > > > +               parent_l1ss_cap = aspm_get_l1ss_cap(parent);
> > > > > +               child_l1ss_cap = aspm_get_l1ss_cap(child);
> > > > > +               aspm_calc_l12_info(link, parent_l1ss_cap,
> > > > > child_l1ss_cap);
> > > > > +       }
> > > > >         pcie_config_aspm_link(link, policy_to_aspm_state(link));
> > > > > 
> > > > >         link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
> > > > 
> > 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 4/4] PCI/ASPM: Fix L1.2 parameters when enable link state
  2024-05-03 19:15         ` David E. Box
@ 2024-05-03 22:28           ` Bjorn Helgaas
  2024-05-13 10:40             ` Jian-Hong Pan
  0 siblings, 1 reply; 8+ messages in thread
From: Bjorn Helgaas @ 2024-05-03 22:28 UTC (permalink / raw)
  To: David E. Box
  Cc: Jian-Hong Pan, Nirmal Patel, Johan Hovold, Ilpo Järvinen,
	Kuppuswamy Sathyanarayanan, Mika Westerberg, Damien Le Moal,
	Jonathan Derrick, linux-pci, linux-kernel, Francisco Munoz

[+cc Francisco]

On Fri, May 03, 2024 at 12:15:49PM -0700, David E. Box wrote:
> On Fri, 2024-05-03 at 17:45 +0800, Jian-Hong Pan wrote:
> > David E. Box <david.e.box@linux.intel.com> 於 2024年5月1日 週三 上午2:26寫道:
> > > On Tue, 2024-04-30 at 15:46 +0800, Jian-Hong Pan wrote:
> > > > David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道:
> > > > > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote:
> > > > > > Currently, when enable link's L1.2 features with
> > > > > > __pci_enable_link_state(),
> > > > > > it configs the link directly without ensuring related L1.2 parameters,
> > > > > > such
> > > > > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have
> > > > > > been
> > > > > > programmed.
> > > > > > 
> > > > > > This leads the link's L1.2 between PCIe Root Port and child device
> > > > > > gets
> > > > > > wrong configs when a caller tries to enabled it.
> > > > > > 
> > > > > > Here is a failed example on ASUS B1400CEAE with enabled VMD:
> > > > > > 
> > > > > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor
> > > > > > PCIe
> > > > > > Controller (rev 01) (prog-if 00 [Normal decode])
> > > > > >     ...
> > > > > >     Capabilities: [200 v1] L1 PM Substates
> > > > > >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
> > > > > > L1_PM_Substates+
> > > > > >                   PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > > > > >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > > >                    T_CommonMode=45us LTR1.2_Threshold=101376ns
> > > > > >         L1SubCtl2: T_PwrOn=50us
> > > > > > 
> > > > > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue
> > > > > > SN550
> > > > > > NVMe
> > > > > > SSD (rev 01) (prog-if 02 [NVM Express])
> > > > > >     ...
> > > > > >     Capabilities: [900 v1] L1 PM Substates
> > > > > >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > > > L1_PM_Substates+
> > > > > >                   PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > > > > >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > > >                    T_CommonMode=0us LTR1.2_Threshold=0ns
> > > > > >         L1SubCtl2: T_PwrOn=10us
> > > > > > 
> > > > > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the
> > > > > > PCIe
> > > > > > Root Port and the child NVMe, they should be programmed with the same
> > > > > > LTR1.2_Threshold value. However, they have different values in this
> > > > > > case.
> > > > > > 
> > > > > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly
> > > > > > before
> > > > > > enable L1.2 bits of L1 PM Substates Control Register in
> > > > > > __pci_enable_link_state().
> > > > > > 
> > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
> > > > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> > > > > > ---
> > > > > > v2:
> > > > > > - Prepare the PCIe LTR parameters before enable L1 Substates
> > > > > > 
> > > > > > v3:
> > > > > > - Only enable supported features for the L1 Substates part
> > > > > > 
> > > > > > v4:
> > > > > > - Focus on fixing L1.2 parameters, instead of re-initializing whole
> > > > > > L1SS
> > > > > > 
> > > > > > v5:
> > > > > > - Fix typo and commit message
> > > > > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce
> > > > > >   aspm_get_l1ss_cap()"
> > > > > > 
> > > > > >  drivers/pci/pcie/aspm.c | 12 ++++++++++++
> > > > > >  1 file changed, 12 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > > > > index c55ac11faa73..553327dee991 100644
> > > > > > --- a/drivers/pci/pcie/aspm.c
> > > > > > +++ b/drivers/pci/pcie/aspm.c
> > > > > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state);
> > > > > >  static int __pci_enable_link_state(struct pci_dev *pdev, int state,
> > > > > > bool
> > > > > > locked)
> > > > > >  {
> > > > > >         struct pcie_link_state *link = pcie_aspm_get_link(pdev);
> > > > > > +       struct pci_dev *child = link->downstream, *parent = link-
> > > > > > >pdev;
> > > > > > +       u32 parent_l1ss_cap, child_l1ss_cap;
> > > > > > 
> > > > > >         if (!link)
> > > > > >                 return -EINVAL;
> > > > > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct
> > > > > > pci_dev
> > > > > > *pdev, int state, bool locked)
> > > > > >                 link->aspm_default |= ASPM_STATE_L1_1_PCIPM |
> > > > > > ASPM_STATE_L1;
> > > > > >         if (state & PCIE_LINK_STATE_L1_2_PCIPM)
> > > > > >                 link->aspm_default |= ASPM_STATE_L1_2_PCIPM |
> > > > > > ASPM_STATE_L1;
> > > > > > +       /*
> > > > > > +        * Ensure L1.2 parameters: Common_Mode_Restore_Times,
> > > > > > T_POWER_ON
> > > > > > and
> > > > > > +        * LTR_L1.2_THRESHOLD are programmed properly before enable
> > > > > > bits
> > > > > > for
> > > > > > +        * L1.2, per PCIe r6.0, sec 5.5.4.
> > > > > > +        */
> > > > > > +       if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) {
> > > > > 
> > > > > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags.

FWIW, Ilpo has removed the ASPM_STATE flags, so eventually this would
have to be updated to apply on the current pci/aspm branch.  We're at
rc6 already, so likely this will end up being v6.11 material so you'll
be able to rebase on v6.10-rc1 when it comes out.

> > > > Thanks for your review, but I notice some description in PCIe spec,
> > > > 5.5.4 L1 PM Substates Configuration:
> > > > "Prior to setting either or both of the enable bits for L1.2, the
> > > > values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2
> > > > Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale
> > > > fields) must be programmed." => I think this includes both "ASPM L1.2
> > > > Enable" and "PCI-PM L1.2 Enable" bits.
> > > 
> > > That's fine. While the spec clearly calls out the ASPM L1.2 Enable bit here,
> > > I
> > > see no harm in including PCI-PM L1.2 in that check. This is what the code
> > > already does in aspm_l1ss_init().
> > > 
> > > The issue is the mixed used of two different types of flags that don't have
> > > the
> > > same meaning. 'state' contains PCIE_LINK_STATE flags which are part of the
> > > caller API to the pci_<enabled/disable>_link_state() functions. The
> > > ASPM_STATE
> > > flags are used internally to aspm.c to track all states and their meaningful
> > > combinations such as ASPM_STATE_L1_2_MASK which includes ASPM L1.2 and PCI-
> > > PM
> > > L1.2. You should not do bit operations between them.
> > > 
> > > Also, you should not require that the timings be calculated only if L1_2 is
> > > enabled. You should calculate the timings as long as it's capable. This is
> > > also
> > > what aspm_l1ss_init() does.
> > > 
> > > The confusion might be over the fact that you are having
> > > __pci_enable_link_state() call aspm_calc_l12_info(). This should have been
> > > handled during initialization of the link in aspm_l1ss_init() and I'm not
> > > sure
> > > why it didn't. Maybe it's because, for VMD, ASPM default state would have
> > > started out all disabled and this somehow led to aspm_l1ss_init() not
> > > getting
> > > called. But looking through the code I don't see it. It would be great if
> > > you
> > > can confirm why they weren't calculated before.
> > 
> > I debug it again.  If I delete the pci_reset_bus() in vmd controller like:
> > 
> > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > index 87b7856f375a..39bfda4350bf 100644
> > --- a/drivers/pci/controller/vmd.c
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -930,25 +930,6 @@ static int vmd_enable_domain(struct vmd_dev *vmd,
> > unsigned long features)
> >         pci_scan_child_bus(vmd->bus);
> >         vmd_domain_reset(vmd);
> > 
> > -       /* When Intel VMD is enabled, the OS does not discover the Root Ports
> > -        * owned by Intel VMD within the MMCFG space. pci_reset_bus() applies
> > -        * a reset to the parent of the PCI device supplied as argument. This
> > -        * is why we pass a child device, so the reset can be triggered at
> > -        * the Intel bridge level and propagated to all the children in the
> > -        * hierarchy.
> > -        */
> > -       list_for_each_entry(child, &vmd->bus->children, node) {
> > -               if (!list_empty(&child->devices)) {
> > -                       dev = list_first_entry(&child->devices,
> > -                                              struct pci_dev, bus_list);
> > -                       ret = pci_reset_bus(dev);
> 
> Hi Nirmal. It's not clear to me from the comment why there's a need to do a bus
> reset. It looks like it is causing misconfiguration of the ASPM L1.2 timings
> which would have been done above in pci_scan_child_bus(). Jian-Hong discovered
> that without the above reset code, the timings are correct.

I don't understand that comment either.  If we don't enumerate the
Root Ports below VMD, it sounds like something is wrong, and reset
doesn't seem like the right fix.

The reset was added by 0a584655ef89 ("PCI: vmd: Fix secondary bus
reset for Intel bridges") for guest reboots.  Maybe Francisco can shed
more light on it.

> This patch recalculates the timings during the call to pci_enable_link_state()
> which is called during pci_bus_walk() below. Originally I thought the
> recalculation might have been needed by all callers to pci_enabled_link_state()
> since it changes the default BIOS configuration. But it looks like the reset is
> the cause and only the VMD driver would need the recalculation as a result. I
> don't see qcom doing a reset.
> 
> Jian-Hong, given this (and assuming the reset is needed) I would not call
> aspm_calc_l12_info() from pci_enable_link_state() but instead try redoing the
> whole ASPM initialization right after the resets are done, maybe by calling
> pci_scan_child_bus() again. What do you think Bjorn?

I would expect pci_reset_bus() to save and restore config space, but
if we don't enumerate all the devices correctly, I suppose we wouldn't
do that for devices we don't know about.

> > -                       if (ret)
> > -                               pci_warn(dev, "can't reset device: %d\n",
> > ret);
> > -
> > -                       break;
> > -               }
> > -       }
> > -
> >         pci_assign_unassigned_bus_resources(vmd->bus);
> > 
> >         pci_walk_bus(vmd->bus, vmd_pm_enable_quirk, &features);
> > 
> > Although PCI-PM_L1.2 is disabled, both PCI bridge and the NVMe's
> > LTR1.2_Threshold are configured as 101376ns:
> > 
> > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core
> > Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal
> > decode])
> > ...
> >   Capabilities: [200 v1] L1 PM Substates
> >   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> >     PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> >   L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> >      T_CommonMode=45us LTR1.2_Threshold=101376ns
> >   L1SubCtl2: T_PwrOn=50us
> > 
> > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD
> > Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> > ...
> >   Capabilities: [900 v1] L1 PM Substates
> >   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
> >     PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> >   L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> >      T_CommonMode=0us LTR1.2_Threshold=101376ns
> >   L1SubCtl2: T_PwrOn=50us
> > 
> > Then, I apply the patch "PCI: vmd: Set PCI devices to D0 before enable
> > PCI PM's L1 substates".  Both PCI bridge and the NVMe's PCI-PM_L1.2 is
> > enabled and LTR1.2_Threshold is configured as 101376ns.
> > 
> > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core
> > Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal
> > decode])
> > ...
> >   Capabilities: [200 v1] L1 PM Substates
> >   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> >     PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> >   L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> >      T_CommonMode=45us LTR1.2_Threshold=101376ns
> >   L1SubCtl2: T_PwrOn=50us
> > 
> > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD
> > Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> > ...
> >   Capabilities: [900 v1] L1 PM Substates
> >   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
> >     PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> >   L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> >      T_CommonMode=0us LTR1.2_Threshold=101376ns
> >   L1SubCtl2: T_PwrOn=50us
> > 
> > I do not know VMD very much.  However, from the test result, it looks
> > like LTR1.2_Threshold has been configured properly originally.  But,
> > LTR1.2_Threshold is misconfigured by 'pci_reset_bus()'.
> > ...
> > > > > 'state' should not even matter.
> > > > > The timings should always be calculated and programmed as long
> > > > > as L1_2 is capable. That way the timings are ready even if L1_2 isn't
> > > > > being
> > > > > enabled now (in case the user enables it later).
> > > > > 
> > > > > > +               parent_l1ss_cap = aspm_get_l1ss_cap(parent);
> > > > > > +               child_l1ss_cap = aspm_get_l1ss_cap(child);
> > > > > > +               aspm_calc_l12_info(link, parent_l1ss_cap,
> > > > > > child_l1ss_cap);
> > > > > > +       }
> > > > > >         pcie_config_aspm_link(link, policy_to_aspm_state(link));
> > > > > > 
> > > > > >         link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
> > > > > 
> > > 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 4/4] PCI/ASPM: Fix L1.2 parameters when enable link state
  2024-05-03 22:28           ` Bjorn Helgaas
@ 2024-05-13 10:40             ` Jian-Hong Pan
  0 siblings, 0 replies; 8+ messages in thread
From: Jian-Hong Pan @ 2024-05-13 10:40 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: David E. Box, Nirmal Patel, Johan Hovold, Ilpo Järvinen,
	Kuppuswamy Sathyanarayanan, Mika Westerberg, Damien Le Moal,
	Jonathan Derrick, linux-pci, linux-kernel, Francisco Munoz

Bjorn Helgaas <helgaas@kernel.org> 於 2024年5月4日 週六 上午6:28寫道:
>
> [+cc Francisco]
>
> On Fri, May 03, 2024 at 12:15:49PM -0700, David E. Box wrote:
> > On Fri, 2024-05-03 at 17:45 +0800, Jian-Hong Pan wrote:
> > > David E. Box <david.e.box@linux.intel.com> 於 2024年5月1日 週三 上午2:26寫道:
> > > > On Tue, 2024-04-30 at 15:46 +0800, Jian-Hong Pan wrote:
> > > > > David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道:
> > > > > > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote:
> > > > > > > Currently, when enable link's L1.2 features with
> > > > > > > __pci_enable_link_state(),
> > > > > > > it configs the link directly without ensuring related L1.2 parameters,
> > > > > > > such
> > > > > > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have
> > > > > > > been
> > > > > > > programmed.
> > > > > > >
> > > > > > > This leads the link's L1.2 between PCIe Root Port and child device
> > > > > > > gets
> > > > > > > wrong configs when a caller tries to enabled it.
> > > > > > >
> > > > > > > Here is a failed example on ASUS B1400CEAE with enabled VMD:
> > > > > > >
> > > > > > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor
> > > > > > > PCIe
> > > > > > > Controller (rev 01) (prog-if 00 [Normal decode])
> > > > > > >     ...
> > > > > > >     Capabilities: [200 v1] L1 PM Substates
> > > > > > >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
> > > > > > > L1_PM_Substates+
> > > > > > >                   PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > > > > > >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > > > >                    T_CommonMode=45us LTR1.2_Threshold=101376ns
> > > > > > >         L1SubCtl2: T_PwrOn=50us
> > > > > > >
> > > > > > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue
> > > > > > > SN550
> > > > > > > NVMe
> > > > > > > SSD (rev 01) (prog-if 02 [NVM Express])
> > > > > > >     ...
> > > > > > >     Capabilities: [900 v1] L1 PM Substates
> > > > > > >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > > > > L1_PM_Substates+
> > > > > > >                   PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > > > > > >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > > > >                    T_CommonMode=0us LTR1.2_Threshold=0ns
> > > > > > >         L1SubCtl2: T_PwrOn=10us
> > > > > > >
> > > > > > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the
> > > > > > > PCIe
> > > > > > > Root Port and the child NVMe, they should be programmed with the same
> > > > > > > LTR1.2_Threshold value. However, they have different values in this
> > > > > > > case.
> > > > > > >
> > > > > > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly
> > > > > > > before
> > > > > > > enable L1.2 bits of L1 PM Substates Control Register in
> > > > > > > __pci_enable_link_state().
> > > > > > >
> > > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
> > > > > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> > > > > > > ---
> > > > > > > v2:
> > > > > > > - Prepare the PCIe LTR parameters before enable L1 Substates
> > > > > > >
> > > > > > > v3:
> > > > > > > - Only enable supported features for the L1 Substates part
> > > > > > >
> > > > > > > v4:
> > > > > > > - Focus on fixing L1.2 parameters, instead of re-initializing whole
> > > > > > > L1SS
> > > > > > >
> > > > > > > v5:
> > > > > > > - Fix typo and commit message
> > > > > > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce
> > > > > > >   aspm_get_l1ss_cap()"
> > > > > > >
> > > > > > >  drivers/pci/pcie/aspm.c | 12 ++++++++++++
> > > > > > >  1 file changed, 12 insertions(+)
> > > > > > >
> > > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > > > > > index c55ac11faa73..553327dee991 100644
> > > > > > > --- a/drivers/pci/pcie/aspm.c
> > > > > > > +++ b/drivers/pci/pcie/aspm.c
> > > > > > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state);
> > > > > > >  static int __pci_enable_link_state(struct pci_dev *pdev, int state,
> > > > > > > bool
> > > > > > > locked)
> > > > > > >  {
> > > > > > >         struct pcie_link_state *link = pcie_aspm_get_link(pdev);
> > > > > > > +       struct pci_dev *child = link->downstream, *parent = link-
> > > > > > > >pdev;
> > > > > > > +       u32 parent_l1ss_cap, child_l1ss_cap;
> > > > > > >
> > > > > > >         if (!link)
> > > > > > >                 return -EINVAL;
> > > > > > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct
> > > > > > > pci_dev
> > > > > > > *pdev, int state, bool locked)
> > > > > > >                 link->aspm_default |= ASPM_STATE_L1_1_PCIPM |
> > > > > > > ASPM_STATE_L1;
> > > > > > >         if (state & PCIE_LINK_STATE_L1_2_PCIPM)
> > > > > > >                 link->aspm_default |= ASPM_STATE_L1_2_PCIPM |
> > > > > > > ASPM_STATE_L1;
> > > > > > > +       /*
> > > > > > > +        * Ensure L1.2 parameters: Common_Mode_Restore_Times,
> > > > > > > T_POWER_ON
> > > > > > > and
> > > > > > > +        * LTR_L1.2_THRESHOLD are programmed properly before enable
> > > > > > > bits
> > > > > > > for
> > > > > > > +        * L1.2, per PCIe r6.0, sec 5.5.4.
> > > > > > > +        */
> > > > > > > +       if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) {
> > > > > >
> > > > > > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags.
>
> FWIW, Ilpo has removed the ASPM_STATE flags, so eventually this would
> have to be updated to apply on the current pci/aspm branch.  We're at
> rc6 already, so likely this will end up being v6.11 material so you'll
> be able to rebase on v6.10-rc1 when it comes out.
>
> > > > > Thanks for your review, but I notice some description in PCIe spec,
> > > > > 5.5.4 L1 PM Substates Configuration:
> > > > > "Prior to setting either or both of the enable bits for L1.2, the
> > > > > values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2
> > > > > Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale
> > > > > fields) must be programmed." => I think this includes both "ASPM L1.2
> > > > > Enable" and "PCI-PM L1.2 Enable" bits.
> > > >
> > > > That's fine. While the spec clearly calls out the ASPM L1.2 Enable bit here,
> > > > I
> > > > see no harm in including PCI-PM L1.2 in that check. This is what the code
> > > > already does in aspm_l1ss_init().
> > > >
> > > > The issue is the mixed used of two different types of flags that don't have
> > > > the
> > > > same meaning. 'state' contains PCIE_LINK_STATE flags which are part of the
> > > > caller API to the pci_<enabled/disable>_link_state() functions. The
> > > > ASPM_STATE
> > > > flags are used internally to aspm.c to track all states and their meaningful
> > > > combinations such as ASPM_STATE_L1_2_MASK which includes ASPM L1.2 and PCI-
> > > > PM
> > > > L1.2. You should not do bit operations between them.
> > > >
> > > > Also, you should not require that the timings be calculated only if L1_2 is
> > > > enabled. You should calculate the timings as long as it's capable. This is
> > > > also
> > > > what aspm_l1ss_init() does.
> > > >
> > > > The confusion might be over the fact that you are having
> > > > __pci_enable_link_state() call aspm_calc_l12_info(). This should have been
> > > > handled during initialization of the link in aspm_l1ss_init() and I'm not
> > > > sure
> > > > why it didn't. Maybe it's because, for VMD, ASPM default state would have
> > > > started out all disabled and this somehow led to aspm_l1ss_init() not
> > > > getting
> > > > called. But looking through the code I don't see it. It would be great if
> > > > you
> > > > can confirm why they weren't calculated before.
> > >
> > > I debug it again.  If I delete the pci_reset_bus() in vmd controller like:
> > >
> > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > > index 87b7856f375a..39bfda4350bf 100644
> > > --- a/drivers/pci/controller/vmd.c
> > > +++ b/drivers/pci/controller/vmd.c
> > > @@ -930,25 +930,6 @@ static int vmd_enable_domain(struct vmd_dev *vmd,
> > > unsigned long features)
> > >         pci_scan_child_bus(vmd->bus);
> > >         vmd_domain_reset(vmd);
> > >
> > > -       /* When Intel VMD is enabled, the OS does not discover the Root Ports
> > > -        * owned by Intel VMD within the MMCFG space. pci_reset_bus() applies
> > > -        * a reset to the parent of the PCI device supplied as argument. This
> > > -        * is why we pass a child device, so the reset can be triggered at
> > > -        * the Intel bridge level and propagated to all the children in the
> > > -        * hierarchy.
> > > -        */
> > > -       list_for_each_entry(child, &vmd->bus->children, node) {
> > > -               if (!list_empty(&child->devices)) {
> > > -                       dev = list_first_entry(&child->devices,
> > > -                                              struct pci_dev, bus_list);
> > > -                       ret = pci_reset_bus(dev);
> >
> > Hi Nirmal. It's not clear to me from the comment why there's a need to do a bus
> > reset. It looks like it is causing misconfiguration of the ASPM L1.2 timings
> > which would have been done above in pci_scan_child_bus(). Jian-Hong discovered
> > that without the above reset code, the timings are correct.
>
> I don't understand that comment either.  If we don't enumerate the
> Root Ports below VMD, it sounds like something is wrong, and reset
> doesn't seem like the right fix.
>
> The reset was added by 0a584655ef89 ("PCI: vmd: Fix secondary bus
> reset for Intel bridges") for guest reboots.  Maybe Francisco can shed
> more light on it.
>
> > This patch recalculates the timings during the call to pci_enable_link_state()
> > which is called during pci_bus_walk() below. Originally I thought the
> > recalculation might have been needed by all callers to pci_enabled_link_state()
> > since it changes the default BIOS configuration. But it looks like the reset is
> > the cause and only the VMD driver would need the recalculation as a result. I
> > don't see qcom doing a reset.
> >
> > Jian-Hong, given this (and assuming the reset is needed) I would not call
> > aspm_calc_l12_info() from pci_enable_link_state() but instead try redoing the
> > whole ASPM initialization right after the resets are done, maybe by calling
> > pci_scan_child_bus() again. What do you think Bjorn?
>
> I would expect pci_reset_bus() to save and restore config space, but
> if we don't enumerate all the devices correctly, I suppose we wouldn't
> do that for devices we don't know about.

According to the discussion above, the misconfiguration of ASPM L1.2
timings comes from pci_reset_bus() which is added by another issue
related to VT-d pass-through:
* 6aab5622296b ("PCI: vmd: Clean up domain before enumeration")
* 0a584655ef89 ("PCI: vmd: Fix secondary bus reset for Intel bridges")
However, I do not quite understand the scenario.

Should I drop patches "PCI/ASPM: Introduce aspm_get_l1ss_cap()" and
"PCI/ASPM: Fix L1.2 parameters when enable link state", then only send
"PCI: vmd: Set PCI devices to D0 before enable PCI PM's L1 substates"
and "PCI/ASPM: Add notes about enabling PCI-PM L1SS to
pci_enable_link_state(_locked)" first?

Jian-Hong Pan

> > > -                       if (ret)
> > > -                               pci_warn(dev, "can't reset device: %d\n",
> > > ret);
> > > -
> > > -                       break;
> > > -               }
> > > -       }
> > > -
> > >         pci_assign_unassigned_bus_resources(vmd->bus);
> > >
> > >         pci_walk_bus(vmd->bus, vmd_pm_enable_quirk, &features);
> > >
> > > Although PCI-PM_L1.2 is disabled, both PCI bridge and the NVMe's
> > > LTR1.2_Threshold are configured as 101376ns:
> > >
> > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core
> > > Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal
> > > decode])
> > > ...
> > >   Capabilities: [200 v1] L1 PM Substates
> > >   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> > >     PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > >   L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > >      T_CommonMode=45us LTR1.2_Threshold=101376ns
> > >   L1SubCtl2: T_PwrOn=50us
> > >
> > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD
> > > Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> > > ...
> > >   Capabilities: [900 v1] L1 PM Substates
> > >   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
> > >     PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > >   L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > >      T_CommonMode=0us LTR1.2_Threshold=101376ns
> > >   L1SubCtl2: T_PwrOn=50us
> > >
> > > Then, I apply the patch "PCI: vmd: Set PCI devices to D0 before enable
> > > PCI PM's L1 substates".  Both PCI bridge and the NVMe's PCI-PM_L1.2 is
> > > enabled and LTR1.2_Threshold is configured as 101376ns.
> > >
> > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core
> > > Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal
> > > decode])
> > > ...
> > >   Capabilities: [200 v1] L1 PM Substates
> > >   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> > >     PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > >   L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > >      T_CommonMode=45us LTR1.2_Threshold=101376ns
> > >   L1SubCtl2: T_PwrOn=50us
> > >
> > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD
> > > Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> > > ...
> > >   Capabilities: [900 v1] L1 PM Substates
> > >   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
> > >     PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > >   L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > >      T_CommonMode=0us LTR1.2_Threshold=101376ns
> > >   L1SubCtl2: T_PwrOn=50us
> > >
> > > I do not know VMD very much.  However, from the test result, it looks
> > > like LTR1.2_Threshold has been configured properly originally.  But,
> > > LTR1.2_Threshold is misconfigured by 'pci_reset_bus()'.
> > > ...
> > > > > > 'state' should not even matter.
> > > > > > The timings should always be calculated and programmed as long
> > > > > > as L1_2 is capable. That way the timings are ready even if L1_2 isn't
> > > > > > being
> > > > > > enabled now (in case the user enables it later).
> > > > > >
> > > > > > > +               parent_l1ss_cap = aspm_get_l1ss_cap(parent);
> > > > > > > +               child_l1ss_cap = aspm_get_l1ss_cap(child);
> > > > > > > +               aspm_calc_l12_info(link, parent_l1ss_cap,
> > > > > > > child_l1ss_cap);
> > > > > > > +       }
> > > > > > >         pcie_config_aspm_link(link, policy_to_aspm_state(link));
> > > > > > >
> > > > > > >         link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
> > > > > >
> > > >
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-05-13 10:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-24 11:02 [PATCH v5 4/4] PCI/ASPM: Fix L1.2 parameters when enable link state Jian-Hong Pan
2024-04-27  0:03 ` David E. Box
2024-04-30  7:46   ` Jian-Hong Pan
2024-04-30 18:26     ` David E. Box
2024-05-03  9:45       ` Jian-Hong Pan
2024-05-03 19:15         ` David E. Box
2024-05-03 22:28           ` Bjorn Helgaas
2024-05-13 10:40             ` Jian-Hong Pan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).