linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] PCI/ASPM: Disable ASPM when save/restore PCI state
@ 2021-01-28 15:52 Victor Ding
  2021-03-11 17:34 ` Bjorn Helgaas
  0 siblings, 1 reply; 6+ messages in thread
From: Victor Ding @ 2021-01-28 15:52 UTC (permalink / raw)
  To: Bjorn Helgaas, Ulf Hansson, Adrian Hunter
  Cc: Ben Chuang, linux-pci, linux-kernel, linux-mmc, Victor Ding,
	Bjorn Helgaas, Chris Packham, Kai-Heng Feng, Mika Westerberg,
	Saheed O. Bolarinwa, Vidya Sagar, Xiongfeng Wang

Certain PCIe devices (e.g. GL9750) have high penalties (e.g. high Port
T_POWER_ON) when exiting L1 but enter L1 aggressively. As a result,
such devices enter and exit L1 frequently during pci_save_state and
pci_restore_state; eventually causing poor suspend/resume performance.

Based on the observation that PCI accesses dominance pci_save_state/
pci_restore_state plus these accesses are fairly close to each other, the
actual time the device could stay in low power states is negligible.
Therefore, the little power-saving benefit from ASPM during suspend/resume
does not overweight the performance degradation caused by high L1 exit
penalties.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211187
Signed-off-by: Victor Ding <victording@google.com>

---

Changes in v2:
- Updated commit message to remove unnecessary information
- Fixed a bug reading wrong register in pcie_save_aspm_control
- Updated to reuse existing pcie_config_aspm_dev where possible
- Fixed goto label style

 drivers/pci/pci.c       | 18 +++++++++++++++---
 drivers/pci/pci.h       |  6 ++++++
 drivers/pci/pcie/aspm.c | 27 +++++++++++++++++++++++++++
 include/linux/pci.h     |  1 +
 4 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 32011b7b4c04..9ea88953f90b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1542,6 +1542,10 @@ static void pci_restore_ltr_state(struct pci_dev *dev)
 int pci_save_state(struct pci_dev *dev)
 {
 	int i;
+
+	pcie_save_aspm_control(dev);
+	pcie_disable_aspm(dev);
+
 	/* XXX: 100% dword access ok here? */
 	for (i = 0; i < 16; i++) {
 		pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]);
@@ -1552,18 +1556,22 @@ int pci_save_state(struct pci_dev *dev)
 
 	i = pci_save_pcie_state(dev);
 	if (i != 0)
-		return i;
+		goto exit;
 
 	i = pci_save_pcix_state(dev);
 	if (i != 0)
-		return i;
+		goto exit;
 
 	pci_save_ltr_state(dev);
 	pci_save_aspm_l1ss_state(dev);
 	pci_save_dpc_state(dev);
 	pci_save_aer_state(dev);
 	pci_save_ptm_state(dev);
-	return pci_save_vc_state(dev);
+	i = pci_save_vc_state(dev);
+
+exit:
+	pcie_restore_aspm_control(dev);
+	return i;
 }
 EXPORT_SYMBOL(pci_save_state);
 
@@ -1661,6 +1669,8 @@ void pci_restore_state(struct pci_dev *dev)
 	if (!dev->state_saved)
 		return;
 
+	pcie_disable_aspm(dev);
+
 	/*
 	 * Restore max latencies (in the LTR capability) before enabling
 	 * LTR itself (in the PCIe capability).
@@ -1689,6 +1699,8 @@ void pci_restore_state(struct pci_dev *dev)
 	pci_enable_acs(dev);
 	pci_restore_iov_state(dev);
 
+	pcie_restore_aspm_control(dev);
+
 	dev->state_saved = false;
 }
 EXPORT_SYMBOL(pci_restore_state);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index a81459159f6d..e074a0cbe73c 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -584,6 +584,9 @@ void pcie_aspm_pm_state_change(struct pci_dev *pdev);
 void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
 void pci_save_aspm_l1ss_state(struct pci_dev *dev);
 void pci_restore_aspm_l1ss_state(struct pci_dev *dev);
+void pcie_save_aspm_control(struct pci_dev *dev);
+void pcie_restore_aspm_control(struct pci_dev *dev);
+void pcie_disable_aspm(struct pci_dev *pdev);
 #else
 static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
 static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
@@ -591,6 +594,9 @@ static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { }
 static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
 static inline void pci_save_aspm_l1ss_state(struct pci_dev *dev) { }
 static inline void pci_restore_aspm_l1ss_state(struct pci_dev *dev) { }
+static inline void pcie_save_aspm_control(struct pci_dev *dev) { }
+static inline void pcie_restore_aspm_control(struct pci_dev *dev) { }
+static inline void pcie_disable_aspm(struct pci_dev *pdev) { }
 #endif
 
 #ifdef CONFIG_PCIE_ECRC
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index a08e7d6dc248..e1e97db32e8b 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -784,6 +784,33 @@ static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val)
 					   PCI_EXP_LNKCTL_ASPMC, val);
 }
 
+void pcie_disable_aspm(struct pci_dev *pdev)
+{
+	if (!pci_is_pcie(pdev))
+		return;
+
+	pcie_config_aspm_dev(pdev, 0);
+}
+
+void pcie_save_aspm_control(struct pci_dev *pdev)
+{
+	u16 lnkctl;
+
+	if (!pci_is_pcie(pdev))
+		return;
+
+	pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &lnkctl);
+	pdev->saved_aspm_ctl = lnkctl & PCI_EXP_LNKCTL_ASPMC;
+}
+
+void pcie_restore_aspm_control(struct pci_dev *pdev)
+{
+	if (!pci_is_pcie(pdev))
+		return;
+
+	pcie_config_aspm_dev(pdev, pdev->saved_aspm_ctl);
+}
+
 static void pcie_config_aspm_link(struct pcie_link_state *link, u32 state)
 {
 	u32 upstream = 0, dwstream = 0;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index b32126d26997..a21bfd6e3f89 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -387,6 +387,7 @@ struct pci_dev {
 	unsigned int	ltr_path:1;	/* Latency Tolerance Reporting
 					   supported from root to here */
 	u16		l1ss;		/* L1SS Capability pointer */
+	u16		saved_aspm_ctl; /* ASPM Control saved at suspend time */
 #endif
 	unsigned int	eetlp_prefix_path:1;	/* End-to-End TLP Prefix */
 
-- 
2.30.0.280.ga3ce27912f-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] PCI/ASPM: Disable ASPM when save/restore PCI state
  2021-01-28 15:52 [PATCH v2] PCI/ASPM: Disable ASPM when save/restore PCI state Victor Ding
@ 2021-03-11 17:34 ` Bjorn Helgaas
  2021-07-26 22:03   ` Bjorn Helgaas
  0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2021-03-11 17:34 UTC (permalink / raw)
  To: Victor Ding
  Cc: Ulf Hansson, Adrian Hunter, Ben Chuang, linux-pci, linux-kernel,
	linux-mmc, Bjorn Helgaas, Chris Packham, Kai-Heng Feng,
	Mika Westerberg, Saheed O. Bolarinwa, Vidya Sagar,
	Xiongfeng Wang

On Thu, Jan 28, 2021 at 03:52:42PM +0000, Victor Ding wrote:
> Certain PCIe devices (e.g. GL9750) have high penalties (e.g. high Port
> T_POWER_ON) when exiting L1 but enter L1 aggressively. As a result,
> such devices enter and exit L1 frequently during pci_save_state and
> pci_restore_state; eventually causing poor suspend/resume performance.
> 
> Based on the observation that PCI accesses dominance pci_save_state/
> pci_restore_state plus these accesses are fairly close to each other, the
> actual time the device could stay in low power states is negligible.
> Therefore, the little power-saving benefit from ASPM during suspend/resume
> does not overweight the performance degradation caused by high L1 exit
> penalties.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211187

Thanks for this!

This device can tolerate unlimited delay for L1 exit (DevCtl Endpoint
L1 Acceptable Latency is unlimited) and it makes no guarantees about
how fast it can exit L1 (LnkCap L1 Exit Latency is also unlimited), so
I think there's basically no restriction on when it can enter ASPM
L1.0.

I have a hard time interpreting the L1.2 entry conditions in PCIe
r5.0, sec 5.5.1, but I can believe it enters L1.2 aggressively since
the device says it can tolerate any latencies.

If L1.2 exit takes 3100us, it could do ~60 L1 exits in 200ms.  I guess
config accesses and code execution can account for some of that, but
still seems like a lot of L1 entries/exits during suspend.  I wouldn't
think we would touch the device that much and that intermittently.

> Signed-off-by: Victor Ding <victording@google.com>
> 
> ---
> 
> Changes in v2:
> - Updated commit message to remove unnecessary information
> - Fixed a bug reading wrong register in pcie_save_aspm_control
> - Updated to reuse existing pcie_config_aspm_dev where possible
> - Fixed goto label style
> 
>  drivers/pci/pci.c       | 18 +++++++++++++++---
>  drivers/pci/pci.h       |  6 ++++++
>  drivers/pci/pcie/aspm.c | 27 +++++++++++++++++++++++++++
>  include/linux/pci.h     |  1 +
>  4 files changed, 49 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 32011b7b4c04..9ea88953f90b 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1542,6 +1542,10 @@ static void pci_restore_ltr_state(struct pci_dev *dev)
>  int pci_save_state(struct pci_dev *dev)
>  {
>  	int i;
> +
> +	pcie_save_aspm_control(dev);
> +	pcie_disable_aspm(dev);

If I understand this patch correctly, it basically does this:

    pci_save_state
  +   pcie_save_aspm_control
  +   pcie_disable_aspm
      <save state>
  +   pcie_restore_aspm_control

The <save state> part is just a bunch of config reads with very little
other code execution.  I'm really surprised that there's enough time
between config reads for the link to go to L1.  I guess you've
verified that this does speed up suspend significantly, but this just
doesn't make sense to me.

In the bugzilla you say the GL9750 can go to L1.2 after ~4us of
inactivity.  That's enough time for a lot of code execution.  We must
be missing something.  There's so much PCI traffic during save/restore
that it should be easy to match up the analyzer trace with the code.
Can you get any insight into what's going on that way?

>  	/* XXX: 100% dword access ok here? */
>  	for (i = 0; i < 16; i++) {
>  		pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]);
> @@ -1552,18 +1556,22 @@ int pci_save_state(struct pci_dev *dev)
>  
>  	i = pci_save_pcie_state(dev);
>  	if (i != 0)
> -		return i;
> +		goto exit;
>  
>  	i = pci_save_pcix_state(dev);
>  	if (i != 0)
> -		return i;
> +		goto exit;
>  
>  	pci_save_ltr_state(dev);
>  	pci_save_aspm_l1ss_state(dev);
>  	pci_save_dpc_state(dev);
>  	pci_save_aer_state(dev);
>  	pci_save_ptm_state(dev);
> -	return pci_save_vc_state(dev);
> +	i = pci_save_vc_state(dev);
> +
> +exit:
> +	pcie_restore_aspm_control(dev);
> +	return i;
>  }
>  EXPORT_SYMBOL(pci_save_state);
>  
> @@ -1661,6 +1669,8 @@ void pci_restore_state(struct pci_dev *dev)
>  	if (!dev->state_saved)
>  		return;
>  
> +	pcie_disable_aspm(dev);
> +
>  	/*
>  	 * Restore max latencies (in the LTR capability) before enabling
>  	 * LTR itself (in the PCIe capability).
> @@ -1689,6 +1699,8 @@ void pci_restore_state(struct pci_dev *dev)
>  	pci_enable_acs(dev);
>  	pci_restore_iov_state(dev);
>  
> +	pcie_restore_aspm_control(dev);
> +
>  	dev->state_saved = false;
>  }
>  EXPORT_SYMBOL(pci_restore_state);
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index a81459159f6d..e074a0cbe73c 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -584,6 +584,9 @@ void pcie_aspm_pm_state_change(struct pci_dev *pdev);
>  void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
>  void pci_save_aspm_l1ss_state(struct pci_dev *dev);
>  void pci_restore_aspm_l1ss_state(struct pci_dev *dev);
> +void pcie_save_aspm_control(struct pci_dev *dev);
> +void pcie_restore_aspm_control(struct pci_dev *dev);
> +void pcie_disable_aspm(struct pci_dev *pdev);
>  #else
>  static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
>  static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
> @@ -591,6 +594,9 @@ static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { }
>  static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
>  static inline void pci_save_aspm_l1ss_state(struct pci_dev *dev) { }
>  static inline void pci_restore_aspm_l1ss_state(struct pci_dev *dev) { }
> +static inline void pcie_save_aspm_control(struct pci_dev *dev) { }
> +static inline void pcie_restore_aspm_control(struct pci_dev *dev) { }
> +static inline void pcie_disable_aspm(struct pci_dev *pdev) { }
>  #endif
>  
>  #ifdef CONFIG_PCIE_ECRC
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index a08e7d6dc248..e1e97db32e8b 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -784,6 +784,33 @@ static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val)
>  					   PCI_EXP_LNKCTL_ASPMC, val);
>  }
>  
> +void pcie_disable_aspm(struct pci_dev *pdev)
> +{
> +	if (!pci_is_pcie(pdev))
> +		return;
> +
> +	pcie_config_aspm_dev(pdev, 0);
> +}
> +
> +void pcie_save_aspm_control(struct pci_dev *pdev)
> +{
> +	u16 lnkctl;
> +
> +	if (!pci_is_pcie(pdev))
> +		return;
> +
> +	pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &lnkctl);
> +	pdev->saved_aspm_ctl = lnkctl & PCI_EXP_LNKCTL_ASPMC;
> +}
> +
> +void pcie_restore_aspm_control(struct pci_dev *pdev)
> +{
> +	if (!pci_is_pcie(pdev))
> +		return;
> +
> +	pcie_config_aspm_dev(pdev, pdev->saved_aspm_ctl);
> +}
> +
>  static void pcie_config_aspm_link(struct pcie_link_state *link, u32 state)
>  {
>  	u32 upstream = 0, dwstream = 0;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index b32126d26997..a21bfd6e3f89 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -387,6 +387,7 @@ struct pci_dev {
>  	unsigned int	ltr_path:1;	/* Latency Tolerance Reporting
>  					   supported from root to here */
>  	u16		l1ss;		/* L1SS Capability pointer */
> +	u16		saved_aspm_ctl; /* ASPM Control saved at suspend time */
>  #endif
>  	unsigned int	eetlp_prefix_path:1;	/* End-to-End TLP Prefix */
>  
> -- 
> 2.30.0.280.ga3ce27912f-goog
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] PCI/ASPM: Disable ASPM when save/restore PCI state
  2021-03-11 17:34 ` Bjorn Helgaas
@ 2021-07-26 22:03   ` Bjorn Helgaas
  2022-02-09  8:03     ` Victor Ding
  0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2021-07-26 22:03 UTC (permalink / raw)
  To: Victor Ding
  Cc: Ulf Hansson, Adrian Hunter, Ben Chuang, linux-pci, linux-kernel,
	linux-mmc, Bjorn Helgaas, Chris Packham, Kai-Heng Feng,
	Mika Westerberg, Saheed O. Bolarinwa, Vidya Sagar,
	Xiongfeng Wang

On Thu, Mar 11, 2021 at 11:34:33AM -0600, Bjorn Helgaas wrote:
> On Thu, Jan 28, 2021 at 03:52:42PM +0000, Victor Ding wrote:
> > Certain PCIe devices (e.g. GL9750) have high penalties (e.g. high Port
> > T_POWER_ON) when exiting L1 but enter L1 aggressively. As a result,
> > such devices enter and exit L1 frequently during pci_save_state and
> > pci_restore_state; eventually causing poor suspend/resume performance.
> > 
> > Based on the observation that PCI accesses dominance pci_save_state/
> > pci_restore_state plus these accesses are fairly close to each other, the
> > actual time the device could stay in low power states is negligible.
> > Therefore, the little power-saving benefit from ASPM during suspend/resume
> > does not overweight the performance degradation caused by high L1 exit
> > penalties.
> > 
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211187
> 
> Thanks for this!
> 
> This device can tolerate unlimited delay for L1 exit (DevCtl Endpoint
> L1 Acceptable Latency is unlimited) and it makes no guarantees about
> how fast it can exit L1 (LnkCap L1 Exit Latency is also unlimited), so
> I think there's basically no restriction on when it can enter ASPM
> L1.0.
> 
> I have a hard time interpreting the L1.2 entry conditions in PCIe
> r5.0, sec 5.5.1, but I can believe it enters L1.2 aggressively since
> the device says it can tolerate any latencies.
> 
> If L1.2 exit takes 3100us, it could do ~60 L1 exits in 200ms.  I guess
> config accesses and code execution can account for some of that, but
> still seems like a lot of L1 entries/exits during suspend.  I wouldn't
> think we would touch the device that much and that intermittently.
> 
> > Signed-off-by: Victor Ding <victording@google.com>
> > 
> > ---
> > 
> > Changes in v2:
> > - Updated commit message to remove unnecessary information
> > - Fixed a bug reading wrong register in pcie_save_aspm_control
> > - Updated to reuse existing pcie_config_aspm_dev where possible
> > - Fixed goto label style
> > 
> >  drivers/pci/pci.c       | 18 +++++++++++++++---
> >  drivers/pci/pci.h       |  6 ++++++
> >  drivers/pci/pcie/aspm.c | 27 +++++++++++++++++++++++++++
> >  include/linux/pci.h     |  1 +
> >  4 files changed, 49 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 32011b7b4c04..9ea88953f90b 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -1542,6 +1542,10 @@ static void pci_restore_ltr_state(struct pci_dev *dev)
> >  int pci_save_state(struct pci_dev *dev)
> >  {
> >  	int i;
> > +
> > +	pcie_save_aspm_control(dev);
> > +	pcie_disable_aspm(dev);
> 
> If I understand this patch correctly, it basically does this:
> 
>     pci_save_state
>   +   pcie_save_aspm_control
>   +   pcie_disable_aspm
>       <save state>
>   +   pcie_restore_aspm_control
> 
> The <save state> part is just a bunch of config reads with very little
> other code execution.  I'm really surprised that there's enough time
> between config reads for the link to go to L1.  I guess you've
> verified that this does speed up suspend significantly, but this just
> doesn't make sense to me.
> 
> In the bugzilla you say the GL9750 can go to L1.2 after ~4us of
> inactivity.  That's enough time for a lot of code execution.  We must
> be missing something.  There's so much PCI traffic during save/restore
> that it should be easy to match up the analyzer trace with the code.
> Can you get any insight into what's going on that way?

I'm dropping this for now, pending a better understanding of what's
going on.

> >  	/* XXX: 100% dword access ok here? */
> >  	for (i = 0; i < 16; i++) {
> >  		pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]);
> > @@ -1552,18 +1556,22 @@ int pci_save_state(struct pci_dev *dev)
> >  
> >  	i = pci_save_pcie_state(dev);
> >  	if (i != 0)
> > -		return i;
> > +		goto exit;
> >  
> >  	i = pci_save_pcix_state(dev);
> >  	if (i != 0)
> > -		return i;
> > +		goto exit;
> >  
> >  	pci_save_ltr_state(dev);
> >  	pci_save_aspm_l1ss_state(dev);
> >  	pci_save_dpc_state(dev);
> >  	pci_save_aer_state(dev);
> >  	pci_save_ptm_state(dev);
> > -	return pci_save_vc_state(dev);
> > +	i = pci_save_vc_state(dev);
> > +
> > +exit:
> > +	pcie_restore_aspm_control(dev);
> > +	return i;
> >  }
> >  EXPORT_SYMBOL(pci_save_state);
> >  
> > @@ -1661,6 +1669,8 @@ void pci_restore_state(struct pci_dev *dev)
> >  	if (!dev->state_saved)
> >  		return;
> >  
> > +	pcie_disable_aspm(dev);
> > +
> >  	/*
> >  	 * Restore max latencies (in the LTR capability) before enabling
> >  	 * LTR itself (in the PCIe capability).
> > @@ -1689,6 +1699,8 @@ void pci_restore_state(struct pci_dev *dev)
> >  	pci_enable_acs(dev);
> >  	pci_restore_iov_state(dev);
> >  
> > +	pcie_restore_aspm_control(dev);
> > +
> >  	dev->state_saved = false;
> >  }
> >  EXPORT_SYMBOL(pci_restore_state);
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index a81459159f6d..e074a0cbe73c 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -584,6 +584,9 @@ void pcie_aspm_pm_state_change(struct pci_dev *pdev);
> >  void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
> >  void pci_save_aspm_l1ss_state(struct pci_dev *dev);
> >  void pci_restore_aspm_l1ss_state(struct pci_dev *dev);
> > +void pcie_save_aspm_control(struct pci_dev *dev);
> > +void pcie_restore_aspm_control(struct pci_dev *dev);
> > +void pcie_disable_aspm(struct pci_dev *pdev);
> >  #else
> >  static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
> >  static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
> > @@ -591,6 +594,9 @@ static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { }
> >  static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
> >  static inline void pci_save_aspm_l1ss_state(struct pci_dev *dev) { }
> >  static inline void pci_restore_aspm_l1ss_state(struct pci_dev *dev) { }
> > +static inline void pcie_save_aspm_control(struct pci_dev *dev) { }
> > +static inline void pcie_restore_aspm_control(struct pci_dev *dev) { }
> > +static inline void pcie_disable_aspm(struct pci_dev *pdev) { }
> >  #endif
> >  
> >  #ifdef CONFIG_PCIE_ECRC
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index a08e7d6dc248..e1e97db32e8b 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -784,6 +784,33 @@ static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val)
> >  					   PCI_EXP_LNKCTL_ASPMC, val);
> >  }
> >  
> > +void pcie_disable_aspm(struct pci_dev *pdev)
> > +{
> > +	if (!pci_is_pcie(pdev))
> > +		return;
> > +
> > +	pcie_config_aspm_dev(pdev, 0);
> > +}
> > +
> > +void pcie_save_aspm_control(struct pci_dev *pdev)
> > +{
> > +	u16 lnkctl;
> > +
> > +	if (!pci_is_pcie(pdev))
> > +		return;
> > +
> > +	pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &lnkctl);
> > +	pdev->saved_aspm_ctl = lnkctl & PCI_EXP_LNKCTL_ASPMC;
> > +}
> > +
> > +void pcie_restore_aspm_control(struct pci_dev *pdev)
> > +{
> > +	if (!pci_is_pcie(pdev))
> > +		return;
> > +
> > +	pcie_config_aspm_dev(pdev, pdev->saved_aspm_ctl);
> > +}
> > +
> >  static void pcie_config_aspm_link(struct pcie_link_state *link, u32 state)
> >  {
> >  	u32 upstream = 0, dwstream = 0;
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index b32126d26997..a21bfd6e3f89 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -387,6 +387,7 @@ struct pci_dev {
> >  	unsigned int	ltr_path:1;	/* Latency Tolerance Reporting
> >  					   supported from root to here */
> >  	u16		l1ss;		/* L1SS Capability pointer */
> > +	u16		saved_aspm_ctl; /* ASPM Control saved at suspend time */
> >  #endif
> >  	unsigned int	eetlp_prefix_path:1;	/* End-to-End TLP Prefix */
> >  
> > -- 
> > 2.30.0.280.ga3ce27912f-goog
> > 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] PCI/ASPM: Disable ASPM when save/restore PCI state
  2021-07-26 22:03   ` Bjorn Helgaas
@ 2022-02-09  8:03     ` Victor Ding
  2022-07-18 16:21       ` Dmytro Maluka
  0 siblings, 1 reply; 6+ messages in thread
From: Victor Ding @ 2022-02-09  8:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Ulf Hansson, Adrian Hunter, Ben Chuang, linux-pci, LKML,
	linux-mmc, Bjorn Helgaas, Chris Packham, Kai-Heng Feng,
	Mika Westerberg, Saheed O. Bolarinwa, Vidya Sagar,
	Xiongfeng Wang

I sincerely apologize that I missed your two emails last year. Please find my
comments embedded below.

On Tue, Jul 27, 2021 at 8:03 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Mar 11, 2021 at 11:34:33AM -0600, Bjorn Helgaas wrote:
> > On Thu, Jan 28, 2021 at 03:52:42PM +0000, Victor Ding wrote:
> > > Certain PCIe devices (e.g. GL9750) have high penalties (e.g. high Port
> > > T_POWER_ON) when exiting L1 but enter L1 aggressively. As a result,
> > > such devices enter and exit L1 frequently during pci_save_state and
> > > pci_restore_state; eventually causing poor suspend/resume performance.
> > >
> > > Based on the observation that PCI accesses dominance pci_save_state/
> > > pci_restore_state plus these accesses are fairly close to each other, the
> > > actual time the device could stay in low power states is negligible.
> > > Therefore, the little power-saving benefit from ASPM during suspend/resume
> > > does not overweight the performance degradation caused by high L1 exit
> > > penalties.
> > >
> > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211187
> >
> > Thanks for this!
> >
> > This device can tolerate unlimited delay for L1 exit (DevCtl Endpoint
> > L1 Acceptable Latency is unlimited) and it makes no guarantees about
> > how fast it can exit L1 (LnkCap L1 Exit Latency is also unlimited), so
> > I think there's basically no restriction on when it can enter ASPM
> > L1.0.
> >
> > I have a hard time interpreting the L1.2 entry conditions in PCIe
> > r5.0, sec 5.5.1, but I can believe it enters L1.2 aggressively since
> > the device says it can tolerate any latencies.
> >
> > If L1.2 exit takes 3100us, it could do ~60 L1 exits in 200ms.  I guess
> > config accesses and code execution can account for some of that, but
> > still seems like a lot of L1 entries/exits during suspend.  I wouldn't
> > think we would touch the device that much and that intermittently.
> >
> > > Signed-off-by: Victor Ding <victording@google.com>
> > >
> > > ---
> > >
> > > Changes in v2:
> > > - Updated commit message to remove unnecessary information
> > > - Fixed a bug reading wrong register in pcie_save_aspm_control
> > > - Updated to reuse existing pcie_config_aspm_dev where possible
> > > - Fixed goto label style
> > >
> > >  drivers/pci/pci.c       | 18 +++++++++++++++---
> > >  drivers/pci/pci.h       |  6 ++++++
> > >  drivers/pci/pcie/aspm.c | 27 +++++++++++++++++++++++++++
> > >  include/linux/pci.h     |  1 +
> > >  4 files changed, 49 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index 32011b7b4c04..9ea88953f90b 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -1542,6 +1542,10 @@ static void pci_restore_ltr_state(struct pci_dev *dev)
> > >  int pci_save_state(struct pci_dev *dev)
> > >  {
> > >     int i;
> > > +
> > > +   pcie_save_aspm_control(dev);
> > > +   pcie_disable_aspm(dev);
> >
> > If I understand this patch correctly, it basically does this:
> >
> >     pci_save_state
> >   +   pcie_save_aspm_control
> >   +   pcie_disable_aspm
> >       <save state>
> >   +   pcie_restore_aspm_control
> >
> > The <save state> part is just a bunch of config reads with very little
> > other code execution.  I'm really surprised that there's enough time
> > between config reads for the link to go to L1.  I guess you've
> > verified that this does speed up suspend significantly, but this just
> > doesn't make sense to me.
> >
> > In the bugzilla you say the GL9750 can go to L1.2 after ~4us of
> > inactivity.  That's enough time for a lot of code execution.  We must
> > be missing something.  There's so much PCI traffic during save/restore
> > that it should be easy to match up the analyzer trace with the code.
> > Can you get any insight into what's going on that way?

Unfortunately I did not and still do not have access to the required analyzers.
However, I received an analyzer trace screenshot confirming GL9750 indeed
went idle for ~4.2us. (I've attached it to the bugzilla).

I agree that 4us is more than sufficient for a lot of code execution, especially
these PCI config space accesses for the same device appear to be fairly
close to each other. However, as device resumes occur in parallel and a
global mutex is required for each access, these PCI config space accesses
from multiple devices are serialized arbitrarily causing accesses from the
same device far apart from each other.

The hypothesis came from my observation that PCI config space accesses
from different devices in the slow runs were always interleaved, while they
were mostly grouped together in the fast runs. To further prove the hypothesis,
I added a global lock when save/restore PCI state, forcing the related PCI
config space accesses from the same device grouped together, I always got
fast runs.

>
> I'm dropping this for now, pending a better understanding of what's
> going on.
>
> > >     /* XXX: 100% dword access ok here? */
> > >     for (i = 0; i < 16; i++) {
> > >             pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]);
> > > @@ -1552,18 +1556,22 @@ int pci_save_state(struct pci_dev *dev)
> > >
> > >     i = pci_save_pcie_state(dev);
> > >     if (i != 0)
> > > -           return i;
> > > +           goto exit;
> > >
> > >     i = pci_save_pcix_state(dev);
> > >     if (i != 0)
> > > -           return i;
> > > +           goto exit;
> > >
> > >     pci_save_ltr_state(dev);
> > >     pci_save_aspm_l1ss_state(dev);
> > >     pci_save_dpc_state(dev);
> > >     pci_save_aer_state(dev);
> > >     pci_save_ptm_state(dev);
> > > -   return pci_save_vc_state(dev);
> > > +   i = pci_save_vc_state(dev);
> > > +
> > > +exit:
> > > +   pcie_restore_aspm_control(dev);
> > > +   return i;
> > >  }
> > >  EXPORT_SYMBOL(pci_save_state);
> > >
> > > @@ -1661,6 +1669,8 @@ void pci_restore_state(struct pci_dev *dev)
> > >     if (!dev->state_saved)
> > >             return;
> > >
> > > +   pcie_disable_aspm(dev);
> > > +
> > >     /*
> > >      * Restore max latencies (in the LTR capability) before enabling
> > >      * LTR itself (in the PCIe capability).
> > > @@ -1689,6 +1699,8 @@ void pci_restore_state(struct pci_dev *dev)
> > >     pci_enable_acs(dev);
> > >     pci_restore_iov_state(dev);
> > >
> > > +   pcie_restore_aspm_control(dev);
> > > +
> > >     dev->state_saved = false;
> > >  }
> > >  EXPORT_SYMBOL(pci_restore_state);
> > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > > index a81459159f6d..e074a0cbe73c 100644
> > > --- a/drivers/pci/pci.h
> > > +++ b/drivers/pci/pci.h
> > > @@ -584,6 +584,9 @@ void pcie_aspm_pm_state_change(struct pci_dev *pdev);
> > >  void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
> > >  void pci_save_aspm_l1ss_state(struct pci_dev *dev);
> > >  void pci_restore_aspm_l1ss_state(struct pci_dev *dev);
> > > +void pcie_save_aspm_control(struct pci_dev *dev);
> > > +void pcie_restore_aspm_control(struct pci_dev *dev);
> > > +void pcie_disable_aspm(struct pci_dev *pdev);
> > >  #else
> > >  static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
> > >  static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
> > > @@ -591,6 +594,9 @@ static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { }
> > >  static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
> > >  static inline void pci_save_aspm_l1ss_state(struct pci_dev *dev) { }
> > >  static inline void pci_restore_aspm_l1ss_state(struct pci_dev *dev) { }
> > > +static inline void pcie_save_aspm_control(struct pci_dev *dev) { }
> > > +static inline void pcie_restore_aspm_control(struct pci_dev *dev) { }
> > > +static inline void pcie_disable_aspm(struct pci_dev *pdev) { }
> > >  #endif
> > >
> > >  #ifdef CONFIG_PCIE_ECRC
> > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > index a08e7d6dc248..e1e97db32e8b 100644
> > > --- a/drivers/pci/pcie/aspm.c
> > > +++ b/drivers/pci/pcie/aspm.c
> > > @@ -784,6 +784,33 @@ static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val)
> > >                                        PCI_EXP_LNKCTL_ASPMC, val);
> > >  }
> > >
> > > +void pcie_disable_aspm(struct pci_dev *pdev)
> > > +{
> > > +   if (!pci_is_pcie(pdev))
> > > +           return;
> > > +
> > > +   pcie_config_aspm_dev(pdev, 0);
> > > +}
> > > +
> > > +void pcie_save_aspm_control(struct pci_dev *pdev)
> > > +{
> > > +   u16 lnkctl;
> > > +
> > > +   if (!pci_is_pcie(pdev))
> > > +           return;
> > > +
> > > +   pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &lnkctl);
> > > +   pdev->saved_aspm_ctl = lnkctl & PCI_EXP_LNKCTL_ASPMC;
> > > +}
> > > +
> > > +void pcie_restore_aspm_control(struct pci_dev *pdev)
> > > +{
> > > +   if (!pci_is_pcie(pdev))
> > > +           return;
> > > +
> > > +   pcie_config_aspm_dev(pdev, pdev->saved_aspm_ctl);
> > > +}
> > > +
> > >  static void pcie_config_aspm_link(struct pcie_link_state *link, u32 state)
> > >  {
> > >     u32 upstream = 0, dwstream = 0;
> > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > index b32126d26997..a21bfd6e3f89 100644
> > > --- a/include/linux/pci.h
> > > +++ b/include/linux/pci.h
> > > @@ -387,6 +387,7 @@ struct pci_dev {
> > >     unsigned int    ltr_path:1;     /* Latency Tolerance Reporting
> > >                                        supported from root to here */
> > >     u16             l1ss;           /* L1SS Capability pointer */
> > > +   u16             saved_aspm_ctl; /* ASPM Control saved at suspend time */
> > >  #endif
> > >     unsigned int    eetlp_prefix_path:1;    /* End-to-End TLP Prefix */
> > >
> > > --
> > > 2.30.0.280.ga3ce27912f-goog
> > >

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] PCI/ASPM: Disable ASPM when save/restore PCI state
  2022-02-09  8:03     ` Victor Ding
@ 2022-07-18 16:21       ` Dmytro Maluka
  2022-07-19 19:11         ` Dmytro Maluka
  0 siblings, 1 reply; 6+ messages in thread
From: Dmytro Maluka @ 2022-07-18 16:21 UTC (permalink / raw)
  To: Victor Ding, Bjorn Helgaas
  Cc: Ulf Hansson, Adrian Hunter, Ben Chuang, linux-pci, LKML,
	linux-mmc, Bjorn Helgaas, Chris Packham, Kai-Heng Feng,
	Mika Westerberg, Saheed O. Bolarinwa, Vidya Sagar,
	Xiongfeng Wang, Thomas Gleixner, Grzegorz Jaszczyk,
	Tomasz Nowicki, Zide Chen

Hi Victor and Bjorn,

On 2/9/22 09:03, Victor Ding wrote:
> I sincerely apologize that I missed your two emails last year. Please find my
> comments embedded below.
> 
> On Tue, Jul 27, 2021 at 8:03 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>
>> On Thu, Mar 11, 2021 at 11:34:33AM -0600, Bjorn Helgaas wrote:
>>> On Thu, Jan 28, 2021 at 03:52:42PM +0000, Victor Ding wrote:
>>>> Certain PCIe devices (e.g. GL9750) have high penalties (e.g. high Port
>>>> T_POWER_ON) when exiting L1 but enter L1 aggressively. As a result,
>>>> such devices enter and exit L1 frequently during pci_save_state and
>>>> pci_restore_state; eventually causing poor suspend/resume performance.
>>>>
>>>> Based on the observation that PCI accesses dominance pci_save_state/
>>>> pci_restore_state plus these accesses are fairly close to each other, the
>>>> actual time the device could stay in low power states is negligible.
>>>> Therefore, the little power-saving benefit from ASPM during suspend/resume
>>>> does not overweight the performance degradation caused by high L1 exit
>>>> penalties.
>>>>
>>>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211187
>>>
>>> Thanks for this!
>>>
>>> This device can tolerate unlimited delay for L1 exit (DevCtl Endpoint
>>> L1 Acceptable Latency is unlimited) and it makes no guarantees about
>>> how fast it can exit L1 (LnkCap L1 Exit Latency is also unlimited), so
>>> I think there's basically no restriction on when it can enter ASPM
>>> L1.0.
>>>
>>> I have a hard time interpreting the L1.2 entry conditions in PCIe
>>> r5.0, sec 5.5.1, but I can believe it enters L1.2 aggressively since
>>> the device says it can tolerate any latencies.
>>>
>>> If L1.2 exit takes 3100us, it could do ~60 L1 exits in 200ms.  I guess
>>> config accesses and code execution can account for some of that, but
>>> still seems like a lot of L1 entries/exits during suspend.  I wouldn't
>>> think we would touch the device that much and that intermittently.
>>>
>>>> Signed-off-by: Victor Ding <victording@google.com>
>>>>
>>>> ---
>>>>
>>>> Changes in v2:
>>>> - Updated commit message to remove unnecessary information
>>>> - Fixed a bug reading wrong register in pcie_save_aspm_control
>>>> - Updated to reuse existing pcie_config_aspm_dev where possible
>>>> - Fixed goto label style
>>>>
>>>>  drivers/pci/pci.c       | 18 +++++++++++++++---
>>>>  drivers/pci/pci.h       |  6 ++++++
>>>>  drivers/pci/pcie/aspm.c | 27 +++++++++++++++++++++++++++
>>>>  include/linux/pci.h     |  1 +
>>>>  4 files changed, 49 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>> index 32011b7b4c04..9ea88953f90b 100644
>>>> --- a/drivers/pci/pci.c
>>>> +++ b/drivers/pci/pci.c
>>>> @@ -1542,6 +1542,10 @@ static void pci_restore_ltr_state(struct pci_dev *dev)
>>>>  int pci_save_state(struct pci_dev *dev)
>>>>  {
>>>>     int i;
>>>> +
>>>> +   pcie_save_aspm_control(dev);
>>>> +   pcie_disable_aspm(dev);
>>>
>>> If I understand this patch correctly, it basically does this:
>>>
>>>     pci_save_state
>>>   +   pcie_save_aspm_control
>>>   +   pcie_disable_aspm
>>>       <save state>
>>>   +   pcie_restore_aspm_control
>>>
>>> The <save state> part is just a bunch of config reads with very little
>>> other code execution.  I'm really surprised that there's enough time
>>> between config reads for the link to go to L1.  I guess you've
>>> verified that this does speed up suspend significantly, but this just
>>> doesn't make sense to me.
>>>
>>> In the bugzilla you say the GL9750 can go to L1.2 after ~4us of
>>> inactivity.  That's enough time for a lot of code execution.  We must
>>> be missing something.  There's so much PCI traffic during save/restore
>>> that it should be easy to match up the analyzer trace with the code.
>>> Can you get any insight into what's going on that way?
> 
> Unfortunately I did not and still do not have access to the required analyzers.
> However, I received an analyzer trace screenshot confirming GL9750 indeed
> went idle for ~4.2us. (I've attached it to the bugzilla).
> 
> I agree that 4us is more than sufficient for a lot of code execution, especially
> these PCI config space accesses for the same device appear to be fairly
> close to each other. However, as device resumes occur in parallel and a
> global mutex is required for each access, these PCI config space accesses
> from multiple devices are serialized arbitrarily causing accesses from the
> same device far apart from each other.
> 
> The hypothesis came from my observation that PCI config space accesses
> from different devices in the slow runs were always interleaved, while they
> were mostly grouped together in the fast runs. To further prove the hypothesis,
> I added a global lock when save/restore PCI state, forcing the related PCI
> config space accesses from the same device grouped together, I always got
> fast runs.

I also happen to be working with a platform with GL9750 and I can
confirm Victor's observations and conclusion: PCI config accesses in
pci_save_state()/pci_restore_state() are so far apart in time because
they are arbitrarily serialized with PCI config accesses for other
devices during parallel suspend/resume, since they use the global
pci_config_lock in pci_conf1_read()/pci_conf1_write().

By the way, there is an easier way to confirm this conclusion: instead
of adding a global lock around PCI state save/restore as Victor did,
we can just disable parallel suspend/resume via
"echo 0 >/sys/power/pm_async".

So it seems we understand what's going on. Can we move forward with
merging Victor's patch?

While we're at it, I'm also wondering why for the basic PCI config (the
first 256 bytes) Linux on x86 always uses the legacy 0xCF8/0xCFC method
instead of MMCFG, even if MMCFG is available. The legacy method is
inherently non-atomic and does require the global lock, while the MMCFG
method generally doesn't, so using MMCFG would significantly speed up
PCI config accesses in high-contention scenarios like the parallel
suspend/resume.

I've tried the below change which forces using MMCFG for the first 256
bytes, and indeed, it makes suspend/resume of individual PCI devices
with pm_async=1 almost as fast as with pm_async=0. In particular, it
fixes the problem with slow GL9750 suspend/resume even without Victor's
patch.

--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -40,20 +40,20 @@ const struct pci_raw_ops *__read_mostly raw_pci_ext_ops;
 int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
                                                int reg, int len, u32 *val)
 {
-       if (domain == 0 && reg < 256 && raw_pci_ops)
-               return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
        if (raw_pci_ext_ops)
                return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
+       if (domain == 0 && reg < 256 && raw_pci_ops)
+               return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
        return -EINVAL;
 }
 
 int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
                                                int reg, int len, u32 val)
 {
-       if (domain == 0 && reg < 256 && raw_pci_ops)
-               return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
        if (raw_pci_ext_ops)
                return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
+       if (domain == 0 && reg < 256 && raw_pci_ops)
+               return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
        return -EINVAL;
 }
 

Sounds good if I submit a patch like this? (I'm not suggesting it
instead of Victor's patch, rather as a separate improvement.)

Thanks,
Dmytro

> 
>>
>> I'm dropping this for now, pending a better understanding of what's
>> going on.
>>
>>>>     /* XXX: 100% dword access ok here? */
>>>>     for (i = 0; i < 16; i++) {
>>>>             pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]);
>>>> @@ -1552,18 +1556,22 @@ int pci_save_state(struct pci_dev *dev)
>>>>
>>>>     i = pci_save_pcie_state(dev);
>>>>     if (i != 0)
>>>> -           return i;
>>>> +           goto exit;
>>>>
>>>>     i = pci_save_pcix_state(dev);
>>>>     if (i != 0)
>>>> -           return i;
>>>> +           goto exit;
>>>>
>>>>     pci_save_ltr_state(dev);
>>>>     pci_save_aspm_l1ss_state(dev);
>>>>     pci_save_dpc_state(dev);
>>>>     pci_save_aer_state(dev);
>>>>     pci_save_ptm_state(dev);
>>>> -   return pci_save_vc_state(dev);
>>>> +   i = pci_save_vc_state(dev);
>>>> +
>>>> +exit:
>>>> +   pcie_restore_aspm_control(dev);
>>>> +   return i;
>>>>  }
>>>>  EXPORT_SYMBOL(pci_save_state);
>>>>
>>>> @@ -1661,6 +1669,8 @@ void pci_restore_state(struct pci_dev *dev)
>>>>     if (!dev->state_saved)
>>>>             return;
>>>>
>>>> +   pcie_disable_aspm(dev);
>>>> +
>>>>     /*
>>>>      * Restore max latencies (in the LTR capability) before enabling
>>>>      * LTR itself (in the PCIe capability).
>>>> @@ -1689,6 +1699,8 @@ void pci_restore_state(struct pci_dev *dev)
>>>>     pci_enable_acs(dev);
>>>>     pci_restore_iov_state(dev);
>>>>
>>>> +   pcie_restore_aspm_control(dev);
>>>> +
>>>>     dev->state_saved = false;
>>>>  }
>>>>  EXPORT_SYMBOL(pci_restore_state);
>>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>>> index a81459159f6d..e074a0cbe73c 100644
>>>> --- a/drivers/pci/pci.h
>>>> +++ b/drivers/pci/pci.h
>>>> @@ -584,6 +584,9 @@ void pcie_aspm_pm_state_change(struct pci_dev *pdev);
>>>>  void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
>>>>  void pci_save_aspm_l1ss_state(struct pci_dev *dev);
>>>>  void pci_restore_aspm_l1ss_state(struct pci_dev *dev);
>>>> +void pcie_save_aspm_control(struct pci_dev *dev);
>>>> +void pcie_restore_aspm_control(struct pci_dev *dev);
>>>> +void pcie_disable_aspm(struct pci_dev *pdev);
>>>>  #else
>>>>  static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
>>>>  static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
>>>> @@ -591,6 +594,9 @@ static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { }
>>>>  static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
>>>>  static inline void pci_save_aspm_l1ss_state(struct pci_dev *dev) { }
>>>>  static inline void pci_restore_aspm_l1ss_state(struct pci_dev *dev) { }
>>>> +static inline void pcie_save_aspm_control(struct pci_dev *dev) { }
>>>> +static inline void pcie_restore_aspm_control(struct pci_dev *dev) { }
>>>> +static inline void pcie_disable_aspm(struct pci_dev *pdev) { }
>>>>  #endif
>>>>
>>>>  #ifdef CONFIG_PCIE_ECRC
>>>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>>>> index a08e7d6dc248..e1e97db32e8b 100644
>>>> --- a/drivers/pci/pcie/aspm.c
>>>> +++ b/drivers/pci/pcie/aspm.c
>>>> @@ -784,6 +784,33 @@ static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val)
>>>>                                        PCI_EXP_LNKCTL_ASPMC, val);
>>>>  }
>>>>
>>>> +void pcie_disable_aspm(struct pci_dev *pdev)
>>>> +{
>>>> +   if (!pci_is_pcie(pdev))
>>>> +           return;
>>>> +
>>>> +   pcie_config_aspm_dev(pdev, 0);
>>>> +}
>>>> +
>>>> +void pcie_save_aspm_control(struct pci_dev *pdev)
>>>> +{
>>>> +   u16 lnkctl;
>>>> +
>>>> +   if (!pci_is_pcie(pdev))
>>>> +           return;
>>>> +
>>>> +   pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &lnkctl);
>>>> +   pdev->saved_aspm_ctl = lnkctl & PCI_EXP_LNKCTL_ASPMC;
>>>> +}
>>>> +
>>>> +void pcie_restore_aspm_control(struct pci_dev *pdev)
>>>> +{
>>>> +   if (!pci_is_pcie(pdev))
>>>> +           return;
>>>> +
>>>> +   pcie_config_aspm_dev(pdev, pdev->saved_aspm_ctl);
>>>> +}
>>>> +
>>>>  static void pcie_config_aspm_link(struct pcie_link_state *link, u32 state)
>>>>  {
>>>>     u32 upstream = 0, dwstream = 0;
>>>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>>>> index b32126d26997..a21bfd6e3f89 100644
>>>> --- a/include/linux/pci.h
>>>> +++ b/include/linux/pci.h
>>>> @@ -387,6 +387,7 @@ struct pci_dev {
>>>>     unsigned int    ltr_path:1;     /* Latency Tolerance Reporting
>>>>                                        supported from root to here */
>>>>     u16             l1ss;           /* L1SS Capability pointer */
>>>> +   u16             saved_aspm_ctl; /* ASPM Control saved at suspend time */
>>>>  #endif
>>>>     unsigned int    eetlp_prefix_path:1;    /* End-to-End TLP Prefix */
>>>>
>>>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] PCI/ASPM: Disable ASPM when save/restore PCI state
  2022-07-18 16:21       ` Dmytro Maluka
@ 2022-07-19 19:11         ` Dmytro Maluka
  0 siblings, 0 replies; 6+ messages in thread
From: Dmytro Maluka @ 2022-07-19 19:11 UTC (permalink / raw)
  To: Victor Ding, Bjorn Helgaas
  Cc: Ulf Hansson, Adrian Hunter, Ben Chuang, linux-pci, LKML,
	linux-mmc, Bjorn Helgaas, Chris Packham, Kai-Heng Feng,
	Mika Westerberg, Saheed O. Bolarinwa, Vidya Sagar,
	Xiongfeng Wang, Thomas Gleixner, Grzegorz Jaszczyk,
	Tomasz Nowicki, Zide Chen

On 7/18/22 18:21, Dmytro Maluka wrote:
> While we're at it, I'm also wondering why for the basic PCI config (the
> first 256 bytes) Linux on x86 always uses the legacy 0xCF8/0xCFC method
> instead of MMCFG, even if MMCFG is available. The legacy method is
> inherently non-atomic and does require the global lock, while the MMCFG
> method generally doesn't, so using MMCFG would significantly speed up
> PCI config accesses in high-contention scenarios like the parallel
> suspend/resume.
> 
> I've tried the below change which forces using MMCFG for the first 256
> bytes, and indeed, it makes suspend/resume of individual PCI devices
> with pm_async=1 almost as fast as with pm_async=0. In particular, it
> fixes the problem with slow GL9750 suspend/resume even without Victor's
> patch.
> 
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -40,20 +40,20 @@ const struct pci_raw_ops *__read_mostly raw_pci_ext_ops;
>  int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
>                                                 int reg, int len, u32 *val)
>  {
> -       if (domain == 0 && reg < 256 && raw_pci_ops)
> -               return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
>         if (raw_pci_ext_ops)
>                 return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
> +       if (domain == 0 && reg < 256 && raw_pci_ops)
> +               return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
>         return -EINVAL;
>  }
>  
>  int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
>                                                 int reg, int len, u32 val)
>  {
> -       if (domain == 0 && reg < 256 && raw_pci_ops)
> -               return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
>         if (raw_pci_ext_ops)
>                 return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
> +       if (domain == 0 && reg < 256 && raw_pci_ops)
> +               return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
>         return -EINVAL;
>  }
>  
> 
> Sounds good if I submit a patch like this? (I'm not suggesting it
> instead of Victor's patch, rather as a separate improvement.)

Ok, I found that a similar change was already suggested in the past by
Thomas [1] and got rejected by Linus [2].

Linus' arguments sound reasonable, and I understand that back then the
only known case of an issue with PCI config lock contention was with
Intel PMU counter registers which are in the extended config space
anyway. But now we know another case of such a contention, concerning
the basic config space too, namely: suspending or resuming many PCI
devices in parallel during system suspend/resume.

I've checked that on my box using MMCFG instead of Type 1 (i.e. using my
above patch) reduces the total suspend or resume time by 15-20 ms on
average. (I also had Victor's patch applied all the time, i.e. the ASPM
L1 exit latency issue was already resolved, so my test was about the PCI
lock contention in general.) So, not exactly a major improvement, yet
not exactly a negligible one. Maybe it's worth optimizing, maybe not.

Anyway, that's a bit of digression. Let's focus primarily on Victor's
ASPM patch.

[1]
https://lore.kernel.org/all/tip-b5b0f00c760b6e9673ab79b88ede2f3c7a039f74@git.kernel.org/

[2]
https://lore.kernel.org/all/CA+55aFwi0tkdugfqNEz6M28RXM2jx6WpaDF4nfA=doUVdZgUNQ@mail.gmail.com/

Thanks,
Dmytro

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-07-19 19:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-28 15:52 [PATCH v2] PCI/ASPM: Disable ASPM when save/restore PCI state Victor Ding
2021-03-11 17:34 ` Bjorn Helgaas
2021-07-26 22:03   ` Bjorn Helgaas
2022-02-09  8:03     ` Victor Ding
2022-07-18 16:21       ` Dmytro Maluka
2022-07-19 19:11         ` Dmytro Maluka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).