linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge
@ 2020-07-22  2:18 Robert Hancock
  2020-07-22 17:40 ` Bjorn Helgaas
  2020-07-29 23:43 ` Bjorn Helgaas
  0 siblings, 2 replies; 6+ messages in thread
From: Robert Hancock @ 2020-07-22  2:18 UTC (permalink / raw)
  To: linux-pci, linux-kernel; +Cc: Bjorn Helgaas, Robert Hancock, stable

Recently ASPM handling was changed to no longer disable ASPM on all
PCIe to PCI bridges. Unfortunately these ASMedia PCIe to PCI bridge
devices don't seem to function properly with ASPM enabled, as they
cause the parent PCIe root port to cause repeated AER timeout errors.
In addition to flooding the kernel log, this also causes the machine
to wake up immediately after suspend is initiated.

Fixes: 66ff14e59e8a ("PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges")
Cc: stable@vger.kernel.org
Signed-off-by: Robert Hancock <hancockrwd@gmail.com>
---
 drivers/pci/quirks.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 812bfc32ecb8..e5713114f2ab 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2330,6 +2330,19 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s);
 
+static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
+{
+	pci_info(dev, "Disabling ASPM L0s/L1\n");
+	pci_disable_link_state(dev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1);
+}
+
+/*
+ * ASM1083/1085 PCIe-PCI bridge devices cause AER timeout errors on the
+ * upstream PCIe root port when ASPM is enabled. At least L0s mode is affected,
+ * disable both L0s and L1 for now to be safe.
+ */
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
+
 /*
  * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain
  * Link bit cleared after starting the link retrain process to allow this
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge
  2020-07-22  2:18 [PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge Robert Hancock
@ 2020-07-22 17:40 ` Bjorn Helgaas
  2020-07-23  0:46   ` Robert Hancock
  2020-07-29 23:43 ` Bjorn Helgaas
  1 sibling, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2020-07-22 17:40 UTC (permalink / raw)
  To: Robert Hancock
  Cc: linux-pci, linux-kernel, Bjorn Helgaas, stable, Puranjay Mohan

[+cc Puranjay]

On Tue, Jul 21, 2020 at 08:18:03PM -0600, Robert Hancock wrote:
> Recently ASPM handling was changed to no longer disable ASPM on all
> PCIe to PCI bridges. Unfortunately these ASMedia PCIe to PCI bridge
> devices don't seem to function properly with ASPM enabled, as they
> cause the parent PCIe root port to cause repeated AER timeout errors.
> In addition to flooding the kernel log, this also causes the machine
> to wake up immediately after suspend is initiated.

Hi Robert, thanks a lot for the report of this problem
(https://lore.kernel.org/r/CADLC3L1R2hssRjxHJv9yhdN_7-hGw58rXSfNp-FraZh0Tw+gRw@mail.gmail.com
and https://bugzilla.redhat.com/show_bug.cgi?id=1853960).

I'm pretty sure Linux ASPM support is missing some things.  This
problem might be a hardware problem where a quirk is the right
solution, but it could also be that it's a result of a Linux defect
that we should fix.

Could you collect the dmesg log and "sudo lspci -vvxxxx" output
somewhere (maybe a bugzilla.kernel.org issue)?  I want to figure out
whether this L1 PM substates are enabled on this link, and whether
that's configured correctly.

> Fixes: 66ff14e59e8a ("PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges")
> Cc: stable@vger.kernel.org
> Signed-off-by: Robert Hancock <hancockrwd@gmail.com>
> ---
>  drivers/pci/quirks.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 812bfc32ecb8..e5713114f2ab 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2330,6 +2330,19 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s);
>  
> +static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
> +{
> +	pci_info(dev, "Disabling ASPM L0s/L1\n");
> +	pci_disable_link_state(dev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1);
> +}
> +
> +/*
> + * ASM1083/1085 PCIe-PCI bridge devices cause AER timeout errors on the
> + * upstream PCIe root port when ASPM is enabled. At least L0s mode is affected,
> + * disable both L0s and L1 for now to be safe.
> + */
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
> +
>  /*
>   * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain
>   * Link bit cleared after starting the link retrain process to allow this
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge
  2020-07-22 17:40 ` Bjorn Helgaas
@ 2020-07-23  0:46   ` Robert Hancock
  2020-07-23  1:04     ` Bjorn Helgaas
  0 siblings, 1 reply; 6+ messages in thread
From: Robert Hancock @ 2020-07-23  0:46 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-kernel, Bjorn Helgaas, stable, Puranjay Mohan

On Wed, Jul 22, 2020 at 11:40 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Puranjay]
>
> On Tue, Jul 21, 2020 at 08:18:03PM -0600, Robert Hancock wrote:
> > Recently ASPM handling was changed to no longer disable ASPM on all
> > PCIe to PCI bridges. Unfortunately these ASMedia PCIe to PCI bridge
> > devices don't seem to function properly with ASPM enabled, as they
> > cause the parent PCIe root port to cause repeated AER timeout errors.
> > In addition to flooding the kernel log, this also causes the machine
> > to wake up immediately after suspend is initiated.
>
> Hi Robert, thanks a lot for the report of this problem
> (https://lore.kernel.org/r/CADLC3L1R2hssRjxHJv9yhdN_7-hGw58rXSfNp-FraZh0Tw+gRw@mail.gmail.com
> and https://bugzilla.redhat.com/show_bug.cgi?id=1853960).
>
> I'm pretty sure Linux ASPM support is missing some things.  This
> problem might be a hardware problem where a quirk is the right
> solution, but it could also be that it's a result of a Linux defect
> that we should fix.
>
> Could you collect the dmesg log and "sudo lspci -vvxxxx" output
> somewhere (maybe a bugzilla.kernel.org issue)?  I want to figure out
> whether this L1 PM substates are enabled on this link, and whether
> that's configured correctly.

Created a Bugzilla entry and added dmesg and lspci output:
https://bugzilla.kernel.org/show_bug.cgi?id=208667

As I noted in that report, I subsequently found this page on ASMedia's
site: https://www.asmedia.com.tw/eng/e_show_products.php?cate_index=169&item=114
which indicates this ASM1083 device has "No PCIe ASPM support". It's
not clear why this problem isn't occurring on Windows however - either
it is not enabling ASPM, somehow it doesn't cause issues with the PCIe
link, or it is causing issues and just doesn't notify the user in any
way. I can try and check if this bridge device is ending up with ASPM
enabled under Windows 10 or not..

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge
  2020-07-23  0:46   ` Robert Hancock
@ 2020-07-23  1:04     ` Bjorn Helgaas
  2020-07-23  1:41       ` Robert Hancock
  0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2020-07-23  1:04 UTC (permalink / raw)
  To: Robert Hancock
  Cc: linux-pci, linux-kernel, Bjorn Helgaas, stable, Puranjay Mohan

On Wed, Jul 22, 2020 at 06:46:06PM -0600, Robert Hancock wrote:
> On Wed, Jul 22, 2020 at 11:40 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Jul 21, 2020 at 08:18:03PM -0600, Robert Hancock wrote:
> > > Recently ASPM handling was changed to no longer disable ASPM on all
> > > PCIe to PCI bridges. Unfortunately these ASMedia PCIe to PCI bridge
> > > devices don't seem to function properly with ASPM enabled, as they
> > > cause the parent PCIe root port to cause repeated AER timeout errors.
> > > In addition to flooding the kernel log, this also causes the machine
> > > to wake up immediately after suspend is initiated.
> >
> > Hi Robert, thanks a lot for the report of this problem
> > (https://lore.kernel.org/r/CADLC3L1R2hssRjxHJv9yhdN_7-hGw58rXSfNp-FraZh0Tw+gRw@mail.gmail.com
> > and https://bugzilla.redhat.com/show_bug.cgi?id=1853960).
> >
> > I'm pretty sure Linux ASPM support is missing some things.  This
> > problem might be a hardware problem where a quirk is the right
> > solution, but it could also be that it's a result of a Linux defect
> > that we should fix.
> >
> > Could you collect the dmesg log and "sudo lspci -vvxxxx" output
> > somewhere (maybe a bugzilla.kernel.org issue)?  I want to figure out
> > whether this L1 PM substates are enabled on this link, and whether
> > that's configured correctly.
> 
> Created a Bugzilla entry and added dmesg and lspci output:
> https://bugzilla.kernel.org/show_bug.cgi?id=208667
> 
> As I noted in that report, I subsequently found this page on ASMedia's
> site: https://www.asmedia.com.tw/eng/e_show_products.php?cate_index=169&item=114
> which indicates this ASM1083 device has "No PCIe ASPM support".

How nice.  According to your lspci, the device itself claims to
support ASPM:

  02:00.0 ... ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge
    LnkCap: ... ASPM L0s L1 ...

but the web page claims otherwise.  That would mean the device is
defective for claiming something that's not true.  Or possibly those
capability bits can be set by BIOS.

> It's not clear why this problem isn't occurring on Windows however -
> either it is not enabling ASPM, somehow it doesn't cause issues with
> the PCIe link, or it is causing issues and just doesn't notify the
> user in any way. I can try and check if this bridge device is ending
> up with ASPM enabled under Windows 10 or not..

If Windows *does* manage to enable ASPM, that would be interesting.  I
don't know whether Windows has a similar quirk mechanism.  I suppose
they must have *some* way to work around defective devices.

Bjorn

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge
  2020-07-23  1:04     ` Bjorn Helgaas
@ 2020-07-23  1:41       ` Robert Hancock
  0 siblings, 0 replies; 6+ messages in thread
From: Robert Hancock @ 2020-07-23  1:41 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-kernel, Bjorn Helgaas, stable, Puranjay Mohan

On Wed, Jul 22, 2020 at 7:04 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Wed, Jul 22, 2020 at 06:46:06PM -0600, Robert Hancock wrote:
> > On Wed, Jul 22, 2020 at 11:40 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Jul 21, 2020 at 08:18:03PM -0600, Robert Hancock wrote:
> > > > Recently ASPM handling was changed to no longer disable ASPM on all
> > > > PCIe to PCI bridges. Unfortunately these ASMedia PCIe to PCI bridge
> > > > devices don't seem to function properly with ASPM enabled, as they
> > > > cause the parent PCIe root port to cause repeated AER timeout errors.
> > > > In addition to flooding the kernel log, this also causes the machine
> > > > to wake up immediately after suspend is initiated.
> > >
> > > Hi Robert, thanks a lot for the report of this problem
> > > (https://lore.kernel.org/r/CADLC3L1R2hssRjxHJv9yhdN_7-hGw58rXSfNp-FraZh0Tw+gRw@mail.gmail.com
> > > and https://bugzilla.redhat.com/show_bug.cgi?id=1853960).
> > >
> > > I'm pretty sure Linux ASPM support is missing some things.  This
> > > problem might be a hardware problem where a quirk is the right
> > > solution, but it could also be that it's a result of a Linux defect
> > > that we should fix.
> > >
> > > Could you collect the dmesg log and "sudo lspci -vvxxxx" output
> > > somewhere (maybe a bugzilla.kernel.org issue)?  I want to figure out
> > > whether this L1 PM substates are enabled on this link, and whether
> > > that's configured correctly.
> >
> > Created a Bugzilla entry and added dmesg and lspci output:
> > https://bugzilla.kernel.org/show_bug.cgi?id=208667
> >
> > As I noted in that report, I subsequently found this page on ASMedia's
> > site: https://www.asmedia.com.tw/eng/e_show_products.php?cate_index=169&item=114
> > which indicates this ASM1083 device has "No PCIe ASPM support".
>
> How nice.  According to your lspci, the device itself claims to
> support ASPM:
>
>   02:00.0 ... ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge
>     LnkCap: ... ASPM L0s L1 ...
>
> but the web page claims otherwise.  That would mean the device is
> defective for claiming something that's not true.  Or possibly those
> capability bits can be set by BIOS.
>
> > It's not clear why this problem isn't occurring on Windows however -
> > either it is not enabling ASPM, somehow it doesn't cause issues with
> > the PCIe link, or it is causing issues and just doesn't notify the
> > user in any way. I can try and check if this bridge device is ending
> > up with ASPM enabled under Windows 10 or not..
>
> If Windows *does* manage to enable ASPM, that would be interesting.  I
> don't know whether Windows has a similar quirk mechanism.  I suppose
> they must have *some* way to work around defective devices.

As I posted on the Bugzilla report, based on lspci output it appears
Windows does have ASPM L0s enabled for this bridge. However, it
appears to have the exact same problem: there are correctable PCIe
error entries showing up in the Windows system event log against the
root port the bridge is connected to. So I am thinking this hardware
is just broken with ASPM enabled.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge
  2020-07-22  2:18 [PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge Robert Hancock
  2020-07-22 17:40 ` Bjorn Helgaas
@ 2020-07-29 23:43 ` Bjorn Helgaas
  1 sibling, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2020-07-29 23:43 UTC (permalink / raw)
  To: Robert Hancock; +Cc: linux-pci, linux-kernel, Bjorn Helgaas, stable

On Tue, Jul 21, 2020 at 08:18:03PM -0600, Robert Hancock wrote:
> Recently ASPM handling was changed to no longer disable ASPM on all
> PCIe to PCI bridges. Unfortunately these ASMedia PCIe to PCI bridge
> devices don't seem to function properly with ASPM enabled, as they
> cause the parent PCIe root port to cause repeated AER timeout errors.
> In addition to flooding the kernel log, this also causes the machine
> to wake up immediately after suspend is initiated.
> 
> Fixes: 66ff14e59e8a ("PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges")
> Cc: stable@vger.kernel.org
> Signed-off-by: Robert Hancock <hancockrwd@gmail.com>

I applied this to for-linus for v5.8, since 66ff14e59e8a was merged
for v5.8.  Thanks very much for finding, debugging, and fixing this!

66ff14e59e8a wasn't marked for stable, so if it *was* backported to
stable kernels, I assume whatever process backported it will also
catch this because of the Fixes: tag.

> ---
>  drivers/pci/quirks.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 812bfc32ecb8..e5713114f2ab 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2330,6 +2330,19 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s);
>  
> +static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
> +{
> +	pci_info(dev, "Disabling ASPM L0s/L1\n");
> +	pci_disable_link_state(dev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1);
> +}
> +
> +/*
> + * ASM1083/1085 PCIe-PCI bridge devices cause AER timeout errors on the
> + * upstream PCIe root port when ASPM is enabled. At least L0s mode is affected,
> + * disable both L0s and L1 for now to be safe.
> + */
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
> +
>  /*
>   * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain
>   * Link bit cleared after starting the link retrain process to allow this
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-07-29 23:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22  2:18 [PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge Robert Hancock
2020-07-22 17:40 ` Bjorn Helgaas
2020-07-23  0:46   ` Robert Hancock
2020-07-23  1:04     ` Bjorn Helgaas
2020-07-23  1:41       ` Robert Hancock
2020-07-29 23:43 ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).