All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Kumlien <ian.kumlien@gmail.com>
To: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"Derrick, Jonathan" <jonathan.derrick@intel.com>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Rob Herring <robh@kernel.org>,
	linux-pci <linux-pci@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] PCI: vmd: Enable ASPM for mobile platforms
Date: Fri, 9 Oct 2020 16:34:30 +0200	[thread overview]
Message-ID: <CAA85sZszpwU4MNKzb=W3MaK8YHwfuEmdXkc03TJkDtOD6a3L0A@mail.gmail.com> (raw)
In-Reply-To: <2C086FFC-28F5-42BB-933E-0C1C4FDC738B@canonical.com>

On Thu, Oct 8, 2020 at 6:19 AM Kai-Heng Feng
<kai.heng.feng@canonical.com> wrote:
>
>
>
> > On Oct 7, 2020, at 21:30, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > On Wed, Oct 07, 2020 at 12:26:19PM +0800, Kai-Heng Feng wrote:
> >>> On Oct 6, 2020, at 03:19, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>> On Tue, Oct 06, 2020 at 02:40:32AM +0800, Kai-Heng Feng wrote:
> >>>>> On Oct 3, 2020, at 06:18, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>>>> On Wed, Sep 30, 2020 at 04:24:54PM +0800, Kai-Heng Feng wrote:
> >
> > ...
> >>>> I wonder whether other devices that add PCIe domain have the same
> >>>> behavior?  Maybe it's not a special case at all...
> >>>
> >>> What other devices are these?
> >>
> >> Controllers which add PCIe domain.
> >
> > I was looking for specific examples, not just a restatement of what
> > you said before.  I'm just curious because there are a lot of
> > controllers I'm not familiar with, and I can't think of an example.
> >
> >>>> I understand the end goal is to keep consistency for the entire ASPM
> >>>> logic. However I can't think of any possible solution right now.
> >>>>
> >>>>> - If we built with CONFIG_PCIEASPM_POWERSAVE=y, would that solve the
> >>>>>  SoC power state problem?
> >>>>
> >>>> Yes.
> >>>>
> >>>>> - What issues would CONFIG_PCIEASPM_POWERSAVE=y introduce?
> >>>>
> >>>> This will break many systems, at least for the 1st Gen Ryzen
> >>>> desktops and laptops.
> >>>>
> >>>> All PCIe ASPM are not enabled by BIOS, and those systems immediately
> >>>> freeze once ASPM is enabled.
> >>>
> >>> That indicates a defect in the Linux ASPM code.  We should fix that.
> >>> It should be safe to use CONFIG_PCIEASPM_POWERSAVE=y on every system.
> >>
> >> On those systems ASPM are also not enabled on Windows. So I think
> >> ASPM are disabled for a reason.
> >
> > If the platform knows ASPM needs to be disabled, it should be using
> > ACPI_FADT_NO_ASPM or _OSC to prevent the OS from using it.  And if
> > CONFIG_PCIEASPM_POWERSAVE=y means Linux enables ASPM when it
> > shouldn't, that's a Linux bug that we need to fix.
>
> Yes that's a bug which fixed by Ian's new patch.
>
> >
> >>> Are there bug reports for these? The info we would need to start with
> >>> includes "lspci -vv" and dmesg log (with CONFIG_PCIEASPM_DEFAULT=y).
> >>> If a console log with CONFIG_PCIEASPM_POWERSAVE=y is available, that
> >>> might be interesting, too.  We'll likely need to add some
> >>> instrumentation and do some experimentation, but in principle, this
> >>> should be fixable.
> >>
> >> Doing this is asking users to use hardware settings that ODM/OEM
> >> never tested, and I think the risk is really high.
> >
> > What?  That's not what I said at all.  I'm asking for information
> > about these hangs so we can fix them.  I'm not suggesting that you
> > should switch to CONFIG_PCIEASPM_POWERSAVE=y for the distro.
>
> Ah, I thought your suggestion is switching to CONFIG_PCIEASPM_POWERSAVE=y, because I sense you want to use that to cover the VMD ASPM this patch tries to solve.
>
> Do we have a conclusion how to enable VMD ASPM with CONFIG_PCIEASPM_DEFAULT=y?
>
> >
> > Let's back up.  You said:
> >
> >  CONFIG_PCIEASPM_POWERSAVE=y ... will break many systems, at least
> >  for the 1st Gen Ryzen desktops and laptops.
> >
> >  All PCIe ASPM are not enabled by BIOS, and those systems immediately
> >  freeze once ASPM is enabled.
> >
> > These system hangs might be caused by (1) some hardware issue that
> > causes a hang when ASPM is enabled even if it is configured correctly
> > or (2) Linux configuring ASPM incorrectly.
>
> It's (2) here.
>
> >
> > For case (1), the platform should be using ACPI_FADT_NO_ASPM or _OSC
> > to prevent the OS from enabling ASPM.  Linux should pay attention to
> > that even when CONFIG_PCIEASPM_POWERSAVE=y.
> >
> > If the platform *should* use these mechanisms but doesn't, the
> > solution is a quirk, not the folklore that "we can't use
> > CONFIG_PCIEASPM_POWERSAVE=y because it breaks some systems."
>
> The platform in question doesn't prevent OS from enabling ASPM.
>
> >
> > For case (2), we should fix Linux so it configures ASPM correctly.
> >
> > We cannot use the build-time CONFIG_PCIEASPM settings to avoid these
> > hangs.  We need to fix the Linux run-time code so the system operates
> > correctly no matter what CONFIG_PCIEASPM setting is used.
> >
> > We have sysfs knobs to control ASPM (see 72ea91afbfb0 ("PCI/ASPM: Add
> > sysfs attributes for controlling ASPM link states")).  They can do the
> > same thing at run-time as CONFIG_PCIEASPM_POWERSAVE=y does at
> > build-time.  If those knobs cause hangs on 1st Gen Ryzen systems, we
> > need to fix that.
>
> Ian's patch solves the issue, at least for the systems I have.

Could you add:
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 15d64832a988..cd9f2101f9a2 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -482,7 +482,12 @@ static void pcie_aspm_check_latency(struct
pci_dev *endpoint)
                        latency = max_t(u32, link->latency_up.l1,
link->latency_dw.l1);
                        l1_max_latency = max_t(u32, latency, l1_max_latency);
                        if (l1_max_latency + l1_switch_latency > acceptable->l1)
+                       {
+                               pci_info(endpoint, "L1 latency
exceeded - path: %i - max: %i\n", l1_switch_latency, l1_max_latency);
+                               pci_info(link->pdev, "Upstream device
- %i\n", link->latency_up.l1);
+                               pci_info(link->downstream, "Downstream
device - %i\n", link->latency_dw.l1);
                                link->aspm_capable &= ~ASPM_STATE_L1;
+                       }

                        l1_switch_latency += 1000;
                }

So we can see what device triggers what links to be disabled?

I think your use-case is much more important than mine - mine fixes
something as a side effect

Also, please send me the lspci -vvv output as well as lspci -PP -s
<device id> for the device id:s mentioned in dmesg with the patch
above applied ;)

> Kai-Heng
>
> >
> > Bjorn
>

      reply	other threads:[~2020-10-09 14:34 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-30  8:24 [PATCH 1/2] PCI/ASPM: Add helper to enable ASPM link Kai-Heng Feng
2020-09-30  8:24 ` [PATCH 2/2] PCI: vmd: Enable ASPM for mobile platforms Kai-Heng Feng
2020-10-02 22:18   ` Bjorn Helgaas
2020-10-05 18:40     ` Kai-Heng Feng
2020-10-05 19:19       ` Bjorn Helgaas
2020-10-07  4:26         ` Kai-Heng Feng
2020-10-07  9:28           ` Ian Kumlien
2020-10-07 13:30           ` Bjorn Helgaas
2020-10-07 13:44             ` Ian Kumlien
2020-10-08  4:19             ` Kai-Heng Feng
2020-10-09 14:34               ` Ian Kumlien [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA85sZszpwU4MNKzb=W3MaK8YHwfuEmdXkc03TJkDtOD6a3L0A@mail.gmail.com' \
    --to=ian.kumlien@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=jonathan.derrick@intel.com \
    --cc=kai.heng.feng@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=robh@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.