All of lore.kernel.org
 help / color / mirror / Atom feed
* [bugzilla-daemon@bugzilla.kernel.org: [Bug 204413] New: "PCI: Add missing link delays" causes regression on resume from suspend to ram]
@ 2019-08-02 22:08 Bjorn Helgaas
  2019-08-02 23:11 ` Matthias Andree
  0 siblings, 1 reply; 2+ messages in thread
From: Bjorn Helgaas @ 2019-08-02 22:08 UTC (permalink / raw)
  To: Mika Westerberg; +Cc: linux-pci, Rafael J. Wysocki, Matthias Andree

[+cc Mika, Rafael, linux-pci]

Hi Matthias,

Thanks a lot for this report!

Mika, this bisected to upstream c2bf1fc212f7 ("PCI: Add missing link
delays required by the PCIe spec").

Matthias, would you mind opening a separate report for the spurious
PME issue you mentioned with 5.2.5?  Seems like we should try to
figure that one out, too.


----- Forwarded message from bugzilla-daemon@bugzilla.kernel.org -----

Date: Fri, 02 Aug 2019 16:26:45 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: bugzilla.pci@gmail.com
Subject: [Bug 204413] New: "PCI: Add missing link delays" causes regression on resume from suspend to ram
Message-ID: <bug-204413-193951@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=204413

            Bug ID: 204413
           Summary: "PCI: Add missing link delays" causes regression on
                    resume from suspend to ram
           Product: Drivers
           Version: 2.5
    Kernel Version: 5.1.20, 5.2.5, 5.3-rc1?
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: PCI
          Assignee: drivers_pci@kernel-bugs.osdl.org
          Reporter: matthias.andree@gmx.de
        Regression: Yes

Description of problem:
vanilla 5.1.20 on x86_64 fails to wake from suspend (STR),
Fedora and vanilla 5.1.19 and prior were fine. 
5.2.5 (from Fedora's-200.fc30) also fails in a different way (spurious PME
interrupts on pcie).

How reproducible:
always

Steps to Reproduce:
1. boot Fedora 30 and log into GNOME desktop
2. click pause symbol to suspend the computer to RAM, wait until suspended
3. press key on keyboard, or power button

Actual results:
computer tries to wake up, HDD LED blinks a bit, but console does not wake.
Other computer on network cannot ping the waking computer.

"sync" hangs in D deep sleep for long amounts of time. 

Expected results:
computer wakes up properly and continues to use its devices.

Additional info:
PM tracing was enabled, the next boot returned
[    0.827930] PM:   hash matches drivers/base/power/main.c:1021

It appears that suspend to disk still works.

Computer has an NVIDIA GeForce 1060 PCIe graphics board, but 5.1.19 and prior
would suspend properly, and the 5.1.20 and 5.2.5 suspend issues also occur if
nvidia kernel modules are renamed out of the way and nouveau remains blocked,
so it's not an nvidia driver issue.

I have "git bisect"ed this on the vanilla stable kernel, the stable/linux-5.1.y
branch (because I have had starting points 5.1.19 and 5.1.20 there).
The failure-inducing commit on the branch is
3c795a8e3481e4dec071b5956e7177e816f6e7f1 (see below), which got picked from 
master's c2bf1fc212f7e6f25ace1af8f0b3ac061ea48ba5, (merged through
cf2d213e49fdf47e4c10dc629a3659e0026a54b8, v5.3-rc1~167)
and also got picked to stable/linux-5.2.y
5817d78eba34f6c86f5462ae2c5212f80a013357 (v5.2.3~291).

Sasha Levin's signoff is only on the stable branches, not on master.

------------------------------------------------------------
commit 3c795a8e3481e4dec071b5956e7177e816f6e7f1 (refs/bisect/bad)
Author: Mika Westerberg <mika.westerberg@linux.intel.com>  2019-06-12 12:57:38
Committer: Greg Kroah-Hartman <gregkh@linuxfoundation.org>  2019-07-26 09:12:37
Parent: 70cc29dba925b8a99a4917c2b5fa6702d0d496d1 (bpf: fix callees pruning
callers)
Child:  a98c15177f72ae3c0a736bb324e66c279bf94899 (net: netsec: initialize tx
ring on ndo_open)
Branch: remotes/stable/linux-5.1.y
Follows: v5.1.19
Precedes: v5.1.20

    PCI: Add missing link delays required by the PCIe spec

    [ Upstream commit c2bf1fc212f7e6f25ace1af8f0b3ac061ea48ba5 ]

    Currently Linux does not follow PCIe spec regarding the required delays
    after reset. A concrete example is a Thunderbolt add-in-card that
    consists of a PCIe switch and two PCIe endpoints:

      +-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller
                                      +-01.0-[04-36]-- DS hotplug port
                                      +-02.0-[37]----00.0 xHCI controller
                                      \-04.0-[38-6b]-- DS hotplug port

    The root port (1b.0) and the PCIe switch downstream ports are all PCIe
    gen3 so they support 8GT/s link speeds.

    We wait for the PCIe hierarchy to enter D3cold (runtime):

      pcieport 0000:00:1b.0: power state changed by ACPI to D3cold

    When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the
    PCIe switch is put to reset and its power is re-applied. This means that
    we must follow the rules in PCIe 4.0 section 6.6.1.
[...]
    Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 drivers/pci/pci.c               | 29 +++++++++++++++++++----------
 drivers/pci/pci.h               |  1 +
 drivers/pci/pcie/portdrv_core.c | 66
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 86 insertions(+), 10 deletions(-)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

----- End forwarded message -----

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [bugzilla-daemon@bugzilla.kernel.org: [Bug 204413] New: "PCI: Add missing link delays" causes regression on resume from suspend to ram]
  2019-08-02 22:08 [bugzilla-daemon@bugzilla.kernel.org: [Bug 204413] New: "PCI: Add missing link delays" causes regression on resume from suspend to ram] Bjorn Helgaas
@ 2019-08-02 23:11 ` Matthias Andree
  0 siblings, 0 replies; 2+ messages in thread
From: Matthias Andree @ 2019-08-02 23:11 UTC (permalink / raw)
  To: Bjorn Helgaas, Mika Westerberg; +Cc: linux-pci, Rafael J. Wysocki

Am 03.08.19 um 00:08 schrieb Bjorn Helgaas:
> [+cc Mika, Rafael, linux-pci]
>
> Hi Matthias,
>
> Thanks a lot for this report!
>
> Mika, this bisected to upstream c2bf1fc212f7 ("PCI: Add missing link
> delays required by the PCIe spec").
>
> Matthias, would you mind opening a separate report for the spurious
> PME issue you mentioned with 5.2.5?  Seems like we should try to
> figure that one out, too.
>
>
> ----- Forwarded message from bugzilla-daemon@bugzilla.kernel.org -----
>
> Date: Fri, 02 Aug 2019 16:26:45 +0000
> From: bugzilla-daemon@bugzilla.kernel.org
> To: bugzilla.pci@gmail.com
> Subject: [Bug 204413] New: "PCI: Add missing link delays" causes regression on resume from suspend to ram
> Message-ID: <bug-204413-193951@https.bugzilla.kernel.org/>
>
> https://bugzilla.kernel.org/show_bug.cgi?id=204413
>
Hi Bjorn,

thanks for the prompt reaction and interest in the bug. I am loathe to
file a different report /now/ because it seems that on my hardware, all
three of Linux 5.1.20, 5.2.5, 5.3-rc2 mess up I/O (disk/net) in the same
way (which is the big thing), only 5.1.20 remains mute and 5.2.5 and
5.3-rc2 log the PME IRQ storms from the AMD PCI bridge at 0000:00:01.3
which might just be another consequence.

Let's see if you/Mika can figure out what the AMD B350/X370 chipsets
need to be catered for with WRT PCI link delays/handling when waking up
from STR and I'll happily see if that fixes the IRQ storm, or else file
that 2nd bug report.

Regards,
Matthias



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-08-02 23:11 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-02 22:08 [bugzilla-daemon@bugzilla.kernel.org: [Bug 204413] New: "PCI: Add missing link delays" causes regression on resume from suspend to ram] Bjorn Helgaas
2019-08-02 23:11 ` Matthias Andree

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.