From: Stanislav Spassov <stanspas@amazon.com>
To: <linux-pci@vger.kernel.org>
Cc: "Stanislav Spassov" <stanspas@amazon.de>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Jan H . Schönherr" <jschoenh@amazon.de>,
"Jonathan Corbet" <corbet@lwn.net>,
"Ashok Raj" <ashok.raj@intel.com>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Sinan Kaya" <okaya@kernel.org>,
"Rajat Jain" <rajatja@google.com>
Subject: [PATCH v4 0/3] Improve PCI device post-reset readiness polling
Date: Sat, 7 Mar 2020 18:20:41 +0100 [thread overview]
Message-ID: <20200307172044.29645-1-stanspas@amazon.com> (raw)
From: Stanislav Spassov <stanspas@amazon.de>
The first version of this patch series can be found here:
https://lore.kernel.org/linux-pci/20200223122057.6504-1-stanspas@amazon.com
The goal of this patch series is to solve an issue where pci_dev_wait
can cause system crashes. After a reset, a hung device may keep
responding with CRS completions indefinitely. If CRS Software Visibility
is enabled on the Root Port, attempting to read any register other than
PCI_VENDOR_ID will cause the Root Port to autonomously retry the request
without reporting back to the CPU core. Unless the number of retries or
the amount of time spent retrying is limited by platform-specific means,
this scenario leads to low-level platform timeouts (such as a TOR
Timeout), which can easily escalate to a crash.
Feedback on the v1 inspired a lot of additional improvements all around the
device reset codepaths and reducing post-reset delays. These improvements
were published as part of v2 (v3 is just small build fixes).
It looks like there is immediate demand specifically for the CRS work,
so I am once again reducing the series to just that. The reset will be
posted as a separate patch series that will likely require more time and
iterations to stabilize.
Changes since v3:
- In pci_dev_wait(), added "timeout -= waited" to account the time spent
polling PCI_VENDOR_ID before falling back to polling PCI_COMMAND if
device readiness could not be positively established via CRS (i.e.,
if we stopped receiving CRS completions but did not receive a valid
vendor ID due to dealing with an SR-IOV VF, or due to a different error)
- Simplified the commit message of "PCI: Add CRS handling to pci_dev_wait()"
to avoid confusion as to when Root Ports will autonomously retry requests
that resulted in CRS completions.
Stanislav Spassov (3):
PCI: Refactor polling loop out of pci_dev_wait
PCI: Cache CRS Software Visibiliy in struct pci_dev
PCI: Add CRS handling to pci_dev_wait()
drivers/pci/pci.c | 109 +++++++++++++++++++++++++++++++++++---------
drivers/pci/probe.c | 8 +++-
include/linux/pci.h | 3 ++
3 files changed, 98 insertions(+), 22 deletions(-)
base-commit: bb6d3fb354c5ee8d6bde2d576eb7220ea09862b9
--
2.25.1
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
next reply other threads:[~2020-03-07 17:21 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-07 17:20 Stanislav Spassov [this message]
2020-03-07 17:20 ` [PATCH v4 1/3] PCI: Refactor polling loop out of pci_dev_wait Stanislav Spassov
2020-03-07 17:20 ` [PATCH v4 2/3] PCI: Cache CRS Software Visibiliy in struct pci_dev Stanislav Spassov
2021-09-12 13:32 ` Bjorn Helgaas
2021-09-13 16:06 ` Spassov, Stanislav
2020-03-07 17:20 ` [PATCH v4 3/3] PCI: Add CRS handling to pci_dev_wait() Stanislav Spassov
2020-03-09 15:55 ` Sinan Kaya
2020-03-09 16:19 ` Raj, Ashok
2020-03-09 16:38 ` Spassov, Stanislav
2020-03-09 17:33 ` Sinan Kaya
2021-09-11 14:03 ` Bjorn Helgaas
2021-09-13 16:29 ` Spassov, Stanislav
2021-09-13 16:38 ` Bjorn Helgaas
2021-09-13 18:04 ` Spassov, Stanislav
2021-09-14 17:53 ` Rajat Jain
2021-09-13 16:07 ` Bjorn Helgaas
2021-09-13 16:39 ` Spassov, Stanislav
2021-01-22 8:54 ` [PATCH v4 0/3] Improve PCI device post-reset readiness polling David Woodhouse
2021-09-10 9:32 ` David Woodhouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200307172044.29645-1-stanspas@amazon.com \
--to=stanspas@amazon.com \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=ashok.raj@intel.com \
--cc=bhelgaas@google.com \
--cc=corbet@lwn.net \
--cc=jschoenh@amazon.de \
--cc=linux-pci@vger.kernel.org \
--cc=okaya@kernel.org \
--cc=rajatja@google.com \
--cc=stanspas@amazon.de \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).