linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/17] Improve PCI device post-reset readiness polling
@ 2020-03-03 13:28 Stanislav Spassov
  2020-03-03 13:28 ` [PATCH v3 01/17] PCI: Fall back to slot/bus reset if softer methods timeout Stanislav Spassov
                   ` (16 more replies)
  0 siblings, 17 replies; 24+ messages in thread
From: Stanislav Spassov @ 2020-03-03 13:28 UTC (permalink / raw)
  To: linux-pci
  Cc: Stanislav Spassov, linux-acpi, Bjorn Helgaas, Thomas Gleixner,
	Andrew Morton, Jan H . Schönherr, Jonathan Corbet,
	Ashok Raj, Alex Williamson, Sinan Kaya, Rajat Jain,
	kbuild test robot

From: Stanislav Spassov <stanspas@amazon.de>

The first version of this patch series can be found here:
https://lore.kernel.org/linux-pci/20200223122057.6504-1-stanspas@amazon.com

Originally (v1), this patch series aimed to only solve an issue where
pci_dev_wait can cause system crashes. After a reset, a hung device may
keep responding with CRS completions indefinitely. If CRS Software
Visibility is enabled on the Root Port, attempting to read any register
other than PCI_VENDOR_ID will cause the Root Port to autonomously retry
the request without reporting back to the CPU core. Unless the number of
retries or the amount of time spent retrying is limited by
platform-specific means, this scenario leads to low-level platform
timeouts (such as a TOR Timeout), which easily escalate to a crash.

The feedback on the first version of this patch series inspired a
deeper dive into the PCI Firmware Spec (_DSM functions 8 and 9),
which revealed several different types of delays that can be overriden
on a per-device basis to avoid waiting for too long on device that are
known to come back quickly after reset. The kernel already stores such
overrides for some, but not all of the delays.

While adding the infrastructure to allow overriding delays, I discovered
and addressed several inconsistencies between what the PCIE
Base Specification says and what the code does, and came up with more
improvements all around device resets and readiness polling.

This patch series now paves the way for Readiness Time Reporting capability
support, and touches upon (in comments) some changes that would be
required for supporting Readiness Notifications.

[Compared to v2, v3 fixes build failures on i386 and arm/arm64:
Reported-by: kbuild test robot <lkp@intel.com>
- int(value_us / 1000) does not work for u64 value_us due to:
  undefined reference to `__udivdi3'
  Change: use '(int)value_us / 1000' to match pre-existing code.
  It seems this would be susceptible to overflow/truncation ?
- I had failed to replace all mentions of PCI_PM_D3COLD_WAIT after
  renaming that constant to PCI_RESET_DELAY.]

Stanislav Spassov (17):
  PCI: Fall back to slot/bus reset if softer methods timeout
  PCI: Remove unused PCI_PM_BUS_WAIT
  PCI: Use pci_bridge_wait_for_secondary_bus after SBR
  PCI: Do not override delay for D0->D3hot transition
  PCI: Fix handling of _DSM 8 (avoiding reset delays)
  PCI: Fix us->ms conversion in pci_acpi_optimize_delay
  PCI: Clean up and document PM/reset delays
  PCI: Add more delay overrides to struct pci_dev
  PCI: Generalize pci_bus_max_d3cold_delay to pci_bus_max_delay
  PCI: Use correct delay in pci_bridge_wait_for_secondary_bus
  PCI: Refactor pci_dev_wait to remove timeout parameter
  PCI: Refactor pci_dev_wait to take pci_init_event
  PCI: Cache CRS Software Visibiliy in struct pci_dev
  PCI: Introduce per-device reset_ready_poll override
  PCI: Refactor polling loop out of pci_dev_wait
  PCI: Add CRS handling to pci_dev_wait()
  PCI: Lower PCIE_RESET_READY_POLL_MS from 1m to 1s

 Documentation/power/pci.rst           |   4 +-
 arch/x86/pci/intel_mid_pci.c          |   2 +-
 drivers/hid/intel-ish-hid/ipc/ipc.c   |   2 +-
 drivers/mfd/intel-lpss-pci.c          |   2 +-
 drivers/net/ethernet/marvell/sky2.c   |   2 +-
 drivers/pci/controller/pci-aardvark.c |   2 +-
 drivers/pci/controller/pci-mvebu.c    |   2 +-
 drivers/pci/iov.c                     |   4 +-
 drivers/pci/pci-acpi.c                | 106 ++++++++----
 drivers/pci/pci-driver.c              |   4 +-
 drivers/pci/pci.c                     | 233 ++++++++++++++++++--------
 drivers/pci/pci.h                     |  81 ++++++++-
 drivers/pci/probe.c                   |  10 +-
 drivers/pci/quirks.c                  |   9 +-
 include/linux/pci-acpi.h              |   8 +-
 include/linux/pci.h                   |  45 ++++-
 16 files changed, 390 insertions(+), 126 deletions(-)


base-commit: bb6d3fb354c5ee8d6bde2d576eb7220ea09862b9
-- 
2.25.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-03-07 11:30 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-03 13:28 [PATCH v3 00/17] Improve PCI device post-reset readiness polling Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 01/17] PCI: Fall back to slot/bus reset if softer methods timeout Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 02/17] PCI: Remove unused PCI_PM_BUS_WAIT Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 03/17] PCI: Use pci_bridge_wait_for_secondary_bus after SBR Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 04/17] PCI: Do not override delay for D0->D3hot transition Stanislav Spassov
2020-03-03 18:57   ` Rafael J. Wysocki
2020-03-07 10:58     ` Spassov, Stanislav
2020-03-03 13:28 ` [PATCH v3 05/17] PCI: Fix handling of _DSM 8 (avoiding reset delays) Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 06/17] PCI: Fix us->ms conversion in pci_acpi_optimize_delay Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 07/17] PCI: Clean up and document PM/reset delays Stanislav Spassov
2020-03-03 19:03   ` Rafael J. Wysocki
2020-03-07 11:30     ` Spassov, Stanislav
2020-03-03 13:28 ` [PATCH v3 08/17] PCI: Add more delay overrides to struct pci_dev Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 09/17] PCI: Generalize pci_bus_max_d3cold_delay to pci_bus_max_delay Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 10/17] PCI: Use correct delay in pci_bridge_wait_for_secondary_bus Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 11/17] PCI: Refactor pci_dev_wait to remove timeout parameter Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 12/17] PCI: Refactor pci_dev_wait to take pci_init_event Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 13/17] PCI: Cache CRS Software Visibiliy in struct pci_dev Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 14/17] PCI: Introduce per-device reset_ready_poll override Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 15/17] PCI: Refactor polling loop out of pci_dev_wait Stanislav Spassov
2020-03-03 13:28 ` [PATCH v3 16/17] PCI: Add CRS handling to pci_dev_wait() Stanislav Spassov
2020-03-05 17:56   ` Raj, Ashok
2020-03-06 18:07     ` Spassov, Stanislav
2020-03-03 13:28 ` [PATCH v3 17/17] PCI: Lower PCIE_RESET_READY_POLL_MS from 1m to 1s Stanislav Spassov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).