[RFC PATCH 0/2] PCI/AER: handling for RCiEPs

* [RFC PATCH 0/2] PCI/AER: handling for RCiEPs
@ 2020-05-21 17:31 Jonathan Cameron
  2020-05-21 17:31 ` [PATCH 1/2] PCI/AER: Do not reset the device status if doing firmware first handling Jonathan Cameron
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Jonathan Cameron @ 2020-05-21 17:31 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci
  Cc: linux-acpi, linuxarm, Lorenzo Pieralisi, Jonathan Cameron

This RFC adds minimal AER handling for Root Complex integrated End Points
(RCiEPs).   These report their errors via a Root Complex Event Collector
(RCEC).  Note that this series does not provide a driver for said RCEC
because we do not need to do anything to it on a Hardware-Reduced ACPI
platform such as the ARM server we wish to support.

My assumption is that anyone needing support will need to enumerate the
association between the RCEC and RCiEPs, setting the rcec pointer added
to struct pci_dev.  If an alternate mechanism is preferred let me know.

Open questions are mainly in patch 2 description.  In particular a
number of the normal reset actions make little sense for an RCiEP (slot
reset?) so I'm unclear whether we should just call them all anyway or not.

Patch 1 avoids a reset of a register on the root port in a firmware first
flow.  It can occur for normal EP flow as well. It probably shouldn't,
but likely effects are minor (as firmware should have reset the register
already).

All comments welcome.  NB. We only care about the Hardware-Reduced
firmware first case so I'm more than happy to rip out he hints of
explicit RCEC support if people would prefer - I just put them in
for the RFC to show how that just possibly 'might' work.

There are other places that I suspect would need to take the RCEC case
into account that I have not addressed here.  Whilst we do have real
hardware RCiEPs, testing here was done with Qemu to allow comparison
of the flows for RCiEPs and EPs that were otherwise identical.
It is also easier to add whatever error injection is needed than on
real hardware.

Only the reduced hardware ACPI case has been tested as we would need
to add a bunch more stuff to Qemu to test the alternative forms
of firmware first of kernel first handling (which we don't care about :)

Jonathan Cameron (2):
  PCI/AER: Do not reset the device status if doing firmware first
    handling.
  PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware
    first

 drivers/pci/pcie/aer.c |  3 +++
 drivers/pci/pcie/err.c | 61 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h    |  1 +
 3 files changed, 65 insertions(+)

-- 
2.19.1

^ permalink raw reply	[flat|nested] 19+ messages in thread