linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] PCI/AER: handling for RCiEPs
@ 2020-05-21 17:31 Jonathan Cameron
  2020-05-21 17:31 ` [PATCH 1/2] PCI/AER: Do not reset the device status if doing firmware first handling Jonathan Cameron
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Jonathan Cameron @ 2020-05-21 17:31 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci
  Cc: linux-acpi, linuxarm, Lorenzo Pieralisi, Jonathan Cameron

This RFC adds minimal AER handling for Root Complex integrated End Points
(RCiEPs).   These report their errors via a Root Complex Event Collector
(RCEC).  Note that this series does not provide a driver for said RCEC
because we do not need to do anything to it on a Hardware-Reduced ACPI
platform such as the ARM server we wish to support.

My assumption is that anyone needing support will need to enumerate the
association between the RCEC and RCiEPs, setting the rcec pointer added
to struct pci_dev.  If an alternate mechanism is preferred let me know.

Open questions are mainly in patch 2 description.  In particular a
number of the normal reset actions make little sense for an RCiEP (slot
reset?) so I'm unclear whether we should just call them all anyway or not.

Patch 1 avoids a reset of a register on the root port in a firmware first
flow.  It can occur for normal EP flow as well. It probably shouldn't,
but likely effects are minor (as firmware should have reset the register
already).

All comments welcome.  NB. We only care about the Hardware-Reduced
firmware first case so I'm more than happy to rip out he hints of
explicit RCEC support if people would prefer - I just put them in
for the RFC to show how that just possibly 'might' work.

There are other places that I suspect would need to take the RCEC case
into account that I have not addressed here.  Whilst we do have real
hardware RCiEPs, testing here was done with Qemu to allow comparison
of the flows for RCiEPs and EPs that were otherwise identical.
It is also easier to add whatever error injection is needed than on
real hardware.

Only the reduced hardware ACPI case has been tested as we would need
to add a bunch more stuff to Qemu to test the alternative forms
of firmware first of kernel first handling (which we don't care about :)

Jonathan Cameron (2):
  PCI/AER: Do not reset the device status if doing firmware first
    handling.
  PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware
    first

 drivers/pci/pcie/aer.c |  3 +++
 drivers/pci/pcie/err.c | 61 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h    |  1 +
 3 files changed, 65 insertions(+)

-- 
2.19.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/2] PCI/AER: Do not reset the device status if doing firmware first handling.
  2020-05-21 17:31 [RFC PATCH 0/2] PCI/AER: handling for RCiEPs Jonathan Cameron
@ 2020-05-21 17:31 ` Jonathan Cameron
  2020-06-16 17:47   ` Bjorn Helgaas
  2020-05-21 17:31 ` [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first Jonathan Cameron
  2020-06-16 10:47 ` [RFC PATCH 0/2] PCI/AER: handling for RCiEPs Jonathan Cameron
  2 siblings, 1 reply; 19+ messages in thread
From: Jonathan Cameron @ 2020-05-21 17:31 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci
  Cc: linux-acpi, linuxarm, Lorenzo Pieralisi, Jonathan Cameron

pci_aer_clear_device_status() currently resets the device status even when
firmware first handling is going on.  In particular it resets it on the
root port.

This has been discussed previously
https://lore.kernel.org/patchwork/patch/427375/.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 drivers/pci/pcie/aer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index f4274d301235..43e78b97ace6 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -373,6 +373,9 @@ void pci_aer_clear_device_status(struct pci_dev *dev)
 {
 	u16 sta;
 
+	if (pcie_aer_get_firmware_first(dev))
+		return;
+
 	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
 	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
 }
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first
  2020-05-21 17:31 [RFC PATCH 0/2] PCI/AER: handling for RCiEPs Jonathan Cameron
  2020-05-21 17:31 ` [PATCH 1/2] PCI/AER: Do not reset the device status if doing firmware first handling Jonathan Cameron
@ 2020-05-21 17:31 ` Jonathan Cameron
  2020-06-16 10:47 ` [RFC PATCH 0/2] PCI/AER: handling for RCiEPs Jonathan Cameron
  2 siblings, 0 replies; 19+ messages in thread
From: Jonathan Cameron @ 2020-05-21 17:31 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci
  Cc: linux-acpi, linuxarm, Lorenzo Pieralisi, Jonathan Cameron

Note this provides complete support for our usecase on an ARM server using
Hardware Reduced ACPI and adds appropriate place for an RCEC driver to hook
if someone else cares to write one, either for firmware first handling on
non Hardware Reduced ACPI or for kernel first AER handling.

For Root Complex integrated End Points (RCiEPs) there is no root port to
discover and hence we cannot walk the bus from the root port to do
appropriate resets.

The PCI specification provides Root Complex Event Collectors to deal with
this circumstance.  These are peer RCiEPs that provide (amongst other
things) collection + interrupt facilities for AER reporting for a set of
RCiEPs in the same root complex.

In the case of a Hardware Reduced ACPI platform, the AER errors are
reported via a GHESv2 path using CPER records as defined in the UEFI
specification.  These are intended to provide complete information and
appropriate hand shake in a fashion that does not require a specific form
of error reporting hardware.  This is contrast to AER handling via the
various HEST entries for PCI Root Port and PCI Device etc where we do
require direct access to the RCEC.

As such my interpretation of the spec is that a Reduced Hardware ACPI
platform should not access the RCEC from the OS at all during AER handling,
and in fact is welcome to use non standard hardware interfaces to provide
the equivalent functionality in any fashion it wishes (as all hidden beind
the firmware).

Hence I am making the provision of an RCEC optional.

The aim of the rest of the code was to replicate the actions that would
have occurred if this had been an EP below a root port. Some of them make
absolutely no sense, but I hope this RFC can start a discussion on what
we should be doing under these circumstances.

It probably makes sense to pull this new block of code out to a separate
function but for the RFC I've left it in place to keep it next to the
existing path.

It appears that the current kernel first code does not support detecting
the multiple error bits being set in the root port error status register.
This seems like a limitation both the normal EP / Root Port case and
for RCiEPs.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 drivers/pci/pcie/err.c | 61 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h    |  1 +
 2 files changed, 62 insertions(+)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 14bb8f54723e..d34be4483f73 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -153,6 +153,67 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
 	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
 	struct pci_bus *bus;
 
+	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {
+		struct pci_dev *rcec = dev->rcec;
+		/* Not clear this makes any sense - we can't reset link anyway...*/
+		if (state == pci_channel_io_frozen) {
+			report_frozen_detected(dev, &status);
+			pci_err(dev, "io is frozen and cannot reset link\n");
+			goto failed;
+		} else {
+			report_normal_detected(dev, &status);
+		}
+
+		if (status == PCI_ERS_RESULT_CAN_RECOVER) {
+			status = PCI_ERS_RESULT_RECOVERED;
+			pci_dbg(dev, "broadcast mmio_enabled message\n");
+			report_mmio_enabled(dev, &status);
+		}
+
+		if (status == PCI_ERS_RESULT_NEED_RESET) {
+			/* No actual slot reset possible */
+			status = PCI_ERS_RESULT_RECOVERED;
+			pci_dbg(dev, "broadcast slot_reset message\n");
+			report_slot_reset(dev, &status);
+		}
+
+		if (status != PCI_ERS_RESULT_RECOVERED)
+			goto failed;
+
+		report_resume(dev, &status);
+
+		/*
+		 * These two should be called on the RCEC  - but in case
+		 * of firmware first they should be no-ops. Given that
+		 * in a reduced hardware ACPI system, it is possible there
+		 * is no standard compliant RCEC at all.
+		 *
+		 * Add some sort of check on what type of HEST entries we have?
+		 */
+		if (rcec) {
+			/*
+			 * Unlike the upstream port case for an EP, we have not
+			 * issued a reset on all device the RCEC handles, so
+			 * perhaps we should be more careful about resetting
+			 * the status registers on the RCEC?
+			 *
+			 * In particular we may need provide a means to handle
+			 * the multiple error bits being set in PCI_ERR_ROOT_STATUS
+			 */
+			pci_aer_clear_device_status(rcec);
+			pci_aer_clear_nonfatal_status(rcec);
+			/*
+			 * Non RCiEP case uses the downstream port above the device
+			 * for this message.
+			 */
+			pci_info(rcec, "device recovery successful\n");
+		} else {
+			pci_info(dev, "device recovery successful\n");
+		}
+
+		return status;
+	}
+
 	/*
 	 * Error recovery runs on all subordinates of the first downstream port.
 	 * If the downstream port detected the error, it is cleared at the end.
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 83ce1cdf5676..cb21dfe05f8c 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -298,6 +298,7 @@ struct pci_dev {
 	struct list_head bus_list;	/* Node in per-bus list */
 	struct pci_bus	*bus;		/* Bus this device is on */
 	struct pci_bus	*subordinate;	/* Bus this device bridges to */
+	struct pci_dev	*rcec;		/* Root Complex Event Collector used */
 
 	void		*sysdata;	/* Hook for sys-specific extension */
 	struct proc_dir_entry *procent;	/* Device entry in /proc/bus/pci */
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/2] PCI/AER: handling for RCiEPs
  2020-05-21 17:31 [RFC PATCH 0/2] PCI/AER: handling for RCiEPs Jonathan Cameron
  2020-05-21 17:31 ` [PATCH 1/2] PCI/AER: Do not reset the device status if doing firmware first handling Jonathan Cameron
  2020-05-21 17:31 ` [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first Jonathan Cameron
@ 2020-06-16 10:47 ` Jonathan Cameron
  2 siblings, 0 replies; 19+ messages in thread
From: Jonathan Cameron @ 2020-06-16 10:47 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci; +Cc: linux-acpi, linuxarm, Lorenzo Pieralisi

Hi All,

Now the merge window is closed, I'd appreciate any comments on this series
from both ACPI and PCI related people.

Thanks,

Jonathan


On Fri, 22 May 2020 01:31:32 +0800
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> This RFC adds minimal AER handling for Root Complex integrated End Points
> (RCiEPs).   These report their errors via a Root Complex Event Collector
> (RCEC).  Note that this series does not provide a driver for said RCEC
> because we do not need to do anything to it on a Hardware-Reduced ACPI
> platform such as the ARM server we wish to support.
> 
> My assumption is that anyone needing support will need to enumerate the
> association between the RCEC and RCiEPs, setting the rcec pointer added
> to struct pci_dev.  If an alternate mechanism is preferred let me know.
> 
> Open questions are mainly in patch 2 description.  In particular a
> number of the normal reset actions make little sense for an RCiEP (slot
> reset?) so I'm unclear whether we should just call them all anyway or not.
> 
> Patch 1 avoids a reset of a register on the root port in a firmware first
> flow.  It can occur for normal EP flow as well. It probably shouldn't,
> but likely effects are minor (as firmware should have reset the register
> already).
> 
> All comments welcome.  NB. We only care about the Hardware-Reduced
> firmware first case so I'm more than happy to rip out he hints of
> explicit RCEC support if people would prefer - I just put them in
> for the RFC to show how that just possibly 'might' work.
> 
> There are other places that I suspect would need to take the RCEC case
> into account that I have not addressed here.  Whilst we do have real
> hardware RCiEPs, testing here was done with Qemu to allow comparison
> of the flows for RCiEPs and EPs that were otherwise identical.
> It is also easier to add whatever error injection is needed than on
> real hardware.
> 
> Only the reduced hardware ACPI case has been tested as we would need
> to add a bunch more stuff to Qemu to test the alternative forms
> of firmware first of kernel first handling (which we don't care about :)
> 
> Jonathan Cameron (2):
>   PCI/AER: Do not reset the device status if doing firmware first
>     handling.
>   PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware
>     first
> 
>  drivers/pci/pcie/aer.c |  3 +++
>  drivers/pci/pcie/err.c | 61 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pci.h    |  1 +
>  3 files changed, 65 insertions(+)
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] PCI/AER: Do not reset the device status if doing firmware first handling.
  2020-05-21 17:31 ` [PATCH 1/2] PCI/AER: Do not reset the device status if doing firmware first handling Jonathan Cameron
@ 2020-06-16 17:47   ` Bjorn Helgaas
  2020-06-16 18:00     ` Kuppuswamy, Sathyanarayanan
  2020-06-17  9:18     ` Jonathan Cameron
  0 siblings, 2 replies; 19+ messages in thread
From: Bjorn Helgaas @ 2020-06-16 17:47 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linuxarm,
	Lorenzo Pieralisi, Kuppuswamy Sathyanarayanan

[+cc Sathy]

On Fri, May 22, 2020 at 01:31:33AM +0800, Jonathan Cameron wrote:
> pci_aer_clear_device_status() currently resets the device status even when
> firmware first handling is going on.  In particular it resets it on the
> root port.
>
> This has been discussed previously
> https://lore.kernel.org/patchwork/patch/427375/.

I don't think this reference is really pertinent, is it?  That patch
to b2c8881da764 changes pci_cleanup_aer_uncorrect_error_status() so it
doesn't clear PCI_ERR_UNCOR_STATUS in "firmware-first" mode.

But your patch only affects PCI_EXP_DEVSTA, not PCI_ERR_UNCOR_STATUS.

> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  drivers/pci/pcie/aer.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index f4274d301235..43e78b97ace6 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -373,6 +373,9 @@ void pci_aer_clear_device_status(struct pci_dev *dev)
>  {
>  	u16 sta;
>  
> +	if (pcie_aer_get_firmware_first(dev))
> +		return;

This needs to be adjusted because pcie_aer_get_firmware_first() no
longer exists after 708b20003624 ("PCI/AER: Remove HEST/FIRMWARE_FIRST
parsing for AER ownership").

This will use the _OSC AER ownership bit to gate clearing of the
status bits in the PCIe capability (not the AER capability).

I think that's the right thing to do, but it's certainly not obvious
from the _OSC description in the PCI Firmware Spec r3.2.  I think we
need a pointer to the ECN that clarifies this, i.e., sec 4.5.1 of:

  System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
  2020, affecting PCI Firmware Specification, Rev. 3.2
  https://members.pcisig.com/wg/PCI-SIG/document/14076

>  	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
>  	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
>  }
> -- 
> 2.19.1
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] PCI/AER: Do not reset the device status if doing firmware first handling.
  2020-06-16 17:47   ` Bjorn Helgaas
@ 2020-06-16 18:00     ` Kuppuswamy, Sathyanarayanan
  2020-06-17  9:31       ` Jonathan Cameron
  2020-06-17  9:18     ` Jonathan Cameron
  1 sibling, 1 reply; 19+ messages in thread
From: Kuppuswamy, Sathyanarayanan @ 2020-06-16 18:00 UTC (permalink / raw)
  To: Bjorn Helgaas, Jonathan Cameron
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linuxarm, Lorenzo Pieralisi

Hi Jonathan,

On 6/16/20 10:47 AM, Bjorn Helgaas wrote:
> [+cc Sathy]
> 
> On Fri, May 22, 2020 at 01:31:33AM +0800, Jonathan Cameron wrote:
>> pci_aer_clear_device_status() currently resets the device status even when
>> firmware first handling is going on.  In particular it resets it on the
>> root port.
>>
>> This has been discussed previously
>> https://lore.kernel.org/patchwork/patch/427375/.
pci_aer_clear_device_status() is only used by handle_error_source(). And
I don't think handle_error_source() is called in FF mode. Can you
give more details on this issue ?
> 
> I don't think this reference is really pertinent, is it?  That patch
> to b2c8881da764 changes pci_cleanup_aer_uncorrect_error_status() so it
> doesn't clear PCI_ERR_UNCOR_STATUS in "firmware-first" mode.
> 
> But your patch only affects PCI_EXP_DEVSTA, not PCI_ERR_UNCOR_STATUS.
> 
>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> ---
>>   drivers/pci/pcie/aer.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index f4274d301235..43e78b97ace6 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -373,6 +373,9 @@ void pci_aer_clear_device_status(struct pci_dev *dev)
>>   {
>>   	u16 sta;
>>   
>> +	if (pcie_aer_get_firmware_first(dev))
>> +		return;
> 
> This needs to be adjusted because pcie_aer_get_firmware_first() no
> longer exists after 708b20003624 ("PCI/AER: Remove HEST/FIRMWARE_FIRST
> parsing for AER ownership").
> 
> This will use the _OSC AER ownership bit to gate clearing of the
> status bits in the PCIe capability (not the AER capability).
> 
> I think that's the right thing to do, but it's certainly not obvious
> from the _OSC description in the PCI Firmware Spec r3.2.  I think we
> need a pointer to the ECN that clarifies this, i.e., sec 4.5.1 of:
> 
>    System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
>    2020, affecting PCI Firmware Specification, Rev. 3.2
>    https://members.pcisig.com/wg/PCI-SIG/document/14076
> 
>>   	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
>>   	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
>>   }
>> -- 
>> 2.19.1
>>

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] PCI/AER: Do not reset the device status if doing firmware first handling.
  2020-06-16 17:47   ` Bjorn Helgaas
  2020-06-16 18:00     ` Kuppuswamy, Sathyanarayanan
@ 2020-06-17  9:18     ` Jonathan Cameron
  1 sibling, 0 replies; 19+ messages in thread
From: Jonathan Cameron @ 2020-06-17  9:18 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linuxarm,
	Lorenzo Pieralisi, Kuppuswamy Sathyanarayanan

On Tue, 16 Jun 2020 12:47:31 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> [+cc Sathy]
> 
> On Fri, May 22, 2020 at 01:31:33AM +0800, Jonathan Cameron wrote:
> > pci_aer_clear_device_status() currently resets the device status even when
> > firmware first handling is going on.  In particular it resets it on the
> > root port.
> >
> > This has been discussed previously
> > https://lore.kernel.org/patchwork/patch/427375/.  
> 
> I don't think this reference is really pertinent, is it?  That patch
> to b2c8881da764 changes pci_cleanup_aer_uncorrect_error_status() so it
> doesn't clear PCI_ERR_UNCOR_STATUS in "firmware-first" mode.
> 
> But your patch only affects PCI_EXP_DEVSTA, not PCI_ERR_UNCOR_STATUS.

I'll be honest I've mostly forgotten my reasoning behind including that
reference.  Might have been as simple as I got lost in the renames.

I'll drop the reference.

> 
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> >  drivers/pci/pcie/aer.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index f4274d301235..43e78b97ace6 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -373,6 +373,9 @@ void pci_aer_clear_device_status(struct pci_dev *dev)
> >  {
> >  	u16 sta;
> >  
> > +	if (pcie_aer_get_firmware_first(dev))
> > +		return;  
> 
> This needs to be adjusted because pcie_aer_get_firmware_first() no
> longer exists after 708b20003624 ("PCI/AER: Remove HEST/FIRMWARE_FIRST
> parsing for AER ownership").
> 
> This will use the _OSC AER ownership bit to gate clearing of the
> status bits in the PCIe capability (not the AER capability).
> 
> I think that's the right thing to do, but it's certainly not obvious
> from the _OSC description in the PCI Firmware Spec r3.2.  I think we
> need a pointer to the ECN that clarifies this, i.e., sec 4.5.1 of:
> 
>   System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
>   2020, affecting PCI Firmware Specification, Rev. 3.2
>   https://members.pcisig.com/wg/PCI-SIG/document/14076

Thanks. I'll add that (though can't check the document currently
for reasons you can probably figure out *sigh*)

Note this patch is rather tangential to patch 2 which is the one
I really need feedback on.  Whilst this appeared to be
wrong it is 'mostly harmless'.

> 
> >  	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
> >  	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
> >  }
> > -- 
> > 2.19.1
> >   



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] PCI/AER: Do not reset the device status if doing firmware first handling.
  2020-06-16 18:00     ` Kuppuswamy, Sathyanarayanan
@ 2020-06-17  9:31       ` Jonathan Cameron
  2020-06-17 20:57         ` Kuppuswamy, Sathyanarayanan
  0 siblings, 1 reply; 19+ messages in thread
From: Jonathan Cameron @ 2020-06-17  9:31 UTC (permalink / raw)
  To: Kuppuswamy, Sathyanarayanan
  Cc: Bjorn Helgaas, Bjorn Helgaas, linux-pci, linux-acpi, linuxarm,
	Lorenzo Pieralisi

On Tue, 16 Jun 2020 11:00:32 -0700
"Kuppuswamy, Sathyanarayanan" <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:

> Hi Jonathan,
> 
> On 6/16/20 10:47 AM, Bjorn Helgaas wrote:
> > [+cc Sathy]
> > 
> > On Fri, May 22, 2020 at 01:31:33AM +0800, Jonathan Cameron wrote:  
> >> pci_aer_clear_device_status() currently resets the device status even when
> >> firmware first handling is going on.  In particular it resets it on the
> >> root port.
> >>
> >> This has been discussed previously
> >> https://lore.kernel.org/patchwork/patch/427375/.  
> pci_aer_clear_device_status() is only used by handle_error_source(). And
> I don't think handle_error_source() is called in FF mode. Can you
> give more details on this issue ?

It's called in pcie_do_recovery

https://elixir.bootlin.com/linux/latest/source/drivers/pci/pcie/err.c#L200

Which is called from both handle_error_source and aer_recover_work_func.

indirectly called from ghes_handle_aer / ghes_do_proc

This particular flow will only happen (I think) on hardware reduced ACPI systems.

Jonathan

> > 
> > I don't think this reference is really pertinent, is it?  That patch
> > to b2c8881da764 changes pci_cleanup_aer_uncorrect_error_status() so it
> > doesn't clear PCI_ERR_UNCOR_STATUS in "firmware-first" mode.
> > 
> > But your patch only affects PCI_EXP_DEVSTA, not PCI_ERR_UNCOR_STATUS.
> >   
> >> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >> ---
> >>   drivers/pci/pcie/aer.c | 3 +++
> >>   1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> >> index f4274d301235..43e78b97ace6 100644
> >> --- a/drivers/pci/pcie/aer.c
> >> +++ b/drivers/pci/pcie/aer.c
> >> @@ -373,6 +373,9 @@ void pci_aer_clear_device_status(struct pci_dev *dev)
> >>   {
> >>   	u16 sta;
> >>   
> >> +	if (pcie_aer_get_firmware_first(dev))
> >> +		return;  
> > 
> > This needs to be adjusted because pcie_aer_get_firmware_first() no
> > longer exists after 708b20003624 ("PCI/AER: Remove HEST/FIRMWARE_FIRST
> > parsing for AER ownership").
> > 
> > This will use the _OSC AER ownership bit to gate clearing of the
> > status bits in the PCIe capability (not the AER capability).
> > 
> > I think that's the right thing to do, but it's certainly not obvious
> > from the _OSC description in the PCI Firmware Spec r3.2.  I think we
> > need a pointer to the ECN that clarifies this, i.e., sec 4.5.1 of:
> > 
> >    System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
> >    2020, affecting PCI Firmware Specification, Rev. 3.2
> >    https://members.pcisig.com/wg/PCI-SIG/document/14076
> >   
> >>   	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
> >>   	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
> >>   }
> >> -- 
> >> 2.19.1
> >>  
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] PCI/AER: Do not reset the device status if doing firmware first handling.
  2020-06-17  9:31       ` Jonathan Cameron
@ 2020-06-17 20:57         ` Kuppuswamy, Sathyanarayanan
  0 siblings, 0 replies; 19+ messages in thread
From: Kuppuswamy, Sathyanarayanan @ 2020-06-17 20:57 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Bjorn Helgaas, Bjorn Helgaas, linux-pci, linux-acpi, linuxarm,
	Lorenzo Pieralisi

Hi,

On 6/17/20 2:31 AM, Jonathan Cameron wrote:
> On Tue, 16 Jun 2020 11:00:32 -0700
> "Kuppuswamy, Sathyanarayanan" <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
> 
>> Hi Jonathan,
>>
>> On 6/16/20 10:47 AM, Bjorn Helgaas wrote:
>>> [+cc Sathy]
>>>
>>> On Fri, May 22, 2020 at 01:31:33AM +0800, Jonathan Cameron wrote:
>>>> pci_aer_clear_device_status() currently resets the device status even when
>>>> firmware first handling is going on.  In particular it resets it on the
>>>> root port.
>>>>
>>>> This has been discussed previously
>>>> https://lore.kernel.org/patchwork/patch/427375/.
>> pci_aer_clear_device_status() is only used by handle_error_source(). And
>> I don't think handle_error_source() is called in FF mode. Can you
>> give more details on this issue ?
> 
> It's called in pcie_do_recovery
> 
> https://elixir.bootlin.com/linux/latest/source/drivers/pci/pcie/err.c#L200
> 
> Which is called from both handle_error_source and aer_recover_work_func.
> 
> indirectly called from ghes_handle_aer / ghes_do_proc
> 
> This particular flow will only happen (I think) on hardware reduced ACPI systems.
Ok. Makes sense.
> 
> Jonathan
> 
>>>
>>> I don't think this reference is really pertinent, is it?  That patch
>>> to b2c8881da764 changes pci_cleanup_aer_uncorrect_error_status() so it
>>> doesn't clear PCI_ERR_UNCOR_STATUS in "firmware-first" mode.
>>>
>>> But your patch only affects PCI_EXP_DEVSTA, not PCI_ERR_UNCOR_STATUS.
>>>    
>>>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>>> ---
>>>>    drivers/pci/pcie/aer.c | 3 +++
>>>>    1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>>> index f4274d301235..43e78b97ace6 100644
>>>> --- a/drivers/pci/pcie/aer.c
>>>> +++ b/drivers/pci/pcie/aer.c
>>>> @@ -373,6 +373,9 @@ void pci_aer_clear_device_status(struct pci_dev *dev)
>>>>    {
>>>>    	u16 sta;
>>>>    
>>>> +	if (pcie_aer_get_firmware_first(dev))
use if (!pcie_aer_is_native(dev))
>>>> +		return;
>>>
>>> This needs to be adjusted because pcie_aer_get_firmware_first() no
>>> longer exists after 708b20003624 ("PCI/AER: Remove HEST/FIRMWARE_FIRST
>>> parsing for AER ownership").
>>>
>>> This will use the _OSC AER ownership bit to gate clearing of the
>>> status bits in the PCIe capability (not the AER capability).
>>>
>>> I think that's the right thing to do, but it's certainly not obvious
>>> from the _OSC description in the PCI Firmware Spec r3.2.  I think we
>>> need a pointer to the ECN that clarifies this, i.e., sec 4.5.1 of:
>>>
>>>     System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
>>>     2020, affecting PCI Firmware Specification, Rev. 3.2
>>>     https://members.pcisig.com/wg/PCI-SIG/document/14076
>>>    
>>>>    	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
>>>>    	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
>>>>    }
>>>> -- 
>>>> 2.19.1
>>>>   
>>
> 
> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first
  2020-06-18 16:20       ` Bjorn Helgaas
@ 2020-06-18 16:38         ` Jonathan Cameron
  0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Cameron @ 2020-06-18 16:38 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Kuppuswamy, Sathyanarayanan, sean.v.kelley, Bjorn Helgaas,
	linux-pci, linux-acpi, linuxarm, Lorenzo Pieralisi

On Thu, 18 Jun 2020 11:20:04 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Thu, Jun 18, 2020 at 09:48:29AM +0100, Jonathan Cameron wrote:
> > On Wed, 17 Jun 2020 11:25:55 -0700
> > "Kuppuswamy, Sathyanarayanan"
> > <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:  
> > > On 6/17/20 10:36 AM, Sean V Kelley wrote:  
> > > > On Tue, 2020-06-16 at 14:24 -0500, Bjorn Helgaas wrote:    
> > > >> On Fri, May 22, 2020 at 01:31:34AM +0800, Jonathan Cameron
> > > >> wrote:    
> 
> > > IIUC, we are trying to solve multiple issues here.
> > > 
> > > 1. Error detection and recovery support for RCiEPs and RCEC.
> > > 2. Firmware first exception for case 1.
> > > 3. AEPI based handling for case 1 (I think this is the case
> > > Jonathan is trying to handle)  
> > 
> > I'm not sure it separates that cleanly but the flow I'm interested
> > in is firmware first with errors reported using APEI / GHESv2 etc.
> > 
> > In particular without an RCEC, as it should (I think) play no part
> > in that path.  One of the main aims of me bringing this forwards at
> > this stage is to establish whether I need to get our hardware teams
> > to put an RCEC in place for future hardware or not.  Right now we
> > have some work arounds in place as we can reroute some of these
> > errors directly to device interrupts.
> > 
> > We haven't been able to come up with a reason why we need an RCEC
> > given our approach to error handling.  
> 
> If you only want to use ACPI APEI and you never need native AER
> support, you may not need an RCEC.  But note that an RCEC *is*
> required if you want PME messages or Function Readiness Status (FRS)
> for RCiEPs.

Understood.

> 
> > > >>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> > > >>> index 14bb8f54723e..d34be4483f73 100644
> > > >>> --- a/drivers/pci/pcie/err.c
> > > >>> +++ b/drivers/pci/pcie/err.c
> > > >>> @@ -153,6 +153,67 @@ pci_ers_result_t pcie_do_recovery(struct
> > > >>> pci_dev *dev,
> > > >>>   	pci_ers_result_t status =
> > > >>> PCI_ERS_RESULT_CAN_RECOVER; struct pci_bus *bus;    
> > > I am curious what bus (dev->subordinate) does RCEC and RCiEP
> > > belongs to ?  
> > 
> > I'm not quite sure what you are asking so...
> > 
> > They are effectively the same as root ports, and so sit on a bus
> > specified via ACPI. In our case IORT. The root complex includes a
> > bunch of bus numbers on which these devices can be discovered.
> > 0, 74, 78, 7a, 7b, 7c, 80, b4, ba, bb, b8, bc on the 2 socket
> > machine I have to hand.  Those buses have a mix of root ports, and
> > RCiEPs on them.  
> 
> I don't know much about IORT, but it's not the way we discover these
> devices.
> 
> RCiEPs and RCECs should be discovered the same way as Root Ports: a
> PNP0A03 device in the ACPI namespace describes a PCI host bridge, and
> the _CRS method of that PNP0A03 device contains a bus number range.
> The base of that range is the root "bus" on which Root Ports, RCiEPs,
> RCECs, and possibly some Conventional PCI devices exist.
> 
> The machine Jonathan mentioned above should have 12 PNP0A03 devices,
> one for each of the root buses he listed.

You are of course correct. That particular bit of info is available in
both places, but enumeration comes the DSDT entries you mention.

> 
> > As to subordinate (i.e. where they bridge to) I don't think it
> > would have any meaning.   For reference I checked one of our RCiEPs
> > and it's set to 0 on the config space.  
> 
> RCiEPs and RCECs are type 0 devices, so they're not bridges.  They
> have no secondary bus.

Thanks

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first
  2020-06-18  8:48     ` Jonathan Cameron
  2020-06-18 16:04       ` Jonathan Cameron
@ 2020-06-18 16:20       ` Bjorn Helgaas
  2020-06-18 16:38         ` Jonathan Cameron
  1 sibling, 1 reply; 19+ messages in thread
From: Bjorn Helgaas @ 2020-06-18 16:20 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Kuppuswamy, Sathyanarayanan, sean.v.kelley, Bjorn Helgaas,
	linux-pci, linux-acpi, linuxarm, Lorenzo Pieralisi

On Thu, Jun 18, 2020 at 09:48:29AM +0100, Jonathan Cameron wrote:
> On Wed, 17 Jun 2020 11:25:55 -0700
> "Kuppuswamy, Sathyanarayanan" <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
> > On 6/17/20 10:36 AM, Sean V Kelley wrote:
> > > On Tue, 2020-06-16 at 14:24 -0500, Bjorn Helgaas wrote:  
> > >> On Fri, May 22, 2020 at 01:31:34AM +0800, Jonathan Cameron wrote:  

> > IIUC, we are trying to solve multiple issues here.
> > 
> > 1. Error detection and recovery support for RCiEPs and RCEC.
> > 2. Firmware first exception for case 1.
> > 3. AEPI based handling for case 1 (I think this is the case Jonathan is
> > trying to handle)
> 
> I'm not sure it separates that cleanly but the flow I'm interested
> in is firmware first with errors reported using APEI / GHESv2 etc.
> 
> In particular without an RCEC, as it should (I think) play no part
> in that path.  One of the main aims of me bringing this forwards at
> this stage is to establish whether I need to get our hardware teams
> to put an RCEC in place for future hardware or not.  Right now we have
> some work arounds in place as we can reroute some of these errors directly
> to device interrupts.
> 
> We haven't been able to come up with a reason why we need an RCEC given
> our approach to error handling.

If you only want to use ACPI APEI and you never need native AER
support, you may not need an RCEC.  But note that an RCEC *is*
required if you want PME messages or Function Readiness Status (FRS)
for RCiEPs.

> > >>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> > >>> index 14bb8f54723e..d34be4483f73 100644
> > >>> --- a/drivers/pci/pcie/err.c
> > >>> +++ b/drivers/pci/pcie/err.c
> > >>> @@ -153,6 +153,67 @@ pci_ers_result_t pcie_do_recovery(struct
> > >>> pci_dev *dev,
> > >>>   	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
> > >>>   	struct pci_bus *bus;  
> > I am curious what bus (dev->subordinate) does RCEC and RCiEP belongs to 
> > ?
> 
> I'm not quite sure what you are asking so...
> 
> They are effectively the same as root ports, and so sit on a bus specified
> via ACPI. In our case IORT. The root complex includes a bunch of bus
> numbers on which these devices can be discovered.
> 0, 74, 78, 7a, 7b, 7c, 80, b4, ba, bb, b8, bc on the 2 socket machine I have
> to hand.  Those buses have a mix of root ports, and RCiEPs on them.

I don't know much about IORT, but it's not the way we discover these
devices.

RCiEPs and RCECs should be discovered the same way as Root Ports: a
PNP0A03 device in the ACPI namespace describes a PCI host bridge, and
the _CRS method of that PNP0A03 device contains a bus number range.
The base of that range is the root "bus" on which Root Ports, RCiEPs,
RCECs, and possibly some Conventional PCI devices exist.

The machine Jonathan mentioned above should have 12 PNP0A03 devices,
one for each of the root buses he listed.

> As to subordinate (i.e. where they bridge to) I don't think it would have
> any meaning.   For reference I checked one of our RCiEPs and it's set
> to 0 on the config space.

RCiEPs and RCECs are type 0 devices, so they're not bridges.  They
have no secondary bus.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first
  2020-06-18  8:48     ` Jonathan Cameron
@ 2020-06-18 16:04       ` Jonathan Cameron
  2020-06-18 16:20       ` Bjorn Helgaas
  1 sibling, 0 replies; 19+ messages in thread
From: Jonathan Cameron @ 2020-06-18 16:04 UTC (permalink / raw)
  To: Kuppuswamy, Sathyanarayanan
  Cc: Lorenzo Pieralisi, linux-pci, linuxarm, linux-acpi,
	Bjorn Helgaas, Bjorn Helgaas, sean.v.kelley

On Thu, 18 Jun 2020 09:48:29 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Wed, 17 Jun 2020 11:25:55 -0700
> "Kuppuswamy, Sathyanarayanan"
> <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
> 
> > On 6/17/20 10:36 AM, Sean V Kelley wrote:  
> > > On Tue, 2020-06-16 at 14:24 -0500, Bjorn Helgaas wrote:    
> > >> Bcc:
> > >> Subject: Re: [PATCH 2/2] PCI/AER: Add partial initial support for
> > >> RCiEPs
> > >>   using RCEC or firmware first
> > >> Reply-To:
> > >> In-Reply-To:
> > >> <20200521173134.2456773-3-Jonathan.Cameron@huawei.com>
> > >>
> > >> [+cc Sathy, Sean]
> > >>
> > >> On Fri, May 22, 2020 at 01:31:34AM +0800, Jonathan Cameron
> > >> wrote:    
> > >>> Note this provides complete support for our usecase on an ARM
> > >>> server using
> > >>> Hardware Reduced ACPI and adds appropriate place for an RCEC
> > >>> driver to hook
> > >>> if someone else cares to write one, either for firmware first
> > >>> handling on
> > >>> non Hardware Reduced ACPI or for kernel first AER handling.    
> > >>
> > >> This provides complete support?  I'm really confused, since this
> > >> relies on dev->rcec, which is never set.  And I don't see
> > >> anything about hooks for RCEC drivers.
> > >>    
> > >>> For Root Complex integrated End Points (RCiEPs) there is no root
> > >>> port to
> > >>> discover and hence we cannot walk the bus from the root port to
> > >>> do appropriate resets.
> > >>>
> > >>> The PCI specification provides Root Complex Event Collectors to
> > >>> deal with
> > >>> this circumstance.  These are peer RCiEPs that provide (amongst
> > >>> other
> > >>> things) collection + interrupt facilities for AER reporting for
> > >>> a set of
> > >>> RCiEPs in the same root complex.
> > >>>
> > >>> In the case of a Hardware Reduced ACPI platform, the AER errors
> > >>> are reported via a GHESv2 path using CPER records as defined in
> > >>> the UEFI
> > >>> specification.  These are intended to provide complete
> > >>> information and
> > >>> appropriate hand shake in a fashion that does not require a
> > >>> specific form
> > >>> of error reporting hardware.  This is contrast to AER handling
> > >>> via the
> > >>> various HEST entries for PCI Root Port and PCI Device etc where
> > >>> we do
> > >>> require direct access to the RCEC.    
> > >>
> > >> Can you include pointers to relevant spec sections for these
> > >> differences between hardware-reduced and other platforms?
> > >>
> > >> This patch doesn't seem to depend on anything about ACPI, APEI,
> > >> firmware-first, or hardware-reduced platforms.
> > >>    
> > >>> As such my interpretation of the spec is that a Reduced Hardware
> > >>> ACPI
> > >>> platform should not access the RCEC from the OS at all during
> > >>> AER handling,
> > >>> and in fact is welcome to use non standard hardware interfaces
> > >>> to provide
> > >>> the equivalent functionality in any fashion it wishes (as all
> > >>> hidden beind
> > >>> the firmware).    
> > > 
> > > 
> > > I'm not sure what you mean by Hardware Reduced ACPI platform, but
> > > you seem to be implying that in this case your hardware lacks
> > > RCECs and so are using firmware specific handling.
> > > 
> > > In 1.3.2.3 (Root Complex Integrated Endpoint rules)
> > > 
> > > If an RCiEP is associated with an optional Root Complex Event
> > > Collector it must signal PME and error conditions through a Root
> > > Complex Event Collector.
> > > 
> > > If the RCEC is not supported/present then the expectation prior
> > > to PCIe 5 is that the RCiEP will use the same mechanism as PCI
> > > systems.  However, if the RCiEP asserts say an error signal there
> > > is no Root Port and the OS has no way of knowing what interrupt
> > > the error is conntected to.  Linux doesn't have support for that
> > > and this was discussed prior here:
> > > 
> > > https://lore.kernel.org/lkml/20190709134538.GA35486@google.com/
> > > 
> > > In such a case, are you then implying that the _OSC method is not
> > > granting control of PCIe Native Power Management Events to the OS
> > > and so are falling back to your defined ACPI mechanism on your
> > > platform?
> > > 
> > > I'm currently working on adding support for RCECs in AER that
> > > would make use of the extended capabilities for identifying the
> > > assocated RCeIPs for purposes of the PME and error condition
> > > signaling.    
> > 
> > IIUC, we are trying to solve multiple issues here.
> > 
> > 1. Error detection and recovery support for RCiEPs and RCEC.
> > 2. Firmware first exception for case 1.
> > 3. AEPI based handling for case 1 (I think this is the case
> > Jonathan is trying to handle)  
> 
> I'm not sure it separates that cleanly but the flow I'm interested
> in is firmware first with errors reported using APEI / GHESv2 etc.
> 
> In particular without an RCEC, as it should (I think) play no part
> in that path.  One of the main aims of me bringing this forwards at
> this stage is to establish whether I need to get our hardware teams
> to put an RCEC in place for future hardware or not.  Right now we have
> some work arounds in place as we can reroute some of these errors
> directly to device interrupts.
> 
> We haven't been able to come up with a reason why we need an RCEC
> given our approach to error handling.
> 
> > 
> > For adding support for case 1,
> > 
> > 1. I think we need to first make the AER driver RCEC aware.
> > 2. Once AER driver is modified to receive IRQs for RCEC error
> >     events, then we can modify pcie_do_recovery() to handle
> >     recovery for RCiEPs and RCEC.
> > 
> > I recommend adding support for basic case first and then add
> > exceptions for Firmware First and AEPI based support.  
> 
> Sounds good as long as progress is timely.
> 
> As both Bjorn and yourself have suggested, we can perhaps
> make pci_walk_bus work when it is passed an RCiEP directly.
> 
> If that an be done, the case I care about should work with very
> minimal changes.
> 
> I'll take a look at how cleanly that can be done.
> 
> >   
> > > 
> > > Thanks,
> > > 
> > > Sean
> > > 
> > >     
> > >>
> > >> A pointer to the spec you're interpreting would be helpful here,
> > >> too.
> > >>
> > >> s/Reduced Hardware/Hardware-Reduced/ to match terminology in spec
> > >> (I'm
> > >> looking at ACPI v6.3, sec 4.1).  Also below in code comments.
> > >>
> > >> s/beind/behind/
> > >>    
> > >>> Hence I am making the provision of an RCEC optional.
> > >>>
> > >>> The aim of the rest of the code was to replicate the actions
> > >>> that would
> > >>> have occurred if this had been an EP below a root port. Some of
> > >>> them make
> > >>> absolutely no sense, but I hope this RFC can start a discussion
> > >>> on what
> > >>> we should be doing under these circumstances.
> > >>>
> > >>> It probably makes sense to pull this new block of code out to a
> > >>> separate
> > >>> function but for the RFC I've left it in place to keep it next
> > >>> to the
> > >>> existing path.    
> > >>
> > >> OK, my comment is: I really hope we don't need a separate path.
> > >> If we
> > >> need a test or two for RCiEPs, that's fine.  But two paths sounds
> > >> like
> > >> a nightmare to maintain.
> > >>    
> > >>> It appears that the current kernel first code does not support
> > >>> detecting
> > >>> the multiple error bits being set in the root port error status
> > >>> register.
> > >>> This seems like a limitation both the normal EP / Root Port case
> > >>> and
> > >>> for RCiEPs.    
> > >>
> > >> Is this paragraph supposed to be a bug report?  It doesn't seem
> > >> to say
> > >> anything about what *this* patch does.  Maybe this should be
> > >> part of the commit log for a separate patch?
> > >>    
> > >>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > >>> ---
> > >>>   drivers/pci/pcie/err.c | 61
> > >>> ++++++++++++++++++++++++++++++++++++++++++
> > >>>   include/linux/pci.h    |  1 +
> > >>>   2 files changed, 62 insertions(+)
> > >>>
> > >>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> > >>> index 14bb8f54723e..d34be4483f73 100644
> > >>> --- a/drivers/pci/pcie/err.c
> > >>> +++ b/drivers/pci/pcie/err.c
> > >>> @@ -153,6 +153,67 @@ pci_ers_result_t pcie_do_recovery(struct
> > >>> pci_dev *dev,
> > >>>   	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
> > >>>   	struct pci_bus *bus;    
> > I am curious what bus (dev->subordinate) does RCEC and RCiEP
> > belongs to ?  
> 
> I'm not quite sure what you are asking so...
> 
> They are effectively the same as root ports, and so sit on a bus
> specified via ACPI. In our case IORT. The root complex includes a
> bunch of bus numbers on which these devices can be discovered.
> 0, 74, 78, 7a, 7b, 7c, 80, b4, ba, bb, b8, bc on the 2 socket machine
> I have to hand.  Those buses have a mix of root ports, and RCiEPs on
> them.
> 
> As to subordinate (i.e. where they bridge to) I don't think it would
> have any meaning.   For reference I checked one of our RCiEPs and
> it's set to 0 on the config space.
> 
> 
> > Does all RCiEPs are in same bus ?  
> 
> No, though for any given RCiEP the associated RCEC would have to be
> on the same bus. So in the above example we would need quite a few
> of them.

One thing to note is we could minimize the changes to the code by
getting the bus on which a device resides directly rather than
bouncing via the root port and back again.

Something like
-       if (!(pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
-             pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM))
-               dev = dev->bus->self;
-       bus = dev->subordinate;
-
+       if ((pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
+            pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM))
+               bus = dev->subordinate;
+       else
+               bus = dev->bus;
+       if (bus->self)
+               dev = bus->self;

(with some special handling to avoid calling root port specific
parts on the RCiEP)

We could check pci_is_root_bus for that last check and it might be
slightly more informative.

But then an AER error on an RCiEP resets everything on the same bus
within the Root Complex.  

I don't think we want to do this as the same logic about resetting the
PCIe bus and everything below doesn't really apply.

Hence I'll try the pci_walk_bus variant as previously suggested.

Jonathan

> 
> > >>>   
> > >>> +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {    
> > Also, instead of creating a new path for RCiEPs, I recommend fixing
> > the pci_walk_bus() part. That will reduce code duplication.  
> > >>> +		struct pci_dev *rcec = dev->rcec;
> > >>> +		/* Not clear this makes any sense - we can't
> > >>> reset link anyway...*/
> > >>> +		if (state == pci_channel_io_frozen) {
> > >>> +			report_frozen_detected(dev, &status);
> > >>> +			pci_err(dev, "io is frozen and cannot
> > >>> reset link\n");
> > >>> +			goto failed;
> > >>> +		} else {
> > >>> +			report_normal_detected(dev, &status);
> > >>> +		}    
> > >>
> > >> I don't understand where you're going with this.  I think you're
> > >> adding recovery for RCiEPs (PCI_EXP_TYPE_RC_END).  It's true that
> > >> there's no link leading to them, but we should still be able to
> > >> reset the RCiEP (not the RCEC) via FLR, if it supports that.
> > >>
> > >> And all the driver callbacks should be for the RCiEP, not the
> > >> RCEC, shouldn't they?  I really hope we can avoid duplicating
> > >> this whole path.  It will be hard to keep the two paths in sync.
> > >>    
> > >>> +		if (status == PCI_ERS_RESULT_CAN_RECOVER) {
> > >>> +			status = PCI_ERS_RESULT_RECOVERED;
> > >>> +			pci_dbg(dev, "broadcast mmio_enabled
> > >>> message\n");
> > >>> +			report_mmio_enabled(dev, &status);
> > >>> +		}
> > >>> +
> > >>> +		if (status == PCI_ERS_RESULT_NEED_RESET) {
> > >>> +			/* No actual slot reset possible */
> > >>> +			status = PCI_ERS_RESULT_RECOVERED;
> > >>> +			pci_dbg(dev, "broadcast slot_reset
> > >>> message\n");
> > >>> +			report_slot_reset(dev, &status);
> > >>> +		}
> > >>> +
> > >>> +		if (status != PCI_ERS_RESULT_RECOVERED)
> > >>> +			goto failed;
> > >>> +
> > >>> +		report_resume(dev, &status);
> > >>> +
> > >>> +		/*
> > >>> +		 * These two should be called on the RCEC  -
> > >>> but in case
> > >>> +		 * of firmware first they should be no-ops.
> > >>> Given that
> > >>> +		 * in a reduced hardware ACPI system, it is
> > >>> possible there
> > >>> +		 * is no standard compliant RCEC at all.
> > >>> +		 *
> > >>> +		 * Add some sort of check on what type of HEST
> > >>> entries we have?
> > >>> +		 */
> > >>> +		if (rcec) {
> > >>> +			/*
> > >>> +			 * Unlike the upstream port case for
> > >>> an EP, we have not
> > >>> +			 * issued a reset on all device the
> > >>> RCEC handles, so
> > >>> +			 * perhaps we should be more careful
> > >>> about resetting
> > >>> +			 * the status registers on the RCEC?
> > >>> +			 *
> > >>> +			 * In particular we may need provide a
> > >>> means to handle
> > >>> +			 * the multiple error bits being set in
> > >>> PCI_ERR_ROOT_STATUS
> > >>> +			 */
> > >>> +			pci_aer_clear_device_status(rcec);
> > >>> +			pci_aer_clear_nonfatal_status(rcec);
> > >>> +			/*
> > >>> +			 * Non RCiEP case uses the downstream
> > >>> port above the device
> > >>> +			 * for this message.
> > >>> +			 */
> > >>> +			pci_info(rcec, "device recovery
> > >>> successful\n");
> > >>> +		} else {
> > >>> +			pci_info(dev, "device recovery
> > >>> successful\n");
> > >>> +		}
> > >>> +
> > >>> +		return status;
> > >>> +	}
> > >>> +
> > >>>   	/*
> > >>>   	 * Error recovery runs on all subordinates of the
> > >>> first downstream port.
> > >>>   	 * If the downstream port detected the error, it is
> > >>> cleared at the end.
> > >>> diff --git a/include/linux/pci.h b/include/linux/pci.h
> > >>> index 83ce1cdf5676..cb21dfe05f8c 100644
> > >>> --- a/include/linux/pci.h
> > >>> +++ b/include/linux/pci.h
> > >>> @@ -298,6 +298,7 @@ struct pci_dev {
> > >>>   	struct list_head bus_list;	/* Node in per-bus
> > >>> list */ struct pci_bus	*bus;		/* Bus this
> > >>> device is on */ struct pci_bus	*subordinate;	/*
> > >>> Bus this device bridges to */
> > >>> +	struct pci_dev	*rcec;		/* Root
> > >>> Complex Event Collector used */    
> > >>
> > >> Nothing ever sets this, so I guess the critical connection
> > >> between RCiEP and RCEC is missing?  Each patch needs to make
> > >> sense on its own,
> > >> so the patch that adds this struct member should also add
> > >> something that sets it and uses it.
> > >>    
> > >>>   	void		*sysdata;	/* Hook for
> > >>> sys-specific extension */
> > >>>   	struct proc_dir_entry *procent;	/* Device
> > >>> entry in /proc/bus/pci */
> > >>> -- 
> > >>> 2.19.1
> > >>>    
> > >     
> >   
> 
> 
> _______________________________________________
> Linuxarm mailing list
> Linuxarm@huawei.com
> http://hulk.huawei.com/mailman/listinfo/linuxarm


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first
  2020-06-17 17:26   ` Bjorn Helgaas
@ 2020-06-18  9:32     ` Jonathan Cameron
  0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Cameron @ 2020-06-18  9:32 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linuxarm,
	Lorenzo Pieralisi, Kuppuswamy Sathyanarayanan, Sean V Kelley

On Wed, 17 Jun 2020 12:26:10 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Wed, Jun 17, 2020 at 12:40:36PM +0100, Jonathan Cameron wrote:
> > On Tue, 16 Jun 2020 14:24:41 -0500
> > Bjorn Helgaas <helgaas@kernel.org> wrote:
> > Hi Bjorn,
> > 
> > Thanks for looking at this.  I got a bit carried away in this description
> > with trying to show that we don't need an RCEC or software support for
> > one and that has made the description rather confused. Sorry about that!
> > 
> > The complete lack of any consistent diagram of the various options in any
> > of the related specs has lead me down more than one dead end trying to work
> > this out.  Looking at the specs today I'm increasingly convinced the
> > question of Hardware-Reduced vs normal doesn't matter.
> > 
> > Let me try proposing a brief replacement description:
> > 
> > "Root Complex integrate End Points (RCiEPs) are not found below a
> > root port.  (PCI Express Base Specification 4.0 1.3.2.3, 7.1 for
> > topology) As such the error handling needs to perform actions on
> > only the device, rather than walking the bus as is done for
> > conventional EPs.
> > 
> >  In firmware-first error handling there is no need to directly
> >  access the Root Complex Event Collector (RCEC) as the firmware is
> >  responsible for all actions touching it.  The implementation of the
> >  RCEC device may not be compliant with the PCIe Spec as the OS does
> >  not access it during error handling.  (RCEC defined in PCI Express
> >  Base Specification 4.0 1.3.4)
> > 
> >  (Can drop next bit if we drop the code) If kernel-first error
> >  handling is in use, handling AER errors for RCiEPs requires access
> >  to the RCEC they are associated with. Support for this is not
> >  included in this RFC due to a lack of available test platform." 
> >   
> > > On Fri, May 22, 2020 at 01:31:34AM +0800, Jonathan Cameron wrote:  
> > > > Note this provides complete support for our usecase on an ARM
> > > > server using Hardware Reduced ACPI and adds appropriate place
> > > > for an RCEC driver to hook if someone else cares to write one,
> > > > either for firmware first handling on non Hardware Reduced ACPI
> > > > or for kernel first AER handling.    
> > > 
> > > This provides complete support?  I'm really confused, since this
> > > relies on dev->rcec, which is never set.  And I don't see anything
> > > about hooks for RCEC drivers.  
> > 
> > In our configuration we only support firmware first.  For that we
> > don't need dev->rcec to be set.
> > 
> > For our case, the OS should not in any way touch the RCEC (in fact,
> > as far as I can tell, it doesn't actually need to exist - and for
> > some of our platforms it doesn't.  An impdef bit of hardware can do
> > the same job.)
> > 
> > The information that could be read from the RCEC is provided in a
> > CPER record via GHESv2.  Confirmation that the OS has done
> > everything it needs to with the error is done via a handshake in the
> > GHESv2 Error Status Block.
> > 
> > Hence for the particular corner case we care about this code
> > provides everything necessary.  The stubs of RCEC support are there
> > just to indicate how it 'might' fit with a model where the RCEC is
> > needed to get information about the error etc.  I'm more than happy
> > to drop them and perhaps put in a comment to put anyone needing them
> > on the right track.
> > 
> > I put this statement around 'fully support in our case' here to
> > indicate that the 'partial initial support' in the title is actually
> > sufficient for some systems.
> >   
> > > > For Root Complex integrated End Points (RCiEPs) there is no root
> > > > port to discover and hence we cannot walk the bus from the root
> > > > port to do appropriate resets.
> > > > 
> > > > The PCI specification provides Root Complex Event Collectors to
> > > > deal with this circumstance.  These are peer RCiEPs that provide
> > > > (amongst other things) collection + interrupt facilities for AER
> > > > reporting for a set of RCiEPs in the same root complex.
> > > > 
> > > > In the case of a Hardware Reduced ACPI platform, the AER errors
> > > > are reported via a GHESv2 path using CPER records as defined in
> > > > the UEFI specification.  These are intended to provide complete
> > > > information and appropriate hand shake in a fashion that does
> > > > not require a specific form of error reporting hardware.  This
> > > > is contrast to AER handling via the various HEST entries for PCI
> > > > Root Port and PCI Device etc where we do require direct access
> > > > to the RCEC.    
> > > 
> > > Can you include pointers to relevant spec sections for these
> > > differences between hardware-reduced and other platforms?  
> > 
> > As mentioned above, I think I went down a dead end on this
> > description.  I think the lack of need for an RCEC in firmware first
> > handling is equally valid in all Firmware first cases.
> > 
> > I can have a go at highlighting relevant spec entries, though its
> > scattered across the ACPI spec and UEFI spec.  Focusing just on the
> > elements relevant to RAS handling...
> > 
> > The very brief version is that in Hardware Reduced ACPI all error
> > information is gathered via a management processor (or firmware
> > doing the same job) and presented as a descriptive record. There is
> > also a generic handshake to acknowledge the error without needing
> > anything hardware specific.
> > 
> > My confusion lay around the non Hardware-Reduced case.  I'm not
> > totally clear on what happens in that path and don't have any
> > hardware to look at.  So my assumption was that it used the HEST
> > entries for PCIe Root Port etc to identify where to find the error.
> > I now 'think' that isn't true and it uses GHES records. If anyone
> > can point me to a reference for this flow that would be great.
> > 
> > HEST can also include GHESv2 entries as defined in ACPI 6.3, section
> > 18.3.2.8 There error flow is the same for all GHESv2 error types:
> > 
> > "These are the steps the OS must take once detecting an error from a
> > particular GHESv2 error source:
> > •OSPM detects error (via interrupt/exception or polling the block status)
> > •OSPM copies the error status block
> > •OSPM clears the block status field of the error status block
> > •OSPM acknowledges the error via Read Ack register. For example:
> > —OSPM reads the Read Ack register  X
> > —OSPM writes  (( X & ReadAckPreserve) | ReadAckWrite)"
> > 
> > Referring back to the GHES description in ACPI 6.3 18.3.2.7
> > We have a Generic Error Status Block which has a bunch of
> > Generic Error Data Entries (18-392). Those contain CPER
> > records.
> > 
> > CPER record types are defined in the UEFI spec, appendix N.
> > These are identified by GUID and there is one for PCIE errors.
> > (table 54)  Definition of that is in N2.7.
> > It includes the source, plus root port / bridge address and
> > (potentially) a copy of the PCIe Advanced Error Reporting
> > Extended Capability Structure
> > 
> > Everything you might otherwise read from the AER registers should
> > be present in this record.  The basic aim being that you shouldn't
> > need to read those PCIe registers directly (and may not be able
> > to).
> >   
> > > This patch doesn't seem to depend on anything about ACPI, APEI,
> > > firmware-first, or hardware-reduced platforms.  
> > 
> > The only thing it really depends on is whether an RCEC is present.
> > It is possible to have a valid platform that doesn't need one.
> > The reference to firmware-first etc are about establishing that it
> > is optional.
> >   
> > > > As such my interpretation of the spec is that a Reduced Hardware
> > > > ACPI platform should not access the RCEC from the OS at all
> > > > during AER handling, and in fact is welcome to use non standard
> > > > hardware interfaces to provide the equivalent functionality in
> > > > any fashion it wishes (as all hidden beind the firmware).    
> > > 
> > > A pointer to the spec you're interpreting would be helpful here,
> > > too.  
> > 
> > Same info as above.  Good info on what firmware first flow actually
> > means is hard to come by.  I have docs on our flows, but can't find
> > any on a typical x86 machine.  
> 
> "Firmware-first" is only mentioned in the ACPI spec, IIRC.  Last I
> looked I could not find any statement about what the OS should do if
> the FIRMWARE_FIRST bit is set, so I don't think Linux should look at
> it.
> 
> If your platform uses firmware-first, I assume Linux learns about
> errors via APEI, and we shouldn't need any changes except to deal with
> an RCiEP instead of the tree rooted at the device reporting an error.
> 
> We should be able to make a smart way to do this.  pci_walk_bus()
> currently takes a *bus* and deals with all the devices on that bus.
> But we should be able to make something similar that takes a *device*
> and deals with the device and any descendents.

Sounds like a good solution though I'm not totally clear how we do this
without causing confusion.

pci_walk_bus_below perhaps?  If we pass that the root port, it will
walk the bus as normal.  If we pass it a device with no bus below it
acts on the device itself?

There will still need to be a few tweaks around the actions on the
port but we can just make those dependent on dev actually being a 
port.

For the kernel-first case those actions will (I think) need to be
performed on the RCEC.  For now I'm happy to ignore that case.


> 
> > > > Hence I am making the provision of an RCEC optional.
> > > >
> > > > The aim of the rest of the code was to replicate the actions that would
> > > > have occurred if this had been an EP below a root port. Some of them make
> > > > absolutely no sense, but I hope this RFC can start a discussion on what
> > > > we should be doing under these circumstances.
> > > > 
> > > > It probably makes sense to pull this new block of code out to a separate
> > > > function but for the RFC I've left it in place to keep it next to the
> > > > existing path.    
> > > 
> > > OK, my comment is: I really hope we don't need a separate path.  If we
> > > need a test or two for RCiEPs, that's fine.  But two paths sounds like
> > > a nightmare to maintain.  
> > 
> > You can't walk the bus for RCiEPs so its going to be inherently different.
> > We could do it as a series of special cases though so it's obvious what
> > is going on. Would you prefer that?  
> 
> Yes.  An error at X affects X and the subtree below X.  If X happens
> to be an RCiEP, the subtree is empty.  That doesn't seem like a huge
> difference.

Agreed they are similar. An error in existing code at X affects the subtree
below the downstream port above X (or X if X is a downstream port).
As we are dealing with point 2 point links, that means X and its peer functions
plus anything below them.
There is also a different set of affects on the downstream port.

Here we don't have an downstream port above Y, or a subtree below Y, so
Y is kind of a magic subtree of 1.

If there is an RCEC that we need do things on, that will have be a special
case but only affects tiny bit of code at the end. 

> 
> > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > ---
> > > >  drivers/pci/pcie/err.c | 61 ++++++++++++++++++++++++++++++++++++++++++
> > > >  include/linux/pci.h    |  1 +
> > > >  2 files changed, 62 insertions(+)
> > > > 
> > > > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> > > > index 14bb8f54723e..d34be4483f73 100644
> > > > --- a/drivers/pci/pcie/err.c
> > > > +++ b/drivers/pci/pcie/err.c
> > > > @@ -153,6 +153,67 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> > > >  	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
> > > >  	struct pci_bus *bus;
> > > >  
> > > > +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {
> > > > +		struct pci_dev *rcec = dev->rcec;
> > > > +		/* Not clear this makes any sense - we can't reset link anyway...*/
> > > > +		if (state == pci_channel_io_frozen) {
> > > > +			report_frozen_detected(dev, &status);
> > > > +			pci_err(dev, "io is frozen and cannot reset link\n");
> > > > +			goto failed;
> > > > +		} else {
> > > > +			report_normal_detected(dev, &status);
> > > > +		}    
> > > 
> > > I don't understand where you're going with this.  I think you're
> > > adding recovery for RCiEPs (PCI_EXP_TYPE_RC_END).  It's true that
> > > there's no link leading to them, but we should still be able to reset
> > > the RCiEP (not the RCEC) via FLR, if it supports that.  
> > 
> > Agreed.  This code is operating on the RCiEP not the rcec. Only the
> > block below under the if (rcec) check touches that.   
> 
> Right, sorry, I misread this.
> 
> > It might help to think of this as walking a bus of one element. Hence
> > we are calling directly on the RCiEP rather than the bus walks in
> > the normal path.
> >   
> > > 
> > > And all the driver callbacks should be for the RCiEP, not the RCEC,
> > > shouldn't they?  I really hope we can avoid duplicating this whole
> > > path.  It will be hard to keep the two paths in sync.  
> > 
> > Yes, and they are unless I'm missing something. Except for the one
> > block below, which mirrors the actions taken on the root port in the
> > normal path.
> >   
> > >   
> > > > +		if (status == PCI_ERS_RESULT_CAN_RECOVER) {
> > > > +			status = PCI_ERS_RESULT_RECOVERED;
> > > > +			pci_dbg(dev, "broadcast mmio_enabled message\n");
> > > > +			report_mmio_enabled(dev, &status);
> > > > +		}
> > > > +
> > > > +		if (status == PCI_ERS_RESULT_NEED_RESET) {
> > > > +			/* No actual slot reset possible */
> > > > +			status = PCI_ERS_RESULT_RECOVERED;
> > > > +			pci_dbg(dev, "broadcast slot_reset message\n");
> > > > +			report_slot_reset(dev, &status);
> > > > +		}
> > > > +
> > > > +		if (status != PCI_ERS_RESULT_RECOVERED)
> > > > +			goto failed;
> > > > +
> > > > +		report_resume(dev, &status);
> > > > +
> > > > +		/*
> > > > +		 * These two should be called on the RCEC  - but in case
> > > > +		 * of firmware first they should be no-ops. Given that
> > > > +		 * in a reduced hardware ACPI system, it is possible there
> > > > +		 * is no standard compliant RCEC at all.
> > > > +		 *
> > > > +		 * Add some sort of check on what type of HEST entries we have?
> > > > +		 */
> > > > +		if (rcec) {  
> > 
> > This is the only bit that related to the RCEC.
> >   
> > > > +			/*
> > > > +			 * Unlike the upstream port case for an EP, we have not
> > > > +			 * issued a reset on all device the RCEC handles, so
> > > > +			 * perhaps we should be more careful about resetting
> > > > +			 * the status registers on the RCEC?
> > > > +			 *
> > > > +			 * In particular we may need provide a means to handle
> > > > +			 * the multiple error bits being set in PCI_ERR_ROOT_STATUS
> > > > +			 */
> > > > +			pci_aer_clear_device_status(rcec);
> > > > +			pci_aer_clear_nonfatal_status(rcec);
> > > > +			/*
> > > > +			 * Non RCiEP case uses the downstream port above the device
> > > > +			 * for this message.
> > > > +			 */
> > > > +			pci_info(rcec, "device recovery successful\n");
> > > > +		} else {
> > > > +			pci_info(dev, "device recovery successful\n");
> > > > +		}
> > > > +
> > > > +		return status;
> > > > +	}
> > > > +
> > > >  	/*
> > > >  	 * Error recovery runs on all subordinates of the first downstream port.
> > > >  	 * If the downstream port detected the error, it is cleared at the end.
> > > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > > index 83ce1cdf5676..cb21dfe05f8c 100644
> > > > --- a/include/linux/pci.h
> > > > +++ b/include/linux/pci.h
> > > > @@ -298,6 +298,7 @@ struct pci_dev {
> > > >  	struct list_head bus_list;	/* Node in per-bus list */
> > > >  	struct pci_bus	*bus;		/* Bus this device is on */
> > > >  	struct pci_bus	*subordinate;	/* Bus this device bridges to */
> > > > +	struct pci_dev	*rcec;		/* Root Complex Event Collector used */   
> > >
> > > Nothing ever sets this, so I guess the critical connection between
> > > RCiEP and RCEC is missing?  Each patch needs to make sense on its own,
> > > so the patch that adds this struct member should also add something
> > > that sets it and uses it.  
> > 
> > I'm happy to drop this. It's here only to try to make the point that the
> > infra-structure would be needed in the non Firmware-First case.
> > 
> > Intent was to illustrate that what I was defining for firmware first
> > would also work for kernel-first flows assuming someone actually put in place
> > infrastructure to hook up the RCEC here.
> > 
> > I was rather hoping someone would jump up and say 'I've got one of those!'.  
> 
> I think they're coming.  But I don't think they're relevant for the
> firmware-first situation you're trying to solve.

I'd certainly like to separate the two as makes my life easier. Sounds like
there is work in progress on this anyway from the other branch of this thread.
(not surprising given the requirements of certain future specs :)

Thanks

Jonathan



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first
  2020-06-17 18:25   ` Kuppuswamy, Sathyanarayanan
@ 2020-06-18  8:48     ` Jonathan Cameron
  2020-06-18 16:04       ` Jonathan Cameron
  2020-06-18 16:20       ` Bjorn Helgaas
  0 siblings, 2 replies; 19+ messages in thread
From: Jonathan Cameron @ 2020-06-18  8:48 UTC (permalink / raw)
  To: Kuppuswamy, Sathyanarayanan
  Cc: sean.v.kelley, Bjorn Helgaas, Bjorn Helgaas, linux-pci,
	linux-acpi, linuxarm, Lorenzo Pieralisi

On Wed, 17 Jun 2020 11:25:55 -0700
"Kuppuswamy, Sathyanarayanan" <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:

> On 6/17/20 10:36 AM, Sean V Kelley wrote:
> > On Tue, 2020-06-16 at 14:24 -0500, Bjorn Helgaas wrote:  
> >> Bcc:
> >> Subject: Re: [PATCH 2/2] PCI/AER: Add partial initial support for
> >> RCiEPs
> >>   using RCEC or firmware first
> >> Reply-To:
> >> In-Reply-To: <20200521173134.2456773-3-Jonathan.Cameron@huawei.com>
> >>
> >> [+cc Sathy, Sean]
> >>
> >> On Fri, May 22, 2020 at 01:31:34AM +0800, Jonathan Cameron wrote:  
> >>> Note this provides complete support for our usecase on an ARM
> >>> server using
> >>> Hardware Reduced ACPI and adds appropriate place for an RCEC driver
> >>> to hook
> >>> if someone else cares to write one, either for firmware first
> >>> handling on
> >>> non Hardware Reduced ACPI or for kernel first AER handling.  
> >>
> >> This provides complete support?  I'm really confused, since this
> >> relies on dev->rcec, which is never set.  And I don't see anything
> >> about hooks for RCEC drivers.
> >>  
> >>> For Root Complex integrated End Points (RCiEPs) there is no root
> >>> port to
> >>> discover and hence we cannot walk the bus from the root port to do
> >>> appropriate resets.
> >>>
> >>> The PCI specification provides Root Complex Event Collectors to
> >>> deal with
> >>> this circumstance.  These are peer RCiEPs that provide (amongst
> >>> other
> >>> things) collection + interrupt facilities for AER reporting for a
> >>> set of
> >>> RCiEPs in the same root complex.
> >>>
> >>> In the case of a Hardware Reduced ACPI platform, the AER errors are
> >>> reported via a GHESv2 path using CPER records as defined in the
> >>> UEFI
> >>> specification.  These are intended to provide complete information
> >>> and
> >>> appropriate hand shake in a fashion that does not require a
> >>> specific form
> >>> of error reporting hardware.  This is contrast to AER handling via
> >>> the
> >>> various HEST entries for PCI Root Port and PCI Device etc where we
> >>> do
> >>> require direct access to the RCEC.  
> >>
> >> Can you include pointers to relevant spec sections for these
> >> differences between hardware-reduced and other platforms?
> >>
> >> This patch doesn't seem to depend on anything about ACPI, APEI,
> >> firmware-first, or hardware-reduced platforms.
> >>  
> >>> As such my interpretation of the spec is that a Reduced Hardware
> >>> ACPI
> >>> platform should not access the RCEC from the OS at all during AER
> >>> handling,
> >>> and in fact is welcome to use non standard hardware interfaces to
> >>> provide
> >>> the equivalent functionality in any fashion it wishes (as all
> >>> hidden beind
> >>> the firmware).  
> > 
> > 
> > I'm not sure what you mean by Hardware Reduced ACPI platform, but you
> > seem to be implying that in this case your hardware lacks RCECs and so
> > are using firmware specific handling.
> > 
> > In 1.3.2.3 (Root Complex Integrated Endpoint rules)
> > 
> > If an RCiEP is associated with an optional Root Complex Event Collector
> > it must signal PME and error conditions through a Root Complex Event
> > Collector.
> > 
> > If the RCEC is not supported/present then the expectation prior to PCIe
> > 5 is that the RCiEP will use the same mechanism as PCI
> > systems.  However, if the RCiEP asserts say an error signal there is no
> > Root Port and the OS has no way of knowing what interrupt the error is
> > conntected to.  Linux doesn't have support for that and this was
> > discussed prior here:
> > 
> > https://lore.kernel.org/lkml/20190709134538.GA35486@google.com/
> > 
> > In such a case, are you then implying that the _OSC method is not
> > granting control of PCIe Native Power Management Events to the OS and
> > so are falling back to your defined ACPI mechanism on your platform?
> > 
> > I'm currently working on adding support for RCECs in AER that would
> > make use of the extended capabilities for identifying the assocated
> > RCeIPs for purposes of the PME and error condition signaling.  
> 
> IIUC, we are trying to solve multiple issues here.
> 
> 1. Error detection and recovery support for RCiEPs and RCEC.
> 2. Firmware first exception for case 1.
> 3. AEPI based handling for case 1 (I think this is the case Jonathan is
> trying to handle)

I'm not sure it separates that cleanly but the flow I'm interested
in is firmware first with errors reported using APEI / GHESv2 etc.

In particular without an RCEC, as it should (I think) play no part
in that path.  One of the main aims of me bringing this forwards at
this stage is to establish whether I need to get our hardware teams
to put an RCEC in place for future hardware or not.  Right now we have
some work arounds in place as we can reroute some of these errors directly
to device interrupts.

We haven't been able to come up with a reason why we need an RCEC given
our approach to error handling.

> 
> For adding support for case 1,
> 
> 1. I think we need to first make the AER driver RCEC aware.
> 2. Once AER driver is modified to receive IRQs for RCEC error
>     events, then we can modify pcie_do_recovery() to handle
>     recovery for RCiEPs and RCEC.
> 
> I recommend adding support for basic case first and then add exceptions
> for Firmware First and AEPI based support.

Sounds good as long as progress is timely.

As both Bjorn and yourself have suggested, we can perhaps
make pci_walk_bus work when it is passed an RCiEP directly.

If that an be done, the case I care about should work with very
minimal changes.

I'll take a look at how cleanly that can be done.

> 
> > 
> > Thanks,
> > 
> > Sean
> > 
> >   
> >>
> >> A pointer to the spec you're interpreting would be helpful here, too.
> >>
> >> s/Reduced Hardware/Hardware-Reduced/ to match terminology in spec
> >> (I'm
> >> looking at ACPI v6.3, sec 4.1).  Also below in code comments.
> >>
> >> s/beind/behind/
> >>  
> >>> Hence I am making the provision of an RCEC optional.
> >>>
> >>> The aim of the rest of the code was to replicate the actions that
> >>> would
> >>> have occurred if this had been an EP below a root port. Some of
> >>> them make
> >>> absolutely no sense, but I hope this RFC can start a discussion on
> >>> what
> >>> we should be doing under these circumstances.
> >>>
> >>> It probably makes sense to pull this new block of code out to a
> >>> separate
> >>> function but for the RFC I've left it in place to keep it next to
> >>> the
> >>> existing path.  
> >>
> >> OK, my comment is: I really hope we don't need a separate path.  If
> >> we
> >> need a test or two for RCiEPs, that's fine.  But two paths sounds
> >> like
> >> a nightmare to maintain.
> >>  
> >>> It appears that the current kernel first code does not support
> >>> detecting
> >>> the multiple error bits being set in the root port error status
> >>> register.
> >>> This seems like a limitation both the normal EP / Root Port case
> >>> and
> >>> for RCiEPs.  
> >>
> >> Is this paragraph supposed to be a bug report?  It doesn't seem to
> >> say
> >> anything about what *this* patch does.  Maybe this should be part of
> >> the commit log for a separate patch?
> >>  
> >>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >>> ---
> >>>   drivers/pci/pcie/err.c | 61
> >>> ++++++++++++++++++++++++++++++++++++++++++
> >>>   include/linux/pci.h    |  1 +
> >>>   2 files changed, 62 insertions(+)
> >>>
> >>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> >>> index 14bb8f54723e..d34be4483f73 100644
> >>> --- a/drivers/pci/pcie/err.c
> >>> +++ b/drivers/pci/pcie/err.c
> >>> @@ -153,6 +153,67 @@ pci_ers_result_t pcie_do_recovery(struct
> >>> pci_dev *dev,
> >>>   	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
> >>>   	struct pci_bus *bus;  
> I am curious what bus (dev->subordinate) does RCEC and RCiEP belongs to 
> ?

I'm not quite sure what you are asking so...

They are effectively the same as root ports, and so sit on a bus specified
via ACPI. In our case IORT. The root complex includes a bunch of bus
numbers on which these devices can be discovered.
0, 74, 78, 7a, 7b, 7c, 80, b4, ba, bb, b8, bc on the 2 socket machine I have
to hand.  Those buses have a mix of root ports, and RCiEPs on them.

As to subordinate (i.e. where they bridge to) I don't think it would have
any meaning.   For reference I checked one of our RCiEPs and it's set
to 0 on the config space.


> Does all RCiEPs are in same bus ?

No, though for any given RCiEP the associated RCEC would have to be
on the same bus. So in the above example we would need quite a few
of them.

> >>>   
> >>> +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {  
> Also, instead of creating a new path for RCiEPs, I recommend fixing
> the pci_walk_bus() part. That will reduce code duplication.
> >>> +		struct pci_dev *rcec = dev->rcec;
> >>> +		/* Not clear this makes any sense - we can't reset link
> >>> anyway...*/
> >>> +		if (state == pci_channel_io_frozen) {
> >>> +			report_frozen_detected(dev, &status);
> >>> +			pci_err(dev, "io is frozen and cannot reset
> >>> link\n");
> >>> +			goto failed;
> >>> +		} else {
> >>> +			report_normal_detected(dev, &status);
> >>> +		}  
> >>
> >> I don't understand where you're going with this.  I think you're
> >> adding recovery for RCiEPs (PCI_EXP_TYPE_RC_END).  It's true that
> >> there's no link leading to them, but we should still be able to reset
> >> the RCiEP (not the RCEC) via FLR, if it supports that.
> >>
> >> And all the driver callbacks should be for the RCiEP, not the RCEC,
> >> shouldn't they?  I really hope we can avoid duplicating this whole
> >> path.  It will be hard to keep the two paths in sync.
> >>  
> >>> +		if (status == PCI_ERS_RESULT_CAN_RECOVER) {
> >>> +			status = PCI_ERS_RESULT_RECOVERED;
> >>> +			pci_dbg(dev, "broadcast mmio_enabled
> >>> message\n");
> >>> +			report_mmio_enabled(dev, &status);
> >>> +		}
> >>> +
> >>> +		if (status == PCI_ERS_RESULT_NEED_RESET) {
> >>> +			/* No actual slot reset possible */
> >>> +			status = PCI_ERS_RESULT_RECOVERED;
> >>> +			pci_dbg(dev, "broadcast slot_reset message\n");
> >>> +			report_slot_reset(dev, &status);
> >>> +		}
> >>> +
> >>> +		if (status != PCI_ERS_RESULT_RECOVERED)
> >>> +			goto failed;
> >>> +
> >>> +		report_resume(dev, &status);
> >>> +
> >>> +		/*
> >>> +		 * These two should be called on the RCEC  - but in
> >>> case
> >>> +		 * of firmware first they should be no-ops. Given that
> >>> +		 * in a reduced hardware ACPI system, it is possible
> >>> there
> >>> +		 * is no standard compliant RCEC at all.
> >>> +		 *
> >>> +		 * Add some sort of check on what type of HEST entries
> >>> we have?
> >>> +		 */
> >>> +		if (rcec) {
> >>> +			/*
> >>> +			 * Unlike the upstream port case for an EP, we
> >>> have not
> >>> +			 * issued a reset on all device the RCEC
> >>> handles, so
> >>> +			 * perhaps we should be more careful about
> >>> resetting
> >>> +			 * the status registers on the RCEC?
> >>> +			 *
> >>> +			 * In particular we may need provide a means to
> >>> handle
> >>> +			 * the multiple error bits being set in
> >>> PCI_ERR_ROOT_STATUS
> >>> +			 */
> >>> +			pci_aer_clear_device_status(rcec);
> >>> +			pci_aer_clear_nonfatal_status(rcec);
> >>> +			/*
> >>> +			 * Non RCiEP case uses the downstream port
> >>> above the device
> >>> +			 * for this message.
> >>> +			 */
> >>> +			pci_info(rcec, "device recovery successful\n");
> >>> +		} else {
> >>> +			pci_info(dev, "device recovery successful\n");
> >>> +		}
> >>> +
> >>> +		return status;
> >>> +	}
> >>> +
> >>>   	/*
> >>>   	 * Error recovery runs on all subordinates of the first
> >>> downstream port.
> >>>   	 * If the downstream port detected the error, it is cleared at
> >>> the end.
> >>> diff --git a/include/linux/pci.h b/include/linux/pci.h
> >>> index 83ce1cdf5676..cb21dfe05f8c 100644
> >>> --- a/include/linux/pci.h
> >>> +++ b/include/linux/pci.h
> >>> @@ -298,6 +298,7 @@ struct pci_dev {
> >>>   	struct list_head bus_list;	/* Node in per-bus list */
> >>>   	struct pci_bus	*bus;		/* Bus this device is on */
> >>>   	struct pci_bus	*subordinate;	/* Bus this device bridges
> >>> to */
> >>> +	struct pci_dev	*rcec;		/* Root Complex Event
> >>> Collector used */  
> >>
> >> Nothing ever sets this, so I guess the critical connection between
> >> RCiEP and RCEC is missing?  Each patch needs to make sense on its
> >> own,
> >> so the patch that adds this struct member should also add something
> >> that sets it and uses it.
> >>  
> >>>   	void		*sysdata;	/* Hook for sys-specific
> >>> extension */
> >>>   	struct proc_dir_entry *procent;	/* Device entry in
> >>> /proc/bus/pci */
> >>> -- 
> >>> 2.19.1
> >>>  
> >   
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first
  2020-06-17 17:36 ` Sean V Kelley
@ 2020-06-17 18:25   ` Kuppuswamy, Sathyanarayanan
  2020-06-18  8:48     ` Jonathan Cameron
  0 siblings, 1 reply; 19+ messages in thread
From: Kuppuswamy, Sathyanarayanan @ 2020-06-17 18:25 UTC (permalink / raw)
  To: sean.v.kelley, Bjorn Helgaas, Jonathan Cameron
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linuxarm, Lorenzo Pieralisi



On 6/17/20 10:36 AM, Sean V Kelley wrote:
> On Tue, 2020-06-16 at 14:24 -0500, Bjorn Helgaas wrote:
>> Bcc:
>> Subject: Re: [PATCH 2/2] PCI/AER: Add partial initial support for
>> RCiEPs
>>   using RCEC or firmware first
>> Reply-To:
>> In-Reply-To: <20200521173134.2456773-3-Jonathan.Cameron@huawei.com>
>>
>> [+cc Sathy, Sean]
>>
>> On Fri, May 22, 2020 at 01:31:34AM +0800, Jonathan Cameron wrote:
>>> Note this provides complete support for our usecase on an ARM
>>> server using
>>> Hardware Reduced ACPI and adds appropriate place for an RCEC driver
>>> to hook
>>> if someone else cares to write one, either for firmware first
>>> handling on
>>> non Hardware Reduced ACPI or for kernel first AER handling.
>>
>> This provides complete support?  I'm really confused, since this
>> relies on dev->rcec, which is never set.  And I don't see anything
>> about hooks for RCEC drivers.
>>
>>> For Root Complex integrated End Points (RCiEPs) there is no root
>>> port to
>>> discover and hence we cannot walk the bus from the root port to do
>>> appropriate resets.
>>>
>>> The PCI specification provides Root Complex Event Collectors to
>>> deal with
>>> this circumstance.  These are peer RCiEPs that provide (amongst
>>> other
>>> things) collection + interrupt facilities for AER reporting for a
>>> set of
>>> RCiEPs in the same root complex.
>>>
>>> In the case of a Hardware Reduced ACPI platform, the AER errors are
>>> reported via a GHESv2 path using CPER records as defined in the
>>> UEFI
>>> specification.  These are intended to provide complete information
>>> and
>>> appropriate hand shake in a fashion that does not require a
>>> specific form
>>> of error reporting hardware.  This is contrast to AER handling via
>>> the
>>> various HEST entries for PCI Root Port and PCI Device etc where we
>>> do
>>> require direct access to the RCEC.
>>
>> Can you include pointers to relevant spec sections for these
>> differences between hardware-reduced and other platforms?
>>
>> This patch doesn't seem to depend on anything about ACPI, APEI,
>> firmware-first, or hardware-reduced platforms.
>>
>>> As such my interpretation of the spec is that a Reduced Hardware
>>> ACPI
>>> platform should not access the RCEC from the OS at all during AER
>>> handling,
>>> and in fact is welcome to use non standard hardware interfaces to
>>> provide
>>> the equivalent functionality in any fashion it wishes (as all
>>> hidden beind
>>> the firmware).
> 
> 
> I'm not sure what you mean by Hardware Reduced ACPI platform, but you
> seem to be implying that in this case your hardware lacks RCECs and so
> are using firmware specific handling.
> 
> In 1.3.2.3 (Root Complex Integrated Endpoint rules)
> 
> If an RCiEP is associated with an optional Root Complex Event Collector
> it must signal PME and error conditions through a Root Complex Event
> Collector.
> 
> If the RCEC is not supported/present then the expectation prior to PCIe
> 5 is that the RCiEP will use the same mechanism as PCI
> systems.  However, if the RCiEP asserts say an error signal there is no
> Root Port and the OS has no way of knowing what interrupt the error is
> conntected to.  Linux doesn't have support for that and this was
> discussed prior here:
> 
> https://lore.kernel.org/lkml/20190709134538.GA35486@google.com/
> 
> In such a case, are you then implying that the _OSC method is not
> granting control of PCIe Native Power Management Events to the OS and
> so are falling back to your defined ACPI mechanism on your platform?
> 
> I'm currently working on adding support for RCECs in AER that would
> make use of the extended capabilities for identifying the assocated
> RCeIPs for purposes of the PME and error condition signaling.

IIUC, we are trying to solve multiple issues here.

1. Error detection and recovery support for RCiEPs and RCEC.
2. Firmware first exception for case 1.
3. AEPI based handling for case 1 (I think this is the case Jonathan is
trying to handle)

For adding support for case 1,

1. I think we need to first make the AER driver RCEC aware.
2. Once AER driver is modified to receive IRQs for RCEC error
    events, then we can modify pcie_do_recovery() to handle
    recovery for RCiEPs and RCEC.

I recommend adding support for basic case first and then add exceptions
for Firmware First and AEPI based support.

> 
> Thanks,
> 
> Sean
> 
> 
>>
>> A pointer to the spec you're interpreting would be helpful here, too.
>>
>> s/Reduced Hardware/Hardware-Reduced/ to match terminology in spec
>> (I'm
>> looking at ACPI v6.3, sec 4.1).  Also below in code comments.
>>
>> s/beind/behind/
>>
>>> Hence I am making the provision of an RCEC optional.
>>>
>>> The aim of the rest of the code was to replicate the actions that
>>> would
>>> have occurred if this had been an EP below a root port. Some of
>>> them make
>>> absolutely no sense, but I hope this RFC can start a discussion on
>>> what
>>> we should be doing under these circumstances.
>>>
>>> It probably makes sense to pull this new block of code out to a
>>> separate
>>> function but for the RFC I've left it in place to keep it next to
>>> the
>>> existing path.
>>
>> OK, my comment is: I really hope we don't need a separate path.  If
>> we
>> need a test or two for RCiEPs, that's fine.  But two paths sounds
>> like
>> a nightmare to maintain.
>>
>>> It appears that the current kernel first code does not support
>>> detecting
>>> the multiple error bits being set in the root port error status
>>> register.
>>> This seems like a limitation both the normal EP / Root Port case
>>> and
>>> for RCiEPs.
>>
>> Is this paragraph supposed to be a bug report?  It doesn't seem to
>> say
>> anything about what *this* patch does.  Maybe this should be part of
>> the commit log for a separate patch?
>>
>>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>> ---
>>>   drivers/pci/pcie/err.c | 61
>>> ++++++++++++++++++++++++++++++++++++++++++
>>>   include/linux/pci.h    |  1 +
>>>   2 files changed, 62 insertions(+)
>>>
>>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
>>> index 14bb8f54723e..d34be4483f73 100644
>>> --- a/drivers/pci/pcie/err.c
>>> +++ b/drivers/pci/pcie/err.c
>>> @@ -153,6 +153,67 @@ pci_ers_result_t pcie_do_recovery(struct
>>> pci_dev *dev,
>>>   	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
>>>   	struct pci_bus *bus;
I am curious what bus (dev->subordinate) does RCEC and RCiEP belongs to 
? Does all RCiEPs are in same bus ?
>>>   
>>> +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {
Also, instead of creating a new path for RCiEPs, I recommend fixing
the pci_walk_bus() part. That will reduce code duplication.
>>> +		struct pci_dev *rcec = dev->rcec;
>>> +		/* Not clear this makes any sense - we can't reset link
>>> anyway...*/
>>> +		if (state == pci_channel_io_frozen) {
>>> +			report_frozen_detected(dev, &status);
>>> +			pci_err(dev, "io is frozen and cannot reset
>>> link\n");
>>> +			goto failed;
>>> +		} else {
>>> +			report_normal_detected(dev, &status);
>>> +		}
>>
>> I don't understand where you're going with this.  I think you're
>> adding recovery for RCiEPs (PCI_EXP_TYPE_RC_END).  It's true that
>> there's no link leading to them, but we should still be able to reset
>> the RCiEP (not the RCEC) via FLR, if it supports that.
>>
>> And all the driver callbacks should be for the RCiEP, not the RCEC,
>> shouldn't they?  I really hope we can avoid duplicating this whole
>> path.  It will be hard to keep the two paths in sync.
>>
>>> +		if (status == PCI_ERS_RESULT_CAN_RECOVER) {
>>> +			status = PCI_ERS_RESULT_RECOVERED;
>>> +			pci_dbg(dev, "broadcast mmio_enabled
>>> message\n");
>>> +			report_mmio_enabled(dev, &status);
>>> +		}
>>> +
>>> +		if (status == PCI_ERS_RESULT_NEED_RESET) {
>>> +			/* No actual slot reset possible */
>>> +			status = PCI_ERS_RESULT_RECOVERED;
>>> +			pci_dbg(dev, "broadcast slot_reset message\n");
>>> +			report_slot_reset(dev, &status);
>>> +		}
>>> +
>>> +		if (status != PCI_ERS_RESULT_RECOVERED)
>>> +			goto failed;
>>> +
>>> +		report_resume(dev, &status);
>>> +
>>> +		/*
>>> +		 * These two should be called on the RCEC  - but in
>>> case
>>> +		 * of firmware first they should be no-ops. Given that
>>> +		 * in a reduced hardware ACPI system, it is possible
>>> there
>>> +		 * is no standard compliant RCEC at all.
>>> +		 *
>>> +		 * Add some sort of check on what type of HEST entries
>>> we have?
>>> +		 */
>>> +		if (rcec) {
>>> +			/*
>>> +			 * Unlike the upstream port case for an EP, we
>>> have not
>>> +			 * issued a reset on all device the RCEC
>>> handles, so
>>> +			 * perhaps we should be more careful about
>>> resetting
>>> +			 * the status registers on the RCEC?
>>> +			 *
>>> +			 * In particular we may need provide a means to
>>> handle
>>> +			 * the multiple error bits being set in
>>> PCI_ERR_ROOT_STATUS
>>> +			 */
>>> +			pci_aer_clear_device_status(rcec);
>>> +			pci_aer_clear_nonfatal_status(rcec);
>>> +			/*
>>> +			 * Non RCiEP case uses the downstream port
>>> above the device
>>> +			 * for this message.
>>> +			 */
>>> +			pci_info(rcec, "device recovery successful\n");
>>> +		} else {
>>> +			pci_info(dev, "device recovery successful\n");
>>> +		}
>>> +
>>> +		return status;
>>> +	}
>>> +
>>>   	/*
>>>   	 * Error recovery runs on all subordinates of the first
>>> downstream port.
>>>   	 * If the downstream port detected the error, it is cleared at
>>> the end.
>>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>>> index 83ce1cdf5676..cb21dfe05f8c 100644
>>> --- a/include/linux/pci.h
>>> +++ b/include/linux/pci.h
>>> @@ -298,6 +298,7 @@ struct pci_dev {
>>>   	struct list_head bus_list;	/* Node in per-bus list */
>>>   	struct pci_bus	*bus;		/* Bus this device is on */
>>>   	struct pci_bus	*subordinate;	/* Bus this device bridges
>>> to */
>>> +	struct pci_dev	*rcec;		/* Root Complex Event
>>> Collector used */
>>
>> Nothing ever sets this, so I guess the critical connection between
>> RCiEP and RCEC is missing?  Each patch needs to make sense on its
>> own,
>> so the patch that adds this struct member should also add something
>> that sets it and uses it.
>>
>>>   	void		*sysdata;	/* Hook for sys-specific
>>> extension */
>>>   	struct proc_dir_entry *procent;	/* Device entry in
>>> /proc/bus/pci */
>>> -- 
>>> 2.19.1
>>>
> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first
  2020-06-16 19:24 [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first Bjorn Helgaas
  2020-06-17 11:40 ` Jonathan Cameron
@ 2020-06-17 17:36 ` Sean V Kelley
  2020-06-17 18:25   ` Kuppuswamy, Sathyanarayanan
  1 sibling, 1 reply; 19+ messages in thread
From: Sean V Kelley @ 2020-06-17 17:36 UTC (permalink / raw)
  To: Bjorn Helgaas, Jonathan Cameron
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linuxarm,
	Lorenzo Pieralisi, Kuppuswamy Sathyanarayanan

On Tue, 2020-06-16 at 14:24 -0500, Bjorn Helgaas wrote:
> Bcc: 
> Subject: Re: [PATCH 2/2] PCI/AER: Add partial initial support for
> RCiEPs
>  using RCEC or firmware first
> Reply-To: 
> In-Reply-To: <20200521173134.2456773-3-Jonathan.Cameron@huawei.com>
> 
> [+cc Sathy, Sean]
> 
> On Fri, May 22, 2020 at 01:31:34AM +0800, Jonathan Cameron wrote:
> > Note this provides complete support for our usecase on an ARM
> > server using
> > Hardware Reduced ACPI and adds appropriate place for an RCEC driver
> > to hook
> > if someone else cares to write one, either for firmware first
> > handling on
> > non Hardware Reduced ACPI or for kernel first AER handling.
> 
> This provides complete support?  I'm really confused, since this
> relies on dev->rcec, which is never set.  And I don't see anything
> about hooks for RCEC drivers.
> 
> > For Root Complex integrated End Points (RCiEPs) there is no root
> > port to
> > discover and hence we cannot walk the bus from the root port to do
> > appropriate resets.
> > 
> > The PCI specification provides Root Complex Event Collectors to
> > deal with
> > this circumstance.  These are peer RCiEPs that provide (amongst
> > other
> > things) collection + interrupt facilities for AER reporting for a
> > set of
> > RCiEPs in the same root complex.
> > 
> > In the case of a Hardware Reduced ACPI platform, the AER errors are
> > reported via a GHESv2 path using CPER records as defined in the
> > UEFI
> > specification.  These are intended to provide complete information
> > and
> > appropriate hand shake in a fashion that does not require a
> > specific form
> > of error reporting hardware.  This is contrast to AER handling via
> > the
> > various HEST entries for PCI Root Port and PCI Device etc where we
> > do
> > require direct access to the RCEC.
> 
> Can you include pointers to relevant spec sections for these
> differences between hardware-reduced and other platforms?
> 
> This patch doesn't seem to depend on anything about ACPI, APEI,
> firmware-first, or hardware-reduced platforms.
> 
> > As such my interpretation of the spec is that a Reduced Hardware
> > ACPI
> > platform should not access the RCEC from the OS at all during AER
> > handling,
> > and in fact is welcome to use non standard hardware interfaces to
> > provide
> > the equivalent functionality in any fashion it wishes (as all
> > hidden beind
> > the firmware).


I'm not sure what you mean by Hardware Reduced ACPI platform, but you
seem to be implying that in this case your hardware lacks RCECs and so
are using firmware specific handling.

In 1.3.2.3 (Root Complex Integrated Endpoint rules)

If an RCiEP is associated with an optional Root Complex Event Collector
it must signal PME and error conditions through a Root Complex Event
Collector.

If the RCEC is not supported/present then the expectation prior to PCIe
5 is that the RCiEP will use the same mechanism as PCI
systems.  However, if the RCiEP asserts say an error signal there is no
Root Port and the OS has no way of knowing what interrupt the error is
conntected to.  Linux doesn't have support for that and this was
discussed prior here:

https://lore.kernel.org/lkml/20190709134538.GA35486@google.com/

In such a case, are you then implying that the _OSC method is not
granting control of PCIe Native Power Management Events to the OS and
so are falling back to your defined ACPI mechanism on your platform?

I'm currently working on adding support for RCECs in AER that would
make use of the extended capabilities for identifying the assocated
RCeIPs for purposes of the PME and error condition signaling.

Thanks,

Sean


> 
> A pointer to the spec you're interpreting would be helpful here, too.
> 
> s/Reduced Hardware/Hardware-Reduced/ to match terminology in spec
> (I'm
> looking at ACPI v6.3, sec 4.1).  Also below in code comments.
> 
> s/beind/behind/
> 
> > Hence I am making the provision of an RCEC optional.
> > 
> > The aim of the rest of the code was to replicate the actions that
> > would
> > have occurred if this had been an EP below a root port. Some of
> > them make
> > absolutely no sense, but I hope this RFC can start a discussion on
> > what
> > we should be doing under these circumstances.
> > 
> > It probably makes sense to pull this new block of code out to a
> > separate
> > function but for the RFC I've left it in place to keep it next to
> > the
> > existing path.
> 
> OK, my comment is: I really hope we don't need a separate path.  If
> we
> need a test or two for RCiEPs, that's fine.  But two paths sounds
> like
> a nightmare to maintain.
> 
> > It appears that the current kernel first code does not support
> > detecting
> > the multiple error bits being set in the root port error status
> > register.
> > This seems like a limitation both the normal EP / Root Port case
> > and
> > for RCiEPs.
> 
> Is this paragraph supposed to be a bug report?  It doesn't seem to
> say
> anything about what *this* patch does.  Maybe this should be part of
> the commit log for a separate patch?
> 
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> >  drivers/pci/pcie/err.c | 61
> > ++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/pci.h    |  1 +
> >  2 files changed, 62 insertions(+)
> > 
> > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> > index 14bb8f54723e..d34be4483f73 100644
> > --- a/drivers/pci/pcie/err.c
> > +++ b/drivers/pci/pcie/err.c
> > @@ -153,6 +153,67 @@ pci_ers_result_t pcie_do_recovery(struct
> > pci_dev *dev,
> >  	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
> >  	struct pci_bus *bus;
> >  
> > +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {
> > +		struct pci_dev *rcec = dev->rcec;
> > +		/* Not clear this makes any sense - we can't reset link
> > anyway...*/
> > +		if (state == pci_channel_io_frozen) {
> > +			report_frozen_detected(dev, &status);
> > +			pci_err(dev, "io is frozen and cannot reset
> > link\n");
> > +			goto failed;
> > +		} else {
> > +			report_normal_detected(dev, &status);
> > +		}
> 
> I don't understand where you're going with this.  I think you're
> adding recovery for RCiEPs (PCI_EXP_TYPE_RC_END).  It's true that
> there's no link leading to them, but we should still be able to reset
> the RCiEP (not the RCEC) via FLR, if it supports that.
> 
> And all the driver callbacks should be for the RCiEP, not the RCEC,
> shouldn't they?  I really hope we can avoid duplicating this whole
> path.  It will be hard to keep the two paths in sync.
> 
> > +		if (status == PCI_ERS_RESULT_CAN_RECOVER) {
> > +			status = PCI_ERS_RESULT_RECOVERED;
> > +			pci_dbg(dev, "broadcast mmio_enabled
> > message\n");
> > +			report_mmio_enabled(dev, &status);
> > +		}
> > +
> > +		if (status == PCI_ERS_RESULT_NEED_RESET) {
> > +			/* No actual slot reset possible */
> > +			status = PCI_ERS_RESULT_RECOVERED;
> > +			pci_dbg(dev, "broadcast slot_reset message\n");
> > +			report_slot_reset(dev, &status);
> > +		}
> > +
> > +		if (status != PCI_ERS_RESULT_RECOVERED)
> > +			goto failed;
> > +
> > +		report_resume(dev, &status);
> > +
> > +		/*
> > +		 * These two should be called on the RCEC  - but in
> > case
> > +		 * of firmware first they should be no-ops. Given that
> > +		 * in a reduced hardware ACPI system, it is possible
> > there
> > +		 * is no standard compliant RCEC at all.
> > +		 *
> > +		 * Add some sort of check on what type of HEST entries
> > we have?
> > +		 */
> > +		if (rcec) {
> > +			/*
> > +			 * Unlike the upstream port case for an EP, we
> > have not
> > +			 * issued a reset on all device the RCEC
> > handles, so
> > +			 * perhaps we should be more careful about
> > resetting
> > +			 * the status registers on the RCEC?
> > +			 *
> > +			 * In particular we may need provide a means to
> > handle
> > +			 * the multiple error bits being set in
> > PCI_ERR_ROOT_STATUS
> > +			 */
> > +			pci_aer_clear_device_status(rcec);
> > +			pci_aer_clear_nonfatal_status(rcec);
> > +			/*
> > +			 * Non RCiEP case uses the downstream port
> > above the device
> > +			 * for this message.
> > +			 */
> > +			pci_info(rcec, "device recovery successful\n");
> > +		} else {
> > +			pci_info(dev, "device recovery successful\n");
> > +		}
> > +
> > +		return status;
> > +	}
> > +
> >  	/*
> >  	 * Error recovery runs on all subordinates of the first
> > downstream port.
> >  	 * If the downstream port detected the error, it is cleared at
> > the end.
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 83ce1cdf5676..cb21dfe05f8c 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -298,6 +298,7 @@ struct pci_dev {
> >  	struct list_head bus_list;	/* Node in per-bus list */
> >  	struct pci_bus	*bus;		/* Bus this device is on */
> >  	struct pci_bus	*subordinate;	/* Bus this device bridges
> > to */
> > +	struct pci_dev	*rcec;		/* Root Complex Event
> > Collector used */
> 
> Nothing ever sets this, so I guess the critical connection between
> RCiEP and RCEC is missing?  Each patch needs to make sense on its
> own,
> so the patch that adds this struct member should also add something
> that sets it and uses it.
> 
> >  	void		*sysdata;	/* Hook for sys-specific
> > extension */
> >  	struct proc_dir_entry *procent;	/* Device entry in
> > /proc/bus/pci */
> > -- 
> > 2.19.1
> > 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first
  2020-06-17 11:40 ` Jonathan Cameron
@ 2020-06-17 17:26   ` Bjorn Helgaas
  2020-06-18  9:32     ` Jonathan Cameron
  0 siblings, 1 reply; 19+ messages in thread
From: Bjorn Helgaas @ 2020-06-17 17:26 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linuxarm,
	Lorenzo Pieralisi, Kuppuswamy Sathyanarayanan, Sean V Kelley

On Wed, Jun 17, 2020 at 12:40:36PM +0100, Jonathan Cameron wrote:
> On Tue, 16 Jun 2020 14:24:41 -0500
> Bjorn Helgaas <helgaas@kernel.org> wrote:
> Hi Bjorn,
> 
> Thanks for looking at this.  I got a bit carried away in this description
> with trying to show that we don't need an RCEC or software support for
> one and that has made the description rather confused. Sorry about that!
> 
> The complete lack of any consistent diagram of the various options in any
> of the related specs has lead me down more than one dead end trying to work
> this out.  Looking at the specs today I'm increasingly convinced the
> question of Hardware-Reduced vs normal doesn't matter.
> 
> Let me try proposing a brief replacement description:
> 
> "Root Complex integrate End Points (RCiEPs) are not found below a
> root port.  (PCI Express Base Specification 4.0 1.3.2.3, 7.1 for
> topology) As such the error handling needs to perform actions on
> only the device, rather than walking the bus as is done for
> conventional EPs.
> 
>  In firmware-first error handling there is no need to directly
>  access the Root Complex Event Collector (RCEC) as the firmware is
>  responsible for all actions touching it.  The implementation of the
>  RCEC device may not be compliant with the PCIe Spec as the OS does
>  not access it during error handling.  (RCEC defined in PCI Express
>  Base Specification 4.0 1.3.4)
> 
>  (Can drop next bit if we drop the code) If kernel-first error
>  handling is in use, handling AER errors for RCiEPs requires access
>  to the RCEC they are associated with. Support for this is not
>  included in this RFC due to a lack of available test platform." 
> 
> > On Fri, May 22, 2020 at 01:31:34AM +0800, Jonathan Cameron wrote:
> > > Note this provides complete support for our usecase on an ARM
> > > server using Hardware Reduced ACPI and adds appropriate place
> > > for an RCEC driver to hook if someone else cares to write one,
> > > either for firmware first handling on non Hardware Reduced ACPI
> > > or for kernel first AER handling.  
> > 
> > This provides complete support?  I'm really confused, since this
> > relies on dev->rcec, which is never set.  And I don't see anything
> > about hooks for RCEC drivers.
> 
> In our configuration we only support firmware first.  For that we
> don't need dev->rcec to be set.
> 
> For our case, the OS should not in any way touch the RCEC (in fact,
> as far as I can tell, it doesn't actually need to exist - and for
> some of our platforms it doesn't.  An impdef bit of hardware can do
> the same job.)
> 
> The information that could be read from the RCEC is provided in a
> CPER record via GHESv2.  Confirmation that the OS has done
> everything it needs to with the error is done via a handshake in the
> GHESv2 Error Status Block.
> 
> Hence for the particular corner case we care about this code
> provides everything necessary.  The stubs of RCEC support are there
> just to indicate how it 'might' fit with a model where the RCEC is
> needed to get information about the error etc.  I'm more than happy
> to drop them and perhaps put in a comment to put anyone needing them
> on the right track.
> 
> I put this statement around 'fully support in our case' here to
> indicate that the 'partial initial support' in the title is actually
> sufficient for some systems.
> 
> > > For Root Complex integrated End Points (RCiEPs) there is no root
> > > port to discover and hence we cannot walk the bus from the root
> > > port to do appropriate resets.
> > > 
> > > The PCI specification provides Root Complex Event Collectors to
> > > deal with this circumstance.  These are peer RCiEPs that provide
> > > (amongst other things) collection + interrupt facilities for AER
> > > reporting for a set of RCiEPs in the same root complex.
> > > 
> > > In the case of a Hardware Reduced ACPI platform, the AER errors
> > > are reported via a GHESv2 path using CPER records as defined in
> > > the UEFI specification.  These are intended to provide complete
> > > information and appropriate hand shake in a fashion that does
> > > not require a specific form of error reporting hardware.  This
> > > is contrast to AER handling via the various HEST entries for PCI
> > > Root Port and PCI Device etc where we do require direct access
> > > to the RCEC.  
> > 
> > Can you include pointers to relevant spec sections for these
> > differences between hardware-reduced and other platforms?
> 
> As mentioned above, I think I went down a dead end on this
> description.  I think the lack of need for an RCEC in firmware first
> handling is equally valid in all Firmware first cases.
> 
> I can have a go at highlighting relevant spec entries, though its
> scattered across the ACPI spec and UEFI spec.  Focusing just on the
> elements relevant to RAS handling...
> 
> The very brief version is that in Hardware Reduced ACPI all error
> information is gathered via a management processor (or firmware
> doing the same job) and presented as a descriptive record. There is
> also a generic handshake to acknowledge the error without needing
> anything hardware specific.
> 
> My confusion lay around the non Hardware-Reduced case.  I'm not
> totally clear on what happens in that path and don't have any
> hardware to look at.  So my assumption was that it used the HEST
> entries for PCIe Root Port etc to identify where to find the error.
> I now 'think' that isn't true and it uses GHES records. If anyone
> can point me to a reference for this flow that would be great.
> 
> HEST can also include GHESv2 entries as defined in ACPI 6.3, section
> 18.3.2.8 There error flow is the same for all GHESv2 error types:
> 
> "These are the steps the OS must take once detecting an error from a
> particular GHESv2 error source:
> •OSPM detects error (via interrupt/exception or polling the block status)
> •OSPM copies the error status block
> •OSPM clears the block status field of the error status block
> •OSPM acknowledges the error via Read Ack register. For example:
> —OSPM reads the Read Ack register  X
> —OSPM writes  (( X & ReadAckPreserve) | ReadAckWrite)"
> 
> Referring back to the GHES description in ACPI 6.3 18.3.2.7
> We have a Generic Error Status Block which has a bunch of
> Generic Error Data Entries (18-392). Those contain CPER
> records.
> 
> CPER record types are defined in the UEFI spec, appendix N.
> These are identified by GUID and there is one for PCIE errors.
> (table 54)  Definition of that is in N2.7.
> It includes the source, plus root port / bridge address and
> (potentially) a copy of the PCIe Advanced Error Reporting
> Extended Capability Structure
> 
> Everything you might otherwise read from the AER registers should
> be present in this record.  The basic aim being that you shouldn't
> need to read those PCIe registers directly (and may not be able
> to).
> 
> > This patch doesn't seem to depend on anything about ACPI, APEI,
> > firmware-first, or hardware-reduced platforms.
> 
> The only thing it really depends on is whether an RCEC is present.
> It is possible to have a valid platform that doesn't need one.
> The reference to firmware-first etc are about establishing that it
> is optional.
> 
> > > As such my interpretation of the spec is that a Reduced Hardware
> > > ACPI platform should not access the RCEC from the OS at all
> > > during AER handling, and in fact is welcome to use non standard
> > > hardware interfaces to provide the equivalent functionality in
> > > any fashion it wishes (as all hidden beind the firmware).  
> > 
> > A pointer to the spec you're interpreting would be helpful here,
> > too.
> 
> Same info as above.  Good info on what firmware first flow actually
> means is hard to come by.  I have docs on our flows, but can't find
> any on a typical x86 machine.

"Firmware-first" is only mentioned in the ACPI spec, IIRC.  Last I
looked I could not find any statement about what the OS should do if
the FIRMWARE_FIRST bit is set, so I don't think Linux should look at
it.

If your platform uses firmware-first, I assume Linux learns about
errors via APEI, and we shouldn't need any changes except to deal with
an RCiEP instead of the tree rooted at the device reporting an error.

We should be able to make a smart way to do this.  pci_walk_bus()
currently takes a *bus* and deals with all the devices on that bus.
But we should be able to make something similar that takes a *device*
and deals with the device and any descendents.

> > > Hence I am making the provision of an RCEC optional.
> > >
> > > The aim of the rest of the code was to replicate the actions that would
> > > have occurred if this had been an EP below a root port. Some of them make
> > > absolutely no sense, but I hope this RFC can start a discussion on what
> > > we should be doing under these circumstances.
> > > 
> > > It probably makes sense to pull this new block of code out to a separate
> > > function but for the RFC I've left it in place to keep it next to the
> > > existing path.  
> > 
> > OK, my comment is: I really hope we don't need a separate path.  If we
> > need a test or two for RCiEPs, that's fine.  But two paths sounds like
> > a nightmare to maintain.
> 
> You can't walk the bus for RCiEPs so its going to be inherently different.
> We could do it as a series of special cases though so it's obvious what
> is going on. Would you prefer that?

Yes.  An error at X affects X and the subtree below X.  If X happens
to be an RCiEP, the subtree is empty.  That doesn't seem like a huge
difference.

> > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > ---
> > >  drivers/pci/pcie/err.c | 61 ++++++++++++++++++++++++++++++++++++++++++
> > >  include/linux/pci.h    |  1 +
> > >  2 files changed, 62 insertions(+)
> > > 
> > > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> > > index 14bb8f54723e..d34be4483f73 100644
> > > --- a/drivers/pci/pcie/err.c
> > > +++ b/drivers/pci/pcie/err.c
> > > @@ -153,6 +153,67 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> > >  	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
> > >  	struct pci_bus *bus;
> > >  
> > > +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {
> > > +		struct pci_dev *rcec = dev->rcec;
> > > +		/* Not clear this makes any sense - we can't reset link anyway...*/
> > > +		if (state == pci_channel_io_frozen) {
> > > +			report_frozen_detected(dev, &status);
> > > +			pci_err(dev, "io is frozen and cannot reset link\n");
> > > +			goto failed;
> > > +		} else {
> > > +			report_normal_detected(dev, &status);
> > > +		}  
> > 
> > I don't understand where you're going with this.  I think you're
> > adding recovery for RCiEPs (PCI_EXP_TYPE_RC_END).  It's true that
> > there's no link leading to them, but we should still be able to reset
> > the RCiEP (not the RCEC) via FLR, if it supports that.
> 
> Agreed.  This code is operating on the RCiEP not the rcec. Only the
> block below under the if (rcec) check touches that. 

Right, sorry, I misread this.

> It might help to think of this as walking a bus of one element. Hence
> we are calling directly on the RCiEP rather than the bus walks in
> the normal path.
> 
> > 
> > And all the driver callbacks should be for the RCiEP, not the RCEC,
> > shouldn't they?  I really hope we can avoid duplicating this whole
> > path.  It will be hard to keep the two paths in sync.
> 
> Yes, and they are unless I'm missing something. Except for the one
> block below, which mirrors the actions taken on the root port in the
> normal path.
> 
> > 
> > > +		if (status == PCI_ERS_RESULT_CAN_RECOVER) {
> > > +			status = PCI_ERS_RESULT_RECOVERED;
> > > +			pci_dbg(dev, "broadcast mmio_enabled message\n");
> > > +			report_mmio_enabled(dev, &status);
> > > +		}
> > > +
> > > +		if (status == PCI_ERS_RESULT_NEED_RESET) {
> > > +			/* No actual slot reset possible */
> > > +			status = PCI_ERS_RESULT_RECOVERED;
> > > +			pci_dbg(dev, "broadcast slot_reset message\n");
> > > +			report_slot_reset(dev, &status);
> > > +		}
> > > +
> > > +		if (status != PCI_ERS_RESULT_RECOVERED)
> > > +			goto failed;
> > > +
> > > +		report_resume(dev, &status);
> > > +
> > > +		/*
> > > +		 * These two should be called on the RCEC  - but in case
> > > +		 * of firmware first they should be no-ops. Given that
> > > +		 * in a reduced hardware ACPI system, it is possible there
> > > +		 * is no standard compliant RCEC at all.
> > > +		 *
> > > +		 * Add some sort of check on what type of HEST entries we have?
> > > +		 */
> > > +		if (rcec) {
> 
> This is the only bit that related to the RCEC.
> 
> > > +			/*
> > > +			 * Unlike the upstream port case for an EP, we have not
> > > +			 * issued a reset on all device the RCEC handles, so
> > > +			 * perhaps we should be more careful about resetting
> > > +			 * the status registers on the RCEC?
> > > +			 *
> > > +			 * In particular we may need provide a means to handle
> > > +			 * the multiple error bits being set in PCI_ERR_ROOT_STATUS
> > > +			 */
> > > +			pci_aer_clear_device_status(rcec);
> > > +			pci_aer_clear_nonfatal_status(rcec);
> > > +			/*
> > > +			 * Non RCiEP case uses the downstream port above the device
> > > +			 * for this message.
> > > +			 */
> > > +			pci_info(rcec, "device recovery successful\n");
> > > +		} else {
> > > +			pci_info(dev, "device recovery successful\n");
> > > +		}
> > > +
> > > +		return status;
> > > +	}
> > > +
> > >  	/*
> > >  	 * Error recovery runs on all subordinates of the first downstream port.
> > >  	 * If the downstream port detected the error, it is cleared at the end.
> > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > index 83ce1cdf5676..cb21dfe05f8c 100644
> > > --- a/include/linux/pci.h
> > > +++ b/include/linux/pci.h
> > > @@ -298,6 +298,7 @@ struct pci_dev {
> > >  	struct list_head bus_list;	/* Node in per-bus list */
> > >  	struct pci_bus	*bus;		/* Bus this device is on */
> > >  	struct pci_bus	*subordinate;	/* Bus this device bridges to */
> > > +	struct pci_dev	*rcec;		/* Root Complex Event Collector used */ 
> >
> > Nothing ever sets this, so I guess the critical connection between
> > RCiEP and RCEC is missing?  Each patch needs to make sense on its own,
> > so the patch that adds this struct member should also add something
> > that sets it and uses it.
> 
> I'm happy to drop this. It's here only to try to make the point that the
> infra-structure would be needed in the non Firmware-First case.
> 
> Intent was to illustrate that what I was defining for firmware first
> would also work for kernel-first flows assuming someone actually put in place
> infrastructure to hook up the RCEC here.
> 
> I was rather hoping someone would jump up and say 'I've got one of those!'.

I think they're coming.  But I don't think they're relevant for the
firmware-first situation you're trying to solve.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first
  2020-06-16 19:24 [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first Bjorn Helgaas
@ 2020-06-17 11:40 ` Jonathan Cameron
  2020-06-17 17:26   ` Bjorn Helgaas
  2020-06-17 17:36 ` Sean V Kelley
  1 sibling, 1 reply; 19+ messages in thread
From: Jonathan Cameron @ 2020-06-17 11:40 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linuxarm,
	Lorenzo Pieralisi, Kuppuswamy Sathyanarayanan, Sean V Kelley

On Tue, 16 Jun 2020 14:24:41 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> Bcc: 
> Subject: Re: [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs
>  using RCEC or firmware first
> Reply-To: 
> In-Reply-To: <20200521173134.2456773-3-Jonathan.Cameron@huawei.com>
> 
> [+cc Sathy, Sean]
> 
Hi Bjorn,

Thanks for looking at this.  I got a bit carried away in this description
with trying to show that we don't need an RCEC or software support for
one and that has made the description rather confused. Sorry about that!

The complete lack of any consistent diagram of the various options in any
of the related specs has lead me down more than one dead end trying to work
this out.  Looking at the specs today I'm increasingly convinced the
question of Hardware-Reduced vs normal doesn't matter.


Let me try proposing a brief replacement description:

"Root Complex integrate End Points (RCiEPs) are not found below a root port.
 (PCI Express Base Specification 4.0 1.3.2.3, 7.1 for topology)
 As such the error handling needs to perform actions on only the device, rather
 than walking the bus as is done for conventional EPs.

 In firmware-first error handling there is no need to directly access the
 Root Complex Event Collector (RCEC) as the firmware is responsible for all
 actions touching it.  The implementation of the RCEC device may not be compliant
 with the PCIe Spec as the OS does not access it during error handling.
 (RCEC defined in PCI Express Base Specification 4.0 1.3.4)

 (Can drop next bit if we drop the code)
 If kernel-first error handling is in use, handling AER errors for RCiEPs
 requires access to the RCEC they are associated with. Support for this
 is not included in this RFC due to a lack of available test platform." 

> On Fri, May 22, 2020 at 01:31:34AM +0800, Jonathan Cameron wrote:
> > Note this provides complete support for our usecase on an ARM server using
> > Hardware Reduced ACPI and adds appropriate place for an RCEC driver to hook
> > if someone else cares to write one, either for firmware first handling on
> > non Hardware Reduced ACPI or for kernel first AER handling.  
> 
> This provides complete support?  I'm really confused, since this
> relies on dev->rcec, which is never set.  And I don't see anything
> about hooks for RCEC drivers.

In our configuration we only support firmware first.
For that we don't need dev->rcec to be set.

For our case, the OS should not in any way touch the RCEC (in fact, as far as I can
tell, it doesn't actually need to exist - and for some of our platforms it doesn't.
An impdef bit of hardware can do the same job.)

The information that could be read from the RCEC is provided
in a CPER record via GHESv2.  Confirmation that the OS has done everything
it needs to with the error is done via a handshake in the GHESv2 Error Status
Block.

Hence for the particular corner case we care about this code provides everything
necessary.  The stubs of RCEC support are there just to indicate how it
'might' fit with a model where the RCEC is needed to get information about
the error etc.  I'm more than happy to drop them and perhaps put in a comment
to put anyone needing them on the right track.

I put this statement around 'fully support in our case' here to indicate
that the 'partial initial support' in the title is actually sufficient
for some systems.

> 
> > For Root Complex integrated End Points (RCiEPs) there is no root port to
> > discover and hence we cannot walk the bus from the root port to do
> > appropriate resets.
> > 
> > The PCI specification provides Root Complex Event Collectors to deal with
> > this circumstance.  These are peer RCiEPs that provide (amongst other
> > things) collection + interrupt facilities for AER reporting for a set of
> > RCiEPs in the same root complex.
> > 
> > In the case of a Hardware Reduced ACPI platform, the AER errors are
> > reported via a GHESv2 path using CPER records as defined in the UEFI
> > specification.  These are intended to provide complete information and
> > appropriate hand shake in a fashion that does not require a specific form
> > of error reporting hardware.  This is contrast to AER handling via the
> > various HEST entries for PCI Root Port and PCI Device etc where we do
> > require direct access to the RCEC.  
> 
> Can you include pointers to relevant spec sections for these
> differences between hardware-reduced and other platforms?

As mentioned above, I think I went down a dead end on this description.
I think the lack of need for an RCEC in firmware first handling is equally
valid in all Firmware first cases.

I can have a go at highlighting relevant spec entries, though its scattered
across the ACPI spec and UEFI spec.  Focusing just on the elements relevant
to RAS handling...

The very brief version is that in Hardware Reduced ACPI all error
information is gathered via a management processor (or firmware doing
the same job) and presented as a descriptive record. There is also
a generic handshake to acknowledge the error without needing anything
hardware specific.

My confusion lay around the non Hardware-Reduced case.  I'm not totally clear
on what happens in that path and don't have any hardware to look at.
So my assumption was that it used the HEST entries for PCIe Root Port
etc to identify where to find the error.  I now 'think' that isn't true
and it uses GHES records. If anyone can point me to a reference for this
flow that would be great.

HEST can also include GHESv2 entries as defined in ACPI 6.3, section 18.3.2.8
There error flow is the same for all GHESv2 error types:

"These are the steps the OS must take once detecting an error from a particular
GHESv2 error source:
•OSPM detects error (via interrupt/exception or polling the block status)
•OSPM copies the error status block
•OSPM clears the block status field of the error status block
•OSPM acknowledges the error via Read Ack register. For example:
—OSPM reads the Read Ack register  X
—OSPM writes  (( X & ReadAckPreserve) | ReadAckWrite)"

Referring back to the GHES description in ACPI 6.3 18.3.2.7
We have a Generic Error Status Block which has a bunch of
Generic Error Data Entries (18-392). Those contain CPER
records.

CPER record types are defined in the UEFI spec, appendix N.
These are identified by GUID and there is one for PCIE errors.
(table 54)  Definition of that is in N2.7.
It includes the source, plus root port / bridge address and
(potentially) a copy of the PCIe Advanced Error Reporting
Extended Capability Structure

Everything you might otherwise read from the AER registers should
be present in this record.  The basic aim being that you shouldn't
need to read those PCIe registers directly (and may not be able
to).



> 
> This patch doesn't seem to depend on anything about ACPI, APEI,
> firmware-first, or hardware-reduced platforms.

The only thing it really depends on is whether an RCEC is present.
It is possible to have a valid platform that doesn't need one.
The reference to firmware-first etc are about establishing that it
is optional.

> 
> > As such my interpretation of the spec is that a Reduced Hardware ACPI
> > platform should not access the RCEC from the OS at all during AER handling,
> > and in fact is welcome to use non standard hardware interfaces to provide
> > the equivalent functionality in any fashion it wishes (as all hidden beind
> > the firmware).  
> 
> A pointer to the spec you're interpreting would be helpful here, too.

Same info as above.  Good info on what firmware first flow actually means
is hard to come by.  I have docs on our flows, but can't find any on
a typical x86 machine.

> 
> s/Reduced Hardware/Hardware-Reduced/ to match terminology in spec (I'm
> looking at ACPI v6.3, sec 4.1).  Also below in code comments.
> 
> s/beind/behind/

Will fix.

> 
> > Hence I am making the provision of an RCEC optional.
> >
> > The aim of the rest of the code was to replicate the actions that would
> > have occurred if this had been an EP below a root port. Some of them make
> > absolutely no sense, but I hope this RFC can start a discussion on what
> > we should be doing under these circumstances.
> > 
> > It probably makes sense to pull this new block of code out to a separate
> > function but for the RFC I've left it in place to keep it next to the
> > existing path.  
> 
> OK, my comment is: I really hope we don't need a separate path.  If we
> need a test or two for RCiEPs, that's fine.  But two paths sounds like
> a nightmare to maintain.

You can't walk the bus for RCiEPs so its going to be inherently different.
We could do it as a series of special cases though so it's obvious what
is going on. Would you prefer that?

> 
> > It appears that the current kernel first code does not support detecting
> > the multiple error bits being set in the root port error status register.
> > This seems like a limitation both the normal EP / Root Port case and
> > for RCiEPs.  
> 
> Is this paragraph supposed to be a bug report?  It doesn't seem to say
> anything about what *this* patch does.  Maybe this should be part of
> the commit log for a separate patch?

Side note on something we are not trying to support here for anyone who
comes across it! I shouldn't have left that in.  I'll drop it.

> 
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> >  drivers/pci/pcie/err.c | 61 ++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/pci.h    |  1 +
> >  2 files changed, 62 insertions(+)
> > 
> > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> > index 14bb8f54723e..d34be4483f73 100644
> > --- a/drivers/pci/pcie/err.c
> > +++ b/drivers/pci/pcie/err.c
> > @@ -153,6 +153,67 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> >  	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
> >  	struct pci_bus *bus;
> >  
> > +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {
> > +		struct pci_dev *rcec = dev->rcec;
> > +		/* Not clear this makes any sense - we can't reset link anyway...*/
> > +		if (state == pci_channel_io_frozen) {
> > +			report_frozen_detected(dev, &status);
> > +			pci_err(dev, "io is frozen and cannot reset link\n");
> > +			goto failed;
> > +		} else {
> > +			report_normal_detected(dev, &status);
> > +		}  
> 
> I don't understand where you're going with this.  I think you're
> adding recovery for RCiEPs (PCI_EXP_TYPE_RC_END).  It's true that
> there's no link leading to them, but we should still be able to reset
> the RCiEP (not the RCEC) via FLR, if it supports that.

Agreed.  This code is operating on the RCiEP not the rcec. Only the
block below under the if (rcec) check touches that. 

It might help to think of this as walking a bus of one element. Hence
we are calling directly on the RCiEP rather than the bus walks in
the normal path.

> 
> And all the driver callbacks should be for the RCiEP, not the RCEC,
> shouldn't they?  I really hope we can avoid duplicating this whole
> path.  It will be hard to keep the two paths in sync.

Yes, and they are unless I'm missing something. Except for the one
block below, which mirrors the actions taken on the root port in the
normal path.

> 
> > +		if (status == PCI_ERS_RESULT_CAN_RECOVER) {
> > +			status = PCI_ERS_RESULT_RECOVERED;
> > +			pci_dbg(dev, "broadcast mmio_enabled message\n");
> > +			report_mmio_enabled(dev, &status);
> > +		}
> > +
> > +		if (status == PCI_ERS_RESULT_NEED_RESET) {
> > +			/* No actual slot reset possible */
> > +			status = PCI_ERS_RESULT_RECOVERED;
> > +			pci_dbg(dev, "broadcast slot_reset message\n");
> > +			report_slot_reset(dev, &status);
> > +		}
> > +
> > +		if (status != PCI_ERS_RESULT_RECOVERED)
> > +			goto failed;
> > +
> > +		report_resume(dev, &status);
> > +
> > +		/*
> > +		 * These two should be called on the RCEC  - but in case
> > +		 * of firmware first they should be no-ops. Given that
> > +		 * in a reduced hardware ACPI system, it is possible there
> > +		 * is no standard compliant RCEC at all.
> > +		 *
> > +		 * Add some sort of check on what type of HEST entries we have?
> > +		 */
> > +		if (rcec) {

This is the only bit that related to the RCEC.

> > +			/*
> > +			 * Unlike the upstream port case for an EP, we have not
> > +			 * issued a reset on all device the RCEC handles, so
> > +			 * perhaps we should be more careful about resetting
> > +			 * the status registers on the RCEC?
> > +			 *
> > +			 * In particular we may need provide a means to handle
> > +			 * the multiple error bits being set in PCI_ERR_ROOT_STATUS
> > +			 */
> > +			pci_aer_clear_device_status(rcec);
> > +			pci_aer_clear_nonfatal_status(rcec);
> > +			/*
> > +			 * Non RCiEP case uses the downstream port above the device
> > +			 * for this message.
> > +			 */
> > +			pci_info(rcec, "device recovery successful\n");
> > +		} else {
> > +			pci_info(dev, "device recovery successful\n");
> > +		}
> > +
> > +		return status;
> > +	}
> > +
> >  	/*
> >  	 * Error recovery runs on all subordinates of the first downstream port.
> >  	 * If the downstream port detected the error, it is cleared at the end.
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 83ce1cdf5676..cb21dfe05f8c 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -298,6 +298,7 @@ struct pci_dev {
> >  	struct list_head bus_list;	/* Node in per-bus list */
> >  	struct pci_bus	*bus;		/* Bus this device is on */
> >  	struct pci_bus	*subordinate;	/* Bus this device bridges to */
> > +	struct pci_dev	*rcec;		/* Root Complex Event Collector used */  
> 
> Nothing ever sets this, so I guess the critical connection between
> RCiEP and RCEC is missing?  Each patch needs to make sense on its own,
> so the patch that adds this struct member should also add something
> that sets it and uses it.

I'm happy to drop this. It's here only to try to make the point that the
infra-structure would be needed in the non Firmware-First case.

Intent was to illustrate that what I was defining for firmware first
would also work for kernel-first flows assuming someone actually put in place
infrastructure to hook up the RCEC here.

I was rather hoping someone would jump up and say 'I've got one of those!'.

Thanks again,

Jonathan


> 
> >  	void		*sysdata;	/* Hook for sys-specific extension */
> >  	struct proc_dir_entry *procent;	/* Device entry in /proc/bus/pci */
> > -- 
> > 2.19.1
> >   



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first
@ 2020-06-16 19:24 Bjorn Helgaas
  2020-06-17 11:40 ` Jonathan Cameron
  2020-06-17 17:36 ` Sean V Kelley
  0 siblings, 2 replies; 19+ messages in thread
From: Bjorn Helgaas @ 2020-06-16 19:24 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linuxarm,
	Lorenzo Pieralisi, Kuppuswamy Sathyanarayanan, Sean V Kelley

Bcc: 
Subject: Re: [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs
 using RCEC or firmware first
Reply-To: 
In-Reply-To: <20200521173134.2456773-3-Jonathan.Cameron@huawei.com>

[+cc Sathy, Sean]

On Fri, May 22, 2020 at 01:31:34AM +0800, Jonathan Cameron wrote:
> Note this provides complete support for our usecase on an ARM server using
> Hardware Reduced ACPI and adds appropriate place for an RCEC driver to hook
> if someone else cares to write one, either for firmware first handling on
> non Hardware Reduced ACPI or for kernel first AER handling.

This provides complete support?  I'm really confused, since this
relies on dev->rcec, which is never set.  And I don't see anything
about hooks for RCEC drivers.

> For Root Complex integrated End Points (RCiEPs) there is no root port to
> discover and hence we cannot walk the bus from the root port to do
> appropriate resets.
> 
> The PCI specification provides Root Complex Event Collectors to deal with
> this circumstance.  These are peer RCiEPs that provide (amongst other
> things) collection + interrupt facilities for AER reporting for a set of
> RCiEPs in the same root complex.
> 
> In the case of a Hardware Reduced ACPI platform, the AER errors are
> reported via a GHESv2 path using CPER records as defined in the UEFI
> specification.  These are intended to provide complete information and
> appropriate hand shake in a fashion that does not require a specific form
> of error reporting hardware.  This is contrast to AER handling via the
> various HEST entries for PCI Root Port and PCI Device etc where we do
> require direct access to the RCEC.

Can you include pointers to relevant spec sections for these
differences between hardware-reduced and other platforms?

This patch doesn't seem to depend on anything about ACPI, APEI,
firmware-first, or hardware-reduced platforms.

> As such my interpretation of the spec is that a Reduced Hardware ACPI
> platform should not access the RCEC from the OS at all during AER handling,
> and in fact is welcome to use non standard hardware interfaces to provide
> the equivalent functionality in any fashion it wishes (as all hidden beind
> the firmware).

A pointer to the spec you're interpreting would be helpful here, too.

s/Reduced Hardware/Hardware-Reduced/ to match terminology in spec (I'm
looking at ACPI v6.3, sec 4.1).  Also below in code comments.

s/beind/behind/

> Hence I am making the provision of an RCEC optional.
>
> The aim of the rest of the code was to replicate the actions that would
> have occurred if this had been an EP below a root port. Some of them make
> absolutely no sense, but I hope this RFC can start a discussion on what
> we should be doing under these circumstances.
> 
> It probably makes sense to pull this new block of code out to a separate
> function but for the RFC I've left it in place to keep it next to the
> existing path.

OK, my comment is: I really hope we don't need a separate path.  If we
need a test or two for RCiEPs, that's fine.  But two paths sounds like
a nightmare to maintain.

> It appears that the current kernel first code does not support detecting
> the multiple error bits being set in the root port error status register.
> This seems like a limitation both the normal EP / Root Port case and
> for RCiEPs.

Is this paragraph supposed to be a bug report?  It doesn't seem to say
anything about what *this* patch does.  Maybe this should be part of
the commit log for a separate patch?

> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  drivers/pci/pcie/err.c | 61 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pci.h    |  1 +
>  2 files changed, 62 insertions(+)
> 
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index 14bb8f54723e..d34be4483f73 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -153,6 +153,67 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
>  	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
>  	struct pci_bus *bus;
>  
> +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {
> +		struct pci_dev *rcec = dev->rcec;
> +		/* Not clear this makes any sense - we can't reset link anyway...*/
> +		if (state == pci_channel_io_frozen) {
> +			report_frozen_detected(dev, &status);
> +			pci_err(dev, "io is frozen and cannot reset link\n");
> +			goto failed;
> +		} else {
> +			report_normal_detected(dev, &status);
> +		}

I don't understand where you're going with this.  I think you're
adding recovery for RCiEPs (PCI_EXP_TYPE_RC_END).  It's true that
there's no link leading to them, but we should still be able to reset
the RCiEP (not the RCEC) via FLR, if it supports that.

And all the driver callbacks should be for the RCiEP, not the RCEC,
shouldn't they?  I really hope we can avoid duplicating this whole
path.  It will be hard to keep the two paths in sync.

> +		if (status == PCI_ERS_RESULT_CAN_RECOVER) {
> +			status = PCI_ERS_RESULT_RECOVERED;
> +			pci_dbg(dev, "broadcast mmio_enabled message\n");
> +			report_mmio_enabled(dev, &status);
> +		}
> +
> +		if (status == PCI_ERS_RESULT_NEED_RESET) {
> +			/* No actual slot reset possible */
> +			status = PCI_ERS_RESULT_RECOVERED;
> +			pci_dbg(dev, "broadcast slot_reset message\n");
> +			report_slot_reset(dev, &status);
> +		}
> +
> +		if (status != PCI_ERS_RESULT_RECOVERED)
> +			goto failed;
> +
> +		report_resume(dev, &status);
> +
> +		/*
> +		 * These two should be called on the RCEC  - but in case
> +		 * of firmware first they should be no-ops. Given that
> +		 * in a reduced hardware ACPI system, it is possible there
> +		 * is no standard compliant RCEC at all.
> +		 *
> +		 * Add some sort of check on what type of HEST entries we have?
> +		 */
> +		if (rcec) {
> +			/*
> +			 * Unlike the upstream port case for an EP, we have not
> +			 * issued a reset on all device the RCEC handles, so
> +			 * perhaps we should be more careful about resetting
> +			 * the status registers on the RCEC?
> +			 *
> +			 * In particular we may need provide a means to handle
> +			 * the multiple error bits being set in PCI_ERR_ROOT_STATUS
> +			 */
> +			pci_aer_clear_device_status(rcec);
> +			pci_aer_clear_nonfatal_status(rcec);
> +			/*
> +			 * Non RCiEP case uses the downstream port above the device
> +			 * for this message.
> +			 */
> +			pci_info(rcec, "device recovery successful\n");
> +		} else {
> +			pci_info(dev, "device recovery successful\n");
> +		}
> +
> +		return status;
> +	}
> +
>  	/*
>  	 * Error recovery runs on all subordinates of the first downstream port.
>  	 * If the downstream port detected the error, it is cleared at the end.
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 83ce1cdf5676..cb21dfe05f8c 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -298,6 +298,7 @@ struct pci_dev {
>  	struct list_head bus_list;	/* Node in per-bus list */
>  	struct pci_bus	*bus;		/* Bus this device is on */
>  	struct pci_bus	*subordinate;	/* Bus this device bridges to */
> +	struct pci_dev	*rcec;		/* Root Complex Event Collector used */

Nothing ever sets this, so I guess the critical connection between
RCiEP and RCEC is missing?  Each patch needs to make sense on its own,
so the patch that adds this struct member should also add something
that sets it and uses it.

>  	void		*sysdata;	/* Hook for sys-specific extension */
>  	struct proc_dir_entry *procent;	/* Device entry in /proc/bus/pci */
> -- 
> 2.19.1
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-06-18 16:39 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-21 17:31 [RFC PATCH 0/2] PCI/AER: handling for RCiEPs Jonathan Cameron
2020-05-21 17:31 ` [PATCH 1/2] PCI/AER: Do not reset the device status if doing firmware first handling Jonathan Cameron
2020-06-16 17:47   ` Bjorn Helgaas
2020-06-16 18:00     ` Kuppuswamy, Sathyanarayanan
2020-06-17  9:31       ` Jonathan Cameron
2020-06-17 20:57         ` Kuppuswamy, Sathyanarayanan
2020-06-17  9:18     ` Jonathan Cameron
2020-05-21 17:31 ` [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first Jonathan Cameron
2020-06-16 10:47 ` [RFC PATCH 0/2] PCI/AER: handling for RCiEPs Jonathan Cameron
2020-06-16 19:24 [PATCH 2/2] PCI/AER: Add partial initial support for RCiEPs using RCEC or firmware first Bjorn Helgaas
2020-06-17 11:40 ` Jonathan Cameron
2020-06-17 17:26   ` Bjorn Helgaas
2020-06-18  9:32     ` Jonathan Cameron
2020-06-17 17:36 ` Sean V Kelley
2020-06-17 18:25   ` Kuppuswamy, Sathyanarayanan
2020-06-18  8:48     ` Jonathan Cameron
2020-06-18 16:04       ` Jonathan Cameron
2020-06-18 16:20       ` Bjorn Helgaas
2020-06-18 16:38         ` Jonathan Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).