All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] Fully enable AER
@ 2022-01-19  9:21 Stefan Roese
  2022-01-19  9:21 ` [PATCH v3 1/2] PCI/portdrv: Don't disable AER reporting in get_port_device_capability() Stefan Roese
  2022-01-19  9:22 ` [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it Stefan Roese
  0 siblings, 2 replies; 13+ messages in thread
From: Stefan Roese @ 2022-01-19  9:21 UTC (permalink / raw)
  To: linux-pci
  Cc: Rafael J . Wysocki, Bjorn Helgaas, Pali Rohár,
	Bharat Kumar Gogada, Michal Simek, Yao Hongbo, Naveen Naidu

While working on AER support on a ZynqMP based system, which has some
PCIe Device connected via a PCIe switch, problems with AER enabling in
the Device Control registers of all PCIe devices but the Root Port. In
fact, only the Root Port has AER enabled right now. This patch set now
fixes this problem by first fixing the AER enabing in the
interconnected PCIe switches between the Root Port and the PCIe
devices and in a 2nd patch, also enabling AER in the PCIe Endpoints.

Please note that these changes are quite invasive, as with these
patches applied, AER now will be enabled in the Device Control
registers of all available PCIe Endpoints, which currently is not the
case.

Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Pali Rohár <pali@kernel.org>
Cc: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yao Hongbo <yaohongbo@linux.alibaba.com>
Cc: Naveen Naidu <naveennaidu479@gmail.com>

Stefan Roese (2):
  PCI/portdrv: Don't disable AER reporting in
    get_port_device_capability()
  PCI/AER: Enable AER on all PCIe devices supporting it

 drivers/pci/pcie/aer.c          | 4 ++++
 drivers/pci/pcie/portdrv_core.c | 9 +--------
 2 files changed, 5 insertions(+), 8 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/2] PCI/portdrv: Don't disable AER reporting in get_port_device_capability()
  2022-01-19  9:21 [PATCH v3 0/2] Fully enable AER Stefan Roese
@ 2022-01-19  9:21 ` Stefan Roese
  2022-01-19  9:22 ` [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it Stefan Roese
  1 sibling, 0 replies; 13+ messages in thread
From: Stefan Roese @ 2022-01-19  9:21 UTC (permalink / raw)
  To: linux-pci
  Cc: Pali Rohár, Rafael J . Wysocki, Bjorn Helgaas,
	Bharat Kumar Gogada, Michal Simek, Yao Hongbo, Naveen Naidu

Testing has shown, that AER reporting is currently disabled in the
DevCtl registers of all non Root Port PCIe devices on systems using
pcie_ports_native || host->native_aer. Practically disabling AER
completely in such systems. This is due to the fact that with commit
2bd50dd800b5 ("PCI: PCIe: Disable PCIe port services during port
initialization"), a call to pci_disable_pcie_error_reporting() was
added *after* the PCIe AER setup was completed for the PCIe device
tree.

Here a longer analysis about the currect status of AER enaling /
disabling upon bootup provided by Bjorn:

  pcie_portdrv_probe
    pcie_port_device_register
      get_port_device_capability
        pci_disable_pcie_error_reporting
          clear CERE NFERE FERE URRE               # <-- disable for RP USP DSP
      pcie_device_init
        device_register                            # new AER service device
          aer_probe
            aer_enable_rootport                    # RP only
              set_downstream_devices_error_reporting
                set_device_error_reporting         # self (RP)
                  if (RP || USP || DSP)
                    pci_enable_pcie_error_reporting
                      set CERE NFERE FERE URRE     # <-- enable for RP
                pci_walk_bus
                  set_device_error_reporting
                    if (RP || USP || DSP)
                      pci_enable_pcie_error_reporting
                        set CERE NFERE FERE URRE   # <-- enable for USP DSP

In a typical Root Port -> Endpoint hierarchy, the above:
  - Disables Error Reporting for the Root Port,
  - Enables Error Reporting for the Root Port,
  - Does NOT enable Error Reporting for the Endpoint because it is not
    a Root Port or Switch Port.

In a deeper Root Port -> Upstream Switch Port -> Downstream Switch
Port -> Endpoint hierarchy:
  - Disables Error Reporting for the Root Port,
  - Enables Error Reporting for the Root Port,
  - Enables Error Reporting for both Switch Ports,
  - Does NOT enable Error Reporting for the Endpoint because it is not
    a Root Port or Switch Port,
  - Disables Error Reporting for the Switch Ports when
    pcie_portdrv_probe() claims them.  AER does not re-enable it
    because these are not Root Ports.

This patch now removes this call to pci_disable_pcie_error_reporting()
from get_port_device_capability(), leaving the already enabled AER
configuration intact. With this change, AER is enabled in the Root Port
and the PCIe switch upstream and downstream ports. Only the PCIe
Endpoints don't have AER enabled yet. A follow-up patch will take
care of this Endpoint enabling.

Fixes: 2bd50dd800b5 ("PCI: PCIe: Disable PCIe port services during port initialization")
Signed-off-by: Stefan Roese <sr@denx.de>
Reviewed-by: Pali Rohár <pali@kernel.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Pali Rohár <pali@kernel.org>
Cc: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yao Hongbo <yaohongbo@linux.alibaba.com>
Cc: Naveen Naidu <naveennaidu479@gmail.com>
---
v3:
- Added RB tag from Pali

v2:
- Enhance commit message as suggested by Bjorn

 drivers/pci/pcie/portdrv_core.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index f81c7be4d7d8..27b990cedb4c 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -244,15 +244,8 @@ static int get_port_device_capability(struct pci_dev *dev)
 
 #ifdef CONFIG_PCIEAER
 	if (dev->aer_cap && pci_aer_available() &&
-	    (pcie_ports_native || host->native_aer)) {
+	    (pcie_ports_native || host->native_aer))
 		services |= PCIE_PORT_SERVICE_AER;
-
-		/*
-		 * Disable AER on this port in case it's been enabled by the
-		 * BIOS (the AER service driver will enable it when necessary).
-		 */
-		pci_disable_pcie_error_reporting(dev);
-	}
 #endif
 
 	/* Root Ports and Root Complex Event Collectors may generate PMEs */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it
  2022-01-19  9:21 [PATCH v3 0/2] Fully enable AER Stefan Roese
  2022-01-19  9:21 ` [PATCH v3 1/2] PCI/portdrv: Don't disable AER reporting in get_port_device_capability() Stefan Roese
@ 2022-01-19  9:22 ` Stefan Roese
  2022-01-19 10:37   ` Pali Rohár
  2022-01-19 18:25   ` Keith Busch
  1 sibling, 2 replies; 13+ messages in thread
From: Stefan Roese @ 2022-01-19  9:22 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Pali Rohár, Bharat Kumar Gogada,
	Michal Simek, Yao Hongbo, Naveen Naidu

With this change, AER is now enabled on all PCIe devices, also when the
PCIe device is hot-plugged.

Please note that this change is quite invasive, as with this patch
applied, AER now will be enabled in the Device Control registers of all
available PCIe Endpoints, which currently is not the case.

When "pci=noaer" is selected, AER stays disabled of course.

Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Pali Rohár <pali@kernel.org>
Cc: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yao Hongbo <yaohongbo@linux.alibaba.com>
Cc: Naveen Naidu <naveennaidu479@gmail.com>
---
v3:
- New patch, replacing the "old" 2/2 patch
  Now enabling of AER for each PCIe device is done in pci_aer_init(),
  which also makes sure that AER is enabled in each PCIe device even when
  it's hot-plugged.

 drivers/pci/pcie/aer.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 9fa1f97e5b27..01a25e4a5168 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -387,6 +387,10 @@ void pci_aer_init(struct pci_dev *dev)
 	pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_ERR, sizeof(u32) * n);
 
 	pci_aer_clear_status(dev);
+
+	/* Enable AER if requested */
+	if (pci_aer_available())
+		pci_enable_pcie_error_reporting(dev);
 }
 
 void pci_aer_exit(struct pci_dev *dev)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it
  2022-01-19  9:22 ` [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it Stefan Roese
@ 2022-01-19 10:37   ` Pali Rohár
  2022-01-20  7:31     ` Stefan Roese
  2022-01-19 18:25   ` Keith Busch
  1 sibling, 1 reply; 13+ messages in thread
From: Pali Rohár @ 2022-01-19 10:37 UTC (permalink / raw)
  To: Stefan Roese
  Cc: linux-pci, Bjorn Helgaas, Bharat Kumar Gogada, Michal Simek,
	Yao Hongbo, Naveen Naidu

On Wednesday 19 January 2022 10:22:00 Stefan Roese wrote:
> With this change, AER is now enabled on all PCIe devices, also when the
> PCIe device is hot-plugged.
> 
> Please note that this change is quite invasive, as with this patch
> applied, AER now will be enabled in the Device Control registers of all
> available PCIe Endpoints, which currently is not the case.
> 
> When "pci=noaer" is selected, AER stays disabled of course.

Hello Stefan! I was thinking more about this change and I'm not sure
what happens if AER-capable PCIe device is hotplugged into some PCIe
switch connected in the PCIe hierarchy where Root Port is not
AER-capable (e.g. current linux implementation of pci-aardvark.c and
pci-mvebu.c). My feeling is that in this case AER should not be enabled
as there is nobody who can deliver AER interrupt to the OS. But I really
do not know what is supposed from kernel AER driver, so lets wait for
Bjorn reply.

And when you opened this issue with hotplugging, another thing for
followup changes in future is calling pcie_set_ecrc_checking() function
to align ECRC state of newly hotplugged device with "pci=ecrc=..."
cmdline option. As currently it is done only at that function
set_device_error_reporting().

> Signed-off-by: Stefan Roese <sr@denx.de>
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Pali Rohár <pali@kernel.org>
> Cc: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> Cc: Michal Simek <michal.simek@xilinx.com>
> Cc: Yao Hongbo <yaohongbo@linux.alibaba.com>
> Cc: Naveen Naidu <naveennaidu479@gmail.com>
> ---
> v3:
> - New patch, replacing the "old" 2/2 patch
>   Now enabling of AER for each PCIe device is done in pci_aer_init(),
>   which also makes sure that AER is enabled in each PCIe device even when
>   it's hot-plugged.
> 
>  drivers/pci/pcie/aer.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 9fa1f97e5b27..01a25e4a5168 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -387,6 +387,10 @@ void pci_aer_init(struct pci_dev *dev)
>  	pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_ERR, sizeof(u32) * n);
>  
>  	pci_aer_clear_status(dev);
> +
> +	/* Enable AER if requested */
> +	if (pci_aer_available())
> +		pci_enable_pcie_error_reporting(dev);
>  }
>  
>  void pci_aer_exit(struct pci_dev *dev)
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it
  2022-01-19  9:22 ` [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it Stefan Roese
  2022-01-19 10:37   ` Pali Rohár
@ 2022-01-19 18:25   ` Keith Busch
  2022-01-19 21:00     ` Bjorn Helgaas
  1 sibling, 1 reply; 13+ messages in thread
From: Keith Busch @ 2022-01-19 18:25 UTC (permalink / raw)
  To: Stefan Roese
  Cc: linux-pci, Bjorn Helgaas, Pali Rohár, Bharat Kumar Gogada,
	Michal Simek, Yao Hongbo, Naveen Naidu

On Wed, Jan 19, 2022 at 10:22:00AM +0100, Stefan Roese wrote:
> @@ -387,6 +387,10 @@ void pci_aer_init(struct pci_dev *dev)
>  	pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_ERR, sizeof(u32) * n);
>  
>  	pci_aer_clear_status(dev);
> +
> +	/* Enable AER if requested */
> +	if (pci_aer_available())
> +		pci_enable_pcie_error_reporting(dev);
>  }

Hasn't it always been the device specific driver's responsibility to
call this function?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it
  2022-01-19 18:25   ` Keith Busch
@ 2022-01-19 21:00     ` Bjorn Helgaas
  2022-01-19 21:18       ` Keith Busch
  0 siblings, 1 reply; 13+ messages in thread
From: Bjorn Helgaas @ 2022-01-19 21:00 UTC (permalink / raw)
  To: Keith Busch
  Cc: Stefan Roese, linux-pci, Pali Rohár, Bharat Kumar Gogada,
	Michal Simek, Yao Hongbo, Naveen Naidu

On Wed, Jan 19, 2022 at 10:25:50AM -0800, Keith Busch wrote:
> On Wed, Jan 19, 2022 at 10:22:00AM +0100, Stefan Roese wrote:
> > @@ -387,6 +387,10 @@ void pci_aer_init(struct pci_dev *dev)
> >  	pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_ERR, sizeof(u32) * n);
> >  
> >  	pci_aer_clear_status(dev);
> > +
> > +	/* Enable AER if requested */
> > +	if (pci_aer_available())
> > +		pci_enable_pcie_error_reporting(dev);
> >  }
> 
> Hasn't it always been the device specific driver's responsibility to
> call this function?

So far it has been done by the driver, because the PCI core doesn't do
it.  But is there a reason it should be done by the driver?  It
doesn't seem necessarily device-specific.

Bjorn

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it
  2022-01-19 21:00     ` Bjorn Helgaas
@ 2022-01-19 21:18       ` Keith Busch
  2022-01-20  7:32         ` Stefan Roese
  0 siblings, 1 reply; 13+ messages in thread
From: Keith Busch @ 2022-01-19 21:18 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Stefan Roese, linux-pci, Pali Rohár, Bharat Kumar Gogada,
	Michal Simek, Yao Hongbo, Naveen Naidu

On Wed, Jan 19, 2022 at 03:00:02PM -0600, Bjorn Helgaas wrote:
> On Wed, Jan 19, 2022 at 10:25:50AM -0800, Keith Busch wrote:
> > On Wed, Jan 19, 2022 at 10:22:00AM +0100, Stefan Roese wrote:
> > > @@ -387,6 +387,10 @@ void pci_aer_init(struct pci_dev *dev)
> > >  	pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_ERR, sizeof(u32) * n);
> > >  
> > >  	pci_aer_clear_status(dev);
> > > +
> > > +	/* Enable AER if requested */
> > > +	if (pci_aer_available())
> > > +		pci_enable_pcie_error_reporting(dev);
> > >  }
> > 
> > Hasn't it always been the device specific driver's responsibility to
> > call this function?
> 
> So far it has been done by the driver, because the PCI core doesn't do
> it.  But is there a reason it should be done by the driver?  It
> doesn't seem necessarily device-specific.

I was thinking the device driver knows if it provides .err_handler
callbacks in order to respond to AER handling, so it would know if it is
ready for its device to enable error reporting. But I guess it doesn't
really matter if the driver provides callbacks anyway.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it
  2022-01-19 10:37   ` Pali Rohár
@ 2022-01-20  7:31     ` Stefan Roese
  2022-01-20 13:23       ` Pali Rohár
  2022-01-20 15:46       ` Bjorn Helgaas
  0 siblings, 2 replies; 13+ messages in thread
From: Stefan Roese @ 2022-01-20  7:31 UTC (permalink / raw)
  To: Pali Rohár
  Cc: linux-pci, Bjorn Helgaas, Bharat Kumar Gogada, Michal Simek,
	Yao Hongbo, Naveen Naidu

On 1/19/22 11:37, Pali Rohár wrote:
> On Wednesday 19 January 2022 10:22:00 Stefan Roese wrote:
>> With this change, AER is now enabled on all PCIe devices, also when the
>> PCIe device is hot-plugged.
>>
>> Please note that this change is quite invasive, as with this patch
>> applied, AER now will be enabled in the Device Control registers of all
>> available PCIe Endpoints, which currently is not the case.
>>
>> When "pci=noaer" is selected, AER stays disabled of course.
> 
> Hello Stefan! I was thinking more about this change and I'm not sure
> what happens if AER-capable PCIe device is hotplugged into some PCIe
> switch connected in the PCIe hierarchy where Root Port is not
> AER-capable (e.g. current linux implementation of pci-aardvark.c and
> pci-mvebu.c). My feeling is that in this case AER should not be enabled
> as there is nobody who can deliver AER interrupt to the OS. But I really
> do not know what is supposed from kernel AER driver, so lets wait for
> Bjorn reply.

But what happens right now, when a device driver like the NVMe driver
calls pci_enable_pcie_error_reporting() ? There is also no checking,
if the connected Root Port or some switch / bridge in-between supports
AER or not. IIUTC, this is identical to what this patch here does.
Enable AER in the device and if the upstream infrastructure does not
support AER, then the AER event will just not be received by the
Kernel. Which is most likely not worse than not enabling AER at all
on this device. Or am I missing something?

> And when you opened this issue with hotplugging, another thing for
> followup changes in future is calling pcie_set_ecrc_checking() function
> to align ECRC state of newly hotplugged device with "pci=ecrc=..."
> cmdline option. As currently it is done only at that function
> set_device_error_reporting().

Agreed, this is another area to look into. Not sure if it's okay to
address this, once this patch-set has been accepted (if it will be).

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it
  2022-01-19 21:18       ` Keith Busch
@ 2022-01-20  7:32         ` Stefan Roese
  0 siblings, 0 replies; 13+ messages in thread
From: Stefan Roese @ 2022-01-20  7:32 UTC (permalink / raw)
  To: Keith Busch, Bjorn Helgaas
  Cc: linux-pci, Pali Rohár, Bharat Kumar Gogada, Michal Simek,
	Yao Hongbo, Naveen Naidu

On 1/19/22 22:18, Keith Busch wrote:
> On Wed, Jan 19, 2022 at 03:00:02PM -0600, Bjorn Helgaas wrote:
>> On Wed, Jan 19, 2022 at 10:25:50AM -0800, Keith Busch wrote:
>>> On Wed, Jan 19, 2022 at 10:22:00AM +0100, Stefan Roese wrote:
>>>> @@ -387,6 +387,10 @@ void pci_aer_init(struct pci_dev *dev)
>>>>   	pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_ERR, sizeof(u32) * n);
>>>>   
>>>>   	pci_aer_clear_status(dev);
>>>> +
>>>> +	/* Enable AER if requested */
>>>> +	if (pci_aer_available())
>>>> +		pci_enable_pcie_error_reporting(dev);
>>>>   }
>>>
>>> Hasn't it always been the device specific driver's responsibility to
>>> call this function?
>>
>> So far it has been done by the driver, because the PCI core doesn't do
>> it.  But is there a reason it should be done by the driver?  It
>> doesn't seem necessarily device-specific.
> 
> I was thinking the device driver knows if it provides .err_handler
> callbacks in order to respond to AER handling, so it would know if it is
> ready for its device to enable error reporting. But I guess it doesn't
> really matter if the driver provides callbacks anyway.

That's my understanding as well.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it
  2022-01-20  7:31     ` Stefan Roese
@ 2022-01-20 13:23       ` Pali Rohár
  2022-01-20 15:46       ` Bjorn Helgaas
  1 sibling, 0 replies; 13+ messages in thread
From: Pali Rohár @ 2022-01-20 13:23 UTC (permalink / raw)
  To: Stefan Roese
  Cc: linux-pci, Bjorn Helgaas, Bharat Kumar Gogada, Michal Simek,
	Yao Hongbo, Naveen Naidu

On Thursday 20 January 2022 08:31:31 Stefan Roese wrote:
> On 1/19/22 11:37, Pali Rohár wrote:
> > On Wednesday 19 January 2022 10:22:00 Stefan Roese wrote:
> > > With this change, AER is now enabled on all PCIe devices, also when the
> > > PCIe device is hot-plugged.
> > > 
> > > Please note that this change is quite invasive, as with this patch
> > > applied, AER now will be enabled in the Device Control registers of all
> > > available PCIe Endpoints, which currently is not the case.
> > > 
> > > When "pci=noaer" is selected, AER stays disabled of course.
> > 
> > Hello Stefan! I was thinking more about this change and I'm not sure
> > what happens if AER-capable PCIe device is hotplugged into some PCIe
> > switch connected in the PCIe hierarchy where Root Port is not
> > AER-capable (e.g. current linux implementation of pci-aardvark.c and
> > pci-mvebu.c). My feeling is that in this case AER should not be enabled
> > as there is nobody who can deliver AER interrupt to the OS. But I really
> > do not know what is supposed from kernel AER driver, so lets wait for
> > Bjorn reply.
> 
> But what happens right now, when a device driver like the NVMe driver
> calls pci_enable_pcie_error_reporting() ? There is also no checking,
> if the connected Root Port or some switch / bridge in-between supports
> AER or not. IIUTC, this is identical to what this patch here does.
> Enable AER in the device and if the upstream infrastructure does not
> support AER, then the AER event will just not be received by the
> Kernel. Which is most likely not worse than not enabling AER at all
> on this device. Or am I missing something?

You are right!

Seems that AER code has lot of candidates for followup fixes/cleanups...

> > And when you opened this issue with hotplugging, another thing for
> > followup changes in future is calling pcie_set_ecrc_checking() function
> > to align ECRC state of newly hotplugged device with "pci=ecrc=..."
> > cmdline option. As currently it is done only at that function
> > set_device_error_reporting().
> 
> Agreed, this is another area to look into. Not sure if it's okay to
> address this, once this patch-set has been accepted (if it will be).
> 
> Thanks,
> Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it
  2022-01-20  7:31     ` Stefan Roese
  2022-01-20 13:23       ` Pali Rohár
@ 2022-01-20 15:46       ` Bjorn Helgaas
  2022-01-20 16:59         ` Stefan Roese
  1 sibling, 1 reply; 13+ messages in thread
From: Bjorn Helgaas @ 2022-01-20 15:46 UTC (permalink / raw)
  To: Stefan Roese
  Cc: Pali Rohár, linux-pci, Bharat Kumar Gogada, Michal Simek,
	Yao Hongbo, Naveen Naidu

On Thu, Jan 20, 2022 at 08:31:31AM +0100, Stefan Roese wrote:
> On 1/19/22 11:37, Pali Rohár wrote:

> > And when you opened this issue with hotplugging, another thing for
> > followup changes in future is calling pcie_set_ecrc_checking() function
> > to align ECRC state of newly hotplugged device with "pci=ecrc=..."
> > cmdline option. As currently it is done only at that function
> > set_device_error_reporting().
> 
> Agreed, this is another area to look into. Not sure if it's okay to
> address this, once this patch-set has been accepted (if it will be).

ECRC might be something that could be peeled off first to reduce the
complexity of AER itself.

The ECRC capability and enable bits are in the AER Capability, so I
think it should be moved to pci_aer_init() so it happens for every
device as we enumerate it.

As far as I can tell, there is no requirement that every device in the
path support ECRC, so it can be enabled independently for each device.
I think devices that don't support ECRC checking must handle TLPs with
ECRC without error.

Per Table 6-5, ECRC check failures result in a device logging the
prefix/header of the TLP and sending ERR_NONFATAL or ERR_COR.  I think
this is useful regardless of whether AER interrupts are enabled
because error information is logged where the ECRC failure was
detected.

Bjorn

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it
  2022-01-20 15:46       ` Bjorn Helgaas
@ 2022-01-20 16:59         ` Stefan Roese
  2022-01-20 17:54           ` Bjorn Helgaas
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Roese @ 2022-01-20 16:59 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Pali Rohár, linux-pci, Bharat Kumar Gogada, Michal Simek,
	Yao Hongbo, Naveen Naidu

On 1/20/22 16:46, Bjorn Helgaas wrote:
> On Thu, Jan 20, 2022 at 08:31:31AM +0100, Stefan Roese wrote:
>> On 1/19/22 11:37, Pali Rohár wrote:
> 
>>> And when you opened this issue with hotplugging, another thing for
>>> followup changes in future is calling pcie_set_ecrc_checking() function
>>> to align ECRC state of newly hotplugged device with "pci=ecrc=..."
>>> cmdline option. As currently it is done only at that function
>>> set_device_error_reporting().
>>
>> Agreed, this is another area to look into. Not sure if it's okay to
>> address this, once this patch-set has been accepted (if it will be).
> 
> ECRC might be something that could be peeled off first to reduce the
> complexity of AER itself.
> 
> The ECRC capability and enable bits are in the AER Capability, so I
> think it should be moved to pci_aer_init() so it happens for every
> device as we enumerate it.

Just that there is no misunderstanding: You are thinking about something
like this:

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 9fa1f97e5b27..5585fefc4d0e 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -387,6 +387,9 @@ void pci_aer_init(struct pci_dev *dev)
         pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_ERR, 
sizeof(u32) * n);

         pci_aer_clear_status(dev);
+
+       /* Enable ECRC checking if enabled and configured */
+       pcie_set_ecrc_checking(dev);
  }

  void pci_aer_exit(struct pci_dev *dev)
@@ -1223,9 +1226,6 @@ static int set_device_error_reporting(struct 
pci_dev *dev, void *data)
                         pci_disable_pcie_error_reporting(dev);
         }

-       if (enable)
-               pcie_set_ecrc_checking(dev);
-
         return 0;
  }

Perhaps as patch 1/3 in this patch series? Or as some completely
separate patch?

Thanks,
Stefan

> As far as I can tell, there is no requirement that every device in the
> path support ECRC, so it can be enabled independently for each device.
> I think devices that don't support ECRC checking must handle TLPs with
> ECRC without error.
> 
> Per Table 6-5, ECRC check failures result in a device logging the
> prefix/header of the TLP and sending ERR_NONFATAL or ERR_COR.  I think
> this is useful regardless of whether AER interrupts are enabled
> because error information is logged where the ECRC failure was
> detected.
> 
> Bjorn
> 

Viele Grüße,
Stefan Roese

-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-51 Fax: (+49)-8142-66989-80 Email: sr@denx.de

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it
  2022-01-20 16:59         ` Stefan Roese
@ 2022-01-20 17:54           ` Bjorn Helgaas
  0 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2022-01-20 17:54 UTC (permalink / raw)
  To: Stefan Roese
  Cc: Pali Rohár, linux-pci, Bharat Kumar Gogada, Michal Simek,
	Yao Hongbo, Naveen Naidu

On Thu, Jan 20, 2022 at 05:59:22PM +0100, Stefan Roese wrote:
> On 1/20/22 16:46, Bjorn Helgaas wrote:
> > On Thu, Jan 20, 2022 at 08:31:31AM +0100, Stefan Roese wrote:
> > > On 1/19/22 11:37, Pali Rohár wrote:
> > 
> > > > And when you opened this issue with hotplugging, another thing for
> > > > followup changes in future is calling pcie_set_ecrc_checking() function
> > > > to align ECRC state of newly hotplugged device with "pci=ecrc=..."
> > > > cmdline option. As currently it is done only at that function
> > > > set_device_error_reporting().
> > > 
> > > Agreed, this is another area to look into. Not sure if it's okay to
> > > address this, once this patch-set has been accepted (if it will be).
> > 
> > ECRC might be something that could be peeled off first to reduce the
> > complexity of AER itself.
> > 
> > The ECRC capability and enable bits are in the AER Capability, so I
> > think it should be moved to pci_aer_init() so it happens for every
> > device as we enumerate it.
> 
> Just that there is no misunderstanding: You are thinking about something
> like this:
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 9fa1f97e5b27..5585fefc4d0e 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -387,6 +387,9 @@ void pci_aer_init(struct pci_dev *dev)
>         pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_ERR, sizeof(u32) *
> n);
> 
>         pci_aer_clear_status(dev);
> +
> +       /* Enable ECRC checking if enabled and configured */
> +       pcie_set_ecrc_checking(dev);
>  }
> 
>  void pci_aer_exit(struct pci_dev *dev)
> @@ -1223,9 +1226,6 @@ static int set_device_error_reporting(struct pci_dev
> *dev, void *data)
>                         pci_disable_pcie_error_reporting(dev);
>         }
> 
> -       if (enable)
> -               pcie_set_ecrc_checking(dev);
> -
>         return 0;
>  }
> 
> Perhaps as patch 1/3 in this patch series? Or as some completely
> separate patch?

Yes.  Probably as 1/3, since subsequent patches may depend on this
one, or at least may not apply cleanly without this one.

Bjorn

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-01-20 17:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-19  9:21 [PATCH v3 0/2] Fully enable AER Stefan Roese
2022-01-19  9:21 ` [PATCH v3 1/2] PCI/portdrv: Don't disable AER reporting in get_port_device_capability() Stefan Roese
2022-01-19  9:22 ` [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it Stefan Roese
2022-01-19 10:37   ` Pali Rohár
2022-01-20  7:31     ` Stefan Roese
2022-01-20 13:23       ` Pali Rohár
2022-01-20 15:46       ` Bjorn Helgaas
2022-01-20 16:59         ` Stefan Roese
2022-01-20 17:54           ` Bjorn Helgaas
2022-01-19 18:25   ` Keith Busch
2022-01-19 21:00     ` Bjorn Helgaas
2022-01-19 21:18       ` Keith Busch
2022-01-20  7:32         ` Stefan Roese

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.