All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] PCI/AER: Enable reporting for all ports
@ 2018-10-11 15:26 Bjorn Helgaas
  2018-10-11 15:26 ` [PATCH v3] PCI/AER: Enable reporting for ports enumerated after AER driver registration Bjorn Helgaas
  0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2018-10-11 15:26 UTC (permalink / raw)
  To: Jon Derrick
  Cc: Dongdong Liu, Keith Busch, Sinan Kaya, Oza Pawandeep,
	Matthew Wilcox, Lukas Wunner, Christoph Hellwig, Mika Westerberg,
	linux-pci, linux-kernel

This is another attempt to fix the AER error reporting issue reported
by Jon.  I've compiled this on x86, but can't test it myself.

If we can get this tested, I'd like to include this for v4.20.


v3: This post
    Fix the problem that v2 didn't enable error reporting on Root
    Ports, as pointed out by Dongdong

v2: https://lore.kernel.org/linux-pci/20181009231915.GC5906@bhelgaas-glaptop.roam.corp.google.com
    My attempt to move the fix to the AER service driver

v1: https://lore.kernel.org/linux-pci/1536085989-2956-1-git-send-email-jonathan.derrick@intel.com
    Jon's initial posting

---

Bjorn Helgaas (1):
      PCI/AER: Enable reporting for ports enumerated after AER driver registration


 drivers/pci/pcie/aer.c |   16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3] PCI/AER: Enable reporting for ports enumerated after AER driver registration
  2018-10-11 15:26 [PATCH v3] PCI/AER: Enable reporting for all ports Bjorn Helgaas
@ 2018-10-11 15:26 ` Bjorn Helgaas
  2018-10-11 15:57   ` Keith Busch
  0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2018-10-11 15:26 UTC (permalink / raw)
  To: Jon Derrick
  Cc: Dongdong Liu, Keith Busch, Sinan Kaya, Oza Pawandeep,
	Matthew Wilcox, Lukas Wunner, Christoph Hellwig, Mika Westerberg,
	linux-pci, linux-kernel

From: Bjorn Helgaas <bhelgaas@google.com>

Previously we enabled AER error reporting only for Switch Ports that were
enumerated prior to registering the AER service driver.  Switch Ports
enumerated after AER driver registration were left with error reporting
disabled.

A common order, which works correctly, is that we enumerate devices before
registering portdrv and the AER driver:

  - Enumerate all the devices at boot-time

  - Register portdrv and bind it to all Root Ports and Switch Ports, which
    disables error reporting for these Ports

  - Register AER service driver and bind it to all Root Ports, which
    enables error reporting for the Root Ports and any Switch Ports below
    them

But if we enumerate devices *after* registering portdrv and the AER driver,
e.g., if a host bridge driver is loaded as a module, error reporting is not
enabled correctly:

  - Register portdrv and AER driver (this happens at boot-time)

  - Enumerate a Root Port

  - Bind portdrv to Root Port, disabling its error reporting

  - Bind AER service driver to Root Port, enabling error reporting for it
    and its children (there are no children, since we haven't enumerated
    them yet)

  - Enumerate Switch Port below the Root Port

  - Bind portdrv to Switch Port, disabling its error reporting

  - AER service driver doesn't bind to Switch Ports, so error reporting
    remains disabled

Hot-adding a Switch fails similarly: error reporting is enabled correctly
for the Root Port, but when the Switch is enumerated, the AER service
driver doesn't claim it, so there's nothing to enable error reporting for
the Switch Ports.

Change the AER service driver so it binds to *all* PCIe Ports, including
Switch Upstream and Downstream Ports.  Enable AER error reporting for all
these Ports, but not for any children.

Link: https://lore.kernel.org/linux-pci/1536085989-2956-1-git-send-email-jonathan.derrick@intel.com
Based-on-patch-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pcie/aer.c |   16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 90b53abf621d..c40c6607849b 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1316,12 +1316,6 @@ static void aer_enable_rootport(struct aer_rpc *rpc)
 	pci_read_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, &reg32);
 	pci_write_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, reg32);
 
-	/*
-	 * Enable error reporting for the root port device and downstream port
-	 * devices.
-	 */
-	set_downstream_devices_error_reporting(pdev, true);
-
 	/* Enable Root Port's interrupt in response to error messages */
 	pci_read_config_dword(pdev, aer_pos + PCI_ERR_ROOT_COMMAND, &reg32);
 	reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
@@ -1378,10 +1372,17 @@ static void aer_remove(struct pcie_device *dev)
  */
 static int aer_probe(struct pcie_device *dev)
 {
+	struct pci_dev *pdev = dev->port;
+	int type = pci_pcie_type(pdev);
 	int status;
 	struct aer_rpc *rpc;
 	struct device *device = &dev->device;
 
+	if (type == PCI_EXP_TYPE_UPSTREAM || type == PCI_EXP_TYPE_DOWNSTREAM) {
+		pci_enable_pcie_error_reporting(pdev);
+		return 0;
+	}
+
 	rpc = devm_kzalloc(device, sizeof(struct aer_rpc), GFP_KERNEL);
 	if (!rpc) {
 		dev_printk(KERN_DEBUG, device, "alloc AER rpc failed\n");
@@ -1399,6 +1400,7 @@ static int aer_probe(struct pcie_device *dev)
 	}
 
 	aer_enable_rootport(rpc);
+	pci_enable_pcie_error_reporting(pdev);
 	dev_info(device, "AER enabled with IRQ %d\n", dev->irq);
 	return 0;
 }
@@ -1439,7 +1441,7 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
 
 static struct pcie_port_service_driver aerdriver = {
 	.name		= "aer",
-	.port_type	= PCI_EXP_TYPE_ROOT_PORT,
+	.port_type	= PCIE_ANY_PORT,
 	.service	= PCIE_PORT_SERVICE_AER,
 
 	.probe		= aer_probe,


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] PCI/AER: Enable reporting for ports enumerated after AER driver registration
  2018-10-11 15:26 ` [PATCH v3] PCI/AER: Enable reporting for ports enumerated after AER driver registration Bjorn Helgaas
@ 2018-10-11 15:57   ` Keith Busch
  2018-10-12  8:16     ` Dongdong Liu
  2018-10-18 20:49     ` Bjorn Helgaas
  0 siblings, 2 replies; 6+ messages in thread
From: Keith Busch @ 2018-10-11 15:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Derrick, Jonathan, Dongdong Liu, Sinan Kaya, Oza Pawandeep,
	Matthew Wilcox, Lukas Wunner, Christoph Hellwig, Mika Westerberg,
	linux-pci, linux-kernel

On Thu, Oct 11, 2018 at 08:26:18AM -0700, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <bhelgaas@google.com>
> 
> Previously we enabled AER error reporting only for Switch Ports that were
> enumerated prior to registering the AER service driver.  Switch Ports
> enumerated after AER driver registration were left with error reporting
> disabled.
> 
> A common order, which works correctly, is that we enumerate devices before
> registering portdrv and the AER driver:
> 
>   - Enumerate all the devices at boot-time
> 
>   - Register portdrv and bind it to all Root Ports and Switch Ports, which
>     disables error reporting for these Ports
> 
>   - Register AER service driver and bind it to all Root Ports, which
>     enables error reporting for the Root Ports and any Switch Ports below
>     them
> 
> But if we enumerate devices *after* registering portdrv and the AER driver,
> e.g., if a host bridge driver is loaded as a module, error reporting is not
> enabled correctly:
> 
>   - Register portdrv and AER driver (this happens at boot-time)
> 
>   - Enumerate a Root Port
> 
>   - Bind portdrv to Root Port, disabling its error reporting
> 
>   - Bind AER service driver to Root Port, enabling error reporting for it
>     and its children (there are no children, since we haven't enumerated
>     them yet)
> 
>   - Enumerate Switch Port below the Root Port
> 
>   - Bind portdrv to Switch Port, disabling its error reporting
> 
>   - AER service driver doesn't bind to Switch Ports, so error reporting
>     remains disabled
> 
> Hot-adding a Switch fails similarly: error reporting is enabled correctly
> for the Root Port, but when the Switch is enumerated, the AER service
> driver doesn't claim it, so there's nothing to enable error reporting for
> the Switch Ports.
> 
> Change the AER service driver so it binds to *all* PCIe Ports, including
> Switch Upstream and Downstream Ports.  Enable AER error reporting for all
> these Ports, but not for any children.
> 
> Link: https://lore.kernel.org/linux-pci/1536085989-2956-1-git-send-email-jonathan.derrick@intel.com
> Based-on-patch-by: Jon Derrick <jonathan.derrick@intel.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  drivers/pci/pcie/aer.c |   16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 90b53abf621d..c40c6607849b 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1316,12 +1316,6 @@ static void aer_enable_rootport(struct aer_rpc *rpc)
>  	pci_read_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, &reg32);
>  	pci_write_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, reg32);
>  
> -	/*
> -	 * Enable error reporting for the root port device and downstream port
> -	 * devices.
> -	 */
> -	set_downstream_devices_error_reporting(pdev, true);
> -
>  	/* Enable Root Port's interrupt in response to error messages */
>  	pci_read_config_dword(pdev, aer_pos + PCI_ERR_ROOT_COMMAND, &reg32);
>  	reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
> @@ -1378,10 +1372,17 @@ static void aer_remove(struct pcie_device *dev)
>   */
>  static int aer_probe(struct pcie_device *dev)
>  {
> +	struct pci_dev *pdev = dev->port;
> +	int type = pci_pcie_type(pdev);
>  	int status;
>  	struct aer_rpc *rpc;
>  	struct device *device = &dev->device;
>  
> +	if (type == PCI_EXP_TYPE_UPSTREAM || type == PCI_EXP_TYPE_DOWNSTREAM) {
> +		pci_enable_pcie_error_reporting(pdev);
> +		return 0;
> +	}

I think we need to either return an error in this case so that the
pcie_device won't be eligable for the .remove() callback, or add a
similiar type check in aer_remove().

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] PCI/AER: Enable reporting for ports enumerated after AER driver registration
  2018-10-11 15:57   ` Keith Busch
@ 2018-10-12  8:16     ` Dongdong Liu
  2018-10-18 20:52       ` Bjorn Helgaas
  2018-10-18 20:49     ` Bjorn Helgaas
  1 sibling, 1 reply; 6+ messages in thread
From: Dongdong Liu @ 2018-10-12  8:16 UTC (permalink / raw)
  To: Keith Busch, Bjorn Helgaas
  Cc: Derrick, Jonathan, Sinan Kaya, Oza Pawandeep, Matthew Wilcox,
	Lukas Wunner, Christoph Hellwig, Mika Westerberg, linux-pci,
	linux-kernel



在 2018/10/11 23:57, Keith Busch 写道:
> On Thu, Oct 11, 2018 at 08:26:18AM -0700, Bjorn Helgaas wrote:
>> From: Bjorn Helgaas <bhelgaas@google.com>
>>
>> Previously we enabled AER error reporting only for Switch Ports that were
>> enumerated prior to registering the AER service driver.  Switch Ports
>> enumerated after AER driver registration were left with error reporting
>> disabled.
>>
>> A common order, which works correctly, is that we enumerate devices before
>> registering portdrv and the AER driver:
>>
>>   - Enumerate all the devices at boot-time
>>
>>   - Register portdrv and bind it to all Root Ports and Switch Ports, which
>>     disables error reporting for these Ports
>>
>>   - Register AER service driver and bind it to all Root Ports, which
>>     enables error reporting for the Root Ports and any Switch Ports below
>>     them
>>
>> But if we enumerate devices *after* registering portdrv and the AER driver,
>> e.g., if a host bridge driver is loaded as a module, error reporting is not
>> enabled correctly:
>>
>>   - Register portdrv and AER driver (this happens at boot-time)
>>
>>   - Enumerate a Root Port
>>
>>   - Bind portdrv to Root Port, disabling its error reporting
>>
>>   - Bind AER service driver to Root Port, enabling error reporting for it
>>     and its children (there are no children, since we haven't enumerated
>>     them yet)
>>
>>   - Enumerate Switch Port below the Root Port
>>
>>   - Bind portdrv to Switch Port, disabling its error reporting
>>
>>   - AER service driver doesn't bind to Switch Ports, so error reporting
>>     remains disabled
>>
>> Hot-adding a Switch fails similarly: error reporting is enabled correctly
>> for the Root Port, but when the Switch is enumerated, the AER service
>> driver doesn't claim it, so there's nothing to enable error reporting for
>> the Switch Ports.
>>
>> Change the AER service driver so it binds to *all* PCIe Ports, including
>> Switch Upstream and Downstream Ports.  Enable AER error reporting for all
>> these Ports, but not for any children.
>>
>> Link: https://lore.kernel.org/linux-pci/1536085989-2956-1-git-send-email-jonathan.derrick@intel.com
>> Based-on-patch-by: Jon Derrick <jonathan.derrick@intel.com>
>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>> ---
>>  drivers/pci/pcie/aer.c |   16 +++++++++-------
>>  1 file changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index 90b53abf621d..c40c6607849b 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -1316,12 +1316,6 @@ static void aer_enable_rootport(struct aer_rpc *rpc)
>>  	pci_read_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, &reg32);
>>  	pci_write_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, reg32);
>>
>> -	/*
>> -	 * Enable error reporting for the root port device and downstream port
>> -	 * devices.
>> -	 */
>> -	set_downstream_devices_error_reporting(pdev, true);
>> -
>>  	/* Enable Root Port's interrupt in response to error messages */
>>  	pci_read_config_dword(pdev, aer_pos + PCI_ERR_ROOT_COMMAND, &reg32);
>>  	reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
>> @@ -1378,10 +1372,17 @@ static void aer_remove(struct pcie_device *dev)
>>   */
>>  static int aer_probe(struct pcie_device *dev)
>>  {
>> +	struct pci_dev *pdev = dev->port;
>> +	int type = pci_pcie_type(pdev);
>>  	int status;
>>  	struct aer_rpc *rpc;
>>  	struct device *device = &dev->device;
>>
>> +	if (type == PCI_EXP_TYPE_UPSTREAM || type == PCI_EXP_TYPE_DOWNSTREAM) {
>> +		pci_enable_pcie_error_reporting(pdev);
>> +		return 0;
>> +	}
>
> I think we need to either return an error in this case so that the
> pcie_device won't be eligable for the .remove() callback, or add a
> similiar type check in aer_remove().
>

It seems aer_root_reset() also will be called for downstream port(err.c driver->reset_link(dev)),
but aer_root_reset is only for root port.

static struct pcie_port_service_driver aerdriver = {
	.name		= "aer",
	.port_type	= PCIE_ANY_PORT,
	.service	= PCIE_PORT_SERVICE_AER,

	.probe		= aer_probe,
	.remove		= aer_remove,
	.reset_link	= aer_root_reset,
};

Thanks,
Dongdong
> .
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] PCI/AER: Enable reporting for ports enumerated after AER driver registration
  2018-10-11 15:57   ` Keith Busch
  2018-10-12  8:16     ` Dongdong Liu
@ 2018-10-18 20:49     ` Bjorn Helgaas
  1 sibling, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2018-10-18 20:49 UTC (permalink / raw)
  To: Keith Busch
  Cc: Derrick, Jonathan, Dongdong Liu, Sinan Kaya, Oza Pawandeep,
	Matthew Wilcox, Lukas Wunner, Christoph Hellwig, Mika Westerberg,
	linux-pci, linux-kernel

On Thu, Oct 11, 2018 at 09:57:16AM -0600, Keith Busch wrote:
> On Thu, Oct 11, 2018 at 08:26:18AM -0700, Bjorn Helgaas wrote:
> > From: Bjorn Helgaas <bhelgaas@google.com>
> > 
> > Previously we enabled AER error reporting only for Switch Ports that were
> > enumerated prior to registering the AER service driver.  Switch Ports
> > enumerated after AER driver registration were left with error reporting
> > disabled.
> > 
> > A common order, which works correctly, is that we enumerate devices before
> > registering portdrv and the AER driver:
> > 
> >   - Enumerate all the devices at boot-time
> > 
> >   - Register portdrv and bind it to all Root Ports and Switch Ports, which
> >     disables error reporting for these Ports
> > 
> >   - Register AER service driver and bind it to all Root Ports, which
> >     enables error reporting for the Root Ports and any Switch Ports below
> >     them
> > 
> > But if we enumerate devices *after* registering portdrv and the AER driver,
> > e.g., if a host bridge driver is loaded as a module, error reporting is not
> > enabled correctly:
> > 
> >   - Register portdrv and AER driver (this happens at boot-time)
> > 
> >   - Enumerate a Root Port
> > 
> >   - Bind portdrv to Root Port, disabling its error reporting
> > 
> >   - Bind AER service driver to Root Port, enabling error reporting for it
> >     and its children (there are no children, since we haven't enumerated
> >     them yet)
> > 
> >   - Enumerate Switch Port below the Root Port
> > 
> >   - Bind portdrv to Switch Port, disabling its error reporting
> > 
> >   - AER service driver doesn't bind to Switch Ports, so error reporting
> >     remains disabled
> > 
> > Hot-adding a Switch fails similarly: error reporting is enabled correctly
> > for the Root Port, but when the Switch is enumerated, the AER service
> > driver doesn't claim it, so there's nothing to enable error reporting for
> > the Switch Ports.
> > 
> > Change the AER service driver so it binds to *all* PCIe Ports, including
> > Switch Upstream and Downstream Ports.  Enable AER error reporting for all
> > these Ports, but not for any children.
> > 
> > Link: https://lore.kernel.org/linux-pci/1536085989-2956-1-git-send-email-jonathan.derrick@intel.com
> > Based-on-patch-by: Jon Derrick <jonathan.derrick@intel.com>
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >  drivers/pci/pcie/aer.c |   16 +++++++++-------
> >  1 file changed, 9 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index 90b53abf621d..c40c6607849b 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -1316,12 +1316,6 @@ static void aer_enable_rootport(struct aer_rpc *rpc)
> >  	pci_read_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, &reg32);
> >  	pci_write_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, reg32);
> >  
> > -	/*
> > -	 * Enable error reporting for the root port device and downstream port
> > -	 * devices.
> > -	 */
> > -	set_downstream_devices_error_reporting(pdev, true);
> > -
> >  	/* Enable Root Port's interrupt in response to error messages */
> >  	pci_read_config_dword(pdev, aer_pos + PCI_ERR_ROOT_COMMAND, &reg32);
> >  	reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
> > @@ -1378,10 +1372,17 @@ static void aer_remove(struct pcie_device *dev)
> >   */
> >  static int aer_probe(struct pcie_device *dev)
> >  {
> > +	struct pci_dev *pdev = dev->port;
> > +	int type = pci_pcie_type(pdev);
> >  	int status;
> >  	struct aer_rpc *rpc;
> >  	struct device *device = &dev->device;
> >  
> > +	if (type == PCI_EXP_TYPE_UPSTREAM || type == PCI_EXP_TYPE_DOWNSTREAM) {
> > +		pci_enable_pcie_error_reporting(pdev);
> > +		return 0;
> > +	}
> 
> I think we need to either return an error in this case so that the
> pcie_device won't be eligable for the .remove() callback, or add a
> similiar type check in aer_remove().

Indeed, thanks!  I think a check in aer_remove() seems nicer.  It doesn't
seem right to return an error here, since everything is working correctly.

Bjorn

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] PCI/AER: Enable reporting for ports enumerated after AER driver registration
  2018-10-12  8:16     ` Dongdong Liu
@ 2018-10-18 20:52       ` Bjorn Helgaas
  0 siblings, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2018-10-18 20:52 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: Keith Busch, Derrick, Jonathan, Sinan Kaya, Oza Pawandeep,
	Matthew Wilcox, Lukas Wunner, Christoph Hellwig, Mika Westerberg,
	linux-pci, linux-kernel

On Fri, Oct 12, 2018 at 04:16:04PM +0800, Dongdong Liu wrote:
> 在 2018/10/11 23:57, Keith Busch 写道:
> > On Thu, Oct 11, 2018 at 08:26:18AM -0700, Bjorn Helgaas wrote:
> > > From: Bjorn Helgaas <bhelgaas@google.com>
> > > 
> > > Previously we enabled AER error reporting only for Switch Ports that were
> > > enumerated prior to registering the AER service driver.  Switch Ports
> > > enumerated after AER driver registration were left with error reporting
> > > disabled.
> > > 
> > > A common order, which works correctly, is that we enumerate devices before
> > > registering portdrv and the AER driver:
> > > 
> > >   - Enumerate all the devices at boot-time
> > > 
> > >   - Register portdrv and bind it to all Root Ports and Switch Ports, which
> > >     disables error reporting for these Ports
> > > 
> > >   - Register AER service driver and bind it to all Root Ports, which
> > >     enables error reporting for the Root Ports and any Switch Ports below
> > >     them
> > > 
> > > But if we enumerate devices *after* registering portdrv and the AER driver,
> > > e.g., if a host bridge driver is loaded as a module, error reporting is not
> > > enabled correctly:
> > > 
> > >   - Register portdrv and AER driver (this happens at boot-time)
> > > 
> > >   - Enumerate a Root Port
> > > 
> > >   - Bind portdrv to Root Port, disabling its error reporting
> > > 
> > >   - Bind AER service driver to Root Port, enabling error reporting for it
> > >     and its children (there are no children, since we haven't enumerated
> > >     them yet)
> > > 
> > >   - Enumerate Switch Port below the Root Port
> > > 
> > >   - Bind portdrv to Switch Port, disabling its error reporting
> > > 
> > >   - AER service driver doesn't bind to Switch Ports, so error reporting
> > >     remains disabled
> > > 
> > > Hot-adding a Switch fails similarly: error reporting is enabled correctly
> > > for the Root Port, but when the Switch is enumerated, the AER service
> > > driver doesn't claim it, so there's nothing to enable error reporting for
> > > the Switch Ports.
> > > 
> > > Change the AER service driver so it binds to *all* PCIe Ports, including
> > > Switch Upstream and Downstream Ports.  Enable AER error reporting for all
> > > these Ports, but not for any children.
> > > 
> > > Link: https://lore.kernel.org/linux-pci/1536085989-2956-1-git-send-email-jonathan.derrick@intel.com
> > > Based-on-patch-by: Jon Derrick <jonathan.derrick@intel.com>
> > > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > > ---
> > >  drivers/pci/pcie/aer.c |   16 +++++++++-------
> > >  1 file changed, 9 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > > index 90b53abf621d..c40c6607849b 100644
> > > --- a/drivers/pci/pcie/aer.c
> > > +++ b/drivers/pci/pcie/aer.c
> > > @@ -1316,12 +1316,6 @@ static void aer_enable_rootport(struct aer_rpc *rpc)
> > >  	pci_read_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, &reg32);
> > >  	pci_write_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, reg32);
> > > 
> > > -	/*
> > > -	 * Enable error reporting for the root port device and downstream port
> > > -	 * devices.
> > > -	 */
> > > -	set_downstream_devices_error_reporting(pdev, true);
> > > -
> > >  	/* Enable Root Port's interrupt in response to error messages */
> > >  	pci_read_config_dword(pdev, aer_pos + PCI_ERR_ROOT_COMMAND, &reg32);
> > >  	reg32 |= ROOT_PORT_INTR_ON_MESG_MASK;
> > > @@ -1378,10 +1372,17 @@ static void aer_remove(struct pcie_device *dev)
> > >   */
> > >  static int aer_probe(struct pcie_device *dev)
> > >  {
> > > +	struct pci_dev *pdev = dev->port;
> > > +	int type = pci_pcie_type(pdev);
> > >  	int status;
> > >  	struct aer_rpc *rpc;
> > >  	struct device *device = &dev->device;
> > > 
> > > +	if (type == PCI_EXP_TYPE_UPSTREAM || type == PCI_EXP_TYPE_DOWNSTREAM) {
> > > +		pci_enable_pcie_error_reporting(pdev);
> > > +		return 0;
> > > +	}
> > 
> > I think we need to either return an error in this case so that the
> > pcie_device won't be eligable for the .remove() callback, or add a
> > similiar type check in aer_remove().
> 
> It seems aer_root_reset() also will be called for downstream port(err.c driver->reset_link(dev)),
> but aer_root_reset is only for root port.

Also right, thanks!

I think it will do the right thing if we make aer_root_reset() check
the port type and only disable/restore the Root Port interrupt
settings and status when called for a Root Port.  The
pci_bus_error_reset() it does is exactly what default_reset_link()
does for devices that aren't claimed by the AER driver.

Bjorn

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-10-18 20:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-11 15:26 [PATCH v3] PCI/AER: Enable reporting for all ports Bjorn Helgaas
2018-10-11 15:26 ` [PATCH v3] PCI/AER: Enable reporting for ports enumerated after AER driver registration Bjorn Helgaas
2018-10-11 15:57   ` Keith Busch
2018-10-12  8:16     ` Dongdong Liu
2018-10-18 20:52       ` Bjorn Helgaas
2018-10-18 20:49     ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.