linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers
@ 2012-11-10  7:41 Pandarathil, Vijaymohan R
  2012-11-15  0:51 ` Bjorn Helgaas
  0 siblings, 1 reply; 6+ messages in thread
From: Pandarathil, Vijaymohan R @ 2012-11-10  7:41 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-kernel, linux-pci, linasvepstas, Myron Stowe

When an error is detected on a PCIe device which does not have an
AER-aware driver, prevent AER infrastructure from reporting
successful error recovery.

This is because the report_error_detected() function that gets
called in the first phase of recovery process allows forward
progress even when the driver for the device does not have AER
capabilities. It seems that all callbacks (in pci_error_handlers
structure) registered by drivers that gets called during error
recovery are not mandatory. So the intention of the infrastructure
design seems to be to allow forward progress even when a specific
callback has not been registered by a driver. However, if error
handler structure itself has not been registered, it doesn't make
sense to allow forward progress.

As a result of the current design, in the case of a single device
having an AER-unaware driver or in the case of any function in a
multi-function card having an AER-unaware driver, a successful
recovery is reported.

Typical scenario this happens is when a PCI device is detached
from a KVM host and the pci-stub driver on the host claims the
device. The pci-stub driver does not have error handling capabilities
but the AER infrastructure still reports that the device recovered
successfully.

The changes proposed here leaves the device in an unrecovered state
if the driver for the device or for any function in a multi-function
card does not have error handler structure registered. This reflects
the true state of the device and prevents any partial recovery (or no
recovery at all) reported as successful.

Please also see comments from Linas Vepstas at the following link
http://www.spinics.net/lists/linux-pci/msg18572.html

Reviewed-by: Linas Vepstas <linasvepstas <at> gmail.com>
Reviewed-by: Myron Stowe <mstowe <at> redhat.com>
Signed-off-by: Vijay Mohan Pandarathil <vijaymohan.pandarathil <at> hp.com>

---

drivers/pci/pcie/aer/aerdrv_core.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
index 06bad96..030b229 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev *dev, void *data)
 
 	dev->error_state = result_data->state;
 
+	if ((!dev->driver || !dev->driver->err_handler) &&
+		!(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
+		dev_info(&dev->dev, "AER: Error detected but no driver has claimed this device or the driver is AER-unaware\n");
+		result_data->result = PCI_ERS_RESULT_NONE;
+		return 1;
+	}
 	if (!dev->driver ||
 		!dev->driver->err_handler ||
 		!dev->driver->err_handler->error_detected) {

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers
  2012-11-10  7:41 [ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers Pandarathil, Vijaymohan R
@ 2012-11-15  0:51 ` Bjorn Helgaas
  2012-11-15  1:35   ` Linas Vepstas
  2012-11-15  7:09   ` Pandarathil, Vijaymohan R
  0 siblings, 2 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2012-11-15  0:51 UTC (permalink / raw)
  To: Pandarathil, Vijaymohan R
  Cc: linux-kernel, linux-pci, linasvepstas, Myron Stowe, Lance Ortiz,
	Huang Ying, Hidetoshi Seto, Andrew Patterson, Zhang Yanmin

[+cc Lance, Huang, Hidetoshi, Andrew, Zhang]

On Sat, Nov 10, 2012 at 07:41:04AM +0000, Pandarathil, Vijaymohan R wrote:
> When an error is detected on a PCIe device which does not have an
> AER-aware driver, prevent AER infrastructure from reporting
> successful error recovery.
> 
> This is because the report_error_detected() function that gets
> called in the first phase of recovery process allows forward
> progress even when the driver for the device does not have AER
> capabilities. It seems that all callbacks (in pci_error_handlers
> structure) registered by drivers that gets called during error
> recovery are not mandatory. So the intention of the infrastructure
> design seems to be to allow forward progress even when a specific
> callback has not been registered by a driver. However, if error
> handler structure itself has not been registered, it doesn't make
> sense to allow forward progress.
> 
> As a result of the current design, in the case of a single device
> having an AER-unaware driver or in the case of any function in a
> multi-function card having an AER-unaware driver, a successful
> recovery is reported.
> 
> Typical scenario this happens is when a PCI device is detached
> from a KVM host and the pci-stub driver on the host claims the
> device. The pci-stub driver does not have error handling capabilities
> but the AER infrastructure still reports that the device recovered
> successfully.
> 
> The changes proposed here leaves the device in an unrecovered state
> if the driver for the device or for any function in a multi-function
> card does not have error handler structure registered. This reflects
> the true state of the device and prevents any partial recovery (or no
> recovery at all) reported as successful.
> 
> Please also see comments from Linas Vepstas at the following link
> http://www.spinics.net/lists/linux-pci/msg18572.html
> 
> Reviewed-by: Linas Vepstas <linasvepstas <at> gmail.com>
> Reviewed-by: Myron Stowe <mstowe <at> redhat.com>
> Signed-off-by: Vijay Mohan Pandarathil <vijaymohan.pandarathil <at> hp.com>
> 
> ---
> 
> drivers/pci/pcie/aer/aerdrv_core.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> index 06bad96..030b229 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev *dev, void *data)
>  
>  	dev->error_state = result_data->state;
>  
> +	if ((!dev->driver || !dev->driver->err_handler) &&
> +		!(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
> +		dev_info(&dev->dev, "AER: Error detected but no driver has claimed this device or the driver is AER-unaware\n");
> +		result_data->result = PCI_ERS_RESULT_NONE;
> +		return 1;

This doesn't seem right because returning 1 here causes pci_walk_bus()
to terminate, which means we won't set dev->error_state or notify
drivers for any devices we haven't visited yet.

> +	}
>  	if (!dev->driver ||
>  		!dev->driver->err_handler ||
>  		!dev->driver->err_handler->error_detected) {

If none of the drivers in the affected hierarchy supports error handling,
I think the call tree looks like this:

    do_recovery                                 # uncorrectable only
        broadcast_error_message(dev, ..., report_error_detected)
            result_data.result = CAN_RECOVER            
            pci_walk_bus(..., report_error_detected)
                report_error_detected           # (each dev in subtree)
                    return 0                    # no change to result
            return result_data.result
        broadcast_error_message(dev, ..., report_mmio_enabled)
            result_data.result = PCI_ERS_RESULT_RECOVERED       
            pci_walk_bus(..., report_mmio_enabled)
                report_mmio_enabled             # (each dev in subtree)
                    return 0                    # no change to result
        dev_info("recovery successful")

Specifically, there are no error_handler functions, so we never change
result_data.result, and the default is that we treat the error as
"recovered successfully."  That seems broken.  An uncorrectable error
is by definition recoverable only by device-specific software, i.e.,
the driver.  We didn't call any drivers, so we can't have recovered
anything.

What do you think of the following alternative?  I don't know why you
checked for bridge devices in your patch, so I don't know whether
that's important here or not.

diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
index 06bad96..a109c68 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -231,11 +231,11 @@ static int report_error_detected(struct pci_dev *dev, void *data)
 				   dev->driver ?
 				   "no AER-aware driver" : "no driver");
 		}
-		return 0;
+		vote = PCI_ERS_RESULT_DISCONNECT;
+	} else {
+		err_handler = dev->driver->err_handler;
+		vote = err_handler->error_detected(dev, result_data->state);
 	}
-
-	err_handler = dev->driver->err_handler;
-	vote = err_handler->error_detected(dev, result_data->state);
 	result_data->result = merge_result(result_data->result, vote);
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers
  2012-11-15  0:51 ` Bjorn Helgaas
@ 2012-11-15  1:35   ` Linas Vepstas
  2012-11-15  7:09   ` Pandarathil, Vijaymohan R
  1 sibling, 0 replies; 6+ messages in thread
From: Linas Vepstas @ 2012-11-15  1:35 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Pandarathil, Vijaymohan R, linux-kernel, linux-pci, Myron Stowe,
	Lance Ortiz, Huang Ying, Hidetoshi Seto, Andrew Patterson,
	Zhang Yanmin

Yes, what you describe appears to be the correct semantics; this would
then be the more correct patch.

Read-the-email-but-didn't-try-to-test-by: Linas Vepstas <linasvepstas
<at> gmail.com>

-- Linas

On 14 November 2012 18:51, Bjorn Helgaas <bhelgaas@google.com> wrote:
>
> [+cc Lance, Huang, Hidetoshi, Andrew, Zhang]
>
> On Sat, Nov 10, 2012 at 07:41:04AM +0000, Pandarathil, Vijaymohan R wrote:
> > When an error is detected on a PCIe device which does not have an
> > AER-aware driver, prevent AER infrastructure from reporting
> > successful error recovery.
> >
> > This is because the report_error_detected() function that gets
> > called in the first phase of recovery process allows forward
> > progress even when the driver for the device does not have AER
> > capabilities. It seems that all callbacks (in pci_error_handlers
> > structure) registered by drivers that gets called during error
> > recovery are not mandatory. So the intention of the infrastructure
> > design seems to be to allow forward progress even when a specific
> > callback has not been registered by a driver. However, if error
> > handler structure itself has not been registered, it doesn't make
> > sense to allow forward progress.
> >
> > As a result of the current design, in the case of a single device
> > having an AER-unaware driver or in the case of any function in a
> > multi-function card having an AER-unaware driver, a successful
> > recovery is reported.
> >
> > Typical scenario this happens is when a PCI device is detached
> > from a KVM host and the pci-stub driver on the host claims the
> > device. The pci-stub driver does not have error handling capabilities
> > but the AER infrastructure still reports that the device recovered
> > successfully.
> >
> > The changes proposed here leaves the device in an unrecovered state
> > if the driver for the device or for any function in a multi-function
> > card does not have error handler structure registered. This reflects
> > the true state of the device and prevents any partial recovery (or no
> > recovery at all) reported as successful.
> >
> > Please also see comments from Linas Vepstas at the following link
> > http://www.spinics.net/lists/linux-pci/msg18572.html
> >
> > Reviewed-by: Linas Vepstas <linasvepstas <at> gmail.com>
> > Reviewed-by: Myron Stowe <mstowe <at> redhat.com>
> > Signed-off-by: Vijay Mohan Pandarathil <vijaymohan.pandarathil <at> hp.com>
> >
> > ---
> >
> > drivers/pci/pcie/aer/aerdrv_core.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> > index 06bad96..030b229 100644
> > --- a/drivers/pci/pcie/aer/aerdrv_core.c
> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> > @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev *dev, void *data)
> >
> >       dev->error_state = result_data->state;
> >
> > +     if ((!dev->driver || !dev->driver->err_handler) &&
> > +             !(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
> > +             dev_info(&dev->dev, "AER: Error detected but no driver has claimed this device or the driver is AER-unaware\n");
> > +             result_data->result = PCI_ERS_RESULT_NONE;
> > +             return 1;
>
> This doesn't seem right because returning 1 here causes pci_walk_bus()
> to terminate, which means we won't set dev->error_state or notify
> drivers for any devices we haven't visited yet.
>
> > +     }
> >       if (!dev->driver ||
> >               !dev->driver->err_handler ||
> >               !dev->driver->err_handler->error_detected) {
>
> If none of the drivers in the affected hierarchy supports error handling,
> I think the call tree looks like this:
>
>     do_recovery                                 # uncorrectable only
>         broadcast_error_message(dev, ..., report_error_detected)
>             result_data.result = CAN_RECOVER
>             pci_walk_bus(..., report_error_detected)
>                 report_error_detected           # (each dev in subtree)
>                     return 0                    # no change to result
>             return result_data.result
>         broadcast_error_message(dev, ..., report_mmio_enabled)
>             result_data.result = PCI_ERS_RESULT_RECOVERED
>             pci_walk_bus(..., report_mmio_enabled)
>                 report_mmio_enabled             # (each dev in subtree)
>                     return 0                    # no change to result
>         dev_info("recovery successful")
>
> Specifically, there are no error_handler functions, so we never change
> result_data.result, and the default is that we treat the error as
> "recovered successfully."  That seems broken.  An uncorrectable error
> is by definition recoverable only by device-specific software, i.e.,
> the driver.  We didn't call any drivers, so we can't have recovered
> anything.
>
> What do you think of the following alternative?  I don't know why you
> checked for bridge devices in your patch, so I don't know whether
> that's important here or not.
>
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> index 06bad96..a109c68 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -231,11 +231,11 @@ static int report_error_detected(struct pci_dev *dev, void *data)
>                                    dev->driver ?
>                                    "no AER-aware driver" : "no driver");
>                 }
> -               return 0;
> +               vote = PCI_ERS_RESULT_DISCONNECT;
> +       } else {
> +               err_handler = dev->driver->err_handler;
> +               vote = err_handler->error_detected(dev, result_data->state);
>         }
> -
> -       err_handler = dev->driver->err_handler;
> -       vote = err_handler->error_detected(dev, result_data->state);
>         result_data->result = merge_result(result_data->result, vote);
>         return 0;
>  }

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers
  2012-11-15  0:51 ` Bjorn Helgaas
  2012-11-15  1:35   ` Linas Vepstas
@ 2012-11-15  7:09   ` Pandarathil, Vijaymohan R
  2012-11-16  1:08     ` Bjorn Helgaas
  1 sibling, 1 reply; 6+ messages in thread
From: Pandarathil, Vijaymohan R @ 2012-11-15  7:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, linux-pci, linasvepstas, Myron Stowe, Ortiz,
	Lance E, Huang Ying, Hidetoshi Seto, Patterson,
	Andrew D (LeftHand Networks),
	Zhang Yanmin

Thanks for the comments. See my response below.

> -----Original Message-----
> From: Bjorn Helgaas [mailto:bhelgaas@google.com]
> Sent: Wednesday, November 14, 2012 4:51 PM
> To: Pandarathil, Vijaymohan R
> Cc: linux-kernel@vger.kernel.org; linux-pci@vger.kernel.org;
> linasvepstas@gmail.com; Myron Stowe; Ortiz, Lance E; Huang Ying; Hidetoshi
> Seto; Patterson, Andrew D (LeftHand Networks); Zhang Yanmin
> Subject: Re: [ PATCH RESEND ] PCI-AER: Do not report successful error
> recovery for devices with AER-unaware drivers
> 
> [+cc Lance, Huang, Hidetoshi, Andrew, Zhang]
> 
> On Sat, Nov 10, 2012 at 07:41:04AM +0000, Pandarathil, Vijaymohan R wrote:
> > When an error is detected on a PCIe device which does not have an
> > AER-aware driver, prevent AER infrastructure from reporting
> > successful error recovery.
> >
> > This is because the report_error_detected() function that gets
> > called in the first phase of recovery process allows forward
> > progress even when the driver for the device does not have AER
> > capabilities. It seems that all callbacks (in pci_error_handlers
> > structure) registered by drivers that gets called during error
> > recovery are not mandatory. So the intention of the infrastructure
> > design seems to be to allow forward progress even when a specific
> > callback has not been registered by a driver. However, if error
> > handler structure itself has not been registered, it doesn't make
> > sense to allow forward progress.
> >
> > As a result of the current design, in the case of a single device
> > having an AER-unaware driver or in the case of any function in a
> > multi-function card having an AER-unaware driver, a successful
> > recovery is reported.
> >
> > Typical scenario this happens is when a PCI device is detached
> > from a KVM host and the pci-stub driver on the host claims the
> > device. The pci-stub driver does not have error handling capabilities
> > but the AER infrastructure still reports that the device recovered
> > successfully.
> >
> > The changes proposed here leaves the device in an unrecovered state
> > if the driver for the device or for any function in a multi-function
> > card does not have error handler structure registered. This reflects
> > the true state of the device and prevents any partial recovery (or no
> > recovery at all) reported as successful.
> >
> > Please also see comments from Linas Vepstas at the following link
> > http://www.spinics.net/lists/linux-pci/msg18572.html
> >
> > Reviewed-by: Linas Vepstas <linasvepstas <at> gmail.com>
> > Reviewed-by: Myron Stowe <mstowe <at> redhat.com>
> > Signed-off-by: Vijay Mohan Pandarathil <vijaymohan.pandarathil <at>
> hp.com>
> >
> > ---
> >
> > drivers/pci/pcie/aer/aerdrv_core.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
> b/drivers/pci/pcie/aer/aerdrv_core.c
> > index 06bad96..030b229 100644
> > --- a/drivers/pci/pcie/aer/aerdrv_core.c
> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> > @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev
> *dev, void *data)
> >
> >  	dev->error_state = result_data->state;
> >
> > +	if ((!dev->driver || !dev->driver->err_handler) &&
> > +		!(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
> > +		dev_info(&dev->dev, "AER: Error detected but no driver has
> claimed this device or the driver is AER-unaware\n");
> > +		result_data->result = PCI_ERS_RESULT_NONE;
> > +		return 1;
> 
> This doesn't seem right because returning 1 here causes pci_walk_bus()
> to terminate, which means we won't set dev->error_state or notify
> drivers for any devices we haven't visited yet.
> 
> > +	}
> >  	if (!dev->driver ||
> >  		!dev->driver->err_handler ||
> >  		!dev->driver->err_handler->error_detected) {
> 
> If none of the drivers in the affected hierarchy supports error handling,
> I think the call tree looks like this:
> 
>     do_recovery                                 # uncorrectable only
>         broadcast_error_message(dev, ..., report_error_detected)
>             result_data.result = CAN_RECOVER
>             pci_walk_bus(..., report_error_detected)
>                 report_error_detected           # (each dev in subtree)
>                     return 0                    # no change to result
>             return result_data.result
>         broadcast_error_message(dev, ..., report_mmio_enabled)
>             result_data.result = PCI_ERS_RESULT_RECOVERED
>             pci_walk_bus(..., report_mmio_enabled)
>                 report_mmio_enabled             # (each dev in subtree)
>                     return 0                    # no change to result
>         dev_info("recovery successful")
> 
> Specifically, there are no error_handler functions, so we never change
> result_data.result, and the default is that we treat the error as
> "recovered successfully."  That seems broken.  An uncorrectable error
> is by definition recoverable only by device-specific software, i.e.,
> the driver.  We didn't call any drivers, so we can't have recovered
> anything.
> 
> What do you think of the following alternative?  I don't know why you
> checked for bridge devices in your patch, so I don't know whether
> that's important here or not.
> 
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
> b/drivers/pci/pcie/aer/aerdrv_core.c
> index 06bad96..a109c68 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -231,11 +231,11 @@ static int report_error_detected(struct pci_dev *dev,
> void *data)
>  				   dev->driver ?
>  				   "no AER-aware driver" : "no driver");
>  		}
> -		return 0;
> +		vote = PCI_ERS_RESULT_DISCONNECT;
> +	} else {
> +		err_handler = dev->driver->err_handler;
> +		vote = err_handler->error_detected(dev, result_data->state);
>  	}
> -
> -	err_handler = dev->driver->err_handler;
> -	vote = err_handler->error_detected(dev, result_data->state);
>  	result_data->result = merge_result(result_data->result, vote);
>  	return 0;
>  }

This would definitely set the error_state for all devices correctly. However, with the 
current implementation of merge_result(), won't we still end up reporting successful 
recovery ? The following case statement in merge_result() can set back the result 
from PCI_ERS_RESULT_DISCONNECT to PCI_ERS_RESULT_NEED_RESET for a subsequent device 
on the bus which returned PCI_ERS_RESULT_NEED_RESET from its error_detected callback . 

merge_result()
...
        case PCI_ERS_RESULT_DISCONNECT:
                if (new == PCI_ERS_RESULT_NEED_RESET)
                        orig = new;
                break;

This would mean do_recovery() proceeds along to the next broadcast_message and 
ultimately report success. Right ? Let me know if I am missing something.

I looked at a few options and the following looked more promising. Thoughts ?

Introduce a new pci_ers_result enum PCI_ERS_RESULT_NO_AER_DRIVER and make changes as follows.

diff --git a/drivers/pci/pcie/aer/aerdrv.h b/drivers/pci/pcie/aer/aerdrv.h
index 94a7598..149ba79 100644
--- a/drivers/pci/pcie/aer/aerdrv.h
+++ b/drivers/pci/pcie/aer/aerdrv.h
@@ -87,6 +87,9 @@ struct aer_broadcast_data {
 static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
 		enum pci_ers_result new)
 {
+	if (new == PCI_ERS_RESULT_NO_AER_DRIVER)
+		return new;
+
 	if (new == PCI_ERS_RESULT_NONE)
 		return orig;
 
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
index 06bad96..729580a 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -231,11 +231,12 @@ static int report_error_detected(struct pci_dev *dev, void *data)
 				   dev->driver ?
 				   "no AER-aware driver" : "no driver");
 		}
-		return 0;
+		vote = PCI_ERS_RESULT_NO_AER_DRIVER;
+	} else {
+		err_handler = dev->driver->err_handler;
+		vote = err_handler->error_detected(dev, result_data->state);
 	}
 
-	err_handler = dev->driver->err_handler;
-	vote = err_handler->error_detected(dev, result_data->state);
 	result_data->result = merge_result(result_data->result, vote);
 	return 0;
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index ee21795..fb7e869 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -538,6 +538,9 @@ enum pci_ers_result {
 
 	/* Device driver is fully recovered and operational */
 	PCI_ERS_RESULT_RECOVERED = (__force pci_ers_result_t) 5,
+
+	/* No AER capabilities registered for the driver */
+	PCI_ERS_RESULT_NO_AER_DRIVER = (__force pci_ers_result_t) 6,
 };
 
 /* PCI bus error event callbacks */


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers
  2012-11-15  7:09   ` Pandarathil, Vijaymohan R
@ 2012-11-16  1:08     ` Bjorn Helgaas
  2012-11-16  4:26       ` Pandarathil, Vijaymohan R
  0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2012-11-16  1:08 UTC (permalink / raw)
  To: Pandarathil, Vijaymohan R
  Cc: linux-kernel, linux-pci, linasvepstas, Myron Stowe, Ortiz,
	Lance E, Huang Ying, Hidetoshi Seto, Patterson,
	Andrew D (LeftHand Networks),
	Zhang Yanmin

On Thu, Nov 15, 2012 at 12:09 AM, Pandarathil, Vijaymohan R
<vijaymohan.pandarathil@hp.com> wrote:
> Thanks for the comments. See my response below.
>
>> -----Original Message-----
>> From: Bjorn Helgaas [mailto:bhelgaas@google.com]
>> Sent: Wednesday, November 14, 2012 4:51 PM
>> To: Pandarathil, Vijaymohan R
>> Cc: linux-kernel@vger.kernel.org; linux-pci@vger.kernel.org;
>> linasvepstas@gmail.com; Myron Stowe; Ortiz, Lance E; Huang Ying; Hidetoshi
>> Seto; Patterson, Andrew D (LeftHand Networks); Zhang Yanmin
>> Subject: Re: [ PATCH RESEND ] PCI-AER: Do not report successful error
>> recovery for devices with AER-unaware drivers
>>
>> [+cc Lance, Huang, Hidetoshi, Andrew, Zhang]
>>
>> On Sat, Nov 10, 2012 at 07:41:04AM +0000, Pandarathil, Vijaymohan R wrote:
>> > When an error is detected on a PCIe device which does not have an
>> > AER-aware driver, prevent AER infrastructure from reporting
>> > successful error recovery.
>> >
>> > This is because the report_error_detected() function that gets
>> > called in the first phase of recovery process allows forward
>> > progress even when the driver for the device does not have AER
>> > capabilities. It seems that all callbacks (in pci_error_handlers
>> > structure) registered by drivers that gets called during error
>> > recovery are not mandatory. So the intention of the infrastructure
>> > design seems to be to allow forward progress even when a specific
>> > callback has not been registered by a driver. However, if error
>> > handler structure itself has not been registered, it doesn't make
>> > sense to allow forward progress.
>> >
>> > As a result of the current design, in the case of a single device
>> > having an AER-unaware driver or in the case of any function in a
>> > multi-function card having an AER-unaware driver, a successful
>> > recovery is reported.
>> >
>> > Typical scenario this happens is when a PCI device is detached
>> > from a KVM host and the pci-stub driver on the host claims the
>> > device. The pci-stub driver does not have error handling capabilities
>> > but the AER infrastructure still reports that the device recovered
>> > successfully.
>> >
>> > The changes proposed here leaves the device in an unrecovered state
>> > if the driver for the device or for any function in a multi-function
>> > card does not have error handler structure registered. This reflects
>> > the true state of the device and prevents any partial recovery (or no
>> > recovery at all) reported as successful.
>> >
>> > Please also see comments from Linas Vepstas at the following link
>> > http://www.spinics.net/lists/linux-pci/msg18572.html
>> >
>> > Reviewed-by: Linas Vepstas <linasvepstas <at> gmail.com>
>> > Reviewed-by: Myron Stowe <mstowe <at> redhat.com>
>> > Signed-off-by: Vijay Mohan Pandarathil <vijaymohan.pandarathil <at>
>> hp.com>
>> >
>> > ---
>> >
>> > drivers/pci/pcie/aer/aerdrv_core.c | 6 ++++++
>> >  1 file changed, 6 insertions(+)
>> >
>> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
>> b/drivers/pci/pcie/aer/aerdrv_core.c
>> > index 06bad96..030b229 100644
>> > --- a/drivers/pci/pcie/aer/aerdrv_core.c
>> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c
>> > @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev
>> *dev, void *data)
>> >
>> >     dev->error_state = result_data->state;
>> >
>> > +   if ((!dev->driver || !dev->driver->err_handler) &&
>> > +           !(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
>> > +           dev_info(&dev->dev, "AER: Error detected but no driver has
>> claimed this device or the driver is AER-unaware\n");
>> > +           result_data->result = PCI_ERS_RESULT_NONE;
>> > +           return 1;
>>
>> This doesn't seem right because returning 1 here causes pci_walk_bus()
>> to terminate, which means we won't set dev->error_state or notify
>> drivers for any devices we haven't visited yet.
>>
>> > +   }
>> >     if (!dev->driver ||
>> >             !dev->driver->err_handler ||
>> >             !dev->driver->err_handler->error_detected) {
>>
>> If none of the drivers in the affected hierarchy supports error handling,
>> I think the call tree looks like this:
>>
>>     do_recovery                                 # uncorrectable only
>>         broadcast_error_message(dev, ..., report_error_detected)
>>             result_data.result = CAN_RECOVER
>>             pci_walk_bus(..., report_error_detected)
>>                 report_error_detected           # (each dev in subtree)
>>                     return 0                    # no change to result
>>             return result_data.result
>>         broadcast_error_message(dev, ..., report_mmio_enabled)
>>             result_data.result = PCI_ERS_RESULT_RECOVERED
>>             pci_walk_bus(..., report_mmio_enabled)
>>                 report_mmio_enabled             # (each dev in subtree)
>>                     return 0                    # no change to result
>>         dev_info("recovery successful")
>>
>> Specifically, there are no error_handler functions, so we never change
>> result_data.result, and the default is that we treat the error as
>> "recovered successfully."  That seems broken.  An uncorrectable error
>> is by definition recoverable only by device-specific software, i.e.,
>> the driver.  We didn't call any drivers, so we can't have recovered
>> anything.
>>
>> What do you think of the following alternative?  I don't know why you
>> checked for bridge devices in your patch, so I don't know whether
>> that's important here or not.
>>
>> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
>> b/drivers/pci/pcie/aer/aerdrv_core.c
>> index 06bad96..a109c68 100644
>> --- a/drivers/pci/pcie/aer/aerdrv_core.c
>> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
>> @@ -231,11 +231,11 @@ static int report_error_detected(struct pci_dev *dev,
>> void *data)
>>                                  dev->driver ?
>>                                  "no AER-aware driver" : "no driver");
>>               }
>> -             return 0;
>> +             vote = PCI_ERS_RESULT_DISCONNECT;
>> +     } else {
>> +             err_handler = dev->driver->err_handler;
>> +             vote = err_handler->error_detected(dev, result_data->state);
>>       }
>> -
>> -     err_handler = dev->driver->err_handler;
>> -     vote = err_handler->error_detected(dev, result_data->state);
>>       result_data->result = merge_result(result_data->result, vote);
>>       return 0;
>>  }
>
> This would definitely set the error_state for all devices correctly. However, with the
> current implementation of merge_result(), won't we still end up reporting successful
> recovery ? The following case statement in merge_result() can set back the result
> from PCI_ERS_RESULT_DISCONNECT to PCI_ERS_RESULT_NEED_RESET for a subsequent device
> on the bus which returned PCI_ERS_RESULT_NEED_RESET from its error_detected callback .
>
> merge_result()
> ...
>         case PCI_ERS_RESULT_DISCONNECT:
>                 if (new == PCI_ERS_RESULT_NEED_RESET)
>                         orig = new;
>                 break;
>
> This would mean do_recovery() proceeds along to the next broadcast_message and
> ultimately report success. Right ? Let me know if I am missing something.

Yes, I think you're right.  I only looked at the case where none of
the devices in the subtree had drivers.

> I looked at a few options and the following looked more promising. Thoughts ?
>
> Introduce a new pci_ers_result enum PCI_ERS_RESULT_NO_AER_DRIVER and make changes as follows.

I don't really like this new enum because it's not really obvious how
it's different from PCI_ERS_RESULT_NONE and, more importantly, it
makes merge_result() even more of a kludge than it already is.

It's hard to write a nice simple description of this algorithm
(visiting all the devices in a subtree, collecting error handler
results, and merging them).  It would be a lot easier to describe if
merge_result() could be written simply as "max()", but I'm not sure
the pci_ers_result codes could be ordered in a way that would make
that possible.  And the desired semantics might make it impossible,
too.

I think the intent of your patch is that if there's any device in the
subtree that lacks an .error_detected() method, we do not call
.mmio_enabled() or .slot_reset() or .resume() for *any* device in the
subtree.  Right?

So maybe this is the best we can do, and it certainly seems better
than what we have now.  Can you repost this as a fresh v2 patch?

> diff --git a/drivers/pci/pcie/aer/aerdrv.h b/drivers/pci/pcie/aer/aerdrv.h
> index 94a7598..149ba79 100644
> --- a/drivers/pci/pcie/aer/aerdrv.h
> +++ b/drivers/pci/pcie/aer/aerdrv.h
> @@ -87,6 +87,9 @@ struct aer_broadcast_data {
>  static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
>                 enum pci_ers_result new)
>  {
> +       if (new == PCI_ERS_RESULT_NO_AER_DRIVER)
> +               return new;

BTW, if you keep this, please just use "return
PCI_ERS_RESULT_NO_AER_DRIVER" rather than "return new" since we *know*
what we're returning.  I think there's another instance of this in
merge_result() that you could fix, too.

> +
>         if (new == PCI_ERS_RESULT_NONE)
>                 return orig;
>
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> index 06bad96..729580a 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -231,11 +231,12 @@ static int report_error_detected(struct pci_dev *dev, void *data)
>                                    dev->driver ?
>                                    "no AER-aware driver" : "no driver");
>                 }
> -               return 0;
> +               vote = PCI_ERS_RESULT_NO_AER_DRIVER;
> +       } else {
> +               err_handler = dev->driver->err_handler;
> +               vote = err_handler->error_detected(dev, result_data->state);
>         }
>
> -       err_handler = dev->driver->err_handler;
> -       vote = err_handler->error_detected(dev, result_data->state);
>         result_data->result = merge_result(result_data->result, vote);
>         return 0;
>  }
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index ee21795..fb7e869 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -538,6 +538,9 @@ enum pci_ers_result {
>
>         /* Device driver is fully recovered and operational */
>         PCI_ERS_RESULT_RECOVERED = (__force pci_ers_result_t) 5,
> +
> +       /* No AER capabilities registered for the driver */
> +       PCI_ERS_RESULT_NO_AER_DRIVER = (__force pci_ers_result_t) 6,
>  };
>
>  /* PCI bus error event callbacks */
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers
  2012-11-16  1:08     ` Bjorn Helgaas
@ 2012-11-16  4:26       ` Pandarathil, Vijaymohan R
  0 siblings, 0 replies; 6+ messages in thread
From: Pandarathil, Vijaymohan R @ 2012-11-16  4:26 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, linux-pci, linasvepstas, Myron Stowe, Ortiz,
	Lance E, Huang Ying, Hidetoshi Seto, Patterson,
	Andrew D (LeftHand Networks),
	Zhang Yanmin



> -----Original Message-----
> From: Bjorn Helgaas [mailto:bhelgaas@google.com]
> Sent: Thursday, November 15, 2012 5:09 PM
> To: Pandarathil, Vijaymohan R
> Cc: linux-kernel@vger.kernel.org; linux-pci@vger.kernel.org;
> linasvepstas@gmail.com; Myron Stowe; Ortiz, Lance E; Huang Ying; Hidetoshi
> Seto; Patterson, Andrew D (LeftHand Networks); Zhang Yanmin
> Subject: Re: [ PATCH RESEND ] PCI-AER: Do not report successful error
> recovery for devices with AER-unaware drivers
> 
> On Thu, Nov 15, 2012 at 12:09 AM, Pandarathil, Vijaymohan R
> <vijaymohan.pandarathil@hp.com> wrote:
> > Thanks for the comments. See my response below.
> >
> >> -----Original Message-----
> >> From: Bjorn Helgaas [mailto:bhelgaas@google.com]
> >> Sent: Wednesday, November 14, 2012 4:51 PM
> >> To: Pandarathil, Vijaymohan R
> >> Cc: linux-kernel@vger.kernel.org; linux-pci@vger.kernel.org;
> >> linasvepstas@gmail.com; Myron Stowe; Ortiz, Lance E; Huang Ying;
> Hidetoshi
> >> Seto; Patterson, Andrew D (LeftHand Networks); Zhang Yanmin
> >> Subject: Re: [ PATCH RESEND ] PCI-AER: Do not report successful error
> >> recovery for devices with AER-unaware drivers
> >>
> >> [+cc Lance, Huang, Hidetoshi, Andrew, Zhang]
> >>
> >> On Sat, Nov 10, 2012 at 07:41:04AM +0000, Pandarathil, Vijaymohan R
> wrote:
> >> > When an error is detected on a PCIe device which does not have an
> >> > AER-aware driver, prevent AER infrastructure from reporting
> >> > successful error recovery.
> >> >
> >> > This is because the report_error_detected() function that gets
> >> > called in the first phase of recovery process allows forward
> >> > progress even when the driver for the device does not have AER
> >> > capabilities. It seems that all callbacks (in pci_error_handlers
> >> > structure) registered by drivers that gets called during error
> >> > recovery are not mandatory. So the intention of the infrastructure
> >> > design seems to be to allow forward progress even when a specific
> >> > callback has not been registered by a driver. However, if error
> >> > handler structure itself has not been registered, it doesn't make
> >> > sense to allow forward progress.
> >> >
> >> > As a result of the current design, in the case of a single device
> >> > having an AER-unaware driver or in the case of any function in a
> >> > multi-function card having an AER-unaware driver, a successful
> >> > recovery is reported.
> >> >
> >> > Typical scenario this happens is when a PCI device is detached
> >> > from a KVM host and the pci-stub driver on the host claims the
> >> > device. The pci-stub driver does not have error handling capabilities
> >> > but the AER infrastructure still reports that the device recovered
> >> > successfully.
> >> >
> >> > The changes proposed here leaves the device in an unrecovered state
> >> > if the driver for the device or for any function in a multi-function
> >> > card does not have error handler structure registered. This reflects
> >> > the true state of the device and prevents any partial recovery (or no
> >> > recovery at all) reported as successful.
> >> >
> >> > Please also see comments from Linas Vepstas at the following link
> >> > http://www.spinics.net/lists/linux-pci/msg18572.html
> >> >
> >> > Reviewed-by: Linas Vepstas <linasvepstas <at> gmail.com>
> >> > Reviewed-by: Myron Stowe <mstowe <at> redhat.com>
> >> > Signed-off-by: Vijay Mohan Pandarathil <vijaymohan.pandarathil <at>
> >> hp.com>
> >> >
> >> > ---
> >> >
> >> > drivers/pci/pcie/aer/aerdrv_core.c | 6 ++++++
> >> >  1 file changed, 6 insertions(+)
> >> >
> >> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
> >> b/drivers/pci/pcie/aer/aerdrv_core.c
> >> > index 06bad96..030b229 100644
> >> > --- a/drivers/pci/pcie/aer/aerdrv_core.c
> >> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> >> > @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev
> >> *dev, void *data)
> >> >
> >> >     dev->error_state = result_data->state;
> >> >
> >> > +   if ((!dev->driver || !dev->driver->err_handler) &&
> >> > +           !(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
> >> > +           dev_info(&dev->dev, "AER: Error detected but no driver has
> >> claimed this device or the driver is AER-unaware\n");
> >> > +           result_data->result = PCI_ERS_RESULT_NONE;
> >> > +           return 1;
> >>
> >> This doesn't seem right because returning 1 here causes pci_walk_bus()
> >> to terminate, which means we won't set dev->error_state or notify
> >> drivers for any devices we haven't visited yet.
> >>
> >> > +   }
> >> >     if (!dev->driver ||
> >> >             !dev->driver->err_handler ||
> >> >             !dev->driver->err_handler->error_detected) {
> >>
> >> If none of the drivers in the affected hierarchy supports error
> handling,
> >> I think the call tree looks like this:
> >>
> >>     do_recovery                                 # uncorrectable only
> >>         broadcast_error_message(dev, ..., report_error_detected)
> >>             result_data.result = CAN_RECOVER
> >>             pci_walk_bus(..., report_error_detected)
> >>                 report_error_detected           # (each dev in subtree)
> >>                     return 0                    # no change to result
> >>             return result_data.result
> >>         broadcast_error_message(dev, ..., report_mmio_enabled)
> >>             result_data.result = PCI_ERS_RESULT_RECOVERED
> >>             pci_walk_bus(..., report_mmio_enabled)
> >>                 report_mmio_enabled             # (each dev in subtree)
> >>                     return 0                    # no change to result
> >>         dev_info("recovery successful")
> >>
> >> Specifically, there are no error_handler functions, so we never change
> >> result_data.result, and the default is that we treat the error as
> >> "recovered successfully."  That seems broken.  An uncorrectable error
> >> is by definition recoverable only by device-specific software, i.e.,
> >> the driver.  We didn't call any drivers, so we can't have recovered
> >> anything.
> >>
> >> What do you think of the following alternative?  I don't know why you
> >> checked for bridge devices in your patch, so I don't know whether
> >> that's important here or not.
> >>
> >> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
> >> b/drivers/pci/pcie/aer/aerdrv_core.c
> >> index 06bad96..a109c68 100644
> >> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> >> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> >> @@ -231,11 +231,11 @@ static int report_error_detected(struct pci_dev
> *dev,
> >> void *data)
> >>                                  dev->driver ?
> >>                                  "no AER-aware driver" : "no driver");
> >>               }
> >> -             return 0;
> >> +             vote = PCI_ERS_RESULT_DISCONNECT;
> >> +     } else {
> >> +             err_handler = dev->driver->err_handler;
> >> +             vote = err_handler->error_detected(dev, result_data-
> >state);
> >>       }
> >> -
> >> -     err_handler = dev->driver->err_handler;
> >> -     vote = err_handler->error_detected(dev, result_data->state);
> >>       result_data->result = merge_result(result_data->result, vote);
> >>       return 0;
> >>  }
> >
> > This would definitely set the error_state for all devices correctly.
> However, with the
> > current implementation of merge_result(), won't we still end up reporting
> successful
> > recovery ? The following case statement in merge_result() can set back
> the result
> > from PCI_ERS_RESULT_DISCONNECT to PCI_ERS_RESULT_NEED_RESET for a
> subsequent device
> > on the bus which returned PCI_ERS_RESULT_NEED_RESET from its
> error_detected callback .
> >
> > merge_result()
> > ...
> >         case PCI_ERS_RESULT_DISCONNECT:
> >                 if (new == PCI_ERS_RESULT_NEED_RESET)
> >                         orig = new;
> >                 break;
> >
> > This would mean do_recovery() proceeds along to the next
> broadcast_message and
> > ultimately report success. Right ? Let me know if I am missing something.
> 
> Yes, I think you're right.  I only looked at the case where none of
> the devices in the subtree had drivers.
> 
> > I looked at a few options and the following looked more promising.
> Thoughts ?
> >
> > Introduce a new pci_ers_result enum PCI_ERS_RESULT_NO_AER_DRIVER and make
> changes as follows.
> 
> I don't really like this new enum because it's not really obvious how
> it's different from PCI_ERS_RESULT_NONE and, more importantly, it
> makes merge_result() even more of a kludge than it already is.
> 
> It's hard to write a nice simple description of this algorithm
> (visiting all the devices in a subtree, collecting error handler
> results, and merging them).  It would be a lot easier to describe if
> merge_result() could be written simply as "max()", but I'm not sure
> the pci_ers_result codes could be ordered in a way that would make
> that possible.  And the desired semantics might make it impossible,
> too.
> 
> I think the intent of your patch is that if there's any device in the
> subtree that lacks an .error_detected() method, we do not call
> .mmio_enabled() or .slot_reset() or .resume() for *any* device in the
> subtree.  Right?

Right. 

BTW, I did look at usage of PCI_ERS_RESULT_NONE in all places.
It seemed to me that the intended semantics of PCI_ERS_RESULT_NONE may 
not have been followed everywhere and I did not want to take a risk of 
breaking something else at this time. I will see if I can cleanup 
merge_result() as a separate patch in future.

> 
> So maybe this is the best we can do, and it certainly seems better
> than what we have now.  Can you repost this as a fresh v2 patch?

Okay. I will post a v2 patch.

> 
> > diff --git a/drivers/pci/pcie/aer/aerdrv.h
> b/drivers/pci/pcie/aer/aerdrv.h
> > index 94a7598..149ba79 100644
> > --- a/drivers/pci/pcie/aer/aerdrv.h
> > +++ b/drivers/pci/pcie/aer/aerdrv.h
> > @@ -87,6 +87,9 @@ struct aer_broadcast_data {
> >  static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
> >                 enum pci_ers_result new)
> >  {
> > +       if (new == PCI_ERS_RESULT_NO_AER_DRIVER)
> > +               return new;
> 
> BTW, if you keep this, please just use "return
> PCI_ERS_RESULT_NO_AER_DRIVER" rather than "return new" since we *know*
> what we're returning.  I think there's another instance of this in
> merge_result() that you could fix, too.

Will do.

> 
> > +
> >         if (new == PCI_ERS_RESULT_NONE)
> >                 return orig;
> >
> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c
> b/drivers/pci/pcie/aer/aerdrv_core.c
> > index 06bad96..729580a 100644
> > --- a/drivers/pci/pcie/aer/aerdrv_core.c
> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> > @@ -231,11 +231,12 @@ static int report_error_detected(struct pci_dev
> *dev, void *data)
> >                                    dev->driver ?
> >                                    "no AER-aware driver" : "no driver");
> >                 }
> > -               return 0;
> > +               vote = PCI_ERS_RESULT_NO_AER_DRIVER;
> > +       } else {
> > +               err_handler = dev->driver->err_handler;
> > +               vote = err_handler->error_detected(dev, result_data-
> >state);
> >         }
> >
> > -       err_handler = dev->driver->err_handler;
> > -       vote = err_handler->error_detected(dev, result_data->state);
> >         result_data->result = merge_result(result_data->result, vote);
> >         return 0;
> >  }
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index ee21795..fb7e869 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -538,6 +538,9 @@ enum pci_ers_result {
> >
> >         /* Device driver is fully recovered and operational */
> >         PCI_ERS_RESULT_RECOVERED = (__force pci_ers_result_t) 5,
> > +
> > +       /* No AER capabilities registered for the driver */
> > +       PCI_ERS_RESULT_NO_AER_DRIVER = (__force pci_ers_result_t) 6,
> >  };
> >
> >  /* PCI bus error event callbacks */
> >

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-11-16  4:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-10  7:41 [ PATCH RESEND ] PCI-AER: Do not report successful error recovery for devices with AER-unaware drivers Pandarathil, Vijaymohan R
2012-11-15  0:51 ` Bjorn Helgaas
2012-11-15  1:35   ` Linas Vepstas
2012-11-15  7:09   ` Pandarathil, Vijaymohan R
2012-11-16  1:08     ` Bjorn Helgaas
2012-11-16  4:26       ` Pandarathil, Vijaymohan R

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).