Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
* [RESEND PATCH v3 0/1] PCI/ERR: fix regression introduced by 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
@ 2020-10-10 22:16 Hedi Berriche
  2020-10-10 22:16 ` [RESEND PATCH v3 1/1] PCI/ERR: don't clobber status after reset_link() Hedi Berriche
  0 siblings, 1 reply; 4+ messages in thread
From: Hedi Berriche @ 2020-10-10 22:16 UTC (permalink / raw)
  To: sathyanarayanan.kuppuswamy
  Cc: linux-kernel, linux-pci, Hedi Berriche, Russ Anderson,
	Bjorn Helgaas, Ashok Raj, Joerg Roedel, stable

This is a resend of v3 as the the original, sent over 6 hours ago, is yet
to make it to LKML.

- Changes since v2:

 * set status to PCI_ERS_RESULT_RECOVERED, in case of successful link
   reset, if and only if the initial value of error status is
   PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER.

- Changes since v1:

 * changed the commit message to clarify what broke post commit 6d2c89441571
 * dropped the misnomer post_reset_status variable in favour of a more natural
   approach that relies on a boolean to keep track of the outcome of reset_link()

After commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
pcie_do_recovery() no longer calls ->slot_reset() in the case of a successful
reset which breaks error recovery by breaking driver (re)initialisation.

Cc: Russ Anderson <rja@hpe.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Joerg Roedel <jroedel@suse.com>

Cc: stable@kernel.org # v5.7+

---
Hedi Berriche (1):
  PCI/ERR: don't clobber status after reset_link()

 drivers/pci/pcie/err.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [RESEND PATCH v3 1/1] PCI/ERR: don't clobber status after reset_link()
  2020-10-10 22:16 [RESEND PATCH v3 0/1] PCI/ERR: fix regression introduced by 6d2c89441571 ("PCI/ERR: Update error status after reset_link()") Hedi Berriche
@ 2020-10-10 22:16 ` Hedi Berriche
  2020-10-11 17:56   ` Sinan Kaya
  0 siblings, 1 reply; 4+ messages in thread
From: Hedi Berriche @ 2020-10-10 22:16 UTC (permalink / raw)
  To: sathyanarayanan.kuppuswamy
  Cc: linux-kernel, linux-pci, Hedi Berriche, Russ Anderson,
	Bjorn Helgaas, Ashok Raj, Joerg Roedel, stable

Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
broke pcie_do_recovery(): updating status after reset_link() has the ill
side effect of causing recovery to fail if the error status is
PCI_ERS_RESULT_CAN_RECOVER or PCI_ERS_RESULT_NEED_RESET as the following
code will *never* run in the case of a successful reset_link()

   177         if (status == PCI_ERS_RESULT_CAN_RECOVER) {
   ...
   181         }

   183         if (status == PCI_ERS_RESULT_NEED_RESET) {
   ...
   192         }

For instance in the case of PCI_ERS_RESULT_NEED_RESET we end up not
calling ->slot_reset() (because we skip report_slot_reset()) thus
breaking driver (re)initialisation.

Don't clobber status with the return value of reset_link(); set status
to PCI_ERS_RESULT_RECOVERED, in case of successful link reset, if and
only if the initial value of error status is PCI_ERS_RESULT_DISCONNECT
or PCI_ERS_RESULT_NO_AER_DRIVER.

Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
Signed-off-by: Hedi Berriche <hedi.berriche@hpe.com>
Cc: Russ Anderson <rja@hpe.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Joerg Roedel <jroedel@suse.com>

Cc: stable@kernel.org # v5.7+
---
 drivers/pci/pcie/err.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index c543f419d8f9..2730826cfd8a 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -165,10 +165,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
 	pci_dbg(dev, "broadcast error_detected message\n");
 	if (state == pci_channel_io_frozen) {
 		pci_walk_bus(bus, report_frozen_detected, &status);
-		status = reset_link(dev);
-		if (status != PCI_ERS_RESULT_RECOVERED) {
+		if (reset_link(dev) != PCI_ERS_RESULT_RECOVERED) {
 			pci_warn(dev, "link reset failed\n");
 			goto failed;
+		} else {
+			if (status == PCI_ERS_RESULT_DISCONNECT ||
+			    status == PCI_ERS_RESULT_NO_AER_DRIVER)
+				status = PCI_ERS_RESULT_RECOVERED;
 		}
 	} else {
 		pci_walk_bus(bus, report_normal_detected, &status);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RESEND PATCH v3 1/1] PCI/ERR: don't clobber status after reset_link()
  2020-10-10 22:16 ` [RESEND PATCH v3 1/1] PCI/ERR: don't clobber status after reset_link() Hedi Berriche
@ 2020-10-11 17:56   ` Sinan Kaya
  2020-10-15 15:32     ` Hedi Berriche
  0 siblings, 1 reply; 4+ messages in thread
From: Sinan Kaya @ 2020-10-11 17:56 UTC (permalink / raw)
  To: Hedi Berriche, sathyanarayanan.kuppuswamy
  Cc: linux-kernel, linux-pci, Russ Anderson, Bjorn Helgaas, Ashok Raj,
	Joerg Roedel, stable

On 10/10/2020 6:16 PM, Hedi Berriche wrote:
> Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
> broke pcie_do_recovery(): updating status after reset_link() has the ill
> side effect of causing recovery to fail if the error status is
> PCI_ERS_RESULT_CAN_RECOVER or PCI_ERS_RESULT_NEED_RESET as the following
> code will *never* run in the case of a successful reset_link()
> 
>    177         if (status == PCI_ERS_RESULT_CAN_RECOVER) {
>    ...
>    181         }
> 
>    183         if (status == PCI_ERS_RESULT_NEED_RESET) {
>    ...
>    192         }
> 
> For instance in the case of PCI_ERS_RESULT_NEED_RESET we end up not
> calling ->slot_reset() (because we skip report_slot_reset()) thus
> breaking driver (re)initialisation.
> 
> Don't clobber status with the return value of reset_link(); set status
> to PCI_ERS_RESULT_RECOVERED, in case of successful link reset, if and
> only if the initial value of error status is PCI_ERS_RESULT_DISCONNECT
> or PCI_ERS_RESULT_NO_AER_DRIVER.
> 
> Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
> Signed-off-by: Hedi Berriche <hedi.berriche@hpe.com>
> Cc: Russ Anderson <rja@hpe.com>
> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Joerg Roedel <jroedel@suse.com>
> 
> Cc: stable@kernel.org # v5.7+
> ---
>  drivers/pci/pcie/err.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index c543f419d8f9..2730826cfd8a 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -165,10 +165,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
>  	pci_dbg(dev, "broadcast error_detected message\n");
>  	if (state == pci_channel_io_frozen) {
>  		pci_walk_bus(bus, report_frozen_detected, &status);
> -		status = reset_link(dev);
> -		if (status != PCI_ERS_RESULT_RECOVERED) {
> +		if (reset_link(dev) != PCI_ERS_RESULT_RECOVERED) {
>  			pci_warn(dev, "link reset failed\n");
>  			goto failed;
> +		} else {
> +			if (status == PCI_ERS_RESULT_DISCONNECT ||
> +			    status == PCI_ERS_RESULT_NO_AER_DRIVER)
> +				status = PCI_ERS_RESULT_RECOVERED;
>  		}
>  	} else {
>  		pci_walk_bus(bus, report_normal_detected, &status);
> 

Reviewed-by: Sinan Kaya <okaya@kernel.org>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RESEND PATCH v3 1/1] PCI/ERR: don't clobber status after reset_link()
  2020-10-11 17:56   ` Sinan Kaya
@ 2020-10-15 15:32     ` Hedi Berriche
  0 siblings, 0 replies; 4+ messages in thread
From: Hedi Berriche @ 2020-10-15 15:32 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: sathyanarayanan.kuppuswamy, linux-kernel, linux-pci,
	Russ Anderson, Bjorn Helgaas, Ashok Raj, Joerg Roedel, stable

On Sun, Oct 11, 2020 at 18:56 Sinan Kaya wrote:
>On 10/10/2020 6:16 PM, Hedi Berriche wrote:
>> Commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
>> broke pcie_do_recovery(): updating status after reset_link() has the ill
>> side effect of causing recovery to fail if the error status is
>> PCI_ERS_RESULT_CAN_RECOVER or PCI_ERS_RESULT_NEED_RESET as the following
>> code will *never* run in the case of a successful reset_link()
>>
>>    177         if (status == PCI_ERS_RESULT_CAN_RECOVER) {
>>    ...
>>    181         }
>>
>>    183         if (status == PCI_ERS_RESULT_NEED_RESET) {
>>    ...
>>    192         }
>>
>> For instance in the case of PCI_ERS_RESULT_NEED_RESET we end up not
>> calling ->slot_reset() (because we skip report_slot_reset()) thus
>> breaking driver (re)initialisation.
>>
>> Don't clobber status with the return value of reset_link(); set status
>> to PCI_ERS_RESULT_RECOVERED, in case of successful link reset, if and
>> only if the initial value of error status is PCI_ERS_RESULT_DISCONNECT
>> or PCI_ERS_RESULT_NO_AER_DRIVER.
>>
>> Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
>> Signed-off-by: Hedi Berriche <hedi.berriche@hpe.com>
>> Cc: Russ Anderson <rja@hpe.com>
>> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>> Cc: Bjorn Helgaas <bhelgaas@google.com>
>> Cc: Ashok Raj <ashok.raj@intel.com>
>> Cc: Joerg Roedel <jroedel@suse.com>
>>
>> Cc: stable@kernel.org # v5.7+
>> ---
>>  drivers/pci/pcie/err.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
>> index c543f419d8f9..2730826cfd8a 100644
>> --- a/drivers/pci/pcie/err.c
>> +++ b/drivers/pci/pcie/err.c
>> @@ -165,10 +165,13 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
>>  	pci_dbg(dev, "broadcast error_detected message\n");
>>  	if (state == pci_channel_io_frozen) {
>>  		pci_walk_bus(bus, report_frozen_detected, &status);
>> -		status = reset_link(dev);
>> -		if (status != PCI_ERS_RESULT_RECOVERED) {
>> +		if (reset_link(dev) != PCI_ERS_RESULT_RECOVERED) {
>>  			pci_warn(dev, "link reset failed\n");
>>  			goto failed;
>> +		} else {
>> +			if (status == PCI_ERS_RESULT_DISCONNECT ||
>> +			    status == PCI_ERS_RESULT_NO_AER_DRIVER)
>> +				status = PCI_ERS_RESULT_RECOVERED;
>>  		}
>>  	} else {
>>  		pci_walk_bus(bus, report_normal_detected, &status);
>>
>
>Reviewed-by: Sinan Kaya <okaya@kernel.org>

Thanks for the Reviewed-by, Sinan.

Folks,

Any further comments or is this ripe for acceptance?

Cheers,
Hedi.
-- 
Be careful of reading health books, you might die of a misprint.
	-- Mark Twain

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-10 22:16 [RESEND PATCH v3 0/1] PCI/ERR: fix regression introduced by 6d2c89441571 ("PCI/ERR: Update error status after reset_link()") Hedi Berriche
2020-10-10 22:16 ` [RESEND PATCH v3 1/1] PCI/ERR: don't clobber status after reset_link() Hedi Berriche
2020-10-11 17:56   ` Sinan Kaya
2020-10-15 15:32     ` Hedi Berriche

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git