linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] PCI: Rework error reporting with PCIe failed link retraining
@ 2024-02-10  1:43 Maciej W. Rozycki
  2024-02-10  1:43 ` [PATCH 1/2] PCI: Correct " Maciej W. Rozycki
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Maciej W. Rozycki @ 2024-02-10  1:43 UTC (permalink / raw)
  To: Ilpo Järvinen, Bjorn Helgaas
  Cc: Mika Westerberg, linux-pci, linux-kernel

Hi,

 This patch series addresses issues observed by Ilpo as reported here: 
<https://lore.kernel.org/r/aa2d1c4e-9961-d54a-00c7-ddf8e858a9b0@linux.intel.com/>, 
one with excessive delays happening when `pcie_failed_link_retrain' is 
called, but link retraining has not been actually attempted, and another 
one with an API misuse caused by a merge mistake.

 See individual change description for further details; 1/2 supersedes: 
<https://patchwork.kernel.org/project/linux-pci/patch/20240202134108.4096-1-ilpo.jarvinen@linux.intel.com/>, 
and 2/2 supersedes: 
<https://patchwork.kernel.org/project/linux-pci/patch/20240208132205.4550-1-ilpo.jarvinen@linux.intel.com/>.

 Unfortunately I cannot verify the changes anymore beyond just checking 
that the system `pcie_failed_link_retrain' was intended for still boots, 
because something happened that makes the problematic link not to work at 
all.

 The system was up for 88 days and the link continued working as I was 
logged in over a serial line wired through a PCIe serial option card 
further downstream and I communicated over the line just fine to log out 
in preparation for a reboot.  After reboot the link did not respond and 
after several further attempts, including reboots and power cycles, the 
link still does not respond, LBMS is never set and I couldn't ever observe 
LT being set either.  This affects U-Boot too, as previously it reported:

PCIE-0: Link up (Gen1-x8, Bus0)
PCI Autoconfig: 02.03.00: Downstream link non-functional
PCI Autoconfig: 02.03.00: Retrying with speed restricted to 2.5GT/s...
PCI Autoconfig: 02.03.00: Succeeded!

and now it only reports:

PCIE-0: Link up (Gen1-x8, Bus0)

 Interestingly enough the system had its mainboard replaced those 3 months 
ago to deal with an unrelated problem, and with the new mainboard in place 
I already had issues with the option cards downstream from the PCIe switch 
immediately wired to 02.03.0.  I had to rewire and reseat the adapter and 
cards several times before it started working reliably.  Maybe something 
has happened to the adapter board with the PCIe switch that caused it to 
stop working, hopefully permanently.  Perhaps it has something to do with 
the power supply connection, which is via an FDC/Berg connector, not my 
favourite one.

 I have four such adapter boards total, so I can try and see if I am able 
to revive the original one or use a replacement one, but it won't happen 
right away, as I have the system installed in a remote lab ~1000mi/1600km 
away from me.  I'll try to bring the system back to fully working order at 
the next opportunity, but it is inconvenient to me to travel there right 
now just to address this problem, so it'll be a couple of weeks and likely 
more before I am able to say something.  I hope it's not the new mainboard 
(PCIe devices in the other slots work just fine).

 Hopefully I'll be able fix it one way or another and will be able to 
report on LBMS behaviour too, that is whether it retriggers with every 
link training iteration or not.

 Meanwhile the patches are hopefully obvious enough to apply.

  Maciej

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] PCI: Correct error reporting with PCIe failed link retraining
  2024-02-10  1:43 [PATCH 0/2] PCI: Rework error reporting with PCIe failed link retraining Maciej W. Rozycki
@ 2024-02-10  1:43 ` Maciej W. Rozycki
  2024-02-12 12:01   ` Ilpo Järvinen
                     ` (2 more replies)
  2024-02-10  1:43 ` [PATCH 2/2] PCI: Use an error code " Maciej W. Rozycki
  2024-02-26 12:55 ` [PATCH 0/2] PCI: Rework error reporting " Maciej W. Rozycki
  2 siblings, 3 replies; 10+ messages in thread
From: Maciej W. Rozycki @ 2024-02-10  1:43 UTC (permalink / raw)
  To: Ilpo Järvinen, Bjorn Helgaas
  Cc: Mika Westerberg, linux-pci, linux-kernel

Only return successful completion status from `pcie_failed_link_retrain' 
if retraining has actually been done, preventing excessive delays from 
being triggered at call sites in a hope that communication will finally 
be established with the downstream device where in fact nothing has been 
done about the link in question that would justify such a hope.

Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/aa2d1c4e-9961-d54a-00c7-ddf8e858a9b0@linux.intel.com/
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: stable@vger.kernel.org # v6.5+
---
 drivers/pci/quirks.c |   18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

linux-pcie-failed-link-retrain-status-fix.diff
Index: linux-macro/drivers/pci/quirks.c
===================================================================
--- linux-macro.orig/drivers/pci/quirks.c
+++ linux-macro/drivers/pci/quirks.c
@@ -74,7 +74,8 @@
  * firmware may have already arranged and lift it with ports that already
  * report their data link being up.
  *
- * Return TRUE if the link has been successfully retrained, otherwise FALSE.
+ * Return TRUE if the link has been successfully retrained, otherwise FALSE,
+ * also when retraining was not needed in the first place.
  */
 bool pcie_failed_link_retrain(struct pci_dev *dev)
 {
@@ -83,10 +84,11 @@ bool pcie_failed_link_retrain(struct pci
 		{}
 	};
 	u16 lnksta, lnkctl2;
+	bool ret = false;
 
 	if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
 	    !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
-		return false;
+		return ret;
 
 	pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
 	pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
@@ -98,9 +100,10 @@ bool pcie_failed_link_retrain(struct pci
 		lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
 		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
 
-		if (pcie_retrain_link(dev, false)) {
+		ret = pcie_retrain_link(dev, false) == 0;
+		if (!ret) {
 			pci_info(dev, "retraining failed\n");
-			return false;
+			return ret;
 		}
 
 		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
@@ -117,13 +120,14 @@ bool pcie_failed_link_retrain(struct pci
 		lnkctl2 |= lnkcap & PCI_EXP_LNKCAP_SLS;
 		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
 
-		if (pcie_retrain_link(dev, false)) {
+		ret = pcie_retrain_link(dev, false) == 0;
+		if (!ret) {
 			pci_info(dev, "retraining failed\n");
-			return false;
+			return ret;
 		}
 	}
 
-	return true;
+	return ret;
 }
 
 static ktime_t fixup_debug_start(struct pci_dev *dev,

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/2] PCI: Use an error code with PCIe failed link retraining
  2024-02-10  1:43 [PATCH 0/2] PCI: Rework error reporting with PCIe failed link retraining Maciej W. Rozycki
  2024-02-10  1:43 ` [PATCH 1/2] PCI: Correct " Maciej W. Rozycki
@ 2024-02-10  1:43 ` Maciej W. Rozycki
  2024-02-12 12:11   ` Ilpo Järvinen
  2024-02-26 12:55 ` [PATCH 0/2] PCI: Rework error reporting " Maciej W. Rozycki
  2 siblings, 1 reply; 10+ messages in thread
From: Maciej W. Rozycki @ 2024-02-10  1:43 UTC (permalink / raw)
  To: Ilpo Järvinen, Bjorn Helgaas
  Cc: Mika Westerberg, linux-pci, linux-kernel

Given how the call place in `pcie_wait_for_link_delay' got structured 
now, and that `pcie_retrain_link' returns a potentially useful error 
code, convert `pcie_failed_link_retrain' to return an error code rather 
than a boolean status, fixing handling at the call site mentioned.  
Update the other call site accordingly.

Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'")
Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/aa2d1c4e-9961-d54a-00c7-ddf8e858a9b0@linux.intel.com/
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: stable@vger.kernel.org # v6.5+
---
 drivers/pci/pci.c    |    2 +-
 drivers/pci/pci.h    |    6 +++---
 drivers/pci/quirks.c |   14 +++++++-------
 3 files changed, 11 insertions(+), 11 deletions(-)

linux-pcie-failed-link-retrain-status-int.diff
Index: linux-macro/drivers/pci/pci.c
===================================================================
--- linux-macro.orig/drivers/pci/pci.c
+++ linux-macro/drivers/pci/pci.c
@@ -1263,7 +1263,7 @@ static int pci_dev_wait(struct pci_dev *
 		if (delay > PCI_RESET_WAIT) {
 			if (retrain) {
 				retrain = false;
-				if (pcie_failed_link_retrain(bridge)) {
+				if (pcie_failed_link_retrain(bridge) == 0) {
 					delay = 1;
 					continue;
 				}
Index: linux-macro/drivers/pci/pci.h
===================================================================
--- linux-macro.orig/drivers/pci/pci.h
+++ linux-macro/drivers/pci/pci.h
@@ -540,7 +540,7 @@ void pci_acs_init(struct pci_dev *dev);
 int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
 int pci_dev_specific_enable_acs(struct pci_dev *dev);
 int pci_dev_specific_disable_acs_redir(struct pci_dev *dev);
-bool pcie_failed_link_retrain(struct pci_dev *dev);
+int pcie_failed_link_retrain(struct pci_dev *dev);
 #else
 static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
 					       u16 acs_flags)
@@ -555,9 +555,9 @@ static inline int pci_dev_specific_disab
 {
 	return -ENOTTY;
 }
-static inline bool pcie_failed_link_retrain(struct pci_dev *dev)
+static inline int pcie_failed_link_retrain(struct pci_dev *dev)
 {
-	return false;
+	return -ENOTTY;
 }
 #endif
 
Index: linux-macro/drivers/pci/quirks.c
===================================================================
--- linux-macro.orig/drivers/pci/quirks.c
+++ linux-macro/drivers/pci/quirks.c
@@ -74,17 +74,17 @@
  * firmware may have already arranged and lift it with ports that already
  * report their data link being up.
  *
- * Return TRUE if the link has been successfully retrained, otherwise FALSE,
+ * Return 0 if the link has been successfully retrained, otherwise an error,
  * also when retraining was not needed in the first place.
  */
-bool pcie_failed_link_retrain(struct pci_dev *dev)
+int pcie_failed_link_retrain(struct pci_dev *dev)
 {
 	static const struct pci_device_id ids[] = {
 		{ PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */
 		{}
 	};
 	u16 lnksta, lnkctl2;
-	bool ret = false;
+	int ret = -ENOTTY;
 
 	if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
 	    !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
@@ -100,8 +100,8 @@ bool pcie_failed_link_retrain(struct pci
 		lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
 		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
 
-		ret = pcie_retrain_link(dev, false) == 0;
-		if (!ret) {
+		ret = pcie_retrain_link(dev, false);
+		if (ret) {
 			pci_info(dev, "retraining failed\n");
 			return ret;
 		}
@@ -120,8 +120,8 @@ bool pcie_failed_link_retrain(struct pci
 		lnkctl2 |= lnkcap & PCI_EXP_LNKCAP_SLS;
 		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
 
-		ret = pcie_retrain_link(dev, false) == 0;
-		if (!ret) {
+		ret = pcie_retrain_link(dev, false);
+		if (ret) {
 			pci_info(dev, "retraining failed\n");
 			return ret;
 		}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] PCI: Correct error reporting with PCIe failed link retraining
  2024-02-10  1:43 ` [PATCH 1/2] PCI: Correct " Maciej W. Rozycki
@ 2024-02-12 12:01   ` Ilpo Järvinen
  2024-03-14  7:19   ` Pengfei Xu
  2024-04-24 22:13   ` Bjorn Helgaas
  2 siblings, 0 replies; 10+ messages in thread
From: Ilpo Järvinen @ 2024-02-12 12:01 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Bjorn Helgaas, Mika Westerberg, linux-pci, LKML

[-- Attachment #1: Type: text/plain, Size: 3242 bytes --]

On Sat, 10 Feb 2024, Maciej W. Rozycki wrote:

> Only return successful completion status from `pcie_failed_link_retrain' 
> if retraining has actually been done, preventing excessive delays from 
> being triggered at call sites in a hope that communication will finally 
> be established with the downstream device where in fact nothing has been 
> done about the link in question that would justify such a hope.
> 
> Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Link: https://lore.kernel.org/r/aa2d1c4e-9961-d54a-00c7-ddf8e858a9b0@linux.intel.com/
> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> Cc: stable@vger.kernel.org # v6.5+

Thanks.

The original thread might be useful for context if somebody has to look at 
this change later on from the history, so:

Link: https://lore.kernel.org/linux-pci/20240129112710.2852-2-ilpo.jarvinen@linux.intel.com/T/#u

Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

-- 
 i.

> ---
>  drivers/pci/quirks.c |   18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> linux-pcie-failed-link-retrain-status-fix.diff
> Index: linux-macro/drivers/pci/quirks.c
> ===================================================================
> --- linux-macro.orig/drivers/pci/quirks.c
> +++ linux-macro/drivers/pci/quirks.c
> @@ -74,7 +74,8 @@
>   * firmware may have already arranged and lift it with ports that already
>   * report their data link being up.
>   *
> - * Return TRUE if the link has been successfully retrained, otherwise FALSE.
> + * Return TRUE if the link has been successfully retrained, otherwise FALSE,
> + * also when retraining was not needed in the first place.
>   */
>  bool pcie_failed_link_retrain(struct pci_dev *dev)
>  {
> @@ -83,10 +84,11 @@ bool pcie_failed_link_retrain(struct pci
>  		{}
>  	};
>  	u16 lnksta, lnkctl2;
> +	bool ret = false;
>  
>  	if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
>  	    !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
> -		return false;
> +		return ret;
>  
>  	pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
>  	pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> @@ -98,9 +100,10 @@ bool pcie_failed_link_retrain(struct pci
>  		lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
>  		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
>  
> -		if (pcie_retrain_link(dev, false)) {
> +		ret = pcie_retrain_link(dev, false) == 0;
> +		if (!ret) {
>  			pci_info(dev, "retraining failed\n");
> -			return false;
> +			return ret;
>  		}
>  
>  		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> @@ -117,13 +120,14 @@ bool pcie_failed_link_retrain(struct pci
>  		lnkctl2 |= lnkcap & PCI_EXP_LNKCAP_SLS;
>  		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
>  
> -		if (pcie_retrain_link(dev, false)) {
> +		ret = pcie_retrain_link(dev, false) == 0;
> +		if (!ret) {
>  			pci_info(dev, "retraining failed\n");
> -			return false;
> +			return ret;
>  		}
>  	}
>  
> -	return true;
> +	return ret;
>  }
>  
>  static ktime_t fixup_debug_start(struct pci_dev *dev,
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] PCI: Use an error code with PCIe failed link retraining
  2024-02-10  1:43 ` [PATCH 2/2] PCI: Use an error code " Maciej W. Rozycki
@ 2024-02-12 12:11   ` Ilpo Järvinen
  0 siblings, 0 replies; 10+ messages in thread
From: Ilpo Järvinen @ 2024-02-12 12:11 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Bjorn Helgaas, Mika Westerberg, linux-pci, LKML

[-- Attachment #1: Type: text/plain, Size: 4327 bytes --]

On Sat, 10 Feb 2024, Maciej W. Rozycki wrote:

> Given how the call place in `pcie_wait_for_link_delay' got structured 
> now, and that `pcie_retrain_link' returns a potentially useful error 
> code, convert `pcie_failed_link_retrain' to return an error code rather 
> than a boolean status, fixing handling at the call site mentioned.  
> Update the other call site accordingly.
> 
> Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'")
> Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Link: https://lore.kernel.org/r/aa2d1c4e-9961-d54a-00c7-ddf8e858a9b0@linux.intel.com/
> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> Cc: stable@vger.kernel.org # v6.5+
> ---
>  drivers/pci/pci.c    |    2 +-
>  drivers/pci/pci.h    |    6 +++---
>  drivers/pci/quirks.c |   14 +++++++-------
>  3 files changed, 11 insertions(+), 11 deletions(-)
> 
> linux-pcie-failed-link-retrain-status-int.diff
> Index: linux-macro/drivers/pci/pci.c
> ===================================================================
> --- linux-macro.orig/drivers/pci/pci.c
> +++ linux-macro/drivers/pci/pci.c
> @@ -1263,7 +1263,7 @@ static int pci_dev_wait(struct pci_dev *
>  		if (delay > PCI_RESET_WAIT) {
>  			if (retrain) {
>  				retrain = false;
> -				if (pcie_failed_link_retrain(bridge)) {
> +				if (pcie_failed_link_retrain(bridge) == 0) {
>  					delay = 1;
>  					continue;
>  				}
> Index: linux-macro/drivers/pci/pci.h
> ===================================================================
> --- linux-macro.orig/drivers/pci/pci.h
> +++ linux-macro/drivers/pci/pci.h
> @@ -540,7 +540,7 @@ void pci_acs_init(struct pci_dev *dev);
>  int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
>  int pci_dev_specific_enable_acs(struct pci_dev *dev);
>  int pci_dev_specific_disable_acs_redir(struct pci_dev *dev);
> -bool pcie_failed_link_retrain(struct pci_dev *dev);
> +int pcie_failed_link_retrain(struct pci_dev *dev);
>  #else
>  static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
>  					       u16 acs_flags)
> @@ -555,9 +555,9 @@ static inline int pci_dev_specific_disab
>  {
>  	return -ENOTTY;
>  }
> -static inline bool pcie_failed_link_retrain(struct pci_dev *dev)
> +static inline int pcie_failed_link_retrain(struct pci_dev *dev)
>  {
> -	return false;
> +	return -ENOTTY;
>  }
>  #endif
>  
> Index: linux-macro/drivers/pci/quirks.c
> ===================================================================
> --- linux-macro.orig/drivers/pci/quirks.c
> +++ linux-macro/drivers/pci/quirks.c
> @@ -74,17 +74,17 @@
>   * firmware may have already arranged and lift it with ports that already
>   * report their data link being up.
>   *
> - * Return TRUE if the link has been successfully retrained, otherwise FALSE,
> + * Return 0 if the link has been successfully retrained, otherwise an error,
>   * also when retraining was not needed in the first place.
>   */
> -bool pcie_failed_link_retrain(struct pci_dev *dev)
> +int pcie_failed_link_retrain(struct pci_dev *dev)
>  {
>  	static const struct pci_device_id ids[] = {
>  		{ PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */
>  		{}
>  	};
>  	u16 lnksta, lnkctl2;
> -	bool ret = false;
> +	int ret = -ENOTTY;
>  
>  	if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
>  	    !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
> @@ -100,8 +100,8 @@ bool pcie_failed_link_retrain(struct pci
>  		lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
>  		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
>  
> -		ret = pcie_retrain_link(dev, false) == 0;
> -		if (!ret) {
> +		ret = pcie_retrain_link(dev, false);
> +		if (ret) {
>  			pci_info(dev, "retraining failed\n");
>  			return ret;
>  		}
> @@ -120,8 +120,8 @@ bool pcie_failed_link_retrain(struct pci
>  		lnkctl2 |= lnkcap & PCI_EXP_LNKCAP_SLS;
>  		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
>  
> -		ret = pcie_retrain_link(dev, false) == 0;
> -		if (!ret) {
> +		ret = pcie_retrain_link(dev, false);
> +		if (ret) {
>  			pci_info(dev, "retraining failed\n");
>  			return ret;
>  		}

Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

I closed my patch from patchwork that is not need after this change.

-- 
 i.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] PCI: Rework error reporting with PCIe failed link retraining
  2024-02-10  1:43 [PATCH 0/2] PCI: Rework error reporting with PCIe failed link retraining Maciej W. Rozycki
  2024-02-10  1:43 ` [PATCH 1/2] PCI: Correct " Maciej W. Rozycki
  2024-02-10  1:43 ` [PATCH 2/2] PCI: Use an error code " Maciej W. Rozycki
@ 2024-02-26 12:55 ` Maciej W. Rozycki
  2 siblings, 0 replies; 10+ messages in thread
From: Maciej W. Rozycki @ 2024-02-26 12:55 UTC (permalink / raw)
  To: Ilpo Järvinen, Bjorn Helgaas
  Cc: Mika Westerberg, linux-pci, linux-kernel

On Sat, 10 Feb 2024, Maciej W. Rozycki wrote:

> Unfortunately I cannot verify the changes anymore beyond just checking 
> that the system `pcie_failed_link_retrain' was intended for still boots, 
> because something happened that makes the problematic link not to work at 
> all.
[...]
>  Hopefully I'll be able fix it one way or another and will be able to 
> report on LBMS behaviour too, that is whether it retriggers with every 
> link training iteration or not.

 As mentioned in the other thread, I have now fixed the hardware failure, 
and I'll try to collect information on LBMS reporting sometime this week.  
I've brought additional hardware too in case I couldn't address the issue, 
so I may be able to find out more.  I'll report on my findings once I have 
them.

  Maciej

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] PCI: Correct error reporting with PCIe failed link retraining
  2024-02-10  1:43 ` [PATCH 1/2] PCI: Correct " Maciej W. Rozycki
  2024-02-12 12:01   ` Ilpo Järvinen
@ 2024-03-14  7:19   ` Pengfei Xu
  2024-03-14 11:27     ` Ilpo Järvinen
  2024-04-24 22:13   ` Bjorn Helgaas
  2 siblings, 1 reply; 10+ messages in thread
From: Pengfei Xu @ 2024-03-14  7:19 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Ilpo Järvinen, Bjorn Helgaas, Mika Westerberg, linux-pci,
	linux-kernel

Hi Maciej W. Rozycki,

Greeting!

We tested intel internal Linux next kernel by syzkaller fuzz testing and met
"CPU soft lockup in __run_timers" problem in guest.

Bisect and find that the following patch is the first bad commit.
Check the commit content is same as following patch.

All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/240313_174939___run_timers
Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/240313_174939___run_timers/repro.c
Syzkaller reproduced steps: https://github.com/xupengfe/syzkaller_logs/blob/main/240313_174939___run_timers/repro.prog
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/240313_174939___run_timers/bisect_info.log
Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/240313_174939___run_timers/0a53950322bc80aeebf56f5a9d38c847186a082a_dmesg.log
bzImage: https://github.com/xupengfe/syzkaller_logs/raw/main/240313_174939___run_timers/bzImage_v6.8_problem.tar.gz

"
[   34.599994] hrtimer: interrupt took 63789 ns
[  102.443957] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 3343956725 wd_nsec: 3343956627
[  393.043326] watchdog: BUG: soft lockup - CPU#1 stuck for 27s! [repro:2653]
[  393.043704] Modules linked in:
[  393.043856] irq event stamp: 17486443
[  393.044019] hardirqs last  enabled at (17486442): [<ffffffff855b0ebe>] irqentry_exit+0x3e/0xa0
[  393.044462] hardirqs last disabled at (17486443): [<ffffffff855aed94>] sysvec_apic_timer_interrupt+0x14/0xc0
[  393.044925] softirqs last  enabled at (14336234): [<ffffffff8126c828>] __irq_exit_rcu+0xa8/0x110
[  393.045332] softirqs last disabled at (14336237): [<ffffffff8126c828>] __irq_exit_rcu+0xa8/0x110
[  393.045740] CPU: 1 PID: 2653 Comm: repro Not tainted 6.8.0-0a53950322bc+ #1
[  393.046062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[  393.046618] RIP: 0010:preempt_count_sub+0x6c/0x160
[  393.046856] Code: ed 07 85 c9 75 1b 65 8b 05 d9 7b d2 7e 89 c2 81 e2 ff ff ff 7f 39 da 7c 1b 81 fb fe 00 00 00 76 6b f7 db 65 01 1d bc 7b d2 7e <48> 8b 5d f8 c9 c3 cc cc cc cc e8 d5 36 9d 01 85 c0 74 ed 48 c7 c0
[  393.047696] RSP: 0018:ffff88806cd09da0 EFLAGS: 00000213
[  393.047950] RAX: 0000000000000102 RBX: 00000000ffffffff RCX: 0000000000000000
[  393.048281] RDX: 0000000000000102 RSI: 0000000000000102 RDI: 0000000000000001
[  393.048611] RBP: ffff88806cd09da8 R08: 0000000000000001 R09: fffffbfff11caa46
[  393.048943] R10: ffffffff88e55237 R11: 0000000000000001 R12: ffff88806cd32a00
[  393.049277] R13: ffffffff81516e90 R14: dffffc0000000000 R15: ffff88806cd32a00
[  393.049611] FS:  0000000000000000(0000) GS:ffff88806cd00000(0000) knlGS:0000000000000000
[  393.049987] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  393.050261] CR2: 00007febf5bfa838 CR3: 000000000bda4006 CR4: 0000000000770ef0
[  393.050596] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  393.050928] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[  393.051262] PKRU: 55555554
[  393.051396] Call Trace:
[  393.051521]  <IRQ>
[  393.051630]  ? show_regs+0xa9/0xc0
[  393.051805]  ? watchdog_timer_fn+0x531/0x6b0
[  393.052019]  ? __pfx_watchdog_timer_fn+0x10/0x10
[  393.052255]  ? __hrtimer_run_queues+0x1a4/0xc00
[  393.052505]  ? __pfx___hrtimer_run_queues+0x10/0x10
[  393.052770]  ? hrtimer_interrupt+0x324/0x7a0
[  393.052994]  ? __sysvec_apic_timer_interrupt+0x105/0x3c0
[  393.053254]  ? sysvec_apic_timer_interrupt+0x4b/0xc0
[  393.053495]  ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
[  393.053757]  ? __pfx_clocksource_watchdog+0x10/0x10
[  393.054002]  ? preempt_count_sub+0x6c/0x160
[  393.054213]  _raw_spin_unlock_irq+0x3c/0x60
[  393.054423]  __run_timers.part.0+0x6c9/0xa40
[  393.054647]  ? __pfx___run_timers.part.0+0x10/0x10
[  393.054883]  ? sysvec_irq_work+0x8b/0x100
[  393.055087]  ? asm_sysvec_irq_work+0x1f/0x30
[  393.055314]  run_timer_softirq+0xbc/0x1c0
[  393.055517]  __do_softirq+0x1cb/0x84a
[  393.055709]  __irq_exit_rcu+0xa8/0x110
[  393.055899]  irq_exit_rcu+0x12/0x30
[  393.056078]  sysvec_apic_timer_interrupt+0xa5/0xc0
[  393.056314]  </IRQ>
[  393.056422]  <TASK>
[  393.056532]  asm_sysvec_apic_timer_interrupt+0x1f/0x30
[  393.056781] RIP: 0010:__schedule+0xbe/0x30c0
[  393.056993] Code: c0 e8 e6 a8 fd ff 48 98 48 8d 3c c5 a0 dc 75 86 48 89 fa 48 c1 ea 03 42 80 3c 2a 00 0f 85 81 28 00 00 4c 03 24 c5 a0 dc 75 86 <49> 8d 84 24 58 0a 00 00 48 89 c2 48 89 85 e0 fe ff ff 48 b8 00 00
[  393.057847] RSP: 0018:ffff888024f3f6e8 EFLAGS: 00000282
[  393.058101] RAX: 0000000000000001 RBX: 0000000000046380 RCX: 0000000000000001
[  393.058437] RDX: 1ffffffff0cebb95 RSI: 0000000000000001 RDI: ffffffff8675dca8
[  393.058771] RBP: ffff888024f3f820 R08: 0000000000000001 R09: 0000000000000002
[  393.059107] R10: 0000000000406000 R11: 0000000000000001 R12: ffff88806cd46380
[  393.059442] R13: dffffc0000000000 R14: ffffffff855da674 R15: 0000000000406000
[  393.059779]  ? __cond_resched+0x24/0x30
[  393.059982]  ? __schedule+0x9a/0x30c0
[  393.060173]  ? lockdep_hardirqs_on+0x8a/0x110
[  393.060408]  ? trace_hardirqs_on+0x26/0x120
[  393.060637]  ? __pfx___schedule+0x10/0x10
[  393.060836]  ? __this_cpu_preempt_check+0x21/0x30
[  393.061068]  ? lock_release+0x417/0x7e0
[  393.061267]  ? __this_cpu_preempt_check+0x21/0x30
[  393.061497]  ? lock_is_held_type+0xf0/0x150
[  393.061705]  ? __cond_resched+0x24/0x30
[  393.061895]  preempt_schedule_common+0x4a/0xd0
[  393.062118]  __cond_resched+0x24/0x30
[  393.062302]  unmap_page_range+0xab4/0x3690
[  393.062530]  ? __pfx_unmap_page_range+0x10/0x10
[  393.062752]  ? __this_cpu_preempt_check+0x21/0x30
[  393.062989]  ? uprobe_munmap+0xb0/0x590
[  393.063186]  unmap_single_vma+0x1ac/0x2d0
[  393.063390]  unmap_vmas+0x210/0x470
[  393.063572]  ? __pfx_unmap_vmas+0x10/0x10
[  393.063769]  ? __pfx_lock_release+0x10/0x10
[  393.063977]  ? lock_release+0x417/0x7e0
[  393.064167]  ? __pfx_folio_batch_move_lru+0x10/0x10
[  393.064410]  ? __pfx_lock_release+0x10/0x10
[  393.064624]  ? mlock_drain_local+0x281/0x4b0
[  393.064844]  exit_mmap+0x19b/0xac0
[  393.065018]  ? mark_lock.part.0+0xf3/0x17a0
[  393.065233]  ? __pfx_exit_mmap+0x10/0x10
[  393.065427]  ? __kasan_check_write+0x18/0x20
[  393.065644]  ? __pfx___mutex_unlock_slowpath+0x10/0x10
[  393.065909]  ? mutex_unlock+0x16/0x20
[  393.066096]  __mmput+0xde/0x3e0
[  393.066256]  mmput+0x74/0x90
[  393.066407]  do_exit+0xa59/0x28c0
[  393.066580]  ? lock_release+0x417/0x7e0
[  393.066778]  ? __pfx_lock_release+0x10/0x10
[  393.066989]  ? __pfx_do_exit+0x10/0x10
[  393.067179]  ? __this_cpu_preempt_check+0x21/0x30
[  393.067410]  ? _raw_spin_unlock_irq+0x2c/0x60
[  393.067630]  ? lockdep_hardirqs_on+0x8a/0x110
[  393.067845]  ? _raw_spin_unlock_irq+0x2c/0x60
[  393.068064]  ? trace_hardirqs_on+0x26/0x120
[  393.068287]  do_group_exit+0xe5/0x2c0
[  393.068489]  __x64_sys_exit_group+0x4d/0x60
[  393.068707]  do_syscall_64+0x73/0x150
[  393.068894]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  393.069140] RIP: 0033:0x7febf5b18a4d
[  393.069319] Code: Unable to access opcode bytes at 0x7febf5b18a23.
"

---

If you don't need the following environment to reproduce the problem or if you
already have one reproduced environment, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
  // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
  // You could change the bzImage_xxx as you want
  // Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage           //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.


Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
yum -y install libslirp-devel.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
make
make install
---

Best Regards,
Thanks!

On 2024-02-10 at 01:43:50 +0000, Maciej W. Rozycki wrote:
> Only return successful completion status from `pcie_failed_link_retrain' 
> if retraining has actually been done, preventing excessive delays from 
> being triggered at call sites in a hope that communication will finally 
> be established with the downstream device where in fact nothing has been 
> done about the link in question that would justify such a hope.
> 
> Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Link: https://lore.kernel.org/r/aa2d1c4e-9961-d54a-00c7-ddf8e858a9b0@linux.intel.com/
> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> Cc: stable@vger.kernel.org # v6.5+
> ---
>  drivers/pci/quirks.c |   18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> linux-pcie-failed-link-retrain-status-fix.diff
> Index: linux-macro/drivers/pci/quirks.c
> ===================================================================
> --- linux-macro.orig/drivers/pci/quirks.c
> +++ linux-macro/drivers/pci/quirks.c
> @@ -74,7 +74,8 @@
>   * firmware may have already arranged and lift it with ports that already
>   * report their data link being up.
>   *
> - * Return TRUE if the link has been successfully retrained, otherwise FALSE.
> + * Return TRUE if the link has been successfully retrained, otherwise FALSE,
> + * also when retraining was not needed in the first place.
>   */
>  bool pcie_failed_link_retrain(struct pci_dev *dev)
>  {
> @@ -83,10 +84,11 @@ bool pcie_failed_link_retrain(struct pci
>  		{}
>  	};
>  	u16 lnksta, lnkctl2;
> +	bool ret = false;
>  
>  	if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
>  	    !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
> -		return false;
> +		return ret;
>  
>  	pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
>  	pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> @@ -98,9 +100,10 @@ bool pcie_failed_link_retrain(struct pci
>  		lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
>  		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
>  
> -		if (pcie_retrain_link(dev, false)) {
> +		ret = pcie_retrain_link(dev, false) == 0;
> +		if (!ret) {
>  			pci_info(dev, "retraining failed\n");
> -			return false;
> +			return ret;
>  		}
>  
>  		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> @@ -117,13 +120,14 @@ bool pcie_failed_link_retrain(struct pci
>  		lnkctl2 |= lnkcap & PCI_EXP_LNKCAP_SLS;
>  		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
>  
> -		if (pcie_retrain_link(dev, false)) {
> +		ret = pcie_retrain_link(dev, false) == 0;
> +		if (!ret) {
>  			pci_info(dev, "retraining failed\n");
> -			return false;
> +			return ret;
>  		}
>  	}
>  
> -	return true;
> +	return ret;
>  }
>  
>  static ktime_t fixup_debug_start(struct pci_dev *dev,

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] PCI: Correct error reporting with PCIe failed link retraining
  2024-03-14  7:19   ` Pengfei Xu
@ 2024-03-14 11:27     ` Ilpo Järvinen
  2024-03-15  3:10       ` Pengfei Xu
  0 siblings, 1 reply; 10+ messages in thread
From: Ilpo Järvinen @ 2024-03-14 11:27 UTC (permalink / raw)
  To: Pengfei Xu
  Cc: Maciej W. Rozycki, Bjorn Helgaas, Mika Westerberg, linux-pci, LKML

[-- Attachment #1: Type: text/plain, Size: 12731 bytes --]

On Thu, 14 Mar 2024, Pengfei Xu wrote:

> Hi Maciej W. Rozycki,
> 
> Greeting!
> 
> We tested intel internal Linux next kernel by syzkaller fuzz testing and met
> "CPU soft lockup in __run_timers" problem in guest.
> 
> Bisect and find that the following patch is the first bad commit.

Hi,

I'm skeptical of this bisect result (perhaps some of the kernels marked 
good are actually bad because the problem does not always trigger?).

Did you check what is the result if you put that patch on top of 6.8?

-- 
 i.

> Check the commit content is same as following patch.
>
> All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/240313_174939___run_timers
> Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/240313_174939___run_timers/repro.c
> Syzkaller reproduced steps: https://github.com/xupengfe/syzkaller_logs/blob/main/240313_174939___run_timers/repro.prog
> Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/240313_174939___run_timers/bisect_info.log
> Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/240313_174939___run_timers/0a53950322bc80aeebf56f5a9d38c847186a082a_dmesg.log
> bzImage: https://github.com/xupengfe/syzkaller_logs/raw/main/240313_174939___run_timers/bzImage_v6.8_problem.tar.gz
> 
> "
> [   34.599994] hrtimer: interrupt took 63789 ns
> [  102.443957] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 3343956725 wd_nsec: 3343956627
> [  393.043326] watchdog: BUG: soft lockup - CPU#1 stuck for 27s! [repro:2653]
> [  393.043704] Modules linked in:
> [  393.043856] irq event stamp: 17486443
> [  393.044019] hardirqs last  enabled at (17486442): [<ffffffff855b0ebe>] irqentry_exit+0x3e/0xa0
> [  393.044462] hardirqs last disabled at (17486443): [<ffffffff855aed94>] sysvec_apic_timer_interrupt+0x14/0xc0
> [  393.044925] softirqs last  enabled at (14336234): [<ffffffff8126c828>] __irq_exit_rcu+0xa8/0x110
> [  393.045332] softirqs last disabled at (14336237): [<ffffffff8126c828>] __irq_exit_rcu+0xa8/0x110
> [  393.045740] CPU: 1 PID: 2653 Comm: repro Not tainted 6.8.0-0a53950322bc+ #1
> [  393.046062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [  393.046618] RIP: 0010:preempt_count_sub+0x6c/0x160
> [  393.046856] Code: ed 07 85 c9 75 1b 65 8b 05 d9 7b d2 7e 89 c2 81 e2 ff ff ff 7f 39 da 7c 1b 81 fb fe 00 00 00 76 6b f7 db 65 01 1d bc 7b d2 7e <48> 8b 5d f8 c9 c3 cc cc cc cc e8 d5 36 9d 01 85 c0 74 ed 48 c7 c0
> [  393.047696] RSP: 0018:ffff88806cd09da0 EFLAGS: 00000213
> [  393.047950] RAX: 0000000000000102 RBX: 00000000ffffffff RCX: 0000000000000000
> [  393.048281] RDX: 0000000000000102 RSI: 0000000000000102 RDI: 0000000000000001
> [  393.048611] RBP: ffff88806cd09da8 R08: 0000000000000001 R09: fffffbfff11caa46
> [  393.048943] R10: ffffffff88e55237 R11: 0000000000000001 R12: ffff88806cd32a00
> [  393.049277] R13: ffffffff81516e90 R14: dffffc0000000000 R15: ffff88806cd32a00
> [  393.049611] FS:  0000000000000000(0000) GS:ffff88806cd00000(0000) knlGS:0000000000000000
> [  393.049987] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  393.050261] CR2: 00007febf5bfa838 CR3: 000000000bda4006 CR4: 0000000000770ef0
> [  393.050596] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  393.050928] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
> [  393.051262] PKRU: 55555554
> [  393.051396] Call Trace:
> [  393.051521]  <IRQ>
> [  393.051630]  ? show_regs+0xa9/0xc0
> [  393.051805]  ? watchdog_timer_fn+0x531/0x6b0
> [  393.052019]  ? __pfx_watchdog_timer_fn+0x10/0x10
> [  393.052255]  ? __hrtimer_run_queues+0x1a4/0xc00
> [  393.052505]  ? __pfx___hrtimer_run_queues+0x10/0x10
> [  393.052770]  ? hrtimer_interrupt+0x324/0x7a0
> [  393.052994]  ? __sysvec_apic_timer_interrupt+0x105/0x3c0
> [  393.053254]  ? sysvec_apic_timer_interrupt+0x4b/0xc0
> [  393.053495]  ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
> [  393.053757]  ? __pfx_clocksource_watchdog+0x10/0x10
> [  393.054002]  ? preempt_count_sub+0x6c/0x160
> [  393.054213]  _raw_spin_unlock_irq+0x3c/0x60
> [  393.054423]  __run_timers.part.0+0x6c9/0xa40
> [  393.054647]  ? __pfx___run_timers.part.0+0x10/0x10
> [  393.054883]  ? sysvec_irq_work+0x8b/0x100
> [  393.055087]  ? asm_sysvec_irq_work+0x1f/0x30
> [  393.055314]  run_timer_softirq+0xbc/0x1c0
> [  393.055517]  __do_softirq+0x1cb/0x84a
> [  393.055709]  __irq_exit_rcu+0xa8/0x110
> [  393.055899]  irq_exit_rcu+0x12/0x30
> [  393.056078]  sysvec_apic_timer_interrupt+0xa5/0xc0
> [  393.056314]  </IRQ>
> [  393.056422]  <TASK>
> [  393.056532]  asm_sysvec_apic_timer_interrupt+0x1f/0x30
> [  393.056781] RIP: 0010:__schedule+0xbe/0x30c0
> [  393.056993] Code: c0 e8 e6 a8 fd ff 48 98 48 8d 3c c5 a0 dc 75 86 48 89 fa 48 c1 ea 03 42 80 3c 2a 00 0f 85 81 28 00 00 4c 03 24 c5 a0 dc 75 86 <49> 8d 84 24 58 0a 00 00 48 89 c2 48 89 85 e0 fe ff ff 48 b8 00 00
> [  393.057847] RSP: 0018:ffff888024f3f6e8 EFLAGS: 00000282
> [  393.058101] RAX: 0000000000000001 RBX: 0000000000046380 RCX: 0000000000000001
> [  393.058437] RDX: 1ffffffff0cebb95 RSI: 0000000000000001 RDI: ffffffff8675dca8
> [  393.058771] RBP: ffff888024f3f820 R08: 0000000000000001 R09: 0000000000000002
> [  393.059107] R10: 0000000000406000 R11: 0000000000000001 R12: ffff88806cd46380
> [  393.059442] R13: dffffc0000000000 R14: ffffffff855da674 R15: 0000000000406000
> [  393.059779]  ? __cond_resched+0x24/0x30
> [  393.059982]  ? __schedule+0x9a/0x30c0
> [  393.060173]  ? lockdep_hardirqs_on+0x8a/0x110
> [  393.060408]  ? trace_hardirqs_on+0x26/0x120
> [  393.060637]  ? __pfx___schedule+0x10/0x10
> [  393.060836]  ? __this_cpu_preempt_check+0x21/0x30
> [  393.061068]  ? lock_release+0x417/0x7e0
> [  393.061267]  ? __this_cpu_preempt_check+0x21/0x30
> [  393.061497]  ? lock_is_held_type+0xf0/0x150
> [  393.061705]  ? __cond_resched+0x24/0x30
> [  393.061895]  preempt_schedule_common+0x4a/0xd0
> [  393.062118]  __cond_resched+0x24/0x30
> [  393.062302]  unmap_page_range+0xab4/0x3690
> [  393.062530]  ? __pfx_unmap_page_range+0x10/0x10
> [  393.062752]  ? __this_cpu_preempt_check+0x21/0x30
> [  393.062989]  ? uprobe_munmap+0xb0/0x590
> [  393.063186]  unmap_single_vma+0x1ac/0x2d0
> [  393.063390]  unmap_vmas+0x210/0x470
> [  393.063572]  ? __pfx_unmap_vmas+0x10/0x10
> [  393.063769]  ? __pfx_lock_release+0x10/0x10
> [  393.063977]  ? lock_release+0x417/0x7e0
> [  393.064167]  ? __pfx_folio_batch_move_lru+0x10/0x10
> [  393.064410]  ? __pfx_lock_release+0x10/0x10
> [  393.064624]  ? mlock_drain_local+0x281/0x4b0
> [  393.064844]  exit_mmap+0x19b/0xac0
> [  393.065018]  ? mark_lock.part.0+0xf3/0x17a0
> [  393.065233]  ? __pfx_exit_mmap+0x10/0x10
> [  393.065427]  ? __kasan_check_write+0x18/0x20
> [  393.065644]  ? __pfx___mutex_unlock_slowpath+0x10/0x10
> [  393.065909]  ? mutex_unlock+0x16/0x20
> [  393.066096]  __mmput+0xde/0x3e0
> [  393.066256]  mmput+0x74/0x90
> [  393.066407]  do_exit+0xa59/0x28c0
> [  393.066580]  ? lock_release+0x417/0x7e0
> [  393.066778]  ? __pfx_lock_release+0x10/0x10
> [  393.066989]  ? __pfx_do_exit+0x10/0x10
> [  393.067179]  ? __this_cpu_preempt_check+0x21/0x30
> [  393.067410]  ? _raw_spin_unlock_irq+0x2c/0x60
> [  393.067630]  ? lockdep_hardirqs_on+0x8a/0x110
> [  393.067845]  ? _raw_spin_unlock_irq+0x2c/0x60
> [  393.068064]  ? trace_hardirqs_on+0x26/0x120
> [  393.068287]  do_group_exit+0xe5/0x2c0
> [  393.068489]  __x64_sys_exit_group+0x4d/0x60
> [  393.068707]  do_syscall_64+0x73/0x150
> [  393.068894]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
> [  393.069140] RIP: 0033:0x7febf5b18a4d
> [  393.069319] Code: Unable to access opcode bytes at 0x7febf5b18a23.
> "
> 
> ---
> 
> If you don't need the following environment to reproduce the problem or if you
> already have one reproduced environment, please ignore the following information.
> 
> How to reproduce:
> git clone https://gitlab.com/xupengfe/repro_vm_env.git
> cd repro_vm_env
> tar -xvf repro_vm_env.tar.gz
> cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
>   // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
>   // You could change the bzImage_xxx as you want
>   // Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version
> You could use below command to log in, there is no password for root.
> ssh -p 10023 root@localhost
> 
> After login vm(virtual machine) successfully, you could transfer reproduced
> binary to the vm by below way, and reproduce the problem in vm:
> gcc -pthread -o repro repro.c
> scp -P 10023 repro root@localhost:/root/
> 
> Get the bzImage for target kernel:
> Please use target kconfig and copy it to kernel_src/.config
> make olddefconfig
> make -jx bzImage           //x should equal or less than cpu num your pc has
> 
> Fill the bzImage file into above start3.sh to load the target kernel in vm.
> 
> 
> Tips:
> If you already have qemu-system-x86_64, please ignore below info.
> If you want to install qemu v7.1.0 version:
> git clone https://github.com/qemu/qemu.git
> cd qemu
> git checkout -f v7.1.0
> mkdir build
> cd build
> yum install -y ninja-build.x86_64
> yum -y install libslirp-devel.x86_64
> ../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
> make
> make install
> ---
> 
> Best Regards,
> Thanks!
> 
> On 2024-02-10 at 01:43:50 +0000, Maciej W. Rozycki wrote:
> > Only return successful completion status from `pcie_failed_link_retrain' 
> > if retraining has actually been done, preventing excessive delays from 
> > being triggered at call sites in a hope that communication will finally 
> > be established with the downstream device where in fact nothing has been 
> > done about the link in question that would justify such a hope.
> > 
> > Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> > Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > Link: https://lore.kernel.org/r/aa2d1c4e-9961-d54a-00c7-ddf8e858a9b0@linux.intel.com/
> > Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> > Cc: stable@vger.kernel.org # v6.5+
> > ---
> >  drivers/pci/quirks.c |   18 +++++++++++-------
> >  1 file changed, 11 insertions(+), 7 deletions(-)
> > 
> > linux-pcie-failed-link-retrain-status-fix.diff
> > Index: linux-macro/drivers/pci/quirks.c
> > ===================================================================
> > --- linux-macro.orig/drivers/pci/quirks.c
> > +++ linux-macro/drivers/pci/quirks.c
> > @@ -74,7 +74,8 @@
> >   * firmware may have already arranged and lift it with ports that already
> >   * report their data link being up.
> >   *
> > - * Return TRUE if the link has been successfully retrained, otherwise FALSE.
> > + * Return TRUE if the link has been successfully retrained, otherwise FALSE,
> > + * also when retraining was not needed in the first place.
> >   */
> >  bool pcie_failed_link_retrain(struct pci_dev *dev)
> >  {
> > @@ -83,10 +84,11 @@ bool pcie_failed_link_retrain(struct pci
> >  		{}
> >  	};
> >  	u16 lnksta, lnkctl2;
> > +	bool ret = false;
> >  
> >  	if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
> >  	    !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
> > -		return false;
> > +		return ret;
> >  
> >  	pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
> >  	pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> > @@ -98,9 +100,10 @@ bool pcie_failed_link_retrain(struct pci
> >  		lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
> >  		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
> >  
> > -		if (pcie_retrain_link(dev, false)) {
> > +		ret = pcie_retrain_link(dev, false) == 0;
> > +		if (!ret) {
> >  			pci_info(dev, "retraining failed\n");
> > -			return false;
> > +			return ret;
> >  		}
> >  
> >  		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> > @@ -117,13 +120,14 @@ bool pcie_failed_link_retrain(struct pci
> >  		lnkctl2 |= lnkcap & PCI_EXP_LNKCAP_SLS;
> >  		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
> >  
> > -		if (pcie_retrain_link(dev, false)) {
> > +		ret = pcie_retrain_link(dev, false) == 0;
> > +		if (!ret) {
> >  			pci_info(dev, "retraining failed\n");
> > -			return false;
> > +			return ret;
> >  		}
> >  	}
> >  
> > -	return true;
> > +	return ret;
> >  }
> >  
> >  static ktime_t fixup_debug_start(struct pci_dev *dev,


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] PCI: Correct error reporting with PCIe failed link retraining
  2024-03-14 11:27     ` Ilpo Järvinen
@ 2024-03-15  3:10       ` Pengfei Xu
  0 siblings, 0 replies; 10+ messages in thread
From: Pengfei Xu @ 2024-03-15  3:10 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Maciej W. Rozycki, Bjorn Helgaas, Mika Westerberg, linux-pci, LKML

Hi Ilpo Järvinen,

On 2024-03-14 at 19:27:36 +0800, Ilpo Järvinen wrote:
> On Thu, 14 Mar 2024, Pengfei Xu wrote:
> 
> > Hi Maciej W. Rozycki,
> > 
> > Greeting!
> > 
> > We tested intel internal Linux next kernel by syzkaller fuzz testing and met
> > "CPU soft lockup in __run_timers" problem in guest.
> > 
> > Bisect and find that the following patch is the first bad commit.
> 
> Hi,
> 
> I'm skeptical of this bisect result (perhaps some of the kernels marked 
> good are actually bad because the problem does not always trigger?).
> 
> Did you check what is the result if you put that patch on top of 6.8?

  Thanks for suggestion!
  I tried this commit on top of v6.8, CPU softlockup could be reproduced in
  23020s(6h23min), but not related to "__run_timers".
  And tried again v6.8 mainline kernel, it reproduced CPU softlockup in
  176s, but not related to "__run_timers".

  So, this issue triggered by "perf and ioctl" syscalls could lead to CPU
  softlockup with varying reasons.

  This issue cannot be reliably reproduced, resulting in incorrect
  bisection results.

  Please ignore the report specific to this commit.
  Sorry for inconvenience.

  Best Regards!
  Thanks!

> 
> -- 
>  i.
> 
> > Check the commit content is same as following patch.
> >
> > All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/240313_174939___run_timers
> > Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/240313_174939___run_timers/repro.c
> > Syzkaller reproduced steps: https://github.com/xupengfe/syzkaller_logs/blob/main/240313_174939___run_timers/repro.prog
> > Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/240313_174939___run_timers/bisect_info.log
> > Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/240313_174939___run_timers/0a53950322bc80aeebf56f5a9d38c847186a082a_dmesg.log
> > bzImage: https://github.com/xupengfe/syzkaller_logs/raw/main/240313_174939___run_timers/bzImage_v6.8_problem.tar.gz
> > 
> > "
> > [   34.599994] hrtimer: interrupt took 63789 ns
> > [  102.443957] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 3343956725 wd_nsec: 3343956627
> > [  393.043326] watchdog: BUG: soft lockup - CPU#1 stuck for 27s! [repro:2653]
> > [  393.043704] Modules linked in:
> > [  393.043856] irq event stamp: 17486443
> > [  393.044019] hardirqs last  enabled at (17486442): [<ffffffff855b0ebe>] irqentry_exit+0x3e/0xa0
> > [  393.044462] hardirqs last disabled at (17486443): [<ffffffff855aed94>] sysvec_apic_timer_interrupt+0x14/0xc0
> > [  393.044925] softirqs last  enabled at (14336234): [<ffffffff8126c828>] __irq_exit_rcu+0xa8/0x110
> > [  393.045332] softirqs last disabled at (14336237): [<ffffffff8126c828>] __irq_exit_rcu+0xa8/0x110
> > [  393.045740] CPU: 1 PID: 2653 Comm: repro Not tainted 6.8.0-0a53950322bc+ #1
> > [  393.046062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > [  393.046618] RIP: 0010:preempt_count_sub+0x6c/0x160
> > [  393.046856] Code: ed 07 85 c9 75 1b 65 8b 05 d9 7b d2 7e 89 c2 81 e2 ff ff ff 7f 39 da 7c 1b 81 fb fe 00 00 00 76 6b f7 db 65 01 1d bc 7b d2 7e <48> 8b 5d f8 c9 c3 cc cc cc cc e8 d5 36 9d 01 85 c0 74 ed 48 c7 c0
> > [  393.047696] RSP: 0018:ffff88806cd09da0 EFLAGS: 00000213
> > [  393.047950] RAX: 0000000000000102 RBX: 00000000ffffffff RCX: 0000000000000000
> > [  393.048281] RDX: 0000000000000102 RSI: 0000000000000102 RDI: 0000000000000001
> > [  393.048611] RBP: ffff88806cd09da8 R08: 0000000000000001 R09: fffffbfff11caa46
> > [  393.048943] R10: ffffffff88e55237 R11: 0000000000000001 R12: ffff88806cd32a00
> > [  393.049277] R13: ffffffff81516e90 R14: dffffc0000000000 R15: ffff88806cd32a00
> > [  393.049611] FS:  0000000000000000(0000) GS:ffff88806cd00000(0000) knlGS:0000000000000000
> > [  393.049987] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  393.050261] CR2: 00007febf5bfa838 CR3: 000000000bda4006 CR4: 0000000000770ef0
> > [  393.050596] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  393.050928] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
> > [  393.051262] PKRU: 55555554
> > [  393.051396] Call Trace:
> > [  393.051521]  <IRQ>
> > [  393.051630]  ? show_regs+0xa9/0xc0
> > [  393.051805]  ? watchdog_timer_fn+0x531/0x6b0
> > [  393.052019]  ? __pfx_watchdog_timer_fn+0x10/0x10
> > [  393.052255]  ? __hrtimer_run_queues+0x1a4/0xc00
> > [  393.052505]  ? __pfx___hrtimer_run_queues+0x10/0x10
> > [  393.052770]  ? hrtimer_interrupt+0x324/0x7a0
> > [  393.052994]  ? __sysvec_apic_timer_interrupt+0x105/0x3c0
> > [  393.053254]  ? sysvec_apic_timer_interrupt+0x4b/0xc0
> > [  393.053495]  ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
> > [  393.053757]  ? __pfx_clocksource_watchdog+0x10/0x10
> > [  393.054002]  ? preempt_count_sub+0x6c/0x160
> > [  393.054213]  _raw_spin_unlock_irq+0x3c/0x60
> > [  393.054423]  __run_timers.part.0+0x6c9/0xa40
> > [  393.054647]  ? __pfx___run_timers.part.0+0x10/0x10
> > [  393.054883]  ? sysvec_irq_work+0x8b/0x100
> > [  393.055087]  ? asm_sysvec_irq_work+0x1f/0x30
> > [  393.055314]  run_timer_softirq+0xbc/0x1c0
> > [  393.055517]  __do_softirq+0x1cb/0x84a
> > [  393.055709]  __irq_exit_rcu+0xa8/0x110
> > [  393.055899]  irq_exit_rcu+0x12/0x30
> > [  393.056078]  sysvec_apic_timer_interrupt+0xa5/0xc0
> > [  393.056314]  </IRQ>
> > [  393.056422]  <TASK>
> > [  393.056532]  asm_sysvec_apic_timer_interrupt+0x1f/0x30
> > [  393.056781] RIP: 0010:__schedule+0xbe/0x30c0
> > [  393.056993] Code: c0 e8 e6 a8 fd ff 48 98 48 8d 3c c5 a0 dc 75 86 48 89 fa 48 c1 ea 03 42 80 3c 2a 00 0f 85 81 28 00 00 4c 03 24 c5 a0 dc 75 86 <49> 8d 84 24 58 0a 00 00 48 89 c2 48 89 85 e0 fe ff ff 48 b8 00 00
> > [  393.057847] RSP: 0018:ffff888024f3f6e8 EFLAGS: 00000282
> > [  393.058101] RAX: 0000000000000001 RBX: 0000000000046380 RCX: 0000000000000001
> > [  393.058437] RDX: 1ffffffff0cebb95 RSI: 0000000000000001 RDI: ffffffff8675dca8
> > [  393.058771] RBP: ffff888024f3f820 R08: 0000000000000001 R09: 0000000000000002
> > [  393.059107] R10: 0000000000406000 R11: 0000000000000001 R12: ffff88806cd46380
> > [  393.059442] R13: dffffc0000000000 R14: ffffffff855da674 R15: 0000000000406000
> > [  393.059779]  ? __cond_resched+0x24/0x30
> > [  393.059982]  ? __schedule+0x9a/0x30c0
> > [  393.060173]  ? lockdep_hardirqs_on+0x8a/0x110
> > [  393.060408]  ? trace_hardirqs_on+0x26/0x120
> > [  393.060637]  ? __pfx___schedule+0x10/0x10
> > [  393.060836]  ? __this_cpu_preempt_check+0x21/0x30
> > [  393.061068]  ? lock_release+0x417/0x7e0
> > [  393.061267]  ? __this_cpu_preempt_check+0x21/0x30
> > [  393.061497]  ? lock_is_held_type+0xf0/0x150
> > [  393.061705]  ? __cond_resched+0x24/0x30
> > [  393.061895]  preempt_schedule_common+0x4a/0xd0
> > [  393.062118]  __cond_resched+0x24/0x30
> > [  393.062302]  unmap_page_range+0xab4/0x3690
> > [  393.062530]  ? __pfx_unmap_page_range+0x10/0x10
> > [  393.062752]  ? __this_cpu_preempt_check+0x21/0x30
> > [  393.062989]  ? uprobe_munmap+0xb0/0x590
> > [  393.063186]  unmap_single_vma+0x1ac/0x2d0
> > [  393.063390]  unmap_vmas+0x210/0x470
> > [  393.063572]  ? __pfx_unmap_vmas+0x10/0x10
> > [  393.063769]  ? __pfx_lock_release+0x10/0x10
> > [  393.063977]  ? lock_release+0x417/0x7e0
> > [  393.064167]  ? __pfx_folio_batch_move_lru+0x10/0x10
> > [  393.064410]  ? __pfx_lock_release+0x10/0x10
> > [  393.064624]  ? mlock_drain_local+0x281/0x4b0
> > [  393.064844]  exit_mmap+0x19b/0xac0
> > [  393.065018]  ? mark_lock.part.0+0xf3/0x17a0
> > [  393.065233]  ? __pfx_exit_mmap+0x10/0x10
> > [  393.065427]  ? __kasan_check_write+0x18/0x20
> > [  393.065644]  ? __pfx___mutex_unlock_slowpath+0x10/0x10
> > [  393.065909]  ? mutex_unlock+0x16/0x20
> > [  393.066096]  __mmput+0xde/0x3e0
> > [  393.066256]  mmput+0x74/0x90
> > [  393.066407]  do_exit+0xa59/0x28c0
> > [  393.066580]  ? lock_release+0x417/0x7e0
> > [  393.066778]  ? __pfx_lock_release+0x10/0x10
> > [  393.066989]  ? __pfx_do_exit+0x10/0x10
> > [  393.067179]  ? __this_cpu_preempt_check+0x21/0x30
> > [  393.067410]  ? _raw_spin_unlock_irq+0x2c/0x60
> > [  393.067630]  ? lockdep_hardirqs_on+0x8a/0x110
> > [  393.067845]  ? _raw_spin_unlock_irq+0x2c/0x60
> > [  393.068064]  ? trace_hardirqs_on+0x26/0x120
> > [  393.068287]  do_group_exit+0xe5/0x2c0
> > [  393.068489]  __x64_sys_exit_group+0x4d/0x60
> > [  393.068707]  do_syscall_64+0x73/0x150
> > [  393.068894]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
> > [  393.069140] RIP: 0033:0x7febf5b18a4d
> > [  393.069319] Code: Unable to access opcode bytes at 0x7febf5b18a23.
> > "
> > 
> > ---
> > 
> > If you don't need the following environment to reproduce the problem or if you
> > already have one reproduced environment, please ignore the following information.
> > 
> > How to reproduce:
> > git clone https://gitlab.com/xupengfe/repro_vm_env.git
> > cd repro_vm_env
> > tar -xvf repro_vm_env.tar.gz
> > cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
> >   // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
> >   // You could change the bzImage_xxx as you want
> >   // Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version
> > You could use below command to log in, there is no password for root.
> > ssh -p 10023 root@localhost
> > 
> > After login vm(virtual machine) successfully, you could transfer reproduced
> > binary to the vm by below way, and reproduce the problem in vm:
> > gcc -pthread -o repro repro.c
> > scp -P 10023 repro root@localhost:/root/
> > 
> > Get the bzImage for target kernel:
> > Please use target kconfig and copy it to kernel_src/.config
> > make olddefconfig
> > make -jx bzImage           //x should equal or less than cpu num your pc has
> > 
> > Fill the bzImage file into above start3.sh to load the target kernel in vm.
> > 
> > 
> > Tips:
> > If you already have qemu-system-x86_64, please ignore below info.
> > If you want to install qemu v7.1.0 version:
> > git clone https://github.com/qemu/qemu.git
> > cd qemu
> > git checkout -f v7.1.0
> > mkdir build
> > cd build
> > yum install -y ninja-build.x86_64
> > yum -y install libslirp-devel.x86_64
> > ../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
> > make
> > make install
> > ---
> > 
> > Best Regards,
> > Thanks!
> > 
> > On 2024-02-10 at 01:43:50 +0000, Maciej W. Rozycki wrote:
> > > Only return successful completion status from `pcie_failed_link_retrain' 
> > > if retraining has actually been done, preventing excessive delays from 
> > > being triggered at call sites in a hope that communication will finally 
> > > be established with the downstream device where in fact nothing has been 
> > > done about the link in question that would justify such a hope.
> > > 
> > > Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> > > Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > > Link: https://lore.kernel.org/r/aa2d1c4e-9961-d54a-00c7-ddf8e858a9b0@linux.intel.com/
> > > Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> > > Cc: stable@vger.kernel.org # v6.5+
> > > ---
> > >  drivers/pci/quirks.c |   18 +++++++++++-------
> > >  1 file changed, 11 insertions(+), 7 deletions(-)
> > > 
> > > linux-pcie-failed-link-retrain-status-fix.diff
> > > Index: linux-macro/drivers/pci/quirks.c
> > > ===================================================================
> > > --- linux-macro.orig/drivers/pci/quirks.c
> > > +++ linux-macro/drivers/pci/quirks.c
> > > @@ -74,7 +74,8 @@
> > >   * firmware may have already arranged and lift it with ports that already
> > >   * report their data link being up.
> > >   *
> > > - * Return TRUE if the link has been successfully retrained, otherwise FALSE.
> > > + * Return TRUE if the link has been successfully retrained, otherwise FALSE,
> > > + * also when retraining was not needed in the first place.
> > >   */
> > >  bool pcie_failed_link_retrain(struct pci_dev *dev)
> > >  {
> > > @@ -83,10 +84,11 @@ bool pcie_failed_link_retrain(struct pci
> > >  		{}
> > >  	};
> > >  	u16 lnksta, lnkctl2;
> > > +	bool ret = false;
> > >  
> > >  	if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
> > >  	    !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
> > > -		return false;
> > > +		return ret;
> > >  
> > >  	pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
> > >  	pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> > > @@ -98,9 +100,10 @@ bool pcie_failed_link_retrain(struct pci
> > >  		lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
> > >  		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
> > >  
> > > -		if (pcie_retrain_link(dev, false)) {
> > > +		ret = pcie_retrain_link(dev, false) == 0;
> > > +		if (!ret) {
> > >  			pci_info(dev, "retraining failed\n");
> > > -			return false;
> > > +			return ret;
> > >  		}
> > >  
> > >  		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> > > @@ -117,13 +120,14 @@ bool pcie_failed_link_retrain(struct pci
> > >  		lnkctl2 |= lnkcap & PCI_EXP_LNKCAP_SLS;
> > >  		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
> > >  
> > > -		if (pcie_retrain_link(dev, false)) {
> > > +		ret = pcie_retrain_link(dev, false) == 0;
> > > +		if (!ret) {
> > >  			pci_info(dev, "retraining failed\n");
> > > -			return false;
> > > +			return ret;
> > >  		}
> > >  	}
> > >  
> > > -	return true;
> > > +	return ret;
> > >  }
> > >  
> > >  static ktime_t fixup_debug_start(struct pci_dev *dev,
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] PCI: Correct error reporting with PCIe failed link retraining
  2024-02-10  1:43 ` [PATCH 1/2] PCI: Correct " Maciej W. Rozycki
  2024-02-12 12:01   ` Ilpo Järvinen
  2024-03-14  7:19   ` Pengfei Xu
@ 2024-04-24 22:13   ` Bjorn Helgaas
  2 siblings, 0 replies; 10+ messages in thread
From: Bjorn Helgaas @ 2024-04-24 22:13 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Ilpo Järvinen, Bjorn Helgaas, Mika Westerberg, linux-pci,
	linux-kernel

On Sat, Feb 10, 2024 at 01:43:50AM +0000, Maciej W. Rozycki wrote:
> Only return successful completion status from `pcie_failed_link_retrain' 
> if retraining has actually been done, preventing excessive delays from 
> being triggered at call sites in a hope that communication will finally 
> be established with the downstream device where in fact nothing has been 
> done about the link in question that would justify such a hope.
> 
> Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Link: https://lore.kernel.org/r/aa2d1c4e-9961-d54a-00c7-ddf8e858a9b0@linux.intel.com/
> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> Cc: stable@vger.kernel.org # v6.5+
> ---
>  drivers/pci/quirks.c |   18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> linux-pcie-failed-link-retrain-status-fix.diff
> Index: linux-macro/drivers/pci/quirks.c
> ===================================================================
> --- linux-macro.orig/drivers/pci/quirks.c
> +++ linux-macro/drivers/pci/quirks.c
> @@ -74,7 +74,8 @@
>   * firmware may have already arranged and lift it with ports that already
>   * report their data link being up.
>   *
> - * Return TRUE if the link has been successfully retrained, otherwise FALSE.
> + * Return TRUE if the link has been successfully retrained, otherwise FALSE,
> + * also when retraining was not needed in the first place.

Can you recast this?  I think it's slightly unclear what is returned
when retraining is not needed.  I *think* you return FALSE when
retraining is not needed.  Maybe this?

  Return TRUE if the link has been successfully retrained.  Return
  FALSE if retraining was not needed or we attempted a retrain and it
  failed.

>   */
>  bool pcie_failed_link_retrain(struct pci_dev *dev)
>  {
> @@ -83,10 +84,11 @@ bool pcie_failed_link_retrain(struct pci
>  		{}
>  	};
>  	u16 lnksta, lnkctl2;
> +	bool ret = false;
>  
>  	if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
>  	    !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
> -		return false;
> +		return ret;

We know the value here, so IMO it's easier to read if we return
"false" instead of "ret".

>  	pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
>  	pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> @@ -98,9 +100,10 @@ bool pcie_failed_link_retrain(struct pci
>  		lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
>  		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
>  
> -		if (pcie_retrain_link(dev, false)) {
> +		ret = pcie_retrain_link(dev, false) == 0;
> +		if (!ret) {
>  			pci_info(dev, "retraining failed\n");
> -			return false;
> +			return ret;

Same here.  We're here because !ret was true, so ret must be false.
I guess in the next patch you want to return the pcie_retrain_link()
return value up the chain, so it will make sense there.

>  		}
>  
>  		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> @@ -117,13 +120,14 @@ bool pcie_failed_link_retrain(struct pci
>  		lnkctl2 |= lnkcap & PCI_EXP_LNKCAP_SLS;
>  		pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
>  
> -		if (pcie_retrain_link(dev, false)) {
> +		ret = pcie_retrain_link(dev, false) == 0;
> +		if (!ret) {
>  			pci_info(dev, "retraining failed\n");
> -			return false;
> +			return ret;
>  		}
>  	}
>  
> -	return true;
> +	return ret;

It gets awfully subtle by the time we get here.  I guess we could set
a "retrain_attempted" flag above and do this:

  if (retrain_attempted)
    return true;

  return false;

But I dunno if it's any better.  I understand the need for a change
like this, but the whole idea of returning failure (false) for a
retrain failure and also for a "no retrain needed" is a little
mind-bending.

>  }
>  
>  static ktime_t fixup_debug_start(struct pci_dev *dev,

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-04-24 22:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-10  1:43 [PATCH 0/2] PCI: Rework error reporting with PCIe failed link retraining Maciej W. Rozycki
2024-02-10  1:43 ` [PATCH 1/2] PCI: Correct " Maciej W. Rozycki
2024-02-12 12:01   ` Ilpo Järvinen
2024-03-14  7:19   ` Pengfei Xu
2024-03-14 11:27     ` Ilpo Järvinen
2024-03-15  3:10       ` Pengfei Xu
2024-04-24 22:13   ` Bjorn Helgaas
2024-02-10  1:43 ` [PATCH 2/2] PCI: Use an error code " Maciej W. Rozycki
2024-02-12 12:11   ` Ilpo Järvinen
2024-02-26 12:55 ` [PATCH 0/2] PCI: Rework error reporting " Maciej W. Rozycki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).