linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/12] PCI: Remove '*val = 0' from pcie_capability_read_*()
@ 2020-07-31 11:02 Saheed O. Bolarinwa
  2020-07-31 11:02 ` [PATCH v4 01/12] IB/hfi1: Check if pcie_capability_read_*() reads ~0 Saheed O. Bolarinwa
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Saheed O. Bolarinwa @ 2020-07-31 11:02 UTC (permalink / raw)
  To: helgaas, Mike Marciniszyn, Dennis Dalessandro, Doug Ledford,
	Jason Gunthorpe, Arnd Bergmann, Greg Kroah-Hartman,
	David S. Miller, Kalle Valo, Jakub Kicinski, Rafael J. Wysocki,
	Len Brown, Russell Currey, Sam Bobroff, Oliver O'Halloran
  Cc: Saheed O. Bolarinwa, bjorn, skhan, linux-kernel-mentees,
	linux-pci, linux-kernel, linux-rdma, QCA ath9k Development,
	linux-wireless, netdev, linux-acpi, linuxppc-dev

v4 CHANGES:
- Drop uses of pcie_capability_read_*() return value. This related to
  [1] which is pointing towards making the accessors return void.
- Remove patches found to be unnecessary
- Reword some commit messages

v3 CHANGES:
- Split previous PATCH 6/13 into two : PATCH 6/14 and PATCH 7/14
- Fix commit message of PATCH 5/14
- Update Patch numbering and Commit messages
- Add 'Acked by Greg KH' to PATCH 2/14
- Add PATCH version

v2 CHANGES:
- Fix missing comma, causing the email cc error
- Fix typos and numbering errors in commit messages
- Add commit message to 13/13
- Add two more patches: PATCH 3/13 and PATCH 4/13

MERGING:
- Patch 6/12 depends on Patch 5/12. However Patch 5/12 has no dependency.
  Please, merge PATCH 6/12 only after Patch 5/12.
- Patch 12/12 depends on all preceding patches. Please merge Patch 12/12
  only after other patches in this series have been merged.
- All other patches have no dependencies besides those mentioned above and
  can be merge individually.

PATCH 5/12:
Set the default case in the switch-statement to set status
to "Power On".

PATCH 1/12 to 11/12:
Use the value read by pcie_capability_read_*() to determine success or
failure. This is done by checking if it is ~0, while maintaining the
functions' behaviour. This ensures that the changes in PATCH 12/12 does
not introduce any bug.

PATCH 12/12:
There are several reasons why a PCI capability read may fail whether the
device is present or not. If this happens, pcie_capability_read_*() will
return -EINVAL/PCIBIOS_BAD_REGISTER_NUMBER or PCIBIOS_DEVICE_NOT_FOUND
and *val is set to 0.

This behaviour if further ensured by this code inside
pcie_capability_read_*()

 ret = pci_read_config_dword(dev, pci_pcie_cap(dev) + pos, val);
 /*
  * Reset *val to 0 if pci_read_config_dword() fails, it may
  * have been written as 0xFFFFFFFF if hardware error happens
  * during pci_read_config_dword().
  */
 if (ret)
	 *val = 0;
 return ret;

a) Since all pci_generic_config_read() does is read a register value,
it may return success after reading a ~0 which *may* have been fabricated
by the PCI host bridge due to a read timeout. Hence pci_read_config_*() 
will return success with a fabricated ~0 in *val, indicating a problem.
In this case, the assumed behaviour of  pcie_capability_read_*() will be
wrong. To avoid error slipping through, more checks are necessary.

b) pci_read_config_*() will return PCIBIOS_DEVICE_NOT_FOUND only if 
dev->error_state = pci_channel_io_perm_failure (i.e. 
pci_dev_is_disconnected()) or if pci_generic_config_read() can't find the
device. In both cases *val is initially set to ~0 but as shown in the code
above pcie_capability_read_*() resets it back to 0. Even with this effort,
drivers still have to perform validation checks more so if 0 is a valid
value.

Most drivers only consider the case (b) and in some cases, there is the 
expectation that on timeout *val has a fabricated value of ~0, which *may*
not always be true as explained in (a).

In any case, checks need to be done to validate the value read and maybe
confirm which error has occurred. It is better left to the drivers to do.

Check the return value of pcie_capability_read_dword() to ensure success
and avoid bug as a result of Patch 14/14.
Remove the reset of *val to 0 when pci_read_config_*() fails.

[1] https://lore.kernel.org/linux-pci/20200714234625.GA428442@bjorn-Precision-5520/


Saheed O. Bolarinwa (12):
  IB/hfi1: Check if pcie_capability_read_*() reads ~0
  misc: rtsx: Check if pcie_capability_read_*() reads ~0
  ath9k: Check if pcie_capability_read_*() reads ~0
  iwlegacy: Check if pcie_capability_read_*() reads ~0
  PCI: pciehp: Set "Power On" as the default get_power_status
  PCI: pciehp: Check if pcie_capability_read_*() reads ~0
  PCI/ACPI: Check if pcie_capability_read_*() reads ~0
  PCI: Check if pcie_capability_read_*() reads ~0
  PCI/PM: Check if pcie_capability_read_*() reads ~0
  PCI/AER: Check if pcie_capability_read_*() reads ~0
  PCI/ASPM: Check if pcie_capability_read_*() reads ~0
  PCI: Remove '*val = 0' from pcie_capability_read_*()

 drivers/infiniband/hw/hfi1/aspm.c            |  6 ++--
 drivers/misc/cardreader/rts5227.c            |  2 +-
 drivers/misc/cardreader/rts5249.c            |  2 +-
 drivers/misc/cardreader/rts5260.c            |  2 +-
 drivers/misc/cardreader/rts5261.c            |  2 +-
 drivers/net/wireless/ath/ath9k/pci.c         |  3 +-
 drivers/net/wireless/intel/iwlegacy/common.c |  2 +-
 drivers/pci/access.c                         | 14 --------
 drivers/pci/hotplug/pciehp_hpc.c             | 13 +++++---
 drivers/pci/pci-acpi.c                       |  4 +--
 drivers/pci/pci.c                            | 34 ++++++++++++++------
 drivers/pci/pcie/aer.c                       |  2 +-
 drivers/pci/pcie/aspm.c                      | 10 +++---
 drivers/pci/probe.c                          | 12 +++----
 14 files changed, 56 insertions(+), 52 deletions(-)

-- 
2.18.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v4 01/12] IB/hfi1: Check if pcie_capability_read_*() reads ~0
  2020-07-31 11:02 [PATCH v4 00/12] PCI: Remove '*val = 0' from pcie_capability_read_*() Saheed O. Bolarinwa
@ 2020-07-31 11:02 ` Saheed O. Bolarinwa
  2020-07-31 13:55   ` Bjorn Helgaas
  2020-07-31 11:02 ` [PATCH v4 02/12] misc: rtsx: " Saheed O. Bolarinwa
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Saheed O. Bolarinwa @ 2020-07-31 11:02 UTC (permalink / raw)
  To: helgaas, Mike Marciniszyn, Dennis Dalessandro, Doug Ledford,
	Jason Gunthorpe
  Cc: Saheed O. Bolarinwa, bjorn, skhan, linux-kernel-mentees,
	linux-pci, linux-kernel, linux-rdma

On failure pcie_capability_read_dword() sets it's last parameter,
val to 0. In this case dn and up will be 0, so aspm_hw_l1_supported()
will return false.
However, with Patch 12/12, it is possible that val is set to ~0 on
failure. This would introduce a bug because (x & x) == (~0 & x). So
with dn and up being 0x02, a true value is return when the read has
actually failed.

Since, the value ~0 is invalid here,

Reset dn and up to 0 when a value of ~0 is read into them, this
ensures false is returned on failure in this case.

Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
---

 drivers/infiniband/hw/hfi1/aspm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/aspm.c b/drivers/infiniband/hw/hfi1/aspm.c
index a3c53be4072c..9605b2145d19 100644
--- a/drivers/infiniband/hw/hfi1/aspm.c
+++ b/drivers/infiniband/hw/hfi1/aspm.c
@@ -33,13 +33,13 @@ static bool aspm_hw_l1_supported(struct hfi1_devdata *dd)
 		return false;
 
 	pcie_capability_read_dword(dd->pcidev, PCI_EXP_LNKCAP, &dn);
-	dn = ASPM_L1_SUPPORTED(dn);
+	dn = (dn == (u32)~0) ? 0 : ASPM_L1_SUPPORTED(dn);
 
 	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &up);
-	up = ASPM_L1_SUPPORTED(up);
+	up = (up == (u32)~0) ? 0 : ASPM_L1_SUPPORTED(up);
 
 	/* ASPM works on A-step but is reported as not supported */
-	return (!!dn || is_ax(dd)) && !!up;
+	return (dn || is_ax(dd)) && up;
 }
 
 /* Set L1 entrance latency for slower entry to L1 */
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 02/12] misc: rtsx: Check if pcie_capability_read_*() reads ~0
  2020-07-31 11:02 [PATCH v4 00/12] PCI: Remove '*val = 0' from pcie_capability_read_*() Saheed O. Bolarinwa
  2020-07-31 11:02 ` [PATCH v4 01/12] IB/hfi1: Check if pcie_capability_read_*() reads ~0 Saheed O. Bolarinwa
@ 2020-07-31 11:02 ` Saheed O. Bolarinwa
  2020-07-31 11:02 ` [PATCH v4 03/12] ath9k: " Saheed O. Bolarinwa
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Saheed O. Bolarinwa @ 2020-07-31 11:02 UTC (permalink / raw)
  To: helgaas, Arnd Bergmann, Greg Kroah-Hartman
  Cc: Saheed O. Bolarinwa, bjorn, skhan, linux-kernel-mentees,
	linux-pci, linux-kernel

On failure pcie_capability_read_word() sets it's last parameter, val
to 0. In which case (val & PCI_EXP_DEVCTL2_LTR_EN) evaluates to 0.
However, with Patch 12/12, it is possible that val is set to ~0 on
failure. This would introduce a bug because (x & x) == (~0 & x).

Since ~0 is an invalid value here,

Add an extra check for ~0 to the if condition to ensure success.

Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
---
 drivers/misc/cardreader/rts5227.c | 2 +-
 drivers/misc/cardreader/rts5249.c | 2 +-
 drivers/misc/cardreader/rts5260.c | 2 +-
 drivers/misc/cardreader/rts5261.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/misc/cardreader/rts5227.c b/drivers/misc/cardreader/rts5227.c
index 3a9467aaa435..cab816639df1 100644
--- a/drivers/misc/cardreader/rts5227.c
+++ b/drivers/misc/cardreader/rts5227.c
@@ -106,7 +106,7 @@ static int rts5227_extra_init_hw(struct rtsx_pcr *pcr)
 	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, OLT_LED_CTL, 0x0F, 0x02);
 	/* Configure LTR */
 	pcie_capability_read_word(pcr->pci, PCI_EXP_DEVCTL2, &cap);
-	if (cap & PCI_EXP_DEVCTL2_LTR_EN)
+	if ((cap != (u16)~0) && (cap & PCI_EXP_DEVCTL2_LTR_EN))
 		rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, LTR_CTL, 0xFF, 0xA3);
 	/* Configure OBFF */
 	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, OBFF_CFG, 0x03, 0x03);
diff --git a/drivers/misc/cardreader/rts5249.c b/drivers/misc/cardreader/rts5249.c
index 6c6c9e95a29f..4382ac753fda 100644
--- a/drivers/misc/cardreader/rts5249.c
+++ b/drivers/misc/cardreader/rts5249.c
@@ -119,7 +119,7 @@ static void rts5249_init_from_cfg(struct rtsx_pcr *pcr)
 		u16 val;
 
 		pcie_capability_read_word(pcr->pci, PCI_EXP_DEVCTL2, &val);
-		if (val & PCI_EXP_DEVCTL2_LTR_EN) {
+		if ((val != (u16)~0) && (val & PCI_EXP_DEVCTL2_LTR_EN)) {
 			option->ltr_enabled = true;
 			option->ltr_active = true;
 			rtsx_set_ltr_latency(pcr, option->ltr_active_latency);
diff --git a/drivers/misc/cardreader/rts5260.c b/drivers/misc/cardreader/rts5260.c
index 7a9dbb778e84..3d52c86aaaa3 100644
--- a/drivers/misc/cardreader/rts5260.c
+++ b/drivers/misc/cardreader/rts5260.c
@@ -519,7 +519,7 @@ static void rts5260_init_from_cfg(struct rtsx_pcr *pcr)
 		u16 val;
 
 		pcie_capability_read_word(pcr->pci, PCI_EXP_DEVCTL2, &val);
-		if (val & PCI_EXP_DEVCTL2_LTR_EN) {
+		if ((val != (u16)~0) && (val & PCI_EXP_DEVCTL2_LTR_EN)) {
 			option->ltr_enabled = true;
 			option->ltr_active = true;
 			rtsx_set_ltr_latency(pcr, option->ltr_active_latency);
diff --git a/drivers/misc/cardreader/rts5261.c b/drivers/misc/cardreader/rts5261.c
index 195822ec858e..7e1188740194 100644
--- a/drivers/misc/cardreader/rts5261.c
+++ b/drivers/misc/cardreader/rts5261.c
@@ -440,7 +440,7 @@ static void rts5261_init_from_cfg(struct rtsx_pcr *pcr)
 		u16 val;
 
 		pcie_capability_read_word(pcr->pci, PCI_EXP_DEVCTL2, &val);
-		if (val & PCI_EXP_DEVCTL2_LTR_EN) {
+		if ((val != (u16)~0) && (val & PCI_EXP_DEVCTL2_LTR_EN)) {
 			option->ltr_enabled = true;
 			option->ltr_active = true;
 			rtsx_set_ltr_latency(pcr, option->ltr_active_latency);
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 03/12] ath9k: Check if pcie_capability_read_*() reads ~0
  2020-07-31 11:02 [PATCH v4 00/12] PCI: Remove '*val = 0' from pcie_capability_read_*() Saheed O. Bolarinwa
  2020-07-31 11:02 ` [PATCH v4 01/12] IB/hfi1: Check if pcie_capability_read_*() reads ~0 Saheed O. Bolarinwa
  2020-07-31 11:02 ` [PATCH v4 02/12] misc: rtsx: " Saheed O. Bolarinwa
@ 2020-07-31 11:02 ` Saheed O. Bolarinwa
  2020-07-31 11:02 ` [PATCH v4 04/12] iwlegacy: " Saheed O. Bolarinwa
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Saheed O. Bolarinwa @ 2020-07-31 11:02 UTC (permalink / raw)
  To: helgaas, David S. Miller, Kalle Valo, Jakub Kicinski
  Cc: Saheed O. Bolarinwa, bjorn, skhan, linux-kernel-mentees,
	linux-pci, linux-kernel, QCA ath9k Development, linux-wireless,
	netdev

On failure pcie_capability_read_dword() sets it's last parameter, val
to 0. However, with Patch 12/12, it is possible that val is set to ~0
on failure. This would introduce a bug because (x & x) == (~0 & x).

Since ~0 is an invalid value here,

Add an extra check for ~0 to the if condition to ensure success.

Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
---
 drivers/net/wireless/ath/ath9k/pci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath9k/pci.c b/drivers/net/wireless/ath/ath9k/pci.c
index f3461b193c7a..f02b243befef 100644
--- a/drivers/net/wireless/ath/ath9k/pci.c
+++ b/drivers/net/wireless/ath/ath9k/pci.c
@@ -867,7 +867,8 @@ static void ath_pci_aspm_init(struct ath_common *common)
 		pci_read_config_dword(pdev, 0x70c, &ah->config.aspm_l1_fix);
 
 	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &aspm);
-	if (aspm & (PCI_EXP_LNKCTL_ASPM_L0S | PCI_EXP_LNKCTL_ASPM_L1)) {
+	if ((aspm != (u16)~0) &&
+	    (aspm & (PCI_EXP_LNKCTL_ASPM_L0S | PCI_EXP_LNKCTL_ASPM_L1))) {
 		ah->aspm_enabled = true;
 		/* Initialize PCIe PM and SERDES registers. */
 		ath9k_hw_configpcipowersave(ah, false);
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 04/12] iwlegacy: Check if pcie_capability_read_*() reads ~0
  2020-07-31 11:02 [PATCH v4 00/12] PCI: Remove '*val = 0' from pcie_capability_read_*() Saheed O. Bolarinwa
                   ` (2 preceding siblings ...)
  2020-07-31 11:02 ` [PATCH v4 03/12] ath9k: " Saheed O. Bolarinwa
@ 2020-07-31 11:02 ` Saheed O. Bolarinwa
  2020-07-31 11:02 ` [PATCH v4 05/12] PCI: pciehp: Set "Power On" as the default get_power_status Saheed O. Bolarinwa
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Saheed O. Bolarinwa @ 2020-07-31 11:02 UTC (permalink / raw)
  To: helgaas, Kalle Valo, David S. Miller, Jakub Kicinski
  Cc: Saheed O. Bolarinwa, bjorn, skhan, linux-kernel-mentees,
	linux-pci, linux-kernel, Stanislaw Gruszka, linux-wireless,
	netdev

On failure pcie_capability_read_dword() sets it's last parameter, val
to 0. However, with Patch 12/12, it is possible that val is set to ~0
on failure. This would introduce a bug because (x & x) == (~0 & x).

Since ~0 is an invalid value here,

Add an extra check for ~0 to the if condition to ensure success.

Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
---
 drivers/net/wireless/intel/iwlegacy/common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/intel/iwlegacy/common.c b/drivers/net/wireless/intel/iwlegacy/common.c
index 348c17ce72f5..659027563260 100644
--- a/drivers/net/wireless/intel/iwlegacy/common.c
+++ b/drivers/net/wireless/intel/iwlegacy/common.c
@@ -4287,7 +4287,7 @@ il_apm_init(struct il_priv *il)
 	 */
 	if (il->cfg->set_l0s) {
 		pcie_capability_read_word(il->pci_dev, PCI_EXP_LNKCTL, &lctl);
-		if (lctl & PCI_EXP_LNKCTL_ASPM_L1) {
+		if ((lctl != (u16)~0) && (lctl & PCI_EXP_LNKCTL_ASPM_L1)) {
 			/* L1-ASPM enabled; disable(!) L0S  */
 			il_set_bit(il, CSR_GIO_REG,
 				   CSR_GIO_REG_VAL_L0S_ENABLED);
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 05/12] PCI: pciehp: Set "Power On" as the default get_power_status
  2020-07-31 11:02 [PATCH v4 00/12] PCI: Remove '*val = 0' from pcie_capability_read_*() Saheed O. Bolarinwa
                   ` (3 preceding siblings ...)
  2020-07-31 11:02 ` [PATCH v4 04/12] iwlegacy: " Saheed O. Bolarinwa
@ 2020-07-31 11:02 ` Saheed O. Bolarinwa
  2020-07-31 11:02 ` [PATCH v4 06/12] PCI: pciehp: Check if pcie_capability_read_*() reads ~0 Saheed O. Bolarinwa
  2020-07-31 11:02 ` [PATCH v4 07/12] PCI/ACPI: " Saheed O. Bolarinwa
  6 siblings, 0 replies; 10+ messages in thread
From: Saheed O. Bolarinwa @ 2020-07-31 11:02 UTC (permalink / raw)
  To: helgaas
  Cc: Saheed O. Bolarinwa, bjorn, skhan, linux-pci,
	linux-kernel-mentees, linux-kernel

The default case of the switch statement is redundant since
PCI_EXP_SLTCTL_PCC is only a single bit.

Set the default case in the switch-statement to set status
to "Power On"

Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
---
 drivers/pci/hotplug/pciehp_hpc.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 53433b37e181..b89c9ee4a3b5 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -400,14 +400,12 @@ void pciehp_get_power_status(struct controller *ctrl, u8 *status)
 		 pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL, slot_ctrl);
 
 	switch (slot_ctrl & PCI_EXP_SLTCTL_PCC) {
-	case PCI_EXP_SLTCTL_PWR_ON:
-		*status = 1;	/* On */
-		break;
 	case PCI_EXP_SLTCTL_PWR_OFF:
 		*status = 0;	/* Off */
 		break;
+	case PCI_EXP_SLTCTL_PWR_ON:
 	default:
-		*status = 0xFF;
+		*status = 1;	/* On */
 		break;
 	}
 }
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 06/12] PCI: pciehp: Check if pcie_capability_read_*() reads ~0
  2020-07-31 11:02 [PATCH v4 00/12] PCI: Remove '*val = 0' from pcie_capability_read_*() Saheed O. Bolarinwa
                   ` (4 preceding siblings ...)
  2020-07-31 11:02 ` [PATCH v4 05/12] PCI: pciehp: Set "Power On" as the default get_power_status Saheed O. Bolarinwa
@ 2020-07-31 11:02 ` Saheed O. Bolarinwa
  2020-07-31 11:02 ` [PATCH v4 07/12] PCI/ACPI: " Saheed O. Bolarinwa
  6 siblings, 0 replies; 10+ messages in thread
From: Saheed O. Bolarinwa @ 2020-07-31 11:02 UTC (permalink / raw)
  To: helgaas
  Cc: Saheed O. Bolarinwa, bjorn, skhan, linux-pci,
	linux-kernel-mentees, linux-kernel

On failure pcie_capability_read_word() sets it's last parameter, val
to 0. However, with Patch 12/12, it is possible that val is set to
~0 on failure. This introduces a bug because (x & x) == (~0 & x).

Since ~0 is an invalid value here,

pciehp_get_power_status():
Add an extra check for ~0 on the value read. If found, set status
to 'Power On' and return.

pcie_wait_for_presence():
Add an extra check for no ~0 to the exit condition of the loop

Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
---
 drivers/pci/hotplug/pciehp_hpc.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index b89c9ee4a3b5..39305aabc3a2 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -278,7 +278,7 @@ static void pcie_wait_for_presence(struct pci_dev *pdev)
 
 	do {
 		pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status);
-		if (slot_status & PCI_EXP_SLTSTA_PDS)
+		if ((slot_status != (u16)~0) && (slot_status & PCI_EXP_SLTSTA_PDS))
 			return;
 		msleep(10);
 		timeout -= 10;
@@ -399,6 +399,11 @@ void pciehp_get_power_status(struct controller *ctrl, u8 *status)
 	ctrl_dbg(ctrl, "%s: SLOTCTRL %x value read %x\n", __func__,
 		 pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL, slot_ctrl);
 
+	if (slot_ctrl == (u16)~0) {
+		*status = 1;    /* On */
+		return;
+	}
+
 	switch (slot_ctrl & PCI_EXP_SLTCTL_PCC) {
 	case PCI_EXP_SLTCTL_PWR_OFF:
 		*status = 0;	/* Off */
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 07/12] PCI/ACPI: Check if pcie_capability_read_*() reads ~0
  2020-07-31 11:02 [PATCH v4 00/12] PCI: Remove '*val = 0' from pcie_capability_read_*() Saheed O. Bolarinwa
                   ` (5 preceding siblings ...)
  2020-07-31 11:02 ` [PATCH v4 06/12] PCI: pciehp: Check if pcie_capability_read_*() reads ~0 Saheed O. Bolarinwa
@ 2020-07-31 11:02 ` Saheed O. Bolarinwa
  6 siblings, 0 replies; 10+ messages in thread
From: Saheed O. Bolarinwa @ 2020-07-31 11:02 UTC (permalink / raw)
  To: helgaas, Rafael J. Wysocki, Len Brown
  Cc: Saheed O. Bolarinwa, bjorn, skhan, linux-pci,
	linux-kernel-mentees, linux-kernel, linux-acpi

On failure pcie_capability_read_*() sets it's last parameter, val
to 0. However, with Patch 12/12, it is possible that val is set
to ~0 on failure. This would introduces a bug because
(x & x) == (~0 & x).

Since ~0 is an invalid value in here,

Add extra check for ~0 in the if condition to ensure success or
failure.

Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
---
 drivers/pci/pci-acpi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 7224b1e5f2a8..873b005947e4 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -253,7 +253,7 @@ static bool pcie_root_rcb_set(struct pci_dev *dev)
 		return false;
 
 	pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl);
-	if (lnkctl & PCI_EXP_LNKCTL_RCB)
+	if ((lnkctl != (u16)~0) && (lnkctl & PCI_EXP_LNKCTL_RCB))
 		return true;
 
 	return false;
@@ -797,7 +797,7 @@ bool pciehp_is_native(struct pci_dev *bridge)
 		return false;
 
 	pcie_capability_read_dword(bridge, PCI_EXP_SLTCAP, &slot_cap);
-	if (!(slot_cap & PCI_EXP_SLTCAP_HPC))
+	if ((slot_cap == (u32)~0) || !(slot_cap & PCI_EXP_SLTCAP_HPC))
 		return false;
 
 	if (pcie_ports_native)
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 01/12] IB/hfi1: Check if pcie_capability_read_*() reads ~0
  2020-07-31 11:02 ` [PATCH v4 01/12] IB/hfi1: Check if pcie_capability_read_*() reads ~0 Saheed O. Bolarinwa
@ 2020-07-31 13:55   ` Bjorn Helgaas
  2020-08-03 11:46     ` Ian Kumlien
  0 siblings, 1 reply; 10+ messages in thread
From: Bjorn Helgaas @ 2020-07-31 13:55 UTC (permalink / raw)
  To: Saheed O. Bolarinwa
  Cc: Mike Marciniszyn, Dennis Dalessandro, Doug Ledford,
	Jason Gunthorpe, bjorn, skhan, linux-kernel-mentees, linux-pci,
	linux-kernel, linux-rdma, Michael J. Ruhl, Ashutosh Dixit,
	Ian Kumlien, Puranjay Mohan

[+cc Michael, Ashutosh, Ian, Puranjay]

On Fri, Jul 31, 2020 at 01:02:29PM +0200, Saheed O. Bolarinwa wrote:
> On failure pcie_capability_read_dword() sets it's last parameter,
> val to 0. In this case dn and up will be 0, so aspm_hw_l1_supported()
> will return false.
> However, with Patch 12/12, it is possible that val is set to ~0 on
> failure. This would introduce a bug because (x & x) == (~0 & x). So
> with dn and up being 0x02, a true value is return when the read has
> actually failed.
> 
> Since, the value ~0 is invalid here,
> 
> Reset dn and up to 0 when a value of ~0 is read into them, this
> ensures false is returned on failure in this case.
> 
> Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
> Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
> ---
> 
>  drivers/infiniband/hw/hfi1/aspm.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/hfi1/aspm.c b/drivers/infiniband/hw/hfi1/aspm.c
> index a3c53be4072c..9605b2145d19 100644
> --- a/drivers/infiniband/hw/hfi1/aspm.c
> +++ b/drivers/infiniband/hw/hfi1/aspm.c
> @@ -33,13 +33,13 @@ static bool aspm_hw_l1_supported(struct hfi1_devdata *dd)
>  		return false;
>  
>  	pcie_capability_read_dword(dd->pcidev, PCI_EXP_LNKCAP, &dn);
> -	dn = ASPM_L1_SUPPORTED(dn);
> +	dn = (dn == (u32)~0) ? 0 : ASPM_L1_SUPPORTED(dn);
>  
>  	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &up);
> -	up = ASPM_L1_SUPPORTED(up);
> +	up = (up == (u32)~0) ? 0 : ASPM_L1_SUPPORTED(up);

I don't want to change this.  The driver shouldn't be mucking with
ASPM at all.  The PCI core should take care of this automatically.  If
it doesn't, we need to fix the core.

If the driver needs to disable ASPM to work around device errata or
something, the core has an interface for that.  But the driver should
not override the system-wide policy for managing ASPM.

Ah, some archaeology finds affa48de8417 ("staging/rdma/hfi1: Add
support for enabling/disabling PCIe ASPM"), which says:

  hfi1 HW has a high PCIe ASPM L1 exit latency and also advertises an
  acceptable latency less than actual ASPM latencies.

That suggests that either there is a device defect, e.g., advertising
incorrect ASPM latencies, or a PCI core defect, e.g., incorrectly
enabling ASPM when the path exit latency exceeds that hfi1 can
tolerate.

Coincidentally, Ian recently debugged a problem in how the PCI core
computes exit latencies over a path [1].

Can anybody supply details about the hfi1 ASPM parameters, e.g., the
output of "sudo lspci -vv"?  Any details about the configuration where
the problem occurs?  Is there a switch in the path?

[1] https://lore.kernel.org/r/20200727213045.2117855-1-ian.kumlien@gmail.com

>  	/* ASPM works on A-step but is reported as not supported */
> -	return (!!dn || is_ax(dd)) && !!up;
> +	return (dn || is_ax(dd)) && up;
>  }
>  
>  /* Set L1 entrance latency for slower entry to L1 */
> -- 
> 2.18.4
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 01/12] IB/hfi1: Check if pcie_capability_read_*() reads ~0
  2020-07-31 13:55   ` Bjorn Helgaas
@ 2020-08-03 11:46     ` Ian Kumlien
  0 siblings, 0 replies; 10+ messages in thread
From: Ian Kumlien @ 2020-08-03 11:46 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Saheed O. Bolarinwa, Mike Marciniszyn, Dennis Dalessandro,
	Doug Ledford, Jason Gunthorpe, bjorn, skhan,
	linux-kernel-mentees, linux-pci, linux-kernel, linux-rdma,
	Michael J. Ruhl, Ashutosh Dixit, Puranjay Mohan

[-- Attachment #1: Type: text/plain, Size: 6809 bytes --]

On Fri, Jul 31, 2020 at 3:55 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Michael, Ashutosh, Ian, Puranjay]
>
> On Fri, Jul 31, 2020 at 01:02:29PM +0200, Saheed O. Bolarinwa wrote:
> > On failure pcie_capability_read_dword() sets it's last parameter,
> > val to 0. In this case dn and up will be 0, so aspm_hw_l1_supported()
> > will return false.
> > However, with Patch 12/12, it is possible that val is set to ~0 on
> > failure. This would introduce a bug because (x & x) == (~0 & x). So
> > with dn and up being 0x02, a true value is return when the read has
> > actually failed.
> >
> > Since, the value ~0 is invalid here,
> >
> > Reset dn and up to 0 when a value of ~0 is read into them, this
> > ensures false is returned on failure in this case.
> >
> > Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
> > Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com>
> > ---
> >
> >  drivers/infiniband/hw/hfi1/aspm.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/hfi1/aspm.c b/drivers/infiniband/hw/hfi1/aspm.c
> > index a3c53be4072c..9605b2145d19 100644
> > --- a/drivers/infiniband/hw/hfi1/aspm.c
> > +++ b/drivers/infiniband/hw/hfi1/aspm.c
> > @@ -33,13 +33,13 @@ static bool aspm_hw_l1_supported(struct hfi1_devdata *dd)
> >               return false;
> >
> >       pcie_capability_read_dword(dd->pcidev, PCI_EXP_LNKCAP, &dn);
> > -     dn = ASPM_L1_SUPPORTED(dn);
> > +     dn = (dn == (u32)~0) ? 0 : ASPM_L1_SUPPORTED(dn);
> >
> >       pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &up);
> > -     up = ASPM_L1_SUPPORTED(up);
> > +     up = (up == (u32)~0) ? 0 : ASPM_L1_SUPPORTED(up);
>
> I don't want to change this.  The driver shouldn't be mucking with
> ASPM at all.  The PCI core should take care of this automatically.  If
> it doesn't, we need to fix the core.
>
> If the driver needs to disable ASPM to work around device errata or
> something, the core has an interface for that.  But the driver should
> not override the system-wide policy for managing ASPM.
>
> Ah, some archaeology finds affa48de8417 ("staging/rdma/hfi1: Add
> support for enabling/disabling PCIe ASPM"), which says:
>
>   hfi1 HW has a high PCIe ASPM L1 exit latency and also advertises an
>   acceptable latency less than actual ASPM latencies.
>
> That suggests that either there is a device defect, e.g., advertising
> incorrect ASPM latencies, or a PCI core defect, e.g., incorrectly
> enabling ASPM when the path exit latency exceeds that hfi1 can
> tolerate.
>
> Coincidentally, Ian recently debugged a problem in how the PCI core
> computes exit latencies over a path [1].
>
> Can anybody supply details about the hfi1 ASPM parameters, e.g., the
> output of "sudo lspci -vv"?  Any details about the configuration where
> the problem occurs?  Is there a switch in the path?
>
> [1] https://lore.kernel.org/r/20200727213045.2117855-1-ian.kumlien@gmail.com
>
> >       /* ASPM works on A-step but is reported as not supported */
> > -     return (!!dn || is_ax(dd)) && !!up;
> > +     return (dn || is_ax(dd)) && up;
> >  }
> >
> >  /* Set L1 entrance latency for slower entry to L1 */
> > --
> > 2.18.4
> >

My experience with pcie is very limited, but the more I look at things
the more I get worried...

Anyway, I have made some changes, could you try the attached patch and
see if it makes a difference?

Changes:
L0s and L1 should only apply to links that actually has it enabled,
don't store or increase values if they don't.
Work on L0s as well, currently it clobbers since I'm not certain about
upstream/downstream distinctions.

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index b17e5ffd31b1..0d93ae065f73 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -434,7 +434,8 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,

 static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 {
-       u32 latency, l1_switch_latency = 0;
+       u32 latency, l1_max_latency = 0, l1_switch_latency = 0,
+               l0s_max_latency = 0;
        struct aspm_latency *acceptable;
        struct pcie_link_state *link;

@@ -447,15 +448,24 @@ static void pcie_aspm_check_latency(struct
pci_dev *endpoint)
        acceptable = &link->acceptable[PCI_FUNC(endpoint->devfn)];

        while (link) {
-               /* Check upstream direction L0s latency */
-               if ((link->aspm_capable & ASPM_STATE_L0S_UP) &&
-                   (link->latency_up.l0s > acceptable->l0s))
-                       link->aspm_capable &= ~ASPM_STATE_L0S_UP;
-
-               /* Check downstream direction L0s latency */
-               if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
-                   (link->latency_dw.l0s > acceptable->l0s))
-                       link->aspm_capable &= ~ASPM_STATE_L0S_DW;
+               if (link->aspm_capable & ASPM_STATE_L0S) {
+                       u32 l0s_up = 0, l0s_dw = 0;
+
+                       /* Check upstream direction L0s latency */
+                       if (link->aspm_capable & ASPM_STATE_L0S_UP)
+                               l0s_up = link->latency_up.l0s;
+
+                       /* Check downstream direction L0s latency */
+                       if (link->aspm_capable & ASPM_STATE_L0S_DW)
+                               l0s_dw = link->latency_dw.l0s;
+
+                       l0s_max_latency += max_t(u32, l0s_up, l0s_dw);
+
+                       /* If the latency exceeds, disable both */
+                       if (l0s_max_latency > acceptable->l0s)
+                               link->aspm_capable &= ~ASPM_STATE_L0S;
+               }
+
                /*
                 * Check L1 latency.
                 * Every switch on the path to root complex need 1
@@ -469,11 +479,13 @@ static void pcie_aspm_check_latency(struct
pci_dev *endpoint)
                 * L1 exit latencies advertised by a device include L1
                 * substate latencies (and hence do not do any check).
                 */
-               latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
-               if ((link->aspm_capable & ASPM_STATE_L1) &&
-                   (latency + l1_switch_latency > acceptable->l1))
-                       link->aspm_capable &= ~ASPM_STATE_L1;
-               l1_switch_latency += 1000;
+               if (link->aspm_capable & ASPM_STATE_L1) {
+                       latency = max_t(u32, link->latency_up.l1,
link->latency_dw.l1);
+                       l1_max_latency = max_t(u32, latency, l1_max_latency);
+                       if (l1_max_latency + l1_switch_latency > acceptable->l1)
+                               link->aspm_capable &= ~ASPM_STATE_L1;
+                       l1_switch_latency += 1000;
+               }

                link = link->parent;
        }

[-- Attachment #2: 0001-Use-maximum-latency-when-determining-L1-L0s-ASPM.patch --]
[-- Type: text/x-patch, Size: 3431 bytes --]

From 971735bd32d7a8cb7cd1a8d4316fc2a2e192f8e2 Mon Sep 17 00:00:00 2001
From: Ian Kumlien <ian.kumlien@gmail.com>
Date: Sun, 26 Jul 2020 16:01:15 +0200
Subject: [PATCH] Use maximum latency when determining L1/L0s ASPM

Currently we check the maximum latency of upstream and downstream
per link, not the maximum for the path

This would work if all links have the same latency, but:
endpoint -> c -> b -> a -> root  (in the order we walk the path)

If c or b has the higest latency, it will not register

Fix this by maintaining the maximum latency value for the path

Also, L0s seems to be a cumulative maximum over the path, fix this as
well

This change fixes a regression introduced (but not caused) by:
66ff14e59e8a (PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges)

Signed-off-by: Ian Kumlien <ian.kumlien@gmail.com>
---
 drivers/pci/pcie/aspm.c | 42 ++++++++++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index b17e5ffd31b1..0d93ae065f73 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -434,7 +434,8 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
 
 static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 {
-	u32 latency, l1_switch_latency = 0;
+	u32 latency, l1_max_latency = 0, l1_switch_latency = 0,
+		l0s_max_latency = 0;
 	struct aspm_latency *acceptable;
 	struct pcie_link_state *link;
 
@@ -447,15 +448,24 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 	acceptable = &link->acceptable[PCI_FUNC(endpoint->devfn)];
 
 	while (link) {
-		/* Check upstream direction L0s latency */
-		if ((link->aspm_capable & ASPM_STATE_L0S_UP) &&
-		    (link->latency_up.l0s > acceptable->l0s))
-			link->aspm_capable &= ~ASPM_STATE_L0S_UP;
-
-		/* Check downstream direction L0s latency */
-		if ((link->aspm_capable & ASPM_STATE_L0S_DW) &&
-		    (link->latency_dw.l0s > acceptable->l0s))
-			link->aspm_capable &= ~ASPM_STATE_L0S_DW;
+		if (link->aspm_capable & ASPM_STATE_L0S) {
+			u32 l0s_up = 0, l0s_dw = 0;
+
+			/* Check upstream direction L0s latency */
+			if (link->aspm_capable & ASPM_STATE_L0S_UP)
+				l0s_up = link->latency_up.l0s;
+
+			/* Check downstream direction L0s latency */
+			if (link->aspm_capable & ASPM_STATE_L0S_DW)
+				l0s_dw = link->latency_dw.l0s;
+
+			l0s_max_latency += max_t(u32, l0s_up, l0s_dw);
+
+			/* If the latency exceeds, disable both */
+			if (l0s_max_latency > acceptable->l0s)
+				link->aspm_capable &= ~ASPM_STATE_L0S;
+		}
+
 		/*
 		 * Check L1 latency.
 		 * Every switch on the path to root complex need 1
@@ -469,11 +479,13 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 		 * L1 exit latencies advertised by a device include L1
 		 * substate latencies (and hence do not do any check).
 		 */
-		latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
-		if ((link->aspm_capable & ASPM_STATE_L1) &&
-		    (latency + l1_switch_latency > acceptable->l1))
-			link->aspm_capable &= ~ASPM_STATE_L1;
-		l1_switch_latency += 1000;
+		if (link->aspm_capable & ASPM_STATE_L1) {
+			latency = max_t(u32, link->latency_up.l1, link->latency_dw.l1);
+			l1_max_latency = max_t(u32, latency, l1_max_latency);
+			if (l1_max_latency + l1_switch_latency > acceptable->l1)
+				link->aspm_capable &= ~ASPM_STATE_L1;
+			l1_switch_latency += 1000;
+		}
 
 		link = link->parent;
 	}
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-08-03 11:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-31 11:02 [PATCH v4 00/12] PCI: Remove '*val = 0' from pcie_capability_read_*() Saheed O. Bolarinwa
2020-07-31 11:02 ` [PATCH v4 01/12] IB/hfi1: Check if pcie_capability_read_*() reads ~0 Saheed O. Bolarinwa
2020-07-31 13:55   ` Bjorn Helgaas
2020-08-03 11:46     ` Ian Kumlien
2020-07-31 11:02 ` [PATCH v4 02/12] misc: rtsx: " Saheed O. Bolarinwa
2020-07-31 11:02 ` [PATCH v4 03/12] ath9k: " Saheed O. Bolarinwa
2020-07-31 11:02 ` [PATCH v4 04/12] iwlegacy: " Saheed O. Bolarinwa
2020-07-31 11:02 ` [PATCH v4 05/12] PCI: pciehp: Set "Power On" as the default get_power_status Saheed O. Bolarinwa
2020-07-31 11:02 ` [PATCH v4 06/12] PCI: pciehp: Check if pcie_capability_read_*() reads ~0 Saheed O. Bolarinwa
2020-07-31 11:02 ` [PATCH v4 07/12] PCI/ACPI: " Saheed O. Bolarinwa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).