All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v11 0/4] PCI: Use CRS Software Visibility to wait for device to become ready
@ 2017-08-18 21:31 ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:31 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

This is another rev of Sinan's series to fix an FLR issue seen with an
Intel 750 NVMe drive.  It seems that the 750 requires more time than we
currently allow after the FLR to become ready for use.

The biggest remaining issue is that I *think* this does not fix the issue
on systems where CRS Software Visibility is not supported or disabled.

Differences from v10:
  - Move 0x0001 test inside pci_bus_wait_crs()
  - Fix pre-existing "valid response before timeout" bug
  - Add message if a device becomes ready after we printed "waiting"
  - Reorder messages so we print "giving up after X" instead of
    "not ready after X; giving up after X"

---

Bjorn Helgaas (1):
      PCI: Don't ignore valid response before CRS timeout

Sinan Kaya (3):
      PCI: Factor out pci_bus_wait_crs()
      PCI: Handle CRS ("device not ready") returned by device after FLR
      PCI: Warn periodically while waiting for device to become ready


 drivers/pci/pci.c   |   19 ++++++++++++++--
 drivers/pci/pci.h   |    1 +
 drivers/pci/probe.c |   61 +++++++++++++++++++++++++++++++++------------------
 3 files changed, 57 insertions(+), 24 deletions(-)


Incremental diff from Sinan's v10 posting:

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index c853551bc8d1..34c0aa1f37aa 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3824,22 +3824,18 @@ static void pci_flr_wait(struct pci_dev *dev)
 	bool ret;
 
 	/*
-	 * Don't touch the HW before waiting 100ms. HW has to finish
-	 * within 100ms according to PCI Express Base Specification
-	 * Revision 3.1 Section 6.6.2: Function-Level Reset (FLR).
+	 * Per PCIe r3.1, sec 6.6.2, the device should finish FLR within
+	 * 100ms, but even after that, it may respond to config requests
+	 * with CRS status if it requires more time.
 	 */
 	msleep(100);
 
-	if (pci_bus_read_config_dword(dev->bus, dev->devfn, PCI_VENDOR_ID,
-				      &id))
+	if (pci_bus_read_config_dword(dev->bus, dev->devfn, PCI_VENDOR_ID, &id))
 		return;
 
-	/* See if we can find a device via CRS first. */
-	if ((id & 0xffff) == 0x0001) {
-		ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
-		if (ret)
-			return;
-	}
+	ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
+	if (ret)
+		return;
 
 	do {
 		msleep(100);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1bbe85190183..b0857052c04a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -235,8 +235,7 @@ enum pci_bar_type {
 	pci_bar_mem64,		/* A 64-bit memory BAR */
 };
 
-bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l,
-		      int crs_timeout);
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout);
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
 int pci_setup_device(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2f4cf7d0e25e..99799cc1de22 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1824,42 +1824,50 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
 }
 EXPORT_SYMBOL(pci_alloc_dev);
 
-bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int crs_timeout)
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout)
 {
 	int delay = 1;
 
+	if ((*l & 0xffff) != 0x0001)
+		return true;	/* not a CRS completion */
+
+	if (!timeout)
+		return false;	/* CRS, but caller doesn't want to wait */
+
 	/*
-	 * Configuration Request Retry Status.  Some root ports return the
-	 * actual device ID instead of the synthetic ID (0xFFFF) required
-	 * by the PCIe spec.  Ignore the device ID and only check for
-	 * (vendor id == 1).
+	 * We got the reserved Vendor ID that indicates a completion with
+	 * Configuration Request Retry Status (CRS).  Retry until we get a
+	 * valid Vendor ID or we time out.
 	 */
-	do {
+	while ((*l & 0xffff) == 0x0001) {
 		msleep(delay);
 		delay *= 2;
-		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+
+		if (delay > timeout) {
+			pr_warn("pci %04x:%02x:%02x.%d: not ready after %dms; giving up\n",
+				pci_domain_nr(bus), bus->number,
+				PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 			return false;
+		}
 
 		if (delay >= 1000)
-			pr_info("pci %04x:%02x:%02x.%d: not responding since %dms still polling\n",
+			pr_info("pci %04x:%02x:%02x.%d: not ready after %dms; waiting\n",
 				pci_domain_nr(bus), bus->number,
-				PCI_SLOT(devfn), PCI_FUNC(devfn), delay);
+				PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 
-		/* Card hasn't responded in 60 seconds?  Must be stuck. */
-		if (delay > crs_timeout) {
-			pr_warn("pci %04x:%02x:%02x.%d: not responding %dms timeout reached\n",
-			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
-			       PCI_FUNC(devfn), crs_timeout);
+		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
 			return false;
-		}
-	} while ((*l & 0xffff) == 0x0001);
+	}
 
+	if (delay >= 1000)
+		pr_info("pci %04x:%02x:%02x.%d: ready after %dms\n",
+			pci_domain_nr(bus), bus->number,
+			PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 	return true;
 }
-EXPORT_SYMBOL(pci_bus_wait_crs);
 
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
-				int crs_timeout)
+				int timeout)
 {
 	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
 		return false;
@@ -1869,20 +1877,7 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
 	    *l == 0x0000ffff || *l == 0xffff0000)
 		return false;
 
-	/*
-	 * Configuration Request Retry Status.  Some root ports return the
-	 * actual device ID instead of the synthetic ID (0xFFFF) required
-	 * by the PCIe spec.  Ignore the device ID and only check for
-	 * (vendor id == 1).
-	 */
-	if ((*l & 0xffff) == 0x0001) {
-		if (!crs_timeout)
-			return false;
-
-		return pci_bus_wait_crs(bus, devfn, l, crs_timeout);
-	}
-
-	return true;
+	return pci_bus_wait_crs(bus, devfn, l, timeout);
 }
 EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 0/4] PCI: Use CRS Software Visibility to wait for device to become ready
@ 2017-08-18 21:31 ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:31 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

This is another rev of Sinan's series to fix an FLR issue seen with an
Intel 750 NVMe drive.  It seems that the 750 requires more time than we
currently allow after the FLR to become ready for use.

The biggest remaining issue is that I *think* this does not fix the issue
on systems where CRS Software Visibility is not supported or disabled.

Differences from v10:
  - Move 0x0001 test inside pci_bus_wait_crs()
  - Fix pre-existing "valid response before timeout" bug
  - Add message if a device becomes ready after we printed "waiting"
  - Reorder messages so we print "giving up after X" instead of
    "not ready after X; giving up after X"

---

Bjorn Helgaas (1):
      PCI: Don't ignore valid response before CRS timeout

Sinan Kaya (3):
      PCI: Factor out pci_bus_wait_crs()
      PCI: Handle CRS ("device not ready") returned by device after FLR
      PCI: Warn periodically while waiting for device to become ready


 drivers/pci/pci.c   |   19 ++++++++++++++--
 drivers/pci/pci.h   |    1 +
 drivers/pci/probe.c |   61 +++++++++++++++++++++++++++++++++------------------
 3 files changed, 57 insertions(+), 24 deletions(-)


Incremental diff from Sinan's v10 posting:

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index c853551bc8d1..34c0aa1f37aa 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3824,22 +3824,18 @@ static void pci_flr_wait(struct pci_dev *dev)
 	bool ret;
 
 	/*
-	 * Don't touch the HW before waiting 100ms. HW has to finish
-	 * within 100ms according to PCI Express Base Specification
-	 * Revision 3.1 Section 6.6.2: Function-Level Reset (FLR).
+	 * Per PCIe r3.1, sec 6.6.2, the device should finish FLR within
+	 * 100ms, but even after that, it may respond to config requests
+	 * with CRS status if it requires more time.
 	 */
 	msleep(100);
 
-	if (pci_bus_read_config_dword(dev->bus, dev->devfn, PCI_VENDOR_ID,
-				      &id))
+	if (pci_bus_read_config_dword(dev->bus, dev->devfn, PCI_VENDOR_ID, &id))
 		return;
 
-	/* See if we can find a device via CRS first. */
-	if ((id & 0xffff) == 0x0001) {
-		ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
-		if (ret)
-			return;
-	}
+	ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
+	if (ret)
+		return;
 
 	do {
 		msleep(100);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1bbe85190183..b0857052c04a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -235,8 +235,7 @@ enum pci_bar_type {
 	pci_bar_mem64,		/* A 64-bit memory BAR */
 };
 
-bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l,
-		      int crs_timeout);
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout);
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
 int pci_setup_device(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2f4cf7d0e25e..99799cc1de22 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1824,42 +1824,50 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
 }
 EXPORT_SYMBOL(pci_alloc_dev);
 
-bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int crs_timeout)
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout)
 {
 	int delay = 1;
 
+	if ((*l & 0xffff) != 0x0001)
+		return true;	/* not a CRS completion */
+
+	if (!timeout)
+		return false;	/* CRS, but caller doesn't want to wait */
+
 	/*
-	 * Configuration Request Retry Status.  Some root ports return the
-	 * actual device ID instead of the synthetic ID (0xFFFF) required
-	 * by the PCIe spec.  Ignore the device ID and only check for
-	 * (vendor id == 1).
+	 * We got the reserved Vendor ID that indicates a completion with
+	 * Configuration Request Retry Status (CRS).  Retry until we get a
+	 * valid Vendor ID or we time out.
 	 */
-	do {
+	while ((*l & 0xffff) == 0x0001) {
 		msleep(delay);
 		delay *= 2;
-		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+
+		if (delay > timeout) {
+			pr_warn("pci %04x:%02x:%02x.%d: not ready after %dms; giving up\n",
+				pci_domain_nr(bus), bus->number,
+				PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 			return false;
+		}
 
 		if (delay >= 1000)
-			pr_info("pci %04x:%02x:%02x.%d: not responding since %dms still polling\n",
+			pr_info("pci %04x:%02x:%02x.%d: not ready after %dms; waiting\n",
 				pci_domain_nr(bus), bus->number,
-				PCI_SLOT(devfn), PCI_FUNC(devfn), delay);
+				PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 
-		/* Card hasn't responded in 60 seconds?  Must be stuck. */
-		if (delay > crs_timeout) {
-			pr_warn("pci %04x:%02x:%02x.%d: not responding %dms timeout reached\n",
-			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
-			       PCI_FUNC(devfn), crs_timeout);
+		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
 			return false;
-		}
-	} while ((*l & 0xffff) == 0x0001);
+	}
 
+	if (delay >= 1000)
+		pr_info("pci %04x:%02x:%02x.%d: ready after %dms\n",
+			pci_domain_nr(bus), bus->number,
+			PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 	return true;
 }
-EXPORT_SYMBOL(pci_bus_wait_crs);
 
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
-				int crs_timeout)
+				int timeout)
 {
 	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
 		return false;
@@ -1869,20 +1877,7 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
 	    *l == 0x0000ffff || *l == 0xffff0000)
 		return false;
 
-	/*
-	 * Configuration Request Retry Status.  Some root ports return the
-	 * actual device ID instead of the synthetic ID (0xFFFF) required
-	 * by the PCIe spec.  Ignore the device ID and only check for
-	 * (vendor id == 1).
-	 */
-	if ((*l & 0xffff) == 0x0001) {
-		if (!crs_timeout)
-			return false;
-
-		return pci_bus_wait_crs(bus, devfn, l, crs_timeout);
-	}
-
-	return true;
+	return pci_bus_wait_crs(bus, devfn, l, timeout);
 }
 EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);
 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 0/4] PCI: Use CRS Software Visibility to wait for device to become ready
@ 2017-08-18 21:31 ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:31 UTC (permalink / raw)
  To: linux-arm-kernel

This is another rev of Sinan's series to fix an FLR issue seen with an
Intel 750 NVMe drive.  It seems that the 750 requires more time than we
currently allow after the FLR to become ready for use.

The biggest remaining issue is that I *think* this does not fix the issue
on systems where CRS Software Visibility is not supported or disabled.

Differences from v10:
  - Move 0x0001 test inside pci_bus_wait_crs()
  - Fix pre-existing "valid response before timeout" bug
  - Add message if a device becomes ready after we printed "waiting"
  - Reorder messages so we print "giving up after X" instead of
    "not ready after X; giving up after X"

---

Bjorn Helgaas (1):
      PCI: Don't ignore valid response before CRS timeout

Sinan Kaya (3):
      PCI: Factor out pci_bus_wait_crs()
      PCI: Handle CRS ("device not ready") returned by device after FLR
      PCI: Warn periodically while waiting for device to become ready


 drivers/pci/pci.c   |   19 ++++++++++++++--
 drivers/pci/pci.h   |    1 +
 drivers/pci/probe.c |   61 +++++++++++++++++++++++++++++++++------------------
 3 files changed, 57 insertions(+), 24 deletions(-)


Incremental diff from Sinan's v10 posting:

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index c853551bc8d1..34c0aa1f37aa 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3824,22 +3824,18 @@ static void pci_flr_wait(struct pci_dev *dev)
 	bool ret;
 
 	/*
-	 * Don't touch the HW before waiting 100ms. HW has to finish
-	 * within 100ms according to PCI Express Base Specification
-	 * Revision 3.1 Section 6.6.2: Function-Level Reset (FLR).
+	 * Per PCIe r3.1, sec 6.6.2, the device should finish FLR within
+	 * 100ms, but even after that, it may respond to config requests
+	 * with CRS status if it requires more time.
 	 */
 	msleep(100);
 
-	if (pci_bus_read_config_dword(dev->bus, dev->devfn, PCI_VENDOR_ID,
-				      &id))
+	if (pci_bus_read_config_dword(dev->bus, dev->devfn, PCI_VENDOR_ID, &id))
 		return;
 
-	/* See if we can find a device via CRS first. */
-	if ((id & 0xffff) == 0x0001) {
-		ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
-		if (ret)
-			return;
-	}
+	ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
+	if (ret)
+		return;
 
 	do {
 		msleep(100);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1bbe85190183..b0857052c04a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -235,8 +235,7 @@ enum pci_bar_type {
 	pci_bar_mem64,		/* A 64-bit memory BAR */
 };
 
-bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l,
-		      int crs_timeout);
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout);
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
 int pci_setup_device(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2f4cf7d0e25e..99799cc1de22 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1824,42 +1824,50 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
 }
 EXPORT_SYMBOL(pci_alloc_dev);
 
-bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int crs_timeout)
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout)
 {
 	int delay = 1;
 
+	if ((*l & 0xffff) != 0x0001)
+		return true;	/* not a CRS completion */
+
+	if (!timeout)
+		return false;	/* CRS, but caller doesn't want to wait */
+
 	/*
-	 * Configuration Request Retry Status.  Some root ports return the
-	 * actual device ID instead of the synthetic ID (0xFFFF) required
-	 * by the PCIe spec.  Ignore the device ID and only check for
-	 * (vendor id == 1).
+	 * We got the reserved Vendor ID that indicates a completion with
+	 * Configuration Request Retry Status (CRS).  Retry until we get a
+	 * valid Vendor ID or we time out.
 	 */
-	do {
+	while ((*l & 0xffff) == 0x0001) {
 		msleep(delay);
 		delay *= 2;
-		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+
+		if (delay > timeout) {
+			pr_warn("pci %04x:%02x:%02x.%d: not ready after %dms; giving up\n",
+				pci_domain_nr(bus), bus->number,
+				PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 			return false;
+		}
 
 		if (delay >= 1000)
-			pr_info("pci %04x:%02x:%02x.%d: not responding since %dms still polling\n",
+			pr_info("pci %04x:%02x:%02x.%d: not ready after %dms; waiting\n",
 				pci_domain_nr(bus), bus->number,
-				PCI_SLOT(devfn), PCI_FUNC(devfn), delay);
+				PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 
-		/* Card hasn't responded in 60 seconds?  Must be stuck. */
-		if (delay > crs_timeout) {
-			pr_warn("pci %04x:%02x:%02x.%d: not responding %dms timeout reached\n",
-			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
-			       PCI_FUNC(devfn), crs_timeout);
+		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
 			return false;
-		}
-	} while ((*l & 0xffff) == 0x0001);
+	}
 
+	if (delay >= 1000)
+		pr_info("pci %04x:%02x:%02x.%d: ready after %dms\n",
+			pci_domain_nr(bus), bus->number,
+			PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 	return true;
 }
-EXPORT_SYMBOL(pci_bus_wait_crs);
 
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
-				int crs_timeout)
+				int timeout)
 {
 	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
 		return false;
@@ -1869,20 +1877,7 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
 	    *l == 0x0000ffff || *l == 0xffff0000)
 		return false;
 
-	/*
-	 * Configuration Request Retry Status.  Some root ports return the
-	 * actual device ID instead of the synthetic ID (0xFFFF) required
-	 * by the PCIe spec.  Ignore the device ID and only check for
-	 * (vendor id == 1).
-	 */
-	if ((*l & 0xffff) == 0x0001) {
-		if (!crs_timeout)
-			return false;
-
-		return pci_bus_wait_crs(bus, devfn, l, crs_timeout);
-	}
-
-	return true;
+	return pci_bus_wait_crs(bus, devfn, l, timeout);
 }
 EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout
  2017-08-18 21:31 ` Bjorn Helgaas
  (?)
@ 2017-08-18 21:32   ` Bjorn Helgaas
  -1 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:32 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

While waiting for a device to become ready (i.e., to return a non-CRS
completion to a read of its Vendor ID), if we got a valid response to the
very last read before timing out, we printed a warning and gave up on the
device even though it was actually ready.

For a typical 60s timeout, we wait about 65s (it's not exact because of the
exponential backoff), but we treated devices that became ready between 33s
and 65s as though they failed.

Move the Device ID read later so we check whether the device is ready
immediately, before checking for a timeout.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/probe.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c31310db0404..08ea844ac4ba 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1849,15 +1849,16 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
 
 		msleep(delay);
 		delay *= 2;
-		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
-			return false;
-		/* Card hasn't responded in 60 seconds?  Must be stuck. */
+
 		if (delay > crs_timeout) {
 			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
 			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
 			       PCI_FUNC(devfn));
 			return false;
 		}
+
+		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+			return false;
 	}
 
 	return true;

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout
@ 2017-08-18 21:32   ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:32 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

While waiting for a device to become ready (i.e., to return a non-CRS
completion to a read of its Vendor ID), if we got a valid response to the
very last read before timing out, we printed a warning and gave up on the
device even though it was actually ready.

For a typical 60s timeout, we wait about 65s (it's not exact because of the
exponential backoff), but we treated devices that became ready between 33s
and 65s as though they failed.

Move the Device ID read later so we check whether the device is ready
immediately, before checking for a timeout.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/probe.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c31310db0404..08ea844ac4ba 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1849,15 +1849,16 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
 
 		msleep(delay);
 		delay *= 2;
-		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
-			return false;
-		/* Card hasn't responded in 60 seconds?  Must be stuck. */
+
 		if (delay > crs_timeout) {
 			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
 			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
 			       PCI_FUNC(devfn));
 			return false;
 		}
+
+		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+			return false;
 	}
 
 	return true;


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout
@ 2017-08-18 21:32   ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:32 UTC (permalink / raw)
  To: linux-arm-kernel

While waiting for a device to become ready (i.e., to return a non-CRS
completion to a read of its Vendor ID), if we got a valid response to the
very last read before timing out, we printed a warning and gave up on the
device even though it was actually ready.

For a typical 60s timeout, we wait about 65s (it's not exact because of the
exponential backoff), but we treated devices that became ready between 33s
and 65s as though they failed.

Move the Device ID read later so we check whether the device is ready
immediately, before checking for a timeout.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/probe.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c31310db0404..08ea844ac4ba 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1849,15 +1849,16 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
 
 		msleep(delay);
 		delay *= 2;
-		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
-			return false;
-		/* Card hasn't responded in 60 seconds?  Must be stuck. */
+
 		if (delay > crs_timeout) {
 			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
 			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
 			       PCI_FUNC(devfn));
 			return false;
 		}
+
+		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+			return false;
 	}
 
 	return true;

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
  2017-08-18 21:31 ` Bjorn Helgaas
  (?)
@ 2017-08-18 21:32   ` Bjorn Helgaas
  -1 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:32 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

From: Sinan Kaya <okaya@codeaurora.org>

Configuration Request Retry Status (CRS) was previously hidden inside
pci_bus_read_dev_vendor_id().  We want to add support for CRS in other
situations, such as waiting for a device to become ready after a Function
Level Reset.

Move CRS handling into pci_bus_wait_crs() so it can be called from other
places.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: move CRS reserved Vendor ID test into pci_bus_wait_crs(), remove
EXPORT_SYMBOL]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.h   |    1 +
 drivers/pci/probe.c |   39 +++++++++++++++++++++++----------------
 2 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 22e061738c6f..b0857052c04a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -235,6 +235,7 @@ enum pci_bar_type {
 	pci_bar_mem64,		/* A 64-bit memory BAR */
 };
 
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout);
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
 int pci_setup_device(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 08ea844ac4ba..342a86640c6b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1824,33 +1824,26 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
 }
 EXPORT_SYMBOL(pci_alloc_dev);
 
-bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
-				int crs_timeout)
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout)
 {
 	int delay = 1;
 
-	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
-		return false;
+	if ((*l & 0xffff) != 0x0001)
+		return true;	/* not a CRS completion */
 
-	/* some broken boards return 0 or ~0 if a slot is empty: */
-	if (*l == 0xffffffff || *l == 0x00000000 ||
-	    *l == 0x0000ffff || *l == 0xffff0000)
-		return false;
+	if (!timeout)
+		return false;	/* CRS, but caller doesn't want to wait */
 
 	/*
-	 * Configuration Request Retry Status.  Some root ports return the
-	 * actual device ID instead of the synthetic ID (0xFFFF) required
-	 * by the PCIe spec.  Ignore the device ID and only check for
-	 * (vendor id == 1).
+	 * We got the reserved Vendor ID that indicates a completion with
+	 * Configuration Request Retry Status (CRS).  Retry until we get a
+	 * valid Vendor ID or we time out.
 	 */
 	while ((*l & 0xffff) == 0x0001) {
-		if (!crs_timeout)
-			return false;
-
 		msleep(delay);
 		delay *= 2;
 
-		if (delay > crs_timeout) {
+		if (delay > timeout) {
 			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
 			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
 			       PCI_FUNC(devfn));
@@ -1863,6 +1856,20 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
 
 	return true;
 }
+
+bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
+				int timeout)
+{
+	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+		return false;
+
+	/* some broken boards return 0 or ~0 if a slot is empty: */
+	if (*l == 0xffffffff || *l == 0x00000000 ||
+	    *l == 0x0000ffff || *l == 0xffff0000)
+		return false;
+
+	return pci_bus_wait_crs(bus, devfn, l, timeout);
+}
 EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);
 
 /*

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-18 21:32   ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:32 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

From: Sinan Kaya <okaya@codeaurora.org>

Configuration Request Retry Status (CRS) was previously hidden inside
pci_bus_read_dev_vendor_id().  We want to add support for CRS in other
situations, such as waiting for a device to become ready after a Function
Level Reset.

Move CRS handling into pci_bus_wait_crs() so it can be called from other
places.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: move CRS reserved Vendor ID test into pci_bus_wait_crs(), remove
EXPORT_SYMBOL]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.h   |    1 +
 drivers/pci/probe.c |   39 +++++++++++++++++++++++----------------
 2 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 22e061738c6f..b0857052c04a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -235,6 +235,7 @@ enum pci_bar_type {
 	pci_bar_mem64,		/* A 64-bit memory BAR */
 };
 
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout);
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
 int pci_setup_device(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 08ea844ac4ba..342a86640c6b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1824,33 +1824,26 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
 }
 EXPORT_SYMBOL(pci_alloc_dev);
 
-bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
-				int crs_timeout)
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout)
 {
 	int delay = 1;
 
-	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
-		return false;
+	if ((*l & 0xffff) != 0x0001)
+		return true;	/* not a CRS completion */
 
-	/* some broken boards return 0 or ~0 if a slot is empty: */
-	if (*l == 0xffffffff || *l == 0x00000000 ||
-	    *l == 0x0000ffff || *l == 0xffff0000)
-		return false;
+	if (!timeout)
+		return false;	/* CRS, but caller doesn't want to wait */
 
 	/*
-	 * Configuration Request Retry Status.  Some root ports return the
-	 * actual device ID instead of the synthetic ID (0xFFFF) required
-	 * by the PCIe spec.  Ignore the device ID and only check for
-	 * (vendor id == 1).
+	 * We got the reserved Vendor ID that indicates a completion with
+	 * Configuration Request Retry Status (CRS).  Retry until we get a
+	 * valid Vendor ID or we time out.
 	 */
 	while ((*l & 0xffff) == 0x0001) {
-		if (!crs_timeout)
-			return false;
-
 		msleep(delay);
 		delay *= 2;
 
-		if (delay > crs_timeout) {
+		if (delay > timeout) {
 			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
 			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
 			       PCI_FUNC(devfn));
@@ -1863,6 +1856,20 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
 
 	return true;
 }
+
+bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
+				int timeout)
+{
+	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+		return false;
+
+	/* some broken boards return 0 or ~0 if a slot is empty: */
+	if (*l == 0xffffffff || *l == 0x00000000 ||
+	    *l == 0x0000ffff || *l == 0xffff0000)
+		return false;
+
+	return pci_bus_wait_crs(bus, devfn, l, timeout);
+}
 EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);
 
 /*


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-18 21:32   ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:32 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sinan Kaya <okaya@codeaurora.org>

Configuration Request Retry Status (CRS) was previously hidden inside
pci_bus_read_dev_vendor_id().  We want to add support for CRS in other
situations, such as waiting for a device to become ready after a Function
Level Reset.

Move CRS handling into pci_bus_wait_crs() so it can be called from other
places.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: move CRS reserved Vendor ID test into pci_bus_wait_crs(), remove
EXPORT_SYMBOL]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.h   |    1 +
 drivers/pci/probe.c |   39 +++++++++++++++++++++++----------------
 2 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 22e061738c6f..b0857052c04a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -235,6 +235,7 @@ enum pci_bar_type {
 	pci_bar_mem64,		/* A 64-bit memory BAR */
 };
 
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout);
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
 int pci_setup_device(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 08ea844ac4ba..342a86640c6b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1824,33 +1824,26 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
 }
 EXPORT_SYMBOL(pci_alloc_dev);
 
-bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
-				int crs_timeout)
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout)
 {
 	int delay = 1;
 
-	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
-		return false;
+	if ((*l & 0xffff) != 0x0001)
+		return true;	/* not a CRS completion */
 
-	/* some broken boards return 0 or ~0 if a slot is empty: */
-	if (*l == 0xffffffff || *l == 0x00000000 ||
-	    *l == 0x0000ffff || *l == 0xffff0000)
-		return false;
+	if (!timeout)
+		return false;	/* CRS, but caller doesn't want to wait */
 
 	/*
-	 * Configuration Request Retry Status.  Some root ports return the
-	 * actual device ID instead of the synthetic ID (0xFFFF) required
-	 * by the PCIe spec.  Ignore the device ID and only check for
-	 * (vendor id == 1).
+	 * We got the reserved Vendor ID that indicates a completion with
+	 * Configuration Request Retry Status (CRS).  Retry until we get a
+	 * valid Vendor ID or we time out.
 	 */
 	while ((*l & 0xffff) == 0x0001) {
-		if (!crs_timeout)
-			return false;
-
 		msleep(delay);
 		delay *= 2;
 
-		if (delay > crs_timeout) {
+		if (delay > timeout) {
 			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
 			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
 			       PCI_FUNC(devfn));
@@ -1863,6 +1856,20 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
 
 	return true;
 }
+
+bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
+				int timeout)
+{
+	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+		return false;
+
+	/* some broken boards return 0 or ~0 if a slot is empty: */
+	if (*l == 0xffffffff || *l == 0x00000000 ||
+	    *l == 0x0000ffff || *l == 0xffff0000)
+		return false;
+
+	return pci_bus_wait_crs(bus, devfn, l, timeout);
+}
 EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);
 
 /*

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 3/4] PCI: Handle CRS ("device not ready") returned by device after FLR
  2017-08-18 21:31 ` Bjorn Helgaas
  (?)
@ 2017-08-18 21:32   ` Bjorn Helgaas
  -1 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:32 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

From: Sinan Kaya <okaya@codeaurora.org>

XXX I think this needs to somehow use the same timeout for the PCI_COMMAND
loop as for the CRS part, so this works on machines without CRS Software
Visibility. -- bhelgaas

Sporadic reset issues have been observed with Intel 750 NVMe drive while
assigning the physical function to the guest machine.  The sequence of
events observed is as follows:

  - perform a Function Level Reset (FLR)
  - sleep up to 1000ms total
  - read ~0 from PCI_COMMAND
  - warn that the device didn't return from FLR
  - touch the device before it's ready
  - device drops config writes when we restore register settings
  - incomplete register restore leaves device in inconsistent state
  - device probe fails because device is in inconsistent state

After reset, an endpoint may respond to config requests with Configuration
Request Retry Status (CRS) to indicate that it is not ready to accept new
requests.  See PCIe r3.1, sec 2.3.1 and 6.6.2.

After an FLR, read the Vendor ID and use pci_bus_wait_crs() to wait for a
value that indicates the device is ready.

If pci_bus_wait_crs() fails, i.e., the device has responded with CRS status
for at least the timeout interval, fall back to the old behavior of reading
the Command register until it is not ~0.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: changelog, adjust for Vendor ID test being inside
pci_bus_wait_crs(), drop PCI_COMMAND tweaks]
Not-Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.c |   19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index af0cc3456dc1..34c0aa1f37aa 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3821,17 +3821,32 @@ static void pci_flr_wait(struct pci_dev *dev)
 {
 	int i = 0;
 	u32 id;
+	bool ret;
+
+	/*
+	 * Per PCIe r3.1, sec 6.6.2, the device should finish FLR within
+	 * 100ms, but even after that, it may respond to config requests
+	 * with CRS status if it requires more time.
+	 */
+	msleep(100);
+
+	if (pci_bus_read_config_dword(dev->bus, dev->devfn, PCI_VENDOR_ID, &id))
+		return;
+
+	ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
+	if (ret)
+		return;
 
 	do {
 		msleep(100);
 		pci_read_config_dword(dev, PCI_COMMAND, &id);
-	} while (i++ < 10 && id == ~0);
+	} while (i++ < 9 && id == ~0);
 
 	if (id == ~0)
 		dev_warn(&dev->dev, "Failed to return from FLR\n");
 	else if (i > 1)
 		dev_info(&dev->dev, "Required additional %dms to return from FLR\n",
-			 (i - 1) * 100);
+			 i * 100);
 }
 
 /**

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 3/4] PCI: Handle CRS ("device not ready") returned by device after FLR
@ 2017-08-18 21:32   ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:32 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

From: Sinan Kaya <okaya@codeaurora.org>

XXX I think this needs to somehow use the same timeout for the PCI_COMMAND
loop as for the CRS part, so this works on machines without CRS Software
Visibility. -- bhelgaas

Sporadic reset issues have been observed with Intel 750 NVMe drive while
assigning the physical function to the guest machine.  The sequence of
events observed is as follows:

  - perform a Function Level Reset (FLR)
  - sleep up to 1000ms total
  - read ~0 from PCI_COMMAND
  - warn that the device didn't return from FLR
  - touch the device before it's ready
  - device drops config writes when we restore register settings
  - incomplete register restore leaves device in inconsistent state
  - device probe fails because device is in inconsistent state

After reset, an endpoint may respond to config requests with Configuration
Request Retry Status (CRS) to indicate that it is not ready to accept new
requests.  See PCIe r3.1, sec 2.3.1 and 6.6.2.

After an FLR, read the Vendor ID and use pci_bus_wait_crs() to wait for a
value that indicates the device is ready.

If pci_bus_wait_crs() fails, i.e., the device has responded with CRS status
for at least the timeout interval, fall back to the old behavior of reading
the Command register until it is not ~0.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: changelog, adjust for Vendor ID test being inside
pci_bus_wait_crs(), drop PCI_COMMAND tweaks]
Not-Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.c |   19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index af0cc3456dc1..34c0aa1f37aa 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3821,17 +3821,32 @@ static void pci_flr_wait(struct pci_dev *dev)
 {
 	int i = 0;
 	u32 id;
+	bool ret;
+
+	/*
+	 * Per PCIe r3.1, sec 6.6.2, the device should finish FLR within
+	 * 100ms, but even after that, it may respond to config requests
+	 * with CRS status if it requires more time.
+	 */
+	msleep(100);
+
+	if (pci_bus_read_config_dword(dev->bus, dev->devfn, PCI_VENDOR_ID, &id))
+		return;
+
+	ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
+	if (ret)
+		return;
 
 	do {
 		msleep(100);
 		pci_read_config_dword(dev, PCI_COMMAND, &id);
-	} while (i++ < 10 && id == ~0);
+	} while (i++ < 9 && id == ~0);
 
 	if (id == ~0)
 		dev_warn(&dev->dev, "Failed to return from FLR\n");
 	else if (i > 1)
 		dev_info(&dev->dev, "Required additional %dms to return from FLR\n",
-			 (i - 1) * 100);
+			 i * 100);
 }
 
 /**


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 3/4] PCI: Handle CRS ("device not ready") returned by device after FLR
@ 2017-08-18 21:32   ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:32 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sinan Kaya <okaya@codeaurora.org>

XXX I think this needs to somehow use the same timeout for the PCI_COMMAND
loop as for the CRS part, so this works on machines without CRS Software
Visibility. -- bhelgaas

Sporadic reset issues have been observed with Intel 750 NVMe drive while
assigning the physical function to the guest machine.  The sequence of
events observed is as follows:

  - perform a Function Level Reset (FLR)
  - sleep up to 1000ms total
  - read ~0 from PCI_COMMAND
  - warn that the device didn't return from FLR
  - touch the device before it's ready
  - device drops config writes when we restore register settings
  - incomplete register restore leaves device in inconsistent state
  - device probe fails because device is in inconsistent state

After reset, an endpoint may respond to config requests with Configuration
Request Retry Status (CRS) to indicate that it is not ready to accept new
requests.  See PCIe r3.1, sec 2.3.1 and 6.6.2.

After an FLR, read the Vendor ID and use pci_bus_wait_crs() to wait for a
value that indicates the device is ready.

If pci_bus_wait_crs() fails, i.e., the device has responded with CRS status
for at least the timeout interval, fall back to the old behavior of reading
the Command register until it is not ~0.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: changelog, adjust for Vendor ID test being inside
pci_bus_wait_crs(), drop PCI_COMMAND tweaks]
Not-Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.c |   19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index af0cc3456dc1..34c0aa1f37aa 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3821,17 +3821,32 @@ static void pci_flr_wait(struct pci_dev *dev)
 {
 	int i = 0;
 	u32 id;
+	bool ret;
+
+	/*
+	 * Per PCIe r3.1, sec 6.6.2, the device should finish FLR within
+	 * 100ms, but even after that, it may respond to config requests
+	 * with CRS status if it requires more time.
+	 */
+	msleep(100);
+
+	if (pci_bus_read_config_dword(dev->bus, dev->devfn, PCI_VENDOR_ID, &id))
+		return;
+
+	ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
+	if (ret)
+		return;
 
 	do {
 		msleep(100);
 		pci_read_config_dword(dev, PCI_COMMAND, &id);
-	} while (i++ < 10 && id == ~0);
+	} while (i++ < 9 && id == ~0);
 
 	if (id == ~0)
 		dev_warn(&dev->dev, "Failed to return from FLR\n");
 	else if (i > 1)
 		dev_info(&dev->dev, "Required additional %dms to return from FLR\n",
-			 (i - 1) * 100);
+			 i * 100);
 }
 
 /**

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 4/4] PCI: Warn periodically while waiting for device to become ready
  2017-08-18 21:31 ` Bjorn Helgaas
@ 2017-08-18 21:32   ` Bjorn Helgaas
  -1 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:32 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

From: Sinan Kaya <okaya@codeaurora.org>

Add a print statement in pci_bus_wait_crs() so that user observes the
progress of device polling instead of silently waiting for timeout to be
reached.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: check for timeout first so we don't print "waiting, giving up",
always print time we've slept (not the actual timeout, print a "ready"
message if we've printed a "waiting" message]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/probe.c |   15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 342a86640c6b..99799cc1de22 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1844,16 +1844,25 @@ bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout)
 		delay *= 2;
 
 		if (delay > timeout) {
-			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
-			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
-			       PCI_FUNC(devfn));
+			pr_warn("pci %04x:%02x:%02x.%d: not ready after %dms; giving up\n",
+				pci_domain_nr(bus), bus->number,
+				PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 			return false;
 		}
 
+		if (delay >= 1000)
+			pr_info("pci %04x:%02x:%02x.%d: not ready after %dms; waiting\n",
+				pci_domain_nr(bus), bus->number,
+				PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
+
 		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
 			return false;
 	}
 
+	if (delay >= 1000)
+		pr_info("pci %04x:%02x:%02x.%d: ready after %dms\n",
+			pci_domain_nr(bus), bus->number,
+			PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 	return true;
 }
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v11 4/4] PCI: Warn periodically while waiting for device to become ready
@ 2017-08-18 21:32   ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-18 21:32 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sinan Kaya <okaya@codeaurora.org>

Add a print statement in pci_bus_wait_crs() so that user observes the
progress of device polling instead of silently waiting for timeout to be
reached.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: check for timeout first so we don't print "waiting, giving up",
always print time we've slept (not the actual timeout, print a "ready"
message if we've printed a "waiting" message]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/probe.c |   15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 342a86640c6b..99799cc1de22 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1844,16 +1844,25 @@ bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout)
 		delay *= 2;
 
 		if (delay > timeout) {
-			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
-			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
-			       PCI_FUNC(devfn));
+			pr_warn("pci %04x:%02x:%02x.%d: not ready after %dms; giving up\n",
+				pci_domain_nr(bus), bus->number,
+				PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 			return false;
 		}
 
+		if (delay >= 1000)
+			pr_info("pci %04x:%02x:%02x.%d: not ready after %dms; waiting\n",
+				pci_domain_nr(bus), bus->number,
+				PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
+
 		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
 			return false;
 	}
 
+	if (delay >= 1000)
+		pr_info("pci %04x:%02x:%02x.%d: ready after %dms\n",
+			pci_domain_nr(bus), bus->number,
+			PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
 	return true;
 }
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
  2017-08-18 21:32   ` Bjorn Helgaas
  (?)
@ 2017-08-21 13:53     ` Sinan Kaya
  -1 siblings, 0 replies; 38+ messages in thread
From: Sinan Kaya @ 2017-08-21 13:53 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> +	if ((*l & 0xffff) != 0x0001)
> +		return true;	/* not a CRS completion */
>  

This version certainly looks cleaner. However, it breaks pci_flr_wait().

If some root port doesn't support CRS and returns 0xFFFFFFFF, pci_bus_wait_crs()
function returns true. pci_flr_wait() prematurely bails out from here.


pci_flr_wait()
{

+	ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
+	if (ret)
+		return;

}

We can change the return code to false above but then we break pci_bus_read_dev_vendor_id()
function. 

That's why, I was interested in creating a pci_bus_crs_visibility_supported() helper
function that would check for the magic 0x0001 value and return true. Otherwise, false. 

pci_bus_read_dev_vendor_id() would do this

pci_bus_read_dev_vendor_id()
{
	...
	if (pci_bus_crs_visibility_supported())
		return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);

	return true
}

Similar pattern for pci_flr_wait().


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-21 13:53     ` Sinan Kaya
  0 siblings, 0 replies; 38+ messages in thread
From: Sinan Kaya @ 2017-08-21 13:53 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> +	if ((*l & 0xffff) != 0x0001)
> +		return true;	/* not a CRS completion */
>  

This version certainly looks cleaner. However, it breaks pci_flr_wait().

If some root port doesn't support CRS and returns 0xFFFFFFFF, pci_bus_wait_crs()
function returns true. pci_flr_wait() prematurely bails out from here.


pci_flr_wait()
{

+	ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
+	if (ret)
+		return;

}

We can change the return code to false above but then we break pci_bus_read_dev_vendor_id()
function. 

That's why, I was interested in creating a pci_bus_crs_visibility_supported() helper
function that would check for the magic 0x0001 value and return true. Otherwise, false. 

pci_bus_read_dev_vendor_id() would do this

pci_bus_read_dev_vendor_id()
{
	...
	if (pci_bus_crs_visibility_supported())
		return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);

	return true
}

Similar pattern for pci_flr_wait().


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-21 13:53     ` Sinan Kaya
  0 siblings, 0 replies; 38+ messages in thread
From: Sinan Kaya @ 2017-08-21 13:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> +	if ((*l & 0xffff) != 0x0001)
> +		return true;	/* not a CRS completion */
>  

This version certainly looks cleaner. However, it breaks pci_flr_wait().

If some root port doesn't support CRS and returns 0xFFFFFFFF, pci_bus_wait_crs()
function returns true. pci_flr_wait() prematurely bails out from here.


pci_flr_wait()
{

+	ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
+	if (ret)
+		return;

}

We can change the return code to false above but then we break pci_bus_read_dev_vendor_id()
function. 

That's why, I was interested in creating a pci_bus_crs_visibility_supported() helper
function that would check for the magic 0x0001 value and return true. Otherwise, false. 

pci_bus_read_dev_vendor_id() would do this

pci_bus_read_dev_vendor_id()
{
	...
	if (pci_bus_crs_visibility_supported())
		return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);

	return true
}

Similar pattern for pci_flr_wait().


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout
  2017-08-18 21:32   ` Bjorn Helgaas
  (?)
@ 2017-08-21 14:02     ` Sinan Kaya
  -1 siblings, 0 replies; 38+ messages in thread
From: Sinan Kaya @ 2017-08-21 14:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> While waiting for a device to become ready (i.e., to return a non-CRS
> completion to a read of its Vendor ID), if we got a valid response to the
> very last read before timing out, we printed a warning and gave up on the
> device even though it was actually ready.
> 
> For a typical 60s timeout, we wait about 65s (it's not exact because of the
> exponential backoff), but we treated devices that became ready between 33s
> and 65s as though they failed.
> 
> Move the Device ID read later so we check whether the device is ready
> immediately, before checking for a timeout.
> 
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  drivers/pci/probe.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index c31310db0404..08ea844ac4ba 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1849,15 +1849,16 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
>  
>  		msleep(delay);
>  		delay *= 2;
> -		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> -			return false;
> -		/* Card hasn't responded in 60 seconds?  Must be stuck. */
> +

There is still a problem here. We'll wait some time above and return without checking if
we actually found the card or not.

>  		if (delay > crs_timeout) {
>  			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
>  			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
>  			       PCI_FUNC(devfn));
>  			return false;
>  		}
> +
> +		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> +			return false;
>  	}
>  
>  	return true;
> 
> 

Here is another shot at it. Sorry, I don't have a diff syntax. I made up the function
by copy paste.

	while ((*l & 0xffff) == 0x0001) {
		if (!crs_timeout)
			return false;
 
		/* Card hasn't responded in 60 seconds?  Must be stuck. */
		if (delay > crs_timeout) {
			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
			       PCI_FUNC(devfn));
			return false;
		} 
		msleep(delay);
		delay *= 2;
		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
			return false; 
	}

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout
@ 2017-08-21 14:02     ` Sinan Kaya
  0 siblings, 0 replies; 38+ messages in thread
From: Sinan Kaya @ 2017-08-21 14:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, linux-arm-kernel

On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> While waiting for a device to become ready (i.e., to return a non-CRS
> completion to a read of its Vendor ID), if we got a valid response to the
> very last read before timing out, we printed a warning and gave up on the
> device even though it was actually ready.
> 
> For a typical 60s timeout, we wait about 65s (it's not exact because of the
> exponential backoff), but we treated devices that became ready between 33s
> and 65s as though they failed.
> 
> Move the Device ID read later so we check whether the device is ready
> immediately, before checking for a timeout.
> 
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  drivers/pci/probe.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index c31310db0404..08ea844ac4ba 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1849,15 +1849,16 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
>  
>  		msleep(delay);
>  		delay *= 2;
> -		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> -			return false;
> -		/* Card hasn't responded in 60 seconds?  Must be stuck. */
> +

There is still a problem here. We'll wait some time above and return without checking if
we actually found the card or not.

>  		if (delay > crs_timeout) {
>  			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
>  			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
>  			       PCI_FUNC(devfn));
>  			return false;
>  		}
> +
> +		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> +			return false;
>  	}
>  
>  	return true;
> 
> 

Here is another shot at it. Sorry, I don't have a diff syntax. I made up the function
by copy paste.

	while ((*l & 0xffff) == 0x0001) {
		if (!crs_timeout)
			return false;
 
		/* Card hasn't responded in 60 seconds?  Must be stuck. */
		if (delay > crs_timeout) {
			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
			       PCI_FUNC(devfn));
			return false;
		} 
		msleep(delay);
		delay *= 2;
		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
			return false; 
	}

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout
@ 2017-08-21 14:02     ` Sinan Kaya
  0 siblings, 0 replies; 38+ messages in thread
From: Sinan Kaya @ 2017-08-21 14:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> While waiting for a device to become ready (i.e., to return a non-CRS
> completion to a read of its Vendor ID), if we got a valid response to the
> very last read before timing out, we printed a warning and gave up on the
> device even though it was actually ready.
> 
> For a typical 60s timeout, we wait about 65s (it's not exact because of the
> exponential backoff), but we treated devices that became ready between 33s
> and 65s as though they failed.
> 
> Move the Device ID read later so we check whether the device is ready
> immediately, before checking for a timeout.
> 
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  drivers/pci/probe.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index c31310db0404..08ea844ac4ba 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1849,15 +1849,16 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
>  
>  		msleep(delay);
>  		delay *= 2;
> -		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> -			return false;
> -		/* Card hasn't responded in 60 seconds?  Must be stuck. */
> +

There is still a problem here. We'll wait some time above and return without checking if
we actually found the card or not.

>  		if (delay > crs_timeout) {
>  			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
>  			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
>  			       PCI_FUNC(devfn));
>  			return false;
>  		}
> +
> +		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> +			return false;
>  	}
>  
>  	return true;
> 
> 

Here is another shot at it. Sorry, I don't have a diff syntax. I made up the function
by copy paste.

	while ((*l & 0xffff) == 0x0001) {
		if (!crs_timeout)
			return false;
 
		/* Card hasn't responded in 60 seconds?  Must be stuck. */
		if (delay > crs_timeout) {
			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
			       PCI_FUNC(devfn));
			return false;
		} 
		msleep(delay);
		delay *= 2;
		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
			return false; 
	}

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout
  2017-08-21 14:02     ` Sinan Kaya
  (?)
@ 2017-08-21 17:44       ` Bjorn Helgaas
  -1 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-21 17:44 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, linux-pci, Timur Tabi, linux-kernel,
	Alex Williamson, linux-arm-msm, linux-arm-kernel

On Mon, Aug 21, 2017 at 10:02:55AM -0400, Sinan Kaya wrote:
> On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> > While waiting for a device to become ready (i.e., to return a non-CRS
> > completion to a read of its Vendor ID), if we got a valid response to the
> > very last read before timing out, we printed a warning and gave up on the
> > device even though it was actually ready.
> > 
> > For a typical 60s timeout, we wait about 65s (it's not exact because of the
> > exponential backoff), but we treated devices that became ready between 33s
> > and 65s as though they failed.
> > 
> > Move the Device ID read later so we check whether the device is ready
> > immediately, before checking for a timeout.
> > 
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >  drivers/pci/probe.c |    7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index c31310db0404..08ea844ac4ba 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -1849,15 +1849,16 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
> >  
> >  		msleep(delay);
> >  		delay *= 2;
> > -		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> > -			return false;
> > -		/* Card hasn't responded in 60 seconds?  Must be stuck. */
> > +
> 
> There is still a problem here. We'll wait some time above and return without checking if
> we actually found the card or not.
> 
> >  		if (delay > crs_timeout) {
> >  			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
> >  			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
> >  			       PCI_FUNC(devfn));
> >  			return false;
> >  		}
> > +
> > +		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> > +			return false;
> >  	}
> >  
> >  	return true;
> > 
> > 
> 
> Here is another shot at it. Sorry, I don't have a diff syntax. I made up the function
> by copy paste.

Yep, this is much better, thanks!

> 	while ((*l & 0xffff) == 0x0001) {
> 		if (!crs_timeout)
> 			return false;
>  
> 		/* Card hasn't responded in 60 seconds?  Must be stuck. */
> 		if (delay > crs_timeout) {
> 			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
> 			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
> 			       PCI_FUNC(devfn));
> 			return false;
> 		} 
> 		msleep(delay);
> 		delay *= 2;
> 		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> 			return false; 
> 	}
> 
> -- 
> Sinan Kaya
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout
@ 2017-08-21 17:44       ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-21 17:44 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, Bjorn Helgaas, linux-arm-kernel

On Mon, Aug 21, 2017 at 10:02:55AM -0400, Sinan Kaya wrote:
> On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> > While waiting for a device to become ready (i.e., to return a non-CRS
> > completion to a read of its Vendor ID), if we got a valid response to the
> > very last read before timing out, we printed a warning and gave up on the
> > device even though it was actually ready.
> > 
> > For a typical 60s timeout, we wait about 65s (it's not exact because of the
> > exponential backoff), but we treated devices that became ready between 33s
> > and 65s as though they failed.
> > 
> > Move the Device ID read later so we check whether the device is ready
> > immediately, before checking for a timeout.
> > 
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >  drivers/pci/probe.c |    7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index c31310db0404..08ea844ac4ba 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -1849,15 +1849,16 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
> >  
> >  		msleep(delay);
> >  		delay *= 2;
> > -		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> > -			return false;
> > -		/* Card hasn't responded in 60 seconds?  Must be stuck. */
> > +
> 
> There is still a problem here. We'll wait some time above and return without checking if
> we actually found the card or not.
> 
> >  		if (delay > crs_timeout) {
> >  			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
> >  			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
> >  			       PCI_FUNC(devfn));
> >  			return false;
> >  		}
> > +
> > +		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> > +			return false;
> >  	}
> >  
> >  	return true;
> > 
> > 
> 
> Here is another shot at it. Sorry, I don't have a diff syntax. I made up the function
> by copy paste.

Yep, this is much better, thanks!

> 	while ((*l & 0xffff) == 0x0001) {
> 		if (!crs_timeout)
> 			return false;
>  
> 		/* Card hasn't responded in 60 seconds?  Must be stuck. */
> 		if (delay > crs_timeout) {
> 			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
> 			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
> 			       PCI_FUNC(devfn));
> 			return false;
> 		} 
> 		msleep(delay);
> 		delay *= 2;
> 		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> 			return false; 
> 	}
> 
> -- 
> Sinan Kaya
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout
@ 2017-08-21 17:44       ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-21 17:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 21, 2017 at 10:02:55AM -0400, Sinan Kaya wrote:
> On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> > While waiting for a device to become ready (i.e., to return a non-CRS
> > completion to a read of its Vendor ID), if we got a valid response to the
> > very last read before timing out, we printed a warning and gave up on the
> > device even though it was actually ready.
> > 
> > For a typical 60s timeout, we wait about 65s (it's not exact because of the
> > exponential backoff), but we treated devices that became ready between 33s
> > and 65s as though they failed.
> > 
> > Move the Device ID read later so we check whether the device is ready
> > immediately, before checking for a timeout.
> > 
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >  drivers/pci/probe.c |    7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index c31310db0404..08ea844ac4ba 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -1849,15 +1849,16 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
> >  
> >  		msleep(delay);
> >  		delay *= 2;
> > -		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> > -			return false;
> > -		/* Card hasn't responded in 60 seconds?  Must be stuck. */
> > +
> 
> There is still a problem here. We'll wait some time above and return without checking if
> we actually found the card or not.
> 
> >  		if (delay > crs_timeout) {
> >  			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
> >  			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
> >  			       PCI_FUNC(devfn));
> >  			return false;
> >  		}
> > +
> > +		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> > +			return false;
> >  	}
> >  
> >  	return true;
> > 
> > 
> 
> Here is another shot at it. Sorry, I don't have a diff syntax. I made up the function
> by copy paste.

Yep, this is much better, thanks!

> 	while ((*l & 0xffff) == 0x0001) {
> 		if (!crs_timeout)
> 			return false;
>  
> 		/* Card hasn't responded in 60 seconds?  Must be stuck. */
> 		if (delay > crs_timeout) {
> 			printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
> 			       pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
> 			       PCI_FUNC(devfn));
> 			return false;
> 		} 
> 		msleep(delay);
> 		delay *= 2;
> 		if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> 			return false; 
> 	}
> 
> -- 
> Sinan Kaya
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
  2017-08-21 13:53     ` Sinan Kaya
  (?)
@ 2017-08-21 19:18       ` Bjorn Helgaas
  -1 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-21 19:18 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, linux-pci, Timur Tabi, linux-kernel,
	Alex Williamson, linux-arm-msm, linux-arm-kernel

On Mon, Aug 21, 2017 at 09:53:56AM -0400, Sinan Kaya wrote:
> On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> > +	if ((*l & 0xffff) != 0x0001)
> > +		return true;	/* not a CRS completion */
> >  
> 
> This version certainly looks cleaner. However, it breaks pci_flr_wait().
> 
> If some root port doesn't support CRS and returns 0xFFFFFFFF, pci_bus_wait_crs()
> function returns true. pci_flr_wait() prematurely bails out from here.
> 
> 
> pci_flr_wait()
> {
> 
> +	ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> +	if (ret)
> +		return;
> 
> }
> 
> We can change the return code to false above but then we break pci_bus_read_dev_vendor_id()
> function. 
> 
> That's why, I was interested in creating a pci_bus_crs_visibility_supported() helper
> function that would check for the magic 0x0001 value and return true. Otherwise, false. 
> 
> pci_bus_read_dev_vendor_id() would do this
> 
> pci_bus_read_dev_vendor_id()
> {
> 	...
> 	if (pci_bus_crs_visibility_supported())
> 		return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> 
> 	return true
> }
> 
> Similar pattern for pci_flr_wait().

I think that makes sense.  We'd want to check for CRS SV being
enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
in the root port's pci_dev, and check it with something like what
pcie_root_rcb_set() does?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-21 19:18       ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-21 19:18 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, Bjorn Helgaas, linux-arm-kernel

On Mon, Aug 21, 2017 at 09:53:56AM -0400, Sinan Kaya wrote:
> On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> > +	if ((*l & 0xffff) != 0x0001)
> > +		return true;	/* not a CRS completion */
> >  
> 
> This version certainly looks cleaner. However, it breaks pci_flr_wait().
> 
> If some root port doesn't support CRS and returns 0xFFFFFFFF, pci_bus_wait_crs()
> function returns true. pci_flr_wait() prematurely bails out from here.
> 
> 
> pci_flr_wait()
> {
> 
> +	ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> +	if (ret)
> +		return;
> 
> }
> 
> We can change the return code to false above but then we break pci_bus_read_dev_vendor_id()
> function. 
> 
> That's why, I was interested in creating a pci_bus_crs_visibility_supported() helper
> function that would check for the magic 0x0001 value and return true. Otherwise, false. 
> 
> pci_bus_read_dev_vendor_id() would do this
> 
> pci_bus_read_dev_vendor_id()
> {
> 	...
> 	if (pci_bus_crs_visibility_supported())
> 		return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> 
> 	return true
> }
> 
> Similar pattern for pci_flr_wait().

I think that makes sense.  We'd want to check for CRS SV being
enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
in the root port's pci_dev, and check it with something like what
pcie_root_rcb_set() does?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-21 19:18       ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-21 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 21, 2017 at 09:53:56AM -0400, Sinan Kaya wrote:
> On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> > +	if ((*l & 0xffff) != 0x0001)
> > +		return true;	/* not a CRS completion */
> >  
> 
> This version certainly looks cleaner. However, it breaks pci_flr_wait().
> 
> If some root port doesn't support CRS and returns 0xFFFFFFFF, pci_bus_wait_crs()
> function returns true. pci_flr_wait() prematurely bails out from here.
> 
> 
> pci_flr_wait()
> {
> 
> +	ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> +	if (ret)
> +		return;
> 
> }
> 
> We can change the return code to false above but then we break pci_bus_read_dev_vendor_id()
> function. 
> 
> That's why, I was interested in creating a pci_bus_crs_visibility_supported() helper
> function that would check for the magic 0x0001 value and return true. Otherwise, false. 
> 
> pci_bus_read_dev_vendor_id() would do this
> 
> pci_bus_read_dev_vendor_id()
> {
> 	...
> 	if (pci_bus_crs_visibility_supported())
> 		return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> 
> 	return true
> }
> 
> Similar pattern for pci_flr_wait().

I think that makes sense.  We'd want to check for CRS SV being
enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
in the root port's pci_dev, and check it with something like what
pcie_root_rcb_set() does?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
  2017-08-21 19:18       ` Bjorn Helgaas
@ 2017-08-21 19:37         ` Sinan Kaya
  -1 siblings, 0 replies; 38+ messages in thread
From: Sinan Kaya @ 2017-08-21 19:37 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, Timur Tabi, linux-kernel,
	Alex Williamson, linux-arm-msm, linux-arm-kernel

On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
>> pci_bus_read_dev_vendor_id()
>> {
>> 	...
>> 	if (pci_bus_crs_visibility_supported())
>> 		return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
>>
>> 	return true
>> }
>>
>> Similar pattern for pci_flr_wait().

Sorry for the poor choice of function name. 

I was thinking of something like this.

bool pci_bus_crs_pending(u32 l)
{
	return (l & 0xFFFF) == 0x0001
}

if (pci_bus_crs_pending(id))
	return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);

> I think that makes sense.  We'd want to check for CRS SV being
> enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
> pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
> in the root port's pci_dev, and check it with something like what
> pcie_root_rcb_set() does?
> 

You can observe CRS under the following conditions

1. root port <-> endpoint 
2. bridge <-> endpoint 
3. root port<->bridge

I was relying on the fact that we are reading 0x001 as an indication that
this device detected CRS. Maybe, this is too indirect.

If we also want to capture the capability, I think the right thing is to
check the parent capability.

bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
{
	if (device type(bridge) == root port)
		return read(root_crs_register_reg);

	if (device type(bridge) == switch)
		return read(switch_crs_register);

	return false;
}

bool pci_bus_crs_pending(struct pci_dev *dev, u32 l)
{
	if !pci_bus_crs_vis_supported(dev->parent)
		return false;

	return (l & 0xFFFF) == 0x0001;
}

I'll prototype this. Let me know if you have concerns.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-21 19:37         ` Sinan Kaya
  0 siblings, 0 replies; 38+ messages in thread
From: Sinan Kaya @ 2017-08-21 19:37 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
>> pci_bus_read_dev_vendor_id()
>> {
>> 	...
>> 	if (pci_bus_crs_visibility_supported())
>> 		return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
>>
>> 	return true
>> }
>>
>> Similar pattern for pci_flr_wait().

Sorry for the poor choice of function name. 

I was thinking of something like this.

bool pci_bus_crs_pending(u32 l)
{
	return (l & 0xFFFF) == 0x0001
}

if (pci_bus_crs_pending(id))
	return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);

> I think that makes sense.  We'd want to check for CRS SV being
> enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
> pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
> in the root port's pci_dev, and check it with something like what
> pcie_root_rcb_set() does?
> 

You can observe CRS under the following conditions

1. root port <-> endpoint 
2. bridge <-> endpoint 
3. root port<->bridge

I was relying on the fact that we are reading 0x001 as an indication that
this device detected CRS. Maybe, this is too indirect.

If we also want to capture the capability, I think the right thing is to
check the parent capability.

bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
{
	if (device type(bridge) == root port)
		return read(root_crs_register_reg);

	if (device type(bridge) == switch)
		return read(switch_crs_register);

	return false;
}

bool pci_bus_crs_pending(struct pci_dev *dev, u32 l)
{
	if !pci_bus_crs_vis_supported(dev->parent)
		return false;

	return (l & 0xFFFF) == 0x0001;
}

I'll prototype this. Let me know if you have concerns.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
  2017-08-21 19:37         ` Sinan Kaya
  (?)
@ 2017-08-21 20:23           ` Bjorn Helgaas
  -1 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-21 20:23 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, linux-pci, Timur Tabi, linux-kernel,
	Alex Williamson, linux-arm-msm, linux-arm-kernel

On Mon, Aug 21, 2017 at 03:37:06PM -0400, Sinan Kaya wrote:
> On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
> ...
> if (pci_bus_crs_pending(id))
> 	return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> 
> > I think that makes sense.  We'd want to check for CRS SV being
> > enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
> > pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
> > in the root port's pci_dev, and check it with something like what
> > pcie_root_rcb_set() does?
> > 
> 
> You can observe CRS under the following conditions
> 
> 1. root port <-> endpoint 
> 2. bridge <-> endpoint 
> 3. root port<->bridge
> 
> I was relying on the fact that we are reading 0x001 as an indication that
> this device detected CRS. Maybe, this is too indirect.
> 
> If we also want to capture the capability, I think the right thing is to
> check the parent capability.
> 
> bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
> {
> 	if (device type(bridge) == root port)
> 		return read(root_crs_register_reg);
> 
> 	if (device type(bridge) == switch)
> 		return read(switch_crs_register);

I don't understand this part.  AFAIK, CRS SV is only a feature of root
ports.  The capability and enable bits are in the Root Capabilities
and Root Control registers.

It's certainly true that a device below a switch can respond with a
CRS completion, but the switch is not the requester, and my
understanding is that it would not take any action on the completion
other than passing it upstream.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-21 20:23           ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-21 20:23 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, Bjorn Helgaas, linux-arm-kernel

On Mon, Aug 21, 2017 at 03:37:06PM -0400, Sinan Kaya wrote:
> On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
> ...
> if (pci_bus_crs_pending(id))
> 	return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> 
> > I think that makes sense.  We'd want to check for CRS SV being
> > enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
> > pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
> > in the root port's pci_dev, and check it with something like what
> > pcie_root_rcb_set() does?
> > 
> 
> You can observe CRS under the following conditions
> 
> 1. root port <-> endpoint 
> 2. bridge <-> endpoint 
> 3. root port<->bridge
> 
> I was relying on the fact that we are reading 0x001 as an indication that
> this device detected CRS. Maybe, this is too indirect.
> 
> If we also want to capture the capability, I think the right thing is to
> check the parent capability.
> 
> bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
> {
> 	if (device type(bridge) == root port)
> 		return read(root_crs_register_reg);
> 
> 	if (device type(bridge) == switch)
> 		return read(switch_crs_register);

I don't understand this part.  AFAIK, CRS SV is only a feature of root
ports.  The capability and enable bits are in the Root Capabilities
and Root Control registers.

It's certainly true that a device below a switch can respond with a
CRS completion, but the switch is not the requester, and my
understanding is that it would not take any action on the completion
other than passing it upstream.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-21 20:23           ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-21 20:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 21, 2017 at 03:37:06PM -0400, Sinan Kaya wrote:
> On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
> ...
> if (pci_bus_crs_pending(id))
> 	return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> 
> > I think that makes sense.  We'd want to check for CRS SV being
> > enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
> > pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
> > in the root port's pci_dev, and check it with something like what
> > pcie_root_rcb_set() does?
> > 
> 
> You can observe CRS under the following conditions
> 
> 1. root port <-> endpoint 
> 2. bridge <-> endpoint 
> 3. root port<->bridge
> 
> I was relying on the fact that we are reading 0x001 as an indication that
> this device detected CRS. Maybe, this is too indirect.
> 
> If we also want to capture the capability, I think the right thing is to
> check the parent capability.
> 
> bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
> {
> 	if (device type(bridge) == root port)
> 		return read(root_crs_register_reg);
> 
> 	if (device type(bridge) == switch)
> 		return read(switch_crs_register);

I don't understand this part.  AFAIK, CRS SV is only a feature of root
ports.  The capability and enable bits are in the Root Capabilities
and Root Control registers.

It's certainly true that a device below a switch can respond with a
CRS completion, but the switch is not the requester, and my
understanding is that it would not take any action on the completion
other than passing it upstream.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
  2017-08-21 20:23           ` Bjorn Helgaas
@ 2017-08-21 20:32             ` Sinan Kaya
  -1 siblings, 0 replies; 38+ messages in thread
From: Sinan Kaya @ 2017-08-21 20:32 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, Timur Tabi, linux-kernel,
	Alex Williamson, linux-arm-msm, linux-arm-kernel

On 8/21/2017 4:23 PM, Bjorn Helgaas wrote:
> On Mon, Aug 21, 2017 at 03:37:06PM -0400, Sinan Kaya wrote:
>> On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
>> ...
>> if (pci_bus_crs_pending(id))
>> 	return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
>>
>>> I think that makes sense.  We'd want to check for CRS SV being
>>> enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
>>> pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
>>> in the root port's pci_dev, and check it with something like what
>>> pcie_root_rcb_set() does?
>>>
>>
>> You can observe CRS under the following conditions
>>
>> 1. root port <-> endpoint 
>> 2. bridge <-> endpoint 
>> 3. root port<->bridge
>>
>> I was relying on the fact that we are reading 0x001 as an indication that
>> this device detected CRS. Maybe, this is too indirect.
>>
>> If we also want to capture the capability, I think the right thing is to
>> check the parent capability.
>>
>> bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
>> {
>> 	if (device type(bridge) == root port)
>> 		return read(root_crs_register_reg);
>>
>> 	if (device type(bridge) == switch)
>> 		return read(switch_crs_register);
> 
> I don't understand this part.  AFAIK, CRS SV is only a feature of root
> ports.  The capability and enable bits are in the Root Capabilities
> and Root Control registers.
> 

No question about it.

> It's certainly true that a device below a switch can respond with a
> CRS completion, but the switch is not the requester, and my
> understanding is that it would not take any action on the completion
> other than passing it upstream.
> 

I saw some bridge references in the spec for CRS. I was going to do
some research for it. You answered my question. I was curious how this
would impact the behavior.

"Bridge Configuration Retry Enable – When Set, this bit enables PCI Express
to PCI/PCI-X bridges to return Configuration Request Retry Status (CRS) in
response to Configuration Requests that target devices below the bridge. 
Refer to the PCI Express to PCI/PCI-X Bridge Specification, Revision 1.0 for
further details."

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-21 20:32             ` Sinan Kaya
  0 siblings, 0 replies; 38+ messages in thread
From: Sinan Kaya @ 2017-08-21 20:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/21/2017 4:23 PM, Bjorn Helgaas wrote:
> On Mon, Aug 21, 2017 at 03:37:06PM -0400, Sinan Kaya wrote:
>> On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
>> ...
>> if (pci_bus_crs_pending(id))
>> 	return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
>>
>>> I think that makes sense.  We'd want to check for CRS SV being
>>> enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
>>> pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
>>> in the root port's pci_dev, and check it with something like what
>>> pcie_root_rcb_set() does?
>>>
>>
>> You can observe CRS under the following conditions
>>
>> 1. root port <-> endpoint 
>> 2. bridge <-> endpoint 
>> 3. root port<->bridge
>>
>> I was relying on the fact that we are reading 0x001 as an indication that
>> this device detected CRS. Maybe, this is too indirect.
>>
>> If we also want to capture the capability, I think the right thing is to
>> check the parent capability.
>>
>> bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
>> {
>> 	if (device type(bridge) == root port)
>> 		return read(root_crs_register_reg);
>>
>> 	if (device type(bridge) == switch)
>> 		return read(switch_crs_register);
> 
> I don't understand this part.  AFAIK, CRS SV is only a feature of root
> ports.  The capability and enable bits are in the Root Capabilities
> and Root Control registers.
> 

No question about it.

> It's certainly true that a device below a switch can respond with a
> CRS completion, but the switch is not the requester, and my
> understanding is that it would not take any action on the completion
> other than passing it upstream.
> 

I saw some bridge references in the spec for CRS. I was going to do
some research for it. You answered my question. I was curious how this
would impact the behavior.

"Bridge Configuration Retry Enable ? When Set, this bit enables PCI Express
to PCI/PCI-X bridges to return Configuration Request Retry Status (CRS) in
response to Configuration Requests that target devices below the bridge. 
Refer to the PCI Express to PCI/PCI-X Bridge Specification, Revision 1.0 for
further details."

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
  2017-08-21 20:32             ` Sinan Kaya
  (?)
@ 2017-08-21 21:09               ` Bjorn Helgaas
  -1 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-21 21:09 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, linux-pci, Timur Tabi, linux-kernel,
	Alex Williamson, linux-arm-msm, linux-arm-kernel

On Mon, Aug 21, 2017 at 04:32:26PM -0400, Sinan Kaya wrote:
> On 8/21/2017 4:23 PM, Bjorn Helgaas wrote:
> > On Mon, Aug 21, 2017 at 03:37:06PM -0400, Sinan Kaya wrote:
> >> On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
> >> ...
> >> if (pci_bus_crs_pending(id))
> >> 	return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> >>
> >>> I think that makes sense.  We'd want to check for CRS SV being
> >>> enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
> >>> pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
> >>> in the root port's pci_dev, and check it with something like what
> >>> pcie_root_rcb_set() does?
> >>>
> >>
> >> You can observe CRS under the following conditions
> >>
> >> 1. root port <-> endpoint 
> >> 2. bridge <-> endpoint 
> >> 3. root port<->bridge
> >>
> >> I was relying on the fact that we are reading 0x001 as an indication that
> >> this device detected CRS. Maybe, this is too indirect.
> >>
> >> If we also want to capture the capability, I think the right thing is to
> >> check the parent capability.
> >>
> >> bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
> >> {
> >> 	if (device type(bridge) == root port)
> >> 		return read(root_crs_register_reg);
> >>
> >> 	if (device type(bridge) == switch)
> >> 		return read(switch_crs_register);
> > 
> > I don't understand this part.  AFAIK, CRS SV is only a feature of root
> > ports.  The capability and enable bits are in the Root Capabilities
> > and Root Control registers.
> > 
> 
> No question about it.
> 
> > It's certainly true that a device below a switch can respond with a
> > CRS completion, but the switch is not the requester, and my
> > understanding is that it would not take any action on the completion
> > other than passing it upstream.
> > 
> 
> I saw some bridge references in the spec for CRS. I was going to do
> some research for it. You answered my question. I was curious how this
> would impact the behavior.
> 
> "Bridge Configuration Retry Enable – When Set, this bit enables PCI Express
> to PCI/PCI-X bridges to return Configuration Request Retry Status (CRS) in
> response to Configuration Requests that target devices below the bridge. 
> Refer to the PCI Express to PCI/PCI-X Bridge Specification, Revision 1.0 for
> further details."

(The above is from PCIe r3.1, sec 7.8.4, Device Control Register, and
also discussed in sec 4.3 of the PCIe-to-PCI/PCI-X bridge spec.)

In any event, the Bridge Configuration Retry Enable only determines
whether the bridge ever returns a CRS completion.  The bridge itself
never converts a CRS completion into the 0x0001 vendor ID.  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-21 21:09               ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-21 21:09 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, Timur Tabi, linux-kernel, Alex Williamson,
	linux-arm-msm, Bjorn Helgaas, linux-arm-kernel

T24gTW9uLCBBdWcgMjEsIDIwMTcgYXQgMDQ6MzI6MjZQTSAtMDQwMCwgU2luYW4gS2F5YSB3cm90
ZToKPiBPbiA4LzIxLzIwMTcgNDoyMyBQTSwgQmpvcm4gSGVsZ2FhcyB3cm90ZToKPiA+IE9uIE1v
biwgQXVnIDIxLCAyMDE3IGF0IDAzOjM3OjA2UE0gLTA0MDAsIFNpbmFuIEtheWEgd3JvdGU6Cj4g
Pj4gT24gOC8yMS8yMDE3IDM6MTggUE0sIEJqb3JuIEhlbGdhYXMgd3JvdGU6Cj4gPj4gLi4uCj4g
Pj4gaWYgKHBjaV9idXNfY3JzX3BlbmRpbmcoaWQpKQo+ID4+IAlyZXR1cm4gcGNpX2J1c193YWl0
X2NycyhkZXYtPmJ1cywgZGV2LT5kZXZmbiwgJmlkLCA2MDAwMCk7Cj4gPj4KPiA+Pj4gSSB0aGlu
ayB0aGF0IG1ha2VzIHNlbnNlLiAgV2UnZCB3YW50IHRvIGNoZWNrIGZvciBDUlMgU1YgYmVpbmcK
PiA+Pj4gZW5hYmxlZCwgZS5nLiwgbWF5YmUgcmVhZCBQQ0lfRVhQX1JUQ1RMX0NSU1NWRSBiYWNr
IGluCj4gPj4+IHBjaV9lbmFibGVfY3JzKCkgYW5kIGNhY2hlIGl0IHNvbWV3aGVyZS4gIE1heWJl
IGEgY3JzX3N2X2VuYWJsZWQgYml0Cj4gPj4+IGluIHRoZSByb290IHBvcnQncyBwY2lfZGV2LCBh
bmQgY2hlY2sgaXQgd2l0aCBzb21ldGhpbmcgbGlrZSB3aGF0Cj4gPj4+IHBjaWVfcm9vdF9yY2Jf
c2V0KCkgZG9lcz8KPiA+Pj4KPiA+Pgo+ID4+IFlvdSBjYW4gb2JzZXJ2ZSBDUlMgdW5kZXIgdGhl
IGZvbGxvd2luZyBjb25kaXRpb25zCj4gPj4KPiA+PiAxLiByb290IHBvcnQgPC0+IGVuZHBvaW50
IAo+ID4+IDIuIGJyaWRnZSA8LT4gZW5kcG9pbnQgCj4gPj4gMy4gcm9vdCBwb3J0PC0+YnJpZGdl
Cj4gPj4KPiA+PiBJIHdhcyByZWx5aW5nIG9uIHRoZSBmYWN0IHRoYXQgd2UgYXJlIHJlYWRpbmcg
MHgwMDEgYXMgYW4gaW5kaWNhdGlvbiB0aGF0Cj4gPj4gdGhpcyBkZXZpY2UgZGV0ZWN0ZWQgQ1JT
LiBNYXliZSwgdGhpcyBpcyB0b28gaW5kaXJlY3QuCj4gPj4KPiA+PiBJZiB3ZSBhbHNvIHdhbnQg
dG8gY2FwdHVyZSB0aGUgY2FwYWJpbGl0eSwgSSB0aGluayB0aGUgcmlnaHQgdGhpbmcgaXMgdG8K
PiA+PiBjaGVjayB0aGUgcGFyZW50IGNhcGFiaWxpdHkuCj4gPj4KPiA+PiBib29sIHBjaV9idXNf
Y3JzX3Zpc19zdXBwb3J0ZWQoc3RydWN0IHBjaV9kZXYgKmJyaWRnZSkKPiA+PiB7Cj4gPj4gCWlm
IChkZXZpY2UgdHlwZShicmlkZ2UpID09IHJvb3QgcG9ydCkKPiA+PiAJCXJldHVybiByZWFkKHJv
b3RfY3JzX3JlZ2lzdGVyX3JlZyk7Cj4gPj4KPiA+PiAJaWYgKGRldmljZSB0eXBlKGJyaWRnZSkg
PT0gc3dpdGNoKQo+ID4+IAkJcmV0dXJuIHJlYWQoc3dpdGNoX2Nyc19yZWdpc3Rlcik7Cj4gPiAK
PiA+IEkgZG9uJ3QgdW5kZXJzdGFuZCB0aGlzIHBhcnQuICBBRkFJSywgQ1JTIFNWIGlzIG9ubHkg
YSBmZWF0dXJlIG9mIHJvb3QKPiA+IHBvcnRzLiAgVGhlIGNhcGFiaWxpdHkgYW5kIGVuYWJsZSBi
aXRzIGFyZSBpbiB0aGUgUm9vdCBDYXBhYmlsaXRpZXMKPiA+IGFuZCBSb290IENvbnRyb2wgcmVn
aXN0ZXJzLgo+ID4gCj4gCj4gTm8gcXVlc3Rpb24gYWJvdXQgaXQuCj4gCj4gPiBJdCdzIGNlcnRh
aW5seSB0cnVlIHRoYXQgYSBkZXZpY2UgYmVsb3cgYSBzd2l0Y2ggY2FuIHJlc3BvbmQgd2l0aCBh
Cj4gPiBDUlMgY29tcGxldGlvbiwgYnV0IHRoZSBzd2l0Y2ggaXMgbm90IHRoZSByZXF1ZXN0ZXIs
IGFuZCBteQo+ID4gdW5kZXJzdGFuZGluZyBpcyB0aGF0IGl0IHdvdWxkIG5vdCB0YWtlIGFueSBh
Y3Rpb24gb24gdGhlIGNvbXBsZXRpb24KPiA+IG90aGVyIHRoYW4gcGFzc2luZyBpdCB1cHN0cmVh
bS4KPiA+IAo+IAo+IEkgc2F3IHNvbWUgYnJpZGdlIHJlZmVyZW5jZXMgaW4gdGhlIHNwZWMgZm9y
IENSUy4gSSB3YXMgZ29pbmcgdG8gZG8KPiBzb21lIHJlc2VhcmNoIGZvciBpdC4gWW91IGFuc3dl
cmVkIG15IHF1ZXN0aW9uLiBJIHdhcyBjdXJpb3VzIGhvdyB0aGlzCj4gd291bGQgaW1wYWN0IHRo
ZSBiZWhhdmlvci4KPiAKPiAiQnJpZGdlIENvbmZpZ3VyYXRpb24gUmV0cnkgRW5hYmxlIOKAkyBX
aGVuIFNldCwgdGhpcyBiaXQgZW5hYmxlcyBQQ0kgRXhwcmVzcwo+IHRvIFBDSS9QQ0ktWCBicmlk
Z2VzIHRvIHJldHVybiBDb25maWd1cmF0aW9uIFJlcXVlc3QgUmV0cnkgU3RhdHVzIChDUlMpIGlu
Cj4gcmVzcG9uc2UgdG8gQ29uZmlndXJhdGlvbiBSZXF1ZXN0cyB0aGF0IHRhcmdldCBkZXZpY2Vz
IGJlbG93IHRoZSBicmlkZ2UuIAo+IFJlZmVyIHRvIHRoZSBQQ0kgRXhwcmVzcyB0byBQQ0kvUENJ
LVggQnJpZGdlIFNwZWNpZmljYXRpb24sIFJldmlzaW9uIDEuMCBmb3IKPiBmdXJ0aGVyIGRldGFp
bHMuIgoKKFRoZSBhYm92ZSBpcyBmcm9tIFBDSWUgcjMuMSwgc2VjIDcuOC40LCBEZXZpY2UgQ29u
dHJvbCBSZWdpc3RlciwgYW5kCmFsc28gZGlzY3Vzc2VkIGluIHNlYyA0LjMgb2YgdGhlIFBDSWUt
dG8tUENJL1BDSS1YIGJyaWRnZSBzcGVjLikKCkluIGFueSBldmVudCwgdGhlIEJyaWRnZSBDb25m
aWd1cmF0aW9uIFJldHJ5IEVuYWJsZSBvbmx5IGRldGVybWluZXMKd2hldGhlciB0aGUgYnJpZGdl
IGV2ZXIgcmV0dXJucyBhIENSUyBjb21wbGV0aW9uLiAgVGhlIGJyaWRnZSBpdHNlbGYKbmV2ZXIg
Y29udmVydHMgYSBDUlMgY29tcGxldGlvbiBpbnRvIHRoZSAweDAwMDEgdmVuZG9yIElELiAgCgpf
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpsaW51eC1hcm0t
a2VybmVsIG1haWxpbmcgbGlzdApsaW51eC1hcm0ta2VybmVsQGxpc3RzLmluZnJhZGVhZC5vcmcK
aHR0cDovL2xpc3RzLmluZnJhZGVhZC5vcmcvbWFpbG1hbi9saXN0aW5mby9saW51eC1hcm0ta2Vy
bmVsCg==

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-21 21:09               ` Bjorn Helgaas
  0 siblings, 0 replies; 38+ messages in thread
From: Bjorn Helgaas @ 2017-08-21 21:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 21, 2017 at 04:32:26PM -0400, Sinan Kaya wrote:
> On 8/21/2017 4:23 PM, Bjorn Helgaas wrote:
> > On Mon, Aug 21, 2017 at 03:37:06PM -0400, Sinan Kaya wrote:
> >> On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
> >> ...
> >> if (pci_bus_crs_pending(id))
> >> 	return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> >>
> >>> I think that makes sense.  We'd want to check for CRS SV being
> >>> enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
> >>> pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
> >>> in the root port's pci_dev, and check it with something like what
> >>> pcie_root_rcb_set() does?
> >>>
> >>
> >> You can observe CRS under the following conditions
> >>
> >> 1. root port <-> endpoint 
> >> 2. bridge <-> endpoint 
> >> 3. root port<->bridge
> >>
> >> I was relying on the fact that we are reading 0x001 as an indication that
> >> this device detected CRS. Maybe, this is too indirect.
> >>
> >> If we also want to capture the capability, I think the right thing is to
> >> check the parent capability.
> >>
> >> bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
> >> {
> >> 	if (device type(bridge) == root port)
> >> 		return read(root_crs_register_reg);
> >>
> >> 	if (device type(bridge) == switch)
> >> 		return read(switch_crs_register);
> > 
> > I don't understand this part.  AFAIK, CRS SV is only a feature of root
> > ports.  The capability and enable bits are in the Root Capabilities
> > and Root Control registers.
> > 
> 
> No question about it.
> 
> > It's certainly true that a device below a switch can respond with a
> > CRS completion, but the switch is not the requester, and my
> > understanding is that it would not take any action on the completion
> > other than passing it upstream.
> > 
> 
> I saw some bridge references in the spec for CRS. I was going to do
> some research for it. You answered my question. I was curious how this
> would impact the behavior.
> 
> "Bridge Configuration Retry Enable ? When Set, this bit enables PCI Express
> to PCI/PCI-X bridges to return Configuration Request Retry Status (CRS) in
> response to Configuration Requests that target devices below the bridge. 
> Refer to the PCI Express to PCI/PCI-X Bridge Specification, Revision 1.0 for
> further details."

(The above is from PCIe r3.1, sec 7.8.4, Device Control Register, and
also discussed in sec 4.3 of the PCIe-to-PCI/PCI-X bridge spec.)

In any event, the Bridge Configuration Retry Enable only determines
whether the bridge ever returns a CRS completion.  The bridge itself
never converts a CRS completion into the 0x0001 vendor ID.  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
  2017-08-21 19:37         ` Sinan Kaya
@ 2017-08-23  4:40           ` Sinan Kaya
  -1 siblings, 0 replies; 38+ messages in thread
From: Sinan Kaya @ 2017-08-23  4:40 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, Timur Tabi, linux-kernel,
	Alex Williamson, linux-arm-msm, linux-arm-kernel

On 8/21/2017 3:37 PM, Sinan Kaya wrote:
> bool pci_bus_crs_pending(struct pci_dev *dev, u32 l)
> {
> 	if !pci_bus_crs_vis_supported(dev->parent)
> 		return false;

Apparently, I can't do this. By the time, we come to here from vendor id read
function, the topology has not been set up yet. 

I'm getting an exception:

[    6.067392] [<ffff00000846c878>] pci_bus_crs_visibility_pending+0x4/0x7c
[    6.074085] [<ffff00000846cccc>] pci_scan_single_device+0x40/0xb4
[    6.080170] [<ffff00000846cd90>] pci_scan_slot+0x50/0xe8
[    6.085474] [<ffff00000846dc14>] pci_scan_child_bus+0x30/0x108
[    6.091300] [<ffff0000084bab94>] acpi_pci_root_create+0x184/0x1f0
[    6.097388] [<ffff000008091dc8>] pci_acpi_scan_root+0x188/0x1d4
[    6.103298] [<ffff0000084ba7a8>] acpi_pci_root_add+0x38c/0x44c
[    6.109125] [<ffff0000084b4d94>] acpi_bus_attach+0xe0/0x1ac
[    6.114689] [<ffff0000084b4e08>] acpi_bus_attach+0x154/0x1ac
[    6.120340] [<ffff0000084b4e08>] acpi_bus_attach+0x154/0x1ac
[    6.125991] [<ffff0000084b6608>] acpi_bus_scan+0x60/0x70
[    6.131297] [<ffff0000091c87f8>] acpi_scan_init+0xd8/0x228
[    6.136774] [<ffff0000091c84e0>] acpi_init+0x2d4/0x328
[    6.141905] [<ffff0000091a0c88>] do_one_initcall+0x80/0x108
[    6.147469] [<ffff0000091a0e98>] kernel_init_freeable+0x188/0x228
[    6.153556] [<ffff000008c9bcbc>] kernel_init+0x10/0xfc
[    6.158687] [<ffff000008082ec0>] ret_from_fork+0x10/0x50



> 
> 	return (l & 0xFFFF) == 0x0001;
> }


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()
@ 2017-08-23  4:40           ` Sinan Kaya
  0 siblings, 0 replies; 38+ messages in thread
From: Sinan Kaya @ 2017-08-23  4:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 8/21/2017 3:37 PM, Sinan Kaya wrote:
> bool pci_bus_crs_pending(struct pci_dev *dev, u32 l)
> {
> 	if !pci_bus_crs_vis_supported(dev->parent)
> 		return false;

Apparently, I can't do this. By the time, we come to here from vendor id read
function, the topology has not been set up yet. 

I'm getting an exception:

[    6.067392] [<ffff00000846c878>] pci_bus_crs_visibility_pending+0x4/0x7c
[    6.074085] [<ffff00000846cccc>] pci_scan_single_device+0x40/0xb4
[    6.080170] [<ffff00000846cd90>] pci_scan_slot+0x50/0xe8
[    6.085474] [<ffff00000846dc14>] pci_scan_child_bus+0x30/0x108
[    6.091300] [<ffff0000084bab94>] acpi_pci_root_create+0x184/0x1f0
[    6.097388] [<ffff000008091dc8>] pci_acpi_scan_root+0x188/0x1d4
[    6.103298] [<ffff0000084ba7a8>] acpi_pci_root_add+0x38c/0x44c
[    6.109125] [<ffff0000084b4d94>] acpi_bus_attach+0xe0/0x1ac
[    6.114689] [<ffff0000084b4e08>] acpi_bus_attach+0x154/0x1ac
[    6.120340] [<ffff0000084b4e08>] acpi_bus_attach+0x154/0x1ac
[    6.125991] [<ffff0000084b6608>] acpi_bus_scan+0x60/0x70
[    6.131297] [<ffff0000091c87f8>] acpi_scan_init+0xd8/0x228
[    6.136774] [<ffff0000091c84e0>] acpi_init+0x2d4/0x328
[    6.141905] [<ffff0000091a0c88>] do_one_initcall+0x80/0x108
[    6.147469] [<ffff0000091a0e98>] kernel_init_freeable+0x188/0x228
[    6.153556] [<ffff000008c9bcbc>] kernel_init+0x10/0xfc
[    6.158687] [<ffff000008082ec0>] ret_from_fork+0x10/0x50



> 
> 	return (l & 0xFFFF) == 0x0001;
> }


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2017-08-23  4:40 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-18 21:31 [PATCH v11 0/4] PCI: Use CRS Software Visibility to wait for device to become ready Bjorn Helgaas
2017-08-18 21:31 ` Bjorn Helgaas
2017-08-18 21:31 ` Bjorn Helgaas
2017-08-18 21:32 ` [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout Bjorn Helgaas
2017-08-18 21:32   ` Bjorn Helgaas
2017-08-18 21:32   ` Bjorn Helgaas
2017-08-21 14:02   ` Sinan Kaya
2017-08-21 14:02     ` Sinan Kaya
2017-08-21 14:02     ` Sinan Kaya
2017-08-21 17:44     ` Bjorn Helgaas
2017-08-21 17:44       ` Bjorn Helgaas
2017-08-21 17:44       ` Bjorn Helgaas
2017-08-18 21:32 ` [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs() Bjorn Helgaas
2017-08-18 21:32   ` Bjorn Helgaas
2017-08-18 21:32   ` Bjorn Helgaas
2017-08-21 13:53   ` Sinan Kaya
2017-08-21 13:53     ` Sinan Kaya
2017-08-21 13:53     ` Sinan Kaya
2017-08-21 19:18     ` Bjorn Helgaas
2017-08-21 19:18       ` Bjorn Helgaas
2017-08-21 19:18       ` Bjorn Helgaas
2017-08-21 19:37       ` Sinan Kaya
2017-08-21 19:37         ` Sinan Kaya
2017-08-21 20:23         ` Bjorn Helgaas
2017-08-21 20:23           ` Bjorn Helgaas
2017-08-21 20:23           ` Bjorn Helgaas
2017-08-21 20:32           ` Sinan Kaya
2017-08-21 20:32             ` Sinan Kaya
2017-08-21 21:09             ` Bjorn Helgaas
2017-08-21 21:09               ` Bjorn Helgaas
2017-08-21 21:09               ` Bjorn Helgaas
2017-08-23  4:40         ` Sinan Kaya
2017-08-23  4:40           ` Sinan Kaya
2017-08-18 21:32 ` [PATCH v11 3/4] PCI: Handle CRS ("device not ready") returned by device after FLR Bjorn Helgaas
2017-08-18 21:32   ` Bjorn Helgaas
2017-08-18 21:32   ` Bjorn Helgaas
2017-08-18 21:32 ` [PATCH v11 4/4] PCI: Warn periodically while waiting for device to become ready Bjorn Helgaas
2017-08-18 21:32   ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.