All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] PCI-e Max Payload Size configuration
@ 2015-07-29 22:18 Keith Busch
  2015-07-29 22:18 ` [PATCH] pci: Default MPS tuning, match upstream port Keith Busch
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Keith Busch @ 2015-07-29 22:18 UTC (permalink / raw)
  To: linux-pci, linux-kernel; +Cc: Bjorn Helgass, Dave Jiang, Keith Busch

This patch series removes arch, driver, and hot-plug specific calls to
configure a PCI-e device's MPS setting. The device is instead configured
in the generic pci-core layer when the device is initially added, and
before it is bound to a driver.

The default policy is also changed from "do nothing" to update the
down stream port to match the upstream port if it is capable.

Dave Jiang (2):
  QIB: Removing usage of pcie_set_mps()
  PCIE: Remove symbol export for pcie_set_mps()

Keith Busch (1):
  pci: Default MPS tuning to match upstream port

 arch/arm/kernel/bios32.c             |   12 ------------
 arch/powerpc/kernel/pci-common.c     |    7 -------
 arch/tile/kernel/pci_gx.c            |    4 ----
 arch/x86/pci/acpi.c                  |    9 ---------
 drivers/infiniband/hw/qib/qib_pcie.c |   27 +--------------------------
 drivers/pci/bus.c                    |    4 ++++
 drivers/pci/hotplug/acpiphp_glue.c   |    1 -
 drivers/pci/hotplug/pciehp_pci.c     |    1 -
 drivers/pci/hotplug/shpchp_pci.c     |    1 -
 drivers/pci/pci.c                    |    1 -
 drivers/pci/probe.c                  |   22 ++++++++++++++++------
 include/linux/pci.h                  |    2 --
 12 files changed, 21 insertions(+), 70 deletions(-)

-- 
1.7.10.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] pci: Default MPS tuning, match upstream port
  2015-07-29 22:18 [PATCH 0/3] PCI-e Max Payload Size configuration Keith Busch
@ 2015-07-29 22:18 ` Keith Busch
  2015-08-17 22:28   ` Bjorn Helgaas
  2015-07-29 22:18 ` [PATCH 2/3] QIB: Removing usage of pcie_set_mps() Keith Busch
  2015-07-29 22:18 ` [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps() Keith Busch
  2 siblings, 1 reply; 16+ messages in thread
From: Keith Busch @ 2015-07-29 22:18 UTC (permalink / raw)
  To: linux-pci, linux-kernel
  Cc: Bjorn Helgass, Dave Jiang, Keith Busch, Austin Bolen,
	Myron Stowe, Jon Mason

A hot plugged PCI-e device max payload size (MPS) defaults to 0 for
128bytes. The device is not usable if the upstream port is configured
to a higher setting.

Bus configuration was previously done by arch specific and hot plug code
after the root port or bridge was scanned, and default behavior logged a
warning without changing the setting if there was a problem. This patch
unifies tuning with a default policy that affects only misconfigured
downstream devices, and preserves existing end result if a non-default
bus tuning is used (ex: pci=pcie_bus_safe).

The new pcie tuning will check the device's MPS against the parent bridge
when it is initially added to the pci subsystem, prior to attaching
to a driver. If MPS is mismatched, the downstream port is set to the
parent bridge's if capable. This is safe to change here since the device
being configured is not bound to a driver and does not affect previously
configured devices in the domain hierarchy.

A device incapable of matching the upstream bridge will log a
warning message and not bind to a driver. This is the safe option since
proper MPS configuration must consider the entire hierarchy between
communicating end points, and we can't safely modify a root port's
subtree MPS settings while it is in use.

Since the bus is configured everytime a bridge is scanned, this
potentially creates unnecessary re-walking of an already configured
sub-tree, but the code only executes during root port initialization,
hot adding a switch, or explicit request to rescan.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Austin Bolen <Austin_Bolen@Dell.com>
Cc: Myron Stowe <mstowe@redhat.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Bjorn Helgaas <bhelgaas@google.com>
---
 arch/arm/kernel/bios32.c           |   12 ------------
 arch/powerpc/kernel/pci-common.c   |    7 -------
 arch/tile/kernel/pci_gx.c          |    4 ----
 arch/x86/pci/acpi.c                |    9 ---------
 drivers/pci/bus.c                  |    4 ++++
 drivers/pci/hotplug/acpiphp_glue.c |    1 -
 drivers/pci/hotplug/pciehp_pci.c   |    1 -
 drivers/pci/hotplug/shpchp_pci.c   |    1 -
 drivers/pci/probe.c                |   27 ++++++++++++++++++++++-----
 include/linux/pci.h                |    2 --
 10 files changed, 26 insertions(+), 42 deletions(-)

diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c
index fcbbbb1..4fff58e 100644
--- a/arch/arm/kernel/bios32.c
+++ b/arch/arm/kernel/bios32.c
@@ -537,18 +537,6 @@ void pci_common_init_dev(struct device *parent, struct hw_pci *hw)
 		 */
 		pci_bus_add_devices(bus);
 	}
-
-	list_for_each_entry(sys, &head, node) {
-		struct pci_bus *bus = sys->bus;
-
-		/* Configure PCI Express settings */
-		if (bus && !pci_has_flag(PCI_PROBE_ONLY)) {
-			struct pci_bus *child;
-
-			list_for_each_entry(child, &bus->children, node)
-				pcie_bus_configure_settings(child);
-		}
-	}
 }
 
 #ifndef CONFIG_PCI_HOST_ITE8152
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index b9de34d..7f27ffe 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1682,13 +1682,6 @@ void pcibios_scan_phb(struct pci_controller *hose)
 	 */
 	if (ppc_md.pcibios_fixup_phb)
 		ppc_md.pcibios_fixup_phb(hose);
-
-	/* Configure PCI Express settings */
-	if (bus && !pci_has_flag(PCI_PROBE_ONLY)) {
-		struct pci_bus *child;
-		list_for_each_entry(child, &bus->children, node)
-			pcie_bus_configure_settings(child);
-	}
 }
 EXPORT_SYMBOL_GPL(pcibios_scan_phb);
 
diff --git a/arch/tile/kernel/pci_gx.c b/arch/tile/kernel/pci_gx.c
index b1df847..67492fb 100644
--- a/arch/tile/kernel/pci_gx.c
+++ b/arch/tile/kernel/pci_gx.c
@@ -599,10 +599,6 @@ static void fixup_read_and_payload_sizes(struct pci_controller *controller)
 	__gxio_mmio_write32(trio_context->mmio_base_mac + reg_offset,
 			    rc_dev_cap.word);
 
-	/* Configure PCI Express MPS setting. */
-	list_for_each_entry(child, &root_bus->children, node)
-		pcie_bus_configure_settings(child);
-
 	/*
 	 * Set the mac_config register in trio based on the MPS/MRS of the link.
 	 */
diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index ff99117..ab5977a 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -478,15 +478,6 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root *root)
 		}
 	}
 
-	/* After the PCI-E bus has been walked and all devices discovered,
-	 * configure any settings of the fabric that might be necessary.
-	 */
-	if (bus) {
-		struct pci_bus *child;
-		list_for_each_entry(child, &bus->children, node)
-			pcie_bus_configure_settings(child);
-	}
-
 	if (bus && node != NUMA_NO_NODE)
 		dev_printk(KERN_DEBUG, &bus->dev, "on NUMA node %d\n", node);
 
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 6fbd3f2..8f8428a 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -277,6 +277,10 @@ void pci_bus_add_device(struct pci_dev *dev)
 {
 	int retval;
 
+	if (dev->bus->self &&
+			pcie_get_mps(dev) != pcie_get_mps(dev->bus->self))
+		return;
+
 	/*
 	 * Can not put in pci_device_add yet because resources
 	 * are not assigned yet for some devices.
diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index ff53856..cd98649 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -509,7 +509,6 @@ static void enable_slot(struct acpiphp_slot *slot)
 	__pci_bus_assign_resources(bus, &add_list, NULL);
 
 	acpiphp_sanitize_bus(bus);
-	pcie_bus_configure_settings(bus);
 	acpiphp_set_acpi_region(slot);
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index 9e69403..93bc7f6 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -65,7 +65,6 @@ int pciehp_configure_device(struct slot *p_slot)
 			pci_hp_add_bridge(dev);
 
 	pci_assign_unassigned_bridge_resources(bridge);
-	pcie_bus_configure_settings(parent);
 	pci_bus_add_devices(parent);
 
  out:
diff --git a/drivers/pci/hotplug/shpchp_pci.c b/drivers/pci/hotplug/shpchp_pci.c
index f8cd3a2..ca3dc3c 100644
--- a/drivers/pci/hotplug/shpchp_pci.c
+++ b/drivers/pci/hotplug/shpchp_pci.c
@@ -69,7 +69,6 @@ int shpchp_configure_device(struct slot *p_slot)
 	}
 
 	pci_assign_unassigned_bridge_resources(bridge);
-	pcie_bus_configure_settings(parent);
 	pci_bus_add_devices(parent);
 
  out:
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index cefd636..b469298 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -37,6 +37,9 @@ struct pci_domain_busn_res {
 	int domain_nr;
 };
 
+static void pcie_bus_detect_mps(struct pci_dev *dev);
+static void pcie_bus_configure_settings(struct pci_bus *bus);
+
 static struct resource *get_pci_domain_busn_res(int domain_nr)
 {
 	struct pci_domain_busn_res *r;
@@ -929,6 +932,8 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)
 		(is_cardbus ? "PCI CardBus %04x:%02x" : "PCI Bus %04x:%02x"),
 		pci_domain_nr(bus), child->number);
 
+	pcie_bus_configure_settings(bus);
+
 	/* Has only triggered on CardBus, fixup is in yenta_socket */
 	while (bus->parent) {
 		if ((child->busn_res.end > bus->busn_res.end) ||
@@ -1589,6 +1594,9 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
 	dev->match_driver = false;
 	ret = device_add(&dev->dev);
 	WARN_ON(ret < 0);
+
+	if (pci_is_pcie(dev))
+		pcie_bus_detect_mps(dev);
 }
 
 struct pci_dev *pci_scan_single_device(struct pci_bus *bus, int devfn)
@@ -1802,9 +1810,17 @@ static void pcie_bus_detect_mps(struct pci_dev *dev)
 	mps = pcie_get_mps(dev);
 	p_mps = pcie_get_mps(bridge);
 
-	if (mps != p_mps)
-		dev_warn(&dev->dev, "Max Payload Size %d, but upstream %s set to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
-			 mps, pci_name(bridge), p_mps);
+	if (mps != p_mps) {
+		if (128 << dev->pcie_mpss < p_mps) {
+			dev_warn(&dev->dev,
+				"Max Payload Size Supported %d, but upstream %s set to %d. If problems are experienced, try running with pci=pcie_bus_safe\n",
+				128 << dev->pcie_mpss, pci_name(bridge), p_mps);
+			return;
+		}
+		pcie_write_mps(dev, p_mps);
+		dev_info(&dev->dev, "Max Payload Size set to %4d/%4d (was %4d)\n",
+			pcie_get_mps(dev), 128 << dev->pcie_mpss, mps);
+	}
 }
 
 static int pcie_bus_configure_set(struct pci_dev *dev, void *data)
@@ -1836,7 +1852,7 @@ static int pcie_bus_configure_set(struct pci_dev *dev, void *data)
  * parents then children fashion.  If this changes, then this code will not
  * work as designed.
  */
-void pcie_bus_configure_settings(struct pci_bus *bus)
+static void pcie_bus_configure_settings(struct pci_bus *bus)
 {
 	u8 smpss = 0;
 
@@ -1863,7 +1879,6 @@ void pcie_bus_configure_settings(struct pci_bus *bus)
 	pcie_bus_configure_set(bus->self, &smpss);
 	pci_walk_bus(bus, pcie_bus_configure_set, &smpss);
 }
-EXPORT_SYMBOL_GPL(pcie_bus_configure_settings);
 
 unsigned int pci_scan_child_bus(struct pci_bus *bus)
 {
@@ -1895,6 +1910,8 @@ unsigned int pci_scan_child_bus(struct pci_bus *bus)
 				max = pci_scan_bridge(bus, dev, max, pass);
 		}
 
+	pcie_bus_configure_settings(bus);
+
 	/*
 	 * We've scanned the bus and so we know all about what's on
 	 * the other side of any bridges that may be on this bus plus
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8a0321a..e1df5f9 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -735,8 +735,6 @@ struct pci_driver {
 /* these external functions are only available when PCI support is enabled */
 #ifdef CONFIG_PCI
 
-void pcie_bus_configure_settings(struct pci_bus *bus);
-
 enum pcie_bus_config_types {
 	PCIE_BUS_TUNE_OFF,
 	PCIE_BUS_SAFE,
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] QIB: Removing usage of pcie_set_mps()
  2015-07-29 22:18 [PATCH 0/3] PCI-e Max Payload Size configuration Keith Busch
  2015-07-29 22:18 ` [PATCH] pci: Default MPS tuning, match upstream port Keith Busch
@ 2015-07-29 22:18 ` Keith Busch
  2015-08-17 22:30   ` Bjorn Helgaas
  2015-07-29 22:18 ` [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps() Keith Busch
  2 siblings, 1 reply; 16+ messages in thread
From: Keith Busch @ 2015-07-29 22:18 UTC (permalink / raw)
  To: linux-pci, linux-kernel; +Cc: Bjorn Helgass, Dave Jiang

From: Dave Jiang <dave.jiang@intel.com>

This is in perperation of un-exporting the pcie_set_mps() function
symbol. A driver should not be changing the MPS as that is the
responsibility of the PCI subsystem.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/infiniband/hw/qib/qib_pcie.c |   27 +--------------------------
 1 file changed, 1 insertion(+), 26 deletions(-)

diff --git a/drivers/infiniband/hw/qib/qib_pcie.c b/drivers/infiniband/hw/qib/qib_pcie.c
index 4758a38..b8a2dcd 100644
--- a/drivers/infiniband/hw/qib/qib_pcie.c
+++ b/drivers/infiniband/hw/qib/qib_pcie.c
@@ -557,12 +557,11 @@ static void qib_tune_pcie_coalesce(struct qib_devdata *dd)
  */
 static int qib_pcie_caps;
 module_param_named(pcie_caps, qib_pcie_caps, int, S_IRUGO);
-MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: Payload (0..3), ReadReq (4..7)");
+MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: ReadReq (4..7)");
 
 static void qib_tune_pcie_caps(struct qib_devdata *dd)
 {
 	struct pci_dev *parent;
-	u16 rc_mpss, rc_mps, ep_mpss, ep_mps;
 	u16 rc_mrrs, ep_mrrs, max_mrrs;
 
 	/* Find out supported and configured values for parent (root) */
@@ -575,30 +574,6 @@ static void qib_tune_pcie_caps(struct qib_devdata *dd)
 	if (!pci_is_pcie(parent) || !pci_is_pcie(dd->pcidev))
 		return;
 
-	rc_mpss = parent->pcie_mpss;
-	rc_mps = ffs(pcie_get_mps(parent)) - 8;
-	/* Find out supported and configured values for endpoint (us) */
-	ep_mpss = dd->pcidev->pcie_mpss;
-	ep_mps = ffs(pcie_get_mps(dd->pcidev)) - 8;
-
-	/* Find max payload supported by root, endpoint */
-	if (rc_mpss > ep_mpss)
-		rc_mpss = ep_mpss;
-
-	/* If Supported greater than limit in module param, limit it */
-	if (rc_mpss > (qib_pcie_caps & 7))
-		rc_mpss = qib_pcie_caps & 7;
-	/* If less than (allowed, supported), bump root payload */
-	if (rc_mpss > rc_mps) {
-		rc_mps = rc_mpss;
-		pcie_set_mps(parent, 128 << rc_mps);
-	}
-	/* If less than (allowed, supported), bump endpoint payload */
-	if (rc_mpss > ep_mps) {
-		ep_mps = rc_mpss;
-		pcie_set_mps(dd->pcidev, 128 << ep_mps);
-	}
-
 	/*
 	 * Now the Read Request size.
 	 * No field for max supported, but PCIe spec limits it to 4096,
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps()
  2015-07-29 22:18 [PATCH 0/3] PCI-e Max Payload Size configuration Keith Busch
  2015-07-29 22:18 ` [PATCH] pci: Default MPS tuning, match upstream port Keith Busch
  2015-07-29 22:18 ` [PATCH 2/3] QIB: Removing usage of pcie_set_mps() Keith Busch
@ 2015-07-29 22:18 ` Keith Busch
  2015-08-17 22:31   ` Bjorn Helgaas
  2 siblings, 1 reply; 16+ messages in thread
From: Keith Busch @ 2015-07-29 22:18 UTC (permalink / raw)
  To: linux-pci, linux-kernel; +Cc: Bjorn Helgass, Dave Jiang

From: Dave Jiang <dave.jiang@intel.com>

The setting of PCIe MPS should be left to the PCI subsystem and not
the driver. An ill configured MPS by the driver could cause the device
to not function or unstablize the entire system. Removing the exported
symbol.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/pci/pci.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0008c95..92349ee 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4121,7 +4121,6 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
 	return pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL,
 						  PCI_EXP_DEVCTL_PAYLOAD, v);
 }
-EXPORT_SYMBOL(pcie_set_mps);
 
 /**
  * pcie_get_minimum_link - determine minimum link settings of a PCI device
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] pci: Default MPS tuning, match upstream port
  2015-07-29 22:18 ` [PATCH] pci: Default MPS tuning, match upstream port Keith Busch
@ 2015-08-17 22:28   ` Bjorn Helgaas
  2015-08-17 23:03     ` Keith Busch
  0 siblings, 1 reply; 16+ messages in thread
From: Bjorn Helgaas @ 2015-08-17 22:28 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-pci, linux-kernel, Dave Jiang, Austin Bolen, Myron Stowe,
	Jon Mason

On Wed, Jul 29, 2015 at 04:18:53PM -0600, Keith Busch wrote:
> A hot plugged PCI-e device max payload size (MPS) defaults to 0 for
> 128bytes. The device is not usable if the upstream port is configured
> to a higher setting.
> 
> Bus configuration was previously done by arch specific and hot plug code
> after the root port or bridge was scanned, and default behavior logged a
> warning without changing the setting if there was a problem. This patch
> unifies tuning with a default policy that affects only misconfigured
> downstream devices, and preserves existing end result if a non-default
> bus tuning is used (ex: pci=pcie_bus_safe).
> 
> The new pcie tuning will check the device's MPS against the parent bridge
> when it is initially added to the pci subsystem, prior to attaching
> to a driver. If MPS is mismatched, the downstream port is set to the
> parent bridge's if capable. 

This is a little confusing (at least, *I'm* confused).  You're talking
about setting the MPS configuration of the "downstream port," but I
think you are talking about either a Switch Upstream Port or an
Endpoint.  Such a port is indeed *downstream* from the parent bridge,
but referring to it as a "downstream port" risks confusing it with the
parent bridge, which is either a Switch Downstream Port or a Root
Port.

> This is safe to change here since the device
> being configured is not bound to a driver and does not affect previously
> configured devices in the domain hierarchy.
> 
> A device incapable of matching the upstream bridge will log a
> warning message and not bind to a driver. This is the safe option since
> proper MPS configuration must consider the entire hierarchy between
> communicating end points, and we can't safely modify a root port's
> subtree MPS settings while it is in use.
> 
> Since the bus is configured everytime a bridge is scanned, this
> potentially creates unnecessary re-walking of an already configured
> sub-tree, but the code only executes during root port initialization,
> hot adding a switch, or explicit request to rescan.
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Austin Bolen <Austin_Bolen@Dell.com>
> Cc: Myron Stowe <mstowe@redhat.com>
> Cc: Jon Mason <jdmason@kudzu.us>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  arch/arm/kernel/bios32.c           |   12 ------------
>  arch/powerpc/kernel/pci-common.c   |    7 -------
>  arch/tile/kernel/pci_gx.c          |    4 ----
>  arch/x86/pci/acpi.c                |    9 ---------
>  drivers/pci/bus.c                  |    4 ++++
>  drivers/pci/hotplug/acpiphp_glue.c |    1 -
>  drivers/pci/hotplug/pciehp_pci.c   |    1 -
>  drivers/pci/hotplug/shpchp_pci.c   |    1 -
>  drivers/pci/probe.c                |   27 ++++++++++++++++++++++-----
>  include/linux/pci.h                |    2 --
>  10 files changed, 26 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c
> index fcbbbb1..4fff58e 100644
> --- a/arch/arm/kernel/bios32.c
> +++ b/arch/arm/kernel/bios32.c
> @@ -537,18 +537,6 @@ void pci_common_init_dev(struct device *parent, struct hw_pci *hw)
>  		 */
>  		pci_bus_add_devices(bus);
>  	}
> -
> -	list_for_each_entry(sys, &head, node) {
> -		struct pci_bus *bus = sys->bus;
> -
> -		/* Configure PCI Express settings */
> -		if (bus && !pci_has_flag(PCI_PROBE_ONLY)) {
> -			struct pci_bus *child;
> -
> -			list_for_each_entry(child, &bus->children, node)
> -				pcie_bus_configure_settings(child);
> -		}
> -	}

I think nothing terrible happens if we call
pcie_bus_configure_settings() more than once (we might get some
duplicate messages, but I don't consider that terrible).  If that's
true, maybe we could split the removal of these calls into a separate
patch.  It doesn't need to be a separate patch for every arch, but I
think getting the "redundant code removal" out of this patch will make
the interesting changes more obvious.

>  }
>  
>  #ifndef CONFIG_PCI_HOST_ITE8152
> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
> index b9de34d..7f27ffe 100644
> --- a/arch/powerpc/kernel/pci-common.c
> +++ b/arch/powerpc/kernel/pci-common.c
> @@ -1682,13 +1682,6 @@ void pcibios_scan_phb(struct pci_controller *hose)
>  	 */
>  	if (ppc_md.pcibios_fixup_phb)
>  		ppc_md.pcibios_fixup_phb(hose);
> -
> -	/* Configure PCI Express settings */
> -	if (bus && !pci_has_flag(PCI_PROBE_ONLY)) {
> -		struct pci_bus *child;
> -		list_for_each_entry(child, &bus->children, node)
> -			pcie_bus_configure_settings(child);
> -	}
>  }
>  EXPORT_SYMBOL_GPL(pcibios_scan_phb);
>  
> diff --git a/arch/tile/kernel/pci_gx.c b/arch/tile/kernel/pci_gx.c
> index b1df847..67492fb 100644
> --- a/arch/tile/kernel/pci_gx.c
> +++ b/arch/tile/kernel/pci_gx.c
> @@ -599,10 +599,6 @@ static void fixup_read_and_payload_sizes(struct pci_controller *controller)
>  	__gxio_mmio_write32(trio_context->mmio_base_mac + reg_offset,
>  			    rc_dev_cap.word);
>  
> -	/* Configure PCI Express MPS setting. */
> -	list_for_each_entry(child, &root_bus->children, node)
> -		pcie_bus_configure_settings(child);
> -
>  	/*
>  	 * Set the mac_config register in trio based on the MPS/MRS of the link.
>  	 */
> diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
> index ff99117..ab5977a 100644
> --- a/arch/x86/pci/acpi.c
> +++ b/arch/x86/pci/acpi.c
> @@ -478,15 +478,6 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root *root)
>  		}
>  	}
>  
> -	/* After the PCI-E bus has been walked and all devices discovered,
> -	 * configure any settings of the fabric that might be necessary.
> -	 */
> -	if (bus) {
> -		struct pci_bus *child;
> -		list_for_each_entry(child, &bus->children, node)
> -			pcie_bus_configure_settings(child);
> -	}
> -
>  	if (bus && node != NUMA_NO_NODE)
>  		dev_printk(KERN_DEBUG, &bus->dev, "on NUMA node %d\n", node);
>  
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 6fbd3f2..8f8428a 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -277,6 +277,10 @@ void pci_bus_add_device(struct pci_dev *dev)
>  {
>  	int retval;
>  
> +	if (dev->bus->self &&
> +			pcie_get_mps(dev) != pcie_get_mps(dev->bus->self))
> +		return;

This part looks like it could be in its own separate patch that
basically enforces the "if we can't safely configure this device's MPS
setting, prevent drivers from using the device" idea.  What happens in
this case?  Does the device still appear in sysfs and lspci, even if
we can't configure its MPS safely?  This seems like good place for a
dev_warn().

>  	/*
>  	 * Can not put in pci_device_add yet because resources
>  	 * are not assigned yet for some devices.
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> index ff53856..cd98649 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -509,7 +509,6 @@ static void enable_slot(struct acpiphp_slot *slot)
>  	__pci_bus_assign_resources(bus, &add_list, NULL);
>  
>  	acpiphp_sanitize_bus(bus);
> -	pcie_bus_configure_settings(bus);
>  	acpiphp_set_acpi_region(slot);
>  
>  	list_for_each_entry(dev, &bus->devices, bus_list) {
> diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
> index 9e69403..93bc7f6 100644
> --- a/drivers/pci/hotplug/pciehp_pci.c
> +++ b/drivers/pci/hotplug/pciehp_pci.c
> @@ -65,7 +65,6 @@ int pciehp_configure_device(struct slot *p_slot)
>  			pci_hp_add_bridge(dev);
>  
>  	pci_assign_unassigned_bridge_resources(bridge);
> -	pcie_bus_configure_settings(parent);
>  	pci_bus_add_devices(parent);
>  
>   out:
> diff --git a/drivers/pci/hotplug/shpchp_pci.c b/drivers/pci/hotplug/shpchp_pci.c
> index f8cd3a2..ca3dc3c 100644
> --- a/drivers/pci/hotplug/shpchp_pci.c
> +++ b/drivers/pci/hotplug/shpchp_pci.c
> @@ -69,7 +69,6 @@ int shpchp_configure_device(struct slot *p_slot)
>  	}
>  
>  	pci_assign_unassigned_bridge_resources(bridge);
> -	pcie_bus_configure_settings(parent);
>  	pci_bus_add_devices(parent);
>  
>   out:
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index cefd636..b469298 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -37,6 +37,9 @@ struct pci_domain_busn_res {
>  	int domain_nr;
>  };
>  
> +static void pcie_bus_detect_mps(struct pci_dev *dev);
> +static void pcie_bus_configure_settings(struct pci_bus *bus);
> +
>  static struct resource *get_pci_domain_busn_res(int domain_nr)
>  {
>  	struct pci_domain_busn_res *r;
> @@ -929,6 +932,8 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)
>  		(is_cardbus ? "PCI CardBus %04x:%02x" : "PCI Bus %04x:%02x"),
>  		pci_domain_nr(bus), child->number);
>  
> +	pcie_bus_configure_settings(bus);
> +
>  	/* Has only triggered on CardBus, fixup is in yenta_socket */
>  	while (bus->parent) {
>  		if ((child->busn_res.end > bus->busn_res.end) ||
> @@ -1589,6 +1594,9 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
>  	dev->match_driver = false;
>  	ret = device_add(&dev->dev);
>  	WARN_ON(ret < 0);
> +
> +	if (pci_is_pcie(dev))
> +		pcie_bus_detect_mps(dev);
>  }
>  
>  struct pci_dev *pci_scan_single_device(struct pci_bus *bus, int devfn)
> @@ -1802,9 +1810,17 @@ static void pcie_bus_detect_mps(struct pci_dev *dev)
>  	mps = pcie_get_mps(dev);
>  	p_mps = pcie_get_mps(bridge);
>  
> -	if (mps != p_mps)
> -		dev_warn(&dev->dev, "Max Payload Size %d, but upstream %s set to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
> -			 mps, pci_name(bridge), p_mps);
> +	if (mps != p_mps) {
> +		if (128 << dev->pcie_mpss < p_mps) {
> +			dev_warn(&dev->dev,
> +				"Max Payload Size Supported %d, but upstream %s set to %d. If problems are experienced, try running with pci=pcie_bus_safe\n",
> +				128 << dev->pcie_mpss, pci_name(bridge), p_mps);
> +			return;

Isn't this the same case where the above change to
pci_bus_add_device() will now decline to add the device?  So I think
there are really two policy changes:

  1) If an device can be configured to match the upstream bridge's MPS
  setting, do it, and

  2) Don't add a device when its MPS setting doesn't match the
  upstream bridge

I'd like those to be separate patches.

> +		}
> +		pcie_write_mps(dev, p_mps);
> +		dev_info(&dev->dev, "Max Payload Size set to %4d/%4d (was %4d)\n",
> +			pcie_get_mps(dev), 128 << dev->pcie_mpss, mps);
> +	}

I assume this hunk is related to the policy change (from "do nothing"
to "update the downstream port").  Can you split that policy change
into its own separate minimal patch?  Yes, I'm looking for minimal and
specific bisect targets :)

I think this will be clearer if you write it as:

  if (mps == p_mps)
    return;

  if (128 << dev->pcie_mpss < p_mps) {
    dev_warn(...);
    return;
  }

  pcie_write_mps(...);
  dev_info(...);

Now that this actively updates the MPS setting, it's starting to grow
beyond the original "pcie_bus_detect_mps()" function name.  

>  }
>  
>  static int pcie_bus_configure_set(struct pci_dev *dev, void *data)
> @@ -1836,7 +1852,7 @@ static int pcie_bus_configure_set(struct pci_dev *dev, void *data)
>   * parents then children fashion.  If this changes, then this code will not
>   * work as designed.
>   */
> -void pcie_bus_configure_settings(struct pci_bus *bus)
> +static void pcie_bus_configure_settings(struct pci_bus *bus)
>  {
>  	u8 smpss = 0;
>  
> @@ -1863,7 +1879,6 @@ void pcie_bus_configure_settings(struct pci_bus *bus)
>  	pcie_bus_configure_set(bus->self, &smpss);
>  	pci_walk_bus(bus, pcie_bus_configure_set, &smpss);
>  }
> -EXPORT_SYMBOL_GPL(pcie_bus_configure_settings);
>  
>  unsigned int pci_scan_child_bus(struct pci_bus *bus)
>  {
> @@ -1895,6 +1910,8 @@ unsigned int pci_scan_child_bus(struct pci_bus *bus)
>  				max = pci_scan_bridge(bus, dev, max, pass);
>  		}
>  
> +	pcie_bus_configure_settings(bus);
> +
>  	/*
>  	 * We've scanned the bus and so we know all about what's on
>  	 * the other side of any bridges that may be on this bus plus
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 8a0321a..e1df5f9 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -735,8 +735,6 @@ struct pci_driver {
>  /* these external functions are only available when PCI support is enabled */
>  #ifdef CONFIG_PCI
>  
> -void pcie_bus_configure_settings(struct pci_bus *bus);
> -
>  enum pcie_bus_config_types {
>  	PCIE_BUS_TUNE_OFF,
>  	PCIE_BUS_SAFE,
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] QIB: Removing usage of pcie_set_mps()
  2015-07-29 22:18 ` [PATCH 2/3] QIB: Removing usage of pcie_set_mps() Keith Busch
@ 2015-08-17 22:30   ` Bjorn Helgaas
  2015-08-17 22:50       ` Jiang, Dave
  0 siblings, 1 reply; 16+ messages in thread
From: Bjorn Helgaas @ 2015-08-17 22:30 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-pci, linux-kernel, Dave Jiang

[+cc Mike, linux-rdma]

On Wed, Jul 29, 2015 at 04:18:54PM -0600, Keith Busch wrote:
> From: Dave Jiang <dave.jiang@intel.com>
> 
> This is in perperation of un-exporting the pcie_set_mps() function
> symbol. A driver should not be changing the MPS as that is the
> responsibility of the PCI subsystem.

Please explain the implications of removing this code.  Does this affect
performance of the device?  If so, how do we get that performance back?

I also cc'd the QIB maintainers for you:

  QIB DRIVER
  M:      Mike Marciniszyn <infinipath@intel.com>
  L:      linux-rdma@vger.kernel.org
  F:      drivers/infiniband/hw/qib/

> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/infiniband/hw/qib/qib_pcie.c |   27 +--------------------------
>  1 file changed, 1 insertion(+), 26 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/qib/qib_pcie.c b/drivers/infiniband/hw/qib/qib_pcie.c
> index 4758a38..b8a2dcd 100644
> --- a/drivers/infiniband/hw/qib/qib_pcie.c
> +++ b/drivers/infiniband/hw/qib/qib_pcie.c
> @@ -557,12 +557,11 @@ static void qib_tune_pcie_coalesce(struct qib_devdata *dd)
>   */
>  static int qib_pcie_caps;
>  module_param_named(pcie_caps, qib_pcie_caps, int, S_IRUGO);
> -MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: Payload (0..3), ReadReq (4..7)");
> +MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: ReadReq (4..7)");
>  
>  static void qib_tune_pcie_caps(struct qib_devdata *dd)
>  {
>  	struct pci_dev *parent;
> -	u16 rc_mpss, rc_mps, ep_mpss, ep_mps;
>  	u16 rc_mrrs, ep_mrrs, max_mrrs;
>  
>  	/* Find out supported and configured values for parent (root) */
> @@ -575,30 +574,6 @@ static void qib_tune_pcie_caps(struct qib_devdata *dd)
>  	if (!pci_is_pcie(parent) || !pci_is_pcie(dd->pcidev))
>  		return;
>  
> -	rc_mpss = parent->pcie_mpss;
> -	rc_mps = ffs(pcie_get_mps(parent)) - 8;
> -	/* Find out supported and configured values for endpoint (us) */
> -	ep_mpss = dd->pcidev->pcie_mpss;
> -	ep_mps = ffs(pcie_get_mps(dd->pcidev)) - 8;
> -
> -	/* Find max payload supported by root, endpoint */
> -	if (rc_mpss > ep_mpss)
> -		rc_mpss = ep_mpss;
> -
> -	/* If Supported greater than limit in module param, limit it */
> -	if (rc_mpss > (qib_pcie_caps & 7))
> -		rc_mpss = qib_pcie_caps & 7;
> -	/* If less than (allowed, supported), bump root payload */
> -	if (rc_mpss > rc_mps) {
> -		rc_mps = rc_mpss;
> -		pcie_set_mps(parent, 128 << rc_mps);
> -	}
> -	/* If less than (allowed, supported), bump endpoint payload */
> -	if (rc_mpss > ep_mps) {
> -		ep_mps = rc_mpss;
> -		pcie_set_mps(dd->pcidev, 128 << ep_mps);
> -	}
> -
>  	/*
>  	 * Now the Read Request size.
>  	 * No field for max supported, but PCIe spec limits it to 4096,
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps()
  2015-07-29 22:18 ` [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps() Keith Busch
@ 2015-08-17 22:31   ` Bjorn Helgaas
  2015-08-17 23:32       ` Jiang, Dave
  0 siblings, 1 reply; 16+ messages in thread
From: Bjorn Helgaas @ 2015-08-17 22:31 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-pci, linux-kernel, Dave Jiang

On Wed, Jul 29, 2015 at 04:18:55PM -0600, Keith Busch wrote:
> From: Dave Jiang <dave.jiang@intel.com>
> 
> The setting of PCIe MPS should be left to the PCI subsystem and not
> the driver. An ill configured MPS by the driver could cause the device
> to not function or unstablize the entire system. Removing the exported
> symbol.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/pci/pci.c |    1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 0008c95..92349ee 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4121,7 +4121,6 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
>  	return pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL,
>  						  PCI_EXP_DEVCTL_PAYLOAD, v);
>  }
> -EXPORT_SYMBOL(pcie_set_mps);

I think the pcie_set_mps() declaration could be moved from
include/linux/pci.h to drivers/pci/pci.h, couldn't it?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] QIB: Removing usage of pcie_set_mps()
  2015-08-17 22:30   ` Bjorn Helgaas
  2015-08-17 22:50       ` Jiang, Dave
@ 2015-08-17 22:50       ` Jiang, Dave
  0 siblings, 0 replies; 16+ messages in thread
From: Jiang, Dave @ 2015-08-17 22:50 UTC (permalink / raw)
  To: Busch, Keith, bhelgaas; +Cc: linux-kernel, linux-pci, linux-rdma, infinipath

On Mon, 2015-08-17 at 17:30 -0500, Bjorn Helgaas wrote:
> [+cc Mike, linux-rdma]
> 
> On Wed, Jul 29, 2015 at 04:18:54PM -0600, Keith Busch wrote:
> > From: Dave Jiang <dave.jiang@intel.com>
> > 
> > This is in perperation of un-exporting the pcie_set_mps() function
> > symbol. A driver should not be changing the MPS as that is the
> > responsibility of the PCI subsystem.
> 
> Please explain the implications of removing this code.  Does this 
> affect
> performance of the device?  If so, how do we get that performance 
> back?

Honestly I don't know. But at the same time I think the driver
shouldn't be touching the MPS at all. Shouldn't that be left to the
PCIe subsystem and rely on the PCIe subsystem to set this to a sane
value? 

> 
> I also cc'd the QIB maintainers for you:
> 
>   QIB DRIVER
>   M:      Mike Marciniszyn <infinipath@intel.com>
>   L:      linux-rdma@vger.kernel.org
>   F:      drivers/infiniband/hw/qib/
> 
> > Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> > ---
> >  drivers/infiniband/hw/qib/qib_pcie.c |   27 +---------------------
> > -----
> >  1 file changed, 1 insertion(+), 26 deletions(-)
> > 
> > diff --git a/drivers/infiniband/hw/qib/qib_pcie.c 
> > b/drivers/infiniband/hw/qib/qib_pcie.c
> > index 4758a38..b8a2dcd 100644
> > --- a/drivers/infiniband/hw/qib/qib_pcie.c
> > +++ b/drivers/infiniband/hw/qib/qib_pcie.c
> > @@ -557,12 +557,11 @@ static void qib_tune_pcie_coalesce(struct 
> > qib_devdata *dd)
> >   */
> >  static int qib_pcie_caps;
> >  module_param_named(pcie_caps, qib_pcie_caps, int, S_IRUGO);
> > -MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: Payload (0..3), 
> > ReadReq (4..7)");
> > +MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: ReadReq (4..7)");
> >  
> >  static void qib_tune_pcie_caps(struct qib_devdata *dd)
> >  {
> >  	struct pci_dev *parent;
> > -	u16 rc_mpss, rc_mps, ep_mpss, ep_mps;
> >  	u16 rc_mrrs, ep_mrrs, max_mrrs;
> >  
> >  	/* Find out supported and configured values for parent 
> > (root) */
> > @@ -575,30 +574,6 @@ static void qib_tune_pcie_caps(struct 
> > qib_devdata *dd)
> >  	if (!pci_is_pcie(parent) || !pci_is_pcie(dd->pcidev))
> >  		return;
> >  
> > -	rc_mpss = parent->pcie_mpss;
> > -	rc_mps = ffs(pcie_get_mps(parent)) - 8;
> > -	/* Find out supported and configured values for endpoint 
> > (us) */
> > -	ep_mpss = dd->pcidev->pcie_mpss;
> > -	ep_mps = ffs(pcie_get_mps(dd->pcidev)) - 8;
> > -
> > -	/* Find max payload supported by root, endpoint */
> > -	if (rc_mpss > ep_mpss)
> > -		rc_mpss = ep_mpss;
> > -
> > -	/* If Supported greater than limit in module param, limit 
> > it */
> > -	if (rc_mpss > (qib_pcie_caps & 7))
> > -		rc_mpss = qib_pcie_caps & 7;
> > -	/* If less than (allowed, supported), bump root payload */
> > -	if (rc_mpss > rc_mps) {
> > -		rc_mps = rc_mpss;
> > -		pcie_set_mps(parent, 128 << rc_mps);
> > -	}
> > -	/* If less than (allowed, supported), bump endpoint 
> > payload */
> > -	if (rc_mpss > ep_mps) {
> > -		ep_mps = rc_mpss;
> > -		pcie_set_mps(dd->pcidev, 128 << ep_mps);
> > -	}
> > -
> >  	/*
> >  	 * Now the Read Request size.
> >  	 * No field for max supported, but PCIe spec limits it to 
> > 4096,

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] QIB: Removing usage of pcie_set_mps()
@ 2015-08-17 22:50       ` Jiang, Dave
  0 siblings, 0 replies; 16+ messages in thread
From: Jiang, Dave @ 2015-08-17 22:50 UTC (permalink / raw)
  To: Busch, Keith, bhelgaas; +Cc: linux-kernel, linux-pci, linux-rdma, infinipath

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3317 bytes --]

On Mon, 2015-08-17 at 17:30 -0500, Bjorn Helgaas wrote:
> [+cc Mike, linux-rdma]
> 
> On Wed, Jul 29, 2015 at 04:18:54PM -0600, Keith Busch wrote:
> > From: Dave Jiang <dave.jiang@intel.com>
> > 
> > This is in perperation of un-exporting the pcie_set_mps() function
> > symbol. A driver should not be changing the MPS as that is the
> > responsibility of the PCI subsystem.
> 
> Please explain the implications of removing this code.  Does this 
> affect
> performance of the device?  If so, how do we get that performance 
> back?

Honestly I don't know. But at the same time I think the driver
shouldn't be touching the MPS at all. Shouldn't that be left to the
PCIe subsystem and rely on the PCIe subsystem to set this to a sane
value? 

> 
> I also cc'd the QIB maintainers for you:
> 
>   QIB DRIVER
>   M:      Mike Marciniszyn <infinipath@intel.com>
>   L:      linux-rdma@vger.kernel.org
>   F:      drivers/infiniband/hw/qib/
> 
> > Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> > ---
> >  drivers/infiniband/hw/qib/qib_pcie.c |   27 +---------------------
> > -----
> >  1 file changed, 1 insertion(+), 26 deletions(-)
> > 
> > diff --git a/drivers/infiniband/hw/qib/qib_pcie.c 
> > b/drivers/infiniband/hw/qib/qib_pcie.c
> > index 4758a38..b8a2dcd 100644
> > --- a/drivers/infiniband/hw/qib/qib_pcie.c
> > +++ b/drivers/infiniband/hw/qib/qib_pcie.c
> > @@ -557,12 +557,11 @@ static void qib_tune_pcie_coalesce(struct 
> > qib_devdata *dd)
> >   */
> >  static int qib_pcie_caps;
> >  module_param_named(pcie_caps, qib_pcie_caps, int, S_IRUGO);
> > -MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: Payload (0..3), 
> > ReadReq (4..7)");
> > +MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: ReadReq (4..7)");
> >  
> >  static void qib_tune_pcie_caps(struct qib_devdata *dd)
> >  {
> >  	struct pci_dev *parent;
> > -	u16 rc_mpss, rc_mps, ep_mpss, ep_mps;
> >  	u16 rc_mrrs, ep_mrrs, max_mrrs;
> >  
> >  	/* Find out supported and configured values for parent 
> > (root) */
> > @@ -575,30 +574,6 @@ static void qib_tune_pcie_caps(struct 
> > qib_devdata *dd)
> >  	if (!pci_is_pcie(parent) || !pci_is_pcie(dd->pcidev))
> >  		return;
> >  
> > -	rc_mpss = parent->pcie_mpss;
> > -	rc_mps = ffs(pcie_get_mps(parent)) - 8;
> > -	/* Find out supported and configured values for endpoint 
> > (us) */
> > -	ep_mpss = dd->pcidev->pcie_mpss;
> > -	ep_mps = ffs(pcie_get_mps(dd->pcidev)) - 8;
> > -
> > -	/* Find max payload supported by root, endpoint */
> > -	if (rc_mpss > ep_mpss)
> > -		rc_mpss = ep_mpss;
> > -
> > -	/* If Supported greater than limit in module param, limit 
> > it */
> > -	if (rc_mpss > (qib_pcie_caps & 7))
> > -		rc_mpss = qib_pcie_caps & 7;
> > -	/* If less than (allowed, supported), bump root payload */
> > -	if (rc_mpss > rc_mps) {
> > -		rc_mps = rc_mpss;
> > -		pcie_set_mps(parent, 128 << rc_mps);
> > -	}
> > -	/* If less than (allowed, supported), bump endpoint 
> > payload */
> > -	if (rc_mpss > ep_mps) {
> > -		ep_mps = rc_mpss;
> > -		pcie_set_mps(dd->pcidev, 128 << ep_mps);
> > -	}
> > -
> >  	/*
> >  	 * Now the Read Request size.
> >  	 * No field for max supported, but PCIe spec limits it to 
> > 4096,ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] QIB: Removing usage of pcie_set_mps()
@ 2015-08-17 22:50       ` Jiang, Dave
  0 siblings, 0 replies; 16+ messages in thread
From: Jiang, Dave @ 2015-08-17 22:50 UTC (permalink / raw)
  To: Busch, Keith, bhelgaas; +Cc: linux-kernel, linux-pci, linux-rdma, infinipath

T24gTW9uLCAyMDE1LTA4LTE3IGF0IDE3OjMwIC0wNTAwLCBCam9ybiBIZWxnYWFzIHdyb3RlOg0K
PiBbK2NjIE1pa2UsIGxpbnV4LXJkbWFdDQo+IA0KPiBPbiBXZWQsIEp1bCAyOSwgMjAxNSBhdCAw
NDoxODo1NFBNIC0wNjAwLCBLZWl0aCBCdXNjaCB3cm90ZToNCj4gPiBGcm9tOiBEYXZlIEppYW5n
IDxkYXZlLmppYW5nQGludGVsLmNvbT4NCj4gPiANCj4gPiBUaGlzIGlzIGluIHBlcnBlcmF0aW9u
IG9mIHVuLWV4cG9ydGluZyB0aGUgcGNpZV9zZXRfbXBzKCkgZnVuY3Rpb24NCj4gPiBzeW1ib2wu
IEEgZHJpdmVyIHNob3VsZCBub3QgYmUgY2hhbmdpbmcgdGhlIE1QUyBhcyB0aGF0IGlzIHRoZQ0K
PiA+IHJlc3BvbnNpYmlsaXR5IG9mIHRoZSBQQ0kgc3Vic3lzdGVtLg0KPiANCj4gUGxlYXNlIGV4
cGxhaW4gdGhlIGltcGxpY2F0aW9ucyBvZiByZW1vdmluZyB0aGlzIGNvZGUuICBEb2VzIHRoaXMg
DQo+IGFmZmVjdA0KPiBwZXJmb3JtYW5jZSBvZiB0aGUgZGV2aWNlPyAgSWYgc28sIGhvdyBkbyB3
ZSBnZXQgdGhhdCBwZXJmb3JtYW5jZSANCj4gYmFjaz8NCg0KSG9uZXN0bHkgSSBkb24ndCBrbm93
LiBCdXQgYXQgdGhlIHNhbWUgdGltZSBJIHRoaW5rIHRoZSBkcml2ZXINCnNob3VsZG4ndCBiZSB0
b3VjaGluZyB0aGUgTVBTIGF0IGFsbC4gU2hvdWxkbid0IHRoYXQgYmUgbGVmdCB0byB0aGUNClBD
SWUgc3Vic3lzdGVtIGFuZCByZWx5IG9uIHRoZSBQQ0llIHN1YnN5c3RlbSB0byBzZXQgdGhpcyB0
byBhIHNhbmUNCnZhbHVlPyANCg0KPiANCj4gSSBhbHNvIGNjJ2QgdGhlIFFJQiBtYWludGFpbmVy
cyBmb3IgeW91Og0KPiANCj4gICBRSUIgRFJJVkVSDQo+ICAgTTogICAgICBNaWtlIE1hcmNpbmlz
enluIDxpbmZpbmlwYXRoQGludGVsLmNvbT4NCj4gICBMOiAgICAgIGxpbnV4LXJkbWFAdmdlci5r
ZXJuZWwub3JnDQo+ICAgRjogICAgICBkcml2ZXJzL2luZmluaWJhbmQvaHcvcWliLw0KPiANCj4g
PiBTaWduZWQtb2ZmLWJ5OiBEYXZlIEppYW5nIDxkYXZlLmppYW5nQGludGVsLmNvbT4NCj4gPiAt
LS0NCj4gPiAgZHJpdmVycy9pbmZpbmliYW5kL2h3L3FpYi9xaWJfcGNpZS5jIHwgICAyNyArLS0t
LS0tLS0tLS0tLS0tLS0tLS0tDQo+ID4gLS0tLS0NCj4gPiAgMSBmaWxlIGNoYW5nZWQsIDEgaW5z
ZXJ0aW9uKCspLCAyNiBkZWxldGlvbnMoLSkNCj4gPiANCj4gPiBkaWZmIC0tZ2l0IGEvZHJpdmVy
cy9pbmZpbmliYW5kL2h3L3FpYi9xaWJfcGNpZS5jIA0KPiA+IGIvZHJpdmVycy9pbmZpbmliYW5k
L2h3L3FpYi9xaWJfcGNpZS5jDQo+ID4gaW5kZXggNDc1OGEzOC4uYjhhMmRjZCAxMDA2NDQNCj4g
PiAtLS0gYS9kcml2ZXJzL2luZmluaWJhbmQvaHcvcWliL3FpYl9wY2llLmMNCj4gPiArKysgYi9k
cml2ZXJzL2luZmluaWJhbmQvaHcvcWliL3FpYl9wY2llLmMNCj4gPiBAQCAtNTU3LDEyICs1NTcs
MTEgQEAgc3RhdGljIHZvaWQgcWliX3R1bmVfcGNpZV9jb2FsZXNjZShzdHJ1Y3QgDQo+ID4gcWli
X2RldmRhdGEgKmRkKQ0KPiA+ICAgKi8NCj4gPiAgc3RhdGljIGludCBxaWJfcGNpZV9jYXBzOw0K
PiA+ICBtb2R1bGVfcGFyYW1fbmFtZWQocGNpZV9jYXBzLCBxaWJfcGNpZV9jYXBzLCBpbnQsIFNf
SVJVR08pOw0KPiA+IC1NT0RVTEVfUEFSTV9ERVNDKHBjaWVfY2FwcywgIk1heCBQQ0llIHR1bmlu
ZzogUGF5bG9hZCAoMC4uMyksIA0KPiA+IFJlYWRSZXEgKDQuLjcpIik7DQo+ID4gK01PRFVMRV9Q
QVJNX0RFU0MocGNpZV9jYXBzLCAiTWF4IFBDSWUgdHVuaW5nOiBSZWFkUmVxICg0Li43KSIpOw0K
PiA+ICANCj4gPiAgc3RhdGljIHZvaWQgcWliX3R1bmVfcGNpZV9jYXBzKHN0cnVjdCBxaWJfZGV2
ZGF0YSAqZGQpDQo+ID4gIHsNCj4gPiAgCXN0cnVjdCBwY2lfZGV2ICpwYXJlbnQ7DQo+ID4gLQl1
MTYgcmNfbXBzcywgcmNfbXBzLCBlcF9tcHNzLCBlcF9tcHM7DQo+ID4gIAl1MTYgcmNfbXJycywg
ZXBfbXJycywgbWF4X21ycnM7DQo+ID4gIA0KPiA+ICAJLyogRmluZCBvdXQgc3VwcG9ydGVkIGFu
ZCBjb25maWd1cmVkIHZhbHVlcyBmb3IgcGFyZW50IA0KPiA+IChyb290KSAqLw0KPiA+IEBAIC01
NzUsMzAgKzU3NCw2IEBAIHN0YXRpYyB2b2lkIHFpYl90dW5lX3BjaWVfY2FwcyhzdHJ1Y3QgDQo+
ID4gcWliX2RldmRhdGEgKmRkKQ0KPiA+ICAJaWYgKCFwY2lfaXNfcGNpZShwYXJlbnQpIHx8ICFw
Y2lfaXNfcGNpZShkZC0+cGNpZGV2KSkNCj4gPiAgCQlyZXR1cm47DQo+ID4gIA0KPiA+IC0JcmNf
bXBzcyA9IHBhcmVudC0+cGNpZV9tcHNzOw0KPiA+IC0JcmNfbXBzID0gZmZzKHBjaWVfZ2V0X21w
cyhwYXJlbnQpKSAtIDg7DQo+ID4gLQkvKiBGaW5kIG91dCBzdXBwb3J0ZWQgYW5kIGNvbmZpZ3Vy
ZWQgdmFsdWVzIGZvciBlbmRwb2ludCANCj4gPiAodXMpICovDQo+ID4gLQllcF9tcHNzID0gZGQt
PnBjaWRldi0+cGNpZV9tcHNzOw0KPiA+IC0JZXBfbXBzID0gZmZzKHBjaWVfZ2V0X21wcyhkZC0+
cGNpZGV2KSkgLSA4Ow0KPiA+IC0NCj4gPiAtCS8qIEZpbmQgbWF4IHBheWxvYWQgc3VwcG9ydGVk
IGJ5IHJvb3QsIGVuZHBvaW50ICovDQo+ID4gLQlpZiAocmNfbXBzcyA+IGVwX21wc3MpDQo+ID4g
LQkJcmNfbXBzcyA9IGVwX21wc3M7DQo+ID4gLQ0KPiA+IC0JLyogSWYgU3VwcG9ydGVkIGdyZWF0
ZXIgdGhhbiBsaW1pdCBpbiBtb2R1bGUgcGFyYW0sIGxpbWl0IA0KPiA+IGl0ICovDQo+ID4gLQlp
ZiAocmNfbXBzcyA+IChxaWJfcGNpZV9jYXBzICYgNykpDQo+ID4gLQkJcmNfbXBzcyA9IHFpYl9w
Y2llX2NhcHMgJiA3Ow0KPiA+IC0JLyogSWYgbGVzcyB0aGFuIChhbGxvd2VkLCBzdXBwb3J0ZWQp
LCBidW1wIHJvb3QgcGF5bG9hZCAqLw0KPiA+IC0JaWYgKHJjX21wc3MgPiByY19tcHMpIHsNCj4g
PiAtCQlyY19tcHMgPSByY19tcHNzOw0KPiA+IC0JCXBjaWVfc2V0X21wcyhwYXJlbnQsIDEyOCA8
PCByY19tcHMpOw0KPiA+IC0JfQ0KPiA+IC0JLyogSWYgbGVzcyB0aGFuIChhbGxvd2VkLCBzdXBw
b3J0ZWQpLCBidW1wIGVuZHBvaW50IA0KPiA+IHBheWxvYWQgKi8NCj4gPiAtCWlmIChyY19tcHNz
ID4gZXBfbXBzKSB7DQo+ID4gLQkJZXBfbXBzID0gcmNfbXBzczsNCj4gPiAtCQlwY2llX3NldF9t
cHMoZGQtPnBjaWRldiwgMTI4IDw8IGVwX21wcyk7DQo+ID4gLQl9DQo+ID4gLQ0KPiA+ICAJLyoN
Cj4gPiAgCSAqIE5vdyB0aGUgUmVhZCBSZXF1ZXN0IHNpemUuDQo+ID4gIAkgKiBObyBmaWVsZCBm
b3IgbWF4IHN1cHBvcnRlZCwgYnV0IFBDSWUgc3BlYyBsaW1pdHMgaXQgdG8gDQo+ID4gNDA5Niw=

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] pci: Default MPS tuning, match upstream port
  2015-08-17 22:28   ` Bjorn Helgaas
@ 2015-08-17 23:03     ` Keith Busch
  0 siblings, 0 replies; 16+ messages in thread
From: Keith Busch @ 2015-08-17 23:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Keith Busch, linux-pci, linux-kernel, Dave Jiang, Austin Bolen,
	Myron Stowe, Jon Mason

On Mon, 17 Aug 2015, Bjorn Helgaas wrote:
> On Wed, Jul 29, 2015 at 04:18:53PM -0600, Keith Busch wrote:
>> The new pcie tuning will check the device's MPS against the parent bridge
>> when it is initially added to the pci subsystem, prior to attaching
>> to a driver. If MPS is mismatched, the downstream port is set to the
>> parent bridge's if capable.
>
> This is a little confusing (at least, *I'm* confused).  You're talking
> about setting the MPS configuration of the "downstream port," but I
> think you are talking about either a Switch Upstream Port or an
> Endpoint.  Such a port is indeed *downstream* from the parent bridge,
> but referring to it as a "downstream port" risks confusing it with the
> parent bridge, which is either a Switch Downstream Port or a Root
> Port.

Yes, the wording is confusing, yet you've managed to describe my
intention, though. :)

>>  {
>>  	int retval;
>>
>> +	if (dev->bus->self &&
>> +			pcie_get_mps(dev) != pcie_get_mps(dev->bus->self))
>> +		return;
>
> This part looks like it could be in its own separate patch that
> basically enforces the "if we can't safely configure this device's MPS
> setting, prevent drivers from using the device" idea.  What happens in
> this case?  Does the device still appear in sysfs and lspci, even if
> we can't configure its MPS safely?  This seems like good place for a
> dev_warn().

It will remain in the pci topology and all the associated sysfs
artifacts and lspic capabilities will be there. You could retune the
whole topology from user space with "setpci" and do a rescan if you were
so inclined, but that could cause problems if transactions are in-flight
while re-tuning.

A dev_warn will be produced when the device is initially scanned as you
noted below, but another at this point might be useful to
make it clear a driver will not be bound.

The above is broken, though, for PCI device's that are not PCI-e, so
need to fix that at the very least if we will enforce this.

>> -	if (mps != p_mps)
>> -		dev_warn(&dev->dev, "Max Payload Size %d, but upstream %s set to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
>> -			 mps, pci_name(bridge), p_mps);
>> +	if (mps != p_mps) {
>> +		if (128 << dev->pcie_mpss < p_mps) {
>> +			dev_warn(&dev->dev,
>> +				"Max Payload Size Supported %d, but upstream %s set to %d. If problems are experienced, try running with pci=pcie_bus_safe\n",
>> +				128 << dev->pcie_mpss, pci_name(bridge), p_mps);
>> +			return;
>
> Isn't this the same case where the above change to
> pci_bus_add_device() will now decline to add the device?  So I think
> there are really two policy changes:
>
>  1) If an device can be configured to match the upstream bridge's MPS
>  setting, do it, and
>
>  2) Don't add a device when its MPS setting doesn't match the
>  upstream bridge
>
> I'd like those to be separate patches.

Ok.

>> +		}
>> +		pcie_write_mps(dev, p_mps);
>> +		dev_info(&dev->dev, "Max Payload Size set to %4d/%4d (was %4d)\n",
>> +			pcie_get_mps(dev), 128 << dev->pcie_mpss, mps);
>> +	}
>
> I assume this hunk is related to the policy change (from "do nothing"
> to "update the downstream port").  Can you split that policy change
> into its own separate minimal patch?  Yes, I'm looking for minimal and
> specific bisect targets :)
>
> I think this will be clearer if you write it as:
>
>  if (mps == p_mps)
>    return;
>
>  if (128 << dev->pcie_mpss < p_mps) {
>    dev_warn(...);
>    return;
>  }
>
>  pcie_write_mps(...);
>  dev_info(...);
>
> Now that this actively updates the MPS setting, it's starting to grow
> beyond the original "pcie_bus_detect_mps()" function name.

Agreed that's a cleaner flow.


Thanks for all the feedback. Will work on a revision this week.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps()
  2015-08-17 22:31   ` Bjorn Helgaas
@ 2015-08-17 23:32       ` Jiang, Dave
  0 siblings, 0 replies; 16+ messages in thread
From: Jiang, Dave @ 2015-08-17 23:32 UTC (permalink / raw)
  To: Busch, Keith, bhelgaas; +Cc: linux-kernel, linux-pci

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1219 bytes --]

On Mon, 2015-08-17 at 17:31 -0500, Bjorn Helgaas wrote:
> On Wed, Jul 29, 2015 at 04:18:55PM -0600, Keith Busch wrote:
> > From: Dave Jiang <dave.jiang@intel.com>
> > 
> > The setting of PCIe MPS should be left to the PCI subsystem and not
> > the driver. An ill configured MPS by the driver could cause the 
> > device
> > to not function or unstablize the entire system. Removing the 
> > exported
> > symbol.
> > 
> > Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> > ---
> >  drivers/pci/pci.c |    1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 0008c95..92349ee 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -4121,7 +4121,6 @@ int pcie_set_mps(struct pci_dev *dev, int 
> > mps)
> >  	return pcie_capability_clear_and_set_word(dev, 
> > PCI_EXP_DEVCTL,
> >  						 
> >  PCI_EXP_DEVCTL_PAYLOAD, v);
> >  }
> > -EXPORT_SYMBOL(pcie_set_mps);
> 
> I think the pcie_set_mps() declaration could be moved from
> include/linux/pci.h to drivers/pci/pci.h, couldn't it?

Ok.ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps()
@ 2015-08-17 23:32       ` Jiang, Dave
  0 siblings, 0 replies; 16+ messages in thread
From: Jiang, Dave @ 2015-08-17 23:32 UTC (permalink / raw)
  To: Busch, Keith, bhelgaas; +Cc: linux-kernel, linux-pci

T24gTW9uLCAyMDE1LTA4LTE3IGF0IDE3OjMxIC0wNTAwLCBCam9ybiBIZWxnYWFzIHdyb3RlOg0K
PiBPbiBXZWQsIEp1bCAyOSwgMjAxNSBhdCAwNDoxODo1NVBNIC0wNjAwLCBLZWl0aCBCdXNjaCB3
cm90ZToNCj4gPiBGcm9tOiBEYXZlIEppYW5nIDxkYXZlLmppYW5nQGludGVsLmNvbT4NCj4gPiAN
Cj4gPiBUaGUgc2V0dGluZyBvZiBQQ0llIE1QUyBzaG91bGQgYmUgbGVmdCB0byB0aGUgUENJIHN1
YnN5c3RlbSBhbmQgbm90DQo+ID4gdGhlIGRyaXZlci4gQW4gaWxsIGNvbmZpZ3VyZWQgTVBTIGJ5
IHRoZSBkcml2ZXIgY291bGQgY2F1c2UgdGhlIA0KPiA+IGRldmljZQ0KPiA+IHRvIG5vdCBmdW5j
dGlvbiBvciB1bnN0YWJsaXplIHRoZSBlbnRpcmUgc3lzdGVtLiBSZW1vdmluZyB0aGUgDQo+ID4g
ZXhwb3J0ZWQNCj4gPiBzeW1ib2wuDQo+ID4gDQo+ID4gU2lnbmVkLW9mZi1ieTogRGF2ZSBKaWFu
ZyA8ZGF2ZS5qaWFuZ0BpbnRlbC5jb20+DQo+ID4gLS0tDQo+ID4gIGRyaXZlcnMvcGNpL3BjaS5j
IHwgICAgMSAtDQo+ID4gIDEgZmlsZSBjaGFuZ2VkLCAxIGRlbGV0aW9uKC0pDQo+ID4gDQo+ID4g
ZGlmZiAtLWdpdCBhL2RyaXZlcnMvcGNpL3BjaS5jIGIvZHJpdmVycy9wY2kvcGNpLmMNCj4gPiBp
bmRleCAwMDA4Yzk1Li45MjM0OWVlIDEwMDY0NA0KPiA+IC0tLSBhL2RyaXZlcnMvcGNpL3BjaS5j
DQo+ID4gKysrIGIvZHJpdmVycy9wY2kvcGNpLmMNCj4gPiBAQCAtNDEyMSw3ICs0MTIxLDYgQEAg
aW50IHBjaWVfc2V0X21wcyhzdHJ1Y3QgcGNpX2RldiAqZGV2LCBpbnQgDQo+ID4gbXBzKQ0KPiA+
ICAJcmV0dXJuIHBjaWVfY2FwYWJpbGl0eV9jbGVhcl9hbmRfc2V0X3dvcmQoZGV2LCANCj4gPiBQ
Q0lfRVhQX0RFVkNUTCwNCj4gPiAgCQkJCQkJIA0KPiA+ICBQQ0lfRVhQX0RFVkNUTF9QQVlMT0FE
LCB2KTsNCj4gPiAgfQ0KPiA+IC1FWFBPUlRfU1lNQk9MKHBjaWVfc2V0X21wcyk7DQo+IA0KPiBJ
IHRoaW5rIHRoZSBwY2llX3NldF9tcHMoKSBkZWNsYXJhdGlvbiBjb3VsZCBiZSBtb3ZlZCBmcm9t
DQo+IGluY2x1ZGUvbGludXgvcGNpLmggdG8gZHJpdmVycy9wY2kvcGNpLmgsIGNvdWxkbid0IGl0
Pw0KDQpPay4=

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] QIB: Removing usage of pcie_set_mps()
  2015-08-17 22:50       ` Jiang, Dave
  (?)
  (?)
@ 2015-08-18  0:06       ` Bjorn Helgaas
  2015-08-18  1:49           ` Jason Gunthorpe
  -1 siblings, 1 reply; 16+ messages in thread
From: Bjorn Helgaas @ 2015-08-18  0:06 UTC (permalink / raw)
  To: Jiang, Dave; +Cc: Busch, Keith, linux-kernel, linux-pci, linux-rdma, infinipath

On Mon, Aug 17, 2015 at 5:50 PM, Jiang, Dave <dave.jiang@intel.com> wrote:
> On Mon, 2015-08-17 at 17:30 -0500, Bjorn Helgaas wrote:
>> [+cc Mike, linux-rdma]
>>
>> On Wed, Jul 29, 2015 at 04:18:54PM -0600, Keith Busch wrote:
>> > From: Dave Jiang <dave.jiang@intel.com>
>> >
>> > This is in perperation of un-exporting the pcie_set_mps() function
>> > symbol. A driver should not be changing the MPS as that is the
>> > responsibility of the PCI subsystem.
>>
>> Please explain the implications of removing this code.  Does this
>> affect
>> performance of the device?  If so, how do we get that performance
>> back?
>
> Honestly I don't know. But at the same time I think the driver
> shouldn't be touching the MPS at all. Shouldn't that be left to the
> PCIe subsystem and rely on the PCIe subsystem to set this to a sane
> value?

Yes, I think in principle the PCI core should own this, but I also
don't want to introduce a performance regression, so I think we need
to understand whether there's a problem, and if there is, fix it.

>> I also cc'd the QIB maintainers for you:
>>
>>   QIB DRIVER
>>   M:      Mike Marciniszyn <infinipath@intel.com>
>>   L:      linux-rdma@vger.kernel.org
>>   F:      drivers/infiniband/hw/qib/
>>
>> > Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> > ---
>> >  drivers/infiniband/hw/qib/qib_pcie.c |   27 +---------------------
>> > -----
>> >  1 file changed, 1 insertion(+), 26 deletions(-)
>> >
>> > diff --git a/drivers/infiniband/hw/qib/qib_pcie.c
>> > b/drivers/infiniband/hw/qib/qib_pcie.c
>> > index 4758a38..b8a2dcd 100644
>> > --- a/drivers/infiniband/hw/qib/qib_pcie.c
>> > +++ b/drivers/infiniband/hw/qib/qib_pcie.c
>> > @@ -557,12 +557,11 @@ static void qib_tune_pcie_coalesce(struct
>> > qib_devdata *dd)
>> >   */
>> >  static int qib_pcie_caps;
>> >  module_param_named(pcie_caps, qib_pcie_caps, int, S_IRUGO);
>> > -MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: Payload (0..3),
>> > ReadReq (4..7)");
>> > +MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: ReadReq (4..7)");
>> >
>> >  static void qib_tune_pcie_caps(struct qib_devdata *dd)
>> >  {
>> >     struct pci_dev *parent;
>> > -   u16 rc_mpss, rc_mps, ep_mpss, ep_mps;
>> >     u16 rc_mrrs, ep_mrrs, max_mrrs;
>> >
>> >     /* Find out supported and configured values for parent
>> > (root) */
>> > @@ -575,30 +574,6 @@ static void qib_tune_pcie_caps(struct
>> > qib_devdata *dd)
>> >     if (!pci_is_pcie(parent) || !pci_is_pcie(dd->pcidev))
>> >             return;
>> >
>> > -   rc_mpss = parent->pcie_mpss;
>> > -   rc_mps = ffs(pcie_get_mps(parent)) - 8;
>> > -   /* Find out supported and configured values for endpoint
>> > (us) */
>> > -   ep_mpss = dd->pcidev->pcie_mpss;
>> > -   ep_mps = ffs(pcie_get_mps(dd->pcidev)) - 8;
>> > -
>> > -   /* Find max payload supported by root, endpoint */
>> > -   if (rc_mpss > ep_mpss)
>> > -           rc_mpss = ep_mpss;
>> > -
>> > -   /* If Supported greater than limit in module param, limit
>> > it */
>> > -   if (rc_mpss > (qib_pcie_caps & 7))
>> > -           rc_mpss = qib_pcie_caps & 7;
>> > -   /* If less than (allowed, supported), bump root payload */
>> > -   if (rc_mpss > rc_mps) {
>> > -           rc_mps = rc_mpss;
>> > -           pcie_set_mps(parent, 128 << rc_mps);
>> > -   }
>> > -   /* If less than (allowed, supported), bump endpoint
>> > payload */
>> > -   if (rc_mpss > ep_mps) {
>> > -           ep_mps = rc_mpss;
>> > -           pcie_set_mps(dd->pcidev, 128 << ep_mps);
>> > -   }
>> > -
>> >     /*
>> >      * Now the Read Request size.
>> >      * No field for max supported, but PCIe spec limits it to
>> > 4096,

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] QIB: Removing usage of pcie_set_mps()
  2015-08-18  0:06       ` Bjorn Helgaas
@ 2015-08-18  1:49           ` Jason Gunthorpe
  0 siblings, 0 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2015-08-18  1:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jiang, Dave, Busch, Keith, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, infinipath

On Mon, Aug 17, 2015 at 07:06:10PM -0500, Bjorn Helgaas wrote:
> On Mon, Aug 17, 2015 at 5:50 PM, Jiang, Dave <dave.jiang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> > On Mon, 2015-08-17 at 17:30 -0500, Bjorn Helgaas wrote:
> >> [+cc Mike, linux-rdma]
> >>
> >> On Wed, Jul 29, 2015 at 04:18:54PM -0600, Keith Busch wrote:
> >> > From: Dave Jiang <dave.jiang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> >> >
> >> > This is in perperation of un-exporting the pcie_set_mps() function
> >> > symbol. A driver should not be changing the MPS as that is the
> >> > responsibility of the PCI subsystem.
> >>
> >> Please explain the implications of removing this code.  Does this
> >> affect
> >> performance of the device?  If so, how do we get that performance
> >> back?
> >
> > Honestly I don't know. But at the same time I think the driver
> > shouldn't be touching the MPS at all. Shouldn't that be left to the
> > PCIe subsystem and rely on the PCIe subsystem to set this to a sane
> > value?
> 
> Yes, I think in principle the PCI core should own this, but I also
> don't want to introduce a performance regression, so I think we need
> to understand whether there's a problem, and if there is, fix it.

Making sure Mike is CC'd directly..

Mike: I see this has been cut and pasted to HFI1 too, I would be
disappointed if HFI needs it as well. :(

FWIW, I totally agree with the above. MPS/MRS and related have to do
with root port capability, switches in the path and any platform
bugs..

I'm not sure why a driver would ever want to mess with this, and since
this code doesn't walk the bus toward the root port, it is technically
wrong, right? I also find it strange that qib_pcie_caps defaults to
zero which means the 'tuning' reduces the payload size to the minimums
(??)

> >> >     /*
> >> >      * Now the Read Request size.
> >> >      * No field for max supported, but PCIe spec limits it to
> >> > 4096,

.. it has been a bit since I looked at this, but IIRC, MRRS is also
something that should not be touched by a driver?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] QIB: Removing usage of pcie_set_mps()
@ 2015-08-18  1:49           ` Jason Gunthorpe
  0 siblings, 0 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2015-08-18  1:49 UTC (permalink / raw)
  To: Mike Marciniszyn, Bjorn Helgaas
  Cc: Jiang, Dave, Busch, Keith, linux-kernel, linux-pci, linux-rdma,
	infinipath

On Mon, Aug 17, 2015 at 07:06:10PM -0500, Bjorn Helgaas wrote:
> On Mon, Aug 17, 2015 at 5:50 PM, Jiang, Dave <dave.jiang@intel.com> wrote:
> > On Mon, 2015-08-17 at 17:30 -0500, Bjorn Helgaas wrote:
> >> [+cc Mike, linux-rdma]
> >>
> >> On Wed, Jul 29, 2015 at 04:18:54PM -0600, Keith Busch wrote:
> >> > From: Dave Jiang <dave.jiang@intel.com>
> >> >
> >> > This is in perperation of un-exporting the pcie_set_mps() function
> >> > symbol. A driver should not be changing the MPS as that is the
> >> > responsibility of the PCI subsystem.
> >>
> >> Please explain the implications of removing this code.  Does this
> >> affect
> >> performance of the device?  If so, how do we get that performance
> >> back?
> >
> > Honestly I don't know. But at the same time I think the driver
> > shouldn't be touching the MPS at all. Shouldn't that be left to the
> > PCIe subsystem and rely on the PCIe subsystem to set this to a sane
> > value?
> 
> Yes, I think in principle the PCI core should own this, but I also
> don't want to introduce a performance regression, so I think we need
> to understand whether there's a problem, and if there is, fix it.

Making sure Mike is CC'd directly..

Mike: I see this has been cut and pasted to HFI1 too, I would be
disappointed if HFI needs it as well. :(

FWIW, I totally agree with the above. MPS/MRS and related have to do
with root port capability, switches in the path and any platform
bugs..

I'm not sure why a driver would ever want to mess with this, and since
this code doesn't walk the bus toward the root port, it is technically
wrong, right? I also find it strange that qib_pcie_caps defaults to
zero which means the 'tuning' reduces the payload size to the minimums
(??)

> >> >     /*
> >> >      * Now the Read Request size.
> >> >      * No field for max supported, but PCIe spec limits it to
> >> > 4096,

.. it has been a bit since I looked at this, but IIRC, MRRS is also
something that should not be touched by a driver?

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-08-18  1:49 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-29 22:18 [PATCH 0/3] PCI-e Max Payload Size configuration Keith Busch
2015-07-29 22:18 ` [PATCH] pci: Default MPS tuning, match upstream port Keith Busch
2015-08-17 22:28   ` Bjorn Helgaas
2015-08-17 23:03     ` Keith Busch
2015-07-29 22:18 ` [PATCH 2/3] QIB: Removing usage of pcie_set_mps() Keith Busch
2015-08-17 22:30   ` Bjorn Helgaas
2015-08-17 22:50     ` Jiang, Dave
2015-08-17 22:50       ` Jiang, Dave
2015-08-17 22:50       ` Jiang, Dave
2015-08-18  0:06       ` Bjorn Helgaas
2015-08-18  1:49         ` Jason Gunthorpe
2015-08-18  1:49           ` Jason Gunthorpe
2015-07-29 22:18 ` [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps() Keith Busch
2015-08-17 22:31   ` Bjorn Helgaas
2015-08-17 23:32     ` Jiang, Dave
2015-08-17 23:32       ` Jiang, Dave

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.