All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] PCI: MPS patches
@ 2012-10-12  5:35 Jon Mason
  2012-10-12  5:35 ` [PATCH 1/3] PCI: correct static code analysis tool issue Jon Mason
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Jon Mason @ 2012-10-12  5:35 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci

Hello Bjorn,
Patch 1/3 is a resend that appears to have not been included in your
  tree (http://permalink.gmane.org/gmane.linux.kernel.pci/16627)
Patch 2/3 is a comment error that Don Dutile noticed in the original 
  submission of the previous patch
Patch 3/3 is a new patch that provides a more sane default for the MPS
  detection.  While I still believe we should have the MPS 
  detection/modification "on" by default, this provides a good default.
  And, hopefully, it will help find issues with MPS-hardware via users
  self-selecting to test it.

Thanks,
Jon


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] PCI: correct static code analysis tool issue
  2012-10-12  5:35 [PATCH 0/3] PCI: MPS patches Jon Mason
@ 2012-10-12  5:35 ` Jon Mason
  2012-10-25  7:29   ` Yijing Wang
  2012-10-12  5:35 ` [PATCH 2/3] PCI: Fix comment syntax Jon Mason
  2012-10-12  5:35 ` [PATCH 3/3] PCI: Add new default PCI-E MPS bus state Jon Mason
  2 siblings, 1 reply; 9+ messages in thread
From: Jon Mason @ 2012-10-12  5:35 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Jon Mason

Coverity noticed that pcie_bus_configure_settings() can call
pcie_bus_configure_set() when "smpss" is uninitialized.  The smpss is
used to make the bus and all child devices have a uniform value.
Setting the smpss is moot for PCIE_BUS_PERFORMANCE, since it will be
determined dynamically based on the max size of the parent device and
the child device.  To avoid the erroneous detection of this issue, set
the smpss to 0.  Also, add a case statement to make clearer the possible
use cases and add a nice comment describing what is being done in the
PCIE_BUS_PERFORMANCE case.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
---
 drivers/pci/probe.c |   25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 9f8a6b7..6b672c1 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1570,21 +1570,36 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
 	if (!pci_is_pcie(bus->self))
 		return;
 
-	if (pcie_bus_config == PCIE_BUS_TUNE_OFF)
-		return;
-
+	switch (pcie_bus_config) {
+	/* The smpss is used to make the bus and all child devices have
+	 * a uniform value.  Setting the smpss is moot for
+	 * "Performance", since it will be determined dynamically based
+	 * on the max size of the parent device and the child device.
+	 * However, not setting the smpss can result in static code
+	 * analysis tools erroniously reporting an issue.  To avoid
+	 * this, set the smpss to 0.
+	 */
+	case PCIE_BUS_PERFORMANCE:
 	/* FIXME - Peer to peer DMA is possible, though the endpoint would need
 	 * to be aware to the MPS of the destination.  To work around this,
 	 * simply force the MPS of the entire system to the smallest possible.
 	 */
-	if (pcie_bus_config == PCIE_BUS_PEER2PEER)
+	case PCIE_BUS_PEER2PEER:
 		smpss = 0;
+		break;
 
-	if (pcie_bus_config == PCIE_BUS_SAFE) {
+	case PCIE_BUS_SAFE:
 		smpss = mpss;
 
 		pcie_find_smpss(bus->self, &smpss);
 		pci_walk_bus(bus, pcie_find_smpss, &smpss);
+		break;
+
+	case PCIE_BUS_TUNE_OFF:
+	default:
+		return;
+	}
+
 	}
 
 	pcie_bus_configure_set(bus->self, &smpss);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] PCI: Fix comment syntax
  2012-10-12  5:35 [PATCH 0/3] PCI: MPS patches Jon Mason
  2012-10-12  5:35 ` [PATCH 1/3] PCI: correct static code analysis tool issue Jon Mason
@ 2012-10-12  5:35 ` Jon Mason
  2012-10-12  5:35 ` [PATCH 3/3] PCI: Add new default PCI-E MPS bus state Jon Mason
  2 siblings, 0 replies; 9+ messages in thread
From: Jon Mason @ 2012-10-12  5:35 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Jon Mason

Correct minor wording issue in MPS Peer-to-peer comment.

Noticed by Don Dutile

Signed-off-by: Jon Mason <jdmason@kudzu.us>
---
 drivers/pci/probe.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 6b672c1..5a18652 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1581,7 +1581,7 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
 	 */
 	case PCIE_BUS_PERFORMANCE:
 	/* FIXME - Peer to peer DMA is possible, though the endpoint would need
-	 * to be aware to the MPS of the destination.  To work around this,
+	 * to be aware of the MPS of the destination.  To work around this,
 	 * simply force the MPS of the entire system to the smallest possible.
 	 */
 	case PCIE_BUS_PEER2PEER:
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] PCI: Add new default PCI-E MPS bus state
  2012-10-12  5:35 [PATCH 0/3] PCI: MPS patches Jon Mason
  2012-10-12  5:35 ` [PATCH 1/3] PCI: correct static code analysis tool issue Jon Mason
  2012-10-12  5:35 ` [PATCH 2/3] PCI: Fix comment syntax Jon Mason
@ 2012-10-12  5:35 ` Jon Mason
  2012-10-15  2:32   ` Yijing Wang
  2 siblings, 1 reply; 9+ messages in thread
From: Jon Mason @ 2012-10-12  5:35 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Jon Mason, Yijing Wang

Add a new default state for the PCI-E MPS determination, PCIE_BUS_WARN.
This state notifies the user that a suboptimal configuration is
occurring (most likely due to incorrect configuration in the BIOS), and
provides them the boot parameter to enable this error to be corrected.
This provides the users an ability to detect an issue with MPS, without
forcing them to correct the issue (which may cause system hangs).  This
is a much more sane default for the distros.

Also, add debug output to show the default device MPS and MPSS.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
CC: Yijing Wang <wangyijing@huawei.com>
---
 drivers/pci/pci.c    |    2 +-
 drivers/pci/probe.c  |   10 ++++++++++
 drivers/pci/quirks.c |    3 ++-
 include/linux/pci.h  |    1 +
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ab4bf5a..1723c81 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -78,7 +78,7 @@ unsigned long pci_cardbus_mem_size = DEFAULT_CARDBUS_MEM_SIZE;
 unsigned long pci_hotplug_io_size  = DEFAULT_HOTPLUG_IO_SIZE;
 unsigned long pci_hotplug_mem_size = DEFAULT_HOTPLUG_MEM_SIZE;
 
-enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_TUNE_OFF;
+enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_WARN;
 
 /*
  * The default CLS is used if arch didn't set CLS explicitly and not
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 5a18652..64bb393 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1448,6 +1448,9 @@ static int pcie_find_smpss(struct pci_dev *dev, void *data)
 	if (!pci_is_pcie(dev))
 		return 0;
 
+	dev_dbg(&dev->dev, "Device MPS %d and MPSS %d\n",
+		pcie_get_mps(dev), 128 << dev->pcie_mpss);
+
 	/* For PCIE hotplug enabled slots not connected directly to a
 	 * PCI-E root port, there can be problems when hotplugging
 	 * devices.  This is due to the possibility of hotplugging a
@@ -1588,6 +1591,7 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
 		smpss = 0;
 		break;
 
+	case PCIE_BUS_WARN:
 	case PCIE_BUS_SAFE:
 		smpss = mpss;
 
@@ -1600,6 +1604,12 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
 		return;
 	}
 
+	if (pcie_bus_config == PCIE_BUS_WARN) {
+		if (smpss != mpss)
+			dev_info(&bus->dev, "Non-optimal PCI-E Bus MPS value of %d being used instead of %d.\n"
+				 "Please use the pci=pcie_bus_safe boot parameter for better performance.\n",
+				 128 << smpss, 128 << mpss);
+		return;
 	}
 
 	pcie_bus_configure_set(bus->self, &smpss);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 5155317..e4eede0 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2787,7 +2787,8 @@ static void __devinit quirk_intel_mc_errata(struct pci_dev *dev)
 	int err;
 	u16 rcc;
 
-	if (pcie_bus_config == PCIE_BUS_TUNE_OFF)
+	if (pcie_bus_config == PCIE_BUS_TUNE_OFF ||
+	    pcie_bus_config == PCIE_BUS_WARN)
 		return;
 
 	/* Intel errata specifies bits to change but does not say what they are.
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5faa831..410eaf9 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -662,6 +662,7 @@ extern void pcie_bus_configure_settings(struct pci_bus *bus, u8 smpss);
 
 enum pcie_bus_config_types {
 	PCIE_BUS_TUNE_OFF,
+	PCIE_BUS_WARN,
 	PCIE_BUS_SAFE,
 	PCIE_BUS_PERFORMANCE,
 	PCIE_BUS_PEER2PEER,
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] PCI: Add new default PCI-E MPS bus state
  2012-10-12  5:35 ` [PATCH 3/3] PCI: Add new default PCI-E MPS bus state Jon Mason
@ 2012-10-15  2:32   ` Yijing Wang
  2012-10-23 22:57     ` Jon Mason
  0 siblings, 1 reply; 9+ messages in thread
From: Yijing Wang @ 2012-10-15  2:32 UTC (permalink / raw)
  To: Jon Mason; +Cc: Bjorn Helgaas, linux-pci

On 2012/10/12 13:35, Jon Mason wrote:
> Add a new default state for the PCI-E MPS determination, PCIE_BUS_WARN.
> This state notifies the user that a suboptimal configuration is
> occurring (most likely due to incorrect configuration in the BIOS), and
> provides them the boot parameter to enable this error to be corrected.
> This provides the users an ability to detect an issue with MPS, without
> forcing them to correct the issue (which may cause system hangs).  This
> is a much more sane default for the distros.
> 
> Also, add debug output to show the default device MPS and MPSS.
> 
> Signed-off-by: Jon Mason <jdmason@kudzu.us>
> CC: Yijing Wang <wangyijing@huawei.com>
> ---
>  drivers/pci/pci.c    |    2 +-
>  drivers/pci/probe.c  |   10 ++++++++++
>  drivers/pci/quirks.c |    3 ++-
>  include/linux/pci.h  |    1 +
>  4 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index ab4bf5a..1723c81 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -78,7 +78,7 @@ unsigned long pci_cardbus_mem_size = DEFAULT_CARDBUS_MEM_SIZE;
>  unsigned long pci_hotplug_io_size  = DEFAULT_HOTPLUG_IO_SIZE;
>  unsigned long pci_hotplug_mem_size = DEFAULT_HOTPLUG_MEM_SIZE;
>  
> -enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_TUNE_OFF;
> +enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_WARN;
>  
>  /*
>   * The default CLS is used if arch didn't set CLS explicitly and not
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 5a18652..64bb393 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1448,6 +1448,9 @@ static int pcie_find_smpss(struct pci_dev *dev, void *data)
>  	if (!pci_is_pcie(dev))
>  		return 0;
>  
> +	dev_dbg(&dev->dev, "Device MPS %d and MPSS %d\n",
> +		pcie_get_mps(dev), 128 << dev->pcie_mpss);
> +

What about dev_prinkt(KERN_DEBUG, .....), so users can get this information from dmesg directly?

>  	/* For PCIE hotplug enabled slots not connected directly to a
>  	 * PCI-E root port, there can be problems when hotplugging
>  	 * devices.  This is due to the possibility of hotplugging a
> @@ -1588,6 +1591,7 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
>  		smpss = 0;
>  		break;
>  
> +	case PCIE_BUS_WARN:
>  	case PCIE_BUS_SAFE:
>  		smpss = mpss;
>  
> @@ -1600,6 +1604,12 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
>  		return;
>  	}
>  
> +	if (pcie_bus_config == PCIE_BUS_WARN) {
> +		if (smpss != mpss)
> +			dev_info(&bus->dev, "Non-optimal PCI-E Bus MPS value of %d being used instead of %d.\n"
> +				 "Please use the pci=pcie_bus_safe boot parameter for better performance.\n",
> +				 128 << smpss, 128 << mpss);

test log in my ia64 machine(support pciehp)

 |           +-07.0-[0000:46-4f]----00.0-[0000:47-4f]--+-04.0-[0000:48-49]----00.0-[0000:49]--
 |           |                                         +-08.0-[0000:4a]--
 |           |                                         +-09.0-[0000:4b]--+-00.0  Intel Corporation 82576 Gigabit Network Connection
 |           |                                         |                 \-00.1  Intel Corporation 82576 Gigabit Network Connection


hot add log:
pci 0000:4b:00.0: [8086:10c9] type 00 class 0x020000
pci 0000:4b:00.0: reg 10: [mem 0x00000000-0x0001ffff]
pci 0000:4b:00.0: reg 14: [mem 0x00000000-0x0001ffff]
pci 0000:4b:00.0: reg 18: [io  0x0000-0x001f]
pci 0000:4b:00.0: reg 1c: [mem 0x00000000-0x00003fff]
pci 0000:4b:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
pci 0000:4b:00.0: PME# supported from D0 D3hot D3cold
pci 0000:4b:00.1: [8086:10c9] type 00 class 0x020000
pci 0000:4b:00.1: reg 10: [mem 0x00000000-0x0001ffff]
pci 0000:4b:00.1: reg 14: [mem 0x00000000-0x0001ffff]
pci 0000:4b:00.1: reg 18: [io  0x0000-0x001f]
pci 0000:4b:00.1: reg 1c: [mem 0x00000000-0x00003fff]
pci 0000:4b:00.1: reg 30: [mem 0x00000000-0x0001ffff pref]
pci 0000:4b:00.1: PME# supported from D0 D3hot D3cold
pcieport 0000:47:09.0: bridge window [mem 0x00100000-0x001fffff pref] to [bus 4b] add_size 200000
pcieport 0000:47:09.0: res[9]=[mem 0x00100000-0x001fffff pref] get_res_add_size add_size 200000
pcieport 0000:47:09.0: BAR 9: can't assign mem pref (size 0x300000)
pcieport 0000:47:09.0: BAR 9: can't assign mem pref (size 0x100000)
pci 0000:4b:00.0: BAR 0: assigned [mem 0x80000000-0x8001ffff]
pci 0000:4b:00.0: BAR 1: assigned [mem 0x80020000-0x8003ffff]
pci 0000:4b:00.0: BAR 6: assigned [mem 0x80040000-0x8005ffff pref]
pci 0000:4b:00.1: BAR 0: assigned [mem 0x80060000-0x8007ffff]
pci 0000:4b:00.1: BAR 1: assigned [mem 0x80080000-0x8009ffff]
pci 0000:4b:00.1: BAR 6: assigned [mem 0x800a0000-0x800bffff pref]
pci 0000:4b:00.0: BAR 3: assigned [mem 0x800c0000-0x800c3fff]
pci 0000:4b:00.1: BAR 3: assigned [mem 0x800c4000-0x800c7fff]
pci 0000:4b:00.0: BAR 2: assigned [io  0x9000-0x901f]
pci 0000:4b:00.1: BAR 2: assigned [io  0x9020-0x903f]
pcieport 0000:47:09.0: PCI bridge to [bus 4b]
pcieport 0000:47:09.0:   bridge window [io  0x9000-0x9fff]
pcieport 0000:47:09.0:   bridge window [mem 0x80000000-0x800fffff]
PCI: No. 2 try to assign unassigned res
pcieport 0000:47:09.0: bridge window [mem 0x00100000-0x001fffff 64bit pref] to [bus 4b] add_size 200000
pcieport 0000:47:09.0: res[9]=[mem 0x00100000-0x001fffff 64bit pref] get_res_add_size add_size 200000
pcieport 0000:47:09.0: BAR 9: can't assign mem pref (size 0x300000)
pcieport 0000:47:09.0: BAR 9: can't assign mem pref (size 0x100000)
pcieport 0000:47:09.0: PCI bridge to [bus 4b]
pcieport 0000:47:09.0:   bridge window [io  0x9000-0x9fff]
pcieport 0000:47:09.0:   bridge window [mem 0x80000000-0x800fffff]
pci_bus 0000:4b: Non-optimal PCI-E Bus MPS value of 128 being used instead of 1024.
Please use the pci=pcie_bus_safe boot parameter for better performance.
pci_bus 0000:4b: Non-optimal PCI-E Bus MPS value of 128 being used instead of 1024.
Please use the pci=pcie_bus_safe boot parameter for better performance.

I think above log has some confusion.
pcieport 0000:47:09.0 device mps current is 256 and mpss is 1024;
the newly hot added igb device 0000:4b:00.0 and 0000:4b:00.1 mps are 128 and mpss is 512;
1、"pci_bus 0000:4b: Non-optimal PCI-E Bus MPS value of 128 being used instead of 1024."  should ".......instead of 256(bridge mps)"?
2、“use the pci=pcie_bus_safe boot parameter for better performance”  should "....use pci=pcie_bus_safe for safe"?
3、above logs is  duplicate.

igb 0000:4b:00.0: enabling device (0100 -> 0102)
igb 0000:4b:00.0: Intel(R) Gigabit Ethernet Network Connection
igb 0000:4b:00.0: eth4: (PCIe:2.5Gb/s:Width x4) 00:0e:0c:ff:ff:ff
igb 0000:4b:00.0: eth4: PBA No: FFFFFF-0FF
igb 0000:4b:00.0: Using MSI-X interrupts. 8 rx queue(s), 8 tx queue(s)
igb 0000:4b:00.1: enabling device (0100 -> 0102)
GSI 63 (level, low) -> CPU 27 (0x3300) vector 137
igb 0000:4b:00.1: Intel(R) Gigabit Ethernet Network Connection
igb 0000:4b:00.1: eth4: (PCIe:2.5Gb/s:Width x4) 00:0e:0c:ff:ff:fe
igb 0000:4b:00.1: eth4: PBA No: FFFFFF-0FF
igb 0000:4b:00.1: Using MSI-X interrupts. 8 rx queue(s), 8 tx queue(s)


Thanks!
Yijing


> +		return;
>  	}
>  
>  	pcie_bus_configure_set(bus->self, &smpss);
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 5155317..e4eede0 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2787,7 +2787,8 @@ static void __devinit quirk_intel_mc_errata(struct pci_dev *dev)
>  	int err;
>  	u16 rcc;
>  
> -	if (pcie_bus_config == PCIE_BUS_TUNE_OFF)
> +	if (pcie_bus_config == PCIE_BUS_TUNE_OFF ||
> +	    pcie_bus_config == PCIE_BUS_WARN)
>  		return;
>  
>  	/* Intel errata specifies bits to change but does not say what they are.
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 5faa831..410eaf9 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -662,6 +662,7 @@ extern void pcie_bus_configure_settings(struct pci_bus *bus, u8 smpss);
>  
>  enum pcie_bus_config_types {
>  	PCIE_BUS_TUNE_OFF,
> +	PCIE_BUS_WARN,
>  	PCIE_BUS_SAFE,
>  	PCIE_BUS_PERFORMANCE,
>  	PCIE_BUS_PEER2PEER,
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] PCI: Add new default PCI-E MPS bus state
  2012-10-15  2:32   ` Yijing Wang
@ 2012-10-23 22:57     ` Jon Mason
  2012-10-25  9:02       ` Yijing Wang
  0 siblings, 1 reply; 9+ messages in thread
From: Jon Mason @ 2012-10-23 22:57 UTC (permalink / raw)
  To: Yijing Wang; +Cc: Bjorn Helgaas, linux-pci

On Sun, Oct 14, 2012 at 7:32 PM, Yijing Wang <wangyijing@huawei.com> wrote:
> On 2012/10/12 13:35, Jon Mason wrote:
>> Add a new default state for the PCI-E MPS determination, PCIE_BUS_WARN.
>> This state notifies the user that a suboptimal configuration is
>> occurring (most likely due to incorrect configuration in the BIOS), and
>> provides them the boot parameter to enable this error to be corrected.
>> This provides the users an ability to detect an issue with MPS, without
>> forcing them to correct the issue (which may cause system hangs).  This
>> is a much more sane default for the distros.
>>
>> Also, add debug output to show the default device MPS and MPSS.
>>
>> Signed-off-by: Jon Mason <jdmason@kudzu.us>
>> CC: Yijing Wang <wangyijing@huawei.com>
>> ---
>>  drivers/pci/pci.c    |    2 +-
>>  drivers/pci/probe.c  |   10 ++++++++++
>>  drivers/pci/quirks.c |    3 ++-
>>  include/linux/pci.h  |    1 +
>>  4 files changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index ab4bf5a..1723c81 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -78,7 +78,7 @@ unsigned long pci_cardbus_mem_size = DEFAULT_CARDBUS_MEM_SIZE;
>>  unsigned long pci_hotplug_io_size  = DEFAULT_HOTPLUG_IO_SIZE;
>>  unsigned long pci_hotplug_mem_size = DEFAULT_HOTPLUG_MEM_SIZE;
>>
>> -enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_TUNE_OFF;
>> +enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_WARN;
>>
>>  /*
>>   * The default CLS is used if arch didn't set CLS explicitly and not
>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>> index 5a18652..64bb393 100644
>> --- a/drivers/pci/probe.c
>> +++ b/drivers/pci/probe.c
>> @@ -1448,6 +1448,9 @@ static int pcie_find_smpss(struct pci_dev *dev, void *data)
>>       if (!pci_is_pcie(dev))
>>               return 0;
>>
>> +     dev_dbg(&dev->dev, "Device MPS %d and MPSS %d\n",
>> +             pcie_get_mps(dev), 128 << dev->pcie_mpss);
>> +
>
> What about dev_prinkt(KERN_DEBUG, .....), so users can get this information from dmesg directly?
>
>>       /* For PCIE hotplug enabled slots not connected directly to a
>>        * PCI-E root port, there can be problems when hotplugging
>>        * devices.  This is due to the possibility of hotplugging a
>> @@ -1588,6 +1591,7 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
>>               smpss = 0;
>>               break;
>>
>> +     case PCIE_BUS_WARN:
>>       case PCIE_BUS_SAFE:
>>               smpss = mpss;
>>
>> @@ -1600,6 +1604,12 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
>>               return;
>>       }
>>
>> +     if (pcie_bus_config == PCIE_BUS_WARN) {
>> +             if (smpss != mpss)
>> +                     dev_info(&bus->dev, "Non-optimal PCI-E Bus MPS value of %d being used instead of %d.\n"
>> +                              "Please use the pci=pcie_bus_safe boot parameter for better performance.\n",
>> +                              128 << smpss, 128 << mpss);
>
> test log in my ia64 machine(support pciehp)
>
>  |           +-07.0-[0000:46-4f]----00.0-[0000:47-4f]--+-04.0-[0000:48-49]----00.0-[0000:49]--
>  |           |                                         +-08.0-[0000:4a]--
>  |           |                                         +-09.0-[0000:4b]--+-00.0  Intel Corporation 82576 Gigabit Network Connection
>  |           |                                         |                 \-00.1  Intel Corporation 82576 Gigabit Network Connection
>
>
> hot add log:
> pci 0000:4b:00.0: [8086:10c9] type 00 class 0x020000
> pci 0000:4b:00.0: reg 10: [mem 0x00000000-0x0001ffff]
> pci 0000:4b:00.0: reg 14: [mem 0x00000000-0x0001ffff]
> pci 0000:4b:00.0: reg 18: [io  0x0000-0x001f]
> pci 0000:4b:00.0: reg 1c: [mem 0x00000000-0x00003fff]
> pci 0000:4b:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
> pci 0000:4b:00.0: PME# supported from D0 D3hot D3cold
> pci 0000:4b:00.1: [8086:10c9] type 00 class 0x020000
> pci 0000:4b:00.1: reg 10: [mem 0x00000000-0x0001ffff]
> pci 0000:4b:00.1: reg 14: [mem 0x00000000-0x0001ffff]
> pci 0000:4b:00.1: reg 18: [io  0x0000-0x001f]
> pci 0000:4b:00.1: reg 1c: [mem 0x00000000-0x00003fff]
> pci 0000:4b:00.1: reg 30: [mem 0x00000000-0x0001ffff pref]
> pci 0000:4b:00.1: PME# supported from D0 D3hot D3cold
> pcieport 0000:47:09.0: bridge window [mem 0x00100000-0x001fffff pref] to [bus 4b] add_size 200000
> pcieport 0000:47:09.0: res[9]=[mem 0x00100000-0x001fffff pref] get_res_add_size add_size 200000
> pcieport 0000:47:09.0: BAR 9: can't assign mem pref (size 0x300000)
> pcieport 0000:47:09.0: BAR 9: can't assign mem pref (size 0x100000)
> pci 0000:4b:00.0: BAR 0: assigned [mem 0x80000000-0x8001ffff]
> pci 0000:4b:00.0: BAR 1: assigned [mem 0x80020000-0x8003ffff]
> pci 0000:4b:00.0: BAR 6: assigned [mem 0x80040000-0x8005ffff pref]
> pci 0000:4b:00.1: BAR 0: assigned [mem 0x80060000-0x8007ffff]
> pci 0000:4b:00.1: BAR 1: assigned [mem 0x80080000-0x8009ffff]
> pci 0000:4b:00.1: BAR 6: assigned [mem 0x800a0000-0x800bffff pref]
> pci 0000:4b:00.0: BAR 3: assigned [mem 0x800c0000-0x800c3fff]
> pci 0000:4b:00.1: BAR 3: assigned [mem 0x800c4000-0x800c7fff]
> pci 0000:4b:00.0: BAR 2: assigned [io  0x9000-0x901f]
> pci 0000:4b:00.1: BAR 2: assigned [io  0x9020-0x903f]
> pcieport 0000:47:09.0: PCI bridge to [bus 4b]
> pcieport 0000:47:09.0:   bridge window [io  0x9000-0x9fff]
> pcieport 0000:47:09.0:   bridge window [mem 0x80000000-0x800fffff]
> PCI: No. 2 try to assign unassigned res
> pcieport 0000:47:09.0: bridge window [mem 0x00100000-0x001fffff 64bit pref] to [bus 4b] add_size 200000
> pcieport 0000:47:09.0: res[9]=[mem 0x00100000-0x001fffff 64bit pref] get_res_add_size add_size 200000
> pcieport 0000:47:09.0: BAR 9: can't assign mem pref (size 0x300000)
> pcieport 0000:47:09.0: BAR 9: can't assign mem pref (size 0x100000)
> pcieport 0000:47:09.0: PCI bridge to [bus 4b]
> pcieport 0000:47:09.0:   bridge window [io  0x9000-0x9fff]
> pcieport 0000:47:09.0:   bridge window [mem 0x80000000-0x800fffff]
> pci_bus 0000:4b: Non-optimal PCI-E Bus MPS value of 128 being used instead of 1024.
> Please use the pci=pcie_bus_safe boot parameter for better performance.
> pci_bus 0000:4b: Non-optimal PCI-E Bus MPS value of 128 being used instead of 1024.
> Please use the pci=pcie_bus_safe boot parameter for better performance.
>
> I think above log has some confusion.
> pcieport 0000:47:09.0 device mps current is 256 and mpss is 1024;
> the newly hot added igb device 0000:4b:00.0 and 0000:4b:00.1 mps are 128 and mpss is 512;
> 1、"pci_bus 0000:4b: Non-optimal PCI-E Bus MPS value of 128 being used instead of 1024."  should ".......instead of 256(bridge mps)"?

The print out shows a bug in the code (which I will push in a second
version of the patch shortly), but having 128 instead of 256 is a
separate issue.  That is an existing limitation due to the hotplug
slot not being connected to the root port.  See the comment on line
1454.  Since PCIE_BUS_SAFE/PERFORMANCE is not being used, it is not
ratcheting down the MPS on the bus like it should (per the comment).

> 2、“use the pci=pcie_bus_safe boot parameter for better performance”  should "....use pci=pcie_bus_safe for safe"?

The reason for the user to add the boot parameter is to get better
performance than they would by default.  For theoretically best
performance, they would want to use "pcie_bus_performance".  With this
in mind, I believe "better" is the correct language.

> 3、above logs is  duplicate.

The duplicate log would only be caused by pcie_bus_configure_settings
being called twice.  Is this only being seen on hotplug devices or on
every device?

>
> igb 0000:4b:00.0: enabling device (0100 -> 0102)
> igb 0000:4b:00.0: Intel(R) Gigabit Ethernet Network Connection
> igb 0000:4b:00.0: eth4: (PCIe:2.5Gb/s:Width x4) 00:0e:0c:ff:ff:ff
> igb 0000:4b:00.0: eth4: PBA No: FFFFFF-0FF
> igb 0000:4b:00.0: Using MSI-X interrupts. 8 rx queue(s), 8 tx queue(s)
> igb 0000:4b:00.1: enabling device (0100 -> 0102)
> GSI 63 (level, low) -> CPU 27 (0x3300) vector 137
> igb 0000:4b:00.1: Intel(R) Gigabit Ethernet Network Connection
> igb 0000:4b:00.1: eth4: (PCIe:2.5Gb/s:Width x4) 00:0e:0c:ff:ff:fe
> igb 0000:4b:00.1: eth4: PBA No: FFFFFF-0FF
> igb 0000:4b:00.1: Using MSI-X interrupts. 8 rx queue(s), 8 tx queue(s)
>
>
> Thanks!
> Yijing
>
>
>> +             return;
>>       }
>>
>>       pcie_bus_configure_set(bus->self, &smpss);
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index 5155317..e4eede0 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -2787,7 +2787,8 @@ static void __devinit quirk_intel_mc_errata(struct pci_dev *dev)
>>       int err;
>>       u16 rcc;
>>
>> -     if (pcie_bus_config == PCIE_BUS_TUNE_OFF)
>> +     if (pcie_bus_config == PCIE_BUS_TUNE_OFF ||
>> +         pcie_bus_config == PCIE_BUS_WARN)
>>               return;
>>
>>       /* Intel errata specifies bits to change but does not say what they are.
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 5faa831..410eaf9 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -662,6 +662,7 @@ extern void pcie_bus_configure_settings(struct pci_bus *bus, u8 smpss);
>>
>>  enum pcie_bus_config_types {
>>       PCIE_BUS_TUNE_OFF,
>> +     PCIE_BUS_WARN,
>>       PCIE_BUS_SAFE,
>>       PCIE_BUS_PERFORMANCE,
>>       PCIE_BUS_PEER2PEER,
>>
>
>
> --
> Thanks!
> Yijing
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] PCI: correct static code analysis tool issue
  2012-10-12  5:35 ` [PATCH 1/3] PCI: correct static code analysis tool issue Jon Mason
@ 2012-10-25  7:29   ` Yijing Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Yijing Wang @ 2012-10-25  7:29 UTC (permalink / raw)
  To: Jon Mason; +Cc: Bjorn Helgaas, linux-pci

On 2012/10/12 13:35, Jon Mason wrote:
> Coverity noticed that pcie_bus_configure_settings() can call
> pcie_bus_configure_set() when "smpss" is uninitialized.  The smpss is
> used to make the bus and all child devices have a uniform value.
> Setting the smpss is moot for PCIE_BUS_PERFORMANCE, since it will be
> determined dynamically based on the max size of the parent device and
> the child device.  To avoid the erroneous detection of this issue, set
> the smpss to 0.  Also, add a case statement to make clearer the possible
> use cases and add a nice comment describing what is being done in the
> PCIE_BUS_PERFORMANCE case.
> 
> Signed-off-by: Jon Mason <jdmason@kudzu.us>
> ---
>  drivers/pci/probe.c |   25 ++++++++++++++++++++-----
>  1 file changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 9f8a6b7..6b672c1 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1570,21 +1570,36 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
>  	if (!pci_is_pcie(bus->self))
>  		return;
>  
> -	if (pcie_bus_config == PCIE_BUS_TUNE_OFF)
> -		return;
> -
> +	switch (pcie_bus_config) {
> +	/* The smpss is used to make the bus and all child devices have
> +	 * a uniform value.  Setting the smpss is moot for
> +	 * "Performance", since it will be determined dynamically based
> +	 * on the max size of the parent device and the child device.
> +	 * However, not setting the smpss can result in static code
> +	 * analysis tools erroniously reporting an issue.  To avoid
> +	 * this, set the smpss to 0.
> +	 */
> +	case PCIE_BUS_PERFORMANCE:
>  	/* FIXME - Peer to peer DMA is possible, though the endpoint would need
>  	 * to be aware to the MPS of the destination.  To work around this,
>  	 * simply force the MPS of the entire system to the smallest possible.
>  	 */
> -	if (pcie_bus_config == PCIE_BUS_PEER2PEER)
> +	case PCIE_BUS_PEER2PEER:
>  		smpss = 0;
> +		break;
>  
> -	if (pcie_bus_config == PCIE_BUS_SAFE) {
> +	case PCIE_BUS_SAFE:
>  		smpss = mpss;
>  
>  		pcie_find_smpss(bus->self, &smpss);
>  		pci_walk_bus(bus, pcie_find_smpss, &smpss);
> +		break;
> +
> +	case PCIE_BUS_TUNE_OFF:
> +	default:
> +		return;
> +	}
> +
>  	}

Remove bracket }

>  
>  	pcie_bus_configure_set(bus->self, &smpss);
> 


-- 
Thanks!
Yijing


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] PCI: Add new default PCI-E MPS bus state
  2012-10-23 22:57     ` Jon Mason
@ 2012-10-25  9:02       ` Yijing Wang
  2013-01-08  0:19         ` Bjorn Helgaas
  0 siblings, 1 reply; 9+ messages in thread
From: Yijing Wang @ 2012-10-25  9:02 UTC (permalink / raw)
  To: Jon Mason; +Cc: Bjorn Helgaas, linux-pci

Hi Jon,
   I think we can do a little more about inconsistent mps problem found in boot path(BIOS configure bug for mps)
or after hotplug device.As shown in your comment on line 1451, it's unsafe to modifying the MPS on the running devices.
Since bus child devices is not running when pcie_bus_configure_settings be called. So we can try to configure child device's
mps according to the mps of bus.
there are two situations:
1、now current running bus mps is larger than child devices mpss can support;
2、now cuurent running bus mps is smaller than child devices mpss can support;

We cannot modifying the MPS for 1, it's not safe; but we can show message to user to add boot parameter pci=pcie_bus_safe.
The second situation we can modify all child devices mps as the mps value of running bus, it's safe and can correct mps problem automatically
for users.

I wrote a draft patch about this solution, if there is some thing wrong with my understanding, let me know.

Thanks very much!
Yijing


diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 5485883..59036a8 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -78,7 +78,7 @@ unsigned long pci_cardbus_mem_size = DEFAULT_CARDBUS_MEM_SIZE;
 unsigned long pci_hotplug_io_size  = DEFAULT_HOTPLUG_IO_SIZE;
 unsigned long pci_hotplug_mem_size = DEFAULT_HOTPLUG_MEM_SIZE;

-enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_TUNE_OFF;
+enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_AUTO;

 /*
  * The default CLS is used if arch didn't set CLS explicitly and not
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index e8b7d5e..6cbfbe1 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1467,6 +1467,19 @@ static int pcie_find_smpss(struct pci_dev *dev, void *data)
 	return 0;
 }

+static int pcie_find_min_mpss(struct pci_dev *dev, void *data)
+{
+	u8 *mpss = data;
+
+	if (!pci_is_pcie(dev))
+		return 0;
+
+	if (*mpss > dev->pcie_mpss)
+		*mpss = dev->pcie_mpss;
+
+	return 0;
+}
+
 static void pcie_write_mps(struct pci_dev *dev, int mps)
 {
 	int rc;
@@ -1560,6 +1573,7 @@ static int pcie_bus_configure_set(struct pci_dev *dev, void *data)
 void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
 {
 	u8 smpss;
+	u8 mps = pcie_get_mps(bus->self) >> 8;

 	if (!pci_is_pcie(bus->self))
 		return;
@@ -1581,7 +1595,19 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
 	case PCIE_BUS_PEER2PEER:
 		smpss = 0;
 		break;
-
+	case PCIE_BUS_AUTO:
+		smpss = bus->self->pcie_mpss;
+		pci_walk_bus(bus, pcie_find_min_mpss, &smpss);
+		if (mps > smpss) {
+			dev_info(&bus->dev,
+				"Current mps %d used in bus 0x%02x is larger than children devices mpss %d support\n"
+				"Please use the pci=pcie_bus_safe boot parameter for safe\n",
+				128 << mpss, bus->number, 128 << smpss);
+			return;
+		}
+		else
+			pci_walk_bus(bus, pcie_bus_configure_set, &mps);
+		return;
 	case PCIE_BUS_SAFE:
 		smpss = mpss;

@@ -1594,8 +1620,6 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
 		return;
 	}

-	}
-
 	pcie_bus_configure_set(bus->self, &smpss);
 	pci_walk_bus(bus, pcie_bus_configure_set, &smpss);
 }
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 7a451ff..539b7e4 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2787,7 +2787,8 @@ static void __devinit quirk_intel_mc_errata(struct pci_dev *dev)
 	int err;
 	u16 rcc;

-	if (pcie_bus_config == PCIE_BUS_TUNE_OFF)
+	if (pcie_bus_config == PCIE_BUS_TUNE_OFF ||
+		pcie_bus_config == PCIE_AUTO_AUTO)
 		return;

 	/* Intel errata specifies bits to change but does not say what they are.
diff --git a/include/linux/pci.h b/include/linux/pci.h
index be1de01..84c4ab1 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -661,6 +661,7 @@ extern void pcie_bus_configure_settings(struct pci_bus *bus, u8 smpss);

 enum pcie_bus_config_types {
 	PCIE_BUS_TUNE_OFF,
+	PCIE_BUS_AUTO,
 	PCIE_BUS_SAFE,
 	PCIE_BUS_PERFORMANCE,
 	PCIE_BUS_PEER2PEER,
-- 
1.7.1


On 2012/10/24 6:57, Jon Mason wrote:
>> pcieport 0000:47:09.0 device mps current is 256 and mpss is 1024;
>> the newly hot added igb device 0000:4b:00.0 and 0000:4b:00.1 mps are 128 and mpss is 512;
>> 1、"pci_bus 0000:4b: Non-optimal PCI-E Bus MPS value of 128 being used instead of 1024."  should ".......instead of 256(bridge mps)"?
> 
> The print out shows a bug in the code (which I will push in a second
> version of the patch shortly), but having 128 instead of 256 is a
> separate issue.  That is an existing limitation due to the hotplug
> slot not being connected to the root port.  See the comment on line
> 1454.  Since PCIE_BUS_SAFE/PERFORMANCE is not being used, it is not
> ratcheting down the MPS on the bus like it should (per the comment).
> 
>> 2、“use the pci=pcie_bus_safe boot parameter for better performance”  should "....use pci=pcie_bus_safe for safe"?
> 
> The reason for the user to add the boot parameter is to get better
> performance than they would by default.  For theoretically best
> performance, they would want to use "pcie_bus_performance".  With this
> in mind, I believe "better" is the correct language.
> 
>> 3、above logs is  duplicate.
> 
> The duplicate log would only be caused by pcie_bus_configure_settings
> being called twice.  Is this only being seen on hotplug devices or on
> every device?
> 
>>
>> igb 0000:4b:00.0: enabling device (0100 -> 0102)
>> igb 0000:4b:00.0: Intel(R) Gigabit Ethernet Network Connection
>> igb 0000:4b:00.0: eth4: (PCIe:2.5Gb/s:Width x4) 00:0e:0c:ff:ff:ff
>> igb 0000:4b:00.0: eth4: PBA No: FFFFFF-0FF
>> igb 0000:4b:00.0: Using MSI-X interrupts. 8 rx queue(s), 8 tx queue(s)
>> igb 0000:4b:00.1: enabling device (0100 -> 0102)
>> GSI 63 (level, low) -> CPU 27 (0x3300) vector 137
>> igb 0000:4b:00.1: Intel(R) Gigabit Ethernet Network Connection
>> igb 0000:4b:00.1: eth4: (PCIe:2.5Gb/s:Width x4) 00:0e:0c:ff:ff:fe
>> igb 0000:4b:00.1: eth4: PBA No: FFFFFF-0FF
>> igb 0000:4b:00.1: Using MSI-X interrupts. 8 rx queue(s), 8 tx queue(s)
>>
>>
>> Thanks!
>> Yijing
>>
>>
>>> +             return;
>>>       }
>>>
>>>       pcie_bus_configure_set(bus->self, &smpss);
>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>> index 5155317..e4eede0 100644
>>> --- a/drivers/pci/quirks.c
>>> +++ b/drivers/pci/quirks.c
>>> @@ -2787,7 +2787,8 @@ static void __devinit quirk_intel_mc_errata(struct pci_dev *dev)
>>>       int err;
>>>       u16 rcc;
>>>
>>> -     if (pcie_bus_config == PCIE_BUS_TUNE_OFF)
>>> +     if (pcie_bus_config == PCIE_BUS_TUNE_OFF ||
>>> +         pcie_bus_config == PCIE_BUS_WARN)
>>>               return;
>>>
>>>       /* Intel errata specifies bits to change but does not say what they are.
>>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>>> index 5faa831..410eaf9 100644
>>> --- a/include/linux/pci.h
>>> +++ b/include/linux/pci.h
>>> @@ -662,6 +662,7 @@ extern void pcie_bus_configure_settings(struct pci_bus *bus, u8 smpss);
>>>
>>>  enum pcie_bus_config_types {
>>>       PCIE_BUS_TUNE_OFF,
>>> +     PCIE_BUS_WARN,
>>>       PCIE_BUS_SAFE,
>>>       PCIE_BUS_PERFORMANCE,
>>>       PCIE_BUS_PEER2PEER,
>>>
>>
>>
>> --
>> Thanks!
>> Yijing
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 


-- 
Thanks!
Yijing


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] PCI: Add new default PCI-E MPS bus state
  2012-10-25  9:02       ` Yijing Wang
@ 2013-01-08  0:19         ` Bjorn Helgaas
  0 siblings, 0 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2013-01-08  0:19 UTC (permalink / raw)
  To: Jon Mason; +Cc: linux-pci, Yijing Wang

Jon,

Sorry for neglecting this until it's now ancient history, but you
mentioned earlier that "The print out shows a bug in the code (which I
will push in a second version of the patch shortly)."

I don't see a second version; did I miss it?  If we still need a fix
here, can you re-post the three patches in this series?

Bjorn

On Thu, Oct 25, 2012 at 3:02 AM, Yijing Wang <wangyijing@huawei.com> wrote:
> Hi Jon,
>    I think we can do a little more about inconsistent mps problem found in boot path(BIOS configure bug for mps)
> or after hotplug device.As shown in your comment on line 1451, it's unsafe to modifying the MPS on the running devices.
> Since bus child devices is not running when pcie_bus_configure_settings be called. So we can try to configure child device's
> mps according to the mps of bus.
> there are two situations:
> 1、now current running bus mps is larger than child devices mpss can support;
> 2、now cuurent running bus mps is smaller than child devices mpss can support;
>
> We cannot modifying the MPS for 1, it's not safe; but we can show message to user to add boot parameter pci=pcie_bus_safe.
> The second situation we can modify all child devices mps as the mps value of running bus, it's safe and can correct mps problem automatically
> for users.
>
> I wrote a draft patch about this solution, if there is some thing wrong with my understanding, let me know.
>
> Thanks very much!
> Yijing
>
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 5485883..59036a8 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -78,7 +78,7 @@ unsigned long pci_cardbus_mem_size = DEFAULT_CARDBUS_MEM_SIZE;
>  unsigned long pci_hotplug_io_size  = DEFAULT_HOTPLUG_IO_SIZE;
>  unsigned long pci_hotplug_mem_size = DEFAULT_HOTPLUG_MEM_SIZE;
>
> -enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_TUNE_OFF;
> +enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_AUTO;
>
>  /*
>   * The default CLS is used if arch didn't set CLS explicitly and not
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index e8b7d5e..6cbfbe1 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1467,6 +1467,19 @@ static int pcie_find_smpss(struct pci_dev *dev, void *data)
>         return 0;
>  }
>
> +static int pcie_find_min_mpss(struct pci_dev *dev, void *data)
> +{
> +       u8 *mpss = data;
> +
> +       if (!pci_is_pcie(dev))
> +               return 0;
> +
> +       if (*mpss > dev->pcie_mpss)
> +               *mpss = dev->pcie_mpss;
> +
> +       return 0;
> +}
> +
>  static void pcie_write_mps(struct pci_dev *dev, int mps)
>  {
>         int rc;
> @@ -1560,6 +1573,7 @@ static int pcie_bus_configure_set(struct pci_dev *dev, void *data)
>  void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
>  {
>         u8 smpss;
> +       u8 mps = pcie_get_mps(bus->self) >> 8;
>
>         if (!pci_is_pcie(bus->self))
>                 return;
> @@ -1581,7 +1595,19 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
>         case PCIE_BUS_PEER2PEER:
>                 smpss = 0;
>                 break;
> -
> +       case PCIE_BUS_AUTO:
> +               smpss = bus->self->pcie_mpss;
> +               pci_walk_bus(bus, pcie_find_min_mpss, &smpss);
> +               if (mps > smpss) {
> +                       dev_info(&bus->dev,
> +                               "Current mps %d used in bus 0x%02x is larger than children devices mpss %d support\n"
> +                               "Please use the pci=pcie_bus_safe boot parameter for safe\n",
> +                               128 << mpss, bus->number, 128 << smpss);
> +                       return;
> +               }
> +               else
> +                       pci_walk_bus(bus, pcie_bus_configure_set, &mps);
> +               return;
>         case PCIE_BUS_SAFE:
>                 smpss = mpss;
>
> @@ -1594,8 +1620,6 @@ void pcie_bus_configure_settings(struct pci_bus *bus, u8 mpss)
>                 return;
>         }
>
> -       }
> -
>         pcie_bus_configure_set(bus->self, &smpss);
>         pci_walk_bus(bus, pcie_bus_configure_set, &smpss);
>  }
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 7a451ff..539b7e4 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2787,7 +2787,8 @@ static void __devinit quirk_intel_mc_errata(struct pci_dev *dev)
>         int err;
>         u16 rcc;
>
> -       if (pcie_bus_config == PCIE_BUS_TUNE_OFF)
> +       if (pcie_bus_config == PCIE_BUS_TUNE_OFF ||
> +               pcie_bus_config == PCIE_AUTO_AUTO)
>                 return;
>
>         /* Intel errata specifies bits to change but does not say what they are.
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index be1de01..84c4ab1 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -661,6 +661,7 @@ extern void pcie_bus_configure_settings(struct pci_bus *bus, u8 smpss);
>
>  enum pcie_bus_config_types {
>         PCIE_BUS_TUNE_OFF,
> +       PCIE_BUS_AUTO,
>         PCIE_BUS_SAFE,
>         PCIE_BUS_PERFORMANCE,
>         PCIE_BUS_PEER2PEER,
> --
> 1.7.1
>
>
> On 2012/10/24 6:57, Jon Mason wrote:
>>> pcieport 0000:47:09.0 device mps current is 256 and mpss is 1024;
>>> the newly hot added igb device 0000:4b:00.0 and 0000:4b:00.1 mps are 128 and mpss is 512;
>>> 1、"pci_bus 0000:4b: Non-optimal PCI-E Bus MPS value of 128 being used instead of 1024."  should ".......instead of 256(bridge mps)"?
>>
>> The print out shows a bug in the code (which I will push in a second
>> version of the patch shortly), but having 128 instead of 256 is a
>> separate issue.  That is an existing limitation due to the hotplug
>> slot not being connected to the root port.  See the comment on line
>> 1454.  Since PCIE_BUS_SAFE/PERFORMANCE is not being used, it is not
>> ratcheting down the MPS on the bus like it should (per the comment).
>>
>>> 2、“use the pci=pcie_bus_safe boot parameter for better performance”  should "....use pci=pcie_bus_safe for safe"?
>>
>> The reason for the user to add the boot parameter is to get better
>> performance than they would by default.  For theoretically best
>> performance, they would want to use "pcie_bus_performance".  With this
>> in mind, I believe "better" is the correct language.
>>
>>> 3、above logs is  duplicate.
>>
>> The duplicate log would only be caused by pcie_bus_configure_settings
>> being called twice.  Is this only being seen on hotplug devices or on
>> every device?
>>
>>>
>>> igb 0000:4b:00.0: enabling device (0100 -> 0102)
>>> igb 0000:4b:00.0: Intel(R) Gigabit Ethernet Network Connection
>>> igb 0000:4b:00.0: eth4: (PCIe:2.5Gb/s:Width x4) 00:0e:0c:ff:ff:ff
>>> igb 0000:4b:00.0: eth4: PBA No: FFFFFF-0FF
>>> igb 0000:4b:00.0: Using MSI-X interrupts. 8 rx queue(s), 8 tx queue(s)
>>> igb 0000:4b:00.1: enabling device (0100 -> 0102)
>>> GSI 63 (level, low) -> CPU 27 (0x3300) vector 137
>>> igb 0000:4b:00.1: Intel(R) Gigabit Ethernet Network Connection
>>> igb 0000:4b:00.1: eth4: (PCIe:2.5Gb/s:Width x4) 00:0e:0c:ff:ff:fe
>>> igb 0000:4b:00.1: eth4: PBA No: FFFFFF-0FF
>>> igb 0000:4b:00.1: Using MSI-X interrupts. 8 rx queue(s), 8 tx queue(s)
>>>
>>>
>>> Thanks!
>>> Yijing
>>>
>>>
>>>> +             return;
>>>>       }
>>>>
>>>>       pcie_bus_configure_set(bus->self, &smpss);
>>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>>> index 5155317..e4eede0 100644
>>>> --- a/drivers/pci/quirks.c
>>>> +++ b/drivers/pci/quirks.c
>>>> @@ -2787,7 +2787,8 @@ static void __devinit quirk_intel_mc_errata(struct pci_dev *dev)
>>>>       int err;
>>>>       u16 rcc;
>>>>
>>>> -     if (pcie_bus_config == PCIE_BUS_TUNE_OFF)
>>>> +     if (pcie_bus_config == PCIE_BUS_TUNE_OFF ||
>>>> +         pcie_bus_config == PCIE_BUS_WARN)
>>>>               return;
>>>>
>>>>       /* Intel errata specifies bits to change but does not say what they are.
>>>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>>>> index 5faa831..410eaf9 100644
>>>> --- a/include/linux/pci.h
>>>> +++ b/include/linux/pci.h
>>>> @@ -662,6 +662,7 @@ extern void pcie_bus_configure_settings(struct pci_bus *bus, u8 smpss);
>>>>
>>>>  enum pcie_bus_config_types {
>>>>       PCIE_BUS_TUNE_OFF,
>>>> +     PCIE_BUS_WARN,
>>>>       PCIE_BUS_SAFE,
>>>>       PCIE_BUS_PERFORMANCE,
>>>>       PCIE_BUS_PEER2PEER,
>>>>
>>>
>>>
>>> --
>>> Thanks!
>>> Yijing
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> .
>>
>
>
> --
> Thanks!
> Yijing
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-01-08  0:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-12  5:35 [PATCH 0/3] PCI: MPS patches Jon Mason
2012-10-12  5:35 ` [PATCH 1/3] PCI: correct static code analysis tool issue Jon Mason
2012-10-25  7:29   ` Yijing Wang
2012-10-12  5:35 ` [PATCH 2/3] PCI: Fix comment syntax Jon Mason
2012-10-12  5:35 ` [PATCH 3/3] PCI: Add new default PCI-E MPS bus state Jon Mason
2012-10-15  2:32   ` Yijing Wang
2012-10-23 22:57     ` Jon Mason
2012-10-25  9:02       ` Yijing Wang
2013-01-08  0:19         ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.