linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PCI: Match Root Port's MPS to endpoint's MPSS when necessary
@ 2018-07-18 18:51 Myron Stowe
  2018-07-24 15:47 ` Jon Mason
  2018-08-01 14:05 ` Bjorn Helgaas
  0 siblings, 2 replies; 7+ messages in thread
From: Myron Stowe @ 2018-07-18 18:51 UTC (permalink / raw)
  To: bhelgaas, linux-pci; +Cc: keith.busch, jdmason

In commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge"), we made
sure every device's MPS setting matches its upstream bridge, making it more
likely that a hot-added device will work in a system with an optimized MPS
configuration.

Recently I've started encountering systems where the endpoint device's MPSS
capability is less than its root port's current MPS value, thus the
endpoint is not capable of matching its upstream bridge's MPS setting (see:
bugzilla via "Link:" below).  This leaves the system vunerable - the
upstream root port could respond with larger sized TLPs than the endpoint
can handle, and the endpoint will consider them to be 'Malformed'.

One could use the "pci=pcie_bus_safe" kernel parameter to resolve the
issue, but, it both forces a user to have to supply a kernel parameter to
get the system to function reliable, and may end up limiting MPS settings
of other, non-related, sub-topologies which could benefit from maintaining
their larger values.

This patch augments Keith's approach to include tuning down a root port's
MPS setting when its hot-added endpoint device is not capable of matching
it.  The tuning down, so that both the root port and endpoint match, is
limited to root ports with downstream endpoint device sub-topologies.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200527
Cc: Keith Busch <keith.busch@intel.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Sinan Kaya <okaya@kernel.org>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
---
 drivers/pci/probe.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ac91b6f..2987bd9 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1670,7 +1670,7 @@ int pci_setup_device(struct pci_dev *dev)
 static void pci_configure_mps(struct pci_dev *dev)
 {
 	struct pci_dev *bridge = pci_upstream_bridge(dev);
-	int mps, p_mps, rc;
+	int mps, mpss, p_mps, rc;
 
 	if (!pci_is_pcie(dev) || !bridge || !pci_is_pcie(bridge))
 		return;
@@ -1694,6 +1694,14 @@ static void pci_configure_mps(struct pci_dev *dev)
 	if (pcie_bus_config != PCIE_BUS_DEFAULT)
 		return;
 
+	mpss = 128 << dev->pcie_mpss;
+	if (mpss < p_mps && pci_pcie_type(bridge) == PCI_EXP_TYPE_ROOT_PORT) {
+		pcie_set_mps(bridge, mpss);
+		pci_info(dev, "Upstream bridge's Max Payload Size set to %d (was %d, max %d)\n",
+			 mpss, p_mps, 128 << bridge->pcie_mpss);
+		p_mps = pcie_get_mps(bridge);
+	}
+
 	rc = pcie_set_mps(dev, p_mps);
 	if (rc) {
 		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
@@ -1702,7 +1710,7 @@ static void pci_configure_mps(struct pci_dev *dev)
 	}
 
 	pci_info(dev, "Max Payload Size set to %d (was %d, max %d)\n",
-		 p_mps, mps, 128 << dev->pcie_mpss);
+		 p_mps, mps, mpss);
 }
 
 static struct hpp_type0 pci_default_type0 = {

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] PCI: Match Root Port's MPS to endpoint's MPSS when necessary
  2018-07-18 18:51 [PATCH] PCI: Match Root Port's MPS to endpoint's MPSS when necessary Myron Stowe
@ 2018-07-24 15:47 ` Jon Mason
  2018-08-01 14:05 ` Bjorn Helgaas
  1 sibling, 0 replies; 7+ messages in thread
From: Jon Mason @ 2018-07-24 15:47 UTC (permalink / raw)
  To: Myron Stowe; +Cc: bhelgaas, linux-pci, keith.busch

On Wed, Jul 18, 2018 at 12:51:58PM -0600, Myron Stowe wrote:
> In commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge"), we made
> sure every device's MPS setting matches its upstream bridge, making it more
> likely that a hot-added device will work in a system with an optimized MPS
> configuration.
> 
> Recently I've started encountering systems where the endpoint device's MPSS
> capability is less than its root port's current MPS value, thus the
> endpoint is not capable of matching its upstream bridge's MPS setting (see:
> bugzilla via "Link:" below).  This leaves the system vunerable - the
> upstream root port could respond with larger sized TLPs than the endpoint
> can handle, and the endpoint will consider them to be 'Malformed'.
> 
> One could use the "pci=pcie_bus_safe" kernel parameter to resolve the
> issue, but, it both forces a user to have to supply a kernel parameter to
> get the system to function reliable, and may end up limiting MPS settings
> of other, non-related, sub-topologies which could benefit from maintaining
> their larger values.
> 
> This patch augments Keith's approach to include tuning down a root port's
> MPS setting when its hot-added endpoint device is not capable of matching
> it.  The tuning down, so that both the root port and endpoint match, is
> limited to root ports with downstream endpoint device sub-topologies.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=200527
> Cc: Keith Busch <keith.busch@intel.com>
> Cc: Jon Mason <jdmason@kudzu.us>

Looks good to me
Acked-by: Jon Mason <jdmason@kudzu.us>

> Cc: Sinan Kaya <okaya@kernel.org>
> Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
> ---
>  drivers/pci/probe.c |   12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index ac91b6f..2987bd9 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1670,7 +1670,7 @@ int pci_setup_device(struct pci_dev *dev)
>  static void pci_configure_mps(struct pci_dev *dev)
>  {
>  	struct pci_dev *bridge = pci_upstream_bridge(dev);
> -	int mps, p_mps, rc;
> +	int mps, mpss, p_mps, rc;
>  
>  	if (!pci_is_pcie(dev) || !bridge || !pci_is_pcie(bridge))
>  		return;
> @@ -1694,6 +1694,14 @@ static void pci_configure_mps(struct pci_dev *dev)
>  	if (pcie_bus_config != PCIE_BUS_DEFAULT)
>  		return;
>  
> +	mpss = 128 << dev->pcie_mpss;
> +	if (mpss < p_mps && pci_pcie_type(bridge) == PCI_EXP_TYPE_ROOT_PORT) {
> +		pcie_set_mps(bridge, mpss);
> +		pci_info(dev, "Upstream bridge's Max Payload Size set to %d (was %d, max %d)\n",
> +			 mpss, p_mps, 128 << bridge->pcie_mpss);
> +		p_mps = pcie_get_mps(bridge);
> +	}
> +
>  	rc = pcie_set_mps(dev, p_mps);
>  	if (rc) {
>  		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
> @@ -1702,7 +1710,7 @@ static void pci_configure_mps(struct pci_dev *dev)
>  	}
>  
>  	pci_info(dev, "Max Payload Size set to %d (was %d, max %d)\n",
> -		 p_mps, mps, 128 << dev->pcie_mpss);
> +		 p_mps, mps, mpss);
>  }
>  
>  static struct hpp_type0 pci_default_type0 = {
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PCI: Match Root Port's MPS to endpoint's MPSS when necessary
  2018-07-18 18:51 [PATCH] PCI: Match Root Port's MPS to endpoint's MPSS when necessary Myron Stowe
  2018-07-24 15:47 ` Jon Mason
@ 2018-08-01 14:05 ` Bjorn Helgaas
  2018-08-10 10:04   ` Dongdong Liu
  1 sibling, 1 reply; 7+ messages in thread
From: Bjorn Helgaas @ 2018-08-01 14:05 UTC (permalink / raw)
  To: Myron Stowe; +Cc: bhelgaas, linux-pci, keith.busch, jdmason

On Wed, Jul 18, 2018 at 12:51:58PM -0600, Myron Stowe wrote:
> In commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge"), we made
> sure every device's MPS setting matches its upstream bridge, making it more
> likely that a hot-added device will work in a system with an optimized MPS
> configuration.
> 
> Recently I've started encountering systems where the endpoint device's MPSS
> capability is less than its root port's current MPS value, thus the
> endpoint is not capable of matching its upstream bridge's MPS setting (see:
> bugzilla via "Link:" below).  This leaves the system vunerable - the
> upstream root port could respond with larger sized TLPs than the endpoint
> can handle, and the endpoint will consider them to be 'Malformed'.
> 
> One could use the "pci=pcie_bus_safe" kernel parameter to resolve the
> issue, but, it both forces a user to have to supply a kernel parameter to
> get the system to function reliable, and may end up limiting MPS settings
> of other, non-related, sub-topologies which could benefit from maintaining
> their larger values.
> 
> This patch augments Keith's approach to include tuning down a root port's
> MPS setting when its hot-added endpoint device is not capable of matching
> it.  The tuning down, so that both the root port and endpoint match, is
> limited to root ports with downstream endpoint device sub-topologies.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=200527
> Cc: Keith Busch <keith.busch@intel.com>
> Cc: Jon Mason <jdmason@kudzu.us>
> Cc: Sinan Kaya <okaya@kernel.org>
> Signed-off-by: Myron Stowe <myron.stowe@redhat.com>

Applied to pci/enumeration for v4.19, thanks!

> ---
>  drivers/pci/probe.c |   12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index ac91b6f..2987bd9 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1670,7 +1670,7 @@ int pci_setup_device(struct pci_dev *dev)
>  static void pci_configure_mps(struct pci_dev *dev)
>  {
>  	struct pci_dev *bridge = pci_upstream_bridge(dev);
> -	int mps, p_mps, rc;
> +	int mps, mpss, p_mps, rc;
>  
>  	if (!pci_is_pcie(dev) || !bridge || !pci_is_pcie(bridge))
>  		return;
> @@ -1694,6 +1694,14 @@ static void pci_configure_mps(struct pci_dev *dev)
>  	if (pcie_bus_config != PCIE_BUS_DEFAULT)
>  		return;
>  
> +	mpss = 128 << dev->pcie_mpss;
> +	if (mpss < p_mps && pci_pcie_type(bridge) == PCI_EXP_TYPE_ROOT_PORT) {
> +		pcie_set_mps(bridge, mpss);
> +		pci_info(dev, "Upstream bridge's Max Payload Size set to %d (was %d, max %d)\n",
> +			 mpss, p_mps, 128 << bridge->pcie_mpss);
> +		p_mps = pcie_get_mps(bridge);
> +	}
> +
>  	rc = pcie_set_mps(dev, p_mps);
>  	if (rc) {
>  		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
> @@ -1702,7 +1710,7 @@ static void pci_configure_mps(struct pci_dev *dev)
>  	}
>  
>  	pci_info(dev, "Max Payload Size set to %d (was %d, max %d)\n",
> -		 p_mps, mps, 128 << dev->pcie_mpss);
> +		 p_mps, mps, mpss);
>  }
>  
>  static struct hpp_type0 pci_default_type0 = {
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PCI: Match Root Port's MPS to endpoint's MPSS when necessary
  2018-08-01 14:05 ` Bjorn Helgaas
@ 2018-08-10 10:04   ` Dongdong Liu
  2018-08-10 17:28     ` Bjorn Helgaas
  2018-08-10 21:33     ` Myron Stowe
  0 siblings, 2 replies; 7+ messages in thread
From: Dongdong Liu @ 2018-08-10 10:04 UTC (permalink / raw)
  To: Bjorn Helgaas, Myron Stowe; +Cc: bhelgaas, linux-pci, keith.busch, jdmason

Hi Bjorn, Myron

I found a bug after applied the patch.

The topology is as below. The 82599 netcard with two functions connect to RP.
  +-[0000:80]-+-00.0-[81]--+-00.0  Device 8086:10fb
  |           |            \-00.1  Device 8086:10fb

1. lspci -s BDF -vvv  to get the value of device's MPSS , MPS and MRRS.
RP (80:00.0): MPSS=512 MPS=512 MRRS=512
EP PF0(81:00.0): MPSS=512 MPS=512 MRRS=512
    PF1(81:00.1): MPSS=512 MPS=512 MRRS=512

2. Enable SRIOV.
echo 1  > /sys/devices/pci0000\:80/0000\:80\:00.0/0000\:81\:00.0/sriov_numvfs
RP(80:00.0): MPSS=512 MPS=128 MRRS=512
                           ^^^
EP PF0(81:00.0): MPSS=512 MPS=512 MRRS=512
			      ^^^ 	
    PF1(81:00.1): MPSS=512 MPS=512 MRRS=512
			      ^^^ 	
    VF0(81:10.0): MPSS=128 MPS=128 MRRS=128
                               ^^^
The 82599 netcard PF (MPSS 512) and VF's MPSS (MPSS 128) are different.
Then RP (MPS 128) will report Malformed TLP when PF0/PF1 has memory write operation with MPS 512.

The 82599 netcard could work ok without the patch.
The values of MPSS, MPS, MRRS are as below without the patch.

RP(80:00.0): MPSS=512 MPS=512 MRRS=512
                           ^^^
EP PF0(81:00.0): MPSS=512 MPS=512 MRRS=512
			      ^^^ 	
    PF1(81:00.1): MPSS=512 MPS=512 MRRS=512
			      ^^^ 	
    VF0(81:10.0): MPSS=128 MPS=128 MRRS=128
                               ^^^

Thanks,
Dongdong

在 2018/8/1 22:05, Bjorn Helgaas 写道:
> On Wed, Jul 18, 2018 at 12:51:58PM -0600, Myron Stowe wrote:
>> In commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge"), we made
>> sure every device's MPS setting matches its upstream bridge, making it more
>> likely that a hot-added device will work in a system with an optimized MPS
>> configuration.
>>
>> Recently I've started encountering systems where the endpoint device's MPSS
>> capability is less than its root port's current MPS value, thus the
>> endpoint is not capable of matching its upstream bridge's MPS setting (see:
>> bugzilla via "Link:" below).  This leaves the system vunerable - the
>> upstream root port could respond with larger sized TLPs than the endpoint
>> can handle, and the endpoint will consider them to be 'Malformed'.
>>
>> One could use the "pci=pcie_bus_safe" kernel parameter to resolve the
>> issue, but, it both forces a user to have to supply a kernel parameter to
>> get the system to function reliable, and may end up limiting MPS settings
>> of other, non-related, sub-topologies which could benefit from maintaining
>> their larger values.
>>
>> This patch augments Keith's approach to include tuning down a root port's
>> MPS setting when its hot-added endpoint device is not capable of matching
>> it.  The tuning down, so that both the root port and endpoint match, is
>> limited to root ports with downstream endpoint device sub-topologies.
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=200527
>> Cc: Keith Busch <keith.busch@intel.com>
>> Cc: Jon Mason <jdmason@kudzu.us>
>> Cc: Sinan Kaya <okaya@kernel.org>
>> Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
>
> Applied to pci/enumeration for v4.19, thanks!
>
>> ---
>>  drivers/pci/probe.c |   12 ++++++++++--
>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>> index ac91b6f..2987bd9 100644
>> --- a/drivers/pci/probe.c
>> +++ b/drivers/pci/probe.c
>> @@ -1670,7 +1670,7 @@ int pci_setup_device(struct pci_dev *dev)
>>  static void pci_configure_mps(struct pci_dev *dev)
>>  {
>>  	struct pci_dev *bridge = pci_upstream_bridge(dev);
>> -	int mps, p_mps, rc;
>> +	int mps, mpss, p_mps, rc;
>>
>>  	if (!pci_is_pcie(dev) || !bridge || !pci_is_pcie(bridge))
>>  		return;
>> @@ -1694,6 +1694,14 @@ static void pci_configure_mps(struct pci_dev *dev)
>>  	if (pcie_bus_config != PCIE_BUS_DEFAULT)
>>  		return;
>>
>> +	mpss = 128 << dev->pcie_mpss;
>> +	if (mpss < p_mps && pci_pcie_type(bridge) == PCI_EXP_TYPE_ROOT_PORT) {
>> +		pcie_set_mps(bridge, mpss);
>> +		pci_info(dev, "Upstream bridge's Max Payload Size set to %d (was %d, max %d)\n",
>> +			 mpss, p_mps, 128 << bridge->pcie_mpss);
>> +		p_mps = pcie_get_mps(bridge);
>> +	}
>> +
>>  	rc = pcie_set_mps(dev, p_mps);
>>  	if (rc) {
>>  		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
>> @@ -1702,7 +1710,7 @@ static void pci_configure_mps(struct pci_dev *dev)
>>  	}
>>
>>  	pci_info(dev, "Max Payload Size set to %d (was %d, max %d)\n",
>> -		 p_mps, mps, 128 << dev->pcie_mpss);
>> +		 p_mps, mps, mpss);
>>  }
>>
>>  static struct hpp_type0 pci_default_type0 = {
>>
>
> .
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PCI: Match Root Port's MPS to endpoint's MPSS when necessary
  2018-08-10 10:04   ` Dongdong Liu
@ 2018-08-10 17:28     ` Bjorn Helgaas
  2018-08-10 21:33     ` Myron Stowe
  1 sibling, 0 replies; 7+ messages in thread
From: Bjorn Helgaas @ 2018-08-10 17:28 UTC (permalink / raw)
  To: Dongdong Liu; +Cc: Myron Stowe, bhelgaas, linux-pci, keith.busch, jdmason

On Fri, Aug 10, 2018 at 06:04:39PM +0800, Dongdong Liu wrote:
> Hi Bjorn, Myron
> 
> I found a bug after applied the patch.
> 
> The topology is as below. The 82599 netcard with two functions connect to RP.
>  +-[0000:80]-+-00.0-[81]--+-00.0  Device 8086:10fb
>  |           |            \-00.1  Device 8086:10fb
> 
> 1. lspci -s BDF -vvv  to get the value of device's MPSS , MPS and MRRS.
> RP (80:00.0): MPSS=512 MPS=512 MRRS=512
> EP PF0(81:00.0): MPSS=512 MPS=512 MRRS=512
>    PF1(81:00.1): MPSS=512 MPS=512 MRRS=512
> 
> 2. Enable SRIOV.
> echo 1  > /sys/devices/pci0000\:80/0000\:80\:00.0/0000\:81\:00.0/sriov_numvfs
> RP(80:00.0): MPSS=512 MPS=128 MRRS=512
>                           ^^^
> EP PF0(81:00.0): MPSS=512 MPS=512 MRRS=512
> 			      ^^^ 	
>    PF1(81:00.1): MPSS=512 MPS=512 MRRS=512
> 			      ^^^ 	
>    VF0(81:10.0): MPSS=128 MPS=128 MRRS=128
>                               ^^^
> The 82599 netcard PF (MPSS 512) and VF's MPSS (MPSS 128) are different.
> Then RP (MPS 128) will report Malformed TLP when PF0/PF1 has memory write operation with MPS 512.
> 
> The 82599 netcard could work ok without the patch.
> The values of MPSS, MPS, MRRS are as below without the patch.
> 
> RP(80:00.0): MPSS=512 MPS=512 MRRS=512
>                           ^^^
> EP PF0(81:00.0): MPSS=512 MPS=512 MRRS=512
> 			      ^^^ 	
>    PF1(81:00.1): MPSS=512 MPS=512 MRRS=512
> 			      ^^^ 	
>    VF0(81:10.0): MPSS=128 MPS=128 MRRS=128
>                               ^^^

OK, thanks a lot for testing this out.

I'll drop this change for now until we figure out what's going on.

> 在 2018/8/1 22:05, Bjorn Helgaas 写道:
> > On Wed, Jul 18, 2018 at 12:51:58PM -0600, Myron Stowe wrote:
> > > In commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge"), we made
> > > sure every device's MPS setting matches its upstream bridge, making it more
> > > likely that a hot-added device will work in a system with an optimized MPS
> > > configuration.
> > > 
> > > Recently I've started encountering systems where the endpoint device's MPSS
> > > capability is less than its root port's current MPS value, thus the
> > > endpoint is not capable of matching its upstream bridge's MPS setting (see:
> > > bugzilla via "Link:" below).  This leaves the system vunerable - the
> > > upstream root port could respond with larger sized TLPs than the endpoint
> > > can handle, and the endpoint will consider them to be 'Malformed'.
> > > 
> > > One could use the "pci=pcie_bus_safe" kernel parameter to resolve the
> > > issue, but, it both forces a user to have to supply a kernel parameter to
> > > get the system to function reliable, and may end up limiting MPS settings
> > > of other, non-related, sub-topologies which could benefit from maintaining
> > > their larger values.
> > > 
> > > This patch augments Keith's approach to include tuning down a root port's
> > > MPS setting when its hot-added endpoint device is not capable of matching
> > > it.  The tuning down, so that both the root port and endpoint match, is
> > > limited to root ports with downstream endpoint device sub-topologies.
> > > 
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=200527
> > > Cc: Keith Busch <keith.busch@intel.com>
> > > Cc: Jon Mason <jdmason@kudzu.us>
> > > Cc: Sinan Kaya <okaya@kernel.org>
> > > Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
> > 
> > Applied to pci/enumeration for v4.19, thanks!
> > 
> > > ---
> > >  drivers/pci/probe.c |   12 ++++++++++--
> > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > > index ac91b6f..2987bd9 100644
> > > --- a/drivers/pci/probe.c
> > > +++ b/drivers/pci/probe.c
> > > @@ -1670,7 +1670,7 @@ int pci_setup_device(struct pci_dev *dev)
> > >  static void pci_configure_mps(struct pci_dev *dev)
> > >  {
> > >  	struct pci_dev *bridge = pci_upstream_bridge(dev);
> > > -	int mps, p_mps, rc;
> > > +	int mps, mpss, p_mps, rc;
> > > 
> > >  	if (!pci_is_pcie(dev) || !bridge || !pci_is_pcie(bridge))
> > >  		return;
> > > @@ -1694,6 +1694,14 @@ static void pci_configure_mps(struct pci_dev *dev)
> > >  	if (pcie_bus_config != PCIE_BUS_DEFAULT)
> > >  		return;
> > > 
> > > +	mpss = 128 << dev->pcie_mpss;
> > > +	if (mpss < p_mps && pci_pcie_type(bridge) == PCI_EXP_TYPE_ROOT_PORT) {
> > > +		pcie_set_mps(bridge, mpss);
> > > +		pci_info(dev, "Upstream bridge's Max Payload Size set to %d (was %d, max %d)\n",
> > > +			 mpss, p_mps, 128 << bridge->pcie_mpss);
> > > +		p_mps = pcie_get_mps(bridge);
> > > +	}
> > > +
> > >  	rc = pcie_set_mps(dev, p_mps);
> > >  	if (rc) {
> > >  		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
> > > @@ -1702,7 +1710,7 @@ static void pci_configure_mps(struct pci_dev *dev)
> > >  	}
> > > 
> > >  	pci_info(dev, "Max Payload Size set to %d (was %d, max %d)\n",
> > > -		 p_mps, mps, 128 << dev->pcie_mpss);
> > > +		 p_mps, mps, mpss);
> > >  }
> > > 
> > >  static struct hpp_type0 pci_default_type0 = {
> > > 
> > 
> > .
> > 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PCI: Match Root Port's MPS to endpoint's MPSS when necessary
  2018-08-10 10:04   ` Dongdong Liu
  2018-08-10 17:28     ` Bjorn Helgaas
@ 2018-08-10 21:33     ` Myron Stowe
  2018-08-11  3:47       ` Dongdong Liu
  1 sibling, 1 reply; 7+ messages in thread
From: Myron Stowe @ 2018-08-10 21:33 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: Bjorn Helgaas, Myron Stowe, bhelgaas, linux-pci, keith.busch, jdmason

On Fri, 10 Aug 2018 18:04:39 +0800
Dongdong Liu <liudongdong3@huawei.com> wrote:

> Hi Bjorn, Myron
>=20
> I found a bug after applied the patch.
>=20
> The topology is as below. The 82599 netcard with two functions
> connect to RP. +-[0000:80]-+-00.0-[81]--+-00.0  Device 8086:10fb
>   |           |            \-00.1  Device 8086:10fb
>=20
> 1. lspci -s BDF -vvv  to get the value of device's MPSS , MPS and
> MRRS. RP (80:00.0): MPSS=3D512 MPS=3D512 MRRS=3D512
> EP PF0(81:00.0): MPSS=3D512 MPS=3D512 MRRS=3D512
>     PF1(81:00.1): MPSS=3D512 MPS=3D512 MRRS=3D512
>=20
> 2. Enable SRIOV.
> echo 1
> > /sys/devices/pci0000\:80/0000\:80\:00.0/0000\:81\:00.0/sriov_numvfs
> > RP(80:00.0): MPSS=3D512 MPS=3D128 MRRS=3D512
>                            ^^^
> EP PF0(81:00.0): MPSS=3D512 MPS=3D512 MRRS=3D512
> 			      ^^^ =09
>     PF1(81:00.1): MPSS=3D512 MPS=3D512 MRRS=3D512
> 			      ^^^ =09
>     VF0(81:10.0): MPSS=3D128 MPS=3D128 MRRS=3D128
>                                ^^^
> The 82599 netcard PF (MPSS 512) and VF's MPSS (MPSS 128) are
> different. Then RP (MPS 128) will report Malformed TLP when PF0/PF1
> has memory write operation with MPS 512.
>=20
> The 82599 netcard could work ok without the patch.
> The values of MPSS, MPS, MRRS are as below without the patch.
>=20
> RP(80:00.0): MPSS=3D512 MPS=3D512 MRRS=3D512
>                            ^^^
> EP PF0(81:00.0): MPSS=3D512 MPS=3D512 MRRS=3D512
> 			      ^^^ =09
>     PF1(81:00.1): MPSS=3D512 MPS=3D512 MRRS=3D512
> 			      ^^^ =09
>     VF0(81:10.0): MPSS=3D128 MPS=3D128 MRRS=3D128
>                                ^^^

Hi Dongdong,

Thanks for the testing and noticing a problem with the patch,
especially before it was incorporated upstream!


Looking into the PCI Express Base spec (4.0 r1.0), section 9.3.5.3
concerning the "Device Capabilities Register", it indicates "PF and VF
functionality is defined in Section 7.5.3.3 except where noted in
Table 9-15".  Table 9-15 doesn't specifically mention anything with
respect to MPSS which would make one _think_ that its respective VF's
bits are valid.

However, section 9.3.5.4, concerning the "Device Control Register",
does specifically show both Max_Payload_Size (MPS) and
Max_Read_request_Size (MRRS) to be 'RsvdP' for VFs in Table 9-16
[1].  Just prior to the table it states:
  "PF and VF functionality is defined in Section 7.5.3.4 except where=20
   noted in Table 9-16. For VF fields marked RsvdP, the PF setting
   applies to the VF."

All of which implies that with respect to MPSS, MPS, and MRRS values,
we should _not_ be paying any attention to the VF's fields, but
rather only to the PF's.  Only looking at the PF's fields also
_logically_ makes sense as it is the sole physical interface to the
PCIe bus.


As to the patch, looks like an additional check as to if the
device is a virtual function - 'dev->is_virtfn' - is needed where we
bail out early in the case that it is.


[1] Per 7.4 "Configuration Register Types: 'RsvdP' fields are -
      "Reserved for future RW implementations.  Register bits are
       read-only and must return zero when read. Software must preserve
       the value read for writes to bits."
    which accounts for the MPS, and MRRS values being read as '0', and
    thus subsequently intereptred as '128'.

    Which brings up a tangental question: Should 'lspci' interpret,
    and output, 'RsvdP' fields of the Device Control Register
    corresponding to VFs?

Myron

>=20
> Thanks,
> Dongdong
=E5=9C=A8 2018/8/1 22:05, Bjorn Helgaas =E5=86=99=E9=81=93:
>
snip O<

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PCI: Match Root Port's MPS to endpoint's MPSS when necessary
  2018-08-10 21:33     ` Myron Stowe
@ 2018-08-11  3:47       ` Dongdong Liu
  0 siblings, 0 replies; 7+ messages in thread
From: Dongdong Liu @ 2018-08-11  3:47 UTC (permalink / raw)
  To: Myron Stowe
  Cc: Bjorn Helgaas, Myron Stowe, bhelgaas, linux-pci, keith.busch, jdmason

Hi Myron

在 2018/8/11 5:33, Myron Stowe 写道:
> On Fri, 10 Aug 2018 18:04:39 +0800
> Dongdong Liu <liudongdong3@huawei.com> wrote:
>
>> Hi Bjorn, Myron
>>
>> I found a bug after applied the patch.
>>
>> The topology is as below. The 82599 netcard with two functions
>> connect to RP. +-[0000:80]-+-00.0-[81]--+-00.0  Device 8086:10fb
>>   |           |            \-00.1  Device 8086:10fb
>>
>> 1. lspci -s BDF -vvv  to get the value of device's MPSS , MPS and
>> MRRS. RP (80:00.0): MPSS=512 MPS=512 MRRS=512
>> EP PF0(81:00.0): MPSS=512 MPS=512 MRRS=512
>>     PF1(81:00.1): MPSS=512 MPS=512 MRRS=512
>>
>> 2. Enable SRIOV.
>> echo 1
>>> /sys/devices/pci0000\:80/0000\:80\:00.0/0000\:81\:00.0/sriov_numvfs
>>> RP(80:00.0): MPSS=512 MPS=128 MRRS=512
>>                            ^^^
>> EP PF0(81:00.0): MPSS=512 MPS=512 MRRS=512
>> 			      ^^^ 	
>>     PF1(81:00.1): MPSS=512 MPS=512 MRRS=512
>> 			      ^^^ 	
>>     VF0(81:10.0): MPSS=128 MPS=128 MRRS=128
>>                                ^^^
>> The 82599 netcard PF (MPSS 512) and VF's MPSS (MPSS 128) are
>> different. Then RP (MPS 128) will report Malformed TLP when PF0/PF1
>> has memory write operation with MPS 512.
>>
>> The 82599 netcard could work ok without the patch.
>> The values of MPSS, MPS, MRRS are as below without the patch.
>>
>> RP(80:00.0): MPSS=512 MPS=512 MRRS=512
>>                            ^^^
>> EP PF0(81:00.0): MPSS=512 MPS=512 MRRS=512
>> 			      ^^^ 	
>>     PF1(81:00.1): MPSS=512 MPS=512 MRRS=512
>> 			      ^^^ 	
>>     VF0(81:10.0): MPSS=128 MPS=128 MRRS=128
>>                                ^^^
>
> Hi Dongdong,
>
> Thanks for the testing and noticing a problem with the patch,
> especially before it was incorporated upstream!
>
>
> Looking into the PCI Express Base spec (4.0 r1.0), section 9.3.5.3
> concerning the "Device Capabilities Register", it indicates "PF and VF
> functionality is defined in Section 7.5.3.3 except where noted in
> Table 9-15".  Table 9-15 doesn't specifically mention anything with
> respect to MPSS which would make one _think_ that its respective VF's
> bits are valid.
Yes, very easy to misunderstand especially section 7.5.3.3 says
Max_Payload_Size Supported--
The Functions of a Multi-Function Device are permitted to report
different values for this field.
>
> However, section 9.3.5.4, concerning the "Device Control Register",
> does specifically show both Max_Payload_Size (MPS) and
> Max_Read_request_Size (MRRS) to be 'RsvdP' for VFs in Table 9-16
> [1].  Just prior to the table it states:
>   "PF and VF functionality is defined in Section 7.5.3.4 except where
>    noted in Table 9-16. For VF fields marked RsvdP, the PF setting
>    applies to the VF."
>
> All of which implies that with respect to MPSS, MPS, and MRRS values,
> we should _not_ be paying any attention to the VF's fields, but
> rather only to the PF's.  Only looking at the PF's fields also
> _logically_ makes sense as it is the sole physical interface to the
> PCIe bus.
Thanks for clarifying this.
>
>
> As to the patch, looks like an additional check as to if the
> device is a virtual function - 'dev->is_virtfn' - is needed where we
> bail out early in the case that it is.

Yes, that will be ok.
Thanks,
Dongdong

>
>
> [1] Per 7.4 "Configuration Register Types: 'RsvdP' fields are -
>       "Reserved for future RW implementations.  Register bits are
>        read-only and must return zero when read. Software must preserve
>        the value read for writes to bits."
>     which accounts for the MPS, and MRRS values being read as '0', and
>     thus subsequently intereptred as '128'.
>
>     Which brings up a tangental question: Should 'lspci' interpret,
>     and output, 'RsvdP' fields of the Device Control Register
>     corresponding to VFs?
>
> Myron
>
>>
>> Thanks,
>> Dongdong
> 在 2018/8/1 22:05, Bjorn Helgaas 写道:
>>
> snip O<
>
> .
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-08-11  3:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-18 18:51 [PATCH] PCI: Match Root Port's MPS to endpoint's MPSS when necessary Myron Stowe
2018-07-24 15:47 ` Jon Mason
2018-08-01 14:05 ` Bjorn Helgaas
2018-08-10 10:04   ` Dongdong Liu
2018-08-10 17:28     ` Bjorn Helgaas
2018-08-10 21:33     ` Myron Stowe
2018-08-11  3:47       ` Dongdong Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).