linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 0/2] PCI: add CRS support after hot reset and FLR
@ 2016-10-03  5:36 Sinan Kaya
  2016-10-03  5:36 ` [PATCH V3 1/2] PCI: add CRS support to error handling path Sinan Kaya
  2016-10-03  5:37 ` [PATCH V3 2/2] PCI: handle CRS returned by device after FLR Sinan Kaya
  0 siblings, 2 replies; 8+ messages in thread
From: Sinan Kaya @ 2016-10-03  5:36 UTC (permalink / raw)
  To: linux-pci, timur, cov, alex.williamson, vikrams
  Cc: Lorenzo.Pieralisi, linux-arm-msm, linux-arm-kernel, Sinan Kaya,
	linux-kernel

The PCIE spec allows an endpoint device to extend the initialization time
beyond 1 second by issuing Configuration Request Retry Status (CRS) for a
vendor ID read request.

This basically means "I'm busy now, please call me back later".

There are two moving parts to CRS support from the SW perspective. One part
is to determine if CRS is supported or not. The second part is to set the
CRS visibility register.

As part of the probe, the Linux kernel sets the above two conditions in
pci_enable_crs function. The kernel is also honoring the returned CRS in
pci_bus_read_dev_vendor_id function if supported. The function will poll up
to specified amount of time while endpoint is returning CRS response.

The PCIe spec also allows CRS to be issued during cold, warm, hot and FLR
resets.

The hot reset is initiated by starting a secondary bus reset. A bus/device
restore follows the reset.  This patch is adding vendor ID read into dev
restore function to validate that the device is accessible before writing
the register contents. If the device issues CRS, the code might poll up
to 60 seconds.

An endpoint is allowed to issue CRS following an FLR request to indicate
that it is not ready to accept new requests. Changing the polling mechanism
in FLR wait function to go read the vendor ID instead of the command/status
register. A CRS indication will only be given if the address to be read is
vendor ID.

v3:
* dropped parent_bus_reset change and IB/hfi1 changes as both of them work
only when there is a single device on the bus and reset is for the device
in the bus. 
* dropper AER changes as AER driver broadcasts error to the endpoint device
driver which eventually cause endpoint driver to be reprobed after fatal 
error. 
* moved vendor id read into the pci_dev_restore function as this is the
* first
attempt to contact the endpoint after a reset.

v2:
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1233472.html
* IB/hfi1 via pci_reset_bridge_secondary_bus
* PCI/AER via pci_reset_bridge_secondary_bus
* PCI: dev_reset via parent bus reset
* use walk_bus for vendor id reads since the lock is no longer held.

v1:
http://www.spinics.net/lists/linux-pci/msg53596.html

* initial implementation

Sinan Kaya (2):
  PCI: add CRS support to error handling path
  PCI: handle CRS returned by device after FLR

 drivers/pci/pci.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V3 1/2] PCI: add CRS support to error handling path
  2016-10-03  5:36 [PATCH V3 0/2] PCI: add CRS support after hot reset and FLR Sinan Kaya
@ 2016-10-03  5:36 ` Sinan Kaya
  2016-11-10 18:39   ` Sinan Kaya
  2016-10-03  5:37 ` [PATCH V3 2/2] PCI: handle CRS returned by device after FLR Sinan Kaya
  1 sibling, 1 reply; 8+ messages in thread
From: Sinan Kaya @ 2016-10-03  5:36 UTC (permalink / raw)
  To: linux-pci, timur, cov, alex.williamson, vikrams
  Cc: Lorenzo.Pieralisi, linux-arm-msm, linux-arm-kernel, Sinan Kaya,
	linux-kernel

The PCIE spec allows an endpoint device to extend the initialization time
beyond 1 second by issuing Configuration Request Retry Status (CRS) for a
vendor ID read request.

This basically means "I'm busy now, please call me back later".

There are two moving parts to CRS support from the SW perspective. One part
is to determine if CRS is supported or not. The second part is to set the
CRS visibility register.

As part of the probe, the Linux kernel sets the above two conditions in
pci_enable_crs function. The kernel is also honoring the returned CRS in
pci_bus_read_dev_vendor_id function if supported. The function will poll up
to specified amount of time while endpoint is returning CRS response.

The PCIe spec also allows CRS to be issued during cold, warm, hot and FLR
resets.

The hot reset is initiated by starting a secondary bus reset. A bus/device
restore follows the reset.  This patch is adding vendor ID read into dev
restore function to validate that the device is accessible before writing
the register contents. If the device issues CRS, the code might poll up
to 60 seconds.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/pci/pci.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index aab9d51..c8749b9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4020,6 +4020,12 @@ static void pci_dev_save_and_disable(struct pci_dev *dev)
 
 static void pci_dev_restore(struct pci_dev *dev)
 {
+	u32 l;
+
+	/* see if the device is accessible first */
+	if (!pci_bus_read_dev_vendor_id(dev->bus, dev->devfn, &l, 60 * 1000))
+		return;
+
 	pci_restore_state(dev);
 	pci_reset_notify(dev, false);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V3 2/2] PCI: handle CRS returned by device after FLR
  2016-10-03  5:36 [PATCH V3 0/2] PCI: add CRS support after hot reset and FLR Sinan Kaya
  2016-10-03  5:36 ` [PATCH V3 1/2] PCI: add CRS support to error handling path Sinan Kaya
@ 2016-10-03  5:37 ` Sinan Kaya
  2017-02-21 17:04   ` Sinan Kaya
  1 sibling, 1 reply; 8+ messages in thread
From: Sinan Kaya @ 2016-10-03  5:37 UTC (permalink / raw)
  To: linux-pci, timur, cov, alex.williamson, vikrams
  Cc: Lorenzo.Pieralisi, linux-arm-msm, linux-arm-kernel, Sinan Kaya,
	linux-kernel

An endpoint is allowed to issue CRS following an FLR request to indicate
that it is not ready to accept new requests. Changing the polling mechanism
in FLR wait function to go read the vendor ID instead of the command/status
register. A CRS indication will only be given if the address to be read is
vendor ID.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/pci/pci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index c8749b9..7580b00 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3725,7 +3725,8 @@ static void pci_flr_wait(struct pci_dev *dev)
 
 	do {
 		msleep(100);
-		pci_read_config_dword(dev, PCI_COMMAND, &id);
+		pci_bus_read_dev_vendor_id(dev->bus, dev->devfn, &id,
+					   60 * 1000);
 	} while (i++ < 10 && id == ~0);
 
 	if (id == ~0)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V3 1/2] PCI: add CRS support to error handling path
  2016-10-03  5:36 ` [PATCH V3 1/2] PCI: add CRS support to error handling path Sinan Kaya
@ 2016-11-10 18:39   ` Sinan Kaya
  0 siblings, 0 replies; 8+ messages in thread
From: Sinan Kaya @ 2016-11-10 18:39 UTC (permalink / raw)
  To: linux-pci, timur, cov, alex.williamson, vikrams
  Cc: Lorenzo.Pieralisi, linux-arm-msm, linux-arm-kernel, linux-kernel

On 10/3/2016 1:36 AM, Sinan Kaya wrote:
> The PCIE spec allows an endpoint device to extend the initialization time
> beyond 1 second by issuing Configuration Request Retry Status (CRS) for a
> vendor ID read request.
> 
> This basically means "I'm busy now, please call me back later".
> 
> There are two moving parts to CRS support from the SW perspective. One part
> is to determine if CRS is supported or not. The second part is to set the
> CRS visibility register.
> 
> As part of the probe, the Linux kernel sets the above two conditions in
> pci_enable_crs function. The kernel is also honoring the returned CRS in
> pci_bus_read_dev_vendor_id function if supported. The function will poll up
> to specified amount of time while endpoint is returning CRS response.
> 
> The PCIe spec also allows CRS to be issued during cold, warm, hot and FLR
> resets.
> 
> The hot reset is initiated by starting a secondary bus reset. A bus/device
> restore follows the reset.  This patch is adding vendor ID read into dev
> restore function to validate that the device is accessible before writing
> the register contents. If the device issues CRS, the code might poll up
> to 60 seconds.
> 
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> ---
>  drivers/pci/pci.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index aab9d51..c8749b9 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4020,6 +4020,12 @@ static void pci_dev_save_and_disable(struct pci_dev *dev)
>  
>  static void pci_dev_restore(struct pci_dev *dev)
>  {
> +	u32 l;
> +
> +	/* see if the device is accessible first */
> +	if (!pci_bus_read_dev_vendor_id(dev->bus, dev->devfn, &l, 60 * 1000))
> +		return;
> +
>  	pci_restore_state(dev);
>  	pci_reset_notify(dev, false);
>  }
> 

Any feedback on this direction?


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V3 2/2] PCI: handle CRS returned by device after FLR
  2016-10-03  5:37 ` [PATCH V3 2/2] PCI: handle CRS returned by device after FLR Sinan Kaya
@ 2017-02-21 17:04   ` Sinan Kaya
  2017-02-21 20:51     ` Alex Williamson
  0 siblings, 1 reply; 8+ messages in thread
From: Sinan Kaya @ 2017-02-21 17:04 UTC (permalink / raw)
  To: linux-pci, alex.williamson
  Cc: timur, cov, vikrams, Lorenzo.Pieralisi, linux-arm-msm,
	linux-arm-kernel, linux-kernel

Hi Alex,

I'm coming back to work on this. 

On 10/3/2016 1:37 AM, Sinan Kaya wrote:
> An endpoint is allowed to issue CRS following an FLR request to indicate
> that it is not ready to accept new requests. Changing the polling mechanism
> in FLR wait function to go read the vendor ID instead of the command/status
> register. A CRS indication will only be given if the address to be read is
> vendor ID.
> 
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> ---
>  drivers/pci/pci.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index c8749b9..7580b00 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3725,7 +3725,8 @@ static void pci_flr_wait(struct pci_dev *dev)
>  
>  	do {
>  		msleep(100);
> -		pci_read_config_dword(dev, PCI_COMMAND, &id);

Your comment here puzzled me. 

https://patchwork.kernel.org/patch/8331851/

"Self nak on this one, didn't account for VFs not implementing the first
dword.  Thanks,"

I'm trying to add Configuration Request Retry Status (CRS) support to FLR
with this patch. 

Basically, the root port will return 0xFFFF0001 only when a config read
request is sent to the vendor ID register and CRS visibility is set.
The SW needs to poll until this special read ID disappears. See the
implementation note on Configuration Request Retry Status in PCIE
specification for details.

pci_bus_read_dev_vendor_id implements this loop for us.

You are saying that there are VFs that do not implement vendor ID register.
Can you give some history on this?


> +		pci_bus_read_dev_vendor_id(dev->bus, dev->devfn, &id,
> +					   60 * 1000);
>  	} while (i++ < 10 && id == ~0);
>  
>  	if (id == ~0)
> 

Sinan

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V3 2/2] PCI: handle CRS returned by device after FLR
  2017-02-21 17:04   ` Sinan Kaya
@ 2017-02-21 20:51     ` Alex Williamson
  2017-02-22  2:37       ` Sinan Kaya
  0 siblings, 1 reply; 8+ messages in thread
From: Alex Williamson @ 2017-02-21 20:51 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-pci, timur, cov, vikrams, Lorenzo.Pieralisi, linux-arm-msm,
	linux-arm-kernel, linux-kernel

On Tue, 21 Feb 2017 12:04:24 -0500
Sinan Kaya <okaya@codeaurora.org> wrote:

> Hi Alex,
> 
> I'm coming back to work on this. 
> 
> On 10/3/2016 1:37 AM, Sinan Kaya wrote:
> > An endpoint is allowed to issue CRS following an FLR request to indicate
> > that it is not ready to accept new requests. Changing the polling mechanism
> > in FLR wait function to go read the vendor ID instead of the command/status
> > register. A CRS indication will only be given if the address to be read is
> > vendor ID.
> > 
> > Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> > ---
> >  drivers/pci/pci.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index c8749b9..7580b00 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -3725,7 +3725,8 @@ static void pci_flr_wait(struct pci_dev *dev)
> >  
> >  	do {
> >  		msleep(100);
> > -		pci_read_config_dword(dev, PCI_COMMAND, &id);  
> 
> Your comment here puzzled me. 
> 
> https://patchwork.kernel.org/patch/8331851/
> 
> "Self nak on this one, didn't account for VFs not implementing the first
> dword.  Thanks,"
> 
> I'm trying to add Configuration Request Retry Status (CRS) support to FLR
> with this patch. 
> 
> Basically, the root port will return 0xFFFF0001 only when a config read
> request is sent to the vendor ID register and CRS visibility is set.
> The SW needs to poll until this special read ID disappears. See the
> implementation note on Configuration Request Retry Status in PCIE
> specification for details.
> 
> pci_bus_read_dev_vendor_id implements this loop for us.
> 
> You are saying that there are VFs that do not implement vendor ID register.
> Can you give some history on this?

SR-IOV spec rev 1.1, 3.4.1.1 & 3.4.1.2, Vendor ID and Device ID fields
for the VF return 0xFFFF when read.  The "Virtualization Intermediary"
is supposed to use the vendor ID from the PF and the device ID defined
in the PF SR-IOV capability.

> 
> > +		pci_bus_read_dev_vendor_id(dev->bus, dev->devfn, &id,
> > +					   60 * 1000);
> >  	} while (i++ < 10 && id == ~0);

pci_bus_read_dev_vendor_id() seems like it will return false with an
id value of ~0 for a functional VF, so this loop will spin longer than
necessary and report an invalid error.  Patch 1/2 from this series
would cause pci_dev_restore() to be a no-op on VFs.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V3 2/2] PCI: handle CRS returned by device after FLR
  2017-02-21 20:51     ` Alex Williamson
@ 2017-02-22  2:37       ` Sinan Kaya
  2017-03-01 17:31         ` Sinan Kaya
  0 siblings, 1 reply; 8+ messages in thread
From: Sinan Kaya @ 2017-02-22  2:37 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-pci, timur, cov, vikrams, Lorenzo.Pieralisi, linux-arm-msm,
	linux-arm-kernel, linux-kernel

On 2/21/2017 3:51 PM, Alex Williamson wrote:
> On Tue, 21 Feb 2017 12:04:24 -0500
> Sinan Kaya <okaya@codeaurora.org> wrote:
> 
>> Hi Alex,
>>
>> I'm coming back to work on this. 
>>
>> On 10/3/2016 1:37 AM, Sinan Kaya wrote:
>>> An endpoint is allowed to issue CRS following an FLR request to indicate
>>> that it is not ready to accept new requests. Changing the polling mechanism
>>> in FLR wait function to go read the vendor ID instead of the command/status
>>> register. A CRS indication will only be given if the address to be read is
>>> vendor ID.
>>>
>>> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
>>> ---
>>>  drivers/pci/pci.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>> index c8749b9..7580b00 100644
>>> --- a/drivers/pci/pci.c
>>> +++ b/drivers/pci/pci.c
>>> @@ -3725,7 +3725,8 @@ static void pci_flr_wait(struct pci_dev *dev)
>>>  
>>>  	do {
>>>  		msleep(100);
>>> -		pci_read_config_dword(dev, PCI_COMMAND, &id);  
>>
>> Your comment here puzzled me. 
>>
>> https://patchwork.kernel.org/patch/8331851/
>>
>> "Self nak on this one, didn't account for VFs not implementing the first
>> dword.  Thanks,"
>>
>> I'm trying to add Configuration Request Retry Status (CRS) support to FLR
>> with this patch. 
>>
>> Basically, the root port will return 0xFFFF0001 only when a config read
>> request is sent to the vendor ID register and CRS visibility is set.
>> The SW needs to poll until this special read ID disappears. See the
>> implementation note on Configuration Request Retry Status in PCIE
>> specification for details.
>>
>> pci_bus_read_dev_vendor_id implements this loop for us.
>>
>> You are saying that there are VFs that do not implement vendor ID register.
>> Can you give some history on this?
> 
> SR-IOV spec rev 1.1, 3.4.1.1 & 3.4.1.2, Vendor ID and Device ID fields
> for the VF return 0xFFFF when read.  The "Virtualization Intermediary"
> is supposed to use the vendor ID from the PF and the device ID defined
> in the PF SR-IOV capability.


Interesting. Since lspci was showing the correct vendor id and device id, I
assumed that it is coming from offset 0.

Maybe, the right thing is to figure out if this is a virtual function or not.
If it is a physical function, check the CRS first before reading the command
register in the existing loop.


> 
>>
>>> +		pci_bus_read_dev_vendor_id(dev->bus, dev->devfn, &id,
>>> +					   60 * 1000);
>>>  	} while (i++ < 10 && id == ~0);
> 
> pci_bus_read_dev_vendor_id() seems like it will return false with an
> id value of ~0 for a functional VF, so this loop will spin longer than
> necessary and report an invalid error.  Patch 1/2 from this series
> would cause pci_dev_restore() to be a no-op on VFs.  Thanks,
> 
> Alex
> 
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V3 2/2] PCI: handle CRS returned by device after FLR
  2017-02-22  2:37       ` Sinan Kaya
@ 2017-03-01 17:31         ` Sinan Kaya
  0 siblings, 0 replies; 8+ messages in thread
From: Sinan Kaya @ 2017-03-01 17:31 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-pci, timur, cov, vikrams, Lorenzo.Pieralisi, linux-arm-msm,
	linux-arm-kernel, linux-kernel

On 2/21/2017 9:37 PM, Sinan Kaya wrote:
>> SR-IOV spec rev 1.1, 3.4.1.1 & 3.4.1.2, Vendor ID and Device ID fields
>> for the VF return 0xFFFF when read.  The "Virtualization Intermediary"
>> is supposed to use the vendor ID from the PF and the device ID defined
>> in the PF SR-IOV capability.
> 
> Interesting. Since lspci was showing the correct vendor id and device id, I
> assumed that it is coming from offset 0.
> 
> Maybe, the right thing is to figure out if this is a virtual function or not.
> If it is a physical function, check the CRS first before reading the command
> register in the existing loop.
> 
> 

I went back and read the spec for CRS in FLR case one more time. We are required
to wait up to 1 seconds if the device is sending CRS. The current code seems to
be already handling this delay. 



-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-03-01 17:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-03  5:36 [PATCH V3 0/2] PCI: add CRS support after hot reset and FLR Sinan Kaya
2016-10-03  5:36 ` [PATCH V3 1/2] PCI: add CRS support to error handling path Sinan Kaya
2016-11-10 18:39   ` Sinan Kaya
2016-10-03  5:37 ` [PATCH V3 2/2] PCI: handle CRS returned by device after FLR Sinan Kaya
2017-02-21 17:04   ` Sinan Kaya
2017-02-21 20:51     ` Alex Williamson
2017-02-22  2:37       ` Sinan Kaya
2017-03-01 17:31         ` Sinan Kaya

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).