* PCI CRS Support
@ 2016-08-24 15:56 Sinan Kaya
2016-08-24 19:10 ` Bjorn Helgaas
0 siblings, 1 reply; 5+ messages in thread
From: Sinan Kaya @ 2016-08-24 15:56 UTC (permalink / raw)
To: Linux PCI, Bjorn Helgaas, Vikram Sethi
Hi Bjorn,
I see that the kernel has support for Configuration Request Retry Status (CRS) visibility
support and it gets discovered and enabled as part of the probe function.
Let's assume a system with CRS capability and have its visibility set as above.
I do not see any code in the failure/reset path to support the CRS requests
returned by the endpoint.
An endpoint is allowed to return CRS after several reset types. I'm pasting the part of
the spec for you at 2.3.1 Request Handling Rules of 3.1 spec.
"For Configuration Requests only, following reset it is possible for a device to terminate the request
but indicate that it is temporarily unable to process the Request, but will be able to process the Request
in the future – in this case, the Configuration Request Retry 10 Status (CRS) Completion Status is used
(see Section 6.6). Valid reset conditions after which a device is permitted to return CRS are:
- Cold, Warm, and Hot Resets
- FLRs
- A reset initiated in response to a D3hot to D0uninitialized device state transition."
I have identified the following functions that have problems for warm and hot resets.
Some callers of pci_reset_bridge_secondary_bus such as pciehp_reset_slot, aer_root_reset.
Other higher level callers such as pci_bus_reset, pci_try_reset_bus and their callers from VFIO.
All these places are impacted by a CRS call. They do the secondary bus reset but do not wait for the
endpoint to respond. Waiting for 1 second is not a guarantee that the endpoint will start responding
immediately. A CRS capable OS needs to interpret the incoming CRS response and poll longer
since CRS visibility is et.
All of this was warm and hot reset.
I also see another problem in the FLR path too. There is some best effort wait up to 1 second in
pci_flr_wait.
Where do we go from here? I was thinking of putting something deep down into the reset secondary
bus function but I'm afraid it will break things especially when we wait up to 60 seconds.
--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: PCI CRS Support
2016-08-24 15:56 PCI CRS Support Sinan Kaya
@ 2016-08-24 19:10 ` Bjorn Helgaas
2016-08-24 19:28 ` Sinan Kaya
0 siblings, 1 reply; 5+ messages in thread
From: Bjorn Helgaas @ 2016-08-24 19:10 UTC (permalink / raw)
To: Sinan Kaya; +Cc: Linux PCI, Bjorn Helgaas, Vikram Sethi
Hi Sinan,
On Wed, Aug 24, 2016 at 11:56:18AM -0400, Sinan Kaya wrote:
> Hi Bjorn,
> I see that the kernel has support for Configuration Request Retry Status (CRS) visibility
> support and it gets discovered and enabled as part of the probe function.
>
> Let's assume a system with CRS capability and have its visibility set as above.
> I do not see any code in the failure/reset path to support the CRS requests
> returned by the endpoint.
>
> An endpoint is allowed to return CRS after several reset types. I'm pasting the part of
> the spec for you at 2.3.1 Request Handling Rules of 3.1 spec.
>
> "For Configuration Requests only, following reset it is possible for a device to terminate the request
> but indicate that it is temporarily unable to process the Request, but will be able to process the Request
> in the future – in this case, the Configuration Request Retry 10 Status (CRS) Completion Status is used
> (see Section 6.6). Valid reset conditions after which a device is permitted to return CRS are:
>
> - Cold, Warm, and Hot Resets
> - FLRs
> - A reset initiated in response to a D3hot to D0uninitialized device state transition."
>
> I have identified the following functions that have problems for warm and hot resets.
>
> Some callers of pci_reset_bridge_secondary_bus such as pciehp_reset_slot, aer_root_reset.
> Other higher level callers such as pci_bus_reset, pci_try_reset_bus and their callers from VFIO.
> All these places are impacted by a CRS call. They do the secondary bus reset but do not wait for the
> endpoint to respond. Waiting for 1 second is not a guarantee that the endpoint will start responding
> immediately. A CRS capable OS needs to interpret the incoming CRS response and poll longer
> since CRS visibility is et.
>
> All of this was warm and hot reset.
>
> I also see another problem in the FLR path too. There is some best effort wait up to 1 second in
> pci_flr_wait.
>
> Where do we go from here? I was thinking of putting something deep down into the reset secondary
> bus function but I'm afraid it will break things especially when we wait up to 60 seconds.
I agree CRS handling after reset is probably all broken.
I hate the fact that we reset devices without re-enumerating them. We
have no assurance that the device is the same after reset (it could
have loaded new firmware and been completely reconfigured).
I don't have any good suggestions for you, so if you have some ideas
and want to fix it, please go ahead.
Bjorn
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: PCI CRS Support
2016-08-24 19:10 ` Bjorn Helgaas
@ 2016-08-24 19:28 ` Sinan Kaya
2016-08-25 8:44 ` Lorenzo Pieralisi
0 siblings, 1 reply; 5+ messages in thread
From: Sinan Kaya @ 2016-08-24 19:28 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Linux PCI, Bjorn Helgaas, Vikram Sethi
On 8/24/2016 3:10 PM, Bjorn Helgaas wrote:
>> Where do we go from here? I was thinking of putting something deep down into the reset secondary
>> > bus function but I'm afraid it will break things especially when we wait up to 60 seconds.
> I agree CRS handling after reset is probably all broken.
>
> I hate the fact that we reset devices without re-enumerating them. We
> have no assurance that the device is the same after reset (it could
> have loaded new firmware and been completely reconfigured).
>
> I don't have any good suggestions for you, so if you have some ideas
> and want to fix it, please go ahead.
I think I'll make a list of paths that reach to secondary bus reset function
and try to keep CRS loop as close as possible to the caller.
I'll focus on FLR later. I won't be heart-broken if somebody took a stab at the
FLR.
--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: PCI CRS Support
2016-08-24 19:28 ` Sinan Kaya
@ 2016-08-25 8:44 ` Lorenzo Pieralisi
2016-08-28 17:57 ` Sinan Kaya
0 siblings, 1 reply; 5+ messages in thread
From: Lorenzo Pieralisi @ 2016-08-25 8:44 UTC (permalink / raw)
To: Sinan Kaya
Cc: Bjorn Helgaas, Linux PCI, Bjorn Helgaas, Vikram Sethi, alex.williamson
Hi Sinan,
[+Alex]
On Wed, Aug 24, 2016 at 03:28:51PM -0400, Sinan Kaya wrote:
> On 8/24/2016 3:10 PM, Bjorn Helgaas wrote:
> >> Where do we go from here? I was thinking of putting something deep down into the reset secondary
> >> > bus function but I'm afraid it will break things especially when we wait up to 60 seconds.
> > I agree CRS handling after reset is probably all broken.
> >
> > I hate the fact that we reset devices without re-enumerating them. We
> > have no assurance that the device is the same after reset (it could
> > have loaded new firmware and been completely reconfigured).
> >
> > I don't have any good suggestions for you, so if you have some ideas
> > and want to fix it, please go ahead.
>
> I think I'll make a list of paths that reach to secondary bus reset
> function and try to keep CRS loop as close as possible to the caller.
>
> I'll focus on FLR later. I won't be heart-broken if somebody took a
> stab at the FLR.
Side note, taking advantage of this query: I hope this is a problem
solved by end of October, if it is not I would suggest adding a track to
tackle it at LPC16 PCI uconf, I added Alex since I know he is interested
in the topic too, it makes sense to debate a solution with all
interested people in one room.
Thanks,
Lorenzo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: PCI CRS Support
2016-08-25 8:44 ` Lorenzo Pieralisi
@ 2016-08-28 17:57 ` Sinan Kaya
0 siblings, 0 replies; 5+ messages in thread
From: Sinan Kaya @ 2016-08-28 17:57 UTC (permalink / raw)
To: Lorenzo Pieralisi, Bjorn Helgaas
Cc: Linux PCI, Bjorn Helgaas, Vikram Sethi, alex.williamson
Hi Bjorn, Lorenzo;
On 8/25/2016 4:44 AM, Lorenzo Pieralisi wrote:
> Hi Sinan,
>
> [+Alex]
>
> On Wed, Aug 24, 2016 at 03:28:51PM -0400, Sinan Kaya wrote:
>> On 8/24/2016 3:10 PM, Bjorn Helgaas wrote:
>>>> Where do we go from here? I was thinking of putting something deep down into the reset secondary
>>>>> bus function but I'm afraid it will break things especially when we wait up to 60 seconds.
>>> I agree CRS handling after reset is probably all broken.
>>>
>>> I hate the fact that we reset devices without re-enumerating them. We
>>> have no assurance that the device is the same after reset (it could
>>> have loaded new firmware and been completely reconfigured).
>>>
>>> I don't have any good suggestions for you, so if you have some ideas
>>> and want to fix it, please go ahead.
>>
>> I think I'll make a list of paths that reach to secondary bus reset
>> function and try to keep CRS loop as close as possible to the caller.
>>
>> I'll focus on FLR later. I won't be heart-broken if somebody took a
>> stab at the FLR.
>
> Side note, taking advantage of this query: I hope this is a problem
> solved by end of October, if it is not I would suggest adding a track to
> tackle it at LPC16 PCI uconf, I added Alex since I know he is interested
> in the topic too, it makes sense to debate a solution with all
> interested people in one room.
>
Let's see where this goes.
Here is a list of all the callers I found. Based on what I see, the code
assumes that registers are accessible immediately after pci_reset_bridge_secondary_bus
call.
pci_reset_bridge_secondary_bus also sleeps for 1 seconds. We can maybe assume
that it is OK to sleep additional potential 60 seconds.
How do you feel about running CRS against all immediate children
of this bridge in pci_reset_secondary_bus function by calling pci_bus_read_dev_vendor_id
for every single child?
Something like:
+ u32 l;
+ list_for_each_entry(dev, &bus->devices, bus_list)
+ pci_bus_read_dev_vendor_id(bus, dev->devfn, &l, 60*1000);
trigger_sbr (infiniband/hw/hfi1/pcie.c)
pci_reset_bridge_secondary_bus (export)
pcibios_reset_secondary_bus (weak)
pci_reset_secondary_bus
__pci_reset_function
pci_dev_reset
__pci_dev_reset
pci_dev_reset_slot_function
pci_reset_hotplug_slot
reset_slot (drivers/pci/hotplug/pciehp_core.c)
pciehp_reset_slot (drivers/pci/hotplug/pciehp_hpc.c)
pci_reset_bridge_secondary_bus (export)
pcibios_reset_secondary_bus (weak)
pci_reset_secondary_bus
pcie_capability_write_word
pci_parent_bus_reset
pci_reset_bridge_secondary_bus
pcibios_reset_secondary_bus (weak)
pci_reset_secondary_bus
pci_try_reset_function
__pci_dev_reset
pci_dev_restore
pcistub_put_pci_dev
__pci_reset_function_locked
__pci_dev_reset
pci_restore_state
pcistub_init_device
__pci_reset_function_locked
__pci_dev_reset
pci_restore_state
handle_error_source
do_recovery
reset_link (drivers/pci/pcie/aer/aerdrv_core.c)
default_reset_link
pci_reset_bridge_secondary_bus (export)
pcibios_reset_secondary_bus (weak)
pci_reset_secondary_bus
driver->reset_link
aer_root_reset
pci_reset_bridge_secondary_bus (export)
pcibios_reset_secondary_bus (weak)
pci_reset_secondary_bus
pci_read_config_dword
vfio_pci_ioctl (VFIO_DEVICE_PCI_HOT_RESET)
pci_try_reset_bus
pci_reset_bridge_secondary_bus
pcibios_reset_secondary_bus (weak)
pci_reset_secondary_bus
vfio_pci_disable
vfio_pci_try_bus_reset
pci_try_reset_bus
pci_reset_bridge_secondary_bus
pcibios_reset_secondary_bus (weak)
pci_reset_secondary_bus
pci_probe_reset_bus (probe =1)
pci_bus_reset
pci_reset_bridge_secondary_bus
pcibios_reset_secondary_bus (weak)
pci_reset_secondary_bus
vfio_pci_ioctl ( VFIO_DEVICE_GET_PCI_HOT_RESET_INFO)
pci_probe_reset_bus (probe =1)
pci_bus_reset
pci_reset_bridge_secondary_bus
pcibios_reset_secondary_bus (weak)
pci_reset_secondary_bus
pci_reset_bus (no caller)
pci_probe_reset_bus (probe =1)
pci_bus_reset
pci_reset_bridge_secondary_bus
pcibios_reset_secondary_bus (weak)
pci_reset_secondary_bus
--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-08-28 17:57 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-24 15:56 PCI CRS Support Sinan Kaya
2016-08-24 19:10 ` Bjorn Helgaas
2016-08-24 19:28 ` Sinan Kaya
2016-08-25 8:44 ` Lorenzo Pieralisi
2016-08-28 17:57 ` Sinan Kaya
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.