* Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? @ 2018-07-26 22:38 Alex G. 2018-07-26 23:00 ` Rajat Jain 2018-07-27 7:18 ` Lukas Wunner 0 siblings, 2 replies; 22+ messages in thread From: Alex G. @ 2018-07-26 22:38 UTC (permalink / raw) To: Keith Busch; +Cc: linux-pci, Bolen, Austin, alex_gagniuc@dellteam.com Hi, I was under the impression that a DLLSC or PDSC would trigger the PCI_DEV_DISCONNECTED bit to be set, blocking any further config access. Although I'm not seeing that happen after a "Slot(xxx): Link Down" event. I suspect my understanding is then wrong. Or maybe PCI_DEV_DISCONNECTED means something else. In the latter case, does it not make sense to have a separate bit to say "Don't touch this device"? Alex ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-26 22:38 Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? Alex G. @ 2018-07-26 23:00 ` Rajat Jain 2018-07-27 0:04 ` Alex_Gagniuc 2018-07-27 7:18 ` Lukas Wunner 1 sibling, 1 reply; 22+ messages in thread From: Rajat Jain @ 2018-07-26 23:00 UTC (permalink / raw) To: Alexandru Gagniuc; +Cc: Busch, Keith, linux-pci, austin_bolen, Alex_Gagniuc On Thu, Jul 26, 2018 at 3:38 PM Alex G. <mr.nuke.me@gmail.com> wrote: > > Hi, > > I was under the impression that a DLLSC or PDSC would trigger the > PCI_DEV_DISCONNECTED bit to be set, blocking any further config access. This sounds like an good idea to me. Caveat below. > > Although I'm not seeing that happen after a "Slot(xxx): Link Down" > event. It should eventually happen when the pciehp takes a note and finally processes the event in pciehp_unconfigure_device(). I suspect that the reason this was put towards the end of the event processing was to be absolutely be sure that the device removal won't be cancelled (Since the bit is not being cleared anywhere, if somehow the device removal is not processed, it may result in ghost devices that are permanently "unreadable"). Thanks, Rajat > I suspect my understanding is then wrong. Or maybe > PCI_DEV_DISCONNECTED means something else. > > In the latter case, does it not make sense to have a separate bit to say > "Don't touch this device"? > > Alex ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-26 23:00 ` Rajat Jain @ 2018-07-27 0:04 ` Alex_Gagniuc 0 siblings, 0 replies; 22+ messages in thread From: Alex_Gagniuc @ 2018-07-27 0:04 UTC (permalink / raw) To: rajatja, mr.nuke.me; +Cc: keith.busch, linux-pci, Austin.Bolen On 07/26/2018 06:01 PM, Rajat Jain wrote:=0A= > On Thu, Jul 26, 2018 at 3:38 PM Alex G. <mr.nuke.me@gmail.com> wrote:=0A= >>=0A= >> Hi,=0A= >>=0A= >> I was under the impression that a DLLSC or PDSC would trigger the=0A= >> PCI_DEV_DISCONNECTED bit to be set, blocking any further config access.= =0A= > =0A= > This sounds like an good idea to me. Caveat below.=0A= =0A= =0A= >>=0A= >> Although I'm not seeing that happen after a "Slot(xxx): Link Down"=0A= >> event.=0A= > =0A= > It should eventually happen when the pciehp takes a note and finally=0A= > processes the event in pciehp_unconfigure_device(). I suspect that the=0A= > reason this was put towards the end of the event processing was to be=0A= > absolutely be sure that the device removal won't be cancelled (Since=0A= > the bit is not being cleared anywhere, if somehow the device removal=0A= > is not processed, it may result in ghost devices that are permanently=0A= > "unreadable").=0A= =0A= I think that's something that could be fixed by clearing the =0A= DISCONNECTED bit on a link up.=0A= What I'm confused about here is, I've put a dump_stack() in =0A= pci_dev_set_disconnected(), and I'm not hitting it. I see the "Link =0A= Down" message, from which I would expect the subordinate to get =0A= disconnected eventually. I would expect this to get hit when removing a =0A= device:=0A= =0A= [ 1256.290283] pciehp 0000:b0:05.0:pcie204: Slot(179): Link Down=0A= [ 1256.300043] pciehp 0000:b0:05.0:pcie204: Slot(179): Card not present=0A= [no other messages related to b0:05.0 or its children]=0A= =0A= =0A= Alex=0A= =0A= > Thanks,=0A= > =0A= > Rajat=0A= > =0A= >> I suspect my understanding is then wrong. Or maybe=0A= >> PCI_DEV_DISCONNECTED means something else.=0A= >>=0A= >> In the latter case, does it not make sense to have a separate bit to say= =0A= >> "Don't touch this device"?=0A= >>=0A= >> Alex=0A= > =0A= =0A= ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-26 22:38 Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? Alex G. 2018-07-26 23:00 ` Rajat Jain @ 2018-07-27 7:18 ` Lukas Wunner 2018-07-27 15:52 ` Alex_Gagniuc 1 sibling, 1 reply; 22+ messages in thread From: Lukas Wunner @ 2018-07-27 7:18 UTC (permalink / raw) To: Alex G.; +Cc: Keith Busch, linux-pci, Bolen, Austin, alex_gagniuc@dellteam.com On Thu, Jul 26, 2018 at 05:38:50PM -0500, Alex G. wrote: > I was under the impression that a DLLSC or PDSC would trigger the > PCI_DEV_DISCONNECTED bit to be set, blocking any further config access. Only if the Presence Detect State bit in the Slot Status register is not set. I think the idea was that if the card is still in the slot, its driver can be unbound orderly because the device is still accessible. So there's no need to set PCI_DEV_DISCONNECTED. However if there is a already a new card in the slot, PCI_DEV_DISCONNECTED will erroneously not be set. Also, if card removal was triggered by the link going down but the card is still in the slot, PCI_DEV_DISCONNECTED will also erroneously not be set even though it's inaccessible. The only situations when PCI_DEV_DISCONNECTED should not be set is if the card is being removed by sysfs or by the user pressing the Attention Button. Anything else is a surprise removal. What we need to do is pass down a flag to pciehp_unconfigure_device() to indicate whether one of those two situations is at hand or not, and PCI_DEV_DISCONNECTED should be set depending on that flag. Are you testing Bjorn's pci/hotplug branch or something based on 4.18 or earlier? There is a major event handling rework queued on that branch, so you may want to test with that. > In the latter case, does it not make sense to have a separate bit to say > "Don't touch this device"? Hm, what would you need it for? Thanks, Lukas ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-27 7:18 ` Lukas Wunner @ 2018-07-27 15:52 ` Alex_Gagniuc 2018-07-27 17:05 ` Lukas Wunner 0 siblings, 1 reply; 22+ messages in thread From: Alex_Gagniuc @ 2018-07-27 15:52 UTC (permalink / raw) To: lukas, mr.nuke.me Cc: keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant Hi Lukas,=0A= =0A= +CC some more Dell folks.=0A= =0A= On 07/27/2018 02:18 AM, Lukas Wunner wrote:=0A= > On Thu, Jul 26, 2018 at 05:38:50PM -0500, Alex G. wrote:=0A= >> I was under the impression that a DLLSC or PDSC would trigger the=0A= >> PCI_DEV_DISCONNECTED bit to be set, blocking any further config access.= =0A= > =0A= > Only if the Presence Detect State bit in the Slot Status register is=0A= > not set.=0A= > =0A= > I think the idea was that if the card is still in the slot, its driver ca= n=0A= > be unbound orderly because the device is still accessible. So there's no= =0A= > need to set PCI_DEV_DISCONNECTED.=0A= =0A= This sounds counter to what I had intuited. I mentally associate =0A= DISCONNECTED with "the link is down". I don't understand the idea of a =0A= driver doing an orderly removal when the link is down. Device registers =0A= wouldn't be accessible in that case.=0A= =0A= > However if there is a already a new card in the slot, PCI_DEV_DISCONNECTE= D=0A= > will erroneously not be set. Also, if card removal was triggered by the= =0A= > link going down but the card is still in the slot, PCI_DEV_DISCONNECTED= =0A= > will also erroneously not be set even though it's inaccessible.=0A= > =0A= > The only situations when PCI_DEV_DISCONNECTED should not be set is if the= =0A= > card is being removed by sysfs or by the user pressing the Attention Butt= on.=0A= > Anything else is a surprise removal. What we need to do is pass down a= =0A= > flag to pciehp_unconfigure_device() to indicate whether one of those two= =0A= > situations is at hand or not, and PCI_DEV_DISCONNECTED should be set=0A= > depending on that flag.=0A= =0A= That makes a little more sense. Is someone working on this?=0A= =0A= > Are you testing Bjorn's pci/hotplug branch or something based on 4.18 or= =0A= > earlier? There is a major event handling rework queued on that branch,= =0A= > so you may want to test with that.=0A= =0A= Been using 4.18-rc6. I'll take a look at Bjorn's latest and greatest. =0A= Thank you for the heads-up.=0A= =0A= >> In the latter case, does it not make sense to have a separate bit to say= =0A= >> "Don't touch this device"?=0A= > =0A= > Hm, what would you need it for?=0A= =0A= Idiotic firmware bugs and unrefined device removal mechanisms. For =0A= example, on removal, we shouldn't be touching MSI registers on a device =0A= whose link is down. Firmware-first (FFS) systems tend to blow up, crash, = =0A= and burn when that happens.=0A= =0A= Another example is NVME subsystem resets (NSSR). NSSR resets the PCIe =0A= link, and the PD state stays the same. Once teardown gets to masking the = =0A= MSI-X bits, it sends an MMIO to the downed link.=0A= =0A= I want to resolve such crashes by guarding teardown paths against =0A= touching registers on the disconnected device -- yes, I know it's a =0A= band-aid and not a cure.=0A= =0A= Alex=0A= =0A= > Thanks,=0A= > =0A= > Lukas=0A= > =0A= ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-27 15:52 ` Alex_Gagniuc @ 2018-07-27 17:05 ` Lukas Wunner 2018-07-27 17:51 ` Alex_Gagniuc 0 siblings, 1 reply; 22+ messages in thread From: Lukas Wunner @ 2018-07-27 17:05 UTC (permalink / raw) To: Alex_Gagniuc Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant On Fri, Jul 27, 2018 at 03:52:04PM +0000, Alex_Gagniuc@Dellteam.com wrote: > On 07/27/2018 02:18 AM, Lukas Wunner wrote: > > On Thu, Jul 26, 2018 at 05:38:50PM -0500, Alex G. wrote: > >> I was under the impression that a DLLSC or PDSC would trigger the > >> PCI_DEV_DISCONNECTED bit to be set, blocking any further config access. > > > > Only if the Presence Detect State bit in the Slot Status register is > > not set. > > > > I think the idea was that if the card is still in the slot, its driver can > > be unbound orderly because the device is still accessible. So there's no > > need to set PCI_DEV_DISCONNECTED. > > This sounds counter to what I had intuited. I mentally associate > DISCONNECTED with "the link is down". I don't understand the idea of a > driver doing an orderly removal when the link is down. Device registers > wouldn't be accessible in that case. That's basically what I meant, sorry for not being clear. > > However if there is a already a new card in the slot, PCI_DEV_DISCONNECTED > > will erroneously not be set. Also, if card removal was triggered by the > > link going down but the card is still in the slot, PCI_DEV_DISCONNECTED > > will also erroneously not be set even though it's inaccessible. > > > > The only situations when PCI_DEV_DISCONNECTED should not be set is if the > > card is being removed by sysfs or by the user pressing the Attention Button. > > Anything else is a surprise removal. What we need to do is pass down a > > flag to pciehp_unconfigure_device() to indicate whether one of those two > > situations is at hand or not, and PCI_DEV_DISCONNECTED should be set > > depending on that flag. > > That makes a little more sense. Is someone working on this? Me, but not for 4.19, we're too late in the cycle, I'm going to post a small number of fixup patches for Bjorn's pci/hotplug branch shortly, to be included in 4.19, and that's it. I posted a patch as part of the series that's now on Bjorn's pci/hotplug branch which touched PCI_DEV_DISCONNECTED code in pciehp_pci.c, but had to withdraw that particular patch: https://patchwork.ozlabs.org/patch/930403/ The first problem with that patch was that I hadn't fully understood yet when to set PCI_DEV_DISCONNECTED and when not to set it. The second problem was that it fixed a deadlock on unplug in a way that wasn't generic enough. The same deadlock can occur in other situations. The real fix is to unbind the driver lockless in pci_stop_dev(). That's non-trivial to achieve, but doable. I don't think there's a way around that to fix the problem: https://www.spinics.net/lists/linux-pci/msg73652.html There's (still) a lot that can be improved in pciehp, I hope to keep working on that as time permits. Thanks, Lukas ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-27 17:05 ` Lukas Wunner @ 2018-07-27 17:51 ` Alex_Gagniuc 2018-07-27 18:17 ` Sinan Kaya 2018-07-28 18:31 ` Lukas Wunner 0 siblings, 2 replies; 22+ messages in thread From: Alex_Gagniuc @ 2018-07-27 17:51 UTC (permalink / raw) To: lukas Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant On 07/27/2018 12:05 PM, Lukas Wunner wrote:=0A= > On Fri, Jul 27, 2018 at 03:52:04PM +0000, Alex_Gagniuc@Dellteam.com wrote= :=0A= >> On 07/27/2018 02:18 AM, Lukas Wunner wrote:=0A= >>> On Thu, Jul 26, 2018 at 05:38:50PM -0500, Alex G. wrote:=0A= >>>> I was under the impression that a DLLSC or PDSC would trigger the=0A= >>>> PCI_DEV_DISCONNECTED bit to be set, blocking any further config access= .=0A= >>>=0A= >>> Only if the Presence Detect State bit in the Slot Status register is=0A= >>> not set.=0A= >>>=0A= >>> I think the idea was that if the card is still in the slot, its driver = can=0A= >>> be unbound orderly because the device is still accessible. So there's = no=0A= >>> need to set PCI_DEV_DISCONNECTED.=0A= >>=0A= >> This sounds counter to what I had intuited. I mentally associate=0A= >> DISCONNECTED with "the link is down". I don't understand the idea of a= =0A= >> driver doing an orderly removal when the link is down. Device registers= =0A= >> wouldn't be accessible in that case.=0A= > =0A= > That's basically what I meant, sorry for not being clear.=0A= > =0A= >>> However if there is a already a new card in the slot, PCI_DEV_DISCONNEC= TED=0A= >>> will erroneously not be set. Also, if card removal was triggered by th= e=0A= >>> link going down but the card is still in the slot, PCI_DEV_DISCONNECTED= =0A= >>> will also erroneously not be set even though it's inaccessible.=0A= >>>=0A= >>> The only situations when PCI_DEV_DISCONNECTED should not be set is if t= he=0A= >>> card is being removed by sysfs or by the user pressing the Attention Bu= tton.=0A= >>> Anything else is a surprise removal. What we need to do is pass down a= =0A= >>> flag to pciehp_unconfigure_device() to indicate whether one of those tw= o=0A= >>> situations is at hand or not, and PCI_DEV_DISCONNECTED should be set=0A= >>> depending on that flag.=0A= >>=0A= >> That makes a little more sense. Is someone working on this?=0A= > =0A= > Me, but not for 4.19, we're too late in the cycle, I'm going to post=0A= > a small number of fixup patches for Bjorn's pci/hotplug branch shortly,= =0A= > to be included in 4.19, and that's it.=0A= > =0A= > I posted a patch as part of the series that's now on Bjorn's pci/hotplug= =0A= > branch which touched PCI_DEV_DISCONNECTED code in pciehp_pci.c, but had= =0A= > to withdraw that particular patch:=0A= > https://patchwork.ozlabs.org/patch/930403/=0A= > =0A= > The first problem with that patch was that I hadn't fully understood=0A= > yet when to set PCI_DEV_DISCONNECTED and when not to set it.=0A= =0A= I think PCI_DEV_DISCONNECTED is a documentation issue above all else. =0A= The history I was given is that drivers would take a very long time to =0A= tear down a device. Config space IO to an nonexistent device took a long = =0A= while to time out. Performance was one motivation -- and was not documented= .=0A= =0A= > The second problem was that it fixed a deadlock on unplug in a way=0A= > that wasn't generic enough. The same deadlock can occur in other=0A= > situations. The real fix is to unbind the driver lockless in=0A= > pci_stop_dev(). That's non-trivial to achieve, but doable.=0A= > I don't think there's a way around that to fix the problem:=0A= > https://www.spinics.net/lists/linux-pci/msg73652.html=0A= > =0A= > There's (still) a lot that can be improved in pciehp, I hope to=0A= > keep working on that as time permits.=0A= =0A= Thanks for all the info. The fix that I was settling on is (pasted) =0A= below. Though that seems to conflict a bit with what you are trying to =0A= do. Now I'm a little conflicted If I should try to submit the below or not.= =0A= =0A= Alex=0A= =0A= diff --git a/drivers/pci/hotplug/pciehp_pci.c =0A= b/drivers/pci/hotplug/pciehp_pci.c=0A= index 3f518dea856d..5cf02639fa0b 100644=0A= --- a/drivers/pci/hotplug/pciehp_pci.c=0A= +++ b/drivers/pci/hotplug/pciehp_pci.c=0A= @@ -74,6 +74,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)=0A= ctrl_dbg(ctrl, "%s: domain:bus:dev =3D %04x:%02x:00\n",=0A= __func__, pci_domain_nr(parent), parent->number);=0A= pciehp_get_adapter_status(p_slot, &presence);=0A= + presence =3D presence && pciehp_check_link_active(ctrl);=0A= =0A= pci_lock_rescan_remove();=0A= =0A= =0A= =0A= > Thanks,=0A= > =0A= > Lukas=0A= > =0A= =0A= ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-27 17:51 ` Alex_Gagniuc @ 2018-07-27 18:17 ` Sinan Kaya 2018-07-27 18:23 ` Alex_Gagniuc 2018-07-28 18:31 ` Lukas Wunner 1 sibling, 1 reply; 22+ messages in thread From: Sinan Kaya @ 2018-07-27 18:17 UTC (permalink / raw) To: Alex_Gagniuc, lukas Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant On 7/27/2018 10:51 AM, Alex_Gagniuc@Dellteam.com wrote: >> The first problem with that patch was that I hadn't fully understood >> yet when to set PCI_DEV_DISCONNECTED and when not to set it. > I think PCI_DEV_DISCONNECTED is a documentation issue above all else. > The history I was given is that drivers would take a very long time to > tear down a device. Config space IO to an nonexistent device took a long > while to time out. Performance was one motivation -- and was not documented. > Completion timeouts are typically in the orders of 50ms. If you have a lot of outstanding non-posted requests, it will take time for SW to flush all requests. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-27 18:17 ` Sinan Kaya @ 2018-07-27 18:23 ` Alex_Gagniuc 2018-07-27 18:34 ` Sinan Kaya 0 siblings, 1 reply; 22+ messages in thread From: Alex_Gagniuc @ 2018-07-27 18:23 UTC (permalink / raw) To: okaya, lukas Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant On 07/27/2018 01:18 PM, Sinan Kaya wrote:=0A= > On 7/27/2018 10:51 AM, Alex_Gagniuc@Dellteam.com wrote:=0A= >>> The first problem with that patch was that I hadn't fully understood=0A= >>> yet when to set PCI_DEV_DISCONNECTED and when not to set it.=0A= >> I think PCI_DEV_DISCONNECTED is a documentation issue above all else.=0A= >> The history I was given is that drivers would take a very long time to= =0A= >> tear down a device. Config space IO to an nonexistent device took a long= =0A= >> while to time out. Performance was one motivation -- and was not documen= ted.=0A= >>=0A= > =0A= > Completion timeouts are typically in the orders of 50ms. If you have a=0A= > lot of outstanding non-posted requests, it will take time for SW to=0A= > flush all requests.=0A= =0A= That's why we check pci_dev_is_disconnected() in pci/access.c=0A= But if the DISCONNECTED bit doesn't get set, it makes me wonder how =0A= solid things currently are.=0A= =0A= Alex=0A= ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-27 18:23 ` Alex_Gagniuc @ 2018-07-27 18:34 ` Sinan Kaya 0 siblings, 0 replies; 22+ messages in thread From: Sinan Kaya @ 2018-07-27 18:34 UTC (permalink / raw) To: Alex_Gagniuc, lukas Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant On 7/27/2018 11:23 AM, Alex_Gagniuc@Dellteam.com wrote: >> Completion timeouts are typically in the orders of 50ms. If you have a >> lot of outstanding non-posted requests, it will take time for SW to >> flush all requests. > That's why we check pci_dev_is_disconnected() in pci/access.c > But if the DISCONNECTED bit doesn't get set, it makes me wonder how > solid things currently are. I don't see any problem by setting this bit as long as the link is down regardless of the drive presence state and clearing it on link recovery. Is there a problem with this approach? I'm catching up with this thread. pcie_do_fatal_recovery() doesn't query presence as an example. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-27 17:51 ` Alex_Gagniuc 2018-07-27 18:17 ` Sinan Kaya @ 2018-07-28 18:31 ` Lukas Wunner 2018-07-29 0:26 ` Sinan Kaya ` (2 more replies) 1 sibling, 3 replies; 22+ messages in thread From: Lukas Wunner @ 2018-07-28 18:31 UTC (permalink / raw) To: Alex_Gagniuc Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant, Sinan Kaya On Fri, Jul 27, 2018 at 05:51:04PM +0000, Alex_Gagniuc@Dellteam.com wrote: > I think PCI_DEV_DISCONNECTED is a documentation issue above all else. > The history I was given is that drivers would take a very long time to > tear down a device. Config space IO to an nonexistent device took a long > while to time out. Performance was one motivation -- and was not documented. Often it is possible for the driver to detect surprise removal by checking if mmio reads return "all ones". But in some cases that's a valid value to read from mmio and then this approach won't work. Also, checking every mmio read may negatively impact performance. Finally, if the card was quickly swapped and the link to the new card is already up, you may be accessing that new card. (mmio accesses may then still return all ones if the BARs are blank, but at least config space accesses should work.) Once it has been determined that the device has been surprise removed, that fact should be cached somewhere to short-circuit any further device accesses. PCI_DEV_DISCONNECTED can act as such a cache. > Thanks for all the info. The fix that I was settling on is (pasted) > below. Though that seems to conflict a bit with what you are trying to > do. Now I'm a little conflicted If I should try to submit the below or not. > > --- a/drivers/pci/hotplug/pciehp_pci.c > +++ b/drivers/pci/hotplug/pciehp_pci.c > @@ -74,6 +74,7 @@ int pciehp_unconfigure_device(struct slot *p_slot) > ctrl_dbg(ctrl, "%s: domain:bus:dev = %04x:%02x:00\n", > __func__, pci_domain_nr(parent), parent->number); > pciehp_get_adapter_status(p_slot, &presence); > + presence = presence && pciehp_check_link_active(ctrl); That approach won't work if the card was quickly swapped and the link to the new card is already up when pciehp_unconfigure_device() runs. FWIW, the below is what I had in mind (on top of Bjorn's pci/hotplug branch). Does this work for you? -- >8 -- Subject: [PATCH] PCI: pciehp: Differentiate between surprise and safe removal When removing PCI devices below a hotplug bridge, pciehp marks them as disconnected if the card is no longer present in the slot or it quiesces them if the card is still present (by disabling INTx interrupts, bus mastering and SERR# reporting). To detect whether the card is still present, pciehp checks the Presence Detect State bit in the Slot Status register. The problem with this approach is that even if the card is present, the link to it may be down, and it that case it would be better to mark the devices as disconnected instead of trying to quiesce them. Moreover, if the card in the slot was quickly replaced by another one, the Presence Detect State bit would be set, yet trying to quiesce the new card's devices would be wrong and the correct thing to do is to mark the previous card's devices as disconnected. Instead of looking at the Presence Detect State bit, it is better to differentiate whether the card was surprise removed versus safely removed (via sysfs or an Attention Button press). On surprise removal, the devices should be marked as disconnected, whereas on safe removal it is correct to quiesce the devices. The knowledge whether a surprise removal or a safe removal is at hand does exist further up in the call stack: A surprise removal is initiated by pciehp_handle_presence_or_link_change(), a safe removal by pciehp_handle_disable_request(). Pass that information down to pciehp_unconfigure_device() and use it in lieu of the Presence Detect State bit. While there, add kernel-doc to pciehp_unconfigure_device() and pciehp_configure_device(). Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Alexandru Gagniuc <mr.nuke.me@gmail.com> --- drivers/pci/hotplug/pciehp.h | 2 +- drivers/pci/hotplug/pciehp_ctrl.c | 22 +++++++++++++--------- drivers/pci/hotplug/pciehp_pci.c | 23 ++++++++++++++++++++--- 3 files changed, 34 insertions(+), 13 deletions(-) diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h index 652c46d9b215..bce29ae769dd 100644 --- a/drivers/pci/hotplug/pciehp.h +++ b/drivers/pci/hotplug/pciehp.h @@ -195,7 +195,7 @@ void pciehp_handle_button_press(struct slot *slot); void pciehp_handle_disable_request(struct slot *slot); void pciehp_handle_presence_or_link_change(struct slot *slot, u32 events); int pciehp_configure_device(struct slot *p_slot); -void pciehp_unconfigure_device(struct slot *p_slot); +void pciehp_unconfigure_device(struct slot *p_slot, bool presence); void pciehp_queue_pushbutton_work(struct work_struct *work); struct controller *pcie_init(struct pcie_device *dev); int pcie_init_notification(struct controller *ctrl); diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c index d7d55160b5f8..8836648e145f 100644 --- a/drivers/pci/hotplug/pciehp_ctrl.c +++ b/drivers/pci/hotplug/pciehp_ctrl.c @@ -40,6 +40,9 @@ hotplug controller logic */ +#define SAFE_REMOVAL true +#define SURPRISE_REMOVAL false + static void set_slot_off(struct controller *ctrl, struct slot *pslot) { /* turn off slot, turn on Amber LED, turn off Green LED if supported*/ @@ -115,12 +118,13 @@ static int board_added(struct slot *p_slot) /** * remove_board - Turns off slot and LEDs * @p_slot: slot where board is being removed + * @safe_removal: whether the board is safely removed (versus surprise removed) */ -static void remove_board(struct slot *p_slot) +static void remove_board(struct slot *p_slot, bool safe_removal) { struct controller *ctrl = p_slot->ctrl; - pciehp_unconfigure_device(p_slot); + pciehp_unconfigure_device(p_slot, safe_removal); if (POWER_CTRL(ctrl)) { pciehp_power_off_slot(p_slot); @@ -138,7 +142,7 @@ static void remove_board(struct slot *p_slot) } static int pciehp_enable_slot(struct slot *slot); -static int pciehp_disable_slot(struct slot *slot); +static int pciehp_disable_slot(struct slot *slot, bool safe_removal); void pciehp_request(struct controller *ctrl, int action) { @@ -230,7 +234,7 @@ void pciehp_handle_disable_request(struct slot *slot) slot->state = POWEROFF_STATE; mutex_unlock(&slot->lock); - ctrl->request_result = pciehp_disable_slot(slot); + ctrl->request_result = pciehp_disable_slot(slot, SAFE_REMOVAL); } void pciehp_handle_presence_or_link_change(struct slot *slot, u32 events) @@ -257,7 +261,7 @@ void pciehp_handle_presence_or_link_change(struct slot *slot, u32 events) if (events & PCI_EXP_SLTSTA_PDC) ctrl_info(ctrl, "Slot(%s): Card not present\n", slot_name(slot)); - pciehp_disable_slot(slot); + pciehp_disable_slot(slot, SURPRISE_REMOVAL); break; default: mutex_unlock(&slot->lock); @@ -343,7 +347,7 @@ static int pciehp_enable_slot(struct slot *slot) return ret; } -static int __pciehp_disable_slot(struct slot *p_slot) +static int __pciehp_disable_slot(struct slot *p_slot, bool safe_removal) { u8 getstatus = 0; struct controller *ctrl = p_slot->ctrl; @@ -357,17 +361,17 @@ static int __pciehp_disable_slot(struct slot *p_slot) } } - remove_board(p_slot); + remove_board(p_slot, safe_removal); return 0; } -static int pciehp_disable_slot(struct slot *slot) +static int pciehp_disable_slot(struct slot *slot, bool safe_removal) { struct controller *ctrl = slot->ctrl; int ret; pm_runtime_get_sync(&ctrl->pcie->port->dev); - ret = __pciehp_disable_slot(slot); + ret = __pciehp_disable_slot(slot, safe_removal); pm_runtime_put(&ctrl->pcie->port->dev); mutex_lock(&slot->lock); diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c index ec3f065bb1c0..079aac163484 100644 --- a/drivers/pci/hotplug/pciehp_pci.c +++ b/drivers/pci/hotplug/pciehp_pci.c @@ -34,6 +34,14 @@ #include "../pci.h" #include "pciehp.h" +/** + * pciehp_configure_device() - enumerate PCI devices below a hotplug bridge + * @p_slot: PCIe hotplug slot + * + * Enumerate PCI devices below a hotplug bridge and add them to the system. + * Return 0 on success, %-EEXIST if the devices are already enumerated or + * %-ENODEV if enumeration failed. + */ int pciehp_configure_device(struct slot *p_slot) { struct pci_dev *dev; @@ -76,9 +84,19 @@ int pciehp_configure_device(struct slot *p_slot) return ret; } -void pciehp_unconfigure_device(struct slot *p_slot) +/** + * pciehp_unconfigure_device() - remove PCI devices below a hotplug bridge + * @p_slot: PCIe hotplug slot + * @presence: whether the card is still present in the slot; + * true for safe removal via sysfs or an Attention Button press, + * false for surprise removal + * + * Unbind PCI devices below a hotplug bridge from their drivers and remove + * them from the system. Safely removed devices are quiesced. Surprise + * removed devices are marked as such to prevent further accesses. + */ +void pciehp_unconfigure_device(struct slot *p_slot, bool presence) { - u8 presence = 0; struct pci_dev *dev, *temp; struct pci_bus *parent = p_slot->ctrl->pcie->port->subordinate; u16 command; @@ -86,7 +104,6 @@ void pciehp_unconfigure_device(struct slot *p_slot) ctrl_dbg(ctrl, "%s: domain:bus:dev = %04x:%02x:00\n", __func__, pci_domain_nr(parent), parent->number); - pciehp_get_adapter_status(p_slot, &presence); pci_lock_rescan_remove(); -- 2.18.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-28 18:31 ` Lukas Wunner @ 2018-07-29 0:26 ` Sinan Kaya 2018-07-29 12:09 ` Lukas Wunner 2018-07-30 13:28 ` David Laight 2018-07-30 21:38 ` Alex_Gagniuc 2 siblings, 1 reply; 22+ messages in thread From: Sinan Kaya @ 2018-07-29 0:26 UTC (permalink / raw) To: Lukas Wunner, Alex_Gagniuc Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant On 7/28/2018 11:31 AM, Lukas Wunner wrote: > The knowledge whether a surprise removal or a safe removal is at hand > does exist further up in the call stack: A surprise removal is > initiated by pciehp_handle_presence_or_link_change(), a safe removal by > pciehp_handle_disable_request(). Can you also check if platform supports surprise link down error reporting (Link Capabilities Register) and reports a surprise link down event in AER Uncorrectable Error Status Register for the hotplug code to make it more reliable? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-29 0:26 ` Sinan Kaya @ 2018-07-29 12:09 ` Lukas Wunner 2018-07-29 16:59 ` Sinan Kaya 0 siblings, 1 reply; 22+ messages in thread From: Lukas Wunner @ 2018-07-29 12:09 UTC (permalink / raw) To: Sinan Kaya Cc: Alex_Gagniuc, mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant On Sat, Jul 28, 2018 at 05:26:57PM -0700, Sinan Kaya wrote: > On 7/28/2018 11:31 AM, Lukas Wunner wrote: > >The knowledge whether a surprise removal or a safe removal is at hand > >does exist further up in the call stack: A surprise removal is > >initiated by pciehp_handle_presence_or_link_change(), a safe removal by > >pciehp_handle_disable_request(). > > Can you also check if platform supports surprise link down error > reporting (Link Capabilities Register) and reports a surprise link > down event in AER Uncorrectable Error Status Register for the > hotplug code to make it more reliable? We read the Link Capabilities register in pcie_init() to determine if Data Link Layer Link Active Reporting is supported. (That's a feature added in the PCIe r1.1 Base Spec. Old devices that strictly adhere to PCIe r1.0 don't support it.) We could likewise cache the Surprise Down Error Reporting Capable bit in struct controller. But I don't quite understand yet how and when you want it to be used by pciehp? If the link goes down, pciehp doesn't care whether that's caused by a fatal error or removal by the user. It seems correct to me to also remove devices on a fatal error, after all they're no longer accessible until the error is cleared (IIUC). Do you agree or disagree? Thanks, Lukas ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-29 12:09 ` Lukas Wunner @ 2018-07-29 16:59 ` Sinan Kaya 0 siblings, 0 replies; 22+ messages in thread From: Sinan Kaya @ 2018-07-29 16:59 UTC (permalink / raw) To: Lukas Wunner Cc: Alex_Gagniuc, mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant On 7/29/18, Lukas Wunner <lukas@wunner.de> wrote: > On Sat, Jul 28, 2018 at 05:26:57PM -0700, Sinan Kaya wrote: >> On 7/28/2018 11:31 AM, Lukas Wunner wrote: >> >The knowledge whether a surprise removal or a safe removal is at hand >> >does exist further up in the call stack: A surprise removal is >> >initiated by pciehp_handle_presence_or_link_change(), a safe removal by >> >pciehp_handle_disable_request(). >> >> Can you also check if platform supports surprise link down error >> reporting (Link Capabilities Register) and reports a surprise link >> down event in AER Uncorrectable Error Status Register for the >> hotplug code to make it more reliable? > > We read the Link Capabilities register in pcie_init() to determine if > Data Link Layer Link Active Reporting is supported. (That's a feature > added in the PCIe r1.1 Base Spec. Old devices that strictly adhere to > PCIe r1.0 don't support it.) > > We could likewise cache the Surprise Down Error Reporting Capable bit > in struct controller. But I don't quite understand yet how and when > you want it to be used by pciehp? If the link goes down, pciehp doesn't > care whether that's caused by a fatal error or removal by the user. > It seems correct to me to also remove devices on a fatal error, after all > they're no longer accessible until the error is cleared (IIUC). > Do you agree or disagree? Yes, we have to remove the devices for both. However, I don't think pciehp is the right place for fatal error link events. I am trying to separate these two execution paths. If we fail, someone will have to take the challenge and unify link code for both. Right now, two threads are trying to do the same thing in parallel. Why surprise link down check... Data link layer can change due to signal integrity issues. Spec defines surprise link down bit to separate device removal >From other link quality issues. That's why, i was suggesting to check it if a pcie device supports It. > > Thanks, > > Lukas > ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-28 18:31 ` Lukas Wunner 2018-07-29 0:26 ` Sinan Kaya @ 2018-07-30 13:28 ` David Laight 2018-07-30 13:54 ` Lukas Wunner 2018-07-30 21:38 ` Alex_Gagniuc 2 siblings, 1 reply; 22+ messages in thread From: David Laight @ 2018-07-30 13:28 UTC (permalink / raw) To: 'Lukas Wunner', Alex_Gagniuc Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant, Sinan Kaya From: Lukas Wunner > Sent: 28 July 2018 19:32 ... > Finally, if the card was quickly swapped and the link to the new > card is already up, you may be accessing that new card. (mmio > accesses may then still return all ones if the BARs are blank, > but at least config space accesses should work.) On my i7-7700 system that no longer works (at least with some cards). If I take the PCIe link down completely (reset the FPGA on the card) it doesn't recover (loops through detect active/quiet and a third state I can't quite remember). ISTR that it recovers from the link going down when I short out the PCIe data lines. It worked fine on a XEON E5-2609 system - I did it a lot when updating the fpga image. Can anyone else verify whether this works on other systems? Or whether the kernel (or BIOS) needs to (re-)initialise some register to make link recovery work. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-30 13:28 ` David Laight @ 2018-07-30 13:54 ` Lukas Wunner 2018-07-30 16:06 ` David Laight 0 siblings, 1 reply; 22+ messages in thread From: Lukas Wunner @ 2018-07-30 13:54 UTC (permalink / raw) To: David Laight Cc: Alex_Gagniuc, mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant, Sinan Kaya On Mon, Jul 30, 2018 at 01:28:14PM +0000, David Laight wrote: > From: Lukas Wunner > > Sent: 28 July 2018 19:32 > ... > > Finally, if the card was quickly swapped and the link to the new > > card is already up, you may be accessing that new card. (mmio > > accesses may then still return all ones if the BARs are blank, > > but at least config space accesses should work.) > > On my i7-7700 system that no longer works (at least with some cards). > If I take the PCIe link down completely (reset the FPGA on the card) > it doesn't recover (loops through detect active/quiet and a third > state I can't quite remember). > > ISTR that it recovers from the link going down when I short out > the PCIe data lines. > > It worked fine on a XEON E5-2609 system - I did it a lot when > updating the fpga image. > > Can anyone else verify whether this works on other systems? > Or whether the kernel (or BIOS) needs to (re-)initialise > some register to make link recovery work. Huh? Can you be a bit more specific what exactly no longer works and which branch or kernel version introduced the regression? Thanks, Lukas ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-30 13:54 ` Lukas Wunner @ 2018-07-30 16:06 ` David Laight 0 siblings, 0 replies; 22+ messages in thread From: David Laight @ 2018-07-30 16:06 UTC (permalink / raw) To: 'Lukas Wunner' Cc: Alex_Gagniuc, mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant, Sinan Kaya From: Lukas Wunner > Sent: 30 July 2018 14:54 > > On Mon, Jul 30, 2018 at 01:28:14PM +0000, David Laight wrote: > > From: Lukas Wunner > > > Sent: 28 July 2018 19:32 > > ... > > > Finally, if the card was quickly swapped and the link to the new > > > card is already up, you may be accessing that new card. (mmio > > > accesses may then still return all ones if the BARs are blank, > > > but at least config space accesses should work.) > > > > On my i7-7700 system that no longer works (at least with some cards). > > If I take the PCIe link down completely (reset the FPGA on the card) > > it doesn't recover (loops through detect active/quiet and a third > > state I can't quite remember). > > > > ISTR that it recovers from the link going down when I short out > > the PCIe data lines. > > > > It worked fine on a XEON E5-2609 system - I did it a lot when > > updating the fpga image. > > > > Can anyone else verify whether this works on other systems? > > Or whether the kernel (or BIOS) needs to (re-)initialise > > some register to make link recovery work. > > Huh? Can you be a bit more specific what exactly no longer works > and which branch or kernel version introduced the regression? I've just rerun the test on the failing system. I believe it is related to the CPU/BIOS version, not the kernel. What I'm actually doing is: 1) Boot the system and load the PCIe drivers for a card we make. 2) echo 1 >/sys/devices/pci..../remove 3) Completely reset the Altera(Intel) fpga at the far end of the PCIe link. I now expect the link to recover, on the XEON E5-2609 it does (with a 4.15-rc6 kernel) but on the i7-7700 it does not (and hasn't for much older kernels). I also don't think it makes any difference whether the PCIe slot is directly connected to the cpu or off the companion chip. We don't have a PCIe analyser, but the fpga traces ltssm state transitions to an internal memory buffer which we can read using a serial link when the PCIe link is down. After the fpga reset I get: clocks: abs delta status: 2 +4G2 Detect Quiet, set l2_exit, set hotrst_exit, set dlup_exit status: 3 +1 Polling Compliance, clear l2_exit, clear hotrst_exit, clear dlup_exit status: 6 +3 Polling Compliance, link speed 1, set link2 de-emphasis level status: 75 +111 Polling Compliance, set link data link active status: 76 +1 Polling Active status: 289 +531 Polling Active, set pld_clk_inuse status: 16e3da +1M4 Detect Quiet status: 22558b +M75 Detect Active status: 22b1a3 +23k Polling Active Repeats forever at the same rate. status: 2dd8a0c7 +23k Polling Active status: 2def8429 +1M5 Detect Quiet status: 2dfaf5da +M75 Detect Active status: 2dfb51f3 +23k Polling Active status: 2e123555 +1M5 Detect Quiet status: 2e1da706 +M75 Detect Active status: 2e1e031f +23k Polling Active status: 2e34e681 +1M5 Detect Quiet status: 2e405832 +M75 Detect Active status: 2e40b44b +23k Polling Active Until I do a 'reboot' when it all recovers status: 2e48c9d9 +M52 Polling Active, set avalon bus reset status: 2e48c9dd +4 Polling Active, clear avalon bus reset status: 2e48c9de +1 Detect Quiet status: 2e48c9df +1 Detect Quiet, clear pld_clk_inuse status: 2e48c9e5 +6 Detect Quiet, set pld_clk_inuse status: 2e48c9e6 +1 Detect Quiet, clear pld_clk_inuse status: 2e48c9e8 +2 Detect Quiet, set pld_clk_inuse status: 2e48c9e9 +1 Detect Active status: 2e48c9eb +2 Detect Active, clear link data link active status: 2e48c9f3 +8 Detect Active, set link data link active The trace suppresses repeated 'Detect active' 'Detect quiet' traces because they happen for a considerable period during a reboot'. time: Thu Jan 1 01:10:33 1970 status: 32fd4f1f +78M Detect Quiet, active<=>quiet bounces 102 status: 3308c162 +M75 Detect Active status: 33091d54 +23k Polling Active status: 33170b7f +M91 Polling Configuration status: 33170bc7 +72 Config Link width start, link width 8 status: 33170be7 +32 Config Link accept status: 33171c1d +4k1 Config Lane num wait status: 33171c2d +16 Config Lane num accept status: 33171c61 +52 Config Complete status: 33171c67 +6 Config Complete, link width 1 status: 33171caa +67 Config Idle status: 33171cba +16 L0 I'm not sure what 'Polling active' means. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-28 18:31 ` Lukas Wunner 2018-07-29 0:26 ` Sinan Kaya 2018-07-30 13:28 ` David Laight @ 2018-07-30 21:38 ` Alex_Gagniuc 2018-07-31 9:28 ` Lukas Wunner 2 siblings, 1 reply; 22+ messages in thread From: Alex_Gagniuc @ 2018-07-30 21:38 UTC (permalink / raw) To: lukas Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant, okaya On 07/28/2018 01:31 PM, Lukas Wunner wrote:=0A= > On Fri, Jul 27, 2018 at 05:51:04PM +0000, Alex_Gagniuc@Dellteam.com wrote= :=0A= >> I think PCI_DEV_DISCONNECTED is a documentation issue above all else.=0A= >> The history I was given is that drivers would take a very long time to= =0A= >> tear down a device. Config space IO to an nonexistent device took a long= =0A= >> while to time out. Performance was one motivation -- and was not documen= ted.=0A= > =0A= > Often it is possible for the driver to detect surprise removal by=0A= > checking if mmio reads return "all ones". But in some cases that's=0A= > a valid value to read from mmio and then this approach won't work.=0A= > Also, checking every mmio read may negatively impact performance.=0A= =0A= A colleague and me beat that dead horse to the afterdeath. Consensus was = =0A= that the return value is less reliable than a coin toss (of a two-heads =0A= coin).=0A= =0A= > Finally, if the card was quickly swapped and the link to the new=0A= > card is already up, you may be accessing that new card. (mmio=0A= > accesses may then still return all ones if the BARs are blank,=0A= > but at least config space accesses should work.)=0A= =0A= I'm really not worried about this cases. Assuming that someone goes to =0A= the effort to swap thing this fast, the IRQ would print an "already =0A= getting powered down" message, or something of the sorts. It would be =0A= plainly obvious to even the most casual dmesg observer.=0A= =0A= From what I can tell, the hotplug code is well armed to handle such =0A= races and resolve them automatically.=0A= =0A= > Once it has been determined that the device has been surprise removed,=0A= > that fact should be cached somewhere to short-circuit any further device= =0A= > accesses. PCI_DEV_DISCONNECTED can act as such a cache.=0A= =0A= I do insist that you spell SURPRISE!!! this way. :)=0A= =0A= >> Thanks for all the info. The fix that I was settling on is (pasted)=0A= >> below. Though that seems to conflict a bit with what you are trying to= =0A= >> do. Now I'm a little conflicted If I should try to submit the below or n= ot.=0A= >>=0A= >> --- a/drivers/pci/hotplug/pciehp_pci.c=0A= >> +++ b/drivers/pci/hotplug/pciehp_pci.c=0A= >> @@ -74,6 +74,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)=0A= >> ctrl_dbg(ctrl, "%s: domain:bus:dev =3D %04x:%02x:00\n",=0A= >> __func__, pci_domain_nr(parent), parent->number);=0A= >> pciehp_get_adapter_status(p_slot, &presence);=0A= >> + presence =3D presence && pciehp_check_link_active(ctrl);=0A= > =0A= > That approach won't work if the card was quickly swapped and the link=0A= > to the new card is already up when pciehp_unconfigure_device() runs.=0A= =0A= You're right. We should use the PD/link status gathered at the time of =0A= the interrupt.=0A= =0A= > FWIW, the below is what I had in mind (on top of Bjorn's pci/hotplug=0A= > branch). Does this work for you?=0A= =0A= This, and another patch (you have been CC'd) solve my problem of =0A= crashing during surprise removal. Thanks!=0A= =0A= Alex=0A= =0A= > -- >8 --=0A= > Subject: [PATCH] PCI: pciehp: Differentiate between surprise and safe rem= oval=0A= > =0A= > When removing PCI devices below a hotplug bridge, pciehp marks them as=0A= > disconnected if the card is no longer present in the slot or it quiesces= =0A= > them if the card is still present (by disabling INTx interrupts, bus=0A= > mastering and SERR# reporting).=0A= > =0A= > To detect whether the card is still present, pciehp checks the Presence= =0A= > Detect State bit in the Slot Status register. The problem with this=0A= > approach is that even if the card is present, the link to it may be=0A= > down, and it that case it would be better to mark the devices as=0A= > disconnected instead of trying to quiesce them. Moreover, if the card=0A= > in the slot was quickly replaced by another one, the Presence Detect=0A= > State bit would be set, yet trying to quiesce the new card's devices=0A= > would be wrong and the correct thing to do is to mark the previous=0A= > card's devices as disconnected.=0A= > =0A= > Instead of looking at the Presence Detect State bit, it is better to=0A= > differentiate whether the card was surprise removed versus safely=0A= > removed (via sysfs or an Attention Button press). On surprise removal,= =0A= > the devices should be marked as disconnected, whereas on safe removal it= =0A= > is correct to quiesce the devices.=0A= > =0A= > The knowledge whether a surprise removal or a safe removal is at hand=0A= > does exist further up in the call stack: A surprise removal is=0A= > initiated by pciehp_handle_presence_or_link_change(), a safe removal by= =0A= > pciehp_handle_disable_request().=0A= > =0A= > Pass that information down to pciehp_unconfigure_device() and use it in= =0A= > lieu of the Presence Detect State bit. While there, add kernel-doc to=0A= > pciehp_unconfigure_device() and pciehp_configure_device().=0A= > =0A= > Signed-off-by: Lukas Wunner <lukas@wunner.de>=0A= > Cc: Alexandru Gagniuc <mr.nuke.me@gmail.com>=0A= > ---=0A= > drivers/pci/hotplug/pciehp.h | 2 +-=0A= > drivers/pci/hotplug/pciehp_ctrl.c | 22 +++++++++++++---------=0A= > drivers/pci/hotplug/pciehp_pci.c | 23 ++++++++++++++++++++---=0A= > 3 files changed, 34 insertions(+), 13 deletions(-)=0A= > =0A= > diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h= =0A= > index 652c46d9b215..bce29ae769dd 100644=0A= > --- a/drivers/pci/hotplug/pciehp.h=0A= > +++ b/drivers/pci/hotplug/pciehp.h=0A= > @@ -195,7 +195,7 @@ void pciehp_handle_button_press(struct slot *slot);= =0A= > void pciehp_handle_disable_request(struct slot *slot);=0A= > void pciehp_handle_presence_or_link_change(struct slot *slot, u32 event= s);=0A= > int pciehp_configure_device(struct slot *p_slot);=0A= > -void pciehp_unconfigure_device(struct slot *p_slot);=0A= > +void pciehp_unconfigure_device(struct slot *p_slot, bool presence);=0A= > void pciehp_queue_pushbutton_work(struct work_struct *work);=0A= > struct controller *pcie_init(struct pcie_device *dev);=0A= > int pcie_init_notification(struct controller *ctrl);=0A= > diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pcie= hp_ctrl.c=0A= > index d7d55160b5f8..8836648e145f 100644=0A= > --- a/drivers/pci/hotplug/pciehp_ctrl.c=0A= > +++ b/drivers/pci/hotplug/pciehp_ctrl.c=0A= > @@ -40,6 +40,9 @@=0A= > hotplug controller logic=0A= > */=0A= > =0A= > +#define SAFE_REMOVAL true=0A= > +#define SURPRISE_REMOVAL false=0A= > +=0A= > static void set_slot_off(struct controller *ctrl, struct slot *pslot)= =0A= > {=0A= > /* turn off slot, turn on Amber LED, turn off Green LED if supported*/= =0A= > @@ -115,12 +118,13 @@ static int board_added(struct slot *p_slot)=0A= > /**=0A= > * remove_board - Turns off slot and LEDs=0A= > * @p_slot: slot where board is being removed=0A= > + * @safe_removal: whether the board is safely removed (versus surprise r= emoved)=0A= > */=0A= > -static void remove_board(struct slot *p_slot)=0A= > +static void remove_board(struct slot *p_slot, bool safe_removal)=0A= > {=0A= > struct controller *ctrl =3D p_slot->ctrl;=0A= > =0A= > - pciehp_unconfigure_device(p_slot);=0A= > + pciehp_unconfigure_device(p_slot, safe_removal);=0A= > =0A= > if (POWER_CTRL(ctrl)) {=0A= > pciehp_power_off_slot(p_slot);=0A= > @@ -138,7 +142,7 @@ static void remove_board(struct slot *p_slot)=0A= > }=0A= > =0A= > static int pciehp_enable_slot(struct slot *slot);=0A= > -static int pciehp_disable_slot(struct slot *slot);=0A= > +static int pciehp_disable_slot(struct slot *slot, bool safe_removal);=0A= > =0A= > void pciehp_request(struct controller *ctrl, int action)=0A= > {=0A= > @@ -230,7 +234,7 @@ void pciehp_handle_disable_request(struct slot *slot)= =0A= > slot->state =3D POWEROFF_STATE;=0A= > mutex_unlock(&slot->lock);=0A= > =0A= > - ctrl->request_result =3D pciehp_disable_slot(slot);=0A= > + ctrl->request_result =3D pciehp_disable_slot(slot, SAFE_REMOVAL);=0A= > }=0A= > =0A= > void pciehp_handle_presence_or_link_change(struct slot *slot, u32 event= s)=0A= > @@ -257,7 +261,7 @@ void pciehp_handle_presence_or_link_change(struct slo= t *slot, u32 events)=0A= > if (events & PCI_EXP_SLTSTA_PDC)=0A= > ctrl_info(ctrl, "Slot(%s): Card not present\n",=0A= > slot_name(slot));=0A= > - pciehp_disable_slot(slot);=0A= > + pciehp_disable_slot(slot, SURPRISE_REMOVAL);=0A= > break;=0A= > default:=0A= > mutex_unlock(&slot->lock);=0A= > @@ -343,7 +347,7 @@ static int pciehp_enable_slot(struct slot *slot)=0A= > return ret;=0A= > }=0A= > =0A= > -static int __pciehp_disable_slot(struct slot *p_slot)=0A= > +static int __pciehp_disable_slot(struct slot *p_slot, bool safe_removal)= =0A= > {=0A= > u8 getstatus =3D 0;=0A= > struct controller *ctrl =3D p_slot->ctrl;=0A= > @@ -357,17 +361,17 @@ static int __pciehp_disable_slot(struct slot *p_slo= t)=0A= > }=0A= > }=0A= > =0A= > - remove_board(p_slot);=0A= > + remove_board(p_slot, safe_removal);=0A= > return 0;=0A= > }=0A= > =0A= > -static int pciehp_disable_slot(struct slot *slot)=0A= > +static int pciehp_disable_slot(struct slot *slot, bool safe_removal)=0A= > {=0A= > struct controller *ctrl =3D slot->ctrl;=0A= > int ret;=0A= > =0A= > pm_runtime_get_sync(&ctrl->pcie->port->dev);=0A= > - ret =3D __pciehp_disable_slot(slot);=0A= > + ret =3D __pciehp_disable_slot(slot, safe_removal);=0A= > pm_runtime_put(&ctrl->pcie->port->dev);=0A= > =0A= > mutex_lock(&slot->lock);=0A= > diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pcieh= p_pci.c=0A= > index ec3f065bb1c0..079aac163484 100644=0A= > --- a/drivers/pci/hotplug/pciehp_pci.c=0A= > +++ b/drivers/pci/hotplug/pciehp_pci.c=0A= > @@ -34,6 +34,14 @@=0A= > #include "../pci.h"=0A= > #include "pciehp.h"=0A= > =0A= > +/**=0A= > + * pciehp_configure_device() - enumerate PCI devices below a hotplug bri= dge=0A= > + * @p_slot: PCIe hotplug slot=0A= > + *=0A= > + * Enumerate PCI devices below a hotplug bridge and add them to the syst= em.=0A= > + * Return 0 on success, %-EEXIST if the devices are already enumerated o= r=0A= > + * %-ENODEV if enumeration failed.=0A= > + */=0A= > int pciehp_configure_device(struct slot *p_slot)=0A= > {=0A= > struct pci_dev *dev;=0A= > @@ -76,9 +84,19 @@ int pciehp_configure_device(struct slot *p_slot)=0A= > return ret;=0A= > }=0A= > =0A= > -void pciehp_unconfigure_device(struct slot *p_slot)=0A= > +/**=0A= > + * pciehp_unconfigure_device() - remove PCI devices below a hotplug brid= ge=0A= > + * @p_slot: PCIe hotplug slot=0A= > + * @presence: whether the card is still present in the slot;=0A= > + * true for safe removal via sysfs or an Attention Button press,=0A= > + * false for surprise removal=0A= > + *=0A= > + * Unbind PCI devices below a hotplug bridge from their drivers and remo= ve=0A= > + * them from the system. Safely removed devices are quiesced. Surprise= =0A= > + * removed devices are marked as such to prevent further accesses.=0A= > + */=0A= > +void pciehp_unconfigure_device(struct slot *p_slot, bool presence)=0A= > {=0A= > - u8 presence =3D 0;=0A= > struct pci_dev *dev, *temp;=0A= > struct pci_bus *parent =3D p_slot->ctrl->pcie->port->subordinate;=0A= > u16 command;=0A= > @@ -86,7 +104,6 @@ void pciehp_unconfigure_device(struct slot *p_slot)=0A= > =0A= > ctrl_dbg(ctrl, "%s: domain:bus:dev =3D %04x:%02x:00\n",=0A= > __func__, pci_domain_nr(parent), parent->number);=0A= > - pciehp_get_adapter_status(p_slot, &presence);=0A= > =0A= > pci_lock_rescan_remove();=0A= > =0A= > =0A= =0A= ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-30 21:38 ` Alex_Gagniuc @ 2018-07-31 9:28 ` Lukas Wunner 2018-07-31 16:35 ` Alex_Gagniuc 0 siblings, 1 reply; 22+ messages in thread From: Lukas Wunner @ 2018-07-31 9:28 UTC (permalink / raw) To: Alex_Gagniuc Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant, okaya On Mon, Jul 30, 2018 at 09:38:04PM +0000, Alex_Gagniuc@Dellteam.com wrote: > On 07/28/2018 01:31 PM, Lukas Wunner wrote: > > On Fri, Jul 27, 2018 at 05:51:04PM +0000, Alex_Gagniuc@Dellteam.com wrote: > > > I think PCI_DEV_DISCONNECTED is a documentation issue above all else. > > > The history I was given is that drivers would take a very long time to > > > tear down a device. Config space IO to an nonexistent device took a long > > > while to time out. Performance was one motivation -- and was not > > > documented. > > > > Often it is possible for the driver to detect surprise removal by > > checking if mmio reads return "all ones". But in some cases that's > > a valid value to read from mmio and then this approach won't work. > > Also, checking every mmio read may negatively impact performance. > > A colleague and me beat that dead horse to the afterdeath. Consensus was > that the return value is less reliable than a coin toss (of a two-heads > coin). Can you elaborate why? Because the "official" stance is that checking every read where "all ones" is an invalid value is the proper way to detect unplugged devices. (Official as in, voiced by Greg KH and Bjorn.) In that sense, PCI_DEV_DISCONNECTED is sort of an unloved child. See this thread: https://www.spinics.net/lists/linux-acpi/msg81445.html > > FWIW, the below is what I had in mind (on top of Bjorn's pci/hotplug > > branch). Does this work for you? > > This, and another patch (you have been CC'd) solve my problem of > crashing during surprise removal. Thanks! Ok thanks, I submitted the patch this morning with your Tested-by. Unfortunately I forgot to cc all your Dell colleagues, sorry. Lukas ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-31 9:28 ` Lukas Wunner @ 2018-07-31 16:35 ` Alex_Gagniuc 2018-08-01 8:58 ` David Laight 0 siblings, 1 reply; 22+ messages in thread From: Alex_Gagniuc @ 2018-07-31 16:35 UTC (permalink / raw) To: lukas Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant, okaya On 07/31/2018 04:29 AM, Lukas Wunner wrote:=0A= > On Mon, Jul 30, 2018 at 09:38:04PM +0000, Alex_Gagniuc@Dellteam.com wrote= :=0A= >> On 07/28/2018 01:31 PM, Lukas Wunner wrote:=0A= >>> On Fri, Jul 27, 2018 at 05:51:04PM +0000, Alex_Gagniuc@Dellteam.com wro= te:=0A= >>>> I think PCI_DEV_DISCONNECTED is a documentation issue above all else.= =0A= >>>> The history I was given is that drivers would take a very long time to= =0A= >>>> tear down a device. Config space IO to an nonexistent device took a lo= ng=0A= >>>> while to time out. Performance was one motivation -- and was not=0A= >>>> documented.=0A= >>>=0A= >>> Often it is possible for the driver to detect surprise removal by=0A= >>> checking if mmio reads return "all ones". But in some cases that's=0A= >>> a valid value to read from mmio and then this approach won't work.=0A= >>> Also, checking every mmio read may negatively impact performance.=0A= >>=0A= >> A colleague and me beat that dead horse to the afterdeath. Consensus was= =0A= >> that the return value is less reliable than a coin toss (of a two-heads= =0A= >> coin).=0A= > =0A= > Can you elaborate why? Because the "official" stance is that checking=0A= > every read where "all ones" is an invalid value is the proper way to=0A= > detect unplugged devices. (Official as in, voiced by Greg KH and Bjorn.)= =0A= > In that sense, PCI_DEV_DISCONNECTED is sort of an unloved child.=0A= =0A= All ones is not necessarily invalid. The bug surface is every single =0A= config read. This approach doesn't even cover config writes -- config =0A= writes are non-posted requests too in PCIe.=0A= "Build it, and they will come". That means that drivers would expect =0A= -ENODEV when a device is gone. If we have that infrastructure, more =0A= drivers will start using it over time, and it's something that can also =0A= be used by generic parts of the PCI code. That also means you need a =0A= generic mechanism to determine a device is bye-bye, and that's what =0A= PCI_DEV_DISCONNECTED gives you.=0A= =0A= > See this thread:=0A= > https://www.spinics.net/lists/linux-acpi/msg81445.html=0A= =0A= The discussion is based on the wrong assumptions that reads are =0A= symmetrical wrt to a device being present or not. However, completion =0A= timeouts and unsupported requests blow that assumption right out of the =0A= water. Best case scenario, you just waste a little more time waiting for = =0A= hardware IO. More common is to end up crashing the machine.=0A= =0A= Greg's ideas work in a perfect world where PCI and PCIe are equivalent =0A= at every level. And in that case, PCI_DEV_DISCONNECTED would have been =0A= pure, 100% genuine Redmond bloatware. We wouldn't have gotten complaints = =0A= from Facebook and other industry players that it takes too damn long to =0A= remove a device. We probably also wouldn't be seeing machines crash on =0A= PCIe removal.=0A= =0A= Fun fact: Before PCI_DEV_DISCONNECTED, you could physically swap a =0A= device before the the teardown path was done with the previous device. =0A= Figuring out what problems that caused is left as an exercise to the reader= .=0A= =0A= >>> FWIW, the below is what I had in mind (on top of Bjorn's pci/hotplug=0A= >>> branch). Does this work for you?=0A= >>=0A= >> This, and another patch (you have been CC'd) solve my problem of=0A= >> crashing during surprise removal. Thanks!=0A= > =0A= > Ok thanks, I submitted the patch this morning with your Tested-by.=0A= =0A= Sweet. Thanks!=0A= =0A= > Unfortunately I forgot to cc all your Dell colleagues, sorry.=0A= =0A= They'll live. They already noticed it and sent me emails about it.=0A= =0A= Alex=0A= =0A= > Lukas=0A= > =0A= =0A= ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-07-31 16:35 ` Alex_Gagniuc @ 2018-08-01 8:58 ` David Laight 2018-08-01 19:06 ` Alex_Gagniuc 0 siblings, 1 reply; 22+ messages in thread From: David Laight @ 2018-08-01 8:58 UTC (permalink / raw) To: 'Alex_Gagniuc@Dellteam.com', lukas Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant, okaya From: Alex_Gagniuc@Dellteam.com > Sent: 31 July 2018 17:36 > > On 07/31/2018 04:29 AM, Lukas Wunner wrote: > > On Mon, Jul 30, 2018 at 09:38:04PM +0000, Alex_Gagniuc@Dellteam.com wrote: > >> On 07/28/2018 01:31 PM, Lukas Wunner wrote: > >>> On Fri, Jul 27, 2018 at 05:51:04PM +0000, Alex_Gagniuc@Dellteam.com wrote: > >>>> I think PCI_DEV_DISCONNECTED is a documentation issue above all else. > >>>> The history I was given is that drivers would take a very long time to > >>>> tear down a device. Config space IO to an nonexistent device took a long > >>>> while to time out. Performance was one motivation -- and was not > >>>> documented. > >>> > >>> Often it is possible for the driver to detect surprise removal by > >>> checking if mmio reads return "all ones". But in some cases that's > >>> a valid value to read from mmio and then this approach won't work. > >>> Also, checking every mmio read may negatively impact performance. > >> > >> A colleague and me beat that dead horse to the afterdeath. Consensus was > >> that the return value is less reliable than a coin toss (of a two-heads > >> coin). Something cheap-ish to find out whether a -1 was caused by a card removal might be sensible - Especially if it can be done without a config space read. Clearly you can't check anything BEFORE doing the read. And reading the pci-id from config space isn't entirely useful. If the card has reset itself (and the link recovered) then you need to read a BAR register and check it is setup. More interestingly a read request that is inside the bridge's address window but outside any BAR (fairly easy to setup if the target has a large BAR and a small one) will also timeout (and return -1) even though there is no failure of the link. If the target supports AER the information about the failed cycle ends up in the target's AER registers - even if the host bridge doesn't support AER (or it is being ignored). So it might be useful being able to read the AER registers even when no AER interrupt (or other notification) actually happens. I've not managed to get linux to pick up AER interrupts even on systems where the hardware clearly supports them (at least on some slots). I suspect the BIOS is carefully disabling them because of reports of message logs being spammed with AER errors. We also have one system (possibly a Dell 740) where any failure of a PCIe link leads to an NMI and a kernel crash! Not entirely useful in a server model that is supposed to have resilience against various errors. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? 2018-08-01 8:58 ` David Laight @ 2018-08-01 19:06 ` Alex_Gagniuc 0 siblings, 0 replies; 22+ messages in thread From: Alex_Gagniuc @ 2018-08-01 19:06 UTC (permalink / raw) To: David.Laight, lukas Cc: mr.nuke.me, keith.busch, linux-pci, Austin.Bolen, Stuart.Hayes, Narendra.K, Christopher.Arzola, David.Chalfant, okaya On 08/01/2018 03:57 AM, David Laight wrote:=0A= > From: Alex_Gagniuc@Dellteam.com=0A= >> Sent: 31 July 2018 17:36=0A= >>=0A= >> On 07/31/2018 04:29 AM, Lukas Wunner wrote:=0A= >>> On Mon, Jul 30, 2018 at 09:38:04PM +0000, Alex_Gagniuc@Dellteam.com wro= te:=0A= >>>> On 07/28/2018 01:31 PM, Lukas Wunner wrote:=0A= >>>>> On Fri, Jul 27, 2018 at 05:51:04PM +0000, Alex_Gagniuc@Dellteam.com w= rote:=0A= >>>>>> I think PCI_DEV_DISCONNECTED is a documentation issue above all else= .=0A= >>>>>> The history I was given is that drivers would take a very long time = to=0A= >>>>>> tear down a device. Config space IO to an nonexistent device took a = long=0A= >>>>>> while to time out. Performance was one motivation -- and was not=0A= >>>>>> documented.=0A= >>>>>=0A= >>>>> Often it is possible for the driver to detect surprise removal by=0A= >>>>> checking if mmio reads return "all ones". But in some cases that's= =0A= >>>>> a valid value to read from mmio and then this approach won't work.=0A= >>>>> Also, checking every mmio read may negatively impact performance.=0A= >>>>=0A= >>>> A colleague and me beat that dead horse to the afterdeath. Consensus w= as=0A= >>>> that the return value is less reliable than a coin toss (of a two-head= s=0A= >>>> coin).=0A= > =0A= > Something cheap-ish to find out whether a -1 was caused by a card=0A= > removal might be sensible - Especially if it can be done without=0A= > a config space read.=0A= > Clearly you can't check anything BEFORE doing the read.=0A= > And reading the pci-id from config space isn't entirely useful.=0A= > If the card has reset itself (and the link recovered) then you=0A= > need to read a BAR register and check it is setup.=0A= > =0A= > More interestingly a read request that is inside the bridge's address=0A= > window but outside any BAR (fairly easy to setup if the target has=0A= > a large BAR and a small one) will also timeout (and return -1) even=0A= > though there is no failure of the link.=0A= > =0A= > If the target supports AER the information about the failed cycle=0A= > ends up in the target's AER registers - even if the host bridge=0A= > doesn't support AER (or it is being ignored).=0A= > So it might be useful being able to read the AER registers even when=0A= > no AER interrupt (or other notification) actually happens.=0A= =0A= There are a number of ways to know a device is kaput. Information from =0A= AER and DPC has proven to be the most reliable. So much, that for the =0A= problems I am trying to solve, this information is necessary and sufficient= .=0A= =0A= > I've not managed to get linux to pick up AER interrupts even on=0A= > systems where the hardware clearly supports them (at least on=0A= > some slots). I suspect the BIOS is carefully disabling them=0A= > because of reports of message logs being spammed with AER errors.=0A= =0A= I suspect you've hit an FFS bug.=0A= =0A= > We also have one system (possibly a Dell 740)=0A= =0A= Not sure we make a "possibly 740" model. Let me ask around.=0A= =0A= > where any failure of a PCIe link leads to an NMI and a kernel crash!=0A= =0A= The kernel crash is a linux bug. I've worked on that extensively in the =0A= past. We tried to fix it [1]. Unfortunately, due to an unprofessional =0A= maintainer and months of spinning in circles, word came that our =0A= resources are better spent elsewhere. Feel free to pick up where we left = =0A= off.=0A= =0A= > Not entirely useful in a server model that is supposed to have=0A= > resilience against various errors.=0A= =0A= You're preaching to the choir. The architecture and features are driven =0A= by customer demand. A lot of those "features" -- I haven't asked what =0A= they are -- are easily implemented with FFS. If you have a problem with =0A= FFS in particular -- and I do realize a lot of the specs around FFS are =0A= poorly written and not well thought out -- then it's marketing, sales =0A= and corporate that should know.=0A= =0A= Here's the thing. I think the FW's job is to do the absolute minimal =0A= initialization to pass on control to the OS. uboot/linux stacks execute =0A= this beautifully. But customers want features, that very often, OS =0A= vendors are hesitant or outright refusing to implement. The only =0A= remaining place to implement them is the platform.=0A= =0A= It sucks, but it's how things are. Anyway, the patches at [1] should =0A= solve your system crashing issue.=0A= =0A= Alex=0A= =0A= =0A= [1] https://lore.kernel.org/patchwork/patch/908811/=0A= =0A= ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2018-08-01 20:53 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-07-26 22:38 Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? Alex G. 2018-07-26 23:00 ` Rajat Jain 2018-07-27 0:04 ` Alex_Gagniuc 2018-07-27 7:18 ` Lukas Wunner 2018-07-27 15:52 ` Alex_Gagniuc 2018-07-27 17:05 ` Lukas Wunner 2018-07-27 17:51 ` Alex_Gagniuc 2018-07-27 18:17 ` Sinan Kaya 2018-07-27 18:23 ` Alex_Gagniuc 2018-07-27 18:34 ` Sinan Kaya 2018-07-28 18:31 ` Lukas Wunner 2018-07-29 0:26 ` Sinan Kaya 2018-07-29 12:09 ` Lukas Wunner 2018-07-29 16:59 ` Sinan Kaya 2018-07-30 13:28 ` David Laight 2018-07-30 13:54 ` Lukas Wunner 2018-07-30 16:06 ` David Laight 2018-07-30 21:38 ` Alex_Gagniuc 2018-07-31 9:28 ` Lukas Wunner 2018-07-31 16:35 ` Alex_Gagniuc 2018-08-01 8:58 ` David Laight 2018-08-01 19:06 ` Alex_Gagniuc
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.