All of lore.kernel.org
 help / color / mirror / Atom feed
From: <Alex_Gagniuc@Dellteam.com>
To: <lukas@wunner.de>
Cc: <mr.nuke.me@gmail.com>, <keith.busch@intel.com>,
	<linux-pci@vger.kernel.org>, <Austin.Bolen@dell.com>,
	<Stuart.Hayes@dell.com>, <Narendra.K@dell.com>,
	<Christopher.Arzola@dell.com>, <David.Chalfant@dell.com>
Subject: Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit?
Date: Fri, 27 Jul 2018 17:51:04 +0000	[thread overview]
Message-ID: <99604d46a7554eb38ee6c1579c53d835@ausx13mps321.AMER.DELL.COM> (raw)
In-Reply-To: 20180727170543.GA5326@wunner.de

On 07/27/2018 12:05 PM, Lukas Wunner wrote:=0A=
> On Fri, Jul 27, 2018 at 03:52:04PM +0000, Alex_Gagniuc@Dellteam.com wrote=
:=0A=
>> On 07/27/2018 02:18 AM, Lukas Wunner wrote:=0A=
>>> On Thu, Jul 26, 2018 at 05:38:50PM -0500, Alex G. wrote:=0A=
>>>> I was under the impression that a DLLSC or PDSC would trigger the=0A=
>>>> PCI_DEV_DISCONNECTED bit to be set, blocking any further config access=
.=0A=
>>>=0A=
>>> Only if the Presence Detect State bit in the Slot Status register is=0A=
>>> not set.=0A=
>>>=0A=
>>> I think the idea was that if the card is still in the slot, its driver =
can=0A=
>>> be unbound orderly because the device is still accessible.  So there's =
no=0A=
>>> need to set PCI_DEV_DISCONNECTED.=0A=
>>=0A=
>> This sounds counter to what I had intuited. I mentally associate=0A=
>> DISCONNECTED with "the link is down". I don't understand the idea of a=
=0A=
>> driver doing an orderly removal when the link is down. Device registers=
=0A=
>> wouldn't be accessible in that case.=0A=
> =0A=
> That's basically what I meant, sorry for not being clear.=0A=
> =0A=
>>> However if there is a already a new card in the slot, PCI_DEV_DISCONNEC=
TED=0A=
>>> will erroneously not be set.  Also, if card removal was triggered by th=
e=0A=
>>> link going down but the card is still in the slot, PCI_DEV_DISCONNECTED=
=0A=
>>> will also erroneously not be set even though it's inaccessible.=0A=
>>>=0A=
>>> The only situations when PCI_DEV_DISCONNECTED should not be set is if t=
he=0A=
>>> card is being removed by sysfs or by the user pressing the Attention Bu=
tton.=0A=
>>> Anything else is a surprise removal.  What we need to do is pass down a=
=0A=
>>> flag to pciehp_unconfigure_device() to indicate whether one of those tw=
o=0A=
>>> situations is at hand or not, and PCI_DEV_DISCONNECTED should be set=0A=
>>> depending on that flag.=0A=
>>=0A=
>> That makes a little more sense. Is someone working on this?=0A=
> =0A=
> Me, but not for 4.19, we're too late in the cycle, I'm going to post=0A=
> a small number of fixup patches for Bjorn's pci/hotplug branch shortly,=
=0A=
> to be included in 4.19, and that's it.=0A=
> =0A=
> I posted a patch as part of the series that's now on Bjorn's pci/hotplug=
=0A=
> branch which touched PCI_DEV_DISCONNECTED code in pciehp_pci.c, but had=
=0A=
> to withdraw that particular patch:=0A=
> https://patchwork.ozlabs.org/patch/930403/=0A=
> =0A=
> The first problem with that patch was that I hadn't fully understood=0A=
> yet when to set PCI_DEV_DISCONNECTED and when not to set it.=0A=
=0A=
I think PCI_DEV_DISCONNECTED is a documentation issue above all else. =0A=
The history I was given is that drivers would take a very long time to =0A=
tear down a device. Config space IO to an nonexistent device took a long =
=0A=
while to time out. Performance was one motivation -- and was not documented=
.=0A=
=0A=
> The second problem was that it fixed a deadlock on unplug in a way=0A=
> that wasn't generic enough.  The same deadlock can occur in other=0A=
> situations.  The real fix is to unbind the driver lockless in=0A=
> pci_stop_dev().  That's non-trivial to achieve, but doable.=0A=
> I don't think there's a way around that to fix the problem:=0A=
> https://www.spinics.net/lists/linux-pci/msg73652.html=0A=
> =0A=
> There's (still) a lot that can be improved in pciehp, I hope to=0A=
> keep working on that as time permits.=0A=
=0A=
Thanks for all the info. The fix that I was settling on is (pasted) =0A=
below. Though that seems to conflict a bit with what you are trying to =0A=
do. Now I'm a little conflicted If I should try to submit the below or not.=
=0A=
=0A=
Alex=0A=
=0A=
diff --git a/drivers/pci/hotplug/pciehp_pci.c =0A=
b/drivers/pci/hotplug/pciehp_pci.c=0A=
index 3f518dea856d..5cf02639fa0b 100644=0A=
--- a/drivers/pci/hotplug/pciehp_pci.c=0A=
+++ b/drivers/pci/hotplug/pciehp_pci.c=0A=
@@ -74,6 +74,7 @@ int pciehp_unconfigure_device(struct slot *p_slot)=0A=
         ctrl_dbg(ctrl, "%s: domain:bus:dev =3D %04x:%02x:00\n",=0A=
                  __func__, pci_domain_nr(parent), parent->number);=0A=
         pciehp_get_adapter_status(p_slot, &presence);=0A=
+       presence =3D presence && pciehp_check_link_active(ctrl);=0A=
=0A=
         pci_lock_rescan_remove();=0A=
=0A=
=0A=
=0A=
> Thanks,=0A=
> =0A=
> Lukas=0A=
> =0A=
=0A=

  reply	other threads:[~2018-07-27 19:14 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-26 22:38 Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit? Alex G.
2018-07-26 23:00 ` Rajat Jain
2018-07-27  0:04   ` Alex_Gagniuc
2018-07-27  7:18 ` Lukas Wunner
2018-07-27 15:52   ` Alex_Gagniuc
2018-07-27 17:05     ` Lukas Wunner
2018-07-27 17:51       ` Alex_Gagniuc [this message]
2018-07-27 18:17         ` Sinan Kaya
2018-07-27 18:23           ` Alex_Gagniuc
2018-07-27 18:34             ` Sinan Kaya
2018-07-28 18:31         ` Lukas Wunner
2018-07-29  0:26           ` Sinan Kaya
2018-07-29 12:09             ` Lukas Wunner
2018-07-29 16:59               ` Sinan Kaya
2018-07-30 13:28           ` David Laight
2018-07-30 13:54             ` Lukas Wunner
2018-07-30 16:06               ` David Laight
2018-07-30 21:38           ` Alex_Gagniuc
2018-07-31  9:28             ` Lukas Wunner
2018-07-31 16:35               ` Alex_Gagniuc
2018-08-01  8:58                 ` David Laight
2018-08-01 19:06                   ` Alex_Gagniuc

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99604d46a7554eb38ee6c1579c53d835@ausx13mps321.AMER.DELL.COM \
    --to=alex_gagniuc@dellteam.com \
    --cc=Austin.Bolen@dell.com \
    --cc=Christopher.Arzola@dell.com \
    --cc=David.Chalfant@dell.com \
    --cc=Narendra.K@dell.com \
    --cc=Stuart.Hayes@dell.com \
    --cc=keith.busch@intel.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mr.nuke.me@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.