During the safe removal procedure, a Data Link Layer State Changed event may occur after pciehp_power_off_slot(), and it is handled when the slot is already set to OFF_STATE. This results in re-enabling the device and makes it impossible to actually safely remove it. Add a check for a Presence Detect Changed bit to filter out this interrupt. It is still possible to re-enable the device if it remains in the slot after pressing the Attention Button by pressing it again. Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com> Cc: Lukas Wunner <lukas@wunner.de> --- drivers/pci/hotplug/pciehp_ctrl.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c index 3f3df4c29f6e..941f77b71df4 100644 --- a/drivers/pci/hotplug/pciehp_ctrl.c +++ b/drivers/pci/hotplug/pciehp_ctrl.c @@ -251,6 +251,14 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) return; } + /* Handle the DLLSC from the slot which just has been turned off */ + if ((events & PCI_EXP_SLTSTA_DLLSC) && + !(events & PCI_EXP_SLTSTA_PDC) && + present && !link_active) { + mutex_unlock(&ctrl->state_lock); + return; + } + switch (ctrl->state) { case BLINKINGON_STATE: cancel_delayed_work(&ctrl->button_work); -- 2.20.1
On Fri, Mar 01, 2019 at 07:12:51PM +0300, Sergey Miroshnichenko wrote:
> During the safe removal procedure, a Data Link Layer State Changed event
> may occur after pciehp_power_off_slot(), and it is handled when the slot is
> already set to OFF_STATE. This results in re-enabling the device and makes
> it impossible to actually safely remove it.
>
> Add a check for a Presence Detect Changed bit to filter out this interrupt.
>
> It is still possible to re-enable the device if it remains in the slot
> after pressing the Attention Button by pressing it again.
>
> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
> Cc: Lukas Wunner <lukas@wunner.de>
> ---
> drivers/pci/hotplug/pciehp_ctrl.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
> index 3f3df4c29f6e..941f77b71df4 100644
> --- a/drivers/pci/hotplug/pciehp_ctrl.c
> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> @@ -251,6 +251,14 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
> return;
> }
>
> + /* Handle the DLLSC from the slot which just has been turned off */
> + if ((events & PCI_EXP_SLTSTA_DLLSC) &&
> + !(events & PCI_EXP_SLTSTA_PDC) &&
> + present && !link_active) {
> + mutex_unlock(&ctrl->state_lock);
> + return;
> + }
> +
> switch (ctrl->state) {
> case BLINKINGON_STATE:
> cancel_delayed_work(&ctrl->button_work);
Hm, I'd instead prefer amending remove_board() like this:
if (POWER_CTRL(ctrl)) {
pciehp_power_off_slot(ctrl);
/*
* After turning power off, we must wait for at least 1 second
* before taking any action that relies on power having been
* removed from the slot/adapter.
*/
msleep(1000);
+ /* ignore link or presence changes caused by power off */
+ atomic_and(~(PCI_EXP_SLTSTA_DLLSC | PCI_EXP_SLTSTA_PDC),
+ &ctrl->pending_events);
}
Would this work for you?
Thanks,
Lukas
[-- Attachment #1.1: Type: text/plain, Size: 2365 bytes --] On 3/5/19 2:01 PM, Lukas Wunner wrote: > On Fri, Mar 01, 2019 at 07:12:51PM +0300, Sergey Miroshnichenko wrote: >> During the safe removal procedure, a Data Link Layer State Changed event >> may occur after pciehp_power_off_slot(), and it is handled when the slot is >> already set to OFF_STATE. This results in re-enabling the device and makes >> it impossible to actually safely remove it. >> >> Add a check for a Presence Detect Changed bit to filter out this interrupt. >> >> It is still possible to re-enable the device if it remains in the slot >> after pressing the Attention Button by pressing it again. >> >> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com> >> Cc: Lukas Wunner <lukas@wunner.de> >> --- >> drivers/pci/hotplug/pciehp_ctrl.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c >> index 3f3df4c29f6e..941f77b71df4 100644 >> --- a/drivers/pci/hotplug/pciehp_ctrl.c >> +++ b/drivers/pci/hotplug/pciehp_ctrl.c >> @@ -251,6 +251,14 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) >> return; >> } >> >> + /* Handle the DLLSC from the slot which just has been turned off */ >> + if ((events & PCI_EXP_SLTSTA_DLLSC) && >> + !(events & PCI_EXP_SLTSTA_PDC) && >> + present && !link_active) { >> + mutex_unlock(&ctrl->state_lock); >> + return; >> + } >> + >> switch (ctrl->state) { >> case BLINKINGON_STATE: >> cancel_delayed_work(&ctrl->button_work); > > Hm, I'd instead prefer amending remove_board() like this: > > if (POWER_CTRL(ctrl)) { > pciehp_power_off_slot(ctrl); > > /* > * After turning power off, we must wait for at least 1 second > * before taking any action that relies on power having been > * removed from the slot/adapter. > */ > msleep(1000); > > + /* ignore link or presence changes caused by power off */ > + atomic_and(~(PCI_EXP_SLTSTA_DLLSC | PCI_EXP_SLTSTA_PDC), > + &ctrl->pending_events); > } > > Would this work for you? Thank you for the review and for the proposed change! This discards exactly the event which caused the issue - I've just checked it on our hardware, and will resend it as a v2. Best regards, Serge > > Thanks, > > Lukas > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --]