From: Lukas Wunner <lukas@wunner.de>
To: "Hoyer, David" <David.Hoyer@netapp.com>
Cc: "linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
Keith Busch <kbusch@kernel.org>
Subject: Re: Kernel hangs when powering up/down drive using sysfs
Date: Mon, 16 Mar 2020 19:19:59 +0100 [thread overview]
Message-ID: <20200316181959.wpzi4hkoyzpghwpw@wunner.de> (raw)
In-Reply-To: <DM5PR06MB313235E97731D97AB813F65D92FB0@DM5PR06MB3132.namprd06.prod.outlook.com>
On Sat, Mar 14, 2020 at 02:19:44PM +0000, Hoyer, David wrote:
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -637,6 +637,8 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id)
> events = atomic_xchg(&ctrl->pending_events, 0);
> if (!events) {
> pci_config_pm_runtime_put(pdev);
> + ctrl->ist_running = false;
> + wake_up(&ctrl->requester);
> return IRQ_NONE;
> }
Thanks David for the report and sorry for the breakage.
The above LGTM, please submit it as a proper patch and
feel free to add my Reviewed-by. Please add the same
two lines before the "return ret" a little further up
in the function.
If it's too cumbersome for you to submit a proper patch
I can do it for you.
> We've instrumented the code and we do see that pciehp_ist() runs
> twice, once exiting with IRQ_HANDLED and then again with IRQ_NONE.
> We believe that is due to the timing differences. Adding debug in
> here changes the timings enough that the hang goes away, so we are
> having troubles proving this 100% at the moment. But just based on
> code inspection, if pciehp_ist() exits with the IRQ_NONE case, then
> nothing will ever set ist_running=false until a subsequent hotplug
> event happens that causes the IRQ_HANDLED case to run. (We were
> able to prove that will cause things to "unhang" and progress at
> that point - if you're hung and you remove a drive, the slot status
> change will then unstick things.)
The question is, why is pciehp_ist() run once more. Most likely
because another event is signaled from the slot. Try adding a
printk() at the top of pciehp_ist() which emits ctrl->pending_events
to understand what's going on.
Thanks,
Lukas
next prev parent reply other threads:[~2020-03-16 18:20 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-14 14:19 Kernel hangs when powering up/down drive using sysfs Hoyer, David
2020-03-16 16:15 ` Keith Busch
2020-03-16 18:10 ` Lukas Wunner
2020-03-16 18:42 ` Keith Busch
2020-03-18 11:53 ` Lukas Wunner
2020-03-16 18:19 ` Lukas Wunner [this message]
2020-03-16 18:25 ` Hoyer, David
2020-03-16 21:35 ` Hoyer, David
2020-03-18 11:49 ` Lukas Wunner
2020-03-18 14:06 ` Hoyer, David
2020-03-18 11:33 ` [PATCH] PCI: pciehp: Fix indefinite wait on sysfs requests Lukas Wunner
2020-03-18 16:43 ` Keith Busch
2020-03-28 20:25 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200316181959.wpzi4hkoyzpghwpw@wunner.de \
--to=lukas@wunner.de \
--cc=David.Hoyer@netapp.com \
--cc=kbusch@kernel.org \
--cc=linux-pci@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).