linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Hoyer, David" <David.Hoyer@netapp.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: "linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	Keith Busch <kbusch@kernel.org>
Subject: RE: Kernel hangs when powering up/down drive using sysfs
Date: Mon, 16 Mar 2020 18:25:53 +0000	[thread overview]
Message-ID: <DM5PR06MB31328A7B4E1A95A8C5E5E3E092F90@DM5PR06MB3132.namprd06.prod.outlook.com> (raw)
In-Reply-To: <20200316181959.wpzi4hkoyzpghwpw@wunner.de>

We were not sure about the return just a few lines up so we did not add the 2 lines.
I will try what you suggested to better understand why we are getting the extra interrupt.

I am not as familiar with submitting a "proper patch" and ask that you do it if you would be so kind.

-----Original Message-----
From: Lukas Wunner <lukas@wunner.de> 
Sent: Monday, March 16, 2020 1:20 PM
To: Hoyer, David <David.Hoyer@netapp.com>
Cc: linux-pci@vger.kernel.org; Keith Busch <kbusch@kernel.org>
Subject: Re: Kernel hangs when powering up/down drive using sysfs

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




On Sat, Mar 14, 2020 at 02:19:44PM +0000, Hoyer, David wrote:
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -637,6 +637,8 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id)
>         events = atomic_xchg(&ctrl->pending_events, 0);
>         if (!events) {
>                 pci_config_pm_runtime_put(pdev);
> +               ctrl->ist_running = false;
> +               wake_up(&ctrl->requester);
>                 return IRQ_NONE;
>        }

Thanks David for the report and sorry for the breakage.

The above LGTM, please submit it as a proper patch and feel free to add my Reviewed-by.  Please add the same two lines before the "return ret" a little further up in the function.

If it's too cumbersome for you to submit a proper patch I can do it for you.


> We've instrumented the code and we do see that pciehp_ist() runs 
> twice, once exiting with IRQ_HANDLED and then again with IRQ_NONE.
> We believe that is due to the timing differences.  Adding debug in 
> here changes the timings enough that the hang goes away, so we are 
> having troubles proving this 100% at the moment.  But just based on 
> code inspection, if pciehp_ist() exits with the IRQ_NONE case, then 
> nothing will ever set ist_running=false until a subsequent hotplug 
> event happens that causes the IRQ_HANDLED case to run.  (We were able 
> to prove that will cause things to "unhang" and progress at that point 
> - if you're hung and you remove a drive, the slot status change will 
> then unstick things.)

The question is, why is pciehp_ist() run once more.  Most likely because another event is signaled from the slot.  Try adding a
printk() at the top of pciehp_ist() which emits ctrl->pending_events to understand what's going on.

Thanks,

Lukas

  reply	other threads:[~2020-03-16 18:25 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-14 14:19 Kernel hangs when powering up/down drive using sysfs Hoyer, David
2020-03-16 16:15 ` Keith Busch
2020-03-16 18:10   ` Lukas Wunner
2020-03-16 18:42     ` Keith Busch
2020-03-18 11:53       ` Lukas Wunner
2020-03-16 18:19 ` Lukas Wunner
2020-03-16 18:25   ` Hoyer, David [this message]
2020-03-16 21:35     ` Hoyer, David
2020-03-18 11:49     ` Lukas Wunner
2020-03-18 14:06       ` Hoyer, David
2020-03-18 11:33 ` [PATCH] PCI: pciehp: Fix indefinite wait on sysfs requests Lukas Wunner
2020-03-18 16:43   ` Keith Busch
2020-03-28 20:25   ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM5PR06MB31328A7B4E1A95A8C5E5E3E092F90@DM5PR06MB3132.namprd06.prod.outlook.com \
    --to=david.hoyer@netapp.com \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).