linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: "Liu, Chuansheng" <chuansheng.liu@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Martin Steigerwald <Martin@lichtvoll.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [REGRESSION] 3.7-rc3+git hard lockup on CPU after inserting/removing USB stick
Date: Tue, 13 Nov 2012 10:52:37 -0800	[thread overview]
Message-ID: <20121113185237.GA26995@kroah.com> (raw)
In-Reply-To: <27240C0AC20F114CBF8149A2696CBE4A1C3887@SHSMSX101.ccr.corp.intel.com>

On Tue, Nov 13, 2012 at 12:47:31AM +0000, Liu, Chuansheng wrote:
> 
> 
> > -----Original Message-----
> > From: Thomas Gleixner [mailto:tglx@linutronix.de]
> > Sent: Tuesday, November 13, 2012 3:32 AM
> > To: Martin Steigerwald
> > Cc: Liu, Chuansheng; linux-kernel@vger.kernel.org; Ingo Molnar; Greg
> > Kroah-Hartman
> > Subject: Re: [REGRESSION] 3.7-rc3+git hard lockup on CPU after
> > inserting/removing USB stick
> > 
> > On Mon, 12 Nov 2012, Martin Steigerwald wrote:
> > > Am Sonntag, 11. November 2012 schrieb Liu, Chuansheng:
> > > > > The first bad commit is:
> > > > >
> > > > > commit 73d4066055e0e2830533041f4b91df8e6e5976ff
> > > > > Author: Chuansheng Liu <chuansheng.liu@intel.com>
> > > > > Date:   Tue Sep 11 16:00:30 2012 +0800
> > > > >
> > > > >     USB/host: Cleanup unneccessary irq disable code
> > > > >
> > > > >     Because the IRQF_DISABLED as the flag is now a NOOP and has
> > been
> > > > >     deprecated and in hardirq context the interrupt is disabled.
> > > > >
> > > > >     so in usb/host code:
> > > > >     Removing the usage of flag IRQF_DISABLED;
> > > > >     Removing the calling local_irq save/restore actions in irq
> > > > >     handler usb_hcd_irq();
> > > > >
> > > > >     Signed-off-by: liu chuansheng <chuansheng.liu@intel.com>
> > > > >     Acked-by: Alan Stern <stern@rowland.harvard.edu>
> > > > >     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > >
> > > > >
> > > > > But:
> > > > >
> > > > > This ony happens with threadirqs option!
> > > > >
> > > > > When I remove threadirqs from kernel command line and reboot with this
> > > > > last bisect kernel USB sticks work.
> > > > >
> > > > > That may explain why nobody else has seen this.
> > > > >
> > > > > So I will try a 3.7-rc4 now, but without threadirqs enabled.
> > > > >
> > > > Thanks your pointing out, the USB HCD irq handler is designed to
> > > > execute in irq handler with irq disabled.  When threadirqs is in
> > > > commandline, it will be executed in thread context with local irq
> > > > enabling, which causes this hardlockup.
> > 
> > No. The problem is caused by the commit above. USB with threaded
> > interrupt handlers worked perfectly fine in the past.
> The reason is that removing local_irq_save/restore() in function usb_hcd_irq().
> The hard lockup analysis is:
> CPU3
> Usb_hcd_irq() -->
>   ehci_irq -->
> spin_lock (&ehci->lock);
> ...
> if (status == ~(u32) 0) {
> 
> At this time, one hrtimer is coming:
> ehci_hrtimer_func() -->
> spin_lock_irqsave(&ehci->lock, flags);
> 
> Due to the spin_lock has been hold before, it causes the deadlock.
> 
> The dmesg is as below:
> [  155.010424]  [<ffffffff814027b0>] ? _raw_spin_lock_irqsave+0x3f/0x48
> [  155.010649]  [<ffffffff814027b0>] ? _raw_spin_lock_irqsave+0x3f/0x48
> [  155.010884]  [<ffffffff814027b0>] ? _raw_spin_lock_irqsave+0x3f/0x48
> [  155.011104]  <<EOE>>  <IRQ>  [<ffffffffa00c6836>] ehci_hrtimer_func+0x28/0xb5 [ehci_hcd]
> [  155.011446]  [<ffffffff8105dc7d>] ? __remove_hrtimer+0x31/0x8b
> [  155.011661]  [<ffffffffa00c680e>] ? end_free_itds+0x108/0x108 [ehci_hcd]
> [  155.011911]  [<ffffffff8105e116>] __run_hrtimer+0xb9/0x176
> [  155.012105]  [<ffffffff8105e804>] hrtimer_interrupt+0xcb/0x1b4
> [  155.012311]  [<ffffffff810a4b6d>] ? irq_thread_fn+0x35/0x35
> [  155.012509]  [<ffffffff8102abc2>] smp_apic_timer_interrupt+0x71/0x84
> [  155.012742]  [<ffffffff81407b4a>] apic_timer_interrupt+0x6a/0x70
> [  155.012950]  <EOI>  [<ffffffffa00c9087>] ? ehci_irq+0x39/0x267 [ehci_hcd]
> [  155.013230]  [<ffffffff81401830>] ? __schedule+0x57f/0x5b2
> [  155.013424]  [<ffffffff810a4b6d>] ? irq_thread_fn+0x35/0x35
> [  155.013645]  [<ffffffffa003befc>] usb_hcd_irq+0x20/0x2f [usbcore]
> [  155.013874]  [<ffffffff810a4b8d>] irq_forced_thread_fn+0x20/0x3e
> 
> > 
> > > > --- a/drivers/usb/core/hcd.c
> > > > +++ b/drivers/usb/core/hcd.c
> > > > @@ -2349,7 +2349,7 @@ static int usb_hcd_request_irqs(struct usb_hcd
> > *hcd,
> > > >         if (hcd->driver->irq) {
> > > >                 snprintf(hcd->irq_descr, sizeof(hcd->irq_descr),
> > "%s:usb%d",
> > > >                                 hcd->driver->description,
> > hcd->self.busnum);
> > > > -               retval = request_irq(irqnum, &usb_hcd_irq, irqflags,
> > > > +               retval = request_irq(irqnum, &usb_hcd_irq,
> > irqflags|IRQF_NO_THREAD,
> > > >                                 hcd->irq_descr, hcd);
> > 
> > NAK. This is exactly the wrong thing to do.
> > 
> > We want to be able to run that code in an handler thread. So you
> As Martin's experience:
> "I think that thread IRQs for USB based interrupts might not be a good
> from another experience."
> Maybe it shows something.
> 
> > removed the local_irq_save/restore() in the driver code and with
> > forced threaded irqs this breaks. Now setting IRQF_NO_THREAD is just
> > working around the problem that the above commit broke it.
> > 
> > There is no hard requirement to run USB interrupts in hard interrupt
> > context. I'd rather see the above commit reverted and then a proper
> > analysis done why removing local_irq_save/restore() breaks forced
> > threaded interrupt handlers.
> As you said, we can revert the patch directly, or submit a new patch to just add
> local_irq_save/restore() back.

Let me revert this for now, as it's obviously causing problems.

thanks,

greg k-h

      reply	other threads:[~2012-11-13 18:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-07 14:01 [REGRESSION] 3.7-rc3+git hard lockup on CPU after inserting/removing USB stick Martin Steigerwald
2012-11-07 14:38 ` Greg Kroah-Hartman
2012-11-07 18:52   ` Martin Steigerwald
2012-11-10 16:34     ` Martin Steigerwald
2012-11-10 16:51       ` Martin Steigerwald
2012-11-11  0:53       ` Liu, Chuansheng
2012-11-12 14:27         ` Martin Steigerwald
2012-11-12 19:31           ` Thomas Gleixner
2012-11-13  0:47             ` Liu, Chuansheng
2012-11-13 18:52               ` Greg Kroah-Hartman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121113185237.GA26995@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=Martin@lichtvoll.de \
    --cc=chuansheng.liu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).