linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Colin Cross <ccross@android.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zoran Markovic <zoran.markovic@linaro.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Linux PM list <linux-pm@vger.kernel.org>,
	Benoit Goby <benoit@android.com>,
	Android Kernel Team <kernel-team@android.com>,
	Todd Poynor <toddpoynor@google.com>, San Mehat <san@google.com>,
	John Stultz <john.stultz@linaro.org>, Pavel Machek <pavel@ucw.cz>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Len Brown <len.brown@intel.com>
Subject: Re: [RFC PATCH] drivers: power: Add watchdog timer to catch drivers which lockup during suspend.
Date: Tue, 30 Apr 2013 21:39:27 -0700	[thread overview]
Message-ID: <CAMbhsRT2eCKCuF1PbYY-NhGLCocE+809eZEkQYZWh42EB+SwBg@mail.gmail.com> (raw)
In-Reply-To: <20130501041731.GA24128@kroah.com>

On Tue, Apr 30, 2013 at 9:17 PM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Tue, Apr 30, 2013 at 08:36:21PM -0700, Colin Cross wrote:
>> On Tue, Apr 30, 2013 at 4:30 PM, Greg Kroah-Hartman
>> <gregkh@linuxfoundation.org> wrote:
>> > On Tue, Apr 30, 2013 at 03:28:33PM -0700, Zoran Markovic wrote:
>> >> From: Benoit Goby <benoit@android.com>
>> >>
>> >> Below is a patch from android kernel that detects a driver suspend
>> >> lockup and captures dump in the kernel log. Please review and provide
>> >> comments.
>> >
>> > There's this really cool thing called a watchdog driver that does stuff
>> > like this :)
>>
>> If the watchdog driver worked in this case this patch wouldn't exist.
>
> Great, let's fix the watchdog timer then :)
>
> What's wrong with it?
>
>> >> Rather than hard-lock the kernel, dump the suspend thread stack and
>> >> BUG() when a driver takes too long to suspend.  The timeout is set to
>> >> 12 seconds to be longer than the usbhid 10 second timeout.
>> >>
>> >> Exclude from the watchdog the time spent waiting for children that
>> >> are resumed asynchronously and time every device, whether or not they
>> >> resumed synchronously.
>> >
>> > No, don't add a driver-core-only timer, use the existing watchdog timers
>> > if you are worried about the kernel locking up.
>>
>> The watchdog timers are useless here.  For one, they generally stop
>> when their driver suspend op is called, so you may not even have one
>> running when you lock up.
>
> But you can fix that, right?

Ah, you're talking about the lockup detectors, and not drivers/watchdog.

The hardlockup detector can tell you if timer interrupts are not
firing, which is unaffected by this patch since the timer wouldn't
fire any way.  The softlockup detector could eventually tell you that
tasks were not being scheduled, but not why.  Even panic on softlockup
will only get you the stack trace of the current task, which will be
the locked up task if it is spinning, but is likely to be the idle
task if the suspend task is blocked on a wait_event.  This patch will
give the stack trace of the suspend operation that is blocked, even if
it is an asynchronous suspend callback.

>> More importantly, the purpose of this patch is to tell you which
>> driver locked up and hopefully why, and the watchdog driver will
>> usually result in a silent reset.
>
> I thought it was an option as to what the watchdog does when it
> triggers.
>
>> This patch will cause a stack trace of the driver suspend op that is
>> blocking suspend progress, even if that call does not happen in the
>> suspend thread.
>
> But who can see this, the machine is now dead.

I'm not sure what might still be working in this situation on x86, but
on ARM the machine is dead anyways.  Some random subset of drivers are
suspended, so you probably have no hardware watchdog, no console, no
video.  kexec on panic, kgdb on panic, console messages saved in
pstore, or jtag are the only options I know of.  This patch is very
useful in conjunction with pstore console.

  reply	other threads:[~2013-05-01  4:39 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-30 22:28 [RFC PATCH] drivers: power: Add watchdog timer to catch drivers which lockup during suspend Zoran Markovic
2013-04-30 22:28 ` [RFC PATCH] power: Add option to log time spent in suspend Zoran Markovic
2013-05-01  0:29   ` Pavel Machek
2013-05-01  3:29     ` Colin Cross
2013-05-02 12:27       ` Pavel Machek
2013-05-02 18:29         ` Colin Cross
2013-05-02 18:58           ` John Stultz
2013-05-02 19:11             ` Colin Cross
2013-04-30 23:30 ` [RFC PATCH] drivers: power: Add watchdog timer to catch drivers which lockup during suspend Greg Kroah-Hartman
2013-05-01  3:36   ` Colin Cross
2013-05-01  4:17     ` Greg Kroah-Hartman
2013-05-01  4:39       ` Colin Cross [this message]
     [not found]         ` <CAK7N6voYXxJKWDwSj5T9Y2fKK+Y5JqN9Wm8Qoffi9N7nRnsYhw@mail.gmail.com>
2013-05-01  5:14           ` Colin Cross
2013-05-01  0:30 ` Pavel Machek
2013-05-01  3:39   ` Colin Cross
2013-05-01 10:56     ` Pavel Machek
2013-05-01 16:10       ` Colin Cross
2013-05-01 16:24         ` Greg Kroah-Hartman
2013-05-02 12:30         ` Pavel Machek
2013-05-02 18:25           ` Colin Cross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMbhsRT2eCKCuF1PbYY-NhGLCocE+809eZEkQYZWh42EB+SwBg@mail.gmail.com \
    --to=ccross@android.com \
    --cc=benoit@android.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=john.stultz@linaro.org \
    --cc=kernel-team@android.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rjw@sisk.pl \
    --cc=san@google.com \
    --cc=toddpoynor@google.com \
    --cc=zoran.markovic@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).