From: Andi Kleen <andi@firstfloor.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <andi@firstfloor.org>,
Radoslaw Szkodzinski <lkml@astralstorm.puszkin.org>,
Arjan van de Ven <arjan@infradead.org>,
linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [feature] automatically detect hung TASK_UNINTERRUPTIBLE tasks
Date: Mon, 3 Dec 2007 13:13:57 +0100 [thread overview]
Message-ID: <20071203121357.GB2986@one.firstfloor.org> (raw)
In-Reply-To: <20071203115900.GB8432@elte.hu>
On Mon, Dec 03, 2007 at 12:59:00PM +0100, Ingo Molnar wrote:
> no. (that's why i added the '(or a kill -9)' qualification above - if
> NFS is mounted noninterruptible then standard signals (such as Ctrl-C)
> should not have an interrupting effect.)
NFS is already interruptible with umount -f (I use that all the time...),
but softlockup won't know that and throw the warning anyways.
> your syslet snide comment aside (which is quite incomprehensible - a
For the record I have no principle problem with syslets, just I do
consider them roughly equivalent in end result to a explicit retry based
AIO implementation.
> retry based asynchonous IO model is clearly inferior even if it were
> implemented everywhere), i do think that most if not all of these
> supposedly "difficult to fix" codepaths are just on the backburner out
> of lack of a clear blame vector.
Hmm. -ENOPARSE. Can you please clarify?
>
> "audit thousands of callsites in 8 million lines of code first" is a
> nice euphemism for hiding from the blame forever. We had 10 years for it
Ok your approach is then to "let's warn about it and hope
it will go away"
> and it didnt happen. As we've seen it again and again, getting a
> non-fatal reminder in the dmesg about the suckage is quite efficient at
It's not universal suckage I would say, but sometimes unavoidable
conditions. Now it is better of course to have these all TASK_KILLABLE,
but then fixing that all in the kernel will probably a long term
project. I'm not arguing against that, just forcing it through
backtraces before even starting all that is probably not the right
strategy to do that.
> getting people to fix crappy solutions, and gives users and exact blame
> point of where to start. That will create pressure to fix these
> problems.
After impacting the user base -- many of these conditions are infrequent
enough that we will likely only see them during real production. Throwing
warnings for lots of known cases is probably ok for a -mm kernel
(where users expect things lik that), but not a "release" (be it
Linus release or any kind of end user distribution) imho.
I don't think there is a real alternative to code audit first
(and someone doing all the work of fixing all these first)
>
> > > I think you are somehow confusing two issues: this patch in no way
> > > declares that "long waits are bad" - if the user _choses_ to wait
> > > for
> >
> > Throwing a backtrace is the kernel's way to declare something as bad.
> > The only more clear ways to that I know of would be BUG or panic().
>
> there are various levels of declarig something bad, and you are quite
> wrong to suggest that a BUG() would be the only recourse.
I didn't write that, please reread my sentence..
But we seem to agree that a backtrace is something "declared bad" anyways,
which was my point.
>
> > > way to stop_ are quite likely bad".
> >
> > The user will just see the backtraces and think the kernel has
> > crashed.
>
> i've just changed the message to:
>
> INFO: task keventd/5 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message
That's better, but the backtrace is still there isn't it?
Anyways I think I could live with it a one liner warning (if it's
seriously rate limited etc.) and a sysctl to enable the backtraces;
off by default. Or if you prefer that record
the backtrace always in a buffer and make it available somewhere in /proc
or /sys or /debug. Would that work for you?
-Andi
next prev parent reply other threads:[~2007-12-03 12:14 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-01 9:20 [feature] automatically detect hung TASK_UNINTERRUPTIBLE tasks Ingo Molnar
2007-12-01 18:31 ` David Rientjes
2007-12-01 18:33 ` Ingo Molnar
2007-12-01 18:42 ` David Rientjes
2007-12-01 19:36 ` Ingo Molnar
2007-12-02 0:54 ` Ingo Oeser
2007-12-02 8:58 ` Ingo Molnar
2007-12-02 15:52 ` David Rientjes
2007-12-02 18:57 ` Andi Kleen
2007-12-02 18:59 ` Ingo Molnar
2007-12-02 19:41 ` Arjan van de Ven
2007-12-02 20:08 ` Ingo Molnar
2007-12-02 20:09 ` Andi Kleen
2007-12-02 20:26 ` Ingo Molnar
2007-12-02 20:47 ` Andi Kleen
2007-12-02 21:10 ` Ingo Molnar
2007-12-02 21:19 ` Andi Kleen
2007-12-02 21:24 ` Ingo Molnar
2007-12-02 21:34 ` Andi Kleen
2007-12-02 22:25 ` Ingo Molnar
2007-12-02 22:18 ` Arjan van de Ven
2007-12-02 22:20 ` Ingo Molnar
2007-12-03 0:00 ` Andi Kleen
2007-12-02 22:43 ` Arjan van de Ven
2007-12-03 0:07 ` Andi Kleen
2007-12-03 0:59 ` Arjan van de Ven
2007-12-03 9:55 ` Andi Kleen
2007-12-03 10:15 ` Radoslaw Szkodzinski
2007-12-03 10:23 ` Ingo Molnar
2007-12-03 10:27 ` Andi Kleen
2007-12-03 10:38 ` Ingo Molnar
2007-12-03 11:04 ` Andi Kleen
2007-12-03 11:59 ` Ingo Molnar
2007-12-03 12:13 ` Andi Kleen [this message]
2007-12-03 12:28 ` Ingo Molnar
2007-12-03 12:41 ` Andi Kleen
2007-12-03 13:00 ` Ingo Molnar
2007-12-03 13:14 ` Andi Kleen
[not found] ` <20071203132955.GA31354@elte.hu>
2007-12-03 13:41 ` Radoslaw Szkodzinski
2007-12-03 13:59 ` Ingo Molnar
2007-12-03 14:15 ` Andi Kleen
2007-12-03 13:48 ` Andi Kleen
2007-12-03 13:55 ` Ingo Molnar
2007-12-03 14:17 ` Andi Kleen
2007-12-03 14:33 ` Ingo Molnar
2007-12-03 17:02 ` Ray Lee
2007-12-03 13:50 ` Pekka Enberg
2007-12-03 13:57 ` Ingo Molnar
2007-12-03 14:14 ` Andi Kleen
2007-12-03 14:19 ` Ingo Molnar
2007-12-03 17:57 ` Andrew Morton
2007-12-03 18:28 ` Rafael J. Wysocki
2007-12-03 19:24 ` Ingo Molnar
2007-12-03 22:47 ` Rafael J. Wysocki
2007-12-04 0:05 ` Ingo Molnar
2007-12-03 15:23 ` Arjan van de Ven
2007-12-03 16:36 ` Andi Kleen
2007-12-05 22:31 ` Mark Lord
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071203121357.GB2986@one.firstfloor.org \
--to=andi@firstfloor.org \
--cc=akpm@linux-foundation.org \
--cc=arjan@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lkml@astralstorm.puszkin.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).