All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Petr Mladek <pmladek@suse.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: NMI watchdog dump does not print on hard lockup
Date: Tue, 17 Oct 2017 16:50:15 +0900	[thread overview]
Message-ID: <20171017075015.GA6915@jagdpanzerIV> (raw)
In-Reply-To: <20171016101547.27b4aa11@gandalf.local.home>

On (10/16/17 10:15), Steven Rostedt wrote:
[..]
> > just "brainstorming" it... with some silly ideas.
> > 
> > pushing the data from NMI panic might look like we are replacing one
> > deadlock scenario with another deadlock scenario. some of the console
> > drivers are soooo complex internally. so I have been thinking about...
> > may be we can extend struct console and add ->write_on_panic() and that
> > handler must be as lockless as possible; so lockless that calling it
> > from anything that is not panic() is a severe bug.
> 
> This may not be a bad idea. And make it so it can't be called unless we
> are in panic mode (or at least "oops in progress").

right.

we used to have that zap_locks() function, which used to re-init printk()
internal locks on panic (printk recursion while in panic, to be exact):
logbuf spin_lock and console_sem. I wasn't to fond of this function, it
was missing the fact that on panic every printk() is a direct printk (at
least we have such expectation), IOW, it involves
	console_unlock()->call_console_drivers()

so punching printk()'s locks and leaving console drivers' locks intact
was not fair. at all. so, to improve the situation, I removed zap_locks().
/* kidding */


we have sort of re-entrant printk() now. but not completely re-entrant,
because console drivers are not re-entrant. so we can do

a) add ->zap_locks() callback to console drivers

   each console (which wants to be useful) can re-init its locks there, we
   will call it from panic() only. but, given how complex some of the
   consoles, I'd much rather prefer

b) add ->write_on_panic() callback to console drivers

   and do a barely legal print out there


I don't expect/want/push for/etc every console driver to implement
->write_on_panic() callback, just several most commonly used ones.
basically, the ones that you and PeterZ are using.


we also can split our flush_on_panic() and factor out the most
important part of console_unlock(). the first flush_on_panic(), let's
call it flush_on_panic_immediately() or whatever we name it, can push
messages only to those console drivers that have ->write_on_panic()
enabled. and it must call factored out part of console_unlock(). we
don't want flush_on_panic_immediately() to attempt up() the console
semaphore, because this can deadlock. so that factored out __console_unlock()
won't care about console_sem at all.


the second flush_on_panic() can push the data to all registered and
enabled consoles. this has chances to deadlock, but we can be less
nervous about it [given that there was at least one console with
->write_on_panic()].


> If oops_in_progress is set, and the console has a "write_on_panic"
> handler, then just call that.

yes. I don't like oops_in_progress variable, but some flag is
definitely needed.

> Heck, if it doesn't have one, and early_printk is defined, then perhaps
> that should be the default "write_on_panic" output?

yes, early_printk is a good addition. my systems have
"# CONFIG_EARLY_PRINTK is not set".

	-ss

  reply	other threads:[~2017-10-17  7:50 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-12 16:16 NMI watchdog dump does not print on hard lockup Steven Rostedt
2017-10-12 19:26 ` Peter Zijlstra
2017-10-13 11:14 ` Petr Mladek
2017-10-13 13:18   ` Steven Rostedt
2017-10-13 19:12     ` Linus Torvalds
2017-10-16 11:12       ` Petr Mladek
2017-10-16 13:13         ` Sergey Senozhatsky
2017-10-16 14:15           ` Steven Rostedt
2017-10-17  7:50             ` Sergey Senozhatsky [this message]
2018-10-23  6:49             ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171017075015.GA6915@jagdpanzerIV \
    --to=sergey.senozhatsky.work@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.