linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Kosina <jkosina@suse.cz>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
	Petr Mladek <pmladek@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Dave Anderson <anderson@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Kay Sievers <kay@vrfy.org>, Michal Hocko <mhocko@suse.cz>,
	Jan Kara <jack@suse.cz>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 00/11] printk: safe printing in NMI context
Date: Wed, 18 Jun 2014 13:03:05 +0200 (CEST)	[thread overview]
Message-ID: <alpine.LNX.2.00.1406181245230.2303@pobox.suse.cz> (raw)
In-Reply-To: <CA+55aFzseOqF-EpKMvwKpfhBJZQSLqKpJ3shzVee9s0+mvyCuA@mail.gmail.com>

On Tue, 10 Jun 2014, Linus Torvalds wrote:

> > Lets be crazy and Cc Linus on that.
> 
> Quite frankly, I hate seeing something like this:
> 
>  kernel/printk/printk.c              | 1218 +++++++++++++++++++++++++----------
> 
> for something that is stupid and broken. Printing from NMI context
> isn't really supposed to work, and we all *know* it's not supposed to
> work.
> 
> I'd much rather disallow it, and if there is one or two places that
> really want to print a warning and know that they are in NMI context,
> have a special workaround just for them, with something that does
> *not* try to make printk in general work any better.
> 
> Dammit, NMI context is special. I absolutely refuse to buy into the
> broken concept that we should make more stuff work in NMI context.
> Hell no, we should *not* try to make more crap work in NMI. NMI people
> should be careful.
> 
> Make a trivial "printk_nmi()" wrapper that tries to do a trylock on
> logbuf_lock, and *maybe* the existing sequence of
> 
>         if (console_trylock_for_printk())
>                 console_unlock();
> 
> then works for actually triggering the printout. But the wrapper
> should be 15 lines of code for "if possible, try to print things", and
> *not* a thousand lines of changes.

Alright, so this went silent again without any real progress. Is everyone 
hoping this gets sorted out on kernel summit, or ... ?

Let me sum up the current situation:

- both RCU stall detector and 'echo l > sysrq-trigger' can (and we've 
  seen it happening for real) cause a complete, undebuggable, silent hang 
  of machine (deadlock in NMI context)

- before 7ff9554bb578 and friends, this was trivial to fix more or less 
  exactly the way Linus is proposing above. We've been carrying the 
  fix in our kernels for a while already [1]. With printk() having got  
  overly complicated recently, the "in principle trivial" fix turns out  
  into crazy mess due to handling of all the indexes, sequence numbers, 
  etc.

- printk() from NMI is actually useful in rare cases (such as inter-CPU 
  stack dumping)

- using lockless buffers that trace_printk() is using has its own 
  problems, as described by Petr elsewhere in this thread


I find it rather outrageous that fixing *real bugs* (leading to hangs) 
becomes impossible due to printk() being too complex. It's very 
unfortunate that the same level of pushback didn't happen when new 
features (that actually *made* it so complicated) have been pushed; that 
would be much more valuable an appropriate.

I believe Jan Kara is in the same situation with his softlockup fixes for 
printk. Those are real problems, as they are bringing machines down, and 
after two years, still not fixed, because "printk() code is scary enough 
as-is"

[1] http://kernel.suse.com/cgit/kernel/commit/?h=SLE11-SP3&id=8d62ae68ff61d77ae3c4899f05dbd9c9742b14c9

-- 
Jiri Kosina
SUSE Labs

  parent reply	other threads:[~2014-06-18 11:03 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-09  9:10 [RFC PATCH 00/11] printk: safe printing in NMI context Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 01/11] printk: rename struct printk_log to printk_msg Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 02/11] printk: allow to handle more log buffers Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 03/11] printk: rename "logbuf_lock" to "main_logbuf_lock" Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 04/11] printk: add NMI ring and cont buffers Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 05/11] printk: allow to modify NMI log buffer size using boot parameter Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 06/11] printk: NMI safe printk Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 07/11] printk: right ordering of the cont buffers from NMI context Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 08/11] printk: try hard to print Oops message in " Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 09/11] printk: merge and flush NMI buffer predictably via IRQ work Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 10/11] printk: survive rotation of sequence numbers Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 11/11] printk: avoid staling when merging NMI log buffer Petr Mladek
2014-05-28 22:02 ` [RFC PATCH 00/11] printk: safe printing in NMI context Jiri Kosina
2014-05-29  0:09   ` Frederic Weisbecker
2014-05-29  8:09     ` Jiri Kosina
2014-06-10 16:46       ` Frederic Weisbecker
2014-06-10 16:57         ` Linus Torvalds
2014-06-10 17:32           ` Jiri Kosina
2014-06-11  9:01             ` Petr Mládek
2014-06-18 11:03           ` Jiri Kosina [this message]
2014-06-18 14:36             ` Paul E. McKenney
2014-06-18 14:41               ` Jiri Kosina
2014-06-18 14:44                 ` Paul E. McKenney
2014-06-18 14:53                   ` Jiri Kosina
2014-06-18 15:07                     ` Paul E. McKenney
     [not found]               ` <CA+55aFwPgDC6gSEPfu3i-pA4f0ZbsTSvykxzX4sXMeLbdXuKrw@mail.gmail.com>
2014-06-18 16:21                 ` Paul E. McKenney
2014-06-18 16:38                   ` Steven Rostedt
2014-06-18 16:43                     ` Paul E. McKenney
2014-06-18 20:36                   ` Jiri Kosina
2014-06-18 21:07                     ` Paul E. McKenney
2014-06-18 21:12                       ` Jiri Kosina
2014-06-18 21:20                         ` Paul E. McKenney
2014-06-18 21:32                           ` Jiri Kosina
2014-06-18 21:37                             ` Paul E. McKenney
2014-06-18 23:20                         ` Steven Rostedt
2014-05-30  8:13     ` Jan Kara
2014-05-30 10:10       ` Jiri Kosina
2014-06-10 16:49       ` Frederic Weisbecker
2014-06-12 11:50     ` Petr Mládek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LNX.2.00.1406181245230.2303@pobox.suse.cz \
    --to=jkosina@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=anderson@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=jack@suse.cz \
    --cc=kay@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.cz \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pmladek@suse.cz \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).