[RFC PATCH 00/11] printk: safe printing in NMI context

* [RFC PATCH 00/11] printk: safe printing in NMI context
@ 2014-05-09  9:10 Petr Mladek
  2014-05-09  9:10 ` [RFC PATCH 01/11] printk: rename struct printk_log to printk_msg Petr Mladek
                   ` (11 more replies)
  0 siblings, 12 replies; 39+ messages in thread
From: Petr Mladek @ 2014-05-09  9:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Frederic Weisbecker, Steven Rostedt, Dave Anderson,
	Paul E. McKenney, Kay Sievers, Jiri Kosina, Michal Hocko,
	Jan Kara, linux-kernel, Petr Mladek

printk() cannot be used safely in NMI context because it uses internal locks
and thus could cause a deadlock. Unfortunately there are circumstances when
calling printk from NMI is very useful. For example, all WARN.*(in_nmi())
would be much more helpful if they didn't lockup the machine.

Another example would be arch_trigger_all_cpu_backtrace for x86 which uses NMI
to dump traces on all CPU (either triggered by sysrq+l or from RCU stall
detector).

This patch set solves the problem by using trylock rather than spin_lock.
If the lock can not be taken, it uses NMI specific buffers to temporary
store the message.

Patches 1-5 are preparation steps to be able to handle more ring buffers
using the same functions.

Patch 6 adds the main logic to handle the NMI messages a safe way.

Patches 7-11 improve various aspects of the NMI messages handling.

It was a long way until I got stable working solution that would fulfill
the most important needs:

    + do not lock up the machine
    + prefer last messages if logging is too slow
    + pass the messages to the console as soon as possible
    + keep the order of messages, especially parts of continuous lines
    + reduce interlacing messages from normal and NMI context

The current solution works pretty well. There are still some corner cases
where the continuous messages are split. I still want to look at it but
I do not want to hide the work any longer. I look forward to hear your
opinion and hints.

Note that the first two patches modifies API that is used by some external
tools, e.g. crash, makedumpfile. The tools would need to get updated.
The API was changed to make the solution cleaner.

I added Paul E. McKenney into CC because there are used memory barriers
in the 6th patch.

The patch set is based on kernel-next. The last commit is a42b108e06bb28348
(Add linux-next specific files for 20140507).

It can be applied also on the Linus' tree if you apply the recent
patches that touch kernel/printk/printk.c

Petr Mladek (11):
  printk: rename struct printk_log to printk_msg
  printk: allow to handle more log buffers
  printk: rename "logbuf_lock" to "main_logbuf_lock"
  printk: add NMI ring and cont buffers
  printk: allow to modify NMI log buffer size using boot parameter
  printk: NMI safe printk
  printk: right ordering of the cont buffers from NMI context
  printk: try hard to print Oops message in NMI context
  printk: merge and flush NMI buffer predictably via IRQ work
  printk: survive rotation of sequence numbers
  printk: avoid staling when merging NMI log buffer

 Documentation/kernel-parameters.txt |   19 +-
 kernel/printk/printk.c              | 1218 +++++++++++++++++++++++++----------
 2 files changed, 878 insertions(+), 359 deletions(-)

-- 
1.8.4

^ permalink raw reply	[flat|nested] 39+ messages in thread