linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] printk: Cleanups and softlockup avoidance
@ 2013-12-23 20:39 Jan Kara
  2013-12-23 20:39 ` [PATCH 1/9] block: Stop abusing csd.list for fifo_time Jan Kara
                   ` (9 more replies)
  0 siblings, 10 replies; 29+ messages in thread
From: Jan Kara @ 2013-12-23 20:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pmladek, Steven Rostedt, Frederic Weisbecker, LKML, Jan Kara

  Hello,

  this is another piece of the printk softlockup saga series. Let me first
remind the problem:

Currently, console_unlock() prints messages from kernel printk buffer to
console while the buffer is non-empty. When serial console is attached,
printing is slow and thus other CPUs in the system have plenty of time
to append new messages to the buffer while one CPU is printing. Thus the
CPU can spend unbounded amount of time doing printing in console_unlock().
This is especially serious since vprintk_emit() calls console_unlock()
with interrupts disabled.
    
In practice users have observed a CPU can spend tens of seconds printing
in console_unlock() (usually during boot when hundreds of SCSI devices
are discovered) resulting in RCU stalls (CPU doing printing doesn't
reach quiescent state for a long time), softlockup reports (IPIs for the
printing CPU don't get served and thus other CPUs are spinning waiting
for the printing CPU to process IPIs), and eventually a machine death
(as messages from stalls and lockups append to printk buffer faster than
we are able to print). So these machines are unable to boot with serial
console attached. Also during artificial stress testing SATA disk
disappears from the system because its interrupts aren't served for too
long.
---

Since my previous attempts to fix softlockups in printk under heavy load met
some resistance, I've decided to try a different approach - do not let
CPU out of the console_unlock() loop until there's someone else to take over
the printing.

This patch set implements that idea. It is organized as follows:

First three patches are cleanups of block layer and improvement of
smp_call_function_single() to use lockless lists.  These patches are already
queued in block tree so they are here only for completeness.

Patches 4-5 implement __smp_call_function_any() to IPI any CPU from given
cpumask with own csd structure provided.

Patches 6-8 are the printk cleanup patches I have already posted. They make
sense on their own so even if patch 9 is considered too problematic / needing
more work please consider merging these three.

Patch 9 implements the hand over of console_sem when CPU has printed over
printk.offload_chars characters and another CPU is in
console_trylock_for_printk() and also sending IPI to some other CPU to come and
take over printing if no printk has been called for a long time.

What do you guys think?

						Merry Christmas ;)
								Honza

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2014-02-03 17:02 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-23 20:39 [PATCH 0/9] printk: Cleanups and softlockup avoidance Jan Kara
2013-12-23 20:39 ` [PATCH 1/9] block: Stop abusing csd.list for fifo_time Jan Kara
2014-02-01 16:48   ` Frederic Weisbecker
2014-02-03 14:48     ` Jan Kara
2014-02-03 17:02       ` Frederic Weisbecker
2013-12-23 20:39 ` [PATCH 2/9] block: Stop abusing rq->csd.list in blk-softirq Jan Kara
2014-01-30 12:39   ` Frederic Weisbecker
2014-01-30 15:45     ` Jan Kara
2014-01-30 17:01       ` Frederic Weisbecker
2014-01-30 22:12         ` Jan Kara
2014-01-31 15:08           ` Frederic Weisbecker
2013-12-23 20:39 ` [PATCH 3/9] kernel: use lockless list for smp_call_function_single() Jan Kara
2014-01-07 16:21   ` Frederic Weisbecker
2013-12-23 20:39 ` [PATCH 4/9] smp: Teach __smp_call_function_single() to check for offline cpus Jan Kara
2014-01-03  0:47   ` Steven Rostedt
2013-12-23 20:39 ` [PATCH 5/9] smp: Provide __smp_call_function_any() Jan Kara
2014-01-03  0:51   ` Steven Rostedt
2013-12-23 20:39 ` [PATCH 6/9] printk: Release lockbuf_lock before calling console_trylock_for_printk() Jan Kara
2014-01-03  1:53   ` Steven Rostedt
2014-01-03  7:49     ` Jan Kara
2013-12-23 20:39 ` [PATCH 7/9] printk: Enable interrupts " Jan Kara
2013-12-23 20:39 ` [PATCH 8/9] printk: Remove separate printk_sched buffers and use printk buf instead Jan Kara
2013-12-23 20:39 ` [PATCH 9/9] printk: Hand over printing to console if printing too long Jan Kara
2014-01-05  7:57   ` Andrew Morton
2014-01-06  9:46     ` Jan Kara
2014-01-13  7:28       ` Jan Kara
2014-01-15 22:23   ` Andrew Morton
2014-01-16 15:52     ` Jan Kara
2013-12-23 20:39 ` [PATCH 10/10] printk: debug: Slow down printing to 9600 bauds Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).