linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: printk: preempt_disable with long time resulting in softlockup/RCU stall issues
       [not found] <CABiPGEe=5tO5ffopExaHWV=fCb98vD5gHirt6vgXciA3hDrOfg@mail.gmail.com>
@ 2015-07-08  8:13 ` Petr Mladek
       [not found]   ` <CABiPGEcBUiBfJYAREbiM0LDQM+iNeOzKmC_oCaL0Zh_humH23Q@mail.gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Petr Mladek @ 2015-07-08  8:13 UTC (permalink / raw)
  To: Gavin Hu
  Cc: linux-kernel, Alex Elder, Luis R. Rodriguez,
	Steven Rostedt (Red Hat),
	Joe Perches, Peter Hurley, Tejun Heo, cxie4, cldu, xjian, fswu,
	Jan Kara

On Wed 2015-07-08 15:20:01, Gavin Hu wrote:
> Hi,
>      You know, a simple pirntk like printk("hello world!\n") via serial
> console may consume 21 seconds  above, because any printk is obligated to
> pump out all the characters in the log buffer
> with preempt disable until it is empty (vprintk_emit() and
> console_unlock()), not just "hello world!" itself.
> 
> And in the end, long time preempt disable may result in the softlock/RCU
> stall issues. Some systems are configured to trigger a kernel panic when
> meet softlock/ RCU stall.  So heavy printk may result in kernel panic.
> Usually, the solution is to remove related pritnks or limit the printk
> times. So sometimes, the low-level developers have to handle various heavy
> pirntk issues.
> 
> Why don't make the printk mechanism more flexible and configurable, for
> example, just pump out 1000 messages per printk calling,  to avoid  preempt
> disable with a long time?
> And we can disable or enable this feature by a proc interface. Be default,
> this feature is disabled. So it doesn't influence boot-time message printk.
> 
> Anyway, we maybe can give the low-level developers more options.
> 
> Any comments are welcome!

I add Jack into CC because he made several attempts to address this
issue in the past. For example,  see
https://lkml.org/lkml/2013/2/26/743

Jack's approach was similar but also tried to schedule further
writing to the console. With your approach, the console might stay
far behind the reality for a long time.

IMHO, we also want to make sure that all messages go out
when the system is panicking.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: printk: preempt_disable with long time resulting in softlockup/RCU stall issues
       [not found]   ` <CABiPGEcBUiBfJYAREbiM0LDQM+iNeOzKmC_oCaL0Zh_humH23Q@mail.gmail.com>
@ 2015-07-08 17:04     ` Luis R. Rodriguez
  2015-07-08 17:35       ` Joe Perches
  2015-07-09 14:10       ` Jan Kara
  0 siblings, 2 replies; 6+ messages in thread
From: Luis R. Rodriguez @ 2015-07-08 17:04 UTC (permalink / raw)
  To: Gavin Hu
  Cc: Petr Mladek, linux-kernel, Alex Elder, Steven Rostedt (Red Hat),
	Joe Perches, Peter Hurley, Tejun Heo, cxie4, cldu, xjian, fswu,
	Jan Kara

On Wed, Jul 08, 2015 at 05:08:35PM +0800, Gavin Hu wrote:
> Hi,
> 
> Yes. We should disable the printk_limit feature when panic to avoid missing
> messages.

Sounds like you have been looking into it and have a good idea of what you
want to do, why not try it and send some RFC patches ?

While at it, then we could consider doing different things depending on the
message type. KERN_EMERG would disable preemption, whereas KERN_INFO may not be
so critical to require it.

 Luis

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: printk: preempt_disable with long time resulting in softlockup/RCU stall issues
  2015-07-08 17:04     ` Luis R. Rodriguez
@ 2015-07-08 17:35       ` Joe Perches
  2015-07-08 18:00         ` Steven Rostedt
  2015-07-09 14:10       ` Jan Kara
  1 sibling, 1 reply; 6+ messages in thread
From: Joe Perches @ 2015-07-08 17:35 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Gavin Hu, Petr Mladek, linux-kernel, Alex Elder,
	Steven Rostedt (Red Hat),
	Peter Hurley, Tejun Heo, cxie4, cldu, xjian, fswu, Jan Kara

On Wed, 2015-07-08 at 19:04 +0200, Luis R. Rodriguez wrote:
> On Wed, Jul 08, 2015 at 05:08:35PM +0800, Gavin Hu wrote:
> > Hi,
> > 
> > Yes. We should disable the printk_limit feature when panic to avoid missing
> > messages.
> 
> Sounds like you have been looking into it and have a good idea of what you
> want to do, why not try it and send some RFC patches ?
> 
> While at it, then we could consider doing different things depending on the
> message type. KERN_EMERG would disable preemption, whereas KERN_INFO may not be
> so critical to require it.

That might be a bit difficult to implement with complete correctness
given KERN_EMERG use and continuation lines.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: printk: preempt_disable with long time resulting in softlockup/RCU stall issues
  2015-07-08 17:35       ` Joe Perches
@ 2015-07-08 18:00         ` Steven Rostedt
  2015-07-09  7:56           ` Petr Mladek
  0 siblings, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2015-07-08 18:00 UTC (permalink / raw)
  To: Joe Perches
  Cc: Luis R. Rodriguez, Gavin Hu, Petr Mladek, linux-kernel,
	Alex Elder, Peter Hurley, Tejun Heo, cxie4, cldu, xjian, fswu,
	Jan Kara

On Wed, 08 Jul 2015 10:35:29 -0700
Joe Perches <joe@perches.com> wrote:

> On Wed, 2015-07-08 at 19:04 +0200, Luis R. Rodriguez wrote:
> > On Wed, Jul 08, 2015 at 05:08:35PM +0800, Gavin Hu wrote:
> > > Hi,
> > > 
> > > Yes. We should disable the printk_limit feature when panic to avoid missing
> > > messages.
> > 
> > Sounds like you have been looking into it and have a good idea of what you
> > want to do, why not try it and send some RFC patches ?
> > 
> > While at it, then we could consider doing different things depending on the
> > message type. KERN_EMERG would disable preemption, whereas KERN_INFO may not be
> > so critical to require it.
> 
> That might be a bit difficult to implement with complete correctness
> given KERN_EMERG use and continuation lines.
> 

Or just have the current context determine what to do. If printk() was
called with preemption or interrupts disabled, it flushes the full
buffer before returning, otherwise it allows the writes to console be
preempted.

That would be great for -rt, as currently we don't even allow printk()
to write out if called from a interrupt disabled context, unless the
console it writes to is marked as preempt safe (serial console is
unfortunately not).

-- Steve

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: printk: preempt_disable with long time resulting in softlockup/RCU stall issues
  2015-07-08 18:00         ` Steven Rostedt
@ 2015-07-09  7:56           ` Petr Mladek
  0 siblings, 0 replies; 6+ messages in thread
From: Petr Mladek @ 2015-07-09  7:56 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Joe Perches, Luis R. Rodriguez, Gavin Hu, linux-kernel,
	Alex Elder, Peter Hurley, Tejun Heo, cxie4, cldu, xjian, fswu,
	Jan Kara

On Wed 2015-07-08 14:00:17, Steven Rostedt wrote:
> On Wed, 08 Jul 2015 10:35:29 -0700
> Joe Perches <joe@perches.com> wrote:
> 
> > On Wed, 2015-07-08 at 19:04 +0200, Luis R. Rodriguez wrote:
> > > On Wed, Jul 08, 2015 at 05:08:35PM +0800, Gavin Hu wrote:
> > > > Hi,
> > > > 
> > > > Yes. We should disable the printk_limit feature when panic to avoid missing
> > > > messages.
> > > 
> > > Sounds like you have been looking into it and have a good idea of what you
> > > want to do, why not try it and send some RFC patches ?
> > > 
> > > While at it, then we could consider doing different things depending on the
> > > message type. KERN_EMERG would disable preemption, whereas KERN_INFO may not be
> > > so critical to require it.
> > 
> > That might be a bit difficult to implement with complete correctness
> > given KERN_EMERG use and continuation lines.
> > 
> 
> Or just have the current context determine what to do. If printk() was
> called with preemption or interrupts disabled, it flushes the full
> buffer before returning, otherwise it allows the writes to console be
> preempted.

I am afraid that any variant of this approach will make the original
problem even worse. There will be more messages waiting and it will
increase the chance of the softlockup.

I guess that people call printk() with disabled interrupts or
preemption with the naive hope that it will be fast (just store
the message in the ring buffer and come back).

I think that we really need to break the console output when it
takes too long and schedule it somewhere else.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: printk: preempt_disable with long time resulting in softlockup/RCU stall issues
  2015-07-08 17:04     ` Luis R. Rodriguez
  2015-07-08 17:35       ` Joe Perches
@ 2015-07-09 14:10       ` Jan Kara
  1 sibling, 0 replies; 6+ messages in thread
From: Jan Kara @ 2015-07-09 14:10 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Gavin Hu, Petr Mladek, linux-kernel, Alex Elder,
	Steven Rostedt (Red Hat),
	Joe Perches, Peter Hurley, Tejun Heo, cxie4, cldu, xjian, fswu,
	Jan Kara

On Wed 08-07-15 19:04:02, Luis R. Rodriguez wrote:
> On Wed, Jul 08, 2015 at 05:08:35PM +0800, Gavin Hu wrote:
> > Hi,
> > 
> > Yes. We should disable the printk_limit feature when panic to avoid missing
> > messages.
> 
> Sounds like you have been looking into it and have a good idea of what you
> want to do, why not try it and send some RFC patches ?

Well, there are patches which try to fix this problem. Petr has referenced
them (at least one posting of them) and we do carry them in SLES kernels
(since without them some big machines aren't able to boot with serial
console attached). The biggest obstacle for them to get merged was that:

a) Andrew Morton didn't like the additional complexity in the printk code.

b) Alan Cox was objecting that the problem isn't in the printk code but
rather in serial console drivers which are too slow to print and thus we
should instead implement some buffering there.

If people care, I can refresh my memory where we ended, refresh the patches
and repost them. Probably we should deal with this one way or another...

> While at it, then we could consider doing different things depending on the
> message type. KERN_EMERG would disable preemption, whereas KERN_INFO may not be
> so critical to require it.

I don't think keying on message type is a good idea. You want to keep
message ordering in the first place so KERN_EMERG message would have to
print everything before it in the printk ring buffer but with other levels
being async this may take even longer than before... So I don't think we
should do that unless really necessary. Printk code currently keys on
'oops_in_progress' to do some special hacks making messages more likely to
appear on console and my patches use the same to switch to a sync mode
where we write out everything we have regardless of how long it takes (at
that point we are doomed anyway so some additional softlockup won't make
matters much worse).

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-07-09 14:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CABiPGEe=5tO5ffopExaHWV=fCb98vD5gHirt6vgXciA3hDrOfg@mail.gmail.com>
2015-07-08  8:13 ` printk: preempt_disable with long time resulting in softlockup/RCU stall issues Petr Mladek
     [not found]   ` <CABiPGEcBUiBfJYAREbiM0LDQM+iNeOzKmC_oCaL0Zh_humH23Q@mail.gmail.com>
2015-07-08 17:04     ` Luis R. Rodriguez
2015-07-08 17:35       ` Joe Perches
2015-07-08 18:00         ` Steven Rostedt
2015-07-09  7:56           ` Petr Mladek
2015-07-09 14:10       ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).