All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Machek <pavel@ucw.cz>
To: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Jan Kara <jack@suse.cz>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Ye Xiaolong <xiaolong.ye@intel.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Petr Mladek <pmladek@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Jiri Slaby <jslaby@suse.com>, Len Brown <len.brown@intel.com>,
	linux-kernel@vger.kernel.org, lkp@01.org
Subject: Re: [printk]  fbc14616f4: BUG:kernel_reboot-without-warning_in_test_stage
Date: Fri, 7 Apr 2017 10:14:49 +0200	[thread overview]
Message-ID: <20170407081449.GA12859@amd> (raw)
In-Reply-To: <20170407074634.GB1091@jagdpanzerIV.localdomain>

[-- Attachment #1: Type: text/plain, Size: 3102 bytes --]

On Fri 2017-04-07 16:46:34, Sergey Senozhatsky wrote:
> On (04/07/17 09:15), Pavel Machek wrote:
> > On Fri 2017-04-07 13:44:40, Sergey Senozhatsky wrote:
> > > Hello,
> > > 
> > > On (04/06/17 19:33), Pavel Machek wrote:
> > > > > This patch set gives up part of the printk() reliability for bounded
> > > > > latency (at least unless we detect we are really in trouble) which is IMHO
> > > > > a good trade-off for lots of users (and others can just turn this feature
> > > > > off).
> > > > 
> > > > If they can ever realize they were bitten by this feature.
> > > > 
> > > > Can we go for different tradeoff?
> > > > 
> > > > In console_unlock(), if you detect too much work, print "Too many
> > > > messages to print, %d bytes delayed" and wake up kernel thread.
> > > 
> > > "too many messages" is undefined. console_unlock() can be called from
> > > IRQ handler or with preemtion disabled, or under spin_lock, or under
> > > RCU read lock, etc. etc. By the time we decide to wake up printk_kthread
> > > from console_unlock() it may be already too late.
> > 
> > So lets define "too many messages" as 240 characters. We know printk
> > worked rather well for us for more than 20 years. Kernel code is used
> > to printk taking few miliseconds.
> 
> serial console can be quite slow. and port->lock, that is acquired by
> console_unlock()->call_console_drivers()->write(), is also accessible
> by serial driver's IRQ handler, and this lock may be busy long
> enough -- as long as that IRQ handler transmits/receives chars. but
> that's not the point.

Well. This is what we had for 20 years.

> [..]
> > Yeah? So you know modified printk() does not work, that's why
> > "emergency mode" exists. Unfortunately, you can't rely on fact that
> > you can detect half-crashed machines by printk levels. You usually
> > can't.
> 
> I'm not happy with those printk_emergency_begin()/end(), sure. but that's
> the reality -- every single solution that would offload printing duty implies
> that there will be cases when offloading would not be possible. either
> PENDING_PRINTK_IPI to other CPUs, or irq_work(PENDING_OUTPUT) on a local CPU,
> or anything else (um... what it is?... softirq? tasklet? print one logbuf
> entry from every IRQ handler? dunno, anything else?). There will be cases
> when we won't be able to expect that something will take over and finish
> printing for us. Well, may be I'm missing some other solution that would
> offload printing, eliminating lockup conditions, and at the same time work
> in 100% of the cases.

I don't have magic solution in my sleeve. You made a good case that
spending 30 seconds in printk() is a bad idea. I agree with that. Your
solution is to introduce printk_emergency_begin()/end(). I don't agree
there.

I believe "spend at most 2 seconds in printk(), then print a warning
and offload" is a solution closer to what we had before.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Pavel Machek <pavel@ucw.cz>
To: lkp@lists.01.org
Subject: Re: [printk] fbc14616f4: BUG:kernel_reboot-without-warning_in_test_stage
Date: Fri, 07 Apr 2017 10:14:49 +0200	[thread overview]
Message-ID: <20170407081449.GA12859@amd> (raw)
In-Reply-To: <20170407074634.GB1091@jagdpanzerIV.localdomain>

[-- Attachment #1: Type: text/plain, Size: 3102 bytes --]

On Fri 2017-04-07 16:46:34, Sergey Senozhatsky wrote:
> On (04/07/17 09:15), Pavel Machek wrote:
> > On Fri 2017-04-07 13:44:40, Sergey Senozhatsky wrote:
> > > Hello,
> > > 
> > > On (04/06/17 19:33), Pavel Machek wrote:
> > > > > This patch set gives up part of the printk() reliability for bounded
> > > > > latency (at least unless we detect we are really in trouble) which is IMHO
> > > > > a good trade-off for lots of users (and others can just turn this feature
> > > > > off).
> > > > 
> > > > If they can ever realize they were bitten by this feature.
> > > > 
> > > > Can we go for different tradeoff?
> > > > 
> > > > In console_unlock(), if you detect too much work, print "Too many
> > > > messages to print, %d bytes delayed" and wake up kernel thread.
> > > 
> > > "too many messages" is undefined. console_unlock() can be called from
> > > IRQ handler or with preemtion disabled, or under spin_lock, or under
> > > RCU read lock, etc. etc. By the time we decide to wake up printk_kthread
> > > from console_unlock() it may be already too late.
> > 
> > So lets define "too many messages" as 240 characters. We know printk
> > worked rather well for us for more than 20 years. Kernel code is used
> > to printk taking few miliseconds.
> 
> serial console can be quite slow. and port->lock, that is acquired by
> console_unlock()->call_console_drivers()->write(), is also accessible
> by serial driver's IRQ handler, and this lock may be busy long
> enough -- as long as that IRQ handler transmits/receives chars. but
> that's not the point.

Well. This is what we had for 20 years.

> [..]
> > Yeah? So you know modified printk() does not work, that's why
> > "emergency mode" exists. Unfortunately, you can't rely on fact that
> > you can detect half-crashed machines by printk levels. You usually
> > can't.
> 
> I'm not happy with those printk_emergency_begin()/end(), sure. but that's
> the reality -- every single solution that would offload printing duty implies
> that there will be cases when offloading would not be possible. either
> PENDING_PRINTK_IPI to other CPUs, or irq_work(PENDING_OUTPUT) on a local CPU,
> or anything else (um... what it is?... softirq? tasklet? print one logbuf
> entry from every IRQ handler? dunno, anything else?). There will be cases
> when we won't be able to expect that something will take over and finish
> printing for us. Well, may be I'm missing some other solution that would
> offload printing, eliminating lockup conditions, and at the same time work
> in 100% of the cases.

I don't have magic solution in my sleeve. You made a good case that
spending 30 seconds in printk() is a bad idea. I agree with that. Your
solution is to introduce printk_emergency_begin()/end(). I don't agree
there.

I believe "spend at most 2 seconds in printk(), then print a warning
and offload" is a solution closer to what we had before.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

  reply	other threads:[~2017-04-07  8:14 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-29  9:25 [RFC][PATCHv2 0/8] printk: introduce printing kernel thread Sergey Senozhatsky
2017-03-29  9:25 ` [RFC][PATCHv2 1/8] printk: move printk_pending out of per-cpu Sergey Senozhatsky
2017-03-31 13:09   ` Petr Mladek
2017-03-31 13:33     ` Peter Zijlstra
2017-04-03 11:23       ` Sergey Senozhatsky
2017-04-03 12:43         ` Petr Mladek
2017-03-29  9:25 ` [RFC][PATCHv2 2/8] printk: introduce printing kernel thread Sergey Senozhatsky
2017-04-04  9:01   ` Petr Mladek
2017-04-04  9:36     ` Sergey Senozhatsky
2017-04-06 17:14   ` Pavel Machek
2017-04-07  5:12     ` Sergey Senozhatsky
2017-04-07  7:21       ` Pavel Machek
2017-04-07  8:15         ` Sergey Senozhatsky
2017-04-07 12:06           ` Pavel Machek
2017-03-29  9:25 ` [RFC][PATCHv2 3/8] printk: offload printing from wake_up_klogd_work_func() Sergey Senozhatsky
2017-03-31 14:56   ` Petr Mladek
2017-04-04 16:15     ` Sergey Senozhatsky
2017-03-29  9:25 ` [RFC][PATCHv2 4/8] pm: switch to printk.emergency mode in unsafe places Sergey Senozhatsky
2017-03-31 15:06   ` Petr Mladek
2017-04-06 17:20   ` Pavel Machek
2017-04-09 10:59     ` Andreas Mohr
2017-04-10 12:20       ` Petr Mladek
2017-04-10 14:38         ` Sergey Senozhatsky
2017-03-29  9:25 ` [RFC][PATCHv2 5/8] sysrq: " Sergey Senozhatsky
2017-03-31 15:37   ` Petr Mladek
2017-04-01  0:04     ` Sergey Senozhatsky
2017-03-29  9:25 ` [RFC][PATCHv2 6/8] kexec: " Sergey Senozhatsky
2017-03-31 15:39   ` Petr Mladek
2017-03-29  9:25 ` [RFC][PATCHv2 7/8] printk: add printk emergency_mode parameter Sergey Senozhatsky
2017-04-03 15:29   ` Petr Mladek
2017-04-04  8:29     ` Sergey Senozhatsky
2017-03-29  9:25 ` [RFC][PATCHv2 8/8] printk: enable printk offloading Sergey Senozhatsky
2017-03-30 21:38   ` [printk] fbc14616f4: BUG:kernel_reboot-without-warning_in_test_stage kernel test robot
2017-03-30 21:38     ` kernel test robot
2017-03-31  2:35     ` Sergey Senozhatsky
2017-03-31  2:35       ` Sergey Senozhatsky
2017-03-31  4:04       ` Sergey Senozhatsky
2017-03-31  4:04         ` Sergey Senozhatsky
2017-03-31  6:39         ` Ye Xiaolong
2017-03-31  6:39           ` Ye Xiaolong
2017-03-31 14:47           ` Sergey Senozhatsky
2017-03-31 14:47             ` Sergey Senozhatsky
2017-03-31 15:28             ` Eric W. Biederman
2017-03-31 15:28               ` Eric W. Biederman
2017-04-03  9:31               ` Jan Kara
2017-04-03  9:31                 ` Jan Kara
2017-04-03 10:06                 ` Petr Mladek
2017-04-03 10:06                   ` Petr Mladek
2017-04-06 17:33                 ` Pavel Machek
2017-04-06 17:33                   ` Pavel Machek
2017-04-07  4:44                   ` Sergey Senozhatsky
2017-04-07  4:44                     ` Sergey Senozhatsky
2017-04-07  7:15                     ` Pavel Machek
2017-04-07  7:15                       ` Pavel Machek
2017-04-07  7:46                       ` Sergey Senozhatsky
2017-04-07  7:46                         ` Sergey Senozhatsky
2017-04-07  8:14                         ` Pavel Machek [this message]
2017-04-07  8:14                           ` Pavel Machek
2017-04-07 12:10                           ` Sergey Senozhatsky
2017-04-07 12:10                             ` Sergey Senozhatsky
2017-04-07 12:44                             ` Pavel Machek
2017-04-07 12:44                               ` Pavel Machek
2017-04-07 14:40                               ` Steven Rostedt
2017-04-07 14:40                                 ` Steven Rostedt
2017-05-08  6:37                                 ` Sergey Senozhatsky
2017-05-08  6:37                                   ` Sergey Senozhatsky
2017-05-17 13:13                                   ` Petr Mladek
2017-05-17 13:13                                     ` Petr Mladek
2017-04-07 15:13                               ` Sergey Senozhatsky
2017-04-07 15:13                                 ` Sergey Senozhatsky
2017-04-07 15:23                                 ` Peter Zijlstra
2017-04-07 15:23                                   ` Peter Zijlstra
2017-04-07 15:40                                   ` Sergey Senozhatsky
2017-04-07 15:40                                     ` Sergey Senozhatsky
2017-04-09 18:21                                     ` Eric W. Biederman
2017-04-09 18:21                                       ` Eric W. Biederman
2017-04-10  4:46                                       ` Sergey Senozhatsky
2017-04-10  4:46                                         ` Sergey Senozhatsky
2017-04-09 10:12                                 ` Pavel Machek
2017-04-09 10:12                                   ` Pavel Machek
2017-04-10  4:53                                   ` Sergey Senozhatsky
2017-04-10  4:53                                     ` Sergey Senozhatsky
2017-04-10 11:54                                     ` Petr Mladek
2017-04-10 11:54                                       ` Petr Mladek
2017-04-10 15:08                                       ` Sergey Senozhatsky
2017-04-10 15:08                                         ` Sergey Senozhatsky
2017-04-10 18:48                                     ` Pavel Machek
2017-04-10 18:48                                       ` Pavel Machek
2017-04-11  1:46                                       ` Sergey Senozhatsky
2017-04-11  1:46                                         ` Sergey Senozhatsky
2017-04-11 16:19                                         ` Sergey Senozhatsky
2017-04-12 18:43                                           ` Pavel Machek
2017-04-13  4:34                                             ` Sergey Senozhatsky
2017-04-13  5:50                                           ` Sergey Senozhatsky
2017-04-13  8:19                                             ` Sergey Senozhatsky
2017-04-13 14:03                                           ` Petr Mladek
2017-04-14  4:42                                             ` Sergey Senozhatsky
2017-04-07 14:29                           ` Steven Rostedt
2017-04-07 14:29                             ` Steven Rostedt
2017-04-09  9:57                             ` Pavel Machek
2017-04-09  9:57                               ` Pavel Machek
2017-04-03 10:51               ` Sergey Senozhatsky
2017-04-03 10:51                 ` Sergey Senozhatsky
2017-04-05  7:29           ` Ye Xiaolong
2017-04-05  7:29             ` Ye Xiaolong
2017-04-05  8:40             ` Sergey Senozhatsky
2017-04-05  8:40               ` Sergey Senozhatsky
2017-04-03 15:42   ` [RFC][PATCHv2 8/8] printk: enable printk offloading Petr Mladek
2017-04-04 13:20     ` Sergey Senozhatsky
2017-04-02  4:13 [printk] fbc14616f4 BUG: kernel reboot-without-warning in test stage Fengguang Wu
2017-04-03  2:14 ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170407081449.GA12859@amd \
    --to=pavel@ucw.cz \
    --cc=akpm@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jack@suse.cz \
    --cc=jslaby@suse.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@01.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rjw@rjwysocki.net \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=xiaolong.ye@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.