linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] printk: Add CONFIG_CONSOLE_LOGLEVEL_PANIC
@ 2021-06-22 14:33 Dmitry Safonov
  2021-06-25  9:13 ` Petr Mladek
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Safonov @ 2021-06-22 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Andrew Morton, John Ogness,
	Petr Mladek, Sergey Senozhatsky, Steven Rostedt

console_verbose() increases console loglevel to CONSOLE_LOGLEVEL_MOTORMOUTH,
which provides more information to debug a panic/oops.

Unfortunately, in Arista we maintain some DUTs (Device Under Test) that
are configured to have 9600 baud rate. While verbose console messages
have their value to post-analyze crashes, on such setup they:
- may prevent panic/oops messages being printed
- take too long to flush on console resulting in watchdog reboot

In all our setups we use kdump which saves dmesg buffer after panic,
so in reality those extra messages on console provide no additional value,
but rather add risk of not getting to __crash_kexec().

Provide CONFIG_CONSOLE_LOGLEVEL_PANIC, which allows to choose how
verbose the kernel must be on oops/panic.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 include/linux/printk.h |  4 ++--
 lib/Kconfig.debug      | 13 +++++++++++++
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index fe7eb2351610..5a65a719f917 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -76,8 +76,8 @@ static inline void console_silent(void)
 
 static inline void console_verbose(void)
 {
-	if (console_loglevel)
-		console_loglevel = CONSOLE_LOGLEVEL_MOTORMOUTH;
+	if (console_loglevel && (CONFIG_CONSOLE_LOGLEVEL_PANIC > 0))
+		console_loglevel = CONFIG_CONSOLE_LOGLEVEL_PANIC;
 }
 
 /* strlen("ratelimit") + 1 */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 678c13967580..0c12cafd9d8b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -61,6 +61,19 @@ config CONSOLE_LOGLEVEL_QUIET
 	  will be used as the loglevel. IOW passing "quiet" will be the
 	  equivalent of passing "loglevel=<CONSOLE_LOGLEVEL_QUIET>"
 
+config CONSOLE_LOGLEVEL_PANIC
+	int "panic console loglevel (1-15)"
+	range 0 15
+	default "15"
+	help
+	  loglevel to use in kernel panic or oopses.
+
+	  Usually in order to provide more debug information on console upon
+	  panic, one wants to see everything being printed (loglevel = 15).
+	  With an exception to setups with low baudrate on serial console,
+	  keeping this value high is a good choice.
+	  0 value is to keep the loglevel during panic/oops unchanged.
+
 config MESSAGE_LOGLEVEL_DEFAULT
 	int "Default message log level (1-7)"
 	range 1 7
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] printk: Add CONFIG_CONSOLE_LOGLEVEL_PANIC
  2021-06-22 14:33 [PATCH] printk: Add CONFIG_CONSOLE_LOGLEVEL_PANIC Dmitry Safonov
@ 2021-06-25  9:13 ` Petr Mladek
  2021-06-25 12:17   ` Dmitry Safonov
  0 siblings, 1 reply; 5+ messages in thread
From: Petr Mladek @ 2021-06-25  9:13 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Andrew Morton, John Ogness,
	Sergey Senozhatsky, Steven Rostedt

On Tue 2021-06-22 15:33:50, Dmitry Safonov wrote:
> console_verbose() increases console loglevel to CONSOLE_LOGLEVEL_MOTORMOUTH,
> which provides more information to debug a panic/oops.
> 
> Unfortunately, in Arista we maintain some DUTs (Device Under Test) that
> are configured to have 9600 baud rate. While verbose console messages
> have their value to post-analyze crashes, on such setup they:
> - may prevent panic/oops messages being printed
> - take too long to flush on console resulting in watchdog reboot
> 
> In all our setups we use kdump which saves dmesg buffer after panic,
> so in reality those extra messages on console provide no additional value,
> but rather add risk of not getting to __crash_kexec().

It makes sense.

> Provide CONFIG_CONSOLE_LOGLEVEL_PANIC, which allows to choose how
> verbose the kernel must be on oops/panic.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: John Ogness <john.ogness@linutronix.de>
> Cc: Petr Mladek <pmladek@suse.com>
> Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> ---
>  include/linux/printk.h |  4 ++--
>  lib/Kconfig.debug      | 13 +++++++++++++
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index fe7eb2351610..5a65a719f917 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -76,8 +76,8 @@ static inline void console_silent(void)
>  
>  static inline void console_verbose(void)
>  {
> -	if (console_loglevel)
> -		console_loglevel = CONSOLE_LOGLEVEL_MOTORMOUTH;
> +	if (console_loglevel && (CONFIG_CONSOLE_LOGLEVEL_PANIC > 0))
> +		console_loglevel = CONFIG_CONSOLE_LOGLEVEL_PANIC;

console_verbose() is called also in some other situations.
For example, check_hung_task(), oops_begin(), debug_locks_ff().
These do not always lead to panic.

At minimum, the name is misleading. It should be something
like CONFIG_CONSOLE_LOGLEVEL_VERBOSE.

But the question is whether we really want to limit the loglevel
also in the non-panic scenarios. IMHO, it is a bad idea.

A better solution would be to introduce console_verbose_panic()
and use it only when it is really going to panic. The function
should also use the lower value only when crash dump is really
successfully enabled.


>  }
>  
>  /* strlen("ratelimit") + 1 */
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 678c13967580..0c12cafd9d8b 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -61,6 +61,19 @@ config CONSOLE_LOGLEVEL_QUIET
>  	  will be used as the loglevel. IOW passing "quiet" will be the
>  	  equivalent of passing "loglevel=<CONSOLE_LOGLEVEL_QUIET>"
>  
> +config CONSOLE_LOGLEVEL_PANIC
> +	int "panic console loglevel (1-15)"

The range is 1-15 here.

> +	range 0 15

But it is 0-15 here. If you use "range 1 15" you should not need the
check (CONFIG_CONSOLE_LOGLEVEL_PANIC > 0) in the code.

> +	default "15"
> +	help
> +	  loglevel to use in kernel panic or oopses.
> +
> +	  Usually in order to provide more debug information on console upon
> +	  panic, one wants to see everything being printed (loglevel = 15).
> +	  With an exception to setups with low baudrate on serial console,
> +	  keeping this value high is a good choice.
> +	  0 value is to keep the loglevel during panic/oops unchanged.

The trick with 0 value just makes things more complicated. The default
"15" does the same job and should be good enough. The hard-coded
default is good enough for the other CONSOLE_LOGLEVEL_* settings.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] printk: Add CONFIG_CONSOLE_LOGLEVEL_PANIC
  2021-06-25  9:13 ` Petr Mladek
@ 2021-06-25 12:17   ` Dmitry Safonov
  2021-06-28 12:43     ` Petr Mladek
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Safonov @ 2021-06-25 12:17 UTC (permalink / raw)
  To: Petr Mladek
  Cc: linux-kernel, Dmitry Safonov, Andrew Morton, John Ogness,
	Sergey Senozhatsky, Steven Rostedt

Hi Petr, thanks for looking into this,

On 6/25/21 10:13 AM, Petr Mladek wrote:
> On Tue 2021-06-22 15:33:50, Dmitry Safonov wrote:
[..]
>> @@ -76,8 +76,8 @@ static inline void console_silent(void)
>>  
>>  static inline void console_verbose(void)
>>  {
>> -	if (console_loglevel)
>> -		console_loglevel = CONSOLE_LOGLEVEL_MOTORMOUTH;
>> +	if (console_loglevel && (CONFIG_CONSOLE_LOGLEVEL_PANIC > 0))
>> +		console_loglevel = CONFIG_CONSOLE_LOGLEVEL_PANIC;
> 
> console_verbose() is called also in some other situations.
> For example, check_hung_task(), oops_begin(), debug_locks_ff().
> These do not always lead to panic.>
> At minimum, the name is misleading. It should be something
> like CONFIG_CONSOLE_LOGLEVEL_VERBOSE.
>
> But the question is whether we really want to limit the loglevel
> also in the non-panic scenarios. IMHO, it is a bad idea.
>
> A better solution would be to introduce console_verbose_panic()
> and use it only when it is really going to panic. The function
> should also use the lower value only when crash dump is really
> successfully enabled.

Hmm, check_hung_task() calls it only if it's going to panic().
debug_locks_off() AFAICS is called only when there is something bad with
either lockdep itself or locks: they may get freed
[print_freed_lock_bug()] or lock is held on return to userspace
[lockdep_sys_exit()] and so on - when lockdep has to turn off. Arguably,
the situations are somewhat close to panic.
MCE calls it also just before panic.

So, the only left is oops_begin().
I'm not sure what to do about it.
What do you think, should console_verbose() be called only under
panic_on_oops? Or should there be console_unverbose() to return the
loglevel in oops_end()? [that seems quite a bit ugly, considering that
there're already places that temporary save loglevel and adding another
one is ugh]

Renaming console_verbose() to console_verbose_on_panic() or something
sounds good to me - I didn't do it only to keep the patch short.

>> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
>> index 678c13967580..0c12cafd9d8b 100644
>> --- a/lib/Kconfig.debug
>> +++ b/lib/Kconfig.debug
>> @@ -61,6 +61,19 @@ config CONSOLE_LOGLEVEL_QUIET
>>  	  will be used as the loglevel. IOW passing "quiet" will be the
>>  	  equivalent of passing "loglevel=<CONSOLE_LOGLEVEL_QUIET>"
>>  
>> +config CONSOLE_LOGLEVEL_PANIC
>> +	int "panic console loglevel (1-15)"
> 
> The range is 1-15 here.
> 
>> +	range 0 15
> 
> But it is 0-15 here. If you use "range 1 15" you should not need the
> check (CONFIG_CONSOLE_LOGLEVEL_PANIC > 0) in the code.
> 
>> +	default "15"
>> +	help
>> +	  loglevel to use in kernel panic or oopses.
>> +
>> +	  Usually in order to provide more debug information on console upon
>> +	  panic, one wants to see everything being printed (loglevel = 15).
>> +	  With an exception to setups with low baudrate on serial console,
>> +	  keeping this value high is a good choice.
>> +	  0 value is to keep the loglevel during panic/oops unchanged.
> 
> The trick with 0 value just makes things more complicated. The default
> "15" does the same job and should be good enough. The hard-coded
> default is good enough for the other CONSOLE_LOGLEVEL_* settings.

Well, "0" is kinda reverse to "15" - it doesn't change loglevel at all.
Actually, the origin purpose of the patch is to have "0" :-)
I thought 0-15 would be better than just off or on to MOTORMOUTH.

Now, looking at it again, I think what may be even better:

: if (console_loglevel && (CONFIG_CONSOLE_LOGLEVEL_PANIC >
console_loglevel))
:       console_loglevel = CONFIG_CONSOLE_LOGLEVEL_PANIC;

that way I can get rid of "0".
What do you think?

Thanks,
          Dmitry

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] printk: Add CONFIG_CONSOLE_LOGLEVEL_PANIC
  2021-06-25 12:17   ` Dmitry Safonov
@ 2021-06-28 12:43     ` Petr Mladek
  2021-06-28 17:26       ` Dmitry Safonov
  0 siblings, 1 reply; 5+ messages in thread
From: Petr Mladek @ 2021-06-28 12:43 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Andrew Morton, John Ogness,
	Sergey Senozhatsky, Steven Rostedt

On Fri 2021-06-25 13:17:43, Dmitry Safonov wrote:
> Hi Petr, thanks for looking into this,
> 
> On 6/25/21 10:13 AM, Petr Mladek wrote:
> > On Tue 2021-06-22 15:33:50, Dmitry Safonov wrote:
> [..]
> >> @@ -76,8 +76,8 @@ static inline void console_silent(void)
> >>  
> >>  static inline void console_verbose(void)
> >>  {
> >> -	if (console_loglevel)
> >> -		console_loglevel = CONSOLE_LOGLEVEL_MOTORMOUTH;
> >> +	if (console_loglevel && (CONFIG_CONSOLE_LOGLEVEL_PANIC > 0))
> >> +		console_loglevel = CONFIG_CONSOLE_LOGLEVEL_PANIC;
> > 
> > console_verbose() is called also in some other situations.
> > For example, check_hung_task(), oops_begin(), debug_locks_ff().
> > These do not always lead to panic.>
> > At minimum, the name is misleading. It should be something
> > like CONFIG_CONSOLE_LOGLEVEL_VERBOSE.
> 
> Hmm, check_hung_task() calls it only if it's going to panic().

Yup.

> debug_locks_off() AFAICS is called only when there is something bad with
> either lockdep itself or locks: they may get freed
> [print_freed_lock_bug()] or lock is held on return to userspace
> [lockdep_sys_exit()] and so on - when lockdep has to turn off. Arguably,
> the situations are somewhat close to panic.

"Somewhat close to panic" is not enough. The important thing is to
see the messages in these critical situations. One way is the verbose
console. The other way is the crash dump. And the crashdump is
generated only when panic() is really called.

BTW: It might actually be better to handle this using a command line
option instead of build option. The build option prevents debugging
the problem when crash dump fails from some reason. It is always
much easier to remove a command line option than rebuild and install
anoter kernel.

> MCE calls it also just before panic.

yup.

> So, the only left is oops_begin().
> I'm not sure what to do about it.
> What do you think, should console_verbose() be called only under
> panic_on_oops?

No. IMHO, it is even more important to make the console_verbose()
when panic_on_oops is disabled. Oops means that there is a high
risk that the system might crash. And the worst thing is a silent
crash (no message, no crashdump).


> Or should there be console_unverbose() to return the
> loglevel in oops_end()? [that seems quite a bit ugly, considering that
> there're already places that temporary save loglevel and adding another
> one is ugh]

Yeah, the temporary console_loglevel changes are ugly. They might be
racy if two processes manipulate the loglevel in parallel.
We should keep them at minimum or better remove them at all.


> Renaming console_verbose() to console_verbose_on_panic() or something
> sounds good to me - I didn't do it only to keep the patch short.

Yup, it looks like to most reasonable approach to me.


> >> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> >> index 678c13967580..0c12cafd9d8b 100644
> >> --- a/lib/Kconfig.debug
> >> +++ b/lib/Kconfig.debug
> >> @@ -61,6 +61,19 @@ config CONSOLE_LOGLEVEL_QUIET
> >>  	  will be used as the loglevel. IOW passing "quiet" will be the
> >>  	  equivalent of passing "loglevel=<CONSOLE_LOGLEVEL_QUIET>"
> >>  
> >> +config CONSOLE_LOGLEVEL_PANIC
> >> +	int "panic console loglevel (1-15)"
> > 
> > The range is 1-15 here.
> > 
> >> +	range 0 15
> > 
> > But it is 0-15 here. If you use "range 1 15" you should not need the
> > check (CONFIG_CONSOLE_LOGLEVEL_PANIC > 0) in the code.
> > 
> >> +	default "15"
> >> +	help
> >> +	  loglevel to use in kernel panic or oopses.
> >> +
> >> +	  Usually in order to provide more debug information on console upon
> >> +	  panic, one wants to see everything being printed (loglevel = 15).
> >> +	  With an exception to setups with low baudrate on serial console,
> >> +	  keeping this value high is a good choice.
> >> +	  0 value is to keep the loglevel during panic/oops unchanged.
> > 
> > The trick with 0 value just makes things more complicated. The default
> > "15" does the same job and should be good enough. The hard-coded
> > default is good enough for the other CONSOLE_LOGLEVEL_* settings.
> 
> Well, "0" is kinda reverse to "15" - it doesn't change loglevel at all.
> Actually, the origin purpose of the patch is to have "0" :-)

Is it enough to keep the current level during panic()? It might be
easier to introduce a commandline option, for example, no_console_verbose_panic.
It would do:

static inline void console_verbose_panic(void)
{
	if (!no_console_verbose_panic)
		console_verbose();
}

It is clear what it does. On the other hand, the logic with particular
loglevels is not clear. 3 different proposals has already been mentioned
in this thread:

	if (console_loglevel &&
	    (CONFIG_CONSOLE_LOGLEVEL_PANIC > console_loglevel)) {
		console_loglevel = CONFIG_CONSOLE_LOGLEVEL_PANIC;
	}

vs.

	if (console_loglevel)
		console_loglevel = CONFIG_CONSOLE_LOGLEVEL_PANIC;

vs.

	if (console_loglevel && CONFIG_CONSOLE_LOGLEVEL_PANIC)
		console_loglevel = CONFIG_CONSOLE_LOGLEVEL_PANIC;


Just imagine that you are a distributor, developer or admin:

   What value you would choose for CONFIG_CONSOLE_LOGLEVEL_PANIC?
   What console loglevel will be used at the end?

The answer depends on the implemented alhorith, console_loglevel,
and CONFIG_CONSOLE_LOGLEVEL_PANIC.

The answer would be much easier if "no_verbose_console_panic" is
used instead.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] printk: Add CONFIG_CONSOLE_LOGLEVEL_PANIC
  2021-06-28 12:43     ` Petr Mladek
@ 2021-06-28 17:26       ` Dmitry Safonov
  0 siblings, 0 replies; 5+ messages in thread
From: Dmitry Safonov @ 2021-06-28 17:26 UTC (permalink / raw)
  To: Petr Mladek
  Cc: linux-kernel, Dmitry Safonov, Andrew Morton, John Ogness,
	Sergey Senozhatsky, Steven Rostedt

On 6/28/21 1:43 PM, Petr Mladek wrote:
[..]
> Is it enough to keep the current level during panic()?

Yes.

> It might be
> easier to introduce a commandline option, for example, no_console_verbose_panic.
> It would do:
> 
> static inline void console_verbose_panic(void)
> {
> 	if (!no_console_verbose_panic)
> 		console_verbose();
> }
> 
> It is clear what it does. On the other hand, the logic with particular
> loglevels is not clear. 3 different proposals has already been mentioned
> in this thread:
> 
> 	if (console_loglevel &&
> 	    (CONFIG_CONSOLE_LOGLEVEL_PANIC > console_loglevel)) {
> 		console_loglevel = CONFIG_CONSOLE_LOGLEVEL_PANIC;
> 	}
> 
> vs.
> 
> 	if (console_loglevel)
> 		console_loglevel = CONFIG_CONSOLE_LOGLEVEL_PANIC;
> 
> vs.
> 
> 	if (console_loglevel && CONFIG_CONSOLE_LOGLEVEL_PANIC)
> 		console_loglevel = CONFIG_CONSOLE_LOGLEVEL_PANIC;
> 
> 
> Just imagine that you are a distributor, developer or admin:
> 
>    What value you would choose for CONFIG_CONSOLE_LOGLEVEL_PANIC?
>    What console loglevel will be used at the end?
> 
> The answer depends on the implemented alhorith, console_loglevel,
> and CONFIG_CONSOLE_LOGLEVEL_PANIC.
> 
> The answer would be much easier if "no_verbose_console_panic" is
> used instead.

Thanks for your replies, Petr, I'll send v2 with the function rename
patch and a patch to introduce this boot option, after the merge window
closes. I appreciate your inputs :-)

Thanks,
          Dmitry

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-28 17:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-22 14:33 [PATCH] printk: Add CONFIG_CONSOLE_LOGLEVEL_PANIC Dmitry Safonov
2021-06-25  9:13 ` Petr Mladek
2021-06-25 12:17   ` Dmitry Safonov
2021-06-28 12:43     ` Petr Mladek
2021-06-28 17:26       ` Dmitry Safonov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).