All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
To: Petr Mladek <pmladek@suse.com>
Cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	kernel@gpiccoli.net, senozhatsky@chromium.org,
	rostedt@goodmis.org, john.ogness@linutronix.de,
	feng.tang@intel.com, kexec@lists.infradead.org,
	dyoung@redhat.com, keescook@chromium.org, anton@enomsg.org,
	ccross@android.com, tony.luck@intel.com
Subject: Re: [PATCH] panic: Move panic_print before kmsg dumpers
Date: Tue, 4 Jan 2022 09:35:42 -0300	[thread overview]
Message-ID: <4c3b3e5c-f037-8103-d10d-bd665316d426@igalia.com> (raw)
In-Reply-To: <YdQzU0KkAVq2BWpk@alley>

Hi Petr, thanks again for the very comprehensive response.
Comments inline below:


On 04/01/2022 08:45, Petr Mladek wrote:
>[...] 
> To make it clear. panic_printk_sys_info() might overwrite the oldest
> messages stored on the log buffer. This patch moves
> panic_print_sys_info() before kmsg_dump(). It means that even
> kmsg_dump() would not see the overwritten oldest messages.

Oh sure, I understand that, and that's fine - we can increase a lot the
size of the log buffer through "log_buf_len" (and it's OK to lose some
early entries in our case).
My first understanding of your message was that "panic_print_sys_info"
could "eat" recent messages in the log buffer due to the lack of flush -
and that would be potentially bad.


> 
> Good question. I am familiar only with the console problems and there
> are many problems. Serial consoles might be slow. All consoles use
> internal locks that might cause deadlocks. More complicated consoles
> are even more risky.
> 
> My experience is that kexec mostly works when there is enough reserved
> space for booting the crash kernel. And it can be tested before
> the crash.
> 
> So, I personally think that console_flush_on_panic() is more risky
> and should not get called before kexec().
> 
> Also kexec might be simply disabled when it does not work. But
> console_flush_on_panic() could not be disabled if we call it
> too early. We might make it configurable. But it would be
> to complicated for users to get it right.

OK, got it! So let's not mess with the flush machinery, seems a bit
dangerous...


> 
> I think that this is the best solution after all.
> 
> Well, I see one more problem. CONSOLE_REPLAY_ALL replays all messages
> on the console. This flag should be handled when it is now. I mean
> after kmsg_dump(), kexec(), console_flush_on_panic().
> 
> It is pity that PANIC_PRINT_ALL_PRINTK_MSG is in "panic_print" flags.
> It makes sense only for consoles. All the other flags make sense
> also for kmsg_dump().
> 
> A solution would be to add a new kernel parameter that would
> obsolete PANIC_PRINT_ALL_PRINTK_MSG. The parameter can be
> called panic_console_replay or so.
> 

Hmm..makes sense. I've thought a bit about this option, it's kinda odd
compared to "print memory status, print tasks"...it would fit into
another parameter, as you proposed (and I'd be happy to do it as part of
this series). But..with that said, I understand we have quite a big
number of parameters nowadays, and panic_print having this
"CONSOLE_REPLAY_ALL" is kind of a kernel API to userspace, so to avoid
any polemics (either due to "oh no, another parameter" or due to "can't
remove that, break userspace"), what if we split panic_print_sys_info
into an upper and lower parts? I can do that using a function parameter,
like this:

<...>
panic_print_sys_info(0);
kmsg_dump();
<... stuff ...>
panic_print_sys_info(1);

So, "panic_print_sys_info(1)" is the current call, like before this
patch - it would just do the "CONSOLE_REPLAY_ALL", whereas
"panic_print_sys_info(0)" do the rest, before the kmsg dumpers. What do
you think?

If you prefer the new kernel parameter than my idea, I'm all for that as
well - it's your choice.

Thanks again for the good discussion.
Cheers,


Guilherme

WARNING: multiple messages have this Message-ID (diff)
From: Guilherme G. Piccoli <gpiccoli@igalia.com>
To: kexec@lists.infradead.org
Subject: [PATCH] panic: Move panic_print before kmsg dumpers
Date: Tue, 4 Jan 2022 09:35:42 -0300	[thread overview]
Message-ID: <4c3b3e5c-f037-8103-d10d-bd665316d426@igalia.com> (raw)
In-Reply-To: <YdQzU0KkAVq2BWpk@alley>

Hi Petr, thanks again for the very comprehensive response.
Comments inline below:


On 04/01/2022 08:45, Petr Mladek wrote:
>[...] 
> To make it clear. panic_printk_sys_info() might overwrite the oldest
> messages stored on the log buffer. This patch moves
> panic_print_sys_info() before kmsg_dump(). It means that even
> kmsg_dump() would not see the overwritten oldest messages.

Oh sure, I understand that, and that's fine - we can increase a lot the
size of the log buffer through "log_buf_len" (and it's OK to lose some
early entries in our case).
My first understanding of your message was that "panic_print_sys_info"
could "eat" recent messages in the log buffer due to the lack of flush -
and that would be potentially bad.


> 
> Good question. I am familiar only with the console problems and there
> are many problems. Serial consoles might be slow. All consoles use
> internal locks that might cause deadlocks. More complicated consoles
> are even more risky.
> 
> My experience is that kexec mostly works when there is enough reserved
> space for booting the crash kernel. And it can be tested before
> the crash.
> 
> So, I personally think that console_flush_on_panic() is more risky
> and should not get called before kexec().
> 
> Also kexec might be simply disabled when it does not work. But
> console_flush_on_panic() could not be disabled if we call it
> too early. We might make it configurable. But it would be
> to complicated for users to get it right.

OK, got it! So let's not mess with the flush machinery, seems a bit
dangerous...


> 
> I think that this is the best solution after all.
> 
> Well, I see one more problem. CONSOLE_REPLAY_ALL replays all messages
> on the console. This flag should be handled when it is now. I mean
> after kmsg_dump(), kexec(), console_flush_on_panic().
> 
> It is pity that PANIC_PRINT_ALL_PRINTK_MSG is in "panic_print" flags.
> It makes sense only for consoles. All the other flags make sense
> also for kmsg_dump().
> 
> A solution would be to add a new kernel parameter that would
> obsolete PANIC_PRINT_ALL_PRINTK_MSG. The parameter can be
> called panic_console_replay or so.
> 

Hmm..makes sense. I've thought a bit about this option, it's kinda odd
compared to "print memory status, print tasks"...it would fit into
another parameter, as you proposed (and I'd be happy to do it as part of
this series). But..with that said, I understand we have quite a big
number of parameters nowadays, and panic_print having this
"CONSOLE_REPLAY_ALL" is kind of a kernel API to userspace, so to avoid
any polemics (either due to "oh no, another parameter" or due to "can't
remove that, break userspace"), what if we split panic_print_sys_info
into an upper and lower parts? I can do that using a function parameter,
like this:

<...>
panic_print_sys_info(0);
kmsg_dump();
<... stuff ...>
panic_print_sys_info(1);

So, "panic_print_sys_info(1)" is the current call, like before this
patch - it would just do the "CONSOLE_REPLAY_ALL", whereas
"panic_print_sys_info(0)" do the rest, before the kmsg dumpers. What do
you think?

If you prefer the new kernel parameter than my idea, I'm all for that as
well - it's your choice.

Thanks again for the good discussion.
Cheers,


Guilherme


  reply	other threads:[~2022-01-04 12:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-30 16:18 [PATCH] panic: Move panic_print before kmsg dumpers Guilherme G. Piccoli
2021-12-30 16:18 ` Guilherme G. Piccoli
2022-01-03 14:58 ` Petr Mladek
2022-01-03 14:58   ` Petr Mladek
2022-01-03 16:30   ` Guilherme G. Piccoli
2022-01-03 16:30     ` Guilherme G. Piccoli
2022-01-04 11:45     ` Petr Mladek
2022-01-04 11:45       ` Petr Mladek
2022-01-04 12:35       ` Guilherme G. Piccoli [this message]
2022-01-04 12:35         ` Guilherme G. Piccoli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c3b3e5c-f037-8103-d10d-bd665316d426@igalia.com \
    --to=gpiccoli@igalia.com \
    --cc=akpm@linux-foundation.org \
    --cc=anton@enomsg.org \
    --cc=ccross@android.com \
    --cc=dyoung@redhat.com \
    --cc=feng.tang@intel.com \
    --cc=john.ogness@linutronix.de \
    --cc=keescook@chromium.org \
    --cc=kernel@gpiccoli.net \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.