From: Petr Mladek <pmladek@suse.com>
To: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
Cc: kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
dyoung@redhat.com, linux-doc@vger.kernel.org, bhe@redhat.com,
vgoyal@redhat.com, stern@rowland.harvard.edu,
akpm@linux-foundation.org, andriy.shevchenko@linux.intel.com,
corbet@lwn.net, halves@canonical.com, kernel@gpiccoli.net,
Will Deacon <will@kernel.org>, Kees Cook <keescook@chromium.org>,
Steven Rostedt <rostedt@goodmis.org>,
Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>,
Masami Hiramatsu <mhiramat@kernel.org>,
John Ogness <john.ogness@linutronix.de>,
"Paul E. McKenney" <paulmck@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Juergen Gross <jgross@suse.com>
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter
Date: Thu, 20 Jan 2022 16:14:33 +0100 [thread overview]
Message-ID: <Yel8WQiBn/HNQN83@alley> (raw)
In-Reply-To: <20220108153451.195121-1-gpiccoli@igalia.com>
Adding some more people into Cc. Some modified the logic in the past.
Some are familiar with some interesting areas where the panic
notfiers are used.
On Sat 2022-01-08 12:34:51, Guilherme G. Piccoli wrote:
> The kernel notifier infrastructure allows function callbacks to be
> added in multiple lists, which are then called in the proper time,
> like in a reboot or panic event. The panic_notifier_list specifically
> contains the callbacks that are executed during a panic event. As any
> other notifier list, the panic one has no filtering and all functions
> previously registered are executed.
>
> The kdump infrastructure, on the other hand, enables users to set
> a crash kernel that is kexec'ed in a panic event, and vmcore/logs
> are collected in such crash kernel. When kdump is set, by default
> the panic notifiers are ignored - the kexec jumps to the crash kernel
> before the list is checked and callbacks executed.
>
> There are some cases though in which kdump users might want to
> allow panic notifier callbacks to execute _before_ the kexec to
> the crash kernel, for a variety of reasons - for example, users
> may think kexec is very prone to fail and want to give a chance
> to kmsg dumpers to run (and save logs using pstore),
Yes, this seems to be original intention for the
"crash_kexec_post_notifiers" option, see the commit
f06e5153f4ae2e2f3b0300f ("kernel/panic.c: add
"crash_kexec_post_notifiers" option for kdump after panic_notifiers")
> some panic notifier is required to properly quiesce some hardware
> that must be used to the crash kernel.
Do you have any example, please? The above mentioned commit
says "crash_kexec_post_notifiers" actually increases risk
of kdump failure.
Note that kmsg_dump() is called after the notifiers only because
some are printing more information, see the commit
6723734cdff15211bb78a ("panic: call panic handlers before kmsg_dump").
They might still increase the change that kmsg_dump() will never
be called.
> But there's a problem: currently it's an "all-or-nothing" situation,
> the kdump user choice is either to execute all panic notifiers or
> none of them. Given that panic notifiers may increase the risk of a
> kdump failure, this is a tough decision and may affect the debug of
> hard to reproduce bugs, if for some reason the user choice is to
> enable panic notifiers, but kdump then fails.
>
> So, this patch aims to ease this decision: we hereby introduce a filter
> for the panic notifier list, in which users may select specifically
> which callbacks they wish to run, allowing a safer kdump. The allowlist
> should be provided using the parameter "panic_notifier_filter=a,b,..."
> where a, b are valid callback names. Invalid symbols are discarded.
I am afraid that this is almost unusable solution:
+ requires deep knowledge of what each notifier does
+ might need debugging what notifier causes problems
+ the list might need to be updated when new notifiers are added
+ function names are implementation detail and might change
+ requires kallsyms
It is only workaround for a real problem. The problem is that
"panic_notifier_list" is used for many purposes that break
each other.
I checked some notifiers and found few groups:
+ disable watchdogs:
+ hung_task_panic()
+ rcu_panic()
+ dump information:
+ kernel_offset_notifier()
+ trace_panic_handler() (duplicate of panic_print=0x10)
+ inform hypervisor
+ xen_panic_event()
+ pvpanic_panic_notify()
+ hyperv_panic_event()
+ misc cleanup / flush / blinking
+ panic_event() in ipmi_msghandler.c
+ panic_happened() in heartbeat.c
+ led_trigger_panic_notifier()
IMHO, the right solution is to split the callbacks into 2 or more
notifier list. Then we might rework panic() to do:
void panic(void)
{
[...]
/* stop watchdogs + extra info */
atomic_notifier_call_chain(&panic_disable_watchdogs_notifier_list, 0, buf);
atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
panic_print_sys_info();
/* crash_kexec + kmsg_dump in configurable order */
if (!_crash_kexec_post_kmsg_dump) {
__crash_kexec(NULL);
smp_send_stop();
} else {
crash_smp_send_stop();
}
kmsg_dump();
if (_crash_kexec_post_kmsg_dump)
__crash_kexec(NULL);
/* infinite loop or reboot */
atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
atomic_notifier_call_chain(&panic_rest_notifier_list, 0, buf);
console_flush_on_panic(CONSOLE_FLUSH_PENDING);
if (panic_timeout >= 0) {
timeout();
emergency_restart();
}
for (i = 0; ; i += PANIC_TIMER_STEP) {
if (i >= i_next) {
i += panic_blink(state ^= 1);
i_next = i + 3600 / PANIC_BLINK_SPD;
}
mdelay(PANIC_TIMER_STEP);
}
}
Two notifier lists might be enough in the above scenario. I would call
them:
panic_pre_dump_notifier_list
panic_post_dump_notifier_list
It is a real solution that will help everyone. It is more complicated now
but it will makes things much easier in the long term. And it might be done
step by step:
1. introduce the two notifier lists
2. convert all users: one by one
3. remove the original notifier list when there is no user
Best Regards,
Petr
next prev parent reply other threads:[~2022-01-20 15:14 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-08 15:34 [PATCH V4] notifier/panic: Introduce panic_notifier_filter Guilherme G. Piccoli
2022-01-14 19:03 ` Guilherme G. Piccoli
[not found] ` <CALu+AoR+GrCpf0gqsx_XYETBGUAfRyP+SPNarK179hT7iQmCqQ@mail.gmail.com>
2022-01-18 13:22 ` Guilherme G. Piccoli
2022-01-16 13:11 ` Baoquan He
2022-01-17 12:59 ` Guilherme G. Piccoli
2022-01-20 15:14 ` Petr Mladek [this message]
2022-01-21 20:31 ` Guilherme G. Piccoli
2022-01-22 10:55 ` Baoquan He
2022-01-23 13:07 ` Masami Hiramatsu
2022-01-24 13:59 ` Baoquan He
2022-01-24 14:48 ` Guilherme G. Piccoli
2022-01-26 3:10 ` Baoquan He
2022-01-26 12:20 ` d.hatayama
2022-01-26 13:20 ` Petr Mladek
2022-01-30 8:50 ` Baoquan He
2022-01-24 11:43 ` d.hatayama
2022-01-24 14:15 ` Baoquan He
2022-01-25 11:50 ` d.hatayama
2022-01-25 12:34 ` Guilherme G. Piccoli
2022-01-25 13:06 ` d.hatayama
2022-01-27 17:16 ` Guilherme G. Piccoli
2022-01-28 13:38 ` Petr Mladek
2022-02-08 18:51 ` Guilherme G. Piccoli
2022-02-09 0:31 ` bhe
2022-02-10 16:39 ` Guilherme G. Piccoli
2022-02-10 17:26 ` Michael Kelley (LINUX)
2022-02-10 17:50 ` Guilherme G. Piccoli
2022-03-06 14:21 ` Guilherme G. Piccoli
2022-03-07 3:42 ` bhe
2022-03-07 13:11 ` Guilherme G. Piccoli
2022-03-07 14:04 ` bhe
2022-03-07 14:25 ` Guilherme G. Piccoli
2022-03-08 12:54 ` Petr Mladek
2022-03-08 13:04 ` Guilherme G. Piccoli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yel8WQiBn/HNQN83@alley \
--to=pmladek@suse.com \
--cc=akpm@linux-foundation.org \
--cc=andriy.shevchenko@linux.intel.com \
--cc=bhe@redhat.com \
--cc=corbet@lwn.net \
--cc=d.hatayama@jp.fujitsu.com \
--cc=dyoung@redhat.com \
--cc=gpiccoli@igalia.com \
--cc=halves@canonical.com \
--cc=hidehiro.kawai.ez@hitachi.com \
--cc=jgross@suse.com \
--cc=john.ogness@linutronix.de \
--cc=keescook@chromium.org \
--cc=kernel@gpiccoli.net \
--cc=kexec@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=stern@rowland.harvard.edu \
--cc=vgoyal@redhat.com \
--cc=vkuznets@redhat.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).