linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
@ 2020-09-18  3:25 Dave Young
  2020-09-19  0:47 ` Andrew Morton
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Young @ 2020-09-18  3:25 UTC (permalink / raw)
  To: Andrew Morton, bhe, Eric Biederman, linux-kernel, kexec

crash_kexec_post_notifiers enables running various panic notifier
before kdump kernel booting. This increases risks of kdump failure.
It is well documented in kernel-parameters.txt. We do not suggest
people to enable it together with kdump unless he/she is really sure.
This is also not suggested to be enabled by default when users are
not aware in distributions.

But unfortunately it is enabled by default in systemd, see below
discussions in a systemd report, we can not convince systemd to change
it:
https://github.com/systemd/systemd/issues/16661

Actually we have got reports about kdump kernel hangs in both s390x
and powerpcle cases caused by the systemd change,  also some x86 cases
could also be caused by the same (although that is in Hyper-V code
instead of systemd, that need to be addressed separately).

Thus to avoid the auto enablement here just disable the param writable
permission in sysfs.

Signed-off-by: Dave Young <dyoung@redhat.com>
---
 kernel/panic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/panic.c b/kernel/panic.c
index aef8872ba843..bea44fc4eb3b 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -695,7 +695,7 @@ core_param(panic, panic_timeout, int, 0644);
 core_param(panic_print, panic_print, ulong, 0644);
 core_param(pause_on_oops, pause_on_oops, int, 0644);
 core_param(panic_on_warn, panic_on_warn, int, 0644);
-core_param(crash_kexec_post_notifiers, crash_kexec_post_notifiers, bool, 0644);
+core_param(crash_kexec_post_notifiers, crash_kexec_post_notifiers, bool, 0444);
 
 static int __init oops_setup(char *s)
 {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-18  3:25 [PATCH] Only allow to set crash_kexec_post_notifiers on boot time Dave Young
@ 2020-09-19  0:47 ` Andrew Morton
  2020-09-19  7:26   ` Dave Young
  2020-09-21 20:18   ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 20+ messages in thread
From: Andrew Morton @ 2020-09-19  0:47 UTC (permalink / raw)
  To: Dave Young; +Cc: bhe, Eric Biederman, linux-kernel, kexec

On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote:

> crash_kexec_post_notifiers enables running various panic notifier
> before kdump kernel booting. This increases risks of kdump failure.
> It is well documented in kernel-parameters.txt. We do not suggest
> people to enable it together with kdump unless he/she is really sure.
> This is also not suggested to be enabled by default when users are
> not aware in distributions.
> 
> But unfortunately it is enabled by default in systemd, see below
> discussions in a systemd report, we can not convince systemd to change
> it:
> https://github.com/systemd/systemd/issues/16661
> 
> Actually we have got reports about kdump kernel hangs in both s390x
> and powerpcle cases caused by the systemd change,  also some x86 cases
> could also be caused by the same (although that is in Hyper-V code
> instead of systemd, that need to be addressed separately).
> 
> Thus to avoid the auto enablement here just disable the param writable
> permission in sysfs.
> 

Well.  I don't think this is at all a desirable way of resolving a
disagreement with the systemd developers

At the above github address I'm seeing "ryncsn added a commit to
ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't
enable crash_kexec_post_notifiers by default".  So didn't that address
the issue?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-19  0:47 ` Andrew Morton
@ 2020-09-19  7:26   ` Dave Young
  2020-09-21 20:18   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 20+ messages in thread
From: Dave Young @ 2020-09-19  7:26 UTC (permalink / raw)
  To: Andrew Morton; +Cc: bhe, Eric Biederman, linux-kernel, kexec

On 09/18/20 at 05:47pm, Andrew Morton wrote:
> On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote:
> 
> > crash_kexec_post_notifiers enables running various panic notifier
> > before kdump kernel booting. This increases risks of kdump failure.
> > It is well documented in kernel-parameters.txt. We do not suggest
> > people to enable it together with kdump unless he/she is really sure.
> > This is also not suggested to be enabled by default when users are
> > not aware in distributions.
> > 
> > But unfortunately it is enabled by default in systemd, see below
> > discussions in a systemd report, we can not convince systemd to change
> > it:
> > https://github.com/systemd/systemd/issues/16661
> > 
> > Actually we have got reports about kdump kernel hangs in both s390x
> > and powerpcle cases caused by the systemd change,  also some x86 cases
> > could also be caused by the same (although that is in Hyper-V code
> > instead of systemd, that need to be addressed separately).
> > 
> > Thus to avoid the auto enablement here just disable the param writable
> > permission in sysfs.
> > 
> 
> Well.  I don't think this is at all a desirable way of resolving a
> disagreement with the systemd developers
> 
> At the above github address I'm seeing "ryncsn added a commit to
> ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't
> enable crash_kexec_post_notifiers by default".  So didn't that address
> the issue?
> 

I hope that commit can be merged in systemd, but we are really not
optimize about that. The discussion is clear there but we did not get
response since Aug 6.

BTW, Kairui sent the systemd pull request 15 days ago, the new update added some
comment.

Thanks
Dave 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-19  0:47 ` Andrew Morton
  2020-09-19  7:26   ` Dave Young
@ 2020-09-21 20:18   ` Konrad Rzeszutek Wilk
  2020-09-22  1:45     ` Eric W. Biederman
                       ` (2 more replies)
  1 sibling, 3 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2020-09-21 20:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Young, bhe, Eric Biederman, linux-kernel, kexec,
	Eric DeVolder, Boris Ostrovsky

On Fri, Sep 18, 2020 at 05:47:43PM -0700, Andrew Morton wrote:
> On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote:
> 
> > crash_kexec_post_notifiers enables running various panic notifier
> > before kdump kernel booting. This increases risks of kdump failure.
> > It is well documented in kernel-parameters.txt. We do not suggest
> > people to enable it together with kdump unless he/she is really sure.
> > This is also not suggested to be enabled by default when users are
> > not aware in distributions.
> > 
> > But unfortunately it is enabled by default in systemd, see below
> > discussions in a systemd report, we can not convince systemd to change
> > it:
> > https://github.com/systemd/systemd/issues/16661
> > 
> > Actually we have got reports about kdump kernel hangs in both s390x
> > and powerpcle cases caused by the systemd change,  also some x86 cases
> > could also be caused by the same (although that is in Hyper-V code
> > instead of systemd, that need to be addressed separately).

Perhaps it may be better to fix the issus on s390x and PowerPC as well?

> > 
> > Thus to avoid the auto enablement here just disable the param writable
> > permission in sysfs.
> > 
> 
> Well.  I don't think this is at all a desirable way of resolving a
> disagreement with the systemd developers
> 
> At the above github address I'm seeing "ryncsn added a commit to
> ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't
> enable crash_kexec_post_notifiers by default".  So didn't that address
> the issue?

It does in systemd, but there is a strong interest in making this on by default.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-21 20:18   ` Konrad Rzeszutek Wilk
@ 2020-09-22  1:45     ` Eric W. Biederman
  2020-09-23  2:43       ` Dave Young
  2020-09-22 10:58     ` Philipp Rudo
  2020-09-23  2:25     ` Dave Young
  2 siblings, 1 reply; 20+ messages in thread
From: Eric W. Biederman @ 2020-09-22  1:45 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Andrew Morton, Dave Young, bhe, linux-kernel, kexec,
	Eric DeVolder, Boris Ostrovsky

Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes:

> On Fri, Sep 18, 2020 at 05:47:43PM -0700, Andrew Morton wrote:
>> On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote:
>> 
>> > crash_kexec_post_notifiers enables running various panic notifier
>> > before kdump kernel booting. This increases risks of kdump failure.
>> > It is well documented in kernel-parameters.txt. We do not suggest
>> > people to enable it together with kdump unless he/she is really sure.
>> > This is also not suggested to be enabled by default when users are
>> > not aware in distributions.
>> > 
>> > But unfortunately it is enabled by default in systemd, see below
>> > discussions in a systemd report, we can not convince systemd to change
>> > it:
>> > https://github.com/systemd/systemd/issues/16661
>> > 
>> > Actually we have got reports about kdump kernel hangs in both s390x
>> > and powerpcle cases caused by the systemd change,  also some x86 cases
>> > could also be caused by the same (although that is in Hyper-V code
>> > instead of systemd, that need to be addressed separately).
>
> Perhaps it may be better to fix the issus on s390x and PowerPC as well?
>
>> > 
>> > Thus to avoid the auto enablement here just disable the param writable
>> > permission in sysfs.
>> > 
>> 
>> Well.  I don't think this is at all a desirable way of resolving a
>> disagreement with the systemd developers
>> 
>> At the above github address I'm seeing "ryncsn added a commit to
>> ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't
>> enable crash_kexec_post_notifiers by default".  So didn't that address
>> the issue?
>
> It does in systemd, but there is a strong interest in making this on
> by default.

There is also a strong interest in removing this code entirely from the
kernel.

This failure is a case in point.

I think I am at my I told you so point.  This is what all of the testing
over all the years has said.  Leaving functionality to the peculiarities
of firmware when you don't have to, and can actually control what is
going on doesn't work.

Eric



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-21 20:18   ` Konrad Rzeszutek Wilk
  2020-09-22  1:45     ` Eric W. Biederman
@ 2020-09-22 10:58     ` Philipp Rudo
  2020-09-22 14:50       ` boris.ostrovsky
  2020-09-23  2:25     ` Dave Young
  2 siblings, 1 reply; 20+ messages in thread
From: Philipp Rudo @ 2020-09-22 10:58 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Andrew Morton, bhe, kexec, linux-kernel, Eric Biederman,
	Boris Ostrovsky, Eric DeVolder, Dave Young

Hi Konrad,


On Mon, 21 Sep 2020 16:18:12 -0400
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:

> On Fri, Sep 18, 2020 at 05:47:43PM -0700, Andrew Morton wrote:
> > On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote:
> >   
> > > crash_kexec_post_notifiers enables running various panic notifier
> > > before kdump kernel booting. This increases risks of kdump failure.
> > > It is well documented in kernel-parameters.txt. We do not suggest
> > > people to enable it together with kdump unless he/she is really sure.
> > > This is also not suggested to be enabled by default when users are
> > > not aware in distributions.
> > > 
> > > But unfortunately it is enabled by default in systemd, see below
> > > discussions in a systemd report, we can not convince systemd to change
> > > it:
> > > https://github.com/systemd/systemd/issues/16661
> > > 
> > > Actually we have got reports about kdump kernel hangs in both s390x
> > > and powerpcle cases caused by the systemd change,  also some x86 cases
> > > could also be caused by the same (although that is in Hyper-V code
> > > instead of systemd, that need to be addressed separately).  
> 
> Perhaps it may be better to fix the issus on s390x and PowerPC as well?

There's little s390 can fix. We use the panic_notifier_list to start
other dumpers in case kdump isn't configured or failed. This behavior was
introduced in 2006 long before crash_kexec_post_notifiers were introduced. So I
suggest that crash_kexec_post_notifiers are fixed instead.

> > > 
> > > Thus to avoid the auto enablement here just disable the param writable
> > > permission in sysfs.
> > >   
> > 
> > Well.  I don't think this is at all a desirable way of resolving a
> > disagreement with the systemd developers
> > 
> > At the above github address I'm seeing "ryncsn added a commit to
> > ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't
> > enable crash_kexec_post_notifiers by default".  So didn't that address
> > the issue?  
> 
> It does in systemd, but there is a strong interest in making this on by default.

AFAIK pstore requires UEFI to work. So what's the point to enable it on non-UEFI
systems?

Thanks
Philipp

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-22 10:58     ` Philipp Rudo
@ 2020-09-22 14:50       ` boris.ostrovsky
  2020-09-22 17:04         ` Guilherme G. Piccoli
  0 siblings, 1 reply; 20+ messages in thread
From: boris.ostrovsky @ 2020-09-22 14:50 UTC (permalink / raw)
  To: Philipp Rudo, Konrad Rzeszutek Wilk
  Cc: Andrew Morton, bhe, kexec, linux-kernel, Eric Biederman,
	Eric DeVolder, Dave Young


On 9/22/20 6:58 AM, Philipp Rudo wrote:
>
> AFAIK pstore requires UEFI to work. So what's the point to enable it on non-UEFI
> systems?


I don't think UEFI is required, ERST can specify its own backend. And that, in fact, can be quite useful in virtualization scenarios (especially in cases of direct boot, when there is no OVMF)


-boris


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-22 14:50       ` boris.ostrovsky
@ 2020-09-22 17:04         ` Guilherme G. Piccoli
  0 siblings, 0 replies; 20+ messages in thread
From: Guilherme G. Piccoli @ 2020-09-22 17:04 UTC (permalink / raw)
  To: boris.ostrovsky
  Cc: Philipp Rudo, Konrad Rzeszutek Wilk, Baoquan He,
	kexec mailing list, LKML, Eric Biederman, Andrew Morton,
	Eric DeVolder, Dave Young

On Tue, Sep 22, 2020 at 11:53 AM <boris.ostrovsky@oracle.com> wrote:
>
>
> On 9/22/20 6:58 AM, Philipp Rudo wrote:
> >
> > AFAIK pstore requires UEFI to work. So what's the point to enable it on non-UEFI
> > systems?
>
>
> I don't think UEFI is required, ERST can specify its own backend. And that, in fact, can be quite useful in virtualization scenarios (especially in cases of direct boot, when there is no OVMF)
>
>
> -boris

There is ramoops backend too - I was able to collect a dmesg in a
cloud provider using that!

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-21 20:18   ` Konrad Rzeszutek Wilk
  2020-09-22  1:45     ` Eric W. Biederman
  2020-09-22 10:58     ` Philipp Rudo
@ 2020-09-23  2:25     ` Dave Young
  2 siblings, 0 replies; 20+ messages in thread
From: Dave Young @ 2020-09-23  2:25 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Andrew Morton, bhe, Eric Biederman, linux-kernel, kexec,
	Eric DeVolder, Boris Ostrovsky

On 09/21/20 at 04:18pm, Konrad Rzeszutek Wilk wrote:
> On Fri, Sep 18, 2020 at 05:47:43PM -0700, Andrew Morton wrote:
> > On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote:
> > 
> > > crash_kexec_post_notifiers enables running various panic notifier
> > > before kdump kernel booting. This increases risks of kdump failure.
> > > It is well documented in kernel-parameters.txt. We do not suggest
> > > people to enable it together with kdump unless he/she is really sure.
> > > This is also not suggested to be enabled by default when users are
> > > not aware in distributions.
> > > 
> > > But unfortunately it is enabled by default in systemd, see below
> > > discussions in a systemd report, we can not convince systemd to change
> > > it:
> > > https://github.com/systemd/systemd/issues/16661
> > > 
> > > Actually we have got reports about kdump kernel hangs in both s390x
> > > and powerpcle cases caused by the systemd change,  also some x86 cases
> > > could also be caused by the same (although that is in Hyper-V code
> > > instead of systemd, that need to be addressed separately).
> 
> Perhaps it may be better to fix the issus on s390x and PowerPC as well?
> 
> > > 
> > > Thus to avoid the auto enablement here just disable the param writable
> > > permission in sysfs.
> > > 
> > 
> > Well.  I don't think this is at all a desirable way of resolving a
> > disagreement with the systemd developers
> > 
> > At the above github address I'm seeing "ryncsn added a commit to
> > ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't
> > enable crash_kexec_post_notifiers by default".  So didn't that address
> > the issue?
> 
> It does in systemd, but there is a strong interest in making this on by default.

I understand there could be such interest, but we have to keep in mind
that any extra things after a system crash can cause kdump unreliable.

I do not object people to use pstore, but I do object to enable the
notifiers by default.

BTW, crash notifiers are not limited to pstore, there are quite a log of
other pieces like led trigger etc.

Thanks
Dave


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-22  1:45     ` Eric W. Biederman
@ 2020-09-23  2:43       ` Dave Young
  2020-09-23 15:48         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Young @ 2020-09-23  2:43 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Konrad Rzeszutek Wilk, Andrew Morton, bhe, linux-kernel, kexec,
	Eric DeVolder, Boris Ostrovsky, Tianyu Lani, Michael Kelley,
	Wei Liu, Masami Hiramatsu, HATAYAMA Daisuke

+ more people who may care about this param 
On 09/21/20 at 08:45pm, Eric W. Biederman wrote:
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes:
> 
> > On Fri, Sep 18, 2020 at 05:47:43PM -0700, Andrew Morton wrote:
> >> On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote:
> >> 
> >> > crash_kexec_post_notifiers enables running various panic notifier
> >> > before kdump kernel booting. This increases risks of kdump failure.
> >> > It is well documented in kernel-parameters.txt. We do not suggest
> >> > people to enable it together with kdump unless he/she is really sure.
> >> > This is also not suggested to be enabled by default when users are
> >> > not aware in distributions.
> >> > 
> >> > But unfortunately it is enabled by default in systemd, see below
> >> > discussions in a systemd report, we can not convince systemd to change
> >> > it:
> >> > https://github.com/systemd/systemd/issues/16661
> >> > 
> >> > Actually we have got reports about kdump kernel hangs in both s390x
> >> > and powerpcle cases caused by the systemd change,  also some x86 cases
> >> > could also be caused by the same (although that is in Hyper-V code
> >> > instead of systemd, that need to be addressed separately).
> >
> > Perhaps it may be better to fix the issus on s390x and PowerPC as well?
> >
> >> > 
> >> > Thus to avoid the auto enablement here just disable the param writable
> >> > permission in sysfs.
> >> > 
> >> 
> >> Well.  I don't think this is at all a desirable way of resolving a
> >> disagreement with the systemd developers
> >> 
> >> At the above github address I'm seeing "ryncsn added a commit to
> >> ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't
> >> enable crash_kexec_post_notifiers by default".  So didn't that address
> >> the issue?
> >
> > It does in systemd, but there is a strong interest in making this on
> > by default.
> 
> There is also a strong interest in removing this code entirely from the
> kernel.

Added Hyper-V people and people who created the param, it is below
commit, I also want to remove it if possible, let's see how people
think, but the least way should be to disable the auto setting in both systemd
and kernel:

    commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45
    Author: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
    Date:   Fri Jun 6 14:37:07 2014 -0700
    
        kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers
    
        Add a "crash_kexec_post_notifiers" boot option to run kdump after
        running panic_notifiers and dump kmsg.  This can help rare situations
        where kdump fails because of unstable crashed kernel or hardware failure
        (memory corruption on critical data/code), or the 2nd kernel is already
        broken by the 1st kernel (it's a broken behavior, but who can guarantee
        that the "crashed" kernel works correctly?).
    
        Usage: add "crash_kexec_post_notifiers" to kernel boot option.
    
        Note that this actually increases risks of the failure of kdump.  This
        option should be set only if you worry about the rare case of kdump
        failure rather than increasing the chance of success.

> 
> This failure is a case in point.
> 
> I think I am at my I told you so point.  This is what all of the testing
> over all the years has said.  Leaving functionality to the peculiarities
> of firmware when you don't have to, and can actually control what is
> going on doesn't work.
> 
> Eric
> 
> 

Thanks
Dave


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-23  2:43       ` Dave Young
@ 2020-09-23 15:48         ` Konrad Rzeszutek Wilk
  2020-09-24 16:15           ` Michael Kelley
  0 siblings, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2020-09-23 15:48 UTC (permalink / raw)
  To: Dave Young
  Cc: Eric W. Biederman, Andrew Morton, bhe, linux-kernel, kexec,
	Eric DeVolder, Boris Ostrovsky, Tianyu Lani, Michael Kelley,
	Wei Liu, Masami Hiramatsu, HATAYAMA Daisuke

On Wed, Sep 23, 2020 at 10:43:29AM +0800, Dave Young wrote:
> + more people who may care about this param 

Paarty time!!

(See below, didn't snip any comments)
> On 09/21/20 at 08:45pm, Eric W. Biederman wrote:
> > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes:
> > 
> > > On Fri, Sep 18, 2020 at 05:47:43PM -0700, Andrew Morton wrote:
> > >> On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote:
> > >> 
> > >> > crash_kexec_post_notifiers enables running various panic notifier
> > >> > before kdump kernel booting. This increases risks of kdump failure.
> > >> > It is well documented in kernel-parameters.txt. We do not suggest
> > >> > people to enable it together with kdump unless he/she is really sure.
> > >> > This is also not suggested to be enabled by default when users are
> > >> > not aware in distributions.
> > >> > 
> > >> > But unfortunately it is enabled by default in systemd, see below
> > >> > discussions in a systemd report, we can not convince systemd to change
> > >> > it:
> > >> > https://github.com/systemd/systemd/issues/16661
> > >> > 
> > >> > Actually we have got reports about kdump kernel hangs in both s390x
> > >> > and powerpcle cases caused by the systemd change,  also some x86 cases
> > >> > could also be caused by the same (although that is in Hyper-V code
> > >> > instead of systemd, that need to be addressed separately).
> > >
> > > Perhaps it may be better to fix the issus on s390x and PowerPC as well?
> > >
> > >> > 
> > >> > Thus to avoid the auto enablement here just disable the param writable
> > >> > permission in sysfs.
> > >> > 
> > >> 
> > >> Well.  I don't think this is at all a desirable way of resolving a
> > >> disagreement with the systemd developers
> > >> 
> > >> At the above github address I'm seeing "ryncsn added a commit to
> > >> ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't
> > >> enable crash_kexec_post_notifiers by default".  So didn't that address
> > >> the issue?
> > >
> > > It does in systemd, but there is a strong interest in making this on
> > > by default.
> > 
> > There is also a strong interest in removing this code entirely from the
> > kernel.
> 
> Added Hyper-V people and people who created the param, it is below
> commit, I also want to remove it if possible, let's see how people
> think, but the least way should be to disable the auto setting in both systemd
> and kernel:
> 
>     commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45
>     Author: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>     Date:   Fri Jun 6 14:37:07 2014 -0700
>     
>         kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers
>     
>         Add a "crash_kexec_post_notifiers" boot option to run kdump after
>         running panic_notifiers and dump kmsg.  This can help rare situations
>         where kdump fails because of unstable crashed kernel or hardware failure
>         (memory corruption on critical data/code), or the 2nd kernel is already
>         broken by the 1st kernel (it's a broken behavior, but who can guarantee
>         that the "crashed" kernel works correctly?).
>     
>         Usage: add "crash_kexec_post_notifiers" to kernel boot option.
>     
>         Note that this actually increases risks of the failure of kdump.  This
>         option should be set only if you worry about the rare case of kdump
>         failure rather than increasing the chance of success.


If this is such risky knob that leads to bugs where folks are backing away
from with disgust in their faces - then perhaps the only way to go about
this is - limit the exposure to known working situations on firmware
that we can control?

That is enable only a subset of post notifiers which determine if they
are OK running if the conditions are blessed?

I think that would satisfy the conditions where you have to to deal with unsavory
bugs that end up on your plate - and aren't fun because there is no
way to fixing it -  but at the same time allowing multiple ways to save the crash?

Please don't take away something that is quite useful in the field. Can we
hammer out something that will remove your pain points?
> 
> > 
> > This failure is a case in point.
> > 
> > I think I am at my I told you so point.  This is what all of the testing
> > over all the years has said.  Leaving functionality to the peculiarities
> > of firmware when you don't have to, and can actually control what is
> > going on doesn't work.
> > 
> > Eric
> > 
> > 
> 
> Thanks
> Dave
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-23 15:48         ` Konrad Rzeszutek Wilk
@ 2020-09-24 16:15           ` Michael Kelley
  2020-09-24 16:25             ` Eric W. Biederman
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Kelley @ 2020-09-24 16:15 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Dave Young
  Cc: Eric W. Biederman, Andrew Morton, bhe, linux-kernel, kexec,
	Eric DeVolder, Boris Ostrovsky, Tianyu Lan, Wei Liu,
	Masami Hiramatsu, HATAYAMA Daisuke

From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Sent: Wednesday, September 23, 2020 8:48 AM
> 
> On Wed, Sep 23, 2020 at 10:43:29AM +0800, Dave Young wrote:
> > + more people who may care about this param
> 
> Paarty time!!
> 
> (See below, didn't snip any comments)
> > On 09/21/20 at 08:45pm, Eric W. Biederman wrote:
> > > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes:
> > >
> > > > On Fri, Sep 18, 2020 at 05:47:43PM -0700, Andrew Morton wrote:
> > > >> On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote:
> > > >>
> > > >> > crash_kexec_post_notifiers enables running various panic notifier
> > > >> > before kdump kernel booting. This increases risks of kdump failure.
> > > >> > It is well documented in kernel-parameters.txt. We do not suggest
> > > >> > people to enable it together with kdump unless he/she is really sure.
> > > >> > This is also not suggested to be enabled by default when users are
> > > >> > not aware in distributions.
> > > >> >
> > > >> > But unfortunately it is enabled by default in systemd, see below
> > > >> > discussions in a systemd report, we can not convince systemd to change
> > > >> > it:
> > > >> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fsyst
> emd%2Fsystemd%2Fissues%2F16661&amp;data=02%7C01%7Cmikelley%40microsoft.com%
> 7C3631bae06f7147c0f92908d85fd7f2b2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
> 7C637364728378052956&amp;sdata=9CUpPUxcKLLggbJ1bjubBjbFUAhPVeZhIc4yss8wAiU%3
> D&amp;reserved=0
> > > >> >
> > > >> > Actually we have got reports about kdump kernel hangs in both s390x
> > > >> > and powerpcle cases caused by the systemd change,  also some x86 cases
> > > >> > could also be caused by the same (although that is in Hyper-V code
> > > >> > instead of systemd, that need to be addressed separately).
> > > >
> > > > Perhaps it may be better to fix the issus on s390x and PowerPC as well?
> > > >
> > > >> >
> > > >> > Thus to avoid the auto enablement here just disable the param writable
> > > >> > permission in sysfs.
> > > >> >
> > > >>
> > > >> Well.  I don't think this is at all a desirable way of resolving a
> > > >> disagreement with the systemd developers
> > > >>
> > > >> At the above github address I'm seeing "ryncsn added a commit to
> > > >> ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't
> > > >> enable crash_kexec_post_notifiers by default".  So didn't that address
> > > >> the issue?
> > > >
> > > > It does in systemd, but there is a strong interest in making this on
> > > > by default.
> > >
> > > There is also a strong interest in removing this code entirely from the
> > > kernel.
> >
> > Added Hyper-V people and people who created the param, it is below
> > commit, I also want to remove it if possible, let's see how people
> > think, but the least way should be to disable the auto setting in both systemd
> > and kernel:

Hyper-V uses a notifier to inform the host system that a Linux VM has
panic'ed.  Informing the host is particularly important in a public cloud
such as Azure so that the cloud software can alert the customer, and can
track cloud-wide reliability statistics.   Whether a kdump is taken is controlled
entirely by the customer and how he configures the VM, and we want
the host to be informed either way.

Michael

> >
> >     commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45
> >     Author: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> >     Date:   Fri Jun 6 14:37:07 2014 -0700
> >
> >         kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after
> panic_notifers
> >
> >         Add a "crash_kexec_post_notifiers" boot option to run kdump after
> >         running panic_notifiers and dump kmsg.  This can help rare situations
> >         where kdump fails because of unstable crashed kernel or hardware failure
> >         (memory corruption on critical data/code), or the 2nd kernel is already
> >         broken by the 1st kernel (it's a broken behavior, but who can guarantee
> >         that the "crashed" kernel works correctly?).
> >
> >         Usage: add "crash_kexec_post_notifiers" to kernel boot option.
> >
> >         Note that this actually increases risks of the failure of kdump.  This
> >         option should be set only if you worry about the rare case of kdump
> >         failure rather than increasing the chance of success.
> 
> 
> If this is such risky knob that leads to bugs where folks are backing away
> from with disgust in their faces - then perhaps the only way to go about
> this is - limit the exposure to known working situations on firmware
> that we can control?
> 
> That is enable only a subset of post notifiers which determine if they
> are OK running if the conditions are blessed?
> 
> I think that would satisfy the conditions where you have to to deal with unsavory
> bugs that end up on your plate - and aren't fun because there is no
> way to fixing it -  but at the same time allowing multiple ways to save the crash?
> 
> Please don't take away something that is quite useful in the field. Can we
> hammer out something that will remove your pain points?
> >
> > >
> > > This failure is a case in point.
> > >
> > > I think I am at my I told you so point.  This is what all of the testing
> > > over all the years has said.  Leaving functionality to the peculiarities
> > > of firmware when you don't have to, and can actually control what is
> > > going on doesn't work.
> > >
> > > Eric
> > >
> > >
> >
> > Thanks
> > Dave
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-24 16:15           ` Michael Kelley
@ 2020-09-24 16:25             ` Eric W. Biederman
  2020-09-24 16:43               ` Michael Kelley
  0 siblings, 1 reply; 20+ messages in thread
From: Eric W. Biederman @ 2020-09-24 16:25 UTC (permalink / raw)
  To: Michael Kelley
  Cc: Konrad Rzeszutek Wilk, Dave Young, Andrew Morton, bhe,
	linux-kernel, kexec, Eric DeVolder, Boris Ostrovsky, Tianyu Lan,
	Wei Liu, Masami Hiramatsu, HATAYAMA Daisuke

Michael Kelley <mikelley@microsoft.com> writes:

> From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Sent: Wednesday, September 23, 2020 8:48 AM
>> 
>> On Wed, Sep 23, 2020 at 10:43:29AM +0800, Dave Young wrote:
>> > + more people who may care about this param
>> 
>> Paarty time!!
>> 
>> (See below, didn't snip any comments)
>> > On 09/21/20 at 08:45pm, Eric W. Biederman wrote:
>> > > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes:
>> > >
>> > > > On Fri, Sep 18, 2020 at 05:47:43PM -0700, Andrew Morton wrote:
>> > > >> On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote:
>> > > >>
>> > > >> > crash_kexec_post_notifiers enables running various panic notifier
>> > > >> > before kdump kernel booting. This increases risks of kdump failure.
>> > > >> > It is well documented in kernel-parameters.txt. We do not suggest
>> > > >> > people to enable it together with kdump unless he/she is really sure.
>> > > >> > This is also not suggested to be enabled by default when users are
>> > > >> > not aware in distributions.
>> > > >> >
>> > > >> > But unfortunately it is enabled by default in systemd, see below
>> > > >> > discussions in a systemd report, we can not convince systemd to change
>> > > >> > it:
>> > > >> >
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fsyst
>> emd%2Fsystemd%2Fissues%2F16661&amp;data=02%7C01%7Cmikelley%40microsoft.com%
>> 7C3631bae06f7147c0f92908d85fd7f2b2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
>> 7C637364728378052956&amp;sdata=9CUpPUxcKLLggbJ1bjubBjbFUAhPVeZhIc4yss8wAiU%3
>> D&amp;reserved=0
>> > > >> >
>> > > >> > Actually we have got reports about kdump kernel hangs in both s390x
>> > > >> > and powerpcle cases caused by the systemd change,  also some x86 cases
>> > > >> > could also be caused by the same (although that is in Hyper-V code
>> > > >> > instead of systemd, that need to be addressed separately).
>> > > >
>> > > > Perhaps it may be better to fix the issus on s390x and PowerPC as well?
>> > > >
>> > > >> >
>> > > >> > Thus to avoid the auto enablement here just disable the param writable
>> > > >> > permission in sysfs.
>> > > >> >
>> > > >>
>> > > >> Well.  I don't think this is at all a desirable way of resolving a
>> > > >> disagreement with the systemd developers
>> > > >>
>> > > >> At the above github address I'm seeing "ryncsn added a commit to
>> > > >> ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't
>> > > >> enable crash_kexec_post_notifiers by default".  So didn't that address
>> > > >> the issue?
>> > > >
>> > > > It does in systemd, but there is a strong interest in making this on
>> > > > by default.
>> > >
>> > > There is also a strong interest in removing this code entirely from the
>> > > kernel.
>> >
>> > Added Hyper-V people and people who created the param, it is below
>> > commit, I also want to remove it if possible, let's see how people
>> > think, but the least way should be to disable the auto setting in both systemd
>> > and kernel:
>
> Hyper-V uses a notifier to inform the host system that a Linux VM has
> panic'ed.  Informing the host is particularly important in a public cloud
> such as Azure so that the cloud software can alert the customer, and can
> track cloud-wide reliability statistics.   Whether a kdump is taken is controlled
> entirely by the customer and how he configures the VM, and we want
> the host to be informed either way.

Why?

Why does the host care?
Especially if the VM continues executing into a kdump kernel?

Further like I have mentioned everytime something like this has come up
a call on the kexec on panic code path should be a direct call (That can
be audited) not something hidden in a notifier call chain (which can not).

Eric

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-24 16:25             ` Eric W. Biederman
@ 2020-09-24 16:43               ` Michael Kelley
  2020-09-24 17:16                 ` boris.ostrovsky
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Kelley @ 2020-09-24 16:43 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Konrad Rzeszutek Wilk, Dave Young, Andrew Morton, bhe,
	linux-kernel, kexec, Eric DeVolder, Boris Ostrovsky, Tianyu Lan,
	Wei Liu, Masami Hiramatsu, HATAYAMA Daisuke

From: Eric W. Biederman <ebiederm@xmission.com> Sent: Thursday, September 24, 2020 9:26 AM
> 
> Michael Kelley <mikelley@microsoft.com> writes:
> 
> >> >
> >> > Added Hyper-V people and people who created the param, it is below
> >> > commit, I also want to remove it if possible, let's see how people
> >> > think, but the least way should be to disable the auto setting in both systemd
> >> > and kernel:
> >
> > Hyper-V uses a notifier to inform the host system that a Linux VM has
> > panic'ed.  Informing the host is particularly important in a public cloud
> > such as Azure so that the cloud software can alert the customer, and can
> > track cloud-wide reliability statistics.   Whether a kdump is taken is controlled
> > entirely by the customer and how he configures the VM, and we want
> > the host to be informed either way.
> 
> Why?
> 
> Why does the host care?
> Especially if the VM continues executing into a kdump kernel?

The host itself doesn't care.  But the host is a convenient out-of-band
channel for recording that a panic has occurred and to collect basic data
about the panic.  This out-of-band channel is then used to notify the end
customer that his VM has panic'ed.  Sure, the customer should be running
his own monitoring software, but customers don't always do what they
should.  Equally important, the out-of-band channel allows the cloud
infrastructure software to notice trends, such as that the rate of Linux
panics has increased, and that perhaps there is a cloud problem that
should be investigated.

> 
> Further like I have mentioned everytime something like this has come up
> a call on the kexec on panic code path should be a direct call (That can
> be audited) not something hidden in a notifier call chain (which can not).
> 

The use case I describe has no particular requirement that it be
implemented via the notifier call chain.  If there's a better way to run
some out-of-band notification code on all Linux panics regardless of
whether a kdump is taken, we're open to such an alternative.

Michael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-24 16:43               ` Michael Kelley
@ 2020-09-24 17:16                 ` boris.ostrovsky
  2020-09-25  3:05                   ` Dave Young
  0 siblings, 1 reply; 20+ messages in thread
From: boris.ostrovsky @ 2020-09-24 17:16 UTC (permalink / raw)
  To: Michael Kelley, Eric W. Biederman
  Cc: Konrad Rzeszutek Wilk, Dave Young, Andrew Morton, bhe,
	linux-kernel, kexec, Eric DeVolder, Tianyu Lan, Wei Liu,
	Masami Hiramatsu, HATAYAMA Daisuke


On 9/24/20 12:43 PM, Michael Kelley wrote:
> From: Eric W. Biederman <ebiederm@xmission.com> Sent: Thursday, September 24, 2020 9:26 AM
>> Michael Kelley <mikelley@microsoft.com> writes:
>>
>>>>> Added Hyper-V people and people who created the param, it is below
>>>>> commit, I also want to remove it if possible, let's see how people
>>>>> think, but the least way should be to disable the auto setting in both systemd
>>>>> and kernel:
>>> Hyper-V uses a notifier to inform the host system that a Linux VM has
>>> panic'ed.  Informing the host is particularly important in a public cloud
>>> such as Azure so that the cloud software can alert the customer, and can
>>> track cloud-wide reliability statistics.   Whether a kdump is taken is controlled
>>> entirely by the customer and how he configures the VM, and we want
>>> the host to be informed either way.
>> Why?
>>
>> Why does the host care?
>> Especially if the VM continues executing into a kdump kernel?
> The host itself doesn't care.  But the host is a convenient out-of-band
> channel for recording that a panic has occurred and to collect basic data
> about the panic.  This out-of-band channel is then used to notify the end
> customer that his VM has panic'ed.  Sure, the customer should be running
> his own monitoring software, but customers don't always do what they
> should.  Equally important, the out-of-band channel allows the cloud
> infrastructure software to notice trends, such as that the rate of Linux
> panics has increased, and that perhaps there is a cloud problem that
> should be investigated.


In many cases (especially in cloud environment) your dump device is remote (e.g. iscsi) and kdump sometimes (often?) gets stuck because of connectivity issues (which could be cause of the panic in the first place). So it is quite desirable to inform the infrastructure that the VM is on its way out without waiting for kdump to complete.


>
>> Further like I have mentioned everytime something like this has come up
>> a call on the kexec on panic code path should be a direct call (That can
>> be audited) not something hidden in a notifier call chain (which can not).
>>

We btw already have a direct call from panic() to kmsg_dump() which is indirectly controlled by crash_kexec_post_notifiers, and it would also be preferable to be able to call it before kdump as well.


-boris


> The use case I describe has no particular requirement that it be
> implemented via the notifier call chain.  If there's a better way to run
> some out-of-band notification code on all Linux panics regardless of
> whether a kdump is taken, we're open to such an alternative.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-24 17:16                 ` boris.ostrovsky
@ 2020-09-25  3:05                   ` Dave Young
  2020-09-25 14:56                     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Young @ 2020-09-25  3:05 UTC (permalink / raw)
  To: boris.ostrovsky
  Cc: Michael Kelley, Eric W. Biederman, Konrad Rzeszutek Wilk,
	Andrew Morton, bhe, linux-kernel, kexec, Eric DeVolder,
	Tianyu Lan, Wei Liu, Masami Hiramatsu, HATAYAMA Daisuke

Hi,

On 09/24/20 at 01:16pm, boris.ostrovsky@oracle.com wrote:
> 
> On 9/24/20 12:43 PM, Michael Kelley wrote:
> > From: Eric W. Biederman <ebiederm@xmission.com> Sent: Thursday, September 24, 2020 9:26 AM
> >> Michael Kelley <mikelley@microsoft.com> writes:
> >>
> >>>>> Added Hyper-V people and people who created the param, it is below
> >>>>> commit, I also want to remove it if possible, let's see how people
> >>>>> think, but the least way should be to disable the auto setting in both systemd
> >>>>> and kernel:
> >>> Hyper-V uses a notifier to inform the host system that a Linux VM has
> >>> panic'ed.  Informing the host is particularly important in a public cloud
> >>> such as Azure so that the cloud software can alert the customer, and can
> >>> track cloud-wide reliability statistics.   Whether a kdump is taken is controlled
> >>> entirely by the customer and how he configures the VM, and we want
> >>> the host to be informed either way.
> >> Why?
> >>
> >> Why does the host care?
> >> Especially if the VM continues executing into a kdump kernel?
> > The host itself doesn't care.  But the host is a convenient out-of-band
> > channel for recording that a panic has occurred and to collect basic data
> > about the panic.  This out-of-band channel is then used to notify the end
> > customer that his VM has panic'ed.  Sure, the customer should be running
> > his own monitoring software, but customers don't always do what they
> > should.  Equally important, the out-of-band channel allows the cloud
> > infrastructure software to notice trends, such as that the rate of Linux
> > panics has increased, and that perhaps there is a cloud problem that
> > should be investigated.
> 
> 
> In many cases (especially in cloud environment) your dump device is remote (e.g. iscsi) and kdump sometimes (often?) gets stuck because of connectivity issues (which could be cause of the panic in the first place). So it is quite desirable to inform the infrastructure that the VM is on its way out without waiting for kdump to complete.

That can probably be done in kdump kernel if it is really needed.  Say
informing host that panic happened and a kdump kernel is runnning.

But I think to set crash_kexec_post_notifiers by default is still bad. 

> 
> 
> >
> >> Further like I have mentioned everytime something like this has come up
> >> a call on the kexec on panic code path should be a direct call (That can
> >> be audited) not something hidden in a notifier call chain (which can not).
> >>
> 
> We btw already have a direct call from panic() to kmsg_dump() which is indirectly controlled by crash_kexec_post_notifiers, and it would also be preferable to be able to call it before kdump as well.

Right, that is the same thing we are talking about.

Thanks
Dave


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-25  3:05                   ` Dave Young
@ 2020-09-25 14:56                     ` Konrad Rzeszutek Wilk
  2020-09-27  2:51                       ` Dave Young
  2020-09-29 13:36                       ` Philipp Rudo
  0 siblings, 2 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2020-09-25 14:56 UTC (permalink / raw)
  To: Dave Young
  Cc: boris.ostrovsky, Michael Kelley, Eric W. Biederman,
	Andrew Morton, bhe, linux-kernel, kexec, Eric DeVolder,
	Tianyu Lan, Wei Liu, Masami Hiramatsu, HATAYAMA Daisuke

On Fri, Sep 25, 2020 at 11:05:58AM +0800, Dave Young wrote:
> Hi,
> 
> On 09/24/20 at 01:16pm, boris.ostrovsky@oracle.com wrote:
> > 
> > On 9/24/20 12:43 PM, Michael Kelley wrote:
> > > From: Eric W. Biederman <ebiederm@xmission.com> Sent: Thursday, September 24, 2020 9:26 AM
> > >> Michael Kelley <mikelley@microsoft.com> writes:
> > >>
> > >>>>> Added Hyper-V people and people who created the param, it is below
> > >>>>> commit, I also want to remove it if possible, let's see how people
> > >>>>> think, but the least way should be to disable the auto setting in both systemd
> > >>>>> and kernel:
> > >>> Hyper-V uses a notifier to inform the host system that a Linux VM has
> > >>> panic'ed.  Informing the host is particularly important in a public cloud
> > >>> such as Azure so that the cloud software can alert the customer, and can
> > >>> track cloud-wide reliability statistics.   Whether a kdump is taken is controlled
> > >>> entirely by the customer and how he configures the VM, and we want
> > >>> the host to be informed either way.
> > >> Why?
> > >>
> > >> Why does the host care?
> > >> Especially if the VM continues executing into a kdump kernel?
> > > The host itself doesn't care.  But the host is a convenient out-of-band
> > > channel for recording that a panic has occurred and to collect basic data
> > > about the panic.  This out-of-band channel is then used to notify the end
> > > customer that his VM has panic'ed.  Sure, the customer should be running
> > > his own monitoring software, but customers don't always do what they
> > > should.  Equally important, the out-of-band channel allows the cloud
> > > infrastructure software to notice trends, such as that the rate of Linux
> > > panics has increased, and that perhaps there is a cloud problem that
> > > should be investigated.
> > 
> > 
> > In many cases (especially in cloud environment) your dump device is remote (e.g. iscsi) and kdump sometimes (often?) gets stuck because of connectivity issues (which could be cause of the panic in the first place). So it is quite desirable to inform the infrastructure that the VM is on its way out without waiting for kdump to complete.
> 
> That can probably be done in kdump kernel if it is really needed.  Say
> informing host that panic happened and a kdump kernel is runnning.

If kdump kernel gets to that point. Sometimes (sadly) it ends up being
misconfigured and it chokes up - and hence having multiple ways to emit
the crash information before running kdump kernel is a life-saver.

> 
> But I think to set crash_kexec_post_notifiers by default is still bad. 

Because of the way it is run today I presume? If there was some
safe/unsafe policy that should work right? I would think that the
safe ones that work properly all the time are:

 - HyperV CRASH_MSRs,
 - KVM PVPANIC_[PANIC,CRASHLOAD] push button knob,
 - pstore EFI variables
 - Dumping in memory,

And then some that depend on firmware version (aka BIOS, and vendor) are:
 - ACPI ERST,

And then the unsafe:
 - s390, PowerPC (I don't actually know what they are but that
    was Dave's primary motivator).

> 
> > 
> > 
> > >
> > >> Further like I have mentioned everytime something like this has come up
> > >> a call on the kexec on panic code path should be a direct call (That can
> > >> be audited) not something hidden in a notifier call chain (which can not).
> > >>
> > 
> > We btw already have a direct call from panic() to kmsg_dump() which is indirectly controlled by crash_kexec_post_notifiers, and it would also be preferable to be able to call it before kdump as well.
> 
> Right, that is the same thing we are talking about.
> 
> Thanks
> Dave
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-25 14:56                     ` Konrad Rzeszutek Wilk
@ 2020-09-27  2:51                       ` Dave Young
  2020-09-29 13:36                       ` Philipp Rudo
  1 sibling, 0 replies; 20+ messages in thread
From: Dave Young @ 2020-09-27  2:51 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: boris.ostrovsky, Michael Kelley, Eric W. Biederman,
	Andrew Morton, bhe, linux-kernel, kexec, Eric DeVolder,
	Tianyu Lan, Wei Liu, Masami Hiramatsu, HATAYAMA Daisuke

Hi,

On 09/25/20 at 10:56am, Konrad Rzeszutek Wilk wrote:
> On Fri, Sep 25, 2020 at 11:05:58AM +0800, Dave Young wrote:
> > Hi,
> > 
> > On 09/24/20 at 01:16pm, boris.ostrovsky@oracle.com wrote:
> > > 
> > > On 9/24/20 12:43 PM, Michael Kelley wrote:
> > > > From: Eric W. Biederman <ebiederm@xmission.com> Sent: Thursday, September 24, 2020 9:26 AM
> > > >> Michael Kelley <mikelley@microsoft.com> writes:
> > > >>
> > > >>>>> Added Hyper-V people and people who created the param, it is below
> > > >>>>> commit, I also want to remove it if possible, let's see how people
> > > >>>>> think, but the least way should be to disable the auto setting in both systemd
> > > >>>>> and kernel:
> > > >>> Hyper-V uses a notifier to inform the host system that a Linux VM has
> > > >>> panic'ed.  Informing the host is particularly important in a public cloud
> > > >>> such as Azure so that the cloud software can alert the customer, and can
> > > >>> track cloud-wide reliability statistics.   Whether a kdump is taken is controlled
> > > >>> entirely by the customer and how he configures the VM, and we want
> > > >>> the host to be informed either way.
> > > >> Why?
> > > >>
> > > >> Why does the host care?
> > > >> Especially if the VM continues executing into a kdump kernel?
> > > > The host itself doesn't care.  But the host is a convenient out-of-band
> > > > channel for recording that a panic has occurred and to collect basic data
> > > > about the panic.  This out-of-band channel is then used to notify the end
> > > > customer that his VM has panic'ed.  Sure, the customer should be running
> > > > his own monitoring software, but customers don't always do what they
> > > > should.  Equally important, the out-of-band channel allows the cloud
> > > > infrastructure software to notice trends, such as that the rate of Linux
> > > > panics has increased, and that perhaps there is a cloud problem that
> > > > should be investigated.
> > > 
> > > 
> > > In many cases (especially in cloud environment) your dump device is remote (e.g. iscsi) and kdump sometimes (often?) gets stuck because of connectivity issues (which could be cause of the panic in the first place). So it is quite desirable to inform the infrastructure that the VM is on its way out without waiting for kdump to complete.
> > 
> > That can probably be done in kdump kernel if it is really needed.  Say
> > informing host that panic happened and a kdump kernel is runnning.
> 
> If kdump kernel gets to that point. Sometimes (sadly) it ends up being
> misconfigured and it chokes up - and hence having multiple ways to emit
> the crash information before running kdump kernel is a life-saver.

If it is done in kernel boot phase before pid 1 comes up then things
should be good enough, specific for kvm/hyper-v guests the kdump kernel.

> 
> > 
> > But I think to set crash_kexec_post_notifiers by default is still bad. 
> 
> Because of the way it is run today I presume? If there was some
> safe/unsafe policy that should work right? I would think that the
> safe ones that work properly all the time are:
> 
>  - HyperV CRASH_MSRs,
>  - KVM PVPANIC_[PANIC,CRASHLOAD] push button knob,
>  - pstore EFI variables
>  - Dumping in memory,
> 
> And then some that depend on firmware version (aka BIOS, and vendor) are:
>  - ACPI ERST,
> 
> And then the unsafe:
>  - s390, PowerPC (I don't actually know what they are but that
>     was Dave's primary motivator).

As I said we also got reports of kdump kernel hang with Hyper-V with the
crash_kexec_post_notifiers enabled.

EFI pstore also depends on efi runtime that is in firmware, also we can
not ensure it works well after a panic happened.  Ditto for other pstore
backends we do not prefer to do it before kdump.  But as I said I'm not
saying they are not useful, people can use them by their choose.

As for the virtual machine panic events maybe it is ok to add some other
hooks instead of the notifiers.  But frankly I still feel it is better to do
it in kdump kernel boot path since kdump works well for virt from our
experience.

> 
> > 
> > > 
> > > 
> > > >
> > > >> Further like I have mentioned everytime something like this has come up
> > > >> a call on the kexec on panic code path should be a direct call (That can
> > > >> be audited) not something hidden in a notifier call chain (which can not).
> > > >>
> > > 
> > > We btw already have a direct call from panic() to kmsg_dump() which is indirectly controlled by crash_kexec_post_notifiers, and it would also be preferable to be able to call it before kdump as well.
> > 
> > Right, that is the same thing we are talking about.
> > 
> > Thanks
> > Dave
> > 
> 

Thanks
Dave


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-25 14:56                     ` Konrad Rzeszutek Wilk
  2020-09-27  2:51                       ` Dave Young
@ 2020-09-29 13:36                       ` Philipp Rudo
  2020-09-29 19:10                         ` boris.ostrovsky
  1 sibling, 1 reply; 20+ messages in thread
From: Philipp Rudo @ 2020-09-29 13:36 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Dave Young, Wei Liu, Tianyu Lan, bhe, kexec, linux-kernel,
	Michael Kelley, HATAYAMA Daisuke, Eric W. Biederman,
	Masami Hiramatsu, boris.ostrovsky, Eric DeVolder, Andrew Morton

Hi,

On Fri, 25 Sep 2020 10:56:25 -0400
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:

> On Fri, Sep 25, 2020 at 11:05:58AM +0800, Dave Young wrote:
> > Hi,
> > 
> > On 09/24/20 at 01:16pm, boris.ostrovsky@oracle.com wrote:  
> > > 
> > > On 9/24/20 12:43 PM, Michael Kelley wrote:  
> > > > From: Eric W. Biederman <ebiederm@xmission.com> Sent: Thursday, September 24, 2020 9:26 AM  
> > > >> Michael Kelley <mikelley@microsoft.com> writes:
> > > >>  
> > > >>>>> Added Hyper-V people and people who created the param, it is below
> > > >>>>> commit, I also want to remove it if possible, let's see how people
> > > >>>>> think, but the least way should be to disable the auto setting in both systemd
> > > >>>>> and kernel:  
> > > >>> Hyper-V uses a notifier to inform the host system that a Linux VM has
> > > >>> panic'ed.  Informing the host is particularly important in a public cloud
> > > >>> such as Azure so that the cloud software can alert the customer, and can
> > > >>> track cloud-wide reliability statistics.   Whether a kdump is taken is controlled
> > > >>> entirely by the customer and how he configures the VM, and we want
> > > >>> the host to be informed either way.  
> > > >> Why?
> > > >>
> > > >> Why does the host care?
> > > >> Especially if the VM continues executing into a kdump kernel?  
> > > > The host itself doesn't care.  But the host is a convenient out-of-band
> > > > channel for recording that a panic has occurred and to collect basic data
> > > > about the panic.  This out-of-band channel is then used to notify the end
> > > > customer that his VM has panic'ed.  Sure, the customer should be running
> > > > his own monitoring software, but customers don't always do what they
> > > > should.  Equally important, the out-of-band channel allows the cloud
> > > > infrastructure software to notice trends, such as that the rate of Linux
> > > > panics has increased, and that perhaps there is a cloud problem that
> > > > should be investigated.  
> > > 
> > > 
> > > In many cases (especially in cloud environment) your dump device is remote (e.g. iscsi) and kdump sometimes (often?) gets stuck because of connectivity issues (which could be cause of the panic in the first place). So it is quite desirable to inform the infrastructure that the VM is on its way out without waiting for kdump to complete.  
> > 
> > That can probably be done in kdump kernel if it is really needed.  Say
> > informing host that panic happened and a kdump kernel is runnning.  
> 
> If kdump kernel gets to that point. Sometimes (sadly) it ends up being
> misconfigured and it chokes up - and hence having multiple ways to emit
> the crash information before running kdump kernel is a life-saver.
> 
> > 
> > But I think to set crash_kexec_post_notifiers by default is still bad.   
> 
> Because of the way it is run today I presume? If there was some
> safe/unsafe policy that should work right? I would think that the
> safe ones that work properly all the time are:
> 
>  - HyperV CRASH_MSRs,
>  - KVM PVPANIC_[PANIC,CRASHLOAD] push button knob,
>  - pstore EFI variables
>  - Dumping in memory,
> 
> And then some that depend on firmware version (aka BIOS, and vendor) are:
>  - ACPI ERST,
> 
> And then the unsafe:
>  - s390, PowerPC (I don't actually know what they are but that
>     was Dave's primary motivator).

that won't work on s390. Let me emphasize that the problems on s390 are not the
notifiers themselves but the fact that they are called before crash_kexec.

On s390 we have multiple dump methods besides kdump. We use a panic notifier to
trigger these dump methods from the panicking kernel. The problem is that these
dump methods are less powerful than kdump so we only want to use them as
fallback, i.e. only use them when either kdump wasn't configured or loading of
the crash kernel failed for whatever reason. That's why (plus historic reasons)
our notifier stops the machine when it is called and none of the methods is
configured. Which means that the second crash_kexec is never reached.

Long story short, the problem on s390 is caused by the two hunks in
kernel/panic.c:panic from f06e5153f4ae ("kernel/panic.c: add
"crash_kexec_post_notifiers" option for kdump after panic_notifers").

Besides the problems on s390 I support Dave and think that setting
crash_kexec_post_notifiers by default is wrong. We should keep in mind that
we are in a panic situation. This means that the kernel is in a state where it
doesn't trust itself anymore. So we should keep the code that is run to the
bare minimum as we cannot rely on it to work properly.

Thanks
Philipp

> 
> >   
> > > 
> > >   
> > > >  
> > > >> Further like I have mentioned everytime something like this has come up
> > > >> a call on the kexec on panic code path should be a direct call (That can
> > > >> be audited) not something hidden in a notifier call chain (which can not).
> > > >>  
> > > 
> > > We btw already have a direct call from panic() to kmsg_dump() which is indirectly controlled by crash_kexec_post_notifiers, and it would also be preferable to be able to call it before kdump as well.  
> > 
> > Right, that is the same thing we are talking about.
> > 
> > Thanks
> > Dave
> >   
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
  2020-09-29 13:36                       ` Philipp Rudo
@ 2020-09-29 19:10                         ` boris.ostrovsky
  0 siblings, 0 replies; 20+ messages in thread
From: boris.ostrovsky @ 2020-09-29 19:10 UTC (permalink / raw)
  To: Philipp Rudo, Konrad Rzeszutek Wilk
  Cc: Dave Young, Wei Liu, Tianyu Lan, bhe, kexec, linux-kernel,
	Michael Kelley, HATAYAMA Daisuke, Eric W. Biederman,
	Masami Hiramatsu, Eric DeVolder, Andrew Morton, lennart

+Lennart


On 9/29/20 9:36 AM, Philipp Rudo wrote:
> Hi,
>
> On Fri, 25 Sep 2020 10:56:25 -0400
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
>
>> On Fri, Sep 25, 2020 at 11:05:58AM +0800, Dave Young wrote:
>>> Hi,
>>>
>>> On 09/24/20 at 01:16pm, boris.ostrovsky@oracle.com wrote:  
>>>> On 9/24/20 12:43 PM, Michael Kelley wrote:  
>>>>> From: Eric W. Biederman <ebiederm@xmission.com> Sent: Thursday, September 24, 2020 9:26 AM  
>>>>>> Michael Kelley <mikelley@microsoft.com> writes:
>>>>>>  
>>>>>>>>> Added Hyper-V people and people who created the param, it is below
>>>>>>>>> commit, I also want to remove it if possible, let's see how people
>>>>>>>>> think, but the least way should be to disable the auto setting in both systemd
>>>>>>>>> and kernel:  
>>>>>>> Hyper-V uses a notifier to inform the host system that a Linux VM has
>>>>>>> panic'ed.  Informing the host is particularly important in a public cloud
>>>>>>> such as Azure so that the cloud software can alert the customer, and can
>>>>>>> track cloud-wide reliability statistics.   Whether a kdump is taken is controlled
>>>>>>> entirely by the customer and how he configures the VM, and we want
>>>>>>> the host to be informed either way.  
>>>>>> Why?
>>>>>>
>>>>>> Why does the host care?
>>>>>> Especially if the VM continues executing into a kdump kernel?  
>>>>> The host itself doesn't care.  But the host is a convenient out-of-band
>>>>> channel for recording that a panic has occurred and to collect basic data
>>>>> about the panic.  This out-of-band channel is then used to notify the end
>>>>> customer that his VM has panic'ed.  Sure, the customer should be running
>>>>> his own monitoring software, but customers don't always do what they
>>>>> should.  Equally important, the out-of-band channel allows the cloud
>>>>> infrastructure software to notice trends, such as that the rate of Linux
>>>>> panics has increased, and that perhaps there is a cloud problem that
>>>>> should be investigated.  
>>>>
>>>> In many cases (especially in cloud environment) your dump device is remote (e.g. iscsi) and kdump sometimes (often?) gets stuck because of connectivity issues (which could be cause of the panic in the first place). So it is quite desirable to inform the infrastructure that the VM is on its way out without waiting for kdump to complete.  
>>> That can probably be done in kdump kernel if it is really needed.  Say
>>> informing host that panic happened and a kdump kernel is runnning.  
>> If kdump kernel gets to that point. Sometimes (sadly) it ends up being
>> misconfigured and it chokes up - and hence having multiple ways to emit
>> the crash information before running kdump kernel is a life-saver.
>>
>>> But I think to set crash_kexec_post_notifiers by default is still bad.   
>> Because of the way it is run today I presume? If there was some
>> safe/unsafe policy that should work right? I would think that the
>> safe ones that work properly all the time are:
>>
>>  - HyperV CRASH_MSRs,
>>  - KVM PVPANIC_[PANIC,CRASHLOAD] push button knob,
>>  - pstore EFI variables
>>  - Dumping in memory,
>>
>> And then some that depend on firmware version (aka BIOS, and vendor) are:
>>  - ACPI ERST,
>>
>> And then the unsafe:
>>  - s390, PowerPC (I don't actually know what they are but that
>>     was Dave's primary motivator).
> that won't work on s390. Let me emphasize that the problems on s390 are not the
> notifiers themselves but the fact that they are called before crash_kexec.
>
> On s390 we have multiple dump methods besides kdump. We use a panic notifier to
> trigger these dump methods from the panicking kernel. The problem is that these
> dump methods are less powerful than kdump so we only want to use them as
> fallback, i.e. only use them when either kdump wasn't configured or loading of
> the crash kernel failed for whatever reason. That's why (plus historic reasons)
> our notifier stops the machine when it is called and none of the methods is
> configured. Which means that the second crash_kexec is never reached.
>
> Long story short, the problem on s390 is caused by the two hunks in
> kernel/panic.c:panic from f06e5153f4ae ("kernel/panic.c: add
> "crash_kexec_post_notifiers" option for kdump after panic_notifers").
>
> Besides the problems on s390 I support Dave and think that setting
> crash_kexec_post_notifiers by default is wrong. We should keep in mind that
> we are in a panic situation. This means that the kernel is in a state where it
> doesn't trust itself anymore. So we should keep the code that is run to the
> bare minimum as we cannot rely on it to work properly.


There is a pending patch to revert notifiers' default in systemd: https://github.com/systemd/systemd/pull/16950


If this change goes through then Dave's patch will be unnecessary.


-boris



>
> Thanks
> Philipp
>
>>>   
>>>>   
>>>>>  
>>>>>> Further like I have mentioned everytime something like this has come up
>>>>>> a call on the kexec on panic code path should be a direct call (That can
>>>>>> be audited) not something hidden in a notifier call chain (which can not).
>>>>>>  
>>>> We btw already have a direct call from panic() to kmsg_dump() which is indirectly controlled by crash_kexec_post_notifiers, and it would also be preferable to be able to call it before kdump as well.  
>>> Right, that is the same thing we are talking about.
>>>
>>> Thanks
>>> Dave
>>>   
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-09-29 19:11 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-18  3:25 [PATCH] Only allow to set crash_kexec_post_notifiers on boot time Dave Young
2020-09-19  0:47 ` Andrew Morton
2020-09-19  7:26   ` Dave Young
2020-09-21 20:18   ` Konrad Rzeszutek Wilk
2020-09-22  1:45     ` Eric W. Biederman
2020-09-23  2:43       ` Dave Young
2020-09-23 15:48         ` Konrad Rzeszutek Wilk
2020-09-24 16:15           ` Michael Kelley
2020-09-24 16:25             ` Eric W. Biederman
2020-09-24 16:43               ` Michael Kelley
2020-09-24 17:16                 ` boris.ostrovsky
2020-09-25  3:05                   ` Dave Young
2020-09-25 14:56                     ` Konrad Rzeszutek Wilk
2020-09-27  2:51                       ` Dave Young
2020-09-29 13:36                       ` Philipp Rudo
2020-09-29 19:10                         ` boris.ostrovsky
2020-09-22 10:58     ` Philipp Rudo
2020-09-22 14:50       ` boris.ostrovsky
2020-09-22 17:04         ` Guilherme G. Piccoli
2020-09-23  2:25     ` Dave Young

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).