From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> To: Dave Young <dyoung@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com>, Andrew Morton <akpm@linux-foundation.org>, bhe@redhat.com, linux-kernel@vger.kernel.org, kexec@lists.infradead.org, Eric DeVolder <eric.devolder@oracle.com>, Boris Ostrovsky <boris.ostrovsky@oracle.com>, Tianyu Lani <Tianyu.Lan@microsoft.com>, Michael Kelley <mikelley@microsoft.com>, Wei Liu <wei.liu@kernel.org>, Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>, HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Subject: Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time Date: Wed, 23 Sep 2020 11:48:25 -0400 [thread overview] Message-ID: <20200923154825.GC7635@char.us.oracle.com> (raw) In-Reply-To: <20200923024329.GB3642@dhcp-128-65.nay.redhat.com> On Wed, Sep 23, 2020 at 10:43:29AM +0800, Dave Young wrote: > + more people who may care about this param Paarty time!! (See below, didn't snip any comments) > On 09/21/20 at 08:45pm, Eric W. Biederman wrote: > > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes: > > > > > On Fri, Sep 18, 2020 at 05:47:43PM -0700, Andrew Morton wrote: > > >> On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote: > > >> > > >> > crash_kexec_post_notifiers enables running various panic notifier > > >> > before kdump kernel booting. This increases risks of kdump failure. > > >> > It is well documented in kernel-parameters.txt. We do not suggest > > >> > people to enable it together with kdump unless he/she is really sure. > > >> > This is also not suggested to be enabled by default when users are > > >> > not aware in distributions. > > >> > > > >> > But unfortunately it is enabled by default in systemd, see below > > >> > discussions in a systemd report, we can not convince systemd to change > > >> > it: > > >> > https://github.com/systemd/systemd/issues/16661 > > >> > > > >> > Actually we have got reports about kdump kernel hangs in both s390x > > >> > and powerpcle cases caused by the systemd change, also some x86 cases > > >> > could also be caused by the same (although that is in Hyper-V code > > >> > instead of systemd, that need to be addressed separately). > > > > > > Perhaps it may be better to fix the issus on s390x and PowerPC as well? > > > > > >> > > > >> > Thus to avoid the auto enablement here just disable the param writable > > >> > permission in sysfs. > > >> > > > >> > > >> Well. I don't think this is at all a desirable way of resolving a > > >> disagreement with the systemd developers > > >> > > >> At the above github address I'm seeing "ryncsn added a commit to > > >> ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't > > >> enable crash_kexec_post_notifiers by default". So didn't that address > > >> the issue? > > > > > > It does in systemd, but there is a strong interest in making this on > > > by default. > > > > There is also a strong interest in removing this code entirely from the > > kernel. > > Added Hyper-V people and people who created the param, it is below > commit, I also want to remove it if possible, let's see how people > think, but the least way should be to disable the auto setting in both systemd > and kernel: > > commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 > Author: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> > Date: Fri Jun 6 14:37:07 2014 -0700 > > kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers > > Add a "crash_kexec_post_notifiers" boot option to run kdump after > running panic_notifiers and dump kmsg. This can help rare situations > where kdump fails because of unstable crashed kernel or hardware failure > (memory corruption on critical data/code), or the 2nd kernel is already > broken by the 1st kernel (it's a broken behavior, but who can guarantee > that the "crashed" kernel works correctly?). > > Usage: add "crash_kexec_post_notifiers" to kernel boot option. > > Note that this actually increases risks of the failure of kdump. This > option should be set only if you worry about the rare case of kdump > failure rather than increasing the chance of success. If this is such risky knob that leads to bugs where folks are backing away from with disgust in their faces - then perhaps the only way to go about this is - limit the exposure to known working situations on firmware that we can control? That is enable only a subset of post notifiers which determine if they are OK running if the conditions are blessed? I think that would satisfy the conditions where you have to to deal with unsavory bugs that end up on your plate - and aren't fun because there is no way to fixing it - but at the same time allowing multiple ways to save the crash? Please don't take away something that is quite useful in the field. Can we hammer out something that will remove your pain points? > > > > > This failure is a case in point. > > > > I think I am at my I told you so point. This is what all of the testing > > over all the years has said. Leaving functionality to the peculiarities > > of firmware when you don't have to, and can actually control what is > > going on doesn't work. > > > > Eric > > > > > > Thanks > Dave >
WARNING: multiple messages have this Message-ID (diff)
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> To: Dave Young <dyoung@redhat.com> Cc: Wei Liu <wei.liu@kernel.org>, Tianyu Lani <Tianyu.Lan@microsoft.com>, bhe@redhat.com, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, Michael Kelley <mikelley@microsoft.com>, HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>, "Eric W. Biederman" <ebiederm@xmission.com>, Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>, Andrew Morton <akpm@linux-foundation.org>, Eric DeVolder <eric.devolder@oracle.com>, Boris Ostrovsky <boris.ostrovsky@oracle.com> Subject: Re: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time Date: Wed, 23 Sep 2020 11:48:25 -0400 [thread overview] Message-ID: <20200923154825.GC7635@char.us.oracle.com> (raw) In-Reply-To: <20200923024329.GB3642@dhcp-128-65.nay.redhat.com> On Wed, Sep 23, 2020 at 10:43:29AM +0800, Dave Young wrote: > + more people who may care about this param Paarty time!! (See below, didn't snip any comments) > On 09/21/20 at 08:45pm, Eric W. Biederman wrote: > > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes: > > > > > On Fri, Sep 18, 2020 at 05:47:43PM -0700, Andrew Morton wrote: > > >> On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote: > > >> > > >> > crash_kexec_post_notifiers enables running various panic notifier > > >> > before kdump kernel booting. This increases risks of kdump failure. > > >> > It is well documented in kernel-parameters.txt. We do not suggest > > >> > people to enable it together with kdump unless he/she is really sure. > > >> > This is also not suggested to be enabled by default when users are > > >> > not aware in distributions. > > >> > > > >> > But unfortunately it is enabled by default in systemd, see below > > >> > discussions in a systemd report, we can not convince systemd to change > > >> > it: > > >> > https://github.com/systemd/systemd/issues/16661 > > >> > > > >> > Actually we have got reports about kdump kernel hangs in both s390x > > >> > and powerpcle cases caused by the systemd change, also some x86 cases > > >> > could also be caused by the same (although that is in Hyper-V code > > >> > instead of systemd, that need to be addressed separately). > > > > > > Perhaps it may be better to fix the issus on s390x and PowerPC as well? > > > > > >> > > > >> > Thus to avoid the auto enablement here just disable the param writable > > >> > permission in sysfs. > > >> > > > >> > > >> Well. I don't think this is at all a desirable way of resolving a > > >> disagreement with the systemd developers > > >> > > >> At the above github address I'm seeing "ryncsn added a commit to > > >> ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't > > >> enable crash_kexec_post_notifiers by default". So didn't that address > > >> the issue? > > > > > > It does in systemd, but there is a strong interest in making this on > > > by default. > > > > There is also a strong interest in removing this code entirely from the > > kernel. > > Added Hyper-V people and people who created the param, it is below > commit, I also want to remove it if possible, let's see how people > think, but the least way should be to disable the auto setting in both systemd > and kernel: > > commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 > Author: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> > Date: Fri Jun 6 14:37:07 2014 -0700 > > kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers > > Add a "crash_kexec_post_notifiers" boot option to run kdump after > running panic_notifiers and dump kmsg. This can help rare situations > where kdump fails because of unstable crashed kernel or hardware failure > (memory corruption on critical data/code), or the 2nd kernel is already > broken by the 1st kernel (it's a broken behavior, but who can guarantee > that the "crashed" kernel works correctly?). > > Usage: add "crash_kexec_post_notifiers" to kernel boot option. > > Note that this actually increases risks of the failure of kdump. This > option should be set only if you worry about the rare case of kdump > failure rather than increasing the chance of success. If this is such risky knob that leads to bugs where folks are backing away from with disgust in their faces - then perhaps the only way to go about this is - limit the exposure to known working situations on firmware that we can control? That is enable only a subset of post notifiers which determine if they are OK running if the conditions are blessed? I think that would satisfy the conditions where you have to to deal with unsavory bugs that end up on your plate - and aren't fun because there is no way to fixing it - but at the same time allowing multiple ways to save the crash? Please don't take away something that is quite useful in the field. Can we hammer out something that will remove your pain points? > > > > > This failure is a case in point. > > > > I think I am at my I told you so point. This is what all of the testing > > over all the years has said. Leaving functionality to the peculiarities > > of firmware when you don't have to, and can actually control what is > > going on doesn't work. > > > > Eric > > > > > > Thanks > Dave > _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2020-09-23 15:47 UTC|newest] Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-09-18 3:25 [PATCH] Only allow to set crash_kexec_post_notifiers on boot time Dave Young 2020-09-18 3:25 ` Dave Young 2020-09-19 0:47 ` Andrew Morton 2020-09-19 0:47 ` Andrew Morton 2020-09-19 7:26 ` Dave Young 2020-09-19 7:26 ` Dave Young 2020-09-21 20:18 ` Konrad Rzeszutek Wilk 2020-09-21 20:18 ` Konrad Rzeszutek Wilk 2020-09-22 1:45 ` Eric W. Biederman 2020-09-22 1:45 ` Eric W. Biederman 2020-09-23 2:43 ` Dave Young 2020-09-23 2:43 ` Dave Young 2020-09-23 15:48 ` Konrad Rzeszutek Wilk [this message] 2020-09-23 15:48 ` Konrad Rzeszutek Wilk 2020-09-24 16:15 ` Michael Kelley 2020-09-24 16:15 ` Michael Kelley 2020-09-24 16:25 ` Eric W. Biederman 2020-09-24 16:25 ` Eric W. Biederman 2020-09-24 16:43 ` Michael Kelley 2020-09-24 16:43 ` Michael Kelley 2020-09-24 17:16 ` boris.ostrovsky 2020-09-24 17:16 ` boris.ostrovsky 2020-09-25 3:05 ` Dave Young 2020-09-25 3:05 ` Dave Young 2020-09-25 14:56 ` Konrad Rzeszutek Wilk 2020-09-25 14:56 ` Konrad Rzeszutek Wilk 2020-09-27 2:51 ` Dave Young 2020-09-27 2:51 ` Dave Young 2020-09-29 13:36 ` Philipp Rudo 2020-09-29 13:36 ` Philipp Rudo 2020-09-29 19:10 ` boris.ostrovsky 2020-09-29 19:10 ` boris.ostrovsky 2020-09-22 10:58 ` Philipp Rudo 2020-09-22 10:58 ` Philipp Rudo 2020-09-22 14:50 ` boris.ostrovsky 2020-09-22 14:50 ` boris.ostrovsky 2020-09-22 17:04 ` Guilherme G. Piccoli 2020-09-22 17:04 ` Guilherme G. Piccoli 2020-09-23 2:25 ` Dave Young 2020-09-23 2:25 ` Dave Young
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200923154825.GC7635@char.us.oracle.com \ --to=konrad.wilk@oracle.com \ --cc=Tianyu.Lan@microsoft.com \ --cc=akpm@linux-foundation.org \ --cc=bhe@redhat.com \ --cc=boris.ostrovsky@oracle.com \ --cc=d.hatayama@jp.fujitsu.com \ --cc=dyoung@redhat.com \ --cc=ebiederm@xmission.com \ --cc=eric.devolder@oracle.com \ --cc=kexec@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=masami.hiramatsu.pt@hitachi.com \ --cc=mikelley@microsoft.com \ --cc=wei.liu@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.