linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
@ 2015-03-06 16:31 "Hatayama, Daisuke/畑山 大輔"
  2015-03-06 18:08 ` Vivek Goyal
  2015-03-23  3:47 ` Baoquan He
  0 siblings, 2 replies; 19+ messages in thread
From: "Hatayama, Daisuke/畑山 大輔" @ 2015-03-06 16:31 UTC (permalink / raw)
  To: ebiederm, Vivek Goyal
  Cc: masami.hiramatsu.pt, hidehiro.kawai.ez, bhe, linux-kernel, kexec

The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
"crash_kexec_post_notifiers" kernel boot option, which toggles
wheather panic() calls crash_kexec() before panic_notifiers and dump
kmsg or after.

The problem is that the commit overlooks panic_on_oops kernel boot
option. If it is enabled, crash_kexec() is called directly without
going through panic() in oops path.

To fix this issue, this patch adds a check to
"crash_kexec_post_notifiers" in the condition of kexec_should_crash().

Also, put a comment in kexec_should_crash() to explain not obvious
things on this patch.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
---
 include/linux/kernel.h |  3 +++
 kernel/kexec.c         | 11 +++++++++++
 kernel/panic.c         |  2 +-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index d6d630d..07483c7 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -426,6 +426,9 @@ extern int panic_on_unrecovered_nmi;
 extern int panic_on_io_nmi;
 extern int panic_on_warn;
 extern int sysctl_panic_on_stackoverflow;
+
+extern bool crash_kexec_post_notifiers;
+
 /*
  * Only to be used by arch init code. If the user over-wrote the default
  * CONFIG_PANIC_TIMEOUT, honor it.
diff --git a/kernel/kexec.c b/kernel/kexec.c
index 38c25b1..5bf6077 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -84,6 +84,17 @@ struct resource crashk_low_res = {

 int kexec_should_crash(struct task_struct *p)
 {
+	/*
+	 * If crash_kexec_post_notifiers is enabled, don't run
+	 * crash_kexec() here yet, which must be run after panic
+	 * notifiers in panic().
+	 */
+	if (crash_kexec_post_notifiers)
+		return 0;
+	/*
+	 * There are 4 panic() calls in do_exit() path, each of which
+	 * calls corresponds to each of these 4 conditions.
+	 */
 	if (in_interrupt() || !p->pid || is_global_init(p) || panic_on_oops)
 		return 1;
 	return 0;
diff --git a/kernel/panic.c b/kernel/panic.c
index 8136ad7..79ca912 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -32,7 +32,7 @@ static unsigned long tainted_mask;
 static int pause_on_oops;
 static int pause_on_oops_flag;
 static DEFINE_SPINLOCK(pause_on_oops_lock);
-static bool crash_kexec_post_notifiers;
+bool crash_kexec_post_notifiers;
 int panic_on_warn __read_mostly;

 int panic_timeout = CONFIG_PANIC_TIMEOUT;
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-06 16:31 [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path "Hatayama, Daisuke/畑山 大輔"
@ 2015-03-06 18:08 ` Vivek Goyal
  2015-03-23  3:47 ` Baoquan He
  1 sibling, 0 replies; 19+ messages in thread
From: Vivek Goyal @ 2015-03-06 18:08 UTC (permalink / raw)
  To: "Hatayama, Daisuke/畑山 大輔"
  Cc: ebiederm, masami.hiramatsu.pt, hidehiro.kawai.ez, bhe,
	linux-kernel, kexec

On Sat, Mar 07, 2015 at 01:31:01AM +0900, "Hatayama, Daisuke/畑山 大輔" wrote:
> The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
> "crash_kexec_post_notifiers" kernel boot option, which toggles
> wheather panic() calls crash_kexec() before panic_notifiers and dump
> kmsg or after.
> 
> The problem is that the commit overlooks panic_on_oops kernel boot
> option. If it is enabled, crash_kexec() is called directly without
> going through panic() in oops path.
> 
> To fix this issue, this patch adds a check to
> "crash_kexec_post_notifiers" in the condition of kexec_should_crash().
> 
> Also, put a comment in kexec_should_crash() to explain not obvious
> things on this patch.
> 
> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> Acked-by: Baoquan He <bhe@redhat.com>
> Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Looks good to me.

Acked-by: Vivek Goyal <vgoyal@redhat.com>

Thanks
Vivek

> ---
>  include/linux/kernel.h |  3 +++
>  kernel/kexec.c         | 11 +++++++++++
>  kernel/panic.c         |  2 +-
>  3 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index d6d630d..07483c7 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -426,6 +426,9 @@ extern int panic_on_unrecovered_nmi;
>  extern int panic_on_io_nmi;
>  extern int panic_on_warn;
>  extern int sysctl_panic_on_stackoverflow;
> +
> +extern bool crash_kexec_post_notifiers;
> +
>  /*
>   * Only to be used by arch init code. If the user over-wrote the default
>   * CONFIG_PANIC_TIMEOUT, honor it.
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index 38c25b1..5bf6077 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -84,6 +84,17 @@ struct resource crashk_low_res = {
> 
>  int kexec_should_crash(struct task_struct *p)
>  {
> +	/*
> +	 * If crash_kexec_post_notifiers is enabled, don't run
> +	 * crash_kexec() here yet, which must be run after panic
> +	 * notifiers in panic().
> +	 */
> +	if (crash_kexec_post_notifiers)
> +		return 0;
> +	/*
> +	 * There are 4 panic() calls in do_exit() path, each of which
> +	 * calls corresponds to each of these 4 conditions.
> +	 */
>  	if (in_interrupt() || !p->pid || is_global_init(p) || panic_on_oops)
>  		return 1;
>  	return 0;
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 8136ad7..79ca912 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -32,7 +32,7 @@ static unsigned long tainted_mask;
>  static int pause_on_oops;
>  static int pause_on_oops_flag;
>  static DEFINE_SPINLOCK(pause_on_oops_lock);
> -static bool crash_kexec_post_notifiers;
> +bool crash_kexec_post_notifiers;
>  int panic_on_warn __read_mostly;
> 
>  int panic_timeout = CONFIG_PANIC_TIMEOUT;
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-06 16:31 [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path "Hatayama, Daisuke/畑山 大輔"
  2015-03-06 18:08 ` Vivek Goyal
@ 2015-03-23  3:47 ` Baoquan He
  2015-03-23  7:19   ` Ingo Molnar
  1 sibling, 1 reply; 19+ messages in thread
From: Baoquan He @ 2015-03-23  3:47 UTC (permalink / raw)
  To: "Hatayama, Daisuke/畑山 大輔"
  Cc: ebiederm, Vivek Goyal, masami.hiramatsu.pt, hidehiro.kawai.ez,
	linux-kernel, kexec, akpm, mingo, bp

CC more people ...

On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote:
> The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
> "crash_kexec_post_notifiers" kernel boot option, which toggles
> wheather panic() calls crash_kexec() before panic_notifiers and dump
> kmsg or after.
> 
> The problem is that the commit overlooks panic_on_oops kernel boot
> option. If it is enabled, crash_kexec() is called directly without
> going through panic() in oops path.
> 
> To fix this issue, this patch adds a check to
> "crash_kexec_post_notifiers" in the condition of kexec_should_crash().
> 
> Also, put a comment in kexec_should_crash() to explain not obvious
> things on this patch.
> 
> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> Acked-by: Baoquan He <bhe@redhat.com>
> Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> ---
>  include/linux/kernel.h |  3 +++
>  kernel/kexec.c         | 11 +++++++++++
>  kernel/panic.c         |  2 +-
>  3 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index d6d630d..07483c7 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -426,6 +426,9 @@ extern int panic_on_unrecovered_nmi;
>  extern int panic_on_io_nmi;
>  extern int panic_on_warn;
>  extern int sysctl_panic_on_stackoverflow;
> +
> +extern bool crash_kexec_post_notifiers;
> +
>  /*
>   * Only to be used by arch init code. If the user over-wrote the default
>   * CONFIG_PANIC_TIMEOUT, honor it.
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index 38c25b1..5bf6077 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -84,6 +84,17 @@ struct resource crashk_low_res = {
> 
>  int kexec_should_crash(struct task_struct *p)
>  {
> +	/*
> +	 * If crash_kexec_post_notifiers is enabled, don't run
> +	 * crash_kexec() here yet, which must be run after panic
> +	 * notifiers in panic().
> +	 */
> +	if (crash_kexec_post_notifiers)
> +		return 0;
> +	/*
> +	 * There are 4 panic() calls in do_exit() path, each of which
> +	 * calls corresponds to each of these 4 conditions.
> +	 */
>  	if (in_interrupt() || !p->pid || is_global_init(p) || panic_on_oops)
>  		return 1;
>  	return 0;
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 8136ad7..79ca912 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -32,7 +32,7 @@ static unsigned long tainted_mask;
>  static int pause_on_oops;
>  static int pause_on_oops_flag;
>  static DEFINE_SPINLOCK(pause_on_oops_lock);
> -static bool crash_kexec_post_notifiers;
> +bool crash_kexec_post_notifiers;
>  int panic_on_warn __read_mostly;
> 
>  int panic_timeout = CONFIG_PANIC_TIMEOUT;
> -- 
> 1.9.3
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-23  3:47 ` Baoquan He
@ 2015-03-23  7:19   ` Ingo Molnar
  2015-03-23 13:37     ` Vivek Goyal
                       ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Ingo Molnar @ 2015-03-23  7:19 UTC (permalink / raw)
  To: Baoquan He
  Cc: "Hatayama, Daisuke/畑山 大輔",
	ebiederm, Vivek Goyal, masami.hiramatsu.pt, hidehiro.kawai.ez,
	linux-kernel, kexec, akpm, mingo, bp


* Baoquan He <bhe@redhat.com> wrote:

> CC more people ...
> 
> On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote:
> > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
> > "crash_kexec_post_notifiers" kernel boot option, which toggles
> > wheather panic() calls crash_kexec() before panic_notifiers and dump
> > kmsg or after.
> > 
> > The problem is that the commit overlooks panic_on_oops kernel boot
> > option. If it is enabled, crash_kexec() is called directly without
> > going through panic() in oops path.
> > 
> > To fix this issue, this patch adds a check to
> > "crash_kexec_post_notifiers" in the condition of kexec_should_crash().
> > 
> > Also, put a comment in kexec_should_crash() to explain not obvious
> > things on this patch.
> > 
> > Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> > Acked-by: Baoquan He <bhe@redhat.com>
> > Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> > Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > ---
> >  include/linux/kernel.h |  3 +++
> >  kernel/kexec.c         | 11 +++++++++++
> >  kernel/panic.c         |  2 +-
> >  3 files changed, 15 insertions(+), 1 deletion(-)

This is hack upon hack, but why was this crap merged in the first 
place?

I see two problems just by cursory review:

1)

Firstly, the real bug in:

  f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers")

Was that crash_kexec() was called unconditionally after notifiers were 
called, which should be fixed via the simple patch below (untested). 
Looks much simpler than your fix.

2)

Secondly, and more importantly, the whole premise of commit 
f06e5153f4ae is broken IMHO:

 "This can help rare situations where kdump fails because of unstable
  crashed kernel or hardware failure (memory corruption on critical
  data/code)"

wtf?

If the kernel crashed due to a kernel crash, then the kernel booting 
up in whatever hardware state should be able to do a clean bootup. The 
fix for those 'rare situations' should be to fix the real bug (for 
example by making hardware driver init (or deinit) sequences more 
robust), not to paper it over by ordering around crash-time sequences 
...

If it crashed due to some hardware failure, there's literally an 
infinite amount of failure modes that may or may not be impacted by 
kexec crash-time handling ordering. We don't want to put a zillion 
such flags into the kernel proper just to allow the perturbation of 
the kernel.

Thanks,

	Ingo

diff --git a/kernel/panic.c b/kernel/panic.c
index 8136ad76e5fd..774614f72cbd 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -142,7 +142,8 @@ void panic(const char *fmt, ...)
 	 * Note: since some panic_notifiers can make crashed kernel
 	 * more unstable, it can increase risks of the kdump failure too.
 	 */
-	crash_kexec(NULL);
+	if (crash_kexec_post_notifiers)
+		crash_kexec(NULL);
 
 	bust_spinlocks(0);
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-23  7:19   ` Ingo Molnar
@ 2015-03-23 13:37     ` Vivek Goyal
  2015-03-23 13:50       ` Ingo Molnar
  2015-03-23 15:36     ` Vivek Goyal
  2015-03-24  3:30     ` Masami Hiramatsu
  2 siblings, 1 reply; 19+ messages in thread
From: Vivek Goyal @ 2015-03-23 13:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Baoquan He, "Hatayama,
	Daisuke/畑山 大輔",
	ebiederm, masami.hiramatsu.pt, hidehiro.kawai.ez, linux-kernel,
	kexec, akpm, mingo, bp

On Mon, Mar 23, 2015 at 08:19:43AM +0100, Ingo Molnar wrote:
> 
> * Baoquan He <bhe@redhat.com> wrote:
> 
> > CC more people ...
> > 
> > On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote:
> > > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
> > > "crash_kexec_post_notifiers" kernel boot option, which toggles
> > > wheather panic() calls crash_kexec() before panic_notifiers and dump
> > > kmsg or after.
> > > 
> > > The problem is that the commit overlooks panic_on_oops kernel boot
> > > option. If it is enabled, crash_kexec() is called directly without
> > > going through panic() in oops path.
> > > 
> > > To fix this issue, this patch adds a check to
> > > "crash_kexec_post_notifiers" in the condition of kexec_should_crash().
> > > 
> > > Also, put a comment in kexec_should_crash() to explain not obvious
> > > things on this patch.
> > > 
> > > Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> > > Acked-by: Baoquan He <bhe@redhat.com>
> > > Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> > > Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > > ---
> > >  include/linux/kernel.h |  3 +++
> > >  kernel/kexec.c         | 11 +++++++++++
> > >  kernel/panic.c         |  2 +-
> > >  3 files changed, 15 insertions(+), 1 deletion(-)
> 
> This is hack upon hack, but why was this crap merged in the first 
> place?
> 
> I see two problems just by cursory review:
> 
> 1)
> 
> Firstly, the real bug in:
> 
>   f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers")
> 
> Was that crash_kexec() was called unconditionally after notifiers were 
> called, which should be fixed via the simple patch below (untested). 
> Looks much simpler than your fix.
> 

Hi Ingo,

Agreed. Your patch looks good.

> 2)
> 
> Secondly, and more importantly, the whole premise of commit 
> f06e5153f4ae is broken IMHO:
> 
>  "This can help rare situations where kdump fails because of unstable
>   crashed kernel or hardware failure (memory corruption on critical
>   data/code)"
> 
> wtf?
> 
> If the kernel crashed due to a kernel crash, then the kernel booting 
> up in whatever hardware state should be able to do a clean bootup. The 
> fix for those 'rare situations' should be to fix the real bug (for 
> example by making hardware driver init (or deinit) sequences more 
> robust), not to paper it over by ordering around crash-time sequences 
> ...
> 
> If it crashed due to some hardware failure, there's literally an 
> infinite amount of failure modes that may or may not be impacted by 
> kexec crash-time handling ordering. We don't want to put a zillion 
> such flags into the kernel proper just to allow the perturbation of 
> the kernel.

I think one of the motivations behind this patch was call to kmsg_dump().
Some vendors have been wanting to have the capability to save kernel logs
to some NVRAM before transition to second kernel happens. Their argument
is that kdump does not succeed all the time and if kdump does not succeed
then atleast they have something to work with (kernel logs retrieved
from pstore interface).

Not that I agree fully with this as problem might happen while we try
to run panic_notifiers or kmsg_dump hooks and never transition into
kdump kernel.

And it has been literally years since some developers have been pushing for
allowing to run panic notifiers before crash_kexec(). Eric Biederman has been
pushing back saying it reduces the reliability of kdump operation so this
is not acceptable.

So while it is very hacky, this command line option was intorduced which
allowed to override default crash_kexec() behavior and those who want
to do additional things (at their own risk) before transition to second
kernel, can specify this parameter.

Thanks
Vivek

> 
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 8136ad76e5fd..774614f72cbd 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -142,7 +142,8 @@ void panic(const char *fmt, ...)
>  	 * Note: since some panic_notifiers can make crashed kernel
>  	 * more unstable, it can increase risks of the kdump failure too.
>  	 */
> -	crash_kexec(NULL);
> +	if (crash_kexec_post_notifiers)
> +		crash_kexec(NULL);
>  
>  	bust_spinlocks(0);
>  

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-23 13:37     ` Vivek Goyal
@ 2015-03-23 13:50       ` Ingo Molnar
  2015-03-23 14:31         ` Vivek Goyal
  0 siblings, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2015-03-23 13:50 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Baoquan He, "Hatayama,
	Daisuke/畑山 大輔",
	ebiederm, masami.hiramatsu.pt, hidehiro.kawai.ez, linux-kernel,
	kexec, akpm, mingo, bp


* Vivek Goyal <vgoyal@redhat.com> wrote:

> On Mon, Mar 23, 2015 at 08:19:43AM +0100, Ingo Molnar wrote:
> > 
> > * Baoquan He <bhe@redhat.com> wrote:
> > 
> > > CC more people ...
> > > 
> > > On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote:
> > > > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
> > > > "crash_kexec_post_notifiers" kernel boot option, which toggles
> > > > wheather panic() calls crash_kexec() before panic_notifiers and dump
> > > > kmsg or after.
> > > > 
> > > > The problem is that the commit overlooks panic_on_oops kernel boot
> > > > option. If it is enabled, crash_kexec() is called directly without
> > > > going through panic() in oops path.
> > > > 
> > > > To fix this issue, this patch adds a check to
> > > > "crash_kexec_post_notifiers" in the condition of kexec_should_crash().
> > > > 
> > > > Also, put a comment in kexec_should_crash() to explain not obvious
> > > > things on this patch.
> > > > 
> > > > Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> > > > Acked-by: Baoquan He <bhe@redhat.com>
> > > > Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> > > > Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > > > ---
> > > >  include/linux/kernel.h |  3 +++
> > > >  kernel/kexec.c         | 11 +++++++++++
> > > >  kernel/panic.c         |  2 +-
> > > >  3 files changed, 15 insertions(+), 1 deletion(-)
> > 
> > This is hack upon hack, but why was this crap merged in the first 
> > place?
> > 
> > I see two problems just by cursory review:
> > 
> > 1)
> > 
> > Firstly, the real bug in:
> > 
> >   f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers")
> > 
> > Was that crash_kexec() was called unconditionally after notifiers were 
> > called, which should be fixed via the simple patch below (untested). 
> > Looks much simpler than your fix.
> > 
> 
> Hi Ingo,
> 
> Agreed. Your patch looks good.

In case you want that simpler fix and need my SOB:

  Signed-off-by: Ingo Molnar <mingo@kernel.org>

(but I have not tested it.)

> > Secondly, and more importantly, the whole premise of commit 
> > f06e5153f4ae is broken IMHO:
> > 
> >  "This can help rare situations where kdump fails because of unstable
> >   crashed kernel or hardware failure (memory corruption on critical
> >   data/code)"
> > 
> > wtf?
> > 
> > If the kernel crashed due to a kernel crash, then the kernel booting 
> > up in whatever hardware state should be able to do a clean bootup. The 
> > fix for those 'rare situations' should be to fix the real bug (for 
> > example by making hardware driver init (or deinit) sequences more 
> > robust), not to paper it over by ordering around crash-time sequences 
> > ...
> > 
> > If it crashed due to some hardware failure, there's literally an 
> > infinite amount of failure modes that may or may not be impacted by 
> > kexec crash-time handling ordering. We don't want to put a zillion 
> > such flags into the kernel proper just to allow the perturbation of 
> > the kernel.
> 
> I think one of the motivations behind this patch was call to kmsg_dump().
> Some vendors have been wanting to have the capability to save kernel logs
> to some NVRAM before transition to second kernel happens. Their argument
> is that kdump does not succeed all the time and if kdump does not succeed
> then atleast they have something to work with (kernel logs retrieved
> from pstore interface).

Doesn't pstore attach itself to printk itself? AFAICS it does:

 fs/pstore/platform.c:   register_console(&pstore_console);

so the printk log leading up to and including the crash should be 
available, regardless of this patch. What am I missing?

> Not that I agree fully with this as problem might happen while we 
> try to run panic_notifiers or kmsg_dump hooks and never transition 
> into kdump kernel.

btw., this is the big problem with 'notifiers' in general: they are 
opaque with barely any semantics defined, and a source of constant 
confusion.

> And it has been literally years since some developers have been 
> pushing for allowing to run panic notifiers before crash_kexec(). 
> Eric Biederman has been pushing back saying it reduces the 
> reliability of kdump operation so this is not acceptable.

So what do those notifiers do?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-23 13:50       ` Ingo Molnar
@ 2015-03-23 14:31         ` Vivek Goyal
  2015-03-23 16:01           ` Don Zickus
  2015-03-24  3:58           ` Masami Hiramatsu
  0 siblings, 2 replies; 19+ messages in thread
From: Vivek Goyal @ 2015-03-23 14:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Baoquan He, "Hatayama,
	Daisuke/畑山 大輔",
	ebiederm, masami.hiramatsu.pt, hidehiro.kawai.ez, linux-kernel,
	kexec, akpm, mingo, bp, Don Zickus

On Mon, Mar 23, 2015 at 02:50:46PM +0100, Ingo Molnar wrote:
> 
> * Vivek Goyal <vgoyal@redhat.com> wrote:
> 
> > On Mon, Mar 23, 2015 at 08:19:43AM +0100, Ingo Molnar wrote:
> > > 
> > > * Baoquan He <bhe@redhat.com> wrote:
> > > 
> > > > CC more people ...
> > > > 
> > > > On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote:
> > > > > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
> > > > > "crash_kexec_post_notifiers" kernel boot option, which toggles
> > > > > wheather panic() calls crash_kexec() before panic_notifiers and dump
> > > > > kmsg or after.
> > > > > 
> > > > > The problem is that the commit overlooks panic_on_oops kernel boot
> > > > > option. If it is enabled, crash_kexec() is called directly without
> > > > > going through panic() in oops path.
> > > > > 
> > > > > To fix this issue, this patch adds a check to
> > > > > "crash_kexec_post_notifiers" in the condition of kexec_should_crash().
> > > > > 
> > > > > Also, put a comment in kexec_should_crash() to explain not obvious
> > > > > things on this patch.
> > > > > 
> > > > > Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> > > > > Acked-by: Baoquan He <bhe@redhat.com>
> > > > > Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> > > > > Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > > > > ---
> > > > >  include/linux/kernel.h |  3 +++
> > > > >  kernel/kexec.c         | 11 +++++++++++
> > > > >  kernel/panic.c         |  2 +-
> > > > >  3 files changed, 15 insertions(+), 1 deletion(-)
> > > 
> > > This is hack upon hack, but why was this crap merged in the first 
> > > place?
> > > 
> > > I see two problems just by cursory review:
> > > 
> > > 1)
> > > 
> > > Firstly, the real bug in:
> > > 
> > >   f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers")
> > > 
> > > Was that crash_kexec() was called unconditionally after notifiers were 
> > > called, which should be fixed via the simple patch below (untested). 
> > > Looks much simpler than your fix.
> > > 
> > 
> > Hi Ingo,
> > 
> > Agreed. Your patch looks good.
> 
> In case you want that simpler fix and need my SOB:
> 
>   Signed-off-by: Ingo Molnar <mingo@kernel.org>
> 
> (but I have not tested it.)

I will quickly test it.

So this is a general fix but not a replacement for fix in this patch?

Because the problem original patch is trying to fix is that crash_kexec()
can be called from outside panic() too (kexec_should_crash()) and in that
case panic notifiers will not be called. So this patch is just trying to
delay the call to crash_kexec() to make it run much later.

> 
> > > Secondly, and more importantly, the whole premise of commit 
> > > f06e5153f4ae is broken IMHO:
> > > 
> > >  "This can help rare situations where kdump fails because of unstable
> > >   crashed kernel or hardware failure (memory corruption on critical
> > >   data/code)"
> > > 
> > > wtf?
> > > 
> > > If the kernel crashed due to a kernel crash, then the kernel booting 
> > > up in whatever hardware state should be able to do a clean bootup. The 
> > > fix for those 'rare situations' should be to fix the real bug (for 
> > > example by making hardware driver init (or deinit) sequences more 
> > > robust), not to paper it over by ordering around crash-time sequences 
> > > ...
> > > 
> > > If it crashed due to some hardware failure, there's literally an 
> > > infinite amount of failure modes that may or may not be impacted by 
> > > kexec crash-time handling ordering. We don't want to put a zillion 
> > > such flags into the kernel proper just to allow the perturbation of 
> > > the kernel.
> > 
> > I think one of the motivations behind this patch was call to kmsg_dump().
> > Some vendors have been wanting to have the capability to save kernel logs
> > to some NVRAM before transition to second kernel happens. Their argument
> > is that kdump does not succeed all the time and if kdump does not succeed
> > then atleast they have something to work with (kernel logs retrieved
> > from pstore interface).
> 
> Doesn't pstore attach itself to printk itself? AFAICS it does:
> 
>  fs/pstore/platform.c:   register_console(&pstore_console);
> 
> so the printk log leading up to and including the crash should be 
> available, regardless of this patch. What am I missing?

That's a good point. I was not aware of it. I am Ccing Don Zickus as
he has spent some time on this in the past.

Masami, would you have thougths on this? IIRC, one reason why kmsg_dump()
was written so that one could dump kernel messages to an NVRAM. Of one
could simple register pstore as console, then how kmsg_dump() will
continue to be useful?

> 
> > Not that I agree fully with this as problem might happen while we 
> > try to run panic_notifiers or kmsg_dump hooks and never transition 
> > into kdump kernel.
> 
> btw., this is the big problem with 'notifiers' in general: they are 
> opaque with barely any semantics defined, and a source of constant 
> confusion.

Agreed. That's the reason Eric never liked the idea of letting panic
notifiers run before crash_kexec().

> 
> > And it has been literally years since some developers have been 
> > pushing for allowing to run panic notifiers before crash_kexec(). 
> > Eric Biederman has been pushing back saying it reduces the 
> > reliability of kdump operation so this is not acceptable.
> 
> So what do those notifiers do?

IIRC, two main reasons had come in the past.

- In a cluster of nodes, people wanted to send some sort of notifications
  to main server that a node has crashed and don't fence it off as it
  might be saving dump.

- And saving kernel logs to non volatile store.

There might be more and I might not be aware about these. Hatayama and
Masami, can you shed more light on this.

BTW, first problem we faced in our clusters too and now it has been fixed.
Basically we send notifications in second kernel in user space to master
server that this node is still saving dump so don't fence it off.

Thanks
Vivek

> 
> Thanks,
> 
> 	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-23  7:19   ` Ingo Molnar
  2015-03-23 13:37     ` Vivek Goyal
@ 2015-03-23 15:36     ` Vivek Goyal
  2015-03-24  3:30     ` Masami Hiramatsu
  2 siblings, 0 replies; 19+ messages in thread
From: Vivek Goyal @ 2015-03-23 15:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Baoquan He, "Hatayama,
	Daisuke/畑山 大輔",
	ebiederm, masami.hiramatsu.pt, hidehiro.kawai.ez, linux-kernel,
	kexec, akpm, mingo, bp

On Mon, Mar 23, 2015 at 08:19:43AM +0100, Ingo Molnar wrote:
> 
> * Baoquan He <bhe@redhat.com> wrote:
> 
> > CC more people ...
> > 
> > On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote:
> > > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
> > > "crash_kexec_post_notifiers" kernel boot option, which toggles
> > > wheather panic() calls crash_kexec() before panic_notifiers and dump
> > > kmsg or after.
> > > 
> > > The problem is that the commit overlooks panic_on_oops kernel boot
> > > option. If it is enabled, crash_kexec() is called directly without
> > > going through panic() in oops path.
> > > 
> > > To fix this issue, this patch adds a check to
> > > "crash_kexec_post_notifiers" in the condition of kexec_should_crash().
> > > 
> > > Also, put a comment in kexec_should_crash() to explain not obvious
> > > things on this patch.
> > > 
> > > Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> > > Acked-by: Baoquan He <bhe@redhat.com>
> > > Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> > > Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > > ---
> > >  include/linux/kernel.h |  3 +++
> > >  kernel/kexec.c         | 11 +++++++++++
> > >  kernel/panic.c         |  2 +-
> > >  3 files changed, 15 insertions(+), 1 deletion(-)
> 
> This is hack upon hack, but why was this crap merged in the first 
> place?
> 
> I see two problems just by cursory review:
> 
> 1)
> 
> Firstly, the real bug in:
> 
>   f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers")
> 
> Was that crash_kexec() was called unconditionally after notifiers were 
> called, which should be fixed via the simple patch below (untested). 
> Looks much simpler than your fix.
> 
> 2)
> 
> Secondly, and more importantly, the whole premise of commit 
> f06e5153f4ae is broken IMHO:
> 
>  "This can help rare situations where kdump fails because of unstable
>   crashed kernel or hardware failure (memory corruption on critical
>   data/code)"
> 
> wtf?
> 
> If the kernel crashed due to a kernel crash, then the kernel booting 
> up in whatever hardware state should be able to do a clean bootup. The 
> fix for those 'rare situations' should be to fix the real bug (for 
> example by making hardware driver init (or deinit) sequences more 
> robust), not to paper it over by ordering around crash-time sequences 
> ...
> 
> If it crashed due to some hardware failure, there's literally an 
> infinite amount of failure modes that may or may not be impacted by 
> kexec crash-time handling ordering. We don't want to put a zillion 
> such flags into the kernel proper just to allow the perturbation of 
> the kernel.
> 
> Thanks,
> 
> 	Ingo
> 

I quickly tested this patch to make sure I can still transition into
second kernel when crash_kexec_post_notifiers is specified on command
line. I have not tried running any notifiers. So.

Tested-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>

This should be a general fix and not a replacement for the patch
in question in this mail thread. 

Thanks
Vivek

> diff --git a/kernel/panic.c b/kernel/panic.c
> index 8136ad76e5fd..774614f72cbd 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -142,7 +142,8 @@ void panic(const char *fmt, ...)
>  	 * Note: since some panic_notifiers can make crashed kernel
>  	 * more unstable, it can increase risks of the kdump failure too.
>  	 */
> -	crash_kexec(NULL);
> +	if (crash_kexec_post_notifiers)
> +		crash_kexec(NULL);
>  
>  	bust_spinlocks(0);
>  

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-23 14:31         ` Vivek Goyal
@ 2015-03-23 16:01           ` Don Zickus
  2015-03-24  3:58           ` Masami Hiramatsu
  1 sibling, 0 replies; 19+ messages in thread
From: Don Zickus @ 2015-03-23 16:01 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Ingo Molnar, Baoquan He, "Hatayama,
	Daisuke/畑山 大輔",
	ebiederm, masami.hiramatsu.pt, hidehiro.kawai.ez, linux-kernel,
	kexec, akpm, mingo, bp

On Mon, Mar 23, 2015 at 10:31:58AM -0400, Vivek Goyal wrote:
> > > I think one of the motivations behind this patch was call to kmsg_dump().
> > > Some vendors have been wanting to have the capability to save kernel logs
> > > to some NVRAM before transition to second kernel happens. Their argument
> > > is that kdump does not succeed all the time and if kdump does not succeed
> > > then atleast they have something to work with (kernel logs retrieved
> > > from pstore interface).
> > 
> > Doesn't pstore attach itself to printk itself? AFAICS it does:
> > 
> >  fs/pstore/platform.c:   register_console(&pstore_console);
> > 
> > so the printk log leading up to and including the crash should be 
> > available, regardless of this patch. What am I missing?
> 
> That's a good point. I was not aware of it. I am Ccing Don Zickus as
> he has spent some time on this in the past.

Hi,

I will throw my two cents in here, though I expect Daisuke to provide better
info.

A number of years ago when I was helping work through some of the birthing
pains of the backend for pstore, we didn't have console support.  I don't
think it made sense for x86 either because:

- lack of space in nvram (for large logs)
- you could mark space for deletion, but space was only recovered on reboot
- the state machine would be slow to write (though mmap might have been
  faster)

Looking through the history of who introduced register_console, it looks
like it was the ARM folks.  They might have a better implementation for a
backend that does not have the above limitations.  I don't know.


> 
> Masami, would you have thougths on this? IIRC, one reason why kmsg_dump()
> was written so that one could dump kernel messages to an NVRAM. Of one
> could simple register pstore as console, then how kmsg_dump() will
> continue to be useful?
> 
> > 
> > > Not that I agree fully with this as problem might happen while we 
> > > try to run panic_notifiers or kmsg_dump hooks and never transition 
> > > into kdump kernel.
> > 
> > btw., this is the big problem with 'notifiers' in general: they are 
> > opaque with barely any semantics defined, and a source of constant 
> > confusion.
> 
> Agreed. That's the reason Eric never liked the idea of letting panic
> notifiers run before crash_kexec().
> 
> > 
> > > And it has been literally years since some developers have been 
> > > pushing for allowing to run panic notifiers before crash_kexec(). 
> > > Eric Biederman has been pushing back saying it reduces the 
> > > reliability of kdump operation so this is not acceptable.
> > 
> > So what do those notifiers do?

I think it was just philosophical differences.  kexec on panic is a
complicated path and a bunch of stuff could go wrong.  As insurance in case kexec
on panic did not work, some companies wanted to pre-maturely capture info,
so they have _something_ to use for a debug analysis.

Vivek, Matthew Garret and myself argued to provide us with failure cases and
we will fix kexec on panic instead.  I think the stability period for kexec
on panic was so long that companies still do not trust it.  Just my guess.

Vivek provided examples of what folks were doing with the notifiers.

> 
> IIRC, two main reasons had come in the past.
> 
> - In a cluster of nodes, people wanted to send some sort of notifications
>   to main server that a node has crashed and don't fence it off as it
>   might be saving dump.
> 
> - And saving kernel logs to non volatile store.
> 
> There might be more and I might not be aware about these. Hatayama and
> Masami, can you shed more light on this.

Cheers,
Don

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-23  7:19   ` Ingo Molnar
  2015-03-23 13:37     ` Vivek Goyal
  2015-03-23 15:36     ` Vivek Goyal
@ 2015-03-24  3:30     ` Masami Hiramatsu
  2015-03-24  7:11       ` Ingo Molnar
  2 siblings, 1 reply; 19+ messages in thread
From: Masami Hiramatsu @ 2015-03-24  3:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Baoquan He, "Hatayama,
	Daisuke/畑山 大輔",
	ebiederm, Vivek Goyal, hidehiro.kawai.ez, linux-kernel, kexec,
	akpm, mingo, bp

(2015/03/23 16:19), Ingo Molnar wrote:
> 
> * Baoquan He <bhe@redhat.com> wrote:
> 
>> CC more people ...
>>
>> On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote:
>>> The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
>>> "crash_kexec_post_notifiers" kernel boot option, which toggles
>>> wheather panic() calls crash_kexec() before panic_notifiers and dump
>>> kmsg or after.
>>>
>>> The problem is that the commit overlooks panic_on_oops kernel boot
>>> option. If it is enabled, crash_kexec() is called directly without
>>> going through panic() in oops path.
>>>
>>> To fix this issue, this patch adds a check to
>>> "crash_kexec_post_notifiers" in the condition of kexec_should_crash().
>>>
>>> Also, put a comment in kexec_should_crash() to explain not obvious
>>> things on this patch.
>>>
>>> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
>>> Acked-by: Baoquan He <bhe@redhat.com>
>>> Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
>>> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>>> ---
>>>  include/linux/kernel.h |  3 +++
>>>  kernel/kexec.c         | 11 +++++++++++
>>>  kernel/panic.c         |  2 +-
>>>  3 files changed, 15 insertions(+), 1 deletion(-)
> 
> This is hack upon hack, but why was this crap merged in the first 
> place?
> 
> I see two problems just by cursory review:
> 
> 1)
> 
> Firstly, the real bug in:
> 
>   f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers")
> 
> Was that crash_kexec() was called unconditionally after notifiers were 
> called, which should be fixed via the simple patch below (untested). 
> Looks much simpler than your fix.

No, Daisuke's patch is not for that case. Since the kdump has a special hook in
kernel oops, when both of panic_on_oops and crash_kernel are set, panic() is
never called. Please see oops_end@arch/x86/kernel/dumpstack.c

----
void oops_end(unsigned long flags, struct pt_regs *regs, int signr)
{
        if (regs && kexec_should_crash(current))
                crash_kexec(regs);
----
Of course crash_kexec() never return except failing kexec unexpectedly.

Thus, kexec_should_crash should returns 0 if crash_kexec_post_notifiers is set.
(Semantically, it is a bit strange that panic_on_oops doesn't call panic(), but
that is another topic.)

However, your patch is also needed since the first crash_kexec() can fail in panic()
when crash_kexec_post_notifiers is not set. In that case, kernel tries to call
notifiers and call the 2nd crash_kexec() again. Actually the 2nd one is useless.

So, here is my reviewed-by.

Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

I'll be reply the latter part in other mail.

Thank you,

> 
> 2)
> 
> Secondly, and more importantly, the whole premise of commit 
> f06e5153f4ae is broken IMHO:
> 
>  "This can help rare situations where kdump fails because of unstable
>   crashed kernel or hardware failure (memory corruption on critical
>   data/code)"
> 
> wtf?
> 
> If the kernel crashed due to a kernel crash, then the kernel booting 
> up in whatever hardware state should be able to do a clean bootup. The 
> fix for those 'rare situations' should be to fix the real bug (for 
> example by making hardware driver init (or deinit) sequences more 
> robust), not to paper it over by ordering around crash-time sequences 
> ...
> 
> If it crashed due to some hardware failure, there's literally an 
> infinite amount of failure modes that may or may not be impacted by 
> kexec crash-time handling ordering. We don't want to put a zillion 
> such flags into the kernel proper just to allow the perturbation of 
> the kernel.
> 
> Thanks,
> 
> 	Ingo
> 
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 8136ad76e5fd..774614f72cbd 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -142,7 +142,8 @@ void panic(const char *fmt, ...)
>  	 * Note: since some panic_notifiers can make crashed kernel
>  	 * more unstable, it can increase risks of the kdump failure too.
>  	 */
> -	crash_kexec(NULL);
> +	if (crash_kexec_post_notifiers)
> +		crash_kexec(NULL);
>  
>  	bust_spinlocks(0);
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-23 14:31         ` Vivek Goyal
  2015-03-23 16:01           ` Don Zickus
@ 2015-03-24  3:58           ` Masami Hiramatsu
  1 sibling, 0 replies; 19+ messages in thread
From: Masami Hiramatsu @ 2015-03-24  3:58 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Ingo Molnar, Baoquan He, "Hatayama,
	Daisuke/畑山 大輔",
	ebiederm, hidehiro.kawai.ez, linux-kernel, kexec, akpm, mingo,
	bp, Don Zickus

(2015/03/23 23:31), Vivek Goyal wrote:
[...]
>>>> Secondly, and more importantly, the whole premise of commit 
>>>> f06e5153f4ae is broken IMHO:
>>>>
>>>>  "This can help rare situations where kdump fails because of unstable
>>>>   crashed kernel or hardware failure (memory corruption on critical
>>>>   data/code)"
>>>>
>>>> wtf?
>>>>
>>>> If the kernel crashed due to a kernel crash, then the kernel booting 
>>>> up in whatever hardware state should be able to do a clean bootup. The 
>>>> fix for those 'rare situations' should be to fix the real bug (for 
>>>> example by making hardware driver init (or deinit) sequences more 
>>>> robust), not to paper it over by ordering around crash-time sequences 
>>>> ...
>>>>
>>>> If it crashed due to some hardware failure, there's literally an 
>>>> infinite amount of failure modes that may or may not be impacted by 
>>>> kexec crash-time handling ordering. We don't want to put a zillion 
>>>> such flags into the kernel proper just to allow the perturbation of 
>>>> the kernel.
>>>
>>> I think one of the motivations behind this patch was call to kmsg_dump().
>>> Some vendors have been wanting to have the capability to save kernel logs
>>> to some NVRAM before transition to second kernel happens. Their argument
>>> is that kdump does not succeed all the time and if kdump does not succeed
>>> then atleast they have something to work with (kernel logs retrieved
>>> from pstore interface).
>>
>> Doesn't pstore attach itself to printk itself? AFAICS it does:
>>
>>  fs/pstore/platform.c:   register_console(&pstore_console);
>>
>> so the printk log leading up to and including the crash should be 
>> available, regardless of this patch. What am I missing?
> 
> That's a good point. I was not aware of it. I am Ccing Don Zickus as
> he has spent some time on this in the past.
> 
> Masami, would you have thougths on this? IIRC, one reason why kmsg_dump()
> was written so that one could dump kernel messages to an NVRAM. Of one
> could simple register pstore as console, then how kmsg_dump() will
> continue to be useful?

Yes, actually, kmsg_dump and pstore can help a lot to dump the last
message (even though kmsg_dump() is called only when setting
crash_kexec_post_notifiers...)

However, there are some machines which don't support pstore, but
only IPMI. pstore(kmsg) stores messages to a local NVRAM, and IPMI
stores messages to BMC(Board Management Controller)'s NVRAM (SEL:
System Event Log).
Some enterprise servers only have BMC, but no NVRAM. For such kind
of servers, we still need to call panic_notifier to store messages
via IPMI.
And also, using IPMI has another secondary feature, we can notice
machine failure from remote machine via IPMI over LAN by monitoring
SEL :)

You might want to integrate IPMI and pstore. But since IPMI SEL is
very limited and very slow, those are very different.

>>> Not that I agree fully with this as problem might happen while we 
>>> try to run panic_notifiers or kmsg_dump hooks and never transition 
>>> into kdump kernel.
>>
>> btw., this is the big problem with 'notifiers' in general: they are 
>> opaque with barely any semantics defined, and a source of constant 
>> confusion.
> 
> Agreed. That's the reason Eric never liked the idea of letting panic
> notifiers run before crash_kexec().

I see. thus I added a notice on documentation.

                        Note that this also increases risks of kdump failure,
                        because some panic notifiers can make the crashed
                        kernel more unstable.

I personally don't recommend to use this in usual situation. Only for
the machines which is very well configured and tested, this feature can
be enabled.

>>> And it has been literally years since some developers have been 
>>> pushing for allowing to run panic notifiers before crash_kexec(). 
>>> Eric Biederman has been pushing back saying it reduces the 
>>> reliability of kdump operation so this is not acceptable.
>>
>> So what do those notifiers do?
> 
> IIRC, two main reasons had come in the past.
> 
> - In a cluster of nodes, people wanted to send some sort of notifications
>   to main server that a node has crashed and don't fence it off as it
>   might be saving dump.
> 
> - And saving kernel logs to non volatile store.
> 
> There might be more and I might not be aware about these. Hatayama and
> Masami, can you shed more light on this.

Yes, as I described above, we'd like to use IPMI to write the log to SEL
and that also allow us to monitor the machine remotely.

> 
> BTW, first problem we faced in our clusters too and now it has been fixed.
> Basically we send notifications in second kernel in user space to master
> server that this node is still saving dump so don't fence it off.

Yeah, that's the usual way, I think. In some "mission-critical" use-cases,
we can't relay only on the kdump stability.

Thank you,



-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-24  3:30     ` Masami Hiramatsu
@ 2015-03-24  7:11       ` Ingo Molnar
  2015-03-24 10:27         ` Eric W. Biederman
  2015-03-24 14:46         ` Vivek Goyal
  0 siblings, 2 replies; 19+ messages in thread
From: Ingo Molnar @ 2015-03-24  7:11 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Baoquan He, "Hatayama,
	Daisuke/畑山 大輔",
	ebiederm, Vivek Goyal, hidehiro.kawai.ez, linux-kernel, kexec,
	akpm, mingo, bp


* Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote:

> (2015/03/23 16:19), Ingo Molnar wrote:
> > 
> > * Baoquan He <bhe@redhat.com> wrote:
> > 
> >> CC more people ...
> >>
> >> On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote:
> >>> The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
> >>> "crash_kexec_post_notifiers" kernel boot option, which toggles
> >>> wheather panic() calls crash_kexec() before panic_notifiers and dump
> >>> kmsg or after.
> >>>
> >>> The problem is that the commit overlooks panic_on_oops kernel boot
> >>> option. If it is enabled, crash_kexec() is called directly without
> >>> going through panic() in oops path.
> >>>
> >>> To fix this issue, this patch adds a check to
> >>> "crash_kexec_post_notifiers" in the condition of kexec_should_crash().
> >>>
> >>> Also, put a comment in kexec_should_crash() to explain not obvious
> >>> things on this patch.
> >>>
> >>> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> >>> Acked-by: Baoquan He <bhe@redhat.com>
> >>> Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> >>> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> >>> ---
> >>>  include/linux/kernel.h |  3 +++
> >>>  kernel/kexec.c         | 11 +++++++++++
> >>>  kernel/panic.c         |  2 +-
> >>>  3 files changed, 15 insertions(+), 1 deletion(-)
> > 
> > This is hack upon hack, but why was this crap merged in the first 
> > place?
> > 
> > I see two problems just by cursory review:
> > 
> > 1)
> > 
> > Firstly, the real bug in:
> > 
> >   f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers")
> > 
> > Was that crash_kexec() was called unconditionally after notifiers were 
> > called, which should be fixed via the simple patch below (untested). 
> > Looks much simpler than your fix.
> 
> No, Daisuke's patch is not for that case. [...]

Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' was 
clearly not a no-op in the default case, against expectations.

So the first step should be to restore the original behavior (my 
patch), then should any new tweaks be added.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-24  7:11       ` Ingo Molnar
@ 2015-03-24 10:27         ` Eric W. Biederman
  2015-03-24 14:32           ` Vivek Goyal
  2015-03-24 14:46         ` Vivek Goyal
  1 sibling, 1 reply; 19+ messages in thread
From: Eric W. Biederman @ 2015-03-24 10:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Masami Hiramatsu, Baoquan He, Hatayama,
	Daisuke/畑山 大輔,
	Vivek Goyal, hidehiro.kawai.ez, linux-kernel, kexec, akpm, mingo,
	bp

Ingo Molnar <mingo@kernel.org> writes:

> * Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote:
>
>> > 
>> >   f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers")
>> > 
>> > Was that crash_kexec() was called unconditionally after notifiers were 
>> > called, which should be fixed via the simple patch below (untested). 
>> > Looks much simpler than your fix.
>> 
>> No, Daisuke's patch is not for that case. [...]
>
> Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' was 
> clearly not a no-op in the default case, against expectations.
>
> So the first step should be to restore the original behavior (my 
> patch), then should any new tweaks be added.

Honestly I think the proper fix is to simply revert f06e5153f4ae.

It was clearly not properly tested by the people who wanted it because
they came back quite a while later with additional bleh.

I think this pretty much counts as hitting the code doesn't work let's
remove it threshold.

Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-24 10:27         ` Eric W. Biederman
@ 2015-03-24 14:32           ` Vivek Goyal
  2015-03-25 15:07             ` Hidehiro Kawai
  0 siblings, 1 reply; 19+ messages in thread
From: Vivek Goyal @ 2015-03-24 14:32 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Ingo Molnar, Masami Hiramatsu, Baoquan He, Hatayama,
	Daisuke/畑山 大輔,
	hidehiro.kawai.ez, linux-kernel, kexec, akpm, mingo, bp

On Tue, Mar 24, 2015 at 05:27:10AM -0500, Eric W. Biederman wrote:
> Ingo Molnar <mingo@kernel.org> writes:
> 
> > * Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote:
> >
> >> > 
> >> >   f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers")
> >> > 
> >> > Was that crash_kexec() was called unconditionally after notifiers were 
> >> > called, which should be fixed via the simple patch below (untested). 
> >> > Looks much simpler than your fix.
> >> 
> >> No, Daisuke's patch is not for that case. [...]
> >
> > Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' was 
> > clearly not a no-op in the default case, against expectations.
> >
> > So the first step should be to restore the original behavior (my 
> > patch), then should any new tweaks be added.
> 
> Honestly I think the proper fix is to simply revert f06e5153f4ae.
> 
> It was clearly not properly tested by the people who wanted it because
> they came back quite a while later with additional bleh.
> 
> I think this pretty much counts as hitting the code doesn't work let's
> remove it threshold.

IMHO, we should give users flexibility of running panic notifiers before
crash_kexec(). Different people have been asking for it since last 7-8
years and it is a pretty small code in kernel so no major maintenance
headache. 

Agreed that this might be very unreliable, but if users want to shoot
themseleves in the foot, it is their choice. This will not be upstream
default and I am hoping that distributions don't make it their default
either.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-24  7:11       ` Ingo Molnar
  2015-03-24 10:27         ` Eric W. Biederman
@ 2015-03-24 14:46         ` Vivek Goyal
  2015-03-24 16:18           ` Ingo Molnar
  1 sibling, 1 reply; 19+ messages in thread
From: Vivek Goyal @ 2015-03-24 14:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Masami Hiramatsu, Baoquan He, "Hatayama,
	Daisuke/畑山 大輔",
	ebiederm, hidehiro.kawai.ez, linux-kernel, kexec, akpm, mingo,
	bp

On Tue, Mar 24, 2015 at 08:11:29AM +0100, Ingo Molnar wrote:
> 
> * Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote:
> 
> > (2015/03/23 16:19), Ingo Molnar wrote:
> > > 
> > > * Baoquan He <bhe@redhat.com> wrote:
> > > 
> > >> CC more people ...
> > >>
> > >> On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote:
> > >>> The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced
> > >>> "crash_kexec_post_notifiers" kernel boot option, which toggles
> > >>> wheather panic() calls crash_kexec() before panic_notifiers and dump
> > >>> kmsg or after.
> > >>>
> > >>> The problem is that the commit overlooks panic_on_oops kernel boot
> > >>> option. If it is enabled, crash_kexec() is called directly without
> > >>> going through panic() in oops path.
> > >>>
> > >>> To fix this issue, this patch adds a check to
> > >>> "crash_kexec_post_notifiers" in the condition of kexec_should_crash().
> > >>>
> > >>> Also, put a comment in kexec_should_crash() to explain not obvious
> > >>> things on this patch.
> > >>>
> > >>> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> > >>> Acked-by: Baoquan He <bhe@redhat.com>
> > >>> Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> > >>> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > >>> ---
> > >>>  include/linux/kernel.h |  3 +++
> > >>>  kernel/kexec.c         | 11 +++++++++++
> > >>>  kernel/panic.c         |  2 +-
> > >>>  3 files changed, 15 insertions(+), 1 deletion(-)
> > > 
> > > This is hack upon hack, but why was this crap merged in the first 
> > > place?
> > > 
> > > I see two problems just by cursory review:
> > > 
> > > 1)
> > > 
> > > Firstly, the real bug in:
> > > 
> > >   f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers")
> > > 
> > > Was that crash_kexec() was called unconditionally after notifiers were 
> > > called, which should be fixed via the simple patch below (untested). 
> > > Looks much simpler than your fix.
> > 
> > No, Daisuke's patch is not for that case. [...]
> 
> Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' was 
> clearly not a no-op in the default case, against expectations.

Hi Ingo,

I did a quick test and in default case crash_kexec() runs before panic
notifiers. So it does look like crash_kexec_post_notifiers is a no-op
in default case.

What am I missing.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-24 14:46         ` Vivek Goyal
@ 2015-03-24 16:18           ` Ingo Molnar
  2015-03-24 17:04             ` Vivek Goyal
  0 siblings, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2015-03-24 16:18 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Masami Hiramatsu, Baoquan He, "Hatayama,
	Daisuke/畑山 大輔",
	ebiederm, hidehiro.kawai.ez, linux-kernel, kexec, akpm, mingo,
	bp


* Vivek Goyal <vgoyal@redhat.com> wrote:

> > Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' 
> > was clearly not a no-op in the default case, against expectations.
> 
> Hi Ingo,
> 
> I did a quick test and in default case crash_kexec() runs before 
> panic notifiers. So it does look like crash_kexec_post_notifiers is 
> a no-op in default case.
> 
> What am I missing.

Well, look at f06e5153f4ae:

diff --git a/kernel/panic.c b/kernel/panic.c
index d02fa9fef46a..62e16cef9cc2 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -32,6 +32,7 @@ static unsigned long tainted_mask;
 static int pause_on_oops;
 static int pause_on_oops_flag;
 static DEFINE_SPINLOCK(pause_on_oops_lock);
+static bool crash_kexec_post_notifiers;
 
 int panic_timeout = CONFIG_PANIC_TIMEOUT;
 EXPORT_SYMBOL_GPL(panic_timeout);
@@ -112,9 +113,11 @@ void panic(const char *fmt, ...)
 	/*
 	 * If we have crashed and we have a crash kernel loaded let it handle
 	 * everything else.
-	 * Do we want to call this before we try to display a message?
+	 * If we want to run this after calling panic_notifiers, pass
+	 * the "crash_kexec_post_notifiers" option to the kernel.
 	 */
-	crash_kexec(NULL);
+	if (!crash_kexec_post_notifiers)
+		crash_kexec(NULL);
 
 	/*
 	 * Note smp_send_stop is the usual smp shutdown function, which
@@ -131,6 +134,15 @@ void panic(const char *fmt, ...)
 
 	kmsg_dump(KMSG_DUMP_PANIC);
 
+	/*
+	 * If you doubt kdump always works fine in any situation,
+	 * "crash_kexec_post_notifiers" offers you a chance to run
+	 * panic_notifiers and dumping kmsg before kdump.
+	 * Note: since some panic_notifiers can make crashed kernel
+	 * more unstable, it can increase risks of the kdump failure too.
+	 */
+	crash_kexec(NULL);
+
 	bust_spinlocks(0);
 
 	if (!panic_blink)


Without knowing what crash_kexec() does, the patch looks buggy: it 
should preserve the old behavior by default, yet it will now execute a 
second crash_kexec() after the kmsg_dump() line.

So the invariant change would have been to do:

-	crash_kexec(NULL);
+	if (!crash_kexec_post_notifiers)
+		crash_kexec(NULL);

...

+	if (crash_kexec_post_notifiers)
+		crash_kexec(NULL);

Which in the !crash_kexec_post_notifiers flag case reduces to:

	crash_kexec();

	...

	/* NOP */

I.e. to exactly what the kernel was doing without the patch 
originally.

Which is what my patch does. Nothing more, nothing less.

There might be other bugs with the patch, I didn't consider that.

A revert would be fine as well.

Thanks,

	Ingo

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-24 16:18           ` Ingo Molnar
@ 2015-03-24 17:04             ` Vivek Goyal
  2015-05-12  8:43               ` Hidehiro Kawai
  0 siblings, 1 reply; 19+ messages in thread
From: Vivek Goyal @ 2015-03-24 17:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Masami Hiramatsu, Baoquan He, "Hatayama,
	Daisuke/畑山 大輔",
	ebiederm, hidehiro.kawai.ez, linux-kernel, kexec, akpm, mingo,
	bp

On Tue, Mar 24, 2015 at 05:18:14PM +0100, Ingo Molnar wrote:
> 
> * Vivek Goyal <vgoyal@redhat.com> wrote:
> 
> > > Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' 
> > > was clearly not a no-op in the default case, against expectations.
> > 
> > Hi Ingo,
> > 
> > I did a quick test and in default case crash_kexec() runs before 
> > panic notifiers. So it does look like crash_kexec_post_notifiers is 
> > a no-op in default case.
> > 
> > What am I missing.
> 
> Well, look at f06e5153f4ae:
> 
> diff --git a/kernel/panic.c b/kernel/panic.c
> index d02fa9fef46a..62e16cef9cc2 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -32,6 +32,7 @@ static unsigned long tainted_mask;
>  static int pause_on_oops;
>  static int pause_on_oops_flag;
>  static DEFINE_SPINLOCK(pause_on_oops_lock);
> +static bool crash_kexec_post_notifiers;
>  
>  int panic_timeout = CONFIG_PANIC_TIMEOUT;
>  EXPORT_SYMBOL_GPL(panic_timeout);
> @@ -112,9 +113,11 @@ void panic(const char *fmt, ...)
>  	/*
>  	 * If we have crashed and we have a crash kernel loaded let it handle
>  	 * everything else.
> -	 * Do we want to call this before we try to display a message?
> +	 * If we want to run this after calling panic_notifiers, pass
> +	 * the "crash_kexec_post_notifiers" option to the kernel.
>  	 */
> -	crash_kexec(NULL);
> +	if (!crash_kexec_post_notifiers)
> +		crash_kexec(NULL);
>  
>  	/*
>  	 * Note smp_send_stop is the usual smp shutdown function, which
> @@ -131,6 +134,15 @@ void panic(const char *fmt, ...)
>  
>  	kmsg_dump(KMSG_DUMP_PANIC);
>  
> +	/*
> +	 * If you doubt kdump always works fine in any situation,
> +	 * "crash_kexec_post_notifiers" offers you a chance to run
> +	 * panic_notifiers and dumping kmsg before kdump.
> +	 * Note: since some panic_notifiers can make crashed kernel
> +	 * more unstable, it can increase risks of the kdump failure too.
> +	 */
> +	crash_kexec(NULL);
> +
>  	bust_spinlocks(0);
>  
>  	if (!panic_blink)
> 
> 
> Without knowing what crash_kexec() does, the patch looks buggy: it 
> should preserve the old behavior by default, yet it will now execute a 
> second crash_kexec() after the kmsg_dump() line.
> 
> So the invariant change would have been to do:
> 
> -	crash_kexec(NULL);
> +	if (!crash_kexec_post_notifiers)
> +		crash_kexec(NULL);
> 
> ...
> 
> +	if (crash_kexec_post_notifiers)
> +		crash_kexec(NULL);
> 
> Which in the !crash_kexec_post_notifiers flag case reduces to:
> 
> 	crash_kexec();
> 
> 	...
> 
> 	/* NOP */
> 
> I.e. to exactly what the kernel was doing without the patch 
> originally.
> 
> Which is what my patch does. Nothing more, nothing less.

Ok, I got it what you mean.

crash_kexec() does not return if a kdump kernel is loaded. If kdump
kernel is not loaded, then crash_kexec() returns without doing anything.

I think that explains why not making second call to crash_kexec() under
if, did not create problems. In first case it will never be called and
in second case, it will do nothing and simply return back.

But anyway, we need your patch as that's right thing to do.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-24 14:32           ` Vivek Goyal
@ 2015-03-25 15:07             ` Hidehiro Kawai
  0 siblings, 0 replies; 19+ messages in thread
From: Hidehiro Kawai @ 2015-03-25 15:07 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Eric W. Biederman, Ingo Molnar, Masami Hiramatsu, Baoquan He,
	"Hatayama, Daisuke/畑山 大輔",
	linux-kernel, kexec, akpm, mingo, bp

Hello all,

(2015/03/24 23:32), Vivek Goyal wrote:
> On Tue, Mar 24, 2015 at 05:27:10AM -0500, Eric W. Biederman wrote:
>> Ingo Molnar <mingo@kernel.org> writes:
>>
>>> * Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote:
>>>
>>>>>
>>>>>   f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers")
>>>>>
>>>>> Was that crash_kexec() was called unconditionally after notifiers were 
>>>>> called, which should be fixed via the simple patch below (untested). 
>>>>> Looks much simpler than your fix.
>>>>
>>>> No, Daisuke's patch is not for that case. [...]
>>>
>>> Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' was 
>>> clearly not a no-op in the default case, against expectations.
>>>
>>> So the first step should be to restore the original behavior (my 
>>> patch), then should any new tweaks be added.
>>
>> Honestly I think the proper fix is to simply revert f06e5153f4ae.
>>
>> It was clearly not properly tested by the people who wanted it because
>> they came back quite a while later with additional bleh.
>>
>> I think this pretty much counts as hitting the code doesn't work let's
>> remove it threshold.
> 
> IMHO, we should give users flexibility of running panic notifiers before
> crash_kexec(). Different people have been asking for it since last 7-8
> years and it is a pretty small code in kernel so no major maintenance
> headache. 
> 
> Agreed that this might be very unreliable, but if users want to shoot
> themseleves in the foot, it is their choice. This will not be upstream
> default and I am hoping that distributions don't make it their default
> either.

We are going to use panic notifier to write SEL record, and actually
it seems to be unreliable.  At least I found two problems in IPMI driver
code while testing Hatayama-san's patch, and they will cause an infinite
loop.  I think users wouldn't notice this bug because most of users use
kdump and there is no difference on display between the infinite loop
case and successful case.

Anyway, we need to harden panic notifier callee.  I will post bug fix
patches for IPMI driver ASAP.

Best regards,

Hidehiro Kawai
Hitachi, Yokohama Research Laboratory



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Re: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
  2015-03-24 17:04             ` Vivek Goyal
@ 2015-05-12  8:43               ` Hidehiro Kawai
  0 siblings, 0 replies; 19+ messages in thread
From: Hidehiro Kawai @ 2015-05-12  8:43 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Ingo Molnar, Masami Hiramatsu, Baoquan He, "Hatayama,
	Daisuke/畑山 大輔",
	ebiederm, linux-kernel, kexec, akpm, mingo, bp

Hi all,

What is the current status of this bug fix patch?
I think it's OK if resending Hatayama-san's patch with Ingo's.

Thanks,

(2015/03/25 2:04), Vivek Goyal wrote:
> On Tue, Mar 24, 2015 at 05:18:14PM +0100, Ingo Molnar wrote:
>>
>> * Vivek Goyal <vgoyal@redhat.com> wrote:
>>
>>>> Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' 
>>>> was clearly not a no-op in the default case, against expectations.
>>>
>>> Hi Ingo,
>>>
>>> I did a quick test and in default case crash_kexec() runs before 
>>> panic notifiers. So it does look like crash_kexec_post_notifiers is 
>>> a no-op in default case.
>>>
>>> What am I missing.
>>
>> Well, look at f06e5153f4ae:
>>
>> diff --git a/kernel/panic.c b/kernel/panic.c
>> index d02fa9fef46a..62e16cef9cc2 100644
>> --- a/kernel/panic.c
>> +++ b/kernel/panic.c
>> @@ -32,6 +32,7 @@ static unsigned long tainted_mask;
>>  static int pause_on_oops;
>>  static int pause_on_oops_flag;
>>  static DEFINE_SPINLOCK(pause_on_oops_lock);
>> +static bool crash_kexec_post_notifiers;
>>  
>>  int panic_timeout = CONFIG_PANIC_TIMEOUT;
>>  EXPORT_SYMBOL_GPL(panic_timeout);
>> @@ -112,9 +113,11 @@ void panic(const char *fmt, ...)
>>  	/*
>>  	 * If we have crashed and we have a crash kernel loaded let it handle
>>  	 * everything else.
>> -	 * Do we want to call this before we try to display a message?
>> +	 * If we want to run this after calling panic_notifiers, pass
>> +	 * the "crash_kexec_post_notifiers" option to the kernel.
>>  	 */
>> -	crash_kexec(NULL);
>> +	if (!crash_kexec_post_notifiers)
>> +		crash_kexec(NULL);
>>  
>>  	/*
>>  	 * Note smp_send_stop is the usual smp shutdown function, which
>> @@ -131,6 +134,15 @@ void panic(const char *fmt, ...)
>>  
>>  	kmsg_dump(KMSG_DUMP_PANIC);
>>  
>> +	/*
>> +	 * If you doubt kdump always works fine in any situation,
>> +	 * "crash_kexec_post_notifiers" offers you a chance to run
>> +	 * panic_notifiers and dumping kmsg before kdump.
>> +	 * Note: since some panic_notifiers can make crashed kernel
>> +	 * more unstable, it can increase risks of the kdump failure too.
>> +	 */
>> +	crash_kexec(NULL);
>> +
>>  	bust_spinlocks(0);
>>  
>>  	if (!panic_blink)
>>
>>
>> Without knowing what crash_kexec() does, the patch looks buggy: it 
>> should preserve the old behavior by default, yet it will now execute a 
>> second crash_kexec() after the kmsg_dump() line.
>>
>> So the invariant change would have been to do:
>>
>> -	crash_kexec(NULL);
>> +	if (!crash_kexec_post_notifiers)
>> +		crash_kexec(NULL);
>>
>> ...
>>
>> +	if (crash_kexec_post_notifiers)
>> +		crash_kexec(NULL);
>>
>> Which in the !crash_kexec_post_notifiers flag case reduces to:
>>
>> 	crash_kexec();
>>
>> 	...
>>
>> 	/* NOP */
>>
>> I.e. to exactly what the kernel was doing without the patch 
>> originally.
>>
>> Which is what my patch does. Nothing more, nothing less.
> 
> Ok, I got it what you mean.
> 
> crash_kexec() does not return if a kdump kernel is loaded. If kdump
> kernel is not loaded, then crash_kexec() returns without doing anything.
> 
> I think that explains why not making second call to crash_kexec() under
> if, did not create problems. In first case it will never be called and
> in second case, it will do nothing and simply return back.
> 
> But anyway, we need your patch as that's right thing to do.
> 
> Thanks
> Vivek
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 


-- 
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-05-12  8:43 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-06 16:31 [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path "Hatayama, Daisuke/畑山 大輔"
2015-03-06 18:08 ` Vivek Goyal
2015-03-23  3:47 ` Baoquan He
2015-03-23  7:19   ` Ingo Molnar
2015-03-23 13:37     ` Vivek Goyal
2015-03-23 13:50       ` Ingo Molnar
2015-03-23 14:31         ` Vivek Goyal
2015-03-23 16:01           ` Don Zickus
2015-03-24  3:58           ` Masami Hiramatsu
2015-03-23 15:36     ` Vivek Goyal
2015-03-24  3:30     ` Masami Hiramatsu
2015-03-24  7:11       ` Ingo Molnar
2015-03-24 10:27         ` Eric W. Biederman
2015-03-24 14:32           ` Vivek Goyal
2015-03-25 15:07             ` Hidehiro Kawai
2015-03-24 14:46         ` Vivek Goyal
2015-03-24 16:18           ` Ingo Molnar
2015-03-24 17:04             ` Vivek Goyal
2015-05-12  8:43               ` Hidehiro Kawai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).