linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [V3 PATCH 0/2] kexec: crash_kexec_post_notifiers boot option related fixes
@ 2016-07-05 11:33 Hidehiro Kawai
  2016-07-05 11:33 ` [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version Hidehiro Kawai
  2016-07-05 11:33 ` [V3 PATCH 2/2] kexec: Use core_param for crash_kexec_post_notifiers boot option Hidehiro Kawai
  0 siblings, 2 replies; 16+ messages in thread
From: Hidehiro Kawai @ 2016-07-05 11:33 UTC (permalink / raw)
  To: Andrew Morton, Dave Young, Eric W. Biederman, Baoquan He
  Cc: x86, kexec, linux-kernel, HATAYAMA Daisuke, Masami Hiramatsu,
	H. Peter Anvin, Daniel Walker, Thomas Gleixner, Borislav Petkov,
	Ingo Molnar, Vivek Goyal

This is the update version of the patch which I posted one year ago
(https://lkml.org/lkml/2015/7/23/864).  Because I couldn't received
positive opinions against the previous version, I had changed the
approach to another one which utilizes kexec purgatory.  Then I'm
back here.

PATCH 1/2 is a bugfix for crash_kexec_post_notifiers boot option
which allows users to call panic notifiers and kmsg dumpers before
kdump.  When this option is enabled, smp_send_stop() is used to stop
other CPUs instead of a function provided by kdump.  smp_send_stop()
just stops other CPUs and doesn't other things needed for kdump.
So, fix it by replacing smp_send_stop() with new appropriate one,
panic_smp_send_stop().  Since panic_smp_send_stop() is architecture
specific and I can't simply copy from kdump routines, only
x86-specific version is provided.

PATCH 2/2 makes crash_kexec_post_notifiers boot option modifiable
after boot.  We don't need to limit it to being modifiable at boot
time.

Changes in V3:
- Revise comments, description, and symbol names (the logic doesn't
  change)
- Make crash_kexec_post_notifiers boot option modifiable after boot

Changes in V2: https://lkml.org/lkml/2015/7/23/864
- Replace smp_send_stop() call with crash_kexec version which
  saves cpu states and does cleanups instead of changing execution
  flow
- Drop a fix for Problem 1
- Drop other patches because they aren't needed anymore

V1: https://lkml.org/lkml/2015/7/10/316

---

Hidehiro Kawai (2):
      x86/panic: Replace smp_send_stop() with kdump friendly version
      kexec: Use core_param for crash_kexec_post_notifiers boot option


 arch/x86/kernel/crash.c |   14 ++++++++----
 kernel/panic.c          |   56 ++++++++++++++++++++++++++++++-----------------
 2 files changed, 46 insertions(+), 24 deletions(-)


-- 
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-05 11:33 [V3 PATCH 0/2] kexec: crash_kexec_post_notifiers boot option related fixes Hidehiro Kawai
@ 2016-07-05 11:33 ` Hidehiro Kawai
  2016-07-11  8:34   ` Dave Young
  2016-07-12  3:11   ` Xunlei Pang
  2016-07-05 11:33 ` [V3 PATCH 2/2] kexec: Use core_param for crash_kexec_post_notifiers boot option Hidehiro Kawai
  1 sibling, 2 replies; 16+ messages in thread
From: Hidehiro Kawai @ 2016-07-05 11:33 UTC (permalink / raw)
  To: Andrew Morton, Dave Young, Eric W. Biederman, Baoquan He
  Cc: Michal Hocko, Toshi Kani, Peter Zijlstra (Intel),
	Minfei Huang, H. Peter Anvin, Daniel Walker, Ingo Molnar,
	Takao Indoh, x86, Lee, Chun-Yi, Borislav Petkov, Vivek Goyal,
	Petr Mladek, Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Masami Hiramatsu, Tejun Heo,
	Vitaly Kuznetsov

This patch fixes one of the problems reported by Daniel Walker
(https://lkml.org/lkml/2015/6/24/44).

If crash_kexec_post_notifiers boot option is specified, other CPUs
are stopped by smp_send_stop() instead of machine_crash_shutdown()
in crash_kexec() path.  This behavior change leads two problems.

 Problem 1:
 octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs are
 still online and try to stop their watchdog timer.  If
 smp_send_stop() is called before octeon_generic_shutdown(), stopping
 watchdog timer will fail because other CPUs have been offlined by
 smp_send_stop().

   panic()
     if crash_kexec_post_notifiers == 1
       smp_send_stop()
       atomic_notifier_call_chain()
       kmsg_dump()
     crash_kexec()
       machine_crash_shutdown()
         octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs

 Problem 2:
 Most of architectures stop other CPUs in machine_crash_shutdown()
 path, and they also do something needed for kdump.  For example,
 they save registers, disable virtualization extensions, and so on.
 However, if smp_send_stop() stops other CPUs before
 machine_crash_shutdown(), we miss those operations.

How do we fix these problems?  In the first place, we should stop
other CPUs as soon as possible when panic() was called, otherwise
other CPUs may wipe out a clue to the cause of the failure.  So, we
replace smp_send_stop() with more suitable one for kdump.

This patch solves Problem 2 by replacing smp_send_stop() in panic()
with panic_smp_send_stop().  This is a weak function which calls
smp_send_stop(), and architecture dependent code may override this
with appropriate one.  This patch only provides x86-specific version.

Changes in V3:
- Revise comments, description, and symbol names

Changes in V2:
- Replace smp_send_stop() call with crash_kexec version which
  saves cpu states and cleans up VMX/SVM
- Drop a fix for Problem 1 at this moment

Reported-by: Daniel Walker <dwalker@fifo99.com>
Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Takao Indoh <indou.takao@jp.fujitsu.com>
Cc: "Lee, Chun-Yi" <joeyli.kernel@gmail.com>
Cc: Minfei Huang <mnfhuang@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/kernel/crash.c |   14 ++++++++++----
 kernel/panic.c          |   43 ++++++++++++++++++++++++++++++++-----------
 2 files changed, 42 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 9ef978d..3305433 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -133,15 +133,21 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
 	disable_local_APIC();
 }
 
-static void kdump_nmi_shootdown_cpus(void)
+/* Override the weak function in kernel/panic.c */
+void panic_smp_send_stop(void)
 {
-	nmi_shootdown_cpus(kdump_nmi_callback);
+	static int cpus_stopped;
+
+	if (cpus_stopped)
+		return;
 
+	nmi_shootdown_cpus(kdump_nmi_callback);
 	disable_local_APIC();
+	cpus_stopped = 1;
 }
 
 #else
-static void kdump_nmi_shootdown_cpus(void)
+void panic_smp_send_stop(void)
 {
 	/* There are no cpus to shootdown */
 }
@@ -160,7 +166,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
 	/* The kernel is broken so disable interrupts */
 	local_irq_disable();
 
-	kdump_nmi_shootdown_cpus();
+	panic_smp_send_stop();
 
 	/*
 	 * VMCLEAR VMCSs loaded on this cpu if needed.
diff --git a/kernel/panic.c b/kernel/panic.c
index 8aa7449..da8062d2 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -71,6 +71,32 @@ void __weak nmi_panic_self_stop(struct pt_regs *regs)
 	panic_smp_self_stop();
 }
 
+/*
+ * Stop other CPUs in panic.  Architecture dependent code may override this
+ * with more suitable version.  For example, if the architecture supports
+ * crash dump, it should save registers of each stopped CPU and disable
+ * per-CPU features such as virtualization extensions.
+ */
+void __weak panic_smp_send_stop(void)
+{
+	static int cpus_stopped;
+
+	/*
+	 * This function can be called twice in panic path, but obviously
+	 * we execute this only once.
+	 */
+	if (cpus_stopped)
+		return;
+
+	/*
+	 * Note smp_send_stop is the usual smp shutdown function, which
+	 * unfortunately means it may not be hardened to work in a panic
+	 * situation.
+	 */
+	smp_send_stop();
+	cpus_stopped = 1;
+}
+
 atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
 
 /*
@@ -125,7 +151,7 @@ void panic(const char *fmt, ...)
 	 * Only one CPU is allowed to execute the panic code from here. For
 	 * multiple parallel invocations of panic, all other CPUs either
 	 * stop themself or will wait until they are stopped by the 1st CPU
-	 * with smp_send_stop().
+	 * with panic_smp_send_stop().
 	 *
 	 * `old_cpu == PANIC_CPU_INVALID' means this is the 1st CPU which
 	 * comes here, so go ahead.
@@ -165,12 +191,7 @@ void panic(const char *fmt, ...)
 		__crash_kexec(NULL);
 	}
 
-	/*
-	 * Note smp_send_stop is the usual smp shutdown function, which
-	 * unfortunately means it may not be hardened to work in a panic
-	 * situation.
-	 */
-	smp_send_stop();
+	panic_smp_send_stop();
 
 	/*
 	 * Run any panic handlers, including those that might need to
@@ -198,10 +219,10 @@ void panic(const char *fmt, ...)
 
 	/*
 	 * We may have ended up stopping the CPU holding the lock (in
-	 * smp_send_stop()) while still having some valuable data in the console
-	 * buffer.  Try to acquire the lock then release it regardless of the
-	 * result.  The release will also print the buffers out.  Locks debug
-	 * should be disabled to avoid reporting bad unlock balance when
+	 * panic_smp_send_stop()) while still having some valuable data in the
+	 * console buffer.  Try to acquire the lock then release it regardless
+	 * of the result.  The release will also print the buffers out.  Locks
+	 * debug should be disabled to avoid reporting bad unlock balance when
 	 * panic() is not being callled from OOPS.
 	 */
 	debug_locks_off();

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [V3 PATCH 2/2] kexec: Use core_param for crash_kexec_post_notifiers boot option
  2016-07-05 11:33 [V3 PATCH 0/2] kexec: crash_kexec_post_notifiers boot option related fixes Hidehiro Kawai
  2016-07-05 11:33 ` [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version Hidehiro Kawai
@ 2016-07-05 11:33 ` Hidehiro Kawai
  1 sibling, 0 replies; 16+ messages in thread
From: Hidehiro Kawai @ 2016-07-05 11:33 UTC (permalink / raw)
  To: Andrew Morton, Dave Young, Eric W. Biederman, Baoquan He
  Cc: Petr Mladek, Michal Hocko, Josh Poimboeuf, x86, kexec,
	linux-kernel, Vitaly Kuznetsov, HATAYAMA Daisuke,
	Masami Hiramatsu, H. Peter Anvin, Tejun Heo, Daniel Walker,
	Thomas Gleixner, Borislav Petkov, Ingo Molnar, Vivek Goyal

crash_kexec_post_notifiers ia a boot option which controls whether
the 1st kernel calls panic notifiers or not before booting the 2nd
kernel.  However, there is no need to limit it to being modifiable
only at boot time.  So, use core_param instead of early_param.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
---
 kernel/panic.c |   13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/kernel/panic.c b/kernel/panic.c
index da8062d2..0d0edeb 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -134,6 +134,7 @@ void panic(const char *fmt, ...)
 	long i, i_next = 0;
 	int state = 0;
 	int old_cpu, this_cpu;
+	bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers;
 
 	/*
 	 * Disable local interrupts. This will prevent panic_smp_self_stop
@@ -186,7 +187,7 @@ void panic(const char *fmt, ...)
 	 *
 	 * Bypass the panic_cpu check and call __crash_kexec directly.
 	 */
-	if (!crash_kexec_post_notifiers) {
+	if (!_crash_kexec_post_notifiers) {
 		printk_nmi_flush_on_panic();
 		__crash_kexec(NULL);
 	}
@@ -212,7 +213,7 @@ void panic(const char *fmt, ...)
 	 *
 	 * Bypass the panic_cpu check and call __crash_kexec directly.
 	 */
-	if (crash_kexec_post_notifiers)
+	if (_crash_kexec_post_notifiers)
 		__crash_kexec(NULL);
 
 	bust_spinlocks(0);
@@ -592,13 +593,7 @@ EXPORT_SYMBOL(__stack_chk_fail);
 core_param(panic, panic_timeout, int, 0644);
 core_param(pause_on_oops, pause_on_oops, int, 0644);
 core_param(panic_on_warn, panic_on_warn, int, 0644);
-
-static int __init setup_crash_kexec_post_notifiers(char *s)
-{
-	crash_kexec_post_notifiers = true;
-	return 0;
-}
-early_param("crash_kexec_post_notifiers", setup_crash_kexec_post_notifiers);
+core_param(crash_kexec_post_notifiers, crash_kexec_post_notifiers, bool, 0644);
 
 static int __init oops_setup(char *s)
 {

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-05 11:33 ` [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version Hidehiro Kawai
@ 2016-07-11  8:34   ` Dave Young
  2016-07-12  2:49     ` 河合英宏 / KAWAI,HIDEHIRO
  2016-07-12  3:11   ` Xunlei Pang
  1 sibling, 1 reply; 16+ messages in thread
From: Dave Young @ 2016-07-11  8:34 UTC (permalink / raw)
  To: Hidehiro Kawai
  Cc: Andrew Morton, Eric W. Biederman, Baoquan He, Michal Hocko,
	Toshi Kani, Peter Zijlstra (Intel),
	Minfei Huang, H. Peter Anvin, Daniel Walker, Ingo Molnar,
	Takao Indoh, x86, Lee, Chun-Yi, Borislav Petkov, Vivek Goyal,
	Petr Mladek, Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Masami Hiramatsu, Tejun Heo,
	Vitaly Kuznetsov

On 07/05/16 at 08:33pm, Hidehiro Kawai wrote:
> This patch fixes one of the problems reported by Daniel Walker
> (https://lkml.org/lkml/2015/6/24/44).
> 
> If crash_kexec_post_notifiers boot option is specified, other CPUs
> are stopped by smp_send_stop() instead of machine_crash_shutdown()
> in crash_kexec() path.  This behavior change leads two problems.
> 
>  Problem 1:
>  octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs are
>  still online and try to stop their watchdog timer.  If
>  smp_send_stop() is called before octeon_generic_shutdown(), stopping
>  watchdog timer will fail because other CPUs have been offlined by
>  smp_send_stop().
> 
>    panic()
>      if crash_kexec_post_notifiers == 1
>        smp_send_stop()
>        atomic_notifier_call_chain()
>        kmsg_dump()
>      crash_kexec()
>        machine_crash_shutdown()
>          octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs
> 
>  Problem 2:
>  Most of architectures stop other CPUs in machine_crash_shutdown()
>  path, and they also do something needed for kdump.  For example,
>  they save registers, disable virtualization extensions, and so on.
>  However, if smp_send_stop() stops other CPUs before
>  machine_crash_shutdown(), we miss those operations.
> 
> How do we fix these problems?  In the first place, we should stop
> other CPUs as soon as possible when panic() was called, otherwise
> other CPUs may wipe out a clue to the cause of the failure.  So, we
> replace smp_send_stop() with more suitable one for kdump.

We have been avoiding extra things in panic path, but unfortunately
crash_kexec_post_notifiers were added. I tend to agree the best place
for this stuff is in 2nd kernel or purgatory instead of in 1st kernel.

As for this patch I'm not sure it is safe to replace the smp_send_stop
with the kdump friendly function. I'm also not sure if the kdump friendly
function is safe for kdump. Will glad to hear opinions from other
arch experts.

BTW, if one want to use crash_kexec_post_notifiers he should take the
risk of unreliable kdump. How about only call smp_send_stop in case no
crash_kexec_post_notifiers being used.

> 
> This patch solves Problem 2 by replacing smp_send_stop() in panic()
> with panic_smp_send_stop().  This is a weak function which calls
> smp_send_stop(), and architecture dependent code may override this
> with appropriate one.  This patch only provides x86-specific version.

It does not fix the Problem 1, it seem not possible to fix it?

> 
> Changes in V3:
> - Revise comments, description, and symbol names
> 
> Changes in V2:
> - Replace smp_send_stop() call with crash_kexec version which
>   saves cpu states and cleans up VMX/SVM
> - Drop a fix for Problem 1 at this moment
> 
> Reported-by: Daniel Walker <dwalker@fifo99.com>
> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
> Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Toshi Kani <toshi.kani@hpe.com>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: Takao Indoh <indou.takao@jp.fujitsu.com>
> Cc: "Lee, Chun-Yi" <joeyli.kernel@gmail.com>
> Cc: Minfei Huang <mnfhuang@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> Cc: Petr Mladek <pmladek@suse.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  arch/x86/kernel/crash.c |   14 ++++++++++----
>  kernel/panic.c          |   43 ++++++++++++++++++++++++++++++++-----------
>  2 files changed, 42 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index 9ef978d..3305433 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -133,15 +133,21 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
>  	disable_local_APIC();
>  }
>  
> -static void kdump_nmi_shootdown_cpus(void)
> +/* Override the weak function in kernel/panic.c */
> +void panic_smp_send_stop(void)
>  {
> -	nmi_shootdown_cpus(kdump_nmi_callback);
> +	static int cpus_stopped;
> +
> +	if (cpus_stopped)
> +		return;
>  
> +	nmi_shootdown_cpus(kdump_nmi_callback);
>  	disable_local_APIC();
> +	cpus_stopped = 1;
>  }
>  
>  #else
> -static void kdump_nmi_shootdown_cpus(void)
> +void panic_smp_send_stop(void)
>  {
>  	/* There are no cpus to shootdown */
>  }
> @@ -160,7 +166,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>  	/* The kernel is broken so disable interrupts */
>  	local_irq_disable();
>  
> -	kdump_nmi_shootdown_cpus();
> +	panic_smp_send_stop();
>  
>  	/*
>  	 * VMCLEAR VMCSs loaded on this cpu if needed.
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 8aa7449..da8062d2 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -71,6 +71,32 @@ void __weak nmi_panic_self_stop(struct pt_regs *regs)
>  	panic_smp_self_stop();
>  }
>  
> +/*
> + * Stop other CPUs in panic.  Architecture dependent code may override this
> + * with more suitable version.  For example, if the architecture supports
> + * crash dump, it should save registers of each stopped CPU and disable
> + * per-CPU features such as virtualization extensions.
> + */
> +void __weak panic_smp_send_stop(void)
> +{
> +	static int cpus_stopped;
> +
> +	/*
> +	 * This function can be called twice in panic path, but obviously
> +	 * we execute this only once.
> +	 */
> +	if (cpus_stopped)
> +		return;
> +
> +	/*
> +	 * Note smp_send_stop is the usual smp shutdown function, which
> +	 * unfortunately means it may not be hardened to work in a panic
> +	 * situation.
> +	 */
> +	smp_send_stop();
> +	cpus_stopped = 1;
> +}
> +
>  atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
>  
>  /*
> @@ -125,7 +151,7 @@ void panic(const char *fmt, ...)
>  	 * Only one CPU is allowed to execute the panic code from here. For
>  	 * multiple parallel invocations of panic, all other CPUs either
>  	 * stop themself or will wait until they are stopped by the 1st CPU
> -	 * with smp_send_stop().
> +	 * with panic_smp_send_stop().
>  	 *
>  	 * `old_cpu == PANIC_CPU_INVALID' means this is the 1st CPU which
>  	 * comes here, so go ahead.
> @@ -165,12 +191,7 @@ void panic(const char *fmt, ...)
>  		__crash_kexec(NULL);
>  	}
>  
> -	/*
> -	 * Note smp_send_stop is the usual smp shutdown function, which
> -	 * unfortunately means it may not be hardened to work in a panic
> -	 * situation.
> -	 */
> -	smp_send_stop();
> +	panic_smp_send_stop();
>  
>  	/*
>  	 * Run any panic handlers, including those that might need to
> @@ -198,10 +219,10 @@ void panic(const char *fmt, ...)
>  
>  	/*
>  	 * We may have ended up stopping the CPU holding the lock (in
> -	 * smp_send_stop()) while still having some valuable data in the console
> -	 * buffer.  Try to acquire the lock then release it regardless of the
> -	 * result.  The release will also print the buffers out.  Locks debug
> -	 * should be disabled to avoid reporting bad unlock balance when
> +	 * panic_smp_send_stop()) while still having some valuable data in the
> +	 * console buffer.  Try to acquire the lock then release it regardless
> +	 * of the result.  The release will also print the buffers out.  Locks
> +	 * debug should be disabled to avoid reporting bad unlock balance when
>  	 * panic() is not being callled from OOPS.
>  	 */
>  	debug_locks_off();
> 
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

Thanks
Dave

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-11  8:34   ` Dave Young
@ 2016-07-12  2:49     ` 河合英宏 / KAWAI,HIDEHIRO
  2016-07-13  2:03       ` 'Dave Young'
  0 siblings, 1 reply; 16+ messages in thread
From: 河合英宏 / KAWAI,HIDEHIRO @ 2016-07-12  2:49 UTC (permalink / raw)
  To: 'Dave Young'
  Cc: Andrew Morton, Eric W. Biederman, Baoquan He, Michal Hocko,
	Toshi Kani, Peter Zijlstra (Intel),
	Minfei Huang, H. Peter Anvin, Daniel Walker, Ingo Molnar,
	Takao Indoh, x86, Lee, Chun-Yi, Borislav Petkov, Vivek Goyal,
	Petr Mladek, Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Masami Hiramatsu, Tejun Heo,
	Vitaly Kuznetsov

Hi Dave,

Thanks for the comments.

> From: Dave Young [mailto:dyoung@redhat.com]
> Sent: Monday, July 11, 2016 5:35 PM
> 
> On 07/05/16 at 08:33pm, Hidehiro Kawai wrote:
> > This patch fixes one of the problems reported by Daniel Walker
> > (https://lkml.org/lkml/2015/6/24/44).
> >
> > If crash_kexec_post_notifiers boot option is specified, other CPUs
> > are stopped by smp_send_stop() instead of machine_crash_shutdown()
> > in crash_kexec() path.  This behavior change leads two problems.
> >
> >  Problem 1:
> >  octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs are
> >  still online and try to stop their watchdog timer.  If
> >  smp_send_stop() is called before octeon_generic_shutdown(), stopping
> >  watchdog timer will fail because other CPUs have been offlined by
> >  smp_send_stop().
> >
> >    panic()
> >      if crash_kexec_post_notifiers == 1
> >        smp_send_stop()
> >        atomic_notifier_call_chain()
> >        kmsg_dump()
> >      crash_kexec()
> >        machine_crash_shutdown()
> >          octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs
> >
> >  Problem 2:
> >  Most of architectures stop other CPUs in machine_crash_shutdown()
> >  path, and they also do something needed for kdump.  For example,
> >  they save registers, disable virtualization extensions, and so on.
> >  However, if smp_send_stop() stops other CPUs before
> >  machine_crash_shutdown(), we miss those operations.
> >
> > How do we fix these problems?  In the first place, we should stop
> > other CPUs as soon as possible when panic() was called, otherwise
> > other CPUs may wipe out a clue to the cause of the failure.  So, we
> > replace smp_send_stop() with more suitable one for kdump.
> 
> We have been avoiding extra things in panic path, but unfortunately
> crash_kexec_post_notifiers were added. I tend to agree the best place
> for this stuff is in 2nd kernel or purgatory instead of in 1st kernel.

Several months ago, I posted a patch set which writes regs to SEL, generate
an event to send SNMP message, and start/stop BMC's watchdog timer in
purgatory.  This feature requires BMC with KCS (Keyboard Controller Style)
I/F, but the most of enterprise grade server would have it.
(http://thread.gmane.org/gmane.linux.kernel.kexec/15382)

Doing kmsg_dump things in purgatory wouldn't be suitable (should be done
in the 2nd kernel before enabling devices and IRQs?)
 
> As for this patch I'm not sure it is safe to replace the smp_send_stop
> with the kdump friendly function. I'm also not sure if the kdump friendly
> function is safe for kdump. Will glad to hear opinions from other
> arch experts.

This stuff depends on architectures, so I speak only about
x86 (the logic doesn't change on other architectures at this time).

kdump path with crash_kexec_post_notifiers disabled:
 panic()
   __crash_kexec()
     crash_setup_regs()
     crash_save_vmcoreinfo()
     machine_crash_shutdown()
       native_machine_crash_shutdown()
         panic_smp_send_stop() /* mostly same as original 
                                * kdump_nmi_shootdown_cpus()
                                */

kdump path with crash_kexec_post_notifiers enabled:
 panic()
   panic_smp_send_stop()
   __crash_kexec()
     crash_setup_regs()
     crash_save_vmcoreinfo()
     machine_crash_shutdown()
       native_machine_crash_shutdown()
         panic_smp_send_stop() // do nothing

The difference is that stopping other CPUs before crash_setup_regs()
and crash_save_vmcoreinfo() or not.  Since crash_setup_regs() and
crash_save_vmcoreinfo() just save information to some memory area, 
they wouldn't be affected by panic_smp_send_stop().  This means
placing panic_smp_send_stop before __crash_kexec is safe.

BTW, I noticed my patch breaks Xen kernel.  I'll fix it in the
next version.

> BTW, if one want to use crash_kexec_post_notifiers he should take the
> risk of unreliable kdump. How about only call smp_send_stop in case no
> crash_kexec_post_notifiers being used.

Unlike panic_smp_send_stop()/kdump_nmi_shootdown_cpus(), smp_send_stop()
for x86 tries to stop other CPUs with normal IPI before issuing NMI IPI.
This would be because NMI IPI has a risk of deadlock.  We checked if
the kdump path has a risk of deadlock in the case of NMI panic and fixed
it.  But I'm not sure about normal panic path.  I agree with that use
smp_send_stop if crash_kexec_post_notifiers or kdump is disabled.

> > This patch solves Problem 2 by replacing smp_send_stop() in panic()
> > with panic_smp_send_stop().  This is a weak function which calls
> > smp_send_stop(), and architecture dependent code may override this
> > with appropriate one.  This patch only provides x86-specific version.
> 
> It does not fix the Problem 1, it seem not possible to fix it?

Problem 1 depends on architectures, and at least it doesn't happen
on x86.  I can try to fix the Problem 1 for MIPS, but I can't test it.
Possible solution will be to use an smp_send_stop variant which stop
the CPU without offlining.

Best regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group

> > Changes in V3:
> > - Revise comments, description, and symbol names
> >
> > Changes in V2:
> > - Replace smp_send_stop() call with crash_kexec version which
> >   saves cpu states and cleans up VMX/SVM
> > - Drop a fix for Problem 1 at this moment
> >
> > Reported-by: Daniel Walker <dwalker@fifo99.com>
> > Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
> > Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> > Cc: Dave Young <dyoung@redhat.com>
> > Cc: Baoquan He <bhe@redhat.com>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Eric Biederman <ebiederm@xmission.com>
> > Cc: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Toshi Kani <toshi.kani@hpe.com>
> > Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> > Cc: Takao Indoh <indou.takao@jp.fujitsu.com>
> > Cc: "Lee, Chun-Yi" <joeyli.kernel@gmail.com>
> > Cc: Minfei Huang <mnfhuang@gmail.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> > Cc: Petr Mladek <pmladek@suse.com>
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> > ---
> >  arch/x86/kernel/crash.c |   14 ++++++++++----
> >  kernel/panic.c          |   43 ++++++++++++++++++++++++++++++++-----------
> >  2 files changed, 42 insertions(+), 15 deletions(-)
> >
> > diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> > index 9ef978d..3305433 100644
> > --- a/arch/x86/kernel/crash.c
> > +++ b/arch/x86/kernel/crash.c
> > @@ -133,15 +133,21 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
> >  	disable_local_APIC();
> >  }
> >
> > -static void kdump_nmi_shootdown_cpus(void)
> > +/* Override the weak function in kernel/panic.c */
> > +void panic_smp_send_stop(void)
> >  {
> > -	nmi_shootdown_cpus(kdump_nmi_callback);
> > +	static int cpus_stopped;
> > +
> > +	if (cpus_stopped)
> > +		return;
> >
> > +	nmi_shootdown_cpus(kdump_nmi_callback);
> >  	disable_local_APIC();
> > +	cpus_stopped = 1;
> >  }
> >
> >  #else
> > -static void kdump_nmi_shootdown_cpus(void)
> > +void panic_smp_send_stop(void)
> >  {
> >  	/* There are no cpus to shootdown */
> >  }
> > @@ -160,7 +166,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
> >  	/* The kernel is broken so disable interrupts */
> >  	local_irq_disable();
> >
> > -	kdump_nmi_shootdown_cpus();
> > +	panic_smp_send_stop();
> >
> >  	/*
> >  	 * VMCLEAR VMCSs loaded on this cpu if needed.
> > diff --git a/kernel/panic.c b/kernel/panic.c
> > index 8aa7449..da8062d2 100644
> > --- a/kernel/panic.c
> > +++ b/kernel/panic.c
> > @@ -71,6 +71,32 @@ void __weak nmi_panic_self_stop(struct pt_regs *regs)
> >  	panic_smp_self_stop();
> >  }
> >
> > +/*
> > + * Stop other CPUs in panic.  Architecture dependent code may override this
> > + * with more suitable version.  For example, if the architecture supports
> > + * crash dump, it should save registers of each stopped CPU and disable
> > + * per-CPU features such as virtualization extensions.
> > + */
> > +void __weak panic_smp_send_stop(void)
> > +{
> > +	static int cpus_stopped;
> > +
> > +	/*
> > +	 * This function can be called twice in panic path, but obviously
> > +	 * we execute this only once.
> > +	 */
> > +	if (cpus_stopped)
> > +		return;
> > +
> > +	/*
> > +	 * Note smp_send_stop is the usual smp shutdown function, which
> > +	 * unfortunately means it may not be hardened to work in a panic
> > +	 * situation.
> > +	 */
> > +	smp_send_stop();
> > +	cpus_stopped = 1;
> > +}
> > +
> >  atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
> >
> >  /*
> > @@ -125,7 +151,7 @@ void panic(const char *fmt, ...)
> >  	 * Only one CPU is allowed to execute the panic code from here. For
> >  	 * multiple parallel invocations of panic, all other CPUs either
> >  	 * stop themself or will wait until they are stopped by the 1st CPU
> > -	 * with smp_send_stop().
> > +	 * with panic_smp_send_stop().
> >  	 *
> >  	 * `old_cpu == PANIC_CPU_INVALID' means this is the 1st CPU which
> >  	 * comes here, so go ahead.
> > @@ -165,12 +191,7 @@ void panic(const char *fmt, ...)
> >  		__crash_kexec(NULL);
> >  	}
> >
> > -	/*
> > -	 * Note smp_send_stop is the usual smp shutdown function, which
> > -	 * unfortunately means it may not be hardened to work in a panic
> > -	 * situation.
> > -	 */
> > -	smp_send_stop();
> > +	panic_smp_send_stop();
> >
> >  	/*
> >  	 * Run any panic handlers, including those that might need to
> > @@ -198,10 +219,10 @@ void panic(const char *fmt, ...)
> >
> >  	/*
> >  	 * We may have ended up stopping the CPU holding the lock (in
> > -	 * smp_send_stop()) while still having some valuable data in the console
> > -	 * buffer.  Try to acquire the lock then release it regardless of the
> > -	 * result.  The release will also print the buffers out.  Locks debug
> > -	 * should be disabled to avoid reporting bad unlock balance when
> > +	 * panic_smp_send_stop()) while still having some valuable data in the
> > +	 * console buffer.  Try to acquire the lock then release it regardless
> > +	 * of the result.  The release will also print the buffers out.  Locks
> > +	 * debug should be disabled to avoid reporting bad unlock balance when
> >  	 * panic() is not being callled from OOPS.
> >  	 */
> >  	debug_locks_off();
> >
> >
> >
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> 
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-05 11:33 ` [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version Hidehiro Kawai
  2016-07-11  8:34   ` Dave Young
@ 2016-07-12  3:11   ` Xunlei Pang
  2016-07-12  3:56     ` 河合英宏 / KAWAI,HIDEHIRO
  1 sibling, 1 reply; 16+ messages in thread
From: Xunlei Pang @ 2016-07-12  3:11 UTC (permalink / raw)
  To: Hidehiro Kawai, Andrew Morton, Dave Young, Eric W. Biederman, Baoquan He
  Cc: Michal Hocko, Toshi Kani, Peter Zijlstra (Intel),
	Minfei Huang, H. Peter Anvin, Daniel Walker, Ingo Molnar,
	Takao Indoh, x86, Lee, Chun-Yi, Borislav Petkov, Vivek Goyal,
	Petr Mladek, Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Masami Hiramatsu, Tejun Heo,
	Vitaly Kuznetsov

On 2016/07/05 at 19:33, Hidehiro Kawai wrote:
> This patch fixes one of the problems reported by Daniel Walker
> (https://lkml.org/lkml/2015/6/24/44).
>
> If crash_kexec_post_notifiers boot option is specified, other CPUs
> are stopped by smp_send_stop() instead of machine_crash_shutdown()
> in crash_kexec() path.  This behavior change leads two problems.
>
>  Problem 1:
>  octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs are
>  still online and try to stop their watchdog timer.  If
>  smp_send_stop() is called before octeon_generic_shutdown(), stopping
>  watchdog timer will fail because other CPUs have been offlined by
>  smp_send_stop().
>
>    panic()
>      if crash_kexec_post_notifiers == 1
>        smp_send_stop()
>        atomic_notifier_call_chain()
>        kmsg_dump()
>      crash_kexec()
>        machine_crash_shutdown()
>          octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs
>
>  Problem 2:
>  Most of architectures stop other CPUs in machine_crash_shutdown()
>  path, and they also do something needed for kdump.  For example,
>  they save registers, disable virtualization extensions, and so on.
>  However, if smp_send_stop() stops other CPUs before
>  machine_crash_shutdown(), we miss those operations.
>
> How do we fix these problems?  In the first place, we should stop
> other CPUs as soon as possible when panic() was called, otherwise
> other CPUs may wipe out a clue to the cause of the failure.  So, we
> replace smp_send_stop() with more suitable one for kdump.
>
> This patch solves Problem 2 by replacing smp_send_stop() in panic()
> with panic_smp_send_stop().  This is a weak function which calls
> smp_send_stop(), and architecture dependent code may override this
> with appropriate one.  This patch only provides x86-specific version.
>
> Changes in V3:
> - Revise comments, description, and symbol names
>
> Changes in V2:
> - Replace smp_send_stop() call with crash_kexec version which
>   saves cpu states and cleans up VMX/SVM
> - Drop a fix for Problem 1 at this moment
>
> Reported-by: Daniel Walker <dwalker@fifo99.com>
> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
> Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Toshi Kani <toshi.kani@hpe.com>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: Takao Indoh <indou.takao@jp.fujitsu.com>
> Cc: "Lee, Chun-Yi" <joeyli.kernel@gmail.com>
> Cc: Minfei Huang <mnfhuang@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> Cc: Petr Mladek <pmladek@suse.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  arch/x86/kernel/crash.c |   14 ++++++++++----
>  kernel/panic.c          |   43 ++++++++++++++++++++++++++++++++-----------
>  2 files changed, 42 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index 9ef978d..3305433 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -133,15 +133,21 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
>  	disable_local_APIC();
>  }
>  
> -static void kdump_nmi_shootdown_cpus(void)
> +/* Override the weak function in kernel/panic.c */
> +void panic_smp_send_stop(void)
>  {
> -	nmi_shootdown_cpus(kdump_nmi_callback);
> +	static int cpus_stopped;

Should be atomic_t type?

> +
> +	if (cpus_stopped)
> +		return;
>  
> +	nmi_shootdown_cpus(kdump_nmi_callback);
>  	disable_local_APIC();
> +	cpus_stopped = 1;
>  }
>  
>  #else
> -static void kdump_nmi_shootdown_cpus(void)
> +void panic_smp_send_stop(void)
>  {
>  	/* There are no cpus to shootdown */
>  }
> @@ -160,7 +166,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>  	/* The kernel is broken so disable interrupts */
>  	local_irq_disable();
>  
> -	kdump_nmi_shootdown_cpus();
> +	panic_smp_send_stop();
>  
>  	/*
>  	 * VMCLEAR VMCSs loaded on this cpu if needed.
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 8aa7449..da8062d2 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -71,6 +71,32 @@ void __weak nmi_panic_self_stop(struct pt_regs *regs)
>  	panic_smp_self_stop();
>  }
>  
> +/*
> + * Stop other CPUs in panic.  Architecture dependent code may override this
> + * with more suitable version.  For example, if the architecture supports
> + * crash dump, it should save registers of each stopped CPU and disable
> + * per-CPU features such as virtualization extensions.
> + */
> +void __weak panic_smp_send_stop(void)
> +{
> +	static int cpus_stopped;
> +
> +	/*
> +	 * This function can be called twice in panic path, but obviously
> +	 * we execute this only once.
> +	 */
> +	if (cpus_stopped)
> +		return;
> +
> +	/*
> +	 * Note smp_send_stop is the usual smp shutdown function, which
> +	 * unfortunately means it may not be hardened to work in a panic
> +	 * situation.
> +	 */
> +	smp_send_stop();
> +	cpus_stopped = 1;
> +}
> +
>  atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
>  
>  /*
> @@ -125,7 +151,7 @@ void panic(const char *fmt, ...)
>  	 * Only one CPU is allowed to execute the panic code from here. For
>  	 * multiple parallel invocations of panic, all other CPUs either
>  	 * stop themself or will wait until they are stopped by the 1st CPU
> -	 * with smp_send_stop().
> +	 * with panic_smp_send_stop().
>  	 *
>  	 * `old_cpu == PANIC_CPU_INVALID' means this is the 1st CPU which
>  	 * comes here, so go ahead.
> @@ -165,12 +191,7 @@ void panic(const char *fmt, ...)
>  		__crash_kexec(NULL);
>  	}
>  
> -	/*
> -	 * Note smp_send_stop is the usual smp shutdown function, which
> -	 * unfortunately means it may not be hardened to work in a panic
> -	 * situation.
> -	 */
> -	smp_send_stop();
> +	panic_smp_send_stop();

If crash_kexec_post_notifiers and kdump both are not used, you actually changed the original behaviour,
it would be better to make it conditionally like:
    if (crash_kexec_post_notifiers)
        panic_smp_send_stop(); // renaming it to crash_smp_send_stop() is better?
    else
        smp_send_stop();

Regards,
Xunlei

>  
>  	/*
>  	 * Run any panic handlers, including those that might need to
> @@ -198,10 +219,10 @@ void panic(const char *fmt, ...)
>  
>  	/*
>  	 * We may have ended up stopping the CPU holding the lock (in
> -	 * smp_send_stop()) while still having some valuable data in the console
> -	 * buffer.  Try to acquire the lock then release it regardless of the
> -	 * result.  The release will also print the buffers out.  Locks debug
> -	 * should be disabled to avoid reporting bad unlock balance when
> +	 * panic_smp_send_stop()) while still having some valuable data in the
> +	 * console buffer.  Try to acquire the lock then release it regardless
> +	 * of the result.  The release will also print the buffers out.  Locks
> +	 * debug should be disabled to avoid reporting bad unlock balance when
>  	 * panic() is not being callled from OOPS.
>  	 */
>  	debug_locks_off();
>
>
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-12  3:11   ` Xunlei Pang
@ 2016-07-12  3:56     ` 河合英宏 / KAWAI,HIDEHIRO
  2016-07-12  6:57       ` Xunlei Pang
  0 siblings, 1 reply; 16+ messages in thread
From: 河合英宏 / KAWAI,HIDEHIRO @ 2016-07-12  3:56 UTC (permalink / raw)
  To: 'xlpang@redhat.com',
	Andrew Morton, Dave Young, Eric W. Biederman, Baoquan He
  Cc: Michal Hocko, Toshi Kani, Peter Zijlstra (Intel),
	Minfei Huang, H. Peter Anvin, Daniel Walker, Ingo Molnar,
	Takao Indoh, x86, Lee, Chun-Yi, Borislav Petkov, Vivek Goyal,
	Petr Mladek, Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Masami Hiramatsu, Tejun Heo,
	Vitaly Kuznetsov

Hi Xunlei,

Thanks for the review.

> From: Xunlei Pang [mailto:xpang@redhat.com]
> Sent: Tuesday, July 12, 2016 12:12 PM
> On 2016/07/05 at 19:33, Hidehiro Kawai wrote:
> > This patch fixes one of the problems reported by Daniel Walker
> > (https://lkml.org/lkml/2015/6/24/44).
> >
> > If crash_kexec_post_notifiers boot option is specified, other CPUs
> > are stopped by smp_send_stop() instead of machine_crash_shutdown()
> > in crash_kexec() path.  This behavior change leads two problems.
> >
> >  Problem 1:
> >  octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs are
> >  still online and try to stop their watchdog timer.  If
> >  smp_send_stop() is called before octeon_generic_shutdown(), stopping
> >  watchdog timer will fail because other CPUs have been offlined by
> >  smp_send_stop().
> >
> >    panic()
> >      if crash_kexec_post_notifiers == 1
> >        smp_send_stop()
> >        atomic_notifier_call_chain()
> >        kmsg_dump()
> >      crash_kexec()
> >        machine_crash_shutdown()
> >          octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs
> >
> >  Problem 2:
> >  Most of architectures stop other CPUs in machine_crash_shutdown()
> >  path, and they also do something needed for kdump.  For example,
> >  they save registers, disable virtualization extensions, and so on.
> >  However, if smp_send_stop() stops other CPUs before
> >  machine_crash_shutdown(), we miss those operations.
> >
> > How do we fix these problems?  In the first place, we should stop
> > other CPUs as soon as possible when panic() was called, otherwise
> > other CPUs may wipe out a clue to the cause of the failure.  So, we
> > replace smp_send_stop() with more suitable one for kdump.
> >
> > This patch solves Problem 2 by replacing smp_send_stop() in panic()
> > with panic_smp_send_stop().  This is a weak function which calls
> > smp_send_stop(), and architecture dependent code may override this
> > with appropriate one.  This patch only provides x86-specific version.
> >
> > Changes in V3:
> > - Revise comments, description, and symbol names
> >
> > Changes in V2:
> > - Replace smp_send_stop() call with crash_kexec version which
> >   saves cpu states and cleans up VMX/SVM
> > - Drop a fix for Problem 1 at this moment
> >
> > Reported-by: Daniel Walker <dwalker@fifo99.com>
> > Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
> > Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> > Cc: Dave Young <dyoung@redhat.com>
> > Cc: Baoquan He <bhe@redhat.com>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Eric Biederman <ebiederm@xmission.com>
> > Cc: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Toshi Kani <toshi.kani@hpe.com>
> > Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> > Cc: Takao Indoh <indou.takao@jp.fujitsu.com>
> > Cc: "Lee, Chun-Yi" <joeyli.kernel@gmail.com>
> > Cc: Minfei Huang <mnfhuang@gmail.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> > Cc: Petr Mladek <pmladek@suse.com>
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> > ---
> >  arch/x86/kernel/crash.c |   14 ++++++++++----
> >  kernel/panic.c          |   43 ++++++++++++++++++++++++++++++++-----------
> >  2 files changed, 42 insertions(+), 15 deletions(-)
> >
> > diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> > index 9ef978d..3305433 100644
> > --- a/arch/x86/kernel/crash.c
> > +++ b/arch/x86/kernel/crash.c
> > @@ -133,15 +133,21 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
> >  	disable_local_APIC();
> >  }
> >
> > -static void kdump_nmi_shootdown_cpus(void)
> > +/* Override the weak function in kernel/panic.c */
> > +void panic_smp_send_stop(void)
> >  {
> > -	nmi_shootdown_cpus(kdump_nmi_callback);
> > +	static int cpus_stopped;
> 
> Should be atomic_t type?

panic_smp_send_stop() can be called by only one panicking CPU
(but can be called twice). It is sufficient to be normal variable.

> > +
> > +	if (cpus_stopped)
> > +		return;
> >
> > +	nmi_shootdown_cpus(kdump_nmi_callback);
> >  	disable_local_APIC();
> > +	cpus_stopped = 1;
> >  }
> >
> >  #else
> > -static void kdump_nmi_shootdown_cpus(void)
> > +void panic_smp_send_stop(void)
> >  {
> >  	/* There are no cpus to shootdown */
> >  }
> > @@ -160,7 +166,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
> >  	/* The kernel is broken so disable interrupts */
> >  	local_irq_disable();
> >
> > -	kdump_nmi_shootdown_cpus();
> > +	panic_smp_send_stop();
> >
> >  	/*
> >  	 * VMCLEAR VMCSs loaded on this cpu if needed.
> > diff --git a/kernel/panic.c b/kernel/panic.c
> > index 8aa7449..da8062d2 100644
> > --- a/kernel/panic.c
> > +++ b/kernel/panic.c
> > @@ -71,6 +71,32 @@ void __weak nmi_panic_self_stop(struct pt_regs *regs)
> >  	panic_smp_self_stop();
> >  }
> >
> > +/*
> > + * Stop other CPUs in panic.  Architecture dependent code may override this
> > + * with more suitable version.  For example, if the architecture supports
> > + * crash dump, it should save registers of each stopped CPU and disable
> > + * per-CPU features such as virtualization extensions.
> > + */
> > +void __weak panic_smp_send_stop(void)
> > +{
> > +	static int cpus_stopped;
> > +
> > +	/*
> > +	 * This function can be called twice in panic path, but obviously
> > +	 * we execute this only once.
> > +	 */
> > +	if (cpus_stopped)
> > +		return;
> > +
> > +	/*
> > +	 * Note smp_send_stop is the usual smp shutdown function, which
> > +	 * unfortunately means it may not be hardened to work in a panic
> > +	 * situation.
> > +	 */
> > +	smp_send_stop();
> > +	cpus_stopped = 1;
> > +}
> > +
> >  atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
> >
> >  /*
> > @@ -125,7 +151,7 @@ void panic(const char *fmt, ...)
> >  	 * Only one CPU is allowed to execute the panic code from here. For
> >  	 * multiple parallel invocations of panic, all other CPUs either
> >  	 * stop themself or will wait until they are stopped by the 1st CPU
> > -	 * with smp_send_stop().
> > +	 * with panic_smp_send_stop().
> >  	 *
> >  	 * `old_cpu == PANIC_CPU_INVALID' means this is the 1st CPU which
> >  	 * comes here, so go ahead.
> > @@ -165,12 +191,7 @@ void panic(const char *fmt, ...)
> >  		__crash_kexec(NULL);
> >  	}
> >
> > -	/*
> > -	 * Note smp_send_stop is the usual smp shutdown function, which
> > -	 * unfortunately means it may not be hardened to work in a panic
> > -	 * situation.
> > -	 */
> > -	smp_send_stop();
> > +	panic_smp_send_stop();
> 
> If crash_kexec_post_notifiers and kdump both are not used, you actually changed the original behaviour,
> it would be better to make it conditionally like:
>     if (crash_kexec_post_notifiers)
>         panic_smp_send_stop(); // renaming it to crash_smp_send_stop() is better?
>     else
>         smp_send_stop();

Dave also pointed about this, and I'll fix it.

Best regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group

> >  	/*
> >  	 * Run any panic handlers, including those that might need to
> > @@ -198,10 +219,10 @@ void panic(const char *fmt, ...)
> >
> >  	/*
> >  	 * We may have ended up stopping the CPU holding the lock (in
> > -	 * smp_send_stop()) while still having some valuable data in the console
> > -	 * buffer.  Try to acquire the lock then release it regardless of the
> > -	 * result.  The release will also print the buffers out.  Locks debug
> > -	 * should be disabled to avoid reporting bad unlock balance when
> > +	 * panic_smp_send_stop()) while still having some valuable data in the
> > +	 * console buffer.  Try to acquire the lock then release it regardless
> > +	 * of the result.  The release will also print the buffers out.  Locks
> > +	 * debug should be disabled to avoid reporting bad unlock balance when
> >  	 * panic() is not being callled from OOPS.
> >  	 */
> >  	debug_locks_off();
> >
> >
> >
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-12  3:56     ` 河合英宏 / KAWAI,HIDEHIRO
@ 2016-07-12  6:57       ` Xunlei Pang
  2016-07-12  7:12         ` 河合英宏 / KAWAI,HIDEHIRO
  0 siblings, 1 reply; 16+ messages in thread
From: Xunlei Pang @ 2016-07-12  6:57 UTC (permalink / raw)
  To: 河合英宏 / KAWAI,HIDEHIRO,
	'xlpang@redhat.com',
	Andrew Morton, Dave Young, Eric W. Biederman, Baoquan He
  Cc: Michal Hocko, Toshi Kani, Peter Zijlstra (Intel),
	Minfei Huang, H. Peter Anvin, Daniel Walker, Ingo Molnar,
	Takao Indoh, x86, Lee, Chun-Yi, Borislav Petkov, Vivek Goyal,
	Petr Mladek, Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Masami Hiramatsu, Tejun Heo,
	Vitaly Kuznetsov

On 2016/07/12 at 11:56, 河合英宏 / KAWAI,HIDEHIRO wrote:
> Hi Xunlei,
>
> Thanks for the review.
>
>> From: Xunlei Pang [mailto:xpang@redhat.com]
>> Sent: Tuesday, July 12, 2016 12:12 PM
>> On 2016/07/05 at 19:33, Hidehiro Kawai wrote:
>>> This patch fixes one of the problems reported by Daniel Walker
>>> (https://lkml.org/lkml/2015/6/24/44).
>>>
>>> If crash_kexec_post_notifiers boot option is specified, other CPUs
>>> are stopped by smp_send_stop() instead of machine_crash_shutdown()
>>> in crash_kexec() path.  This behavior change leads two problems.
>>>
>>>  Problem 1:
>>>  octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs are
>>>  still online and try to stop their watchdog timer.  If
>>>  smp_send_stop() is called before octeon_generic_shutdown(), stopping
>>>  watchdog timer will fail because other CPUs have been offlined by
>>>  smp_send_stop().
>>>
>>>    panic()
>>>      if crash_kexec_post_notifiers == 1
>>>        smp_send_stop()
>>>        atomic_notifier_call_chain()
>>>        kmsg_dump()
>>>      crash_kexec()
>>>        machine_crash_shutdown()
>>>          octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs
>>>
>>>  Problem 2:
>>>  Most of architectures stop other CPUs in machine_crash_shutdown()
>>>  path, and they also do something needed for kdump.  For example,
>>>  they save registers, disable virtualization extensions, and so on.
>>>  However, if smp_send_stop() stops other CPUs before
>>>  machine_crash_shutdown(), we miss those operations.
>>>
>>> How do we fix these problems?  In the first place, we should stop
>>> other CPUs as soon as possible when panic() was called, otherwise
>>> other CPUs may wipe out a clue to the cause of the failure.  So, we
>>> replace smp_send_stop() with more suitable one for kdump.
>>>
>>> This patch solves Problem 2 by replacing smp_send_stop() in panic()
>>> with panic_smp_send_stop().  This is a weak function which calls
>>> smp_send_stop(), and architecture dependent code may override this
>>> with appropriate one.  This patch only provides x86-specific version.
>>>
>>> Changes in V3:
>>> - Revise comments, description, and symbol names
>>>
>>> Changes in V2:
>>> - Replace smp_send_stop() call with crash_kexec version which
>>>   saves cpu states and cleans up VMX/SVM
>>> - Drop a fix for Problem 1 at this moment
>>>
>>> Reported-by: Daniel Walker <dwalker@fifo99.com>
>>> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
>>> Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
>>> Cc: Dave Young <dyoung@redhat.com>
>>> Cc: Baoquan He <bhe@redhat.com>
>>> Cc: Vivek Goyal <vgoyal@redhat.com>
>>> Cc: Eric Biederman <ebiederm@xmission.com>
>>> Cc: Masami Hiramatsu <mhiramat@kernel.org>
>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>> Cc: Ingo Molnar <mingo@redhat.com>
>>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>>> Cc: Borislav Petkov <bp@suse.de>
>>> Cc: Toshi Kani <toshi.kani@hpe.com>
>>> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
>>> Cc: Takao Indoh <indou.takao@jp.fujitsu.com>
>>> Cc: "Lee, Chun-Yi" <joeyli.kernel@gmail.com>
>>> Cc: Minfei Huang <mnfhuang@gmail.com>
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: Michal Hocko <mhocko@suse.com>
>>> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
>>> Cc: Petr Mladek <pmladek@suse.com>
>>> Cc: Tejun Heo <tj@kernel.org>
>>> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
>>> ---
>>>  arch/x86/kernel/crash.c |   14 ++++++++++----
>>>  kernel/panic.c          |   43 ++++++++++++++++++++++++++++++++-----------
>>>  2 files changed, 42 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>>> index 9ef978d..3305433 100644
>>> --- a/arch/x86/kernel/crash.c
>>> +++ b/arch/x86/kernel/crash.c
>>> @@ -133,15 +133,21 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
>>>  	disable_local_APIC();
>>>  }
>>>
>>> -static void kdump_nmi_shootdown_cpus(void)
>>> +/* Override the weak function in kernel/panic.c */
>>> +void panic_smp_send_stop(void)
>>>  {
>>> -	nmi_shootdown_cpus(kdump_nmi_callback);
>>> +	static int cpus_stopped;
>> Should be atomic_t type?
> panic_smp_send_stop() can be called by only one panicking CPU
> (but can be called twice). It is sufficient to be normal variable.

There are other call sites of __crash_kexec() for oops cases, which can
call panic_smp_send_stop() concurrently on a different cpu.

Regards,
Xunlei

>>> +
>>> +	if (cpus_stopped)
>>> +		return;
>>>
>>> +	nmi_shootdown_cpus(kdump_nmi_callback);
>>>  	disable_local_APIC();
>>> +	cpus_stopped = 1;
>>>  }
>>>
>>>  #else
>>> -static void kdump_nmi_shootdown_cpus(void)
>>> +void panic_smp_send_stop(void)
>>>  {
>>>  	/* There are no cpus to shootdown */
>>>  }
>>> @@ -160,7 +166,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>>>  	/* The kernel is broken so disable interrupts */
>>>  	local_irq_disable();
>>>
>>> -	kdump_nmi_shootdown_cpus();
>>> +	panic_smp_send_stop();
>>>
>>>  	/*
>>>  	 * VMCLEAR VMCSs loaded on this cpu if needed.
>>> diff --git a/kernel/panic.c b/kernel/panic.c
>>> index 8aa7449..da8062d2 100644
>>> --- a/kernel/panic.c
>>> +++ b/kernel/panic.c
>>> @@ -71,6 +71,32 @@ void __weak nmi_panic_self_stop(struct pt_regs *regs)
>>>  	panic_smp_self_stop();
>>>  }
>>>
>>> +/*
>>> + * Stop other CPUs in panic.  Architecture dependent code may override this
>>> + * with more suitable version.  For example, if the architecture supports
>>> + * crash dump, it should save registers of each stopped CPU and disable
>>> + * per-CPU features such as virtualization extensions.
>>> + */
>>> +void __weak panic_smp_send_stop(void)
>>> +{
>>> +	static int cpus_stopped;
>>> +
>>> +	/*
>>> +	 * This function can be called twice in panic path, but obviously
>>> +	 * we execute this only once.
>>> +	 */
>>> +	if (cpus_stopped)
>>> +		return;
>>> +
>>> +	/*
>>> +	 * Note smp_send_stop is the usual smp shutdown function, which
>>> +	 * unfortunately means it may not be hardened to work in a panic
>>> +	 * situation.
>>> +	 */
>>> +	smp_send_stop();
>>> +	cpus_stopped = 1;
>>> +}
>>> +
>>>  atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
>>>
>>>  /*
>>> @@ -125,7 +151,7 @@ void panic(const char *fmt, ...)
>>>  	 * Only one CPU is allowed to execute the panic code from here. For
>>>  	 * multiple parallel invocations of panic, all other CPUs either
>>>  	 * stop themself or will wait until they are stopped by the 1st CPU
>>> -	 * with smp_send_stop().
>>> +	 * with panic_smp_send_stop().
>>>  	 *
>>>  	 * `old_cpu == PANIC_CPU_INVALID' means this is the 1st CPU which
>>>  	 * comes here, so go ahead.
>>> @@ -165,12 +191,7 @@ void panic(const char *fmt, ...)
>>>  		__crash_kexec(NULL);
>>>  	}
>>>
>>> -	/*
>>> -	 * Note smp_send_stop is the usual smp shutdown function, which
>>> -	 * unfortunately means it may not be hardened to work in a panic
>>> -	 * situation.
>>> -	 */
>>> -	smp_send_stop();
>>> +	panic_smp_send_stop();
>> If crash_kexec_post_notifiers and kdump both are not used, you actually changed the original behaviour,
>> it would be better to make it conditionally like:
>>     if (crash_kexec_post_notifiers)
>>         panic_smp_send_stop(); // renaming it to crash_smp_send_stop() is better?
>>     else
>>         smp_send_stop();
> Dave also pointed about this, and I'll fix it.
>
> Best regards,
>
> Hidehiro Kawai
> Hitachi, Ltd. Research & Development Group
>
>>>  	/*
>>>  	 * Run any panic handlers, including those that might need to
>>> @@ -198,10 +219,10 @@ void panic(const char *fmt, ...)
>>>
>>>  	/*
>>>  	 * We may have ended up stopping the CPU holding the lock (in
>>> -	 * smp_send_stop()) while still having some valuable data in the console
>>> -	 * buffer.  Try to acquire the lock then release it regardless of the
>>> -	 * result.  The release will also print the buffers out.  Locks debug
>>> -	 * should be disabled to avoid reporting bad unlock balance when
>>> +	 * panic_smp_send_stop()) while still having some valuable data in the
>>> +	 * console buffer.  Try to acquire the lock then release it regardless
>>> +	 * of the result.  The release will also print the buffers out.  Locks
>>> +	 * debug should be disabled to avoid reporting bad unlock balance when
>>>  	 * panic() is not being callled from OOPS.
>>>  	 */
>>>  	debug_locks_off();
>>>
>>>
>>>
>>> _______________________________________________
>>> kexec mailing list
>>> kexec@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-12  6:57       ` Xunlei Pang
@ 2016-07-12  7:12         ` 河合英宏 / KAWAI,HIDEHIRO
  2016-07-12  7:26           ` Xunlei Pang
  0 siblings, 1 reply; 16+ messages in thread
From: 河合英宏 / KAWAI,HIDEHIRO @ 2016-07-12  7:12 UTC (permalink / raw)
  To: 'xlpang@redhat.com',
	Andrew Morton, Dave Young, Eric W. Biederman, Baoquan He
  Cc: Michal Hocko, Toshi Kani, Peter Zijlstra (Intel),
	Minfei Huang, H. Peter Anvin, Daniel Walker, Ingo Molnar,
	Takao Indoh, x86, Lee, Chun-Yi, Borislav Petkov, Vivek Goyal,
	Petr Mladek, Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Masami Hiramatsu, Tejun Heo,
	Vitaly Kuznetsov

> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Xunlei Pang
> Sent: Tuesday, July 12, 2016 3:57 PM
> On 2016/07/12 at 11:56, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > Hi Xunlei,
> >
> > Thanks for the review.
> >
> >> From: Xunlei Pang [mailto:xpang@redhat.com]
> >> Sent: Tuesday, July 12, 2016 12:12 PM
> >> On 2016/07/05 at 19:33, Hidehiro Kawai wrote:
> >>> This patch fixes one of the problems reported by Daniel Walker
> >>> (https://lkml.org/lkml/2015/6/24/44).
> >>>
> >>> If crash_kexec_post_notifiers boot option is specified, other CPUs
> >>> are stopped by smp_send_stop() instead of machine_crash_shutdown()
> >>> in crash_kexec() path.  This behavior change leads two problems.
> >>>
> >>>  Problem 1:
> >>>  octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs are
> >>>  still online and try to stop their watchdog timer.  If
> >>>  smp_send_stop() is called before octeon_generic_shutdown(), stopping
> >>>  watchdog timer will fail because other CPUs have been offlined by
> >>>  smp_send_stop().
> >>>
> >>>    panic()
> >>>      if crash_kexec_post_notifiers == 1
> >>>        smp_send_stop()
> >>>        atomic_notifier_call_chain()
> >>>        kmsg_dump()
> >>>      crash_kexec()
> >>>        machine_crash_shutdown()
> >>>          octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs
> >>>
> >>>  Problem 2:
> >>>  Most of architectures stop other CPUs in machine_crash_shutdown()
> >>>  path, and they also do something needed for kdump.  For example,
> >>>  they save registers, disable virtualization extensions, and so on.
> >>>  However, if smp_send_stop() stops other CPUs before
> >>>  machine_crash_shutdown(), we miss those operations.
> >>>
> >>> How do we fix these problems?  In the first place, we should stop
> >>> other CPUs as soon as possible when panic() was called, otherwise
> >>> other CPUs may wipe out a clue to the cause of the failure.  So, we
> >>> replace smp_send_stop() with more suitable one for kdump.
> >>>
> >>> This patch solves Problem 2 by replacing smp_send_stop() in panic()
> >>> with panic_smp_send_stop().  This is a weak function which calls
> >>> smp_send_stop(), and architecture dependent code may override this
> >>> with appropriate one.  This patch only provides x86-specific version.
> >>>
> >>> Changes in V3:
> >>> - Revise comments, description, and symbol names
> >>>
> >>> Changes in V2:
> >>> - Replace smp_send_stop() call with crash_kexec version which
> >>>   saves cpu states and cleans up VMX/SVM
> >>> - Drop a fix for Problem 1 at this moment
> >>>
> >>> Reported-by: Daniel Walker <dwalker@fifo99.com>
> >>> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
> >>> Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
> >>> Cc: Dave Young <dyoung@redhat.com>
> >>> Cc: Baoquan He <bhe@redhat.com>
> >>> Cc: Vivek Goyal <vgoyal@redhat.com>
> >>> Cc: Eric Biederman <ebiederm@xmission.com>
> >>> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> >>> Cc: Thomas Gleixner <tglx@linutronix.de>
> >>> Cc: Ingo Molnar <mingo@redhat.com>
> >>> Cc: "H. Peter Anvin" <hpa@zytor.com>
> >>> Cc: Borislav Petkov <bp@suse.de>
> >>> Cc: Toshi Kani <toshi.kani@hpe.com>
> >>> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> >>> Cc: Takao Indoh <indou.takao@jp.fujitsu.com>
> >>> Cc: "Lee, Chun-Yi" <joeyli.kernel@gmail.com>
> >>> Cc: Minfei Huang <mnfhuang@gmail.com>
> >>> Cc: Andrew Morton <akpm@linux-foundation.org>
> >>> Cc: Michal Hocko <mhocko@suse.com>
> >>> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> >>> Cc: Petr Mladek <pmladek@suse.com>
> >>> Cc: Tejun Heo <tj@kernel.org>
> >>> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> >>> ---
> >>>  arch/x86/kernel/crash.c |   14 ++++++++++----
> >>>  kernel/panic.c          |   43 ++++++++++++++++++++++++++++++++-----------
> >>>  2 files changed, 42 insertions(+), 15 deletions(-)
> >>>
> >>> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> >>> index 9ef978d..3305433 100644
> >>> --- a/arch/x86/kernel/crash.c
> >>> +++ b/arch/x86/kernel/crash.c
> >>> @@ -133,15 +133,21 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
> >>>  	disable_local_APIC();
> >>>  }
> >>>
> >>> -static void kdump_nmi_shootdown_cpus(void)
> >>> +/* Override the weak function in kernel/panic.c */
> >>> +void panic_smp_send_stop(void)
> >>>  {
> >>> -	nmi_shootdown_cpus(kdump_nmi_callback);
> >>> +	static int cpus_stopped;
> >> Should be atomic_t type?
> > panic_smp_send_stop() can be called by only one panicking CPU
> > (but can be called twice). It is sufficient to be normal variable.
> 
> There are other call sites of __crash_kexec() for oops cases, which can
> call panic_smp_send_stop() concurrently on a different cpu.

In oops cases, crash_kexec() is called first, then __crash_kexec() is
called.  crash_kexec() excludes concurrent execution of panic() and
crash_kexec() via panic_cpu, so panic_smp_send_stop() shouldn't be
called concurrently.

Best regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-12  7:12         ` 河合英宏 / KAWAI,HIDEHIRO
@ 2016-07-12  7:26           ` Xunlei Pang
  0 siblings, 0 replies; 16+ messages in thread
From: Xunlei Pang @ 2016-07-12  7:26 UTC (permalink / raw)
  To: 河合英宏 / KAWAI,HIDEHIRO,
	'xlpang@redhat.com',
	Andrew Morton, Dave Young, Eric W. Biederman, Baoquan He
  Cc: Michal Hocko, Toshi Kani, Peter Zijlstra (Intel),
	Minfei Huang, H. Peter Anvin, Daniel Walker, Ingo Molnar,
	Takao Indoh, x86, Lee, Chun-Yi, Borislav Petkov, Vivek Goyal,
	Petr Mladek, Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Masami Hiramatsu, Tejun Heo,
	Vitaly Kuznetsov

On 2016/07/12 at 15:12, 河合英宏 / KAWAI,HIDEHIRO wrote:
>> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Xunlei Pang
>> Sent: Tuesday, July 12, 2016 3:57 PM
>> On 2016/07/12 at 11:56, 河合英宏 / KAWAI,HIDEHIRO wrote:
>>> Hi Xunlei,
>>>
>>> Thanks for the review.
>>>
>>>> From: Xunlei Pang [mailto:xpang@redhat.com]
>>>> Sent: Tuesday, July 12, 2016 12:12 PM
>>>> On 2016/07/05 at 19:33, Hidehiro Kawai wrote:
>>>>> This patch fixes one of the problems reported by Daniel Walker
>>>>> (https://lkml.org/lkml/2015/6/24/44).
>>>>>
>>>>> If crash_kexec_post_notifiers boot option is specified, other CPUs
>>>>> are stopped by smp_send_stop() instead of machine_crash_shutdown()
>>>>> in crash_kexec() path.  This behavior change leads two problems.
>>>>>
>>>>>  Problem 1:
>>>>>  octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs are
>>>>>  still online and try to stop their watchdog timer.  If
>>>>>  smp_send_stop() is called before octeon_generic_shutdown(), stopping
>>>>>  watchdog timer will fail because other CPUs have been offlined by
>>>>>  smp_send_stop().
>>>>>
>>>>>    panic()
>>>>>      if crash_kexec_post_notifiers == 1
>>>>>        smp_send_stop()
>>>>>        atomic_notifier_call_chain()
>>>>>        kmsg_dump()
>>>>>      crash_kexec()
>>>>>        machine_crash_shutdown()
>>>>>          octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs
>>>>>
>>>>>  Problem 2:
>>>>>  Most of architectures stop other CPUs in machine_crash_shutdown()
>>>>>  path, and they also do something needed for kdump.  For example,
>>>>>  they save registers, disable virtualization extensions, and so on.
>>>>>  However, if smp_send_stop() stops other CPUs before
>>>>>  machine_crash_shutdown(), we miss those operations.
>>>>>
>>>>> How do we fix these problems?  In the first place, we should stop
>>>>> other CPUs as soon as possible when panic() was called, otherwise
>>>>> other CPUs may wipe out a clue to the cause of the failure.  So, we
>>>>> replace smp_send_stop() with more suitable one for kdump.
>>>>>
>>>>> This patch solves Problem 2 by replacing smp_send_stop() in panic()
>>>>> with panic_smp_send_stop().  This is a weak function which calls
>>>>> smp_send_stop(), and architecture dependent code may override this
>>>>> with appropriate one.  This patch only provides x86-specific version.
>>>>>
>>>>> Changes in V3:
>>>>> - Revise comments, description, and symbol names
>>>>>
>>>>> Changes in V2:
>>>>> - Replace smp_send_stop() call with crash_kexec version which
>>>>>   saves cpu states and cleans up VMX/SVM
>>>>> - Drop a fix for Problem 1 at this moment
>>>>>
>>>>> Reported-by: Daniel Walker <dwalker@fifo99.com>
>>>>> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
>>>>> Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
>>>>> Cc: Dave Young <dyoung@redhat.com>
>>>>> Cc: Baoquan He <bhe@redhat.com>
>>>>> Cc: Vivek Goyal <vgoyal@redhat.com>
>>>>> Cc: Eric Biederman <ebiederm@xmission.com>
>>>>> Cc: Masami Hiramatsu <mhiramat@kernel.org>
>>>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>>>> Cc: Ingo Molnar <mingo@redhat.com>
>>>>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>>>>> Cc: Borislav Petkov <bp@suse.de>
>>>>> Cc: Toshi Kani <toshi.kani@hpe.com>
>>>>> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
>>>>> Cc: Takao Indoh <indou.takao@jp.fujitsu.com>
>>>>> Cc: "Lee, Chun-Yi" <joeyli.kernel@gmail.com>
>>>>> Cc: Minfei Huang <mnfhuang@gmail.com>
>>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>>> Cc: Michal Hocko <mhocko@suse.com>
>>>>> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
>>>>> Cc: Petr Mladek <pmladek@suse.com>
>>>>> Cc: Tejun Heo <tj@kernel.org>
>>>>> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
>>>>> ---
>>>>>  arch/x86/kernel/crash.c |   14 ++++++++++----
>>>>>  kernel/panic.c          |   43 ++++++++++++++++++++++++++++++++-----------
>>>>>  2 files changed, 42 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>>>>> index 9ef978d..3305433 100644
>>>>> --- a/arch/x86/kernel/crash.c
>>>>> +++ b/arch/x86/kernel/crash.c
>>>>> @@ -133,15 +133,21 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
>>>>>  	disable_local_APIC();
>>>>>  }
>>>>>
>>>>> -static void kdump_nmi_shootdown_cpus(void)
>>>>> +/* Override the weak function in kernel/panic.c */
>>>>> +void panic_smp_send_stop(void)
>>>>>  {
>>>>> -	nmi_shootdown_cpus(kdump_nmi_callback);
>>>>> +	static int cpus_stopped;
>>>> Should be atomic_t type?
>>> panic_smp_send_stop() can be called by only one panicking CPU
>>> (but can be called twice). It is sufficient to be normal variable.
>> There are other call sites of __crash_kexec() for oops cases, which can
>> call panic_smp_send_stop() concurrently on a different cpu.
> In oops cases, crash_kexec() is called first, then __crash_kexec() is
> called.  crash_kexec() excludes concurrent execution of panic() and
> crash_kexec() via panic_cpu, so panic_smp_send_stop() shouldn't be
> called concurrently.

Right, that's why oops calls crash_kexec() not __crash_kexec().
I have no problem on this. Thanks!

Regards,
Xunlei

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-12  2:49     ` 河合英宏 / KAWAI,HIDEHIRO
@ 2016-07-13  2:03       ` 'Dave Young'
  2016-07-15 11:50         ` 河合英宏 / KAWAI,HIDEHIRO
  0 siblings, 1 reply; 16+ messages in thread
From: 'Dave Young' @ 2016-07-13  2:03 UTC (permalink / raw)
  To: 河合英宏 / KAWAI,HIDEHIRO
  Cc: Michal Hocko, Toshi Kani, Peter Zijlstra (Intel),
	Vitaly Kuznetsov, Minfei Huang, H. Peter Anvin, Daniel Walker,
	Ingo Molnar, Takao Indoh, Baoquan He, x86, Lee, Chun-Yi,
	Borislav Petkov, Vivek Goyal, Masami Hiramatsu, Petr Mladek,
	Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Eric W. Biederman, Tejun Heo,
	Andrew Morton

On 07/12/16 at 02:49am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> Hi Dave,
> 
> Thanks for the comments.
> 
> > From: Dave Young [mailto:dyoung@redhat.com]
> > Sent: Monday, July 11, 2016 5:35 PM
> > 
> > On 07/05/16 at 08:33pm, Hidehiro Kawai wrote:
> > > This patch fixes one of the problems reported by Daniel Walker
> > > (https://lkml.org/lkml/2015/6/24/44).
> > >
> > > If crash_kexec_post_notifiers boot option is specified, other CPUs
> > > are stopped by smp_send_stop() instead of machine_crash_shutdown()
> > > in crash_kexec() path.  This behavior change leads two problems.
> > >
> > >  Problem 1:
> > >  octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs are
> > >  still online and try to stop their watchdog timer.  If
> > >  smp_send_stop() is called before octeon_generic_shutdown(), stopping
> > >  watchdog timer will fail because other CPUs have been offlined by
> > >  smp_send_stop().
> > >
> > >    panic()
> > >      if crash_kexec_post_notifiers == 1
> > >        smp_send_stop()
> > >        atomic_notifier_call_chain()
> > >        kmsg_dump()
> > >      crash_kexec()
> > >        machine_crash_shutdown()
> > >          octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs
> > >
> > >  Problem 2:
> > >  Most of architectures stop other CPUs in machine_crash_shutdown()
> > >  path, and they also do something needed for kdump.  For example,
> > >  they save registers, disable virtualization extensions, and so on.
> > >  However, if smp_send_stop() stops other CPUs before
> > >  machine_crash_shutdown(), we miss those operations.
> > >
> > > How do we fix these problems?  In the first place, we should stop
> > > other CPUs as soon as possible when panic() was called, otherwise
> > > other CPUs may wipe out a clue to the cause of the failure.  So, we
> > > replace smp_send_stop() with more suitable one for kdump.
> > 
> > We have been avoiding extra things in panic path, but unfortunately
> > crash_kexec_post_notifiers were added. I tend to agree the best place
> > for this stuff is in 2nd kernel or purgatory instead of in 1st kernel.
> 
> Several months ago, I posted a patch set which writes regs to SEL, generate
> an event to send SNMP message, and start/stop BMC's watchdog timer in
> purgatory.  This feature requires BMC with KCS (Keyboard Controller Style)
> I/F, but the most of enterprise grade server would have it.
> (http://thread.gmane.org/gmane.linux.kernel.kexec/15382)
> 
> Doing kmsg_dump things in purgatory wouldn't be suitable (should be done
> in the 2nd kernel before enabling devices and IRQs?)

In theory it is doable maybe do something like oldmem_kmsg_dump while /proc/vmcore
initializing?

>  
> > As for this patch I'm not sure it is safe to replace the smp_send_stop
> > with the kdump friendly function. I'm also not sure if the kdump friendly
> > function is safe for kdump. Will glad to hear opinions from other
> > arch experts.
> 
> This stuff depends on architectures, so I speak only about
> x86 (the logic doesn't change on other architectures at this time).
> 
> kdump path with crash_kexec_post_notifiers disabled:
>  panic()
>    __crash_kexec()
>      crash_setup_regs()
>      crash_save_vmcoreinfo()
>      machine_crash_shutdown()
>        native_machine_crash_shutdown()
>          panic_smp_send_stop() /* mostly same as original 
>                                 * kdump_nmi_shootdown_cpus()
>                                 */
> 
> kdump path with crash_kexec_post_notifiers enabled:
>  panic()
>    panic_smp_send_stop()
>    __crash_kexec()
>      crash_setup_regs()
>      crash_save_vmcoreinfo()
>      machine_crash_shutdown()
>        native_machine_crash_shutdown()
>          panic_smp_send_stop() // do nothing
> 
> The difference is that stopping other CPUs before crash_setup_regs()
> and crash_save_vmcoreinfo() or not.  Since crash_setup_regs() and
> crash_save_vmcoreinfo() just save information to some memory area, 
> they wouldn't be affected by panic_smp_send_stop().  This means
> placing panic_smp_send_stop before __crash_kexec is safe.
> 
> BTW, I noticed my patch breaks Xen kernel.  I'll fix it in the
> next version.

But it does breaks stuff which depends on cpu not being disabled
like problem 1 you mentioned in patch log.

> 
> > BTW, if one want to use crash_kexec_post_notifiers he should take the
> > risk of unreliable kdump. How about only call smp_send_stop in case no
> > crash_kexec_post_notifiers being used.
> 
> Unlike panic_smp_send_stop()/kdump_nmi_shootdown_cpus(), smp_send_stop()
> for x86 tries to stop other CPUs with normal IPI before issuing NMI IPI.
> This would be because NMI IPI has a risk of deadlock.  We checked if
> the kdump path has a risk of deadlock in the case of NMI panic and fixed
> it.  But I'm not sure about normal panic path.  I agree with that use
> smp_send_stop if crash_kexec_post_notifiers or kdump is disabled.

What I mean is like below, problem 1 will not exist in this way, but
kdump will be unreliable:
if (!crash_kexec_post_notifiers)
	smp_send_stop()

> 
> > > This patch solves Problem 2 by replacing smp_send_stop() in panic()
> > > with panic_smp_send_stop().  This is a weak function which calls
> > > smp_send_stop(), and architecture dependent code may override this
> > > with appropriate one.  This patch only provides x86-specific version.
> > 
> > It does not fix the Problem 1, it seem not possible to fix it?
> 
> Problem 1 depends on architectures, and at least it doesn't happen
> on x86.  I can try to fix the Problem 1 for MIPS, but I can't test it.
> Possible solution will be to use an smp_send_stop variant which stop
> the CPU without offlining.
> 
> Best regards,
> 
> Hidehiro Kawai
> Hitachi, Ltd. Research & Development Group
> 
[snip]
Thanks
Dave

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-13  2:03       ` 'Dave Young'
@ 2016-07-15 11:50         ` 河合英宏 / KAWAI,HIDEHIRO
  2016-07-18  9:02           ` 'Dave Young'
  0 siblings, 1 reply; 16+ messages in thread
From: 河合英宏 / KAWAI,HIDEHIRO @ 2016-07-15 11:50 UTC (permalink / raw)
  To: 'Dave Young'
  Cc: Michal Hocko, Toshi Kani, Peter Zijlstra (Intel),
	Vitaly Kuznetsov, Minfei Huang, H. Peter Anvin, Daniel Walker,
	Ingo Molnar, Takao Indoh, Baoquan He, x86, Lee, Chun-Yi,
	Borislav Petkov, Vivek Goyal, Masami Hiramatsu, Petr Mladek,
	Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Eric W. Biederman, Tejun Heo,
	Andrew Morton

Hi Dave,

Thanks for your reply.

> From: 'Dave Young' [mailto:dyoung@redhat.com]
> Sent: Wednesday, July 13, 2016 11:04 AM
> 
> On 07/12/16 at 02:49am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > Hi Dave,
> >
> > Thanks for the comments.
> >
> > > From: Dave Young [mailto:dyoung@redhat.com]
> > > Sent: Monday, July 11, 2016 5:35 PM
> > >
> > > On 07/05/16 at 08:33pm, Hidehiro Kawai wrote:
> > > > This patch fixes one of the problems reported by Daniel Walker
> > > > (https://lkml.org/lkml/2015/6/24/44).
> > > >
> > > > If crash_kexec_post_notifiers boot option is specified, other CPUs
> > > > are stopped by smp_send_stop() instead of machine_crash_shutdown()
> > > > in crash_kexec() path.  This behavior change leads two problems.
> > > >
> > > >  Problem 1:
> > > >  octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs
> > > > are  still online and try to stop their watchdog timer.  If
> > > >  smp_send_stop() is called before octeon_generic_shutdown(),
> > > > stopping  watchdog timer will fail because other CPUs have been
> > > > offlined by  smp_send_stop().
> > > >
> > > >    panic()
> > > >      if crash_kexec_post_notifiers == 1
> > > >        smp_send_stop()
> > > >        atomic_notifier_call_chain()
> > > >        kmsg_dump()
> > > >      crash_kexec()
> > > >        machine_crash_shutdown()
> > > >          octeon_generic_shutdown() // shutdown watchdog for ONLINE
> > > > CPUs
> > > >
> > > >  Problem 2:
> > > >  Most of architectures stop other CPUs in machine_crash_shutdown()
> > > > path, and they also do something needed for kdump.  For example,
> > > > they save registers, disable virtualization extensions, and so on.
> > > >  However, if smp_send_stop() stops other CPUs before
> > > > machine_crash_shutdown(), we miss those operations.
> > > >
> > > > How do we fix these problems?  In the first place, we should stop
> > > > other CPUs as soon as possible when panic() was called, otherwise
> > > > other CPUs may wipe out a clue to the cause of the failure.  So,
> > > > we replace smp_send_stop() with more suitable one for kdump.
> > >
> > > We have been avoiding extra things in panic path, but unfortunately
> > > crash_kexec_post_notifiers were added. I tend to agree the best
> > > place for this stuff is in 2nd kernel or purgatory instead of in 1st kernel.
> >
> > Several months ago, I posted a patch set which writes regs to SEL,
> > generate an event to send SNMP message, and start/stop BMC's watchdog
> > timer in purgatory.  This feature requires BMC with KCS (Keyboard
> > Controller Style) I/F, but the most of enterprise grade server would have it.
> > (http://thread.gmane.org/gmane.linux.kernel.kexec/15382)
> >
> > Doing kmsg_dump things in purgatory wouldn't be suitable (should be
> > done in the 2nd kernel before enabling devices and IRQs?)
> 
> In theory it is doable maybe do something like oldmem_kmsg_dump while /proc/vmcore initializing?

Hmm, I checked the case of using ACPI ERST as a persistent
storage. The feature is initialized by device_initcall():

device_initcall
  erst_init
    pstore_register

And vmcore is initialized by fs_initcall() which is called
before device_initcall().  We may be able to change the sequence,
but anyway, these are done in later phase of the kernel initialization.
So, it may get less reliable although it would be doable.

> > > As for this patch I'm not sure it is safe to replace the
> > > smp_send_stop with the kdump friendly function. I'm also not sure if
> > > the kdump friendly function is safe for kdump. Will glad to hear
> > > opinions from other arch experts.
> >
> > This stuff depends on architectures, so I speak only about
> > x86 (the logic doesn't change on other architectures at this time).
> >
> > kdump path with crash_kexec_post_notifiers disabled:
> >  panic()
> >    __crash_kexec()
> >      crash_setup_regs()
> >      crash_save_vmcoreinfo()
> >      machine_crash_shutdown()
> >        native_machine_crash_shutdown()
> >          panic_smp_send_stop() /* mostly same as original
> >                                 * kdump_nmi_shootdown_cpus()
> >                                 */
> >
> > kdump path with crash_kexec_post_notifiers enabled:
> >  panic()
> >    panic_smp_send_stop()
> >    __crash_kexec()
> >      crash_setup_regs()
> >      crash_save_vmcoreinfo()
> >      machine_crash_shutdown()
> >        native_machine_crash_shutdown()
> >          panic_smp_send_stop() // do nothing
> >
> > The difference is that stopping other CPUs before crash_setup_regs()
> > and crash_save_vmcoreinfo() or not.  Since crash_setup_regs() and
> > crash_save_vmcoreinfo() just save information to some memory area,
> > they wouldn't be affected by panic_smp_send_stop().  This means
> > placing panic_smp_send_stop before __crash_kexec is safe.
> >
> > BTW, I noticed my patch breaks Xen kernel.  I'll fix it in the next
> > version.
> 
> But it does breaks stuff which depends on cpu not being disabled like problem 1 you mentioned in patch log.

As I mentioned in the description of this patch, we should stop
other CPUs ASAP to preserve current state either
crash_kexec_post_notifiers is enabled or not.
Then, all remaining procedures should work well
after stopping other CPUs (but keep the CPU map online).

Vivek also mentioned similar things:
https://lkml.org/lkml/2015/7/14/433

> > > BTW, if one want to use crash_kexec_post_notifiers he should take
> > > the risk of unreliable kdump. How about only call smp_send_stop in
> > > case no crash_kexec_post_notifiers being used.
> >
> > Unlike panic_smp_send_stop()/kdump_nmi_shootdown_cpus(),
> > smp_send_stop() for x86 tries to stop other CPUs with normal IPI before issuing NMI IPI.
> > This would be because NMI IPI has a risk of deadlock.  We checked if
> > the kdump path has a risk of deadlock in the case of NMI panic and
> > fixed it.  But I'm not sure about normal panic path.  I agree with
> > that use smp_send_stop if crash_kexec_post_notifiers or kdump is disabled.
> 
> What I mean is like below, problem 1 will not exist in this way, but kdump will be unreliable:
> if (!crash_kexec_post_notifiers)
> 	smp_send_stop()

Remaining procedures including atomic_notifier_call_chain and
kmsg_dump may assume that other CPUs have stopped.  Actually,
IPMI driver callback assumes so that.  Also, other CPUs may
change the current state while calling these callbacks and make
the dump analysis difficult.

[snip]

Best regards,

Hidehiro Kawai

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-15 11:50         ` 河合英宏 / KAWAI,HIDEHIRO
@ 2016-07-18  9:02           ` 'Dave Young'
  2016-07-19  5:51             ` 河合英宏 / KAWAI,HIDEHIRO
  0 siblings, 1 reply; 16+ messages in thread
From: 'Dave Young' @ 2016-07-18  9:02 UTC (permalink / raw)
  To: 河合英宏 / KAWAI,HIDEHIRO
  Cc: Michal Hocko, Toshi Kani, Peter Zijlstra (Intel),
	Vitaly Kuznetsov, Minfei Huang, H. Peter Anvin, Daniel Walker,
	Ingo Molnar, Takao Indoh, Baoquan He, x86, Lee, Chun-Yi,
	Borislav Petkov, Vivek Goyal, Masami Hiramatsu, Petr Mladek,
	Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Eric W. Biederman, Tejun Heo,
	Andrew Morton

Hi,

On 07/15/16 at 11:50am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> Hi Dave,
> 
> Thanks for your reply.
> 
> > From: 'Dave Young' [mailto:dyoung@redhat.com]
> > Sent: Wednesday, July 13, 2016 11:04 AM
> > 
> > On 07/12/16 at 02:49am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > Hi Dave,
> > >
> > > Thanks for the comments.
> > >
> > > > From: Dave Young [mailto:dyoung@redhat.com]
> > > > Sent: Monday, July 11, 2016 5:35 PM
> > > >
> > > > On 07/05/16 at 08:33pm, Hidehiro Kawai wrote:
> > > > > This patch fixes one of the problems reported by Daniel Walker
> > > > > (https://lkml.org/lkml/2015/6/24/44).
> > > > >
> > > > > If crash_kexec_post_notifiers boot option is specified, other CPUs
> > > > > are stopped by smp_send_stop() instead of machine_crash_shutdown()
> > > > > in crash_kexec() path.  This behavior change leads two problems.
> > > > >
> > > > >  Problem 1:
> > > > >  octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs
> > > > > are  still online and try to stop their watchdog timer.  If
> > > > >  smp_send_stop() is called before octeon_generic_shutdown(),
> > > > > stopping  watchdog timer will fail because other CPUs have been
> > > > > offlined by  smp_send_stop().
> > > > >
> > > > >    panic()
> > > > >      if crash_kexec_post_notifiers == 1
> > > > >        smp_send_stop()
> > > > >        atomic_notifier_call_chain()
> > > > >        kmsg_dump()
> > > > >      crash_kexec()
> > > > >        machine_crash_shutdown()
> > > > >          octeon_generic_shutdown() // shutdown watchdog for ONLINE
> > > > > CPUs
> > > > >
> > > > >  Problem 2:
> > > > >  Most of architectures stop other CPUs in machine_crash_shutdown()
> > > > > path, and they also do something needed for kdump.  For example,
> > > > > they save registers, disable virtualization extensions, and so on.
> > > > >  However, if smp_send_stop() stops other CPUs before
> > > > > machine_crash_shutdown(), we miss those operations.
> > > > >
> > > > > How do we fix these problems?  In the first place, we should stop
> > > > > other CPUs as soon as possible when panic() was called, otherwise
> > > > > other CPUs may wipe out a clue to the cause of the failure.  So,
> > > > > we replace smp_send_stop() with more suitable one for kdump.
> > > >
> > > > We have been avoiding extra things in panic path, but unfortunately
> > > > crash_kexec_post_notifiers were added. I tend to agree the best
> > > > place for this stuff is in 2nd kernel or purgatory instead of in 1st kernel.
> > >
> > > Several months ago, I posted a patch set which writes regs to SEL,
> > > generate an event to send SNMP message, and start/stop BMC's watchdog
> > > timer in purgatory.  This feature requires BMC with KCS (Keyboard
> > > Controller Style) I/F, but the most of enterprise grade server would have it.
> > > (http://thread.gmane.org/gmane.linux.kernel.kexec/15382)
> > >
> > > Doing kmsg_dump things in purgatory wouldn't be suitable (should be
> > > done in the 2nd kernel before enabling devices and IRQs?)
> > 
> > In theory it is doable maybe do something like oldmem_kmsg_dump while /proc/vmcore initializing?
> 
> Hmm, I checked the case of using ACPI ERST as a persistent
> storage. The feature is initialized by device_initcall():
> 
> device_initcall
>   erst_init
>     pstore_register
> 
> And vmcore is initialized by fs_initcall() which is called
> before device_initcall().  We may be able to change the sequence,
> but anyway, these are done in later phase of the kernel initialization.
> So, it may get less reliable although it would be doable.

Agreed, it is just an idea, it may need more experiments if you need.

> 
> > > > As for this patch I'm not sure it is safe to replace the
> > > > smp_send_stop with the kdump friendly function. I'm also not sure if
> > > > the kdump friendly function is safe for kdump. Will glad to hear
> > > > opinions from other arch experts.
> > >
> > > This stuff depends on architectures, so I speak only about
> > > x86 (the logic doesn't change on other architectures at this time).
> > >
> > > kdump path with crash_kexec_post_notifiers disabled:
> > >  panic()
> > >    __crash_kexec()
> > >      crash_setup_regs()
> > >      crash_save_vmcoreinfo()
> > >      machine_crash_shutdown()
> > >        native_machine_crash_shutdown()
> > >          panic_smp_send_stop() /* mostly same as original
> > >                                 * kdump_nmi_shootdown_cpus()
> > >                                 */
> > >
> > > kdump path with crash_kexec_post_notifiers enabled:
> > >  panic()
> > >    panic_smp_send_stop()
> > >    __crash_kexec()
> > >      crash_setup_regs()
> > >      crash_save_vmcoreinfo()
> > >      machine_crash_shutdown()
> > >        native_machine_crash_shutdown()
> > >          panic_smp_send_stop() // do nothing
> > >
> > > The difference is that stopping other CPUs before crash_setup_regs()
> > > and crash_save_vmcoreinfo() or not.  Since crash_setup_regs() and
> > > crash_save_vmcoreinfo() just save information to some memory area,
> > > they wouldn't be affected by panic_smp_send_stop().  This means
> > > placing panic_smp_send_stop before __crash_kexec is safe.
> > >
> > > BTW, I noticed my patch breaks Xen kernel.  I'll fix it in the next
> > > version.
> > 
> > But it does breaks stuff which depends on cpu not being disabled like problem 1 you mentioned in patch log.
> 
> As I mentioned in the description of this patch, we should stop
> other CPUs ASAP to preserve current state either
> crash_kexec_post_notifiers is enabled or not.
> Then, all remaining procedures should work well
> after stopping other CPUs (but keep the CPU map online).
> 
> Vivek also mentioned similar things:
> https://lkml.org/lkml/2015/7/14/433

The implementation in this patchset is different from suggestion in above link?

I think Vivek's suggestion is a good idea, to drop smp_send_stop and do below:
 
stop_cpus_save_register_state;

if (!crash_kexec_post_notifiers)
	crash_kexec()
atomic_notifier_call_chain()
kmsg_dump()

I'm just commenting from code flow point of view, the detail implementation
definitely need more comments from Arch experts.

Any reason did not move the kdump friendly function to earlier point like
before previous __crash_kexec() below? 
        if (!crash_kexec_post_notifiers) {
                printk_nmi_flush_on_panic();
                __crash_kexec(NULL);
        }
> 
> > > > BTW, if one want to use crash_kexec_post_notifiers he should take
> > > > the risk of unreliable kdump. How about only call smp_send_stop in
> > > > case no crash_kexec_post_notifiers being used.
> > >
> > > Unlike panic_smp_send_stop()/kdump_nmi_shootdown_cpus(),
> > > smp_send_stop() for x86 tries to stop other CPUs with normal IPI before issuing NMI IPI.
> > > This would be because NMI IPI has a risk of deadlock.  We checked if
> > > the kdump path has a risk of deadlock in the case of NMI panic and
> > > fixed it.  But I'm not sure about normal panic path.  I agree with
> > > that use smp_send_stop if crash_kexec_post_notifiers or kdump is disabled.
> > 
> > What I mean is like below, problem 1 will not exist in this way, but kdump will be unreliable:
> > if (!crash_kexec_post_notifiers)
> > 	smp_send_stop()
> 
> Remaining procedures including atomic_notifier_call_chain and
> kmsg_dump may assume that other CPUs have stopped.  Actually,
> IPMI driver callback assumes so that.  Also, other CPUs may

Ok, if so please ignore my previous suggestion.

> change the current state while calling these callbacks and make
> the dump analysis difficult.
> 
> [snip]
> 
> Best regards,
> 
> Hidehiro Kawai
> 

Thanks
Dave

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-18  9:02           ` 'Dave Young'
@ 2016-07-19  5:51             ` 河合英宏 / KAWAI,HIDEHIRO
  2016-07-19  6:52               ` 'Dave Young'
  0 siblings, 1 reply; 16+ messages in thread
From: 河合英宏 / KAWAI,HIDEHIRO @ 2016-07-19  5:51 UTC (permalink / raw)
  To: 'Dave Young'
  Cc: Michal Hocko, Toshi Kani, Peter Zijlstra (Intel),
	Vitaly Kuznetsov, Minfei Huang, H. Peter Anvin, Daniel Walker,
	Ingo Molnar, Takao Indoh, Baoquan He, x86, Lee, Chun-Yi,
	Borislav Petkov, Vivek Goyal, Masami Hiramatsu, Petr Mladek,
	Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Eric W. Biederman, Tejun Heo,
	Andrew Morton

Hi,

> From: 'Dave Young' [mailto:dyoung@redhat.com]
> Sent: Monday, July 18, 2016 6:02 PM
> On 07/15/16 at 11:50am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > Hi Dave,
> >
> > Thanks for your reply.
> >
> > > From: 'Dave Young' [mailto:dyoung@redhat.com]
> > > Sent: Wednesday, July 13, 2016 11:04 AM
> > >
> > > On 07/12/16 at 02:49am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > > Hi Dave,
> > > >
> > > > Thanks for the comments.
> > > >
> > > > > From: Dave Young [mailto:dyoung@redhat.com]
> > > > > Sent: Monday, July 11, 2016 5:35 PM
> > > > >
> > > > > On 07/05/16 at 08:33pm, Hidehiro Kawai wrote:
[snip] 
> > > > > As for this patch I'm not sure it is safe to replace the
> > > > > smp_send_stop with the kdump friendly function. I'm also not sure if
> > > > > the kdump friendly function is safe for kdump. Will glad to hear
> > > > > opinions from other arch experts.
> > > >
> > > > This stuff depends on architectures, so I speak only about
> > > > x86 (the logic doesn't change on other architectures at this time).
> > > >
> > > > kdump path with crash_kexec_post_notifiers disabled:
> > > >  panic()
> > > >    __crash_kexec()
> > > >      crash_setup_regs()
> > > >      crash_save_vmcoreinfo()
> > > >      machine_crash_shutdown()
> > > >        native_machine_crash_shutdown()
> > > >          panic_smp_send_stop() /* mostly same as original
> > > >                                 * kdump_nmi_shootdown_cpus()
> > > >                                 */
> > > >
> > > > kdump path with crash_kexec_post_notifiers enabled:
> > > >  panic()
> > > >    panic_smp_send_stop()
> > > >    __crash_kexec()
> > > >      crash_setup_regs()
> > > >      crash_save_vmcoreinfo()
> > > >      machine_crash_shutdown()
> > > >        native_machine_crash_shutdown()
> > > >          panic_smp_send_stop() // do nothing
> > > >
> > > > The difference is that stopping other CPUs before crash_setup_regs()
> > > > and crash_save_vmcoreinfo() or not.  Since crash_setup_regs() and
> > > > crash_save_vmcoreinfo() just save information to some memory area,
> > > > they wouldn't be affected by panic_smp_send_stop().  This means
> > > > placing panic_smp_send_stop before __crash_kexec is safe.
> > > >
> > > > BTW, I noticed my patch breaks Xen kernel.  I'll fix it in the next
> > > > version.
> > >
> > > But it does breaks stuff which depends on cpu not being disabled like problem 1 you mentioned in patch log.
> >
> > As I mentioned in the description of this patch, we should stop
> > other CPUs ASAP to preserve current state either
> > crash_kexec_post_notifiers is enabled or not.
> > Then, all remaining procedures should work well
> > after stopping other CPUs (but keep the CPU map online).
> >
> > Vivek also mentioned similar things:
> > https://lkml.org/lkml/2015/7/14/433
> 
> The implementation in this patchset is different from suggestion in above link?
> 
> I think Vivek's suggestion is a good idea, to drop smp_send_stop and do below:
> 
> stop_cpus_save_register_state;
> 
> if (!crash_kexec_post_notifiers)
> 	crash_kexec()
> atomic_notifier_call_chain()
> kmsg_dump()
> 
> I'm just commenting from code flow point of view, the detail implementation
> definitely need more comments from Arch experts.
> 
> Any reason did not move the kdump friendly function to earlier point like
> before previous __crash_kexec() below?
>         if (!crash_kexec_post_notifiers) {
>                 printk_nmi_flush_on_panic();
>                 __crash_kexec(NULL);
>         }

The reason why the implementation differs from Vivek's is to keep
the current code flow if crash_kexec_post_notifiers is not specified.

If we apply Vivek's or your suggestion, it may always cause kdump
to fail on MIPS OCTEON due to Problem 1.  I don't want to make things
any worse.  I may post a patch for MIPS OCTEON, but I can't test it.
For other architectures, I'm not sure what problems there are.
So at first, I want to fix the case where crash_kexec_post_notifiers is
specified on x86.  Then, if all other architectures support
`stop other CPUs before crash_kexec', switch to your or Vivek's
suggesting code.

Is this acceptable?

Best regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-19  5:51             ` 河合英宏 / KAWAI,HIDEHIRO
@ 2016-07-19  6:52               ` 'Dave Young'
  2016-07-19 11:23                 ` 河合英宏 / KAWAI,HIDEHIRO
  0 siblings, 1 reply; 16+ messages in thread
From: 'Dave Young' @ 2016-07-19  6:52 UTC (permalink / raw)
  To: 河合英宏 / KAWAI,HIDEHIRO
  Cc: Michal Hocko, Toshi Kani, Peter Zijlstra (Intel),
	Vitaly Kuznetsov, Minfei Huang, H. Peter Anvin, Daniel Walker,
	Ingo Molnar, Takao Indoh, Baoquan He, x86, Lee, Chun-Yi,
	Borislav Petkov, Vivek Goyal, Masami Hiramatsu, Petr Mladek,
	Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Eric W. Biederman, Tejun Heo,
	Andrew Morton

Hi, 
On 07/19/16 at 05:51am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> Hi,
> 
> > From: 'Dave Young' [mailto:dyoung@redhat.com]
> > Sent: Monday, July 18, 2016 6:02 PM
> > On 07/15/16 at 11:50am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > Hi Dave,
> > >
> > > Thanks for your reply.
> > >
> > > > From: 'Dave Young' [mailto:dyoung@redhat.com]
> > > > Sent: Wednesday, July 13, 2016 11:04 AM
> > > >
> > > > On 07/12/16 at 02:49am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > > > Hi Dave,
> > > > >
> > > > > Thanks for the comments.
> > > > >
> > > > > > From: Dave Young [mailto:dyoung@redhat.com]
> > > > > > Sent: Monday, July 11, 2016 5:35 PM
> > > > > >
> > > > > > On 07/05/16 at 08:33pm, Hidehiro Kawai wrote:
> [snip] 
> > > > > > As for this patch I'm not sure it is safe to replace the
> > > > > > smp_send_stop with the kdump friendly function. I'm also not sure if
> > > > > > the kdump friendly function is safe for kdump. Will glad to hear
> > > > > > opinions from other arch experts.
> > > > >
> > > > > This stuff depends on architectures, so I speak only about
> > > > > x86 (the logic doesn't change on other architectures at this time).
> > > > >
> > > > > kdump path with crash_kexec_post_notifiers disabled:
> > > > >  panic()
> > > > >    __crash_kexec()
> > > > >      crash_setup_regs()
> > > > >      crash_save_vmcoreinfo()
> > > > >      machine_crash_shutdown()
> > > > >        native_machine_crash_shutdown()
> > > > >          panic_smp_send_stop() /* mostly same as original
> > > > >                                 * kdump_nmi_shootdown_cpus()
> > > > >                                 */
> > > > >
> > > > > kdump path with crash_kexec_post_notifiers enabled:
> > > > >  panic()
> > > > >    panic_smp_send_stop()
> > > > >    __crash_kexec()
> > > > >      crash_setup_regs()
> > > > >      crash_save_vmcoreinfo()
> > > > >      machine_crash_shutdown()
> > > > >        native_machine_crash_shutdown()
> > > > >          panic_smp_send_stop() // do nothing
> > > > >
> > > > > The difference is that stopping other CPUs before crash_setup_regs()
> > > > > and crash_save_vmcoreinfo() or not.  Since crash_setup_regs() and
> > > > > crash_save_vmcoreinfo() just save information to some memory area,
> > > > > they wouldn't be affected by panic_smp_send_stop().  This means
> > > > > placing panic_smp_send_stop before __crash_kexec is safe.
> > > > >
> > > > > BTW, I noticed my patch breaks Xen kernel.  I'll fix it in the next
> > > > > version.
> > > >
> > > > But it does breaks stuff which depends on cpu not being disabled like problem 1 you mentioned in patch log.
> > >
> > > As I mentioned in the description of this patch, we should stop
> > > other CPUs ASAP to preserve current state either
> > > crash_kexec_post_notifiers is enabled or not.
> > > Then, all remaining procedures should work well
> > > after stopping other CPUs (but keep the CPU map online).
> > >
> > > Vivek also mentioned similar things:
> > > https://lkml.org/lkml/2015/7/14/433
> > 
> > The implementation in this patchset is different from suggestion in above link?
> > 
> > I think Vivek's suggestion is a good idea, to drop smp_send_stop and do below:
> > 
> > stop_cpus_save_register_state;
> > 
> > if (!crash_kexec_post_notifiers)
> > 	crash_kexec()
> > atomic_notifier_call_chain()
> > kmsg_dump()
> > 
> > I'm just commenting from code flow point of view, the detail implementation
> > definitely need more comments from Arch experts.
> > 
> > Any reason did not move the kdump friendly function to earlier point like
> > before previous __crash_kexec() below?
> >         if (!crash_kexec_post_notifiers) {
> >                 printk_nmi_flush_on_panic();
> >                 __crash_kexec(NULL);
> >         }
> 
> The reason why the implementation differs from Vivek's is to keep
> the current code flow if crash_kexec_post_notifiers is not specified.
> 
> If we apply Vivek's or your suggestion, it may always cause kdump
> to fail on MIPS OCTEON due to Problem 1.  I don't want to make things
> any worse.  I may post a patch for MIPS OCTEON, but I can't test it.
> For other architectures, I'm not sure what problems there are.
> So at first, I want to fix the case where crash_kexec_post_notifiers is
> specified on x86.  Then, if all other architectures support
> `stop other CPUs before crash_kexec', switch to your or Vivek's
> suggesting code.
> 
> Is this acceptable?

Maybe you can find someone who can test MIPS OCTEON so that they can give
some thoughts first and maybe test a fix?

[dyoung@localhost linux]$ ./scripts/get_maintainer.pl -f arch/mips/cavium-octeon
Ralf Baechle <ralf@linux-mips.org> (supporter:MIPS,commit_signer:32/35=91%)
David Daney <david.daney@cavium.com> (commit_signer:21/35=60%,authored:8/35=23%)
Aaro Koskinen <aaro.koskinen@iki.fi> (commit_signer:15/35=43%,authored:8/35=23%)
Janne Huttunen <janne.huttunen@nokia.com>
(commit_signer:7/35=20%,authored:7/35=20%)
Thomas Gleixner <tglx@linutronix.de> (commit_signer:4/35=11%,authored:2/35=6%)
linux-mips@linux-mips.org (open list:MIPS)
linux-kernel@vger.kernel.org (open list)

Thanks
Dave

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version
  2016-07-19  6:52               ` 'Dave Young'
@ 2016-07-19 11:23                 ` 河合英宏 / KAWAI,HIDEHIRO
  0 siblings, 0 replies; 16+ messages in thread
From: 河合英宏 / KAWAI,HIDEHIRO @ 2016-07-19 11:23 UTC (permalink / raw)
  To: 'Dave Young'
  Cc: Michal Hocko, Toshi Kani, Peter Zijlstra (Intel),
	Vitaly Kuznetsov, Minfei Huang, H. Peter Anvin, Daniel Walker,
	Ingo Molnar, Takao Indoh, Baoquan He, x86, Lee, Chun-Yi,
	Borislav Petkov, Vivek Goyal, Masami Hiramatsu, Petr Mladek,
	Josh Poimboeuf, Thomas Gleixner, Ingo Molnar, kexec,
	linux-kernel, HATAYAMA Daisuke, Eric W. Biederman, Tejun Heo,
	Andrew Morton

> From: 'Dave Young' [mailto:dyoung@redhat.com]
> Sent: Tuesday, July 19, 2016 3:52 PM
> Hi,
> On 07/19/16 at 05:51am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > Hi,
> >
> > > From: 'Dave Young' [mailto:dyoung@redhat.com]
> > > Sent: Monday, July 18, 2016 6:02 PM
> > > On 07/15/16 at 11:50am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > > Hi Dave,
> > > >
> > > > Thanks for your reply.
> > > >
> > > > > From: 'Dave Young' [mailto:dyoung@redhat.com]
> > > > > Sent: Wednesday, July 13, 2016 11:04 AM
> > > > >
> > > > > On 07/12/16 at 02:49am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > > > > Hi Dave,
> > > > > >
> > > > > > Thanks for the comments.
> > > > > >
> > > > > > > From: Dave Young [mailto:dyoung@redhat.com]
> > > > > > > Sent: Monday, July 11, 2016 5:35 PM
> > > > > > >
> > > > > > > On 07/05/16 at 08:33pm, Hidehiro Kawai wrote:
> > [snip]
> > > > > > > As for this patch I'm not sure it is safe to replace the
> > > > > > > smp_send_stop with the kdump friendly function. I'm also not sure if
> > > > > > > the kdump friendly function is safe for kdump. Will glad to hear
> > > > > > > opinions from other arch experts.
> > > > > >
> > > > > > This stuff depends on architectures, so I speak only about
> > > > > > x86 (the logic doesn't change on other architectures at this time).
> > > > > >
> > > > > > kdump path with crash_kexec_post_notifiers disabled:
> > > > > >  panic()
> > > > > >    __crash_kexec()
> > > > > >      crash_setup_regs()
> > > > > >      crash_save_vmcoreinfo()
> > > > > >      machine_crash_shutdown()
> > > > > >        native_machine_crash_shutdown()
> > > > > >          panic_smp_send_stop() /* mostly same as original
> > > > > >                                 * kdump_nmi_shootdown_cpus()
> > > > > >                                 */
> > > > > >
> > > > > > kdump path with crash_kexec_post_notifiers enabled:
> > > > > >  panic()
> > > > > >    panic_smp_send_stop()
> > > > > >    __crash_kexec()
> > > > > >      crash_setup_regs()
> > > > > >      crash_save_vmcoreinfo()
> > > > > >      machine_crash_shutdown()
> > > > > >        native_machine_crash_shutdown()
> > > > > >          panic_smp_send_stop() // do nothing
> > > > > >
> > > > > > The difference is that stopping other CPUs before crash_setup_regs()
> > > > > > and crash_save_vmcoreinfo() or not.  Since crash_setup_regs() and
> > > > > > crash_save_vmcoreinfo() just save information to some memory area,
> > > > > > they wouldn't be affected by panic_smp_send_stop().  This means
> > > > > > placing panic_smp_send_stop before __crash_kexec is safe.
> > > > > >
> > > > > > BTW, I noticed my patch breaks Xen kernel.  I'll fix it in the next
> > > > > > version.
> > > > >
> > > > > But it does breaks stuff which depends on cpu not being disabled like problem 1 you mentioned in patch log.
> > > >
> > > > As I mentioned in the description of this patch, we should stop
> > > > other CPUs ASAP to preserve current state either
> > > > crash_kexec_post_notifiers is enabled or not.
> > > > Then, all remaining procedures should work well
> > > > after stopping other CPUs (but keep the CPU map online).
> > > >
> > > > Vivek also mentioned similar things:
> > > > https://lkml.org/lkml/2015/7/14/433
> > >
> > > The implementation in this patchset is different from suggestion in above link?
> > >
> > > I think Vivek's suggestion is a good idea, to drop smp_send_stop and do below:
> > >
> > > stop_cpus_save_register_state;
> > >
> > > if (!crash_kexec_post_notifiers)
> > > 	crash_kexec()
> > > atomic_notifier_call_chain()
> > > kmsg_dump()
> > >
> > > I'm just commenting from code flow point of view, the detail implementation
> > > definitely need more comments from Arch experts.
> > >
> > > Any reason did not move the kdump friendly function to earlier point like
> > > before previous __crash_kexec() below?
> > >         if (!crash_kexec_post_notifiers) {
> > >                 printk_nmi_flush_on_panic();
> > >                 __crash_kexec(NULL);
> > >         }
> >
> > The reason why the implementation differs from Vivek's is to keep
> > the current code flow if crash_kexec_post_notifiers is not specified.
> >
> > If we apply Vivek's or your suggestion, it may always cause kdump
> > to fail on MIPS OCTEON due to Problem 1.  I don't want to make things
> > any worse.  I may post a patch for MIPS OCTEON, but I can't test it.
> > For other architectures, I'm not sure what problems there are.
> > So at first, I want to fix the case where crash_kexec_post_notifiers is
> > specified on x86.  Then, if all other architectures support
> > `stop other CPUs before crash_kexec', switch to your or Vivek's
> > suggesting code.
> >
> > Is this acceptable?
> 
> Maybe you can find someone who can test MIPS OCTEON so that they can give
> some thoughts first and maybe test a fix?
> 
> [dyoung@localhost linux]$ ./scripts/get_maintainer.pl -f arch/mips/cavium-octeon
> Ralf Baechle <ralf@linux-mips.org> (supporter:MIPS,commit_signer:32/35=91%)
> David Daney <david.daney@cavium.com> (commit_signer:21/35=60%,authored:8/35=23%)
> Aaro Koskinen <aaro.koskinen@iki.fi> (commit_signer:15/35=43%,authored:8/35=23%)
> Janne Huttunen <janne.huttunen@nokia.com>
> (commit_signer:7/35=20%,authored:7/35=20%)
> Thomas Gleixner <tglx@linutronix.de> (commit_signer:4/35=11%,authored:2/35=6%)
> linux-mips@linux-mips.org (open list:MIPS)
> linux-kernel@vger.kernel.org (open list)

So I'll try to fix for MIPS OCTEON, but I'm going to keep the current
behavior when crash_kexec_post_notifiers is not specified because
I'm not sure what will happen on other architectures.

Best regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-07-19 11:25 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-05 11:33 [V3 PATCH 0/2] kexec: crash_kexec_post_notifiers boot option related fixes Hidehiro Kawai
2016-07-05 11:33 ` [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version Hidehiro Kawai
2016-07-11  8:34   ` Dave Young
2016-07-12  2:49     ` 河合英宏 / KAWAI,HIDEHIRO
2016-07-13  2:03       ` 'Dave Young'
2016-07-15 11:50         ` 河合英宏 / KAWAI,HIDEHIRO
2016-07-18  9:02           ` 'Dave Young'
2016-07-19  5:51             ` 河合英宏 / KAWAI,HIDEHIRO
2016-07-19  6:52               ` 'Dave Young'
2016-07-19 11:23                 ` 河合英宏 / KAWAI,HIDEHIRO
2016-07-12  3:11   ` Xunlei Pang
2016-07-12  3:56     ` 河合英宏 / KAWAI,HIDEHIRO
2016-07-12  6:57       ` Xunlei Pang
2016-07-12  7:12         ` 河合英宏 / KAWAI,HIDEHIRO
2016-07-12  7:26           ` Xunlei Pang
2016-07-05 11:33 ` [V3 PATCH 2/2] kexec: Use core_param for crash_kexec_post_notifiers boot option Hidehiro Kawai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).