linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] x86/crash: Fix double list_add nmi_shootdown bug
@ 2022-05-11 23:43 Sean Christopherson
  2022-05-11 23:43 ` [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list_add Sean Christopherson
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Sean Christopherson @ 2022-05-11 23:43 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: H. Peter Anvin, linux-kernel, Guilherme G . Piccoli,
	Vitaly Kuznetsov, Paolo Bonzini, Sean Christopherson

Fix a double list_add() bug found and debugged by Guilherme, who did all
the hard work.  nmi_shootdown_cpus() doesn't play nice with being called
more than once.  With the "right" kexec/kdump configuration,
emergency_vmx_disable_all() can be reached after kdump_nmi_shootdown_cpus()
(the two users of nmi_shootdown_cpus()).

My solution is to turn the emergency_vmx_disable_all() shootdown into a
nop of sorts, and move the disabling of virtualization into the core
crash_nmi_callback() handler.  The only thing emergency_vmx_disable_all()
cares about is disabling VMX/SVM (obviously), and since I can't envision a
use case for an NMI shootdown that doesn't want to disable virtualization,
doing that in the core handler means emergency_vmx_disable_all() only
needs to ensure _a_ shootdown occurs, it doesn't care when that shootdown
happened or what callback was run.

This obviously punts on making nmi_shootdown_cpus() truly multi-caller
friendly, but notifier chains tend to be messy, and it's not obvious to
me what would be the desired/correct behavior for a true multi-shootdown
use case.

Patch 2 is a related bug fix found while exploring ideas for patch 1.

Sean Christopherson (2):
  x86/crash: Disable virt in core NMI crash handler to avoid double
    list_add
  x86/reboot: Disable virtualization in an emergency if SVM is supported

 arch/x86/include/asm/reboot.h |  1 +
 arch/x86/kernel/crash.c       | 16 +--------
 arch/x86/kernel/reboot.c      | 64 +++++++++++++++++++++++++++--------
 3 files changed, 51 insertions(+), 30 deletions(-)


base-commit: feb9c5e19e913b53cb536a7aa7c9f20107bb51ec
-- 
2.36.0.512.ge40c2bad7a-goog


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list_add
  2022-05-11 23:43 [PATCH 0/2] x86/crash: Fix double list_add nmi_shootdown bug Sean Christopherson
@ 2022-05-11 23:43 ` Sean Christopherson
  2022-05-12  9:14   ` Vitaly Kuznetsov
  2022-05-12 10:51   ` Thomas Gleixner
  2022-05-11 23:43 ` [PATCH 2/2] x86/reboot: Disable virtualization in an emergency if SVM is supported Sean Christopherson
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 16+ messages in thread
From: Sean Christopherson @ 2022-05-11 23:43 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: H. Peter Anvin, linux-kernel, Guilherme G . Piccoli,
	Vitaly Kuznetsov, Paolo Bonzini, Sean Christopherson

Disable virtualization in crash_nmi_callback() and skip the requested NMI
shootdown if a shootdown has already occurred, i.e. a callback has been
registered.  The NMI crash shootdown path doesn't play nice with multiple
invocations, e.g. attempting to register the NMI handler multiple times
will trigger a double list_add() and hang the system (in addition to
multiple other issues).  If "crash_kexec_post_notifiers" is specified on
the kernel command line, panic() will invoke crash_smp_send_stop() and
result in a second call to nmi_shootdown_cpus() during
native_machine_emergency_restart().

Invoke the callback _before_ disabling virtualization, as the current
VMCS needs to be cleared before doing VMXOFF.  Note, this results in a
subtle change in ordering between disabling virtualization and stopping
Intel PT on the responding CPUs.  While VMX and Intel PT do interact,
VMXOFF and writes to MSR_IA32_RTIT_CTL do not induce faults between one
another, which is all that matters when panicking.

WARN if nmi_shootdown_cpus() is called a second time with anything other
than the reboot path's "nop" handler, as bailing means the requested
isn't being invoked.  Punt true handling of multiple shootdown callbacks
until there's an actual use case for doing so (beyond disabling
virtualization).

Extract the disabling logic to a common helper to deduplicate code, and
to prepare for doing the shootdown in the emergency reboot path if SVM
is supported.

Note, prior to commit ed72736183c4 ("x86/reboot: Force all cpus to exit
VMX root if VMX is supported"), nmi_shootdown_cpus() was subtly protected
against a second invocation by a cpu_vmx_enabled() check as the kdump
handler would disable VMX if it ran first.

Fixes: ed72736183c4 ("x86/reboot: Force all cpus to exit VMX root if VMX is supported)
Cc: stable@vger.kernel.org
Reported-and-tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/all/20220427224924.592546-2-gpiccoli@igalia.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/reboot.h |  1 +
 arch/x86/kernel/crash.c       | 16 +--------------
 arch/x86/kernel/reboot.c      | 38 ++++++++++++++++++++++++++++++++---
 3 files changed, 37 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h
index 04c17be9b5fd..8f2da36435a6 100644
--- a/arch/x86/include/asm/reboot.h
+++ b/arch/x86/include/asm/reboot.h
@@ -25,6 +25,7 @@ void __noreturn machine_real_restart(unsigned int type);
 #define MRR_BIOS	0
 #define MRR_APM		1
 
+void cpu_crash_disable_virtualization(void);
 typedef void (*nmi_shootdown_cb)(int, struct pt_regs*);
 void nmi_panic_self_stop(struct pt_regs *regs);
 void nmi_shootdown_cpus(nmi_shootdown_cb callback);
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index e8326a8d1c5d..fe0cf83843ba 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -81,15 +81,6 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
 	 */
 	cpu_crash_vmclear_loaded_vmcss();
 
-	/* Disable VMX or SVM if needed.
-	 *
-	 * We need to disable virtualization on all CPUs.
-	 * Having VMX or SVM enabled on any CPU may break rebooting
-	 * after the kdump kernel has finished its task.
-	 */
-	cpu_emergency_vmxoff();
-	cpu_emergency_svm_disable();
-
 	/*
 	 * Disable Intel PT to stop its logging
 	 */
@@ -148,12 +139,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
 	 */
 	cpu_crash_vmclear_loaded_vmcss();
 
-	/* Booting kdump kernel with VMX or SVM enabled won't work,
-	 * because (among other limitations) we can't disable paging
-	 * with the virt flags.
-	 */
-	cpu_emergency_vmxoff();
-	cpu_emergency_svm_disable();
+	cpu_crash_disable_virtualization();
 
 	/*
 	 * Disable Intel PT to stop its logging
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index fa700b46588e..f9543a4e9b09 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -528,9 +528,9 @@ static inline void kb_wait(void)
 	}
 }
 
-static void vmxoff_nmi(int cpu, struct pt_regs *regs)
+static void nmi_shootdown_nop(int cpu, struct pt_regs *regs)
 {
-	cpu_emergency_vmxoff();
+	/* Nothing to do, the NMI shootdown handler disables virtualization. */
 }
 
 /* Use NMIs as IPIs to tell all CPUs to disable virtualization */
@@ -554,7 +554,7 @@ static void emergency_vmx_disable_all(void)
 		__cpu_emergency_vmxoff();
 
 		/* Halt and exit VMX root operation on the other CPUs. */
-		nmi_shootdown_cpus(vmxoff_nmi);
+		nmi_shootdown_cpus(nmi_shootdown_nop);
 	}
 }
 
@@ -802,6 +802,18 @@ static nmi_shootdown_cb shootdown_callback;
 static atomic_t waiting_for_crash_ipi;
 static int crash_ipi_issued;
 
+void cpu_crash_disable_virtualization(void)
+{
+	/*
+	 * Disable virtualization, i.e. VMX or SVM, so that INIT is recognized
+	 * during reboot.  VMX blocks INIT if the CPU is post-VMXON, and SVM
+	 * blocks INIT if GIF=0.  Note, CLGI #UDs if SVM isn't enabled, so it's
+	 * easier to just disable SVM unconditionally.
+	 */
+	cpu_emergency_vmxoff();
+	cpu_emergency_svm_disable();
+}
+
 static int crash_nmi_callback(unsigned int val, struct pt_regs *regs)
 {
 	int cpu;
@@ -819,6 +831,12 @@ static int crash_nmi_callback(unsigned int val, struct pt_regs *regs)
 
 	shootdown_callback(cpu, regs);
 
+	/*
+	 * Prepare the CPU for reboot _after_ invoking the callback so that the
+	 * callback can safely use virtualization instructions, e.g. VMCLEAR.
+	 */
+	cpu_crash_disable_virtualization();
+
 	atomic_dec(&waiting_for_crash_ipi);
 	/* Assume hlt works */
 	halt();
@@ -840,6 +858,20 @@ void nmi_shootdown_cpus(nmi_shootdown_cb callback)
 	unsigned long msecs;
 	local_irq_disable();
 
+	/*
+	 * Invoking multiple callbacks is not currently supported, registering
+	 * the NMI handler twice will cause a list_add() double add BUG().
+	 * The exception is the "nop" handler in the emergency reboot path,
+	 * which can run after e.g. kdump's shootdown.  Do nothing if the crash
+	 * handler has already run, i.e. has already prepared other CPUs, the
+	 * reboot path doesn't have any work of its to do, it just needs to
+	 * ensure all CPUs have prepared for reboot.
+	 */
+	if (shootdown_callback) {
+		WARN_ON_ONCE(callback != nmi_shootdown_nop);
+		return;
+	}
+
 	/* Make a note of crashing cpu. Will be used in NMI callback. */
 	crashing_cpu = safe_smp_processor_id();
 
-- 
2.36.0.512.ge40c2bad7a-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2] x86/reboot: Disable virtualization in an emergency if SVM is supported
  2022-05-11 23:43 [PATCH 0/2] x86/crash: Fix double list_add nmi_shootdown bug Sean Christopherson
  2022-05-11 23:43 ` [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list_add Sean Christopherson
@ 2022-05-11 23:43 ` Sean Christopherson
  2022-05-12  8:37   ` Vitaly Kuznetsov
  2022-05-12 10:57   ` Thomas Gleixner
  2022-05-13 11:10 ` [PATCH] x86/nmi: Make register_nmi_handler() more robust Thomas Gleixner
  2022-05-17  7:34 ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
  3 siblings, 2 replies; 16+ messages in thread
From: Sean Christopherson @ 2022-05-11 23:43 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: H. Peter Anvin, linux-kernel, Guilherme G . Piccoli,
	Vitaly Kuznetsov, Paolo Bonzini, Sean Christopherson

Disable SVM on all CPUs via NMI shootdown during an emergency reboot.
Like VMX, SVM can block INIT and thus prevent bringing up other CPUs via
INIT-SIPI-SIPI.

Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kernel/reboot.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index f9543a4e9b09..33c1f4883b27 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -533,27 +533,29 @@ static void nmi_shootdown_nop(int cpu, struct pt_regs *regs)
 	/* Nothing to do, the NMI shootdown handler disables virtualization. */
 }
 
-/* Use NMIs as IPIs to tell all CPUs to disable virtualization */
-static void emergency_vmx_disable_all(void)
+static void emergency_reboot_disable_virtualization(void)
 {
 	/* Just make sure we won't change CPUs while doing this */
 	local_irq_disable();
 
 	/*
-	 * Disable VMX on all CPUs before rebooting, otherwise we risk hanging
-	 * the machine, because the CPU blocks INIT when it's in VMX root.
+	 * Disable virtualization on all CPUs before rebooting to avoid hanging
+	 * the system, as VMX and SVM block INIT when running in the host
 	 *
 	 * We can't take any locks and we may be on an inconsistent state, so
-	 * use NMIs as IPIs to tell the other CPUs to exit VMX root and halt.
+	 * use NMIs as IPIs to tell the other CPUs to disable VMX/SVM and halt.
 	 *
-	 * Do the NMI shootdown even if VMX if off on _this_ CPU, as that
-	 * doesn't prevent a different CPU from being in VMX root operation.
+	 * Do the NMI shootdown even if virtualization is off on _this_ CPU, as
+	 * other CPUs may have virtualization enabled.
 	 */
-	if (cpu_has_vmx()) {
-		/* Safely force _this_ CPU out of VMX root operation. */
-		__cpu_emergency_vmxoff();
+	if (cpu_has_vmx() || cpu_has_svm(NULL)) {
+		/* Safely force _this_ CPU out of VMX/SVM operation. */
+		if (cpu_has_vmx())
+			__cpu_emergency_vmxoff();
+		else
+			cpu_emergency_svm_disable();
 
-		/* Halt and exit VMX root operation on the other CPUs. */
+		/* Disable VMX/SVM and halt on other CPUs. */
 		nmi_shootdown_cpus(nmi_shootdown_nop);
 	}
 }
@@ -590,7 +592,7 @@ static void native_machine_emergency_restart(void)
 	unsigned short mode;
 
 	if (reboot_emergency)
-		emergency_vmx_disable_all();
+		emergency_reboot_disable_virtualization();
 
 	tboot_shutdown(TB_SHUTDOWN_REBOOT);
 
-- 
2.36.0.512.ge40c2bad7a-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] x86/reboot: Disable virtualization in an emergency if SVM is supported
  2022-05-11 23:43 ` [PATCH 2/2] x86/reboot: Disable virtualization in an emergency if SVM is supported Sean Christopherson
@ 2022-05-12  8:37   ` Vitaly Kuznetsov
  2022-05-12 10:57   ` Thomas Gleixner
  1 sibling, 0 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2022-05-12  8:37 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: H. Peter Anvin, linux-kernel, Guilherme G . Piccoli,
	Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86

Sean Christopherson <seanjc@google.com> writes:

> Disable SVM on all CPUs via NMI shootdown during an emergency reboot.
> Like VMX, SVM can block INIT and thus prevent bringing up other CPUs via
> INIT-SIPI-SIPI.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kernel/reboot.c | 26 ++++++++++++++------------
>  1 file changed, 14 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> index f9543a4e9b09..33c1f4883b27 100644
> --- a/arch/x86/kernel/reboot.c
> +++ b/arch/x86/kernel/reboot.c
> @@ -533,27 +533,29 @@ static void nmi_shootdown_nop(int cpu, struct pt_regs *regs)
>  	/* Nothing to do, the NMI shootdown handler disables virtualization. */
>  }
>  
> -/* Use NMIs as IPIs to tell all CPUs to disable virtualization */
> -static void emergency_vmx_disable_all(void)
> +static void emergency_reboot_disable_virtualization(void)
>  {
>  	/* Just make sure we won't change CPUs while doing this */
>  	local_irq_disable();
>  
>  	/*
> -	 * Disable VMX on all CPUs before rebooting, otherwise we risk hanging
> -	 * the machine, because the CPU blocks INIT when it's in VMX root.
> +	 * Disable virtualization on all CPUs before rebooting to avoid hanging
> +	 * the system, as VMX and SVM block INIT when running in the host
>  	 *
>  	 * We can't take any locks and we may be on an inconsistent state, so
> -	 * use NMIs as IPIs to tell the other CPUs to exit VMX root and halt.
> +	 * use NMIs as IPIs to tell the other CPUs to disable VMX/SVM and halt.
>  	 *
> -	 * Do the NMI shootdown even if VMX if off on _this_ CPU, as that
> -	 * doesn't prevent a different CPU from being in VMX root operation.
> +	 * Do the NMI shootdown even if virtualization is off on _this_ CPU, as
> +	 * other CPUs may have virtualization enabled.
>  	 */
> -	if (cpu_has_vmx()) {
> -		/* Safely force _this_ CPU out of VMX root operation. */
> -		__cpu_emergency_vmxoff();
> +	if (cpu_has_vmx() || cpu_has_svm(NULL)) {
> +		/* Safely force _this_ CPU out of VMX/SVM operation. */
> +		if (cpu_has_vmx())
> +			__cpu_emergency_vmxoff();
> +		else
> +			cpu_emergency_svm_disable();
>  
> -		/* Halt and exit VMX root operation on the other CPUs. */
> +		/* Disable VMX/SVM and halt on other CPUs. */
>  		nmi_shootdown_cpus(nmi_shootdown_nop);
>  	}

Nitpicking: we can get rid of the second cpu_has_vmx() if we write this
as:

	if (cpu_has_vmx())
		__cpu_emergency_vmxoff();
	else if (cpu_has_svm(NULL))
		cpu_emergency_svm_disable();
	else 
		return;

	nmi_shootdown_cpus(nmi_shootdown_nop);

>  }
> @@ -590,7 +592,7 @@ static void native_machine_emergency_restart(void)
>  	unsigned short mode;
>  
>  	if (reboot_emergency)
> -		emergency_vmx_disable_all();
> +		emergency_reboot_disable_virtualization();
>  
>  	tboot_shutdown(TB_SHUTDOWN_REBOOT);

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list_add
  2022-05-11 23:43 ` [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list_add Sean Christopherson
@ 2022-05-12  9:14   ` Vitaly Kuznetsov
  2022-05-12 10:51   ` Thomas Gleixner
  1 sibling, 0 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2022-05-12  9:14 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: H. Peter Anvin, linux-kernel, Guilherme G . Piccoli,
	Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Michael Kelley, Dexuan Cui

Sean Christopherson <seanjc@google.com> writes:

> Disable virtualization in crash_nmi_callback() and skip the requested NMI
> shootdown if a shootdown has already occurred, i.e. a callback has been
> registered.  The NMI crash shootdown path doesn't play nice with multiple
> invocations, e.g. attempting to register the NMI handler multiple times
> will trigger a double list_add() and hang the system (in addition to
> multiple other issues).  If "crash_kexec_post_notifiers" is specified on
> the kernel command line, panic() will invoke crash_smp_send_stop() and
> result in a second call to nmi_shootdown_cpus() during
> native_machine_emergency_restart().

crash_kexec_post_notifiers is also automatically enabled on Hyper-V (see
hv_common_init()) which means the hang should also be observed there.

Cc: Michael and Dexuan

>
> Invoke the callback _before_ disabling virtualization, as the current
> VMCS needs to be cleared before doing VMXOFF.  Note, this results in a
> subtle change in ordering between disabling virtualization and stopping
> Intel PT on the responding CPUs.  While VMX and Intel PT do interact,
> VMXOFF and writes to MSR_IA32_RTIT_CTL do not induce faults between one
> another, which is all that matters when panicking.
>
> WARN if nmi_shootdown_cpus() is called a second time with anything other
> than the reboot path's "nop" handler, as bailing means the requested
> isn't being invoked.  Punt true handling of multiple shootdown callbacks
> until there's an actual use case for doing so (beyond disabling
> virtualization).
>
> Extract the disabling logic to a common helper to deduplicate code, and
> to prepare for doing the shootdown in the emergency reboot path if SVM
> is supported.
>
> Note, prior to commit ed72736183c4 ("x86/reboot: Force all cpus to exit
> VMX root if VMX is supported"), nmi_shootdown_cpus() was subtly protected
> against a second invocation by a cpu_vmx_enabled() check as the kdump
> handler would disable VMX if it ran first.
>
> Fixes: ed72736183c4 ("x86/reboot: Force all cpus to exit VMX root if VMX is supported)
> Cc: stable@vger.kernel.org
> Reported-and-tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Link: https://lore.kernel.org/all/20220427224924.592546-2-gpiccoli@igalia.com
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/reboot.h |  1 +
>  arch/x86/kernel/crash.c       | 16 +--------------
>  arch/x86/kernel/reboot.c      | 38 ++++++++++++++++++++++++++++++++---
>  3 files changed, 37 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h
> index 04c17be9b5fd..8f2da36435a6 100644
> --- a/arch/x86/include/asm/reboot.h
> +++ b/arch/x86/include/asm/reboot.h
> @@ -25,6 +25,7 @@ void __noreturn machine_real_restart(unsigned int type);
>  #define MRR_BIOS	0
>  #define MRR_APM		1
>  
> +void cpu_crash_disable_virtualization(void);
>  typedef void (*nmi_shootdown_cb)(int, struct pt_regs*);
>  void nmi_panic_self_stop(struct pt_regs *regs);
>  void nmi_shootdown_cpus(nmi_shootdown_cb callback);
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index e8326a8d1c5d..fe0cf83843ba 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -81,15 +81,6 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
>  	 */
>  	cpu_crash_vmclear_loaded_vmcss();
>  
> -	/* Disable VMX or SVM if needed.
> -	 *
> -	 * We need to disable virtualization on all CPUs.
> -	 * Having VMX or SVM enabled on any CPU may break rebooting
> -	 * after the kdump kernel has finished its task.
> -	 */
> -	cpu_emergency_vmxoff();
> -	cpu_emergency_svm_disable();
> -
>  	/*
>  	 * Disable Intel PT to stop its logging
>  	 */
> @@ -148,12 +139,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>  	 */
>  	cpu_crash_vmclear_loaded_vmcss();
>  
> -	/* Booting kdump kernel with VMX or SVM enabled won't work,
> -	 * because (among other limitations) we can't disable paging
> -	 * with the virt flags.
> -	 */
> -	cpu_emergency_vmxoff();
> -	cpu_emergency_svm_disable();
> +	cpu_crash_disable_virtualization();
>  
>  	/*
>  	 * Disable Intel PT to stop its logging
> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> index fa700b46588e..f9543a4e9b09 100644
> --- a/arch/x86/kernel/reboot.c
> +++ b/arch/x86/kernel/reboot.c
> @@ -528,9 +528,9 @@ static inline void kb_wait(void)
>  	}
>  }
>  
> -static void vmxoff_nmi(int cpu, struct pt_regs *regs)
> +static void nmi_shootdown_nop(int cpu, struct pt_regs *regs)
>  {
> -	cpu_emergency_vmxoff();
> +	/* Nothing to do, the NMI shootdown handler disables virtualization. */
>  }
>  
>  /* Use NMIs as IPIs to tell all CPUs to disable virtualization */
> @@ -554,7 +554,7 @@ static void emergency_vmx_disable_all(void)
>  		__cpu_emergency_vmxoff();
>  
>  		/* Halt and exit VMX root operation on the other CPUs. */
> -		nmi_shootdown_cpus(vmxoff_nmi);
> +		nmi_shootdown_cpus(nmi_shootdown_nop);
>  	}
>  }
>  
> @@ -802,6 +802,18 @@ static nmi_shootdown_cb shootdown_callback;
>  static atomic_t waiting_for_crash_ipi;
>  static int crash_ipi_issued;
>  
> +void cpu_crash_disable_virtualization(void)
> +{
> +	/*
> +	 * Disable virtualization, i.e. VMX or SVM, so that INIT is recognized
> +	 * during reboot.  VMX blocks INIT if the CPU is post-VMXON, and SVM
> +	 * blocks INIT if GIF=0.  Note, CLGI #UDs if SVM isn't enabled, so it's
> +	 * easier to just disable SVM unconditionally.
> +	 */
> +	cpu_emergency_vmxoff();
> +	cpu_emergency_svm_disable();
> +}
> +
>  static int crash_nmi_callback(unsigned int val, struct pt_regs *regs)
>  {
>  	int cpu;
> @@ -819,6 +831,12 @@ static int crash_nmi_callback(unsigned int val, struct pt_regs *regs)
>  
>  	shootdown_callback(cpu, regs);
>  
> +	/*
> +	 * Prepare the CPU for reboot _after_ invoking the callback so that the
> +	 * callback can safely use virtualization instructions, e.g. VMCLEAR.
> +	 */
> +	cpu_crash_disable_virtualization();
> +
>  	atomic_dec(&waiting_for_crash_ipi);
>  	/* Assume hlt works */
>  	halt();
> @@ -840,6 +858,20 @@ void nmi_shootdown_cpus(nmi_shootdown_cb callback)
>  	unsigned long msecs;
>  	local_irq_disable();
>  
> +	/*
> +	 * Invoking multiple callbacks is not currently supported, registering
> +	 * the NMI handler twice will cause a list_add() double add BUG().
> +	 * The exception is the "nop" handler in the emergency reboot path,
> +	 * which can run after e.g. kdump's shootdown.  Do nothing if the crash
> +	 * handler has already run, i.e. has already prepared other CPUs, the
> +	 * reboot path doesn't have any work of its to do, it just needs to
> +	 * ensure all CPUs have prepared for reboot.
> +	 */
> +	if (shootdown_callback) {
> +		WARN_ON_ONCE(callback != nmi_shootdown_nop);
> +		return;
> +	}
> +
>  	/* Make a note of crashing cpu. Will be used in NMI callback. */
>  	crashing_cpu = safe_smp_processor_id();

Assuming I didn't get lost in the (infinite) multiverse of possible
shutdown/reboot/crash/kexec-not-kexec paths,

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list_add
  2022-05-11 23:43 ` [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list_add Sean Christopherson
  2022-05-12  9:14   ` Vitaly Kuznetsov
@ 2022-05-12 10:51   ` Thomas Gleixner
  2022-05-12 14:14     ` Sean Christopherson
  1 sibling, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2022-05-12 10:51 UTC (permalink / raw)
  To: Sean Christopherson, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: H. Peter Anvin, linux-kernel, Guilherme G . Piccoli,
	Vitaly Kuznetsov, Paolo Bonzini, Sean Christopherson

Sean,

On Wed, May 11 2022 at 23:43, Sean Christopherson wrote:
> @@ -840,6 +858,20 @@ void nmi_shootdown_cpus(nmi_shootdown_cb callback)
>  	unsigned long msecs;
>  	local_irq_disable();
>  
> +	/*
> +	 * Invoking multiple callbacks is not currently supported, registering
> +	 * the NMI handler twice will cause a list_add() double add BUG().
> +	 * The exception is the "nop" handler in the emergency reboot path,
> +	 * which can run after e.g. kdump's shootdown.  Do nothing if the crash
> +	 * handler has already run, i.e. has already prepared other CPUs, the
> +	 * reboot path doesn't have any work of its to do, it just needs to
> +	 * ensure all CPUs have prepared for reboot.

This is confusing at best. The double list add is just one part of the
problem, which would be trivial enough to fix.

The real point is that after the first shoot down all other CPUs are
stuck in crash_nmi_callback() and won't respond to the second NMI.

So trying to run this twice is completely pointless and guaranteed to
run into the timeout.

> +	 */
> +	if (shootdown_callback) {
> +		WARN_ON_ONCE(callback != nmi_shootdown_nop);
> +		return;

Instead of playing games with the callback pointer, I prefer to make
this all explicit. Delta patch below.

Thanks,

        tglx
---
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -528,10 +528,7 @@ static inline void kb_wait(void)
 	}
 }
 
-static void nmi_shootdown_nop(int cpu, struct pt_regs *regs)
-{
-	/* Nothing to do, the NMI shootdown handler disables virtualization. */
-}
+static inline void nmi_shootdown_cpus_on_restart(void);
 
 /* Use NMIs as IPIs to tell all CPUs to disable virtualization */
 static void emergency_vmx_disable_all(void)
@@ -554,7 +551,7 @@ static void emergency_vmx_disable_all(vo
 		__cpu_emergency_vmxoff();
 
 		/* Halt and exit VMX root operation on the other CPUs. */
-		nmi_shootdown_cpus(nmi_shootdown_nop);
+		nmi_shootdown_cpus_on_restart();
 	}
 }
 
@@ -829,7 +826,8 @@ static int crash_nmi_callback(unsigned i
 		return NMI_HANDLED;
 	local_irq_disable();
 
-	shootdown_callback(cpu, regs);
+	if (shootdown_callback)
+		shootdown_callback(cpu, regs);
 
 	/*
 	 * Prepare the CPU for reboot _after_ invoking the callback so that the
@@ -846,31 +844,30 @@ static int crash_nmi_callback(unsigned i
 	return NMI_HANDLED;
 }
 
-/*
- * Halt all other CPUs, calling the specified function on each of them
+/**
+ * nmi_shootdown_cpus - Stop other CPUs via NMI
+ * @callback:	Optional callback to be invoked from the NMI handler
  *
- * This function can be used to halt all other CPUs on crash
- * or emergency reboot time. The function passed as parameter
- * will be called inside a NMI handler on all CPUs.
+ * The NMI handler on the remote CPUs invokes @callback, if not
+ * NULL, first and then disables virtualization to ensure that
+ * INIT is recognized during reboot.
+ *
+ * nmi_shootdown_cpus() can only be invoked once. After the first
+ * invocation all other CPUs are stuck in crash_nmi_callback() and
+ * cannot respond to a second NMI.
  */
 void nmi_shootdown_cpus(nmi_shootdown_cb callback)
 {
 	unsigned long msecs;
+
 	local_irq_disable();
 
 	/*
-	 * Invoking multiple callbacks is not currently supported, registering
-	 * the NMI handler twice will cause a list_add() double add BUG().
-	 * The exception is the "nop" handler in the emergency reboot path,
-	 * which can run after e.g. kdump's shootdown.  Do nothing if the crash
-	 * handler has already run, i.e. has already prepared other CPUs, the
-	 * reboot path doesn't have any work of its to do, it just needs to
-	 * ensure all CPUs have prepared for reboot.
+	 * Aside of being pointless this would register the NMI handler
+	 * twice causing list corruption.
 	 */
-	if (shootdown_callback) {
-		WARN_ON_ONCE(callback != nmi_shootdown_nop);
+	if (WARN_ON_ONCE(crash_ipi_issued))
 		return;
-	}
 
 	/* Make a note of crashing cpu. Will be used in NMI callback. */
 	crashing_cpu = safe_smp_processor_id();
@@ -902,6 +899,12 @@ void nmi_shootdown_cpus(nmi_shootdown_cb
 	/* Leave the nmi callback set */
 }
 
+static inline void nmi_shootdown_cpus_on_restart(void)
+{
+	if (!crash_ipi_issued)
+		nmi_shootdown_cpus(NULL);
+}
+
 /*
  * Check if the crash dumping IPI got issued and if so, call its callback
  * directly. This function is used when we have already been in NMI handler.
@@ -929,6 +932,8 @@ void nmi_shootdown_cpus(nmi_shootdown_cb
 	/* No other CPUs to shoot down */
 }
 
+static inline void nmi_shootdown_cpus_on_restart(void) { }
+
 void run_crash_ipi_callback(struct pt_regs *regs)
 {
 }

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] x86/reboot: Disable virtualization in an emergency if SVM is supported
  2022-05-11 23:43 ` [PATCH 2/2] x86/reboot: Disable virtualization in an emergency if SVM is supported Sean Christopherson
  2022-05-12  8:37   ` Vitaly Kuznetsov
@ 2022-05-12 10:57   ` Thomas Gleixner
  2022-05-12 14:39     ` Sean Christopherson
  1 sibling, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2022-05-12 10:57 UTC (permalink / raw)
  To: Sean Christopherson, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: H. Peter Anvin, linux-kernel, Guilherme G . Piccoli,
	Vitaly Kuznetsov, Paolo Bonzini, Sean Christopherson

On Wed, May 11 2022 at 23:43, Sean Christopherson wrote:
> Disable SVM on all CPUs via NMI shootdown during an emergency reboot.
> Like VMX, SVM can block INIT and thus prevent bringing up other CPUs via
> INIT-SIPI-SIPI.

With the delta patch applied, I'd make that:

--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -530,29 +530,25 @@ static inline void kb_wait(void)
 
 static inline void nmi_shootdown_cpus_on_restart(void);
 
-/* Use NMIs as IPIs to tell all CPUs to disable virtualization */
-static void emergency_vmx_disable_all(void)
+static void emergency_reboot_disable_virtualization(void)
 {
 	/* Just make sure we won't change CPUs while doing this */
 	local_irq_disable();
 
 	/*
-	 * Disable VMX on all CPUs before rebooting, otherwise we risk hanging
-	 * the machine, because the CPU blocks INIT when it's in VMX root.
+	 * Disable virtualization on all CPUs before rebooting to avoid hanging
+	 * the system, as VMX and SVM block INIT when running in the host
 	 *
 	 * We can't take any locks and we may be on an inconsistent state, so
-	 * use NMIs as IPIs to tell the other CPUs to exit VMX root and halt.
+	 * use NMIs as IPIs to tell the other CPUs to disable VMX/SVM and halt.
 	 *
-	 * Do the NMI shootdown even if VMX if off on _this_ CPU, as that
-	 * doesn't prevent a different CPU from being in VMX root operation.
+	 * Do the NMI shootdown even if virtualization is off on _this_ CPU, as
+	 * other CPUs may have virtualization enabled.
 	 */
-	if (cpu_has_vmx()) {
-		/* Safely force _this_ CPU out of VMX root operation. */
-		__cpu_emergency_vmxoff();
+	cpu_crash_disable_virtualization();
 
-		/* Halt and exit VMX root operation on the other CPUs. */
+	if (cpu_has_vmx() || cpu_has_svm(NULL))
 		nmi_shootdown_cpus_on_restart();
-	}
 }
 
 
@@ -587,7 +583,7 @@ static void native_machine_emergency_res
 	unsigned short mode;
 
 	if (reboot_emergency)
-		emergency_vmx_disable_all();
+		emergency_reboot_disable_virtualization();
 
 	tboot_shutdown(TB_SHUTDOWN_REBOOT);
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list_add
  2022-05-12 10:51   ` Thomas Gleixner
@ 2022-05-12 14:14     ` Sean Christopherson
  2022-05-12 14:35       ` Sean Christopherson
  2022-05-12 15:48       ` Thomas Gleixner
  0 siblings, 2 replies; 16+ messages in thread
From: Sean Christopherson @ 2022-05-12 14:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel, Guilherme G . Piccoli, Vitaly Kuznetsov,
	Paolo Bonzini

On Thu, May 12, 2022, Thomas Gleixner wrote:
> Sean,
> 
> On Wed, May 11 2022 at 23:43, Sean Christopherson wrote:
> > @@ -840,6 +858,20 @@ void nmi_shootdown_cpus(nmi_shootdown_cb callback)
> >  	unsigned long msecs;
> >  	local_irq_disable();
> >  
> > +	/*
> > +	 * Invoking multiple callbacks is not currently supported, registering
> > +	 * the NMI handler twice will cause a list_add() double add BUG().
> > +	 * The exception is the "nop" handler in the emergency reboot path,
> > +	 * which can run after e.g. kdump's shootdown.  Do nothing if the crash
> > +	 * handler has already run, i.e. has already prepared other CPUs, the
> > +	 * reboot path doesn't have any work of its to do, it just needs to
> > +	 * ensure all CPUs have prepared for reboot.
> 
> This is confusing at best. The double list add is just one part of the
> problem, which would be trivial enough to fix.
> 
> The real point is that after the first shoot down all other CPUs are
> stuck in crash_nmi_callback() and won't respond to the second NMI.

Well that's embarrasingly obvious in hindsight.

> 
> So trying to run this twice is completely pointless and guaranteed to
> run into the timeout.
> 
> > +	 */
> > +	if (shootdown_callback) {
> > +		WARN_ON_ONCE(callback != nmi_shootdown_nop);
> > +		return;
> 
> Instead of playing games with the callback pointer, I prefer to make
> this all explicit. Delta patch below.

Much better.  If you're planning on doing fixup, can you also include a comment
tweak about why the callback is left set?  If not, I'll do it in v2.  A bit
overkill, but it's another thing that for me was obvious only once I realized "why".

Thanks!

diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 4e3a839ae146..808d3e75fb2d 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -896,7 +896,11 @@ void nmi_shootdown_cpus(nmi_shootdown_cb callback)
                msecs--;
        }
 
-       /* Leave the nmi callback set */
+       /*
+        * Leave the nmi callback set, shootdown is a one-time thing.  Clearing
+        * the callback could result in a NULL pointer dereference if a CPU
+        * (finally) responds after the timeout expires.
+        */
 }
 
 static inline void nmi_shootdown_cpus_on_restart(void)

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list_add
  2022-05-12 14:14     ` Sean Christopherson
@ 2022-05-12 14:35       ` Sean Christopherson
  2022-05-12 15:48       ` Thomas Gleixner
  1 sibling, 0 replies; 16+ messages in thread
From: Sean Christopherson @ 2022-05-12 14:35 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel, Guilherme G . Piccoli, Vitaly Kuznetsov,
	Paolo Bonzini

On Thu, May 12, 2022, Sean Christopherson wrote:
> On Thu, May 12, 2022, Thomas Gleixner wrote:
> > Sean,
> > 
> > On Wed, May 11 2022 at 23:43, Sean Christopherson wrote:
> > > @@ -840,6 +858,20 @@ void nmi_shootdown_cpus(nmi_shootdown_cb callback)
> > >  	unsigned long msecs;
> > >  	local_irq_disable();
> > >  
> > > +	/*
> > > +	 * Invoking multiple callbacks is not currently supported, registering
> > > +	 * the NMI handler twice will cause a list_add() double add BUG().
> > > +	 * The exception is the "nop" handler in the emergency reboot path,
> > > +	 * which can run after e.g. kdump's shootdown.  Do nothing if the crash
> > > +	 * handler has already run, i.e. has already prepared other CPUs, the
> > > +	 * reboot path doesn't have any work of its to do, it just needs to
> > > +	 * ensure all CPUs have prepared for reboot.
> > 
> > This is confusing at best. The double list add is just one part of the
> > problem, which would be trivial enough to fix.
> > 
> > The real point is that after the first shoot down all other CPUs are
> > stuck in crash_nmi_callback() and won't respond to the second NMI.
> 
> Well that's embarrasingly obvious in hindsight.
> 
> > 
> > So trying to run this twice is completely pointless and guaranteed to
> > run into the timeout.
> > 
> > > +	 */
> > > +	if (shootdown_callback) {
> > > +		WARN_ON_ONCE(callback != nmi_shootdown_nop);
> > > +		return;
> > 
> > Instead of playing games with the callback pointer, I prefer to make
> > this all explicit. Delta patch below.
> 
> Much better.  If you're planning on doing fixup, can you also include a comment
> tweak about why the callback is left set?  If not, I'll do it in v2.  A bit
> overkill, but it's another thing that for me was obvious only once I realized "why".

Actually, I'll send a v2, there's some more cleanup that can be done if patch 2
uses cpu_crash_disable_virtualization().

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] x86/reboot: Disable virtualization in an emergency if SVM is supported
  2022-05-12 10:57   ` Thomas Gleixner
@ 2022-05-12 14:39     ` Sean Christopherson
  2022-05-12 15:47       ` Thomas Gleixner
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2022-05-12 14:39 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel, Guilherme G . Piccoli, Vitaly Kuznetsov,
	Paolo Bonzini

On Thu, May 12, 2022, Thomas Gleixner wrote:
> On Wed, May 11 2022 at 23:43, Sean Christopherson wrote:
> > Disable SVM on all CPUs via NMI shootdown during an emergency reboot.
> > Like VMX, SVM can block INIT and thus prevent bringing up other CPUs via
> > INIT-SIPI-SIPI.
> 
> With the delta patch applied, I'd make that:
> 
> --- a/arch/x86/kernel/reboot.c
> +++ b/arch/x86/kernel/reboot.c
> @@ -530,29 +530,25 @@ static inline void kb_wait(void)
>  
>  static inline void nmi_shootdown_cpus_on_restart(void);
>  
> -/* Use NMIs as IPIs to tell all CPUs to disable virtualization */
> -static void emergency_vmx_disable_all(void)
> +static void emergency_reboot_disable_virtualization(void)
>  {
>  	/* Just make sure we won't change CPUs while doing this */
>  	local_irq_disable();
>  
>  	/*
> -	 * Disable VMX on all CPUs before rebooting, otherwise we risk hanging
> -	 * the machine, because the CPU blocks INIT when it's in VMX root.
> +	 * Disable virtualization on all CPUs before rebooting to avoid hanging
> +	 * the system, as VMX and SVM block INIT when running in the host
>  	 *
>  	 * We can't take any locks and we may be on an inconsistent state, so
> -	 * use NMIs as IPIs to tell the other CPUs to exit VMX root and halt.
> +	 * use NMIs as IPIs to tell the other CPUs to disable VMX/SVM and halt.
>  	 *
> -	 * Do the NMI shootdown even if VMX if off on _this_ CPU, as that
> -	 * doesn't prevent a different CPU from being in VMX root operation.
> +	 * Do the NMI shootdown even if virtualization is off on _this_ CPU, as
> +	 * other CPUs may have virtualization enabled.
>  	 */
> -	if (cpu_has_vmx()) {
> -		/* Safely force _this_ CPU out of VMX root operation. */
> -		__cpu_emergency_vmxoff();
> +	cpu_crash_disable_virtualization();
>  
> -		/* Halt and exit VMX root operation on the other CPUs. */
> +	if (cpu_has_vmx() || cpu_has_svm(NULL))
>  		nmi_shootdown_cpus_on_restart();
> -	}

What about leaving cpu_crash_disable_virtualization() inside the if-statement?
It feels wierd to "disable" virtualization on the current CPU but ignore others,
e.g. if there's some newfangled type of virtualization in the future, I would be
quite surprised if only the CPU doing the transfer needed to disable virtualization.

	if (cpu_has_vmx() || cpu_has_svm(NULL)) {
		/* Safely force _this_ CPU out of VMX/SVM operation. */
		cpu_crash_disable_virtualization();

		/* Disable VMX/SVM and halt on other CPUs. */
		nmi_shootdown_cpus_on_restart()
	}


>  }
>  
>  
> @@ -587,7 +583,7 @@ static void native_machine_emergency_res
>  	unsigned short mode;
>  
>  	if (reboot_emergency)
> -		emergency_vmx_disable_all();
> +		emergency_reboot_disable_virtualization();
>  
>  	tboot_shutdown(TB_SHUTDOWN_REBOOT);
>  

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] x86/reboot: Disable virtualization in an emergency if SVM is supported
  2022-05-12 14:39     ` Sean Christopherson
@ 2022-05-12 15:47       ` Thomas Gleixner
  0 siblings, 0 replies; 16+ messages in thread
From: Thomas Gleixner @ 2022-05-12 15:47 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel, Guilherme G . Piccoli, Vitaly Kuznetsov,
	Paolo Bonzini

On Thu, May 12 2022 at 14:39, Sean Christopherson wrote:
> What about leaving cpu_crash_disable_virtualization() inside the if-statement?
> It feels wierd to "disable" virtualization on the current CPU but ignore others,
> e.g. if there's some newfangled type of virtualization in the future, I would be
> quite surprised if only the CPU doing the transfer needed to disable virtualization.

No real preference, though you have the unconditional invocation already
in the crash code IIRC.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list_add
  2022-05-12 14:14     ` Sean Christopherson
  2022-05-12 14:35       ` Sean Christopherson
@ 2022-05-12 15:48       ` Thomas Gleixner
  1 sibling, 0 replies; 16+ messages in thread
From: Thomas Gleixner @ 2022-05-12 15:48 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel, Guilherme G . Piccoli, Vitaly Kuznetsov,
	Paolo Bonzini

On Thu, May 12 2022 at 14:14, Sean Christopherson wrote:
> On Thu, May 12, 2022, Thomas Gleixner wrote:
> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> index 4e3a839ae146..808d3e75fb2d 100644
> --- a/arch/x86/kernel/reboot.c
> +++ b/arch/x86/kernel/reboot.c
> @@ -896,7 +896,11 @@ void nmi_shootdown_cpus(nmi_shootdown_cb callback)
>                 msecs--;
>         }
>  
> -       /* Leave the nmi callback set */
> +       /*
> +        * Leave the nmi callback set, shootdown is a one-time thing.  Clearing
> +        * the callback could result in a NULL pointer dereference if a CPU
> +        * (finally) responds after the timeout expires.
> +        */

Good point!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] x86/nmi: Make register_nmi_handler() more robust
  2022-05-11 23:43 [PATCH 0/2] x86/crash: Fix double list_add nmi_shootdown bug Sean Christopherson
  2022-05-11 23:43 ` [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list_add Sean Christopherson
  2022-05-11 23:43 ` [PATCH 2/2] x86/reboot: Disable virtualization in an emergency if SVM is supported Sean Christopherson
@ 2022-05-13 11:10 ` Thomas Gleixner
  2022-05-15 11:37   ` Thomas Gleixner
  2022-05-15 11:39   ` [PATCH V2] " Thomas Gleixner
  2022-05-17  7:34 ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
  3 siblings, 2 replies; 16+ messages in thread
From: Thomas Gleixner @ 2022-05-13 11:10 UTC (permalink / raw)
  To: Sean Christopherson, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: H. Peter Anvin, linux-kernel, Guilherme G . Piccoli,
	Vitaly Kuznetsov, Paolo Bonzini, Sean Christopherson

register_nmi_handler() has no sanity check whether a handler has been
registered already. Such an unintended double-add leads to list corruption
and hard to diagnose problems during the next NMI handling.

Init the list head in the static nmi action struct and check it for being
empty in register_nmi_handler().

Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/nmi.h |    1 +
 arch/x86/kernel/nmi.c      |   10 +++++++---
 2 files changed, 8 insertions(+), 3 deletions(-)

--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -47,6 +47,7 @@ struct nmiaction {
 #define register_nmi_handler(t, fn, fg, n, init...)	\
 ({							\
 	static struct nmiaction init fn##_na = {	\
+		.list = LIST_HEAD_INIT(fn##_na.list),	\
 		.handler = (fn),			\
 		.name = (n),				\
 		.flags = (fg),				\
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -157,7 +157,7 @@ int __register_nmi_handler(unsigned int
 	struct nmi_desc *desc = nmi_to_desc(type);
 	unsigned long flags;
 
-	if (!action->handler)
+	if (WARN_ON_ONCE(action->handler || !list_empty(&action->list)))
 		return -EINVAL;
 
 	raw_spin_lock_irqsave(&desc->lock, flags);
@@ -186,7 +186,7 @@ EXPORT_SYMBOL(__register_nmi_handler);
 void unregister_nmi_handler(unsigned int type, const char *name)
 {
 	struct nmi_desc *desc = nmi_to_desc(type);
-	struct nmiaction *n;
+	struct nmiaction *n, *found = NULL;
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&desc->lock, flags);
@@ -200,12 +200,16 @@ void unregister_nmi_handler(unsigned int
 			WARN(in_nmi(),
 				"Trying to free NMI (%s) from NMI context!\n", n->name);
 			list_del_rcu(&n->list);
+			found = n;
 			break;
 		}
 	}
 
 	raw_spin_unlock_irqrestore(&desc->lock, flags);
-	synchronize_rcu();
+	if (found) {
+		synchronize_rcu();
+		INIT_LIST_HEAD(found);
+	}
 }
 EXPORT_SYMBOL_GPL(unregister_nmi_handler);
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/nmi: Make register_nmi_handler() more robust
  2022-05-13 11:10 ` [PATCH] x86/nmi: Make register_nmi_handler() more robust Thomas Gleixner
@ 2022-05-15 11:37   ` Thomas Gleixner
  2022-05-15 11:39   ` [PATCH V2] " Thomas Gleixner
  1 sibling, 0 replies; 16+ messages in thread
From: Thomas Gleixner @ 2022-05-15 11:37 UTC (permalink / raw)
  To: Sean Christopherson, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: H. Peter Anvin, linux-kernel, Guilherme G . Piccoli,
	Vitaly Kuznetsov, Paolo Bonzini, Sean Christopherson

On Fri, May 13 2022 at 13:10, Thomas Gleixner wrote:
> @@ -157,7 +157,7 @@ int __register_nmi_handler(unsigned int
>  	struct nmi_desc *desc = nmi_to_desc(type);
>  	unsigned long flags;
>  
> -	if (!action->handler)
> +	if (WARN_ON_ONCE(action->handler || !list_empty(&action->list)))

Bah. That should obviously be !action->handler. Let me send V2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH V2] x86/nmi: Make register_nmi_handler() more robust
  2022-05-13 11:10 ` [PATCH] x86/nmi: Make register_nmi_handler() more robust Thomas Gleixner
  2022-05-15 11:37   ` Thomas Gleixner
@ 2022-05-15 11:39   ` Thomas Gleixner
  1 sibling, 0 replies; 16+ messages in thread
From: Thomas Gleixner @ 2022-05-15 11:39 UTC (permalink / raw)
  To: Sean Christopherson, Ingo Molnar, Borislav Petkov, Dave Hansen, x86
  Cc: H. Peter Anvin, linux-kernel, Guilherme G . Piccoli,
	Vitaly Kuznetsov, Paolo Bonzini, Sean Christopherson

register_nmi_handler() has no sanity check whether a handler has been
registered already. Such an unintended double-add leads to list corruption
and hard to diagnose problems during the next NMI handling.

Init the list head in the static nmi action struct and check it for being
empty in register_nmi_handler().

Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20220511234332.3654455-1-seanjc@google.com
---
V2: Warn if no handler provided and not for the correct case (Boris)
---
 arch/x86/include/asm/nmi.h |    1 +
 arch/x86/kernel/nmi.c      |   10 +++++++---
 2 files changed, 8 insertions(+), 3 deletions(-)

--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -47,6 +47,7 @@ struct nmiaction {
 #define register_nmi_handler(t, fn, fg, n, init...)	\
 ({							\
 	static struct nmiaction init fn##_na = {	\
+		.list = LIST_HEAD_INIT(fn##_na.list),	\
 		.handler = (fn),			\
 		.name = (n),				\
 		.flags = (fg),				\
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -157,7 +157,7 @@ int __register_nmi_handler(unsigned int
 	struct nmi_desc *desc = nmi_to_desc(type);
 	unsigned long flags;
 
-	if (!action->handler)
+	if (WARN_ON_ONCE(!action->handler || !list_empty(&action->list)))
 		return -EINVAL;
 
 	raw_spin_lock_irqsave(&desc->lock, flags);
@@ -186,7 +186,7 @@ EXPORT_SYMBOL(__register_nmi_handler);
 void unregister_nmi_handler(unsigned int type, const char *name)
 {
 	struct nmi_desc *desc = nmi_to_desc(type);
-	struct nmiaction *n;
+	struct nmiaction *n, *found = NULL;
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&desc->lock, flags);
@@ -200,12 +200,16 @@ void unregister_nmi_handler(unsigned int
 			WARN(in_nmi(),
 				"Trying to free NMI (%s) from NMI context!\n", n->name);
 			list_del_rcu(&n->list);
+			found = n;
 			break;
 		}
 	}
 
 	raw_spin_unlock_irqrestore(&desc->lock, flags);
-	synchronize_rcu();
+	if (found) {
+		synchronize_rcu();
+		INIT_LIST_HEAD(found);
+	}
 }
 EXPORT_SYMBOL_GPL(unregister_nmi_handler);
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [tip: x86/core] x86/nmi: Make register_nmi_handler() more robust
  2022-05-11 23:43 [PATCH 0/2] x86/crash: Fix double list_add nmi_shootdown bug Sean Christopherson
                   ` (2 preceding siblings ...)
  2022-05-13 11:10 ` [PATCH] x86/nmi: Make register_nmi_handler() more robust Thomas Gleixner
@ 2022-05-17  7:34 ` tip-bot2 for Thomas Gleixner
  3 siblings, 0 replies; 16+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2022-05-17  7:34 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Sean Christopherson, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     a7fed5c0431dbfa707037848830f980e0f93cfb3
Gitweb:        https://git.kernel.org/tip/a7fed5c0431dbfa707037848830f980e0f93cfb3
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Sun, 15 May 2022 13:39:34 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Tue, 17 May 2022 09:25:25 +02:00

x86/nmi: Make register_nmi_handler() more robust

register_nmi_handler() has no sanity check whether a handler has been
registered already. Such an unintended double-add leads to list corruption
and hard to diagnose problems during the next NMI handling.

Init the list head in the static NMI action struct and check it for being
empty in register_nmi_handler().

  [ bp: Fixups. ]

Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/lkml/20220511234332.3654455-1-seanjc@google.com
---
 arch/x86/include/asm/nmi.h |  1 +
 arch/x86/kernel/nmi.c      | 12 ++++++++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
index 1cb9c17..5c5f1e5 100644
--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -47,6 +47,7 @@ struct nmiaction {
 #define register_nmi_handler(t, fn, fg, n, init...)	\
 ({							\
 	static struct nmiaction init fn##_na = {	\
+		.list = LIST_HEAD_INIT(fn##_na.list),	\
 		.handler = (fn),			\
 		.name = (n),				\
 		.flags = (fg),				\
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index e73f7df..cec0bfa 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -157,7 +157,7 @@ int __register_nmi_handler(unsigned int type, struct nmiaction *action)
 	struct nmi_desc *desc = nmi_to_desc(type);
 	unsigned long flags;
 
-	if (!action->handler)
+	if (WARN_ON_ONCE(!action->handler || !list_empty(&action->list)))
 		return -EINVAL;
 
 	raw_spin_lock_irqsave(&desc->lock, flags);
@@ -177,7 +177,7 @@ int __register_nmi_handler(unsigned int type, struct nmiaction *action)
 		list_add_rcu(&action->list, &desc->head);
 	else
 		list_add_tail_rcu(&action->list, &desc->head);
-	
+
 	raw_spin_unlock_irqrestore(&desc->lock, flags);
 	return 0;
 }
@@ -186,7 +186,7 @@ EXPORT_SYMBOL(__register_nmi_handler);
 void unregister_nmi_handler(unsigned int type, const char *name)
 {
 	struct nmi_desc *desc = nmi_to_desc(type);
-	struct nmiaction *n;
+	struct nmiaction *n, *found = NULL;
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&desc->lock, flags);
@@ -200,12 +200,16 @@ void unregister_nmi_handler(unsigned int type, const char *name)
 			WARN(in_nmi(),
 				"Trying to free NMI (%s) from NMI context!\n", n->name);
 			list_del_rcu(&n->list);
+			found = n;
 			break;
 		}
 	}
 
 	raw_spin_unlock_irqrestore(&desc->lock, flags);
-	synchronize_rcu();
+	if (found) {
+		synchronize_rcu();
+		INIT_LIST_HEAD(&found->list);
+	}
 }
 EXPORT_SYMBOL_GPL(unregister_nmi_handler);
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-05-17  7:37 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-11 23:43 [PATCH 0/2] x86/crash: Fix double list_add nmi_shootdown bug Sean Christopherson
2022-05-11 23:43 ` [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list_add Sean Christopherson
2022-05-12  9:14   ` Vitaly Kuznetsov
2022-05-12 10:51   ` Thomas Gleixner
2022-05-12 14:14     ` Sean Christopherson
2022-05-12 14:35       ` Sean Christopherson
2022-05-12 15:48       ` Thomas Gleixner
2022-05-11 23:43 ` [PATCH 2/2] x86/reboot: Disable virtualization in an emergency if SVM is supported Sean Christopherson
2022-05-12  8:37   ` Vitaly Kuznetsov
2022-05-12 10:57   ` Thomas Gleixner
2022-05-12 14:39     ` Sean Christopherson
2022-05-12 15:47       ` Thomas Gleixner
2022-05-13 11:10 ` [PATCH] x86/nmi: Make register_nmi_handler() more robust Thomas Gleixner
2022-05-15 11:37   ` Thomas Gleixner
2022-05-15 11:39   ` [PATCH V2] " Thomas Gleixner
2022-05-17  7:34 ` [tip: x86/core] " tip-bot2 for Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).