kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] KVM: Fix race between reboot and hardware enabling
@ 2023-03-10 22:14 Sean Christopherson
  2023-03-10 22:14 ` [PATCH 1/2] KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown Sean Christopherson
  2023-03-10 22:14 ` [PATCH 2/2] KVM: Don't enable hardware after a restart/shutdown is initiated Sean Christopherson
  0 siblings, 2 replies; 7+ messages in thread
From: Sean Christopherson @ 2023-03-10 22:14 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, linux-kernel, Marc Zyngier, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, kvmarm, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra, kvm-riscv,
	Sean Christopherson

Fix a bug where enabling hardware virtualization can race with a forced
reboot, e.g. `reboot -f`, and result in virt hardware being enabled when
the reboot is attempted, and thus hanging the reboot.

Found by inspection, confirmed by hacking the reboot flow to wait until
KVM loads (the problematic window is ridiculously small).

Tested only on x86, though there would have to be some seriously subtle
arch and/or driver code for this to break other architectures.

Sean Christopherson (2):
  KVM: Use syscore_ops instead of reboot_notifier to hook
    restart/shutdown
  KVM: Don't enable hardware after a restart/shutdown is initiated

 virt/kvm/kvm_main.c | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)


base-commit: 45dd9bc75d9adc9483f0c7d662ba6e73ed698a0b
-- 
2.40.0.rc1.284.g88254d51c5-goog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown
  2023-03-10 22:14 [PATCH 0/2] KVM: Fix race between reboot and hardware enabling Sean Christopherson
@ 2023-03-10 22:14 ` Sean Christopherson
  2023-03-12 10:12   ` Marc Zyngier
  2023-03-10 22:14 ` [PATCH 2/2] KVM: Don't enable hardware after a restart/shutdown is initiated Sean Christopherson
  1 sibling, 1 reply; 7+ messages in thread
From: Sean Christopherson @ 2023-03-10 22:14 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, linux-kernel, Marc Zyngier, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, kvmarm, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra, kvm-riscv,
	Sean Christopherson

Use syscore_ops.shutdown to disable hardware virtualization during a
reboot instead of using the dedicated reboot_notifier so that KVM disables
virtualization _after_ system_state has been updated.  This will allow
fixing a race in KVM's handling of a forced reboot where KVM can end up
enabling hardware virtualization between kernel_restart_prepare() and
machine_restart().

Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: kvmarm@lists.linux.dev
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Atish Patra <atishp@atishpatra.org>
Cc: kvm-riscv@lists.infradead.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d255964ec331..6cdfbb2c641b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5211,8 +5211,7 @@ static int hardware_enable_all(void)
 	return r;
 }
 
-static int kvm_reboot(struct notifier_block *notifier, unsigned long val,
-		      void *v)
+static void kvm_reboot(void)
 {
 	/*
 	 * Some (well, at least mine) BIOSes hang on reboot if
@@ -5223,14 +5222,8 @@ static int kvm_reboot(struct notifier_block *notifier, unsigned long val,
 	pr_info("kvm: exiting hardware virtualization\n");
 	kvm_rebooting = true;
 	on_each_cpu(hardware_disable_nolock, NULL, 1);
-	return NOTIFY_OK;
 }
 
-static struct notifier_block kvm_reboot_notifier = {
-	.notifier_call = kvm_reboot,
-	.priority = 0,
-};
-
 static int kvm_suspend(void)
 {
 	/*
@@ -5261,6 +5254,8 @@ static void kvm_resume(void)
 static struct syscore_ops kvm_syscore_ops = {
 	.suspend = kvm_suspend,
 	.resume = kvm_resume,
+	.shutdown = kvm_reboot,
+
 };
 #else /* CONFIG_KVM_GENERIC_HARDWARE_ENABLING */
 static int hardware_enable_all(void)
@@ -5967,7 +5962,6 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 	if (r)
 		return r;
 
-	register_reboot_notifier(&kvm_reboot_notifier);
 	register_syscore_ops(&kvm_syscore_ops);
 #endif
 
@@ -6039,7 +6033,6 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 err_vcpu_cache:
 #ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING
 	unregister_syscore_ops(&kvm_syscore_ops);
-	unregister_reboot_notifier(&kvm_reboot_notifier);
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
 #endif
 	return r;
@@ -6065,7 +6058,6 @@ void kvm_exit(void)
 	kvm_async_pf_deinit();
 #ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING
 	unregister_syscore_ops(&kvm_syscore_ops);
-	unregister_reboot_notifier(&kvm_reboot_notifier);
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
 #endif
 	kvm_irqfd_exit();
-- 
2.40.0.rc1.284.g88254d51c5-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] KVM: Don't enable hardware after a restart/shutdown is initiated
  2023-03-10 22:14 [PATCH 0/2] KVM: Fix race between reboot and hardware enabling Sean Christopherson
  2023-03-10 22:14 ` [PATCH 1/2] KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown Sean Christopherson
@ 2023-03-10 22:14 ` Sean Christopherson
  2023-03-12 10:21   ` Marc Zyngier
  1 sibling, 1 reply; 7+ messages in thread
From: Sean Christopherson @ 2023-03-10 22:14 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, linux-kernel, Marc Zyngier, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, kvmarm, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra, kvm-riscv,
	Sean Christopherson

Reject hardware enabling, i.e. VM creation, if a restart/shutdown has
been initiated to avoid re-enabling hardware between kvm_reboot() and
machine_{halt,power_off,restart}().  The restart case is especially
problematic (for x86) as enabling VMX (or clearing GIF in KVM_RUN on
SVM) blocks INIT, which results in the restart/reboot hanging as BIOS
is unable to wake and rendezvous with APs.

Note, this bug, and the original issue that motivated the addition of
kvm_reboot(), is effectively limited to a forced reboot, e.g. `reboot -f`.
In a "normal" reboot, userspace will gracefully teardown userspace before
triggering the kernel reboot (modulo bugs, errors, etc), i.e. any process
that might do ioctl(KVM_CREATE_VM) is long gone.

Fixes: 8e1c18157d87 ("KVM: VMX: Disable VMX when system shutdown")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6cdfbb2c641b..b2bf4c105181 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5182,7 +5182,20 @@ static void hardware_disable_all(void)
 static int hardware_enable_all(void)
 {
 	atomic_t failed = ATOMIC_INIT(0);
-	int r = 0;
+	int r;
+
+	/*
+	 * Do not enable hardware virtualization if the system is going down.
+	 * If userspace initiated a forced reboot, e.g. reboot -f, then it's
+	 * possible for an in-flight KVM_CREATE_VM to trigger hardware enabling
+	 * after kvm_reboot() is called.  Note, this relies on system_state
+	 * being set _before_ kvm_reboot(), which is why KVM uses a syscore ops
+	 * hook instead of registering a dedicated reboot notifier (the latter
+	 * runs before system_state is updated).
+	 */
+	if (system_state == SYSTEM_HALT || system_state == SYSTEM_POWER_OFF ||
+	    system_state == SYSTEM_RESTART)
+		return -EBUSY;
 
 	/*
 	 * When onlining a CPU, cpu_online_mask is set before kvm_online_cpu()
@@ -5195,6 +5208,8 @@ static int hardware_enable_all(void)
 	cpus_read_lock();
 	mutex_lock(&kvm_lock);
 
+	r = 0;
+
 	kvm_usage_count++;
 	if (kvm_usage_count == 1) {
 		on_each_cpu(hardware_enable_nolock, &failed, 1);
-- 
2.40.0.rc1.284.g88254d51c5-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown
  2023-03-10 22:14 ` [PATCH 1/2] KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown Sean Christopherson
@ 2023-03-12 10:12   ` Marc Zyngier
  0 siblings, 0 replies; 7+ messages in thread
From: Marc Zyngier @ 2023-03-12 10:12 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, kvmarm, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra, kvm-riscv

On Fri, 10 Mar 2023 22:14:13 +0000,
Sean Christopherson <seanjc@google.com> wrote:
> 
> Use syscore_ops.shutdown to disable hardware virtualization during a
> reboot instead of using the dedicated reboot_notifier so that KVM disables
> virtualization _after_ system_state has been updated.  This will allow
> fixing a race in KVM's handling of a forced reboot where KVM can end up
> enabling hardware virtualization between kernel_restart_prepare() and
> machine_restart().
> 
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: James Morse <james.morse@arm.com>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Zenghui Yu <yuzenghui@huawei.com>
> Cc: kvmarm@lists.linux.dev
> Cc: Huacai Chen <chenhuacai@kernel.org>
> Cc: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
> Cc: Anup Patel <anup@brainfault.org>
> Cc: Atish Patra <atishp@atishpatra.org>
> Cc: kvm-riscv@lists.infradead.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  virt/kvm/kvm_main.c | 14 +++-----------
>  1 file changed, 3 insertions(+), 11 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index d255964ec331..6cdfbb2c641b 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5211,8 +5211,7 @@ static int hardware_enable_all(void)
>  	return r;
>  }
>  
> -static int kvm_reboot(struct notifier_block *notifier, unsigned long val,
> -		      void *v)
> +static void kvm_reboot(void)
>  {
>  	/*
>  	 * Some (well, at least mine) BIOSes hang on reboot if
> @@ -5223,14 +5222,8 @@ static int kvm_reboot(struct notifier_block *notifier, unsigned long val,
>  	pr_info("kvm: exiting hardware virtualization\n");
>  	kvm_rebooting = true;
>  	on_each_cpu(hardware_disable_nolock, NULL, 1);
> -	return NOTIFY_OK;
>  }
>  
> -static struct notifier_block kvm_reboot_notifier = {
> -	.notifier_call = kvm_reboot,
> -	.priority = 0,
> -};
> -
>  static int kvm_suspend(void)
>  {
>  	/*
> @@ -5261,6 +5254,8 @@ static void kvm_resume(void)
>  static struct syscore_ops kvm_syscore_ops = {
>  	.suspend = kvm_suspend,
>  	.resume = kvm_resume,
> +	.shutdown = kvm_reboot,
> +

nit: consider renaming the kvm_reboot to kvm_shutdown to match the
syscore structure, and drop the spurious blank line.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] KVM: Don't enable hardware after a restart/shutdown is initiated
  2023-03-10 22:14 ` [PATCH 2/2] KVM: Don't enable hardware after a restart/shutdown is initiated Sean Christopherson
@ 2023-03-12 10:21   ` Marc Zyngier
  2023-03-13 15:02     ` Sean Christopherson
  0 siblings, 1 reply; 7+ messages in thread
From: Marc Zyngier @ 2023-03-12 10:21 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, kvmarm, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra, kvm-riscv

On Fri, 10 Mar 2023 22:14:14 +0000,
Sean Christopherson <seanjc@google.com> wrote:
> 
> Reject hardware enabling, i.e. VM creation, if a restart/shutdown has
> been initiated to avoid re-enabling hardware between kvm_reboot() and
> machine_{halt,power_off,restart}().  The restart case is especially
> problematic (for x86) as enabling VMX (or clearing GIF in KVM_RUN on
> SVM) blocks INIT, which results in the restart/reboot hanging as BIOS
> is unable to wake and rendezvous with APs.
> 
> Note, this bug, and the original issue that motivated the addition of
> kvm_reboot(), is effectively limited to a forced reboot, e.g. `reboot -f`.
> In a "normal" reboot, userspace will gracefully teardown userspace before
> triggering the kernel reboot (modulo bugs, errors, etc), i.e. any process
> that might do ioctl(KVM_CREATE_VM) is long gone.
> 
> Fixes: 8e1c18157d87 ("KVM: VMX: Disable VMX when system shutdown")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  virt/kvm/kvm_main.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 6cdfbb2c641b..b2bf4c105181 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5182,7 +5182,20 @@ static void hardware_disable_all(void)
>  static int hardware_enable_all(void)
>  {
>  	atomic_t failed = ATOMIC_INIT(0);
> -	int r = 0;
> +	int r;
> +
> +	/*
> +	 * Do not enable hardware virtualization if the system is going down.
> +	 * If userspace initiated a forced reboot, e.g. reboot -f, then it's
> +	 * possible for an in-flight KVM_CREATE_VM to trigger hardware enabling
> +	 * after kvm_reboot() is called.  Note, this relies on system_state
> +	 * being set _before_ kvm_reboot(), which is why KVM uses a syscore ops
> +	 * hook instead of registering a dedicated reboot notifier (the latter
> +	 * runs before system_state is updated).
> +	 */
> +	if (system_state == SYSTEM_HALT || system_state == SYSTEM_POWER_OFF ||
> +	    system_state == SYSTEM_RESTART)
> +		return -EBUSY;

Since we now seem to be relying on system_state for most things, is
there any use for 'kvm_rebooting' other than the ease of evaluation in
__svm_vcpu_run?

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] KVM: Don't enable hardware after a restart/shutdown is initiated
  2023-03-12 10:21   ` Marc Zyngier
@ 2023-03-13 15:02     ` Sean Christopherson
  2023-03-13 17:57       ` Marc Zyngier
  0 siblings, 1 reply; 7+ messages in thread
From: Sean Christopherson @ 2023-03-13 15:02 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Paolo Bonzini, kvm, linux-kernel, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, kvmarm, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra, kvm-riscv

On Sun, Mar 12, 2023, Marc Zyngier wrote:
> On Fri, 10 Mar 2023 22:14:14 +0000,
> Sean Christopherson <seanjc@google.com> wrote:
> > 
> > Reject hardware enabling, i.e. VM creation, if a restart/shutdown has
> > been initiated to avoid re-enabling hardware between kvm_reboot() and
> > machine_{halt,power_off,restart}().  The restart case is especially
> > problematic (for x86) as enabling VMX (or clearing GIF in KVM_RUN on
> > SVM) blocks INIT, which results in the restart/reboot hanging as BIOS
> > is unable to wake and rendezvous with APs.
> > 
> > Note, this bug, and the original issue that motivated the addition of
> > kvm_reboot(), is effectively limited to a forced reboot, e.g. `reboot -f`.
> > In a "normal" reboot, userspace will gracefully teardown userspace before
> > triggering the kernel reboot (modulo bugs, errors, etc), i.e. any process
> > that might do ioctl(KVM_CREATE_VM) is long gone.
> > 
> > Fixes: 8e1c18157d87 ("KVM: VMX: Disable VMX when system shutdown")
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  virt/kvm/kvm_main.c | 17 ++++++++++++++++-
> >  1 file changed, 16 insertions(+), 1 deletion(-)
> > 
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 6cdfbb2c641b..b2bf4c105181 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -5182,7 +5182,20 @@ static void hardware_disable_all(void)
> >  static int hardware_enable_all(void)
> >  {
> >  	atomic_t failed = ATOMIC_INIT(0);
> > -	int r = 0;
> > +	int r;
> > +
> > +	/*
> > +	 * Do not enable hardware virtualization if the system is going down.
> > +	 * If userspace initiated a forced reboot, e.g. reboot -f, then it's
> > +	 * possible for an in-flight KVM_CREATE_VM to trigger hardware enabling
> > +	 * after kvm_reboot() is called.  Note, this relies on system_state
> > +	 * being set _before_ kvm_reboot(), which is why KVM uses a syscore ops
> > +	 * hook instead of registering a dedicated reboot notifier (the latter
> > +	 * runs before system_state is updated).
> > +	 */
> > +	if (system_state == SYSTEM_HALT || system_state == SYSTEM_POWER_OFF ||
> > +	    system_state == SYSTEM_RESTART)
> > +		return -EBUSY;
> 
> Since we now seem to be relying on system_state for most things, is
> there any use for 'kvm_rebooting' other than the ease of evaluation in
> __svm_vcpu_run?

Sadly, yes.  The x86 implementations of emergency_restart(), __crash_kexec() and
other emergency reboot flows disable virtualization and set 'kvm_rebooting'
without touching system_state.  VMX and SVM rely on 'kvm_rebooting' being set to
avoid triggering (another) BUG() during the emergency.

On my todo list is to better understand whether or not the other architectures
that utilize the generic hardware enabling (ARM, RISC-V, MIPS) truly need to disable
virtualization during a reboot, versus KVM simply being polite.  E.g. on x86, if VMX
is left enabled, reboot may hang depending on how the reboot is performed.   If
other architectures really truly need to disable virtualization, then they likely
need something similar to x86's emergency reboot shenanigans.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] KVM: Don't enable hardware after a restart/shutdown is initiated
  2023-03-13 15:02     ` Sean Christopherson
@ 2023-03-13 17:57       ` Marc Zyngier
  0 siblings, 0 replies; 7+ messages in thread
From: Marc Zyngier @ 2023-03-13 17:57 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, kvmarm, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra, kvm-riscv

On Mon, 13 Mar 2023 15:02:27 +0000,
Sean Christopherson <seanjc@google.com> wrote:
> 
> On my todo list is to better understand whether or not the other architectures
> that utilize the generic hardware enabling (ARM, RISC-V, MIPS) truly need to disable
> virtualization during a reboot, versus KVM simply being polite.  E.g. on x86, if VMX
> is left enabled, reboot may hang depending on how the reboot is performed.   If
> other architectures really truly need to disable virtualization, then they likely
> need something similar to x86's emergency reboot shenanigans.

At least pre-CCA, there isn't much to do, because there is no such
thing as "disabling virtualisation". For kexec, the only things we
need to do are to go back to EL2 in the nVHE case, and in any case to
put all other CPUs back into the firmware (PSCI CPU_OFF).

CCA may well add other things into the picture, because it is a
parallel exception level that KVM doesn't really control. That's one
of the many open questions I have about this "lovely" piece of
architecture.

Of course, if we were to completely ignore CCA and instead use the
underlying HW (aka RME), things would be a lot simpler and we'd be
back to my original statement...

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-03-13 17:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-10 22:14 [PATCH 0/2] KVM: Fix race between reboot and hardware enabling Sean Christopherson
2023-03-10 22:14 ` [PATCH 1/2] KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown Sean Christopherson
2023-03-12 10:12   ` Marc Zyngier
2023-03-10 22:14 ` [PATCH 2/2] KVM: Don't enable hardware after a restart/shutdown is initiated Sean Christopherson
2023-03-12 10:21   ` Marc Zyngier
2023-03-13 15:02     ` Sean Christopherson
2023-03-13 17:57       ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).