kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Improve KVM's interaction with CPU hotplug
@ 2022-01-18  6:44 Chao Gao
  2022-01-18  6:44 ` [PATCH v2 1/4] KVM: x86: Move check_processor_compatibility from init ops to runtime ops Chao Gao
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Chao Gao @ 2022-01-18  6:44 UTC (permalink / raw)
  To: kvm, seanjc, pbonzini, kevin.tian, tglx
  Cc: Chao Gao, Albert Ou, Aleksandar Markovic, Alexander Gordeev,
	Alexandru Elisei, Anup Patel, Atish Patra,
	Benjamin Herrenschmidt, Bharata B Rao, Borislav Petkov,
	Catalin Marinas, Cédric Le Goater, Christian Borntraeger,
	Claudio Imbrenda, Dave Hansen, David Hildenbrand, Fabiano Rosas,
	Hector Martin, Heiko Carstens, H. Peter Anvin, Huacai Chen,
	Ingo Molnar, James Morse, Janosch Frank, Jim Mattson,
	Joerg Roedel, John Garry, kvmarm, kvm-riscv, linux-arm-kernel,
	linux-kernel, linux-mips, linuxppc-dev, linux-riscv, linux-s390,
	Maciej S. Szmigiero, Marc Zyngier, Michael Ellerman,
	Nicholas Piggin, Nick Desaulniers, Palmer Dabbelt,
	Paul Mackerras, Paul Walmsley, Ravi Bangoria, Shaokun Zhang,
	Sumanth Korikkar, Suzuki K Poulose, Thomas Bogendoerfer,
	Thomas Richter, Vasily Gorbik, Vitaly Kuznetsov, Wanpeng Li,
	Will Deacon, x86

Changes from v1->v2: (all comments/suggestions on v1 are from Sean, thanks)
 - Merged v1's patch 2 into patch 1, and v1's patch 5 into patch 6.
 - Use static_call for check_processor_compatibility().
 - Generate patch 2 with "git revert" and do manual changes based on that.
 - Loosen the WARN_ON() in kvm_arch_check_processor_compat() instead of
   removing it.
 - KVM always prevent incompatible CPUs from being brought up regardless of
   running VMs.
 - Use pr_warn instead of pr_info to emit logs when KVM finds offending
   CPUs.

KVM registers its CPU hotplug callback to CPU starting section. And in the
callback, KVM enables hardware virtualization on hotplugged CPUs if any VM
is running on existing CPUs.

There are two problems in the process:
1. KVM doesn't do compatibility checks before enabling hardware
virtualization on hotplugged CPUs. This may cause #GP if VMX isn't
supported or vmentry failure if some in-use VMX features are missing on
hotplugged CPUs. Both break running VMs.
2. Callbacks in CPU STARTING section cannot fail. So, even if KVM finds
some incompatible CPUs, its callback cannot block CPU hotplug.

This series improves KVM's interaction with CPU hotplug to avoid
incompatible CPUs breaking running VMs. Following changes are made:

1. move KVM's CPU hotplug callback to ONLINE section (suggested by Thomas)
2. do compatibility checks on hotplugged CPUs.
3. abort onlining incompatible CPUs

This series is a follow-up to the discussion about KVM and CPU hotplug
https://lore.kernel.org/lkml/3d3296f0-9245-40f9-1b5a-efffdb082de9@redhat.com/T/

Note: this series is tested only on Intel systems.

Chao Gao (4):
  KVM: x86: Move check_processor_compatibility from init ops to runtime
    ops
  Partially revert "KVM: Pass kvm_init()'s opaque param to additional
    arch funcs"
  KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  KVM: Do compatibility checks on hotplugged CPUs

 arch/arm64/kvm/arm.c               |  2 +-
 arch/mips/kvm/mips.c               |  2 +-
 arch/powerpc/kvm/powerpc.c         |  2 +-
 arch/riscv/kvm/main.c              |  2 +-
 arch/s390/kvm/kvm-s390.c           |  2 +-
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  2 +-
 arch/x86/kvm/svm/svm.c             |  4 +-
 arch/x86/kvm/vmx/evmcs.c           |  2 +-
 arch/x86/kvm/vmx/evmcs.h           |  2 +-
 arch/x86/kvm/vmx/vmx.c             | 12 +++---
 arch/x86/kvm/x86.c                 | 16 +++++---
 include/linux/cpuhotplug.h         |  2 +-
 include/linux/kvm_host.h           |  2 +-
 virt/kvm/kvm_main.c                | 62 ++++++++++++++++++++----------
 15 files changed, 71 insertions(+), 44 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/4] KVM: x86: Move check_processor_compatibility from init ops to runtime ops
  2022-01-18  6:44 [PATCH v2 0/4] Improve KVM's interaction with CPU hotplug Chao Gao
@ 2022-01-18  6:44 ` Chao Gao
  2022-01-18  6:44 ` [PATCH v2 2/4] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs" Chao Gao
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Chao Gao @ 2022-01-18  6:44 UTC (permalink / raw)
  To: kvm, seanjc, pbonzini, kevin.tian, tglx
  Cc: Chao Gao, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

so that KVM can do compatibility checks on hotplugged CPUs. Drop __init
from check_processor_compatibility() and its callees.

Use a static_call() to invoke .check_processor_compatibility().

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  2 +-
 arch/x86/kvm/svm/svm.c             |  4 ++--
 arch/x86/kvm/vmx/evmcs.c           |  2 +-
 arch/x86/kvm/vmx/evmcs.h           |  2 +-
 arch/x86/kvm/vmx/vmx.c             | 12 ++++++------
 arch/x86/kvm/x86.c                 |  3 +--
 7 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index f658bb4dbb74..ab9b4eca56be 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -123,6 +123,7 @@ KVM_X86_OP_NULL(enable_direct_tlbflush)
 KVM_X86_OP_NULL(migrate_timers)
 KVM_X86_OP(msr_filter_changed)
 KVM_X86_OP_NULL(complete_emulated_msr)
+KVM_X86_OP(check_processor_compatibility)
 
 #undef KVM_X86_OP
 #undef KVM_X86_OP_NULL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 89d1fdb39c46..a916b16edd89 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1314,6 +1314,7 @@ static inline u16 kvm_lapic_irq_dest_mode(bool dest_mode_logical)
 struct kvm_x86_ops {
 	const char *name;
 
+	int (*check_processor_compatibility)(void);
 	int (*hardware_enable)(void);
 	void (*hardware_disable)(void);
 	void (*hardware_unsetup)(void);
@@ -1526,7 +1527,6 @@ struct kvm_x86_nested_ops {
 struct kvm_x86_init_ops {
 	int (*cpu_has_kvm_support)(void);
 	int (*disabled_by_bios)(void);
-	int (*check_processor_compatibility)(void);
 	int (*hardware_setup)(void);
 
 	struct kvm_x86_ops *runtime_ops;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index c3d9006478a4..5725fed5ced7 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4064,7 +4064,7 @@ svm_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall)
 	hypercall[2] = 0xd9;
 }
 
-static int __init svm_check_processor_compat(void)
+static int svm_check_processor_compat(void)
 {
 	return 0;
 }
@@ -4613,6 +4613,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.name = "kvm_amd",
 
 	.hardware_unsetup = svm_hardware_teardown,
+	.check_processor_compatibility = svm_check_processor_compat,
 	.hardware_enable = svm_hardware_enable,
 	.hardware_disable = svm_hardware_disable,
 	.cpu_has_accelerated_tpr = svm_cpu_has_accelerated_tpr,
@@ -4746,7 +4747,6 @@ static struct kvm_x86_init_ops svm_init_ops __initdata = {
 	.cpu_has_kvm_support = has_svm,
 	.disabled_by_bios = is_disabled,
 	.hardware_setup = svm_hardware_setup,
-	.check_processor_compatibility = svm_check_processor_compat,
 
 	.runtime_ops = &svm_x86_ops,
 };
diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index ba6f99f584ac..50f923e9917e 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -296,7 +296,7 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 };
 const unsigned int nr_evmcs_1_fields = ARRAY_SIZE(vmcs_field_to_evmcs_1);
 
-__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
+void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
 {
 	vmcs_conf->pin_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_PINCTRL;
 	vmcs_conf->cpu_based_2nd_exec_ctrl &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index 16731d2cf231..17a7c956396b 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -181,7 +181,7 @@ static inline void evmcs_load(u64 phys_addr)
 	vp_ap->enlighten_vmentry = 1;
 }
 
-__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
+void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
 #else /* !IS_ENABLED(CONFIG_HYPERV) */
 static __always_inline void evmcs_write64(unsigned long field, u64 value) {}
 static inline void evmcs_write32(unsigned long field, u32 value) {}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 15e30602782b..364348e134df 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2402,8 +2402,8 @@ static bool cpu_has_sgx(void)
 	return cpuid_eax(0) >= 0x12 && (cpuid_eax(0x12) & BIT(0));
 }
 
-static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
-				      u32 msr, u32 *result)
+static int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
+			       u32 msr, u32 *result)
 {
 	u32 vmx_msr_low, vmx_msr_high;
 	u32 ctl = ctl_min | ctl_opt;
@@ -2421,8 +2421,8 @@ static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
 	return 0;
 }
 
-static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
-				    struct vmx_capability *vmx_cap)
+static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
+			     struct vmx_capability *vmx_cap)
 {
 	u32 vmx_msr_low, vmx_msr_high;
 	u32 min, opt, min2, opt2;
@@ -7051,7 +7051,7 @@ static int vmx_vm_init(struct kvm *kvm)
 	return 0;
 }
 
-static int __init vmx_check_processor_compat(void)
+static int vmx_check_processor_compat(void)
 {
 	struct vmcs_config vmcs_conf;
 	struct vmx_capability vmx_cap;
@@ -7663,6 +7663,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 
 	.hardware_unsetup = hardware_unsetup,
 
+	.check_processor_compatibility = vmx_check_processor_compat,
 	.hardware_enable = hardware_enable,
 	.hardware_disable = hardware_disable,
 	.cpu_has_accelerated_tpr = report_flexpriority,
@@ -7999,7 +8000,6 @@ static __init int hardware_setup(void)
 static struct kvm_x86_init_ops vmx_init_ops __initdata = {
 	.cpu_has_kvm_support = cpu_has_kvm_support,
 	.disabled_by_bios = vmx_disabled_by_bios,
-	.check_processor_compatibility = vmx_check_processor_compat,
 	.hardware_setup = hardware_setup,
 
 	.runtime_ops = &vmx_x86_ops,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 60da2331ec32..f8bc1948a8b5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11473,7 +11473,6 @@ void kvm_arch_hardware_unsetup(void)
 int kvm_arch_check_processor_compat(void *opaque)
 {
 	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
-	struct kvm_x86_init_ops *ops = opaque;
 
 	WARN_ON(!irqs_disabled());
 
@@ -11481,7 +11480,7 @@ int kvm_arch_check_processor_compat(void *opaque)
 	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
 		return -EIO;
 
-	return ops->check_processor_compatibility();
+	return static_call(kvm_x86_check_processor_compatibility)();
 }
 
 bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/4] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs"
  2022-01-18  6:44 [PATCH v2 0/4] Improve KVM's interaction with CPU hotplug Chao Gao
  2022-01-18  6:44 ` [PATCH v2 1/4] KVM: x86: Move check_processor_compatibility from init ops to runtime ops Chao Gao
@ 2022-01-18  6:44 ` Chao Gao
  2022-01-18  6:44 ` [PATCH v2 3/4] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section Chao Gao
  2022-01-18  6:44 ` [PATCH v2 4/4] KVM: Do compatibility checks on hotplugged CPUs Chao Gao
  3 siblings, 0 replies; 8+ messages in thread
From: Chao Gao @ 2022-01-18  6:44 UTC (permalink / raw)
  To: kvm, seanjc, pbonzini, kevin.tian, tglx
  Cc: Chao Gao, Marc Zyngier, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Catalin Marinas, Will Deacon, Huacai Chen,
	Aleksandar Markovic, Thomas Bogendoerfer, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Anup Patel, Atish Patra,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Christian Borntraeger,
	Janosch Frank, David Hildenbrand, Claudio Imbrenda,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Maciej S. Szmigiero, Cédric Le Goater, Ravi Bangoria,
	Nicholas Piggin, Bharata B Rao, Nick Desaulniers, Fabiano Rosas,
	linux-arm-kernel, kvmarm, linux-kernel, linux-mips, linuxppc-dev,
	kvm-riscv, linux-riscv, linux-s390

This partially reverts commit b99040853738 ("KVM: Pass kvm_init()'s opaque
param to additional arch funcs") remove opaque from
kvm_arch_check_processor_compat because no one uses this opaque now.
Address conflicts for ARM (due to file movement) and manually handle RISC-V
which comes after the commit.

And changes about kvm_arch_hardware_setup() in original commit are still
needed so they are not reverted.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 arch/arm64/kvm/arm.c       |  2 +-
 arch/mips/kvm/mips.c       |  2 +-
 arch/powerpc/kvm/powerpc.c |  2 +-
 arch/riscv/kvm/main.c      |  2 +-
 arch/s390/kvm/kvm-s390.c   |  2 +-
 arch/x86/kvm/x86.c         |  2 +-
 include/linux/kvm_host.h   |  2 +-
 virt/kvm/kvm_main.c        | 16 +++-------------
 8 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 868109cf96b4..92ab3d5516ce 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -73,7 +73,7 @@ int kvm_arch_hardware_setup(void *opaque)
 	return 0;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
+int kvm_arch_check_processor_compat(void)
 {
 	return 0;
 }
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index e59cb6246f76..c5dc4fe53bfc 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -140,7 +140,7 @@ int kvm_arch_hardware_setup(void *opaque)
 	return 0;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
+int kvm_arch_check_processor_compat(void)
 {
 	return 0;
 }
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 2ad0ccd202d5..30c817f3fa0c 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -423,7 +423,7 @@ int kvm_arch_hardware_setup(void *opaque)
 	return 0;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
+int kvm_arch_check_processor_compat(void)
 {
 	return kvmppc_core_check_processor_compat();
 }
diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
index 2e5ca43c8c49..992877e78393 100644
--- a/arch/riscv/kvm/main.c
+++ b/arch/riscv/kvm/main.c
@@ -20,7 +20,7 @@ long kvm_arch_dev_ioctl(struct file *filp,
 	return -EINVAL;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
+int kvm_arch_check_processor_compat(void)
 {
 	return 0;
 }
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 9c6d45d0d345..99c70d881cb6 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -252,7 +252,7 @@ int kvm_arch_hardware_enable(void)
 	return 0;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
+int kvm_arch_check_processor_compat(void)
 {
 	return 0;
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f8bc1948a8b5..6f3bf78afb29 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11470,7 +11470,7 @@ void kvm_arch_hardware_unsetup(void)
 	static_call(kvm_x86_hardware_unsetup)();
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
+int kvm_arch_check_processor_compat(void)
 {
 	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 3c47b146851a..a51e9ab520fc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1310,7 +1310,7 @@ int kvm_arch_hardware_enable(void);
 void kvm_arch_hardware_disable(void);
 int kvm_arch_hardware_setup(void *opaque);
 void kvm_arch_hardware_unsetup(void);
-int kvm_arch_check_processor_compat(void *opaque);
+int kvm_arch_check_processor_compat(void);
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
 bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6e8e9d36f382..148f7169b431 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5603,22 +5603,14 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void)
         return &kvm_running_vcpu;
 }
 
-struct kvm_cpu_compat_check {
-	void *opaque;
-	int *ret;
-};
-
-static void check_processor_compat(void *data)
+static void check_processor_compat(void *rtn)
 {
-	struct kvm_cpu_compat_check *c = data;
-
-	*c->ret = kvm_arch_check_processor_compat(c->opaque);
+	*(int *)rtn = kvm_arch_check_processor_compat();
 }
 
 int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 		  struct module *module)
 {
-	struct kvm_cpu_compat_check c;
 	int r;
 	int cpu;
 
@@ -5646,10 +5638,8 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	if (r < 0)
 		goto out_free_1;
 
-	c.ret = &r;
-	c.opaque = opaque;
 	for_each_online_cpu(cpu) {
-		smp_call_function_single(cpu, check_processor_compat, &c, 1);
+		smp_call_function_single(cpu, check_processor_compat, &r, 1);
 		if (r < 0)
 			goto out_free_2;
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/4] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  2022-01-18  6:44 [PATCH v2 0/4] Improve KVM's interaction with CPU hotplug Chao Gao
  2022-01-18  6:44 ` [PATCH v2 1/4] KVM: x86: Move check_processor_compatibility from init ops to runtime ops Chao Gao
  2022-01-18  6:44 ` [PATCH v2 2/4] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs" Chao Gao
@ 2022-01-18  6:44 ` Chao Gao
  2022-02-09  0:29   ` Sean Christopherson
  2022-01-18  6:44 ` [PATCH v2 4/4] KVM: Do compatibility checks on hotplugged CPUs Chao Gao
  3 siblings, 1 reply; 8+ messages in thread
From: Chao Gao @ 2022-01-18  6:44 UTC (permalink / raw)
  To: kvm, seanjc, pbonzini, kevin.tian, tglx
  Cc: Chao Gao, John Garry, Will Deacon, Darrick J. Wong,
	Shaokun Zhang, Thomas Richter, Tony Lindgren, linux-kernel

The CPU STARTING section doesn't allow callbacks to fail. Move KVM's
hotplug callback to ONLINE section so that it can abort onlining a CPU in
certain cases to avoid potentially breaking VMs running on existing CPUs.
For example, when kvm fails to enable hardware virtualization on the
hotplugged CPU.

Place KVM's hotplug state before CPUHP_AP_SCHED_WAIT_EMPTY as it ensures
when offlining a CPU, all user tasks and non-pinned kernel tasks have left
the CPU, i.e. there cannot be a vCPU task around. So, it is safe for KVM's
CPU offline callback to disable hardware virtualization at that point.
Likewise, KVM's online callback can enable hardware virtualization before
any vCPU task gets a chance to run on hotplugged CPUs.

KVM's CPU hotplug callbacks are renamed as well.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 include/linux/cpuhotplug.h |  2 +-
 virt/kvm/kvm_main.c        | 28 ++++++++++++++++++++--------
 2 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 773c83730906..14d354c8ce35 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -182,7 +182,6 @@ enum cpuhp_state {
 	CPUHP_AP_CSKY_TIMER_STARTING,
 	CPUHP_AP_TI_GP_TIMER_STARTING,
 	CPUHP_AP_HYPERV_TIMER_STARTING,
-	CPUHP_AP_KVM_STARTING,
 	CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING,
 	CPUHP_AP_KVM_ARM_VGIC_STARTING,
 	CPUHP_AP_KVM_ARM_TIMER_STARTING,
@@ -200,6 +199,7 @@ enum cpuhp_state {
 
 	/* Online section invoked on the hotplugged CPU from the hotplug thread */
 	CPUHP_AP_ONLINE_IDLE,
+	CPUHP_AP_KVM_ONLINE,
 	CPUHP_AP_SCHED_WAIT_EMPTY,
 	CPUHP_AP_SMPBOOT_THREADS,
 	CPUHP_AP_X86_VDSO_VMA_ONLINE,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 148f7169b431..528741601122 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4856,13 +4856,25 @@ static void hardware_enable_nolock(void *junk)
 	}
 }
 
-static int kvm_starting_cpu(unsigned int cpu)
+static int kvm_online_cpu(unsigned int cpu)
 {
+	int ret = 0;
+
 	raw_spin_lock(&kvm_count_lock);
-	if (kvm_usage_count)
+	/*
+	 * Abort the CPU online process if hardware virtualization cannot
+	 * be enabled. Otherwise running VMs would encounter unrecoverable
+	 * errors when scheduled to this CPU.
+	 */
+	if (kvm_usage_count) {
 		hardware_enable_nolock(NULL);
+		if (atomic_read(&hardware_enable_failed)) {
+			ret = -EIO;
+			pr_warn("kvm: abort onlining CPU%d", cpu);
+		}
+	}
 	raw_spin_unlock(&kvm_count_lock);
-	return 0;
+	return ret;
 }
 
 static void hardware_disable_nolock(void *junk)
@@ -4875,7 +4887,7 @@ static void hardware_disable_nolock(void *junk)
 	kvm_arch_hardware_disable();
 }
 
-static int kvm_dying_cpu(unsigned int cpu)
+static int kvm_offline_cpu(unsigned int cpu)
 {
 	raw_spin_lock(&kvm_count_lock);
 	if (kvm_usage_count)
@@ -5644,8 +5656,8 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 			goto out_free_2;
 	}
 
-	r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_STARTING, "kvm/cpu:starting",
-				      kvm_starting_cpu, kvm_dying_cpu);
+	r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_ONLINE, "kvm/cpu:online",
+				      kvm_online_cpu, kvm_offline_cpu);
 	if (r)
 		goto out_free_2;
 	register_reboot_notifier(&kvm_reboot_notifier);
@@ -5708,7 +5720,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	kmem_cache_destroy(kvm_vcpu_cache);
 out_free_3:
 	unregister_reboot_notifier(&kvm_reboot_notifier);
-	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING);
+	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
 out_free_2:
 	kvm_arch_hardware_unsetup();
 out_free_1:
@@ -5734,7 +5746,7 @@ void kvm_exit(void)
 	kvm_async_pf_deinit();
 	unregister_syscore_ops(&kvm_syscore_ops);
 	unregister_reboot_notifier(&kvm_reboot_notifier);
-	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING);
+	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
 	on_each_cpu(hardware_disable_nolock, NULL, 1);
 	kvm_arch_hardware_unsetup();
 	kvm_arch_exit();
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 4/4] KVM: Do compatibility checks on hotplugged CPUs
  2022-01-18  6:44 [PATCH v2 0/4] Improve KVM's interaction with CPU hotplug Chao Gao
                   ` (2 preceding siblings ...)
  2022-01-18  6:44 ` [PATCH v2 3/4] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section Chao Gao
@ 2022-01-18  6:44 ` Chao Gao
  2022-02-09  0:36   ` Sean Christopherson
  3 siblings, 1 reply; 8+ messages in thread
From: Chao Gao @ 2022-01-18  6:44 UTC (permalink / raw)
  To: kvm, seanjc, pbonzini, kevin.tian, tglx
  Cc: Chao Gao, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, linux-kernel

At init time, KVM does compatibility checks to ensure that all online
CPUs support hardware virtualization and a common set of features. But
KVM uses hotplugged CPUs without such compatibility checks. On Intel
CPUs, this leads to #GP if the hotplugged CPU doesn't support VMX or
vmentry failure if the hotplugged CPU doesn't meet minimal feature
requirements.

Do compatibility checks when onlining a CPU and abort the online process
if the hotplugged CPU is incompatible with online CPUs.

CPU hotplug is disabled during hardware_enable_all() to prevent the corner
case as shown below. A hotplugged CPU marks itself online in
cpu_online_mask (1) and enables interrupt (2) before invoking callbacks
registered in ONLINE section (3). So, if hardware_enable_all() is invoked
on another CPU right after (2), then on_each_cpu() in hardware_enable_all()
invokes hardware_enable_nolock() on the hotplugged CPU before
kvm_online_cpu() is called. This makes the CPU escape from compatibility
checks, which is risky.

	start_secondary { ...
		set_cpu_online(smp_processor_id(), true); <- 1
		...
		local_irq_enable();  <- 2
		...
		cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); <- 3
	}

Keep compatibility checks at KVM init time. It can help to find
incompatibility issues earlier and refuse to load arch KVM module
(e.g., kvm-intel).

Loosen the WARN_ON in kvm_arch_check_processor_compat so that it
can be invoked from KVM's CPU hotplug callback (i.e., kvm_online_cpu).

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 arch/x86/kvm/x86.c  | 11 +++++++++--
 virt/kvm/kvm_main.c | 20 +++++++++++++++++++-
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6f3bf78afb29..21bdb5783f71 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11472,9 +11472,16 @@ void kvm_arch_hardware_unsetup(void)
 
 int kvm_arch_check_processor_compat(void)
 {
-	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
+	int cpu = smp_processor_id();
+	struct cpuinfo_x86 *c = &cpu_data(cpu);
 
-	WARN_ON(!irqs_disabled());
+	/*
+	 * Compatibility checks are done when loading KVM or in KVM's CPU
+	 * hotplug callback. It ensures all online CPUs are compatible to run
+	 * vCPUs. For other cases, compatibility checks are unnecessary or
+	 * even problematic. Try to detect improper usages here.
+	 */
+	WARN_ON(!irqs_disabled() && cpu_active(cpu));
 
 	if (__cr4_reserved_bits(cpu_has, c) !=
 	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 528741601122..83f87fb1fa0a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4858,7 +4858,13 @@ static void hardware_enable_nolock(void *junk)
 
 static int kvm_online_cpu(unsigned int cpu)
 {
-	int ret = 0;
+	int ret;
+
+	ret = kvm_arch_check_processor_compat();
+	if (ret) {
+		pr_warn("kvm: CPU%d is incompatible with online CPUs", cpu);
+		return ret;
+	}
 
 	raw_spin_lock(&kvm_count_lock);
 	/*
@@ -4916,6 +4922,17 @@ static int hardware_enable_all(void)
 {
 	int r = 0;
 
+	/*
+	 * During onlining a CPU, cpu_online_mask is set before kvm_online_cpu()
+	 * is called. on_each_cpu() between them includes the CPU. As a result,
+	 * hardware_enable_nolock() may get invoked before kvm_online_cpu().
+	 * This would enable hardware virtualization on that cpu without
+	 * compatibility checks, which can potentially crash system or break
+	 * running VMs.
+	 *
+	 * Disable CPU hotplug to prevent this case from happening.
+	 */
+	cpus_read_lock();
 	raw_spin_lock(&kvm_count_lock);
 
 	kvm_usage_count++;
@@ -4930,6 +4947,7 @@ static int hardware_enable_all(void)
 	}
 
 	raw_spin_unlock(&kvm_count_lock);
+	cpus_read_unlock();
 
 	return r;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/4] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  2022-01-18  6:44 ` [PATCH v2 3/4] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section Chao Gao
@ 2022-02-09  0:29   ` Sean Christopherson
  2022-02-09  7:59     ` Chao Gao
  0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2022-02-09  0:29 UTC (permalink / raw)
  To: Chao Gao
  Cc: kvm, pbonzini, kevin.tian, tglx, John Garry, Will Deacon,
	Darrick J. Wong, Shaokun Zhang, Thomas Richter, Tony Lindgren,
	linux-kernel

On Tue, Jan 18, 2022, Chao Gao wrote:
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 148f7169b431..528741601122 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -4856,13 +4856,25 @@ static void hardware_enable_nolock(void *junk)
>  	}
>  }
>  
> -static int kvm_starting_cpu(unsigned int cpu)
> +static int kvm_online_cpu(unsigned int cpu)
>  {
> +	int ret = 0;
> +
>  	raw_spin_lock(&kvm_count_lock);
> -	if (kvm_usage_count)
> +	/*
> +	 * Abort the CPU online process if hardware virtualization cannot
> +	 * be enabled. Otherwise running VMs would encounter unrecoverable
> +	 * errors when scheduled to this CPU.
> +	 */
> +	if (kvm_usage_count) {


>  		hardware_enable_nolock(NULL);
> +		if (atomic_read(&hardware_enable_failed)) {

This needs:

		atomic_set(&hardware_enable_failed, 0);

otherwise failure to online one CPU will prevent onlining other non-broken CPUs.
It's probably worth adding a WARN_ON_ONCE above this too, e.g.

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 70e034cbe813..b25a00c76b3a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4863,8 +4863,11 @@ static int kvm_online_cpu(unsigned int cpu)
         * errors when scheduled to this CPU.
         */
        if (kvm_usage_count) {
+               WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
+
                hardware_enable_nolock(NULL);
                if (atomic_read(&hardware_enable_failed)) {
+                       atomic_set(&hardware_enable_failed, 0);
                        ret = -EIO;
                        pr_warn("kvm: abort onlining CPU%d", cpu);
                }


> +			ret = -EIO;
> +			pr_warn("kvm: abort onlining CPU%d", cpu);

This is somewhat redundant with the pr_info() message in hardware_enable_nolock().
What about adding the below as a prep patch?  I think/hope it would be obvious to
the user/admin that onlining the CPU failed?  E.g. this for the output

  kvm: enabling virtualization on CPU2 failed during hardware_enable_all()

From: Sean Christopherson <seanjc@google.com>
Date: Tue, 8 Feb 2022 13:26:19 -0800
Subject: [PATCH] KVM: Provide more information in kernel log if hardware
 enabling fails

Provide the name of the calling function to hardware_enable_nolock() and
include it in the error message to provide additional information on
exactly what path failed.

Opportunistically bump the pr_info() to pr_warn(), failure to enable
virtualization support is warn-worthy as _something_ is wrong with the
system.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index be614a6325e4..23481fd746aa 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4833,7 +4833,7 @@ static struct miscdevice kvm_dev = {
 	&kvm_chardev_ops,
 };

-static void hardware_enable_nolock(void *junk)
+static void hardware_enable_nolock(void *caller_name)
 {
 	int cpu = raw_smp_processor_id();
 	int r;
@@ -4848,7 +4848,8 @@ static void hardware_enable_nolock(void *junk)
 	if (r) {
 		cpumask_clear_cpu(cpu, cpus_hardware_enabled);
 		atomic_inc(&hardware_enable_failed);
-		pr_info("kvm: enabling virtualization on CPU%d failed\n", cpu);
+		pr_warn("kvm: enabling virtualization on CPU%d failed during %s()\n",
+			cpu, (const char *)caller_name);
 	}
 }

@@ -4856,7 +4857,7 @@ static int kvm_starting_cpu(unsigned int cpu)
 {
 	raw_spin_lock(&kvm_count_lock);
 	if (kvm_usage_count)
-		hardware_enable_nolock(NULL);
+		hardware_enable_nolock((void *)__func__);
 	raw_spin_unlock(&kvm_count_lock);
 	return 0;
 }
@@ -4905,7 +4906,7 @@ static int hardware_enable_all(void)
 	kvm_usage_count++;
 	if (kvm_usage_count == 1) {
 		atomic_set(&hardware_enable_failed, 0);
-		on_each_cpu(hardware_enable_nolock, NULL, 1);
+		on_each_cpu(hardware_enable_nolock, (void *)__func__, 1);

 		if (atomic_read(&hardware_enable_failed)) {
 			hardware_disable_all_nolock();
@@ -5530,7 +5531,7 @@ static void kvm_resume(void)
 #ifdef CONFIG_LOCKDEP
 		WARN_ON(lockdep_is_held(&kvm_count_lock));
 #endif
-		hardware_enable_nolock(NULL);
+		hardware_enable_nolock((void *)__func__);
 	}
 }


base-commit: 357ef9d9c0728bc2bbb9810c662263bba6b8dbc7
--


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 4/4] KVM: Do compatibility checks on hotplugged CPUs
  2022-01-18  6:44 ` [PATCH v2 4/4] KVM: Do compatibility checks on hotplugged CPUs Chao Gao
@ 2022-02-09  0:36   ` Sean Christopherson
  0 siblings, 0 replies; 8+ messages in thread
From: Sean Christopherson @ 2022-02-09  0:36 UTC (permalink / raw)
  To: Chao Gao
  Cc: kvm, pbonzini, kevin.tian, tglx, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, linux-kernel

On Tue, Jan 18, 2022, Chao Gao wrote:
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 528741601122..83f87fb1fa0a 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -4858,7 +4858,13 @@ static void hardware_enable_nolock(void *junk)
>  
>  static int kvm_online_cpu(unsigned int cpu)
>  {
> -	int ret = 0;
> +	int ret;
> +
> +	ret = kvm_arch_check_processor_compat();
> +	if (ret) {
> +		pr_warn("kvm: CPU%d is incompatible with online CPUs", cpu);

Hmm, I'd say omit the pr_warn() here too and instead add a pr_err() for the
setup_vmcs_config() path in vmx_check_processor_compat().

> +		return ret;
> +	}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/4] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  2022-02-09  0:29   ` Sean Christopherson
@ 2022-02-09  7:59     ` Chao Gao
  0 siblings, 0 replies; 8+ messages in thread
From: Chao Gao @ 2022-02-09  7:59 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, pbonzini, kevin.tian, tglx, John Garry, Will Deacon,
	Darrick J. Wong, Shaokun Zhang, Thomas Richter, Tony Lindgren,
	linux-kernel

On Wed, Feb 09, 2022 at 12:29:57AM +0000, Sean Christopherson wrote:
>On Tue, Jan 18, 2022, Chao Gao wrote:
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 148f7169b431..528741601122 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -4856,13 +4856,25 @@ static void hardware_enable_nolock(void *junk)
>>  	}
>>  }
>>  
>> -static int kvm_starting_cpu(unsigned int cpu)
>> +static int kvm_online_cpu(unsigned int cpu)
>>  {
>> +	int ret = 0;
>> +
>>  	raw_spin_lock(&kvm_count_lock);
>> -	if (kvm_usage_count)
>> +	/*
>> +	 * Abort the CPU online process if hardware virtualization cannot
>> +	 * be enabled. Otherwise running VMs would encounter unrecoverable
>> +	 * errors when scheduled to this CPU.
>> +	 */
>> +	if (kvm_usage_count) {
>
>
>>  		hardware_enable_nolock(NULL);
>> +		if (atomic_read(&hardware_enable_failed)) {
>
>This needs:
>
>		atomic_set(&hardware_enable_failed, 0);
>
>otherwise failure to online one CPU will prevent onlining other non-broken CPUs.
>It's probably worth adding a WARN_ON_ONCE above this too, e.g.

Thanks. All your comments to this series make sense. I just post a revised
version.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-02-09  7:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-18  6:44 [PATCH v2 0/4] Improve KVM's interaction with CPU hotplug Chao Gao
2022-01-18  6:44 ` [PATCH v2 1/4] KVM: x86: Move check_processor_compatibility from init ops to runtime ops Chao Gao
2022-01-18  6:44 ` [PATCH v2 2/4] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs" Chao Gao
2022-01-18  6:44 ` [PATCH v2 3/4] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section Chao Gao
2022-02-09  0:29   ` Sean Christopherson
2022-02-09  7:59     ` Chao Gao
2022-01-18  6:44 ` [PATCH v2 4/4] KVM: Do compatibility checks on hotplugged CPUs Chao Gao
2022-02-09  0:36   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).